W0827 12:30:23.489000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:23.489000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:23.489000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:23.489000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:24.844000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:24.844000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:24.844000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:24.844000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:26.236000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:26.236000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:26.236000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:26.236000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:27.440000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:27.440000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:27.440000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:27.440000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:29.683000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:29.683000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:29.683000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:29.683000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:31.998000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:31.998000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:31.998000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:31.998000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:32.661000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:32.661000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:32.661000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:32.661000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:34.217000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:30:34.217000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:30:34.217000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:30:34.217000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** [2025-08-27 12:30:46,072] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,089] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,090] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,092] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,098] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,100] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,123] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,125] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,226] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,233] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directorydf: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzydf: : No such file or directory /tmp/triton_lzy: No such file or directory [2025-08-27 12:30:46,238] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,238] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,240] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,241] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,242] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,243] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:30:46,615] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,646] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,646] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,649] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,656] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,656] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,656] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,659] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,664] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,745] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,774] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,778] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,784] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,802] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,805] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:46,807] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy: No such file or directory/tmp/triton_lzy df: : No such file or directory /tmp/triton_lzydf: df: : No such file or directory/tmp/triton_lzy /tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:30:48,241] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,293] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,320] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,324] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,333] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,341] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,346] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:48,347] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,538] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,563] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,564] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,738] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,738] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,738] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,738] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,738] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,739] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,739] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2025-08-27 12:30:48,739] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,767] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:48,975] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,228] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,258] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,268] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,275] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:49,275] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,278] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:49,279] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,280] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,370] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,370] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,370] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,371] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,371] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,371] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,371] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,373] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,530] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,531] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,531] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,531] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,531] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,532] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,532] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,532] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:49,546] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,550] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,554] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:49,557] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:49,557] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,557] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:49,558] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:49,882] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,023] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,195] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,196] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,197] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,197] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,202] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,204] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,207] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,330] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,331] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,335] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,338] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:50,344] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,345] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:50,346] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:51,427] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,427] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,427] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,427] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,427] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,428] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,428] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:51,431] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:30:52,473] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:52,896] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:52,897] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:52,897] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:52,905] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:52,905] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:52,906] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:30:52,906] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:30:55,268] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,457] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,472] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,476] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,482] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,484] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,491] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:55,495] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzydf: : No such file or directory df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory df: df: /tmp/triton_lzydf: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory : No such file or directory [2025-08-27 12:30:57,030] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,156] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,214] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,225] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,231] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,242] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,253] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,254] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzydf: : No such file or directory /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:30:57,611] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,818] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,838] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,841] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,846] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,847] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,848] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:30:57,851] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:00,751] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:01,345] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:01,711] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:01,713] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:01,715] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:01,718] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:01,719] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:01,719] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:01,720] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:02,069] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,069] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,069] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,070] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,070] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,070] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,070] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,070] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,561] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,561] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:31:02,888] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,136] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,232] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,236] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,246] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,279] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:03,280] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,281] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:03,281] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,492] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:03,493] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,494] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:03,494] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,503] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:31:03,505] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:03,505] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:31:07,619] [INFO] [partition_parameters.py:348:__exit__] finished initializing model - num_params = 729, num_elems = 8.29B Loading checkpoint shards: 0%| | 0/5 [00:00 [rank55]: train(attn_implementation="flash_attention_2") [rank55]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank55]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank55]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank55]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank55]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank55]: data = self.load_json(data_args.data_path, parse_line=True) [rank55]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank55]: raise ValueError(f"Failed to load json file {path}") [rank55]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank49]: Traceback (most recent call last): [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank49]: with open(path, "r") as f: [rank49]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank49]: During handling of the above exception, another exception occurred: [rank49]: Traceback (most recent call last): [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank49]: train(attn_implementation="flash_attention_2") [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank49]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank49]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank49]: data = self.load_json(data_args.data_path, parse_line=True) [rank49]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank49]: raise ValueError(f"Failed to load json file {path}") [rank49]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank51]: Traceback (most recent call last): [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank51]: with open(path, "r") as f: [rank51]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank51]: During handling of the above exception, another exception occurred: [rank51]: Traceback (most recent call last): [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank51]: train(attn_implementation="flash_attention_2") [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank51]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank51]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank51]: data = self.load_json(data_args.data_path, parse_line=True) [rank51]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank51]: raise ValueError(f"Failed to load json file {path}") [rank51]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank48]: Traceback (most recent call last): [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank48]: with open(path, "r") as f: [rank48]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank48]: During handling of the above exception, another exception occurred: [rank48]: Traceback (most recent call last): [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank48]: train(attn_implementation="flash_attention_2") [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank48]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank48]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank48]: data = self.load_json(data_args.data_path, parse_line=True) [rank48]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank48]: raise ValueError(f"Failed to load json file {path}") [rank48]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank54]: Traceback (most recent call last): [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank54]: with open(path, "r") as f: [rank54]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank54]: During handling of the above exception, another exception occurred: [rank54]: Traceback (most recent call last): [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank54]: train(attn_implementation="flash_attention_2") [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank54]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank54]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank54]: data = self.load_json(data_args.data_path, parse_line=True) [rank54]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank54]: raise ValueError(f"Failed to load json file {path}") [rank54]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank50]: Traceback (most recent call last): [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank50]: with open(path, "r") as f: [rank50]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank50]: During handling of the above exception, another exception occurred: [rank50]: Traceback (most recent call last): [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank50]: train(attn_implementation="flash_attention_2") [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank50]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank50]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank50]: data = self.load_json(data_args.data_path, parse_line=True) [rank50]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank50]: raise ValueError(f"Failed to load json file {path}") [rank50]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank53]: Traceback (most recent call last): [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank53]: with open(path, "r") as f: [rank53]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank53]: During handling of the above exception, another exception occurred: [rank53]: Traceback (most recent call last): [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank53]: train(attn_implementation="flash_attention_2") [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank53]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank53]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank53]: data = self.load_json(data_args.data_path, parse_line=True) [rank53]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank53]: raise ValueError(f"Failed to load json file {path}") [rank53]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank52]: Traceback (most recent call last): [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank52]: with open(path, "r") as f: [rank52]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank52]: During handling of the above exception, another exception occurred: [rank52]: Traceback (most recent call last): [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank52]: train(attn_implementation="flash_attention_2") [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank52]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank52]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank52]: data = self.load_json(data_args.data_path, parse_line=True) [rank52]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank52]: raise ValueError(f"Failed to load json file {path}") [rank52]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank12]: Traceback (most recent call last): [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank12]: with open(path, "r") as f: [rank12]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank12]: During handling of the above exception, another exception occurred: [rank12]: Traceback (most recent call last): [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank12]: train(attn_implementation="flash_attention_2") [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank12]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank12]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank12]: data = self.load_json(data_args.data_path, parse_line=True) [rank12]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank12]: raise ValueError(f"Failed to load json file {path}") [rank12]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank10]: Traceback (most recent call last): [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank10]: with open(path, "r") as f: [rank10]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank10]: During handling of the above exception, another exception occurred: [rank10]: Traceback (most recent call last): [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank10]: train(attn_implementation="flash_attention_2") [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank10]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank10]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank10]: data = self.load_json(data_args.data_path, parse_line=True) [rank10]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank10]: raise ValueError(f"Failed to load json file {path}") [rank10]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank11]: Traceback (most recent call last): [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank11]: with open(path, "r") as f: [rank11]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank11]: During handling of the above exception, another exception occurred: [rank11]: Traceback (most recent call last): [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank11]: train(attn_implementation="flash_attention_2") [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank11]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank11]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank11]: data = self.load_json(data_args.data_path, parse_line=True) [rank11]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank11]: raise ValueError(f"Failed to load json file {path}") [rank11]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank8]: Traceback (most recent call last): [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank8]: with open(path, "r") as f: [rank8]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank8]: During handling of the above exception, another exception occurred: [rank8]: Traceback (most recent call last): [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank8]: train(attn_implementation="flash_attention_2") [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank8]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank8]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank8]: data = self.load_json(data_args.data_path, parse_line=True) [rank8]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank8]: raise ValueError(f"Failed to load json file {path}") [rank8]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank13]: Traceback (most recent call last): [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank13]: with open(path, "r") as f: [rank13]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank13]: During handling of the above exception, another exception occurred: [rank13]: Traceback (most recent call last): [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank13]: train(attn_implementation="flash_attention_2") [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank13]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank13]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank13]: data = self.load_json(data_args.data_path, parse_line=True) [rank13]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank13]: raise ValueError(f"Failed to load json file {path}") [rank13]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank9]: Traceback (most recent call last): [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank9]: with open(path, "r") as f: [rank9]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank9]: During handling of the above exception, another exception occurred: [rank9]: Traceback (most recent call last): [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank9]: train(attn_implementation="flash_attention_2") [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank9]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank9]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank9]: data = self.load_json(data_args.data_path, parse_line=True) [rank9]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank9]: raise ValueError(f"Failed to load json file {path}") [rank9]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank14]: Traceback (most recent call last): [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank14]: with open(path, "r") as f: [rank14]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank14]: During handling of the above exception, another exception occurred: [rank14]: Traceback (most recent call last): [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank14]: train(attn_implementation="flash_attention_2") [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank14]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank14]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank14]: data = self.load_json(data_args.data_path, parse_line=True) [rank14]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank14]: raise ValueError(f"Failed to load json file {path}") [rank14]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank15]: Traceback (most recent call last): [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank15]: with open(path, "r") as f: [rank15]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank15]: During handling of the above exception, another exception occurred: [rank15]: Traceback (most recent call last): [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank15]: train(attn_implementation="flash_attention_2") [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank15]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank15]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank15]: data = self.load_json(data_args.data_path, parse_line=True) [rank15]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank15]: raise ValueError(f"Failed to load json file {path}") [rank15]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank18]: Traceback (most recent call last): [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank18]: with open(path, "r") as f: [rank18]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank18]: During handling of the above exception, another exception occurred: [rank18]: Traceback (most recent call last): [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank18]: train(attn_implementation="flash_attention_2") [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank18]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank18]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank18]: data = self.load_json(data_args.data_path, parse_line=True) [rank18]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank18]: raise ValueError(f"Failed to load json file {path}") [rank18]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank16]: Traceback (most recent call last): [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank16]: with open(path, "r") as f: [rank16]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank16]: During handling of the above exception, another exception occurred: [rank16]: Traceback (most recent call last): [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank16]: train(attn_implementation="flash_attention_2") [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank16]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank16]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank16]: data = self.load_json(data_args.data_path, parse_line=True) [rank16]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank16]: raise ValueError(f"Failed to load json file {path}") [rank16]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank20]: Traceback (most recent call last): [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank20]: with open(path, "r") as f: [rank20]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank20]: During handling of the above exception, another exception occurred: [rank20]: Traceback (most recent call last): [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank20]: train(attn_implementation="flash_attention_2") [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank20]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank20]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank20]: data = self.load_json(data_args.data_path, parse_line=True) [rank20]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank20]: raise ValueError(f"Failed to load json file {path}") [rank20]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank23]: Traceback (most recent call last): [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank23]: with open(path, "r") as f: [rank23]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank23]: During handling of the above exception, another exception occurred: [rank23]: Traceback (most recent call last): [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank23]: train(attn_implementation="flash_attention_2") [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank23]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank23]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank23]: data = self.load_json(data_args.data_path, parse_line=True) [rank23]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank23]: raise ValueError(f"Failed to load json file {path}") [rank23]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank22]: Traceback (most recent call last): [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspacFailed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank7]: Traceback (most recent call last): [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank7]: with open(path, "r") as f: [rank7]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank7]: During handling of the above exception, another exception occurred: [rank7]: Traceback (most recent call last): [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank7]: train(attn_implementation="flash_attention_2") [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank7]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank7]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank7]: data = self.load_json(data_args.data_path, parse_line=True) [rank7]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank7]: raise ValueError(f"Failed to load json file {path}") [rank7]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank3]: Traceback (most recent call last): [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank3]: with open(path, "r") as f: [rank3]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank3]: During handling of the above exception, another exception occurred: [rank3]: Traceback (most recent call last): [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank3]: train(attn_implementation="flash_attention_2") [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank3]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank3]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank3]: data = self.load_json(data_args.data_path, parse_line=True) [rank3]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank3]: raise ValueError(f"Failed to load json file {path}") [rank3]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank5]: Traceback (most recent call last): [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank5]: with open(path, "r") as f: [rank5]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank5]: During handling of the above exception, another exception occurred: [rank5]: Traceback (most recent call last): [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank5]: train(attn_implementation="flash_attention_2") [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank5]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank5]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank5]: data = self.load_json(data_args.data_path, parse_line=True) [rank5]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank5]: raise ValueError(f"Failed to load json file {path}") [rank5]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank4]: Traceback (most recent call last): [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank4]: with open(path, "r") as f: [rank4]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank4]: During handling of the above exception, another exception occurred: [rank4]: Traceback (most recent call last): [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank4]: train(attn_implementation="flash_attention_2") [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank4]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank4]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank4]: data = self.load_json(data_args.data_path, parse_line=True) [rank4]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank4]: raise ValueError(f"Failed to load json file {path}") [rank4]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank6]: Traceback (most recent call last): [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank6]: with open(path, "r") as f: [rank6]: FileNotFoundError: [Errno 2] No such fie/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank22]: with open(path, "r") as f: [rank22]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank22]: During handling of the above exception, another exception occurred: [rank22]: Traceback (most recent call last): [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank22]: train(attn_implementation="flash_attention_2") [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank22]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank22]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank22]: data = self.load_json(data_args.data_path, parse_line=True) [rank22]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank22]: raise ValueError(f"Failed to load json file {path}") [rank22]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank17]: Traceback (most recent call last): [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank17]: with open(path, "r") as f: [rank17]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank17]: During handling of the above exception, another exception occurred: [rank17]: Traceback (most recent call last): [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank17]: train(attn_implementation="flash_attention_2") [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank17]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank17]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank17]: data = self.load_json(data_args.data_path, parse_line=True) [rank17]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank17]: raise ValueError(f"Failed to load json file {path}") [rank17]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank19]: Traceback (most recent call last): [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank19]: with open(path, "r") as f: [rank19]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank19]: During handling of the above exception, another exception occurred: [rank19]: Traceback (most recent call last): [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank19]: train(attn_implementation="flash_attention_2") [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank19]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank19]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank19]: data = self.load_json(data_args.data_path, parse_line=True) [rank19]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank19]: raise ValueError(f"Failed to load json file {path}") [rank19]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank21]: Traceback (most recent call last): [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank21]: with open(path, "r") as f: [rank21]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank21]: During handling of the above exception, another exception occurred: [rank21]: Traceback (most recent call last): [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank21]: train(attn_implementation="flash_attention_2") [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank21]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank21]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank21]: data = self.load_json(data_args.data_path, parse_line=True) [rank21]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank21]: raise ValueError(f"Failed to load json file {path}") [rank21]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json le or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank6]: During handling of the above exception, another exception occurred: [rank6]: Traceback (most recent call last): [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank6]: train(attn_implementation="flash_attention_2") [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank6]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank6]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank6]: data = self.load_json(data_args.data_path, parse_line=True) [rank6]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank6]: raise ValueError(f"Failed to load json file {path}") [rank6]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank2]: Traceback (most recent call last): [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank2]: with open(path, "r") as f: [rank2]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank2]: During handling of the above exception, another exception occurred: [rank2]: Traceback (most recent call last): [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank2]: train(attn_implementation="flash_attention_2") [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank2]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank2]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank2]: data = self.load_json(data_args.data_path, parse_line=True) [rank2]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank2]: raise ValueError(f"Failed to load json file {path}") [rank2]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank1]: Traceback (most recent call last): [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank1]: with open(path, "r") as f: [rank1]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank1]: During handling of the above exception, another exception occurred: [rank1]: Traceback (most recent call last): [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank1]: train(attn_implementation="flash_attention_2") [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank1]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank1]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank1]: data = self.load_json(data_args.data_path, parse_line=True) [rank1]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank1]: raise ValueError(f"Failed to load json file {path}") [rank1]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Loading checkpoint shards: 100%|██████████| 5/5 [00:21<00:00, 3.50s/it] Loading checkpoint shards: 100%|██████████| 5/5 [00:21<00:00, 4.38s/it] Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank58]: Traceback (most recent call last): [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank58]: with open(path, "r") as f: [rank58]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank58]: During handling of the above exception, another exception occurred: [rank58]: Traceback (most recent call last): [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank58]: train(attn_implementation="flash_attention_2") [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank58]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank58]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank58]: data = self.load_json(data_args.data_path, parse_line=True) [rank58]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank58]: raise ValueError(f"Failed to load json file {path}") [rank58]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank56]: Traceback (most recent call last): [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank56]: with open(path, "r") as f: [rank56]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank56]: During handling of the above exception, another exception occurred: [rank56]: Traceback (most recent call last): [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank56]: train(attn_implementation="flash_attention_2") [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank56]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank56]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank56]: data = self.load_json(data_args.data_path, parse_line=True) [rank56]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank56]: raise ValueError(f"Failed to load json file {path}") [rank56]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank63]: Traceback (most recent call last): [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank63]: with open(path, "r") as f: [rank63]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank63]: During handling of the above exception, another exception occurred: [rank63]: Traceback (most recent call last): [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank63]: train(attn_implementation="flash_attention_2") [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank63]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank63]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank63]: data = self.load_json(data_args.data_path, parse_line=True) [rank63]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank63]: raise ValueError(f"Failed to load json file {path}") [rank63]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank59]: Traceback (most recent call last): [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank59]: with open(path, "r") as f: [rank59]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank59]: During handling of the above exception, another exception occurred: [rank59]: Traceback (most recent call last): [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank59]: train(attn_implementation="flash_attention_2") [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank59]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank59]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank59]: data = self.load_json(data_args.data_path, parse_line=True) [rank59]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank59]: raise ValueError(f"Failed to load json file {path}") [rank59]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank62]: Traceback (most recent call last): [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank62]: with open(path, "r") as f: [rank62]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank62]: During handling of the above exception, another exception occurred: [rank62]: Traceback (most recent call last): [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank62]: train(attn_implementation="flash_attention_2") [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank62]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank62]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank62]: data = self.load_json(data_args.data_path, parse_line=True) [rank62]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank62]: raise ValueError(f"Failed to load json file {path}") [rank62]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank60]: Traceback (most recent call last): [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank60]: with open(path, "r") as f: [rank60]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank60]: During handling of the above exception, another exception occurred: [rank60]: Traceback (most recent call last): [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank60]: train(attn_implementation="flash_attention_2") [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank60]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank60]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank60]: data = self.load_json(data_args.data_path, parse_line=True) [rank60]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank60]: raise ValueError(f"Failed to load json file {path}") [rank60]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank57]: Traceback (most recent call last): [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank57]: with open(path, "r") as f: [rank57]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank57]: During handling of the above exception, another exception occurred: [rank57]: Traceback (most recent call last): [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank57]: train(attn_implementation="flash_attention_2") [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank57]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank57]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank57]: data = self.load_json(data_args.data_path, parse_line=True) [rank57]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank57]: raise ValueError(f"Failed to load json file {path}") [rank57]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank61]: Traceback (most recent call last): [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank61]: with open(path, "r") as f: [rank61]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank61]: During handling of the above exception, another exception occurred: [rank61]: Traceback (most recent call last): [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank61]: train(attn_implementation="flash_attention_2") [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank61]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank61]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank61]: data = self.load_json(data_args.data_path, parse_line=True) [rank61]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank61]: raise ValueError(f"Failed to load json file {path}") [rank61]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [rank48]:[W827 12:31:29.413110601 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) [rank8]:[W827 12:31:29.554335661 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) [rank16]:[W827 12:31:30.805799355 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) [rank56]:[W827 12:31:30.790508714 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) Vision Module - Attention Blocks: Trainable Block Indices: None Non-Trainable Block Indices: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] Merger Module Trainable: True LLM Module - Embed Tokens Trainable: True LLM Module - Trainable Layer Indices: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27] LLM Module - Non-Trainable Layer Indices: None Rank 0: [TCSLoader] config_path: ~/petreloss.conf Rank 0: --> before Client(conf_path) Rank 0: --> after Client(conf_path) Rank 0: Loading datasets: ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank0]: Traceback (most recent call last): [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank0]: with open(path, "r") as f: [rank0]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank0]: During handling of the above exception, another exception occurred: [rank0]: Traceback (most recent call last): [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank0]: train(attn_implementation="flash_attention_2") [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank0]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank0]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank0]: data = self.load_json(data_args.data_path, parse_line=True) [rank0]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank0]: raise ValueError(f"Failed to load json file {path}") [rank0]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank28]: Traceback (most recent call last): [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank28]: with open(path, "r") as f: [rank28]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank28]: During handling of the above exception, another exception occurred: [rank28]: Traceback (most recent call last): [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank28]: train(attn_implementation="flash_attention_2") [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank28]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank28]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank28]: data = self.load_json(data_args.data_path, parse_line=True) [rank28]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank28]: raise ValueError(f"Failed to load json file {path}") [rank28]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank30]: Traceback (most recent call last): [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank30]: with open(path, "r") as f: [rank30]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank30]: During handling of the above exception, another exception occurred: [rank30]: Traceback (most recent call last): [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank30]: train(attn_implementation="flash_attention_2") [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank30]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank30]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank30]: data = self.load_json(data_args.data_path, parse_line=True) [rank30]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank30]: raise ValueError(f"Failed to load json file {path}") [rank30]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank25]: Traceback (most recent call last): [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank25]: with open(path, "r") as f: [rank25]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank25]: During handling of the above exception, another exception occurred: [rank25]: Traceback (most recent call last): [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank25]: train(attn_implementation="flash_attention_2") [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank25]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank25]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank25]: data = self.load_json(data_args.data_path, parse_line=True) [rank25]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank25]: raise ValueError(f"Failed to load json file {path}") [rank25]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank24]: Traceback (most recent call last): [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank24]: with open(path, "r") as f: [rank24]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank24]: During handling of the above exception, another exception occurred: [rank24]: Traceback (most recent call last): [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank24]: train(attn_implementation="flash_attention_2") [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank24]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank24]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank24]: data = self.load_json(data_args.data_path, parse_line=True) [rank24]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank24]: raise ValueError(f"Failed to load json file {path}") [rank24]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank27]: Traceback (most recent call last): [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank27]: with open(path, "r") as f: [rank27]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank27]: During handling of the above exception, another exception occurred: [rank27]: Traceback (most recent call last): [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank27]: train(attn_implementation="flash_attention_2") [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank27]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank27]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank27]: data = self.load_json(data_args.data_path, parse_line=True) [rank27]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank27]: raise ValueError(f"Failed to load json file {path}") [rank27]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank26]: Traceback (most recent call last): [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank26]: with open(path, "r") as f: [rank26]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank26]: During handling of the above exception, another exception occurred: [rank26]: Traceback (most recent call last): [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank26]: train(attn_implementation="flash_attention_2") [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank26]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank26]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank26]: data = self.load_json(data_args.data_path, parse_line=True) [rank26]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank26]: raise ValueError(f"Failed to load json file {path}") [rank26]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank29]: Traceback (most recent call last): [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank29]: with open(path, "r") as f: [rank29]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank29]: During handling of the above exception, another exception occurred: [rank29]: Traceback (most recent call last): [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank29]: train(attn_implementation="flash_attention_2") [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank29]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank29]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank29]: data = self.load_json(data_args.data_path, parse_line=True) [rank29]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank29]: raise ValueError(f"Failed to load json file {path}") [rank29]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank31]: Traceback (most recent call last): [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank31]: with open(path, "r") as f: [rank31]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank31]: During handling of the above exception, another exception occurred: [rank31]: Traceback (most recent call last): [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank31]: train(attn_implementation="flash_attention_2") [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank31]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank31]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank31]: data = self.load_json(data_args.data_path, parse_line=True) [rank31]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank31]: raise ValueError(f"Failed to load json file {path}") [rank31]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank0]:[W827 12:31:30.489554142 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank40]: Traceback (most recent call last): [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank40]: with open(path, "r") as f: [rank40]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank40]: During handling of the above exception, another exception occurred: [rank40]: Traceback (most recent call last): [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank40]: train(attn_implementation="flash_attention_2") [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank40]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank40]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank40]: data = self.load_json(data_args.data_path, parse_line=True) [rank40]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank40]: raise ValueError(f"Failed to load json file {path}") [rank40]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank44]: Traceback (most recent call last): [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank44]: with open(path, "r") as f: [rank44]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank44]: During handling of the above exception, another exception occurred: [rank44]: Traceback (most recent call last): [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank44]: train(attn_implementation="flash_attention_2") [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank44]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank44]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank44]: data = self.load_json(data_args.data_path, parse_line=True) [rank44]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank44]: raise ValueError(f"Failed to load json file {path}") [rank44]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank43]: Traceback (most recent call last): [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank43]: with open(path, "r") as f: [rank43]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank43]: During handling of the above exception, another exception occurred: [rank43]: Traceback (most recent call last): [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank43]: train(attn_implementation="flash_attention_2") [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank43]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank43]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank43]: data = self.load_json(data_args.data_path, parse_line=True) [rank43]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank43]: raise ValueError(f"Failed to load json file {path}") [rank43]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank42]: Traceback (most recent call last): [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank42]: with open(path, "r") as f: [rank42]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank42]: During handling of the above exception, another exception occurred: [rank42]: Traceback (most recent call last): [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank42]: train(attn_implementation="flash_attention_2") [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank42]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank42]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank42]: data = self.load_json(data_args.data_path, parse_line=True) [rank42]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank42]: raise ValueError(f"Failed to load json file {path}") [rank42]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank46]: Traceback (most recent call last): [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank46]: with open(path, "r") as f: [rank46]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank46]: During handling of the above exception, another exception occurred: [rank46]: Traceback (most recent call last): [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank46]: train(attn_implementation="flash_attention_2") [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank46]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank46]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank46]: data = self.load_json(data_args.data_path, parse_line=True) [rank46]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank46]: raise ValueError(f"Failed to load json file {path}") [rank46]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank41]: Traceback (most recent call last): [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank41]: with open(path, "r") as f: [rank41]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank41]: During handling of the above exception, another exception occurred: [rank41]: Traceback (most recent call last): [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank41]: train(attn_implementation="flash_attention_2") [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank41]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank41]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank41]: data = self.load_json(data_args.data_path, parse_line=True) [rank41]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank41]: raise ValueError(f"Failed to load json file {path}") [rank41]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank47]: Traceback (most recent call last): [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank47]: with open(path, "r") as f: [rank47]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank47]: During handling of the above exception, another exception occurred: [rank47]: Traceback (most recent call last): [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank47]: train(attn_implementation="flash_attention_2") [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank47]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank47]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank47]: data = self.load_json(data_args.data_path, parse_line=True) [rank47]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank47]: raise ValueError(f"Failed to load json file {path}") [rank47]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank45]: Traceback (most recent call last): [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank45]: with open(path, "r") as f: [rank45]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank45]: During handling of the above exception, another exception occurred: [rank45]: Traceback (most recent call last): [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank45]: train(attn_implementation="flash_attention_2") [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank45]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank45]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank45]: data = self.load_json(data_args.data_path, parse_line=True) [rank45]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank45]: raise ValueError(f"Failed to load json file {path}") [rank45]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json W0827 12:31:31.055000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 907 closing signal SIGTERM W0827 12:31:31.055000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 908 closing signal SIGTERM W0827 12:31:31.055000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 909 closing signal SIGTERM W0827 12:31:31.055000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 911 closing signal SIGTERM W0827 12:31:31.056000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 912 closing signal SIGTERM W0827 12:31:31.056000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 913 closing signal SIGTERM W0827 12:31:31.056000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 914 closing signal SIGTERM W0827 12:31:31.154000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 937 closing signal SIGTERM W0827 12:31:31.155000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 904 closing signal SIGTERM W0827 12:31:31.155000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 938 closing signal SIGTERM W0827 12:31:31.155000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 939 closing signal SIGTERM W0827 12:31:31.156000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 940 closing signal SIGTERM W0827 12:31:31.156000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 941 closing signal SIGTERM W0827 12:31:31.157000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 943 closing signal SIGTERM W0827 12:31:31.156000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 906 closing signal SIGTERM W0827 12:31:31.156000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 907 closing signal SIGTERM W0827 12:31:31.156000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 908 closing signal SIGTERM W0827 12:31:31.156000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 909 closing signal SIGTERM W0827 12:31:31.156000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 910 closing signal SIGTERM W0827 12:31:31.157000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 911 closing signal SIGTERM W0827 12:31:31.157000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 944 closing signal SIGTERM [rank24]:[W827 12:31:31.019991011 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) W0827 12:31:31.350000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 889 closing signal SIGTERM W0827 12:31:31.351000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 891 closing signal SIGTERM W0827 12:31:31.351000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 893 closing signal SIGTERM W0827 12:31:31.351000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 894 closing signal SIGTERM W0827 12:31:31.351000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 895 closing signal SIGTERM W0827 12:31:31.351000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 896 closing signal SIGTERM [rank40]:[W827 12:31:31.234758572 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) E0827 12:31:31.461000 905 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 3 (pid: 910) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:31 host : 10-102-217-33.kubebrain-node-exporter.kubebrain.svc.pjlab.local rank : 11 (local_rank: 3) exitcode : 1 (pid: 910) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception:Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json. Exception: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json'[Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank37]: Traceback (most recent call last): [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank37]: with open(path, "r") as f: [rank37]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank37]: During handling of the above exception, another exception occurred: [rank37]: Traceback (most recent call last): [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank37]: train(attn_implementation="flash_attention_2") [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank37]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank37]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank37]: data = self.load_json(data_args.data_path, parse_line=True) [rank37]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank37]: raise ValueError(f"Failed to load json file {path}") [rank37]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank34]: Traceback (most recent call last): [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank34]: with open(path, "r") as f: [rank34]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank34]: During handling of the above exception, another exception occurred: [rank34]: Traceback (most recent call last): [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank34]: train(attn_implementation="flash_attention_2") [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank34]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank34]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank34]: data = self.load_json(data_args.data_path, parse_line=True) [rank34]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank34]: raise ValueError(f"Failed to load json file {path}") [rank34]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank35]: Traceback (most recent call last): [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank35]: with open(path, "r") as f: [rank35]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank35]: During handling of the above exception, another exception occurred: [rank35]: Traceback (most recent call last): [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank35]: train(attn_implementation="flash_attention_2") [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank35]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank35]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank35]: data = self.load_json(data_args.data_path, parse_line=True) [rank35]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank35]: raise ValueError(f"Failed to load json file {path}") [rank35]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank32]: Traceback (most recent call last): [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank32]: with open(path, "r") as f: [rank32]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank32]: During handling of the above exception, another exception occurred: [rank32]: Traceback (most recent call last): [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank32]: train(attn_implementation="flash_attention_2") [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank32]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank32]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank32]: data = self.load_json(data_args.data_path, parse_line=True) [rank32]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank32]: raise ValueError(f"Failed to load json file {path}") [rank32]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank38]: Traceback (most recent call last): [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank38]: with open(path, "r") as f: [rank38]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank38]: During handling of the above exception, another exception occurred: [rank38]: Traceback (most recent call last): [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank38]: train(attn_implementation="flash_attention_2") [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank38]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank38]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank38]: data = self.load_json(data_args.data_path, parse_line=True) [rank38]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank38]: raise ValueError(f"Failed to load json file {path}") [rank38]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank36]: Traceback (most recent call last): [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank36]: with open(path, "r") as f: [rank36]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank36]: During handling of the above exception, another exception occurred: [rank36]: Traceback (most recent call last): [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank36]: train(attn_implementation="flash_attention_2") [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank36]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank36]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank36]: data = self.load_json(data_args.data_path, parse_line=True) [rank36]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank36]: raise ValueError(f"Failed to load json file {path}") [rank36]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank33]: Traceback (most recent call last): [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank33]: with open(path, "r") as f: [rank33]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank33]: During handling of the above exception, another exception occurred: [rank33]: Traceback (most recent call last): [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank33]: train(attn_implementation="flash_attention_2") [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank33]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank33]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank33]: data = self.load_json(data_args.data_path, parse_line=True) [rank33]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank33]: raise ValueError(f"Failed to load json file {path}") [rank33]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json [rank39]: Traceback (most recent call last): [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 806, in load_json [rank39]: with open(path, "r") as f: [rank39]: FileNotFoundError: [Errno 2] No such file or directory: '../internvl_chat/data/internvl_meta/meta/meta_250827_2.json' [rank39]: During handling of the above exception, another exception occurred: [rank39]: Traceback (most recent call last): [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 203, in [rank39]: train(attn_implementation="flash_attention_2") [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/train/train_qwen.py", line 172, in train [rank39]: data_module = make_supervised_data_module(tokenizer=tokenizer, data_args=data_args) [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1741, in make_supervised_data_module [rank39]: train_dataset = LazySupervisedDataset(tokenizer=tokenizer, data_args=data_args) [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 708, in __init__ [rank39]: data = self.load_json(data_args.data_path, parse_line=True) [rank39]: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 812, in load_json [rank39]: raise ValueError(f"Failed to load json file {path}") [rank39]: ValueError: Failed to load json file ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json E0827 12:31:31.585000 902 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 1 (pid: 905) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:31 host : 10-102-200-17.monitoring-dcgm-exporter.kubebrain.svc.pjlab.local rank : 49 (local_rank: 1) exitcode : 1 (pid: 905) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ W0827 12:31:31.648000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM W0827 12:31:31.649000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 12:31:31.649000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 12:31:31.649000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 12:31:31.649000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 924 closing signal SIGTERM W0827 12:31:31.649000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 925 closing signal SIGTERM E0827 12:31:31.836000 935 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 5 (pid: 942) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:31 host : 10-102-214-58.csi-metrics-rbdplugin.ceph-csi.svc.pjlab.local rank : 61 (local_rank: 5) exitcode : 1 (pid: 942) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ E0827 12:31:31.866000 886 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 1 (pid: 890) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: [1]: time : 2025-08-27_12:31:31 host : 10-102-210-14.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 3 (local_rank: 3) exitcode : 1 (pid: 892) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:31 host : 10-102-210-14.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 1 (local_rank: 1) exitcode : 1 (pid: 890) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ [rank32]:[W827 12:31:32.839132400 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) W0827 12:31:32.166000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM W0827 12:31:32.166000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 12:31:32.167000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 12:31:32.167000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 923 closing signal SIGTERM W0827 12:31:32.168000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 924 closing signal SIGTERM W0827 12:31:32.168000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 925 closing signal SIGTERM W0827 12:31:32.169000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 926 closing signal SIGTERM E0827 12:31:32.365000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 4 (pid: 923) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: [1]: time : 2025-08-27_12:31:31 host : 10-102-206-20.csi-metrics-rbdplugin.ceph-csi.svc.pjlab.local rank : 23 (local_rank: 7) exitcode : 1 (pid: 926) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:31 host : 10-102-206-20.csi-metrics-rbdplugin.ceph-csi.svc.pjlab.local rank : 20 (local_rank: 4) exitcode : 1 (pid: 923) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ W0827 12:31:32.498000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 12:31:32.498000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 12:31:32.499000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 12:31:32.499000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 923 closing signal SIGTERM W0827 12:31:32.499000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 924 closing signal SIGTERM W0827 12:31:32.499000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 926 closing signal SIGTERM W0827 12:31:32.499000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 927 closing signal SIGTERM E0827 12:31:32.812000 917 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 1 (pid: 920) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:32 host : 10-102-202-21.node-local-dns.kube-system.svc.pjlab.local rank : 25 (local_rank: 1) exitcode : 1 (pid: 920) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ E0827 12:31:32.957000 918 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 5 (pid: 925) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:32 host : 10-102-201-36.monitoring-dcgm-exporter.kubebrain.svc.pjlab.local rank : 45 (local_rank: 5) exitcode : 1 (pid: 925) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ W0827 12:31:33.064000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 929 closing signal SIGTERM W0827 12:31:33.065000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 930 closing signal SIGTERM W0827 12:31:33.065000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 931 closing signal SIGTERM W0827 12:31:33.065000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 932 closing signal SIGTERM W0827 12:31:33.065000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 933 closing signal SIGTERM W0827 12:31:33.066000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 934 closing signal SIGTERM W0827 12:31:33.066000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 936 closing signal SIGTERM E0827 12:31:33.495000 927 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 6 (pid: 935) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_12:31:33 host : 10-102-202-24.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 38 (local_rank: 6) exitcode : 1 (pid: 935) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ W0827 12:39:26.512000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:26.512000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:26.512000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:26.512000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:26.558000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:26.558000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:26.558000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:26.558000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:26.999000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:26.999000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:26.999000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:26.999000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:27.946000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:27.946000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:27.946000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:27.946000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:28.767000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:28.767000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:28.767000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:28.767000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:31.216000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:31.216000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:31.216000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:31.216000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:34.063000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:34.063000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:34.063000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:34.063000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:35.625000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 12:39:35.625000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 12:39:35.625000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 12:39:35.625000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** [2025-08-27 12:39:41,448] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,484] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:39:41,632] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,642] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,648] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,651] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,658] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,661] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,675] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,675] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,676] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,676] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,678] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,681] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,692] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:41,699] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:42,586] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,701] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,882] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,883] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,887] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,898] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,898] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,906] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,932] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,952] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,987] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,989] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,991] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,991] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:42,992] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:43,225] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,346] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,391] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,407] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,559] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,565] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,573] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,580] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,582] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,586] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,724] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:43,725] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:43,725] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,727] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,732] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:43,736] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:47,628] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,734] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,755] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,762] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,763] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,771] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,772] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:47,779] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:39:48,246] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,246] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,263] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,310] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,320] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,323] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,325] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,329] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,359] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,364] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,366] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,376] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,380] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,382] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,384] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,385] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,386] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,388] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,388] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,391] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,394] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,399] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,401] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,407] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,410] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,439] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:39:48,474] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,485] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:39:48,506] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,507] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,508] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:48,510] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,832] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,914] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,916] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,926] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,927] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,927] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,927] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,928] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,928] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,928] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,928] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,938] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:50,939] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,132] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,140] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,141] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,146] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,146] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,147] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,150] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,389] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,460] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,463] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,496] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,497] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:51,700] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,700] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,701] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,703] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,709] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,712] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,713] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,775] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,776] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,783] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,784] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,785] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,787] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,788] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,806] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,809] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,811] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,811] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,812] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:51,812] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:51,817] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:52,005] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:52,326] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:52,332] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:52,333] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:52,333] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:52,334] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:39:52,334] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:52,335] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:39:55,195] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,334] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,340] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,396] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,405] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,410] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,414] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 12:39:55,416] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 12:39:59,948] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:39:59,949] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 12:40:00,526] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:40:00,870] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:40:00,874] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:40:00,879] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:40:00,881] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 12:40:00,886] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:40:00,887] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 [2025-08-27 12:40:00,887] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 64 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. Loading checkpoint shards: 0%| | 0/5 [00:00 before Client(conf_path) Rank 0: --> after Client(conf_path) Rank 0: Loading datasets: ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Rank 0: Loading guienv Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl with all sampling strategy Rank 0: Loaded 327972 samples from VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl Rank 0: Loading omniact Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl with all sampling strategy Rank 0: Loaded 6720 samples from VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl Rank 0: Loading ricoig16k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl with all sampling strategy Rank 0: Loaded 16133 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl Rank 0: Loading ricosca Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl with all sampling strategy Rank 0: Loaded 173212 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl Rank 0: Loading seeclick Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl with all sampling strategy Rank 0: Loaded 271121 samples from VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl Rank 0: Loading ui_refexp Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl with all sampling strategy Rank 0: Loaded 15624 samples from VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl Rank 0: Loading webui350k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl with all sampling strategy Rank 0: Loaded 57389 samples from VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl Rank 0: Loading widget_captioning Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl with all sampling strategy Rank 0: Loaded 101426 samples from VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl Rank 0: Loading aitw-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl Rank 0: Loading aitw-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl Rank 0: Loading aitw-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl Rank 0: Loading amex-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl Rank 0: Loading amex-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl Rank 0: Loading amex-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl Rank 0: Loading android_control Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl with repeat:2 sampling strategy Rank 0: Loaded 149428 samples from VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl Rank 0: Loading coat Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl with all sampling strategy Rank 0: Loaded 11833 samples from VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl Rank 0: Loading guiact-web-multi-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl Rank 0: Loading guiact-web-multi-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl Rank 0: Loading guiact-web-multi-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl Rank 0: Loading guiact-web-single Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl with all sampling strategy Rank 0: Loaded 67396 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl Rank 0: Loading guide Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl with all sampling strategy Rank 0: Loaded 13544 samples from VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl Rank 0: Loading gui-odyssey-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl Rank 0: Loading gui-odyssey-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl Rank 0: Loading gui-odyssey-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl Rank 0: Loading mind2web-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl Rank 0: Loading mind2web-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl Rank 0: Loading mind2web-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl Rank 0: Loading miniwob-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl Rank 0: Loading miniwob-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl Rank 0: Loading miniwob-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl Rank 0: Loading aguvis_android_control-v2 Rank 0: Skipping aguvis_android_control-v2 due to repeat_time=0 Rank 0: Loading aguvis_coat-v2 Rank 0: Skipping aguvis_coat-v2 due to repeat_time=0 Rank 0: Loading aguvis_docvqa_grounding Rank 0: Skipping aguvis_docvqa_grounding due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-multi Rank 0: Skipping aguvis_guiact-web-multi due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-single-v2 Rank 0: Skipping aguvis_guiact-web-single-v2 due to repeat_time=0 Rank 0: Loading aguvis_guide_si_10k-v2 Rank 0: Skipping aguvis_guide_si_10k-v2 due to repeat_time=0 Rank 0: Loading aguvis_guienv Rank 0: Skipping aguvis_guienv due to repeat_time=0 Rank 0: Loading aguvis_mind2web_train_v1.0.1 Rank 0: Skipping aguvis_mind2web_train_v1.0.1 due to repeat_time=0 Rank 0: Loading aguvis_omniact Rank 0: Skipping aguvis_omniact due to repeat_time=0 Rank 0: Loading aguvis_osatlas_ui_tars_cleaned Rank 0: Skipping aguvis_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_ricoig16k Rank 0: Skipping aguvis_ricoig16k due to repeat_time=0 Rank 0: Loading aguvis_ricosca Rank 0: Skipping aguvis_ricosca due to repeat_time=0 Rank 0: Loading aguvis_seeclick_mi_ui_tars_cleaned Rank 0: Skipping aguvis_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping aguvis_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading aguvis_ui_refexp Rank 0: Skipping aguvis_ui_refexp due to repeat_time=0 Rank 0: Loading aguvis_webui350k Rank 0: Skipping aguvis_webui350k due to repeat_time=0 Rank 0: Loading aguvis_widget_captioning Rank 0: Skipping aguvis_widget_captioning due to repeat_time=0 Rank 0: Loading icon_caption_icon_v0222_description Rank 0: Skipping icon_caption_icon_v0222_description due to repeat_time=0 Rank 0: Loading icon_grounding_icon_v0222_grounding Rank 0: Skipping icon_grounding_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_component_final_1.5m Rank 0: Skipping refusal_component_final_1.5m due to repeat_time=0 Rank 0: Loading refusal_component_library_snap_icon_data_grounding Rank 0: Skipping refusal_component_library_snap_icon_data_grounding due to repeat_time=0 Rank 0: Loading refusal_component_v1_130k Rank 0: Skipping refusal_component_v1_130k due to repeat_time=0 Rank 0: Loading refusal_guienv Rank 0: Skipping refusal_guienv due to repeat_time=0 Rank 0: Loading refusal_icon_v0222_grounding Rank 0: Skipping refusal_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_osatlas_ui_tars_cleaned Rank 0: Skipping refusal_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_ricosca Rank 0: Skipping refusal_ricosca due to repeat_time=0 Rank 0: Loading refusal_seeclick_mi_ui_tars_cleaned Rank 0: Skipping refusal_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping refusal_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading refusal_training_data_icon_grounded_merged Rank 0: Skipping refusal_training_data_icon_grounded_merged due to repeat_time=0 Rank 0: Loading component_generated_component_final_1.5m_cleaned_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 3987 samples from VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_description Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 11061 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 4424 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl Rank 0: Loading component_generated_component_v1_130k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 26376 samples from VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl Rank 0: Loading component_rule-based_doc_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 3153 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl Rank 0: Loading component_rule-based_doc_scroll_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 603 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl Rank 0: Loading component_rule-based_ethercalc_v1 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2012 samples from VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl Rank 0: Loading component_rule-based_slide_v1_17k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2363 samples from VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl Rank 0: Loading icon_caption_ios_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 49498 samples from VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_mac_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 18083 samples from VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_training_data_icon Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 75874 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_grounding_training_data_icon_grounded_merged Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 5466 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl Rank 0: Loading layout_layout200k_training_data_qwen25 Rank 0: Skipping layout_layout200k_training_data_qwen25 due to repeat_time=0 Rank 0: Loading layout_layout200k_grounding_training_data_qwen25 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 158612 samples from VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl Rank 0: Loading layout_layout400k_claude_training_data_qwen25_split Rank 0: Skipping layout_layout400k_claude_training_data_qwen25_split due to repeat_time=0 Rank 0: Loading layout_layout400k_claude_grounding_training_data_qwen25_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7540 samples from VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading layout_os_layout_v1 Rank 0: Skipping layout_os_layout_v1 due to repeat_time=0 Rank 0: Loading layout_os_layout_v1_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7857 samples from VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading mind2web_raw_image Rank 0: Loading VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl with all sampling strategy Rank 0: Loaded 5740 samples from VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl Rank 0: Loading ws_android_navigation_20250328 Rank 0: Skipping ws_android_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250407 Rank 0: Skipping ws_android_navigation_20250407 due to repeat_time=0 Rank 0: Loading ws_web_navigation_w_history_20250328 Rank 0: Skipping ws_web_navigation_w_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_wo_history_20250328 Rank 0: Skipping ws_web_navigation_wo_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_20250421 Rank 0: Skipping ws_web_navigation_20250421 due to repeat_time=0 Rank 0: Loading ws_ubuntu_navigation_20250328 Rank 0: Skipping ws_ubuntu_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250505 Rank 0: Skipping ws_android_navigation_20250505 due to repeat_time=0 Rank 0: Loading internal_android_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 48814 samples from VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19042 samples from VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 8363 samples from VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 26412 samples from VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 57522 samples from VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 1342 samples from VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 15766 samples from VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19280 samples from VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 3560 samples from VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 9258 samples from VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 420 samples from VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 8898 samples from VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 21026 samples from VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 4490 samples from VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 22154 samples from VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 11614 samples from VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 16767 samples from VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 746 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 9856 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl Rank 0: Loading internal_windows_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 146442 samples from VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 57126 samples from VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 16726 samples from VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 52824 samples from VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 115044 samples from VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2684 samples from VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 31532 samples from VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 38560 samples from VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 10680 samples from VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 18516 samples from VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 1260 samples from VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 17796 samples from VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 26937 samples from VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 42051 samples from VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 44307 samples from VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 23229 samples from VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 33534 samples from VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6254 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl Rank 0: Loading internal_windows_planning_cot_boost_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 3127 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl Rank 0: Loading internal_ubuntu_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2984 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 19712 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl Rank 0: Loading private_aig_share_0815_logo_oral_operation_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_region_caption_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl Rank 0: Loading private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl with all sampling strategy Rank 0: Loaded 20293 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl Rank 0: Loading private_ui_phone_comment_20240606_json_d20241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl with all sampling strategy Rank 0: Loaded 1055 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl Rank 0: Loading private_ui_internal_aig_json_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6837 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl Rank 0: Loading private_ui_internal_aig_xml_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6873 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl Rank 0: Loading OS_Altas_androidworld_grounding_d241120_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl with all sampling strategy Rank 0: Loaded 89860 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl Rank 0: Loading private_ui_aig_share_long_caption_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl with repeat:4 sampling strategy Rank 0: Loaded 3156 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl Rank 0: Loading aw_1218_grounding Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl Rank 0: Loading aw_1218_regioncaption Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl Rank 0: Loading aw_1218_oral_operation Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl Rank 0: Loading private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl with all sampling strategy Rank 0: Loaded 6600 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 24620 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d20240604_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl with all sampling strategy Rank 0: Loaded 17196 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 5998 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl Rank 0: Loading private_ui_phone_2403_ocr_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 31276 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl Rank 0: Loading screen_qa_with_bbox_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 62401 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl Rank 0: Loading screenai_layout_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl with all sampling strategy Rank 0: Loaded 22076 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl Rank 0: Loading amex_grounding_d240813_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl with all sampling strategy Rank 0: Loaded 102007 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl Rank 0: Loading guicourse_guienv_text_grounding_1_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 63581 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl Rank 0: Loading guicourse_guienv_text_grounding_2_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 6852 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading screen_qa_short_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 27880 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_grounding_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl Rank 0: Loading private_schedual_extract_20240520_v2_r464_reprompt_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl with repeat:2 sampling strategy Rank 0: Loaded 928 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl Rank 0: Loading private_ui2json_app_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2488 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl Rank 0: Loading private_ui2json_os_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1242 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl Rank 0: Loading private_ui2json_web_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2360 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 3791 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_aig_share_2405_marker_recognition_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5179 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ocr_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5090 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_operation_oral_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5070 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5248 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2408_region_caption_d240903_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl with all sampling strategy Rank 0: Loaded 5854 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading uground_web_direct_150k_description_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 133523 samples from VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl Rank 0: Loading uground_web_direct_258k_function_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 169889 samples from VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl with all sampling strategy Rank 0: Loaded 400000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_2 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 300000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_3 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl with all sampling strategy Rank 0: Loaded 161474 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_4 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl with all sampling strategy Rank 0: Loaded 239854 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl Rank 0: Loading altas_windows Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 200000 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl Rank 0: Loading altas_windows_2 Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 552883 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl Rank 0: Loading altas_linux Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 32538 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl Rank 0: Loading atlas_macos_uitars_coord Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 14197 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl Rank 0: Loading atlas_macos_uitars_filtered Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl with random:30% sampling strategy Rank 0: Loaded 4133 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl Rank 0: Loading android_action_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl with all sampling strategy Rank 0: Loaded 11242 samples from VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl Rank 0: Loading windows_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 23961 samples from VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl Rank 0: Loading web_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl with all sampling strategy Rank 0: Loaded 18918 samples from VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 657 samples from VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 107 samples from VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading windows_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading mac_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 1578 samples from VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 20394 samples from VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl Rank 0: Loading web_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 14285 samples from VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl Rank 0: Loading android_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 7180 samples from VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl Rank 0: Loading windows_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 42845 samples from VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 150 samples from VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_aug_cropping_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 15350 samples from VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl with all sampling strategy Rank 0: Loaded 20116 samples from VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl Rank 0: Loading iphone_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 3780 samples from VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl Rank 0: Loading mac_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11721 samples from VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl Rank 0: Loading android_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 35675 samples from VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading android_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 18016 samples from VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_canvas_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 624 samples from VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 201304 samples from VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 28346 samples from VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl Rank 0: Loading android_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl with all sampling strategy Rank 0: Loaded 9814 samples from VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl Rank 0: Loading windows_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5270 samples from VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 10468 samples from VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading ubuntu_action_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl with all sampling strategy Rank 0: Loaded 3404 samples from VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl Rank 0: Loading windows_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 22840 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5242 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6530 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 34101 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 68202 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_1 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_2 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_3 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3331 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_4 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_1 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_2 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_3 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_4 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_5 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_6 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_7 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_8 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 4695 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3130 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_crop_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 1430 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11919 samples from VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 7946 samples from VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl with all sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 343 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl Rank 0: Loading windows_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl with all sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl Rank 0: Loading android_ocr_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl with all sampling strategy Rank 0: Loaded 29878 samples from VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl Rank 0: Loading mac_orc_20250328 Rank 0: Loading VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl with all sampling strategy Rank 0: Loaded 4393 samples from VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 158 samples from VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_click_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl with random:50% sampling strategy Rank 0: Loaded 1126 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 33 samples from VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4111 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2054 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 338 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_crop_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 169 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 6818 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 3408 samples from VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading android_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8719 samples from VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl Rank 0: Loading windows_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 7744 samples from VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl Rank 0: Loading web_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8376 samples from VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl Rank 0: Loading icon_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl with all sampling strategy Rank 0: Loaded 81303 samples from VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl Rank 0: Loading mac_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 626 samples from VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl Rank 0: Loading iphone_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 13924 samples from VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl Rank 0: Loading web_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl with random:50% sampling strategy Rank 0: Loaded 32254 samples from VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl Rank 0: Loading android_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 3926 samples from VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl Rank 0: Loading windows_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 16226 samples from VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl Rank 0: Loading windows_cropping_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl with random:30% sampling strategy Rank 0: Loaded 7322 samples from VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl Rank 0: Loading iphone_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl with random:50% sampling strategy Rank 0: Loaded 8448 samples from VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl Rank 0: Loading iphone_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl with all sampling strategy Rank 0: Loaded 927 samples from VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl Rank 0: Loading mac_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl with all sampling strategy Rank 0: Loaded 3051 samples from VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl Rank 0: Loading android_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 15760 samples from VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl Rank 0: Loading android_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 7923 samples from VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl Rank 0: Loading web_canvas_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl with random:30% sampling strategy Rank 0: Loaded 1174 samples from VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl Rank 0: Loading web_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 108856 samples from VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 15538 samples from VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl Rank 0: Loading android_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl with random:50% sampling strategy Rank 0: Loaded 5714 samples from VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl Rank 0: Loading windows_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 2643 samples from VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2625 samples from VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl with random:50% sampling strategy Rank 0: Loaded 1792 samples from VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl Rank 0: Loading windows_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_crop_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 686 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 295 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1058 samples from VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4230 samples from VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 846 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 535 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 215 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 146 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl Rank 0: Loading uibert_train_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 4646 samples from VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl Rank 0: Loading openapp_taperception_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 2500 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl Rank 0: Loading openapp_widget_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 14878 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl Rank 0: Loading openapp_mug_grounding_d240812 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl with all sampling strategy Rank 0: Loaded 26090 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl Rank 0: Loading private_ui_phone_2403_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 24798 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ground_d240521_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl with all sampling strategy Rank 0: Loaded 5008 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl Rank 0: Loading private_ui_aig_share_2406_ground_d240612_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl with all sampling strategy Rank 0: Loaded 7903 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl Rank 0: Loading windows_pc_agent_e_planning_cot Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl with repeat:3 sampling strategy Rank 0: Loaded 83346 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl Rank 0: Loading windows_pc_agent_e_navigation Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl with all sampling strategy Rank 0: Loaded 27782 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl Rank 0: Loading ubuntu_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl with random:65% sampling strategy Rank 0: Loaded 53435 samples from VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl Rank 0: Loading ubuntu_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl with random:25% sampling strategy Rank 0: Loaded 20552 samples from VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl Rank 0: Loading windows_mac_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl with random:30% sampling strategy Rank 0: Loaded 100078 samples from VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl Rank 0: Loading windows_mac_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl with random:15% sampling strategy Rank 0: Loaded 50039 samples from VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl Rank 0: Loading os_genesis_ac_training_data Rank 0: Skipping os_genesis_ac_training_data due to repeat_time=0 Rank 0: Loading os_genesis_aw_training_data Rank 0: Skipping os_genesis_aw_training_data due to repeat_time=0 Rank 0: Loading os_genesis_web_training Rank 0: Skipping os_genesis_web_training due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_1 Rank 0: Skipping gui_odyssey_plus_1 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_2 Rank 0: Skipping gui_odyssey_plus_2 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_custom_3 Rank 0: Skipping gui_odyssey_plus_custom_3 due to repeat_time=0 Rank 0: Loading mm_gui_mid Rank 0: Skipping mm_gui_mid due to repeat_time=0 Rank 0: Loading text_gui_mid Rank 0: Skipping text_gui_mid due to repeat_time=0 Rank 0: Loading gui_mid_trajectory Rank 0: Skipping gui_mid_trajectory due to repeat_time=0 Rank 0: Loading ubuntu_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 7024 samples from VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl Rank 0: Loading windows_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 3144 samples from VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl Rank 0: Loading sharegpt4o_review_negative_en_20240825 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 37455 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl Rank 0: Loading internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 59981 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl Rank 0: Loading ai2d_cot_gpt4o_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 14724 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl Rank 0: Loading scienceqa_multi_choice_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 23400 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl Rank 0: Loading fsc147_train_en_20241007 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 4025 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl Rank 0: Loading docreason_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 31829 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl Rank 0: Loading mmtab_instruct_pretrain_en_20240902 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 83057 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl Rank 0: Loading textvqa_en_20240611 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 42560 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl Rank 0: Loading textcap_gpt4o_en_20240905 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26596 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl Rank 0: Loading eaten_passport_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl with random:23% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl Rank 0: Loading textocr_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26329 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl Rank 0: Loading laion_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 13468 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl Rank 0: Loading llavar_inhouse_sft_chat_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19908 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl Rank 0: Loading llavar_inhouse_sft_longcap_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19916 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl Rank 0: Loading icdar2019_art_task1_3_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 6782 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl Rank 0: Loading chinese_ocr_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 68312 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl Rank 0: Loading cocotextv2_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19938 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl Rank 0: Loading mtwi_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11424 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl Rank 0: Loading textocr_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 22488 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl Rank 0: Loading arxiv_table_65k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 79283 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl Rank 0: Loading arxiv_ocr_162k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl with random:74% sampling strategy Rank 0: Loaded 120223 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl Rank 0: Loading iam_multi_turn_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12168 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl Rank 0: Loading poie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2768 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl Rank 0: Loading sroie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 770 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl Rank 0: Loading ocrvqa_en_20241116 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl with random:37% sampling strategy Rank 0: Loaded 76358 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl Rank 0: Loading edrawsvg_caption_13k_zh_20240522 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11457 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl Rank 0: Loading wired_table_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl with random:37% sampling strategy Rank 0: Loaded 36850 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl Rank 0: Loading hme100k_en_20240620 Rank 0: Skipping hme100k_en_20240620 due to repeat_time=0 Rank 0: Loading synth_calligraphy_poetry_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 123000 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl Rank 0: Loading chrome_writting_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 10855 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl Rank 0: Loading vcr_wiki_en_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 20357 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl Rank 0: Loading vcr_wiki_en_hard_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl Rank 0: Loading vcr_wiki_zh_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 19569 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl Rank 0: Loading gpt4gen_rd_boxcot_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 4620 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl Rank 0: Loading math_150_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 184 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl Rank 0: Loading math_2k_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2453 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl Rank 0: Loading geoqa+_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 88951 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl Rank 0: Loading tqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 24741 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl Rank 0: Loading tqa_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 21340 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl Rank 0: Loading geometry3k_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12921 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl Rank 0: Loading geometry3k_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11370 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl Rank 0: Loading unigeo_calc_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 25734 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl Rank 0: Loading super_clevr_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 73800 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl Rank 0: Loading mavis_math_function_caption_to_question_en_20240821 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 36414 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl Rank 0: Loading geomverse_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11437 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl Rank 0: Loading cmm_math_cot_zh_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 16172 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl Rank 0: Loading qwen_filtered_gpt4v_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 8497 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl Rank 0: Loading qwen_filtered_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl with random:55% sampling strategy Rank 0: Loaded 2709 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl Rank 0: Loading screenai_layout_en_20241102 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 27152 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl Rank 0: Loading qwen_filtered_infinitymath_en_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 116490 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl Rank 0: Loading qwen_filtered_sft_code_sensetime_en_zh_20240920 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl with random:66% sampling strategy Rank 0: Loaded 459932 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl Rank 0: Loading qwen_filtered_know_saraswati_cot_en_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 148371 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl Rank 0: Loading qwen_filtered_leetcode_en_zh_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1642 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl Rank 0: Loading data_gpt_generalquestion_correction_cn_43k_v2_20240813 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 52892 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl Rank 0: Loading SynthCode_leetcode_vqa_4k_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5517 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl Rank 0: Loading SynthCode_llmapi_vqa_187_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 230 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl Rank 0: Loading captcha_feedback_619_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 761 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl Rank 0: Loading open_r1_math_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 427940 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl Rank 0: Loading open_thoughts_114k_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl with random:62% sampling strategy Rank 0: Loaded 69746 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl Rank 0: Loading lmsys_single_turn Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl with random:62% sampling strategy Rank 0: Loaded 207732 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl Rank 0: Loading SCP_116K_filter Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 61205 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl Rank 0: Loading Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl with random:62% sampling strategy Rank 0: Loaded 154952 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl Rank 0: Loading longcite_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl with random:83% sampling strategy Rank 0: Loaded 35439 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl Rank 0: Loading long_instruct_with_paraphrasing_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 9417 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl Rank 0: Loading qwen_filtered_tomb_evolved_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 21483 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl Rank 0: Loading qwen_filtered_xcoder_80k_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 85474 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl Rank 0: Loading qwen_filtered_sft_general_zhuguan_zh_20241002 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl with random:88% sampling strategy [rank53]:[E827 13:45:42.088958579 ProcessGroupNCCL.cpp:616] [Rank 53] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800075 milliseconds before timing out. [rank53]:[E827 13:45:42.089192379 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 53] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank53]:[E827 13:45:42.310932794 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 53] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank53]:[E827 13:45:42.310949914 ProcessGroupNCCL.cpp:630] [Rank 53] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank53]:[E827 13:45:42.310953699 ProcessGroupNCCL.cpp:636] [Rank 53] To avoid data inconsistency, we are taking the entire process down. [rank53]:[E827 13:45:42.312106709 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 53] Process group watchdog thread terminated with exception: [Rank 53] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800075 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9995125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f99963ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f99963b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f99963b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f99e73b65c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f99ef0a3aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f99ef130c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 53] Process group watchdog thread terminated with exception: [Rank 53] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800075 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9995125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f99963ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f99963b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f99963b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f99e73b65c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f99ef0a3aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f99ef130c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9995125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f999601fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f99e73b65c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f99ef0a3aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f99ef130c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:45:45.162000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 891 closing signal SIGTERM W0827 13:45:45.166000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 892 closing signal SIGTERM W0827 13:45:45.166000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 893 closing signal SIGTERM W0827 13:45:45.166000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 894 closing signal SIGTERM W0827 13:45:45.167000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 895 closing signal SIGTERM W0827 13:45:45.167000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 13:45:45.167000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM E0827 13:45:50.504000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 5 (pid: 896) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:45:45 host : 10-102-200-17.networking-agent.kubebrain-networking.svc.pjlab.local rank : 53 (local_rank: 5) exitcode : -6 (pid: 896) error_file: traceback : Signal 6 (SIGABRT) received by PID 896 ============================================================ Rank 0: Loaded 124772 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl Rank 0: Loading merged_mmmu_knowledge_point_gpt4o_en_20241118 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl with repeat:1.1 sampling strategy [rank33]:[E827 13:46:00.369576078 ProcessGroupNCCL.cpp:616] [Rank 33] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800084 milliseconds before timing out. [rank33]:[E827 13:46:00.369750824 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 33] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank33]:[E827 13:46:01.040943262 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 33] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank33]:[E827 13:46:01.040965338 ProcessGroupNCCL.cpp:630] [Rank 33] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank33]:[E827 13:46:01.040969362 ProcessGroupNCCL.cpp:636] [Rank 33] To avoid data inconsistency, we are taking the entire process down. [rank33]:[E827 13:46:01.042187937 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 33] Process group watchdog thread terminated with exception: [Rank 33] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800084 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f860547a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f86067ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f86067b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f86067b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f86577475c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f865f434aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f865f4c1c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 33] Process group watchdog thread terminated with exception: [Rank 33] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800084 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f860547a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f86067ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f86067b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f86067b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f86577475c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f865f434aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f865f4c1c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f860547a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f860641fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f86577475c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f865f434aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f865f4c1c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:46:04.083000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 914 closing signal SIGTERM W0827 13:46:04.087000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 13:46:04.088000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 13:46:04.088000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 13:46:04.089000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM W0827 13:46:04.089000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 13:46:04.090000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM [rank60]:[E827 13:46:05.771980525 ProcessGroupNCCL.cpp:616] [Rank 60] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800000 milliseconds before timing out. [rank60]:[E827 13:46:05.772174241 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 60] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank60]:[E827 13:46:05.283139442 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 60] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank60]:[E827 13:46:05.283153478 ProcessGroupNCCL.cpp:630] [Rank 60] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank60]:[E827 13:46:05.283156644 ProcessGroupNCCL.cpp:636] [Rank 60] To avoid data inconsistency, we are taking the entire process down. [rank60]:[E827 13:46:05.284283500 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 60] Process group watchdog thread terminated with exception: [Rank 60] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800000 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fe80bf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fe80d1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fe80d1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fe80d1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fe85e2095c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fe865ef6aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fe865f83c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 60] Process group watchdog thread terminated with exception: [Rank 60] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800000 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fe80bf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fe80d1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fe80d1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fe80d1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fe85e2095c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fe865ef6aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fe865f83c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fe80bf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fe80ce1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fe85e2095c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fe865ef6aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fe865f83c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:46:07.978000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 908 closing signal SIGTERM W0827 13:46:07.981000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 909 closing signal SIGTERM W0827 13:46:07.982000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 910 closing signal SIGTERM W0827 13:46:07.982000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 911 closing signal SIGTERM W0827 13:46:07.983000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 913 closing signal SIGTERM W0827 13:46:07.983000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 914 closing signal SIGTERM W0827 13:46:07.983000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 915 closing signal SIGTERM E0827 13:46:09.902000 912 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 1 (pid: 915) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:46:04 host : 10-102-204-7.monitoring-dcgm-exporter.kubebrain.svc.pjlab.local rank : 33 (local_rank: 1) exitcode : -6 (pid: 915) error_file: traceback : Signal 6 (SIGABRT) received by PID 915 ============================================================ Rank 0: Loaded 47383 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl Rank 0: Loading android_ui_longcap_qwen_zh_20240409 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl with repeat:2.46 sampling strategy Rank 0: Loaded 13528 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl Rank 0: Loading screen2words_longcap_gpt4o_en_20240819 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 18106 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl Rank 0: Loading drawing_to_html_en_20240628 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl with repeat:1.23 sampling strategy E0827 13:46:13.559000 906 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 4 (pid: 912) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:46:07 host : 10-102-216-33.networking-agent.kubebrain-networking.svc.pjlab.local rank : 60 (local_rank: 4) exitcode : -6 (pid: 912) error_file: traceback : Signal 6 (SIGABRT) received by PID 912 ============================================================ Rank 0: Loaded 2090 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl Rank 0: Loading airplane_app_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1368 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading taobao_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1925 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading wechat_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1344 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading websight_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5349 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl Rank 0: Total training samples: 11313173 Rank 0: Formatting inputs...Skip in lazy mode Rank 0: Resize images between 3136 to 2109744 [rank20]:[E827 13:46:46.263467555 ProcessGroupNCCL.cpp:616] [Rank 20] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800082 milliseconds before timing out. [rank20]:[E827 13:46:46.263709453 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 20] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank20]:[E827 13:46:46.482309584 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 20] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank20]:[E827 13:46:46.482323818 ProcessGroupNCCL.cpp:630] [Rank 20] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank20]:[E827 13:46:46.482326975 ProcessGroupNCCL.cpp:636] [Rank 20] To avoid data inconsistency, we are taking the entire process down. [rank20]:[E827 13:46:46.483494871 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 20] Process group watchdog thread terminated with exception: [Rank 20] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800082 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f458dd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f458efab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f458efb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f458efb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f45dffdc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f45e7cc9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f45e7d56c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 20] Process group watchdog thread terminated with exception: [Rank 20] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800082 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f458dd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f458efab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f458efb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f458efb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f45dffdc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f45e7cc9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f45e7d56c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f458dd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f458ec1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f45dffdc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f45e7cc9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f45e7d56c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:46:49.571000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM W0827 13:46:49.574000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 903 closing signal SIGTERM W0827 13:46:49.575000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 904 closing signal SIGTERM W0827 13:46:49.575000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 905 closing signal SIGTERM W0827 13:46:49.576000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 907 closing signal SIGTERM W0827 13:46:49.576000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 908 closing signal SIGTERM W0827 13:46:49.576000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 909 closing signal SIGTERM Rank 0: Length of multimodal samples: 9328384, pure textual samples: 1984512 E0827 13:46:55.590000 900 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 4 (pid: 906) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:46:49 host : 10-102-199-8.monitoring-prometheus-node-exporter.kubebrain.svc.pjlab.local rank : 20 (local_rank: 4) exitcode : -6 (pid: 906) error_file: traceback : Signal 6 (SIGABRT) received by PID 906 ============================================================ [rank28]:[E827 13:47:17.095462880 ProcessGroupNCCL.cpp:616] [Rank 28] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800077 milliseconds before timing out. [rank28]:[E827 13:47:17.095674630 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 28] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank28]:[E827 13:47:18.795380619 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 28] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank28]:[E827 13:47:18.795398413 ProcessGroupNCCL.cpp:630] [Rank 28] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank28]:[E827 13:47:18.795402056 ProcessGroupNCCL.cpp:636] [Rank 28] To avoid data inconsistency, we are taking the entire process down. [rank28]:[E827 13:47:18.796632682 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 28] Process group watchdog thread terminated with exception: [Rank 28] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800077 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1b4d617446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f1b4e9484d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f1b4e94f913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f1b4e95137d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f1b9e27d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1ba755eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1ba75ebc3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 28] Process group watchdog thread terminated with exception: [Rank 28] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800077 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1b4d617446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f1b4e9484d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f1b4e94f913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f1b4e95137d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f1b9e27d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1ba755eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1ba75ebc3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1b4d617446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f1b4e5bcceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f1b9e27d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f1ba755eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f1ba75ebc3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:47:20.973000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 906 closing signal SIGTERM W0827 13:47:20.976000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 907 closing signal SIGTERM W0827 13:47:20.976000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 908 closing signal SIGTERM W0827 13:47:20.977000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 909 closing signal SIGTERM W0827 13:47:20.977000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 911 closing signal SIGTERM W0827 13:47:20.978000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 912 closing signal SIGTERM W0827 13:47:20.978000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 913 closing signal SIGTERM E0827 13:47:26.765000 904 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 4 (pid: 910) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:47:20 host : 10-102-204-19.smartctl-exporter.smartctl-exporter.svc.pjlab.local rank : 28 (local_rank: 4) exitcode : -6 (pid: 910) error_file: traceback : Signal 6 (SIGABRT) received by PID 910 ============================================================ Parameter Offload: Total persistent parameters: 848896 in 368 params [rank6]:[E827 13:47:49.432505213 ProcessGroupNCCL.cpp:616] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800083 milliseconds before timing out. [rank6]:[E827 13:47:49.433275600 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 6] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank6]:[E827 13:47:50.903029468 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 6] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank6]:[E827 13:47:50.903055025 ProcessGroupNCCL.cpp:630] [Rank 6] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank6]:[E827 13:47:50.903059094 ProcessGroupNCCL.cpp:636] [Rank 6] To avoid data inconsistency, we are taking the entire process down. [rank6]:[E827 13:47:50.904498066 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 6] Process group watchdog thread terminated with exception: [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800083 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f56c267a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f56c39ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f56c39b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f56c39b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f57148fc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f571c5e9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f571c676c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 6] Process group watchdog thread terminated with exception: [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800083 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f56c267a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f56c39ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f56c39b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f56c39b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f57148fc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f571c5e9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f571c676c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f56c267a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f56c361fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f57148fc5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f571c5e9aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f571c676c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:47:52.960000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 895 closing signal SIGTERM W0827 13:47:52.963000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 896 closing signal SIGTERM W0827 13:47:52.964000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 13:47:52.964000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM W0827 13:47:52.965000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 899 closing signal SIGTERM W0827 13:47:52.965000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 900 closing signal SIGTERM W0827 13:47:52.965000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM E0827 13:47:58.515000 892 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 6 (pid: 901) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:47:52 host : 10-102-223-16.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 6 (local_rank: 6) exitcode : -6 (pid: 901) error_file: traceback : Signal 6 (SIGABRT) received by PID 901 ============================================================ [rank9]:[E827 13:48:12.315378591 ProcessGroupNCCL.cpp:616] [Rank 9] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800088 milliseconds before timing out. [rank9]:[E827 13:48:12.315621382 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 9] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank9]:[E827 13:48:12.759778515 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 9] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank9]:[E827 13:48:12.759798411 ProcessGroupNCCL.cpp:630] [Rank 9] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank9]:[E827 13:48:12.759802162 ProcessGroupNCCL.cpp:636] [Rank 9] To avoid data inconsistency, we are taking the entire process down. [rank9]:[E827 13:48:12.761052362 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 9] Process group watchdog thread terminated with exception: [Rank 9] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800088 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd161f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fd1631ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fd1631b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fd1631b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fd1b41e75c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fd1bbed4aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fd1bbf61c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 9] Process group watchdog thread terminated with exception: [Rank 9] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800088 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd161f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fd1631ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fd1631b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fd1631b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fd1b41e75c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fd1bbed4aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fd1bbf61c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd161f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fd162e1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fd1b41e75c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fd1bbed4aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fd1bbf61c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:48:15.192000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 893 closing signal SIGTERM W0827 13:48:15.198000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 895 closing signal SIGTERM W0827 13:48:15.198000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 896 closing signal SIGTERM W0827 13:48:15.199000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 13:48:15.199000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM W0827 13:48:15.200000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 899 closing signal SIGTERM W0827 13:48:15.200000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 900 closing signal SIGTERM E0827 13:48:20.790000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 1 (pid: 894) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:48:15 host : 10-102-204-48.smartctl-exporter.smartctl-exporter.svc.pjlab.local rank : 9 (local_rank: 1) exitcode : -6 (pid: 894) error_file: traceback : Signal 6 (SIGABRT) received by PID 894 ============================================================ [rank42]:[E827 13:48:53.688087705 ProcessGroupNCCL.cpp:616] [Rank 42] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800048 milliseconds before timing out. [rank42]:[E827 13:48:53.688259368 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 42] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank42]:[E827 13:48:54.580299041 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 42] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank42]:[E827 13:48:54.580316466 ProcessGroupNCCL.cpp:630] [Rank 42] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank42]:[E827 13:48:54.580319739 ProcessGroupNCCL.cpp:636] [Rank 42] To avoid data inconsistency, we are taking the entire process down. [rank42]:[E827 13:48:54.581425310 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 42] Process group watchdog thread terminated with exception: [Rank 42] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800048 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f142cf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f142e1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f142e1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f142e1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f147f1e45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1486ed1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1486f5ec3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 42] Process group watchdog thread terminated with exception: [Rank 42] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800048 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f142cf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f142e1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f142e1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f142e1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f147f1e45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1486ed1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1486f5ec3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f142cf25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f142de1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f147f1e45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f1486ed1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f1486f5ec3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 13:48:57.097000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 926 closing signal SIGTERM W0827 13:48:57.102000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 927 closing signal SIGTERM W0827 13:48:57.103000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 929 closing signal SIGTERM W0827 13:48:57.103000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 930 closing signal SIGTERM W0827 13:48:57.104000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 931 closing signal SIGTERM W0827 13:48:57.104000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 932 closing signal SIGTERM W0827 13:48:57.104000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 933 closing signal SIGTERM E0827 13:49:02.520000 924 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 2 (pid: 928) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_13:48:57 host : 10-102-202-24.csi-metrics-rbdplugin.ceph-csi.svc.pjlab.local rank : 42 (local_rank: 2) exitcode : -6 (pid: 928) error_file: traceback : Signal 6 (SIGABRT) received by PID 928 ============================================================ W0827 14:27:28.258000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:28.258000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:28.258000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:28.258000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:28.908000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:28.908000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:28.908000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:28.908000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.091000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:29.091000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.091000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:29.091000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.200000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:29.200000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.200000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:29.200000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.441000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:29.441000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.441000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:29.441000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.756000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:29.756000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:29.756000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:29.756000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:30.776000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:30.776000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:30.776000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:30.776000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.139000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.139000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.139000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.139000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.164000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.164000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.164000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.164000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.607000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.607000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.607000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.607000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.610000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.610000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.610000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.610000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.800000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.800000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.800000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.800000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.957000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:31.957000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:31.957000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:31.957000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:32.325000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:32.325000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:32.325000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:32.325000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:33.352000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:33.352000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:33.352000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:33.352000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:36.583000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 14:27:36.583000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 14:27:36.583000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 14:27:36.583000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** [2025-08-27 14:27:41,488] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,489] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,495] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:41,860] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,932] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,966] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,969] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,970] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:41,984] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,095] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,227] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,281] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,332] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,347] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,357] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,368] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,379] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,382] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,386] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,387] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,421] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,429] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,435] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,447] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,456] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,457] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,477] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,542] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,554] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,560] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,570] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,575] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,584] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,585] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,587] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,671] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:42,672] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:42,673] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:42,687] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,694] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,703] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,704] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,706] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,712] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,722] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,722] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,727] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,727] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,728] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,732] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:42,747] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,750] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,866] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,867] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,870] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,873] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,874] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,879] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,880] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,883] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:42,970] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:42,982] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:42,985] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,993] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,994] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:42,997] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,007] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,015] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,023] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,035] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,058] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,066] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,066] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,068] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,069] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,077] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,078] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,093] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,109] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,111] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,112] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,116] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,117] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,120] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:43,121] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,123] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,124] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,148] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,155] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,165] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,169] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,172] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,240] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,241] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,243] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,258] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,281] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,330] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,587] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,605] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,614] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,622] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,647] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,654] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,660] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,676] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,678] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,680] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:43,682] [INFO] [comm.py:652:init_distributed] cdb=None You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,703] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,712] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,718] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,718] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,719] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,725] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,740] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,740] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,745] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,748] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,792] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,872] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:43,873] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,881] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,892] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,894] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:43,895] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,894] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,909] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,928] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,928] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,935] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,942] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:43,947] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,950] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:43,975] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,979] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:43,984] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,992] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:43,995] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,043] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,045] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,049] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,065] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,084] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,091] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,107] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,114] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,122] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,122] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,130] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,134] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,142] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,154] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,177] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,178] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,180] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:44,183] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,184] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,213] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,229] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,238] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,245] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,246] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,256] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,256] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,256] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,268] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,269] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,274] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,275] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,278] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,281] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,281] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,283] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,285] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,283] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,325] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,328] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,338] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,349] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,362] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,365] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,372] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,373] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,375] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,375] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,378] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,377] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,383] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,385] [INFO] [comm.py:652:init_distributed] cdb=None You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,397] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,403] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,404] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,405] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,417] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,419] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,440] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,442] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,455] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,464] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,467] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,470] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,471] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,471] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2025-08-27 14:27:44,473] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,481] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,486] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,504] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,525] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,525] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,526] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,529] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,529] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,530] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,531] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,583] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,618] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,619] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,623] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,626] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,629] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,675] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,688] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,698] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,700] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,704] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,722] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,725] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:44,791] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,795] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,797] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,804] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,804] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,859] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,861] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,899] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,954] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,961] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,967] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,967] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,978] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:44,982] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,984] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:44,984] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,080] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,142] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,142] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,142] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,148] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,150] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,154] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,163] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,175] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,175] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,175] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,176] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,278] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,286] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,289] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,297] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,304] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,309] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,310] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,310] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,311] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,313] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,315] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,316] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,316] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,323] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,327] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,335] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,339] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:45,340] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,344] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,351] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:45,543] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,550] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,559] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,562] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,592] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,652] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,663] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,920] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,922] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,922] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,964] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:45,964] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:46,141] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,152] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:46,153] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:46,490] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,500] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,501] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,501] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,501] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,501] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,502] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,502] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,816] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,834] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,834] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,835] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,839] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:46,840] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:46,841] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:47,185] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,277] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,338] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,341] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,345] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,351] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,355] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:47,355] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:49,852] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:50,351] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:50,662] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:50,664] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:50,668] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:50,668] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:50,669] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:27:50,670] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:50,670] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:27:54,728] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,868] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,889] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,895] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,900] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,902] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,905] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:54,907] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:27:59,716] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,717] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:27:59,756] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,909] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,951] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,966] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,966] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,990] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:27:59,999] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:28:00,000] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 14:28:00,259] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: df: df: /tmp/triton_lzydf: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory/tmp/triton_lzy: No such file or directory : No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 14:28:00,612] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:00,616] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:00,620] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:28:00,621] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:28:00,621] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:00,624] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:00,628] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,024] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,025] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 14:28:05,571] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,929] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,936] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 14:28:05,937] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,938] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,939] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,940] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 14:28:05,950] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. Loading checkpoint shards: 0%| | 0/5 [00:00 before Client(conf_path) Rank 0: --> after Client(conf_path) Rank 0: Loading datasets: ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Rank 0: Loading guienv Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl with all sampling strategy Rank 0: Loaded 327972 samples from VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl Rank 0: Loading omniact Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl with all sampling strategy Rank 0: Loaded 6720 samples from VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl Rank 0: Loading ricoig16k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl with all sampling strategy Rank 0: Loaded 16133 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl Rank 0: Loading ricosca Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl with all sampling strategy Rank 0: Loaded 173212 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl Rank 0: Loading seeclick Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl with all sampling strategy Rank 0: Loaded 271121 samples from VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl Rank 0: Loading ui_refexp Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl with all sampling strategy Rank 0: Loaded 15624 samples from VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl Rank 0: Loading webui350k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl with all sampling strategy Rank 0: Loaded 57389 samples from VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl Rank 0: Loading widget_captioning Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl with all sampling strategy Rank 0: Loaded 101426 samples from VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl Rank 0: Loading aitw-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl Rank 0: Loading aitw-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl Rank 0: Loading aitw-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl Rank 0: Loading amex-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl Rank 0: Loading amex-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl Rank 0: Loading amex-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl Rank 0: Loading android_control Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl with repeat:2 sampling strategy Rank 0: Loaded 149428 samples from VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl Rank 0: Loading coat Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl with all sampling strategy Rank 0: Loaded 11833 samples from VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl Rank 0: Loading guiact-web-multi-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl Rank 0: Loading guiact-web-multi-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl Rank 0: Loading guiact-web-multi-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl Rank 0: Loading guiact-web-single Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl with all sampling strategy Rank 0: Loaded 67396 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl Rank 0: Loading guide Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl with all sampling strategy Rank 0: Loaded 13544 samples from VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl Rank 0: Loading gui-odyssey-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl Rank 0: Loading gui-odyssey-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl Rank 0: Loading gui-odyssey-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl Rank 0: Loading mind2web-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl Rank 0: Loading mind2web-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl Rank 0: Loading mind2web-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl Rank 0: Loading miniwob-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl Rank 0: Loading miniwob-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl Rank 0: Loading miniwob-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl Rank 0: Loading aguvis_android_control-v2 Rank 0: Skipping aguvis_android_control-v2 due to repeat_time=0 Rank 0: Loading aguvis_coat-v2 Rank 0: Skipping aguvis_coat-v2 due to repeat_time=0 Rank 0: Loading aguvis_docvqa_grounding Rank 0: Skipping aguvis_docvqa_grounding due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-multi Rank 0: Skipping aguvis_guiact-web-multi due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-single-v2 Rank 0: Skipping aguvis_guiact-web-single-v2 due to repeat_time=0 Rank 0: Loading aguvis_guide_si_10k-v2 Rank 0: Skipping aguvis_guide_si_10k-v2 due to repeat_time=0 Rank 0: Loading aguvis_guienv Rank 0: Skipping aguvis_guienv due to repeat_time=0 Rank 0: Loading aguvis_mind2web_train_v1.0.1 Rank 0: Skipping aguvis_mind2web_train_v1.0.1 due to repeat_time=0 Rank 0: Loading aguvis_omniact Rank 0: Skipping aguvis_omniact due to repeat_time=0 Rank 0: Loading aguvis_osatlas_ui_tars_cleaned Rank 0: Skipping aguvis_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_ricoig16k Rank 0: Skipping aguvis_ricoig16k due to repeat_time=0 Rank 0: Loading aguvis_ricosca Rank 0: Skipping aguvis_ricosca due to repeat_time=0 Rank 0: Loading aguvis_seeclick_mi_ui_tars_cleaned Rank 0: Skipping aguvis_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping aguvis_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading aguvis_ui_refexp Rank 0: Skipping aguvis_ui_refexp due to repeat_time=0 Rank 0: Loading aguvis_webui350k Rank 0: Skipping aguvis_webui350k due to repeat_time=0 Rank 0: Loading aguvis_widget_captioning Rank 0: Skipping aguvis_widget_captioning due to repeat_time=0 Rank 0: Loading icon_caption_icon_v0222_description Rank 0: Skipping icon_caption_icon_v0222_description due to repeat_time=0 Rank 0: Loading icon_grounding_icon_v0222_grounding Rank 0: Skipping icon_grounding_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_component_final_1.5m Rank 0: Skipping refusal_component_final_1.5m due to repeat_time=0 Rank 0: Loading refusal_component_library_snap_icon_data_grounding Rank 0: Skipping refusal_component_library_snap_icon_data_grounding due to repeat_time=0 Rank 0: Loading refusal_component_v1_130k Rank 0: Skipping refusal_component_v1_130k due to repeat_time=0 Rank 0: Loading refusal_guienv Rank 0: Skipping refusal_guienv due to repeat_time=0 Rank 0: Loading refusal_icon_v0222_grounding Rank 0: Skipping refusal_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_osatlas_ui_tars_cleaned Rank 0: Skipping refusal_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_ricosca Rank 0: Skipping refusal_ricosca due to repeat_time=0 Rank 0: Loading refusal_seeclick_mi_ui_tars_cleaned Rank 0: Skipping refusal_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping refusal_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading refusal_training_data_icon_grounded_merged Rank 0: Skipping refusal_training_data_icon_grounded_merged due to repeat_time=0 Rank 0: Loading component_generated_component_final_1.5m_cleaned_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 3987 samples from VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_description Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 11061 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 4424 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl Rank 0: Loading component_generated_component_v1_130k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 26376 samples from VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl Rank 0: Loading component_rule-based_doc_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 3153 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl Rank 0: Loading component_rule-based_doc_scroll_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 603 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl Rank 0: Loading component_rule-based_ethercalc_v1 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2012 samples from VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl Rank 0: Loading component_rule-based_slide_v1_17k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2363 samples from VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl Rank 0: Loading icon_caption_ios_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 49498 samples from VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_mac_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 18083 samples from VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_training_data_icon Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 75874 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_grounding_training_data_icon_grounded_merged Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 5466 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl Rank 0: Loading layout_layout200k_training_data_qwen25 Rank 0: Skipping layout_layout200k_training_data_qwen25 due to repeat_time=0 Rank 0: Loading layout_layout200k_grounding_training_data_qwen25 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 158612 samples from VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl Rank 0: Loading layout_layout400k_claude_training_data_qwen25_split Rank 0: Skipping layout_layout400k_claude_training_data_qwen25_split due to repeat_time=0 Rank 0: Loading layout_layout400k_claude_grounding_training_data_qwen25_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7540 samples from VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading layout_os_layout_v1 Rank 0: Skipping layout_os_layout_v1 due to repeat_time=0 Rank 0: Loading layout_os_layout_v1_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7857 samples from VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading mind2web_raw_image Rank 0: Loading VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl with all sampling strategy Rank 0: Loaded 5740 samples from VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl Rank 0: Loading ws_android_navigation_20250328 Rank 0: Skipping ws_android_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250407 Rank 0: Skipping ws_android_navigation_20250407 due to repeat_time=0 Rank 0: Loading ws_web_navigation_w_history_20250328 Rank 0: Skipping ws_web_navigation_w_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_wo_history_20250328 Rank 0: Skipping ws_web_navigation_wo_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_20250421 Rank 0: Skipping ws_web_navigation_20250421 due to repeat_time=0 Rank 0: Loading ws_ubuntu_navigation_20250328 Rank 0: Skipping ws_ubuntu_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250505 Rank 0: Skipping ws_android_navigation_20250505 due to repeat_time=0 Rank 0: Loading internal_android_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 48814 samples from VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19042 samples from VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 8363 samples from VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 26412 samples from VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 57522 samples from VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 1342 samples from VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 15766 samples from VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19280 samples from VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 3560 samples from VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 9258 samples from VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 420 samples from VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 8898 samples from VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 21026 samples from VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 4490 samples from VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 22154 samples from VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 11614 samples from VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 16767 samples from VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 746 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 9856 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl Rank 0: Loading internal_windows_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 146442 samples from VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 57126 samples from VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 16726 samples from VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 52824 samples from VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 115044 samples from VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2684 samples from VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 31532 samples from VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 38560 samples from VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 10680 samples from VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 18516 samples from VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 1260 samples from VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 17796 samples from VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 26937 samples from VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 42051 samples from VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 44307 samples from VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 23229 samples from VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 33534 samples from VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6254 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl Rank 0: Loading internal_windows_planning_cot_boost_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 3127 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl Rank 0: Loading internal_ubuntu_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2984 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 19712 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl Rank 0: Loading private_aig_share_0815_logo_oral_operation_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_region_caption_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl Rank 0: Loading private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl with all sampling strategy Rank 0: Loaded 20293 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl Rank 0: Loading private_ui_phone_comment_20240606_json_d20241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl with all sampling strategy Rank 0: Loaded 1055 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl Rank 0: Loading private_ui_internal_aig_json_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6837 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl Rank 0: Loading private_ui_internal_aig_xml_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6873 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl Rank 0: Loading OS_Altas_androidworld_grounding_d241120_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl with all sampling strategy Rank 0: Loaded 89860 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl Rank 0: Loading private_ui_aig_share_long_caption_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl with repeat:4 sampling strategy Rank 0: Loaded 3156 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl Rank 0: Loading aw_1218_grounding Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl Rank 0: Loading aw_1218_regioncaption Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl Rank 0: Loading aw_1218_oral_operation Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl Rank 0: Loading private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl with all sampling strategy Rank 0: Loaded 6600 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 24620 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d20240604_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl with all sampling strategy Rank 0: Loaded 17196 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 5998 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl Rank 0: Loading private_ui_phone_2403_ocr_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 31276 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl Rank 0: Loading screen_qa_with_bbox_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 62401 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl Rank 0: Loading screenai_layout_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl with all sampling strategy Rank 0: Loaded 22076 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl Rank 0: Loading amex_grounding_d240813_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl with all sampling strategy Rank 0: Loaded 102007 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl Rank 0: Loading guicourse_guienv_text_grounding_1_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 63581 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl Rank 0: Loading guicourse_guienv_text_grounding_2_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 6852 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading screen_qa_short_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 27880 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_grounding_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl Rank 0: Loading private_schedual_extract_20240520_v2_r464_reprompt_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl with repeat:2 sampling strategy Rank 0: Loaded 928 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl Rank 0: Loading private_ui2json_app_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2488 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl Rank 0: Loading private_ui2json_os_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1242 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl Rank 0: Loading private_ui2json_web_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2360 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 3791 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_aig_share_2405_marker_recognition_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5179 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ocr_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5090 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_operation_oral_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5070 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5248 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2408_region_caption_d240903_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl with all sampling strategy Rank 0: Loaded 5854 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading uground_web_direct_150k_description_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 133523 samples from VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl Rank 0: Loading uground_web_direct_258k_function_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 169889 samples from VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl with all sampling strategy Rank 0: Loaded 400000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_2 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 300000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_3 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl with all sampling strategy Rank 0: Loaded 161474 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_4 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl with all sampling strategy Rank 0: Loaded 239854 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl Rank 0: Loading altas_windows Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 200000 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl Rank 0: Loading altas_windows_2 Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 552883 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl Rank 0: Loading altas_linux Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 32538 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl Rank 0: Loading atlas_macos_uitars_coord Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 14197 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl Rank 0: Loading atlas_macos_uitars_filtered Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl with random:30% sampling strategy Rank 0: Loaded 4133 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl Rank 0: Loading android_action_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl with all sampling strategy Rank 0: Loaded 11242 samples from VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl Rank 0: Loading windows_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 23961 samples from VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl Rank 0: Loading web_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl with all sampling strategy Rank 0: Loaded 18918 samples from VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 657 samples from VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 107 samples from VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading windows_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading mac_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 1578 samples from VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 20394 samples from VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl Rank 0: Loading web_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 14285 samples from VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl Rank 0: Loading android_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 7180 samples from VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl Rank 0: Loading windows_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 42845 samples from VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 150 samples from VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_aug_cropping_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 15350 samples from VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl with all sampling strategy Rank 0: Loaded 20116 samples from VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl Rank 0: Loading iphone_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 3780 samples from VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl Rank 0: Loading mac_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11721 samples from VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl Rank 0: Loading android_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 35675 samples from VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading android_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 18016 samples from VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_canvas_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 624 samples from VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 201304 samples from VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 28346 samples from VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl Rank 0: Loading android_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl with all sampling strategy Rank 0: Loaded 9814 samples from VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl Rank 0: Loading windows_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5270 samples from VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 10468 samples from VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading ubuntu_action_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl with all sampling strategy Rank 0: Loaded 3404 samples from VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl Rank 0: Loading windows_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 22840 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5242 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6530 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 34101 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 68202 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_1 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_2 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_3 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3331 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_4 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_1 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_2 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_3 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_4 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_5 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_6 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_7 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_8 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 4695 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3130 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_crop_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 1430 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11919 samples from VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 7946 samples from VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl with all sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 343 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl Rank 0: Loading windows_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl with all sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl Rank 0: Loading android_ocr_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl with all sampling strategy Rank 0: Loaded 29878 samples from VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl Rank 0: Loading mac_orc_20250328 Rank 0: Loading VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl with all sampling strategy Rank 0: Loaded 4393 samples from VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 158 samples from VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_click_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl with random:50% sampling strategy Rank 0: Loaded 1126 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 33 samples from VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4111 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2054 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 338 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_crop_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 169 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 6818 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 3408 samples from VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading android_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8719 samples from VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl Rank 0: Loading windows_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 7744 samples from VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl Rank 0: Loading web_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8376 samples from VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl Rank 0: Loading icon_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl with all sampling strategy Rank 0: Loaded 81303 samples from VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl Rank 0: Loading mac_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 626 samples from VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl Rank 0: Loading iphone_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 13924 samples from VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl Rank 0: Loading web_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl with random:50% sampling strategy Rank 0: Loaded 32254 samples from VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl Rank 0: Loading android_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 3926 samples from VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl Rank 0: Loading windows_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 16226 samples from VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl Rank 0: Loading windows_cropping_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl with random:30% sampling strategy Rank 0: Loaded 7322 samples from VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl Rank 0: Loading iphone_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl with random:50% sampling strategy Rank 0: Loaded 8448 samples from VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl Rank 0: Loading iphone_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl with all sampling strategy Rank 0: Loaded 927 samples from VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl Rank 0: Loading mac_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl with all sampling strategy Rank 0: Loaded 3051 samples from VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl Rank 0: Loading android_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 15760 samples from VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl Rank 0: Loading android_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 7923 samples from VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl Rank 0: Loading web_canvas_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl with random:30% sampling strategy Rank 0: Loaded 1174 samples from VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl Rank 0: Loading web_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 108856 samples from VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 15538 samples from VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl Rank 0: Loading android_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl with random:50% sampling strategy Rank 0: Loaded 5714 samples from VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl Rank 0: Loading windows_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 2643 samples from VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2625 samples from VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl with random:50% sampling strategy Rank 0: Loaded 1792 samples from VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl Rank 0: Loading windows_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_crop_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 686 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 295 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1058 samples from VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4230 samples from VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 846 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 535 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 215 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 146 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl Rank 0: Loading uibert_train_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 4646 samples from VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl Rank 0: Loading openapp_taperception_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 2500 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl Rank 0: Loading openapp_widget_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 14878 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl Rank 0: Loading openapp_mug_grounding_d240812 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl with all sampling strategy Rank 0: Loaded 26090 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl Rank 0: Loading private_ui_phone_2403_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 24798 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ground_d240521_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl with all sampling strategy Rank 0: Loaded 5008 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl Rank 0: Loading private_ui_aig_share_2406_ground_d240612_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl with all sampling strategy Rank 0: Loaded 7903 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl Rank 0: Loading windows_pc_agent_e_planning_cot Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl with repeat:3 sampling strategy Rank 0: Loaded 83346 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl Rank 0: Loading windows_pc_agent_e_navigation Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl with all sampling strategy Rank 0: Loaded 27782 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl Rank 0: Loading ubuntu_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl with random:65% sampling strategy Rank 0: Loaded 53435 samples from VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl Rank 0: Loading ubuntu_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl with random:25% sampling strategy Rank 0: Loaded 20552 samples from VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl Rank 0: Loading windows_mac_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl with random:30% sampling strategy Rank 0: Loaded 100078 samples from VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl Rank 0: Loading windows_mac_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl with random:15% sampling strategy Rank 0: Loaded 50039 samples from VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl Rank 0: Loading os_genesis_ac_training_data Rank 0: Skipping os_genesis_ac_training_data due to repeat_time=0 Rank 0: Loading os_genesis_aw_training_data Rank 0: Skipping os_genesis_aw_training_data due to repeat_time=0 Rank 0: Loading os_genesis_web_training Rank 0: Skipping os_genesis_web_training due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_1 Rank 0: Skipping gui_odyssey_plus_1 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_2 Rank 0: Skipping gui_odyssey_plus_2 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_custom_3 Rank 0: Skipping gui_odyssey_plus_custom_3 due to repeat_time=0 Rank 0: Loading mm_gui_mid Rank 0: Skipping mm_gui_mid due to repeat_time=0 Rank 0: Loading text_gui_mid Rank 0: Skipping text_gui_mid due to repeat_time=0 Rank 0: Loading gui_mid_trajectory Rank 0: Skipping gui_mid_trajectory due to repeat_time=0 Rank 0: Loading ubuntu_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 7024 samples from VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl Rank 0: Loading windows_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 3144 samples from VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl Rank 0: Loading sharegpt4o_review_negative_en_20240825 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 37455 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl Rank 0: Loading internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 59981 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl Rank 0: Loading ai2d_cot_gpt4o_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 14724 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl Rank 0: Loading scienceqa_multi_choice_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 23400 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl Rank 0: Loading fsc147_train_en_20241007 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 4025 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl Rank 0: Loading docreason_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 31829 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl Rank 0: Loading mmtab_instruct_pretrain_en_20240902 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 83057 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl Rank 0: Loading textvqa_en_20240611 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 42560 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl Rank 0: Loading textcap_gpt4o_en_20240905 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26596 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl Rank 0: Loading eaten_passport_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl with random:23% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl Rank 0: Loading textocr_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26329 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl Rank 0: Loading laion_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 13468 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl Rank 0: Loading llavar_inhouse_sft_chat_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19908 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl Rank 0: Loading llavar_inhouse_sft_longcap_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19916 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl Rank 0: Loading icdar2019_art_task1_3_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 6782 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl Rank 0: Loading chinese_ocr_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 68312 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl Rank 0: Loading cocotextv2_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19938 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl Rank 0: Loading mtwi_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11424 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl Rank 0: Loading textocr_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 22488 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl Rank 0: Loading arxiv_table_65k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 79283 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl Rank 0: Loading arxiv_ocr_162k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl with random:74% sampling strategy Rank 0: Loaded 120223 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl Rank 0: Loading iam_multi_turn_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12168 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl Rank 0: Loading poie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2768 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl Rank 0: Loading sroie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 770 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl Rank 0: Loading ocrvqa_en_20241116 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl with random:37% sampling strategy Rank 0: Loaded 76358 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl Rank 0: Loading edrawsvg_caption_13k_zh_20240522 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11457 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl Rank 0: Loading wired_table_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl with random:37% sampling strategy Rank 0: Loaded 36850 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl Rank 0: Loading hme100k_en_20240620 Rank 0: Skipping hme100k_en_20240620 due to repeat_time=0 Rank 0: Loading synth_calligraphy_poetry_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 123000 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl Rank 0: Loading chrome_writting_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 10855 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl Rank 0: Loading vcr_wiki_en_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 20357 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl Rank 0: Loading vcr_wiki_en_hard_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl Rank 0: Loading vcr_wiki_zh_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 19569 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl Rank 0: Loading gpt4gen_rd_boxcot_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 4620 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl Rank 0: Loading math_150_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 184 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl Rank 0: Loading math_2k_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2453 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl Rank 0: Loading geoqa+_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 88951 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl Rank 0: Loading tqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 24741 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl Rank 0: Loading tqa_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 21340 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl Rank 0: Loading geometry3k_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12921 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl Rank 0: Loading geometry3k_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11370 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl Rank 0: Loading unigeo_calc_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 25734 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl Rank 0: Loading super_clevr_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 73800 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl Rank 0: Loading mavis_math_function_caption_to_question_en_20240821 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 36414 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl Rank 0: Loading geomverse_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11437 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl Rank 0: Loading cmm_math_cot_zh_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 16172 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl Rank 0: Loading qwen_filtered_gpt4v_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 8497 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl Rank 0: Loading qwen_filtered_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl with random:55% sampling strategy Rank 0: Loaded 2709 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl Rank 0: Loading screenai_layout_en_20241102 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 27152 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl Rank 0: Loading qwen_filtered_infinitymath_en_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 116490 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl Rank 0: Loading qwen_filtered_sft_code_sensetime_en_zh_20240920 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl with random:66% sampling strategy Rank 0: Loaded 459932 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl Rank 0: Loading qwen_filtered_know_saraswati_cot_en_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 148371 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl Rank 0: Loading qwen_filtered_leetcode_en_zh_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1642 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl Rank 0: Loading data_gpt_generalquestion_correction_cn_43k_v2_20240813 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 52892 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl Rank 0: Loading SynthCode_leetcode_vqa_4k_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5517 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl Rank 0: Loading SynthCode_llmapi_vqa_187_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 230 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl Rank 0: Loading captcha_feedback_619_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 761 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl Rank 0: Loading open_r1_math_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 427940 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl Rank 0: Loading open_thoughts_114k_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl with random:62% sampling strategy Rank 0: Loaded 69746 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl Rank 0: Loading lmsys_single_turn Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl with random:62% sampling strategy Rank 0: Loaded 207732 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl Rank 0: Loading SCP_116K_filter Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 61205 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl Rank 0: Loading Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl with random:62% sampling strategy Rank 0: Loaded 154952 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl Rank 0: Loading longcite_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl with random:83% sampling strategy Rank 0: Loaded 35439 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl Rank 0: Loading long_instruct_with_paraphrasing_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 9417 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl Rank 0: Loading qwen_filtered_tomb_evolved_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 21483 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl Rank 0: Loading qwen_filtered_xcoder_80k_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl with repeat:1.1 sampling strategy [rank57]:[E827 15:16:50.342578567 ProcessGroupNCCL.cpp:616] [Rank 57] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800002 milliseconds before timing out. [rank57]:[E827 15:16:50.342743289 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 57] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank57]:[E827 15:16:51.071766381 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 57] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank57]:[E827 15:16:51.071781530 ProcessGroupNCCL.cpp:630] [Rank 57] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank57]:[E827 15:16:51.071784960 ProcessGroupNCCL.cpp:636] [Rank 57] To avoid data inconsistency, we are taking the entire process down. [rank57]:[E827 15:16:51.073027854 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 57] Process group watchdog thread terminated with exception: [Rank 57] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800002 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9d39525446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f9d3a7ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f9d3a7b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f9d3a7b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f9d8b82c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9d93519aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9d935a6c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 57] Process group watchdog thread terminated with exception: [Rank 57] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800002 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9d39525446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f9d3a7ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f9d3a7b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f9d3a7b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f9d8b82c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9d93519aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9d935a6c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9d39525446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f9d3a41fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f9d8b82c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f9d93519aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f9d935a6c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:16:53.904000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 15:16:53.909000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 899 closing signal SIGTERM W0827 15:16:53.909000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 900 closing signal SIGTERM W0827 15:16:53.910000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 901 closing signal SIGTERM W0827 15:16:53.910000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM W0827 15:16:53.910000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 903 closing signal SIGTERM W0827 15:16:53.911000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 904 closing signal SIGTERM E0827 15:16:59.191000 895 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 1 (pid: 898) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:16:53 host : 10-102-213-47.node-local-dns.kube-system.svc.pjlab.local rank : 57 (local_rank: 1) exitcode : -6 (pid: 898) error_file: traceback : Signal 6 (SIGABRT) received by PID 898 ============================================================ Rank 0: Loaded 85474 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl Rank 0: Loading qwen_filtered_sft_general_zhuguan_zh_20241002 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl with random:88% sampling strategy [rank50]:[E827 15:17:03.108868405 ProcessGroupNCCL.cpp:616] [Rank 50] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. [rank50]:[E827 15:17:03.109004885 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 50] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank50]:[E827 15:17:04.782382174 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 50] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank50]:[E827 15:17:04.782403918 ProcessGroupNCCL.cpp:630] [Rank 50] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank50]:[E827 15:17:04.782407806 ProcessGroupNCCL.cpp:636] [Rank 50] To avoid data inconsistency, we are taking the entire process down. [rank50]:[E827 15:17:04.783585042 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 50] Process group watchdog thread terminated with exception: [Rank 50] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fad1bd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fad1cfab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fad1cfb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fad1cfb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fad6dfd45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fad75cc1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fad75d4ec3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 50] Process group watchdog thread terminated with exception: [Rank 50] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fad1bd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fad1cfab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fad1cfb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fad1cfb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fad6dfd45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fad75cc1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fad75d4ec3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fad1bd25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fad1cc1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fad6dfd45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fad75cc1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fad75d4ec3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:17:06.553000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 15:17:06.559000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 15:17:06.559000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM W0827 15:17:06.559000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 15:17:06.560000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 15:17:06.560000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 15:17:06.561000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 923 closing signal SIGTERM E0827 15:17:12.161000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 2 (pid: 918) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:06 host : 10-102-217-7.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 50 (local_rank: 2) exitcode : -6 (pid: 918) error_file: traceback : Signal 6 (SIGABRT) received by PID 918 ============================================================ Rank 0: Loaded 124772 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl Rank 0: Loading merged_mmmu_knowledge_point_gpt4o_en_20241118 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl with repeat:1.1 sampling strategy [rank123]:[E827 15:17:21.036674448 ProcessGroupNCCL.cpp:616] [Rank 123] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800016 milliseconds before timing out. [rank123]:[E827 15:17:21.036861898 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 123] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank41]:[E827 15:17:21.469534296 ProcessGroupNCCL.cpp:616] [Rank 41] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800030 milliseconds before timing out. [rank41]:[E827 15:17:21.469705994 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 41] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank123]:[E827 15:17:22.753268046 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 123] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank123]:[E827 15:17:22.753282230 ProcessGroupNCCL.cpp:630] [Rank 123] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank123]:[E827 15:17:22.753286466 ProcessGroupNCCL.cpp:636] [Rank 123] To avoid data inconsistency, we are taking the entire process down. [rank123]:[E827 15:17:22.754628814 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 123] Process group watchdog thread terminated with exception: [Rank 123] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800016 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7ff57ad25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7ff57bfab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7ff57bfb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7ff57bfb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7ff5ccfa45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7ff5d4c91aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7ff5d4d1ec3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 123] Process group watchdog thread terminated with exception: [Rank 123] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800016 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7ff57ad25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7ff57bfab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7ff57bfb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7ff57bfb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7ff5ccfa45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7ff5d4c91aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7ff5d4d1ec3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7ff57ad25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7ff57bc1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7ff5ccfa45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7ff5d4c91aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7ff5d4d1ec3c in /lib/x86_64-linux-gnu/libc.so.6) [rank41]:[E827 15:17:22.179969182 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 41] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank41]:[E827 15:17:22.179992775 ProcessGroupNCCL.cpp:630] [Rank 41] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank41]:[E827 15:17:22.179997283 ProcessGroupNCCL.cpp:636] [Rank 41] To avoid data inconsistency, we are taking the entire process down. [rank41]:[E827 15:17:22.181283559 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 41] Process group watchdog thread terminated with exception: [Rank 41] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800030 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8fdfe7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f8fe11ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f8fe11b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f8fe11b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f90320d35c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9039dc0aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9039e4dc3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 41] Process group watchdog thread terminated with exception: [Rank 41] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800030 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8fdfe7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f8fe11ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f8fe11b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f8fe11b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f90320d35c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9039dc0aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9039e4dc3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8fdfe7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f8fe0e1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f90320d35c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f9039dc0aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f9039e4dc3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:17:24.539000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 945 closing signal SIGTERM W0827 15:17:24.543000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 946 closing signal SIGTERM W0827 15:17:24.544000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 947 closing signal SIGTERM W0827 15:17:24.544000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 949 closing signal SIGTERM W0827 15:17:24.545000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 950 closing signal SIGTERM W0827 15:17:24.545000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 951 closing signal SIGTERM W0827 15:17:24.545000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 952 closing signal SIGTERM W0827 15:17:25.160000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 912 closing signal SIGTERM W0827 15:17:25.164000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 914 closing signal SIGTERM W0827 15:17:25.165000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 915 closing signal SIGTERM W0827 15:17:25.165000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 15:17:25.165000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 15:17:25.166000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 15:17:25.166000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM [rank12]:[E827 15:17:29.682000295 ProcessGroupNCCL.cpp:616] [Rank 12] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800014 milliseconds before timing out. [rank12]:[E827 15:17:29.682176121 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 12] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank12]:[E827 15:17:29.136933886 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 12] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank12]:[E827 15:17:29.136950595 ProcessGroupNCCL.cpp:630] [Rank 12] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank12]:[E827 15:17:29.136955107 ProcessGroupNCCL.cpp:636] [Rank 12] To avoid data inconsistency, we are taking the entire process down. [rank12]:[E827 15:17:29.138243551 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 12] Process group watchdog thread terminated with exception: [Rank 12] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800014 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f93be27a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f93bf5ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f93bf5b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f93bf5b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f941053e5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f941822baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f94182b8c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 12] Process group watchdog thread terminated with exception: [Rank 12] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800014 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f93be27a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f93bf5ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f93bf5b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f93bf5b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f941053e5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f941822baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f94182b8c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f93be27a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f93bf21fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f941053e5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f941822baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f94182b8c3c in /lib/x86_64-linux-gnu/libc.so.6) E0827 15:17:29.894000 943 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 3 (pid: 948) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:24 host : 10-102-214-17.monitoring-dcgm-exporter.kubebrain.svc.pjlab.local rank : 123 (local_rank: 3) exitcode : -6 (pid: 948) error_file: traceback : Signal 6 (SIGABRT) received by PID 948 ============================================================ E0827 15:17:31.131000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 1 (pid: 913) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:25 host : 10-102-217-16.smartctl-exporter.smartctl-exporter.svc.pjlab.local rank : 41 (local_rank: 1) exitcode : -6 (pid: 913) error_file: traceback : Signal 6 (SIGABRT) received by PID 913 ============================================================ [rank9]:[E827 15:17:32.383274755 ProcessGroupNCCL.cpp:616] [Rank 9] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800068 milliseconds before timing out. [rank9]:[E827 15:17:32.383413133 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 9] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. W0827 15:17:33.180000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 915 closing signal SIGTERM W0827 15:17:33.185000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 15:17:33.185000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 15:17:33.186000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 15:17:33.186000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 15:17:33.187000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 15:17:33.187000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM [rank64]:[E827 15:17:35.141447628 ProcessGroupNCCL.cpp:616] [Rank 64] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800031 milliseconds before timing out. [rank64]:[E827 15:17:35.141656156 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 64] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank64]:[E827 15:17:35.542626517 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 64] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank64]:[E827 15:17:35.542650548 ProcessGroupNCCL.cpp:630] [Rank 64] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank64]:[E827 15:17:35.542654709 ProcessGroupNCCL.cpp:636] [Rank 64] To avoid data inconsistency, we are taking the entire process down. [rank64]:[E827 15:17:35.543996093 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 64] Process group watchdog thread terminated with exception: [Rank 64] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800031 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f74b607a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f74b73ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f74b73b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f74b73b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f750831a5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f7510007aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f7510094c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 64] Process group watchdog thread terminated with exception: [Rank 64] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800031 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f74b607a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f74b73ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f74b73b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f74b73b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f750831a5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f7510007aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f7510094c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f74b607a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f74b701fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f750831a5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f7510007aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f7510094c3c in /lib/x86_64-linux-gnu/libc.so.6) Rank 0: Loaded 47383 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl Rank 0: Loading android_ui_longcap_qwen_zh_20240409 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl with repeat:2.46 sampling strategy Rank 0: Loaded 13528 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl Rank 0: Loading screen2words_longcap_gpt4o_en_20240819 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl with repeat:1.23 sampling strategy E0827 15:17:38.623000 913 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 4 (pid: 919) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:33 host : 10-102-210-23.node-local-dns.kube-system.svc.pjlab.local rank : 12 (local_rank: 4) exitcode : -6 (pid: 919) error_file: traceback : Signal 6 (SIGABRT) received by PID 919 ============================================================ W0827 15:17:38.675000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 901 closing signal SIGTERM W0827 15:17:38.681000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM W0827 15:17:38.682000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 903 closing signal SIGTERM W0827 15:17:38.682000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 904 closing signal SIGTERM W0827 15:17:38.683000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 905 closing signal SIGTERM W0827 15:17:38.683000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 906 closing signal SIGTERM W0827 15:17:38.683000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 907 closing signal SIGTERM Rank 0: Loaded 18106 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl Rank 0: Loading drawing_to_html_en_20240628 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2090 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl Rank 0: Loading airplane_app_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1368 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading taobao_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1925 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading wechat_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1344 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading websight_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5349 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl Rank 0: Total training samples: 11313173 E0827 15:17:44.032000 898 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 0 (pid: 900) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:38 host : 10-102-199-24.monitoring-prometheus-node-exporter.kubebrain.svc.pjlab.local rank : 64 (local_rank: 0) exitcode : -6 (pid: 900) error_file: traceback : Signal 6 (SIGABRT) received by PID 900 ============================================================ [rank75]:[E827 15:17:54.375577294 ProcessGroupNCCL.cpp:616] [Rank 75] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800067 milliseconds before timing out. [rank75]:[E827 15:17:54.375731431 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 75] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank75]:[E827 15:17:55.102483514 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 75] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank75]:[E827 15:17:55.102504726 ProcessGroupNCCL.cpp:630] [Rank 75] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank75]:[E827 15:17:55.102509426 ProcessGroupNCCL.cpp:636] [Rank 75] To avoid data inconsistency, we are taking the entire process down. [rank75]:[E827 15:17:55.103673810 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 75] Process group watchdog thread terminated with exception: [Rank 75] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800067 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7febdca7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7febdddab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7febdddb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7febdddb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fec2ed645c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fec36a51aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fec36adec3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 75] Process group watchdog thread terminated with exception: [Rank 75] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800067 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7febdca7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7febdddab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7febdddb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7febdddb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fec2ed645c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fec36a51aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fec36adec3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7febdca7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7febdda1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fec2ed645c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fec36a51aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fec36adec3c in /lib/x86_64-linux-gnu/libc.so.6) Rank 0: Formatting inputs...Skip in lazy mode Rank 0: Resize images between 3136 to 2109744 W0827 15:17:58.216000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 15:17:58.220000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 15:17:58.220000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 15:17:58.221000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 15:17:58.221000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 15:17:58.222000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 15:17:58.222000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 923 closing signal SIGTERM [rank104]:[E827 15:17:58.528517471 ProcessGroupNCCL.cpp:616] [Rank 104] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800024 milliseconds before timing out. [rank104]:[E827 15:17:58.528681812 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 104] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank104]:[E827 15:17:59.999291981 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 104] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank104]:[E827 15:17:59.999306169 ProcessGroupNCCL.cpp:630] [Rank 104] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank104]:[E827 15:17:59.999310096 ProcessGroupNCCL.cpp:636] [Rank 104] To avoid data inconsistency, we are taking the entire process down. [rank104]:[E827 15:17:59.000418439 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 104] Process group watchdog thread terminated with exception: [Rank 104] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800024 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f5bc4f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f5bc61ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f5bc61b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f5bc61b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f5c171d45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f5c1eec1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f5c1ef4ec3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 104] Process group watchdog thread terminated with exception: [Rank 104] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800024 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f5bc4f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f5bc61ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f5bc61b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f5bc61b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f5c171d45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f5c1eec1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f5c1ef4ec3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f5bc4f25446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f5bc5e1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f5c171d45c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f5c1eec1aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f5c1ef4ec3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:18:01.575000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 925 closing signal SIGTERM W0827 15:18:01.583000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 926 closing signal SIGTERM W0827 15:18:01.584000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 927 closing signal SIGTERM W0827 15:18:01.585000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 928 closing signal SIGTERM W0827 15:18:01.585000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 929 closing signal SIGTERM W0827 15:18:01.585000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 930 closing signal SIGTERM W0827 15:18:01.586000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 931 closing signal SIGTERM [rank25]:[E827 15:18:01.481852856 ProcessGroupNCCL.cpp:616] [Rank 25] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800064 milliseconds before timing out. [rank25]:[E827 15:18:01.481990148 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 25] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank25]:[E827 15:18:01.814295529 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 25] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank25]:[E827 15:18:01.814307739 ProcessGroupNCCL.cpp:630] [Rank 25] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank25]:[E827 15:18:01.814310966 ProcessGroupNCCL.cpp:636] [Rank 25] To avoid data inconsistency, we are taking the entire process down. [rank25]:[E827 15:18:01.815466351 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 25] Process group watchdog thread terminated with exception: [Rank 25] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800064 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fc28be7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fc28d1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fc28d1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fc28d1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fc2de12c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fc2e5e19aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fc2e5ea6c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 25] Process group watchdog thread terminated with exception: [Rank 25] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800064 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fc28be7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fc28d1ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fc28d1b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fc28d1b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fc2de12c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fc2e5e19aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fc2e5ea6c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fc28be7a446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fc28ce1fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fc2de12c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fc2e5e19aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fc2e5ea6c3c in /lib/x86_64-linux-gnu/libc.so.6) E0827 15:18:04.273000 914 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 3 (pid: 919) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:17:58 host : 10-102-199-13.node-local-dns.kube-system.svc.pjlab.local rank : 75 (local_rank: 3) exitcode : -6 (pid: 919) error_file: traceback : Signal 6 (SIGABRT) received by PID 919 ============================================================ W0827 15:18:04.296000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 896 closing signal SIGTERM W0827 15:18:04.305000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM W0827 15:18:04.306000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 899 closing signal SIGTERM W0827 15:18:04.306000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 900 closing signal SIGTERM W0827 15:18:04.306000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 901 closing signal SIGTERM W0827 15:18:04.307000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM W0827 15:18:04.307000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 903 closing signal SIGTERM E0827 15:18:07.862000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 0 (pid: 924) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:18:01 host : 10-102-202-18.networking-agent.kubebrain-networking.svc.pjlab.local rank : 104 (local_rank: 0) exitcode : -6 (pid: 924) error_file: traceback : Signal 6 (SIGABRT) received by PID 924 ============================================================ E0827 15:18:09.710000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 1 (pid: 897) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:18:04 host : 10-102-205-42.monitoring-kube-prometheus-kubelet.kube-system.svc.pjlab.local rank : 25 (local_rank: 1) exitcode : -6 (pid: 897) error_file: traceback : Signal 6 (SIGABRT) received by PID 897 ============================================================ [rank85]:[E827 15:18:14.283458593 ProcessGroupNCCL.cpp:616] [Rank 85] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800012 milliseconds before timing out. [rank85]:[E827 15:18:14.283595285 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 85] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank85]:[E827 15:18:15.020682207 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 85] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank85]:[E827 15:18:15.020695259 ProcessGroupNCCL.cpp:630] [Rank 85] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank85]:[E827 15:18:15.020698047 ProcessGroupNCCL.cpp:636] [Rank 85] To avoid data inconsistency, we are taking the entire process down. [rank85]:[E827 15:18:15.021735232 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 85] Process group watchdog thread terminated with exception: [Rank 85] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800012 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd159925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fd15abab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fd15abb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fd15abb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fd1abb885c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fd1b3875aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fd1b3902c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 85] Process group watchdog thread terminated with exception: [Rank 85] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800012 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd159925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7fd15abab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7fd15abb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fd15abb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7fd1abb885c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7fd1b3875aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7fd1b3902c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fd159925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fd15a81fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fd1abb885c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fd1b3875aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fd1b3902c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:18:18.011000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 15:18:18.018000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM W0827 15:18:18.019000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 920 closing signal SIGTERM W0827 15:18:18.019000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 921 closing signal SIGTERM W0827 15:18:18.020000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 15:18:18.020000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 924 closing signal SIGTERM W0827 15:18:18.020000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 925 closing signal SIGTERM Rank 0: Length of multimodal samples: 9328128, pure textual samples: 1984512 [rank37]:[E827 15:18:20.477167920 ProcessGroupNCCL.cpp:616] [Rank 37] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800062 milliseconds before timing out. [rank37]:[E827 15:18:20.477313270 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 37] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank37]:[E827 15:18:21.891721450 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 37] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank37]:[E827 15:18:21.891733706 ProcessGroupNCCL.cpp:630] [Rank 37] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank37]:[E827 15:18:21.891736768 ProcessGroupNCCL.cpp:636] [Rank 37] To avoid data inconsistency, we are taking the entire process down. [rank37]:[E827 15:18:21.892917648 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 37] Process group watchdog thread terminated with exception: [Rank 37] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800062 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7efbda325446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7efbdb5ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7efbdb5b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7efbdb5b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7efc2c63c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7efc34329aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7efc343b6c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 37] Process group watchdog thread terminated with exception: [Rank 37] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800062 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7efbda325446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7efbdb5ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7efbdb5b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7efbdb5b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7efc2c63c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7efc34329aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7efc343b6c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7efbda325446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7efbdb21fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7efc2c63c5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7efc34329aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7efc343b6c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:18:23.327000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 891 closing signal SIGTERM W0827 15:18:23.333000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 892 closing signal SIGTERM W0827 15:18:23.333000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 893 closing signal SIGTERM W0827 15:18:23.334000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 894 closing signal SIGTERM W0827 15:18:23.334000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 895 closing signal SIGTERM W0827 15:18:23.334000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 15:18:23.335000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM E0827 15:18:23.646000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 5 (pid: 923) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:18:18 host : 10-102-205-8.monitoring-prometheus-node-exporter.kubebrain.svc.pjlab.local rank : 85 (local_rank: 5) exitcode : -6 (pid: 923) error_file: traceback : Signal 6 (SIGABRT) received by PID 923 ============================================================ E0827 15:18:28.223000 889 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 5 (pid: 896) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:18:23 host : 10-102-204-48.monitoring-dcgm-exporter.kubebrain.svc.pjlab.local rank : 37 (local_rank: 5) exitcode : -6 (pid: 896) error_file: traceback : Signal 6 (SIGABRT) received by PID 896 ============================================================ [rank98]:[E827 15:18:43.156990986 ProcessGroupNCCL.cpp:616] [Rank 98] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. [rank98]:[E827 15:18:43.157112605 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 98] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank98]:[E827 15:18:43.351007753 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 98] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank98]:[E827 15:18:43.351020334 ProcessGroupNCCL.cpp:630] [Rank 98] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank98]:[E827 15:18:43.351023142 ProcessGroupNCCL.cpp:636] [Rank 98] To avoid data inconsistency, we are taking the entire process down. [rank98]:[E827 15:18:43.352037429 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 98] Process group watchdog thread terminated with exception: [Rank 98] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9bea925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f9bebbab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f9bebbb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f9bebbb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f9c3cc0d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9c448faaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9c44987c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 98] Process group watchdog thread terminated with exception: [Rank 98] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800051 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9bea925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f9bebbab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f9bebbb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f9bebbb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f9c3cc0d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9c448faaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9c44987c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f9bea925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f9beb81fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f9c3cc0d5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f9c448faaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f9c44987c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:18:46.023000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 922 closing signal SIGTERM W0827 15:18:46.031000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 923 closing signal SIGTERM W0827 15:18:46.031000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 925 closing signal SIGTERM W0827 15:18:46.032000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 926 closing signal SIGTERM W0827 15:18:46.032000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 927 closing signal SIGTERM W0827 15:18:46.033000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 928 closing signal SIGTERM W0827 15:18:46.033000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 929 closing signal SIGTERM E0827 15:18:51.485000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 2 (pid: 924) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:18:46 host : 10-102-202-24.fluent-bit.fluent.svc.pjlab.local rank : 98 (local_rank: 2) exitcode : -6 (pid: 924) error_file: traceback : Signal 6 (SIGABRT) received by PID 924 ============================================================ Parameter Offload: Total persistent parameters: 848896 in 368 params [rank91]:[E827 15:19:09.562185374 ProcessGroupNCCL.cpp:616] [Rank 91] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800086 milliseconds before timing out. [rank91]:[E827 15:19:09.562405767 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 91] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank91]:[E827 15:19:10.233217826 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 91] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank91]:[E827 15:19:10.233240554 ProcessGroupNCCL.cpp:630] [Rank 91] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank91]:[E827 15:19:10.233245081 ProcessGroupNCCL.cpp:636] [Rank 91] To avoid data inconsistency, we are taking the entire process down. [rank91]:[E827 15:19:10.234778702 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 91] Process group watchdog thread terminated with exception: [Rank 91] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800086 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1c76925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f1c77bab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f1c77bb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f1c77bb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f1cc8bf05c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1cd08ddaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1cd096ac3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 91] Process group watchdog thread terminated with exception: [Rank 91] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800086 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1c76925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f1c77bab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f1c77bb2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f1c77bb437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f1cc8bf05c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f1cd08ddaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f1cd096ac3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f1c76925446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f1c7781fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f1cc8bf05c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f1cd08ddaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f1cd096ac3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:19:14.696000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 932 closing signal SIGTERM W0827 15:19:14.699000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 933 closing signal SIGTERM W0827 15:19:14.699000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 934 closing signal SIGTERM W0827 15:19:14.700000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 936 closing signal SIGTERM W0827 15:19:14.700000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 937 closing signal SIGTERM W0827 15:19:14.701000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 938 closing signal SIGTERM W0827 15:19:14.701000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 939 closing signal SIGTERM E0827 15:19:22.729000 930 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 3 (pid: 935) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:19:14 host : 10-102-204-16.monitoring-kube-prometheus-kube-proxy.kube-system.svc.pjlab.local rank : 91 (local_rank: 3) exitcode : -6 (pid: 935) error_file: traceback : Signal 6 (SIGABRT) received by PID 935 ============================================================ [rank115]:[E827 15:19:29.177194889 ProcessGroupNCCL.cpp:616] [Rank 115] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800038 milliseconds before timing out. [rank115]:[E827 15:19:29.177424992 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 115] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank115]:[E827 15:19:30.894916218 ProcessGroupNCCL.cpp:1834] [PG ID 0 PG GUID 0(default_pg) Rank 115] Timeout at NCCL work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank115]:[E827 15:19:30.894933514 ProcessGroupNCCL.cpp:630] [Rank 115] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank115]:[E827 15:19:30.894937806 ProcessGroupNCCL.cpp:636] [Rank 115] To avoid data inconsistency, we are taking the entire process down. [rank115]:[E827 15:19:30.896385537 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 115] Process group watchdog thread terminated with exception: [Rank 115] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800038 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f99a8125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f99a93ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f99a93b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f99a93b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f99fa3ae5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9a0209baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9a02128c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 115] Process group watchdog thread terminated with exception: [Rank 115] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) ran for 1800038 milliseconds before timing out. Exception raised from checkTimeout at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:618 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f99a8125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout(std::optional > >) + 0x282 (0x7f99a93ab4d2 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x233 (0x7f99a93b2913 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f99a93b437d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: + 0x145c0 (0x7f99fa3ae5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #5: + 0x9caa4 (0x7f9a0209baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: + 0x129c3c (0x7f9a02128c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f99a8125446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f99a901fceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f99fa3ae5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f9a0209baa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f9a02128c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:19:34.388000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 948 closing signal SIGTERM W0827 15:19:34.392000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 949 closing signal SIGTERM W0827 15:19:34.392000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 950 closing signal SIGTERM W0827 15:19:34.393000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 952 closing signal SIGTERM W0827 15:19:34.393000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 953 closing signal SIGTERM W0827 15:19:34.394000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 954 closing signal SIGTERM W0827 15:19:34.394000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 955 closing signal SIGTERM E0827 15:19:41.996000 946 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 3 (pid: 951) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:19:34 host : 10-102-202-28.fluent-bit.fluent.svc.pjlab.local rank : 115 (local_rank: 3) exitcode : -6 (pid: 951) error_file: traceback : Signal 6 (SIGABRT) received by PID 951 ============================================================ [rank0]:[E827 15:21:24.852437440 ProcessGroupNCCL.cpp:542] [Rank 0] Collective WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) raised the following async exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.199.24<51910> with status=5 opcode=129 len=266240 vendor err 244 (Recv) localGid ::ffff:10.103.6.44 remoteGids::ffff:10.103.0.74 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8c3883f446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7f8c39b6fce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7f8c39b6ff2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7f8c39b778f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f8c39b7937d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7f8c894af5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7f8c92783aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7f8c92810c3c in /lib/x86_64-linux-gnu/libc.so.6) [rank0]:[E827 15:21:24.854086478 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 0] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank0]:[E827 15:21:24.432559517 ProcessGroupNCCL.cpp:630] [Rank 0] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank0]:[E827 15:21:24.432576881 ProcessGroupNCCL.cpp:636] [Rank 0] To avoid data inconsistency, we are taking the entire process down. [rank0]:[E827 15:21:24.432710314 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 0] Process group watchdog thread terminated with exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.199.24<51910> with status=5 opcode=129 len=266240 vendor err 244 (Recv) localGid ::ffff:10.103.6.44 remoteGids::ffff:10.103.0.74 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8c3883f446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7f8c39b6fce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7f8c39b6ff2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7f8c39b778f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f8c39b7937d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7f8c894af5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7f8c92783aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7f8c92810c3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 0] Process group watchdog thread terminated with exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.199.24<51910> with status=5 opcode=129 len=266240 vendor err 244 (Recv) localGid ::ffff:10.103.6.44 remoteGids::ffff:10.103.0.74 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8c3883f446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7f8c39b6fce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7f8c39b6ff2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7f8c39b778f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7f8c39b7937d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7f8c894af5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7f8c92783aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7f8c92810c3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7f8c3883f446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7f8c397e4ceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7f8c894af5c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7f8c92783aa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7f8c92810c3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:21:27.383000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 913 closing signal SIGTERM W0827 15:21:27.388000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 914 closing signal SIGTERM W0827 15:21:27.389000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 915 closing signal SIGTERM W0827 15:21:27.389000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 916 closing signal SIGTERM W0827 15:21:27.389000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 917 closing signal SIGTERM W0827 15:21:27.390000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 918 closing signal SIGTERM W0827 15:21:27.390000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 919 closing signal SIGTERM E0827 15:21:33.591000 909 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 0 (pid: 912) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:21:27 host : 10-102-210-14.monitoring-prometheus-node-exporter.kubebrain.svc.pjlab.local rank : 0 (local_rank: 0) exitcode : -6 (pid: 912) error_file: traceback : Signal 6 (SIGABRT) received by PID 912 ============================================================ [rank16]:[E827 15:24:32.616750320 ProcessGroupNCCL.cpp:542] [Rank 16] Collective WorkNCCL(SeqNum=2188, OpType=ALLREDUCE, NumelIn=1, NumelOut=1, Timeout(ms)=1800000) raised the following async exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.204.48<60201> with status=12 opcode=32699 len=0 vendor err 129 (Send) localGid ::ffff:10.103.4.79 remoteGids::ffff:10.103.3.163 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fbe3a812446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7fbe3bb42ce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7fbe3bb42f2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7fbe3bb4a8f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fbe3bb4c37d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7fbe8b4785c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7fbe9474eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7fbe947dbc3c in /lib/x86_64-linux-gnu/libc.so.6) [rank16]:[E827 15:24:32.618090878 ProcessGroupNCCL.cpp:1785] [PG ID 0 PG GUID 0(default_pg) Rank 16] Exception (either an error or timeout) detected by watchdog at work: 2188, last enqueued NCCL work: 2188, last completed NCCL work: 2187. [rank16]:[E827 15:24:33.292944554 ProcessGroupNCCL.cpp:630] [Rank 16] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank16]:[E827 15:24:33.292957750 ProcessGroupNCCL.cpp:636] [Rank 16] To avoid data inconsistency, we are taking the entire process down. [rank16]:[E827 15:24:33.293070037 ProcessGroupNCCL.cpp:1595] [PG ID 0 PG GUID 0(default_pg) Rank 16] Process group watchdog thread terminated with exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.204.48<60201> with status=12 opcode=32699 len=0 vendor err 129 (Send) localGid ::ffff:10.103.4.79 remoteGids::ffff:10.103.3.163 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fbe3a812446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7fbe3bb42ce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7fbe3bb42f2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7fbe3bb4a8f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fbe3bb4c37d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7fbe8b4785c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7fbe9474eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7fbe947dbc3c in /lib/x86_64-linux-gnu/libc.so.6) terminate called after throwing an instance of 'c10::DistBackendError' what(): [PG ID 0 PG GUID 0(default_pg) Rank 16] Process group watchdog thread terminated with exception: NCCL error: remote process exited or there was a network error, NCCL version 2.21.5 ncclRemoteError: A call failed possibly due to a network error or a remote process exiting prematurely. Last error: NET/IB: Got completion from peer 10.102.204.48<60201> with status=12 opcode=32699 len=0 vendor err 129 (Send) localGid ::ffff:10.103.4.79 remoteGids::ffff:10.103.3.163 hca mlx5_0 Exception raised from checkForNCCLErrorsInternal at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:2027 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fbe3a812446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: c10d::ProcessGroupNCCL::checkForNCCLErrorsInternal(std::shared_ptr&) + 0x220 (0x7fbe3bb42ce0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: c10d::ProcessGroupNCCL::WorkNCCL::checkAndSetException() + 0x7c (0x7fbe3bb42f2c in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #3: c10d::ProcessGroupNCCL::watchdogHandler() + 0x213 (0x7fbe3bb4a8f3 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #4: c10d::ProcessGroupNCCL::ncclCommWatchdog() + 0x14d (0x7fbe3bb4c37d in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #5: + 0x145c0 (0x7fbe8b4785c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #6: + 0x9caa4 (0x7fbe9474eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #7: + 0x129c3c (0x7fbe947dbc3c in /lib/x86_64-linux-gnu/libc.so.6) Exception raised from ncclCommWatchdog at /opt/conda/conda-bld/pytorch_1729647382455/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1601 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7fbe3a812446 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libc10.so) frame #1: + 0xe65ceb (0x7fbe3b7b7ceb in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame #2: + 0x145c0 (0x7fbe8b4785c0 in /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/lib/libtorch.so) frame #3: + 0x9caa4 (0x7fbe9474eaa4 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: + 0x129c3c (0x7fbe947dbc3c in /lib/x86_64-linux-gnu/libc.so.6) W0827 15:24:33.961000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:33.961000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:33.961000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:33.961000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:36.657000 919 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:36.657000 919 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:36.657000 919 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:36.657000 919 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:36.971000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 897 closing signal SIGTERM W0827 15:24:36.977000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 898 closing signal SIGTERM W0827 15:24:36.978000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 899 closing signal SIGTERM W0827 15:24:36.978000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 900 closing signal SIGTERM W0827 15:24:36.979000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 901 closing signal SIGTERM W0827 15:24:36.979000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 902 closing signal SIGTERM W0827 15:24:36.980000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 903 closing signal SIGTERM W0827 15:24:38.672000 931 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:38.672000 931 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:38.672000 931 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:38.672000 931 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:40.387000 956 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:40.387000 956 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:40.387000 956 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:40.387000 956 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:41.496000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:41.496000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:41.496000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:41.496000 922 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:41.530000 942 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:41.530000 942 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:41.530000 942 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:41.530000 942 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** E0827 15:24:42.382000 894 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: -6) local_rank: 0 (pid: 896) of binary: /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/python Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==2.5.1', 'console_scripts', 'torchrun')()) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 919, in main run(args) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py", line 910, in run elastic_launch( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 138, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 269, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ qwenvl/train/train_qwen.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-08-27_15:24:36 host : 10-102-205-41.kubebrain-node-exporter.kubebrain.svc.pjlab.local rank : 16 (local_rank: 0) exitcode : -6 (pid: 896) error_file: traceback : Signal 6 (SIGABRT) received by PID 896 ============================================================ W0827 15:24:43.190000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:43.190000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:43.190000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:43.190000 916 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:43.507000 908 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:43.507000 908 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:43.507000 908 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:43.507000 908 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:43.856000 934 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:43.856000 934 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:43.856000 934 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:43.856000 934 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:44.209000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:44.209000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:44.209000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:44.209000 891 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:46.728000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:46.728000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:46.728000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:46.728000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:47.665000 911 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:47.665000 911 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:47.665000 911 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:47.665000 911 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:47.843000 907 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:47.843000 907 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:47.843000 907 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:47.843000 907 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:49.188000 896 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:49.188000 896 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:49.188000 896 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:49.188000 896 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:50.029000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:50.029000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:50.029000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:50.029000 910 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:51.059000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] W0827 15:24:51.059000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** W0827 15:24:51.059000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0827 15:24:51.059000 920 /mnt/shared-storage-user/intern7shared/liuzhaoyang/envs/qwen2_5vl/lib/python3.10/site-packages/torch/distributed/run.py:793] ***************************************** [2025-08-27 15:24:58,950] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:58,996] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,008] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,044] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,048] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,049] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,050] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,051] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:24:59,488] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,494] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,495] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,496] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,497] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,498] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,499] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,506] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: df: df: /tmp/triton_lzy/tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory : No such file or directorydf: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:24:59,800] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,956] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:24:59,963] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,978] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,979] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,989] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,990] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:24:59,991] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,178] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,183] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,184] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,185] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,186] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,192] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,192] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,193] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:00,682] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,793] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,800] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,853] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,855] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,857] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,858] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,859] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:00,950] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,952] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,960] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,961] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,971] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,971] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,971] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,971] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,972] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,972] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,972] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,972] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2025-08-27 15:25:00,978] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,982] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:00,992] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:00,993] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,007] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,008] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,009] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,042] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,046] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,047] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,049] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,051] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,053] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:01,410] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,498] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:01,503] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,538] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,544] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,545] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,547] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,548] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,551] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,552] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,552] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,555] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,558] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,614] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,615] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,616] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,617] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,619] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:01,623] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,624] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,626] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,675] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,682] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,684] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,685] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,688] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,690] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:01,693] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:01,808] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:01,808] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:01,811] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:01,811] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:01,812] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:01,815] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:01,816] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:01,941] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:01,942] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,022] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,331] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,334] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:02,336] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,338] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,342] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,345] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:02,346] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,407] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,726] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,668] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,669] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,678] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:02,727] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,728] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,730] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,732] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:02,733] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:02,737] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:03,232] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:03,360] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,426] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,453] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,453] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,453] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,453] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,453] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,454] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,454] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,454] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,479] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,484] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,485] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,501] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,504] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,509] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,512] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,518] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,535] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,561] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:03,561] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:03,571] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:03,572] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:03,572] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:03,577] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:03,578] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:03,798] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,843] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,856] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,864] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,864] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,872] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,874] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,875] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,879] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,880] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,884] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,885] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,886] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,890] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,893] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,894] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:03,908] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:03,922] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,922] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,922] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,923] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,953] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,954] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,954] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,954] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,955] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,955] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,955] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,955] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:03,968] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:04,045] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:04,201] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,215] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,216] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,216] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,217] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,222] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,228] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,278] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,278] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,282] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,282] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,283] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,283] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,284] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,358] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,359] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,361] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,363] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,364] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,368] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,368] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,377] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,459] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,468] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:04,700] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,701] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,702] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,705] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,705] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:04,707] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:04,709] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,035] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,036] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,042] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,042] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,044] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,049] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,049] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,052] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,355] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,361] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:05,374] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,374] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,375] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,375] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:05,375] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:06,063] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,063] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,064] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,598] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,599] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,599] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,599] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,600] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,600] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,600] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,603] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,607] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,607] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,607] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,608] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,608] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,608] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,608] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,608] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:06,643] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:06,962] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:06,964] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:06,965] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:06,980] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:06,981] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:06,983] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:06,985] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,081] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,102] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,393] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,393] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,394] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,395] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,399] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,401] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,402] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,418] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,419] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,421] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,421] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,428] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,429] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:07,429] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:07,952] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,058] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,060] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,073] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,077] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,083] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,084] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:08,087] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directorydf: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:12,272] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,393] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,421] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,437] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,440] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,443] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,448] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:12,454] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,884] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:12,999] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,075] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,091] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,103] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,108] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,111] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,113] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,138] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-08-27 15:25:13,432] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. df: df: /tmp/triton_lzy/tmp/triton_lzy: No such file or directory : No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory df: /tmp/triton_lzy: No such file or directory [2025-08-27 15:25:13,760] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:13,763] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:13,769] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:13,775] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:13,776] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:13,780] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:13,781] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,439] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,440] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:17,440] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,063] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,411] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:18,413] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,413] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,414] [INFO] [comm.py:652:init_distributed] cdb=None [2025-08-27 15:25:18,415] [INFO] [comm.py:652:init_distributed] cdb=None You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,417] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,425] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,427] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 [2025-08-27 15:25:18,427] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,430] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:18,959] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,284] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,294] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,299] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,301] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,305] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,314] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:19,337] [INFO] [config.py:733:__init__] Config mesh_device None world_size = 128 You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. [2025-08-27 15:25:22,881] [INFO] [partition_parameters.py:348:__exit__] finished initializing model - num_params = 729, num_elems = 8.29B Loading checkpoint shards: 0%| | 0/5 [00:00 before Client(conf_path) Rank 0: --> after Client(conf_path) Rank 0: Loading datasets: ../internvl_chat/data/internvl_meta/meta/meta_250827_2.json Rank 0: Loading guienv Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl with all sampling strategy Rank 0: Loaded 327972 samples from VC:s3://gui/new_annotations/aguvis/stage1/guienv_202507011.jsonl Rank 0: Loading omniact Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl with all sampling strategy Rank 0: Loaded 6720 samples from VC:s3://gui/new_annotations/aguvis/stage1/omniact_fix_202507011.jsonl Rank 0: Loading ricoig16k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl with all sampling strategy Rank 0: Loaded 16133 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricoig16k_202507011.jsonl Rank 0: Loading ricosca Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl with all sampling strategy Rank 0: Loaded 173212 samples from VC:s3://gui/new_annotations/aguvis/stage1/ricosca_202507011.jsonl Rank 0: Loading seeclick Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl with all sampling strategy Rank 0: Loaded 271121 samples from VC:s3://gui/new_annotations/aguvis/stage1/seeclick_202507011.jsonl Rank 0: Loading ui_refexp Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl with all sampling strategy Rank 0: Loaded 15624 samples from VC:s3://gui/new_annotations/aguvis/stage1/ui_refexp_202507011.jsonl Rank 0: Loading webui350k Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl with all sampling strategy Rank 0: Loaded 57389 samples from VC:s3://gui/new_annotations/aguvis/stage1/webui350k_202507011.jsonl Rank 0: Loading widget_captioning Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl with all sampling strategy Rank 0: Loaded 101426 samples from VC:s3://gui/new_annotations/aguvis/stage1/widget_captioning_202507011.jsonl Rank 0: Loading aitw-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l1_202507011.jsonl Rank 0: Loading aitw-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l2_202507011.jsonl Rank 0: Loading aitw-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 18992 samples from VC:s3://gui/new_annotations/aguvis/stage2/aitw-l3_202507011.jsonl Rank 0: Loading amex-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l1_202507011.jsonl Rank 0: Loading amex-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l2_202507011.jsonl Rank 0: Loading amex-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 38469 samples from VC:s3://gui/new_annotations/aguvis/stage2/amex-l3_202507011.jsonl Rank 0: Loading android_control Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl with repeat:2 sampling strategy Rank 0: Loaded 149428 samples from VC:s3://gui/new_annotations/aguvis/stage2/android_control_202507011.jsonl Rank 0: Loading coat Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl with all sampling strategy Rank 0: Loaded 11833 samples from VC:s3://gui/new_annotations/aguvis/stage2/coat_filtered_202507011.jsonl Rank 0: Loading guiact-web-multi-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l1_202507011.jsonl Rank 0: Loading guiact-web-multi-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l2_202507011.jsonl Rank 0: Loading guiact-web-multi-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 16704 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-multi-l3_202507011.jsonl Rank 0: Loading guiact-web-single Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl with all sampling strategy Rank 0: Loaded 67396 samples from VC:s3://gui/new_annotations/aguvis/stage2/guiact-web-single_202507013.jsonl Rank 0: Loading guide Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl with all sampling strategy Rank 0: Loaded 13544 samples from VC:s3://gui/new_annotations/aguvis/stage2/guide_202507011.jsonl Rank 0: Loading gui-odyssey-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l1_202507011.jsonl Rank 0: Loading gui-odyssey-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l2_202507011.jsonl Rank 0: Loading gui-odyssey-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 118282 samples from VC:s3://gui/new_annotations/aguvis/stage2/gui-odyssey-l3_202507011.jsonl Rank 0: Loading mind2web-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l1_202507011.jsonl Rank 0: Loading mind2web-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l2_202507011.jsonl Rank 0: Loading mind2web-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 7591 samples from VC:s3://gui/new_annotations/aguvis/stage2/mind2web-l3_202507011.jsonl Rank 0: Loading miniwob-l1 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l1_202507011.jsonl Rank 0: Loading miniwob-l2 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l2_202507011.jsonl Rank 0: Loading miniwob-l3 Rank 0: Loading VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl with all sampling strategy Rank 0: Loaded 9826 samples from VC:s3://gui/new_annotations/aguvis/stage2/miniwob-l3_202507011.jsonl Rank 0: Loading aguvis_android_control-v2 Rank 0: Skipping aguvis_android_control-v2 due to repeat_time=0 Rank 0: Loading aguvis_coat-v2 Rank 0: Skipping aguvis_coat-v2 due to repeat_time=0 Rank 0: Loading aguvis_docvqa_grounding Rank 0: Skipping aguvis_docvqa_grounding due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-multi Rank 0: Skipping aguvis_guiact-web-multi due to repeat_time=0 Rank 0: Loading aguvis_guiact-web-single-v2 Rank 0: Skipping aguvis_guiact-web-single-v2 due to repeat_time=0 Rank 0: Loading aguvis_guide_si_10k-v2 Rank 0: Skipping aguvis_guide_si_10k-v2 due to repeat_time=0 Rank 0: Loading aguvis_guienv Rank 0: Skipping aguvis_guienv due to repeat_time=0 Rank 0: Loading aguvis_mind2web_train_v1.0.1 Rank 0: Skipping aguvis_mind2web_train_v1.0.1 due to repeat_time=0 Rank 0: Loading aguvis_omniact Rank 0: Skipping aguvis_omniact due to repeat_time=0 Rank 0: Loading aguvis_osatlas_ui_tars_cleaned Rank 0: Skipping aguvis_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_ricoig16k Rank 0: Skipping aguvis_ricoig16k due to repeat_time=0 Rank 0: Loading aguvis_ricosca Rank 0: Skipping aguvis_ricosca due to repeat_time=0 Rank 0: Loading aguvis_seeclick_mi_ui_tars_cleaned Rank 0: Skipping aguvis_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading aguvis_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping aguvis_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading aguvis_ui_refexp Rank 0: Skipping aguvis_ui_refexp due to repeat_time=0 Rank 0: Loading aguvis_webui350k Rank 0: Skipping aguvis_webui350k due to repeat_time=0 Rank 0: Loading aguvis_widget_captioning Rank 0: Skipping aguvis_widget_captioning due to repeat_time=0 Rank 0: Loading icon_caption_icon_v0222_description Rank 0: Skipping icon_caption_icon_v0222_description due to repeat_time=0 Rank 0: Loading icon_grounding_icon_v0222_grounding Rank 0: Skipping icon_grounding_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_component_final_1.5m Rank 0: Skipping refusal_component_final_1.5m due to repeat_time=0 Rank 0: Loading refusal_component_library_snap_icon_data_grounding Rank 0: Skipping refusal_component_library_snap_icon_data_grounding due to repeat_time=0 Rank 0: Loading refusal_component_v1_130k Rank 0: Skipping refusal_component_v1_130k due to repeat_time=0 Rank 0: Loading refusal_guienv Rank 0: Skipping refusal_guienv due to repeat_time=0 Rank 0: Loading refusal_icon_v0222_grounding Rank 0: Skipping refusal_icon_v0222_grounding due to repeat_time=0 Rank 0: Loading refusal_osatlas_ui_tars_cleaned Rank 0: Skipping refusal_osatlas_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_ricosca Rank 0: Skipping refusal_ricosca due to repeat_time=0 Rank 0: Loading refusal_seeclick_mi_ui_tars_cleaned Rank 0: Skipping refusal_seeclick_mi_ui_tars_cleaned due to repeat_time=0 Rank 0: Loading refusal_seeclick_ui_tars_cleaned_fixed Rank 0: Skipping refusal_seeclick_ui_tars_cleaned_fixed due to repeat_time=0 Rank 0: Loading refusal_training_data_icon_grounded_merged Rank 0: Skipping refusal_training_data_icon_grounded_merged due to repeat_time=0 Rank 0: Loading component_generated_component_final_1.5m_cleaned_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 3987 samples from VC:s3://gui-agent/jedi/annotations_250713/component_final_1.5m_cleaned_split_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_description Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 11061 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_description_conversations_20250713.jsonl Rank 0: Loading component_generated_component_library_snap_icon_data_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 4424 samples from VC:s3://gui-agent/jedi/annotations_250713/component_library_snap_icon_data_grounding_conversations_20250713.jsonl Rank 0: Loading component_generated_component_v1_130k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 26376 samples from VC:s3://gui-agent/jedi/annotations_250713/component_v1_130k_20250713.jsonl Rank 0: Loading component_rule-based_doc_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 3153 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_data_new_20250713.jsonl Rank 0: Loading component_rule-based_doc_scroll_data_new Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 603 samples from VC:s3://gui-agent/jedi/annotations_250713/doc_scroll_data_new_20250713.jsonl Rank 0: Loading component_rule-based_ethercalc_v1 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2012 samples from VC:s3://gui-agent/jedi/annotations_250713/ethercalc_v1_20250713.jsonl Rank 0: Loading component_rule-based_slide_v1_17k Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl with random:20% sampling strategy Rank 0: Loaded 2363 samples from VC:s3://gui-agent/jedi/annotations_250713/slide_v1_17k_20250713.jsonl Rank 0: Loading icon_caption_ios_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 49498 samples from VC:s3://gui-agent/jedi/annotations_250713/ios_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_mac_app_data Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl with all sampling strategy Rank 0: Loaded 18083 samples from VC:s3://gui-agent/jedi/annotations_250713/mac_app_data_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_caption_training_data_icon Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl with random:50% sampling strategy Rank 0: Loaded 75874 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_pure_color_background_20250713.jsonl Rank 0: Loading icon_grounding_training_data_icon_grounded_merged Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 5466 samples from VC:s3://gui-agent/jedi/annotations_250713/training_data_icon_conversations-images_grounded_merged_20250713.jsonl Rank 0: Loading layout_layout200k_training_data_qwen25 Rank 0: Skipping layout_layout200k_training_data_qwen25 due to repeat_time=0 Rank 0: Loading layout_layout200k_grounding_training_data_qwen25 Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl with random:10% sampling strategy Rank 0: Loaded 158612 samples from VC:s3://gui-agent/jedi/annotations_250713/layout200k_grounding_training_data_qwen25_20250713.jsonl Rank 0: Loading layout_layout400k_claude_training_data_qwen25_split Rank 0: Skipping layout_layout400k_claude_training_data_qwen25_split due to repeat_time=0 Rank 0: Loading layout_layout400k_claude_grounding_training_data_qwen25_split Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7540 samples from VC:s3://gui-agent/jedi/annotations_250713/layout400k_claude_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading layout_os_layout_v1 Rank 0: Skipping layout_os_layout_v1 due to repeat_time=0 Rank 0: Loading layout_os_layout_v1_grounding Rank 0: Loading VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl with random:30% sampling strategy Rank 0: Loaded 7857 samples from VC:s3://gui-agent/jedi/annotations_250713/os_layout_v1_grounding_training_data_qwen25_split_20250713.jsonl Rank 0: Loading mind2web_raw_image Rank 0: Loading VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl with all sampling strategy Rank 0: Loaded 5740 samples from VC:s3://gui-agent/mind2web_train/navigation_20250705.jsonl Rank 0: Loading ws_android_navigation_20250328 Rank 0: Skipping ws_android_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250407 Rank 0: Skipping ws_android_navigation_20250407 due to repeat_time=0 Rank 0: Loading ws_web_navigation_w_history_20250328 Rank 0: Skipping ws_web_navigation_w_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_wo_history_20250328 Rank 0: Skipping ws_web_navigation_wo_history_20250328 due to repeat_time=0 Rank 0: Loading ws_web_navigation_20250421 Rank 0: Skipping ws_web_navigation_20250421 due to repeat_time=0 Rank 0: Loading ws_ubuntu_navigation_20250328 Rank 0: Skipping ws_ubuntu_navigation_20250328 due to repeat_time=0 Rank 0: Loading ws_android_navigation_20250505 Rank 0: Skipping ws_android_navigation_20250505 due to repeat_time=0 Rank 0: Loading internal_android_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 48814 samples from VC:s3://gui-agent/data_20250612/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19042 samples from VC:s3://gui-agent/data_20250612/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 8363 samples from VC:s3://gui-agent/data_20250612/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 26412 samples from VC:s3://gui-agent/data_20250612/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 57522 samples from VC:s3://gui-agent/data_20250612/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 1342 samples from VC:s3://gui-agent/data_20250624/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 15766 samples from VC:s3://gui-agent/data_20250624/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 19280 samples from VC:s3://gui-agent/data_20250630/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 3560 samples from VC:s3://gui-agent/data_20250630/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 9258 samples from VC:s3://gui-agent/data_20250630/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 420 samples from VC:s3://gui-agent/data_20250707/mac/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 8898 samples from VC:s3://gui-agent/data_20250707/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 21026 samples from VC:s3://gui-agent/data_20250707/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_navigation_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 4490 samples from VC:s3://gui-agent/data_20250707/android/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 22154 samples from VC:s3://gui-agent/data_20250714/windows/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 11614 samples from VC:s3://gui-agent/data_20250714/web/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl with random:50% sampling strategy Rank 0: Loaded 16767 samples from VC:s3://gui-agent/data_20250714/ubuntu/navigation_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 746 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822.jsonl Rank 0: Loading internal_ubuntu_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 9856 samples from VC:s3://gui-agent/data_20250813/ubuntu/navigation_20250822_boost.jsonl Rank 0: Loading internal_windows_navigation_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822.jsonl Rank 0: Loading internal_windows_navigation_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl with random:50% sampling strategy Rank 0: Loaded 1564 samples from VC:s3://gui-agent/data_20250813/windows/navigation_20250822_boost.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 146442 samples from VC:s3://gui-agent/data_20250612/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 57126 samples from VC:s3://gui-agent/data_20250612/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 16726 samples from VC:s3://gui-agent/data_20250612/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 52824 samples from VC:s3://gui-agent/data_20250612/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250612 Rank 0: Loading VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 115044 samples from VC:s3://gui-agent/data_20250612/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2684 samples from VC:s3://gui-agent/data_20250624/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250624 Rank 0: Loading VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 31532 samples from VC:s3://gui-agent/data_20250624/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 38560 samples from VC:s3://gui-agent/data_20250630/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 10680 samples from VC:s3://gui-agent/data_20250630/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl with repeat:2 sampling strategy Rank 0: Loaded 18516 samples from VC:s3://gui-agent/data_20250630/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_mac_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 1260 samples from VC:s3://gui-agent/data_20250707/mac/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 17796 samples from VC:s3://gui-agent/data_20250707/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_android_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl with repeat:3 sampling strategy Rank 0: Loaded 26937 samples from VC:s3://gui-agent/data_20250707/android/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 42051 samples from VC:s3://gui-agent/data_20250707/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 44307 samples from VC:s3://gui-agent/data_20250714/windows/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_web_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 23229 samples from VC:s3://gui-agent/data_20250714/web/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl with all sampling strategy Rank 0: Loaded 33534 samples from VC:s3://gui-agent/data_20250714/ubuntu/planning_20250720_boost_instruction.jsonl Rank 0: Loading internal_windows_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6254 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822.jsonl Rank 0: Loading internal_windows_planning_cot_boost_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 3127 samples from VC:s3://gui-agent/data_20250813/windows/planning_20250822_boost.jsonl Rank 0: Loading internal_ubuntu_planning_cot_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2984 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822.jsonl Rank 0: Loading internal_ubuntu_planning_cot_boost_instruction_20250813 Rank 0: Loading VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl with all sampling strategy Rank 0: Loaded 19712 samples from VC:s3://gui-agent/data_20250813/ubuntu/planning_20250822_boost.jsonl Rank 0: Loading private_aig_share_0815_logo_oral_operation_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_oral_operation_d240924_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_region_caption_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_region_caption_d240924_v1.jsonl Rank 0: Loading private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl with all sampling strategy Rank 0: Loaded 20293 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ui_operation_oral_wbox_d241023_v2.jsonl Rank 0: Loading private_ui_phone_comment_20240606_json_d20241023_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl with all sampling strategy Rank 0: Loaded 1055 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_comment_20240606_json_d20241023_v2.jsonl Rank 0: Loading private_ui_internal_aig_json_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6837 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_json_d241126.jsonl Rank 0: Loading private_ui_internal_aig_xml_d241126 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl with repeat:3 sampling strategy Rank 0: Loaded 6873 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_internal_aig_xml_d241126.jsonl Rank 0: Loading OS_Altas_androidworld_grounding_d241120_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl with all sampling strategy Rank 0: Loaded 89860 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/OS_Altas_androidworld_grounding_d241120_v1.jsonl Rank 0: Loading private_ui_aig_share_long_caption_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl with repeat:4 sampling strategy Rank 0: Loaded 3156 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_long_caption_20240604_v1.jsonl Rank 0: Loading aw_1218_grounding Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/grounding_new.jsonl Rank 0: Loading aw_1218_regioncaption Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/regioncaption_new.jsonl Rank 0: Loading aw_1218_oral_operation Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl with all sampling strategy Rank 0: Loaded 863 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/oral_operation_new.jsonl Rank 0: Loading private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl with all sampling strategy Rank 0: Loaded 6600 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240812_grounding_dataset_20240812_v1_r6600.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 24620 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d20240604_v2 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl with all sampling strategy Rank 0: Loaded 17196 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d20240604_v2.jsonl Rank 0: Loading private_ui_phone_2403_long_caption_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 5998 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_long_caption_d240430_v1.jsonl Rank 0: Loading private_ui_phone_2403_ocr_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 31276 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_phone_2403_ocr_d240430_v1.jsonl Rank 0: Loading screen_qa_with_bbox_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 62401 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_with_bbox_d240430_v1.jsonl Rank 0: Loading screenai_layout_20240604_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl with all sampling strategy Rank 0: Loaded 22076 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screenai_layout_20240604_v1.jsonl Rank 0: Loading amex_grounding_d240813_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl with all sampling strategy Rank 0: Loaded 102007 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/amex_grounding_d240813_v1.jsonl Rank 0: Loading guicourse_guienv_text_grounding_1_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 63581 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_1_d240815_v3.jsonl Rank 0: Loading guicourse_guienv_text_grounding_2_d240815_v3 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl with all sampling strategy Rank 0: Loaded 6852 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/guicourse_guienv_text_grounding_2_d240815_v3.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_detection_d20240418_v1.jsonl Rank 0: Loading private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_tablet_20240416_v7_ground_d20240416_v1.jsonl Rank 0: Loading screen_qa_short_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 27880 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/screen_qa_short_d240430_v1.jsonl Rank 0: Loading private_aig_share_0815_logo_grounding_d240924_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl with all sampling strategy Rank 0: Loaded 1405 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_aig_share_0815_logo_grounding_d240924_v1.jsonl Rank 0: Loading private_schedual_extract_20240520_v2_r464_reprompt_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl with repeat:2 sampling strategy Rank 0: Loaded 928 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_schedual_extract_20240520_v2_r464_reprompt_d240607.jsonl Rank 0: Loading private_ui2json_app_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2488 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_app_d20240822_v1.jsonl Rank 0: Loading private_ui2json_os_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1242 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_os_d20240822_v1.jsonl Rank 0: Loading private_ui2json_web_d20240822_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2360 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui2json_web_d20240822_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl with all sampling strategy Rank 0: Loaded 3791 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_element_recognition_d240605_v1_correct_d240607.jsonl Rank 0: Loading private_ui_aig_share_2405_marker_recognition_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5179 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_marker_recognition_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ocr_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5090 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_ocr_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_operation_oral_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5070 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_operation_oral_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl with all sampling strategy Rank 0: Loaded 5248 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2405_visual_prompt_with_bbox_d240605_v1.jsonl Rank 0: Loading private_ui_aig_share_2408_region_caption_d240903_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl with all sampling strategy Rank 0: Loaded 5854 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_aig_share_2408_region_caption_d240903_v1.jsonl Rank 0: Loading private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1 Rank 0: Loading VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl with all sampling strategy Rank 0: Loaded 2000 samples from VC:s3://gui/new_annotations/st_data/20250222/annotations/private_ui_homescreen_phone_20240416_v7_element_recognition_d20240416_v1.jsonl Rank 0: Loading uground_web_direct_150k_description_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 133523 samples from VC:s3://gui/new_annotations/uground/web_direct_150k_description_filtered_20250826.jsonl Rank 0: Loading uground_web_direct_258k_function_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl with all sampling strategy Rank 0: Loaded 169889 samples from VC:s3://gui/new_annotations/uground/web_direct_258k_function_filtered_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl with all sampling strategy Rank 0: Loaded 400000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_2 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 300000 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_2.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_3 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl with all sampling strategy Rank 0: Loaded 161474 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_3.jsonl Rank 0: Loading uground_web_hybrid_773k_max_25qa_filtered_4 Rank 0: Loading VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl with all sampling strategy Rank 0: Loaded 239854 samples from VC:s3://gui/new_annotations/uground/web_hybrid_773k_max_25qa_filtered_new_20250826_4.jsonl Rank 0: Loading altas_windows Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 200000 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826.jsonl Rank 0: Loading altas_windows_2 Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 552883 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_windows_splited_20250826_2.jsonl Rank 0: Loading altas_linux Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl with all sampling strategy Rank 0: Loaded 32538 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_linux_splited_20250826.jsonl Rank 0: Loading atlas_macos_uitars_coord Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl with all sampling strategy Rank 0: Loaded 14197 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826_2.jsonl Rank 0: Loading atlas_macos_uitars_filtered Rank 0: Loading VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl with random:30% sampling strategy Rank 0: Loaded 4133 samples from VC:s3://gui/new_annotations/OS-Atlas/windows_desktop/processed_macos_splited_20250826.jsonl Rank 0: Loading android_action_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl with all sampling strategy Rank 0: Loaded 11242 samples from VC:s3://gui/data_20250328/android/filter_action_grounding_20250405_202507011.jsonl Rank 0: Loading windows_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 23961 samples from VC:s3://gui-agent/data_20250328/windows/action_grounding_20250409_202507011_20250722.jsonl Rank 0: Loading web_action_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl with all sampling strategy Rank 0: Loaded 18918 samples from VC:s3://gui-agent/data_20250328/web_25k/action_grounding_20250404_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 657 samples from VC:s3://gui/data_20250310/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl with all sampling strategy Rank 0: Loaded 107 samples from VC:s3://gui/data_20250317/ubuntu/action_grounding_20250407_202507011.jsonl Rank 0: Loading windows_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 480 samples from VC:s3://gui/data_20250317/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 944 samples from VC:s3://gui/data_20250310/windows/crop_action_grounding_20250421_202507011_20250722.jsonl Rank 0: Loading mac_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 1578 samples from VC:s3://gui-agent/data_20250407/mac/action_grounding_20250410_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 20394 samples from VC:s3://gui-agent/data_20250407/iphone/white/action_grounding_20250410_202507011.jsonl Rank 0: Loading web_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 14285 samples from VC:s3://gui-agent/data_20250407/web/action_grounding_20250414_202507011.jsonl Rank 0: Loading android_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl with all sampling strategy Rank 0: Loaded 7180 samples from VC:s3://gui-agent/data_20250407/android/action_grounding_20250410_202507011.jsonl Rank 0: Loading windows_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 42845 samples from VC:s3://gui-agent/data_20250407/windows/action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 150 samples from VC:s3://gui-agent/data_20250407/windows/human_action_grounding_20250416_202507011_20250722.jsonl Rank 0: Loading windows_aug_cropping_action_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 15350 samples from VC:s3://gui-agent/data_20250407/windows/sub_action_grounding_20250421_202507011.jsonl Rank 0: Loading iphone_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl with all sampling strategy Rank 0: Loaded 20116 samples from VC:s3://gui-agent/data_20250414/iphone/action_grounding_20250417_202507011.jsonl Rank 0: Loading iphone_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 3780 samples from VC:s3://gui-agent/data_20250414/iphone/human_action_grounding_20250421_202507011.jsonl Rank 0: Loading mac_human_action_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11721 samples from VC:s3://gui-agent/data_20250414/mac/human_action_grounding_20250418_202507011.jsonl Rank 0: Loading android_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 35675 samples from VC:s3://gui-agent/data_20250421/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading android_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl with all sampling strategy Rank 0: Loaded 18016 samples from VC:s3://gui-agent/data_20250428/Android/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_canvas_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl with random:20% sampling strategy Rank 0: Loaded 624 samples from VC:s3://gui-agent/data_20250428/web_canvas/action_grounding_20250429_202507011.jsonl Rank 0: Loading web_action_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 201304 samples from VC:s3://gui-agent/data_20250421/web/action_grounding_20250505_202507011.jsonl Rank 0: Loading ubuntu_action_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl with all sampling strategy Rank 0: Loaded 28346 samples from VC:s3://gui-agent/data_20250428/ubuntu/action_grounding_20250505_202507011.jsonl Rank 0: Loading android_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl with all sampling strategy Rank 0: Loaded 9814 samples from VC:s3://gui-agent/data_20250505/android/action_grounding_20250506_202507011.jsonl Rank 0: Loading windows_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5270 samples from VC:s3://gui-agent/data_20250505/windows/action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 10468 samples from VC:s3://gui-agent/data_20250505/windows/crop_action_grounding_20250508_202507011_20250722.jsonl Rank 0: Loading ubuntu_action_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl with all sampling strategy Rank 0: Loaded 3404 samples from VC:s3://gui-agent/data_20250508/ubuntu/action_grounding_20250509_202507011.jsonl Rank 0: Loading windows_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 22840 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250510_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5242 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250526_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_3 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6530 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250527_202507011_20250722.jsonl Rank 0: Loading windows_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 34101 samples from VC:s3://gui-agent/data_20250526/windows/action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_crop_action_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 68202 samples from VC:s3://gui-agent/data_20250526/windows/crop_action_grounding_20250529_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_1 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250510_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_2 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250526_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_3 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3331 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250527_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250609_4 Rank 0: Loading VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250609/windows/action_grounding_20250529_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_1 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_2 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_3 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_4 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_5 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 11420 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250510_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_6 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2621 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250526_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_7 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3265 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250527_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250616_8 Rank 0: Loading VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 10571 samples from VC:s3://gui-agent/data_20250616/windows_pure_paste/action_grounding_20250529_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 4695 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 3130 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250619_202507011_20250722.jsonl Rank 0: Loading windows_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_crop_hover_action_grounding_20250623 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 6666 samples from VC:s3://gui-agent/data_20250623/windows/crop_action_grounding_20250620_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3333 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250620_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 1430 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 1565 samples from VC:s3://gui-agent/data_20250623/windows_augment/action_grounding_20250619_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl with repeat:3 sampling strategy Rank 0: Loaded 11919 samples from VC:s3://gui-agent/data_20250630/windows/action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 7946 samples from VC:s3://gui-agent/data_20250630/windows/crop_action_grounding_20250627_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 3973 samples from VC:s3://gui-agent/data_20250630/windows_augment/action_grounding_20250627_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 1990 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_action_grounding_20250630_202507011_20250722.jsonl Rank 0: Loading windows_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl with all sampling strategy Rank 0: Loaded 5040 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_action_grounding_20250703_202507011_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 2520 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/action_grounding_20250703_pure_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl with all sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_concat_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_paste_202507011.jsonl Rank 0: Loading windows_aug_action_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl with random:50% sampling strategy Rank 0: Loaded 995 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/action_grounding_20250630_pure_paste_202507011.jsonl Rank 0: Loading windows_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl with all sampling strategy Rank 0: Loaded 2538 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_action_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 343 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_action_grounding_20250708.jsonl Rank 0: Loading windows_aug_action_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl with random:50% sampling strategy Rank 0: Loaded 1269 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_action_grounding_20250708.jsonl Rank 0: Loading windows_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl with repeat:2 sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/action_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_action_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl with all sampling strategy Rank 0: Loaded 2832 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_action_grounding_20250717_20250722.jsonl Rank 0: Loading android_ocr_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl with all sampling strategy Rank 0: Loaded 29878 samples from VC:s3://gui/data_20250328/android/text_ocr_20250409.jsonl Rank 0: Loading mac_orc_20250328 Rank 0: Loading VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl with all sampling strategy Rank 0: Loaded 4393 samples from VC:s3://gui/data_20250328/mac/element_ocr_20250328.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 158 samples from VC:s3://gui/data_20250310/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_click_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl with random:50% sampling strategy Rank 0: Loaded 1126 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_function_20250421.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl with random:50% sampling strategy Rank 0: Loaded 33 samples from VC:s3://gui/data_20250317/ubuntu/internvl_grounding_20250407.jsonl Rank 0: Loading windows_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4111 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2054 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 338 samples from VC:s3://gui/data_20250317/windows/internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_crop_click_internvl_grounding_20250317 Rank 0: Loading VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 169 samples from VC:s3://gui/data_20250317/windows/crop_internvl_grounding_function_20250421_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 6818 samples from VC:s3://gui/data_20250310/windows/internvl_grounding_20250421_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250310 Rank 0: Loading VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 3408 samples from VC:s3://gui/data_20250310/windows/crop_internvl_grounding_20250421_20250722.jsonl Rank 0: Loading android_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8719 samples from VC:s3://gui/data_20250328/android/internvl_grounding_20250409.jsonl Rank 0: Loading windows_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 7744 samples from VC:s3://gui-agent/data_20250328/windows/internvl_grounding_20250425_20250722.jsonl Rank 0: Loading web_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl with random:50% sampling strategy Rank 0: Loaded 8376 samples from VC:s3://gui/data_20250328/web_25k/internvl_grounding_20250409.jsonl Rank 0: Loading icon_internvl_grounding_20250328 Rank 0: Loading VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl with all sampling strategy Rank 0: Loaded 81303 samples from VC:s3://gui/data_20250328/icon_canva/icon_anno_20250328.jsonl Rank 0: Loading mac_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 626 samples from VC:s3://gui-agent/data_20250407/mac/internvl_grounding_20250410.jsonl Rank 0: Loading iphone_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 13924 samples from VC:s3://gui-agent/data_20250407/iphone/white/internvl_grounding_20250410.jsonl Rank 0: Loading web_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl with random:50% sampling strategy Rank 0: Loaded 32254 samples from VC:s3://gui-agent/data_20250407/web/internvl_grounding_20250414.jsonl Rank 0: Loading android_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl with random:50% sampling strategy Rank 0: Loaded 3926 samples from VC:s3://gui-agent/data_20250407/android/internvl_grounding_20250410.jsonl Rank 0: Loading windows_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 16226 samples from VC:s3://gui-agent/data_20250407/windows/internvl_grounding_20250416_20250722.jsonl Rank 0: Loading windows_cropping_internvl_grounding_20250407 Rank 0: Loading VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl with random:30% sampling strategy Rank 0: Loaded 7322 samples from VC:s3://gui-agent/data_20250407/windows/sub_internvl_grounding_20250421.jsonl Rank 0: Loading iphone_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl with random:50% sampling strategy Rank 0: Loaded 8448 samples from VC:s3://gui-agent/data_20250414/iphone/internvl_grounding_20250417.jsonl Rank 0: Loading iphone_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl with all sampling strategy Rank 0: Loaded 927 samples from VC:s3://gui-agent/data_20250414/iphone/human_internvl_grounding_20250421.jsonl Rank 0: Loading mac_human_internvl_grounding_20250414 Rank 0: Loading VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl with all sampling strategy Rank 0: Loaded 3051 samples from VC:s3://gui-agent/data_20250414/mac/human_internvl_grounding_20250418.jsonl Rank 0: Loading android_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 15760 samples from VC:s3://gui-agent/data_20250421/Android/internvl_grounding_20250429.jsonl Rank 0: Loading android_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl with random:50% sampling strategy Rank 0: Loaded 7923 samples from VC:s3://gui-agent/data_20250428/Android/internvl_grounding_20250429.jsonl Rank 0: Loading web_canvas_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl with random:30% sampling strategy Rank 0: Loaded 1174 samples from VC:s3://gui-agent/data_20250428/web_canvas/internvl_grounding_20250429.jsonl Rank 0: Loading web_internvl_grounding_20250421 Rank 0: Loading VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 108856 samples from VC:s3://gui-agent/data_20250421/web/internvl_grounding_20250505.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250428 Rank 0: Loading VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl with random:50% sampling strategy Rank 0: Loaded 15538 samples from VC:s3://gui-agent/data_20250428/ubuntu/internvl_grounding_20250505.jsonl Rank 0: Loading android_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl with random:50% sampling strategy Rank 0: Loaded 5714 samples from VC:s3://gui-agent/data_20250505/android/internvl_grounding_20250506.jsonl Rank 0: Loading windows_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 2643 samples from VC:s3://gui-agent/data_20250505/windows/internvl_grounding_20250508_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250505 Rank 0: Loading VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 2625 samples from VC:s3://gui-agent/data_20250505/windows/crop_internvl_grounding_20250508_20250722.jsonl Rank 0: Loading ubuntu_internvl_grounding_20250508 Rank 0: Loading VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl with random:50% sampling strategy Rank 0: Loaded 1792 samples from VC:s3://gui-agent/data_20250508/ubuntu/internvl_grounding_20250509.jsonl Rank 0: Loading windows_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_1 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 5934 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250510_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_2 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1419 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250526_20250722.jsonl Rank 0: Loading windows_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_crop_internvl_grounding_20250526_4 Rank 0: Loading VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 17792 samples from VC:s3://gui-agent/data_20250526/windows/crop_internvl_grounding_20250529_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 810 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250619_20250722.jsonl Rank 0: Loading windows_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_crop_hover_internvl_grounding_20250523 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1714 samples from VC:s3://gui-agent/data_20250623/windows/crop_internvl_grounding_20250620_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_1 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 686 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_2 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_3 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1372 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250620_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_4 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 295 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_5 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250623_6 Rank 0: Loading VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 648 samples from VC:s3://gui-agent/data_20250623/windows_augment/internvl_grounding_20250619_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1058 samples from VC:s3://gui-agent/data_20250630/windows/internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 4230 samples from VC:s3://gui-agent/data_20250630/windows/crop_internvl_grounding_20250627_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_1 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 846 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1692 samples from VC:s3://gui-agent/data_20250630/windows_augment/internvl_grounding_20250627_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_2 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250630/windows_data_20250630/crop_internvl_grounding_20250630_20250722.jsonl Rank 0: Loading windows_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250630_3 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 1338 samples from VC:s3://gui-agent/data_20250630/windows_data_20250703/crop_internvl_grounding_20250703_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_4 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 535 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_5 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_6 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 1070 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/internvl_grounding_20250703_pure_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_7 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl with random:20% sampling strategy Rank 0: Loaded 215 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_concat.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_8 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_paste.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250630_9 Rank 0: Loading VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl with random:20% sampling strategy Rank 0: Loaded 431 samples from VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/internvl_grounding_20250630_pure_paste.jsonl Rank 0: Loading windows_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250707 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 672 samples from VC:s3://gui-agent/data_20250707/windows_data_20250707/crop_internvl_grounding_20250708_20250722.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_1 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 146 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/concat_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_2 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_aug_internvl_grounding_20250707_3 Rank 0: Loading VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl with random:20% sampling strategy Rank 0: Loaded 538 samples from VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/pure_paste_internvl_grounding_20250708.jsonl Rank 0: Loading windows_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl with random:50% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/internvl_grounding_20250717_20250722.jsonl Rank 0: Loading windows_crop_human_internvl_grounding_20250714 Rank 0: Loading VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl with random:25% sampling strategy Rank 0: Loaded 754 samples from VC:s3://gui-agent/data_20250714/windows_data_20250714/crop_internvl_grounding_20250717_20250722.jsonl Rank 0: Loading uibert_train_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 4646 samples from VC:s3://gui/new_annotations/gui_data_grounding/uibert_train_ground_d240430_v1.jsonl Rank 0: Loading openapp_taperception_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 2500 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_taperception_grounding_d240815_v2.jsonl Rank 0: Loading openapp_widget_grounding_d240815_v2 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl with all sampling strategy Rank 0: Loaded 14878 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_widget_grounding_d240815_v2.jsonl Rank 0: Loading openapp_mug_grounding_d240812 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl with all sampling strategy Rank 0: Loaded 26090 samples from VC:s3://gui/new_annotations/gui_data_grounding/openapp_mug_grounding_d240812.jsonl Rank 0: Loading private_ui_phone_2403_ground_d240430_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl with all sampling strategy Rank 0: Loaded 24798 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_phone_2403_ground_d240430_v1.jsonl Rank 0: Loading private_ui_aig_share_2405_ground_d240521_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl with all sampling strategy Rank 0: Loaded 5008 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2405_ground_d240521_v1.jsonl Rank 0: Loading private_ui_aig_share_2406_ground_d240612_v1 Rank 0: Loading VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl with all sampling strategy Rank 0: Loaded 7903 samples from VC:s3://gui/new_annotations/gui_data_grounding/private_ui_aig_share_2406_ground_d240612_v1.jsonl Rank 0: Loading windows_pc_agent_e_planning_cot Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl with repeat:3 sampling strategy Rank 0: Loaded 83346 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data.jsonl Rank 0: Loading windows_pc_agent_e_navigation Rank 0: Loading VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl with all sampling strategy Rank 0: Loaded 27782 samples from VC:s3://gui-agent/data_20250609/pc_agent_e/pc_agent_e_training_data_without_think.jsonl Rank 0: Loading ubuntu_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl with random:65% sampling strategy Rank 0: Loaded 53435 samples from VC:s3://gui-agent/agentnet/ubuntu_planning_20250818.jsonl Rank 0: Loading ubuntu_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl with random:25% sampling strategy Rank 0: Loaded 20552 samples from VC:s3://gui-agent/agentnet/ubuntu_navigation_20250818.jsonl Rank 0: Loading windows_mac_agentnet_planning_cot Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl with random:30% sampling strategy Rank 0: Loaded 100078 samples from VC:s3://gui-agent/agentnet/win_mac_planning_20250818.jsonl Rank 0: Loading windows_mac_agentnet_navigation Rank 0: Loading VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl with random:15% sampling strategy Rank 0: Loaded 50039 samples from VC:s3://gui-agent/agentnet/win_mac_navigation_20250818.jsonl Rank 0: Loading os_genesis_ac_training_data Rank 0: Skipping os_genesis_ac_training_data due to repeat_time=0 Rank 0: Loading os_genesis_aw_training_data Rank 0: Skipping os_genesis_aw_training_data due to repeat_time=0 Rank 0: Loading os_genesis_web_training Rank 0: Skipping os_genesis_web_training due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_1 Rank 0: Skipping gui_odyssey_plus_1 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_2 Rank 0: Skipping gui_odyssey_plus_2 due to repeat_time=0 Rank 0: Loading gui_odyssey_plus_custom_3 Rank 0: Skipping gui_odyssey_plus_custom_3 due to repeat_time=0 Rank 0: Loading mm_gui_mid Rank 0: Skipping mm_gui_mid due to repeat_time=0 Rank 0: Loading text_gui_mid Rank 0: Skipping text_gui_mid due to repeat_time=0 Rank 0: Loading gui_mid_trajectory Rank 0: Skipping gui_mid_trajectory due to repeat_time=0 Rank 0: Loading ubuntu_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 7024 samples from VC:s3://gui-agent/cua_text_rag/ubuntu_rag.jsonl Rank 0: Loading windows_rag Rank 0: Loading VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl with repeat:2 sampling strategy Rank 0: Loaded 3144 samples from VC:s3://gui-agent/cua_text_rag/windows_rag.jsonl Rank 0: Loading sharegpt4o_review_negative_en_20240825 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 37455 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_sharegpt4o_review_negative_en_20240825.jsonl Rank 0: Loading internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 59981 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_reflection_internvl_sa1b_caption_gpt4o_review_gpt4o_en_20241017.jsonl Rank 0: Loading ai2d_cot_gpt4o_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 14724 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_ai2d_cot_gpt4o_en_20240805.jsonl Rank 0: Loading scienceqa_multi_choice_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 23400 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_science_scienceqa_multi_choice_en_20240402.jsonl Rank 0: Loading fsc147_train_en_20241007 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 4025 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_general_fsc147_train_en_20241007.jsonl Rank 0: Loading docreason_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 31829 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_document_docreason_en_20240403.jsonl Rank 0: Loading mmtab_instruct_pretrain_en_20240902 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 83057 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_chart_mmtab_instruct_pretrain_en_20240902.jsonl Rank 0: Loading textvqa_en_20240611 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 42560 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textvqa_en_20240611.jsonl Rank 0: Loading textcap_gpt4o_en_20240905 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26596 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textcap_gpt4o_en_20240905.jsonl Rank 0: Loading eaten_passport_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl with random:23% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_eaten_passport_zh_20240402.jsonl Rank 0: Loading textocr_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 26329 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_gpt4v_en_20240402.jsonl Rank 0: Loading laion_gpt4v_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 13468 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_laion_gpt4v_en_20240402.jsonl Rank 0: Loading llavar_inhouse_sft_chat_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19908 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_chat_en_20240521.jsonl Rank 0: Loading llavar_inhouse_sft_longcap_en_20240521 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19916 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_llavar_inhouse_sft_longcap_en_20240521.jsonl Rank 0: Loading icdar2019_art_task1_3_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 6782 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_icdar2019_art_task1_3_zh_20240805.jsonl Rank 0: Loading chinese_ocr_zh_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 68312 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chinese_ocr_zh_20240402.jsonl Rank 0: Loading cocotextv2_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 19938 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_cocotextv2_en_20240805.jsonl Rank 0: Loading mtwi_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11424 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_mtwi_zh_20240805.jsonl Rank 0: Loading textocr_en_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 22488 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_textocr_en_20240805.jsonl Rank 0: Loading arxiv_table_65k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 79283 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_table_65k_en_20240403.jsonl Rank 0: Loading arxiv_ocr_162k_en_20240403 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl with random:74% sampling strategy Rank 0: Loaded 120223 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_arxiv_ocr_162k_en_20240403.jsonl Rank 0: Loading iam_multi_turn_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12168 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_iam_multi_turn_en_20240621.jsonl Rank 0: Loading poie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2768 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_poie_multi_turn_en_20240620.jsonl Rank 0: Loading sroie_multi_turn_en_20240620 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 770 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_sroie_multi_turn_en_20240620.jsonl Rank 0: Loading ocrvqa_en_20241116 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl with random:37% sampling strategy Rank 0: Loaded 76358 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_ocrvqa_en_20241116.jsonl Rank 0: Loading edrawsvg_caption_13k_zh_20240522 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11457 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_edrawsvg_caption_13k_zh_20240522.jsonl Rank 0: Loading wired_table_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl with random:37% sampling strategy Rank 0: Loaded 36850 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_wired_table_zh_20240627.jsonl Rank 0: Loading hme100k_en_20240620 Rank 0: Skipping hme100k_en_20240620 due to repeat_time=0 Rank 0: Loading synth_calligraphy_poetry_zh_20240805 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 123000 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_synth_calligraphy_poetry_zh_20240805.jsonl Rank 0: Loading chrome_writting_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 10855 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_chrome_writting_en_20240814.jsonl Rank 0: Loading vcr_wiki_en_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 20357 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_easy_20240907.jsonl Rank 0: Loading vcr_wiki_en_hard_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 22540 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_en_hard_20240907.jsonl Rank 0: Loading vcr_wiki_zh_easy_20240907 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl with random:74% sampling strategy Rank 0: Loaded 19569 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_ocr_vcr_wiki_zh_easy_20240907.jsonl Rank 0: Loading gpt4gen_rd_boxcot_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 4620 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_grounding_gpt4gen_rd_boxcot_en_20240402.jsonl Rank 0: Loading math_150_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 184 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_150_gpt4o_zh_20240626.jsonl Rank 0: Loading math_2k_gpt4o_zh_20240626 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2453 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_math_2k_gpt4o_zh_20240626.jsonl Rank 0: Loading geoqa+_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 88951 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geoqa+_en_20240402.jsonl Rank 0: Loading tqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 24741 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_en_20240402.jsonl Rank 0: Loading tqa_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 21340 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_tqa_cot_gpt4o_en_20240621.jsonl Rank 0: Loading geometry3k_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 12921 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_en_20240402.jsonl Rank 0: Loading geometry3k_cot_gpt4o_en_20240621 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11370 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geometry3k_cot_gpt4o_en_20240621.jsonl Rank 0: Loading unigeo_calc_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 25734 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_unigeo_calc_en_20240402.jsonl Rank 0: Loading super_clevr_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 73800 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_super_clevr_en_20240402.jsonl Rank 0: Loading mavis_math_function_caption_to_question_en_20240821 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 36414 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_mavis_math_function_caption_to_question_en_20240821.jsonl Rank 0: Loading geomverse_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 11437 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_geomverse_en_20240814.jsonl Rank 0: Loading cmm_math_cot_zh_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 16172 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_math_cmm_math_cot_zh_20240924.jsonl Rank 0: Loading qwen_filtered_gpt4v_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 8497 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_gpt4v_mathqa_en_20240402.jsonl Rank 0: Loading qwen_filtered_mathqa_en_20240402 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl with random:55% sampling strategy Rank 0: Loaded 2709 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_mathqa_en_20240402.jsonl Rank 0: Loading screenai_layout_en_20241102 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 27152 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screenai_layout_en_20241102.jsonl Rank 0: Loading qwen_filtered_infinitymath_en_20240924 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 116490 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_infinitymath_en_20240924.jsonl Rank 0: Loading qwen_filtered_sft_code_sensetime_en_zh_20240920 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl with random:66% sampling strategy Rank 0: Loaded 459932 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_code_sensetime_en_zh_20240920.jsonl Rank 0: Loading qwen_filtered_know_saraswati_cot_en_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 148371 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_know_saraswati_cot_en_20240520.jsonl Rank 0: Loading qwen_filtered_leetcode_en_zh_20240520 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1642 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_leetcode_en_zh_20240520.jsonl Rank 0: Loading data_gpt_generalquestion_correction_cn_43k_v2_20240813 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 52892 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_data_gpt_generalquestion_correction_cn_43k_v2_20240813.jsonl Rank 0: Loading SynthCode_leetcode_vqa_4k_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5517 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_leetcode_vqa_4k_v1.jsonl Rank 0: Loading SynthCode_llmapi_vqa_187_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 230 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_SynthCode_llmapi_vqa_187_v1.jsonl Rank 0: Loading captcha_feedback_619_v1 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 761 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_zjg_download_temp_data_internvl3_data_st2pj_20250222_annotations_captcha_feedback_619_v1.jsonl Rank 0: Loading open_r1_math_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 427940 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_r1_math_en_20250212.jsonl Rank 0: Loading open_thoughts_114k_en_20250212 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl with random:62% sampling strategy Rank 0: Loaded 69746 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_opensource_open_thoughts_114k_en_20250212.jsonl Rank 0: Loading lmsys_single_turn Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl with random:62% sampling strategy Rank 0: Loaded 207732 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241226_std_reasoning_deepseek_inhouse_lmsys_single_turn.jsonl Rank 0: Loading SCP_116K_filter Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 61205 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_SCP_116K_filter-conv-anno.jsonl Rank 0: Loading Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl with random:62% sampling strategy Rank 0: Loaded 154952 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_inspurfs_share_data_wangweiyun_share_data_nlp_long_cot_sft_Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-conv-anno.jsonl Rank 0: Loading longcite_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl with random:83% sampling strategy Rank 0: Loaded 35439 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_longcite_en_zh_20240912.jsonl Rank 0: Loading long_instruct_with_paraphrasing_en_zh_20240912 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 9417 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data_long_instruct_with_paraphrasing_en_zh_20240912.jsonl Rank 0: Loading qwen_filtered_tomb_evolved_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 21483 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_tomb_evolved_en_20240913.jsonl Rank 0: Loading qwen_filtered_xcoder_80k_en_20240913 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 85474 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_xcoder_80k_en_20240913.jsonl Rank 0: Loading qwen_filtered_sft_general_zhuguan_zh_20241002 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl with random:88% sampling strategy Rank 0: Loaded 124772 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_nlp_data2_qwen_filtered_sft_general_zhuguan_zh_20241002.jsonl Rank 0: Loading merged_mmmu_knowledge_point_gpt4o_en_20241118 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl with repeat:1.1 sampling strategy Rank 0: Loaded 47383 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_k12_merged_mmmu_knowledge_point_gpt4o_en_20241118.jsonl Rank 0: Loading android_ui_longcap_qwen_zh_20240409 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl with repeat:2.46 sampling strategy Rank 0: Loaded 13528 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_android_ui_longcap_qwen_zh_20240409.jsonl Rank 0: Loading screen2words_longcap_gpt4o_en_20240819 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 18106 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_screen2words_longcap_gpt4o_en_20240819.jsonl Rank 0: Loading drawing_to_html_en_20240628 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 2090 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_drawing_to_html_en_20240628.jsonl Rank 0: Loading airplane_app_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1368 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_airplane_app_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading taobao_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1925 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_taobao_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading wechat_longcap_gpt4o_zh_20240627 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 1344 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_wechat_longcap_gpt4o_zh_20240627.jsonl Rank 0: Loading websight_en_20240814 Rank 0: Loading VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl with repeat:1.23 sampling strategy Rank 0: Loaded 5349 samples from VC:s3://local2xsky/annotation/annotation_sft/_mnt_petrelfs_wangweiyun_workspace_cz_InternVL_internvl_chat_dev_metas2_stage3_v5_20241014_std_gui_websight_en_20240814.jsonl Rank 0: Total training samples: 11313173 Rank 0: Formatting inputs...Skip in lazy mode Rank 0: Resize images between 3136 to 2109744 Rank 0: Length of multimodal samples: 9328128, pure textual samples: 1984512 Parameter Offload: Total persistent parameters: 848896 in 368 params Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408531 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10725, 'image': 'vrdu_table_final_2/astro-ph.CO/c257a434-1cc5-4bd4-8243-815d868b8dcc.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} Token indices sequence length is longer than the specified maximum sequence length for this model (41070 > 40960). Running this sequence through the model will result in indexing errors 0%| | 0/22095 [00:00 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 1/22095 [00:27<169:13:16, 27.57s/it] {'loss': 0.8106, 'grad_norm': 8.66013199834246, 'learning_rate': 0.0, 'epoch': 0.0} 0%| | 1/22095 [00:27<169:13:16, 27.57s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 2/22095 [00:32<88:57:25, 14.50s/it] {'loss': 0.8228, 'grad_norm': 8.116963999167655, 'learning_rate': 1.5082956259426848e-08, 'epoch': 0.0} 0%| | 2/22095 [00:32<88:57:25, 14.50s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 0%| | 3/22095 [00:37<61:46:17, 10.07s/it] {'loss': 0.8379, 'grad_norm': 8.805888630344679, 'learning_rate': 3.0165912518853697e-08, 'epoch': 0.0} 0%| | 3/22095 [00:37<61:46:17, 10.07s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Token indices sequence length is longer than the specified maximum sequence length for this model (51984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (159015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70435 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51923 > 40960). Running this sequence through the model will result in indexing errors 0%| | 4/22095 [00:42<48:14:08, 7.86s/it] {'loss': 0.8215, 'grad_norm': 10.413908076660961, 'learning_rate': 4.524886877828055e-08, 'epoch': 0.0} 0%| | 4/22095 [00:42<48:14:08, 7.86s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 5/22095 [00:45<39:04:28, 6.37s/it] {'loss': 0.7408, 'grad_norm': 10.270466065239447, 'learning_rate': 6.033182503770739e-08, 'epoch': 0.0} 0%| | 5/22095 [00:45<39:04:28, 6.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 0%| | 6/22095 [00:49<34:15:45, 5.58s/it] {'loss': 0.871, 'grad_norm': 8.667172848979332, 'learning_rate': 7.541478129713425e-08, 'epoch': 0.0} 0%| | 6/22095 [00:49<34:15:45, 5.58s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Token indices sequence length is longer than the specified maximum sequence length for this model (52504 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75463 > 40960). Running this sequence through the model will result in indexing errors 0%| | 7/22095 [00:53<29:57:01, 4.88s/it] {'loss': 0.826, 'grad_norm': 7.875222542229044, 'learning_rate': 9.04977375565611e-08, 'epoch': 0.0} 0%| | 7/22095 [00:53<29:57:01, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58920 > 40960). Running this sequence through the model will result in indexing errors /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Token indices sequence length is longer than the specified maximum sequence length for this model (91275 > 40960). Running this sequence through the model will result in indexing errors 0%| | 8/22095 [00:57<29:22:56, 4.79s/it] {'loss': 0.7683, 'grad_norm': 9.852921507702614, 'learning_rate': 1.0558069381598795e-07, 'epoch': 0.0} 0%| | 8/22095 [00:58<29:22:56, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 0%| | 9/22095 [01:08<39:45:04, 6.48s/it] {'loss': 0.7524, 'grad_norm': 2.931646823848608, 'learning_rate': 1.2066365007541479e-07, 'epoch': 0.0} 0%| | 9/22095 [01:08<39:45:04, 6.48s/it] 0%| | 10/22095 [01:12<35:51:03, 5.84s/it] {'loss': 0.8115, 'grad_norm': 7.699231177320019, 'learning_rate': 1.3574660633484163e-07, 'epoch': 0.0} 0%| | 10/22095 [01:12<35:51:03, 5.84s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 11/22095 [01:16<31:53:29, 5.20s/it] {'loss': 0.8242, 'grad_norm': 8.48382482276139, 'learning_rate': 1.508295625942685e-07, 'epoch': 0.0} 0%| | 11/22095 [01:16<31:53:29, 5.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48667 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45624 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86076 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78960 > 40960). Running this sequence through the model will result in indexing errors 0%| | 12/22095 [01:23<36:01:26, 5.87s/it] {'loss': 0.7765, 'grad_norm': 3.0335535418133057, 'learning_rate': 1.6591251885369535e-07, 'epoch': 0.0} 0%| | 12/22095 [01:23<36:01:26, 5.87s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 13/22095 [01:28<33:14:14, 5.42s/it] {'loss': 0.8172, 'grad_norm': 7.977093909893501, 'learning_rate': 1.809954751131222e-07, 'epoch': 0.0} 0%| | 13/22095 [01:28<33:14:14, 5.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42331 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50980 > 40960). Running this sequence through the model will result in indexing errors 0%| | 14/22095 [01:31<28:33:41, 4.66s/it] {'loss': 0.8067, 'grad_norm': 8.497572549254802, 'learning_rate': 1.9607843137254904e-07, 'epoch': 0.0} 0%| | 14/22095 [01:31<28:33:41, 4.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 15/22095 [01:37<31:43:46, 5.17s/it] {'loss': 0.7585, 'grad_norm': 2.8940758258319446, 'learning_rate': 2.111613876319759e-07, 'epoch': 0.0} 0%| | 15/22095 [01:37<31:43:46, 5.17s/it] 0%| | 16/22095 [01:48<41:45:36, 6.81s/it] {'loss': 0.771, 'grad_norm': 2.855482184271788, 'learning_rate': 2.2624434389140273e-07, 'epoch': 0.0} 0%| | 16/22095 [01:48<41:45:36, 6.81s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 17/22095 [01:53<38:30:30, 6.28s/it] {'loss': 0.8223, 'grad_norm': 8.986529342997084, 'learning_rate': 2.4132730015082957e-07, 'epoch': 0.0} 0%| | 17/22095 [01:53<38:30:30, 6.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66196 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89764 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43010 > 40960). Running this sequence through the model will result in indexing errors 0%| | 18/22095 [01:56<33:04:47, 5.39s/it] {'loss': 0.7447, 'grad_norm': 6.395648167097882, 'learning_rate': 2.564102564102564e-07, 'epoch': 0.0} 0%| | 18/22095 [01:56<33:04:47, 5.39s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 19/22095 [02:00<30:24:05, 4.96s/it] {'loss': 0.829, 'grad_norm': 4.9798889312227725, 'learning_rate': 2.7149321266968326e-07, 'epoch': 0.0} 0%| | 19/22095 [02:00<30:24:05, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 20/22095 [02:10<40:34:04, 6.62s/it] {'loss': 0.7568, 'grad_norm': 2.6740823934447224, 'learning_rate': 2.865761689291101e-07, 'epoch': 0.0} 0%| | 20/22095 [02:10<40:34:04, 6.62s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 0%| | 21/22095 [02:16<38:39:41, 6.31s/it] {'loss': 0.8026, 'grad_norm': 4.052800002632703, 'learning_rate': 3.01659125188537e-07, 'epoch': 0.0} 0%| | 21/22095 [02:16<38:39:41, 6.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 [2025-08-27 16:00:25,782] [WARNING] [stage3.py:2118:step] 1 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time 0%| | 22/22095 [02:27<47:31:26, 7.75s/it] {'loss': 0.7492, 'grad_norm': 2.9369933535648247, 'learning_rate': 3.167420814479638e-07, 'epoch': 0.0} 0%| | 22/22095 [02:27<47:31:26, 7.75s/it] 0%| | 23/22095 [02:31<41:28:59, 6.77s/it] {'loss': 0.7575, 'grad_norm': 4.247413851142735, 'learning_rate': 3.318250377073907e-07, 'epoch': 0.0} 0%| | 23/22095 [02:31<41:28:59, 6.77s/it] 0%| | 24/22095 [02:35<35:51:11, 5.85s/it] {'loss': 0.759, 'grad_norm': 3.879005288265461, 'learning_rate': 3.4690799396681754e-07, 'epoch': 0.0} 0%| | 24/22095 [02:35<35:51:11, 5.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8345324 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11978, 'image': 'vrdu_table_final_2/astro-ph.CO/d68e3990-c36f-4d1c-a303-87443fa558a0.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 0%| | 25/22095 [02:39<32:50:04, 5.36s/it] {'loss': 0.7482, 'grad_norm': 3.6440819833985922, 'learning_rate': 3.619909502262444e-07, 'epoch': 0.0} 0%| | 25/22095 [02:39<32:50:04, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 26/22095 [02:48<39:39:12, 6.47s/it] {'loss': 0.7626, 'grad_norm': 2.757922595973135, 'learning_rate': 3.770739064856712e-07, 'epoch': 0.0} 0%| | 26/22095 [02:48<39:39:12, 6.47s/it] 0%| | 27/22095 [02:53<36:21:31, 5.93s/it] {'loss': 0.7556, 'grad_norm': 3.832157008102444, 'learning_rate': 3.921568627450981e-07, 'epoch': 0.0} 0%| | 27/22095 [02:53<36:21:31, 5.93s/it] 0%| | 28/22095 [02:57<32:55:21, 5.37s/it] {'loss': 0.7646, 'grad_norm': 3.372216966665436, 'learning_rate': 4.072398190045249e-07, 'epoch': 0.0} 0%| | 28/22095 [02:57<32:55:21, 5.37s/it] 0%| | 29/22095 [03:01<29:50:24, 4.87s/it] {'loss': 0.7665, 'grad_norm': 3.13813749845113, 'learning_rate': 4.223227752639518e-07, 'epoch': 0.0} 0%| | 29/22095 [03:01<29:50:24, 4.87s/it] 0%| | 30/22095 [03:05<28:54:54, 4.72s/it] {'loss': 0.748, 'grad_norm': 2.8809864229440163, 'learning_rate': 4.374057315233786e-07, 'epoch': 0.0} 0%| | 30/22095 [03:05<28:54:54, 4.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45216 > 40960). Running this sequence through the model will result in indexing errors 0%| | 31/22095 [03:09<27:01:34, 4.41s/it] {'loss': 0.7738, 'grad_norm': 2.8897137086592086, 'learning_rate': 4.5248868778280546e-07, 'epoch': 0.0} 0%| | 31/22095 [03:09<27:01:34, 4.41s/it] 0%| | 32/22095 [03:12<24:52:00, 4.06s/it] {'loss': 0.738, 'grad_norm': 2.942818955850594, 'learning_rate': 4.675716440422323e-07, 'epoch': 0.0} 0%| | 32/22095 [03:12<24:52:00, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 0%| | 33/22095 [03:16<24:32:58, 4.01s/it] {'loss': 0.7678, 'grad_norm': 2.6309860751703624, 'learning_rate': 4.826546003016591e-07, 'epoch': 0.0} 0%| | 33/22095 [03:16<24:32:58, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60592 > 40960). Running this sequence through the model will result in indexing errors 0%| | 34/22095 [03:19<22:20:52, 3.65s/it] {'loss': 0.7834, 'grad_norm': 2.8062178485999776, 'learning_rate': 4.977375565610859e-07, 'epoch': 0.0} 0%| | 34/22095 [03:19<22:20:52, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106419 > 40960). Running this sequence through the model will result in indexing errors 0%| | 35/22095 [03:22<21:52:24, 3.57s/it] {'loss': 0.7633, 'grad_norm': 2.8268426312674646, 'learning_rate': 5.128205128205128e-07, 'epoch': 0.0} 0%| | 35/22095 [03:22<21:52:24, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403473 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5646, 'image': 'vrdu_table_final_2/astro-ph.CO/35965147-786b-4d28-8d25-abe2b820e323.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 0%| | 36/22095 [03:26<21:17:57, 3.48s/it] {'loss': 0.7394, 'grad_norm': 2.4556470307247205, 'learning_rate': 5.279034690799397e-07, 'epoch': 0.0} 0%| | 36/22095 [03:26<21:17:57, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49883 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69391 > 40960). Running this sequence through the model will result in indexing errors 0%| | 37/22095 [03:29<20:44:12, 3.38s/it] {'loss': 0.7233, 'grad_norm': 2.472058064042505, 'learning_rate': 5.429864253393665e-07, 'epoch': 0.0} 0%| | 37/22095 [03:29<20:44:12, 3.38s/it] 0%| | 38/22095 [03:32<19:54:25, 3.25s/it] {'loss': 0.7058, 'grad_norm': 2.1758732925169255, 'learning_rate': 5.580693815987934e-07, 'epoch': 0.0} 0%| | 38/22095 [03:32<19:54:25, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74853 > 40960). Running this sequence through the model will result in indexing errors 0%| | 39/22095 [03:42<32:35:00, 5.32s/it] {'loss': 0.7554, 'grad_norm': 2.5628467709923193, 'learning_rate': 5.731523378582202e-07, 'epoch': 0.0} 0%| | 39/22095 [03:42<32:35:00, 5.32s/it] 0%| | 40/22095 [03:46<30:01:20, 4.90s/it] {'loss': 0.7271, 'grad_norm': 2.3023149629245503, 'learning_rate': 5.882352941176471e-07, 'epoch': 0.0} 0%| | 40/22095 [03:46<30:01:20, 4.90s/it] 0%| | 41/22095 [03:50<28:07:23, 4.59s/it] {'loss': 0.6804, 'grad_norm': 2.0801763787360965, 'learning_rate': 6.03318250377074e-07, 'epoch': 0.0} 0%| | 41/22095 [03:50<28:07:23, 4.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 42/22095 [03:58<34:20:26, 5.61s/it] {'loss': 0.7269, 'grad_norm': 2.2400032166503094, 'learning_rate': 6.184012066365008e-07, 'epoch': 0.0} 0%| | 42/22095 [03:58<34:20:26, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42175 > 40960). Running this sequence through the model will result in indexing errors 0%| | 43/22095 [04:06<40:12:58, 6.57s/it] {'loss': 0.724, 'grad_norm': 2.074448591273433, 'learning_rate': 6.334841628959276e-07, 'epoch': 0.0} 0%| | 43/22095 [04:06<40:12:58, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 0%| | 44/22095 [04:11<36:26:53, 5.95s/it] {'loss': 0.7597, 'grad_norm': 2.052458166169399, 'learning_rate': 6.485671191553546e-07, 'epoch': 0.0} 0%| | 44/22095 [04:11<36:26:53, 5.95s/it] 0%| | 45/22095 [04:19<40:35:42, 6.63s/it] {'loss': 0.775, 'grad_norm': 2.3000279668851826, 'learning_rate': 6.636500754147814e-07, 'epoch': 0.0} 0%| | 45/22095 [04:19<40:35:42, 6.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 0%| | 46/22095 [04:24<37:16:34, 6.09s/it] {'loss': 0.6826, 'grad_norm': 2.040178744317726, 'learning_rate': 6.787330316742082e-07, 'epoch': 0.0} 0%| | 46/22095 [04:24<37:16:34, 6.09s/it] 0%| | 47/22095 [04:34<44:12:58, 7.22s/it] {'loss': 0.7196, 'grad_norm': 2.169650593616676, 'learning_rate': 6.938159879336351e-07, 'epoch': 0.0} 0%| | 47/22095 [04:34<44:12:58, 7.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 0%| | 48/22095 [04:38<38:42:47, 6.32s/it] {'loss': 0.6534, 'grad_norm': 2.243237967368775, 'learning_rate': 7.088989441930619e-07, 'epoch': 0.0} 0%| | 48/22095 [04:38<38:42:47, 6.32s/it] 0%| | 49/22095 [04:42<34:07:43, 5.57s/it] {'loss': 0.7239, 'grad_norm': 1.805594330039211, 'learning_rate': 7.239819004524888e-07, 'epoch': 0.0} 0%| | 49/22095 [04:42<34:07:43, 5.57s/it] 0%| | 50/22095 [04:45<29:18:37, 4.79s/it] {'loss': 0.7892, 'grad_norm': 1.95060511614219, 'learning_rate': 7.390648567119156e-07, 'epoch': 0.0} 0%| | 50/22095 [04:45<29:18:37, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 51/22095 [04:53<35:03:57, 5.73s/it] {'loss': 0.7541, 'grad_norm': 2.0050821311562967, 'learning_rate': 7.541478129713424e-07, 'epoch': 0.0} 0%| | 51/22095 [04:53<35:03:57, 5.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101995 > 40960). Running this sequence through the model will result in indexing errors 0%| | 52/22095 [04:57<31:54:59, 5.21s/it] {'loss': 0.653, 'grad_norm': 1.591345649647068, 'learning_rate': 7.692307692307694e-07, 'epoch': 0.0} 0%| | 52/22095 [04:57<31:54:59, 5.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930855 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54008, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 4cm\nB. 6cm\nC. 2cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 0%| | 53/22095 [05:00<29:09:13, 4.76s/it] {'loss': 0.7579, 'grad_norm': 1.5359586585503124, 'learning_rate': 7.843137254901962e-07, 'epoch': 0.0} 0%| | 53/22095 [05:00<29:09:13, 4.76s/it] 0%| | 54/22095 [05:04<27:38:23, 4.51s/it] {'loss': 0.6981, 'grad_norm': 1.4903440630350036, 'learning_rate': 7.993966817496229e-07, 'epoch': 0.0} 0%| | 54/22095 [05:04<27:38:23, 4.51s/it] 0%| | 55/22095 [05:08<26:23:48, 4.31s/it] {'loss': 0.7096, 'grad_norm': 1.576261684717775, 'learning_rate': 8.144796380090498e-07, 'epoch': 0.0} 0%| | 55/22095 [05:08<26:23:48, 4.31s/it] 0%| | 56/22095 [05:13<26:35:08, 4.34s/it] {'loss': 0.7056, 'grad_norm': 1.4623041693607248, 'learning_rate': 8.295625942684766e-07, 'epoch': 0.0} 0%| | 56/22095 [05:13<26:35:08, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41231 > 40960). Running this sequence through the model will result in indexing errors 0%| | 57/22095 [05:16<24:24:43, 3.99s/it] {'loss': 0.7375, 'grad_norm': 1.3299793426063247, 'learning_rate': 8.446455505279036e-07, 'epoch': 0.0} 0%| | 57/22095 [05:16<24:24:43, 3.99s/it] 0%| | 58/22095 [05:19<22:43:53, 3.71s/it] {'loss': 0.6838, 'grad_norm': 1.3229170043287524, 'learning_rate': 8.597285067873304e-07, 'epoch': 0.0} 0%| | 58/22095 [05:19<22:43:53, 3.71s/it] 0%| | 59/22095 [05:23<23:17:19, 3.80s/it] {'loss': 0.7115, 'grad_norm': 1.3144587271599693, 'learning_rate': 8.748114630467572e-07, 'epoch': 0.0} 0%| | 59/22095 [05:23<23:17:19, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 60/22095 [05:30<29:36:35, 4.84s/it] {'loss': 0.7371, 'grad_norm': 1.6142307994153728, 'learning_rate': 8.898944193061841e-07, 'epoch': 0.0} 0%| | 60/22095 [05:30<29:36:35, 4.84s/it] 0%| | 61/22095 [05:35<28:50:59, 4.71s/it] {'loss': 0.6543, 'grad_norm': 1.244580858498938, 'learning_rate': 9.049773755656109e-07, 'epoch': 0.0} 0%| | 61/22095 [05:35<28:50:59, 4.71s/it] 0%| | 62/22095 [05:39<27:36:13, 4.51s/it] {'loss': 0.6459, 'grad_norm': 1.2418783287357869, 'learning_rate': 9.200603318250378e-07, 'epoch': 0.0} 0%| | 62/22095 [05:39<27:36:13, 4.51s/it] 0%| | 63/22095 [05:43<27:12:27, 4.45s/it] {'loss': 0.7234, 'grad_norm': 1.3734658575051808, 'learning_rate': 9.351432880844646e-07, 'epoch': 0.0} 0%| | 63/22095 [05:43<27:12:27, 4.45s/it] 0%| | 64/22095 [05:47<27:05:41, 4.43s/it] {'loss': 0.7172, 'grad_norm': 1.5154654997771089, 'learning_rate': 9.502262443438914e-07, 'epoch': 0.0} 0%| | 64/22095 [05:47<27:05:41, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 65/22095 [05:56<35:53:06, 5.86s/it] {'loss': 0.7091, 'grad_norm': 1.5993349941418993, 'learning_rate': 9.653092006033183e-07, 'epoch': 0.0} 0%| | 65/22095 [05:56<35:53:06, 5.86s/it] 0%| | 66/22095 [06:01<33:53:53, 5.54s/it] {'loss': 0.6245, 'grad_norm': 1.2952398954480047, 'learning_rate': 9.80392156862745e-07, 'epoch': 0.0} 0%| | 66/22095 [06:01<33:53:53, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60692 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62846 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142179 > 40960). Running this sequence through the model will result in indexing errors 0%| | 67/22095 [06:04<29:33:43, 4.83s/it] {'loss': 0.6454, 'grad_norm': 1.3771464561355673, 'learning_rate': 9.954751131221719e-07, 'epoch': 0.0} 0%| | 67/22095 [06:04<29:33:43, 4.83s/it] 0%| | 68/22095 [06:10<31:01:43, 5.07s/it] {'loss': 0.6814, 'grad_norm': 1.412649273270314, 'learning_rate': 1.0105580693815989e-06, 'epoch': 0.0} 0%| | 68/22095 [06:10<31:01:43, 5.07s/it] 0%| | 69/22095 [06:15<30:09:20, 4.93s/it] {'loss': 0.682, 'grad_norm': 1.3193487969255955, 'learning_rate': 1.0256410256410257e-06, 'epoch': 0.0} 0%| | 69/22095 [06:15<30:09:20, 4.93s/it] 0%| | 70/22095 [06:18<27:01:48, 4.42s/it] {'loss': 0.7166, 'grad_norm': 1.273629024869017, 'learning_rate': 1.0407239819004527e-06, 'epoch': 0.0} 0%| | 70/22095 [06:18<27:01:48, 4.42s/it] 0%| | 71/22095 [06:21<24:52:25, 4.07s/it] {'loss': 0.7045, 'grad_norm': 1.1381038066518965, 'learning_rate': 1.0558069381598795e-06, 'epoch': 0.0} 0%| | 71/22095 [06:21<24:52:25, 4.07s/it] 0%| | 72/22095 [06:24<22:41:52, 3.71s/it] {'loss': 0.6771, 'grad_norm': 1.2047824209988878, 'learning_rate': 1.0708898944193063e-06, 'epoch': 0.0} 0%| | 72/22095 [06:24<22:41:52, 3.71s/it] 0%| | 73/22095 [06:27<21:24:43, 3.50s/it] {'loss': 0.6741, 'grad_norm': 1.2391818859284285, 'learning_rate': 1.085972850678733e-06, 'epoch': 0.0} 0%| | 73/22095 [06:27<21:24:43, 3.50s/it] 0%| | 74/22095 [06:31<21:49:04, 3.57s/it] {'loss': 0.6855, 'grad_norm': 1.2521185180506431, 'learning_rate': 1.1010558069381598e-06, 'epoch': 0.0} 0%| | 74/22095 [06:31<21:49:04, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350705 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17379, 'image': 'vrdu_table_final_2/astro-ph.CO/5945513e-46c0-4b91-bf41-4b423c80fe37.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 0%| | 75/22095 [06:38<27:46:52, 4.54s/it] {'loss': 0.7001, 'grad_norm': 1.3882698005588492, 'learning_rate': 1.1161387631975868e-06, 'epoch': 0.0} 0%| | 75/22095 [06:38<27:46:52, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76074 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43017 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46194 > 40960) for 4 sample(s). Truncating to 5234 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (115023 > 40960). Running this sequence through the model will result in indexing errors 0%| | 76/22095 [06:42<26:51:28, 4.39s/it] {'loss': 0.6747, 'grad_norm': 1.179314695521784, 'learning_rate': 1.1312217194570136e-06, 'epoch': 0.0} 0%| | 76/22095 [06:42<26:51:28, 4.39s/it] 0%| | 77/22095 [06:45<24:44:49, 4.05s/it] {'loss': 0.6286, 'grad_norm': 1.5552928150597494, 'learning_rate': 1.1463046757164404e-06, 'epoch': 0.0} 0%| | 77/22095 [06:45<24:44:49, 4.05s/it] 0%| | 78/22095 [06:48<23:02:38, 3.77s/it] {'loss': 0.6742, 'grad_norm': 1.1734748345240251, 'learning_rate': 1.1613876319758674e-06, 'epoch': 0.0} 0%| | 78/22095 [06:48<23:02:38, 3.77s/it] 0%| | 79/22095 [06:51<22:36:37, 3.70s/it] {'loss': 0.6426, 'grad_norm': 1.131287052435935, 'learning_rate': 1.1764705882352942e-06, 'epoch': 0.0} 0%| | 79/22095 [06:51<22:36:37, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114792 > 40960). Running this sequence through the model will result in indexing errors 0%| | 80/22095 [06:55<22:30:59, 3.68s/it] {'loss': 0.6618, 'grad_norm': 1.3295414806754717, 'learning_rate': 1.1915535444947212e-06, 'epoch': 0.0} 0%| | 80/22095 [06:55<22:30:59, 3.68s/it] 0%| | 81/22095 [06:59<22:48:58, 3.73s/it] {'loss': 0.6729, 'grad_norm': 1.2939253815058376, 'learning_rate': 1.206636500754148e-06, 'epoch': 0.0} 0%| | 81/22095 [06:59<22:48:58, 3.73s/it] 0%| | 82/22095 [07:02<21:36:14, 3.53s/it] {'loss': 0.5791, 'grad_norm': 1.059572927368594, 'learning_rate': 1.2217194570135748e-06, 'epoch': 0.0} 0%| | 82/22095 [07:02<21:36:14, 3.53s/it] 0%| | 83/22095 [07:05<20:50:25, 3.41s/it] {'loss': 0.6478, 'grad_norm': 1.1326662378816292, 'learning_rate': 1.2368024132730016e-06, 'epoch': 0.0} 0%| | 83/22095 [07:05<20:50:25, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 84/22095 [07:17<36:31:02, 5.97s/it] {'loss': 0.7105, 'grad_norm': 1.4420484198243764, 'learning_rate': 1.2518853695324284e-06, 'epoch': 0.0} 0%| | 84/22095 [07:17<36:31:02, 5.97s/it] 0%| | 85/22095 [07:21<32:52:11, 5.38s/it] {'loss': 0.6299, 'grad_norm': 1.054691744896305, 'learning_rate': 1.2669683257918552e-06, 'epoch': 0.0} 0%| | 85/22095 [07:21<32:52:11, 5.38s/it] 0%| | 86/22095 [07:25<30:15:22, 4.95s/it] {'loss': 0.6486, 'grad_norm': 1.1882611476012201, 'learning_rate': 1.282051282051282e-06, 'epoch': 0.0} 0%| | 86/22095 [07:25<30:15:22, 4.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8582837 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5826, 'image': '962600350.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Barbara Brooks Kimmel'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Citizenship Made Simple: An Easy to Read Guide to the U.S. Citizenship Process'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Test Preparation'}, {'from': 'human', 'value': 'Is this book related to Test Preparation? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Yes'}, {'from': 'human', 'value': 'Is this book related to Arts & Photography? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 0%| | 87/22095 [07:28<27:27:39, 4.49s/it] {'loss': 0.6162, 'grad_norm': 1.059033257046168, 'learning_rate': 1.2971342383107092e-06, 'epoch': 0.0} 0%| | 87/22095 [07:28<27:27:39, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 0%| | 88/22095 [07:38<37:09:20, 6.08s/it] {'loss': 0.7638, 'grad_norm': 1.4556419443337114, 'learning_rate': 1.312217194570136e-06, 'epoch': 0.0} 0%| | 88/22095 [07:38<37:09:20, 6.08s/it] 0%| | 89/22095 [07:43<34:57:30, 5.72s/it] {'loss': 0.6258, 'grad_norm': 1.1239534063715333, 'learning_rate': 1.3273001508295628e-06, 'epoch': 0.0} 0%| | 89/22095 [07:43<34:57:30, 5.72s/it] 0%| | 90/22095 [07:47<32:07:46, 5.26s/it] {'loss': 0.6705, 'grad_norm': 1.106624334983117, 'learning_rate': 1.3423831070889896e-06, 'epoch': 0.0} 0%| | 90/22095 [07:47<32:07:46, 5.26s/it] 0%| | 91/22095 [07:51<29:29:36, 4.83s/it] {'loss': 0.6365, 'grad_norm': 1.0594735066509668, 'learning_rate': 1.3574660633484164e-06, 'epoch': 0.0} 0%| | 91/22095 [07:51<29:29:36, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46110 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75562 > 40960). Running this sequence through the model will result in indexing errors 0%| | 92/22095 [07:55<27:14:15, 4.46s/it] {'loss': 0.6674, 'grad_norm': 1.1407310916855053, 'learning_rate': 1.3725490196078434e-06, 'epoch': 0.0} 0%| | 92/22095 [07:55<27:14:15, 4.46s/it] 0%| | 93/22095 [07:58<25:05:31, 4.11s/it] {'loss': 0.6586, 'grad_norm': 1.0868736007753048, 'learning_rate': 1.3876319758672702e-06, 'epoch': 0.0} 0%| | 93/22095 [07:58<25:05:31, 4.11s/it] 0%| | 94/22095 [08:01<22:53:03, 3.74s/it] {'loss': 0.6344, 'grad_norm': 1.0787095460749092, 'learning_rate': 1.402714932126697e-06, 'epoch': 0.0} 0%| | 94/22095 [08:01<22:53:03, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74248 > 40960). Running this sequence through the model will result in indexing errors 0%| | 95/22095 [08:05<23:33:57, 3.86s/it] {'loss': 0.6751, 'grad_norm': 1.1365671650774605, 'learning_rate': 1.4177978883861237e-06, 'epoch': 0.0} 0%| | 95/22095 [08:05<23:33:57, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59906 > 40960). Running this sequence through the model will result in indexing errors 0%| | 96/22095 [08:09<23:04:04, 3.77s/it] {'loss': 0.657, 'grad_norm': 1.0487165231214994, 'learning_rate': 1.4328808446455505e-06, 'epoch': 0.0} 0%| | 96/22095 [08:09<23:04:04, 3.77s/it] 0%| | 97/22095 [08:12<22:53:28, 3.75s/it] {'loss': 0.6337, 'grad_norm': 0.9787033497294746, 'learning_rate': 1.4479638009049775e-06, 'epoch': 0.0} 0%| | 97/22095 [08:12<22:53:28, 3.75s/it] 0%| | 98/22095 [08:16<22:59:33, 3.76s/it] {'loss': 0.6081, 'grad_norm': 1.0254485226838135, 'learning_rate': 1.4630467571644043e-06, 'epoch': 0.0} 0%| | 98/22095 [08:16<22:59:33, 3.76s/it] 0%| | 99/22095 [08:19<21:18:33, 3.49s/it] {'loss': 0.5998, 'grad_norm': 1.3872002279212687, 'learning_rate': 1.4781297134238311e-06, 'epoch': 0.0} 0%| | 99/22095 [08:19<21:18:33, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [395, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8499079 in VC:s3://internvl-moe-sft-data/. Exception: Image size [395, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 68302, 'image': 'vrdu_texteq/astro-ph.CO/d8e074e6-d382-42d0-80b8-f329ddf43fda.png', 'image_wh': [[395, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $\\delta^\\mathrm{K}$ is the Kronecker delta.'}]} 0%| | 100/22095 [08:28<31:26:32, 5.15s/it] {'loss': 0.6809, 'grad_norm': 1.2984889703154123, 'learning_rate': 1.493212669683258e-06, 'epoch': 0.0} 0%| | 100/22095 [08:28<31:26:32, 5.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42528 > 40960). Running this sequence through the model will result in indexing errors 0%| | 101/22095 [08:31<28:08:30, 4.61s/it] {'loss': 0.6208, 'grad_norm': 1.0452806515562352, 'learning_rate': 1.5082956259426847e-06, 'epoch': 0.0} 0%| | 101/22095 [08:31<28:08:30, 4.61s/it] 0%| | 102/22095 [08:34<24:44:43, 4.05s/it] {'loss': 0.6415, 'grad_norm': 1.0097601200328257, 'learning_rate': 1.5233785822021115e-06, 'epoch': 0.0} 0%| | 102/22095 [08:34<24:44:43, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43024 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47611 > 40960). Running this sequence through the model will result in indexing errors 0%| | 103/22095 [08:40<28:31:33, 4.67s/it] {'loss': 0.7088, 'grad_norm': 1.269723703523461, 'learning_rate': 1.5384615384615387e-06, 'epoch': 0.0} 0%| | 103/22095 [08:40<28:31:33, 4.67s/it] 0%| | 104/22095 [08:50<37:17:09, 6.10s/it] {'loss': 0.7053, 'grad_norm': 1.2470497789460646, 'learning_rate': 1.5535444947209655e-06, 'epoch': 0.0} 0%| | 104/22095 [08:50<37:17:09, 6.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 0%| | 105/22095 [08:53<31:56:38, 5.23s/it] {'loss': 0.6018, 'grad_norm': 0.9368939620497067, 'learning_rate': 1.5686274509803923e-06, 'epoch': 0.0} 0%| | 105/22095 [08:53<31:56:38, 5.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59612 > 40960). Running this sequence through the model will result in indexing errors 0%| | 106/22095 [08:56<29:01:59, 4.75s/it] {'loss': 0.6662, 'grad_norm': 1.1604437387684339, 'learning_rate': 1.583710407239819e-06, 'epoch': 0.0} 0%| | 106/22095 [08:56<29:01:59, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (143960 > 40960). Running this sequence through the model will result in indexing errors 0%| | 107/22095 [08:59<25:52:29, 4.24s/it] {'loss': 0.6463, 'grad_norm': 1.0358552080502414, 'learning_rate': 1.5987933634992459e-06, 'epoch': 0.0} 0%| | 107/22095 [09:00<25:52:29, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [598, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8470962 in VC:s3://internvl-moe-sft-data/. Exception: Image size [598, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2775, 'image': 'vrdu_texteq/astro-ph.CO/77045f59-29ef-4111-8228-1c26aa5002ad.png', 'image_wh': [[598, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $\\mathcal{R}^{-1}$ is the inverted correlation matrix and'}]} 0%| | 108/22095 [09:04<26:16:01, 4.30s/it] {'loss': 0.6304, 'grad_norm': 1.0368493258936142, 'learning_rate': 1.6138763197586729e-06, 'epoch': 0.0} 0%| | 108/22095 [09:04<26:16:01, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348317 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14986, 'image': 'vrdu_table_final_2/astro-ph.CO/fddf6561-fd18-4af1-aef1-ce082e8d0f7d.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 0%| | 109/22095 [09:14<36:53:11, 6.04s/it] {'loss': 0.668, 'grad_norm': 1.1302662456729546, 'learning_rate': 1.6289592760180997e-06, 'epoch': 0.0} 0%| | 109/22095 [09:14<36:53:11, 6.04s/it] 0%| | 110/22095 [09:17<31:47:56, 5.21s/it] {'loss': 0.5568, 'grad_norm': 1.0351636351517919, 'learning_rate': 1.6440422322775265e-06, 'epoch': 0.0} 0%| | 110/22095 [09:17<31:47:56, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 111/22095 [09:24<35:04:12, 5.74s/it] {'loss': 0.6857, 'grad_norm': 1.0501568943457311, 'learning_rate': 1.6591251885369533e-06, 'epoch': 0.01} 1%| | 111/22095 [09:24<35:04:12, 5.74s/it] 1%| | 112/22095 [09:28<30:35:09, 5.01s/it] {'loss': 0.6603, 'grad_norm': 1.0207344968356886, 'learning_rate': 1.67420814479638e-06, 'epoch': 0.01} 1%| | 112/22095 [09:28<30:35:09, 5.01s/it] 1%| | 113/22095 [09:31<26:46:11, 4.38s/it] {'loss': 0.6509, 'grad_norm': 1.0310491415955947, 'learning_rate': 1.6892911010558073e-06, 'epoch': 0.01} 1%| | 113/22095 [09:31<26:46:11, 4.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 114/22095 [09:38<32:59:02, 5.40s/it] {'loss': 0.7051, 'grad_norm': 0.9639024601411313, 'learning_rate': 1.704374057315234e-06, 'epoch': 0.01} 1%| | 114/22095 [09:38<32:59:02, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57316 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58637 > 40960). Running this sequence through the model will result in indexing errors 1%| | 115/22095 [09:42<30:44:11, 5.03s/it] {'loss': 0.6142, 'grad_norm': 0.9586419580660428, 'learning_rate': 1.7194570135746609e-06, 'epoch': 0.01} 1%| | 115/22095 [09:42<30:44:11, 5.03s/it] 1%| | 116/22095 [09:46<28:11:32, 4.62s/it] {'loss': 0.6466, 'grad_norm': 1.04842344522062, 'learning_rate': 1.7345399698340876e-06, 'epoch': 0.01} 1%| | 116/22095 [09:46<28:11:32, 4.62s/it] 1%| | 117/22095 [09:50<26:16:32, 4.30s/it] {'loss': 0.5516, 'grad_norm': 0.9173299249262994, 'learning_rate': 1.7496229260935144e-06, 'epoch': 0.01} 1%| | 117/22095 [09:50<26:16:32, 4.30s/it] 1%| | 118/22095 [09:55<27:41:32, 4.54s/it] {'loss': 0.5931, 'grad_norm': 1.0831856068854298, 'learning_rate': 1.7647058823529414e-06, 'epoch': 0.01} 1%| | 118/22095 [09:55<27:41:32, 4.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930853 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54006, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396969 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63822, 'image': 'vrdu_table_final_2/astro-ph.EP/327b49f1-4032-4a87-aeb3-f1b08b9e3883.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_y$\\end{tabular}\n```"}]} 1%| | 119/22095 [09:59<26:53:39, 4.41s/it] {'loss': 0.577, 'grad_norm': 0.8946723582329622, 'learning_rate': 1.7797888386123682e-06, 'epoch': 0.01} 1%| | 119/22095 [09:59<26:53:39, 4.41s/it] 1%| | 120/22095 [10:02<24:37:19, 4.03s/it] {'loss': 0.6203, 'grad_norm': 1.1010913291353444, 'learning_rate': 1.794871794871795e-06, 'epoch': 0.01} 1%| | 120/22095 [10:02<24:37:19, 4.03s/it] 1%| | 121/22095 [10:05<22:37:36, 3.71s/it] {'loss': 0.6101, 'grad_norm': 0.9085173100560293, 'learning_rate': 1.8099547511312218e-06, 'epoch': 0.01} 1%| | 121/22095 [10:05<22:37:36, 3.71s/it] 1%| | 122/22095 [10:08<21:03:47, 3.45s/it] {'loss': 0.5623, 'grad_norm': 0.9320802267784369, 'learning_rate': 1.8250377073906486e-06, 'epoch': 0.01} 1%| | 122/22095 [10:08<21:03:47, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60845 > 40960). Running this sequence through the model will result in indexing errors 1%| | 123/22095 [10:15<28:26:40, 4.66s/it] {'loss': 0.6649, 'grad_norm': 0.8195711642520936, 'learning_rate': 1.8401206636500756e-06, 'epoch': 0.01} 1%| | 123/22095 [10:15<28:26:40, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72082 > 40960). Running this sequence through the model will result in indexing errors 1%| | 124/22095 [10:19<25:49:13, 4.23s/it] {'loss': 0.5649, 'grad_norm': 1.010253752587403, 'learning_rate': 1.8552036199095024e-06, 'epoch': 0.01} 1%| | 124/22095 [10:19<25:49:13, 4.23s/it] 1%| | 125/22095 [10:22<24:13:58, 3.97s/it] {'loss': 0.5982, 'grad_norm': 0.991330010194571, 'learning_rate': 1.8702865761689292e-06, 'epoch': 0.01} 1%| | 125/22095 [10:22<24:13:58, 3.97s/it] 1%| | 126/22095 [10:25<22:17:26, 3.65s/it] {'loss': 0.6453, 'grad_norm': 1.1475476851088662, 'learning_rate': 1.885369532428356e-06, 'epoch': 0.01} 1%| | 126/22095 [10:25<22:17:26, 3.65s/it] 1%| | 127/22095 [10:28<21:20:40, 3.50s/it] {'loss': 0.5992, 'grad_norm': 0.9235417511336075, 'learning_rate': 1.9004524886877828e-06, 'epoch': 0.01} 1%| | 127/22095 [10:28<21:20:40, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88859 > 40960). Running this sequence through the model will result in indexing errors 1%| | 128/22095 [10:31<20:31:11, 3.36s/it] {'loss': 0.6304, 'grad_norm': 0.9847063346731131, 'learning_rate': 1.91553544494721e-06, 'epoch': 0.01} 1%| | 128/22095 [10:31<20:31:11, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91584 > 40960). Running this sequence through the model will result in indexing errors 1%| | 129/22095 [10:34<19:58:51, 3.27s/it] {'loss': 0.6046, 'grad_norm': 1.0269890471208356, 'learning_rate': 1.9306184012066366e-06, 'epoch': 0.01} 1%| | 129/22095 [10:34<19:58:51, 3.27s/it] 1%| | 130/22095 [10:38<20:52:17, 3.42s/it] {'loss': 0.6118, 'grad_norm': 0.9258189317090748, 'learning_rate': 1.9457013574660634e-06, 'epoch': 0.01} 1%| | 130/22095 [10:38<20:52:17, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 131/22095 [10:48<32:59:04, 5.41s/it] {'loss': 0.6506, 'grad_norm': 0.7492225521686104, 'learning_rate': 1.96078431372549e-06, 'epoch': 0.01} 1%| | 131/22095 [10:48<32:59:04, 5.41s/it] 1%| | 132/22095 [10:56<37:51:21, 6.21s/it] {'loss': 0.6702, 'grad_norm': 0.7421506908255864, 'learning_rate': 1.975867269984917e-06, 'epoch': 0.01} 1%| | 132/22095 [10:56<37:51:21, 6.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 133/22095 [10:59<32:38:35, 5.35s/it] {'loss': 0.5685, 'grad_norm': 0.9545777179324334, 'learning_rate': 1.9909502262443437e-06, 'epoch': 0.01} 1%| | 133/22095 [10:59<32:38:35, 5.35s/it] 1%| | 134/22095 [11:03<29:06:56, 4.77s/it] {'loss': 0.5939, 'grad_norm': 0.9622591564554519, 'learning_rate': 2.006033182503771e-06, 'epoch': 0.01} 1%| | 134/22095 [11:03<29:06:56, 4.77s/it] 1%| | 135/22095 [11:06<26:44:41, 4.38s/it] {'loss': 0.6169, 'grad_norm': 0.9109690493087242, 'learning_rate': 2.0211161387631978e-06, 'epoch': 0.01} 1%| | 135/22095 [11:06<26:44:41, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 136/22095 [11:14<33:15:01, 5.45s/it] {'loss': 0.6737, 'grad_norm': 0.6813272311762413, 'learning_rate': 2.0361990950226245e-06, 'epoch': 0.01} 1%| | 136/22095 [11:14<33:15:01, 5.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307762 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2AB8HdZic_eJjSZFnXXXVwVXa_!!1981295988.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请给出图片里所有的文字,我需要知道它们是什么。'}, {'from': 'gpt', 'value': '图中所有文字:\n送\n佳节\n买\n一\n一\n首选\n按节能静音蹲便水箱\n厂家直销\n大容量\n大冲力\n黄金牛\n五年质保\n35元\n包邮\n破损包赔'}]} 1%| | 137/22095 [11:24<42:11:51, 6.92s/it] {'loss': 0.6571, 'grad_norm': 0.6720777722130816, 'learning_rate': 2.0512820512820513e-06, 'epoch': 0.01} 1%| | 137/22095 [11:25<42:11:51, 6.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 138/22095 [11:28<35:58:09, 5.90s/it] {'loss': 0.5855, 'grad_norm': 0.9667588053275596, 'learning_rate': 2.066365007541478e-06, 'epoch': 0.01} 1%| | 138/22095 [11:28<35:58:09, 5.90s/it] 1%| | 139/22095 [11:32<31:58:14, 5.24s/it] {'loss': 0.6125, 'grad_norm': 0.9344038736230162, 'learning_rate': 2.0814479638009053e-06, 'epoch': 0.01} 1%| | 139/22095 [11:32<31:58:14, 5.24s/it] 1%| | 140/22095 [11:36<30:24:58, 4.99s/it] {'loss': 0.5956, 'grad_norm': 0.9465344342175216, 'learning_rate': 2.096530920060332e-06, 'epoch': 0.01} 1%| | 140/22095 [11:36<30:24:58, 4.99s/it] 1%| | 141/22095 [11:40<27:58:03, 4.59s/it] {'loss': 0.6398, 'grad_norm': 1.2591164988076586, 'learning_rate': 2.111613876319759e-06, 'epoch': 0.01} 1%| | 141/22095 [11:40<27:58:03, 4.59s/it] 1%| | 142/22095 [11:43<25:23:43, 4.16s/it] {'loss': 0.6047, 'grad_norm': 0.9677788926426463, 'learning_rate': 2.1266968325791857e-06, 'epoch': 0.01} 1%| | 142/22095 [11:43<25:23:43, 4.16s/it] 1%| | 143/22095 [11:47<25:35:52, 4.20s/it] {'loss': 0.6125, 'grad_norm': 0.9872448306229715, 'learning_rate': 2.1417797888386125e-06, 'epoch': 0.01} 1%| | 143/22095 [11:47<25:35:52, 4.20s/it] 1%| | 144/22095 [11:51<25:13:04, 4.14s/it] {'loss': 0.5664, 'grad_norm': 1.0035271309418503, 'learning_rate': 2.1568627450980393e-06, 'epoch': 0.01} 1%| | 144/22095 [11:51<25:13:04, 4.14s/it] 1%| | 145/22095 [11:54<23:13:41, 3.81s/it] {'loss': 0.5886, 'grad_norm': 0.9880506990276164, 'learning_rate': 2.171945701357466e-06, 'epoch': 0.01} 1%| | 145/22095 [11:54<23:13:41, 3.81s/it] 1%| | 146/22095 [11:59<24:22:41, 4.00s/it] {'loss': 0.6753, 'grad_norm': 1.0103893385719878, 'learning_rate': 2.187028657616893e-06, 'epoch': 0.01} 1%| | 146/22095 [11:59<24:22:41, 4.00s/it] 1%| | 147/22095 [12:03<24:19:05, 3.99s/it] {'loss': 0.6041, 'grad_norm': 1.011094895629863, 'learning_rate': 2.2021116138763197e-06, 'epoch': 0.01} 1%| | 147/22095 [12:03<24:19:05, 3.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922716 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45869, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 3\nB. 4\nC. 1\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AD=\\frac{1}{3}AB,AB=12,∴AD=4,∵C是AD的中点,∴AC=\\frac{1}{2}AD=2.'}]} 1%| | 148/22095 [12:07<25:47:50, 4.23s/it] {'loss': 0.6077, 'grad_norm': 0.9464836309060873, 'learning_rate': 2.2171945701357465e-06, 'epoch': 0.01} 1%| | 148/22095 [12:07<25:47:50, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85858 > 40960). Running this sequence through the model will result in indexing errors 1%| | 149/22095 [12:11<24:15:16, 3.98s/it] {'loss': 0.5735, 'grad_norm': 0.9495819965291539, 'learning_rate': 2.2322775263951737e-06, 'epoch': 0.01} 1%| | 149/22095 [12:11<24:15:16, 3.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410450 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12649, 'image': 'vrdu_table_final_2/astro-ph.CO/a6864068-5772-4096-8990-ef20771f71ed.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 1%| | 150/22095 [12:14<22:45:10, 3.73s/it] {'loss': 0.6186, 'grad_norm': 1.1858654132024509, 'learning_rate': 2.2473604826546005e-06, 'epoch': 0.01} 1%| | 150/22095 [12:14<22:45:10, 3.73s/it] 1%| | 151/22095 [12:18<23:13:08, 3.81s/it] {'loss': 0.5791, 'grad_norm': 0.9009709214733077, 'learning_rate': 2.2624434389140273e-06, 'epoch': 0.01} 1%| | 151/22095 [12:18<23:13:08, 3.81s/it] 1%| | 152/22095 [12:21<22:32:40, 3.70s/it] {'loss': 0.5288, 'grad_norm': 1.9798213247768532, 'learning_rate': 2.277526395173454e-06, 'epoch': 0.01} 1%| | 152/22095 [12:22<22:32:40, 3.70s/it] 1%| | 153/22095 [12:24<20:55:01, 3.43s/it] {'loss': 0.5388, 'grad_norm': 0.8768656065313846, 'learning_rate': 2.292609351432881e-06, 'epoch': 0.01} 1%| | 153/22095 [12:24<20:55:01, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 154/22095 [12:28<21:14:15, 3.48s/it] {'loss': 0.621, 'grad_norm': 0.9853972537936977, 'learning_rate': 2.307692307692308e-06, 'epoch': 0.01} 1%| | 154/22095 [12:28<21:14:15, 3.48s/it] 1%| | 155/22095 [12:31<20:46:24, 3.41s/it] {'loss': 0.5722, 'grad_norm': 1.183982065728492, 'learning_rate': 2.322775263951735e-06, 'epoch': 0.01} 1%| | 155/22095 [12:31<20:46:24, 3.41s/it] 1%| | 156/22095 [12:35<21:03:04, 3.45s/it] {'loss': 0.5931, 'grad_norm': 0.9122814416114049, 'learning_rate': 2.3378582202111617e-06, 'epoch': 0.01} 1%| | 156/22095 [12:35<21:03:04, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 157/22095 [12:43<30:50:46, 5.06s/it] {'loss': 0.6359, 'grad_norm': 0.6050120092736706, 'learning_rate': 2.3529411764705885e-06, 'epoch': 0.01} 1%| | 157/22095 [12:43<30:50:46, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924513 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47666, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nA. 4cm\nB. 6cm\nC. 2cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 1%| | 158/22095 [12:48<29:34:59, 4.85s/it] {'loss': 0.5777, 'grad_norm': 0.8846810488876649, 'learning_rate': 2.3680241327300152e-06, 'epoch': 0.01} 1%| | 158/22095 [12:48<29:34:59, 4.85s/it] 1%| | 159/22095 [12:51<26:13:40, 4.30s/it] {'loss': 0.6196, 'grad_norm': 0.9791346336803588, 'learning_rate': 2.3831070889894425e-06, 'epoch': 0.01} 1%| | 159/22095 [12:51<26:13:40, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887883 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11036, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1.5\nB. 2\nC. 0.5\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 160/22095 [12:55<25:35:05, 4.20s/it] {'loss': 0.5649, 'grad_norm': 1.0364431416834374, 'learning_rate': 2.3981900452488693e-06, 'epoch': 0.01} 1%| | 160/22095 [12:55<25:35:05, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41608 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52064 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94839 > 40960). Running this sequence through the model will result in indexing errors 1%| | 161/22095 [12:58<24:01:18, 3.94s/it] {'loss': 0.5675, 'grad_norm': 1.0907811411319832, 'learning_rate': 2.413273001508296e-06, 'epoch': 0.01} 1%| | 161/22095 [12:58<24:01:18, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (157327 > 40960). Running this sequence through the model will result in indexing errors 1%| | 162/22095 [13:02<23:15:15, 3.82s/it] {'loss': 0.5868, 'grad_norm': 1.0702644046402912, 'learning_rate': 2.428355957767723e-06, 'epoch': 0.01} 1%| | 162/22095 [13:02<23:15:15, 3.82s/it] 1%| | 163/22095 [13:05<21:42:08, 3.56s/it] {'loss': 0.6126, 'grad_norm': 1.1961484830707276, 'learning_rate': 2.4434389140271496e-06, 'epoch': 0.01} 1%| | 163/22095 [13:05<21:42:08, 3.56s/it] 1%| | 164/22095 [13:08<21:59:36, 3.61s/it] {'loss': 0.5593, 'grad_norm': 0.9997853040755823, 'learning_rate': 2.4585218702865764e-06, 'epoch': 0.01} 1%| | 164/22095 [13:08<21:59:36, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 165/22095 [13:11<20:54:36, 3.43s/it] {'loss': 0.5741, 'grad_norm': 1.5306620336532337, 'learning_rate': 2.4736048265460032e-06, 'epoch': 0.01} 1%| | 165/22095 [13:11<20:54:36, 3.43s/it] 1%| | 166/22095 [13:15<20:31:15, 3.37s/it] {'loss': 0.597, 'grad_norm': 1.0365389768096527, 'learning_rate': 2.48868778280543e-06, 'epoch': 0.01} 1%| | 166/22095 [13:15<20:31:15, 3.37s/it] 1%| | 167/22095 [13:17<19:39:43, 3.23s/it] {'loss': 0.6228, 'grad_norm': 0.9300028853685578, 'learning_rate': 2.503770739064857e-06, 'epoch': 0.01} 1%| | 167/22095 [13:18<19:39:43, 3.23s/it] 1%| | 168/22095 [13:20<19:01:55, 3.12s/it] {'loss': 0.5888, 'grad_norm': 0.9730959927324633, 'learning_rate': 2.5188536953242836e-06, 'epoch': 0.01} 1%| | 168/22095 [13:20<19:01:55, 3.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [381, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8469960 in VC:s3://internvl-moe-sft-data/. Exception: Image size [381, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 134010, 'image': 'vrdu_texteq/astro-ph.CO/3c81b182-8671-4793-a754-9b439b638402.png', 'image_wh': [[381, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'We now define the matrix $\\mathbf{D}$ as'}]} 1%| | 169/22095 [13:28<26:42:48, 4.39s/it] {'loss': 0.6655, 'grad_norm': 0.7004727910376336, 'learning_rate': 2.5339366515837104e-06, 'epoch': 0.01} 1%| | 169/22095 [13:28<26:42:48, 4.39s/it] 1%| | 170/22095 [13:31<25:23:57, 4.17s/it] {'loss': 0.61, 'grad_norm': 1.1770735730846689, 'learning_rate': 2.549019607843137e-06, 'epoch': 0.01} 1%| | 170/22095 [13:31<25:23:57, 4.17s/it] 1%| | 171/22095 [13:35<24:05:47, 3.96s/it] {'loss': 0.6124, 'grad_norm': 0.9265069767412805, 'learning_rate': 2.564102564102564e-06, 'epoch': 0.01} 1%| | 171/22095 [13:35<24:05:47, 3.96s/it] 1%| | 172/22095 [13:38<22:15:19, 3.65s/it] {'loss': 0.5759, 'grad_norm': 1.0026834295632552, 'learning_rate': 2.5791855203619916e-06, 'epoch': 0.01} 1%| | 172/22095 [13:38<22:15:19, 3.65s/it] 1%| | 173/22095 [13:41<21:02:11, 3.45s/it] {'loss': 0.6195, 'grad_norm': 0.893051784847375, 'learning_rate': 2.5942684766214184e-06, 'epoch': 0.01} 1%| | 173/22095 [13:41<21:02:11, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 174/22095 [13:45<21:51:17, 3.59s/it] {'loss': 0.5924, 'grad_norm': 1.0294544753456212, 'learning_rate': 2.609351432880845e-06, 'epoch': 0.01} 1%| | 174/22095 [13:45<21:51:17, 3.59s/it] 1%| | 175/22095 [13:49<22:44:00, 3.73s/it] {'loss': 0.6401, 'grad_norm': 0.983431175208823, 'learning_rate': 2.624434389140272e-06, 'epoch': 0.01} 1%| | 175/22095 [13:49<22:44:00, 3.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373195 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39968, 'image': 'vrdu_table_final_2/astro-ph.CO/2353b9b9-623e-418f-acd6-d376ac3311e6.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 1%| | 176/22095 [13:52<21:54:27, 3.60s/it] {'loss': 0.5839, 'grad_norm': 0.9732249669160397, 'learning_rate': 2.6395173453996988e-06, 'epoch': 0.01} 1%| | 176/22095 [13:52<21:54:27, 3.60s/it] 1%| | 177/22095 [13:57<23:31:48, 3.86s/it] {'loss': 0.5342, 'grad_norm': 1.0460819509049557, 'learning_rate': 2.6546003016591256e-06, 'epoch': 0.01} 1%| | 177/22095 [13:57<23:31:48, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 178/22095 [14:04<30:55:06, 5.08s/it] {'loss': 0.6642, 'grad_norm': 0.6725530926649889, 'learning_rate': 2.6696832579185524e-06, 'epoch': 0.01} 1%| | 178/22095 [14:04<30:55:06, 5.08s/it] 1%| | 179/22095 [14:08<28:58:04, 4.76s/it] {'loss': 0.5543, 'grad_norm': 0.947656576932551, 'learning_rate': 2.684766214177979e-06, 'epoch': 0.01} 1%| | 179/22095 [14:08<28:58:04, 4.76s/it] 1%| | 180/22095 [14:12<26:01:25, 4.27s/it] {'loss': 0.5781, 'grad_norm': 0.890279138583737, 'learning_rate': 2.699849170437406e-06, 'epoch': 0.01} 1%| | 180/22095 [14:12<26:01:25, 4.27s/it] 1%| | 181/22095 [14:15<23:39:02, 3.89s/it] {'loss': 0.5896, 'grad_norm': 0.9491950792758711, 'learning_rate': 2.7149321266968327e-06, 'epoch': 0.01} 1%| | 181/22095 [14:15<23:39:02, 3.89s/it] 1%| | 182/22095 [14:18<22:36:45, 3.71s/it] {'loss': 0.5462, 'grad_norm': 0.9199486672240875, 'learning_rate': 2.7300150829562595e-06, 'epoch': 0.01} 1%| | 182/22095 [14:18<22:36:45, 3.71s/it] 1%| | 183/22095 [14:22<23:01:26, 3.78s/it] {'loss': 0.5749, 'grad_norm': 4.539286243882512, 'learning_rate': 2.7450980392156867e-06, 'epoch': 0.01} 1%| | 183/22095 [14:22<23:01:26, 3.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 184/22095 [14:25<21:28:35, 3.53s/it] {'loss': 0.5577, 'grad_norm': 0.9266218360089107, 'learning_rate': 2.7601809954751135e-06, 'epoch': 0.01} 1%| | 184/22095 [14:25<21:28:35, 3.53s/it] 1%| | 185/22095 [14:28<20:37:31, 3.39s/it] {'loss': 0.5218, 'grad_norm': 0.9869037300499367, 'learning_rate': 2.7752639517345403e-06, 'epoch': 0.01} 1%| | 185/22095 [14:28<20:37:31, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55460 > 40960). Running this sequence through the model will result in indexing errors 1%| | 186/22095 [14:32<21:44:52, 3.57s/it] {'loss': 0.5768, 'grad_norm': 0.9711731377333721, 'learning_rate': 2.790346907993967e-06, 'epoch': 0.01} 1%| | 186/22095 [14:32<21:44:52, 3.57s/it] 1%| | 187/22095 [14:36<23:45:10, 3.90s/it] {'loss': 0.587, 'grad_norm': 1.2170837862401673, 'learning_rate': 2.805429864253394e-06, 'epoch': 0.01} 1%| | 187/22095 [14:37<23:45:10, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47413 > 40960). Running this sequence through the model will result in indexing errors 1%| | 188/22095 [14:40<23:24:57, 3.85s/it] {'loss': 0.5275, 'grad_norm': 0.8855600221161494, 'learning_rate': 2.8205128205128207e-06, 'epoch': 0.01} 1%| | 188/22095 [14:40<23:24:57, 3.85s/it] 1%| | 189/22095 [14:44<23:33:37, 3.87s/it] {'loss': 0.5574, 'grad_norm': 0.9718908666916015, 'learning_rate': 2.8355957767722475e-06, 'epoch': 0.01} 1%| | 189/22095 [14:44<23:33:37, 3.87s/it] 1%| | 190/22095 [14:47<22:25:31, 3.69s/it] {'loss': 0.5696, 'grad_norm': 0.9121332025706556, 'learning_rate': 2.8506787330316743e-06, 'epoch': 0.01} 1%| | 190/22095 [14:47<22:25:31, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58159 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44505 > 40960). Running this sequence through the model will result in indexing errors 1%| | 191/22095 [14:50<20:48:44, 3.42s/it] {'loss': 0.5831, 'grad_norm': 0.9606521465966855, 'learning_rate': 2.865761689291101e-06, 'epoch': 0.01} 1%| | 191/22095 [14:50<20:48:44, 3.42s/it] 1%| | 192/22095 [14:53<19:43:24, 3.24s/it] {'loss': 0.6033, 'grad_norm': 0.9248220902665218, 'learning_rate': 2.880844645550528e-06, 'epoch': 0.01} 1%| | 192/22095 [14:53<19:43:24, 3.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [167, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8480481 in VC:s3://internvl-moe-sft-data/. Exception: Image size [167, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 155413, 'image': 'vrdu_texteq/astro-ph.CO/b6c44111-7c12-46bd-977c-b274ebfdbe5e.png', 'image_wh': [[167, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'at $>95\\%$ CL.'}]} 1%| | 193/22095 [14:56<19:19:51, 3.18s/it] {'loss': 0.5711, 'grad_norm': 1.245130579446091, 'learning_rate': 2.895927601809955e-06, 'epoch': 0.01} 1%| | 193/22095 [14:56<19:19:51, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 194/22095 [15:07<32:52:00, 5.40s/it] {'loss': 0.6706, 'grad_norm': 0.6920799519601023, 'learning_rate': 2.911010558069382e-06, 'epoch': 0.01} 1%| | 194/22095 [15:07<32:52:00, 5.40s/it] 1%| | 195/22095 [15:10<29:04:14, 4.78s/it] {'loss': 0.5236, 'grad_norm': 0.9225517446215864, 'learning_rate': 2.9260935143288087e-06, 'epoch': 0.01} 1%| | 195/22095 [15:10<29:04:14, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 196/22095 [15:20<38:19:48, 6.30s/it] {'loss': 0.6724, 'grad_norm': 0.6084442686451063, 'learning_rate': 2.9411764705882355e-06, 'epoch': 0.01} 1%| | 196/22095 [15:20<38:19:48, 6.30s/it] 1%| | 197/22095 [15:23<33:05:19, 5.44s/it] {'loss': 0.5737, 'grad_norm': 1.0686412628431616, 'learning_rate': 2.9562594268476623e-06, 'epoch': 0.01} 1%| | 197/22095 [15:23<33:05:19, 5.44s/it] 1%| | 198/22095 [15:26<28:56:22, 4.76s/it] {'loss': 0.5786, 'grad_norm': 1.0296132076998055, 'learning_rate': 2.971342383107089e-06, 'epoch': 0.01} 1%| | 198/22095 [15:26<28:56:22, 4.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 199/22095 [15:37<39:17:37, 6.46s/it] {'loss': 0.6535, 'grad_norm': 0.6266104875319409, 'learning_rate': 2.986425339366516e-06, 'epoch': 0.01} 1%| | 199/22095 [15:37<39:17:37, 6.46s/it] 1%| | 200/22095 [15:47<46:19:35, 7.62s/it] {'loss': 0.647, 'grad_norm': 0.5458319005831402, 'learning_rate': 3.0015082956259426e-06, 'epoch': 0.01} 1%| | 200/22095 [15:47<46:19:35, 7.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047832 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 6cm\nB. 6.5cm\nC. 5cm\nD. 5.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 1%| | 201/22095 [15:51<38:37:16, 6.35s/it] {'loss': 0.5753, 'grad_norm': 0.963218116755177, 'learning_rate': 3.0165912518853694e-06, 'epoch': 0.01} 1%| | 201/22095 [15:51<38:37:16, 6.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83206 > 40960). Running this sequence through the model will result in indexing errors 1%| | 202/22095 [15:54<33:26:39, 5.50s/it] {'loss': 0.5516, 'grad_norm': 1.1478861970452678, 'learning_rate': 3.0316742081447962e-06, 'epoch': 0.01} 1%| | 202/22095 [15:54<33:26:39, 5.50s/it] 1%| | 203/22095 [15:58<30:23:43, 5.00s/it] {'loss': 0.597, 'grad_norm': 0.8944848254240031, 'learning_rate': 3.046757164404223e-06, 'epoch': 0.01} 1%| | 203/22095 [15:58<30:23:43, 5.00s/it] 1%| | 204/22095 [16:01<27:47:02, 4.57s/it] {'loss': 0.5382, 'grad_norm': 0.9869650281799124, 'learning_rate': 3.0618401206636506e-06, 'epoch': 0.01} 1%| | 204/22095 [16:01<27:47:02, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62223 > 40960). Running this sequence through the model will result in indexing errors 1%| | 205/22095 [16:04<24:52:13, 4.09s/it] {'loss': 0.5331, 'grad_norm': 1.130476760242102, 'learning_rate': 3.0769230769230774e-06, 'epoch': 0.01} 1%| | 205/22095 [16:04<24:52:13, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104397 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104533 > 40960). Running this sequence through the model will result in indexing errors 1%| | 206/22095 [16:10<27:45:36, 4.57s/it] {'loss': 0.618, 'grad_norm': 0.6252836318965277, 'learning_rate': 3.0920060331825042e-06, 'epoch': 0.01} 1%| | 206/22095 [16:10<27:45:36, 4.57s/it] 1%| | 207/22095 [16:13<25:21:57, 4.17s/it] {'loss': 0.5858, 'grad_norm': 0.9759516513557152, 'learning_rate': 3.107088989441931e-06, 'epoch': 0.01} 1%| | 207/22095 [16:13<25:21:57, 4.17s/it] 1%| | 208/22095 [16:18<25:34:12, 4.21s/it] {'loss': 0.5768, 'grad_norm': 0.8682518804136213, 'learning_rate': 3.122171945701358e-06, 'epoch': 0.01} 1%| | 208/22095 [16:18<25:34:12, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107698 > 40960). Running this sequence through the model will result in indexing errors 1%| | 209/22095 [16:21<24:26:38, 4.02s/it] {'loss': 0.529, 'grad_norm': 1.236201277393762, 'learning_rate': 3.1372549019607846e-06, 'epoch': 0.01} 1%| | 209/22095 [16:21<24:26:38, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101337 > 40960). Running this sequence through the model will result in indexing errors 1%| | 210/22095 [16:30<33:53:06, 5.57s/it] {'loss': 0.6492, 'grad_norm': 0.49697934177661585, 'learning_rate': 3.1523378582202114e-06, 'epoch': 0.01} 1%| | 210/22095 [16:30<33:53:06, 5.57s/it] 1%| | 211/22095 [16:34<30:15:04, 4.98s/it] {'loss': 0.563, 'grad_norm': 0.8940926607531419, 'learning_rate': 3.167420814479638e-06, 'epoch': 0.01} 1%| | 211/22095 [16:34<30:15:04, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 212/22095 [16:37<27:15:12, 4.48s/it] {'loss': 0.5136, 'grad_norm': 0.858566838909827, 'learning_rate': 3.182503770739065e-06, 'epoch': 0.01} 1%| | 212/22095 [16:37<27:15:12, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51009 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49840 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51920 > 40960). Running this sequence through the model will result in indexing errors 1%| | 213/22095 [16:44<31:03:35, 5.11s/it] {'loss': 0.6387, 'grad_norm': 0.4250007793137212, 'learning_rate': 3.1975867269984918e-06, 'epoch': 0.01} 1%| | 213/22095 [16:44<31:03:35, 5.11s/it] 1%| | 214/22095 [16:54<40:28:59, 6.66s/it] {'loss': 0.6461, 'grad_norm': 0.4830424869851058, 'learning_rate': 3.212669683257919e-06, 'epoch': 0.01} 1%| | 214/22095 [16:54<40:28:59, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 215/22095 [16:58<35:01:56, 5.76s/it] {'loss': 0.5798, 'grad_norm': 0.9492593456229038, 'learning_rate': 3.2277526395173458e-06, 'epoch': 0.01} 1%| | 215/22095 [16:58<35:01:56, 5.76s/it] 1%| | 216/22095 [17:02<32:01:43, 5.27s/it] {'loss': 0.5629, 'grad_norm': 0.9052606044295339, 'learning_rate': 3.2428355957767726e-06, 'epoch': 0.01} 1%| | 216/22095 [17:02<32:01:43, 5.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 217/22095 [17:11<38:00:06, 6.25s/it] {'loss': 0.6497, 'grad_norm': 0.46529736780797787, 'learning_rate': 3.2579185520361994e-06, 'epoch': 0.01} 1%| | 217/22095 [17:11<38:00:06, 6.25s/it] 1%| | 218/22095 [17:17<37:41:06, 6.20s/it] {'loss': 0.6619, 'grad_norm': 0.44782130490666033, 'learning_rate': 3.273001508295626e-06, 'epoch': 0.01} 1%| | 218/22095 [17:17<37:41:06, 6.20s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 219/22095 [17:20<32:26:39, 5.34s/it] {'loss': 0.4668, 'grad_norm': 0.8659706800572665, 'learning_rate': 3.288084464555053e-06, 'epoch': 0.01} 1%| | 219/22095 [17:20<32:26:39, 5.34s/it] 1%| | 220/22095 [17:23<28:35:08, 4.70s/it] {'loss': 0.517, 'grad_norm': 0.8891708692268783, 'learning_rate': 3.3031674208144797e-06, 'epoch': 0.01} 1%| | 220/22095 [17:23<28:35:08, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59852 > 40960). Running this sequence through the model will result in indexing errors 1%| | 221/22095 [17:34<39:34:18, 6.51s/it] {'loss': 0.6212, 'grad_norm': 0.44320004826346965, 'learning_rate': 3.3182503770739065e-06, 'epoch': 0.01} 1%| | 221/22095 [17:34<39:34:18, 6.51s/it] 1%| | 222/22095 [17:38<34:43:03, 5.71s/it] {'loss': 0.612, 'grad_norm': 1.0023703249828462, 'learning_rate': 3.3333333333333333e-06, 'epoch': 0.01} 1%| | 222/22095 [17:38<34:43:03, 5.71s/it] 1%| | 223/22095 [17:42<31:46:52, 5.23s/it] {'loss': 0.5485, 'grad_norm': 0.9391998913476327, 'learning_rate': 3.34841628959276e-06, 'epoch': 0.01} 1%| | 223/22095 [17:42<31:46:52, 5.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7322182 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_161646_1/images/before_screenshot_1_id_73_internvl_element-caption_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nRetrieve the coordinates from the reference: The 'Reference' dropdown button in AutoCAD Mechanical's ribbon interface, positioned within the Block section of the toolbar."}, {'from': 'gpt', 'value': "The 'Reference' dropdown button in AutoCAD Mechanical's ribbon interface, positioned within the Block section of the toolbar.[[449, 168]]"}], 'width': 3600, 'height': 2338} 1%| | 224/22095 [17:45<27:51:19, 4.59s/it] {'loss': 0.5324, 'grad_norm': 0.9215626336665612, 'learning_rate': 3.3634992458521878e-06, 'epoch': 0.01} 1%| | 224/22095 [17:45<27:51:19, 4.59s/it] 1%| | 225/22095 [17:48<25:15:27, 4.16s/it] {'loss': 0.5148, 'grad_norm': 0.9650536964226126, 'learning_rate': 3.3785822021116145e-06, 'epoch': 0.01} 1%| | 225/22095 [17:48<25:15:27, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 226/22095 [17:58<36:18:54, 5.98s/it] {'loss': 0.6352, 'grad_norm': 0.44025580492320604, 'learning_rate': 3.3936651583710413e-06, 'epoch': 0.01} 1%| | 226/22095 [17:58<36:18:54, 5.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45799 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45415 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140979 > 40960). Running this sequence through the model will result in indexing errors 1%| | 227/22095 [18:02<32:27:46, 5.34s/it] {'loss': 0.5221, 'grad_norm': 0.9860234711060266, 'learning_rate': 3.408748114630468e-06, 'epoch': 0.01} 1%| | 227/22095 [18:02<32:27:46, 5.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396944 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63797, 'image': 'vrdu_table_final_2/astro-ph.EP/2f6e82fd-bc5c-4d6f-9889-08a69af56eb2.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_y$\\end{tabular}\n```"}]} 1%| | 228/22095 [18:06<30:05:00, 4.95s/it] {'loss': 0.4942, 'grad_norm': 0.8750523725384597, 'learning_rate': 3.423831070889895e-06, 'epoch': 0.01} 1%| | 228/22095 [18:06<30:05:00, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 229/22095 [18:12<31:17:47, 5.15s/it] {'loss': 0.6168, 'grad_norm': 0.40786929784799386, 'learning_rate': 3.4389140271493217e-06, 'epoch': 0.01} 1%| | 229/22095 [18:12<31:17:47, 5.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 230/22095 [18:15<27:54:08, 4.59s/it] {'loss': 0.5732, 'grad_norm': 0.9581620031963619, 'learning_rate': 3.4539969834087485e-06, 'epoch': 0.01} 1%| | 230/22095 [18:15<27:54:08, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85622 > 40960). Running this sequence through the model will result in indexing errors 1%| | 231/22095 [18:18<25:30:16, 4.20s/it] {'loss': 0.5713, 'grad_norm': 1.078295738150313, 'learning_rate': 3.4690799396681753e-06, 'epoch': 0.01} 1%| | 231/22095 [18:18<25:30:16, 4.20s/it] 1%| | 232/22095 [18:21<23:11:04, 3.82s/it] {'loss': 0.5712, 'grad_norm': 1.005658714168155, 'learning_rate': 3.484162895927602e-06, 'epoch': 0.01} 1%| | 232/22095 [18:21<23:11:04, 3.82s/it] 1%| | 233/22095 [18:24<21:48:37, 3.59s/it] {'loss': 0.5375, 'grad_norm': 0.9349868398807674, 'learning_rate': 3.499245852187029e-06, 'epoch': 0.01} 1%| | 233/22095 [18:24<21:48:37, 3.59s/it] 1%| | 234/22095 [18:27<20:36:10, 3.39s/it] {'loss': 0.4573, 'grad_norm': 0.8593073232849099, 'learning_rate': 3.5143288084464557e-06, 'epoch': 0.01} 1%| | 234/22095 [18:27<20:36:10, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 235/22095 [18:36<31:00:29, 5.11s/it] {'loss': 0.6272, 'grad_norm': 0.46015794426951856, 'learning_rate': 3.529411764705883e-06, 'epoch': 0.01} 1%| | 235/22095 [18:36<31:00:29, 5.11s/it] 1%| | 236/22095 [18:40<27:32:38, 4.54s/it] {'loss': 0.5125, 'grad_norm': 1.0336342436633565, 'learning_rate': 3.5444947209653097e-06, 'epoch': 0.01} 1%| | 236/22095 [18:40<27:32:38, 4.54s/it] 1%| | 237/22095 [18:43<26:01:37, 4.29s/it] {'loss': 0.5745, 'grad_norm': 0.8832268250263099, 'learning_rate': 3.5595776772247365e-06, 'epoch': 0.01} 1%| | 237/22095 [18:43<26:01:37, 4.29s/it] 1%| | 238/22095 [18:47<25:18:48, 4.17s/it] {'loss': 0.5466, 'grad_norm': 0.9516494399278698, 'learning_rate': 3.5746606334841633e-06, 'epoch': 0.01} 1%| | 238/22095 [18:47<25:18:48, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75567 > 40960). Running this sequence through the model will result in indexing errors 1%| | 239/22095 [18:56<33:02:08, 5.44s/it] {'loss': 0.6034, 'grad_norm': 0.4369975511253576, 'learning_rate': 3.58974358974359e-06, 'epoch': 0.01} 1%| | 239/22095 [18:56<33:02:08, 5.44s/it] 1%| | 240/22095 [18:59<29:54:28, 4.93s/it] {'loss': 0.5792, 'grad_norm': 0.9401986167737724, 'learning_rate': 3.604826546003017e-06, 'epoch': 0.01} 1%| | 240/22095 [18:59<29:54:28, 4.93s/it] 1%| | 241/22095 [19:02<26:14:55, 4.32s/it] {'loss': 0.6009, 'grad_norm': 1.0006366096995205, 'learning_rate': 3.6199095022624436e-06, 'epoch': 0.01} 1%| | 241/22095 [19:02<26:14:55, 4.32s/it] 1%| | 242/22095 [19:05<23:47:52, 3.92s/it] {'loss': 0.5477, 'grad_norm': 0.9510760230842723, 'learning_rate': 3.6349924585218704e-06, 'epoch': 0.01} 1%| | 242/22095 [19:05<23:47:52, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%| | 243/22095 [19:09<23:13:46, 3.83s/it] {'loss': 0.544, 'grad_norm': 0.9288859948652094, 'learning_rate': 3.6500754147812972e-06, 'epoch': 0.01} 1%| | 243/22095 [19:09<23:13:46, 3.83s/it] 1%| | 244/22095 [19:12<22:35:27, 3.72s/it] {'loss': 0.5416, 'grad_norm': 0.9863343501078657, 'learning_rate': 3.665158371040724e-06, 'epoch': 0.01} 1%| | 244/22095 [19:12<22:35:27, 3.72s/it] 1%| | 245/22095 [19:15<20:57:07, 3.45s/it] {'loss': 0.4932, 'grad_norm': 0.8605513996689019, 'learning_rate': 3.6802413273001512e-06, 'epoch': 0.01} 1%| | 245/22095 [19:15<20:57:07, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047573 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2.5\nB. 4.5\nC. 7\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 1%| | 246/22095 [19:19<22:13:42, 3.66s/it] {'loss': 0.6034, 'grad_norm': 0.916463140897627, 'learning_rate': 3.695324283559578e-06, 'epoch': 0.01} 1%| | 246/22095 [19:19<22:13:42, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 [2025-08-27 16:17:28,483] [WARNING] [stage3.py:2118:step] 1 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time 1%| | 247/22095 [19:30<34:27:17, 5.68s/it] {'loss': 0.6056, 'grad_norm': 0.4956358437007129, 'learning_rate': 3.710407239819005e-06, 'epoch': 0.01} 1%| | 247/22095 [19:30<34:27:17, 5.68s/it] 1%| | 248/22095 [19:33<30:37:19, 5.05s/it] {'loss': 0.5617, 'grad_norm': 0.9163501523417295, 'learning_rate': 3.7254901960784316e-06, 'epoch': 0.01} 1%| | 248/22095 [19:33<30:37:19, 5.05s/it] 1%| | 249/22095 [19:36<27:07:50, 4.47s/it] {'loss': 0.546, 'grad_norm': 0.9778185038718414, 'learning_rate': 3.7405731523378584e-06, 'epoch': 0.01} 1%| | 249/22095 [19:36<27:07:50, 4.47s/it] 1%| | 250/22095 [19:39<24:18:39, 4.01s/it] {'loss': 0.5811, 'grad_norm': 0.9520944809364426, 'learning_rate': 3.755656108597285e-06, 'epoch': 0.01} 1%| | 250/22095 [19:39<24:18:39, 4.01s/it] 1%| | 251/22095 [19:42<22:15:04, 3.67s/it] {'loss': 0.5386, 'grad_norm': 0.8823450931650489, 'learning_rate': 3.770739064856712e-06, 'epoch': 0.01} 1%| | 251/22095 [19:42<22:15:04, 3.67s/it] 1%| | 252/22095 [19:45<21:27:17, 3.54s/it] {'loss': 0.5205, 'grad_norm': 0.9512201225676397, 'learning_rate': 3.7858220211161388e-06, 'epoch': 0.01} 1%| | 252/22095 [19:45<21:27:17, 3.54s/it] 1%| | 253/22095 [19:49<22:05:38, 3.64s/it] {'loss': 0.5838, 'grad_norm': 0.9448051386395275, 'learning_rate': 3.8009049773755656e-06, 'epoch': 0.01} 1%| | 253/22095 [19:49<22:05:38, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62759 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83038 > 40960). Running this sequence through the model will result in indexing errors 1%| | 254/22095 [19:52<21:00:24, 3.46s/it] {'loss': 0.5709, 'grad_norm': 0.8613594278239519, 'learning_rate': 3.815987933634992e-06, 'epoch': 0.01} 1%| | 254/22095 [19:52<21:00:24, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43289 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63302 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83678 > 40960). Running this sequence through the model will result in indexing errors 1%| | 255/22095 [19:56<20:46:50, 3.43s/it] {'loss': 0.551, 'grad_norm': 0.9715804592335193, 'learning_rate': 3.83107088989442e-06, 'epoch': 0.01} 1%| | 255/22095 [19:56<20:46:50, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43221 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116807 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52670 > 40960). Running this sequence through the model will result in indexing errors 1%| | 256/22095 [19:58<19:18:32, 3.18s/it] {'loss': 0.5178, 'grad_norm': 0.8766071855257321, 'learning_rate': 3.846153846153847e-06, 'epoch': 0.01} 1%| | 256/22095 [19:58<19:18:32, 3.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (130804 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69763 > 40960). Running this sequence through the model will result in indexing errors 1%| | 257/22095 [20:01<18:32:47, 3.06s/it] {'loss': 0.5359, 'grad_norm': 0.8675288371592743, 'learning_rate': 3.861236802413273e-06, 'epoch': 0.01} 1%| | 257/22095 [20:01<18:32:47, 3.06s/it] 1%| | 258/22095 [20:05<19:55:01, 3.28s/it] {'loss': 0.531, 'grad_norm': 0.9401513608490044, 'learning_rate': 3.8763197586727e-06, 'epoch': 0.01} 1%| | 258/22095 [20:05<19:55:01, 3.28s/it] 1%| | 259/22095 [20:08<19:56:10, 3.29s/it] {'loss': 0.536, 'grad_norm': 0.8856111486307915, 'learning_rate': 3.891402714932127e-06, 'epoch': 0.01} 1%| | 259/22095 [20:08<19:56:10, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49692 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102110 > 40960). Running this sequence through the model will result in indexing errors 1%| | 260/22095 [20:15<26:37:38, 4.39s/it] {'loss': 0.617, 'grad_norm': 0.6697874287839527, 'learning_rate': 3.906485671191554e-06, 'epoch': 0.01} 1%| | 260/22095 [20:15<26:37:38, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54062 > 40960). Running this sequence through the model will result in indexing errors 1%| | 261/22095 [20:18<24:07:19, 3.98s/it] {'loss': 0.536, 'grad_norm': 0.9666300735640014, 'learning_rate': 3.92156862745098e-06, 'epoch': 0.01} 1%| | 261/22095 [20:18<24:07:19, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 262/22095 [20:29<35:49:17, 5.91s/it] {'loss': 0.6328, 'grad_norm': 0.5422182479370271, 'learning_rate': 3.9366515837104075e-06, 'epoch': 0.01} 1%| | 262/22095 [20:29<35:49:17, 5.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119569 > 40960). Running this sequence through the model will result in indexing errors 1%| | 263/22095 [20:39<43:11:39, 7.12s/it] {'loss': 0.5888, 'grad_norm': 0.4923913584350843, 'learning_rate': 3.951734539969834e-06, 'epoch': 0.01} 1%| | 263/22095 [20:39<43:11:39, 7.12s/it]VC:s3://sa-1b/sa_000001/sa_22344.jpg 2025-08-27 16:18:37.335164 load time: 1428.78 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-27 16:18:38.207577 load time: 1017.77 ms 1%| | 264/22095 [20:49<48:58:12, 8.08s/it] {'loss': 0.5906, 'grad_norm': 0.43537169302489886, 'learning_rate': 3.966817496229261e-06, 'epoch': 0.01} 1%| | 264/22095 [20:49<48:58:12, 8.08s/it] 1%| | 265/22095 [20:59<52:40:28, 8.69s/it] {'loss': 0.6193, 'grad_norm': 0.495426622721109, 'learning_rate': 3.9819004524886875e-06, 'epoch': 0.01} 1%| | 265/22095 [20:59<52:40:28, 8.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 266/22095 [21:03<43:52:47, 7.24s/it] {'loss': 0.5928, 'grad_norm': 1.0064974142968786, 'learning_rate': 3.9969834087481156e-06, 'epoch': 0.01} 1%| | 266/22095 [21:03<43:52:47, 7.24s/it] 1%| | 267/22095 [21:07<38:27:41, 6.34s/it] {'loss': 0.4592, 'grad_norm': 0.9836738785638621, 'learning_rate': 4.012066365007542e-06, 'epoch': 0.01} 1%| | 267/22095 [21:07<38:27:41, 6.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-27 16:19:06.749563 load time: 1051.12 ms 1%| | 268/22095 [21:18<46:13:14, 7.62s/it] {'loss': 0.5993, 'grad_norm': 0.6441927181482473, 'learning_rate': 4.027149321266969e-06, 'epoch': 0.01} 1%| | 268/22095 [21:18<46:13:14, 7.62s/it] 1%| | 269/22095 [21:28<50:43:17, 8.37s/it] {'loss': 0.6138, 'grad_norm': 0.5193332967918078, 'learning_rate': 4.0422322775263955e-06, 'epoch': 0.01} 1%| | 269/22095 [21:28<50:43:17, 8.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 270/22095 [21:31<41:46:11, 6.89s/it] {'loss': 0.5578, 'grad_norm': 0.9500292114748106, 'learning_rate': 4.057315233785823e-06, 'epoch': 0.01} 1%| | 270/22095 [21:31<41:46:11, 6.89s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_023154_before_screenshot_sub3.png 2025-08-27 16:19:30.017013 load time: 1043.64 ms 1%| | 271/22095 [21:42<49:30:06, 8.17s/it] {'loss': 0.608, 'grad_norm': 0.4149215761090333, 'learning_rate': 4.072398190045249e-06, 'epoch': 0.01} 1%| | 271/22095 [21:42<49:30:06, 8.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 1%| | 272/22095 [21:47<42:24:21, 7.00s/it] {'loss': 0.5623, 'grad_norm': 1.0754271378193447, 'learning_rate': 4.087481146304676e-06, 'epoch': 0.01} 1%| | 272/22095 [21:47<42:24:21, 7.00s/it] 1%| | 273/22095 [21:50<35:43:58, 5.89s/it] {'loss': 0.5349, 'grad_norm': 0.925293932357001, 'learning_rate': 4.102564102564103e-06, 'epoch': 0.01} 1%| | 273/22095 [21:50<35:43:58, 5.89s/it] 1%| | 274/22095 [21:53<30:51:24, 5.09s/it] {'loss': 0.5174, 'grad_norm': 1.133479898251426, 'learning_rate': 4.11764705882353e-06, 'epoch': 0.01} 1%| | 274/22095 [21:53<30:51:24, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%| | 275/22095 [22:01<36:06:35, 5.96s/it] {'loss': 0.6002, 'grad_norm': 0.5209625544150582, 'learning_rate': 4.132730015082956e-06, 'epoch': 0.01} 1%| | 275/22095 [22:01<36:06:35, 5.96s/it] 1%| | 276/22095 [22:04<31:19:26, 5.17s/it] {'loss': 0.5646, 'grad_norm': 1.0776330878254095, 'learning_rate': 4.1478129713423835e-06, 'epoch': 0.01} 1%| | 276/22095 [22:05<31:19:26, 5.17s/it] 1%|▏ | 277/22095 [22:08<28:07:40, 4.64s/it] {'loss': 0.5618, 'grad_norm': 1.1219212069602342, 'learning_rate': 4.162895927601811e-06, 'epoch': 0.01} 1%|▏ | 277/22095 [22:08<28:07:40, 4.64s/it] 1%|▏ | 278/22095 [22:11<25:27:06, 4.20s/it] {'loss': 0.5335, 'grad_norm': 0.9540197245213747, 'learning_rate': 4.177978883861237e-06, 'epoch': 0.01} 1%|▏ | 278/22095 [22:11<25:27:06, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52656 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 279/22095 [22:15<24:56:40, 4.12s/it] {'loss': 0.5274, 'grad_norm': 1.0409140098530238, 'learning_rate': 4.193061840120664e-06, 'epoch': 0.01} 1%|▏ | 279/22095 [22:15<24:56:40, 4.12s/it] 1%|▏ | 280/22095 [22:18<22:38:54, 3.74s/it] {'loss': 0.5062, 'grad_norm': 0.9479380496867869, 'learning_rate': 4.208144796380091e-06, 'epoch': 0.01} 1%|▏ | 280/22095 [22:18<22:38:54, 3.74s/it] 1%|▏ | 281/22095 [22:21<21:16:56, 3.51s/it] {'loss': 0.5108, 'grad_norm': 0.8985186859100934, 'learning_rate': 4.223227752639518e-06, 'epoch': 0.01} 1%|▏ | 281/22095 [22:21<21:16:56, 3.51s/it] 1%|▏ | 282/22095 [22:24<21:10:14, 3.49s/it] {'loss': 0.5079, 'grad_norm': 0.8673565799474766, 'learning_rate': 4.238310708898944e-06, 'epoch': 0.01} 1%|▏ | 282/22095 [22:24<21:10:14, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81508 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 283/22095 [22:27<20:29:05, 3.38s/it] {'loss': 0.5285, 'grad_norm': 0.9738525782248589, 'learning_rate': 4.2533936651583714e-06, 'epoch': 0.01} 1%|▏ | 283/22095 [22:27<20:29:05, 3.38s/it] 1%|▏ | 284/22095 [22:30<19:12:33, 3.17s/it] {'loss': 0.5087, 'grad_norm': 0.9445428406253431, 'learning_rate': 4.268476621417798e-06, 'epoch': 0.01} 1%|▏ | 284/22095 [22:30<19:12:33, 3.17s/it] 1%|▏ | 285/22095 [22:33<19:37:15, 3.24s/it] {'loss': 0.5435, 'grad_norm': 0.9482994825503861, 'learning_rate': 4.283559577677225e-06, 'epoch': 0.01} 1%|▏ | 285/22095 [22:33<19:37:15, 3.24s/it] 1%|▏ | 286/22095 [22:37<20:24:18, 3.37s/it] {'loss': 0.4748, 'grad_norm': 0.9428865202414309, 'learning_rate': 4.298642533936652e-06, 'epoch': 0.01} 1%|▏ | 286/22095 [22:37<20:24:18, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 287/22095 [22:45<28:02:16, 4.63s/it] {'loss': 0.6106, 'grad_norm': 0.48671128565727295, 'learning_rate': 4.313725490196079e-06, 'epoch': 0.01} 1%|▏ | 287/22095 [22:45<28:02:16, 4.63s/it] 1%|▏ | 288/22095 [22:49<27:39:26, 4.57s/it] {'loss': 0.5244, 'grad_norm': 0.8757214278700259, 'learning_rate': 4.328808446455506e-06, 'epoch': 0.01} 1%|▏ | 288/22095 [22:49<27:39:26, 4.57s/it] 1%|▏ | 289/22095 [22:52<24:43:07, 4.08s/it] {'loss': 0.5239, 'grad_norm': 1.316171662310517, 'learning_rate': 4.343891402714932e-06, 'epoch': 0.01} 1%|▏ | 289/22095 [22:52<24:43:07, 4.08s/it] 1%|▏ | 290/22095 [22:56<24:43:35, 4.08s/it] {'loss': 0.549, 'grad_norm': 0.9589527483195361, 'learning_rate': 4.358974358974359e-06, 'epoch': 0.01} 1%|▏ | 290/22095 [22:56<24:43:35, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 291/22095 [23:07<36:13:27, 5.98s/it] {'loss': 0.6291, 'grad_norm': 0.4200317179916816, 'learning_rate': 4.374057315233786e-06, 'epoch': 0.01} 1%|▏ | 291/22095 [23:07<36:13:27, 5.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41949 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64826 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 292/22095 [23:11<32:52:41, 5.43s/it] {'loss': 0.558, 'grad_norm': 0.9318035726460884, 'learning_rate': 4.389140271493213e-06, 'epoch': 0.01} 1%|▏ | 292/22095 [23:11<32:52:41, 5.43s/it] 1%|▏ | 293/22095 [23:14<28:31:10, 4.71s/it] {'loss': 0.5025, 'grad_norm': 1.0019818689732092, 'learning_rate': 4.404223227752639e-06, 'epoch': 0.01} 1%|▏ | 293/22095 [23:14<28:31:10, 4.71s/it] 1%|▏ | 294/22095 [23:18<27:22:06, 4.52s/it] {'loss': 0.5599, 'grad_norm': 1.113539719910521, 'learning_rate': 4.419306184012067e-06, 'epoch': 0.01} 1%|▏ | 294/22095 [23:18<27:22:06, 4.52s/it] 1%|▏ | 295/22095 [23:22<25:57:58, 4.29s/it] {'loss': 0.5216, 'grad_norm': 1.0569103377615405, 'learning_rate': 4.434389140271493e-06, 'epoch': 0.01} 1%|▏ | 295/22095 [23:22<25:57:58, 4.29s/it] 1%|▏ | 296/22095 [23:24<23:25:26, 3.87s/it] {'loss': 0.4954, 'grad_norm': 0.9664867476522317, 'learning_rate': 4.44947209653092e-06, 'epoch': 0.01} 1%|▏ | 296/22095 [23:24<23:25:26, 3.87s/it] 1%|▏ | 297/22095 [23:27<21:30:07, 3.55s/it] {'loss': 0.5027, 'grad_norm': 0.896450043739785, 'learning_rate': 4.464555052790347e-06, 'epoch': 0.01} 1%|▏ | 297/22095 [23:27<21:30:07, 3.55s/it] 1%|▏ | 298/22095 [23:31<21:23:33, 3.53s/it] {'loss': 0.5497, 'grad_norm': 0.9422311268833776, 'learning_rate': 4.479638009049775e-06, 'epoch': 0.01} 1%|▏ | 298/22095 [23:31<21:23:33, 3.53s/it] 1%|▏ | 299/22095 [23:34<20:00:43, 3.31s/it] {'loss': 0.56, 'grad_norm': 1.035399348209044, 'learning_rate': 4.494720965309201e-06, 'epoch': 0.01} 1%|▏ | 299/22095 [23:34<20:00:43, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 300/22095 [23:44<32:24:08, 5.35s/it] {'loss': 0.5795, 'grad_norm': 0.5018088786562147, 'learning_rate': 4.509803921568628e-06, 'epoch': 0.01} 1%|▏ | 300/22095 [23:44<32:24:08, 5.35s/it] 1%|▏ | 301/22095 [23:47<28:39:29, 4.73s/it] {'loss': 0.5114, 'grad_norm': 0.9210544017751315, 'learning_rate': 4.5248868778280546e-06, 'epoch': 0.01} 1%|▏ | 301/22095 [23:47<28:39:29, 4.73s/it] 1%|▏ | 302/22095 [23:50<26:23:18, 4.36s/it] {'loss': 0.5583, 'grad_norm': 1.291852571021736, 'learning_rate': 4.539969834087482e-06, 'epoch': 0.01} 1%|▏ | 302/22095 [23:50<26:23:18, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%|▏ | 303/22095 [23:53<23:40:01, 3.91s/it] {'loss': 0.4886, 'grad_norm': 1.0262904351383784, 'learning_rate': 4.555052790346908e-06, 'epoch': 0.01} 1%|▏ | 303/22095 [23:53<23:40:01, 3.91s/it] 1%|▏ | 304/22095 [23:56<21:49:25, 3.61s/it] {'loss': 0.5207, 'grad_norm': 0.9650289716611675, 'learning_rate': 4.570135746606335e-06, 'epoch': 0.01} 1%|▏ | 304/22095 [23:56<21:49:25, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 305/22095 [24:03<27:03:26, 4.47s/it] {'loss': 0.6426, 'grad_norm': 0.48075664142150215, 'learning_rate': 4.585218702865762e-06, 'epoch': 0.01} 1%|▏ | 305/22095 [24:03<27:03:26, 4.47s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 1%|▏ | 306/22095 [24:06<25:48:11, 4.26s/it] {'loss': 0.5104, 'grad_norm': 0.8763321117534213, 'learning_rate': 4.600301659125189e-06, 'epoch': 0.01} 1%|▏ | 306/22095 [24:06<25:48:11, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52336 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43701 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 307/22095 [24:10<23:56:31, 3.96s/it] {'loss': 0.5378, 'grad_norm': 0.941361972531803, 'learning_rate': 4.615384615384616e-06, 'epoch': 0.01} 1%|▏ | 307/22095 [24:10<23:56:31, 3.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%|▏ | 308/22095 [24:13<22:59:24, 3.80s/it] {'loss': 0.5498, 'grad_norm': 1.5059100405196593, 'learning_rate': 4.6304675716440425e-06, 'epoch': 0.01} 1%|▏ | 308/22095 [24:13<22:59:24, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59493 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 309/22095 [24:17<23:20:04, 3.86s/it] {'loss': 0.5412, 'grad_norm': 0.917428534519775, 'learning_rate': 4.64555052790347e-06, 'epoch': 0.01} 1%|▏ | 309/22095 [24:17<23:20:04, 3.86s/it] 1%|▏ | 310/22095 [24:21<22:33:54, 3.73s/it] {'loss': 0.5303, 'grad_norm': 0.9499732405768552, 'learning_rate': 4.660633484162896e-06, 'epoch': 0.01} 1%|▏ | 310/22095 [24:21<22:33:54, 3.73s/it] 1%|▏ | 311/22095 [24:23<20:55:20, 3.46s/it] {'loss': 0.552, 'grad_norm': 0.9636543513031476, 'learning_rate': 4.675716440422323e-06, 'epoch': 0.01} 1%|▏ | 311/22095 [24:23<20:55:20, 3.46s/it] 1%|▏ | 312/22095 [24:27<21:55:42, 3.62s/it] {'loss': 0.4908, 'grad_norm': 0.9242787914214284, 'learning_rate': 4.69079939668175e-06, 'epoch': 0.01} 1%|▏ | 312/22095 [24:27<21:55:42, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120668 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58980 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93624 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 313/22095 [24:31<21:37:54, 3.58s/it] {'loss': 0.5389, 'grad_norm': 1.081272531811085, 'learning_rate': 4.705882352941177e-06, 'epoch': 0.01} 1%|▏ | 313/22095 [24:31<21:37:54, 3.58s/it] 1%|▏ | 314/22095 [24:35<22:08:18, 3.66s/it] {'loss': 0.5807, 'grad_norm': 0.9253075444801346, 'learning_rate': 4.720965309200603e-06, 'epoch': 0.01} 1%|▏ | 314/22095 [24:35<22:08:18, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 315/22095 [24:45<33:37:21, 5.56s/it] {'loss': 0.5956, 'grad_norm': 0.49351422143918683, 'learning_rate': 4.7360482654600305e-06, 'epoch': 0.01} 1%|▏ | 315/22095 [24:45<33:37:21, 5.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50876 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 316/22095 [24:48<28:58:41, 4.79s/it] {'loss': 0.5276, 'grad_norm': 0.9741852578327058, 'learning_rate': 4.751131221719457e-06, 'epoch': 0.01} 1%|▏ | 316/22095 [24:48<28:58:41, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 317/22095 [24:58<38:18:48, 6.33s/it] {'loss': 0.6086, 'grad_norm': 0.42603570176596806, 'learning_rate': 4.766214177978885e-06, 'epoch': 0.01} 1%|▏ | 317/22095 [24:58<38:18:48, 6.33s/it] 1%|▏ | 318/22095 [25:01<32:28:59, 5.37s/it] {'loss': 0.5729, 'grad_norm': 0.9694041654859664, 'learning_rate': 4.781297134238311e-06, 'epoch': 0.01} 1%|▏ | 318/22095 [25:01<32:28:59, 5.37s/it] 1%|▏ | 319/22095 [25:05<30:46:01, 5.09s/it] {'loss': 0.5127, 'grad_norm': 0.8927176798511097, 'learning_rate': 4.7963800904977385e-06, 'epoch': 0.01} 1%|▏ | 319/22095 [25:05<30:46:01, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-27 16:23:03.982364 load time: 1049.3 ms 1%|▏ | 320/22095 [25:16<41:55:18, 6.93s/it] {'loss': 0.6003, 'grad_norm': 0.504617438746344, 'learning_rate': 4.811463046757165e-06, 'epoch': 0.01} 1%|▏ | 320/22095 [25:16<41:55:18, 6.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73669 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 321/22095 [25:27<48:17:04, 7.98s/it] {'loss': 0.6335, 'grad_norm': 0.4540099755279664, 'learning_rate': 4.826546003016592e-06, 'epoch': 0.01} 1%|▏ | 321/22095 [25:27<48:17:04, 7.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (109137 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 322/22095 [25:30<40:14:56, 6.65s/it] {'loss': 0.5226, 'grad_norm': 1.1454920132721478, 'learning_rate': 4.8416289592760185e-06, 'epoch': 0.01} 1%|▏ | 322/22095 [25:30<40:14:56, 6.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54458 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51455 > 40960). Running this sequence through the model will result in indexing errors 1%|▏ | 323/22095 [25:34<34:49:38, 5.76s/it] {'loss': 0.523, 'grad_norm': 0.9128245799208095, 'learning_rate': 4.856711915535446e-06, 'epoch': 0.01} 1%|▏ | 323/22095 [25:34<34:49:38, 5.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 1%|▏ | 324/22095 [25:45<44:49:23, 7.41s/it] {'loss': 0.6013, 'grad_norm': 0.42641660214287797, 'learning_rate': 4.871794871794872e-06, 'epoch': 0.01} 1%|▏ | 324/22095 [25:45<44:49:23, 7.41s/it] 1%|▏ | 325/22095 [25:49<37:19:15, 6.17s/it] {'loss': 0.5273, 'grad_norm': 0.9227995753110831, 'learning_rate': 4.886877828054299e-06, 'epoch': 0.01} 1%|▏ | 325/22095 [25:49<37:19:15, 6.17s/it] 1%|▏ | 326/22095 [25:52<31:27:51, 5.20s/it] {'loss': 0.51, 'grad_norm': 0.859961581416437, 'learning_rate': 4.901960784313726e-06, 'epoch': 0.01} 1%|▏ | 326/22095 [25:52<31:27:51, 5.20s/it] 1%|▏ | 327/22095 [25:55<27:40:28, 4.58s/it] {'loss': 0.4988, 'grad_norm': 0.9260793685617191, 'learning_rate': 4.917043740573153e-06, 'epoch': 0.01} 1%|▏ | 327/22095 [25:55<27:40:28, 4.58s/it] 1%|▏ | 328/22095 [25:58<25:25:15, 4.20s/it] {'loss': 0.5041, 'grad_norm': 0.9780194052175253, 'learning_rate': 4.93212669683258e-06, 'epoch': 0.01} 1%|▏ | 328/22095 [25:58<25:25:15, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%|▏ | 329/22095 [26:09<37:08:45, 6.14s/it] {'loss': 0.5843, 'grad_norm': 0.5176024283585878, 'learning_rate': 4.9472096530920064e-06, 'epoch': 0.01} 1%|▏ | 329/22095 [26:09<37:08:45, 6.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 1%|▏ | 330/22095 [26:15<38:12:34, 6.32s/it] {'loss': 0.594, 'grad_norm': 0.4840598729523819, 'learning_rate': 4.962292609351434e-06, 'epoch': 0.01} 1%|▏ | 330/22095 [26:15<38:12:34, 6.32s/it] 1%|▏ | 331/22095 [26:22<38:19:04, 6.34s/it] {'loss': 0.5979, 'grad_norm': 0.38051714853312013, 'learning_rate': 4.97737556561086e-06, 'epoch': 0.01} 1%|▏ | 331/22095 [26:22<38:19:04, 6.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 2%|▏ | 332/22095 [26:26<34:04:16, 5.64s/it] {'loss': 0.5146, 'grad_norm': 0.9995763964660412, 'learning_rate': 4.992458521870287e-06, 'epoch': 0.02} 2%|▏ | 332/22095 [26:26<34:04:16, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69277 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80033 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41885 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 333/22095 [26:29<30:27:01, 5.04s/it] {'loss': 0.5428, 'grad_norm': 0.974717356695917, 'learning_rate': 5.007541478129714e-06, 'epoch': 0.02} 2%|▏ | 333/22095 [26:29<30:27:01, 5.04s/it] 2%|▏ | 334/22095 [26:33<28:02:32, 4.64s/it] {'loss': 0.5165, 'grad_norm': 0.9470024372074591, 'learning_rate': 5.022624434389141e-06, 'epoch': 0.02} 2%|▏ | 334/22095 [26:33<28:02:32, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 335/22095 [26:37<27:09:24, 4.49s/it] {'loss': 0.5282, 'grad_norm': 1.1126527648607103, 'learning_rate': 5.037707390648567e-06, 'epoch': 0.02} 2%|▏ | 335/22095 [26:37<27:09:24, 4.49s/it] 2%|▏ | 336/22095 [26:41<26:16:55, 4.35s/it] {'loss': 0.5556, 'grad_norm': 1.0448542636444378, 'learning_rate': 5.052790346907994e-06, 'epoch': 0.02} 2%|▏ | 336/22095 [26:41<26:16:55, 4.35s/it] 2%|▏ | 337/22095 [26:45<24:30:10, 4.05s/it] {'loss': 0.4889, 'grad_norm': 1.0030573569299337, 'learning_rate': 5.067873303167421e-06, 'epoch': 0.02} 2%|▏ | 337/22095 [26:45<24:30:10, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75584 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 338/22095 [26:47<22:00:22, 3.64s/it] {'loss': 0.5208, 'grad_norm': 1.010629182095211, 'learning_rate': 5.082956259426848e-06, 'epoch': 0.02} 2%|▏ | 338/22095 [26:47<22:00:22, 3.64s/it] 2%|▏ | 339/22095 [26:51<22:00:16, 3.64s/it] {'loss': 0.4962, 'grad_norm': 0.8679356439233207, 'learning_rate': 5.098039215686274e-06, 'epoch': 0.02} 2%|▏ | 339/22095 [26:51<22:00:16, 3.64s/it] 2%|▏ | 340/22095 [26:55<22:57:22, 3.80s/it] {'loss': 0.5067, 'grad_norm': 0.9161320845529888, 'learning_rate': 5.1131221719457016e-06, 'epoch': 0.02} 2%|▏ | 340/22095 [26:55<22:57:22, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 341/22095 [27:05<34:13:04, 5.66s/it] {'loss': 0.5856, 'grad_norm': 1.1114898863130425, 'learning_rate': 5.128205128205128e-06, 'epoch': 0.02} 2%|▏ | 341/22095 [27:05<34:13:04, 5.66s/it] 2%|▏ | 342/22095 [27:09<29:59:35, 4.96s/it] {'loss': 0.5067, 'grad_norm': 0.9330607826133401, 'learning_rate': 5.143288084464555e-06, 'epoch': 0.02} 2%|▏ | 342/22095 [27:09<29:59:35, 4.96s/it] 2%|▏ | 343/22095 [27:12<27:59:32, 4.63s/it] {'loss': 0.5558, 'grad_norm': 0.944407245141589, 'learning_rate': 5.158371040723983e-06, 'epoch': 0.02} 2%|▏ | 343/22095 [27:12<27:59:32, 4.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 344/22095 [27:15<24:53:01, 4.12s/it] {'loss': 0.5319, 'grad_norm': 1.1333977043998347, 'learning_rate': 5.1734539969834096e-06, 'epoch': 0.02} 2%|▏ | 344/22095 [27:15<24:53:01, 4.12s/it] 2%|▏ | 345/22095 [27:19<24:51:36, 4.11s/it] {'loss': 0.5263, 'grad_norm': 0.873397313220358, 'learning_rate': 5.188536953242837e-06, 'epoch': 0.02} 2%|▏ | 345/22095 [27:19<24:51:36, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 346/22095 [27:28<33:10:22, 5.49s/it] {'loss': 0.6024, 'grad_norm': 0.763591213995416, 'learning_rate': 5.203619909502263e-06, 'epoch': 0.02} 2%|▏ | 346/22095 [27:28<33:10:22, 5.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101002 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 347/22095 [27:31<29:03:40, 4.81s/it] {'loss': 0.5125, 'grad_norm': 1.1260558397244373, 'learning_rate': 5.21870286576169e-06, 'epoch': 0.02} 2%|▏ | 347/22095 [27:31<29:03:40, 4.81s/it] 2%|▏ | 348/22095 [27:35<26:37:09, 4.41s/it] {'loss': 0.5169, 'grad_norm': 0.9889199273006318, 'learning_rate': 5.233785822021117e-06, 'epoch': 0.02} 2%|▏ | 348/22095 [27:35<26:37:09, 4.41s/it] 2%|▏ | 349/22095 [27:38<23:47:12, 3.94s/it] {'loss': 0.4975, 'grad_norm': 0.9146677322541702, 'learning_rate': 5.248868778280544e-06, 'epoch': 0.02} 2%|▏ | 349/22095 [27:38<23:47:12, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 350/22095 [27:41<23:34:16, 3.90s/it] {'loss': 0.4895, 'grad_norm': 0.8626251002848274, 'learning_rate': 5.26395173453997e-06, 'epoch': 0.02} 2%|▏ | 350/22095 [27:41<23:34:16, 3.90s/it] 2%|▏ | 351/22095 [27:45<23:40:45, 3.92s/it] {'loss': 0.4803, 'grad_norm': 1.009332375677646, 'learning_rate': 5.2790346907993975e-06, 'epoch': 0.02} 2%|▏ | 351/22095 [27:45<23:40:45, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 352/22095 [27:49<22:40:04, 3.75s/it] {'loss': 0.5219, 'grad_norm': 1.04132783036604, 'learning_rate': 5.294117647058824e-06, 'epoch': 0.02} 2%|▏ | 352/22095 [27:49<22:40:04, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58258 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 353/22095 [27:52<21:48:01, 3.61s/it] {'loss': 0.5251, 'grad_norm': 0.8918609702573956, 'learning_rate': 5.309200603318251e-06, 'epoch': 0.02} 2%|▏ | 353/22095 [27:52<21:48:01, 3.61s/it] 2%|▏ | 354/22095 [27:57<24:08:50, 4.00s/it] {'loss': 0.5295, 'grad_norm': 0.9777210319694408, 'learning_rate': 5.3242835595776775e-06, 'epoch': 0.02} 2%|▏ | 354/22095 [27:57<24:08:50, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 355/22095 [28:07<34:51:53, 5.77s/it] {'loss': 0.5909, 'grad_norm': 0.7434341418735497, 'learning_rate': 5.339366515837105e-06, 'epoch': 0.02} 2%|▏ | 355/22095 [28:07<34:51:53, 5.77s/it] 2%|▏ | 356/22095 [28:11<31:44:51, 5.26s/it] {'loss': 0.5164, 'grad_norm': 1.1328565365265588, 'learning_rate': 5.354449472096531e-06, 'epoch': 0.02} 2%|▏ | 356/22095 [28:11<31:44:51, 5.26s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/inventor/handmade_annotation_1/images/2_id_43_internvl_position_crop_0_grounding_instructions_random_paste.png 2025-08-27 16:26:11.601987 load time: 1277.52 ms 2%|▏ | 357/22095 [28:15<29:31:52, 4.89s/it] {'loss': 0.5455, 'grad_norm': 0.9334067691411315, 'learning_rate': 5.369532428355958e-06, 'epoch': 0.02} 2%|▏ | 357/22095 [28:15<29:31:52, 4.89s/it] 2%|▏ | 358/22095 [28:18<26:12:04, 4.34s/it] {'loss': 0.5212, 'grad_norm': 0.8700250681219853, 'learning_rate': 5.384615384615385e-06, 'epoch': 0.02} 2%|▏ | 358/22095 [28:18<26:12:04, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310724 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1D_2pd2MTUeJjSZFKXXagopXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否将这幅图像中的所有文字详细地列出来?'}, {'from': 'gpt', 'value': '图中所有文字:\n笔画\n练字轩人教版课本同步\n®\n练\n字\n轩\n偏旁部首\n槽\n学\n凹\n生\n练字帖\n数字\n轻松\n练成\n一手好字\n21天\n同步(1-2)年级\n拼音\n(1-\n常用成语\n年级'}]} 2%|▏ | 359/22095 [28:22<24:57:38, 4.13s/it] {'loss': 0.5332, 'grad_norm': 0.8928502707857849, 'learning_rate': 5.399698340874812e-06, 'epoch': 0.02} 2%|▏ | 359/22095 [28:22<24:57:38, 4.13s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31234.png 2025-08-27 16:26:21.593836 load time: 1118.41 ms 2%|▏ | 360/22095 [28:26<25:08:29, 4.16s/it] {'loss': 0.5707, 'grad_norm': 1.0109170289938698, 'learning_rate': 5.414781297134238e-06, 'epoch': 0.02} 2%|▏ | 360/22095 [28:26<25:08:29, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 361/22095 [28:34<32:10:50, 5.33s/it] {'loss': 0.5669, 'grad_norm': 0.47521107319293504, 'learning_rate': 5.4298642533936655e-06, 'epoch': 0.02} 2%|▏ | 361/22095 [28:34<32:10:50, 5.33s/it] 2%|▏ | 362/22095 [28:38<30:20:35, 5.03s/it] {'loss': 0.4959, 'grad_norm': 0.9306733378237048, 'learning_rate': 5.444947209653092e-06, 'epoch': 0.02} 2%|▏ | 362/22095 [28:38<30:20:35, 5.03s/it] 2%|▏ | 363/22095 [28:42<28:43:20, 4.76s/it] {'loss': 0.5333, 'grad_norm': 0.9194562783119116, 'learning_rate': 5.460030165912519e-06, 'epoch': 0.02} 2%|▏ | 363/22095 [28:42<28:43:20, 4.76s/it] 2%|▏ | 364/22095 [28:46<26:03:28, 4.32s/it] {'loss': 0.4786, 'grad_norm': 1.0013647007578086, 'learning_rate': 5.475113122171946e-06, 'epoch': 0.02} 2%|▏ | 364/22095 [28:46<26:03:28, 4.32s/it] 2%|▏ | 365/22095 [28:49<24:55:11, 4.13s/it] {'loss': 0.5266, 'grad_norm': 0.961270745112822, 'learning_rate': 5.4901960784313735e-06, 'epoch': 0.02} 2%|▏ | 365/22095 [28:49<24:55:11, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310833 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1ELryLpXXXXcIXFXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nIdentify text from the picture.'}, {'from': 'gpt', 'value': 'All words in the image:\n英格索兰正品气动隔膜泵中国区域总代理\n厂家电话\n13663713257\nARO进口气动隔膜泵\nBQG70/0.2隔膜泵\nBQG150/0.2隔膜泵\nBQG350/0.2隔膜泵\n广西桂林工具有限公司\nBQG450/0.2隔膜泵'}]} 2%|▏ | 366/22095 [28:53<24:51:34, 4.12s/it] {'loss': 0.5228, 'grad_norm': 0.9891712982691216, 'learning_rate': 5.505279034690801e-06, 'epoch': 0.02} 2%|▏ | 366/22095 [28:53<24:51:34, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 367/22095 [28:57<24:31:04, 4.06s/it] {'loss': 0.532, 'grad_norm': 0.9361874821945951, 'learning_rate': 5.520361990950227e-06, 'epoch': 0.02} 2%|▏ | 367/22095 [28:57<24:31:04, 4.06s/it] 2%|▏ | 368/22095 [29:01<24:28:54, 4.06s/it] {'loss': 0.5114, 'grad_norm': 1.01010060145753, 'learning_rate': 5.535444947209654e-06, 'epoch': 0.02} 2%|▏ | 368/22095 [29:01<24:28:54, 4.06s/it] 2%|▏ | 369/22095 [29:04<22:11:15, 3.68s/it] {'loss': 0.5285, 'grad_norm': 0.9379850817433849, 'learning_rate': 5.550527903469081e-06, 'epoch': 0.02} 2%|▏ | 369/22095 [29:04<22:11:15, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 370/22095 [29:07<20:42:54, 3.43s/it] {'loss': 0.4726, 'grad_norm': 1.016504623607867, 'learning_rate': 5.565610859728508e-06, 'epoch': 0.02} 2%|▏ | 370/22095 [29:07<20:42:54, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73129 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 371/22095 [29:10<20:30:13, 3.40s/it] {'loss': 0.5037, 'grad_norm': 1.082966973882081, 'learning_rate': 5.580693815987934e-06, 'epoch': 0.02} 2%|▏ | 371/22095 [29:10<20:30:13, 3.40s/it] 2%|▏ | 372/22095 [29:14<21:00:59, 3.48s/it] {'loss': 0.5725, 'grad_norm': 0.997881502882024, 'learning_rate': 5.5957767722473614e-06, 'epoch': 0.02} 2%|▏ | 372/22095 [29:14<21:00:59, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 373/22095 [29:21<27:49:02, 4.61s/it] {'loss': 0.592, 'grad_norm': 0.5965032348908443, 'learning_rate': 5.610859728506788e-06, 'epoch': 0.02} 2%|▏ | 373/22095 [29:21<27:49:02, 4.61s/it] 2%|▏ | 374/22095 [29:25<26:57:32, 4.47s/it] {'loss': 0.5252, 'grad_norm': 0.9535705328720748, 'learning_rate': 5.625942684766215e-06, 'epoch': 0.02} 2%|▏ | 374/22095 [29:25<26:57:32, 4.47s/it] 2%|▏ | 375/22095 [29:30<27:31:34, 4.56s/it] {'loss': 0.5908, 'grad_norm': 0.9635724454974052, 'learning_rate': 5.641025641025641e-06, 'epoch': 0.02} 2%|▏ | 375/22095 [29:30<27:31:34, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250508_132635_1/images/before_screenshot_1_id_117_function_2_crop_0_grounding_instructions_point_o_paste.png 2025-08-27 16:27:29.676848 load time: 1512.36 ms 2%|▏ | 376/22095 [29:34<26:54:46, 4.46s/it] {'loss': 0.573, 'grad_norm': 0.920845074054737, 'learning_rate': 5.656108597285069e-06, 'epoch': 0.02} 2%|▏ | 376/22095 [29:34<26:54:46, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65152 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67397 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49346 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 377/22095 [29:38<24:46:19, 4.11s/it] {'loss': 0.5179, 'grad_norm': 0.8683200822495826, 'learning_rate': 5.671191553544495e-06, 'epoch': 0.02} 2%|▏ | 377/22095 [29:38<24:46:19, 4.11s/it] 2%|▏ | 378/22095 [29:40<22:16:14, 3.69s/it] {'loss': 0.4988, 'grad_norm': 0.9308637166603803, 'learning_rate': 5.686274509803922e-06, 'epoch': 0.02} 2%|▏ | 378/22095 [29:40<22:16:14, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 379/22095 [29:49<30:24:43, 5.04s/it] {'loss': 0.6014, 'grad_norm': 0.5493833687064147, 'learning_rate': 5.7013574660633486e-06, 'epoch': 0.02} 2%|▏ | 379/22095 [29:49<30:24:43, 5.04s/it] 2%|▏ | 380/22095 [29:53<28:21:41, 4.70s/it] {'loss': 0.5174, 'grad_norm': 0.9211708064016626, 'learning_rate': 5.716440422322776e-06, 'epoch': 0.02} 2%|▏ | 380/22095 [29:53<28:21:41, 4.70s/it] 2%|▏ | 381/22095 [29:55<24:56:46, 4.14s/it] {'loss': 0.5162, 'grad_norm': 0.936329971148168, 'learning_rate': 5.731523378582202e-06, 'epoch': 0.02} 2%|▏ | 381/22095 [29:55<24:56:46, 4.14s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_1/images/step_0.png 2025-08-27 16:27:55.281134 load time: 1115.95 ms 2%|▏ | 382/22095 [29:59<23:25:48, 3.88s/it] {'loss': 0.4509, 'grad_norm': 1.064735933806856, 'learning_rate': 5.746606334841629e-06, 'epoch': 0.02} 2%|▏ | 382/22095 [29:59<23:25:48, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 383/22095 [30:03<23:19:49, 3.87s/it] {'loss': 0.531, 'grad_norm': 0.9441184309473762, 'learning_rate': 5.761689291101056e-06, 'epoch': 0.02} 2%|▏ | 383/22095 [30:03<23:19:49, 3.87s/it] 2%|▏ | 384/22095 [30:06<23:07:20, 3.83s/it] {'loss': 0.5829, 'grad_norm': 0.891932270187907, 'learning_rate': 5.776772247360483e-06, 'epoch': 0.02} 2%|▏ | 384/22095 [30:06<23:07:20, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 385/22095 [30:17<34:44:18, 5.76s/it] {'loss': 0.5725, 'grad_norm': 0.5207360378802349, 'learning_rate': 5.79185520361991e-06, 'epoch': 0.02} 2%|▏ | 385/22095 [30:17<34:44:18, 5.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44239 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 386/22095 [30:20<29:58:48, 4.97s/it] {'loss': 0.5431, 'grad_norm': 0.9828487126912975, 'learning_rate': 5.806938159879337e-06, 'epoch': 0.02} 2%|▏ | 386/22095 [30:20<29:58:48, 4.97s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-27 16:28:18.451254 load time: 1028.8 ms 2%|▏ | 387/22095 [30:23<26:09:06, 4.34s/it] {'loss': 0.571, 'grad_norm': 0.879686558754779, 'learning_rate': 5.822021116138764e-06, 'epoch': 0.02} 2%|▏ | 387/22095 [30:23<26:09:06, 4.34s/it] 2%|▏ | 388/22095 [30:26<24:15:58, 4.02s/it] {'loss': 0.4815, 'grad_norm': 0.867256262719138, 'learning_rate': 5.837104072398191e-06, 'epoch': 0.02} 2%|▏ | 388/22095 [30:26<24:15:58, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45077 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 389/22095 [30:30<24:29:01, 4.06s/it] {'loss': 0.571, 'grad_norm': 0.9583720954401007, 'learning_rate': 5.852187028657617e-06, 'epoch': 0.02} 2%|▏ | 389/22095 [30:30<24:29:01, 4.06s/it] 2%|▏ | 390/22095 [30:33<22:54:28, 3.80s/it] {'loss': 0.5015, 'grad_norm': 0.8976973382623153, 'learning_rate': 5.8672699849170446e-06, 'epoch': 0.02} 2%|▏ | 390/22095 [30:33<22:54:28, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [281, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8470662 in VC:s3://internvl-moe-sft-data/. Exception: Image size [281, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47515, 'image': 'vrdu_texteq/astro-ph.CO/3921801d-c50d-42e6-8366-c84b7ec38daf.png', 'image_wh': [[281, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'where $Y \\equiv \\alpha L_{\\rm max}$ with'}]} 2%|▏ | 391/22095 [30:36<21:20:40, 3.54s/it] {'loss': 0.4882, 'grad_norm': 0.8935749651867885, 'learning_rate': 5.882352941176471e-06, 'epoch': 0.02} 2%|▏ | 391/22095 [30:36<21:20:40, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48317 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 392/22095 [30:39<20:25:34, 3.39s/it] {'loss': 0.5131, 'grad_norm': 0.8648779834677215, 'learning_rate': 5.897435897435898e-06, 'epoch': 0.02} 2%|▏ | 392/22095 [30:39<20:25:34, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53700 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56324 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 393/22095 [30:42<20:14:47, 3.36s/it] {'loss': 0.4949, 'grad_norm': 0.9431042049088775, 'learning_rate': 5.9125188536953245e-06, 'epoch': 0.02} 2%|▏ | 393/22095 [30:42<20:14:47, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 394/22095 [30:53<32:55:49, 5.46s/it] {'loss': 0.5519, 'grad_norm': 0.5114510065663831, 'learning_rate': 5.927601809954752e-06, 'epoch': 0.02} 2%|▏ | 394/22095 [30:53<32:55:49, 5.46s/it] 2%|▏ | 395/22095 [30:56<28:47:06, 4.78s/it] {'loss': 0.4862, 'grad_norm': 1.0818380636393499, 'learning_rate': 5.942684766214178e-06, 'epoch': 0.02} 2%|▏ | 395/22095 [30:56<28:47:06, 4.78s/it] 2%|▏ | 396/22095 [31:00<26:42:50, 4.43s/it] {'loss': 0.5129, 'grad_norm': 0.9493273137564501, 'learning_rate': 5.957767722473605e-06, 'epoch': 0.02} 2%|▏ | 396/22095 [31:00<26:42:50, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 397/22095 [31:09<36:14:28, 6.01s/it] {'loss': 0.5866, 'grad_norm': 0.40185565651478844, 'learning_rate': 5.972850678733032e-06, 'epoch': 0.02} 2%|▏ | 397/22095 [31:09<36:14:28, 6.01s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30198.png 2025-08-27 16:29:09.282332 load time: 1348.06 ms 2%|▏ | 398/22095 [31:13<31:59:00, 5.31s/it] {'loss': 0.4719, 'grad_norm': 0.8638395089291984, 'learning_rate': 5.987933634992459e-06, 'epoch': 0.02} 2%|▏ | 398/22095 [31:13<31:59:00, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55044 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 399/22095 [31:16<28:03:32, 4.66s/it] {'loss': 0.5821, 'grad_norm': 1.0007749499746714, 'learning_rate': 6.003016591251885e-06, 'epoch': 0.02} 2%|▏ | 399/22095 [31:16<28:03:32, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 400/22095 [31:25<36:34:47, 6.07s/it] {'loss': 0.6172, 'grad_norm': 0.5148875748111391, 'learning_rate': 6.0180995475113125e-06, 'epoch': 0.02} 2%|▏ | 400/22095 [31:25<36:34:47, 6.07s/it] 2%|▏ | 401/22095 [31:36<45:03:19, 7.48s/it] {'loss': 0.5892, 'grad_norm': 0.4632624398180578, 'learning_rate': 6.033182503770739e-06, 'epoch': 0.02} 2%|▏ | 401/22095 [31:36<45:03:19, 7.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (60499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88841 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 402/22095 [31:40<38:31:37, 6.39s/it] {'loss': 0.5006, 'grad_norm': 0.9138373595599119, 'learning_rate': 6.048265460030166e-06, 'epoch': 0.02} 2%|▏ | 402/22095 [31:40<38:31:37, 6.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334231 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 841, 'image': 'vrdu_table_final_2/astro-ph.CO/cf8b34f9-5549-46ad-9817-87218933c07a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 2%|▏ | 403/22095 [31:44<33:25:23, 5.55s/it] {'loss': 0.5384, 'grad_norm': 1.002764279783093, 'learning_rate': 6.0633484162895924e-06, 'epoch': 0.02} 2%|▏ | 403/22095 [31:44<33:25:23, 5.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 404/22095 [31:47<30:09:25, 5.01s/it] {'loss': 0.4955, 'grad_norm': 0.9330664682506379, 'learning_rate': 6.07843137254902e-06, 'epoch': 0.02} 2%|▏ | 404/22095 [31:47<30:09:25, 5.01s/it] 2%|▏ | 405/22095 [31:50<26:31:14, 4.40s/it] {'loss': 0.5018, 'grad_norm': 0.8863526835691686, 'learning_rate': 6.093514328808446e-06, 'epoch': 0.02} 2%|▏ | 405/22095 [31:50<26:31:14, 4.40s/it] 2%|▏ | 406/22095 [31:53<23:49:59, 3.96s/it] {'loss': 0.5304, 'grad_norm': 0.8723865491435912, 'learning_rate': 6.108597285067874e-06, 'epoch': 0.02} 2%|▏ | 406/22095 [31:53<23:49:59, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 407/22095 [32:00<28:49:59, 4.79s/it] {'loss': 0.5871, 'grad_norm': 0.7326950236339584, 'learning_rate': 6.123680241327301e-06, 'epoch': 0.02} 2%|▏ | 407/22095 [32:00<28:49:59, 4.79s/it] 2%|▏ | 408/22095 [32:04<27:23:12, 4.55s/it] {'loss': 0.5121, 'grad_norm': 1.1416361451930475, 'learning_rate': 6.138763197586728e-06, 'epoch': 0.02} 2%|▏ | 408/22095 [32:04<27:23:12, 4.55s/it] 2%|▏ | 409/22095 [32:07<24:38:19, 4.09s/it] {'loss': 0.4931, 'grad_norm': 0.9460814774142131, 'learning_rate': 6.153846153846155e-06, 'epoch': 0.02} 2%|▏ | 409/22095 [32:07<24:38:19, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 410/22095 [32:18<36:45:01, 6.10s/it] {'loss': 0.577, 'grad_norm': 0.43571051461036575, 'learning_rate': 6.168929110105581e-06, 'epoch': 0.02} 2%|▏ | 410/22095 [32:18<36:45:01, 6.10s/it] 2%|▏ | 411/22095 [32:22<32:52:59, 5.46s/it] {'loss': 0.5226, 'grad_norm': 0.9005078431887299, 'learning_rate': 6.1840120663650085e-06, 'epoch': 0.02} 2%|▏ | 411/22095 [32:22<32:52:59, 5.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121698 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86573 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 412/22095 [32:26<29:53:31, 4.96s/it] {'loss': 0.4509, 'grad_norm': 1.0887276487488817, 'learning_rate': 6.199095022624435e-06, 'epoch': 0.02} 2%|▏ | 412/22095 [32:26<29:53:31, 4.96s/it] 2%|▏ | 413/22095 [32:29<27:44:37, 4.61s/it] {'loss': 0.4858, 'grad_norm': 0.8428854780274067, 'learning_rate': 6.214177978883862e-06, 'epoch': 0.02} 2%|▏ | 413/22095 [32:29<27:44:37, 4.61s/it] 2%|▏ | 414/22095 [32:33<25:56:30, 4.31s/it] {'loss': 0.4748, 'grad_norm': 0.9252658031535186, 'learning_rate': 6.229260935143288e-06, 'epoch': 0.02} 2%|▏ | 414/22095 [32:33<25:56:30, 4.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 415/22095 [32:36<23:26:58, 3.89s/it] {'loss': 0.5015, 'grad_norm': 0.9412040538128673, 'learning_rate': 6.244343891402716e-06, 'epoch': 0.02} 2%|▏ | 415/22095 [32:36<23:26:58, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46275 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51899 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68853 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 416/22095 [32:39<21:35:05, 3.58s/it] {'loss': 0.4821, 'grad_norm': 0.9138434969166862, 'learning_rate': 6.259426847662142e-06, 'epoch': 0.02} 2%|▏ | 416/22095 [32:39<21:35:05, 3.58s/it] 2%|▏ | 417/22095 [32:42<20:29:48, 3.40s/it] {'loss': 0.5217, 'grad_norm': 0.8656063917318203, 'learning_rate': 6.274509803921569e-06, 'epoch': 0.02} 2%|▏ | 417/22095 [32:42<20:29:48, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887885 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11038, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1.5'}]} 2%|▏ | 418/22095 [32:45<20:40:58, 3.43s/it] {'loss': 0.5392, 'grad_norm': 0.94991968312089, 'learning_rate': 6.2895927601809956e-06, 'epoch': 0.02} 2%|▏ | 418/22095 [32:45<20:40:58, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45905 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 419/22095 [32:48<19:34:55, 3.25s/it] {'loss': 0.4799, 'grad_norm': 0.9372279572345605, 'learning_rate': 6.304675716440423e-06, 'epoch': 0.02} 2%|▏ | 419/22095 [32:48<19:34:55, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 420/22095 [32:58<31:51:03, 5.29s/it] {'loss': 0.5858, 'grad_norm': 0.9533178482186907, 'learning_rate': 6.319758672699849e-06, 'epoch': 0.02} 2%|▏ | 420/22095 [32:58<31:51:03, 5.29s/it] 2%|▏ | 421/22095 [33:02<29:27:25, 4.89s/it] {'loss': 0.5204, 'grad_norm': 0.9034610556331467, 'learning_rate': 6.334841628959276e-06, 'epoch': 0.02} 2%|▏ | 421/22095 [33:02<29:27:25, 4.89s/it] 2%|▏ | 422/22095 [33:06<27:17:08, 4.53s/it] {'loss': 0.5085, 'grad_norm': 0.9342840894396373, 'learning_rate': 6.349924585218703e-06, 'epoch': 0.02} 2%|▏ | 422/22095 [33:06<27:17:08, 4.53s/it] 2%|▏ | 423/22095 [33:09<25:35:51, 4.25s/it] {'loss': 0.4955, 'grad_norm': 0.8600603730023817, 'learning_rate': 6.36500754147813e-06, 'epoch': 0.02} 2%|▏ | 423/22095 [33:09<25:35:51, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 424/22095 [33:21<39:05:22, 6.49s/it] {'loss': 0.5843, 'grad_norm': 0.6397248799825535, 'learning_rate': 6.380090497737556e-06, 'epoch': 0.02} 2%|▏ | 424/22095 [33:21<39:05:22, 6.49s/it] 2%|▏ | 425/22095 [33:25<34:01:12, 5.65s/it] {'loss': 0.512, 'grad_norm': 1.0534317002502531, 'learning_rate': 6.3951734539969835e-06, 'epoch': 0.02} 2%|▏ | 425/22095 [33:25<34:01:12, 5.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 426/22095 [33:28<30:17:14, 5.03s/it] {'loss': 0.5013, 'grad_norm': 0.9708662444314573, 'learning_rate': 6.410256410256412e-06, 'epoch': 0.02} 2%|▏ | 426/22095 [33:28<30:17:14, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (68011 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 427/22095 [33:39<39:57:31, 6.64s/it] {'loss': 0.5656, 'grad_norm': 0.6262308260974387, 'learning_rate': 6.425339366515838e-06, 'epoch': 0.02} 2%|▏ | 427/22095 [33:39<39:57:31, 6.64s/it] 2%|▏ | 428/22095 [33:43<35:59:01, 5.98s/it] {'loss': 0.5492, 'grad_norm': 1.104132946757567, 'learning_rate': 6.440422322775265e-06, 'epoch': 0.02} 2%|▏ | 428/22095 [33:43<35:59:01, 5.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 429/22095 [33:46<30:27:42, 5.06s/it] {'loss': 0.513, 'grad_norm': 1.0780508879991042, 'learning_rate': 6.4555052790346916e-06, 'epoch': 0.02} 2%|▏ | 429/22095 [33:46<30:27:42, 5.06s/it] 2%|▏ | 430/22095 [33:50<28:12:53, 4.69s/it] {'loss': 0.5113, 'grad_norm': 0.8014315963924707, 'learning_rate': 6.470588235294119e-06, 'epoch': 0.02} 2%|▏ | 430/22095 [33:50<28:12:53, 4.69s/it] 2%|▏ | 431/22095 [33:53<25:36:48, 4.26s/it] {'loss': 0.5298, 'grad_norm': 0.9294933497639883, 'learning_rate': 6.485671191553545e-06, 'epoch': 0.02} 2%|▏ | 431/22095 [33:53<25:36:48, 4.26s/it] 2%|▏ | 432/22095 [33:57<24:21:06, 4.05s/it] {'loss': 0.4905, 'grad_norm': 1.0134726540671148, 'learning_rate': 6.500754147812972e-06, 'epoch': 0.02} 2%|▏ | 432/22095 [33:57<24:21:06, 4.05s/it] 2%|▏ | 433/22095 [34:01<24:07:34, 4.01s/it] {'loss': 0.5372, 'grad_norm': 0.935200214361148, 'learning_rate': 6.515837104072399e-06, 'epoch': 0.02} 2%|▏ | 433/22095 [34:01<24:07:34, 4.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333768 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 377, 'image': 'vrdu_table_final_2/astro-ph.CO/acc39a41-8397-46e0-b5ad-0b57ec647b79.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} 2%|▏ | 434/22095 [34:04<22:41:45, 3.77s/it] {'loss': 0.463, 'grad_norm': 0.9332971164790316, 'learning_rate': 6.530920060331826e-06, 'epoch': 0.02} 2%|▏ | 434/22095 [34:04<22:41:45, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 435/22095 [34:13<32:20:10, 5.37s/it] {'loss': 0.5862, 'grad_norm': 0.867110760375185, 'learning_rate': 6.546003016591252e-06, 'epoch': 0.02} 2%|▏ | 435/22095 [34:13<32:20:10, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46174 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68387 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 436/22095 [34:16<28:18:49, 4.71s/it] {'loss': 0.5442, 'grad_norm': 0.8950935667741385, 'learning_rate': 6.5610859728506795e-06, 'epoch': 0.02} 2%|▏ | 436/22095 [34:16<28:18:49, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48294 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109746 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 437/22095 [34:19<25:30:30, 4.24s/it] {'loss': 0.4917, 'grad_norm': 0.9721133660738163, 'learning_rate': 6.576168929110106e-06, 'epoch': 0.02} 2%|▏ | 437/22095 [34:19<25:30:30, 4.24s/it] 2%|▏ | 438/22095 [34:23<23:48:52, 3.96s/it] {'loss': 0.49, 'grad_norm': 0.8960955325001373, 'learning_rate': 6.591251885369533e-06, 'epoch': 0.02} 2%|▏ | 438/22095 [34:23<23:48:52, 3.96s/it] 2%|▏ | 439/22095 [34:27<23:56:27, 3.98s/it] {'loss': 0.5162, 'grad_norm': 0.9513126805609666, 'learning_rate': 6.6063348416289595e-06, 'epoch': 0.02} 2%|▏ | 439/22095 [34:27<23:56:27, 3.98s/it] 2%|▏ | 440/22095 [34:30<22:14:09, 3.70s/it] {'loss': 0.5152, 'grad_norm': 0.9823253412027514, 'learning_rate': 6.621417797888387e-06, 'epoch': 0.02} 2%|▏ | 440/22095 [34:30<22:14:09, 3.70s/it] 2%|▏ | 441/22095 [34:33<20:41:00, 3.44s/it] {'loss': 0.527, 'grad_norm': 0.9045371589165514, 'learning_rate': 6.636500754147813e-06, 'epoch': 0.02} 2%|▏ | 441/22095 [34:33<20:41:00, 3.44s/it] 2%|▏ | 442/22095 [34:36<20:43:35, 3.45s/it] {'loss': 0.5669, 'grad_norm': 0.8782531559250729, 'learning_rate': 6.65158371040724e-06, 'epoch': 0.02} 2%|▏ | 442/22095 [34:36<20:43:35, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [495, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8438388 in VC:s3://internvl-moe-sft-data/. Exception: Image size [495, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 89225, 'image': 'vrdu_texteq/astro-ph.CO/79d43050-90dd-455f-8158-a52f9c614c0f.png', 'image_wh': [[495, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'The action with the $\\pi$ field introduced is'}]} 2%|▏ | 443/22095 [34:39<19:55:07, 3.31s/it] {'loss': 0.4585, 'grad_norm': 0.8859706150934012, 'learning_rate': 6.666666666666667e-06, 'epoch': 0.02} 2%|▏ | 443/22095 [34:39<19:55:07, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 444/22095 [34:49<32:04:47, 5.33s/it] {'loss': 0.5682, 'grad_norm': 0.6991315890774688, 'learning_rate': 6.681749622926094e-06, 'epoch': 0.02} 2%|▏ | 444/22095 [34:49<32:04:47, 5.33s/it] 2%|▏ | 445/22095 [34:53<28:52:50, 4.80s/it] {'loss': 0.5208, 'grad_norm': 0.9485402451951149, 'learning_rate': 6.69683257918552e-06, 'epoch': 0.02} 2%|▏ | 445/22095 [34:53<28:52:50, 4.80s/it] 2%|▏ | 446/22095 [34:56<25:57:50, 4.32s/it] {'loss': 0.5101, 'grad_norm': 0.9873924486992607, 'learning_rate': 6.7119155354449474e-06, 'epoch': 0.02} 2%|▏ | 446/22095 [34:56<25:57:50, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54964 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47705 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (48922 > 40960) for 4 sample(s). Truncating to 7635 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (41524 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 447/22095 [34:58<23:03:13, 3.83s/it] {'loss': 0.5268, 'grad_norm': 0.9276034420485871, 'learning_rate': 6.7269984917043755e-06, 'epoch': 0.02} 2%|▏ | 447/22095 [34:58<23:03:13, 3.83s/it] 2%|▏ | 448/22095 [35:01<21:06:50, 3.51s/it] {'loss': 0.5419, 'grad_norm': 0.9603097707029032, 'learning_rate': 6.742081447963802e-06, 'epoch': 0.02} 2%|▏ | 448/22095 [35:01<21:06:50, 3.51s/it] 2%|▏ | 449/22095 [35:05<20:54:15, 3.48s/it] {'loss': 0.527, 'grad_norm': 0.9290487002204546, 'learning_rate': 6.757164404223229e-06, 'epoch': 0.02} 2%|▏ | 449/22095 [35:05<20:54:15, 3.48s/it] 2%|▏ | 450/22095 [35:08<20:53:07, 3.47s/it] {'loss': 0.486, 'grad_norm': 1.0113822614648986, 'learning_rate': 6.7722473604826555e-06, 'epoch': 0.02} 2%|▏ | 450/22095 [35:08<20:53:07, 3.47s/it] 2%|▏ | 451/22095 [35:13<23:07:48, 3.85s/it] {'loss': 0.5126, 'grad_norm': 0.9560484644844321, 'learning_rate': 6.787330316742083e-06, 'epoch': 0.02} 2%|▏ | 451/22095 [35:13<23:07:48, 3.85s/it] 2%|▏ | 452/22095 [35:17<23:16:49, 3.87s/it] {'loss': 0.5092, 'grad_norm': 0.9513817484526905, 'learning_rate': 6.802413273001509e-06, 'epoch': 0.02} 2%|▏ | 452/22095 [35:17<23:16:49, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119134 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87190 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 453/22095 [35:20<22:01:07, 3.66s/it] {'loss': 0.4636, 'grad_norm': 1.0341828220958489, 'learning_rate': 6.817496229260936e-06, 'epoch': 0.02} 2%|▏ | 453/22095 [35:20<22:01:07, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73482 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 454/22095 [35:24<22:25:45, 3.73s/it] {'loss': 0.4935, 'grad_norm': 0.836876325711259, 'learning_rate': 6.832579185520363e-06, 'epoch': 0.02} 2%|▏ | 454/22095 [35:24<22:25:45, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 455/22095 [35:34<34:13:41, 5.69s/it] {'loss': 0.5816, 'grad_norm': 0.7154257246692087, 'learning_rate': 6.84766214177979e-06, 'epoch': 0.02} 2%|▏ | 455/22095 [35:34<34:13:41, 5.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 456/22095 [35:38<31:23:40, 5.22s/it] {'loss': 0.525, 'grad_norm': 0.8883803720828303, 'learning_rate': 6.862745098039216e-06, 'epoch': 0.02} 2%|▏ | 456/22095 [35:38<31:23:40, 5.22s/it] 2%|▏ | 457/22095 [35:41<27:27:28, 4.57s/it] {'loss': 0.4803, 'grad_norm': 0.8812155648935334, 'learning_rate': 6.8778280542986434e-06, 'epoch': 0.02} 2%|▏ | 457/22095 [35:41<27:27:28, 4.57s/it] 2%|▏ | 458/22095 [35:44<24:52:55, 4.14s/it] {'loss': 0.5146, 'grad_norm': 0.8914599212352133, 'learning_rate': 6.89291101055807e-06, 'epoch': 0.02} 2%|▏ | 458/22095 [35:44<24:52:55, 4.14s/it] 2%|▏ | 459/22095 [35:48<24:32:06, 4.08s/it] {'loss': 0.4794, 'grad_norm': 0.9610108620275446, 'learning_rate': 6.907993966817497e-06, 'epoch': 0.02} 2%|▏ | 459/22095 [35:48<24:32:06, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81627 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79717 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 460/22095 [35:52<23:00:16, 3.83s/it] {'loss': 0.5321, 'grad_norm': 0.9236182851861297, 'learning_rate': 6.923076923076923e-06, 'epoch': 0.02} 2%|▏ | 460/22095 [35:52<23:00:16, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97634 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 461/22095 [36:01<33:02:01, 5.50s/it] {'loss': 0.6144, 'grad_norm': 0.7895517788163228, 'learning_rate': 6.938159879336351e-06, 'epoch': 0.02} 2%|▏ | 461/22095 [36:01<33:02:01, 5.50s/it] 2%|▏ | 462/22095 [36:04<28:56:04, 4.82s/it] {'loss': 0.558, 'grad_norm': 1.4224341444472233, 'learning_rate': 6.953242835595777e-06, 'epoch': 0.02} 2%|▏ | 462/22095 [36:04<28:56:04, 4.82s/it] 2%|▏ | 463/22095 [36:08<26:43:51, 4.45s/it] {'loss': 0.5194, 'grad_norm': 0.8534888875322985, 'learning_rate': 6.968325791855204e-06, 'epoch': 0.02} 2%|▏ | 463/22095 [36:08<26:43:51, 4.45s/it] 2%|▏ | 464/22095 [36:12<26:05:13, 4.34s/it] {'loss': 0.484, 'grad_norm': 0.8878493254499618, 'learning_rate': 6.9834087481146306e-06, 'epoch': 0.02} 2%|▏ | 464/22095 [36:12<26:05:13, 4.34s/it] 2%|▏ | 465/22095 [36:15<23:34:24, 3.92s/it] {'loss': 0.4687, 'grad_norm': 0.7689726276830536, 'learning_rate': 6.998491704374058e-06, 'epoch': 0.02} 2%|▏ | 465/22095 [36:15<23:34:24, 3.92s/it] 2%|▏ | 466/22095 [36:18<22:14:50, 3.70s/it] {'loss': 0.4896, 'grad_norm': 0.8431041350728392, 'learning_rate': 7.013574660633484e-06, 'epoch': 0.02} 2%|▏ | 466/22095 [36:18<22:14:50, 3.70s/it] 2%|▏ | 467/22095 [36:22<22:33:05, 3.75s/it] {'loss': 0.5267, 'grad_norm': 0.9131741584792719, 'learning_rate': 7.028657616892911e-06, 'epoch': 0.02} 2%|▏ | 467/22095 [36:22<22:33:05, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [498, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8466994 in VC:s3://internvl-moe-sft-data/. Exception: Image size [498, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 138223, 'image': 'vrdu_texteq/astro-ph.CO/ca8d3049-98ee-4cd7-a243-1647147b5dd6.png', 'image_wh': [[498, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'First we consider the case: $ M=2$. Then'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 468/22095 [36:32<33:53:39, 5.64s/it] {'loss': 0.5891, 'grad_norm': 0.7000414706451139, 'learning_rate': 7.0437405731523386e-06, 'epoch': 0.02} 2%|▏ | 468/22095 [36:32<33:53:39, 5.64s/it] 2%|▏ | 469/22095 [36:42<42:19:58, 7.05s/it] {'loss': 0.5661, 'grad_norm': 0.5274632467798588, 'learning_rate': 7.058823529411766e-06, 'epoch': 0.02} 2%|▏ | 469/22095 [36:42<42:19:58, 7.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 2%|▏ | 470/22095 [36:46<36:13:24, 6.03s/it] {'loss': 0.5283, 'grad_norm': 1.0489074712596098, 'learning_rate': 7.073906485671192e-06, 'epoch': 0.02} 2%|▏ | 470/22095 [36:46<36:13:24, 6.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41532 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 471/22095 [36:50<32:16:54, 5.37s/it] {'loss': 0.4743, 'grad_norm': 0.8608740580398528, 'learning_rate': 7.088989441930619e-06, 'epoch': 0.02} 2%|▏ | 471/22095 [36:50<32:16:54, 5.37s/it] 2%|▏ | 472/22095 [36:53<28:08:33, 4.69s/it] {'loss': 0.5414, 'grad_norm': 0.8993120418263926, 'learning_rate': 7.104072398190046e-06, 'epoch': 0.02} 2%|▏ | 472/22095 [36:53<28:08:33, 4.69s/it] 2%|▏ | 473/22095 [36:57<26:42:19, 4.45s/it] {'loss': 0.4785, 'grad_norm': 0.8519655085775868, 'learning_rate': 7.119155354449473e-06, 'epoch': 0.02} 2%|▏ | 473/22095 [36:57<26:42:19, 4.45s/it] 2%|▏ | 474/22095 [37:00<23:51:29, 3.97s/it] {'loss': 0.4768, 'grad_norm': 0.9992364234055222, 'learning_rate': 7.134238310708899e-06, 'epoch': 0.02} 2%|▏ | 474/22095 [37:00<23:51:29, 3.97s/it] 2%|▏ | 475/22095 [37:03<22:17:34, 3.71s/it] {'loss': 0.512, 'grad_norm': 0.8211878630216686, 'learning_rate': 7.1493212669683265e-06, 'epoch': 0.02} 2%|▏ | 475/22095 [37:03<22:17:34, 3.71s/it] 2%|▏ | 476/22095 [37:06<20:43:12, 3.45s/it] {'loss': 0.4972, 'grad_norm': 0.8673840729651399, 'learning_rate': 7.164404223227753e-06, 'epoch': 0.02} 2%|▏ | 476/22095 [37:06<20:43:12, 3.45s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30499.png 2025-08-27 16:35:02.964575 load time: 1066.65 ms 2%|▏ | 477/22095 [37:10<22:03:18, 3.67s/it] {'loss': 0.5009, 'grad_norm': 0.9626081409011095, 'learning_rate': 7.17948717948718e-06, 'epoch': 0.02} 2%|▏ | 477/22095 [37:10<22:03:18, 3.67s/it] 2%|▏ | 478/22095 [37:13<20:56:44, 3.49s/it] {'loss': 0.5291, 'grad_norm': 0.9324671863783013, 'learning_rate': 7.1945701357466065e-06, 'epoch': 0.02} 2%|▏ | 478/22095 [37:13<20:56:44, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 479/22095 [37:22<30:41:45, 5.11s/it] {'loss': 0.5974, 'grad_norm': 1.6119787378474646, 'learning_rate': 7.209653092006034e-06, 'epoch': 0.02} 2%|▏ | 479/22095 [37:22<30:41:45, 5.11s/it] 2%|▏ | 480/22095 [37:25<27:20:36, 4.55s/it] {'loss': 0.4819, 'grad_norm': 0.8936322895814439, 'learning_rate': 7.22473604826546e-06, 'epoch': 0.02} 2%|▏ | 480/22095 [37:25<27:20:36, 4.55s/it] 2%|▏ | 481/22095 [37:28<24:47:34, 4.13s/it] {'loss': 0.5015, 'grad_norm': 1.1833559492371089, 'learning_rate': 7.239819004524887e-06, 'epoch': 0.02} 2%|▏ | 481/22095 [37:28<24:47:34, 4.13s/it] 2%|▏ | 482/22095 [37:31<22:47:51, 3.80s/it] {'loss': 0.5349, 'grad_norm': 0.9542089624937165, 'learning_rate': 7.2549019607843145e-06, 'epoch': 0.02} 2%|▏ | 482/22095 [37:31<22:47:51, 3.80s/it] 2%|▏ | 483/22095 [37:35<22:58:23, 3.83s/it] {'loss': 0.4842, 'grad_norm': 0.8717232550801574, 'learning_rate': 7.269984917043741e-06, 'epoch': 0.02} 2%|▏ | 483/22095 [37:35<22:58:23, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67873 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 484/22095 [37:38<21:08:52, 3.52s/it] {'loss': 0.4522, 'grad_norm': 0.8784660417369917, 'learning_rate': 7.285067873303168e-06, 'epoch': 0.02} 2%|▏ | 484/22095 [37:38<21:08:52, 3.52s/it] 2%|▏ | 485/22095 [37:41<21:07:18, 3.52s/it] {'loss': 0.486, 'grad_norm': 0.8714297614485044, 'learning_rate': 7.3001508295625945e-06, 'epoch': 0.02} 2%|▏ | 485/22095 [37:41<21:07:18, 3.52s/it] 2%|▏ | 486/22095 [37:45<20:48:48, 3.47s/it] {'loss': 0.518, 'grad_norm': 1.0117867219550658, 'learning_rate': 7.315233785822022e-06, 'epoch': 0.02} 2%|▏ | 486/22095 [37:45<20:48:48, 3.47s/it] 2%|▏ | 487/22095 [37:49<22:27:19, 3.74s/it] {'loss': 0.5309, 'grad_norm': 0.8551402658731775, 'learning_rate': 7.330316742081448e-06, 'epoch': 0.02} 2%|▏ | 487/22095 [37:49<22:27:19, 3.74s/it] 2%|▏ | 488/22095 [37:52<21:39:53, 3.61s/it] {'loss': 0.4849, 'grad_norm': 0.8520731544633671, 'learning_rate': 7.345399698340876e-06, 'epoch': 0.02} 2%|▏ | 488/22095 [37:52<21:39:53, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76954 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 489/22095 [37:56<22:08:58, 3.69s/it] {'loss': 0.5183, 'grad_norm': 0.945220429406945, 'learning_rate': 7.3604826546003025e-06, 'epoch': 0.02} 2%|▏ | 489/22095 [37:56<22:08:58, 3.69s/it] 2%|▏ | 490/22095 [38:00<21:35:11, 3.60s/it] {'loss': 0.5068, 'grad_norm': 0.8965594690346885, 'learning_rate': 7.37556561085973e-06, 'epoch': 0.02} 2%|▏ | 490/22095 [38:00<21:35:11, 3.60s/it] 2%|▏ | 491/22095 [38:03<20:26:28, 3.41s/it] {'loss': 0.5172, 'grad_norm': 0.9383416027268775, 'learning_rate': 7.390648567119156e-06, 'epoch': 0.02} 2%|▏ | 491/22095 [38:03<20:26:28, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 492/22095 [38:09<26:23:31, 4.40s/it] {'loss': 0.6656, 'grad_norm': 2.8137379093185415, 'learning_rate': 7.405731523378583e-06, 'epoch': 0.02} 2%|▏ | 492/22095 [38:09<26:23:31, 4.40s/it] 2%|▏ | 493/22095 [38:13<24:18:16, 4.05s/it] {'loss': 0.517, 'grad_norm': 0.8911437812376426, 'learning_rate': 7.42081447963801e-06, 'epoch': 0.02} 2%|▏ | 493/22095 [38:13<24:18:16, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80783 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 494/22095 [38:16<23:56:56, 3.99s/it] {'loss': 0.476, 'grad_norm': 0.8746627228365279, 'learning_rate': 7.435897435897437e-06, 'epoch': 0.02} 2%|▏ | 494/22095 [38:16<23:56:56, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71044 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 495/22095 [38:19<21:50:20, 3.64s/it] {'loss': 0.4964, 'grad_norm': 1.0210061302597564, 'learning_rate': 7.450980392156863e-06, 'epoch': 0.02} 2%|▏ | 495/22095 [38:19<21:50:20, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 496/22095 [38:29<32:53:58, 5.48s/it] {'loss': 0.5769, 'grad_norm': 1.3930772745192121, 'learning_rate': 7.4660633484162904e-06, 'epoch': 0.02} 2%|▏ | 496/22095 [38:29<32:53:58, 5.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881986 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5139, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 4\nB. 6\nC. 2\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 2%|▏ | 497/22095 [38:33<30:43:18, 5.12s/it] {'loss': 0.5273, 'grad_norm': 0.9354966630918502, 'learning_rate': 7.481146304675717e-06, 'epoch': 0.02} 2%|▏ | 497/22095 [38:33<30:43:18, 5.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42571 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59537 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 498/22095 [38:37<27:41:25, 4.62s/it] {'loss': 0.5322, 'grad_norm': 1.0404930623959125, 'learning_rate': 7.496229260935144e-06, 'epoch': 0.02} 2%|▏ | 498/22095 [38:37<27:41:25, 4.62s/it] 2%|▏ | 499/22095 [38:40<25:01:40, 4.17s/it] {'loss': 0.5197, 'grad_norm': 0.9212989516000655, 'learning_rate': 7.51131221719457e-06, 'epoch': 0.02} 2%|▏ | 499/22095 [38:40<25:01:40, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69930 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 500/22095 [38:43<23:10:18, 3.86s/it] {'loss': 0.4601, 'grad_norm': 1.1837040152817129, 'learning_rate': 7.526395173453998e-06, 'epoch': 0.02} 2%|▏ | 500/22095 [38:43<23:10:18, 3.86s/it] 2%|▏ | 501/22095 [38:47<24:18:50, 4.05s/it] {'loss': 0.5082, 'grad_norm': 0.9831508145374277, 'learning_rate': 7.541478129713424e-06, 'epoch': 0.02} 2%|▏ | 501/22095 [38:47<24:18:50, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 502/22095 [38:56<32:05:55, 5.35s/it] {'loss': 0.6088, 'grad_norm': 1.7317500998832773, 'learning_rate': 7.556561085972851e-06, 'epoch': 0.02} 2%|▏ | 502/22095 [38:56<32:05:55, 5.35s/it] 2%|▏ | 503/22095 [39:06<40:16:49, 6.72s/it] {'loss': 0.5859, 'grad_norm': 1.4020978387603402, 'learning_rate': 7.5716440422322776e-06, 'epoch': 0.02} 2%|▏ | 503/22095 [39:06<40:16:49, 6.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 2%|▏ | 504/22095 [39:10<35:37:04, 5.94s/it] {'loss': 0.4931, 'grad_norm': 0.9968935655490997, 'learning_rate': 7.586726998491705e-06, 'epoch': 0.02} 2%|▏ | 504/22095 [39:10<35:37:04, 5.94s/it] 2%|▏ | 505/22095 [39:13<30:36:17, 5.10s/it] {'loss': 0.5629, 'grad_norm': 0.9737894931726774, 'learning_rate': 7.601809954751131e-06, 'epoch': 0.02} 2%|▏ | 505/22095 [39:13<30:36:17, 5.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 506/22095 [39:23<39:25:14, 6.57s/it] {'loss': 0.5703, 'grad_norm': 0.8123883604016281, 'learning_rate': 7.616892911010558e-06, 'epoch': 0.02} 2%|▏ | 506/22095 [39:23<39:25:14, 6.57s/it] 2%|▏ | 507/22095 [39:27<34:17:23, 5.72s/it] {'loss': 0.4761, 'grad_norm': 0.9854901638439508, 'learning_rate': 7.631975867269985e-06, 'epoch': 0.02} 2%|▏ | 507/22095 [39:27<34:17:23, 5.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 508/22095 [39:37<41:51:16, 6.98s/it] {'loss': 0.5448, 'grad_norm': 1.4197575971354461, 'learning_rate': 7.647058823529411e-06, 'epoch': 0.02} 2%|▏ | 508/22095 [39:37<41:51:16, 6.98s/it] 2%|▏ | 509/22095 [39:40<35:49:36, 5.98s/it] {'loss': 0.5432, 'grad_norm': 0.923807733562872, 'learning_rate': 7.66214177978884e-06, 'epoch': 0.02} 2%|▏ | 509/22095 [39:40<35:49:36, 5.98s/it] 2%|▏ | 510/22095 [39:44<31:29:17, 5.25s/it] {'loss': 0.507, 'grad_norm': 0.9263412579857541, 'learning_rate': 7.677224736048267e-06, 'epoch': 0.02} 2%|▏ | 510/22095 [39:44<31:29:17, 5.25s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 511/22095 [39:47<26:57:53, 4.50s/it] {'loss': 0.4931, 'grad_norm': 0.854932864691546, 'learning_rate': 7.692307692307694e-06, 'epoch': 0.02} 2%|▏ | 511/22095 [39:47<26:57:53, 4.50s/it] 2%|▏ | 512/22095 [39:50<25:16:49, 4.22s/it] {'loss': 0.5055, 'grad_norm': 0.9010694714875057, 'learning_rate': 7.70739064856712e-06, 'epoch': 0.02} 2%|▏ | 512/22095 [39:50<25:16:49, 4.22s/it] 2%|▏ | 513/22095 [39:54<23:40:40, 3.95s/it] {'loss': 0.5165, 'grad_norm': 0.9302825746028861, 'learning_rate': 7.722473604826546e-06, 'epoch': 0.02} 2%|▏ | 513/22095 [39:54<23:40:40, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 514/22095 [39:57<22:24:46, 3.74s/it] {'loss': 0.5036, 'grad_norm': 0.815629604194157, 'learning_rate': 7.737556561085974e-06, 'epoch': 0.02} 2%|▏ | 514/22095 [39:57<22:24:46, 3.74s/it] 2%|▏ | 515/22095 [40:00<21:32:17, 3.59s/it] {'loss': 0.509, 'grad_norm': 0.873286630615026, 'learning_rate': 7.7526395173454e-06, 'epoch': 0.02} 2%|▏ | 515/22095 [40:00<21:32:17, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (69946 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 516/22095 [40:11<34:43:06, 5.79s/it] {'loss': 0.5643, 'grad_norm': 1.3120833057706391, 'learning_rate': 7.767722473604827e-06, 'epoch': 0.02} 2%|▏ | 516/22095 [40:11<34:43:06, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62461 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102385 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 517/22095 [40:14<30:23:22, 5.07s/it] {'loss': 0.5009, 'grad_norm': 0.8369186657274733, 'learning_rate': 7.782805429864253e-06, 'epoch': 0.02} 2%|▏ | 517/22095 [40:14<30:23:22, 5.07s/it] 2%|▏ | 518/22095 [40:18<27:06:38, 4.52s/it] {'loss': 0.4528, 'grad_norm': 0.8544449557940702, 'learning_rate': 7.797888386123682e-06, 'epoch': 0.02} 2%|▏ | 518/22095 [40:18<27:06:38, 4.52s/it] 2%|▏ | 519/22095 [40:20<23:59:29, 4.00s/it] {'loss': 0.5267, 'grad_norm': 1.0406352425448877, 'learning_rate': 7.812971342383108e-06, 'epoch': 0.02} 2%|▏ | 519/22095 [40:20<23:59:29, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51247 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42829 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 520/22095 [40:25<24:34:40, 4.10s/it] {'loss': 0.5368, 'grad_norm': 0.8706975403009801, 'learning_rate': 7.828054298642534e-06, 'epoch': 0.02} 2%|▏ | 520/22095 [40:25<24:34:40, 4.10s/it] 2%|▏ | 521/22095 [40:28<23:02:46, 3.85s/it] {'loss': 0.476, 'grad_norm': 0.8513980861118772, 'learning_rate': 7.84313725490196e-06, 'epoch': 0.02} 2%|▏ | 521/22095 [40:28<23:02:46, 3.85s/it] 2%|▏ | 522/22095 [40:32<22:48:07, 3.81s/it] {'loss': 0.5099, 'grad_norm': 0.8850946942100224, 'learning_rate': 7.858220211161389e-06, 'epoch': 0.02} 2%|▏ | 522/22095 [40:32<22:48:07, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49015 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 523/22095 [40:35<22:13:49, 3.71s/it] {'loss': 0.5034, 'grad_norm': 0.9757505338538881, 'learning_rate': 7.873303167420815e-06, 'epoch': 0.02} 2%|▏ | 523/22095 [40:35<22:13:49, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 524/22095 [40:45<33:12:40, 5.54s/it] {'loss': 0.5835, 'grad_norm': 0.8863131755025008, 'learning_rate': 7.888386123680241e-06, 'epoch': 0.02} 2%|▏ | 524/22095 [40:45<33:12:40, 5.54s/it] 2%|▏ | 525/22095 [40:55<41:56:06, 7.00s/it] {'loss': 0.5908, 'grad_norm': 0.7221129697816255, 'learning_rate': 7.903469079939668e-06, 'epoch': 0.02} 2%|▏ | 525/22095 [40:55<41:56:06, 7.00s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 2%|▏ | 526/22095 [40:59<35:21:41, 5.90s/it] {'loss': 0.5293, 'grad_norm': 1.1644628967771642, 'learning_rate': 7.918552036199096e-06, 'epoch': 0.02} 2%|▏ | 526/22095 [40:59<35:21:41, 5.90s/it] 2%|▏ | 527/22095 [41:03<32:06:40, 5.36s/it] {'loss': 0.5114, 'grad_norm': 0.9771835472342165, 'learning_rate': 7.933634992458522e-06, 'epoch': 0.02} 2%|▏ | 527/22095 [41:03<32:06:40, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 528/22095 [41:13<41:04:14, 6.86s/it] {'loss': 0.5591, 'grad_norm': 0.84653531874801, 'learning_rate': 7.948717948717949e-06, 'epoch': 0.02} 2%|▏ | 528/22095 [41:13<41:04:14, 6.86s/it] 2%|▏ | 529/22095 [41:17<34:54:59, 5.83s/it] {'loss': 0.4927, 'grad_norm': 1.1156798418233669, 'learning_rate': 7.963800904977375e-06, 'epoch': 0.02} 2%|▏ | 529/22095 [41:17<34:54:59, 5.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 530/22095 [41:24<38:30:28, 6.43s/it] {'loss': 0.5492, 'grad_norm': 0.6547282981964502, 'learning_rate': 7.978883861236803e-06, 'epoch': 0.02} 2%|▏ | 530/22095 [41:24<38:30:28, 6.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64357 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 531/22095 [41:34<44:06:47, 7.36s/it] {'loss': 0.553, 'grad_norm': 0.4978885832627992, 'learning_rate': 7.993966817496231e-06, 'epoch': 0.02} 2%|▏ | 531/22095 [41:34<44:06:47, 7.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 2%|▏ | 532/22095 [41:37<36:32:19, 6.10s/it] {'loss': 0.4927, 'grad_norm': 0.9638003747027917, 'learning_rate': 8.009049773755657e-06, 'epoch': 0.02} 2%|▏ | 532/22095 [41:37<36:32:19, 6.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42363 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48063 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 533/22095 [41:41<32:19:12, 5.40s/it] {'loss': 0.5389, 'grad_norm': 0.9485547114845543, 'learning_rate': 8.024132730015084e-06, 'epoch': 0.02} 2%|▏ | 533/22095 [41:41<32:19:12, 5.40s/it] 2%|▏ | 534/22095 [41:44<28:19:35, 4.73s/it] {'loss': 0.5218, 'grad_norm': 0.970251325477242, 'learning_rate': 8.03921568627451e-06, 'epoch': 0.02} 2%|▏ | 534/22095 [41:44<28:19:35, 4.73s/it] 2%|▏ | 535/22095 [41:47<24:57:16, 4.17s/it] {'loss': 0.5245, 'grad_norm': 1.066770352433271, 'learning_rate': 8.054298642533938e-06, 'epoch': 0.02} 2%|▏ | 535/22095 [41:47<24:57:16, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 536/22095 [41:57<34:56:11, 5.83s/it] {'loss': 0.5851, 'grad_norm': 1.2508405584119022, 'learning_rate': 8.069381598793365e-06, 'epoch': 0.02} 2%|▏ | 536/22095 [41:57<34:56:11, 5.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None [Try #0] Failed to fetch sample 1096386 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None Problematic sample: {'image': ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'], 'conversations': [{'from': 'human', 'value': "\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nI want to book a hotel in london, prize should be less than $600, guest rating is 8+, 4 star rating, breakfast included\n\nPrevious operations:\nStep 1: Tap on the Chrome app to start searching for hotels in London.\nStep 2: Type 'Chrome' in the search bar to open the Chrome browser.\nStep 3: Tap on the Hotels.com app icon to begin searching for a hotel.\nStep 4: Wait for the Hotels.com app to finish loading to proceed with the hotel search.\nStep 5: Tap on the 'Going to' field to enter London as the destination.\nStep 6: Tap on 'London' from Recent Searches to select it as the destination for the hotel search.\nStep 7: Tap on 'London, England, United Kingdom' to select it as the destination for the hotel search.\nStep 8: Tap on the 'Search' button to view available hotels in London.\nStep 9: Tap on the filter icon or option to apply filters to the hotel search results.\nStep 10: Swipe up on the screen to reveal more filter options, such as guest rating and hotel stars.\nStep 11: Swipe up to reveal more filter options, specifically guest ratings and hotel stars.\nStep 12: Tap on the 4-star rating box to select it.\nStep 13: Tap 'Done' to confirm the current filter selections and proceed.\nStep 14: Swipe up on the screen to locate and select the 'breakfast included' filter option.\nStep 15: Tap on 'Breakfast included' to select it.\nStep 16: Tap on the 'Done' button to apply the filters and view available hotel options.\nStep 17: Tap on the topmost hotel card to view its details.\nStep 18: Tap on the 'Select a room' button to check available room options and their prices.\nStep 19: Tap on 'Select and customize' to proceed with the room selection and customization options.\nStep 20: Tap on 'Breakfast buffet' to select it, then tap on 'Reserve' to book the room."}, {'from': 'gpt', 'value': "\nGiven the sequence of booking actions, it seems the process is almost complete. The loading sign indicates that the system is processing the final booking or confirmation step.\n\n\nWait for the loading to complete to confirm the booking status.\n\n\nterminate(status='success')\n"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952528 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3363, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 8\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 2%|▏ | 537/22095 [42:01<31:57:08, 5.34s/it] {'loss': 0.4804, 'grad_norm': 0.8977303585116398, 'learning_rate': 8.084464555052791e-06, 'epoch': 0.02} 2%|▏ | 537/22095 [42:01<31:57:08, 5.34s/it] 2%|▏ | 538/22095 [42:04<27:16:57, 4.56s/it] {'loss': 0.4385, 'grad_norm': 0.8550682869491487, 'learning_rate': 8.099547511312217e-06, 'epoch': 0.02} 2%|▏ | 538/22095 [42:04<27:16:57, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 539/22095 [42:14<37:15:55, 6.22s/it] {'loss': 0.5873, 'grad_norm': 0.5604442567533277, 'learning_rate': 8.114630467571645e-06, 'epoch': 0.02} 2%|▏ | 539/22095 [42:14<37:15:55, 6.22s/it] 2%|▏ | 540/22095 [42:17<32:04:00, 5.36s/it] {'loss': 0.4735, 'grad_norm': 1.0044342170560538, 'learning_rate': 8.129713423831072e-06, 'epoch': 0.02} 2%|▏ | 540/22095 [42:17<32:04:00, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 2%|▏ | 541/22095 [42:27<39:55:48, 6.67s/it] {'loss': 0.5873, 'grad_norm': 0.6410626077944116, 'learning_rate': 8.144796380090498e-06, 'epoch': 0.02} 2%|▏ | 541/22095 [42:27<39:55:48, 6.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949354 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 189, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 5cm\nB. 15cm\nC. 16cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 2%|▏ | 542/22095 [42:30<34:23:30, 5.74s/it] {'loss': 0.4954, 'grad_norm': 0.9384899463609325, 'learning_rate': 8.159879336349925e-06, 'epoch': 0.02} 2%|▏ | 542/22095 [42:30<34:23:30, 5.74s/it] 2%|▏ | 543/22095 [42:34<30:06:28, 5.03s/it] {'loss': 0.5302, 'grad_norm': 0.9117160909143945, 'learning_rate': 8.174962292609353e-06, 'epoch': 0.02} 2%|▏ | 543/22095 [42:34<30:06:28, 5.03s/it] 2%|▏ | 544/22095 [42:37<27:16:51, 4.56s/it] {'loss': 0.5089, 'grad_norm': 0.9444687825633534, 'learning_rate': 8.190045248868779e-06, 'epoch': 0.02} 2%|▏ | 544/22095 [42:37<27:16:51, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68432 > 40960). Running this sequence through the model will result in indexing errors 2%|▏ | 545/22095 [42:40<24:17:44, 4.06s/it] {'loss': 0.4722, 'grad_norm': 0.8911153707648989, 'learning_rate': 8.205128205128205e-06, 'epoch': 0.02} 2%|▏ | 545/22095 [42:40<24:17:44, 4.06s/it] 2%|▏ | 546/22095 [42:43<22:09:59, 3.70s/it] {'loss': 0.5297, 'grad_norm': 0.8783990335001564, 'learning_rate': 8.220211161387632e-06, 'epoch': 0.02} 2%|▏ | 546/22095 [42:43<22:09:59, 3.70s/it] 2%|▏ | 547/22095 [42:46<20:43:35, 3.46s/it] {'loss': 0.5089, 'grad_norm': 0.9749831735866099, 'learning_rate': 8.23529411764706e-06, 'epoch': 0.02} 2%|▏ | 547/22095 [42:46<20:43:35, 3.46s/it] 2%|▏ | 548/22095 [42:49<19:57:05, 3.33s/it] {'loss': 0.4903, 'grad_norm': 0.9534394716345836, 'learning_rate': 8.250377073906486e-06, 'epoch': 0.02} 2%|▏ | 548/22095 [42:49<19:57:05, 3.33s/it] 2%|▏ | 549/22095 [42:51<18:48:51, 3.14s/it] {'loss': 0.5271, 'grad_norm': 0.986230968075295, 'learning_rate': 8.265460030165913e-06, 'epoch': 0.02} 2%|▏ | 549/22095 [42:51<18:48:51, 3.14s/it] 2%|▏ | 550/22095 [42:55<20:00:51, 3.34s/it] {'loss': 0.5049, 'grad_norm': 0.9445929420045507, 'learning_rate': 8.280542986425339e-06, 'epoch': 0.02} 2%|▏ | 550/22095 [42:55<20:00:51, 3.34s/it] 2%|▏ | 551/22095 [43:00<22:07:12, 3.70s/it] {'loss': 0.4706, 'grad_norm': 0.8445435072400197, 'learning_rate': 8.295625942684767e-06, 'epoch': 0.02} 2%|▏ | 551/22095 [43:00<22:07:12, 3.70s/it] 2%|▏ | 552/22095 [43:03<21:53:40, 3.66s/it] {'loss': 0.495, 'grad_norm': 0.8146518680172017, 'learning_rate': 8.310708898944195e-06, 'epoch': 0.02} 2%|▏ | 552/22095 [43:03<21:53:40, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 553/22095 [43:10<26:23:27, 4.41s/it] {'loss': 0.5775, 'grad_norm': 1.2530851684753053, 'learning_rate': 8.325791855203621e-06, 'epoch': 0.03} 3%|▎ | 553/22095 [43:10<26:23:27, 4.41s/it] 3%|▎ | 554/22095 [43:16<30:30:55, 5.10s/it] {'loss': 0.6055, 'grad_norm': 0.8546866880086258, 'learning_rate': 8.340874811463048e-06, 'epoch': 0.03} 3%|▎ | 554/22095 [43:16<30:30:55, 5.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (44923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44151 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43468 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 555/22095 [43:19<26:52:32, 4.49s/it] {'loss': 0.5157, 'grad_norm': 1.1621279479943107, 'learning_rate': 8.355957767722474e-06, 'epoch': 0.03} 3%|▎ | 555/22095 [43:19<26:52:32, 4.49s/it] 3%|▎ | 556/22095 [43:23<24:57:04, 4.17s/it] {'loss': 0.4294, 'grad_norm': 0.9994237527753784, 'learning_rate': 8.371040723981902e-06, 'epoch': 0.03} 3%|▎ | 556/22095 [43:23<24:57:04, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110542 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62958 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50391 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 557/22095 [43:26<23:00:07, 3.84s/it] {'loss': 0.4898, 'grad_norm': 0.8970428825907083, 'learning_rate': 8.386123680241329e-06, 'epoch': 0.03} 3%|▎ | 557/22095 [43:26<23:00:07, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 558/22095 [43:29<22:28:34, 3.76s/it] {'loss': 0.5023, 'grad_norm': 1.1646020881114376, 'learning_rate': 8.401206636500755e-06, 'epoch': 0.03} 3%|▎ | 558/22095 [43:29<22:28:34, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59825 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 559/22095 [43:33<21:41:38, 3.63s/it] {'loss': 0.4947, 'grad_norm': 1.0114259306681477, 'learning_rate': 8.416289592760181e-06, 'epoch': 0.03} 3%|▎ | 559/22095 [43:33<21:41:38, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52100 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42295 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49008 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60707 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 560/22095 [43:37<22:41:41, 3.79s/it] {'loss': 0.4837, 'grad_norm': 1.0786177518368725, 'learning_rate': 8.43137254901961e-06, 'epoch': 0.03} 3%|▎ | 560/22095 [43:37<22:41:41, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 561/22095 [43:41<23:54:47, 4.00s/it] {'loss': 0.5464, 'grad_norm': 1.1702295424811555, 'learning_rate': 8.446455505279036e-06, 'epoch': 0.03} 3%|▎ | 561/22095 [43:41<23:54:47, 4.00s/it] 3%|▎ | 562/22095 [43:45<22:39:11, 3.79s/it] {'loss': 0.5263, 'grad_norm': 0.9776929278671919, 'learning_rate': 8.461538461538462e-06, 'epoch': 0.03} 3%|▎ | 562/22095 [43:45<22:39:11, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 563/22095 [43:55<34:55:16, 5.84s/it] {'loss': 0.5842, 'grad_norm': 2.158204647935258, 'learning_rate': 8.476621417797888e-06, 'epoch': 0.03} 3%|▎ | 563/22095 [43:55<34:55:16, 5.84s/it] 3%|▎ | 564/22095 [43:59<31:34:20, 5.28s/it] {'loss': 0.5163, 'grad_norm': 1.2324208092362534, 'learning_rate': 8.491704374057317e-06, 'epoch': 0.03} 3%|▎ | 564/22095 [43:59<31:34:20, 5.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 565/22095 [44:09<40:11:38, 6.72s/it] {'loss': 0.5762, 'grad_norm': 1.1914425382161538, 'learning_rate': 8.506787330316743e-06, 'epoch': 0.03} 3%|▎ | 565/22095 [44:09<40:11:38, 6.72s/it] 3%|▎ | 566/22095 [44:13<34:06:23, 5.70s/it] {'loss': 0.4902, 'grad_norm': 0.9005565179098027, 'learning_rate': 8.52187028657617e-06, 'epoch': 0.03} 3%|▎ | 566/22095 [44:13<34:06:23, 5.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55920 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 567/22095 [44:23<41:43:04, 6.98s/it] {'loss': 0.5824, 'grad_norm': 0.8924173594211934, 'learning_rate': 8.536953242835596e-06, 'epoch': 0.03} 3%|▎ | 567/22095 [44:23<41:43:04, 6.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 568/22095 [44:32<45:34:49, 7.62s/it] {'loss': 0.5912, 'grad_norm': 1.0033799904869471, 'learning_rate': 8.552036199095024e-06, 'epoch': 0.03} 3%|▎ | 568/22095 [44:32<45:34:49, 7.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 569/22095 [44:35<37:58:33, 6.35s/it] {'loss': 0.5026, 'grad_norm': 1.3324490342886244, 'learning_rate': 8.56711915535445e-06, 'epoch': 0.03} 3%|▎ | 569/22095 [44:35<37:58:33, 6.35s/it] 3%|▎ | 570/22095 [44:39<32:57:01, 5.51s/it] {'loss': 0.4814, 'grad_norm': 1.013502674657888, 'learning_rate': 8.582202111613876e-06, 'epoch': 0.03} 3%|▎ | 570/22095 [44:39<32:57:01, 5.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 571/22095 [44:42<28:56:40, 4.84s/it] {'loss': 0.4891, 'grad_norm': 1.0831743227926793, 'learning_rate': 8.597285067873304e-06, 'epoch': 0.03} 3%|▎ | 571/22095 [44:42<28:56:40, 4.84s/it] 3%|▎ | 572/22095 [44:45<26:27:45, 4.43s/it] {'loss': 0.4562, 'grad_norm': 1.2332349946569197, 'learning_rate': 8.612368024132731e-06, 'epoch': 0.03} 3%|▎ | 572/22095 [44:45<26:27:45, 4.43s/it] 3%|▎ | 573/22095 [44:48<23:30:42, 3.93s/it] {'loss': 0.4895, 'grad_norm': 0.9177188668689421, 'learning_rate': 8.627450980392157e-06, 'epoch': 0.03} 3%|▎ | 573/22095 [44:48<23:30:42, 3.93s/it] 3%|▎ | 574/22095 [44:51<21:53:14, 3.66s/it] {'loss': 0.5196, 'grad_norm': 1.1644222868364908, 'learning_rate': 8.642533936651585e-06, 'epoch': 0.03} 3%|▎ | 574/22095 [44:51<21:53:14, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (115859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70837 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59971 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 575/22095 [44:58<26:47:13, 4.48s/it] {'loss': 0.6046, 'grad_norm': 3.131192676139937, 'learning_rate': 8.657616892911012e-06, 'epoch': 0.03} 3%|▎ | 575/22095 [44:58<26:47:13, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922712 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45865, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 4\nB. 1\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 3%|▎ | 576/22095 [45:01<24:50:24, 4.16s/it] {'loss': 0.541, 'grad_norm': 1.5353274640022994, 'learning_rate': 8.672699849170438e-06, 'epoch': 0.03} 3%|▎ | 576/22095 [45:01<24:50:24, 4.16s/it] 3%|▎ | 577/22095 [45:05<23:47:46, 3.98s/it] {'loss': 0.5182, 'grad_norm': 1.0917678418917096, 'learning_rate': 8.687782805429864e-06, 'epoch': 0.03} 3%|▎ | 577/22095 [45:05<23:47:46, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65241 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 578/22095 [45:09<24:14:59, 4.06s/it] {'loss': 0.5366, 'grad_norm': 1.5763631164913774, 'learning_rate': 8.702865761689292e-06, 'epoch': 0.03} 3%|▎ | 578/22095 [45:09<24:14:59, 4.06s/it] 3%|▎ | 579/22095 [45:12<22:21:59, 3.74s/it] {'loss': 0.5253, 'grad_norm': 1.0361965455336255, 'learning_rate': 8.717948717948719e-06, 'epoch': 0.03} 3%|▎ | 579/22095 [45:12<22:21:59, 3.74s/it] 3%|▎ | 580/22095 [45:16<22:23:33, 3.75s/it] {'loss': 0.5001, 'grad_norm': 1.083518580000751, 'learning_rate': 8.733031674208145e-06, 'epoch': 0.03} 3%|▎ | 580/22095 [45:16<22:23:33, 3.75s/it] 3%|▎ | 581/22095 [45:19<21:23:38, 3.58s/it] {'loss': 0.5259, 'grad_norm': 1.155423558488019, 'learning_rate': 8.748114630467572e-06, 'epoch': 0.03} 3%|▎ | 581/22095 [45:19<21:23:38, 3.58s/it] 3%|▎ | 582/22095 [45:22<20:12:14, 3.38s/it] {'loss': 0.5254, 'grad_norm': 1.0320656726255666, 'learning_rate': 8.763197586727e-06, 'epoch': 0.03} 3%|▎ | 582/22095 [45:22<20:12:14, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 583/22095 [45:32<33:07:06, 5.54s/it] {'loss': 0.5735, 'grad_norm': 1.2339557419740619, 'learning_rate': 8.778280542986426e-06, 'epoch': 0.03} 3%|▎ | 583/22095 [45:32<33:07:06, 5.54s/it] 3%|▎ | 584/22095 [45:36<29:33:14, 4.95s/it] {'loss': 0.5164, 'grad_norm': 0.9770656904444553, 'learning_rate': 8.793363499245852e-06, 'epoch': 0.03} 3%|▎ | 584/22095 [45:36<29:33:14, 4.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 585/22095 [45:39<26:00:11, 4.35s/it] {'loss': 0.487, 'grad_norm': 1.2688613297497584, 'learning_rate': 8.808446455505279e-06, 'epoch': 0.03} 3%|▎ | 585/22095 [45:39<26:00:11, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72059 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133019 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92047 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 586/22095 [45:49<36:20:50, 6.08s/it] {'loss': 0.5501, 'grad_norm': 1.034534302426547, 'learning_rate': 8.823529411764707e-06, 'epoch': 0.03} 3%|▎ | 586/22095 [45:49<36:20:50, 6.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 587/22095 [45:52<31:31:33, 5.28s/it] {'loss': 0.5342, 'grad_norm': 1.0523407452463565, 'learning_rate': 8.838612368024133e-06, 'epoch': 0.03} 3%|▎ | 587/22095 [45:52<31:31:33, 5.28s/it] 3%|▎ | 588/22095 [45:56<28:07:24, 4.71s/it] {'loss': 0.484, 'grad_norm': 1.0254835996160458, 'learning_rate': 8.85369532428356e-06, 'epoch': 0.03} 3%|▎ | 588/22095 [45:56<28:07:24, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48001 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67199 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 589/22095 [45:59<26:25:04, 4.42s/it] {'loss': 0.5375, 'grad_norm': 0.9622678518511545, 'learning_rate': 8.868778280542986e-06, 'epoch': 0.03} 3%|▎ | 589/22095 [45:59<26:25:04, 4.42s/it] 3%|▎ | 590/22095 [46:03<25:02:02, 4.19s/it] {'loss': 0.5067, 'grad_norm': 0.9677691980588815, 'learning_rate': 8.883861236802414e-06, 'epoch': 0.03} 3%|▎ | 590/22095 [46:03<25:02:02, 4.19s/it] 3%|▎ | 591/22095 [46:06<22:36:06, 3.78s/it] {'loss': 0.5187, 'grad_norm': 0.9627736768818012, 'learning_rate': 8.89894419306184e-06, 'epoch': 0.03} 3%|▎ | 591/22095 [46:06<22:36:06, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97055 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 592/22095 [46:09<20:50:13, 3.49s/it] {'loss': 0.4758, 'grad_norm': 0.9440199612227524, 'learning_rate': 8.914027149321268e-06, 'epoch': 0.03} 3%|▎ | 592/22095 [46:09<20:50:13, 3.49s/it] 3%|▎ | 593/22095 [46:12<20:19:03, 3.40s/it] {'loss': 0.5087, 'grad_norm': 0.93767786410755, 'learning_rate': 8.929110105580695e-06, 'epoch': 0.03} 3%|▎ | 593/22095 [46:12<20:19:03, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 594/22095 [46:22<33:05:38, 5.54s/it] {'loss': 0.5766, 'grad_norm': 1.1964384917445936, 'learning_rate': 8.944193061840121e-06, 'epoch': 0.03} 3%|▎ | 594/22095 [46:22<33:05:38, 5.54s/it] 3%|▎ | 595/22095 [46:26<28:57:51, 4.85s/it] {'loss': 0.4784, 'grad_norm': 1.0117724924229, 'learning_rate': 8.95927601809955e-06, 'epoch': 0.03} 3%|▎ | 595/22095 [46:26<28:57:51, 4.85s/it] 3%|▎ | 596/22095 [46:29<26:47:49, 4.49s/it] {'loss': 0.4617, 'grad_norm': 0.9925820906569525, 'learning_rate': 8.974358974358976e-06, 'epoch': 0.03} 3%|▎ | 596/22095 [46:29<26:47:49, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78097 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 597/22095 [46:38<33:45:35, 5.65s/it] {'loss': 0.5879, 'grad_norm': 0.6677243498813, 'learning_rate': 8.989441930618402e-06, 'epoch': 0.03} 3%|▎ | 597/22095 [46:38<33:45:35, 5.65s/it] 3%|▎ | 598/22095 [46:42<30:23:57, 5.09s/it] {'loss': 0.5333, 'grad_norm': 0.9912709074709852, 'learning_rate': 9.004524886877828e-06, 'epoch': 0.03} 3%|▎ | 598/22095 [46:42<30:23:57, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75760 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 599/22095 [46:44<26:19:46, 4.41s/it] {'loss': 0.521, 'grad_norm': 1.0272835433313774, 'learning_rate': 9.019607843137256e-06, 'epoch': 0.03} 3%|▎ | 599/22095 [46:44<26:19:46, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 600/22095 [46:54<34:54:13, 5.85s/it] {'loss': 0.5702, 'grad_norm': 0.683172105248878, 'learning_rate': 9.034690799396683e-06, 'epoch': 0.03} 3%|▎ | 600/22095 [46:54<34:54:13, 5.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 601/22095 [46:58<31:59:20, 5.36s/it] {'loss': 0.4317, 'grad_norm': 0.8123114594776241, 'learning_rate': 9.049773755656109e-06, 'epoch': 0.03} 3%|▎ | 601/22095 [46:58<31:59:20, 5.36s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30493.png 2025-08-27 16:44:53.908798 load time: 1721.98 ms 3%|▎ | 602/22095 [47:01<28:59:17, 4.86s/it] {'loss': 0.4948, 'grad_norm': 0.8680303400075743, 'learning_rate': 9.064856711915535e-06, 'epoch': 0.03} 3%|▎ | 602/22095 [47:01<28:59:17, 4.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [475, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465587 in VC:s3://internvl-moe-sft-data/. Exception: Image size [475, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15325, 'image': 'vrdu_texteq/astro-ph.CO/3f0a1e10-aa41-4790-933e-95830b1b8c33.png', 'image_wh': [[475, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $V$ is the volume of the wind cell.'}]} 3%|▎ | 603/22095 [47:06<27:50:27, 4.66s/it] {'loss': 0.5081, 'grad_norm': 0.8525579255911381, 'learning_rate': 9.079939668174964e-06, 'epoch': 0.03} 3%|▎ | 603/22095 [47:06<27:50:27, 4.66s/it] 3%|▎ | 604/22095 [47:09<25:56:08, 4.34s/it] {'loss': 0.474, 'grad_norm': 0.8102455407975535, 'learning_rate': 9.09502262443439e-06, 'epoch': 0.03} 3%|▎ | 604/22095 [47:09<25:56:08, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 605/22095 [47:19<35:46:49, 5.99s/it] {'loss': 0.5591, 'grad_norm': 0.8259249514304119, 'learning_rate': 9.110105580693816e-06, 'epoch': 0.03} 3%|▎ | 605/22095 [47:19<35:46:49, 5.99s/it] 3%|▎ | 606/22095 [47:22<30:55:38, 5.18s/it] {'loss': 0.4717, 'grad_norm': 0.8613389960609926, 'learning_rate': 9.125188536953243e-06, 'epoch': 0.03} 3%|▎ | 606/22095 [47:22<30:55:38, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 607/22095 [47:29<32:41:50, 5.48s/it] {'loss': 0.5509, 'grad_norm': 0.6540097401182121, 'learning_rate': 9.14027149321267e-06, 'epoch': 0.03} 3%|▎ | 607/22095 [47:29<32:41:50, 5.48s/it] 3%|▎ | 608/22095 [47:33<30:08:27, 5.05s/it] {'loss': 0.4877, 'grad_norm': 0.9461405945810982, 'learning_rate': 9.155354449472097e-06, 'epoch': 0.03} 3%|▎ | 608/22095 [47:33<30:08:27, 5.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 609/22095 [47:36<28:03:41, 4.70s/it] {'loss': 0.5194, 'grad_norm': 0.9013544304168348, 'learning_rate': 9.170437405731523e-06, 'epoch': 0.03} 3%|▎ | 609/22095 [47:36<28:03:41, 4.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959459 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10294, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 3%|▎ | 610/22095 [47:39<24:51:28, 4.17s/it] {'loss': 0.4829, 'grad_norm': 1.1402918439029173, 'learning_rate': 9.18552036199095e-06, 'epoch': 0.03} 3%|▎ | 610/22095 [47:39<24:51:28, 4.17s/it] 3%|▎ | 611/22095 [47:42<22:31:57, 3.78s/it] {'loss': 0.5196, 'grad_norm': 0.9047977925822974, 'learning_rate': 9.200603318250378e-06, 'epoch': 0.03} 3%|▎ | 611/22095 [47:42<22:31:57, 3.78s/it] 3%|▎ | 612/22095 [47:46<22:24:49, 3.76s/it] {'loss': 0.4846, 'grad_norm': 0.9207106618040918, 'learning_rate': 9.215686274509804e-06, 'epoch': 0.03} 3%|▎ | 612/22095 [47:46<22:24:49, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75190 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71712 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47338 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 613/22095 [47:50<23:36:08, 3.96s/it] {'loss': 0.46, 'grad_norm': 0.8252106661434595, 'learning_rate': 9.230769230769232e-06, 'epoch': 0.03} 3%|▎ | 613/22095 [47:50<23:36:08, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66614 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 614/22095 [47:53<21:18:31, 3.57s/it] {'loss': 0.5667, 'grad_norm': 0.9010037438995498, 'learning_rate': 9.245852187028659e-06, 'epoch': 0.03} 3%|▎ | 614/22095 [47:53<21:18:31, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 615/22095 [47:57<21:33:27, 3.61s/it] {'loss': 0.4788, 'grad_norm': 0.8647698992314747, 'learning_rate': 9.260935143288085e-06, 'epoch': 0.03} 3%|▎ | 615/22095 [47:57<21:33:27, 3.61s/it] 3%|▎ | 616/22095 [48:00<21:13:27, 3.56s/it] {'loss': 0.5116, 'grad_norm': 0.9697514771488839, 'learning_rate': 9.276018099547513e-06, 'epoch': 0.03} 3%|▎ | 616/22095 [48:00<21:13:27, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 617/22095 [48:03<19:47:34, 3.32s/it] {'loss': 0.5, 'grad_norm': 0.8975155881110004, 'learning_rate': 9.29110105580694e-06, 'epoch': 0.03} 3%|▎ | 617/22095 [48:03<19:47:34, 3.32s/it] 3%|▎ | 618/22095 [48:07<20:57:01, 3.51s/it] {'loss': 0.4704, 'grad_norm': 0.9331083075291978, 'learning_rate': 9.306184012066366e-06, 'epoch': 0.03} 3%|▎ | 618/22095 [48:07<20:57:01, 3.51s/it] 3%|▎ | 619/22095 [48:10<20:03:12, 3.36s/it] {'loss': 0.466, 'grad_norm': 0.8356134579308436, 'learning_rate': 9.321266968325792e-06, 'epoch': 0.03} 3%|▎ | 619/22095 [48:10<20:03:12, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 620/22095 [48:13<20:09:03, 3.38s/it] {'loss': 0.4672, 'grad_norm': 1.0477143205408235, 'learning_rate': 9.33634992458522e-06, 'epoch': 0.03} 3%|▎ | 620/22095 [48:13<20:09:03, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 621/22095 [48:23<31:50:44, 5.34s/it] {'loss': 0.5469, 'grad_norm': 1.4144678930589356, 'learning_rate': 9.351432880844647e-06, 'epoch': 0.03} 3%|▎ | 621/22095 [48:23<31:50:44, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42400 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 622/22095 [48:33<39:24:50, 6.61s/it] {'loss': 0.5242, 'grad_norm': 0.7897408114102736, 'learning_rate': 9.366515837104073e-06, 'epoch': 0.03} 3%|▎ | 622/22095 [48:33<39:24:50, 6.61s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 623/22095 [48:36<33:45:06, 5.66s/it] {'loss': 0.5051, 'grad_norm': 0.9966835571160938, 'learning_rate': 9.3815987933635e-06, 'epoch': 0.03} 3%|▎ | 623/22095 [48:36<33:45:06, 5.66s/it] 3%|▎ | 624/22095 [48:39<29:19:50, 4.92s/it] {'loss': 0.4974, 'grad_norm': 1.0012891418820922, 'learning_rate': 9.396681749622927e-06, 'epoch': 0.03} 3%|▎ | 624/22095 [48:39<29:19:50, 4.92s/it] 3%|▎ | 625/22095 [48:43<25:58:53, 4.36s/it] {'loss': 0.478, 'grad_norm': 0.8643487579115816, 'learning_rate': 9.411764705882354e-06, 'epoch': 0.03} 3%|▎ | 625/22095 [48:43<25:58:53, 4.36s/it] 3%|▎ | 626/22095 [48:47<26:01:12, 4.36s/it] {'loss': 0.4871, 'grad_norm': 1.0272295174288926, 'learning_rate': 9.42684766214178e-06, 'epoch': 0.03} 3%|▎ | 626/22095 [48:47<26:01:12, 4.36s/it] 3%|▎ | 627/22095 [48:51<24:45:25, 4.15s/it] {'loss': 0.4988, 'grad_norm': 0.9497133963033242, 'learning_rate': 9.441930618401207e-06, 'epoch': 0.03} 3%|▎ | 627/22095 [48:51<24:45:25, 4.15s/it] 3%|▎ | 628/22095 [48:54<23:50:59, 4.00s/it] {'loss': 0.5156, 'grad_norm': 0.885996701031616, 'learning_rate': 9.457013574660635e-06, 'epoch': 0.03} 3%|▎ | 628/22095 [48:54<23:50:59, 4.00s/it] 3%|▎ | 629/22095 [48:57<22:00:04, 3.69s/it] {'loss': 0.4292, 'grad_norm': 0.9061066748339962, 'learning_rate': 9.472096530920061e-06, 'epoch': 0.03} 3%|▎ | 629/22095 [48:57<22:00:04, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 630/22095 [49:04<27:36:03, 4.63s/it] {'loss': 0.5426, 'grad_norm': 2.3887004974429993, 'learning_rate': 9.487179487179487e-06, 'epoch': 0.03} 3%|▎ | 630/22095 [49:04<27:36:03, 4.63s/it] 3%|▎ | 631/22095 [49:08<26:59:10, 4.53s/it] {'loss': 0.5036, 'grad_norm': 1.0015617083513035, 'learning_rate': 9.502262443438914e-06, 'epoch': 0.03} 3%|▎ | 631/22095 [49:08<26:59:10, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56221 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 632/22095 [49:12<24:50:10, 4.17s/it] {'loss': 0.519, 'grad_norm': 1.1310600352677773, 'learning_rate': 9.517345399698342e-06, 'epoch': 0.03} 3%|▎ | 632/22095 [49:12<24:50:10, 4.17s/it] 3%|▎ | 633/22095 [49:15<23:14:33, 3.90s/it] {'loss': 0.5245, 'grad_norm': 1.0516032606117616, 'learning_rate': 9.53242835595777e-06, 'epoch': 0.03} 3%|▎ | 633/22095 [49:15<23:14:33, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81778 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 634/22095 [49:18<21:46:07, 3.65s/it] {'loss': 0.4858, 'grad_norm': 0.9627150600636653, 'learning_rate': 9.547511312217196e-06, 'epoch': 0.03} 3%|▎ | 634/22095 [49:18<21:46:07, 3.65s/it] 3%|▎ | 635/22095 [49:22<23:01:07, 3.86s/it] {'loss': 0.5066, 'grad_norm': 0.9908734595681895, 'learning_rate': 9.562594268476623e-06, 'epoch': 0.03} 3%|▎ | 635/22095 [49:22<23:01:07, 3.86s/it] 3%|▎ | 636/22095 [49:27<23:48:12, 3.99s/it] {'loss': 0.4889, 'grad_norm': 0.8558945955157928, 'learning_rate': 9.577677224736049e-06, 'epoch': 0.03} 3%|▎ | 636/22095 [49:27<23:48:12, 3.99s/it] 3%|▎ | 637/22095 [49:30<23:05:12, 3.87s/it] {'loss': 0.4829, 'grad_norm': 1.1896758395015434, 'learning_rate': 9.592760180995477e-06, 'epoch': 0.03} 3%|▎ | 637/22095 [49:30<23:05:12, 3.87s/it] 3%|▎ | 638/22095 [49:33<21:45:47, 3.65s/it] {'loss': 0.5054, 'grad_norm': 1.0550264618476963, 'learning_rate': 9.607843137254903e-06, 'epoch': 0.03} 3%|▎ | 638/22095 [49:33<21:45:47, 3.65s/it] 3%|▎ | 639/22095 [49:36<20:26:43, 3.43s/it] {'loss': 0.5131, 'grad_norm': 0.9297556565489958, 'learning_rate': 9.62292609351433e-06, 'epoch': 0.03} 3%|▎ | 639/22095 [49:36<20:26:43, 3.43s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 640/22095 [49:46<32:36:14, 5.47s/it] {'loss': 0.5572, 'grad_norm': 1.8024189426984918, 'learning_rate': 9.638009049773756e-06, 'epoch': 0.03} 3%|▎ | 640/22095 [49:46<32:36:14, 5.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [442, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8442317 in VC:s3://internvl-moe-sft-data/. Exception: Image size [442, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 86681, 'image': 'vrdu_texteq/astro-ph.CO/2272086f-826d-4195-8d3c-49f8bd3339dd.png', 'image_wh': [[442, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $\\delta^{3}$ is the Dirac delta function.'}]} 3%|▎ | 641/22095 [49:50<28:35:35, 4.80s/it] {'loss': 0.4433, 'grad_norm': 1.031101602190451, 'learning_rate': 9.653092006033184e-06, 'epoch': 0.03} 3%|▎ | 641/22095 [49:50<28:35:35, 4.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 642/22095 [50:00<38:16:12, 6.42s/it] {'loss': 0.5722, 'grad_norm': 0.9243046569458107, 'learning_rate': 9.66817496229261e-06, 'epoch': 0.03} 3%|▎ | 642/22095 [50:00<38:16:12, 6.42s/it] 3%|▎ | 643/22095 [50:04<33:25:26, 5.61s/it] {'loss': 0.5154, 'grad_norm': 0.9572833331050026, 'learning_rate': 9.683257918552037e-06, 'epoch': 0.03} 3%|▎ | 643/22095 [50:04<33:25:26, 5.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 644/22095 [50:07<29:36:19, 4.97s/it] {'loss': 0.5023, 'grad_norm': 0.9545263635012465, 'learning_rate': 9.698340874811463e-06, 'epoch': 0.03} 3%|▎ | 644/22095 [50:07<29:36:19, 4.97s/it] 3%|▎ | 645/22095 [50:10<26:00:54, 4.37s/it] {'loss': 0.4989, 'grad_norm': 1.1494387901770513, 'learning_rate': 9.713423831070891e-06, 'epoch': 0.03} 3%|▎ | 645/22095 [50:10<26:00:54, 4.37s/it] 3%|▎ | 646/22095 [50:13<23:47:58, 3.99s/it] {'loss': 0.4753, 'grad_norm': 0.811442686170712, 'learning_rate': 9.728506787330318e-06, 'epoch': 0.03} 3%|▎ | 646/22095 [50:13<23:47:58, 3.99s/it] 3%|▎ | 647/22095 [50:17<23:42:53, 3.98s/it] {'loss': 0.4897, 'grad_norm': 0.9594851119019182, 'learning_rate': 9.743589743589744e-06, 'epoch': 0.03} 3%|▎ | 647/22095 [50:17<23:42:53, 3.98s/it] 3%|▎ | 648/22095 [50:21<22:42:47, 3.81s/it] {'loss': 0.5069, 'grad_norm': 0.9577158737270751, 'learning_rate': 9.75867269984917e-06, 'epoch': 0.03} 3%|▎ | 648/22095 [50:21<22:42:47, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88147 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 649/22095 [50:24<21:25:58, 3.60s/it] {'loss': 0.4707, 'grad_norm': 0.9677059426724289, 'learning_rate': 9.773755656108599e-06, 'epoch': 0.03} 3%|▎ | 649/22095 [50:24<21:25:58, 3.60s/it] 3%|▎ | 650/22095 [50:27<21:10:07, 3.55s/it] {'loss': 0.5233, 'grad_norm': 0.9131491960385424, 'learning_rate': 9.788838612368025e-06, 'epoch': 0.03} 3%|▎ | 650/22095 [50:27<21:10:07, 3.55s/it] 3%|▎ | 651/22095 [50:31<21:48:07, 3.66s/it] {'loss': 0.494, 'grad_norm': 0.8903706265312409, 'learning_rate': 9.803921568627451e-06, 'epoch': 0.03} 3%|▎ | 651/22095 [50:31<21:48:07, 3.66s/it] 3%|▎ | 652/22095 [50:35<22:32:17, 3.78s/it] {'loss': 0.5024, 'grad_norm': 0.8897802621593002, 'learning_rate': 9.819004524886878e-06, 'epoch': 0.03} 3%|▎ | 652/22095 [50:35<22:32:17, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8399280 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1434, 'image': 'vrdu_table_final_2/astro-ph.CO/6a7e6999-6bb0-4b2f-a7a4-48b9833b7bc0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 3%|▎ | 653/22095 [50:39<22:56:19, 3.85s/it] {'loss': 0.4947, 'grad_norm': 0.886489545618575, 'learning_rate': 9.834087481146306e-06, 'epoch': 0.03} 3%|▎ | 653/22095 [50:39<22:56:19, 3.85s/it] 3%|▎ | 654/22095 [50:42<21:37:53, 3.63s/it] {'loss': 0.5074, 'grad_norm': 1.142183417297171, 'learning_rate': 9.849170437405732e-06, 'epoch': 0.03} 3%|▎ | 654/22095 [50:42<21:37:53, 3.63s/it] 3%|▎ | 655/22095 [50:45<20:50:25, 3.50s/it] {'loss': 0.4285, 'grad_norm': 0.7920238350891496, 'learning_rate': 9.86425339366516e-06, 'epoch': 0.03} 3%|▎ | 655/22095 [50:45<20:50:25, 3.50s/it] 3%|▎ | 656/22095 [50:50<22:50:30, 3.84s/it] {'loss': 0.4933, 'grad_norm': 0.7817895238319269, 'learning_rate': 9.879336349924586e-06, 'epoch': 0.03} 3%|▎ | 656/22095 [50:50<22:50:30, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 657/22095 [50:56<26:55:21, 4.52s/it] {'loss': 0.6201, 'grad_norm': 3.3052962392165166, 'learning_rate': 9.894419306184013e-06, 'epoch': 0.03} 3%|▎ | 657/22095 [50:56<26:55:21, 4.52s/it] 3%|▎ | 658/22095 [51:06<35:58:55, 6.04s/it] {'loss': 0.5664, 'grad_norm': 1.8967947995531824, 'learning_rate': 9.90950226244344e-06, 'epoch': 0.03} 3%|▎ | 658/22095 [51:06<35:58:55, 6.04s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 659/22095 [51:10<32:09:49, 5.40s/it] {'loss': 0.468, 'grad_norm': 1.016181045275243, 'learning_rate': 9.924585218702867e-06, 'epoch': 0.03} 3%|▎ | 659/22095 [51:10<32:09:49, 5.40s/it] 3%|▎ | 660/22095 [51:13<29:17:03, 4.92s/it] {'loss': 0.5306, 'grad_norm': 1.0906829653780257, 'learning_rate': 9.939668174962294e-06, 'epoch': 0.03} 3%|▎ | 660/22095 [51:13<29:17:03, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121182 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 661/22095 [51:17<26:05:03, 4.38s/it] {'loss': 0.4589, 'grad_norm': 1.0567718662960948, 'learning_rate': 9.95475113122172e-06, 'epoch': 0.03} 3%|▎ | 661/22095 [51:17<26:05:03, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109748 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (49227 > 40960) for 4 sample(s). Truncating to 8267 with 3 samples. 3%|▎ | 662/22095 [51:20<24:26:40, 4.11s/it] {'loss': 0.5076, 'grad_norm': 1.155731681459567, 'learning_rate': 9.969834087481146e-06, 'epoch': 0.03} 3%|▎ | 662/22095 [51:20<24:26:40, 4.11s/it] 3%|▎ | 663/22095 [51:23<22:27:10, 3.77s/it] {'loss': 0.432, 'grad_norm': 0.9147348944501993, 'learning_rate': 9.984917043740574e-06, 'epoch': 0.03} 3%|▎ | 663/22095 [51:23<22:27:10, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78536 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 664/22095 [51:27<22:41:26, 3.81s/it] {'loss': 0.4823, 'grad_norm': 0.9209779765845878, 'learning_rate': 1e-05, 'epoch': 0.03} 3%|▎ | 664/22095 [51:27<22:41:26, 3.81s/it] 3%|▎ | 665/22095 [51:31<23:08:58, 3.89s/it] {'loss': 0.5029, 'grad_norm': 0.9271797626630768, 'learning_rate': 9.999999946282679e-06, 'epoch': 0.03} 3%|▎ | 665/22095 [51:31<23:08:58, 3.89s/it] 3%|▎ | 666/22095 [51:35<22:48:03, 3.83s/it] {'loss': 0.5299, 'grad_norm': 1.039138585186304, 'learning_rate': 9.999999785130714e-06, 'epoch': 0.03} 3%|▎ | 666/22095 [51:35<22:48:03, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49607 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57041 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 667/22095 [51:38<22:00:45, 3.70s/it] {'loss': 0.4611, 'grad_norm': 0.969861000408359, 'learning_rate': 9.999999516544111e-06, 'epoch': 0.03} 3%|▎ | 667/22095 [51:38<22:00:45, 3.70s/it] 3%|▎ | 668/22095 [51:41<20:41:40, 3.48s/it] {'loss': 0.5032, 'grad_norm': 1.077682631894601, 'learning_rate': 9.999999140522874e-06, 'epoch': 0.03} 3%|▎ | 668/22095 [51:41<20:41:40, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 669/22095 [51:51<32:28:53, 5.46s/it] {'loss': 0.7728, 'grad_norm': 5.948473688153386, 'learning_rate': 9.999998657067014e-06, 'epoch': 0.03} 3%|▎ | 669/22095 [51:51<32:28:53, 5.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932784 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55937, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵点P是AC的中点,点Q是BC的中点,线段AC=8cm,线段BC=4cm,∴CP=4cm,CQ=2cm,∴PQ=4+2=6cm.'}]} 3%|▎ | 670/22095 [52:01<39:49:31, 6.69s/it] {'loss': 0.6763, 'grad_norm': 3.676232499482188, 'learning_rate': 9.999998066176536e-06, 'epoch': 0.03} 3%|▎ | 670/22095 [52:01<39:49:31, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 671/22095 [52:04<33:25:05, 5.62s/it] {'loss': 0.5174, 'grad_norm': 1.2775316773570315, 'learning_rate': 9.999997367851456e-06, 'epoch': 0.03} 3%|▎ | 671/22095 [52:04<33:25:05, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54539 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44935 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 672/22095 [52:14<41:02:36, 6.90s/it] {'loss': 0.5926, 'grad_norm': 1.6908085425895913, 'learning_rate': 9.999996562091792e-06, 'epoch': 0.03} 3%|▎ | 672/22095 [52:14<41:02:36, 6.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11237679 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 776, 'image': ['MMMU/Architecture_and_Engineering/test_108_image_1.png', 'MMMU/Architecture_and_Engineering/test_108_image_2.png', 'MMMU/Architecture_and_Engineering/test_108_image_3.png', 'MMMU/Architecture_and_Engineering/test_108_image_4.png', 'MMMU/Architecture_and_Engineering/test_108_image_5.png', 'MMMU/Architecture_and_Engineering/test_108_image_6.png'], 'image_wh': [[174, 169], [188, 172], [197, 21], [194, 51], [194, 42], [193, 45]], 'conversations': [{'from': 'human', 'value': "Image-1: \nImage-2: \nImage-3: \nImage-4: \nImage-5: \nImage-6: \nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: Match List I with List II and select the correct answer using the codes given below the lists:\nOptions: ['', '', '', '']"}, {'from': 'gpt', 'value': '### Introduction\n\nThe task involves matching objects related to fencing with their symbolic representations. This exercise is a typical type of question found in exams that test conceptual understanding and application of knowledge in a topic like agricultural sciences, landscape architecture, or property management. By recognizing symbols and their meanings, we can develop an understanding of different types of fencing and their uses.\n\n### Knowledge Points\n\n1. **Types of Fencing**\n \n Fencing serves as a physical barrier intended to demarcate boundaries, provide security, and manage agricultural practices. There are several types of fencing, each with distinct characteristics and applications. The types listed in this exercise are:\n\n - **Hedge Fencing**:\n - **Concept**: A living fence formed by planting bushes or small trees closely together.\n - **Applications**: Used for property boundaries, windbreaks, and wildlife habitats.\n - **Examples**: Common hedges include species like hawthorn or privet. These plants grow densely and can be trimmed to maintain the desired height and thickness.\n\n - **Wire Fencing**:\n - **Concept**: A fence constructed using metal wires, which may be barbed or smooth.\n - **Applications**: Commonly used in agriculture to contain livestock, as well as for security purposes.\n - **Examples**: Barbed wire is used in rural settings to deter animals or trespassers, whereas electrified wire fences are often used for livestock control.\n\n - **Pipe Fencing**:\n - **Concept**: Made of vertical and horizontal steel or iron pipes connected to create a strong barrier.\n - **Applications**: Often seen in arenas, farms, and equestrian facilities for defining boundary areas.\n - **Examples**: This type of fencing is favored for its durability and is often found in rodeos or along ranch perimeters.\n\n - **Wood Fencing**:\n - **Concept**: Traditional fencing made of wooden posts and rails.\n - **Applications**: Used in residential, agricultural, and aesthetic landscapes.\n - **Examples**: Picket fences around homes or rail fences in horse pastures. It provides aesthetic value in suburban neighborhoods.\n\n2. **Symbolic Representation in Diagrams**\n\n Understanding and interpreting symbols is crucial in many fields such as cartography, mechanical drawing, and in various standardized tests. Symbols simplify information and make complex ideas easier to communicate at a glance.\n\n - **Hedge Symbol**: Typically a series of arches, representing the natural contour of hedge plants.\n - **Wire Fence Symbol**: Horizontal lines intersected by Xs, indicative of barbed wire crosses.\n - **Pipe Fence Symbol**: Horizontal line with circles or filled dots depicting pipes joint.\n - **Wood Fence Symbol**: Often represented with parallel lines to simulate the wooden rails.\n\n3. **Applications and Context**\n\n Fencing types are intricately linked with their applications based on context, need, and local regulations. Here are some considerations:\n\n - **Agriculture**:\n - **Purpose**: Protect crops from wildlife, delineate property, and manage livestock.\n - **Type Choice**: Barbed wire is prevalent due to cost efficiency for large areas.\n\n - **Residential**:\n - **Purpose**: Privacy, decoration, and boundary marking.\n - **Type Choice**: Wood and hedge fences add aesthetic appeal.\n\n - **Commercial/Industrial**:\n - **Purpose**: Security and access control.\n - **Type Choice**: High wire fencing or electrified fences provide necessary deterrence.\n\n4. **Environmental and Maintenance Considerations**\n\n When selecting fencing, evaluating environmental impact and maintenance requirements is essential:\n\n - **Hedge Fencing**:\n - **Environmental Impact**: Encourages biodiversity by providing habitats; requires regular pruning for shape and size control.\n - **Maintenance**: Periodic trimming and potential planting to fill gaps.\n\n - **Wire Fencing**:\n - **Environmental Impact**: Minimal; may pose risk to wildlife if not monitored.\n - **Maintenance**: Occasional tightening and rust prevention.\n\n - **Pipe Fencing**:\n - **Environmental Impact**: Durable with minimal upkeep needed.\n - **Maintenance**: Inspections for loosening joints and repainting to prevent corrosion.\n\n - **Wood Fencing**:\n - **Environmental Impact**: Possible use of sustainably sourced timber reduces the carbon footprint.\n - **Maintenance**: Frequent inspections for rot, termite infestation, and repainting or staining as required.\n\n5. **Cultural and Historical Context**\n\n Fencing has not only practical but also cultural implications throughout history:\n\n - **Historical Significance**: The invention and widespread use of barbed wire in the late 19th century revolutionized agriculture, enabling the development of the American West.\n - **Cultural Traditions**: In many cultures, fences symbolize privacy and protection, serving as a defense against potential outsiders.\n - **Aesthetic Values**: Hedge and wood fences are integral to the visual and architectural landscape in various regions, reflecting tradition and local craftsmanship.\n\n### Conclusion\n\nMastering the ability to correctly pair objects with their symbolic representations requires an understanding of both their functional roles and visual indicators. By comprehending the nuances of each fencing type and the contexts in which they are employed, we gain insights into broader topics of environmental management, property delineation, and cultural practices. Such knowledge proves invaluable in professional fields ranging from landscape design to agricultural planning, contributing to informed decision-making in both rural and urban settings. Understanding these constructs facilitates a deeper appreciation for how we organize and interact with our environments, effectively merging utility with symbolism in the practical world.'}]} 3%|▎ | 673/22095 [52:18<36:07:15, 6.07s/it] {'loss': 0.4605, 'grad_norm': 1.2246977137958381, 'learning_rate': 9.999995648897555e-06, 'epoch': 0.03} 3%|▎ | 673/22095 [52:18<36:07:15, 6.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80457 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 674/22095 [52:21<31:39:19, 5.32s/it] {'loss': 0.5713, 'grad_norm': 1.3528227854220647, 'learning_rate': 9.99999462826877e-06, 'epoch': 0.03} 3%|▎ | 674/22095 [52:21<31:39:19, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 675/22095 [52:32<41:11:09, 6.92s/it] {'loss': 0.651, 'grad_norm': 3.362052766446197, 'learning_rate': 9.999993500205456e-06, 'epoch': 0.03} 3%|▎ | 675/22095 [52:32<41:11:09, 6.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302490 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1MDqqdgDD8KJjy0FdXXcjvXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n识别图片中的文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n加宽果冻轮\n一秒折叠\n加宽四轮闪光'}]} 3%|▎ | 676/22095 [52:36<35:35:34, 5.98s/it] {'loss': 0.5178, 'grad_norm': 0.958564212895757, 'learning_rate': 9.999992264707636e-06, 'epoch': 0.03} 3%|▎ | 676/22095 [52:36<35:35:34, 5.98s/it] 3%|▎ | 677/22095 [52:39<30:26:48, 5.12s/it] {'loss': 0.5313, 'grad_norm': 1.0395509750067198, 'learning_rate': 9.999990921775341e-06, 'epoch': 0.03} 3%|▎ | 677/22095 [52:39<30:26:48, 5.12s/it] 3%|▎ | 678/22095 [52:43<28:46:01, 4.84s/it] {'loss': 0.5664, 'grad_norm': 1.120139810370774, 'learning_rate': 9.999989471408598e-06, 'epoch': 0.03} 3%|▎ | 678/22095 [52:43<28:46:01, 4.84s/it] 3%|▎ | 679/22095 [52:47<27:40:04, 4.65s/it] {'loss': 0.4841, 'grad_norm': 1.032136947745328, 'learning_rate': 9.999987913607437e-06, 'epoch': 0.03} 3%|▎ | 679/22095 [52:47<27:40:04, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83325 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 680/22095 [52:57<35:55:56, 6.04s/it] {'loss': 0.6585, 'grad_norm': 2.467769359435823, 'learning_rate': 9.999986248371889e-06, 'epoch': 0.03} 3%|▎ | 680/22095 [52:57<35:55:56, 6.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47316 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114850 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 681/22095 [53:00<31:31:06, 5.30s/it] {'loss': 0.5235, 'grad_norm': 1.0114156966604324, 'learning_rate': 9.999984475701996e-06, 'epoch': 0.03} 3%|▎ | 681/22095 [53:00<31:31:06, 5.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44929 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121838 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 682/22095 [53:07<34:28:05, 5.79s/it] {'loss': 0.6423, 'grad_norm': 2.1678025103559753, 'learning_rate': 9.999982595597793e-06, 'epoch': 0.03} 3%|▎ | 682/22095 [53:07<34:28:05, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80893 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 683/22095 [53:10<29:28:15, 4.95s/it] {'loss': 0.4334, 'grad_norm': 1.0222023327367475, 'learning_rate': 9.99998060805932e-06, 'epoch': 0.03} 3%|▎ | 683/22095 [53:10<29:28:15, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128055 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 684/22095 [53:13<26:30:45, 4.46s/it] {'loss': 0.5546, 'grad_norm': 1.0264965752501367, 'learning_rate': 9.999978513086617e-06, 'epoch': 0.03} 3%|▎ | 684/22095 [53:13<26:30:45, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 685/22095 [53:22<34:04:24, 5.73s/it] {'loss': 0.5974, 'grad_norm': 1.2861062861613923, 'learning_rate': 9.999976310679735e-06, 'epoch': 0.03} 3%|▎ | 685/22095 [53:22<34:04:24, 5.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 686/22095 [53:26<30:42:50, 5.16s/it] {'loss': 0.4734, 'grad_norm': 0.9369415311783983, 'learning_rate': 9.999974000838716e-06, 'epoch': 0.03} 3%|▎ | 686/22095 [53:26<30:42:50, 5.16s/it] 3%|▎ | 687/22095 [53:30<28:55:41, 4.86s/it] {'loss': 0.5228, 'grad_norm': 0.9003760578237145, 'learning_rate': 9.999971583563615e-06, 'epoch': 0.03} 3%|▎ | 687/22095 [53:30<28:55:41, 4.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895850 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19003, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nA. 8cm\nB. 1lcm\nC. 13cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 688/22095 [53:37<32:20:01, 5.44s/it] {'loss': 0.5705, 'grad_norm': 0.79968319184062, 'learning_rate': 9.99996905885448e-06, 'epoch': 0.03} 3%|▎ | 688/22095 [53:37<32:20:01, 5.44s/it] 3%|▎ | 689/22095 [53:46<39:20:21, 6.62s/it] {'loss': 0.5749, 'grad_norm': 0.7660832635821488, 'learning_rate': 9.999966426711364e-06, 'epoch': 0.03} 3%|▎ | 689/22095 [53:46<39:20:21, 6.62s/it] 3%|▎ | 690/22095 [53:56<44:50:02, 7.54s/it] {'loss': 0.5822, 'grad_norm': 0.6958497061532944, 'learning_rate': 9.99996368713433e-06, 'epoch': 0.03} 3%|▎ | 690/22095 [53:56<44:50:02, 7.54s/it] 3%|▎ | 691/22095 [54:05<46:57:11, 7.90s/it] {'loss': 0.6022, 'grad_norm': 0.8745363662409338, 'learning_rate': 9.999960840123428e-06, 'epoch': 0.03} 3%|▎ | 691/22095 [54:05<46:57:11, 7.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 692/22095 [54:09<40:23:22, 6.79s/it] {'loss': 0.4984, 'grad_norm': 1.3696394690577367, 'learning_rate': 9.999957885678725e-06, 'epoch': 0.03} 3%|▎ | 692/22095 [54:09<40:23:22, 6.79s/it] 3%|▎ | 693/22095 [54:13<34:53:11, 5.87s/it] {'loss': 0.5122, 'grad_norm': 1.0403799882410576, 'learning_rate': 9.999954823800287e-06, 'epoch': 0.03} 3%|▎ | 693/22095 [54:13<34:53:11, 5.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 694/22095 [54:17<31:25:58, 5.29s/it] {'loss': 0.509, 'grad_norm': 1.2900716363078422, 'learning_rate': 9.99995165448817e-06, 'epoch': 0.03} 3%|▎ | 694/22095 [54:17<31:25:58, 5.29s/it] 3%|▎ | 695/22095 [54:20<28:32:16, 4.80s/it] {'loss': 0.5161, 'grad_norm': 1.0807483041647399, 'learning_rate': 9.999948377742453e-06, 'epoch': 0.03} 3%|▎ | 695/22095 [54:20<28:32:16, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 696/22095 [54:24<27:00:10, 4.54s/it] {'loss': 0.5443, 'grad_norm': 1.0028670752550366, 'learning_rate': 9.9999449935632e-06, 'epoch': 0.03} 3%|▎ | 696/22095 [54:24<27:00:10, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43340 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 697/22095 [54:29<26:49:34, 4.51s/it] {'loss': 0.4537, 'grad_norm': 0.7870478530335979, 'learning_rate': 9.999941501950484e-06, 'epoch': 0.03} 3%|▎ | 697/22095 [54:29<26:49:34, 4.51s/it] 3%|▎ | 698/22095 [54:32<25:36:46, 4.31s/it] {'loss': 0.4937, 'grad_norm': 1.0264232534437348, 'learning_rate': 9.999937902904382e-06, 'epoch': 0.03} 3%|▎ | 698/22095 [54:32<25:36:46, 4.31s/it] 3%|▎ | 699/22095 [54:36<23:39:16, 3.98s/it] {'loss': 0.459, 'grad_norm': 1.0829604759779226, 'learning_rate': 9.999934196424972e-06, 'epoch': 0.03} 3%|▎ | 699/22095 [54:36<23:39:16, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57395 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55213 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 700/22095 [54:39<22:34:47, 3.80s/it] {'loss': 0.5105, 'grad_norm': 0.966309088508198, 'learning_rate': 9.999930382512331e-06, 'epoch': 0.03} 3%|▎ | 700/22095 [54:39<22:34:47, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138003 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 701/22095 [54:42<21:13:33, 3.57s/it] {'loss': 0.4671, 'grad_norm': 1.0366245263100786, 'learning_rate': 9.999926461166541e-06, 'epoch': 0.03} 3%|▎ | 701/22095 [54:42<21:13:33, 3.57s/it] 3%|▎ | 702/22095 [54:46<20:57:16, 3.53s/it] {'loss': 0.4912, 'grad_norm': 1.2234134756053456, 'learning_rate': 9.99992243238769e-06, 'epoch': 0.03} 3%|▎ | 702/22095 [54:46<20:57:16, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79924 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 703/22095 [54:48<19:37:58, 3.30s/it] {'loss': 0.5133, 'grad_norm': 0.9590494692266611, 'learning_rate': 9.99991829617586e-06, 'epoch': 0.03} 3%|▎ | 703/22095 [54:48<19:37:58, 3.30s/it] 3%|▎ | 704/22095 [54:52<19:51:01, 3.34s/it] {'loss': 0.5341, 'grad_norm': 1.3428036136373722, 'learning_rate': 9.999914052531143e-06, 'epoch': 0.03} 3%|▎ | 704/22095 [54:52<19:51:01, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 705/22095 [55:00<28:39:13, 4.82s/it] {'loss': 0.5954, 'grad_norm': 1.4649979468611838, 'learning_rate': 9.999909701453629e-06, 'epoch': 0.03} 3%|▎ | 705/22095 [55:00<28:39:13, 4.82s/it] 3%|▎ | 706/22095 [55:03<26:01:05, 4.38s/it] {'loss': 0.5009, 'grad_norm': 1.4747032246725622, 'learning_rate': 9.99990524294341e-06, 'epoch': 0.03} 3%|▎ | 706/22095 [55:03<26:01:05, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63317 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 707/22095 [55:07<25:23:54, 4.28s/it] {'loss': 0.5205, 'grad_norm': 0.8628686083875655, 'learning_rate': 9.999900677000584e-06, 'epoch': 0.03} 3%|▎ | 707/22095 [55:07<25:23:54, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (77572 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60928 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92550 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 708/22095 [55:17<34:57:26, 5.88s/it] {'loss': 0.5734, 'grad_norm': 0.7491389733083518, 'learning_rate': 9.99989600362525e-06, 'epoch': 0.03} 3%|▎ | 708/22095 [55:17<34:57:26, 5.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65622 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 709/22095 [55:27<42:41:16, 7.19s/it] {'loss': 0.5635, 'grad_norm': 0.5221698127186557, 'learning_rate': 9.999891222817507e-06, 'epoch': 0.03} 3%|▎ | 709/22095 [55:27<42:41:16, 7.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (55923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106705 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 710/22095 [55:31<36:31:32, 6.15s/it] {'loss': 0.5399, 'grad_norm': 1.759719976659955, 'learning_rate': 9.999886334577456e-06, 'epoch': 0.03} 3%|▎ | 710/22095 [55:31<36:31:32, 6.15s/it] 3%|▎ | 711/22095 [55:41<44:05:03, 7.42s/it] {'loss': 0.5543, 'grad_norm': 0.616650168587763, 'learning_rate': 9.999881338905204e-06, 'epoch': 0.03} 3%|▎ | 711/22095 [55:41<44:05:03, 7.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 712/22095 [55:45<36:28:33, 6.14s/it] {'loss': 0.4935, 'grad_norm': 0.9961499565001543, 'learning_rate': 9.999876235800859e-06, 'epoch': 0.03} 3%|▎ | 712/22095 [55:45<36:28:33, 6.14s/it] 3%|▎ | 713/22095 [55:48<31:08:17, 5.24s/it] {'loss': 0.4725, 'grad_norm': 1.0072414232661828, 'learning_rate': 9.999871025264528e-06, 'epoch': 0.03} 3%|▎ | 713/22095 [55:48<31:08:17, 5.24s/it] 3%|▎ | 714/22095 [55:52<28:49:31, 4.85s/it] {'loss': 0.5199, 'grad_norm': 1.0472179363754566, 'learning_rate': 9.999865707296326e-06, 'epoch': 0.03} 3%|▎ | 714/22095 [55:52<28:49:31, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 715/22095 [55:58<32:23:48, 5.46s/it] {'loss': 0.5595, 'grad_norm': 0.8728295602150761, 'learning_rate': 9.999860281896366e-06, 'epoch': 0.03} 3%|▎ | 715/22095 [55:58<32:23:48, 5.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52403 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49260 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61871 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90406 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 716/22095 [56:08<39:39:49, 6.68s/it] {'loss': 0.5402, 'grad_norm': 0.8090690179012936, 'learning_rate': 9.999854749064764e-06, 'epoch': 0.03} 3%|▎ | 716/22095 [56:08<39:39:49, 6.68s/it] 3%|▎ | 717/22095 [56:15<39:45:12, 6.69s/it] {'loss': 0.5795, 'grad_norm': 0.6068390919302309, 'learning_rate': 9.999849108801637e-06, 'epoch': 0.03} 3%|▎ | 717/22095 [56:15<39:45:12, 6.69s/it] 3%|▎ | 718/22095 [56:25<46:12:27, 7.78s/it] {'loss': 0.5798, 'grad_norm': 0.5094045186994052, 'learning_rate': 9.999843361107111e-06, 'epoch': 0.03} 3%|▎ | 718/22095 [56:25<46:12:27, 7.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 3%|▎ | 719/22095 [56:29<40:08:10, 6.76s/it] {'loss': 0.5546, 'grad_norm': 1.3118168143611104, 'learning_rate': 9.999837505981308e-06, 'epoch': 0.03} 3%|▎ | 719/22095 [56:29<40:08:10, 6.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66583 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91143 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 720/22095 [56:33<34:20:36, 5.78s/it] {'loss': 0.5243, 'grad_norm': 1.0749377351159282, 'learning_rate': 9.99983154342435e-06, 'epoch': 0.03} 3%|▎ | 720/22095 [56:33<34:20:36, 5.78s/it] 3%|▎ | 721/22095 [56:37<30:51:02, 5.20s/it] {'loss': 0.4979, 'grad_norm': 1.4815041547299679, 'learning_rate': 9.99982547343637e-06, 'epoch': 0.03} 3%|▎ | 721/22095 [56:37<30:51:02, 5.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047663 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 8cm\nB. 10cm\nC. 16cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 3%|▎ | 722/22095 [56:41<28:24:38, 4.79s/it] {'loss': 0.4916, 'grad_norm': 1.1919403721482398, 'learning_rate': 9.999819296017496e-06, 'epoch': 0.03} 3%|▎ | 722/22095 [56:41<28:24:38, 4.79s/it] 3%|▎ | 723/22095 [56:43<24:54:24, 4.20s/it] {'loss': 0.4929, 'grad_norm': 0.9116590492540936, 'learning_rate': 9.999813011167861e-06, 'epoch': 0.03} 3%|▎ | 723/22095 [56:43<24:54:24, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 724/22095 [56:53<35:17:53, 5.95s/it] {'loss': 0.5855, 'grad_norm': 1.2629616711327594, 'learning_rate': 9.9998066188876e-06, 'epoch': 0.03} 3%|▎ | 724/22095 [56:53<35:17:53, 5.95s/it] 3%|▎ | 725/22095 [56:57<31:24:09, 5.29s/it] {'loss': 0.5008, 'grad_norm': 0.9248013179371635, 'learning_rate': 9.99980011917685e-06, 'epoch': 0.03} 3%|▎ | 725/22095 [56:57<31:24:09, 5.29s/it] 3%|▎ | 726/22095 [57:01<27:59:42, 4.72s/it] {'loss': 0.4596, 'grad_norm': 1.0678999576310209, 'learning_rate': 9.999793512035751e-06, 'epoch': 0.03} 3%|▎ | 726/22095 [57:01<27:59:42, 4.72s/it] 3%|▎ | 727/22095 [57:04<25:09:26, 4.24s/it] {'loss': 0.5499, 'grad_norm': 0.8909248268182276, 'learning_rate': 9.999786797464446e-06, 'epoch': 0.03} 3%|▎ | 727/22095 [57:04<25:09:26, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 728/22095 [57:12<31:55:57, 5.38s/it] {'loss': 0.5702, 'grad_norm': 0.7216813737007891, 'learning_rate': 9.999779975463079e-06, 'epoch': 0.03} 3%|▎ | 728/22095 [57:12<31:55:57, 5.38s/it] 3%|▎ | 729/22095 [57:16<29:26:43, 4.96s/it] {'loss': 0.5352, 'grad_norm': 1.0018022162445315, 'learning_rate': 9.999773046031795e-06, 'epoch': 0.03} 3%|▎ | 729/22095 [57:16<29:26:43, 4.96s/it] 3%|▎ | 730/22095 [57:19<26:23:05, 4.45s/it] {'loss': 0.4565, 'grad_norm': 0.919647573300584, 'learning_rate': 9.999766009170743e-06, 'epoch': 0.03} 3%|▎ | 730/22095 [57:19<26:23:05, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78262 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 731/22095 [57:23<25:05:24, 4.23s/it] {'loss': 0.4837, 'grad_norm': 0.8796753543539122, 'learning_rate': 9.999758864880078e-06, 'epoch': 0.03} 3%|▎ | 731/22095 [57:23<25:05:24, 4.23s/it] 3%|▎ | 732/22095 [57:27<25:32:08, 4.30s/it] {'loss': 0.4907, 'grad_norm': 0.849129211511892, 'learning_rate': 9.999751613159947e-06, 'epoch': 0.03} 3%|▎ | 732/22095 [57:27<25:32:08, 4.30s/it] 3%|▎ | 733/22095 [57:31<24:04:19, 4.06s/it] {'loss': 0.5033, 'grad_norm': 0.8562079088711478, 'learning_rate': 9.99974425401051e-06, 'epoch': 0.03} 3%|▎ | 733/22095 [57:31<24:04:19, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 734/22095 [57:34<22:23:00, 3.77s/it] {'loss': 0.4548, 'grad_norm': 0.8722424149204497, 'learning_rate': 9.999736787431927e-06, 'epoch': 0.03} 3%|▎ | 734/22095 [57:34<22:23:00, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65673 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70441 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67096 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 735/22095 [57:37<20:47:52, 3.51s/it] {'loss': 0.4592, 'grad_norm': 0.8075078284399309, 'learning_rate': 9.999729213424355e-06, 'epoch': 0.03} 3%|▎ | 735/22095 [57:37<20:47:52, 3.51s/it] 3%|▎ | 736/22095 [57:39<19:31:07, 3.29s/it] {'loss': 0.4876, 'grad_norm': 0.9474371118555418, 'learning_rate': 9.999721531987958e-06, 'epoch': 0.03} 3%|▎ | 736/22095 [57:39<19:31:07, 3.29s/it] 3%|▎ | 737/22095 [57:42<18:38:33, 3.14s/it] {'loss': 0.5161, 'grad_norm': 0.9740697650234126, 'learning_rate': 9.999713743122898e-06, 'epoch': 0.03} 3%|▎ | 737/22095 [57:42<18:38:33, 3.14s/it] 3%|▎ | 738/22095 [57:45<18:14:02, 3.07s/it] {'loss': 0.4738, 'grad_norm': 0.7762903958059998, 'learning_rate': 9.999705846829348e-06, 'epoch': 0.03} 3%|▎ | 738/22095 [57:45<18:14:02, 3.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 739/22095 [57:55<30:39:19, 5.17s/it] {'loss': 0.5282, 'grad_norm': 0.6913393805537109, 'learning_rate': 9.999697843107475e-06, 'epoch': 0.03} 3%|▎ | 739/22095 [57:55<30:39:19, 5.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 740/22095 [57:59<27:22:26, 4.61s/it] {'loss': 0.4951, 'grad_norm': 0.9631280627628708, 'learning_rate': 9.99968973195745e-06, 'epoch': 0.03} 3%|▎ | 740/22095 [57:59<27:22:26, 4.61s/it] 3%|▎ | 741/22095 [58:01<24:14:12, 4.09s/it] {'loss': 0.496, 'grad_norm': 1.1401384416707492, 'learning_rate': 9.999681513379447e-06, 'epoch': 0.03} 3%|▎ | 741/22095 [58:01<24:14:12, 4.09s/it] 3%|▎ | 742/22095 [58:05<22:45:19, 3.84s/it] {'loss': 0.4793, 'grad_norm': 0.8209194165070912, 'learning_rate': 9.999673187373644e-06, 'epoch': 0.03} 3%|▎ | 742/22095 [58:05<22:45:19, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48237 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 743/22095 [58:15<33:44:50, 5.69s/it] {'loss': 0.5692, 'grad_norm': 0.43857969694489585, 'learning_rate': 9.99966475394022e-06, 'epoch': 0.03} 3%|▎ | 743/22095 [58:15<33:44:50, 5.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89828 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77674 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 744/22095 [58:18<29:04:53, 4.90s/it] {'loss': 0.4542, 'grad_norm': 1.0790565795223694, 'learning_rate': 9.999656213079356e-06, 'epoch': 0.03} 3%|▎ | 744/22095 [58:18<29:04:53, 4.90s/it] 3%|▎ | 745/22095 [58:22<27:27:58, 4.63s/it] {'loss': 0.4916, 'grad_norm': 0.8621474937716063, 'learning_rate': 9.999647564791234e-06, 'epoch': 0.03} 3%|▎ | 745/22095 [58:22<27:27:58, 4.63s/it] 3%|▎ | 746/22095 [58:25<25:39:05, 4.33s/it] {'loss': 0.4985, 'grad_norm': 0.8492549723137861, 'learning_rate': 9.999638809076043e-06, 'epoch': 0.03} 3%|▎ | 746/22095 [58:25<25:39:05, 4.33s/it] 3%|▎ | 747/22095 [58:28<23:34:08, 3.97s/it] {'loss': 0.5065, 'grad_norm': 1.0321269679631884, 'learning_rate': 9.999629945933967e-06, 'epoch': 0.03} 3%|▎ | 747/22095 [58:28<23:34:08, 3.97s/it] 3%|▎ | 748/22095 [58:32<23:30:10, 3.96s/it] {'loss': 0.5104, 'grad_norm': 1.0591850566806524, 'learning_rate': 9.9996209753652e-06, 'epoch': 0.03} 3%|▎ | 748/22095 [58:32<23:30:10, 3.96s/it] 3%|▎ | 749/22095 [58:36<23:34:16, 3.98s/it] {'loss': 0.5013, 'grad_norm': 0.9046110188284266, 'learning_rate': 9.999611897369933e-06, 'epoch': 0.03} 3%|▎ | 749/22095 [58:36<23:34:16, 3.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914853 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38006, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 5\nB. 6\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 3%|▎ | 750/22095 [58:39<21:51:35, 3.69s/it] {'loss': 0.4607, 'grad_norm': 0.8114624134750734, 'learning_rate': 9.999602711948362e-06, 'epoch': 0.03} 3%|▎ | 750/22095 [58:39<21:51:35, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 751/22095 [58:50<34:07:27, 5.76s/it] {'loss': 0.5433, 'grad_norm': 0.4443264274419598, 'learning_rate': 9.999593419100683e-06, 'epoch': 0.03} 3%|▎ | 751/22095 [58:50<34:07:27, 5.76s/it] 3%|▎ | 752/22095 [59:00<41:36:53, 7.02s/it] {'loss': 0.5734, 'grad_norm': 0.43502422992640793, 'learning_rate': 9.999584018827097e-06, 'epoch': 0.03} 3%|▎ | 752/22095 [59:00<41:36:53, 7.02s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (102853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62439 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 753/22095 [59:04<36:12:18, 6.11s/it] {'loss': 0.5093, 'grad_norm': 1.2444799823008759, 'learning_rate': 9.999574511127806e-06, 'epoch': 0.03} 3%|▎ | 753/22095 [59:04<36:12:18, 6.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 754/22095 [59:08<32:12:44, 5.43s/it] {'loss': 0.5308, 'grad_norm': 1.0738710755456755, 'learning_rate': 9.999564896003013e-06, 'epoch': 0.03} 3%|▎ | 754/22095 [59:08<32:12:44, 5.43s/it] 3%|▎ | 755/22095 [59:11<28:09:55, 4.75s/it] {'loss': 0.5181, 'grad_norm': 0.8692108687564998, 'learning_rate': 9.999555173452925e-06, 'epoch': 0.03} 3%|▎ | 755/22095 [59:11<28:09:55, 4.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047795 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 3\nB. 10\nC. 5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 3%|▎ | 756/22095 [59:14<25:29:12, 4.30s/it] {'loss': 0.4986, 'grad_norm': 0.9737398693974376, 'learning_rate': 9.999545343477752e-06, 'epoch': 0.03} 3%|▎ | 756/22095 [59:14<25:29:12, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [606, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8422818 in VC:s3://internvl-moe-sft-data/. Exception: Image size [606, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8317, 'image': 'vrdu_texteq/astro-ph.CO/62acebc0-d27a-4b5e-87cc-6b36d5362afb.png', 'image_wh': [[606, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'where $\\theta_{E}$ is the critical Einstein radius defined as:'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [870, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8480207 in VC:s3://internvl-moe-sft-data/. Exception: Image size [870, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 127314, 'image': 'vrdu_texteq/astro-ph.CO/8c3765ed-c0d7-4c2b-a91e-9269ad49657a.png', 'image_wh': [[870, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'where we have used $m\\simeq 0.511 \\mbox{MeV}=h\\nu$ with $h=4.135\n\\times 10^{-15}\\mbox{eV}\\times \\mbox{s}$.'}]} 3%|▎ | 757/22095 [59:17<22:55:52, 3.87s/it] {'loss': 0.5009, 'grad_norm': 0.9555525860038144, 'learning_rate': 9.999535406077706e-06, 'epoch': 0.03} 3%|▎ | 757/22095 [59:17<22:55:52, 3.87s/it] 3%|▎ | 758/22095 [59:20<21:54:29, 3.70s/it] {'loss': 0.4735, 'grad_norm': 0.8457483843791993, 'learning_rate': 9.999525361252996e-06, 'epoch': 0.03} 3%|▎ | 758/22095 [59:20<21:54:29, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55657 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 759/22095 [59:23<20:07:37, 3.40s/it] {'loss': 0.459, 'grad_norm': 0.9139228689176659, 'learning_rate': 9.999515209003842e-06, 'epoch': 0.03} 3%|▎ | 759/22095 [59:23<20:07:37, 3.40s/it] 3%|▎ | 760/22095 [59:27<21:13:41, 3.58s/it] {'loss': 0.477, 'grad_norm': 1.0397618730118945, 'learning_rate': 9.99950494933046e-06, 'epoch': 0.03} 3%|▎ | 760/22095 [59:27<21:13:41, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 3%|▎ | 761/22095 [59:36<30:54:30, 5.22s/it] {'loss': 0.5614, 'grad_norm': 0.6038324799341142, 'learning_rate': 9.999494582233074e-06, 'epoch': 0.03} 3%|▎ | 761/22095 [59:36<30:54:30, 5.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 762/22095 [59:41<29:44:41, 5.02s/it] {'loss': 0.4982, 'grad_norm': 0.8974012894329598, 'learning_rate': 9.999484107711904e-06, 'epoch': 0.03} 3%|▎ | 762/22095 [59:41<29:44:41, 5.02s/it] 3%|▎ | 763/22095 [59:44<26:12:16, 4.42s/it] {'loss': 0.4917, 'grad_norm': 1.0877263021222965, 'learning_rate': 9.999473525767173e-06, 'epoch': 0.03} 3%|▎ | 763/22095 [59:44<26:12:16, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 3%|▎ | 764/22095 [59:47<23:35:41, 3.98s/it] {'loss': 0.4959, 'grad_norm': 0.7823596745969484, 'learning_rate': 9.999462836399112e-06, 'epoch': 0.03} 3%|▎ | 764/22095 [59:47<23:35:41, 3.98s/it] 3%|▎ | 765/22095 [59:50<21:55:58, 3.70s/it] {'loss': 0.4362, 'grad_norm': 0.8617310112728173, 'learning_rate': 9.999452039607948e-06, 'epoch': 0.03} 3%|▎ | 765/22095 [59:50<21:55:58, 3.70s/it] 3%|▎ | 766/22095 [59:53<21:54:31, 3.70s/it] {'loss': 0.462, 'grad_norm': 0.8658866389167234, 'learning_rate': 9.999441135393917e-06, 'epoch': 0.03} 3%|▎ | 766/22095 [59:53<21:54:31, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (98382 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103182 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 767/22095 [59:59<26:08:03, 4.41s/it] {'loss': 0.5354, 'grad_norm': 0.41883019477992395, 'learning_rate': 9.99943012375725e-06, 'epoch': 0.03} 3%|▎ | 767/22095 [59:59<26:08:03, 4.41s/it] 3%|▎ | 768/22095 [1:00:03<24:33:19, 4.14s/it] {'loss': 0.4495, 'grad_norm': 0.813158518464342, 'learning_rate': 9.999419004698182e-06, 'epoch': 0.03} 3%|▎ | 768/22095 [1:00:03<24:33:19, 4.14s/it] 3%|▎ | 769/22095 [1:00:07<24:13:35, 4.09s/it] {'loss': 0.4945, 'grad_norm': 0.8880549031334604, 'learning_rate': 9.999407778216957e-06, 'epoch': 0.03} 3%|▎ | 769/22095 [1:00:07<24:13:35, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81639 > 40960). Running this sequence through the model will result in indexing errors 3%|▎ | 770/22095 [1:00:10<23:09:45, 3.91s/it] {'loss': 0.5068, 'grad_norm': 0.9003853192572606, 'learning_rate': 9.999396444313811e-06, 'epoch': 0.03} 3%|▎ | 770/22095 [1:00:10<23:09:45, 3.91s/it] 3%|▎ | 771/22095 [1:00:14<23:16:36, 3.93s/it] {'loss': 0.4964, 'grad_norm': 0.8752619001869574, 'learning_rate': 9.99938500298899e-06, 'epoch': 0.03} 3%|▎ | 771/22095 [1:00:14<23:16:36, 3.93s/it] 3%|▎ | 772/22095 [1:00:17<21:25:48, 3.62s/it] {'loss': 0.4564, 'grad_norm': 0.8271100830472271, 'learning_rate': 9.99937345424274e-06, 'epoch': 0.03} 3%|▎ | 772/22095 [1:00:17<21:25:48, 3.62s/it] 3%|▎ | 773/22095 [1:00:21<21:54:22, 3.70s/it] {'loss': 0.4577, 'grad_norm': 0.7203352211017896, 'learning_rate': 9.99936179807531e-06, 'epoch': 0.03} 3%|▎ | 773/22095 [1:00:21<21:54:22, 3.70s/it] 4%|▎ | 774/22095 [1:00:25<21:34:14, 3.64s/it] {'loss': 0.4508, 'grad_norm': 0.8500740344609287, 'learning_rate': 9.999350034486948e-06, 'epoch': 0.04} 4%|▎ | 774/22095 [1:00:25<21:34:14, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 775/22095 [1:00:34<31:52:03, 5.38s/it] {'loss': 0.5338, 'grad_norm': 0.39547626177355233, 'learning_rate': 9.99933816347791e-06, 'epoch': 0.04} 4%|▎ | 775/22095 [1:00:34<31:52:03, 5.38s/it] 4%|▎ | 776/22095 [1:00:38<28:20:13, 4.79s/it] {'loss': 0.5512, 'grad_norm': 0.9311817658265301, 'learning_rate': 9.999326185048447e-06, 'epoch': 0.04} 4%|▎ | 776/22095 [1:00:38<28:20:13, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66298 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 777/22095 [1:00:42<27:38:14, 4.67s/it] {'loss': 0.5399, 'grad_norm': 0.8800233195349324, 'learning_rate': 9.99931409919882e-06, 'epoch': 0.04} 4%|▎ | 777/22095 [1:00:42<27:38:14, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 778/22095 [1:00:51<35:56:24, 6.07s/it] {'loss': 0.5335, 'grad_norm': 0.3313894794355621, 'learning_rate': 9.999301905929286e-06, 'epoch': 0.04} 4%|▎ | 778/22095 [1:00:51<35:56:24, 6.07s/it] 4%|▎ | 779/22095 [1:00:54<30:37:39, 5.17s/it] {'loss': 0.4468, 'grad_norm': 0.8317066378641015, 'learning_rate': 9.999289605240109e-06, 'epoch': 0.04} 4%|▎ | 779/22095 [1:00:54<30:37:39, 5.17s/it] 4%|▎ | 780/22095 [1:00:58<27:59:11, 4.73s/it] {'loss': 0.4734, 'grad_norm': 0.7974651494986721, 'learning_rate': 9.999277197131551e-06, 'epoch': 0.04} 4%|▎ | 780/22095 [1:00:58<27:59:11, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47513 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66485 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 781/22095 [1:01:01<25:01:53, 4.23s/it] {'loss': 0.5195, 'grad_norm': 0.7944228025196886, 'learning_rate': 9.999264681603881e-06, 'epoch': 0.04} 4%|▎ | 781/22095 [1:01:01<25:01:53, 4.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [437, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8503512 in VC:s3://internvl-moe-sft-data/. Exception: Image size [437, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 104805, 'image': 'vrdu_texteq/astro-ph.CO/e6f0f083-3b08-4fdf-966e-6f6b50e04fd3.png', 'image_wh': [[437, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'We introduce the $XY$-coordinate as'}]} 4%|▎ | 782/22095 [1:01:05<24:17:57, 4.10s/it] {'loss': 0.4342, 'grad_norm': 0.8433968983805096, 'learning_rate': 9.99925205865737e-06, 'epoch': 0.04} 4%|▎ | 782/22095 [1:01:05<24:17:57, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11262856 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 956, 'image': 'MMMU_Pro/standard/test_955_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze all the relevant background knowledge involved in this question and give an in-depth and detailed introduction to each relevant knowledge point. You need to provide a clear explanation of each knowledge point involved in the question and answer, including its concept, definition, practical application, importance and influence in related fields. Please provide appropriate examples to support the explanation of each knowledge point, just like the detailed explanation in the textbook. Make sure the answer is clearly structured, well-organized, and covers all important aspects of each knowledge point. The answer requires at least 2,000 words, and avoid directly answering the answer to the specific question, and focus on the analysis and explanation of background knowledge.\nHere are the content and options of this question:\nQuestion: In the following reaction , what mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: ['38.50 grams', '40.00 grams', '35.50 grams', '37.00 grams', '34.78 grams', ' 36.78 grams', '36.50 grams', ' 36.76 grams', ' 36.77 grams']"}, {'from': 'gpt', 'value': "To provide a comprehensive understanding of the reaction given in the equation \\( \\text{H}_2\\text{SO}_4 + \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + \\text{H}_2\\text{O} \\), it's essential to delve into several key chemistry concepts. These include the nature of acids and bases, stoichiometry, types of chemical reactions, molar mass calculations, and practical applications of such reactions. Each of these concepts plays a pivotal role in understanding the process and outcomes of the chemical reaction presented.\n\n### 1. Nature of Acids and Bases\n\n#### Concept and Definition\nAcids and bases are fundamental categories of compounds in chemistry that interact in various ways. According to the Brønsted-Lowry theory:\n\n- **Acids** are substances that donate a proton (\\( \\text{H}^+ \\)) in a chemical reaction.\n- **Bases** are substances that accept a proton.\n\nIn our reaction, sulfuric acid (\\( \\text{H}_2\\text{SO}_4 \\)) is an acid, and sodium hydroxide (\\( \\text{NaOH} \\)) is a base.\n\n#### Practical Application and Importance\nAcid-base reactions are common and crucial in various industrial processes, including wastewater treatment, fertilization, and the production of various chemicals. Understanding acid-base interactions allows chemists to predict reaction products and design processes accordingly.\n\n#### Example\nIn titration, a known concentration of acid can be used to determine the concentration of a base, or vice versa. Such neutralization reactions help in manufacturing and laboratory settings to analyze compound purity or formulation accuracy.\n\n### 2. Stoichiometry\n\n#### Concept and Definition\nStoichiometry involves calculating the quantities of reactants and products in chemical reactions. It is based on the law of conservation of mass, which states that matter cannot be created or destroyed in an isolated system.\n\n#### Practical Application and Importance\nStoichiometry is crucial in ensuring the correct proportions of substances when conducting experiments or industrial reactions. This prevents wastage of resources and optimizes the yield of desired products.\n\n#### Example\nIn the reaction between sulfuric acid and sodium hydroxide, stoichiometry helps in determining the exact amount of acid required to react completely with a given amount of base to achieve neutralization without excess reagents.\n\n### 3. Types of Chemical Reactions\n\n#### Concept and Definition\nChemical reactions can be classified into several types, such as synthesis, decomposition, single replacement, double replacement, and combustion. The reaction between \\( \\text{H}_2\\text{SO}_4 \\) and \\( \\text{NaOH} \\) is a **double replacement reaction**, where parts of two compounds switch places to form two new compounds.\n\n#### Practical Application and Importance\nUnderstanding the type of chemical reaction is critical for predicting product formation and reaction feasibility. Double replacement reactions are widely used in various environmental and industrial applications, such as water purification and the manufacturing of polyhalite-based fertilizers.\n\n#### Example\nIn double replacement reactions, such as the one forming sodium sulfate (\\( \\text{Na}_2\\text{SO}_4 \\)) and water (\\( \\text{H}_2\\text{O} \\)), it's essential to evaluate the solubility and identity of the products to predict reaction outcomes effectively. \n\n### 4. Molar Mass Calculations\n\n#### Concept and Definition\nMolar mass is the mass of one mole of a substance, calculated by summing the atomic masses of all atoms in a molecule. It is expressed in grams per mole (g/mol).\n\n- **Sulfuric Acid (H₂SO₄)**: Molar mass = 2(1.01) + 32.07 + 4(16.00) = 98.09 g/mol\n- **Sodium Hydroxide (NaOH)**: Molar mass = 22.99 + 15.999 + 1.008 = 39.997 g/mol\n\n#### Practical Application and Importance\nAccurate molar mass calculations are essential for determining how much of each substance is required or produced in a reaction. This ensures the efficiency and correctness of chemical processes.\n\n#### Example\nIn the given reaction, we can use molar masses to calculate the exact mass of sulfuric acid needed to react with a specified amount of sodium hydroxide. This is crucial, for instance, in formulating cleaning agents where precise acid-base reactions ensure effectiveness and safety.\n\n### 5. Reaction Balancing and Conservation of Mass\n\n#### Concept and Definition\nChemical equations must be balanced to obey the law of conservation of mass. This means that the number of atoms of each element must be the same on both sides of the equation.\n\n#### Practical Application and Importance\nBalanced reactions are a cornerstone of chemical science, allowing predictions about reactant and product amounts and ensuring that equations accurately reflect the real-world behavior of chemicals.\n\n#### Example\nFor the reaction \\( \\text{H}_2\\text{SO}_4 + 2 \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2 \\text{H}_2\\text{O} \\), balancing ensures that two moles of sodium hydroxide fully react with one mole of sulfuric acid, producing sodium sulfate and water in stoichiometric proportions.\n\n### 6. Equilibrium and Reaction Dynamics\n\n#### Concept and Definition\nChemical reactions can be reversible, reaching a state of equilibrium where the rate of the forward reaction equals the rate of the backward reaction. However, the reaction in question is not typically reversible under standard conditions.\n\n#### Practical Application and Importance\nUnderstanding equilibrium is vital in processes such as drug formulation, where reaching a specific equilibrium ensures product stability and optimal performance.\n\n#### Example\nIn production processes involving acids and bases, knowing whether a reaction reaches completion or equilibrium helps in designing reaction conditions and equipment for optimal yield and efficiency.\n\n### 7. Neutralization Reactions\n\n#### Concept and Definition\nNeutralization is a specific type of acid-base reaction resulting in water and a salt. Neutralization reactions are essential in various applications, from daily life (like antacids) to large-scale industrial processes.\n\n#### Practical Application and Importance\nNeutralization reactions are used in waste treatment to adjust pH levels before discharging materials into the environment. They also play a role in biomedical applications, where maintaining proper pH is essential for bodily functions.\n\n#### Example\nIn the lab, neutralization can be applied for pH adjustments in buffer solutions, vital for enzyme reactions that require specific pH ranges to function correctly.\n\n### 8. Industrial and Environmental Implications\n\n#### Concept and Definition\nChemical reactions such as the neutralization of sulfuric acid have far-reaching implications in industries and the environment. Industries use such reactions for manufacturing and waste management, while environmental applications include pollution control.\n\n#### Practical Application and Importance\nUnderstanding the environmental impact of chemical processes is crucial for designing sustainable practices and solutions. Proper handling and neutralization of waste acids, for example, prevent soil and water pollution.\n\n#### Example\nIn the production of fertilizers, neutralization reactions help in forming compounds like ammonium nitrate, which are essential for agricultural productivity.\n\n### 9. Safety and Handling of Chemicals\n\n#### Concept and Definition\nHandling chemicals like sulfuric acid and sodium hydroxide requires caution due to their corrosive nature. Safety protocols in labs and industries are essential to prevent accidents and health hazards.\n\n#### Practical Application and Importance\nSafety measures include using appropriate personal protective equipment (PPE), understanding material safety data sheets (MSDS), and adhering to proper storage and disposal procedures. These practices protect individuals and the environment.\n\n#### Example\nIn a laboratory setting, dilution of acids like sulfuric acid should be performed by adding acid to water, not the reverse, to prevent exothermic reactions that can cause splattering and burns.\n\n### 10. Calculating Reactant Mass in Reactions\n\n#### Concept and Definition\nCalculating the mass of reactants needed in a chemical reaction involves using stoichiometry and molar masses. This process is crucial for both academic and industrial chemistry.\n\n#### Practical Application and Importance\nCorrect calculations ensure that reactions proceed efficiently, minimizing waste and ensuring complete reactions. This is particularly important in large-scale industrial processes where cost efficiency and yield optimization are critical.\n\n#### Example\nGiven a balanced equation, the process would involve using the mol ratio to determine how many grams of one reactant are needed to react with a given amount of another. This is applied in processes such as pharmaceuticals to ensure proper dosages.\n\n### Conclusion\n\nAn understanding of the above concepts forms the foundation for analyzing and interpreting chemical reactions, such as the one provided. These concepts are vital across various domains of chemistry, influencing scientific research, industrial applications, and environmental management.\n\nBy mastering these principles, chemists can optimize reactions for different purposes, whether in developing new materials, ensuring environmental compliance, or designing safe and effective pharmaceuticals. In educational settings, this knowledge equips students with the tools to explore and innovate in the field of chemistry, contributing to advancements in technology and society."}]} 4%|▎ | 783/22095 [1:01:08<22:02:17, 3.72s/it] {'loss': 0.5107, 'grad_norm': 0.8021298240348729, 'learning_rate': 9.999239328292283e-06, 'epoch': 0.04} 4%|▎ | 783/22095 [1:01:08<22:02:17, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74554 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 784/22095 [1:01:11<21:06:42, 3.57s/it] {'loss': 0.4739, 'grad_norm': 0.9387637066650226, 'learning_rate': 9.999226490508897e-06, 'epoch': 0.04} 4%|▎ | 784/22095 [1:01:11<21:06:42, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▎ | 785/22095 [1:01:21<32:15:02, 5.45s/it] {'loss': 0.554, 'grad_norm': 0.4872050722136279, 'learning_rate': 9.999213545307488e-06, 'epoch': 0.04} 4%|▎ | 785/22095 [1:01:21<32:15:02, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55413 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48814 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 786/22095 [1:01:24<27:52:11, 4.71s/it] {'loss': 0.5362, 'grad_norm': 0.9012196342130966, 'learning_rate': 9.999200492688334e-06, 'epoch': 0.04} 4%|▎ | 786/22095 [1:01:24<27:52:11, 4.71s/it] 4%|▎ | 787/22095 [1:01:27<24:59:49, 4.22s/it] {'loss': 0.4624, 'grad_norm': 0.9506473252668751, 'learning_rate': 9.999187332651716e-06, 'epoch': 0.04} 4%|▎ | 787/22095 [1:01:27<24:59:49, 4.22s/it] 4%|▎ | 788/22095 [1:01:30<22:40:35, 3.83s/it] {'loss': 0.4634, 'grad_norm': 0.8901086070418615, 'learning_rate': 9.999174065197916e-06, 'epoch': 0.04} 4%|▎ | 788/22095 [1:01:30<22:40:35, 3.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 789/22095 [1:01:37<28:01:15, 4.73s/it] {'loss': 0.5775, 'grad_norm': 0.36537172901012394, 'learning_rate': 9.999160690327218e-06, 'epoch': 0.04} 4%|▎ | 789/22095 [1:01:37<28:01:15, 4.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▎ | 790/22095 [1:01:40<26:04:19, 4.41s/it] {'loss': 0.4727, 'grad_norm': 1.3411043185866383, 'learning_rate': 9.999147208039912e-06, 'epoch': 0.04} 4%|▎ | 790/22095 [1:01:40<26:04:19, 4.41s/it] 4%|▎ | 791/22095 [1:01:44<25:31:30, 4.31s/it] {'loss': 0.523, 'grad_norm': 0.9409963730058601, 'learning_rate': 9.999133618336285e-06, 'epoch': 0.04} 4%|▎ | 791/22095 [1:01:44<25:31:30, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 792/22095 [1:01:52<30:55:29, 5.23s/it] {'loss': 0.5649, 'grad_norm': 0.377734894810001, 'learning_rate': 9.99911992121663e-06, 'epoch': 0.04} 4%|▎ | 792/22095 [1:01:52<30:55:29, 5.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▎ | 793/22095 [1:01:55<27:54:06, 4.72s/it] {'loss': 0.5063, 'grad_norm': 1.0227197057908568, 'learning_rate': 9.999106116681243e-06, 'epoch': 0.04} 4%|▎ | 793/22095 [1:01:55<27:54:06, 4.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 794/22095 [1:02:06<37:52:08, 6.40s/it] {'loss': 0.5136, 'grad_norm': 0.3577445076404578, 'learning_rate': 9.999092204730418e-06, 'epoch': 0.04} 4%|▎ | 794/22095 [1:02:06<37:52:08, 6.40s/it] 4%|▎ | 795/22095 [1:02:09<32:27:54, 5.49s/it] {'loss': 0.4668, 'grad_norm': 0.8851794694546333, 'learning_rate': 9.999078185364455e-06, 'epoch': 0.04} 4%|▎ | 795/22095 [1:02:09<32:27:54, 5.49s/it] 4%|▎ | 796/22095 [1:02:12<28:05:23, 4.75s/it] {'loss': 0.5141, 'grad_norm': 0.822351255754194, 'learning_rate': 9.999064058583657e-06, 'epoch': 0.04} 4%|▎ | 796/22095 [1:02:12<28:05:23, 4.75s/it] 4%|▎ | 797/22095 [1:02:15<25:11:09, 4.26s/it] {'loss': 0.5556, 'grad_norm': 0.8727820448841807, 'learning_rate': 9.999049824388324e-06, 'epoch': 0.04} 4%|▎ | 797/22095 [1:02:15<25:11:09, 4.26s/it] 4%|▎ | 798/22095 [1:02:19<24:05:42, 4.07s/it] {'loss': 0.5346, 'grad_norm': 0.9039671640308168, 'learning_rate': 9.999035482778764e-06, 'epoch': 0.04} 4%|▎ | 798/22095 [1:02:19<24:05:42, 4.07s/it] 4%|▎ | 799/22095 [1:02:22<23:35:36, 3.99s/it] {'loss': 0.5, 'grad_norm': 0.8457605924012196, 'learning_rate': 9.999021033755286e-06, 'epoch': 0.04} 4%|▎ | 799/22095 [1:02:23<23:35:36, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76072 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42456 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51967 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 800/22095 [1:02:25<21:28:05, 3.63s/it] {'loss': 0.4503, 'grad_norm': 0.9032493102294806, 'learning_rate': 9.999006477318197e-06, 'epoch': 0.04} 4%|▎ | 800/22095 [1:02:25<21:28:05, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90691 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49907 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 801/22095 [1:02:28<19:47:10, 3.35s/it] {'loss': 0.4662, 'grad_norm': 0.7868413613589126, 'learning_rate': 9.998991813467814e-06, 'epoch': 0.04} 4%|▎ | 801/22095 [1:02:28<19:47:10, 3.35s/it] 4%|▎ | 802/22095 [1:02:31<19:15:00, 3.25s/it] {'loss': 0.4836, 'grad_norm': 0.93526853556894, 'learning_rate': 9.998977042204449e-06, 'epoch': 0.04} 4%|▎ | 802/22095 [1:02:31<19:15:00, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75558 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66256 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51452 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 803/22095 [1:02:35<20:05:47, 3.40s/it] {'loss': 0.4806, 'grad_norm': 0.8455875935885705, 'learning_rate': 9.998962163528421e-06, 'epoch': 0.04} 4%|▎ | 803/22095 [1:02:35<20:05:47, 3.40s/it] 4%|▎ | 804/22095 [1:02:39<20:49:13, 3.52s/it] {'loss': 0.4291, 'grad_norm': 0.8344451586941963, 'learning_rate': 9.998947177440048e-06, 'epoch': 0.04} 4%|▎ | 804/22095 [1:02:39<20:49:13, 3.52s/it] 4%|▎ | 805/22095 [1:02:43<21:44:32, 3.68s/it] {'loss': 0.5004, 'grad_norm': 0.8522693202810683, 'learning_rate': 9.998932083939657e-06, 'epoch': 0.04} 4%|▎ | 805/22095 [1:02:43<21:44:32, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [939, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8439655 in VC:s3://internvl-moe-sft-data/. Exception: Image size [939, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 159661, 'image': 'vrdu_texteq/astro-ph.CO/d1e58ad7-590d-4aea-9dbd-fa4495158c45.png', 'image_wh': [[939, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'and an unbiased estimate of the true covariance matrix $C^t$ from these data is'}]} 4%|▎ | 806/22095 [1:02:46<20:25:01, 3.45s/it] {'loss': 0.4948, 'grad_norm': 0.905163368072537, 'learning_rate': 9.998916883027565e-06, 'epoch': 0.04} 4%|▎ | 806/22095 [1:02:46<20:25:01, 3.45s/it] 4%|▎ | 807/22095 [1:02:50<21:25:53, 3.62s/it] {'loss': 0.4602, 'grad_norm': 0.73147243210702, 'learning_rate': 9.998901574704102e-06, 'epoch': 0.04} 4%|▎ | 807/22095 [1:02:50<21:25:53, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 808/22095 [1:03:00<34:13:59, 5.79s/it] {'loss': 0.5818, 'grad_norm': 0.4513364849291583, 'learning_rate': 9.9988861589696e-06, 'epoch': 0.04} 4%|▎ | 808/22095 [1:03:00<34:13:59, 5.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308237 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2QQ0keMb.PuJjSZFpXXbuFpXa_!!3422147592.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n疯狂热卖\n全包玻璃时尚龟缸\n从小到大多种选择\n包邮\n包损\n3\n1\n赠\n买\n+\n+'}]} 4%|▎ | 809/22095 [1:03:08<37:10:57, 6.29s/it] {'loss': 0.5678, 'grad_norm': 0.40403113807498325, 'learning_rate': 9.998870635824385e-06, 'epoch': 0.04} 4%|▎ | 809/22095 [1:03:08<37:10:57, 6.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 4%|▎ | 810/22095 [1:03:11<31:39:13, 5.35s/it] {'loss': 0.495, 'grad_norm': 1.07302313691336, 'learning_rate': 9.998855005268794e-06, 'epoch': 0.04} 4%|▎ | 810/22095 [1:03:11<31:39:13, 5.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308106 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2LG7EamYH8KJjSspdXXcRgVXa_!!780201377.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you extract the written content from this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n粉色\n推荐\n有支架\n38\ncm\n38x150cm\n120\n厂家直销\n无需验货\n150\n破损补发\n高清镜面\n45'}]} 4%|▎ | 811/22095 [1:03:19<35:50:14, 6.06s/it] {'loss': 0.5311, 'grad_norm': 0.3781018473275081, 'learning_rate': 9.998839267303163e-06, 'epoch': 0.04} 4%|▎ | 811/22095 [1:03:19<35:50:14, 6.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 4%|▎ | 812/22095 [1:03:22<30:56:47, 5.23s/it] {'loss': 0.4647, 'grad_norm': 0.8388566810721623, 'learning_rate': 9.998823421927826e-06, 'epoch': 0.04} 4%|▎ | 812/22095 [1:03:22<30:56:47, 5.23s/it] 4%|▎ | 813/22095 [1:03:26<28:18:20, 4.79s/it] {'loss': 0.4687, 'grad_norm': 0.83966169332251, 'learning_rate': 9.998807469143129e-06, 'epoch': 0.04} 4%|▎ | 813/22095 [1:03:26<28:18:20, 4.79s/it] 4%|▎ | 814/22095 [1:03:30<26:26:47, 4.47s/it] {'loss': 0.4837, 'grad_norm': 0.860897967280018, 'learning_rate': 9.998791408949408e-06, 'epoch': 0.04} 4%|▎ | 814/22095 [1:03:30<26:26:47, 4.47s/it] 4%|▎ | 815/22095 [1:03:33<24:32:47, 4.15s/it] {'loss': 0.4928, 'grad_norm': 0.8062004578592479, 'learning_rate': 9.998775241347017e-06, 'epoch': 0.04} 4%|▎ | 815/22095 [1:03:33<24:32:47, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▎ | 816/22095 [1:03:37<23:42:06, 4.01s/it] {'loss': 0.4549, 'grad_norm': 0.7764292399554144, 'learning_rate': 9.998758966336296e-06, 'epoch': 0.04} 4%|▎ | 816/22095 [1:03:37<23:42:06, 4.01s/it] 4%|▎ | 817/22095 [1:03:40<22:48:31, 3.86s/it] {'loss': 0.4537, 'grad_norm': 0.7996994905115854, 'learning_rate': 9.998742583917598e-06, 'epoch': 0.04} 4%|▎ | 817/22095 [1:03:40<22:48:31, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108923 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 818/22095 [1:03:43<21:06:40, 3.57s/it] {'loss': 0.4563, 'grad_norm': 0.8265343193477142, 'learning_rate': 9.998726094091275e-06, 'epoch': 0.04} 4%|▎ | 818/22095 [1:03:43<21:06:40, 3.57s/it] 4%|▎ | 819/22095 [1:03:47<21:33:42, 3.65s/it] {'loss': 0.5115, 'grad_norm': 0.82518409601697, 'learning_rate': 9.99870949685768e-06, 'epoch': 0.04} 4%|▎ | 819/22095 [1:03:47<21:33:42, 3.65s/it] 4%|▎ | 820/22095 [1:03:51<21:55:11, 3.71s/it] {'loss': 0.4773, 'grad_norm': 0.8645585275224266, 'learning_rate': 9.99869279221717e-06, 'epoch': 0.04} 4%|▎ | 820/22095 [1:03:51<21:55:11, 3.71s/it] 4%|▎ | 821/22095 [1:03:55<22:15:19, 3.77s/it] {'loss': 0.4519, 'grad_norm': 0.8706170286664132, 'learning_rate': 9.998675980170106e-06, 'epoch': 0.04} 4%|▎ | 821/22095 [1:03:55<22:15:19, 3.77s/it] 4%|▎ | 822/22095 [1:03:58<20:53:15, 3.53s/it] {'loss': 0.4296, 'grad_norm': 0.7940021232898752, 'learning_rate': 9.998659060716844e-06, 'epoch': 0.04} 4%|▎ | 822/22095 [1:03:58<20:53:15, 3.53s/it] 4%|▎ | 823/22095 [1:04:01<20:12:33, 3.42s/it] {'loss': 0.4646, 'grad_norm': 0.8167594306281625, 'learning_rate': 9.998642033857753e-06, 'epoch': 0.04} 4%|▎ | 823/22095 [1:04:01<20:12:33, 3.42s/it] 4%|▎ | 824/22095 [1:04:05<21:00:26, 3.56s/it] {'loss': 0.475, 'grad_norm': 1.7764689686555373, 'learning_rate': 9.998624899593197e-06, 'epoch': 0.04} 4%|▎ | 824/22095 [1:04:05<21:00:26, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44759 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49172 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 825/22095 [1:04:07<19:36:16, 3.32s/it] {'loss': 0.4667, 'grad_norm': 0.8476519593212136, 'learning_rate': 9.998607657923545e-06, 'epoch': 0.04} 4%|▎ | 825/22095 [1:04:07<19:36:16, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▎ | 826/22095 [1:04:17<30:02:07, 5.08s/it] {'loss': 0.57, 'grad_norm': 0.6990683330559072, 'learning_rate': 9.998590308849164e-06, 'epoch': 0.04} 4%|▎ | 826/22095 [1:04:17<30:02:07, 5.08s/it] 4%|▎ | 827/22095 [1:04:20<27:57:52, 4.73s/it] {'loss': 0.4403, 'grad_norm': 0.7878434383789473, 'learning_rate': 9.998572852370432e-06, 'epoch': 0.04} 4%|▎ | 827/22095 [1:04:21<27:57:52, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41708 > 40960). Running this sequence through the model will result in indexing errors 4%|▎ | 828/22095 [1:04:24<26:13:35, 4.44s/it] {'loss': 0.4564, 'grad_norm': 0.8452597507527643, 'learning_rate': 9.998555288487719e-06, 'epoch': 0.04} 4%|▎ | 828/22095 [1:04:24<26:13:35, 4.44s/it] 4%|▍ | 829/22095 [1:04:27<23:12:42, 3.93s/it] {'loss': 0.4431, 'grad_norm': 0.8291505593866499, 'learning_rate': 9.998537617201405e-06, 'epoch': 0.04} 4%|▍ | 829/22095 [1:04:27<23:12:42, 3.93s/it] 4%|▍ | 830/22095 [1:04:31<22:31:59, 3.81s/it] {'loss': 0.5116, 'grad_norm': 0.8739561479854944, 'learning_rate': 9.998519838511872e-06, 'epoch': 0.04} 4%|▍ | 830/22095 [1:04:31<22:31:59, 3.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8391792 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 58616, 'image': 'vrdu_table_final_2/astro-ph.EP/02dba466-91bd-4b39-8626-3019613c2fa1.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 831/22095 [1:04:34<22:47:00, 3.86s/it] {'loss': 0.4549, 'grad_norm': 0.8685439340711074, 'learning_rate': 9.998501952419496e-06, 'epoch': 0.04} 4%|▍ | 831/22095 [1:04:34<22:47:00, 3.86s/it] 4%|▍ | 832/22095 [1:04:38<21:18:10, 3.61s/it] {'loss': 0.4933, 'grad_norm': 0.8176380913106567, 'learning_rate': 9.998483958924666e-06, 'epoch': 0.04} 4%|▍ | 832/22095 [1:04:38<21:18:10, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 833/22095 [1:04:48<34:03:26, 5.77s/it] {'loss': 0.5262, 'grad_norm': 0.5896711368819945, 'learning_rate': 9.998465858027769e-06, 'epoch': 0.04} 4%|▍ | 833/22095 [1:04:48<34:03:26, 5.77s/it] 4%|▍ | 834/22095 [1:04:52<29:34:18, 5.01s/it] {'loss': 0.4775, 'grad_norm': 0.9204416208370488, 'learning_rate': 9.99844764972919e-06, 'epoch': 0.04} 4%|▍ | 834/22095 [1:04:52<29:34:18, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51289 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52412 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46978 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124076 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 835/22095 [1:04:55<26:46:30, 4.53s/it] {'loss': 0.438, 'grad_norm': 0.8974279098852376, 'learning_rate': 9.998429334029323e-06, 'epoch': 0.04} 4%|▍ | 835/22095 [1:04:55<26:46:30, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 836/22095 [1:05:04<35:32:06, 6.02s/it] {'loss': 0.5537, 'grad_norm': 0.4566914236668775, 'learning_rate': 9.998410910928562e-06, 'epoch': 0.04} 4%|▍ | 836/22095 [1:05:04<35:32:06, 6.02s/it] 4%|▍ | 837/22095 [1:05:08<31:03:16, 5.26s/it] {'loss': 0.473, 'grad_norm': 0.9354308373334939, 'learning_rate': 9.998392380427302e-06, 'epoch': 0.04} 4%|▍ | 837/22095 [1:05:08<31:03:16, 5.26s/it] 4%|▍ | 838/22095 [1:05:11<27:07:44, 4.59s/it] {'loss': 0.4528, 'grad_norm': 0.9415680530950489, 'learning_rate': 9.998373742525941e-06, 'epoch': 0.04} 4%|▍ | 838/22095 [1:05:11<27:07:44, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118399 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 839/22095 [1:05:15<25:46:05, 4.36s/it] {'loss': 0.4451, 'grad_norm': 0.8566551875156379, 'learning_rate': 9.998354997224879e-06, 'epoch': 0.04} 4%|▍ | 839/22095 [1:05:15<25:46:05, 4.36s/it] 4%|▍ | 840/22095 [1:05:18<23:50:38, 4.04s/it] {'loss': 0.488, 'grad_norm': 0.8394095318186099, 'learning_rate': 9.998336144524521e-06, 'epoch': 0.04} 4%|▍ | 840/22095 [1:05:18<23:50:38, 4.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 841/22095 [1:05:22<23:45:47, 4.03s/it] {'loss': 0.5012, 'grad_norm': 0.8683511332108413, 'learning_rate': 9.998317184425268e-06, 'epoch': 0.04} 4%|▍ | 841/22095 [1:05:22<23:45:47, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 842/22095 [1:05:26<23:02:01, 3.90s/it] {'loss': 0.5156, 'grad_norm': 0.8028785010939439, 'learning_rate': 9.998298116927532e-06, 'epoch': 0.04} 4%|▍ | 842/22095 [1:05:26<23:02:01, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 843/22095 [1:05:33<28:39:54, 4.86s/it] {'loss': 0.5676, 'grad_norm': 0.9152126508013158, 'learning_rate': 9.99827894203172e-06, 'epoch': 0.04} 4%|▍ | 843/22095 [1:05:33<28:39:54, 4.86s/it] 4%|▍ | 844/22095 [1:05:36<25:58:55, 4.40s/it] {'loss': 0.5213, 'grad_norm': 0.9331450310430767, 'learning_rate': 9.998259659738243e-06, 'epoch': 0.04} 4%|▍ | 844/22095 [1:05:36<25:58:55, 4.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047738 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 845/22095 [1:05:40<25:47:50, 4.37s/it] {'loss': 0.47, 'grad_norm': 0.8624765340420999, 'learning_rate': 9.998240270047519e-06, 'epoch': 0.04} 4%|▍ | 845/22095 [1:05:40<25:47:50, 4.37s/it] 4%|▍ | 846/22095 [1:05:44<23:57:59, 4.06s/it] {'loss': 0.5183, 'grad_norm': 0.9051223838196641, 'learning_rate': 9.998220772959962e-06, 'epoch': 0.04} 4%|▍ | 846/22095 [1:05:44<23:57:59, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 847/22095 [1:05:55<35:58:02, 6.09s/it] {'loss': 0.563, 'grad_norm': 0.38497024559095944, 'learning_rate': 9.998201168475991e-06, 'epoch': 0.04} 4%|▍ | 847/22095 [1:05:55<35:58:02, 6.09s/it] 4%|▍ | 848/22095 [1:05:58<30:47:55, 5.22s/it] {'loss': 0.4387, 'grad_norm': 1.0886459904950612, 'learning_rate': 9.998181456596027e-06, 'epoch': 0.04} 4%|▍ | 848/22095 [1:05:58<30:47:55, 5.22s/it] 4%|▍ | 849/22095 [1:06:01<27:23:01, 4.64s/it] {'loss': 0.5013, 'grad_norm': 0.906151568063953, 'learning_rate': 9.998161637320495e-06, 'epoch': 0.04} 4%|▍ | 849/22095 [1:06:01<27:23:01, 4.64s/it] 4%|▍ | 850/22095 [1:06:05<25:43:48, 4.36s/it] {'loss': 0.4966, 'grad_norm': 0.8703851030575666, 'learning_rate': 9.998141710649822e-06, 'epoch': 0.04} 4%|▍ | 850/22095 [1:06:05<25:43:48, 4.36s/it] 4%|▍ | 851/22095 [1:06:08<24:19:59, 4.12s/it] {'loss': 0.4399, 'grad_norm': 0.8775907858377049, 'learning_rate': 9.998121676584432e-06, 'epoch': 0.04} 4%|▍ | 851/22095 [1:06:08<24:19:59, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 852/22095 [1:06:13<25:35:03, 4.34s/it] {'loss': 0.5776, 'grad_norm': 0.5793465331166096, 'learning_rate': 9.998101535124758e-06, 'epoch': 0.04} 4%|▍ | 852/22095 [1:06:13<25:35:03, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88530 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79896 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 853/22095 [1:06:16<23:20:13, 3.96s/it] {'loss': 0.5187, 'grad_norm': 0.8748602831935688, 'learning_rate': 9.998081286271234e-06, 'epoch': 0.04} 4%|▍ | 853/22095 [1:06:16<23:20:13, 3.96s/it] 4%|▍ | 854/22095 [1:06:20<22:21:55, 3.79s/it] {'loss': 0.5106, 'grad_norm': 0.7891627206628827, 'learning_rate': 9.99806093002429e-06, 'epoch': 0.04} 4%|▍ | 854/22095 [1:06:20<22:21:55, 3.79s/it] 4%|▍ | 855/22095 [1:06:23<21:04:47, 3.57s/it] {'loss': 0.4622, 'grad_norm': 0.804830597119037, 'learning_rate': 9.99804046638437e-06, 'epoch': 0.04} 4%|▍ | 855/22095 [1:06:23<21:04:47, 3.57s/it] 4%|▍ | 856/22095 [1:06:26<20:26:12, 3.46s/it] {'loss': 0.4602, 'grad_norm': 0.8470820634732646, 'learning_rate': 9.99801989535191e-06, 'epoch': 0.04} 4%|▍ | 856/22095 [1:06:26<20:26:12, 3.46s/it] 4%|▍ | 857/22095 [1:06:29<19:21:17, 3.28s/it] {'loss': 0.4819, 'grad_norm': 0.7899585020497784, 'learning_rate': 9.997999216927352e-06, 'epoch': 0.04} 4%|▍ | 857/22095 [1:06:29<19:21:17, 3.28s/it] 4%|▍ | 858/22095 [1:06:32<18:28:31, 3.13s/it] {'loss': 0.4689, 'grad_norm': 0.7852368236713659, 'learning_rate': 9.997978431111142e-06, 'epoch': 0.04} 4%|▍ | 858/22095 [1:06:32<18:28:31, 3.13s/it] 4%|▍ | 859/22095 [1:06:35<18:25:57, 3.12s/it] {'loss': 0.4799, 'grad_norm': 0.7943676416848869, 'learning_rate': 9.997957537903727e-06, 'epoch': 0.04} 4%|▍ | 859/22095 [1:06:35<18:25:57, 3.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100354 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 860/22095 [1:06:38<17:57:48, 3.05s/it] {'loss': 0.4807, 'grad_norm': 0.892529635684416, 'learning_rate': 9.997936537305551e-06, 'epoch': 0.04} 4%|▍ | 860/22095 [1:06:38<17:57:48, 3.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46499 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 861/22095 [1:06:47<29:10:33, 4.95s/it] {'loss': 0.5707, 'grad_norm': 0.45745492269747656, 'learning_rate': 9.997915429317071e-06, 'epoch': 0.04} 4%|▍ | 861/22095 [1:06:47<29:10:33, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72965 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48421 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44518 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 862/22095 [1:06:50<26:19:22, 4.46s/it] {'loss': 0.4792, 'grad_norm': 0.8100899395043848, 'learning_rate': 9.997894213938738e-06, 'epoch': 0.04} 4%|▍ | 862/22095 [1:06:50<26:19:22, 4.46s/it] 4%|▍ | 863/22095 [1:06:54<24:25:06, 4.14s/it] {'loss': 0.5256, 'grad_norm': 0.8251870779862273, 'learning_rate': 9.997872891171009e-06, 'epoch': 0.04} 4%|▍ | 863/22095 [1:06:54<24:25:06, 4.14s/it] 4%|▍ | 864/22095 [1:06:56<22:06:26, 3.75s/it] {'loss': 0.46, 'grad_norm': 0.9052449244530871, 'learning_rate': 9.99785146101434e-06, 'epoch': 0.04} 4%|▍ | 864/22095 [1:06:56<22:06:26, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 865/22095 [1:07:00<21:09:54, 3.59s/it] {'loss': 0.5034, 'grad_norm': 0.8233054083332147, 'learning_rate': 9.997829923469194e-06, 'epoch': 0.04} 4%|▍ | 865/22095 [1:07:00<21:09:54, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119205 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109800 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 866/22095 [1:07:03<20:22:40, 3.46s/it] {'loss': 0.4548, 'grad_norm': 0.7663398755861442, 'learning_rate': 9.997808278536032e-06, 'epoch': 0.04} 4%|▍ | 866/22095 [1:07:03<20:22:40, 3.46s/it] 4%|▍ | 867/22095 [1:07:06<19:19:58, 3.28s/it] {'loss': 0.51, 'grad_norm': 0.8238704523328892, 'learning_rate': 9.99778652621532e-06, 'epoch': 0.04} 4%|▍ | 867/22095 [1:07:06<19:19:58, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (94347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41467 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 868/22095 [1:07:15<29:58:15, 5.08s/it] {'loss': 0.5214, 'grad_norm': 0.4089041222135545, 'learning_rate': 9.997764666507523e-06, 'epoch': 0.04} 4%|▍ | 868/22095 [1:07:15<29:58:15, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98909 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 869/22095 [1:07:19<27:34:24, 4.68s/it] {'loss': 0.448, 'grad_norm': 0.8337966943875073, 'learning_rate': 9.997742699413115e-06, 'epoch': 0.04} 4%|▍ | 869/22095 [1:07:19<27:34:24, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (141072 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 870/22095 [1:07:26<31:22:54, 5.32s/it] {'loss': 0.5238, 'grad_norm': 0.3718911507575922, 'learning_rate': 9.997720624932566e-06, 'epoch': 0.04} 4%|▍ | 870/22095 [1:07:26<31:22:54, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115420 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86679 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 871/22095 [1:07:30<29:13:58, 4.96s/it] {'loss': 0.492, 'grad_norm': 0.8330502280175536, 'learning_rate': 9.99769844306635e-06, 'epoch': 0.04} 4%|▍ | 871/22095 [1:07:30<29:13:58, 4.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047927 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 15cm\nB. 13cm\nC. 11cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 4%|▍ | 872/22095 [1:07:34<27:43:18, 4.70s/it] {'loss': 0.523, 'grad_norm': 0.8869907730560409, 'learning_rate': 9.997676153814944e-06, 'epoch': 0.04} 4%|▍ | 872/22095 [1:07:34<27:43:18, 4.70s/it] 4%|▍ | 873/22095 [1:07:38<26:20:11, 4.47s/it] {'loss': 0.5044, 'grad_norm': 0.8564115335442091, 'learning_rate': 9.997653757178824e-06, 'epoch': 0.04} 4%|▍ | 873/22095 [1:07:38<26:20:11, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 874/22095 [1:07:48<36:29:38, 6.19s/it] {'loss': 0.5594, 'grad_norm': 0.3409783012433021, 'learning_rate': 9.997631253158477e-06, 'epoch': 0.04} 4%|▍ | 874/22095 [1:07:48<36:29:38, 6.19s/it] 4%|▍ | 875/22095 [1:07:51<31:27:03, 5.34s/it] {'loss': 0.477, 'grad_norm': 0.8857825685787568, 'learning_rate': 9.997608641754381e-06, 'epoch': 0.04} 4%|▍ | 875/22095 [1:07:51<31:27:03, 5.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62431 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119961 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 876/22095 [1:08:00<37:12:18, 6.31s/it] {'loss': 0.5545, 'grad_norm': 0.40045123455066095, 'learning_rate': 9.997585922967026e-06, 'epoch': 0.04} 4%|▍ | 876/22095 [1:08:00<37:12:18, 6.31s/it] 4%|▍ | 877/22095 [1:08:11<44:58:21, 7.63s/it] {'loss': 0.5172, 'grad_norm': 0.3827260476019697, 'learning_rate': 9.997563096796899e-06, 'epoch': 0.04} 4%|▍ | 877/22095 [1:08:11<44:58:21, 7.63s/it] 4%|▍ | 878/22095 [1:08:21<49:26:50, 8.39s/it] {'loss': 0.5662, 'grad_norm': 0.34473254672516823, 'learning_rate': 9.997540163244487e-06, 'epoch': 0.04} 4%|▍ | 878/22095 [1:08:21<49:26:50, 8.39s/it] 4%|▍ | 879/22095 [1:08:29<48:45:15, 8.27s/it] {'loss': 0.553, 'grad_norm': 0.3722334175738685, 'learning_rate': 9.997517122310287e-06, 'epoch': 0.04} 4%|▍ | 879/22095 [1:08:29<48:45:15, 8.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41262 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 880/22095 [1:08:33<41:18:45, 7.01s/it] {'loss': 0.4872, 'grad_norm': 1.1596370622984065, 'learning_rate': 9.997493973994793e-06, 'epoch': 0.04} 4%|▍ | 880/22095 [1:08:33<41:18:45, 7.01s/it] 4%|▍ | 881/22095 [1:08:36<34:55:51, 5.93s/it] {'loss': 0.5366, 'grad_norm': 0.9422109365205645, 'learning_rate': 9.997470718298503e-06, 'epoch': 0.04} 4%|▍ | 881/22095 [1:08:36<34:55:51, 5.93s/it] 4%|▍ | 882/22095 [1:08:40<30:54:43, 5.25s/it] {'loss': 0.5092, 'grad_norm': 0.9376681347956652, 'learning_rate': 9.997447355221915e-06, 'epoch': 0.04} 4%|▍ | 882/22095 [1:08:40<30:54:43, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55141 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 883/22095 [1:08:43<27:20:20, 4.64s/it] {'loss': 0.5064, 'grad_norm': 0.9194276554143219, 'learning_rate': 9.997423884765532e-06, 'epoch': 0.04} 4%|▍ | 883/22095 [1:08:43<27:20:20, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 884/22095 [1:08:51<32:31:32, 5.52s/it] {'loss': 0.5557, 'grad_norm': 0.5705723711046609, 'learning_rate': 9.99740030692986e-06, 'epoch': 0.04} 4%|▍ | 884/22095 [1:08:51<32:31:32, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 885/22095 [1:08:54<28:40:02, 4.87s/it] {'loss': 0.4448, 'grad_norm': 1.1046214239155359, 'learning_rate': 9.9973766217154e-06, 'epoch': 0.04} 4%|▍ | 885/22095 [1:08:54<28:40:02, 4.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8381160 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47949, 'image': 'vrdu_table_final_2/astro-ph.CO/75399ea1-fd0d-4999-9cf1-625593103c17.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```'}]} 4%|▍ | 886/22095 [1:08:57<26:07:19, 4.43s/it] {'loss': 0.4865, 'grad_norm': 0.8280146005762715, 'learning_rate': 9.997352829122667e-06, 'epoch': 0.04} 4%|▍ | 886/22095 [1:08:57<26:07:19, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 887/22095 [1:09:00<23:44:39, 4.03s/it] {'loss': 0.4488, 'grad_norm': 0.8686670011300119, 'learning_rate': 9.99732892915217e-06, 'epoch': 0.04} 4%|▍ | 887/22095 [1:09:00<23:44:39, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (69977 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 888/22095 [1:09:08<29:54:41, 5.08s/it] {'loss': 0.5701, 'grad_norm': 0.43280572475101964, 'learning_rate': 9.99730492180442e-06, 'epoch': 0.04} 4%|▍ | 888/22095 [1:09:08<29:54:41, 5.08s/it] 4%|▍ | 889/22095 [1:09:11<26:39:09, 4.52s/it] {'loss': 0.4958, 'grad_norm': 0.8997349614022049, 'learning_rate': 9.997280807079938e-06, 'epoch': 0.04} 4%|▍ | 889/22095 [1:09:11<26:39:09, 4.52s/it] 4%|▍ | 890/22095 [1:09:14<23:59:02, 4.07s/it] {'loss': 0.4799, 'grad_norm': 0.8726634607887849, 'learning_rate': 9.997256584979239e-06, 'epoch': 0.04} 4%|▍ | 890/22095 [1:09:14<23:59:02, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [973, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8450906 in VC:s3://internvl-moe-sft-data/. Exception: Image size [973, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69816, 'image': 'vrdu_texteq/astro-ph.CO/af6a1f89-1a82-46ca-9a80-68dbb35bd57c.png', 'image_wh': [[973, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': "if $z=1$ this formula is know as the Parseval's theorem\nfor the Mellin transform."}]} 4%|▍ | 891/22095 [1:09:21<27:51:44, 4.73s/it] {'loss': 0.5752, 'grad_norm': 0.4248470639109258, 'learning_rate': 9.997232255502842e-06, 'epoch': 0.04} 4%|▍ | 891/22095 [1:09:21<27:51:44, 4.73s/it] 4%|▍ | 892/22095 [1:09:24<25:26:41, 4.32s/it] {'loss': 0.5101, 'grad_norm': 0.8720384694863267, 'learning_rate': 9.997207818651273e-06, 'epoch': 0.04} 4%|▍ | 892/22095 [1:09:24<25:26:41, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59542 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89396 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 893/22095 [1:09:27<22:55:29, 3.89s/it] {'loss': 0.4938, 'grad_norm': 0.9066900693499089, 'learning_rate': 9.997183274425058e-06, 'epoch': 0.04} 4%|▍ | 893/22095 [1:09:27<22:55:29, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134797 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52650 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 894/22095 [1:09:30<22:21:50, 3.80s/it] {'loss': 0.4438, 'grad_norm': 0.9292081807321683, 'learning_rate': 9.997158622824719e-06, 'epoch': 0.04} 4%|▍ | 894/22095 [1:09:30<22:21:50, 3.80s/it] 4%|▍ | 895/22095 [1:09:34<21:40:32, 3.68s/it] {'loss': 0.4396, 'grad_norm': 1.057943792521059, 'learning_rate': 9.99713386385079e-06, 'epoch': 0.04} 4%|▍ | 895/22095 [1:09:34<21:40:32, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 896/22095 [1:09:39<25:16:18, 4.29s/it] {'loss': 0.5388, 'grad_norm': 0.44618197731567005, 'learning_rate': 9.9971089975038e-06, 'epoch': 0.04} 4%|▍ | 896/22095 [1:09:39<25:16:18, 4.29s/it] 4%|▍ | 897/22095 [1:09:43<23:23:24, 3.97s/it] {'loss': 0.4858, 'grad_norm': 1.0347025476143397, 'learning_rate': 9.997084023784286e-06, 'epoch': 0.04} 4%|▍ | 897/22095 [1:09:43<23:23:24, 3.97s/it] 4%|▍ | 898/22095 [1:09:46<21:58:03, 3.73s/it] {'loss': 0.4632, 'grad_norm': 0.9476775847258264, 'learning_rate': 9.997058942692786e-06, 'epoch': 0.04} 4%|▍ | 898/22095 [1:09:46<21:58:03, 3.73s/it] 4%|▍ | 899/22095 [1:09:49<20:19:53, 3.45s/it] {'loss': 0.4905, 'grad_norm': 0.7912189795373192, 'learning_rate': 9.997033754229835e-06, 'epoch': 0.04} 4%|▍ | 899/22095 [1:09:49<20:19:53, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41834 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 900/22095 [1:09:52<20:02:10, 3.40s/it] {'loss': 0.4605, 'grad_norm': 0.9864123150456614, 'learning_rate': 9.997008458395975e-06, 'epoch': 0.04} 4%|▍ | 900/22095 [1:09:52<20:02:10, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 901/22095 [1:10:01<30:44:58, 5.22s/it] {'loss': 0.551, 'grad_norm': 0.40215490381501273, 'learning_rate': 9.996983055191752e-06, 'epoch': 0.04} 4%|▍ | 901/22095 [1:10:01<30:44:58, 5.22s/it] 4%|▍ | 902/22095 [1:10:08<33:49:32, 5.75s/it] {'loss': 0.5483, 'grad_norm': 0.3822696497597169, 'learning_rate': 9.99695754461771e-06, 'epoch': 0.04} 4%|▍ | 902/22095 [1:10:08<33:49:32, 5.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (69808 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62781 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 903/22095 [1:10:12<29:52:56, 5.08s/it] {'loss': 0.4921, 'grad_norm': 1.0110205632233296, 'learning_rate': 9.996931926674396e-06, 'epoch': 0.04} 4%|▍ | 903/22095 [1:10:12<29:52:56, 5.08s/it] 4%|▍ | 904/22095 [1:10:15<26:56:49, 4.58s/it] {'loss': 0.5194, 'grad_norm': 0.9296152330438265, 'learning_rate': 9.996906201362361e-06, 'epoch': 0.04} 4%|▍ | 904/22095 [1:10:15<26:56:49, 4.58s/it] 4%|▍ | 905/22095 [1:10:19<25:39:27, 4.36s/it] {'loss': 0.4809, 'grad_norm': 0.8565404745687755, 'learning_rate': 9.99688036868216e-06, 'epoch': 0.04} 4%|▍ | 905/22095 [1:10:19<25:39:27, 4.36s/it] 4%|▍ | 906/22095 [1:10:22<23:49:20, 4.05s/it] {'loss': 0.4932, 'grad_norm': 0.8590242523016869, 'learning_rate': 9.996854428634348e-06, 'epoch': 0.04} 4%|▍ | 906/22095 [1:10:22<23:49:20, 4.05s/it] 4%|▍ | 907/22095 [1:10:26<22:52:12, 3.89s/it] {'loss': 0.4946, 'grad_norm': 0.8493136418082664, 'learning_rate': 9.996828381219479e-06, 'epoch': 0.04} 4%|▍ | 907/22095 [1:10:26<22:52:12, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 908/22095 [1:10:32<27:04:12, 4.60s/it] {'loss': 0.5735, 'grad_norm': 0.6124772508865317, 'learning_rate': 9.996802226438117e-06, 'epoch': 0.04} 4%|▍ | 908/22095 [1:10:32<27:04:12, 4.60s/it] 4%|▍ | 909/22095 [1:10:36<26:06:35, 4.44s/it] {'loss': 0.4609, 'grad_norm': 0.8954342427228704, 'learning_rate': 9.996775964290819e-06, 'epoch': 0.04} 4%|▍ | 909/22095 [1:10:36<26:06:35, 4.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144079 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 910/22095 [1:10:40<24:06:41, 4.10s/it] {'loss': 0.4952, 'grad_norm': 0.8279174198415556, 'learning_rate': 9.996749594778153e-06, 'epoch': 0.04} 4%|▍ | 910/22095 [1:10:40<24:06:41, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 911/22095 [1:10:50<34:57:14, 5.94s/it] {'loss': 0.5361, 'grad_norm': 0.3864797621662077, 'learning_rate': 9.996723117900684e-06, 'epoch': 0.04} 4%|▍ | 911/22095 [1:10:50<34:57:14, 5.94s/it] 4%|▍ | 912/22095 [1:10:53<30:18:06, 5.15s/it] {'loss': 0.4824, 'grad_norm': 0.900106757577596, 'learning_rate': 9.996696533658981e-06, 'epoch': 0.04} 4%|▍ | 912/22095 [1:10:53<30:18:06, 5.15s/it] 4%|▍ | 913/22095 [1:10:57<28:38:30, 4.87s/it] {'loss': 0.4586, 'grad_norm': 0.8550971074550466, 'learning_rate': 9.996669842053617e-06, 'epoch': 0.04} 4%|▍ | 913/22095 [1:10:57<28:38:30, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41459 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 914/22095 [1:11:00<25:20:26, 4.31s/it] {'loss': 0.4704, 'grad_norm': 0.9222917536345409, 'learning_rate': 9.996643043085164e-06, 'epoch': 0.04} 4%|▍ | 914/22095 [1:11:00<25:20:26, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76237 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 915/22095 [1:11:03<23:11:32, 3.94s/it] {'loss': 0.4926, 'grad_norm': 0.8242120274667823, 'learning_rate': 9.996616136754198e-06, 'epoch': 0.04} 4%|▍ | 915/22095 [1:11:03<23:11:32, 3.94s/it] 4%|▍ | 916/22095 [1:11:07<22:35:15, 3.84s/it] {'loss': 0.4383, 'grad_norm': 0.7706995384843042, 'learning_rate': 9.996589123061297e-06, 'epoch': 0.04} 4%|▍ | 916/22095 [1:11:07<22:35:15, 3.84s/it] 4%|▍ | 917/22095 [1:11:10<20:52:33, 3.55s/it] {'loss': 0.4811, 'grad_norm': 0.7484393968818605, 'learning_rate': 9.996562002007042e-06, 'epoch': 0.04} 4%|▍ | 917/22095 [1:11:10<20:52:33, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941399 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64552, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6.4cm'}]} 4%|▍ | 918/22095 [1:11:20<31:43:58, 5.39s/it] {'loss': 0.561, 'grad_norm': 0.5945477305149327, 'learning_rate': 9.996534773592016e-06, 'epoch': 0.04} 4%|▍ | 918/22095 [1:11:20<31:43:58, 5.39s/it] 4%|▍ | 919/22095 [1:11:23<27:47:18, 4.72s/it] {'loss': 0.4753, 'grad_norm': 0.88348900798707, 'learning_rate': 9.9965074378168e-06, 'epoch': 0.04} 4%|▍ | 919/22095 [1:11:23<27:47:18, 4.72s/it] 4%|▍ | 920/22095 [1:11:26<24:44:02, 4.21s/it] {'loss': 0.4554, 'grad_norm': 0.843212052146494, 'learning_rate': 9.996479994681989e-06, 'epoch': 0.04} 4%|▍ | 920/22095 [1:11:26<24:44:02, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948335 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71488, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 4%|▍ | 921/22095 [1:11:30<24:44:04, 4.21s/it] {'loss': 0.446, 'grad_norm': 0.8267504174103804, 'learning_rate': 9.996452444188166e-06, 'epoch': 0.04} 4%|▍ | 921/22095 [1:11:30<24:44:04, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49903 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 922/22095 [1:11:40<34:39:51, 5.89s/it] {'loss': 0.5399, 'grad_norm': 0.39741826226838095, 'learning_rate': 9.996424786335925e-06, 'epoch': 0.04} 4%|▍ | 922/22095 [1:11:40<34:39:51, 5.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 923/22095 [1:11:49<39:48:09, 6.77s/it] {'loss': 0.5416, 'grad_norm': 0.3831494947571296, 'learning_rate': 9.996397021125862e-06, 'epoch': 0.04} 4%|▍ | 923/22095 [1:11:49<39:48:09, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 4%|▍ | 924/22095 [1:11:53<34:48:35, 5.92s/it] {'loss': 0.449, 'grad_norm': 0.8422075131373115, 'learning_rate': 9.996369148558573e-06, 'epoch': 0.04} 4%|▍ | 924/22095 [1:11:53<34:48:35, 5.92s/it] 4%|▍ | 925/22095 [1:11:56<30:30:09, 5.19s/it] {'loss': 0.4113, 'grad_norm': 0.8486980902404835, 'learning_rate': 9.996341168634653e-06, 'epoch': 0.04} 4%|▍ | 925/22095 [1:11:56<30:30:09, 5.19s/it] 4%|▍ | 926/22095 [1:12:00<27:40:04, 4.71s/it] {'loss': 0.4738, 'grad_norm': 0.889356449999675, 'learning_rate': 9.99631308135471e-06, 'epoch': 0.04} 4%|▍ | 926/22095 [1:12:00<27:40:04, 4.71s/it] 4%|▍ | 927/22095 [1:12:03<25:15:46, 4.30s/it] {'loss': 0.508, 'grad_norm': 0.8520583531149387, 'learning_rate': 9.996284886719342e-06, 'epoch': 0.04} 4%|▍ | 927/22095 [1:12:03<25:15:46, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 928/22095 [1:12:06<23:48:35, 4.05s/it] {'loss': 0.4892, 'grad_norm': 0.8429515410229458, 'learning_rate': 9.996256584729157e-06, 'epoch': 0.04} 4%|▍ | 928/22095 [1:12:06<23:48:35, 4.05s/it] 4%|▍ | 929/22095 [1:12:10<22:29:16, 3.82s/it] {'loss': 0.5311, 'grad_norm': 0.8478899080075085, 'learning_rate': 9.996228175384764e-06, 'epoch': 0.04} 4%|▍ | 929/22095 [1:12:10<22:29:16, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 930/22095 [1:12:16<27:29:53, 4.68s/it] {'loss': 0.561, 'grad_norm': 0.6460493207491624, 'learning_rate': 9.996199658686769e-06, 'epoch': 0.04} 4%|▍ | 930/22095 [1:12:16<27:29:53, 4.68s/it] 4%|▍ | 931/22095 [1:12:21<26:41:36, 4.54s/it] {'loss': 0.4714, 'grad_norm': 0.7654622141397179, 'learning_rate': 9.99617103463579e-06, 'epoch': 0.04} 4%|▍ | 931/22095 [1:12:21<26:41:36, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 932/22095 [1:12:24<24:25:55, 4.16s/it] {'loss': 0.4912, 'grad_norm': 1.0725569589615345, 'learning_rate': 9.99614230323244e-06, 'epoch': 0.04} 4%|▍ | 932/22095 [1:12:24<24:25:55, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43645 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 933/22095 [1:12:27<22:00:44, 3.74s/it] {'loss': 0.4458, 'grad_norm': 0.7822955176560418, 'learning_rate': 9.996113464477337e-06, 'epoch': 0.04} 4%|▍ | 933/22095 [1:12:27<22:00:44, 3.74s/it] 4%|▍ | 934/22095 [1:12:30<21:30:55, 3.66s/it] {'loss': 0.5008, 'grad_norm': 0.803557154231793, 'learning_rate': 9.996084518371101e-06, 'epoch': 0.04} 4%|▍ | 934/22095 [1:12:30<21:30:55, 3.66s/it] 4%|▍ | 935/22095 [1:12:34<21:49:54, 3.71s/it] {'loss': 0.4699, 'grad_norm': 0.8629657209874251, 'learning_rate': 9.996055464914351e-06, 'epoch': 0.04} 4%|▍ | 935/22095 [1:12:34<21:49:54, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 936/22095 [1:12:38<21:29:18, 3.66s/it] {'loss': 0.4988, 'grad_norm': 0.9756308146322324, 'learning_rate': 9.996026304107713e-06, 'epoch': 0.04} 4%|▍ | 936/22095 [1:12:38<21:29:18, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53663 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49464 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 937/22095 [1:12:40<20:17:02, 3.45s/it] {'loss': 0.4693, 'grad_norm': 0.8561456799235027, 'learning_rate': 9.995997035951816e-06, 'epoch': 0.04} 4%|▍ | 937/22095 [1:12:40<20:17:02, 3.45s/it] 4%|▍ | 938/22095 [1:12:43<19:21:24, 3.29s/it] {'loss': 0.4863, 'grad_norm': 0.8924522605663066, 'learning_rate': 9.995967660447285e-06, 'epoch': 0.04} 4%|▍ | 938/22095 [1:12:43<19:21:24, 3.29s/it] 4%|▍ | 939/22095 [1:12:46<18:44:14, 3.19s/it] {'loss': 0.4681, 'grad_norm': 0.8418644621286018, 'learning_rate': 9.995938177594753e-06, 'epoch': 0.04} 4%|▍ | 939/22095 [1:12:46<18:44:14, 3.19s/it] 4%|▍ | 940/22095 [1:12:50<19:07:26, 3.25s/it] {'loss': 0.469, 'grad_norm': 1.2365889696528232, 'learning_rate': 9.995908587394854e-06, 'epoch': 0.04} 4%|▍ | 940/22095 [1:12:50<19:07:26, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 941/22095 [1:13:00<31:28:53, 5.36s/it] {'loss': 0.5168, 'grad_norm': 0.6024320407477483, 'learning_rate': 9.995878889848223e-06, 'epoch': 0.04} 4%|▍ | 941/22095 [1:13:00<31:28:53, 5.36s/it] 4%|▍ | 942/22095 [1:13:04<29:16:12, 4.98s/it] {'loss': 0.4466, 'grad_norm': 0.9491137737305888, 'learning_rate': 9.995849084955498e-06, 'epoch': 0.04} 4%|▍ | 942/22095 [1:13:04<29:16:12, 4.98s/it] 4%|▍ | 943/22095 [1:13:08<27:07:57, 4.62s/it] {'loss': 0.4709, 'grad_norm': 0.851853575868263, 'learning_rate': 9.99581917271732e-06, 'epoch': 0.04} 4%|▍ | 943/22095 [1:13:08<27:07:57, 4.62s/it] 4%|▍ | 944/22095 [1:13:11<23:55:00, 4.07s/it] {'loss': 0.4656, 'grad_norm': 0.8028920300451445, 'learning_rate': 9.995789153134333e-06, 'epoch': 0.04} 4%|▍ | 944/22095 [1:13:11<23:55:00, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84896 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 945/22095 [1:13:16<26:39:40, 4.54s/it] {'loss': 0.5244, 'grad_norm': 0.4079955958058016, 'learning_rate': 9.995759026207179e-06, 'epoch': 0.04} 4%|▍ | 945/22095 [1:13:16<26:39:40, 4.54s/it] 4%|▍ | 946/22095 [1:13:24<32:45:28, 5.58s/it] {'loss': 0.5344, 'grad_norm': 0.4484634107614797, 'learning_rate': 9.995728791936505e-06, 'epoch': 0.04} 4%|▍ | 946/22095 [1:13:24<32:45:28, 5.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 4%|▍ | 947/22095 [1:13:30<32:05:20, 5.46s/it] {'loss': 0.4561, 'grad_norm': 0.906131738275771, 'learning_rate': 9.995698450322965e-06, 'epoch': 0.04} 4%|▍ | 947/22095 [1:13:30<32:05:20, 5.46s/it] 4%|▍ | 948/22095 [1:13:34<29:36:28, 5.04s/it] {'loss': 0.4295, 'grad_norm': 1.0408129265137542, 'learning_rate': 9.995668001367208e-06, 'epoch': 0.04} 4%|▍ | 948/22095 [1:13:34<29:36:28, 5.04s/it] 4%|▍ | 949/22095 [1:13:37<26:13:10, 4.46s/it] {'loss': 0.4805, 'grad_norm': 0.8137121642370845, 'learning_rate': 9.995637445069889e-06, 'epoch': 0.04} 4%|▍ | 949/22095 [1:13:37<26:13:10, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957886 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8721, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 5\nB. 2\nC. 3\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 4%|▍ | 950/22095 [1:13:40<24:24:54, 4.16s/it] {'loss': 0.5397, 'grad_norm': 0.8978674731733924, 'learning_rate': 9.995606781431664e-06, 'epoch': 0.04} 4%|▍ | 950/22095 [1:13:41<24:24:54, 4.16s/it] 4%|▍ | 951/22095 [1:13:44<23:15:57, 3.96s/it] {'loss': 0.4466, 'grad_norm': 0.8323695041137925, 'learning_rate': 9.99557601045319e-06, 'epoch': 0.04} 4%|▍ | 951/22095 [1:13:44<23:15:57, 3.96s/it] 4%|▍ | 952/22095 [1:13:47<23:04:47, 3.93s/it] {'loss': 0.4916, 'grad_norm': 0.8388981810246476, 'learning_rate': 9.995545132135133e-06, 'epoch': 0.04} 4%|▍ | 952/22095 [1:13:47<23:04:47, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 953/22095 [1:13:55<29:51:31, 5.08s/it] {'loss': 0.5394, 'grad_norm': 0.5734877288235656, 'learning_rate': 9.995514146478152e-06, 'epoch': 0.04} 4%|▍ | 953/22095 [1:13:55<29:51:31, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50919 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57191 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45115 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 4%|▍ | 954/22095 [1:13:59<27:42:16, 4.72s/it] {'loss': 0.4966, 'grad_norm': 0.9670137618853054, 'learning_rate': 9.995483053482917e-06, 'epoch': 0.04} 4%|▍ | 954/22095 [1:13:59<27:42:16, 4.72s/it] 4%|▍ | 955/22095 [1:14:02<25:11:23, 4.29s/it] {'loss': 0.4859, 'grad_norm': 0.8820830424535521, 'learning_rate': 9.995451853150091e-06, 'epoch': 0.04} 4%|▍ | 955/22095 [1:14:02<25:11:23, 4.29s/it] 4%|▍ | 956/22095 [1:14:06<23:38:32, 4.03s/it] {'loss': 0.512, 'grad_norm': 0.7973854009399443, 'learning_rate': 9.995420545480349e-06, 'epoch': 0.04} 4%|▍ | 956/22095 [1:14:06<23:38:32, 4.03s/it] 4%|▍ | 957/22095 [1:14:09<21:54:40, 3.73s/it] {'loss': 0.4833, 'grad_norm': 0.8844927712303589, 'learning_rate': 9.99538913047436e-06, 'epoch': 0.04} 4%|▍ | 957/22095 [1:14:09<21:54:40, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42472 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 958/22095 [1:14:12<21:13:55, 3.62s/it] {'loss': 0.4711, 'grad_norm': 0.8569865302662765, 'learning_rate': 9.9953576081328e-06, 'epoch': 0.04} 4%|▍ | 958/22095 [1:14:12<21:13:55, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87033 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55757 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 959/22095 [1:14:15<20:18:40, 3.46s/it] {'loss': 0.439, 'grad_norm': 0.8153974165603668, 'learning_rate': 9.995325978456349e-06, 'epoch': 0.04} 4%|▍ | 959/22095 [1:14:15<20:18:40, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82690 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 960/22095 [1:14:18<19:08:37, 3.26s/it] {'loss': 0.445, 'grad_norm': 0.8034190547420145, 'learning_rate': 9.995294241445685e-06, 'epoch': 0.04} 4%|▍ | 960/22095 [1:14:18<19:08:37, 3.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 961/22095 [1:14:21<18:55:12, 3.22s/it] {'loss': 0.4781, 'grad_norm': 0.8022765956843442, 'learning_rate': 9.995262397101489e-06, 'epoch': 0.04} 4%|▍ | 961/22095 [1:14:21<18:55:12, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90603 > 40960). Running this sequence through the model will result in indexing errors [2025-08-27 17:12:30,148] [WARNING] [stage3.py:2118:step] 1 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to reduce memory consumption. If you are unable to make the cache flushes go away consider adding get_accelerator().empty_cache() calls in your training loop to ensure that all ranks flush their caches at the same time 4%|▍ | 962/22095 [1:14:31<31:05:58, 5.30s/it] {'loss': 0.544, 'grad_norm': 0.5816020147571523, 'learning_rate': 9.995230445424446e-06, 'epoch': 0.04} 4%|▍ | 962/22095 [1:14:31<31:05:58, 5.30s/it] 4%|▍ | 963/22095 [1:14:35<27:42:08, 4.72s/it] {'loss': 0.4766, 'grad_norm': 0.9986172227674865, 'learning_rate': 9.995198386415241e-06, 'epoch': 0.04} 4%|▍ | 963/22095 [1:14:35<27:42:08, 4.72s/it] 4%|▍ | 964/22095 [1:14:38<25:38:40, 4.37s/it] {'loss': 0.4591, 'grad_norm': 0.791646512962987, 'learning_rate': 9.995166220074566e-06, 'epoch': 0.04} 4%|▍ | 964/22095 [1:14:38<25:38:40, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 965/22095 [1:14:42<24:40:22, 4.20s/it] {'loss': 0.4547, 'grad_norm': 0.8885119876985477, 'learning_rate': 9.995133946403111e-06, 'epoch': 0.04} 4%|▍ | 965/22095 [1:14:42<24:40:22, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 966/22095 [1:14:52<35:11:44, 6.00s/it] {'loss': 0.5103, 'grad_norm': 0.4175285658178469, 'learning_rate': 9.995101565401566e-06, 'epoch': 0.04} 4%|▍ | 966/22095 [1:14:52<35:11:44, 6.00s/it] 4%|▍ | 967/22095 [1:14:59<36:31:56, 6.22s/it] {'loss': 0.5563, 'grad_norm': 0.44806347188990125, 'learning_rate': 9.995069077070632e-06, 'epoch': 0.04} 4%|▍ | 967/22095 [1:14:59<36:31:56, 6.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 4%|▍ | 968/22095 [1:15:02<30:58:05, 5.28s/it] {'loss': 0.492, 'grad_norm': 1.2349779804925443, 'learning_rate': 9.995036481411005e-06, 'epoch': 0.04} 4%|▍ | 968/22095 [1:15:02<30:58:05, 5.28s/it] 4%|▍ | 969/22095 [1:15:05<27:33:19, 4.70s/it] {'loss': 0.4622, 'grad_norm': 0.9400022460617656, 'learning_rate': 9.995003778423383e-06, 'epoch': 0.04} 4%|▍ | 969/22095 [1:15:05<27:33:19, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84197 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 970/22095 [1:15:16<37:26:46, 6.38s/it] {'loss': 0.536, 'grad_norm': 0.42943118298688115, 'learning_rate': 9.994970968108473e-06, 'epoch': 0.04} 4%|▍ | 970/22095 [1:15:16<37:26:46, 6.38s/it] 4%|▍ | 971/22095 [1:15:20<33:04:44, 5.64s/it] {'loss': 0.4773, 'grad_norm': 1.161378142208096, 'learning_rate': 9.994938050466976e-06, 'epoch': 0.04} 4%|▍ | 971/22095 [1:15:20<33:04:44, 5.64s/it] 4%|▍ | 972/22095 [1:15:23<29:39:51, 5.06s/it] {'loss': 0.5134, 'grad_norm': 1.1044222313320406, 'learning_rate': 9.994905025499602e-06, 'epoch': 0.04} 4%|▍ | 972/22095 [1:15:23<29:39:51, 5.06s/it] 4%|▍ | 973/22095 [1:15:27<27:40:49, 4.72s/it] {'loss': 0.4961, 'grad_norm': 0.8840976288998802, 'learning_rate': 9.994871893207058e-06, 'epoch': 0.04} 4%|▍ | 973/22095 [1:15:27<27:40:49, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 4%|▍ | 974/22095 [1:15:31<25:17:10, 4.31s/it] {'loss': 0.5218, 'grad_norm': 0.9243901215383543, 'learning_rate': 9.99483865359006e-06, 'epoch': 0.04} 4%|▍ | 974/22095 [1:15:31<25:17:10, 4.31s/it] 4%|▍ | 975/22095 [1:15:34<23:14:36, 3.96s/it] {'loss': 0.464, 'grad_norm': 0.924911837548586, 'learning_rate': 9.99480530664932e-06, 'epoch': 0.04} 4%|▍ | 975/22095 [1:15:34<23:14:36, 3.96s/it] 4%|▍ | 976/22095 [1:15:37<22:22:35, 3.81s/it] {'loss': 0.4863, 'grad_norm': 0.875085936045096, 'learning_rate': 9.994771852385552e-06, 'epoch': 0.04} 4%|▍ | 976/22095 [1:15:37<22:22:35, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62721 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 977/22095 [1:15:41<21:30:36, 3.67s/it] {'loss': 0.4658, 'grad_norm': 0.8127041858417101, 'learning_rate': 9.994738290799479e-06, 'epoch': 0.04} 4%|▍ | 977/22095 [1:15:41<21:30:36, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 978/22095 [1:15:50<31:13:36, 5.32s/it] {'loss': 0.5463, 'grad_norm': 0.5493520559000986, 'learning_rate': 9.99470462189182e-06, 'epoch': 0.04} 4%|▍ | 978/22095 [1:15:50<31:13:36, 5.32s/it] 4%|▍ | 979/22095 [1:15:54<29:00:14, 4.94s/it] {'loss': 0.463, 'grad_norm': 1.02408537343092, 'learning_rate': 9.994670845663297e-06, 'epoch': 0.04} 4%|▍ | 979/22095 [1:15:54<29:00:14, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45726 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44948 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42246 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 980/22095 [1:15:57<26:00:06, 4.43s/it] {'loss': 0.4892, 'grad_norm': 0.916174894899694, 'learning_rate': 9.99463696211464e-06, 'epoch': 0.04} 4%|▍ | 980/22095 [1:15:57<26:00:06, 4.43s/it] 4%|▍ | 981/22095 [1:16:02<26:28:31, 4.51s/it] {'loss': 0.4441, 'grad_norm': 0.8152376323392234, 'learning_rate': 9.994602971246573e-06, 'epoch': 0.04} 4%|▍ | 981/22095 [1:16:02<26:28:31, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70606 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 982/22095 [1:16:05<24:38:25, 4.20s/it] {'loss': 0.4403, 'grad_norm': 0.8328042792025057, 'learning_rate': 9.994568873059829e-06, 'epoch': 0.04} 4%|▍ | 982/22095 [1:16:05<24:38:25, 4.20s/it] 4%|▍ | 983/22095 [1:16:08<22:12:39, 3.79s/it] {'loss': 0.5063, 'grad_norm': 2.0243382061600372, 'learning_rate': 9.994534667555138e-06, 'epoch': 0.04} 4%|▍ | 983/22095 [1:16:08<22:12:39, 3.79s/it] 4%|▍ | 984/22095 [1:16:11<20:40:45, 3.53s/it] {'loss': 0.4542, 'grad_norm': 0.8762795255672268, 'learning_rate': 9.994500354733238e-06, 'epoch': 0.04} 4%|▍ | 984/22095 [1:16:11<20:40:45, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 4%|▍ | 985/22095 [1:16:22<33:10:06, 5.66s/it] {'loss': 0.5622, 'grad_norm': 0.4677123917860248, 'learning_rate': 9.994465934594863e-06, 'epoch': 0.04} 4%|▍ | 985/22095 [1:16:22<33:10:06, 5.66s/it] 4%|▍ | 986/22095 [1:16:26<31:32:17, 5.38s/it] {'loss': 0.4326, 'grad_norm': 1.0928191852342053, 'learning_rate': 9.994431407140757e-06, 'epoch': 0.04} 4%|▍ | 986/22095 [1:16:26<31:32:17, 5.38s/it] 4%|▍ | 987/22095 [1:16:30<28:22:40, 4.84s/it] {'loss': 0.4831, 'grad_norm': 0.8365114456102918, 'learning_rate': 9.994396772371658e-06, 'epoch': 0.04} 4%|▍ | 987/22095 [1:16:30<28:22:40, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954485 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5320, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} 4%|▍ | 988/22095 [1:16:40<37:55:24, 6.47s/it] {'loss': 0.5172, 'grad_norm': 0.40524873724140714, 'learning_rate': 9.994362030288312e-06, 'epoch': 0.04} 4%|▍ | 988/22095 [1:16:40<37:55:24, 6.47s/it] 4%|▍ | 989/22095 [1:16:44<32:46:46, 5.59s/it] {'loss': 0.4512, 'grad_norm': 0.7934455189770376, 'learning_rate': 9.994327180891462e-06, 'epoch': 0.04} 4%|▍ | 989/22095 [1:16:44<32:46:46, 5.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53899 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51179 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114622 > 40960). Running this sequence through the model will result in indexing errors 4%|▍ | 990/22095 [1:16:52<37:47:12, 6.45s/it] {'loss': 0.5455, 'grad_norm': 0.43443711211319097, 'learning_rate': 9.994292224181864e-06, 'epoch': 0.04} 4%|▍ | 990/22095 [1:16:52<37:47:12, 6.45s/it] 4%|▍ | 991/22095 [1:16:56<33:05:05, 5.64s/it] {'loss': 0.4939, 'grad_norm': 1.078608052597576, 'learning_rate': 9.994257160160263e-06, 'epoch': 0.04} 4%|▍ | 991/22095 [1:16:56<33:05:05, 5.64s/it] 4%|▍ | 992/22095 [1:17:00<29:31:15, 5.04s/it] {'loss': 0.4862, 'grad_norm': 0.8074285900663641, 'learning_rate': 9.994221988827415e-06, 'epoch': 0.04} 4%|▍ | 992/22095 [1:17:00<29:31:15, 5.04s/it] 4%|▍ | 993/22095 [1:17:03<27:10:26, 4.64s/it] {'loss': 0.43, 'grad_norm': 0.790319743842942, 'learning_rate': 9.994186710184073e-06, 'epoch': 0.04} 4%|▍ | 993/22095 [1:17:03<27:10:26, 4.64s/it] 4%|▍ | 994/22095 [1:17:06<23:56:19, 4.08s/it] {'loss': 0.4658, 'grad_norm': 0.8873418220441408, 'learning_rate': 9.994151324231e-06, 'epoch': 0.04} 4%|▍ | 994/22095 [1:17:06<23:56:19, 4.08s/it] 5%|▍ | 995/22095 [1:17:09<22:26:31, 3.83s/it] {'loss': 0.417, 'grad_norm': 0.8225925871404844, 'learning_rate': 9.994115830968951e-06, 'epoch': 0.05} 5%|▍ | 995/22095 [1:17:09<22:26:31, 3.83s/it] 5%|▍ | 996/22095 [1:17:12<20:41:57, 3.53s/it] {'loss': 0.474, 'grad_norm': 0.8076197995626312, 'learning_rate': 9.994080230398693e-06, 'epoch': 0.05} 5%|▍ | 996/22095 [1:17:12<20:41:57, 3.53s/it] 5%|▍ | 997/22095 [1:17:15<20:09:47, 3.44s/it] {'loss': 0.5055, 'grad_norm': 0.8566188848933955, 'learning_rate': 9.994044522520988e-06, 'epoch': 0.05} 5%|▍ | 997/22095 [1:17:15<20:09:47, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 998/22095 [1:17:25<31:51:27, 5.44s/it] {'loss': 0.5656, 'grad_norm': 0.7046398217059011, 'learning_rate': 9.994008707336604e-06, 'epoch': 0.05} 5%|▍ | 998/22095 [1:17:26<31:51:27, 5.44s/it] 5%|▍ | 999/22095 [1:17:29<27:46:54, 4.74s/it] {'loss': 0.5063, 'grad_norm': 1.267284206330529, 'learning_rate': 9.99397278484631e-06, 'epoch': 0.05} 5%|▍ | 999/22095 [1:17:29<27:46:54, 4.74s/it] 5%|▍ | 1000/22095 [1:17:32<24:47:24, 4.23s/it] {'loss': 0.5099, 'grad_norm': 0.8217192392842306, 'learning_rate': 9.993936755050881e-06, 'epoch': 0.05} 5%|▍ | 1000/22095 [1:17:32<24:47:24, 4.23s/it] 5%|▍ | 1001/22095 [1:17:35<23:39:35, 4.04s/it] {'loss': 0.4602, 'grad_norm': 0.866571381344001, 'learning_rate': 9.993900617951087e-06, 'epoch': 0.05} 5%|▍ | 1001/22095 [1:17:35<23:39:35, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55947 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123449 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1002/22095 [1:17:38<21:39:21, 3.70s/it] {'loss': 0.4631, 'grad_norm': 0.8053116757932492, 'learning_rate': 9.993864373547707e-06, 'epoch': 0.05} 5%|▍ | 1002/22095 [1:17:38<21:39:21, 3.70s/it] 5%|▍ | 1003/22095 [1:17:42<21:18:24, 3.64s/it] {'loss': 0.4536, 'grad_norm': 0.8906153508438541, 'learning_rate': 9.993828021841518e-06, 'epoch': 0.05} 5%|▍ | 1003/22095 [1:17:42<21:18:24, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98676 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1004/22095 [1:17:45<20:34:41, 3.51s/it] {'loss': 0.4591, 'grad_norm': 0.7761743131090554, 'learning_rate': 9.993791562833303e-06, 'epoch': 0.05} 5%|▍ | 1004/22095 [1:17:45<20:34:41, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52781 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1005/22095 [1:17:48<20:43:05, 3.54s/it] {'loss': 0.5054, 'grad_norm': 0.8438694699439588, 'learning_rate': 9.993754996523846e-06, 'epoch': 0.05} 5%|▍ | 1005/22095 [1:17:48<20:43:05, 3.54s/it] 5%|▍ | 1006/22095 [1:17:52<20:03:52, 3.43s/it] {'loss': 0.4456, 'grad_norm': 0.9633033027437268, 'learning_rate': 9.99371832291393e-06, 'epoch': 0.05} 5%|▍ | 1006/22095 [1:17:52<20:03:52, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1007/22095 [1:18:02<32:40:53, 5.58s/it] {'loss': 0.5288, 'grad_norm': 0.7624238288048312, 'learning_rate': 9.993681542004343e-06, 'epoch': 0.05} 5%|▍ | 1007/22095 [1:18:02<32:40:53, 5.58s/it] 5%|▍ | 1008/22095 [1:18:06<29:37:17, 5.06s/it] {'loss': 0.4678, 'grad_norm': 1.0441430641968361, 'learning_rate': 9.99364465379588e-06, 'epoch': 0.05} 5%|▍ | 1008/22095 [1:18:06<29:37:17, 5.06s/it] 5%|▍ | 1009/22095 [1:18:10<26:55:54, 4.60s/it] {'loss': 0.428, 'grad_norm': 0.8104564043918575, 'learning_rate': 9.993607658289325e-06, 'epoch': 0.05} 5%|▍ | 1009/22095 [1:18:10<26:55:54, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1010/22095 [1:18:20<36:34:45, 6.25s/it] {'loss': 0.5174, 'grad_norm': 0.39650819228795653, 'learning_rate': 9.993570555485484e-06, 'epoch': 0.05} 5%|▍ | 1010/22095 [1:18:20<36:34:45, 6.25s/it] 5%|▍ | 1011/22095 [1:18:23<31:23:41, 5.36s/it] {'loss': 0.5068, 'grad_norm': 1.0830099479547428, 'learning_rate': 9.993533345385145e-06, 'epoch': 0.05} 5%|▍ | 1011/22095 [1:18:23<31:23:41, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1012/22095 [1:18:31<36:25:06, 6.22s/it] {'loss': 0.5052, 'grad_norm': 0.5169965065083604, 'learning_rate': 9.993496027989112e-06, 'epoch': 0.05} 5%|▍ | 1012/22095 [1:18:31<36:25:06, 6.22s/it] 5%|▍ | 1013/22095 [1:18:41<42:35:29, 7.27s/it] {'loss': 0.5631, 'grad_norm': 0.4772035033679194, 'learning_rate': 9.993458603298184e-06, 'epoch': 0.05} 5%|▍ | 1013/22095 [1:18:41<42:35:29, 7.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▍ | 1014/22095 [1:18:45<36:31:37, 6.24s/it] {'loss': 0.4949, 'grad_norm': 1.0771229663771391, 'learning_rate': 9.993421071313168e-06, 'epoch': 0.05} 5%|▍ | 1014/22095 [1:18:45<36:31:37, 6.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86002 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81632 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52307 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1015/22095 [1:18:48<30:56:31, 5.28s/it] {'loss': 0.4857, 'grad_norm': 0.9200033400407627, 'learning_rate': 9.993383432034869e-06, 'epoch': 0.05} 5%|▍ | 1015/22095 [1:18:48<30:56:31, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57318 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1016/22095 [1:18:51<27:53:30, 4.76s/it] {'loss': 0.4762, 'grad_norm': 0.8675512644866923, 'learning_rate': 9.993345685464097e-06, 'epoch': 0.05} 5%|▍ | 1016/22095 [1:18:51<27:53:30, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123630 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46609 > 40960) for 4 sample(s). Truncating to 5649 with 3 samples. 5%|▍ | 1017/22095 [1:18:55<25:15:39, 4.31s/it] {'loss': 0.4652, 'grad_norm': 0.8495883898640996, 'learning_rate': 9.993307831601661e-06, 'epoch': 0.05} 5%|▍ | 1017/22095 [1:18:55<25:15:39, 4.31s/it] 5%|▍ | 1018/22095 [1:18:58<24:04:46, 4.11s/it] {'loss': 0.4401, 'grad_norm': 0.8723604577429355, 'learning_rate': 9.993269870448375e-06, 'epoch': 0.05} 5%|▍ | 1018/22095 [1:18:58<24:04:46, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47166 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42224 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1019/22095 [1:19:01<21:36:59, 3.69s/it] {'loss': 0.4295, 'grad_norm': 0.695706230625535, 'learning_rate': 9.993231802005056e-06, 'epoch': 0.05} 5%|▍ | 1019/22095 [1:19:01<21:36:59, 3.69s/it] 5%|▍ | 1020/22095 [1:19:04<21:17:20, 3.64s/it] {'loss': 0.4856, 'grad_norm': 0.8270086618965578, 'learning_rate': 9.99319362627252e-06, 'epoch': 0.05} 5%|▍ | 1020/22095 [1:19:04<21:17:20, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67785 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64241 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1021/22095 [1:19:09<22:11:09, 3.79s/it] {'loss': 0.476, 'grad_norm': 0.8445513154871833, 'learning_rate': 9.993155343251592e-06, 'epoch': 0.05} 5%|▍ | 1021/22095 [1:19:09<22:11:09, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1022/22095 [1:19:12<21:45:47, 3.72s/it] {'loss': 0.5158, 'grad_norm': 0.8926816239824479, 'learning_rate': 9.993116952943087e-06, 'epoch': 0.05} 5%|▍ | 1022/22095 [1:19:12<21:45:47, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1023/22095 [1:19:21<31:35:54, 5.40s/it] {'loss': 0.5605, 'grad_norm': 1.0848942171687697, 'learning_rate': 9.993078455347835e-06, 'epoch': 0.05} 5%|▍ | 1023/22095 [1:19:22<31:35:54, 5.40s/it] 5%|▍ | 1024/22095 [1:19:28<32:46:02, 5.60s/it] {'loss': 0.523, 'grad_norm': 0.7114123741793746, 'learning_rate': 9.993039850466664e-06, 'epoch': 0.05} 5%|▍ | 1024/22095 [1:19:28<32:46:02, 5.60s/it] 5%|▍ | 1025/22095 [1:19:37<39:19:03, 6.72s/it] {'loss': 0.5228, 'grad_norm': 0.41718334147884434, 'learning_rate': 9.9930011383004e-06, 'epoch': 0.05} 5%|▍ | 1025/22095 [1:19:37<39:19:03, 6.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (189987600 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7927875 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (189987600 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/38238.png', 'image_wh': [[23340, 8140]], 'conversations': [{'from': 'human', 'value': '\nWhat is the top most technique to sell the company in the interview? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Give a tour of the site.\nAccording to the findings of a survey conducted on UK based pharmaceutical professionals and specialist recruiters, giving a tour of the company site is considered to be the top most technique to sell the company in an interview. This technique allows candidates to have a personal and interactive experience with the company and its work culture. It gives them a chance to see the work environment, meet potential colleagues and experience the day-to-day operations of the company. This technique can create a positive impression on the candidates and may make them more interested in accepting the job offer.'}]} 5%|▍ | 1026/22095 [1:19:41<35:03:58, 5.99s/it] {'loss': 0.4418, 'grad_norm': 1.0948263552696134, 'learning_rate': 9.992962318849876e-06, 'epoch': 0.05} 5%|▍ | 1026/22095 [1:19:41<35:03:58, 5.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1027/22095 [1:19:51<41:59:30, 7.18s/it] {'loss': 0.5414, 'grad_norm': 1.3184136429115096, 'learning_rate': 9.992923392115927e-06, 'epoch': 0.05} 5%|▍ | 1027/22095 [1:19:51<41:59:30, 7.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▍ | 1028/22095 [1:19:55<36:42:09, 6.27s/it] {'loss': 0.4914, 'grad_norm': 0.988322664954066, 'learning_rate': 9.992884358099389e-06, 'epoch': 0.05} 5%|▍ | 1028/22095 [1:19:55<36:42:09, 6.27s/it] 5%|▍ | 1029/22095 [1:20:04<40:38:52, 6.95s/it] {'loss': 0.5398, 'grad_norm': 1.03237806355917, 'learning_rate': 9.9928452168011e-06, 'epoch': 0.05} 5%|▍ | 1029/22095 [1:20:04<40:38:52, 6.95s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▍ | 1030/22095 [1:20:08<34:59:47, 5.98s/it] {'loss': 0.4913, 'grad_norm': 1.0051916183489378, 'learning_rate': 9.992805968221902e-06, 'epoch': 0.05} 5%|▍ | 1030/22095 [1:20:08<34:59:47, 5.98s/it] 5%|▍ | 1031/22095 [1:20:11<30:39:10, 5.24s/it] {'loss': 0.5054, 'grad_norm': 0.823719675186997, 'learning_rate': 9.99276661236264e-06, 'epoch': 0.05} 5%|▍ | 1031/22095 [1:20:11<30:39:10, 5.24s/it] 5%|▍ | 1032/22095 [1:20:14<26:38:42, 4.55s/it] {'loss': 0.4993, 'grad_norm': 0.8540554077712061, 'learning_rate': 9.992727149224155e-06, 'epoch': 0.05} 5%|▍ | 1032/22095 [1:20:14<26:38:42, 4.55s/it] 5%|▍ | 1033/22095 [1:20:17<24:24:39, 4.17s/it] {'loss': 0.477, 'grad_norm': 0.9522023086516167, 'learning_rate': 9.992687578807296e-06, 'epoch': 0.05} 5%|▍ | 1033/22095 [1:20:17<24:24:39, 4.17s/it] 5%|▍ | 1034/22095 [1:20:20<21:59:18, 3.76s/it] {'loss': 0.525, 'grad_norm': 1.1578859999622484, 'learning_rate': 9.992647901112918e-06, 'epoch': 0.05} 5%|▍ | 1034/22095 [1:20:20<21:59:18, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72796 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51237 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1035/22095 [1:20:24<22:03:55, 3.77s/it] {'loss': 0.4579, 'grad_norm': 0.7745434808777305, 'learning_rate': 9.992608116141868e-06, 'epoch': 0.05} 5%|▍ | 1035/22095 [1:20:24<22:03:55, 3.77s/it] 5%|▍ | 1036/22095 [1:20:27<21:42:22, 3.71s/it] {'loss': 0.4992, 'grad_norm': 0.7848365984109472, 'learning_rate': 9.992568223895007e-06, 'epoch': 0.05} 5%|▍ | 1036/22095 [1:20:27<21:42:22, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45261 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1037/22095 [1:20:31<22:13:51, 3.80s/it] {'loss': 0.4865, 'grad_norm': 1.4260545518131067, 'learning_rate': 9.992528224373184e-06, 'epoch': 0.05} 5%|▍ | 1037/22095 [1:20:31<22:13:51, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41934 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1038/22095 [1:20:34<20:17:22, 3.47s/it] {'loss': 0.4634, 'grad_norm': 0.973652136914254, 'learning_rate': 9.992488117577265e-06, 'epoch': 0.05} 5%|▍ | 1038/22095 [1:20:34<20:17:22, 3.47s/it] 5%|▍ | 1039/22095 [1:20:38<21:08:38, 3.62s/it] {'loss': 0.4649, 'grad_norm': 0.7644367544615681, 'learning_rate': 9.99244790350811e-06, 'epoch': 0.05} 5%|▍ | 1039/22095 [1:20:38<21:08:38, 3.62s/it] 5%|▍ | 1040/22095 [1:20:41<20:09:30, 3.45s/it] {'loss': 0.474, 'grad_norm': 0.8781215453997191, 'learning_rate': 9.992407582166582e-06, 'epoch': 0.05} 5%|▍ | 1040/22095 [1:20:41<20:09:30, 3.45s/it] 5%|▍ | 1041/22095 [1:20:44<19:01:12, 3.25s/it] {'loss': 0.446, 'grad_norm': 0.8450792350739496, 'learning_rate': 9.992367153553549e-06, 'epoch': 0.05} 5%|▍ | 1041/22095 [1:20:44<19:01:12, 3.25s/it] 5%|▍ | 1042/22095 [1:20:47<19:00:20, 3.25s/it] {'loss': 0.455, 'grad_norm': 0.9136083369013077, 'learning_rate': 9.992326617669876e-06, 'epoch': 0.05} 5%|▍ | 1042/22095 [1:20:47<19:00:20, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1043/22095 [1:20:57<30:49:19, 5.27s/it] {'loss': 0.5615, 'grad_norm': 2.3247242416841916, 'learning_rate': 9.99228597451644e-06, 'epoch': 0.05} 5%|▍ | 1043/22095 [1:20:57<30:49:19, 5.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52001 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49838 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1044/22095 [1:21:00<27:11:54, 4.65s/it] {'loss': 0.4526, 'grad_norm': 0.980824771184377, 'learning_rate': 9.99224522409411e-06, 'epoch': 0.05} 5%|▍ | 1044/22095 [1:21:00<27:11:54, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1045/22095 [1:21:10<36:36:27, 6.26s/it] {'loss': 0.5381, 'grad_norm': 0.7777888346815359, 'learning_rate': 9.992204366403761e-06, 'epoch': 0.05} 5%|▍ | 1045/22095 [1:21:10<36:36:27, 6.26s/it] 5%|▍ | 1046/22095 [1:21:20<42:11:36, 7.22s/it] {'loss': 0.5143, 'grad_norm': 0.568469961832735, 'learning_rate': 9.992163401446274e-06, 'epoch': 0.05} 5%|▍ | 1046/22095 [1:21:20<42:11:36, 7.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▍ | 1047/22095 [1:21:23<34:57:39, 5.98s/it] {'loss': 0.514, 'grad_norm': 1.0356275444400265, 'learning_rate': 9.992122329222527e-06, 'epoch': 0.05} 5%|▍ | 1047/22095 [1:21:23<34:57:39, 5.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1048/22095 [1:21:26<30:29:03, 5.21s/it] {'loss': 0.4338, 'grad_norm': 0.9332493931025058, 'learning_rate': 9.992081149733404e-06, 'epoch': 0.05} 5%|▍ | 1048/22095 [1:21:26<30:29:03, 5.21s/it] 5%|▍ | 1049/22095 [1:21:30<28:15:06, 4.83s/it] {'loss': 0.5054, 'grad_norm': 1.0333833578504823, 'learning_rate': 9.99203986297979e-06, 'epoch': 0.05} 5%|▍ | 1049/22095 [1:21:30<28:15:06, 4.83s/it] 5%|▍ | 1050/22095 [1:21:33<25:02:21, 4.28s/it] {'loss': 0.4828, 'grad_norm': 0.8779286388663978, 'learning_rate': 9.99199846896257e-06, 'epoch': 0.05} 5%|▍ | 1050/22095 [1:21:33<25:02:21, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52574 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91431 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1051/22095 [1:21:37<24:18:04, 4.16s/it] {'loss': 0.488, 'grad_norm': 0.7946450784816697, 'learning_rate': 9.991956967682635e-06, 'epoch': 0.05} 5%|▍ | 1051/22095 [1:21:37<24:18:04, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1052/22095 [1:21:47<33:30:56, 5.73s/it] {'loss': 0.5759, 'grad_norm': 2.8390151308444898, 'learning_rate': 9.991915359140876e-06, 'epoch': 0.05} 5%|▍ | 1052/22095 [1:21:47<33:30:56, 5.73s/it] 5%|▍ | 1053/22095 [1:21:50<28:51:22, 4.94s/it] {'loss': 0.4438, 'grad_norm': 0.795466215922787, 'learning_rate': 9.991873643338187e-06, 'epoch': 0.05} 5%|▍ | 1053/22095 [1:21:50<28:51:22, 4.94s/it] 5%|▍ | 1054/22095 [1:21:54<26:54:26, 4.60s/it] {'loss': 0.509, 'grad_norm': 0.89522116285953, 'learning_rate': 9.991831820275466e-06, 'epoch': 0.05} 5%|▍ | 1054/22095 [1:21:54<26:54:26, 4.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396943 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63796, 'image': 'vrdu_table_final_2/astro-ph.EP/cb5d0383-d74d-442c-80a6-3de96f98dfb9.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_x$\\end{tabular}\n```"}]} 5%|▍ | 1055/22095 [1:21:57<24:52:26, 4.26s/it] {'loss': 0.5155, 'grad_norm': 0.8460939918115581, 'learning_rate': 9.99178988995361e-06, 'epoch': 0.05} 5%|▍ | 1055/22095 [1:21:57<24:52:26, 4.26s/it] 5%|▍ | 1056/22095 [1:22:01<24:06:57, 4.13s/it] {'loss': 0.4917, 'grad_norm': 0.9491912394916352, 'learning_rate': 9.991747852373522e-06, 'epoch': 0.05} 5%|▍ | 1056/22095 [1:22:01<24:06:57, 4.13s/it] 5%|▍ | 1057/22095 [1:22:04<22:42:36, 3.89s/it] {'loss': 0.4638, 'grad_norm': 1.1064849521692737, 'learning_rate': 9.9917057075361e-06, 'epoch': 0.05} 5%|▍ | 1057/22095 [1:22:04<22:42:36, 3.89s/it] 5%|▍ | 1058/22095 [1:22:08<22:14:41, 3.81s/it] {'loss': 0.498, 'grad_norm': 0.8443091766555462, 'learning_rate': 9.991663455442255e-06, 'epoch': 0.05} 5%|▍ | 1058/22095 [1:22:08<22:14:41, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1059/22095 [1:22:18<33:32:47, 5.74s/it] {'loss': 0.5387, 'grad_norm': 1.4157963360282313, 'learning_rate': 9.991621096092895e-06, 'epoch': 0.05} 5%|▍ | 1059/22095 [1:22:18<33:32:47, 5.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1060/22095 [1:22:21<29:28:55, 5.05s/it] {'loss': 0.4364, 'grad_norm': 0.8466878016153724, 'learning_rate': 9.991578629488926e-06, 'epoch': 0.05} 5%|▍ | 1060/22095 [1:22:21<29:28:55, 5.05s/it] 5%|▍ | 1061/22095 [1:22:25<26:27:37, 4.53s/it] {'loss': 0.4207, 'grad_norm': 0.907026124451798, 'learning_rate': 9.991536055631263e-06, 'epoch': 0.05} 5%|▍ | 1061/22095 [1:22:25<26:27:37, 4.53s/it] 5%|▍ | 1062/22095 [1:22:28<24:04:29, 4.12s/it] {'loss': 0.5143, 'grad_norm': 1.3059124793214225, 'learning_rate': 9.99149337452082e-06, 'epoch': 0.05} 5%|▍ | 1062/22095 [1:22:28<24:04:29, 4.12s/it] 5%|▍ | 1063/22095 [1:22:32<23:58:43, 4.10s/it] {'loss': 0.4878, 'grad_norm': 1.0645852262999596, 'learning_rate': 9.991450586158515e-06, 'epoch': 0.05} 5%|▍ | 1063/22095 [1:22:32<23:58:43, 4.10s/it] 5%|▍ | 1064/22095 [1:22:35<22:33:58, 3.86s/it] {'loss': 0.493, 'grad_norm': 0.8253282261334028, 'learning_rate': 9.991407690545267e-06, 'epoch': 0.05} 5%|▍ | 1064/22095 [1:22:35<22:33:58, 3.86s/it] 5%|▍ | 1065/22095 [1:22:38<20:41:08, 3.54s/it] {'loss': 0.4481, 'grad_norm': 0.841953712734116, 'learning_rate': 9.991364687681998e-06, 'epoch': 0.05} 5%|▍ | 1065/22095 [1:22:38<20:41:08, 3.54s/it] 5%|▍ | 1066/22095 [1:22:41<19:33:03, 3.35s/it] {'loss': 0.4993, 'grad_norm': 1.8495652063684374, 'learning_rate': 9.991321577569632e-06, 'epoch': 0.05} 5%|▍ | 1066/22095 [1:22:41<19:33:03, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1067/22095 [1:22:44<19:05:44, 3.27s/it] {'loss': 0.4786, 'grad_norm': 0.7895317716602368, 'learning_rate': 9.991278360209094e-06, 'epoch': 0.05} 5%|▍ | 1067/22095 [1:22:44<19:05:44, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116610 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53451 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1068/22095 [1:22:48<19:47:03, 3.39s/it] {'loss': 0.4064, 'grad_norm': 0.7317553707364586, 'learning_rate': 9.991235035601314e-06, 'epoch': 0.05} 5%|▍ | 1068/22095 [1:22:48<19:47:03, 3.39s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883162 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6315, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 7\nB. 2\nC. 2.5\nD. 4.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1069/22095 [1:22:51<19:27:24, 3.33s/it] {'loss': 0.4873, 'grad_norm': 4.832414373312705, 'learning_rate': 9.991191603747223e-06, 'epoch': 0.05} 5%|▍ | 1069/22095 [1:22:51<19:27:24, 3.33s/it] 5%|▍ | 1070/22095 [1:22:54<19:37:02, 3.36s/it] {'loss': 0.4324, 'grad_norm': 0.8183438730813508, 'learning_rate': 9.991148064647753e-06, 'epoch': 0.05} 5%|▍ | 1070/22095 [1:22:54<19:37:02, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954303 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5138, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 8\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 5%|▍ | 1071/22095 [1:22:58<20:26:27, 3.50s/it] {'loss': 0.4406, 'grad_norm': 0.8433086571476347, 'learning_rate': 9.99110441830384e-06, 'epoch': 0.05} 5%|▍ | 1071/22095 [1:22:58<20:26:27, 3.50s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (106300000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 5%|▍ | 1072/22095 [1:23:01<19:44:29, 3.38s/it] {'loss': 0.4656, 'grad_norm': 0.8047137664912013, 'learning_rate': 9.991060664716423e-06, 'epoch': 0.05} 5%|▍ | 1072/22095 [1:23:01<19:44:29, 3.38s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/37949.png 2025-08-27 17:20:57.452960 load time: 1505.75 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1073/22095 [1:23:04<19:06:58, 3.27s/it] {'loss': 0.4845, 'grad_norm': 0.8109246254977853, 'learning_rate': 9.991016803886441e-06, 'epoch': 0.05} 5%|▍ | 1073/22095 [1:23:04<19:06:58, 3.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878406 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1559, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:根据题意,AC=12cm,CB=\\frac{2}{3}AC,所以CB=8cm,所以AB=AC+CB=20cm,又D、E分别为AC、AB的中点,所以DE=AE-AD=\\frac{1}{2}(AB-AC)=4cm.即DE=4cm.'}]} 5%|▍ | 1074/22095 [1:23:07<18:49:50, 3.22s/it] {'loss': 0.4508, 'grad_norm': 0.88284711254973, 'learning_rate': 9.990972835814836e-06, 'epoch': 0.05} 5%|▍ | 1074/22095 [1:23:07<18:49:50, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44435 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41650 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42786 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1075/22095 [1:23:11<19:08:53, 3.28s/it] {'loss': 0.4699, 'grad_norm': 0.9783194550591081, 'learning_rate': 9.990928760502554e-06, 'epoch': 0.05} 5%|▍ | 1075/22095 [1:23:11<19:08:53, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1076/22095 [1:23:18<26:51:35, 4.60s/it] {'loss': 0.5487, 'grad_norm': 0.8693059736340535, 'learning_rate': 9.990884577950542e-06, 'epoch': 0.05} 5%|▍ | 1076/22095 [1:23:18<26:51:35, 4.60s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30810.png 2025-08-27 17:21:17.754874 load time: 1269.11 ms 5%|▍ | 1077/22095 [1:23:23<26:02:37, 4.46s/it] {'loss': 0.4735, 'grad_norm': 0.8105918598711073, 'learning_rate': 9.990840288159747e-06, 'epoch': 0.05} 5%|▍ | 1077/22095 [1:23:23<26:02:37, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78842 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40961 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1078/22095 [1:23:30<31:21:11, 5.37s/it] {'loss': 0.564, 'grad_norm': 0.4456518130885961, 'learning_rate': 9.990795891131125e-06, 'epoch': 0.05} 5%|▍ | 1078/22095 [1:23:30<31:21:11, 5.37s/it] 5%|▍ | 1079/22095 [1:23:33<27:28:09, 4.71s/it] {'loss': 0.4433, 'grad_norm': 0.7670428008042319, 'learning_rate': 9.990751386865624e-06, 'epoch': 0.05} 5%|▍ | 1079/22095 [1:23:33<27:28:09, 4.71s/it] 5%|▍ | 1080/22095 [1:23:36<24:26:53, 4.19s/it] {'loss': 0.4988, 'grad_norm': 0.917325505413355, 'learning_rate': 9.990706775364204e-06, 'epoch': 0.05} 5%|▍ | 1080/22095 [1:23:36<24:26:53, 4.19s/it] 5%|▍ | 1081/22095 [1:23:40<23:59:40, 4.11s/it] {'loss': 0.4192, 'grad_norm': 0.8182023141846203, 'learning_rate': 9.990662056627825e-06, 'epoch': 0.05} 5%|▍ | 1081/22095 [1:23:40<23:59:40, 4.11s/it] 5%|▍ | 1082/22095 [1:23:44<22:39:47, 3.88s/it] {'loss': 0.5304, 'grad_norm': 0.913312253395318, 'learning_rate': 9.990617230657446e-06, 'epoch': 0.05} 5%|▍ | 1082/22095 [1:23:44<22:39:47, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63542 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47425 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1083/22095 [1:23:53<33:09:11, 5.68s/it] {'loss': 0.5529, 'grad_norm': 0.8520793636486157, 'learning_rate': 9.990572297454031e-06, 'epoch': 0.05} 5%|▍ | 1083/22095 [1:23:53<33:09:11, 5.68s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250508_132635_1/images/before_screenshot_1_id_115_function_1_crop_0_grounding_instructions_point_o_paste.png 2025-08-27 17:21:53.201427 load time: 1157.87 ms 5%|▍ | 1084/22095 [1:23:57<29:18:51, 5.02s/it] {'loss': 0.4579, 'grad_norm': 0.7872118274791575, 'learning_rate': 9.990527257018544e-06, 'epoch': 0.05} 5%|▍ | 1084/22095 [1:23:57<29:18:51, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46667 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56430 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1085/22095 [1:24:01<27:05:47, 4.64s/it] {'loss': 0.4798, 'grad_norm': 0.7971591462553888, 'learning_rate': 9.990482109351951e-06, 'epoch': 0.05} 5%|▍ | 1085/22095 [1:24:01<27:05:47, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1086/22095 [1:24:11<36:17:18, 6.22s/it] {'loss': 0.5453, 'grad_norm': 0.5425751844160779, 'learning_rate': 9.990436854455228e-06, 'epoch': 0.05} 5%|▍ | 1086/22095 [1:24:11<36:17:18, 6.22s/it] 5%|▍ | 1087/22095 [1:24:14<31:03:44, 5.32s/it] {'loss': 0.4667, 'grad_norm': 0.885036195458033, 'learning_rate': 9.990391492329341e-06, 'epoch': 0.05} 5%|▍ | 1087/22095 [1:24:14<31:03:44, 5.32s/it] 5%|▍ | 1088/22095 [1:24:17<27:51:21, 4.77s/it] {'loss': 0.4687, 'grad_norm': 0.8812596963416286, 'learning_rate': 9.99034602297527e-06, 'epoch': 0.05} 5%|▍ | 1088/22095 [1:24:17<27:51:21, 4.77s/it] 5%|▍ | 1089/22095 [1:24:20<24:31:12, 4.20s/it] {'loss': 0.4574, 'grad_norm': 0.7548044713322737, 'learning_rate': 9.990300446393988e-06, 'epoch': 0.05} 5%|▍ | 1089/22095 [1:24:20<24:31:12, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▍ | 1090/22095 [1:24:23<22:57:08, 3.93s/it] {'loss': 0.4417, 'grad_norm': 1.3872353232776937, 'learning_rate': 9.990254762586477e-06, 'epoch': 0.05} 5%|▍ | 1090/22095 [1:24:23<22:57:08, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41348 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79525 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1091/22095 [1:24:32<31:02:51, 5.32s/it] {'loss': 0.5387, 'grad_norm': 0.6820528986305062, 'learning_rate': 9.990208971553716e-06, 'epoch': 0.05} 5%|▍ | 1091/22095 [1:24:32<31:02:51, 5.32s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20441.png 2025-08-27 17:22:30.764858 load time: 1808.43 ms 5%|▍ | 1092/22095 [1:24:35<27:37:37, 4.74s/it] {'loss': 0.4802, 'grad_norm': 0.8846682032693766, 'learning_rate': 9.990163073296692e-06, 'epoch': 0.05} 5%|▍ | 1092/22095 [1:24:35<27:37:37, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▍ | 1093/22095 [1:24:46<38:14:20, 6.55s/it] {'loss': 0.5377, 'grad_norm': 0.4990089890256931, 'learning_rate': 9.99011706781639e-06, 'epoch': 0.05} 5%|▍ | 1093/22095 [1:24:46<38:14:20, 6.55s/it] 5%|▍ | 1094/22095 [1:24:49<32:22:35, 5.55s/it] {'loss': 0.4238, 'grad_norm': 0.7752264950430418, 'learning_rate': 9.990070955113798e-06, 'epoch': 0.05} 5%|▍ | 1094/22095 [1:24:49<32:22:35, 5.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [575, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433992 in VC:s3://internvl-moe-sft-data/. Exception: Image size [575, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61517, 'image': 'vrdu_texteq/astro-ph.CO/eadf258c-288f-4895-9b55-1f1488cd49e0.png', 'image_wh': [[575, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'For $60\\, e$-folds and $\\Gamma = 0.1$ the observables read:'}]} 5%|▍ | 1095/22095 [1:24:53<28:48:40, 4.94s/it] {'loss': 0.4477, 'grad_norm': 0.8435283800432144, 'learning_rate': 9.990024735189907e-06, 'epoch': 0.05} 5%|▍ | 1095/22095 [1:24:53<28:48:40, 4.94s/it] 5%|▍ | 1096/22095 [1:24:56<25:24:08, 4.35s/it] {'loss': 0.4369, 'grad_norm': 0.7803389134460659, 'learning_rate': 9.989978408045709e-06, 'epoch': 0.05} 5%|▍ | 1096/22095 [1:24:56<25:24:08, 4.35s/it] 5%|▍ | 1097/22095 [1:24:59<23:19:14, 4.00s/it] {'loss': 0.5141, 'grad_norm': 0.7334389651169826, 'learning_rate': 9.989931973682202e-06, 'epoch': 0.05} 5%|▍ | 1097/22095 [1:24:59<23:19:14, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127068 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1098/22095 [1:25:02<21:56:09, 3.76s/it] {'loss': 0.5037, 'grad_norm': 0.820447034179538, 'learning_rate': 9.989885432100381e-06, 'epoch': 0.05} 5%|▍ | 1098/22095 [1:25:02<21:56:09, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (132859 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1099/22095 [1:25:12<31:44:13, 5.44s/it] {'loss': 0.5438, 'grad_norm': 0.6248412960611177, 'learning_rate': 9.989838783301248e-06, 'epoch': 0.05} 5%|▍ | 1099/22095 [1:25:12<31:44:13, 5.44s/it] 5%|▍ | 1100/22095 [1:25:15<27:41:54, 4.75s/it] {'loss': 0.4732, 'grad_norm': 0.8232431368916311, 'learning_rate': 9.989792027285805e-06, 'epoch': 0.05} 5%|▍ | 1100/22095 [1:25:15<27:41:54, 4.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8551917 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14321, 'image': '435070088.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a sci-fi book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 5%|▍ | 1101/22095 [1:25:18<25:05:59, 4.30s/it] {'loss': 0.4218, 'grad_norm': 0.8808472161055362, 'learning_rate': 9.989745164055056e-06, 'epoch': 0.05} 5%|▍ | 1101/22095 [1:25:18<25:05:59, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93604 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1102/22095 [1:25:22<24:28:47, 4.20s/it] {'loss': 0.4684, 'grad_norm': 0.8609138479748117, 'learning_rate': 9.989698193610007e-06, 'epoch': 0.05} 5%|▍ | 1102/22095 [1:25:22<24:28:47, 4.20s/it] 5%|▍ | 1103/22095 [1:25:25<23:11:31, 3.98s/it] {'loss': 0.4427, 'grad_norm': 0.7324041656570668, 'learning_rate': 9.98965111595167e-06, 'epoch': 0.05} 5%|▍ | 1103/22095 [1:25:25<23:11:31, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104685 > 40960). Running this sequence through the model will result in indexing errors 5%|▍ | 1104/22095 [1:25:29<22:35:22, 3.87s/it] {'loss': 0.46, 'grad_norm': 0.7668886253685008, 'learning_rate': 9.989603931081055e-06, 'epoch': 0.05} 5%|▍ | 1104/22095 [1:25:29<22:35:22, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [139, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8419815 in VC:s3://internvl-moe-sft-data/. Exception: Image size [139, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 88081, 'image': 'vrdu_texteq/astro-ph.CO/57a34b79-5f66-46b7-b07c-873289038bfd.png', 'image_wh': [[139, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'while $ D $ is :'}]} 5%|▌ | 1105/22095 [1:25:32<20:59:24, 3.60s/it] {'loss': 0.5022, 'grad_norm': 0.8435203086437428, 'learning_rate': 9.989556638999175e-06, 'epoch': 0.05} 5%|▌ | 1105/22095 [1:25:32<20:59:24, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1106/22095 [1:25:42<32:25:12, 5.56s/it] {'loss': 0.5337, 'grad_norm': 0.4261132058486116, 'learning_rate': 9.989509239707047e-06, 'epoch': 0.05} 5%|▌ | 1106/22095 [1:25:42<32:25:12, 5.56s/it] 5%|▌ | 1107/22095 [1:25:46<29:11:17, 5.01s/it] {'loss': 0.4439, 'grad_norm': 0.8578128070222978, 'learning_rate': 9.989461733205692e-06, 'epoch': 0.05} 5%|▌ | 1107/22095 [1:25:46<29:11:17, 5.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [459, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8494779 in VC:s3://internvl-moe-sft-data/. Exception: Image size [459, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 70363, 'image': 'vrdu_texteq/astro-ph.CO/ca702bf1-14b0-42ea-9f61-9cf307da858e.png', 'image_wh': [[459, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'Here $N_b$ is the size of the data vector.'}]} 5%|▌ | 1108/22095 [1:25:49<25:39:34, 4.40s/it] {'loss': 0.4679, 'grad_norm': 0.836209356153558, 'learning_rate': 9.989414119496126e-06, 'epoch': 0.05} 5%|▌ | 1108/22095 [1:25:49<25:39:34, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75076 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51930 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1109/22095 [1:25:52<23:22:08, 4.01s/it] {'loss': 0.4532, 'grad_norm': 0.7941865579121077, 'learning_rate': 9.989366398579375e-06, 'epoch': 0.05} 5%|▌ | 1109/22095 [1:25:52<23:22:08, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1110/22095 [1:26:02<34:32:06, 5.92s/it] {'loss': 0.5278, 'grad_norm': 0.35779933739851233, 'learning_rate': 9.989318570456463e-06, 'epoch': 0.05} 5%|▌ | 1110/22095 [1:26:02<34:32:06, 5.92s/it] 5%|▌ | 1111/22095 [1:26:06<30:39:43, 5.26s/it] {'loss': 0.4225, 'grad_norm': 0.8641871319948795, 'learning_rate': 9.989270635128418e-06, 'epoch': 0.05} 5%|▌ | 1111/22095 [1:26:06<30:39:43, 5.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1112/22095 [1:26:15<37:58:51, 6.52s/it] {'loss': 0.5678, 'grad_norm': 0.5511830790669976, 'learning_rate': 9.989222592596272e-06, 'epoch': 0.05} 5%|▌ | 1112/22095 [1:26:16<37:58:51, 6.52s/it] 5%|▌ | 1113/22095 [1:26:19<33:06:28, 5.68s/it] {'loss': 0.4781, 'grad_norm': 0.8417673671234892, 'learning_rate': 9.989174442861056e-06, 'epoch': 0.05} 5%|▌ | 1113/22095 [1:26:19<33:06:28, 5.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1114/22095 [1:26:23<29:36:41, 5.08s/it] {'loss': 0.4283, 'grad_norm': 0.8276051715311175, 'learning_rate': 9.989126185923803e-06, 'epoch': 0.05} 5%|▌ | 1114/22095 [1:26:23<29:36:41, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72182 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44055 > 40960) for 4 sample(s). Truncating to 1473 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (72936 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1115/22095 [1:26:26<26:03:29, 4.47s/it] {'loss': 0.4295, 'grad_norm': 0.8593958832271233, 'learning_rate': 9.989077821785552e-06, 'epoch': 0.05} 5%|▌ | 1115/22095 [1:26:26<26:03:29, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51787 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1116/22095 [1:26:31<26:39:25, 4.57s/it] {'loss': 0.4601, 'grad_norm': 0.7195755355913743, 'learning_rate': 9.98902935044734e-06, 'epoch': 0.05} 5%|▌ | 1116/22095 [1:26:31<26:39:25, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1117/22095 [1:26:41<36:18:10, 6.23s/it] {'loss': 0.5335, 'grad_norm': 0.4017362161163235, 'learning_rate': 9.988980771910213e-06, 'epoch': 0.05} 5%|▌ | 1117/22095 [1:26:41<36:18:10, 6.23s/it] 5%|▌ | 1118/22095 [1:26:44<30:46:07, 5.28s/it] {'loss': 0.471, 'grad_norm': 0.8845074196050715, 'learning_rate': 9.988932086175209e-06, 'epoch': 0.05} 5%|▌ | 1118/22095 [1:26:44<30:46:07, 5.28s/it] 5%|▌ | 1119/22095 [1:26:47<27:15:20, 4.68s/it] {'loss': 0.4394, 'grad_norm': 0.761327368968983, 'learning_rate': 9.988883293243378e-06, 'epoch': 0.05} 5%|▌ | 1119/22095 [1:26:47<27:15:20, 4.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [64, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365280 in VC:s3://internvl-moe-sft-data/. Exception: Image size [64, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32021, 'image': 'vrdu_table_final_2/astro-ph.CO/b21fa58c-51fa-4f01-8f1c-a9076048e1df.png', 'image_wh': [[64, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}0.066\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1120/22095 [1:26:50<24:05:52, 4.14s/it] {'loss': 0.4287, 'grad_norm': 0.7783332130399511, 'learning_rate': 9.988834393115768e-06, 'epoch': 0.05} 5%|▌ | 1120/22095 [1:26:50<24:05:52, 4.14s/it] 5%|▌ | 1121/22095 [1:26:54<24:08:25, 4.14s/it] {'loss': 0.4478, 'grad_norm': 0.7309564495384006, 'learning_rate': 9.988785385793427e-06, 'epoch': 0.05} 5%|▌ | 1121/22095 [1:26:54<24:08:25, 4.14s/it] 5%|▌ | 1122/22095 [1:26:57<21:39:41, 3.72s/it] {'loss': 0.4197, 'grad_norm': 0.8208787390443257, 'learning_rate': 9.98873627127741e-06, 'epoch': 0.05} 5%|▌ | 1122/22095 [1:26:57<21:39:41, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1123/22095 [1:27:07<31:53:39, 5.47s/it] {'loss': 0.5226, 'grad_norm': 0.4006151609810976, 'learning_rate': 9.988687049568772e-06, 'epoch': 0.05} 5%|▌ | 1123/22095 [1:27:07<31:53:39, 5.47s/it] 5%|▌ | 1124/22095 [1:27:13<33:11:08, 5.70s/it] {'loss': 0.5386, 'grad_norm': 0.36589360608826726, 'learning_rate': 9.988637720668573e-06, 'epoch': 0.05} 5%|▌ | 1124/22095 [1:27:13<33:11:08, 5.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▌ | 1125/22095 [1:27:16<29:26:37, 5.05s/it] {'loss': 0.4795, 'grad_norm': 0.7734634813347808, 'learning_rate': 9.98858828457787e-06, 'epoch': 0.05} 5%|▌ | 1125/22095 [1:27:16<29:26:37, 5.05s/it] 5%|▌ | 1126/22095 [1:27:20<26:20:30, 4.52s/it] {'loss': 0.4619, 'grad_norm': 0.8394436286581666, 'learning_rate': 9.988538741297724e-06, 'epoch': 0.05} 5%|▌ | 1126/22095 [1:27:20<26:20:30, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77308 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1127/22095 [1:27:22<23:10:13, 3.98s/it] {'loss': 0.4081, 'grad_norm': 0.8843306463699511, 'learning_rate': 9.988489090829204e-06, 'epoch': 0.05} 5%|▌ | 1127/22095 [1:27:22<23:10:13, 3.98s/it] 5%|▌ | 1128/22095 [1:27:26<23:16:26, 4.00s/it] {'loss': 0.4804, 'grad_norm': 0.7403667217778743, 'learning_rate': 9.988439333173373e-06, 'epoch': 0.05} 5%|▌ | 1128/22095 [1:27:26<23:16:26, 4.00s/it] 5%|▌ | 1129/22095 [1:27:30<23:26:58, 4.03s/it] {'loss': 0.4803, 'grad_norm': 1.013403764971607, 'learning_rate': 9.988389468331304e-06, 'epoch': 0.05} 5%|▌ | 1129/22095 [1:27:30<23:26:58, 4.03s/it] 5%|▌ | 1130/22095 [1:27:33<21:33:51, 3.70s/it] {'loss': 0.4575, 'grad_norm': 0.757680606016578, 'learning_rate': 9.988339496304062e-06, 'epoch': 0.05} 5%|▌ | 1130/22095 [1:27:33<21:33:51, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1131/22095 [1:27:37<21:02:50, 3.61s/it] {'loss': 0.4178, 'grad_norm': 0.7396867410136777, 'learning_rate': 9.988289417092729e-06, 'epoch': 0.05} 5%|▌ | 1131/22095 [1:27:37<21:02:50, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1132/22095 [1:27:41<21:24:25, 3.68s/it] {'loss': 0.5086, 'grad_norm': 0.8297804902848872, 'learning_rate': 9.988239230698373e-06, 'epoch': 0.05} 5%|▌ | 1132/22095 [1:27:41<21:24:25, 3.68s/it] 5%|▌ | 1133/22095 [1:27:44<20:14:09, 3.48s/it] {'loss': 0.4618, 'grad_norm': 0.7912505856561427, 'learning_rate': 9.988188937122078e-06, 'epoch': 0.05} 5%|▌ | 1133/22095 [1:27:44<20:14:09, 3.48s/it] 5%|▌ | 1134/22095 [1:27:48<21:01:05, 3.61s/it] {'loss': 0.4813, 'grad_norm': 0.9367111562458711, 'learning_rate': 9.988138536364922e-06, 'epoch': 0.05} 5%|▌ | 1134/22095 [1:27:48<21:01:05, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918156 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41309, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm'}, {'from': 'gpt', 'value': '∵点D是AC的中点,∴bc=\\frac{1}{2}AC,∵点E是BC的中点,∴CE=\\frac{1}{2}CB,∴DE=DC+CE=\\frac{1}{2}(AC+CB)=8cm,故选B'}]} 5%|▌ | 1135/22095 [1:27:58<33:16:10, 5.71s/it] {'loss': 0.5245, 'grad_norm': 0.6628874626437402, 'learning_rate': 9.988088028427992e-06, 'epoch': 0.05} 5%|▌ | 1135/22095 [1:27:58<33:16:10, 5.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86023 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85618 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49605 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1136/22095 [1:28:02<29:42:34, 5.10s/it] {'loss': 0.4558, 'grad_norm': 1.7104204042871356, 'learning_rate': 9.988037413312365e-06, 'epoch': 0.05} 5%|▌ | 1136/22095 [1:28:02<29:42:34, 5.10s/it] 5%|▌ | 1137/22095 [1:28:05<26:01:13, 4.47s/it] {'loss': 0.4825, 'grad_norm': 0.8066081014907509, 'learning_rate': 9.987986691019136e-06, 'epoch': 0.05} 5%|▌ | 1137/22095 [1:28:05<26:01:13, 4.47s/it] 5%|▌ | 1138/22095 [1:28:08<23:08:51, 3.98s/it] {'loss': 0.4536, 'grad_norm': 0.806073020603354, 'learning_rate': 9.987935861549393e-06, 'epoch': 0.05} 5%|▌ | 1138/22095 [1:28:08<23:08:51, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1139/22095 [1:28:11<22:05:46, 3.80s/it] {'loss': 0.4765, 'grad_norm': 0.7535675067601677, 'learning_rate': 9.987884924904228e-06, 'epoch': 0.05} 5%|▌ | 1139/22095 [1:28:11<22:05:46, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94103 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1140/22095 [1:28:15<21:59:26, 3.78s/it] {'loss': 0.4676, 'grad_norm': 0.8168490660049519, 'learning_rate': 9.987833881084734e-06, 'epoch': 0.05} 5%|▌ | 1140/22095 [1:28:15<21:59:26, 3.78s/it] 5%|▌ | 1141/22095 [1:28:18<20:48:07, 3.57s/it] {'loss': 0.485, 'grad_norm': 0.849879574073609, 'learning_rate': 9.987782730092009e-06, 'epoch': 0.05} 5%|▌ | 1141/22095 [1:28:18<20:48:07, 3.57s/it] 5%|▌ | 1142/22095 [1:28:22<21:35:50, 3.71s/it] {'loss': 0.4804, 'grad_norm': 0.8437989117176353, 'learning_rate': 9.987731471927152e-06, 'epoch': 0.05} 5%|▌ | 1142/22095 [1:28:22<21:35:50, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83092 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45174 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1143/22095 [1:28:26<22:00:27, 3.78s/it] {'loss': 0.4653, 'grad_norm': 1.1720073013495276, 'learning_rate': 9.987680106591264e-06, 'epoch': 0.05} 5%|▌ | 1143/22095 [1:28:26<22:00:27, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (65273 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1144/22095 [1:28:36<32:58:26, 5.67s/it] {'loss': 0.5479, 'grad_norm': 0.6529217354151006, 'learning_rate': 9.98762863408545e-06, 'epoch': 0.05} 5%|▌ | 1144/22095 [1:28:36<32:58:26, 5.67s/it] 5%|▌ | 1145/22095 [1:28:39<28:57:22, 4.98s/it] {'loss': 0.4758, 'grad_norm': 0.8299860927045014, 'learning_rate': 9.987577054410813e-06, 'epoch': 0.05} 5%|▌ | 1145/22095 [1:28:39<28:57:22, 4.98s/it] 5%|▌ | 1146/22095 [1:28:43<26:54:23, 4.62s/it] {'loss': 0.4813, 'grad_norm': 0.7964057437485828, 'learning_rate': 9.987525367568464e-06, 'epoch': 0.05} 5%|▌ | 1146/22095 [1:28:43<26:54:23, 4.62s/it] 5%|▌ | 1147/22095 [1:28:46<24:15:45, 4.17s/it] {'loss': 0.4751, 'grad_norm': 0.7868668624743436, 'learning_rate': 9.987473573559514e-06, 'epoch': 0.05} 5%|▌ | 1147/22095 [1:28:46<24:15:45, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1148/22095 [1:28:55<32:24:22, 5.57s/it] {'loss': 0.5253, 'grad_norm': 0.36553766406529065, 'learning_rate': 9.987421672385073e-06, 'epoch': 0.05} 5%|▌ | 1148/22095 [1:28:55<32:24:22, 5.57s/it] 5%|▌ | 1149/22095 [1:29:00<30:33:15, 5.25s/it] {'loss': 0.474, 'grad_norm': 0.8803059079274371, 'learning_rate': 9.98736966404626e-06, 'epoch': 0.05} 5%|▌ | 1149/22095 [1:29:00<30:33:15, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59326 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1150/22095 [1:29:03<27:22:46, 4.71s/it] {'loss': 0.4543, 'grad_norm': 0.7268443082420789, 'learning_rate': 9.98731754854419e-06, 'epoch': 0.05} 5%|▌ | 1150/22095 [1:29:03<27:22:46, 4.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108679428 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 5%|▌ | 1151/22095 [1:29:14<37:34:39, 6.46s/it] {'loss': 0.5388, 'grad_norm': 0.38910048527339913, 'learning_rate': 9.987265325879983e-06, 'epoch': 0.05} 5%|▌ | 1151/22095 [1:29:14<37:34:39, 6.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047230 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12cm'}]} 5%|▌ | 1152/22095 [1:29:23<42:58:06, 7.39s/it] {'loss': 0.5392, 'grad_norm': 0.37753851519302517, 'learning_rate': 9.98721299605476e-06, 'epoch': 0.05} 5%|▌ | 1152/22095 [1:29:23<42:58:06, 7.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▌ | 1153/22095 [1:29:26<35:52:11, 6.17s/it] {'loss': 0.4305, 'grad_norm': 1.2202861369230587, 'learning_rate': 9.987160559069649e-06, 'epoch': 0.05} 5%|▌ | 1153/22095 [1:29:26<35:52:11, 6.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1154/22095 [1:29:31<32:49:44, 5.64s/it] {'loss': 0.5202, 'grad_norm': 0.8826481532913922, 'learning_rate': 9.987108014925772e-06, 'epoch': 0.05} 5%|▌ | 1154/22095 [1:29:31<32:49:44, 5.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938309 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61462, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} 5%|▌ | 1155/22095 [1:29:34<28:30:20, 4.90s/it] {'loss': 0.4912, 'grad_norm': 0.8066686419753973, 'learning_rate': 9.987055363624263e-06, 'epoch': 0.05} 5%|▌ | 1155/22095 [1:29:34<28:30:20, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65344 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1156/22095 [1:29:37<25:23:58, 4.37s/it] {'loss': 0.4606, 'grad_norm': 0.9693526840801042, 'learning_rate': 9.98700260516625e-06, 'epoch': 0.05} 5%|▌ | 1156/22095 [1:29:37<25:23:58, 4.37s/it] 5%|▌ | 1157/22095 [1:29:41<23:57:23, 4.12s/it] {'loss': 0.4528, 'grad_norm': 0.7720688308741186, 'learning_rate': 9.986949739552867e-06, 'epoch': 0.05} 5%|▌ | 1157/22095 [1:29:41<23:57:23, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (102873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70736 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1158/22095 [1:29:48<30:20:05, 5.22s/it] {'loss': 0.5449, 'grad_norm': 0.6195630876054363, 'learning_rate': 9.98689676678525e-06, 'epoch': 0.05} 5%|▌ | 1158/22095 [1:29:48<30:20:05, 5.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41812 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62262 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64779 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1159/22095 [1:29:51<26:37:28, 4.58s/it] {'loss': 0.4996, 'grad_norm': 1.0037880322159685, 'learning_rate': 9.986843686864538e-06, 'epoch': 0.05} 5%|▌ | 1159/22095 [1:29:52<26:37:28, 4.58s/it] 5%|▌ | 1160/22095 [1:29:54<23:45:05, 4.08s/it] {'loss': 0.4897, 'grad_norm': 0.8977655363937432, 'learning_rate': 9.986790499791872e-06, 'epoch': 0.05} 5%|▌ | 1160/22095 [1:29:54<23:45:05, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (65430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59232 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (136443 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41228 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1161/22095 [1:30:02<29:52:50, 5.14s/it] {'loss': 0.5529, 'grad_norm': 0.42159467768136355, 'learning_rate': 9.986737205568393e-06, 'epoch': 0.05} 5%|▌ | 1161/22095 [1:30:02<29:52:50, 5.14s/it] 5%|▌ | 1162/22095 [1:30:05<26:54:43, 4.63s/it] {'loss': 0.4662, 'grad_norm': 0.7786886256636473, 'learning_rate': 9.986683804195248e-06, 'epoch': 0.05} 5%|▌ | 1162/22095 [1:30:05<26:54:43, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44254 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43531 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87137 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1163/22095 [1:30:09<25:39:27, 4.41s/it] {'loss': 0.5036, 'grad_norm': 0.9184508577564199, 'learning_rate': 9.98663029567358e-06, 'epoch': 0.05} 5%|▌ | 1163/22095 [1:30:09<25:39:27, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1164/22095 [1:30:18<32:59:33, 5.67s/it] {'loss': 0.5074, 'grad_norm': 0.445883140071055, 'learning_rate': 9.986576680004546e-06, 'epoch': 0.05} 5%|▌ | 1164/22095 [1:30:18<32:59:33, 5.67s/it] 5%|▌ | 1165/22095 [1:30:22<30:24:59, 5.23s/it] {'loss': 0.4596, 'grad_norm': 0.7543499978541274, 'learning_rate': 9.986522957189293e-06, 'epoch': 0.05} 5%|▌ | 1165/22095 [1:30:22<30:24:59, 5.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1166/22095 [1:30:33<39:23:03, 6.77s/it] {'loss': 0.5426, 'grad_norm': 0.3332988296091536, 'learning_rate': 9.986469127228977e-06, 'epoch': 0.05} 5%|▌ | 1166/22095 [1:30:33<39:23:03, 6.77s/it] 5%|▌ | 1167/22095 [1:30:42<44:08:15, 7.59s/it] {'loss': 0.5026, 'grad_norm': 0.29582601007727477, 'learning_rate': 9.986415190124754e-06, 'epoch': 0.05} 5%|▌ | 1167/22095 [1:30:42<44:08:15, 7.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▌ | 1168/22095 [1:30:46<37:18:05, 6.42s/it] {'loss': 0.442, 'grad_norm': 0.9204990477506604, 'learning_rate': 9.986361145877783e-06, 'epoch': 0.05} 5%|▌ | 1168/22095 [1:30:46<37:18:05, 6.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76921 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1169/22095 [1:30:52<37:52:08, 6.51s/it] {'loss': 0.5665, 'grad_norm': 0.4160967570525296, 'learning_rate': 9.986306994489226e-06, 'epoch': 0.05} 5%|▌ | 1169/22095 [1:30:52<37:52:08, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (54679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (137879 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1170/22095 [1:30:56<32:22:52, 5.57s/it] {'loss': 0.4393, 'grad_norm': 0.8304239963080823, 'learning_rate': 9.986252735960245e-06, 'epoch': 0.05} 5%|▌ | 1170/22095 [1:30:56<32:22:52, 5.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72508 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1171/22095 [1:31:05<39:21:22, 6.77s/it] {'loss': 0.5234, 'grad_norm': 0.3926909854775528, 'learning_rate': 9.986198370292007e-06, 'epoch': 0.05} 5%|▌ | 1171/22095 [1:31:05<39:21:22, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45374 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1172/22095 [1:31:09<34:14:48, 5.89s/it] {'loss': 0.5123, 'grad_norm': 0.8733750774138885, 'learning_rate': 9.98614389748568e-06, 'epoch': 0.05} 5%|▌ | 1172/22095 [1:31:09<34:14:48, 5.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49564 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63209 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42493 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1173/22095 [1:31:12<29:17:34, 5.04s/it] {'loss': 0.5097, 'grad_norm': 0.8887474864844891, 'learning_rate': 9.986089317542434e-06, 'epoch': 0.05} 5%|▌ | 1173/22095 [1:31:12<29:17:34, 5.04s/it] 5%|▌ | 1174/22095 [1:31:16<26:28:38, 4.56s/it] {'loss': 0.4245, 'grad_norm': 0.7862722297713525, 'learning_rate': 9.986034630463443e-06, 'epoch': 0.05} 5%|▌ | 1174/22095 [1:31:16<26:28:38, 4.56s/it] 5%|▌ | 1175/22095 [1:31:19<24:28:33, 4.21s/it] {'loss': 0.4794, 'grad_norm': 0.7915565801064068, 'learning_rate': 9.985979836249882e-06, 'epoch': 0.05} 5%|▌ | 1175/22095 [1:31:19<24:28:33, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46589 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127159 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1176/22095 [1:31:22<22:28:12, 3.87s/it] {'loss': 0.4669, 'grad_norm': 0.8468539133747583, 'learning_rate': 9.985924934902927e-06, 'epoch': 0.05} 5%|▌ | 1176/22095 [1:31:22<22:28:12, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1177/22095 [1:31:32<32:13:13, 5.55s/it] {'loss': 0.5246, 'grad_norm': 0.6026584322786569, 'learning_rate': 9.985869926423757e-06, 'epoch': 0.05} 5%|▌ | 1177/22095 [1:31:32<32:13:13, 5.55s/it] 5%|▌ | 1178/22095 [1:31:42<40:14:02, 6.92s/it] {'loss': 0.5579, 'grad_norm': 0.49181922663684496, 'learning_rate': 9.985814810813556e-06, 'epoch': 0.05} 5%|▌ | 1178/22095 [1:31:42<40:14:02, 6.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▌ | 1179/22095 [1:31:46<34:43:38, 5.98s/it] {'loss': 0.4746, 'grad_norm': 0.865929858522939, 'learning_rate': 9.985759588073508e-06, 'epoch': 0.05} 5%|▌ | 1179/22095 [1:31:46<34:43:38, 5.98s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1180/22095 [1:31:51<34:17:55, 5.90s/it] {'loss': 0.5232, 'grad_norm': 0.4006188616165955, 'learning_rate': 9.985704258204798e-06, 'epoch': 0.05} 5%|▌ | 1180/22095 [1:31:51<34:17:55, 5.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41372 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127717 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1181/22095 [1:32:02<41:54:51, 7.21s/it] {'loss': 0.5318, 'grad_norm': 0.44661749061090317, 'learning_rate': 9.985648821208616e-06, 'epoch': 0.05} 5%|▌ | 1181/22095 [1:32:02<41:54:51, 7.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (77007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90943 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1182/22095 [1:32:05<34:45:35, 5.98s/it] {'loss': 0.472, 'grad_norm': 0.8150202842917483, 'learning_rate': 9.985593277086155e-06, 'epoch': 0.05} 5%|▌ | 1182/22095 [1:32:05<34:45:35, 5.98s/it] 5%|▌ | 1183/22095 [1:32:08<30:10:43, 5.20s/it] {'loss': 0.4737, 'grad_norm': 0.9321471164895254, 'learning_rate': 9.985537625838603e-06, 'epoch': 0.05} 5%|▌ | 1183/22095 [1:32:08<30:10:43, 5.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1184/22095 [1:32:18<37:46:55, 6.50s/it] {'loss': 0.5248, 'grad_norm': 0.4787161789655883, 'learning_rate': 9.985481867467162e-06, 'epoch': 0.05} 5%|▌ | 1184/22095 [1:32:18<37:46:55, 6.50s/it] 5%|▌ | 1185/22095 [1:32:21<32:26:54, 5.59s/it] {'loss': 0.447, 'grad_norm': 0.781597037845619, 'learning_rate': 9.985426001973026e-06, 'epoch': 0.05} 5%|▌ | 1185/22095 [1:32:21<32:26:54, 5.59s/it] 5%|▌ | 1186/22095 [1:32:24<27:56:00, 4.81s/it] {'loss': 0.4877, 'grad_norm': 0.7739348385383168, 'learning_rate': 9.985370029357399e-06, 'epoch': 0.05} 5%|▌ | 1186/22095 [1:32:24<27:56:00, 4.81s/it] 5%|▌ | 1187/22095 [1:32:27<24:38:04, 4.24s/it] {'loss': 0.4825, 'grad_norm': 0.8249985271286937, 'learning_rate': 9.98531394962148e-06, 'epoch': 0.05} 5%|▌ | 1187/22095 [1:32:27<24:38:04, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [81, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369871 in VC:s3://internvl-moe-sft-data/. Exception: Image size [81, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36623, 'image': 'vrdu_table_final_2/astro-ph.CO/9471abcd-fbf1-48fc-b32e-8b5c954f4b74.png', 'image_wh': [[81, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[t]{c}\n \\@author\n \\end{tabular}\n```'}]} 5%|▌ | 1188/22095 [1:32:31<24:15:44, 4.18s/it] {'loss': 0.4585, 'grad_norm': 0.7769368520306797, 'learning_rate': 9.985257762766476e-06, 'epoch': 0.05} 5%|▌ | 1188/22095 [1:32:31<24:15:44, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1189/22095 [1:32:39<30:31:19, 5.26s/it] {'loss': 0.5321, 'grad_norm': 0.49925545381022285, 'learning_rate': 9.985201468793593e-06, 'epoch': 0.05} 5%|▌ | 1189/22095 [1:32:39<30:31:19, 5.26s/it] 5%|▌ | 1190/22095 [1:32:48<37:59:44, 6.54s/it] {'loss': 0.5298, 'grad_norm': 0.3838928358822514, 'learning_rate': 9.985145067704042e-06, 'epoch': 0.05} 5%|▌ | 1190/22095 [1:32:48<37:59:44, 6.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (42996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121946 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1191/22095 [1:32:51<32:05:15, 5.53s/it] {'loss': 0.4555, 'grad_norm': 0.9635548790408048, 'learning_rate': 9.985088559499032e-06, 'epoch': 0.05} 5%|▌ | 1191/22095 [1:32:51<32:05:15, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57263 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1192/22095 [1:32:55<29:05:30, 5.01s/it] {'loss': 0.476, 'grad_norm': 0.8111640620370049, 'learning_rate': 9.985031944179781e-06, 'epoch': 0.05} 5%|▌ | 1192/22095 [1:32:55<29:05:30, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1193/22095 [1:33:00<27:43:21, 4.77s/it] {'loss': 0.4952, 'grad_norm': 0.8288410981511287, 'learning_rate': 9.984975221747505e-06, 'epoch': 0.05} 5%|▌ | 1193/22095 [1:33:00<27:43:21, 4.77s/it] 5%|▌ | 1194/22095 [1:33:03<24:46:32, 4.27s/it] {'loss': 0.4564, 'grad_norm': 0.8598494946424741, 'learning_rate': 9.984918392203421e-06, 'epoch': 0.05} 5%|▌ | 1194/22095 [1:33:03<24:46:32, 4.27s/it] 5%|▌ | 1195/22095 [1:33:06<23:24:41, 4.03s/it] {'loss': 0.4415, 'grad_norm': 0.7828458336222965, 'learning_rate': 9.98486145554875e-06, 'epoch': 0.05} 5%|▌ | 1195/22095 [1:33:06<23:24:41, 4.03s/it] 5%|▌ | 1196/22095 [1:33:09<21:11:17, 3.65s/it] {'loss': 0.4606, 'grad_norm': 0.7851268025912339, 'learning_rate': 9.984804411784717e-06, 'epoch': 0.05} 5%|▌ | 1196/22095 [1:33:09<21:11:17, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44260 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74504 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86950 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1197/22095 [1:33:12<20:38:40, 3.56s/it] {'loss': 0.4906, 'grad_norm': 0.7997777564795878, 'learning_rate': 9.984747260912546e-06, 'epoch': 0.05} 5%|▌ | 1197/22095 [1:33:12<20:38:40, 3.56s/it] 5%|▌ | 1198/22095 [1:33:16<20:15:24, 3.49s/it] {'loss': 0.4964, 'grad_norm': 0.7557512579413929, 'learning_rate': 9.984690002933465e-06, 'epoch': 0.05} 5%|▌ | 1198/22095 [1:33:16<20:15:24, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 5%|▌ | 1199/22095 [1:33:22<25:23:56, 4.38s/it] {'loss': 0.5612, 'grad_norm': 1.010347993778274, 'learning_rate': 9.984632637848708e-06, 'epoch': 0.05} 5%|▌ | 1199/22095 [1:33:22<25:23:56, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48327 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1200/22095 [1:33:30<32:23:14, 5.58s/it] {'loss': 0.5502, 'grad_norm': 0.5411492624911337, 'learning_rate': 9.984575165659503e-06, 'epoch': 0.05} 5%|▌ | 1200/22095 [1:33:30<32:23:14, 5.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 5%|▌ | 1201/22095 [1:33:34<29:18:29, 5.05s/it] {'loss': 0.4881, 'grad_norm': 1.0723421207044346, 'learning_rate': 9.984517586367088e-06, 'epoch': 0.05} 5%|▌ | 1201/22095 [1:33:34<29:18:29, 5.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308342 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2asCSoDnI8KJjy0FfXXcdoVXa_!!2638823772.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you please help me read its text content?'}, {'from': 'gpt', 'value': 'All words in the image:\n奥亚\n多功能\nAY\n密封条\n包邮\n买贵补差价\n5米装\n浴室门缝\n玻璃门\n窗缝'}]} 5%|▌ | 1202/22095 [1:33:38<26:24:15, 4.55s/it] {'loss': 0.4582, 'grad_norm': 1.024700727373943, 'learning_rate': 9.984459899972696e-06, 'epoch': 0.05} 5%|▌ | 1202/22095 [1:33:38<26:24:15, 4.55s/it] 5%|▌ | 1203/22095 [1:33:41<24:07:05, 4.16s/it] {'loss': 0.4682, 'grad_norm': 0.8541022994896453, 'learning_rate': 9.984402106477572e-06, 'epoch': 0.05} 5%|▌ | 1203/22095 [1:33:41<24:07:05, 4.16s/it] 5%|▌ | 1204/22095 [1:33:44<22:33:47, 3.89s/it] {'loss': 0.4897, 'grad_norm': 0.8690265217212084, 'learning_rate': 9.984344205882954e-06, 'epoch': 0.05} 5%|▌ | 1204/22095 [1:33:44<22:33:47, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1205/22095 [1:33:54<33:54:55, 5.84s/it] {'loss': 0.5448, 'grad_norm': 1.574950294258745, 'learning_rate': 9.984286198190087e-06, 'epoch': 0.05} 5%|▌ | 1205/22095 [1:33:54<33:54:55, 5.84s/it] 5%|▌ | 1206/22095 [1:33:58<29:45:25, 5.13s/it] {'loss': 0.48, 'grad_norm': 0.9788112855639516, 'learning_rate': 9.984228083400218e-06, 'epoch': 0.05} 5%|▌ | 1206/22095 [1:33:58<29:45:25, 5.13s/it] 5%|▌ | 1207/22095 [1:34:02<28:05:29, 4.84s/it] {'loss': 0.4679, 'grad_norm': 0.8732371443748891, 'learning_rate': 9.984169861514597e-06, 'epoch': 0.05} 5%|▌ | 1207/22095 [1:34:02<28:05:29, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 5%|▌ | 1208/22095 [1:34:10<33:55:00, 5.85s/it] {'loss': 0.5437, 'grad_norm': 0.6303672878123504, 'learning_rate': 9.98411153253447e-06, 'epoch': 0.05} 5%|▌ | 1208/22095 [1:34:10<33:55:00, 5.85s/it] 5%|▌ | 1209/22095 [1:34:14<31:00:11, 5.34s/it] {'loss': 0.4852, 'grad_norm': 0.9558686425270596, 'learning_rate': 9.984053096461098e-06, 'epoch': 0.05} 5%|▌ | 1209/22095 [1:34:14<31:00:11, 5.34s/it] 5%|▌ | 1210/22095 [1:34:19<28:47:29, 4.96s/it] {'loss': 0.4598, 'grad_norm': 1.087269887409945, 'learning_rate': 9.983994553295728e-06, 'epoch': 0.05} 5%|▌ | 1210/22095 [1:34:19<28:47:29, 4.96s/it] 5%|▌ | 1211/22095 [1:34:23<27:32:24, 4.75s/it] {'loss': 0.4845, 'grad_norm': 0.8303647766368976, 'learning_rate': 9.983935903039625e-06, 'epoch': 0.05} 5%|▌ | 1211/22095 [1:34:23<27:32:24, 4.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [212, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8512846 in VC:s3://internvl-moe-sft-data/. Exception: Image size [212, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51015, 'image': 'vrdu_texteq/astro-ph.CO/fecf3277-7055-47e4-8adf-ad4ade6759c7.png', 'image_wh': [[212, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'with $\\mathcal{G}$ defined as'}]} 5%|▌ | 1212/22095 [1:34:27<26:03:18, 4.49s/it] {'loss': 0.5078, 'grad_norm': 0.8183369574706792, 'learning_rate': 9.983877145694046e-06, 'epoch': 0.05} 5%|▌ | 1212/22095 [1:34:27<26:03:18, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51726 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1213/22095 [1:34:30<23:31:26, 4.06s/it] {'loss': 0.4419, 'grad_norm': 0.8522053359132349, 'learning_rate': 9.983818281260253e-06, 'epoch': 0.05} 5%|▌ | 1213/22095 [1:34:30<23:31:26, 4.06s/it] 5%|▌ | 1214/22095 [1:34:33<22:04:45, 3.81s/it] {'loss': 0.4218, 'grad_norm': 0.9621572091092583, 'learning_rate': 9.983759309739512e-06, 'epoch': 0.05} 5%|▌ | 1214/22095 [1:34:33<22:04:45, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71180 > 40960). Running this sequence through the model will result in indexing errors 5%|▌ | 1215/22095 [1:34:36<20:31:38, 3.54s/it] {'loss': 0.5094, 'grad_norm': 0.9053413715699286, 'learning_rate': 9.98370023113309e-06, 'epoch': 0.05} 5%|▌ | 1215/22095 [1:34:36<20:31:38, 3.54s/it] 6%|▌ | 1216/22095 [1:34:40<21:02:11, 3.63s/it] {'loss': 0.4825, 'grad_norm': 0.7835898344779059, 'learning_rate': 9.983641045442256e-06, 'epoch': 0.06} 6%|▌ | 1216/22095 [1:34:40<21:02:11, 3.63s/it] 6%|▌ | 1217/22095 [1:34:43<20:49:21, 3.59s/it] {'loss': 0.5178, 'grad_norm': 0.9485400715813633, 'learning_rate': 9.983581752668283e-06, 'epoch': 0.06} 6%|▌ | 1217/22095 [1:34:43<20:49:21, 3.59s/it] 6%|▌ | 1218/22095 [1:34:47<20:30:37, 3.54s/it] {'loss': 0.4506, 'grad_norm': 0.7566415295491471, 'learning_rate': 9.983522352812443e-06, 'epoch': 0.06} 6%|▌ | 1218/22095 [1:34:47<20:30:37, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51501 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1219/22095 [1:34:49<19:22:47, 3.34s/it] {'loss': 0.4485, 'grad_norm': 0.7828487329955025, 'learning_rate': 9.983462845876015e-06, 'epoch': 0.06} 6%|▌ | 1219/22095 [1:34:49<19:22:47, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113417 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1220/22095 [1:34:53<19:34:47, 3.38s/it] {'loss': 0.4515, 'grad_norm': 0.8264869845338306, 'learning_rate': 9.983403231860273e-06, 'epoch': 0.06} 6%|▌ | 1220/22095 [1:34:53<19:34:47, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1221/22095 [1:34:59<23:56:29, 4.13s/it] {'loss': 0.5786, 'grad_norm': 1.5117493089257694, 'learning_rate': 9.983343510766504e-06, 'epoch': 0.06} 6%|▌ | 1221/22095 [1:34:59<23:56:29, 4.13s/it] 6%|▌ | 1222/22095 [1:35:04<25:37:37, 4.42s/it] {'loss': 0.4313, 'grad_norm': 0.9014398579583309, 'learning_rate': 9.983283682595986e-06, 'epoch': 0.06} 6%|▌ | 1222/22095 [1:35:04<25:37:37, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1223/22095 [1:35:16<39:35:48, 6.83s/it] {'loss': 0.5528, 'grad_norm': 0.614288103847374, 'learning_rate': 9.983223747350008e-06, 'epoch': 0.06} 6%|▌ | 1223/22095 [1:35:16<39:35:48, 6.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1224/22095 [1:35:20<33:45:45, 5.82s/it] {'loss': 0.4304, 'grad_norm': 0.9713214207122499, 'learning_rate': 9.983163705029857e-06, 'epoch': 0.06} 6%|▌ | 1224/22095 [1:35:20<33:45:45, 5.82s/it] 6%|▌ | 1225/22095 [1:35:48<72:49:27, 12.56s/it] {'loss': 0.4615, 'grad_norm': 0.9031593512931938, 'learning_rate': 9.983103555636821e-06, 'epoch': 0.06} 6%|▌ | 1225/22095 [1:35:48<72:49:27, 12.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1226/22095 [1:37:30<228:03:58, 39.34s/it] {'loss': 0.4663, 'grad_norm': 0.9316822418803643, 'learning_rate': 9.983043299172195e-06, 'epoch': 0.06} 6%|▌ | 1226/22095 [1:37:30<228:03:58, 39.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250624/ubuntu/images/libreoffice_calc/6a805b54-8819-4a8a-a154-af211a5b07fd/images/step_3.png 2025-08-27 17:35:28.723973 load time: 1233.93 ms 6%|▌ | 1227/22095 [1:37:38<174:03:46, 30.03s/it] {'loss': 0.5504, 'grad_norm': 1.5925339118559703, 'learning_rate': 9.982982935637272e-06, 'epoch': 0.06} 6%|▌ | 1227/22095 [1:37:38<174:03:46, 30.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8411500 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13707, 'image': 'vrdu_table_final_2/astro-ph.CO/5bd7fff3-4ec8-4c27-b6bd-2b4b07fcef0e.png', 'image_wh': [[23, 6]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}ccccccccccc@{}}\n...\n\\end{tabular}\n```"}]} 6%|▌ | 1228/22095 [1:37:42<127:42:54, 22.03s/it] {'loss': 0.4547, 'grad_norm': 0.8044607515862033, 'learning_rate': 9.98292246503335e-06, 'epoch': 0.06} 6%|▌ | 1228/22095 [1:37:42<127:42:54, 22.03s/it]VC:s3://gui/data_20250328/icon_canva/images/mobile_1440x3120_1743153888_canvas.png 2025-08-27 17:35:40.399179 load time: 1049.75 ms 6%|▌ | 1229/22095 [1:37:45<95:14:57, 16.43s/it] {'loss': 0.4676, 'grad_norm': 0.7912690572472879, 'learning_rate': 9.982861887361728e-06, 'epoch': 0.06} 6%|▌ | 1229/22095 [1:37:45<95:14:57, 16.43s/it]VC:s3://sa-1b/sa_000000/sa_1415.jpg 2025-08-27 17:35:43.765927 load time: 1068.68 ms 6%|▌ | 1230/22095 [1:39:03<202:33:42, 34.95s/it] {'loss': 0.4764, 'grad_norm': 0.9431332129547105, 'learning_rate': 9.982801202623708e-06, 'epoch': 0.06} 6%|▌ | 1230/22095 [1:39:03<202:33:42, 34.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42652 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58526 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1231/22095 [1:39:32<192:35:14, 33.23s/it] {'loss': 0.4274, 'grad_norm': 0.8081337093685754, 'learning_rate': 9.982740410820595e-06, 'epoch': 0.06} 6%|▌ | 1231/22095 [1:39:32<192:35:14, 33.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104283 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1232/22095 [1:40:24<224:38:48, 38.76s/it] {'loss': 0.406, 'grad_norm': 0.763449935761009, 'learning_rate': 9.98267951195369e-06, 'epoch': 0.06} 6%|▌ | 1232/22095 [1:40:24<224:38:48, 38.76s/it] 6%|▌ | 1233/22095 [1:42:29<374:30:30, 64.63s/it] {'loss': 0.451, 'grad_norm': 0.8422970501795058, 'learning_rate': 9.982618506024309e-06, 'epoch': 0.06} 6%|▌ | 1233/22095 [1:42:29<374:30:30, 64.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104945 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1234/22095 [1:45:05<532:37:52, 91.92s/it] {'loss': 0.4609, 'grad_norm': 0.757141131265391, 'learning_rate': 9.982557393033758e-06, 'epoch': 0.06} 6%|▌ | 1234/22095 [1:45:05<532:37:52, 91.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49523 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1235/22095 [1:45:58<465:02:55, 80.26s/it] {'loss': 0.4758, 'grad_norm': 0.7797311152704947, 'learning_rate': 9.98249617298335e-06, 'epoch': 0.06} 6%|▌ | 1235/22095 [1:45:58<465:02:55, 80.26s/it] 6%|▌ | 1236/22095 [1:46:49<415:23:36, 71.69s/it] {'loss': 0.4443, 'grad_norm': 0.7571561418356503, 'learning_rate': 9.982434845874405e-06, 'epoch': 0.06} 6%|▌ | 1236/22095 [1:46:49<415:23:36, 71.69s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_115534_before_screenshot.png 2025-08-27 17:44:48.136382 load time: 1031.38 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1237/22095 [1:47:59<412:26:28, 71.19s/it] {'loss': 0.4379, 'grad_norm': 0.940424347691789, 'learning_rate': 9.982373411708237e-06, 'epoch': 0.06} 6%|▌ | 1237/22095 [1:47:59<412:26:28, 71.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1238/22095 [1:52:04<713:32:35, 123.16s/it] {'loss': 0.5175, 'grad_norm': 0.8378688150058623, 'learning_rate': 9.982311870486166e-06, 'epoch': 0.06} 6%|▌ | 1238/22095 [1:52:04<713:32:35, 123.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250616/windows_paste/images/autocad/20250509_145301_364946_2818_1/images/before_screenshot_1_id_0_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-27 17:50:02.574781 load time: 1303.79 ms 6%|▌ | 1239/22095 [1:53:05<605:10:21, 104.46s/it] {'loss': 0.5485, 'grad_norm': 0.7725216832344305, 'learning_rate': 9.982250222209513e-06, 'epoch': 0.06} 6%|▌ | 1239/22095 [1:53:05<605:10:21, 104.46s/it] 6%|▌ | 1240/22095 [1:55:08<638:13:29, 110.17s/it] {'loss': 0.5171, 'grad_norm': 0.5805096150729553, 'learning_rate': 9.982188466879607e-06, 'epoch': 0.06} 6%|▌ | 1240/22095 [1:55:08<638:13:29, 110.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (59544 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1241/22095 [1:57:44<717:19:53, 123.83s/it] {'loss': 0.432, 'grad_norm': 0.857878552541859, 'learning_rate': 9.98212660449777e-06, 'epoch': 0.06} 6%|▌ | 1241/22095 [1:57:44<717:19:53, 123.83s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1242/22095 [1:59:08<648:23:45, 111.94s/it] {'loss': 0.5439, 'grad_norm': 0.4158987046032542, 'learning_rate': 9.982064635065336e-06, 'epoch': 0.06} 6%|▌ | 1242/22095 [1:59:08<648:23:45, 111.94s/it] 6%|▌ | 1243/22095 [2:00:52<634:59:17, 109.63s/it] {'loss': 0.5426, 'grad_norm': 0.4930079344799815, 'learning_rate': 9.982002558583633e-06, 'epoch': 0.06} 6%|▌ | 1243/22095 [2:00:52<634:59:17, 109.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [46, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7796110 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [46, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '17450', 'image': '31946.jpg', 'image_wh': [[46, 23]], 'conversations': [{'from': 'human', 'value': '\nI am responding to the following question using the image as a reference: \nHow do the objects or people in the image interact with each other? \nHere is my answer: \nThe image showcases a mathematical formulation, which can be visualized as:\n\\[ \\widehat{\\mu}(x; y) \\]\n\nThis formulation prominently displays a caret symbol (^) above the Greek letter "mu" (\\(\\mu\\)), denoting a transformed or estimated version of mu. The entire term is a function of two variables, \\( x \\) and \\( y \\), both encapsulated within parentheses and demarcated by a semicolon.\n\n### Detailed Breakdown:\n\n1. **Caret ( ^ ) Symbol**: Positioned atop the symbol \\(\\mu\\), the caret usually symbolizes an estimation or an alternate form of the variable it is superimposed on.\n2. **Mu (\\(\\mu\\))**: The Greek letter commonly utilized to denote the mean or expectation in statistical analyses. Here, it appears in the context of a mean-like quantity.\n3. **Variables \\( x \\) and \\( y \\)**: Serving as inputs within the parentheses, these conventionally symbolize variables of interest or indices pertinent to the function. The semicolon\'s inclusion implies that these variables might have separate impacts on the function.\n4. **Parentheses ((); )**: These enclose the variables on which the expression\'s outcome hinges.\n\n### Potential Context:\n\n- This formulation seems to embody a mean-related function, where mean, often represented by \\(\\mu\\), is engaged in statistical and probability analyses to signify the average value of a set of numbers. Here, the caret might denote an estimated or adjusted mean.\n- The reliance on two variables \\( x \\) and \\( y \\) implies that this expression was formulated within a framework where the mean is analyzed as a function of these two parameters. This scenario might be situated within a temporal dimension (where \\( x \\) could symbolize time) and another dimension (\\( y \\)).\n\n### Conclusion:\n\nGiven \\( \\widehat{\\mu}(x; y) \\), we discern:\n- It is an estimated or modified mean function.\n- It hinges on two parameters, \\( x \\) and \\( y \\), which are set apart by a semicolon, pointing to the possibility of distinct roles for these variables within the function.\n\nContemplating these aspects, the expression could be situated within a statistical model where mean estimation is conducted over a time horizon represented by \\( x \\) or across different variables symbolized by \\( y \\). \nKindly correct my answer based on the image, and if it\'s wrong, please explain the reason.'}, {'from': 'gpt', 'value': 'Your description for the mathematical expression is thorough and well-articulated, showing a deep understanding of the notation and its potential contexts in statistical analysis. However, there are a few points where your answer diverges from the image provided:\n\n1. **Mismatch in Symbol Description**: Initially, you described a caret (^) above the Greek letter mu (μ), which denotes an estimated or transformed version of mu. However, the image shows a tilde (~) above sigma (σ), which generally indicates an approximation or alternative form.\n\n2. **Incorrect Greek Letter**: You interpreted the Greek letter as mu (μ), which is typically associated with the mean in statistical contexts. The image, however, features sigma squared (σ²), which is often related to variance or the square of the standard deviation.\n\n3. **Exponent Overlooked**: Your original answer did not include the exponent indicating the square of the Greek letter. The image clearly shows σ squared.\n\n4. **Variable Names**: In your initial response, you described the variables as x and y, while the image uses t and s.\n\nThe rest of your explanation regarding the variables, parentheses, and the potential roles of the variables within the function is sound and accurately reflects the conventions of mathematical notation in statistical models. Your detailed breakdown and potential context for the expression are insightful and appropriately interpret the function as it might be used in statistical analysis.'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/app_store_ios/336309916866.png 2025-08-27 17:58:51.027120 load time: 1041.29 ms VC:s3://gui/aguvis/aguvis-stage1/widget_captioning/images/15948.jpg 2025-08-27 17:58:51.025398 load time: 1053.96 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_185445_before_screenshot.png 2025-08-27 17:58:51.027088 load time: 1042.43 ms VC:s3://gui/aguvis/aguvis-stage2/amex/images/19ff2e31474140638551e99f55e09b0fstep4.png 2025-08-27 17:58:51.025186 load time: 1058.01 ms VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/4852351861639435_12.png 2025-08-27 17:58:51.025569 load time: 1075.91 ms 6%|▌ | 1244/22095 [2:01:42<531:40:51, 91.80s/it] {'loss': 0.4712, 'grad_norm': 0.9266098650508937, 'learning_rate': 9.981940375053996e-06, 'epoch': 0.06} 6%|▌ | 1244/22095 [2:01:42<531:40:51, 91.80s/it]VC:s3://gui-agent/data_20250630/android/images/Cantook/multi_cantook_2/images/015_click_1750148747400.png 2025-08-27 17:59:41.220389 load time: 1031.09 ms 6%|▌ | 1245/22095 [2:02:41<473:46:36, 81.80s/it] {'loss': 0.4478, 'grad_norm': 0.8588623400330749, 'learning_rate': 9.981878084477764e-06, 'epoch': 0.06} 6%|▌ | 1245/22095 [2:02:41<473:46:36, 81.80s/it] 6%|▌ | 1246/22095 [2:04:54<562:47:45, 97.18s/it] {'loss': 0.4534, 'grad_norm': 0.7961382205379587, 'learning_rate': 9.981815686856268e-06, 'epoch': 0.06} 6%|▌ | 1246/22095 [2:04:54<562:47:45, 97.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47580 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1247/22095 [2:07:47<693:41:04, 119.78s/it] {'loss': 0.4689, 'grad_norm': 0.8930930962002508, 'learning_rate': 9.981753182190856e-06, 'epoch': 0.06} 6%|▌ | 1247/22095 [2:07:47<693:41:04, 119.78s/it] 6%|▌ | 1248/22095 [2:10:23<758:05:17, 130.91s/it] {'loss': 0.4321, 'grad_norm': 0.907959707580663, 'learning_rate': 9.981690570482869e-06, 'epoch': 0.06} 6%|▌ | 1248/22095 [2:10:23<758:05:17, 130.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48091 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113084 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74451 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1249/22095 [2:10:31<544:11:16, 93.98s/it] {'loss': 0.5635, 'grad_norm': 0.7678510229566248, 'learning_rate': 9.981627851733651e-06, 'epoch': 0.06} 6%|▌ | 1249/22095 [2:10:31<544:11:16, 93.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52403 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1250/22095 [2:10:35<387:16:01, 66.88s/it] {'loss': 0.4135, 'grad_norm': 0.8515278105378038, 'learning_rate': 9.98156502594455e-06, 'epoch': 0.06} 6%|▌ | 1250/22095 [2:10:35<387:16:01, 66.88s/it] 6%|▌ | 1251/22095 [2:12:24<459:54:59, 79.43s/it] {'loss': 0.4919, 'grad_norm': 0.7744151104367206, 'learning_rate': 9.981502093116917e-06, 'epoch': 0.06} 6%|▌ | 1251/22095 [2:12:24<459:54:59, 79.43s/it] 6%|▌ | 1252/22095 [2:13:16<412:27:20, 71.24s/it] {'loss': 0.4288, 'grad_norm': 0.7920093313730772, 'learning_rate': 9.981439053252102e-06, 'epoch': 0.06} 6%|▌ | 1252/22095 [2:13:16<412:27:20, 71.24s/it] 6%|▌ | 1253/22095 [2:14:51<454:49:47, 78.56s/it] {'loss': 0.4679, 'grad_norm': 0.8486807298027813, 'learning_rate': 9.981375906351463e-06, 'epoch': 0.06} 6%|▌ | 1253/22095 [2:14:51<454:49:47, 78.56s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_469864.png 2025-08-27 18:12:50.112760 load time: 1134.28 ms 6%|▌ | 1254/22095 [2:16:13<460:32:18, 79.55s/it] {'loss': 0.5354, 'grad_norm': 1.0169113659412827, 'learning_rate': 9.981312652416353e-06, 'epoch': 0.06} 6%|▌ | 1254/22095 [2:16:13<460:32:18, 79.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70866 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102839 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1255/22095 [2:17:30<455:44:45, 78.73s/it] {'loss': 0.5397, 'grad_norm': 0.5822200648528999, 'learning_rate': 9.981249291448134e-06, 'epoch': 0.06} 6%|▌ | 1255/22095 [2:17:30<455:44:45, 78.73s/it] 6%|▌ | 1256/22095 [2:17:33<324:21:32, 56.03s/it] {'loss': 0.4753, 'grad_norm': 0.8313948880129977, 'learning_rate': 9.981185823448166e-06, 'epoch': 0.06} 6%|▌ | 1256/22095 [2:17:33<324:21:32, 56.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1257/22095 [2:18:22<312:06:20, 53.92s/it] {'loss': 0.5286, 'grad_norm': 0.43699857233451633, 'learning_rate': 9.981122248417815e-06, 'epoch': 0.06} 6%|▌ | 1257/22095 [2:18:22<312:06:20, 53.92s/it] 6%|▌ | 1258/22095 [2:19:22<321:49:35, 55.60s/it] {'loss': 0.5527, 'grad_norm': 0.3821736230004524, 'learning_rate': 9.981058566358443e-06, 'epoch': 0.06} 6%|▌ | 1258/22095 [2:19:22<321:49:35, 55.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://st2pj/20250222/images/sam-all/images/sa_554413.jpg 2025-08-27 18:17:20.368957 load time: 1034.29 ms 6%|▌ | 1259/22095 [2:19:25<231:18:03, 39.96s/it] {'loss': 0.4282, 'grad_norm': 0.7856472855765193, 'learning_rate': 9.98099477727142e-06, 'epoch': 0.06} 6%|▌ | 1259/22095 [2:19:25<231:18:03, 39.96s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_045946_before_screenshot_sub0.png 2025-08-27 18:17:23.845866 load time: 1033.63 ms 6%|▌ | 1260/22095 [2:19:55<213:36:38, 36.91s/it] {'loss': 0.5309, 'grad_norm': 0.4076765456528284, 'learning_rate': 9.98093088115812e-06, 'epoch': 0.06} 6%|▌ | 1260/22095 [2:19:55<213:36:38, 36.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250609/pc_agent_e/images/screenshot/b75b_44724446_5.png VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_074216_before_screenshot_sub0.png 2025-08-27 18:17:53.626823 load time: 1029.54 ms VC:s3://gui-agent/agentnet/win_mac_images/a94f3c67-a3bd-457b-96ff-8cef03d6593b.png 2025-08-27 18:17:53.626982 load time: 1021.42 ms 2025-08-27 18:17:53.626891 load time: 1027.02 ms VC:s3://gui-agent/data_20250612/web/images/yang_0610222229/10_140_52_49_0610224734/img/1.png 2025-08-27 18:17:53.626965 load time: 1022.93 ms 6%|▌ | 1261/22095 [2:20:23<198:22:02, 34.28s/it] {'loss': 0.4614, 'grad_norm': 0.8071172893195849, 'learning_rate': 9.980866878019911e-06, 'epoch': 0.06} 6%|▌ | 1261/22095 [2:20:23<198:22:02, 34.28s/it] 6%|▌ | 1262/22095 [2:20:53<190:44:47, 32.96s/it] {'loss': 0.4701, 'grad_norm': 0.8475654460872483, 'learning_rate': 9.98080276785817e-06, 'epoch': 0.06} 6%|▌ | 1262/22095 [2:20:53<190:44:47, 32.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8412340 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14549, 'image': 'vrdu_table_final_2/astro-ph.CO/eb8a3b6a-8b63-4113-8756-fcc459921732.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 6%|▌ | 1263/22095 [2:21:28<193:53:34, 33.51s/it] {'loss': 0.5347, 'grad_norm': 0.47535055779038554, 'learning_rate': 9.980738550674277e-06, 'epoch': 0.06} 6%|▌ | 1263/22095 [2:21:28<193:53:34, 33.51s/it] 6%|▌ | 1264/22095 [2:21:57<187:08:16, 32.34s/it] {'loss': 0.485, 'grad_norm': 0.7909484322536614, 'learning_rate': 9.980674226469608e-06, 'epoch': 0.06} 6%|▌ | 1264/22095 [2:21:57<187:08:16, 32.34s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-27 18:19:57.530696 load time: 1093.75 ms 6%|▌ | 1265/22095 [2:22:30<187:26:25, 32.39s/it] {'loss': 0.4867, 'grad_norm': 0.8709597748130423, 'learning_rate': 9.980609795245548e-06, 'epoch': 0.06} 6%|▌ | 1265/22095 [2:22:30<187:26:25, 32.39s/it] 6%|▌ | 1266/22095 [2:23:02<187:15:43, 32.37s/it] {'loss': 0.4603, 'grad_norm': 0.8005551517657652, 'learning_rate': 9.980545257003481e-06, 'epoch': 0.06} 6%|▌ | 1266/22095 [2:23:02<187:15:43, 32.37s/it]VC:s3://ocr/coco/train2014/COCO_train2014_000000331492.jpg 2025-08-27 18:21:00.947413 load time: 1024.0 ms VC:s3://internvl2/datasets/VCR-wiki-en-easy/images/0017961.jpg 2025-08-27 18:21:00.947885 load time: 1041.15 ms VC:s3://gui-agent/data_20250612/android/images/Total_data_windows_0612_hard_data_device1_Simple_Gallery_Pro/ExpenseAddMultipleFromGallery/images/015_click_1749005831752.png 2025-08-27 18:21:00.948052 load time: 1034.12 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_572081.png 2025-08-27 18:21:00.945801 load time: 1028.84 ms VC:s3://gui-agent/data_20250714/ubuntu/images/chrome/fac8e48e-85c6-4bc9-bac2-34c9a37e1bcc/images/step_2.png 2025-08-27 18:21:00.947747 load time: 1057.14 ms 6%|▌ | 1267/22095 [2:23:30<179:59:38, 31.11s/it] {'loss': 0.4336, 'grad_norm': 0.7964388047236337, 'learning_rate': 9.980480611744791e-06, 'epoch': 0.06} 6%|▌ | 1267/22095 [2:23:30<179:59:38, 31.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [73, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8514432 in VC:s3://internvl-moe-sft-data/. Exception: Image size [73, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 154451, 'image': 'vrdu_texteq/astro-ph.CO/ed6617af-2baa-4571-be0d-5afd614264ad.png', 'image_wh': [[73, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': '\\(k_{n-1}x\\)'}]} 6%|▌ | 1268/22095 [2:23:33<130:56:19, 22.63s/it] {'loss': 0.4721, 'grad_norm': 0.7983523114274277, 'learning_rate': 9.980415859470872e-06, 'epoch': 0.06} 6%|▌ | 1268/22095 [2:23:33<130:56:19, 22.63s/it]VC:s3://gui-agent/data_20250407/windows/images/excel/20250404_091720_5/images/before_screenshot_42.png 2025-08-27 18:21:31.911262 load time: 1019.15 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [378, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8477367 in VC:s3://internvl-moe-sft-data/. Exception: Image size [378, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25074, 'image': 'vrdu_texteq/astro-ph.CO/5f2c02ee-1a53-4bb1-b6f2-dd204f6c52c1.png', 'image_wh': [[378, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $N$ is the number of bins.'}]} 6%|▌ | 1269/22095 [2:24:34<196:53:04, 34.03s/it] {'loss': 0.4537, 'grad_norm': 0.829019588478564, 'learning_rate': 9.980351000183108e-06, 'epoch': 0.06} 6%|▌ | 1269/22095 [2:24:34<196:53:04, 34.03s/it] 6%|▌ | 1270/22095 [2:25:02<186:34:10, 32.25s/it] {'loss': 0.4967, 'grad_norm': 0.8097302752292416, 'learning_rate': 9.9802860338829e-06, 'epoch': 0.06} 6%|▌ | 1270/22095 [2:25:02<186:34:10, 32.25s/it]VC:s3://gui/aguvis/aguvis-stage2/android_control/images/7571/screenshot_3.png 2025-08-27 18:23:00.636439 load time: 1030.08 ms 6%|▌ | 1271/22095 [2:25:05<136:00:14, 23.51s/it] {'loss': 0.4537, 'grad_norm': 0.9222055660580988, 'learning_rate': 9.98022096057164e-06, 'epoch': 0.06} 6%|▌ | 1271/22095 [2:25:05<136:00:14, 23.51s/it]VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/986779d8a42026a85fb9c6e2b760143d.png 2025-08-27 18:23:03.759469 load time: 1029.25 ms 6%|▌ | 1272/22095 [2:25:08<100:43:56, 17.42s/it] {'loss': 0.4337, 'grad_norm': 0.8110170634260588, 'learning_rate': 9.980155780250728e-06, 'epoch': 0.06} 6%|▌ | 1272/22095 [2:25:08<100:43:56, 17.42s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_196991.png 2025-08-27 18:23:06.951578 load time: 1043.65 ms 6%|▌ | 1273/22095 [2:25:12<76:38:03, 13.25s/it] {'loss': 0.4801, 'grad_norm': 0.7978741699808711, 'learning_rate': 9.980090492921563e-06, 'epoch': 0.06} 6%|▌ | 1273/22095 [2:25:12<76:38:03, 13.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348842 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15512, 'image': 'vrdu_table_final_2/astro-ph.CO/284a63fa-6219-4c77-99ad-9e1e81274252.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$S_{3}$\\end{tabular}\n```"}]} VC:s3://multi-modal/playground/data/geoqa+/images/7531.png 2025-08-27 18:23:10.473983 load time: 1022.46 ms VC:s3://multi-modal/agent_data/AndroidUI/20240327/20240327_filtered/ximalaya/screen_00000190.jpg 2025-08-27 18:23:10.474231 load time: 1024.91 ms 6%|▌ | 1274/22095 [2:25:15<59:34:05, 10.30s/it] {'loss': 0.5115, 'grad_norm': 0.8338140737252998, 'learning_rate': 9.98002509858555e-06, 'epoch': 0.06} 6%|▌ | 1274/22095 [2:25:15<59:34:05, 10.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880209 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3362, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 6\nB. 8\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 6%|▌ | 1275/22095 [2:25:18<46:45:54, 8.09s/it] {'loss': 0.4572, 'grad_norm': 0.7310445282087151, 'learning_rate': 9.979959597244089e-06, 'epoch': 0.06} 6%|▌ | 1275/22095 [2:25:18<46:45:54, 8.09s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-27 18:23:18.629356 load time: 1002.9 ms 6%|▌ | 1276/22095 [2:25:22<38:59:37, 6.74s/it] {'loss': 0.4782, 'grad_norm': 0.7531542525602791, 'learning_rate': 9.979893988898592e-06, 'epoch': 0.06} 6%|▌ | 1276/22095 [2:25:22<38:59:37, 6.74s/it] 6%|▌ | 1277/22095 [2:25:25<33:07:08, 5.73s/it] {'loss': 0.4425, 'grad_norm': 0.7523160239405309, 'learning_rate': 9.97982827355047e-06, 'epoch': 0.06} 6%|▌ | 1277/22095 [2:25:25<33:07:08, 5.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1278/22095 [2:25:34<38:20:17, 6.63s/it] {'loss': 0.5545, 'grad_norm': 0.5112046197859239, 'learning_rate': 9.979762451201132e-06, 'epoch': 0.06} 6%|▌ | 1278/22095 [2:25:34<38:20:17, 6.63s/it] 6%|▌ | 1279/22095 [2:25:37<32:38:11, 5.64s/it] {'loss': 0.4949, 'grad_norm': 0.8668024028019767, 'learning_rate': 9.979696521851992e-06, 'epoch': 0.06} 6%|▌ | 1279/22095 [2:25:37<32:38:11, 5.64s/it] 6%|▌ | 1280/22095 [2:25:40<27:54:04, 4.83s/it] {'loss': 0.4369, 'grad_norm': 0.779857898757288, 'learning_rate': 9.979630485504468e-06, 'epoch': 0.06} 6%|▌ | 1280/22095 [2:25:40<27:54:04, 4.83s/it] 6%|▌ | 1281/22095 [2:25:44<26:13:03, 4.53s/it] {'loss': 0.4641, 'grad_norm': 0.751721968005427, 'learning_rate': 9.97956434215998e-06, 'epoch': 0.06} 6%|▌ | 1281/22095 [2:25:44<26:13:03, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56814 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44477 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1282/22095 [2:25:47<23:51:39, 4.13s/it] {'loss': 0.429, 'grad_norm': 0.7025221316380706, 'learning_rate': 9.979498091819946e-06, 'epoch': 0.06} 6%|▌ | 1282/22095 [2:25:47<23:51:39, 4.13s/it] 6%|▌ | 1283/22095 [2:25:51<23:53:27, 4.13s/it] {'loss': 0.4303, 'grad_norm': 0.7194116567943578, 'learning_rate': 9.979431734485794e-06, 'epoch': 0.06} 6%|▌ | 1283/22095 [2:25:51<23:53:27, 4.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1284/22095 [2:25:54<21:35:57, 3.74s/it] {'loss': 0.4604, 'grad_norm': 0.7585130981008433, 'learning_rate': 9.979365270158945e-06, 'epoch': 0.06} 6%|▌ | 1284/22095 [2:25:54<21:35:57, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71768 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1285/22095 [2:25:57<20:47:03, 3.60s/it] {'loss': 0.4644, 'grad_norm': 0.7978623772369193, 'learning_rate': 9.979298698840829e-06, 'epoch': 0.06} 6%|▌ | 1285/22095 [2:25:57<20:47:03, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125480 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1286/22095 [2:26:01<20:58:14, 3.63s/it] {'loss': 0.4468, 'grad_norm': 0.7284784348084953, 'learning_rate': 9.979232020532877e-06, 'epoch': 0.06} 6%|▌ | 1286/22095 [2:26:01<20:58:14, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1287/22095 [2:26:04<20:38:21, 3.57s/it] {'loss': 0.4488, 'grad_norm': 0.7619367485973766, 'learning_rate': 9.979165235236523e-06, 'epoch': 0.06} 6%|▌ | 1287/22095 [2:26:04<20:38:21, 3.57s/it] 6%|▌ | 1288/22095 [2:26:08<21:03:57, 3.64s/it] {'loss': 0.4614, 'grad_norm': 0.6968847961889845, 'learning_rate': 9.979098342953198e-06, 'epoch': 0.06} 6%|▌ | 1288/22095 [2:26:08<21:03:57, 3.64s/it] 6%|▌ | 1289/22095 [2:26:11<20:13:50, 3.50s/it] {'loss': 0.4614, 'grad_norm': 1.0272658772877736, 'learning_rate': 9.979031343684344e-06, 'epoch': 0.06} 6%|▌ | 1289/22095 [2:26:11<20:13:50, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1290/22095 [2:26:17<24:35:56, 4.26s/it] {'loss': 0.5455, 'grad_norm': 0.5435257113968114, 'learning_rate': 9.978964237431396e-06, 'epoch': 0.06} 6%|▌ | 1290/22095 [2:26:17<24:35:56, 4.26s/it] 6%|▌ | 1291/22095 [2:26:22<25:31:36, 4.42s/it] {'loss': 0.4309, 'grad_norm': 0.8353831810505, 'learning_rate': 9.978897024195801e-06, 'epoch': 0.06} 6%|▌ | 1291/22095 [2:26:22<25:31:36, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90184 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83102 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110303 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1292/22095 [2:26:26<23:50:02, 4.12s/it] {'loss': 0.4579, 'grad_norm': 0.7290873281764264, 'learning_rate': 9.978829703978999e-06, 'epoch': 0.06} 6%|▌ | 1292/22095 [2:26:26<23:50:02, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1293/22095 [2:26:36<34:51:25, 6.03s/it] {'loss': 0.5133, 'grad_norm': 0.3411679847978857, 'learning_rate': 9.978762276782438e-06, 'epoch': 0.06} 6%|▌ | 1293/22095 [2:26:36<34:51:25, 6.03s/it] 6%|▌ | 1294/22095 [2:26:43<35:33:47, 6.15s/it] {'loss': 0.5583, 'grad_norm': 0.3718275823828351, 'learning_rate': 9.978694742607566e-06, 'epoch': 0.06} 6%|▌ | 1294/22095 [2:26:43<35:33:47, 6.15s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 6%|▌ | 1295/22095 [2:26:46<31:26:04, 5.44s/it] {'loss': 0.4861, 'grad_norm': 0.9738128977028798, 'learning_rate': 9.978627101455836e-06, 'epoch': 0.06} 6%|▌ | 1295/22095 [2:26:46<31:26:04, 5.44s/it] 6%|▌ | 1296/22095 [2:26:50<27:59:49, 4.85s/it] {'loss': 0.5127, 'grad_norm': 0.9236918922061743, 'learning_rate': 9.9785593533287e-06, 'epoch': 0.06} 6%|▌ | 1296/22095 [2:26:50<27:59:49, 4.85s/it] 6%|▌ | 1297/22095 [2:26:54<27:18:48, 4.73s/it] {'loss': 0.4449, 'grad_norm': 0.6878931249632689, 'learning_rate': 9.978491498227615e-06, 'epoch': 0.06} 6%|▌ | 1297/22095 [2:26:54<27:18:48, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116651 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1298/22095 [2:26:58<25:28:01, 4.41s/it] {'loss': 0.4625, 'grad_norm': 0.7557270130278592, 'learning_rate': 9.978423536154036e-06, 'epoch': 0.06} 6%|▌ | 1298/22095 [2:26:58<25:28:01, 4.41s/it] 6%|▌ | 1299/22095 [2:27:01<23:04:36, 3.99s/it] {'loss': 0.4336, 'grad_norm': 1.0642306750672659, 'learning_rate': 9.978355467109427e-06, 'epoch': 0.06} 6%|▌ | 1299/22095 [2:27:01<23:04:36, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1300/22095 [2:27:11<33:04:55, 5.73s/it] {'loss': 0.5213, 'grad_norm': 0.5117178183518919, 'learning_rate': 9.978287291095248e-06, 'epoch': 0.06} 6%|▌ | 1300/22095 [2:27:11<33:04:55, 5.73s/it] 6%|▌ | 1301/22095 [2:27:14<29:30:31, 5.11s/it] {'loss': 0.4541, 'grad_norm': 0.794491939155005, 'learning_rate': 9.978219008112965e-06, 'epoch': 0.06} 6%|▌ | 1301/22095 [2:27:14<29:30:31, 5.11s/it] 6%|▌ | 1302/22095 [2:27:18<26:48:59, 4.64s/it] {'loss': 0.4702, 'grad_norm': 0.8099921246940341, 'learning_rate': 9.978150618164044e-06, 'epoch': 0.06} 6%|▌ | 1302/22095 [2:27:18<26:48:59, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1303/22095 [2:27:21<23:54:44, 4.14s/it] {'loss': 0.4413, 'grad_norm': 0.7695831619969989, 'learning_rate': 9.978082121249957e-06, 'epoch': 0.06} 6%|▌ | 1303/22095 [2:27:21<23:54:44, 4.14s/it] 6%|▌ | 1304/22095 [2:27:24<22:53:25, 3.96s/it] {'loss': 0.4808, 'grad_norm': 0.8738306669138687, 'learning_rate': 9.978013517372173e-06, 'epoch': 0.06} 6%|▌ | 1304/22095 [2:27:24<22:53:25, 3.96s/it] 6%|▌ | 1305/22095 [2:27:28<21:58:21, 3.80s/it] {'loss': 0.5147, 'grad_norm': 0.8250768119015344, 'learning_rate': 9.977944806532169e-06, 'epoch': 0.06} 6%|▌ | 1305/22095 [2:27:28<21:58:21, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50722 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1306/22095 [2:27:37<31:38:13, 5.48s/it] {'loss': 0.5311, 'grad_norm': 0.49320793021794157, 'learning_rate': 9.977875988731418e-06, 'epoch': 0.06} 6%|▌ | 1306/22095 [2:27:37<31:38:13, 5.48s/it] 6%|▌ | 1307/22095 [2:27:41<27:50:55, 4.82s/it] {'loss': 0.411, 'grad_norm': 0.6911527640318451, 'learning_rate': 9.977807063971401e-06, 'epoch': 0.06} 6%|▌ | 1307/22095 [2:27:41<27:50:55, 4.82s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_6.png 2025-08-27 18:25:40.533640 load time: 1008.22 ms 6%|▌ | 1308/22095 [2:27:43<24:22:22, 4.22s/it] {'loss': 0.4715, 'grad_norm': 0.7413756021870055, 'learning_rate': 9.977738032253598e-06, 'epoch': 0.06} 6%|▌ | 1308/22095 [2:27:43<24:22:22, 4.22s/it] 6%|▌ | 1309/22095 [2:27:46<21:58:09, 3.80s/it] {'loss': 0.4609, 'grad_norm': 0.8133220610132956, 'learning_rate': 9.977668893579493e-06, 'epoch': 0.06} 6%|▌ | 1309/22095 [2:27:46<21:58:09, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46970 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1310/22095 [2:27:50<22:01:15, 3.81s/it] {'loss': 0.4714, 'grad_norm': 0.8196571698370698, 'learning_rate': 9.977599647950572e-06, 'epoch': 0.06} 6%|▌ | 1310/22095 [2:27:50<22:01:15, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1311/22095 [2:27:54<21:28:35, 3.72s/it] {'loss': 0.4819, 'grad_norm': 0.8644105862836049, 'learning_rate': 9.977530295368321e-06, 'epoch': 0.06} 6%|▌ | 1311/22095 [2:27:54<21:28:35, 3.72s/it] 6%|▌ | 1312/22095 [2:27:56<19:45:02, 3.42s/it] {'loss': 0.4529, 'grad_norm': 1.0647079301335312, 'learning_rate': 9.977460835834231e-06, 'epoch': 0.06} 6%|▌ | 1312/22095 [2:27:56<19:45:02, 3.42s/it] 6%|▌ | 1313/22095 [2:27:59<18:51:22, 3.27s/it] {'loss': 0.458, 'grad_norm': 0.8206674886176298, 'learning_rate': 9.977391269349795e-06, 'epoch': 0.06} 6%|▌ | 1313/22095 [2:27:59<18:51:22, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1314/22095 [2:28:09<30:55:27, 5.36s/it] {'loss': 0.5106, 'grad_norm': 0.5318759671503385, 'learning_rate': 9.977321595916507e-06, 'epoch': 0.06} 6%|▌ | 1314/22095 [2:28:09<30:55:27, 5.36s/it] 6%|▌ | 1315/22095 [2:28:13<27:55:29, 4.84s/it] {'loss': 0.4952, 'grad_norm': 0.8727685781150271, 'learning_rate': 9.977251815535867e-06, 'epoch': 0.06} 6%|▌ | 1315/22095 [2:28:13<27:55:29, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (151654 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1316/22095 [2:28:16<25:21:41, 4.39s/it] {'loss': 0.4188, 'grad_norm': 0.744440840375188, 'learning_rate': 9.97718192820937e-06, 'epoch': 0.06} 6%|▌ | 1316/22095 [2:28:17<25:21:41, 4.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [720, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8424776 in VC:s3://internvl-moe-sft-data/. Exception: Image size [720, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 90192, 'image': 'vrdu_texteq/astro-ph.CO/409d034d-fc37-45e0-8648-ef4ae9989b03.png', 'image_wh': [[720, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'The likelihood $\\mathcal{L}$ is assumed to be multivariate Gaussian as'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1317/22095 [2:28:20<23:12:07, 4.02s/it] {'loss': 0.4429, 'grad_norm': 1.0437273788657893, 'learning_rate': 9.977111933938519e-06, 'epoch': 0.06} 6%|▌ | 1317/22095 [2:28:20<23:12:07, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8502335 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 9552, 'image': 'vrdu_texteq/astro-ph.CO/cfd5a0a5-7243-4507-a6e1-64bd145238c5.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': '\\vspace{0.2cm}\n$\\Downarrow$'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified visual_processed = processor.preprocess(image, return_tensors="pt") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:21 and width:135 must be larger than factor:28 [Try #0] Failed to fetch sample 2074782 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:21 and width:135 must be larger than factor:28 Problematic sample: {'image': 'a4b739659e7c325d58fab0a3e55e135c875dc9a485e2b2af942dbafdf4662908.png', 'conversations': [{'from': 'human', 'value': '\nThe position of this Icon can be described as:\nThe icon is located in the top navigation bar, slightly to the right of the center. It is positioned between the RustyLoot logo on the left and a series of other icons on the right, such as user profile and settings icons.\n\nFunctional capabilities of the Icon:\nThis icon likely serves as a status indicator or a shortcut to a specific feature or section within the application, possibly related to user achievements or rewards.'}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]', 'recipient': 'all', 'end_turn': True}]} 6%|▌ | 1318/22095 [2:28:30<35:12:55, 6.10s/it] {'loss': 0.525, 'grad_norm': 0.46107440696628565, 'learning_rate': 9.97704183272482e-06, 'epoch': 0.06} 6%|▌ | 1318/22095 [2:28:31<35:12:55, 6.10s/it] 6%|▌ | 1319/22095 [2:28:35<31:53:59, 5.53s/it] {'loss': 0.4579, 'grad_norm': 0.9797169729349386, 'learning_rate': 9.976971624569776e-06, 'epoch': 0.06} 6%|▌ | 1319/22095 [2:28:35<31:53:59, 5.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [20, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396967 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63820, 'image': 'vrdu_table_final_2/astro-ph.EP/9d130570-c73b-4194-978a-f0f3bc175a7b.png', 'image_wh': [[20, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{l}x\\end{tabular}\n```"}]} 6%|▌ | 1320/22095 [2:28:39<29:26:38, 5.10s/it] {'loss': 0.4914, 'grad_norm': 0.8694097288243677, 'learning_rate': 9.9769013094749e-06, 'epoch': 0.06} 6%|▌ | 1320/22095 [2:28:39<29:26:38, 5.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86366 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1321/22095 [2:28:43<28:31:01, 4.94s/it] {'loss': 0.4999, 'grad_norm': 0.8524501643852321, 'learning_rate': 9.976830887441699e-06, 'epoch': 0.06} 6%|▌ | 1321/22095 [2:28:43<28:31:01, 4.94s/it] 6%|▌ | 1322/22095 [2:28:47<26:02:27, 4.51s/it] {'loss': 0.4622, 'grad_norm': 0.8575124599427206, 'learning_rate': 9.976760358471687e-06, 'epoch': 0.06} 6%|▌ | 1322/22095 [2:28:47<26:02:27, 4.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957888 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8723, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 2\nB. 3\nC. 10\nD. 5'}, {'from': 'gpt', 'value': '【解答】解:(1)由点M、N分别是线段AC、BC的中点,得MC=\\frac{1}{2}AC=\\frac{1}{2}×4=2,NC=\\frac{1}{2}BC=\\frac{1}{2}×6=3.由线段的和差,得MN=MC+NC=2+3=5;'}]} 6%|▌ | 1323/22095 [2:28:50<23:03:52, 4.00s/it] {'loss': 0.4204, 'grad_norm': 0.786340660897482, 'learning_rate': 9.976689722566379e-06, 'epoch': 0.06} 6%|▌ | 1323/22095 [2:28:50<23:03:52, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250623/windows_augment/images/android_studio/2025-06-18_182417/images/step_1_id_62_internvl_appearance_crop_0_grounding_instructions_random_paste.png 2025-08-27 18:26:49.590688 load time: 1324.92 ms 6%|▌ | 1324/22095 [2:28:59<32:31:20, 5.64s/it] {'loss': 0.5612, 'grad_norm': 0.4433058625534793, 'learning_rate': 9.976618979727295e-06, 'epoch': 0.06} 6%|▌ | 1324/22095 [2:28:59<32:31:20, 5.64s/it] 6%|▌ | 1325/22095 [2:29:02<28:13:44, 4.89s/it] {'loss': 0.4716, 'grad_norm': 0.8653521916140352, 'learning_rate': 9.976548129955953e-06, 'epoch': 0.06} 6%|▌ | 1325/22095 [2:29:02<28:13:44, 4.89s/it] 6%|▌ | 1326/22095 [2:29:06<25:52:13, 4.48s/it] {'loss': 0.4307, 'grad_norm': 0.796213598131806, 'learning_rate': 9.976477173253878e-06, 'epoch': 0.06} 6%|▌ | 1326/22095 [2:29:06<25:52:13, 4.48s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-27 18:27:05.075560 load time: 1499.06 ms 6%|▌ | 1327/22095 [2:29:09<23:37:22, 4.09s/it] {'loss': 0.4347, 'grad_norm': 0.8431570163012182, 'learning_rate': 9.97640610962259e-06, 'epoch': 0.06} 6%|▌ | 1327/22095 [2:29:09<23:37:22, 4.09s/it] 6%|▌ | 1328/22095 [2:29:12<21:50:54, 3.79s/it] {'loss': 0.4251, 'grad_norm': 0.7192342906246789, 'learning_rate': 9.97633493906362e-06, 'epoch': 0.06} 6%|▌ | 1328/22095 [2:29:12<21:50:54, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1329/22095 [2:29:19<27:34:35, 4.78s/it] {'loss': 0.5414, 'grad_norm': 0.3889310353578457, 'learning_rate': 9.976263661578495e-06, 'epoch': 0.06} 6%|▌ | 1329/22095 [2:29:19<27:34:35, 4.78s/it] 6%|▌ | 1330/22095 [2:29:22<25:01:23, 4.34s/it] {'loss': 0.455, 'grad_norm': 0.8408450911647659, 'learning_rate': 9.976192277168748e-06, 'epoch': 0.06} 6%|▌ | 1330/22095 [2:29:22<25:01:23, 4.34s/it] 6%|▌ | 1331/22095 [2:29:26<24:25:10, 4.23s/it] {'loss': 0.4486, 'grad_norm': 0.7791459262815543, 'learning_rate': 9.976120785835912e-06, 'epoch': 0.06} 6%|▌ | 1331/22095 [2:29:26<24:25:10, 4.23s/it] 6%|▌ | 1332/22095 [2:29:30<23:10:07, 4.02s/it] {'loss': 0.4438, 'grad_norm': 1.5368903877653917, 'learning_rate': 9.976049187581523e-06, 'epoch': 0.06} 6%|▌ | 1332/22095 [2:29:30<23:10:07, 4.02s/it] 6%|▌ | 1333/22095 [2:29:33<22:05:04, 3.83s/it] {'loss': 0.4697, 'grad_norm': 0.8423946896063987, 'learning_rate': 9.97597748240712e-06, 'epoch': 0.06} 6%|▌ | 1333/22095 [2:29:33<22:05:04, 3.83s/it] 6%|▌ | 1334/22095 [2:29:37<21:12:37, 3.68s/it] {'loss': 0.4845, 'grad_norm': 0.7581531607654294, 'learning_rate': 9.975905670314243e-06, 'epoch': 0.06} 6%|▌ | 1334/22095 [2:29:37<21:12:37, 3.68s/it] 6%|▌ | 1335/22095 [2:29:40<20:31:04, 3.56s/it] {'loss': 0.44, 'grad_norm': 0.8046542695421153, 'learning_rate': 9.975833751304435e-06, 'epoch': 0.06} 6%|▌ | 1335/22095 [2:29:40<20:31:04, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1336/22095 [2:29:48<27:33:47, 4.78s/it] {'loss': 0.5409, 'grad_norm': 0.4528822879915298, 'learning_rate': 9.975761725379243e-06, 'epoch': 0.06} 6%|▌ | 1336/22095 [2:29:48<27:33:47, 4.78s/it] 6%|▌ | 1337/22095 [2:29:51<24:46:54, 4.30s/it] {'loss': 0.452, 'grad_norm': 0.850037337299667, 'learning_rate': 9.975689592540214e-06, 'epoch': 0.06} 6%|▌ | 1337/22095 [2:29:51<24:46:54, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302257 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1KB8ILXXXXXcTXpXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nOCR这张照片,只输出文字内容。'}, {'from': 'gpt', 'value': '图中所有文字:\n6376E\n6376E3\NF\n奔翔汽配\n部件批发\n6371\6373\n包邮\n6400\n疯狂抢购\n电话:\nQQ:15803191460\n752853109\n5号6390\n6388\n盗\n正品配件'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1338/22095 [2:29:54<23:06:34, 4.01s/it] {'loss': 0.4389, 'grad_norm': 0.99034556951951, 'learning_rate': 9.975617352788897e-06, 'epoch': 0.06} 6%|▌ | 1338/22095 [2:29:54<23:06:34, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65167 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1339/22095 [2:29:57<22:02:22, 3.82s/it] {'loss': 0.5071, 'grad_norm': 1.0271152953869538, 'learning_rate': 9.975545006126843e-06, 'epoch': 0.06} 6%|▌ | 1339/22095 [2:29:58<22:02:22, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59218 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1340/22095 [2:30:00<20:33:51, 3.57s/it] {'loss': 0.4876, 'grad_norm': 1.1420043625492036, 'learning_rate': 9.975472552555609e-06, 'epoch': 0.06} 6%|▌ | 1340/22095 [2:30:00<20:33:51, 3.57s/it] 6%|▌ | 1341/22095 [2:30:04<20:46:20, 3.60s/it] {'loss': 0.4107, 'grad_norm': 0.6971624850612915, 'learning_rate': 9.975399992076752e-06, 'epoch': 0.06} 6%|▌ | 1341/22095 [2:30:04<20:46:20, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52240 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1342/22095 [2:30:08<20:50:22, 3.62s/it] {'loss': 0.4331, 'grad_norm': 0.9569774396437504, 'learning_rate': 9.975327324691828e-06, 'epoch': 0.06} 6%|▌ | 1342/22095 [2:30:08<20:50:22, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1343/22095 [2:30:11<19:35:22, 3.40s/it] {'loss': 0.5065, 'grad_norm': 0.7914603074563847, 'learning_rate': 9.9752545504024e-06, 'epoch': 0.06} 6%|▌ | 1343/22095 [2:30:11<19:35:22, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1344/22095 [2:30:14<19:11:50, 3.33s/it] {'loss': 0.4993, 'grad_norm': 0.9088018964302117, 'learning_rate': 9.975181669210034e-06, 'epoch': 0.06} 6%|▌ | 1344/22095 [2:30:14<19:11:50, 3.33s/it] 6%|▌ | 1345/22095 [2:30:17<18:35:21, 3.23s/it] {'loss': 0.5226, 'grad_norm': 1.1608321903492493, 'learning_rate': 9.975108681116293e-06, 'epoch': 0.06} 6%|▌ | 1345/22095 [2:30:17<18:35:21, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1346/22095 [2:30:20<18:48:35, 3.26s/it] {'loss': 0.4445, 'grad_norm': 0.7616726835673043, 'learning_rate': 9.975035586122746e-06, 'epoch': 0.06} 6%|▌ | 1346/22095 [2:30:20<18:48:35, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1347/22095 [2:30:26<23:57:59, 4.16s/it] {'loss': 0.5431, 'grad_norm': 0.4804652401570814, 'learning_rate': 9.974962384230965e-06, 'epoch': 0.06} 6%|▌ | 1347/22095 [2:30:26<23:57:59, 4.16s/it] 6%|▌ | 1348/22095 [2:30:35<31:53:44, 5.53s/it] {'loss': 0.5413, 'grad_norm': 0.4173257563644276, 'learning_rate': 9.97488907544252e-06, 'epoch': 0.06} 6%|▌ | 1348/22095 [2:30:35<31:53:44, 5.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 6%|▌ | 1349/22095 [2:30:38<27:47:33, 4.82s/it] {'loss': 0.4772, 'grad_norm': 0.962069270717757, 'learning_rate': 9.97481565975899e-06, 'epoch': 0.06} 6%|▌ | 1349/22095 [2:30:38<27:47:33, 4.82s/it] 6%|▌ | 1350/22095 [2:30:48<35:50:03, 6.22s/it] {'loss': 0.5404, 'grad_norm': 0.36001383419618155, 'learning_rate': 9.97474213718195e-06, 'epoch': 0.06} 6%|▌ | 1350/22095 [2:30:48<35:50:03, 6.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 6%|▌ | 1351/22095 [2:30:52<31:43:22, 5.51s/it] {'loss': 0.4357, 'grad_norm': 0.7685197910763001, 'learning_rate': 9.974668507712979e-06, 'epoch': 0.06} 6%|▌ | 1351/22095 [2:30:52<31:43:22, 5.51s/it] 6%|▌ | 1352/22095 [2:30:55<27:49:04, 4.83s/it] {'loss': 0.4166, 'grad_norm': 0.7682770127489988, 'learning_rate': 9.974594771353662e-06, 'epoch': 0.06} 6%|▌ | 1352/22095 [2:30:55<27:49:04, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41714 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70702 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76296 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1353/22095 [2:30:58<25:15:38, 4.38s/it] {'loss': 0.4639, 'grad_norm': 0.8399578495600116, 'learning_rate': 9.97452092810558e-06, 'epoch': 0.06} 6%|▌ | 1353/22095 [2:30:58<25:15:38, 4.38s/it] 6%|▌ | 1354/22095 [2:31:01<22:43:28, 3.94s/it] {'loss': 0.489, 'grad_norm': 0.7164554717054931, 'learning_rate': 9.974446977970322e-06, 'epoch': 0.06} 6%|▌ | 1354/22095 [2:31:01<22:43:28, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1355/22095 [2:31:04<21:13:38, 3.68s/it] {'loss': 0.4225, 'grad_norm': 0.8019348898901124, 'learning_rate': 9.974372920949478e-06, 'epoch': 0.06} 6%|▌ | 1355/22095 [2:31:04<21:13:38, 3.68s/it] 6%|▌ | 1356/22095 [2:31:07<20:04:44, 3.49s/it] {'loss': 0.4416, 'grad_norm': 0.7457433655397221, 'learning_rate': 9.974298757044636e-06, 'epoch': 0.06} 6%|▌ | 1356/22095 [2:31:07<20:04:44, 3.49s/it] 6%|▌ | 1357/22095 [2:31:11<20:55:35, 3.63s/it] {'loss': 0.4496, 'grad_norm': 0.8093746951320654, 'learning_rate': 9.97422448625739e-06, 'epoch': 0.06} 6%|▌ | 1357/22095 [2:31:11<20:55:35, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53003 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1358/22095 [2:31:15<20:17:03, 3.52s/it] {'loss': 0.4122, 'grad_norm': 0.7702467986450054, 'learning_rate': 9.974150108589338e-06, 'epoch': 0.06} 6%|▌ | 1358/22095 [2:31:15<20:17:03, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45092 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78335 > 40960). Running this sequence through the model will result in indexing errors 6%|▌ | 1359/22095 [2:31:19<21:20:38, 3.71s/it] {'loss': 0.5027, 'grad_norm': 0.74345550836823, 'learning_rate': 9.974075624042076e-06, 'epoch': 0.06} 6%|▌ | 1359/22095 [2:31:19<21:20:38, 3.71s/it] 6%|▌ | 1360/22095 [2:31:23<21:42:59, 3.77s/it] {'loss': 0.4435, 'grad_norm': 0.7484354102674841, 'learning_rate': 9.974001032617208e-06, 'epoch': 0.06} 6%|▌ | 1360/22095 [2:31:23<21:42:59, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941400 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64553, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm'}, {'from': 'gpt', 'value': '【解答】解:∵M是AC的中点,N是DB的中点,CD=3cm,AB=9.8cm,∴MC+DN=\\frac{1}{2}(AB-CD)=3.4cm,∴MN=MC+DN+CD=3.4+3=6.4cm.'}]} 6%|▌ | 1361/22095 [2:31:33<33:23:08, 5.80s/it] {'loss': 0.5201, 'grad_norm': 0.749996191394422, 'learning_rate': 9.973926334316332e-06, 'epoch': 0.06} 6%|▌ | 1361/22095 [2:31:33<33:23:08, 5.80s/it] 6%|▌ | 1362/22095 [2:31:36<28:56:25, 5.03s/it] {'loss': 0.4307, 'grad_norm': 0.8440489977478883, 'learning_rate': 9.973851529141056e-06, 'epoch': 0.06} 6%|▌ | 1362/22095 [2:31:36<28:56:25, 5.03s/it] 6%|▌ | 1363/22095 [2:31:39<25:17:38, 4.39s/it] {'loss': 0.4501, 'grad_norm': 0.8720090740465758, 'learning_rate': 9.973776617092988e-06, 'epoch': 0.06} 6%|▌ | 1363/22095 [2:31:39<25:17:38, 4.39s/it] 6%|▌ | 1364/22095 [2:31:43<23:35:41, 4.10s/it] {'loss': 0.5035, 'grad_norm': 0.7660819683254514, 'learning_rate': 9.973701598173736e-06, 'epoch': 0.06} 6%|▌ | 1364/22095 [2:31:43<23:35:41, 4.10s/it] 6%|▌ | 1365/22095 [2:31:46<22:41:49, 3.94s/it] {'loss': 0.4673, 'grad_norm': 0.8494876594535249, 'learning_rate': 9.973626472384911e-06, 'epoch': 0.06} 6%|▌ | 1365/22095 [2:31:46<22:41:49, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▌ | 1366/22095 [2:31:55<31:46:01, 5.52s/it] {'loss': 0.5281, 'grad_norm': 0.48390760598129084, 'learning_rate': 9.973551239728129e-06, 'epoch': 0.06} 6%|▌ | 1366/22095 [2:31:55<31:46:01, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▌ | 1367/22095 [2:32:02<33:31:29, 5.82s/it] {'loss': 0.5271, 'grad_norm': 0.4369140872787393, 'learning_rate': 9.973475900205005e-06, 'epoch': 0.06} 6%|▌ | 1367/22095 [2:32:02<33:31:29, 5.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 6%|▌ | 1368/22095 [2:32:06<30:06:09, 5.23s/it] {'loss': 0.457, 'grad_norm': 0.7571630206601964, 'learning_rate': 9.97340045381716e-06, 'epoch': 0.06} 6%|▌ | 1368/22095 [2:32:06<30:06:09, 5.23s/it] 6%|▌ | 1369/22095 [2:32:09<27:08:40, 4.71s/it] {'loss': 0.4085, 'grad_norm': 0.7646121483191878, 'learning_rate': 9.973324900566214e-06, 'epoch': 0.06} 6%|▌ | 1369/22095 [2:32:09<27:08:40, 4.71s/it] 6%|▌ | 1370/22095 [2:32:13<24:37:48, 4.28s/it] {'loss': 0.4159, 'grad_norm': 1.3464464658815254, 'learning_rate': 9.973249240453789e-06, 'epoch': 0.06} 6%|▌ | 1370/22095 [2:32:13<24:37:48, 4.28s/it] 6%|▌ | 1371/22095 [2:32:15<22:00:52, 3.82s/it] {'loss': 0.4295, 'grad_norm': 0.903357108813622, 'learning_rate': 9.973173473481513e-06, 'epoch': 0.06} 6%|▌ | 1371/22095 [2:32:15<22:00:52, 3.82s/it] 6%|▌ | 1372/22095 [2:32:19<21:45:49, 3.78s/it] {'loss': 0.4428, 'grad_norm': 0.7900153899952508, 'learning_rate': 9.973097599651013e-06, 'epoch': 0.06} 6%|▌ | 1372/22095 [2:32:19<21:45:49, 3.78s/it] 6%|▌ | 1373/22095 [2:32:23<21:43:16, 3.77s/it] {'loss': 0.4394, 'grad_norm': 0.9107180209952301, 'learning_rate': 9.973021618963919e-06, 'epoch': 0.06} 6%|▌ | 1373/22095 [2:32:23<21:43:16, 3.77s/it] 6%|▌ | 1374/22095 [2:32:26<20:23:56, 3.54s/it] {'loss': 0.4603, 'grad_norm': 0.7416484939825573, 'learning_rate': 9.972945531421863e-06, 'epoch': 0.06} 6%|▌ | 1374/22095 [2:32:26<20:23:56, 3.54s/it] 6%|▌ | 1375/22095 [2:32:30<21:26:02, 3.72s/it] {'loss': 0.4159, 'grad_norm': 1.465734093495972, 'learning_rate': 9.972869337026482e-06, 'epoch': 0.06} 6%|▌ | 1375/22095 [2:32:30<21:26:02, 3.72s/it] 6%|▌ | 1376/22095 [2:32:34<22:38:11, 3.93s/it] {'loss': 0.4091, 'grad_norm': 0.780362708639645, 'learning_rate': 9.972793035779412e-06, 'epoch': 0.06} 6%|▌ | 1376/22095 [2:32:34<22:38:11, 3.93s/it] 6%|▌ | 1377/22095 [2:32:37<21:03:15, 3.66s/it] {'loss': 0.494, 'grad_norm': 0.808375725476795, 'learning_rate': 9.972716627682292e-06, 'epoch': 0.06} 6%|▌ | 1377/22095 [2:32:37<21:03:15, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [425, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8464724 in VC:s3://internvl-moe-sft-data/. Exception: Image size [425, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 138228, 'image': 'vrdu_texteq/astro-ph.CO/09c26465-dd39-4b3d-84f3-b75a6aa2b251.png', 'image_wh': [[425, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'Here we assume that $M=6$. Then'}]} 6%|▌ | 1378/22095 [2:32:41<21:12:06, 3.68s/it] {'loss': 0.5243, 'grad_norm': 0.8677968454058532, 'learning_rate': 9.972640112736764e-06, 'epoch': 0.06} 6%|▌ | 1378/22095 [2:32:41<21:12:06, 3.68s/it] 6%|▌ | 1379/22095 [2:32:45<20:55:17, 3.64s/it] {'loss': 0.4548, 'grad_norm': 0.7190882477891677, 'learning_rate': 9.972563490944474e-06, 'epoch': 0.06} 6%|▌ | 1379/22095 [2:32:45<20:55:17, 3.64s/it] 6%|▌ | 1380/22095 [2:32:47<19:31:07, 3.39s/it] {'loss': 0.4349, 'grad_norm': 0.8145344529184204, 'learning_rate': 9.972486762307064e-06, 'epoch': 0.06} 6%|▌ | 1380/22095 [2:32:47<19:31:07, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91696 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66308 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1381/22095 [2:32:50<18:29:29, 3.21s/it] {'loss': 0.419, 'grad_norm': 0.82848896121603, 'learning_rate': 9.972409926826188e-06, 'epoch': 0.06} 6%|▋ | 1381/22095 [2:32:50<18:29:29, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1382/22095 [2:33:01<31:42:11, 5.51s/it] {'loss': 0.5263, 'grad_norm': 0.8915242371627252, 'learning_rate': 9.972332984503493e-06, 'epoch': 0.06} 6%|▋ | 1382/22095 [2:33:01<31:42:11, 5.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61023 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1383/22095 [2:33:05<28:18:29, 4.92s/it] {'loss': 0.4617, 'grad_norm': 0.8686588179710028, 'learning_rate': 9.972255935340631e-06, 'epoch': 0.06} 6%|▋ | 1383/22095 [2:33:05<28:18:29, 4.92s/it] 6%|▋ | 1384/22095 [2:33:09<26:35:57, 4.62s/it] {'loss': 0.4577, 'grad_norm': 0.7461021561719915, 'learning_rate': 9.972178779339264e-06, 'epoch': 0.06} 6%|▋ | 1384/22095 [2:33:09<26:35:57, 4.62s/it] 6%|▋ | 1385/22095 [2:33:12<24:04:56, 4.19s/it] {'loss': 0.468, 'grad_norm': 0.767860526526951, 'learning_rate': 9.972101516501043e-06, 'epoch': 0.06} 6%|▋ | 1385/22095 [2:33:12<24:04:56, 4.19s/it] 6%|▋ | 1386/22095 [2:33:15<22:03:56, 3.84s/it] {'loss': 0.4581, 'grad_norm': 0.7755025230348713, 'learning_rate': 9.972024146827633e-06, 'epoch': 0.06} 6%|▋ | 1386/22095 [2:33:15<22:03:56, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▋ | 1387/22095 [2:33:19<22:13:39, 3.86s/it] {'loss': 0.4258, 'grad_norm': 0.7136222251907728, 'learning_rate': 9.971946670320693e-06, 'epoch': 0.06} 6%|▋ | 1387/22095 [2:33:19<22:13:39, 3.86s/it] 6%|▋ | 1388/22095 [2:33:22<20:35:50, 3.58s/it] {'loss': 0.418, 'grad_norm': 0.7962994471669949, 'learning_rate': 9.971869086981892e-06, 'epoch': 0.06} 6%|▋ | 1388/22095 [2:33:22<20:35:50, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91063 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1389/22095 [2:33:32<31:51:28, 5.54s/it] {'loss': 0.5509, 'grad_norm': 0.9369097066828213, 'learning_rate': 9.971791396812891e-06, 'epoch': 0.06} 6%|▋ | 1389/22095 [2:33:32<31:51:28, 5.54s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250501_133654_1/images/before_screenshot_1_id_192_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-27 18:31:31.420143 load time: 1475.3 ms 6%|▋ | 1390/22095 [2:33:35<28:43:18, 4.99s/it] {'loss': 0.4512, 'grad_norm': 0.85928628848663, 'learning_rate': 9.971713599815364e-06, 'epoch': 0.06} 6%|▋ | 1390/22095 [2:33:35<28:43:18, 4.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1391/22095 [2:33:41<29:16:52, 5.09s/it] {'loss': 0.5308, 'grad_norm': 0.4760552251345332, 'learning_rate': 9.971635695990981e-06, 'epoch': 0.06} 6%|▋ | 1391/22095 [2:33:41<29:16:52, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▋ | 1392/22095 [2:33:44<26:41:44, 4.64s/it] {'loss': 0.5004, 'grad_norm': 0.8016075510443832, 'learning_rate': 9.971557685341415e-06, 'epoch': 0.06} 6%|▋ | 1392/22095 [2:33:44<26:41:44, 4.64s/it] 6%|▋ | 1393/22095 [2:33:47<23:38:40, 4.11s/it] {'loss': 0.447, 'grad_norm': 0.8390371784654099, 'learning_rate': 9.971479567868345e-06, 'epoch': 0.06} 6%|▋ | 1393/22095 [2:33:47<23:38:40, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921713 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44866, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 6%|▋ | 1394/22095 [2:33:50<21:48:11, 3.79s/it] {'loss': 0.464, 'grad_norm': 0.8424654439840261, 'learning_rate': 9.971401343573448e-06, 'epoch': 0.06} 6%|▋ | 1394/22095 [2:33:50<21:48:11, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93716 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44550 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1395/22095 [2:33:54<21:33:43, 3.75s/it] {'loss': 0.4349, 'grad_norm': 0.7607383258531611, 'learning_rate': 9.971323012458403e-06, 'epoch': 0.06} 6%|▋ | 1395/22095 [2:33:54<21:33:43, 3.75s/it] 6%|▋ | 1396/22095 [2:33:58<21:44:30, 3.78s/it] {'loss': 0.4361, 'grad_norm': 0.8127223782573194, 'learning_rate': 9.971244574524897e-06, 'epoch': 0.06} 6%|▋ | 1396/22095 [2:33:58<21:44:30, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1397/22095 [2:34:07<30:59:25, 5.39s/it] {'loss': 0.5568, 'grad_norm': 1.0790888839572594, 'learning_rate': 9.97116602977461e-06, 'epoch': 0.06} 6%|▋ | 1397/22095 [2:34:07<30:59:25, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50517 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95996 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1398/22095 [2:34:10<27:17:41, 4.75s/it] {'loss': 0.4803, 'grad_norm': 1.2668559111678692, 'learning_rate': 9.971087378209235e-06, 'epoch': 0.06} 6%|▋ | 1398/22095 [2:34:10<27:17:41, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108834 > 40960). Running this sequence through the model will result in indexing errors VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31232.png 2025-08-27 18:32:10.935130 load time: 1053.71 ms 6%|▋ | 1399/22095 [2:34:14<25:23:16, 4.42s/it] {'loss': 0.4364, 'grad_norm': 0.755449835673747, 'learning_rate': 9.97100861983046e-06, 'epoch': 0.06} 6%|▋ | 1399/22095 [2:34:14<25:23:16, 4.42s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (143038128 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 6%|▋ | 1400/22095 [2:34:18<24:12:00, 4.21s/it] {'loss': 0.4613, 'grad_norm': 0.8453741315660476, 'learning_rate': 9.970929754639976e-06, 'epoch': 0.06} 6%|▋ | 1400/22095 [2:34:18<24:12:00, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60726 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1401/22095 [2:34:21<22:29:02, 3.91s/it] {'loss': 0.4654, 'grad_norm': 0.8970543727300306, 'learning_rate': 9.970850782639478e-06, 'epoch': 0.06} 6%|▋ | 1401/22095 [2:34:21<22:29:02, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047146 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3'}]} 6%|▋ | 1402/22095 [2:34:24<21:59:17, 3.83s/it] {'loss': 0.3953, 'grad_norm': 0.9167840194035326, 'learning_rate': 9.970771703830666e-06, 'epoch': 0.06} 6%|▋ | 1402/22095 [2:34:24<21:59:17, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1403/22095 [2:34:34<31:50:58, 5.54s/it] {'loss': 0.5232, 'grad_norm': 0.5676117809549915, 'learning_rate': 9.970692518215236e-06, 'epoch': 0.06} 6%|▋ | 1403/22095 [2:34:34<31:50:58, 5.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (255713934 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7926320 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (255713934 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/38454.png', 'image_wh': [[19362, 13207]], 'conversations': [{'from': 'human', 'value': '\nwhich are the 2 years mentioned Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': '2016 and 2018.\nThe image text mentions two years, which are 2016 and 2018. These years are related to the UN Summit for Refugees and Migrants. It is not clear from the image what happened in those years, but it is possible that important decisions or events related to global migration policies took place during those times. For more information, the text directs readers to the website refugeesmigrants.un.org and to use the hashtag #UN4RefugeesMigrants.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 6%|▋ | 1404/22095 [2:34:37<28:01:12, 4.88s/it] {'loss': 0.4419, 'grad_norm': 0.8284466275722184, 'learning_rate': 9.970613225794887e-06, 'epoch': 0.06} 6%|▋ | 1404/22095 [2:34:37<28:01:12, 4.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047661 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,M是AB中点,∴BM=\\frac{1}{2}AB=5cm,又∵NB=2cm,∴MN=BM-BN=5-2=3cm.'}]} 6%|▋ | 1405/22095 [2:34:40<24:33:28, 4.27s/it] {'loss': 0.4603, 'grad_norm': 0.916990534856046, 'learning_rate': 9.970533826571329e-06, 'epoch': 0.06} 6%|▋ | 1405/22095 [2:34:40<24:33:28, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1406/22095 [2:34:49<32:27:16, 5.65s/it] {'loss': 0.512, 'grad_norm': 0.5007907202491658, 'learning_rate': 9.970454320546264e-06, 'epoch': 0.06} 6%|▋ | 1406/22095 [2:34:49<32:27:16, 5.65s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-27 18:32:48.346332 load time: 1396.04 ms 6%|▋ | 1407/22095 [2:34:53<28:54:12, 5.03s/it] {'loss': 0.4818, 'grad_norm': 0.8791752963483216, 'learning_rate': 9.9703747077214e-06, 'epoch': 0.06} 6%|▋ | 1407/22095 [2:34:53<28:54:12, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42089 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1408/22095 [2:34:56<25:21:24, 4.41s/it] {'loss': 0.4426, 'grad_norm': 0.8003276708620475, 'learning_rate': 9.970294988098452e-06, 'epoch': 0.06} 6%|▋ | 1408/22095 [2:34:56<25:21:24, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42987 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1409/22095 [2:35:05<33:57:30, 5.91s/it] {'loss': 0.5424, 'grad_norm': 0.45111031550043196, 'learning_rate': 9.970215161679126e-06, 'epoch': 0.06} 6%|▋ | 1409/22095 [2:35:05<33:57:30, 5.91s/it] 6%|▋ | 1410/22095 [2:35:09<30:08:48, 5.25s/it] {'loss': 0.4591, 'grad_norm': 0.7672189227879728, 'learning_rate': 9.970135228465144e-06, 'epoch': 0.06} 6%|▋ | 1410/22095 [2:35:09<30:08:48, 5.25s/it] 6%|▋ | 1411/22095 [2:35:13<27:45:49, 4.83s/it] {'loss': 0.4507, 'grad_norm': 0.9335458891124578, 'learning_rate': 9.970055188458219e-06, 'epoch': 0.06} 6%|▋ | 1411/22095 [2:35:13<27:45:49, 4.83s/it] 6%|▋ | 1412/22095 [2:35:16<25:17:24, 4.40s/it] {'loss': 0.4377, 'grad_norm': 0.7114136093026613, 'learning_rate': 9.969975041660073e-06, 'epoch': 0.06} 6%|▋ | 1412/22095 [2:35:16<25:17:24, 4.40s/it] 6%|▋ | 1413/22095 [2:35:19<22:26:56, 3.91s/it] {'loss': 0.3875, 'grad_norm': 0.8747199646442678, 'learning_rate': 9.969894788072427e-06, 'epoch': 0.06} 6%|▋ | 1413/22095 [2:35:19<22:26:56, 3.91s/it] 6%|▋ | 1414/22095 [2:35:22<22:16:37, 3.88s/it] {'loss': 0.4383, 'grad_norm': 0.829182408703576, 'learning_rate': 9.969814427697007e-06, 'epoch': 0.06} 6%|▋ | 1414/22095 [2:35:22<22:16:37, 3.88s/it] 6%|▋ | 1415/22095 [2:35:26<22:04:31, 3.84s/it] {'loss': 0.4635, 'grad_norm': 0.7319960516188317, 'learning_rate': 9.969733960535537e-06, 'epoch': 0.06} 6%|▋ | 1415/22095 [2:35:26<22:04:31, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70376 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1416/22095 [2:35:35<30:26:38, 5.30s/it] {'loss': 0.5478, 'grad_norm': 0.9267499056159455, 'learning_rate': 9.969653386589749e-06, 'epoch': 0.06} 6%|▋ | 1416/22095 [2:35:35<30:26:38, 5.30s/it] 6%|▋ | 1417/22095 [2:35:42<32:42:13, 5.69s/it] {'loss': 0.5244, 'grad_norm': 0.4920810844031071, 'learning_rate': 9.969572705861371e-06, 'epoch': 0.06} 6%|▋ | 1417/22095 [2:35:42<32:42:13, 5.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (63257 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1418/22095 [2:35:45<28:59:52, 5.05s/it] {'loss': 0.4943, 'grad_norm': 0.8843609262456265, 'learning_rate': 9.96949191835214e-06, 'epoch': 0.06} 6%|▋ | 1418/22095 [2:35:45<28:59:52, 5.05s/it] 6%|▋ | 1419/22095 [2:35:50<28:18:38, 4.93s/it] {'loss': 0.4671, 'grad_norm': 0.7939118677093964, 'learning_rate': 9.96941102406379e-06, 'epoch': 0.06} 6%|▋ | 1419/22095 [2:35:50<28:18:38, 4.93s/it] 6%|▋ | 1420/22095 [2:35:53<26:10:39, 4.56s/it] {'loss': 0.4537, 'grad_norm': 0.8074312906259381, 'learning_rate': 9.969330022998057e-06, 'epoch': 0.06} 6%|▋ | 1420/22095 [2:35:53<26:10:39, 4.56s/it] 6%|▋ | 1421/22095 [2:35:57<24:35:25, 4.28s/it] {'loss': 0.4675, 'grad_norm': 0.7941660570380797, 'learning_rate': 9.969248915156689e-06, 'epoch': 0.06} 6%|▋ | 1421/22095 [2:35:57<24:35:25, 4.28s/it] 6%|▋ | 1422/22095 [2:36:00<22:10:43, 3.86s/it] {'loss': 0.4102, 'grad_norm': 0.739298603734896, 'learning_rate': 9.96916770054142e-06, 'epoch': 0.06} 6%|▋ | 1422/22095 [2:36:00<22:10:43, 3.86s/it] 6%|▋ | 1423/22095 [2:36:03<21:08:58, 3.68s/it] {'loss': 0.4473, 'grad_norm': 0.7447203329144491, 'learning_rate': 9.969086379154e-06, 'epoch': 0.06} 6%|▋ | 1423/22095 [2:36:03<21:08:58, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (111125556 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10005.png 2025-08-27 18:34:03.086728 load time: 1277.12 ms 6%|▋ | 1424/22095 [2:36:08<23:40:59, 4.12s/it] {'loss': 0.5181, 'grad_norm': 1.1690352979094978, 'learning_rate': 9.969004950996175e-06, 'epoch': 0.06} 6%|▋ | 1424/22095 [2:36:08<23:40:59, 4.12s/it] 6%|▋ | 1425/22095 [2:36:12<22:34:35, 3.93s/it] {'loss': 0.4239, 'grad_norm': 0.6911976096981919, 'learning_rate': 9.968923416069694e-06, 'epoch': 0.06} 6%|▋ | 1425/22095 [2:36:12<22:34:35, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100067 > 40960). Running this sequence through the model will result in indexing errors 6%|▋ | 1426/22095 [2:36:20<28:59:51, 5.05s/it] {'loss': 0.514, 'grad_norm': 0.5637242347474052, 'learning_rate': 9.96884177437631e-06, 'epoch': 0.06} 6%|▋ | 1426/22095 [2:36:20<28:59:51, 5.05s/it] 6%|▋ | 1427/22095 [2:36:24<27:32:14, 4.80s/it] {'loss': 0.467, 'grad_norm': 0.7441386514999966, 'learning_rate': 9.968760025917777e-06, 'epoch': 0.06} 6%|▋ | 1427/22095 [2:36:24<27:32:14, 4.80s/it] 6%|▋ | 1428/22095 [2:36:27<24:15:07, 4.22s/it] {'loss': 0.4294, 'grad_norm': 0.784708982521003, 'learning_rate': 9.968678170695851e-06, 'epoch': 0.06} 6%|▋ | 1428/22095 [2:36:27<24:15:07, 4.22s/it] 6%|▋ | 1429/22095 [2:36:31<24:25:13, 4.25s/it] {'loss': 0.4495, 'grad_norm': 0.6990928541859651, 'learning_rate': 9.968596208712293e-06, 'epoch': 0.06} 6%|▋ | 1429/22095 [2:36:31<24:25:13, 4.25s/it] 6%|▋ | 1430/22095 [2:36:34<23:06:23, 4.03s/it] {'loss': 0.4917, 'grad_norm': 1.0658537709299807, 'learning_rate': 9.968514139968862e-06, 'epoch': 0.06} 6%|▋ | 1430/22095 [2:36:34<23:06:23, 4.03s/it] 6%|▋ | 1431/22095 [2:36:38<22:24:22, 3.90s/it] {'loss': 0.4447, 'grad_norm': 0.7858687905006959, 'learning_rate': 9.96843196446732e-06, 'epoch': 0.06} 6%|▋ | 1431/22095 [2:36:38<22:24:22, 3.90s/it] 6%|▋ | 1432/22095 [2:36:42<22:03:22, 3.84s/it] {'loss': 0.4743, 'grad_norm': 0.8138918788039744, 'learning_rate': 9.968349682209434e-06, 'epoch': 0.06} 6%|▋ | 1432/22095 [2:36:42<22:03:22, 3.84s/it] 6%|▋ | 1433/22095 [2:36:45<20:46:14, 3.62s/it] {'loss': 0.4507, 'grad_norm': 0.7115302776794071, 'learning_rate': 9.968267293196976e-06, 'epoch': 0.06} 6%|▋ | 1433/22095 [2:36:45<20:46:14, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 6%|▋ | 1434/22095 [2:36:54<30:43:52, 5.35s/it] {'loss': 0.5437, 'grad_norm': 1.566811431079173, 'learning_rate': 9.96818479743171e-06, 'epoch': 0.06} 6%|▋ | 1434/22095 [2:36:54<30:43:52, 5.35s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-27 18:34:53.030975 load time: 1210.15 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/30fa924765d794b1119ecbe77ef7c9d78045bed35211f46887007cb504e9c0b0.png 2025-08-27 18:34:53.580406 load time: 1177.9 ms 6%|▋ | 1435/22095 [2:36:58<27:55:36, 4.87s/it] {'loss': 0.4627, 'grad_norm': 0.7712599008875365, 'learning_rate': 9.968102194915411e-06, 'epoch': 0.06} 6%|▋ | 1435/22095 [2:36:58<27:55:36, 4.87s/it] 6%|▋ | 1436/22095 [2:37:02<25:38:10, 4.47s/it] {'loss': 0.4519, 'grad_norm': 0.8098764007806255, 'learning_rate': 9.968019485649856e-06, 'epoch': 0.06} 6%|▋ | 1436/22095 [2:37:02<25:38:10, 4.47s/it] 7%|▋ | 1437/22095 [2:37:05<23:27:30, 4.09s/it] {'loss': 0.4679, 'grad_norm': 0.730530054382985, 'learning_rate': 9.967936669636818e-06, 'epoch': 0.07} 7%|▋ | 1437/22095 [2:37:05<23:27:30, 4.09s/it] 7%|▋ | 1438/22095 [2:37:08<21:46:13, 3.79s/it] {'loss': 0.3934, 'grad_norm': 0.7298010549478803, 'learning_rate': 9.96785374687808e-06, 'epoch': 0.07} 7%|▋ | 1438/22095 [2:37:08<21:46:13, 3.79s/it] 7%|▋ | 1439/22095 [2:37:11<19:57:37, 3.48s/it] {'loss': 0.4488, 'grad_norm': 0.7719987609607117, 'learning_rate': 9.967770717375423e-06, 'epoch': 0.07} 7%|▋ | 1439/22095 [2:37:11<19:57:37, 3.48s/it] 7%|▋ | 1440/22095 [2:37:14<19:44:26, 3.44s/it] {'loss': 0.4493, 'grad_norm': 0.7576083931078323, 'learning_rate': 9.967687581130632e-06, 'epoch': 0.07} 7%|▋ | 1440/22095 [2:37:14<19:44:26, 3.44s/it] 7%|▋ | 1441/22095 [2:37:17<19:30:39, 3.40s/it] {'loss': 0.434, 'grad_norm': 0.7456916900585282, 'learning_rate': 9.967604338145488e-06, 'epoch': 0.07} 7%|▋ | 1441/22095 [2:37:17<19:30:39, 3.40s/it] 7%|▋ | 1442/22095 [2:37:20<19:04:48, 3.33s/it] {'loss': 0.4175, 'grad_norm': 0.7609509382040559, 'learning_rate': 9.967520988421788e-06, 'epoch': 0.07} 7%|▋ | 1442/22095 [2:37:20<19:04:48, 3.33s/it] 7%|▋ | 1443/22095 [2:37:24<19:10:10, 3.34s/it] {'loss': 0.465, 'grad_norm': 0.8318454322580573, 'learning_rate': 9.967437531961316e-06, 'epoch': 0.07} 7%|▋ | 1443/22095 [2:37:24<19:10:10, 3.34s/it] 7%|▋ | 1444/22095 [2:37:27<18:23:14, 3.21s/it] {'loss': 0.4394, 'grad_norm': 0.7591737815983305, 'learning_rate': 9.967353968765868e-06, 'epoch': 0.07} 7%|▋ | 1444/22095 [2:37:27<18:23:14, 3.21s/it] 7%|▋ | 1445/22095 [2:37:30<18:08:39, 3.16s/it] {'loss': 0.47, 'grad_norm': 0.7765098393214346, 'learning_rate': 9.967270298837239e-06, 'epoch': 0.07} 7%|▋ | 1445/22095 [2:37:30<18:08:39, 3.16s/it] 7%|▋ | 1446/22095 [2:37:35<21:19:11, 3.72s/it] {'loss': 0.3877, 'grad_norm': 0.7409110390658145, 'learning_rate': 9.967186522177228e-06, 'epoch': 0.07} 7%|▋ | 1446/22095 [2:37:35<21:19:11, 3.72s/it] 7%|▋ | 1447/22095 [2:37:38<20:32:01, 3.58s/it] {'loss': 0.4224, 'grad_norm': 0.9312488082329916, 'learning_rate': 9.967102638787634e-06, 'epoch': 0.07} 7%|▋ | 1447/22095 [2:37:38<20:32:01, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932783 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55936, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6cm'}]} 7%|▋ | 1448/22095 [2:37:41<20:13:22, 3.53s/it] {'loss': 0.4626, 'grad_norm': 0.749522516130679, 'learning_rate': 9.96701864867026e-06, 'epoch': 0.07} 7%|▋ | 1448/22095 [2:37:41<20:13:22, 3.53s/it] 7%|▋ | 1449/22095 [2:37:45<20:03:28, 3.50s/it] {'loss': 0.5151, 'grad_norm': 0.8037830099599014, 'learning_rate': 9.96693455182691e-06, 'epoch': 0.07} 7%|▋ | 1449/22095 [2:37:45<20:03:28, 3.50s/it] 7%|▋ | 1450/22095 [2:37:48<19:24:57, 3.39s/it] {'loss': 0.5014, 'grad_norm': 0.8896635354781706, 'learning_rate': 9.96685034825939e-06, 'epoch': 0.07} 7%|▋ | 1450/22095 [2:37:48<19:24:57, 3.39s/it] 7%|▋ | 1451/22095 [2:37:51<18:55:05, 3.30s/it] {'loss': 0.48, 'grad_norm': 0.9075654261092597, 'learning_rate': 9.966766037969512e-06, 'epoch': 0.07} 7%|▋ | 1451/22095 [2:37:51<18:55:05, 3.30s/it] 7%|▋ | 1452/22095 [2:37:54<18:13:29, 3.18s/it] {'loss': 0.4136, 'grad_norm': 0.779067559983051, 'learning_rate': 9.966681620959085e-06, 'epoch': 0.07} 7%|▋ | 1452/22095 [2:37:54<18:13:29, 3.18s/it] 7%|▋ | 1453/22095 [2:37:57<17:58:12, 3.13s/it] {'loss': 0.4761, 'grad_norm': 0.8202536297853454, 'learning_rate': 9.966597097229925e-06, 'epoch': 0.07} 7%|▋ | 1453/22095 [2:37:57<17:58:12, 3.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65947 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1454/22095 [2:38:00<17:44:48, 3.10s/it] {'loss': 0.468, 'grad_norm': 1.5381184164373207, 'learning_rate': 9.966512466783846e-06, 'epoch': 0.07} 7%|▋ | 1454/22095 [2:38:00<17:44:48, 3.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118122 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1455/22095 [2:38:03<18:24:11, 3.21s/it] {'loss': 0.4855, 'grad_norm': 0.7781534864045977, 'learning_rate': 9.966427729622668e-06, 'epoch': 0.07} 7%|▋ | 1455/22095 [2:38:03<18:24:11, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75739 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1456/22095 [2:38:06<17:52:30, 3.12s/it] {'loss': 0.4623, 'grad_norm': 0.7798522596941327, 'learning_rate': 9.966342885748212e-06, 'epoch': 0.07} 7%|▋ | 1456/22095 [2:38:06<17:52:30, 3.12s/it] 7%|▋ | 1457/22095 [2:38:10<18:58:32, 3.31s/it] {'loss': 0.483, 'grad_norm': 0.7384237234710249, 'learning_rate': 9.9662579351623e-06, 'epoch': 0.07} 7%|▋ | 1457/22095 [2:38:10<18:58:32, 3.31s/it] 7%|▋ | 1458/22095 [2:38:14<19:12:20, 3.35s/it] {'loss': 0.4736, 'grad_norm': 0.7826289098351965, 'learning_rate': 9.966172877866757e-06, 'epoch': 0.07} 7%|▋ | 1458/22095 [2:38:14<19:12:20, 3.35s/it] 7%|▋ | 1459/22095 [2:38:16<18:18:37, 3.19s/it] {'loss': 0.4721, 'grad_norm': 1.0297726700330934, 'learning_rate': 9.966087713863412e-06, 'epoch': 0.07} 7%|▋ | 1459/22095 [2:38:16<18:18:37, 3.19s/it] 7%|▋ | 1460/22095 [2:38:20<18:30:03, 3.23s/it] {'loss': 0.4884, 'grad_norm': 0.8561847199763449, 'learning_rate': 9.966002443154095e-06, 'epoch': 0.07} 7%|▋ | 1460/22095 [2:38:20<18:30:03, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1461/22095 [2:38:25<22:51:55, 3.99s/it] {'loss': 0.5438, 'grad_norm': 2.3350403237905915, 'learning_rate': 9.965917065740636e-06, 'epoch': 0.07} 7%|▋ | 1461/22095 [2:38:25<22:51:55, 3.99s/it] 7%|▋ | 1462/22095 [2:38:29<21:50:21, 3.81s/it] {'loss': 0.473, 'grad_norm': 0.8562932774188359, 'learning_rate': 9.965831581624872e-06, 'epoch': 0.07} 7%|▋ | 1462/22095 [2:38:29<21:50:21, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1463/22095 [2:38:38<31:45:27, 5.54s/it] {'loss': 0.547, 'grad_norm': 0.6826153820493718, 'learning_rate': 9.965745990808638e-06, 'epoch': 0.07} 7%|▋ | 1463/22095 [2:38:38<31:45:27, 5.54s/it] 7%|▋ | 1464/22095 [2:38:43<29:42:35, 5.18s/it] {'loss': 0.4536, 'grad_norm': 0.7976139884418457, 'learning_rate': 9.965660293293773e-06, 'epoch': 0.07} 7%|▋ | 1464/22095 [2:38:43<29:42:35, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1465/22095 [2:38:54<39:27:55, 6.89s/it] {'loss': 0.5063, 'grad_norm': 0.9966975784442647, 'learning_rate': 9.96557448908212e-06, 'epoch': 0.07} 7%|▋ | 1465/22095 [2:38:54<39:27:55, 6.89s/it] 7%|▋ | 1466/22095 [2:38:57<33:42:59, 5.88s/it] {'loss': 0.4333, 'grad_norm': 0.8040762001174595, 'learning_rate': 9.965488578175522e-06, 'epoch': 0.07} 7%|▋ | 1466/22095 [2:38:57<33:42:59, 5.88s/it] 7%|▋ | 1467/22095 [2:39:00<28:33:48, 4.98s/it] {'loss': 0.4308, 'grad_norm': 0.8261180768400028, 'learning_rate': 9.965402560575825e-06, 'epoch': 0.07} 7%|▋ | 1467/22095 [2:39:00<28:33:48, 4.98s/it] 7%|▋ | 1468/22095 [2:39:04<26:57:32, 4.71s/it] {'loss': 0.4597, 'grad_norm': 0.9011136586096863, 'learning_rate': 9.965316436284877e-06, 'epoch': 0.07} 7%|▋ | 1468/22095 [2:39:04<26:57:32, 4.71s/it] 7%|▋ | 1469/22095 [2:39:07<24:01:36, 4.19s/it] {'loss': 0.4384, 'grad_norm': 1.111941679470362, 'learning_rate': 9.965230205304528e-06, 'epoch': 0.07} 7%|▋ | 1469/22095 [2:39:07<24:01:36, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90067 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1470/22095 [2:39:10<22:14:40, 3.88s/it] {'loss': 0.4624, 'grad_norm': 0.7204158964166693, 'learning_rate': 9.96514386763663e-06, 'epoch': 0.07} 7%|▋ | 1470/22095 [2:39:10<22:14:40, 3.88s/it] 7%|▋ | 1471/22095 [2:39:13<20:53:56, 3.65s/it] {'loss': 0.4667, 'grad_norm': 0.9149661678745, 'learning_rate': 9.965057423283043e-06, 'epoch': 0.07} 7%|▋ | 1471/22095 [2:39:13<20:53:56, 3.65s/it] 7%|▋ | 1472/22095 [2:39:16<19:30:44, 3.41s/it] {'loss': 0.4311, 'grad_norm': 0.7932039742728114, 'learning_rate': 9.964970872245618e-06, 'epoch': 0.07} 7%|▋ | 1472/22095 [2:39:16<19:30:44, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1473/22095 [2:39:24<27:24:39, 4.79s/it] {'loss': 0.5952, 'grad_norm': 1.89014538464223, 'learning_rate': 9.96488421452622e-06, 'epoch': 0.07} 7%|▋ | 1473/22095 [2:39:24<27:24:39, 4.79s/it] 7%|▋ | 1474/22095 [2:39:28<25:01:54, 4.37s/it] {'loss': 0.443, 'grad_norm': 0.8032671819143476, 'learning_rate': 9.964797450126708e-06, 'epoch': 0.07} 7%|▋ | 1474/22095 [2:39:28<25:01:54, 4.37s/it] 7%|▋ | 1475/22095 [2:39:31<23:46:05, 4.15s/it] {'loss': 0.4828, 'grad_norm': 0.7481664383546026, 'learning_rate': 9.964710579048947e-06, 'epoch': 0.07} 7%|▋ | 1475/22095 [2:39:31<23:46:05, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128286 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1476/22095 [2:39:35<22:53:48, 4.00s/it] {'loss': 0.4415, 'grad_norm': 0.7386169349436833, 'learning_rate': 9.964623601294802e-06, 'epoch': 0.07} 7%|▋ | 1476/22095 [2:39:35<22:53:48, 4.00s/it] 7%|▋ | 1477/22095 [2:39:38<20:59:12, 3.66s/it] {'loss': 0.4436, 'grad_norm': 0.779597142522534, 'learning_rate': 9.964536516866146e-06, 'epoch': 0.07} 7%|▋ | 1477/22095 [2:39:38<20:59:12, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1478/22095 [2:39:45<26:19:13, 4.60s/it] {'loss': 0.5342, 'grad_norm': 0.6576504922840656, 'learning_rate': 9.964449325764846e-06, 'epoch': 0.07} 7%|▋ | 1478/22095 [2:39:45<26:19:13, 4.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960772 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11607, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 2cm\nB. 5cm\nC. 4cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 7%|▋ | 1479/22095 [2:39:48<24:46:39, 4.33s/it] {'loss': 0.4446, 'grad_norm': 0.8003162853179788, 'learning_rate': 9.964362027992777e-06, 'epoch': 0.07} 7%|▋ | 1479/22095 [2:39:48<24:46:39, 4.33s/it] 7%|▋ | 1480/22095 [2:39:51<22:17:56, 3.89s/it] {'loss': 0.4222, 'grad_norm': 0.7921636322644379, 'learning_rate': 9.964274623551814e-06, 'epoch': 0.07} 7%|▋ | 1480/22095 [2:39:51<22:17:56, 3.89s/it] 7%|▋ | 1481/22095 [2:39:54<20:27:54, 3.57s/it] {'loss': 0.4191, 'grad_norm': 0.8123469531045502, 'learning_rate': 9.964187112443839e-06, 'epoch': 0.07} 7%|▋ | 1481/22095 [2:39:54<20:27:54, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8583842 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 9779, 'image': '385320175.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Mystery, Thriller & Suspense? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Sports & Outdoors? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 7%|▋ | 1482/22095 [2:39:57<19:21:58, 3.38s/it] {'loss': 0.459, 'grad_norm': 0.820058490946606, 'learning_rate': 9.964099494670727e-06, 'epoch': 0.07} 7%|▋ | 1482/22095 [2:39:57<19:21:58, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51206 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46536 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85947 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41300 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1483/22095 [2:40:00<18:30:11, 3.23s/it] {'loss': 0.4884, 'grad_norm': 0.7893140272997659, 'learning_rate': 9.964011770234364e-06, 'epoch': 0.07} 7%|▋ | 1483/22095 [2:40:00<18:30:11, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1484/22095 [2:40:07<25:04:16, 4.38s/it] {'loss': 0.5305, 'grad_norm': 1.0204350131333586, 'learning_rate': 9.963923939136632e-06, 'epoch': 0.07} 7%|▋ | 1484/22095 [2:40:07<25:04:16, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77048 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45386 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1485/22095 [2:40:11<24:05:58, 4.21s/it] {'loss': 0.47, 'grad_norm': 1.1363273251767987, 'learning_rate': 9.963836001379423e-06, 'epoch': 0.07} 7%|▋ | 1485/22095 [2:40:11<24:05:58, 4.21s/it] 7%|▋ | 1486/22095 [2:40:14<21:52:52, 3.82s/it] {'loss': 0.4702, 'grad_norm': 0.8064235598914097, 'learning_rate': 9.963747956964623e-06, 'epoch': 0.07} 7%|▋ | 1486/22095 [2:40:14<21:52:52, 3.82s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369952 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36704, 'image': 'vrdu_table_final_2/astro-ph.CO/15a4d4a6-a378-4939-902f-c289d196ef16.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-27 18:38:12.349325 load time: 1033.05 ms 7%|▋ | 1487/22095 [2:40:17<21:30:15, 3.76s/it] {'loss': 0.4592, 'grad_norm': 0.728234755597837, 'learning_rate': 9.963659805894123e-06, 'epoch': 0.07} 7%|▋ | 1487/22095 [2:40:17<21:30:15, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1488/22095 [2:40:21<21:17:15, 3.72s/it] {'loss': 0.4776, 'grad_norm': 0.8039241658387191, 'learning_rate': 9.96357154816982e-06, 'epoch': 0.07} 7%|▋ | 1488/22095 [2:40:21<21:17:15, 3.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1489/22095 [2:40:24<20:53:05, 3.65s/it] {'loss': 0.4504, 'grad_norm': 0.7661464342835728, 'learning_rate': 9.963483183793606e-06, 'epoch': 0.07} 7%|▋ | 1489/22095 [2:40:24<20:53:05, 3.65s/it] 7%|▋ | 1490/22095 [2:40:28<20:42:14, 3.62s/it] {'loss': 0.4531, 'grad_norm': 0.737037447449985, 'learning_rate': 9.963394712767385e-06, 'epoch': 0.07} 7%|▋ | 1490/22095 [2:40:28<20:42:14, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1491/22095 [2:40:38<31:51:29, 5.57s/it] {'loss': 0.5121, 'grad_norm': 0.5540904667559855, 'learning_rate': 9.963306135093054e-06, 'epoch': 0.07} 7%|▋ | 1491/22095 [2:40:38<31:51:29, 5.57s/it] 7%|▋ | 1492/22095 [2:40:41<28:21:11, 4.95s/it] {'loss': 0.4768, 'grad_norm': 0.8491171776657304, 'learning_rate': 9.96321745077252e-06, 'epoch': 0.07} 7%|▋ | 1492/22095 [2:40:41<28:21:11, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1493/22095 [2:40:51<36:24:26, 6.36s/it] {'loss': 0.5299, 'grad_norm': 0.43270575186531257, 'learning_rate': 9.963128659807684e-06, 'epoch': 0.07} 7%|▋ | 1493/22095 [2:40:51<36:24:26, 6.36s/it] 7%|▋ | 1494/22095 [2:40:54<30:57:42, 5.41s/it] {'loss': 0.4687, 'grad_norm': 0.8745420939942341, 'learning_rate': 9.963039762200457e-06, 'epoch': 0.07} 7%|▋ | 1494/22095 [2:40:54<30:57:42, 5.41s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-27 18:38:54.245459 load time: 1068.89 ms 7%|▋ | 1495/22095 [2:40:57<26:46:47, 4.68s/it] {'loss': 0.459, 'grad_norm': 0.8373452601614464, 'learning_rate': 9.96295075795275e-06, 'epoch': 0.07} 7%|▋ | 1495/22095 [2:40:57<26:46:47, 4.68s/it] 7%|▋ | 1496/22095 [2:41:01<25:13:00, 4.41s/it] {'loss': 0.5082, 'grad_norm': 0.7698332952658042, 'learning_rate': 9.962861647066472e-06, 'epoch': 0.07} 7%|▋ | 1496/22095 [2:41:01<25:13:00, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74850 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1497/22095 [2:41:05<23:52:52, 4.17s/it] {'loss': 0.456, 'grad_norm': 0.8100270250771073, 'learning_rate': 9.962772429543539e-06, 'epoch': 0.07} 7%|▋ | 1497/22095 [2:41:05<23:52:52, 4.17s/it] 7%|▋ | 1498/22095 [2:41:08<21:33:16, 3.77s/it] {'loss': 0.4587, 'grad_norm': 0.7843116494944076, 'learning_rate': 9.96268310538587e-06, 'epoch': 0.07} 7%|▋ | 1498/22095 [2:41:08<21:33:16, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75950 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1499/22095 [2:41:10<20:09:13, 3.52s/it] {'loss': 0.438, 'grad_norm': 0.7879384228895121, 'learning_rate': 9.962593674595382e-06, 'epoch': 0.07} 7%|▋ | 1499/22095 [2:41:10<20:09:13, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304953 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1f.3QhLBNTKJjSszcXXbO2VXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n对图片进行OCR。'}, {'from': 'gpt', 'value': '图中所有文字:\n2017\n双11来了\n特价\n18/21速后拨\n19.9\n包邮\n¥\nLEMA\nLEMA\n真功夫单车折扣店\nWRLCOMETOMYSHOP\n入行六年,只卖正品\n技师指导安装\n山地自行车后拨\n变速顺畅金秋特惠'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1500/22095 [2:41:14<21:01:03, 3.67s/it] {'loss': 0.4559, 'grad_norm': 0.7403443187198006, 'learning_rate': 9.962504137173997e-06, 'epoch': 0.07} 7%|▋ | 1500/22095 [2:41:14<21:01:03, 3.67s/it] 7%|▋ | 1501/22095 [2:41:17<19:46:28, 3.46s/it] {'loss': 0.4454, 'grad_norm': 0.7619760500772016, 'learning_rate': 9.96241449312364e-06, 'epoch': 0.07} 7%|▋ | 1501/22095 [2:41:17<19:46:28, 3.46s/it] 7%|▋ | 1502/22095 [2:41:20<18:37:15, 3.26s/it] {'loss': 0.4684, 'grad_norm': 0.8274554589173836, 'learning_rate': 9.962324742446237e-06, 'epoch': 0.07} 7%|▋ | 1502/22095 [2:41:20<18:37:15, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87700 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63866 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1503/22095 [2:41:23<18:10:01, 3.18s/it] {'loss': 0.4558, 'grad_norm': 0.8122743257002134, 'learning_rate': 9.962234885143715e-06, 'epoch': 0.07} 7%|▋ | 1503/22095 [2:41:23<18:10:01, 3.18s/it] 7%|▋ | 1504/22095 [2:41:26<17:56:29, 3.14s/it] {'loss': 0.5069, 'grad_norm': 0.8125418203275906, 'learning_rate': 9.962144921218005e-06, 'epoch': 0.07} 7%|▋ | 1504/22095 [2:41:26<17:56:29, 3.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42748 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1505/22095 [2:41:36<29:55:37, 5.23s/it] {'loss': 0.5401, 'grad_norm': 1.197255811058083, 'learning_rate': 9.962054850671042e-06, 'epoch': 0.07} 7%|▋ | 1505/22095 [2:41:36<29:55:37, 5.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51205 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1506/22095 [2:41:40<26:47:38, 4.68s/it] {'loss': 0.3907, 'grad_norm': 0.7745585414790322, 'learning_rate': 9.961964673504759e-06, 'epoch': 0.07} 7%|▋ | 1506/22095 [2:41:40<26:47:38, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1507/22095 [2:41:50<37:03:33, 6.48s/it] {'loss': 0.5421, 'grad_norm': 0.493849832077196, 'learning_rate': 9.961874389721095e-06, 'epoch': 0.07} 7%|▋ | 1507/22095 [2:41:50<37:03:33, 6.48s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30183.png 2025-08-27 18:39:49.234058 load time: 1167.5 ms 7%|▋ | 1508/22095 [2:41:54<32:01:25, 5.60s/it] {'loss': 0.4773, 'grad_norm': 0.7940022723097947, 'learning_rate': 9.96178399932199e-06, 'epoch': 0.07} 7%|▋ | 1508/22095 [2:41:54<32:01:25, 5.60s/it] 7%|▋ | 1509/22095 [2:41:57<28:05:03, 4.91s/it] {'loss': 0.4817, 'grad_norm': 0.8717583485304328, 'learning_rate': 9.961693502309385e-06, 'epoch': 0.07} 7%|▋ | 1509/22095 [2:41:57<28:05:03, 4.91s/it] 7%|▋ | 1510/22095 [2:42:00<25:03:25, 4.38s/it] {'loss': 0.4492, 'grad_norm': 0.8235441414406076, 'learning_rate': 9.961602898685225e-06, 'epoch': 0.07} 7%|▋ | 1510/22095 [2:42:00<25:03:25, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76986 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113329 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121986 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1511/22095 [2:42:09<32:38:17, 5.71s/it] {'loss': 0.5486, 'grad_norm': 1.0502485038616896, 'learning_rate': 9.961512188451458e-06, 'epoch': 0.07} 7%|▋ | 1511/22095 [2:42:09<32:38:17, 5.71s/it] 7%|▋ | 1512/22095 [2:42:13<28:48:03, 5.04s/it] {'loss': 0.4708, 'grad_norm': 0.7696161022692278, 'learning_rate': 9.961421371610034e-06, 'epoch': 0.07} 7%|▋ | 1512/22095 [2:42:13<28:48:03, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/notes_1/images/step_0.png 2025-08-27 18:40:12.664590 load time: 1132.93 ms 7%|▋ | 1513/22095 [2:42:23<38:23:27, 6.71s/it] {'loss': 0.5473, 'grad_norm': 0.7810384269518527, 'learning_rate': 9.9613304481629e-06, 'epoch': 0.07} 7%|▋ | 1513/22095 [2:42:23<38:23:27, 6.71s/it] 7%|▋ | 1514/22095 [2:42:33<43:27:45, 7.60s/it] {'loss': 0.5205, 'grad_norm': 0.5406957384016672, 'learning_rate': 9.961239418112013e-06, 'epoch': 0.07} 7%|▋ | 1514/22095 [2:42:33<43:27:45, 7.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 7%|▋ | 1515/22095 [2:42:36<35:49:14, 6.27s/it] {'loss': 0.4817, 'grad_norm': 0.9623025256050911, 'learning_rate': 9.961148281459328e-06, 'epoch': 0.07} 7%|▋ | 1515/22095 [2:42:36<35:49:14, 6.27s/it] 7%|▋ | 1516/22095 [2:42:40<31:37:20, 5.53s/it] {'loss': 0.4505, 'grad_norm': 0.8031406777854115, 'learning_rate': 9.961057038206804e-06, 'epoch': 0.07} 7%|▋ | 1516/22095 [2:42:40<31:37:20, 5.53s/it] 7%|▋ | 1517/22095 [2:42:43<27:38:52, 4.84s/it] {'loss': 0.4735, 'grad_norm': 0.8468517938114531, 'learning_rate': 9.960965688356401e-06, 'epoch': 0.07} 7%|▋ | 1517/22095 [2:42:43<27:38:52, 4.84s/it] 7%|▋ | 1518/22095 [2:42:47<25:05:11, 4.39s/it] {'loss': 0.4216, 'grad_norm': 0.7971237582102288, 'learning_rate': 9.960874231910081e-06, 'epoch': 0.07} 7%|▋ | 1518/22095 [2:42:47<25:05:11, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-27 18:40:45.886091 load time: 1013.91 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38144.png 2025-08-27 18:40:46.274875 load time: 1256.12 ms 7%|▋ | 1519/22095 [2:42:56<34:04:49, 5.96s/it] {'loss': 0.5635, 'grad_norm': 1.1521788910843045, 'learning_rate': 9.960782668869811e-06, 'epoch': 0.07} 7%|▋ | 1519/22095 [2:42:56<34:04:49, 5.96s/it] 7%|▋ | 1520/22095 [2:43:00<29:43:10, 5.20s/it] {'loss': 0.489, 'grad_norm': 0.7942431019486376, 'learning_rate': 9.960690999237555e-06, 'epoch': 0.07} 7%|▋ | 1520/22095 [2:43:00<29:43:10, 5.20s/it] 7%|▋ | 1521/22095 [2:43:04<27:40:04, 4.84s/it] {'loss': 0.43, 'grad_norm': 0.7702082381966745, 'learning_rate': 9.960599223015287e-06, 'epoch': 0.07} 7%|▋ | 1521/22095 [2:43:04<27:40:04, 4.84s/it] 7%|▋ | 1522/22095 [2:43:08<26:22:41, 4.62s/it] {'loss': 0.4509, 'grad_norm': 0.6724548920276606, 'learning_rate': 9.960507340204977e-06, 'epoch': 0.07} 7%|▋ | 1522/22095 [2:43:08<26:22:41, 4.62s/it] 7%|▋ | 1523/22095 [2:43:11<23:40:56, 4.14s/it] {'loss': 0.4686, 'grad_norm': 0.8431250938808247, 'learning_rate': 9.960415350808598e-06, 'epoch': 0.07} 7%|▋ | 1523/22095 [2:43:11<23:40:56, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881984 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5137, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 2\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 7%|▋ | 1524/22095 [2:43:14<22:07:12, 3.87s/it] {'loss': 0.4505, 'grad_norm': 0.7809711877829256, 'learning_rate': 9.960323254828129e-06, 'epoch': 0.07} 7%|▋ | 1524/22095 [2:43:14<22:07:12, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1525/22095 [2:43:20<25:34:11, 4.48s/it] {'loss': 0.5229, 'grad_norm': 0.6213389147642807, 'learning_rate': 9.960231052265548e-06, 'epoch': 0.07} 7%|▋ | 1525/22095 [2:43:20<25:34:11, 4.48s/it] 7%|▋ | 1526/22095 [2:43:24<24:20:59, 4.26s/it] {'loss': 0.4357, 'grad_norm': 0.759452138984007, 'learning_rate': 9.960138743122835e-06, 'epoch': 0.07} 7%|▋ | 1526/22095 [2:43:24<24:20:59, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75937 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45126 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1527/22095 [2:43:27<22:43:46, 3.98s/it] {'loss': 0.4177, 'grad_norm': 0.7715047876751977, 'learning_rate': 9.960046327401975e-06, 'epoch': 0.07} 7%|▋ | 1527/22095 [2:43:27<22:43:46, 3.98s/it] 7%|▋ | 1528/22095 [2:43:31<22:23:02, 3.92s/it] {'loss': 0.4023, 'grad_norm': 0.7671577146844714, 'learning_rate': 9.959953805104953e-06, 'epoch': 0.07} 7%|▋ | 1528/22095 [2:43:31<22:23:02, 3.92s/it] 7%|▋ | 1529/22095 [2:43:34<20:27:57, 3.58s/it] {'loss': 0.4729, 'grad_norm': 0.7603786601404778, 'learning_rate': 9.959861176233756e-06, 'epoch': 0.07} 7%|▋ | 1529/22095 [2:43:34<20:27:57, 3.58s/it] 7%|▋ | 1530/22095 [2:43:38<21:32:37, 3.77s/it] {'loss': 0.4791, 'grad_norm': 1.2263879084410303, 'learning_rate': 9.959768440790377e-06, 'epoch': 0.07} 7%|▋ | 1530/22095 [2:43:38<21:32:37, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48290 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1531/22095 [2:43:42<21:34:29, 3.78s/it] {'loss': 0.4855, 'grad_norm': 0.8042621884641189, 'learning_rate': 9.959675598776805e-06, 'epoch': 0.07} 7%|▋ | 1531/22095 [2:43:42<21:34:29, 3.78s/it] 7%|▋ | 1532/22095 [2:43:45<21:45:05, 3.81s/it] {'loss': 0.4193, 'grad_norm': 0.7607417462032463, 'learning_rate': 9.95958265019504e-06, 'epoch': 0.07} 7%|▋ | 1532/22095 [2:43:45<21:45:05, 3.81s/it] 7%|▋ | 1533/22095 [2:43:48<20:01:55, 3.51s/it] {'loss': 0.4299, 'grad_norm': 0.7396140867796288, 'learning_rate': 9.959489595047074e-06, 'epoch': 0.07} 7%|▋ | 1533/22095 [2:43:48<20:01:55, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51134 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72243 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1534/22095 [2:43:51<19:25:26, 3.40s/it] {'loss': 0.5049, 'grad_norm': 0.8098941502403151, 'learning_rate': 9.959396433334907e-06, 'epoch': 0.07} 7%|▋ | 1534/22095 [2:43:51<19:25:26, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (90998 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1535/22095 [2:44:01<29:47:11, 5.22s/it] {'loss': 0.5169, 'grad_norm': 0.6644968450292126, 'learning_rate': 9.959303165060546e-06, 'epoch': 0.07} 7%|▋ | 1535/22095 [2:44:01<29:47:11, 5.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51372 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1536/22095 [2:44:04<26:15:42, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69507 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.4529, 'grad_norm': 0.8017479067575274, 'learning_rate': 9.959209790225987e-06, 'epoch': 0.07} 7%|▋ | 1536/22095 [2:44:04<26:15:42, 4.60s/it] 7%|▋ | 1537/22095 [2:44:07<23:23:29, 4.10s/it] {'loss': 0.4178, 'grad_norm': 1.6091086506950214, 'learning_rate': 9.959116308833244e-06, 'epoch': 0.07} 7%|▋ | 1537/22095 [2:44:07<23:23:29, 4.10s/it] 7%|▋ | 1538/22095 [2:44:11<22:45:56, 3.99s/it] {'loss': 0.4357, 'grad_norm': 0.7569888031961698, 'learning_rate': 9.959022720884321e-06, 'epoch': 0.07} 7%|▋ | 1538/22095 [2:44:11<22:45:56, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1539/22095 [2:44:20<31:58:38, 5.60s/it] {'loss': 0.5254, 'grad_norm': 0.34109004209053556, 'learning_rate': 9.95892902638123e-06, 'epoch': 0.07} 7%|▋ | 1539/22095 [2:44:20<31:58:38, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49137 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98064 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1540/22095 [2:44:23<28:02:57, 4.91s/it] {'loss': 0.4158, 'grad_norm': 0.829481890585957, 'learning_rate': 9.958835225325984e-06, 'epoch': 0.07} 7%|▋ | 1540/22095 [2:44:23<28:02:57, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (132895 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1541/22095 [2:44:27<25:31:54, 4.47s/it] {'loss': 0.3823, 'grad_norm': 0.7756169671760623, 'learning_rate': 9.9587413177206e-06, 'epoch': 0.07} 7%|▋ | 1541/22095 [2:44:27<25:31:54, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1542/22095 [2:44:35<32:21:50, 5.67s/it] {'loss': 0.5427, 'grad_norm': 0.44633982647223575, 'learning_rate': 9.958647303567094e-06, 'epoch': 0.07} 7%|▋ | 1542/22095 [2:44:35<32:21:50, 5.67s/it] 7%|▋ | 1543/22095 [2:44:39<29:18:20, 5.13s/it] {'loss': 0.4481, 'grad_norm': 0.883531560550747, 'learning_rate': 9.958553182867488e-06, 'epoch': 0.07} 7%|▋ | 1543/22095 [2:44:39<29:18:20, 5.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1544/22095 [2:44:49<36:49:20, 6.45s/it] {'loss': 0.5135, 'grad_norm': 0.4300749273865749, 'learning_rate': 9.958458955623802e-06, 'epoch': 0.07} 7%|▋ | 1544/22095 [2:44:49<36:49:20, 6.45s/it] 7%|▋ | 1545/22095 [2:44:52<31:24:54, 5.50s/it] {'loss': 0.4874, 'grad_norm': 0.7996971504896987, 'learning_rate': 9.958364621838062e-06, 'epoch': 0.07} 7%|▋ | 1545/22095 [2:44:52<31:24:54, 5.50s/it] 7%|▋ | 1546/22095 [2:44:55<27:43:55, 4.86s/it] {'loss': 0.4484, 'grad_norm': 0.7474858485034108, 'learning_rate': 9.958270181512295e-06, 'epoch': 0.07} 7%|▋ | 1546/22095 [2:44:55<27:43:55, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62317 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1547/22095 [2:44:59<26:07:34, 4.58s/it] {'loss': 0.4148, 'grad_norm': 0.8790061160587301, 'learning_rate': 9.95817563464853e-06, 'epoch': 0.07} 7%|▋ | 1547/22095 [2:44:59<26:07:34, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1548/22095 [2:45:09<34:32:26, 6.05s/it] {'loss': 0.5323, 'grad_norm': 0.49396382028010033, 'learning_rate': 9.958080981248798e-06, 'epoch': 0.07} 7%|▋ | 1548/22095 [2:45:09<34:32:26, 6.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1549/22095 [2:45:12<29:43:44, 5.21s/it] {'loss': 0.4221, 'grad_norm': 0.7998562590968145, 'learning_rate': 9.957986221315134e-06, 'epoch': 0.07} 7%|▋ | 1549/22095 [2:45:12<29:43:44, 5.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1550/22095 [2:45:15<26:32:56, 4.65s/it] {'loss': 0.4593, 'grad_norm': 0.8086282984928738, 'learning_rate': 9.957891354849573e-06, 'epoch': 0.07} 7%|▋ | 1550/22095 [2:45:15<26:32:56, 4.65s/it] 7%|▋ | 1551/22095 [2:45:18<23:37:09, 4.14s/it] {'loss': 0.4597, 'grad_norm': 0.8162934805047718, 'learning_rate': 9.957796381854152e-06, 'epoch': 0.07} 7%|▋ | 1551/22095 [2:45:18<23:37:09, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8346214 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12871, 'image': 'vrdu_table_final_2/astro-ph.CO/27850288-601b-4679-a2fd-859894298097.png', 'image_wh': [[12, 17]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\footnotesize #1\n\\end{tabular}\n```"}]} 7%|▋ | 1552/22095 [2:45:21<21:51:03, 3.83s/it] {'loss': 0.4515, 'grad_norm': 0.7663579788681755, 'learning_rate': 9.957701302330915e-06, 'epoch': 0.07} 7%|▋ | 1552/22095 [2:45:21<21:51:03, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58456 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46608 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1553/22095 [2:45:24<20:30:03, 3.59s/it] {'loss': 0.4443, 'grad_norm': 0.7919053756584956, 'learning_rate': 9.957606116281905e-06, 'epoch': 0.07} 7%|▋ | 1553/22095 [2:45:24<20:30:03, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1554/22095 [2:45:27<19:27:06, 3.41s/it] {'loss': 0.4557, 'grad_norm': 0.8133292991518659, 'learning_rate': 9.957510823709165e-06, 'epoch': 0.07} 7%|▋ | 1554/22095 [2:45:27<19:27:06, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1555/22095 [2:45:33<23:59:13, 4.20s/it] {'loss': 0.5269, 'grad_norm': 0.6882397202006979, 'learning_rate': 9.957415424614742e-06, 'epoch': 0.07} 7%|▋ | 1555/22095 [2:45:33<23:59:13, 4.20s/it] 7%|▋ | 1556/22095 [2:45:37<22:32:45, 3.95s/it] {'loss': 0.435, 'grad_norm': 0.8998616260792923, 'learning_rate': 9.957319919000687e-06, 'epoch': 0.07} 7%|▋ | 1556/22095 [2:45:37<22:32:45, 3.95s/it] 7%|▋ | 1557/22095 [2:45:40<21:11:32, 3.71s/it] {'loss': 0.497, 'grad_norm': 0.7549419519105975, 'learning_rate': 9.957224306869053e-06, 'epoch': 0.07} 7%|▋ | 1557/22095 [2:45:40<21:11:32, 3.71s/it] 7%|▋ | 1558/22095 [2:45:43<20:33:32, 3.60s/it] {'loss': 0.4492, 'grad_norm': 0.7234189199464256, 'learning_rate': 9.957128588221895e-06, 'epoch': 0.07} 7%|▋ | 1558/22095 [2:45:43<20:33:32, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50164 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42277 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1559/22095 [2:45:47<20:51:29, 3.66s/it] {'loss': 0.3977, 'grad_norm': 0.7526551273534683, 'learning_rate': 9.957032763061264e-06, 'epoch': 0.07} 7%|▋ | 1559/22095 [2:45:47<20:51:29, 3.66s/it] 7%|▋ | 1560/22095 [2:45:50<20:21:46, 3.57s/it] {'loss': 0.479, 'grad_norm': 0.7661108290816389, 'learning_rate': 9.956936831389228e-06, 'epoch': 0.07} 7%|▋ | 1560/22095 [2:45:50<20:21:46, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78975 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1561/22095 [2:46:00<30:16:29, 5.31s/it] {'loss': 0.5161, 'grad_norm': 0.7268298686703955, 'learning_rate': 9.956840793207841e-06, 'epoch': 0.07} 7%|▋ | 1561/22095 [2:46:00<30:16:29, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41158 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53849 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89657 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1562/22095 [2:46:03<27:30:56, 4.82s/it] {'loss': 0.4463, 'grad_norm': 0.8351450729731379, 'learning_rate': 9.95674464851917e-06, 'epoch': 0.07} 7%|▋ | 1562/22095 [2:46:03<27:30:56, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48216 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1563/22095 [2:46:06<24:09:57, 4.24s/it] {'loss': 0.4756, 'grad_norm': 0.7466578559219901, 'learning_rate': 9.95664839732528e-06, 'epoch': 0.07} 7%|▋ | 1563/22095 [2:46:06<24:09:57, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52065 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1564/22095 [2:46:09<21:52:46, 3.84s/it] {'loss': 0.4096, 'grad_norm': 0.7005965800541605, 'learning_rate': 9.956552039628237e-06, 'epoch': 0.07} 7%|▋ | 1564/22095 [2:46:09<21:52:46, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1565/22095 [2:46:13<21:30:37, 3.77s/it] {'loss': 0.4333, 'grad_norm': 0.7945484153877097, 'learning_rate': 9.956455575430115e-06, 'epoch': 0.07} 7%|▋ | 1565/22095 [2:46:13<21:30:37, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None [Try #0] Failed to fetch sample 1134855 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None Problematic sample: {'image': ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'], 'conversations': [{'from': 'human', 'value': "\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nI want to book a hotel in london, prize should be less than $600, guest rating is 8+, 4 star rating, breakfast included\n\nPrevious operations:\nStep 1: Tap on the Chrome app to start searching for hotels in London.\nStep 2: Type 'Chrome' in the search bar to open the Chrome browser.\nStep 3: Tap on the Hotels.com app icon to begin searching for a hotel.\nStep 4: Wait for the Hotels.com app to finish loading to proceed with the hotel search.\nStep 5: Tap on the 'Going to' field to enter London as the destination.\nStep 6: Tap on 'London' from Recent Searches to select it as the destination for the hotel search.\nStep 7: Tap on 'London, England, United Kingdom' to select it as the destination for the hotel search.\nStep 8: Tap on the 'Search' button to view available hotels in London.\nStep 9: Tap on the filter icon or option to apply filters to the hotel search results.\nStep 10: Swipe up on the screen to reveal more filter options, such as guest rating and hotel stars.\nStep 11: Swipe up to reveal more filter options, specifically guest ratings and hotel stars.\nStep 12: Tap on the 4-star rating box to select it.\nStep 13: Tap 'Done' to confirm the current filter selections and proceed.\nStep 14: Swipe up on the screen to locate and select the 'breakfast included' filter option.\nStep 15: Tap on 'Breakfast included' to select it.\nStep 16: Tap on the 'Done' button to apply the filters and view available hotel options.\nStep 17: Tap on the topmost hotel card to view its details.\nStep 18: Tap on the 'Select a room' button to check available room options and their prices.\nStep 19: Tap on 'Select and customize' to proceed with the room selection and customization options.\nStep 20: Tap on 'Breakfast buffet' to select it, then tap on 'Reserve' to book the room."}, {'from': 'gpt', 'value': "\nThe screen is displaying a loading indicator, suggesting that the reservation process might be finalizing or confirmation details are being loaded. Given the sequence of booking actions, it seems the process is almost complete. The loading sign indicates that the system is processing the final booking or confirmation step.\n\n\nWait for the loading to complete to confirm the booking status.\n\n\nterminate(status='success')\n"}]} 7%|▋ | 1566/22095 [2:46:16<19:58:24, 3.50s/it] {'loss': 0.4668, 'grad_norm': 0.7964664261929095, 'learning_rate': 9.956359004732986e-06, 'epoch': 0.07} 7%|▋ | 1566/22095 [2:46:16<19:58:24, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1567/22095 [2:46:26<31:46:25, 5.57s/it] {'loss': 0.5346, 'grad_norm': 0.4334662715457474, 'learning_rate': 9.956262327538924e-06, 'epoch': 0.07} 7%|▋ | 1567/22095 [2:46:26<31:46:25, 5.57s/it] 7%|▋ | 1568/22095 [2:46:30<29:09:48, 5.11s/it] {'loss': 0.4911, 'grad_norm': 0.8466745017880247, 'learning_rate': 9.956165543850007e-06, 'epoch': 0.07} 7%|▋ | 1568/22095 [2:46:30<29:09:48, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914675 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37828, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 7%|▋ | 1569/22095 [2:46:40<37:26:10, 6.57s/it] {'loss': 0.5066, 'grad_norm': 0.33019104388466125, 'learning_rate': 9.956068653668314e-06, 'epoch': 0.07} 7%|▋ | 1569/22095 [2:46:40<37:26:10, 6.57s/it] 7%|▋ | 1570/22095 [2:46:47<38:20:21, 6.72s/it] {'loss': 0.5325, 'grad_norm': 0.3623870468116331, 'learning_rate': 9.955971656995927e-06, 'epoch': 0.07} 7%|▋ | 1570/22095 [2:46:47<38:20:21, 6.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1571/22095 [2:46:57<43:23:54, 7.61s/it] {'loss': 0.5381, 'grad_norm': 0.3741449779498846, 'learning_rate': 9.955874553834928e-06, 'epoch': 0.07} 7%|▋ | 1571/22095 [2:46:57<43:23:54, 7.61s/it] 7%|▋ | 1572/22095 [2:47:03<41:19:42, 7.25s/it] {'loss': 0.5692, 'grad_norm': 0.3768306568499116, 'learning_rate': 9.955777344187407e-06, 'epoch': 0.07} 7%|▋ | 1572/22095 [2:47:03<41:19:42, 7.25s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 7%|▋ | 1573/22095 [2:47:07<35:32:07, 6.23s/it] {'loss': 0.4282, 'grad_norm': 1.085101554006371, 'learning_rate': 9.955680028055453e-06, 'epoch': 0.07} 7%|▋ | 1573/22095 [2:47:07<35:32:07, 6.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48924 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1574/22095 [2:47:10<30:26:55, 5.34s/it] {'loss': 0.4335, 'grad_norm': 0.783608464689036, 'learning_rate': 9.955582605441154e-06, 'epoch': 0.07} 7%|▋ | 1574/22095 [2:47:10<30:26:55, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1575/22095 [2:47:14<27:19:20, 4.79s/it] {'loss': 0.4817, 'grad_norm': 0.9016573804866266, 'learning_rate': 9.955485076346605e-06, 'epoch': 0.07} 7%|▋ | 1575/22095 [2:47:14<27:19:20, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89349 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45493 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1576/22095 [2:47:18<25:29:06, 4.47s/it] {'loss': 0.4541, 'grad_norm': 0.8464255427197236, 'learning_rate': 9.955387440773902e-06, 'epoch': 0.07} 7%|▋ | 1576/22095 [2:47:18<25:29:06, 4.47s/it] 7%|▋ | 1577/22095 [2:47:21<23:12:37, 4.07s/it] {'loss': 0.5068, 'grad_norm': 0.8352333785219662, 'learning_rate': 9.955289698725141e-06, 'epoch': 0.07} 7%|▋ | 1577/22095 [2:47:21<23:12:37, 4.07s/it] 7%|▋ | 1578/22095 [2:47:24<22:10:14, 3.89s/it] {'loss': 0.4476, 'grad_norm': 0.7678664991367014, 'learning_rate': 9.955191850202424e-06, 'epoch': 0.07} 7%|▋ | 1578/22095 [2:47:24<22:10:14, 3.89s/it] 7%|▋ | 1579/22095 [2:47:29<22:53:07, 4.02s/it] {'loss': 0.4361, 'grad_norm': 0.7432997228864974, 'learning_rate': 9.955093895207853e-06, 'epoch': 0.07} 7%|▋ | 1579/22095 [2:47:29<22:53:07, 4.02s/it] 7%|▋ | 1580/22095 [2:47:32<21:27:51, 3.77s/it] {'loss': 0.466, 'grad_norm': 1.043347006843631, 'learning_rate': 9.954995833743532e-06, 'epoch': 0.07} 7%|▋ | 1580/22095 [2:47:32<21:27:51, 3.77s/it] 7%|▋ | 1581/22095 [2:47:35<20:45:02, 3.64s/it] {'loss': 0.4462, 'grad_norm': 0.7511877513268755, 'learning_rate': 9.95489766581157e-06, 'epoch': 0.07} 7%|▋ | 1581/22095 [2:47:35<20:45:02, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1582/22095 [2:47:42<25:36:18, 4.49s/it] {'loss': 0.5603, 'grad_norm': 0.7138314733618735, 'learning_rate': 9.954799391414073e-06, 'epoch': 0.07} 7%|▋ | 1582/22095 [2:47:42<25:36:18, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93409 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1583/22095 [2:47:45<23:18:36, 4.09s/it] {'loss': 0.45, 'grad_norm': 1.0080615313646797, 'learning_rate': 9.954701010553156e-06, 'epoch': 0.07} 7%|▋ | 1583/22095 [2:47:45<23:18:36, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1584/22095 [2:47:54<32:27:29, 5.70s/it] {'loss': 0.5231, 'grad_norm': 0.44949036623651845, 'learning_rate': 9.95460252323093e-06, 'epoch': 0.07} 7%|▋ | 1584/22095 [2:47:54<32:27:29, 5.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1585/22095 [2:47:58<28:30:37, 5.00s/it] {'loss': 0.4641, 'grad_norm': 0.814469135656858, 'learning_rate': 9.954503929449513e-06, 'epoch': 0.07} 7%|▋ | 1585/22095 [2:47:58<28:30:37, 5.00s/it] 7%|▋ | 1586/22095 [2:48:01<24:58:48, 4.38s/it] {'loss': 0.4783, 'grad_norm': 0.780742667966943, 'learning_rate': 9.954405229211025e-06, 'epoch': 0.07} 7%|▋ | 1586/22095 [2:48:01<24:58:48, 4.38s/it] 7%|▋ | 1587/22095 [2:48:04<22:44:25, 3.99s/it] {'loss': 0.4102, 'grad_norm': 0.9876274201894842, 'learning_rate': 9.954306422517583e-06, 'epoch': 0.07} 7%|▋ | 1587/22095 [2:48:04<22:44:25, 3.99s/it] 7%|▋ | 1588/22095 [2:48:07<21:43:07, 3.81s/it] {'loss': 0.5033, 'grad_norm': 0.8347915917303684, 'learning_rate': 9.954207509371313e-06, 'epoch': 0.07} 7%|▋ | 1588/22095 [2:48:07<21:43:07, 3.81s/it] 7%|▋ | 1589/22095 [2:48:10<21:02:59, 3.70s/it] {'loss': 0.4545, 'grad_norm': 0.7436863290744097, 'learning_rate': 9.954108489774339e-06, 'epoch': 0.07} 7%|▋ | 1589/22095 [2:48:10<21:02:59, 3.70s/it] 7%|▋ | 1590/22095 [2:48:14<20:12:21, 3.55s/it] {'loss': 0.483, 'grad_norm': 0.7512176662693054, 'learning_rate': 9.95400936372879e-06, 'epoch': 0.07} 7%|▋ | 1590/22095 [2:48:14<20:12:21, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1591/22095 [2:48:23<29:44:30, 5.22s/it] {'loss': 0.5386, 'grad_norm': 0.6968344530808637, 'learning_rate': 9.953910131236793e-06, 'epoch': 0.07} 7%|▋ | 1591/22095 [2:48:23<29:44:30, 5.22s/it] 7%|▋ | 1592/22095 [2:48:26<26:44:36, 4.70s/it] {'loss': 0.4793, 'grad_norm': 0.9115392018944554, 'learning_rate': 9.953810792300482e-06, 'epoch': 0.07} 7%|▋ | 1592/22095 [2:48:26<26:44:36, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71244 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1593/22095 [2:48:29<24:06:59, 4.23s/it] {'loss': 0.4179, 'grad_norm': 0.7593058631745393, 'learning_rate': 9.953711346921994e-06, 'epoch': 0.07} 7%|▋ | 1593/22095 [2:48:29<24:06:59, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1594/22095 [2:48:32<21:43:33, 3.82s/it] {'loss': 0.4427, 'grad_norm': 0.8363184414562198, 'learning_rate': 9.953611795103462e-06, 'epoch': 0.07} 7%|▋ | 1594/22095 [2:48:32<21:43:33, 3.82s/it] 7%|▋ | 1595/22095 [2:48:36<21:17:00, 3.74s/it] {'loss': 0.4654, 'grad_norm': 0.7961544165941885, 'learning_rate': 9.953512136847026e-06, 'epoch': 0.07} 7%|▋ | 1595/22095 [2:48:36<21:17:00, 3.74s/it] 7%|▋ | 1596/22095 [2:48:40<21:48:38, 3.83s/it] {'loss': 0.4491, 'grad_norm': 0.8475644647503084, 'learning_rate': 9.953412372154826e-06, 'epoch': 0.07} 7%|▋ | 1596/22095 [2:48:40<21:48:38, 3.83s/it] 7%|▋ | 1597/22095 [2:48:44<22:03:13, 3.87s/it] {'loss': 0.4721, 'grad_norm': 0.7591172312405083, 'learning_rate': 9.95331250102901e-06, 'epoch': 0.07} 7%|▋ | 1597/22095 [2:48:44<22:03:13, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45746 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42728 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54297 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1598/22095 [2:48:47<20:58:42, 3.68s/it] {'loss': 0.4261, 'grad_norm': 1.2064906579866868, 'learning_rate': 9.95321252347172e-06, 'epoch': 0.07} 7%|▋ | 1598/22095 [2:48:47<20:58:42, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89301 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1599/22095 [2:48:50<19:36:50, 3.45s/it] {'loss': 0.4595, 'grad_norm': 0.8037664893843064, 'learning_rate': 9.953112439485107e-06, 'epoch': 0.07} 7%|▋ | 1599/22095 [2:48:50<19:36:50, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1600/22095 [2:49:00<30:37:45, 5.38s/it] {'loss': 0.5278, 'grad_norm': 0.6037274223376878, 'learning_rate': 9.95301224907132e-06, 'epoch': 0.07} 7%|▋ | 1600/22095 [2:49:00<30:37:45, 5.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1601/22095 [2:49:04<28:30:19, 5.01s/it] {'loss': 0.468, 'grad_norm': 0.8423451104175681, 'learning_rate': 9.95291195223251e-06, 'epoch': 0.07} 7%|▋ | 1601/22095 [2:49:04<28:30:19, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307063 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1xgtKLXXXXXbWXpXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能告诉我图片中文字吗?'}, {'from': 'gpt', 'value': '图中所有文字:\n提高免疫力\n护发\n解缓慢性疾病\n,\n,\n全场满\n38\n元包邮\n东北农家自种有机花芸豆'}]} 7%|▋ | 1602/22095 [2:49:07<25:09:58, 4.42s/it] {'loss': 0.4573, 'grad_norm': 0.734562033088208, 'learning_rate': 9.952811548970834e-06, 'epoch': 0.07} 7%|▋ | 1602/22095 [2:49:07<25:09:58, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60436 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41155 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1603/22095 [2:49:10<22:56:25, 4.03s/it] {'loss': 0.4247, 'grad_norm': 0.7980025152030702, 'learning_rate': 9.952711039288451e-06, 'epoch': 0.07} 7%|▋ | 1603/22095 [2:49:10<22:56:25, 4.03s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_2/images/step_3.png 2025-08-27 18:47:09.499937 load time: 1153.91 ms 7%|▋ | 1604/22095 [2:49:14<22:22:27, 3.93s/it] {'loss': 0.4378, 'grad_norm': 0.8268544917981663, 'learning_rate': 9.952610423187516e-06, 'epoch': 0.07} 7%|▋ | 1604/22095 [2:49:14<22:22:27, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1605/22095 [2:49:17<20:49:34, 3.66s/it] {'loss': 0.4295, 'grad_norm': 0.7288491743965827, 'learning_rate': 9.952509700670197e-06, 'epoch': 0.07} 7%|▋ | 1605/22095 [2:49:17<20:49:34, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10359.png 2025-08-27 18:47:14.091744 load time: 1220.43 ms 7%|▋ | 1606/22095 [2:49:24<27:24:18, 4.82s/it] {'loss': 0.5117, 'grad_norm': 0.574267631988978, 'learning_rate': 9.952408871738652e-06, 'epoch': 0.07} 7%|▋ | 1606/22095 [2:49:24<27:24:18, 4.82s/it] 7%|▋ | 1607/22095 [2:49:28<25:42:09, 4.52s/it] {'loss': 0.4359, 'grad_norm': 0.810368491171752, 'learning_rate': 9.952307936395054e-06, 'epoch': 0.07} 7%|▋ | 1607/22095 [2:49:28<25:42:09, 4.52s/it] 7%|▋ | 1608/22095 [2:49:32<24:16:18, 4.27s/it] {'loss': 0.4529, 'grad_norm': 0.7467048796188573, 'learning_rate': 9.952206894641565e-06, 'epoch': 0.07} 7%|▋ | 1608/22095 [2:49:32<24:16:18, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47680 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114951 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1609/22095 [2:49:35<22:41:11, 3.99s/it] {'loss': 0.4459, 'grad_norm': 0.7670762293890041, 'learning_rate': 9.952105746480361e-06, 'epoch': 0.07} 7%|▋ | 1609/22095 [2:49:35<22:41:11, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103158 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1610/22095 [2:49:38<21:21:56, 3.75s/it] {'loss': 0.5131, 'grad_norm': 0.8677749241903157, 'learning_rate': 9.952004491913613e-06, 'epoch': 0.07} 7%|▋ | 1610/22095 [2:49:38<21:21:56, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1611/22095 [2:49:50<34:37:39, 6.09s/it] {'loss': 0.5079, 'grad_norm': 0.43473157187032946, 'learning_rate': 9.9519031309435e-06, 'epoch': 0.07} 7%|▋ | 1611/22095 [2:49:50<34:37:39, 6.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1612/22095 [2:49:54<31:16:42, 5.50s/it] {'loss': 0.4676, 'grad_norm': 0.7709482615496647, 'learning_rate': 9.951801663572194e-06, 'epoch': 0.07} 7%|▋ | 1612/22095 [2:49:54<31:16:42, 5.50s/it] 7%|▋ | 1613/22095 [2:49:57<26:42:28, 4.69s/it] {'loss': 0.4816, 'grad_norm': 0.7734986634642428, 'learning_rate': 9.951700089801879e-06, 'epoch': 0.07} 7%|▋ | 1613/22095 [2:49:57<26:42:28, 4.69s/it] 7%|▋ | 1614/22095 [2:50:00<24:12:38, 4.26s/it] {'loss': 0.4398, 'grad_norm': 0.712129054682761, 'learning_rate': 9.951598409634738e-06, 'epoch': 0.07} 7%|▋ | 1614/22095 [2:50:00<24:12:38, 4.26s/it] 7%|▋ | 1615/22095 [2:50:04<23:57:55, 4.21s/it] {'loss': 0.4412, 'grad_norm': 0.6855724174862319, 'learning_rate': 9.951496623072955e-06, 'epoch': 0.07} 7%|▋ | 1615/22095 [2:50:04<23:57:55, 4.21s/it] 7%|▋ | 1616/22095 [2:50:08<23:16:44, 4.09s/it] {'loss': 0.4403, 'grad_norm': 0.7685554643700105, 'learning_rate': 9.951394730118717e-06, 'epoch': 0.07} 7%|▋ | 1616/22095 [2:50:08<23:16:44, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1617/22095 [2:50:14<26:00:19, 4.57s/it] {'loss': 0.5291, 'grad_norm': 0.4755960850993418, 'learning_rate': 9.951292730774213e-06, 'epoch': 0.07} 7%|▋ | 1617/22095 [2:50:14<26:00:19, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44395 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1618/22095 [2:50:23<34:19:14, 6.03s/it] {'loss': 0.5251, 'grad_norm': 0.4204314232545453, 'learning_rate': 9.951190625041634e-06, 'epoch': 0.07} 7%|▋ | 1618/22095 [2:50:23<34:19:14, 6.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 7%|▋ | 1619/22095 [2:50:26<29:28:43, 5.18s/it] {'loss': 0.4715, 'grad_norm': 0.7848766554034355, 'learning_rate': 9.951088412923175e-06, 'epoch': 0.07} 7%|▋ | 1619/22095 [2:50:26<29:28:43, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80321 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1620/22095 [2:50:31<27:42:20, 4.87s/it] {'loss': 0.471, 'grad_norm': 0.7101940377324135, 'learning_rate': 9.950986094421033e-06, 'epoch': 0.07} 7%|▋ | 1620/22095 [2:50:31<27:42:20, 4.87s/it] 7%|▋ | 1621/22095 [2:50:34<24:40:29, 4.34s/it] {'loss': 0.4493, 'grad_norm': 0.9173872355738407, 'learning_rate': 9.950883669537405e-06, 'epoch': 0.07} 7%|▋ | 1621/22095 [2:50:34<24:40:29, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047669 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 5\nB. 4\nC. 3\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 7%|▋ | 1622/22095 [2:50:36<21:48:22, 3.83s/it] {'loss': 0.4719, 'grad_norm': 0.8585729042659949, 'learning_rate': 9.950781138274494e-06, 'epoch': 0.07} 7%|▋ | 1622/22095 [2:50:36<21:48:22, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1623/22095 [2:50:43<27:07:02, 4.77s/it] {'loss': 0.5371, 'grad_norm': 0.7234837833909473, 'learning_rate': 9.950678500634501e-06, 'epoch': 0.07} 7%|▋ | 1623/22095 [2:50:43<27:07:02, 4.77s/it] 7%|▋ | 1624/22095 [2:50:47<24:38:24, 4.33s/it] {'loss': 0.4383, 'grad_norm': 0.7175744371928864, 'learning_rate': 9.95057575661963e-06, 'epoch': 0.07} 7%|▋ | 1624/22095 [2:50:47<24:38:24, 4.33s/it] 7%|▋ | 1625/22095 [2:50:49<22:03:50, 3.88s/it] {'loss': 0.4679, 'grad_norm': 0.7903698818392364, 'learning_rate': 9.950472906232091e-06, 'epoch': 0.07} 7%|▋ | 1625/22095 [2:50:49<22:03:50, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1626/22095 [2:50:57<27:46:14, 4.88s/it] {'loss': 0.5099, 'grad_norm': 0.3575880075219209, 'learning_rate': 9.950369949474095e-06, 'epoch': 0.07} 7%|▋ | 1626/22095 [2:50:57<27:46:14, 4.88s/it] 7%|▋ | 1627/22095 [2:51:00<25:15:17, 4.44s/it] {'loss': 0.4453, 'grad_norm': 0.785889252127557, 'learning_rate': 9.950266886347852e-06, 'epoch': 0.07} 7%|▋ | 1627/22095 [2:51:00<25:15:17, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1628/22095 [2:51:08<30:28:56, 5.36s/it] {'loss': 0.5208, 'grad_norm': 0.35653778542954306, 'learning_rate': 9.950163716855578e-06, 'epoch': 0.07} 7%|▋ | 1628/22095 [2:51:08<30:28:56, 5.36s/it] 7%|▋ | 1629/22095 [2:51:11<26:43:40, 4.70s/it] {'loss': 0.4549, 'grad_norm': 1.5450850497688764, 'learning_rate': 9.950060440999486e-06, 'epoch': 0.07} 7%|▋ | 1629/22095 [2:51:11<26:43:40, 4.70s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_4/images/before_screenshot_21_id_87_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-27 18:49:09.963379 load time: 1024.01 ms 7%|▋ | 1630/22095 [2:51:14<23:50:45, 4.19s/it] {'loss': 0.4813, 'grad_norm': 0.7693214263174912, 'learning_rate': 9.949957058781802e-06, 'epoch': 0.07} 7%|▋ | 1630/22095 [2:51:14<23:50:45, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53888 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1631/22095 [2:51:17<21:43:40, 3.82s/it] {'loss': 0.4426, 'grad_norm': 0.8123776819315764, 'learning_rate': 9.949853570204742e-06, 'epoch': 0.07} 7%|▋ | 1631/22095 [2:51:17<21:43:40, 3.82s/it] 7%|▋ | 1632/22095 [2:51:21<22:34:28, 3.97s/it] {'loss': 0.4226, 'grad_norm': 0.745047866822542, 'learning_rate': 9.94974997527053e-06, 'epoch': 0.07} 7%|▋ | 1632/22095 [2:51:21<22:34:28, 3.97s/it] 7%|▋ | 1633/22095 [2:51:24<20:56:28, 3.68s/it] {'loss': 0.473, 'grad_norm': 1.1293663462337062, 'learning_rate': 9.949646273981394e-06, 'epoch': 0.07} 7%|▋ | 1633/22095 [2:51:24<20:56:28, 3.68s/it] 7%|▋ | 1634/22095 [2:51:27<19:28:59, 3.43s/it] {'loss': 0.4421, 'grad_norm': 0.725690333793923, 'learning_rate': 9.949542466339561e-06, 'epoch': 0.07} 7%|▋ | 1634/22095 [2:51:27<19:28:59, 3.43s/it] 7%|▋ | 1635/22095 [2:51:31<20:11:06, 3.55s/it] {'loss': 0.4429, 'grad_norm': 0.7562534174587207, 'learning_rate': 9.949438552347262e-06, 'epoch': 0.07} 7%|▋ | 1635/22095 [2:51:31<20:11:06, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1636/22095 [2:51:34<20:04:38, 3.53s/it] {'loss': 0.4806, 'grad_norm': 0.9048310824567884, 'learning_rate': 9.94933453200673e-06, 'epoch': 0.07} 7%|▋ | 1636/22095 [2:51:34<20:04:38, 3.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915459 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38612, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 6\nB. 6.5\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 7%|▋ | 1637/22095 [2:51:37<19:05:23, 3.36s/it] {'loss': 0.4829, 'grad_norm': 0.7639953887241905, 'learning_rate': 9.949230405320198e-06, 'epoch': 0.07} 7%|▋ | 1637/22095 [2:51:37<19:05:23, 3.36s/it] 7%|▋ | 1638/22095 [2:51:42<21:06:46, 3.72s/it] {'loss': 0.4355, 'grad_norm': 0.7243486351207513, 'learning_rate': 9.949126172289905e-06, 'epoch': 0.07} 7%|▋ | 1638/22095 [2:51:42<21:06:46, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1639/22095 [2:51:48<26:17:13, 4.63s/it] {'loss': 0.5237, 'grad_norm': 0.6082662853357984, 'learning_rate': 9.949021832918092e-06, 'epoch': 0.07} 7%|▋ | 1639/22095 [2:51:48<26:17:13, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68303 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88823 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1640/22095 [2:51:52<24:14:29, 4.27s/it] {'loss': 0.4821, 'grad_norm': 0.8794722206912542, 'learning_rate': 9.948917387206999e-06, 'epoch': 0.07} 7%|▋ | 1640/22095 [2:51:52<24:14:29, 4.27s/it] 7%|▋ | 1641/22095 [2:51:55<22:45:13, 4.00s/it] {'loss': 0.4723, 'grad_norm': 0.8696598828377334, 'learning_rate': 9.948812835158872e-06, 'epoch': 0.07} 7%|▋ | 1641/22095 [2:51:55<22:45:13, 4.00s/it] 7%|▋ | 1642/22095 [2:51:58<21:03:08, 3.71s/it] {'loss': 0.4216, 'grad_norm': 0.7322349900873856, 'learning_rate': 9.948708176775954e-06, 'epoch': 0.07} 7%|▋ | 1642/22095 [2:51:58<21:03:08, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118911 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54367 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1643/22095 [2:52:02<21:18:02, 3.75s/it] {'loss': 0.4641, 'grad_norm': 0.8653735420761837, 'learning_rate': 9.948603412060498e-06, 'epoch': 0.07} 7%|▋ | 1643/22095 [2:52:02<21:18:02, 3.75s/it] 7%|▋ | 1644/22095 [2:52:06<21:15:41, 3.74s/it] {'loss': 0.4344, 'grad_norm': 0.7424619254038732, 'learning_rate': 9.948498541014752e-06, 'epoch': 0.07} 7%|▋ | 1644/22095 [2:52:06<21:15:41, 3.74s/it] 7%|▋ | 1645/22095 [2:52:09<20:51:23, 3.67s/it] {'loss': 0.4083, 'grad_norm': 0.7369603541667813, 'learning_rate': 9.94839356364097e-06, 'epoch': 0.07} 7%|▋ | 1645/22095 [2:52:09<20:51:23, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44625 > 40960). Running this sequence through the model will result in indexing errors 7%|▋ | 1646/22095 [2:52:12<19:48:34, 3.49s/it] {'loss': 0.4047, 'grad_norm': 0.7140327682794468, 'learning_rate': 9.94828847994141e-06, 'epoch': 0.07} 7%|▋ | 1646/22095 [2:52:12<19:48:34, 3.49s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_0.png 2025-08-27 18:50:11.125005 load time: 1202.32 ms 7%|▋ | 1647/22095 [2:52:15<18:42:10, 3.29s/it] {'loss': 0.4515, 'grad_norm': 0.7216918324659736, 'learning_rate': 9.948183289918327e-06, 'epoch': 0.07} 7%|▋ | 1647/22095 [2:52:15<18:42:10, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1648/22095 [2:52:21<23:32:28, 4.14s/it] {'loss': 0.5288, 'grad_norm': 0.6294579395283374, 'learning_rate': 9.948077993573983e-06, 'epoch': 0.07} 7%|▋ | 1648/22095 [2:52:21<23:32:28, 4.14s/it] 7%|▋ | 1649/22095 [2:52:25<22:55:57, 4.04s/it] {'loss': 0.5042, 'grad_norm': 0.7958923321672456, 'learning_rate': 9.947972590910639e-06, 'epoch': 0.07} 7%|▋ | 1649/22095 [2:52:25<22:55:57, 4.04s/it] 7%|▋ | 1650/22095 [2:52:28<21:20:39, 3.76s/it] {'loss': 0.4503, 'grad_norm': 0.8417226582025827, 'learning_rate': 9.94786708193056e-06, 'epoch': 0.07} 7%|▋ | 1650/22095 [2:52:28<21:20:39, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 7%|▋ | 1651/22095 [2:52:38<31:09:09, 5.49s/it] {'loss': 0.5113, 'grad_norm': 0.3480153243530504, 'learning_rate': 9.947761466636014e-06, 'epoch': 0.07} 7%|▋ | 1651/22095 [2:52:38<31:09:09, 5.49s/it] 7%|▋ | 1652/22095 [2:52:41<27:45:20, 4.89s/it] {'loss': 0.4482, 'grad_norm': 0.7706541931247691, 'learning_rate': 9.94765574502927e-06, 'epoch': 0.07} 7%|▋ | 1652/22095 [2:52:41<27:45:20, 4.89s/it] 7%|▋ | 1653/22095 [2:52:44<24:10:44, 4.26s/it] {'loss': 0.4164, 'grad_norm': 0.7555172102442301, 'learning_rate': 9.947549917112601e-06, 'epoch': 0.07} 7%|▋ | 1653/22095 [2:52:44<24:10:44, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 7%|▋ | 1654/22095 [2:52:51<28:43:21, 5.06s/it] {'loss': 0.5317, 'grad_norm': 0.4662745249690845, 'learning_rate': 9.947443982888279e-06, 'epoch': 0.07} 7%|▋ | 1654/22095 [2:52:51<28:43:21, 5.06s/it] 7%|▋ | 1655/22095 [2:52:55<26:24:46, 4.65s/it] {'loss': 0.4558, 'grad_norm': 0.838637572397322, 'learning_rate': 9.947337942358579e-06, 'epoch': 0.07} 7%|▋ | 1655/22095 [2:52:55<26:24:46, 4.65s/it]VC:s3://gui-agent/mind2web_train/images/f57e6c0a-8f8b-4756-9f1d-1bdea7a0af5c/images/3.png 2025-08-27 18:50:54.316056 load time: 1111.82 ms 7%|▋ | 1656/22095 [2:52:58<24:00:30, 4.23s/it] {'loss': 0.4353, 'grad_norm': 0.816122648381445, 'learning_rate': 9.947231795525782e-06, 'epoch': 0.07} 7%|▋ | 1656/22095 [2:52:58<24:00:30, 4.23s/it] 7%|▋ | 1657/22095 [2:53:01<22:29:41, 3.96s/it] {'loss': 0.5044, 'grad_norm': 0.7673786090994253, 'learning_rate': 9.94712554239217e-06, 'epoch': 0.07} 7%|▋ | 1657/22095 [2:53:01<22:29:41, 3.96s/it] 8%|▊ | 1658/22095 [2:53:06<23:12:27, 4.09s/it] {'loss': 0.449, 'grad_norm': 0.704259071940599, 'learning_rate': 9.947019182960023e-06, 'epoch': 0.08} 8%|▊ | 1658/22095 [2:53:06<23:12:27, 4.09s/it] 8%|▊ | 1659/22095 [2:53:09<21:52:29, 3.85s/it] {'loss': 0.4738, 'grad_norm': 0.7164377948782436, 'learning_rate': 9.946912717231625e-06, 'epoch': 0.08} 8%|▊ | 1659/22095 [2:53:09<21:52:29, 3.85s/it] 8%|▊ | 1660/22095 [2:53:12<20:16:19, 3.57s/it] {'loss': 0.4134, 'grad_norm': 0.6995814898789535, 'learning_rate': 9.94680614520927e-06, 'epoch': 0.08} 8%|▊ | 1660/22095 [2:53:12<20:16:19, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910452 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33605, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm'}, {'from': 'gpt', 'value': '【解答】解:∵点D、E分别是AC和BC的中点,∴DE=DC+CE=\\frac{1}{2}AC+\\frac{1}{2}BC=\\frac{1}{2}AB而AB=16cm,∴DE=\\frac{1}{2}×16=8(cm).'}]} 8%|▊ | 1661/22095 [2:53:15<19:19:50, 3.41s/it] {'loss': 0.4305, 'grad_norm': 0.857002781818847, 'learning_rate': 9.94669946689524e-06, 'epoch': 0.08} 8%|▊ | 1661/22095 [2:53:15<19:19:50, 3.41s/it] 8%|▊ | 1662/22095 [2:53:18<18:25:22, 3.25s/it] {'loss': 0.4683, 'grad_norm': 0.7424988428382787, 'learning_rate': 9.946592682291834e-06, 'epoch': 0.08} 8%|▊ | 1662/22095 [2:53:18<18:25:22, 3.25s/it] 8%|▊ | 1663/22095 [2:53:21<18:18:32, 3.23s/it] {'loss': 0.4157, 'grad_norm': 0.7630417153053298, 'learning_rate': 9.94648579140134e-06, 'epoch': 0.08} 8%|▊ | 1663/22095 [2:53:21<18:18:32, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42229 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68748 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55106 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1664/22095 [2:53:24<17:27:41, 3.08s/it] {'loss': 0.4425, 'grad_norm': 0.7893442002173218, 'learning_rate': 9.946378794226062e-06, 'epoch': 0.08} 8%|▊ | 1664/22095 [2:53:24<17:27:41, 3.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43188 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1665/22095 [2:53:27<18:02:33, 3.18s/it] {'loss': 0.4324, 'grad_norm': 0.6564682733195059, 'learning_rate': 9.946271690768295e-06, 'epoch': 0.08} 8%|▊ | 1665/22095 [2:53:27<18:02:33, 3.18s/it] 8%|▊ | 1666/22095 [2:53:30<17:41:54, 3.12s/it] {'loss': 0.4887, 'grad_norm': 0.8107525179394984, 'learning_rate': 9.946164481030339e-06, 'epoch': 0.08} 8%|▊ | 1666/22095 [2:53:30<17:41:54, 3.12s/it] 8%|▊ | 1667/22095 [2:53:34<18:32:18, 3.27s/it] {'loss': 0.4952, 'grad_norm': 0.7832898959544311, 'learning_rate': 9.9460571650145e-06, 'epoch': 0.08} 8%|▊ | 1667/22095 [2:53:34<18:32:18, 3.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1668/22095 [2:53:37<18:46:55, 3.31s/it] {'loss': 0.4314, 'grad_norm': 0.7275954330682075, 'learning_rate': 9.945949742723083e-06, 'epoch': 0.08} 8%|▊ | 1668/22095 [2:53:37<18:46:55, 3.31s/it] 8%|▊ | 1669/22095 [2:53:41<19:25:19, 3.42s/it] {'loss': 0.4737, 'grad_norm': 0.8272072509791674, 'learning_rate': 9.945842214158397e-06, 'epoch': 0.08} 8%|▊ | 1669/22095 [2:53:41<19:25:19, 3.42s/it] 8%|▊ | 1670/22095 [2:53:45<20:03:00, 3.53s/it] {'loss': 0.446, 'grad_norm': 0.7302759518481456, 'learning_rate': 9.94573457932275e-06, 'epoch': 0.08} 8%|▊ | 1670/22095 [2:53:45<20:03:00, 3.53s/it] 8%|▊ | 1671/22095 [2:53:48<19:31:04, 3.44s/it] {'loss': 0.439, 'grad_norm': 0.7285007459322589, 'learning_rate': 9.945626838218458e-06, 'epoch': 0.08} 8%|▊ | 1671/22095 [2:53:48<19:31:04, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1672/22095 [2:53:53<23:01:46, 4.06s/it] {'loss': 0.5277, 'grad_norm': 0.600806103249455, 'learning_rate': 9.945518990847835e-06, 'epoch': 0.08} 8%|▊ | 1672/22095 [2:53:53<23:01:46, 4.06s/it] 8%|▊ | 1673/22095 [2:53:57<21:50:29, 3.85s/it] {'loss': 0.4852, 'grad_norm': 0.7644840288674761, 'learning_rate': 9.945411037213198e-06, 'epoch': 0.08} 8%|▊ | 1673/22095 [2:53:57<21:50:29, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53869 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49987 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82103 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1674/22095 [2:54:00<21:21:33, 3.77s/it] {'loss': 0.4512, 'grad_norm': 0.7606077392588975, 'learning_rate': 9.945302977316864e-06, 'epoch': 0.08} 8%|▊ | 1674/22095 [2:54:00<21:21:33, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1675/22095 [2:54:04<20:52:55, 3.68s/it] {'loss': 0.4448, 'grad_norm': 0.717075980182358, 'learning_rate': 9.94519481116116e-06, 'epoch': 0.08} 8%|▊ | 1675/22095 [2:54:04<20:52:55, 3.68s/it] 8%|▊ | 1676/22095 [2:54:07<19:27:05, 3.43s/it] {'loss': 0.4802, 'grad_norm': 0.8581548432084788, 'learning_rate': 9.945086538748407e-06, 'epoch': 0.08} 8%|▊ | 1676/22095 [2:54:07<19:27:05, 3.43s/it] 8%|▊ | 1677/22095 [2:54:10<19:43:44, 3.48s/it] {'loss': 0.4677, 'grad_norm': 0.7421139594180161, 'learning_rate': 9.944978160080932e-06, 'epoch': 0.08} 8%|▊ | 1677/22095 [2:54:10<19:43:44, 3.48s/it] 8%|▊ | 1678/22095 [2:54:14<21:12:58, 3.74s/it] {'loss': 0.4558, 'grad_norm': 0.7329069408677304, 'learning_rate': 9.944869675161062e-06, 'epoch': 0.08} 8%|▊ | 1678/22095 [2:54:14<21:12:58, 3.74s/it] 8%|▊ | 1679/22095 [2:54:19<21:47:45, 3.84s/it] {'loss': 0.4025, 'grad_norm': 0.6783854819129181, 'learning_rate': 9.944761083991131e-06, 'epoch': 0.08} 8%|▊ | 1679/22095 [2:54:19<21:47:45, 3.84s/it] 8%|▊ | 1680/22095 [2:54:22<20:30:00, 3.62s/it] {'loss': 0.4869, 'grad_norm': 0.7926734375394854, 'learning_rate': 9.944652386573472e-06, 'epoch': 0.08} 8%|▊ | 1680/22095 [2:54:22<20:30:00, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55584 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1681/22095 [2:54:25<20:50:09, 3.67s/it] {'loss': 0.4493, 'grad_norm': 0.7747418613948901, 'learning_rate': 9.944543582910417e-06, 'epoch': 0.08} 8%|▊ | 1681/22095 [2:54:25<20:50:09, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1682/22095 [2:54:33<27:22:06, 4.83s/it] {'loss': 0.5339, 'grad_norm': 0.6748968711002219, 'learning_rate': 9.944434673004308e-06, 'epoch': 0.08} 8%|▊ | 1682/22095 [2:54:33<27:22:06, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48264 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104583 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1683/22095 [2:54:37<26:05:03, 4.60s/it] {'loss': 0.4748, 'grad_norm': 0.8246562125565327, 'learning_rate': 9.944325656857485e-06, 'epoch': 0.08} 8%|▊ | 1683/22095 [2:54:37<26:05:03, 4.60s/it] 8%|▊ | 1684/22095 [2:54:40<23:04:56, 4.07s/it] {'loss': 0.4123, 'grad_norm': 0.7619113155947301, 'learning_rate': 9.944216534472287e-06, 'epoch': 0.08} 8%|▊ | 1684/22095 [2:54:40<23:04:56, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91106 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1685/22095 [2:54:43<21:31:00, 3.80s/it] {'loss': 0.4604, 'grad_norm': 0.8584651446158137, 'learning_rate': 9.94410730585106e-06, 'epoch': 0.08} 8%|▊ | 1685/22095 [2:54:43<21:31:00, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1686/22095 [2:54:46<20:50:57, 3.68s/it] {'loss': 0.4572, 'grad_norm': 0.8277957948886835, 'learning_rate': 9.943997970996153e-06, 'epoch': 0.08} 8%|▊ | 1686/22095 [2:54:46<20:50:57, 3.68s/it] 8%|▊ | 1687/22095 [2:54:49<19:40:00, 3.47s/it] {'loss': 0.492, 'grad_norm': 0.7801979933818924, 'learning_rate': 9.943888529909916e-06, 'epoch': 0.08} 8%|▊ | 1687/22095 [2:54:49<19:40:00, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45227 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1688/22095 [2:54:53<19:07:52, 3.37s/it] {'loss': 0.449, 'grad_norm': 0.8391519761417111, 'learning_rate': 9.943778982594695e-06, 'epoch': 0.08} 8%|▊ | 1688/22095 [2:54:53<19:07:52, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1689/22095 [2:55:04<32:15:20, 5.69s/it] {'loss': 0.5261, 'grad_norm': 0.5006693997641047, 'learning_rate': 9.943669329052848e-06, 'epoch': 0.08} 8%|▊ | 1689/22095 [2:55:04<32:15:20, 5.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1690/22095 [2:55:07<28:24:05, 5.01s/it] {'loss': 0.4434, 'grad_norm': 0.8170731618555016, 'learning_rate': 9.943559569286731e-06, 'epoch': 0.08} 8%|▊ | 1690/22095 [2:55:07<28:24:05, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1691/22095 [2:55:16<34:45:02, 6.13s/it] {'loss': 0.5147, 'grad_norm': 0.3748653606553218, 'learning_rate': 9.943449703298703e-06, 'epoch': 0.08} 8%|▊ | 1691/22095 [2:55:16<34:45:02, 6.13s/it] 8%|▊ | 1692/22095 [2:55:19<30:13:03, 5.33s/it] {'loss': 0.437, 'grad_norm': 0.7888227413039046, 'learning_rate': 9.943339731091122e-06, 'epoch': 0.08} 8%|▊ | 1692/22095 [2:55:19<30:13:03, 5.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1693/22095 [2:55:30<38:34:56, 6.81s/it] {'loss': 0.5146, 'grad_norm': 0.38129147182439493, 'learning_rate': 9.943229652666353e-06, 'epoch': 0.08} 8%|▊ | 1693/22095 [2:55:30<38:34:56, 6.81s/it] 8%|▊ | 1694/22095 [2:55:34<34:17:16, 6.05s/it] {'loss': 0.4428, 'grad_norm': 0.7875704407185139, 'learning_rate': 9.94311946802676e-06, 'epoch': 0.08} 8%|▊ | 1694/22095 [2:55:34<34:17:16, 6.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [92, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350700 in VC:s3://internvl-moe-sft-data/. Exception: Image size [92, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17374, 'image': 'vrdu_table_final_2/astro-ph.CO/5ee9bb19-68d4-4759-a82b-3e436b55ded2.png', 'image_wh': [[92, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}Finalist\\end{tabular}\n```"}]} 8%|▊ | 1695/22095 [2:55:37<29:10:07, 5.15s/it] {'loss': 0.4613, 'grad_norm': 1.7171840921170323, 'learning_rate': 9.943009177174712e-06, 'epoch': 0.08} 8%|▊ | 1695/22095 [2:55:37<29:10:07, 5.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1696/22095 [2:55:46<36:36:44, 6.46s/it] {'loss': 0.54, 'grad_norm': 0.5894610471131189, 'learning_rate': 9.942898780112578e-06, 'epoch': 0.08} 8%|▊ | 1696/22095 [2:55:46<36:36:44, 6.46s/it] 8%|▊ | 1697/22095 [2:55:50<31:25:13, 5.55s/it] {'loss': 0.4184, 'grad_norm': 0.9081380234402502, 'learning_rate': 9.94278827684273e-06, 'epoch': 0.08} 8%|▊ | 1697/22095 [2:55:50<31:25:13, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54520 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95268 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53613 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1698/22095 [2:55:53<27:23:29, 4.83s/it] {'loss': 0.4672, 'grad_norm': 0.7823344173089999, 'learning_rate': 9.942677667367541e-06, 'epoch': 0.08} 8%|▊ | 1698/22095 [2:55:53<27:23:29, 4.83s/it] 8%|▊ | 1699/22095 [2:55:57<25:24:55, 4.49s/it] {'loss': 0.4232, 'grad_norm': 0.704900119854084, 'learning_rate': 9.942566951689391e-06, 'epoch': 0.08} 8%|▊ | 1699/22095 [2:55:57<25:24:55, 4.49s/it] 8%|▊ | 1700/22095 [2:56:00<24:17:34, 4.29s/it] {'loss': 0.4317, 'grad_norm': 0.9097669547846146, 'learning_rate': 9.942456129810658e-06, 'epoch': 0.08} 8%|▊ | 1700/22095 [2:56:00<24:17:34, 4.29s/it] 8%|▊ | 1701/22095 [2:56:04<22:51:05, 4.03s/it] {'loss': 0.4731, 'grad_norm': 0.7346865602242609, 'learning_rate': 9.942345201733722e-06, 'epoch': 0.08} 8%|▊ | 1701/22095 [2:56:04<22:51:05, 4.03s/it] 8%|▊ | 1702/22095 [2:56:08<22:53:55, 4.04s/it] {'loss': 0.4739, 'grad_norm': 0.7587183898655953, 'learning_rate': 9.942234167460966e-06, 'epoch': 0.08} 8%|▊ | 1702/22095 [2:56:08<22:53:55, 4.04s/it] 8%|▊ | 1703/22095 [2:56:12<22:21:18, 3.95s/it] {'loss': 0.4157, 'grad_norm': 0.8201545355377298, 'learning_rate': 9.942123026994776e-06, 'epoch': 0.08} 8%|▊ | 1703/22095 [2:56:12<22:21:18, 3.95s/it] 8%|▊ | 1704/22095 [2:56:15<20:56:50, 3.70s/it] {'loss': 0.4148, 'grad_norm': 0.7966173846718374, 'learning_rate': 9.942011780337542e-06, 'epoch': 0.08} 8%|▊ | 1704/22095 [2:56:15<20:56:50, 3.70s/it] 8%|▊ | 1705/22095 [2:56:18<20:33:42, 3.63s/it] {'loss': 0.4087, 'grad_norm': 0.7314677884036063, 'learning_rate': 9.941900427491652e-06, 'epoch': 0.08} 8%|▊ | 1705/22095 [2:56:18<20:33:42, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1706/22095 [2:56:28<30:09:19, 5.32s/it] {'loss': 0.5203, 'grad_norm': 0.5109860171092886, 'learning_rate': 9.941788968459502e-06, 'epoch': 0.08} 8%|▊ | 1706/22095 [2:56:28<30:09:19, 5.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1707/22095 [2:56:31<27:44:13, 4.90s/it] {'loss': 0.4036, 'grad_norm': 0.8133081034298886, 'learning_rate': 9.941677403243482e-06, 'epoch': 0.08} 8%|▊ | 1707/22095 [2:56:31<27:44:13, 4.90s/it] 8%|▊ | 1708/22095 [2:56:35<26:15:11, 4.64s/it] {'loss': 0.4689, 'grad_norm': 0.7755811147059083, 'learning_rate': 9.941565731845993e-06, 'epoch': 0.08} 8%|▊ | 1708/22095 [2:56:35<26:15:11, 4.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941395 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64548, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 8%|▊ | 1709/22095 [2:56:40<26:13:47, 4.63s/it] {'loss': 0.4253, 'grad_norm': 0.7232739678213123, 'learning_rate': 9.941453954269434e-06, 'epoch': 0.08} 8%|▊ | 1709/22095 [2:56:40<26:13:47, 4.63s/it] 8%|▊ | 1710/22095 [2:56:43<24:05:01, 4.25s/it] {'loss': 0.4546, 'grad_norm': 0.8649248462101308, 'learning_rate': 9.941342070516205e-06, 'epoch': 0.08} 8%|▊ | 1710/22095 [2:56:43<24:05:01, 4.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [87, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358062 in VC:s3://internvl-moe-sft-data/. Exception: Image size [87, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24773, 'image': 'vrdu_table_final_2/astro-ph.CO/0c5fac35-8138-4a32-87d7-d9b5cdcde35d.png', 'image_wh': [[87, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$\\alpha$-basis\\end{tabular}\n```"}]} 8%|▊ | 1711/22095 [2:56:47<22:48:19, 4.03s/it] {'loss': 0.4374, 'grad_norm': 0.8732318494015278, 'learning_rate': 9.941230080588711e-06, 'epoch': 0.08} 8%|▊ | 1711/22095 [2:56:47<22:48:19, 4.03s/it] 8%|▊ | 1712/22095 [2:56:51<22:43:24, 4.01s/it] {'loss': 0.44, 'grad_norm': 0.7775579716115757, 'learning_rate': 9.941117984489358e-06, 'epoch': 0.08} 8%|▊ | 1712/22095 [2:56:51<22:43:24, 4.01s/it] 8%|▊ | 1713/22095 [2:56:54<21:18:54, 3.76s/it] {'loss': 0.4301, 'grad_norm': 0.8352613535775968, 'learning_rate': 9.941005782220557e-06, 'epoch': 0.08} 8%|▊ | 1713/22095 [2:56:54<21:18:54, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (114793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71281 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1714/22095 [2:56:57<20:07:34, 3.55s/it] {'loss': 0.4748, 'grad_norm': 1.600225175835997, 'learning_rate': 9.940893473784714e-06, 'epoch': 0.08} 8%|▊ | 1714/22095 [2:56:57<20:07:34, 3.55s/it] 8%|▊ | 1715/22095 [2:57:01<20:23:08, 3.60s/it] {'loss': 0.468, 'grad_norm': 0.7210361373387383, 'learning_rate': 9.940781059184246e-06, 'epoch': 0.08} 8%|▊ | 1715/22095 [2:57:01<20:23:08, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953369 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4204, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 8%|▊ | 1716/22095 [2:57:05<21:02:29, 3.72s/it] {'loss': 0.4502, 'grad_norm': 0.7215114442344711, 'learning_rate': 9.940668538421569e-06, 'epoch': 0.08} 8%|▊ | 1716/22095 [2:57:05<21:02:29, 3.72s/it] 8%|▊ | 1717/22095 [2:57:08<20:27:11, 3.61s/it] {'loss': 0.4619, 'grad_norm': 0.7494374543013032, 'learning_rate': 9.940555911499098e-06, 'epoch': 0.08} 8%|▊ | 1717/22095 [2:57:08<20:27:11, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79242 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1718/22095 [2:57:12<20:06:59, 3.55s/it] {'loss': 0.4807, 'grad_norm': 0.8060329308220311, 'learning_rate': 9.940443178419255e-06, 'epoch': 0.08} 8%|▊ | 1718/22095 [2:57:12<20:06:59, 3.55s/it] 8%|▊ | 1719/22095 [2:57:15<19:03:00, 3.37s/it] {'loss': 0.4063, 'grad_norm': 0.7348165098234182, 'learning_rate': 9.940330339184461e-06, 'epoch': 0.08} 8%|▊ | 1719/22095 [2:57:15<19:03:00, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1720/22095 [2:57:18<18:28:10, 3.26s/it] {'loss': 0.4637, 'grad_norm': 0.7759411143569876, 'learning_rate': 9.94021739379714e-06, 'epoch': 0.08} 8%|▊ | 1720/22095 [2:57:18<18:28:10, 3.26s/it] 8%|▊ | 1721/22095 [2:57:21<17:51:34, 3.16s/it] {'loss': 0.4375, 'grad_norm': 0.6940545281776684, 'learning_rate': 9.940104342259721e-06, 'epoch': 0.08} 8%|▊ | 1721/22095 [2:57:21<17:51:34, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1722/22095 [2:57:28<25:32:02, 4.51s/it] {'loss': 0.5262, 'grad_norm': 0.6427432439226576, 'learning_rate': 9.939991184574632e-06, 'epoch': 0.08} 8%|▊ | 1722/22095 [2:57:28<25:32:02, 4.51s/it] 8%|▊ | 1723/22095 [2:57:32<24:45:52, 4.38s/it] {'loss': 0.4516, 'grad_norm': 0.7415240348599722, 'learning_rate': 9.939877920744305e-06, 'epoch': 0.08} 8%|▊ | 1723/22095 [2:57:32<24:45:52, 4.38s/it] 8%|▊ | 1724/22095 [2:57:35<22:38:26, 4.00s/it] {'loss': 0.4324, 'grad_norm': 0.7042025867455868, 'learning_rate': 9.939764550771172e-06, 'epoch': 0.08} 8%|▊ | 1724/22095 [2:57:35<22:38:26, 4.00s/it] 8%|▊ | 1725/22095 [2:57:38<21:02:23, 3.72s/it] {'loss': 0.4533, 'grad_norm': 0.7534567023710437, 'learning_rate': 9.939651074657672e-06, 'epoch': 0.08} 8%|▊ | 1725/22095 [2:57:38<21:02:23, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1726/22095 [2:57:49<32:05:21, 5.67s/it] {'loss': 0.5252, 'grad_norm': 0.5824720249805632, 'learning_rate': 9.939537492406239e-06, 'epoch': 0.08} 8%|▊ | 1726/22095 [2:57:49<32:05:21, 5.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1727/22095 [2:57:52<28:32:51, 5.05s/it] {'loss': 0.4304, 'grad_norm': 0.7635615190018861, 'learning_rate': 9.939423804019316e-06, 'epoch': 0.08} 8%|▊ | 1727/22095 [2:57:52<28:32:51, 5.05s/it] 8%|▊ | 1728/22095 [2:57:55<25:07:02, 4.44s/it] {'loss': 0.4557, 'grad_norm': 0.806469733151042, 'learning_rate': 9.939310009499348e-06, 'epoch': 0.08} 8%|▊ | 1728/22095 [2:57:55<25:07:02, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42916 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1729/22095 [2:58:02<28:49:31, 5.10s/it] {'loss': 0.5468, 'grad_norm': 0.46661235464145895, 'learning_rate': 9.939196108848777e-06, 'epoch': 0.08} 8%|▊ | 1729/22095 [2:58:02<28:49:31, 5.10s/it] 8%|▊ | 1730/22095 [2:58:06<27:30:39, 4.86s/it] {'loss': 0.3917, 'grad_norm': 0.7065872520486806, 'learning_rate': 9.93908210207005e-06, 'epoch': 0.08} 8%|▊ | 1730/22095 [2:58:06<27:30:39, 4.86s/it] 8%|▊ | 1731/22095 [2:58:09<24:25:04, 4.32s/it] {'loss': 0.4622, 'grad_norm': 0.7716556106668118, 'learning_rate': 9.93896798916562e-06, 'epoch': 0.08} 8%|▊ | 1731/22095 [2:58:09<24:25:04, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1732/22095 [2:58:13<23:59:22, 4.24s/it] {'loss': 0.4564, 'grad_norm': 0.7320673064635882, 'learning_rate': 9.938853770137935e-06, 'epoch': 0.08} 8%|▊ | 1732/22095 [2:58:13<23:59:22, 4.24s/it] 8%|▊ | 1733/22095 [2:58:17<22:17:26, 3.94s/it] {'loss': 0.453, 'grad_norm': 0.7696490103344769, 'learning_rate': 9.938739444989452e-06, 'epoch': 0.08} 8%|▊ | 1733/22095 [2:58:17<22:17:26, 3.94s/it] 8%|▊ | 1734/22095 [2:58:21<22:39:06, 4.01s/it] {'loss': 0.4298, 'grad_norm': 0.7618050527451413, 'learning_rate': 9.938625013722625e-06, 'epoch': 0.08} 8%|▊ | 1734/22095 [2:58:21<22:39:06, 4.01s/it] 8%|▊ | 1735/22095 [2:58:24<21:00:48, 3.72s/it] {'loss': 0.4492, 'grad_norm': 0.7287327328684455, 'learning_rate': 9.938510476339915e-06, 'epoch': 0.08} 8%|▊ | 1735/22095 [2:58:24<21:00:48, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59438 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1736/22095 [2:58:27<19:40:06, 3.48s/it] {'loss': 0.4397, 'grad_norm': 0.7355669694141859, 'learning_rate': 9.938395832843784e-06, 'epoch': 0.08} 8%|▊ | 1736/22095 [2:58:27<19:40:06, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41078 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1737/22095 [2:58:36<29:59:08, 5.30s/it] {'loss': 0.5109, 'grad_norm': 0.5986083608600049, 'learning_rate': 9.938281083236692e-06, 'epoch': 0.08} 8%|▊ | 1737/22095 [2:58:36<29:59:08, 5.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934540 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57693, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nA. 8\nB. 4\nC. 6\nD. 7.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 8%|▊ | 1738/22095 [2:58:40<26:45:58, 4.73s/it] {'loss': 0.464, 'grad_norm': 0.8267103372429406, 'learning_rate': 9.938166227521106e-06, 'epoch': 0.08} 8%|▊ | 1738/22095 [2:58:40<26:45:58, 4.73s/it] 8%|▊ | 1739/22095 [2:58:43<23:41:09, 4.19s/it] {'loss': 0.4495, 'grad_norm': 0.8498829525864762, 'learning_rate': 9.938051265699495e-06, 'epoch': 0.08} 8%|▊ | 1739/22095 [2:58:43<23:41:09, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42789 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1740/22095 [2:58:46<21:59:34, 3.89s/it] {'loss': 0.493, 'grad_norm': 0.7371846718545788, 'learning_rate': 9.937936197774328e-06, 'epoch': 0.08} 8%|▊ | 1740/22095 [2:58:46<21:59:34, 3.89s/it] 8%|▊ | 1741/22095 [2:58:49<20:58:13, 3.71s/it] {'loss': 0.506, 'grad_norm': 0.7612591001481491, 'learning_rate': 9.937821023748077e-06, 'epoch': 0.08} 8%|▊ | 1741/22095 [2:58:49<20:58:13, 3.71s/it] 8%|▊ | 1742/22095 [2:58:53<21:00:23, 3.72s/it] {'loss': 0.4469, 'grad_norm': 0.7921542623351243, 'learning_rate': 9.93770574362322e-06, 'epoch': 0.08} 8%|▊ | 1742/22095 [2:58:53<21:00:23, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394434 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61269, 'image': 'vrdu_table_final_2/astro-ph.EP/191a8f69-41c6-4733-9f32-2c9802b23690.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 8%|▊ | 1743/22095 [2:58:56<20:12:14, 3.57s/it] {'loss': 0.4762, 'grad_norm': 0.7471390352933903, 'learning_rate': 9.937590357402229e-06, 'epoch': 0.08} 8%|▊ | 1743/22095 [2:58:56<20:12:14, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1744/22095 [2:59:06<30:57:10, 5.48s/it] {'loss': 0.5168, 'grad_norm': 0.6834612586390713, 'learning_rate': 9.937474865087588e-06, 'epoch': 0.08} 8%|▊ | 1744/22095 [2:59:06<30:57:10, 5.48s/it] 8%|▊ | 1745/22095 [2:59:11<29:27:34, 5.21s/it] {'loss': 0.5015, 'grad_norm': 0.4588021154978975, 'learning_rate': 9.937359266681774e-06, 'epoch': 0.08} 8%|▊ | 1745/22095 [2:59:11<29:27:34, 5.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 8%|▊ | 1746/22095 [2:59:14<25:57:41, 4.59s/it] {'loss': 0.3916, 'grad_norm': 0.8696667108506297, 'learning_rate': 9.937243562187276e-06, 'epoch': 0.08} 8%|▊ | 1746/22095 [2:59:14<25:57:41, 4.59s/it] 8%|▊ | 1747/22095 [2:59:17<24:32:45, 4.34s/it] {'loss': 0.4699, 'grad_norm': 0.9082653531391771, 'learning_rate': 9.937127751606577e-06, 'epoch': 0.08} 8%|▊ | 1747/22095 [2:59:17<24:32:45, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55455 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73023 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1748/22095 [2:59:26<32:00:40, 5.66s/it] {'loss': 0.5257, 'grad_norm': 0.5583039423714269, 'learning_rate': 9.937011834942165e-06, 'epoch': 0.08} 8%|▊ | 1748/22095 [2:59:26<32:00:40, 5.66s/it] 8%|▊ | 1749/22095 [2:59:31<29:51:23, 5.28s/it] {'loss': 0.4654, 'grad_norm': 0.7838632052997255, 'learning_rate': 9.936895812196531e-06, 'epoch': 0.08} 8%|▊ | 1749/22095 [2:59:31<29:51:23, 5.28s/it] 8%|▊ | 1750/22095 [2:59:34<26:45:25, 4.73s/it] {'loss': 0.4551, 'grad_norm': 0.8033674155474345, 'learning_rate': 9.936779683372169e-06, 'epoch': 0.08} 8%|▊ | 1750/22095 [2:59:34<26:45:25, 4.73s/it] 8%|▊ | 1751/22095 [2:59:38<25:48:54, 4.57s/it] {'loss': 0.4933, 'grad_norm': 0.7960851566293581, 'learning_rate': 9.936663448471573e-06, 'epoch': 0.08} 8%|▊ | 1751/22095 [2:59:38<25:48:54, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56638 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1752/22095 [2:59:42<24:44:06, 4.38s/it] {'loss': 0.4504, 'grad_norm': 0.75375389407784, 'learning_rate': 9.936547107497243e-06, 'epoch': 0.08} 8%|▊ | 1752/22095 [2:59:42<24:44:06, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [314, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433820 in VC:s3://internvl-moe-sft-data/. Exception: Image size [314, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49409, 'image': 'vrdu_texteq/astro-ph.CO/2f1416d2-07b3-4541-bd3c-57ccaf7dd66d.png', 'image_wh': [[314, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'and $R_{\\rm d}$ is the scale radius'}]} 8%|▊ | 1753/22095 [2:59:46<23:47:35, 4.21s/it] {'loss': 0.4346, 'grad_norm': 0.8160771745066261, 'learning_rate': 9.936430660451676e-06, 'epoch': 0.08} 8%|▊ | 1753/22095 [2:59:46<23:47:35, 4.21s/it] 8%|▊ | 1754/22095 [2:59:49<21:48:14, 3.86s/it] {'loss': 0.4562, 'grad_norm': 0.7558067831789662, 'learning_rate': 9.936314107337375e-06, 'epoch': 0.08} 8%|▊ | 1754/22095 [2:59:49<21:48:14, 3.86s/it] 8%|▊ | 1755/22095 [2:59:52<20:59:56, 3.72s/it] {'loss': 0.4346, 'grad_norm': 0.7699610133117941, 'learning_rate': 9.936197448156845e-06, 'epoch': 0.08} 8%|▊ | 1755/22095 [2:59:52<20:59:56, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [395, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8525605 in VC:s3://internvl-moe-sft-data/. Exception: Image size [395, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36230, 'image': 'vrdu_texteq/astro-ph.CO/85a2a317-21ba-498f-bd59-e5efb9d03913.png', 'image_wh': [[395, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': '$7.7\\sigma$ for the low redshift bin and'}]} 8%|▊ | 1756/22095 [2:59:58<24:05:10, 4.26s/it] {'loss': 0.54, 'grad_norm': 0.4979692975191816, 'learning_rate': 9.936080682912594e-06, 'epoch': 0.08} 8%|▊ | 1756/22095 [2:59:58<24:05:10, 4.26s/it] 8%|▊ | 1757/22095 [3:00:02<23:14:58, 4.12s/it] {'loss': 0.4594, 'grad_norm': 0.8028363930031575, 'learning_rate': 9.935963811607127e-06, 'epoch': 0.08} 8%|▊ | 1757/22095 [3:00:02<23:14:58, 4.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63848 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1758/22095 [3:00:05<21:47:42, 3.86s/it] {'loss': 0.4588, 'grad_norm': 0.7932475224690867, 'learning_rate': 9.935846834242956e-06, 'epoch': 0.08} 8%|▊ | 1758/22095 [3:00:05<21:47:42, 3.86s/it] 8%|▊ | 1759/22095 [3:00:09<21:52:14, 3.87s/it] {'loss': 0.4699, 'grad_norm': 0.7325005955848916, 'learning_rate': 9.935729750822598e-06, 'epoch': 0.08} 8%|▊ | 1759/22095 [3:00:09<21:52:14, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65840 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1760/22095 [3:00:17<28:30:37, 5.05s/it] {'loss': 0.5169, 'grad_norm': 0.3710574617410941, 'learning_rate': 9.935612561348566e-06, 'epoch': 0.08} 8%|▊ | 1760/22095 [3:00:17<28:30:37, 5.05s/it] 8%|▊ | 1761/22095 [3:00:20<25:24:34, 4.50s/it] {'loss': 0.4356, 'grad_norm': 0.7945140363859222, 'learning_rate': 9.935495265823379e-06, 'epoch': 0.08} 8%|▊ | 1761/22095 [3:00:20<25:24:34, 4.50s/it] 8%|▊ | 1762/22095 [3:00:23<23:41:28, 4.19s/it] {'loss': 0.4829, 'grad_norm': 1.1506262279272965, 'learning_rate': 9.935377864249558e-06, 'epoch': 0.08} 8%|▊ | 1762/22095 [3:00:23<23:41:28, 4.19s/it] 8%|▊ | 1763/22095 [3:00:27<22:09:10, 3.92s/it] {'loss': 0.4144, 'grad_norm': 0.7278323845638668, 'learning_rate': 9.935260356629623e-06, 'epoch': 0.08} 8%|▊ | 1763/22095 [3:00:27<22:09:10, 3.92s/it] 8%|▊ | 1764/22095 [3:00:30<20:29:57, 3.63s/it] {'loss': 0.3977, 'grad_norm': 0.7497010631773515, 'learning_rate': 9.935142742966099e-06, 'epoch': 0.08} 8%|▊ | 1764/22095 [3:00:30<20:29:57, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53173 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1765/22095 [3:00:33<20:37:17, 3.65s/it] {'loss': 0.4732, 'grad_norm': 0.7404458281691546, 'learning_rate': 9.935025023261518e-06, 'epoch': 0.08} 8%|▊ | 1765/22095 [3:00:33<20:37:17, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1766/22095 [3:00:43<30:30:59, 5.40s/it] {'loss': 0.5196, 'grad_norm': 0.46932200375510835, 'learning_rate': 9.934907197518405e-06, 'epoch': 0.08} 8%|▊ | 1766/22095 [3:00:43<30:30:59, 5.40s/it] 8%|▊ | 1767/22095 [3:00:46<26:44:00, 4.73s/it] {'loss': 0.4248, 'grad_norm': 0.7566804198935556, 'learning_rate': 9.934789265739291e-06, 'epoch': 0.08} 8%|▊ | 1767/22095 [3:00:46<26:44:00, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1768/22095 [3:00:54<33:02:20, 5.85s/it] {'loss': 0.5084, 'grad_norm': 0.36487050083608025, 'learning_rate': 9.934671227926714e-06, 'epoch': 0.08} 8%|▊ | 1768/22095 [3:00:54<33:02:20, 5.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49196 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59000 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1769/22095 [3:00:58<28:49:44, 5.11s/it] {'loss': 0.4203, 'grad_norm': 0.7996572718306318, 'learning_rate': 9.934553084083205e-06, 'epoch': 0.08} 8%|▊ | 1769/22095 [3:00:58<28:49:44, 5.11s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1770/22095 [3:01:01<26:12:27, 4.64s/it] {'loss': 0.4812, 'grad_norm': 0.7649702206192669, 'learning_rate': 9.934434834211309e-06, 'epoch': 0.08} 8%|▊ | 1770/22095 [3:01:01<26:12:27, 4.64s/it] 8%|▊ | 1771/22095 [3:01:04<23:06:32, 4.09s/it] {'loss': 0.4276, 'grad_norm': 0.7872672685702111, 'learning_rate': 9.93431647831356e-06, 'epoch': 0.08} 8%|▊ | 1771/22095 [3:01:04<23:06:32, 4.09s/it] 8%|▊ | 1772/22095 [3:01:08<22:52:45, 4.05s/it] {'loss': 0.4045, 'grad_norm': 0.7374095690002961, 'learning_rate': 9.934198016392507e-06, 'epoch': 0.08} 8%|▊ | 1772/22095 [3:01:08<22:52:45, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1773/22095 [3:01:15<28:12:38, 5.00s/it] {'loss': 0.5258, 'grad_norm': 0.46296745243555587, 'learning_rate': 9.934079448450692e-06, 'epoch': 0.08} 8%|▊ | 1773/22095 [3:01:15<28:12:38, 5.00s/it] 8%|▊ | 1774/22095 [3:01:19<25:14:05, 4.47s/it] {'loss': 0.4316, 'grad_norm': 0.9402761167337232, 'learning_rate': 9.933960774490663e-06, 'epoch': 0.08} 8%|▊ | 1774/22095 [3:01:19<25:14:05, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1775/22095 [3:01:21<22:33:57, 4.00s/it] {'loss': 0.4937, 'grad_norm': 0.770679153259081, 'learning_rate': 9.933841994514972e-06, 'epoch': 0.08} 8%|▊ | 1775/22095 [3:01:21<22:33:57, 4.00s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_4/images/before_screenshot_38_id_302_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-27 18:59:20.847308 load time: 1020.29 ms 8%|▊ | 1776/22095 [3:01:24<20:40:08, 3.66s/it] {'loss': 0.456, 'grad_norm': 0.7540461540716024, 'learning_rate': 9.933723108526168e-06, 'epoch': 0.08} 8%|▊ | 1776/22095 [3:01:24<20:40:08, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8392004 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 58830, 'image': 'vrdu_table_final_2/astro-ph.EP/c7d80045-c7ed-47ce-aa79-bbbdf4bd507d.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1777/22095 [3:01:34<31:08:09, 5.52s/it] {'loss': 0.5169, 'grad_norm': 0.4096177953587281, 'learning_rate': 9.933604116526807e-06, 'epoch': 0.08} 8%|▊ | 1777/22095 [3:01:34<31:08:09, 5.52s/it] 8%|▊ | 1778/22095 [3:01:38<27:31:26, 4.88s/it] {'loss': 0.4286, 'grad_norm': 0.846509551981744, 'learning_rate': 9.933485018519448e-06, 'epoch': 0.08} 8%|▊ | 1778/22095 [3:01:38<27:31:26, 4.88s/it] 8%|▊ | 1779/22095 [3:01:41<24:52:21, 4.41s/it] {'loss': 0.4304, 'grad_norm': 0.7339440286381098, 'learning_rate': 9.933365814506646e-06, 'epoch': 0.08} 8%|▊ | 1779/22095 [3:01:41<24:52:21, 4.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1780/22095 [3:01:44<22:42:26, 4.02s/it] {'loss': 0.4615, 'grad_norm': 0.7571291353344685, 'learning_rate': 9.933246504490966e-06, 'epoch': 0.08} 8%|▊ | 1780/22095 [3:01:44<22:42:26, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1781/22095 [3:01:47<21:21:10, 3.78s/it] {'loss': 0.4042, 'grad_norm': 0.785352191096861, 'learning_rate': 9.933127088474968e-06, 'epoch': 0.08} 8%|▊ | 1781/22095 [3:01:47<21:21:10, 3.78s/it] 8%|▊ | 1782/22095 [3:01:51<20:48:31, 3.69s/it] {'loss': 0.4385, 'grad_norm': 0.7196603766142141, 'learning_rate': 9.93300756646122e-06, 'epoch': 0.08} 8%|▊ | 1782/22095 [3:01:51<20:48:31, 3.69s/it] 8%|▊ | 1783/22095 [3:01:56<23:05:38, 4.09s/it] {'loss': 0.4111, 'grad_norm': 0.790373245354847, 'learning_rate': 9.932887938452292e-06, 'epoch': 0.08} 8%|▊ | 1783/22095 [3:01:56<23:05:38, 4.09s/it] 8%|▊ | 1784/22095 [3:01:59<21:46:07, 3.86s/it] {'loss': 0.4257, 'grad_norm': 0.7157238169714207, 'learning_rate': 9.932768204450751e-06, 'epoch': 0.08} 8%|▊ | 1784/22095 [3:01:59<21:46:07, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85137 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101782 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1785/22095 [3:02:02<19:58:31, 3.54s/it] {'loss': 0.4602, 'grad_norm': 0.8043970204059999, 'learning_rate': 9.932648364459172e-06, 'epoch': 0.08} 8%|▊ | 1785/22095 [3:02:02<19:58:31, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110300 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1786/22095 [3:02:05<18:42:45, 3.32s/it] {'loss': 0.4547, 'grad_norm': 0.8122044645865316, 'learning_rate': 9.93252841848013e-06, 'epoch': 0.08} 8%|▊ | 1786/22095 [3:02:05<18:42:45, 3.32s/it] 8%|▊ | 1787/22095 [3:02:08<18:02:03, 3.20s/it] {'loss': 0.4828, 'grad_norm': 0.7857914577282751, 'learning_rate': 9.932408366516198e-06, 'epoch': 0.08} 8%|▊ | 1787/22095 [3:02:08<18:02:03, 3.20s/it] 8%|▊ | 1788/22095 [3:02:11<17:38:27, 3.13s/it] {'loss': 0.3924, 'grad_norm': 0.7751491400233553, 'learning_rate': 9.932288208569961e-06, 'epoch': 0.08} 8%|▊ | 1788/22095 [3:02:11<17:38:27, 3.13s/it] 8%|▊ | 1789/22095 [3:02:14<17:40:02, 3.13s/it] {'loss': 0.4319, 'grad_norm': 0.7582334131958042, 'learning_rate': 9.932167944643998e-06, 'epoch': 0.08} 8%|▊ | 1789/22095 [3:02:14<17:40:02, 3.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965709 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16544, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 3cm\nB. 2cm\nC. 5cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 8%|▊ | 1790/22095 [3:02:18<19:23:17, 3.44s/it] {'loss': 0.4755, 'grad_norm': 0.7161260925348351, 'learning_rate': 9.932047574740895e-06, 'epoch': 0.08} 8%|▊ | 1790/22095 [3:02:18<19:23:17, 3.44s/it] 8%|▊ | 1791/22095 [3:02:21<19:34:48, 3.47s/it] {'loss': 0.4175, 'grad_norm': 0.8180310284049336, 'learning_rate': 9.931927098863237e-06, 'epoch': 0.08} 8%|▊ | 1791/22095 [3:02:21<19:34:48, 3.47s/it]Traceback (most recent call last): Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11234126 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8617, 'image': 'MMMU/Pharmacy/test_19_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: Sulfuric acid reacts with sodium hydroxide according to the following:,What mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: [' 36.76 grams', ' 36.77 grams', ' 36.78 grams']"}, {'from': 'gpt', 'value': "## Introduction\n\nThe reaction between sulfuric acid \\((\\text{H}_2\\text{SO}_4)\\) and sodium hydroxide \\((\\text{NaOH})\\) is a classic example of an acid-base reaction. Understanding this reaction requires knowledge of several fundamental chemical principles including stoichiometry, molecular weight calculation, balancing chemical equations, and the concept of molarity. In this discussion, I will provide a detailed overview of each relevant concept to ensure a comprehensive understanding of the underlying principles.\n\n### Sulfuric Acid \\((\\text{H}_2\\text{SO}_4)\\)\n\n#### Chemical Structure and Properties:\n- **Chemical Formula**: \\(\\text{H}_2\\text{SO}_4\\)\n- **Molecular Weight**: The molecular weight of sulfuric acid is:\n - Hydrogen (H): \\(1 \\, \\text{amu} \\times 2 = 2\\, \\text{amu}\\)\n - Sulfur (S): \\(32 \\, \\text{amu} = 32\\, \\text{amu}\\)\n - Oxygen (O): \\(16 \\, \\text{amu} \\times 4 = 64\\, \\text{amu}\\)\n - **Total**: \\(2 + 32 + 64 = 98 \\, \\text{amu}\\)\n- **Properties**: Sulfuric acid is a highly corrosive, dense liquid that is soluble in water at all concentrations. It is a strong mineral acid known for its capability to dehydrate carbohydrates and organic materials.\n\n#### Applications:\n- **Industrial Uses**: Sulfuric acid is widely used in the manufacture of fertilizers, chemicals, pharmaceuticals, and in petroleum refining.\n- **Laboratory Uses**: It is used in various chemical synthesis reactions, as a dehydrating agent, and in titrations.\n\n#### Acid-Base Behavior:\nSulfuric acid is a **diprotic acid**, which means it can donate two protons \\((\\text{H}^+)\\) in aqueous solutions:\n- First dissociation: \\(\\text{H}_2\\text{SO}_4 \\rightarrow \\text{H}^+ + \\text{HSO}_4^-\\)\n- Second dissociation: \\(\\text{HSO}_4^- \\rightarrow \\text{H}^+ + \\text{SO}_4^{2-}\\)\n\n### Sodium Hydroxide \\((\\text{NaOH})\\)\n\n#### Chemical Structure and Properties:\n- **Chemical Formula**: \\(\\text{NaOH}\\)\n- **Molecular Weight**: Calculated as:\n - Sodium (Na): \\(23 \\, \\text{amu} = 23\\, \\text{amu}\\)\n - Oxygen (O): \\(16 \\, \\text{amu} = 16\\, \\text{amu}\\)\n - Hydrogen (H): \\(1 \\, \\text{amu} = 1\\, \\text{amu}\\)\n - **Total**: \\(23 + 16 + 1 = 40 \\, \\text{amu}\\)\n- **Properties**: Sodium hydroxide, also known as lye or caustic soda, is a strong base that is highly soluble in water and forms a slippery solution.\n\n#### Applications:\n- **Industrial Uses**: Used in cleaning agents, the manufacture of paper and textiles, and in the saponification process for soap production.\n- **Laboratory Uses**: Commonly used as a reagent in laboratory experiments, particularly in titrations.\n\n#### Base Behavior:\nSodium hydroxide dissociates in water to form sodium ions and hydroxide ions:\n\\[\n\\text{NaOH} \\rightarrow \\text{Na}^+ + \\text{OH}^-\n\\]\n\n### Stoichiometry and Chemical Equations\n\n#### Balancing Chemical Equations:\nThe balanced chemical equation for the reaction between sulfuric acid and sodium hydroxide is:\n\\[\n\\text{H}_2\\text{SO}_4 + 2\\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2\\text{H}_2\\text{O}\n\\]\nThis equation indicates that one mole of sulfuric acid reacts with two moles of sodium hydroxide to produce one mole of sodium sulfate and two moles of water.\n\n#### Stoichiometric Calculations:\nStoichiometry involves the quantitative relationships between the reactants and products in a chemical reaction. It uses the balanced equation to convert between moles of reactants and products.\n\n- **Mole Concept**: The mole is a fundamental unit in chemistry that represents a specific number of particles (Avogadro's number, \\(6.022 \\times 10^{23}\\)).\n\n- **Molar Ratio**: From the balanced equation, the reaction requires a 1:2 molar ratio of \\(\\text{H}_2\\text{SO}_4\\) to \\(\\text{NaOH}\\).\n\n### Molecular Weight and Molar Mass\n\n#### Molecular Weight Calculation:\n- **Molar Mass**: The molar mass (in grams per mole) is numerically equal to the molecular weight but converted to grams.\n\n- **Example Calculation for \\(\\text{H}_2\\text{SO}_4\\)**: \n - As calculated earlier, sulfuric acid has a molecular weight of 98 amu, which corresponds to a molar mass of 98 g/mol.\n\n#### Importance in Calculations:\nKnowing the molar mass is essential for converting between the mass of a substance and the moles of the substance, which is critical in stoichiometric calculations.\n\n### Practical Applications and Examples\n\n#### Neutralization Reactions:\n- **Definition**: A neutralization reaction is between an acid and a base to form water and a salt.\n \n- **Example**: The reaction between \\(\\text{H}_2\\text{SO}_4\\) and \\(\\text{NaOH}\\) is a neutralization reaction, producing \\(\\text{Na}_2\\text{SO}_4\\) (sodium sulfate) and water.\n\n#### Titrations:\n- **Concept**: Titration is a method to determine the concentration of a solution by reacting it with a solution of known concentration.\n \n- **Relevance**: Understanding stoichiometric ratios is crucial in titrations to calculate unknown concentrations.\n\n#### Industrial Synthesis:\n- **Chemical Manufacturing**: Many industrial processes rely on precise stoichiometric calculations to optimize the yield and efficiency of chemical syntheses.\n\n### Mass Calculations in Chemical Reactions\n\nGiven 0.75 mol of \\(\\text{NaOH}\\), we can determine the mass of \\(\\text{H}_2\\text{SO}_4\\) required using the balanced equation and stoichiometry:\n\n1. **Determine Moles Required**:\n - According to the balanced equation, 1 mole of \\(\\text{H}_2\\text{SO}_4\\) reacts with 2 moles of \\(\\text{NaOH}\\).\n - Therefore, 0.75 moles of \\(\\text{NaOH}\\) require 0.375 moles of \\(\\text{H}_2\\text{SO}_4\\).\n\n2. **Convert Moles to Grams**:\n - Molar mass of \\(\\text{H}_2\\text{SO}_4\\) is 98 g/mol.\n - Mass of \\(\\text{H}_2\\text{SO}_4\\) = \\(0.375 \\times 98 = 36.75\\) grams.\n\nThrough these detailed explorations, this classic chemical reaction demonstrates key principles in chemistry, providing valuable insights into stoichiometry, neutralization reactions, and practical applications in industrial and laboratory settings."}]} File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047920 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 2\nB. 4\nC. 8\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 8%|▊ | 1792/22095 [3:02:25<19:13:36, 3.41s/it] {'loss': 0.4034, 'grad_norm': 0.6835845823674249, 'learning_rate': 9.931806517013612e-06, 'epoch': 0.08} 8%|▊ | 1792/22095 [3:02:25<19:13:36, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1793/22095 [3:02:35<31:14:27, 5.54s/it] {'loss': 0.4926, 'grad_norm': 0.577581697777701, 'learning_rate': 9.931685829194612e-06, 'epoch': 0.08} 8%|▊ | 1793/22095 [3:02:35<31:14:27, 5.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [709, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8414500 in VC:s3://internvl-moe-sft-data/. Exception: Image size [709, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 155553, 'image': 'vrdu_texteq/astro-ph.CO/06198ef3-4fd7-4387-a5e2-5201657bb43f.png', 'image_wh': [[709, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'where from the Planck measurement $n=0.9616\\pm 0.0094$~.'}]} 8%|▊ | 1794/22095 [3:02:39<27:49:41, 4.93s/it] {'loss': 0.4507, 'grad_norm': 0.788125945271417, 'learning_rate': 9.931565035408833e-06, 'epoch': 0.08} 8%|▊ | 1794/22095 [3:02:39<27:49:41, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47605 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1795/22095 [3:02:42<24:58:35, 4.43s/it] {'loss': 0.4145, 'grad_norm': 0.7967217507636467, 'learning_rate': 9.931444135658864e-06, 'epoch': 0.08} 8%|▊ | 1795/22095 [3:02:42<24:58:35, 4.43s/it] 8%|▊ | 1796/22095 [3:02:45<22:47:26, 4.04s/it] {'loss': 0.4061, 'grad_norm': 0.7126756325676225, 'learning_rate': 9.931323129947306e-06, 'epoch': 0.08} 8%|▊ | 1796/22095 [3:02:45<22:47:26, 4.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [448, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8473211 in VC:s3://internvl-moe-sft-data/. Exception: Image size [448, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47381, 'image': 'vrdu_texteq/astro-ph.CO/1b0361ca-f34d-4192-8d04-5c27814de898.png', 'image_wh': [[448, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $X$ is the iteration number and'}]} 8%|▊ | 1797/22095 [3:02:48<20:46:32, 3.68s/it] {'loss': 0.4138, 'grad_norm': 0.7425565928836916, 'learning_rate': 9.931202018276761e-06, 'epoch': 0.08} 8%|▊ | 1797/22095 [3:02:48<20:46:32, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97159 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1798/22095 [3:02:51<20:16:32, 3.60s/it] {'loss': 0.4607, 'grad_norm': 0.7435477086974337, 'learning_rate': 9.93108080064983e-06, 'epoch': 0.08} 8%|▊ | 1798/22095 [3:02:51<20:16:32, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50742 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57316 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1799/22095 [3:02:55<20:25:57, 3.62s/it] {'loss': 0.4587, 'grad_norm': 0.772240860997725, 'learning_rate': 9.930959477069117e-06, 'epoch': 0.08} 8%|▊ | 1799/22095 [3:02:55<20:25:57, 3.62s/it] 8%|▊ | 1800/22095 [3:02:58<19:17:04, 3.42s/it] {'loss': 0.4311, 'grad_norm': 0.7435660703263862, 'learning_rate': 9.930838047537228e-06, 'epoch': 0.08} 8%|▊ | 1800/22095 [3:02:58<19:17:04, 3.42s/it] 8%|▊ | 1801/22095 [3:03:01<18:38:59, 3.31s/it] {'loss': 0.4285, 'grad_norm': 0.6831940907334916, 'learning_rate': 9.930716512056775e-06, 'epoch': 0.08} 8%|▊ | 1801/22095 [3:03:01<18:38:59, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75756 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1802/22095 [3:03:04<18:04:32, 3.21s/it] {'loss': 0.4521, 'grad_norm': 0.8106025492808707, 'learning_rate': 9.930594870630365e-06, 'epoch': 0.08} 8%|▊ | 1802/22095 [3:03:04<18:04:32, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1803/22095 [3:03:10<22:27:31, 3.98s/it] {'loss': 0.5216, 'grad_norm': 0.752269782845294, 'learning_rate': 9.930473123260618e-06, 'epoch': 0.08} 8%|▊ | 1803/22095 [3:03:10<22:27:31, 3.98s/it] 8%|▊ | 1804/22095 [3:03:13<21:36:49, 3.83s/it] {'loss': 0.4725, 'grad_norm': 0.7751702345109988, 'learning_rate': 9.930351269950144e-06, 'epoch': 0.08} 8%|▊ | 1804/22095 [3:03:13<21:36:49, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (102984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52831 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1805/22095 [3:03:16<19:50:55, 3.52s/it] {'loss': 0.424, 'grad_norm': 0.7742768817473613, 'learning_rate': 9.930229310701563e-06, 'epoch': 0.08} 8%|▊ | 1805/22095 [3:03:16<19:50:55, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63944 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51438 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1806/22095 [3:03:22<24:21:12, 4.32s/it] {'loss': 0.55, 'grad_norm': 0.46471716913368016, 'learning_rate': 9.930107245517498e-06, 'epoch': 0.08} 8%|▊ | 1806/22095 [3:03:22<24:21:12, 4.32s/it] 8%|▊ | 1807/22095 [3:03:30<30:25:58, 5.40s/it] {'loss': 0.5319, 'grad_norm': 0.3865056561970093, 'learning_rate': 9.929985074400569e-06, 'epoch': 0.08} 8%|▊ | 1807/22095 [3:03:30<30:25:58, 5.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 8%|▊ | 1808/22095 [3:03:33<26:49:28, 4.76s/it] {'loss': 0.4585, 'grad_norm': 0.9958802419213363, 'learning_rate': 9.929862797353402e-06, 'epoch': 0.08} 8%|▊ | 1808/22095 [3:03:33<26:49:28, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047852 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 8%|▊ | 1809/22095 [3:03:37<25:07:42, 4.46s/it] {'loss': 0.4438, 'grad_norm': 0.8995192761447293, 'learning_rate': 9.929740414378625e-06, 'epoch': 0.08} 8%|▊ | 1809/22095 [3:03:37<25:07:42, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1810/22095 [3:03:47<33:29:16, 5.94s/it] {'loss': 0.5199, 'grad_norm': 0.5321022825681664, 'learning_rate': 9.929617925478868e-06, 'epoch': 0.08} 8%|▊ | 1810/22095 [3:03:47<33:29:16, 5.94s/it] 8%|▊ | 1811/22095 [3:03:50<28:57:25, 5.14s/it] {'loss': 0.4453, 'grad_norm': 0.9978659299427117, 'learning_rate': 9.92949533065676e-06, 'epoch': 0.08} 8%|▊ | 1811/22095 [3:03:50<28:57:25, 5.14s/it] 8%|▊ | 1812/22095 [3:03:53<25:37:49, 4.55s/it] {'loss': 0.4665, 'grad_norm': 0.9841211011383583, 'learning_rate': 9.929372629914937e-06, 'epoch': 0.08} 8%|▊ | 1812/22095 [3:03:53<25:37:49, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1813/22095 [3:04:03<34:07:03, 6.06s/it] {'loss': 0.4928, 'grad_norm': 0.47226880302660784, 'learning_rate': 9.929249823256037e-06, 'epoch': 0.08} 8%|▊ | 1813/22095 [3:04:03<34:07:03, 6.06s/it] 8%|▊ | 1814/22095 [3:04:12<39:50:50, 7.07s/it] {'loss': 0.5312, 'grad_norm': 0.44030382896499054, 'learning_rate': 9.929126910682697e-06, 'epoch': 0.08} 8%|▊ | 1814/22095 [3:04:12<39:50:50, 7.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (47495 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (57286 > 40960) for 4 sample(s). Truncating to 7645 with 2 samples. 8%|▊ | 1815/22095 [3:04:16<33:50:17, 6.01s/it] {'loss': 0.437, 'grad_norm': 1.0380986223119235, 'learning_rate': 9.929003892197558e-06, 'epoch': 0.08} 8%|▊ | 1815/22095 [3:04:16<33:50:17, 6.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337286 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3908, 'image': 'vrdu_table_final_2/astro-ph.CO/fd803df4-4ecd-42ca-bd18-401c733136c7.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} Token indices sequence length is longer than the specified maximum sequence length for this model (99231 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1816/22095 [3:04:19<28:54:00, 5.13s/it] {'loss': 0.4426, 'grad_norm': 0.8010625789004381, 'learning_rate': 9.928880767803264e-06, 'epoch': 0.08} 8%|▊ | 1816/22095 [3:04:19<28:54:00, 5.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85197 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1817/22095 [3:04:22<26:37:23, 4.73s/it] {'loss': 0.444, 'grad_norm': 0.8061411711841202, 'learning_rate': 9.928757537502458e-06, 'epoch': 0.08} 8%|▊ | 1817/22095 [3:04:22<26:37:23, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365495 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32236, 'image': 'vrdu_table_final_2/astro-ph.CO/fd9cb0d4-0fc0-4bb5-ac5a-5902f063a3b4.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 8%|▊ | 1818/22095 [3:04:25<23:31:14, 4.18s/it] {'loss': 0.4575, 'grad_norm': 0.9451107724245381, 'learning_rate': 9.928634201297793e-06, 'epoch': 0.08} 8%|▊ | 1818/22095 [3:04:25<23:31:14, 4.18s/it] 8%|▊ | 1819/22095 [3:04:29<22:12:28, 3.94s/it] {'loss': 0.4367, 'grad_norm': 0.7274563128661418, 'learning_rate': 9.928510759191914e-06, 'epoch': 0.08} 8%|▊ | 1819/22095 [3:04:29<22:12:28, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922715 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45868, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '2'}]} 8%|▊ | 1820/22095 [3:04:32<21:11:37, 3.76s/it] {'loss': 0.43, 'grad_norm': 0.8482490358684864, 'learning_rate': 9.928387211187478e-06, 'epoch': 0.08} 8%|▊ | 1820/22095 [3:04:32<21:11:37, 3.76s/it] 8%|▊ | 1821/22095 [3:04:35<20:36:43, 3.66s/it] {'loss': 0.4179, 'grad_norm': 0.9110378559090746, 'learning_rate': 9.928263557287135e-06, 'epoch': 0.08} 8%|▊ | 1821/22095 [3:04:35<20:36:43, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1822/22095 [3:04:40<21:18:00, 3.78s/it] {'loss': 0.4434, 'grad_norm': 0.9439672447412728, 'learning_rate': 9.928139797493545e-06, 'epoch': 0.08} 8%|▊ | 1822/22095 [3:04:40<21:18:00, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70357 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1823/22095 [3:04:50<32:17:57, 5.74s/it] {'loss': 0.5682, 'grad_norm': 0.6352867452932015, 'learning_rate': 9.928015931809368e-06, 'epoch': 0.08} 8%|▊ | 1823/22095 [3:04:50<32:17:57, 5.74s/it] 8%|▊ | 1824/22095 [3:04:53<28:38:46, 5.09s/it] {'loss': 0.4506, 'grad_norm': 0.8239888115018645, 'learning_rate': 9.927891960237261e-06, 'epoch': 0.08} 8%|▊ | 1824/22095 [3:04:53<28:38:46, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1825/22095 [3:05:04<37:29:38, 6.66s/it] {'loss': 0.498, 'grad_norm': 0.42021768765074907, 'learning_rate': 9.927767882779892e-06, 'epoch': 0.08} 8%|▊ | 1825/22095 [3:05:04<37:29:38, 6.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1826/22095 [3:05:07<32:03:08, 5.69s/it] {'loss': 0.4705, 'grad_norm': 1.2031228767214417, 'learning_rate': 9.927643699439925e-06, 'epoch': 0.08} 8%|▊ | 1826/22095 [3:05:07<32:03:08, 5.69s/it] 8%|▊ | 1827/22095 [3:05:10<27:36:16, 4.90s/it] {'loss': 0.4587, 'grad_norm': 0.7864408720289386, 'learning_rate': 9.92751941022003e-06, 'epoch': 0.08} 8%|▊ | 1827/22095 [3:05:10<27:36:16, 4.90s/it] 8%|▊ | 1828/22095 [3:05:13<24:34:22, 4.36s/it] {'loss': 0.4978, 'grad_norm': 0.8146428310929719, 'learning_rate': 9.927395015122876e-06, 'epoch': 0.08} 8%|▊ | 1828/22095 [3:05:13<24:34:22, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047685 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 3\nB. 4\nC. 1\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AD=\\frac{1}{3}AB,AB=12,∴AD=4,∵C是AD的中点,∴AC=\\frac{1}{2}AD=2.'}]} 8%|▊ | 1829/22095 [3:05:17<23:32:28, 4.18s/it] {'loss': 0.4838, 'grad_norm': 0.9403173929391236, 'learning_rate': 9.927270514151137e-06, 'epoch': 0.08} 8%|▊ | 1829/22095 [3:05:17<23:32:28, 4.18s/it] 8%|▊ | 1830/22095 [3:05:21<23:48:50, 4.23s/it] {'loss': 0.4569, 'grad_norm': 0.7782652098882176, 'learning_rate': 9.927145907307486e-06, 'epoch': 0.08} 8%|▊ | 1830/22095 [3:05:21<23:48:50, 4.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908005 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Invalidate trace cache @ step 2: expected module 1, but got module 364 Problematic sample: {'id': 31158, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nA. 6\nB. 2\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1831/22095 [3:05:31<32:34:05, 5.79s/it] {'loss': 0.5406, 'grad_norm': 1.0164338183003059, 'learning_rate': 9.927021194594604e-06, 'epoch': 0.08} 8%|▊ | 1831/22095 [3:05:31<32:34:05, 5.79s/it] 8%|▊ | 1832/22095 [3:05:34<28:50:00, 5.12s/it] {'loss': 0.4387, 'grad_norm': 0.7846927910640304, 'learning_rate': 9.926896376015168e-06, 'epoch': 0.08} 8%|▊ | 1832/22095 [3:05:34<28:50:00, 5.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [492, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8498092 in VC:s3://internvl-moe-sft-data/. Exception: Image size [492, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 140455, 'image': 'vrdu_texteq/astro-ph.CO/3dda0c5e-106e-4de6-821d-222eed92e068.png', 'image_wh': [[492, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': '$P_{shot}$ is the unknown Poisson shot noise.'}]} 8%|▊ | 1833/22095 [3:05:38<26:15:52, 4.67s/it] {'loss': 0.5215, 'grad_norm': 0.8219142743159832, 'learning_rate': 9.926771451571862e-06, 'epoch': 0.08} 8%|▊ | 1833/22095 [3:05:38<26:15:52, 4.67s/it] 8%|▊ | 1834/22095 [3:05:41<23:21:18, 4.15s/it] {'loss': 0.4201, 'grad_norm': 0.6988546359400207, 'learning_rate': 9.926646421267366e-06, 'epoch': 0.08} 8%|▊ | 1834/22095 [3:05:41<23:21:18, 4.15s/it] 8%|▊ | 1835/22095 [3:05:44<22:11:25, 3.94s/it] {'loss': 0.4536, 'grad_norm': 0.7129991767098324, 'learning_rate': 9.926521285104371e-06, 'epoch': 0.08} 8%|▊ | 1835/22095 [3:05:44<22:11:25, 3.94s/it] 8%|▊ | 1836/22095 [3:05:49<22:33:39, 4.01s/it] {'loss': 0.4813, 'grad_norm': 0.7806065574483061, 'learning_rate': 9.926396043085564e-06, 'epoch': 0.08} 8%|▊ | 1836/22095 [3:05:49<22:33:39, 4.01s/it] 8%|▊ | 1837/22095 [3:05:51<20:30:18, 3.64s/it] {'loss': 0.4476, 'grad_norm': 0.7805909948187303, 'learning_rate': 9.926270695213638e-06, 'epoch': 0.08} 8%|▊ | 1837/22095 [3:05:51<20:30:18, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48263 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115663 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110238 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1838/22095 [3:05:57<24:38:35, 4.38s/it] {'loss': 0.4907, 'grad_norm': 0.5948129410350549, 'learning_rate': 9.926145241491283e-06, 'epoch': 0.08} 8%|▊ | 1838/22095 [3:05:57<24:38:35, 4.38s/it] 8%|▊ | 1839/22095 [3:06:01<23:09:34, 4.12s/it] {'loss': 0.4772, 'grad_norm': 0.8754548731717574, 'learning_rate': 9.926019681921196e-06, 'epoch': 0.08} 8%|▊ | 1839/22095 [3:06:01<23:09:34, 4.12s/it] 8%|▊ | 1840/22095 [3:06:05<23:18:50, 4.14s/it] {'loss': 0.4527, 'grad_norm': 0.9208126996590229, 'learning_rate': 9.925894016506076e-06, 'epoch': 0.08} 8%|▊ | 1840/22095 [3:06:05<23:18:50, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46274 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43826 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1841/22095 [3:06:08<21:33:10, 3.83s/it] {'loss': 0.4071, 'grad_norm': 1.202719505119469, 'learning_rate': 9.925768245248622e-06, 'epoch': 0.08} 8%|▊ | 1841/22095 [3:06:08<21:33:10, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59575 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1842/22095 [3:06:12<21:31:42, 3.83s/it] {'loss': 0.4702, 'grad_norm': 0.8513587887535525, 'learning_rate': 9.925642368151536e-06, 'epoch': 0.08} 8%|▊ | 1842/22095 [3:06:12<21:31:42, 3.83s/it] 8%|▊ | 1843/22095 [3:06:15<19:52:23, 3.53s/it] {'loss': 0.438, 'grad_norm': 0.8056976169739187, 'learning_rate': 9.925516385217524e-06, 'epoch': 0.08} 8%|▊ | 1843/22095 [3:06:15<19:52:23, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71084 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1844/22095 [3:06:19<20:06:56, 3.58s/it] {'loss': 0.4457, 'grad_norm': 0.9725598362627249, 'learning_rate': 9.925390296449293e-06, 'epoch': 0.08} 8%|▊ | 1844/22095 [3:06:19<20:06:56, 3.58s/it]Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46636 > 40960) for 4 sample(s). Truncating to 40673 with 2 samples. 8%|▊ | 1845/22095 [3:06:23<20:59:30, 3.73s/it] {'loss': 0.429, 'grad_norm': 0.8121083302424683, 'learning_rate': 9.925264101849552e-06, 'epoch': 0.08} 8%|▊ | 1845/22095 [3:06:23<20:59:30, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1846/22095 [3:06:32<29:36:46, 5.26s/it] {'loss': 0.5364, 'grad_norm': 0.7744796561216578, 'learning_rate': 9.925137801421011e-06, 'epoch': 0.08} 8%|▊ | 1846/22095 [3:06:32<29:36:46, 5.26s/it] 8%|▊ | 1847/22095 [3:06:35<27:03:41, 4.81s/it] {'loss': 0.3983, 'grad_norm': 0.800946010472054, 'learning_rate': 9.925011395166387e-06, 'epoch': 0.08} 8%|▊ | 1847/22095 [3:06:35<27:03:41, 4.81s/it] 8%|▊ | 1848/22095 [3:06:39<25:50:13, 4.59s/it] {'loss': 0.51, 'grad_norm': 0.7276221221900879, 'learning_rate': 9.924884883088392e-06, 'epoch': 0.08} 8%|▊ | 1848/22095 [3:06:39<25:50:13, 4.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1849/22095 [3:06:49<34:12:34, 6.08s/it] {'loss': 0.5154, 'grad_norm': 0.4285744008499849, 'learning_rate': 9.924758265189746e-06, 'epoch': 0.08} 8%|▊ | 1849/22095 [3:06:49<34:12:34, 6.08s/it] 8%|▊ | 1850/22095 [3:06:58<38:30:26, 6.85s/it] {'loss': 0.5345, 'grad_norm': 0.4265868429049617, 'learning_rate': 9.924631541473174e-06, 'epoch': 0.08} 8%|▊ | 1850/22095 [3:06:58<38:30:26, 6.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 8%|▊ | 1851/22095 [3:07:02<33:43:03, 6.00s/it] {'loss': 0.4562, 'grad_norm': 0.7860118715571093, 'learning_rate': 9.924504711941391e-06, 'epoch': 0.08} 8%|▊ | 1851/22095 [3:07:02<33:43:03, 6.00s/it] 8%|▊ | 1852/22095 [3:07:11<40:05:41, 7.13s/it] {'loss': 0.5181, 'grad_norm': 0.41053072199017704, 'learning_rate': 9.924377776597128e-06, 'epoch': 0.08} 8%|▊ | 1852/22095 [3:07:11<40:05:41, 7.13s/it] 8%|▊ | 1853/22095 [3:07:22<45:09:20, 8.03s/it] {'loss': 0.5228, 'grad_norm': 0.41695682875868034, 'learning_rate': 9.92425073544311e-06, 'epoch': 0.08} 8%|▊ | 1853/22095 [3:07:22<45:09:20, 8.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 8%|▊ | 1854/22095 [3:07:25<38:07:20, 6.78s/it] {'loss': 0.4461, 'grad_norm': 0.8054863739728471, 'learning_rate': 9.924123588482068e-06, 'epoch': 0.08} 8%|▊ | 1854/22095 [3:07:25<38:07:20, 6.78s/it] 8%|▊ | 1855/22095 [3:07:29<33:28:13, 5.95s/it] {'loss': 0.4478, 'grad_norm': 0.7739457395814601, 'learning_rate': 9.923996335716732e-06, 'epoch': 0.08} 8%|▊ | 1855/22095 [3:07:29<33:28:13, 5.95s/it] 8%|▊ | 1856/22095 [3:07:32<28:30:33, 5.07s/it] {'loss': 0.4644, 'grad_norm': 0.7623799852036963, 'learning_rate': 9.92386897714984e-06, 'epoch': 0.08} 8%|▊ | 1856/22095 [3:07:32<28:30:33, 5.07s/it] 8%|▊ | 1857/22095 [3:07:37<27:10:21, 4.83s/it] {'loss': 0.4439, 'grad_norm': 0.7923643933942867, 'learning_rate': 9.923741512784124e-06, 'epoch': 0.08} 8%|▊ | 1857/22095 [3:07:37<27:10:21, 4.83s/it] 8%|▊ | 1858/22095 [3:07:40<24:32:23, 4.37s/it] {'loss': 0.4717, 'grad_norm': 0.7400779069636888, 'learning_rate': 9.923613942622326e-06, 'epoch': 0.08} 8%|▊ | 1858/22095 [3:07:40<24:32:23, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1859/22095 [3:07:44<24:07:21, 4.29s/it] {'loss': 0.4649, 'grad_norm': 0.7773476459849654, 'learning_rate': 9.923486266667186e-06, 'epoch': 0.08} 8%|▊ | 1859/22095 [3:07:44<24:07:21, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120631 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1860/22095 [3:07:47<21:36:39, 3.84s/it] {'loss': 0.4666, 'grad_norm': 0.7908800432339458, 'learning_rate': 9.923358484921447e-06, 'epoch': 0.08} 8%|▊ | 1860/22095 [3:07:47<21:36:39, 3.84s/it] 8%|▊ | 1861/22095 [3:07:50<20:58:08, 3.73s/it] {'loss': 0.4084, 'grad_norm': 0.7190776961384768, 'learning_rate': 9.923230597387856e-06, 'epoch': 0.08} 8%|▊ | 1861/22095 [3:07:50<20:58:08, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (40967 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46376 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1862/22095 [3:07:55<21:48:17, 3.88s/it] {'loss': 0.4366, 'grad_norm': 0.7009628564366115, 'learning_rate': 9.92310260406916e-06, 'epoch': 0.08} 8%|▊ | 1862/22095 [3:07:55<21:48:17, 3.88s/it] 8%|▊ | 1863/22095 [3:07:58<20:42:54, 3.69s/it] {'loss': 0.4222, 'grad_norm': 0.7328454796007702, 'learning_rate': 9.922974504968107e-06, 'epoch': 0.08} 8%|▊ | 1863/22095 [3:07:58<20:42:54, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (124791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93905 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1864/22095 [3:08:01<19:56:25, 3.55s/it] {'loss': 0.4821, 'grad_norm': 0.8678391023098952, 'learning_rate': 9.922846300087454e-06, 'epoch': 0.08} 8%|▊ | 1864/22095 [3:08:01<19:56:25, 3.55s/it] 8%|▊ | 1865/22095 [3:08:04<19:19:09, 3.44s/it] {'loss': 0.4217, 'grad_norm': 0.8433282876733529, 'learning_rate': 9.922717989429954e-06, 'epoch': 0.08} 8%|▊ | 1865/22095 [3:08:04<19:19:09, 3.44s/it] 8%|▊ | 1866/22095 [3:08:07<18:33:44, 3.30s/it] {'loss': 0.5084, 'grad_norm': 0.7218042482897301, 'learning_rate': 9.922589572998362e-06, 'epoch': 0.08} 8%|▊ | 1866/22095 [3:08:07<18:33:44, 3.30s/it] 8%|▊ | 1867/22095 [3:08:10<18:29:10, 3.29s/it] {'loss': 0.4069, 'grad_norm': 0.8602340789259252, 'learning_rate': 9.922461050795438e-06, 'epoch': 0.08} 8%|▊ | 1867/22095 [3:08:10<18:29:10, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1868/22095 [3:08:20<28:44:22, 5.12s/it] {'loss': 0.5391, 'grad_norm': 0.8309640447672152, 'learning_rate': 9.922332422823945e-06, 'epoch': 0.08} 8%|▊ | 1868/22095 [3:08:20<28:44:22, 5.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1869/22095 [3:08:23<25:54:15, 4.61s/it] {'loss': 0.4768, 'grad_norm': 0.8035478084960879, 'learning_rate': 9.922203689086647e-06, 'epoch': 0.08} 8%|▊ | 1869/22095 [3:08:23<25:54:15, 4.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1870/22095 [3:08:31<31:41:08, 5.64s/it] {'loss': 0.5419, 'grad_norm': 0.4148660540710434, 'learning_rate': 9.922074849586308e-06, 'epoch': 0.08} 8%|▊ | 1870/22095 [3:08:31<31:41:08, 5.64s/it] 8%|▊ | 1871/22095 [3:08:36<29:22:12, 5.23s/it] {'loss': 0.43, 'grad_norm': 0.8219306255439327, 'learning_rate': 9.921945904325697e-06, 'epoch': 0.08} 8%|▊ | 1871/22095 [3:08:36<29:22:12, 5.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 8%|▊ | 1872/22095 [3:08:47<39:20:55, 7.00s/it] {'loss': 0.5292, 'grad_norm': 0.5312965844343916, 'learning_rate': 9.921816853307587e-06, 'epoch': 0.08} 8%|▊ | 1872/22095 [3:08:47<39:20:55, 7.00s/it] 8%|▊ | 1873/22095 [3:08:51<33:54:54, 6.04s/it] {'loss': 0.4661, 'grad_norm': 0.7367383375649544, 'learning_rate': 9.921687696534747e-06, 'epoch': 0.08} 8%|▊ | 1873/22095 [3:08:51<33:54:54, 6.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (102551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47655 > 40960). Running this sequence through the model will result in indexing errors 8%|▊ | 1874/22095 [3:08:54<29:56:04, 5.33s/it] {'loss': 0.4288, 'grad_norm': 0.7950765321616564, 'learning_rate': 9.921558434009955e-06, 'epoch': 0.08} 8%|▊ | 1874/22095 [3:08:54<29:56:04, 5.33s/it] 8%|▊ | 1875/22095 [3:08:59<28:25:31, 5.06s/it] {'loss': 0.4307, 'grad_norm': 0.7644887707411738, 'learning_rate': 9.921429065735988e-06, 'epoch': 0.08} 8%|▊ | 1875/22095 [3:08:59<28:25:31, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 8%|▊ | 1876/22095 [3:09:02<25:35:34, 4.56s/it] {'loss': 0.4117, 'grad_norm': 0.7419573945037351, 'learning_rate': 9.921299591715624e-06, 'epoch': 0.08} 8%|▊ | 1876/22095 [3:09:02<25:35:34, 4.56s/it] 8%|▊ | 1877/22095 [3:09:07<26:03:46, 4.64s/it] {'loss': 0.4789, 'grad_norm': 0.7484051985152155, 'learning_rate': 9.921170011951647e-06, 'epoch': 0.08} 8%|▊ | 1877/22095 [3:09:07<26:03:46, 4.64s/it] 8%|▊ | 1878/22095 [3:09:10<23:49:53, 4.24s/it] {'loss': 0.4159, 'grad_norm': 0.7464732006200073, 'learning_rate': 9.921040326446843e-06, 'epoch': 0.08} 8%|▊ | 1878/22095 [3:09:10<23:49:53, 4.24s/it] 9%|▊ | 1879/22095 [3:09:13<22:06:23, 3.94s/it] {'loss': 0.4521, 'grad_norm': 0.7313580668980304, 'learning_rate': 9.920910535203994e-06, 'epoch': 0.09} 9%|▊ | 1879/22095 [3:09:13<22:06:23, 3.94s/it] 9%|▊ | 1880/22095 [3:09:16<20:26:03, 3.64s/it] {'loss': 0.4779, 'grad_norm': 0.8321482272182403, 'learning_rate': 9.92078063822589e-06, 'epoch': 0.09} 9%|▊ | 1880/22095 [3:09:16<20:26:03, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46959 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128434 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1881/22095 [3:09:20<21:03:26, 3.75s/it] {'loss': 0.3927, 'grad_norm': 0.6967085482083333, 'learning_rate': 9.920650635515325e-06, 'epoch': 0.09} 9%|▊ | 1881/22095 [3:09:20<21:03:26, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▊ | 1882/22095 [3:09:30<30:44:49, 5.48s/it] {'loss': 0.5055, 'grad_norm': 0.9794195771812104, 'learning_rate': 9.92052052707509e-06, 'epoch': 0.09} 9%|▊ | 1882/22095 [3:09:30<30:44:49, 5.48s/it] 9%|▊ | 1883/22095 [3:09:34<28:04:20, 5.00s/it] {'loss': 0.4186, 'grad_norm': 0.6899236636306616, 'learning_rate': 9.92039031290798e-06, 'epoch': 0.09} 9%|▊ | 1883/22095 [3:09:34<28:04:20, 5.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▊ | 1884/22095 [3:09:37<24:51:31, 4.43s/it] {'loss': 0.4871, 'grad_norm': 0.725913617196428, 'learning_rate': 9.920259993016797e-06, 'epoch': 0.09} 9%|▊ | 1884/22095 [3:09:37<24:51:31, 4.43s/it] 9%|▊ | 1885/22095 [3:09:40<22:39:53, 4.04s/it] {'loss': 0.4206, 'grad_norm': 0.8332609166219846, 'learning_rate': 9.920129567404335e-06, 'epoch': 0.09} 9%|▊ | 1885/22095 [3:09:40<22:39:53, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63668 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91223 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1886/22095 [3:09:43<21:24:21, 3.81s/it] {'loss': 0.4109, 'grad_norm': 0.7069291434494992, 'learning_rate': 9.9199990360734e-06, 'epoch': 0.09} 9%|▊ | 1886/22095 [3:09:43<21:24:21, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Token indices sequence length is longer than the specified maximum sequence length for this model (162655 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1887/22095 [3:09:46<20:20:42, 3.62s/it] {'loss': 0.3979, 'grad_norm': 0.7991002911044793, 'learning_rate': 9.919868399026797e-06, 'epoch': 0.09} 9%|▊ | 1887/22095 [3:09:46<20:20:42, 3.62s/it] 9%|▊ | 1888/22095 [3:09:50<19:35:18, 3.49s/it] {'loss': 0.4882, 'grad_norm': 0.8346063775026414, 'learning_rate': 9.919737656267335e-06, 'epoch': 0.09} 9%|▊ | 1888/22095 [3:09:50<19:35:18, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▊ | 1889/22095 [3:10:00<30:48:49, 5.49s/it] {'loss': 0.522, 'grad_norm': 0.6406919207588929, 'learning_rate': 9.919606807797817e-06, 'epoch': 0.09} 9%|▊ | 1889/22095 [3:10:00<30:48:49, 5.49s/it] 9%|▊ | 1890/22095 [3:10:03<26:59:43, 4.81s/it] {'loss': 0.4645, 'grad_norm': 0.7523586431196658, 'learning_rate': 9.919475853621058e-06, 'epoch': 0.09} 9%|▊ | 1890/22095 [3:10:03<26:59:43, 4.81s/it] 9%|▊ | 1891/22095 [3:10:06<23:48:38, 4.24s/it] {'loss': 0.4176, 'grad_norm': 0.6798200716411895, 'learning_rate': 9.919344793739874e-06, 'epoch': 0.09} 9%|▊ | 1891/22095 [3:10:06<23:48:38, 4.24s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 9%|▊ | 1892/22095 [3:10:10<22:44:16, 4.05s/it] {'loss': 0.441, 'grad_norm': 0.736139778725739, 'learning_rate': 9.919213628157078e-06, 'epoch': 0.09} 9%|▊ | 1892/22095 [3:10:10<22:44:16, 4.05s/it] 9%|▊ | 1893/22095 [3:10:13<21:24:01, 3.81s/it] {'loss': 0.4291, 'grad_norm': 0.6998787747239904, 'learning_rate': 9.91908235687549e-06, 'epoch': 0.09} 9%|▊ | 1893/22095 [3:10:13<21:24:01, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▊ | 1894/22095 [3:10:20<27:38:59, 4.93s/it] {'loss': 0.5303, 'grad_norm': 0.5684390273842443, 'learning_rate': 9.918950979897928e-06, 'epoch': 0.09} 9%|▊ | 1894/22095 [3:10:20<27:38:59, 4.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955477 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6312, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 9%|▊ | 1895/22095 [3:10:24<24:46:09, 4.41s/it] {'loss': 0.4609, 'grad_norm': 0.7239694026683012, 'learning_rate': 9.91881949722722e-06, 'epoch': 0.09} 9%|▊ | 1895/22095 [3:10:24<24:46:09, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52636 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1896/22095 [3:10:27<24:02:57, 4.29s/it] {'loss': 0.4247, 'grad_norm': 0.7182694255332468, 'learning_rate': 9.918687908866185e-06, 'epoch': 0.09} 9%|▊ | 1896/22095 [3:10:28<24:02:57, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41957 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106385 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65691 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1897/22095 [3:10:30<21:44:33, 3.88s/it] {'loss': 0.4404, 'grad_norm': 0.7599687280674057, 'learning_rate': 9.918556214817655e-06, 'epoch': 0.09} 9%|▊ | 1897/22095 [3:10:30<21:44:33, 3.88s/it] 9%|▊ | 1898/22095 [3:10:34<21:46:22, 3.88s/it] {'loss': 0.4639, 'grad_norm': 0.7937333688188135, 'learning_rate': 9.918424415084458e-06, 'epoch': 0.09} 9%|▊ | 1898/22095 [3:10:34<21:46:22, 3.88s/it] 9%|▊ | 1899/22095 [3:10:37<20:17:36, 3.62s/it] {'loss': 0.4284, 'grad_norm': 0.8085633869903383, 'learning_rate': 9.918292509669426e-06, 'epoch': 0.09} 9%|▊ | 1899/22095 [3:10:37<20:17:36, 3.62s/it] 9%|▊ | 1900/22095 [3:10:41<20:05:33, 3.58s/it] {'loss': 0.4654, 'grad_norm': 0.776740327788825, 'learning_rate': 9.918160498575394e-06, 'epoch': 0.09} 9%|▊ | 1900/22095 [3:10:41<20:05:33, 3.58s/it] 9%|▊ | 1901/22095 [3:10:44<18:53:37, 3.37s/it] {'loss': 0.422, 'grad_norm': 0.7632130970803997, 'learning_rate': 9.918028381805196e-06, 'epoch': 0.09} 9%|▊ | 1901/22095 [3:10:44<18:53:37, 3.37s/it] 9%|▊ | 1902/22095 [3:10:47<19:34:59, 3.49s/it] {'loss': 0.4558, 'grad_norm': 0.7443778623445391, 'learning_rate': 9.917896159361674e-06, 'epoch': 0.09} 9%|▊ | 1902/22095 [3:10:47<19:34:59, 3.49s/it] 9%|▊ | 1903/22095 [3:10:50<18:42:24, 3.34s/it] {'loss': 0.4543, 'grad_norm': 0.7202079960261414, 'learning_rate': 9.917763831247667e-06, 'epoch': 0.09} 9%|▊ | 1903/22095 [3:10:50<18:42:24, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44661 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76919 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79788 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1904/22095 [3:10:58<25:03:14, 4.47s/it] {'loss': 0.5185, 'grad_norm': 0.7044512276975228, 'learning_rate': 9.91763139746602e-06, 'epoch': 0.09} 9%|▊ | 1904/22095 [3:10:58<25:03:14, 4.47s/it] 9%|▊ | 1905/22095 [3:11:08<34:33:08, 6.16s/it] {'loss': 0.5281, 'grad_norm': 0.5471247904879195, 'learning_rate': 9.917498858019577e-06, 'epoch': 0.09} 9%|▊ | 1905/22095 [3:11:08<34:33:08, 6.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 9%|▊ | 1906/22095 [3:11:11<29:36:00, 5.28s/it] {'loss': 0.3936, 'grad_norm': 0.7310463517891044, 'learning_rate': 9.917366212911187e-06, 'epoch': 0.09} 9%|▊ | 1906/22095 [3:11:11<29:36:00, 5.28s/it] 9%|▊ | 1907/22095 [3:11:14<26:41:52, 4.76s/it] {'loss': 0.4251, 'grad_norm': 0.7507045212274692, 'learning_rate': 9.917233462143698e-06, 'epoch': 0.09} 9%|▊ | 1907/22095 [3:11:14<26:41:52, 4.76s/it] 9%|▊ | 1908/22095 [3:11:17<23:38:39, 4.22s/it] {'loss': 0.4258, 'grad_norm': 0.7295758228462808, 'learning_rate': 9.917100605719968e-06, 'epoch': 0.09} 9%|▊ | 1908/22095 [3:11:17<23:38:39, 4.22s/it] 9%|▊ | 1909/22095 [3:11:21<22:07:04, 3.94s/it] {'loss': 0.4553, 'grad_norm': 0.7133428447258995, 'learning_rate': 9.916967643642844e-06, 'epoch': 0.09} 9%|▊ | 1909/22095 [3:11:21<22:07:04, 3.94s/it] 9%|▊ | 1910/22095 [3:11:24<21:23:52, 3.82s/it] {'loss': 0.417, 'grad_norm': 0.7734628826971062, 'learning_rate': 9.916834575915186e-06, 'epoch': 0.09} 9%|▊ | 1910/22095 [3:11:24<21:23:52, 3.82s/it] 9%|▊ | 1911/22095 [3:11:28<21:20:44, 3.81s/it] {'loss': 0.4119, 'grad_norm': 0.6985026626342046, 'learning_rate': 9.916701402539857e-06, 'epoch': 0.09} 9%|▊ | 1911/22095 [3:11:28<21:20:44, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47927 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1912/22095 [3:11:38<31:35:50, 5.64s/it] {'loss': 0.5522, 'grad_norm': 1.1633875585458455, 'learning_rate': 9.916568123519713e-06, 'epoch': 0.09} 9%|▊ | 1912/22095 [3:11:38<31:35:50, 5.64s/it] 9%|▊ | 1913/22095 [3:11:42<28:18:44, 5.05s/it] {'loss': 0.4336, 'grad_norm': 0.8129193812189842, 'learning_rate': 9.916434738857621e-06, 'epoch': 0.09} 9%|▊ | 1913/22095 [3:11:42<28:18:44, 5.05s/it] 9%|▊ | 1914/22095 [3:11:46<26:54:05, 4.80s/it] {'loss': 0.4838, 'grad_norm': 0.7459738133804278, 'learning_rate': 9.916301248556446e-06, 'epoch': 0.09} 9%|▊ | 1914/22095 [3:11:46<26:54:05, 4.80s/it] 9%|▊ | 1915/22095 [3:11:50<25:34:00, 4.56s/it] {'loss': 0.4234, 'grad_norm': 0.7379164873726534, 'learning_rate': 9.916167652619058e-06, 'epoch': 0.09} 9%|▊ | 1915/22095 [3:11:50<25:34:00, 4.56s/it] 9%|▊ | 1916/22095 [3:11:53<22:46:04, 4.06s/it] {'loss': 0.4448, 'grad_norm': 0.8413281330999343, 'learning_rate': 9.916033951048322e-06, 'epoch': 0.09} 9%|▊ | 1916/22095 [3:11:53<22:46:04, 4.06s/it] 9%|▊ | 1917/22095 [3:11:56<21:45:46, 3.88s/it] {'loss': 0.3915, 'grad_norm': 0.7615897239798799, 'learning_rate': 9.915900143847119e-06, 'epoch': 0.09} 9%|▊ | 1917/22095 [3:11:56<21:45:46, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▊ | 1918/22095 [3:12:00<21:43:45, 3.88s/it] {'loss': 0.466, 'grad_norm': 0.7395346710437531, 'learning_rate': 9.915766231018317e-06, 'epoch': 0.09} 9%|▊ | 1918/22095 [3:12:00<21:43:45, 3.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047572 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 9%|▊ | 1919/22095 [3:12:04<22:15:36, 3.97s/it] {'loss': 0.4975, 'grad_norm': 0.7461409358070803, 'learning_rate': 9.915632212564798e-06, 'epoch': 0.09} 9%|▊ | 1919/22095 [3:12:04<22:15:36, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▊ | 1920/22095 [3:12:14<31:14:36, 5.58s/it] {'loss': 0.5407, 'grad_norm': 0.8443578605281495, 'learning_rate': 9.91549808848944e-06, 'epoch': 0.09} 9%|▊ | 1920/22095 [3:12:14<31:14:36, 5.58s/it] 9%|▊ | 1921/22095 [3:12:18<29:00:47, 5.18s/it] {'loss': 0.4272, 'grad_norm': 0.8118802915223485, 'learning_rate': 9.915363858795125e-06, 'epoch': 0.09} 9%|▊ | 1921/22095 [3:12:18<29:00:47, 5.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308382 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2c.dsiPnD8KJjSspbXXbbEXXa_!!3249380751.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能输出一下图片中的所有文字吗?'}, {'from': 'gpt', 'value': '图中所有文字:\n第二件\n1元\n送\n俩用型挂绳\n指环扣支架\n男人\n可以不帅\n可以没钱\ncool\n但是\n一定要骚!'}]} 9%|▊ | 1922/22095 [3:12:21<25:19:12, 4.52s/it] {'loss': 0.4269, 'grad_norm': 0.7345275937375815, 'learning_rate': 9.915229523484736e-06, 'epoch': 0.09} 9%|▊ | 1922/22095 [3:12:21<25:19:12, 4.52s/it] 9%|▊ | 1923/22095 [3:12:24<23:15:05, 4.15s/it] {'loss': 0.4409, 'grad_norm': 0.7122964942038054, 'learning_rate': 9.915095082561161e-06, 'epoch': 0.09} 9%|▊ | 1923/22095 [3:12:24<23:15:05, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68079 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1924/22095 [3:12:28<22:43:32, 4.06s/it] {'loss': 0.4316, 'grad_norm': 0.7396962298691463, 'learning_rate': 9.914960536027289e-06, 'epoch': 0.09} 9%|▊ | 1924/22095 [3:12:28<22:43:32, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45591 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70934 > 40960). Running this sequence through the model will result in indexing errors 9%|▊ | 1925/22095 [3:12:31<21:34:08, 3.85s/it] {'loss': 0.4782, 'grad_norm': 0.7307867562497987, 'learning_rate': 9.91482588388601e-06, 'epoch': 0.09} 9%|▊ | 1925/22095 [3:12:31<21:34:08, 3.85s/it] 9%|▊ | 1926/22095 [3:12:34<20:19:01, 3.63s/it] {'loss': 0.4382, 'grad_norm': 0.7484534005439776, 'learning_rate': 9.914691126140216e-06, 'epoch': 0.09} 9%|▊ | 1926/22095 [3:12:34<20:19:01, 3.63s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▊ | 1927/22095 [3:12:38<20:37:19, 3.68s/it] {'loss': 0.5151, 'grad_norm': 0.825731600659895, 'learning_rate': 9.914556262792805e-06, 'epoch': 0.09} 9%|▊ | 1927/22095 [3:12:38<20:37:19, 3.68s/it] 9%|▊ | 1928/22095 [3:12:41<19:07:13, 3.41s/it] {'loss': 0.4662, 'grad_norm': 0.7123185730894329, 'learning_rate': 9.914421293846675e-06, 'epoch': 0.09} 9%|▊ | 1928/22095 [3:12:41<19:07:13, 3.41s/it] 9%|▊ | 1929/22095 [3:12:45<19:59:32, 3.57s/it] {'loss': 0.4698, 'grad_norm': 0.7781423627660821, 'learning_rate': 9.914286219304724e-06, 'epoch': 0.09} 9%|▊ | 1929/22095 [3:12:45<19:59:32, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▊ | 1930/22095 [3:12:48<19:00:02, 3.39s/it] {'loss': 0.3905, 'grad_norm': 0.6957365833355711, 'learning_rate': 9.914151039169855e-06, 'epoch': 0.09} 9%|▊ | 1930/22095 [3:12:48<19:00:02, 3.39s/it] 9%|▊ | 1931/22095 [3:12:52<20:08:22, 3.60s/it] {'loss': 0.4447, 'grad_norm': 0.6685636570567208, 'learning_rate': 9.914015753444973e-06, 'epoch': 0.09} 9%|▊ | 1931/22095 [3:12:52<20:08:22, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▊ | 1932/22095 [3:12:59<26:11:41, 4.68s/it] {'loss': 0.5551, 'grad_norm': 1.0053401661334387, 'learning_rate': 9.913880362132984e-06, 'epoch': 0.09} 9%|▊ | 1932/22095 [3:12:59<26:11:41, 4.68s/it] 9%|▊ | 1933/22095 [3:13:10<36:32:26, 6.52s/it] {'loss': 0.5343, 'grad_norm': 0.6118565640438647, 'learning_rate': 9.913744865236798e-06, 'epoch': 0.09} 9%|▊ | 1933/22095 [3:13:10<36:32:26, 6.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 9%|▉ | 1934/22095 [3:13:15<33:15:27, 5.94s/it] {'loss': 0.4909, 'grad_norm': 1.0514009365642782, 'learning_rate': 9.913609262759326e-06, 'epoch': 0.09} 9%|▉ | 1934/22095 [3:13:15<33:15:27, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (131736 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87372 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133304 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1935/22095 [3:13:18<29:21:38, 5.24s/it] {'loss': 0.4809, 'grad_norm': 0.855627611367071, 'learning_rate': 9.913473554703483e-06, 'epoch': 0.09} 9%|▉ | 1935/22095 [3:13:18<29:21:38, 5.24s/it] 9%|▉ | 1936/22095 [3:13:22<26:55:43, 4.81s/it] {'loss': 0.4447, 'grad_norm': 0.7468569420197058, 'learning_rate': 9.913337741072183e-06, 'epoch': 0.09} 9%|▉ | 1936/22095 [3:13:22<26:55:43, 4.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924297 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nIn which state is the Columbia River Dam? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Washington state.\nThe text does not mention the Columbia River Dam specifically, but it does mention the state of Washington, which is where the dam is located.'}]} 9%|▉ | 1937/22095 [3:13:25<23:39:15, 4.22s/it] {'loss': 0.4235, 'grad_norm': 0.7679281923297259, 'learning_rate': 9.913201821868345e-06, 'epoch': 0.09} 9%|▉ | 1937/22095 [3:13:25<23:39:15, 4.22s/it] 9%|▉ | 1938/22095 [3:13:28<21:50:18, 3.90s/it] {'loss': 0.3993, 'grad_norm': 0.695156803389326, 'learning_rate': 9.913065797094893e-06, 'epoch': 0.09} 9%|▉ | 1938/22095 [3:13:28<21:50:18, 3.90s/it] 9%|▉ | 1939/22095 [3:13:31<20:49:26, 3.72s/it] {'loss': 0.422, 'grad_norm': 0.7702685215584218, 'learning_rate': 9.912929666754741e-06, 'epoch': 0.09} 9%|▉ | 1939/22095 [3:13:31<20:49:26, 3.72s/it] 9%|▉ | 1940/22095 [3:13:35<20:10:14, 3.60s/it] {'loss': 0.4281, 'grad_norm': 0.7598651277864084, 'learning_rate': 9.912793430850822e-06, 'epoch': 0.09} 9%|▉ | 1940/22095 [3:13:35<20:10:14, 3.60s/it] 9%|▉ | 1941/22095 [3:13:38<20:08:04, 3.60s/it] {'loss': 0.4467, 'grad_norm': 0.6621538356949686, 'learning_rate': 9.912657089386062e-06, 'epoch': 0.09} 9%|▉ | 1941/22095 [3:13:38<20:08:04, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41636 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74249 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1942/22095 [3:13:41<18:40:31, 3.34s/it] {'loss': 0.4615, 'grad_norm': 0.738213356518308, 'learning_rate': 9.912520642363387e-06, 'epoch': 0.09} 9%|▉ | 1942/22095 [3:13:41<18:40:31, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 1943/22095 [3:13:50<28:57:28, 5.17s/it] {'loss': 0.6037, 'grad_norm': 2.1635848608348716, 'learning_rate': 9.912384089785731e-06, 'epoch': 0.09} 9%|▉ | 1943/22095 [3:13:50<28:57:28, 5.17s/it] 9%|▉ | 1944/22095 [3:13:54<25:57:10, 4.64s/it] {'loss': 0.4801, 'grad_norm': 1.0352199439365746, 'learning_rate': 9.91224743165603e-06, 'epoch': 0.09} 9%|▉ | 1944/22095 [3:13:54<25:57:10, 4.64s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30505.png 2025-08-27 19:11:50.880442 load time: 1642.18 ms 9%|▉ | 1945/22095 [3:13:58<25:17:48, 4.52s/it] {'loss': 0.5165, 'grad_norm': 0.8709803651125759, 'learning_rate': 9.912110667977218e-06, 'epoch': 0.09} 9%|▉ | 1945/22095 [3:13:58<25:17:48, 4.52s/it] 9%|▉ | 1946/22095 [3:14:01<22:19:55, 3.99s/it] {'loss': 0.4637, 'grad_norm': 0.8736736235706593, 'learning_rate': 9.911973798752232e-06, 'epoch': 0.09} 9%|▉ | 1946/22095 [3:14:01<22:19:55, 3.99s/it] 9%|▉ | 1947/22095 [3:14:04<21:30:25, 3.84s/it] {'loss': 0.4777, 'grad_norm': 0.8532192502651776, 'learning_rate': 9.911836823984018e-06, 'epoch': 0.09} 9%|▉ | 1947/22095 [3:14:04<21:30:25, 3.84s/it] 9%|▉ | 1948/22095 [3:14:07<19:49:01, 3.54s/it] {'loss': 0.4497, 'grad_norm': 0.7883996558296882, 'learning_rate': 9.911699743675513e-06, 'epoch': 0.09} 9%|▉ | 1948/22095 [3:14:07<19:49:01, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 1949/22095 [3:14:10<18:32:49, 3.31s/it] {'loss': 0.4458, 'grad_norm': 0.768457003127408, 'learning_rate': 9.911562557829668e-06, 'epoch': 0.09} 9%|▉ | 1949/22095 [3:14:10<18:32:49, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [214, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374655 in VC:s3://internvl-moe-sft-data/. Exception: Image size [214, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41431, 'image': 'vrdu_table_final_2/astro-ph.CO/12d6880b-3266-4425-9bd9-400af319237c.png', 'image_wh': [[214, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{c}\n\\epsfxsize=.75\\linewidth\n\\epsfbox{red-bin-2.eps}\n\\vspace*{-4ex}\n\\end{tabular}\n```"}]} 9%|▉ | 1950/22095 [3:14:20<30:27:06, 5.44s/it] {'loss': 0.521, 'grad_norm': 1.2044000998025475, 'learning_rate': 9.911425266449428e-06, 'epoch': 0.09} 9%|▉ | 1950/22095 [3:14:20<30:27:06, 5.44s/it] 9%|▉ | 1951/22095 [3:14:24<26:57:24, 4.82s/it] {'loss': 0.4367, 'grad_norm': 0.8025949214413347, 'learning_rate': 9.911287869537744e-06, 'epoch': 0.09} 9%|▉ | 1951/22095 [3:14:24<26:57:24, 4.82s/it] 9%|▉ | 1952/22095 [3:14:28<25:42:00, 4.59s/it] {'loss': 0.4638, 'grad_norm': 0.8465178290443464, 'learning_rate': 9.911150367097566e-06, 'epoch': 0.09} 9%|▉ | 1952/22095 [3:14:28<25:42:00, 4.59s/it] 9%|▉ | 1953/22095 [3:14:31<23:37:11, 4.22s/it] {'loss': 0.438, 'grad_norm': 0.8250438554174135, 'learning_rate': 9.911012759131852e-06, 'epoch': 0.09} 9%|▉ | 1953/22095 [3:14:31<23:37:11, 4.22s/it] 9%|▉ | 1954/22095 [3:14:34<21:45:05, 3.89s/it] {'loss': 0.4175, 'grad_norm': 0.7497252987029995, 'learning_rate': 9.910875045643555e-06, 'epoch': 0.09} 9%|▉ | 1954/22095 [3:14:34<21:45:05, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43936 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1955/22095 [3:14:37<20:40:40, 3.70s/it] {'loss': 0.4404, 'grad_norm': 0.7660876420547476, 'learning_rate': 9.910737226635636e-06, 'epoch': 0.09} 9%|▉ | 1955/22095 [3:14:37<20:40:40, 3.70s/it] 9%|▉ | 1956/22095 [3:14:40<19:28:44, 3.48s/it] {'loss': 0.4182, 'grad_norm': 0.7384189585829709, 'learning_rate': 9.910599302111057e-06, 'epoch': 0.09} 9%|▉ | 1956/22095 [3:14:40<19:28:44, 3.48s/it] 9%|▉ | 1957/22095 [3:14:44<20:01:21, 3.58s/it] {'loss': 0.4084, 'grad_norm': 0.7495072122782613, 'learning_rate': 9.91046127207278e-06, 'epoch': 0.09} 9%|▉ | 1957/22095 [3:14:44<20:01:21, 3.58s/it] 9%|▉ | 1958/22095 [3:14:47<18:59:50, 3.40s/it] {'loss': 0.4389, 'grad_norm': 0.7627524376716129, 'learning_rate': 9.910323136523773e-06, 'epoch': 0.09} 9%|▉ | 1958/22095 [3:14:47<18:59:50, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348833 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15503, 'image': 'vrdu_table_final_2/astro-ph.CO/3e5900db-d593-4483-8400-80e8ebf0b35d.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 9%|▉ | 1959/22095 [3:14:50<18:41:48, 3.34s/it] {'loss': 0.4454, 'grad_norm': 0.7313100522135206, 'learning_rate': 9.910184895467001e-06, 'epoch': 0.09} 9%|▉ | 1959/22095 [3:14:50<18:41:48, 3.34s/it] 9%|▉ | 1960/22095 [3:14:54<19:16:21, 3.45s/it] {'loss': 0.4231, 'grad_norm': 0.712201543133738, 'learning_rate': 9.910046548905437e-06, 'epoch': 0.09} 9%|▉ | 1960/22095 [3:14:54<19:16:21, 3.45s/it] 9%|▉ | 1961/22095 [3:14:57<18:54:56, 3.38s/it] {'loss': 0.4943, 'grad_norm': 0.7388104424423865, 'learning_rate': 9.909908096842053e-06, 'epoch': 0.09} 9%|▉ | 1961/22095 [3:14:57<18:54:56, 3.38s/it] 9%|▉ | 1962/22095 [3:15:00<17:55:02, 3.20s/it] {'loss': 0.473, 'grad_norm': 0.7075371931646569, 'learning_rate': 9.909769539279823e-06, 'epoch': 0.09} 9%|▉ | 1962/22095 [3:15:00<17:55:02, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 1963/22095 [3:15:08<25:08:22, 4.50s/it] {'loss': 0.5189, 'grad_norm': 0.6600754349658474, 'learning_rate': 9.909630876221726e-06, 'epoch': 0.09} 9%|▉ | 1963/22095 [3:15:08<25:08:22, 4.50s/it] 9%|▉ | 1964/22095 [3:15:11<24:07:35, 4.31s/it] {'loss': 0.4538, 'grad_norm': 0.7682343599469311, 'learning_rate': 9.909492107670737e-06, 'epoch': 0.09} 9%|▉ | 1964/22095 [3:15:12<24:07:35, 4.31s/it] 9%|▉ | 1965/22095 [3:15:14<21:53:17, 3.91s/it] {'loss': 0.4521, 'grad_norm': 0.7616573920258807, 'learning_rate': 9.909353233629844e-06, 'epoch': 0.09} 9%|▉ | 1965/22095 [3:15:14<21:53:17, 3.91s/it] 9%|▉ | 1966/22095 [3:15:18<21:18:41, 3.81s/it] {'loss': 0.5001, 'grad_norm': 0.7655653986533902, 'learning_rate': 9.909214254102027e-06, 'epoch': 0.09} 9%|▉ | 1966/22095 [3:15:18<21:18:41, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 1967/22095 [3:15:21<19:57:36, 3.57s/it] {'loss': 0.4719, 'grad_norm': 0.7915882076561631, 'learning_rate': 9.909075169090275e-06, 'epoch': 0.09} 9%|▉ | 1967/22095 [3:15:21<19:57:36, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047851 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 6cm\nB. 2cm\nC. 3cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 9%|▉ | 1968/22095 [3:15:24<18:36:43, 3.33s/it] {'loss': 0.4376, 'grad_norm': 0.8313921499635212, 'learning_rate': 9.90893597859757e-06, 'epoch': 0.09} 9%|▉ | 1968/22095 [3:15:24<18:36:43, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116723 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1969/22095 [3:15:27<18:13:21, 3.26s/it] {'loss': 0.4728, 'grad_norm': 0.7865427117270996, 'learning_rate': 9.908796682626911e-06, 'epoch': 0.09} 9%|▉ | 1969/22095 [3:15:27<18:13:21, 3.26s/it] 9%|▉ | 1970/22095 [3:15:30<17:27:31, 3.12s/it] {'loss': 0.4036, 'grad_norm': 0.7169640304743307, 'learning_rate': 9.908657281181289e-06, 'epoch': 0.09} 9%|▉ | 1970/22095 [3:15:30<17:27:31, 3.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113699 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50317 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1971/22095 [3:15:33<17:49:56, 3.19s/it] {'loss': 0.4504, 'grad_norm': 0.7259107241970512, 'learning_rate': 9.908517774263694e-06, 'epoch': 0.09} 9%|▉ | 1971/22095 [3:15:33<17:49:56, 3.19s/it] 9%|▉ | 1972/22095 [3:15:37<19:05:10, 3.41s/it] {'loss': 0.4208, 'grad_norm': 0.8039701538287216, 'learning_rate': 9.90837816187713e-06, 'epoch': 0.09} 9%|▉ | 1972/22095 [3:15:37<19:05:10, 3.41s/it] 9%|▉ | 1973/22095 [3:15:40<18:08:08, 3.24s/it] {'loss': 0.4544, 'grad_norm': 0.8988560217541354, 'learning_rate': 9.908238444024593e-06, 'epoch': 0.09} 9%|▉ | 1973/22095 [3:15:40<18:08:08, 3.24s/it] 9%|▉ | 1974/22095 [3:15:44<19:49:57, 3.55s/it] {'loss': 0.459, 'grad_norm': 0.7390784236507443, 'learning_rate': 9.908098620709088e-06, 'epoch': 0.09} 9%|▉ | 1974/22095 [3:15:44<19:49:57, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047229 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 12cm\nB. 6cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 9%|▉ | 1975/22095 [3:15:48<20:14:26, 3.62s/it] {'loss': 0.4309, 'grad_norm': 0.71640823902626, 'learning_rate': 9.907958691933616e-06, 'epoch': 0.09} 9%|▉ | 1975/22095 [3:15:48<20:14:26, 3.62s/it] 9%|▉ | 1976/22095 [3:15:51<19:49:02, 3.55s/it] {'loss': 0.4342, 'grad_norm': 0.7413322149073592, 'learning_rate': 9.907818657701185e-06, 'epoch': 0.09} 9%|▉ | 1976/22095 [3:15:51<19:49:02, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 1977/22095 [3:16:01<29:40:17, 5.31s/it] {'loss': 0.5497, 'grad_norm': 0.7103835938842725, 'learning_rate': 9.907678518014805e-06, 'epoch': 0.09} 9%|▉ | 1977/22095 [3:16:01<29:40:17, 5.31s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 9%|▉ | 1978/22095 [3:16:04<25:59:23, 4.65s/it] {'loss': 0.4103, 'grad_norm': 0.8569735720494202, 'learning_rate': 9.907538272877487e-06, 'epoch': 0.09} 9%|▉ | 1978/22095 [3:16:04<25:59:23, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 1979/22095 [3:16:13<33:38:36, 6.02s/it] {'loss': 0.5308, 'grad_norm': 0.4764015148465107, 'learning_rate': 9.907397922292244e-06, 'epoch': 0.09} 9%|▉ | 1979/22095 [3:16:13<33:38:36, 6.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 1980/22095 [3:16:17<29:41:04, 5.31s/it] {'loss': 0.4092, 'grad_norm': 0.7423437895217264, 'learning_rate': 9.90725746626209e-06, 'epoch': 0.09} 9%|▉ | 1980/22095 [3:16:17<29:41:04, 5.31s/it] 9%|▉ | 1981/22095 [3:16:20<25:50:25, 4.62s/it] {'loss': 0.4369, 'grad_norm': 0.7474565450120157, 'learning_rate': 9.907116904790046e-06, 'epoch': 0.09} 9%|▉ | 1981/22095 [3:16:20<25:50:25, 4.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948710 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71863, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nA. 4\nB. 1\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 9%|▉ | 1982/22095 [3:16:23<23:01:34, 4.12s/it] {'loss': 0.4647, 'grad_norm': 0.7462368731786069, 'learning_rate': 9.90697623787913e-06, 'epoch': 0.09} 9%|▉ | 1982/22095 [3:16:23<23:01:34, 4.12s/it] 9%|▉ | 1983/22095 [3:16:27<22:58:31, 4.11s/it] {'loss': 0.4692, 'grad_norm': 0.7712449048788309, 'learning_rate': 9.906835465532364e-06, 'epoch': 0.09} 9%|▉ | 1983/22095 [3:16:27<22:58:31, 4.11s/it] 9%|▉ | 1984/22095 [3:16:30<21:16:44, 3.81s/it] {'loss': 0.3805, 'grad_norm': 0.711234265516189, 'learning_rate': 9.906694587752777e-06, 'epoch': 0.09} 9%|▉ | 1984/22095 [3:16:30<21:16:44, 3.81s/it] 9%|▉ | 1985/22095 [3:16:33<20:38:42, 3.70s/it] {'loss': 0.3879, 'grad_norm': 0.8239657790937962, 'learning_rate': 9.906553604543392e-06, 'epoch': 0.09} 9%|▉ | 1985/22095 [3:16:33<20:38:42, 3.70s/it] 9%|▉ | 1986/22095 [3:16:37<20:20:48, 3.64s/it] {'loss': 0.4371, 'grad_norm': 0.6990732298441272, 'learning_rate': 9.90641251590724e-06, 'epoch': 0.09} 9%|▉ | 1986/22095 [3:16:37<20:20:48, 3.64s/it] 9%|▉ | 1987/22095 [3:16:40<20:09:03, 3.61s/it] {'loss': 0.4598, 'grad_norm': 0.7636960030695994, 'learning_rate': 9.906271321847349e-06, 'epoch': 0.09} 9%|▉ | 1987/22095 [3:16:40<20:09:03, 3.61s/it] 9%|▉ | 1988/22095 [3:16:43<18:51:24, 3.38s/it] {'loss': 0.442, 'grad_norm': 0.7243944420146078, 'learning_rate': 9.906130022366757e-06, 'epoch': 0.09} 9%|▉ | 1988/22095 [3:16:43<18:51:24, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73574 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55157 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1989/22095 [3:16:47<19:45:18, 3.54s/it] {'loss': 0.4358, 'grad_norm': 0.7077679135943856, 'learning_rate': 9.905988617468501e-06, 'epoch': 0.09} 9%|▉ | 1989/22095 [3:16:47<19:45:18, 3.54s/it] 9%|▉ | 1990/22095 [3:16:50<18:45:53, 3.36s/it] {'loss': 0.3987, 'grad_norm': 0.69551264922042, 'learning_rate': 9.905847107155615e-06, 'epoch': 0.09} 9%|▉ | 1990/22095 [3:16:50<18:45:53, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954494 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5329, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 8cm\nB. 10cm\nC. 12cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 9%|▉ | 1991/22095 [3:16:53<18:26:51, 3.30s/it] {'loss': 0.3991, 'grad_norm': 0.7493605737496202, 'learning_rate': 9.905705491431143e-06, 'epoch': 0.09} 9%|▉ | 1991/22095 [3:16:53<18:26:51, 3.30s/it] 9%|▉ | 1992/22095 [3:16:56<18:25:33, 3.30s/it] {'loss': 0.4341, 'grad_norm': 0.7035523261648925, 'learning_rate': 9.905563770298126e-06, 'epoch': 0.09} 9%|▉ | 1992/22095 [3:16:56<18:25:33, 3.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956351 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7186, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1cm'}]} 9%|▉ | 1993/22095 [3:17:00<18:54:38, 3.39s/it] {'loss': 0.4504, 'grad_norm': 0.6974086929552428, 'learning_rate': 9.905421943759611e-06, 'epoch': 0.09} 9%|▉ | 1993/22095 [3:17:01<18:54:38, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43588 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1994/22095 [3:17:04<19:34:18, 3.51s/it] {'loss': 0.4629, 'grad_norm': 0.724066794803269, 'learning_rate': 9.905280011818642e-06, 'epoch': 0.09} 9%|▉ | 1994/22095 [3:17:04<19:34:18, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100573 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 1995/22095 [3:17:08<20:37:45, 3.69s/it] {'loss': 0.4473, 'grad_norm': 0.856557517052248, 'learning_rate': 9.905137974478274e-06, 'epoch': 0.09} 9%|▉ | 1995/22095 [3:17:08<20:37:45, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 1996/22095 [3:17:15<26:56:14, 4.82s/it] {'loss': 0.5605, 'grad_norm': 1.5435171830573482, 'learning_rate': 9.904995831741553e-06, 'epoch': 0.09} 9%|▉ | 1996/22095 [3:17:15<26:56:14, 4.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 1997/22095 [3:17:23<30:54:26, 5.54s/it] {'loss': 0.5319, 'grad_norm': 0.8950493639427329, 'learning_rate': 9.904853583611537e-06, 'epoch': 0.09} 9%|▉ | 1997/22095 [3:17:23<30:54:26, 5.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 9%|▉ | 1998/22095 [3:17:26<27:25:04, 4.91s/it] {'loss': 0.4585, 'grad_norm': 0.885813931552701, 'learning_rate': 9.904711230091284e-06, 'epoch': 0.09} 9%|▉ | 1998/22095 [3:17:26<27:25:04, 4.91s/it] 9%|▉ | 1999/22095 [3:17:29<24:46:28, 4.44s/it] {'loss': 0.405, 'grad_norm': 0.8122692652550383, 'learning_rate': 9.904568771183848e-06, 'epoch': 0.09} 9%|▉ | 1999/22095 [3:17:29<24:46:28, 4.44s/it] 9%|▉ | 2000/22095 [3:17:32<21:56:22, 3.93s/it] {'loss': 0.4255, 'grad_norm': 0.7032039282943026, 'learning_rate': 9.904426206892292e-06, 'epoch': 0.09} 9%|▉ | 2000/22095 [3:17:32<21:56:22, 3.93s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 9%|▉ | 2001/22095 [3:18:25<103:59:40, 18.63s/it] {'loss': 0.4146, 'grad_norm': 0.730949878970527, 'learning_rate': 9.90428353721968e-06, 'epoch': 0.09} 9%|▉ | 2001/22095 [3:18:25<103:59:40, 18.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2002/22095 [3:18:36<91:21:13, 16.37s/it] {'loss': 0.5715, 'grad_norm': 1.8957984523607985, 'learning_rate': 9.904140762169079e-06, 'epoch': 0.09} 9%|▉ | 2002/22095 [3:18:36<91:21:13, 16.37s/it] 9%|▉ | 2003/22095 [3:18:44<77:14:48, 13.84s/it] {'loss': 0.5678, 'grad_norm': 1.6179371382040924, 'learning_rate': 9.903997881743552e-06, 'epoch': 0.09} 9%|▉ | 2003/22095 [3:18:44<77:14:48, 13.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 9%|▉ | 2004/22095 [3:18:48<60:19:44, 10.81s/it] {'loss': 0.4557, 'grad_norm': 0.8557369834319728, 'learning_rate': 9.903854895946174e-06, 'epoch': 0.09} 9%|▉ | 2004/22095 [3:18:48<60:19:44, 10.81s/it] 9%|▉ | 2005/22095 [3:18:53<50:15:06, 9.00s/it] {'loss': 0.4157, 'grad_norm': 0.9088274857229913, 'learning_rate': 9.903711804780015e-06, 'epoch': 0.09} 9%|▉ | 2005/22095 [3:18:53<50:15:06, 9.00s/it] 9%|▉ | 2006/22095 [3:18:56<41:07:55, 7.37s/it] {'loss': 0.4635, 'grad_norm': 0.8200194902087647, 'learning_rate': 9.90356860824815e-06, 'epoch': 0.09} 9%|▉ | 2006/22095 [3:18:56<41:07:55, 7.37s/it] 9%|▉ | 2007/22095 [3:19:00<34:56:14, 6.26s/it] {'loss': 0.48, 'grad_norm': 0.8415326148921095, 'learning_rate': 9.903425306353656e-06, 'epoch': 0.09} 9%|▉ | 2007/22095 [3:19:00<34:56:14, 6.26s/it] 9%|▉ | 2008/22095 [3:19:04<31:54:10, 5.72s/it] {'loss': 0.4895, 'grad_norm': 0.8056663597851978, 'learning_rate': 9.90328189909961e-06, 'epoch': 0.09} 9%|▉ | 2008/22095 [3:19:04<31:54:10, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96513 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73948 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2009/22095 [3:19:08<28:14:17, 5.06s/it] {'loss': 0.4835, 'grad_norm': 0.8676749506416057, 'learning_rate': 9.903138386489097e-06, 'epoch': 0.09} 9%|▉ | 2009/22095 [3:19:08<28:14:17, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68751 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140198 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2010/22095 [3:19:11<25:08:36, 4.51s/it] {'loss': 0.4351, 'grad_norm': 0.7626279262973016, 'learning_rate': 9.902994768525199e-06, 'epoch': 0.09} 9%|▉ | 2010/22095 [3:19:11<25:08:36, 4.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954486 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5321, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BD=4cm,∴AD=AB-BD=10-4=6(cm),∵点C是AD中点,∴CD=\\frac{1}{2}AD=3cm,则BC=CD+BD=7cm,'}]} 9%|▉ | 2011/22095 [3:19:16<25:41:20, 4.60s/it] {'loss': 0.4682, 'grad_norm': 0.9920892948015188, 'learning_rate': 9.902851045211e-06, 'epoch': 0.09} 9%|▉ | 2011/22095 [3:19:16<25:41:20, 4.60s/it] 9%|▉ | 2012/22095 [3:19:21<25:49:20, 4.63s/it] {'loss': 0.4427, 'grad_norm': 0.8113461558852759, 'learning_rate': 9.902707216549592e-06, 'epoch': 0.09} 9%|▉ | 2012/22095 [3:19:21<25:49:20, 4.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2013/22095 [3:19:25<25:20:15, 4.54s/it] {'loss': 0.4191, 'grad_norm': 0.7395981957725416, 'learning_rate': 9.902563282544061e-06, 'epoch': 0.09} 9%|▉ | 2013/22095 [3:19:25<25:20:15, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56275 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80804 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83342 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2014/22095 [3:19:28<23:23:46, 4.19s/it] {'loss': 0.4024, 'grad_norm': 0.7629633005452424, 'learning_rate': 9.902419243197505e-06, 'epoch': 0.09} 9%|▉ | 2014/22095 [3:19:28<23:23:46, 4.19s/it] 9%|▉ | 2015/22095 [3:19:31<21:32:20, 3.86s/it] {'loss': 0.4209, 'grad_norm': 0.7563524409039875, 'learning_rate': 9.902275098513015e-06, 'epoch': 0.09} 9%|▉ | 2015/22095 [3:19:31<21:32:20, 3.86s/it] 9%|▉ | 2016/22095 [3:19:34<20:01:54, 3.59s/it] {'loss': 0.436, 'grad_norm': 0.7238738044857984, 'learning_rate': 9.90213084849369e-06, 'epoch': 0.09} 9%|▉ | 2016/22095 [3:19:34<20:01:54, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11306992 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 553, 'image': 'chat/617.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请解释一下图中的聊天内容。'}, {'from': 'gpt', 'value': '这张图片展示了一个聊天界面,其中显示了多个联系人和最近的消息记录。以下是每个联系人以及他们最近发送的消息:\n\n1. **lion2**\n - 时间:04.25 18:00\n - 消息内容:这会在干嘛\n - 解释:看起来是询问对方当前在做什么。\n\n2. **家装顾问**\n - 时间:04.25 18:00\n - 消息内容:电影院见\n - 解释:这是安排见面的消息,约定在电影院见。\n\n3. **lion**\n - 时间:04.20 17:52\n - 消息内容:我们已经是好友了,和我打声招呼吧\n - 解释:提醒对方他们已经互加好友,并请求对方打招呼。\n\n4. **张东林**\n - 时间:04.20 17:00\n - 消息内容:要的要的\n - 解释:这可能是对之前讨论内容的回应,表示肯定。\n\n5. **XBB**\n - 时间:12.24 11:33\n - 消息内容:RWQRQWT\n - 解释:这条消息看起来像是无意义的字母组合,可能是打字错误或者是内行才懂的代码。\n\n6. **二姨**\n - 时间:09.18 23:46\n - 消息内容:(空白)\n - 解释:没有显示近期的消息内容,可能是空消息或者之前的消息已被删除。\n\n7. **老李**\n - 时间:09.17 18:25\n - 消息内容:嗯\n - 解释:简单的“嗯”,表示同意或者回应。\n\n这些聊天记录总结了用户与不同联系人的交流,从朋友互动到安排见面,再到简单的确认和回应。'}]} 9%|▉ | 2017/22095 [3:19:45<32:02:05, 5.74s/it] {'loss': 0.5675, 'grad_norm': 1.764351025615634, 'learning_rate': 9.901986493142629e-06, 'epoch': 0.09} 9%|▉ | 2017/22095 [3:19:45<32:02:05, 5.74s/it] 9%|▉ | 2018/22095 [3:19:50<31:02:14, 5.57s/it] {'loss': 0.5895, 'grad_norm': 1.4324836252716153, 'learning_rate': 9.901842032462931e-06, 'epoch': 0.09} 9%|▉ | 2018/22095 [3:19:50<31:02:14, 5.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 9%|▉ | 2019/22095 [3:19:54<27:58:13, 5.02s/it] {'loss': 0.4472, 'grad_norm': 0.8962622205166951, 'learning_rate': 9.901697466457706e-06, 'epoch': 0.09} 9%|▉ | 2019/22095 [3:19:54<27:58:13, 5.02s/it] 9%|▉ | 2020/22095 [3:19:58<25:58:07, 4.66s/it] {'loss': 0.3986, 'grad_norm': 0.8641869503878485, 'learning_rate': 9.901552795130054e-06, 'epoch': 0.09} 9%|▉ | 2020/22095 [3:19:58<25:58:07, 4.66s/it] 9%|▉ | 2021/22095 [3:20:02<24:44:12, 4.44s/it] {'loss': 0.4555, 'grad_norm': 0.9679911613055479, 'learning_rate': 9.901408018483087e-06, 'epoch': 0.09} 9%|▉ | 2021/22095 [3:20:02<24:44:12, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44496 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49985 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63076 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2022/22095 [3:20:11<32:57:07, 5.91s/it] {'loss': 0.5488, 'grad_norm': 1.7195222230010936, 'learning_rate': 9.901263136519917e-06, 'epoch': 0.09} 9%|▉ | 2022/22095 [3:20:11<32:57:07, 5.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2023/22095 [3:20:14<28:37:30, 5.13s/it] {'loss': 0.4668, 'grad_norm': 0.8581028478777686, 'learning_rate': 9.901118149243653e-06, 'epoch': 0.09} 9%|▉ | 2023/22095 [3:20:14<28:37:30, 5.13s/it] 9%|▉ | 2024/22095 [3:20:18<26:34:40, 4.77s/it] {'loss': 0.4235, 'grad_norm': 0.7784153654002759, 'learning_rate': 9.900973056657414e-06, 'epoch': 0.09} 9%|▉ | 2024/22095 [3:20:18<26:34:40, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58105 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52474 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2025/22095 [3:20:22<24:00:51, 4.31s/it] {'loss': 0.4437, 'grad_norm': 0.7236386410380111, 'learning_rate': 9.900827858764315e-06, 'epoch': 0.09} 9%|▉ | 2025/22095 [3:20:22<24:00:51, 4.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2026/22095 [3:20:25<22:01:08, 3.95s/it] {'loss': 0.5253, 'grad_norm': 0.8316898164519387, 'learning_rate': 9.900682555567478e-06, 'epoch': 0.09} 9%|▉ | 2026/22095 [3:20:25<22:01:08, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2027/22095 [3:20:28<21:10:18, 3.80s/it] {'loss': 0.3917, 'grad_norm': 0.792182988032171, 'learning_rate': 9.900537147070025e-06, 'epoch': 0.09} 9%|▉ | 2027/22095 [3:20:28<21:10:18, 3.80s/it] 9%|▉ | 2028/22095 [3:20:32<20:43:35, 3.72s/it] {'loss': 0.4041, 'grad_norm': 0.7853677656386151, 'learning_rate': 9.900391633275079e-06, 'epoch': 0.09} 9%|▉ | 2028/22095 [3:20:32<20:43:35, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71486 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2029/22095 [3:20:35<19:55:23, 3.57s/it] {'loss': 0.4462, 'grad_norm': 0.9331508273972187, 'learning_rate': 9.900246014185765e-06, 'epoch': 0.09} 9%|▉ | 2029/22095 [3:20:35<19:55:23, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2030/22095 [3:20:44<29:53:13, 5.36s/it] {'loss': 0.5497, 'grad_norm': 1.40698347576658, 'learning_rate': 9.900100289805217e-06, 'epoch': 0.09} 9%|▉ | 2030/22095 [3:20:44<29:53:13, 5.36s/it] 9%|▉ | 2031/22095 [3:20:48<26:18:23, 4.72s/it] {'loss': 0.3621, 'grad_norm': 0.8484668466241169, 'learning_rate': 9.899954460136563e-06, 'epoch': 0.09} 9%|▉ | 2031/22095 [3:20:48<26:18:23, 4.72s/it] 9%|▉ | 2032/22095 [3:20:51<23:53:37, 4.29s/it] {'loss': 0.4725, 'grad_norm': 0.7649499435717951, 'learning_rate': 9.899808525182935e-06, 'epoch': 0.09} 9%|▉ | 2032/22095 [3:20:51<23:53:37, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2033/22095 [3:21:01<34:10:13, 6.13s/it] {'loss': 0.5284, 'grad_norm': 0.8057014650689612, 'learning_rate': 9.899662484947473e-06, 'epoch': 0.09} 9%|▉ | 2033/22095 [3:21:01<34:10:13, 6.13s/it] 9%|▉ | 2034/22095 [3:21:05<30:28:49, 5.47s/it] {'loss': 0.4134, 'grad_norm': 1.1381958528632035, 'learning_rate': 9.899516339433308e-06, 'epoch': 0.09} 9%|▉ | 2034/22095 [3:21:05<30:28:49, 5.47s/it] 9%|▉ | 2035/22095 [3:21:09<26:49:38, 4.81s/it] {'loss': 0.455, 'grad_norm': 0.7743025263168356, 'learning_rate': 9.899370088643589e-06, 'epoch': 0.09} 9%|▉ | 2035/22095 [3:21:09<26:49:38, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908007 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31160, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nA. 8\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 9%|▉ | 2036/22095 [3:21:12<23:54:06, 4.29s/it] {'loss': 0.4497, 'grad_norm': 0.8291030740640418, 'learning_rate': 9.899223732581452e-06, 'epoch': 0.09} 9%|▉ | 2036/22095 [3:21:12<23:54:06, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2037/22095 [3:21:21<32:37:54, 5.86s/it] {'loss': 0.5065, 'grad_norm': 0.6110725868672015, 'learning_rate': 9.899077271250043e-06, 'epoch': 0.09} 9%|▉ | 2037/22095 [3:21:21<32:37:54, 5.86s/it] 9%|▉ | 2038/22095 [3:21:24<28:05:54, 5.04s/it] {'loss': 0.4626, 'grad_norm': 0.9969532555112366, 'learning_rate': 9.898930704652512e-06, 'epoch': 0.09} 9%|▉ | 2038/22095 [3:21:24<28:05:54, 5.04s/it] 9%|▉ | 2039/22095 [3:21:28<26:08:10, 4.69s/it] {'loss': 0.4295, 'grad_norm': 0.7716306199234213, 'learning_rate': 9.898784032792005e-06, 'epoch': 0.09} 9%|▉ | 2039/22095 [3:21:28<26:08:10, 4.69s/it] 9%|▉ | 2040/22095 [3:21:33<25:28:56, 4.57s/it] {'loss': 0.4323, 'grad_norm': 0.7635557014521751, 'learning_rate': 9.898637255671674e-06, 'epoch': 0.09} 9%|▉ | 2040/22095 [3:21:33<25:28:56, 4.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [545, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433876 in VC:s3://internvl-moe-sft-data/. Exception: Image size [545, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13951, 'image': 'vrdu_texteq/astro-ph.CO/d4b65d53-700f-4785-93c1-5da8fff77e35.png', 'image_wh': [[545, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': "where T = 10$^5$ K and ${h}$ is Planck's constant."}]} 9%|▉ | 2041/22095 [3:21:35<22:48:14, 4.09s/it] {'loss': 0.4321, 'grad_norm': 0.832082874007693, 'learning_rate': 9.898490373294673e-06, 'epoch': 0.09} 9%|▉ | 2041/22095 [3:21:35<22:48:14, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341286 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7931, 'image': 'vrdu_table_final_2/astro-ph.CO/0bc6ad13-46ad-4969-b9e8-fa04fae33931.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 9%|▉ | 2042/22095 [3:21:38<20:55:39, 3.76s/it] {'loss': 0.4305, 'grad_norm': 0.9848162238601245, 'learning_rate': 9.898343385664161e-06, 'epoch': 0.09} 9%|▉ | 2042/22095 [3:21:38<20:55:39, 3.76s/it] 9%|▉ | 2043/22095 [3:21:43<21:28:32, 3.86s/it] {'loss': 0.4903, 'grad_norm': 0.745330890800276, 'learning_rate': 9.898196292783291e-06, 'epoch': 0.09} 9%|▉ | 2043/22095 [3:21:43<21:28:32, 3.86s/it] 9%|▉ | 2044/22095 [3:21:46<20:29:23, 3.68s/it] {'loss': 0.4719, 'grad_norm': 0.9633770165986147, 'learning_rate': 9.898049094655229e-06, 'epoch': 0.09} 9%|▉ | 2044/22095 [3:21:46<20:29:23, 3.68s/it] 9%|▉ | 2045/22095 [3:21:49<20:10:16, 3.62s/it] {'loss': 0.4433, 'grad_norm': 0.8418196344795355, 'learning_rate': 9.897901791283133e-06, 'epoch': 0.09} 9%|▉ | 2045/22095 [3:21:49<20:10:16, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8340484 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7128, 'image': 'vrdu_table_final_2/astro-ph.CO/e6577296-a0cd-4548-986e-2ffd979cbb29.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 9%|▉ | 2046/22095 [3:21:56<25:43:11, 4.62s/it] {'loss': 0.55, 'grad_norm': 1.1470649580813783, 'learning_rate': 9.897754382670171e-06, 'epoch': 0.09} 9%|▉ | 2046/22095 [3:21:56<25:43:11, 4.62s/it] 9%|▉ | 2047/22095 [3:22:01<25:59:08, 4.67s/it] {'loss': 0.4489, 'grad_norm': 0.8762812853825404, 'learning_rate': 9.897606868819508e-06, 'epoch': 0.09} 9%|▉ | 2047/22095 [3:22:01<25:59:08, 4.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2048/22095 [3:22:04<23:18:22, 4.19s/it] {'loss': 0.4265, 'grad_norm': 0.8423179020239366, 'learning_rate': 9.897459249734318e-06, 'epoch': 0.09} 9%|▉ | 2048/22095 [3:22:04<23:18:22, 4.19s/it] 9%|▉ | 2049/22095 [3:22:08<22:35:09, 4.06s/it] {'loss': 0.4449, 'grad_norm': 0.7325380667861581, 'learning_rate': 9.89731152541777e-06, 'epoch': 0.09} 9%|▉ | 2049/22095 [3:22:08<22:35:09, 4.06s/it] 9%|▉ | 2050/22095 [3:22:11<20:37:38, 3.70s/it] {'loss': 0.4298, 'grad_norm': 0.7717625136781339, 'learning_rate': 9.897163695873036e-06, 'epoch': 0.09} 9%|▉ | 2050/22095 [3:22:11<20:37:38, 3.70s/it] 9%|▉ | 2051/22095 [3:22:15<20:47:07, 3.73s/it] {'loss': 0.4139, 'grad_norm': 0.7557315397060594, 'learning_rate': 9.897015761103298e-06, 'epoch': 0.09} 9%|▉ | 2051/22095 [3:22:15<20:47:07, 3.73s/it] 9%|▉ | 2052/22095 [3:22:17<19:31:08, 3.51s/it] {'loss': 0.4668, 'grad_norm': 0.7589363242557137, 'learning_rate': 9.896867721111726e-06, 'epoch': 0.09} 9%|▉ | 2052/22095 [3:22:18<19:31:08, 3.51s/it] 9%|▉ | 2053/22095 [3:22:21<19:40:52, 3.54s/it] {'loss': 0.4654, 'grad_norm': 0.8135784534771674, 'learning_rate': 9.89671957590151e-06, 'epoch': 0.09} 9%|▉ | 2053/22095 [3:22:21<19:40:52, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955479 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6314, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 4.5\nB. 7\nC. 2\nD. 2.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 9%|▉ | 2054/22095 [3:22:25<19:57:48, 3.59s/it] {'loss': 0.4593, 'grad_norm': 0.770208676126074, 'learning_rate': 9.89657132547583e-06, 'epoch': 0.09} 9%|▉ | 2054/22095 [3:22:25<19:57:48, 3.59s/it] 9%|▉ | 2055/22095 [3:22:28<19:15:31, 3.46s/it] {'loss': 0.4724, 'grad_norm': 0.7130610024807864, 'learning_rate': 9.89642296983787e-06, 'epoch': 0.09} 9%|▉ | 2055/22095 [3:22:28<19:15:31, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93122 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2056/22095 [3:22:31<18:10:18, 3.26s/it] {'loss': 0.4605, 'grad_norm': 0.8139952994416343, 'learning_rate': 9.896274508990818e-06, 'epoch': 0.09} 9%|▉ | 2056/22095 [3:22:31<18:10:18, 3.26s/it] 9%|▉ | 2057/22095 [3:22:34<17:50:13, 3.20s/it] {'loss': 0.4247, 'grad_norm': 0.7597358274853931, 'learning_rate': 9.896125942937865e-06, 'epoch': 0.09} 9%|▉ | 2057/22095 [3:22:34<17:50:13, 3.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77913 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2058/22095 [3:22:37<17:42:31, 3.18s/it] {'loss': 0.4706, 'grad_norm': 0.8903289322010103, 'learning_rate': 9.895977271682203e-06, 'epoch': 0.09} 9%|▉ | 2058/22095 [3:22:37<17:42:31, 3.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_1/images/step_0.png 2025-08-27 19:20:35.745982 load time: 1031.68 ms 9%|▉ | 2059/22095 [3:22:40<17:27:28, 3.14s/it] {'loss': 0.4398, 'grad_norm': 0.8455330761971988, 'learning_rate': 9.895828495227026e-06, 'epoch': 0.09} 9%|▉ | 2059/22095 [3:22:40<17:27:28, 3.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48499 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2060/22095 [3:22:43<17:35:23, 3.16s/it] {'loss': 0.5244, 'grad_norm': 0.7911632261161675, 'learning_rate': 9.89567961357553e-06, 'epoch': 0.09} 9%|▉ | 2060/22095 [3:22:43<17:35:23, 3.16s/it] 9%|▉ | 2061/22095 [3:22:46<16:52:43, 3.03s/it] {'loss': 0.4424, 'grad_norm': 0.7714784849728722, 'learning_rate': 9.895530626730917e-06, 'epoch': 0.09} 9%|▉ | 2061/22095 [3:22:46<16:52:43, 3.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77776 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2062/22095 [3:22:49<17:35:44, 3.16s/it] {'loss': 0.4432, 'grad_norm': 0.7595825000992268, 'learning_rate': 9.895381534696385e-06, 'epoch': 0.09} 9%|▉ | 2062/22095 [3:22:49<17:35:44, 3.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2063/22095 [3:22:53<17:56:52, 3.23s/it] {'loss': 0.4948, 'grad_norm': 0.7933673014458721, 'learning_rate': 9.89523233747514e-06, 'epoch': 0.09} 9%|▉ | 2063/22095 [3:22:53<17:56:52, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2064/22095 [3:23:03<29:34:29, 5.32s/it] {'loss': 0.5417, 'grad_norm': 0.6566601566759311, 'learning_rate': 9.895083035070386e-06, 'epoch': 0.09} 9%|▉ | 2064/22095 [3:23:03<29:34:29, 5.32s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-27 19:21:03.379329 load time: 1037.41 ms 9%|▉ | 2065/22095 [3:23:08<28:27:14, 5.11s/it] {'loss': 0.4582, 'grad_norm': 0.7426330048072817, 'learning_rate': 9.894933627485332e-06, 'epoch': 0.09} 9%|▉ | 2065/22095 [3:23:08<28:27:14, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2066/22095 [3:23:16<34:39:32, 6.23s/it] {'loss': 0.5136, 'grad_norm': 0.5430416280961405, 'learning_rate': 9.894784114723186e-06, 'epoch': 0.09} 9%|▉ | 2066/22095 [3:23:16<34:39:32, 6.23s/it] 9%|▉ | 2067/22095 [3:23:21<31:01:48, 5.58s/it] {'loss': 0.4623, 'grad_norm': 0.8886218681236913, 'learning_rate': 9.894634496787166e-06, 'epoch': 0.09} 9%|▉ | 2067/22095 [3:23:21<31:01:48, 5.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396953 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63806, 'image': 'vrdu_table_final_2/astro-ph.EP/86ac179f-8d68-43a5-aa43-6d1dde4e544b.png', 'image_wh': [[12, 14]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}z\\end{tabular}\n```'}]} 9%|▉ | 2068/22095 [3:23:30<38:15:08, 6.88s/it] {'loss': 0.4935, 'grad_norm': 0.346019844103863, 'learning_rate': 9.89448477368048e-06, 'epoch': 0.09} 9%|▉ | 2068/22095 [3:23:30<38:15:08, 6.88s/it] 9%|▉ | 2069/22095 [3:23:35<33:55:17, 6.10s/it] {'loss': 0.4279, 'grad_norm': 0.7603346261585872, 'learning_rate': 9.89433494540635e-06, 'epoch': 0.09} 9%|▉ | 2069/22095 [3:23:35<33:55:17, 6.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2070/22095 [3:23:43<37:17:17, 6.70s/it] {'loss': 0.5457, 'grad_norm': 0.44835072827611927, 'learning_rate': 9.894185011967994e-06, 'epoch': 0.09} 9%|▉ | 2070/22095 [3:23:43<37:17:17, 6.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2071/22095 [3:23:46<31:55:08, 5.74s/it] {'loss': 0.4298, 'grad_norm': 0.8254020666481041, 'learning_rate': 9.894034973368633e-06, 'epoch': 0.09} 9%|▉ | 2071/22095 [3:23:46<31:55:08, 5.74s/it] 9%|▉ | 2072/22095 [3:23:50<27:45:07, 4.99s/it] {'loss': 0.4348, 'grad_norm': 0.7212908946016292, 'learning_rate': 9.89388482961149e-06, 'epoch': 0.09} 9%|▉ | 2072/22095 [3:23:50<27:45:07, 4.99s/it] 9%|▉ | 2073/22095 [3:23:53<25:41:48, 4.62s/it] {'loss': 0.4529, 'grad_norm': 0.8305662270330699, 'learning_rate': 9.893734580699796e-06, 'epoch': 0.09} 9%|▉ | 2073/22095 [3:23:53<25:41:48, 4.62s/it] 9%|▉ | 2074/22095 [3:23:57<23:25:05, 4.21s/it] {'loss': 0.3987, 'grad_norm': 0.7498327315800464, 'learning_rate': 9.893584226636773e-06, 'epoch': 0.09} 9%|▉ | 2074/22095 [3:23:57<23:25:05, 4.21s/it] 9%|▉ | 2075/22095 [3:23:59<20:55:53, 3.76s/it] {'loss': 0.4056, 'grad_norm': 0.7576967573230682, 'learning_rate': 9.893433767425655e-06, 'epoch': 0.09} 9%|▉ | 2075/22095 [3:23:59<20:55:53, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_3/images/step_0.png 2025-08-27 19:21:58.059217 load time: 1040.16 ms 9%|▉ | 2076/22095 [3:24:03<20:05:39, 3.61s/it] {'loss': 0.4209, 'grad_norm': 0.8421167565131027, 'learning_rate': 9.893283203069675e-06, 'epoch': 0.09} 9%|▉ | 2076/22095 [3:24:03<20:05:39, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2077/22095 [3:24:06<19:53:04, 3.58s/it] {'loss': 0.4547, 'grad_norm': 0.9312987055219514, 'learning_rate': 9.893132533572067e-06, 'epoch': 0.09} 9%|▉ | 2077/22095 [3:24:06<19:53:04, 3.58s/it] 9%|▉ | 2078/22095 [3:24:09<18:44:29, 3.37s/it] {'loss': 0.419, 'grad_norm': 0.7287349796188367, 'learning_rate': 9.892981758936069e-06, 'epoch': 0.09} 9%|▉ | 2078/22095 [3:24:09<18:44:29, 3.37s/it] 9%|▉ | 2079/22095 [3:24:12<18:25:39, 3.31s/it] {'loss': 0.446, 'grad_norm': 0.7886625124732145, 'learning_rate': 9.89283087916492e-06, 'epoch': 0.09} 9%|▉ | 2079/22095 [3:24:12<18:25:39, 3.31s/it] 9%|▉ | 2080/22095 [3:24:16<19:06:47, 3.44s/it] {'loss': 0.4047, 'grad_norm': 0.7342267624673294, 'learning_rate': 9.892679894261865e-06, 'epoch': 0.09} 9%|▉ | 2080/22095 [3:24:16<19:06:47, 3.44s/it] 9%|▉ | 2081/22095 [3:24:19<18:27:04, 3.32s/it] {'loss': 0.4536, 'grad_norm': 0.7551290893052203, 'learning_rate': 9.892528804230144e-06, 'epoch': 0.09} 9%|▉ | 2081/22095 [3:24:19<18:27:04, 3.32s/it] 9%|▉ | 2082/22095 [3:24:22<17:57:33, 3.23s/it] {'loss': 0.4232, 'grad_norm': 0.6795891080594817, 'learning_rate': 9.892377609073006e-06, 'epoch': 0.09} 9%|▉ | 2082/22095 [3:24:22<17:57:33, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2083/22095 [3:24:31<27:45:57, 4.99s/it] {'loss': 0.51, 'grad_norm': 1.0309268697491385, 'learning_rate': 9.892226308793697e-06, 'epoch': 0.09} 9%|▉ | 2083/22095 [3:24:31<27:45:57, 4.99s/it] 9%|▉ | 2084/22095 [3:24:35<25:27:21, 4.58s/it] {'loss': 0.4124, 'grad_norm': 0.999083959427165, 'learning_rate': 9.892074903395472e-06, 'epoch': 0.09} 9%|▉ | 2084/22095 [3:24:35<25:27:21, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396949 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63802, 'image': 'vrdu_table_final_2/astro-ph.EP/c2ab4f80-6d7b-4b07-b070-081a5e4c000b.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$\\phi$\\end{tabular}\n```"}]} 9%|▉ | 2085/22095 [3:24:38<22:52:31, 4.12s/it] {'loss': 0.4113, 'grad_norm': 0.6626173222054419, 'learning_rate': 9.891923392881581e-06, 'epoch': 0.09} 9%|▉ | 2085/22095 [3:24:38<22:52:31, 4.12s/it] 9%|▉ | 2086/22095 [3:24:41<20:52:50, 3.76s/it] {'loss': 0.4096, 'grad_norm': 0.8008472966595057, 'learning_rate': 9.89177177725528e-06, 'epoch': 0.09} 9%|▉ | 2086/22095 [3:24:41<20:52:50, 3.76s/it] 9%|▉ | 2087/22095 [3:24:44<20:16:29, 3.65s/it] {'loss': 0.4698, 'grad_norm': 0.7070000181672658, 'learning_rate': 9.89162005651983e-06, 'epoch': 0.09} 9%|▉ | 2087/22095 [3:24:44<20:16:29, 3.65s/it] 9%|▉ | 2088/22095 [3:24:48<20:05:43, 3.62s/it] {'loss': 0.3912, 'grad_norm': 0.7102922590146776, 'learning_rate': 9.891468230678487e-06, 'epoch': 0.09} 9%|▉ | 2088/22095 [3:24:48<20:05:43, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82126 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2089/22095 [3:24:51<19:43:40, 3.55s/it] {'loss': 0.4661, 'grad_norm': 0.7965351685251075, 'learning_rate': 9.891316299734514e-06, 'epoch': 0.09} 9%|▉ | 2089/22095 [3:24:51<19:43:40, 3.55s/it] 9%|▉ | 2090/22095 [3:24:54<19:35:36, 3.53s/it] {'loss': 0.4163, 'grad_norm': 0.7133001147134137, 'learning_rate': 9.891164263691178e-06, 'epoch': 0.09} 9%|▉ | 2090/22095 [3:24:54<19:35:36, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65964 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2091/22095 [3:24:58<18:58:56, 3.42s/it] {'loss': 0.4251, 'grad_norm': 0.6919466012287274, 'learning_rate': 9.891012122551742e-06, 'epoch': 0.09} 9%|▉ | 2091/22095 [3:24:58<18:58:56, 3.42s/it] 9%|▉ | 2092/22095 [3:25:03<21:40:04, 3.90s/it] {'loss': 0.4708, 'grad_norm': 0.8172173083042604, 'learning_rate': 9.890859876319479e-06, 'epoch': 0.09} 9%|▉ | 2092/22095 [3:25:03<21:40:04, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76864 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43966 > 40960). Running this sequence through the model will result in indexing errors 9%|▉ | 2093/22095 [3:25:06<20:29:19, 3.69s/it] {'loss': 0.4183, 'grad_norm': 0.7157802079281965, 'learning_rate': 9.890707524997657e-06, 'epoch': 0.09} 9%|▉ | 2093/22095 [3:25:06<20:29:19, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 9%|▉ | 2094/22095 [3:25:10<21:11:28, 3.81s/it] {'loss': 0.4581, 'grad_norm': 0.8012655492564951, 'learning_rate': 9.890555068589552e-06, 'epoch': 0.09} 9%|▉ | 2094/22095 [3:25:10<21:11:28, 3.81s/it] 9%|▉ | 2095/22095 [3:25:13<19:38:51, 3.54s/it] {'loss': 0.4321, 'grad_norm': 0.6616316877473514, 'learning_rate': 9.890402507098437e-06, 'epoch': 0.09} 9%|▉ | 2095/22095 [3:25:13<19:38:51, 3.54s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (92139516 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10135.png 2025-08-27 19:23:11.527625 load time: 1561.1 ms 9%|▉ | 2096/22095 [3:25:16<19:40:29, 3.54s/it] {'loss': 0.4048, 'grad_norm': 0.6796374624881355, 'learning_rate': 9.890249840527593e-06, 'epoch': 0.09} 9%|▉ | 2096/22095 [3:25:16<19:40:29, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2097/22095 [3:25:25<27:28:54, 4.95s/it] {'loss': 0.5378, 'grad_norm': 0.6595102338230455, 'learning_rate': 9.8900970688803e-06, 'epoch': 0.09} 9%|▉ | 2097/22095 [3:25:25<27:28:54, 4.95s/it] 9%|▉ | 2098/22095 [3:25:28<25:26:31, 4.58s/it] {'loss': 0.4311, 'grad_norm': 0.7408710873526465, 'learning_rate': 9.88994419215984e-06, 'epoch': 0.09} 9%|▉ | 2098/22095 [3:25:28<25:26:31, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 9%|▉ | 2099/22095 [3:25:38<34:23:19, 6.19s/it] {'loss': 0.5408, 'grad_norm': 0.5541647525789047, 'learning_rate': 9.889791210369496e-06, 'epoch': 0.09} 9%|▉ | 2099/22095 [3:25:38<34:23:19, 6.19s/it] 10%|▉ | 2100/22095 [3:25:41<29:18:08, 5.28s/it] {'loss': 0.4689, 'grad_norm': 0.7911667363009298, 'learning_rate': 9.889638123512557e-06, 'epoch': 0.1} 10%|▉ | 2100/22095 [3:25:41<29:18:08, 5.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2101/22095 [3:25:51<36:57:59, 6.66s/it] {'loss': 0.5257, 'grad_norm': 0.42855301210537444, 'learning_rate': 9.889484931592313e-06, 'epoch': 0.1} 10%|▉ | 2101/22095 [3:25:51<36:57:59, 6.66s/it] 10%|▉ | 2102/22095 [3:25:54<31:14:19, 5.62s/it] {'loss': 0.4347, 'grad_norm': 0.9473251263121182, 'learning_rate': 9.889331634612053e-06, 'epoch': 0.1} 10%|▉ | 2102/22095 [3:25:54<31:14:19, 5.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2103/22095 [3:26:02<33:54:10, 6.10s/it] {'loss': 0.4935, 'grad_norm': 0.4418629563484785, 'learning_rate': 9.889178232575074e-06, 'epoch': 0.1} 10%|▉ | 2103/22095 [3:26:02<33:54:10, 6.10s/it] 10%|▉ | 2104/22095 [3:26:06<30:07:28, 5.42s/it] {'loss': 0.4619, 'grad_norm': 0.8028317808705686, 'learning_rate': 9.889024725484672e-06, 'epoch': 0.1} 10%|▉ | 2104/22095 [3:26:06<30:07:28, 5.42s/it] 10%|▉ | 2105/22095 [3:26:09<26:27:42, 4.77s/it] {'loss': 0.4343, 'grad_norm': 0.7433539611390096, 'learning_rate': 9.888871113344144e-06, 'epoch': 0.1} 10%|▉ | 2105/22095 [3:26:09<26:27:42, 4.77s/it] 10%|▉ | 2106/22095 [3:26:12<23:37:24, 4.25s/it] {'loss': 0.4586, 'grad_norm': 0.7133886009247582, 'learning_rate': 9.888717396156788e-06, 'epoch': 0.1} 10%|▉ | 2106/22095 [3:26:12<23:37:24, 4.25s/it] 10%|▉ | 2107/22095 [3:26:15<21:53:41, 3.94s/it] {'loss': 0.4374, 'grad_norm': 0.6706564434461998, 'learning_rate': 9.88856357392591e-06, 'epoch': 0.1} 10%|▉ | 2107/22095 [3:26:15<21:53:41, 3.94s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-27 19:24:13.786847 load time: 1130.5 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2108/22095 [3:26:18<20:00:09, 3.60s/it] {'loss': 0.4282, 'grad_norm': 0.8111131097505304, 'learning_rate': 9.888409646654818e-06, 'epoch': 0.1} 10%|▉ | 2108/22095 [3:26:18<20:00:09, 3.60s/it] 10%|▉ | 2109/22095 [3:26:21<19:43:05, 3.55s/it] {'loss': 0.457, 'grad_norm': 0.7686171681775432, 'learning_rate': 9.888255614346813e-06, 'epoch': 0.1} 10%|▉ | 2109/22095 [3:26:21<19:43:05, 3.55s/it] 10%|▉ | 2110/22095 [3:26:25<20:05:19, 3.62s/it] {'loss': 0.4647, 'grad_norm': 0.7054175753477833, 'learning_rate': 9.88810147700521e-06, 'epoch': 0.1} 10%|▉ | 2110/22095 [3:26:25<20:05:19, 3.62s/it] 10%|▉ | 2111/22095 [3:26:29<19:51:47, 3.58s/it] {'loss': 0.4704, 'grad_norm': 1.0515788174529657, 'learning_rate': 9.887947234633318e-06, 'epoch': 0.1} 10%|▉ | 2111/22095 [3:26:29<19:51:47, 3.58s/it] 10%|▉ | 2112/22095 [3:26:32<18:54:31, 3.41s/it] {'loss': 0.4552, 'grad_norm': 0.7175508450358566, 'learning_rate': 9.887792887234453e-06, 'epoch': 0.1} 10%|▉ | 2112/22095 [3:26:32<18:54:31, 3.41s/it] 10%|▉ | 2113/22095 [3:26:35<19:35:08, 3.53s/it] {'loss': 0.4396, 'grad_norm': 0.7729794430588015, 'learning_rate': 9.88763843481193e-06, 'epoch': 0.1} 10%|▉ | 2113/22095 [3:26:35<19:35:08, 3.53s/it] 10%|▉ | 2114/22095 [3:26:38<18:29:37, 3.33s/it] {'loss': 0.4271, 'grad_norm': 0.706629623489225, 'learning_rate': 9.887483877369068e-06, 'epoch': 0.1} 10%|▉ | 2114/22095 [3:26:38<18:29:37, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2115/22095 [3:26:42<19:09:19, 3.45s/it] {'loss': 0.4409, 'grad_norm': 0.7475391012848127, 'learning_rate': 9.88732921490919e-06, 'epoch': 0.1} 10%|▉ | 2115/22095 [3:26:42<19:09:19, 3.45s/it] 10%|▉ | 2116/22095 [3:26:45<18:09:07, 3.27s/it] {'loss': 0.4801, 'grad_norm': 0.7293429645265976, 'learning_rate': 9.887174447435615e-06, 'epoch': 0.1} 10%|▉ | 2116/22095 [3:26:45<18:09:07, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116649 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2117/22095 [3:26:48<18:06:37, 3.26s/it] {'loss': 0.4306, 'grad_norm': 0.6990341610953973, 'learning_rate': 9.88701957495167e-06, 'epoch': 0.1} 10%|▉ | 2117/22095 [3:26:48<18:06:37, 3.26s/it] 10%|▉ | 2118/22095 [3:26:52<19:27:26, 3.51s/it] {'loss': 0.4297, 'grad_norm': 0.7544727476330307, 'learning_rate': 9.886864597460686e-06, 'epoch': 0.1} 10%|▉ | 2118/22095 [3:26:52<19:27:26, 3.51s/it] 10%|▉ | 2119/22095 [3:26:56<20:05:11, 3.62s/it] {'loss': 0.4923, 'grad_norm': 0.7856611371489157, 'learning_rate': 9.88670951496599e-06, 'epoch': 0.1} 10%|▉ | 2119/22095 [3:26:56<20:05:11, 3.62s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-27 19:24:55.582503 load time: 1193.29 ms 10%|▉ | 2120/22095 [3:26:59<19:11:06, 3.46s/it] {'loss': 0.4295, 'grad_norm': 0.7044813533443429, 'learning_rate': 9.886554327470917e-06, 'epoch': 0.1} 10%|▉ | 2120/22095 [3:26:59<19:11:06, 3.46s/it] 10%|▉ | 2121/22095 [3:27:02<18:53:57, 3.41s/it] {'loss': 0.452, 'grad_norm': 0.872281271578073, 'learning_rate': 9.886399034978798e-06, 'epoch': 0.1} 10%|▉ | 2121/22095 [3:27:02<18:53:57, 3.41s/it] 10%|▉ | 2122/22095 [3:27:06<19:22:01, 3.49s/it] {'loss': 0.4103, 'grad_norm': 0.6719432262640053, 'learning_rate': 9.886243637492969e-06, 'epoch': 0.1} 10%|▉ | 2122/22095 [3:27:06<19:22:01, 3.49s/it] 10%|▉ | 2123/22095 [3:27:09<18:55:23, 3.41s/it] {'loss': 0.4032, 'grad_norm': 0.7155813838531678, 'learning_rate': 9.886088135016773e-06, 'epoch': 0.1} 10%|▉ | 2123/22095 [3:27:09<18:55:23, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403952 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6128, 'image': 'vrdu_table_final_2/astro-ph.CO/9dd28813-b3fb-489f-9db1-98a9978dbf48.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} 10%|▉ | 2124/22095 [3:27:12<17:58:33, 3.24s/it] {'loss': 0.4494, 'grad_norm': 0.7312506486965615, 'learning_rate': 9.88593252755355e-06, 'epoch': 0.1} 10%|▉ | 2124/22095 [3:27:12<17:58:33, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2125/22095 [3:27:21<27:29:12, 4.96s/it] {'loss': 0.5495, 'grad_norm': 1.817955979217866, 'learning_rate': 9.885776815106643e-06, 'epoch': 0.1} 10%|▉ | 2125/22095 [3:27:21<27:29:12, 4.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334822 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1434, 'image': 'vrdu_table_final_2/astro-ph.CO/6a7e6999-6bb0-4b2f-a7a4-48b9833b7bc0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 10%|▉ | 2126/22095 [3:27:25<25:50:20, 4.66s/it] {'loss': 0.4755, 'grad_norm': 0.7271498190042965, 'learning_rate': 9.885620997679397e-06, 'epoch': 0.1} 10%|▉ | 2126/22095 [3:27:25<25:50:20, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48330 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2127/22095 [3:27:28<23:42:40, 4.27s/it] {'loss': 0.4454, 'grad_norm': 0.9057747660426797, 'learning_rate': 9.88546507527516e-06, 'epoch': 0.1} 10%|▉ | 2127/22095 [3:27:28<23:42:40, 4.27s/it] 10%|▉ | 2128/22095 [3:27:31<21:18:36, 3.84s/it] {'loss': 0.405, 'grad_norm': 0.7188105353417086, 'learning_rate': 9.885309047897285e-06, 'epoch': 0.1} 10%|▉ | 2128/22095 [3:27:31<21:18:36, 3.84s/it] 10%|▉ | 2129/22095 [3:27:35<20:58:08, 3.78s/it] {'loss': 0.4987, 'grad_norm': 0.824647330223287, 'learning_rate': 9.88515291554912e-06, 'epoch': 0.1} 10%|▉ | 2129/22095 [3:27:35<20:58:08, 3.78s/it] 10%|▉ | 2130/22095 [3:27:39<21:03:50, 3.80s/it] {'loss': 0.4606, 'grad_norm': 0.8047057457137237, 'learning_rate': 9.884996678234024e-06, 'epoch': 0.1} 10%|▉ | 2130/22095 [3:27:39<21:03:50, 3.80s/it] 10%|▉ | 2131/22095 [3:27:42<20:08:06, 3.63s/it] {'loss': 0.4225, 'grad_norm': 0.7541646312384799, 'learning_rate': 9.884840335955354e-06, 'epoch': 0.1} 10%|▉ | 2131/22095 [3:27:42<20:08:06, 3.63s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250507_011522_1/images/before_screenshot_3_id_96_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-27 19:25:42.369574 load time: 1076.3 ms 10%|▉ | 2132/22095 [3:27:46<20:12:02, 3.64s/it] {'loss': 0.4381, 'grad_norm': 0.7657084288201877, 'learning_rate': 9.884683888716466e-06, 'epoch': 0.1} 10%|▉ | 2132/22095 [3:27:46<20:12:02, 3.64s/it] 10%|▉ | 2133/22095 [3:27:48<18:54:34, 3.41s/it] {'loss': 0.4376, 'grad_norm': 0.733840982809254, 'learning_rate': 9.884527336520724e-06, 'epoch': 0.1} 10%|▉ | 2133/22095 [3:27:49<18:54:34, 3.41s/it] 10%|▉ | 2134/22095 [3:27:52<19:31:26, 3.52s/it] {'loss': 0.4756, 'grad_norm': 0.7450007075565378, 'learning_rate': 9.88437067937149e-06, 'epoch': 0.1} 10%|▉ | 2134/22095 [3:27:52<19:31:26, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2135/22095 [3:28:02<29:59:56, 5.41s/it] {'loss': 0.5089, 'grad_norm': 0.8394335018999397, 'learning_rate': 9.884213917272133e-06, 'epoch': 0.1} 10%|▉ | 2135/22095 [3:28:02<29:59:56, 5.41s/it] 10%|▉ | 2136/22095 [3:28:06<27:29:25, 4.96s/it] {'loss': 0.4506, 'grad_norm': 0.7165101390264119, 'learning_rate': 9.88405705022602e-06, 'epoch': 0.1} 10%|▉ | 2136/22095 [3:28:06<27:29:25, 4.96s/it] 10%|▉ | 2137/22095 [3:28:09<24:30:38, 4.42s/it] {'loss': 0.3896, 'grad_norm': 0.7136018139841266, 'learning_rate': 9.883900078236519e-06, 'epoch': 0.1} 10%|▉ | 2137/22095 [3:28:09<24:30:38, 4.42s/it] 10%|▉ | 2138/22095 [3:28:12<22:29:22, 4.06s/it] {'loss': 0.3826, 'grad_norm': 1.0973366170490864, 'learning_rate': 9.883743001307007e-06, 'epoch': 0.1} 10%|▉ | 2138/22095 [3:28:12<22:29:22, 4.06s/it] 10%|▉ | 2139/22095 [3:28:16<22:36:13, 4.08s/it] {'loss': 0.4749, 'grad_norm': 0.7498597906623329, 'learning_rate': 9.883585819440854e-06, 'epoch': 0.1} 10%|▉ | 2139/22095 [3:28:17<22:36:13, 4.08s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396930 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63783, 'image': 'vrdu_table_final_2/astro-ph.EP/8ffa60d2-f133-4417-8686-fdd10727d7ce.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_z$\\end{tabular}\n```"}]} 10%|▉ | 2140/22095 [3:28:21<22:40:56, 4.09s/it] {'loss': 0.4289, 'grad_norm': 0.7252975386637314, 'learning_rate': 9.883428532641445e-06, 'epoch': 0.1} 10%|▉ | 2140/22095 [3:28:21<22:40:56, 4.09s/it] 10%|▉ | 2141/22095 [3:28:24<21:17:19, 3.84s/it] {'loss': 0.45, 'grad_norm': 0.7609438996643995, 'learning_rate': 9.883271140912153e-06, 'epoch': 0.1} 10%|▉ | 2141/22095 [3:28:24<21:17:19, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55759 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92317 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2142/22095 [3:28:28<20:56:49, 3.78s/it] {'loss': 0.4543, 'grad_norm': 0.6732376326562703, 'learning_rate': 9.88311364425636e-06, 'epoch': 0.1} 10%|▉ | 2142/22095 [3:28:28<20:56:49, 3.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2143/22095 [3:28:31<20:19:05, 3.67s/it] {'loss': 0.391, 'grad_norm': 0.6372499949868802, 'learning_rate': 9.882956042677457e-06, 'epoch': 0.1} 10%|▉ | 2143/22095 [3:28:31<20:19:05, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2144/22095 [3:28:35<21:06:54, 3.81s/it] {'loss': 0.453, 'grad_norm': 0.7951653118971782, 'learning_rate': 9.882798336178821e-06, 'epoch': 0.1} 10%|▉ | 2144/22095 [3:28:35<21:06:54, 3.81s/it] 10%|▉ | 2145/22095 [3:28:38<20:17:55, 3.66s/it] {'loss': 0.4693, 'grad_norm': 0.7507677694816026, 'learning_rate': 9.882640524763847e-06, 'epoch': 0.1} 10%|▉ | 2145/22095 [3:28:38<20:17:55, 3.66s/it] 10%|▉ | 2146/22095 [3:28:41<19:11:56, 3.46s/it] {'loss': 0.3988, 'grad_norm': 0.6870245167157061, 'learning_rate': 9.882482608435924e-06, 'epoch': 0.1} 10%|▉ | 2146/22095 [3:28:41<19:11:56, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (136485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45911 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50760 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2147/22095 [3:28:45<18:49:36, 3.40s/it] {'loss': 0.4471, 'grad_norm': 0.7071117925139719, 'learning_rate': 9.882324587198446e-06, 'epoch': 0.1} 10%|▉ | 2147/22095 [3:28:45<18:49:36, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85664 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2148/22095 [3:28:48<18:04:45, 3.26s/it] {'loss': 0.4139, 'grad_norm': 0.6773979773976042, 'learning_rate': 9.882166461054806e-06, 'epoch': 0.1} 10%|▉ | 2148/22095 [3:28:48<18:04:45, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2149/22095 [3:28:53<22:10:23, 4.00s/it] {'loss': 0.5461, 'grad_norm': 0.6139954213727978, 'learning_rate': 9.882008230008403e-06, 'epoch': 0.1} 10%|▉ | 2149/22095 [3:28:53<22:10:23, 4.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [67, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369588 in VC:s3://internvl-moe-sft-data/. Exception: Image size [67, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36340, 'image': 'vrdu_table_final_2/astro-ph.CO/9fc98cdb-e9fb-414a-ae5b-22a8f9eb0ba2.png', 'image_wh': [[67, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\Delta Y_{\\rm tot}$\\end{tabular}\n```"}]} 10%|▉ | 2150/22095 [3:29:03<31:22:12, 5.66s/it] {'loss': 0.5198, 'grad_norm': 0.5598559951748219, 'learning_rate': 9.881849894062639e-06, 'epoch': 0.1} 10%|▉ | 2150/22095 [3:29:03<31:22:12, 5.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [517, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8352308 in VC:s3://internvl-moe-sft-data/. Exception: Image size [517, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18989, 'image': 'vrdu_table_final_2/astro-ph.CO/8a75a7fd-2416-4df7-817b-2b55037d386b.png', 'image_wh': [[517, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{cccccc}\n\\multicolumn{1}{c}{\\footnotesize $^a$ These are listed without 0 superscript by PDG .} \n\\end{tabular}\n```"}]} 10%|▉ | 2151/22095 [3:29:06<27:39:22, 4.99s/it] {'loss': 0.4562, 'grad_norm': 0.7831826157949604, 'learning_rate': 9.881691453220912e-06, 'epoch': 0.1} 10%|▉ | 2151/22095 [3:29:06<27:39:22, 4.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308190 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2OWuBeMjN8KJjSZFCXXb3GpXa_!!864860621.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请提取照片里的文本。'}, {'from': 'gpt', 'value': '图中所有文字:\n没\n有\n煎\n苦逼\n熬\n的\n傻逼\n付出\n的\n哪\n有\n牛逼闪闪\n天\nTOMORROW\n的\n明'}]} 10%|▉ | 2152/22095 [3:29:10<25:30:50, 4.61s/it] {'loss': 0.4838, 'grad_norm': 0.7644327354430474, 'learning_rate': 9.88153290748663e-06, 'epoch': 0.1} 10%|▉ | 2152/22095 [3:29:10<25:30:50, 4.61s/it] 10%|▉ | 2153/22095 [3:29:15<25:23:49, 4.58s/it] {'loss': 0.4401, 'grad_norm': 0.824965801673457, 'learning_rate': 9.8813742568632e-06, 'epoch': 0.1} 10%|▉ | 2153/22095 [3:29:15<25:23:49, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54305 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50819 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2154/22095 [3:29:18<23:52:08, 4.31s/it] {'loss': 0.4399, 'grad_norm': 0.7638742540229725, 'learning_rate': 9.881215501354025e-06, 'epoch': 0.1} 10%|▉ | 2154/22095 [3:29:18<23:52:08, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2155/22095 [3:29:28<32:33:43, 5.88s/it] {'loss': 0.5198, 'grad_norm': 0.6250697994456945, 'learning_rate': 9.881056640962524e-06, 'epoch': 0.1} 10%|▉ | 2155/22095 [3:29:28<32:33:43, 5.88s/it] 10%|▉ | 2156/22095 [3:29:31<28:37:43, 5.17s/it] {'loss': 0.4157, 'grad_norm': 0.7161150856416505, 'learning_rate': 9.880897675692105e-06, 'epoch': 0.1} 10%|▉ | 2156/22095 [3:29:31<28:37:43, 5.17s/it] 10%|▉ | 2157/22095 [3:29:35<26:31:19, 4.79s/it] {'loss': 0.3921, 'grad_norm': 0.7066523082148944, 'learning_rate': 9.880738605546186e-06, 'epoch': 0.1} 10%|▉ | 2157/22095 [3:29:35<26:31:19, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45889 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79584 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2158/22095 [3:29:38<23:31:40, 4.25s/it] {'loss': 0.398, 'grad_norm': 0.6960762618579934, 'learning_rate': 9.880579430528183e-06, 'epoch': 0.1} 10%|▉ | 2158/22095 [3:29:38<23:31:40, 4.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106507 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2159/22095 [3:29:41<21:28:45, 3.88s/it] {'loss': 0.3952, 'grad_norm': 0.6712418881697726, 'learning_rate': 9.880420150641519e-06, 'epoch': 0.1} 10%|▉ | 2159/22095 [3:29:41<21:28:45, 3.88s/it] 10%|▉ | 2160/22095 [3:29:45<21:45:39, 3.93s/it] {'loss': 0.4779, 'grad_norm': 0.8044310928745596, 'learning_rate': 9.880260765889615e-06, 'epoch': 0.1} 10%|▉ | 2160/22095 [3:29:45<21:45:39, 3.93s/it] 10%|▉ | 2161/22095 [3:29:48<20:14:23, 3.66s/it] {'loss': 0.4259, 'grad_norm': 0.7582313840845955, 'learning_rate': 9.880101276275896e-06, 'epoch': 0.1} 10%|▉ | 2161/22095 [3:29:48<20:14:23, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [328, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8430561 in VC:s3://internvl-moe-sft-data/. Exception: Image size [328, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 131984, 'image': 'vrdu_texteq/astro-ph.CO/448ae204-27bf-4acf-a000-682478267005.png', 'image_wh': [[328, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'We can obtain $r$ as follows:'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2162/22095 [3:29:58<30:57:15, 5.59s/it] {'loss': 0.5103, 'grad_norm': 0.411187499329373, 'learning_rate': 9.87994168180379e-06, 'epoch': 0.1} 10%|▉ | 2162/22095 [3:29:58<30:57:15, 5.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [245, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429662 in VC:s3://internvl-moe-sft-data/. Exception: Image size [245, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 94223, 'image': 'vrdu_texteq/astro-ph.CO/b4567203-afdc-4cdd-bdc5-e21ed4909c77.png', 'image_wh': [[245, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $\\Delta\\theta=\\theta_2-\\theta_1$.'}]} 10%|▉ | 2163/22095 [3:30:02<27:36:57, 4.99s/it] {'loss': 0.5148, 'grad_norm': 0.8352279353409846, 'learning_rate': 9.879781982476722e-06, 'epoch': 0.1} 10%|▉ | 2163/22095 [3:30:02<27:36:57, 4.99s/it] 10%|▉ | 2164/22095 [3:30:05<24:22:42, 4.40s/it] {'loss': 0.431, 'grad_norm': 0.6975707964456537, 'learning_rate': 9.879622178298128e-06, 'epoch': 0.1} 10%|▉ | 2164/22095 [3:30:05<24:22:42, 4.40s/it] 10%|▉ | 2165/22095 [3:30:09<23:50:43, 4.31s/it] {'loss': 0.4202, 'grad_norm': 0.7346753431411169, 'learning_rate': 9.879462269271439e-06, 'epoch': 0.1} 10%|▉ | 2165/22095 [3:30:09<23:50:43, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2166/22095 [3:30:18<32:26:41, 5.86s/it] {'loss': 0.5198, 'grad_norm': 0.347524141797627, 'learning_rate': 9.879302255400092e-06, 'epoch': 0.1} 10%|▉ | 2166/22095 [3:30:18<32:26:41, 5.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (125259 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66622 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2167/22095 [3:30:28<37:52:41, 6.84s/it] {'loss': 0.5421, 'grad_norm': 0.3586006925617358, 'learning_rate': 9.879142136687524e-06, 'epoch': 0.1} 10%|▉ | 2167/22095 [3:30:28<37:52:41, 6.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 10%|▉ | 2168/22095 [3:30:31<31:55:38, 5.77s/it] {'loss': 0.4372, 'grad_norm': 0.9478318513259163, 'learning_rate': 9.878981913137178e-06, 'epoch': 0.1} 10%|▉ | 2168/22095 [3:30:31<31:55:38, 5.77s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 10%|▉ | 2169/22095 [3:30:34<27:50:43, 5.03s/it] {'loss': 0.3997, 'grad_norm': 0.7835830213363405, 'learning_rate': 9.878821584752495e-06, 'epoch': 0.1} 10%|▉ | 2169/22095 [3:30:34<27:50:43, 5.03s/it] 10%|▉ | 2170/22095 [3:30:38<25:29:53, 4.61s/it] {'loss': 0.4473, 'grad_norm': 0.7127149457361394, 'learning_rate': 9.878661151536923e-06, 'epoch': 0.1} 10%|▉ | 2170/22095 [3:30:38<25:29:53, 4.61s/it] 10%|▉ | 2171/22095 [3:30:41<22:42:35, 4.10s/it] {'loss': 0.4679, 'grad_norm': 0.9555889260859909, 'learning_rate': 9.878500613493904e-06, 'epoch': 0.1} 10%|▉ | 2171/22095 [3:30:41<22:42:35, 4.10s/it] 10%|▉ | 2172/22095 [3:30:44<20:34:26, 3.72s/it] {'loss': 0.4414, 'grad_norm': 0.741480683152549, 'learning_rate': 9.87833997062689e-06, 'epoch': 0.1} 10%|▉ | 2172/22095 [3:30:44<20:34:26, 3.72s/it] 10%|▉ | 2173/22095 [3:30:47<19:42:37, 3.56s/it] {'loss': 0.4415, 'grad_norm': 0.7204742635936184, 'learning_rate': 9.878179222939333e-06, 'epoch': 0.1} 10%|▉ | 2173/22095 [3:30:47<19:42:37, 3.56s/it] 10%|▉ | 2174/22095 [3:30:51<20:11:59, 3.65s/it] {'loss': 0.4517, 'grad_norm': 0.898233981729686, 'learning_rate': 9.878018370434686e-06, 'epoch': 0.1} 10%|▉ | 2174/22095 [3:30:51<20:11:59, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2175/22095 [3:31:00<29:49:00, 5.39s/it] {'loss': 0.5291, 'grad_norm': 0.5640581299672587, 'learning_rate': 9.877857413116408e-06, 'epoch': 0.1} 10%|▉ | 2175/22095 [3:31:00<29:49:00, 5.39s/it] 10%|▉ | 2176/22095 [3:31:04<27:24:14, 4.95s/it] {'loss': 0.4235, 'grad_norm': 0.8242186289537619, 'learning_rate': 9.877696350987954e-06, 'epoch': 0.1} 10%|▉ | 2176/22095 [3:31:04<27:24:14, 4.95s/it] 10%|▉ | 2177/22095 [3:31:08<25:32:52, 4.62s/it] {'loss': 0.4414, 'grad_norm': 0.7710921868502916, 'learning_rate': 9.877535184052786e-06, 'epoch': 0.1} 10%|▉ | 2177/22095 [3:31:08<25:32:52, 4.62s/it] 10%|▉ | 2178/22095 [3:31:11<23:24:17, 4.23s/it] {'loss': 0.4113, 'grad_norm': 0.6927559731602532, 'learning_rate': 9.877373912314367e-06, 'epoch': 0.1} 10%|▉ | 2178/22095 [3:31:11<23:24:17, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2179/22095 [3:31:19<29:28:40, 5.33s/it] {'loss': 0.5251, 'grad_norm': 0.3404435555777899, 'learning_rate': 9.877212535776161e-06, 'epoch': 0.1} 10%|▉ | 2179/22095 [3:31:19<29:28:40, 5.33s/it] 10%|▉ | 2180/22095 [3:31:22<25:44:59, 4.65s/it] {'loss': 0.4138, 'grad_norm': 0.8667350818901997, 'learning_rate': 9.87705105444164e-06, 'epoch': 0.1} 10%|▉ | 2180/22095 [3:31:22<25:44:59, 4.65s/it] 10%|▉ | 2181/22095 [3:31:26<24:30:12, 4.43s/it] {'loss': 0.4152, 'grad_norm': 0.9045198118800067, 'learning_rate': 9.876889468314268e-06, 'epoch': 0.1} 10%|▉ | 2181/22095 [3:31:26<24:30:12, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69483 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42508 > 40960) for 4 sample(s). Truncating to 1201 with 2 samples. 10%|▉ | 2182/22095 [3:31:29<22:08:19, 4.00s/it] {'loss': 0.4166, 'grad_norm': 0.6987822562694133, 'learning_rate': 9.876727777397522e-06, 'epoch': 0.1} 10%|▉ | 2182/22095 [3:31:29<22:08:19, 4.00s/it] 10%|▉ | 2183/22095 [3:31:32<21:00:59, 3.80s/it] {'loss': 0.4554, 'grad_norm': 0.7422912359370816, 'learning_rate': 9.876565981694871e-06, 'epoch': 0.1} 10%|▉ | 2183/22095 [3:31:32<21:00:59, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69543 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2184/22095 [3:31:36<19:55:13, 3.60s/it] {'loss': 0.4713, 'grad_norm': 0.8370568389817287, 'learning_rate': 9.876404081209796e-06, 'epoch': 0.1} 10%|▉ | 2184/22095 [3:31:36<19:55:13, 3.60s/it] 10%|▉ | 2185/22095 [3:31:38<18:39:27, 3.37s/it] {'loss': 0.4493, 'grad_norm': 0.7619494348801985, 'learning_rate': 9.876242075945774e-06, 'epoch': 0.1} 10%|▉ | 2185/22095 [3:31:38<18:39:27, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93400 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61047 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2186/22095 [3:31:41<17:46:06, 3.21s/it] {'loss': 0.4599, 'grad_norm': 0.7099278623393065, 'learning_rate': 9.876079965906284e-06, 'epoch': 0.1} 10%|▉ | 2186/22095 [3:31:41<17:46:06, 3.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2187/22095 [3:31:45<18:41:43, 3.38s/it] {'loss': 0.3861, 'grad_norm': 0.6536632383972745, 'learning_rate': 9.875917751094814e-06, 'epoch': 0.1} 10%|▉ | 2187/22095 [3:31:45<18:41:43, 3.38s/it] 10%|▉ | 2188/22095 [3:31:49<19:48:25, 3.58s/it] {'loss': 0.4468, 'grad_norm': 0.7126200240781412, 'learning_rate': 9.875755431514846e-06, 'epoch': 0.1} 10%|▉ | 2188/22095 [3:31:49<19:48:25, 3.58s/it] 10%|▉ | 2189/22095 [3:31:52<18:49:52, 3.41s/it] {'loss': 0.4283, 'grad_norm': 0.723300049733845, 'learning_rate': 9.875593007169868e-06, 'epoch': 0.1} 10%|▉ | 2189/22095 [3:31:52<18:49:52, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045957 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 10\nB. 8\nC. 7\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946046 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69199, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 6\nB. 7.5\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 10%|▉ | 2190/22095 [3:32:01<27:42:20, 5.01s/it] {'loss': 0.5392, 'grad_norm': 1.9414330444935057, 'learning_rate': 9.87543047806337e-06, 'epoch': 0.1} 10%|▉ | 2190/22095 [3:32:01<27:42:20, 5.01s/it] 10%|▉ | 2191/22095 [3:32:05<25:56:35, 4.69s/it] {'loss': 0.4594, 'grad_norm': 0.7713129418149638, 'learning_rate': 9.875267844198846e-06, 'epoch': 0.1} 10%|▉ | 2191/22095 [3:32:05<25:56:35, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2192/22095 [3:32:15<34:30:35, 6.24s/it] {'loss': 0.5206, 'grad_norm': 0.36286998565742357, 'learning_rate': 9.875105105579789e-06, 'epoch': 0.1} 10%|▉ | 2192/22095 [3:32:15<34:30:35, 6.24s/it] 10%|▉ | 2193/22095 [3:32:18<30:10:41, 5.46s/it] {'loss': 0.4469, 'grad_norm': 0.7090515722591921, 'learning_rate': 9.874942262209695e-06, 'epoch': 0.1} 10%|▉ | 2193/22095 [3:32:18<30:10:41, 5.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2194/22095 [3:32:29<38:20:04, 6.93s/it] {'loss': 0.513, 'grad_norm': 0.35163704632743675, 'learning_rate': 9.874779314092065e-06, 'epoch': 0.1} 10%|▉ | 2194/22095 [3:32:29<38:20:04, 6.93s/it] 10%|▉ | 2195/22095 [3:32:33<34:02:25, 6.16s/it] {'loss': 0.4538, 'grad_norm': 0.7991550466891403, 'learning_rate': 9.874616261230398e-06, 'epoch': 0.1} 10%|▉ | 2195/22095 [3:32:33<34:02:25, 6.16s/it] 10%|▉ | 2196/22095 [3:32:36<28:47:04, 5.21s/it] {'loss': 0.4492, 'grad_norm': 0.7567345337215104, 'learning_rate': 9.874453103628201e-06, 'epoch': 0.1} 10%|▉ | 2196/22095 [3:32:36<28:47:04, 5.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|▉ | 2197/22095 [3:32:39<26:04:16, 4.72s/it] {'loss': 0.419, 'grad_norm': 0.6729555115890912, 'learning_rate': 9.874289841288976e-06, 'epoch': 0.1} 10%|▉ | 2197/22095 [3:32:39<26:04:16, 4.72s/it] 10%|▉ | 2198/22095 [3:32:43<23:49:41, 4.31s/it] {'loss': 0.4344, 'grad_norm': 0.7729185223247828, 'learning_rate': 9.874126474216234e-06, 'epoch': 0.1} 10%|▉ | 2198/22095 [3:32:43<23:49:41, 4.31s/it] 10%|▉ | 2199/22095 [3:32:46<22:04:51, 4.00s/it] {'loss': 0.4201, 'grad_norm': 0.7355415324216681, 'learning_rate': 9.873963002413483e-06, 'epoch': 0.1} 10%|▉ | 2199/22095 [3:32:46<22:04:51, 4.00s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_4.png 2025-08-27 19:30:44.884784 load time: 1044.49 ms 10%|▉ | 2200/22095 [3:32:49<20:09:52, 3.65s/it] {'loss': 0.4277, 'grad_norm': 0.6982893543482479, 'learning_rate': 9.873799425884235e-06, 'epoch': 0.1} 10%|▉ | 2200/22095 [3:32:49<20:09:52, 3.65s/it] 10%|▉ | 2201/22095 [3:32:52<19:27:43, 3.52s/it] {'loss': 0.4539, 'grad_norm': 0.773880454489427, 'learning_rate': 9.873635744632008e-06, 'epoch': 0.1} 10%|▉ | 2201/22095 [3:32:52<19:27:43, 3.52s/it] 10%|▉ | 2202/22095 [3:32:55<19:04:34, 3.45s/it] {'loss': 0.4192, 'grad_norm': 0.8494593639519243, 'learning_rate': 9.873471958660316e-06, 'epoch': 0.1} 10%|▉ | 2202/22095 [3:32:55<19:04:34, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|▉ | 2203/22095 [3:33:06<30:59:05, 5.61s/it] {'loss': 0.5519, 'grad_norm': 0.8580004122744093, 'learning_rate': 9.873308067972679e-06, 'epoch': 0.1} 10%|▉ | 2203/22095 [3:33:06<30:59:05, 5.61s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_11/images/before_screenshot_54_id_325_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-27 19:31:04.878424 load time: 1033.11 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-27 19:31:05.866199 load time: 1103.5 ms 10%|▉ | 2204/22095 [3:33:10<27:45:38, 5.02s/it] {'loss': 0.4467, 'grad_norm': 0.7290001443780998, 'learning_rate': 9.87314407257262e-06, 'epoch': 0.1} 10%|▉ | 2204/22095 [3:33:10<27:45:38, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46629 > 40960). Running this sequence through the model will result in indexing errors 10%|▉ | 2205/22095 [3:33:13<25:04:53, 4.54s/it] {'loss': 0.4354, 'grad_norm': 0.7988956837738996, 'learning_rate': 9.87297997246366e-06, 'epoch': 0.1} 10%|▉ | 2205/22095 [3:33:13<25:04:53, 4.54s/it] 10%|▉ | 2206/22095 [3:33:16<22:50:57, 4.14s/it] {'loss': 0.4403, 'grad_norm': 0.7358440002456154, 'learning_rate': 9.872815767649329e-06, 'epoch': 0.1} 10%|▉ | 2206/22095 [3:33:16<22:50:57, 4.14s/it] 10%|▉ | 2207/22095 [3:33:21<23:01:32, 4.17s/it] {'loss': 0.4011, 'grad_norm': 0.7204005245588492, 'learning_rate': 9.87265145813315e-06, 'epoch': 0.1} 10%|▉ | 2207/22095 [3:33:21<23:01:32, 4.17s/it] 10%|▉ | 2208/22095 [3:33:25<23:05:14, 4.18s/it] {'loss': 0.4647, 'grad_norm': 0.9151800895514478, 'learning_rate': 9.872487043918659e-06, 'epoch': 0.1} 10%|▉ | 2208/22095 [3:33:25<23:05:14, 4.18s/it] 10%|▉ | 2209/22095 [3:33:28<21:46:08, 3.94s/it] {'loss': 0.415, 'grad_norm': 0.6684901673506206, 'learning_rate': 9.872322525009383e-06, 'epoch': 0.1} 10%|▉ | 2209/22095 [3:33:28<21:46:08, 3.94s/it] 10%|█ | 2210/22095 [3:33:31<19:58:30, 3.62s/it] {'loss': 0.4549, 'grad_norm': 0.810784992270613, 'learning_rate': 9.872157901408863e-06, 'epoch': 0.1} 10%|█ | 2210/22095 [3:33:31<19:58:30, 3.62s/it] 10%|█ | 2211/22095 [3:33:34<19:02:47, 3.45s/it] {'loss': 0.4739, 'grad_norm': 0.7561176597277325, 'learning_rate': 9.871993173120633e-06, 'epoch': 0.1} 10%|█ | 2211/22095 [3:33:34<19:02:47, 3.45s/it] 10%|█ | 2212/22095 [3:33:37<18:11:56, 3.30s/it] {'loss': 0.3496, 'grad_norm': 0.7224618216199857, 'learning_rate': 9.871828340148232e-06, 'epoch': 0.1} 10%|█ | 2212/22095 [3:33:37<18:11:56, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2213/22095 [3:33:47<29:01:19, 5.25s/it] {'loss': 0.537, 'grad_norm': 0.48863707696024233, 'learning_rate': 9.871663402495202e-06, 'epoch': 0.1} 10%|█ | 2213/22095 [3:33:47<29:01:19, 5.25s/it] 10%|█ | 2214/22095 [3:33:51<26:55:59, 4.88s/it] {'loss': 0.43, 'grad_norm': 0.7221464520254421, 'learning_rate': 9.87149836016509e-06, 'epoch': 0.1} 10%|█ | 2214/22095 [3:33:51<26:55:59, 4.88s/it] 10%|█ | 2215/22095 [3:33:54<23:31:04, 4.26s/it] {'loss': 0.436, 'grad_norm': 0.7771878759212629, 'learning_rate': 9.871333213161438e-06, 'epoch': 0.1} 10%|█ | 2215/22095 [3:33:54<23:31:04, 4.26s/it] 10%|█ | 2216/22095 [3:33:57<22:00:03, 3.98s/it] {'loss': 0.4362, 'grad_norm': 0.7560332076204587, 'learning_rate': 9.871167961487798e-06, 'epoch': 0.1} 10%|█ | 2216/22095 [3:33:57<22:00:03, 3.98s/it] 10%|█ | 2217/22095 [3:34:00<20:51:24, 3.78s/it] {'loss': 0.4112, 'grad_norm': 0.8018401387575208, 'learning_rate': 9.871002605147717e-06, 'epoch': 0.1} 10%|█ | 2217/22095 [3:34:00<20:51:24, 3.78s/it] 10%|█ | 2218/22095 [3:34:04<20:20:26, 3.68s/it] {'loss': 0.4268, 'grad_norm': 0.6839784259457237, 'learning_rate': 9.870837144144752e-06, 'epoch': 0.1} 10%|█ | 2218/22095 [3:34:04<20:20:26, 3.68s/it] 10%|█ | 2219/22095 [3:34:07<19:31:40, 3.54s/it] {'loss': 0.4214, 'grad_norm': 0.6933617275224555, 'learning_rate': 9.870671578482457e-06, 'epoch': 0.1} 10%|█ | 2219/22095 [3:34:07<19:31:40, 3.54s/it] 10%|█ | 2220/22095 [3:34:10<19:18:36, 3.50s/it] {'loss': 0.4716, 'grad_norm': 0.7206012184599776, 'learning_rate': 9.870505908164386e-06, 'epoch': 0.1} 10%|█ | 2220/22095 [3:34:10<19:18:36, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884876 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8029, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 8\nB. 7\nC. 6\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 10%|█ | 2221/22095 [3:34:14<19:44:21, 3.58s/it] {'loss': 0.4395, 'grad_norm': 0.8075038548608849, 'learning_rate': 9.870340133194103e-06, 'epoch': 0.1} 10%|█ | 2221/22095 [3:34:14<19:44:21, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77715 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62531 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41632 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2222/22095 [3:34:18<20:29:55, 3.71s/it] {'loss': 0.4831, 'grad_norm': 0.7635458649375578, 'learning_rate': 9.870174253575169e-06, 'epoch': 0.1} 10%|█ | 2222/22095 [3:34:18<20:29:55, 3.71s/it] 10%|█ | 2223/22095 [3:34:22<19:56:17, 3.61s/it] {'loss': 0.437, 'grad_norm': 0.8634190586113031, 'learning_rate': 9.870008269311148e-06, 'epoch': 0.1} 10%|█ | 2223/22095 [3:34:22<19:56:17, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49858 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2224/22095 [3:34:25<19:18:18, 3.50s/it] {'loss': 0.4045, 'grad_norm': 0.6343631893568048, 'learning_rate': 9.869842180405607e-06, 'epoch': 0.1} 10%|█ | 2224/22095 [3:34:25<19:18:18, 3.50s/it] 10%|█ | 2225/22095 [3:34:29<19:41:24, 3.57s/it] {'loss': 0.4848, 'grad_norm': 0.7399088703990067, 'learning_rate': 9.869675986862113e-06, 'epoch': 0.1} 10%|█ | 2225/22095 [3:34:29<19:41:24, 3.57s/it] 10%|█ | 2226/22095 [3:34:31<18:38:20, 3.38s/it] {'loss': 0.4651, 'grad_norm': 0.7663305947086005, 'learning_rate': 9.869509688684238e-06, 'epoch': 0.1} 10%|█ | 2226/22095 [3:34:31<18:38:20, 3.38s/it] 10%|█ | 2227/22095 [3:34:34<17:49:59, 3.23s/it] {'loss': 0.4519, 'grad_norm': 0.7544720440825532, 'learning_rate': 9.869343285875556e-06, 'epoch': 0.1} 10%|█ | 2227/22095 [3:34:34<17:49:59, 3.23s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_49_id_131_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-27 19:32:34.335217 load time: 1222.5 ms VC:s3://internvl2/datasets/IAM/image/g07-007a.png 2025-08-27 19:32:34.767303 load time: 1024.32 ms 10%|█ | 2228/22095 [3:34:39<20:03:54, 3.64s/it] {'loss': 0.4276, 'grad_norm': 0.7707733603624612, 'learning_rate': 9.869176778439641e-06, 'epoch': 0.1} 10%|█ | 2228/22095 [3:34:39<20:03:54, 3.64s/it] 10%|█ | 2229/22095 [3:34:42<19:27:24, 3.53s/it] {'loss': 0.4214, 'grad_norm': 0.8454366073314977, 'learning_rate': 9.869010166380074e-06, 'epoch': 0.1} 10%|█ | 2229/22095 [3:34:42<19:27:24, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2230/22095 [3:34:48<23:40:56, 4.29s/it] {'loss': 0.5228, 'grad_norm': 0.5461864636688677, 'learning_rate': 9.868843449700429e-06, 'epoch': 0.1} 10%|█ | 2230/22095 [3:34:48<23:40:56, 4.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954301 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5136, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 10%|█ | 2231/22095 [3:34:52<22:00:25, 3.99s/it] {'loss': 0.4261, 'grad_norm': 0.7294155882889573, 'learning_rate': 9.868676628404294e-06, 'epoch': 0.1} 10%|█ | 2231/22095 [3:34:52<22:00:25, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73183 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66228 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2232/22095 [3:35:01<31:17:11, 5.67s/it] {'loss': 0.5219, 'grad_norm': 0.3168440817594752, 'learning_rate': 9.86850970249525e-06, 'epoch': 0.1} 10%|█ | 2232/22095 [3:35:01<31:17:11, 5.67s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2233/22095 [3:35:05<27:39:32, 5.01s/it] {'loss': 0.3979, 'grad_norm': 0.7579059657604005, 'learning_rate': 9.868342671976887e-06, 'epoch': 0.1} 10%|█ | 2233/22095 [3:35:05<27:39:32, 5.01s/it] 10%|█ | 2234/22095 [3:35:09<26:05:31, 4.73s/it] {'loss': 0.4608, 'grad_norm': 0.7292905542346751, 'learning_rate': 9.86817553685279e-06, 'epoch': 0.1} 10%|█ | 2234/22095 [3:35:09<26:05:31, 4.73s/it] 10%|█ | 2235/22095 [3:35:12<23:32:58, 4.27s/it] {'loss': 0.4371, 'grad_norm': 0.8458997911415539, 'learning_rate': 9.868008297126552e-06, 'epoch': 0.1} 10%|█ | 2235/22095 [3:35:12<23:32:58, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49433 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53746 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63138 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118855 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2236/22095 [3:35:15<21:49:32, 3.96s/it] {'loss': 0.3859, 'grad_norm': 0.824759911636903, 'learning_rate': 9.867840952801768e-06, 'epoch': 0.1} 10%|█ | 2236/22095 [3:35:15<21:49:32, 3.96s/it] 10%|█ | 2237/22095 [3:35:18<20:16:59, 3.68s/it] {'loss': 0.44, 'grad_norm': 0.7304273464477413, 'learning_rate': 9.867673503882031e-06, 'epoch': 0.1} 10%|█ | 2237/22095 [3:35:18<20:16:59, 3.68s/it] 10%|█ | 2238/22095 [3:35:22<20:08:22, 3.65s/it] {'loss': 0.4505, 'grad_norm': 0.7799304327722856, 'learning_rate': 9.867505950370942e-06, 'epoch': 0.1} 10%|█ | 2238/22095 [3:35:22<20:08:22, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68221 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2239/22095 [3:35:25<18:42:12, 3.39s/it] {'loss': 0.4801, 'grad_norm': 0.7306543877797002, 'learning_rate': 9.8673382922721e-06, 'epoch': 0.1} 10%|█ | 2239/22095 [3:35:25<18:42:12, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2240/22095 [3:35:34<28:37:15, 5.19s/it] {'loss': 0.5353, 'grad_norm': 0.7435279790784834, 'learning_rate': 9.867170529589106e-06, 'epoch': 0.1} 10%|█ | 2240/22095 [3:35:34<28:37:15, 5.19s/it] 10%|█ | 2241/22095 [3:35:38<26:08:34, 4.74s/it] {'loss': 0.4005, 'grad_norm': 0.6603844615030072, 'learning_rate': 9.867002662325564e-06, 'epoch': 0.1} 10%|█ | 2241/22095 [3:35:38<26:08:34, 4.74s/it] 10%|█ | 2242/22095 [3:35:42<25:06:54, 4.55s/it] {'loss': 0.4415, 'grad_norm': 0.8630624848995856, 'learning_rate': 9.866834690485083e-06, 'epoch': 0.1} 10%|█ | 2242/22095 [3:35:42<25:06:54, 4.55s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_47_id_78_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-27 19:33:41.111482 load time: 1227.61 ms 10%|█ | 2243/22095 [3:35:45<22:31:48, 4.09s/it] {'loss': 0.4243, 'grad_norm': 0.9896646400129324, 'learning_rate': 9.866666614071274e-06, 'epoch': 0.1} 10%|█ | 2243/22095 [3:35:45<22:31:48, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2244/22095 [3:35:55<32:42:00, 5.93s/it] {'loss': 0.5234, 'grad_norm': 0.41220129274846495, 'learning_rate': 9.866498433087745e-06, 'epoch': 0.1} 10%|█ | 2244/22095 [3:35:55<32:42:00, 5.93s/it] 10%|█ | 2245/22095 [3:36:01<32:42:56, 5.93s/it] {'loss': 0.4474, 'grad_norm': 0.8362621005931979, 'learning_rate': 9.86633014753811e-06, 'epoch': 0.1} 10%|█ | 2245/22095 [3:36:01<32:42:56, 5.93s/it] 10%|█ | 2246/22095 [3:36:04<27:56:18, 5.07s/it] {'loss': 0.4534, 'grad_norm': 0.7503981593504316, 'learning_rate': 9.866161757425988e-06, 'epoch': 0.1} 10%|█ | 2246/22095 [3:36:04<27:56:18, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2247/22095 [3:36:14<36:00:11, 6.53s/it] {'loss': 0.5063, 'grad_norm': 0.41195840898795105, 'learning_rate': 9.865993262754993e-06, 'epoch': 0.1} 10%|█ | 2247/22095 [3:36:14<36:00:11, 6.53s/it] 10%|█ | 2248/22095 [3:36:17<31:10:16, 5.65s/it] {'loss': 0.433, 'grad_norm': 0.7265194341332623, 'learning_rate': 9.86582466352875e-06, 'epoch': 0.1} 10%|█ | 2248/22095 [3:36:17<31:10:16, 5.65s/it] 10%|█ | 2249/22095 [3:36:20<26:46:25, 4.86s/it] {'loss': 0.4292, 'grad_norm': 0.8409081526645884, 'learning_rate': 9.865655959750877e-06, 'epoch': 0.1} 10%|█ | 2249/22095 [3:36:21<26:46:25, 4.86s/it] 10%|█ | 2250/22095 [3:36:24<25:17:46, 4.59s/it] {'loss': 0.4367, 'grad_norm': 3.0630792854097093, 'learning_rate': 9.865487151425003e-06, 'epoch': 0.1} 10%|█ | 2250/22095 [3:36:24<25:17:46, 4.59s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-27 19:34:23.228517 load time: 1055.06 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2251/22095 [3:36:28<22:53:07, 4.15s/it] {'loss': 0.4542, 'grad_norm': 0.8861233445538413, 'learning_rate': 9.865318238554754e-06, 'epoch': 0.1} 10%|█ | 2251/22095 [3:36:28<22:53:07, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50596 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2252/22095 [3:36:31<21:18:05, 3.86s/it] {'loss': 0.418, 'grad_norm': 0.7822807066601043, 'learning_rate': 9.865149221143755e-06, 'epoch': 0.1} 10%|█ | 2252/22095 [3:36:31<21:18:05, 3.86s/it] 10%|█ | 2253/22095 [3:36:34<20:26:10, 3.71s/it] {'loss': 0.4496, 'grad_norm': 0.7559943477806504, 'learning_rate': 9.864980099195644e-06, 'epoch': 0.1} 10%|█ | 2253/22095 [3:36:34<20:26:10, 3.71s/it] 10%|█ | 2254/22095 [3:36:37<19:29:19, 3.54s/it] {'loss': 0.4134, 'grad_norm': 0.768806042496435, 'learning_rate': 9.864810872714053e-06, 'epoch': 0.1} 10%|█ | 2254/22095 [3:36:37<19:29:19, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [292, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8437042 in VC:s3://internvl-moe-sft-data/. Exception: Image size [292, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52205, 'image': 'vrdu_texteq/astro-ph.CO/d9853c61-db8e-46ad-8f73-8d9b1530af42.png', 'image_wh': [[292, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where ${\\bf s}_{12} \\equiv{\\bf s}_1 - {\\bf s}_2$ and'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_4.png 2025-08-27 19:34:37.633423 load time: 1413.52 ms 10%|█ | 2255/22095 [3:36:41<19:12:19, 3.48s/it] {'loss': 0.4246, 'grad_norm': 0.7047306300749351, 'learning_rate': 9.864641541702616e-06, 'epoch': 0.1} 10%|█ | 2255/22095 [3:36:41<19:12:19, 3.48s/it] 10%|█ | 2256/22095 [3:36:45<20:03:05, 3.64s/it] {'loss': 0.4247, 'grad_norm': 0.7028731886223434, 'learning_rate': 9.864472106164974e-06, 'epoch': 0.1} 10%|█ | 2256/22095 [3:36:45<20:03:05, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44567 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2257/22095 [3:36:48<19:10:56, 3.48s/it] {'loss': 0.4625, 'grad_norm': 0.7153130576512241, 'learning_rate': 9.864302566104764e-06, 'epoch': 0.1} 10%|█ | 2257/22095 [3:36:48<19:10:56, 3.48s/it] 10%|█ | 2258/22095 [3:36:51<19:29:14, 3.54s/it] {'loss': 0.4191, 'grad_norm': 1.048917465822911, 'learning_rate': 9.864132921525633e-06, 'epoch': 0.1} 10%|█ | 2258/22095 [3:36:51<19:29:14, 3.54s/it] 10%|█ | 2259/22095 [3:36:55<18:48:28, 3.41s/it] {'loss': 0.4477, 'grad_norm': 0.7899022592166589, 'learning_rate': 9.863963172431225e-06, 'epoch': 0.1} 10%|█ | 2259/22095 [3:36:55<18:48:28, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2260/22095 [3:37:03<27:04:48, 4.91s/it] {'loss': 0.5383, 'grad_norm': 0.47204269426321754, 'learning_rate': 9.863793318825186e-06, 'epoch': 0.1} 10%|█ | 2260/22095 [3:37:03<27:04:48, 4.91s/it] 10%|█ | 2261/22095 [3:37:07<25:12:43, 4.58s/it] {'loss': 0.4722, 'grad_norm': 0.6892225594080795, 'learning_rate': 9.863623360711167e-06, 'epoch': 0.1} 10%|█ | 2261/22095 [3:37:07<25:12:43, 4.58s/it] 10%|█ | 2262/22095 [3:37:10<22:43:40, 4.13s/it] {'loss': 0.4466, 'grad_norm': 0.748046058634716, 'learning_rate': 9.86345329809282e-06, 'epoch': 0.1} 10%|█ | 2262/22095 [3:37:10<22:43:40, 4.13s/it] 10%|█ | 2263/22095 [3:37:13<21:16:45, 3.86s/it] {'loss': 0.4145, 'grad_norm': 0.7130768776206144, 'learning_rate': 9.863283130973799e-06, 'epoch': 0.1} 10%|█ | 2263/22095 [3:37:13<21:16:45, 3.86s/it] 10%|█ | 2264/22095 [3:37:16<19:48:36, 3.60s/it] {'loss': 0.4318, 'grad_norm': 0.8219802036337167, 'learning_rate': 9.86311285935776e-06, 'epoch': 0.1} 10%|█ | 2264/22095 [3:37:16<19:48:36, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30442.png 2025-08-27 19:35:13.720469 load time: 1190.38 ms 10%|█ | 2265/22095 [3:37:26<29:33:07, 5.36s/it] {'loss': 0.5065, 'grad_norm': 0.3198446179300901, 'learning_rate': 9.86294248324836e-06, 'epoch': 0.1} 10%|█ | 2265/22095 [3:37:26<29:33:07, 5.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2266/22095 [3:37:29<26:40:21, 4.84s/it] {'loss': 0.4529, 'grad_norm': 0.756389577087066, 'learning_rate': 9.862772002649261e-06, 'epoch': 0.1} 10%|█ | 2266/22095 [3:37:29<26:40:21, 4.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2267/22095 [3:37:32<23:49:11, 4.32s/it] {'loss': 0.4489, 'grad_norm': 0.7648477142665258, 'learning_rate': 9.862601417564128e-06, 'epoch': 0.1} 10%|█ | 2267/22095 [3:37:32<23:49:11, 4.32s/it] 10%|█ | 2268/22095 [3:37:36<22:15:43, 4.04s/it] {'loss': 0.4219, 'grad_norm': 0.7699363576476402, 'learning_rate': 9.862430727996627e-06, 'epoch': 0.1} 10%|█ | 2268/22095 [3:37:36<22:15:43, 4.04s/it] 10%|█ | 2269/22095 [3:37:39<21:34:57, 3.92s/it] {'loss': 0.407, 'grad_norm': 0.7729899239840131, 'learning_rate': 9.86225993395042e-06, 'epoch': 0.1} 10%|█ | 2269/22095 [3:37:39<21:34:57, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91151 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2270/22095 [3:37:42<19:52:32, 3.61s/it] {'loss': 0.3968, 'grad_norm': 0.7204098134653517, 'learning_rate': 9.86208903542918e-06, 'epoch': 0.1} 10%|█ | 2270/22095 [3:37:42<19:52:32, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2271/22095 [3:37:52<29:42:48, 5.40s/it] {'loss': 0.5204, 'grad_norm': 0.35254404610092344, 'learning_rate': 9.861918032436582e-06, 'epoch': 0.1} 10%|█ | 2271/22095 [3:37:52<29:42:48, 5.40s/it] 10%|█ | 2272/22095 [3:37:55<26:28:12, 4.81s/it] {'loss': 0.4407, 'grad_norm': 0.7655693019851239, 'learning_rate': 9.861746924976297e-06, 'epoch': 0.1} 10%|█ | 2272/22095 [3:37:55<26:28:12, 4.81s/it] 10%|█ | 2273/22095 [3:37:58<23:49:58, 4.33s/it] {'loss': 0.444, 'grad_norm': 0.7358913387904376, 'learning_rate': 9.861575713052e-06, 'epoch': 0.1} 10%|█ | 2273/22095 [3:37:58<23:49:58, 4.33s/it] 10%|█ | 2274/22095 [3:38:01<21:30:23, 3.91s/it] {'loss': 0.4331, 'grad_norm': 0.7648849776920894, 'learning_rate': 9.861404396667375e-06, 'epoch': 0.1} 10%|█ | 2274/22095 [3:38:01<21:30:23, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2275/22095 [3:38:05<20:55:40, 3.80s/it] {'loss': 0.4632, 'grad_norm': 0.7137409472692013, 'learning_rate': 9.861232975826098e-06, 'epoch': 0.1} 10%|█ | 2275/22095 [3:38:05<20:55:40, 3.80s/it] 10%|█ | 2276/22095 [3:38:08<19:54:34, 3.62s/it] {'loss': 0.4217, 'grad_norm': 0.8041533524954448, 'learning_rate': 9.861061450531857e-06, 'epoch': 0.1} 10%|█ | 2276/22095 [3:38:08<19:54:34, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2277/22095 [3:38:11<19:30:24, 3.54s/it] {'loss': 0.3682, 'grad_norm': 0.6937125693526789, 'learning_rate': 9.860889820788333e-06, 'epoch': 0.1} 10%|█ | 2277/22095 [3:38:11<19:30:24, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2278/22095 [3:38:19<26:16:21, 4.77s/it] {'loss': 0.5088, 'grad_norm': 0.3890112300684754, 'learning_rate': 9.860718086599217e-06, 'epoch': 0.1} 10%|█ | 2278/22095 [3:38:19<26:16:21, 4.77s/it] 10%|█ | 2279/22095 [3:38:23<24:31:10, 4.45s/it] {'loss': 0.4567, 'grad_norm': 0.755158671146552, 'learning_rate': 9.860546247968196e-06, 'epoch': 0.1} 10%|█ | 2279/22095 [3:38:23<24:31:10, 4.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [231, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8384672 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51472, 'image': 'vrdu_table_final_2/astro-ph.CO/9b1ef985-5a14-410a-9b9f-def389fc8e19.png', 'image_wh': [[231, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}Number of clusters\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2280/22095 [3:38:26<21:52:41, 3.97s/it] {'loss': 0.3703, 'grad_norm': 0.6964351594584257, 'learning_rate': 9.860374304898966e-06, 'epoch': 0.1} 10%|█ | 2280/22095 [3:38:26<21:52:41, 3.97s/it] 10%|█ | 2281/22095 [3:38:29<21:21:09, 3.88s/it] {'loss': 0.4203, 'grad_norm': 0.6942034226642203, 'learning_rate': 9.86020225739522e-06, 'epoch': 0.1} 10%|█ | 2281/22095 [3:38:29<21:21:09, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2282/22095 [3:38:33<21:11:26, 3.85s/it] {'loss': 0.4105, 'grad_norm': 0.7493731249129036, 'learning_rate': 9.860030105460655e-06, 'epoch': 0.1} 10%|█ | 2282/22095 [3:38:33<21:11:26, 3.85s/it] 10%|█ | 2283/22095 [3:38:36<20:24:44, 3.71s/it] {'loss': 0.3853, 'grad_norm': 0.7875242177860882, 'learning_rate': 9.859857849098967e-06, 'epoch': 0.1} 10%|█ | 2283/22095 [3:38:36<20:24:44, 3.71s/it] 10%|█ | 2284/22095 [3:38:41<21:19:45, 3.88s/it] {'loss': 0.4332, 'grad_norm': 0.6723055543092835, 'learning_rate': 9.859685488313861e-06, 'epoch': 0.1} 10%|█ | 2284/22095 [3:38:41<21:19:45, 3.88s/it] 10%|█ | 2285/22095 [3:38:44<20:36:41, 3.75s/it] {'loss': 0.4391, 'grad_norm': 0.6560360537570137, 'learning_rate': 9.859513023109037e-06, 'epoch': 0.1} 10%|█ | 2285/22095 [3:38:44<20:36:41, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91213 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2286/22095 [3:38:47<19:36:18, 3.56s/it] {'loss': 0.4383, 'grad_norm': 0.6849114895958732, 'learning_rate': 9.859340453488206e-06, 'epoch': 0.1} 10%|█ | 2286/22095 [3:38:47<19:36:18, 3.56s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_183722_4/images/before_screenshot_44_id_139_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-27 19:36:46.958102 load time: 1162.79 ms 10%|█ | 2287/22095 [3:38:50<18:33:04, 3.37s/it] {'loss': 0.4432, 'grad_norm': 0.7783040393773684, 'learning_rate': 9.859167779455072e-06, 'epoch': 0.1} 10%|█ | 2287/22095 [3:38:50<18:33:04, 3.37s/it] 10%|█ | 2288/22095 [3:38:53<17:39:54, 3.21s/it] {'loss': 0.4384, 'grad_norm': 0.724288770259416, 'learning_rate': 9.858995001013347e-06, 'epoch': 0.1} 10%|█ | 2288/22095 [3:38:53<17:39:54, 3.21s/it] 10%|█ | 2289/22095 [3:38:56<17:40:22, 3.21s/it] {'loss': 0.4456, 'grad_norm': 0.7111758444461136, 'learning_rate': 9.858822118166742e-06, 'epoch': 0.1} 10%|█ | 2289/22095 [3:38:56<17:40:22, 3.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308701 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2o7gEbDnI8KJjSszgXXc8ApXa_!!517868909.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text, no need to display the bounding box.'}, {'from': 'gpt', 'value': 'All words in the image:\n精美礼品相赠\n精美礼品&二选一\n或\n公仔钥匙扣\n送\n送\n香皂玫瑰花\n(仓库随机发放)\n小号不送礼物\n默认送钥匙扣,'}]} 10%|█ | 2290/22095 [3:39:00<18:07:06, 3.29s/it] {'loss': 0.4351, 'grad_norm': 0.6932478030538325, 'learning_rate': 9.85864913091897e-06, 'epoch': 0.1} 10%|█ | 2290/22095 [3:39:00<18:07:06, 3.29s/it] 10%|█ | 2291/22095 [3:39:03<18:10:11, 3.30s/it] {'loss': 0.4304, 'grad_norm': 0.7579749515859955, 'learning_rate': 9.858476039273755e-06, 'epoch': 0.1} 10%|█ | 2291/22095 [3:39:03<18:10:11, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63292 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91534 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100473 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2292/22095 [3:39:06<17:18:33, 3.15s/it] {'loss': 0.4056, 'grad_norm': 0.7060927633532289, 'learning_rate': 9.85830284323481e-06, 'epoch': 0.1} 10%|█ | 2292/22095 [3:39:06<17:18:33, 3.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80137 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2293/22095 [3:39:09<17:30:59, 3.18s/it] {'loss': 0.4651, 'grad_norm': 0.7485879479542586, 'learning_rate': 9.858129542805857e-06, 'epoch': 0.1} 10%|█ | 2293/22095 [3:39:09<17:30:59, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2294/22095 [3:39:17<25:33:43, 4.65s/it] {'loss': 0.5417, 'grad_norm': 0.48112576797143786, 'learning_rate': 9.857956137990621e-06, 'epoch': 0.1} 10%|█ | 2294/22095 [3:39:17<25:33:43, 4.65s/it] 10%|█ | 2295/22095 [3:39:21<23:23:50, 4.25s/it] {'loss': 0.4429, 'grad_norm': 0.7667656764960077, 'learning_rate': 9.857782628792826e-06, 'epoch': 0.1} 10%|█ | 2295/22095 [3:39:21<23:23:50, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2296/22095 [3:39:24<22:11:53, 4.04s/it] {'loss': 0.436, 'grad_norm': 0.6954661751657719, 'learning_rate': 9.857609015216205e-06, 'epoch': 0.1} 10%|█ | 2296/22095 [3:39:24<22:11:53, 4.04s/it] 10%|█ | 2297/22095 [3:39:28<21:30:41, 3.91s/it] {'loss': 0.4751, 'grad_norm': 0.6967576041244012, 'learning_rate': 9.857435297264484e-06, 'epoch': 0.1} 10%|█ | 2297/22095 [3:39:28<21:30:41, 3.91s/it] 10%|█ | 2298/22095 [3:39:30<19:40:49, 3.58s/it] {'loss': 0.4228, 'grad_norm': 0.834588518369712, 'learning_rate': 9.857261474941397e-06, 'epoch': 0.1} 10%|█ | 2298/22095 [3:39:30<19:40:49, 3.58s/it] 10%|█ | 2299/22095 [3:39:33<18:21:46, 3.34s/it] {'loss': 0.3802, 'grad_norm': 0.6721588650367362, 'learning_rate': 9.85708754825068e-06, 'epoch': 0.1} 10%|█ | 2299/22095 [3:39:33<18:21:46, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 10%|█ | 2300/22095 [3:39:43<28:23:59, 5.16s/it] {'loss': 0.5238, 'grad_norm': 0.41026951413440077, 'learning_rate': 9.856913517196065e-06, 'epoch': 0.1} 10%|█ | 2300/22095 [3:39:43<28:23:59, 5.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2301/22095 [3:39:46<25:13:00, 4.59s/it] {'loss': 0.4336, 'grad_norm': 0.7035267949243749, 'learning_rate': 9.8567393817813e-06, 'epoch': 0.1} 10%|█ | 2301/22095 [3:39:46<25:13:00, 4.59s/it] 10%|█ | 2302/22095 [3:39:49<23:17:48, 4.24s/it] {'loss': 0.4439, 'grad_norm': 0.7585486917088173, 'learning_rate': 9.85656514201012e-06, 'epoch': 0.1} 10%|█ | 2302/22095 [3:39:49<23:17:48, 4.24s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU/Agriculture/test_79_image_1.png 2025-08-27 19:37:48.777353 load time: 1207.86 ms 10%|█ | 2303/22095 [3:39:53<22:44:30, 4.14s/it] {'loss': 0.403, 'grad_norm': 0.6649764632725609, 'learning_rate': 9.85639079788627e-06, 'epoch': 0.1} 10%|█ | 2303/22095 [3:39:53<22:44:30, 4.14s/it] 10%|█ | 2304/22095 [3:39:57<22:09:30, 4.03s/it] {'loss': 0.4479, 'grad_norm': 0.7385706952114729, 'learning_rate': 9.856216349413499e-06, 'epoch': 0.1} 10%|█ | 2304/22095 [3:39:57<22:09:30, 4.03s/it] 10%|█ | 2305/22095 [3:40:00<20:45:13, 3.78s/it] {'loss': 0.3824, 'grad_norm': 0.6905045514972701, 'learning_rate': 9.856041796595553e-06, 'epoch': 0.1} 10%|█ | 2305/22095 [3:40:00<20:45:13, 3.78s/it] 10%|█ | 2306/22095 [3:40:03<19:55:32, 3.62s/it] {'loss': 0.395, 'grad_norm': 0.6327526228856807, 'learning_rate': 9.855867139436182e-06, 'epoch': 0.1} 10%|█ | 2306/22095 [3:40:03<19:55:32, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941398 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64551, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 7cm\nB. 5.4cm\nC. 6.4cm\nD. 6.8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 10%|█ | 2307/22095 [3:40:07<19:04:45, 3.47s/it] {'loss': 0.48, 'grad_norm': 0.7546079408741386, 'learning_rate': 9.85569237793914e-06, 'epoch': 0.1} 10%|█ | 2307/22095 [3:40:07<19:04:45, 3.47s/it]VC:s3://gui-agent/agentnet/win_mac_images/db1289e0-7ec3-4b02-aae6-fe698814b712.png 2025-08-27 19:38:05.352080 load time: 1204.42 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/vscode_1/images/step_0.png 2025-08-27 19:38:06.411681 load time: 1109.61 ms 10%|█ | 2308/22095 [3:40:10<18:34:36, 3.38s/it] {'loss': 0.4505, 'grad_norm': 0.737242030395461, 'learning_rate': 9.855517512108182e-06, 'epoch': 0.1} 10%|█ | 2308/22095 [3:40:10<18:34:36, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301566 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1EWCIhBDH8KJjSspnXXbNAVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease process this image using OCR and provide me with the text.'}, {'from': 'gpt', 'value': 'All words in the image:\n仓库搬迁特价促销\n尺寸定做\n超强承重\n日常价\n日常价\n底价\n跌破\n年中感恩\n回馈到底\n清仓价'}]} 10%|█ | 2309/22095 [3:40:12<17:30:12, 3.18s/it] {'loss': 0.4319, 'grad_norm': 0.6670828689286896, 'learning_rate': 9.855342541947065e-06, 'epoch': 0.1} 10%|█ | 2309/22095 [3:40:12<17:30:12, 3.18s/it] 10%|█ | 2310/22095 [3:40:17<19:06:26, 3.48s/it] {'loss': 0.4621, 'grad_norm': 0.7111244403772158, 'learning_rate': 9.855167467459548e-06, 'epoch': 0.1} 10%|█ | 2310/22095 [3:40:17<19:06:26, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047681 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 4\nB. 1\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 10%|█ | 2311/22095 [3:40:20<18:14:18, 3.32s/it] {'loss': 0.4183, 'grad_norm': 0.7428775133067482, 'learning_rate': 9.854992288649397e-06, 'epoch': 0.1} 10%|█ | 2311/22095 [3:40:20<18:14:18, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81704 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2312/22095 [3:40:23<18:10:23, 3.31s/it] {'loss': 0.4512, 'grad_norm': 0.7683796554367252, 'learning_rate': 9.85481700552037e-06, 'epoch': 0.1} 10%|█ | 2312/22095 [3:40:23<18:10:23, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119128 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2313/22095 [3:40:27<19:30:19, 3.55s/it] {'loss': 0.4567, 'grad_norm': 0.6426490380469407, 'learning_rate': 9.854641618076236e-06, 'epoch': 0.1} 10%|█ | 2313/22095 [3:40:27<19:30:19, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76051 > 40960). Running this sequence through the model will result in indexing errors 10%|█ | 2314/22095 [3:40:30<18:39:45, 3.40s/it] {'loss': 0.401, 'grad_norm': 0.6711259744328443, 'learning_rate': 9.854466126320763e-06, 'epoch': 0.1} 10%|█ | 2314/22095 [3:40:30<18:39:45, 3.40s/it] 10%|█ | 2315/22095 [3:40:33<18:36:45, 3.39s/it] {'loss': 0.4774, 'grad_norm': 0.8455675550799173, 'learning_rate': 9.854290530257723e-06, 'epoch': 0.1} 10%|█ | 2315/22095 [3:40:33<18:36:45, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047848 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 10%|█ | 2316/22095 [3:40:36<18:00:34, 3.28s/it] {'loss': 0.4238, 'grad_norm': 0.7467569882797702, 'learning_rate': 9.85411482989089e-06, 'epoch': 0.1} 10%|█ | 2316/22095 [3:40:36<18:00:34, 3.28s/it] 10%|█ | 2317/22095 [3:40:40<18:01:17, 3.28s/it] {'loss': 0.4381, 'grad_norm': 0.734320352470716, 'learning_rate': 9.853939025224037e-06, 'epoch': 0.1} 10%|█ | 2317/22095 [3:40:40<18:01:17, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2318/22095 [3:40:43<18:36:15, 3.39s/it] {'loss': 0.4068, 'grad_norm': 0.6341079183035991, 'learning_rate': 9.853763116260941e-06, 'epoch': 0.1} 10%|█ | 2318/22095 [3:40:43<18:36:15, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 10%|█ | 2319/22095 [3:40:46<18:03:47, 3.29s/it] {'loss': 0.4489, 'grad_norm': 0.7800650722788848, 'learning_rate': 9.853587103005382e-06, 'epoch': 0.1} 10%|█ | 2319/22095 [3:40:46<18:03:47, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-27 19:38:46.008209 load time: 1334.95 ms 11%|█ | 2320/22095 [3:40:56<28:58:43, 5.28s/it] {'loss': 0.5282, 'grad_norm': 0.451047452424555, 'learning_rate': 9.853410985461145e-06, 'epoch': 0.11} 11%|█ | 2320/22095 [3:40:56<28:58:43, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61436 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2321/22095 [3:41:06<36:00:10, 6.55s/it] {'loss': 0.5363, 'grad_norm': 0.38852085304725875, 'learning_rate': 9.85323476363201e-06, 'epoch': 0.11} 11%|█ | 2321/22095 [3:41:06<36:00:10, 6.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047659 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 2cm\nB. 5cm\nC. 4cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 11%|█ | 2322/22095 [3:41:15<40:54:30, 7.45s/it] {'loss': 0.5195, 'grad_norm': 0.31390542967558155, 'learning_rate': 9.853058437521768e-06, 'epoch': 0.11} 11%|█ | 2322/22095 [3:41:15<40:54:30, 7.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2323/22095 [3:41:19<35:21:37, 6.44s/it] {'loss': 0.434, 'grad_norm': 0.7989730108740146, 'learning_rate': 9.852882007134202e-06, 'epoch': 0.11} 11%|█ | 2323/22095 [3:41:19<35:21:37, 6.44s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/f9be7ed3-49aa-4f23-a176-7af6afdfae84/images/step_5.png 2025-08-27 19:39:19.963365 load time: 1098.06 ms 11%|█ | 2324/22095 [3:41:23<30:52:50, 5.62s/it] {'loss': 0.4473, 'grad_norm': 0.749871253521505, 'learning_rate': 9.852705472473108e-06, 'epoch': 0.11} 11%|█ | 2324/22095 [3:41:23<30:52:50, 5.62s/it] 11%|█ | 2325/22095 [3:41:27<27:27:25, 5.00s/it] {'loss': 0.4168, 'grad_norm': 0.7727191908261867, 'learning_rate': 9.852528833542278e-06, 'epoch': 0.11} 11%|█ | 2325/22095 [3:41:27<27:27:25, 5.00s/it] 11%|█ | 2326/22095 [3:41:30<24:59:06, 4.55s/it] {'loss': 0.4311, 'grad_norm': 0.7798115976467546, 'learning_rate': 9.852352090345504e-06, 'epoch': 0.11} 11%|█ | 2326/22095 [3:41:30<24:59:06, 4.55s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-27 19:39:29.471525 load time: 1231.74 ms 11%|█ | 2327/22095 [3:41:34<24:18:06, 4.43s/it] {'loss': 0.4638, 'grad_norm': 0.7268627450505267, 'learning_rate': 9.85217524288659e-06, 'epoch': 0.11} 11%|█ | 2327/22095 [3:41:34<24:18:06, 4.43s/it] 11%|█ | 2328/22095 [3:41:38<22:27:38, 4.09s/it] {'loss': 0.4049, 'grad_norm': 0.8698990762933937, 'learning_rate': 9.851998291169332e-06, 'epoch': 0.11} 11%|█ | 2328/22095 [3:41:38<22:27:38, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2329/22095 [3:41:47<30:50:31, 5.62s/it] {'loss': 0.5142, 'grad_norm': 0.6206157334058049, 'learning_rate': 9.85182123519753e-06, 'epoch': 0.11} 11%|█ | 2329/22095 [3:41:47<30:50:31, 5.62s/it] 11%|█ | 2330/22095 [3:41:50<26:59:14, 4.92s/it] {'loss': 0.4425, 'grad_norm': 0.7495598486543176, 'learning_rate': 9.851644074974992e-06, 'epoch': 0.11} 11%|█ | 2330/22095 [3:41:50<26:59:14, 4.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2331/22095 [3:41:54<25:07:59, 4.58s/it] {'loss': 0.4056, 'grad_norm': 0.6881993019531567, 'learning_rate': 9.851466810505523e-06, 'epoch': 0.11} 11%|█ | 2331/22095 [3:41:54<25:07:59, 4.58s/it] 11%|█ | 2332/22095 [3:41:57<23:07:54, 4.21s/it] {'loss': 0.4254, 'grad_norm': 0.7274463239961425, 'learning_rate': 9.851289441792934e-06, 'epoch': 0.11} 11%|█ | 2332/22095 [3:41:57<23:07:54, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112520 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2333/22095 [3:42:00<21:07:07, 3.85s/it] {'loss': 0.4515, 'grad_norm': 0.7768959308161464, 'learning_rate': 9.851111968841033e-06, 'epoch': 0.11} 11%|█ | 2333/22095 [3:42:00<21:07:07, 3.85s/it] 11%|█ | 2334/22095 [3:42:04<20:23:43, 3.72s/it] {'loss': 0.3977, 'grad_norm': 0.7460630867250454, 'learning_rate': 9.850934391653636e-06, 'epoch': 0.11} 11%|█ | 2334/22095 [3:42:04<20:23:43, 3.72s/it] 11%|█ | 2335/22095 [3:42:08<20:37:27, 3.76s/it] {'loss': 0.3988, 'grad_norm': 0.7049755514403179, 'learning_rate': 9.850756710234557e-06, 'epoch': 0.11} 11%|█ | 2335/22095 [3:42:08<20:37:27, 3.76s/it] 11%|█ | 2336/22095 [3:42:10<18:58:52, 3.46s/it] {'loss': 0.4394, 'grad_norm': 0.8415832391023809, 'learning_rate': 9.850578924587614e-06, 'epoch': 0.11} 11%|█ | 2336/22095 [3:42:10<18:58:52, 3.46s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30356.png 2025-08-27 19:40:07.996679 load time: 1741.27 ms 11%|█ | 2337/22095 [3:42:14<18:49:52, 3.43s/it] {'loss': 0.3985, 'grad_norm': 0.9337415198625878, 'learning_rate': 9.850401034716629e-06, 'epoch': 0.11} 11%|█ | 2337/22095 [3:42:14<18:49:52, 3.43s/it] 11%|█ | 2338/22095 [3:42:17<18:18:12, 3.34s/it] {'loss': 0.4113, 'grad_norm': 0.7403823554645752, 'learning_rate': 9.85022304062542e-06, 'epoch': 0.11} 11%|█ | 2338/22095 [3:42:17<18:18:12, 3.34s/it] 11%|█ | 2339/22095 [3:42:20<17:57:38, 3.27s/it] {'loss': 0.4565, 'grad_norm': 0.7597975447386219, 'learning_rate': 9.850044942317814e-06, 'epoch': 0.11} 11%|█ | 2339/22095 [3:42:20<17:57:38, 3.27s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-27 19:40:19.494069 load time: 1170.56 ms 11%|█ | 2340/22095 [3:42:23<17:46:08, 3.24s/it] {'loss': 0.4617, 'grad_norm': 0.7379521994471612, 'learning_rate': 9.84986673979764e-06, 'epoch': 0.11} 11%|█ | 2340/22095 [3:42:23<17:46:08, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (81328 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2341/22095 [3:42:33<28:14:56, 5.15s/it] {'loss': 0.4947, 'grad_norm': 0.5879028793940111, 'learning_rate': 9.849688433068724e-06, 'epoch': 0.11} 11%|█ | 2341/22095 [3:42:33<28:14:56, 5.15s/it] 11%|█ | 2342/22095 [3:42:36<25:45:50, 4.70s/it] {'loss': 0.4749, 'grad_norm': 0.8458052413227429, 'learning_rate': 9.849510022134899e-06, 'epoch': 0.11} 11%|█ | 2342/22095 [3:42:36<25:45:50, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2343/22095 [3:42:39<23:10:07, 4.22s/it] {'loss': 0.4197, 'grad_norm': 0.7548523560498813, 'learning_rate': 9.849331506999996e-06, 'epoch': 0.11} 11%|█ | 2343/22095 [3:42:39<23:10:07, 4.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2344/22095 [3:42:44<23:05:19, 4.21s/it] {'loss': 0.438, 'grad_norm': 0.7488304007811081, 'learning_rate': 9.849152887667855e-06, 'epoch': 0.11} 11%|█ | 2344/22095 [3:42:44<23:05:19, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2345/22095 [3:42:47<22:03:09, 4.02s/it] {'loss': 0.4229, 'grad_norm': 0.7414720569112878, 'learning_rate': 9.848974164142309e-06, 'epoch': 0.11} 11%|█ | 2345/22095 [3:42:47<22:03:09, 4.02s/it] 11%|█ | 2346/22095 [3:42:51<21:03:26, 3.84s/it] {'loss': 0.4464, 'grad_norm': 0.7423339816887292, 'learning_rate': 9.848795336427202e-06, 'epoch': 0.11} 11%|█ | 2346/22095 [3:42:51<21:03:26, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43289 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2347/22095 [3:42:54<19:46:57, 3.61s/it] {'loss': 0.4756, 'grad_norm': 0.78079522511492, 'learning_rate': 9.848616404526374e-06, 'epoch': 0.11} 11%|█ | 2347/22095 [3:42:54<19:46:57, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84804 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85358 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2348/22095 [3:42:57<18:45:56, 3.42s/it] {'loss': 0.4547, 'grad_norm': 0.8166039341128437, 'learning_rate': 9.848437368443672e-06, 'epoch': 0.11} 11%|█ | 2348/22095 [3:42:57<18:45:56, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2349/22095 [3:43:07<29:32:44, 5.39s/it] {'loss': 0.5249, 'grad_norm': 0.6081730989414673, 'learning_rate': 9.848258228182943e-06, 'epoch': 0.11} 11%|█ | 2349/22095 [3:43:07<29:32:44, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69076 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2350/22095 [3:43:10<26:46:31, 4.88s/it] {'loss': 0.3982, 'grad_norm': 0.7141425259010529, 'learning_rate': 9.848078983748032e-06, 'epoch': 0.11} 11%|█ | 2350/22095 [3:43:10<26:46:31, 4.88s/it] 11%|█ | 2351/22095 [3:43:14<24:46:15, 4.52s/it] {'loss': 0.4029, 'grad_norm': 0.7046271198271215, 'learning_rate': 9.847899635142797e-06, 'epoch': 0.11} 11%|█ | 2351/22095 [3:43:14<24:46:15, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72655 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43546 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2352/22095 [3:43:18<23:56:53, 4.37s/it] {'loss': 0.4417, 'grad_norm': 0.72872473152322, 'learning_rate': 9.847720182371086e-06, 'epoch': 0.11} 11%|█ | 2352/22095 [3:43:18<23:56:53, 4.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43179 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2353/22095 [3:43:21<22:03:53, 4.02s/it] {'loss': 0.4342, 'grad_norm': 0.7796132442433459, 'learning_rate': 9.847540625436756e-06, 'epoch': 0.11} 11%|█ | 2353/22095 [3:43:21<22:03:53, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884030 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7183, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1.5cm\nB. 2cm\nC. 4cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 11%|█ | 2354/22095 [3:43:24<20:36:09, 3.76s/it] {'loss': 0.4482, 'grad_norm': 0.7273745717727493, 'learning_rate': 9.847360964343667e-06, 'epoch': 0.11} 11%|█ | 2354/22095 [3:43:24<20:36:09, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68269 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96140 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2355/22095 [3:43:34<29:57:53, 5.46s/it] {'loss': 0.4964, 'grad_norm': 0.4829803438786883, 'learning_rate': 9.84718119909568e-06, 'epoch': 0.11} 11%|█ | 2355/22095 [3:43:34<29:57:53, 5.46s/it] 11%|█ | 2356/22095 [3:43:43<36:14:42, 6.61s/it] {'loss': 0.5238, 'grad_norm': 0.436561756230908, 'learning_rate': 9.847001329696653e-06, 'epoch': 0.11} 11%|█ | 2356/22095 [3:43:43<36:14:42, 6.61s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2357/22095 [3:43:46<30:57:07, 5.65s/it] {'loss': 0.3898, 'grad_norm': 0.8452537247177253, 'learning_rate': 9.846821356150455e-06, 'epoch': 0.11} 11%|█ | 2357/22095 [3:43:46<30:57:07, 5.65s/it] 11%|█ | 2358/22095 [3:43:55<36:10:13, 6.60s/it] {'loss': 0.5405, 'grad_norm': 0.3834464300732562, 'learning_rate': 9.846641278460952e-06, 'epoch': 0.11} 11%|█ | 2358/22095 [3:43:55<36:10:13, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (62346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44061 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48037 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2359/22095 [3:43:59<30:57:09, 5.65s/it] {'loss': 0.4122, 'grad_norm': 0.7367450263313308, 'learning_rate': 9.846461096632014e-06, 'epoch': 0.11} 11%|█ | 2359/22095 [3:43:59<30:57:09, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121729 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65962 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2360/22095 [3:44:08<37:20:26, 6.81s/it] {'loss': 0.5298, 'grad_norm': 0.40049087997983424, 'learning_rate': 9.846280810667512e-06, 'epoch': 0.11} 11%|█ | 2360/22095 [3:44:08<37:20:26, 6.81s/it] 11%|█ | 2361/22095 [3:44:17<39:58:21, 7.29s/it] {'loss': 0.4991, 'grad_norm': 0.35495512975334464, 'learning_rate': 9.846100420571319e-06, 'epoch': 0.11} 11%|█ | 2361/22095 [3:44:17<39:58:21, 7.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2362/22095 [3:44:20<33:10:44, 6.05s/it] {'loss': 0.4215, 'grad_norm': 0.8965978069317818, 'learning_rate': 9.84591992634731e-06, 'epoch': 0.11} 11%|█ | 2362/22095 [3:44:20<33:10:44, 6.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84878 > 40960). Running this sequence through the model will result in indexing errors VC:s3://internvl-moe-sft-data/vrdu_table_final_2/astro-ph.CO/e2e4fc97-5c2c-414d-8a15-3b1f44db7654.png 2025-08-27 19:42:18.610207 load time: 1052.67 ms VC:s3://gui-agent/data_20250707/windows/images/excel/free_task_20250623_150746/images/20250623_150807_9.png 2025-08-27 19:42:18.610023 load time: 1043.84 ms 11%|█ | 2363/22095 [3:44:29<38:59:25, 7.11s/it] {'loss': 0.5186, 'grad_norm': 0.36426252834219985, 'learning_rate': 9.845739327999366e-06, 'epoch': 0.11} 11%|█ | 2363/22095 [3:44:29<38:59:25, 7.11s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2364/22095 [3:44:33<33:03:47, 6.03s/it] {'loss': 0.4627, 'grad_norm': 0.7992452990273541, 'learning_rate': 9.845558625531368e-06, 'epoch': 0.11} 11%|█ | 2364/22095 [3:44:33<33:03:47, 6.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73078 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80270 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2365/22095 [3:44:37<30:39:36, 5.59s/it] {'loss': 0.4202, 'grad_norm': 0.8018786875684687, 'learning_rate': 9.845377818947194e-06, 'epoch': 0.11} 11%|█ | 2365/22095 [3:44:38<30:39:36, 5.59s/it] 11%|█ | 2366/22095 [3:44:41<26:28:35, 4.83s/it] {'loss': 0.4563, 'grad_norm': 0.747643050655193, 'learning_rate': 9.845196908250737e-06, 'epoch': 0.11} 11%|█ | 2366/22095 [3:44:41<26:28:35, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2367/22095 [3:44:51<36:16:27, 6.62s/it] {'loss': 0.5062, 'grad_norm': 0.43328647639779194, 'learning_rate': 9.845015893445874e-06, 'epoch': 0.11} 11%|█ | 2367/22095 [3:44:51<36:16:27, 6.62s/it] 11%|█ | 2368/22095 [3:45:01<41:49:26, 7.63s/it] {'loss': 0.4825, 'grad_norm': 0.40488746552796007, 'learning_rate': 9.844834774536503e-06, 'epoch': 0.11} 11%|█ | 2368/22095 [3:45:01<41:49:26, 7.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2369/22095 [3:45:05<35:45:58, 6.53s/it] {'loss': 0.4464, 'grad_norm': 0.9608323793728127, 'learning_rate': 9.84465355152651e-06, 'epoch': 0.11} 11%|█ | 2369/22095 [3:45:05<35:45:58, 6.53s/it] 11%|█ | 2370/22095 [3:45:13<37:39:50, 6.87s/it] {'loss': 0.522, 'grad_norm': 0.4088980125981813, 'learning_rate': 9.844472224419794e-06, 'epoch': 0.11} 11%|█ | 2370/22095 [3:45:13<37:39:50, 6.87s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2371/22095 [3:45:16<31:40:00, 5.78s/it] {'loss': 0.4272, 'grad_norm': 0.7098724808096835, 'learning_rate': 9.844290793220249e-06, 'epoch': 0.11} 11%|█ | 2371/22095 [3:45:16<31:40:00, 5.78s/it] 11%|█ | 2372/22095 [3:45:19<27:33:06, 5.03s/it] {'loss': 0.4372, 'grad_norm': 0.810840485749749, 'learning_rate': 9.84410925793177e-06, 'epoch': 0.11} 11%|█ | 2372/22095 [3:45:19<27:33:06, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2373/22095 [3:45:30<36:24:15, 6.65s/it] {'loss': 0.5439, 'grad_norm': 0.4924884016840848, 'learning_rate': 9.843927618558262e-06, 'epoch': 0.11} 11%|█ | 2373/22095 [3:45:30<36:24:15, 6.65s/it] 11%|█ | 2374/22095 [3:45:34<31:58:58, 5.84s/it] {'loss': 0.471, 'grad_norm': 0.8003323683533289, 'learning_rate': 9.843745875103628e-06, 'epoch': 0.11} 11%|█ | 2374/22095 [3:45:34<31:58:58, 5.84s/it] 11%|█ | 2375/22095 [3:45:37<27:53:42, 5.09s/it] {'loss': 0.4476, 'grad_norm': 0.7457532055280077, 'learning_rate': 9.84356402757177e-06, 'epoch': 0.11} 11%|█ | 2375/22095 [3:45:37<27:53:42, 5.09s/it] 11%|█ | 2376/22095 [3:45:40<24:17:45, 4.44s/it] {'loss': 0.4134, 'grad_norm': 0.7647485411939247, 'learning_rate': 9.843382075966596e-06, 'epoch': 0.11} 11%|█ | 2376/22095 [3:45:40<24:17:45, 4.44s/it] 11%|█ | 2377/22095 [3:45:43<21:56:04, 4.00s/it] {'loss': 0.3952, 'grad_norm': 0.8488959006361999, 'learning_rate': 9.843200020292017e-06, 'epoch': 0.11} 11%|█ | 2377/22095 [3:45:43<21:56:04, 4.00s/it] 11%|█ | 2378/22095 [3:45:46<20:12:10, 3.69s/it] {'loss': 0.4477, 'grad_norm': 0.7277350914827775, 'learning_rate': 9.843017860551946e-06, 'epoch': 0.11} 11%|█ | 2378/22095 [3:45:46<20:12:10, 3.69s/it] 11%|█ | 2379/22095 [3:45:50<20:02:28, 3.66s/it] {'loss': 0.3839, 'grad_norm': 0.7694876133740054, 'learning_rate': 9.842835596750292e-06, 'epoch': 0.11} 11%|█ | 2379/22095 [3:45:50<20:02:28, 3.66s/it] 11%|█ | 2380/22095 [3:45:53<18:53:26, 3.45s/it] {'loss': 0.4337, 'grad_norm': 0.9111084221791587, 'learning_rate': 9.842653228890979e-06, 'epoch': 0.11} 11%|█ | 2380/22095 [3:45:53<18:53:26, 3.45s/it] 11%|█ | 2381/22095 [3:45:56<19:05:08, 3.49s/it] {'loss': 0.3778, 'grad_norm': 0.6855619180024155, 'learning_rate': 9.84247075697792e-06, 'epoch': 0.11} 11%|█ | 2381/22095 [3:45:56<19:05:08, 3.49s/it] 11%|█ | 2382/22095 [3:45:59<18:11:12, 3.32s/it] {'loss': 0.3973, 'grad_norm': 0.6863604389228337, 'learning_rate': 9.842288181015035e-06, 'epoch': 0.11} 11%|█ | 2382/22095 [3:45:59<18:11:12, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2383/22095 [3:46:06<23:24:38, 4.28s/it] {'loss': 0.5356, 'grad_norm': 0.6123620312305323, 'learning_rate': 9.84210550100625e-06, 'epoch': 0.11} 11%|█ | 2383/22095 [3:46:06<23:24:38, 4.28s/it] 11%|█ | 2384/22095 [3:46:11<26:02:26, 4.76s/it] {'loss': 0.4588, 'grad_norm': 0.8038667464891291, 'learning_rate': 9.841922716955488e-06, 'epoch': 0.11} 11%|█ | 2384/22095 [3:46:11<26:02:26, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81322 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46777 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (141071 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2385/22095 [3:46:16<24:55:14, 4.55s/it] {'loss': 0.4849, 'grad_norm': 0.8232987417202955, 'learning_rate': 9.84173982886668e-06, 'epoch': 0.11} 11%|█ | 2385/22095 [3:46:16<24:55:14, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (130479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93245 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2386/22095 [3:46:19<23:04:05, 4.21s/it] {'loss': 0.4006, 'grad_norm': 0.7063766129624343, 'learning_rate': 9.841556836743752e-06, 'epoch': 0.11} 11%|█ | 2386/22095 [3:46:19<23:04:05, 4.21s/it] 11%|█ | 2387/22095 [3:46:23<22:44:54, 4.16s/it] {'loss': 0.4433, 'grad_norm': 0.8885217926078794, 'learning_rate': 9.841373740590638e-06, 'epoch': 0.11} 11%|█ | 2387/22095 [3:46:23<22:44:54, 4.16s/it] 11%|█ | 2388/22095 [3:46:26<20:56:03, 3.82s/it] {'loss': 0.4048, 'grad_norm': 0.7139772742218194, 'learning_rate': 9.84119054041127e-06, 'epoch': 0.11} 11%|█ | 2388/22095 [3:46:26<20:56:03, 3.82s/it] 11%|█ | 2389/22095 [3:46:30<20:52:07, 3.81s/it] {'loss': 0.4283, 'grad_norm': 0.7428126319567718, 'learning_rate': 9.841007236209588e-06, 'epoch': 0.11} 11%|█ | 2389/22095 [3:46:30<20:52:07, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51833 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2390/22095 [3:46:34<20:42:31, 3.78s/it] {'loss': 0.438, 'grad_norm': 0.7118335211303995, 'learning_rate': 9.840823827989526e-06, 'epoch': 0.11} 11%|█ | 2390/22095 [3:46:34<20:42:31, 3.78s/it] 11%|█ | 2391/22095 [3:46:37<19:27:37, 3.56s/it] {'loss': 0.4164, 'grad_norm': 0.71790940397756, 'learning_rate': 9.84064031575503e-06, 'epoch': 0.11} 11%|█ | 2391/22095 [3:46:37<19:27:37, 3.56s/it] 11%|█ | 2392/22095 [3:46:40<19:05:04, 3.49s/it] {'loss': 0.4193, 'grad_norm': 0.7169223945081842, 'learning_rate': 9.840456699510038e-06, 'epoch': 0.11} 11%|█ | 2392/22095 [3:46:40<19:05:04, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2393/22095 [3:46:49<28:46:20, 5.26s/it] {'loss': 0.5264, 'grad_norm': 0.5818471073518655, 'learning_rate': 9.840272979258498e-06, 'epoch': 0.11} 11%|█ | 2393/22095 [3:46:49<28:46:20, 5.26s/it] 11%|█ | 2394/22095 [3:46:57<32:39:30, 5.97s/it] {'loss': 0.5038, 'grad_norm': 0.44257271729553715, 'learning_rate': 9.84008915500436e-06, 'epoch': 0.11} 11%|█ | 2394/22095 [3:46:57<32:39:30, 5.97s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2395/22095 [3:47:00<28:31:44, 5.21s/it] {'loss': 0.409, 'grad_norm': 0.9142415732457422, 'learning_rate': 9.83990522675157e-06, 'epoch': 0.11} 11%|█ | 2395/22095 [3:47:00<28:31:44, 5.21s/it] 11%|█ | 2396/22095 [3:47:03<25:03:15, 4.58s/it] {'loss': 0.4275, 'grad_norm': 0.857612046602221, 'learning_rate': 9.83972119450408e-06, 'epoch': 0.11} 11%|█ | 2396/22095 [3:47:03<25:03:15, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58588 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45649 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2397/22095 [3:47:07<24:02:54, 4.40s/it] {'loss': 0.4086, 'grad_norm': 0.8211196319920059, 'learning_rate': 9.839537058265847e-06, 'epoch': 0.11} 11%|█ | 2397/22095 [3:47:07<24:02:54, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2398/22095 [3:47:16<31:03:29, 5.68s/it] {'loss': 0.5295, 'grad_norm': 0.6625393546539192, 'learning_rate': 9.839352818040825e-06, 'epoch': 0.11} 11%|█ | 2398/22095 [3:47:16<31:03:29, 5.68s/it] 11%|█ | 2399/22095 [3:47:20<27:42:43, 5.07s/it] {'loss': 0.4349, 'grad_norm': 0.8175415684719074, 'learning_rate': 9.839168473832975e-06, 'epoch': 0.11} 11%|█ | 2399/22095 [3:47:20<27:42:43, 5.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1134, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341802 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1134, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8447, 'image': 'vrdu_table_final_2/astro-ph.CO/9e316111-2deb-4a9d-83a8-2eab15a90490.png', 'image_wh': [[1134, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n9&10&11&12&13&14&15\n\\end{tabular}\n```"}]} 11%|█ | 2400/22095 [3:47:23<24:19:24, 4.45s/it] {'loss': 0.398, 'grad_norm': 0.6971690315490437, 'learning_rate': 9.838984025646257e-06, 'epoch': 0.11} 11%|█ | 2400/22095 [3:47:23<24:19:24, 4.45s/it] 11%|█ | 2401/22095 [3:47:26<21:34:33, 3.94s/it] {'loss': 0.4326, 'grad_norm': 0.7333102031011268, 'learning_rate': 9.838799473484633e-06, 'epoch': 0.11} 11%|█ | 2401/22095 [3:47:26<21:34:33, 3.94s/it] 11%|█ | 2402/22095 [3:47:29<20:17:35, 3.71s/it] {'loss': 0.4563, 'grad_norm': 0.7465017771426746, 'learning_rate': 9.83861481735207e-06, 'epoch': 0.11} 11%|█ | 2402/22095 [3:47:29<20:17:35, 3.71s/it] 11%|█ | 2403/22095 [3:47:32<19:53:08, 3.64s/it] {'loss': 0.4603, 'grad_norm': 0.7089569072849802, 'learning_rate': 9.838430057252537e-06, 'epoch': 0.11} 11%|█ | 2403/22095 [3:47:32<19:53:08, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [359, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8531250 in VC:s3://internvl-moe-sft-data/. Exception: Image size [359, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29570, 'image': 'vrdu_texteq/astro-ph.CO/55c9212a-5a05-4879-9bf5-cfdd2a00d5e7.png', 'image_wh': [[359, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'that is maximal when $a=a_*$.'}]} 11%|█ | 2404/22095 [3:47:35<19:24:43, 3.55s/it] {'loss': 0.4472, 'grad_norm': 0.9367161129821115, 'learning_rate': 9.838245193189999e-06, 'epoch': 0.11} 11%|█ | 2404/22095 [3:47:36<19:24:43, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2405/22095 [3:47:45<29:00:40, 5.30s/it] {'loss': 0.5022, 'grad_norm': 0.44700555725761754, 'learning_rate': 9.838060225168432e-06, 'epoch': 0.11} 11%|█ | 2405/22095 [3:47:45<29:00:40, 5.30s/it] 11%|█ | 2406/22095 [3:47:49<26:45:15, 4.89s/it] {'loss': 0.4389, 'grad_norm': 0.7906653966767495, 'learning_rate': 9.837875153191812e-06, 'epoch': 0.11} 11%|█ | 2406/22095 [3:47:49<26:45:15, 4.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104442 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50743 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2407/22095 [3:47:53<25:42:02, 4.70s/it] {'loss': 0.4223, 'grad_norm': 0.7286294782716575, 'learning_rate': 9.837689977264111e-06, 'epoch': 0.11} 11%|█ | 2407/22095 [3:47:53<25:42:02, 4.70s/it] 11%|█ | 2408/22095 [3:47:56<23:00:07, 4.21s/it] {'loss': 0.4281, 'grad_norm': 2.253138918510198, 'learning_rate': 9.837504697389311e-06, 'epoch': 0.11} 11%|█ | 2408/22095 [3:47:56<23:00:07, 4.21s/it] 11%|█ | 2409/22095 [3:48:00<21:47:49, 3.99s/it] {'loss': 0.4392, 'grad_norm': 0.7661119819405651, 'learning_rate': 9.837319313571394e-06, 'epoch': 0.11} 11%|█ | 2409/22095 [3:48:00<21:47:49, 3.99s/it] 11%|█ | 2410/22095 [3:48:03<20:26:19, 3.74s/it] {'loss': 0.4698, 'grad_norm': 0.7407852422162088, 'learning_rate': 9.83713382581434e-06, 'epoch': 0.11} 11%|█ | 2410/22095 [3:48:03<20:26:19, 3.74s/it] 11%|█ | 2411/22095 [3:48:06<19:11:11, 3.51s/it] {'loss': 0.4589, 'grad_norm': 0.8474642676205407, 'learning_rate': 9.836948234122136e-06, 'epoch': 0.11} 11%|█ | 2411/22095 [3:48:06<19:11:11, 3.51s/it] 11%|█ | 2412/22095 [3:48:09<18:13:15, 3.33s/it] {'loss': 0.4159, 'grad_norm': 0.7070961319248004, 'learning_rate': 9.83676253849877e-06, 'epoch': 0.11} 11%|█ | 2412/22095 [3:48:09<18:13:15, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68811 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43688 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2413/22095 [3:48:11<17:13:33, 3.15s/it] {'loss': 0.5101, 'grad_norm': 0.8516499038377042, 'learning_rate': 9.836576738948234e-06, 'epoch': 0.11} 11%|█ | 2413/22095 [3:48:11<17:13:33, 3.15s/it] 11%|█ | 2414/22095 [3:48:15<18:25:38, 3.37s/it] {'loss': 0.4545, 'grad_norm': 0.7205091158697706, 'learning_rate': 9.836390835474516e-06, 'epoch': 0.11} 11%|█ | 2414/22095 [3:48:15<18:25:38, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2415/22095 [3:48:22<24:15:39, 4.44s/it] {'loss': 0.489, 'grad_norm': 0.41820220247791706, 'learning_rate': 9.836204828081612e-06, 'epoch': 0.11} 11%|█ | 2415/22095 [3:48:22<24:15:39, 4.44s/it] 11%|█ | 2416/22095 [3:48:25<22:22:45, 4.09s/it] {'loss': 0.4397, 'grad_norm': 0.7746337605139181, 'learning_rate': 9.836018716773522e-06, 'epoch': 0.11} 11%|█ | 2416/22095 [3:48:25<22:22:45, 4.09s/it] 11%|█ | 2417/22095 [3:48:29<21:26:19, 3.92s/it] {'loss': 0.4459, 'grad_norm': 0.7362030193141964, 'learning_rate': 9.835832501554242e-06, 'epoch': 0.11} 11%|█ | 2417/22095 [3:48:29<21:26:19, 3.92s/it] 11%|█ | 2418/22095 [3:48:33<21:22:13, 3.91s/it] {'loss': 0.4436, 'grad_norm': 0.7433426511252804, 'learning_rate': 9.835646182427773e-06, 'epoch': 0.11} 11%|█ | 2418/22095 [3:48:33<21:22:13, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57117 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2419/22095 [3:48:36<19:26:21, 3.56s/it] {'loss': 0.4438, 'grad_norm': 0.8830194859078878, 'learning_rate': 9.835459759398118e-06, 'epoch': 0.11} 11%|█ | 2419/22095 [3:48:36<19:26:21, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81076 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66091 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42480 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2420/22095 [3:48:38<18:00:42, 3.30s/it] {'loss': 0.4386, 'grad_norm': 0.7277600436583608, 'learning_rate': 9.835273232469285e-06, 'epoch': 0.11} 11%|█ | 2420/22095 [3:48:38<18:00:42, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2421/22095 [3:48:45<23:40:58, 4.33s/it] {'loss': 0.5036, 'grad_norm': 0.3795240972724703, 'learning_rate': 9.83508660164528e-06, 'epoch': 0.11} 11%|█ | 2421/22095 [3:48:45<23:40:58, 4.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2422/22095 [3:48:49<22:56:14, 4.20s/it] {'loss': 0.4306, 'grad_norm': 0.7872464533012712, 'learning_rate': 9.834899866930116e-06, 'epoch': 0.11} 11%|█ | 2422/22095 [3:48:49<22:56:14, 4.20s/it] 11%|█ | 2423/22095 [3:48:52<20:47:14, 3.80s/it] {'loss': 0.4909, 'grad_norm': 0.7764837247171241, 'learning_rate': 9.834713028327802e-06, 'epoch': 0.11} 11%|█ | 2423/22095 [3:48:52<20:47:14, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1398, in _get_item sources = self.preprocess_conversation_format( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1189, in preprocess_conversation_format msg = format_grounding_internvl2qwenvl( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 399, in format_grounding_internvl2qwenvl new_message = find_bbox(ref_matches, message, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 262, in find_bbox assert len(ref_matches) == len( AssertionError: ref_matches: ['{{ref-web|títol= Une «bombe météorologique» déclenche la panique au Canada et aux États-Unis | url= http://ec.gc.ca/meteo-weather/default.asp?lang=Fr&n=47397091-1 |lloc= [[Service météorologique du Canada]] |editor= [[Environnement Canada]] |consulta= 26 desembre 2013}}', '{{ref-publicació|cognom=Martín León|nom=F|títol=El concepto de ciclogénesis explosiva o “bomba meteorológica”|publicació=RAM-Revista del Aficionado a la Meteorología|data=28 octubre 2013|volum=octubre 2013|url=http://www.tiempo.com/ram/4070/el-concepto-de-ciclognesis-explosiva/|consulta=5 març 2014}}', "{{ref-notícia|cognom=Bernis|nom=M|títol=Qui s'ha inventat la ciclogènesi explosiva?|publicació=Diari Ara|url=http://www.ara.cat/societat/meteo/sha-inventat-ciclogenesi-explosiva_0_1054094656.html|consulta=5 març 2014|data=25 desembre 2013}}", '{{ref-publicació|cognom=Servei Meteorològic de Catalunya|títol=Resum mensual. Febrer 2010|publicació=Butlletins climàtics|data=Març 2010|url=http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|consulta=5 març 2014|arxiuurl=https://web.archive.org/web/20140305171041/http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|arxiudata=5 de març 2014}}', "{{ref-notícia|cognom=VilaWeb|títol=Tempesta 'Xynthia': destrosses i ferits a Catalunya Nord|publicació=VilaWeb|url=http://www.vilaweb.cat/noticia/3696118/20100301/tempesta-xynthia-causa-destroces-catalunya-nord.html|consulta=5 març 2014|data=1 març 2010}}", '{{ref-notícia|cognom=Portal informatiu de TV3|nom=3/24|títol=50\xa0morts i 9 desapareguts a França pel pas de la tempesta "Xynthia"|url=http://www.324.cat/noticia/548185/societat/50-morts-i-9-desapareguts-a-Franca-pel-pas-de-la-tempesta-Xynthia|consulta=5 març 2014|data=3/3/2010}}', '{{ref-notícia|cognom=Libération|títol=Xynthia, retour sur la tempête|publicació=Libération|url=http://www.liberation.fr/tempete-xynthia,99875|consulta=5 març 2014|data=febrer i març 2010}} {{Webarchive|url=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |date=5 de març 2014 }} {{Ref-web |url=http://www.liberation.fr/tempete-xynthia,99875 |títol=Còpia arxivada |consulta=2014-03-05 |arxiuurl=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |arxiudata=2014-03-05}}'], box_matches: ['[[Fitxer:BraerStorm1993.png|miniatura|El gener de 1993 es creà una tempesta que va arribar a un mínim històric de 913 [[mbar]]', '[[borrasca]]', '[[Service météorologique du Canada]]', '[[Environnement Canada]]', '[[Categoria:Meteorologia|Ciclogènesi explosiva]]'], message: [[Fitxer:BraerStorm1993.png|miniatura|El gener de 1993 es creà una tempesta que va arribar a un mínim històric de 913 [[mbar]] (hPa).]] Una '''ciclogènesi explosiva''' és una [[borrasca]] que s'aprofundeix molt ràpidament, amb una variació de més 24 hectopascals (hPa) en menys de 24 hores que pot generar forts vents amb velocitat fins a 140 kilòmetres per hora.{{ref-web|títol= Une «bombe météorologique» déclenche la panique au Canada et aux États-Unis | url= http://ec.gc.ca/meteo-weather/default.asp?lang=Fr&n=47397091-1 |lloc= [[Service météorologique du Canada]] |editor= [[Environnement Canada]] |consulta= 26 desembre 2013}} Per a les latituds on es troba Catalunya, aquesta definició es relaxa i engloba caigudes de pressió d'uns 20 hPa en 24 hores, o fins i tot submúltiples d'ella, per exemple 9-10 hPa en 12h.{{ref-publicació|cognom=Martín León|nom=F|títol=El concepto de ciclogénesis explosiva o “bomba meteorológica”|publicació=RAM-Revista del Aficionado a la Meteorología|data=28 octubre 2013|volum=octubre 2013|url=http://www.tiempo.com/ram/4070/el-concepto-de-ciclognesis-explosiva/|consulta=5 març 2014}} El concepte va ser proposat l'any 1980 pels investigadors americans Fred Sanders i John R. Gyakum, que van parlar de "meteorological bomb". L'ús a Catalunya va fer-se popular a partir del gener del 2009{{ref-notícia|cognom=Bernis|nom=M|títol=Qui s'ha inventat la ciclogènesi explosiva?|publicació=Diari Ara|url=http://www.ara.cat/societat/meteo/sha-inventat-ciclogenesi-explosiva_0_1054094656.html|consulta=5 març 2014|data=25 desembre 2013}} amb el pas del cicló Klaus. A finals de febrer de 2010 la depressió Xynthia,{{ref-publicació|cognom=Servei Meteorològic de Catalunya|títol=Resum mensual. Febrer 2010|publicació=Butlletins climàtics|data=Març 2010|url=http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|consulta=5 març 2014|arxiuurl=https://web.archive.org/web/20140305171041/http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|arxiudata=5 de març 2014}} fruit d'una nova ciclogènesi explosiva, va causar destrosses al seu pas per Catalunya{{ref-notícia|cognom=VilaWeb|títol=Tempesta 'Xynthia': destrosses i ferits a Catalunya Nord|publicació=VilaWeb|url=http://www.vilaweb.cat/noticia/3696118/20100301/tempesta-xynthia-causa-destroces-catalunya-nord.html|consulta=5 març 2014|data=1 març 2010}} i nombroses víctimes a França.{{ref-notícia|cognom=Portal informatiu de TV3|nom=3/24|títol=50 morts i 9 desapareguts a França pel pas de la tempesta "Xynthia"|url=http://www.324.cat/noticia/548185/societat/50-morts-i-9-desapareguts-a-Franca-pel-pas-de-la-tempesta-Xynthia|consulta=5 març 2014|data=3/3/2010}}{{ref-notícia|cognom=Libération|títol=Xynthia, retour sur la tempête|publicació=Libération|url=http://www.liberation.fr/tempete-xynthia,99875|consulta=5 març 2014|data=febrer i març 2010}} {{Webarchive|url=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |date=5 de març 2014 }} {{Ref-web |url=http://www.liberation.fr/tempete-xynthia,99875 |títol=Còpia arxivada |consulta=2014-03-05 |arxiuurl=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |arxiudata=2014-03-05}} == Referències == {{Referències}} {{Autoritat}} [[Categoria:Meteorologia|Ciclogènesi explosiva]] [Try #0] Failed to fetch sample 6887847 in VC:s3://gui/data_20250328/android/images/. Exception: ref_matches: ['{{ref-web|títol= Une «bombe météorologique» déclenche la panique au Canada et aux États-Unis | url= http://ec.gc.ca/meteo-weather/default.asp?lang=Fr&n=47397091-1 |lloc= [[Service météorologique du Canada]] |editor= [[Environnement Canada]] |consulta= 26 desembre 2013}}', '{{ref-publicació|cognom=Martín León|nom=F|títol=El concepto de ciclogénesis explosiva o “bomba meteorológica”|publicació=RAM-Revista del Aficionado a la Meteorología|data=28 octubre 2013|volum=octubre 2013|url=http://www.tiempo.com/ram/4070/el-concepto-de-ciclognesis-explosiva/|consulta=5 març 2014}}', "{{ref-notícia|cognom=Bernis|nom=M|títol=Qui s'ha inventat la ciclogènesi explosiva?|publicació=Diari Ara|url=http://www.ara.cat/societat/meteo/sha-inventat-ciclogenesi-explosiva_0_1054094656.html|consulta=5 març 2014|data=25 desembre 2013}}", '{{ref-publicació|cognom=Servei Meteorològic de Catalunya|títol=Resum mensual. Febrer 2010|publicació=Butlletins climàtics|data=Març 2010|url=http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|consulta=5 març 2014|arxiuurl=https://web.archive.org/web/20140305171041/http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|arxiudata=5 de març 2014}}', "{{ref-notícia|cognom=VilaWeb|títol=Tempesta 'Xynthia': destrosses i ferits a Catalunya Nord|publicació=VilaWeb|url=http://www.vilaweb.cat/noticia/3696118/20100301/tempesta-xynthia-causa-destroces-catalunya-nord.html|consulta=5 març 2014|data=1 març 2010}}", '{{ref-notícia|cognom=Portal informatiu de TV3|nom=3/24|títol=50\xa0morts i 9 desapareguts a França pel pas de la tempesta "Xynthia"|url=http://www.324.cat/noticia/548185/societat/50-morts-i-9-desapareguts-a-Franca-pel-pas-de-la-tempesta-Xynthia|consulta=5 març 2014|data=3/3/2010}}', '{{ref-notícia|cognom=Libération|títol=Xynthia, retour sur la tempête|publicació=Libération|url=http://www.liberation.fr/tempete-xynthia,99875|consulta=5 març 2014|data=febrer i març 2010}} {{Webarchive|url=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |date=5 de març 2014 }} {{Ref-web |url=http://www.liberation.fr/tempete-xynthia,99875 |títol=Còpia arxivada |consulta=2014-03-05 |arxiuurl=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |arxiudata=2014-03-05}}'], box_matches: ['[[Fitxer:BraerStorm1993.png|miniatura|El gener de 1993 es creà una tempesta que va arribar a un mínim històric de 913 [[mbar]]', '[[borrasca]]', '[[Service météorologique du Canada]]', '[[Environnement Canada]]', '[[Categoria:Meteorologia|Ciclogènesi explosiva]]'], message: [[Fitxer:BraerStorm1993.png|miniatura|El gener de 1993 es creà una tempesta que va arribar a un mínim històric de 913 [[mbar]] (hPa).]] Una '''ciclogènesi explosiva''' és una [[borrasca]] que s'aprofundeix molt ràpidament, amb una variació de més 24 hectopascals (hPa) en menys de 24 hores que pot generar forts vents amb velocitat fins a 140 kilòmetres per hora.{{ref-web|títol= Une «bombe météorologique» déclenche la panique au Canada et aux États-Unis | url= http://ec.gc.ca/meteo-weather/default.asp?lang=Fr&n=47397091-1 |lloc= [[Service météorologique du Canada]] |editor= [[Environnement Canada]] |consulta= 26 desembre 2013}} Per a les latituds on es troba Catalunya, aquesta definició es relaxa i engloba caigudes de pressió d'uns 20 hPa en 24 hores, o fins i tot submúltiples d'ella, per exemple 9-10 hPa en 12h.{{ref-publicació|cognom=Martín León|nom=F|títol=El concepto de ciclogénesis explosiva o “bomba meteorológica”|publicació=RAM-Revista del Aficionado a la Meteorología|data=28 octubre 2013|volum=octubre 2013|url=http://www.tiempo.com/ram/4070/el-concepto-de-ciclognesis-explosiva/|consulta=5 març 2014}} El concepte va ser proposat l'any 1980 pels investigadors americans Fred Sanders i John R. Gyakum, que van parlar de "meteorological bomb". L'ús a Catalunya va fer-se popular a partir del gener del 2009{{ref-notícia|cognom=Bernis|nom=M|títol=Qui s'ha inventat la ciclogènesi explosiva?|publicació=Diari Ara|url=http://www.ara.cat/societat/meteo/sha-inventat-ciclogenesi-explosiva_0_1054094656.html|consulta=5 març 2014|data=25 desembre 2013}} amb el pas del cicló Klaus. A finals de febrer de 2010 la depressió Xynthia,{{ref-publicació|cognom=Servei Meteorològic de Catalunya|títol=Resum mensual. Febrer 2010|publicació=Butlletins climàtics|data=Març 2010|url=http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|consulta=5 març 2014|arxiuurl=https://web.archive.org/web/20140305171041/http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|arxiudata=5 de març 2014}} fruit d'una nova ciclogènesi explosiva, va causar destrosses al seu pas per Catalunya{{ref-notícia|cognom=VilaWeb|títol=Tempesta 'Xynthia': destrosses i ferits a Catalunya Nord|publicació=VilaWeb|url=http://www.vilaweb.cat/noticia/3696118/20100301/tempesta-xynthia-causa-destroces-catalunya-nord.html|consulta=5 març 2014|data=1 març 2010}} i nombroses víctimes a França.{{ref-notícia|cognom=Portal informatiu de TV3|nom=3/24|títol=50 morts i 9 desapareguts a França pel pas de la tempesta "Xynthia"|url=http://www.324.cat/noticia/548185/societat/50-morts-i-9-desapareguts-a-Franca-pel-pas-de-la-tempesta-Xynthia|consulta=5 març 2014|data=3/3/2010}}{{ref-notícia|cognom=Libération|títol=Xynthia, retour sur la tempête|publicació=Libération|url=http://www.liberation.fr/tempete-xynthia,99875|consulta=5 març 2014|data=febrer i març 2010}} {{Webarchive|url=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |date=5 de març 2014 }} {{Ref-web |url=http://www.liberation.fr/tempete-xynthia,99875 |títol=Còpia arxivada |consulta=2014-03-05 |arxiuurl=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |arxiudata=2014-03-05}} == Referències == {{Referències}} {{Autoritat}} [[Categoria:Meteorologia|Ciclogènesi explosiva]] Problematic sample: {'image': 'wiki/Cycle_0_Iter0_seq10/images/screenshot-37-1743222050.6585867-before.png', 'conversations': [{'from': 'human', 'value': '\nExtract and display the text located within the specified area: [[0, 121, 981, 924]]'}, {'from': 'gpt', 'value': '[[Fitxer:BraerStorm1993.png|miniatura|El gener de 1993 es creà una tempesta que va arribar a un mínim històric de 913 [[mbar]] (hPa).]]\nUna \'\'\'ciclogènesi explosiva\'\'\' és una [[borrasca]] que s\'aprofundeix molt ràpidament, amb una variació de més 24 hectopascals (hPa) en menys de 24 hores que pot generar forts vents amb velocitat fins a 140 kilòmetres per hora.{{ref-web|títol= Une «bombe météorologique» déclenche la panique au Canada et aux États-Unis | url= http://ec.gc.ca/meteo-weather/default.asp?lang=Fr&n=47397091-1 |lloc= [[Service météorologique du Canada]] |editor= [[Environnement Canada]] |consulta= 26 desembre 2013}} Per a les latituds on es troba Catalunya, aquesta definició es relaxa i engloba caigudes de pressió d\'uns 20 hPa en 24 hores, o fins i tot submúltiples d\'ella, per exemple 9-10 hPa en 12h.{{ref-publicació|cognom=Martín León|nom=F|títol=El concepto de ciclogénesis explosiva o “bomba meteorológica”|publicació=RAM-Revista del Aficionado a la Meteorología|data=28 octubre 2013|volum=octubre 2013|url=http://www.tiempo.com/ram/4070/el-concepto-de-ciclognesis-explosiva/|consulta=5 març 2014}}\n\nEl concepte va ser proposat l\'any 1980 pels investigadors americans Fred Sanders i John R. Gyakum, que van parlar de "meteorological bomb". L\'ús a Catalunya va fer-se popular a partir del gener del 2009{{ref-notícia|cognom=Bernis|nom=M|títol=Qui s\'ha inventat la ciclogènesi explosiva?|publicació=Diari Ara|url=http://www.ara.cat/societat/meteo/sha-inventat-ciclogenesi-explosiva_0_1054094656.html|consulta=5 març 2014|data=25 desembre 2013}} amb el pas del cicló Klaus.\n\nA finals de febrer de 2010 la depressió Xynthia,{{ref-publicació|cognom=Servei Meteorològic de Catalunya|títol=Resum mensual. Febrer 2010|publicació=Butlletins climàtics|data=Març 2010|url=http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|consulta=5 març 2014|arxiuurl=https://web.archive.org/web/20140305171041/http://www20.gencat.cat/docs/meteocat/Continguts/Climatologia/Butlletins%20i%20resums%20climatics/Butlletins%20mensuals/2010/pdf/ButlletiFebrer10.pdf|arxiudata=5 de març 2014}} fruit d\'una nova ciclogènesi explosiva, va causar destrosses al seu pas per Catalunya{{ref-notícia|cognom=VilaWeb|títol=Tempesta \'Xynthia\': destrosses i ferits a Catalunya Nord|publicació=VilaWeb|url=http://www.vilaweb.cat/noticia/3696118/20100301/tempesta-xynthia-causa-destroces-catalunya-nord.html|consulta=5 març 2014|data=1 març 2010}} i nombroses víctimes a França.{{ref-notícia|cognom=Portal informatiu de TV3|nom=3/24|títol=50\xa0morts i 9 desapareguts a França pel pas de la tempesta "Xynthia"|url=http://www.324.cat/noticia/548185/societat/50-morts-i-9-desapareguts-a-Franca-pel-pas-de-la-tempesta-Xynthia|consulta=5 març 2014|data=3/3/2010}}{{ref-notícia|cognom=Libération|títol=Xynthia, retour sur la tempête|publicació=Libération|url=http://www.liberation.fr/tempete-xynthia,99875|consulta=5 març 2014|data=febrer i març 2010}} {{Webarchive|url=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |date=5 de març 2014 }} {{Ref-web |url=http://www.liberation.fr/tempete-xynthia,99875 |títol=Còpia arxivada |consulta=2014-03-05 |arxiuurl=https://web.archive.org/web/20140305165824/http://www.liberation.fr/tempete-xynthia,99875 |arxiudata=2014-03-05}}\n\n== Referències ==\n{{Referències}}\n\n{{Autoritat}}\n\n[[Categoria:Meteorologia|Ciclogènesi explosiva]]'}], 'width': 1280, 'height': 2856} 11%|█ | 2424/22095 [3:48:56<21:39:15, 3.96s/it] {'loss': 0.3955, 'grad_norm': 0.6853382068771207, 'learning_rate': 9.834526085842352e-06, 'epoch': 0.11} 11%|█ | 2424/22095 [3:48:56<21:39:15, 3.96s/it] 11%|█ | 2425/22095 [3:49:00<22:13:22, 4.07s/it] {'loss': 0.4379, 'grad_norm': 0.7321908758615492, 'learning_rate': 9.834339039477787e-06, 'epoch': 0.11} 11%|█ | 2425/22095 [3:49:00<22:13:22, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (161358 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2426/22095 [3:49:04<22:03:26, 4.04s/it] {'loss': 0.4054, 'grad_norm': 0.7753512763451639, 'learning_rate': 9.834151889238121e-06, 'epoch': 0.11} 11%|█ | 2426/22095 [3:49:04<22:03:26, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2427/22095 [3:49:14<31:28:45, 5.76s/it] {'loss': 0.5238, 'grad_norm': 0.3675674524715627, 'learning_rate': 9.83396463512738e-06, 'epoch': 0.11} 11%|█ | 2427/22095 [3:49:14<31:28:45, 5.76s/it] 11%|█ | 2428/22095 [3:49:24<37:51:21, 6.93s/it] {'loss': 0.504, 'grad_norm': 0.3506600433838427, 'learning_rate': 9.833777277149585e-06, 'epoch': 0.11} 11%|█ | 2428/22095 [3:49:24<37:51:21, 6.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█ | 2429/22095 [3:49:28<32:43:11, 5.99s/it] {'loss': 0.5326, 'grad_norm': 0.91527237285889, 'learning_rate': 9.833589815308761e-06, 'epoch': 0.11} 11%|█ | 2429/22095 [3:49:28<32:43:11, 5.99s/it] 11%|█ | 2430/22095 [3:49:31<28:41:25, 5.25s/it] {'loss': 0.4735, 'grad_norm': 0.8309215990408595, 'learning_rate': 9.833402249608938e-06, 'epoch': 0.11} 11%|█ | 2430/22095 [3:49:31<28:41:25, 5.25s/it] 11%|█ | 2431/22095 [3:49:36<27:13:05, 4.98s/it] {'loss': 0.4463, 'grad_norm': 0.6832150109748468, 'learning_rate': 9.833214580054145e-06, 'epoch': 0.11} 11%|█ | 2431/22095 [3:49:36<27:13:05, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2432/22095 [3:49:38<23:46:54, 4.35s/it] {'loss': 0.3863, 'grad_norm': 0.8721708352761983, 'learning_rate': 9.833026806648415e-06, 'epoch': 0.11} 11%|█ | 2432/22095 [3:49:38<23:46:54, 4.35s/it] 11%|█ | 2433/22095 [3:49:43<23:20:21, 4.27s/it] {'loss': 0.4044, 'grad_norm': 0.7530372258702653, 'learning_rate': 9.832838929395782e-06, 'epoch': 0.11} 11%|█ | 2433/22095 [3:49:43<23:20:21, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2434/22095 [3:49:50<28:30:52, 5.22s/it] {'loss': 0.5151, 'grad_norm': 0.4474166144373787, 'learning_rate': 9.832650948300284e-06, 'epoch': 0.11} 11%|█ | 2434/22095 [3:49:50<28:30:52, 5.22s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31254.png 2025-08-27 19:47:50.590173 load time: 1049.45 ms 11%|█ | 2435/22095 [3:49:54<26:13:50, 4.80s/it] {'loss': 0.4523, 'grad_norm': 0.9044477916013893, 'learning_rate': 9.832462863365959e-06, 'epoch': 0.11} 11%|█ | 2435/22095 [3:49:54<26:13:50, 4.80s/it] 11%|█ | 2436/22095 [3:49:57<23:54:01, 4.38s/it] {'loss': 0.4247, 'grad_norm': 0.8437894209564453, 'learning_rate': 9.83227467459685e-06, 'epoch': 0.11} 11%|█ | 2436/22095 [3:49:57<23:54:01, 4.38s/it] 11%|█ | 2437/22095 [3:50:19<52:28:58, 9.61s/it] {'loss': 0.3963, 'grad_norm': 0.7387425628334219, 'learning_rate': 9.832086381996997e-06, 'epoch': 0.11} 11%|█ | 2437/22095 [3:50:19<52:28:58, 9.61s/it] 11%|█ | 2438/22095 [3:50:22<41:12:12, 7.55s/it] {'loss': 0.4649, 'grad_norm': 0.7854114015827275, 'learning_rate': 9.83189798557045e-06, 'epoch': 0.11} 11%|█ | 2438/22095 [3:50:22<41:12:12, 7.55s/it] 11%|█ | 2439/22095 [3:50:25<34:23:46, 6.30s/it] {'loss': 0.3967, 'grad_norm': 0.827093510582746, 'learning_rate': 9.831709485321255e-06, 'epoch': 0.11} 11%|█ | 2439/22095 [3:50:25<34:23:46, 6.30s/it] 11%|█ | 2440/22095 [3:50:29<30:42:27, 5.62s/it] {'loss': 0.4789, 'grad_norm': 0.7078503797630626, 'learning_rate': 9.831520881253462e-06, 'epoch': 0.11} 11%|█ | 2440/22095 [3:50:29<30:42:27, 5.62s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100014728 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 11%|█ | 2441/22095 [3:50:52<58:12:32, 10.66s/it] {'loss': 0.4161, 'grad_norm': 0.6430101363692668, 'learning_rate': 9.831332173371125e-06, 'epoch': 0.11} 11%|█ | 2441/22095 [3:50:52<58:12:32, 10.66s/it] 11%|█ | 2442/22095 [3:50:55<46:56:20, 8.60s/it] {'loss': 0.4383, 'grad_norm': 0.685885874237446, 'learning_rate': 9.831143361678299e-06, 'epoch': 0.11} 11%|█ | 2442/22095 [3:50:55<46:56:20, 8.60s/it] 11%|█ | 2443/22095 [3:50:59<38:03:51, 6.97s/it] {'loss': 0.432, 'grad_norm': 0.7105514565656513, 'learning_rate': 9.830954446179037e-06, 'epoch': 0.11} 11%|█ | 2443/22095 [3:50:59<38:03:51, 6.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2444/22095 [3:51:47<105:26:01, 19.32s/it] {'loss': 0.5381, 'grad_norm': 0.48220270099095647, 'learning_rate': 9.830765426877404e-06, 'epoch': 0.11} 11%|█ | 2444/22095 [3:51:47<105:26:01, 19.32s/it] 11%|█ | 2445/22095 [3:52:12<115:47:39, 21.21s/it] {'loss': 0.4586, 'grad_norm': 0.6741979700234679, 'learning_rate': 9.830576303777456e-06, 'epoch': 0.11} 11%|█ | 2445/22095 [3:52:12<115:47:39, 21.21s/it] 11%|█ | 2446/22095 [3:52:34<116:20:08, 21.31s/it] {'loss': 0.4608, 'grad_norm': 0.7119060669901887, 'learning_rate': 9.83038707688326e-06, 'epoch': 0.11} 11%|█ | 2446/22095 [3:52:34<116:20:08, 21.31s/it] 11%|█ | 2447/22095 [3:54:09<238:01:21, 43.61s/it] {'loss': 0.4664, 'grad_norm': 0.8096309822720701, 'learning_rate': 9.830197746198882e-06, 'epoch': 0.11} 11%|█ | 2447/22095 [3:54:10<238:01:21, 43.61s/it] 11%|█ | 2448/22095 [3:54:13<171:59:44, 31.52s/it] {'loss': 0.4452, 'grad_norm': 0.8196824978383348, 'learning_rate': 9.83000831172839e-06, 'epoch': 0.11} 11%|█ | 2448/22095 [3:54:13<171:59:44, 31.52s/it] 11%|█ | 2449/22095 [3:54:35<157:31:34, 28.87s/it] {'loss': 0.4527, 'grad_norm': 0.7556365021830586, 'learning_rate': 9.829818773475852e-06, 'epoch': 0.11} 11%|█ | 2449/22095 [3:54:35<157:31:34, 28.87s/it]VC:s3://internvl2/datasets/ai2diagram/ai2d/abc_images/1306.png 2025-08-27 19:52:34.246805 load time: 1032.48 ms 11%|█ | 2450/22095 [3:55:53<236:59:21, 43.43s/it] {'loss': 0.4201, 'grad_norm': 0.8006816242158974, 'learning_rate': 9.829629131445342e-06, 'epoch': 0.11} 11%|█ | 2450/22095 [3:55:53<236:59:21, 43.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [84, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8411391 in VC:s3://internvl-moe-sft-data/. Exception: Image size [84, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13597, 'image': 'vrdu_table_final_2/astro-ph.CO/9ecbcd87-61f8-4cb3-92d3-fc88a2ccfbc1.png', 'image_wh': [[84, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{@{}c@{}}ALMA\\\\ \\end{tabular}\n```"}]} 11%|█ | 2451/22095 [3:56:34<233:57:03, 42.87s/it] {'loss': 0.4824, 'grad_norm': 0.7554350680724521, 'learning_rate': 9.829439385640936e-06, 'epoch': 0.11} 11%|█ | 2451/22095 [3:56:34<233:57:03, 42.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94448 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2452/22095 [3:57:51<289:30:35, 53.06s/it] {'loss': 0.3956, 'grad_norm': 0.664479601846548, 'learning_rate': 9.82924953606671e-06, 'epoch': 0.11} 11%|█ | 2452/22095 [3:57:51<289:30:35, 53.06s/it]VC:s3://gui-agent/data_20250624/ubuntu/images/libreoffice_impress/d76723b2-ff69-426d-84e7-f53cf5e2fc1f/images/step_1.png 2025-08-27 19:55:50.065135 load time: 1018.68 ms VC:s3://st2pj/20250222/images/sam-all/images/sa_164091.jpg 2025-08-27 19:55:50.065120 load time: 1045.8 ms 11%|█ | 2453/22095 [3:58:50<298:33:02, 54.72s/it] {'loss': 0.4114, 'grad_norm': 0.6857058477647321, 'learning_rate': 9.829059582726743e-06, 'epoch': 0.11} 11%|█ | 2453/22095 [3:58:50<298:33:02, 54.72s/it] 11%|█ | 2454/22095 [4:00:08<337:30:05, 61.86s/it] {'loss': 0.4213, 'grad_norm': 0.7722700935916896, 'learning_rate': 9.828869525625118e-06, 'epoch': 0.11} 11%|█ | 2454/22095 [4:00:08<337:30:05, 61.86s/it] 11%|█ | 2455/22095 [4:01:27<364:32:05, 66.82s/it] {'loss': 0.4329, 'grad_norm': 0.7138037254822772, 'learning_rate': 9.828679364765917e-06, 'epoch': 0.11} 11%|█ | 2455/22095 [4:01:27<364:32:05, 66.82s/it] 11%|█ | 2456/22095 [4:02:45<383:30:25, 70.30s/it] {'loss': 0.4003, 'grad_norm': 0.779255191989751, 'learning_rate': 9.828489100153224e-06, 'epoch': 0.11} 11%|█ | 2456/22095 [4:02:45<383:30:25, 70.30s/it] 11%|█ | 2457/22095 [4:03:30<342:29:29, 62.78s/it] {'loss': 0.4325, 'grad_norm': 0.9137826989883181, 'learning_rate': 9.828298731791133e-06, 'epoch': 0.11} 11%|█ | 2457/22095 [4:03:30<342:29:29, 62.78s/it] 11%|█ | 2458/22095 [4:04:10<305:05:25, 55.93s/it] {'loss': 0.4099, 'grad_norm': 0.7227084991013153, 'learning_rate': 9.82810825968373e-06, 'epoch': 0.11} 11%|█ | 2458/22095 [4:04:10<305:05:25, 55.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55401 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52813 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46849 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2459/22095 [4:04:52<281:44:29, 51.65s/it] {'loss': 0.3982, 'grad_norm': 0.7269166807445072, 'learning_rate': 9.827917683835109e-06, 'epoch': 0.11} 11%|█ | 2459/22095 [4:04:52<281:44:29, 51.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2460/22095 [4:05:20<243:20:41, 44.62s/it] {'loss': 0.5181, 'grad_norm': 0.6259376766075867, 'learning_rate': 9.827727004249366e-06, 'epoch': 0.11} 11%|█ | 2460/22095 [4:05:20<243:20:41, 44.62s/it] 11%|█ | 2461/22095 [4:06:21<269:12:23, 49.36s/it] {'loss': 0.4308, 'grad_norm': 0.7294144278342447, 'learning_rate': 9.827536220930596e-06, 'epoch': 0.11} 11%|█ | 2461/22095 [4:06:21<269:12:23, 49.36s/it] 11%|█ | 2462/22095 [4:07:28<298:12:36, 54.68s/it] {'loss': 0.4454, 'grad_norm': 0.7849374307877185, 'learning_rate': 9.827345333882898e-06, 'epoch': 0.11} 11%|█ | 2462/22095 [4:07:28<298:12:36, 54.68s/it] 11%|█ | 2463/22095 [4:08:45<335:34:07, 61.53s/it] {'loss': 0.4257, 'grad_norm': 0.8153581247572056, 'learning_rate': 9.827154343110376e-06, 'epoch': 0.11} 11%|█ | 2463/22095 [4:08:45<335:34:07, 61.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2464/22095 [4:09:44<330:50:23, 60.67s/it] {'loss': 0.4247, 'grad_norm': 0.6611837576004533, 'learning_rate': 9.826963248617133e-06, 'epoch': 0.11} 11%|█ | 2464/22095 [4:09:44<330:50:23, 60.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62581 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56381 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2465/22095 [4:11:05<364:24:53, 66.83s/it] {'loss': 0.4558, 'grad_norm': 0.7848171203104205, 'learning_rate': 9.826772050407273e-06, 'epoch': 0.11} 11%|█ | 2465/22095 [4:11:05<364:24:53, 66.83s/it] 11%|█ | 2466/22095 [4:13:03<447:49:34, 82.13s/it] {'loss': 0.3769, 'grad_norm': 2.1110926208357843, 'learning_rate': 9.826580748484908e-06, 'epoch': 0.11} 11%|█ | 2466/22095 [4:13:03<447:49:34, 82.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_510033.png 2025-08-27 20:11:01.812002 load time: 1040.46 ms 11%|█ | 2467/22095 [4:14:01<407:34:53, 74.76s/it] {'loss': 0.5348, 'grad_norm': 0.8213480521010255, 'learning_rate': 9.826389342854146e-06, 'epoch': 0.11} 11%|█ | 2467/22095 [4:14:01<407:34:53, 74.76s/it]VC:s3://gui-agent/data_20250526/windows/images/spotify/20250515_111734_1/images/before_screenshot_1_id_136_function_0_crop_1_grounding_instructions_random.png 2025-08-27 20:11:59.339132 load time: 1034.32 ms VC:s3://gui-agent/agentnet/win_mac_images/8bcf4187-7589-48d0-8db4-7a8b0845b66c.png 2025-08-27 20:11:59.336997 load time: 1053.31 ms 11%|█ | 2468/22095 [4:14:44<355:43:56, 65.25s/it] {'loss': 0.4407, 'grad_norm': 0.7099766095535562, 'learning_rate': 9.8261978335191e-06, 'epoch': 0.11} 11%|█ | 2468/22095 [4:14:44<355:43:56, 65.25s/it] 11%|█ | 2469/22095 [4:16:21<408:13:24, 74.88s/it] {'loss': 0.425, 'grad_norm': 0.7630263984746236, 'learning_rate': 9.826006220483886e-06, 'epoch': 0.11} 11%|█ | 2469/22095 [4:16:21<408:13:24, 74.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63097 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66242 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2470/22095 [4:17:28<396:07:55, 72.67s/it] {'loss': 0.4686, 'grad_norm': 0.6950195430682758, 'learning_rate': 9.825814503752618e-06, 'epoch': 0.11} 11%|█ | 2470/22095 [4:17:28<396:07:55, 72.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59484 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51687 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2471/22095 [4:17:51<314:49:56, 57.76s/it] {'loss': 0.4228, 'grad_norm': 0.7677560822279192, 'learning_rate': 9.825622683329419e-06, 'epoch': 0.11} 11%|█ | 2471/22095 [4:17:51<314:49:56, 57.76s/it] 11%|█ | 2472/22095 [4:18:49<314:42:51, 57.74s/it] {'loss': 0.4136, 'grad_norm': 0.7758707129168558, 'learning_rate': 9.82543075921841e-06, 'epoch': 0.11} 11%|█ | 2472/22095 [4:18:49<314:42:51, 57.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2473/22095 [4:18:59<235:50:12, 43.27s/it] {'loss': 0.501, 'grad_norm': 0.46941180467558136, 'learning_rate': 9.825238731423713e-06, 'epoch': 0.11} 11%|█ | 2473/22095 [4:18:59<235:50:12, 43.27s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2474/22095 [4:19:19<198:58:37, 36.51s/it] {'loss': 0.4119, 'grad_norm': 0.7360325669571167, 'learning_rate': 9.825046599949455e-06, 'epoch': 0.11} 11%|█ | 2474/22095 [4:19:19<198:58:37, 36.51s/it] 11%|█ | 2475/22095 [4:20:39<269:03:07, 49.37s/it] {'loss': 0.4081, 'grad_norm': 0.7760259047685877, 'learning_rate': 9.824854364799766e-06, 'epoch': 0.11} 11%|█ | 2475/22095 [4:20:39<269:03:07, 49.37s/it] 11%|█ | 2476/22095 [4:21:45<296:37:00, 54.43s/it] {'loss': 0.4486, 'grad_norm': 0.764283254991474, 'learning_rate': 9.824662025978774e-06, 'epoch': 0.11} 11%|█ | 2476/22095 [4:21:45<296:37:00, 54.43s/it] 11%|█ | 2477/22095 [4:22:32<285:02:24, 52.31s/it] {'loss': 0.4288, 'grad_norm': 0.702833413633441, 'learning_rate': 9.824469583490612e-06, 'epoch': 0.11} 11%|█ | 2477/22095 [4:22:32<285:02:24, 52.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█ | 2478/22095 [4:23:19<275:27:41, 50.55s/it] {'loss': 0.5195, 'grad_norm': 0.40113438821656633, 'learning_rate': 9.824277037339419e-06, 'epoch': 0.11} 11%|█ | 2478/22095 [4:23:19<275:27:41, 50.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89760 > 40960). Running this sequence through the model will result in indexing errors 11%|█ | 2479/22095 [4:23:28<207:58:48, 38.17s/it] {'loss': 0.5168, 'grad_norm': 0.38158292572294616, 'learning_rate': 9.824084387529326e-06, 'epoch': 0.11} 11%|█ | 2479/22095 [4:23:28<207:58:48, 38.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047793 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 11%|█ | 2480/22095 [4:23:31<150:54:48, 27.70s/it] {'loss': 0.4531, 'grad_norm': 0.956080719114455, 'learning_rate': 9.823891634064478e-06, 'epoch': 0.11} 11%|█ | 2480/22095 [4:23:31<150:54:48, 27.70s/it] 11%|█ | 2481/22095 [4:24:14<174:48:00, 32.08s/it] {'loss': 0.4813, 'grad_norm': 0.7235253068311136, 'learning_rate': 9.823698776949011e-06, 'epoch': 0.11} 11%|█ | 2481/22095 [4:24:14<174:48:00, 32.08s/it] 11%|█ | 2482/22095 [4:24:17<127:52:52, 23.47s/it] {'loss': 0.4315, 'grad_norm': 0.7522682244822607, 'learning_rate': 9.823505816187076e-06, 'epoch': 0.11} 11%|█ | 2482/22095 [4:24:17<127:52:52, 23.47s/it] 11%|█ | 2483/22095 [4:24:59<158:49:01, 29.15s/it] {'loss': 0.43, 'grad_norm': 0.8176435014549138, 'learning_rate': 9.823312751782812e-06, 'epoch': 0.11} 11%|█ | 2483/22095 [4:24:59<158:49:01, 29.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█ | 2484/22095 [4:25:59<208:17:50, 38.24s/it] {'loss': 0.4744, 'grad_norm': 0.718478291493913, 'learning_rate': 9.823119583740373e-06, 'epoch': 0.11} 11%|█ | 2484/22095 [4:25:59<208:17:50, 38.24s/it] 11%|█ | 2485/22095 [4:26:03<152:13:45, 27.95s/it] {'loss': 0.4179, 'grad_norm': 0.6502726205028461, 'learning_rate': 9.822926312063905e-06, 'epoch': 0.11} 11%|█ | 2485/22095 [4:26:03<152:13:45, 27.95s/it] 11%|█▏ | 2486/22095 [4:26:44<173:14:32, 31.81s/it] {'loss': 0.4687, 'grad_norm': 0.7305782785378717, 'learning_rate': 9.822732936757564e-06, 'epoch': 0.11} 11%|█▏ | 2486/22095 [4:26:44<173:14:32, 31.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (113273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41439 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2487/22095 [4:27:32<200:48:44, 36.87s/it] {'loss': 0.531, 'grad_norm': 0.6004507126829316, 'learning_rate': 9.822539457825505e-06, 'epoch': 0.11} 11%|█▏ | 2487/22095 [4:27:32<200:48:44, 36.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2488/22095 [4:27:36<146:38:19, 26.92s/it] {'loss': 0.4376, 'grad_norm': 0.7823041938016477, 'learning_rate': 9.822345875271884e-06, 'epoch': 0.11} 11%|█▏ | 2488/22095 [4:27:36<146:38:19, 26.92s/it] 11%|█▏ | 2489/22095 [4:27:39<108:09:41, 19.86s/it] {'loss': 0.4314, 'grad_norm': 0.7453882450020083, 'learning_rate': 9.82215218910086e-06, 'epoch': 0.11} 11%|█▏ | 2489/22095 [4:27:39<108:09:41, 19.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69088 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94168 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56207 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2490/22095 [4:27:43<81:17:56, 14.93s/it] {'loss': 0.3992, 'grad_norm': 0.6997393893692536, 'learning_rate': 9.821958399316595e-06, 'epoch': 0.11} 11%|█▏ | 2490/22095 [4:27:43<81:17:56, 14.93s/it] 11%|█▏ | 2491/22095 [4:28:07<96:52:03, 17.79s/it] {'loss': 0.4732, 'grad_norm': 0.8002430016187357, 'learning_rate': 9.821764505923257e-06, 'epoch': 0.11} 11%|█▏ | 2491/22095 [4:28:07<96:52:03, 17.79s/it] 11%|█▏ | 2492/22095 [4:28:48<135:06:26, 24.81s/it] {'loss': 0.4273, 'grad_norm': 0.7487575906480041, 'learning_rate': 9.821570508925005e-06, 'epoch': 0.11} 11%|█▏ | 2492/22095 [4:28:48<135:06:26, 24.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74482 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2493/22095 [4:28:52<100:03:24, 18.38s/it] {'loss': 0.4093, 'grad_norm': 0.741383909506027, 'learning_rate': 9.821376408326013e-06, 'epoch': 0.11} 11%|█▏ | 2493/22095 [4:28:52<100:03:24, 18.38s/it] 11%|█▏ | 2494/22095 [4:28:56<76:02:38, 13.97s/it] {'loss': 0.4081, 'grad_norm': 0.8072793420604332, 'learning_rate': 9.821182204130448e-06, 'epoch': 0.11} 11%|█▏ | 2494/22095 [4:28:56<76:02:38, 13.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█▏ | 2495/22095 [4:29:05<69:19:01, 12.73s/it] {'loss': 0.515, 'grad_norm': 0.4724509747117798, 'learning_rate': 9.820987896342487e-06, 'epoch': 0.11} 11%|█▏ | 2495/22095 [4:29:05<69:19:01, 12.73s/it] 11%|█▏ | 2496/22095 [4:29:09<54:00:29, 9.92s/it] {'loss': 0.4594, 'grad_norm': 0.758529766742308, 'learning_rate': 9.8207934849663e-06, 'epoch': 0.11} 11%|█▏ | 2496/22095 [4:29:09<54:00:29, 9.92s/it] 11%|█▏ | 2497/22095 [4:29:12<43:12:54, 7.94s/it] {'loss': 0.4138, 'grad_norm': 0.7146459027236057, 'learning_rate': 9.820598970006068e-06, 'epoch': 0.11} 11%|█▏ | 2497/22095 [4:29:12<43:12:54, 7.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█▏ | 2498/22095 [4:29:39<74:36:08, 13.70s/it] {'loss': 0.5035, 'grad_norm': 0.34575479257833636, 'learning_rate': 9.82040435146597e-06, 'epoch': 0.11} 11%|█▏ | 2498/22095 [4:29:39<74:36:08, 13.70s/it] 11%|█▏ | 2499/22095 [4:29:46<63:47:16, 11.72s/it] {'loss': 0.4976, 'grad_norm': 0.3175300244710189, 'learning_rate': 9.820209629350189e-06, 'epoch': 0.11} 11%|█▏ | 2499/22095 [4:29:46<63:47:16, 11.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█▏ | 2500/22095 [4:29:50<50:12:53, 9.23s/it] {'loss': 0.4725, 'grad_norm': 0.9579112274642932, 'learning_rate': 9.820014803662905e-06, 'epoch': 0.11} 11%|█▏ | 2500/22095 [4:29:50<50:12:53, 9.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2501/22095 [4:29:57<47:06:46, 8.66s/it] {'loss': 0.5118, 'grad_norm': 0.42437979117341523, 'learning_rate': 9.819819874408306e-06, 'epoch': 0.11} 11%|█▏ | 2501/22095 [4:29:57<47:06:46, 8.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 11%|█▏ | 2502/22095 [4:30:01<39:37:48, 7.28s/it] {'loss': 0.4211, 'grad_norm': 1.0258124340930102, 'learning_rate': 9.81962484159058e-06, 'epoch': 0.11} 11%|█▏ | 2502/22095 [4:30:01<39:37:48, 7.28s/it] 11%|█▏ | 2503/22095 [4:30:05<33:29:20, 6.15s/it] {'loss': 0.4202, 'grad_norm': 0.7774769252865491, 'learning_rate': 9.819429705213922e-06, 'epoch': 0.11} 11%|█▏ | 2503/22095 [4:30:05<33:29:20, 6.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50141 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113421 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2504/22095 [4:30:08<28:44:36, 5.28s/it] {'loss': 0.4285, 'grad_norm': 0.7262055087963113, 'learning_rate': 9.819234465282518e-06, 'epoch': 0.11} 11%|█▏ | 2504/22095 [4:30:08<28:44:36, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107139 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2505/22095 [4:30:12<26:04:33, 4.79s/it] {'loss': 0.4395, 'grad_norm': 0.8528024190792679, 'learning_rate': 9.819039121800568e-06, 'epoch': 0.11} 11%|█▏ | 2505/22095 [4:30:12<26:04:33, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62611 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2506/22095 [4:30:15<23:26:38, 4.31s/it] {'loss': 0.4621, 'grad_norm': 0.723040071029397, 'learning_rate': 9.818843674772268e-06, 'epoch': 0.11} 11%|█▏ | 2506/22095 [4:30:15<23:26:38, 4.31s/it] 11%|█▏ | 2507/22095 [4:30:17<20:58:49, 3.86s/it] {'loss': 0.4238, 'grad_norm': 0.6986040317293818, 'learning_rate': 9.818648124201817e-06, 'epoch': 0.11} 11%|█▏ | 2507/22095 [4:30:18<20:58:49, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48910 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72126 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (148444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70354 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2508/22095 [4:30:21<20:35:49, 3.79s/it] {'loss': 0.463, 'grad_norm': 0.8031078796455979, 'learning_rate': 9.818452470093416e-06, 'epoch': 0.11} 11%|█▏ | 2508/22095 [4:30:21<20:35:49, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45580 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41008 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44405 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95154 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2509/22095 [4:30:24<19:48:16, 3.64s/it] {'loss': 0.4244, 'grad_norm': 0.7185941171883956, 'learning_rate': 9.818256712451272e-06, 'epoch': 0.11} 11%|█▏ | 2509/22095 [4:30:24<19:48:16, 3.64s/it] 11%|█▏ | 2510/22095 [4:30:29<20:50:20, 3.83s/it] {'loss': 0.422, 'grad_norm': 0.7064335790434774, 'learning_rate': 9.81806085127959e-06, 'epoch': 0.11} 11%|█▏ | 2510/22095 [4:30:29<20:50:20, 3.83s/it] 11%|█▏ | 2511/22095 [4:30:33<21:44:11, 4.00s/it] {'loss': 0.4074, 'grad_norm': 0.658313320908971, 'learning_rate': 9.817864886582575e-06, 'epoch': 0.11} 11%|█▏ | 2511/22095 [4:30:33<21:44:11, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█▏ | 2512/22095 [4:30:43<30:53:23, 5.68s/it] {'loss': 0.5352, 'grad_norm': 0.5048450850733868, 'learning_rate': 9.817668818364441e-06, 'epoch': 0.11} 11%|█▏ | 2512/22095 [4:30:43<30:53:23, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60547 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2513/22095 [4:30:46<27:37:43, 5.08s/it] {'loss': 0.4123, 'grad_norm': 0.8615294240399992, 'learning_rate': 9.817472646629403e-06, 'epoch': 0.11} 11%|█▏ | 2513/22095 [4:30:46<27:37:43, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895849 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19002, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 11%|█▏ | 2514/22095 [4:30:56<35:17:30, 6.49s/it] {'loss': 0.4918, 'grad_norm': 0.3868215144673393, 'learning_rate': 9.817276371381671e-06, 'epoch': 0.11} 11%|█▏ | 2514/22095 [4:30:56<35:17:30, 6.49s/it] 11%|█▏ | 2515/22095 [4:31:00<30:20:49, 5.58s/it] {'loss': 0.4139, 'grad_norm': 0.72087500047661, 'learning_rate': 9.817079992625467e-06, 'epoch': 0.11} 11%|█▏ | 2515/22095 [4:31:00<30:20:49, 5.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45820 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2516/22095 [4:31:10<37:24:34, 6.88s/it] {'loss': 0.5143, 'grad_norm': 0.3519348388728298, 'learning_rate': 9.816883510365007e-06, 'epoch': 0.11} 11%|█▏ | 2516/22095 [4:31:10<37:24:34, 6.88s/it] 11%|█▏ | 2517/22095 [4:31:19<41:41:37, 7.67s/it] {'loss': 0.5142, 'grad_norm': 0.37694064143955736, 'learning_rate': 9.816686924604515e-06, 'epoch': 0.11} 11%|█▏ | 2517/22095 [4:31:19<41:41:37, 7.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2518/22095 [4:31:22<34:50:28, 6.41s/it] {'loss': 0.4321, 'grad_norm': 0.9441190145175116, 'learning_rate': 9.816490235348215e-06, 'epoch': 0.11} 11%|█▏ | 2518/22095 [4:31:22<34:50:28, 6.41s/it] 11%|█▏ | 2519/22095 [4:31:26<30:31:37, 5.61s/it] {'loss': 0.4461, 'grad_norm': 0.77127837366527, 'learning_rate': 9.816293442600331e-06, 'epoch': 0.11} 11%|█▏ | 2519/22095 [4:31:26<30:31:37, 5.61s/it] 11%|█▏ | 2520/22095 [4:31:30<27:14:26, 5.01s/it] {'loss': 0.3901, 'grad_norm': 0.8364582358922202, 'learning_rate': 9.816096546365094e-06, 'epoch': 0.11} 11%|█▏ | 2520/22095 [4:31:30<27:14:26, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2521/22095 [4:31:41<36:33:38, 6.72s/it] {'loss': 0.5407, 'grad_norm': 0.500008347179003, 'learning_rate': 9.815899546646734e-06, 'epoch': 0.11} 11%|█▏ | 2521/22095 [4:31:41<36:33:38, 6.72s/it] 11%|█▏ | 2522/22095 [4:31:44<31:50:43, 5.86s/it] {'loss': 0.428, 'grad_norm': 0.8221660850937721, 'learning_rate': 9.815702443449482e-06, 'epoch': 0.11} 11%|█▏ | 2522/22095 [4:31:44<31:50:43, 5.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2523/22095 [4:31:49<29:54:49, 5.50s/it] {'loss': 0.4608, 'grad_norm': 0.869718534293267, 'learning_rate': 9.815505236777576e-06, 'epoch': 0.11} 11%|█▏ | 2523/22095 [4:31:49<29:54:49, 5.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924588 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47741, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm'}, {'from': 'gpt', 'value': '∵点D是AC的中点,∴bc=\\frac{1}{2}AC,∵点E是BC的中点,∴CE=\\frac{1}{2}CB,∴DE=DC+CE=\\frac{1}{2}(AC+CB)=8cm,故选B'}]} 11%|█▏ | 2524/22095 [4:31:59<36:39:56, 6.74s/it] {'loss': 0.5186, 'grad_norm': 0.38870622655972636, 'learning_rate': 9.815307926635252e-06, 'epoch': 0.11} 11%|█▏ | 2524/22095 [4:31:59<36:39:56, 6.74s/it] 11%|█▏ | 2525/22095 [4:32:03<31:52:57, 5.86s/it] {'loss': 0.3684, 'grad_norm': 0.8050537795361368, 'learning_rate': 9.815110513026749e-06, 'epoch': 0.11} 11%|█▏ | 2525/22095 [4:32:03<31:52:57, 5.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954496 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5331, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 12cm\nB. 6cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 11%|█▏ | 2526/22095 [4:32:06<28:42:41, 5.28s/it] {'loss': 0.4784, 'grad_norm': 0.9066359801731271, 'learning_rate': 9.814912995956311e-06, 'epoch': 0.11} 11%|█▏ | 2526/22095 [4:32:06<28:42:41, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60332 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2527/22095 [4:32:10<26:07:30, 4.81s/it] {'loss': 0.4326, 'grad_norm': 0.780943610306695, 'learning_rate': 9.814715375428181e-06, 'epoch': 0.11} 11%|█▏ | 2527/22095 [4:32:10<26:07:30, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (102379 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [337, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504172 in VC:s3://internvl-moe-sft-data/. Exception: Image size [337, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39980, 'image': 'vrdu_texteq/astro-ph.CO/5f983087-133b-49c4-96af-28f6c061e3a6.png', 'image_wh': [[337, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'Redshift slice $0.7 < z < 0.9$:'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047168 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 6cm\nB. 12cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 11%|█▏ | 2528/22095 [4:32:14<23:47:33, 4.38s/it] {'loss': 0.4624, 'grad_norm': 0.7916808022497258, 'learning_rate': 9.814517651446603e-06, 'epoch': 0.11} 11%|█▏ | 2528/22095 [4:32:14<23:47:33, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [381, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8445556 in VC:s3://internvl-moe-sft-data/. Exception: Image size [381, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7297, 'image': 'vrdu_texteq/astro-ph.CO/958e713c-7b32-4d62-a228-b2e203e56487.png', 'image_wh': [[381, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'where the terms $\\Gamma_i$ are defined:'}]} 11%|█▏ | 2529/22095 [4:32:16<21:26:18, 3.94s/it] {'loss': 0.4368, 'grad_norm': 0.803747106010139, 'learning_rate': 9.814319824015827e-06, 'epoch': 0.11} 11%|█▏ | 2529/22095 [4:32:16<21:26:18, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116284 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2530/22095 [4:32:20<20:34:26, 3.79s/it] {'loss': 0.4765, 'grad_norm': 0.764847693389444, 'learning_rate': 9.814121893140105e-06, 'epoch': 0.11} 11%|█▏ | 2530/22095 [4:32:20<20:34:26, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 11%|█▏ | 2531/22095 [4:32:23<19:27:20, 3.58s/it] {'loss': 0.4087, 'grad_norm': 0.6383253482048825, 'learning_rate': 9.81392385882369e-06, 'epoch': 0.11} 11%|█▏ | 2531/22095 [4:32:23<19:27:20, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63022 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75521 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2532/22095 [4:32:26<18:36:17, 3.42s/it] {'loss': 0.4072, 'grad_norm': 0.821489295199945, 'learning_rate': 9.813725721070834e-06, 'epoch': 0.11} 11%|█▏ | 2532/22095 [4:32:26<18:36:17, 3.42s/it] 11%|█▏ | 2533/22095 [4:32:29<17:40:51, 3.25s/it] {'loss': 0.4296, 'grad_norm': 0.8707012364160911, 'learning_rate': 9.813527479885799e-06, 'epoch': 0.11} 11%|█▏ | 2533/22095 [4:32:29<17:40:51, 3.25s/it] 11%|█▏ | 2534/22095 [4:32:33<18:55:51, 3.48s/it] {'loss': 0.4377, 'grad_norm': 1.3106924634509085, 'learning_rate': 9.813329135272841e-06, 'epoch': 0.11} 11%|█▏ | 2534/22095 [4:32:33<18:55:51, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43650 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128025 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2535/22095 [4:32:42<28:51:08, 5.31s/it] {'loss': 0.5184, 'grad_norm': 0.5267145937741655, 'learning_rate': 9.813130687236222e-06, 'epoch': 0.11} 11%|█▏ | 2535/22095 [4:32:42<28:51:08, 5.31s/it] 11%|█▏ | 2536/22095 [4:32:46<26:25:37, 4.86s/it] {'loss': 0.4098, 'grad_norm': 0.7352462805393354, 'learning_rate': 9.81293213578021e-06, 'epoch': 0.11} 11%|█▏ | 2536/22095 [4:32:46<26:25:37, 4.86s/it] 11%|█▏ | 2537/22095 [4:32:49<23:05:59, 4.25s/it] {'loss': 0.4135, 'grad_norm': 0.8631311864144396, 'learning_rate': 9.812733480909065e-06, 'epoch': 0.11} 11%|█▏ | 2537/22095 [4:32:49<23:05:59, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 11%|█▏ | 2538/22095 [4:32:59<32:04:37, 5.90s/it] {'loss': 0.5288, 'grad_norm': 0.4587714355663383, 'learning_rate': 9.812534722627058e-06, 'epoch': 0.11} 11%|█▏ | 2538/22095 [4:32:59<32:04:37, 5.90s/it] 11%|█▏ | 2539/22095 [4:33:02<28:11:11, 5.19s/it] {'loss': 0.4404, 'grad_norm': 0.7901169569777019, 'learning_rate': 9.812335860938462e-06, 'epoch': 0.11} 11%|█▏ | 2539/22095 [4:33:02<28:11:11, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83048 > 40960). Running this sequence through the model will result in indexing errors 11%|█▏ | 2540/22095 [4:33:09<30:04:52, 5.54s/it] {'loss': 0.5265, 'grad_norm': 0.361868647097338, 'learning_rate': 9.812136895847548e-06, 'epoch': 0.11} 11%|█▏ | 2540/22095 [4:33:09<30:04:52, 5.54s/it] 12%|█▏ | 2541/22095 [4:33:12<26:40:40, 4.91s/it] {'loss': 0.4571, 'grad_norm': 0.7596945627994101, 'learning_rate': 9.811937827358592e-06, 'epoch': 0.12} 12%|█▏ | 2541/22095 [4:33:12<26:40:40, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43600 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2542/22095 [4:33:15<23:28:59, 4.32s/it] {'loss': 0.4267, 'grad_norm': 0.7906186891160457, 'learning_rate': 9.81173865547587e-06, 'epoch': 0.12} 12%|█▏ | 2542/22095 [4:33:15<23:28:59, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53122 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41493 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2543/22095 [4:33:25<32:10:09, 5.92s/it] {'loss': 0.4849, 'grad_norm': 0.4117514950499734, 'learning_rate': 9.811539380203663e-06, 'epoch': 0.12} 12%|█▏ | 2543/22095 [4:33:25<32:10:09, 5.92s/it] 12%|█▏ | 2544/22095 [4:33:28<27:48:27, 5.12s/it] {'loss': 0.4434, 'grad_norm': 0.7657511338022506, 'learning_rate': 9.811340001546252e-06, 'epoch': 0.12} 12%|█▏ | 2544/22095 [4:33:28<27:48:27, 5.12s/it] 12%|█▏ | 2545/22095 [4:33:31<24:05:48, 4.44s/it] {'loss': 0.4494, 'grad_norm': 0.7789309960897834, 'learning_rate': 9.811140519507922e-06, 'epoch': 0.12} 12%|█▏ | 2545/22095 [4:33:31<24:05:48, 4.44s/it] 12%|█▏ | 2546/22095 [4:33:35<24:07:13, 4.44s/it] {'loss': 0.4671, 'grad_norm': 0.6871882671752764, 'learning_rate': 9.810940934092958e-06, 'epoch': 0.12} 12%|█▏ | 2546/22095 [4:33:35<24:07:13, 4.44s/it] 12%|█▏ | 2547/22095 [4:33:39<22:13:51, 4.09s/it] {'loss': 0.4138, 'grad_norm': 0.7825805615576867, 'learning_rate': 9.810741245305649e-06, 'epoch': 0.12} 12%|█▏ | 2547/22095 [4:33:39<22:13:51, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877036 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 189, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 5cm\nB. 15cm\nC. 16cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 12%|█▏ | 2548/22095 [4:33:48<31:09:35, 5.74s/it] {'loss': 0.4827, 'grad_norm': 0.45342424459144637, 'learning_rate': 9.810541453150286e-06, 'epoch': 0.12} 12%|█▏ | 2548/22095 [4:33:48<31:09:35, 5.74s/it] 12%|█▏ | 2549/22095 [4:33:52<27:59:29, 5.16s/it] {'loss': 0.4261, 'grad_norm': 0.7997696049758634, 'learning_rate': 9.810341557631161e-06, 'epoch': 0.12} 12%|█▏ | 2549/22095 [4:33:52<27:59:29, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2550/22095 [4:33:59<31:19:44, 5.77s/it] {'loss': 0.5186, 'grad_norm': 0.3354195057845609, 'learning_rate': 9.81014155875257e-06, 'epoch': 0.12} 12%|█▏ | 2550/22095 [4:33:59<31:19:44, 5.77s/it] 12%|█▏ | 2551/22095 [4:34:06<33:44:58, 6.22s/it] {'loss': 0.5159, 'grad_norm': 0.3518734423957318, 'learning_rate': 9.80994145651881e-06, 'epoch': 0.12} 12%|█▏ | 2551/22095 [4:34:06<33:44:58, 6.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954484 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5319, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 8cm\nB. 5cm\nC. 6cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 12%|█▏ | 2552/22095 [4:34:10<29:19:32, 5.40s/it] {'loss': 0.4127, 'grad_norm': 0.8518932818441349, 'learning_rate': 9.809741250934182e-06, 'epoch': 0.12} 12%|█▏ | 2552/22095 [4:34:10<29:19:32, 5.40s/it] 12%|█▏ | 2553/22095 [4:34:14<27:40:05, 5.10s/it] {'loss': 0.47, 'grad_norm': 0.8049676685657544, 'learning_rate': 9.809540942002984e-06, 'epoch': 0.12} 12%|█▏ | 2553/22095 [4:34:14<27:40:05, 5.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [331, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8457834 in VC:s3://internvl-moe-sft-data/. Exception: Image size [331, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 85801, 'image': 'vrdu_texteq/astro-ph.CO/020b7c8f-4a53-4aa4-9b9b-6e7ed5e010c3.png', 'image_wh': [[331, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $D$ is a new constant.'}]} 12%|█▏ | 2554/22095 [4:34:18<25:00:33, 4.61s/it] {'loss': 0.4076, 'grad_norm': 1.1333619096212444, 'learning_rate': 9.809340529729523e-06, 'epoch': 0.12} 12%|█▏ | 2554/22095 [4:34:18<25:00:33, 4.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76539 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128393 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113141 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86634 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2555/22095 [4:34:25<28:59:53, 5.34s/it] {'loss': 0.4991, 'grad_norm': 0.4749602483194078, 'learning_rate': 9.809140014118106e-06, 'epoch': 0.12} 12%|█▏ | 2555/22095 [4:34:25<28:59:53, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109340 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2556/22095 [4:34:28<26:03:14, 4.80s/it] {'loss': 0.4566, 'grad_norm': 0.8929118199192368, 'learning_rate': 9.80893939517304e-06, 'epoch': 0.12} 12%|█▏ | 2556/22095 [4:34:28<26:03:14, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2557/22095 [4:34:31<23:10:29, 4.27s/it] {'loss': 0.4217, 'grad_norm': 1.0534167992870758, 'learning_rate': 9.808738672898637e-06, 'epoch': 0.12} 12%|█▏ | 2557/22095 [4:34:31<23:10:29, 4.27s/it] 12%|█▏ | 2558/22095 [4:34:34<21:00:51, 3.87s/it] {'loss': 0.43, 'grad_norm': 0.9090595000677438, 'learning_rate': 9.808537847299206e-06, 'epoch': 0.12} 12%|█▏ | 2558/22095 [4:34:34<21:00:51, 3.87s/it] 12%|█▏ | 2559/22095 [4:34:38<19:47:40, 3.65s/it] {'loss': 0.4057, 'grad_norm': 0.7642568276547604, 'learning_rate': 9.808336918379068e-06, 'epoch': 0.12} 12%|█▏ | 2559/22095 [4:34:38<19:47:40, 3.65s/it] 12%|█▏ | 2560/22095 [4:34:41<19:16:28, 3.55s/it] {'loss': 0.4222, 'grad_norm': 0.752331667095118, 'learning_rate': 9.808135886142536e-06, 'epoch': 0.12} 12%|█▏ | 2560/22095 [4:34:41<19:16:28, 3.55s/it] 12%|█▏ | 2561/22095 [4:34:44<18:52:47, 3.48s/it] {'loss': 0.4255, 'grad_norm': 0.7131841249575264, 'learning_rate': 9.80793475059393e-06, 'epoch': 0.12} 12%|█▏ | 2561/22095 [4:34:44<18:52:47, 3.48s/it] 12%|█▏ | 2562/22095 [4:34:48<19:50:42, 3.66s/it] {'loss': 0.4236, 'grad_norm': 1.0519989552101878, 'learning_rate': 9.807733511737574e-06, 'epoch': 0.12} 12%|█▏ | 2562/22095 [4:34:48<19:50:42, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2563/22095 [4:34:54<24:02:58, 4.43s/it] {'loss': 0.4961, 'grad_norm': 0.5128610138385369, 'learning_rate': 9.80753216957779e-06, 'epoch': 0.12} 12%|█▏ | 2563/22095 [4:34:54<24:02:58, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43365 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2564/22095 [4:35:04<32:11:10, 5.93s/it] {'loss': 0.5104, 'grad_norm': 0.45318445672664937, 'learning_rate': 9.807330724118906e-06, 'epoch': 0.12} 12%|█▏ | 2564/22095 [4:35:04<32:11:10, 5.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2565/22095 [4:35:07<28:14:00, 5.20s/it] {'loss': 0.4216, 'grad_norm': 0.9444688018175134, 'learning_rate': 9.807129175365248e-06, 'epoch': 0.12} 12%|█▏ | 2565/22095 [4:35:07<28:14:00, 5.20s/it] 12%|█▏ | 2566/22095 [4:35:11<26:11:03, 4.83s/it] {'loss': 0.4404, 'grad_norm': 0.7334542392066108, 'learning_rate': 9.806927523321148e-06, 'epoch': 0.12} 12%|█▏ | 2566/22095 [4:35:11<26:11:03, 4.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408175 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10367, 'image': 'vrdu_table_final_2/astro-ph.CO/7e030d42-dd5c-4679-9581-2aa63a5454e9.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 12%|█▏ | 2567/22095 [4:35:14<23:03:01, 4.25s/it] {'loss': 0.3966, 'grad_norm': 0.667130718816056, 'learning_rate': 9.806725767990938e-06, 'epoch': 0.12} 12%|█▏ | 2567/22095 [4:35:14<23:03:01, 4.25s/it] 12%|█▏ | 2568/22095 [4:35:18<22:48:46, 4.21s/it] {'loss': 0.4384, 'grad_norm': 0.7872579252818356, 'learning_rate': 9.806523909378956e-06, 'epoch': 0.12} 12%|█▏ | 2568/22095 [4:35:18<22:48:46, 4.21s/it] 12%|█▏ | 2569/22095 [4:35:21<20:40:00, 3.81s/it] {'loss': 0.4756, 'grad_norm': 0.8402962544898573, 'learning_rate': 9.806321947489537e-06, 'epoch': 0.12} 12%|█▏ | 2569/22095 [4:35:21<20:40:00, 3.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [153, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8437728 in VC:s3://internvl-moe-sft-data/. Exception: Image size [153, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 130284, 'image': 'vrdu_texteq/astro-ph.CO/c4dc01b9-0a63-484a-b6bc-fbce49bf2f22.png', 'image_wh': [[153, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'within \\mbox{$\\sim 1\\sigma$}.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2570/22095 [4:35:25<20:11:27, 3.72s/it] {'loss': 0.448, 'grad_norm': 0.8202872754958558, 'learning_rate': 9.806119882327019e-06, 'epoch': 0.12} 12%|█▏ | 2570/22095 [4:35:25<20:11:27, 3.72s/it] 12%|█▏ | 2571/22095 [4:35:29<20:14:00, 3.73s/it] {'loss': 0.4321, 'grad_norm': 0.7107492495403155, 'learning_rate': 9.805917713895748e-06, 'epoch': 0.12} 12%|█▏ | 2571/22095 [4:35:29<20:14:00, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78014 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2572/22095 [4:35:36<26:07:33, 4.82s/it] {'loss': 0.5144, 'grad_norm': 0.771093224734539, 'learning_rate': 9.805715442200065e-06, 'epoch': 0.12} 12%|█▏ | 2572/22095 [4:35:36<26:07:33, 4.82s/it] 12%|█▏ | 2573/22095 [4:35:40<25:28:24, 4.70s/it] {'loss': 0.4454, 'grad_norm': 0.9386984016833877, 'learning_rate': 9.805513067244316e-06, 'epoch': 0.12} 12%|█▏ | 2573/22095 [4:35:40<25:28:24, 4.70s/it] 12%|█▏ | 2574/22095 [4:35:43<22:50:35, 4.21s/it] {'loss': 0.4083, 'grad_norm': 0.8371963863890383, 'learning_rate': 9.80531058903285e-06, 'epoch': 0.12} 12%|█▏ | 2574/22095 [4:35:43<22:50:35, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51180 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2575/22095 [4:35:47<21:11:05, 3.91s/it] {'loss': 0.4178, 'grad_norm': 0.7628481457269319, 'learning_rate': 9.805108007570019e-06, 'epoch': 0.12} 12%|█▏ | 2575/22095 [4:35:47<21:11:05, 3.91s/it] 12%|█▏ | 2576/22095 [4:35:50<19:52:56, 3.67s/it] {'loss': 0.3906, 'grad_norm': 0.7629279641165384, 'learning_rate': 9.804905322860174e-06, 'epoch': 0.12} 12%|█▏ | 2576/22095 [4:35:50<19:52:56, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50399 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109552 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2577/22095 [4:35:53<18:38:23, 3.44s/it] {'loss': 0.4435, 'grad_norm': 0.7876862625935663, 'learning_rate': 9.80470253490767e-06, 'epoch': 0.12} 12%|█▏ | 2577/22095 [4:35:53<18:38:23, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103493 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2578/22095 [4:35:56<18:31:49, 3.42s/it] {'loss': 0.427, 'grad_norm': 0.8393703723647782, 'learning_rate': 9.804499643716866e-06, 'epoch': 0.12} 12%|█▏ | 2578/22095 [4:35:56<18:31:49, 3.42s/it] 12%|█▏ | 2579/22095 [4:35:59<18:01:14, 3.32s/it] {'loss': 0.4125, 'grad_norm': 0.8430332725251674, 'learning_rate': 9.804296649292119e-06, 'epoch': 0.12} 12%|█▏ | 2579/22095 [4:35:59<18:01:14, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49131 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2580/22095 [4:36:08<26:41:01, 4.92s/it] {'loss': 0.5417, 'grad_norm': 0.5868369641884476, 'learning_rate': 9.804093551637794e-06, 'epoch': 0.12} 12%|█▏ | 2580/22095 [4:36:08<26:41:01, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66533 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65399 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (41926 > 40960) for 4 sample(s). Truncating to 536 with 2 samples. 12%|█▏ | 2581/22095 [4:36:11<24:12:26, 4.47s/it] {'loss': 0.4395, 'grad_norm': 0.7528081549316857, 'learning_rate': 9.803890350758253e-06, 'epoch': 0.12} 12%|█▏ | 2581/22095 [4:36:11<24:12:26, 4.47s/it] 12%|█▏ | 2582/22095 [4:36:15<22:34:19, 4.16s/it] {'loss': 0.4565, 'grad_norm': 0.812847278681468, 'learning_rate': 9.803687046657863e-06, 'epoch': 0.12} 12%|█▏ | 2582/22095 [4:36:15<22:34:19, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [128, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8377913 in VC:s3://internvl-moe-sft-data/. Exception: Image size [128, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44696, 'image': 'vrdu_table_final_2/astro-ph.CO/b9bbb10b-fbbb-4989-b5cc-9fbfbb798344.png', 'image_wh': [[128, 23]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}}Resolution \\\\ \\Mpch\\end{tabular}\n```"}]} 12%|█▏ | 2583/22095 [4:36:24<31:35:45, 5.83s/it] {'loss': 0.5108, 'grad_norm': 0.3822252791723251, 'learning_rate': 9.80348363934099e-06, 'epoch': 0.12} 12%|█▏ | 2583/22095 [4:36:24<31:35:45, 5.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047143 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 4\nB. 5\nC. 6\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 12%|█▏ | 2584/22095 [4:36:27<27:19:38, 5.04s/it] {'loss': 0.4828, 'grad_norm': 0.81855950148487, 'learning_rate': 9.803280128812009e-06, 'epoch': 0.12} 12%|█▏ | 2584/22095 [4:36:28<27:19:38, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2585/22095 [4:36:36<33:02:28, 6.10s/it] {'loss': 0.5013, 'grad_norm': 0.38534699136789974, 'learning_rate': 9.803076515075288e-06, 'epoch': 0.12} 12%|█▏ | 2585/22095 [4:36:36<33:02:28, 6.10s/it] 12%|█▏ | 2586/22095 [4:36:40<28:49:35, 5.32s/it] {'loss': 0.3777, 'grad_norm': 0.8995071789486414, 'learning_rate': 9.802872798135205e-06, 'epoch': 0.12} 12%|█▏ | 2586/22095 [4:36:40<28:49:35, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54795 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2587/22095 [4:36:43<25:02:34, 4.62s/it] {'loss': 0.4325, 'grad_norm': 0.6654776493366726, 'learning_rate': 9.802668977996134e-06, 'epoch': 0.12} 12%|█▏ | 2587/22095 [4:36:43<25:02:34, 4.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2588/22095 [4:36:52<33:41:40, 6.22s/it] {'loss': 0.5202, 'grad_norm': 0.40645015656442596, 'learning_rate': 9.80246505466246e-06, 'epoch': 0.12} 12%|█▏ | 2588/22095 [4:36:53<33:41:40, 6.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2589/22095 [4:37:03<40:11:52, 7.42s/it] {'loss': 0.4921, 'grad_norm': 0.4328728192860413, 'learning_rate': 9.802261028138563e-06, 'epoch': 0.12} 12%|█▏ | 2589/22095 [4:37:03<40:11:52, 7.42s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 12%|█▏ | 2590/22095 [4:37:07<34:28:46, 6.36s/it] {'loss': 0.4322, 'grad_norm': 0.8520383668317281, 'learning_rate': 9.802056898428823e-06, 'epoch': 0.12} 12%|█▏ | 2590/22095 [4:37:07<34:28:46, 6.36s/it] 12%|█▏ | 2591/22095 [4:37:11<30:29:21, 5.63s/it] {'loss': 0.4772, 'grad_norm': 0.8162455760932941, 'learning_rate': 9.801852665537628e-06, 'epoch': 0.12} 12%|█▏ | 2591/22095 [4:37:11<30:29:21, 5.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95259 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2592/22095 [4:37:15<27:53:46, 5.15s/it] {'loss': 0.4096, 'grad_norm': 0.715639339968077, 'learning_rate': 9.801648329469368e-06, 'epoch': 0.12} 12%|█▏ | 2592/22095 [4:37:15<27:53:46, 5.15s/it] 12%|█▏ | 2593/22095 [4:37:18<25:07:04, 4.64s/it] {'loss': 0.4204, 'grad_norm': 0.7674946740030191, 'learning_rate': 9.801443890228433e-06, 'epoch': 0.12} 12%|█▏ | 2593/22095 [4:37:18<25:07:04, 4.64s/it] 12%|█▏ | 2594/22095 [4:37:21<22:40:42, 4.19s/it] {'loss': 0.4088, 'grad_norm': 0.8687384401260149, 'learning_rate': 9.801239347819213e-06, 'epoch': 0.12} 12%|█▏ | 2594/22095 [4:37:21<22:40:42, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41861 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2595/22095 [4:37:28<26:32:31, 4.90s/it] {'loss': 0.519, 'grad_norm': 0.776904357959414, 'learning_rate': 9.801034702246109e-06, 'epoch': 0.12} 12%|█▏ | 2595/22095 [4:37:28<26:32:31, 4.90s/it] 12%|█▏ | 2596/22095 [4:37:32<25:47:32, 4.76s/it] {'loss': 0.4869, 'grad_norm': 0.7081209701798946, 'learning_rate': 9.80082995351351e-06, 'epoch': 0.12} 12%|█▏ | 2596/22095 [4:37:32<25:47:32, 4.76s/it] 12%|█▏ | 2597/22095 [4:37:36<23:35:09, 4.35s/it] {'loss': 0.4077, 'grad_norm': 0.8995320362421774, 'learning_rate': 9.800625101625823e-06, 'epoch': 0.12} 12%|█▏ | 2597/22095 [4:37:36<23:35:09, 4.35s/it] 12%|█▏ | 2598/22095 [4:37:39<21:48:17, 4.03s/it] {'loss': 0.4211, 'grad_norm': 0.8018640403584185, 'learning_rate': 9.800420146587446e-06, 'epoch': 0.12} 12%|█▏ | 2598/22095 [4:37:39<21:48:17, 4.03s/it] 12%|█▏ | 2599/22095 [4:37:42<20:19:32, 3.75s/it] {'loss': 0.4222, 'grad_norm': 0.7980549659130014, 'learning_rate': 9.800215088402785e-06, 'epoch': 0.12} 12%|█▏ | 2599/22095 [4:37:42<20:19:32, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2600/22095 [4:37:51<29:45:20, 5.49s/it] {'loss': 0.5301, 'grad_norm': 0.4214767758524068, 'learning_rate': 9.800009927076242e-06, 'epoch': 0.12} 12%|█▏ | 2600/22095 [4:37:51<29:45:20, 5.49s/it] 12%|█▏ | 2601/22095 [4:37:55<25:57:22, 4.79s/it] {'loss': 0.4073, 'grad_norm': 0.8670626633745894, 'learning_rate': 9.79980466261223e-06, 'epoch': 0.12} 12%|█▏ | 2601/22095 [4:37:55<25:57:22, 4.79s/it] 12%|█▏ | 2602/22095 [4:37:58<22:51:16, 4.22s/it] {'loss': 0.4164, 'grad_norm': 0.7999794305874037, 'learning_rate': 9.799599295015154e-06, 'epoch': 0.12} 12%|█▏ | 2602/22095 [4:37:58<22:51:16, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2603/22095 [4:38:07<31:15:04, 5.77s/it] {'loss': 0.4855, 'grad_norm': 0.38042470528204886, 'learning_rate': 9.799393824289432e-06, 'epoch': 0.12} 12%|█▏ | 2603/22095 [4:38:07<31:15:04, 5.77s/it] 12%|█▏ | 2604/22095 [4:38:10<27:12:20, 5.02s/it] {'loss': 0.4406, 'grad_norm': 0.7736755748456349, 'learning_rate': 9.799188250439477e-06, 'epoch': 0.12} 12%|█▏ | 2604/22095 [4:38:10<27:12:20, 5.02s/it] 12%|█▏ | 2605/22095 [4:38:14<24:30:42, 4.53s/it] {'loss': 0.4272, 'grad_norm': 0.9131934705355554, 'learning_rate': 9.798982573469706e-06, 'epoch': 0.12} 12%|█▏ | 2605/22095 [4:38:14<24:30:42, 4.53s/it] 12%|█▏ | 2606/22095 [4:38:17<22:21:44, 4.13s/it] {'loss': 0.4511, 'grad_norm': 0.7246504193592824, 'learning_rate': 9.79877679338454e-06, 'epoch': 0.12} 12%|█▏ | 2606/22095 [4:38:17<22:21:44, 4.13s/it] 12%|█▏ | 2607/22095 [4:38:20<20:53:12, 3.86s/it] {'loss': 0.4068, 'grad_norm': 0.7085435596280111, 'learning_rate': 9.798570910188396e-06, 'epoch': 0.12} 12%|█▏ | 2607/22095 [4:38:20<20:53:12, 3.86s/it] 12%|█▏ | 2608/22095 [4:38:23<19:27:13, 3.59s/it] {'loss': 0.4549, 'grad_norm': 0.7441094134118903, 'learning_rate': 9.798364923885703e-06, 'epoch': 0.12} 12%|█▏ | 2608/22095 [4:38:23<19:27:13, 3.59s/it] 12%|█▏ | 2609/22095 [4:38:26<19:13:28, 3.55s/it] {'loss': 0.4463, 'grad_norm': 0.7898290377759148, 'learning_rate': 9.798158834480883e-06, 'epoch': 0.12} 12%|█▏ | 2609/22095 [4:38:26<19:13:28, 3.55s/it] 12%|█▏ | 2610/22095 [4:38:29<18:26:55, 3.41s/it] {'loss': 0.3952, 'grad_norm': 0.6696170261673059, 'learning_rate': 9.797952641978368e-06, 'epoch': 0.12} 12%|█▏ | 2610/22095 [4:38:30<18:26:55, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (125248 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2611/22095 [4:38:33<18:19:37, 3.39s/it] {'loss': 0.5207, 'grad_norm': 0.84637099325106, 'learning_rate': 9.797746346382586e-06, 'epoch': 0.12} 12%|█▏ | 2611/22095 [4:38:33<18:19:37, 3.39s/it] 12%|█▏ | 2612/22095 [4:38:36<17:26:33, 3.22s/it] {'loss': 0.4498, 'grad_norm': 0.7675882546910711, 'learning_rate': 9.797539947697969e-06, 'epoch': 0.12} 12%|█▏ | 2612/22095 [4:38:36<17:26:33, 3.22s/it] 12%|█▏ | 2613/22095 [4:38:39<18:10:13, 3.36s/it] {'loss': 0.4515, 'grad_norm': 0.761430885948382, 'learning_rate': 9.797333445928954e-06, 'epoch': 0.12} 12%|█▏ | 2613/22095 [4:38:39<18:10:13, 3.36s/it] 12%|█▏ | 2614/22095 [4:38:43<17:54:34, 3.31s/it] {'loss': 0.4042, 'grad_norm': 0.7188207060892471, 'learning_rate': 9.797126841079979e-06, 'epoch': 0.12} 12%|█▏ | 2614/22095 [4:38:43<17:54:34, 3.31s/it] 12%|█▏ | 2615/22095 [4:38:45<16:54:59, 3.13s/it] {'loss': 0.4564, 'grad_norm': 0.7496719771480668, 'learning_rate': 9.796920133155479e-06, 'epoch': 0.12} 12%|█▏ | 2615/22095 [4:38:45<16:54:59, 3.13s/it] 12%|█▏ | 2616/22095 [4:38:48<16:45:52, 3.10s/it] {'loss': 0.4532, 'grad_norm': 1.0304731986930586, 'learning_rate': 9.796713322159897e-06, 'epoch': 0.12} 12%|█▏ | 2616/22095 [4:38:48<16:45:52, 3.10s/it] 12%|█▏ | 2617/22095 [4:38:51<16:12:52, 3.00s/it] {'loss': 0.4328, 'grad_norm': 0.9970062290722393, 'learning_rate': 9.796506408097679e-06, 'epoch': 0.12} 12%|█▏ | 2617/22095 [4:38:51<16:12:52, 3.00s/it] 12%|█▏ | 2618/22095 [4:38:54<16:34:01, 3.06s/it] {'loss': 0.469, 'grad_norm': 0.7185759231957992, 'learning_rate': 9.79629939097327e-06, 'epoch': 0.12} 12%|█▏ | 2618/22095 [4:38:54<16:34:01, 3.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2619/22095 [4:39:01<22:04:41, 4.08s/it] {'loss': 0.5458, 'grad_norm': 0.6205298876025276, 'learning_rate': 9.796092270791118e-06, 'epoch': 0.12} 12%|█▏ | 2619/22095 [4:39:01<22:04:41, 4.08s/it] 12%|█▏ | 2620/22095 [4:39:04<20:54:22, 3.86s/it] {'loss': 0.4199, 'grad_norm': 0.9922965858266375, 'learning_rate': 9.795885047555673e-06, 'epoch': 0.12} 12%|█▏ | 2620/22095 [4:39:04<20:54:22, 3.86s/it] 12%|█▏ | 2621/22095 [4:39:08<21:05:39, 3.90s/it] {'loss': 0.4193, 'grad_norm': 0.7396088827374019, 'learning_rate': 9.795677721271388e-06, 'epoch': 0.12} 12%|█▏ | 2621/22095 [4:39:08<21:05:39, 3.90s/it] 12%|█▏ | 2622/22095 [4:39:11<19:44:25, 3.65s/it] {'loss': 0.4015, 'grad_norm': 0.6860612044528022, 'learning_rate': 9.795470291942717e-06, 'epoch': 0.12} 12%|█▏ | 2622/22095 [4:39:11<19:44:25, 3.65s/it] 12%|█▏ | 2623/22095 [4:39:14<18:28:33, 3.42s/it] {'loss': 0.4456, 'grad_norm': 0.9072366386992914, 'learning_rate': 9.795262759574117e-06, 'epoch': 0.12} 12%|█▏ | 2623/22095 [4:39:14<18:28:33, 3.42s/it] 12%|█▏ | 2624/22095 [4:39:18<19:39:01, 3.63s/it] {'loss': 0.4396, 'grad_norm': 0.8086548745386506, 'learning_rate': 9.795055124170047e-06, 'epoch': 0.12} 12%|█▏ | 2624/22095 [4:39:18<19:39:01, 3.63s/it] 12%|█▏ | 2625/22095 [4:39:22<19:20:18, 3.58s/it] {'loss': 0.4149, 'grad_norm': 0.8949857372498993, 'learning_rate': 9.79484738573497e-06, 'epoch': 0.12} 12%|█▏ | 2625/22095 [4:39:22<19:20:18, 3.58s/it] 12%|█▏ | 2626/22095 [4:39:25<19:55:15, 3.68s/it] {'loss': 0.4285, 'grad_norm': 0.7416475609651193, 'learning_rate': 9.794639544273352e-06, 'epoch': 0.12} 12%|█▏ | 2626/22095 [4:39:26<19:55:15, 3.68s/it] 12%|█▏ | 2627/22095 [4:39:29<20:04:55, 3.71s/it] {'loss': 0.4561, 'grad_norm': 0.7588753079782493, 'learning_rate': 9.794431599789653e-06, 'epoch': 0.12} 12%|█▏ | 2627/22095 [4:39:29<20:04:55, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123684 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2628/22095 [4:39:32<18:36:50, 3.44s/it] {'loss': 0.4243, 'grad_norm': 0.790029553819229, 'learning_rate': 9.794223552288344e-06, 'epoch': 0.12} 12%|█▏ | 2628/22095 [4:39:32<18:36:50, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69675 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60170 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122618 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2629/22095 [4:39:36<19:32:24, 3.61s/it] {'loss': 0.3943, 'grad_norm': 0.712822624737303, 'learning_rate': 9.794015401773896e-06, 'epoch': 0.12} 12%|█▏ | 2629/22095 [4:39:36<19:32:24, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2630/22095 [4:39:44<26:50:13, 4.96s/it] {'loss': 0.5127, 'grad_norm': 0.8630365063043336, 'learning_rate': 9.79380714825078e-06, 'epoch': 0.12} 12%|█▏ | 2630/22095 [4:39:44<26:50:13, 4.96s/it] 12%|█▏ | 2631/22095 [4:39:48<24:57:52, 4.62s/it] {'loss': 0.4413, 'grad_norm': 0.9709964026202484, 'learning_rate': 9.793598791723471e-06, 'epoch': 0.12} 12%|█▏ | 2631/22095 [4:39:48<24:57:52, 4.62s/it] 12%|█▏ | 2632/22095 [4:39:51<23:06:16, 4.27s/it] {'loss': 0.4456, 'grad_norm': 0.7388955468482264, 'learning_rate': 9.793390332196448e-06, 'epoch': 0.12} 12%|█▏ | 2632/22095 [4:39:52<23:06:16, 4.27s/it] 12%|█▏ | 2633/22095 [4:39:55<21:17:54, 3.94s/it] {'loss': 0.4058, 'grad_norm': 0.7321239683277707, 'learning_rate': 9.793181769674186e-06, 'epoch': 0.12} 12%|█▏ | 2633/22095 [4:39:55<21:17:54, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49838 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96530 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2634/22095 [4:39:58<20:19:00, 3.76s/it] {'loss': 0.4178, 'grad_norm': 1.2012183009430955, 'learning_rate': 9.792973104161172e-06, 'epoch': 0.12} 12%|█▏ | 2634/22095 [4:39:58<20:19:00, 3.76s/it] 12%|█▏ | 2635/22095 [4:40:01<18:38:41, 3.45s/it] {'loss': 0.4096, 'grad_norm': 1.0114217801817902, 'learning_rate': 9.792764335661885e-06, 'epoch': 0.12} 12%|█▏ | 2635/22095 [4:40:01<18:38:41, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70761 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2636/22095 [4:40:08<25:38:37, 4.74s/it] {'loss': 0.5141, 'grad_norm': 0.6756070247331848, 'learning_rate': 9.792555464180813e-06, 'epoch': 0.12} 12%|█▏ | 2636/22095 [4:40:08<25:38:37, 4.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2637/22095 [4:40:18<33:07:42, 6.13s/it] {'loss': 0.5274, 'grad_norm': 0.5561437651032958, 'learning_rate': 9.792346489722443e-06, 'epoch': 0.12} 12%|█▏ | 2637/22095 [4:40:18<33:07:42, 6.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 12%|█▏ | 2638/22095 [4:40:22<29:27:06, 5.45s/it] {'loss': 0.4768, 'grad_norm': 0.9687338809941708, 'learning_rate': 9.792137412291265e-06, 'epoch': 0.12} 12%|█▏ | 2638/22095 [4:40:22<29:27:06, 5.45s/it] 12%|█▏ | 2639/22095 [4:40:25<26:32:47, 4.91s/it] {'loss': 0.4102, 'grad_norm': 0.7698369750400488, 'learning_rate': 9.791928231891771e-06, 'epoch': 0.12} 12%|█▏ | 2639/22095 [4:40:25<26:32:47, 4.91s/it] 12%|█▏ | 2640/22095 [4:40:28<23:36:20, 4.37s/it] {'loss': 0.4375, 'grad_norm': 0.7731517828178811, 'learning_rate': 9.791718948528457e-06, 'epoch': 0.12} 12%|█▏ | 2640/22095 [4:40:28<23:36:20, 4.37s/it] 12%|█▏ | 2641/22095 [4:40:32<21:51:29, 4.04s/it] {'loss': 0.4676, 'grad_norm': 0.92130600172214, 'learning_rate': 9.79150956220582e-06, 'epoch': 0.12} 12%|█▏ | 2641/22095 [4:40:32<21:51:29, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2642/22095 [4:40:40<28:31:27, 5.28s/it] {'loss': 0.5164, 'grad_norm': 0.6881212305908873, 'learning_rate': 9.79130007292836e-06, 'epoch': 0.12} 12%|█▏ | 2642/22095 [4:40:40<28:31:27, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44868 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2643/22095 [4:40:43<25:15:34, 4.67s/it] {'loss': 0.4386, 'grad_norm': 0.7206591094332322, 'learning_rate': 9.791090480700575e-06, 'epoch': 0.12} 12%|█▏ | 2643/22095 [4:40:43<25:15:34, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2644/22095 [4:40:50<28:52:07, 5.34s/it] {'loss': 0.5285, 'grad_norm': 0.4618535399153771, 'learning_rate': 9.790880785526971e-06, 'epoch': 0.12} 12%|█▏ | 2644/22095 [4:40:50<28:52:07, 5.34s/it] 12%|█▏ | 2645/22095 [4:40:54<25:59:06, 4.81s/it] {'loss': 0.4009, 'grad_norm': 0.746839294471897, 'learning_rate': 9.790670987412052e-06, 'epoch': 0.12} 12%|█▏ | 2645/22095 [4:40:54<25:59:06, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917247 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40400, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 10\nB. 12\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 12%|█▏ | 2646/22095 [4:40:57<23:35:31, 4.37s/it] {'loss': 0.3695, 'grad_norm': 0.7155483593476287, 'learning_rate': 9.790461086360327e-06, 'epoch': 0.12} 12%|█▏ | 2646/22095 [4:40:57<23:35:31, 4.37s/it] 12%|█▏ | 2647/22095 [4:41:01<22:57:33, 4.25s/it] {'loss': 0.4128, 'grad_norm': 0.7112119393094498, 'learning_rate': 9.790251082376308e-06, 'epoch': 0.12} 12%|█▏ | 2647/22095 [4:41:01<22:57:33, 4.25s/it] 12%|█▏ | 2648/22095 [4:41:04<20:54:01, 3.87s/it] {'loss': 0.4237, 'grad_norm': 0.7763810167513379, 'learning_rate': 9.790040975464503e-06, 'epoch': 0.12} 12%|█▏ | 2648/22095 [4:41:04<20:54:01, 3.87s/it] 12%|█▏ | 2649/22095 [4:41:07<19:36:06, 3.63s/it] {'loss': 0.4416, 'grad_norm': 0.712552760136062, 'learning_rate': 9.78983076562943e-06, 'epoch': 0.12} 12%|█▏ | 2649/22095 [4:41:07<19:36:06, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61289 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2650/22095 [4:41:11<19:46:07, 3.66s/it] {'loss': 0.4608, 'grad_norm': 0.7682964438579254, 'learning_rate': 9.789620452875605e-06, 'epoch': 0.12} 12%|█▏ | 2650/22095 [4:41:11<19:46:07, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41178 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63432 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2651/22095 [4:41:21<30:38:00, 5.67s/it] {'loss': 0.5363, 'grad_norm': 0.87596034032508, 'learning_rate': 9.789410037207546e-06, 'epoch': 0.12} 12%|█▏ | 2651/22095 [4:41:21<30:38:00, 5.67s/it] 12%|█▏ | 2652/22095 [4:41:24<26:33:50, 4.92s/it] {'loss': 0.4195, 'grad_norm': 0.7109720872914763, 'learning_rate': 9.789199518629774e-06, 'epoch': 0.12} 12%|█▏ | 2652/22095 [4:41:24<26:33:50, 4.92s/it] 12%|█▏ | 2653/22095 [4:41:28<23:51:41, 4.42s/it] {'loss': 0.446, 'grad_norm': 0.7260005596256348, 'learning_rate': 9.788988897146814e-06, 'epoch': 0.12} 12%|█▏ | 2653/22095 [4:41:28<23:51:41, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48341 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2654/22095 [4:41:30<21:31:50, 3.99s/it] {'loss': 0.4292, 'grad_norm': 0.7833435501815497, 'learning_rate': 9.788778172763191e-06, 'epoch': 0.12} 12%|█▏ | 2654/22095 [4:41:31<21:31:50, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2655/22095 [4:41:38<27:15:42, 5.05s/it] {'loss': 0.5069, 'grad_norm': 0.3969348371479244, 'learning_rate': 9.788567345483434e-06, 'epoch': 0.12} 12%|█▏ | 2655/22095 [4:41:38<27:15:42, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46497 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2656/22095 [4:41:41<24:19:30, 4.50s/it] {'loss': 0.4012, 'grad_norm': 0.7136665781135838, 'learning_rate': 9.78835641531207e-06, 'epoch': 0.12} 12%|█▏ | 2656/22095 [4:41:41<24:19:30, 4.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43882 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2657/22095 [4:41:44<21:27:09, 3.97s/it] {'loss': 0.4371, 'grad_norm': 0.715615027824156, 'learning_rate': 9.788145382253633e-06, 'epoch': 0.12} 12%|█▏ | 2657/22095 [4:41:44<21:27:09, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2658/22095 [4:41:54<30:27:47, 5.64s/it] {'loss': 0.5096, 'grad_norm': 0.507504666798424, 'learning_rate': 9.787934246312657e-06, 'epoch': 0.12} 12%|█▏ | 2658/22095 [4:41:54<30:27:47, 5.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2659/22095 [4:42:03<37:00:28, 6.85s/it] {'loss': 0.5422, 'grad_norm': 0.4527911516398708, 'learning_rate': 9.787723007493681e-06, 'epoch': 0.12} 12%|█▏ | 2659/22095 [4:42:03<37:00:28, 6.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 12%|█▏ | 2660/22095 [4:42:07<31:43:21, 5.88s/it] {'loss': 0.3803, 'grad_norm': 0.7400944870716484, 'learning_rate': 9.787511665801242e-06, 'epoch': 0.12} 12%|█▏ | 2660/22095 [4:42:07<31:43:21, 5.88s/it] 12%|█▏ | 2661/22095 [4:42:10<27:28:20, 5.09s/it] {'loss': 0.4075, 'grad_norm': 0.6825565016927304, 'learning_rate': 9.78730022123988e-06, 'epoch': 0.12} 12%|█▏ | 2661/22095 [4:42:10<27:28:20, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2662/22095 [4:42:17<31:15:47, 5.79s/it] {'loss': 0.491, 'grad_norm': 0.4016213987042365, 'learning_rate': 9.787088673814137e-06, 'epoch': 0.12} 12%|█▏ | 2662/22095 [4:42:17<31:15:47, 5.79s/it] 12%|█▏ | 2663/22095 [4:42:21<27:17:53, 5.06s/it] {'loss': 0.3884, 'grad_norm': 0.6816641941164435, 'learning_rate': 9.786877023528564e-06, 'epoch': 0.12} 12%|█▏ | 2663/22095 [4:42:21<27:17:53, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394333 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61168, 'image': 'vrdu_table_final_2/astro-ph.EP/d5a435d5-0b43-4e3f-a649-85243597f608.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2664/22095 [4:42:30<34:31:01, 6.40s/it] {'loss': 0.5096, 'grad_norm': 0.49056929924742526, 'learning_rate': 9.786665270387706e-06, 'epoch': 0.12} 12%|█▏ | 2664/22095 [4:42:30<34:31:01, 6.40s/it] 12%|█▏ | 2665/22095 [4:42:40<39:49:50, 7.38s/it] {'loss': 0.4994, 'grad_norm': 0.40521515766664484, 'learning_rate': 9.78645341439611e-06, 'epoch': 0.12} 12%|█▏ | 2665/22095 [4:42:40<39:49:50, 7.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 12%|█▏ | 2666/22095 [4:42:43<32:54:54, 6.10s/it] {'loss': 0.4777, 'grad_norm': 0.8613575720890704, 'learning_rate': 9.786241455558332e-06, 'epoch': 0.12} 12%|█▏ | 2666/22095 [4:42:43<32:54:54, 6.10s/it] 12%|█▏ | 2667/22095 [4:42:47<29:23:48, 5.45s/it] {'loss': 0.4586, 'grad_norm': 0.7503037569130684, 'learning_rate': 9.786029393878925e-06, 'epoch': 0.12} 12%|█▏ | 2667/22095 [4:42:47<29:23:48, 5.45s/it] 12%|█▏ | 2668/22095 [4:42:50<25:06:11, 4.65s/it] {'loss': 0.3881, 'grad_norm': 0.6666433288490968, 'learning_rate': 9.785817229362445e-06, 'epoch': 0.12} 12%|█▏ | 2668/22095 [4:42:50<25:06:11, 4.65s/it] 12%|█▏ | 2669/22095 [4:42:53<22:46:27, 4.22s/it] {'loss': 0.399, 'grad_norm': 0.895421914416902, 'learning_rate': 9.78560496201345e-06, 'epoch': 0.12} 12%|█▏ | 2669/22095 [4:42:53<22:46:27, 4.22s/it] 12%|█▏ | 2670/22095 [4:42:56<21:16:15, 3.94s/it] {'loss': 0.4544, 'grad_norm': 0.7827897135331325, 'learning_rate': 9.785392591836504e-06, 'epoch': 0.12} 12%|█▏ | 2670/22095 [4:42:56<21:16:15, 3.94s/it] 12%|█▏ | 2671/22095 [4:43:00<20:13:04, 3.75s/it] {'loss': 0.455, 'grad_norm': 0.7347360025993976, 'learning_rate': 9.785180118836169e-06, 'epoch': 0.12} 12%|█▏ | 2671/22095 [4:43:00<20:13:04, 3.75s/it] 12%|█▏ | 2672/22095 [4:43:03<18:55:32, 3.51s/it] {'loss': 0.4535, 'grad_norm': 0.828298118763274, 'learning_rate': 9.784967543017008e-06, 'epoch': 0.12} 12%|█▏ | 2672/22095 [4:43:03<18:55:32, 3.51s/it] 12%|█▏ | 2673/22095 [4:43:05<17:49:01, 3.30s/it] {'loss': 0.3994, 'grad_norm': 0.7050553279671258, 'learning_rate': 9.784754864383593e-06, 'epoch': 0.12} 12%|█▏ | 2673/22095 [4:43:05<17:49:01, 3.30s/it] 12%|█▏ | 2674/22095 [4:43:09<18:05:19, 3.35s/it] {'loss': 0.4549, 'grad_norm': 0.7759845857752202, 'learning_rate': 9.784542082940488e-06, 'epoch': 0.12} 12%|█▏ | 2674/22095 [4:43:09<18:05:19, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107615 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43329 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2675/22095 [4:43:13<18:33:48, 3.44s/it] {'loss': 0.4441, 'grad_norm': 0.9078020537580649, 'learning_rate': 9.784329198692269e-06, 'epoch': 0.12} 12%|█▏ | 2675/22095 [4:43:13<18:33:48, 3.44s/it] 12%|█▏ | 2676/22095 [4:43:17<19:39:46, 3.65s/it] {'loss': 0.4718, 'grad_norm': 0.7967655768014362, 'learning_rate': 9.78411621164351e-06, 'epoch': 0.12} 12%|█▏ | 2676/22095 [4:43:17<19:39:46, 3.65s/it] 12%|█▏ | 2677/22095 [4:43:20<19:03:58, 3.53s/it] {'loss': 0.436, 'grad_norm': 0.6867947079019111, 'learning_rate': 9.783903121798787e-06, 'epoch': 0.12} 12%|█▏ | 2677/22095 [4:43:20<19:03:58, 3.53s/it] 12%|█▏ | 2678/22095 [4:43:24<19:30:15, 3.62s/it] {'loss': 0.4426, 'grad_norm': 0.7372737057900989, 'learning_rate': 9.783689929162679e-06, 'epoch': 0.12} 12%|█▏ | 2678/22095 [4:43:24<19:30:15, 3.62s/it] 12%|█▏ | 2679/22095 [4:43:27<18:31:01, 3.43s/it] {'loss': 0.4679, 'grad_norm': 0.6993752887877634, 'learning_rate': 9.783476633739766e-06, 'epoch': 0.12} 12%|█▏ | 2679/22095 [4:43:27<18:31:01, 3.43s/it] 12%|█▏ | 2680/22095 [4:43:30<18:29:57, 3.43s/it] {'loss': 0.4123, 'grad_norm': 0.6940593830233261, 'learning_rate': 9.783263235534632e-06, 'epoch': 0.12} 12%|█▏ | 2680/22095 [4:43:30<18:29:57, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358066 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24777, 'image': 'vrdu_table_final_2/astro-ph.CO/b9dfdbd2-68f7-4737-a7ff-ce08030fcd69.png', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\eftcamb basis\\end{tabular}\n```"}]} 12%|█▏ | 2681/22095 [4:43:33<17:51:20, 3.31s/it] {'loss': 0.398, 'grad_norm': 0.7090396183152797, 'learning_rate': 9.783049734551861e-06, 'epoch': 0.12} 12%|█▏ | 2681/22095 [4:43:33<17:51:20, 3.31s/it] 12%|█▏ | 2682/22095 [4:43:36<17:07:22, 3.18s/it] {'loss': 0.4313, 'grad_norm': 0.6781535914506774, 'learning_rate': 9.78283613079604e-06, 'epoch': 0.12} 12%|█▏ | 2682/22095 [4:43:36<17:07:22, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (90860 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68805 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100354 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102864 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2683/22095 [4:43:43<22:33:29, 4.18s/it] {'loss': 0.5487, 'grad_norm': 1.229670729375344, 'learning_rate': 9.782622424271761e-06, 'epoch': 0.12} 12%|█▏ | 2683/22095 [4:43:43<22:33:29, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44911 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72533 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70443 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2684/22095 [4:43:46<20:40:31, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106990 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.3965, 'grad_norm': 0.8239341407113727, 'learning_rate': 9.782408614983616e-06, 'epoch': 0.12} 12%|█▏ | 2684/22095 [4:43:46<20:40:31, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (124584 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55414 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2685/22095 [4:43:49<19:57:18, 3.70s/it] {'loss': 0.4073, 'grad_norm': 0.7395028627363661, 'learning_rate': 9.782194702936198e-06, 'epoch': 0.12} 12%|█▏ | 2685/22095 [4:43:49<19:57:18, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2686/22095 [4:43:53<19:44:39, 3.66s/it] {'loss': 0.4154, 'grad_norm': 0.6849031343236605, 'learning_rate': 9.781980688134102e-06, 'epoch': 0.12} 12%|█▏ | 2686/22095 [4:43:53<19:44:39, 3.66s/it] 12%|█▏ | 2687/22095 [4:43:55<18:10:30, 3.37s/it] {'loss': 0.4413, 'grad_norm': 1.2292072746561793, 'learning_rate': 9.781766570581927e-06, 'epoch': 0.12} 12%|█▏ | 2687/22095 [4:43:55<18:10:30, 3.37s/it] 12%|█▏ | 2688/22095 [4:43:59<18:47:46, 3.49s/it] {'loss': 0.4313, 'grad_norm': 0.7494350410309273, 'learning_rate': 9.781552350284275e-06, 'epoch': 0.12} 12%|█▏ | 2688/22095 [4:43:59<18:47:46, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2689/22095 [4:44:02<17:34:00, 3.26s/it] {'loss': 0.4419, 'grad_norm': 0.7544210475047712, 'learning_rate': 9.78133802724575e-06, 'epoch': 0.12} 12%|█▏ | 2689/22095 [4:44:02<17:34:00, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74203 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96476 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2690/22095 [4:44:05<16:52:14, 3.13s/it] {'loss': 0.4334, 'grad_norm': 0.8333559848106804, 'learning_rate': 9.781123601470953e-06, 'epoch': 0.12} 12%|█▏ | 2690/22095 [4:44:05<16:52:14, 3.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [387, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8483873 in VC:s3://internvl-moe-sft-data/. Exception: Image size [387, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 134459, 'image': 'vrdu_texteq/astro-ph.CO/8cd84773-9ffc-4edf-af02-4be683c490a9.png', 'image_wh': [[387, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'Notice that the $ G $ and $ I $ kernels'}]} 12%|█▏ | 2691/22095 [4:44:15<28:29:02, 5.28s/it] {'loss': 0.5363, 'grad_norm': 1.3548197867903617, 'learning_rate': 9.780909072964497e-06, 'epoch': 0.12} 12%|█▏ | 2691/22095 [4:44:15<28:29:02, 5.28s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_4/images/before_screenshot_21_id_110_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-27 20:42:15.387479 load time: 1062.5 ms 12%|█▏ | 2692/22095 [4:44:25<35:32:24, 6.59s/it] {'loss': 0.5295, 'grad_norm': 0.8618630036683134, 'learning_rate': 9.780694441730987e-06, 'epoch': 0.12} 12%|█▏ | 2692/22095 [4:44:25<35:32:24, 6.59s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30797.png 2025-08-27 20:42:23.754599 load time: 1885.14 ms 12%|█▏ | 2693/22095 [4:44:34<40:16:24, 7.47s/it] {'loss': 0.4854, 'grad_norm': 0.45892450903463605, 'learning_rate': 9.780479707775035e-06, 'epoch': 0.12} 12%|█▏ | 2693/22095 [4:44:34<40:16:24, 7.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899858 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23011, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nA. 8cm\nB. 10cm\nC. 16cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2694/22095 [4:44:38<34:22:49, 6.38s/it] {'loss': 0.438, 'grad_norm': 0.7846049894036222, 'learning_rate': 9.780264871101256e-06, 'epoch': 0.12} 12%|█▏ | 2694/22095 [4:44:38<34:22:49, 6.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [292, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8462699 in VC:s3://internvl-moe-sft-data/. Exception: Image size [292, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 73649, 'image': 'vrdu_texteq/astro-ph.CO/723c72bd-3c4b-42f7-acf2-92b6a2dc0eb2.png', 'image_wh': [[292, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where ${\\bf S}$ is defined to be'}]} 12%|█▏ | 2695/22095 [4:44:41<29:10:02, 5.41s/it] {'loss': 0.4153, 'grad_norm': 0.7605031550438223, 'learning_rate': 9.78004993171427e-06, 'epoch': 0.12} 12%|█▏ | 2695/22095 [4:44:41<29:10:02, 5.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43649 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2696/22095 [4:44:45<26:32:02, 4.92s/it] {'loss': 0.4399, 'grad_norm': 0.7357793468390382, 'learning_rate': 9.77983488961869e-06, 'epoch': 0.12} 12%|█▏ | 2696/22095 [4:44:45<26:32:02, 4.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880207 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3360, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 12%|█▏ | 2697/22095 [4:44:49<24:48:22, 4.60s/it] {'loss': 0.4408, 'grad_norm': 0.7580043201789151, 'learning_rate': 9.779619744819136e-06, 'epoch': 0.12} 12%|█▏ | 2697/22095 [4:44:49<24:48:22, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49259 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93569 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57116 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123882 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2698/22095 [4:44:59<33:32:39, 6.23s/it] {'loss': 0.5704, 'grad_norm': 2.103868880774699, 'learning_rate': 9.779404497320236e-06, 'epoch': 0.12} 12%|█▏ | 2698/22095 [4:44:59<33:32:39, 6.23s/it] 12%|█▏ | 2699/22095 [4:45:03<29:39:40, 5.51s/it] {'loss': 0.3799, 'grad_norm': 0.7871003114405238, 'learning_rate': 9.77918914712661e-06, 'epoch': 0.12} 12%|█▏ | 2699/22095 [4:45:03<29:39:40, 5.51s/it] 12%|█▏ | 2700/22095 [4:45:06<25:49:16, 4.79s/it] {'loss': 0.474, 'grad_norm': 0.7841364790364305, 'learning_rate': 9.778973694242888e-06, 'epoch': 0.12} 12%|█▏ | 2700/22095 [4:45:06<25:49:16, 4.79s/it] 12%|█▏ | 2701/22095 [4:45:09<23:08:45, 4.30s/it] {'loss': 0.4336, 'grad_norm': 0.8486597643275121, 'learning_rate': 9.7787581386737e-06, 'epoch': 0.12} 12%|█▏ | 2701/22095 [4:45:09<23:08:45, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80166 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-27 20:43:07.402816 load time: 1015.46 ms Token indices sequence length is longer than the specified maximum sequence length for this model (48361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72894 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2702/22095 [4:45:12<21:37:55, 4.02s/it] {'loss': 0.4111, 'grad_norm': 0.7196137791219471, 'learning_rate': 9.778542480423677e-06, 'epoch': 0.12} 12%|█▏ | 2702/22095 [4:45:12<21:37:55, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42451 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2703/22095 [4:45:16<21:25:34, 3.98s/it] {'loss': 0.4042, 'grad_norm': 0.7347694160748849, 'learning_rate': 9.77832671949745e-06, 'epoch': 0.12} 12%|█▏ | 2703/22095 [4:45:16<21:25:34, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2704/22095 [4:45:26<31:47:00, 5.90s/it] {'loss': 0.5408, 'grad_norm': 0.9636687858399887, 'learning_rate': 9.778110855899659e-06, 'epoch': 0.12} 12%|█▏ | 2704/22095 [4:45:26<31:47:00, 5.90s/it] 12%|█▏ | 2705/22095 [4:45:30<27:23:33, 5.09s/it] {'loss': 0.4039, 'grad_norm': 0.8111963342960673, 'learning_rate': 9.777894889634939e-06, 'epoch': 0.12} 12%|█▏ | 2705/22095 [4:45:30<27:23:33, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2706/22095 [4:45:39<34:25:00, 6.39s/it] {'loss': 0.5425, 'grad_norm': 0.6695899517416207, 'learning_rate': 9.777678820707932e-06, 'epoch': 0.12} 12%|█▏ | 2706/22095 [4:45:39<34:25:00, 6.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [317, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8417095 in VC:s3://internvl-moe-sft-data/. Exception: Image size [317, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57324, 'image': 'vrdu_texteq/astro-ph.CO/195ce664-268c-4f22-ae54-d03cdab9e940.png', 'image_wh': [[317, 25]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'so that we can write $\\lambda_i$ as'}]} 12%|█▏ | 2707/22095 [4:45:43<30:27:06, 5.65s/it] {'loss': 0.3837, 'grad_norm': 0.7610453505809822, 'learning_rate': 9.777462649123281e-06, 'epoch': 0.12} 12%|█▏ | 2707/22095 [4:45:43<30:27:06, 5.65s/it] 12%|█▏ | 2708/22095 [4:45:46<25:57:00, 4.82s/it] {'loss': 0.3737, 'grad_norm': 0.7326549088785459, 'learning_rate': 9.777246374885631e-06, 'epoch': 0.12} 12%|█▏ | 2708/22095 [4:45:46<25:57:00, 4.82s/it] 12%|█▏ | 2709/22095 [4:45:49<23:48:11, 4.42s/it] {'loss': 0.3914, 'grad_norm': 0.6840927367574374, 'learning_rate': 9.77702999799963e-06, 'epoch': 0.12} 12%|█▏ | 2709/22095 [4:45:49<23:48:11, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2710/22095 [4:45:59<31:49:47, 5.91s/it] {'loss': 0.5241, 'grad_norm': 0.9382804953482135, 'learning_rate': 9.776813518469924e-06, 'epoch': 0.12} 12%|█▏ | 2710/22095 [4:45:59<31:49:47, 5.91s/it] 12%|█▏ | 2711/22095 [4:46:07<35:13:04, 6.54s/it] {'loss': 0.5221, 'grad_norm': 0.8627904070576863, 'learning_rate': 9.776596936301168e-06, 'epoch': 0.12} 12%|█▏ | 2711/22095 [4:46:07<35:13:04, 6.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 12%|█▏ | 2712/22095 [4:46:10<30:19:29, 5.63s/it] {'loss': 0.4575, 'grad_norm': 0.6784951736072385, 'learning_rate': 9.776380251498013e-06, 'epoch': 0.12} 12%|█▏ | 2712/22095 [4:46:10<30:19:29, 5.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2713/22095 [4:46:14<27:37:09, 5.13s/it] {'loss': 0.4268, 'grad_norm': 0.6935729007977068, 'learning_rate': 9.776163464065115e-06, 'epoch': 0.12} 12%|█▏ | 2713/22095 [4:46:14<27:37:09, 5.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [834, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8426610 in VC:s3://internvl-moe-sft-data/. Exception: Image size [834, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30209, 'image': 'vrdu_texteq/astro-ph.CO/c69f8d16-7221-42cc-a761-76eeaf2d566d.png', 'image_wh': [[834, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where the final term now involves a double summation over $\\bar{n}$ and $n$.'}]} 12%|█▏ | 2714/22095 [4:46:17<24:24:02, 4.53s/it] {'loss': 0.4375, 'grad_norm': 0.7215493944586941, 'learning_rate': 9.775946574007133e-06, 'epoch': 0.12} 12%|█▏ | 2714/22095 [4:46:17<24:24:02, 4.53s/it] 12%|█▏ | 2715/22095 [4:46:21<23:11:53, 4.31s/it] {'loss': 0.4137, 'grad_norm': 0.791182864674943, 'learning_rate': 9.775729581328728e-06, 'epoch': 0.12} 12%|█▏ | 2715/22095 [4:46:21<23:11:53, 4.31s/it] 12%|█▏ | 2716/22095 [4:46:25<22:06:07, 4.11s/it] {'loss': 0.4204, 'grad_norm': 0.69097511820002, 'learning_rate': 9.775512486034564e-06, 'epoch': 0.12} 12%|█▏ | 2716/22095 [4:46:25<22:06:07, 4.11s/it] 12%|█▏ | 2717/22095 [4:46:28<19:54:37, 3.70s/it] {'loss': 0.4299, 'grad_norm': 0.7109609228395887, 'learning_rate': 9.775295288129301e-06, 'epoch': 0.12} 12%|█▏ | 2717/22095 [4:46:28<19:54:37, 3.70s/it] 12%|█▏ | 2718/22095 [4:46:31<18:42:59, 3.48s/it] {'loss': 0.4669, 'grad_norm': 0.6922140625025477, 'learning_rate': 9.775077987617609e-06, 'epoch': 0.12} 12%|█▏ | 2718/22095 [4:46:31<18:42:59, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954498 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5333, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵AN:MN=1:2,且AN=2,∴2:MN=1:2,∴MN=4cm,∴AM=6cm.∵M是线段AB的中点,∴AB=2AM,∴AB=12cm,故D答案正确.'}]} 12%|█▏ | 2719/22095 [4:46:33<17:54:29, 3.33s/it] {'loss': 0.4484, 'grad_norm': 0.7454249071139529, 'learning_rate': 9.774860584504156e-06, 'epoch': 0.12} 12%|█▏ | 2719/22095 [4:46:33<17:54:29, 3.33s/it] 12%|█▏ | 2720/22095 [4:46:37<17:33:31, 3.26s/it] {'loss': 0.3958, 'grad_norm': 0.6929325920676569, 'learning_rate': 9.774643078793616e-06, 'epoch': 0.12} 12%|█▏ | 2720/22095 [4:46:37<17:33:31, 3.26s/it] 12%|█▏ | 2721/22095 [4:46:41<18:53:15, 3.51s/it] {'loss': 0.4551, 'grad_norm': 0.6871179296450526, 'learning_rate': 9.774425470490657e-06, 'epoch': 0.12} 12%|█▏ | 2721/22095 [4:46:41<18:53:15, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48471 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2722/22095 [4:46:44<19:06:14, 3.55s/it] {'loss': 0.446, 'grad_norm': 0.6682448994056809, 'learning_rate': 9.774207759599961e-06, 'epoch': 0.12} 12%|█▏ | 2722/22095 [4:46:44<19:06:14, 3.55s/it] 12%|█▏ | 2723/22095 [4:46:48<19:48:35, 3.68s/it] {'loss': 0.4169, 'grad_norm': 0.6988690432746627, 'learning_rate': 9.773989946126202e-06, 'epoch': 0.12} 12%|█▏ | 2723/22095 [4:46:48<19:48:35, 3.68s/it] 12%|█▏ | 2724/22095 [4:46:51<18:55:17, 3.52s/it] {'loss': 0.3707, 'grad_norm': 0.6891060920685302, 'learning_rate': 9.773772030074062e-06, 'epoch': 0.12} 12%|█▏ | 2724/22095 [4:46:51<18:55:17, 3.52s/it] 12%|█▏ | 2725/22095 [4:46:54<17:42:23, 3.29s/it] {'loss': 0.4012, 'grad_norm': 0.6759094580335833, 'learning_rate': 9.773554011448221e-06, 'epoch': 0.12} 12%|█▏ | 2725/22095 [4:46:54<17:42:23, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56368 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2726/22095 [4:46:58<17:47:43, 3.31s/it] {'loss': 0.3912, 'grad_norm': 0.6820553553816232, 'learning_rate': 9.773335890253367e-06, 'epoch': 0.12} 12%|█▏ | 2726/22095 [4:46:58<17:47:43, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45360 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2727/22095 [4:47:01<17:25:49, 3.24s/it] {'loss': 0.4546, 'grad_norm': 0.6928170927079416, 'learning_rate': 9.773117666494183e-06, 'epoch': 0.12} 12%|█▏ | 2727/22095 [4:47:01<17:25:49, 3.24s/it] 12%|█▏ | 2728/22095 [4:47:04<17:42:34, 3.29s/it] {'loss': 0.4215, 'grad_norm': 0.7628843441717775, 'learning_rate': 9.772899340175362e-06, 'epoch': 0.12} 12%|█▏ | 2728/22095 [4:47:04<17:42:34, 3.29s/it] 12%|█▏ | 2729/22095 [4:47:07<16:57:17, 3.15s/it] {'loss': 0.4508, 'grad_norm': 0.7729846245333306, 'learning_rate': 9.772680911301592e-06, 'epoch': 0.12} 12%|█▏ | 2729/22095 [4:47:07<16:57:17, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-27 20:45:04.208642 load time: 1398.02 ms Token indices sequence length is longer than the specified maximum sequence length for this model (48531 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2730/22095 [4:47:15<25:25:23, 4.73s/it] {'loss': 0.5501, 'grad_norm': 1.8521663449637993, 'learning_rate': 9.772462379877566e-06, 'epoch': 0.12} 12%|█▏ | 2730/22095 [4:47:15<25:25:23, 4.73s/it] 12%|█▏ | 2731/22095 [4:47:19<23:15:43, 4.32s/it] {'loss': 0.4294, 'grad_norm': 0.7956300766318737, 'learning_rate': 9.772243745907983e-06, 'epoch': 0.12} 12%|█▏ | 2731/22095 [4:47:19<23:15:43, 4.32s/it] 12%|█▏ | 2732/22095 [4:47:22<21:17:03, 3.96s/it] {'loss': 0.427, 'grad_norm': 0.7989248544289832, 'learning_rate': 9.772025009397538e-06, 'epoch': 0.12} 12%|█▏ | 2732/22095 [4:47:22<21:17:03, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2733/22095 [4:47:31<29:44:32, 5.53s/it] {'loss': 0.5273, 'grad_norm': 0.6440191845101102, 'learning_rate': 9.771806170350931e-06, 'epoch': 0.12} 12%|█▏ | 2733/22095 [4:47:31<29:44:32, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53683 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65262 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87154 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2734/22095 [4:47:34<25:57:49, 4.83s/it] {'loss': 0.3975, 'grad_norm': 0.8104647497365935, 'learning_rate': 9.771587228772866e-06, 'epoch': 0.12} 12%|█▏ | 2734/22095 [4:47:34<25:57:49, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2735/22095 [4:47:44<33:23:56, 6.21s/it] {'loss': 0.5167, 'grad_norm': 0.7902673158685333, 'learning_rate': 9.771368184668046e-06, 'epoch': 0.12} 12%|█▏ | 2735/22095 [4:47:44<33:23:56, 6.21s/it] 12%|█▏ | 2736/22095 [4:47:47<29:04:32, 5.41s/it] {'loss': 0.4532, 'grad_norm': 1.0776498695175314, 'learning_rate': 9.771149038041177e-06, 'epoch': 0.12} 12%|█▏ | 2736/22095 [4:47:47<29:04:32, 5.41s/it] 12%|█▏ | 2737/22095 [4:47:51<26:41:45, 4.96s/it] {'loss': 0.4391, 'grad_norm': 0.7327511682649216, 'learning_rate': 9.77092978889697e-06, 'epoch': 0.12} 12%|█▏ | 2737/22095 [4:47:51<26:41:45, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 12%|█▏ | 2738/22095 [4:48:01<33:56:57, 6.31s/it] {'loss': 0.5269, 'grad_norm': 1.005928049604739, 'learning_rate': 9.770710437240134e-06, 'epoch': 0.12} 12%|█▏ | 2738/22095 [4:48:01<33:56:57, 6.31s/it] 12%|█▏ | 2739/22095 [4:48:05<30:23:50, 5.65s/it] {'loss': 0.457, 'grad_norm': 0.8360074267582877, 'learning_rate': 9.770490983075383e-06, 'epoch': 0.12} 12%|█▏ | 2739/22095 [4:48:05<30:23:50, 5.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6778116 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_132635_1/images/before_screenshot_1_id_89_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nClick on the 'Stretch' button in the Modify panel to stretch partially selected objects while moving fully selected ones."}, {'from': 'gpt', 'value': '\nclick(x=0.5150, y=0.6775)\n'}], 'width': 3600, 'height': 2338} 12%|█▏ | 2740/22095 [4:48:08<26:47:44, 4.98s/it] {'loss': 0.4054, 'grad_norm': 0.8991652457166528, 'learning_rate': 9.770271426407432e-06, 'epoch': 0.12} 12%|█▏ | 2740/22095 [4:48:08<26:47:44, 4.98s/it] 12%|█▏ | 2741/22095 [4:48:13<26:01:17, 4.84s/it] {'loss': 0.3936, 'grad_norm': 0.7664564769263479, 'learning_rate': 9.770051767241e-06, 'epoch': 0.12} 12%|█▏ | 2741/22095 [4:48:13<26:01:17, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (95697 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80664 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85379 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2742/22095 [4:48:18<27:22:20, 5.09s/it] {'loss': 0.5359, 'grad_norm': 0.77113558498915, 'learning_rate': 9.769832005580804e-06, 'epoch': 0.12} 12%|█▏ | 2742/22095 [4:48:18<27:22:20, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84820 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2743/22095 [4:48:23<26:32:32, 4.94s/it] {'loss': 0.4158, 'grad_norm': 0.8478107407330637, 'learning_rate': 9.769612141431568e-06, 'epoch': 0.12} 12%|█▏ | 2743/22095 [4:48:23<26:32:32, 4.94s/it] 12%|█▏ | 2744/22095 [4:48:26<23:56:43, 4.45s/it] {'loss': 0.3909, 'grad_norm': 0.795928760884458, 'learning_rate': 9.769392174798017e-06, 'epoch': 0.12} 12%|█▏ | 2744/22095 [4:48:26<23:56:43, 4.45s/it] 12%|█▏ | 2745/22095 [4:48:29<21:20:42, 3.97s/it] {'loss': 0.4278, 'grad_norm': 0.9940043746867019, 'learning_rate': 9.769172105684875e-06, 'epoch': 0.12} 12%|█▏ | 2745/22095 [4:48:29<21:20:42, 3.97s/it] 12%|█▏ | 2746/22095 [4:48:32<19:43:05, 3.67s/it] {'loss': 0.4406, 'grad_norm': 0.7112341670412834, 'learning_rate': 9.76895193409687e-06, 'epoch': 0.12} 12%|█▏ | 2746/22095 [4:48:32<19:43:05, 3.67s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-27 20:46:30.729504 load time: 1022.22 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-27 20:46:32.575460 load time: 1029.09 ms 12%|█▏ | 2747/22095 [4:48:36<20:58:53, 3.90s/it] {'loss': 0.4156, 'grad_norm': 0.7573968135300648, 'learning_rate': 9.768731660038737e-06, 'epoch': 0.12} 12%|█▏ | 2747/22095 [4:48:36<20:58:53, 3.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [814, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8472563 in VC:s3://internvl-moe-sft-data/. Exception: Image size [814, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 53233, 'image': 'vrdu_texteq/astro-ph.CO/cf80d87e-c6f5-4443-b7ff-4733bd1c7cea.png', 'image_wh': [[814, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': '\\ is assumed for the calculation of the bolometric and\nk$-$corrections.'}]} 12%|█▏ | 2748/22095 [4:48:40<21:08:17, 3.93s/it] {'loss': 0.4301, 'grad_norm': 1.113628090493707, 'learning_rate': 9.768511283515207e-06, 'epoch': 0.12} 12%|█▏ | 2748/22095 [4:48:40<21:08:17, 3.93s/it] 12%|█▏ | 2749/22095 [4:48:44<20:16:59, 3.77s/it] {'loss': 0.4271, 'grad_norm': 0.6786265356978682, 'learning_rate': 9.768290804531013e-06, 'epoch': 0.12} 12%|█▏ | 2749/22095 [4:48:44<20:16:59, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2750/22095 [4:48:51<26:33:30, 4.94s/it] {'loss': 0.5163, 'grad_norm': 0.7577079928644646, 'learning_rate': 9.768070223090896e-06, 'epoch': 0.12} 12%|█▏ | 2750/22095 [4:48:51<26:33:30, 4.94s/it] 12%|█▏ | 2751/22095 [4:49:01<34:02:48, 6.34s/it] {'loss': 0.5325, 'grad_norm': 0.5699591505059829, 'learning_rate': 9.767849539199594e-06, 'epoch': 0.12} 12%|█▏ | 2751/22095 [4:49:01<34:02:48, 6.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2752/22095 [4:49:05<29:29:31, 5.49s/it] {'loss': 0.4455, 'grad_norm': 1.6891404275229212, 'learning_rate': 9.767628752861848e-06, 'epoch': 0.12} 12%|█▏ | 2752/22095 [4:49:05<29:29:31, 5.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53575 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42758 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2753/22095 [4:49:08<26:43:54, 4.98s/it] {'loss': 0.4223, 'grad_norm': 0.7397212184485615, 'learning_rate': 9.767407864082404e-06, 'epoch': 0.12} 12%|█▏ | 2753/22095 [4:49:08<26:43:54, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82028 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2754/22095 [4:49:12<25:15:44, 4.70s/it] {'loss': 0.3905, 'grad_norm': 0.7178882499970353, 'learning_rate': 9.767186872866004e-06, 'epoch': 0.12} 12%|█▏ | 2754/22095 [4:49:12<25:15:44, 4.70s/it] 12%|█▏ | 2755/22095 [4:49:17<24:16:54, 4.52s/it] {'loss': 0.4232, 'grad_norm': 0.7992228039565072, 'learning_rate': 9.766965779217401e-06, 'epoch': 0.12} 12%|█▏ | 2755/22095 [4:49:17<24:16:54, 4.52s/it] 12%|█▏ | 2756/22095 [4:49:20<23:09:10, 4.31s/it] {'loss': 0.4512, 'grad_norm': 0.7976996216847373, 'learning_rate': 9.766744583141345e-06, 'epoch': 0.12} 12%|█▏ | 2756/22095 [4:49:20<23:09:10, 4.31s/it] 12%|█▏ | 2757/22095 [4:49:23<20:52:50, 3.89s/it] {'loss': 0.4165, 'grad_norm': 0.6992639657248829, 'learning_rate': 9.766523284642588e-06, 'epoch': 0.12} 12%|█▏ | 2757/22095 [4:49:23<20:52:50, 3.89s/it] 12%|█▏ | 2758/22095 [4:49:26<19:42:31, 3.67s/it] {'loss': 0.4365, 'grad_norm': 0.6921757743659276, 'learning_rate': 9.766301883725884e-06, 'epoch': 0.12} 12%|█▏ | 2758/22095 [4:49:26<19:42:31, 3.67s/it] 12%|█▏ | 2759/22095 [4:49:30<19:14:21, 3.58s/it] {'loss': 0.4301, 'grad_norm': 0.7561530071180924, 'learning_rate': 9.76608038039599e-06, 'epoch': 0.12} 12%|█▏ | 2759/22095 [4:49:30<19:14:21, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 12%|█▏ | 2760/22095 [4:49:33<18:34:37, 3.46s/it] {'loss': 0.4177, 'grad_norm': 0.8004041588160753, 'learning_rate': 9.765858774657669e-06, 'epoch': 0.12} 12%|█▏ | 2760/22095 [4:49:33<18:34:37, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52294 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51839 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83603 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43036 > 40960). Running this sequence through the model will result in indexing errors 12%|█▏ | 2761/22095 [4:49:41<25:54:40, 4.82s/it] {'loss': 0.5095, 'grad_norm': 1.0769918767267206, 'learning_rate': 9.76563706651568e-06, 'epoch': 0.12} 12%|█▏ | 2761/22095 [4:49:41<25:54:40, 4.82s/it] 13%|█▎ | 2762/22095 [4:49:46<26:22:46, 4.91s/it] {'loss': 0.5023, 'grad_norm': 0.8997733531140661, 'learning_rate': 9.765415255974784e-06, 'epoch': 0.13} 13%|█▎ | 2762/22095 [4:49:46<26:22:46, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74099 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2763/22095 [4:49:54<31:53:06, 5.94s/it] {'loss': 0.5207, 'grad_norm': 0.433269329782108, 'learning_rate': 9.765193343039751e-06, 'epoch': 0.13} 13%|█▎ | 2763/22095 [4:49:54<31:53:06, 5.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 13%|█▎ | 2764/22095 [4:49:58<28:20:32, 5.28s/it] {'loss': 0.4668, 'grad_norm': 0.9853752511046728, 'learning_rate': 9.76497132771535e-06, 'epoch': 0.13} 13%|█▎ | 2764/22095 [4:49:58<28:20:32, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2765/22095 [4:50:02<26:50:28, 5.00s/it] {'loss': 0.4301, 'grad_norm': 0.8148443072198143, 'learning_rate': 9.764749210006348e-06, 'epoch': 0.13} 13%|█▎ | 2765/22095 [4:50:02<26:50:28, 5.00s/it] 13%|█▎ | 2766/22095 [4:50:06<24:57:00, 4.65s/it] {'loss': 0.4973, 'grad_norm': 0.8802776513421972, 'learning_rate': 9.76452698991752e-06, 'epoch': 0.13} 13%|█▎ | 2766/22095 [4:50:06<24:57:00, 4.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2767/22095 [4:50:10<22:42:23, 4.23s/it] {'loss': 0.4407, 'grad_norm': 0.855492498623336, 'learning_rate': 9.76430466745364e-06, 'epoch': 0.13} 13%|█▎ | 2767/22095 [4:50:10<22:42:23, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50334 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48183 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2768/22095 [4:50:12<20:28:00, 3.81s/it] {'loss': 0.3817, 'grad_norm': 0.8785273791521889, 'learning_rate': 9.764082242619485e-06, 'epoch': 0.13} 13%|█▎ | 2768/22095 [4:50:12<20:28:00, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2769/22095 [4:50:22<29:29:45, 5.49s/it] {'loss': 0.53, 'grad_norm': 1.557105939671637, 'learning_rate': 9.763859715419834e-06, 'epoch': 0.13} 13%|█▎ | 2769/22095 [4:50:22<29:29:45, 5.49s/it] 13%|█▎ | 2770/22095 [4:50:26<27:41:01, 5.16s/it] {'loss': 0.4435, 'grad_norm': 1.066750154692408, 'learning_rate': 9.76363708585947e-06, 'epoch': 0.13} 13%|█▎ | 2770/22095 [4:50:26<27:41:01, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2771/22095 [4:50:37<37:04:48, 6.91s/it] {'loss': 0.5445, 'grad_norm': 1.162892927861453, 'learning_rate': 9.763414353943175e-06, 'epoch': 0.13} 13%|█▎ | 2771/22095 [4:50:37<37:04:48, 6.91s/it] 13%|█▎ | 2772/22095 [4:50:42<33:03:59, 6.16s/it] {'loss': 0.4512, 'grad_norm': 0.8886752219994062, 'learning_rate': 9.763191519675735e-06, 'epoch': 0.13} 13%|█▎ | 2772/22095 [4:50:42<33:03:59, 6.16s/it] 13%|█▎ | 2773/22095 [4:50:46<30:56:54, 5.77s/it] {'loss': 0.458, 'grad_norm': 0.8803921244404787, 'learning_rate': 9.762968583061938e-06, 'epoch': 0.13} 13%|█▎ | 2773/22095 [4:50:46<30:56:54, 5.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2774/22095 [4:50:49<26:24:45, 4.92s/it] {'loss': 0.4956, 'grad_norm': 0.9265125219905003, 'learning_rate': 9.762745544106576e-06, 'epoch': 0.13} 13%|█▎ | 2774/22095 [4:50:49<26:24:45, 4.92s/it] 13%|█▎ | 2775/22095 [4:50:53<24:26:17, 4.55s/it] {'loss': 0.47, 'grad_norm': 1.0048813765796745, 'learning_rate': 9.762522402814438e-06, 'epoch': 0.13} 13%|█▎ | 2775/22095 [4:50:53<24:26:17, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2776/22095 [4:51:00<28:40:35, 5.34s/it] {'loss': 0.5454, 'grad_norm': 1.0089599105936264, 'learning_rate': 9.762299159190322e-06, 'epoch': 0.13} 13%|█▎ | 2776/22095 [4:51:00<28:40:35, 5.34s/it] 13%|█▎ | 2777/22095 [4:51:04<25:27:46, 4.75s/it] {'loss': 0.4505, 'grad_norm': 0.7490852698867788, 'learning_rate': 9.762075813239022e-06, 'epoch': 0.13} 13%|█▎ | 2777/22095 [4:51:04<25:27:46, 4.75s/it] 13%|█▎ | 2778/22095 [4:51:07<22:29:00, 4.19s/it] {'loss': 0.4108, 'grad_norm': 0.7337323557104144, 'learning_rate': 9.761852364965339e-06, 'epoch': 0.13} 13%|█▎ | 2778/22095 [4:51:07<22:29:00, 4.19s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-27 20:49:05.893916 load time: 1102.39 ms 13%|█▎ | 2779/22095 [4:51:10<21:51:51, 4.07s/it] {'loss': 0.4365, 'grad_norm': 0.8230115914730641, 'learning_rate': 9.761628814374074e-06, 'epoch': 0.13} 13%|█▎ | 2779/22095 [4:51:10<21:51:51, 4.07s/it] 13%|█▎ | 2780/22095 [4:51:14<21:15:40, 3.96s/it] {'loss': 0.4069, 'grad_norm': 0.7377174974714752, 'learning_rate': 9.76140516147003e-06, 'epoch': 0.13} 13%|█▎ | 2780/22095 [4:51:14<21:15:40, 3.96s/it] 13%|█▎ | 2781/22095 [4:51:17<19:27:39, 3.63s/it] {'loss': 0.4056, 'grad_norm': 0.7771110398697952, 'learning_rate': 9.761181406258012e-06, 'epoch': 0.13} 13%|█▎ | 2781/22095 [4:51:17<19:27:39, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44698 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2782/22095 [4:51:20<19:05:47, 3.56s/it] {'loss': 0.4479, 'grad_norm': 0.8547150246842478, 'learning_rate': 9.760957548742828e-06, 'epoch': 0.13} 13%|█▎ | 2782/22095 [4:51:20<19:05:47, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92981 > 40960). Running this sequence through the model will result in indexing errors /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (99586880 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 13%|█▎ | 2783/22095 [4:51:24<18:42:39, 3.49s/it] {'loss': 0.3704, 'grad_norm': 0.7135937835367194, 'learning_rate': 9.760733588929289e-06, 'epoch': 0.13} 13%|█▎ | 2783/22095 [4:51:24<18:42:39, 3.49s/it] 13%|█▎ | 2784/22095 [4:51:27<18:22:58, 3.43s/it] {'loss': 0.4439, 'grad_norm': 0.733145252105959, 'learning_rate': 9.760509526822206e-06, 'epoch': 0.13} 13%|█▎ | 2784/22095 [4:51:27<18:22:58, 3.43s/it] 13%|█▎ | 2785/22095 [4:51:31<18:45:55, 3.50s/it] {'loss': 0.4677, 'grad_norm': 0.7259614720917196, 'learning_rate': 9.760285362426397e-06, 'epoch': 0.13} 13%|█▎ | 2785/22095 [4:51:31<18:45:55, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2786/22095 [4:51:34<19:17:01, 3.60s/it] {'loss': 0.422, 'grad_norm': 0.7895174836944507, 'learning_rate': 9.760061095746671e-06, 'epoch': 0.13} 13%|█▎ | 2786/22095 [4:51:34<19:17:01, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45828 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2787/22095 [4:51:38<19:20:42, 3.61s/it] {'loss': 0.4348, 'grad_norm': 0.7242396310884901, 'learning_rate': 9.759836726787855e-06, 'epoch': 0.13} 13%|█▎ | 2787/22095 [4:51:38<19:20:42, 3.61s/it] 13%|█▎ | 2788/22095 [4:51:41<18:35:24, 3.47s/it] {'loss': 0.4439, 'grad_norm': 0.7492140269556657, 'learning_rate': 9.759612255554765e-06, 'epoch': 0.13} 13%|█▎ | 2788/22095 [4:51:41<18:35:24, 3.47s/it] 13%|█▎ | 2789/22095 [4:51:45<19:43:41, 3.68s/it] {'loss': 0.4383, 'grad_norm': 0.7253084120049096, 'learning_rate': 9.759387682052226e-06, 'epoch': 0.13} 13%|█▎ | 2789/22095 [4:51:45<19:43:41, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2790/22095 [4:51:55<29:07:54, 5.43s/it] {'loss': 0.5338, 'grad_norm': 1.1116143910529677, 'learning_rate': 9.759163006285064e-06, 'epoch': 0.13} 13%|█▎ | 2790/22095 [4:51:55<29:07:54, 5.43s/it] 13%|█▎ | 2791/22095 [4:51:59<26:31:28, 4.95s/it] {'loss': 0.4255, 'grad_norm': 0.8685718392395363, 'learning_rate': 9.758938228258103e-06, 'epoch': 0.13} 13%|█▎ | 2791/22095 [4:51:59<26:31:28, 4.95s/it] 13%|█▎ | 2792/22095 [4:52:01<23:08:20, 4.32s/it] {'loss': 0.4382, 'grad_norm': 0.7273351825438547, 'learning_rate': 9.758713347976179e-06, 'epoch': 0.13} 13%|█▎ | 2792/22095 [4:52:02<23:08:20, 4.32s/it] 13%|█▎ | 2793/22095 [4:52:05<21:34:14, 4.02s/it] {'loss': 0.4311, 'grad_norm': 0.7179864020192804, 'learning_rate': 9.758488365444117e-06, 'epoch': 0.13} 13%|█▎ | 2793/22095 [4:52:05<21:34:14, 4.02s/it] 13%|█▎ | 2794/22095 [4:52:08<20:48:52, 3.88s/it] {'loss': 0.4111, 'grad_norm': 0.6649989419467585, 'learning_rate': 9.758263280666757e-06, 'epoch': 0.13} 13%|█▎ | 2794/22095 [4:52:08<20:48:52, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123283 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2795/22095 [4:52:18<29:32:37, 5.51s/it] {'loss': 0.5024, 'grad_norm': 0.4415612740759909, 'learning_rate': 9.758038093648931e-06, 'epoch': 0.13} 13%|█▎ | 2795/22095 [4:52:18<29:32:37, 5.51s/it] 13%|█▎ | 2796/22095 [4:52:22<27:11:20, 5.07s/it] {'loss': 0.4213, 'grad_norm': 0.8351532083948827, 'learning_rate': 9.757812804395482e-06, 'epoch': 0.13} 13%|█▎ | 2796/22095 [4:52:22<27:11:20, 5.07s/it] 13%|█▎ | 2797/22095 [4:52:25<24:19:53, 4.54s/it] {'loss': 0.4192, 'grad_norm': 1.0093853854686414, 'learning_rate': 9.757587412911247e-06, 'epoch': 0.13} 13%|█▎ | 2797/22095 [4:52:25<24:19:53, 4.54s/it] 13%|█▎ | 2798/22095 [4:52:28<21:50:59, 4.08s/it] {'loss': 0.4206, 'grad_norm': 0.6988444753162972, 'learning_rate': 9.75736191920107e-06, 'epoch': 0.13} 13%|█▎ | 2798/22095 [4:52:28<21:50:59, 4.08s/it] 13%|█▎ | 2799/22095 [4:52:32<22:02:28, 4.11s/it] {'loss': 0.4357, 'grad_norm': 0.7719411662818325, 'learning_rate': 9.757136323269798e-06, 'epoch': 0.13} 13%|█▎ | 2799/22095 [4:52:32<22:02:28, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61167 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59568 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2800/22095 [4:52:36<20:53:07, 3.90s/it] {'loss': 0.4749, 'grad_norm': 0.8921782142026669, 'learning_rate': 9.756910625122276e-06, 'epoch': 0.13} 13%|█▎ | 2800/22095 [4:52:36<20:53:07, 3.90s/it] 13%|█▎ | 2801/22095 [4:52:40<21:03:11, 3.93s/it] {'loss': 0.4751, 'grad_norm': 0.7741589039777517, 'learning_rate': 9.756684824763354e-06, 'epoch': 0.13} 13%|█▎ | 2801/22095 [4:52:40<21:03:11, 3.93s/it] 13%|█▎ | 2802/22095 [4:52:43<19:29:03, 3.64s/it] {'loss': 0.4489, 'grad_norm': 0.7571345424322213, 'learning_rate': 9.756458922197884e-06, 'epoch': 0.13} 13%|█▎ | 2802/22095 [4:52:43<19:29:03, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2803/22095 [4:52:50<24:58:53, 4.66s/it] {'loss': 0.5225, 'grad_norm': 0.8639275063462156, 'learning_rate': 9.756232917430719e-06, 'epoch': 0.13} 13%|█▎ | 2803/22095 [4:52:50<24:58:53, 4.66s/it] 13%|█▎ | 2804/22095 [4:52:53<22:48:24, 4.26s/it] {'loss': 0.4155, 'grad_norm': 0.8443327036951966, 'learning_rate': 9.756006810466719e-06, 'epoch': 0.13} 13%|█▎ | 2804/22095 [4:52:53<22:48:24, 4.26s/it] 13%|█▎ | 2805/22095 [4:52:56<20:37:33, 3.85s/it] {'loss': 0.4464, 'grad_norm': 0.8229293203051627, 'learning_rate': 9.755780601310738e-06, 'epoch': 0.13} 13%|█▎ | 2805/22095 [4:52:56<20:37:33, 3.85s/it] 13%|█▎ | 2806/22095 [4:53:00<20:47:23, 3.88s/it] {'loss': 0.4316, 'grad_norm': 0.8798994860945353, 'learning_rate': 9.755554289967638e-06, 'epoch': 0.13} 13%|█▎ | 2806/22095 [4:53:00<20:47:23, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71746 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64339 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95961 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2807/22095 [4:53:03<19:27:10, 3.63s/it] {'loss': 0.4324, 'grad_norm': 0.7797064513535364, 'learning_rate': 9.755327876442282e-06, 'epoch': 0.13} 13%|█▎ | 2807/22095 [4:53:03<19:27:10, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2808/22095 [4:53:10<25:35:58, 4.78s/it] {'loss': 0.5179, 'grad_norm': 0.46667805555480824, 'learning_rate': 9.755101360739537e-06, 'epoch': 0.13} 13%|█▎ | 2808/22095 [4:53:10<25:35:58, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41309 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2809/22095 [4:53:14<23:21:21, 4.36s/it] {'loss': 0.4062, 'grad_norm': 1.1732456802108047, 'learning_rate': 9.754874742864264e-06, 'epoch': 0.13} 13%|█▎ | 2809/22095 [4:53:14<23:21:21, 4.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940296 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63449, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nA. 2\nB. 3\nC. 4\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:根据题意,AC=12cm,CB=\\frac{2}{3}AC,所以CB=8cm,所以AB=AC+CB=20cm,又D、E分别为AC、AB的中点,所以DE=AE-AD=\\frac{1}{2}(AB-AC)=4cm.即DE=4cm.'}]} 13%|█▎ | 2810/22095 [4:53:17<20:55:37, 3.91s/it] {'loss': 0.4489, 'grad_norm': 0.7681936991538912, 'learning_rate': 9.754648022821339e-06, 'epoch': 0.13} 13%|█▎ | 2810/22095 [4:53:17<20:55:37, 3.91s/it] 13%|█▎ | 2811/22095 [4:53:20<19:48:36, 3.70s/it] {'loss': 0.4357, 'grad_norm': 0.8233301252300935, 'learning_rate': 9.754421200615629e-06, 'epoch': 0.13} 13%|█▎ | 2811/22095 [4:53:20<19:48:36, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880127 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3280, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12'}]} 13%|█▎ | 2812/22095 [4:53:23<18:56:41, 3.54s/it] {'loss': 0.4545, 'grad_norm': 1.5611820942128543, 'learning_rate': 9.75419427625201e-06, 'epoch': 0.13} 13%|█▎ | 2812/22095 [4:53:23<18:56:41, 3.54s/it] 13%|█▎ | 2813/22095 [4:53:26<18:32:02, 3.46s/it] {'loss': 0.4269, 'grad_norm': 0.7366516657356496, 'learning_rate': 9.753967249735359e-06, 'epoch': 0.13} 13%|█▎ | 2813/22095 [4:53:26<18:32:02, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (227182 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2814/22095 [4:53:29<18:15:33, 3.41s/it] {'loss': 0.4172, 'grad_norm': 1.197383763338776, 'learning_rate': 9.753740121070552e-06, 'epoch': 0.13} 13%|█▎ | 2814/22095 [4:53:29<18:15:33, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2815/22095 [4:53:38<25:54:57, 4.84s/it] {'loss': 0.523, 'grad_norm': 0.5187181295570135, 'learning_rate': 9.753512890262468e-06, 'epoch': 0.13} 13%|█▎ | 2815/22095 [4:53:38<25:54:57, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51354 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42190 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70045 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2816/22095 [4:53:42<24:32:33, 4.58s/it] {'loss': 0.4342, 'grad_norm': 0.7402400139361618, 'learning_rate': 9.753285557315993e-06, 'epoch': 0.13} 13%|█▎ | 2816/22095 [4:53:42<24:32:33, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2817/22095 [4:53:51<32:36:01, 6.09s/it] {'loss': 0.5042, 'grad_norm': 0.4179812876405542, 'learning_rate': 9.75305812223601e-06, 'epoch': 0.13} 13%|█▎ | 2817/22095 [4:53:51<32:36:01, 6.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85038 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73244 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2818/22095 [4:53:54<27:55:57, 5.22s/it] {'loss': 0.4321, 'grad_norm': 0.956692225707494, 'learning_rate': 9.752830585027406e-06, 'epoch': 0.13} 13%|█▎ | 2818/22095 [4:53:54<27:55:57, 5.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51737 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2819/22095 [4:53:58<24:40:42, 4.61s/it] {'loss': 0.4106, 'grad_norm': 0.7580316124124711, 'learning_rate': 9.752602945695068e-06, 'epoch': 0.13} 13%|█▎ | 2819/22095 [4:53:58<24:40:42, 4.61s/it] 13%|█▎ | 2820/22095 [4:54:02<23:44:08, 4.43s/it] {'loss': 0.4584, 'grad_norm': 0.6779058714280451, 'learning_rate': 9.75237520424389e-06, 'epoch': 0.13} 13%|█▎ | 2820/22095 [4:54:02<23:44:08, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56097 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44293 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2821/22095 [4:54:05<22:19:01, 4.17s/it] {'loss': 0.4538, 'grad_norm': 0.7006360886542689, 'learning_rate': 9.752147360678767e-06, 'epoch': 0.13} 13%|█▎ | 2821/22095 [4:54:05<22:19:01, 4.17s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9189555 in VC:s3://internvl2/datasets/EduChat-Math/. Exception: Image size [25, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6003, 'image': '9361.jpg', 'image_wh': [[25, 25]], 'conversations': [{'from': 'human', 'value': '你的任务是回答以下问题。在回答之前,请逐步推理说明您的思路。当你准备好给出答案时,请使用以下格式:"答案: ..."\n\n问题:\n\n定义在 $\\mathrm{R}$ 上的函数 $\\mathrm{f}(\\mathrm{x})$ 满足: $\\mathrm{f}^{\\prime}(\\mathrm{x})>1-\\mathrm{f}(\\mathrm{x}), \\mathrm{f}(0)=6, \\mathrm{f}^{\\prime}(\\mathrm{x})$ 是 $\\mathrm{f}(\\mathrm{x})$ 的导函数, 则不等式 $e^{x} f(x)>e^{x}+5$ (其中 $e$ 为自然对数的底数) 的解集为 \\$ \\qquad \\$\n\n'}, {'from': 'gpt', 'value': '要解不等式 $e^{x} f(x) > e^{x} + 5$,我们先定义一个辅助函数 $g(x) = e^{x} f(x) - e^{x}$。\n\n### 步骤 1: 求导\n计算 $g(x)$ 的导数:\n\\[ g^{\\prime}(x) = e^{x} f(x) + e^{x} f^{\\prime}(x) - e^{x} = e^{x} (f(x) + f^{\\prime}(x) - 1) \\]\n\n### 步骤 2: 判断导数符号\n已知 $f^{\\prime}(x) > 1 - f(x)$,因此:\n\\[ f(x) + f^{\\prime}(x) - 1 > 0 \\]\n\n由此可以得出:\n\\[ g^{\\prime}(x) > 0 \\]\n\n这表明 $g(x)$ 是单调递增的函数。\n\n### 步骤 3: 初始条件\n计算 $g(0)$:\n\\[ g(0) = e^{0} f(0) - e^{0} = 6 - 1 = 5 \\]\n\n### 步骤 4: 解不等式\n由于 $g(x)$ 是单调递增的,所以对于 $x > 0$,有 $g(x) > g(0) = 5$。\n\n因此,不等式 $e^{x} f(x) > e^{x} + 5$ 的解集为 $(0, +\\infty)$。\n\n答案: $(0, +\\infty)$'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_1/images/step_0.png 2025-08-27 20:52:05.004286 load time: 1037.28 ms 13%|█▎ | 2822/22095 [4:54:15<31:59:17, 5.98s/it] {'loss': 0.5084, 'grad_norm': 0.4225543051078938, 'learning_rate': 9.75191941500459e-06, 'epoch': 0.13} 13%|█▎ | 2822/22095 [4:54:15<31:59:17, 5.98s/it] 13%|█▎ | 2823/22095 [4:54:19<28:29:46, 5.32s/it] {'loss': 0.4234, 'grad_norm': 0.9870591208561422, 'learning_rate': 9.75169136722626e-06, 'epoch': 0.13} 13%|█▎ | 2823/22095 [4:54:19<28:29:46, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2824/22095 [4:54:29<35:41:38, 6.67s/it] {'loss': 0.497, 'grad_norm': 0.38302841608917704, 'learning_rate': 9.751463217348675e-06, 'epoch': 0.13} 13%|█▎ | 2824/22095 [4:54:29<35:41:38, 6.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2825/22095 [4:54:33<31:27:56, 5.88s/it] {'loss': 0.4236, 'grad_norm': 0.7176736840275659, 'learning_rate': 9.75123496537674e-06, 'epoch': 0.13} 13%|█▎ | 2825/22095 [4:54:33<31:27:56, 5.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2826/22095 [4:54:42<37:11:51, 6.95s/it] {'loss': 0.5039, 'grad_norm': 0.34689025847761207, 'learning_rate': 9.751006611315357e-06, 'epoch': 0.13} 13%|█▎ | 2826/22095 [4:54:42<37:11:51, 6.95s/it] 13%|█▎ | 2827/22095 [4:54:48<34:13:03, 6.39s/it] {'loss': 0.4995, 'grad_norm': 0.3500954668873423, 'learning_rate': 9.750778155169434e-06, 'epoch': 0.13} 13%|█▎ | 2827/22095 [4:54:48<34:13:03, 6.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 13%|█▎ | 2828/22095 [4:54:51<29:55:59, 5.59s/it] {'loss': 0.4577, 'grad_norm': 1.0915034728521975, 'learning_rate': 9.75054959694388e-06, 'epoch': 0.13} 13%|█▎ | 2828/22095 [4:54:51<29:55:59, 5.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2829/22095 [4:54:59<32:53:10, 6.15s/it] {'loss': 0.5616, 'grad_norm': 0.36993428839340675, 'learning_rate': 9.750320936643604e-06, 'epoch': 0.13} 13%|█▎ | 2829/22095 [4:54:59<32:53:10, 6.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877038 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 191, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm'}, {'from': 'gpt', 'value': '【解答】解:∵点C是线段AB的中点,AB=20cm,∴BC=\\frac{1}{2}AB=\\frac{1}{2}×20cm=10cm,∵点D是线段BC的中点,∴BD=\\frac{1}{2}BC=\\frac{1}{2}×10cm=5cm,∴AD=AB-BD=20cm-5cm=15cm.'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2830/22095 [4:55:02<28:14:28, 5.28s/it] {'loss': 0.4061, 'grad_norm': 1.1635716200014408, 'learning_rate': 9.75009217427352e-06, 'epoch': 0.13} 13%|█▎ | 2830/22095 [4:55:02<28:14:28, 5.28s/it] 13%|█▎ | 2831/22095 [4:55:05<24:54:19, 4.65s/it] {'loss': 0.4129, 'grad_norm': 0.7709762211162258, 'learning_rate': 9.749863309838545e-06, 'epoch': 0.13} 13%|█▎ | 2831/22095 [4:55:05<24:54:19, 4.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922572 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45725, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm'}, {'from': 'gpt', 'value': '【解答】解:点D是AC的中点,如果CD=4cm,AC=2CD=2×4=8(cm),BC=AB-AC=13-8=5(cm).'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8362007 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 28741, 'image': 'vrdu_table_final_2/astro-ph.CO/1e4bca58-1d52-42f7-a67b-79c43d4381d9.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}#1@{}}{#2}\\\\{#3}\\end{tabular}\n```"}]} 13%|█▎ | 2832/22095 [4:55:08<22:21:36, 4.18s/it] {'loss': 0.4606, 'grad_norm': 0.8127009596810594, 'learning_rate': 9.749634343343598e-06, 'epoch': 0.13} 13%|█▎ | 2832/22095 [4:55:08<22:21:36, 4.18s/it] 13%|█▎ | 2833/22095 [4:55:12<21:00:06, 3.93s/it] {'loss': 0.432, 'grad_norm': 0.7069452105591859, 'learning_rate': 9.749405274793592e-06, 'epoch': 0.13} 13%|█▎ | 2833/22095 [4:55:12<21:00:06, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2834/22095 [4:55:15<20:23:02, 3.81s/it] {'loss': 0.4228, 'grad_norm': 0.7539224217869867, 'learning_rate': 9.749176104193456e-06, 'epoch': 0.13} 13%|█▎ | 2834/22095 [4:55:15<20:23:02, 3.81s/it] 13%|█▎ | 2835/22095 [4:55:18<18:47:31, 3.51s/it] {'loss': 0.3924, 'grad_norm': 0.682126292506584, 'learning_rate': 9.748946831548111e-06, 'epoch': 0.13} 13%|█▎ | 2835/22095 [4:55:18<18:47:31, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954481 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5316, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 13%|█▎ | 2836/22095 [4:55:21<18:48:39, 3.52s/it] {'loss': 0.4298, 'grad_norm': 0.8189363607131463, 'learning_rate': 9.748717456862484e-06, 'epoch': 0.13} 13%|█▎ | 2836/22095 [4:55:21<18:48:39, 3.52s/it] 13%|█▎ | 2837/22095 [4:55:25<18:38:40, 3.49s/it] {'loss': 0.4387, 'grad_norm': 0.6526943554999189, 'learning_rate': 9.748487980141503e-06, 'epoch': 0.13} 13%|█▎ | 2837/22095 [4:55:25<18:38:40, 3.49s/it] 13%|█▎ | 2838/22095 [4:55:28<18:16:12, 3.42s/it] {'loss': 0.4605, 'grad_norm': 0.7165030693701668, 'learning_rate': 9.748258401390099e-06, 'epoch': 0.13} 13%|█▎ | 2838/22095 [4:55:28<18:16:12, 3.42s/it] 13%|█▎ | 2839/22095 [4:55:32<19:02:57, 3.56s/it] {'loss': 0.4589, 'grad_norm': 0.748171676264515, 'learning_rate': 9.748028720613206e-06, 'epoch': 0.13} 13%|█▎ | 2839/22095 [4:55:32<19:02:57, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2840/22095 [4:55:36<20:14:01, 3.78s/it] {'loss': 0.4405, 'grad_norm': 0.6958245051079487, 'learning_rate': 9.747798937815756e-06, 'epoch': 0.13} 13%|█▎ | 2840/22095 [4:55:36<20:14:01, 3.78s/it] 13%|█▎ | 2841/22095 [4:55:40<20:03:46, 3.75s/it] {'loss': 0.4166, 'grad_norm': 0.7010544328742045, 'learning_rate': 9.74756905300269e-06, 'epoch': 0.13} 13%|█▎ | 2841/22095 [4:55:40<20:03:46, 3.75s/it] 13%|█▎ | 2842/22095 [4:55:43<19:20:04, 3.62s/it] {'loss': 0.3861, 'grad_norm': 0.6929601001488586, 'learning_rate': 9.747339066178947e-06, 'epoch': 0.13} 13%|█▎ | 2842/22095 [4:55:43<19:20:04, 3.62s/it] 13%|█▎ | 2843/22095 [4:55:47<18:53:35, 3.53s/it] {'loss': 0.4512, 'grad_norm': 0.7691628549541619, 'learning_rate': 9.747108977349466e-06, 'epoch': 0.13} 13%|█▎ | 2843/22095 [4:55:47<18:53:35, 3.53s/it] 13%|█▎ | 2844/22095 [4:55:50<18:17:20, 3.42s/it] {'loss': 0.4443, 'grad_norm': 1.5952758417935748, 'learning_rate': 9.746878786519195e-06, 'epoch': 0.13} 13%|█▎ | 2844/22095 [4:55:50<18:17:20, 3.42s/it] 13%|█▎ | 2845/22095 [4:55:53<18:40:39, 3.49s/it] {'loss': 0.3726, 'grad_norm': 0.7012408391747204, 'learning_rate': 9.746648493693076e-06, 'epoch': 0.13} 13%|█▎ | 2845/22095 [4:55:53<18:40:39, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (109727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47616 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53856 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2846/22095 [4:55:56<17:53:04, 3.34s/it] {'loss': 0.38, 'grad_norm': 0.7661600137981615, 'learning_rate': 9.74641809887606e-06, 'epoch': 0.13} 13%|█▎ | 2846/22095 [4:55:56<17:53:04, 3.34s/it] 13%|█▎ | 2847/22095 [4:55:59<17:05:55, 3.20s/it] {'loss': 0.3985, 'grad_norm': 0.8000061170172621, 'learning_rate': 9.746187602073097e-06, 'epoch': 0.13} 13%|█▎ | 2847/22095 [4:55:59<17:05:55, 3.20s/it] 13%|█▎ | 2848/22095 [4:56:03<17:57:03, 3.36s/it] {'loss': 0.4487, 'grad_norm': 0.7000898176833337, 'learning_rate': 9.745957003289138e-06, 'epoch': 0.13} 13%|█▎ | 2848/22095 [4:56:03<17:57:03, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2849/22095 [4:56:07<18:08:49, 3.39s/it] {'loss': 0.4188, 'grad_norm': 0.717128506736437, 'learning_rate': 9.745726302529139e-06, 'epoch': 0.13} 13%|█▎ | 2849/22095 [4:56:07<18:08:49, 3.39s/it] 13%|█▎ | 2850/22095 [4:56:09<17:12:06, 3.22s/it] {'loss': 0.3798, 'grad_norm': 0.7935962162575106, 'learning_rate': 9.745495499798058e-06, 'epoch': 0.13} 13%|█▎ | 2850/22095 [4:56:09<17:12:06, 3.22s/it] 13%|█▎ | 2851/22095 [4:56:13<17:24:34, 3.26s/it] {'loss': 0.4351, 'grad_norm': 0.7923976032261633, 'learning_rate': 9.745264595100854e-06, 'epoch': 0.13} 13%|█▎ | 2851/22095 [4:56:13<17:24:34, 3.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2852/22095 [4:56:16<17:10:20, 3.21s/it] {'loss': 0.4304, 'grad_norm': 0.747883990023528, 'learning_rate': 9.745033588442487e-06, 'epoch': 0.13} 13%|█▎ | 2852/22095 [4:56:16<17:10:20, 3.21s/it] 13%|█▎ | 2853/22095 [4:56:19<16:25:08, 3.07s/it] {'loss': 0.4162, 'grad_norm': 0.7459299202346427, 'learning_rate': 9.744802479827921e-06, 'epoch': 0.13} 13%|█▎ | 2853/22095 [4:56:19<16:25:08, 3.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2854/22095 [4:56:28<26:45:04, 5.01s/it] {'loss': 0.5119, 'grad_norm': 0.5887077338666191, 'learning_rate': 9.744571269262122e-06, 'epoch': 0.13} 13%|█▎ | 2854/22095 [4:56:28<26:45:04, 5.01s/it] 13%|█▎ | 2855/22095 [4:56:32<25:16:52, 4.73s/it] {'loss': 0.4359, 'grad_norm': 0.6729072197590653, 'learning_rate': 9.74433995675006e-06, 'epoch': 0.13} 13%|█▎ | 2855/22095 [4:56:32<25:16:52, 4.73s/it] 13%|█▎ | 2856/22095 [4:56:36<23:42:20, 4.44s/it] {'loss': 0.4771, 'grad_norm': 0.7777456505468983, 'learning_rate': 9.744108542296702e-06, 'epoch': 0.13} 13%|█▎ | 2856/22095 [4:56:36<23:42:20, 4.44s/it] 13%|█▎ | 2857/22095 [4:56:39<22:18:51, 4.18s/it] {'loss': 0.3983, 'grad_norm': 0.7469741198993132, 'learning_rate': 9.743877025907023e-06, 'epoch': 0.13} 13%|█▎ | 2857/22095 [4:56:39<22:18:51, 4.18s/it] 13%|█▎ | 2858/22095 [4:56:43<21:37:27, 4.05s/it] {'loss': 0.3637, 'grad_norm': 0.6858749117382782, 'learning_rate': 9.743645407585994e-06, 'epoch': 0.13} 13%|█▎ | 2858/22095 [4:56:43<21:37:27, 4.05s/it] 13%|█▎ | 2859/22095 [4:56:47<20:38:34, 3.86s/it] {'loss': 0.4184, 'grad_norm': 0.7031646069476174, 'learning_rate': 9.743413687338596e-06, 'epoch': 0.13} 13%|█▎ | 2859/22095 [4:56:47<20:38:34, 3.86s/it] 13%|█▎ | 2860/22095 [4:56:51<21:12:52, 3.97s/it] {'loss': 0.4131, 'grad_norm': 0.7255909443098081, 'learning_rate': 9.743181865169806e-06, 'epoch': 0.13} 13%|█▎ | 2860/22095 [4:56:51<21:12:52, 3.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2861/22095 [4:56:54<19:47:01, 3.70s/it] {'loss': 0.4202, 'grad_norm': 0.7322265572281352, 'learning_rate': 9.742949941084604e-06, 'epoch': 0.13} 13%|█▎ | 2861/22095 [4:56:54<19:47:01, 3.70s/it] 13%|█▎ | 2862/22095 [4:56:57<19:01:35, 3.56s/it] {'loss': 0.4113, 'grad_norm': 0.6886194155596765, 'learning_rate': 9.742717915087978e-06, 'epoch': 0.13} 13%|█▎ | 2862/22095 [4:56:57<19:01:35, 3.56s/it] 13%|█▎ | 2863/22095 [4:57:01<19:10:04, 3.59s/it] {'loss': 0.4339, 'grad_norm': 0.7652278487384526, 'learning_rate': 9.742485787184907e-06, 'epoch': 0.13} 13%|█▎ | 2863/22095 [4:57:01<19:10:04, 3.59s/it] 13%|█▎ | 2864/22095 [4:57:04<18:41:52, 3.50s/it] {'loss': 0.4259, 'grad_norm': 0.67801745151185, 'learning_rate': 9.742253557380383e-06, 'epoch': 0.13} 13%|█▎ | 2864/22095 [4:57:04<18:41:52, 3.50s/it] 13%|█▎ | 2865/22095 [4:57:07<17:31:08, 3.28s/it] {'loss': 0.4087, 'grad_norm': 0.6355102242351469, 'learning_rate': 9.742021225679394e-06, 'epoch': 0.13} 13%|█▎ | 2865/22095 [4:57:07<17:31:08, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2866/22095 [4:57:10<17:27:28, 3.27s/it] {'loss': 0.502, 'grad_norm': 0.7260698872298267, 'learning_rate': 9.741788792086934e-06, 'epoch': 0.13} 13%|█▎ | 2866/22095 [4:57:10<17:27:28, 3.27s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30729.png 2025-08-27 20:55:08.920946 load time: 1461.35 ms 13%|█▎ | 2867/22095 [4:57:15<19:34:59, 3.67s/it] {'loss': 0.376, 'grad_norm': 1.0475091478809628, 'learning_rate': 9.741556256607996e-06, 'epoch': 0.13} 13%|█▎ | 2867/22095 [4:57:15<19:34:59, 3.67s/it] 13%|█▎ | 2868/22095 [4:57:17<18:01:09, 3.37s/it] {'loss': 0.4064, 'grad_norm': 0.7174565795305505, 'learning_rate': 9.741323619247575e-06, 'epoch': 0.13} 13%|█▎ | 2868/22095 [4:57:17<18:01:09, 3.37s/it] 13%|█▎ | 2869/22095 [4:57:20<17:14:01, 3.23s/it] {'loss': 0.4205, 'grad_norm': 0.6816845441724139, 'learning_rate': 9.741090880010674e-06, 'epoch': 0.13} 13%|█▎ | 2869/22095 [4:57:20<17:14:01, 3.23s/it] 13%|█▎ | 2870/22095 [4:57:23<16:43:14, 3.13s/it] {'loss': 0.469, 'grad_norm': 0.7404934530715019, 'learning_rate': 9.74085803890229e-06, 'epoch': 0.13} 13%|█▎ | 2870/22095 [4:57:23<16:43:14, 3.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8553925 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24632, 'image': '942110439.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Mystery, Thriller & Suspense? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 13%|█▎ | 2871/22095 [4:57:27<17:48:11, 3.33s/it] {'loss': 0.4323, 'grad_norm': 0.7107227611827823, 'learning_rate': 9.740625095927428e-06, 'epoch': 0.13} 13%|█▎ | 2871/22095 [4:57:27<17:48:11, 3.33s/it] 13%|█▎ | 2872/22095 [4:57:31<19:22:46, 3.63s/it] {'loss': 0.4506, 'grad_norm': 0.6435610487654178, 'learning_rate': 9.74039205109109e-06, 'epoch': 0.13} 13%|█▎ | 2872/22095 [4:57:31<19:22:46, 3.63s/it] 13%|█▎ | 2873/22095 [4:57:34<18:25:16, 3.45s/it] {'loss': 0.4603, 'grad_norm': 0.6905701954641099, 'learning_rate': 9.740158904398286e-06, 'epoch': 0.13} 13%|█▎ | 2873/22095 [4:57:34<18:25:16, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84762 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2874/22095 [4:57:38<18:47:01, 3.52s/it] {'loss': 0.3842, 'grad_norm': 0.6490855543128955, 'learning_rate': 9.739925655854028e-06, 'epoch': 0.13} 13%|█▎ | 2874/22095 [4:57:38<18:47:01, 3.52s/it] 13%|█▎ | 2875/22095 [4:57:41<17:53:26, 3.35s/it] {'loss': 0.4419, 'grad_norm': 0.7628737285578185, 'learning_rate': 9.739692305463324e-06, 'epoch': 0.13} 13%|█▎ | 2875/22095 [4:57:41<17:53:26, 3.35s/it] 13%|█▎ | 2876/22095 [4:57:44<17:26:38, 3.27s/it] {'loss': 0.4759, 'grad_norm': 0.7171768424557958, 'learning_rate': 9.739458853231188e-06, 'epoch': 0.13} 13%|█▎ | 2876/22095 [4:57:44<17:26:38, 3.27s/it] 13%|█▎ | 2877/22095 [4:57:48<17:42:03, 3.32s/it] {'loss': 0.4431, 'grad_norm': 0.6877385757644017, 'learning_rate': 9.739225299162638e-06, 'epoch': 0.13} 13%|█▎ | 2877/22095 [4:57:48<17:42:03, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67268 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2878/22095 [4:57:51<17:55:57, 3.36s/it] {'loss': 0.4724, 'grad_norm': 0.7360380510178575, 'learning_rate': 9.738991643262693e-06, 'epoch': 0.13} 13%|█▎ | 2878/22095 [4:57:51<17:55:57, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65597 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43523 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46034 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2879/22095 [4:57:54<17:53:55, 3.35s/it] {'loss': 0.4279, 'grad_norm': 0.7086556215160046, 'learning_rate': 9.738757885536371e-06, 'epoch': 0.13} 13%|█▎ | 2879/22095 [4:57:54<17:53:55, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72991 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48067 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2880/22095 [4:57:57<17:14:24, 3.23s/it] {'loss': 0.4422, 'grad_norm': 0.7367770757564954, 'learning_rate': 9.738524025988696e-06, 'epoch': 0.13} 13%|█▎ | 2880/22095 [4:57:57<17:14:24, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97069 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2881/22095 [4:58:01<18:03:50, 3.38s/it] {'loss': 0.4302, 'grad_norm': 0.7325494044645873, 'learning_rate': 9.738290064624694e-06, 'epoch': 0.13} 13%|█▎ | 2881/22095 [4:58:01<18:03:50, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43264 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2882/22095 [4:58:06<21:19:31, 4.00s/it] {'loss': 0.5225, 'grad_norm': 0.7139844892563704, 'learning_rate': 9.73805600144939e-06, 'epoch': 0.13} 13%|█▎ | 2882/22095 [4:58:06<21:19:31, 4.00s/it] 13%|█▎ | 2883/22095 [4:58:10<19:59:20, 3.75s/it] {'loss': 0.3924, 'grad_norm': 0.7369897055113472, 'learning_rate': 9.737821836467816e-06, 'epoch': 0.13} 13%|█▎ | 2883/22095 [4:58:10<19:59:20, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [570, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8467858 in VC:s3://internvl-moe-sft-data/. Exception: Image size [570, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13359, 'image': 'vrdu_texteq/astro-ph.CO/b59141fa-d4ea-4714-ab5a-296e12b4467e.png', 'image_wh': [[570, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'with $\\mbox{f}_i$ the Fermi--Dirac distribution\nwritten as'}]} 13%|█▎ | 2884/22095 [4:58:12<18:38:48, 3.49s/it] {'loss': 0.4555, 'grad_norm': 0.7422154925009801, 'learning_rate': 9.737587569685e-06, 'epoch': 0.13} 13%|█▎ | 2884/22095 [4:58:12<18:38:48, 3.49s/it] 13%|█▎ | 2885/22095 [4:58:15<17:33:44, 3.29s/it] {'loss': 0.3824, 'grad_norm': 0.6495721284288817, 'learning_rate': 9.737353201105978e-06, 'epoch': 0.13} 13%|█▎ | 2885/22095 [4:58:15<17:33:44, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2886/22095 [4:58:26<28:59:33, 5.43s/it] {'loss': 0.5372, 'grad_norm': 0.3972406270595581, 'learning_rate': 9.737118730735786e-06, 'epoch': 0.13} 13%|█▎ | 2886/22095 [4:58:26<28:59:33, 5.43s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_2/images/step_0.png 2025-08-27 20:56:25.737470 load time: 1060.11 ms 13%|█▎ | 2887/22095 [4:58:32<29:46:27, 5.58s/it] {'loss': 0.5196, 'grad_norm': 0.3759637227326978, 'learning_rate': 9.73688415857946e-06, 'epoch': 0.13} 13%|█▎ | 2887/22095 [4:58:32<29:46:27, 5.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89186 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2888/22095 [4:58:39<33:08:06, 6.21s/it] {'loss': 0.5211, 'grad_norm': 0.38323779671317704, 'learning_rate': 9.736649484642044e-06, 'epoch': 0.13} 13%|█▎ | 2888/22095 [4:58:39<33:08:06, 6.21s/it] 13%|█▎ | 2889/22095 [4:58:44<30:52:53, 5.79s/it] {'loss': 0.502, 'grad_norm': 0.3246091338161728, 'learning_rate': 9.736414708928576e-06, 'epoch': 0.13} 13%|█▎ | 2889/22095 [4:58:44<30:52:53, 5.79s/it] 13%|█▎ | 2890/22095 [4:58:54<36:54:35, 6.92s/it] {'loss': 0.5038, 'grad_norm': 0.3367150725113058, 'learning_rate': 9.736179831444103e-06, 'epoch': 0.13} 13%|█▎ | 2890/22095 [4:58:54<36:54:35, 6.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 13%|█▎ | 2891/22095 [4:58:57<30:55:22, 5.80s/it] {'loss': 0.4215, 'grad_norm': 1.1456990720264577, 'learning_rate': 9.735944852193673e-06, 'epoch': 0.13} 13%|█▎ | 2891/22095 [4:58:57<30:55:22, 5.80s/it] 13%|█▎ | 2892/22095 [4:59:01<27:47:52, 5.21s/it] {'loss': 0.4528, 'grad_norm': 0.7204268232990344, 'learning_rate': 9.735709771182331e-06, 'epoch': 0.13} 13%|█▎ | 2892/22095 [4:59:01<27:47:52, 5.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2893/22095 [4:59:04<24:21:15, 4.57s/it] {'loss': 0.4629, 'grad_norm': 0.8307652317369043, 'learning_rate': 9.735474588415132e-06, 'epoch': 0.13} 13%|█▎ | 2893/22095 [4:59:04<24:21:15, 4.57s/it] 13%|█▎ | 2894/22095 [4:59:07<22:25:04, 4.20s/it] {'loss': 0.4201, 'grad_norm': 1.1668789311608647, 'learning_rate': 9.735239303897129e-06, 'epoch': 0.13} 13%|█▎ | 2894/22095 [4:59:07<22:25:04, 4.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None [Try #0] Failed to fetch sample 1863756 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None Problematic sample: {'image': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png', 'conversations': [], 'image_id': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047231 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵AN:MN=1:2,且AN=2,∴2:MN=1:2,∴MN=4cm,∴AM=6cm.∵M是线段AB的中点,∴AB=2AM,∴AB=12cm,故D答案正确.'}]} 13%|█▎ | 2895/22095 [4:59:11<22:10:39, 4.16s/it] {'loss': 0.4314, 'grad_norm': 0.7384757066530285, 'learning_rate': 9.735003917633376e-06, 'epoch': 0.13} 13%|█▎ | 2895/22095 [4:59:11<22:10:39, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74472 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70097 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58930 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2896/22095 [4:59:21<31:56:09, 5.99s/it] {'loss': 0.5127, 'grad_norm': 0.6199463148996311, 'learning_rate': 9.73476842962893e-06, 'epoch': 0.13} 13%|█▎ | 2896/22095 [4:59:21<31:56:09, 5.99s/it] 13%|█▎ | 2897/22095 [4:59:25<28:34:35, 5.36s/it] {'loss': 0.4345, 'grad_norm': 1.2150021320725093, 'learning_rate': 9.734532839888853e-06, 'epoch': 0.13} 13%|█▎ | 2897/22095 [4:59:25<28:34:35, 5.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [109, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8399445 in VC:s3://internvl-moe-sft-data/. Exception: Image size [109, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1599, 'image': 'vrdu_table_final_2/astro-ph.CO/42134518-2f00-4d96-9a0e-801eef3819d4.png', 'image_wh': [[109, 20]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha - \\alpha_{\\rm true}$\\end{tabular}\n```"}]} 13%|█▎ | 2898/22095 [4:59:29<25:39:50, 4.81s/it] {'loss': 0.4413, 'grad_norm': 0.9311562905072975, 'learning_rate': 9.734297148418205e-06, 'epoch': 0.13} 13%|█▎ | 2898/22095 [4:59:29<25:39:50, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49774 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2899/22095 [4:59:33<24:43:06, 4.64s/it] {'loss': 0.4008, 'grad_norm': 0.7231301939259253, 'learning_rate': 9.734061355222054e-06, 'epoch': 0.13} 13%|█▎ | 2899/22095 [4:59:33<24:43:06, 4.64s/it] 13%|█▎ | 2900/22095 [4:59:36<22:17:26, 4.18s/it] {'loss': 0.3956, 'grad_norm': 0.8095404884163195, 'learning_rate': 9.733825460305462e-06, 'epoch': 0.13} 13%|█▎ | 2900/22095 [4:59:36<22:17:26, 4.18s/it] 13%|█▎ | 2901/22095 [4:59:40<21:42:27, 4.07s/it] {'loss': 0.4109, 'grad_norm': 0.8873271958738239, 'learning_rate': 9.7335894636735e-06, 'epoch': 0.13} 13%|█▎ | 2901/22095 [4:59:40<21:42:27, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76890 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2902/22095 [4:59:43<20:38:40, 3.87s/it] {'loss': 0.4431, 'grad_norm': 0.8089331693710592, 'learning_rate': 9.73335336533124e-06, 'epoch': 0.13} 13%|█▎ | 2902/22095 [4:59:43<20:38:40, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2903/22095 [4:59:47<20:33:53, 3.86s/it] {'loss': 0.4636, 'grad_norm': 0.7109221137009005, 'learning_rate': 9.733117165283753e-06, 'epoch': 0.13} 13%|█▎ | 2903/22095 [4:59:47<20:33:53, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2904/22095 [4:59:58<31:31:12, 5.91s/it] {'loss': 0.5099, 'grad_norm': 0.5918246544216307, 'learning_rate': 9.732880863536114e-06, 'epoch': 0.13} 13%|█▎ | 2904/22095 [4:59:58<31:31:12, 5.91s/it] 13%|█▎ | 2905/22095 [5:00:02<28:01:01, 5.26s/it] {'loss': 0.4276, 'grad_norm': 0.8772212127328848, 'learning_rate': 9.732644460093402e-06, 'epoch': 0.13} 13%|█▎ | 2905/22095 [5:00:02<28:01:01, 5.26s/it] 13%|█▎ | 2906/22095 [5:00:05<24:54:58, 4.67s/it] {'loss': 0.4327, 'grad_norm': 0.8671021696476754, 'learning_rate': 9.732407954960695e-06, 'epoch': 0.13} 13%|█▎ | 2906/22095 [5:00:05<24:54:58, 4.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2907/22095 [5:00:15<33:21:23, 6.26s/it] {'loss': 0.5093, 'grad_norm': 0.3731968179795825, 'learning_rate': 9.732171348143076e-06, 'epoch': 0.13} 13%|█▎ | 2907/22095 [5:00:15<33:21:23, 6.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42360 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2908/22095 [5:00:18<28:11:16, 5.29s/it] {'loss': 0.3628, 'grad_norm': 1.5183166337584588, 'learning_rate': 9.731934639645628e-06, 'epoch': 0.13} 13%|█▎ | 2908/22095 [5:00:18<28:11:16, 5.29s/it] 13%|█▎ | 2909/22095 [5:00:21<24:12:14, 4.54s/it] {'loss': 0.4497, 'grad_norm': 0.7402805392392205, 'learning_rate': 9.731697829473438e-06, 'epoch': 0.13} 13%|█▎ | 2909/22095 [5:00:21<24:12:14, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44225 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2910/22095 [5:00:30<30:51:58, 5.79s/it] {'loss': 0.5167, 'grad_norm': 0.4875191976551136, 'learning_rate': 9.731460917631594e-06, 'epoch': 0.13} 13%|█▎ | 2910/22095 [5:00:30<30:51:58, 5.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2911/22095 [5:00:33<27:07:20, 5.09s/it] {'loss': 0.4501, 'grad_norm': 0.8975361419222202, 'learning_rate': 9.731223904125186e-06, 'epoch': 0.13} 13%|█▎ | 2911/22095 [5:00:33<27:07:20, 5.09s/it]VC:s3://gui-agent/data_20250407/web/images/google_map/trajectory_48/img/step_1.png 2025-08-27 20:58:31.736642 load time: 1086.18 ms 13%|█▎ | 2912/22095 [5:00:37<25:34:25, 4.80s/it] {'loss': 0.4582, 'grad_norm': 0.7226982167131014, 'learning_rate': 9.730986788959308e-06, 'epoch': 0.13} 13%|█▎ | 2912/22095 [5:00:37<25:34:25, 4.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382450 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49244, 'image': 'vrdu_table_final_2/astro-ph.CO/cc1875ba-66a0-4dd6-b2d5-fd14e92bbfd1.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2913/22095 [5:00:40<22:32:59, 4.23s/it] {'loss': 0.3826, 'grad_norm': 0.7359214134878374, 'learning_rate': 9.730749572139054e-06, 'epoch': 0.13} 13%|█▎ | 2913/22095 [5:00:40<22:32:59, 4.23s/it] 13%|█▎ | 2914/22095 [5:00:45<23:00:40, 4.32s/it] {'loss': 0.4567, 'grad_norm': 0.6907853883793874, 'learning_rate': 9.730512253669523e-06, 'epoch': 0.13} 13%|█▎ | 2914/22095 [5:00:45<23:00:40, 4.32s/it] 13%|█▎ | 2915/22095 [5:00:48<22:19:13, 4.19s/it] {'loss': 0.4673, 'grad_norm': 0.7620674928519404, 'learning_rate': 9.730274833555809e-06, 'epoch': 0.13} 13%|█▎ | 2915/22095 [5:00:48<22:19:13, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2916/22095 [5:01:02<36:58:05, 6.94s/it] {'loss': 0.4913, 'grad_norm': 0.4297711923387903, 'learning_rate': 9.730037311803017e-06, 'epoch': 0.13} 13%|█▎ | 2916/22095 [5:01:02<36:58:05, 6.94s/it] 13%|█▎ | 2917/22095 [5:01:10<38:28:20, 7.22s/it] {'loss': 0.5311, 'grad_norm': 0.3635336031936697, 'learning_rate': 9.72979968841625e-06, 'epoch': 0.13} 13%|█▎ | 2917/22095 [5:01:10<38:28:20, 7.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59209 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2918/22095 [5:01:14<33:48:46, 6.35s/it] {'loss': 0.5029, 'grad_norm': 0.31921531659711294, 'learning_rate': 9.729561963400616e-06, 'epoch': 0.13} 13%|█▎ | 2918/22095 [5:01:14<33:48:46, 6.35s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250616/windows_paste/images/stata/20250520_101919_5/images/before_screenshot_34_id_59_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-27 20:59:12.716827 load time: 1007.76 ms 13%|█▎ | 2919/22095 [5:01:17<29:21:20, 5.51s/it] {'loss': 0.4328, 'grad_norm': 0.8854230801627538, 'learning_rate': 9.72932413676122e-06, 'epoch': 0.13} 13%|█▎ | 2919/22095 [5:01:18<29:21:20, 5.51s/it] 13%|█▎ | 2920/22095 [5:01:21<26:26:53, 4.97s/it] {'loss': 0.4549, 'grad_norm': 0.7914974220186524, 'learning_rate': 9.729086208503174e-06, 'epoch': 0.13} 13%|█▎ | 2920/22095 [5:01:21<26:26:53, 4.97s/it] 13%|█▎ | 2921/22095 [5:01:25<24:07:44, 4.53s/it] {'loss': 0.4069, 'grad_norm': 0.6883760552004766, 'learning_rate': 9.728848178631588e-06, 'epoch': 0.13} 13%|█▎ | 2921/22095 [5:01:25<24:07:44, 4.53s/it] 13%|█▎ | 2922/22095 [5:01:28<21:43:42, 4.08s/it] {'loss': 0.402, 'grad_norm': 1.0327106749885167, 'learning_rate': 9.72861004715158e-06, 'epoch': 0.13} 13%|█▎ | 2922/22095 [5:01:28<21:43:42, 4.08s/it] 13%|█▎ | 2923/22095 [5:01:31<20:00:06, 3.76s/it] {'loss': 0.3908, 'grad_norm': 0.7813827826117568, 'learning_rate': 9.728371814068265e-06, 'epoch': 0.13} 13%|█▎ | 2923/22095 [5:01:31<20:00:06, 3.76s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10359.png 2025-08-27 20:59:27.486244 load time: 1123.14 ms 13%|█▎ | 2924/22095 [5:01:34<18:48:06, 3.53s/it] {'loss': 0.4126, 'grad_norm': 0.6625608201070756, 'learning_rate': 9.728133479386763e-06, 'epoch': 0.13} 13%|█▎ | 2924/22095 [5:01:34<18:48:06, 3.53s/it] 13%|█▎ | 2925/22095 [5:01:38<19:16:16, 3.62s/it] {'loss': 0.3966, 'grad_norm': 0.8204268002708216, 'learning_rate': 9.727895043112192e-06, 'epoch': 0.13} 13%|█▎ | 2925/22095 [5:01:38<19:16:16, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (118724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92334 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2926/22095 [5:01:47<28:37:28, 5.38s/it] {'loss': 0.5155, 'grad_norm': 0.7096880863141186, 'learning_rate': 9.727656505249676e-06, 'epoch': 0.13} 13%|█▎ | 2926/22095 [5:01:47<28:37:28, 5.38s/it] 13%|█▎ | 2927/22095 [5:01:50<25:02:24, 4.70s/it] {'loss': 0.4203, 'grad_norm': 0.8209178150784526, 'learning_rate': 9.727417865804343e-06, 'epoch': 0.13} 13%|█▎ | 2927/22095 [5:01:50<25:02:24, 4.70s/it] 13%|█▎ | 2928/22095 [5:01:54<23:07:39, 4.34s/it] {'loss': 0.4437, 'grad_norm': 1.320863933337403, 'learning_rate': 9.72717912478132e-06, 'epoch': 0.13} 13%|█▎ | 2928/22095 [5:01:54<23:07:39, 4.34s/it] 13%|█▎ | 2929/22095 [5:01:57<21:07:38, 3.97s/it] {'loss': 0.4092, 'grad_norm': 0.6550034192890875, 'learning_rate': 9.726940282185734e-06, 'epoch': 0.13} 13%|█▎ | 2929/22095 [5:01:57<21:07:38, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63256 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111663 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55237 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2930/22095 [5:02:00<20:21:09, 3.82s/it] {'loss': 0.4351, 'grad_norm': 0.7084129783400551, 'learning_rate': 9.726701338022722e-06, 'epoch': 0.13} 13%|█▎ | 2930/22095 [5:02:00<20:21:09, 3.82s/it] 13%|█▎ | 2931/22095 [5:02:03<18:42:27, 3.51s/it] {'loss': 0.3808, 'grad_norm': 0.693353211799939, 'learning_rate': 9.726462292297411e-06, 'epoch': 0.13} 13%|█▎ | 2931/22095 [5:02:03<18:42:27, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115006 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2932/22095 [5:02:06<18:08:21, 3.41s/it] {'loss': 0.4215, 'grad_norm': 0.7701332472199786, 'learning_rate': 9.726223145014946e-06, 'epoch': 0.13} 13%|█▎ | 2932/22095 [5:02:06<18:08:21, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109480 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2933/22095 [5:02:11<20:06:59, 3.78s/it] {'loss': 0.4252, 'grad_norm': 0.7094866856303067, 'learning_rate': 9.725983896180458e-06, 'epoch': 0.13} 13%|█▎ | 2933/22095 [5:02:11<20:06:59, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2934/22095 [5:02:20<29:09:43, 5.48s/it] {'loss': 0.5054, 'grad_norm': 0.6232236818498672, 'learning_rate': 9.725744545799093e-06, 'epoch': 0.13} 13%|█▎ | 2934/22095 [5:02:20<29:09:43, 5.48s/it] 13%|█▎ | 2935/22095 [5:02:24<25:54:57, 4.87s/it] {'loss': 0.3881, 'grad_norm': 0.9707367889864059, 'learning_rate': 9.72550509387599e-06, 'epoch': 0.13} 13%|█▎ | 2935/22095 [5:02:24<25:54:57, 4.87s/it] 13%|█▎ | 2936/22095 [5:02:27<23:35:40, 4.43s/it] {'loss': 0.4109, 'grad_norm': 0.7360273696473658, 'learning_rate': 9.725265540416296e-06, 'epoch': 0.13} 13%|█▎ | 2936/22095 [5:02:27<23:35:40, 4.43s/it] 13%|█▎ | 2937/22095 [5:02:30<21:30:40, 4.04s/it] {'loss': 0.4553, 'grad_norm': 0.7238757945473541, 'learning_rate': 9.725025885425159e-06, 'epoch': 0.13} 13%|█▎ | 2937/22095 [5:02:30<21:30:40, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2938/22095 [5:02:39<29:29:52, 5.54s/it] {'loss': 0.5154, 'grad_norm': 0.471024966139833, 'learning_rate': 9.724786128907726e-06, 'epoch': 0.13} 13%|█▎ | 2938/22095 [5:02:39<29:29:52, 5.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959458 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10293, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 5\nB. 2\nC. 3\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 13%|█▎ | 2939/22095 [5:02:43<26:18:49, 4.95s/it] {'loss': 0.4465, 'grad_norm': 0.7205417842353122, 'learning_rate': 9.724546270869152e-06, 'epoch': 0.13} 13%|█▎ | 2939/22095 [5:02:43<26:18:49, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2940/22095 [5:02:50<30:33:52, 5.74s/it] {'loss': 0.5377, 'grad_norm': 0.40647518704260016, 'learning_rate': 9.724306311314589e-06, 'epoch': 0.13} 13%|█▎ | 2940/22095 [5:02:51<30:33:52, 5.74s/it] 13%|█▎ | 2941/22095 [5:03:00<36:53:18, 6.93s/it] {'loss': 0.5163, 'grad_norm': 0.35852848738255716, 'learning_rate': 9.724066250249192e-06, 'epoch': 0.13} 13%|█▎ | 2941/22095 [5:03:00<36:53:18, 6.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2942/22095 [5:03:04<31:50:44, 5.99s/it] {'loss': 0.4381, 'grad_norm': 0.9552604293041855, 'learning_rate': 9.72382608767812e-06, 'epoch': 0.13} 13%|█▎ | 2942/22095 [5:03:04<31:50:44, 5.99s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2943/22095 [5:03:07<27:33:09, 5.18s/it] {'loss': 0.4132, 'grad_norm': 0.7275710516886414, 'learning_rate': 9.723585823606533e-06, 'epoch': 0.13} 13%|█▎ | 2943/22095 [5:03:07<27:33:09, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2944/22095 [5:03:16<33:04:12, 6.22s/it] {'loss': 0.5094, 'grad_norm': 0.5386892240679365, 'learning_rate': 9.723345458039595e-06, 'epoch': 0.13} 13%|█▎ | 2944/22095 [5:03:16<33:04:12, 6.22s/it] 13%|█▎ | 2945/22095 [5:03:20<30:11:08, 5.67s/it] {'loss': 0.4535, 'grad_norm': 0.9176692907509812, 'learning_rate': 9.723104990982469e-06, 'epoch': 0.13} 13%|█▎ | 2945/22095 [5:03:20<30:11:08, 5.67s/it] 13%|█▎ | 2946/22095 [5:03:23<25:49:45, 4.86s/it] {'loss': 0.4384, 'grad_norm': 0.7598831526931533, 'learning_rate': 9.722864422440323e-06, 'epoch': 0.13} 13%|█▎ | 2946/22095 [5:03:23<25:49:45, 4.86s/it] 13%|█▎ | 2947/22095 [5:03:27<23:36:15, 4.44s/it] {'loss': 0.4696, 'grad_norm': 0.6886217263953099, 'learning_rate': 9.722623752418327e-06, 'epoch': 0.13} 13%|█▎ | 2947/22095 [5:03:27<23:36:15, 4.44s/it] 13%|█▎ | 2948/22095 [5:03:30<21:11:40, 3.98s/it] {'loss': 0.4127, 'grad_norm': 0.7236203182922575, 'learning_rate': 9.722382980921649e-06, 'epoch': 0.13} 13%|█▎ | 2948/22095 [5:03:30<21:11:40, 3.98s/it] 13%|█▎ | 2949/22095 [5:03:33<19:56:23, 3.75s/it] {'loss': 0.4252, 'grad_norm': 0.7866624324006298, 'learning_rate': 9.722142107955465e-06, 'epoch': 0.13} 13%|█▎ | 2949/22095 [5:03:33<19:56:23, 3.75s/it] 13%|█▎ | 2950/22095 [5:03:36<19:26:57, 3.66s/it] {'loss': 0.427, 'grad_norm': 0.7674053504076765, 'learning_rate': 9.721901133524951e-06, 'epoch': 0.13} 13%|█▎ | 2950/22095 [5:03:36<19:26:57, 3.66s/it] 13%|█▎ | 2951/22095 [5:03:40<19:31:30, 3.67s/it] {'loss': 0.4098, 'grad_norm': 0.7596085549378919, 'learning_rate': 9.721660057635284e-06, 'epoch': 0.13} 13%|█▎ | 2951/22095 [5:03:40<19:31:30, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8300723 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB17ggQcJHO8KJjSZFtXXchfXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n从图片中辨认文字。'}, {'from': 'gpt', 'value': '图中所有文字:\nE14大红\nE27尖泡\nE14中红\nE12中红\nE27圆泡\nE12透明\nB22圆泡\nE12红色\n钨丝款\n好家依生活馆\nhttps://shop108874005.taobao.com/'}]} 13%|█▎ | 2952/22095 [5:03:44<20:29:34, 3.85s/it] {'loss': 0.3913, 'grad_norm': 0.7883147091388137, 'learning_rate': 9.721418880291642e-06, 'epoch': 0.13} 13%|█▎ | 2952/22095 [5:03:44<20:29:34, 3.85s/it] 13%|█▎ | 2953/22095 [5:03:49<21:05:13, 3.97s/it] {'loss': 0.384, 'grad_norm': 0.7018500754706035, 'learning_rate': 9.72117760149921e-06, 'epoch': 0.13} 13%|█▎ | 2953/22095 [5:03:49<21:05:13, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108612 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2954/22095 [5:03:52<20:01:19, 3.77s/it] {'loss': 0.4403, 'grad_norm': 0.6791992191941137, 'learning_rate': 9.720936221263174e-06, 'epoch': 0.13} 13%|█▎ | 2954/22095 [5:03:52<20:01:19, 3.77s/it] 13%|█▎ | 2955/22095 [5:03:55<19:46:37, 3.72s/it] {'loss': 0.4036, 'grad_norm': 0.7247397511156864, 'learning_rate': 9.720694739588714e-06, 'epoch': 0.13} 13%|█▎ | 2955/22095 [5:03:55<19:46:37, 3.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2956/22095 [5:03:58<18:31:43, 3.49s/it] {'loss': 0.4059, 'grad_norm': 0.7815257711538628, 'learning_rate': 9.720453156481023e-06, 'epoch': 0.13} 13%|█▎ | 2956/22095 [5:03:58<18:31:43, 3.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [131, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8509112 in VC:s3://internvl-moe-sft-data/. Exception: Image size [131, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24628, 'image': 'vrdu_texteq/astro-ph.CO/16dd65ee-25f1-4fe5-8921-263776d8250a.png', 'image_wh': [[131, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'for $n_{v} < 1$.'}]} 13%|█▎ | 2957/22095 [5:04:01<17:35:05, 3.31s/it] {'loss': 0.4463, 'grad_norm': 0.7438749116316435, 'learning_rate': 9.720211471945293e-06, 'epoch': 0.13} 13%|█▎ | 2957/22095 [5:04:01<17:35:05, 3.31s/it] 13%|█▎ | 2958/22095 [5:04:04<16:55:17, 3.18s/it] {'loss': 0.4146, 'grad_norm': 0.7237329330053868, 'learning_rate': 9.719969685986714e-06, 'epoch': 0.13} 13%|█▎ | 2958/22095 [5:04:04<16:55:17, 3.18s/it] 13%|█▎ | 2959/22095 [5:04:07<16:27:27, 3.10s/it] {'loss': 0.442, 'grad_norm': 0.9573335901450759, 'learning_rate': 9.719727798610483e-06, 'epoch': 0.13} 13%|█▎ | 2959/22095 [5:04:07<16:27:27, 3.10s/it] 13%|█▎ | 2960/22095 [5:04:11<17:12:34, 3.24s/it] {'loss': 0.4202, 'grad_norm': 0.9656716511929738, 'learning_rate': 9.719485809821799e-06, 'epoch': 0.13} 13%|█▎ | 2960/22095 [5:04:11<17:12:34, 3.24s/it] 13%|█▎ | 2961/22095 [5:04:14<16:56:17, 3.19s/it] {'loss': 0.3783, 'grad_norm': 0.8338597559468536, 'learning_rate': 9.719243719625857e-06, 'epoch': 0.13} 13%|█▎ | 2961/22095 [5:04:14<16:56:17, 3.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2962/22095 [5:04:22<25:30:11, 4.80s/it] {'loss': 0.5274, 'grad_norm': 0.6863750264693458, 'learning_rate': 9.719001528027863e-06, 'epoch': 0.13} 13%|█▎ | 2962/22095 [5:04:22<25:30:11, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2963/22095 [5:04:25<22:55:00, 4.31s/it] {'loss': 0.4137, 'grad_norm': 0.7985367660393604, 'learning_rate': 9.71875923503302e-06, 'epoch': 0.13} 13%|█▎ | 2963/22095 [5:04:25<22:55:00, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73560 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2964/22095 [5:04:29<21:04:25, 3.97s/it] {'loss': 0.4091, 'grad_norm': 0.9118946431924335, 'learning_rate': 9.718516840646533e-06, 'epoch': 0.13} 13%|█▎ | 2964/22095 [5:04:29<21:04:25, 3.97s/it] 13%|█▎ | 2965/22095 [5:04:31<19:13:30, 3.62s/it] {'loss': 0.4263, 'grad_norm': 0.7934119150445078, 'learning_rate': 9.71827434487361e-06, 'epoch': 0.13} 13%|█▎ | 2965/22095 [5:04:31<19:13:30, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382448 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49242, 'image': 'vrdu_table_final_2/astro-ph.CO/363f11fe-4923-45ed-8293-3eb93ddacf80.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 13%|█▎ | 2966/22095 [5:04:35<19:57:06, 3.75s/it] {'loss': 0.4382, 'grad_norm': 0.7735298280481725, 'learning_rate': 9.718031747719465e-06, 'epoch': 0.13} 13%|█▎ | 2966/22095 [5:04:35<19:57:06, 3.75s/it] 13%|█▎ | 2967/22095 [5:04:38<18:32:25, 3.49s/it] {'loss': 0.4435, 'grad_norm': 0.9513330875037518, 'learning_rate': 9.717789049189306e-06, 'epoch': 0.13} 13%|█▎ | 2967/22095 [5:04:38<18:32:25, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115319 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2968/22095 [5:04:42<18:21:28, 3.46s/it] {'loss': 0.4049, 'grad_norm': 0.6909982648821564, 'learning_rate': 9.71754624928835e-06, 'epoch': 0.13} 13%|█▎ | 2968/22095 [5:04:42<18:21:28, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47059 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84248 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2969/22095 [5:04:45<18:11:53, 3.43s/it] {'loss': 0.395, 'grad_norm': 0.8679915662733991, 'learning_rate': 9.717303348021814e-06, 'epoch': 0.13} 13%|█▎ | 2969/22095 [5:04:45<18:11:53, 3.43s/it] 13%|█▎ | 2970/22095 [5:04:48<17:19:31, 3.26s/it] {'loss': 0.4044, 'grad_norm': 0.8847599037454432, 'learning_rate': 9.717060345394917e-06, 'epoch': 0.13} 13%|█▎ | 2970/22095 [5:04:48<17:19:31, 3.26s/it] 13%|█▎ | 2971/22095 [5:04:51<16:52:04, 3.18s/it] {'loss': 0.3957, 'grad_norm': 0.7118812380117908, 'learning_rate': 9.716817241412882e-06, 'epoch': 0.13} 13%|█▎ | 2971/22095 [5:04:51<16:52:04, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 13%|█▎ | 2972/22095 [5:04:59<24:18:08, 4.58s/it] {'loss': 0.5091, 'grad_norm': 0.6306224960599156, 'learning_rate': 9.71657403608093e-06, 'epoch': 0.13} 13%|█▎ | 2972/22095 [5:04:59<24:18:08, 4.58s/it] 13%|█▎ | 2973/22095 [5:05:02<22:08:39, 4.17s/it] {'loss': 0.4114, 'grad_norm': 0.7622301601596669, 'learning_rate': 9.716330729404287e-06, 'epoch': 0.13} 13%|█▎ | 2973/22095 [5:05:02<22:08:39, 4.17s/it] 13%|█▎ | 2974/22095 [5:05:05<20:29:41, 3.86s/it] {'loss': 0.3871, 'grad_norm': 0.7484186601210657, 'learning_rate': 9.716087321388184e-06, 'epoch': 0.13} 13%|█▎ | 2974/22095 [5:05:05<20:29:41, 3.86s/it] 13%|█▎ | 2975/22095 [5:05:08<18:52:36, 3.55s/it] {'loss': 0.4587, 'grad_norm': 0.7625419668353308, 'learning_rate': 9.715843812037846e-06, 'epoch': 0.13} 13%|█▎ | 2975/22095 [5:05:08<18:52:36, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41384 > 40960). Running this sequence through the model will result in indexing errors 13%|█▎ | 2976/22095 [5:05:15<23:48:02, 4.48s/it] {'loss': 0.4977, 'grad_norm': 0.35951379933644734, 'learning_rate': 9.71560020135851e-06, 'epoch': 0.13} 13%|█▎ | 2976/22095 [5:05:15<23:48:02, 4.48s/it] 13%|█▎ | 2977/22095 [5:05:23<29:45:05, 5.60s/it] {'loss': 0.5211, 'grad_norm': 0.3442069853028463, 'learning_rate': 9.715356489355408e-06, 'epoch': 0.13} 13%|█▎ | 2977/22095 [5:05:23<29:45:05, 5.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 13%|█▎ | 2978/22095 [5:05:26<26:17:38, 4.95s/it] {'loss': 0.3668, 'grad_norm': 0.7327349225883127, 'learning_rate': 9.715112676033777e-06, 'epoch': 0.13} 13%|█▎ | 2978/22095 [5:05:26<26:17:38, 4.95s/it] 13%|█▎ | 2979/22095 [5:05:30<24:06:21, 4.54s/it] {'loss': 0.4397, 'grad_norm': 0.6963321159595294, 'learning_rate': 9.714868761398856e-06, 'epoch': 0.13} 13%|█▎ | 2979/22095 [5:05:30<24:06:21, 4.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1078, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8461097 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1078, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 150547, 'image': 'vrdu_texteq/astro-ph.CO/7a43a68d-0414-4cbf-b4a0-9b67e8fb668e.png', 'image_wh': [[1078, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'Since $\\Lambda_2$ is constant in time we do not need to consider it when we introduce the $\\pi$ field.'}]} 13%|█▎ | 2980/22095 [5:05:33<22:05:46, 4.16s/it] {'loss': 0.4309, 'grad_norm': 0.7512751684699605, 'learning_rate': 9.714624745455885e-06, 'epoch': 0.13} 13%|█▎ | 2980/22095 [5:05:33<22:05:46, 4.16s/it] 13%|█▎ | 2981/22095 [5:05:38<22:53:40, 4.31s/it] {'loss': 0.4231, 'grad_norm': 0.6640385044151759, 'learning_rate': 9.71438062821011e-06, 'epoch': 0.13} 13%|█▎ | 2981/22095 [5:05:38<22:53:40, 4.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 13%|█▎ | 2982/22095 [5:05:41<20:50:35, 3.93s/it] {'loss': 0.3837, 'grad_norm': 0.7347924559093464, 'learning_rate': 9.714136409666773e-06, 'epoch': 0.13} 13%|█▎ | 2982/22095 [5:05:41<20:50:35, 3.93s/it] 14%|█▎ | 2983/22095 [5:05:44<20:21:13, 3.83s/it] {'loss': 0.3813, 'grad_norm': 0.7153839496940779, 'learning_rate': 9.713892089831122e-06, 'epoch': 0.14} 14%|█▎ | 2983/22095 [5:05:44<20:21:13, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▎ | 2984/22095 [5:05:52<26:57:27, 5.08s/it] {'loss': 0.5242, 'grad_norm': 0.6314896443694339, 'learning_rate': 9.71364766870841e-06, 'epoch': 0.14} 14%|█▎ | 2984/22095 [5:05:52<26:57:27, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72197 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91702 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 2985/22095 [5:05:56<24:11:22, 4.56s/it] {'loss': 0.3912, 'grad_norm': 0.7938527436789027, 'learning_rate': 9.713403146303885e-06, 'epoch': 0.14} 14%|█▎ | 2985/22095 [5:05:56<24:11:22, 4.56s/it] 14%|█▎ | 2986/22095 [5:05:59<22:32:57, 4.25s/it] {'loss': 0.4341, 'grad_norm': 0.7578285470729604, 'learning_rate': 9.713158522622804e-06, 'epoch': 0.14} 14%|█▎ | 2986/22095 [5:05:59<22:32:57, 4.25s/it] 14%|█▎ | 2987/22095 [5:06:03<21:19:01, 4.02s/it] {'loss': 0.4769, 'grad_norm': 0.7256290252232681, 'learning_rate': 9.71291379767042e-06, 'epoch': 0.14} 14%|█▎ | 2987/22095 [5:06:03<21:19:01, 4.02s/it] 14%|█▎ | 2988/22095 [5:06:06<20:07:33, 3.79s/it] {'loss': 0.4145, 'grad_norm': 0.705979909773582, 'learning_rate': 9.712668971451996e-06, 'epoch': 0.14} 14%|█▎ | 2988/22095 [5:06:06<20:07:33, 3.79s/it] 14%|█▎ | 2989/22095 [5:06:10<20:13:55, 3.81s/it] {'loss': 0.4247, 'grad_norm': 0.7603561663218684, 'learning_rate': 9.712424043972786e-06, 'epoch': 0.14} 14%|█▎ | 2989/22095 [5:06:10<20:13:55, 3.81s/it] 14%|█▎ | 2990/22095 [5:06:13<18:56:50, 3.57s/it] {'loss': 0.4629, 'grad_norm': 0.7209131814522719, 'learning_rate': 9.712179015238058e-06, 'epoch': 0.14} 14%|█▎ | 2990/22095 [5:06:13<18:56:50, 3.57s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 14%|█▎ | 2991/22095 [5:06:16<18:22:09, 3.46s/it] {'loss': 0.443, 'grad_norm': 0.685193750110945, 'learning_rate': 9.711933885253076e-06, 'epoch': 0.14} 14%|█▎ | 2991/22095 [5:06:16<18:22:09, 3.46s/it] 14%|█▎ | 2992/22095 [5:06:20<19:33:09, 3.68s/it] {'loss': 0.4122, 'grad_norm': 0.6841066749635569, 'learning_rate': 9.711688654023105e-06, 'epoch': 0.14} 14%|█▎ | 2992/22095 [5:06:20<19:33:09, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41330 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 2993/22095 [5:06:30<28:37:20, 5.39s/it] {'loss': 0.4883, 'grad_norm': 0.5869559059563596, 'learning_rate': 9.711443321553415e-06, 'epoch': 0.14} 14%|█▎ | 2993/22095 [5:06:30<28:37:20, 5.39s/it] 14%|█▎ | 2994/22095 [5:06:34<26:31:54, 5.00s/it] {'loss': 0.4483, 'grad_norm': 0.7553318168260519, 'learning_rate': 9.71119788784928e-06, 'epoch': 0.14} 14%|█▎ | 2994/22095 [5:06:34<26:31:54, 5.00s/it] 14%|█▎ | 2995/22095 [5:06:37<24:23:41, 4.60s/it] {'loss': 0.4336, 'grad_norm': 0.7287364896534111, 'learning_rate': 9.71095235291597e-06, 'epoch': 0.14} 14%|█▎ | 2995/22095 [5:06:37<24:23:41, 4.60s/it] 14%|█▎ | 2996/22095 [5:06:41<22:58:55, 4.33s/it] {'loss': 0.4295, 'grad_norm': 0.7214137149091241, 'learning_rate': 9.710706716758765e-06, 'epoch': 0.14} 14%|█▎ | 2996/22095 [5:06:41<22:58:55, 4.33s/it] 14%|█▎ | 2997/22095 [5:06:44<21:23:46, 4.03s/it] {'loss': 0.411, 'grad_norm': 0.7463034267148326, 'learning_rate': 9.710460979382938e-06, 'epoch': 0.14} 14%|█▎ | 2997/22095 [5:06:44<21:23:46, 4.03s/it] 14%|█▎ | 2998/22095 [5:06:47<19:48:05, 3.73s/it] {'loss': 0.3998, 'grad_norm': 0.7686412440469463, 'learning_rate': 9.710215140793774e-06, 'epoch': 0.14} 14%|█▎ | 2998/22095 [5:06:47<19:48:05, 3.73s/it] 14%|█▎ | 2999/22095 [5:06:50<18:19:29, 3.45s/it] {'loss': 0.3754, 'grad_norm': 0.9299126331208909, 'learning_rate': 9.709969200996551e-06, 'epoch': 0.14} 14%|█▎ | 2999/22095 [5:06:50<18:19:29, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76670 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (53407 > 40960) for 4 sample(s). Truncating to 29025 with 2 samples. 14%|█▎ | 3000/22095 [5:06:54<18:52:27, 3.56s/it] {'loss': 0.443, 'grad_norm': 0.7395976891382515, 'learning_rate': 9.709723159996556e-06, 'epoch': 0.14} 14%|█▎ | 3000/22095 [5:06:54<18:52:27, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▎ | 3001/22095 [5:07:00<22:55:12, 4.32s/it] {'loss': 0.5077, 'grad_norm': 0.4746397495528787, 'learning_rate': 9.709477017799076e-06, 'epoch': 0.14} 14%|█▎ | 3001/22095 [5:07:00<22:55:12, 4.32s/it] 14%|█▎ | 3002/22095 [5:07:03<21:07:04, 3.98s/it] {'loss': 0.4328, 'grad_norm': 0.7794278734782368, 'learning_rate': 9.709230774409397e-06, 'epoch': 0.14} 14%|█▎ | 3002/22095 [5:07:03<21:07:04, 3.98s/it] 14%|█▎ | 3003/22095 [5:07:07<20:12:10, 3.81s/it] {'loss': 0.3932, 'grad_norm': 0.649400693345909, 'learning_rate': 9.708984429832815e-06, 'epoch': 0.14} 14%|█▎ | 3003/22095 [5:07:07<20:12:10, 3.81s/it] 14%|█▎ | 3004/22095 [5:07:13<23:33:29, 4.44s/it] {'loss': 0.425, 'grad_norm': 0.7243280492242261, 'learning_rate': 9.708737984074616e-06, 'epoch': 0.14} 14%|█▎ | 3004/22095 [5:07:13<23:33:29, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▎ | 3005/22095 [5:07:16<21:42:58, 4.10s/it] {'loss': 0.4414, 'grad_norm': 0.7567664983915044, 'learning_rate': 9.708491437140103e-06, 'epoch': 0.14} 14%|█▎ | 3005/22095 [5:07:16<21:42:58, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44456 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3006/22095 [5:07:20<21:31:36, 4.06s/it] {'loss': 0.438, 'grad_norm': 0.7045563573152411, 'learning_rate': 9.708244789034568e-06, 'epoch': 0.14} 14%|█▎ | 3006/22095 [5:07:20<21:31:36, 4.06s/it] 14%|█▎ | 3007/22095 [5:07:24<20:44:10, 3.91s/it] {'loss': 0.4046, 'grad_norm': 0.6520952845285576, 'learning_rate': 9.707998039763315e-06, 'epoch': 0.14} 14%|█▎ | 3007/22095 [5:07:24<20:44:10, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▎ | 3008/22095 [5:07:33<28:51:56, 5.44s/it] {'loss': 0.5201, 'grad_norm': 0.521269484352832, 'learning_rate': 9.707751189331642e-06, 'epoch': 0.14} 14%|█▎ | 3008/22095 [5:07:33<28:51:56, 5.44s/it] 14%|█▎ | 3009/22095 [5:07:37<26:42:55, 5.04s/it] {'loss': 0.4759, 'grad_norm': 0.7179054327234664, 'learning_rate': 9.707504237744854e-06, 'epoch': 0.14} 14%|█▎ | 3009/22095 [5:07:37<26:42:55, 5.04s/it] 14%|█▎ | 3010/22095 [5:07:40<23:31:08, 4.44s/it] {'loss': 0.4593, 'grad_norm': 0.6624657232936156, 'learning_rate': 9.707257185008259e-06, 'epoch': 0.14} 14%|█▎ | 3010/22095 [5:07:40<23:31:08, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▎ | 3011/22095 [5:07:43<22:23:49, 4.22s/it] {'loss': 0.4301, 'grad_norm': 0.9881809471220743, 'learning_rate': 9.707010031127164e-06, 'epoch': 0.14} 14%|█▎ | 3011/22095 [5:07:43<22:23:49, 4.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878404 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1557, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 6\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 14%|█▎ | 3012/22095 [5:07:48<22:10:56, 4.18s/it] {'loss': 0.4417, 'grad_norm': 0.7301209156664, 'learning_rate': 9.70676277610688e-06, 'epoch': 0.14} 14%|█▎ | 3012/22095 [5:07:48<22:10:56, 4.18s/it] 14%|█▎ | 3013/22095 [5:07:51<20:49:14, 3.93s/it] {'loss': 0.4054, 'grad_norm': 0.9904205272743063, 'learning_rate': 9.70651541995272e-06, 'epoch': 0.14} 14%|█▎ | 3013/22095 [5:07:51<20:49:14, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▎ | 3014/22095 [5:07:59<27:52:56, 5.26s/it] {'loss': 0.5211, 'grad_norm': 0.470084247855341, 'learning_rate': 9.706267962669999e-06, 'epoch': 0.14} 14%|█▎ | 3014/22095 [5:07:59<27:52:56, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45608 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3015/22095 [5:08:02<24:42:40, 4.66s/it] {'loss': 0.4135, 'grad_norm': 0.7041348282498282, 'learning_rate': 9.706020404264033e-06, 'epoch': 0.14} 14%|█▎ | 3015/22095 [5:08:02<24:42:40, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46910 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3016/22095 [5:08:06<22:50:26, 4.31s/it] {'loss': 0.4016, 'grad_norm': 0.6653132246514796, 'learning_rate': 9.705772744740142e-06, 'epoch': 0.14} 14%|█▎ | 3016/22095 [5:08:06<22:50:26, 4.31s/it] 14%|█▎ | 3017/22095 [5:08:09<20:41:39, 3.90s/it] {'loss': 0.4183, 'grad_norm': 0.6466986946363746, 'learning_rate': 9.705524984103647e-06, 'epoch': 0.14} 14%|█▎ | 3017/22095 [5:08:09<20:41:39, 3.90s/it] 14%|█▎ | 3018/22095 [5:08:12<19:46:12, 3.73s/it] {'loss': 0.4253, 'grad_norm': 0.6928442879758144, 'learning_rate': 9.705277122359871e-06, 'epoch': 0.14} 14%|█▎ | 3018/22095 [5:08:12<19:46:12, 3.73s/it] 14%|█▎ | 3019/22095 [5:08:15<18:20:02, 3.46s/it] {'loss': 0.4113, 'grad_norm': 0.6640542362916704, 'learning_rate': 9.705029159514143e-06, 'epoch': 0.14} 14%|█▎ | 3019/22095 [5:08:15<18:20:02, 3.46s/it] 14%|█▎ | 3020/22095 [5:08:18<17:51:48, 3.37s/it] {'loss': 0.4088, 'grad_norm': 0.6814633476218547, 'learning_rate': 9.704781095571788e-06, 'epoch': 0.14} 14%|█▎ | 3020/22095 [5:08:18<17:51:48, 3.37s/it] 14%|█▎ | 3021/22095 [5:08:22<17:59:34, 3.40s/it] {'loss': 0.3909, 'grad_norm': 0.706195642860411, 'learning_rate': 9.704532930538137e-06, 'epoch': 0.14} 14%|█▎ | 3021/22095 [5:08:22<17:59:34, 3.40s/it] 14%|█▎ | 3022/22095 [5:08:25<17:04:35, 3.22s/it] {'loss': 0.397, 'grad_norm': 0.8893079768742019, 'learning_rate': 9.704284664418521e-06, 'epoch': 0.14} 14%|█▎ | 3022/22095 [5:08:25<17:04:35, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (97394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71369 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3023/22095 [5:08:33<26:09:51, 4.94s/it] {'loss': 0.4932, 'grad_norm': 0.4988443481670754, 'learning_rate': 9.704036297218278e-06, 'epoch': 0.14} 14%|█▎ | 3023/22095 [5:08:33<26:09:51, 4.94s/it] 14%|█▎ | 3024/22095 [5:08:43<33:03:04, 6.24s/it] {'loss': 0.5055, 'grad_norm': 0.3946010923966299, 'learning_rate': 9.70378782894274e-06, 'epoch': 0.14} 14%|█▎ | 3024/22095 [5:08:43<33:03:04, 6.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 14%|█▎ | 3025/22095 [5:08:46<28:15:42, 5.34s/it] {'loss': 0.438, 'grad_norm': 0.7658283547259195, 'learning_rate': 9.70353925959725e-06, 'epoch': 0.14} 14%|█▎ | 3025/22095 [5:08:46<28:15:42, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▎ | 3026/22095 [5:08:51<27:10:06, 5.13s/it] {'loss': 0.3792, 'grad_norm': 0.7081223389869151, 'learning_rate': 9.703290589187146e-06, 'epoch': 0.14} 14%|█▎ | 3026/22095 [5:08:51<27:10:06, 5.13s/it] 14%|█▎ | 3027/22095 [5:08:54<24:32:49, 4.63s/it] {'loss': 0.4005, 'grad_norm': 0.7429936786885113, 'learning_rate': 9.703041817717773e-06, 'epoch': 0.14} 14%|█▎ | 3027/22095 [5:08:54<24:32:49, 4.63s/it] 14%|█▎ | 3028/22095 [5:08:58<23:36:20, 4.46s/it] {'loss': 0.4689, 'grad_norm': 0.9040521372243512, 'learning_rate': 9.702792945194475e-06, 'epoch': 0.14} 14%|█▎ | 3028/22095 [5:08:58<23:36:20, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [234, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8522403 in VC:s3://internvl-moe-sft-data/. Exception: Image size [234, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 72278, 'image': 'vrdu_texteq/astro-ph.CO/94bdc6fc-e75e-4180-8960-718abb06a3cd.png', 'image_wh': [[234, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'As a result $\\bar H > {\\cal H}$.'}]} 14%|█▎ | 3029/22095 [5:09:01<21:00:26, 3.97s/it] {'loss': 0.4277, 'grad_norm': 0.7365739004941558, 'learning_rate': 9.7025439716226e-06, 'epoch': 0.14} 14%|█▎ | 3029/22095 [5:09:01<21:00:26, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80016 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87277 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122214 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3030/22095 [5:09:05<20:55:15, 3.95s/it] {'loss': 0.3984, 'grad_norm': 0.6711554139649331, 'learning_rate': 9.702294897007499e-06, 'epoch': 0.14} 14%|█▎ | 3030/22095 [5:09:05<20:55:15, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▎ | 3031/22095 [5:09:14<29:43:08, 5.61s/it] {'loss': 0.5139, 'grad_norm': 0.7033004233347675, 'learning_rate': 9.702045721354521e-06, 'epoch': 0.14} 14%|█▎ | 3031/22095 [5:09:14<29:43:08, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47515 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3032/22095 [5:09:17<25:47:14, 4.87s/it] {'loss': 0.4476, 'grad_norm': 0.9480372201213745, 'learning_rate': 9.701796444669022e-06, 'epoch': 0.14} 14%|█▎ | 3032/22095 [5:09:17<25:47:14, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▎ | 3033/22095 [5:09:21<23:45:26, 4.49s/it] {'loss': 0.4125, 'grad_norm': 0.7158231088813505, 'learning_rate': 9.701547066956359e-06, 'epoch': 0.14} 14%|█▎ | 3033/22095 [5:09:21<23:45:26, 4.49s/it] 14%|█▎ | 3034/22095 [5:09:24<21:11:12, 4.00s/it] {'loss': 0.3784, 'grad_norm': 0.7652456510075898, 'learning_rate': 9.701297588221888e-06, 'epoch': 0.14} 14%|█▎ | 3034/22095 [5:09:24<21:11:12, 4.00s/it] 14%|█▎ | 3035/22095 [5:09:27<19:21:07, 3.66s/it] {'loss': 0.4084, 'grad_norm': 0.9602938075922123, 'learning_rate': 9.701048008470972e-06, 'epoch': 0.14} 14%|█▎ | 3035/22095 [5:09:27<19:21:07, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344074 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10726, 'image': 'vrdu_table_final_2/astro-ph.CO/f6d7151c-25da-4f3f-b639-0747e8f15f34.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 14%|█▎ | 3036/22095 [5:09:30<18:12:36, 3.44s/it] {'loss': 0.423, 'grad_norm': 0.7415008906843575, 'learning_rate': 9.700798327708972e-06, 'epoch': 0.14} 14%|█▎ | 3036/22095 [5:09:30<18:12:36, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46675 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3037/22095 [5:09:33<18:41:51, 3.53s/it] {'loss': 0.43, 'grad_norm': 0.6903867395699205, 'learning_rate': 9.700548545941253e-06, 'epoch': 0.14} 14%|█▎ | 3037/22095 [5:09:33<18:41:51, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76308 > 40960). Running this sequence through the model will result in indexing errors 14%|█▎ | 3038/22095 [5:09:43<27:58:49, 5.29s/it] {'loss': 0.4955, 'grad_norm': 0.6979826967977066, 'learning_rate': 9.700298663173183e-06, 'epoch': 0.14} 14%|█▎ | 3038/22095 [5:09:43<27:58:49, 5.29s/it] 14%|█▍ | 3039/22095 [5:09:47<25:50:50, 4.88s/it] {'loss': 0.421, 'grad_norm': 0.7044161732381952, 'learning_rate': 9.70004867941013e-06, 'epoch': 0.14} 14%|█▍ | 3039/22095 [5:09:47<25:50:50, 4.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929819 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52972, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3040/22095 [5:09:55<30:21:46, 5.74s/it] {'loss': 0.5152, 'grad_norm': 0.37286787245742103, 'learning_rate': 9.699798594657464e-06, 'epoch': 0.14} 14%|█▍ | 3040/22095 [5:09:55<30:21:46, 5.74s/it] 14%|█▍ | 3041/22095 [5:09:59<28:28:56, 5.38s/it] {'loss': 0.445, 'grad_norm': 0.9123876179707686, 'learning_rate': 9.699548408920563e-06, 'epoch': 0.14} 14%|█▍ | 3041/22095 [5:09:59<28:28:56, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3042/22095 [5:10:05<29:18:59, 5.54s/it] {'loss': 0.5323, 'grad_norm': 0.6151008314597627, 'learning_rate': 9.6992981222048e-06, 'epoch': 0.14} 14%|█▍ | 3042/22095 [5:10:05<29:18:59, 5.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3043/22095 [5:10:09<26:11:51, 4.95s/it] {'loss': 0.4556, 'grad_norm': 0.8045341700582606, 'learning_rate': 9.699047734515554e-06, 'epoch': 0.14} 14%|█▍ | 3043/22095 [5:10:09<26:11:51, 4.95s/it] 14%|█▍ | 3044/22095 [5:10:12<23:31:16, 4.44s/it] {'loss': 0.443, 'grad_norm': 0.8138702592007475, 'learning_rate': 9.698797245858202e-06, 'epoch': 0.14} 14%|█▍ | 3044/22095 [5:10:12<23:31:16, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41971 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3045/22095 [5:10:22<32:28:06, 6.14s/it] {'loss': 0.4995, 'grad_norm': 0.5004662711445704, 'learning_rate': 9.69854665623813e-06, 'epoch': 0.14} 14%|█▍ | 3045/22095 [5:10:22<32:28:06, 6.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63941 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3046/22095 [5:10:30<35:03:34, 6.63s/it] {'loss': 0.5132, 'grad_norm': 0.4342575012009011, 'learning_rate': 9.698295965660721e-06, 'epoch': 0.14} 14%|█▍ | 3046/22095 [5:10:30<35:03:34, 6.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 14%|█▍ | 3047/22095 [5:10:33<30:23:28, 5.74s/it] {'loss': 0.389, 'grad_norm': 0.8695634845187036, 'learning_rate': 9.69804517413136e-06, 'epoch': 0.14} 14%|█▍ | 3047/22095 [5:10:33<30:23:28, 5.74s/it] 14%|█▍ | 3048/22095 [5:10:37<26:48:14, 5.07s/it] {'loss': 0.4401, 'grad_norm': 0.7739147195274749, 'learning_rate': 9.697794281655439e-06, 'epoch': 0.14} 14%|█▍ | 3048/22095 [5:10:37<26:48:14, 5.07s/it] 14%|█▍ | 3049/22095 [5:10:41<25:22:56, 4.80s/it] {'loss': 0.3981, 'grad_norm': 0.7119861534065948, 'learning_rate': 9.697543288238345e-06, 'epoch': 0.14} 14%|█▍ | 3049/22095 [5:10:41<25:22:56, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3050/22095 [5:10:44<22:16:38, 4.21s/it] {'loss': 0.4072, 'grad_norm': 0.6663963626978507, 'learning_rate': 9.697292193885475e-06, 'epoch': 0.14} 14%|█▍ | 3050/22095 [5:10:44<22:16:38, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59063 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3051/22095 [5:10:47<20:36:47, 3.90s/it] {'loss': 0.3964, 'grad_norm': 0.781265903071915, 'learning_rate': 9.69704099860222e-06, 'epoch': 0.14} 14%|█▍ | 3051/22095 [5:10:47<20:36:47, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63195 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3052/22095 [5:10:50<18:47:50, 3.55s/it] {'loss': 0.4457, 'grad_norm': 1.2176129605873554, 'learning_rate': 9.696789702393982e-06, 'epoch': 0.14} 14%|█▍ | 3052/22095 [5:10:50<18:47:50, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3053/22095 [5:10:59<28:12:40, 5.33s/it] {'loss': 0.5298, 'grad_norm': 0.9365927829037233, 'learning_rate': 9.69653830526616e-06, 'epoch': 0.14} 14%|█▍ | 3053/22095 [5:10:59<28:12:40, 5.33s/it] 14%|█▍ | 3054/22095 [5:11:06<29:45:36, 5.63s/it] {'loss': 0.4898, 'grad_norm': 0.5873324580483902, 'learning_rate': 9.696286807224151e-06, 'epoch': 0.14} 14%|█▍ | 3054/22095 [5:11:06<29:45:36, 5.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (63622 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3055/22095 [5:11:09<26:42:22, 5.05s/it] {'loss': 0.4818, 'grad_norm': 0.7170843817829563, 'learning_rate': 9.696035208273363e-06, 'epoch': 0.14} 14%|█▍ | 3055/22095 [5:11:09<26:42:22, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101670 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3056/22095 [5:11:13<24:15:35, 4.59s/it] {'loss': 0.4018, 'grad_norm': 0.7445503826919057, 'learning_rate': 9.6957835084192e-06, 'epoch': 0.14} 14%|█▍ | 3056/22095 [5:11:13<24:15:35, 4.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3057/22095 [5:11:17<23:37:47, 4.47s/it] {'loss': 0.4229, 'grad_norm': 0.6868032484108987, 'learning_rate': 9.695531707667073e-06, 'epoch': 0.14} 14%|█▍ | 3057/22095 [5:11:17<23:37:47, 4.47s/it] 14%|█▍ | 3058/22095 [5:11:20<20:59:42, 3.97s/it] {'loss': 0.4223, 'grad_norm': 0.7126566217183055, 'learning_rate': 9.695279806022391e-06, 'epoch': 0.14} 14%|█▍ | 3058/22095 [5:11:20<20:59:42, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72794 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61203 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129099 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3059/22095 [5:11:30<31:19:04, 5.92s/it] {'loss': 0.5042, 'grad_norm': 1.6047999058425257, 'learning_rate': 9.695027803490565e-06, 'epoch': 0.14} 14%|█▍ | 3059/22095 [5:11:30<31:19:04, 5.92s/it] 14%|█▍ | 3060/22095 [5:11:39<35:58:30, 6.80s/it] {'loss': 0.5175, 'grad_norm': 1.1171172770190672, 'learning_rate': 9.694775700077013e-06, 'epoch': 0.14} 14%|█▍ | 3060/22095 [5:11:39<35:58:30, 6.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 14%|█▍ | 3061/22095 [5:11:42<30:29:58, 5.77s/it] {'loss': 0.3975, 'grad_norm': 0.9032744039406752, 'learning_rate': 9.694523495787149e-06, 'epoch': 0.14} 14%|█▍ | 3061/22095 [5:11:42<30:29:58, 5.77s/it] 14%|█▍ | 3062/22095 [5:11:47<27:57:42, 5.29s/it] {'loss': 0.4484, 'grad_norm': 0.7173000975750978, 'learning_rate': 9.694271190626393e-06, 'epoch': 0.14} 14%|█▍ | 3062/22095 [5:11:47<27:57:42, 5.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250421/Android/ebay/Cycle_0_Iter_40/images/screenshot-610-1745295899.2475102-before.png 2025-08-27 21:09:45.418728 load time: 1026.85 ms 14%|█▍ | 3063/22095 [5:11:54<30:35:04, 5.79s/it] {'loss': 0.5279, 'grad_norm': 1.0967957259191934, 'learning_rate': 9.694018784600166e-06, 'epoch': 0.14} 14%|█▍ | 3063/22095 [5:11:54<30:35:04, 5.79s/it] 14%|█▍ | 3064/22095 [5:11:59<29:39:49, 5.61s/it] {'loss': 0.5089, 'grad_norm': 1.1085504083237756, 'learning_rate': 9.693766277713893e-06, 'epoch': 0.14} 14%|█▍ | 3064/22095 [5:11:59<29:39:49, 5.61s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 14%|█▍ | 3065/22095 [5:12:03<27:31:06, 5.21s/it] {'loss': 0.4073, 'grad_norm': 1.0817725333092245, 'learning_rate': 9.693513669972999e-06, 'epoch': 0.14} 14%|█▍ | 3065/22095 [5:12:03<27:31:06, 5.21s/it] 14%|█▍ | 3066/22095 [5:12:07<24:56:47, 4.72s/it] {'loss': 0.382, 'grad_norm': 0.7458369430458078, 'learning_rate': 9.69326096138291e-06, 'epoch': 0.14} 14%|█▍ | 3066/22095 [5:12:07<24:56:47, 4.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3067/22095 [5:12:16<31:44:41, 6.01s/it] {'loss': 0.531, 'grad_norm': 0.6444122363483363, 'learning_rate': 9.693008151949058e-06, 'epoch': 0.14} 14%|█▍ | 3067/22095 [5:12:16<31:44:41, 6.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-27 21:10:15.517083 load time: 1231.64 ms 14%|█▍ | 3068/22095 [5:12:20<29:12:21, 5.53s/it] {'loss': 0.4223, 'grad_norm': 0.8259422450902474, 'learning_rate': 9.692755241676874e-06, 'epoch': 0.14} 14%|█▍ | 3068/22095 [5:12:20<29:12:21, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73846 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3069/22095 [5:12:23<24:57:57, 4.72s/it] {'loss': 0.4282, 'grad_norm': 0.8397951737371382, 'learning_rate': 9.692502230571792e-06, 'epoch': 0.14} 14%|█▍ | 3069/22095 [5:12:23<24:57:57, 4.72s/it] 14%|█▍ | 3070/22095 [5:12:26<22:14:26, 4.21s/it] {'loss': 0.3981, 'grad_norm': 0.7945570794817268, 'learning_rate': 9.69224911863925e-06, 'epoch': 0.14} 14%|█▍ | 3070/22095 [5:12:26<22:14:26, 4.21s/it] 14%|█▍ | 3071/22095 [5:12:30<21:24:19, 4.05s/it] {'loss': 0.4371, 'grad_norm': 0.7121357050394432, 'learning_rate': 9.691995905884684e-06, 'epoch': 0.14} 14%|█▍ | 3071/22095 [5:12:30<21:24:19, 4.05s/it] 14%|█▍ | 3072/22095 [5:12:33<20:42:10, 3.92s/it] {'loss': 0.4514, 'grad_norm': 0.7504348353548871, 'learning_rate': 9.691742592313537e-06, 'epoch': 0.14} 14%|█▍ | 3072/22095 [5:12:33<20:42:10, 3.92s/it] 14%|█▍ | 3073/22095 [5:12:36<18:54:55, 3.58s/it] {'loss': 0.419, 'grad_norm': 0.8473511032416616, 'learning_rate': 9.691489177931253e-06, 'epoch': 0.14} 14%|█▍ | 3073/22095 [5:12:36<18:54:55, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120361 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3074/22095 [5:12:40<19:45:36, 3.74s/it] {'loss': 0.4036, 'grad_norm': 0.7824499897745099, 'learning_rate': 9.691235662743273e-06, 'epoch': 0.14} 14%|█▍ | 3074/22095 [5:12:40<19:45:36, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57153 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3075/22095 [5:12:43<18:32:47, 3.51s/it] {'loss': 0.4274, 'grad_norm': 0.6989322866124807, 'learning_rate': 9.690982046755048e-06, 'epoch': 0.14} 14%|█▍ | 3075/22095 [5:12:43<18:32:47, 3.51s/it] 14%|█▍ | 3076/22095 [5:12:47<19:33:46, 3.70s/it] {'loss': 0.4167, 'grad_norm': 0.6853412184971148, 'learning_rate': 9.690728329972025e-06, 'epoch': 0.14} 14%|█▍ | 3076/22095 [5:12:47<19:33:46, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3077/22095 [5:12:53<22:53:00, 4.33s/it] {'loss': 0.5224, 'grad_norm': 1.4691170124006308, 'learning_rate': 9.690474512399658e-06, 'epoch': 0.14} 14%|█▍ | 3077/22095 [5:12:53<22:53:00, 4.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59204 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3078/22095 [5:12:56<21:01:54, 3.98s/it] {'loss': 0.4327, 'grad_norm': 0.8608378530218217, 'learning_rate': 9.690220594043399e-06, 'epoch': 0.14} 14%|█▍ | 3078/22095 [5:12:56<21:01:54, 3.98s/it] 14%|█▍ | 3079/22095 [5:13:00<20:31:27, 3.89s/it] {'loss': 0.4128, 'grad_norm': 0.8612975549057106, 'learning_rate': 9.689966574908704e-06, 'epoch': 0.14} 14%|█▍ | 3079/22095 [5:13:00<20:31:27, 3.89s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_185849_3/images/before_screenshot_10_id_90_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-27 21:11:00.888346 load time: 1013.75 ms 14%|█▍ | 3080/22095 [5:13:04<20:11:04, 3.82s/it] {'loss': 0.3769, 'grad_norm': 1.1479647773489714, 'learning_rate': 9.689712455001031e-06, 'epoch': 0.14} 14%|█▍ | 3080/22095 [5:13:04<20:11:04, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: broken data stream when reading image file [Try #0] Failed to fetch sample 7323005 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: broken data stream when reading image file Problematic sample: {'image': 'autocad/20250508_161646_1/images/before_screenshot_1_id_73_internvl_appearance_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nDetermine the bounding box from: A rectangular button with the text 'Field' displayed in white on a dark gray background. The button has a subtle raised effect and is part of the AutoCAD ribbon interface. It features a small icon above the text that appears to be related to field functionality."}, {'from': 'gpt', 'value': "A rectangular button with the text 'Field' displayed in white on a dark gray background. The button has a subtle raised effect and is part of the AutoCAD ribbon interface. It features a small icon above the text that appears to be related to field functionality.[[592, 410, 606, 451]]"}], 'width': 2704, 'height': 1756} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3081/22095 [5:13:06<18:41:58, 3.54s/it] {'loss': 0.4226, 'grad_norm': 0.7404215174065047, 'learning_rate': 9.689458234325842e-06, 'epoch': 0.14} 14%|█▍ | 3081/22095 [5:13:06<18:41:58, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128750 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3082/22095 [5:13:16<27:41:50, 5.24s/it] {'loss': 0.5254, 'grad_norm': 0.7714942920613758, 'learning_rate': 9.689203912888597e-06, 'epoch': 0.14} 14%|█▍ | 3082/22095 [5:13:16<27:41:50, 5.24s/it] 14%|█▍ | 3083/22095 [5:13:19<25:14:29, 4.78s/it] {'loss': 0.3953, 'grad_norm': 0.8908715997971788, 'learning_rate': 9.688949490694762e-06, 'epoch': 0.14} 14%|█▍ | 3083/22095 [5:13:19<25:14:29, 4.78s/it] 14%|█▍ | 3084/22095 [5:13:22<22:19:09, 4.23s/it] {'loss': 0.4468, 'grad_norm': 0.8204552348832636, 'learning_rate': 9.688694967749804e-06, 'epoch': 0.14} 14%|█▍ | 3084/22095 [5:13:22<22:19:09, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3085/22095 [5:13:25<20:37:19, 3.91s/it] {'loss': 0.4333, 'grad_norm': 0.7103680809500149, 'learning_rate': 9.68844034405919e-06, 'epoch': 0.14} 14%|█▍ | 3085/22095 [5:13:25<20:37:19, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52240 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57443 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3086/22095 [5:13:35<29:17:50, 5.55s/it] {'loss': 0.5296, 'grad_norm': 0.6542833279183922, 'learning_rate': 9.688185619628395e-06, 'epoch': 0.14} 14%|█▍ | 3086/22095 [5:13:35<29:17:50, 5.55s/it] 14%|█▍ | 3087/22095 [5:13:39<26:50:38, 5.08s/it] {'loss': 0.4291, 'grad_norm': 0.7607580376314098, 'learning_rate': 9.687930794462887e-06, 'epoch': 0.14} 14%|█▍ | 3087/22095 [5:13:39<26:50:38, 5.08s/it] 14%|█▍ | 3088/22095 [5:13:42<23:46:56, 4.50s/it] {'loss': 0.426, 'grad_norm': 0.731342029185371, 'learning_rate': 9.687675868568145e-06, 'epoch': 0.14} 14%|█▍ | 3088/22095 [5:13:42<23:46:56, 4.50s/it] 14%|█▍ | 3089/22095 [5:13:45<21:37:15, 4.10s/it] {'loss': 0.4535, 'grad_norm': 0.729612946517145, 'learning_rate': 9.687420841949646e-06, 'epoch': 0.14} 14%|█▍ | 3089/22095 [5:13:45<21:37:15, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3090/22095 [5:13:52<26:37:51, 5.04s/it] {'loss': 0.5057, 'grad_norm': 0.4080284721261495, 'learning_rate': 9.68716571461287e-06, 'epoch': 0.14} 14%|█▍ | 3090/22095 [5:13:52<26:37:51, 5.04s/it] 14%|█▍ | 3091/22095 [5:13:56<25:05:36, 4.75s/it] {'loss': 0.431, 'grad_norm': 0.6791576858736899, 'learning_rate': 9.686910486563297e-06, 'epoch': 0.14} 14%|█▍ | 3091/22095 [5:13:56<25:05:36, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3092/22095 [5:14:00<22:31:49, 4.27s/it] {'loss': 0.4299, 'grad_norm': 0.6886144579501806, 'learning_rate': 9.686655157806412e-06, 'epoch': 0.14} 14%|█▍ | 3092/22095 [5:14:00<22:31:49, 4.27s/it] 14%|█▍ | 3093/22095 [5:14:04<22:03:13, 4.18s/it] {'loss': 0.418, 'grad_norm': 0.717296049557901, 'learning_rate': 9.686399728347704e-06, 'epoch': 0.14} 14%|█▍ | 3093/22095 [5:14:04<22:03:13, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3094/22095 [5:14:13<30:03:57, 5.70s/it] {'loss': 0.5322, 'grad_norm': 0.4554757341055788, 'learning_rate': 9.686144198192658e-06, 'epoch': 0.14} 14%|█▍ | 3094/22095 [5:14:13<30:03:57, 5.70s/it] 14%|█▍ | 3095/22095 [5:14:17<27:20:38, 5.18s/it] {'loss': 0.4378, 'grad_norm': 0.8239705370084248, 'learning_rate': 9.685888567346765e-06, 'epoch': 0.14} 14%|█▍ | 3095/22095 [5:14:17<27:20:38, 5.18s/it] 14%|█▍ | 3096/22095 [5:14:20<24:50:09, 4.71s/it] {'loss': 0.3947, 'grad_norm': 0.7019855219895466, 'learning_rate': 9.685632835815519e-06, 'epoch': 0.14} 14%|█▍ | 3096/22095 [5:14:20<24:50:09, 4.71s/it] 14%|█▍ | 3097/22095 [5:14:24<22:56:50, 4.35s/it] {'loss': 0.4222, 'grad_norm': 0.6641989821498935, 'learning_rate': 9.685377003604412e-06, 'epoch': 0.14} 14%|█▍ | 3097/22095 [5:14:24<22:56:50, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117583 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3098/22095 [5:14:28<22:00:40, 4.17s/it] {'loss': 0.418, 'grad_norm': 0.7060878628909356, 'learning_rate': 9.685121070718946e-06, 'epoch': 0.14} 14%|█▍ | 3098/22095 [5:14:28<22:00:40, 4.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [167, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8416948 in VC:s3://internvl-moe-sft-data/. Exception: Image size [167, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 111380, 'image': 'vrdu_texteq/astro-ph.CO/d47f1079-5a33-45f7-83e5-e4bb82b5cb2a.png', 'image_wh': [[167, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'for $r\\geq R$ and'}]} 14%|█▍ | 3099/22095 [5:14:32<21:48:36, 4.13s/it] {'loss': 0.4459, 'grad_norm': 0.7055236199159629, 'learning_rate': 9.684865037164616e-06, 'epoch': 0.14} 14%|█▍ | 3099/22095 [5:14:32<21:48:36, 4.13s/it] 14%|█▍ | 3100/22095 [5:14:35<20:10:11, 3.82s/it] {'loss': 0.3899, 'grad_norm': 0.7136009838012457, 'learning_rate': 9.684608902946926e-06, 'epoch': 0.14} 14%|█▍ | 3100/22095 [5:14:35<20:10:11, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3101/22095 [5:14:38<19:22:17, 3.67s/it] {'loss': 0.417, 'grad_norm': 0.7280440062086467, 'learning_rate': 9.684352668071378e-06, 'epoch': 0.14} 14%|█▍ | 3101/22095 [5:14:38<19:22:17, 3.67s/it] 14%|█▍ | 3102/22095 [5:14:41<18:18:53, 3.47s/it] {'loss': 0.415, 'grad_norm': 0.7027541777851652, 'learning_rate': 9.684096332543477e-06, 'epoch': 0.14} 14%|█▍ | 3102/22095 [5:14:41<18:18:53, 3.47s/it] 14%|█▍ | 3103/22095 [5:14:45<18:42:11, 3.55s/it] {'loss': 0.4969, 'grad_norm': 0.686899495440445, 'learning_rate': 9.683839896368732e-06, 'epoch': 0.14} 14%|█▍ | 3103/22095 [5:14:45<18:42:11, 3.55s/it] 14%|█▍ | 3104/22095 [5:14:48<18:47:37, 3.56s/it] {'loss': 0.4609, 'grad_norm': 0.8245957668931218, 'learning_rate': 9.683583359552654e-06, 'epoch': 0.14} 14%|█▍ | 3104/22095 [5:14:48<18:47:37, 3.56s/it] 14%|█▍ | 3105/22095 [5:14:51<18:00:45, 3.41s/it] {'loss': 0.4388, 'grad_norm': 0.7556686305076334, 'learning_rate': 9.683326722100753e-06, 'epoch': 0.14} 14%|█▍ | 3105/22095 [5:14:51<18:00:45, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3106/22095 [5:15:01<27:20:15, 5.18s/it] {'loss': 0.4911, 'grad_norm': 0.5379909049373236, 'learning_rate': 9.683069984018545e-06, 'epoch': 0.14} 14%|█▍ | 3106/22095 [5:15:01<27:20:15, 5.18s/it] 14%|█▍ | 3107/22095 [5:15:04<24:54:19, 4.72s/it] {'loss': 0.4007, 'grad_norm': 0.8009003920741268, 'learning_rate': 9.682813145311547e-06, 'epoch': 0.14} 14%|█▍ | 3107/22095 [5:15:04<24:54:19, 4.72s/it] 14%|█▍ | 3108/22095 [5:15:08<22:29:01, 4.26s/it] {'loss': 0.4309, 'grad_norm': 0.7335679247958816, 'learning_rate': 9.682556205985274e-06, 'epoch': 0.14} 14%|█▍ | 3108/22095 [5:15:08<22:29:01, 4.26s/it] 14%|█▍ | 3109/22095 [5:15:16<29:23:25, 5.57s/it] {'loss': 0.4242, 'grad_norm': 0.730711292524861, 'learning_rate': 9.682299166045252e-06, 'epoch': 0.14} 14%|█▍ | 3109/22095 [5:15:16<29:23:25, 5.57s/it] 14%|█▍ | 3110/22095 [5:15:21<27:23:00, 5.19s/it] {'loss': 0.3755, 'grad_norm': 0.7004543383866226, 'learning_rate': 9.682042025497001e-06, 'epoch': 0.14} 14%|█▍ | 3110/22095 [5:15:21<27:23:00, 5.19s/it] 14%|█▍ | 3111/22095 [5:15:24<24:11:00, 4.59s/it] {'loss': 0.4541, 'grad_norm': 1.519595889804459, 'learning_rate': 9.681784784346047e-06, 'epoch': 0.14} 14%|█▍ | 3111/22095 [5:15:24<24:11:00, 4.59s/it] 14%|█▍ | 3112/22095 [5:15:28<23:14:59, 4.41s/it] {'loss': 0.3909, 'grad_norm': 0.661638334038799, 'learning_rate': 9.681527442597916e-06, 'epoch': 0.14} 14%|█▍ | 3112/22095 [5:15:28<23:14:59, 4.41s/it] 14%|█▍ | 3113/22095 [5:15:31<21:23:52, 4.06s/it] {'loss': 0.447, 'grad_norm': 0.6977427321641013, 'learning_rate': 9.681270000258138e-06, 'epoch': 0.14} 14%|█▍ | 3113/22095 [5:15:31<21:23:52, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3114/22095 [5:15:34<19:58:27, 3.79s/it] {'loss': 0.3864, 'grad_norm': 0.7069246418810469, 'learning_rate': 9.681012457332247e-06, 'epoch': 0.14} 14%|█▍ | 3114/22095 [5:15:34<19:58:27, 3.79s/it] 14%|█▍ | 3115/22095 [5:15:38<19:40:34, 3.73s/it] {'loss': 0.4387, 'grad_norm': 0.7266791455837109, 'learning_rate': 9.680754813825774e-06, 'epoch': 0.14} 14%|█▍ | 3115/22095 [5:15:38<19:40:34, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96452 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79499 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3116/22095 [5:15:42<19:55:25, 3.78s/it] {'loss': 0.4425, 'grad_norm': 0.7571786820538708, 'learning_rate': 9.680497069744254e-06, 'epoch': 0.14} 14%|█▍ | 3116/22095 [5:15:42<19:55:25, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97817 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3117/22095 [5:15:45<18:51:18, 3.58s/it] {'loss': 0.439, 'grad_norm': 0.7102976525155017, 'learning_rate': 9.68023922509323e-06, 'epoch': 0.14} 14%|█▍ | 3117/22095 [5:15:45<18:51:18, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94420 > 40960). Running this sequence through the model will result in indexing errors VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31176.png 2025-08-27 21:13:44.809926 load time: 1114.3 ms 14%|█▍ | 3118/22095 [5:15:49<19:50:40, 3.76s/it] {'loss': 0.4311, 'grad_norm': 0.6973062459504656, 'learning_rate': 9.67998127987824e-06, 'epoch': 0.14} 14%|█▍ | 3118/22095 [5:15:49<19:50:40, 3.76s/it] 14%|█▍ | 3119/22095 [5:15:53<19:35:20, 3.72s/it] {'loss': 0.4194, 'grad_norm': 0.6736480913668162, 'learning_rate': 9.679723234104822e-06, 'epoch': 0.14} 14%|█▍ | 3119/22095 [5:15:53<19:35:20, 3.72s/it] 14%|█▍ | 3120/22095 [5:15:56<19:54:20, 3.78s/it] {'loss': 0.3818, 'grad_norm': 0.6712826154773708, 'learning_rate': 9.679465087778526e-06, 'epoch': 0.14} 14%|█▍ | 3120/22095 [5:15:56<19:54:20, 3.78s/it] 14%|█▍ | 3121/22095 [5:16:00<18:52:22, 3.58s/it] {'loss': 0.4403, 'grad_norm': 0.7766965611884528, 'learning_rate': 9.679206840904898e-06, 'epoch': 0.14} 14%|█▍ | 3121/22095 [5:16:00<18:52:22, 3.58s/it] 14%|█▍ | 3122/22095 [5:16:03<18:12:20, 3.45s/it] {'loss': 0.4386, 'grad_norm': 0.7349713706275844, 'learning_rate': 9.678948493489485e-06, 'epoch': 0.14} 14%|█▍ | 3122/22095 [5:16:03<18:12:20, 3.45s/it] 14%|█▍ | 3123/22095 [5:16:06<17:47:54, 3.38s/it] {'loss': 0.4208, 'grad_norm': 0.6508673734652729, 'learning_rate': 9.67869004553784e-06, 'epoch': 0.14} 14%|█▍ | 3123/22095 [5:16:06<17:47:54, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3124/22095 [5:16:15<27:35:54, 5.24s/it] {'loss': 0.5208, 'grad_norm': 0.728103346866895, 'learning_rate': 9.678431497055515e-06, 'epoch': 0.14} 14%|█▍ | 3124/22095 [5:16:16<27:35:54, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44841 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124484 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41709 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3125/22095 [5:16:21<28:40:34, 5.44s/it] {'loss': 0.5069, 'grad_norm': 0.45853081090677567, 'learning_rate': 9.678172848048067e-06, 'epoch': 0.14} 14%|█▍ | 3125/22095 [5:16:21<28:40:34, 5.44s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 14%|█▍ | 3126/22095 [5:16:25<25:45:32, 4.89s/it] {'loss': 0.3767, 'grad_norm': 0.804412565112675, 'learning_rate': 9.677914098521051e-06, 'epoch': 0.14} 14%|█▍ | 3126/22095 [5:16:25<25:45:32, 4.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54320 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3127/22095 [5:16:29<23:57:43, 4.55s/it] {'loss': 0.42, 'grad_norm': 0.6728901882341966, 'learning_rate': 9.677655248480026e-06, 'epoch': 0.14} 14%|█▍ | 3127/22095 [5:16:29<23:57:43, 4.55s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/3f2b3961-0642-4fca-9848-82ff1d70c9af/images/step_0.png 2025-08-27 21:14:29.034636 load time: 1045.75 ms 14%|█▍ | 3128/22095 [5:16:32<21:31:35, 4.09s/it] {'loss': 0.3903, 'grad_norm': 0.7435907240723189, 'learning_rate': 9.67739629793056e-06, 'epoch': 0.14} 14%|█▍ | 3128/22095 [5:16:32<21:31:35, 4.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [45, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366682 in VC:s3://internvl-moe-sft-data/. Exception: Image size [45, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33428, 'image': 'vrdu_table_final_2/astro-ph.CO/be73fcf9-5dcc-45e7-8e0e-179e0f5a3bec.png', 'image_wh': [[45, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$Y_{tot}$\\end{tabular}\n```"}]} 14%|█▍ | 3129/22095 [5:16:35<20:44:10, 3.94s/it] {'loss': 0.4231, 'grad_norm': 0.7145965926168072, 'learning_rate': 9.677137246878212e-06, 'epoch': 0.14} 14%|█▍ | 3129/22095 [5:16:35<20:44:10, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047834 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 14%|█▍ | 3130/22095 [5:16:39<19:58:05, 3.79s/it] {'loss': 0.4187, 'grad_norm': 0.6718267845844599, 'learning_rate': 9.676878095328547e-06, 'epoch': 0.14} 14%|█▍ | 3130/22095 [5:16:39<19:58:05, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3131/22095 [5:16:42<18:35:31, 3.53s/it] {'loss': 0.4173, 'grad_norm': 0.7073341669717274, 'learning_rate': 9.67661884328714e-06, 'epoch': 0.14} 14%|█▍ | 3131/22095 [5:16:42<18:35:31, 3.53s/it] 14%|█▍ | 3132/22095 [5:16:45<17:39:27, 3.35s/it] {'loss': 0.4201, 'grad_norm': 0.7355489401608707, 'learning_rate': 9.676359490759554e-06, 'epoch': 0.14} 14%|█▍ | 3132/22095 [5:16:45<17:39:27, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3133/22095 [5:16:54<27:36:54, 5.24s/it] {'loss': 0.5338, 'grad_norm': 1.6407994213832153, 'learning_rate': 9.676100037751366e-06, 'epoch': 0.14} 14%|█▍ | 3133/22095 [5:16:54<27:36:54, 5.24s/it] 14%|█▍ | 3134/22095 [5:16:58<24:34:02, 4.66s/it] {'loss': 0.4435, 'grad_norm': 0.7249211925149227, 'learning_rate': 9.675840484268149e-06, 'epoch': 0.14} 14%|█▍ | 3134/22095 [5:16:58<24:34:02, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3135/22095 [5:17:07<32:10:54, 6.11s/it] {'loss': 0.5272, 'grad_norm': 0.6720350118302825, 'learning_rate': 9.675580830315481e-06, 'epoch': 0.14} 14%|█▍ | 3135/22095 [5:17:07<32:10:54, 6.11s/it] 14%|█▍ | 3136/22095 [5:17:11<28:12:25, 5.36s/it] {'loss': 0.4302, 'grad_norm': 0.7299979256540658, 'learning_rate': 9.67532107589894e-06, 'epoch': 0.14} 14%|█▍ | 3136/22095 [5:17:11<28:12:25, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45792 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47728 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3137/22095 [5:17:14<24:22:25, 4.63s/it] {'loss': 0.3973, 'grad_norm': 0.727738005887431, 'learning_rate': 9.67506122102411e-06, 'epoch': 0.14} 14%|█▍ | 3137/22095 [5:17:14<24:22:25, 4.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [459, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8437285 in VC:s3://internvl-moe-sft-data/. Exception: Image size [459, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 90930, 'image': 'vrdu_texteq/astro-ph.CO/fd696f86-6954-4179-89dd-dbdf42e4d395.png', 'image_wh': [[459, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'with a correlation coefficient $r=0.98$.'}]} 14%|█▍ | 3138/22095 [5:17:17<22:39:28, 4.30s/it] {'loss': 0.423, 'grad_norm': 0.8311652702626384, 'learning_rate': 9.674801265696572e-06, 'epoch': 0.14} 14%|█▍ | 3138/22095 [5:17:17<22:39:28, 4.30s/it] 14%|█▍ | 3139/22095 [5:17:20<20:18:09, 3.86s/it] {'loss': 0.3822, 'grad_norm': 0.6585685880370094, 'learning_rate': 9.674541209921913e-06, 'epoch': 0.14} 14%|█▍ | 3139/22095 [5:17:20<20:18:09, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111441 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3140/22095 [5:17:24<21:09:27, 4.02s/it] {'loss': 0.4129, 'grad_norm': 0.7377137503636694, 'learning_rate': 9.674281053705719e-06, 'epoch': 0.14} 14%|█▍ | 3140/22095 [5:17:24<21:09:27, 4.02s/it] 14%|█▍ | 3141/22095 [5:17:28<21:15:10, 4.04s/it] {'loss': 0.4275, 'grad_norm': 0.7171166966408776, 'learning_rate': 9.67402079705358e-06, 'epoch': 0.14} 14%|█▍ | 3141/22095 [5:17:28<21:15:10, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70698 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3142/22095 [5:17:31<19:20:12, 3.67s/it] {'loss': 0.4481, 'grad_norm': 0.7429829000995922, 'learning_rate': 9.673760439971091e-06, 'epoch': 0.14} 14%|█▍ | 3142/22095 [5:17:31<19:20:12, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101297 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3143/22095 [5:17:34<18:23:29, 3.49s/it] {'loss': 0.4229, 'grad_norm': 0.7112414502918661, 'learning_rate': 9.673499982463846e-06, 'epoch': 0.14} 14%|█▍ | 3143/22095 [5:17:34<18:23:29, 3.49s/it] 14%|█▍ | 3144/22095 [5:17:38<18:21:01, 3.49s/it] {'loss': 0.4215, 'grad_norm': 0.6799300213562083, 'learning_rate': 9.673239424537437e-06, 'epoch': 0.14} 14%|█▍ | 3144/22095 [5:17:38<18:21:01, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3145/22095 [5:17:47<27:38:54, 5.25s/it] {'loss': 0.5697, 'grad_norm': 2.4514558013874717, 'learning_rate': 9.672978766197468e-06, 'epoch': 0.14} 14%|█▍ | 3145/22095 [5:17:47<27:38:54, 5.25s/it] 14%|█▍ | 3146/22095 [5:17:51<24:54:34, 4.73s/it] {'loss': 0.3968, 'grad_norm': 0.8073367634530231, 'learning_rate': 9.672718007449535e-06, 'epoch': 0.14} 14%|█▍ | 3146/22095 [5:17:51<24:54:34, 4.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [617, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8521670 in VC:s3://internvl-moe-sft-data/. Exception: Image size [617, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 66205, 'image': 'vrdu_texteq/astro-ph.CO/35668618-0db0-4ad5-8ab9-56e586bc3d1c.png', 'image_wh': [[617, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'where $z_v$ is the redshift at the time of virialization.'}]} 14%|█▍ | 3147/22095 [5:17:54<22:11:01, 4.21s/it] {'loss': 0.3786, 'grad_norm': 0.7186657373222725, 'learning_rate': 9.672457148299245e-06, 'epoch': 0.14} 14%|█▍ | 3147/22095 [5:17:54<22:11:01, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3148/22095 [5:18:03<29:42:56, 5.65s/it] {'loss': 0.5066, 'grad_norm': 0.7936502277208761, 'learning_rate': 9.672196188752201e-06, 'epoch': 0.14} 14%|█▍ | 3148/22095 [5:18:03<29:42:56, 5.65s/it] 14%|█▍ | 3149/22095 [5:18:06<25:53:17, 4.92s/it] {'loss': 0.3956, 'grad_norm': 0.7367266440609374, 'learning_rate': 9.67193512881401e-06, 'epoch': 0.14} 14%|█▍ | 3149/22095 [5:18:06<25:53:17, 4.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3150/22095 [5:18:13<29:30:33, 5.61s/it] {'loss': 0.51, 'grad_norm': 0.86974007930535, 'learning_rate': 9.671673968490281e-06, 'epoch': 0.14} 14%|█▍ | 3150/22095 [5:18:13<29:30:33, 5.61s/it] 14%|█▍ | 3151/22095 [5:18:17<27:00:45, 5.13s/it] {'loss': 0.3905, 'grad_norm': 0.7339821934903883, 'learning_rate': 9.671412707786628e-06, 'epoch': 0.14} 14%|█▍ | 3151/22095 [5:18:17<27:00:45, 5.13s/it] 14%|█▍ | 3152/22095 [5:18:21<24:57:12, 4.74s/it] {'loss': 0.4156, 'grad_norm': 0.7741642555970408, 'learning_rate': 9.67115134670866e-06, 'epoch': 0.14} 14%|█▍ | 3152/22095 [5:18:21<24:57:12, 4.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3153/22095 [5:18:25<23:24:04, 4.45s/it] {'loss': 0.4323, 'grad_norm': 0.6771007640808087, 'learning_rate': 9.670889885262e-06, 'epoch': 0.14} 14%|█▍ | 3153/22095 [5:18:25<23:24:04, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3154/22095 [5:18:30<24:25:28, 4.64s/it] {'loss': 0.5543, 'grad_norm': 1.4497561146456408, 'learning_rate': 9.670628323452259e-06, 'epoch': 0.14} 14%|█▍ | 3154/22095 [5:18:30<24:25:28, 4.64s/it] 14%|█▍ | 3155/22095 [5:18:34<22:54:41, 4.35s/it] {'loss': 0.4272, 'grad_norm': 2.3808597659867896, 'learning_rate': 9.670366661285061e-06, 'epoch': 0.14} 14%|█▍ | 3155/22095 [5:18:34<22:54:41, 4.35s/it] 14%|█▍ | 3156/22095 [5:18:37<21:01:25, 4.00s/it] {'loss': 0.389, 'grad_norm': 0.6572621235218521, 'learning_rate': 9.670104898766028e-06, 'epoch': 0.14} 14%|█▍ | 3156/22095 [5:18:37<21:01:25, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3157/22095 [5:18:41<20:55:29, 3.98s/it] {'loss': 0.4043, 'grad_norm': 0.6716930824211316, 'learning_rate': 9.669843035900783e-06, 'epoch': 0.14} 14%|█▍ | 3157/22095 [5:18:41<20:55:29, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3158/22095 [5:18:44<19:56:06, 3.79s/it] {'loss': 0.4865, 'grad_norm': 0.7747608617896153, 'learning_rate': 9.669581072694954e-06, 'epoch': 0.14} 14%|█▍ | 3158/22095 [5:18:44<19:56:06, 3.79s/it] 14%|█▍ | 3159/22095 [5:18:48<20:23:15, 3.88s/it] {'loss': 0.4336, 'grad_norm': 0.7170303310627186, 'learning_rate': 9.669319009154169e-06, 'epoch': 0.14} 14%|█▍ | 3159/22095 [5:18:48<20:23:15, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3160/22095 [5:18:51<19:37:01, 3.73s/it] {'loss': 0.4216, 'grad_norm': 0.7309134726102187, 'learning_rate': 9.66905684528406e-06, 'epoch': 0.14} 14%|█▍ | 3160/22095 [5:18:51<19:37:01, 3.73s/it] 14%|█▍ | 3161/22095 [5:18:55<18:40:02, 3.55s/it] {'loss': 0.4163, 'grad_norm': 0.7802958711128894, 'learning_rate': 9.668794581090257e-06, 'epoch': 0.14} 14%|█▍ | 3161/22095 [5:18:55<18:40:02, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43319 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3162/22095 [5:18:58<18:21:51, 3.49s/it] {'loss': 0.4519, 'grad_norm': 0.8650002567846453, 'learning_rate': 9.6685322165784e-06, 'epoch': 0.14} 14%|█▍ | 3162/22095 [5:18:58<18:21:51, 3.49s/it] 14%|█▍ | 3163/22095 [5:19:02<19:14:20, 3.66s/it] {'loss': 0.3696, 'grad_norm': 0.6585549158195979, 'learning_rate': 9.668269751754123e-06, 'epoch': 0.14} 14%|█▍ | 3163/22095 [5:19:02<19:14:20, 3.66s/it] 14%|█▍ | 3164/22095 [5:19:05<18:15:23, 3.47s/it] {'loss': 0.3826, 'grad_norm': 0.7047364200382954, 'learning_rate': 9.668007186623068e-06, 'epoch': 0.14} 14%|█▍ | 3164/22095 [5:19:05<18:15:23, 3.47s/it] 14%|█▍ | 3165/22095 [5:19:08<17:53:53, 3.40s/it] {'loss': 0.4091, 'grad_norm': 0.6752872962284534, 'learning_rate': 9.667744521190873e-06, 'epoch': 0.14} 14%|█▍ | 3165/22095 [5:19:08<17:53:53, 3.40s/it] 14%|█▍ | 3166/22095 [5:19:11<17:16:00, 3.28s/it] {'loss': 0.3918, 'grad_norm': 0.7053220031969246, 'learning_rate': 9.667481755463183e-06, 'epoch': 0.14} 14%|█▍ | 3166/22095 [5:19:11<17:16:00, 3.28s/it] 14%|█▍ | 3167/22095 [5:19:15<18:20:10, 3.49s/it] {'loss': 0.4624, 'grad_norm': 0.7531808398595325, 'learning_rate': 9.66721888944565e-06, 'epoch': 0.14} 14%|█▍ | 3167/22095 [5:19:15<18:20:10, 3.49s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_2/images/step_0.png 2025-08-27 21:17:15.032281 load time: 1007.35 ms 14%|█▍ | 3168/22095 [5:19:19<18:16:19, 3.48s/it] {'loss': 0.4121, 'grad_norm': 0.7065600576391988, 'learning_rate': 9.666955923143912e-06, 'epoch': 0.14} 14%|█▍ | 3168/22095 [5:19:19<18:16:19, 3.48s/it] 14%|█▍ | 3169/22095 [5:19:23<18:56:00, 3.60s/it] {'loss': 0.4169, 'grad_norm': 0.7669130699470585, 'learning_rate': 9.666692856563628e-06, 'epoch': 0.14} 14%|█▍ | 3169/22095 [5:19:23<18:56:00, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (62054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77429 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90655 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3170/22095 [5:19:32<28:12:10, 5.36s/it] {'loss': 0.5071, 'grad_norm': 1.1338249346765497, 'learning_rate': 9.666429689710447e-06, 'epoch': 0.14} 14%|█▍ | 3170/22095 [5:19:32<28:12:10, 5.36s/it] 14%|█▍ | 3171/22095 [5:19:36<25:49:50, 4.91s/it] {'loss': 0.4471, 'grad_norm': 0.6905545961200613, 'learning_rate': 9.666166422590024e-06, 'epoch': 0.14} 14%|█▍ | 3171/22095 [5:19:36<25:49:50, 4.91s/it] 14%|█▍ | 3172/22095 [5:19:39<23:29:42, 4.47s/it] {'loss': 0.4686, 'grad_norm': 0.6918574529763974, 'learning_rate': 9.665903055208013e-06, 'epoch': 0.14} 14%|█▍ | 3172/22095 [5:19:39<23:29:42, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3173/22095 [5:19:43<21:31:20, 4.09s/it] {'loss': 0.4353, 'grad_norm': 0.725100808369826, 'learning_rate': 9.665639587570079e-06, 'epoch': 0.14} 14%|█▍ | 3173/22095 [5:19:43<21:31:20, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64629 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3174/22095 [5:19:46<20:07:15, 3.83s/it] {'loss': 0.4346, 'grad_norm': 0.7448320533413242, 'learning_rate': 9.665376019681876e-06, 'epoch': 0.14} 14%|█▍ | 3174/22095 [5:19:46<20:07:15, 3.83s/it] 14%|█▍ | 3175/22095 [5:19:49<19:23:07, 3.69s/it] {'loss': 0.3947, 'grad_norm': 0.7240294172659404, 'learning_rate': 9.665112351549074e-06, 'epoch': 0.14} 14%|█▍ | 3175/22095 [5:19:49<19:23:07, 3.69s/it] 14%|█▍ | 3176/22095 [5:19:52<18:24:48, 3.50s/it] {'loss': 0.4177, 'grad_norm': 0.7927000090096381, 'learning_rate': 9.664848583177335e-06, 'epoch': 0.14} 14%|█▍ | 3176/22095 [5:19:52<18:24:48, 3.50s/it] 14%|█▍ | 3177/22095 [5:19:56<18:31:06, 3.52s/it] {'loss': 0.4181, 'grad_norm': 0.7320172279898588, 'learning_rate': 9.664584714572326e-06, 'epoch': 0.14} 14%|█▍ | 3177/22095 [5:19:56<18:31:06, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3178/22095 [5:20:04<25:28:53, 4.85s/it] {'loss': 0.5135, 'grad_norm': 0.5698166164116315, 'learning_rate': 9.664320745739717e-06, 'epoch': 0.14} 14%|█▍ | 3178/22095 [5:20:04<25:28:53, 4.85s/it] 14%|█▍ | 3179/22095 [5:20:08<24:02:43, 4.58s/it] {'loss': 0.4467, 'grad_norm': 0.7539947870246922, 'learning_rate': 9.664056676685183e-06, 'epoch': 0.14} 14%|█▍ | 3179/22095 [5:20:08<24:02:43, 4.58s/it] 14%|█▍ | 3180/22095 [5:20:11<21:34:38, 4.11s/it] {'loss': 0.4245, 'grad_norm': 0.7213958856579106, 'learning_rate': 9.663792507414393e-06, 'epoch': 0.14} 14%|█▍ | 3180/22095 [5:20:11<21:34:38, 4.11s/it] 14%|█▍ | 3181/22095 [5:20:14<19:58:29, 3.80s/it] {'loss': 0.4099, 'grad_norm': 0.6726300749608423, 'learning_rate': 9.663528237933027e-06, 'epoch': 0.14} 14%|█▍ | 3181/22095 [5:20:14<19:58:29, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3182/22095 [5:20:23<29:04:35, 5.53s/it] {'loss': 0.5279, 'grad_norm': 0.4822560169478301, 'learning_rate': 9.663263868246762e-06, 'epoch': 0.14} 14%|█▍ | 3182/22095 [5:20:23<29:04:35, 5.53s/it] 14%|█▍ | 3183/22095 [5:20:28<27:49:42, 5.30s/it] {'loss': 0.4152, 'grad_norm': 0.9085236049734731, 'learning_rate': 9.662999398361278e-06, 'epoch': 0.14} 14%|█▍ | 3183/22095 [5:20:28<27:49:42, 5.30s/it] 14%|█▍ | 3184/22095 [5:20:32<25:30:19, 4.86s/it] {'loss': 0.4515, 'grad_norm': 0.7829003770567539, 'learning_rate': 9.662734828282258e-06, 'epoch': 0.14} 14%|█▍ | 3184/22095 [5:20:32<25:30:19, 4.86s/it] 14%|█▍ | 3185/22095 [5:20:35<23:18:40, 4.44s/it] {'loss': 0.4002, 'grad_norm': 0.6575551419625164, 'learning_rate': 9.66247015801539e-06, 'epoch': 0.14} 14%|█▍ | 3185/22095 [5:20:35<23:18:40, 4.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65607 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45455 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109656 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3186/22095 [5:20:38<21:04:34, 4.01s/it] {'loss': 0.4125, 'grad_norm': 0.7181233616587022, 'learning_rate': 9.662205387566355e-06, 'epoch': 0.14} 14%|█▍ | 3186/22095 [5:20:38<21:04:34, 4.01s/it] 14%|█▍ | 3187/22095 [5:20:42<20:13:47, 3.85s/it] {'loss': 0.41, 'grad_norm': 0.6990078657681448, 'learning_rate': 9.661940516940846e-06, 'epoch': 0.14} 14%|█▍ | 3187/22095 [5:20:42<20:13:47, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3188/22095 [5:20:52<29:26:52, 5.61s/it] {'loss': 0.4915, 'grad_norm': 0.4046658946943539, 'learning_rate': 9.661675546144553e-06, 'epoch': 0.14} 14%|█▍ | 3188/22095 [5:20:52<29:26:52, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86164 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3189/22095 [5:20:55<25:37:16, 4.88s/it] {'loss': 0.4238, 'grad_norm': 0.7822597863670415, 'learning_rate': 9.661410475183169e-06, 'epoch': 0.14} 14%|█▍ | 3189/22095 [5:20:55<25:37:16, 4.88s/it] 14%|█▍ | 3190/22095 [5:20:58<23:10:42, 4.41s/it] {'loss': 0.3847, 'grad_norm': 0.691259239024519, 'learning_rate': 9.661145304062391e-06, 'epoch': 0.14} 14%|█▍ | 3190/22095 [5:20:58<23:10:42, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878401 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1554, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 14%|█▍ | 3191/22095 [5:21:01<21:10:05, 4.03s/it] {'loss': 0.4286, 'grad_norm': 0.7603748825768698, 'learning_rate': 9.660880032787917e-06, 'epoch': 0.14} 14%|█▍ | 3191/22095 [5:21:01<21:10:05, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57403 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51119 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3192/22095 [5:21:05<20:17:43, 3.87s/it] {'loss': 0.407, 'grad_norm': 0.7192307911925878, 'learning_rate': 9.660614661365446e-06, 'epoch': 0.14} 14%|█▍ | 3192/22095 [5:21:05<20:17:43, 3.87s/it] 14%|█▍ | 3193/22095 [5:21:08<19:10:07, 3.65s/it] {'loss': 0.4582, 'grad_norm': 0.7306554858820006, 'learning_rate': 9.660349189800678e-06, 'epoch': 0.14} 14%|█▍ | 3193/22095 [5:21:08<19:10:07, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3194/22095 [5:21:17<28:17:31, 5.39s/it] {'loss': 0.5332, 'grad_norm': 0.35736569073643326, 'learning_rate': 9.660083618099321e-06, 'epoch': 0.14} 14%|█▍ | 3194/22095 [5:21:17<28:17:31, 5.39s/it] 14%|█▍ | 3195/22095 [5:21:22<26:25:02, 5.03s/it] {'loss': 0.4104, 'grad_norm': 0.7395746670458239, 'learning_rate': 9.659817946267079e-06, 'epoch': 0.14} 14%|█▍ | 3195/22095 [5:21:22<26:25:02, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 14%|█▍ | 3196/22095 [5:21:25<24:36:40, 4.69s/it] {'loss': 0.4173, 'grad_norm': 0.7604286179825007, 'learning_rate': 9.65955217430966e-06, 'epoch': 0.14} 14%|█▍ | 3196/22095 [5:21:25<24:36:40, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (108637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55041 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3197/22095 [5:21:35<32:27:33, 6.18s/it] {'loss': 0.4974, 'grad_norm': 0.34860121685691725, 'learning_rate': 9.659286302232776e-06, 'epoch': 0.14} 14%|█▍ | 3197/22095 [5:21:35<32:27:33, 6.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78054 > 40960). Running this sequence through the model will result in indexing errors 14%|█▍ | 3198/22095 [5:21:39<28:27:06, 5.42s/it] {'loss': 0.4527, 'grad_norm': 0.691904119059187, 'learning_rate': 9.659020330042139e-06, 'epoch': 0.14} 14%|█▍ | 3198/22095 [5:21:39<28:27:06, 5.42s/it] 14%|█▍ | 3199/22095 [5:21:43<26:26:25, 5.04s/it] {'loss': 0.4309, 'grad_norm': 0.697285444271822, 'learning_rate': 9.658754257743465e-06, 'epoch': 0.14} 14%|█▍ | 3199/22095 [5:21:43<26:26:25, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 14%|█▍ | 3200/22095 [5:21:50<29:53:34, 5.70s/it] {'loss': 0.5187, 'grad_norm': 0.3251444549614559, 'learning_rate': 9.65848808534247e-06, 'epoch': 0.14} 14%|█▍ | 3200/22095 [5:21:50<29:53:34, 5.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307405 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB20pSjowvD8KJjy0FlXXagBFXa_!!2034041896.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPerform OCR on this picture, but do not output the coordinates.'}, {'from': 'gpt', 'value': 'All words in the image:\n满就送\n海洋球+双面爬行垫+故事机+收纳筐\n年货节\n大放送\n319\n推不倒双吸盘牢固安全'}]} 14%|█▍ | 3201/22095 [5:21:54<26:54:23, 5.13s/it] {'loss': 0.425, 'grad_norm': 0.6903935648469879, 'learning_rate': 9.658221812844872e-06, 'epoch': 0.14} 14%|█▍ | 3201/22095 [5:21:54<26:54:23, 5.13s/it] 14%|█▍ | 3202/22095 [5:21:57<24:19:01, 4.63s/it] {'loss': 0.4405, 'grad_norm': 0.6953138537555082, 'learning_rate': 9.657955440256396e-06, 'epoch': 0.14} 14%|█▍ | 3202/22095 [5:21:57<24:19:01, 4.63s/it] 14%|█▍ | 3203/22095 [5:22:01<22:11:04, 4.23s/it] {'loss': 0.4221, 'grad_norm': 0.726536205730736, 'learning_rate': 9.657688967582762e-06, 'epoch': 0.14} 14%|█▍ | 3203/22095 [5:22:01<22:11:04, 4.23s/it] 15%|█▍ | 3204/22095 [5:22:04<20:05:11, 3.83s/it] {'loss': 0.4541, 'grad_norm': 0.7074917191163478, 'learning_rate': 9.657422394829697e-06, 'epoch': 0.15} 15%|█▍ | 3204/22095 [5:22:04<20:05:11, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3205/22095 [5:22:13<29:10:57, 5.56s/it] {'loss': 0.5251, 'grad_norm': 0.3587028380035552, 'learning_rate': 9.65715572200293e-06, 'epoch': 0.15} 15%|█▍ | 3205/22095 [5:22:13<29:10:57, 5.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047799 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5'}, {'from': 'gpt', 'value': '【解答】解:(1)由点M、N分别是线段AC、BC的中点,得MC=\\frac{1}{2}AC=\\frac{1}{2}×4=2,NC=\\frac{1}{2}BC=\\frac{1}{2}×6=3.由线段的和差,得MN=MC+NC=2+3=5;'}]} 15%|█▍ | 3206/22095 [5:22:24<37:11:41, 7.09s/it] {'loss': 0.5109, 'grad_norm': 0.5153669271065352, 'learning_rate': 9.65688894910819e-06, 'epoch': 0.15} 15%|█▍ | 3206/22095 [5:22:24<37:11:41, 7.09s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▍ | 3207/22095 [5:22:27<31:16:45, 5.96s/it] {'loss': 0.4372, 'grad_norm': 0.7331308798946778, 'learning_rate': 9.656622076151208e-06, 'epoch': 0.15} 15%|█▍ | 3207/22095 [5:22:27<31:16:45, 5.96s/it] 15%|█▍ | 3208/22095 [5:22:32<29:36:48, 5.64s/it] {'loss': 0.437, 'grad_norm': 0.731716823903968, 'learning_rate': 9.65635510313772e-06, 'epoch': 0.15} 15%|█▍ | 3208/22095 [5:22:32<29:36:48, 5.64s/it] 15%|█▍ | 3209/22095 [5:22:36<27:01:54, 5.15s/it] {'loss': 0.3799, 'grad_norm': 0.6501948418255162, 'learning_rate': 9.656088030073462e-06, 'epoch': 0.15} 15%|█▍ | 3209/22095 [5:22:36<27:01:54, 5.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113517 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3210/22095 [5:22:40<24:32:33, 4.68s/it] {'loss': 0.4147, 'grad_norm': 0.6802252677586819, 'learning_rate': 9.655820856964171e-06, 'epoch': 0.15} 15%|█▍ | 3210/22095 [5:22:40<24:32:33, 4.68s/it] 15%|█▍ | 3211/22095 [5:22:44<24:30:55, 4.67s/it] {'loss': 0.4373, 'grad_norm': 0.669078224193458, 'learning_rate': 9.65555358381559e-06, 'epoch': 0.15} 15%|█▍ | 3211/22095 [5:22:44<24:30:55, 4.67s/it] 15%|█▍ | 3212/22095 [5:22:48<22:13:33, 4.24s/it] {'loss': 0.3962, 'grad_norm': 0.7840548343002391, 'learning_rate': 9.65528621063346e-06, 'epoch': 0.15} 15%|█▍ | 3212/22095 [5:22:48<22:13:33, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77506 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3213/22095 [5:22:51<21:19:21, 4.07s/it] {'loss': 0.4113, 'grad_norm': 0.6714606210120766, 'learning_rate': 9.655018737423529e-06, 'epoch': 0.15} 15%|█▍ | 3213/22095 [5:22:51<21:19:21, 4.07s/it] 15%|█▍ | 3214/22095 [5:22:55<20:58:41, 4.00s/it] {'loss': 0.435, 'grad_norm': 0.764637062360135, 'learning_rate': 9.65475116419154e-06, 'epoch': 0.15} 15%|█▍ | 3214/22095 [5:22:55<20:58:41, 4.00s/it] 15%|█▍ | 3215/22095 [5:22:59<20:16:06, 3.86s/it] {'loss': 0.3845, 'grad_norm': 0.6940196373497772, 'learning_rate': 9.654483490943245e-06, 'epoch': 0.15} 15%|█▍ | 3215/22095 [5:22:59<20:16:06, 3.86s/it] 15%|█▍ | 3216/22095 [5:23:02<19:13:30, 3.67s/it] {'loss': 0.3827, 'grad_norm': 0.6979860285067682, 'learning_rate': 9.654215717684397e-06, 'epoch': 0.15} 15%|█▍ | 3216/22095 [5:23:02<19:13:30, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950719 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1554, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 15%|█▍ | 3217/22095 [5:23:06<20:15:52, 3.86s/it] {'loss': 0.4263, 'grad_norm': 0.6450860678926025, 'learning_rate': 9.653947844420744e-06, 'epoch': 0.15} 15%|█▍ | 3217/22095 [5:23:06<20:15:52, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3218/22095 [5:23:16<30:31:14, 5.82s/it] {'loss': 0.4863, 'grad_norm': 0.4631569753271721, 'learning_rate': 9.653679871158048e-06, 'epoch': 0.15} 15%|█▍ | 3218/22095 [5:23:16<30:31:14, 5.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8335270 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1883, 'image': 'vrdu_table_final_2/astro-ph.CO/79e03ec1-8f9d-4318-8019-cc5d52f103fb.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 15%|█▍ | 3219/22095 [5:23:21<27:49:40, 5.31s/it] {'loss': 0.4481, 'grad_norm': 0.6504102546655056, 'learning_rate': 9.653411797902063e-06, 'epoch': 0.15} 15%|█▍ | 3219/22095 [5:23:21<27:49:40, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71853 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3220/22095 [5:23:24<24:49:01, 4.73s/it] {'loss': 0.4089, 'grad_norm': 0.8385292768004025, 'learning_rate': 9.65314362465855e-06, 'epoch': 0.15} 15%|█▍ | 3220/22095 [5:23:24<24:49:01, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42414 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64273 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3221/22095 [5:23:34<33:28:53, 6.39s/it] {'loss': 0.5351, 'grad_norm': 0.3677785772048963, 'learning_rate': 9.652875351433272e-06, 'epoch': 0.15} 15%|█▍ | 3221/22095 [5:23:34<33:28:53, 6.39s/it] 15%|█▍ | 3222/22095 [5:23:38<29:41:57, 5.67s/it] {'loss': 0.4031, 'grad_norm': 0.7357209315563944, 'learning_rate': 9.652606978231994e-06, 'epoch': 0.15} 15%|█▍ | 3222/22095 [5:23:38<29:41:57, 5.67s/it] 15%|█▍ | 3223/22095 [5:23:41<25:42:29, 4.90s/it] {'loss': 0.4362, 'grad_norm': 0.9786863345150582, 'learning_rate': 9.65233850506048e-06, 'epoch': 0.15} 15%|█▍ | 3223/22095 [5:23:41<25:42:29, 4.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8300774 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB189sDazihSKJjy0FfXXbGzFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text is hidden in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n温度数字显示表\n铝合金不沾盘\n不锈钢机身\n台/立两用\n电饼铛\ns\n赠送全套工具,\n老师傅配方教程21世纪商贸8'}]} 15%|█▍ | 3224/22095 [5:23:45<23:06:04, 4.41s/it] {'loss': 0.4401, 'grad_norm': 0.7184510643808878, 'learning_rate': 9.6520699319245e-06, 'epoch': 0.15} 15%|█▍ | 3224/22095 [5:23:45<23:06:04, 4.41s/it] 15%|█▍ | 3225/22095 [5:23:48<21:37:11, 4.12s/it] {'loss': 0.3851, 'grad_norm': 0.7416426760575285, 'learning_rate': 9.651801258829827e-06, 'epoch': 0.15} 15%|█▍ | 3225/22095 [5:23:48<21:37:11, 4.12s/it] 15%|█▍ | 3226/22095 [5:23:52<21:33:23, 4.11s/it] {'loss': 0.4249, 'grad_norm': 0.6576989128660552, 'learning_rate': 9.651532485782231e-06, 'epoch': 0.15} 15%|█▍ | 3226/22095 [5:23:52<21:33:23, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41901 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3227/22095 [5:23:56<20:30:52, 3.91s/it] {'loss': 0.379, 'grad_norm': 0.7296653820837178, 'learning_rate': 9.651263612787487e-06, 'epoch': 0.15} 15%|█▍ | 3227/22095 [5:23:56<20:30:52, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3228/22095 [5:24:03<25:25:25, 4.85s/it] {'loss': 0.4942, 'grad_norm': 0.47286884291343695, 'learning_rate': 9.650994639851375e-06, 'epoch': 0.15} 15%|█▍ | 3228/22095 [5:24:03<25:25:25, 4.85s/it] 15%|█▍ | 3229/22095 [5:24:13<33:40:29, 6.43s/it] {'loss': 0.5121, 'grad_norm': 0.41271948404533615, 'learning_rate': 9.650725566979671e-06, 'epoch': 0.15} 15%|█▍ | 3229/22095 [5:24:13<33:40:29, 6.43s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▍ | 3230/22095 [5:24:16<28:59:36, 5.53s/it] {'loss': 0.442, 'grad_norm': 0.8261333798102216, 'learning_rate': 9.650456394178157e-06, 'epoch': 0.15} 15%|█▍ | 3230/22095 [5:24:16<28:59:36, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44307 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42144 > 40960) for 4 sample(s). Truncating to 624 with 2 samples. 15%|█▍ | 3231/22095 [5:24:19<25:31:04, 4.87s/it] {'loss': 0.426, 'grad_norm': 0.6917499919499427, 'learning_rate': 9.65018712145262e-06, 'epoch': 0.15} 15%|█▍ | 3231/22095 [5:24:20<25:31:04, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60455 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (162151 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3232/22095 [5:24:23<23:29:06, 4.48s/it] {'loss': 0.4061, 'grad_norm': 0.7170816180353979, 'learning_rate': 9.649917748808844e-06, 'epoch': 0.15} 15%|█▍ | 3232/22095 [5:24:23<23:29:06, 4.48s/it] 15%|█▍ | 3233/22095 [5:24:27<21:51:26, 4.17s/it] {'loss': 0.4426, 'grad_norm': 0.6987359541063993, 'learning_rate': 9.649648276252614e-06, 'epoch': 0.15} 15%|█▍ | 3233/22095 [5:24:27<21:51:26, 4.17s/it] 15%|█▍ | 3234/22095 [5:24:30<20:20:33, 3.88s/it] {'loss': 0.4252, 'grad_norm': 0.8070158387970268, 'learning_rate': 9.649378703789724e-06, 'epoch': 0.15} 15%|█▍ | 3234/22095 [5:24:30<20:20:33, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3235/22095 [5:24:40<29:42:13, 5.67s/it] {'loss': 0.4893, 'grad_norm': 0.6080123077930029, 'learning_rate': 9.649109031425968e-06, 'epoch': 0.15} 15%|█▍ | 3235/22095 [5:24:40<29:42:13, 5.67s/it] 15%|█▍ | 3236/22095 [5:24:44<27:56:59, 5.34s/it] {'loss': 0.4406, 'grad_norm': 0.6925974737374918, 'learning_rate': 9.648839259167135e-06, 'epoch': 0.15} 15%|█▍ | 3236/22095 [5:24:44<27:56:59, 5.34s/it] 15%|█▍ | 3237/22095 [5:24:47<24:21:42, 4.65s/it] {'loss': 0.3996, 'grad_norm': 0.7773373297030651, 'learning_rate': 9.648569387019025e-06, 'epoch': 0.15} 15%|█▍ | 3237/22095 [5:24:47<24:21:42, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65799 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45440 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3238/22095 [5:24:51<22:37:15, 4.32s/it] {'loss': 0.4379, 'grad_norm': 0.7564126557863732, 'learning_rate': 9.648299414987434e-06, 'epoch': 0.15} 15%|█▍ | 3238/22095 [5:24:51<22:37:15, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3239/22095 [5:24:58<27:12:33, 5.19s/it] {'loss': 0.5252, 'grad_norm': 0.4150638456805394, 'learning_rate': 9.648029343078167e-06, 'epoch': 0.15} 15%|█▍ | 3239/22095 [5:24:58<27:12:33, 5.19s/it] 15%|█▍ | 3240/22095 [5:25:04<29:08:50, 5.57s/it] {'loss': 0.5406, 'grad_norm': 0.3622096334292757, 'learning_rate': 9.647759171297024e-06, 'epoch': 0.15} 15%|█▍ | 3240/22095 [5:25:04<29:08:50, 5.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (74514 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3241/22095 [5:25:09<26:57:40, 5.15s/it] {'loss': 0.3972, 'grad_norm': 0.6823350984312218, 'learning_rate': 9.64748889964981e-06, 'epoch': 0.15} 15%|█▍ | 3241/22095 [5:25:09<26:57:40, 5.15s/it] 15%|█▍ | 3242/22095 [5:25:12<24:48:05, 4.74s/it] {'loss': 0.4485, 'grad_norm': 0.7071456722892714, 'learning_rate': 9.647218528142333e-06, 'epoch': 0.15} 15%|█▍ | 3242/22095 [5:25:12<24:48:05, 4.74s/it] 15%|█▍ | 3243/22095 [5:25:15<21:40:27, 4.14s/it] {'loss': 0.4102, 'grad_norm': 0.6869486414541307, 'learning_rate': 9.646948056780403e-06, 'epoch': 0.15} 15%|█▍ | 3243/22095 [5:25:15<21:40:27, 4.14s/it] 15%|█▍ | 3244/22095 [5:25:18<20:07:20, 3.84s/it] {'loss': 0.3975, 'grad_norm': 0.6827200494044977, 'learning_rate': 9.646677485569834e-06, 'epoch': 0.15} 15%|█▍ | 3244/22095 [5:25:18<20:07:20, 3.84s/it] 15%|█▍ | 3245/22095 [5:25:21<18:27:28, 3.53s/it] {'loss': 0.4023, 'grad_norm': 0.6883900751807166, 'learning_rate': 9.646406814516434e-06, 'epoch': 0.15} 15%|█▍ | 3245/22095 [5:25:21<18:27:28, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (81283 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3246/22095 [5:25:30<27:35:34, 5.27s/it] {'loss': 0.4943, 'grad_norm': 0.5684018991219494, 'learning_rate': 9.646136043626023e-06, 'epoch': 0.15} 15%|█▍ | 3246/22095 [5:25:30<27:35:34, 5.27s/it] 15%|█▍ | 3247/22095 [5:25:34<24:15:43, 4.63s/it] {'loss': 0.4079, 'grad_norm': 0.6624837027292081, 'learning_rate': 9.645865172904418e-06, 'epoch': 0.15} 15%|█▍ | 3247/22095 [5:25:34<24:15:43, 4.63s/it] 15%|█▍ | 3248/22095 [5:25:37<21:45:33, 4.16s/it] {'loss': 0.4176, 'grad_norm': 0.6713274806582772, 'learning_rate': 9.645594202357438e-06, 'epoch': 0.15} 15%|█▍ | 3248/22095 [5:25:37<21:45:33, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54396 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3249/22095 [5:25:46<30:06:25, 5.75s/it] {'loss': 0.4997, 'grad_norm': 0.34889750817078513, 'learning_rate': 9.645323131990908e-06, 'epoch': 0.15} 15%|█▍ | 3249/22095 [5:25:46<30:06:25, 5.75s/it] 15%|█▍ | 3250/22095 [5:25:49<26:00:41, 4.97s/it] {'loss': 0.4072, 'grad_norm': 0.7653840453006011, 'learning_rate': 9.64505196181065e-06, 'epoch': 0.15} 15%|█▍ | 3250/22095 [5:25:49<26:00:41, 4.97s/it] 15%|█▍ | 3251/22095 [5:25:53<24:34:09, 4.69s/it] {'loss': 0.3849, 'grad_norm': 0.6156596928615148, 'learning_rate': 9.644780691822491e-06, 'epoch': 0.15} 15%|█▍ | 3251/22095 [5:25:53<24:34:09, 4.69s/it] 15%|█▍ | 3252/22095 [5:25:57<22:23:58, 4.28s/it] {'loss': 0.4479, 'grad_norm': 0.7840515047666305, 'learning_rate': 9.644509322032262e-06, 'epoch': 0.15} 15%|█▍ | 3252/22095 [5:25:57<22:23:58, 4.28s/it] 15%|█▍ | 3253/22095 [5:26:00<20:27:43, 3.91s/it] {'loss': 0.3878, 'grad_norm': 0.6974849515726319, 'learning_rate': 9.644237852445792e-06, 'epoch': 0.15} 15%|█▍ | 3253/22095 [5:26:00<20:27:43, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142295 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3254/22095 [5:26:06<23:50:47, 4.56s/it] {'loss': 0.4938, 'grad_norm': 0.5227456682682667, 'learning_rate': 9.643966283068912e-06, 'epoch': 0.15} 15%|█▍ | 3254/22095 [5:26:06<23:50:47, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▍ | 3255/22095 [5:26:10<23:21:40, 4.46s/it] {'loss': 0.4184, 'grad_norm': 0.7538320838930355, 'learning_rate': 9.643694613907461e-06, 'epoch': 0.15} 15%|█▍ | 3255/22095 [5:26:10<23:21:40, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72095 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3256/22095 [5:26:13<21:55:30, 4.19s/it] {'loss': 0.4429, 'grad_norm': 0.6633093812499693, 'learning_rate': 9.643422844967274e-06, 'epoch': 0.15} 15%|█▍ | 3256/22095 [5:26:13<21:55:30, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3257/22095 [5:26:23<30:16:11, 5.78s/it] {'loss': 0.5149, 'grad_norm': 0.38889087430455327, 'learning_rate': 9.643150976254192e-06, 'epoch': 0.15} 15%|█▍ | 3257/22095 [5:26:23<30:16:11, 5.78s/it] 15%|█▍ | 3258/22095 [5:26:27<27:37:25, 5.28s/it] {'loss': 0.4621, 'grad_norm': 0.6689124157664896, 'learning_rate': 9.642879007774058e-06, 'epoch': 0.15} 15%|█▍ | 3258/22095 [5:26:27<27:37:25, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▍ | 3259/22095 [5:26:30<24:39:05, 4.71s/it] {'loss': 0.3693, 'grad_norm': 0.6469967659217317, 'learning_rate': 9.64260693953271e-06, 'epoch': 0.15} 15%|█▍ | 3259/22095 [5:26:30<24:39:05, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47120 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42698 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3260/22095 [5:26:34<22:10:27, 4.24s/it] {'loss': 0.461, 'grad_norm': 0.7225777422722433, 'learning_rate': 9.642334771536e-06, 'epoch': 0.15} 15%|█▍ | 3260/22095 [5:26:34<22:10:27, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88793 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3261/22095 [5:26:36<19:50:57, 3.79s/it] {'loss': 0.3842, 'grad_norm': 0.663154906716596, 'learning_rate': 9.642062503789772e-06, 'epoch': 0.15} 15%|█▍ | 3261/22095 [5:26:36<19:50:57, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▍ | 3262/22095 [5:26:46<28:46:07, 5.50s/it] {'loss': 0.5037, 'grad_norm': 0.4600695882779264, 'learning_rate': 9.641790136299877e-06, 'epoch': 0.15} 15%|█▍ | 3262/22095 [5:26:46<28:46:07, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71091 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66218 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3263/22095 [5:26:49<25:41:45, 4.91s/it] {'loss': 0.4413, 'grad_norm': 0.7247436825298432, 'learning_rate': 9.641517669072171e-06, 'epoch': 0.15} 15%|█▍ | 3263/22095 [5:26:49<25:41:45, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3264/22095 [5:26:57<30:39:58, 5.86s/it] {'loss': 0.5163, 'grad_norm': 0.3658974333913344, 'learning_rate': 9.641245102112503e-06, 'epoch': 0.15} 15%|█▍ | 3264/22095 [5:26:57<30:39:58, 5.86s/it] 15%|█▍ | 3265/22095 [5:27:01<26:30:29, 5.07s/it] {'loss': 0.4003, 'grad_norm': 0.679325315229181, 'learning_rate': 9.640972435426734e-06, 'epoch': 0.15} 15%|█▍ | 3265/22095 [5:27:01<26:30:29, 5.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▍ | 3266/22095 [5:27:04<23:35:31, 4.51s/it] {'loss': 0.4247, 'grad_norm': 0.6799213139722677, 'learning_rate': 9.640699669020721e-06, 'epoch': 0.15} 15%|█▍ | 3266/22095 [5:27:04<23:35:31, 4.51s/it] 15%|█▍ | 3267/22095 [5:27:07<21:19:34, 4.08s/it] {'loss': 0.3924, 'grad_norm': 0.7157540734235226, 'learning_rate': 9.640426802900325e-06, 'epoch': 0.15} 15%|█▍ | 3267/22095 [5:27:07<21:19:34, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3268/22095 [5:27:14<25:37:22, 4.90s/it] {'loss': 0.5094, 'grad_norm': 0.4449196343882221, 'learning_rate': 9.640153837071407e-06, 'epoch': 0.15} 15%|█▍ | 3268/22095 [5:27:14<25:37:22, 4.90s/it] 15%|█▍ | 3269/22095 [5:27:22<30:30:37, 5.83s/it] {'loss': 0.4969, 'grad_norm': 0.38711013174344855, 'learning_rate': 9.639880771539836e-06, 'epoch': 0.15} 15%|█▍ | 3269/22095 [5:27:22<30:30:37, 5.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304853 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1eH5QmP3z9KJjy0FmXXXiwXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you decode and provide me with the exact words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\nSKF\n进口圆柱滚子轴承\nSuper-precisionbearing\nSKF\nSKF\n原装正品\n质保两年\n假一罚十\n全国包邮'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▍ | 3270/22095 [5:27:25<26:20:57, 5.04s/it] {'loss': 0.4226, 'grad_norm': 0.7658134774111022, 'learning_rate': 9.639607606311477e-06, 'epoch': 0.15} 15%|█▍ | 3270/22095 [5:27:25<26:20:57, 5.04s/it] 15%|█▍ | 3271/22095 [5:27:32<29:06:34, 5.57s/it] {'loss': 0.5179, 'grad_norm': 0.3612525389646945, 'learning_rate': 9.6393343413922e-06, 'epoch': 0.15} 15%|█▍ | 3271/22095 [5:27:32<29:06:34, 5.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (43591 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50105 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3272/22095 [5:27:35<25:38:32, 4.90s/it] {'loss': 0.3766, 'grad_norm': 0.6473237930596188, 'learning_rate': 9.639060976787878e-06, 'epoch': 0.15} 15%|█▍ | 3272/22095 [5:27:35<25:38:32, 4.90s/it] 15%|█▍ | 3273/22095 [5:27:38<22:53:26, 4.38s/it] {'loss': 0.3788, 'grad_norm': 0.7181412797772644, 'learning_rate': 9.638787512504382e-06, 'epoch': 0.15} 15%|█▍ | 3273/22095 [5:27:38<22:53:26, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99039 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3274/22095 [5:27:41<20:52:35, 3.99s/it] {'loss': 0.3702, 'grad_norm': 0.6567385369804504, 'learning_rate': 9.63851394854759e-06, 'epoch': 0.15} 15%|█▍ | 3274/22095 [5:27:41<20:52:35, 3.99s/it] 15%|█▍ | 3275/22095 [5:27:45<20:19:09, 3.89s/it] {'loss': 0.445, 'grad_norm': 0.7640669550629611, 'learning_rate': 9.638240284923377e-06, 'epoch': 0.15} 15%|█▍ | 3275/22095 [5:27:45<20:19:09, 3.89s/it] 15%|█▍ | 3276/22095 [5:27:49<20:25:01, 3.91s/it] {'loss': 0.424, 'grad_norm': 0.6652970779894702, 'learning_rate': 9.637966521637628e-06, 'epoch': 0.15} 15%|█▍ | 3276/22095 [5:27:49<20:25:01, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79153 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3277/22095 [5:27:52<18:55:37, 3.62s/it] {'loss': 0.404, 'grad_norm': 0.6624015894203762, 'learning_rate': 9.637692658696222e-06, 'epoch': 0.15} 15%|█▍ | 3277/22095 [5:27:52<18:55:37, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50327 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83146 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3278/22095 [5:27:56<19:26:19, 3.72s/it] {'loss': 0.4228, 'grad_norm': 0.6559214372418471, 'learning_rate': 9.637418696105043e-06, 'epoch': 0.15} 15%|█▍ | 3278/22095 [5:27:56<19:26:19, 3.72s/it] 15%|█▍ | 3279/22095 [5:27:59<18:54:01, 3.62s/it] {'loss': 0.4156, 'grad_norm': 0.7299149369216319, 'learning_rate': 9.63714463386998e-06, 'epoch': 0.15} 15%|█▍ | 3279/22095 [5:27:59<18:54:01, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3280/22095 [5:28:09<28:37:12, 5.48s/it] {'loss': 0.4854, 'grad_norm': 0.4885908240246616, 'learning_rate': 9.636870471996923e-06, 'epoch': 0.15} 15%|█▍ | 3280/22095 [5:28:09<28:37:12, 5.48s/it] 15%|█▍ | 3281/22095 [5:28:12<24:47:55, 4.75s/it] {'loss': 0.386, 'grad_norm': 0.6781850874150899, 'learning_rate': 9.63659621049176e-06, 'epoch': 0.15} 15%|█▍ | 3281/22095 [5:28:12<24:47:55, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3282/22095 [5:28:22<32:11:57, 6.16s/it] {'loss': 0.512, 'grad_norm': 0.3620606036338175, 'learning_rate': 9.636321849360382e-06, 'epoch': 0.15} 15%|█▍ | 3282/22095 [5:28:22<32:11:57, 6.16s/it] 15%|█▍ | 3283/22095 [5:28:25<27:19:38, 5.23s/it] {'loss': 0.3859, 'grad_norm': 0.7136500943838553, 'learning_rate': 9.63604738860869e-06, 'epoch': 0.15} 15%|█▍ | 3283/22095 [5:28:25<27:19:38, 5.23s/it] 15%|█▍ | 3284/22095 [5:28:28<24:27:08, 4.68s/it] {'loss': 0.423, 'grad_norm': 0.7297957271328588, 'learning_rate': 9.635772828242575e-06, 'epoch': 0.15} 15%|█▍ | 3284/22095 [5:28:28<24:27:08, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3285/22095 [5:28:37<31:59:51, 6.12s/it] {'loss': 0.4979, 'grad_norm': 0.3745810356349233, 'learning_rate': 9.63549816826794e-06, 'epoch': 0.15} 15%|█▍ | 3285/22095 [5:28:37<31:59:51, 6.12s/it] 15%|█▍ | 3286/22095 [5:28:44<33:09:30, 6.35s/it] {'loss': 0.5067, 'grad_norm': 0.356023359325862, 'learning_rate': 9.635223408690688e-06, 'epoch': 0.15} 15%|█▍ | 3286/22095 [5:28:44<33:09:30, 6.35s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▍ | 3287/22095 [5:28:48<28:09:37, 5.39s/it] {'loss': 0.416, 'grad_norm': 0.6842405407097039, 'learning_rate': 9.63494854951672e-06, 'epoch': 0.15} 15%|█▍ | 3287/22095 [5:28:48<28:09:37, 5.39s/it] 15%|█▍ | 3288/22095 [5:28:52<26:06:38, 5.00s/it] {'loss': 0.4349, 'grad_norm': 0.7189774734887591, 'learning_rate': 9.634673590751944e-06, 'epoch': 0.15} 15%|█▍ | 3288/22095 [5:28:52<26:06:38, 5.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [575, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8499080 in VC:s3://internvl-moe-sft-data/. Exception: Image size [575, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 161921, 'image': 'vrdu_texteq/astro-ph.CO/eec806e9-4f02-424a-9c8d-3eda3a94e327.png', 'image_wh': [[575, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'and observed in simulations differ less than $5\\%$.'}]} 15%|█▍ | 3289/22095 [5:28:56<24:31:43, 4.70s/it] {'loss': 0.4225, 'grad_norm': 0.7186667769161001, 'learning_rate': 9.634398532402264e-06, 'epoch': 0.15} 15%|█▍ | 3289/22095 [5:28:56<24:31:43, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109546 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3290/22095 [5:28:59<22:15:33, 4.26s/it] {'loss': 0.3895, 'grad_norm': 0.7200939913489343, 'learning_rate': 9.634123374473596e-06, 'epoch': 0.15} 15%|█▍ | 3290/22095 [5:28:59<22:15:33, 4.26s/it] 15%|█▍ | 3291/22095 [5:29:03<22:16:02, 4.26s/it] {'loss': 0.3925, 'grad_norm': 0.6927317189414052, 'learning_rate': 9.633848116971849e-06, 'epoch': 0.15} 15%|█▍ | 3291/22095 [5:29:03<22:16:02, 4.26s/it] 15%|█▍ | 3292/22095 [5:29:06<20:42:22, 3.96s/it] {'loss': 0.4118, 'grad_norm': 1.166576042660753, 'learning_rate': 9.633572759902936e-06, 'epoch': 0.15} 15%|█▍ | 3292/22095 [5:29:06<20:42:22, 3.96s/it] 15%|█▍ | 3293/22095 [5:29:10<20:15:24, 3.88s/it] {'loss': 0.4184, 'grad_norm': 0.7492381420963506, 'learning_rate': 9.633297303272777e-06, 'epoch': 0.15} 15%|█▍ | 3293/22095 [5:29:10<20:15:24, 3.88s/it] 15%|█▍ | 3294/22095 [5:29:14<19:41:11, 3.77s/it] {'loss': 0.3735, 'grad_norm': 0.7223441248858142, 'learning_rate': 9.633021747087288e-06, 'epoch': 0.15} 15%|█▍ | 3294/22095 [5:29:14<19:41:11, 3.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917246 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40399, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 8\nB. 10\nC. 12\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899859 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23012, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nA. 10cm\nB. 16cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 15%|█▍ | 3295/22095 [5:29:17<18:54:06, 3.62s/it] {'loss': 0.4248, 'grad_norm': 0.6917643370279334, 'learning_rate': 9.632746091352393e-06, 'epoch': 0.15} 15%|█▍ | 3295/22095 [5:29:17<18:54:06, 3.62s/it] 15%|█▍ | 3296/22095 [5:29:21<20:32:17, 3.93s/it] {'loss': 0.3917, 'grad_norm': 0.7034104999065266, 'learning_rate': 9.632470336074009e-06, 'epoch': 0.15} 15%|█▍ | 3296/22095 [5:29:21<20:32:17, 3.93s/it] 15%|█▍ | 3297/22095 [5:29:25<19:57:13, 3.82s/it] {'loss': 0.4178, 'grad_norm': 0.6831009361510235, 'learning_rate': 9.632194481258069e-06, 'epoch': 0.15} 15%|█▍ | 3297/22095 [5:29:25<19:57:13, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308687 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2nY7VgmfD8KJjSszhXXbIJFXa_!!2949876198.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nOCR。'}, {'from': 'gpt', 'value': '图中所有文字:\n张爆炸贴套装\n100\n首次购买送马克笔一支\n立\n2\n拍两\n元\n减\n件以上\n特\n價\n促销商品\n原價\n热卖中\n新品上市\n店长\n推介\n人気商品\n售价\n元\n超低價'}]} 15%|█▍ | 3298/22095 [5:29:31<23:03:32, 4.42s/it] {'loss': 0.5524, 'grad_norm': 0.5716097893116413, 'learning_rate': 9.631918526910493e-06, 'epoch': 0.15} 15%|█▍ | 3298/22095 [5:29:31<23:03:32, 4.42s/it] 15%|█▍ | 3299/22095 [5:29:35<21:54:46, 4.20s/it] {'loss': 0.3869, 'grad_norm': 0.6941509909598592, 'learning_rate': 9.631642473037216e-06, 'epoch': 0.15} 15%|█▍ | 3299/22095 [5:29:35<21:54:46, 4.20s/it] 15%|█▍ | 3300/22095 [5:29:39<22:48:48, 4.37s/it] {'loss': 0.4733, 'grad_norm': 0.7320256723128162, 'learning_rate': 9.631366319644167e-06, 'epoch': 0.15} 15%|█▍ | 3300/22095 [5:29:39<22:48:48, 4.37s/it] 15%|█▍ | 3301/22095 [5:29:43<21:13:04, 4.06s/it] {'loss': 0.4246, 'grad_norm': 0.7026666178027788, 'learning_rate': 9.631090066737278e-06, 'epoch': 0.15} 15%|█▍ | 3301/22095 [5:29:43<21:13:04, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45116 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44539 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59716 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3302/22095 [5:29:46<19:38:10, 3.76s/it] {'loss': 0.481, 'grad_norm': 1.0725104038960356, 'learning_rate': 9.630813714322488e-06, 'epoch': 0.15} 15%|█▍ | 3302/22095 [5:29:46<19:38:10, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72557 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3303/22095 [5:29:49<18:34:09, 3.56s/it] {'loss': 0.3949, 'grad_norm': 0.6730109808516953, 'learning_rate': 9.630537262405735e-06, 'epoch': 0.15} 15%|█▍ | 3303/22095 [5:29:49<18:34:09, 3.56s/it] 15%|█▍ | 3304/22095 [5:29:52<17:51:14, 3.42s/it] {'loss': 0.4161, 'grad_norm': 0.7169901020211172, 'learning_rate': 9.630260710992956e-06, 'epoch': 0.15} 15%|█▍ | 3304/22095 [5:29:52<17:51:14, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045995 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 5cm\nB. 无法确定\nC. 1cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 15%|█▍ | 3305/22095 [5:29:55<17:15:41, 3.31s/it] {'loss': 0.3782, 'grad_norm': 0.6953568784611611, 'learning_rate': 9.629984060090097e-06, 'epoch': 0.15} 15%|█▍ | 3305/22095 [5:29:55<17:15:41, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46788 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3306/22095 [5:29:58<16:59:20, 3.26s/it] {'loss': 0.4203, 'grad_norm': 0.723483016156664, 'learning_rate': 9.629707309703099e-06, 'epoch': 0.15} 15%|█▍ | 3306/22095 [5:29:58<16:59:20, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49605 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47929 > 40960). Running this sequence through the model will result in indexing errors 15%|█▍ | 3307/22095 [5:30:01<16:38:40, 3.19s/it] {'loss': 0.3877, 'grad_norm': 0.6857172507609567, 'learning_rate': 9.629430459837909e-06, 'epoch': 0.15} 15%|█▍ | 3307/22095 [5:30:01<16:38:40, 3.19s/it] 15%|█▍ | 3308/22095 [5:30:04<16:12:58, 3.11s/it] {'loss': 0.4093, 'grad_norm': 0.700995997510525, 'learning_rate': 9.629153510500478e-06, 'epoch': 0.15} 15%|█▍ | 3308/22095 [5:30:04<16:12:58, 3.11s/it] 15%|█▍ | 3309/22095 [5:30:07<16:35:18, 3.18s/it] {'loss': 0.3871, 'grad_norm': 0.6784060071160138, 'learning_rate': 9.628876461696754e-06, 'epoch': 0.15} 15%|█▍ | 3309/22095 [5:30:07<16:35:18, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3310/22095 [5:30:17<26:24:05, 5.06s/it] {'loss': 0.5218, 'grad_norm': 0.5473550274042821, 'learning_rate': 9.628599313432694e-06, 'epoch': 0.15} 15%|█▍ | 3310/22095 [5:30:17<26:24:05, 5.06s/it] 15%|█▍ | 3311/22095 [5:30:26<33:11:35, 6.36s/it] {'loss': 0.5336, 'grad_norm': 0.42281332962853047, 'learning_rate': 9.628322065714248e-06, 'epoch': 0.15} 15%|█▍ | 3311/22095 [5:30:26<33:11:35, 6.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (138201600 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 15%|█▍ | 3312/22095 [5:30:30<28:27:59, 5.46s/it] {'loss': 0.4277, 'grad_norm': 0.7061183170557235, 'learning_rate': 9.628044718547379e-06, 'epoch': 0.15} 15%|█▍ | 3312/22095 [5:30:30<28:27:59, 5.46s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30369.png 2025-08-27 21:28:25.543188 load time: 1761.86 ms 15%|█▍ | 3313/22095 [5:30:34<26:10:13, 5.02s/it] {'loss': 0.4106, 'grad_norm': 0.7485224328430917, 'learning_rate': 9.62776727193804e-06, 'epoch': 0.15} 15%|█▍ | 3313/22095 [5:30:34<26:10:13, 5.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▍ | 3314/22095 [5:30:43<32:30:59, 6.23s/it] {'loss': 0.5306, 'grad_norm': 0.49668806668198767, 'learning_rate': 9.627489725892195e-06, 'epoch': 0.15} 15%|█▍ | 3314/22095 [5:30:43<32:30:59, 6.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3315/22095 [5:30:49<32:01:04, 6.14s/it] {'loss': 0.5247, 'grad_norm': 0.5014177021586415, 'learning_rate': 9.627212080415808e-06, 'epoch': 0.15} 15%|█▌ | 3315/22095 [5:30:49<32:01:04, 6.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▌ | 3316/22095 [5:30:52<28:27:40, 5.46s/it] {'loss': 0.4317, 'grad_norm': 0.7862004631444977, 'learning_rate': 9.626934335514847e-06, 'epoch': 0.15} 15%|█▌ | 3316/22095 [5:30:52<28:27:40, 5.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910448 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33601, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 8cm\nB. 10cm\nC. 16cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 15%|█▌ | 3317/22095 [5:30:56<25:02:46, 4.80s/it] {'loss': 0.4345, 'grad_norm': 0.7304331450450849, 'learning_rate': 9.626656491195277e-06, 'epoch': 0.15} 15%|█▌ | 3317/22095 [5:30:56<25:02:46, 4.80s/it] 15%|█▌ | 3318/22095 [5:30:59<22:07:45, 4.24s/it] {'loss': 0.4388, 'grad_norm': 0.7099372085651383, 'learning_rate': 9.626378547463067e-06, 'epoch': 0.15} 15%|█▌ | 3318/22095 [5:30:59<22:07:45, 4.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3319/22095 [5:31:02<20:55:04, 4.01s/it] {'loss': 0.4032, 'grad_norm': 0.6846526401291254, 'learning_rate': 9.626100504324194e-06, 'epoch': 0.15} 15%|█▌ | 3319/22095 [5:31:02<20:55:04, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59805 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50552 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44740 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3320/22095 [5:31:05<19:19:26, 3.71s/it] {'loss': 0.4003, 'grad_norm': 0.8179284569944197, 'learning_rate': 9.625822361784626e-06, 'epoch': 0.15} 15%|█▌ | 3320/22095 [5:31:05<19:19:26, 3.71s/it] 15%|█▌ | 3321/22095 [5:31:08<18:26:02, 3.53s/it] {'loss': 0.3899, 'grad_norm': 0.7223286737584297, 'learning_rate': 9.625544119850344e-06, 'epoch': 0.15} 15%|█▌ | 3321/22095 [5:31:08<18:26:02, 3.53s/it] 15%|█▌ | 3322/22095 [5:31:12<18:30:06, 3.55s/it] {'loss': 0.3849, 'grad_norm': 0.6603367788079666, 'learning_rate': 9.625265778527325e-06, 'epoch': 0.15} 15%|█▌ | 3322/22095 [5:31:12<18:30:06, 3.55s/it] 15%|█▌ | 3323/22095 [5:31:15<18:05:44, 3.47s/it] {'loss': 0.4401, 'grad_norm': 0.7607035253944717, 'learning_rate': 9.62498733782155e-06, 'epoch': 0.15} 15%|█▌ | 3323/22095 [5:31:15<18:05:44, 3.47s/it] 15%|█▌ | 3324/22095 [5:31:19<19:20:38, 3.71s/it] {'loss': 0.4311, 'grad_norm': 0.7541225861262272, 'learning_rate': 9.624708797739002e-06, 'epoch': 0.15} 15%|█▌ | 3324/22095 [5:31:19<19:20:38, 3.71s/it] 15%|█▌ | 3325/22095 [5:31:23<19:26:03, 3.73s/it] {'loss': 0.4407, 'grad_norm': 0.7014753507801132, 'learning_rate': 9.624430158285664e-06, 'epoch': 0.15} 15%|█▌ | 3325/22095 [5:31:23<19:26:03, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42530 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3326/22095 [5:31:26<18:04:23, 3.47s/it] {'loss': 0.3872, 'grad_norm': 0.6942835461363718, 'learning_rate': 9.624151419467527e-06, 'epoch': 0.15} 15%|█▌ | 3326/22095 [5:31:26<18:04:23, 3.47s/it] 15%|█▌ | 3327/22095 [5:31:30<18:24:03, 3.53s/it] {'loss': 0.4314, 'grad_norm': 0.7138036553160272, 'learning_rate': 9.623872581290576e-06, 'epoch': 0.15} 15%|█▌ | 3327/22095 [5:31:30<18:24:03, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3328/22095 [5:31:39<26:47:02, 5.14s/it] {'loss': 0.5235, 'grad_norm': 0.8268866607473049, 'learning_rate': 9.623593643760805e-06, 'epoch': 0.15} 15%|█▌ | 3328/22095 [5:31:39<26:47:02, 5.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3329/22095 [5:31:42<24:09:46, 4.64s/it] {'loss': 0.4128, 'grad_norm': 0.7084738223029766, 'learning_rate': 9.623314606884207e-06, 'epoch': 0.15} 15%|█▌ | 3329/22095 [5:31:42<24:09:46, 4.64s/it] 15%|█▌ | 3330/22095 [5:31:45<21:35:47, 4.14s/it] {'loss': 0.443, 'grad_norm': 0.7136920204031075, 'learning_rate': 9.623035470666778e-06, 'epoch': 0.15} 15%|█▌ | 3330/22095 [5:31:45<21:35:47, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84191 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3331/22095 [5:31:49<20:58:45, 4.03s/it] {'loss': 0.423, 'grad_norm': 0.6631487881509494, 'learning_rate': 9.622756235114515e-06, 'epoch': 0.15} 15%|█▌ | 3331/22095 [5:31:49<20:58:45, 4.03s/it] 15%|█▌ | 3332/22095 [5:31:52<19:13:14, 3.69s/it] {'loss': 0.4024, 'grad_norm': 0.7237223774625545, 'learning_rate': 9.622476900233417e-06, 'epoch': 0.15} 15%|█▌ | 3332/22095 [5:31:52<19:13:14, 3.69s/it] 15%|█▌ | 3333/22095 [5:31:55<18:48:49, 3.61s/it] {'loss': 0.4373, 'grad_norm': 0.7136308557874039, 'learning_rate': 9.622197466029488e-06, 'epoch': 0.15} 15%|█▌ | 3333/22095 [5:31:55<18:48:49, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3334/22095 [5:32:05<27:56:00, 5.36s/it] {'loss': 0.5152, 'grad_norm': 0.6607160453705881, 'learning_rate': 9.621917932508733e-06, 'epoch': 0.15} 15%|█▌ | 3334/22095 [5:32:05<27:56:00, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88480 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3335/22095 [5:32:08<25:28:24, 4.89s/it] {'loss': 0.4072, 'grad_norm': 0.6643946238580568, 'learning_rate': 9.621638299677157e-06, 'epoch': 0.15} 15%|█▌ | 3335/22095 [5:32:08<25:28:24, 4.89s/it] 15%|█▌ | 3336/22095 [5:32:12<23:14:19, 4.46s/it] {'loss': 0.4121, 'grad_norm': 0.7875523683733319, 'learning_rate': 9.621358567540766e-06, 'epoch': 0.15} 15%|█▌ | 3336/22095 [5:32:12<23:14:19, 4.46s/it] 15%|█▌ | 3337/22095 [5:32:15<20:47:31, 3.99s/it] {'loss': 0.3895, 'grad_norm': 0.6171204450407061, 'learning_rate': 9.621078736105573e-06, 'epoch': 0.15} 15%|█▌ | 3337/22095 [5:32:15<20:47:31, 3.99s/it] 15%|█▌ | 3338/22095 [5:32:18<19:21:02, 3.71s/it] {'loss': 0.3854, 'grad_norm': 0.6658184861266957, 'learning_rate': 9.620798805377592e-06, 'epoch': 0.15} 15%|█▌ | 3338/22095 [5:32:18<19:21:02, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3339/22095 [5:32:24<23:09:16, 4.44s/it] {'loss': 0.4777, 'grad_norm': 0.404686142066214, 'learning_rate': 9.620518775362835e-06, 'epoch': 0.15} 15%|█▌ | 3339/22095 [5:32:24<23:09:16, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3340/22095 [5:32:28<22:12:11, 4.26s/it] {'loss': 0.4257, 'grad_norm': 0.7400084138254952, 'learning_rate': 9.620238646067322e-06, 'epoch': 0.15} 15%|█▌ | 3340/22095 [5:32:28<22:12:11, 4.26s/it] 15%|█▌ | 3341/22095 [5:32:31<21:20:27, 4.10s/it] {'loss': 0.4365, 'grad_norm': 0.8422360462884343, 'learning_rate': 9.619958417497069e-06, 'epoch': 0.15} 15%|█▌ | 3341/22095 [5:32:31<21:20:27, 4.10s/it] 15%|█▌ | 3342/22095 [5:32:36<21:36:08, 4.15s/it] {'loss': 0.4329, 'grad_norm': 0.9105430523186688, 'learning_rate': 9.619678089658097e-06, 'epoch': 0.15} 15%|█▌ | 3342/22095 [5:32:36<21:36:08, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3343/22095 [5:32:45<29:21:09, 5.64s/it] {'loss': 0.5005, 'grad_norm': 0.41325749692787855, 'learning_rate': 9.619397662556434e-06, 'epoch': 0.15} 15%|█▌ | 3343/22095 [5:32:45<29:21:09, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42042 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41501 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87045 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3344/22095 [5:32:48<25:21:00, 4.87s/it] {'loss': 0.4102, 'grad_norm': 0.8607971073288683, 'learning_rate': 9.619117136198101e-06, 'epoch': 0.15} 15%|█▌ | 3344/22095 [5:32:48<25:21:00, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79734 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3345/22095 [5:32:51<22:32:07, 4.33s/it] {'loss': 0.4079, 'grad_norm': 0.7294942692583787, 'learning_rate': 9.61883651058913e-06, 'epoch': 0.15} 15%|█▌ | 3345/22095 [5:32:51<22:32:07, 4.33s/it] 15%|█▌ | 3346/22095 [5:32:55<22:16:22, 4.28s/it] {'loss': 0.4413, 'grad_norm': 0.682562825235258, 'learning_rate': 9.618555785735546e-06, 'epoch': 0.15} 15%|█▌ | 3346/22095 [5:32:55<22:16:22, 4.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3347/22095 [5:32:58<20:51:52, 4.01s/it] {'loss': 0.3879, 'grad_norm': 0.6883105556066248, 'learning_rate': 9.618274961643384e-06, 'epoch': 0.15} 15%|█▌ | 3347/22095 [5:32:58<20:51:52, 4.01s/it] 15%|█▌ | 3348/22095 [5:33:02<20:29:38, 3.94s/it] {'loss': 0.4119, 'grad_norm': 0.6450651111810509, 'learning_rate': 9.617994038318675e-06, 'epoch': 0.15} 15%|█▌ | 3348/22095 [5:33:02<20:29:38, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59674 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3349/22095 [5:33:06<20:39:22, 3.97s/it] {'loss': 0.3703, 'grad_norm': 0.7116436770471571, 'learning_rate': 9.617713015767457e-06, 'epoch': 0.15} 15%|█▌ | 3349/22095 [5:33:06<20:39:22, 3.97s/it] 15%|█▌ | 3350/22095 [5:33:09<18:53:30, 3.63s/it] {'loss': 0.3761, 'grad_norm': 0.7377554108903471, 'learning_rate': 9.617431893995771e-06, 'epoch': 0.15} 15%|█▌ | 3350/22095 [5:33:09<18:53:30, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914369 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37522, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 15%|█▌ | 3351/22095 [5:33:13<19:17:07, 3.70s/it] {'loss': 0.4193, 'grad_norm': 0.6646223645221694, 'learning_rate': 9.617150673009654e-06, 'epoch': 0.15} 15%|█▌ | 3351/22095 [5:33:13<19:17:07, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43651 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3352/22095 [5:33:22<28:15:13, 5.43s/it] {'loss': 0.5071, 'grad_norm': 0.45223626911500664, 'learning_rate': 9.61686935281515e-06, 'epoch': 0.15} 15%|█▌ | 3352/22095 [5:33:22<28:15:13, 5.43s/it] 15%|█▌ | 3353/22095 [5:33:29<30:20:43, 5.83s/it] {'loss': 0.5095, 'grad_norm': 0.3798005018025008, 'learning_rate': 9.616587933418302e-06, 'epoch': 0.15} 15%|█▌ | 3353/22095 [5:33:29<30:20:43, 5.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882165 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5318, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 7cm\nB. 8cm\nC. 5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 15%|█▌ | 3354/22095 [5:33:32<26:11:33, 5.03s/it] {'loss': 0.4139, 'grad_norm': 0.789312374773116, 'learning_rate': 9.616306414825158e-06, 'epoch': 0.15} 15%|█▌ | 3354/22095 [5:33:32<26:11:33, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82481 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3355/22095 [5:33:36<23:57:38, 4.60s/it] {'loss': 0.4142, 'grad_norm': 0.6956206272305917, 'learning_rate': 9.616024797041769e-06, 'epoch': 0.15} 15%|█▌ | 3355/22095 [5:33:36<23:57:38, 4.60s/it] 15%|█▌ | 3356/22095 [5:33:39<21:29:12, 4.13s/it] {'loss': 0.4318, 'grad_norm': 0.6648217865219, 'learning_rate': 9.615743080074183e-06, 'epoch': 0.15} 15%|█▌ | 3356/22095 [5:33:39<21:29:12, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3357/22095 [5:33:46<25:22:34, 4.88s/it] {'loss': 0.5229, 'grad_norm': 0.5935656105926574, 'learning_rate': 9.615461263928454e-06, 'epoch': 0.15} 15%|█▌ | 3357/22095 [5:33:46<25:22:34, 4.88s/it] 15%|█▌ | 3358/22095 [5:33:49<22:39:07, 4.35s/it] {'loss': 0.4273, 'grad_norm': 0.7476404386102188, 'learning_rate': 9.615179348610638e-06, 'epoch': 0.15} 15%|█▌ | 3358/22095 [5:33:49<22:39:07, 4.35s/it] 15%|█▌ | 3359/22095 [5:33:52<21:34:50, 4.15s/it] {'loss': 0.4166, 'grad_norm': 0.7061949866038572, 'learning_rate': 9.614897334126791e-06, 'epoch': 0.15} 15%|█▌ | 3359/22095 [5:33:52<21:34:50, 4.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914671 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37824, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 2\nB. 4\nC. 8\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3360/22095 [5:33:59<26:02:43, 5.00s/it] {'loss': 0.5397, 'grad_norm': 0.40998979916945805, 'learning_rate': 9.614615220482976e-06, 'epoch': 0.15} 15%|█▌ | 3360/22095 [5:33:59<26:02:43, 5.00s/it] 15%|█▌ | 3361/22095 [5:34:04<25:01:09, 4.81s/it] {'loss': 0.4159, 'grad_norm': 0.8891488917747642, 'learning_rate': 9.614333007685253e-06, 'epoch': 0.15} 15%|█▌ | 3361/22095 [5:34:04<25:01:09, 4.81s/it] 15%|█▌ | 3362/22095 [5:34:07<23:02:22, 4.43s/it] {'loss': 0.3637, 'grad_norm': 0.7086492645793606, 'learning_rate': 9.614050695739683e-06, 'epoch': 0.15} 15%|█▌ | 3362/22095 [5:34:07<23:02:22, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3363/22095 [5:34:17<30:31:38, 5.87s/it] {'loss': 0.4981, 'grad_norm': 0.41334132030452503, 'learning_rate': 9.613768284652336e-06, 'epoch': 0.15} 15%|█▌ | 3363/22095 [5:34:17<30:31:38, 5.87s/it] 15%|█▌ | 3364/22095 [5:34:20<26:27:24, 5.08s/it] {'loss': 0.3899, 'grad_norm': 0.7341079457582734, 'learning_rate': 9.613485774429279e-06, 'epoch': 0.15} 15%|█▌ | 3364/22095 [5:34:20<26:27:24, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56065 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3365/22095 [5:34:23<23:24:29, 4.50s/it] {'loss': 0.4445, 'grad_norm': 0.7868870876356426, 'learning_rate': 9.61320316507658e-06, 'epoch': 0.15} 15%|█▌ | 3365/22095 [5:34:23<23:24:29, 4.50s/it] 15%|█▌ | 3366/22095 [5:34:27<22:12:06, 4.27s/it] {'loss': 0.4189, 'grad_norm': 0.7311498728838661, 'learning_rate': 9.612920456600317e-06, 'epoch': 0.15} 15%|█▌ | 3366/22095 [5:34:27<22:12:06, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46535 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3367/22095 [5:34:31<22:08:36, 4.26s/it] {'loss': 0.3834, 'grad_norm': 0.7843591853548378, 'learning_rate': 9.612637649006557e-06, 'epoch': 0.15} 15%|█▌ | 3367/22095 [5:34:31<22:08:36, 4.26s/it] 15%|█▌ | 3368/22095 [5:34:35<21:22:02, 4.11s/it] {'loss': 0.4292, 'grad_norm': 0.7826692283832626, 'learning_rate': 9.612354742301381e-06, 'epoch': 0.15} 15%|█▌ | 3368/22095 [5:34:35<21:22:02, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58410 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101572 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3369/22095 [5:34:38<20:25:01, 3.93s/it] {'loss': 0.3619, 'grad_norm': 0.6647315439646727, 'learning_rate': 9.61207173649087e-06, 'epoch': 0.15} 15%|█▌ | 3369/22095 [5:34:38<20:25:01, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3370/22095 [5:34:42<19:58:28, 3.84s/it] {'loss': 0.3907, 'grad_norm': 0.6557323897622711, 'learning_rate': 9.6117886315811e-06, 'epoch': 0.15} 15%|█▌ | 3370/22095 [5:34:42<19:58:28, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62154 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3371/22095 [5:34:46<19:52:06, 3.82s/it] {'loss': 0.4113, 'grad_norm': 0.7221361141961096, 'learning_rate': 9.611505427578159e-06, 'epoch': 0.15} 15%|█▌ | 3371/22095 [5:34:46<19:52:06, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122234 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3372/22095 [5:34:49<19:07:35, 3.68s/it] {'loss': 0.3876, 'grad_norm': 0.7099770327135274, 'learning_rate': 9.611222124488126e-06, 'epoch': 0.15} 15%|█▌ | 3372/22095 [5:34:49<19:07:35, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3373/22095 [5:34:59<29:17:30, 5.63s/it] {'loss': 0.4772, 'grad_norm': 0.4333483735128783, 'learning_rate': 9.610938722317095e-06, 'epoch': 0.15} 15%|█▌ | 3373/22095 [5:34:59<29:17:30, 5.63s/it] 15%|█▌ | 3374/22095 [5:35:02<25:42:18, 4.94s/it] {'loss': 0.4274, 'grad_norm': 0.7950935256983794, 'learning_rate': 9.61065522107115e-06, 'epoch': 0.15} 15%|█▌ | 3374/22095 [5:35:02<25:42:18, 4.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3375/22095 [5:35:12<32:54:18, 6.33s/it] {'loss': 0.5014, 'grad_norm': 0.3544158593732371, 'learning_rate': 9.610371620756385e-06, 'epoch': 0.15} 15%|█▌ | 3375/22095 [5:35:12<32:54:18, 6.33s/it] 15%|█▌ | 3376/22095 [5:35:15<28:22:02, 5.46s/it] {'loss': 0.4031, 'grad_norm': 0.7494143798836961, 'learning_rate': 9.610087921378895e-06, 'epoch': 0.15} 15%|█▌ | 3376/22095 [5:35:15<28:22:02, 5.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43066 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3377/22095 [5:35:25<34:37:29, 6.66s/it] {'loss': 0.5054, 'grad_norm': 0.32936531220215204, 'learning_rate': 9.609804122944774e-06, 'epoch': 0.15} 15%|█▌ | 3377/22095 [5:35:25<34:37:29, 6.66s/it] 15%|█▌ | 3378/22095 [5:35:28<29:04:37, 5.59s/it] {'loss': 0.3913, 'grad_norm': 0.6970342828940654, 'learning_rate': 9.60952022546012e-06, 'epoch': 0.15} 15%|█▌ | 3378/22095 [5:35:28<29:04:37, 5.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3379/22095 [5:35:38<35:49:44, 6.89s/it] {'loss': 0.5125, 'grad_norm': 0.3542374522374869, 'learning_rate': 9.609236228931033e-06, 'epoch': 0.15} 15%|█▌ | 3379/22095 [5:35:38<35:49:44, 6.89s/it] 15%|█▌ | 3380/22095 [5:35:48<41:16:36, 7.94s/it] {'loss': 0.5181, 'grad_norm': 0.37112645861116433, 'learning_rate': 9.608952133363616e-06, 'epoch': 0.15} 15%|█▌ | 3380/22095 [5:35:48<41:16:36, 7.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 15%|█▌ | 3381/22095 [5:35:53<36:16:40, 6.98s/it] {'loss': 0.3955, 'grad_norm': 0.7934526387588287, 'learning_rate': 9.608667938763974e-06, 'epoch': 0.15} 15%|█▌ | 3381/22095 [5:35:53<36:16:40, 6.98s/it] 15%|█▌ | 3382/22095 [5:35:57<31:49:30, 6.12s/it] {'loss': 0.3943, 'grad_norm': 0.7638806885248461, 'learning_rate': 9.60838364513821e-06, 'epoch': 0.15} 15%|█▌ | 3382/22095 [5:35:57<31:49:30, 6.12s/it] 15%|█▌ | 3383/22095 [5:36:01<28:11:51, 5.42s/it] {'loss': 0.4013, 'grad_norm': 0.6629446396210611, 'learning_rate': 9.608099252492437e-06, 'epoch': 0.15} 15%|█▌ | 3383/22095 [5:36:01<28:11:51, 5.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3384/22095 [5:36:04<24:13:09, 4.66s/it] {'loss': 0.3968, 'grad_norm': 0.8507887664141085, 'learning_rate': 9.607814760832764e-06, 'epoch': 0.15} 15%|█▌ | 3384/22095 [5:36:04<24:13:09, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (105939 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3385/22095 [5:36:13<30:26:59, 5.86s/it] {'loss': 0.5105, 'grad_norm': 0.4667875313224176, 'learning_rate': 9.607530170165302e-06, 'epoch': 0.15} 15%|█▌ | 3385/22095 [5:36:13<30:26:59, 5.86s/it] 15%|█▌ | 3386/22095 [5:36:16<26:48:29, 5.16s/it] {'loss': 0.4417, 'grad_norm': 0.7202328958788586, 'learning_rate': 9.607245480496168e-06, 'epoch': 0.15} 15%|█▌ | 3386/22095 [5:36:16<26:48:29, 5.16s/it] 15%|█▌ | 3387/22095 [5:36:19<23:20:05, 4.49s/it] {'loss': 0.4114, 'grad_norm': 0.7011725011282022, 'learning_rate': 9.60696069183148e-06, 'epoch': 0.15} 15%|█▌ | 3387/22095 [5:36:19<23:20:05, 4.49s/it] 15%|█▌ | 3388/22095 [5:36:22<21:46:15, 4.19s/it] {'loss': 0.4384, 'grad_norm': 0.7338372700466135, 'learning_rate': 9.606675804177355e-06, 'epoch': 0.15} 15%|█▌ | 3388/22095 [5:36:22<21:46:15, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50468 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42940 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3389/22095 [5:36:27<21:52:05, 4.21s/it] {'loss': 0.451, 'grad_norm': 0.6826376956547067, 'learning_rate': 9.606390817539915e-06, 'epoch': 0.15} 15%|█▌ | 3389/22095 [5:36:27<21:52:05, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894383 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17536, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} 15%|█▌ | 3390/22095 [5:36:31<21:23:04, 4.12s/it] {'loss': 0.436, 'grad_norm': 0.6496624803686839, 'learning_rate': 9.606105731925284e-06, 'epoch': 0.15} 15%|█▌ | 3390/22095 [5:36:31<21:23:04, 4.12s/it] 15%|█▌ | 3391/22095 [5:36:34<19:29:34, 3.75s/it] {'loss': 0.4005, 'grad_norm': 0.7179188848683196, 'learning_rate': 9.605820547339585e-06, 'epoch': 0.15} 15%|█▌ | 3391/22095 [5:36:34<19:29:34, 3.75s/it] 15%|█▌ | 3392/22095 [5:36:37<19:15:07, 3.71s/it] {'loss': 0.4075, 'grad_norm': 0.6822218854988548, 'learning_rate': 9.605535263788952e-06, 'epoch': 0.15} 15%|█▌ | 3392/22095 [5:36:37<19:15:07, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3393/22095 [5:36:46<26:48:31, 5.16s/it] {'loss': 0.53, 'grad_norm': 0.3730441185268565, 'learning_rate': 9.60524988127951e-06, 'epoch': 0.15} 15%|█▌ | 3393/22095 [5:36:46<26:48:31, 5.16s/it] 15%|█▌ | 3394/22095 [5:36:49<24:17:15, 4.68s/it] {'loss': 0.4018, 'grad_norm': 0.6636620437666377, 'learning_rate': 9.604964399817392e-06, 'epoch': 0.15} 15%|█▌ | 3394/22095 [5:36:49<24:17:15, 4.68s/it] 15%|█▌ | 3395/22095 [5:36:53<22:30:09, 4.33s/it] {'loss': 0.4387, 'grad_norm': 0.7268505905368877, 'learning_rate': 9.60467881940873e-06, 'epoch': 0.15} 15%|█▌ | 3395/22095 [5:36:53<22:30:09, 4.33s/it] 15%|█▌ | 3396/22095 [5:36:57<22:19:48, 4.30s/it] {'loss': 0.3756, 'grad_norm': 0.6961618085585248, 'learning_rate': 9.604393140059666e-06, 'epoch': 0.15} 15%|█▌ | 3396/22095 [5:36:57<22:19:48, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3397/22095 [5:37:06<30:25:52, 5.86s/it] {'loss': 0.4851, 'grad_norm': 0.3179645190143338, 'learning_rate': 9.604107361776331e-06, 'epoch': 0.15} 15%|█▌ | 3397/22095 [5:37:06<30:25:52, 5.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3398/22095 [5:37:10<26:30:50, 5.11s/it] {'loss': 0.3963, 'grad_norm': 0.662873080346535, 'learning_rate': 9.603821484564873e-06, 'epoch': 0.15} 15%|█▌ | 3398/22095 [5:37:10<26:30:50, 5.11s/it] 15%|█▌ | 3399/22095 [5:37:13<23:45:52, 4.58s/it] {'loss': 0.3792, 'grad_norm': 0.6926916308137251, 'learning_rate': 9.603535508431428e-06, 'epoch': 0.15} 15%|█▌ | 3399/22095 [5:37:13<23:45:52, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895853 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19006, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} 15%|█▌ | 3400/22095 [5:37:17<23:17:11, 4.48s/it] {'loss': 0.4637, 'grad_norm': 0.7226052047552491, 'learning_rate': 9.603249433382145e-06, 'epoch': 0.15} 15%|█▌ | 3400/22095 [5:37:17<23:17:11, 4.48s/it] 15%|█▌ | 3401/22095 [5:37:21<21:27:52, 4.13s/it] {'loss': 0.4246, 'grad_norm': 0.7045172676686957, 'learning_rate': 9.602963259423168e-06, 'epoch': 0.15} 15%|█▌ | 3401/22095 [5:37:21<21:27:52, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8377794 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44577, 'image': 'vrdu_table_final_2/astro-ph.CO/58275bad-fdec-4510-ae36-fb287d77c85b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3402/22095 [5:37:26<22:36:27, 4.35s/it] {'loss': 0.4106, 'grad_norm': 0.7305812318197507, 'learning_rate': 9.602676986560649e-06, 'epoch': 0.15} 15%|█▌ | 3402/22095 [5:37:26<22:36:27, 4.35s/it] 15%|█▌ | 3403/22095 [5:37:29<20:55:42, 4.03s/it] {'loss': 0.3772, 'grad_norm': 0.6517290131883109, 'learning_rate': 9.602390614800737e-06, 'epoch': 0.15} 15%|█▌ | 3403/22095 [5:37:29<20:55:42, 4.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047548 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 15%|█▌ | 3404/22095 [5:37:33<20:24:00, 3.93s/it] {'loss': 0.4714, 'grad_norm': 0.6246818631587848, 'learning_rate': 9.602104144149587e-06, 'epoch': 0.15} 15%|█▌ | 3404/22095 [5:37:33<20:24:00, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3405/22095 [5:37:36<19:40:04, 3.79s/it] {'loss': 0.4017, 'grad_norm': 0.6493350034733257, 'learning_rate': 9.601817574613352e-06, 'epoch': 0.15} 15%|█▌ | 3405/22095 [5:37:36<19:40:04, 3.79s/it] 15%|█▌ | 3406/22095 [5:37:39<18:12:48, 3.51s/it] {'loss': 0.4399, 'grad_norm': 0.69897511909871, 'learning_rate': 9.60153090619819e-06, 'epoch': 0.15} 15%|█▌ | 3406/22095 [5:37:39<18:12:48, 3.51s/it] 15%|█▌ | 3407/22095 [5:37:43<18:42:30, 3.60s/it] {'loss': 0.3845, 'grad_norm': 0.6498934389942151, 'learning_rate': 9.601244138910262e-06, 'epoch': 0.15} 15%|█▌ | 3407/22095 [5:37:43<18:42:30, 3.60s/it] 15%|█▌ | 3408/22095 [5:37:46<18:28:32, 3.56s/it] {'loss': 0.4543, 'grad_norm': 0.7266953270385536, 'learning_rate': 9.60095727275573e-06, 'epoch': 0.15} 15%|█▌ | 3408/22095 [5:37:46<18:28:32, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3409/22095 [5:37:51<21:10:32, 4.08s/it] {'loss': 0.5156, 'grad_norm': 0.4488563542322566, 'learning_rate': 9.600670307740755e-06, 'epoch': 0.15} 15%|█▌ | 3409/22095 [5:37:51<21:10:32, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44333 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60830 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3410/22095 [5:37:55<19:38:18, 3.78s/it] {'loss': 0.3851, 'grad_norm': 0.965869209891706, 'learning_rate': 9.600383243871508e-06, 'epoch': 0.15} 15%|█▌ | 3410/22095 [5:37:55<19:38:18, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [145, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397499 in VC:s3://internvl-moe-sft-data/. Exception: Image size [145, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64354, 'image': 'vrdu_table_final_2/astro-ph.EP/e366b371-f148-4db4-92e8-1dd79b5ff203.png', 'image_wh': [[145, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}} \\hspace{-1 cm} Instruments\\end{tabular}\n```"}]} 15%|█▌ | 3411/22095 [5:37:58<18:22:27, 3.54s/it] {'loss': 0.4307, 'grad_norm': 0.7504707497764602, 'learning_rate': 9.600096081154151e-06, 'epoch': 0.15} 15%|█▌ | 3411/22095 [5:37:58<18:22:27, 3.54s/it] 15%|█▌ | 3412/22095 [5:38:01<18:22:54, 3.54s/it] {'loss': 0.3978, 'grad_norm': 0.6613917357422878, 'learning_rate': 9.59980881959486e-06, 'epoch': 0.15} 15%|█▌ | 3412/22095 [5:38:01<18:22:54, 3.54s/it] 15%|█▌ | 3413/22095 [5:38:04<17:16:56, 3.33s/it] {'loss': 0.4667, 'grad_norm': 0.7520780460741235, 'learning_rate': 9.599521459199803e-06, 'epoch': 0.15} 15%|█▌ | 3413/22095 [5:38:04<17:16:56, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 15%|█▌ | 3414/22095 [5:38:08<19:05:31, 3.68s/it] {'loss': 0.4274, 'grad_norm': 0.8176268578974006, 'learning_rate': 9.599233999975156e-06, 'epoch': 0.15} 15%|█▌ | 3414/22095 [5:38:08<19:05:31, 3.68s/it] 15%|█▌ | 3415/22095 [5:38:12<19:14:25, 3.71s/it] {'loss': 0.4168, 'grad_norm': 0.6840665310353456, 'learning_rate': 9.598946441927097e-06, 'epoch': 0.15} 15%|█▌ | 3415/22095 [5:38:12<19:14:25, 3.71s/it] 15%|█▌ | 3416/22095 [5:38:16<19:33:44, 3.77s/it] {'loss': 0.4372, 'grad_norm': 0.7683252535016913, 'learning_rate': 9.598658785061803e-06, 'epoch': 0.15} 15%|█▌ | 3416/22095 [5:38:16<19:33:44, 3.77s/it] 15%|█▌ | 3417/22095 [5:38:20<19:10:06, 3.69s/it] {'loss': 0.3983, 'grad_norm': 0.6823793322842577, 'learning_rate': 9.598371029385455e-06, 'epoch': 0.15} 15%|█▌ | 3417/22095 [5:38:20<19:10:06, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3418/22095 [5:38:29<28:11:00, 5.43s/it] {'loss': 0.4959, 'grad_norm': 0.44229167249114293, 'learning_rate': 9.598083174904235e-06, 'epoch': 0.15} 15%|█▌ | 3418/22095 [5:38:29<28:11:00, 5.43s/it] 15%|█▌ | 3419/22095 [5:38:32<24:34:35, 4.74s/it] {'loss': 0.4095, 'grad_norm': 0.7338618242173148, 'learning_rate': 9.597795221624334e-06, 'epoch': 0.15} 15%|█▌ | 3419/22095 [5:38:32<24:34:35, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 15%|█▌ | 3420/22095 [5:38:39<27:55:37, 5.38s/it] {'loss': 0.5028, 'grad_norm': 0.32181814585633023, 'learning_rate': 9.59750716955193e-06, 'epoch': 0.15} 15%|█▌ | 3420/22095 [5:38:39<27:55:37, 5.38s/it] 15%|█▌ | 3421/22095 [5:38:42<24:45:56, 4.77s/it] {'loss': 0.4154, 'grad_norm': 0.7202648707296, 'learning_rate': 9.59721901869322e-06, 'epoch': 0.15} 15%|█▌ | 3421/22095 [5:38:42<24:45:56, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44640 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66996 > 40960). Running this sequence through the model will result in indexing errors 15%|█▌ | 3422/22095 [5:38:46<22:09:19, 4.27s/it] {'loss': 0.4178, 'grad_norm': 0.7033701337578779, 'learning_rate': 9.596930769054391e-06, 'epoch': 0.15} 15%|█▌ | 3422/22095 [5:38:46<22:09:19, 4.27s/it] 15%|█▌ | 3423/22095 [5:38:49<20:12:43, 3.90s/it] {'loss': 0.407, 'grad_norm': 0.9866248658193294, 'learning_rate': 9.59664242064164e-06, 'epoch': 0.15} 15%|█▌ | 3423/22095 [5:38:49<20:12:43, 3.90s/it] 15%|█▌ | 3424/22095 [5:38:52<19:12:57, 3.71s/it] {'loss': 0.3774, 'grad_norm': 0.7775425621832033, 'learning_rate': 9.59635397346116e-06, 'epoch': 0.15} 15%|█▌ | 3424/22095 [5:38:52<19:12:57, 3.71s/it] 16%|█▌ | 3425/22095 [5:38:55<18:43:25, 3.61s/it] {'loss': 0.4344, 'grad_norm': 0.6732563516972015, 'learning_rate': 9.596065427519149e-06, 'epoch': 0.16} 16%|█▌ | 3425/22095 [5:38:55<18:43:25, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3426/22095 [5:38:59<19:03:27, 3.67s/it] {'loss': 0.421, 'grad_norm': 0.691794745822921, 'learning_rate': 9.595776782821807e-06, 'epoch': 0.16} 16%|█▌ | 3426/22095 [5:38:59<19:03:27, 3.67s/it] 16%|█▌ | 3427/22095 [5:39:03<19:54:53, 3.84s/it] {'loss': 0.4408, 'grad_norm': 0.6687992237129932, 'learning_rate': 9.595488039375338e-06, 'epoch': 0.16} 16%|█▌ | 3427/22095 [5:39:03<19:54:53, 3.84s/it] 16%|█▌ | 3428/22095 [5:39:07<19:48:37, 3.82s/it] {'loss': 0.4267, 'grad_norm': 0.7624800700828009, 'learning_rate': 9.595199197185944e-06, 'epoch': 0.16} 16%|█▌ | 3428/22095 [5:39:07<19:48:37, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3429/22095 [5:39:11<19:40:18, 3.79s/it] {'loss': 0.4503, 'grad_norm': 0.7863688543009814, 'learning_rate': 9.594910256259834e-06, 'epoch': 0.16} 16%|█▌ | 3429/22095 [5:39:11<19:40:18, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3430/22095 [5:39:21<29:47:24, 5.75s/it] {'loss': 0.5111, 'grad_norm': 0.6196129348860103, 'learning_rate': 9.594621216603215e-06, 'epoch': 0.16} 16%|█▌ | 3430/22095 [5:39:21<29:47:24, 5.75s/it] 16%|█▌ | 3431/22095 [5:39:24<25:53:43, 4.99s/it] {'loss': 0.3929, 'grad_norm': 0.7464054819999745, 'learning_rate': 9.594332078222296e-06, 'epoch': 0.16} 16%|█▌ | 3431/22095 [5:39:24<25:53:43, 4.99s/it] 16%|█▌ | 3432/22095 [5:39:28<23:33:31, 4.54s/it] {'loss': 0.4165, 'grad_norm': 0.660553397731443, 'learning_rate': 9.594042841123291e-06, 'epoch': 0.16} 16%|█▌ | 3432/22095 [5:39:28<23:33:31, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44870 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70632 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3433/22095 [5:39:32<22:13:54, 4.29s/it] {'loss': 0.4312, 'grad_norm': 0.6912744419593565, 'learning_rate': 9.593753505312415e-06, 'epoch': 0.16} 16%|█▌ | 3433/22095 [5:39:32<22:13:54, 4.29s/it] 16%|█▌ | 3434/22095 [5:39:35<20:23:31, 3.93s/it] {'loss': 0.4068, 'grad_norm': 0.700543306891848, 'learning_rate': 9.593464070795887e-06, 'epoch': 0.16} 16%|█▌ | 3434/22095 [5:39:35<20:23:31, 3.93s/it] 16%|█▌ | 3435/22095 [5:39:39<20:53:20, 4.03s/it] {'loss': 0.3695, 'grad_norm': 0.6372973202578978, 'learning_rate': 9.593174537579921e-06, 'epoch': 0.16} 16%|█▌ | 3435/22095 [5:39:39<20:53:20, 4.03s/it] 16%|█▌ | 3436/22095 [5:39:43<20:29:07, 3.95s/it] {'loss': 0.3848, 'grad_norm': 0.7077294435391664, 'learning_rate': 9.592884905670742e-06, 'epoch': 0.16} 16%|█▌ | 3436/22095 [5:39:43<20:29:07, 3.95s/it] 16%|█▌ | 3437/22095 [5:39:45<18:39:07, 3.60s/it] {'loss': 0.3867, 'grad_norm': 2.966558214938584, 'learning_rate': 9.592595175074573e-06, 'epoch': 0.16} 16%|█▌ | 3437/22095 [5:39:45<18:39:07, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3438/22095 [5:39:52<23:53:28, 4.61s/it] {'loss': 0.5055, 'grad_norm': 0.6559110571739295, 'learning_rate': 9.592305345797636e-06, 'epoch': 0.16} 16%|█▌ | 3438/22095 [5:39:52<23:53:28, 4.61s/it] 16%|█▌ | 3439/22095 [5:39:56<22:39:24, 4.37s/it] {'loss': 0.4145, 'grad_norm': 0.7035205428029994, 'learning_rate': 9.592015417846166e-06, 'epoch': 0.16} 16%|█▌ | 3439/22095 [5:39:56<22:39:24, 4.37s/it] 16%|█▌ | 3440/22095 [5:40:00<21:30:59, 4.15s/it] {'loss': 0.3844, 'grad_norm': 0.7079837650347159, 'learning_rate': 9.591725391226383e-06, 'epoch': 0.16} 16%|█▌ | 3440/22095 [5:40:00<21:30:59, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3441/22095 [5:40:10<30:45:58, 5.94s/it] {'loss': 0.4867, 'grad_norm': 0.32253785644702726, 'learning_rate': 9.591435265944527e-06, 'epoch': 0.16} 16%|█▌ | 3441/22095 [5:40:10<30:45:58, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60863 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3442/22095 [5:40:13<26:34:39, 5.13s/it] {'loss': 0.4029, 'grad_norm': 0.8604594320238467, 'learning_rate': 9.591145042006829e-06, 'epoch': 0.16} 16%|█▌ | 3442/22095 [5:40:13<26:34:39, 5.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41743 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80836 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3443/22095 [5:40:17<25:00:20, 4.83s/it] {'loss': 0.4557, 'grad_norm': 0.8566153255616973, 'learning_rate': 9.590854719419522e-06, 'epoch': 0.16} 16%|█▌ | 3443/22095 [5:40:17<25:00:20, 4.83s/it] 16%|█▌ | 3444/22095 [5:40:20<22:23:22, 4.32s/it] {'loss': 0.4254, 'grad_norm': 0.6991748547257743, 'learning_rate': 9.59056429818885e-06, 'epoch': 0.16} 16%|█▌ | 3444/22095 [5:40:20<22:23:22, 4.32s/it] 16%|█▌ | 3445/22095 [5:40:25<21:58:46, 4.24s/it] {'loss': 0.3911, 'grad_norm': 0.6833290597384973, 'learning_rate': 9.590273778321048e-06, 'epoch': 0.16} 16%|█▌ | 3445/22095 [5:40:25<21:58:46, 4.24s/it] 16%|█▌ | 3446/22095 [5:40:28<21:34:59, 4.17s/it] {'loss': 0.4306, 'grad_norm': 0.7384819888520089, 'learning_rate': 9.58998315982236e-06, 'epoch': 0.16} 16%|█▌ | 3446/22095 [5:40:29<21:34:59, 4.17s/it] 16%|█▌ | 3447/22095 [5:40:32<20:51:21, 4.03s/it] {'loss': 0.4349, 'grad_norm': 0.7408386652212549, 'learning_rate': 9.589692442699033e-06, 'epoch': 0.16} 16%|█▌ | 3447/22095 [5:40:32<20:51:21, 4.03s/it] 16%|█▌ | 3448/22095 [5:40:36<20:34:06, 3.97s/it] {'loss': 0.4633, 'grad_norm': 0.7202578582855139, 'learning_rate': 9.589401626957309e-06, 'epoch': 0.16} 16%|█▌ | 3448/22095 [5:40:36<20:34:06, 3.97s/it] 16%|█▌ | 3449/22095 [5:40:39<19:37:40, 3.79s/it] {'loss': 0.4691, 'grad_norm': 0.73345358302742, 'learning_rate': 9.589110712603442e-06, 'epoch': 0.16} 16%|█▌ | 3449/22095 [5:40:39<19:37:40, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3450/22095 [5:40:49<29:25:04, 5.68s/it] {'loss': 0.536, 'grad_norm': 0.6367108198509374, 'learning_rate': 9.588819699643677e-06, 'epoch': 0.16} 16%|█▌ | 3450/22095 [5:40:49<29:25:04, 5.68s/it] 16%|█▌ | 3451/22095 [5:41:00<36:22:27, 7.02s/it] {'loss': 0.5085, 'grad_norm': 0.4522720428613994, 'learning_rate': 9.588528588084272e-06, 'epoch': 0.16} 16%|█▌ | 3451/22095 [5:41:00<36:22:27, 7.02s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 16%|█▌ | 3452/22095 [5:41:04<32:52:35, 6.35s/it] {'loss': 0.439, 'grad_norm': 0.7553918774782068, 'learning_rate': 9.588237377931482e-06, 'epoch': 0.16} 16%|█▌ | 3452/22095 [5:41:04<32:52:35, 6.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (125559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54500 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3453/22095 [5:41:09<29:22:21, 5.67s/it] {'loss': 0.4457, 'grad_norm': 0.7313305103265421, 'learning_rate': 9.587946069191561e-06, 'epoch': 0.16} 16%|█▌ | 3453/22095 [5:41:09<29:22:21, 5.67s/it] 16%|█▌ | 3454/22095 [5:41:12<26:03:43, 5.03s/it] {'loss': 0.4285, 'grad_norm': 0.6840203653682762, 'learning_rate': 9.58765466187077e-06, 'epoch': 0.16} 16%|█▌ | 3454/22095 [5:41:12<26:03:43, 5.03s/it] 16%|█▌ | 3455/22095 [5:41:16<24:23:33, 4.71s/it] {'loss': 0.3913, 'grad_norm': 0.7561467717478978, 'learning_rate': 9.587363155975367e-06, 'epoch': 0.16} 16%|█▌ | 3455/22095 [5:41:16<24:23:33, 4.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365239 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31980, 'image': 'vrdu_table_final_2/astro-ph.CO/7dfc7b60-8998-4766-b90d-1b7cd959b49b.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{@{}#1@{}}{#2}\\\\{#3}\\end{tabular}\n```"}]} 16%|█▌ | 3456/22095 [5:41:21<25:05:11, 4.85s/it] {'loss': 0.4189, 'grad_norm': 0.8080672045005288, 'learning_rate': 9.587071551511621e-06, 'epoch': 0.16} 16%|█▌ | 3456/22095 [5:41:21<25:05:11, 4.85s/it] 16%|█▌ | 3457/22095 [5:41:24<22:33:39, 4.36s/it] {'loss': 0.3409, 'grad_norm': 0.7731559050053085, 'learning_rate': 9.586779848485797e-06, 'epoch': 0.16} 16%|█▌ | 3457/22095 [5:41:24<22:33:39, 4.36s/it] 16%|█▌ | 3458/22095 [5:41:28<21:54:14, 4.23s/it] {'loss': 0.4369, 'grad_norm': 0.6965717472553806, 'learning_rate': 9.58648804690416e-06, 'epoch': 0.16} 16%|█▌ | 3458/22095 [5:41:28<21:54:14, 4.23s/it] 16%|█▌ | 3459/22095 [5:41:32<21:03:32, 4.07s/it] {'loss': 0.4422, 'grad_norm': 0.8440351872536466, 'learning_rate': 9.586196146772982e-06, 'epoch': 0.16} 16%|█▌ | 3459/22095 [5:41:32<21:03:32, 4.07s/it] 16%|█▌ | 3460/22095 [5:41:36<20:38:49, 3.99s/it] {'loss': 0.4561, 'grad_norm': 0.7661592744550679, 'learning_rate': 9.585904148098532e-06, 'epoch': 0.16} 16%|█▌ | 3460/22095 [5:41:36<20:38:49, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3461/22095 [5:41:45<29:28:19, 5.69s/it] {'loss': 0.4997, 'grad_norm': 1.3733560238638691, 'learning_rate': 9.58561205088709e-06, 'epoch': 0.16} 16%|█▌ | 3461/22095 [5:41:46<29:28:19, 5.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892997 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16150, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 16%|█▌ | 3462/22095 [5:41:50<26:51:41, 5.19s/it] {'loss': 0.4139, 'grad_norm': 0.7280290741276498, 'learning_rate': 9.585319855144926e-06, 'epoch': 0.16} 16%|█▌ | 3462/22095 [5:41:50<26:51:41, 5.19s/it] 16%|█▌ | 3463/22095 [5:41:53<24:17:26, 4.69s/it] {'loss': 0.4493, 'grad_norm': 0.7531301358903874, 'learning_rate': 9.585027560878322e-06, 'epoch': 0.16} 16%|█▌ | 3463/22095 [5:41:53<24:17:26, 4.69s/it] 16%|█▌ | 3464/22095 [5:41:57<22:55:09, 4.43s/it] {'loss': 0.4061, 'grad_norm': 0.6604910087043134, 'learning_rate': 9.584735168093557e-06, 'epoch': 0.16} 16%|█▌ | 3464/22095 [5:41:57<22:55:09, 4.43s/it] 16%|█▌ | 3465/22095 [5:42:00<21:02:23, 4.07s/it] {'loss': 0.4581, 'grad_norm': 0.7322442914769552, 'learning_rate': 9.584442676796915e-06, 'epoch': 0.16} 16%|█▌ | 3465/22095 [5:42:00<21:02:23, 4.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3466/22095 [5:42:04<20:05:16, 3.88s/it] {'loss': 0.4086, 'grad_norm': 0.7073063990760776, 'learning_rate': 9.584150086994678e-06, 'epoch': 0.16} 16%|█▌ | 3466/22095 [5:42:04<20:05:16, 3.88s/it] 16%|█▌ | 3467/22095 [5:42:07<19:51:37, 3.84s/it] {'loss': 0.3896, 'grad_norm': 0.7190082010411192, 'learning_rate': 9.583857398693137e-06, 'epoch': 0.16} 16%|█▌ | 3467/22095 [5:42:07<19:51:37, 3.84s/it] 16%|█▌ | 3468/22095 [5:42:11<20:04:09, 3.88s/it] {'loss': 0.4143, 'grad_norm': 0.7584845386772335, 'learning_rate': 9.583564611898577e-06, 'epoch': 0.16} 16%|█▌ | 3468/22095 [5:42:11<20:04:09, 3.88s/it] 16%|█▌ | 3469/22095 [5:42:14<18:29:32, 3.57s/it] {'loss': 0.4264, 'grad_norm': 0.7108815510865111, 'learning_rate': 9.583271726617293e-06, 'epoch': 0.16} 16%|█▌ | 3469/22095 [5:42:14<18:29:32, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89707 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100816 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3470/22095 [5:42:17<17:59:07, 3.48s/it] {'loss': 0.4339, 'grad_norm': 0.7277968624062817, 'learning_rate': 9.582978742855575e-06, 'epoch': 0.16} 16%|█▌ | 3470/22095 [5:42:17<17:59:07, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [412, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8527237 in VC:s3://internvl-moe-sft-data/. Exception: Image size [412, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25015, 'image': 'vrdu_texteq/astro-ph.CO/5fd5d6c5-21ba-4e3f-8dcf-cd0444ebe180.png', 'image_wh': [[412, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'and the one on $\\theta$ can be re-writen'}]} 16%|█▌ | 3471/22095 [5:42:21<18:31:29, 3.58s/it] {'loss': 0.4065, 'grad_norm': 0.7515655120521509, 'learning_rate': 9.582685660619718e-06, 'epoch': 0.16} 16%|█▌ | 3471/22095 [5:42:21<18:31:29, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047930 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [495, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8493104 in VC:s3://internvl-moe-sft-data/. Exception: Image size [495, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119010, 'image': 'vrdu_texteq/astro-ph.CO/a0beb3fe-c16e-47e1-bc1d-4a4559e6de4e.png', 'image_wh': [[495, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $b_{min}\\rightarrow 0$ and $b_{max}\\rightarrow \\infty$. And so:'}]} 16%|█▌ | 3472/22095 [5:42:24<17:42:43, 3.42s/it] {'loss': 0.4211, 'grad_norm': 0.6949952504423228, 'learning_rate': 9.582392479916023e-06, 'epoch': 0.16} 16%|█▌ | 3472/22095 [5:42:24<17:42:43, 3.42s/it] 16%|█▌ | 3473/22095 [5:42:28<18:27:04, 3.57s/it] {'loss': 0.4322, 'grad_norm': 0.7550078700494964, 'learning_rate': 9.582099200750784e-06, 'epoch': 0.16} 16%|█▌ | 3473/22095 [5:42:28<18:27:04, 3.57s/it] 16%|█▌ | 3474/22095 [5:42:31<17:37:22, 3.41s/it] {'loss': 0.3894, 'grad_norm': 0.6658476068153182, 'learning_rate': 9.58180582313031e-06, 'epoch': 0.16} 16%|█▌ | 3474/22095 [5:42:31<17:37:22, 3.41s/it] 16%|█▌ | 3475/22095 [5:42:34<17:16:53, 3.34s/it] {'loss': 0.4394, 'grad_norm': 0.7200657052746892, 'learning_rate': 9.581512347060899e-06, 'epoch': 0.16} 16%|█▌ | 3475/22095 [5:42:34<17:16:53, 3.34s/it] 16%|█▌ | 3476/22095 [5:42:37<16:13:01, 3.14s/it] {'loss': 0.4125, 'grad_norm': 0.6551082332498498, 'learning_rate': 9.58121877254886e-06, 'epoch': 0.16} 16%|█▌ | 3476/22095 [5:42:37<16:13:01, 3.14s/it] 16%|█▌ | 3477/22095 [5:42:41<17:14:13, 3.33s/it] {'loss': 0.4245, 'grad_norm': 0.8696623317849659, 'learning_rate': 9.580925099600497e-06, 'epoch': 0.16} 16%|█▌ | 3477/22095 [5:42:41<17:14:13, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3478/22095 [5:42:51<28:20:29, 5.48s/it] {'loss': 0.5327, 'grad_norm': 1.3728801852345183, 'learning_rate': 9.580631328222124e-06, 'epoch': 0.16} 16%|█▌ | 3478/22095 [5:42:51<28:20:29, 5.48s/it] 16%|█▌ | 3479/22095 [5:42:56<26:50:05, 5.19s/it] {'loss': 0.3812, 'grad_norm': 0.7613061786692783, 'learning_rate': 9.580337458420052e-06, 'epoch': 0.16} 16%|█▌ | 3479/22095 [5:42:56<26:50:05, 5.19s/it] 16%|█▌ | 3480/22095 [5:42:59<24:23:44, 4.72s/it] {'loss': 0.4271, 'grad_norm': 0.6733684603028954, 'learning_rate': 9.580043490200597e-06, 'epoch': 0.16} 16%|█▌ | 3480/22095 [5:42:59<24:23:44, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3481/22095 [5:43:03<22:53:20, 4.43s/it] {'loss': 0.3765, 'grad_norm': 0.7118128617274064, 'learning_rate': 9.579749423570072e-06, 'epoch': 0.16} 16%|█▌ | 3481/22095 [5:43:03<22:53:20, 4.43s/it] 16%|█▌ | 3482/22095 [5:43:07<21:46:03, 4.21s/it] {'loss': 0.4328, 'grad_norm': 0.8008041863113271, 'learning_rate': 9.579455258534798e-06, 'epoch': 0.16} 16%|█▌ | 3482/22095 [5:43:07<21:46:03, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55632 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76286 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3483/22095 [5:43:10<20:43:00, 4.01s/it] {'loss': 0.4616, 'grad_norm': 0.7067609703059613, 'learning_rate': 9.579160995101095e-06, 'epoch': 0.16} 16%|█▌ | 3483/22095 [5:43:10<20:43:00, 4.01s/it] 16%|█▌ | 3484/22095 [5:43:14<20:08:22, 3.90s/it] {'loss': 0.4095, 'grad_norm': 0.6948951876700356, 'learning_rate': 9.578866633275289e-06, 'epoch': 0.16} 16%|█▌ | 3484/22095 [5:43:14<20:08:22, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (97528 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3485/22095 [5:43:22<26:37:09, 5.15s/it] {'loss': 0.5231, 'grad_norm': 1.030356858024951, 'learning_rate': 9.578572173063698e-06, 'epoch': 0.16} 16%|█▌ | 3485/22095 [5:43:22<26:37:09, 5.15s/it] 16%|█▌ | 3486/22095 [5:43:26<24:11:10, 4.68s/it] {'loss': 0.4501, 'grad_norm': 0.731783601918967, 'learning_rate': 9.578277614472655e-06, 'epoch': 0.16} 16%|█▌ | 3486/22095 [5:43:26<24:11:10, 4.68s/it] 16%|█▌ | 3487/22095 [5:43:29<21:39:17, 4.19s/it] {'loss': 0.3953, 'grad_norm': 0.6664453441880271, 'learning_rate': 9.577982957508488e-06, 'epoch': 0.16} 16%|█▌ | 3487/22095 [5:43:29<21:39:17, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3488/22095 [5:43:32<19:51:48, 3.84s/it] {'loss': 0.4274, 'grad_norm': 0.6950749453384258, 'learning_rate': 9.577688202177525e-06, 'epoch': 0.16} 16%|█▌ | 3488/22095 [5:43:32<19:51:48, 3.84s/it] 16%|█▌ | 3489/22095 [5:43:36<19:54:33, 3.85s/it] {'loss': 0.4375, 'grad_norm': 1.0097198471065796, 'learning_rate': 9.577393348486104e-06, 'epoch': 0.16} 16%|█▌ | 3489/22095 [5:43:36<19:54:33, 3.85s/it] 16%|█▌ | 3490/22095 [5:43:39<19:45:28, 3.82s/it] {'loss': 0.4217, 'grad_norm': 0.7139937096631528, 'learning_rate': 9.577098396440557e-06, 'epoch': 0.16} 16%|█▌ | 3490/22095 [5:43:39<19:45:28, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49315 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3491/22095 [5:43:47<25:05:11, 4.85s/it] {'loss': 0.5023, 'grad_norm': 0.5875788879196998, 'learning_rate': 9.576803346047223e-06, 'epoch': 0.16} 16%|█▌ | 3491/22095 [5:43:47<25:05:11, 4.85s/it] 16%|█▌ | 3492/22095 [5:43:51<24:20:39, 4.71s/it] {'loss': 0.4239, 'grad_norm': 0.6762842201383202, 'learning_rate': 9.576508197312441e-06, 'epoch': 0.16} 16%|█▌ | 3492/22095 [5:43:51<24:20:39, 4.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3493/22095 [5:44:01<32:54:13, 6.37s/it] {'loss': 0.4967, 'grad_norm': 0.37114250178621616, 'learning_rate': 9.576212950242554e-06, 'epoch': 0.16} 16%|█▌ | 3493/22095 [5:44:01<32:54:13, 6.37s/it] 16%|█▌ | 3494/22095 [5:44:05<28:32:09, 5.52s/it] {'loss': 0.443, 'grad_norm': 0.6747086799824057, 'learning_rate': 9.575917604843907e-06, 'epoch': 0.16} 16%|█▌ | 3494/22095 [5:44:05<28:32:09, 5.52s/it] 16%|█▌ | 3495/22095 [5:44:08<25:18:29, 4.90s/it] {'loss': 0.4158, 'grad_norm': 0.6714700623457361, 'learning_rate': 9.575622161122843e-06, 'epoch': 0.16} 16%|█▌ | 3495/22095 [5:44:08<25:18:29, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90980 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3496/22095 [5:44:12<22:44:25, 4.40s/it] {'loss': 0.4436, 'grad_norm': 0.7040860890844103, 'learning_rate': 9.575326619085713e-06, 'epoch': 0.16} 16%|█▌ | 3496/22095 [5:44:12<22:44:25, 4.40s/it] 16%|█▌ | 3497/22095 [5:44:15<21:28:56, 4.16s/it] {'loss': 0.4327, 'grad_norm': 0.8520999125563331, 'learning_rate': 9.575030978738865e-06, 'epoch': 0.16} 16%|█▌ | 3497/22095 [5:44:15<21:28:56, 4.16s/it] 16%|█▌ | 3498/22095 [5:44:18<19:25:41, 3.76s/it] {'loss': 0.3907, 'grad_norm': 0.742115121650574, 'learning_rate': 9.574735240088652e-06, 'epoch': 0.16} 16%|█▌ | 3498/22095 [5:44:18<19:25:41, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960204 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11039, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AC=6,CB=3,∴AB=6+3=9,∵O是线段AB的中点,∴AO=9÷2=4.5,∴OC=AC-AO=6-4.5=1.5.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [309, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474697 in VC:s3://internvl-moe-sft-data/. Exception: Image size [309, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 115327, 'image': 'vrdu_texteq/astro-ph.CO/0d52879a-9cbd-4cf5-b341-b3dc85870f83.png', 'image_wh': [[309, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'which for $X$ we can write'}]} 16%|█▌ | 3499/22095 [5:44:25<24:30:32, 4.74s/it] {'loss': 0.4895, 'grad_norm': 0.6498680949772371, 'learning_rate': 9.574439403141431e-06, 'epoch': 0.16} 16%|█▌ | 3499/22095 [5:44:25<24:30:32, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132779 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3500/22095 [5:44:28<22:05:20, 4.28s/it] {'loss': 0.3604, 'grad_norm': 0.7487791434196598, 'learning_rate': 9.574143467903554e-06, 'epoch': 0.16} 16%|█▌ | 3500/22095 [5:44:28<22:05:20, 4.28s/it] 16%|█▌ | 3501/22095 [5:44:32<20:52:22, 4.04s/it] {'loss': 0.4797, 'grad_norm': 0.7048832865090792, 'learning_rate': 9.573847434381382e-06, 'epoch': 0.16} 16%|█▌ | 3501/22095 [5:44:32<20:52:22, 4.04s/it] 16%|█▌ | 3502/22095 [5:44:35<19:18:23, 3.74s/it] {'loss': 0.4349, 'grad_norm': 0.6976048296528936, 'learning_rate': 9.573551302581279e-06, 'epoch': 0.16} 16%|█▌ | 3502/22095 [5:44:35<19:18:23, 3.74s/it] 16%|█▌ | 3503/22095 [5:44:38<18:18:59, 3.55s/it] {'loss': 0.3949, 'grad_norm': 0.7027842537035046, 'learning_rate': 9.573255072509604e-06, 'epoch': 0.16} 16%|█▌ | 3503/22095 [5:44:38<18:18:59, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3504/22095 [5:44:47<27:39:57, 5.36s/it] {'loss': 0.5011, 'grad_norm': 0.35742746636337064, 'learning_rate': 9.572958744172722e-06, 'epoch': 0.16} 16%|█▌ | 3504/22095 [5:44:47<27:39:57, 5.36s/it] 16%|█▌ | 3505/22095 [5:44:51<24:33:08, 4.75s/it] {'loss': 0.4505, 'grad_norm': 0.6404956434831562, 'learning_rate': 9.572662317577002e-06, 'epoch': 0.16} 16%|█▌ | 3505/22095 [5:44:51<24:33:08, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3506/22095 [5:44:54<22:18:59, 4.32s/it] {'loss': 0.4209, 'grad_norm': 0.7493153738563227, 'learning_rate': 9.572365792728812e-06, 'epoch': 0.16} 16%|█▌ | 3506/22095 [5:44:54<22:18:59, 4.32s/it] 16%|█▌ | 3507/22095 [5:44:57<20:36:33, 3.99s/it] {'loss': 0.402, 'grad_norm': 0.6490770161540544, 'learning_rate': 9.572069169634526e-06, 'epoch': 0.16} 16%|█▌ | 3507/22095 [5:44:57<20:36:33, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3508/22095 [5:45:07<29:25:54, 5.70s/it] {'loss': 0.5375, 'grad_norm': 0.48412819221920733, 'learning_rate': 9.571772448300514e-06, 'epoch': 0.16} 16%|█▌ | 3508/22095 [5:45:07<29:25:54, 5.70s/it] 16%|█▌ | 3509/22095 [5:45:11<26:39:09, 5.16s/it] {'loss': 0.3852, 'grad_norm': 0.691642616032123, 'learning_rate': 9.571475628733153e-06, 'epoch': 0.16} 16%|█▌ | 3509/22095 [5:45:11<26:39:09, 5.16s/it] 16%|█▌ | 3510/22095 [5:45:15<24:24:02, 4.73s/it] {'loss': 0.421, 'grad_norm': 0.6706389167853233, 'learning_rate': 9.571178710938823e-06, 'epoch': 0.16} 16%|█▌ | 3510/22095 [5:45:15<24:24:02, 4.73s/it] 16%|█▌ | 3511/22095 [5:45:17<21:09:30, 4.10s/it] {'loss': 0.4211, 'grad_norm': 0.8253546949943985, 'learning_rate': 9.570881694923899e-06, 'epoch': 0.16} 16%|█▌ | 3511/22095 [5:45:17<21:09:30, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79515 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3512/22095 [5:45:20<19:16:47, 3.74s/it] {'loss': 0.4265, 'grad_norm': 0.6671583142880999, 'learning_rate': 9.570584580694768e-06, 'epoch': 0.16} 16%|█▌ | 3512/22095 [5:45:20<19:16:47, 3.74s/it] 16%|█▌ | 3513/22095 [5:45:23<18:05:47, 3.51s/it] {'loss': 0.4102, 'grad_norm': 0.9602296597346538, 'learning_rate': 9.570287368257811e-06, 'epoch': 0.16} 16%|█▌ | 3513/22095 [5:45:23<18:05:47, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047937 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5'}, {'from': 'gpt', 'value': '【解答】解:∵点D是AC的中点,如果CD=4,∴AC=2CD=8∵AB=14∴BC=AB-AC=6'}]} 16%|█▌ | 3514/22095 [5:45:26<17:26:16, 3.38s/it] {'loss': 0.4406, 'grad_norm': 0.7444960285872806, 'learning_rate': 9.569990057619414e-06, 'epoch': 0.16} 16%|█▌ | 3514/22095 [5:45:26<17:26:16, 3.38s/it] 16%|█▌ | 3515/22095 [5:45:29<16:55:12, 3.28s/it] {'loss': 0.4068, 'grad_norm': 0.6599282051949944, 'learning_rate': 9.569692648785967e-06, 'epoch': 0.16} 16%|█▌ | 3515/22095 [5:45:29<16:55:12, 3.28s/it] 16%|█▌ | 3516/22095 [5:45:34<18:56:11, 3.67s/it] {'loss': 0.415, 'grad_norm': 0.7741719356633847, 'learning_rate': 9.56939514176386e-06, 'epoch': 0.16} 16%|█▌ | 3516/22095 [5:45:34<18:56:11, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369953 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36705, 'image': 'vrdu_table_final_2/astro-ph.CO/ab5abb96-9679-4f69-95e7-e7bad3b3bef6.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```'}]} 16%|█▌ | 3517/22095 [5:45:37<18:52:45, 3.66s/it] {'loss': 0.4139, 'grad_norm': 0.7258449526309476, 'learning_rate': 9.569097536559486e-06, 'epoch': 0.16} 16%|█▌ | 3517/22095 [5:45:37<18:52:45, 3.66s/it] 16%|█▌ | 3518/22095 [5:45:40<17:28:04, 3.39s/it] {'loss': 0.4041, 'grad_norm': 0.7239432327363277, 'learning_rate': 9.568799833179238e-06, 'epoch': 0.16} 16%|█▌ | 3518/22095 [5:45:40<17:28:04, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3519/22095 [5:45:51<28:52:07, 5.59s/it] {'loss': 0.5377, 'grad_norm': 0.56606812255434, 'learning_rate': 9.568502031629513e-06, 'epoch': 0.16} 16%|█▌ | 3519/22095 [5:45:51<28:52:07, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90682 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68584 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3520/22095 [5:45:54<25:26:13, 4.93s/it] {'loss': 0.3549, 'grad_norm': 0.6611254951118147, 'learning_rate': 9.568204131916712e-06, 'epoch': 0.16} 16%|█▌ | 3520/22095 [5:45:54<25:26:13, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42032 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43877 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3521/22095 [5:45:57<22:35:43, 4.38s/it] {'loss': 0.413, 'grad_norm': 0.7134987779635908, 'learning_rate': 9.567906134047233e-06, 'epoch': 0.16} 16%|█▌ | 3521/22095 [5:45:57<22:35:43, 4.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3522/22095 [5:46:00<20:40:10, 4.01s/it] {'loss': 0.4342, 'grad_norm': 0.6931576955652552, 'learning_rate': 9.567608038027481e-06, 'epoch': 0.16} 16%|█▌ | 3522/22095 [5:46:01<20:40:10, 4.01s/it] 16%|█▌ | 3523/22095 [5:46:03<18:50:39, 3.65s/it] {'loss': 0.3926, 'grad_norm': 0.7093232644238057, 'learning_rate': 9.567309843863862e-06, 'epoch': 0.16} 16%|█▌ | 3523/22095 [5:46:03<18:50:39, 3.65s/it] 16%|█▌ | 3524/22095 [5:46:07<18:20:39, 3.56s/it] {'loss': 0.476, 'grad_norm': 0.7141010617990805, 'learning_rate': 9.56701155156278e-06, 'epoch': 0.16} 16%|█▌ | 3524/22095 [5:46:07<18:20:39, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3525/22095 [5:46:10<17:36:08, 3.41s/it] {'loss': 0.4099, 'grad_norm': 0.6637371625249113, 'learning_rate': 9.566713161130646e-06, 'epoch': 0.16} 16%|█▌ | 3525/22095 [5:46:10<17:36:08, 3.41s/it] 16%|█▌ | 3526/22095 [5:46:14<19:01:22, 3.69s/it] {'loss': 0.391, 'grad_norm': 0.7357520258636889, 'learning_rate': 9.566414672573873e-06, 'epoch': 0.16} 16%|█▌ | 3526/22095 [5:46:14<19:01:22, 3.69s/it] 16%|█▌ | 3527/22095 [5:46:17<17:49:12, 3.45s/it] {'loss': 0.4638, 'grad_norm': 0.7564330883531917, 'learning_rate': 9.566116085898872e-06, 'epoch': 0.16} 16%|█▌ | 3527/22095 [5:46:17<17:49:12, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3528/22095 [5:46:25<25:32:28, 4.95s/it] {'loss': 0.5584, 'grad_norm': 0.49859478913625926, 'learning_rate': 9.565817401112061e-06, 'epoch': 0.16} 16%|█▌ | 3528/22095 [5:46:25<25:32:28, 4.95s/it] 16%|█▌ | 3529/22095 [5:46:35<32:22:17, 6.28s/it] {'loss': 0.4982, 'grad_norm': 0.3963057852996049, 'learning_rate': 9.565518618219857e-06, 'epoch': 0.16} 16%|█▌ | 3529/22095 [5:46:35<32:22:17, 6.28s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304149 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1_pj8mSfD8KJjSszhXXbIJFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请分析这张图片,并提取其中的所有文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n真正发热12小时的暖贴\n满50片送暖手蛋1个\n萱草大号自发热暖贴\n草\n萱\n萱草熊熊系列暖贴\nHotPatch\n萱草\n百货专柜518'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957618 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8453, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm'}, {'from': 'gpt', 'value': '【解答】解:根据上图所示OB=5cm-OA,∵OA=(AB+BC)÷2=4cm,∴OB=1cm.'}]} 16%|█▌ | 3530/22095 [5:46:38<28:14:23, 5.48s/it] {'loss': 0.4088, 'grad_norm': 0.7049571051495215, 'learning_rate': 9.56521973722868e-06, 'epoch': 0.16} 16%|█▌ | 3530/22095 [5:46:38<28:14:23, 5.48s/it] 16%|█▌ | 3531/22095 [5:46:42<25:00:59, 4.85s/it] {'loss': 0.3681, 'grad_norm': 0.6898393168730423, 'learning_rate': 9.564920758144951e-06, 'epoch': 0.16} 16%|█▌ | 3531/22095 [5:46:42<25:00:59, 4.85s/it] 16%|█▌ | 3532/22095 [5:46:45<23:14:53, 4.51s/it] {'loss': 0.3881, 'grad_norm': 0.8196229163508992, 'learning_rate': 9.564621680975095e-06, 'epoch': 0.16} 16%|█▌ | 3532/22095 [5:46:46<23:14:53, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47929 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44553 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3533/22095 [5:46:49<21:26:02, 4.16s/it] {'loss': 0.4025, 'grad_norm': 0.7760658972671267, 'learning_rate': 9.564322505725539e-06, 'epoch': 0.16} 16%|█▌ | 3533/22095 [5:46:49<21:26:02, 4.16s/it] 16%|█▌ | 3534/22095 [5:46:52<19:42:37, 3.82s/it] {'loss': 0.4105, 'grad_norm': 0.7553452613369879, 'learning_rate': 9.56402323240271e-06, 'epoch': 0.16} 16%|█▌ | 3534/22095 [5:46:52<19:42:37, 3.82s/it] 16%|█▌ | 3535/22095 [5:46:55<18:37:17, 3.61s/it] {'loss': 0.3949, 'grad_norm': 0.7444146630985331, 'learning_rate': 9.563723861013039e-06, 'epoch': 0.16} 16%|█▌ | 3535/22095 [5:46:55<18:37:17, 3.61s/it] 16%|█▌ | 3536/22095 [5:46:58<17:52:03, 3.47s/it] {'loss': 0.375, 'grad_norm': 0.634783786367562, 'learning_rate': 9.563424391562958e-06, 'epoch': 0.16} 16%|█▌ | 3536/22095 [5:46:58<17:52:03, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109804 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3537/22095 [5:47:02<18:36:53, 3.61s/it] {'loss': 0.4077, 'grad_norm': 0.6960440889965599, 'learning_rate': 9.563124824058905e-06, 'epoch': 0.16} 16%|█▌ | 3537/22095 [5:47:02<18:36:53, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3538/22095 [5:47:06<18:21:02, 3.56s/it] {'loss': 0.4072, 'grad_norm': 0.6738085708575341, 'learning_rate': 9.562825158507311e-06, 'epoch': 0.16} 16%|█▌ | 3538/22095 [5:47:06<18:21:02, 3.56s/it] 16%|█▌ | 3539/22095 [5:47:09<18:44:34, 3.64s/it] {'loss': 0.4379, 'grad_norm': 0.730674294084897, 'learning_rate': 9.562525394914621e-06, 'epoch': 0.16} 16%|█▌ | 3539/22095 [5:47:09<18:44:34, 3.64s/it] 16%|█▌ | 3540/22095 [5:47:12<17:26:51, 3.39s/it] {'loss': 0.4621, 'grad_norm': 0.8099966068598785, 'learning_rate': 9.562225533287271e-06, 'epoch': 0.16} 16%|█▌ | 3540/22095 [5:47:12<17:26:51, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3541/22095 [5:47:22<27:02:08, 5.25s/it] {'loss': 0.5253, 'grad_norm': 0.8510329125183358, 'learning_rate': 9.561925573631706e-06, 'epoch': 0.16} 16%|█▌ | 3541/22095 [5:47:22<27:02:08, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43534 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3542/22095 [5:47:25<24:07:59, 4.68s/it] {'loss': 0.3987, 'grad_norm': 0.7692233573042879, 'learning_rate': 9.561625515954372e-06, 'epoch': 0.16} 16%|█▌ | 3542/22095 [5:47:25<24:07:59, 4.68s/it] 16%|█▌ | 3543/22095 [5:47:29<22:14:36, 4.32s/it] {'loss': 0.4236, 'grad_norm': 0.7289333091370424, 'learning_rate': 9.561325360261714e-06, 'epoch': 0.16} 16%|█▌ | 3543/22095 [5:47:29<22:14:36, 4.32s/it] 16%|█▌ | 3544/22095 [5:47:33<22:17:21, 4.33s/it] {'loss': 0.39, 'grad_norm': 0.6737579331807066, 'learning_rate': 9.561025106560184e-06, 'epoch': 0.16} 16%|█▌ | 3544/22095 [5:47:33<22:17:21, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3545/22095 [5:47:40<25:54:21, 5.03s/it] {'loss': 0.5043, 'grad_norm': 1.0999296662435172, 'learning_rate': 9.560724754856234e-06, 'epoch': 0.16} 16%|█▌ | 3545/22095 [5:47:40<25:54:21, 5.03s/it] 16%|█▌ | 3546/22095 [5:47:43<23:13:52, 4.51s/it] {'loss': 0.4529, 'grad_norm': 0.8174191146577231, 'learning_rate': 9.560424305156314e-06, 'epoch': 0.16} 16%|█▌ | 3546/22095 [5:47:43<23:13:52, 4.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307710 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2965Lb46I8KJjSszfXXaZVXXa_!!646445699.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请分析这张图片,并提取其中的所有文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n中华射箭户外\n不仅可以锻\n炼体\n练习弓箭\n质还可以\n增强实力哟!!!'}]} 16%|█▌ | 3547/22095 [5:47:46<21:31:00, 4.18s/it] {'loss': 0.4264, 'grad_norm': 0.7134271369346344, 'learning_rate': 9.560123757466885e-06, 'epoch': 0.16} 16%|█▌ | 3547/22095 [5:47:46<21:31:00, 4.18s/it] 16%|█▌ | 3548/22095 [5:47:50<20:37:17, 4.00s/it] {'loss': 0.4666, 'grad_norm': 0.7892566606627814, 'learning_rate': 9.5598231117944e-06, 'epoch': 0.16} 16%|█▌ | 3548/22095 [5:47:50<20:37:17, 4.00s/it] 16%|█▌ | 3549/22095 [5:47:53<19:41:08, 3.82s/it] {'loss': 0.4007, 'grad_norm': 0.6618307576773829, 'learning_rate': 9.559522368145319e-06, 'epoch': 0.16} 16%|█▌ | 3549/22095 [5:47:53<19:41:08, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3550/22095 [5:47:59<22:56:31, 4.45s/it] {'loss': 0.5258, 'grad_norm': 0.5561735482323752, 'learning_rate': 9.55922152652611e-06, 'epoch': 0.16} 16%|█▌ | 3550/22095 [5:47:59<22:56:31, 4.45s/it] 16%|█▌ | 3551/22095 [5:48:08<29:23:50, 5.71s/it] {'loss': 0.5249, 'grad_norm': 0.45960926418367365, 'learning_rate': 9.55892058694323e-06, 'epoch': 0.16} 16%|█▌ | 3551/22095 [5:48:08<29:23:50, 5.71s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 16%|█▌ | 3552/22095 [5:48:11<25:42:34, 4.99s/it] {'loss': 0.3931, 'grad_norm': 1.039650118403853, 'learning_rate': 9.558619549403148e-06, 'epoch': 0.16} 16%|█▌ | 3552/22095 [5:48:11<25:42:34, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137912 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3553/22095 [5:48:14<23:07:13, 4.49s/it] {'loss': 0.42, 'grad_norm': 0.7145501228608365, 'learning_rate': 9.558318413912333e-06, 'epoch': 0.16} 16%|█▌ | 3553/22095 [5:48:14<23:07:13, 4.49s/it] 16%|█▌ | 3554/22095 [5:48:18<21:15:28, 4.13s/it] {'loss': 0.4074, 'grad_norm': 0.7942813553983717, 'learning_rate': 9.558017180477256e-06, 'epoch': 0.16} 16%|█▌ | 3554/22095 [5:48:18<21:15:28, 4.13s/it] 16%|█▌ | 3555/22095 [5:48:22<20:54:42, 4.06s/it] {'loss': 0.3903, 'grad_norm': 1.2037104862401944, 'learning_rate': 9.557715849104388e-06, 'epoch': 0.16} 16%|█▌ | 3555/22095 [5:48:22<20:54:42, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114195 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3556/22095 [5:48:25<19:32:50, 3.80s/it] {'loss': 0.3849, 'grad_norm': 0.6698408169445956, 'learning_rate': 9.557414419800204e-06, 'epoch': 0.16} 16%|█▌ | 3556/22095 [5:48:25<19:32:50, 3.80s/it] 16%|█▌ | 3557/22095 [5:48:28<18:17:18, 3.55s/it] {'loss': 0.4154, 'grad_norm': 0.6827869819031588, 'learning_rate': 9.557112892571182e-06, 'epoch': 0.16} 16%|█▌ | 3557/22095 [5:48:28<18:17:18, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56768 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3558/22095 [5:48:31<17:02:56, 3.31s/it] {'loss': 0.4312, 'grad_norm': 0.7589222223962308, 'learning_rate': 9.5568112674238e-06, 'epoch': 0.16} 16%|█▌ | 3558/22095 [5:48:31<17:02:56, 3.31s/it] 16%|█▌ | 3559/22095 [5:48:33<16:19:05, 3.17s/it] {'loss': 0.4274, 'grad_norm': 0.7177871268316959, 'learning_rate': 9.556509544364538e-06, 'epoch': 0.16} 16%|█▌ | 3559/22095 [5:48:33<16:19:05, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3560/22095 [5:48:43<25:54:08, 5.03s/it] {'loss': 0.4941, 'grad_norm': 0.8385093075093688, 'learning_rate': 9.556207723399882e-06, 'epoch': 0.16} 16%|█▌ | 3560/22095 [5:48:43<25:54:08, 5.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302256 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1KB4MLXXXXXblXpXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCan you help me analyze the text noted on this picture?'}, {'from': 'gpt', 'value': 'All words in the image:\n中原智能家控生活馆\nmm\n324000521\nQQ:\n9\n11\n15'}]} 16%|█▌ | 3561/22095 [5:48:46<23:14:58, 4.52s/it] {'loss': 0.4128, 'grad_norm': 0.7822093267945871, 'learning_rate': 9.555905804536315e-06, 'epoch': 0.16} 16%|█▌ | 3561/22095 [5:48:46<23:14:58, 4.52s/it] 16%|█▌ | 3562/22095 [5:48:49<20:49:34, 4.05s/it] {'loss': 0.4083, 'grad_norm': 0.8118716712352124, 'learning_rate': 9.555603787780321e-06, 'epoch': 0.16} 16%|█▌ | 3562/22095 [5:48:49<20:49:34, 4.05s/it] 16%|█▌ | 3563/22095 [5:48:52<18:55:12, 3.68s/it] {'loss': 0.4207, 'grad_norm': 0.709902451425543, 'learning_rate': 9.555301673138397e-06, 'epoch': 0.16} 16%|█▌ | 3563/22095 [5:48:52<18:55:12, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3564/22095 [5:49:01<27:43:16, 5.39s/it] {'loss': 0.5039, 'grad_norm': 0.4298801720977222, 'learning_rate': 9.55499946061703e-06, 'epoch': 0.16} 16%|█▌ | 3564/22095 [5:49:01<27:43:16, 5.39s/it] 16%|█▌ | 3565/22095 [5:49:04<24:23:52, 4.74s/it] {'loss': 0.4508, 'grad_norm': 0.7351410622515675, 'learning_rate': 9.554697150222713e-06, 'epoch': 0.16} 16%|█▌ | 3565/22095 [5:49:04<24:23:52, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3566/22095 [5:49:12<28:27:10, 5.53s/it] {'loss': 0.5276, 'grad_norm': 0.4190010950162947, 'learning_rate': 9.554394741961944e-06, 'epoch': 0.16} 16%|█▌ | 3566/22095 [5:49:12<28:27:10, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72038 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3567/22095 [5:49:15<25:11:49, 4.90s/it] {'loss': 0.4163, 'grad_norm': 0.7005793374248309, 'learning_rate': 9.554092235841219e-06, 'epoch': 0.16} 16%|█▌ | 3567/22095 [5:49:15<25:11:49, 4.90s/it] 16%|█▌ | 3568/22095 [5:49:19<22:53:48, 4.45s/it] {'loss': 0.4009, 'grad_norm': 0.7101329404583272, 'learning_rate': 9.553789631867039e-06, 'epoch': 0.16} 16%|█▌ | 3568/22095 [5:49:19<22:53:48, 4.45s/it] 16%|█▌ | 3569/22095 [5:49:22<21:04:02, 4.09s/it] {'loss': 0.4007, 'grad_norm': 0.671422559631563, 'learning_rate': 9.553486930045906e-06, 'epoch': 0.16} 16%|█▌ | 3569/22095 [5:49:22<21:04:02, 4.09s/it] 16%|█▌ | 3570/22095 [5:49:25<19:20:06, 3.76s/it] {'loss': 0.4549, 'grad_norm': 0.8008267532909126, 'learning_rate': 9.553184130384324e-06, 'epoch': 0.16} 16%|█▌ | 3570/22095 [5:49:25<19:20:06, 3.76s/it] 16%|█▌ | 3571/22095 [5:49:28<19:06:46, 3.71s/it] {'loss': 0.4455, 'grad_norm': 0.7376105220646921, 'learning_rate': 9.5528812328888e-06, 'epoch': 0.16} 16%|█▌ | 3571/22095 [5:49:28<19:06:46, 3.71s/it] 16%|█▌ | 3572/22095 [5:49:33<19:44:07, 3.84s/it] {'loss': 0.4112, 'grad_norm': 0.6704041835203152, 'learning_rate': 9.552578237565839e-06, 'epoch': 0.16} 16%|█▌ | 3572/22095 [5:49:33<19:44:07, 3.84s/it] 16%|█▌ | 3573/22095 [5:49:36<18:17:27, 3.56s/it] {'loss': 0.4417, 'grad_norm': 0.6929611158994938, 'learning_rate': 9.552275144421953e-06, 'epoch': 0.16} 16%|█▌ | 3573/22095 [5:49:36<18:17:27, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3574/22095 [5:49:39<17:59:18, 3.50s/it] {'loss': 0.4215, 'grad_norm': 0.6915671452152358, 'learning_rate': 9.551971953463659e-06, 'epoch': 0.16} 16%|█▌ | 3574/22095 [5:49:39<17:59:18, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▌ | 3575/22095 [5:49:48<27:08:16, 5.28s/it] {'loss': 0.5025, 'grad_norm': 1.136766335665436, 'learning_rate': 9.551668664697467e-06, 'epoch': 0.16} 16%|█▌ | 3575/22095 [5:49:48<27:08:16, 5.28s/it] 16%|█▌ | 3576/22095 [5:49:52<24:33:52, 4.78s/it] {'loss': 0.3585, 'grad_norm': 0.6674529201333661, 'learning_rate': 9.551365278129894e-06, 'epoch': 0.16} 16%|█▌ | 3576/22095 [5:49:52<24:33:52, 4.78s/it] 16%|█▌ | 3577/22095 [5:49:55<21:38:47, 4.21s/it] {'loss': 0.4171, 'grad_norm': 0.8954324093302272, 'learning_rate': 9.55106179376746e-06, 'epoch': 0.16} 16%|█▌ | 3577/22095 [5:49:55<21:38:47, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3578/22095 [5:49:58<20:36:48, 4.01s/it] {'loss': 0.4321, 'grad_norm': 0.78892785286471, 'learning_rate': 9.550758211616684e-06, 'epoch': 0.16} 16%|█▌ | 3578/22095 [5:49:58<20:36:48, 4.01s/it] 16%|█▌ | 3579/22095 [5:50:02<20:11:39, 3.93s/it] {'loss': 0.3647, 'grad_norm': 0.6293828498610229, 'learning_rate': 9.550454531684092e-06, 'epoch': 0.16} 16%|█▌ | 3579/22095 [5:50:02<20:11:39, 3.93s/it] 16%|█▌ | 3580/22095 [5:50:05<18:56:47, 3.68s/it] {'loss': 0.4225, 'grad_norm': 0.7028121697581299, 'learning_rate': 9.550150753976209e-06, 'epoch': 0.16} 16%|█▌ | 3580/22095 [5:50:05<18:56:47, 3.68s/it] 16%|█▌ | 3581/22095 [5:50:08<17:29:49, 3.40s/it] {'loss': 0.4268, 'grad_norm': 0.6885948868606172, 'learning_rate': 9.54984687849956e-06, 'epoch': 0.16} 16%|█▌ | 3581/22095 [5:50:08<17:29:49, 3.40s/it] 16%|█▌ | 3582/22095 [5:50:12<17:48:02, 3.46s/it] {'loss': 0.4105, 'grad_norm': 0.6863217671163234, 'learning_rate': 9.549542905260674e-06, 'epoch': 0.16} 16%|█▌ | 3582/22095 [5:50:12<17:48:02, 3.46s/it] 16%|█▌ | 3583/22095 [5:50:15<17:45:15, 3.45s/it] {'loss': 0.3946, 'grad_norm': 0.7042879685578601, 'learning_rate': 9.549238834266086e-06, 'epoch': 0.16} 16%|█▌ | 3583/22095 [5:50:15<17:45:15, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93963 > 40960). Running this sequence through the model will result in indexing errors 16%|█▌ | 3584/22095 [5:50:18<17:41:02, 3.44s/it] {'loss': 0.3693, 'grad_norm': 0.6589248712522245, 'learning_rate': 9.548934665522325e-06, 'epoch': 0.16} 16%|█▌ | 3584/22095 [5:50:18<17:41:02, 3.44s/it] 16%|█▌ | 3585/22095 [5:50:21<17:12:11, 3.35s/it] {'loss': 0.3809, 'grad_norm': 0.7011875986352972, 'learning_rate': 9.548630399035931e-06, 'epoch': 0.16} 16%|█▌ | 3585/22095 [5:50:22<17:12:11, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3586/22095 [5:50:25<17:14:57, 3.36s/it] {'loss': 0.4031, 'grad_norm': 0.7806831482841735, 'learning_rate': 9.54832603481344e-06, 'epoch': 0.16} 16%|█▌ | 3586/22095 [5:50:25<17:14:57, 3.36s/it] 16%|█▌ | 3587/22095 [5:50:28<16:21:37, 3.18s/it] {'loss': 0.4171, 'grad_norm': 0.7346212570439461, 'learning_rate': 9.54802157286139e-06, 'epoch': 0.16} 16%|█▌ | 3587/22095 [5:50:28<16:21:37, 3.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047794 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 16%|█▌ | 3588/22095 [5:50:31<16:25:03, 3.19s/it] {'loss': 0.4172, 'grad_norm': 0.6962142519779313, 'learning_rate': 9.547717013186326e-06, 'epoch': 0.16} 16%|█▌ | 3588/22095 [5:50:31<16:25:03, 3.19s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▌ | 3589/22095 [5:50:35<17:07:32, 3.33s/it] {'loss': 0.4122, 'grad_norm': 0.6451915080977847, 'learning_rate': 9.547412355794789e-06, 'epoch': 0.16} 16%|█▌ | 3589/22095 [5:50:35<17:07:32, 3.33s/it] 16%|█▌ | 3590/22095 [5:50:38<17:21:31, 3.38s/it] {'loss': 0.4314, 'grad_norm': 0.6863066177902849, 'learning_rate': 9.547107600693328e-06, 'epoch': 0.16} 16%|█▌ | 3590/22095 [5:50:38<17:21:31, 3.38s/it] 16%|█▋ | 3591/22095 [5:50:42<17:48:34, 3.46s/it] {'loss': 0.4208, 'grad_norm': 0.6418986780368956, 'learning_rate': 9.54680274788849e-06, 'epoch': 0.16} 16%|█▋ | 3591/22095 [5:50:42<17:48:34, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3592/22095 [5:50:49<23:54:51, 4.65s/it] {'loss': 0.552, 'grad_norm': 0.9359545702071064, 'learning_rate': 9.546497797386824e-06, 'epoch': 0.16} 16%|█▋ | 3592/22095 [5:50:49<23:54:51, 4.65s/it] 16%|█▋ | 3593/22095 [5:50:52<21:46:11, 4.24s/it] {'loss': 0.4228, 'grad_norm': 0.7327979098491275, 'learning_rate': 9.546192749194885e-06, 'epoch': 0.16} 16%|█▋ | 3593/22095 [5:50:52<21:46:11, 4.24s/it] 16%|█▋ | 3594/22095 [5:50:56<21:09:10, 4.12s/it] {'loss': 0.4029, 'grad_norm': 0.8076526403856159, 'learning_rate': 9.545887603319228e-06, 'epoch': 0.16} 16%|█▋ | 3594/22095 [5:50:56<21:09:10, 4.12s/it] 16%|█▋ | 3595/22095 [5:50:59<19:26:41, 3.78s/it] {'loss': 0.3929, 'grad_norm': 0.7663084724651145, 'learning_rate': 9.545582359766405e-06, 'epoch': 0.16} 16%|█▋ | 3595/22095 [5:50:59<19:26:41, 3.78s/it] 16%|█▋ | 3596/22095 [5:51:03<19:39:42, 3.83s/it] {'loss': 0.3769, 'grad_norm': 0.6985073830450215, 'learning_rate': 9.54527701854298e-06, 'epoch': 0.16} 16%|█▋ | 3596/22095 [5:51:03<19:39:42, 3.83s/it] 16%|█▋ | 3597/22095 [5:51:07<19:47:57, 3.85s/it] {'loss': 0.4669, 'grad_norm': 0.7097929957097808, 'learning_rate': 9.544971579655512e-06, 'epoch': 0.16} 16%|█▋ | 3597/22095 [5:51:07<19:47:57, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53744 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57369 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3598/22095 [5:51:10<19:02:54, 3.71s/it] {'loss': 0.4041, 'grad_norm': 0.6590480282146434, 'learning_rate': 9.544666043110562e-06, 'epoch': 0.16} 16%|█▋ | 3598/22095 [5:51:10<19:02:54, 3.71s/it] 16%|█▋ | 3599/22095 [5:51:15<20:06:14, 3.91s/it] {'loss': 0.4011, 'grad_norm': 0.6845449427550792, 'learning_rate': 9.544360408914696e-06, 'epoch': 0.16} 16%|█▋ | 3599/22095 [5:51:15<20:06:14, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99575 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41489 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3600/22095 [5:51:18<18:44:23, 3.65s/it] {'loss': 0.4056, 'grad_norm': 0.7318456776258486, 'learning_rate': 9.544054677074483e-06, 'epoch': 0.16} 16%|█▋ | 3600/22095 [5:51:18<18:44:23, 3.65s/it] 16%|█▋ | 3601/22095 [5:51:22<19:36:55, 3.82s/it] {'loss': 0.4206, 'grad_norm': 0.7210936459362473, 'learning_rate': 9.543748847596491e-06, 'epoch': 0.16} 16%|█▋ | 3601/22095 [5:51:22<19:36:55, 3.82s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▋ | 3602/22095 [5:51:26<19:14:16, 3.75s/it] {'loss': 0.3923, 'grad_norm': 0.7099193412366627, 'learning_rate': 9.543442920487291e-06, 'epoch': 0.16} 16%|█▋ | 3602/22095 [5:51:26<19:14:16, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▋ | 3603/22095 [5:51:29<17:55:24, 3.49s/it] {'loss': 0.4666, 'grad_norm': 0.7374127538432773, 'learning_rate': 9.543136895753458e-06, 'epoch': 0.16} 16%|█▋ | 3603/22095 [5:51:29<17:55:24, 3.49s/it] 16%|█▋ | 3604/22095 [5:51:32<17:25:43, 3.39s/it] {'loss': 0.3662, 'grad_norm': 0.6380376806590465, 'learning_rate': 9.542830773401564e-06, 'epoch': 0.16} 16%|█▋ | 3604/22095 [5:51:32<17:25:43, 3.39s/it] 16%|█▋ | 3605/22095 [5:51:35<16:44:56, 3.26s/it] {'loss': 0.4414, 'grad_norm': 0.7137586537082149, 'learning_rate': 9.54252455343819e-06, 'epoch': 0.16} 16%|█▋ | 3605/22095 [5:51:35<16:44:56, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3606/22095 [5:51:45<27:00:04, 5.26s/it] {'loss': 0.4986, 'grad_norm': 0.9055787596951604, 'learning_rate': 9.542218235869915e-06, 'epoch': 0.16} 16%|█▋ | 3606/22095 [5:51:45<27:00:04, 5.26s/it] 16%|█▋ | 3607/22095 [5:51:48<24:50:56, 4.84s/it] {'loss': 0.4135, 'grad_norm': 0.8021776941432398, 'learning_rate': 9.54191182070332e-06, 'epoch': 0.16} 16%|█▋ | 3607/22095 [5:51:48<24:50:56, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3608/22095 [5:51:56<29:11:08, 5.68s/it] {'loss': 0.5387, 'grad_norm': 0.5627123166001644, 'learning_rate': 9.54160530794499e-06, 'epoch': 0.16} 16%|█▋ | 3608/22095 [5:51:56<29:11:08, 5.68s/it] 16%|█▋ | 3609/22095 [5:52:00<26:05:18, 5.08s/it] {'loss': 0.4124, 'grad_norm': 0.7214389595222919, 'learning_rate': 9.541298697601508e-06, 'epoch': 0.16} 16%|█▋ | 3609/22095 [5:52:00<26:05:18, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41496 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110507 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3610/22095 [5:52:03<23:36:37, 4.60s/it] {'loss': 0.4447, 'grad_norm': 0.7582642791339812, 'learning_rate': 9.540991989679468e-06, 'epoch': 0.16} 16%|█▋ | 3610/22095 [5:52:03<23:36:37, 4.60s/it] 16%|█▋ | 3611/22095 [5:52:07<22:51:18, 4.45s/it] {'loss': 0.4116, 'grad_norm': 0.7113602407039508, 'learning_rate': 9.540685184185455e-06, 'epoch': 0.16} 16%|█▋ | 3611/22095 [5:52:07<22:51:18, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▋ | 3612/22095 [5:52:11<21:19:28, 4.15s/it] {'loss': 0.4665, 'grad_norm': 0.7172389497873176, 'learning_rate': 9.540378281126064e-06, 'epoch': 0.16} 16%|█▋ | 3612/22095 [5:52:11<21:19:28, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76270 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47623 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3613/22095 [5:52:14<19:56:01, 3.88s/it] {'loss': 0.4648, 'grad_norm': 0.7156189181787637, 'learning_rate': 9.540071280507887e-06, 'epoch': 0.16} 16%|█▋ | 3613/22095 [5:52:14<19:56:01, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60396 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53272 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3614/22095 [5:52:17<18:52:37, 3.68s/it] {'loss': 0.3967, 'grad_norm': 0.7428520280649665, 'learning_rate': 9.539764182337523e-06, 'epoch': 0.16} 16%|█▋ | 3614/22095 [5:52:17<18:52:37, 3.68s/it] 16%|█▋ | 3615/22095 [5:52:20<17:29:01, 3.41s/it] {'loss': 0.4227, 'grad_norm': 0.6564775779238287, 'learning_rate': 9.539456986621568e-06, 'epoch': 0.16} 16%|█▋ | 3615/22095 [5:52:20<17:29:01, 3.41s/it] 16%|█▋ | 3616/22095 [5:52:23<17:02:37, 3.32s/it] {'loss': 0.4373, 'grad_norm': 0.5938727255118851, 'learning_rate': 9.539149693366628e-06, 'epoch': 0.16} 16%|█▋ | 3616/22095 [5:52:23<17:02:37, 3.32s/it] 16%|█▋ | 3617/22095 [5:52:26<16:37:47, 3.24s/it] {'loss': 0.3769, 'grad_norm': 0.8369196659255341, 'learning_rate': 9.538842302579299e-06, 'epoch': 0.16} 16%|█▋ | 3617/22095 [5:52:26<16:37:47, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83958 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3618/22095 [5:52:35<25:59:05, 5.06s/it] {'loss': 0.5097, 'grad_norm': 1.165080626912052, 'learning_rate': 9.538534814266187e-06, 'epoch': 0.16} 16%|█▋ | 3618/22095 [5:52:36<25:59:05, 5.06s/it] 16%|█▋ | 3619/22095 [5:52:39<23:14:05, 4.53s/it] {'loss': 0.4292, 'grad_norm': 0.8593480539402252, 'learning_rate': 9.538227228433905e-06, 'epoch': 0.16} 16%|█▋ | 3619/22095 [5:52:39<23:14:05, 4.53s/it] 16%|█▋ | 3620/22095 [5:52:42<21:01:27, 4.10s/it] {'loss': 0.3933, 'grad_norm': 0.7962035527894968, 'learning_rate': 9.537919545089057e-06, 'epoch': 0.16} 16%|█▋ | 3620/22095 [5:52:42<21:01:27, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3621/22095 [5:52:51<28:37:40, 5.58s/it] {'loss': 0.5197, 'grad_norm': 0.6133916320357296, 'learning_rate': 9.537611764238253e-06, 'epoch': 0.16} 16%|█▋ | 3621/22095 [5:52:51<28:37:40, 5.58s/it] 16%|█▋ | 3622/22095 [5:53:00<34:37:10, 6.75s/it] {'loss': 0.4907, 'grad_norm': 0.45833468804418676, 'learning_rate': 9.53730388588811e-06, 'epoch': 0.16} 16%|█▋ | 3622/22095 [5:53:00<34:37:10, 6.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 16%|█▋ | 3623/22095 [5:53:05<30:48:38, 6.00s/it] {'loss': 0.3854, 'grad_norm': 0.6799233073262908, 'learning_rate': 9.536995910045241e-06, 'epoch': 0.16} 16%|█▋ | 3623/22095 [5:53:05<30:48:38, 6.00s/it] 16%|█▋ | 3624/22095 [5:53:08<27:10:20, 5.30s/it] {'loss': 0.4124, 'grad_norm': 0.7132031385597974, 'learning_rate': 9.536687836716265e-06, 'epoch': 0.16} 16%|█▋ | 3624/22095 [5:53:08<27:10:20, 5.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408362 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10555, 'image': 'vrdu_table_final_2/astro-ph.CO/e492e590-7572-4321-a76f-d1078da71805.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 16%|█▋ | 3625/22095 [5:53:11<23:27:43, 4.57s/it] {'loss': 0.4351, 'grad_norm': 1.1479222305246017, 'learning_rate': 9.536379665907801e-06, 'epoch': 0.16} 16%|█▋ | 3625/22095 [5:53:11<23:27:43, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3626/22095 [5:53:19<28:08:20, 5.48s/it] {'loss': 0.5495, 'grad_norm': 0.8723725440454708, 'learning_rate': 9.53607139762647e-06, 'epoch': 0.16} 16%|█▋ | 3626/22095 [5:53:19<28:08:20, 5.48s/it] 16%|█▋ | 3627/22095 [5:53:22<24:40:04, 4.81s/it] {'loss': 0.4412, 'grad_norm': 0.7988608216779318, 'learning_rate': 9.535763031878895e-06, 'epoch': 0.16} 16%|█▋ | 3627/22095 [5:53:22<24:40:04, 4.81s/it] 16%|█▋ | 3628/22095 [5:53:26<23:25:58, 4.57s/it] {'loss': 0.4246, 'grad_norm': 0.6804044503637056, 'learning_rate': 9.535454568671705e-06, 'epoch': 0.16} 16%|█▋ | 3628/22095 [5:53:26<23:25:58, 4.57s/it] 16%|█▋ | 3629/22095 [5:53:30<22:40:51, 4.42s/it] {'loss': 0.4234, 'grad_norm': 0.7159799420828592, 'learning_rate': 9.535146008011524e-06, 'epoch': 0.16} 16%|█▋ | 3629/22095 [5:53:30<22:40:51, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (149383 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (54632 > 40960) for 4 sample(s). Truncating to 24152 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (80390 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3630/22095 [5:53:33<21:02:37, 4.10s/it] {'loss': 0.3846, 'grad_norm': 0.8517435555100842, 'learning_rate': 9.534837349904986e-06, 'epoch': 0.16} 16%|█▋ | 3630/22095 [5:53:33<21:02:37, 4.10s/it] 16%|█▋ | 3631/22095 [5:53:36<19:19:07, 3.77s/it] {'loss': 0.4253, 'grad_norm': 0.7481983359140346, 'learning_rate': 9.534528594358718e-06, 'epoch': 0.16} 16%|█▋ | 3631/22095 [5:53:36<19:19:07, 3.77s/it] 16%|█▋ | 3632/22095 [5:53:39<17:57:40, 3.50s/it] {'loss': 0.4291, 'grad_norm': 0.746705182354197, 'learning_rate': 9.53421974137936e-06, 'epoch': 0.16} 16%|█▋ | 3632/22095 [5:53:39<17:57:40, 3.50s/it] 16%|█▋ | 3633/22095 [5:53:43<17:40:41, 3.45s/it] {'loss': 0.42, 'grad_norm': 0.656268854264488, 'learning_rate': 9.533910790973545e-06, 'epoch': 0.16} 16%|█▋ | 3633/22095 [5:53:43<17:40:41, 3.45s/it] 16%|█▋ | 3634/22095 [5:53:46<16:59:38, 3.31s/it] {'loss': 0.454, 'grad_norm': 0.715318601638388, 'learning_rate': 9.533601743147911e-06, 'epoch': 0.16} 16%|█▋ | 3634/22095 [5:53:46<16:59:38, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86804 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72548 > 40960). Running this sequence through the model will result in indexing errors 16%|█▋ | 3635/22095 [5:53:49<17:26:32, 3.40s/it] {'loss': 0.4316, 'grad_norm': 0.6353354533937493, 'learning_rate': 9.533292597909101e-06, 'epoch': 0.16} 16%|█▋ | 3635/22095 [5:53:49<17:26:32, 3.40s/it] 16%|█▋ | 3636/22095 [5:53:53<17:21:15, 3.38s/it] {'loss': 0.4179, 'grad_norm': 0.6869695424669242, 'learning_rate': 9.532983355263753e-06, 'epoch': 0.16} 16%|█▋ | 3636/22095 [5:53:53<17:21:15, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344685 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11337, 'image': 'vrdu_table_final_2/astro-ph.CO/d77dc4e3-2f67-40c2-addd-2992078a076b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 16%|█▋ | 3637/22095 [5:53:56<17:34:00, 3.43s/it] {'loss': 0.3999, 'grad_norm': 0.6966617156005382, 'learning_rate': 9.532674015218519e-06, 'epoch': 0.16} 16%|█▋ | 3637/22095 [5:53:56<17:34:00, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3638/22095 [5:54:07<28:28:03, 5.55s/it] {'loss': 0.5192, 'grad_norm': 0.738112774923969, 'learning_rate': 9.532364577780039e-06, 'epoch': 0.16} 16%|█▋ | 3638/22095 [5:54:07<28:28:03, 5.55s/it] 16%|█▋ | 3639/22095 [5:54:10<25:41:46, 5.01s/it] {'loss': 0.4315, 'grad_norm': 0.7197831342194553, 'learning_rate': 9.532055042954964e-06, 'epoch': 0.16} 16%|█▋ | 3639/22095 [5:54:10<25:41:46, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 16%|█▋ | 3640/22095 [5:54:14<22:54:54, 4.47s/it] {'loss': 0.432, 'grad_norm': 0.6691610827792658, 'learning_rate': 9.531745410749946e-06, 'epoch': 0.16} 16%|█▋ | 3640/22095 [5:54:14<22:54:54, 4.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7797924 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '19264', 'image': '28739.jpg', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\nI am providing an answer to the question below based on the image: \nCan you describe the background and setting of the image in detail? \nHere is my response: \nThe image presents a diagram of a curved path labeled \\( CD \\), with waypoints \\( P \\) and \\( Q \\) marked along the curve.\n\n1. **Curved Path \\( CD \\)**: The primary feature in the image is a curved path extending from point \\( C \\) on the left to point \\( D \\) on the right.\n2. **Waypoints \\( P \\) and \\( Q \\)**: Waypoints \\( P \\) and \\( Q \\) are positioned along the curve between \\( C \\) and \\( D \\). Specifically, \\( P \\) is situated closer to \\( C \\) while \\( Q \\) is nearer to \\( D \\).\n\nThe arrangement of the waypoints indicates that they are part of the curved path and share its trajectory. Therefore, \\( P \\) and \\( Q \\) divide the whole curved path \\( CD \\) into three segments: \\( CP \\), \\( PQ \\), and \\( QD \\).\n\n### Analysis\n\nTo analyze this further, let's understand potential questions and answers that can be derived from this image:\n\n1. **What is the relationship between the waypoints?**\n\n - **Positioning and Division of the Curved Path**: Waypoint \\( P \\) lies between \\( C \\) and \\( Q \\), while waypoint \\( Q \\) lies between \\( P \\) and \\( D \\). This creates a sequential relationship: \\( C \\rightarrow P \\rightarrow Q \\rightarrow D \\).\n\n2. **Properties and Measurements**:\n\n - **Lengths**: If we consider the lengths of the segments \\( CP \\), \\( PQ \\), and \\( QD \\), they would all add up to the length of \\( CD \\) (i.e., \\( CD = CP + PQ + QD \\)). However, the image does not provide specific numerical values or ratios for these segments.\n \n - **Curve Path Geometry**: Since \\( P \\) and \\( Q \\) lie on the curved path \\( CD \\), they contribute to the curve's shape and geometry. Without specific measurements, the segments' curvatures and their relationship to the total curve cannot be quantified.\n\n3. **Potential Geometrical Relations**:\n\n - **Division Points and Ratios**: If \\( P \\) and \\( Q \\) were division points or other specific fractional points on the curve, this would imply certain proportional relationships. For instance, if \\( P \\) is a division point that splits the curve in half, then \\( CP = PD \\). Similarly, if \\( Q \\) divides \\( CD \\) in the ratio 1:3, then \\( CQ = 1/4 CD \\) and \\( QD = 3/4 CD \\).\n\n4. **Possible Theorems**:\n\n - **Path Division Principle**: We can apply the Path Division Principle here. It states that if a point is between two other points on a curved path, the sum of the lengths of the segments created by that point equals the total length of the original path (i.e., \\( CD = CP + PQ + QD \\)).\n\n### Conclusion\n\nThe image captures a curved path with waypoints \\( P \\) and \\( Q \\) dividing the path \\( CD \\). Such a configuration can be foundational for understanding more complex concepts in geometry such as curve path division, curve geometry, and the properties of curved segments. Without specific measurements or additional context, the simplest conclusion is recognizing the division of path \\( CD \\) into smaller segments \\( CP \\), \\( PQ \\), and \\( QD \\). \nPlease revise my answer using the image, and if there are any errors, kindly explain why it's incorrect."}, {'from': 'gpt', 'value': 'Your description accurately adapts to the shift from a curved path to a straight line segment, maintaining a structured format that allows for clear and concise analysis. However, there are a few subtle adjustments that could enhance clarity and precision:\n\n1. **Consistency in Symbol Usage**: Ensure consistency in the use of symbols for points and segments. In the original description, you used backslashes before the symbols (e.g., "\\\\( CD \\\\)"). In the adapted description, this format is not maintained. While not incorrect, keeping the notation consistent throughout your descriptions (either with or without backslashes) improves readability.\n\n2. **Accuracy in Describing the Image**: The adapted description accurately reflects the changed image from a curved path to a straight line segment. However, the initial part of the adapted description still starts with a reference to the "image presents a diagram of a curved path." This should be revised to immediately reflect the straight line segment to avoid confusion.\n\n3. **Logical Flow in Analysis**: In the analysis section, there is a logical and smooth transition between describing the relationship between the points and discussing properties and measurements. The division into subsections for potential questions, geometrical relations, and possible theorems helps organize the analysis effectively.\n\n4. **Specificity in Theorems and Principles**: When discussing the Segment Addition Postulate, you correctly state that "if a point is between two other points on a line segment, the sum of the lengths of the segments created by that point equals the total length of the original segment." This is clear and correct.\n\n5. **Conclusion**: The conclusion appropriately summarizes the main points discussed and the implications of the geometric configuration. It reinforces the foundational nature of the described segment division in geometry without delving into unnecessary details that aren\'t present in the image.\n\nOverall, your adapted description is thorough and largely correct. Minor adjustments in consistency and initial description accuracy will refine it further.'}]} 16%|█▋ | 3641/22095 [5:54:17<21:04:01, 4.11s/it] {'loss': 0.4584, 'grad_norm': 0.7336041070909316, 'learning_rate': 9.531435681171637e-06, 'epoch': 0.16} 16%|█▋ | 3641/22095 [5:54:17<21:04:01, 4.11s/it] 16%|█▋ | 3642/22095 [5:54:20<19:19:20, 3.77s/it] {'loss': 0.4358, 'grad_norm': 0.6534578008595666, 'learning_rate': 9.531125854226692e-06, 'epoch': 0.16} 16%|█▋ | 3642/22095 [5:54:20<19:19:20, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 16%|█▋ | 3643/22095 [5:54:29<27:41:58, 5.40s/it] {'loss': 0.5232, 'grad_norm': 0.5270481010940409, 'learning_rate': 9.53081592992177e-06, 'epoch': 0.16} 16%|█▋ | 3643/22095 [5:54:29<27:41:58, 5.40s/it] 16%|█▋ | 3644/22095 [5:54:33<25:35:00, 4.99s/it] {'loss': 0.3747, 'grad_norm': 0.832431322687446, 'learning_rate': 9.530505908263528e-06, 'epoch': 0.16} 16%|█▋ | 3644/22095 [5:54:33<25:35:00, 4.99s/it] 16%|█▋ | 3645/22095 [5:54:37<23:51:28, 4.66s/it] {'loss': 0.4118, 'grad_norm': 0.7072464593601516, 'learning_rate': 9.53019578925863e-06, 'epoch': 0.16} 16%|█▋ | 3645/22095 [5:54:37<23:51:28, 4.66s/it] 17%|█▋ | 3646/22095 [5:54:40<21:29:12, 4.19s/it] {'loss': 0.4015, 'grad_norm': 0.6158071480238312, 'learning_rate': 9.529885572913735e-06, 'epoch': 0.17} 17%|█▋ | 3646/22095 [5:54:40<21:29:12, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86356 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3647/22095 [5:54:44<20:53:08, 4.08s/it] {'loss': 0.4451, 'grad_norm': 0.7465380445759865, 'learning_rate': 9.529575259235514e-06, 'epoch': 0.17} 17%|█▋ | 3647/22095 [5:54:44<20:53:08, 4.08s/it] 17%|█▋ | 3648/22095 [5:54:48<20:13:18, 3.95s/it] {'loss': 0.3958, 'grad_norm': 0.684587534515466, 'learning_rate': 9.52926484823063e-06, 'epoch': 0.17} 17%|█▋ | 3648/22095 [5:54:48<20:13:18, 3.95s/it] 17%|█▋ | 3649/22095 [5:54:51<19:10:27, 3.74s/it] {'loss': 0.404, 'grad_norm': 0.636071964727268, 'learning_rate': 9.528954339905759e-06, 'epoch': 0.17} 17%|█▋ | 3649/22095 [5:54:51<19:10:27, 3.74s/it] 17%|█▋ | 3650/22095 [5:54:54<18:10:03, 3.55s/it] {'loss': 0.4247, 'grad_norm': 0.6693368657078425, 'learning_rate': 9.528643734267564e-06, 'epoch': 0.17} 17%|█▋ | 3650/22095 [5:54:54<18:10:03, 3.55s/it] 17%|█▋ | 3651/22095 [5:54:57<17:08:37, 3.35s/it] {'loss': 0.4013, 'grad_norm': 0.8203539046854412, 'learning_rate': 9.528333031322728e-06, 'epoch': 0.17} 17%|█▋ | 3651/22095 [5:54:57<17:08:37, 3.35s/it] 17%|█▋ | 3652/22095 [5:55:01<17:51:55, 3.49s/it] {'loss': 0.4628, 'grad_norm': 0.6957575535631207, 'learning_rate': 9.528022231077921e-06, 'epoch': 0.17} 17%|█▋ | 3652/22095 [5:55:01<17:51:55, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44902 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116176 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130407 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42589 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3653/22095 [5:55:03<16:52:14, 3.29s/it] {'loss': 0.4181, 'grad_norm': 0.6883990276108745, 'learning_rate': 9.527711333539821e-06, 'epoch': 0.17} 17%|█▋ | 3653/22095 [5:55:03<16:52:14, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (145556 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3654/22095 [5:55:08<18:10:34, 3.55s/it] {'loss': 0.4216, 'grad_norm': 0.6621880635568219, 'learning_rate': 9.527400338715112e-06, 'epoch': 0.17} 17%|█▋ | 3654/22095 [5:55:08<18:10:34, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90328 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3655/22095 [5:55:15<23:38:22, 4.62s/it] {'loss': 0.5367, 'grad_norm': 0.5905985298350221, 'learning_rate': 9.527089246610475e-06, 'epoch': 0.17} 17%|█▋ | 3655/22095 [5:55:15<23:38:22, 4.62s/it] 17%|█▋ | 3656/22095 [5:55:18<22:13:20, 4.34s/it] {'loss': 0.4306, 'grad_norm': 0.66555066514175, 'learning_rate': 9.526778057232595e-06, 'epoch': 0.17} 17%|█▋ | 3656/22095 [5:55:18<22:13:20, 4.34s/it] 17%|█▋ | 3657/22095 [5:55:21<20:10:49, 3.94s/it] {'loss': 0.3738, 'grad_norm': 0.6431260588652664, 'learning_rate': 9.526466770588156e-06, 'epoch': 0.17} 17%|█▋ | 3657/22095 [5:55:21<20:10:49, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65240 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129930 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3658/22095 [5:55:24<18:54:22, 3.69s/it] {'loss': 0.4557, 'grad_norm': 0.7130443154083126, 'learning_rate': 9.526155386683848e-06, 'epoch': 0.17} 17%|█▋ | 3658/22095 [5:55:24<18:54:22, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91327 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3659/22095 [5:55:27<17:36:18, 3.44s/it] {'loss': 0.397, 'grad_norm': 0.6961582015719266, 'learning_rate': 9.525843905526361e-06, 'epoch': 0.17} 17%|█▋ | 3659/22095 [5:55:27<17:36:18, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3660/22095 [5:55:50<47:23:13, 9.25s/it] {'loss': 0.4043, 'grad_norm': 0.623855766689215, 'learning_rate': 9.525532327122391e-06, 'epoch': 0.17} 17%|█▋ | 3660/22095 [5:55:50<47:23:13, 9.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3661/22095 [5:56:00<48:07:48, 9.40s/it] {'loss': 0.505, 'grad_norm': 0.4348115038481914, 'learning_rate': 9.525220651478628e-06, 'epoch': 0.17} 17%|█▋ | 3661/22095 [5:56:00<48:07:48, 9.40s/it] 17%|█▋ | 3662/22095 [5:56:03<38:35:53, 7.54s/it] {'loss': 0.4289, 'grad_norm': 0.7124951405659162, 'learning_rate': 9.524908878601773e-06, 'epoch': 0.17} 17%|█▋ | 3662/22095 [5:56:03<38:35:53, 7.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3663/22095 [5:56:13<41:32:47, 8.11s/it] {'loss': 0.4992, 'grad_norm': 0.3185914241502027, 'learning_rate': 9.524597008498522e-06, 'epoch': 0.17} 17%|█▋ | 3663/22095 [5:56:13<41:32:47, 8.11s/it] 17%|█▋ | 3664/22095 [5:56:16<34:36:35, 6.76s/it] {'loss': 0.4552, 'grad_norm': 0.7086305378925704, 'learning_rate': 9.524285041175578e-06, 'epoch': 0.17} 17%|█▋ | 3664/22095 [5:56:16<34:36:35, 6.76s/it] 17%|█▋ | 3665/22095 [5:56:19<28:36:10, 5.59s/it] {'loss': 0.3899, 'grad_norm': 0.8279029973754335, 'learning_rate': 9.523972976639645e-06, 'epoch': 0.17} 17%|█▋ | 3665/22095 [5:56:19<28:36:10, 5.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3666/22095 [5:56:29<34:41:17, 6.78s/it] {'loss': 0.4951, 'grad_norm': 0.3352919478515927, 'learning_rate': 9.523660814897426e-06, 'epoch': 0.17} 17%|█▋ | 3666/22095 [5:56:29<34:41:17, 6.78s/it] 17%|█▋ | 3667/22095 [5:56:32<30:01:18, 5.86s/it] {'loss': 0.4349, 'grad_norm': 0.727907832487834, 'learning_rate': 9.52334855595563e-06, 'epoch': 0.17} 17%|█▋ | 3667/22095 [5:56:32<30:01:18, 5.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [417, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8482545 in VC:s3://internvl-moe-sft-data/. Exception: Image size [417, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69332, 'image': 'vrdu_texteq/astro-ph.CO/97495f00-f5ad-4cc7-95ba-4bf5239bf519.png', 'image_wh': [[417, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $r_{\\rm F}$ denotes the Fresnel scale'}]} 17%|█▋ | 3668/22095 [5:56:36<26:21:17, 5.15s/it] {'loss': 0.377, 'grad_norm': 0.6549909987507452, 'learning_rate': 9.523036199820964e-06, 'epoch': 0.17} 17%|█▋ | 3668/22095 [5:56:36<26:21:17, 5.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922960 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46113, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 无法确定\nB. 1cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 17%|█▋ | 3669/22095 [5:56:39<23:24:56, 4.57s/it] {'loss': 0.4197, 'grad_norm': 0.626733270750146, 'learning_rate': 9.522723746500144e-06, 'epoch': 0.17} 17%|█▋ | 3669/22095 [5:56:39<23:24:56, 4.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3670/22095 [5:56:43<21:49:06, 4.26s/it] {'loss': 0.4233, 'grad_norm': 0.7155877450445944, 'learning_rate': 9.522411195999879e-06, 'epoch': 0.17} 17%|█▋ | 3670/22095 [5:56:43<21:49:06, 4.26s/it] 17%|█▋ | 3671/22095 [5:56:46<20:40:33, 4.04s/it] {'loss': 0.4286, 'grad_norm': 0.6985169309325515, 'learning_rate': 9.522098548326888e-06, 'epoch': 0.17} 17%|█▋ | 3671/22095 [5:56:46<20:40:33, 4.04s/it] 17%|█▋ | 3672/22095 [5:57:25<74:31:18, 14.56s/it] {'loss': 0.4138, 'grad_norm': 0.7018155538709772, 'learning_rate': 9.521785803487888e-06, 'epoch': 0.17} 17%|█▋ | 3672/22095 [5:57:25<74:31:18, 14.56s/it] 17%|█▋ | 3673/22095 [5:57:46<84:10:10, 16.45s/it] {'loss': 0.3932, 'grad_norm': 0.6868291461365117, 'learning_rate': 9.5214729614896e-06, 'epoch': 0.17} 17%|█▋ | 3673/22095 [5:57:46<84:10:10, 16.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89516 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3674/22095 [5:58:07<90:46:13, 17.74s/it] {'loss': 0.405, 'grad_norm': 0.770746088836217, 'learning_rate': 9.521160022338742e-06, 'epoch': 0.17} 17%|█▋ | 3674/22095 [5:58:07<90:46:13, 17.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64041 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3675/22095 [5:58:10<69:10:13, 13.52s/it] {'loss': 0.3715, 'grad_norm': 0.7083686368080435, 'learning_rate': 9.520846986042043e-06, 'epoch': 0.17} 17%|█▋ | 3675/22095 [5:58:10<69:10:13, 13.52s/it] 17%|█▋ | 3676/22095 [5:58:33<83:20:28, 16.29s/it] {'loss': 0.4242, 'grad_norm': 0.7256127301256134, 'learning_rate': 9.520533852606226e-06, 'epoch': 0.17} 17%|█▋ | 3676/22095 [5:58:33<83:20:28, 16.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3677/22095 [5:58:36<63:17:12, 12.37s/it] {'loss': 0.4022, 'grad_norm': 0.7874565138391614, 'learning_rate': 9.520220622038019e-06, 'epoch': 0.17} 17%|█▋ | 3677/22095 [5:58:36<63:17:12, 12.37s/it] 17%|█▋ | 3678/22095 [5:58:40<49:28:59, 9.67s/it] {'loss': 0.4053, 'grad_norm': 0.663211610255665, 'learning_rate': 9.519907294344155e-06, 'epoch': 0.17} 17%|█▋ | 3678/22095 [5:58:40<49:28:59, 9.67s/it]VC:s3://gui-agent/data_20250612/android/images/Total_data_windows_0612_hard_data2_device_1_Broccoli/RecipeDeleteMultipleRecipesWithConstraint/images/013_click_1749446032580.png 2025-08-27 21:56:38.576339 load time: 1029.25 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/18943/screenshot_2.png 2025-08-27 21:56:38.576197 load time: 1051.04 ms 17%|█▋ | 3679/22095 [5:59:21<97:34:36, 19.07s/it] {'loss': 0.4084, 'grad_norm': 0.7512651019964488, 'learning_rate': 9.519593869531366e-06, 'epoch': 0.17} 17%|█▋ | 3679/22095 [5:59:21<97:34:36, 19.07s/it]VC:s3://gui/data_20250328/android/images/wiki/Cycle_3_Iter_4/images/screenshot-101-1743230895.226662-before.png 2025-08-27 21:57:19.585843 load time: 1032.9 ms VC:s3://gui-agent/data_20250612/android/images/Total_data_windows_0612_easy_data_device1_Markor/MarkorDeleteNewestNote/images/home_screen.png 2025-08-27 21:57:19.583583 load time: 1065.15 ms 17%|█▋ | 3680/22095 [5:59:43<102:09:41, 19.97s/it] {'loss': 0.4363, 'grad_norm': 0.6856705100857937, 'learning_rate': 9.519280347606383e-06, 'epoch': 0.17} 17%|█▋ | 3680/22095 [5:59:43<102:09:41, 19.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387773 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54585, 'image': 'vrdu_table_final_2/astro-ph.CO/5f0c0d9f-8d28-4d78-9da1-001d5e541595.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 17%|█▋ | 3681/22095 [6:00:05<105:10:45, 20.56s/it] {'loss': 0.4354, 'grad_norm': 0.820504135650576, 'learning_rate': 9.518966728575947e-06, 'epoch': 0.17} 17%|█▋ | 3681/22095 [6:00:05<105:10:45, 20.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3682/22095 [6:00:33<116:34:19, 22.79s/it] {'loss': 0.4859, 'grad_norm': 0.5359567082935988, 'learning_rate': 9.518653012446794e-06, 'epoch': 0.17} 17%|█▋ | 3682/22095 [6:00:33<116:34:19, 22.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51157 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73328 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3683/22095 [6:01:22<156:39:19, 30.63s/it] {'loss': 0.4734, 'grad_norm': 0.42768864998571654, 'learning_rate': 9.518339199225668e-06, 'epoch': 0.17} 17%|█▋ | 3683/22095 [6:01:22<156:39:19, 30.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3684/22095 [6:01:25<114:47:41, 22.45s/it] {'loss': 0.4493, 'grad_norm': 0.6977386931322633, 'learning_rate': 9.518025288919307e-06, 'epoch': 0.17} 17%|█▋ | 3684/22095 [6:01:25<114:47:41, 22.45s/it] 17%|█▋ | 3685/22095 [6:02:08<145:34:51, 28.47s/it] {'loss': 0.4551, 'grad_norm': 0.7312921685664348, 'learning_rate': 9.51771128153446e-06, 'epoch': 0.17} 17%|█▋ | 3685/22095 [6:02:08<145:34:51, 28.47s/it] 17%|█▋ | 3686/22095 [6:03:12<200:08:24, 39.14s/it] {'loss': 0.3496, 'grad_norm': 0.8006426810646926, 'learning_rate': 9.517397177077874e-06, 'epoch': 0.17} 17%|█▋ | 3686/22095 [6:03:12<200:08:24, 39.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44336 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90742 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81573 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3687/22095 [6:03:34<173:45:16, 33.98s/it] {'loss': 0.4105, 'grad_norm': 0.6467847843733492, 'learning_rate': 9.517082975556294e-06, 'epoch': 0.17} 17%|█▋ | 3687/22095 [6:03:34<173:45:16, 33.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41572 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3688/22095 [6:03:57<157:37:01, 30.83s/it] {'loss': 0.3715, 'grad_norm': 0.7434533319298859, 'learning_rate': 9.516768676976476e-06, 'epoch': 0.17} 17%|█▋ | 3688/22095 [6:03:57<157:37:01, 30.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51998 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3689/22095 [6:04:00<115:01:17, 22.50s/it] {'loss': 0.4101, 'grad_norm': 0.7565063603778319, 'learning_rate': 9.51645428134517e-06, 'epoch': 0.17} 17%|█▋ | 3689/22095 [6:04:00<115:01:17, 22.50s/it] 17%|█▋ | 3690/22095 [6:04:25<118:26:26, 23.17s/it] {'loss': 0.4008, 'grad_norm': 0.6583406605473527, 'learning_rate': 9.516139788669133e-06, 'epoch': 0.17} 17%|█▋ | 3690/22095 [6:04:25<118:26:26, 23.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48902 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3691/22095 [6:04:34<97:39:40, 19.10s/it] {'loss': 0.5161, 'grad_norm': 0.9488362943271945, 'learning_rate': 9.515825198955122e-06, 'epoch': 0.17} 17%|█▋ | 3691/22095 [6:04:34<97:39:40, 19.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (144482 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3692/22095 [6:04:57<102:57:49, 20.14s/it] {'loss': 0.4167, 'grad_norm': 0.74207752543715, 'learning_rate': 9.515510512209898e-06, 'epoch': 0.17} 17%|█▋ | 3692/22095 [6:04:57<102:57:49, 20.14s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_527112.png 2025-08-27 22:02:55.802163 load time: 1035.31 ms VC:s3://gui-agent/data_20250526/windows/images/spotify/20250515_115852_1/images/before_screenshot_14_id_50_function_0_crop_0_grounding_instructions_random.png 2025-08-27 22:02:55.800305 load time: 1070.75 ms 17%|█▋ | 3693/22095 [6:05:38<135:38:54, 26.54s/it] {'loss': 0.421, 'grad_norm': 0.688311061847171, 'learning_rate': 9.515195728440221e-06, 'epoch': 0.17} 17%|█▋ | 3693/22095 [6:05:38<135:38:54, 26.54s/it] 17%|█▋ | 3694/22095 [6:06:36<183:25:05, 35.88s/it] {'loss': 0.4215, 'grad_norm': 0.7076735124909065, 'learning_rate': 9.514880847652855e-06, 'epoch': 0.17} 17%|█▋ | 3694/22095 [6:06:36<183:25:05, 35.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3695/22095 [6:07:20<194:53:58, 38.13s/it] {'loss': 0.5269, 'grad_norm': 0.4649065122917939, 'learning_rate': 9.514565869854566e-06, 'epoch': 0.17} 17%|█▋ | 3695/22095 [6:07:20<194:53:58, 38.13s/it] 17%|█▋ | 3696/22095 [6:07:29<150:55:56, 29.53s/it] {'loss': 0.4916, 'grad_norm': 0.37744810277788543, 'learning_rate': 9.51425079505212e-06, 'epoch': 0.17} 17%|█▋ | 3696/22095 [6:07:29<150:55:56, 29.53s/it] 17%|█▋ | 3697/22095 [6:07:57<149:11:38, 29.19s/it] {'loss': 0.4882, 'grad_norm': 0.35435769341967316, 'learning_rate': 9.513935623252292e-06, 'epoch': 0.17} 17%|█▋ | 3697/22095 [6:07:57<149:11:38, 29.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3698/22095 [6:08:01<109:36:31, 21.45s/it] {'loss': 0.3775, 'grad_norm': 0.8444424239920386, 'learning_rate': 9.51362035446185e-06, 'epoch': 0.17} 17%|█▋ | 3698/22095 [6:08:01<109:36:31, 21.45s/it] 17%|█▋ | 3699/22095 [6:08:29<119:37:30, 23.41s/it] {'loss': 0.5053, 'grad_norm': 0.49136085657274564, 'learning_rate': 9.513304988687568e-06, 'epoch': 0.17} 17%|█▋ | 3699/22095 [6:08:29<119:37:30, 23.41s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/AndroidUI/20240327/20240327_filtered/wangyiyunyingyue/screen_00000307.jpg 2025-08-27 22:06:27.563300 load time: 1034.44 ms 17%|█▋ | 3700/22095 [6:08:56<126:08:47, 24.69s/it] {'loss': 0.5036, 'grad_norm': 0.48488017714940024, 'learning_rate': 9.512989525936223e-06, 'epoch': 0.17} 17%|█▋ | 3700/22095 [6:08:56<126:08:47, 24.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3701/22095 [6:09:19<123:05:12, 24.09s/it] {'loss': 0.4554, 'grad_norm': 0.7566271474139192, 'learning_rate': 9.512673966214597e-06, 'epoch': 0.17} 17%|█▋ | 3701/22095 [6:09:19<123:05:12, 24.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3702/22095 [6:10:19<177:41:28, 34.78s/it] {'loss': 0.3939, 'grad_norm': 0.8793616289280668, 'learning_rate': 9.512358309529463e-06, 'epoch': 0.17} 17%|█▋ | 3702/22095 [6:10:19<177:41:28, 34.78s/it]VC:s3://gui/aguvis/aguvis-stage2/android_control/images/17877/screenshot_1.png 2025-08-27 22:08:17.650835 load time: 1049.62 ms 17%|█▋ | 3703/22095 [6:11:38<245:27:06, 48.04s/it] {'loss': 0.3886, 'grad_norm': 0.6267765901631913, 'learning_rate': 9.51204255588761e-06, 'epoch': 0.17} 17%|█▋ | 3703/22095 [6:11:38<245:27:06, 48.04s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_112131.png 2025-08-27 22:09:36.643813 load time: 1049.41 ms 17%|█▋ | 3704/22095 [6:12:18<232:56:33, 45.60s/it] {'loss': 0.42, 'grad_norm': 0.7177662264806949, 'learning_rate': 9.51172670529582e-06, 'epoch': 0.17} 17%|█▋ | 3704/22095 [6:12:18<232:56:33, 45.60s/it] 17%|█▋ | 3705/22095 [6:13:21<260:04:08, 50.91s/it] {'loss': 0.4347, 'grad_norm': 0.7540731952293713, 'learning_rate': 9.511410757760878e-06, 'epoch': 0.17} 17%|█▋ | 3705/22095 [6:13:21<260:04:08, 50.91s/it] 17%|█▋ | 3706/22095 [6:14:42<305:52:26, 59.88s/it] {'loss': 0.4344, 'grad_norm': 1.175057370829924, 'learning_rate': 9.511094713289575e-06, 'epoch': 0.17} 17%|█▋ | 3706/22095 [6:14:42<305:52:26, 59.88s/it] 17%|█▋ | 3707/22095 [6:16:01<335:53:23, 65.76s/it] {'loss': 0.3923, 'grad_norm': 0.6673807601229402, 'learning_rate': 9.510778571888704e-06, 'epoch': 0.17} 17%|█▋ | 3707/22095 [6:16:01<335:53:23, 65.76s/it] 17%|█▋ | 3708/22095 [6:17:02<328:09:03, 64.25s/it] {'loss': 0.4127, 'grad_norm': 0.6639737685542729, 'learning_rate': 9.510462333565052e-06, 'epoch': 0.17} 17%|█▋ | 3708/22095 [6:17:02<328:09:03, 64.25s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-0_3972407-split-0.jpg 2025-08-27 22:15:00.854442 load time: 1044.87 ms VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/47818.jpg 2025-08-27 22:15:00.854565 load time: 1051.88 ms VC:s3://gui-agent/data_20250505/android/images/settings/Cycle_1_Iter_0/images/screenshot-13-1746179545.754323-before.png 2025-08-27 22:15:00.855157 load time: 1042.1 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/3ef3a71caec8f631e704d08b2b27f1a3dc41e30e86ed23c85c8272b6e96a0464.png 2025-08-27 22:15:00.854741 load time: 1049.61 ms VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250502_111053_1/images/before_screenshot_1_id_86_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-27 22:15:00.854744 load time: 1301.14 ms 17%|█▋ | 3709/22095 [6:18:00<317:55:30, 62.25s/it] {'loss': 0.3768, 'grad_norm': 0.6623937707899048, 'learning_rate': 9.510145998325419e-06, 'epoch': 0.17} 17%|█▋ | 3709/22095 [6:18:00<317:55:30, 62.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3710/22095 [6:18:59<312:42:39, 61.23s/it] {'loss': 0.4426, 'grad_norm': 0.7189384687642437, 'learning_rate': 9.509829566176601e-06, 'epoch': 0.17} 17%|█▋ | 3710/22095 [6:18:59<312:42:39, 61.23s/it]VC:s3://internvl2/datasets/mmtab/IID_train_image/TABMWP_19474.jpg 2025-08-27 22:16:57.297495 load time: 1032.97 ms VC:s3://gui/aguvis/aguvis-stage2/guiact-web-single/images/d635bab9-3b27-40f5-87c1-75886f6cbb6a.jpg 2025-08-27 22:16:57.299734 load time: 1048.4 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [981, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8505791 in VC:s3://internvl-moe-sft-data/. Exception: Image size [981, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19604, 'image': 'vrdu_texteq/astro-ph.CO/e9f5210e-dfa3-4155-ae82-b157e6780f7f.png', 'image_wh': [[981, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'with $M_*$ the characteristic nonlinear mass at redshift $z$ and $M_\\mathrm{vir}$ the \nvirial mass.'}]} 17%|█▋ | 3711/22095 [6:19:57<308:48:52, 60.47s/it] {'loss': 0.4116, 'grad_norm': 0.7553610441482141, 'learning_rate': 9.509513037125395e-06, 'epoch': 0.17} 17%|█▋ | 3711/22095 [6:19:57<308:48:52, 60.47s/it] 17%|█▋ | 3712/22095 [6:20:57<307:38:22, 60.25s/it] {'loss': 0.4017, 'grad_norm': 0.6739738444460805, 'learning_rate': 9.509196411178605e-06, 'epoch': 0.17} 17%|█▋ | 3712/22095 [6:20:57<307:38:22, 60.25s/it] 17%|█▋ | 3713/22095 [6:22:18<339:18:05, 66.45s/it] {'loss': 0.4114, 'grad_norm': 0.8404052913139409, 'learning_rate': 9.508879688343033e-06, 'epoch': 0.17} 17%|█▋ | 3713/22095 [6:22:18<339:18:05, 66.45s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_443926.png 2025-08-27 22:20:16.641342 load time: 1036.0 ms VC:s3://gui/aguvis/aguvis-stage2/amex/images/6e0178de355448a5bb9a63171878c9bcstep21.png 2025-08-27 22:20:16.640039 load time: 1047.11 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39015.png 2025-08-27 22:20:17.469743 load time: 1002.99 ms 17%|█▋ | 3714/22095 [6:22:57<298:04:42, 58.38s/it] {'loss': 0.4285, 'grad_norm': 0.7005051307524329, 'learning_rate': 9.508562868625484e-06, 'epoch': 0.17} 17%|█▋ | 3714/22095 [6:22:57<298:04:42, 58.38s/it] 17%|█▋ | 3715/22095 [6:24:17<330:47:55, 64.79s/it] {'loss': 0.3782, 'grad_norm': 0.6861061710619839, 'learning_rate': 9.508245952032765e-06, 'epoch': 0.17} 17%|█▋ | 3715/22095 [6:24:17<330:47:55, 64.79s/it] 17%|█▋ | 3716/22095 [6:24:39<264:35:25, 51.83s/it] {'loss': 0.4325, 'grad_norm': 0.7290703589658392, 'learning_rate': 9.507928938571689e-06, 'epoch': 0.17} 17%|█▋ | 3716/22095 [6:24:39<264:35:25, 51.83s/it] 17%|█▋ | 3717/22095 [6:25:01<219:32:10, 43.00s/it] {'loss': 0.3857, 'grad_norm': 0.6613589739454301, 'learning_rate': 9.507611828249062e-06, 'epoch': 0.17} 17%|█▋ | 3717/22095 [6:25:01<219:32:10, 43.00s/it] 17%|█▋ | 3718/22095 [6:25:24<188:30:11, 36.93s/it] {'loss': 0.4095, 'grad_norm': 0.675217399005107, 'learning_rate': 9.507294621071702e-06, 'epoch': 0.17} 17%|█▋ | 3718/22095 [6:25:24<188:30:11, 36.93s/it] 17%|█▋ | 3719/22095 [6:26:44<254:17:42, 49.82s/it] {'loss': 0.4322, 'grad_norm': 0.7494715356484618, 'learning_rate': 9.506977317046424e-06, 'epoch': 0.17} 17%|█▋ | 3719/22095 [6:26:44<254:17:42, 49.82s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_505692.png 2025-08-27 22:24:42.581324 load time: 1025.21 ms VC:s3://st2pj/20250222/images/sam-all/images/sa_452284.jpg 2025-08-27 22:24:42.582794 load time: 1032.55 ms 17%|█▋ | 3720/22095 [6:27:08<215:35:49, 42.24s/it] {'loss': 0.4673, 'grad_norm': 0.6606116250864447, 'learning_rate': 9.506659916180046e-06, 'epoch': 0.17} 17%|█▋ | 3720/22095 [6:27:08<215:35:49, 42.24s/it] 17%|█▋ | 3721/22095 [6:27:30<183:43:02, 36.00s/it] {'loss': 0.4361, 'grad_norm': 0.7163806350258065, 'learning_rate': 9.506342418479388e-06, 'epoch': 0.17} 17%|█▋ | 3721/22095 [6:27:30<183:43:02, 36.00s/it] 17%|█▋ | 3722/22095 [6:27:33<133:05:42, 26.08s/it] {'loss': 0.4085, 'grad_norm': 0.626221536747177, 'learning_rate': 9.50602482395127e-06, 'epoch': 0.17} 17%|█▋ | 3722/22095 [6:27:33<133:05:42, 26.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50281 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3723/22095 [6:28:14<156:12:45, 30.61s/it] {'loss': 0.4388, 'grad_norm': 0.7311666290265683, 'learning_rate': 9.50570713260252e-06, 'epoch': 0.17} 17%|█▋ | 3723/22095 [6:28:14<156:12:45, 30.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [262, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8524765 in VC:s3://internvl-moe-sft-data/. Exception: Image size [262, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 50022, 'image': 'vrdu_texteq/astro-ph.CO/0f679d89-99c9-45b2-b1f0-ebd90f0a6762.png', 'image_wh': [[262, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'where $N_{t} = N_{c} + N_{n}$.'}]} 17%|█▋ | 3724/22095 [6:30:12<290:44:20, 56.97s/it] {'loss': 0.3982, 'grad_norm': 0.6904737034485492, 'learning_rate': 9.50538934443996e-06, 'epoch': 0.17} 17%|█▋ | 3724/22095 [6:30:12<290:44:20, 56.97s/it] 17%|█▋ | 3725/22095 [6:31:14<297:37:20, 58.33s/it] {'loss': 0.4169, 'grad_norm': 0.6646357804123364, 'learning_rate': 9.50507145947042e-06, 'epoch': 0.17} 17%|█▋ | 3725/22095 [6:31:14<297:37:20, 58.33s/it] 17%|█▋ | 3726/22095 [6:31:56<272:32:49, 53.41s/it] {'loss': 0.4108, 'grad_norm': 0.6716345872070512, 'learning_rate': 9.504753477700731e-06, 'epoch': 0.17} 17%|█▋ | 3726/22095 [6:31:56<272:32:49, 53.41s/it] 17%|█▋ | 3727/22095 [6:32:19<226:10:54, 44.33s/it] {'loss': 0.41, 'grad_norm': 0.6807055677303921, 'learning_rate': 9.504435399137726e-06, 'epoch': 0.17} 17%|█▋ | 3727/22095 [6:32:19<226:10:54, 44.33s/it] 17%|█▋ | 3728/22095 [6:33:20<252:00:00, 49.39s/it] {'loss': 0.3943, 'grad_norm': 0.7083670984942221, 'learning_rate': 9.504117223788238e-06, 'epoch': 0.17} 17%|█▋ | 3728/22095 [6:33:20<252:00:00, 49.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/app_store_ios/16120807211.png 2025-08-27 22:31:18.957288 load time: 1013.47 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_754288.png 2025-08-27 22:31:18.957079 load time: 1029.29 ms 17%|█▋ | 3729/22095 [6:33:42<209:32:31, 41.07s/it] {'loss': 0.4139, 'grad_norm': 0.6568251151354582, 'learning_rate': 9.503798951659104e-06, 'epoch': 0.17} 17%|█▋ | 3729/22095 [6:33:42<209:32:31, 41.07s/it] 17%|█▋ | 3730/22095 [6:33:46<152:44:01, 29.94s/it] {'loss': 0.4098, 'grad_norm': 0.6810666065376982, 'learning_rate': 9.503480582757163e-06, 'epoch': 0.17} 17%|█▋ | 3730/22095 [6:33:46<152:44:01, 29.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3731/22095 [6:34:16<152:40:10, 29.93s/it] {'loss': 0.518, 'grad_norm': 0.987821729528041, 'learning_rate': 9.503162117089256e-06, 'epoch': 0.17} 17%|█▋ | 3731/22095 [6:34:16<152:40:10, 29.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53421 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3732/22095 [6:34:38<141:06:54, 27.67s/it] {'loss': 0.4107, 'grad_norm': 0.6956385502760074, 'learning_rate': 9.502843554662225e-06, 'epoch': 0.17} 17%|█▋ | 3732/22095 [6:34:38<141:06:54, 27.67s/it] 17%|█▋ | 3733/22095 [6:35:41<195:37:22, 38.35s/it] {'loss': 0.3878, 'grad_norm': 0.7239870623317979, 'learning_rate': 9.502524895482917e-06, 'epoch': 0.17} 17%|█▋ | 3733/22095 [6:35:41<195:37:22, 38.35s/it] 17%|█▋ | 3734/22095 [6:36:04<171:10:07, 33.56s/it] {'loss': 0.4095, 'grad_norm': 0.6494967848122583, 'learning_rate': 9.502206139558175e-06, 'epoch': 0.17} 17%|█▋ | 3734/22095 [6:36:04<171:10:07, 33.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8400528 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2689, 'image': 'vrdu_table_final_2/astro-ph.CO/ec0fe774-3ac1-44ea-a803-03066e15ad5a.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 17%|█▋ | 3735/22095 [6:36:26<153:32:25, 30.11s/it] {'loss': 0.405, 'grad_norm': 0.7057674434549391, 'learning_rate': 9.501887286894852e-06, 'epoch': 0.17} 17%|█▋ | 3735/22095 [6:36:26<153:32:25, 30.11s/it] 17%|█▋ | 3736/22095 [6:36:47<140:25:36, 27.54s/it] {'loss': 0.4078, 'grad_norm': 0.7003729702583934, 'learning_rate': 9.501568337499798e-06, 'epoch': 0.17} 17%|█▋ | 3736/22095 [6:36:47<140:25:36, 27.54s/it]VC:s3://gui/visual_inputs/multi_modal/agent_data/rico/dataset/image/1963.jpg 2025-08-27 22:34:46.110882 load time: 1027.96 ms 17%|█▋ | 3737/22095 [6:36:50<102:46:08, 20.15s/it] {'loss': 0.393, 'grad_norm': 0.6829548547414105, 'learning_rate': 9.501249291379865e-06, 'epoch': 0.17} 17%|█▋ | 3737/22095 [6:36:50<102:46:08, 20.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52883 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3738/22095 [6:36:53<76:24:47, 14.99s/it] {'loss': 0.4413, 'grad_norm': 0.74769762989029, 'learning_rate': 9.50093014854191e-06, 'epoch': 0.17} 17%|█▋ | 3738/22095 [6:36:53<76:24:47, 14.99s/it] 17%|█▋ | 3739/22095 [6:36:57<59:16:03, 11.62s/it] {'loss': 0.4369, 'grad_norm': 0.7410832274031145, 'learning_rate': 9.500610908992788e-06, 'epoch': 0.17} 17%|█▋ | 3739/22095 [6:36:57<59:16:03, 11.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3740/22095 [6:37:04<52:55:04, 10.38s/it] {'loss': 0.5246, 'grad_norm': 0.9351541665911204, 'learning_rate': 9.500291572739362e-06, 'epoch': 0.17} 17%|█▋ | 3740/22095 [6:37:04<52:55:04, 10.38s/it] 17%|█▋ | 3741/22095 [6:37:14<52:19:45, 10.26s/it] {'loss': 0.5083, 'grad_norm': 0.7186639929072621, 'learning_rate': 9.49997213978849e-06, 'epoch': 0.17} 17%|█▋ | 3741/22095 [6:37:14<52:19:45, 10.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (51082 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66766 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56512 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3742/22095 [6:37:36<69:38:06, 13.66s/it] {'loss': 0.43, 'grad_norm': 0.7107391517242865, 'learning_rate': 9.49965261014704e-06, 'epoch': 0.17} 17%|█▋ | 3742/22095 [6:37:36<69:38:06, 13.66s/it] 17%|█▋ | 3743/22095 [6:37:39<53:53:05, 10.57s/it] {'loss': 0.4075, 'grad_norm': 0.6242561143083231, 'learning_rate': 9.499332983821873e-06, 'epoch': 0.17} 17%|█▋ | 3743/22095 [6:37:39<53:53:05, 10.57s/it] 17%|█▋ | 3744/22095 [6:37:42<42:11:49, 8.28s/it] {'loss': 0.4052, 'grad_norm': 0.6358146911217676, 'learning_rate': 9.49901326081986e-06, 'epoch': 0.17} 17%|█▋ | 3744/22095 [6:37:42<42:11:49, 8.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45658 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3745/22095 [6:37:45<34:08:15, 6.70s/it] {'loss': 0.4071, 'grad_norm': 0.7088530729448421, 'learning_rate': 9.498693441147868e-06, 'epoch': 0.17} 17%|█▋ | 3745/22095 [6:37:45<34:08:15, 6.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52322 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3746/22095 [6:37:54<37:06:05, 7.28s/it] {'loss': 0.5321, 'grad_norm': 1.0440527962553254, 'learning_rate': 9.498373524812771e-06, 'epoch': 0.17} 17%|█▋ | 3746/22095 [6:37:54<37:06:05, 7.28s/it] 17%|█▋ | 3747/22095 [6:37:57<31:15:41, 6.13s/it] {'loss': 0.3925, 'grad_norm': 0.7200570581024353, 'learning_rate': 9.498053511821445e-06, 'epoch': 0.17} 17%|█▋ | 3747/22095 [6:37:57<31:15:41, 6.13s/it] 17%|█▋ | 3748/22095 [6:38:02<28:11:00, 5.53s/it] {'loss': 0.4265, 'grad_norm': 0.6562807866803275, 'learning_rate': 9.497733402180761e-06, 'epoch': 0.17} 17%|█▋ | 3748/22095 [6:38:02<28:11:00, 5.53s/it] 17%|█▋ | 3749/22095 [6:38:05<24:36:09, 4.83s/it] {'loss': 0.4132, 'grad_norm': 0.6703611503814182, 'learning_rate': 9.497413195897601e-06, 'epoch': 0.17} 17%|█▋ | 3749/22095 [6:38:05<24:36:09, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3750/22095 [6:38:14<31:56:11, 6.27s/it] {'loss': 0.518, 'grad_norm': 0.6892556857897737, 'learning_rate': 9.497092892978844e-06, 'epoch': 0.17} 17%|█▋ | 3750/22095 [6:38:14<31:56:11, 6.27s/it] 17%|█▋ | 3751/22095 [6:38:18<28:05:04, 5.51s/it] {'loss': 0.4001, 'grad_norm': 0.6611965128166901, 'learning_rate': 9.496772493431373e-06, 'epoch': 0.17} 17%|█▋ | 3751/22095 [6:38:18<28:05:04, 5.51s/it] 17%|█▋ | 3752/22095 [6:38:23<26:31:26, 5.21s/it] {'loss': 0.4128, 'grad_norm': 0.7793165941706258, 'learning_rate': 9.496451997262071e-06, 'epoch': 0.17} 17%|█▋ | 3752/22095 [6:38:23<26:31:26, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3753/22095 [6:38:34<35:35:02, 6.98s/it] {'loss': 0.4902, 'grad_norm': 0.47410875781920114, 'learning_rate': 9.496131404477826e-06, 'epoch': 0.17} 17%|█▋ | 3753/22095 [6:38:34<35:35:02, 6.98s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 17%|█▋ | 3754/22095 [6:38:38<31:31:49, 6.19s/it] {'loss': 0.393, 'grad_norm': 0.6863007768927061, 'learning_rate': 9.495810715085526e-06, 'epoch': 0.17} 17%|█▋ | 3754/22095 [6:38:38<31:31:49, 6.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3755/22095 [6:38:42<27:26:58, 5.39s/it] {'loss': 0.3979, 'grad_norm': 0.7225011596239164, 'learning_rate': 9.495489929092062e-06, 'epoch': 0.17} 17%|█▋ | 3755/22095 [6:38:42<27:26:58, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77599 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3756/22095 [6:38:45<24:43:24, 4.85s/it] {'loss': 0.391, 'grad_norm': 0.7322895998589937, 'learning_rate': 9.495169046504325e-06, 'epoch': 0.17} 17%|█▋ | 3756/22095 [6:38:45<24:43:24, 4.85s/it] 17%|█▋ | 3757/22095 [6:38:49<22:42:47, 4.46s/it] {'loss': 0.3956, 'grad_norm': 0.6541370202035405, 'learning_rate': 9.494848067329211e-06, 'epoch': 0.17} 17%|█▋ | 3757/22095 [6:38:49<22:42:47, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (136253 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3758/22095 [6:38:52<20:12:10, 3.97s/it] {'loss': 0.3263, 'grad_norm': 0.643158696186771, 'learning_rate': 9.494526991573619e-06, 'epoch': 0.17} 17%|█▋ | 3758/22095 [6:38:52<20:12:10, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118019 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3759/22095 [6:38:55<19:57:51, 3.92s/it] {'loss': 0.4137, 'grad_norm': 0.6596094479839292, 'learning_rate': 9.494205819244444e-06, 'epoch': 0.17} 17%|█▋ | 3759/22095 [6:38:55<19:57:51, 3.92s/it] 17%|█▋ | 3760/22095 [6:38:59<19:05:46, 3.75s/it] {'loss': 0.4406, 'grad_norm': 0.6836657205327861, 'learning_rate': 9.493884550348589e-06, 'epoch': 0.17} 17%|█▋ | 3760/22095 [6:38:59<19:05:46, 3.75s/it] 17%|█▋ | 3761/22095 [6:39:01<17:29:38, 3.44s/it] {'loss': 0.4254, 'grad_norm': 0.6867349047506497, 'learning_rate': 9.493563184892958e-06, 'epoch': 0.17} 17%|█▋ | 3761/22095 [6:39:01<17:29:38, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91689 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54138 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3762/22095 [6:39:05<17:58:26, 3.53s/it] {'loss': 0.4609, 'grad_norm': 0.6815726649750058, 'learning_rate': 9.493241722884454e-06, 'epoch': 0.17} 17%|█▋ | 3762/22095 [6:39:05<17:58:26, 3.53s/it] 17%|█▋ | 3763/22095 [6:39:08<17:23:24, 3.42s/it] {'loss': 0.3385, 'grad_norm': 0.6885133337543684, 'learning_rate': 9.492920164329985e-06, 'epoch': 0.17} 17%|█▋ | 3763/22095 [6:39:08<17:23:24, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3764/22095 [6:39:18<27:25:58, 5.39s/it] {'loss': 0.5036, 'grad_norm': 1.0212362444151246, 'learning_rate': 9.492598509236461e-06, 'epoch': 0.17} 17%|█▋ | 3764/22095 [6:39:18<27:25:58, 5.39s/it] 17%|█▋ | 3765/22095 [6:39:22<24:28:54, 4.81s/it] {'loss': 0.3948, 'grad_norm': 0.774041865443583, 'learning_rate': 9.492276757610795e-06, 'epoch': 0.17} 17%|█▋ | 3765/22095 [6:39:22<24:28:54, 4.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3766/22095 [6:39:31<31:30:37, 6.19s/it] {'loss': 0.4832, 'grad_norm': 0.6134765089458971, 'learning_rate': 9.491954909459895e-06, 'epoch': 0.17} 17%|█▋ | 3766/22095 [6:39:31<31:30:37, 6.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42797 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3767/22095 [6:39:41<36:27:19, 7.16s/it] {'loss': 0.5457, 'grad_norm': 0.36207745015444576, 'learning_rate': 9.491632964790683e-06, 'epoch': 0.17} 17%|█▋ | 3767/22095 [6:39:41<36:27:19, 7.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3768/22095 [6:39:44<30:32:54, 6.00s/it] {'loss': 0.419, 'grad_norm': 0.7729859301025481, 'learning_rate': 9.491310923610071e-06, 'epoch': 0.17} 17%|█▋ | 3768/22095 [6:39:44<30:32:54, 6.00s/it] 17%|█▋ | 3769/22095 [6:39:48<27:06:36, 5.33s/it] {'loss': 0.4085, 'grad_norm': 0.6830124066237396, 'learning_rate': 9.490988785924983e-06, 'epoch': 0.17} 17%|█▋ | 3769/22095 [6:39:48<27:06:36, 5.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3770/22095 [6:39:57<33:26:52, 6.57s/it] {'loss': 0.4953, 'grad_norm': 0.7352583131801282, 'learning_rate': 9.490666551742338e-06, 'epoch': 0.17} 17%|█▋ | 3770/22095 [6:39:57<33:26:52, 6.57s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3771/22095 [6:40:00<28:11:00, 5.54s/it] {'loss': 0.3845, 'grad_norm': 0.6727503824450569, 'learning_rate': 9.490344221069062e-06, 'epoch': 0.17} 17%|█▋ | 3771/22095 [6:40:00<28:11:00, 5.54s/it] 17%|█▋ | 3772/22095 [6:40:03<24:21:08, 4.78s/it] {'loss': 0.4165, 'grad_norm': 0.7477892470274518, 'learning_rate': 9.490021793912079e-06, 'epoch': 0.17} 17%|█▋ | 3772/22095 [6:40:03<24:21:08, 4.78s/it] 17%|█▋ | 3773/22095 [6:40:06<21:37:26, 4.25s/it] {'loss': 0.3909, 'grad_norm': 0.7280098699277786, 'learning_rate': 9.489699270278316e-06, 'epoch': 0.17} 17%|█▋ | 3773/22095 [6:40:06<21:37:26, 4.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405744 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7931, 'image': 'vrdu_table_final_2/astro-ph.CO/0bc6ad13-46ad-4969-b9e8-fa04fae33931.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 17%|█▋ | 3774/22095 [6:40:10<20:15:32, 3.98s/it] {'loss': 0.4372, 'grad_norm': 0.662921243277628, 'learning_rate': 9.489376650174708e-06, 'epoch': 0.17} 17%|█▋ | 3774/22095 [6:40:10<20:15:32, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3775/22095 [6:40:16<24:21:59, 4.79s/it] {'loss': 0.5428, 'grad_norm': 0.9470274488208026, 'learning_rate': 9.489053933608182e-06, 'epoch': 0.17} 17%|█▋ | 3775/22095 [6:40:16<24:21:59, 4.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3776/22095 [6:40:20<22:53:26, 4.50s/it] {'loss': 0.4145, 'grad_norm': 0.6684584842570336, 'learning_rate': 9.488731120585675e-06, 'epoch': 0.17} 17%|█▋ | 3776/22095 [6:40:20<22:53:26, 4.50s/it] 17%|█▋ | 3777/22095 [6:40:23<20:32:48, 4.04s/it] {'loss': 0.3872, 'grad_norm': 0.7104320385344154, 'learning_rate': 9.488408211114121e-06, 'epoch': 0.17} 17%|█▋ | 3777/22095 [6:40:23<20:32:48, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92815 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3778/22095 [6:40:26<19:33:02, 3.84s/it] {'loss': 0.3681, 'grad_norm': 0.6115588397093513, 'learning_rate': 9.48808520520046e-06, 'epoch': 0.17} 17%|█▋ | 3778/22095 [6:40:26<19:33:02, 3.84s/it] 17%|█▋ | 3779/22095 [6:40:29<18:00:40, 3.54s/it] {'loss': 0.4817, 'grad_norm': 0.7587750055088698, 'learning_rate': 9.487762102851631e-06, 'epoch': 0.17} 17%|█▋ | 3779/22095 [6:40:29<18:00:40, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3780/22095 [6:40:33<18:22:59, 3.61s/it] {'loss': 0.3939, 'grad_norm': 0.6573454769461418, 'learning_rate': 9.487438904074581e-06, 'epoch': 0.17} 17%|█▋ | 3780/22095 [6:40:33<18:22:59, 3.61s/it] 17%|█▋ | 3781/22095 [6:40:37<19:03:55, 3.75s/it] {'loss': 0.3964, 'grad_norm': 0.6807681638602197, 'learning_rate': 9.48711560887625e-06, 'epoch': 0.17} 17%|█▋ | 3781/22095 [6:40:37<19:03:55, 3.75s/it] 17%|█▋ | 3782/22095 [6:40:41<18:59:04, 3.73s/it] {'loss': 0.4169, 'grad_norm': 0.6528819830811584, 'learning_rate': 9.486792217263584e-06, 'epoch': 0.17} 17%|█▋ | 3782/22095 [6:40:41<18:59:04, 3.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3783/22095 [6:40:44<17:42:37, 3.48s/it] {'loss': 0.3778, 'grad_norm': 0.6456444671915634, 'learning_rate': 9.486468729243533e-06, 'epoch': 0.17} 17%|█▋ | 3783/22095 [6:40:44<17:42:37, 3.48s/it] 17%|█▋ | 3784/22095 [6:40:47<17:56:50, 3.53s/it] {'loss': 0.4021, 'grad_norm': 0.685917141382002, 'learning_rate': 9.48614514482305e-06, 'epoch': 0.17} 17%|█▋ | 3784/22095 [6:40:47<17:56:50, 3.53s/it] 17%|█▋ | 3785/22095 [6:40:50<17:15:43, 3.39s/it] {'loss': 0.3866, 'grad_norm': 0.7035389881359554, 'learning_rate': 9.485821464009084e-06, 'epoch': 0.17} 17%|█▋ | 3785/22095 [6:40:50<17:15:43, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50189 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3786/22095 [6:40:59<24:49:32, 4.88s/it] {'loss': 0.5125, 'grad_norm': 0.6293192536212908, 'learning_rate': 9.485497686808594e-06, 'epoch': 0.17} 17%|█▋ | 3786/22095 [6:40:59<24:49:32, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111985 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3787/22095 [6:41:02<22:49:13, 4.49s/it] {'loss': 0.4342, 'grad_norm': 0.6996237899447874, 'learning_rate': 9.485173813228535e-06, 'epoch': 0.17} 17%|█▋ | 3787/22095 [6:41:02<22:49:13, 4.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3788/22095 [6:41:05<20:31:07, 4.03s/it] {'loss': 0.4426, 'grad_norm': 0.7228737991665823, 'learning_rate': 9.484849843275863e-06, 'epoch': 0.17} 17%|█▋ | 3788/22095 [6:41:05<20:31:07, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46497 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3789/22095 [6:41:09<20:12:02, 3.97s/it] {'loss': 0.4401, 'grad_norm': 0.6784925368979711, 'learning_rate': 9.484525776957544e-06, 'epoch': 0.17} 17%|█▋ | 3789/22095 [6:41:09<20:12:02, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82551 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3790/22095 [6:41:12<18:32:06, 3.65s/it] {'loss': 0.4187, 'grad_norm': 0.6955890630411466, 'learning_rate': 9.484201614280539e-06, 'epoch': 0.17} 17%|█▋ | 3790/22095 [6:41:12<18:32:06, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95302 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47311 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3791/22095 [6:41:15<17:20:41, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108667 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.3606, 'grad_norm': 0.720595862616324, 'learning_rate': 9.483877355251814e-06, 'epoch': 0.17} 17%|█▋ | 3791/22095 [6:41:15<17:20:41, 3.41s/it] 17%|█▋ | 3792/22095 [6:41:19<18:16:15, 3.59s/it] {'loss': 0.4037, 'grad_norm': 0.6916800542587127, 'learning_rate': 9.483552999878335e-06, 'epoch': 0.17} 17%|█▋ | 3792/22095 [6:41:19<18:16:15, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98471 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102864 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3793/22095 [6:41:23<18:31:20, 3.64s/it] {'loss': 0.4423, 'grad_norm': 0.694954500461271, 'learning_rate': 9.483228548167075e-06, 'epoch': 0.17} 17%|█▋ | 3793/22095 [6:41:23<18:31:20, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3794/22095 [6:41:31<26:21:35, 5.19s/it] {'loss': 0.5163, 'grad_norm': 0.3969421838470267, 'learning_rate': 9.482904000124998e-06, 'epoch': 0.17} 17%|█▋ | 3794/22095 [6:41:31<26:21:35, 5.19s/it] 17%|█▋ | 3795/22095 [6:41:40<31:09:26, 6.13s/it] {'loss': 0.5129, 'grad_norm': 0.33452167181396353, 'learning_rate': 9.482579355759085e-06, 'epoch': 0.17} 17%|█▋ | 3795/22095 [6:41:40<31:09:26, 6.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3796/22095 [6:41:44<28:44:10, 5.65s/it] {'loss': 0.455, 'grad_norm': 0.6967667263051072, 'learning_rate': 9.482254615076307e-06, 'epoch': 0.17} 17%|█▋ | 3796/22095 [6:41:44<28:44:10, 5.65s/it] 17%|█▋ | 3797/22095 [6:41:50<28:50:23, 5.67s/it] {'loss': 0.5311, 'grad_norm': 0.3339040070467812, 'learning_rate': 9.481929778083646e-06, 'epoch': 0.17} 17%|█▋ | 3797/22095 [6:41:50<28:50:23, 5.67s/it] 17%|█▋ | 3798/22095 [6:41:57<30:43:51, 6.05s/it] {'loss': 0.5197, 'grad_norm': 0.36808453193924556, 'learning_rate': 9.481604844788078e-06, 'epoch': 0.17} 17%|█▋ | 3798/22095 [6:41:57<30:43:51, 6.05s/it] 17%|█▋ | 3799/22095 [6:42:06<35:55:32, 7.07s/it] {'loss': 0.4952, 'grad_norm': 0.35561700516571404, 'learning_rate': 9.481279815196587e-06, 'epoch': 0.17} 17%|█▋ | 3799/22095 [6:42:06<35:55:32, 7.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3800/22095 [6:42:10<30:22:08, 5.98s/it] {'loss': 0.4168, 'grad_norm': 1.0893008139643745, 'learning_rate': 9.480954689316155e-06, 'epoch': 0.17} 17%|█▋ | 3800/22095 [6:42:10<30:22:08, 5.98s/it] 17%|█▋ | 3801/22095 [6:42:19<35:33:56, 7.00s/it] {'loss': 0.5244, 'grad_norm': 0.4053512638244079, 'learning_rate': 9.480629467153768e-06, 'epoch': 0.17} 17%|█▋ | 3801/22095 [6:42:19<35:33:56, 7.00s/it] 17%|█▋ | 3802/22095 [6:42:26<34:47:05, 6.85s/it] {'loss': 0.4998, 'grad_norm': 0.40523098581740213, 'learning_rate': 9.480304148716418e-06, 'epoch': 0.17} 17%|█▋ | 3802/22095 [6:42:26<34:47:05, 6.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3803/22095 [6:42:29<29:19:11, 5.77s/it] {'loss': 0.4028, 'grad_norm': 0.8014392539434086, 'learning_rate': 9.479978734011089e-06, 'epoch': 0.17} 17%|█▋ | 3803/22095 [6:42:29<29:19:11, 5.77s/it] 17%|█▋ | 3804/22095 [6:42:32<25:42:32, 5.06s/it] {'loss': 0.4377, 'grad_norm': 0.7256290606944289, 'learning_rate': 9.479653223044776e-06, 'epoch': 0.17} 17%|█▋ | 3804/22095 [6:42:32<25:42:32, 5.06s/it] 17%|█▋ | 3805/22095 [6:42:36<22:53:11, 4.50s/it] {'loss': 0.359, 'grad_norm': 0.6517262142803637, 'learning_rate': 9.479327615824476e-06, 'epoch': 0.17} 17%|█▋ | 3805/22095 [6:42:36<22:53:11, 4.50s/it] 17%|█▋ | 3806/22095 [6:42:39<20:32:25, 4.04s/it] {'loss': 0.4047, 'grad_norm': 0.7758678317801353, 'learning_rate': 9.479001912357181e-06, 'epoch': 0.17} 17%|█▋ | 3806/22095 [6:42:39<20:32:25, 4.04s/it] 17%|█▋ | 3807/22095 [6:42:42<18:56:40, 3.73s/it] {'loss': 0.4003, 'grad_norm': 0.6734929141109003, 'learning_rate': 9.478676112649892e-06, 'epoch': 0.17} 17%|█▋ | 3807/22095 [6:42:42<18:56:40, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47929 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44788 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3808/22095 [6:42:45<17:46:12, 3.50s/it] {'loss': 0.4209, 'grad_norm': 0.6460118289975646, 'learning_rate': 9.478350216709609e-06, 'epoch': 0.17} 17%|█▋ | 3808/22095 [6:42:45<17:46:12, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908006 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31159, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nA. 2\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 17%|█▋ | 3809/22095 [6:42:47<16:46:37, 3.30s/it] {'loss': 0.4035, 'grad_norm': 0.7400410273732939, 'learning_rate': 9.478024224543332e-06, 'epoch': 0.17} 17%|█▋ | 3809/22095 [6:42:47<16:46:37, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60710 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68887 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3810/22095 [6:42:51<16:30:06, 3.25s/it] {'loss': 0.4006, 'grad_norm': 0.6620153485178493, 'learning_rate': 9.477698136158068e-06, 'epoch': 0.17} 17%|█▋ | 3810/22095 [6:42:51<16:30:06, 3.25s/it] 17%|█▋ | 3811/22095 [6:42:54<17:27:51, 3.44s/it] {'loss': 0.3948, 'grad_norm': 0.674849489646353, 'learning_rate': 9.477371951560825e-06, 'epoch': 0.17} 17%|█▋ | 3811/22095 [6:42:54<17:27:51, 3.44s/it] 17%|█▋ | 3812/22095 [6:42:57<16:46:15, 3.30s/it] {'loss': 0.3879, 'grad_norm': 0.6830113481329078, 'learning_rate': 9.477045670758609e-06, 'epoch': 0.17} 17%|█▋ | 3812/22095 [6:42:57<16:46:15, 3.30s/it] 17%|█▋ | 3813/22095 [6:43:00<16:06:31, 3.17s/it] {'loss': 0.3828, 'grad_norm': 0.6779094831298704, 'learning_rate': 9.476719293758431e-06, 'epoch': 0.17} 17%|█▋ | 3813/22095 [6:43:00<16:06:31, 3.17s/it] 17%|█▋ | 3814/22095 [6:43:03<15:50:28, 3.12s/it] {'loss': 0.399, 'grad_norm': 0.66722862120204, 'learning_rate': 9.476392820567306e-06, 'epoch': 0.17} 17%|█▋ | 3814/22095 [6:43:03<15:50:28, 3.12s/it] 17%|█▋ | 3815/22095 [6:43:06<15:17:30, 3.01s/it] {'loss': 0.3661, 'grad_norm': 0.6684781940195049, 'learning_rate': 9.476066251192248e-06, 'epoch': 0.17} 17%|█▋ | 3815/22095 [6:43:06<15:17:30, 3.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3816/22095 [6:43:17<26:55:04, 5.30s/it] {'loss': 0.4783, 'grad_norm': 0.49516822571517094, 'learning_rate': 9.475739585640272e-06, 'epoch': 0.17} 17%|█▋ | 3816/22095 [6:43:17<26:55:04, 5.30s/it] 17%|█▋ | 3817/22095 [6:43:20<24:30:08, 4.83s/it] {'loss': 0.3734, 'grad_norm': 0.624620814684528, 'learning_rate': 9.475412823918398e-06, 'epoch': 0.17} 17%|█▋ | 3817/22095 [6:43:20<24:30:08, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3818/22095 [6:43:30<31:48:44, 6.27s/it] {'loss': 0.4856, 'grad_norm': 0.3758624282498899, 'learning_rate': 9.475085966033649e-06, 'epoch': 0.17} 17%|█▋ | 3818/22095 [6:43:30<31:48:44, 6.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92568 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3819/22095 [6:43:34<27:53:20, 5.49s/it] {'loss': 0.4685, 'grad_norm': 0.7495092258007775, 'learning_rate': 9.474759011993045e-06, 'epoch': 0.17} 17%|█▋ | 3819/22095 [6:43:34<27:53:20, 5.49s/it] 17%|█▋ | 3820/22095 [6:43:37<25:14:16, 4.97s/it] {'loss': 0.4345, 'grad_norm': 0.7028451617869309, 'learning_rate': 9.474431961803615e-06, 'epoch': 0.17} 17%|█▋ | 3820/22095 [6:43:37<25:14:16, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96571 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3821/22095 [6:43:40<22:13:19, 4.38s/it] {'loss': 0.3717, 'grad_norm': 0.6838612365680022, 'learning_rate': 9.474104815472382e-06, 'epoch': 0.17} 17%|█▋ | 3821/22095 [6:43:40<22:13:19, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70270 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62186 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3822/22095 [6:43:43<19:57:03, 3.93s/it] {'loss': 0.4126, 'grad_norm': 0.6690480173586114, 'learning_rate': 9.47377757300638e-06, 'epoch': 0.17} 17%|█▋ | 3822/22095 [6:43:43<19:57:03, 3.93s/it] 17%|█▋ | 3823/22095 [6:43:46<18:43:18, 3.69s/it] {'loss': 0.3475, 'grad_norm': 0.6459824004506101, 'learning_rate': 9.473450234412638e-06, 'epoch': 0.17} 17%|█▋ | 3823/22095 [6:43:46<18:43:18, 3.69s/it] 17%|█▋ | 3824/22095 [6:43:49<17:27:49, 3.44s/it] {'loss': 0.4021, 'grad_norm': 0.6605687360932971, 'learning_rate': 9.473122799698189e-06, 'epoch': 0.17} 17%|█▋ | 3824/22095 [6:43:49<17:27:49, 3.44s/it] 17%|█▋ | 3825/22095 [6:43:52<16:22:39, 3.23s/it] {'loss': 0.3697, 'grad_norm': 0.7336982972593437, 'learning_rate': 9.472795268870068e-06, 'epoch': 0.17} 17%|█▋ | 3825/22095 [6:43:52<16:22:39, 3.23s/it] 17%|█▋ | 3826/22095 [6:43:55<16:05:11, 3.17s/it] {'loss': 0.4196, 'grad_norm': 0.6807569738448235, 'learning_rate': 9.472467641935314e-06, 'epoch': 0.17} 17%|█▋ | 3826/22095 [6:43:55<16:05:11, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3827/22095 [6:44:04<24:44:37, 4.88s/it] {'loss': 0.5147, 'grad_norm': 0.6598634075011642, 'learning_rate': 9.472139918900969e-06, 'epoch': 0.17} 17%|█▋ | 3827/22095 [6:44:04<24:44:37, 4.88s/it] 17%|█▋ | 3828/22095 [6:44:08<23:29:46, 4.63s/it] {'loss': 0.4068, 'grad_norm': 0.6445631249211847, 'learning_rate': 9.47181209977407e-06, 'epoch': 0.17} 17%|█▋ | 3828/22095 [6:44:08<23:29:46, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3829/22095 [6:44:15<27:41:19, 5.46s/it] {'loss': 0.5171, 'grad_norm': 0.4750175521736982, 'learning_rate': 9.471484184561664e-06, 'epoch': 0.17} 17%|█▋ | 3829/22095 [6:44:15<27:41:19, 5.46s/it] 17%|█▋ | 3830/22095 [6:44:25<33:56:00, 6.69s/it] {'loss': 0.4917, 'grad_norm': 0.3584003360435236, 'learning_rate': 9.471156173270796e-06, 'epoch': 0.17} 17%|█▋ | 3830/22095 [6:44:25<33:56:00, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 17%|█▋ | 3831/22095 [6:44:28<29:09:50, 5.75s/it] {'loss': 0.3914, 'grad_norm': 0.7073047056713918, 'learning_rate': 9.470828065908512e-06, 'epoch': 0.17} 17%|█▋ | 3831/22095 [6:44:29<29:09:50, 5.75s/it] 17%|█▋ | 3832/22095 [6:44:32<25:49:41, 5.09s/it] {'loss': 0.4363, 'grad_norm': 0.7026671316830553, 'learning_rate': 9.470499862481867e-06, 'epoch': 0.17} 17%|█▋ | 3832/22095 [6:44:32<25:49:41, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43711 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3833/22095 [6:44:35<22:23:29, 4.41s/it] {'loss': 0.3664, 'grad_norm': 0.6446855918509418, 'learning_rate': 9.470171562997908e-06, 'epoch': 0.17} 17%|█▋ | 3833/22095 [6:44:35<22:23:29, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112803 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3834/22095 [6:44:38<20:00:40, 3.95s/it] {'loss': 0.3864, 'grad_norm': 0.6653694374505091, 'learning_rate': 9.469843167463692e-06, 'epoch': 0.17} 17%|█▋ | 3834/22095 [6:44:38<20:00:40, 3.95s/it] 17%|█▋ | 3835/22095 [6:44:40<18:08:01, 3.58s/it] {'loss': 0.399, 'grad_norm': 0.6814010028261566, 'learning_rate': 9.469514675886276e-06, 'epoch': 0.17} 17%|█▋ | 3835/22095 [6:44:40<18:08:01, 3.58s/it] 17%|█▋ | 3836/22095 [6:44:44<18:47:41, 3.71s/it] {'loss': 0.3474, 'grad_norm': 0.7220147662032761, 'learning_rate': 9.469186088272714e-06, 'epoch': 0.17} 17%|█▋ | 3836/22095 [6:44:44<18:47:41, 3.71s/it] 17%|█▋ | 3837/22095 [6:44:49<20:28:23, 4.04s/it] {'loss': 0.4211, 'grad_norm': 0.6498031930000514, 'learning_rate': 9.468857404630069e-06, 'epoch': 0.17} 17%|█▋ | 3837/22095 [6:44:49<20:28:23, 4.04s/it] 17%|█▋ | 3838/22095 [6:44:52<19:03:38, 3.76s/it] {'loss': 0.4029, 'grad_norm': 0.6605078113760698, 'learning_rate': 9.468528624965406e-06, 'epoch': 0.17} 17%|█▋ | 3838/22095 [6:44:52<19:03:38, 3.76s/it] 17%|█▋ | 3839/22095 [6:44:55<17:51:39, 3.52s/it] {'loss': 0.3802, 'grad_norm': 0.686696816542692, 'learning_rate': 9.468199749285785e-06, 'epoch': 0.17} 17%|█▋ | 3839/22095 [6:44:55<17:51:39, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [84, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8346933 in VC:s3://internvl-moe-sft-data/. Exception: Image size [84, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13597, 'image': 'vrdu_table_final_2/astro-ph.CO/9ecbcd87-61f8-4cb3-92d3-fc88a2ccfbc1.png', 'image_wh': [[84, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{@{}c@{}}ALMA\\\\ \\end{tabular}\n```"}]} 17%|█▋ | 3840/22095 [6:44:58<17:04:45, 3.37s/it] {'loss': 0.4076, 'grad_norm': 0.6838184982820498, 'learning_rate': 9.467870777598274e-06, 'epoch': 0.17} 17%|█▋ | 3840/22095 [6:44:58<17:04:45, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3841/22095 [6:45:08<26:13:15, 5.17s/it] {'loss': 0.5383, 'grad_norm': 1.603580102321151, 'learning_rate': 9.467541709909942e-06, 'epoch': 0.17} 17%|█▋ | 3841/22095 [6:45:08<26:13:15, 5.17s/it] 17%|█▋ | 3842/22095 [6:45:11<23:18:11, 4.60s/it] {'loss': 0.4162, 'grad_norm': 0.7090022793113754, 'learning_rate': 9.46721254622786e-06, 'epoch': 0.17} 17%|█▋ | 3842/22095 [6:45:11<23:18:11, 4.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3843/22095 [6:45:14<20:30:11, 4.04s/it] {'loss': 0.4254, 'grad_norm': 0.6742737656798072, 'learning_rate': 9.466883286559102e-06, 'epoch': 0.17} 17%|█▋ | 3843/22095 [6:45:14<20:30:11, 4.04s/it] 17%|█▋ | 3844/22095 [6:45:17<19:28:52, 3.84s/it] {'loss': 0.3856, 'grad_norm': 0.7230508206818501, 'learning_rate': 9.46655393091074e-06, 'epoch': 0.17} 17%|█▋ | 3844/22095 [6:45:17<19:28:52, 3.84s/it] 17%|█▋ | 3845/22095 [6:45:20<18:17:45, 3.61s/it] {'loss': 0.4029, 'grad_norm': 0.812199433596557, 'learning_rate': 9.466224479289851e-06, 'epoch': 0.17} 17%|█▋ | 3845/22095 [6:45:20<18:17:45, 3.61s/it] 17%|█▋ | 3846/22095 [6:45:23<17:47:32, 3.51s/it] {'loss': 0.3809, 'grad_norm': 0.706257234178393, 'learning_rate': 9.465894931703517e-06, 'epoch': 0.17} 17%|█▋ | 3846/22095 [6:45:23<17:47:32, 3.51s/it] 17%|█▋ | 3847/22095 [6:45:27<18:30:10, 3.65s/it] {'loss': 0.3825, 'grad_norm': 0.7939136340507305, 'learning_rate': 9.465565288158815e-06, 'epoch': 0.17} 17%|█▋ | 3847/22095 [6:45:27<18:30:10, 3.65s/it] 17%|█▋ | 3848/22095 [6:45:31<18:01:36, 3.56s/it] {'loss': 0.3804, 'grad_norm': 0.7038949961513123, 'learning_rate': 9.46523554866283e-06, 'epoch': 0.17} 17%|█▋ | 3848/22095 [6:45:31<18:01:36, 3.56s/it] 17%|█▋ | 3849/22095 [6:45:34<17:12:19, 3.39s/it] {'loss': 0.3839, 'grad_norm': 0.6547818451610412, 'learning_rate': 9.464905713222648e-06, 'epoch': 0.17} 17%|█▋ | 3849/22095 [6:45:34<17:12:19, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3850/22095 [6:45:37<16:33:42, 3.27s/it] {'loss': 0.3985, 'grad_norm': 0.7689604401981761, 'learning_rate': 9.464575781845355e-06, 'epoch': 0.17} 17%|█▋ | 3850/22095 [6:45:37<16:33:42, 3.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3851/22095 [6:45:40<16:05:20, 3.17s/it] {'loss': 0.437, 'grad_norm': 0.6932882122137568, 'learning_rate': 9.46424575453804e-06, 'epoch': 0.17} 17%|█▋ | 3851/22095 [6:45:40<16:05:20, 3.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 17%|█▋ | 3852/22095 [6:45:43<15:39:36, 3.09s/it] {'loss': 0.3566, 'grad_norm': 0.6653382774110156, 'learning_rate': 9.463915631307795e-06, 'epoch': 0.17} 17%|█▋ | 3852/22095 [6:45:43<15:39:36, 3.09s/it] 17%|█▋ | 3853/22095 [6:45:47<17:57:42, 3.54s/it] {'loss': 0.4094, 'grad_norm': 0.7126119345383991, 'learning_rate': 9.463585412161712e-06, 'epoch': 0.17} 17%|█▋ | 3853/22095 [6:45:47<17:57:42, 3.54s/it] 17%|█▋ | 3854/22095 [6:45:51<17:56:51, 3.54s/it] {'loss': 0.4516, 'grad_norm': 0.7299262826317278, 'learning_rate': 9.463255097106888e-06, 'epoch': 0.17} 17%|█▋ | 3854/22095 [6:45:51<17:56:51, 3.54s/it] 17%|█▋ | 3855/22095 [6:45:54<16:47:09, 3.31s/it] {'loss': 0.4126, 'grad_norm': 0.6814962311461914, 'learning_rate': 9.462924686150419e-06, 'epoch': 0.17} 17%|█▋ | 3855/22095 [6:45:54<16:47:09, 3.31s/it] 17%|█▋ | 3856/22095 [6:45:57<17:13:28, 3.40s/it] {'loss': 0.3564, 'grad_norm': 0.6083842476390211, 'learning_rate': 9.462594179299408e-06, 'epoch': 0.17} 17%|█▋ | 3856/22095 [6:45:57<17:13:28, 3.40s/it] 17%|█▋ | 3857/22095 [6:46:00<16:52:25, 3.33s/it] {'loss': 0.3934, 'grad_norm': 0.6648655332817587, 'learning_rate': 9.462263576560951e-06, 'epoch': 0.17} 17%|█▋ | 3857/22095 [6:46:00<16:52:25, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58747 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44794 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46288 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3858/22095 [6:46:03<15:57:16, 3.15s/it] {'loss': 0.4531, 'grad_norm': 0.6540476263210379, 'learning_rate': 9.461932877942154e-06, 'epoch': 0.17} 17%|█▋ | 3858/22095 [6:46:03<15:57:16, 3.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70120 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49051 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3859/22095 [6:46:06<15:29:06, 3.06s/it] {'loss': 0.4127, 'grad_norm': 0.6841250761641118, 'learning_rate': 9.461602083450126e-06, 'epoch': 0.17} 17%|█▋ | 3859/22095 [6:46:06<15:29:06, 3.06s/it] 17%|█▋ | 3860/22095 [6:46:10<17:14:18, 3.40s/it] {'loss': 0.4099, 'grad_norm': 0.6339962178288016, 'learning_rate': 9.461271193091971e-06, 'epoch': 0.17} 17%|█▋ | 3860/22095 [6:46:10<17:14:18, 3.40s/it] 17%|█▋ | 3861/22095 [6:46:14<18:17:46, 3.61s/it] {'loss': 0.4309, 'grad_norm': 0.7159915950097685, 'learning_rate': 9.4609402068748e-06, 'epoch': 0.17} 17%|█▋ | 3861/22095 [6:46:14<18:17:46, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82492 > 40960). Running this sequence through the model will result in indexing errors 17%|█▋ | 3862/22095 [6:46:18<18:31:15, 3.66s/it] {'loss': 0.4109, 'grad_norm': 0.7058571671469768, 'learning_rate': 9.460609124805724e-06, 'epoch': 0.17} 17%|█▋ | 3862/22095 [6:46:18<18:31:15, 3.66s/it] 17%|█▋ | 3863/22095 [6:46:22<18:45:14, 3.70s/it] {'loss': 0.3917, 'grad_norm': 0.6230601643244665, 'learning_rate': 9.460277946891859e-06, 'epoch': 0.17} 17%|█▋ | 3863/22095 [6:46:22<18:45:14, 3.70s/it] 17%|█▋ | 3864/22095 [6:46:25<17:26:52, 3.45s/it] {'loss': 0.3789, 'grad_norm': 0.6702894866068349, 'learning_rate': 9.459946673140317e-06, 'epoch': 0.17} 17%|█▋ | 3864/22095 [6:46:25<17:26:52, 3.45s/it] 17%|█▋ | 3865/22095 [6:46:28<18:01:22, 3.56s/it] {'loss': 0.4154, 'grad_norm': 0.6890944216862027, 'learning_rate': 9.45961530355822e-06, 'epoch': 0.17} 17%|█▋ | 3865/22095 [6:46:28<18:01:22, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 17%|█▋ | 3866/22095 [6:46:40<29:29:11, 5.82s/it] {'loss': 0.5401, 'grad_norm': 1.3068604598743, 'learning_rate': 9.459283838152686e-06, 'epoch': 0.17} 17%|█▋ | 3866/22095 [6:46:40<29:29:11, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72206 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116412 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3867/22095 [6:46:45<29:22:31, 5.80s/it] {'loss': 0.518, 'grad_norm': 1.1368248257270164, 'learning_rate': 9.45895227693084e-06, 'epoch': 0.18} 18%|█▊ | 3867/22095 [6:46:45<29:22:31, 5.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 18%|█▊ | 3868/22095 [6:46:50<27:46:00, 5.48s/it] {'loss': 0.4324, 'grad_norm': 0.7575971192564259, 'learning_rate': 9.458620619899803e-06, 'epoch': 0.18} 18%|█▊ | 3868/22095 [6:46:50<27:46:00, 5.48s/it] 18%|█▊ | 3869/22095 [6:46:54<25:44:10, 5.08s/it] {'loss': 0.4319, 'grad_norm': 0.7079932207926797, 'learning_rate': 9.458288867066702e-06, 'epoch': 0.18} 18%|█▊ | 3869/22095 [6:46:54<25:44:10, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3870/22095 [6:47:00<26:51:13, 5.30s/it] {'loss': 0.5161, 'grad_norm': 0.7920654492018414, 'learning_rate': 9.457957018438668e-06, 'epoch': 0.18} 18%|█▊ | 3870/22095 [6:47:00<26:51:13, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59429 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3871/22095 [6:47:04<24:27:10, 4.83s/it] {'loss': 0.4424, 'grad_norm': 0.7178653393777417, 'learning_rate': 9.457625074022827e-06, 'epoch': 0.18} 18%|█▊ | 3871/22095 [6:47:04<24:27:10, 4.83s/it] 18%|█▊ | 3872/22095 [6:47:09<25:00:50, 4.94s/it] {'loss': 0.4109, 'grad_norm': 0.6877880291711488, 'learning_rate': 9.457293033826314e-06, 'epoch': 0.18} 18%|█▊ | 3872/22095 [6:47:09<25:00:50, 4.94s/it] 18%|█▊ | 3873/22095 [6:47:12<22:41:52, 4.48s/it] {'loss': 0.3978, 'grad_norm': 0.6855375864869679, 'learning_rate': 9.456960897856264e-06, 'epoch': 0.18} 18%|█▊ | 3873/22095 [6:47:12<22:41:52, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54797 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42053 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3874/22095 [6:47:16<20:46:06, 4.10s/it] {'loss': 0.4077, 'grad_norm': 0.6933572812811226, 'learning_rate': 9.456628666119812e-06, 'epoch': 0.18} 18%|█▊ | 3874/22095 [6:47:16<20:46:06, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3875/22095 [6:47:26<29:53:53, 5.91s/it] {'loss': 0.5352, 'grad_norm': 1.5072854612067466, 'learning_rate': 9.456296338624098e-06, 'epoch': 0.18} 18%|█▊ | 3875/22095 [6:47:26<29:53:53, 5.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3876/22095 [6:47:29<26:31:34, 5.24s/it] {'loss': 0.4084, 'grad_norm': 0.6863337744350352, 'learning_rate': 9.455963915376262e-06, 'epoch': 0.18} 18%|█▊ | 3876/22095 [6:47:29<26:31:34, 5.24s/it] 18%|█▊ | 3877/22095 [6:47:33<24:19:49, 4.81s/it] {'loss': 0.3907, 'grad_norm': 0.8083693280407686, 'learning_rate': 9.455631396383446e-06, 'epoch': 0.18} 18%|█▊ | 3877/22095 [6:47:33<24:19:49, 4.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3878/22095 [6:47:41<28:41:52, 5.67s/it] {'loss': 0.5165, 'grad_norm': 1.1066296499903405, 'learning_rate': 9.455298781652797e-06, 'epoch': 0.18} 18%|█▊ | 3878/22095 [6:47:41<28:41:52, 5.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3879/22095 [6:47:51<35:27:23, 7.01s/it] {'loss': 0.4957, 'grad_norm': 0.8388078483952227, 'learning_rate': 9.454966071191461e-06, 'epoch': 0.18} 18%|█▊ | 3879/22095 [6:47:51<35:27:23, 7.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (68627 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3880/22095 [6:47:54<29:26:42, 5.82s/it] {'loss': 0.4132, 'grad_norm': 0.7459585566669298, 'learning_rate': 9.454633265006585e-06, 'epoch': 0.18} 18%|█▊ | 3880/22095 [6:47:54<29:26:42, 5.82s/it] 18%|█▊ | 3881/22095 [6:47:57<25:50:25, 5.11s/it] {'loss': 0.4443, 'grad_norm': 0.7000355787238284, 'learning_rate': 9.454300363105323e-06, 'epoch': 0.18} 18%|█▊ | 3881/22095 [6:47:57<25:50:25, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3882/22095 [6:48:07<32:28:46, 6.42s/it] {'loss': 0.4895, 'grad_norm': 0.6348413063134372, 'learning_rate': 9.453967365494824e-06, 'epoch': 0.18} 18%|█▊ | 3882/22095 [6:48:07<32:28:46, 6.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101957 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45674 > 40960) for 4 sample(s). Truncating to 448 with 1 samples. 18%|█▊ | 3883/22095 [6:48:10<28:05:10, 5.55s/it] {'loss': 0.4248, 'grad_norm': 0.7044484596842852, 'learning_rate': 9.453634272182249e-06, 'epoch': 0.18} 18%|█▊ | 3883/22095 [6:48:10<28:05:10, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43828 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88637 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3884/22095 [6:48:14<24:19:28, 4.81s/it] {'loss': 0.4257, 'grad_norm': 0.7131839130118517, 'learning_rate': 9.45330108317475e-06, 'epoch': 0.18} 18%|█▊ | 3884/22095 [6:48:14<24:19:28, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79715 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108380 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3885/22095 [6:48:18<23:14:21, 4.59s/it] {'loss': 0.4329, 'grad_norm': 0.6633999287098569, 'learning_rate': 9.45296779847949e-06, 'epoch': 0.18} 18%|█▊ | 3885/22095 [6:48:18<23:14:21, 4.59s/it] 18%|█▊ | 3886/22095 [6:48:21<21:07:17, 4.18s/it] {'loss': 0.4161, 'grad_norm': 0.6678411587728268, 'learning_rate': 9.452634418103626e-06, 'epoch': 0.18} 18%|█▊ | 3886/22095 [6:48:21<21:07:17, 4.18s/it] 18%|█▊ | 3887/22095 [6:48:24<19:27:17, 3.85s/it] {'loss': 0.4185, 'grad_norm': 0.6953785685795281, 'learning_rate': 9.452300942054324e-06, 'epoch': 0.18} 18%|█▊ | 3887/22095 [6:48:24<19:27:17, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3888/22095 [6:48:32<25:36:20, 5.06s/it] {'loss': 0.5453, 'grad_norm': 1.179121220652229, 'learning_rate': 9.451967370338747e-06, 'epoch': 0.18} 18%|█▊ | 3888/22095 [6:48:32<25:36:20, 5.06s/it] 18%|█▊ | 3889/22095 [6:48:35<22:47:30, 4.51s/it] {'loss': 0.4168, 'grad_norm': 0.7069774255329982, 'learning_rate': 9.451633702964067e-06, 'epoch': 0.18} 18%|█▊ | 3889/22095 [6:48:35<22:47:30, 4.51s/it] 18%|█▊ | 3890/22095 [6:48:38<21:01:56, 4.16s/it] {'loss': 0.3923, 'grad_norm': 0.7127050767647245, 'learning_rate': 9.45129993993745e-06, 'epoch': 0.18} 18%|█▊ | 3890/22095 [6:48:38<21:01:56, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3891/22095 [6:48:48<29:32:55, 5.84s/it] {'loss': 0.4977, 'grad_norm': 0.9695002630974516, 'learning_rate': 9.450966081266069e-06, 'epoch': 0.18} 18%|█▊ | 3891/22095 [6:48:48<29:32:55, 5.84s/it] 18%|█▊ | 3892/22095 [6:48:56<33:10:26, 6.56s/it] {'loss': 0.526, 'grad_norm': 0.8269460584043274, 'learning_rate': 9.450632126957098e-06, 'epoch': 0.18} 18%|█▊ | 3892/22095 [6:48:56<33:10:26, 6.56s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3893/22095 [6:49:06<37:42:55, 7.46s/it] {'loss': 0.5192, 'grad_norm': 0.5904239742667041, 'learning_rate': 9.45029807701771e-06, 'epoch': 0.18} 18%|█▊ | 3893/22095 [6:49:06<37:42:55, 7.46s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 18%|█▊ | 3894/22095 [6:49:09<31:30:17, 6.23s/it] {'loss': 0.3919, 'grad_norm': 0.7518798449772747, 'learning_rate': 9.449963931455084e-06, 'epoch': 0.18} 18%|█▊ | 3894/22095 [6:49:09<31:30:17, 6.23s/it] 18%|█▊ | 3895/22095 [6:49:13<27:59:52, 5.54s/it] {'loss': 0.4009, 'grad_norm': 0.7435713399453958, 'learning_rate': 9.449629690276401e-06, 'epoch': 0.18} 18%|█▊ | 3895/22095 [6:49:13<27:59:52, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (147581 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3896/22095 [6:49:17<25:21:01, 5.01s/it] {'loss': 0.3547, 'grad_norm': 0.5816424605119339, 'learning_rate': 9.44929535348884e-06, 'epoch': 0.18} 18%|█▊ | 3896/22095 [6:49:17<25:21:01, 5.01s/it] 18%|█▊ | 3897/22095 [6:49:20<22:25:48, 4.44s/it] {'loss': 0.367, 'grad_norm': 0.821270546116312, 'learning_rate': 9.44896092109959e-06, 'epoch': 0.18} 18%|█▊ | 3897/22095 [6:49:20<22:25:48, 4.44s/it] 18%|█▊ | 3898/22095 [6:49:23<20:19:41, 4.02s/it] {'loss': 0.4123, 'grad_norm': 0.9931850336613028, 'learning_rate': 9.448626393115833e-06, 'epoch': 0.18} 18%|█▊ | 3898/22095 [6:49:23<20:19:41, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53402 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3899/22095 [6:49:26<18:18:53, 3.62s/it] {'loss': 0.3862, 'grad_norm': 0.6533003091582168, 'learning_rate': 9.448291769544758e-06, 'epoch': 0.18} 18%|█▊ | 3899/22095 [6:49:26<18:18:53, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44870 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3900/22095 [6:49:29<18:15:36, 3.61s/it] {'loss': 0.4037, 'grad_norm': 0.6816255014141562, 'learning_rate': 9.447957050393552e-06, 'epoch': 0.18} 18%|█▊ | 3900/22095 [6:49:29<18:15:36, 3.61s/it] 18%|█▊ | 3901/22095 [6:49:32<17:14:53, 3.41s/it] {'loss': 0.4018, 'grad_norm': 0.7414261462013415, 'learning_rate': 9.447622235669412e-06, 'epoch': 0.18} 18%|█▊ | 3901/22095 [6:49:32<17:14:53, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [184, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8447981 in VC:s3://internvl-moe-sft-data/. Exception: Image size [184, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 146196, 'image': 'vrdu_texteq/astro-ph.CO/3720258f-6362-48f7-8227-0f3007fc24f3.png', 'image_wh': [[184, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': '\\ a $2 \\times 2$ matrix:'}]} 18%|█▊ | 3902/22095 [6:49:36<17:53:39, 3.54s/it] {'loss': 0.3837, 'grad_norm': 0.7064509154674706, 'learning_rate': 9.44728732537953e-06, 'epoch': 0.18} 18%|█▊ | 3902/22095 [6:49:36<17:53:39, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41405 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52937 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3903/22095 [6:49:40<17:39:22, 3.49s/it] {'loss': 0.3788, 'grad_norm': 0.6743859732965065, 'learning_rate': 9.446952319531102e-06, 'epoch': 0.18} 18%|█▊ | 3903/22095 [6:49:40<17:39:22, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3904/22095 [6:49:49<26:38:51, 5.27s/it] {'loss': 0.5672, 'grad_norm': 2.4314369942165257, 'learning_rate': 9.446617218131326e-06, 'epoch': 0.18} 18%|█▊ | 3904/22095 [6:49:49<26:38:51, 5.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118419 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3905/22095 [6:49:52<23:28:53, 4.65s/it] {'loss': 0.4037, 'grad_norm': 0.7027033907421777, 'learning_rate': 9.446282021187403e-06, 'epoch': 0.18} 18%|█▊ | 3905/22095 [6:49:52<23:28:53, 4.65s/it] 18%|█▊ | 3906/22095 [6:49:55<21:10:27, 4.19s/it] {'loss': 0.4069, 'grad_norm': 0.6764659734044738, 'learning_rate': 9.445946728706535e-06, 'epoch': 0.18} 18%|█▊ | 3906/22095 [6:49:55<21:10:27, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3907/22095 [6:50:05<29:04:01, 5.75s/it] {'loss': 0.5385, 'grad_norm': 1.2756689798521164, 'learning_rate': 9.445611340695926e-06, 'epoch': 0.18} 18%|█▊ | 3907/22095 [6:50:05<29:04:01, 5.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96511 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96048 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3908/22095 [6:50:08<24:57:02, 4.94s/it] {'loss': 0.3839, 'grad_norm': 2.5327412146331674, 'learning_rate': 9.445275857162784e-06, 'epoch': 0.18} 18%|█▊ | 3908/22095 [6:50:08<24:57:02, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111915 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881048 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4201, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 4cm\nB. 5cm\nC. 无法确定\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 18%|█▊ | 3909/22095 [6:50:11<21:55:23, 4.34s/it] {'loss': 0.406, 'grad_norm': 0.6686919910303133, 'learning_rate': 9.444940278114316e-06, 'epoch': 0.18} 18%|█▊ | 3909/22095 [6:50:11<21:55:23, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3910/22095 [6:50:15<21:21:34, 4.23s/it] {'loss': 0.3656, 'grad_norm': 0.8168236937312541, 'learning_rate': 9.444604603557733e-06, 'epoch': 0.18} 18%|█▊ | 3910/22095 [6:50:15<21:21:34, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3911/22095 [6:50:24<29:39:31, 5.87s/it] {'loss': 0.5121, 'grad_norm': 0.5818948523437871, 'learning_rate': 9.444268833500247e-06, 'epoch': 0.18} 18%|█▊ | 3911/22095 [6:50:24<29:39:31, 5.87s/it] 18%|█▊ | 3912/22095 [6:50:28<25:39:35, 5.08s/it] {'loss': 0.4032, 'grad_norm': 0.6705505609317879, 'learning_rate': 9.443932967949074e-06, 'epoch': 0.18} 18%|█▊ | 3912/22095 [6:50:28<25:39:35, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3913/22095 [6:50:37<32:19:26, 6.40s/it] {'loss': 0.5175, 'grad_norm': 0.756973477548209, 'learning_rate': 9.443597006911432e-06, 'epoch': 0.18} 18%|█▊ | 3913/22095 [6:50:37<32:19:26, 6.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [659, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8509610 in VC:s3://internvl-moe-sft-data/. Exception: Image size [659, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4499, 'image': 'vrdu_texteq/astro-ph.CO/f6df908e-f8f7-496b-9c56-1666296f342d.png', 'image_wh': [[659, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'The absence of a correlation is excluded at $\\sim 9 \\sigma$ level.'}]} 18%|█▊ | 3914/22095 [6:50:41<28:05:03, 5.56s/it] {'loss': 0.4031, 'grad_norm': 0.720833225924833, 'learning_rate': 9.443260950394535e-06, 'epoch': 0.18} 18%|█▊ | 3914/22095 [6:50:41<28:05:03, 5.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3915/22095 [6:50:45<25:34:20, 5.06s/it] {'loss': 0.3792, 'grad_norm': 0.7253115963736714, 'learning_rate': 9.442924798405605e-06, 'epoch': 0.18} 18%|█▊ | 3915/22095 [6:50:45<25:34:20, 5.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3916/22095 [6:50:54<32:02:27, 6.35s/it] {'loss': 0.5194, 'grad_norm': 0.7770778836748186, 'learning_rate': 9.44258855095187e-06, 'epoch': 0.18} 18%|█▊ | 3916/22095 [6:50:54<32:02:27, 6.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045958 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 8\nB. 7\nC. 6\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 18%|█▊ | 3917/22095 [6:50:58<28:05:36, 5.56s/it] {'loss': 0.3729, 'grad_norm': 0.7620354620331995, 'learning_rate': 9.442252208040551e-06, 'epoch': 0.18} 18%|█▊ | 3917/22095 [6:50:58<28:05:36, 5.56s/it] 18%|█▊ | 3918/22095 [6:51:01<24:06:17, 4.77s/it] {'loss': 0.4391, 'grad_norm': 0.7620836046469401, 'learning_rate': 9.441915769678874e-06, 'epoch': 0.18} 18%|█▊ | 3918/22095 [6:51:01<24:06:17, 4.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047899 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 13cm\nB. 7cm\nC. 8cm\nD. 1lcm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 18%|█▊ | 3919/22095 [6:51:05<22:58:46, 4.55s/it] {'loss': 0.4095, 'grad_norm': 0.6579862798048236, 'learning_rate': 9.44157923587407e-06, 'epoch': 0.18} 18%|█▊ | 3919/22095 [6:51:05<22:58:46, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83311 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3920/22095 [6:51:08<20:40:28, 4.10s/it] {'loss': 0.4239, 'grad_norm': 0.8267059990269631, 'learning_rate': 9.441242606633369e-06, 'epoch': 0.18} 18%|█▊ | 3920/22095 [6:51:08<20:40:28, 4.10s/it] 18%|█▊ | 3921/22095 [6:51:11<18:54:15, 3.74s/it] {'loss': 0.3951, 'grad_norm': 0.7353518313708115, 'learning_rate': 9.440905881964007e-06, 'epoch': 0.18} 18%|█▊ | 3921/22095 [6:51:11<18:54:15, 3.74s/it] 18%|█▊ | 3922/22095 [6:51:14<18:25:46, 3.65s/it] {'loss': 0.3894, 'grad_norm': 0.6659406803623616, 'learning_rate': 9.440569061873213e-06, 'epoch': 0.18} 18%|█▊ | 3922/22095 [6:51:14<18:25:46, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3923/22095 [6:51:18<19:33:42, 3.88s/it] {'loss': 0.3994, 'grad_norm': 0.7242839163193702, 'learning_rate': 9.44023214636823e-06, 'epoch': 0.18} 18%|█▊ | 3923/22095 [6:51:18<19:33:42, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46959 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119296 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131590 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3924/22095 [6:51:24<22:37:39, 4.48s/it] {'loss': 0.535, 'grad_norm': 0.6445084964739258, 'learning_rate': 9.439895135456297e-06, 'epoch': 0.18} 18%|█▊ | 3924/22095 [6:51:24<22:37:39, 4.48s/it] 18%|█▊ | 3925/22095 [6:51:28<21:20:23, 4.23s/it] {'loss': 0.4296, 'grad_norm': 0.7747303062389351, 'learning_rate': 9.43955802914465e-06, 'epoch': 0.18} 18%|█▊ | 3925/22095 [6:51:28<21:20:23, 4.23s/it] 18%|█▊ | 3926/22095 [6:51:31<19:56:41, 3.95s/it] {'loss': 0.421, 'grad_norm': 0.6792800098985597, 'learning_rate': 9.439220827440539e-06, 'epoch': 0.18} 18%|█▊ | 3926/22095 [6:51:31<19:56:41, 3.95s/it] 18%|█▊ | 3927/22095 [6:51:35<19:51:59, 3.94s/it] {'loss': 0.432, 'grad_norm': 0.7261120584253042, 'learning_rate': 9.438883530351207e-06, 'epoch': 0.18} 18%|█▊ | 3927/22095 [6:51:35<19:51:59, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76544 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76038 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98517 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41629 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3928/22095 [6:51:41<22:21:46, 4.43s/it] {'loss': 0.5249, 'grad_norm': 0.39904041842961024, 'learning_rate': 9.438546137883898e-06, 'epoch': 0.18} 18%|█▊ | 3928/22095 [6:51:41<22:21:46, 4.43s/it] 18%|█▊ | 3929/22095 [6:51:44<20:32:39, 4.07s/it] {'loss': 0.3769, 'grad_norm': 0.8107503680750474, 'learning_rate': 9.438208650045866e-06, 'epoch': 0.18} 18%|█▊ | 3929/22095 [6:51:44<20:32:39, 4.07s/it] 18%|█▊ | 3930/22095 [6:51:47<18:48:38, 3.73s/it] {'loss': 0.4014, 'grad_norm': 0.7203559014405553, 'learning_rate': 9.43787106684436e-06, 'epoch': 0.18} 18%|█▊ | 3930/22095 [6:51:47<18:48:38, 3.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3931/22095 [6:51:51<19:04:24, 3.78s/it] {'loss': 0.3732, 'grad_norm': 0.962397168004711, 'learning_rate': 9.437533388286635e-06, 'epoch': 0.18} 18%|█▊ | 3931/22095 [6:51:51<19:04:24, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_423960.png 2025-08-27 22:49:49.611287 load time: 1064.33 ms 18%|█▊ | 3932/22095 [6:52:02<30:01:54, 5.95s/it] {'loss': 0.5248, 'grad_norm': 0.45413655885885273, 'learning_rate': 9.437195614379947e-06, 'epoch': 0.18} 18%|█▊ | 3932/22095 [6:52:02<30:01:54, 5.95s/it] 18%|█▊ | 3933/22095 [6:52:05<26:26:41, 5.24s/it] {'loss': 0.3741, 'grad_norm': 0.817919637821282, 'learning_rate': 9.436857745131553e-06, 'epoch': 0.18} 18%|█▊ | 3933/22095 [6:52:05<26:26:41, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74307 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3934/22095 [6:52:08<22:39:40, 4.49s/it] {'loss': 0.4316, 'grad_norm': 0.682119639232124, 'learning_rate': 9.436519780548712e-06, 'epoch': 0.18} 18%|█▊ | 3934/22095 [6:52:08<22:39:40, 4.49s/it] 18%|█▊ | 3935/22095 [6:52:11<20:06:13, 3.99s/it] {'loss': 0.3644, 'grad_norm': 0.6600377596146887, 'learning_rate': 9.436181720638688e-06, 'epoch': 0.18} 18%|█▊ | 3935/22095 [6:52:11<20:06:13, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3936/22095 [6:52:20<28:18:53, 5.61s/it] {'loss': 0.4959, 'grad_norm': 0.4001893548138264, 'learning_rate': 9.435843565408742e-06, 'epoch': 0.18} 18%|█▊ | 3936/22095 [6:52:20<28:18:53, 5.61s/it] 18%|█▊ | 3937/22095 [6:52:24<25:40:04, 5.09s/it] {'loss': 0.4095, 'grad_norm': 0.8448226633705256, 'learning_rate': 9.435505314866143e-06, 'epoch': 0.18} 18%|█▊ | 3937/22095 [6:52:24<25:40:04, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3938/22095 [6:52:28<23:41:37, 4.70s/it] {'loss': 0.4406, 'grad_norm': 0.7752472854802783, 'learning_rate': 9.435166969018158e-06, 'epoch': 0.18} 18%|█▊ | 3938/22095 [6:52:28<23:41:37, 4.70s/it] 18%|█▊ | 3939/22095 [6:52:31<21:23:25, 4.24s/it] {'loss': 0.422, 'grad_norm': 0.6748292726243518, 'learning_rate': 9.434828527872052e-06, 'epoch': 0.18} 18%|█▊ | 3939/22095 [6:52:31<21:23:25, 4.24s/it] 18%|█▊ | 3940/22095 [6:52:34<19:22:57, 3.84s/it] {'loss': 0.4333, 'grad_norm': 0.7678223720734237, 'learning_rate': 9.434489991435106e-06, 'epoch': 0.18} 18%|█▊ | 3940/22095 [6:52:34<19:22:57, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3941/22095 [6:52:37<17:44:03, 3.52s/it] {'loss': 0.3947, 'grad_norm': 0.8010333175192554, 'learning_rate': 9.434151359714587e-06, 'epoch': 0.18} 18%|█▊ | 3941/22095 [6:52:37<17:44:03, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3942/22095 [6:52:46<26:38:41, 5.28s/it] {'loss': 0.5114, 'grad_norm': 0.4352994652107656, 'learning_rate': 9.433812632717776e-06, 'epoch': 0.18} 18%|█▊ | 3942/22095 [6:52:46<26:38:41, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3943/22095 [6:52:50<23:53:39, 4.74s/it] {'loss': 0.4461, 'grad_norm': 0.7307693833726469, 'learning_rate': 9.433473810451947e-06, 'epoch': 0.18} 18%|█▊ | 3943/22095 [6:52:50<23:53:39, 4.74s/it] 18%|█▊ | 3944/22095 [6:52:53<21:40:44, 4.30s/it] {'loss': 0.3852, 'grad_norm': 0.6569377227188848, 'learning_rate': 9.433134892924383e-06, 'epoch': 0.18} 18%|█▊ | 3944/22095 [6:52:53<21:40:44, 4.30s/it] 18%|█▊ | 3945/22095 [6:52:57<20:32:53, 4.08s/it] {'loss': 0.4646, 'grad_norm': 0.7588640467049291, 'learning_rate': 9.432795880142366e-06, 'epoch': 0.18} 18%|█▊ | 3945/22095 [6:52:57<20:32:53, 4.08s/it] 18%|█▊ | 3946/22095 [6:53:00<19:05:59, 3.79s/it] {'loss': 0.3859, 'grad_norm': 0.7047461181700752, 'learning_rate': 9.432456772113179e-06, 'epoch': 0.18} 18%|█▊ | 3946/22095 [6:53:00<19:05:59, 3.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [556, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8495390 in VC:s3://internvl-moe-sft-data/. Exception: Image size [556, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 127346, 'image': 'vrdu_texteq/astro-ph.CO/7ba0b7af-b691-4d80-89c2-beb8bcfc9cb5.png', 'image_wh': [[556, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $\\hat{z}^{\\mu}$ in the unit vector in the $z$ direction.'}]} 18%|█▊ | 3947/22095 [6:53:03<18:53:08, 3.75s/it] {'loss': 0.4089, 'grad_norm': 0.6265443772082004, 'learning_rate': 9.43211756884411e-06, 'epoch': 0.18} 18%|█▊ | 3947/22095 [6:53:03<18:53:08, 3.75s/it] 18%|█▊ | 3948/22095 [6:53:07<18:04:40, 3.59s/it] {'loss': 0.4275, 'grad_norm': 0.709027381706952, 'learning_rate': 9.431778270342447e-06, 'epoch': 0.18} 18%|█▊ | 3948/22095 [6:53:07<18:04:40, 3.59s/it] 18%|█▊ | 3949/22095 [6:53:10<17:38:59, 3.50s/it] {'loss': 0.4154, 'grad_norm': 0.7120230598819207, 'learning_rate': 9.431438876615478e-06, 'epoch': 0.18} 18%|█▊ | 3949/22095 [6:53:10<17:38:59, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3950/22095 [6:53:13<17:47:50, 3.53s/it] {'loss': 0.4286, 'grad_norm': 0.6887782725922001, 'learning_rate': 9.4310993876705e-06, 'epoch': 0.18} 18%|█▊ | 3950/22095 [6:53:13<17:47:50, 3.53s/it] 18%|█▊ | 3951/22095 [6:53:17<17:48:45, 3.53s/it] {'loss': 0.4382, 'grad_norm': 0.710693416079211, 'learning_rate': 9.430759803514802e-06, 'epoch': 0.18} 18%|█▊ | 3951/22095 [6:53:17<17:48:45, 3.53s/it] 18%|█▊ | 3952/22095 [6:53:20<16:55:29, 3.36s/it] {'loss': 0.3749, 'grad_norm': 0.6188621557116822, 'learning_rate': 9.430420124155687e-06, 'epoch': 0.18} 18%|█▊ | 3952/22095 [6:53:20<16:55:29, 3.36s/it] 18%|█▊ | 3953/22095 [6:53:24<17:41:19, 3.51s/it] {'loss': 0.436, 'grad_norm': 0.6738178241809274, 'learning_rate': 9.43008034960045e-06, 'epoch': 0.18} 18%|█▊ | 3953/22095 [6:53:24<17:41:19, 3.51s/it] 18%|█▊ | 3954/22095 [6:53:27<17:51:14, 3.54s/it] {'loss': 0.4149, 'grad_norm': 0.6825365111947984, 'learning_rate': 9.42974047985639e-06, 'epoch': 0.18} 18%|█▊ | 3954/22095 [6:53:27<17:51:14, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45491 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3955/22095 [6:53:31<17:13:22, 3.42s/it] {'loss': 0.4023, 'grad_norm': 0.6605285605975528, 'learning_rate': 9.429400514930815e-06, 'epoch': 0.18} 18%|█▊ | 3955/22095 [6:53:31<17:13:22, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110682 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3956/22095 [6:53:33<16:20:57, 3.24s/it] {'loss': 0.4472, 'grad_norm': 0.7096638896075035, 'learning_rate': 9.429060454831026e-06, 'epoch': 0.18} 18%|█▊ | 3956/22095 [6:53:33<16:20:57, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3957/22095 [6:53:40<21:11:57, 4.21s/it] {'loss': 0.5404, 'grad_norm': 0.415451298677631, 'learning_rate': 9.42872029956433e-06, 'epoch': 0.18} 18%|█▊ | 3957/22095 [6:53:40<21:11:57, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60388 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3958/22095 [6:53:43<19:30:27, 3.87s/it] {'loss': 0.4114, 'grad_norm': 0.7667486909261592, 'learning_rate': 9.428380049138038e-06, 'epoch': 0.18} 18%|█▊ | 3958/22095 [6:53:43<19:30:27, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359934 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26655, 'image': 'vrdu_table_final_2/astro-ph.CO/c0fa5ec1-7d91-40d5-a2f5-90a58df74edb.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 18%|█▊ | 3959/22095 [6:53:46<18:45:42, 3.72s/it] {'loss': 0.4269, 'grad_norm': 0.6663120328599494, 'learning_rate': 9.428039703559458e-06, 'epoch': 0.18} 18%|█▊ | 3959/22095 [6:53:46<18:45:42, 3.72s/it] 18%|█▊ | 3960/22095 [6:53:49<17:28:31, 3.47s/it] {'loss': 0.4392, 'grad_norm': 0.7142005965525264, 'learning_rate': 9.427699262835904e-06, 'epoch': 0.18} 18%|█▊ | 3960/22095 [6:53:49<17:28:31, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3961/22095 [6:53:53<17:15:22, 3.43s/it] {'loss': 0.4144, 'grad_norm': 0.8063258085038403, 'learning_rate': 9.427358726974693e-06, 'epoch': 0.18} 18%|█▊ | 3961/22095 [6:53:53<17:15:22, 3.43s/it] 18%|█▊ | 3962/22095 [6:53:56<16:50:48, 3.34s/it] {'loss': 0.3895, 'grad_norm': 0.6875786825371241, 'learning_rate': 9.42701809598314e-06, 'epoch': 0.18} 18%|█▊ | 3962/22095 [6:53:56<16:50:48, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3963/22095 [6:54:05<25:58:35, 5.16s/it] {'loss': 0.4787, 'grad_norm': 0.38949801700994485, 'learning_rate': 9.426677369868564e-06, 'epoch': 0.18} 18%|█▊ | 3963/22095 [6:54:05<25:58:35, 5.16s/it] 18%|█▊ | 3964/22095 [6:54:09<23:45:52, 4.72s/it] {'loss': 0.3878, 'grad_norm': 0.7008172907799324, 'learning_rate': 9.426336548638287e-06, 'epoch': 0.18} 18%|█▊ | 3964/22095 [6:54:09<23:45:52, 4.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884031 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7184, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 2cm\nB. 4cm\nC. 1cm\nD. 1.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 18%|█▊ | 3965/22095 [6:54:12<20:58:36, 4.17s/it] {'loss': 0.4116, 'grad_norm': 0.6497017033257975, 'learning_rate': 9.425995632299631e-06, 'epoch': 0.18} 18%|█▊ | 3965/22095 [6:54:12<20:58:36, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3966/22095 [6:54:16<20:56:35, 4.16s/it] {'loss': 0.5049, 'grad_norm': 0.3271747753890301, 'learning_rate': 9.425654620859923e-06, 'epoch': 0.18} 18%|█▊ | 3966/22095 [6:54:16<20:56:35, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51358 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3967/22095 [6:54:20<20:44:25, 4.12s/it] {'loss': 0.3762, 'grad_norm': 0.7075592309298688, 'learning_rate': 9.425313514326491e-06, 'epoch': 0.18} 18%|█▊ | 3967/22095 [6:54:20<20:44:25, 4.12s/it] 18%|█▊ | 3968/22095 [6:54:23<19:39:20, 3.90s/it] {'loss': 0.4145, 'grad_norm': 1.1093788268170846, 'learning_rate': 9.424972312706663e-06, 'epoch': 0.18} 18%|█▊ | 3968/22095 [6:54:23<19:39:20, 3.90s/it] 18%|█▊ | 3969/22095 [6:54:27<18:49:09, 3.74s/it] {'loss': 0.4049, 'grad_norm': 0.69564273713801, 'learning_rate': 9.424631016007768e-06, 'epoch': 0.18} 18%|█▊ | 3969/22095 [6:54:27<18:49:09, 3.74s/it] 18%|█▊ | 3970/22095 [6:54:30<18:19:27, 3.64s/it] {'loss': 0.357, 'grad_norm': 0.6730428805768747, 'learning_rate': 9.424289624237143e-06, 'epoch': 0.18} 18%|█▊ | 3970/22095 [6:54:30<18:19:27, 3.64s/it] 18%|█▊ | 3971/22095 [6:54:34<18:53:37, 3.75s/it] {'loss': 0.4044, 'grad_norm': 0.753946831911797, 'learning_rate': 9.423948137402123e-06, 'epoch': 0.18} 18%|█▊ | 3971/22095 [6:54:34<18:53:37, 3.75s/it] 18%|█▊ | 3972/22095 [6:54:37<18:19:29, 3.64s/it] {'loss': 0.3898, 'grad_norm': 0.6356412800721248, 'learning_rate': 9.423606555510043e-06, 'epoch': 0.18} 18%|█▊ | 3972/22095 [6:54:37<18:19:29, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909862 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33015, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nA. 16\nB. 9\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 18%|█▊ | 3973/22095 [6:54:41<17:48:29, 3.54s/it] {'loss': 0.3978, 'grad_norm': 0.6725822799634429, 'learning_rate': 9.423264878568246e-06, 'epoch': 0.18} 18%|█▊ | 3973/22095 [6:54:41<17:48:29, 3.54s/it] 18%|█▊ | 3974/22095 [6:54:45<18:15:18, 3.63s/it] {'loss': 0.4201, 'grad_norm': 0.7231602834064391, 'learning_rate': 9.42292310658407e-06, 'epoch': 0.18} 18%|█▊ | 3974/22095 [6:54:45<18:15:18, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3975/22095 [6:54:55<28:47:06, 5.72s/it] {'loss': 0.5007, 'grad_norm': 0.3432254737661334, 'learning_rate': 9.422581239564861e-06, 'epoch': 0.18} 18%|█▊ | 3975/22095 [6:54:55<28:47:06, 5.72s/it] 18%|█▊ | 3976/22095 [6:55:00<26:58:19, 5.36s/it] {'loss': 0.4049, 'grad_norm': 0.7604507555915985, 'learning_rate': 9.422239277517964e-06, 'epoch': 0.18} 18%|█▊ | 3976/22095 [6:55:00<26:58:19, 5.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 3, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369925 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 3, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36677, 'image': 'vrdu_table_final_2/astro-ph.CO/7bee2787-c2d2-41c3-8384-f853e2061c7c.png', 'image_wh': [[12, 3]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}l@{}} - \\\\ $\\,$ \\end{tabular}\n```"}]} 18%|█▊ | 3977/22095 [6:55:04<25:28:40, 5.06s/it] {'loss': 0.3911, 'grad_norm': 0.6473006192279481, 'learning_rate': 9.421897220450728e-06, 'epoch': 0.18} 18%|█▊ | 3977/22095 [6:55:04<25:28:40, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 3978/22095 [6:55:07<22:47:36, 4.53s/it] {'loss': 0.3843, 'grad_norm': 0.6406786856957288, 'learning_rate': 9.4215550683705e-06, 'epoch': 0.18} 18%|█▊ | 3978/22095 [6:55:07<22:47:36, 4.53s/it] 18%|█▊ | 3979/22095 [6:55:10<20:39:30, 4.11s/it] {'loss': 0.4037, 'grad_norm': 0.7426731979212589, 'learning_rate': 9.421212821284633e-06, 'epoch': 0.18} 18%|█▊ | 3979/22095 [6:55:10<20:39:30, 4.11s/it] 18%|█▊ | 3980/22095 [6:55:13<18:51:42, 3.75s/it] {'loss': 0.4113, 'grad_norm': 0.6165518680048144, 'learning_rate': 9.420870479200483e-06, 'epoch': 0.18} 18%|█▊ | 3980/22095 [6:55:13<18:51:42, 3.75s/it] 18%|█▊ | 3981/22095 [6:55:17<18:16:54, 3.63s/it] {'loss': 0.3845, 'grad_norm': 0.6055004238082176, 'learning_rate': 9.420528042125404e-06, 'epoch': 0.18} 18%|█▊ | 3981/22095 [6:55:17<18:16:54, 3.63s/it] 18%|█▊ | 3982/22095 [6:55:20<17:50:50, 3.55s/it] {'loss': 0.4331, 'grad_norm': 0.6829154371077608, 'learning_rate': 9.420185510066753e-06, 'epoch': 0.18} 18%|█▊ | 3982/22095 [6:55:20<17:50:50, 3.55s/it] 18%|█▊ | 3983/22095 [6:55:23<16:45:20, 3.33s/it] {'loss': 0.4057, 'grad_norm': 0.701267887540177, 'learning_rate': 9.41984288303189e-06, 'epoch': 0.18} 18%|█▊ | 3983/22095 [6:55:23<16:45:20, 3.33s/it] 18%|█▊ | 3984/22095 [6:55:27<17:22:24, 3.45s/it] {'loss': 0.4022, 'grad_norm': 0.7412461761260488, 'learning_rate': 9.419500161028178e-06, 'epoch': 0.18} 18%|█▊ | 3984/22095 [6:55:27<17:22:24, 3.45s/it] 18%|█▊ | 3985/22095 [6:55:30<17:19:35, 3.44s/it] {'loss': 0.3746, 'grad_norm': 0.6478422974417011, 'learning_rate': 9.419157344062984e-06, 'epoch': 0.18} 18%|█▊ | 3985/22095 [6:55:30<17:19:35, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59870 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3986/22095 [6:55:34<18:37:57, 3.70s/it] {'loss': 0.4228, 'grad_norm': 0.6463703026899874, 'learning_rate': 9.418814432143669e-06, 'epoch': 0.18} 18%|█▊ | 3986/22095 [6:55:34<18:37:57, 3.70s/it] 18%|█▊ | 3987/22095 [6:55:38<19:06:46, 3.80s/it] {'loss': 0.4108, 'grad_norm': 0.6316234750070395, 'learning_rate': 9.418471425277603e-06, 'epoch': 0.18} 18%|█▊ | 3987/22095 [6:55:38<19:06:46, 3.80s/it] 18%|█▊ | 3988/22095 [6:55:41<17:27:30, 3.47s/it] {'loss': 0.4254, 'grad_norm': 0.7962127763617338, 'learning_rate': 9.418128323472157e-06, 'epoch': 0.18} 18%|█▊ | 3988/22095 [6:55:41<17:27:30, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047901 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm'}, {'from': 'gpt', 'value': '【解答】解:∵CB=3cm,DB=5cm,∴CD=5-3=2cm,∵D是AC的中点,∴AC=2CD=4cm,∴AB=AC+CB=4+3=7cm.'}]} 18%|█▊ | 3989/22095 [6:55:44<16:42:25, 3.32s/it] {'loss': 0.4222, 'grad_norm': 0.6952711404179807, 'learning_rate': 9.417785126734701e-06, 'epoch': 0.18} 18%|█▊ | 3989/22095 [6:55:44<16:42:25, 3.32s/it] 18%|█▊ | 3990/22095 [6:55:48<18:01:48, 3.59s/it] {'loss': 0.412, 'grad_norm': 0.7285155806695047, 'learning_rate': 9.417441835072615e-06, 'epoch': 0.18} 18%|█▊ | 3990/22095 [6:55:48<18:01:48, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83387 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3991/22095 [6:55:52<17:49:43, 3.55s/it] {'loss': 0.3712, 'grad_norm': 0.6387050480157371, 'learning_rate': 9.417098448493267e-06, 'epoch': 0.18} 18%|█▊ | 3991/22095 [6:55:52<17:49:43, 3.55s/it] 18%|█▊ | 3992/22095 [6:55:55<17:51:54, 3.55s/it] {'loss': 0.3723, 'grad_norm': 0.6248158896534401, 'learning_rate': 9.41675496700404e-06, 'epoch': 0.18} 18%|█▊ | 3992/22095 [6:55:55<17:51:54, 3.55s/it] 18%|█▊ | 3993/22095 [6:55:59<17:32:45, 3.49s/it] {'loss': 0.3747, 'grad_norm': 0.6519533699295533, 'learning_rate': 9.416411390612315e-06, 'epoch': 0.18} 18%|█▊ | 3993/22095 [6:55:59<17:32:45, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 3994/22095 [6:56:09<27:53:52, 5.55s/it] {'loss': 0.5126, 'grad_norm': 0.49737740981951944, 'learning_rate': 9.416067719325472e-06, 'epoch': 0.18} 18%|█▊ | 3994/22095 [6:56:09<27:53:52, 5.55s/it] 18%|█▊ | 3995/22095 [6:56:12<24:48:03, 4.93s/it] {'loss': 0.4085, 'grad_norm': 0.6608173062896084, 'learning_rate': 9.415723953150897e-06, 'epoch': 0.18} 18%|█▊ | 3995/22095 [6:56:12<24:48:03, 4.93s/it] 18%|█▊ | 3996/22095 [6:56:17<23:40:18, 4.71s/it] {'loss': 0.3878, 'grad_norm': 0.6267983817821665, 'learning_rate': 9.415380092095976e-06, 'epoch': 0.18} 18%|█▊ | 3996/22095 [6:56:17<23:40:18, 4.71s/it] 18%|█▊ | 3997/22095 [6:56:19<20:53:34, 4.16s/it] {'loss': 0.4488, 'grad_norm': 0.7368257783544285, 'learning_rate': 9.415036136168099e-06, 'epoch': 0.18} 18%|█▊ | 3997/22095 [6:56:20<20:53:34, 4.16s/it] 18%|█▊ | 3998/22095 [6:56:23<20:16:39, 4.03s/it] {'loss': 0.4199, 'grad_norm': 0.6199378468837977, 'learning_rate': 9.414692085374654e-06, 'epoch': 0.18} 18%|█▊ | 3998/22095 [6:56:23<20:16:39, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (125845 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44099 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 3999/22095 [6:56:31<25:10:59, 5.01s/it] {'loss': 0.517, 'grad_norm': 0.5008602627173249, 'learning_rate': 9.414347939723033e-06, 'epoch': 0.18} 18%|█▊ | 3999/22095 [6:56:31<25:10:59, 5.01s/it] 18%|█▊ | 4000/22095 [6:56:34<22:29:26, 4.47s/it] {'loss': 0.4177, 'grad_norm': 0.7481791294382593, 'learning_rate': 9.414003699220636e-06, 'epoch': 0.18} 18%|█▊ | 4000/22095 [6:56:34<22:29:26, 4.47s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 18%|█▊ | 4001/22095 [6:57:17<81:41:41, 16.25s/it] {'loss': 0.3521, 'grad_norm': 0.6333229028807124, 'learning_rate': 9.413659363874855e-06, 'epoch': 0.18} 18%|█▊ | 4001/22095 [6:57:18<81:41:41, 16.25s/it] 18%|█▊ | 4002/22095 [6:57:21<61:56:05, 12.32s/it] {'loss': 0.3738, 'grad_norm': 0.6906285797805105, 'learning_rate': 9.413314933693088e-06, 'epoch': 0.18} 18%|█▊ | 4002/22095 [6:57:21<61:56:05, 12.32s/it]VC:s3://multi-modal/playground/data/geoqa+/images/12038.png 2025-08-27 22:55:19.429318 load time: 1032.85 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4003/22095 [6:57:24<48:29:05, 9.65s/it] {'loss': 0.4365, 'grad_norm': 0.6255253306601154, 'learning_rate': 9.41297040868274e-06, 'epoch': 0.18} 18%|█▊ | 4003/22095 [6:57:24<48:29:05, 9.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56067 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4004/22095 [6:57:27<38:36:26, 7.68s/it] {'loss': 0.382, 'grad_norm': 0.6497671002311871, 'learning_rate': 9.412625788851208e-06, 'epoch': 0.18} 18%|█▊ | 4004/22095 [6:57:27<38:36:26, 7.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4005/22095 [6:57:31<32:45:10, 6.52s/it] {'loss': 0.4666, 'grad_norm': 0.734449128993359, 'learning_rate': 9.412281074205903e-06, 'epoch': 0.18} 18%|█▊ | 4005/22095 [6:57:31<32:45:10, 6.52s/it] 18%|█▊ | 4006/22095 [6:57:34<27:28:58, 5.47s/it] {'loss': 0.4051, 'grad_norm': 0.6881308403070234, 'learning_rate': 9.41193626475423e-06, 'epoch': 0.18} 18%|█▊ | 4006/22095 [6:57:34<27:28:58, 5.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4007/22095 [6:57:43<33:31:42, 6.67s/it] {'loss': 0.5014, 'grad_norm': 0.4046944363293297, 'learning_rate': 9.411591360503594e-06, 'epoch': 0.18} 18%|█▊ | 4007/22095 [6:57:43<33:31:42, 6.67s/it] 18%|█▊ | 4008/22095 [6:57:47<28:13:51, 5.62s/it] {'loss': 0.3746, 'grad_norm': 0.6909008023747709, 'learning_rate': 9.41124636146141e-06, 'epoch': 0.18} 18%|█▊ | 4008/22095 [6:57:47<28:13:51, 5.62s/it] 18%|█▊ | 4009/22095 [6:57:50<24:51:18, 4.95s/it] {'loss': 0.4352, 'grad_norm': 0.7304086747129781, 'learning_rate': 9.41090126763509e-06, 'epoch': 0.18} 18%|█▊ | 4009/22095 [6:57:50<24:51:18, 4.95s/it] 18%|█▊ | 4010/22095 [6:57:53<22:34:34, 4.49s/it] {'loss': 0.4304, 'grad_norm': 0.6541757729998198, 'learning_rate': 9.410556079032049e-06, 'epoch': 0.18} 18%|█▊ | 4010/22095 [6:57:53<22:34:34, 4.49s/it] 18%|█▊ | 4011/22095 [6:57:57<20:53:13, 4.16s/it] {'loss': 0.4321, 'grad_norm': 0.6872384761920306, 'learning_rate': 9.410210795659702e-06, 'epoch': 0.18} 18%|█▊ | 4011/22095 [6:57:57<20:53:13, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4012/22095 [6:58:03<24:24:16, 4.86s/it] {'loss': 0.4897, 'grad_norm': 0.3936706884541984, 'learning_rate': 9.409865417525473e-06, 'epoch': 0.18} 18%|█▊ | 4012/22095 [6:58:03<24:24:16, 4.86s/it] 18%|█▊ | 4013/22095 [6:58:08<23:58:59, 4.77s/it] {'loss': 0.3871, 'grad_norm': 0.6558205643441305, 'learning_rate': 9.409519944636778e-06, 'epoch': 0.18} 18%|█▊ | 4013/22095 [6:58:08<23:58:59, 4.77s/it] 18%|█▊ | 4014/22095 [6:58:12<22:30:09, 4.48s/it] {'loss': 0.3961, 'grad_norm': 0.6706625783570828, 'learning_rate': 9.409174377001043e-06, 'epoch': 0.18} 18%|█▊ | 4014/22095 [6:58:12<22:30:09, 4.48s/it] 18%|█▊ | 4015/22095 [6:58:15<20:30:38, 4.08s/it] {'loss': 0.4043, 'grad_norm': 0.6638586577515181, 'learning_rate': 9.40882871462569e-06, 'epoch': 0.18} 18%|█▊ | 4015/22095 [6:58:15<20:30:38, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89892 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4016/22095 [6:58:19<20:12:31, 4.02s/it] {'loss': 0.4002, 'grad_norm': 0.6821026362875763, 'learning_rate': 9.408482957518152e-06, 'epoch': 0.18} 18%|█▊ | 4016/22095 [6:58:19<20:12:31, 4.02s/it] 18%|█▊ | 4017/22095 [6:58:22<19:08:06, 3.81s/it] {'loss': 0.4185, 'grad_norm': 0.6699639269128657, 'learning_rate': 9.408137105685853e-06, 'epoch': 0.18} 18%|█▊ | 4017/22095 [6:58:22<19:08:06, 3.81s/it] 18%|█▊ | 4018/22095 [6:58:25<17:47:33, 3.54s/it] {'loss': 0.4125, 'grad_norm': 0.6801453596286959, 'learning_rate': 9.407791159136226e-06, 'epoch': 0.18} 18%|█▊ | 4018/22095 [6:58:25<17:47:33, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8391998 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 58824, 'image': 'vrdu_table_final_2/astro-ph.EP/d7829734-0dee-41db-a760-167a12e00c87.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 18%|█▊ | 4019/22095 [6:58:28<16:50:02, 3.35s/it] {'loss': 0.4349, 'grad_norm': 0.7345921944079861, 'learning_rate': 9.407445117876705e-06, 'epoch': 0.18} 18%|█▊ | 4019/22095 [6:58:28<16:50:02, 3.35s/it] 18%|█▊ | 4020/22095 [6:58:31<16:09:38, 3.22s/it] {'loss': 0.4387, 'grad_norm': 0.6975348399576882, 'learning_rate': 9.407098981914726e-06, 'epoch': 0.18} 18%|█▊ | 4020/22095 [6:58:31<16:09:38, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43285 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4021/22095 [6:58:34<16:23:33, 3.27s/it] {'loss': 0.4169, 'grad_norm': 0.653769033304466, 'learning_rate': 9.406752751257724e-06, 'epoch': 0.18} 18%|█▊ | 4021/22095 [6:58:34<16:23:33, 3.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4022/22095 [6:58:37<16:23:19, 3.26s/it] {'loss': 0.3745, 'grad_norm': 0.6274532282943137, 'learning_rate': 9.40640642591314e-06, 'epoch': 0.18} 18%|█▊ | 4022/22095 [6:58:37<16:23:19, 3.26s/it] 18%|█▊ | 4023/22095 [6:58:41<17:09:55, 3.42s/it] {'loss': 0.4375, 'grad_norm': 0.7483261826815685, 'learning_rate': 9.406060005888414e-06, 'epoch': 0.18} 18%|█▊ | 4023/22095 [6:58:41<17:09:55, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4024/22095 [6:58:48<22:39:31, 4.51s/it] {'loss': 0.5221, 'grad_norm': 0.6041073913009501, 'learning_rate': 9.405713491190992e-06, 'epoch': 0.18} 18%|█▊ | 4024/22095 [6:58:48<22:39:31, 4.51s/it] 18%|█▊ | 4025/22095 [6:58:52<21:01:27, 4.19s/it] {'loss': 0.4088, 'grad_norm': 0.669360438322126, 'learning_rate': 9.405366881828317e-06, 'epoch': 0.18} 18%|█▊ | 4025/22095 [6:58:52<21:01:27, 4.19s/it] 18%|█▊ | 4026/22095 [6:58:55<19:19:11, 3.85s/it] {'loss': 0.4042, 'grad_norm': 0.6639839913216719, 'learning_rate': 9.40502017780784e-06, 'epoch': 0.18} 18%|█▊ | 4026/22095 [6:58:55<19:19:11, 3.85s/it] 18%|█▊ | 4027/22095 [6:58:58<17:47:08, 3.54s/it] {'loss': 0.3955, 'grad_norm': 0.7186389028518378, 'learning_rate': 9.404673379137007e-06, 'epoch': 0.18} 18%|█▊ | 4027/22095 [6:58:58<17:47:08, 3.54s/it] 18%|█▊ | 4028/22095 [6:59:02<18:41:27, 3.72s/it] {'loss': 0.4038, 'grad_norm': 0.7107843378506711, 'learning_rate': 9.40432648582327e-06, 'epoch': 0.18} 18%|█▊ | 4028/22095 [6:59:02<18:41:27, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4029/22095 [6:59:06<20:07:44, 4.01s/it] {'loss': 0.5146, 'grad_norm': 0.38811158141052376, 'learning_rate': 9.403979497874085e-06, 'epoch': 0.18} 18%|█▊ | 4029/22095 [6:59:06<20:07:44, 4.01s/it] 18%|█▊ | 4030/22095 [6:59:10<19:05:24, 3.80s/it] {'loss': 0.4025, 'grad_norm': 0.639213961994482, 'learning_rate': 9.403632415296907e-06, 'epoch': 0.18} 18%|█▊ | 4030/22095 [6:59:10<19:05:24, 3.80s/it] 18%|█▊ | 4031/22095 [6:59:13<17:40:15, 3.52s/it] {'loss': 0.4333, 'grad_norm': 0.7539547800235694, 'learning_rate': 9.403285238099192e-06, 'epoch': 0.18} 18%|█▊ | 4031/22095 [6:59:13<17:40:15, 3.52s/it] 18%|█▊ | 4032/22095 [6:59:16<17:53:23, 3.57s/it] {'loss': 0.3992, 'grad_norm': 0.6599444003297105, 'learning_rate': 9.402937966288402e-06, 'epoch': 0.18} 18%|█▊ | 4032/22095 [6:59:16<17:53:23, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4033/22095 [6:59:20<18:02:13, 3.60s/it] {'loss': 0.3805, 'grad_norm': 0.7890073548455298, 'learning_rate': 9.402590599871994e-06, 'epoch': 0.18} 18%|█▊ | 4033/22095 [6:59:20<18:02:13, 3.60s/it] 18%|█▊ | 4034/22095 [6:59:24<18:15:12, 3.64s/it] {'loss': 0.4398, 'grad_norm': 0.6702074136415639, 'learning_rate': 9.402243138857439e-06, 'epoch': 0.18} 18%|█▊ | 4034/22095 [6:59:24<18:15:12, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65789 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47327 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78714 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4035/22095 [6:59:27<17:14:47, 3.44s/it] {'loss': 0.4137, 'grad_norm': 0.7131108641193631, 'learning_rate': 9.401895583252198e-06, 'epoch': 0.18} 18%|█▊ | 4035/22095 [6:59:27<17:14:47, 3.44s/it] 18%|█▊ | 4036/22095 [6:59:30<16:49:44, 3.35s/it] {'loss': 0.432, 'grad_norm': 0.6868999795018752, 'learning_rate': 9.40154793306374e-06, 'epoch': 0.18} 18%|█▊ | 4036/22095 [6:59:30<16:49:44, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396112 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62954, 'image': 'vrdu_table_final_2/astro-ph.EP/76c8db22-59ba-4af8-ab02-0ac403e2d61f.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4037/22095 [6:59:34<17:24:54, 3.47s/it] {'loss': 0.4423, 'grad_norm': 1.1847528882430005, 'learning_rate': 9.401200188299538e-06, 'epoch': 0.18} 18%|█▊ | 4037/22095 [6:59:34<17:24:54, 3.47s/it] 18%|█▊ | 4038/22095 [6:59:37<17:04:58, 3.41s/it] {'loss': 0.4007, 'grad_norm': 0.6741045662787118, 'learning_rate': 9.40085234896706e-06, 'epoch': 0.18} 18%|█▊ | 4038/22095 [6:59:37<17:04:58, 3.41s/it] 18%|█▊ | 4039/22095 [6:59:40<16:25:52, 3.28s/it] {'loss': 0.4238, 'grad_norm': 0.7622032017541412, 'learning_rate': 9.400504415073781e-06, 'epoch': 0.18} 18%|█▊ | 4039/22095 [6:59:40<16:25:52, 3.28s/it] 18%|█▊ | 4040/22095 [6:59:43<15:48:41, 3.15s/it] {'loss': 0.3761, 'grad_norm': 0.6220923109649558, 'learning_rate': 9.400156386627177e-06, 'epoch': 0.18} 18%|█▊ | 4040/22095 [6:59:43<15:48:41, 3.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73297 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4041/22095 [6:59:46<16:26:50, 3.28s/it] {'loss': 0.4273, 'grad_norm': 0.8052312403575104, 'learning_rate': 9.399808263634725e-06, 'epoch': 0.18} 18%|█▊ | 4041/22095 [6:59:46<16:26:50, 3.28s/it] 18%|█▊ | 4042/22095 [6:59:49<15:51:45, 3.16s/it] {'loss': 0.4268, 'grad_norm': 0.684150835282518, 'learning_rate': 9.399460046103908e-06, 'epoch': 0.18} 18%|█▊ | 4042/22095 [6:59:49<15:51:45, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4043/22095 [6:59:59<25:19:03, 5.05s/it] {'loss': 0.488, 'grad_norm': 0.4843734116041665, 'learning_rate': 9.399111734042206e-06, 'epoch': 0.18} 18%|█▊ | 4043/22095 [6:59:59<25:19:03, 5.05s/it] 18%|█▊ | 4044/22095 [7:00:02<23:11:05, 4.62s/it] {'loss': 0.3692, 'grad_norm': 0.7435265409800369, 'learning_rate': 9.398763327457104e-06, 'epoch': 0.18} 18%|█▊ | 4044/22095 [7:00:02<23:11:05, 4.62s/it] 18%|█▊ | 4045/22095 [7:00:05<21:06:17, 4.21s/it] {'loss': 0.3993, 'grad_norm': 0.6868701918588811, 'learning_rate': 9.398414826356088e-06, 'epoch': 0.18} 18%|█▊ | 4045/22095 [7:00:05<21:06:17, 4.21s/it] 18%|█▊ | 4046/22095 [7:00:10<21:00:53, 4.19s/it] {'loss': 0.3478, 'grad_norm': 0.9011937083746742, 'learning_rate': 9.398066230746645e-06, 'epoch': 0.18} 18%|█▊ | 4046/22095 [7:00:10<21:00:53, 4.19s/it] 18%|█▊ | 4047/22095 [7:00:12<18:57:14, 3.78s/it] {'loss': 0.3849, 'grad_norm': 0.714107170767041, 'learning_rate': 9.397717540636268e-06, 'epoch': 0.18} 18%|█▊ | 4047/22095 [7:00:12<18:57:14, 3.78s/it] 18%|█▊ | 4048/22095 [7:00:15<17:51:07, 3.56s/it] {'loss': 0.3682, 'grad_norm': 0.6470444070084024, 'learning_rate': 9.397368756032445e-06, 'epoch': 0.18} 18%|█▊ | 4048/22095 [7:00:15<17:51:07, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49595 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79690 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4049/22095 [7:00:25<26:40:55, 5.32s/it] {'loss': 0.5435, 'grad_norm': 0.4175690544272301, 'learning_rate': 9.397019876942675e-06, 'epoch': 0.18} 18%|█▊ | 4049/22095 [7:00:25<26:40:55, 5.32s/it] 18%|█▊ | 4050/22095 [7:00:28<23:55:32, 4.77s/it] {'loss': 0.4174, 'grad_norm': 0.7044682525903221, 'learning_rate': 9.396670903374452e-06, 'epoch': 0.18} 18%|█▊ | 4050/22095 [7:00:28<23:55:32, 4.77s/it] 18%|█▊ | 4051/22095 [7:00:31<21:20:37, 4.26s/it] {'loss': 0.4232, 'grad_norm': 0.7265792303979158, 'learning_rate': 9.396321835335274e-06, 'epoch': 0.18} 18%|█▊ | 4051/22095 [7:00:31<21:20:37, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45874 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4052/22095 [7:00:34<19:35:23, 3.91s/it] {'loss': 0.4212, 'grad_norm': 0.6683492011615667, 'learning_rate': 9.395972672832642e-06, 'epoch': 0.18} 18%|█▊ | 4052/22095 [7:00:35<19:35:23, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4053/22095 [7:00:38<19:20:43, 3.86s/it] {'loss': 0.4166, 'grad_norm': 0.8017287999678097, 'learning_rate': 9.39562341587406e-06, 'epoch': 0.18} 18%|█▊ | 4053/22095 [7:00:38<19:20:43, 3.86s/it] 18%|█▊ | 4054/22095 [7:00:41<18:17:52, 3.65s/it] {'loss': 0.4377, 'grad_norm': 0.7425769217227569, 'learning_rate': 9.39527406446703e-06, 'epoch': 0.18} 18%|█▊ | 4054/22095 [7:00:41<18:17:52, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301606 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1EjJ_a4k98KJjSZFoXXXS6pXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat is the all word in the image?'}, {'from': 'gpt', 'value': 'All words in the image:\n奥立德\n好评\n如潮\n100%实物拍摄,盗图必究\n厂家直销\n随意定做\n正品保证'}]} 18%|█▊ | 4055/22095 [7:00:47<20:41:23, 4.13s/it] {'loss': 0.4994, 'grad_norm': 0.32643590782763204, 'learning_rate': 9.394924618619059e-06, 'epoch': 0.18} 18%|█▊ | 4055/22095 [7:00:47<20:41:23, 4.13s/it] 18%|█▊ | 4056/22095 [7:00:50<19:38:59, 3.92s/it] {'loss': 0.4105, 'grad_norm': 0.6471436602045438, 'learning_rate': 9.394575078337657e-06, 'epoch': 0.18} 18%|█▊ | 4056/22095 [7:00:50<19:38:59, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4057/22095 [7:01:00<28:02:24, 5.60s/it] {'loss': 0.4927, 'grad_norm': 0.4833246761409749, 'learning_rate': 9.394225443630332e-06, 'epoch': 0.18} 18%|█▊ | 4057/22095 [7:01:00<28:02:24, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51912 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93960 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4058/22095 [7:01:03<24:34:08, 4.90s/it] {'loss': 0.3934, 'grad_norm': 0.6848428054800386, 'learning_rate': 9.393875714504598e-06, 'epoch': 0.18} 18%|█▊ | 4058/22095 [7:01:03<24:34:08, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120164 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56584 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43491 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4059/22095 [7:01:06<21:19:11, 4.26s/it] {'loss': 0.4049, 'grad_norm': 0.655694304334304, 'learning_rate': 9.393525890967971e-06, 'epoch': 0.18} 18%|█▊ | 4059/22095 [7:01:06<21:19:11, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4060/22095 [7:01:15<29:07:24, 5.81s/it] {'loss': 0.4795, 'grad_norm': 0.3594581111319001, 'learning_rate': 9.393175973027967e-06, 'epoch': 0.18} 18%|█▊ | 4060/22095 [7:01:15<29:07:24, 5.81s/it] 18%|█▊ | 4061/22095 [7:01:19<26:10:48, 5.23s/it] {'loss': 0.3944, 'grad_norm': 0.7635421681342424, 'learning_rate': 9.392825960692103e-06, 'epoch': 0.18} 18%|█▊ | 4061/22095 [7:01:19<26:10:48, 5.23s/it] 18%|█▊ | 4062/22095 [7:01:23<24:13:50, 4.84s/it] {'loss': 0.4417, 'grad_norm': 0.6727442618433395, 'learning_rate': 9.3924758539679e-06, 'epoch': 0.18} 18%|█▊ | 4062/22095 [7:01:23<24:13:50, 4.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [284, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8426393 in VC:s3://internvl-moe-sft-data/. Exception: Image size [284, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20750, 'image': 'vrdu_texteq/astro-ph.CO/b3f892ef-e46e-4704-8e9c-c6f00e8dd980.png', 'image_wh': [[284, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': '$\\bullet$ {\\bf Parametrization II}'}]} 18%|█▊ | 4063/22095 [7:01:26<21:56:43, 4.38s/it] {'loss': 0.3782, 'grad_norm': 0.6499968211176823, 'learning_rate': 9.392125652862881e-06, 'epoch': 0.18} 18%|█▊ | 4063/22095 [7:01:26<21:56:43, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918291 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41444, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nA. 6\nB. 6.5\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 18%|█▊ | 4064/22095 [7:01:29<20:02:55, 4.00s/it] {'loss': 0.4408, 'grad_norm': 0.6338590652010262, 'learning_rate': 9.391775357384571e-06, 'epoch': 0.18} 18%|█▊ | 4064/22095 [7:01:29<20:02:55, 4.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882177 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5330, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 10cm\nB. 12cm\nC. 6cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4065/22095 [7:01:32<18:29:37, 3.69s/it] {'loss': 0.3867, 'grad_norm': 0.6456796294022185, 'learning_rate': 9.3914249675405e-06, 'epoch': 0.18} 18%|█▊ | 4065/22095 [7:01:32<18:29:37, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4066/22095 [7:01:40<24:44:28, 4.94s/it] {'loss': 0.5238, 'grad_norm': 0.4560740638541108, 'learning_rate': 9.39107448333819e-06, 'epoch': 0.18} 18%|█▊ | 4066/22095 [7:01:40<24:44:28, 4.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 18%|█▊ | 4067/22095 [7:01:50<32:54:39, 6.57s/it] {'loss': 0.5169, 'grad_norm': 0.3738339016612427, 'learning_rate': 9.390723904785178e-06, 'epoch': 0.18} 18%|█▊ | 4067/22095 [7:01:50<32:54:39, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 18%|█▊ | 4068/22095 [7:01:54<27:43:16, 5.54s/it] {'loss': 0.4241, 'grad_norm': 0.8041291215714814, 'learning_rate': 9.390373231888991e-06, 'epoch': 0.18} 18%|█▊ | 4068/22095 [7:01:54<27:43:16, 5.54s/it] 18%|█▊ | 4069/22095 [7:01:58<26:15:54, 5.25s/it] {'loss': 0.4115, 'grad_norm': 0.6659286018838321, 'learning_rate': 9.39002246465717e-06, 'epoch': 0.18} 18%|█▊ | 4069/22095 [7:01:58<26:15:54, 5.25s/it] 18%|█▊ | 4070/22095 [7:02:02<24:00:14, 4.79s/it] {'loss': 0.4536, 'grad_norm': 0.6382235578815533, 'learning_rate': 9.389671603097248e-06, 'epoch': 0.18} 18%|█▊ | 4070/22095 [7:02:02<24:00:14, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57041 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68224 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56484 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4071/22095 [7:02:05<22:02:20, 4.40s/it] {'loss': 0.3774, 'grad_norm': 0.7681460436470686, 'learning_rate': 9.389320647216767e-06, 'epoch': 0.18} 18%|█▊ | 4071/22095 [7:02:05<22:02:20, 4.40s/it] 18%|█▊ | 4072/22095 [7:02:09<20:12:54, 4.04s/it] {'loss': 0.399, 'grad_norm': 0.7688218133575312, 'learning_rate': 9.388969597023265e-06, 'epoch': 0.18} 18%|█▊ | 4072/22095 [7:02:09<20:12:54, 4.04s/it] 18%|█▊ | 4073/22095 [7:02:11<18:20:09, 3.66s/it] {'loss': 0.3793, 'grad_norm': 0.6452861606689477, 'learning_rate': 9.388618452524285e-06, 'epoch': 0.18} 18%|█▊ | 4073/22095 [7:02:11<18:20:09, 3.66s/it] 18%|█▊ | 4074/22095 [7:02:14<17:11:49, 3.44s/it] {'loss': 0.4177, 'grad_norm': 0.703849544538944, 'learning_rate': 9.388267213727373e-06, 'epoch': 0.18} 18%|█▊ | 4074/22095 [7:02:14<17:11:49, 3.44s/it] 18%|█▊ | 4075/22095 [7:02:17<16:25:05, 3.28s/it] {'loss': 0.4469, 'grad_norm': 0.6851811677788411, 'learning_rate': 9.387915880640077e-06, 'epoch': 0.18} 18%|█▊ | 4075/22095 [7:02:17<16:25:05, 3.28s/it] 18%|█▊ | 4076/22095 [7:02:20<15:42:08, 3.14s/it] {'loss': 0.4019, 'grad_norm': 0.6926954407735016, 'learning_rate': 9.387564453269945e-06, 'epoch': 0.18} 18%|█▊ | 4076/22095 [7:02:20<15:42:08, 3.14s/it] 18%|█▊ | 4077/22095 [7:02:23<15:26:18, 3.08s/it] {'loss': 0.4304, 'grad_norm': 0.6532267417161022, 'learning_rate': 9.38721293162453e-06, 'epoch': 0.18} 18%|█▊ | 4077/22095 [7:02:23<15:26:18, 3.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 18%|█▊ | 4078/22095 [7:02:31<22:23:17, 4.47s/it] {'loss': 0.5237, 'grad_norm': 0.7015512523650694, 'learning_rate': 9.386861315711382e-06, 'epoch': 0.18} 18%|█▊ | 4078/22095 [7:02:31<22:23:17, 4.47s/it] 18%|█▊ | 4079/22095 [7:02:37<25:30:52, 5.10s/it] {'loss': 0.5247, 'grad_norm': 0.5024324484532633, 'learning_rate': 9.386509605538057e-06, 'epoch': 0.18} 18%|█▊ | 4079/22095 [7:02:37<25:30:52, 5.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 18%|█▊ | 4080/22095 [7:02:41<22:55:43, 4.58s/it] {'loss': 0.4297, 'grad_norm': 0.8009254479377098, 'learning_rate': 9.386157801112112e-06, 'epoch': 0.18} 18%|█▊ | 4080/22095 [7:02:41<22:55:43, 4.58s/it] 18%|█▊ | 4081/22095 [7:02:45<22:45:20, 4.55s/it] {'loss': 0.4324, 'grad_norm': 0.7143862676068782, 'learning_rate': 9.385805902441109e-06, 'epoch': 0.18} 18%|█▊ | 4081/22095 [7:02:45<22:45:20, 4.55s/it] 18%|█▊ | 4082/22095 [7:02:49<21:06:21, 4.22s/it] {'loss': 0.3778, 'grad_norm': 0.7113426206348091, 'learning_rate': 9.385453909532606e-06, 'epoch': 0.18} 18%|█▊ | 4082/22095 [7:02:49<21:06:21, 4.22s/it] 18%|█▊ | 4083/22095 [7:02:51<19:13:03, 3.84s/it] {'loss': 0.3976, 'grad_norm': 0.7426225523157818, 'learning_rate': 9.385101822394167e-06, 'epoch': 0.18} 18%|█▊ | 4083/22095 [7:02:52<19:13:03, 3.84s/it] 18%|█▊ | 4084/22095 [7:02:55<19:27:29, 3.89s/it] {'loss': 0.3931, 'grad_norm': 0.7347993942730485, 'learning_rate': 9.384749641033358e-06, 'epoch': 0.18} 18%|█▊ | 4084/22095 [7:02:56<19:27:29, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333748 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 357, 'image': 'vrdu_table_final_2/astro-ph.CO/9d033032-7835-4ddf-af82-c9ab4df1c359.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 18%|█▊ | 4085/22095 [7:03:00<20:00:00, 4.00s/it] {'loss': 0.4235, 'grad_norm': 0.7249770199361875, 'learning_rate': 9.384397365457747e-06, 'epoch': 0.18} 18%|█▊ | 4085/22095 [7:03:00<20:00:00, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62578 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79582 > 40960). Running this sequence through the model will result in indexing errors 18%|█▊ | 4086/22095 [7:03:03<18:45:03, 3.75s/it] {'loss': 0.3643, 'grad_norm': 0.7628881619671343, 'learning_rate': 9.3840449956749e-06, 'epoch': 0.18} 18%|█▊ | 4086/22095 [7:03:03<18:45:03, 3.75s/it] 18%|█▊ | 4087/22095 [7:03:06<17:39:58, 3.53s/it] {'loss': 0.4394, 'grad_norm': 0.6992424817628683, 'learning_rate': 9.383692531692392e-06, 'epoch': 0.18} 18%|█▊ | 4087/22095 [7:03:06<17:39:58, 3.53s/it] 19%|█▊ | 4088/22095 [7:03:09<16:53:08, 3.38s/it] {'loss': 0.4066, 'grad_norm': 0.6819527112357563, 'learning_rate': 9.383339973517796e-06, 'epoch': 0.19} 19%|█▊ | 4088/22095 [7:03:09<16:53:08, 3.38s/it] 19%|█▊ | 4089/22095 [7:03:12<16:31:14, 3.30s/it] {'loss': 0.42, 'grad_norm': 0.7248904254009867, 'learning_rate': 9.382987321158686e-06, 'epoch': 0.19} 19%|█▊ | 4089/22095 [7:03:12<16:31:14, 3.30s/it] 19%|█▊ | 4090/22095 [7:03:15<15:57:49, 3.19s/it] {'loss': 0.4014, 'grad_norm': 1.0207624752501492, 'learning_rate': 9.382634574622637e-06, 'epoch': 0.19} 19%|█▊ | 4090/22095 [7:03:15<15:57:49, 3.19s/it] 19%|█▊ | 4091/22095 [7:03:19<17:01:01, 3.40s/it] {'loss': 0.4049, 'grad_norm': 0.6421552433584352, 'learning_rate': 9.382281733917235e-06, 'epoch': 0.19} 19%|█▊ | 4091/22095 [7:03:19<17:01:01, 3.40s/it] 19%|█▊ | 4092/22095 [7:03:22<16:36:52, 3.32s/it] {'loss': 0.3951, 'grad_norm': 0.6737615415897059, 'learning_rate': 9.381928799050054e-06, 'epoch': 0.19} 19%|█▊ | 4092/22095 [7:03:22<16:36:52, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43485 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4093/22095 [7:03:25<16:17:51, 3.26s/it] {'loss': 0.3851, 'grad_norm': 0.6744702758862204, 'learning_rate': 9.381575770028684e-06, 'epoch': 0.19} 19%|█▊ | 4093/22095 [7:03:25<16:17:51, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4094/22095 [7:03:34<25:22:12, 5.07s/it] {'loss': 0.5285, 'grad_norm': 1.5943439842998557, 'learning_rate': 9.381222646860708e-06, 'epoch': 0.19} 19%|█▊ | 4094/22095 [7:03:34<25:22:12, 5.07s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▊ | 4095/22095 [7:03:38<23:18:13, 4.66s/it] {'loss': 0.4299, 'grad_norm': 0.7502753234602005, 'learning_rate': 9.380869429553712e-06, 'epoch': 0.19} 19%|█▊ | 4095/22095 [7:03:38<23:18:13, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42255 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4096/22095 [7:03:42<21:47:00, 4.36s/it] {'loss': 0.399, 'grad_norm': 0.6712999824888282, 'learning_rate': 9.380516118115287e-06, 'epoch': 0.19} 19%|█▊ | 4096/22095 [7:03:42<21:47:00, 4.36s/it] 19%|█▊ | 4097/22095 [7:03:45<20:14:20, 4.05s/it] {'loss': 0.3814, 'grad_norm': 0.6842794981481997, 'learning_rate': 9.380162712553024e-06, 'epoch': 0.19} 19%|█▊ | 4097/22095 [7:03:45<20:14:20, 4.05s/it] 19%|█▊ | 4098/22095 [7:03:48<18:29:24, 3.70s/it] {'loss': 0.3983, 'grad_norm': 0.7185421908678674, 'learning_rate': 9.379809212874517e-06, 'epoch': 0.19} 19%|█▊ | 4098/22095 [7:03:48<18:29:24, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59790 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4099/22095 [7:03:52<18:20:56, 3.67s/it] {'loss': 0.3495, 'grad_norm': 0.6614546240803848, 'learning_rate': 9.379455619087361e-06, 'epoch': 0.19} 19%|█▊ | 4099/22095 [7:03:52<18:20:56, 3.67s/it] 19%|█▊ | 4100/22095 [7:03:54<17:05:02, 3.42s/it] {'loss': 0.4067, 'grad_norm': 0.6541861454625707, 'learning_rate': 9.379101931199154e-06, 'epoch': 0.19} 19%|█▊ | 4100/22095 [7:03:54<17:05:02, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4101/22095 [7:04:04<25:58:28, 5.20s/it] {'loss': 0.5088, 'grad_norm': 1.0781073128655911, 'learning_rate': 9.378748149217498e-06, 'epoch': 0.19} 19%|█▊ | 4101/22095 [7:04:04<25:58:28, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60468 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55923 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4102/22095 [7:04:07<22:57:49, 4.59s/it] {'loss': 0.3794, 'grad_norm': 0.690712285204404, 'learning_rate': 9.378394273149992e-06, 'epoch': 0.19} 19%|█▊ | 4102/22095 [7:04:07<22:57:49, 4.59s/it] 19%|█▊ | 4103/22095 [7:04:10<20:47:13, 4.16s/it] {'loss': 0.3863, 'grad_norm': 0.6801659428571151, 'learning_rate': 9.37804030300424e-06, 'epoch': 0.19} 19%|█▊ | 4103/22095 [7:04:10<20:47:13, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4104/22095 [7:04:20<28:45:29, 5.75s/it] {'loss': 0.5028, 'grad_norm': 0.6462483308953831, 'learning_rate': 9.377686238787848e-06, 'epoch': 0.19} 19%|█▊ | 4104/22095 [7:04:20<28:45:29, 5.75s/it] 19%|█▊ | 4105/22095 [7:04:23<24:50:44, 4.97s/it] {'loss': 0.3842, 'grad_norm': 0.6980783750964051, 'learning_rate': 9.377332080508423e-06, 'epoch': 0.19} 19%|█▊ | 4105/22095 [7:04:23<24:50:44, 4.97s/it] 19%|█▊ | 4106/22095 [7:04:26<22:18:32, 4.46s/it] {'loss': 0.4278, 'grad_norm': 0.6791753952826917, 'learning_rate': 9.376977828173576e-06, 'epoch': 0.19} 19%|█▊ | 4106/22095 [7:04:26<22:18:32, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4107/22095 [7:04:36<31:08:59, 6.23s/it] {'loss': 0.4883, 'grad_norm': 0.5805388440208535, 'learning_rate': 9.376623481790918e-06, 'epoch': 0.19} 19%|█▊ | 4107/22095 [7:04:36<31:08:59, 6.23s/it] 19%|█▊ | 4108/22095 [7:04:40<27:54:25, 5.59s/it] {'loss': 0.3824, 'grad_norm': 0.6774348967610799, 'learning_rate': 9.376269041368063e-06, 'epoch': 0.19} 19%|█▊ | 4108/22095 [7:04:40<27:54:25, 5.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8591280 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12574, 'image': '782118577.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a sci-fi book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a youngster related book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 19%|█▊ | 4109/22095 [7:04:44<24:56:16, 4.99s/it] {'loss': 0.4184, 'grad_norm': 0.7046508039451562, 'learning_rate': 9.375914506912628e-06, 'epoch': 0.19} 19%|█▊ | 4109/22095 [7:04:44<24:56:16, 4.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4110/22095 [7:04:54<32:55:40, 6.59s/it] {'loss': 0.5237, 'grad_norm': 0.6349773226277379, 'learning_rate': 9.37555987843223e-06, 'epoch': 0.19} 19%|█▊ | 4110/22095 [7:04:54<32:55:40, 6.59s/it] 19%|█▊ | 4111/22095 [7:04:59<29:17:17, 5.86s/it] {'loss': 0.3783, 'grad_norm': 0.7162257440503814, 'learning_rate': 9.375205155934488e-06, 'epoch': 0.19} 19%|█▊ | 4111/22095 [7:04:59<29:17:17, 5.86s/it] 19%|█▊ | 4112/22095 [7:05:02<26:06:26, 5.23s/it] {'loss': 0.4266, 'grad_norm': 0.7165412489731983, 'learning_rate': 9.374850339427024e-06, 'epoch': 0.19} 19%|█▊ | 4112/22095 [7:05:02<26:06:26, 5.23s/it] 19%|█▊ | 4113/22095 [7:05:06<24:01:16, 4.81s/it] {'loss': 0.3875, 'grad_norm': 0.6954870860047865, 'learning_rate': 9.374495428917463e-06, 'epoch': 0.19} 19%|█▊ | 4113/22095 [7:05:06<24:01:16, 4.81s/it] 19%|█▊ | 4114/22095 [7:05:10<22:56:12, 4.59s/it] {'loss': 0.3425, 'grad_norm': 0.6981956064505662, 'learning_rate': 9.37414042441343e-06, 'epoch': 0.19} 19%|█▊ | 4114/22095 [7:05:10<22:56:12, 4.59s/it] 19%|█▊ | 4115/22095 [7:05:14<21:32:39, 4.31s/it] {'loss': 0.4829, 'grad_norm': 0.7453770808853691, 'learning_rate': 9.373785325922556e-06, 'epoch': 0.19} 19%|█▊ | 4115/22095 [7:05:14<21:32:39, 4.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▊ | 4116/22095 [7:05:18<21:35:02, 4.32s/it] {'loss': 0.4009, 'grad_norm': 0.6556482744053577, 'learning_rate': 9.373430133452466e-06, 'epoch': 0.19} 19%|█▊ | 4116/22095 [7:05:18<21:35:02, 4.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390429 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57248, 'image': 'vrdu_table_final_2/astro-ph.EP/5876f095-8ba8-4084-84a4-69376ab893db.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 19%|█▊ | 4117/22095 [7:05:21<19:57:43, 4.00s/it] {'loss': 0.4074, 'grad_norm': 0.6973709463168327, 'learning_rate': 9.373074847010795e-06, 'epoch': 0.19} 19%|█▊ | 4117/22095 [7:05:22<19:57:43, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45074 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42145 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4118/22095 [7:05:26<20:38:06, 4.13s/it] {'loss': 0.459, 'grad_norm': 0.7096935290207299, 'learning_rate': 9.372719466605176e-06, 'epoch': 0.19} 19%|█▊ | 4118/22095 [7:05:26<20:38:06, 4.13s/it] 19%|█▊ | 4119/22095 [7:05:29<18:58:41, 3.80s/it] {'loss': 0.3997, 'grad_norm': 0.626924754136172, 'learning_rate': 9.372363992243245e-06, 'epoch': 0.19} 19%|█▊ | 4119/22095 [7:05:29<18:58:41, 3.80s/it] 19%|█▊ | 4120/22095 [7:05:33<18:43:14, 3.75s/it] {'loss': 0.4543, 'grad_norm': 0.691742254232117, 'learning_rate': 9.37200842393264e-06, 'epoch': 0.19} 19%|█▊ | 4120/22095 [7:05:33<18:43:14, 3.75s/it] 19%|█▊ | 4121/22095 [7:05:36<18:44:34, 3.75s/it] {'loss': 0.4108, 'grad_norm': 0.7763991797792152, 'learning_rate': 9.371652761681006e-06, 'epoch': 0.19} 19%|█▊ | 4121/22095 [7:05:36<18:44:34, 3.75s/it] 19%|█▊ | 4122/22095 [7:05:41<19:23:59, 3.89s/it] {'loss': 0.3566, 'grad_norm': 0.6590271573620654, 'learning_rate': 9.371297005495976e-06, 'epoch': 0.19} 19%|█▊ | 4122/22095 [7:05:41<19:23:59, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [406, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8479265 in VC:s3://internvl-moe-sft-data/. Exception: Image size [406, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 9648, 'image': 'vrdu_texteq/astro-ph.CO/acf5b07f-8043-4ab8-9abb-cadcde80e474.png', 'image_wh': [[406, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'where $\\tilde{n}$ is a noise realization and'}]} 19%|█▊ | 4123/22095 [7:05:45<19:41:56, 3.95s/it] {'loss': 0.3836, 'grad_norm': 0.7172524152091183, 'learning_rate': 9.3709411553852e-06, 'epoch': 0.19} 19%|█▊ | 4123/22095 [7:05:45<19:41:56, 3.95s/it] 19%|█▊ | 4124/22095 [7:05:48<18:31:16, 3.71s/it] {'loss': 0.3759, 'grad_norm': 0.6864398739450693, 'learning_rate': 9.370585211356323e-06, 'epoch': 0.19} 19%|█▊ | 4124/22095 [7:05:48<18:31:16, 3.71s/it] 19%|█▊ | 4125/22095 [7:05:52<18:51:09, 3.78s/it] {'loss': 0.378, 'grad_norm': 0.645730820024419, 'learning_rate': 9.370229173416994e-06, 'epoch': 0.19} 19%|█▊ | 4125/22095 [7:05:52<18:51:09, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▊ | 4126/22095 [7:06:01<27:20:21, 5.48s/it] {'loss': 0.5183, 'grad_norm': 0.6075348044191042, 'learning_rate': 9.36987304157486e-06, 'epoch': 0.19} 19%|█▊ | 4126/22095 [7:06:01<27:20:21, 5.48s/it] 19%|█▊ | 4127/22095 [7:06:04<24:04:34, 4.82s/it] {'loss': 0.3947, 'grad_norm': 0.78877271214592, 'learning_rate': 9.369516815837579e-06, 'epoch': 0.19} 19%|█▊ | 4127/22095 [7:06:04<24:04:34, 4.82s/it] 19%|█▊ | 4128/22095 [7:06:07<20:55:55, 4.19s/it] {'loss': 0.4193, 'grad_norm': 0.7040157005782529, 'learning_rate': 9.369160496212797e-06, 'epoch': 0.19} 19%|█▊ | 4128/22095 [7:06:07<20:55:55, 4.19s/it] 19%|█▊ | 4129/22095 [7:06:10<18:57:42, 3.80s/it] {'loss': 0.4055, 'grad_norm': 0.7747630187627644, 'learning_rate': 9.368804082708178e-06, 'epoch': 0.19} 19%|█▊ | 4129/22095 [7:06:10<18:57:42, 3.80s/it] 19%|█▊ | 4130/22095 [7:06:13<18:00:04, 3.61s/it] {'loss': 0.4087, 'grad_norm': 0.6983420726951024, 'learning_rate': 9.368447575331376e-06, 'epoch': 0.19} 19%|█▊ | 4130/22095 [7:06:13<18:00:04, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [27, 54, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8800124 in VC:s3://internvl2/datasets/chrome_writting/images/. Exception: Image size [27, 54, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3855, 'image': '0003855.jpg', 'image_wh': [[27, 54]], 'conversations': [{'from': 'human', 'value': '\nPlease write out the expression of the formula in the image using LaTeX format.\nPlease give the latex expression directly without any extra output.'}, {'from': 'gpt', 'value': '$1$'}]} 19%|█▊ | 4131/22095 [7:06:16<16:49:01, 3.37s/it] {'loss': 0.3831, 'grad_norm': 0.6971366034220329, 'learning_rate': 9.368090974090053e-06, 'epoch': 0.19} 19%|█▊ | 4131/22095 [7:06:16<16:49:01, 3.37s/it] 19%|█▊ | 4132/22095 [7:06:20<17:19:28, 3.47s/it] {'loss': 0.4186, 'grad_norm': 0.6732722620717891, 'learning_rate': 9.36773427899187e-06, 'epoch': 0.19} 19%|█▊ | 4132/22095 [7:06:20<17:19:28, 3.47s/it] 19%|█▊ | 4133/22095 [7:06:23<16:48:32, 3.37s/it] {'loss': 0.4112, 'grad_norm': 0.7045716634227874, 'learning_rate': 9.367377490044491e-06, 'epoch': 0.19} 19%|█▊ | 4133/22095 [7:06:23<16:48:32, 3.37s/it] 19%|█▊ | 4134/22095 [7:06:26<16:09:40, 3.24s/it] {'loss': 0.4044, 'grad_norm': 0.7178724111065302, 'learning_rate': 9.367020607255584e-06, 'epoch': 0.19} 19%|█▊ | 4134/22095 [7:06:26<16:09:40, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43094 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47612 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4135/22095 [7:06:29<15:51:02, 3.18s/it] {'loss': 0.4038, 'grad_norm': 0.7229009173713574, 'learning_rate': 9.366663630632817e-06, 'epoch': 0.19} 19%|█▊ | 4135/22095 [7:06:29<15:51:02, 3.18s/it] 19%|█▊ | 4136/22095 [7:06:32<15:52:21, 3.18s/it] {'loss': 0.3759, 'grad_norm': 0.6270133182764555, 'learning_rate': 9.36630656018386e-06, 'epoch': 0.19} 19%|█▊ | 4136/22095 [7:06:32<15:52:21, 3.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▊ | 4137/22095 [7:06:35<16:06:38, 3.23s/it] {'loss': 0.3723, 'grad_norm': 0.7126974681377124, 'learning_rate': 9.365949395916383e-06, 'epoch': 0.19} 19%|█▊ | 4137/22095 [7:06:35<16:06:38, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▊ | 4138/22095 [7:06:39<16:57:09, 3.40s/it] {'loss': 0.4281, 'grad_norm': 0.7413695555778791, 'learning_rate': 9.365592137838063e-06, 'epoch': 0.19} 19%|█▊ | 4138/22095 [7:06:39<16:57:09, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43016 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57364 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4139/22095 [7:06:42<16:13:57, 3.25s/it] {'loss': 0.4052, 'grad_norm': 1.0872267695040065, 'learning_rate': 9.365234785956575e-06, 'epoch': 0.19} 19%|█▊ | 4139/22095 [7:06:42<16:13:57, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43717 > 40960). Running this sequence through the model will result in indexing errors 19%|█▊ | 4140/22095 [7:06:47<18:09:32, 3.64s/it] {'loss': 0.395, 'grad_norm': 0.6618657909759026, 'learning_rate': 9.3648773402796e-06, 'epoch': 0.19} 19%|█▊ | 4140/22095 [7:06:47<18:09:32, 3.64s/it] 19%|█▊ | 4141/22095 [7:06:50<17:04:52, 3.43s/it] {'loss': 0.3977, 'grad_norm': 0.756847238761708, 'learning_rate': 9.364519800814818e-06, 'epoch': 0.19} 19%|█▊ | 4141/22095 [7:06:50<17:04:52, 3.43s/it] 19%|█▊ | 4142/22095 [7:06:53<17:15:36, 3.46s/it] {'loss': 0.4612, 'grad_norm': 0.7104506052937238, 'learning_rate': 9.364162167569907e-06, 'epoch': 0.19} 19%|█▊ | 4142/22095 [7:06:53<17:15:36, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4143/22095 [7:07:02<25:32:43, 5.12s/it] {'loss': 0.5194, 'grad_norm': 0.622825098297183, 'learning_rate': 9.363804440552557e-06, 'epoch': 0.19} 19%|█▉ | 4143/22095 [7:07:02<25:32:43, 5.12s/it] 19%|█▉ | 4144/22095 [7:07:06<23:36:45, 4.74s/it] {'loss': 0.3511, 'grad_norm': 0.8854425327165626, 'learning_rate': 9.363446619770452e-06, 'epoch': 0.19} 19%|█▉ | 4144/22095 [7:07:06<23:36:45, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47005 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4145/22095 [7:07:09<21:14:39, 4.26s/it] {'loss': 0.4249, 'grad_norm': 0.7319115033311788, 'learning_rate': 9.363088705231277e-06, 'epoch': 0.19} 19%|█▉ | 4145/22095 [7:07:09<21:14:39, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [325, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8506088 in VC:s3://internvl-moe-sft-data/. Exception: Image size [325, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 142984, 'image': 'vrdu_texteq/astro-ph.CO/95f35bca-34b6-45cc-9ca0-18825e581b17.png', 'image_wh': [[325, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'with $\\Delta{Y}=Y_n-Y_{n-1}$ and'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047657 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 4cm\nB. 3cm\nC. 2cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4146/22095 [7:07:13<20:46:21, 4.17s/it] {'loss': 0.3504, 'grad_norm': 0.6450997911633352, 'learning_rate': 9.36273069694273e-06, 'epoch': 0.19} 19%|█▉ | 4146/22095 [7:07:13<20:46:21, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4147/22095 [7:07:19<23:49:20, 4.78s/it] {'loss': 0.535, 'grad_norm': 0.46984863114291675, 'learning_rate': 9.362372594912498e-06, 'epoch': 0.19} 19%|█▉ | 4147/22095 [7:07:19<23:49:20, 4.78s/it] 19%|█▉ | 4148/22095 [7:07:27<28:15:17, 5.67s/it] {'loss': 0.5204, 'grad_norm': 0.4128089269608341, 'learning_rate': 9.362014399148275e-06, 'epoch': 0.19} 19%|█▉ | 4148/22095 [7:07:27<28:15:17, 5.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (54159 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68533 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4149/22095 [7:07:31<25:09:13, 5.05s/it] {'loss': 0.397, 'grad_norm': 0.8256883317191542, 'learning_rate': 9.361656109657761e-06, 'epoch': 0.19} 19%|█▉ | 4149/22095 [7:07:31<25:09:13, 5.05s/it] 19%|█▉ | 4150/22095 [7:07:34<23:11:12, 4.65s/it] {'loss': 0.4696, 'grad_norm': 0.808810125346021, 'learning_rate': 9.361297726448656e-06, 'epoch': 0.19} 19%|█▉ | 4150/22095 [7:07:34<23:11:12, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4151/22095 [7:07:40<25:23:56, 5.10s/it] {'loss': 0.5191, 'grad_norm': 0.3406958381061442, 'learning_rate': 9.360939249528653e-06, 'epoch': 0.19} 19%|█▉ | 4151/22095 [7:07:40<25:23:56, 5.10s/it] 19%|█▉ | 4152/22095 [7:07:44<23:50:38, 4.78s/it] {'loss': 0.3934, 'grad_norm': 0.7482057961528514, 'learning_rate': 9.360580678905462e-06, 'epoch': 0.19} 19%|█▉ | 4152/22095 [7:07:45<23:50:38, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4153/22095 [7:07:54<31:17:56, 6.28s/it] {'loss': 0.5091, 'grad_norm': 0.3900188665176955, 'learning_rate': 9.360222014586782e-06, 'epoch': 0.19} 19%|█▉ | 4153/22095 [7:07:54<31:17:56, 6.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4154/22095 [7:07:58<27:37:43, 5.54s/it] {'loss': 0.4003, 'grad_norm': 0.7373474726783783, 'learning_rate': 9.359863256580326e-06, 'epoch': 0.19} 19%|█▉ | 4154/22095 [7:07:58<27:37:43, 5.54s/it] 19%|█▉ | 4155/22095 [7:08:01<23:33:18, 4.73s/it] {'loss': 0.3899, 'grad_norm': 0.7019847965214174, 'learning_rate': 9.359504404893795e-06, 'epoch': 0.19} 19%|█▉ | 4155/22095 [7:08:01<23:33:18, 4.73s/it] 19%|█▉ | 4156/22095 [7:08:04<20:55:16, 4.20s/it] {'loss': 0.38, 'grad_norm': 0.6653724063594627, 'learning_rate': 9.359145459534906e-06, 'epoch': 0.19} 19%|█▉ | 4156/22095 [7:08:04<20:55:16, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4157/22095 [7:08:13<28:20:44, 5.69s/it] {'loss': 0.5017, 'grad_norm': 0.3733988868528731, 'learning_rate': 9.35878642051137e-06, 'epoch': 0.19} 19%|█▉ | 4157/22095 [7:08:13<28:20:44, 5.69s/it] 19%|█▉ | 4158/22095 [7:08:16<24:46:21, 4.97s/it] {'loss': 0.4396, 'grad_norm': 0.8251591439561985, 'learning_rate': 9.358427287830898e-06, 'epoch': 0.19} 19%|█▉ | 4158/22095 [7:08:16<24:46:21, 4.97s/it] 19%|█▉ | 4159/22095 [7:08:19<22:02:59, 4.43s/it] {'loss': 0.4259, 'grad_norm': 0.7482630429833076, 'learning_rate': 9.358068061501211e-06, 'epoch': 0.19} 19%|█▉ | 4159/22095 [7:08:20<22:02:59, 4.43s/it] 19%|█▉ | 4160/22095 [7:08:23<20:34:50, 4.13s/it] {'loss': 0.3608, 'grad_norm': 0.6534549270921789, 'learning_rate': 9.357708741530025e-06, 'epoch': 0.19} 19%|█▉ | 4160/22095 [7:08:23<20:34:50, 4.13s/it] 19%|█▉ | 4161/22095 [7:08:27<20:03:39, 4.03s/it] {'loss': 0.3773, 'grad_norm': 0.6465604289725814, 'learning_rate': 9.357349327925063e-06, 'epoch': 0.19} 19%|█▉ | 4161/22095 [7:08:27<20:03:39, 4.03s/it] 19%|█▉ | 4162/22095 [7:08:31<19:48:04, 3.98s/it] {'loss': 0.3589, 'grad_norm': 0.7427467905232471, 'learning_rate': 9.356989820694046e-06, 'epoch': 0.19} 19%|█▉ | 4162/22095 [7:08:31<19:48:04, 3.98s/it] 19%|█▉ | 4163/22095 [7:08:34<18:14:05, 3.66s/it] {'loss': 0.4453, 'grad_norm': 0.7248522411623955, 'learning_rate': 9.3566302198447e-06, 'epoch': 0.19} 19%|█▉ | 4163/22095 [7:08:34<18:14:05, 3.66s/it] 19%|█▉ | 4164/22095 [7:08:37<18:29:26, 3.71s/it] {'loss': 0.3759, 'grad_norm': 0.6468436050912464, 'learning_rate': 9.356270525384749e-06, 'epoch': 0.19} 19%|█▉ | 4164/22095 [7:08:37<18:29:26, 3.71s/it] 19%|█▉ | 4165/22095 [7:08:40<17:02:56, 3.42s/it] {'loss': 0.3866, 'grad_norm': 0.6606287578369291, 'learning_rate': 9.355910737321927e-06, 'epoch': 0.19} 19%|█▉ | 4165/22095 [7:08:40<17:02:56, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4166/22095 [7:08:43<16:14:56, 3.26s/it] {'loss': 0.426, 'grad_norm': 0.874469386000399, 'learning_rate': 9.35555085566396e-06, 'epoch': 0.19} 19%|█▉ | 4166/22095 [7:08:43<16:14:56, 3.26s/it] 19%|█▉ | 4167/22095 [7:08:47<17:08:02, 3.44s/it] {'loss': 0.4193, 'grad_norm': 1.379091788268698, 'learning_rate': 9.35519088041858e-06, 'epoch': 0.19} 19%|█▉ | 4167/22095 [7:08:47<17:08:02, 3.44s/it] 19%|█▉ | 4168/22095 [7:08:50<17:00:13, 3.41s/it] {'loss': 0.3589, 'grad_norm': 0.6644904529186746, 'learning_rate': 9.354830811593527e-06, 'epoch': 0.19} 19%|█▉ | 4168/22095 [7:08:50<17:00:13, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4169/22095 [7:08:54<18:14:06, 3.66s/it] {'loss': 0.3771, 'grad_norm': 0.6252804901163832, 'learning_rate': 9.354470649196532e-06, 'epoch': 0.19} 19%|█▉ | 4169/22095 [7:08:54<18:14:06, 3.66s/it] 19%|█▉ | 4170/22095 [7:08:59<19:08:52, 3.85s/it] {'loss': 0.3911, 'grad_norm': 0.6371878142364097, 'learning_rate': 9.354110393235339e-06, 'epoch': 0.19} 19%|█▉ | 4170/22095 [7:08:59<19:08:52, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918290 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41443, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nA. 5\nB. 6\nC. 6.5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 19%|█▉ | 4171/22095 [7:09:03<19:23:24, 3.89s/it] {'loss': 0.435, 'grad_norm': 0.7310086362067233, 'learning_rate': 9.353750043717685e-06, 'epoch': 0.19} 19%|█▉ | 4171/22095 [7:09:03<19:23:24, 3.89s/it] 19%|█▉ | 4172/22095 [7:09:07<20:23:24, 4.10s/it] {'loss': 0.3899, 'grad_norm': 0.6213844537992925, 'learning_rate': 9.353389600651313e-06, 'epoch': 0.19} 19%|█▉ | 4172/22095 [7:09:07<20:23:24, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4173/22095 [7:09:11<20:02:04, 4.02s/it] {'loss': 0.4159, 'grad_norm': 0.6700371913451663, 'learning_rate': 9.35302906404397e-06, 'epoch': 0.19} 19%|█▉ | 4173/22095 [7:09:11<20:02:04, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41435 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4174/22095 [7:09:15<19:20:36, 3.89s/it] {'loss': 0.3734, 'grad_norm': 0.7052752239619765, 'learning_rate': 9.352668433903402e-06, 'epoch': 0.19} 19%|█▉ | 4174/22095 [7:09:15<19:20:36, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4175/22095 [7:09:18<18:53:00, 3.79s/it] {'loss': 0.4252, 'grad_norm': 0.6566206805556328, 'learning_rate': 9.352307710237358e-06, 'epoch': 0.19} 19%|█▉ | 4175/22095 [7:09:18<18:53:00, 3.79s/it] 19%|█▉ | 4176/22095 [7:09:21<17:24:32, 3.50s/it] {'loss': 0.4134, 'grad_norm': 0.7189694032411736, 'learning_rate': 9.351946893053587e-06, 'epoch': 0.19} 19%|█▉ | 4176/22095 [7:09:21<17:24:32, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134665 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4177/22095 [7:09:24<16:33:50, 3.33s/it] {'loss': 0.4327, 'grad_norm': 0.714903760039513, 'learning_rate': 9.351585982359845e-06, 'epoch': 0.19} 19%|█▉ | 4177/22095 [7:09:24<16:33:50, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62532 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4178/22095 [7:09:27<16:37:19, 3.34s/it] {'loss': 0.3823, 'grad_norm': 0.9170516480758911, 'learning_rate': 9.351224978163885e-06, 'epoch': 0.19} 19%|█▉ | 4178/22095 [7:09:27<16:37:19, 3.34s/it] 19%|█▉ | 4179/22095 [7:09:31<16:57:06, 3.41s/it] {'loss': 0.4025, 'grad_norm': 0.6625988385123889, 'learning_rate': 9.350863880473462e-06, 'epoch': 0.19} 19%|█▉ | 4179/22095 [7:09:31<16:57:06, 3.41s/it] 19%|█▉ | 4180/22095 [7:09:35<17:16:58, 3.47s/it] {'loss': 0.4382, 'grad_norm': 0.6808033178232126, 'learning_rate': 9.350502689296337e-06, 'epoch': 0.19} 19%|█▉ | 4180/22095 [7:09:35<17:16:58, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8343904 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10555, 'image': 'vrdu_table_final_2/astro-ph.CO/e492e590-7572-4321-a76f-d1078da71805.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 19%|█▉ | 4181/22095 [7:09:38<16:54:20, 3.40s/it] {'loss': 0.3599, 'grad_norm': 0.6377075302123607, 'learning_rate': 9.350141404640273e-06, 'epoch': 0.19} 19%|█▉ | 4181/22095 [7:09:38<16:54:20, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (110274 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92951 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4182/22095 [7:09:45<22:52:47, 4.60s/it] {'loss': 0.5079, 'grad_norm': 0.4455416067005901, 'learning_rate': 9.34978002651303e-06, 'epoch': 0.19} 19%|█▉ | 4182/22095 [7:09:45<22:52:47, 4.60s/it] 19%|█▉ | 4183/22095 [7:09:49<21:03:59, 4.23s/it] {'loss': 0.3626, 'grad_norm': 0.9095683713471634, 'learning_rate': 9.349418554922371e-06, 'epoch': 0.19} 19%|█▉ | 4183/22095 [7:09:49<21:03:59, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42812 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4184/22095 [7:09:53<21:04:55, 4.24s/it] {'loss': 0.3844, 'grad_norm': 0.7259307961524236, 'learning_rate': 9.349056989876068e-06, 'epoch': 0.19} 19%|█▉ | 4184/22095 [7:09:53<21:04:55, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4185/22095 [7:10:00<24:46:16, 4.98s/it] {'loss': 0.5089, 'grad_norm': 0.3346786654337558, 'learning_rate': 9.348695331381887e-06, 'epoch': 0.19} 19%|█▉ | 4185/22095 [7:10:00<24:46:16, 4.98s/it] 19%|█▉ | 4186/22095 [7:10:08<30:07:13, 6.05s/it] {'loss': 0.4931, 'grad_norm': 0.3344837591888438, 'learning_rate': 9.3483335794476e-06, 'epoch': 0.19} 19%|█▉ | 4186/22095 [7:10:08<30:07:13, 6.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 19%|█▉ | 4187/22095 [7:10:12<26:24:41, 5.31s/it] {'loss': 0.3988, 'grad_norm': 0.8388219787853163, 'learning_rate': 9.347971734080978e-06, 'epoch': 0.19} 19%|█▉ | 4187/22095 [7:10:12<26:24:41, 5.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 56, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8355016 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 56, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21706, 'image': 'vrdu_table_final_2/astro-ph.CO/67f32e08-6a15-422f-97b2-83fff540f38c.png', 'image_wh': [[14, 56]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #1\\\\#2\n \\end{tabular}\n```"}]} 19%|█▉ | 4188/22095 [7:10:15<23:16:40, 4.68s/it] {'loss': 0.413, 'grad_norm': 0.7570586941523583, 'learning_rate': 9.347609795289798e-06, 'epoch': 0.19} 19%|█▉ | 4188/22095 [7:10:15<23:16:40, 4.68s/it] 19%|█▉ | 4189/22095 [7:10:18<20:32:53, 4.13s/it] {'loss': 0.3826, 'grad_norm': 0.8178611564571792, 'learning_rate': 9.347247763081834e-06, 'epoch': 0.19} 19%|█▉ | 4189/22095 [7:10:18<20:32:53, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4190/22095 [7:10:27<28:33:48, 5.74s/it] {'loss': 0.5218, 'grad_norm': 0.628098793520388, 'learning_rate': 9.346885637464871e-06, 'epoch': 0.19} 19%|█▉ | 4190/22095 [7:10:27<28:33:48, 5.74s/it] 19%|█▉ | 4191/22095 [7:10:31<25:01:42, 5.03s/it] {'loss': 0.3973, 'grad_norm': 0.7271969659510159, 'learning_rate': 9.346523418446682e-06, 'epoch': 0.19} 19%|█▉ | 4191/22095 [7:10:31<25:01:42, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (119117 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94438 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4192/22095 [7:10:40<31:25:14, 6.32s/it] {'loss': 0.5036, 'grad_norm': 0.39111820880276893, 'learning_rate': 9.346161106035056e-06, 'epoch': 0.19} 19%|█▉ | 4192/22095 [7:10:40<31:25:14, 6.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89280 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4193/22095 [7:10:43<26:59:46, 5.43s/it] {'loss': 0.4423, 'grad_norm': 1.1236366782758218, 'learning_rate': 9.345798700237778e-06, 'epoch': 0.19} 19%|█▉ | 4193/22095 [7:10:43<26:59:46, 5.43s/it] 19%|█▉ | 4194/22095 [7:10:46<23:24:31, 4.71s/it] {'loss': 0.3715, 'grad_norm': 0.6378386630155638, 'learning_rate': 9.34543620106263e-06, 'epoch': 0.19} 19%|█▉ | 4194/22095 [7:10:46<23:24:31, 4.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401748 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3912, 'image': 'vrdu_table_final_2/astro-ph.CO/3407a9cc-1e77-47ca-a061-a64fd7199e17.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 19%|█▉ | 4195/22095 [7:10:50<21:43:55, 4.37s/it] {'loss': 0.3934, 'grad_norm': 0.7933074202178645, 'learning_rate': 9.345073608517405e-06, 'epoch': 0.19} 19%|█▉ | 4195/22095 [7:10:50<21:43:55, 4.37s/it] 19%|█▉ | 4196/22095 [7:10:53<19:28:12, 3.92s/it] {'loss': 0.3943, 'grad_norm': 0.6613598448529215, 'learning_rate': 9.344710922609893e-06, 'epoch': 0.19} 19%|█▉ | 4196/22095 [7:10:53<19:28:12, 3.92s/it] 19%|█▉ | 4197/22095 [7:10:57<19:47:07, 3.98s/it] {'loss': 0.4128, 'grad_norm': 0.7896961749002904, 'learning_rate': 9.344348143347888e-06, 'epoch': 0.19} 19%|█▉ | 4197/22095 [7:10:57<19:47:07, 3.98s/it] 19%|█▉ | 4198/22095 [7:11:00<18:50:32, 3.79s/it] {'loss': 0.3779, 'grad_norm': 0.6664564006056659, 'learning_rate': 9.343985270739184e-06, 'epoch': 0.19} 19%|█▉ | 4198/22095 [7:11:00<18:50:32, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46696 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78373 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92922 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4199/22095 [7:11:04<19:00:41, 3.82s/it] {'loss': 0.3793, 'grad_norm': 0.6422339067166803, 'learning_rate': 9.343622304791577e-06, 'epoch': 0.19} 19%|█▉ | 4199/22095 [7:11:04<19:00:41, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4200/22095 [7:11:12<24:36:06, 4.95s/it] {'loss': 0.4923, 'grad_norm': 0.446876644874073, 'learning_rate': 9.343259245512866e-06, 'epoch': 0.19} 19%|█▉ | 4200/22095 [7:11:12<24:36:06, 4.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903164 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26317, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. \\frac{9}{2}cm\nB. 5cm\nC. \\frac{11}{2}cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4201/22095 [7:11:21<31:15:01, 6.29s/it] {'loss': 0.5146, 'grad_norm': 0.38858522128843226, 'learning_rate': 9.342896092910857e-06, 'epoch': 0.19} 19%|█▉ | 4201/22095 [7:11:21<31:15:01, 6.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 19%|█▉ | 4202/22095 [7:11:25<27:29:13, 5.53s/it] {'loss': 0.4266, 'grad_norm': 0.7559868937443666, 'learning_rate': 9.342532846993345e-06, 'epoch': 0.19} 19%|█▉ | 4202/22095 [7:11:25<27:29:13, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91072 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4203/22095 [7:11:29<25:08:10, 5.06s/it] {'loss': 0.4009, 'grad_norm': 0.6361014808828712, 'learning_rate': 9.342169507768143e-06, 'epoch': 0.19} 19%|█▉ | 4203/22095 [7:11:29<25:08:10, 5.06s/it] 19%|█▉ | 4204/22095 [7:11:32<21:59:00, 4.42s/it] {'loss': 0.3937, 'grad_norm': 0.6703834528070357, 'learning_rate': 9.341806075243049e-06, 'epoch': 0.19} 19%|█▉ | 4204/22095 [7:11:32<21:59:00, 4.42s/it] 19%|█▉ | 4205/22095 [7:11:35<20:57:40, 4.22s/it] {'loss': 0.4345, 'grad_norm': 0.8034159210953544, 'learning_rate': 9.341442549425882e-06, 'epoch': 0.19} 19%|█▉ | 4205/22095 [7:11:36<20:57:40, 4.22s/it] 19%|█▉ | 4206/22095 [7:11:39<19:26:43, 3.91s/it] {'loss': 0.4077, 'grad_norm': 0.70894004080529, 'learning_rate': 9.341078930324446e-06, 'epoch': 0.19} 19%|█▉ | 4206/22095 [7:11:39<19:26:43, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4207/22095 [7:11:46<23:50:03, 4.80s/it] {'loss': 0.505, 'grad_norm': 0.6265618012794626, 'learning_rate': 9.340715217946557e-06, 'epoch': 0.19} 19%|█▉ | 4207/22095 [7:11:46<23:50:03, 4.80s/it] 19%|█▉ | 4208/22095 [7:11:49<21:12:36, 4.27s/it] {'loss': 0.4117, 'grad_norm': 0.6867976221003556, 'learning_rate': 9.34035141230003e-06, 'epoch': 0.19} 19%|█▉ | 4208/22095 [7:11:49<21:12:36, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880128 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3281, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由题意得,EC+FD=EF-CD=8-4=4,∵E是AC的中点,F是BD的中点,∴AE+FB=EC+FD=4,∴AB=AE+FB+EF=4+8=12.'}]} 19%|█▉ | 4209/22095 [7:11:51<19:02:42, 3.83s/it] {'loss': 0.3745, 'grad_norm': 0.6815725037318351, 'learning_rate': 9.339987513392681e-06, 'epoch': 0.19} 19%|█▉ | 4209/22095 [7:11:51<19:02:42, 3.83s/it] 19%|█▉ | 4210/22095 [7:11:54<17:46:46, 3.58s/it] {'loss': 0.4132, 'grad_norm': 0.9252713915472213, 'learning_rate': 9.33962352123233e-06, 'epoch': 0.19} 19%|█▉ | 4210/22095 [7:11:54<17:46:46, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (133147 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4211/22095 [7:12:04<27:08:10, 5.46s/it] {'loss': 0.4936, 'grad_norm': 0.35903187846953316, 'learning_rate': 9.339259435826798e-06, 'epoch': 0.19} 19%|█▉ | 4211/22095 [7:12:04<27:08:10, 5.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952525 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3360, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 19%|█▉ | 4212/22095 [7:12:07<23:38:36, 4.76s/it] {'loss': 0.3868, 'grad_norm': 0.7351904438219221, 'learning_rate': 9.338895257183907e-06, 'epoch': 0.19} 19%|█▉ | 4212/22095 [7:12:07<23:38:36, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44871 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100496 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41454 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4213/22095 [7:12:10<20:48:09, 4.19s/it] {'loss': 0.427, 'grad_norm': 0.666741204958213, 'learning_rate': 9.338530985311483e-06, 'epoch': 0.19} 19%|█▉ | 4213/22095 [7:12:10<20:48:09, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4214/22095 [7:12:14<19:56:46, 4.02s/it] {'loss': 0.3899, 'grad_norm': 0.6864012420285249, 'learning_rate': 9.338166620217353e-06, 'epoch': 0.19} 19%|█▉ | 4214/22095 [7:12:14<19:56:46, 4.02s/it] 19%|█▉ | 4215/22095 [7:12:17<18:34:18, 3.74s/it] {'loss': 0.4032, 'grad_norm': 0.6406318140484143, 'learning_rate': 9.337802161909344e-06, 'epoch': 0.19} 19%|█▉ | 4215/22095 [7:12:17<18:34:18, 3.74s/it] 19%|█▉ | 4216/22095 [7:12:20<17:41:30, 3.56s/it] {'loss': 0.4443, 'grad_norm': 0.6779079194311202, 'learning_rate': 9.337437610395292e-06, 'epoch': 0.19} 19%|█▉ | 4216/22095 [7:12:20<17:41:30, 3.56s/it] 19%|█▉ | 4217/22095 [7:12:23<16:54:11, 3.40s/it] {'loss': 0.4001, 'grad_norm': 0.8085914383393745, 'learning_rate': 9.337072965683026e-06, 'epoch': 0.19} 19%|█▉ | 4217/22095 [7:12:23<16:54:11, 3.40s/it] 19%|█▉ | 4218/22095 [7:12:26<16:42:01, 3.36s/it] {'loss': 0.4139, 'grad_norm': 0.8418371319390886, 'learning_rate': 9.336708227780382e-06, 'epoch': 0.19} 19%|█▉ | 4218/22095 [7:12:26<16:42:01, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4219/22095 [7:12:34<22:32:47, 4.54s/it] {'loss': 0.4937, 'grad_norm': 0.5312634645258777, 'learning_rate': 9.336343396695197e-06, 'epoch': 0.19} 19%|█▉ | 4219/22095 [7:12:34<22:32:47, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56993 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4220/22095 [7:12:38<21:38:48, 4.36s/it] {'loss': 0.3672, 'grad_norm': 0.8147529416185956, 'learning_rate': 9.335978472435311e-06, 'epoch': 0.19} 19%|█▉ | 4220/22095 [7:12:38<21:38:48, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398206 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 357, 'image': 'vrdu_table_final_2/astro-ph.CO/9d033032-7835-4ddf-af82-c9ab4df1c359.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 19%|█▉ | 4221/22095 [7:12:48<30:49:35, 6.21s/it] {'loss': 0.499, 'grad_norm': 0.36155580688435995, 'learning_rate': 9.335613455008565e-06, 'epoch': 0.19} 19%|█▉ | 4221/22095 [7:12:48<30:49:35, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80657 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4222/22095 [7:12:52<27:49:43, 5.61s/it] {'loss': 0.4269, 'grad_norm': 0.8797910704411804, 'learning_rate': 9.335248344422803e-06, 'epoch': 0.19} 19%|█▉ | 4222/22095 [7:12:52<27:49:43, 5.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882164 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5317, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 6cm\nB. 7cm\nC. 8cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [342, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8462434 in VC:s3://internvl-moe-sft-data/. Exception: Image size [342, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 154462, 'image': 'vrdu_texteq/astro-ph.CO/dc0dedaf-df58-4ba1-a35e-c577f40214b5.png', 'image_wh': [[342, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'We then calculate the $V_T$ as'}]} 19%|█▉ | 4223/22095 [7:12:56<25:17:25, 5.09s/it] {'loss': 0.4253, 'grad_norm': 0.6906031892517221, 'learning_rate': 9.334883140685867e-06, 'epoch': 0.19} 19%|█▉ | 4223/22095 [7:12:56<25:17:25, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93606 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4224/22095 [7:13:00<22:48:01, 4.59s/it] {'loss': 0.4356, 'grad_norm': 0.7501075268027417, 'learning_rate': 9.334517843805606e-06, 'epoch': 0.19} 19%|█▉ | 4224/22095 [7:13:00<22:48:01, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [523, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8481547 in VC:s3://internvl-moe-sft-data/. Exception: Image size [523, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 147069, 'image': 'vrdu_texteq/astro-ph.CO/60ca3f73-b44b-4b29-b70a-13b12d5a995c.png', 'image_wh': [[523, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'Where $H_{int}$ is the interaction Hamiltonian.'}]} 19%|█▉ | 4225/22095 [7:13:04<21:41:23, 4.37s/it] {'loss': 0.4198, 'grad_norm': 0.8377097360755146, 'learning_rate': 9.334152453789868e-06, 'epoch': 0.19} 19%|█▉ | 4225/22095 [7:13:04<21:41:23, 4.37s/it] 19%|█▉ | 4226/22095 [7:13:07<20:56:26, 4.22s/it] {'loss': 0.4341, 'grad_norm': 0.6989885826530646, 'learning_rate': 9.333786970646507e-06, 'epoch': 0.19} 19%|█▉ | 4226/22095 [7:13:07<20:56:26, 4.22s/it] 19%|█▉ | 4227/22095 [7:13:11<20:05:40, 4.05s/it] {'loss': 0.3878, 'grad_norm': 0.6613263318372491, 'learning_rate': 9.333421394383374e-06, 'epoch': 0.19} 19%|█▉ | 4227/22095 [7:13:11<20:05:40, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [98, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8420253 in VC:s3://internvl-moe-sft-data/. Exception: Image size [98, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65210, 'image': 'vrdu_texteq/astro-ph.CO/e385e1b6-4272-42b4-8493-dddcb3a172d3.png', 'image_wh': [[98, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'and \\mbox{$\\Omega_\\Lambda$}.'}]} 19%|█▉ | 4228/22095 [7:13:20<26:43:34, 5.39s/it] {'loss': 0.5035, 'grad_norm': 0.5685047682663741, 'learning_rate': 9.333055725008323e-06, 'epoch': 0.19} 19%|█▉ | 4228/22095 [7:13:20<26:43:34, 5.39s/it] 19%|█▉ | 4229/22095 [7:13:24<25:43:21, 5.18s/it] {'loss': 0.4241, 'grad_norm': 0.7589492207434562, 'learning_rate': 9.332689962529213e-06, 'epoch': 0.19} 19%|█▉ | 4229/22095 [7:13:24<25:43:21, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4230/22095 [7:13:32<29:47:35, 6.00s/it] {'loss': 0.5119, 'grad_norm': 0.4250628070188152, 'learning_rate': 9.332324106953903e-06, 'epoch': 0.19} 19%|█▉ | 4230/22095 [7:13:32<29:47:35, 6.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41681 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96357 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4231/22095 [7:13:36<26:22:57, 5.32s/it] {'loss': 0.4065, 'grad_norm': 0.7021573180456467, 'learning_rate': 9.331958158290253e-06, 'epoch': 0.19} 19%|█▉ | 4231/22095 [7:13:36<26:22:57, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4232/22095 [7:13:42<27:45:44, 5.60s/it] {'loss': 0.4987, 'grad_norm': 0.382512596580531, 'learning_rate': 9.331592116546128e-06, 'epoch': 0.19} 19%|█▉ | 4232/22095 [7:13:42<27:45:44, 5.60s/it] 19%|█▉ | 4233/22095 [7:13:46<25:02:00, 5.05s/it] {'loss': 0.4129, 'grad_norm': 0.7330051980181821, 'learning_rate': 9.33122598172939e-06, 'epoch': 0.19} 19%|█▉ | 4233/22095 [7:13:46<25:02:00, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111288 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4234/22095 [7:13:49<21:46:40, 4.39s/it] {'loss': 0.4418, 'grad_norm': 0.689069453015296, 'learning_rate': 9.33085975384791e-06, 'epoch': 0.19} 19%|█▉ | 4234/22095 [7:13:49<21:46:40, 4.39s/it] 19%|█▉ | 4235/22095 [7:13:53<21:15:05, 4.28s/it] {'loss': 0.3966, 'grad_norm': 0.6607414917334931, 'learning_rate': 9.330493432909553e-06, 'epoch': 0.19} 19%|█▉ | 4235/22095 [7:13:53<21:15:05, 4.28s/it] 19%|█▉ | 4236/22095 [7:13:56<19:28:35, 3.93s/it] {'loss': 0.3827, 'grad_norm': 0.6062373121085459, 'learning_rate': 9.330127018922195e-06, 'epoch': 0.19} 19%|█▉ | 4236/22095 [7:13:56<19:28:35, 3.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359158 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25878, 'image': 'vrdu_table_final_2/astro-ph.CO/f4933772-b40f-4733-a9e5-5798add85ea0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 19%|█▉ | 4237/22095 [7:13:59<18:57:41, 3.82s/it] {'loss': 0.4258, 'grad_norm': 0.6657482893388922, 'learning_rate': 9.329760511893703e-06, 'epoch': 0.19} 19%|█▉ | 4237/22095 [7:13:59<18:57:41, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4238/22095 [7:14:08<25:48:20, 5.20s/it] {'loss': 0.5121, 'grad_norm': 0.76336256720969, 'learning_rate': 9.329393911831957e-06, 'epoch': 0.19} 19%|█▉ | 4238/22095 [7:14:08<25:48:20, 5.20s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4239/22095 [7:14:16<30:10:23, 6.08s/it] {'loss': 0.4937, 'grad_norm': 0.570098781814426, 'learning_rate': 9.329027218744833e-06, 'epoch': 0.19} 19%|█▉ | 4239/22095 [7:14:16<30:10:23, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (46816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52389 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4240/22095 [7:14:19<26:01:25, 5.25s/it] {'loss': 0.4024, 'grad_norm': 0.6741787772987584, 'learning_rate': 9.328660432640211e-06, 'epoch': 0.19} 19%|█▉ | 4240/22095 [7:14:19<26:01:25, 5.25s/it] 19%|█▉ | 4241/22095 [7:14:23<23:43:39, 4.78s/it] {'loss': 0.4197, 'grad_norm': 0.6141085288648717, 'learning_rate': 9.32829355352597e-06, 'epoch': 0.19} 19%|█▉ | 4241/22095 [7:14:23<23:43:39, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4242/22095 [7:14:29<25:32:34, 5.15s/it] {'loss': 0.5256, 'grad_norm': 0.535364704619855, 'learning_rate': 9.327926581409992e-06, 'epoch': 0.19} 19%|█▉ | 4242/22095 [7:14:29<25:32:34, 5.15s/it] 19%|█▉ | 4243/22095 [7:14:33<23:46:13, 4.79s/it] {'loss': 0.4067, 'grad_norm': 0.6452703911522668, 'learning_rate': 9.327559516300164e-06, 'epoch': 0.19} 19%|█▉ | 4243/22095 [7:14:33<23:46:13, 4.79s/it] 19%|█▉ | 4244/22095 [7:14:36<21:43:32, 4.38s/it] {'loss': 0.3749, 'grad_norm': 0.6459242711060674, 'learning_rate': 9.327192358204374e-06, 'epoch': 0.19} 19%|█▉ | 4244/22095 [7:14:36<21:43:32, 4.38s/it] 19%|█▉ | 4245/22095 [7:14:40<20:23:29, 4.11s/it] {'loss': 0.397, 'grad_norm': 0.6934224689498821, 'learning_rate': 9.32682510713051e-06, 'epoch': 0.19} 19%|█▉ | 4245/22095 [7:14:40<20:23:29, 4.11s/it] 19%|█▉ | 4246/22095 [7:14:43<19:01:17, 3.84s/it] {'loss': 0.3713, 'grad_norm': 0.7035646271268499, 'learning_rate': 9.326457763086463e-06, 'epoch': 0.19} 19%|█▉ | 4246/22095 [7:14:43<19:01:17, 3.84s/it] 19%|█▉ | 4247/22095 [7:14:48<19:54:10, 4.01s/it] {'loss': 0.412, 'grad_norm': 0.6359014330987115, 'learning_rate': 9.326090326080129e-06, 'epoch': 0.19} 19%|█▉ | 4247/22095 [7:14:48<19:54:10, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4248/22095 [7:14:58<29:09:44, 5.88s/it] {'loss': 0.5247, 'grad_norm': 0.8562684223334093, 'learning_rate': 9.325722796119396e-06, 'epoch': 0.19} 19%|█▉ | 4248/22095 [7:14:58<29:09:44, 5.88s/it] 19%|█▉ | 4249/22095 [7:15:02<26:05:27, 5.26s/it] {'loss': 0.3677, 'grad_norm': 0.668983090986784, 'learning_rate': 9.325355173212169e-06, 'epoch': 0.19} 19%|█▉ | 4249/22095 [7:15:02<26:05:27, 5.26s/it] 19%|█▉ | 4250/22095 [7:15:05<23:46:00, 4.79s/it] {'loss': 0.4335, 'grad_norm': 0.6735918825748062, 'learning_rate': 9.324987457366342e-06, 'epoch': 0.19} 19%|█▉ | 4250/22095 [7:15:05<23:46:00, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85520 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4251/22095 [7:15:09<22:28:05, 4.53s/it] {'loss': 0.3478, 'grad_norm': 0.6111348754854273, 'learning_rate': 9.324619648589818e-06, 'epoch': 0.19} 19%|█▉ | 4251/22095 [7:15:09<22:28:05, 4.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [373, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8416184 in VC:s3://internvl-moe-sft-data/. Exception: Image size [373, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 74941, 'image': 'vrdu_texteq/astro-ph.CO/df8c50ae-2948-46fb-975c-a36620c657f4.png', 'image_wh': [[373, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': '$\\square$ Focus Area 1: Lunar farside'}]} 19%|█▉ | 4252/22095 [7:15:12<20:20:39, 4.10s/it] {'loss': 0.3914, 'grad_norm': 0.6529173510325869, 'learning_rate': 9.324251746890501e-06, 'epoch': 0.19} 19%|█▉ | 4252/22095 [7:15:12<20:20:39, 4.10s/it] 19%|█▉ | 4253/22095 [7:15:16<19:01:35, 3.84s/it] {'loss': 0.4417, 'grad_norm': 0.7176116943040441, 'learning_rate': 9.323883752276294e-06, 'epoch': 0.19} 19%|█▉ | 4253/22095 [7:15:16<19:01:35, 3.84s/it] 19%|█▉ | 4254/22095 [7:15:19<18:05:54, 3.65s/it] {'loss': 0.3822, 'grad_norm': 0.8261099452942321, 'learning_rate': 9.323515664755105e-06, 'epoch': 0.19} 19%|█▉ | 4254/22095 [7:15:19<18:05:54, 3.65s/it] 19%|█▉ | 4255/22095 [7:15:23<19:05:24, 3.85s/it] {'loss': 0.4208, 'grad_norm': 0.7602870493102301, 'learning_rate': 9.323147484334843e-06, 'epoch': 0.19} 19%|█▉ | 4255/22095 [7:15:23<19:05:24, 3.85s/it] 19%|█▉ | 4256/22095 [7:15:26<17:49:50, 3.60s/it] {'loss': 0.431, 'grad_norm': 0.6966871082756921, 'learning_rate': 9.322779211023418e-06, 'epoch': 0.19} 19%|█▉ | 4256/22095 [7:15:26<17:49:50, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81904 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4257/22095 [7:15:29<16:41:37, 3.37s/it] {'loss': 0.4047, 'grad_norm': 0.6646502179431876, 'learning_rate': 9.322410844828747e-06, 'epoch': 0.19} 19%|█▉ | 4257/22095 [7:15:29<16:41:37, 3.37s/it] 19%|█▉ | 4258/22095 [7:15:32<16:54:16, 3.41s/it] {'loss': 0.4006, 'grad_norm': 0.7620936660471155, 'learning_rate': 9.322042385758738e-06, 'epoch': 0.19} 19%|█▉ | 4258/22095 [7:15:32<16:54:16, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047673 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 6\nB. 5\nC. 4\nD. 3'}, {'from': 'gpt', 'value': '【解答】解:∵AM:BN=3:1,而点M是线段AB的中点,且AB=12,∴AM=BM=6,BN=2而MN=BM-BN=6-2=4'}]} 19%|█▉ | 4259/22095 [7:15:41<25:11:51, 5.09s/it] {'loss': 0.4854, 'grad_norm': 0.5508551135263602, 'learning_rate': 9.321673833821316e-06, 'epoch': 0.19} 19%|█▉ | 4259/22095 [7:15:41<25:11:51, 5.09s/it] 19%|█▉ | 4260/22095 [7:15:45<23:01:03, 4.65s/it] {'loss': 0.4336, 'grad_norm': 0.7260689341128518, 'learning_rate': 9.321305189024395e-06, 'epoch': 0.19} 19%|█▉ | 4260/22095 [7:15:45<23:01:03, 4.65s/it] 19%|█▉ | 4261/22095 [7:15:49<21:52:34, 4.42s/it] {'loss': 0.3674, 'grad_norm': 0.6686325587042482, 'learning_rate': 9.320936451375896e-06, 'epoch': 0.19} 19%|█▉ | 4261/22095 [7:15:49<21:52:34, 4.42s/it] 19%|█▉ | 4262/22095 [7:15:53<21:16:33, 4.30s/it] {'loss': 0.3821, 'grad_norm': 0.6248650776883086, 'learning_rate': 9.320567620883746e-06, 'epoch': 0.19} 19%|█▉ | 4262/22095 [7:15:53<21:16:33, 4.30s/it] 19%|█▉ | 4263/22095 [7:15:56<19:29:42, 3.94s/it] {'loss': 0.407, 'grad_norm': 0.708121072026063, 'learning_rate': 9.320198697555866e-06, 'epoch': 0.19} 19%|█▉ | 4263/22095 [7:15:56<19:29:42, 3.94s/it] 19%|█▉ | 4264/22095 [7:15:59<18:42:47, 3.78s/it] {'loss': 0.3846, 'grad_norm': 0.6231187055806207, 'learning_rate': 9.319829681400185e-06, 'epoch': 0.19} 19%|█▉ | 4264/22095 [7:15:59<18:42:47, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4265/22095 [7:16:06<22:11:39, 4.48s/it] {'loss': 0.5126, 'grad_norm': 0.43480382071748097, 'learning_rate': 9.319460572424632e-06, 'epoch': 0.19} 19%|█▉ | 4265/22095 [7:16:06<22:11:39, 4.48s/it] 19%|█▉ | 4266/22095 [7:16:09<20:52:22, 4.21s/it] {'loss': 0.4143, 'grad_norm': 0.658036330731171, 'learning_rate': 9.319091370637136e-06, 'epoch': 0.19} 19%|█▉ | 4266/22095 [7:16:09<20:52:22, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77091 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4267/22095 [7:16:13<20:09:02, 4.07s/it] {'loss': 0.3835, 'grad_norm': 0.6675050586561079, 'learning_rate': 9.318722076045632e-06, 'epoch': 0.19} 19%|█▉ | 4267/22095 [7:16:13<20:09:02, 4.07s/it] 19%|█▉ | 4268/22095 [7:16:16<19:06:58, 3.86s/it] {'loss': 0.389, 'grad_norm': 0.6878269185310447, 'learning_rate': 9.318352688658055e-06, 'epoch': 0.19} 19%|█▉ | 4268/22095 [7:16:16<19:06:58, 3.86s/it] 19%|█▉ | 4269/22095 [7:16:19<18:09:50, 3.67s/it] {'loss': 0.4138, 'grad_norm': 0.6417992974753269, 'learning_rate': 9.317983208482342e-06, 'epoch': 0.19} 19%|█▉ | 4269/22095 [7:16:19<18:09:50, 3.67s/it] 19%|█▉ | 4270/22095 [7:16:23<17:22:44, 3.51s/it] {'loss': 0.4442, 'grad_norm': 0.7400557604489254, 'learning_rate': 9.317613635526431e-06, 'epoch': 0.19} 19%|█▉ | 4270/22095 [7:16:23<17:22:44, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047798 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 19%|█▉ | 4271/22095 [7:16:26<17:30:02, 3.53s/it] {'loss': 0.3417, 'grad_norm': 0.6493969617728693, 'learning_rate': 9.317243969798263e-06, 'epoch': 0.19} 19%|█▉ | 4271/22095 [7:16:26<17:30:02, 3.53s/it] 19%|█▉ | 4272/22095 [7:16:29<16:48:50, 3.40s/it] {'loss': 0.4165, 'grad_norm': 0.7276072755450002, 'learning_rate': 9.31687421130578e-06, 'epoch': 0.19} 19%|█▉ | 4272/22095 [7:16:29<16:48:50, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4273/22095 [7:16:32<16:00:59, 3.24s/it] {'loss': 0.4105, 'grad_norm': 0.7372973315741538, 'learning_rate': 9.31650436005693e-06, 'epoch': 0.19} 19%|█▉ | 4273/22095 [7:16:32<16:00:59, 3.24s/it] 19%|█▉ | 4274/22095 [7:16:36<16:47:07, 3.39s/it] {'loss': 0.4357, 'grad_norm': 0.7257643318303959, 'learning_rate': 9.31613441605966e-06, 'epoch': 0.19} 19%|█▉ | 4274/22095 [7:16:36<16:47:07, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4275/22095 [7:16:45<25:47:54, 5.21s/it] {'loss': 0.4984, 'grad_norm': 0.4424464950211227, 'learning_rate': 9.315764379321916e-06, 'epoch': 0.19} 19%|█▉ | 4275/22095 [7:16:45<25:47:54, 5.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4276/22095 [7:16:49<22:55:59, 4.63s/it] {'loss': 0.3993, 'grad_norm': 0.6614602225854239, 'learning_rate': 9.31539424985165e-06, 'epoch': 0.19} 19%|█▉ | 4276/22095 [7:16:49<22:55:59, 4.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: broken data stream when reading image file [Try #0] Failed to fetch sample 7323025 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: broken data stream when reading image file Problematic sample: {'image': 'autocad/20250509_125727_1/images/before_screenshot_1_id_121_internvl_position_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nOnly respond with the coordinates: The 'CUSTOMIZE' text is located in the left side of the application window, within the Tool Palettes section. It's positioned at the top of what appears to be a customization panel, directly above the helper text that reads 'Press F1 for more help'."}, {'from': 'gpt', 'value': "The 'CUSTOMIZE' text is located in the left side of the application window, within the Tool Palettes section. It's positioned at the top of what appears to be a customization panel, directly above the helper text that reads 'Press F1 for more help'.[[495, 550, 515, 557]]"}], 'width': 3600, 'height': 2338} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885300 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8453, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm'}, {'from': 'gpt', 'value': '【解答】解:根据上图所示OB=5cm-OA,∵OA=(AB+BC)÷2=4cm,∴OB=1cm.'}]} 19%|█▉ | 4277/22095 [7:16:52<21:38:24, 4.37s/it] {'loss': 0.3853, 'grad_norm': 0.7541761185746563, 'learning_rate': 9.315024027656815e-06, 'epoch': 0.19} 19%|█▉ | 4277/22095 [7:16:52<21:38:24, 4.37s/it] 19%|█▉ | 4278/22095 [7:16:56<20:20:24, 4.11s/it] {'loss': 0.3879, 'grad_norm': 0.660166374597361, 'learning_rate': 9.314653712745368e-06, 'epoch': 0.19} 19%|█▉ | 4278/22095 [7:16:56<20:20:24, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [650, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8426191 in VC:s3://internvl-moe-sft-data/. Exception: Image size [650, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41227, 'image': 'vrdu_texteq/astro-ph.CO/f2246a75-ad5a-4586-9df6-7c7dc9608eae.png', 'image_wh': [[650, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'with $\\theta_b$ the standard deviation of the Gaussian beam.'}]} 19%|█▉ | 4279/22095 [7:16:59<19:35:54, 3.96s/it] {'loss': 0.4038, 'grad_norm': 0.6435859897251333, 'learning_rate': 9.314283305125262e-06, 'epoch': 0.19} 19%|█▉ | 4279/22095 [7:17:00<19:35:54, 3.96s/it] 19%|█▉ | 4280/22095 [7:17:03<18:59:39, 3.84s/it] {'loss': 0.4066, 'grad_norm': 0.7788972612870466, 'learning_rate': 9.313912804804459e-06, 'epoch': 0.19} 19%|█▉ | 4280/22095 [7:17:03<18:59:39, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960770 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11605, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 4cm\nB. 3cm\nC. 2cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 19%|█▉ | 4281/22095 [7:17:07<18:46:19, 3.79s/it] {'loss': 0.4215, 'grad_norm': 0.6496745811186846, 'learning_rate': 9.31354221179092e-06, 'epoch': 0.19} 19%|█▉ | 4281/22095 [7:17:07<18:46:19, 3.79s/it] 19%|█▉ | 4282/22095 [7:17:11<19:55:36, 4.03s/it] {'loss': 0.422, 'grad_norm': 0.6848040494516953, 'learning_rate': 9.313171526092606e-06, 'epoch': 0.19} 19%|█▉ | 4282/22095 [7:17:11<19:55:36, 4.03s/it] 19%|█▉ | 4283/22095 [7:17:14<18:03:24, 3.65s/it] {'loss': 0.3954, 'grad_norm': 0.8014840865761476, 'learning_rate': 9.312800747717484e-06, 'epoch': 0.19} 19%|█▉ | 4283/22095 [7:17:14<18:03:24, 3.65s/it] 19%|█▉ | 4284/22095 [7:17:18<18:01:20, 3.64s/it] {'loss': 0.3774, 'grad_norm': 0.6019739424529075, 'learning_rate': 9.312429876673517e-06, 'epoch': 0.19} 19%|█▉ | 4284/22095 [7:17:18<18:01:20, 3.64s/it] 19%|█▉ | 4285/22095 [7:17:21<17:34:51, 3.55s/it] {'loss': 0.3709, 'grad_norm': 0.6340649489666178, 'learning_rate': 9.312058912968679e-06, 'epoch': 0.19} 19%|█▉ | 4285/22095 [7:17:21<17:34:51, 3.55s/it] 19%|█▉ | 4286/22095 [7:17:24<16:52:55, 3.41s/it] {'loss': 0.4114, 'grad_norm': 0.6679937885689637, 'learning_rate': 9.311687856610939e-06, 'epoch': 0.19} 19%|█▉ | 4286/22095 [7:17:24<16:52:55, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91134 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4287/22095 [7:17:28<16:57:03, 3.43s/it] {'loss': 0.3748, 'grad_norm': 0.6207236883394622, 'learning_rate': 9.311316707608267e-06, 'epoch': 0.19} 19%|█▉ | 4287/22095 [7:17:28<16:57:03, 3.43s/it] 19%|█▉ | 4288/22095 [7:17:31<16:22:58, 3.31s/it] {'loss': 0.3919, 'grad_norm': 0.580243437551749, 'learning_rate': 9.31094546596864e-06, 'epoch': 0.19} 19%|█▉ | 4288/22095 [7:17:31<16:22:58, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79740 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4289/22095 [7:17:34<15:48:40, 3.20s/it] {'loss': 0.396, 'grad_norm': 0.7394004485489974, 'learning_rate': 9.310574131700036e-06, 'epoch': 0.19} 19%|█▉ | 4289/22095 [7:17:34<15:48:40, 3.20s/it] 19%|█▉ | 4290/22095 [7:17:37<15:31:43, 3.14s/it] {'loss': 0.4577, 'grad_norm': 0.7232307625440043, 'learning_rate': 9.310202704810433e-06, 'epoch': 0.19} 19%|█▉ | 4290/22095 [7:17:37<15:31:43, 3.14s/it] 19%|█▉ | 4291/22095 [7:17:40<16:05:43, 3.25s/it] {'loss': 0.381, 'grad_norm': 0.6854297919590081, 'learning_rate': 9.309831185307812e-06, 'epoch': 0.19} 19%|█▉ | 4291/22095 [7:17:40<16:05:43, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45523 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (137012 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4292/22095 [7:17:43<16:16:49, 3.29s/it] {'loss': 0.4015, 'grad_norm': 0.6196581479239693, 'learning_rate': 9.309459573200154e-06, 'epoch': 0.19} 19%|█▉ | 4292/22095 [7:17:43<16:16:49, 3.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4293/22095 [7:17:47<16:03:12, 3.25s/it] {'loss': 0.3935, 'grad_norm': 0.6615913159694972, 'learning_rate': 9.309087868495447e-06, 'epoch': 0.19} 19%|█▉ | 4293/22095 [7:17:47<16:03:12, 3.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366674 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33420, 'image': 'vrdu_table_final_2/astro-ph.CO/7820f674-7ccb-4924-8e56-125cfdf73778.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 19%|█▉ | 4294/22095 [7:17:49<15:24:54, 3.12s/it] {'loss': 0.4008, 'grad_norm': 0.6393318696287117, 'learning_rate': 9.308716071201676e-06, 'epoch': 0.19} 19%|█▉ | 4294/22095 [7:17:49<15:24:54, 3.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4295/22095 [7:17:55<19:39:30, 3.98s/it] {'loss': 0.4989, 'grad_norm': 0.546204787393143, 'learning_rate': 9.308344181326829e-06, 'epoch': 0.19} 19%|█▉ | 4295/22095 [7:17:55<19:39:30, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75482 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43769 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4296/22095 [7:17:59<18:32:38, 3.75s/it] {'loss': 0.3923, 'grad_norm': 0.6960738962155074, 'learning_rate': 9.307972198878897e-06, 'epoch': 0.19} 19%|█▉ | 4296/22095 [7:17:59<18:32:38, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 19%|█▉ | 4297/22095 [7:18:04<21:22:16, 4.32s/it] {'loss': 0.4821, 'grad_norm': 0.33892437302645556, 'learning_rate': 9.307600123865874e-06, 'epoch': 0.19} 19%|█▉ | 4297/22095 [7:18:04<21:22:16, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81543 > 40960). Running this sequence through the model will result in indexing errors 19%|█▉ | 4298/22095 [7:18:07<19:32:47, 3.95s/it] {'loss': 0.3633, 'grad_norm': 0.6390442836312251, 'learning_rate': 9.307227956295754e-06, 'epoch': 0.19} 19%|█▉ | 4298/22095 [7:18:07<19:32:47, 3.95s/it] 19%|█▉ | 4299/22095 [7:18:10<17:48:05, 3.60s/it] {'loss': 0.3651, 'grad_norm': 0.6792186733996218, 'learning_rate': 9.306855696176536e-06, 'epoch': 0.19} 19%|█▉ | 4299/22095 [7:18:10<17:48:05, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38916.png 2025-08-27 23:16:11.134542 load time: 1113.53 ms 19%|█▉ | 4300/22095 [7:18:20<26:43:15, 5.41s/it] {'loss': 0.4678, 'grad_norm': 0.4289816986536164, 'learning_rate': 9.306483343516212e-06, 'epoch': 0.19} 19%|█▉ | 4300/22095 [7:18:20<26:43:15, 5.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369587 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36339, 'image': 'vrdu_table_final_2/astro-ph.CO/890c3965-f010-4847-a98d-e4ba4aae74b5.png', 'image_wh': [[17, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\gamma$\\end{tabular}\n```"}]} 19%|█▉ | 4301/22095 [7:18:23<23:23:45, 4.73s/it] {'loss': 0.4272, 'grad_norm': 0.7078556709325038, 'learning_rate': 9.30611089832279e-06, 'epoch': 0.19} 19%|█▉ | 4301/22095 [7:18:23<23:23:45, 4.73s/it] 19%|█▉ | 4302/22095 [7:18:26<21:25:19, 4.33s/it] {'loss': 0.3604, 'grad_norm': 0.670063578620746, 'learning_rate': 9.30573836060427e-06, 'epoch': 0.19} 19%|█▉ | 4302/22095 [7:18:26<21:25:19, 4.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4303/22095 [7:18:30<19:54:21, 4.03s/it] {'loss': 0.343, 'grad_norm': 0.6125312638812249, 'learning_rate': 9.305365730368658e-06, 'epoch': 0.19} 19%|█▉ | 4303/22095 [7:18:30<19:54:21, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250630/windows_augment/images/autocad/handmade_annotation_1/images/4_id_11_internvl_element-caption_crop_0_grounding_instructions_random_paste.png 2025-08-27 23:16:27.296265 load time: 1332.57 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 19%|█▉ | 4304/22095 [7:18:35<22:34:43, 4.57s/it] {'loss': 0.5143, 'grad_norm': 0.43965317617828314, 'learning_rate': 9.304993007623958e-06, 'epoch': 0.19} 19%|█▉ | 4304/22095 [7:18:35<22:34:43, 4.57s/it] 19%|█▉ | 4305/22095 [7:18:39<20:45:08, 4.20s/it] {'loss': 0.4482, 'grad_norm': 0.6869474154118801, 'learning_rate': 9.30462019237818e-06, 'epoch': 0.19} 19%|█▉ | 4305/22095 [7:18:39<20:45:08, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/images/os_ubuntu_2/handmade_annotation_14/images/paste_Screenshot from 2025-07-09 15-31-44_id_7_internvl_position_crop_1_grounding_instructions_random.png 2025-08-27 23:16:38.894039 load time: 1053.29 ms 19%|█▉ | 4306/22095 [7:18:45<24:01:08, 4.86s/it] {'loss': 0.5008, 'grad_norm': 0.37417953819718464, 'learning_rate': 9.304247284639335e-06, 'epoch': 0.19} 19%|█▉ | 4306/22095 [7:18:45<24:01:08, 4.86s/it] 19%|█▉ | 4307/22095 [7:18:48<21:37:22, 4.38s/it] {'loss': 0.3748, 'grad_norm': 0.6682325190494395, 'learning_rate': 9.303874284415435e-06, 'epoch': 0.19} 19%|█▉ | 4307/22095 [7:18:48<21:37:22, 4.38s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31409.png 2025-08-27 23:16:45.636549 load time: 1449.95 ms 19%|█▉ | 4308/22095 [7:18:52<20:21:13, 4.12s/it] {'loss': 0.4181, 'grad_norm': 0.6462237565900745, 'learning_rate': 9.303501191714494e-06, 'epoch': 0.19} 19%|█▉ | 4308/22095 [7:18:52<20:21:13, 4.12s/it] 20%|█▉ | 4309/22095 [7:18:55<18:36:37, 3.77s/it] {'loss': 0.3434, 'grad_norm': 0.644397907031416, 'learning_rate': 9.303128006544531e-06, 'epoch': 0.2} 20%|█▉ | 4309/22095 [7:18:55<18:36:37, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49214 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4310/22095 [7:18:58<18:17:44, 3.70s/it] {'loss': 0.3615, 'grad_norm': 0.6405159433598613, 'learning_rate': 9.302754728913563e-06, 'epoch': 0.2} 20%|█▉ | 4310/22095 [7:18:58<18:17:44, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78917 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4311/22095 [7:19:01<17:15:43, 3.49s/it] {'loss': 0.4064, 'grad_norm': 0.7136582827059487, 'learning_rate': 9.302381358829612e-06, 'epoch': 0.2} 20%|█▉ | 4311/22095 [7:19:02<17:15:43, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (143346 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4312/22095 [7:19:04<16:27:42, 3.33s/it] {'loss': 0.4185, 'grad_norm': 0.6703286149054923, 'learning_rate': 9.302007896300697e-06, 'epoch': 0.2} 20%|█▉ | 4312/22095 [7:19:04<16:27:42, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4313/22095 [7:19:07<15:47:54, 3.20s/it] {'loss': 0.3992, 'grad_norm': 0.67624507234566, 'learning_rate': 9.301634341334846e-06, 'epoch': 0.2} 20%|█▉ | 4313/22095 [7:19:07<15:47:54, 3.20s/it] 20%|█▉ | 4314/22095 [7:19:10<15:08:51, 3.07s/it] {'loss': 0.3774, 'grad_norm': 0.6507390990718162, 'learning_rate': 9.301260693940084e-06, 'epoch': 0.2} 20%|█▉ | 4314/22095 [7:19:10<15:08:51, 3.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71786 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62088 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72257 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4315/22095 [7:19:13<14:56:09, 3.02s/it] {'loss': 0.391, 'grad_norm': 0.635224165327857, 'learning_rate': 9.300886954124442e-06, 'epoch': 0.2} 20%|█▉ | 4315/22095 [7:19:13<14:56:09, 3.02s/it] 20%|█▉ | 4316/22095 [7:19:17<17:05:16, 3.46s/it] {'loss': 0.4185, 'grad_norm': 0.820758878812254, 'learning_rate': 9.300513121895946e-06, 'epoch': 0.2} 20%|█▉ | 4316/22095 [7:19:18<17:05:16, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4317/22095 [7:19:21<17:24:12, 3.52s/it] {'loss': 0.3871, 'grad_norm': 0.6567914034471967, 'learning_rate': 9.300139197262633e-06, 'epoch': 0.2} 20%|█▉ | 4317/22095 [7:19:21<17:24:12, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4318/22095 [7:19:30<24:33:55, 4.97s/it] {'loss': 0.4992, 'grad_norm': 0.7745445471527284, 'learning_rate': 9.299765180232534e-06, 'epoch': 0.2} 20%|█▉ | 4318/22095 [7:19:30<24:33:55, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52094 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104470 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94254 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4319/22095 [7:19:40<32:46:40, 6.64s/it] {'loss': 0.5268, 'grad_norm': 0.5686358701093436, 'learning_rate': 9.299391070813687e-06, 'epoch': 0.2} 20%|█▉ | 4319/22095 [7:19:40<32:46:40, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|█▉ | 4320/22095 [7:19:44<29:14:43, 5.92s/it] {'loss': 0.3943, 'grad_norm': 0.7189558651821492, 'learning_rate': 9.29901686901413e-06, 'epoch': 0.2} 20%|█▉ | 4320/22095 [7:19:44<29:14:43, 5.92s/it] 20%|█▉ | 4321/22095 [7:19:48<26:07:56, 5.29s/it] {'loss': 0.3723, 'grad_norm': 0.6913062360595764, 'learning_rate': 9.298642574841906e-06, 'epoch': 0.2} 20%|█▉ | 4321/22095 [7:19:48<26:07:56, 5.29s/it] 20%|█▉ | 4322/22095 [7:19:51<22:41:41, 4.60s/it] {'loss': 0.373, 'grad_norm': 0.7396020547936617, 'learning_rate': 9.298268188305054e-06, 'epoch': 0.2} 20%|█▉ | 4322/22095 [7:19:51<22:41:41, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4323/22095 [7:20:01<29:54:18, 6.06s/it] {'loss': 0.4975, 'grad_norm': 0.7529357435564967, 'learning_rate': 9.29789370941162e-06, 'epoch': 0.2} 20%|█▉ | 4323/22095 [7:20:01<29:54:18, 6.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106912 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55846 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53772 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4324/22095 [7:20:04<25:57:14, 5.26s/it] {'loss': 0.3901, 'grad_norm': 0.7697627354205817, 'learning_rate': 9.29751913816965e-06, 'epoch': 0.2} 20%|█▉ | 4324/22095 [7:20:04<25:57:14, 5.26s/it] 20%|█▉ | 4325/22095 [7:20:07<23:01:35, 4.66s/it] {'loss': 0.4247, 'grad_norm': 0.7005308781484924, 'learning_rate': 9.297144474587193e-06, 'epoch': 0.2} 20%|█▉ | 4325/22095 [7:20:07<23:01:35, 4.66s/it] 20%|█▉ | 4326/22095 [7:20:10<20:27:47, 4.15s/it] {'loss': 0.4269, 'grad_norm': 0.6823939930425462, 'learning_rate': 9.296769718672298e-06, 'epoch': 0.2} 20%|█▉ | 4326/22095 [7:20:10<20:27:47, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4327/22095 [7:20:20<28:12:21, 5.71s/it] {'loss': 0.5073, 'grad_norm': 0.614401669857915, 'learning_rate': 9.296394870433018e-06, 'epoch': 0.2} 20%|█▉ | 4327/22095 [7:20:20<28:12:21, 5.71s/it] 20%|█▉ | 4328/22095 [7:20:24<25:52:19, 5.24s/it] {'loss': 0.4299, 'grad_norm': 0.7837169926571875, 'learning_rate': 9.29601992987741e-06, 'epoch': 0.2} 20%|█▉ | 4328/22095 [7:20:24<25:52:19, 5.24s/it] 20%|█▉ | 4329/22095 [7:20:27<23:03:51, 4.67s/it] {'loss': 0.4003, 'grad_norm': 0.7365197541628816, 'learning_rate': 9.295644897013526e-06, 'epoch': 0.2} 20%|█▉ | 4329/22095 [7:20:27<23:03:51, 4.67s/it] 20%|█▉ | 4330/22095 [7:20:30<20:57:36, 4.25s/it] {'loss': 0.4143, 'grad_norm': 0.6291839507518706, 'learning_rate': 9.295269771849426e-06, 'epoch': 0.2} 20%|█▉ | 4330/22095 [7:20:30<20:57:36, 4.25s/it] 20%|█▉ | 4331/22095 [7:20:34<20:48:35, 4.22s/it] {'loss': 0.4568, 'grad_norm': 0.7709732047355847, 'learning_rate': 9.294894554393172e-06, 'epoch': 0.2} 20%|█▉ | 4331/22095 [7:20:34<20:48:35, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4332/22095 [7:20:42<25:08:12, 5.09s/it] {'loss': 0.5193, 'grad_norm': 0.4735295845056872, 'learning_rate': 9.294519244652825e-06, 'epoch': 0.2} 20%|█▉ | 4332/22095 [7:20:42<25:08:12, 5.09s/it] 20%|█▉ | 4333/22095 [7:20:45<23:05:06, 4.68s/it] {'loss': 0.4162, 'grad_norm': 0.7441985176769523, 'learning_rate': 9.294143842636447e-06, 'epoch': 0.2} 20%|█▉ | 4333/22095 [7:20:45<23:05:06, 4.68s/it] 20%|█▉ | 4334/22095 [7:20:48<20:36:02, 4.18s/it] {'loss': 0.3498, 'grad_norm': 0.6201583591024671, 'learning_rate': 9.293768348352106e-06, 'epoch': 0.2} 20%|█▉ | 4334/22095 [7:20:48<20:36:02, 4.18s/it] 20%|█▉ | 4335/22095 [7:20:51<19:08:22, 3.88s/it] {'loss': 0.3856, 'grad_norm': 1.1910528546319035, 'learning_rate': 9.293392761807873e-06, 'epoch': 0.2} 20%|█▉ | 4335/22095 [7:20:51<19:08:22, 3.88s/it] 20%|█▉ | 4336/22095 [7:20:55<18:07:37, 3.67s/it] {'loss': 0.3971, 'grad_norm': 0.7391466697999081, 'learning_rate': 9.293017083011814e-06, 'epoch': 0.2} 20%|█▉ | 4336/22095 [7:20:55<18:07:37, 3.67s/it] 20%|█▉ | 4337/22095 [7:20:58<17:37:55, 3.57s/it] {'loss': 0.3835, 'grad_norm': 0.679637583536768, 'learning_rate': 9.292641311972004e-06, 'epoch': 0.2} 20%|█▉ | 4337/22095 [7:20:58<17:37:55, 3.57s/it] 20%|█▉ | 4338/22095 [7:21:02<17:42:57, 3.59s/it] {'loss': 0.4116, 'grad_norm': 0.6965119342754589, 'learning_rate': 9.292265448696515e-06, 'epoch': 0.2} 20%|█▉ | 4338/22095 [7:21:02<17:42:57, 3.59s/it] 20%|█▉ | 4339/22095 [7:21:05<16:46:36, 3.40s/it] {'loss': 0.3934, 'grad_norm': 0.6969239676501272, 'learning_rate': 9.291889493193424e-06, 'epoch': 0.2} 20%|█▉ | 4339/22095 [7:21:05<16:46:36, 3.40s/it] 20%|█▉ | 4340/22095 [7:21:08<16:15:02, 3.29s/it] {'loss': 0.4031, 'grad_norm': 0.6554684903091974, 'learning_rate': 9.29151344547081e-06, 'epoch': 0.2} 20%|█▉ | 4340/22095 [7:21:08<16:15:02, 3.29s/it] 20%|█▉ | 4341/22095 [7:21:11<15:44:28, 3.19s/it] {'loss': 0.3987, 'grad_norm': 0.6461420773694276, 'learning_rate': 9.291137305536752e-06, 'epoch': 0.2} 20%|█▉ | 4341/22095 [7:21:11<15:44:28, 3.19s/it] 20%|█▉ | 4342/22095 [7:21:15<16:54:16, 3.43s/it] {'loss': 0.3913, 'grad_norm': 0.6893853761388993, 'learning_rate': 9.290761073399333e-06, 'epoch': 0.2} 20%|█▉ | 4342/22095 [7:21:15<16:54:16, 3.43s/it] 20%|█▉ | 4343/22095 [7:21:18<17:29:35, 3.55s/it] {'loss': 0.4473, 'grad_norm': 0.6926547278011074, 'learning_rate': 9.290384749066636e-06, 'epoch': 0.2} 20%|█▉ | 4343/22095 [7:21:18<17:29:35, 3.55s/it] 20%|█▉ | 4344/22095 [7:21:22<17:14:09, 3.50s/it] {'loss': 0.3774, 'grad_norm': 0.7730563566336075, 'learning_rate': 9.290008332546749e-06, 'epoch': 0.2} 20%|█▉ | 4344/22095 [7:21:22<17:14:09, 3.50s/it] 20%|█▉ | 4345/22095 [7:21:25<16:32:12, 3.35s/it] {'loss': 0.3956, 'grad_norm': 0.761834350061357, 'learning_rate': 9.289631823847758e-06, 'epoch': 0.2} 20%|█▉ | 4345/22095 [7:21:25<16:32:12, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8491657 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40546, 'image': 'vrdu_texteq/astro-ph.CO/d5c5ba47-64ad-4e99-a277-fc8ad7d8c0ac.png', 'image_wh': [[14, 20]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': '$g$'}]} 20%|█▉ | 4346/22095 [7:21:28<16:09:43, 3.28s/it] {'loss': 0.4139, 'grad_norm': 0.6493671717411257, 'learning_rate': 9.289255222977754e-06, 'epoch': 0.2} 20%|█▉ | 4346/22095 [7:21:28<16:09:43, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382444 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49238, 'image': 'vrdu_table_final_2/astro-ph.CO/a3555283-c78e-45b0-aa3f-27a446011f19.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 20%|█▉ | 4347/22095 [7:21:37<24:13:00, 4.91s/it] {'loss': 0.4967, 'grad_norm': 0.4625826341249837, 'learning_rate': 9.288878529944827e-06, 'epoch': 0.2} 20%|█▉ | 4347/22095 [7:21:37<24:13:00, 4.91s/it] 20%|█▉ | 4348/22095 [7:21:40<22:01:42, 4.47s/it] {'loss': 0.4225, 'grad_norm': 0.7436204158165354, 'learning_rate': 9.288501744757073e-06, 'epoch': 0.2} 20%|█▉ | 4348/22095 [7:21:40<22:01:42, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52365 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4349/22095 [7:21:44<20:37:40, 4.18s/it] {'loss': 0.4244, 'grad_norm': 0.709135533044813, 'learning_rate': 9.28812486742259e-06, 'epoch': 0.2} 20%|█▉ | 4349/22095 [7:21:44<20:37:40, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69930 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56217 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4350/22095 [7:21:46<18:21:22, 3.72s/it] {'loss': 0.415, 'grad_norm': 0.6641857934741487, 'learning_rate': 9.287747897949471e-06, 'epoch': 0.2} 20%|█▉ | 4350/22095 [7:21:46<18:21:22, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87432 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4351/22095 [7:21:49<17:33:30, 3.56s/it] {'loss': 0.4266, 'grad_norm': 0.714834663860521, 'learning_rate': 9.287370836345819e-06, 'epoch': 0.2} 20%|█▉ | 4351/22095 [7:21:49<17:33:30, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4352/22095 [7:21:54<18:29:45, 3.75s/it] {'loss': 0.356, 'grad_norm': 0.648691599092435, 'learning_rate': 9.286993682619736e-06, 'epoch': 0.2} 20%|█▉ | 4352/22095 [7:21:54<18:29:45, 3.75s/it] 20%|█▉ | 4353/22095 [7:21:57<18:41:02, 3.79s/it] {'loss': 0.3934, 'grad_norm': 0.6686053436740277, 'learning_rate': 9.286616436779326e-06, 'epoch': 0.2} 20%|█▉ | 4353/22095 [7:21:58<18:41:02, 3.79s/it] 20%|█▉ | 4354/22095 [7:22:02<19:38:18, 3.99s/it] {'loss': 0.4217, 'grad_norm': 0.6551766852685967, 'learning_rate': 9.286239098832693e-06, 'epoch': 0.2} 20%|█▉ | 4354/22095 [7:22:02<19:38:18, 3.99s/it] 20%|█▉ | 4355/22095 [7:22:05<18:02:39, 3.66s/it] {'loss': 0.4014, 'grad_norm': 0.6901906038206445, 'learning_rate': 9.285861668787947e-06, 'epoch': 0.2} 20%|█▉ | 4355/22095 [7:22:05<18:02:39, 3.66s/it] 20%|█▉ | 4356/22095 [7:22:08<17:51:37, 3.62s/it] {'loss': 0.429, 'grad_norm': 0.6851944238808891, 'learning_rate': 9.285484146653195e-06, 'epoch': 0.2} 20%|█▉ | 4356/22095 [7:22:08<17:51:37, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4357/22095 [7:22:13<19:37:48, 3.98s/it] {'loss': 0.5184, 'grad_norm': 0.5409067984664292, 'learning_rate': 9.285106532436552e-06, 'epoch': 0.2} 20%|█▉ | 4357/22095 [7:22:13<19:37:48, 3.98s/it] 20%|█▉ | 4358/22095 [7:22:21<25:31:44, 5.18s/it] {'loss': 0.5058, 'grad_norm': 0.4302281353429989, 'learning_rate': 9.28472882614613e-06, 'epoch': 0.2} 20%|█▉ | 4358/22095 [7:22:21<25:31:44, 5.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|█▉ | 4359/22095 [7:22:25<23:02:35, 4.68s/it] {'loss': 0.3741, 'grad_norm': 0.7081951904028086, 'learning_rate': 9.284351027790044e-06, 'epoch': 0.2} 20%|█▉ | 4359/22095 [7:22:25<23:02:35, 4.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83199 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4360/22095 [7:22:29<22:18:50, 4.53s/it] {'loss': 0.4113, 'grad_norm': 0.6824997425810786, 'learning_rate': 9.283973137376414e-06, 'epoch': 0.2} 20%|█▉ | 4360/22095 [7:22:29<22:18:50, 4.53s/it] 20%|█▉ | 4361/22095 [7:22:33<20:59:55, 4.26s/it] {'loss': 0.3592, 'grad_norm': 0.7060524282807865, 'learning_rate': 9.283595154913358e-06, 'epoch': 0.2} 20%|█▉ | 4361/22095 [7:22:33<20:59:55, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882168 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5321, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BD=4cm,∴AD=AB-BD=10-4=6(cm),∵点C是AD中点,∴CD=\\frac{1}{2}AD=3cm,则BC=CD+BD=7cm,'}]} 20%|█▉ | 4362/22095 [7:22:36<19:31:27, 3.96s/it] {'loss': 0.4269, 'grad_norm': 0.7613516912992961, 'learning_rate': 9.283217080409e-06, 'epoch': 0.2} 20%|█▉ | 4362/22095 [7:22:36<19:31:27, 3.96s/it] 20%|█▉ | 4363/22095 [7:22:40<19:45:15, 4.01s/it] {'loss': 0.4193, 'grad_norm': 0.7603432579957048, 'learning_rate': 9.28283891387146e-06, 'epoch': 0.2} 20%|█▉ | 4363/22095 [7:22:40<19:45:15, 4.01s/it] 20%|█▉ | 4364/22095 [7:22:44<19:52:26, 4.04s/it] {'loss': 0.4444, 'grad_norm': 0.6901066095588091, 'learning_rate': 9.282460655308864e-06, 'epoch': 0.2} 20%|█▉ | 4364/22095 [7:22:44<19:52:26, 4.04s/it] 20%|█▉ | 4365/22095 [7:22:47<18:36:51, 3.78s/it] {'loss': 0.3687, 'grad_norm': 0.584544309983049, 'learning_rate': 9.282082304729343e-06, 'epoch': 0.2} 20%|█▉ | 4365/22095 [7:22:47<18:36:51, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46165 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4366/22095 [7:22:51<19:00:54, 3.86s/it] {'loss': 0.3928, 'grad_norm': 0.7097341226175008, 'learning_rate': 9.281703862141024e-06, 'epoch': 0.2} 20%|█▉ | 4366/22095 [7:22:51<19:00:54, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45087 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70106 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43774 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4367/22095 [7:22:55<19:38:07, 3.99s/it] {'loss': 0.3784, 'grad_norm': 0.704967973293008, 'learning_rate': 9.28132532755204e-06, 'epoch': 0.2} 20%|█▉ | 4367/22095 [7:22:56<19:38:07, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4368/22095 [7:23:02<22:57:54, 4.66s/it] {'loss': 0.4982, 'grad_norm': 0.9059692746686824, 'learning_rate': 9.280946700970524e-06, 'epoch': 0.2} 20%|█▉ | 4368/22095 [7:23:02<22:57:54, 4.66s/it] 20%|█▉ | 4369/22095 [7:23:05<20:41:47, 4.20s/it] {'loss': 0.3921, 'grad_norm': 0.6941175981179759, 'learning_rate': 9.280567982404611e-06, 'epoch': 0.2} 20%|█▉ | 4369/22095 [7:23:05<20:41:47, 4.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [370, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8489529 in VC:s3://internvl-moe-sft-data/. Exception: Image size [370, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6706, 'image': 'vrdu_texteq/astro-ph.CO/0f4b3c92-1b55-4a90-b88a-5554d7484693.png', 'image_wh': [[370, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': "where $G$ is Newton's constant."}]} 20%|█▉ | 4370/22095 [7:23:08<18:48:59, 3.82s/it] {'loss': 0.361, 'grad_norm': 0.7020988815556046, 'learning_rate': 9.280189171862439e-06, 'epoch': 0.2} 20%|█▉ | 4370/22095 [7:23:08<18:48:59, 3.82s/it] 20%|█▉ | 4371/22095 [7:23:12<19:02:44, 3.87s/it] {'loss': 0.4315, 'grad_norm': 0.733757185054716, 'learning_rate': 9.279810269352147e-06, 'epoch': 0.2} 20%|█▉ | 4371/22095 [7:23:12<19:02:44, 3.87s/it] 20%|█▉ | 4372/22095 [7:23:16<19:15:19, 3.91s/it] {'loss': 0.3648, 'grad_norm': 0.9030378119274879, 'learning_rate': 9.279431274881876e-06, 'epoch': 0.2} 20%|█▉ | 4372/22095 [7:23:16<19:15:19, 3.91s/it] 20%|█▉ | 4373/22095 [7:23:19<17:30:00, 3.55s/it] {'loss': 0.4028, 'grad_norm': 0.6822015900189836, 'learning_rate': 9.279052188459772e-06, 'epoch': 0.2} 20%|█▉ | 4373/22095 [7:23:19<17:30:00, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4374/22095 [7:23:27<24:44:38, 5.03s/it] {'loss': 0.5142, 'grad_norm': 0.6302617832804666, 'learning_rate': 9.278673010093977e-06, 'epoch': 0.2} 20%|█▉ | 4374/22095 [7:23:27<24:44:38, 5.03s/it] 20%|█▉ | 4375/22095 [7:23:35<28:32:36, 5.80s/it] {'loss': 0.5014, 'grad_norm': 0.496785804689605, 'learning_rate': 9.278293739792642e-06, 'epoch': 0.2} 20%|█▉ | 4375/22095 [7:23:35<28:32:36, 5.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|█▉ | 4376/22095 [7:23:39<26:16:04, 5.34s/it] {'loss': 0.4004, 'grad_norm': 0.7594395569593272, 'learning_rate': 9.277914377563911e-06, 'epoch': 0.2} 20%|█▉ | 4376/22095 [7:23:39<26:16:04, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4377/22095 [7:23:42<23:19:51, 4.74s/it] {'loss': 0.3856, 'grad_norm': 0.6961484073026286, 'learning_rate': 9.277534923415941e-06, 'epoch': 0.2} 20%|█▉ | 4377/22095 [7:23:42<23:19:51, 4.74s/it] 20%|█▉ | 4378/22095 [7:23:45<20:42:48, 4.21s/it] {'loss': 0.4135, 'grad_norm': 0.752079978554707, 'learning_rate': 9.277155377356881e-06, 'epoch': 0.2} 20%|█▉ | 4378/22095 [7:23:45<20:42:48, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47773 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115214 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4379/22095 [7:23:48<19:25:59, 3.95s/it] {'loss': 0.3979, 'grad_norm': 0.7544369391544148, 'learning_rate': 9.27677573939489e-06, 'epoch': 0.2} 20%|█▉ | 4379/22095 [7:23:49<19:25:59, 3.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [145, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8349606 in VC:s3://internvl-moe-sft-data/. Exception: Image size [145, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16278, 'image': 'vrdu_table_final_2/astro-ph.CO/d5bd712e-bf2a-450d-86b0-f647c76fdc90.png', 'image_wh': [[145, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{ccc}\n& GAUSSIAN&\\\\\n\n\\end{tabular}\n```"}]} 20%|█▉ | 4380/22095 [7:23:52<18:04:00, 3.67s/it] {'loss': 0.4056, 'grad_norm': 0.730897043381073, 'learning_rate': 9.276396009538122e-06, 'epoch': 0.2} 20%|█▉ | 4380/22095 [7:23:52<18:04:00, 3.67s/it] 20%|█▉ | 4381/22095 [7:23:55<18:24:37, 3.74s/it] {'loss': 0.4194, 'grad_norm': 0.6920481467125367, 'learning_rate': 9.276016187794739e-06, 'epoch': 0.2} 20%|█▉ | 4381/22095 [7:23:55<18:24:37, 3.74s/it] 20%|█▉ | 4382/22095 [7:23:59<17:38:45, 3.59s/it] {'loss': 0.4046, 'grad_norm': 0.6528896991545643, 'learning_rate': 9.275636274172901e-06, 'epoch': 0.2} 20%|█▉ | 4382/22095 [7:23:59<17:38:45, 3.59s/it] 20%|█▉ | 4383/22095 [7:24:04<20:35:54, 4.19s/it] {'loss': 0.3838, 'grad_norm': 0.7660841197828047, 'learning_rate': 9.27525626868077e-06, 'epoch': 0.2} 20%|█▉ | 4383/22095 [7:24:04<20:35:54, 4.19s/it] 20%|█▉ | 4384/22095 [7:24:09<20:43:12, 4.21s/it] {'loss': 0.3808, 'grad_norm': 0.7133729442628481, 'learning_rate': 9.274876171326514e-06, 'epoch': 0.2} 20%|█▉ | 4384/22095 [7:24:09<20:43:12, 4.21s/it] 20%|█▉ | 4385/22095 [7:24:12<19:11:59, 3.90s/it] {'loss': 0.3779, 'grad_norm': 0.6404607940509356, 'learning_rate': 9.274495982118297e-06, 'epoch': 0.2} 20%|█▉ | 4385/22095 [7:24:12<19:11:59, 3.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8338404 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5031, 'image': 'vrdu_table_final_2/astro-ph.CO/1daab5d7-c135-4764-8810-034956f0a661.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 20%|█▉ | 4386/22095 [7:24:15<18:20:06, 3.73s/it] {'loss': 0.4167, 'grad_norm': 0.7199080919720658, 'learning_rate': 9.27411570106429e-06, 'epoch': 0.2} 20%|█▉ | 4386/22095 [7:24:15<18:20:06, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047786 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 20%|█▉ | 4387/22095 [7:24:24<25:40:08, 5.22s/it] {'loss': 0.5265, 'grad_norm': 1.2733519021150366, 'learning_rate': 9.273735328172664e-06, 'epoch': 0.2} 20%|█▉ | 4387/22095 [7:24:24<25:40:08, 5.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047929 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 11cm\nB. 12cm\nC. 15cm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 20%|█▉ | 4388/22095 [7:24:27<22:37:02, 4.60s/it] {'loss': 0.3537, 'grad_norm': 0.7103108166886162, 'learning_rate': 9.273354863451589e-06, 'epoch': 0.2} 20%|█▉ | 4388/22095 [7:24:27<22:37:02, 4.60s/it] 20%|█▉ | 4389/22095 [7:24:31<21:59:46, 4.47s/it] {'loss': 0.3739, 'grad_norm': 0.7246408967666235, 'learning_rate': 9.272974306909246e-06, 'epoch': 0.2} 20%|█▉ | 4389/22095 [7:24:31<21:59:46, 4.47s/it] 20%|█▉ | 4390/22095 [7:24:34<19:55:28, 4.05s/it] {'loss': 0.354, 'grad_norm': 0.6757107097574218, 'learning_rate': 9.272593658553806e-06, 'epoch': 0.2} 20%|█▉ | 4390/22095 [7:24:34<19:55:28, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333765 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 374, 'image': 'vrdu_table_final_2/astro-ph.CO/0fea7510-b803-41b9-94b7-e34373412534.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} 20%|█▉ | 4391/22095 [7:24:37<18:40:02, 3.80s/it] {'loss': 0.4137, 'grad_norm': 0.7067224266353317, 'learning_rate': 9.272212918393452e-06, 'epoch': 0.2} 20%|█▉ | 4391/22095 [7:24:37<18:40:02, 3.80s/it] 20%|█▉ | 4392/22095 [7:24:40<17:31:44, 3.56s/it] {'loss': 0.3582, 'grad_norm': 0.6155517958991198, 'learning_rate': 9.271832086436364e-06, 'epoch': 0.2} 20%|█▉ | 4392/22095 [7:24:40<17:31:44, 3.56s/it] 20%|█▉ | 4393/22095 [7:24:43<16:29:00, 3.35s/it] {'loss': 0.3961, 'grad_norm': 0.6805250960188165, 'learning_rate': 9.271451162690723e-06, 'epoch': 0.2} 20%|█▉ | 4393/22095 [7:24:43<16:29:00, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87092 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4394/22095 [7:24:46<16:00:44, 3.26s/it] {'loss': 0.375, 'grad_norm': 0.6844985043600663, 'learning_rate': 9.271070147164715e-06, 'epoch': 0.2} 20%|█▉ | 4394/22095 [7:24:46<16:00:44, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46621 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74017 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4395/22095 [7:24:49<15:51:23, 3.23s/it] {'loss': 0.4425, 'grad_norm': 0.7043381339443586, 'learning_rate': 9.270689039866528e-06, 'epoch': 0.2} 20%|█▉ | 4395/22095 [7:24:49<15:51:23, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4396/22095 [7:24:52<15:14:53, 3.10s/it] {'loss': 0.3539, 'grad_norm': 0.6478549868387321, 'learning_rate': 9.270307840804349e-06, 'epoch': 0.2} 20%|█▉ | 4396/22095 [7:24:52<15:14:53, 3.10s/it] 20%|█▉ | 4397/22095 [7:24:55<14:44:18, 3.00s/it] {'loss': 0.3831, 'grad_norm': 0.6498360959963834, 'learning_rate': 9.26992654998637e-06, 'epoch': 0.2} 20%|█▉ | 4397/22095 [7:24:55<14:44:18, 3.00s/it] 20%|█▉ | 4398/22095 [7:24:58<15:29:23, 3.15s/it] {'loss': 0.4313, 'grad_norm': 0.674985630028175, 'learning_rate': 9.269545167420786e-06, 'epoch': 0.2} 20%|█▉ | 4398/22095 [7:24:58<15:29:23, 3.15s/it] 20%|█▉ | 4399/22095 [7:25:03<17:33:23, 3.57s/it] {'loss': 0.4219, 'grad_norm': 0.6890540926953495, 'learning_rate': 9.269163693115786e-06, 'epoch': 0.2} 20%|█▉ | 4399/22095 [7:25:03<17:33:23, 3.57s/it] 20%|█▉ | 4400/22095 [7:25:06<17:21:24, 3.53s/it] {'loss': 0.3732, 'grad_norm': 0.6898930046935953, 'learning_rate': 9.268782127079571e-06, 'epoch': 0.2} 20%|█▉ | 4400/22095 [7:25:06<17:21:24, 3.53s/it] 20%|█▉ | 4401/22095 [7:25:10<17:12:58, 3.50s/it] {'loss': 0.3636, 'grad_norm': 0.6177411270268293, 'learning_rate': 9.26840046932034e-06, 'epoch': 0.2} 20%|█▉ | 4401/22095 [7:25:10<17:12:58, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|█▉ | 4402/22095 [7:25:19<26:04:44, 5.31s/it] {'loss': 0.5663, 'grad_norm': 1.3825448268802494, 'learning_rate': 9.26801871984629e-06, 'epoch': 0.2} 20%|█▉ | 4402/22095 [7:25:19<26:04:44, 5.31s/it] 20%|█▉ | 4403/22095 [7:25:27<30:01:48, 6.11s/it] {'loss': 0.5335, 'grad_norm': 0.9528669719885208, 'learning_rate': 9.267636878665629e-06, 'epoch': 0.2} 20%|█▉ | 4403/22095 [7:25:27<30:01:48, 6.11s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|█▉ | 4404/22095 [7:25:31<26:36:51, 5.42s/it] {'loss': 0.3852, 'grad_norm': 0.6940354386270224, 'learning_rate': 9.267254945786556e-06, 'epoch': 0.2} 20%|█▉ | 4404/22095 [7:25:31<26:36:51, 5.42s/it] 20%|█▉ | 4405/22095 [7:25:35<23:34:38, 4.80s/it] {'loss': 0.3684, 'grad_norm': 0.7044042232798097, 'learning_rate': 9.26687292121728e-06, 'epoch': 0.2} 20%|█▉ | 4405/22095 [7:25:35<23:34:38, 4.80s/it] 20%|█▉ | 4406/22095 [7:25:38<20:57:45, 4.27s/it] {'loss': 0.4547, 'grad_norm': 0.8272582311689283, 'learning_rate': 9.26649080496601e-06, 'epoch': 0.2} 20%|█▉ | 4406/22095 [7:25:38<20:57:45, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [489, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8525882 in VC:s3://internvl-moe-sft-data/. Exception: Image size [489, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 159366, 'image': 'vrdu_texteq/astro-ph.CO/e2706a4c-696e-4659-bc15-8e73c31271c7.png', 'image_wh': [[489, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'The field was observed for for $3\\times 1800$s.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4407/22095 [7:25:40<18:54:36, 3.85s/it] {'loss': 0.4182, 'grad_norm': 0.8991881377986967, 'learning_rate': 9.266108597040957e-06, 'epoch': 0.2} 20%|█▉ | 4407/22095 [7:25:40<18:54:36, 3.85s/it] 20%|█▉ | 4408/22095 [7:25:44<18:35:46, 3.79s/it] {'loss': 0.4388, 'grad_norm': 0.8684215744351997, 'learning_rate': 9.265726297450332e-06, 'epoch': 0.2} 20%|█▉ | 4408/22095 [7:25:44<18:35:46, 3.79s/it] 20%|█▉ | 4409/22095 [7:25:47<17:43:17, 3.61s/it] {'loss': 0.376, 'grad_norm': 0.671845166331204, 'learning_rate': 9.265343906202351e-06, 'epoch': 0.2} 20%|█▉ | 4409/22095 [7:25:47<17:43:17, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [781, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8484932 in VC:s3://internvl-moe-sft-data/. Exception: Image size [781, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 73582, 'image': 'vrdu_texteq/astro-ph.CO/6efd57e4-676d-448b-93a9-7269e9e2c7d9.png', 'image_wh': [[781, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'We can form a linear estimator for the $\\Delta$ terms in $\\mathbf{x}$ of the form'}]} 20%|█▉ | 4410/22095 [7:25:51<18:04:53, 3.68s/it] {'loss': 0.4265, 'grad_norm': 0.7218212880590468, 'learning_rate': 9.264961423305229e-06, 'epoch': 0.2} 20%|█▉ | 4410/22095 [7:25:51<18:04:53, 3.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [67, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390147 in VC:s3://internvl-moe-sft-data/. Exception: Image size [67, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56966, 'image': 'vrdu_table_final_2/astro-ph.EP/82fee656-58fc-479a-b82b-2cbcdf93037e.png', 'image_wh': [[67, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} Value \\\\ \\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|█▉ | 4411/22095 [7:25:55<18:25:44, 3.75s/it] {'loss': 0.3955, 'grad_norm': 0.7227038515063788, 'learning_rate': 9.264578848767184e-06, 'epoch': 0.2} 20%|█▉ | 4411/22095 [7:25:55<18:25:44, 3.75s/it] 20%|█▉ | 4412/22095 [7:25:58<17:52:36, 3.64s/it] {'loss': 0.4277, 'grad_norm': 0.7151889788085047, 'learning_rate': 9.264196182596438e-06, 'epoch': 0.2} 20%|█▉ | 4412/22095 [7:25:58<17:52:36, 3.64s/it] 20%|█▉ | 4413/22095 [7:26:01<16:52:53, 3.44s/it] {'loss': 0.3914, 'grad_norm': 0.8204253178429187, 'learning_rate': 9.26381342480121e-06, 'epoch': 0.2} 20%|█▉ | 4413/22095 [7:26:01<16:52:53, 3.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339020 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5651, 'image': 'vrdu_table_final_2/astro-ph.CO/4da3c934-8c0a-4727-912b-2703e92489cb.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 20%|█▉ | 4414/22095 [7:26:05<17:14:22, 3.51s/it] {'loss': 0.3781, 'grad_norm': 0.7712295837039834, 'learning_rate': 9.26343057538973e-06, 'epoch': 0.2} 20%|█▉ | 4414/22095 [7:26:05<17:14:22, 3.51s/it] 20%|█▉ | 4415/22095 [7:26:09<18:02:31, 3.67s/it] {'loss': 0.3883, 'grad_norm': 0.7006062972466253, 'learning_rate': 9.263047634370221e-06, 'epoch': 0.2} 20%|█▉ | 4415/22095 [7:26:09<18:02:31, 3.67s/it] 20%|█▉ | 4416/22095 [7:26:13<18:10:44, 3.70s/it] {'loss': 0.4038, 'grad_norm': 0.7042611085484666, 'learning_rate': 9.26266460175091e-06, 'epoch': 0.2} 20%|█▉ | 4416/22095 [7:26:13<18:10:44, 3.70s/it] 20%|█▉ | 4417/22095 [7:26:17<18:06:45, 3.69s/it] {'loss': 0.4182, 'grad_norm': 0.7092910007919806, 'learning_rate': 9.262281477540029e-06, 'epoch': 0.2} 20%|█▉ | 4417/22095 [7:26:17<18:06:45, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68804 > 40960). Running this sequence through the model will result in indexing errors 20%|█▉ | 4418/22095 [7:26:20<17:09:15, 3.49s/it] {'loss': 0.4053, 'grad_norm': 0.6352626827612116, 'learning_rate': 9.26189826174581e-06, 'epoch': 0.2} 20%|█▉ | 4418/22095 [7:26:20<17:09:15, 3.49s/it] 20%|██ | 4419/22095 [7:26:23<16:40:30, 3.40s/it] {'loss': 0.4037, 'grad_norm': 0.6908034332242652, 'learning_rate': 9.261514954376487e-06, 'epoch': 0.2} 20%|██ | 4419/22095 [7:26:23<16:40:30, 3.40s/it] 20%|██ | 4420/22095 [7:26:26<16:09:45, 3.29s/it] {'loss': 0.4285, 'grad_norm': 0.6393519206087547, 'learning_rate': 9.261131555440295e-06, 'epoch': 0.2} 20%|██ | 4420/22095 [7:26:26<16:09:45, 3.29s/it] 20%|██ | 4421/22095 [7:26:29<15:23:12, 3.13s/it] {'loss': 0.3904, 'grad_norm': 0.7543033606632701, 'learning_rate': 9.260748064945473e-06, 'epoch': 0.2} 20%|██ | 4421/22095 [7:26:29<15:23:12, 3.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104627 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4422/22095 [7:26:32<16:11:18, 3.30s/it] {'loss': 0.4251, 'grad_norm': 0.8441036963952208, 'learning_rate': 9.26036448290026e-06, 'epoch': 0.2} 20%|██ | 4422/22095 [7:26:32<16:11:18, 3.30s/it] 20%|██ | 4423/22095 [7:26:35<15:36:08, 3.18s/it] {'loss': 0.3955, 'grad_norm': 0.7412155904408332, 'learning_rate': 9.259980809312901e-06, 'epoch': 0.2} 20%|██ | 4423/22095 [7:26:35<15:36:08, 3.18s/it] 20%|██ | 4424/22095 [7:26:38<15:28:42, 3.15s/it] {'loss': 0.3939, 'grad_norm': 0.7085212725378105, 'learning_rate': 9.259597044191635e-06, 'epoch': 0.2} 20%|██ | 4424/22095 [7:26:38<15:28:42, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4425/22095 [7:26:46<22:34:57, 4.60s/it] {'loss': 0.6369, 'grad_norm': 3.2353281721630096, 'learning_rate': 9.259213187544714e-06, 'epoch': 0.2} 20%|██ | 4425/22095 [7:26:46<22:34:57, 4.60s/it] 20%|██ | 4426/22095 [7:26:54<26:39:45, 5.43s/it] {'loss': 0.5755, 'grad_norm': 1.81014429072996, 'learning_rate': 9.25882923938038e-06, 'epoch': 0.2} 20%|██ | 4426/22095 [7:26:54<26:39:45, 5.43s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|██ | 4427/22095 [7:26:57<23:37:16, 4.81s/it] {'loss': 0.4165, 'grad_norm': 0.9136144605035063, 'learning_rate': 9.25844519970689e-06, 'epoch': 0.2} 20%|██ | 4427/22095 [7:26:57<23:37:16, 4.81s/it] 20%|██ | 4428/22095 [7:27:00<21:25:05, 4.36s/it] {'loss': 0.4422, 'grad_norm': 1.02823154637771, 'learning_rate': 9.258061068532487e-06, 'epoch': 0.2} 20%|██ | 4428/22095 [7:27:00<21:25:05, 4.36s/it] 20%|██ | 4429/22095 [7:27:04<20:43:14, 4.22s/it] {'loss': 0.4443, 'grad_norm': 1.1601240780768014, 'learning_rate': 9.257676845865431e-06, 'epoch': 0.2} 20%|██ | 4429/22095 [7:27:04<20:43:14, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4430/22095 [7:27:14<28:31:38, 5.81s/it] {'loss': 0.5761, 'grad_norm': 2.571160437651429, 'learning_rate': 9.257292531713977e-06, 'epoch': 0.2} 20%|██ | 4430/22095 [7:27:14<28:31:38, 5.81s/it] 20%|██ | 4431/22095 [7:27:17<25:19:10, 5.16s/it] {'loss': 0.4099, 'grad_norm': 0.8266987002197626, 'learning_rate': 9.25690812608638e-06, 'epoch': 0.2} 20%|██ | 4431/22095 [7:27:17<25:19:10, 5.16s/it] 20%|██ | 4432/22095 [7:27:21<22:40:41, 4.62s/it] {'loss': 0.3916, 'grad_norm': 0.7510003168521713, 'learning_rate': 9.256523628990903e-06, 'epoch': 0.2} 20%|██ | 4432/22095 [7:27:21<22:40:41, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44088 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42025 > 40960) for 4 sample(s). Truncating to 38746 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (88515 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4433/22095 [7:27:24<20:19:49, 4.14s/it] {'loss': 0.4012, 'grad_norm': 0.7427283363249146, 'learning_rate': 9.256139040435806e-06, 'epoch': 0.2} 20%|██ | 4433/22095 [7:27:24<20:19:49, 4.14s/it] 20%|██ | 4434/22095 [7:27:27<19:27:29, 3.97s/it] {'loss': 0.3643, 'grad_norm': 0.8078106575136649, 'learning_rate': 9.255754360429353e-06, 'epoch': 0.2} 20%|██ | 4434/22095 [7:27:27<19:27:29, 3.97s/it] 20%|██ | 4435/22095 [7:27:30<17:51:05, 3.64s/it] {'loss': 0.3855, 'grad_norm': 0.8547619551004447, 'learning_rate': 9.255369588979806e-06, 'epoch': 0.2} 20%|██ | 4435/22095 [7:27:30<17:51:05, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56467 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57800 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4436/22095 [7:27:33<16:50:09, 3.43s/it] {'loss': 0.3987, 'grad_norm': 0.7763954730463974, 'learning_rate': 9.25498472609544e-06, 'epoch': 0.2} 20%|██ | 4436/22095 [7:27:33<16:50:09, 3.43s/it] 20%|██ | 4437/22095 [7:27:37<16:52:07, 3.44s/it] {'loss': 0.4286, 'grad_norm': 0.8427741534604029, 'learning_rate': 9.254599771784519e-06, 'epoch': 0.2} 20%|██ | 4437/22095 [7:27:37<16:52:07, 3.44s/it] 20%|██ | 4438/22095 [7:27:40<17:12:59, 3.51s/it] {'loss': 0.3951, 'grad_norm': 0.7836133705709749, 'learning_rate': 9.254214726055314e-06, 'epoch': 0.2} 20%|██ | 4438/22095 [7:27:40<17:12:59, 3.51s/it] 20%|██ | 4439/22095 [7:27:43<16:18:27, 3.33s/it] {'loss': 0.4004, 'grad_norm': 0.7298023584383054, 'learning_rate': 9.253829588916103e-06, 'epoch': 0.2} 20%|██ | 4439/22095 [7:27:43<16:18:27, 3.33s/it] 20%|██ | 4440/22095 [7:27:46<16:21:14, 3.33s/it] {'loss': 0.417, 'grad_norm': 0.665970725982351, 'learning_rate': 9.253444360375157e-06, 'epoch': 0.2} 20%|██ | 4440/22095 [7:27:46<16:21:14, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4441/22095 [7:27:56<25:19:17, 5.16s/it] {'loss': 0.6065, 'grad_norm': 1.531114314437859, 'learning_rate': 9.253059040440757e-06, 'epoch': 0.2} 20%|██ | 4441/22095 [7:27:56<25:19:17, 5.16s/it] 20%|██ | 4442/22095 [7:28:00<23:05:55, 4.71s/it] {'loss': 0.4067, 'grad_norm': 0.7515967029741009, 'learning_rate': 9.25267362912118e-06, 'epoch': 0.2} 20%|██ | 4442/22095 [7:28:00<23:05:55, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76403 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4443/22095 [7:28:03<20:51:30, 4.25s/it] {'loss': 0.3985, 'grad_norm': 0.6829143827997587, 'learning_rate': 9.252288126424707e-06, 'epoch': 0.2} 20%|██ | 4443/22095 [7:28:03<20:51:30, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4444/22095 [7:28:12<28:52:04, 5.89s/it] {'loss': 0.5895, 'grad_norm': 1.1010757231316262, 'learning_rate': 9.251902532359622e-06, 'epoch': 0.2} 20%|██ | 4444/22095 [7:28:12<28:52:04, 5.89s/it] 20%|██ | 4445/22095 [7:28:16<25:32:49, 5.21s/it] {'loss': 0.3961, 'grad_norm': 0.7326225536696734, 'learning_rate': 9.25151684693421e-06, 'epoch': 0.2} 20%|██ | 4445/22095 [7:28:16<25:32:49, 5.21s/it] 20%|██ | 4446/22095 [7:28:20<23:27:33, 4.79s/it] {'loss': 0.4333, 'grad_norm': 0.7448227710158162, 'learning_rate': 9.251131070156761e-06, 'epoch': 0.2} 20%|██ | 4446/22095 [7:28:20<23:27:33, 4.79s/it] 20%|██ | 4447/22095 [7:28:23<20:48:49, 4.25s/it] {'loss': 0.4081, 'grad_norm': 0.734666576540432, 'learning_rate': 9.250745202035558e-06, 'epoch': 0.2} 20%|██ | 4447/22095 [7:28:23<20:48:49, 4.25s/it] 20%|██ | 4448/22095 [7:28:26<18:38:16, 3.80s/it] {'loss': 0.3982, 'grad_norm': 0.6936139236742962, 'learning_rate': 9.250359242578898e-06, 'epoch': 0.2} 20%|██ | 4448/22095 [7:28:26<18:38:16, 3.80s/it] 20%|██ | 4449/22095 [7:28:29<17:57:14, 3.66s/it] {'loss': 0.3906, 'grad_norm': 0.7448189175784459, 'learning_rate': 9.249973191795072e-06, 'epoch': 0.2} 20%|██ | 4449/22095 [7:28:29<17:57:14, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65944 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83391 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79742 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130766 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4450/22095 [7:28:32<17:34:00, 3.58s/it] {'loss': 0.436, 'grad_norm': 0.7389504242537271, 'learning_rate': 9.249587049692375e-06, 'epoch': 0.2} 20%|██ | 4450/22095 [7:28:32<17:34:00, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77902 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4451/22095 [7:28:37<19:40:41, 4.02s/it] {'loss': 0.5595, 'grad_norm': 0.8361161553792722, 'learning_rate': 9.249200816279103e-06, 'epoch': 0.2} 20%|██ | 4451/22095 [7:28:37<19:40:41, 4.02s/it] 20%|██ | 4452/22095 [7:28:41<19:46:18, 4.03s/it] {'loss': 0.3676, 'grad_norm': 0.7152862153144932, 'learning_rate': 9.248814491563555e-06, 'epoch': 0.2} 20%|██ | 4452/22095 [7:28:41<19:46:18, 4.03s/it] 20%|██ | 4453/22095 [7:28:45<18:46:52, 3.83s/it] {'loss': 0.3462, 'grad_norm': 0.6533846117754827, 'learning_rate': 9.248428075554034e-06, 'epoch': 0.2} 20%|██ | 4453/22095 [7:28:45<18:46:52, 3.83s/it] 20%|██ | 4454/22095 [7:28:48<18:19:16, 3.74s/it] {'loss': 0.4081, 'grad_norm': 0.8677187836764106, 'learning_rate': 9.248041568258843e-06, 'epoch': 0.2} 20%|██ | 4454/22095 [7:28:48<18:19:16, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4455/22095 [7:28:52<18:14:07, 3.72s/it] {'loss': 0.4266, 'grad_norm': 0.6707462106808738, 'learning_rate': 9.247654969686283e-06, 'epoch': 0.2} 20%|██ | 4455/22095 [7:28:52<18:14:07, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396977 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63830, 'image': 'vrdu_table_final_2/astro-ph.EP/c8c49818-fe3d-4125-9529-7f5c89de1737.png', 'image_wh': [[17, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$\\omega$\\end{tabular}\n```"}]} 20%|██ | 4456/22095 [7:28:55<17:01:48, 3.48s/it] {'loss': 0.41, 'grad_norm': 0.672407091659003, 'learning_rate': 9.247268279844666e-06, 'epoch': 0.2} 20%|██ | 4456/22095 [7:28:55<17:01:48, 3.48s/it] 20%|██ | 4457/22095 [7:28:58<17:00:47, 3.47s/it] {'loss': 0.4167, 'grad_norm': 0.6815229751795202, 'learning_rate': 9.246881498742296e-06, 'epoch': 0.2} 20%|██ | 4457/22095 [7:28:58<17:00:47, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4458/22095 [7:29:08<25:46:58, 5.26s/it] {'loss': 0.5093, 'grad_norm': 0.5479951855336056, 'learning_rate': 9.246494626387487e-06, 'epoch': 0.2} 20%|██ | 4458/22095 [7:29:08<25:46:58, 5.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4459/22095 [7:29:17<31:53:08, 6.51s/it] {'loss': 0.5098, 'grad_norm': 0.5161973718916856, 'learning_rate': 9.24610766278855e-06, 'epoch': 0.2} 20%|██ | 4459/22095 [7:29:17<31:53:08, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882175 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5328, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 20%|██ | 4460/22095 [7:29:21<27:18:15, 5.57s/it] {'loss': 0.3833, 'grad_norm': 0.850925537761303, 'learning_rate': 9.245720607953802e-06, 'epoch': 0.2} 20%|██ | 4460/22095 [7:29:21<27:18:15, 5.57s/it] 20%|██ | 4461/22095 [7:29:24<24:43:42, 5.05s/it] {'loss': 0.3858, 'grad_norm': 0.6542363235875374, 'learning_rate': 9.245333461891555e-06, 'epoch': 0.2} 20%|██ | 4461/22095 [7:29:24<24:43:42, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50413 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42205 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4462/22095 [7:29:27<21:43:52, 4.44s/it] {'loss': 0.373, 'grad_norm': 0.6973756715606783, 'learning_rate': 9.244946224610132e-06, 'epoch': 0.2} 20%|██ | 4462/22095 [7:29:27<21:43:52, 4.44s/it] 20%|██ | 4463/22095 [7:29:31<21:01:40, 4.29s/it] {'loss': 0.4349, 'grad_norm': 0.8079341217942448, 'learning_rate': 9.244558896117852e-06, 'epoch': 0.2} 20%|██ | 4463/22095 [7:29:31<21:01:40, 4.29s/it] 20%|██ | 4464/22095 [7:29:34<19:08:00, 3.91s/it] {'loss': 0.4003, 'grad_norm': 0.681137135301652, 'learning_rate': 9.244171476423037e-06, 'epoch': 0.2} 20%|██ | 4464/22095 [7:29:34<19:08:00, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54771 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60032 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4465/22095 [7:29:37<17:41:06, 3.61s/it] {'loss': 0.4466, 'grad_norm': 0.7047389289061264, 'learning_rate': 9.243783965534012e-06, 'epoch': 0.2} 20%|██ | 4465/22095 [7:29:37<17:41:06, 3.61s/it] 20%|██ | 4466/22095 [7:29:40<16:25:00, 3.35s/it] {'loss': 0.3839, 'grad_norm': 0.7248341836247567, 'learning_rate': 9.243396363459104e-06, 'epoch': 0.2} 20%|██ | 4466/22095 [7:29:40<16:25:00, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4467/22095 [7:29:44<16:45:51, 3.42s/it] {'loss': 0.4275, 'grad_norm': 0.6816766720407302, 'learning_rate': 9.24300867020664e-06, 'epoch': 0.2} 20%|██ | 4467/22095 [7:29:44<16:45:51, 3.42s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 20%|██ | 4468/22095 [7:29:47<16:29:33, 3.37s/it] {'loss': 0.3764, 'grad_norm': 0.6565503559461989, 'learning_rate': 9.242620885784952e-06, 'epoch': 0.2} 20%|██ | 4468/22095 [7:29:47<16:29:33, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [75, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398601 in VC:s3://internvl-moe-sft-data/. Exception: Image size [75, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 753, 'image': 'vrdu_table_final_2/astro-ph.CO/a220553f-87b5-4ed1-9057-0a100f41724d.png', 'image_wh': [[75, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{l}3C286\\end{tabular}\n```"}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (106300000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 20%|██ | 4469/22095 [7:29:50<15:42:06, 3.21s/it] {'loss': 0.3909, 'grad_norm': 0.7332199448787484, 'learning_rate': 9.242233010202371e-06, 'epoch': 0.2} 20%|██ | 4469/22095 [7:29:50<15:42:06, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/37949.png 2025-08-27 23:27:45.707156 load time: 1794.88 ms 20%|██ | 4470/22095 [7:29:59<24:49:27, 5.07s/it] {'loss': 0.5072, 'grad_norm': 0.6763839124000454, 'learning_rate': 9.241845043467232e-06, 'epoch': 0.2} 20%|██ | 4470/22095 [7:29:59<24:49:27, 5.07s/it] 20%|██ | 4471/22095 [7:30:03<22:42:10, 4.64s/it] {'loss': 0.3975, 'grad_norm': 0.691110792809285, 'learning_rate': 9.241456985587868e-06, 'epoch': 0.2} 20%|██ | 4471/22095 [7:30:03<22:42:10, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48164 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63514 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4472/22095 [7:30:06<20:57:41, 4.28s/it] {'loss': 0.4609, 'grad_norm': 0.6932678792221445, 'learning_rate': 9.241068836572623e-06, 'epoch': 0.2} 20%|██ | 4472/22095 [7:30:06<20:57:41, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4473/22095 [7:30:14<26:06:07, 5.33s/it] {'loss': 0.495, 'grad_norm': 0.44079329807617135, 'learning_rate': 9.240680596429833e-06, 'epoch': 0.2} 20%|██ | 4473/22095 [7:30:14<26:06:07, 5.33s/it] 20%|██ | 4474/22095 [7:30:17<22:57:24, 4.69s/it] {'loss': 0.4043, 'grad_norm': 0.655790544577791, 'learning_rate': 9.240292265167843e-06, 'epoch': 0.2} 20%|██ | 4474/22095 [7:30:17<22:57:24, 4.69s/it] 20%|██ | 4475/22095 [7:30:21<21:12:19, 4.33s/it] {'loss': 0.361, 'grad_norm': 0.7597126537185416, 'learning_rate': 9.239903842794995e-06, 'epoch': 0.2} 20%|██ | 4475/22095 [7:30:21<21:12:19, 4.33s/it] 20%|██ | 4476/22095 [7:30:24<19:44:36, 4.03s/it] {'loss': 0.4059, 'grad_norm': 0.6928027687811528, 'learning_rate': 9.239515329319633e-06, 'epoch': 0.2} 20%|██ | 4476/22095 [7:30:24<19:44:36, 4.03s/it] 20%|██ | 4477/22095 [7:30:28<19:01:48, 3.89s/it] {'loss': 0.3775, 'grad_norm': 0.6265742356763842, 'learning_rate': 9.23912672475011e-06, 'epoch': 0.2} 20%|██ | 4477/22095 [7:30:28<19:01:48, 3.89s/it] 20%|██ | 4478/22095 [7:30:31<18:33:27, 3.79s/it] {'loss': 0.3929, 'grad_norm': 0.6250366573033741, 'learning_rate': 9.238738029094771e-06, 'epoch': 0.2} 20%|██ | 4478/22095 [7:30:31<18:33:27, 3.79s/it] 20%|██ | 4479/22095 [7:30:34<17:50:04, 3.64s/it] {'loss': 0.3946, 'grad_norm': 0.6239194232238193, 'learning_rate': 9.238349242361971e-06, 'epoch': 0.2} 20%|██ | 4479/22095 [7:30:34<17:50:04, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91624 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4480/22095 [7:30:44<26:48:51, 5.48s/it] {'loss': 0.5188, 'grad_norm': 0.49157297510570386, 'learning_rate': 9.237960364560063e-06, 'epoch': 0.2} 20%|██ | 4480/22095 [7:30:44<26:48:51, 5.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139797 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45025 > 40960) for 4 sample(s). Truncating to 40559 with 2 samples. 20%|██ | 4481/22095 [7:30:48<24:26:06, 4.99s/it] {'loss': 0.3835, 'grad_norm': 0.7051363283191563, 'learning_rate': 9.237571395697403e-06, 'epoch': 0.2} 20%|██ | 4481/22095 [7:30:48<24:26:06, 4.99s/it] 20%|██ | 4482/22095 [7:30:52<23:10:13, 4.74s/it] {'loss': 0.4022, 'grad_norm': 0.6660464023096989, 'learning_rate': 9.237182335782347e-06, 'epoch': 0.2} 20%|██ | 4482/22095 [7:30:52<23:10:13, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58854 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4483/22095 [7:30:56<21:09:36, 4.33s/it] {'loss': 0.397, 'grad_norm': 0.6843714863796969, 'learning_rate': 9.236793184823257e-06, 'epoch': 0.2} 20%|██ | 4483/22095 [7:30:56<21:09:36, 4.33s/it] 20%|██ | 4484/22095 [7:30:58<19:01:36, 3.89s/it] {'loss': 0.3815, 'grad_norm': 0.6855307725350448, 'learning_rate': 9.236403942828494e-06, 'epoch': 0.2} 20%|██ | 4484/22095 [7:30:58<19:01:36, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58880 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4485/22095 [7:31:01<17:40:58, 3.61s/it] {'loss': 0.3846, 'grad_norm': 0.7079043018016902, 'learning_rate': 9.236014609806421e-06, 'epoch': 0.2} 20%|██ | 4485/22095 [7:31:01<17:40:58, 3.61s/it] 20%|██ | 4486/22095 [7:31:05<17:56:09, 3.67s/it] {'loss': 0.3855, 'grad_norm': 0.6504384598720968, 'learning_rate': 9.235625185765403e-06, 'epoch': 0.2} 20%|██ | 4486/22095 [7:31:05<17:56:09, 3.67s/it] 20%|██ | 4487/22095 [7:31:08<16:41:59, 3.41s/it] {'loss': 0.3779, 'grad_norm': 0.6534159175098676, 'learning_rate': 9.235235670713808e-06, 'epoch': 0.2} 20%|██ | 4487/22095 [7:31:08<16:41:59, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4488/22095 [7:31:18<25:32:29, 5.22s/it] {'loss': 0.5256, 'grad_norm': 0.41340276621194005, 'learning_rate': 9.23484606466001e-06, 'epoch': 0.2} 20%|██ | 4488/22095 [7:31:18<25:32:29, 5.22s/it] 20%|██ | 4489/22095 [7:31:27<31:38:17, 6.47s/it] {'loss': 0.5193, 'grad_norm': 0.3317961618649806, 'learning_rate': 9.234456367612373e-06, 'epoch': 0.2} 20%|██ | 4489/22095 [7:31:27<31:38:17, 6.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|██ | 4490/22095 [7:31:30<27:19:47, 5.59s/it] {'loss': 0.396, 'grad_norm': 0.7802740604366346, 'learning_rate': 9.234066579579274e-06, 'epoch': 0.2} 20%|██ | 4490/22095 [7:31:30<27:19:47, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113321 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4491/22095 [7:31:37<29:27:52, 6.03s/it] {'loss': 0.465, 'grad_norm': 0.3090938209094079, 'learning_rate': 9.23367670056909e-06, 'epoch': 0.2} 20%|██ | 4491/22095 [7:31:37<29:27:52, 6.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|██ | 4492/22095 [7:31:42<26:40:38, 5.46s/it] {'loss': 0.4396, 'grad_norm': 0.7517899148301332, 'learning_rate': 9.233286730590195e-06, 'epoch': 0.2} 20%|██ | 4492/22095 [7:31:42<26:40:38, 5.46s/it] 20%|██ | 4493/22095 [7:31:49<29:58:01, 6.13s/it] {'loss': 0.5304, 'grad_norm': 0.3312876729542182, 'learning_rate': 9.23289666965097e-06, 'epoch': 0.2} 20%|██ | 4493/22095 [7:31:49<29:58:01, 6.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 20%|██ | 4494/22095 [7:31:53<26:43:45, 5.47s/it] {'loss': 0.3931, 'grad_norm': 0.7427229107966618, 'learning_rate': 9.232506517759797e-06, 'epoch': 0.2} 20%|██ | 4494/22095 [7:31:53<26:43:45, 5.47s/it] 20%|██ | 4495/22095 [7:31:57<24:32:08, 5.02s/it] {'loss': 0.4469, 'grad_norm': 0.6789139552402076, 'learning_rate': 9.232116274925056e-06, 'epoch': 0.2} 20%|██ | 4495/22095 [7:31:57<24:32:08, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88810 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4496/22095 [7:32:01<22:58:43, 4.70s/it] {'loss': 0.4291, 'grad_norm': 0.7699554131664681, 'learning_rate': 9.231725941155133e-06, 'epoch': 0.2} 20%|██ | 4496/22095 [7:32:01<22:58:43, 4.70s/it] 20%|██ | 4497/22095 [7:32:04<20:17:35, 4.15s/it] {'loss': 0.3907, 'grad_norm': 0.6974959924396135, 'learning_rate': 9.231335516458419e-06, 'epoch': 0.2} 20%|██ | 4497/22095 [7:32:04<20:17:35, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47558 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90171 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63889 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4498/22095 [7:32:07<18:44:26, 3.83s/it] {'loss': 0.445, 'grad_norm': 0.6737647687437592, 'learning_rate': 9.2309450008433e-06, 'epoch': 0.2} 20%|██ | 4498/22095 [7:32:07<18:44:26, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916666 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39819, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nA. 2\nB. 0.5\nC. 1\nD. 1.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 20%|██ | 4499/22095 [7:32:12<20:38:28, 4.22s/it] {'loss': 0.4961, 'grad_norm': 0.4044279854670894, 'learning_rate': 9.230554394318167e-06, 'epoch': 0.2} 20%|██ | 4499/22095 [7:32:12<20:38:28, 4.22s/it] 20%|██ | 4500/22095 [7:32:16<19:53:27, 4.07s/it] {'loss': 0.3862, 'grad_norm': 0.844669490129284, 'learning_rate': 9.230163696891415e-06, 'epoch': 0.2} 20%|██ | 4500/22095 [7:32:16<19:53:27, 4.07s/it] 20%|██ | 4501/22095 [7:32:20<19:21:27, 3.96s/it] {'loss': 0.3812, 'grad_norm': 0.7059403482053921, 'learning_rate': 9.229772908571435e-06, 'epoch': 0.2} 20%|██ | 4501/22095 [7:32:20<19:21:27, 3.96s/it] 20%|██ | 4502/22095 [7:32:23<18:20:07, 3.75s/it] {'loss': 0.382, 'grad_norm': 0.6133336353944165, 'learning_rate': 9.229382029366625e-06, 'epoch': 0.2} 20%|██ | 4502/22095 [7:32:23<18:20:07, 3.75s/it] 20%|██ | 4503/22095 [7:32:27<18:49:59, 3.85s/it] {'loss': 0.3902, 'grad_norm': 0.6774167500015552, 'learning_rate': 9.228991059285387e-06, 'epoch': 0.2} 20%|██ | 4503/22095 [7:32:27<18:49:59, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4504/22095 [7:32:34<23:47:28, 4.87s/it] {'loss': 0.5226, 'grad_norm': 0.32660098914735397, 'learning_rate': 9.228599998336119e-06, 'epoch': 0.2} 20%|██ | 4504/22095 [7:32:34<23:47:28, 4.87s/it] 20%|██ | 4505/22095 [7:32:38<22:16:53, 4.56s/it] {'loss': 0.4481, 'grad_norm': 0.8621953064122951, 'learning_rate': 9.228208846527222e-06, 'epoch': 0.2} 20%|██ | 4505/22095 [7:32:38<22:16:53, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4506/22095 [7:32:47<29:21:11, 6.01s/it] {'loss': 0.521, 'grad_norm': 0.31618272426187605, 'learning_rate': 9.227817603867106e-06, 'epoch': 0.2} 20%|██ | 4506/22095 [7:32:47<29:21:11, 6.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50124 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101063 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4507/22095 [7:32:52<27:06:34, 5.55s/it] {'loss': 0.408, 'grad_norm': 0.6973370488241651, 'learning_rate': 9.227426270364172e-06, 'epoch': 0.2} 20%|██ | 4507/22095 [7:32:52<27:06:34, 5.55s/it] 20%|██ | 4508/22095 [7:32:56<25:13:43, 5.16s/it] {'loss': 0.4008, 'grad_norm': 0.7224976157253408, 'learning_rate': 9.227034846026833e-06, 'epoch': 0.2} 20%|██ | 4508/22095 [7:32:56<25:13:43, 5.16s/it] 20%|██ | 4509/22095 [7:32:59<22:20:32, 4.57s/it] {'loss': 0.4123, 'grad_norm': 0.860177658259118, 'learning_rate': 9.226643330863497e-06, 'epoch': 0.2} 20%|██ | 4509/22095 [7:32:59<22:20:32, 4.57s/it] 20%|██ | 4510/22095 [7:33:03<21:35:51, 4.42s/it] {'loss': 0.4213, 'grad_norm': 0.9113439419672582, 'learning_rate': 9.226251724882576e-06, 'epoch': 0.2} 20%|██ | 4510/22095 [7:33:04<21:35:51, 4.42s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39078.png 2025-08-27 23:30:59.258370 load time: 2073.87 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4511/22095 [7:33:06<19:09:30, 3.92s/it] {'loss': 0.3931, 'grad_norm': 0.6339986185921339, 'learning_rate': 9.225860028092486e-06, 'epoch': 0.2} 20%|██ | 4511/22095 [7:33:06<19:09:30, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None [Try #0] Failed to fetch sample 1057917 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: Number of image tokens ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'] does not match number of images None Problematic sample: {'image': ['63eeb12752a6426abeb129b8049d5bddstep20.png', '63eeb12752a6426abeb129b8049d5bddstep21.png'], 'conversations': [{'from': 'human', 'value': "\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nI want to book a hotel in london, prize should be less than $600, guest rating is 8+, 4 star rating, breakfast included\n\nPrevious operations:\nStep 1: Tap on the Chrome app to start searching for hotels in London.\nStep 2: Type 'Chrome' in the search bar to open the Chrome browser.\nStep 3: Tap on the Hotels.com app icon to begin searching for a hotel.\nStep 4: Wait for the Hotels.com app to finish loading to proceed with the hotel search.\nStep 5: Tap on the 'Going to' field to enter London as the destination.\nStep 6: Tap on 'London' from Recent Searches to select it as the destination for the hotel search.\nStep 7: Tap on 'London, England, United Kingdom' to select it as the destination for the hotel search.\nStep 8: Tap on the 'Search' button to view available hotels in London.\nStep 9: Tap on the filter icon or option to apply filters to the hotel search results.\nStep 10: Swipe up on the screen to reveal more filter options, such as guest rating and hotel stars.\nStep 11: Swipe up to reveal more filter options, specifically guest ratings and hotel stars.\nStep 12: Tap on the 4-star rating box to select it.\nStep 13: Tap 'Done' to confirm the current filter selections and proceed.\nStep 14: Swipe up on the screen to locate and select the 'breakfast included' filter option.\nStep 15: Tap on 'Breakfast included' to select it.\nStep 16: Tap on the 'Done' button to apply the filters and view available hotel options.\nStep 17: Tap on the topmost hotel card to view its details.\nStep 18: Tap on the 'Select a room' button to check available room options and their prices.\nStep 19: Tap on 'Select and customize' to proceed with the room selection and customization options.\nStep 20: Tap on 'Breakfast buffet' to select it, then tap on 'Reserve' to book the room."}, {'from': 'gpt', 'value': "\nWait for the loading to complete to confirm the booking status.\n\n\nterminate(status='success')\n"}]} 20%|██ | 4512/22095 [7:33:09<17:48:17, 3.65s/it] {'loss': 0.3616, 'grad_norm': 0.6893893577078529, 'learning_rate': 9.225468240501643e-06, 'epoch': 0.2} 20%|██ | 4512/22095 [7:33:09<17:48:17, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4513/22095 [7:33:12<16:31:51, 3.38s/it] {'loss': 0.4093, 'grad_norm': 0.7505527891331154, 'learning_rate': 9.225076362118464e-06, 'epoch': 0.2} 20%|██ | 4513/22095 [7:33:12<16:31:51, 3.38s/it] 20%|██ | 4514/22095 [7:33:15<15:46:04, 3.23s/it] {'loss': 0.3789, 'grad_norm': 0.6566878964422919, 'learning_rate': 9.22468439295137e-06, 'epoch': 0.2} 20%|██ | 4514/22095 [7:33:15<15:46:04, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 20%|██ | 4515/22095 [7:33:26<26:45:01, 5.48s/it] {'loss': 0.5064, 'grad_norm': 0.4184513712004044, 'learning_rate': 9.224292333008785e-06, 'epoch': 0.2} 20%|██ | 4515/22095 [7:33:26<26:45:01, 5.48s/it] 20%|██ | 4516/22095 [7:33:29<23:42:07, 4.85s/it] {'loss': 0.3671, 'grad_norm': 0.6435411751541094, 'learning_rate': 9.223900182299132e-06, 'epoch': 0.2} 20%|██ | 4516/22095 [7:33:29<23:42:07, 4.85s/it] 20%|██ | 4517/22095 [7:33:32<20:42:57, 4.24s/it] {'loss': 0.3805, 'grad_norm': 0.7608819934526175, 'learning_rate': 9.223507940830836e-06, 'epoch': 0.2} 20%|██ | 4517/22095 [7:33:32<20:42:57, 4.24s/it] 20%|██ | 4518/22095 [7:33:36<20:34:09, 4.21s/it] {'loss': 0.3975, 'grad_norm': 0.7234395427513599, 'learning_rate': 9.223115608612325e-06, 'epoch': 0.2} 20%|██ | 4518/22095 [7:33:36<20:34:09, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949352 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 187, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 16cm\nB. 10cm\nC. 5cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 20%|██ | 4519/22095 [7:33:39<19:03:26, 3.90s/it] {'loss': 0.3955, 'grad_norm': 0.6773361232917884, 'learning_rate': 9.222723185652031e-06, 'epoch': 0.2} 20%|██ | 4519/22095 [7:33:39<19:03:26, 3.90s/it] 20%|██ | 4520/22095 [7:33:43<18:23:03, 3.77s/it] {'loss': 0.3876, 'grad_norm': 0.7176545433153804, 'learning_rate': 9.222330671958385e-06, 'epoch': 0.2} 20%|██ | 4520/22095 [7:33:43<18:23:03, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4521/22095 [7:33:47<18:45:35, 3.84s/it] {'loss': 0.4109, 'grad_norm': 0.6506277456232169, 'learning_rate': 9.22193806753982e-06, 'epoch': 0.2} 20%|██ | 4521/22095 [7:33:47<18:45:35, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4522/22095 [7:33:51<19:02:09, 3.90s/it] {'loss': 0.3546, 'grad_norm': 0.6114119716823015, 'learning_rate': 9.221545372404774e-06, 'epoch': 0.2} 20%|██ | 4522/22095 [7:33:51<19:02:09, 3.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4523/22095 [7:33:54<17:35:15, 3.60s/it] {'loss': 0.384, 'grad_norm': 0.6235199644185562, 'learning_rate': 9.22115258656168e-06, 'epoch': 0.2} 20%|██ | 4523/22095 [7:33:54<17:35:15, 3.60s/it] 20%|██ | 4524/22095 [7:33:58<18:07:40, 3.71s/it] {'loss': 0.3897, 'grad_norm': 0.7221677461499507, 'learning_rate': 9.220759710018984e-06, 'epoch': 0.2} 20%|██ | 4524/22095 [7:33:58<18:07:40, 3.71s/it] 20%|██ | 4525/22095 [7:34:00<16:49:41, 3.45s/it] {'loss': 0.4044, 'grad_norm': 0.6620027763210206, 'learning_rate': 9.220366742785126e-06, 'epoch': 0.2} 20%|██ | 4525/22095 [7:34:00<16:49:41, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54773 > 40960). Running this sequence through the model will result in indexing errors 20%|██ | 4526/22095 [7:34:03<16:08:24, 3.31s/it] {'loss': 0.387, 'grad_norm': 0.6396484049106496, 'learning_rate': 9.219973684868546e-06, 'epoch': 0.2} 20%|██ | 4526/22095 [7:34:03<16:08:24, 3.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [487, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8450138 in VC:s3://internvl-moe-sft-data/. Exception: Image size [487, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4962, 'image': 'vrdu_texteq/astro-ph.CO/382b87cf-cf6d-4c68-917f-ed98293e7a7f.png', 'image_wh': [[487, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'while for values smaller than $I_c$ we have'}]} 20%|██ | 4527/22095 [7:34:07<16:03:21, 3.29s/it] {'loss': 0.3711, 'grad_norm': 0.6500113781338169, 'learning_rate': 9.219580536277693e-06, 'epoch': 0.2} 20%|██ | 4527/22095 [7:34:07<16:03:21, 3.29s/it] 20%|██ | 4528/22095 [7:34:10<16:20:25, 3.35s/it] {'loss': 0.3726, 'grad_norm': 0.8404289788973495, 'learning_rate': 9.219187297021015e-06, 'epoch': 0.2} 20%|██ | 4528/22095 [7:34:10<16:20:25, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 20%|██ | 4529/22095 [7:34:13<15:42:52, 3.22s/it] {'loss': 0.4157, 'grad_norm': 0.6387752010088015, 'learning_rate': 9.218793967106959e-06, 'epoch': 0.2} 20%|██ | 4529/22095 [7:34:13<15:42:52, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41616 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42403 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79681 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4530/22095 [7:34:16<15:05:40, 3.09s/it] {'loss': 0.366, 'grad_norm': 0.630614010902661, 'learning_rate': 9.218400546543977e-06, 'epoch': 0.21} 21%|██ | 4530/22095 [7:34:16<15:05:40, 3.09s/it] 21%|██ | 4531/22095 [7:34:19<14:52:52, 3.05s/it] {'loss': 0.4018, 'grad_norm': 0.6171563864543054, 'learning_rate': 9.218007035340525e-06, 'epoch': 0.21} 21%|██ | 4531/22095 [7:34:19<14:52:52, 3.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99305 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4532/22095 [7:34:22<14:42:10, 3.01s/it] {'loss': 0.415, 'grad_norm': 0.6403875839699732, 'learning_rate': 9.217613433505056e-06, 'epoch': 0.21} 21%|██ | 4532/22095 [7:34:22<14:42:10, 3.01s/it] 21%|██ | 4533/22095 [7:34:25<14:55:30, 3.06s/it] {'loss': 0.4413, 'grad_norm': 0.650257811916877, 'learning_rate': 9.217219741046026e-06, 'epoch': 0.21} 21%|██ | 4533/22095 [7:34:25<14:55:30, 3.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47625 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4534/22095 [7:34:31<19:29:16, 4.00s/it] {'loss': 0.4794, 'grad_norm': 0.38375700790017325, 'learning_rate': 9.216825957971898e-06, 'epoch': 0.21} 21%|██ | 4534/22095 [7:34:31<19:29:16, 4.00s/it] 21%|██ | 4535/22095 [7:34:34<18:31:47, 3.80s/it] {'loss': 0.3696, 'grad_norm': 0.6478140522618808, 'learning_rate': 9.21643208429113e-06, 'epoch': 0.21} 21%|██ | 4535/22095 [7:34:34<18:31:47, 3.80s/it] 21%|██ | 4536/22095 [7:34:37<17:10:57, 3.52s/it] {'loss': 0.4065, 'grad_norm': 0.6876090924588174, 'learning_rate': 9.216038120012187e-06, 'epoch': 0.21} 21%|██ | 4536/22095 [7:34:37<17:10:57, 3.52s/it] 21%|██ | 4537/22095 [7:34:40<16:15:38, 3.33s/it] {'loss': 0.4241, 'grad_norm': 0.7690946772442716, 'learning_rate': 9.215644065143533e-06, 'epoch': 0.21} 21%|██ | 4537/22095 [7:34:40<16:15:38, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4538/22095 [7:34:50<25:14:35, 5.18s/it] {'loss': 0.4742, 'grad_norm': 0.3022901550826388, 'learning_rate': 9.215249919693634e-06, 'epoch': 0.21} 21%|██ | 4538/22095 [7:34:50<25:14:35, 5.18s/it] 21%|██ | 4539/22095 [7:34:53<23:01:30, 4.72s/it] {'loss': 0.4444, 'grad_norm': 0.7201790266563611, 'learning_rate': 9.214855683670962e-06, 'epoch': 0.21} 21%|██ | 4539/22095 [7:34:53<23:01:30, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4540/22095 [7:35:02<28:26:01, 5.83s/it] {'loss': 0.5113, 'grad_norm': 0.3224356216160571, 'learning_rate': 9.214461357083986e-06, 'epoch': 0.21} 21%|██ | 4540/22095 [7:35:02<28:26:01, 5.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54519 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44991 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4541/22095 [7:35:06<26:13:43, 5.38s/it] {'loss': 0.4504, 'grad_norm': 0.6949901984886445, 'learning_rate': 9.21406693994118e-06, 'epoch': 0.21} 21%|██ | 4541/22095 [7:35:06<26:13:43, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4542/22095 [7:35:14<29:53:18, 6.13s/it] {'loss': 0.5201, 'grad_norm': 0.2908178295108323, 'learning_rate': 9.213672432251016e-06, 'epoch': 0.21} 21%|██ | 4542/22095 [7:35:14<29:53:18, 6.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49836 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4543/22095 [7:35:17<25:31:25, 5.24s/it] {'loss': 0.3505, 'grad_norm': 0.6020446632402668, 'learning_rate': 9.213277834021975e-06, 'epoch': 0.21} 21%|██ | 4543/22095 [7:35:17<25:31:25, 5.24s/it] 21%|██ | 4544/22095 [7:35:21<23:47:27, 4.88s/it] {'loss': 0.4143, 'grad_norm': 0.6681811347078651, 'learning_rate': 9.212883145262532e-06, 'epoch': 0.21} 21%|██ | 4544/22095 [7:35:21<23:47:27, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57731 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4545/22095 [7:35:25<21:54:05, 4.49s/it] {'loss': 0.3621, 'grad_norm': 0.5824281917974231, 'learning_rate': 9.212488365981169e-06, 'epoch': 0.21} 21%|██ | 4545/22095 [7:35:25<21:54:05, 4.49s/it] 21%|██ | 4546/22095 [7:35:29<21:15:49, 4.36s/it] {'loss': 0.408, 'grad_norm': 0.6887703737480376, 'learning_rate': 9.21209349618637e-06, 'epoch': 0.21} 21%|██ | 4546/22095 [7:35:29<21:15:49, 4.36s/it] 21%|██ | 4547/22095 [7:35:32<19:23:24, 3.98s/it] {'loss': 0.4061, 'grad_norm': 0.6825994800894678, 'learning_rate': 9.211698535886617e-06, 'epoch': 0.21} 21%|██ | 4547/22095 [7:35:32<19:23:24, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57036 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55584 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54286 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106866 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4548/22095 [7:35:35<18:47:18, 3.85s/it] {'loss': 0.3746, 'grad_norm': 0.6205959778563953, 'learning_rate': 9.211303485090396e-06, 'epoch': 0.21} 21%|██ | 4548/22095 [7:35:35<18:47:18, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52997 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4549/22095 [7:35:44<26:17:21, 5.39s/it] {'loss': 0.5068, 'grad_norm': 0.3568690622234325, 'learning_rate': 9.210908343806201e-06, 'epoch': 0.21} 21%|██ | 4549/22095 [7:35:44<26:17:21, 5.39s/it] 21%|██ | 4550/22095 [7:35:48<23:48:24, 4.88s/it] {'loss': 0.4337, 'grad_norm': 0.7072278527888455, 'learning_rate': 9.210513112042516e-06, 'epoch': 0.21} 21%|██ | 4550/22095 [7:35:48<23:48:24, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4551/22095 [7:35:58<31:20:54, 6.43s/it] {'loss': 0.4811, 'grad_norm': 0.2905103607140547, 'learning_rate': 9.210117789807837e-06, 'epoch': 0.21} 21%|██ | 4551/22095 [7:35:58<31:20:54, 6.43s/it] 21%|██ | 4552/22095 [7:36:02<27:35:21, 5.66s/it] {'loss': 0.4229, 'grad_norm': 0.7029007626023053, 'learning_rate': 9.209722377110657e-06, 'epoch': 0.21} 21%|██ | 4552/22095 [7:36:02<27:35:21, 5.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4553/22095 [7:36:11<33:04:57, 6.79s/it] {'loss': 0.5031, 'grad_norm': 0.292195895010017, 'learning_rate': 9.20932687395947e-06, 'epoch': 0.21} 21%|██ | 4553/22095 [7:36:11<33:04:57, 6.79s/it] 21%|██ | 4554/22095 [7:36:15<28:31:07, 5.85s/it] {'loss': 0.3548, 'grad_norm': 0.6145259280329646, 'learning_rate': 9.20893128036278e-06, 'epoch': 0.21} 21%|██ | 4554/22095 [7:36:15<28:31:07, 5.85s/it] 21%|██ | 4555/22095 [7:36:19<26:15:07, 5.39s/it] {'loss': 0.4202, 'grad_norm': 0.7336553661516277, 'learning_rate': 9.208535596329082e-06, 'epoch': 0.21} 21%|██ | 4555/22095 [7:36:19<26:15:07, 5.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4556/22095 [7:36:29<32:54:51, 6.76s/it] {'loss': 0.4885, 'grad_norm': 0.35952433690955826, 'learning_rate': 9.20813982186688e-06, 'epoch': 0.21} 21%|██ | 4556/22095 [7:36:29<32:54:51, 6.76s/it] 21%|██ | 4557/22095 [7:36:35<31:54:55, 6.55s/it] {'loss': 0.4909, 'grad_norm': 0.3178685454909754, 'learning_rate': 9.207743956984676e-06, 'epoch': 0.21} 21%|██ | 4557/22095 [7:36:35<31:54:55, 6.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 21%|██ | 4558/22095 [7:36:39<28:08:28, 5.78s/it] {'loss': 0.4113, 'grad_norm': 0.7560864491204569, 'learning_rate': 9.20734800169098e-06, 'epoch': 0.21} 21%|██ | 4558/22095 [7:36:39<28:08:28, 5.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4559/22095 [7:36:43<25:21:06, 5.20s/it] {'loss': 0.4012, 'grad_norm': 0.6815854069387833, 'learning_rate': 9.206951955994294e-06, 'epoch': 0.21} 21%|██ | 4559/22095 [7:36:43<25:21:06, 5.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922570 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45723, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nA. 9cm\nB. 4cm\nC. 5cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 21%|██ | 4560/22095 [7:36:47<23:37:39, 4.85s/it] {'loss': 0.3886, 'grad_norm': 0.6708607127704782, 'learning_rate': 9.206555819903132e-06, 'epoch': 0.21} 21%|██ | 4560/22095 [7:36:47<23:37:39, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108491 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4561/22095 [7:36:50<20:44:52, 4.26s/it] {'loss': 0.3947, 'grad_norm': 0.6822575874755423, 'learning_rate': 9.206159593426005e-06, 'epoch': 0.21} 21%|██ | 4561/22095 [7:36:50<20:44:52, 4.26s/it] 21%|██ | 4562/22095 [7:36:55<21:15:04, 4.36s/it] {'loss': 0.3646, 'grad_norm': 0.6443314679032428, 'learning_rate': 9.205763276571429e-06, 'epoch': 0.21} 21%|██ | 4562/22095 [7:36:55<21:15:04, 4.36s/it] 21%|██ | 4563/22095 [7:36:59<20:58:29, 4.31s/it] {'loss': 0.4032, 'grad_norm': 1.2434698621002092, 'learning_rate': 9.205366869347915e-06, 'epoch': 0.21} 21%|██ | 4563/22095 [7:36:59<20:58:29, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45137 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89231 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4564/22095 [7:37:02<18:56:30, 3.89s/it] {'loss': 0.354, 'grad_norm': 0.5846776817337689, 'learning_rate': 9.204970371763984e-06, 'epoch': 0.21} 21%|██ | 4564/22095 [7:37:02<18:56:30, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (145043 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4565/22095 [7:37:05<18:05:42, 3.72s/it] {'loss': 0.3752, 'grad_norm': 0.6663693027142774, 'learning_rate': 9.204573783828153e-06, 'epoch': 0.21} 21%|██ | 4565/22095 [7:37:05<18:05:42, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118394 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4566/22095 [7:37:08<16:51:40, 3.46s/it] {'loss': 0.3578, 'grad_norm': 0.6392985106616054, 'learning_rate': 9.204177105548946e-06, 'epoch': 0.21} 21%|██ | 4566/22095 [7:37:08<16:51:40, 3.46s/it] 21%|██ | 4567/22095 [7:37:11<16:31:06, 3.39s/it] {'loss': 0.378, 'grad_norm': 0.6708860689350385, 'learning_rate': 9.203780336934885e-06, 'epoch': 0.21} 21%|██ | 4567/22095 [7:37:11<16:31:06, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4568/22095 [7:37:15<16:36:00, 3.41s/it] {'loss': 0.3772, 'grad_norm': 0.6624608867324993, 'learning_rate': 9.203383477994495e-06, 'epoch': 0.21} 21%|██ | 4568/22095 [7:37:15<16:36:00, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4569/22095 [7:37:24<25:42:26, 5.28s/it] {'loss': 0.4944, 'grad_norm': 0.443250526541554, 'learning_rate': 9.202986528736302e-06, 'epoch': 0.21} 21%|██ | 4569/22095 [7:37:24<25:42:26, 5.28s/it] 21%|██ | 4570/22095 [7:37:28<23:57:50, 4.92s/it] {'loss': 0.3936, 'grad_norm': 0.7065139455060182, 'learning_rate': 9.20258948916884e-06, 'epoch': 0.21} 21%|██ | 4570/22095 [7:37:28<23:57:50, 4.92s/it] 21%|██ | 4571/22095 [7:37:32<22:06:54, 4.54s/it] {'loss': 0.385, 'grad_norm': 0.6886726217442054, 'learning_rate': 9.202192359300635e-06, 'epoch': 0.21} 21%|██ | 4571/22095 [7:37:32<22:06:54, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4572/22095 [7:37:42<29:23:01, 6.04s/it] {'loss': 0.4913, 'grad_norm': 0.30440484021814024, 'learning_rate': 9.201795139140224e-06, 'epoch': 0.21} 21%|██ | 4572/22095 [7:37:42<29:23:01, 6.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390430 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57249, 'image': 'vrdu_table_final_2/astro-ph.EP/1ed2c441-8818-405f-9269-d29b3fd15c46.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 21%|██ | 4573/22095 [7:37:45<25:41:57, 5.28s/it] {'loss': 0.4234, 'grad_norm': 0.6461373151052034, 'learning_rate': 9.201397828696139e-06, 'epoch': 0.21} 21%|██ | 4573/22095 [7:37:45<25:41:57, 5.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4574/22095 [7:37:55<32:26:45, 6.67s/it] {'loss': 0.4926, 'grad_norm': 0.29313820021793546, 'learning_rate': 9.201000427976917e-06, 'epoch': 0.21} 21%|██ | 4574/22095 [7:37:55<32:26:45, 6.67s/it] 21%|██ | 4575/22095 [7:37:59<28:18:05, 5.82s/it] {'loss': 0.4272, 'grad_norm': 0.7446593774804321, 'learning_rate': 9.2006029369911e-06, 'epoch': 0.21} 21%|██ | 4575/22095 [7:37:59<28:18:05, 5.82s/it] 21%|██ | 4576/22095 [7:38:03<25:58:21, 5.34s/it] {'loss': 0.41, 'grad_norm': 0.634239966775946, 'learning_rate': 9.200205355747228e-06, 'epoch': 0.21} 21%|██ | 4576/22095 [7:38:03<25:58:21, 5.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398220 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 371, 'image': 'vrdu_table_final_2/astro-ph.CO/9299fc94-b352-4306-8ece-a4ad1bd19435.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} 21%|██ | 4577/22095 [7:38:06<22:32:10, 4.63s/it] {'loss': 0.4049, 'grad_norm': 0.6269753464584988, 'learning_rate': 9.199807684253842e-06, 'epoch': 0.21} 21%|██ | 4577/22095 [7:38:06<22:32:10, 4.63s/it] 21%|██ | 4578/22095 [7:38:10<21:06:27, 4.34s/it] {'loss': 0.4133, 'grad_norm': 0.6626327081044174, 'learning_rate': 9.199409922519487e-06, 'epoch': 0.21} 21%|██ | 4578/22095 [7:38:10<21:06:27, 4.34s/it] 21%|██ | 4579/22095 [7:38:12<18:46:06, 3.86s/it] {'loss': 0.4049, 'grad_norm': 0.6810410460327005, 'learning_rate': 9.19901207055271e-06, 'epoch': 0.21} 21%|██ | 4579/22095 [7:38:12<18:46:06, 3.86s/it] 21%|██ | 4580/22095 [7:38:16<18:09:25, 3.73s/it] {'loss': 0.4485, 'grad_norm': 0.677169394674404, 'learning_rate': 9.198614128362062e-06, 'epoch': 0.21} 21%|██ | 4580/22095 [7:38:16<18:09:25, 3.73s/it] 21%|██ | 4581/22095 [7:38:19<17:00:18, 3.50s/it] {'loss': 0.399, 'grad_norm': 0.6882687805510503, 'learning_rate': 9.19821609595609e-06, 'epoch': 0.21} 21%|██ | 4581/22095 [7:38:19<17:00:18, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4582/22095 [7:38:22<16:27:00, 3.38s/it] {'loss': 0.3734, 'grad_norm': 0.7153669943119992, 'learning_rate': 9.197817973343347e-06, 'epoch': 0.21} 21%|██ | 4582/22095 [7:38:22<16:27:00, 3.38s/it] 21%|██ | 4583/22095 [7:38:25<16:35:48, 3.41s/it] {'loss': 0.4033, 'grad_norm': 0.659508194441196, 'learning_rate': 9.197419760532389e-06, 'epoch': 0.21} 21%|██ | 4583/22095 [7:38:25<16:35:48, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81663 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4584/22095 [7:38:28<15:41:09, 3.22s/it] {'loss': 0.3872, 'grad_norm': 0.6520433998599509, 'learning_rate': 9.197021457531771e-06, 'epoch': 0.21} 21%|██ | 4584/22095 [7:38:28<15:41:09, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79597 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [267, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8499556 in VC:s3://internvl-moe-sft-data/. Exception: Image size [267, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 146996, 'image': 'vrdu_texteq/astro-ph.CO/1f9c2292-26b4-46a7-9c1f-383595838a15.png', 'image_wh': [[267, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'In the limit of small $x$'}]} 21%|██ | 4585/22095 [7:38:31<15:31:46, 3.19s/it] {'loss': 0.3936, 'grad_norm': 0.6629252034178089, 'learning_rate': 9.196623064350054e-06, 'epoch': 0.21} 21%|██ | 4585/22095 [7:38:31<15:31:46, 3.19s/it] 21%|██ | 4586/22095 [7:38:34<15:22:08, 3.16s/it] {'loss': 0.3918, 'grad_norm': 0.8014334519954863, 'learning_rate': 9.196224580995796e-06, 'epoch': 0.21} 21%|██ | 4586/22095 [7:38:34<15:22:08, 3.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4587/22095 [7:38:38<16:14:01, 3.34s/it] {'loss': 0.3992, 'grad_norm': 0.6381738015711796, 'learning_rate': 9.19582600747756e-06, 'epoch': 0.21} 21%|██ | 4587/22095 [7:38:38<16:14:01, 3.34s/it] 21%|██ | 4588/22095 [7:38:42<16:59:22, 3.49s/it] {'loss': 0.4203, 'grad_norm': 0.6575473018865028, 'learning_rate': 9.195427343803906e-06, 'epoch': 0.21} 21%|██ | 4588/22095 [7:38:42<16:59:22, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303986 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1ZtXwLXXXXXXkaXXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n提取这张照片中的文本信息,但不要输出坐标。'}, {'from': 'gpt', 'value': '图中所有文字:\n旺旺拐家拿钥匙\n旺旺\n拐家拿钥匙\n:'}]} 21%|██ | 4589/22095 [7:38:46<18:10:07, 3.74s/it] {'loss': 0.4196, 'grad_norm': 0.680756389919026, 'learning_rate': 9.195028589983407e-06, 'epoch': 0.21} 21%|██ | 4589/22095 [7:38:46<18:10:07, 3.74s/it] 21%|██ | 4590/22095 [7:38:50<17:38:08, 3.63s/it] {'loss': 0.3954, 'grad_norm': 0.7390360892869532, 'learning_rate': 9.194629746024627e-06, 'epoch': 0.21} 21%|██ | 4590/22095 [7:38:50<17:38:08, 3.63s/it] 21%|██ | 4591/22095 [7:38:53<16:43:58, 3.44s/it] {'loss': 0.4134, 'grad_norm': 0.6774172823682195, 'learning_rate': 9.194230811936135e-06, 'epoch': 0.21} 21%|██ | 4591/22095 [7:38:53<16:43:58, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4592/22095 [7:39:02<25:59:30, 5.35s/it] {'loss': 0.518, 'grad_norm': 0.4994134842411283, 'learning_rate': 9.193831787726507e-06, 'epoch': 0.21} 21%|██ | 4592/22095 [7:39:03<25:59:30, 5.35s/it] 21%|██ | 4593/22095 [7:39:06<23:18:52, 4.80s/it] {'loss': 0.3993, 'grad_norm': 0.8287166767605357, 'learning_rate': 9.193432673404312e-06, 'epoch': 0.21} 21%|██ | 4593/22095 [7:39:06<23:18:52, 4.80s/it] 21%|██ | 4594/22095 [7:39:10<21:53:41, 4.50s/it] {'loss': 0.41, 'grad_norm': 0.751905987339554, 'learning_rate': 9.19303346897813e-06, 'epoch': 0.21} 21%|██ | 4594/22095 [7:39:10<21:53:41, 4.50s/it] 21%|██ | 4595/22095 [7:39:13<20:36:49, 4.24s/it] {'loss': 0.3207, 'grad_norm': 0.7410296417952482, 'learning_rate': 9.192634174456536e-06, 'epoch': 0.21} 21%|██ | 4595/22095 [7:39:13<20:36:49, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908198 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31351, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nA. 2.5\nB. 4.5\nC. 7\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 21%|██ | 4596/22095 [7:39:16<18:45:46, 3.86s/it] {'loss': 0.3675, 'grad_norm': 0.6774001605097975, 'learning_rate': 9.19223478984811e-06, 'epoch': 0.21} 21%|██ | 4596/22095 [7:39:16<18:45:46, 3.86s/it] 21%|██ | 4597/22095 [7:39:20<17:58:05, 3.70s/it] {'loss': 0.4113, 'grad_norm': 0.6846876993441788, 'learning_rate': 9.191835315161432e-06, 'epoch': 0.21} 21%|██ | 4597/22095 [7:39:20<17:58:05, 3.70s/it] 21%|██ | 4598/22095 [7:39:23<17:41:53, 3.64s/it] {'loss': 0.3714, 'grad_norm': 0.6541964859436085, 'learning_rate': 9.191435750405091e-06, 'epoch': 0.21} 21%|██ | 4598/22095 [7:39:23<17:41:53, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57141 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4599/22095 [7:39:26<16:36:57, 3.42s/it] {'loss': 0.4332, 'grad_norm': 0.6553196982674202, 'learning_rate': 9.191036095587667e-06, 'epoch': 0.21} 21%|██ | 4599/22095 [7:39:26<16:36:57, 3.42s/it] 21%|██ | 4600/22095 [7:39:29<16:24:23, 3.38s/it] {'loss': 0.3774, 'grad_norm': 0.69992746418274, 'learning_rate': 9.190636350717747e-06, 'epoch': 0.21} 21%|██ | 4600/22095 [7:39:29<16:24:23, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47625 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43717 > 40960) for 4 sample(s). Truncating to 585 with 2 samples. 21%|██ | 4601/22095 [7:39:33<16:35:14, 3.41s/it] {'loss': 0.3841, 'grad_norm': 0.685563902486613, 'learning_rate': 9.190236515803926e-06, 'epoch': 0.21} 21%|██ | 4601/22095 [7:39:33<16:35:14, 3.41s/it] 21%|██ | 4602/22095 [7:39:36<15:51:09, 3.26s/it] {'loss': 0.3941, 'grad_norm': 0.7595242844336855, 'learning_rate': 9.18983659085479e-06, 'epoch': 0.21} 21%|██ | 4602/22095 [7:39:36<15:51:09, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110885 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81992 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4603/22095 [7:39:39<15:58:12, 3.29s/it] {'loss': 0.3761, 'grad_norm': 0.6384986197619993, 'learning_rate': 9.189436575878933e-06, 'epoch': 0.21} 21%|██ | 4603/22095 [7:39:39<15:58:12, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49899 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4604/22095 [7:39:43<16:17:47, 3.35s/it] {'loss': 0.3927, 'grad_norm': 0.7432963933397082, 'learning_rate': 9.189036470884951e-06, 'epoch': 0.21} 21%|██ | 4604/22095 [7:39:43<16:17:47, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44752 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56358 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4605/22095 [7:39:46<15:39:55, 3.22s/it] {'loss': 0.3661, 'grad_norm': 0.6954626477052497, 'learning_rate': 9.188636275881442e-06, 'epoch': 0.21} 21%|██ | 4605/22095 [7:39:46<15:39:55, 3.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914676 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37829, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 2\nB. 4\nC. 8\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由点D是线段AB的中点,得AD=\\frac{1}{2}AB=\\frac{1}{2}×16=8cm,由C是线段AD的中点,得CD=\\frac{1}{2}AD=\\frac{1}{2}×8=4cm.'}]} 21%|██ | 4606/22095 [7:39:49<15:27:11, 3.18s/it] {'loss': 0.4073, 'grad_norm': 0.7741203657135156, 'learning_rate': 9.188235990877004e-06, 'epoch': 0.21} 21%|██ | 4606/22095 [7:39:49<15:27:11, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4607/22095 [7:39:56<21:52:25, 4.50s/it] {'loss': 0.5383, 'grad_norm': 0.4754471604353748, 'learning_rate': 9.187835615880235e-06, 'epoch': 0.21} 21%|██ | 4607/22095 [7:39:56<21:52:25, 4.50s/it] 21%|██ | 4608/22095 [7:40:00<20:06:16, 4.14s/it] {'loss': 0.4326, 'grad_norm': 0.800689655749288, 'learning_rate': 9.187435150899743e-06, 'epoch': 0.21} 21%|██ | 4608/22095 [7:40:00<20:06:16, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4609/22095 [7:40:10<28:57:10, 5.96s/it] {'loss': 0.4918, 'grad_norm': 0.3767292595817881, 'learning_rate': 9.187034595944131e-06, 'epoch': 0.21} 21%|██ | 4609/22095 [7:40:10<28:57:10, 5.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4610/22095 [7:40:13<25:18:59, 5.21s/it] {'loss': 0.3991, 'grad_norm': 0.6318101316229977, 'learning_rate': 9.186633951022005e-06, 'epoch': 0.21} 21%|██ | 4610/22095 [7:40:13<25:18:59, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4611/22095 [7:40:23<32:22:44, 6.67s/it] {'loss': 0.4824, 'grad_norm': 0.2964277166418596, 'learning_rate': 9.186233216141972e-06, 'epoch': 0.21} 21%|██ | 4611/22095 [7:40:23<32:22:44, 6.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80652 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92580 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4612/22095 [7:40:30<32:35:18, 6.71s/it] {'loss': 0.4822, 'grad_norm': 0.32570830424870334, 'learning_rate': 9.185832391312644e-06, 'epoch': 0.21} 21%|██ | 4612/22095 [7:40:30<32:35:18, 6.71s/it] 21%|██ | 4613/22095 [7:40:40<36:59:08, 7.62s/it] {'loss': 0.4878, 'grad_norm': 0.2974554326662702, 'learning_rate': 9.185431476542635e-06, 'epoch': 0.21} 21%|██ | 4613/22095 [7:40:40<36:59:08, 7.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4614/22095 [7:40:44<31:16:12, 6.44s/it] {'loss': 0.3986, 'grad_norm': 0.7701402097321183, 'learning_rate': 9.185030471840557e-06, 'epoch': 0.21} 21%|██ | 4614/22095 [7:40:44<31:16:12, 6.44s/it] 21%|██ | 4615/22095 [7:40:47<27:02:26, 5.57s/it] {'loss': 0.4542, 'grad_norm': 0.7491767662695776, 'learning_rate': 9.184629377215028e-06, 'epoch': 0.21} 21%|██ | 4615/22095 [7:40:47<27:02:26, 5.57s/it] 21%|██ | 4616/22095 [7:40:50<23:17:55, 4.80s/it] {'loss': 0.3708, 'grad_norm': 0.7697111484129529, 'learning_rate': 9.184228192674667e-06, 'epoch': 0.21} 21%|██ | 4616/22095 [7:40:50<23:17:55, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4617/22095 [7:40:54<21:44:41, 4.48s/it] {'loss': 0.3901, 'grad_norm': 0.6412765775554661, 'learning_rate': 9.183826918228092e-06, 'epoch': 0.21} 21%|██ | 4617/22095 [7:40:54<21:44:41, 4.48s/it] 21%|██ | 4618/22095 [7:40:57<19:29:43, 4.02s/it] {'loss': 0.3966, 'grad_norm': 0.719052047311315, 'learning_rate': 9.183425553883925e-06, 'epoch': 0.21} 21%|██ | 4618/22095 [7:40:57<19:29:43, 4.02s/it] 21%|██ | 4619/22095 [7:41:00<18:27:22, 3.80s/it] {'loss': 0.3682, 'grad_norm': 0.6533657663372489, 'learning_rate': 9.183024099650793e-06, 'epoch': 0.21} 21%|██ | 4619/22095 [7:41:00<18:27:22, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4620/22095 [7:41:07<23:24:30, 4.82s/it] {'loss': 0.5193, 'grad_norm': 0.4063216099091764, 'learning_rate': 9.18262255553732e-06, 'epoch': 0.21} 21%|██ | 4620/22095 [7:41:07<23:24:30, 4.82s/it] 21%|██ | 4621/22095 [7:41:11<21:47:55, 4.49s/it] {'loss': 0.4351, 'grad_norm': 0.716403564936248, 'learning_rate': 9.182220921552132e-06, 'epoch': 0.21} 21%|██ | 4621/22095 [7:41:11<21:47:55, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42694 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4622/22095 [7:41:14<19:39:48, 4.05s/it] {'loss': 0.3863, 'grad_norm': 0.6644833222370995, 'learning_rate': 9.181819197703864e-06, 'epoch': 0.21} 21%|██ | 4622/22095 [7:41:14<19:39:48, 4.05s/it] 21%|██ | 4623/22095 [7:41:18<19:54:14, 4.10s/it] {'loss': 0.3436, 'grad_norm': 0.7599099253584791, 'learning_rate': 9.181417384001143e-06, 'epoch': 0.21} 21%|██ | 4623/22095 [7:41:18<19:54:14, 4.10s/it] 21%|██ | 4624/22095 [7:41:22<18:52:06, 3.89s/it] {'loss': 0.3701, 'grad_norm': 0.6960306482533338, 'learning_rate': 9.181015480452607e-06, 'epoch': 0.21} 21%|██ | 4624/22095 [7:41:22<18:52:06, 3.89s/it] 21%|██ | 4625/22095 [7:41:25<17:32:32, 3.61s/it] {'loss': 0.4453, 'grad_norm': 0.6677901326076892, 'learning_rate': 9.180613487066888e-06, 'epoch': 0.21} 21%|██ | 4625/22095 [7:41:25<17:32:32, 3.61s/it] 21%|██ | 4626/22095 [7:41:28<16:31:31, 3.41s/it] {'loss': 0.4158, 'grad_norm': 0.7052036480574205, 'learning_rate': 9.180211403852623e-06, 'epoch': 0.21} 21%|██ | 4626/22095 [7:41:28<16:31:31, 3.41s/it] 21%|██ | 4627/22095 [7:41:30<15:54:28, 3.28s/it] {'loss': 0.3774, 'grad_norm': 0.6767244474943633, 'learning_rate': 9.179809230818458e-06, 'epoch': 0.21} 21%|██ | 4627/22095 [7:41:31<15:54:28, 3.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [675, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8442861 in VC:s3://internvl-moe-sft-data/. Exception: Image size [675, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62733, 'image': 'vrdu_texteq/astro-ph.CO/7d9850ce-6507-4e93-9e72-2d3fab116a98.png', 'image_wh': [[675, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': '$\\omega $ \ndetermines the rotation of our Earth around the Sun.'}]} 21%|██ | 4628/22095 [7:41:34<16:47:34, 3.46s/it] {'loss': 0.423, 'grad_norm': 0.6896840117858403, 'learning_rate': 9.179406967973025e-06, 'epoch': 0.21} 21%|██ | 4628/22095 [7:41:34<16:47:34, 3.46s/it] 21%|██ | 4629/22095 [7:41:37<16:05:10, 3.32s/it] {'loss': 0.3883, 'grad_norm': 1.0927703091373575, 'learning_rate': 9.179004615324976e-06, 'epoch': 0.21} 21%|██ | 4629/22095 [7:41:37<16:05:10, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4630/22095 [7:41:47<24:42:39, 5.09s/it] {'loss': 0.5055, 'grad_norm': 0.3893396568527757, 'learning_rate': 9.178602172882951e-06, 'epoch': 0.21} 21%|██ | 4630/22095 [7:41:47<24:42:39, 5.09s/it] 21%|██ | 4631/22095 [7:41:50<22:04:15, 4.55s/it] {'loss': 0.3971, 'grad_norm': 0.734134054975077, 'learning_rate': 9.178199640655598e-06, 'epoch': 0.21} 21%|██ | 4631/22095 [7:41:50<22:04:15, 4.55s/it] 21%|██ | 4632/22095 [7:41:53<20:26:42, 4.21s/it] {'loss': 0.4527, 'grad_norm': 0.6996391227153452, 'learning_rate': 9.177797018651568e-06, 'epoch': 0.21} 21%|██ | 4632/22095 [7:41:53<20:26:42, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341353 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7998, 'image': 'vrdu_table_final_2/astro-ph.CO/7f8e454c-420d-4553-a133-c25a5223faab.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 21%|██ | 4633/22095 [7:41:57<19:22:04, 3.99s/it] {'loss': 0.4148, 'grad_norm': 0.6747446956371578, 'learning_rate': 9.177394306879513e-06, 'epoch': 0.21} 21%|██ | 4633/22095 [7:41:57<19:22:04, 3.99s/it] 21%|██ | 4634/22095 [7:42:01<19:22:53, 4.00s/it] {'loss': 0.4469, 'grad_norm': 0.6681883466966753, 'learning_rate': 9.176991505348082e-06, 'epoch': 0.21} 21%|██ | 4634/22095 [7:42:01<19:22:53, 4.00s/it] 21%|██ | 4635/22095 [7:42:04<18:09:08, 3.74s/it] {'loss': 0.3342, 'grad_norm': 0.6171146969041853, 'learning_rate': 9.176588614065934e-06, 'epoch': 0.21} 21%|██ | 4635/22095 [7:42:04<18:09:08, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54413 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58011 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4636/22095 [7:42:07<16:46:59, 3.46s/it] {'loss': 0.414, 'grad_norm': 0.6505458312196792, 'learning_rate': 9.17618563304172e-06, 'epoch': 0.21} 21%|██ | 4636/22095 [7:42:07<16:46:59, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54646 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4637/22095 [7:42:10<16:57:07, 3.50s/it] {'loss': 0.409, 'grad_norm': 0.6281754410034931, 'learning_rate': 9.175782562284108e-06, 'epoch': 0.21} 21%|██ | 4637/22095 [7:42:10<16:57:07, 3.50s/it] 21%|██ | 4638/22095 [7:42:15<18:04:05, 3.73s/it] {'loss': 0.4315, 'grad_norm': 0.6888797123187864, 'learning_rate': 9.175379401801752e-06, 'epoch': 0.21} 21%|██ | 4638/22095 [7:42:15<18:04:05, 3.73s/it] 21%|██ | 4639/22095 [7:42:17<16:50:25, 3.47s/it] {'loss': 0.4113, 'grad_norm': 0.7493278171281029, 'learning_rate': 9.174976151603314e-06, 'epoch': 0.21} 21%|██ | 4639/22095 [7:42:17<16:50:25, 3.47s/it] 21%|██ | 4640/22095 [7:42:21<16:21:56, 3.38s/it] {'loss': 0.3607, 'grad_norm': 0.8294347486423541, 'learning_rate': 9.174572811697464e-06, 'epoch': 0.21} 21%|██ | 4640/22095 [7:42:21<16:21:56, 3.38s/it] 21%|██ | 4641/22095 [7:42:23<15:24:34, 3.18s/it] {'loss': 0.3661, 'grad_norm': 0.7000019570919632, 'learning_rate': 9.174169382092864e-06, 'epoch': 0.21} 21%|██ | 4641/22095 [7:42:23<15:24:34, 3.18s/it] 21%|██ | 4642/22095 [7:42:26<15:21:33, 3.17s/it] {'loss': 0.4516, 'grad_norm': 0.6980852810654846, 'learning_rate': 9.173765862798185e-06, 'epoch': 0.21} 21%|██ | 4642/22095 [7:42:26<15:21:33, 3.17s/it] 21%|██ | 4643/22095 [7:42:30<16:07:48, 3.33s/it] {'loss': 0.3795, 'grad_norm': 0.7211418317715546, 'learning_rate': 9.173362253822095e-06, 'epoch': 0.21} 21%|██ | 4643/22095 [7:42:30<16:07:48, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4644/22095 [7:42:39<24:36:03, 5.08s/it] {'loss': 0.4853, 'grad_norm': 0.39470052963079344, 'learning_rate': 9.172958555173268e-06, 'epoch': 0.21} 21%|██ | 4644/22095 [7:42:39<24:36:03, 5.08s/it] 21%|██ | 4645/22095 [7:42:46<26:26:44, 5.46s/it] {'loss': 0.497, 'grad_norm': 0.33231990173857073, 'learning_rate': 9.17255476686038e-06, 'epoch': 0.21} 21%|██ | 4645/22095 [7:42:46<26:26:44, 5.46s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 21%|██ | 4646/22095 [7:42:49<23:41:40, 4.89s/it] {'loss': 0.3966, 'grad_norm': 0.6787839011727957, 'learning_rate': 9.172150888892102e-06, 'epoch': 0.21} 21%|██ | 4646/22095 [7:42:49<23:41:40, 4.89s/it] 21%|██ | 4647/22095 [7:42:53<22:17:12, 4.60s/it] {'loss': 0.4028, 'grad_norm': 0.7294657149087491, 'learning_rate': 9.171746921277116e-06, 'epoch': 0.21} 21%|██ | 4647/22095 [7:42:53<22:17:12, 4.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4648/22095 [7:42:56<20:25:26, 4.21s/it] {'loss': 0.3603, 'grad_norm': 0.7001236222785713, 'learning_rate': 9.171342864024103e-06, 'epoch': 0.21} 21%|██ | 4648/22095 [7:42:56<20:25:26, 4.21s/it] 21%|██ | 4649/22095 [7:43:00<19:51:03, 4.10s/it] {'loss': 0.4124, 'grad_norm': 0.9831069171644347, 'learning_rate': 9.17093871714174e-06, 'epoch': 0.21} 21%|██ | 4649/22095 [7:43:00<19:51:03, 4.10s/it] 21%|██ | 4650/22095 [7:43:03<18:30:56, 3.82s/it] {'loss': 0.3952, 'grad_norm': 0.6921596856617678, 'learning_rate': 9.170534480638718e-06, 'epoch': 0.21} 21%|██ | 4650/22095 [7:43:03<18:30:56, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4651/22095 [7:43:07<18:32:01, 3.82s/it] {'loss': 0.3858, 'grad_norm': 0.8188302904529361, 'learning_rate': 9.170130154523715e-06, 'epoch': 0.21} 21%|██ | 4651/22095 [7:43:07<18:32:01, 3.82s/it] 21%|██ | 4652/22095 [7:43:11<18:12:52, 3.76s/it] {'loss': 0.399, 'grad_norm': 0.6255134043307441, 'learning_rate': 9.169725738805425e-06, 'epoch': 0.21} 21%|██ | 4652/22095 [7:43:11<18:12:52, 3.76s/it] 21%|██ | 4653/22095 [7:43:14<17:47:14, 3.67s/it] {'loss': 0.3949, 'grad_norm': 0.6553509538956197, 'learning_rate': 9.169321233492534e-06, 'epoch': 0.21} 21%|██ | 4653/22095 [7:43:14<17:47:14, 3.67s/it] 21%|██ | 4654/22095 [7:43:18<18:11:53, 3.76s/it] {'loss': 0.3631, 'grad_norm': 0.6746561449928478, 'learning_rate': 9.168916638593736e-06, 'epoch': 0.21} 21%|██ | 4654/22095 [7:43:18<18:11:53, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4655/22095 [7:43:21<17:00:15, 3.51s/it] {'loss': 0.3969, 'grad_norm': 0.6714508923031717, 'learning_rate': 9.168511954117723e-06, 'epoch': 0.21} 21%|██ | 4655/22095 [7:43:21<17:00:15, 3.51s/it] 21%|██ | 4656/22095 [7:43:26<18:08:21, 3.74s/it] {'loss': 0.4342, 'grad_norm': 0.6108944498419144, 'learning_rate': 9.16810718007319e-06, 'epoch': 0.21} 21%|██ | 4656/22095 [7:43:26<18:08:21, 3.74s/it] 21%|██ | 4657/22095 [7:43:29<16:58:14, 3.50s/it] {'loss': 0.4071, 'grad_norm': 0.6910327522428834, 'learning_rate': 9.167702316468835e-06, 'epoch': 0.21} 21%|██ | 4657/22095 [7:43:29<16:58:14, 3.50s/it] 21%|██ | 4658/22095 [7:43:32<16:23:45, 3.39s/it] {'loss': 0.4079, 'grad_norm': 0.6512363183886198, 'learning_rate': 9.167297363313357e-06, 'epoch': 0.21} 21%|██ | 4658/22095 [7:43:32<16:23:45, 3.39s/it] 21%|██ | 4659/22095 [7:43:35<17:00:43, 3.51s/it] {'loss': 0.371, 'grad_norm': 0.6883357018712101, 'learning_rate': 9.166892320615459e-06, 'epoch': 0.21} 21%|██ | 4659/22095 [7:43:35<17:00:43, 3.51s/it] 21%|██ | 4660/22095 [7:43:38<16:07:57, 3.33s/it] {'loss': 0.4051, 'grad_norm': 0.6252312961373009, 'learning_rate': 9.166487188383841e-06, 'epoch': 0.21} 21%|██ | 4660/22095 [7:43:38<16:07:57, 3.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [364, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474794 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 121409, 'image': 'vrdu_texteq/astro-ph.CO/2f48252d-50f3-4719-934a-b5309c25690a.png', 'image_wh': [[364, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'with the covariance matrix $C$:'}]} 21%|██ | 4661/22095 [7:43:41<15:51:50, 3.28s/it] {'loss': 0.4222, 'grad_norm': 0.6589575854416991, 'learning_rate': 9.166081966627211e-06, 'epoch': 0.21} 21%|██ | 4661/22095 [7:43:41<15:51:50, 3.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [628, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8510900 in VC:s3://internvl-moe-sft-data/. Exception: Image size [628, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56413, 'image': 'vrdu_texteq/astro-ph.CO/25732353-2910-4f69-9f13-6734233b34aa.png', 'image_wh': [[628, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'This resulted in an $H$ to \\emph{F160W} transformation of:'}]} 21%|██ | 4662/22095 [7:43:46<17:22:08, 3.59s/it] {'loss': 0.4831, 'grad_norm': 0.6861814844204853, 'learning_rate': 9.165676655354274e-06, 'epoch': 0.21} 21%|██ | 4662/22095 [7:43:46<17:22:08, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51347 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4663/22095 [7:43:49<16:16:23, 3.36s/it] {'loss': 0.3438, 'grad_norm': 0.6707217311626233, 'learning_rate': 9.16527125457374e-06, 'epoch': 0.21} 21%|██ | 4663/22095 [7:43:49<16:16:23, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76522 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4664/22095 [7:43:53<17:51:05, 3.69s/it] {'loss': 0.3499, 'grad_norm': 0.7936046441370954, 'learning_rate': 9.16486576429432e-06, 'epoch': 0.21} 21%|██ | 4664/22095 [7:43:53<17:51:05, 3.69s/it] 21%|██ | 4665/22095 [7:43:57<18:31:11, 3.83s/it] {'loss': 0.3742, 'grad_norm': 0.7234342589495827, 'learning_rate': 9.164460184524726e-06, 'epoch': 0.21} 21%|██ | 4665/22095 [7:43:57<18:31:11, 3.83s/it] 21%|██ | 4666/22095 [7:44:01<18:10:40, 3.75s/it] {'loss': 0.3686, 'grad_norm': 0.6772999050647835, 'learning_rate': 9.16405451527367e-06, 'epoch': 0.21} 21%|██ | 4666/22095 [7:44:01<18:10:40, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4667/22095 [7:44:07<21:37:30, 4.47s/it] {'loss': 0.4921, 'grad_norm': 0.6520754696457214, 'learning_rate': 9.163648756549875e-06, 'epoch': 0.21} 21%|██ | 4667/22095 [7:44:07<21:37:30, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4668/22095 [7:44:11<20:37:49, 4.26s/it] {'loss': 0.3969, 'grad_norm': 0.7762013029665157, 'learning_rate': 9.163242908362053e-06, 'epoch': 0.21} 21%|██ | 4668/22095 [7:44:11<20:37:49, 4.26s/it] 21%|██ | 4669/22095 [7:44:14<18:51:41, 3.90s/it] {'loss': 0.4119, 'grad_norm': 0.6829559452971228, 'learning_rate': 9.16283697071893e-06, 'epoch': 0.21} 21%|██ | 4669/22095 [7:44:14<18:51:41, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92190 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4670/22095 [7:44:17<17:45:30, 3.67s/it] {'loss': 0.4044, 'grad_norm': 0.6784654757287741, 'learning_rate': 9.162430943629224e-06, 'epoch': 0.21} 21%|██ | 4670/22095 [7:44:17<17:45:30, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██ | 4671/22095 [7:44:27<26:23:03, 5.45s/it] {'loss': 0.4802, 'grad_norm': 0.336867182270296, 'learning_rate': 9.162024827101663e-06, 'epoch': 0.21} 21%|██ | 4671/22095 [7:44:27<26:23:03, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89688 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4672/22095 [7:44:33<28:14:29, 5.84s/it] {'loss': 0.5177, 'grad_norm': 0.3422647615733906, 'learning_rate': 9.161618621144967e-06, 'epoch': 0.21} 21%|██ | 4672/22095 [7:44:33<28:14:29, 5.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 21%|██ | 4673/22095 [7:44:37<24:34:59, 5.08s/it] {'loss': 0.4127, 'grad_norm': 1.1008475237506572, 'learning_rate': 9.161212325767873e-06, 'epoch': 0.21} 21%|██ | 4673/22095 [7:44:37<24:34:59, 5.08s/it] 21%|██ | 4674/22095 [7:44:40<22:08:09, 4.57s/it] {'loss': 0.38, 'grad_norm': 0.6685866425309338, 'learning_rate': 9.160805940979104e-06, 'epoch': 0.21} 21%|██ | 4674/22095 [7:44:40<22:08:09, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42508 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4675/22095 [7:44:43<19:41:06, 4.07s/it] {'loss': 0.4012, 'grad_norm': 0.6905136705212983, 'learning_rate': 9.160399466787392e-06, 'epoch': 0.21} 21%|██ | 4675/22095 [7:44:43<19:41:06, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78955 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50632 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4676/22095 [7:44:46<17:51:41, 3.69s/it] {'loss': 0.3776, 'grad_norm': 0.9911104484428561, 'learning_rate': 9.159992903201478e-06, 'epoch': 0.21} 21%|██ | 4676/22095 [7:44:46<17:51:41, 3.69s/it] 21%|██ | 4677/22095 [7:44:49<17:20:12, 3.58s/it] {'loss': 0.4013, 'grad_norm': 0.7258200921880575, 'learning_rate': 9.15958625023009e-06, 'epoch': 0.21} 21%|██ | 4677/22095 [7:44:49<17:20:12, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49654 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4678/22095 [7:44:59<26:46:20, 5.53s/it] {'loss': 0.4963, 'grad_norm': 0.40608231174106374, 'learning_rate': 9.15917950788197e-06, 'epoch': 0.21} 21%|██ | 4678/22095 [7:44:59<26:46:20, 5.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4679/22095 [7:45:02<23:28:33, 4.85s/it] {'loss': 0.3785, 'grad_norm': 0.753776177198563, 'learning_rate': 9.158772676165854e-06, 'epoch': 0.21} 21%|██ | 4679/22095 [7:45:02<23:28:33, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45730 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4680/22095 [7:45:08<24:48:53, 5.13s/it] {'loss': 0.5132, 'grad_norm': 0.4500270088526661, 'learning_rate': 9.158365755090488e-06, 'epoch': 0.21} 21%|██ | 4680/22095 [7:45:08<24:48:53, 5.13s/it] 21%|██ | 4681/22095 [7:45:11<21:52:14, 4.52s/it] {'loss': 0.4071, 'grad_norm': 1.1467954943239418, 'learning_rate': 9.157958744664612e-06, 'epoch': 0.21} 21%|██ | 4681/22095 [7:45:11<21:52:14, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4682/22095 [7:45:15<20:50:05, 4.31s/it] {'loss': 0.4126, 'grad_norm': 0.6675630555278697, 'learning_rate': 9.157551644896974e-06, 'epoch': 0.21} 21%|██ | 4682/22095 [7:45:15<20:50:05, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954497 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5332, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12cm'}]} 21%|██ | 4683/22095 [7:45:19<20:49:21, 4.31s/it] {'loss': 0.4719, 'grad_norm': 0.29502147340878954, 'learning_rate': 9.15714445579632e-06, 'epoch': 0.21} 21%|██ | 4683/22095 [7:45:19<20:49:21, 4.31s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_145555.png 2025-08-27 23:43:18.104445 load time: 1044.73 ms VC:s3://gui-agent/data_20250421/web/images/wa_wiki/trajectory_17/img/step_1.png 2025-08-27 23:43:18.106172 load time: 1036.85 ms 21%|██ | 4684/22095 [7:45:29<28:58:03, 5.99s/it] {'loss': 0.5063, 'grad_norm': 0.3252216219097457, 'learning_rate': 9.156737177371399e-06, 'epoch': 0.21} 21%|██ | 4684/22095 [7:45:29<28:58:03, 5.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 21%|██ | 4685/22095 [7:45:33<25:35:48, 5.29s/it] {'loss': 0.4284, 'grad_norm': 1.8135010055214982, 'learning_rate': 9.156329809630962e-06, 'epoch': 0.21} 21%|██ | 4685/22095 [7:45:33<25:35:48, 5.29s/it] 21%|██ | 4686/22095 [7:45:42<31:05:59, 6.43s/it] {'loss': 0.4703, 'grad_norm': 0.38930115148456806, 'learning_rate': 9.155922352583763e-06, 'epoch': 0.21} 21%|██ | 4686/22095 [7:45:42<31:05:59, 6.43s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 21%|██ | 4687/22095 [7:45:45<26:31:09, 5.48s/it] {'loss': 0.3811, 'grad_norm': 1.280235310925464, 'learning_rate': 9.155514806238557e-06, 'epoch': 0.21} 21%|██ | 4687/22095 [7:45:45<26:31:09, 5.48s/it] 21%|██ | 4688/22095 [7:45:53<29:52:09, 6.18s/it] {'loss': 0.5066, 'grad_norm': 0.31157176205214804, 'learning_rate': 9.1551071706041e-06, 'epoch': 0.21} 21%|██ | 4688/22095 [7:45:53<29:52:09, 6.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (120512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54144 > 40960). Running this sequence through the model will result in indexing errors 21%|██ | 4689/22095 [7:45:57<26:40:57, 5.52s/it] {'loss': 0.3751, 'grad_norm': 1.170161611611983, 'learning_rate': 9.154699445689151e-06, 'epoch': 0.21} 21%|██ | 4689/22095 [7:45:57<26:40:57, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██ | 4690/22095 [7:46:02<25:19:50, 5.24s/it] {'loss': 0.4575, 'grad_norm': 0.6990135258357858, 'learning_rate': 9.154291631502471e-06, 'epoch': 0.21} 21%|██ | 4690/22095 [7:46:02<25:19:50, 5.24s/it] 21%|██ | 4691/22095 [7:46:05<23:18:07, 4.82s/it] {'loss': 0.3965, 'grad_norm': 0.7945124398062862, 'learning_rate': 9.153883728052824e-06, 'epoch': 0.21} 21%|██ | 4691/22095 [7:46:05<23:18:07, 4.82s/it] 21%|██ | 4692/22095 [7:46:08<20:35:58, 4.26s/it] {'loss': 0.3956, 'grad_norm': 1.098263554852461, 'learning_rate': 9.153475735348973e-06, 'epoch': 0.21} 21%|██ | 4692/22095 [7:46:08<20:35:58, 4.26s/it] 21%|██ | 4693/22095 [7:46:12<20:18:38, 4.20s/it] {'loss': 0.4391, 'grad_norm': 0.7792909497721552, 'learning_rate': 9.153067653399684e-06, 'epoch': 0.21} 21%|██ | 4693/22095 [7:46:13<20:18:38, 4.20s/it] 21%|██ | 4694/22095 [7:46:16<19:33:45, 4.05s/it] {'loss': 0.3678, 'grad_norm': 0.7769896070381618, 'learning_rate': 9.152659482213727e-06, 'epoch': 0.21} 21%|██ | 4694/22095 [7:46:16<19:33:45, 4.05s/it] 21%|██ | 4695/22095 [7:46:20<19:21:58, 4.01s/it] {'loss': 0.3668, 'grad_norm': 0.7248722333731626, 'learning_rate': 9.152251221799871e-06, 'epoch': 0.21} 21%|██ | 4695/22095 [7:46:20<19:21:58, 4.01s/it] 21%|██▏ | 4696/22095 [7:46:23<17:43:52, 3.67s/it] {'loss': 0.3748, 'grad_norm': 0.7951512445368836, 'learning_rate': 9.15184287216689e-06, 'epoch': 0.21} 21%|██▏ | 4696/22095 [7:46:23<17:43:52, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██▏ | 4697/22095 [7:46:32<25:40:15, 5.31s/it] {'loss': 0.5111, 'grad_norm': 0.46089270965008194, 'learning_rate': 9.151434433323556e-06, 'epoch': 0.21} 21%|██▏ | 4697/22095 [7:46:32<25:40:15, 5.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047550 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 10cm\nB. 5cm\nC. 15cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 21%|██▏ | 4698/22095 [7:46:36<24:10:51, 5.00s/it] {'loss': 0.3889, 'grad_norm': 0.7608710232314421, 'learning_rate': 9.151025905278647e-06, 'epoch': 0.21} 21%|██▏ | 4698/22095 [7:46:36<24:10:51, 5.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344073 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10725, 'image': 'vrdu_table_final_2/astro-ph.CO/c257a434-1cc5-4bd4-8243-815d868b8dcc.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 21%|██▏ | 4699/22095 [7:46:40<21:52:27, 4.53s/it] {'loss': 0.3943, 'grad_norm': 0.6912646554784957, 'learning_rate': 9.15061728804094e-06, 'epoch': 0.21} 21%|██▏ | 4699/22095 [7:46:40<21:52:27, 4.53s/it] 21%|██▏ | 4700/22095 [7:46:43<19:12:33, 3.98s/it] {'loss': 0.3742, 'grad_norm': 0.6628035744904157, 'learning_rate': 9.150208581619215e-06, 'epoch': 0.21} 21%|██▏ | 4700/22095 [7:46:43<19:12:33, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (132307 > 40960). Running this sequence through the model will result in indexing errors 21%|██▏ | 4701/22095 [7:46:46<17:58:22, 3.72s/it] {'loss': 0.3574, 'grad_norm': 0.8216102198117472, 'learning_rate': 9.149799786022256e-06, 'epoch': 0.21} 21%|██▏ | 4701/22095 [7:46:46<17:58:22, 3.72s/it] 21%|██▏ | 4702/22095 [7:46:49<17:03:36, 3.53s/it] {'loss': 0.3828, 'grad_norm': 0.7410885559035563, 'learning_rate': 9.149390901258841e-06, 'epoch': 0.21} 21%|██▏ | 4702/22095 [7:46:49<17:03:36, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4703/22095 [7:46:52<16:40:46, 3.45s/it] {'loss': 0.3664, 'grad_norm': 0.6536436140256802, 'learning_rate': 9.14898192733776e-06, 'epoch': 0.21} 21%|██▏ | 4703/22095 [7:46:52<16:40:46, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4704/22095 [7:46:56<16:57:44, 3.51s/it] {'loss': 0.429, 'grad_norm': 0.6977789389528802, 'learning_rate': 9.148572864267804e-06, 'epoch': 0.21} 21%|██▏ | 4704/22095 [7:46:56<16:57:44, 3.51s/it] 21%|██▏ | 4705/22095 [7:46:58<15:58:33, 3.31s/it] {'loss': 0.4213, 'grad_norm': 0.7408643479307443, 'learning_rate': 9.148163712057755e-06, 'epoch': 0.21} 21%|██▏ | 4705/22095 [7:46:58<15:58:33, 3.31s/it] 21%|██▏ | 4706/22095 [7:47:02<15:45:57, 3.26s/it] {'loss': 0.3908, 'grad_norm': 0.6933655118606328, 'learning_rate': 9.147754470716407e-06, 'epoch': 0.21} 21%|██▏ | 4706/22095 [7:47:02<15:45:57, 3.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11306078 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1535, 'image': 'taobao/1537.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一张中文宣传图片,内容如下:\n\n1. 图片左侧的文字内容是“点击左上角蓝色字体 星标粤卫平台公众号”。这意味着建议用户点击图片左上角的蓝色字体,以星标一个名为“粤卫平台”的微信公众号。\n\n2. 图片右侧是一只穿着宇航服的小动物(可能是熊猫)在一个飞碟里。这个小动物看起来非常可爱,身穿宇航服,飞碟也是卡通风格,具有紫色和蓝色的条纹装饰。\n\n总的来说,这是一个推广微信公众号的卡通风格图片,旨在引导用户关注“粤卫平台”公众号,通过点击图片左上角的蓝色字体来进行操作。'}]} 21%|██▏ | 4707/22095 [7:47:05<15:20:31, 3.18s/it] {'loss': 0.3323, 'grad_norm': 0.6693690607323451, 'learning_rate': 9.147345140252557e-06, 'epoch': 0.21} 21%|██▏ | 4707/22095 [7:47:05<15:20:31, 3.18s/it] 21%|██▏ | 4708/22095 [7:47:09<16:26:19, 3.40s/it] {'loss': 0.3634, 'grad_norm': 0.6021530915845484, 'learning_rate': 9.146935720674996e-06, 'epoch': 0.21} 21%|██▏ | 4708/22095 [7:47:09<16:26:19, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██▏ | 4709/22095 [7:47:20<27:53:11, 5.77s/it] {'loss': 0.4996, 'grad_norm': 0.37323411086081587, 'learning_rate': 9.146526211992523e-06, 'epoch': 0.21} 21%|██▏ | 4709/22095 [7:47:20<27:53:11, 5.77s/it] 21%|██▏ | 4710/22095 [7:47:24<25:48:08, 5.34s/it] {'loss': 0.3723, 'grad_norm': 0.6934432204208978, 'learning_rate': 9.146116614213938e-06, 'epoch': 0.21} 21%|██▏ | 4710/22095 [7:47:24<25:48:08, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4711/22095 [7:47:28<23:24:51, 4.85s/it] {'loss': 0.4022, 'grad_norm': 0.684048900725944, 'learning_rate': 9.14570692734804e-06, 'epoch': 0.21} 21%|██▏ | 4711/22095 [7:47:28<23:24:51, 4.85s/it] 21%|██▏ | 4712/22095 [7:47:31<20:30:48, 4.25s/it] {'loss': 0.3528, 'grad_norm': 0.6876883349474486, 'learning_rate': 9.145297151403631e-06, 'epoch': 0.21} 21%|██▏ | 4712/22095 [7:47:31<20:30:48, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4713/22095 [7:47:34<18:38:21, 3.86s/it] {'loss': 0.3581, 'grad_norm': 0.6900448103049897, 'learning_rate': 9.14488728638952e-06, 'epoch': 0.21} 21%|██▏ | 4713/22095 [7:47:34<18:38:21, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85483 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42735 > 40960). Running this sequence through the model will result in indexing errors 21%|██▏ | 4714/22095 [7:47:37<17:27:04, 3.61s/it] {'loss': 0.3596, 'grad_norm': 0.6654430713269235, 'learning_rate': 9.144477332314509e-06, 'epoch': 0.21} 21%|██▏ | 4714/22095 [7:47:37<17:27:04, 3.61s/it] 21%|██▏ | 4715/22095 [7:47:40<16:34:42, 3.43s/it] {'loss': 0.3447, 'grad_norm': 0.7570070927296473, 'learning_rate': 9.14406728918741e-06, 'epoch': 0.21} 21%|██▏ | 4715/22095 [7:47:40<16:34:42, 3.43s/it] 21%|██▏ | 4716/22095 [7:47:44<17:20:16, 3.59s/it] {'loss': 0.4206, 'grad_norm': 0.7467852955198542, 'learning_rate': 9.143657157017034e-06, 'epoch': 0.21} 21%|██▏ | 4716/22095 [7:47:44<17:20:16, 3.59s/it] 21%|██▏ | 4717/22095 [7:47:47<16:21:53, 3.39s/it] {'loss': 0.3856, 'grad_norm': 0.6630878162500251, 'learning_rate': 9.14324693581219e-06, 'epoch': 0.21} 21%|██▏ | 4717/22095 [7:47:47<16:21:53, 3.39s/it] 21%|██▏ | 4718/22095 [7:47:51<17:06:43, 3.55s/it] {'loss': 0.3687, 'grad_norm': 0.582274695283196, 'learning_rate': 9.142836625581694e-06, 'epoch': 0.21} 21%|██▏ | 4718/22095 [7:47:51<17:06:43, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [142, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7806338 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [142, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27678', 'image': '51983.jpg', 'image_wh': [[142, 25]], 'conversations': [{'from': 'human', 'value': '\nI am responding to the following question using the image as a reference: \nWhat small or subtle details can you identify in the image that might be easy to overlook? \nHere is my answer: \nThe image showcases a mathematical equation. The formula in the image is interpreted as "Val1 * Val2".\n\n### Image Description:\n1. **Object:** The mathematical expression "Val1 * Val2".\n2. **Symbols and Variables:**\n - **Val1 and Val2:** These are variables or placeholders denoting specific values or quantities. Given the absence of context within the image, they might symbolize anything from straightforward numerical values to intricate mathematical entities.\n - **Multiplication Sign (*):** This operator indicates that the values represented by Val1 and Val2 should be multiplied together.\n\n### Analysis:\nThe image consists exclusively of a mathematical expression featuring two variables and a multiplication operator. There is no extra context or extraneous visual data.\n\n### Relevant Knowledge:\nIn mathematics and associated domains, variables such as "Val1" and "Val2" are utilized to denote unknown or generalized quantities. The multiplication operator (*) is among the primary operations, denoting the product of two quantities.\n\n**Example of Possible Interpretations and Applications:**\n- In arithmetic, if Val1 = 2 and Val2 = 3, then Val1 * Val2 = 6.\n- If these variables represent dimensions in a physics problem, they could be lengths, where Val1 could be the length of an object and Val2 its width.\n\n### Sample Question and Answer:\n**Question:** If Val1 = 4 and Val2 = 3, what is Val1 * Val2?\n**Answer (CoT):**\n1. Recognize the provided values:\n - Val1 = 4\n - Val2 = 3\n2. Implement the multiplication operator:\n - 4 * 3\n3. Determine the product:\n - 4 * 3 = 12\nHence, Val1 * Val2 = 12.\n\n**Elaboration:**\nWithout detailed context, the comprehension of how to multiply these variables holds universal relevance. Regardless of whether Val1 and Val2 are scalars, matrices, or more elaborate entities, multiplication plays a pivotal role in mathematics, implying that we multiply corresponding elements within the structure delineated by the variables. This facilitates extensive applications across various fields, from fundamental arithmetic to sophisticated sectors like computational engineering where these variables could represent elements in complex systems or equations. \nKindly correct my answer based on the image, and if it\'s wrong, please explain the reason.'}, {'from': 'gpt', 'value': 'Your description is detailed and well-structured, demonstrating a clear understanding of how to interpret and analyze mathematical expressions. However, there\'s a discrepancy between the description you\'ve provided and the image you mentioned later. Initially, you described an equation with a multiplication operation ("Val1 * Val2"), but later, you mentioned an addition operation ("Vox1 + Vox2"). This inconsistency suggests that there might have been a mix-up between two different images or scenarios.\n\nTo maintain accuracy, ensure that the description matches the image you are analyzing. If the image indeed shows an addition operation, the initial part of your description would need to be adjusted to reflect this correctly. Here are a few adjustments you might consider:\n\n- Change "Val1 * Val2" to "Vox1 + Vox2" in the first part of the description.\n- Adjust the examples and sample question and answer to match the addition operation.\n- Ensure that any further explanations or elaborations align with the operation depicted in the image.\n\nThis adjustment will help maintain consistency and accuracy in your description.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [70, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390158 in VC:s3://internvl-moe-sft-data/. Exception: Image size [70, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56977, 'image': 'vrdu_table_final_2/astro-ph.EP/552644f4-80b5-415a-8874-feef63657ce2.png', 'image_wh': [[70, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} Name \\\\ \\end{tabular}\n```"}]} 21%|██▏ | 4719/22095 [7:47:54<16:45:42, 3.47s/it] {'loss': 0.3833, 'grad_norm': 0.6545244065786855, 'learning_rate': 9.142426226334365e-06, 'epoch': 0.21} 21%|██▏ | 4719/22095 [7:47:54<16:45:42, 3.47s/it] 21%|██▏ | 4720/22095 [7:47:58<17:15:11, 3.57s/it] {'loss': 0.4312, 'grad_norm': 0.6883635673957185, 'learning_rate': 9.142015738079017e-06, 'epoch': 0.21} 21%|██▏ | 4720/22095 [7:47:58<17:15:11, 3.57s/it] 21%|██▏ | 4721/22095 [7:48:01<16:15:49, 3.37s/it] {'loss': 0.3916, 'grad_norm': 0.6751076178839724, 'learning_rate': 9.141605160824473e-06, 'epoch': 0.21} 21%|██▏ | 4721/22095 [7:48:01<16:15:49, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47108 > 40960). Running this sequence through the model will result in indexing errors 21%|██▏ | 4722/22095 [7:48:10<24:59:34, 5.18s/it] {'loss': 0.511, 'grad_norm': 0.5309631556880734, 'learning_rate': 9.141194494579553e-06, 'epoch': 0.21} 21%|██▏ | 4722/22095 [7:48:10<24:59:34, 5.18s/it] 21%|██▏ | 4723/22095 [7:48:14<22:51:54, 4.74s/it] {'loss': 0.3938, 'grad_norm': 0.7167569087582294, 'learning_rate': 9.140783739353083e-06, 'epoch': 0.21} 21%|██▏ | 4723/22095 [7:48:14<22:51:54, 4.74s/it] 21%|██▏ | 4724/22095 [7:48:17<20:14:55, 4.20s/it] {'loss': 0.4021, 'grad_norm': 0.662087956027807, 'learning_rate': 9.140372895153887e-06, 'epoch': 0.21} 21%|██▏ | 4724/22095 [7:48:17<20:14:55, 4.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374107 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40882, 'image': 'vrdu_table_final_2/astro-ph.CO/5c85a751-0197-4cee-b4c3-58cbd21dfc0f.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 21%|██▏ | 4725/22095 [7:48:20<18:36:06, 3.86s/it] {'loss': 0.4173, 'grad_norm': 0.6237946620606332, 'learning_rate': 9.139961961990796e-06, 'epoch': 0.21} 21%|██▏ | 4725/22095 [7:48:20<18:36:06, 3.86s/it] 21%|██▏ | 4726/22095 [7:48:23<17:23:51, 3.61s/it] {'loss': 0.4065, 'grad_norm': 0.6856754692559144, 'learning_rate': 9.139550939872635e-06, 'epoch': 0.21} 21%|██▏ | 4726/22095 [7:48:23<17:23:51, 3.61s/it] 21%|██▏ | 4727/22095 [7:48:26<16:27:52, 3.41s/it] {'loss': 0.4086, 'grad_norm': 0.7084939389228818, 'learning_rate': 9.139139828808238e-06, 'epoch': 0.21} 21%|██▏ | 4727/22095 [7:48:26<16:27:52, 3.41s/it] 21%|██▏ | 4728/22095 [7:48:29<16:25:12, 3.40s/it] {'loss': 0.3686, 'grad_norm': 0.6144425495088535, 'learning_rate': 9.13872862880644e-06, 'epoch': 0.21} 21%|██▏ | 4728/22095 [7:48:29<16:25:12, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4729/22095 [7:48:32<16:14:54, 3.37s/it] {'loss': 0.3772, 'grad_norm': 0.6861154793751022, 'learning_rate': 9.138317339876073e-06, 'epoch': 0.21} 21%|██▏ | 4729/22095 [7:48:32<16:14:54, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 21%|██▏ | 4730/22095 [7:48:37<18:38:13, 3.86s/it] {'loss': 0.5363, 'grad_norm': 0.5890343111376076, 'learning_rate': 9.137905962025977e-06, 'epoch': 0.21} 21%|██▏ | 4730/22095 [7:48:37<18:38:13, 3.86s/it] 21%|██▏ | 4731/22095 [7:48:41<18:02:33, 3.74s/it] {'loss': 0.4486, 'grad_norm': 0.7217914460843785, 'learning_rate': 9.13749449526499e-06, 'epoch': 0.21} 21%|██▏ | 4731/22095 [7:48:41<18:02:33, 3.74s/it] 21%|██▏ | 4732/22095 [7:48:44<17:12:06, 3.57s/it] {'loss': 0.4231, 'grad_norm': 0.6471609436769172, 'learning_rate': 9.137082939601953e-06, 'epoch': 0.21} 21%|██▏ | 4732/22095 [7:48:44<17:12:06, 3.57s/it] 21%|██▏ | 4733/22095 [7:48:47<16:11:25, 3.36s/it] {'loss': 0.4308, 'grad_norm': 0.6431682681393758, 'learning_rate': 9.136671295045713e-06, 'epoch': 0.21} 21%|██▏ | 4733/22095 [7:48:47<16:11:25, 3.36s/it] 21%|██▏ | 4734/22095 [7:48:50<16:08:45, 3.35s/it] {'loss': 0.3905, 'grad_norm': 0.6473156110408935, 'learning_rate': 9.13625956160511e-06, 'epoch': 0.21} 21%|██▏ | 4734/22095 [7:48:50<16:08:45, 3.35s/it] 21%|██▏ | 4735/22095 [7:48:54<16:39:59, 3.46s/it] {'loss': 0.4215, 'grad_norm': 0.6914681606814352, 'learning_rate': 9.135847739288991e-06, 'epoch': 0.21} 21%|██▏ | 4735/22095 [7:48:54<16:39:59, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4736/22095 [7:48:58<17:20:56, 3.60s/it] {'loss': 0.3643, 'grad_norm': 0.6284487613866135, 'learning_rate': 9.135435828106208e-06, 'epoch': 0.21} 21%|██▏ | 4736/22095 [7:48:58<17:20:56, 3.60s/it] 21%|██▏ | 4737/22095 [7:49:01<16:32:47, 3.43s/it] {'loss': 0.3776, 'grad_norm': 0.6258062829449805, 'learning_rate': 9.135023828065609e-06, 'epoch': 0.21} 21%|██▏ | 4737/22095 [7:49:01<16:32:47, 3.43s/it] 21%|██▏ | 4738/22095 [7:49:04<15:50:36, 3.29s/it] {'loss': 0.3803, 'grad_norm': 0.6517393869689241, 'learning_rate': 9.13461173917605e-06, 'epoch': 0.21} 21%|██▏ | 4738/22095 [7:49:04<15:50:36, 3.29s/it] 21%|██▏ | 4739/22095 [7:49:07<15:49:03, 3.28s/it] {'loss': 0.4224, 'grad_norm': 0.6343432447683893, 'learning_rate': 9.134199561446379e-06, 'epoch': 0.21} 21%|██▏ | 4739/22095 [7:49:07<15:49:03, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 21%|██▏ | 4740/22095 [7:49:11<16:34:47, 3.44s/it] {'loss': 0.3879, 'grad_norm': 0.7060100945070159, 'learning_rate': 9.13378729488546e-06, 'epoch': 0.21} 21%|██▏ | 4740/22095 [7:49:11<16:34:47, 3.44s/it] 21%|██▏ | 4741/22095 [7:49:14<15:56:09, 3.31s/it] {'loss': 0.3826, 'grad_norm': 0.6590280271404857, 'learning_rate': 9.133374939502147e-06, 'epoch': 0.21} 21%|██▏ | 4741/22095 [7:49:14<15:56:09, 3.31s/it] 21%|██▏ | 4742/22095 [7:49:18<17:08:47, 3.56s/it] {'loss': 0.373, 'grad_norm': 0.6401579046669076, 'learning_rate': 9.132962495305302e-06, 'epoch': 0.21} 21%|██▏ | 4742/22095 [7:49:18<17:08:47, 3.56s/it] 21%|██▏ | 4743/22095 [7:49:21<16:33:13, 3.43s/it] {'loss': 0.4087, 'grad_norm': 0.8626848325193524, 'learning_rate': 9.132549962303786e-06, 'epoch': 0.21} 21%|██▏ | 4743/22095 [7:49:21<16:33:13, 3.43s/it] 21%|██▏ | 4744/22095 [7:49:25<17:25:51, 3.62s/it] {'loss': 0.3839, 'grad_norm': 0.6145110012458083, 'learning_rate': 9.132137340506464e-06, 'epoch': 0.21} 21%|██▏ | 4744/22095 [7:49:26<17:25:51, 3.62s/it] 21%|██▏ | 4745/22095 [7:49:30<18:50:46, 3.91s/it] {'loss': 0.3955, 'grad_norm': 0.6269187719056039, 'learning_rate': 9.131724629922199e-06, 'epoch': 0.21} 21%|██▏ | 4745/22095 [7:49:30<18:50:46, 3.91s/it] 21%|██▏ | 4746/22095 [7:49:33<18:31:19, 3.84s/it] {'loss': 0.3995, 'grad_norm': 0.6731846589000745, 'learning_rate': 9.131311830559864e-06, 'epoch': 0.21} 21%|██▏ | 4746/22095 [7:49:33<18:31:19, 3.84s/it] 21%|██▏ | 4747/22095 [7:49:37<18:04:58, 3.75s/it] {'loss': 0.3406, 'grad_norm': 0.628819585985676, 'learning_rate': 9.130898942428326e-06, 'epoch': 0.21} 21%|██▏ | 4747/22095 [7:49:37<18:04:58, 3.75s/it] 21%|██▏ | 4748/22095 [7:49:41<17:54:59, 3.72s/it] {'loss': 0.4098, 'grad_norm': 0.6643307393653383, 'learning_rate': 9.130485965536455e-06, 'epoch': 0.21} 21%|██▏ | 4748/22095 [7:49:41<17:54:59, 3.72s/it] 21%|██▏ | 4749/22095 [7:49:44<17:59:05, 3.73s/it] {'loss': 0.4016, 'grad_norm': 0.6360776379996099, 'learning_rate': 9.130072899893127e-06, 'epoch': 0.21} 21%|██▏ | 4749/22095 [7:49:44<17:59:05, 3.73s/it] 21%|██▏ | 4750/22095 [7:49:49<18:43:22, 3.89s/it] {'loss': 0.405, 'grad_norm': 0.6744085126225986, 'learning_rate': 9.129659745507219e-06, 'epoch': 0.21} 21%|██▏ | 4750/22095 [7:49:49<18:43:22, 3.89s/it] 22%|██▏ | 4751/22095 [7:49:53<19:34:32, 4.06s/it] {'loss': 0.3999, 'grad_norm': 0.6673408539577744, 'learning_rate': 9.129246502387602e-06, 'epoch': 0.22} 22%|██▏ | 4751/22095 [7:49:53<19:34:32, 4.06s/it] 22%|██▏ | 4752/22095 [7:49:56<18:27:58, 3.83s/it] {'loss': 0.4242, 'grad_norm': 0.7208579860734767, 'learning_rate': 9.128833170543164e-06, 'epoch': 0.22} 22%|██▏ | 4752/22095 [7:49:56<18:27:58, 3.83s/it] 22%|██▏ | 4753/22095 [7:50:00<17:40:56, 3.67s/it] {'loss': 0.3537, 'grad_norm': 0.5792654836484264, 'learning_rate': 9.12841974998278e-06, 'epoch': 0.22} 22%|██▏ | 4753/22095 [7:50:00<17:40:56, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100903 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4754/22095 [7:50:03<16:54:35, 3.51s/it] {'loss': 0.3769, 'grad_norm': 0.6245147101677967, 'learning_rate': 9.128006240715335e-06, 'epoch': 0.22} 22%|██▏ | 4754/22095 [7:50:03<16:54:35, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62543 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4755/22095 [7:50:07<17:22:14, 3.61s/it] {'loss': 0.4466, 'grad_norm': 0.7417289049341894, 'learning_rate': 9.127592642749714e-06, 'epoch': 0.22} 22%|██▏ | 4755/22095 [7:50:07<17:22:14, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49484 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4756/22095 [7:50:10<17:00:21, 3.53s/it] {'loss': 0.4151, 'grad_norm': 0.7202075800051113, 'learning_rate': 9.127178956094805e-06, 'epoch': 0.22} 22%|██▏ | 4756/22095 [7:50:10<17:00:21, 3.53s/it] 22%|██▏ | 4757/22095 [7:50:13<16:02:32, 3.33s/it] {'loss': 0.4143, 'grad_norm': 0.6338443835750316, 'learning_rate': 9.126765180759495e-06, 'epoch': 0.22} 22%|██▏ | 4757/22095 [7:50:13<16:02:32, 3.33s/it] 22%|██▏ | 4758/22095 [7:50:16<15:38:10, 3.25s/it] {'loss': 0.3949, 'grad_norm': 0.7038352222725998, 'learning_rate': 9.126351316752677e-06, 'epoch': 0.22} 22%|██▏ | 4758/22095 [7:50:16<15:38:10, 3.25s/it] 22%|██▏ | 4759/22095 [7:50:19<15:06:05, 3.14s/it] {'loss': 0.383, 'grad_norm': 0.7621821057855113, 'learning_rate': 9.125937364083241e-06, 'epoch': 0.22} 22%|██▏ | 4759/22095 [7:50:19<15:06:05, 3.14s/it] 22%|██▏ | 4760/22095 [7:50:24<17:32:37, 3.64s/it] {'loss': 0.3882, 'grad_norm': 0.6183332780913333, 'learning_rate': 9.125523322760084e-06, 'epoch': 0.22} 22%|██▏ | 4760/22095 [7:50:24<17:32:37, 3.64s/it] 22%|██▏ | 4761/22095 [7:50:27<16:29:51, 3.43s/it] {'loss': 0.3748, 'grad_norm': 0.7084547271369006, 'learning_rate': 9.1251091927921e-06, 'epoch': 0.22} 22%|██▏ | 4761/22095 [7:50:27<16:29:51, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4762/22095 [7:50:36<24:56:18, 5.18s/it] {'loss': 0.4986, 'grad_norm': 0.6274635436111439, 'learning_rate': 9.124694974188188e-06, 'epoch': 0.22} 22%|██▏ | 4762/22095 [7:50:36<24:56:18, 5.18s/it] 22%|██▏ | 4763/22095 [7:50:46<31:38:03, 6.57s/it] {'loss': 0.4994, 'grad_norm': 0.3987272632360564, 'learning_rate': 9.124280666957251e-06, 'epoch': 0.22} 22%|██▏ | 4763/22095 [7:50:46<31:38:03, 6.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68394 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4764/22095 [7:50:52<31:08:06, 6.47s/it] {'loss': 0.4929, 'grad_norm': 0.3045904274637565, 'learning_rate': 9.123866271108188e-06, 'epoch': 0.22} 22%|██▏ | 4764/22095 [7:50:52<31:08:06, 6.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 22%|██▏ | 4765/22095 [7:50:55<26:35:25, 5.52s/it] {'loss': 0.3927, 'grad_norm': 0.7009826919682061, 'learning_rate': 9.123451786649906e-06, 'epoch': 0.22} 22%|██▏ | 4765/22095 [7:50:55<26:35:25, 5.52s/it] 22%|██▏ | 4766/22095 [7:50:59<23:54:09, 4.97s/it] {'loss': 0.4443, 'grad_norm': 0.6765166384139596, 'learning_rate': 9.123037213591308e-06, 'epoch': 0.22} 22%|██▏ | 4766/22095 [7:50:59<23:54:09, 4.97s/it] 22%|██▏ | 4767/22095 [7:51:02<20:42:36, 4.30s/it] {'loss': 0.4134, 'grad_norm': 0.6274547084962255, 'learning_rate': 9.122622551941303e-06, 'epoch': 0.22} 22%|██▏ | 4767/22095 [7:51:02<20:42:36, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49929 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4768/22095 [7:51:05<19:43:19, 4.10s/it] {'loss': 0.4027, 'grad_norm': 0.6800427324859651, 'learning_rate': 9.122207801708802e-06, 'epoch': 0.22} 22%|██▏ | 4768/22095 [7:51:05<19:43:19, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [106, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8336709 in VC:s3://internvl-moe-sft-data/. Exception: Image size [106, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3329, 'image': 'vrdu_table_final_2/astro-ph.CO/9a39d18f-09be-46a3-baca-71519a81fddb.png', 'image_wh': [[106, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}} Remarks \\end{tabular}\n```"}]} 22%|██▏ | 4769/22095 [7:51:08<17:57:02, 3.73s/it] {'loss': 0.3804, 'grad_norm': 0.6689189439956374, 'learning_rate': 9.121792962902715e-06, 'epoch': 0.22} 22%|██▏ | 4769/22095 [7:51:08<17:57:02, 3.73s/it] 22%|██▏ | 4770/22095 [7:51:12<18:15:32, 3.79s/it] {'loss': 0.3983, 'grad_norm': 1.0224311576834435, 'learning_rate': 9.121378035531957e-06, 'epoch': 0.22} 22%|██▏ | 4770/22095 [7:51:12<18:15:32, 3.79s/it] 22%|██▏ | 4771/22095 [7:51:15<16:51:52, 3.50s/it] {'loss': 0.3779, 'grad_norm': 0.6625989732933459, 'learning_rate': 9.120963019605442e-06, 'epoch': 0.22} 22%|██▏ | 4771/22095 [7:51:15<16:51:52, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56727 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4772/22095 [7:51:25<25:53:03, 5.38s/it] {'loss': 0.5067, 'grad_norm': 0.9989802934708129, 'learning_rate': 9.12054791513209e-06, 'epoch': 0.22} 22%|██▏ | 4772/22095 [7:51:25<25:53:03, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365277 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32018, 'image': 'vrdu_table_final_2/astro-ph.CO/bfb4eb44-6722-46e0-8fdf-c0b9de11a30f.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}0.05\\end{tabular}\n```"}]} 22%|██▏ | 4773/22095 [7:51:28<23:21:36, 4.85s/it] {'loss': 0.3762, 'grad_norm': 0.6473775166086109, 'learning_rate': 9.120132722120817e-06, 'epoch': 0.22} 22%|██▏ | 4773/22095 [7:51:28<23:21:36, 4.85s/it] 22%|██▏ | 4774/22095 [7:51:31<20:41:46, 4.30s/it] {'loss': 0.4046, 'grad_norm': 0.6851233182293817, 'learning_rate': 9.119717440580547e-06, 'epoch': 0.22} 22%|██▏ | 4774/22095 [7:51:31<20:41:46, 4.30s/it] 22%|██▏ | 4775/22095 [7:51:34<19:02:13, 3.96s/it] {'loss': 0.3725, 'grad_norm': 0.6633570528725057, 'learning_rate': 9.1193020705202e-06, 'epoch': 0.22} 22%|██▏ | 4775/22095 [7:51:34<19:02:13, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_234948.png 2025-08-27 23:49:33.195630 load time: 1031.67 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_044849_before_screenshot.png 2025-08-27 23:49:33.197099 load time: 1042.44 ms 22%|██▏ | 4776/22095 [7:51:44<26:47:58, 5.57s/it] {'loss': 0.4849, 'grad_norm': 0.38037237653395206, 'learning_rate': 9.118886611948704e-06, 'epoch': 0.22} 22%|██▏ | 4776/22095 [7:51:44<26:47:58, 5.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908009 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31162, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 22%|██▏ | 4777/22095 [7:51:48<25:04:48, 5.21s/it] {'loss': 0.3842, 'grad_norm': 0.7112060405305283, 'learning_rate': 9.118471064874985e-06, 'epoch': 0.22} 22%|██▏ | 4777/22095 [7:51:48<25:04:48, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4778/22095 [7:51:57<30:16:54, 6.30s/it] {'loss': 0.4991, 'grad_norm': 0.5085666920474634, 'learning_rate': 9.118055429307972e-06, 'epoch': 0.22} 22%|██▏ | 4778/22095 [7:51:57<30:16:54, 6.30s/it] 22%|██▏ | 4779/22095 [7:52:00<26:15:55, 5.46s/it] {'loss': 0.4218, 'grad_norm': 0.7313166012275039, 'learning_rate': 9.117639705256595e-06, 'epoch': 0.22} 22%|██▏ | 4779/22095 [7:52:00<26:15:55, 5.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4780/22095 [7:52:06<25:44:11, 5.35s/it] {'loss': 0.5184, 'grad_norm': 0.5094894776139752, 'learning_rate': 9.117223892729788e-06, 'epoch': 0.22} 22%|██▏ | 4780/22095 [7:52:06<25:44:11, 5.35s/it] 22%|██▏ | 4781/22095 [7:52:09<22:42:28, 4.72s/it] {'loss': 0.4278, 'grad_norm': 0.8138243525194827, 'learning_rate': 9.116807991736483e-06, 'epoch': 0.22} 22%|██▏ | 4781/22095 [7:52:09<22:42:28, 4.72s/it] 22%|██▏ | 4782/22095 [7:52:13<21:15:10, 4.42s/it] {'loss': 0.3727, 'grad_norm': 0.7507256150215487, 'learning_rate': 9.11639200228562e-06, 'epoch': 0.22} 22%|██▏ | 4782/22095 [7:52:13<21:15:10, 4.42s/it] 22%|██▏ | 4783/22095 [7:52:16<19:30:35, 4.06s/it] {'loss': 0.4084, 'grad_norm': 0.7920366485254184, 'learning_rate': 9.115975924386133e-06, 'epoch': 0.22} 22%|██▏ | 4783/22095 [7:52:16<19:30:35, 4.06s/it] 22%|██▏ | 4784/22095 [7:52:19<17:44:29, 3.69s/it] {'loss': 0.3937, 'grad_norm': 0.6485474045722099, 'learning_rate': 9.115559758046967e-06, 'epoch': 0.22} 22%|██▏ | 4784/22095 [7:52:19<17:44:29, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78866 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4785/22095 [7:52:23<18:19:03, 3.81s/it] {'loss': 0.4281, 'grad_norm': 0.935641414641697, 'learning_rate': 9.115143503277061e-06, 'epoch': 0.22} 22%|██▏ | 4785/22095 [7:52:23<18:19:03, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103585 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4786/22095 [7:52:27<18:31:48, 3.85s/it] {'loss': 0.4214, 'grad_norm': 0.690370433047125, 'learning_rate': 9.11472716008536e-06, 'epoch': 0.22} 22%|██▏ | 4786/22095 [7:52:27<18:31:48, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83120 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4787/22095 [7:52:30<17:59:12, 3.74s/it] {'loss': 0.4211, 'grad_norm': 0.6883280151352387, 'learning_rate': 9.114310728480809e-06, 'epoch': 0.22} 22%|██▏ | 4787/22095 [7:52:30<17:59:12, 3.74s/it] 22%|██▏ | 4788/22095 [7:52:33<16:43:23, 3.48s/it] {'loss': 0.4069, 'grad_norm': 0.6955993473765785, 'learning_rate': 9.113894208472357e-06, 'epoch': 0.22} 22%|██▏ | 4788/22095 [7:52:33<16:43:23, 3.48s/it] 22%|██▏ | 4789/22095 [7:52:36<15:46:40, 3.28s/it] {'loss': 0.3835, 'grad_norm': 0.9616140275679443, 'learning_rate': 9.113477600068954e-06, 'epoch': 0.22} 22%|██▏ | 4789/22095 [7:52:36<15:46:40, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4790/22095 [7:52:39<15:40:17, 3.26s/it] {'loss': 0.4239, 'grad_norm': 0.6925944169611911, 'learning_rate': 9.11306090327955e-06, 'epoch': 0.22} 22%|██▏ | 4790/22095 [7:52:39<15:40:17, 3.26s/it] 22%|██▏ | 4791/22095 [7:52:42<15:41:30, 3.26s/it] {'loss': 0.4307, 'grad_norm': 0.6841971425614416, 'learning_rate': 9.112644118113098e-06, 'epoch': 0.22} 22%|██▏ | 4791/22095 [7:52:42<15:41:30, 3.26s/it] 22%|██▏ | 4792/22095 [7:52:45<15:08:32, 3.15s/it] {'loss': 0.3921, 'grad_norm': 0.6420255859979275, 'learning_rate': 9.112227244578557e-06, 'epoch': 0.22} 22%|██▏ | 4792/22095 [7:52:45<15:08:32, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55476 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4793/22095 [7:52:54<23:57:18, 4.98s/it] {'loss': 0.4979, 'grad_norm': 1.0857826233798953, 'learning_rate': 9.111810282684883e-06, 'epoch': 0.22} 22%|██▏ | 4793/22095 [7:52:54<23:57:18, 4.98s/it] 22%|██▏ | 4794/22095 [7:53:03<28:47:20, 5.99s/it] {'loss': 0.5308, 'grad_norm': 0.7103692532900913, 'learning_rate': 9.111393232441033e-06, 'epoch': 0.22} 22%|██▏ | 4794/22095 [7:53:03<28:47:20, 5.99s/it] 22%|██▏ | 4795/22095 [7:53:11<31:23:25, 6.53s/it] {'loss': 0.515, 'grad_norm': 0.31048946305929215, 'learning_rate': 9.11097609385597e-06, 'epoch': 0.22} 22%|██▏ | 4795/22095 [7:53:11<31:23:25, 6.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (49133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46892 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48843 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4796/22095 [7:53:14<26:33:26, 5.53s/it] {'loss': 0.3976, 'grad_norm': 0.7646353820957875, 'learning_rate': 9.110558866938657e-06, 'epoch': 0.22} 22%|██▏ | 4796/22095 [7:53:14<26:33:26, 5.53s/it] 22%|██▏ | 4797/22095 [7:53:17<23:44:03, 4.94s/it] {'loss': 0.4052, 'grad_norm': 0.847772148923698, 'learning_rate': 9.110141551698058e-06, 'epoch': 0.22} 22%|██▏ | 4797/22095 [7:53:17<23:44:03, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41170 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4798/22095 [7:53:21<21:31:20, 4.48s/it] {'loss': 0.3784, 'grad_norm': 0.781170161994958, 'learning_rate': 9.10972414814314e-06, 'epoch': 0.22} 22%|██▏ | 4798/22095 [7:53:21<21:31:20, 4.48s/it] 22%|██▏ | 4799/22095 [7:53:24<20:30:39, 4.27s/it] {'loss': 0.4047, 'grad_norm': 0.7387783443504868, 'learning_rate': 9.109306656282873e-06, 'epoch': 0.22} 22%|██▏ | 4799/22095 [7:53:24<20:30:39, 4.27s/it] 22%|██▏ | 4800/22095 [7:53:28<19:07:57, 3.98s/it] {'loss': 0.415, 'grad_norm': 0.7776877608382816, 'learning_rate': 9.108889076126226e-06, 'epoch': 0.22} 22%|██▏ | 4800/22095 [7:53:28<19:07:57, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4801/22095 [7:53:37<26:53:35, 5.60s/it] {'loss': 0.5213, 'grad_norm': 1.8493601277062992, 'learning_rate': 9.108471407682173e-06, 'epoch': 0.22} 22%|██▏ | 4801/22095 [7:53:37<26:53:35, 5.60s/it] 22%|██▏ | 4802/22095 [7:53:40<23:33:50, 4.91s/it] {'loss': 0.401, 'grad_norm': 0.7543258013996105, 'learning_rate': 9.108053650959687e-06, 'epoch': 0.22} 22%|██▏ | 4802/22095 [7:53:40<23:33:50, 4.91s/it] 22%|██▏ | 4803/22095 [7:53:44<20:58:21, 4.37s/it] {'loss': 0.4269, 'grad_norm': 0.6386692161119694, 'learning_rate': 9.107635805967746e-06, 'epoch': 0.22} 22%|██▏ | 4803/22095 [7:53:44<20:58:21, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4804/22095 [7:53:51<24:49:54, 5.17s/it] {'loss': 0.5302, 'grad_norm': 0.7614681871635838, 'learning_rate': 9.107217872715326e-06, 'epoch': 0.22} 22%|██▏ | 4804/22095 [7:53:51<24:49:54, 5.17s/it] 22%|██▏ | 4805/22095 [7:53:55<23:18:39, 4.85s/it] {'loss': 0.4, 'grad_norm': 0.7008794665046963, 'learning_rate': 9.10679985121141e-06, 'epoch': 0.22} 22%|██▏ | 4805/22095 [7:53:55<23:18:39, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (66725 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4806/22095 [7:54:02<26:54:36, 5.60s/it] {'loss': 0.5107, 'grad_norm': 0.6508911059006347, 'learning_rate': 9.106381741464976e-06, 'epoch': 0.22} 22%|██▏ | 4806/22095 [7:54:02<26:54:36, 5.60s/it] 22%|██▏ | 4807/22095 [7:54:05<23:32:22, 4.90s/it] {'loss': 0.3736, 'grad_norm': 0.7100459278013462, 'learning_rate': 9.105963543485012e-06, 'epoch': 0.22} 22%|██▏ | 4807/22095 [7:54:05<23:32:22, 4.90s/it] 22%|██▏ | 4808/22095 [7:54:08<20:35:46, 4.29s/it] {'loss': 0.4255, 'grad_norm': 0.7069088764353862, 'learning_rate': 9.105545257280502e-06, 'epoch': 0.22} 22%|██▏ | 4808/22095 [7:54:08<20:35:46, 4.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306787 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1vLFzLXXXXXbpXVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCan you identify all the words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n真維斯\nNAT\nURE\nEST.1972\nwa315.\ntaobao.\ncom\n专柜正品\n送运费险'}]} 22%|██▏ | 4809/22095 [7:54:12<19:20:32, 4.03s/it] {'loss': 0.4482, 'grad_norm': 0.7355101644306209, 'learning_rate': 9.105126882860431e-06, 'epoch': 0.22} 22%|██▏ | 4809/22095 [7:54:12<19:20:32, 4.03s/it] 22%|██▏ | 4810/22095 [7:54:14<17:37:18, 3.67s/it] {'loss': 0.3661, 'grad_norm': 0.6537061814493169, 'learning_rate': 9.104708420233794e-06, 'epoch': 0.22} 22%|██▏ | 4810/22095 [7:54:14<17:37:18, 3.67s/it] 22%|██▏ | 4811/22095 [7:54:18<16:58:55, 3.54s/it] {'loss': 0.3673, 'grad_norm': 0.7340349633672104, 'learning_rate': 9.104289869409577e-06, 'epoch': 0.22} 22%|██▏ | 4811/22095 [7:54:18<16:58:55, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101470 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4812/22095 [7:54:21<16:10:49, 3.37s/it] {'loss': 0.3988, 'grad_norm': 0.6970942224495815, 'learning_rate': 9.103871230396778e-06, 'epoch': 0.22} 22%|██▏ | 4812/22095 [7:54:21<16:10:49, 3.37s/it] 22%|██▏ | 4813/22095 [7:54:23<15:22:20, 3.20s/it] {'loss': 0.3794, 'grad_norm': 0.6889946495799093, 'learning_rate': 9.10345250320439e-06, 'epoch': 0.22} 22%|██▏ | 4813/22095 [7:54:23<15:22:20, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4814/22095 [7:54:33<25:01:17, 5.21s/it] {'loss': 0.4945, 'grad_norm': 1.2788229066156596, 'learning_rate': 9.103033687841412e-06, 'epoch': 0.22} 22%|██▏ | 4814/22095 [7:54:33<25:01:17, 5.21s/it] 22%|██▏ | 4815/22095 [7:54:38<23:41:21, 4.94s/it] {'loss': 0.3967, 'grad_norm': 0.6827859617043349, 'learning_rate': 9.10261478431684e-06, 'epoch': 0.22} 22%|██▏ | 4815/22095 [7:54:38<23:41:21, 4.94s/it] 22%|██▏ | 4816/22095 [7:54:41<20:53:03, 4.35s/it] {'loss': 0.4055, 'grad_norm': 0.6939430838461157, 'learning_rate': 9.102195792639677e-06, 'epoch': 0.22} 22%|██▏ | 4816/22095 [7:54:41<20:53:03, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46857 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4817/22095 [7:54:44<19:22:47, 4.04s/it] {'loss': 0.4031, 'grad_norm': 0.6831299091964597, 'learning_rate': 9.101776712818924e-06, 'epoch': 0.22} 22%|██▏ | 4817/22095 [7:54:44<19:22:47, 4.04s/it] 22%|██▏ | 4818/22095 [7:54:47<18:04:04, 3.76s/it] {'loss': 0.398, 'grad_norm': 0.6629483845316078, 'learning_rate': 9.101357544863589e-06, 'epoch': 0.22} 22%|██▏ | 4818/22095 [7:54:47<18:04:04, 3.76s/it] 22%|██▏ | 4819/22095 [7:54:50<17:00:08, 3.54s/it] {'loss': 0.4025, 'grad_norm': 0.7171445450976456, 'learning_rate': 9.100938288782675e-06, 'epoch': 0.22} 22%|██▏ | 4819/22095 [7:54:50<17:00:08, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [239, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8422193 in VC:s3://internvl-moe-sft-data/. Exception: Image size [239, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 101014, 'image': 'vrdu_texteq/astro-ph.CO/5626f396-aa10-49c1-9c98-57f4ebb4ae1c.png', 'image_wh': [[239, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'and a skewness $\\delta$ as'}]} 22%|██▏ | 4820/22095 [7:55:02<28:35:18, 5.96s/it] {'loss': 0.4927, 'grad_norm': 0.5089509713601021, 'learning_rate': 9.100518944585194e-06, 'epoch': 0.22} 22%|██▏ | 4820/22095 [7:55:02<28:35:18, 5.96s/it] 22%|██▏ | 4821/22095 [7:55:05<24:44:26, 5.16s/it] {'loss': 0.4026, 'grad_norm': 0.7030141717004029, 'learning_rate': 9.100099512280155e-06, 'epoch': 0.22} 22%|██▏ | 4821/22095 [7:55:05<24:44:26, 5.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4822/22095 [7:55:08<21:51:51, 4.56s/it] {'loss': 0.3793, 'grad_norm': 0.668643383248003, 'learning_rate': 9.099679991876567e-06, 'epoch': 0.22} 22%|██▏ | 4822/22095 [7:55:08<21:51:51, 4.56s/it] 22%|██▏ | 4823/22095 [7:55:11<19:32:19, 4.07s/it] {'loss': 0.3817, 'grad_norm': 0.6782888733885742, 'learning_rate': 9.09926038338345e-06, 'epoch': 0.22} 22%|██▏ | 4823/22095 [7:55:11<19:32:19, 4.07s/it] 22%|██▏ | 4824/22095 [7:55:14<17:57:22, 3.74s/it] {'loss': 0.354, 'grad_norm': 0.6695192386556748, 'learning_rate': 9.098840686809816e-06, 'epoch': 0.22} 22%|██▏ | 4824/22095 [7:55:14<17:57:22, 3.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4825/22095 [7:55:22<23:40:13, 4.93s/it] {'loss': 0.5142, 'grad_norm': 0.6071035795399119, 'learning_rate': 9.098420902164684e-06, 'epoch': 0.22} 22%|██▏ | 4825/22095 [7:55:22<23:40:13, 4.93s/it] 22%|██▏ | 4826/22095 [7:55:25<21:28:06, 4.48s/it] {'loss': 0.4157, 'grad_norm': 0.6713769264047075, 'learning_rate': 9.098001029457074e-06, 'epoch': 0.22} 22%|██▏ | 4826/22095 [7:55:25<21:28:06, 4.48s/it] 22%|██▏ | 4827/22095 [7:55:29<20:35:06, 4.29s/it] {'loss': 0.3699, 'grad_norm': 0.6264425918657583, 'learning_rate': 9.097581068696009e-06, 'epoch': 0.22} 22%|██▏ | 4827/22095 [7:55:29<20:35:06, 4.29s/it] 22%|██▏ | 4828/22095 [7:55:33<19:47:04, 4.12s/it] {'loss': 0.4022, 'grad_norm': 0.6349502047551863, 'learning_rate': 9.09716101989051e-06, 'epoch': 0.22} 22%|██▏ | 4828/22095 [7:55:33<19:47:04, 4.12s/it] 22%|██▏ | 4829/22095 [7:55:36<18:55:36, 3.95s/it] {'loss': 0.3828, 'grad_norm': 0.6473338464170852, 'learning_rate': 9.096740883049606e-06, 'epoch': 0.22} 22%|██▏ | 4829/22095 [7:55:36<18:55:36, 3.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [339, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429572 in VC:s3://internvl-moe-sft-data/. Exception: Image size [339, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 89316, 'image': 'vrdu_texteq/astro-ph.CO/12b069c3-a64e-4e3a-bdf9-d6518a856868.png', 'image_wh': [[339, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where $z_l$ is the lens redshift.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358153 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24864, 'image': 'vrdu_table_final_2/astro-ph.CO/a33f4b32-86c2-4cd9-8967-ef5e9990e93b.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{c}$S_{1}$\\end{tabular}\n```"}]} 22%|██▏ | 4830/22095 [7:55:39<17:26:49, 3.64s/it] {'loss': 0.3805, 'grad_norm': 0.6804309986334778, 'learning_rate': 9.096320658182323e-06, 'epoch': 0.22} 22%|██▏ | 4830/22095 [7:55:39<17:26:49, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4831/22095 [7:55:47<22:45:42, 4.75s/it] {'loss': 0.5005, 'grad_norm': 0.4044106437016609, 'learning_rate': 9.095900345297688e-06, 'epoch': 0.22} 22%|██▏ | 4831/22095 [7:55:47<22:45:42, 4.75s/it] 22%|██▏ | 4832/22095 [7:55:50<21:33:23, 4.50s/it] {'loss': 0.3727, 'grad_norm': 0.6281173047090177, 'learning_rate': 9.095479944404735e-06, 'epoch': 0.22} 22%|██▏ | 4832/22095 [7:55:50<21:33:23, 4.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047750 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1.5'}]} 22%|██▏ | 4833/22095 [7:55:54<19:46:50, 4.13s/it] {'loss': 0.4261, 'grad_norm': 0.6724742320266274, 'learning_rate': 9.095059455512496e-06, 'epoch': 0.22} 22%|██▏ | 4833/22095 [7:55:54<19:46:50, 4.13s/it] 22%|██▏ | 4834/22095 [7:55:57<18:50:43, 3.93s/it] {'loss': 0.3509, 'grad_norm': 0.6443652166595981, 'learning_rate': 9.094638878630007e-06, 'epoch': 0.22} 22%|██▏ | 4834/22095 [7:55:57<18:50:43, 3.93s/it] 22%|██▏ | 4835/22095 [7:56:00<17:25:18, 3.63s/it] {'loss': 0.3848, 'grad_norm': 1.0013061561516774, 'learning_rate': 9.094218213766304e-06, 'epoch': 0.22} 22%|██▏ | 4835/22095 [7:56:00<17:25:18, 3.63s/it] 22%|██▏ | 4836/22095 [7:56:04<17:09:58, 3.58s/it] {'loss': 0.4139, 'grad_norm': 0.957768121562034, 'learning_rate': 9.093797460930426e-06, 'epoch': 0.22} 22%|██▏ | 4836/22095 [7:56:04<17:09:58, 3.58s/it] 22%|██▏ | 4837/22095 [7:56:07<16:43:10, 3.49s/it] {'loss': 0.4067, 'grad_norm': 0.7983684200974046, 'learning_rate': 9.093376620131414e-06, 'epoch': 0.22} 22%|██▏ | 4837/22095 [7:56:07<16:43:10, 3.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [578, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504957 in VC:s3://internvl-moe-sft-data/. Exception: Image size [578, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 90672, 'image': 'vrdu_texteq/astro-ph.CO/35364009-bfe7-4103-9761-eef9d61e13bb.png', 'image_wh': [[578, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where $N_b$ is the total number of radial bins and'}]} 22%|██▏ | 4838/22095 [7:56:10<16:14:10, 3.39s/it] {'loss': 0.4392, 'grad_norm': 0.648737570449002, 'learning_rate': 9.09295569137831e-06, 'epoch': 0.22} 22%|██▏ | 4838/22095 [7:56:10<16:14:10, 3.39s/it] 22%|██▏ | 4839/22095 [7:56:13<15:35:20, 3.25s/it] {'loss': 0.3965, 'grad_norm': 0.7519313642500751, 'learning_rate': 9.092534674680158e-06, 'epoch': 0.22} 22%|██▏ | 4839/22095 [7:56:13<15:35:20, 3.25s/it] 22%|██▏ | 4840/22095 [7:56:16<15:46:48, 3.29s/it] {'loss': 0.397, 'grad_norm': 0.6612125002526383, 'learning_rate': 9.092113570046005e-06, 'epoch': 0.22} 22%|██▏ | 4840/22095 [7:56:16<15:46:48, 3.29s/it] 22%|██▏ | 4841/22095 [7:56:20<15:38:16, 3.26s/it] {'loss': 0.3922, 'grad_norm': 0.6496470297316416, 'learning_rate': 9.0916923774849e-06, 'epoch': 0.22} 22%|██▏ | 4841/22095 [7:56:20<15:38:16, 3.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [81, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398000 in VC:s3://internvl-moe-sft-data/. Exception: Image size [81, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 150, 'image': 'vrdu_table_final_2/astro-ph.CO/4fdbdc30-953e-4dc1-9c7c-fda218db5285.png', 'image_wh': [[81, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{l}MMF1\\end{tabular}\n```'}]} 22%|██▏ | 4842/22095 [7:56:22<15:00:02, 3.13s/it] {'loss': 0.3593, 'grad_norm': 0.6049859360584547, 'learning_rate': 9.091271097005894e-06, 'epoch': 0.22} 22%|██▏ | 4842/22095 [7:56:22<15:00:02, 3.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47323 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (153465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80430 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4843/22095 [7:56:25<14:45:31, 3.08s/it] {'loss': 0.3995, 'grad_norm': 0.66596102844228, 'learning_rate': 9.090849728618034e-06, 'epoch': 0.22} 22%|██▏ | 4843/22095 [7:56:25<14:45:31, 3.08s/it] 22%|██▏ | 4844/22095 [7:56:29<15:44:56, 3.29s/it] {'loss': 0.397, 'grad_norm': 0.658302161673842, 'learning_rate': 9.090428272330381e-06, 'epoch': 0.22} 22%|██▏ | 4844/22095 [7:56:29<15:44:56, 3.29s/it] 22%|██▏ | 4845/22095 [7:56:33<16:13:58, 3.39s/it] {'loss': 0.4294, 'grad_norm': 0.7329682214731195, 'learning_rate': 9.090006728151986e-06, 'epoch': 0.22} 22%|██▏ | 4845/22095 [7:56:33<16:13:58, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47518 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106441 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110260 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4846/22095 [7:56:36<16:22:54, 3.42s/it] {'loss': 0.4163, 'grad_norm': 0.7542852602708906, 'learning_rate': 9.089585096091906e-06, 'epoch': 0.22} 22%|██▏ | 4846/22095 [7:56:36<16:22:54, 3.42s/it] 22%|██▏ | 4847/22095 [7:56:39<16:05:57, 3.36s/it] {'loss': 0.4238, 'grad_norm': 0.7274274023986838, 'learning_rate': 9.089163376159205e-06, 'epoch': 0.22} 22%|██▏ | 4847/22095 [7:56:39<16:05:57, 3.36s/it] 22%|██▏ | 4848/22095 [7:56:43<16:02:25, 3.35s/it] {'loss': 0.4075, 'grad_norm': 0.6175911757542972, 'learning_rate': 9.08874156836294e-06, 'epoch': 0.22} 22%|██▏ | 4848/22095 [7:56:43<16:02:25, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120641 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4849/22095 [7:56:46<16:06:10, 3.36s/it] {'loss': 0.4295, 'grad_norm': 0.6682924426530112, 'learning_rate': 9.088319672712179e-06, 'epoch': 0.22} 22%|██▏ | 4849/22095 [7:56:46<16:06:10, 3.36s/it] 22%|██▏ | 4850/22095 [7:56:50<16:16:57, 3.40s/it] {'loss': 0.4059, 'grad_norm': 1.1939628574369072, 'learning_rate': 9.087897689215983e-06, 'epoch': 0.22} 22%|██▏ | 4850/22095 [7:56:50<16:16:57, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047103 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 32cm\nB. 4cm\nC. 8cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 22%|██▏ | 4851/22095 [7:56:59<24:57:35, 5.21s/it] {'loss': 0.5302, 'grad_norm': 0.43468805882083733, 'learning_rate': 9.087475617883419e-06, 'epoch': 0.22} 22%|██▏ | 4851/22095 [7:56:59<24:57:35, 5.21s/it] 22%|██▏ | 4852/22095 [7:57:03<22:41:47, 4.74s/it] {'loss': 0.3914, 'grad_norm': 0.6512979730787424, 'learning_rate': 9.08705345872356e-06, 'epoch': 0.22} 22%|██▏ | 4852/22095 [7:57:03<22:41:47, 4.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4853/22095 [7:57:06<20:11:16, 4.22s/it] {'loss': 0.3487, 'grad_norm': 0.6743507359541027, 'learning_rate': 9.086631211745474e-06, 'epoch': 0.22} 22%|██▏ | 4853/22095 [7:57:06<20:11:16, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4854/22095 [7:57:15<27:36:53, 5.77s/it] {'loss': 0.5257, 'grad_norm': 0.35648851900559125, 'learning_rate': 9.086208876958233e-06, 'epoch': 0.22} 22%|██▏ | 4854/22095 [7:57:15<27:36:53, 5.77s/it] 22%|██▏ | 4855/22095 [7:57:19<24:14:35, 5.06s/it] {'loss': 0.3392, 'grad_norm': 0.6652147485374682, 'learning_rate': 9.085786454370915e-06, 'epoch': 0.22} 22%|██▏ | 4855/22095 [7:57:19<24:14:35, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45944 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121254 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64597 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4856/22095 [7:57:21<21:00:44, 4.39s/it] {'loss': 0.4144, 'grad_norm': 0.7496692106275002, 'learning_rate': 9.085363943992593e-06, 'epoch': 0.22} 22%|██▏ | 4856/22095 [7:57:21<21:00:44, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4857/22095 [7:57:29<25:29:58, 5.33s/it] {'loss': 0.4836, 'grad_norm': 0.3034966816973855, 'learning_rate': 9.084941345832348e-06, 'epoch': 0.22} 22%|██▏ | 4857/22095 [7:57:29<25:29:58, 5.33s/it] 22%|██▏ | 4858/22095 [7:57:33<23:12:36, 4.85s/it] {'loss': 0.3607, 'grad_norm': 0.6818992193702953, 'learning_rate': 9.08451865989926e-06, 'epoch': 0.22} 22%|██▏ | 4858/22095 [7:57:33<23:12:36, 4.85s/it] 22%|██▏ | 4859/22095 [7:57:36<20:38:11, 4.31s/it] {'loss': 0.4213, 'grad_norm': 0.734131224447571, 'learning_rate': 9.08409588620241e-06, 'epoch': 0.22} 22%|██▏ | 4859/22095 [7:57:36<20:38:11, 4.31s/it] 22%|██▏ | 4860/22095 [7:57:39<19:58:20, 4.17s/it] {'loss': 0.3966, 'grad_norm': 0.7196822706743875, 'learning_rate': 9.083673024750882e-06, 'epoch': 0.22} 22%|██▏ | 4860/22095 [7:57:39<19:58:20, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4861/22095 [7:57:43<19:36:12, 4.09s/it] {'loss': 0.3849, 'grad_norm': 0.7125834370384616, 'learning_rate': 9.083250075553765e-06, 'epoch': 0.22} 22%|██▏ | 4861/22095 [7:57:43<19:36:12, 4.09s/it] 22%|██▏ | 4862/22095 [7:57:47<18:58:51, 3.97s/it] {'loss': 0.4505, 'grad_norm': 0.7844549672485713, 'learning_rate': 9.082827038620143e-06, 'epoch': 0.22} 22%|██▏ | 4862/22095 [7:57:47<18:58:51, 3.97s/it] 22%|██▏ | 4863/22095 [7:57:50<18:08:31, 3.79s/it] {'loss': 0.3821, 'grad_norm': 0.6848019492200854, 'learning_rate': 9.082403913959109e-06, 'epoch': 0.22} 22%|██▏ | 4863/22095 [7:57:50<18:08:31, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4864/22095 [7:57:55<19:43:31, 4.12s/it] {'loss': 0.517, 'grad_norm': 0.34865430290147625, 'learning_rate': 9.08198070157975e-06, 'epoch': 0.22} 22%|██▏ | 4864/22095 [7:57:55<19:43:31, 4.12s/it] 22%|██▏ | 4865/22095 [7:57:59<18:33:56, 3.88s/it] {'loss': 0.3561, 'grad_norm': 0.5992614487673076, 'learning_rate': 9.081557401491164e-06, 'epoch': 0.22} 22%|██▏ | 4865/22095 [7:57:59<18:33:56, 3.88s/it] 22%|██▏ | 4866/22095 [7:58:02<17:58:42, 3.76s/it] {'loss': 0.3714, 'grad_norm': 0.6710248014391402, 'learning_rate': 9.081134013702447e-06, 'epoch': 0.22} 22%|██▏ | 4866/22095 [7:58:02<17:58:42, 3.76s/it] 22%|██▏ | 4867/22095 [7:58:06<17:51:02, 3.73s/it] {'loss': 0.4516, 'grad_norm': 0.6860968982296224, 'learning_rate': 9.080710538222692e-06, 'epoch': 0.22} 22%|██▏ | 4867/22095 [7:58:06<17:51:02, 3.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914852 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38005, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 4\nB. 5\nC. 6\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 22%|██▏ | 4868/22095 [7:58:09<17:36:38, 3.68s/it] {'loss': 0.4373, 'grad_norm': 0.7244537982890571, 'learning_rate': 9.080286975061e-06, 'epoch': 0.22} 22%|██▏ | 4868/22095 [7:58:09<17:36:38, 3.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957196 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8031, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 6\nB. 10\nC. 8\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 22%|██▏ | 4869/22095 [7:58:12<16:35:31, 3.47s/it] {'loss': 0.4273, 'grad_norm': 0.6788330842807545, 'learning_rate': 9.079863324226473e-06, 'epoch': 0.22} 22%|██▏ | 4869/22095 [7:58:12<16:35:31, 3.47s/it] 22%|██▏ | 4870/22095 [7:58:16<17:34:03, 3.67s/it] {'loss': 0.3791, 'grad_norm': 0.6624675454264172, 'learning_rate': 9.079439585728214e-06, 'epoch': 0.22} 22%|██▏ | 4870/22095 [7:58:16<17:34:03, 3.67s/it] 22%|██▏ | 4871/22095 [7:58:20<16:42:44, 3.49s/it] {'loss': 0.3956, 'grad_norm': 0.6730079736173539, 'learning_rate': 9.079015759575327e-06, 'epoch': 0.22} 22%|██▏ | 4871/22095 [7:58:20<16:42:44, 3.49s/it] 22%|██▏ | 4872/22095 [7:58:23<17:03:12, 3.56s/it] {'loss': 0.3508, 'grad_norm': 0.699365885683548, 'learning_rate': 9.078591845776921e-06, 'epoch': 0.22} 22%|██▏ | 4872/22095 [7:58:23<17:03:12, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52784 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70716 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4873/22095 [7:58:27<17:36:35, 3.68s/it] {'loss': 0.4171, 'grad_norm': 0.6799562739889206, 'learning_rate': 9.0781678443421e-06, 'epoch': 0.22} 22%|██▏ | 4873/22095 [7:58:27<17:36:35, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4874/22095 [7:58:30<16:28:47, 3.45s/it] {'loss': 0.3761, 'grad_norm': 0.6385646491210312, 'learning_rate': 9.077743755279977e-06, 'epoch': 0.22} 22%|██▏ | 4874/22095 [7:58:30<16:28:47, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4875/22095 [7:58:33<15:43:34, 3.29s/it] {'loss': 0.3867, 'grad_norm': 0.6774749292746666, 'learning_rate': 9.077319578599667e-06, 'epoch': 0.22} 22%|██▏ | 4875/22095 [7:58:33<15:43:34, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128632 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4876/22095 [7:58:36<15:13:49, 3.18s/it] {'loss': 0.3534, 'grad_norm': 0.6287712017883627, 'learning_rate': 9.076895314310282e-06, 'epoch': 0.22} 22%|██▏ | 4876/22095 [7:58:36<15:13:49, 3.18s/it] 22%|██▏ | 4877/22095 [7:58:40<16:02:51, 3.36s/it] {'loss': 0.3711, 'grad_norm': 0.6665113821873541, 'learning_rate': 9.076470962420935e-06, 'epoch': 0.22} 22%|██▏ | 4877/22095 [7:58:40<16:02:51, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4878/22095 [7:58:43<15:21:54, 3.21s/it] {'loss': 0.3757, 'grad_norm': 0.6834998561502025, 'learning_rate': 9.076046522940749e-06, 'epoch': 0.22} 22%|██▏ | 4878/22095 [7:58:43<15:21:54, 3.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4879/22095 [7:58:47<16:27:47, 3.44s/it] {'loss': 0.4163, 'grad_norm': 0.7744574841169455, 'learning_rate': 9.075621995878841e-06, 'epoch': 0.22} 22%|██▏ | 4879/22095 [7:58:47<16:27:47, 3.44s/it] 22%|██▏ | 4880/22095 [7:58:50<16:19:33, 3.41s/it] {'loss': 0.407, 'grad_norm': 0.6814692976919767, 'learning_rate': 9.075197381244333e-06, 'epoch': 0.22} 22%|██▏ | 4880/22095 [7:58:50<16:19:33, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4881/22095 [7:58:56<19:54:30, 4.16s/it] {'loss': 0.5034, 'grad_norm': 0.3696973213244952, 'learning_rate': 9.074772679046351e-06, 'epoch': 0.22} 22%|██▏ | 4881/22095 [7:58:56<19:54:30, 4.16s/it] 22%|██▏ | 4882/22095 [7:58:59<19:07:11, 4.00s/it] {'loss': 0.3985, 'grad_norm': 0.6658175315998246, 'learning_rate': 9.074347889294017e-06, 'epoch': 0.22} 22%|██▏ | 4882/22095 [7:58:59<19:07:11, 4.00s/it] 22%|██▏ | 4883/22095 [7:59:22<45:16:42, 9.47s/it] {'loss': 0.424, 'grad_norm': 0.7359226606161277, 'learning_rate': 9.073923011996462e-06, 'epoch': 0.22} 22%|██▏ | 4883/22095 [7:59:22<45:16:42, 9.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4884/22095 [7:59:27<39:14:43, 8.21s/it] {'loss': 0.496, 'grad_norm': 0.3223135860462851, 'learning_rate': 9.073498047162813e-06, 'epoch': 0.22} 22%|██▏ | 4884/22095 [7:59:27<39:14:43, 8.21s/it] 22%|██▏ | 4885/22095 [7:59:30<32:12:57, 6.74s/it] {'loss': 0.4055, 'grad_norm': 0.6504920645333246, 'learning_rate': 9.073072994802202e-06, 'epoch': 0.22} 22%|██▏ | 4885/22095 [7:59:30<32:12:57, 6.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41173 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55080 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4886/22095 [7:59:34<28:00:32, 5.86s/it] {'loss': 0.3905, 'grad_norm': 0.7523501341301879, 'learning_rate': 9.072647854923763e-06, 'epoch': 0.22} 22%|██▏ | 4886/22095 [7:59:34<28:00:32, 5.86s/it] 22%|██▏ | 4887/22095 [7:59:37<23:41:32, 4.96s/it] {'loss': 0.3838, 'grad_norm': 0.7486959836693489, 'learning_rate': 9.072222627536627e-06, 'epoch': 0.22} 22%|██▏ | 4887/22095 [7:59:37<23:41:32, 4.96s/it] 22%|██▏ | 4888/22095 [7:59:40<20:39:23, 4.32s/it] {'loss': 0.3548, 'grad_norm': 0.6528488225448029, 'learning_rate': 9.071797312649934e-06, 'epoch': 0.22} 22%|██▏ | 4888/22095 [7:59:40<20:39:23, 4.32s/it] 22%|██▏ | 4889/22095 [8:00:01<45:33:10, 9.53s/it] {'loss': 0.3931, 'grad_norm': 0.6901906657633865, 'learning_rate': 9.071371910272823e-06, 'epoch': 0.22} 22%|██▏ | 4889/22095 [8:00:01<45:33:10, 9.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4890/22095 [8:00:23<62:37:41, 13.10s/it] {'loss': 0.3866, 'grad_norm': 0.7001577349443593, 'learning_rate': 9.070946420414435e-06, 'epoch': 0.22} 22%|██▏ | 4890/22095 [8:00:23<62:37:41, 13.10s/it] 22%|██▏ | 4891/22095 [8:00:26<48:18:38, 10.11s/it] {'loss': 0.392, 'grad_norm': 0.801922623369581, 'learning_rate': 9.07052084308391e-06, 'epoch': 0.22} 22%|██▏ | 4891/22095 [8:00:26<48:18:38, 10.11s/it] 22%|██▏ | 4892/22095 [8:01:10<96:59:03, 20.30s/it] {'loss': 0.414, 'grad_norm': 0.7677849535633429, 'learning_rate': 9.070095178290394e-06, 'epoch': 0.22} 22%|██▏ | 4892/22095 [8:01:10<96:59:03, 20.30s/it] 22%|██▏ | 4893/22095 [8:01:33<100:05:24, 20.95s/it] {'loss': 0.3772, 'grad_norm': 0.6805848241897616, 'learning_rate': 9.069669426043033e-06, 'epoch': 0.22} 22%|██▏ | 4893/22095 [8:01:33<100:05:24, 20.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940292 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63445, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nA. 3\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 22%|██▏ | 4894/22095 [8:01:56<104:16:56, 21.83s/it] {'loss': 0.5201, 'grad_norm': 0.4350282351307276, 'learning_rate': 9.069243586350976e-06, 'epoch': 0.22} 22%|██▏ | 4894/22095 [8:01:56<104:16:56, 21.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://st2pj/20250222/images/multi_modal_2024/agent_data/OS-Atlas/androidworld/b801389f-72be-4f53-9249-82436718b848.png 2025-08-27 23:59:55.220270 load time: 1043.8 ms 22%|██▏ | 4895/22095 [8:02:00<78:08:23, 16.35s/it] {'loss': 0.3992, 'grad_norm': 0.7355164443478349, 'learning_rate': 9.068817659223371e-06, 'epoch': 0.22} 22%|██▏ | 4895/22095 [8:02:00<78:08:23, 16.35s/it] 22%|██▏ | 4896/22095 [8:02:21<85:12:06, 17.83s/it] {'loss': 0.406, 'grad_norm': 0.7023395182277847, 'learning_rate': 9.068391644669371e-06, 'epoch': 0.22} 22%|██▏ | 4896/22095 [8:02:21<85:12:06, 17.83s/it] 22%|██▏ | 4897/22095 [8:02:25<65:18:30, 13.67s/it] {'loss': 0.41, 'grad_norm': 0.7063160229165169, 'learning_rate': 9.067965542698129e-06, 'epoch': 0.22} 22%|██▏ | 4897/22095 [8:02:25<65:18:30, 13.67s/it] 22%|██▏ | 4898/22095 [8:03:06<104:01:27, 21.78s/it] {'loss': 0.4181, 'grad_norm': 0.6950126592196099, 'learning_rate': 9.067539353318804e-06, 'epoch': 0.22} 22%|██▏ | 4898/22095 [8:03:06<104:01:27, 21.78s/it] 22%|██▏ | 4899/22095 [8:04:09<162:31:36, 34.03s/it] {'loss': 0.4187, 'grad_norm': 0.6869503147100258, 'learning_rate': 9.067113076540547e-06, 'epoch': 0.22} 22%|██▏ | 4899/22095 [8:04:09<162:31:36, 34.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94674 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111480 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4900/22095 [8:04:48<169:52:20, 35.57s/it] {'loss': 0.3908, 'grad_norm': 0.6871055919686695, 'learning_rate': 9.066686712372524e-06, 'epoch': 0.22} 22%|██▏ | 4900/22095 [8:04:48<169:52:20, 35.57s/it] 22%|██▏ | 4901/22095 [8:04:51<123:35:18, 25.88s/it] {'loss': 0.369, 'grad_norm': 0.6954917745019584, 'learning_rate': 9.066260260823893e-06, 'epoch': 0.22} 22%|██▏ | 4901/22095 [8:04:51<123:35:18, 25.88s/it] 22%|██▏ | 4902/22095 [8:05:12<116:40:57, 24.43s/it] {'loss': 0.4043, 'grad_norm': 0.6749196877672734, 'learning_rate': 9.065833721903817e-06, 'epoch': 0.22} 22%|██▏ | 4902/22095 [8:05:12<116:40:57, 24.43s/it] 22%|██▏ | 4903/22095 [8:05:53<140:41:29, 29.46s/it] {'loss': 0.4153, 'grad_norm': 0.6561073305341862, 'learning_rate': 9.065407095621462e-06, 'epoch': 0.22} 22%|██▏ | 4903/22095 [8:05:53<140:41:29, 29.46s/it] 22%|██▏ | 4904/22095 [8:05:56<102:57:01, 21.56s/it] {'loss': 0.3959, 'grad_norm': 0.6712022464959736, 'learning_rate': 9.064980381985993e-06, 'epoch': 0.22} 22%|██▏ | 4904/22095 [8:05:56<102:57:01, 21.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [381, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8441975 in VC:s3://internvl-moe-sft-data/. Exception: Image size [381, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 70390, 'image': 'vrdu_texteq/astro-ph.CO/596e60c3-8bdb-4cfc-ae4d-b664eadeec22.png', 'image_wh': [[381, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where $J$ is the Jacobian matrix'}]} 22%|██▏ | 4905/22095 [8:06:38<132:11:19, 27.68s/it] {'loss': 0.3698, 'grad_norm': 0.8511922636720965, 'learning_rate': 9.064553581006583e-06, 'epoch': 0.22} 22%|██▏ | 4905/22095 [8:06:38<132:11:19, 27.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62053 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4906/22095 [8:06:42<97:42:59, 20.47s/it] {'loss': 0.4295, 'grad_norm': 0.6086633175646591, 'learning_rate': 9.064126692692397e-06, 'epoch': 0.22} 22%|██▏ | 4906/22095 [8:06:42<97:42:59, 20.47s/it] 22%|██▏ | 4907/22095 [8:07:40<151:16:55, 31.69s/it] {'loss': 0.373, 'grad_norm': 0.5905928955912909, 'learning_rate': 9.063699717052612e-06, 'epoch': 0.22} 22%|██▏ | 4907/22095 [8:07:40<151:16:55, 31.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4908/22095 [8:07:50<119:51:23, 25.11s/it] {'loss': 0.5065, 'grad_norm': 0.4037618393026883, 'learning_rate': 9.0632726540964e-06, 'epoch': 0.22} 22%|██▏ | 4908/22095 [8:07:50<119:51:23, 25.11s/it] 22%|██▏ | 4909/22095 [8:07:54<90:23:43, 18.94s/it] {'loss': 0.4814, 'grad_norm': 0.7951152684635662, 'learning_rate': 9.06284550383294e-06, 'epoch': 0.22} 22%|██▏ | 4909/22095 [8:07:54<90:23:43, 18.94s/it] 22%|██▏ | 4910/22095 [8:08:15<93:00:22, 19.48s/it] {'loss': 0.3257, 'grad_norm': 0.7227826564261085, 'learning_rate': 9.062418266271406e-06, 'epoch': 0.22} 22%|██▏ | 4910/22095 [8:08:15<93:00:22, 19.48s/it] 22%|██▏ | 4911/22095 [8:08:37<97:03:03, 20.33s/it] {'loss': 0.3548, 'grad_norm': 0.6299762399568244, 'learning_rate': 9.06199094142098e-06, 'epoch': 0.22} 22%|██▏ | 4911/22095 [8:08:37<97:03:03, 20.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4912/22095 [8:08:59<99:50:10, 20.92s/it] {'loss': 0.3968, 'grad_norm': 0.6391259383273397, 'learning_rate': 9.061563529290845e-06, 'epoch': 0.22} 22%|██▏ | 4912/22095 [8:08:59<99:50:10, 20.92s/it] 22%|██▏ | 4913/22095 [8:09:22<101:58:18, 21.37s/it] {'loss': 0.4148, 'grad_norm': 0.6254815763024072, 'learning_rate': 9.061136029890186e-06, 'epoch': 0.22} 22%|██▏ | 4913/22095 [8:09:22<101:58:18, 21.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906521 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29674, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} 22%|██▏ | 4914/22095 [8:10:09<138:26:42, 29.01s/it] {'loss': 0.512, 'grad_norm': 0.2979564299577659, 'learning_rate': 9.060708443228184e-06, 'epoch': 0.22} 22%|██▏ | 4914/22095 [8:10:09<138:26:42, 29.01s/it] 22%|██▏ | 4915/22095 [8:10:31<128:30:16, 26.93s/it] {'loss': 0.4223, 'grad_norm': 0.7084697271068684, 'learning_rate': 9.060280769314028e-06, 'epoch': 0.22} 22%|██▏ | 4915/22095 [8:10:31<128:30:16, 26.93s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_563979.png 2025-08-28 00:08:29.600513 load time: 1021.42 ms VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/windows_10/F17F.png 2025-08-28 00:08:29.600339 load time: 1027.28 ms VC:s3://gui-agent/data_20250630/windows/images/PS/handmade_annotation_12/images/PS1_id_15_internvl_appearance_crop_0_grounding_instructions_random.png 2025-08-28 00:08:29.598391 load time: 1030.4 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_642722.png 2025-08-28 00:08:29.598515 load time: 1031.6 ms 22%|██▏ | 4916/22095 [8:11:09<145:04:27, 30.40s/it] {'loss': 0.3844, 'grad_norm': 0.6349471347363206, 'learning_rate': 9.05985300815691e-06, 'epoch': 0.22} 22%|██▏ | 4916/22095 [8:11:09<145:04:27, 30.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (227190 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4917/22095 [8:11:51<160:38:59, 33.67s/it] {'loss': 0.3938, 'grad_norm': 2.6816014194505997, 'learning_rate': 9.05942515976602e-06, 'epoch': 0.22} 22%|██▏ | 4917/22095 [8:11:51<160:38:59, 33.67s/it] 22%|██▏ | 4918/22095 [8:12:13<144:11:00, 30.22s/it] {'loss': 0.3803, 'grad_norm': 0.6663388143533746, 'learning_rate': 9.05899722415055e-06, 'epoch': 0.22} 22%|██▏ | 4918/22095 [8:12:13<144:11:00, 30.22s/it]VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/9743423304549335_9.png 2025-08-28 00:10:11.546458 load time: 1074.78 ms 22%|██▏ | 4919/22095 [8:13:31<212:51:02, 44.61s/it] {'loss': 0.374, 'grad_norm': 0.720519900506803, 'learning_rate': 9.058569201319696e-06, 'epoch': 0.22} 22%|██▏ | 4919/22095 [8:13:31<212:51:02, 44.61s/it] 22%|██▏ | 4920/22095 [8:14:12<207:30:03, 43.49s/it] {'loss': 0.3908, 'grad_norm': 0.6702356758327594, 'learning_rate': 9.058141091282656e-06, 'epoch': 0.22} 22%|██▏ | 4920/22095 [8:14:12<207:30:03, 43.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55296 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47520 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79742 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4921/22095 [8:15:31<258:15:08, 54.13s/it] {'loss': 0.3944, 'grad_norm': 0.6841496388893396, 'learning_rate': 9.057712894048627e-06, 'epoch': 0.22} 22%|██▏ | 4921/22095 [8:15:31<258:15:08, 54.13s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30797.png 2025-08-28 00:13:31.282869 load time: 1429.3 ms 22%|██▏ | 4922/22095 [8:16:13<241:18:44, 50.59s/it] {'loss': 0.3823, 'grad_norm': 0.6330740612123831, 'learning_rate': 9.05728460962681e-06, 'epoch': 0.22} 22%|██▏ | 4922/22095 [8:16:13<241:18:44, 50.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4923/22095 [8:17:49<305:24:25, 64.03s/it] {'loss': 0.4255, 'grad_norm': 0.6340769606214909, 'learning_rate': 9.056856238026408e-06, 'epoch': 0.22} 22%|██▏ | 4923/22095 [8:17:49<305:24:25, 64.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4924/22095 [8:19:06<324:03:06, 67.94s/it] {'loss': 0.3963, 'grad_norm': 0.6726079933198121, 'learning_rate': 9.056427779256624e-06, 'epoch': 0.22} 22%|██▏ | 4924/22095 [8:19:06<324:03:06, 67.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4925/22095 [8:19:15<240:23:31, 50.40s/it] {'loss': 0.5117, 'grad_norm': 0.3608419911007155, 'learning_rate': 9.055999233326667e-06, 'epoch': 0.22} 22%|██▏ | 4925/22095 [8:19:15<240:23:31, 50.40s/it] 22%|██▏ | 4926/22095 [8:19:37<200:07:27, 41.96s/it] {'loss': 0.3622, 'grad_norm': 1.0467144881169608, 'learning_rate': 9.055570600245744e-06, 'epoch': 0.22} 22%|██▏ | 4926/22095 [8:19:37<200:07:27, 41.96s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240828_202339_before_screenshot.png 2025-08-28 00:17:36.107061 load time: 1238.14 ms 22%|██▏ | 4927/22095 [8:20:40<230:10:18, 48.27s/it] {'loss': 0.3882, 'grad_norm': 0.7188640558240326, 'learning_rate': 9.055141880023062e-06, 'epoch': 0.22} 22%|██▏ | 4927/22095 [8:20:40<230:10:18, 48.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54619 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4928/22095 [8:21:25<224:23:13, 47.06s/it] {'loss': 0.4102, 'grad_norm': 0.7274003485447285, 'learning_rate': 9.054713072667838e-06, 'epoch': 0.22} 22%|██▏ | 4928/22095 [8:21:25<224:23:13, 47.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4929/22095 [8:21:50<193:40:34, 40.62s/it] {'loss': 0.488, 'grad_norm': 0.35090527102288355, 'learning_rate': 9.054284178189281e-06, 'epoch': 0.22} 22%|██▏ | 4929/22095 [8:21:50<193:40:34, 40.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11234219 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8710, 'image': 'MMMU/Pharmacy/test_112_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,How many moles of each product are formed\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': "### Introduction to the Reaction: Iron(III) Oxide and Carbon Monoxide\n\nThe chemical equation provided is a classic example of a reduction-oxidation (redox) reaction used in metallurgy, particularly in the extraction of iron from its ores in a blast furnace. The reaction involves the reduction of iron(III) oxide (\\( \\text{Fe}_2\\text{O}_3 \\)) and the oxidation of carbon monoxide (\\( \\text{CO} \\)).\n\n#### Key Chemical Compounds:\n\n1. **Iron(III) Oxide (\\( \\text{Fe}_2\\text{O}_3 \\))**\n\n - **Concepts and Properties:**\n - Iron(III) oxide, also known as ferric oxide, is a compound composed of iron in its +3 oxidation state.\n - It appears as a reddish-brown powder and is a significant component of rust.\n - It has the chemical formula \\( \\text{Fe}_2\\text{O}_3 \\), indicating two iron ions (+3 charge each) bonded to three oxide ions (O\\(^{-2}\\)).\n\n - **Applications:**\n - Its primary use is in steelmaking, serving as an iron ore.\n - Used in pigments, ceramics, and magnetic recording.\n - Plays a role in thermite reactions, a process employed to weld railway tracks.\n\n - **Related Examples:**\n - Rust formation: Iron (\\( \\text{Fe} \\)) reacts with oxygen and water to form \\( \\text{Fe}_2\\text{O}_3 \\cdot n\\text{H}_2\\text{O} \\).\n - Hematite (ore) processing involves reduction in a blast furnace to extract pure iron.\n\n2. **Carbon Monoxide (\\( \\text{CO} \\))**\n\n - **Concepts and Properties:**\n - Carbon monoxide is a colorless, odorless gas composed of one carbon atom double-bonded to an oxygen atom.\n - It is highly toxic and binds with hemoglobin in blood, restricting oxygen transport.\n - Acts as a reducing agent in metallurgical processes.\n\n - **Applications:**\n - Primarily used in the reduction of metal ores.\n - Plays a role in the synthesis of hydrocarbons via the Fischer-Tropsch process.\n - Used in gas manufacturing and purification processes.\n\n - **Related Examples:**\n - Incomplete combustion of carbon-based fuels resulting in \\( \\text{CO} \\) production.\n - Conversion of dangerous tailpipe emissions in vehicles by catalytic converters.\n\n#### The Reaction Process: Redox in Metallurgy\n\n1. **Redox Reaction Basics:**\n\n - **Oxidation-Reduction (Redox) Reactions:**\n - Involve the transfer of electrons between atoms or molecules.\n - Oxidation refers to the loss of electrons, whereas reduction refers to the gain of electrons.\n - In the given reaction, \\( \\text{Fe}_2\\text{O}_3 \\) is reduced to \\( \\text{Fe} \\), while \\( \\text{CO} \\) is oxidized to \\( \\text{CO}_2 \\).\n\n - **Balancing the Redox Reaction:**\n - Balancing redox reactions involves ensuring that both mass and charge are conserved.\n - Let's balance the equation: \\( \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\).\n - This balanced equation indicates that each mole of \\( \\text{Fe}_2\\text{O}_3 \\) produces two moles of \\( \\text{Fe} \\) and three moles of \\( \\text{CO}_2 \\).\n\n2. **Reaction Mechanism within a Blast Furnace:**\n\n - **Blast Furnace Operation:**\n - A blast furnace is used to extract metallic iron from iron ores like hematite (\\( \\text{Fe}_2\\text{O}_3 \\)) and magnetite (\\( \\text{Fe}_3\\text{O}_4 \\)).\n - The furnace layers include coke (carbon source), limestone (flux), and iron ore.\n - Hot air (\\(\\sim 2000^\\circ \\)C) is blown into the furnace, igniting the coke to produce \\( \\text{CO} \\) and \\( \\text{CO}_2 \\).\n - The ascending gases reduce iron oxides to molten iron.\n\n - **Chemical Reactions Involved:**\n - \\( \\text{C}(s) + \\text{O}_2(g) \\rightarrow \\text{CO}_2(g) \\)\n - \\( \\text{CO}_2(g) + \\text{C}(s) \\rightarrow 2\\text{CO}(g) \\)\n - Iron oxide reduction: \\( \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\)\n\n3. **Analysis of Reaction Products:**\n\n - **Iron (\\( \\text{Fe} \\)):**\n - The primary product of reducing \\( \\text{Fe}_2\\text{O}_3 \\).\n - Forms molten slag layered above molten iron in a blast furnace, later solidified into pig iron.\n\n - **Carbon Dioxide (\\( \\text{CO}_2 \\)):**\n - A gaseous byproduct eventually released from the furnace.\n - Its formation underscores the complete oxidation of carbon monoxide.\n\n#### Related Concepts and Applications\n\n1. **Stoichiometry in Redox Reactions:**\n\n - **Concept:**\n - Stoichiometry involves calculating the quantitative relationships of reactants and products in chemical reactions.\n\n - **Application:**\n - Used to compute quantities in chemical syntheses and industrial processes.\n - Crucial for determining reactant amounts needed and predicting theoretical yield.\n\n2. **Thermal and Chemical Efficiency:**\n\n - **Concept:**\n - Efficiency reflects the performance of industrial processes like iron extraction, measured in terms of both energy use and product yield.\n\n - **Application:**\n - Improve process design to optimize fuel consumption and reduce waste.\n - Considerations include reaction kinetics, heat exchange, and material utilization.\n\n3. **Environmental Impact:**\n\n - **Concept:**\n - The environmental footprint involves emissions and waste management issues arising from industrial processes.\n\n - **Application:**\n - Innovations in waste management and emissions reduction in steelmaking.\n - Development of cleaner alternatives and recyclability pathways.\n\n4. **Metallurgical Advances:**\n\n - **Concept:**\n - Metallurgy explores methods of metal extraction and alloy production.\n\n - **Application:**\n - New techniques such as electric arc furnaces and direct reduced iron (DRI) processes.\n - Yield better-quality steel with lower energy consumption and reduced emissions.\n\nIn sum, the intertwined concepts of redox chemistry, reaction balancing, stoichiometry, and metallurgical principles provide a comprehensive understanding of the reaction's role in industry and its broader implications. The operation of a blast furnace exemplifies how these chemical reactions are harnessed for large-scale metal production, impacting engineering, environmental practices, and global industries."}]} 22%|██▏ | 4930/22095 [8:22:49<219:14:40, 45.98s/it] {'loss': 0.3465, 'grad_norm': 0.643082305355415, 'learning_rate': 9.05385519659661e-06, 'epoch': 0.22} 22%|██▏ | 4930/22095 [8:22:49<219:14:40, 45.98s/it] 22%|██▏ | 4931/22095 [8:24:24<289:49:06, 60.79s/it] {'loss': 0.4201, 'grad_norm': 0.6488098190014425, 'learning_rate': 9.05342612789904e-06, 'epoch': 0.22} 22%|██▏ | 4931/22095 [8:24:24<289:49:06, 60.79s/it] 22%|██▏ | 4932/22095 [8:24:46<234:22:01, 49.16s/it] {'loss': 0.3536, 'grad_norm': 0.6559485495898078, 'learning_rate': 9.052996972105794e-06, 'epoch': 0.22} 22%|██▏ | 4932/22095 [8:24:46<234:22:01, 49.16s/it] 22%|██▏ | 4933/22095 [8:25:46<249:44:32, 52.39s/it] {'loss': 0.3803, 'grad_norm': 0.6305430257580593, 'learning_rate': 9.052567729226089e-06, 'epoch': 0.22} 22%|██▏ | 4933/22095 [8:25:46<249:44:32, 52.39s/it] 22%|██▏ | 4934/22095 [8:26:27<233:06:28, 48.90s/it] {'loss': 0.404, 'grad_norm': 0.693319002377718, 'learning_rate': 9.052138399269153e-06, 'epoch': 0.22} 22%|██▏ | 4934/22095 [8:26:27<233:06:28, 48.90s/it] 22%|██▏ | 4935/22095 [8:26:48<194:18:30, 40.76s/it] {'loss': 0.4099, 'grad_norm': 0.7031246161246214, 'learning_rate': 9.051708982244205e-06, 'epoch': 0.22} 22%|██▏ | 4935/22095 [8:26:48<194:18:30, 40.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47435 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4936/22095 [8:27:55<230:36:34, 48.38s/it] {'loss': 0.5132, 'grad_norm': 0.3962595002461269, 'learning_rate': 9.051279478160475e-06, 'epoch': 0.22} 22%|██▏ | 4936/22095 [8:27:55<230:36:34, 48.38s/it] 22%|██▏ | 4937/22095 [8:28:37<221:30:18, 46.48s/it] {'loss': 0.3888, 'grad_norm': 0.6743678198292562, 'learning_rate': 9.050849887027192e-06, 'epoch': 0.22} 22%|██▏ | 4937/22095 [8:28:37<221:30:18, 46.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4938/22095 [8:29:41<247:24:31, 51.91s/it] {'loss': 0.3712, 'grad_norm': 0.7553552357137391, 'learning_rate': 9.050420208853587e-06, 'epoch': 0.22} 22%|██▏ | 4938/22095 [8:29:41<247:24:31, 51.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68452 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44231 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4939/22095 [8:30:22<230:51:04, 48.44s/it] {'loss': 0.3895, 'grad_norm': 0.6468895793062895, 'learning_rate': 9.04999044364889e-06, 'epoch': 0.22} 22%|██▏ | 4939/22095 [8:30:22<230:51:04, 48.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53511 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4940/22095 [8:31:21<246:05:34, 51.64s/it] {'loss': 0.3454, 'grad_norm': 0.6049019690618703, 'learning_rate': 9.049560591422339e-06, 'epoch': 0.22} 22%|██▏ | 4940/22095 [8:31:21<246:05:34, 51.64s/it] 22%|██▏ | 4941/22095 [8:32:22<260:07:50, 54.59s/it] {'loss': 0.3908, 'grad_norm': 0.6421990664350431, 'learning_rate': 9.049130652183167e-06, 'epoch': 0.22} 22%|██▏ | 4941/22095 [8:32:22<260:07:50, 54.59s/it] 22%|██▏ | 4942/22095 [8:33:22<267:23:34, 56.12s/it] {'loss': 0.3668, 'grad_norm': 0.6610407945684632, 'learning_rate': 9.048700625940613e-06, 'epoch': 0.22} 22%|██▏ | 4942/22095 [8:33:22<267:23:34, 56.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4943/22095 [8:33:49<226:00:51, 47.44s/it] {'loss': 0.5241, 'grad_norm': 0.3797001342865075, 'learning_rate': 9.048270512703917e-06, 'epoch': 0.22} 22%|██▏ | 4943/22095 [8:33:49<226:00:51, 47.44s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/ui2json_app_d20240822_v1/collect/360qinglidashi_screen_00000025.jpg 2025-08-28 00:31:47.805960 load time: 1039.3 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339016 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5647, 'image': 'vrdu_table_final_2/astro-ph.CO/74faceb1-f946-4f9a-8165-b4070a453ad0.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 22%|██▏ | 4944/22095 [8:33:58<171:12:36, 35.94s/it] {'loss': 0.5158, 'grad_norm': 0.3268726665548248, 'learning_rate': 9.04784031248232e-06, 'epoch': 0.22} 22%|██▏ | 4944/22095 [8:33:58<171:12:36, 35.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (114595 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4945/22095 [8:34:02<125:19:59, 26.31s/it] {'loss': 0.4637, 'grad_norm': 0.7192392891866869, 'learning_rate': 9.04741002528507e-06, 'epoch': 0.22} 22%|██▏ | 4945/22095 [8:34:02<125:19:59, 26.31s/it] 22%|██▏ | 4946/22095 [8:34:43<146:51:32, 30.83s/it] {'loss': 0.3877, 'grad_norm': 0.6991235003684834, 'learning_rate': 9.046979651121407e-06, 'epoch': 0.22} 22%|██▏ | 4946/22095 [8:34:43<146:51:32, 30.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4947/22095 [8:34:53<116:22:47, 24.43s/it] {'loss': 0.5109, 'grad_norm': 0.39804851670087743, 'learning_rate': 9.04654919000058e-06, 'epoch': 0.22} 22%|██▏ | 4947/22095 [8:34:53<116:22:47, 24.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960771 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11606, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 3cm\nB. 2cm\nC. 5cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 22%|██▏ | 4948/22095 [8:35:02<94:57:19, 19.94s/it] {'loss': 0.471, 'grad_norm': 0.37682713032256965, 'learning_rate': 9.046118641931841e-06, 'epoch': 0.22} 22%|██▏ | 4948/22095 [8:35:02<94:57:19, 19.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 22%|██▏ | 4949/22095 [8:35:25<98:14:56, 20.63s/it] {'loss': 0.4281, 'grad_norm': 0.7482524081049696, 'learning_rate': 9.045688006924438e-06, 'epoch': 0.22} 22%|██▏ | 4949/22095 [8:35:25<98:14:56, 20.63s/it] 22%|██▏ | 4950/22095 [8:35:28<74:04:28, 15.55s/it] {'loss': 0.3682, 'grad_norm': 0.6714930277958379, 'learning_rate': 9.045257284987625e-06, 'epoch': 0.22} 22%|██▏ | 4950/22095 [8:35:28<74:04:28, 15.55s/it] 22%|██▏ | 4951/22095 [8:35:50<82:20:07, 17.29s/it] {'loss': 0.3829, 'grad_norm': 0.668180324072035, 'learning_rate': 9.044826476130657e-06, 'epoch': 0.22} 22%|██▏ | 4951/22095 [8:35:50<82:20:07, 17.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55009 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63105 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4952/22095 [8:35:53<63:00:04, 13.23s/it] {'loss': 0.3952, 'grad_norm': 0.6081873599786615, 'learning_rate': 9.04439558036279e-06, 'epoch': 0.22} 22%|██▏ | 4952/22095 [8:35:53<63:00:04, 13.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4953/22095 [8:36:19<81:25:38, 17.10s/it] {'loss': 0.5081, 'grad_norm': 0.46132916924040257, 'learning_rate': 9.043964597693285e-06, 'epoch': 0.22} 22%|██▏ | 4953/22095 [8:36:20<81:25:38, 17.10s/it] 22%|██▏ | 4954/22095 [8:36:23<61:34:39, 12.93s/it] {'loss': 0.3934, 'grad_norm': 0.7123279512728615, 'learning_rate': 9.043533528131401e-06, 'epoch': 0.22} 22%|██▏ | 4954/22095 [8:36:23<61:34:39, 12.93s/it] 22%|██▏ | 4955/22095 [8:36:45<75:30:57, 15.86s/it] {'loss': 0.412, 'grad_norm': 0.6788394561902364, 'learning_rate': 9.0431023716864e-06, 'epoch': 0.22} 22%|██▏ | 4955/22095 [8:36:45<75:30:57, 15.86s/it]VC:s3://mm-dataset/ocrvqa/images/902675052.jpg 2025-08-28 00:34:44.169058 load time: 1022.27 ms 22%|██▏ | 4956/22095 [8:37:10<87:52:14, 18.46s/it] {'loss': 0.4584, 'grad_norm': 0.6516290357865095, 'learning_rate': 9.042671128367545e-06, 'epoch': 0.22} 22%|██▏ | 4956/22095 [8:37:10<87:52:14, 18.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 22%|██▏ | 4957/22095 [8:37:13<66:07:22, 13.89s/it] {'loss': 0.3739, 'grad_norm': 0.694717466352686, 'learning_rate': 9.042239798184104e-06, 'epoch': 0.22} 22%|██▏ | 4957/22095 [8:37:13<66:07:22, 13.89s/it] 22%|██▏ | 4958/22095 [8:37:35<77:28:25, 16.28s/it] {'loss': 0.426, 'grad_norm': 0.6566955026673125, 'learning_rate': 9.041808381145345e-06, 'epoch': 0.22} 22%|██▏ | 4958/22095 [8:37:35<77:28:25, 16.28s/it] 22%|██▏ | 4959/22095 [8:38:34<139:10:40, 29.24s/it] {'loss': 0.3974, 'grad_norm': 0.7044272633799166, 'learning_rate': 9.041376877260537e-06, 'epoch': 0.22} 22%|██▏ | 4959/22095 [8:38:34<139:10:40, 29.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4960/22095 [8:38:44<110:53:05, 23.30s/it] {'loss': 0.483, 'grad_norm': 0.36168351486043804, 'learning_rate': 9.040945286538954e-06, 'epoch': 0.22} 22%|██▏ | 4960/22095 [8:38:44<110:53:05, 23.30s/it] 22%|██▏ | 4961/22095 [8:39:07<110:10:10, 23.15s/it] {'loss': 0.4604, 'grad_norm': 0.7206470073148731, 'learning_rate': 9.040513608989865e-06, 'epoch': 0.22} 22%|██▏ | 4961/22095 [8:39:07<110:10:10, 23.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140994 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46438 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4962/22095 [8:39:10<82:22:59, 17.31s/it] {'loss': 0.3785, 'grad_norm': 0.6872488577391577, 'learning_rate': 9.040081844622549e-06, 'epoch': 0.22} 22%|██▏ | 4962/22095 [8:39:10<82:22:59, 17.31s/it]VC:s3://gui-agent/data_20250609/pc_agent_e/images/screenshot/b75b_c6e7ced5_6.png 2025-08-28 00:37:09.160906 load time: 1031.35 ms 22%|██▏ | 4963/22095 [8:39:13<61:57:55, 13.02s/it] {'loss': 0.3796, 'grad_norm': 0.6441005996153664, 'learning_rate': 9.039649993446282e-06, 'epoch': 0.22} 22%|██▏ | 4963/22095 [8:39:13<61:57:55, 13.02s/it] 22%|██▏ | 4964/22095 [8:39:17<48:02:59, 10.10s/it] {'loss': 0.3969, 'grad_norm': 0.6648495871965038, 'learning_rate': 9.039218055470345e-06, 'epoch': 0.22} 22%|██▏ | 4964/22095 [8:39:17<48:02:59, 10.10s/it] 22%|██▏ | 4965/22095 [8:39:41<67:46:37, 14.24s/it] {'loss': 0.4097, 'grad_norm': 0.6206659315926869, 'learning_rate': 9.038786030704015e-06, 'epoch': 0.22} 22%|██▏ | 4965/22095 [8:39:41<67:46:37, 14.24s/it]VC:s3://gui-agent/data_20250407/web/images/wolframalpha_com/trajectory_3/img/step_7.png 2025-08-28 00:37:39.371467 load time: 1241.98 ms 22%|██▏ | 4966/22095 [8:39:44<51:39:56, 10.86s/it] {'loss': 0.3693, 'grad_norm': 0.572166293045714, 'learning_rate': 9.038353919156579e-06, 'epoch': 0.22} 22%|██▏ | 4966/22095 [8:39:44<51:39:56, 10.86s/it] 22%|██▏ | 4967/22095 [8:39:47<40:46:25, 8.57s/it] {'loss': 0.393, 'grad_norm': 0.688238215927942, 'learning_rate': 9.03792172083732e-06, 'epoch': 0.22} 22%|██▏ | 4967/22095 [8:39:47<40:46:25, 8.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56068 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75255 > 40960). Running this sequence through the model will result in indexing errors 22%|██▏ | 4968/22095 [8:39:49<32:25:07, 6.81s/it] {'loss': 0.385, 'grad_norm': 0.6426199823718562, 'learning_rate': 9.037489435755525e-06, 'epoch': 0.22} 22%|██▏ | 4968/22095 [8:39:50<32:25:07, 6.81s/it] 22%|██▏ | 4969/22095 [8:39:53<27:29:22, 5.78s/it] {'loss': 0.4335, 'grad_norm': 0.6863085194919177, 'learning_rate': 9.037057063920482e-06, 'epoch': 0.22} 22%|██▏ | 4969/22095 [8:39:53<27:29:22, 5.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 22%|██▏ | 4970/22095 [8:40:03<33:39:43, 7.08s/it] {'loss': 0.4737, 'grad_norm': 0.3480317198597673, 'learning_rate': 9.03662460534148e-06, 'epoch': 0.22} 22%|██▏ | 4970/22095 [8:40:03<33:39:43, 7.08s/it] 22%|██▏ | 4971/22095 [8:40:06<28:07:56, 5.91s/it] {'loss': 0.4075, 'grad_norm': 0.7441531712838492, 'learning_rate': 9.036192060027815e-06, 'epoch': 0.22} 22%|██▏ | 4971/22095 [8:40:06<28:07:56, 5.91s/it] 23%|██▎ | 4972/22095 [8:40:09<23:48:06, 5.00s/it] {'loss': 0.3916, 'grad_norm': 0.6849480310304413, 'learning_rate': 9.035759427988779e-06, 'epoch': 0.23} 23%|██▎ | 4972/22095 [8:40:09<23:48:06, 5.00s/it] 23%|██▎ | 4973/22095 [8:40:13<21:45:34, 4.58s/it] {'loss': 0.3677, 'grad_norm': 0.6509099795779941, 'learning_rate': 9.035326709233666e-06, 'epoch': 0.23} 23%|██▎ | 4973/22095 [8:40:13<21:45:34, 4.58s/it] 23%|██▎ | 4974/22095 [8:40:16<20:04:46, 4.22s/it] {'loss': 0.3982, 'grad_norm': 0.6816022275651715, 'learning_rate': 9.034893903771776e-06, 'epoch': 0.23} 23%|██▎ | 4974/22095 [8:40:16<20:04:46, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50804 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42494 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 4975/22095 [8:40:19<18:30:12, 3.89s/it] {'loss': 0.3834, 'grad_norm': 0.656449812683951, 'learning_rate': 9.034461011612408e-06, 'epoch': 0.23} 23%|██▎ | 4975/22095 [8:40:19<18:30:12, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 4976/22095 [8:40:30<27:52:40, 5.86s/it] {'loss': 0.5213, 'grad_norm': 0.37343104904151614, 'learning_rate': 9.034028032764866e-06, 'epoch': 0.23} 23%|██▎ | 4976/22095 [8:40:30<27:52:40, 5.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100650 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 4977/22095 [8:40:34<25:45:19, 5.42s/it] {'loss': 0.3551, 'grad_norm': 0.6434372858287496, 'learning_rate': 9.033594967238449e-06, 'epoch': 0.23} 23%|██▎ | 4977/22095 [8:40:34<25:45:19, 5.42s/it] 23%|██▎ | 4978/22095 [8:40:38<23:38:07, 4.97s/it] {'loss': 0.385, 'grad_norm': 0.635442460720965, 'learning_rate': 9.033161815042465e-06, 'epoch': 0.23} 23%|██▎ | 4978/22095 [8:40:38<23:38:07, 4.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 4979/22095 [8:40:47<28:51:59, 6.07s/it] {'loss': 0.5183, 'grad_norm': 0.31351963290651574, 'learning_rate': 9.032728576186221e-06, 'epoch': 0.23} 23%|██▎ | 4979/22095 [8:40:47<28:51:59, 6.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76874 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 4980/22095 [8:40:50<25:25:29, 5.35s/it] {'loss': 0.4107, 'grad_norm': 0.7047869382640914, 'learning_rate': 9.032295250679024e-06, 'epoch': 0.23} 23%|██▎ | 4980/22095 [8:40:50<25:25:29, 5.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 4981/22095 [8:40:54<22:49:15, 4.80s/it] {'loss': 0.3933, 'grad_norm': 0.802688035345281, 'learning_rate': 9.031861838530187e-06, 'epoch': 0.23} 23%|██▎ | 4981/22095 [8:40:54<22:49:15, 4.80s/it] 23%|██▎ | 4982/22095 [8:40:57<21:12:16, 4.46s/it] {'loss': 0.3806, 'grad_norm': 0.6748141649156711, 'learning_rate': 9.031428339749023e-06, 'epoch': 0.23} 23%|██▎ | 4982/22095 [8:40:57<21:12:16, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 4983/22095 [8:41:07<28:31:35, 6.00s/it] {'loss': 0.4662, 'grad_norm': 0.31771599393442523, 'learning_rate': 9.030994754344845e-06, 'epoch': 0.23} 23%|██▎ | 4983/22095 [8:41:07<28:31:35, 6.00s/it] 23%|██▎ | 4984/22095 [8:41:10<24:41:14, 5.19s/it] {'loss': 0.3726, 'grad_norm': 0.6717370322459778, 'learning_rate': 9.03056108232697e-06, 'epoch': 0.23} 23%|██▎ | 4984/22095 [8:41:10<24:41:14, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 4985/22095 [8:41:20<30:51:24, 6.49s/it] {'loss': 0.5291, 'grad_norm': 0.3275985126479388, 'learning_rate': 9.030127323704716e-06, 'epoch': 0.23} 23%|██▎ | 4985/22095 [8:41:20<30:51:24, 6.49s/it] 23%|██▎ | 4986/22095 [8:41:24<26:51:35, 5.65s/it] {'loss': 0.4068, 'grad_norm': 0.6814558422571404, 'learning_rate': 9.029693478487403e-06, 'epoch': 0.23} 23%|██▎ | 4986/22095 [8:41:24<26:51:35, 5.65s/it] 23%|██▎ | 4987/22095 [8:41:26<22:54:26, 4.82s/it] {'loss': 0.3794, 'grad_norm': 0.7062955257138119, 'learning_rate': 9.029259546684352e-06, 'epoch': 0.23} 23%|██▎ | 4987/22095 [8:41:26<22:54:26, 4.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 4988/22095 [8:41:36<30:13:56, 6.36s/it] {'loss': 0.4816, 'grad_norm': 0.29081694014049086, 'learning_rate': 9.028825528304892e-06, 'epoch': 0.23} 23%|██▎ | 4988/22095 [8:41:36<30:13:56, 6.36s/it] 23%|██▎ | 4989/22095 [8:41:40<26:25:12, 5.56s/it] {'loss': 0.3616, 'grad_norm': 0.7861575746631435, 'learning_rate': 9.028391423358343e-06, 'epoch': 0.23} 23%|██▎ | 4989/22095 [8:41:40<26:25:12, 5.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504803 in VC:s3://internvl-moe-sft-data/. Exception: Image size [192, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 76388, 'image': 'vrdu_texteq/astro-ph.CO/9f6a21d0-3e16-432b-98f5-5937d9721c94.png', 'image_wh': [[192, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'with variable $\\alpha$.'}]} 23%|██▎ | 4990/22095 [8:41:44<24:12:28, 5.09s/it] {'loss': 0.3572, 'grad_norm': 0.644040995539964, 'learning_rate': 9.027957231854034e-06, 'epoch': 0.23} 23%|██▎ | 4990/22095 [8:41:44<24:12:28, 5.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [409, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8498719 in VC:s3://internvl-moe-sft-data/. Exception: Image size [409, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 92371, 'image': 'vrdu_texteq/astro-ph.CO/40bb8fce-b231-4667-8bc9-cc38f556caa7.png', 'image_wh': [[409, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'This can be solved for $H$ to write'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393029 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 59860, 'image': 'vrdu_table_final_2/astro-ph.EP/7869eca3-d700-44a9-aa59-68acb6a4decc.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 23%|██▎ | 4991/22095 [8:41:48<22:02:33, 4.64s/it] {'loss': 0.3785, 'grad_norm': 0.6394027967714493, 'learning_rate': 9.027522953801296e-06, 'epoch': 0.23} 23%|██▎ | 4991/22095 [8:41:48<22:02:33, 4.64s/it] 23%|██▎ | 4992/22095 [8:41:51<19:33:01, 4.12s/it] {'loss': 0.3816, 'grad_norm': 0.662027116266261, 'learning_rate': 9.027088589209458e-06, 'epoch': 0.23} 23%|██▎ | 4992/22095 [8:41:51<19:33:01, 4.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (228881206 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7925534 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (228881206 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/10611.png', 'image_wh': [[8002, 28603]], 'conversations': [{'from': 'human', 'value': '\nWhat percentage of people are putting more money into savings amid Covid-19? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'The percentage of people putting more money into savings amid Covid-19 is 34%.\nThis information is stated in the text "MX surveyed over 1,000 people to understand their spending and saving behavior during COVID-19." It goes on to mention that "2/3 have changed their view on saving money in the last two weeks" and "34% said they were now putting more money into savings." This suggests that the pandemic has caused a shift in many people\'s financial priorities, with a significant portion prioritizing saving money.'}]} 23%|██▎ | 4993/22095 [8:41:54<18:35:03, 3.91s/it] {'loss': 0.4121, 'grad_norm': 0.6663392960488509, 'learning_rate': 9.026654138087857e-06, 'epoch': 0.23} 23%|██▎ | 4993/22095 [8:41:54<18:35:03, 3.91s/it] 23%|██▎ | 4994/22095 [8:41:57<17:34:03, 3.70s/it] {'loss': 0.3979, 'grad_norm': 0.6182570656575896, 'learning_rate': 9.026219600445824e-06, 'epoch': 0.23} 23%|██▎ | 4994/22095 [8:41:57<17:34:03, 3.70s/it] 23%|██▎ | 4995/22095 [8:42:01<17:33:38, 3.70s/it] {'loss': 0.3589, 'grad_norm': 0.6965154917042926, 'learning_rate': 9.025784976292698e-06, 'epoch': 0.23} 23%|██▎ | 4995/22095 [8:42:01<17:33:38, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 4996/22095 [8:42:04<16:15:59, 3.42s/it] {'loss': 0.379, 'grad_norm': 0.6926703656472599, 'learning_rate': 9.025350265637816e-06, 'epoch': 0.23} 23%|██▎ | 4996/22095 [8:42:04<16:15:59, 3.42s/it] 23%|██▎ | 4997/22095 [8:42:08<17:16:37, 3.64s/it] {'loss': 0.3825, 'grad_norm': 0.6645148067448459, 'learning_rate': 9.02491546849052e-06, 'epoch': 0.23} 23%|██▎ | 4997/22095 [8:42:08<17:16:37, 3.64s/it] 23%|██▎ | 4998/22095 [8:42:11<17:05:03, 3.60s/it] {'loss': 0.4086, 'grad_norm': 0.6664176952365746, 'learning_rate': 9.024480584860151e-06, 'epoch': 0.23} 23%|██▎ | 4998/22095 [8:42:11<17:05:03, 3.60s/it] 23%|██▎ | 4999/22095 [8:42:14<16:06:12, 3.39s/it] {'loss': 0.405, 'grad_norm': 0.6365534795201367, 'learning_rate': 9.024045614756056e-06, 'epoch': 0.23} 23%|██▎ | 4999/22095 [8:42:14<16:06:12, 3.39s/it] 23%|██▎ | 5000/22095 [8:42:17<15:55:33, 3.35s/it] {'loss': 0.389, 'grad_norm': 0.6339182513959327, 'learning_rate': 9.02361055818758e-06, 'epoch': 0.23} 23%|██▎ | 5000/22095 [8:42:17<15:55:33, 3.35s/it] 23%|██▎ | 5001/22095 [8:42:21<16:14:20, 3.42s/it] {'loss': 0.3989, 'grad_norm': 0.6353193403763745, 'learning_rate': 9.02317541516407e-06, 'epoch': 0.23} 23%|██▎ | 5001/22095 [8:42:21<16:14:20, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5002/22095 [8:42:25<16:28:27, 3.47s/it] {'loss': 0.4203, 'grad_norm': 0.6190937243125736, 'learning_rate': 9.022740185694877e-06, 'epoch': 0.23} 23%|██▎ | 5002/22095 [8:42:25<16:28:27, 3.47s/it] 23%|██▎ | 5003/22095 [8:42:28<15:39:12, 3.30s/it] {'loss': 0.3938, 'grad_norm': 0.6636631009871302, 'learning_rate': 9.022304869789352e-06, 'epoch': 0.23} 23%|██▎ | 5003/22095 [8:42:28<15:39:12, 3.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5004/22095 [8:42:32<16:55:15, 3.56s/it] {'loss': 0.3902, 'grad_norm': 0.7262606727914647, 'learning_rate': 9.02186946745685e-06, 'epoch': 0.23} 23%|██▎ | 5004/22095 [8:42:32<16:55:15, 3.56s/it] 23%|██▎ | 5005/22095 [8:42:35<16:44:29, 3.53s/it] {'loss': 0.3833, 'grad_norm': 0.6552484435543318, 'learning_rate': 9.021433978706724e-06, 'epoch': 0.23} 23%|██▎ | 5005/22095 [8:42:35<16:44:29, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81408 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95224 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65608 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5006/22095 [8:42:38<16:05:53, 3.39s/it] {'loss': 0.38, 'grad_norm': 0.6619672606846077, 'learning_rate': 9.020998403548333e-06, 'epoch': 0.23} 23%|██▎ | 5006/22095 [8:42:38<16:05:53, 3.39s/it] 23%|██▎ | 5007/22095 [8:42:41<15:32:37, 3.27s/it] {'loss': 0.3952, 'grad_norm': 0.6452943746949554, 'learning_rate': 9.020562741991035e-06, 'epoch': 0.23} 23%|██▎ | 5007/22095 [8:42:41<15:32:37, 3.27s/it] 23%|██▎ | 5008/22095 [8:42:44<15:16:29, 3.22s/it] {'loss': 0.4396, 'grad_norm': 0.7229966603330219, 'learning_rate': 9.020126994044194e-06, 'epoch': 0.23} 23%|██▎ | 5008/22095 [8:42:44<15:16:29, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61925 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44153 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50158 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5009/22095 [8:42:48<16:05:26, 3.39s/it] {'loss': 0.4145, 'grad_norm': 0.6631422008112223, 'learning_rate': 9.01969115971717e-06, 'epoch': 0.23} 23%|██▎ | 5009/22095 [8:42:48<16:05:26, 3.39s/it] 23%|██▎ | 5010/22095 [8:42:52<16:41:02, 3.52s/it] {'loss': 0.3982, 'grad_norm': 0.6025960255581672, 'learning_rate': 9.019255239019327e-06, 'epoch': 0.23} 23%|██▎ | 5010/22095 [8:42:52<16:41:02, 3.52s/it] 23%|██▎ | 5011/22095 [8:42:55<16:44:31, 3.53s/it] {'loss': 0.4105, 'grad_norm': 0.640514584070971, 'learning_rate': 9.018819231960035e-06, 'epoch': 0.23} 23%|██▎ | 5011/22095 [8:42:55<16:44:31, 3.53s/it] 23%|██▎ | 5012/22095 [8:42:59<17:10:43, 3.62s/it] {'loss': 0.4185, 'grad_norm': 0.6926676693423613, 'learning_rate': 9.01838313854866e-06, 'epoch': 0.23} 23%|██▎ | 5012/22095 [8:42:59<17:10:43, 3.62s/it] 23%|██▎ | 5013/22095 [8:43:02<16:18:11, 3.44s/it] {'loss': 0.3833, 'grad_norm': 0.646092239513566, 'learning_rate': 9.017946958794572e-06, 'epoch': 0.23} 23%|██▎ | 5013/22095 [8:43:02<16:18:11, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5014/22095 [8:43:13<26:08:13, 5.51s/it] {'loss': 0.4814, 'grad_norm': 0.43804579728241944, 'learning_rate': 9.017510692707144e-06, 'epoch': 0.23} 23%|██▎ | 5014/22095 [8:43:13<26:08:13, 5.51s/it] 23%|██▎ | 5015/22095 [8:43:17<23:51:08, 5.03s/it] {'loss': 0.4079, 'grad_norm': 0.6746575132691192, 'learning_rate': 9.01707434029575e-06, 'epoch': 0.23} 23%|██▎ | 5015/22095 [8:43:17<23:51:08, 5.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8399727 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1882, 'image': 'vrdu_table_final_2/astro-ph.CO/2f0d8460-67b1-4173-94c8-28ebace550c6.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 23%|██▎ | 5016/22095 [8:43:20<22:14:46, 4.69s/it] {'loss': 0.3984, 'grad_norm': 0.6754684713002371, 'learning_rate': 9.016637901569767e-06, 'epoch': 0.23} 23%|██▎ | 5016/22095 [8:43:20<22:14:46, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5017/22095 [8:43:30<29:06:01, 6.13s/it] {'loss': 0.5116, 'grad_norm': 0.33722037950031575, 'learning_rate': 9.01620137653857e-06, 'epoch': 0.23} 23%|██▎ | 5017/22095 [8:43:30<29:06:01, 6.13s/it] 23%|██▎ | 5018/22095 [8:43:33<25:09:38, 5.30s/it] {'loss': 0.3737, 'grad_norm': 0.6355902009523129, 'learning_rate': 9.015764765211542e-06, 'epoch': 0.23} 23%|██▎ | 5018/22095 [8:43:33<25:09:38, 5.30s/it] 23%|██▎ | 5019/22095 [8:43:37<23:07:47, 4.88s/it] {'loss': 0.455, 'grad_norm': 0.7013672992936948, 'learning_rate': 9.015328067598064e-06, 'epoch': 0.23} 23%|██▎ | 5019/22095 [8:43:37<23:07:47, 4.88s/it] 23%|██▎ | 5020/22095 [8:43:41<21:03:52, 4.44s/it] {'loss': 0.3807, 'grad_norm': 0.636650952638715, 'learning_rate': 9.014891283707517e-06, 'epoch': 0.23} 23%|██▎ | 5020/22095 [8:43:41<21:03:52, 4.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77976 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51722 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5021/22095 [8:43:43<18:36:15, 3.92s/it] {'loss': 0.4363, 'grad_norm': 0.6591969812812456, 'learning_rate': 9.014454413549285e-06, 'epoch': 0.23} 23%|██▎ | 5021/22095 [8:43:43<18:36:15, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41554 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5022/22095 [8:43:46<17:22:35, 3.66s/it] {'loss': 0.3622, 'grad_norm': 0.6480320041512702, 'learning_rate': 9.014017457132759e-06, 'epoch': 0.23} 23%|██▎ | 5022/22095 [8:43:46<17:22:35, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897248 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20401, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nA. 5\nB. 6\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 23%|██▎ | 5023/22095 [8:43:49<16:03:54, 3.39s/it] {'loss': 0.395, 'grad_norm': 0.7046158006976412, 'learning_rate': 9.013580414467324e-06, 'epoch': 0.23} 23%|██▎ | 5023/22095 [8:43:49<16:03:54, 3.39s/it] 23%|██▎ | 5024/22095 [8:43:53<16:22:25, 3.45s/it] {'loss': 0.4135, 'grad_norm': 0.7412498466774871, 'learning_rate': 9.013143285562375e-06, 'epoch': 0.23} 23%|██▎ | 5024/22095 [8:43:53<16:22:25, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5025/22095 [8:43:56<15:49:55, 3.34s/it] {'loss': 0.3934, 'grad_norm': 0.6430333473293939, 'learning_rate': 9.012706070427302e-06, 'epoch': 0.23} 23%|██▎ | 5025/22095 [8:43:56<15:49:55, 3.34s/it] 23%|██▎ | 5026/22095 [8:43:59<15:50:15, 3.34s/it] {'loss': 0.3622, 'grad_norm': 0.9565897525460444, 'learning_rate': 9.012268769071499e-06, 'epoch': 0.23} 23%|██▎ | 5026/22095 [8:43:59<15:50:15, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61617 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5027/22095 [8:44:02<15:25:41, 3.25s/it] {'loss': 0.3784, 'grad_norm': 0.7943779811391722, 'learning_rate': 9.011831381504362e-06, 'epoch': 0.23} 23%|██▎ | 5027/22095 [8:44:02<15:25:41, 3.25s/it] 23%|██▎ | 5028/22095 [8:44:06<16:27:52, 3.47s/it] {'loss': 0.3891, 'grad_norm': 0.6853892780934475, 'learning_rate': 9.011393907735291e-06, 'epoch': 0.23} 23%|██▎ | 5028/22095 [8:44:06<16:27:52, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5029/22095 [8:44:14<22:03:50, 4.65s/it] {'loss': 0.4946, 'grad_norm': 0.5847454644149267, 'learning_rate': 9.010956347773685e-06, 'epoch': 0.23} 23%|██▎ | 5029/22095 [8:44:14<22:03:50, 4.65s/it] 23%|██▎ | 5030/22095 [8:44:22<27:35:30, 5.82s/it] {'loss': 0.5247, 'grad_norm': 0.43884271089363497, 'learning_rate': 9.010518701628946e-06, 'epoch': 0.23} 23%|██▎ | 5030/22095 [8:44:22<27:35:30, 5.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5031/22095 [8:44:26<24:08:25, 5.09s/it] {'loss': 0.427, 'grad_norm': 0.7332460232839678, 'learning_rate': 9.010080969310477e-06, 'epoch': 0.23} 23%|██▎ | 5031/22095 [8:44:26<24:08:25, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5032/22095 [8:44:29<22:00:44, 4.64s/it] {'loss': 0.3768, 'grad_norm': 0.666964101425838, 'learning_rate': 9.009643150827683e-06, 'epoch': 0.23} 23%|██▎ | 5032/22095 [8:44:29<22:00:44, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56260 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5033/22095 [8:44:33<20:50:19, 4.40s/it] {'loss': 0.4198, 'grad_norm': 0.6490034288028559, 'learning_rate': 9.009205246189974e-06, 'epoch': 0.23} 23%|██▎ | 5033/22095 [8:44:33<20:50:19, 4.40s/it] 23%|██▎ | 5034/22095 [8:44:36<18:46:11, 3.96s/it] {'loss': 0.3847, 'grad_norm': 0.6593400548042422, 'learning_rate': 9.008767255406757e-06, 'epoch': 0.23} 23%|██▎ | 5034/22095 [8:44:36<18:46:11, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49262 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81285 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924295 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nIn which state is the Tomb of Abraham Lincoln? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Illinois\nThe text does not explicitly mention the location of the Tomb of Abraham Lincoln, but it can be inferred that it is in Illinois as the list includes several states of the United States and one of them is Illinois. This suggests that Abraham Lincoln, who was the 16th President of the United States and famously associated with Illinois, may have had his tomb located in this state. This is confirmed by doing a quick search, which yields that the Lincoln Tomb is indeed situated in Oak Ridge Cemetery in Springfield, Illinois.'}]} 23%|██▎ | 5035/22095 [8:44:46<27:16:23, 5.76s/it] {'loss': 0.5144, 'grad_norm': 0.6084761452473841, 'learning_rate': 9.008329178487442e-06, 'epoch': 0.23} 23%|██▎ | 5035/22095 [8:44:46<27:16:23, 5.76s/it] 23%|██▎ | 5036/22095 [8:44:49<23:31:14, 4.96s/it] {'loss': 0.3904, 'grad_norm': 0.6718615799842731, 'learning_rate': 9.007891015441447e-06, 'epoch': 0.23} 23%|██▎ | 5036/22095 [8:44:49<23:31:14, 4.96s/it] 23%|██▎ | 5037/22095 [8:44:53<22:36:38, 4.77s/it] {'loss': 0.3686, 'grad_norm': 0.6738115602196058, 'learning_rate': 9.007452766278181e-06, 'epoch': 0.23} 23%|██▎ | 5037/22095 [8:44:53<22:36:38, 4.77s/it] 23%|██▎ | 5038/22095 [8:44:56<20:03:38, 4.23s/it] {'loss': 0.4093, 'grad_norm': 0.6424073143560038, 'learning_rate': 9.007014431007064e-06, 'epoch': 0.23} 23%|██▎ | 5038/22095 [8:44:56<20:03:38, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61367 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5039/22095 [8:44:59<18:15:25, 3.85s/it] {'loss': 0.4291, 'grad_norm': 0.7084602860135315, 'learning_rate': 9.006576009637513e-06, 'epoch': 0.23} 23%|██▎ | 5039/22095 [8:44:59<18:15:25, 3.85s/it] 23%|██▎ | 5040/22095 [8:45:02<16:54:15, 3.57s/it] {'loss': 0.3985, 'grad_norm': 0.6434498431158302, 'learning_rate': 9.00613750217895e-06, 'epoch': 0.23} 23%|██▎ | 5040/22095 [8:45:02<16:54:15, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5041/22095 [8:45:09<21:56:37, 4.63s/it] {'loss': 0.5242, 'grad_norm': 0.40294794849350446, 'learning_rate': 9.005698908640795e-06, 'epoch': 0.23} 23%|██▎ | 5041/22095 [8:45:09<21:56:37, 4.63s/it] 23%|██▎ | 5042/22095 [8:45:13<20:07:59, 4.25s/it] {'loss': 0.3963, 'grad_norm': 0.6032961522009523, 'learning_rate': 9.005260229032471e-06, 'epoch': 0.23} 23%|██▎ | 5042/22095 [8:45:13<20:07:59, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5043/22095 [8:45:22<28:01:41, 5.92s/it] {'loss': 0.475, 'grad_norm': 0.3356321513424042, 'learning_rate': 9.004821463363409e-06, 'epoch': 0.23} 23%|██▎ | 5043/22095 [8:45:23<28:01:41, 5.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81225 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5044/22095 [8:45:26<24:19:53, 5.14s/it] {'loss': 0.3356, 'grad_norm': 0.5938021240632243, 'learning_rate': 9.004382611643032e-06, 'epoch': 0.23} 23%|██▎ | 5044/22095 [8:45:26<24:19:53, 5.14s/it] 23%|██▎ | 5045/22095 [8:45:29<21:06:59, 4.46s/it] {'loss': 0.4112, 'grad_norm': 0.7223163154013754, 'learning_rate': 9.003943673880771e-06, 'epoch': 0.23} 23%|██▎ | 5045/22095 [8:45:29<21:06:59, 4.46s/it] 23%|██▎ | 5046/22095 [8:45:32<18:57:14, 4.00s/it] {'loss': 0.3885, 'grad_norm': 0.6531894900267514, 'learning_rate': 9.00350465008606e-06, 'epoch': 0.23} 23%|██▎ | 5046/22095 [8:45:32<18:57:14, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5047/22095 [8:45:39<23:09:03, 4.89s/it] {'loss': 0.5096, 'grad_norm': 0.3875028933102217, 'learning_rate': 9.003065540268328e-06, 'epoch': 0.23} 23%|██▎ | 5047/22095 [8:45:39<23:09:03, 4.89s/it] 23%|██▎ | 5048/22095 [8:45:43<22:04:12, 4.66s/it] {'loss': 0.393, 'grad_norm': 0.6800193885461148, 'learning_rate': 9.00262634443701e-06, 'epoch': 0.23} 23%|██▎ | 5048/22095 [8:45:43<22:04:12, 4.66s/it] 23%|██▎ | 5049/22095 [8:45:46<20:30:09, 4.33s/it] {'loss': 0.3392, 'grad_norm': 0.6674094552885022, 'learning_rate': 9.002187062601548e-06, 'epoch': 0.23} 23%|██▎ | 5049/22095 [8:45:46<20:30:09, 4.33s/it] 23%|██▎ | 5050/22095 [8:45:50<19:31:43, 4.12s/it] {'loss': 0.3492, 'grad_norm': 0.6999580178301209, 'learning_rate': 9.001747694771378e-06, 'epoch': 0.23} 23%|██▎ | 5050/22095 [8:45:50<19:31:43, 4.12s/it] 23%|██▎ | 5051/22095 [8:45:53<18:50:19, 3.98s/it] {'loss': 0.3999, 'grad_norm': 0.6571573244248741, 'learning_rate': 9.00130824095594e-06, 'epoch': 0.23} 23%|██▎ | 5051/22095 [8:45:53<18:50:19, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5052/22095 [8:46:00<22:17:33, 4.71s/it] {'loss': 0.4805, 'grad_norm': 0.3742644643532557, 'learning_rate': 9.000868701164676e-06, 'epoch': 0.23} 23%|██▎ | 5052/22095 [8:46:00<22:17:33, 4.71s/it] 23%|██▎ | 5053/22095 [8:46:04<21:13:42, 4.48s/it] {'loss': 0.4351, 'grad_norm': 0.7135463099458239, 'learning_rate': 9.00042907540703e-06, 'epoch': 0.23} 23%|██▎ | 5053/22095 [8:46:04<21:13:42, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76686 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5054/22095 [8:46:08<20:29:36, 4.33s/it] {'loss': 0.3836, 'grad_norm': 0.6325496509989086, 'learning_rate': 8.999989363692453e-06, 'epoch': 0.23} 23%|██▎ | 5054/22095 [8:46:08<20:29:36, 4.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5055/22095 [8:46:11<18:39:54, 3.94s/it] {'loss': 0.4416, 'grad_norm': 0.7088140449106423, 'learning_rate': 8.999549566030389e-06, 'epoch': 0.23} 23%|██▎ | 5055/22095 [8:46:11<18:39:54, 3.94s/it] 23%|██▎ | 5056/22095 [8:46:14<16:59:43, 3.59s/it] {'loss': 0.3555, 'grad_norm': 0.6212518022240358, 'learning_rate': 8.999109682430288e-06, 'epoch': 0.23} 23%|██▎ | 5056/22095 [8:46:14<16:59:43, 3.59s/it] 23%|██▎ | 5057/22095 [8:46:17<16:09:39, 3.41s/it] {'loss': 0.3844, 'grad_norm': 0.7263540895949513, 'learning_rate': 8.9986697129016e-06, 'epoch': 0.23} 23%|██▎ | 5057/22095 [8:46:17<16:09:39, 3.41s/it] 23%|██▎ | 5058/22095 [8:46:20<15:25:35, 3.26s/it] {'loss': 0.359, 'grad_norm': 0.6386208062962684, 'learning_rate': 8.998229657453783e-06, 'epoch': 0.23} 23%|██▎ | 5058/22095 [8:46:20<15:25:35, 3.26s/it] 23%|██▎ | 5059/22095 [8:46:22<14:50:00, 3.13s/it] {'loss': 0.3813, 'grad_norm': 0.7299702917808321, 'learning_rate': 8.99778951609629e-06, 'epoch': 0.23} 23%|██▎ | 5059/22095 [8:46:22<14:50:00, 3.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5060/22095 [8:46:26<15:44:12, 3.33s/it] {'loss': 0.3686, 'grad_norm': 0.6187123146757414, 'learning_rate': 8.997349288838579e-06, 'epoch': 0.23} 23%|██▎ | 5060/22095 [8:46:26<15:44:12, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57370 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5061/22095 [8:46:29<15:00:06, 3.17s/it] {'loss': 0.361, 'grad_norm': 0.6462565768504316, 'learning_rate': 8.996908975690107e-06, 'epoch': 0.23} 23%|██▎ | 5061/22095 [8:46:29<15:00:06, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (98090 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5062/22095 [8:46:38<23:48:10, 5.03s/it] {'loss': 0.4917, 'grad_norm': 0.43439600221257585, 'learning_rate': 8.996468576660337e-06, 'epoch': 0.23} 23%|██▎ | 5062/22095 [8:46:38<23:48:10, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43510 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5063/22095 [8:46:42<21:10:36, 4.48s/it] {'loss': 0.3568, 'grad_norm': 0.6981144837914366, 'learning_rate': 8.996028091758733e-06, 'epoch': 0.23} 23%|██▎ | 5063/22095 [8:46:42<21:10:36, 4.48s/it] 23%|██▎ | 5064/22095 [8:46:44<18:48:30, 3.98s/it] {'loss': 0.3985, 'grad_norm': 0.665355672986008, 'learning_rate': 8.995587520994757e-06, 'epoch': 0.23} 23%|██▎ | 5064/22095 [8:46:44<18:48:30, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49512 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5065/22095 [8:46:48<18:09:56, 3.84s/it] {'loss': 0.3997, 'grad_norm': 0.6531402191226752, 'learning_rate': 8.995146864377877e-06, 'epoch': 0.23} 23%|██▎ | 5065/22095 [8:46:48<18:09:56, 3.84s/it] 23%|██▎ | 5066/22095 [8:46:51<17:44:21, 3.75s/it] {'loss': 0.3675, 'grad_norm': 0.6692184657619775, 'learning_rate': 8.994706121917562e-06, 'epoch': 0.23} 23%|██▎ | 5066/22095 [8:46:51<17:44:21, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42578 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5067/22095 [8:46:55<17:50:39, 3.77s/it] {'loss': 0.3545, 'grad_norm': 0.748061770522374, 'learning_rate': 8.99426529362328e-06, 'epoch': 0.23} 23%|██▎ | 5067/22095 [8:46:55<17:50:39, 3.77s/it] 23%|██▎ | 5068/22095 [8:46:58<17:02:02, 3.60s/it] {'loss': 0.3949, 'grad_norm': 0.6629816145696559, 'learning_rate': 8.993824379504505e-06, 'epoch': 0.23} 23%|██▎ | 5068/22095 [8:46:58<17:02:02, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5069/22095 [8:47:06<22:58:11, 4.86s/it] {'loss': 0.4942, 'grad_norm': 0.36058234225704106, 'learning_rate': 8.99338337957071e-06, 'epoch': 0.23} 23%|██▎ | 5069/22095 [8:47:06<22:58:11, 4.86s/it] 23%|██▎ | 5070/22095 [8:47:10<21:07:27, 4.47s/it] {'loss': 0.4098, 'grad_norm': 0.6403442029161902, 'learning_rate': 8.99294229383137e-06, 'epoch': 0.23} 23%|██▎ | 5070/22095 [8:47:10<21:07:27, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5071/22095 [8:47:19<28:09:11, 5.95s/it] {'loss': 0.5023, 'grad_norm': 0.30217042809740324, 'learning_rate': 8.992501122295964e-06, 'epoch': 0.23} 23%|██▎ | 5071/22095 [8:47:19<28:09:11, 5.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51230 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49132 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5072/22095 [8:47:24<26:15:47, 5.55s/it] {'loss': 0.392, 'grad_norm': 0.7060732314966579, 'learning_rate': 8.992059864973972e-06, 'epoch': 0.23} 23%|██▎ | 5072/22095 [8:47:24<26:15:47, 5.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83282 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5073/22095 [8:47:34<33:14:25, 7.03s/it] {'loss': 0.5342, 'grad_norm': 0.31473324252979823, 'learning_rate': 8.991618521874874e-06, 'epoch': 0.23} 23%|██▎ | 5073/22095 [8:47:34<33:14:25, 7.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45273 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5074/22095 [8:47:38<28:23:43, 6.01s/it] {'loss': 0.4034, 'grad_norm': 0.7440077572148602, 'learning_rate': 8.991177093008153e-06, 'epoch': 0.23} 23%|██▎ | 5074/22095 [8:47:38<28:23:43, 6.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5075/22095 [8:47:45<30:29:07, 6.45s/it] {'loss': 0.5276, 'grad_norm': 0.3344787423121618, 'learning_rate': 8.990735578383295e-06, 'epoch': 0.23} 23%|██▎ | 5075/22095 [8:47:45<30:29:07, 6.45s/it] 23%|██▎ | 5076/22095 [8:47:49<26:04:49, 5.52s/it] {'loss': 0.3846, 'grad_norm': 0.7355397133446866, 'learning_rate': 8.990293978009782e-06, 'epoch': 0.23} 23%|██▎ | 5076/22095 [8:47:49<26:04:49, 5.52s/it] 23%|██▎ | 5077/22095 [8:47:52<22:42:29, 4.80s/it] {'loss': 0.3438, 'grad_norm': 0.6111272041129948, 'learning_rate': 8.98985229189711e-06, 'epoch': 0.23} 23%|██▎ | 5077/22095 [8:47:52<22:42:29, 4.80s/it] 23%|██▎ | 5078/22095 [8:47:55<20:13:28, 4.28s/it] {'loss': 0.3601, 'grad_norm': 0.6582935033341474, 'learning_rate': 8.989410520054767e-06, 'epoch': 0.23} 23%|██▎ | 5078/22095 [8:47:55<20:13:28, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45136 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69059 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49803 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5079/22095 [8:47:59<19:32:44, 4.14s/it] {'loss': 0.4188, 'grad_norm': 0.6543794262424822, 'learning_rate': 8.988968662492243e-06, 'epoch': 0.23} 23%|██▎ | 5079/22095 [8:47:59<19:32:44, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5080/22095 [8:48:08<26:57:07, 5.70s/it] {'loss': 0.5115, 'grad_norm': 0.3491993514780067, 'learning_rate': 8.988526719219035e-06, 'epoch': 0.23} 23%|██▎ | 5080/22095 [8:48:08<26:57:07, 5.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5081/22095 [8:48:15<29:10:16, 6.17s/it] {'loss': 0.5075, 'grad_norm': 0.3329227948894098, 'learning_rate': 8.988084690244636e-06, 'epoch': 0.23} 23%|██▎ | 5081/22095 [8:48:15<29:10:16, 6.17s/it] 23%|██▎ | 5082/22095 [8:48:21<28:45:41, 6.09s/it] {'loss': 0.4829, 'grad_norm': 0.3059273323766298, 'learning_rate': 8.987642575578546e-06, 'epoch': 0.23} 23%|██▎ | 5082/22095 [8:48:21<28:45:41, 6.09s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5083/22095 [8:48:25<25:53:26, 5.48s/it] {'loss': 0.4061, 'grad_norm': 0.6749932682170735, 'learning_rate': 8.987200375230262e-06, 'epoch': 0.23} 23%|██▎ | 5083/22095 [8:48:25<25:53:26, 5.48s/it] 23%|██▎ | 5084/22095 [8:48:29<22:45:14, 4.82s/it] {'loss': 0.4166, 'grad_norm': 0.667571209794923, 'learning_rate': 8.986758089209292e-06, 'epoch': 0.23} 23%|██▎ | 5084/22095 [8:48:29<22:45:14, 4.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [234, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8482817 in VC:s3://internvl-moe-sft-data/. Exception: Image size [234, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 42917, 'image': 'vrdu_texteq/astro-ph.CO/94e478fa-f9a9-41f7-9a61-63bce09dd4e8.png', 'image_wh': [[234, 25]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'with $T_o = 0.39\\,{\\rm mK}$.'}]} 23%|██▎ | 5085/22095 [8:48:32<20:24:10, 4.32s/it] {'loss': 0.4102, 'grad_norm': 0.6994861621747892, 'learning_rate': 8.986315717525132e-06, 'epoch': 0.23} 23%|██▎ | 5085/22095 [8:48:32<20:24:10, 4.32s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_2/images/step_0.png 2025-08-28 00:46:32.733154 load time: 1009.69 ms 23%|██▎ | 5086/22095 [8:48:35<19:34:27, 4.14s/it] {'loss': 0.4032, 'grad_norm': 0.6364880522630176, 'learning_rate': 8.98587326018729e-06, 'epoch': 0.23} 23%|██▎ | 5086/22095 [8:48:35<19:34:27, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5087/22095 [8:48:46<28:46:19, 6.09s/it] {'loss': 0.4834, 'grad_norm': 0.40828602820192145, 'learning_rate': 8.985430717205276e-06, 'epoch': 0.23} 23%|██▎ | 5087/22095 [8:48:46<28:46:19, 6.09s/it] 23%|██▎ | 5088/22095 [8:48:50<25:48:11, 5.46s/it] {'loss': 0.4167, 'grad_norm': 0.6472024336771361, 'learning_rate': 8.984988088588594e-06, 'epoch': 0.23} 23%|██▎ | 5088/22095 [8:48:50<25:48:11, 5.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42153 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87725 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5089/22095 [8:48:54<23:53:10, 5.06s/it] {'loss': 0.3806, 'grad_norm': 0.6382904284079052, 'learning_rate': 8.984545374346758e-06, 'epoch': 0.23} 23%|██▎ | 5089/22095 [8:48:54<23:53:10, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75328 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5090/22095 [8:48:59<23:21:24, 4.94s/it] {'loss': 0.4345, 'grad_norm': 0.7405927409143391, 'learning_rate': 8.98410257448928e-06, 'epoch': 0.23} 23%|██▎ | 5090/22095 [8:48:59<23:21:24, 4.94s/it] 23%|██▎ | 5091/22095 [8:49:02<20:31:50, 4.35s/it] {'loss': 0.3811, 'grad_norm': 0.6151881033617659, 'learning_rate': 8.983659689025673e-06, 'epoch': 0.23} 23%|██▎ | 5091/22095 [8:49:02<20:31:50, 4.35s/it] 23%|██▎ | 5092/22095 [8:49:05<19:18:20, 4.09s/it] {'loss': 0.3562, 'grad_norm': 0.6586186202326227, 'learning_rate': 8.983216717965453e-06, 'epoch': 0.23} 23%|██▎ | 5092/22095 [8:49:05<19:18:20, 4.09s/it] 23%|██▎ | 5093/22095 [8:49:09<18:25:54, 3.90s/it] {'loss': 0.3732, 'grad_norm': 0.6552286587750954, 'learning_rate': 8.98277366131814e-06, 'epoch': 0.23} 23%|██▎ | 5093/22095 [8:49:09<18:25:54, 3.90s/it] 23%|██▎ | 5094/22095 [8:49:12<17:36:56, 3.73s/it] {'loss': 0.3634, 'grad_norm': 0.6744611940595147, 'learning_rate': 8.982330519093255e-06, 'epoch': 0.23} 23%|██▎ | 5094/22095 [8:49:12<17:36:56, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45648 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97567 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5095/22095 [8:49:15<16:49:16, 3.56s/it] {'loss': 0.3916, 'grad_norm': 0.6468474579989157, 'learning_rate': 8.981887291300315e-06, 'epoch': 0.23} 23%|██▎ | 5095/22095 [8:49:15<16:49:16, 3.56s/it] 23%|██▎ | 5096/22095 [8:49:18<15:49:40, 3.35s/it] {'loss': 0.4041, 'grad_norm': 0.6413550963762416, 'learning_rate': 8.981443977948848e-06, 'epoch': 0.23} 23%|██▎ | 5096/22095 [8:49:18<15:49:40, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 23%|██▎ | 5097/22095 [8:49:26<22:55:55, 4.86s/it] {'loss': 0.5196, 'grad_norm': 0.3875886722541602, 'learning_rate': 8.98100057904838e-06, 'epoch': 0.23} 23%|██▎ | 5097/22095 [8:49:27<22:55:55, 4.86s/it] 23%|██▎ | 5098/22095 [8:49:35<27:29:09, 5.82s/it] {'loss': 0.4977, 'grad_norm': 0.3240288028641131, 'learning_rate': 8.980557094608433e-06, 'epoch': 0.23} 23%|██▎ | 5098/22095 [8:49:35<27:29:09, 5.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5099/22095 [8:49:38<24:13:00, 5.13s/it] {'loss': 0.3847, 'grad_norm': 0.6977386054479082, 'learning_rate': 8.980113524638541e-06, 'epoch': 0.23} 23%|██▎ | 5099/22095 [8:49:38<24:13:00, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5100/22095 [8:49:42<22:11:11, 4.70s/it] {'loss': 0.3908, 'grad_norm': 0.65294049746612, 'learning_rate': 8.979669869148234e-06, 'epoch': 0.23} 23%|██▎ | 5100/22095 [8:49:42<22:11:11, 4.70s/it] 23%|██▎ | 5101/22095 [8:49:45<20:16:42, 4.30s/it] {'loss': 0.3793, 'grad_norm': 0.6586498592271675, 'learning_rate': 8.979226128147043e-06, 'epoch': 0.23} 23%|██▎ | 5101/22095 [8:49:45<20:16:42, 4.30s/it] 23%|██▎ | 5102/22095 [8:49:49<20:15:53, 4.29s/it] {'loss': 0.4071, 'grad_norm': 0.624180963723518, 'learning_rate': 8.978782301644503e-06, 'epoch': 0.23} 23%|██▎ | 5102/22095 [8:49:49<20:15:53, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5103/22095 [8:49:59<28:22:32, 6.01s/it] {'loss': 0.4873, 'grad_norm': 0.3947402995617219, 'learning_rate': 8.978338389650152e-06, 'epoch': 0.23} 23%|██▎ | 5103/22095 [8:49:59<28:22:32, 6.01s/it] 23%|██▎ | 5104/22095 [8:50:04<25:58:07, 5.50s/it] {'loss': 0.4046, 'grad_norm': 0.7039306333910293, 'learning_rate': 8.977894392173527e-06, 'epoch': 0.23} 23%|██▎ | 5104/22095 [8:50:04<25:58:07, 5.50s/it] 23%|██▎ | 5105/22095 [8:50:07<22:39:46, 4.80s/it] {'loss': 0.4122, 'grad_norm': 0.6445887028036205, 'learning_rate': 8.97745030922417e-06, 'epoch': 0.23} 23%|██▎ | 5105/22095 [8:50:07<22:39:46, 4.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946833 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69986, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nA. 6.8cm\nB. 7cm\nC. 5.4cm\nD. 6.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 23%|██▎ | 5106/22095 [8:50:11<22:02:57, 4.67s/it] {'loss': 0.3884, 'grad_norm': 0.7006501371896489, 'learning_rate': 8.977006140811621e-06, 'epoch': 0.23} 23%|██▎ | 5106/22095 [8:50:11<22:02:57, 4.67s/it] 23%|██▎ | 5107/22095 [8:50:14<19:32:46, 4.14s/it] {'loss': 0.3607, 'grad_norm': 0.6703032827813812, 'learning_rate': 8.976561886945426e-06, 'epoch': 0.23} 23%|██▎ | 5107/22095 [8:50:14<19:32:46, 4.14s/it] 23%|██▎ | 5108/22095 [8:50:18<19:22:08, 4.10s/it] {'loss': 0.4294, 'grad_norm': 0.7015094869729309, 'learning_rate': 8.976117547635125e-06, 'epoch': 0.23} 23%|██▎ | 5108/22095 [8:50:18<19:22:08, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53613 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73002 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5109/22095 [8:50:23<19:57:59, 4.23s/it] {'loss': 0.4084, 'grad_norm': 0.6317103668825891, 'learning_rate': 8.975673122890273e-06, 'epoch': 0.23} 23%|██▎ | 5109/22095 [8:50:23<19:57:59, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42947 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57498 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5110/22095 [8:50:27<19:37:42, 4.16s/it] {'loss': 0.3692, 'grad_norm': 0.6348862463002435, 'learning_rate': 8.975228612720415e-06, 'epoch': 0.23} 23%|██▎ | 5110/22095 [8:50:27<19:37:42, 4.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359349 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26069, 'image': 'vrdu_table_final_2/astro-ph.CO/fd453eab-b265-46d8-8cc3-8ff992e62dad.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 23%|██▎ | 5111/22095 [8:50:31<19:09:15, 4.06s/it] {'loss': 0.4034, 'grad_norm': 0.7192469572737816, 'learning_rate': 8.974784017135104e-06, 'epoch': 0.23} 23%|██▎ | 5111/22095 [8:50:31<19:09:15, 4.06s/it] 23%|██▎ | 5112/22095 [8:50:34<18:19:53, 3.89s/it] {'loss': 0.3641, 'grad_norm': 0.70426941566466, 'learning_rate': 8.974339336143892e-06, 'epoch': 0.23} 23%|██▎ | 5112/22095 [8:50:34<18:19:53, 3.89s/it] 23%|██▎ | 5113/22095 [8:50:37<17:37:55, 3.74s/it] {'loss': 0.4074, 'grad_norm': 0.7329488639684603, 'learning_rate': 8.973894569756333e-06, 'epoch': 0.23} 23%|██▎ | 5113/22095 [8:50:37<17:37:55, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5114/22095 [8:50:41<17:46:04, 3.77s/it] {'loss': 0.3797, 'grad_norm': 0.678743773114706, 'learning_rate': 8.973449717981984e-06, 'epoch': 0.23} 23%|██▎ | 5114/22095 [8:50:41<17:46:04, 3.77s/it] 23%|██▎ | 5115/22095 [8:50:45<17:55:28, 3.80s/it] {'loss': 0.4082, 'grad_norm': 0.6801450151035214, 'learning_rate': 8.973004780830405e-06, 'epoch': 0.23} 23%|██▎ | 5115/22095 [8:50:45<17:55:28, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5116/22095 [8:50:48<17:15:24, 3.66s/it] {'loss': 0.3891, 'grad_norm': 0.8518430751724988, 'learning_rate': 8.972559758311156e-06, 'epoch': 0.23} 23%|██▎ | 5116/22095 [8:50:48<17:15:24, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50519 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45414 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44176 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5117/22095 [8:50:58<25:44:31, 5.46s/it] {'loss': 0.5036, 'grad_norm': 0.4006010921762384, 'learning_rate': 8.972114650433798e-06, 'epoch': 0.23} 23%|██▎ | 5117/22095 [8:50:58<25:44:31, 5.46s/it] 23%|██▎ | 5118/22095 [8:51:08<32:36:02, 6.91s/it] {'loss': 0.5028, 'grad_norm': 0.33678025565893854, 'learning_rate': 8.971669457207896e-06, 'epoch': 0.23} 23%|██▎ | 5118/22095 [8:51:08<32:36:02, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5119/22095 [8:51:12<28:15:30, 5.99s/it] {'loss': 0.3729, 'grad_norm': 0.7913353825927536, 'learning_rate': 8.971224178643015e-06, 'epoch': 0.23} 23%|██▎ | 5119/22095 [8:51:12<28:15:30, 5.99s/it] 23%|██▎ | 5120/22095 [8:51:16<25:05:38, 5.32s/it] {'loss': 0.3754, 'grad_norm': 0.7398032625351002, 'learning_rate': 8.970778814748722e-06, 'epoch': 0.23} 23%|██▎ | 5120/22095 [8:51:16<25:05:38, 5.32s/it] 23%|██▎ | 5121/22095 [8:51:19<22:09:11, 4.70s/it] {'loss': 0.396, 'grad_norm': 0.6682354188081827, 'learning_rate': 8.97033336553459e-06, 'epoch': 0.23} 23%|██▎ | 5121/22095 [8:51:19<22:09:11, 4.70s/it] 23%|██▎ | 5122/22095 [8:51:23<20:04:56, 4.26s/it] {'loss': 0.4829, 'grad_norm': 0.8144947277111944, 'learning_rate': 8.969887831010185e-06, 'epoch': 0.23} 23%|██▎ | 5122/22095 [8:51:23<20:04:56, 4.26s/it] 23%|██▎ | 5123/22095 [8:51:25<18:04:31, 3.83s/it] {'loss': 0.388, 'grad_norm': 0.7451888612956965, 'learning_rate': 8.969442211185086e-06, 'epoch': 0.23} 23%|██▎ | 5123/22095 [8:51:25<18:04:31, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56029 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5124/22095 [8:51:28<17:02:20, 3.61s/it] {'loss': 0.3659, 'grad_norm': 0.7696054579132323, 'learning_rate': 8.968996506068863e-06, 'epoch': 0.23} 23%|██▎ | 5124/22095 [8:51:28<17:02:20, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5125/22095 [8:51:38<25:28:45, 5.41s/it] {'loss': 0.5083, 'grad_norm': 0.47226834278168806, 'learning_rate': 8.968550715671096e-06, 'epoch': 0.23} 23%|██▎ | 5125/22095 [8:51:38<25:28:45, 5.41s/it] 23%|██▎ | 5126/22095 [8:51:42<22:47:16, 4.83s/it] {'loss': 0.363, 'grad_norm': 0.8043182628259528, 'learning_rate': 8.968104840001362e-06, 'epoch': 0.23} 23%|██▎ | 5126/22095 [8:51:42<22:47:16, 4.83s/it] 23%|██▎ | 5127/22095 [8:51:45<20:17:36, 4.31s/it] {'loss': 0.3646, 'grad_norm': 0.8018927387330322, 'learning_rate': 8.967658879069243e-06, 'epoch': 0.23} 23%|██▎ | 5127/22095 [8:51:45<20:17:36, 4.31s/it] 23%|██▎ | 5128/22095 [8:51:48<18:18:33, 3.88s/it] {'loss': 0.3807, 'grad_norm': 0.6455595712351934, 'learning_rate': 8.96721283288432e-06, 'epoch': 0.23} 23%|██▎ | 5128/22095 [8:51:48<18:18:33, 3.88s/it] 23%|██▎ | 5129/22095 [8:51:51<18:20:25, 3.89s/it] {'loss': 0.4122, 'grad_norm': 0.6741490222011743, 'learning_rate': 8.966766701456177e-06, 'epoch': 0.23} 23%|██▎ | 5129/22095 [8:51:51<18:20:25, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59577 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5130/22095 [8:51:55<17:32:39, 3.72s/it] {'loss': 0.4552, 'grad_norm': 0.7687792391535636, 'learning_rate': 8.9663204847944e-06, 'epoch': 0.23} 23%|██▎ | 5130/22095 [8:51:55<17:32:39, 3.72s/it] 23%|██▎ | 5131/22095 [8:52:00<19:15:27, 4.09s/it] {'loss': 0.3792, 'grad_norm': 0.6669012351077497, 'learning_rate': 8.965874182908578e-06, 'epoch': 0.23} 23%|██▎ | 5131/22095 [8:52:00<19:15:27, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5132/22095 [8:52:10<27:23:07, 5.81s/it] {'loss': 0.4974, 'grad_norm': 0.3517548575118009, 'learning_rate': 8.9654277958083e-06, 'epoch': 0.23} 23%|██▎ | 5132/22095 [8:52:10<27:23:07, 5.81s/it] 23%|██▎ | 5133/22095 [8:52:14<25:21:02, 5.38s/it] {'loss': 0.3722, 'grad_norm': 0.6987967216935925, 'learning_rate': 8.96498132350316e-06, 'epoch': 0.23} 23%|██▎ | 5133/22095 [8:52:14<25:21:02, 5.38s/it] 23%|██▎ | 5134/22095 [8:52:17<21:52:48, 4.64s/it] {'loss': 0.3949, 'grad_norm': 2.2598695936336903, 'learning_rate': 8.964534766002747e-06, 'epoch': 0.23} 23%|██▎ | 5134/22095 [8:52:17<21:52:48, 4.64s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30998.png 2025-08-28 00:50:13.273659 load time: 1824.14 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047931 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=18cm,BC=6cm,∴AC=AB-BC=12cm又∵D为BC的中点,∴CD=\\frac{1}{2}BC=3于是AD=AC+CD=12+3=15'}]} 23%|██▎ | 5135/22095 [8:52:20<19:39:28, 4.17s/it] {'loss': 0.4195, 'grad_norm': 0.6806879214900802, 'learning_rate': 8.964088123316657e-06, 'epoch': 0.23} 23%|██▎ | 5135/22095 [8:52:20<19:39:28, 4.17s/it] 23%|██▎ | 5136/22095 [8:52:23<18:47:56, 3.99s/it] {'loss': 0.3989, 'grad_norm': 0.6509213495136762, 'learning_rate': 8.96364139545449e-06, 'epoch': 0.23} 23%|██▎ | 5136/22095 [8:52:23<18:47:56, 3.99s/it] 23%|██▎ | 5137/22095 [8:52:27<18:45:12, 3.98s/it] {'loss': 0.398, 'grad_norm': 0.6993968892851343, 'learning_rate': 8.96319458242584e-06, 'epoch': 0.23} 23%|██▎ | 5137/22095 [8:52:27<18:45:12, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5138/22095 [8:52:35<23:47:12, 5.05s/it] {'loss': 0.5247, 'grad_norm': 0.3881002933457079, 'learning_rate': 8.962747684240313e-06, 'epoch': 0.23} 23%|██▎ | 5138/22095 [8:52:35<23:47:12, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46540 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5139/22095 [8:52:39<22:45:40, 4.83s/it] {'loss': 0.4302, 'grad_norm': 0.7525172989755955, 'learning_rate': 8.962300700907508e-06, 'epoch': 0.23} 23%|██▎ | 5139/22095 [8:52:39<22:45:40, 4.83s/it] 23%|██▎ | 5140/22095 [8:52:43<21:48:39, 4.63s/it] {'loss': 0.3945, 'grad_norm': 0.6624381151083798, 'learning_rate': 8.96185363243703e-06, 'epoch': 0.23} 23%|██▎ | 5140/22095 [8:52:43<21:48:39, 4.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5141/22095 [8:52:47<20:46:33, 4.41s/it] {'loss': 0.3918, 'grad_norm': 0.6056783431125236, 'learning_rate': 8.961406478838486e-06, 'epoch': 0.23} 23%|██▎ | 5141/22095 [8:52:47<20:46:33, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952443 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3278, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 12\nB. 16\nC. 9\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 23%|██▎ | 5142/22095 [8:52:51<19:32:06, 4.15s/it] {'loss': 0.4018, 'grad_norm': 0.6902131262621901, 'learning_rate': 8.960959240121483e-06, 'epoch': 0.23} 23%|██▎ | 5142/22095 [8:52:51<19:32:06, 4.15s/it] 23%|██▎ | 5143/22095 [8:52:54<18:22:47, 3.90s/it] {'loss': 0.433, 'grad_norm': 0.7433333943110101, 'learning_rate': 8.96051191629563e-06, 'epoch': 0.23} 23%|██▎ | 5143/22095 [8:52:54<18:22:47, 3.90s/it] 23%|██▎ | 5144/22095 [8:52:57<17:04:07, 3.63s/it] {'loss': 0.3489, 'grad_norm': 0.6448248656475403, 'learning_rate': 8.96006450737054e-06, 'epoch': 0.23} 23%|██▎ | 5144/22095 [8:52:57<17:04:07, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85499 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5145/22095 [8:53:01<17:19:33, 3.68s/it] {'loss': 0.4211, 'grad_norm': 0.6608729931513597, 'learning_rate': 8.959617013355829e-06, 'epoch': 0.23} 23%|██▎ | 5145/22095 [8:53:01<17:19:33, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44523 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5146/22095 [8:53:05<17:48:38, 3.78s/it] {'loss': 0.404, 'grad_norm': 0.714836914340452, 'learning_rate': 8.959169434261106e-06, 'epoch': 0.23} 23%|██▎ | 5146/22095 [8:53:05<17:48:38, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5147/22095 [8:53:13<23:38:27, 5.02s/it] {'loss': 0.4861, 'grad_norm': 0.4241072094567102, 'learning_rate': 8.958721770095993e-06, 'epoch': 0.23} 23%|██▎ | 5147/22095 [8:53:13<23:38:27, 5.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902705 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25858, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 23%|██▎ | 5148/22095 [8:53:17<21:54:38, 4.65s/it] {'loss': 0.3767, 'grad_norm': 0.7033691745901531, 'learning_rate': 8.958274020870107e-06, 'epoch': 0.23} 23%|██▎ | 5148/22095 [8:53:17<21:54:38, 4.65s/it] 23%|██▎ | 5149/22095 [8:53:20<20:17:58, 4.31s/it] {'loss': 0.3817, 'grad_norm': 0.6697782259656342, 'learning_rate': 8.95782618659307e-06, 'epoch': 0.23} 23%|██▎ | 5149/22095 [8:53:20<20:17:58, 4.31s/it] 23%|██▎ | 5150/22095 [8:53:23<18:26:33, 3.92s/it] {'loss': 0.3734, 'grad_norm': 0.684941030312248, 'learning_rate': 8.957378267274502e-06, 'epoch': 0.23} 23%|██▎ | 5150/22095 [8:53:23<18:26:33, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71884 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5151/22095 [8:53:26<16:56:00, 3.60s/it] {'loss': 0.3462, 'grad_norm': 0.6454346323086292, 'learning_rate': 8.95693026292403e-06, 'epoch': 0.23} 23%|██▎ | 5151/22095 [8:53:26<16:56:00, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111391 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [214, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8361119 in VC:s3://internvl-moe-sft-data/. Exception: Image size [214, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27847, 'image': 'vrdu_table_final_2/astro-ph.CO/47ad0219-91e4-4eee-ab51-a94579c890c9.png', 'image_wh': [[214, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\small #1 \\today\n\\end{tabular}\n```"}]} 23%|██▎ | 5152/22095 [8:53:31<18:12:24, 3.87s/it] {'loss': 0.3936, 'grad_norm': 0.6743028983951925, 'learning_rate': 8.956482173551281e-06, 'epoch': 0.23} 23%|██▎ | 5152/22095 [8:53:31<18:12:24, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44391 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122652 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5153/22095 [8:53:37<22:12:27, 4.72s/it] {'loss': 0.5015, 'grad_norm': 0.35510312218255524, 'learning_rate': 8.956033999165881e-06, 'epoch': 0.23} 23%|██▎ | 5153/22095 [8:53:37<22:12:27, 4.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44343 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5154/22095 [8:53:47<29:27:08, 6.26s/it] {'loss': 0.4976, 'grad_norm': 0.33903672826457865, 'learning_rate': 8.95558573977746e-06, 'epoch': 0.23} 23%|██▎ | 5154/22095 [8:53:47<29:27:08, 6.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5155/22095 [8:53:51<25:22:18, 5.39s/it] {'loss': 0.4227, 'grad_norm': 0.6732348430980337, 'learning_rate': 8.955137395395649e-06, 'epoch': 0.23} 23%|██▎ | 5155/22095 [8:53:51<25:22:18, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93466 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5156/22095 [8:53:54<22:48:08, 4.85s/it] {'loss': 0.3716, 'grad_norm': 0.6513263774892051, 'learning_rate': 8.954688966030083e-06, 'epoch': 0.23} 23%|██▎ | 5156/22095 [8:53:54<22:48:08, 4.85s/it] 23%|██▎ | 5157/22095 [8:53:57<20:08:41, 4.28s/it] {'loss': 0.3805, 'grad_norm': 0.6796153436099222, 'learning_rate': 8.954240451690396e-06, 'epoch': 0.23} 23%|██▎ | 5157/22095 [8:53:57<20:08:41, 4.28s/it] 23%|██▎ | 5158/22095 [8:54:00<18:06:32, 3.85s/it] {'loss': 0.4376, 'grad_norm': 0.7066157020242287, 'learning_rate': 8.953791852386229e-06, 'epoch': 0.23} 23%|██▎ | 5158/22095 [8:54:00<18:06:32, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5159/22095 [8:54:07<22:33:28, 4.80s/it] {'loss': 0.5046, 'grad_norm': 0.4003704868791629, 'learning_rate': 8.953343168127218e-06, 'epoch': 0.23} 23%|██▎ | 5159/22095 [8:54:07<22:33:28, 4.80s/it] 23%|██▎ | 5160/22095 [8:54:11<20:53:26, 4.44s/it] {'loss': 0.3873, 'grad_norm': 0.7553986056398494, 'learning_rate': 8.952894398923003e-06, 'epoch': 0.23} 23%|██▎ | 5160/22095 [8:54:11<20:53:26, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5161/22095 [8:54:19<26:31:18, 5.64s/it] {'loss': 0.4796, 'grad_norm': 0.3605697912438177, 'learning_rate': 8.952445544783227e-06, 'epoch': 0.23} 23%|██▎ | 5161/22095 [8:54:19<26:31:18, 5.64s/it] 23%|██▎ | 5162/22095 [8:54:28<30:57:02, 6.58s/it] {'loss': 0.5052, 'grad_norm': 0.33431313155846065, 'learning_rate': 8.951996605717537e-06, 'epoch': 0.23} 23%|██▎ | 5162/22095 [8:54:28<30:57:02, 6.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5163/22095 [8:54:31<26:29:07, 5.63s/it] {'loss': 0.3645, 'grad_norm': 0.6935848168831603, 'learning_rate': 8.951547581735576e-06, 'epoch': 0.23} 23%|██▎ | 5163/22095 [8:54:31<26:29:07, 5.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5164/22095 [8:54:41<31:55:49, 6.79s/it] {'loss': 0.5062, 'grad_norm': 0.28657580245470116, 'learning_rate': 8.951098472846994e-06, 'epoch': 0.23} 23%|██▎ | 5164/22095 [8:54:41<31:55:49, 6.79s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 23%|██▎ | 5165/22095 [8:54:44<26:55:21, 5.72s/it] {'loss': 0.3496, 'grad_norm': 0.6478693357949638, 'learning_rate': 8.950649279061441e-06, 'epoch': 0.23} 23%|██▎ | 5165/22095 [8:54:44<26:55:21, 5.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [137, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394420 in VC:s3://internvl-moe-sft-data/. Exception: Image size [137, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61255, 'image': 'vrdu_table_final_2/astro-ph.EP/1caee1a3-eb07-4248-a823-b2955a893544.png', 'image_wh': [[137, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}} \\hspace{-1cm}Continuum \\\\ \\\\ \\end{tabular}\n```"}]} 23%|██▎ | 5166/22095 [8:54:47<23:28:54, 4.99s/it] {'loss': 0.4445, 'grad_norm': 0.6352113123421139, 'learning_rate': 8.950200000388569e-06, 'epoch': 0.23} 23%|██▎ | 5166/22095 [8:54:47<23:28:54, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64415 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5167/22095 [8:54:51<21:47:59, 4.64s/it] {'loss': 0.4252, 'grad_norm': 0.6870906658498175, 'learning_rate': 8.94975063683803e-06, 'epoch': 0.23} 23%|██▎ | 5167/22095 [8:54:51<21:47:59, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47742 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86765 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5168/22095 [8:54:55<20:30:46, 4.36s/it] {'loss': 0.353, 'grad_norm': 0.6279771064277256, 'learning_rate': 8.949301188419481e-06, 'epoch': 0.23} 23%|██▎ | 5168/22095 [8:54:55<20:30:46, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (132791 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5169/22095 [8:54:58<18:36:32, 3.96s/it] {'loss': 0.3868, 'grad_norm': 0.6866365440285913, 'learning_rate': 8.948851655142579e-06, 'epoch': 0.23} 23%|██▎ | 5169/22095 [8:54:58<18:36:32, 3.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396974 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63827, 'image': 'vrdu_table_final_2/astro-ph.EP/af2e510f-bdf9-43b0-862b-3c4df7f2cb9a.png', 'image_wh': [[17, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{l}$\\omega$\\end{tabular}\n```"}]} 23%|██▎ | 5170/22095 [8:55:01<17:54:45, 3.81s/it] {'loss': 0.4156, 'grad_norm': 0.634897252379839, 'learning_rate': 8.948402037016984e-06, 'epoch': 0.23} 23%|██▎ | 5170/22095 [8:55:01<17:54:45, 3.81s/it] 23%|██▎ | 5171/22095 [8:55:04<17:07:58, 3.64s/it] {'loss': 0.384, 'grad_norm': 0.6840741049902902, 'learning_rate': 8.947952334052354e-06, 'epoch': 0.23} 23%|██▎ | 5171/22095 [8:55:04<17:07:58, 3.64s/it] 23%|██▎ | 5172/22095 [8:55:07<16:00:46, 3.41s/it] {'loss': 0.3184, 'grad_norm': 0.6709720523253192, 'learning_rate': 8.947502546258354e-06, 'epoch': 0.23} 23%|██▎ | 5172/22095 [8:55:07<16:00:46, 3.41s/it] 23%|██▎ | 5173/22095 [8:55:11<16:18:20, 3.47s/it] {'loss': 0.3699, 'grad_norm': 0.6546340406137499, 'learning_rate': 8.947052673644649e-06, 'epoch': 0.23} 23%|██▎ | 5173/22095 [8:55:11<16:18:20, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42180 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5174/22095 [8:55:14<15:56:27, 3.39s/it] {'loss': 0.3687, 'grad_norm': 1.1575843601066382, 'learning_rate': 8.946602716220903e-06, 'epoch': 0.23} 23%|██▎ | 5174/22095 [8:55:14<15:56:27, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5175/22095 [8:55:22<21:51:52, 4.65s/it] {'loss': 0.5038, 'grad_norm': 0.5592146591684121, 'learning_rate': 8.946152673996786e-06, 'epoch': 0.23} 23%|██▎ | 5175/22095 [8:55:22<21:51:52, 4.65s/it] 23%|██▎ | 5176/22095 [8:55:31<28:31:28, 6.07s/it] {'loss': 0.4751, 'grad_norm': 0.438025934009356, 'learning_rate': 8.94570254698197e-06, 'epoch': 0.23} 23%|██▎ | 5176/22095 [8:55:31<28:31:28, 6.07s/it] 23%|██▎ | 5177/22095 [8:55:40<33:05:40, 7.04s/it] {'loss': 0.5063, 'grad_norm': 0.3331041402234504, 'learning_rate': 8.94525233518612e-06, 'epoch': 0.23} 23%|██▎ | 5177/22095 [8:55:40<33:05:40, 7.04s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (195698800 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7925640 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (195698800 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/10446.png', 'image_wh': [[15800, 12386]], 'conversations': [{'from': 'human', 'value': '\nWhat percent of confirmed COVID-19 cases survived? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': '- Simple phrase answer: 97.7%\n- According to the text, there were 72,314 unique COVID-19 cases diagnosed in the first 40 days in Hubei province, with 44,672 confirmed healthcare deaths (2.3%). This means that the remaining percentage of cases, which is 97.7%, survived.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5178/22095 [8:55:44<28:35:26, 6.08s/it] {'loss': 0.3478, 'grad_norm': 0.7216492408264652, 'learning_rate': 8.944802038618919e-06, 'epoch': 0.23} 23%|██▎ | 5178/22095 [8:55:44<28:35:26, 6.08s/it] 23%|██▎ | 5179/22095 [8:55:48<24:58:40, 5.32s/it] {'loss': 0.3644, 'grad_norm': 0.6858494288023315, 'learning_rate': 8.944351657290037e-06, 'epoch': 0.23} 23%|██▎ | 5179/22095 [8:55:48<24:58:40, 5.32s/it] 23%|██▎ | 5180/22095 [8:55:51<21:37:17, 4.60s/it] {'loss': 0.3867, 'grad_norm': 0.6451322581270235, 'learning_rate': 8.94390119120915e-06, 'epoch': 0.23} 23%|██▎ | 5180/22095 [8:55:51<21:37:17, 4.60s/it] 23%|██▎ | 5181/22095 [8:55:54<19:14:10, 4.09s/it] {'loss': 0.4206, 'grad_norm': 0.7447336454417398, 'learning_rate': 8.94345064038594e-06, 'epoch': 0.23} 23%|██▎ | 5181/22095 [8:55:54<19:14:10, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41001 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5182/22095 [8:56:03<26:49:05, 5.71s/it] {'loss': 0.514, 'grad_norm': 0.7668578669225723, 'learning_rate': 8.943000004830087e-06, 'epoch': 0.23} 23%|██▎ | 5182/22095 [8:56:03<26:49:05, 5.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881988 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5141, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 23%|██▎ | 5183/22095 [8:56:07<24:01:15, 5.11s/it] {'loss': 0.4438, 'grad_norm': 0.718206994552857, 'learning_rate': 8.942549284551274e-06, 'epoch': 0.23} 23%|██▎ | 5183/22095 [8:56:07<24:01:15, 5.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5184/22095 [8:56:10<21:46:46, 4.64s/it] {'loss': 0.4234, 'grad_norm': 0.7203731751657185, 'learning_rate': 8.942098479559185e-06, 'epoch': 0.23} 23%|██▎ | 5184/22095 [8:56:10<21:46:46, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 23%|██▎ | 5185/22095 [8:56:14<20:38:04, 4.39s/it] {'loss': 0.4002, 'grad_norm': 0.6292088498449315, 'learning_rate': 8.941647589863507e-06, 'epoch': 0.23} 23%|██▎ | 5185/22095 [8:56:14<20:38:04, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91109 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5186/22095 [8:56:17<18:58:23, 4.04s/it] {'loss': 0.4002, 'grad_norm': 0.6600783332319208, 'learning_rate': 8.941196615473929e-06, 'epoch': 0.23} 23%|██▎ | 5186/22095 [8:56:17<18:58:23, 4.04s/it] 23%|██▎ | 5187/22095 [8:56:20<17:29:01, 3.72s/it] {'loss': 0.4058, 'grad_norm': 0.7434646236905292, 'learning_rate': 8.94074555640014e-06, 'epoch': 0.23} 23%|██▎ | 5187/22095 [8:56:20<17:29:01, 3.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308023 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2I6rudAGj11JjSZFMXXXnRVXa_!!3434792875.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to read and retrieve all the words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n风靡韩国\n正品态度\n同样低价\n30000台\n礼品丰厚\n免费开票\n送上楼\n钜惠价\n135\n全店满就减进行中!\n进店送豪礼'}]} 23%|██▎ | 5188/22095 [8:56:24<16:41:58, 3.56s/it] {'loss': 0.3731, 'grad_norm': 0.6471649173552345, 'learning_rate': 8.940294412651831e-06, 'epoch': 0.23} 23%|██▎ | 5188/22095 [8:56:24<16:41:58, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64990 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5189/22095 [8:56:28<17:38:30, 3.76s/it] {'loss': 0.4365, 'grad_norm': 0.7267012818438411, 'learning_rate': 8.939843184238698e-06, 'epoch': 0.23} 23%|██▎ | 5189/22095 [8:56:28<17:38:30, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 23%|██▎ | 5190/22095 [8:56:37<25:42:45, 5.48s/it] {'loss': 0.4843, 'grad_norm': 0.42194579918599756, 'learning_rate': 8.939391871170435e-06, 'epoch': 0.23} 23%|██▎ | 5190/22095 [8:56:37<25:42:45, 5.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8452252 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 9551, 'image': 'vrdu_texteq/astro-ph.CO/4273f1e9-5dee-44fe-996f-129c9c0255c1.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': '$\\Downarrow$'}]} 23%|██▎ | 5191/22095 [8:56:41<23:04:22, 4.91s/it] {'loss': 0.4126, 'grad_norm': 0.6951856656124707, 'learning_rate': 8.93894047345674e-06, 'epoch': 0.23} 23%|██▎ | 5191/22095 [8:56:41<23:04:22, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44635 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72374 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75406 > 40960). Running this sequence through the model will result in indexing errors 23%|██▎ | 5192/22095 [8:56:44<20:11:00, 4.30s/it] {'loss': 0.391, 'grad_norm': 0.6568313845426516, 'learning_rate': 8.93848899110731e-06, 'epoch': 0.23} 23%|██▎ | 5192/22095 [8:56:44<20:11:00, 4.30s/it] 24%|██▎ | 5193/22095 [8:56:47<18:07:11, 3.86s/it] {'loss': 0.3812, 'grad_norm': 0.7239464873852787, 'learning_rate': 8.93803742413185e-06, 'epoch': 0.24} 24%|██▎ | 5193/22095 [8:56:47<18:07:11, 3.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [709, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507683 in VC:s3://internvl-moe-sft-data/. Exception: Image size [709, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32133, 'image': 'vrdu_texteq/astro-ph.CO/173eda26-bf67-4a8a-9477-07ed10b0f2ea.png', 'image_wh': [[709, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'The use of the virial theorem to derive $M_{BH}$ based on the'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5194/22095 [8:56:54<23:21:34, 4.98s/it] {'loss': 0.5347, 'grad_norm': 0.3504256078341399, 'learning_rate': 8.937585772540058e-06, 'epoch': 0.24} 24%|██▎ | 5194/22095 [8:56:54<23:21:34, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▎ | 5195/22095 [8:56:58<22:28:15, 4.79s/it] {'loss': 0.3973, 'grad_norm': 0.6827160249138898, 'learning_rate': 8.937134036341643e-06, 'epoch': 0.24} 24%|██▎ | 5195/22095 [8:56:58<22:28:15, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43294 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63937 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91412 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5196/22095 [8:57:01<19:57:33, 4.25s/it] {'loss': 0.4111, 'grad_norm': 0.6703648183205427, 'learning_rate': 8.93668221554631e-06, 'epoch': 0.24} 24%|██▎ | 5196/22095 [8:57:01<19:57:33, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (86770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92050 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5197/22095 [8:57:12<28:30:46, 6.07s/it] {'loss': 0.5168, 'grad_norm': 0.31976622494354495, 'learning_rate': 8.936230310163765e-06, 'epoch': 0.24} 24%|██▎ | 5197/22095 [8:57:12<28:30:46, 6.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▎ | 5198/22095 [8:57:15<24:41:11, 5.26s/it] {'loss': 0.3728, 'grad_norm': 0.6581618322094055, 'learning_rate': 8.935778320203721e-06, 'epoch': 0.24} 24%|██▎ | 5198/22095 [8:57:15<24:41:11, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68323 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5199/22095 [8:57:19<22:43:33, 4.84s/it] {'loss': 0.3508, 'grad_norm': 0.6170063595280307, 'learning_rate': 8.935326245675887e-06, 'epoch': 0.24} 24%|██▎ | 5199/22095 [8:57:19<22:43:33, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5200/22095 [8:57:28<28:58:58, 6.18s/it] {'loss': 0.4836, 'grad_norm': 0.33336389010807627, 'learning_rate': 8.934874086589981e-06, 'epoch': 0.24} 24%|██▎ | 5200/22095 [8:57:28<28:58:58, 6.18s/it] 24%|██▎ | 5201/22095 [8:57:32<25:24:10, 5.41s/it] {'loss': 0.376, 'grad_norm': 0.9567379336319621, 'learning_rate': 8.934421842955715e-06, 'epoch': 0.24} 24%|██▎ | 5201/22095 [8:57:32<25:24:10, 5.41s/it] 24%|██▎ | 5202/22095 [8:57:35<22:29:38, 4.79s/it] {'loss': 0.35, 'grad_norm': 0.6638392906292916, 'learning_rate': 8.933969514782808e-06, 'epoch': 0.24} 24%|██▎ | 5202/22095 [8:57:35<22:29:38, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73910 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5203/22095 [8:57:38<19:57:17, 4.25s/it] {'loss': 0.3838, 'grad_norm': 0.65336301854799, 'learning_rate': 8.933517102080977e-06, 'epoch': 0.24} 24%|██▎ | 5203/22095 [8:57:38<19:57:17, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▎ | 5204/22095 [8:57:48<27:28:42, 5.86s/it] {'loss': 0.4925, 'grad_norm': 0.3227094570153876, 'learning_rate': 8.933064604859945e-06, 'epoch': 0.24} 24%|██▎ | 5204/22095 [8:57:48<27:28:42, 5.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48092 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5205/22095 [8:57:55<28:55:56, 6.17s/it] {'loss': 0.5056, 'grad_norm': 0.31246372138751155, 'learning_rate': 8.932612023129433e-06, 'epoch': 0.24} 24%|██▎ | 5205/22095 [8:57:55<28:55:56, 6.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▎ | 5206/22095 [8:57:58<25:09:21, 5.36s/it] {'loss': 0.3625, 'grad_norm': 0.7701575727730373, 'learning_rate': 8.932159356899169e-06, 'epoch': 0.24} 24%|██▎ | 5206/22095 [8:57:58<25:09:21, 5.36s/it] 24%|██▎ | 5207/22095 [8:58:08<30:58:56, 6.60s/it] {'loss': 0.4982, 'grad_norm': 0.2880103831704138, 'learning_rate': 8.931706606178874e-06, 'epoch': 0.24} 24%|██▎ | 5207/22095 [8:58:08<30:58:56, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▎ | 5208/22095 [8:58:12<27:41:33, 5.90s/it] {'loss': 0.3792, 'grad_norm': 0.7208448019428256, 'learning_rate': 8.931253770978281e-06, 'epoch': 0.24} 24%|██▎ | 5208/22095 [8:58:12<27:41:33, 5.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77815 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5209/22095 [8:58:16<25:05:05, 5.35s/it] {'loss': 0.39, 'grad_norm': 0.6607662590967202, 'learning_rate': 8.93080085130712e-06, 'epoch': 0.24} 24%|██▎ | 5209/22095 [8:58:16<25:05:05, 5.35s/it] 24%|██▎ | 5210/22095 [8:58:20<22:24:09, 4.78s/it] {'loss': 0.3755, 'grad_norm': 0.6524379021573505, 'learning_rate': 8.930347847175118e-06, 'epoch': 0.24} 24%|██▎ | 5210/22095 [8:58:20<22:24:09, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5211/22095 [8:58:30<30:42:59, 6.55s/it] {'loss': 0.4992, 'grad_norm': 0.3524805216490473, 'learning_rate': 8.929894758592016e-06, 'epoch': 0.24} 24%|██▎ | 5211/22095 [8:58:30<30:42:59, 6.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918154 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41307, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nA. 32cm\nB. 4cm\nC. 8cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 24%|██▎ | 5212/22095 [8:58:35<28:05:28, 5.99s/it] {'loss': 0.4145, 'grad_norm': 0.6890393087617782, 'learning_rate': 8.929441585567543e-06, 'epoch': 0.24} 24%|██▎ | 5212/22095 [8:58:35<28:05:28, 5.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5213/22095 [8:58:45<34:19:40, 7.32s/it] {'loss': 0.4905, 'grad_norm': 0.35273439977535703, 'learning_rate': 8.928988328111437e-06, 'epoch': 0.24} 24%|██▎ | 5213/22095 [8:58:45<34:19:40, 7.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [387, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8451073 in VC:s3://internvl-moe-sft-data/. Exception: Image size [387, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 74702, 'image': 'vrdu_texteq/astro-ph.CO/07892f11-0041-4772-b790-eb5022e1ebeb.png', 'image_wh': [[387, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where $\\sigma^{2}$ and $\\omega^{2}$ are the scalars'}]} 24%|██▎ | 5214/22095 [8:58:50<30:28:33, 6.50s/it] {'loss': 0.3921, 'grad_norm': 0.6956924315211259, 'learning_rate': 8.928534986233441e-06, 'epoch': 0.24} 24%|██▎ | 5214/22095 [8:58:50<30:28:33, 6.50s/it] 24%|██▎ | 5215/22095 [8:58:54<26:48:04, 5.72s/it] {'loss': 0.3345, 'grad_norm': 0.6630715659033836, 'learning_rate': 8.928081559943293e-06, 'epoch': 0.24} 24%|██▎ | 5215/22095 [8:58:54<26:48:04, 5.72s/it] 24%|██▎ | 5216/22095 [8:58:57<23:22:09, 4.98s/it] {'loss': 0.36, 'grad_norm': 0.6177416326477722, 'learning_rate': 8.927628049250736e-06, 'epoch': 0.24} 24%|██▎ | 5216/22095 [8:58:57<23:22:09, 4.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96473 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5217/22095 [8:59:04<26:10:46, 5.58s/it] {'loss': 0.4947, 'grad_norm': 0.4239754000254461, 'learning_rate': 8.927174454165518e-06, 'epoch': 0.24} 24%|██▎ | 5217/22095 [8:59:04<26:10:46, 5.58s/it] 24%|██▎ | 5218/22095 [8:59:07<22:54:13, 4.89s/it] {'loss': 0.3808, 'grad_norm': 0.7221304351259809, 'learning_rate': 8.926720774697379e-06, 'epoch': 0.24} 24%|██▎ | 5218/22095 [8:59:07<22:54:13, 4.89s/it] 24%|██▎ | 5219/22095 [8:59:10<20:02:28, 4.28s/it] {'loss': 0.4445, 'grad_norm': 0.7340501642183168, 'learning_rate': 8.926267010856072e-06, 'epoch': 0.24} 24%|██▎ | 5219/22095 [8:59:10<20:02:28, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72561 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41522 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5220/22095 [8:59:14<19:13:10, 4.10s/it] {'loss': 0.402, 'grad_norm': 0.6257379899081139, 'learning_rate': 8.925813162651345e-06, 'epoch': 0.24} 24%|██▎ | 5220/22095 [8:59:14<19:13:10, 4.10s/it] 24%|██▎ | 5221/22095 [8:59:17<17:44:00, 3.78s/it] {'loss': 0.4159, 'grad_norm': 0.6189493871758575, 'learning_rate': 8.92535923009295e-06, 'epoch': 0.24} 24%|██▎ | 5221/22095 [8:59:17<17:44:00, 3.78s/it] 24%|██▎ | 5222/22095 [8:59:20<16:26:38, 3.51s/it] {'loss': 0.3664, 'grad_norm': 0.6375101034399829, 'learning_rate': 8.924905213190641e-06, 'epoch': 0.24} 24%|██▎ | 5222/22095 [8:59:20<16:26:38, 3.51s/it] 24%|██▎ | 5223/22095 [8:59:23<15:42:38, 3.35s/it] {'loss': 0.4071, 'grad_norm': 0.659813540503722, 'learning_rate': 8.924451111954173e-06, 'epoch': 0.24} 24%|██▎ | 5223/22095 [8:59:23<15:42:38, 3.35s/it] 24%|██▎ | 5224/22095 [8:59:26<15:21:14, 3.28s/it] {'loss': 0.3749, 'grad_norm': 0.6552180360974372, 'learning_rate': 8.923996926393306e-06, 'epoch': 0.24} 24%|██▎ | 5224/22095 [8:59:26<15:21:14, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5225/22095 [8:59:35<24:12:23, 5.17s/it] {'loss': 0.5057, 'grad_norm': 0.3810805951151939, 'learning_rate': 8.923542656517795e-06, 'epoch': 0.24} 24%|██▎ | 5225/22095 [8:59:35<24:12:23, 5.17s/it] 24%|██▎ | 5226/22095 [8:59:39<21:28:54, 4.58s/it] {'loss': 0.3789, 'grad_norm': 0.6787402212616696, 'learning_rate': 8.923088302337402e-06, 'epoch': 0.24} 24%|██▎ | 5226/22095 [8:59:39<21:28:54, 4.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▎ | 5227/22095 [8:59:43<20:50:29, 4.45s/it] {'loss': 0.4037, 'grad_norm': 0.6387130171561558, 'learning_rate': 8.922633863861891e-06, 'epoch': 0.24} 24%|██▎ | 5227/22095 [8:59:43<20:50:29, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64865 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5228/22095 [8:59:52<28:01:35, 5.98s/it] {'loss': 0.4942, 'grad_norm': 0.3172792652610559, 'learning_rate': 8.922179341101027e-06, 'epoch': 0.24} 24%|██▎ | 5228/22095 [8:59:52<28:01:35, 5.98s/it] 24%|██▎ | 5229/22095 [8:59:56<24:53:07, 5.31s/it] {'loss': 0.3778, 'grad_norm': 0.6719534269724785, 'learning_rate': 8.921724734064573e-06, 'epoch': 0.24} 24%|██▎ | 5229/22095 [8:59:56<24:53:07, 5.31s/it] 24%|██▎ | 5230/22095 [8:59:59<21:32:04, 4.60s/it] {'loss': 0.4318, 'grad_norm': 0.7536957548197958, 'learning_rate': 8.9212700427623e-06, 'epoch': 0.24} 24%|██▎ | 5230/22095 [8:59:59<21:32:04, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72362 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5231/22095 [9:00:02<19:15:51, 4.11s/it] {'loss': 0.3631, 'grad_norm': 0.7039894799263119, 'learning_rate': 8.920815267203977e-06, 'epoch': 0.24} 24%|██▎ | 5231/22095 [9:00:02<19:15:51, 4.11s/it] 24%|██▎ | 5232/22095 [9:00:05<18:15:23, 3.90s/it] {'loss': 0.4254, 'grad_norm': 0.6832250244855517, 'learning_rate': 8.920360407399375e-06, 'epoch': 0.24} 24%|██▎ | 5232/22095 [9:00:05<18:15:23, 3.90s/it] 24%|██▎ | 5233/22095 [9:00:09<17:32:42, 3.75s/it] {'loss': 0.3406, 'grad_norm': 0.697516368031495, 'learning_rate': 8.919905463358269e-06, 'epoch': 0.24} 24%|██▎ | 5233/22095 [9:00:09<17:32:42, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940295 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63448, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 24%|██▎ | 5234/22095 [9:00:13<17:32:31, 3.75s/it] {'loss': 0.4, 'grad_norm': 0.6938707188368235, 'learning_rate': 8.919450435090433e-06, 'epoch': 0.24} 24%|██▎ | 5234/22095 [9:00:13<17:32:31, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43775 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5235/22095 [9:00:16<17:05:56, 3.65s/it] {'loss': 0.4226, 'grad_norm': 0.6964208971518877, 'learning_rate': 8.918995322605646e-06, 'epoch': 0.24} 24%|██▎ | 5235/22095 [9:00:16<17:05:56, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916665 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39818, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nA. 1.5\nB. 2\nC. 0.5\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▎ | 5236/22095 [9:00:24<22:36:52, 4.83s/it] {'loss': 0.492, 'grad_norm': 0.41458513233775623, 'learning_rate': 8.918540125913686e-06, 'epoch': 0.24} 24%|██▎ | 5236/22095 [9:00:24<22:36:52, 4.83s/it] 24%|██▎ | 5237/22095 [9:00:27<20:12:29, 4.32s/it] {'loss': 0.3861, 'grad_norm': 0.6790128431432206, 'learning_rate': 8.918084845024334e-06, 'epoch': 0.24} 24%|██▎ | 5237/22095 [9:00:27<20:12:29, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 24%|██▎ | 5238/22095 [9:00:30<18:23:18, 3.93s/it] {'loss': 0.3855, 'grad_norm': 0.6650948718946313, 'learning_rate': 8.917629479947369e-06, 'epoch': 0.24} 24%|██▎ | 5238/22095 [9:00:30<18:23:18, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5239/22095 [9:00:39<26:28:18, 5.65s/it] {'loss': 0.5137, 'grad_norm': 0.32944873763671007, 'learning_rate': 8.917174030692582e-06, 'epoch': 0.24} 24%|██▎ | 5239/22095 [9:00:39<26:28:18, 5.65s/it] 24%|██▎ | 5240/22095 [9:00:44<24:47:48, 5.30s/it] {'loss': 0.4035, 'grad_norm': 0.7527930233644158, 'learning_rate': 8.916718497269755e-06, 'epoch': 0.24} 24%|██▎ | 5240/22095 [9:00:44<24:47:48, 5.30s/it] 24%|██▎ | 5241/22095 [9:00:47<22:16:36, 4.76s/it] {'loss': 0.3802, 'grad_norm': 0.7248777445942971, 'learning_rate': 8.916262879688674e-06, 'epoch': 0.24} 24%|██▎ | 5241/22095 [9:00:47<22:16:36, 4.76s/it] 24%|██▎ | 5242/22095 [9:00:51<21:16:30, 4.54s/it] {'loss': 0.3757, 'grad_norm': 0.6412358156653725, 'learning_rate': 8.915807177959133e-06, 'epoch': 0.24} 24%|██▎ | 5242/22095 [9:00:51<21:16:30, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▎ | 5243/22095 [9:01:01<27:54:33, 5.96s/it] {'loss': 0.4974, 'grad_norm': 0.387452448857917, 'learning_rate': 8.915351392090925e-06, 'epoch': 0.24} 24%|██▎ | 5243/22095 [9:01:01<27:54:33, 5.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49688 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50253 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5244/22095 [9:01:04<24:41:57, 5.28s/it] {'loss': 0.3416, 'grad_norm': 0.7509110448440282, 'learning_rate': 8.914895522093839e-06, 'epoch': 0.24} 24%|██▎ | 5244/22095 [9:01:04<24:41:57, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77963 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5245/22095 [9:01:07<21:19:10, 4.55s/it] {'loss': 0.3728, 'grad_norm': 0.6916588684495765, 'learning_rate': 8.91443956797767e-06, 'epoch': 0.24} 24%|██▎ | 5245/22095 [9:01:07<21:19:10, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54633 > 40960). Running this sequence through the model will result in indexing errors 24%|██▎ | 5246/22095 [9:01:11<19:45:58, 4.22s/it] {'loss': 0.3884, 'grad_norm': 0.6397813659500448, 'learning_rate': 8.91398352975222e-06, 'epoch': 0.24} 24%|██▎ | 5246/22095 [9:01:11<19:45:58, 4.22s/it] 24%|██▎ | 5247/22095 [9:01:13<17:47:34, 3.80s/it] {'loss': 0.3725, 'grad_norm': 1.3669528665831592, 'learning_rate': 8.913527407427282e-06, 'epoch': 0.24} 24%|██▎ | 5247/22095 [9:01:13<17:47:34, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61597 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5248/22095 [9:01:16<16:26:42, 3.51s/it] {'loss': 0.3635, 'grad_norm': 0.714301416019482, 'learning_rate': 8.91307120101266e-06, 'epoch': 0.24} 24%|██▍ | 5248/22095 [9:01:16<16:26:42, 3.51s/it] 24%|██▍ | 5249/22095 [9:01:20<16:52:42, 3.61s/it] {'loss': 0.3996, 'grad_norm': 0.6345896928698095, 'learning_rate': 8.912614910518158e-06, 'epoch': 0.24} 24%|██▍ | 5249/22095 [9:01:20<16:52:42, 3.61s/it] 24%|██▍ | 5250/22095 [9:01:25<18:20:40, 3.92s/it] {'loss': 0.3743, 'grad_norm': 0.6514324533755185, 'learning_rate': 8.912158535953576e-06, 'epoch': 0.24} 24%|██▍ | 5250/22095 [9:01:25<18:20:40, 3.92s/it] 24%|██▍ | 5251/22095 [9:01:28<17:07:44, 3.66s/it] {'loss': 0.3851, 'grad_norm': 0.6343997575674019, 'learning_rate': 8.911702077328723e-06, 'epoch': 0.24} 24%|██▍ | 5251/22095 [9:01:28<17:07:44, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5252/22095 [9:01:35<21:50:17, 4.67s/it] {'loss': 0.5126, 'grad_norm': 0.3922728341327845, 'learning_rate': 8.911245534653409e-06, 'epoch': 0.24} 24%|██▍ | 5252/22095 [9:01:35<21:50:17, 4.67s/it] 24%|██▍ | 5253/22095 [9:01:38<19:42:43, 4.21s/it] {'loss': 0.3682, 'grad_norm': 1.0385948182811826, 'learning_rate': 8.910788907937437e-06, 'epoch': 0.24} 24%|██▍ | 5253/22095 [9:01:38<19:42:43, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (119065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132597 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5254/22095 [9:01:48<27:27:16, 5.87s/it] {'loss': 0.5111, 'grad_norm': 0.3969089405014628, 'learning_rate': 8.910332197190623e-06, 'epoch': 0.24} 24%|██▍ | 5254/22095 [9:01:48<27:27:16, 5.87s/it] 24%|██▍ | 5255/22095 [9:01:51<24:26:51, 5.23s/it] {'loss': 0.3867, 'grad_norm': 0.7374498506816389, 'learning_rate': 8.90987540242278e-06, 'epoch': 0.24} 24%|██▍ | 5255/22095 [9:01:51<24:26:51, 5.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5256/22095 [9:01:58<25:55:16, 5.54s/it] {'loss': 0.4951, 'grad_norm': 0.3126931159539168, 'learning_rate': 8.909418523643724e-06, 'epoch': 0.24} 24%|██▍ | 5256/22095 [9:01:58<25:55:16, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49102 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5257/22095 [9:02:02<23:28:15, 5.02s/it] {'loss': 0.3913, 'grad_norm': 0.6573278021852665, 'learning_rate': 8.908961560863271e-06, 'epoch': 0.24} 24%|██▍ | 5257/22095 [9:02:02<23:28:15, 5.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5258/22095 [9:02:05<21:40:50, 4.64s/it] {'loss': 0.4083, 'grad_norm': 0.6754692628404956, 'learning_rate': 8.908504514091239e-06, 'epoch': 0.24} 24%|██▍ | 5258/22095 [9:02:05<21:40:50, 4.64s/it] 24%|██▍ | 5259/22095 [9:02:09<20:37:31, 4.41s/it] {'loss': 0.3772, 'grad_norm': 0.7009135373593584, 'learning_rate': 8.908047383337447e-06, 'epoch': 0.24} 24%|██▍ | 5259/22095 [9:02:09<20:37:31, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (255713934 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7926319 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (255713934 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/38454.png', 'image_wh': [[19362, 13207]], 'conversations': [{'from': 'human', 'value': '\nhow many countries participated in the summit Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': '193 countries participated in the summit.\nThe text states that there were "193" countries involved in the "Global Compact" that was created for "safe, orderly and regular migration" as well as in the "New York Declaration" on refugees. This means that all 193 member states of the United Nations were involved in the summit for refugees and migrants.'}]} 24%|██▍ | 5260/22095 [9:02:12<18:32:50, 3.97s/it] {'loss': 0.4036, 'grad_norm': 0.6307270990180422, 'learning_rate': 8.907590168611724e-06, 'epoch': 0.24} 24%|██▍ | 5260/22095 [9:02:12<18:32:50, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5261/22095 [9:02:22<26:37:51, 5.70s/it] {'loss': 0.4853, 'grad_norm': 0.3787399442226855, 'learning_rate': 8.907132869923886e-06, 'epoch': 0.24} 24%|██▍ | 5261/22095 [9:02:22<26:37:51, 5.70s/it] 24%|██▍ | 5262/22095 [9:02:31<32:11:32, 6.88s/it] {'loss': 0.4977, 'grad_norm': 0.3359140168820819, 'learning_rate': 8.906675487283764e-06, 'epoch': 0.24} 24%|██▍ | 5262/22095 [9:02:31<32:11:32, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▍ | 5263/22095 [9:02:35<27:59:28, 5.99s/it] {'loss': 0.405, 'grad_norm': 0.7146020697874832, 'learning_rate': 8.906218020701182e-06, 'epoch': 0.24} 24%|██▍ | 5263/22095 [9:02:35<27:59:28, 5.99s/it] 24%|██▍ | 5264/22095 [9:02:39<24:06:21, 5.16s/it] {'loss': 0.3726, 'grad_norm': 0.6773098535047575, 'learning_rate': 8.905760470185974e-06, 'epoch': 0.24} 24%|██▍ | 5264/22095 [9:02:39<24:06:21, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5265/22095 [9:02:48<30:25:38, 6.51s/it] {'loss': 0.4924, 'grad_norm': 0.31021528458286485, 'learning_rate': 8.90530283574797e-06, 'epoch': 0.24} 24%|██▍ | 5265/22095 [9:02:48<30:25:38, 6.51s/it] 24%|██▍ | 5266/22095 [9:02:52<26:43:30, 5.72s/it] {'loss': 0.3884, 'grad_norm': 0.6946956337630187, 'learning_rate': 8.904845117397e-06, 'epoch': 0.24} 24%|██▍ | 5266/22095 [9:02:52<26:43:30, 5.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5267/22095 [9:02:55<23:20:17, 4.99s/it] {'loss': 0.3681, 'grad_norm': 0.6487113983731831, 'learning_rate': 8.904387315142901e-06, 'epoch': 0.24} 24%|██▍ | 5267/22095 [9:02:55<23:20:17, 4.99s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5268/22095 [9:02:59<21:19:52, 4.56s/it] {'loss': 0.3657, 'grad_norm': 0.7148333414265486, 'learning_rate': 8.903929428995512e-06, 'epoch': 0.24} 24%|██▍ | 5268/22095 [9:02:59<21:19:52, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43008 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51598 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5269/22095 [9:03:02<19:06:35, 4.09s/it] {'loss': 0.3237, 'grad_norm': 0.6329638363612822, 'learning_rate': 8.903471458964668e-06, 'epoch': 0.24} 24%|██▍ | 5269/22095 [9:03:02<19:06:35, 4.09s/it] 24%|██▍ | 5270/22095 [9:03:05<17:25:12, 3.73s/it] {'loss': 0.3561, 'grad_norm': 0.6737207547648599, 'learning_rate': 8.903013405060212e-06, 'epoch': 0.24} 24%|██▍ | 5270/22095 [9:03:05<17:25:12, 3.73s/it] 24%|██▍ | 5271/22095 [9:03:08<16:11:00, 3.46s/it] {'loss': 0.3589, 'grad_norm': 0.7075110234059288, 'learning_rate': 8.902555267291984e-06, 'epoch': 0.24} 24%|██▍ | 5271/22095 [9:03:08<16:11:00, 3.46s/it] 24%|██▍ | 5272/22095 [9:03:11<16:05:51, 3.44s/it] {'loss': 0.3779, 'grad_norm': 0.6371642429692355, 'learning_rate': 8.90209704566983e-06, 'epoch': 0.24} 24%|██▍ | 5272/22095 [9:03:11<16:05:51, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52088 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48484 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5273/22095 [9:03:14<16:00:07, 3.42s/it] {'loss': 0.3782, 'grad_norm': 0.6740952463066865, 'learning_rate': 8.901638740203594e-06, 'epoch': 0.24} 24%|██▍ | 5273/22095 [9:03:14<16:00:07, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (136873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82360 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5274/22095 [9:03:18<16:31:32, 3.54s/it] {'loss': 0.3964, 'grad_norm': 0.6595696256173088, 'learning_rate': 8.901180350903125e-06, 'epoch': 0.24} 24%|██▍ | 5274/22095 [9:03:18<16:31:32, 3.54s/it] 24%|██▍ | 5275/22095 [9:03:21<15:32:48, 3.33s/it] {'loss': 0.3956, 'grad_norm': 0.6521331171437978, 'learning_rate': 8.900721877778271e-06, 'epoch': 0.24} 24%|██▍ | 5275/22095 [9:03:21<15:32:48, 3.33s/it] 24%|██▍ | 5276/22095 [9:03:24<15:23:12, 3.29s/it] {'loss': 0.357, 'grad_norm': 0.6925461727004567, 'learning_rate': 8.900263320838886e-06, 'epoch': 0.24} 24%|██▍ | 5276/22095 [9:03:24<15:23:12, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109060 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50295 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5277/22095 [9:03:34<24:00:35, 5.14s/it] {'loss': 0.525, 'grad_norm': 0.5489727830840708, 'learning_rate': 8.899804680094818e-06, 'epoch': 0.24} 24%|██▍ | 5277/22095 [9:03:34<24:00:35, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109605 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (143203 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5278/22095 [9:03:38<22:32:42, 4.83s/it] {'loss': 0.4357, 'grad_norm': 0.6763946770947068, 'learning_rate': 8.899345955555928e-06, 'epoch': 0.24} 24%|██▍ | 5278/22095 [9:03:38<22:32:42, 4.83s/it] 24%|██▍ | 5279/22095 [9:03:41<20:07:03, 4.31s/it] {'loss': 0.429, 'grad_norm': 0.7201089468189605, 'learning_rate': 8.898887147232066e-06, 'epoch': 0.24} 24%|██▍ | 5279/22095 [9:03:41<20:07:03, 4.31s/it] 24%|██▍ | 5280/22095 [9:03:45<19:36:25, 4.20s/it] {'loss': 0.3797, 'grad_norm': 0.8181534036902893, 'learning_rate': 8.898428255133098e-06, 'epoch': 0.24} 24%|██▍ | 5280/22095 [9:03:45<19:36:25, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5281/22095 [9:03:54<26:53:52, 5.76s/it] {'loss': 0.5104, 'grad_norm': 0.38237795007697833, 'learning_rate': 8.897969279268877e-06, 'epoch': 0.24} 24%|██▍ | 5281/22095 [9:03:54<26:53:52, 5.76s/it] 24%|██▍ | 5282/22095 [9:04:04<32:16:30, 6.91s/it] {'loss': 0.4722, 'grad_norm': 0.3744096064923189, 'learning_rate': 8.897510219649268e-06, 'epoch': 0.24} 24%|██▍ | 5282/22095 [9:04:04<32:16:30, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▍ | 5283/22095 [9:04:08<28:40:53, 6.14s/it] {'loss': 0.3431, 'grad_norm': 0.6605872245547142, 'learning_rate': 8.897051076284135e-06, 'epoch': 0.24} 24%|██▍ | 5283/22095 [9:04:08<28:40:53, 6.14s/it] 24%|██▍ | 5284/22095 [9:04:12<25:31:50, 5.47s/it] {'loss': 0.3862, 'grad_norm': 0.7269263863684868, 'learning_rate': 8.896591849183343e-06, 'epoch': 0.24} 24%|██▍ | 5284/22095 [9:04:12<25:31:50, 5.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5285/22095 [9:04:19<27:44:28, 5.94s/it] {'loss': 0.4864, 'grad_norm': 0.3298538520006822, 'learning_rate': 8.89613253835676e-06, 'epoch': 0.24} 24%|██▍ | 5285/22095 [9:04:19<27:44:28, 5.94s/it] 24%|██▍ | 5286/22095 [9:04:26<29:36:02, 6.34s/it] {'loss': 0.481, 'grad_norm': 0.32610787321322837, 'learning_rate': 8.895673143814254e-06, 'epoch': 0.24} 24%|██▍ | 5286/22095 [9:04:26<29:36:02, 6.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (47766 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5287/22095 [9:04:31<27:25:37, 5.87s/it] {'loss': 0.4213, 'grad_norm': 0.692626113129391, 'learning_rate': 8.895213665565698e-06, 'epoch': 0.24} 24%|██▍ | 5287/22095 [9:04:31<27:25:37, 5.87s/it] 24%|██▍ | 5288/22095 [9:04:35<24:21:30, 5.22s/it] {'loss': 0.3849, 'grad_norm': 0.6756453499928873, 'learning_rate': 8.894754103620963e-06, 'epoch': 0.24} 24%|██▍ | 5288/22095 [9:04:35<24:21:30, 5.22s/it] 24%|██▍ | 5289/22095 [9:04:38<21:28:44, 4.60s/it] {'loss': 0.399, 'grad_norm': 0.6977591454885193, 'learning_rate': 8.894294457989924e-06, 'epoch': 0.24} 24%|██▍ | 5289/22095 [9:04:38<21:28:44, 4.60s/it] 24%|██▍ | 5290/22095 [9:04:41<19:20:17, 4.14s/it] {'loss': 0.4267, 'grad_norm': 0.7247753730009222, 'learning_rate': 8.893834728682459e-06, 'epoch': 0.24} 24%|██▍ | 5290/22095 [9:04:41<19:20:17, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932782 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55935, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 12cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 24%|██▍ | 5291/22095 [9:04:44<17:59:37, 3.85s/it] {'loss': 0.3309, 'grad_norm': 0.6429940559942522, 'learning_rate': 8.893374915708443e-06, 'epoch': 0.24} 24%|██▍ | 5291/22095 [9:04:44<17:59:37, 3.85s/it] 24%|██▍ | 5292/22095 [9:04:47<16:45:24, 3.59s/it] {'loss': 0.3736, 'grad_norm': 0.6512035834490856, 'learning_rate': 8.892915019077757e-06, 'epoch': 0.24} 24%|██▍ | 5292/22095 [9:04:47<16:45:24, 3.59s/it] 24%|██▍ | 5293/22095 [9:04:51<16:17:31, 3.49s/it] {'loss': 0.3783, 'grad_norm': 0.6223161302422051, 'learning_rate': 8.892455038800286e-06, 'epoch': 0.24} 24%|██▍ | 5293/22095 [9:04:51<16:17:31, 3.49s/it] 24%|██▍ | 5294/22095 [9:04:54<15:38:35, 3.35s/it] {'loss': 0.4299, 'grad_norm': 0.6208666737007078, 'learning_rate': 8.891994974885909e-06, 'epoch': 0.24} 24%|██▍ | 5294/22095 [9:04:54<15:38:35, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5295/22095 [9:05:00<19:59:04, 4.28s/it] {'loss': 0.5051, 'grad_norm': 0.5913234393655277, 'learning_rate': 8.891534827344514e-06, 'epoch': 0.24} 24%|██▍ | 5295/22095 [9:05:00<19:59:04, 4.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5296/22095 [9:05:04<19:05:22, 4.09s/it] {'loss': 0.4066, 'grad_norm': 0.6841295356040774, 'learning_rate': 8.891074596185987e-06, 'epoch': 0.24} 24%|██▍ | 5296/22095 [9:05:04<19:05:22, 4.09s/it] 24%|██▍ | 5297/22095 [9:05:07<18:06:50, 3.88s/it] {'loss': 0.4207, 'grad_norm': 0.7106840922605392, 'learning_rate': 8.890614281420218e-06, 'epoch': 0.24} 24%|██▍ | 5297/22095 [9:05:07<18:06:50, 3.88s/it] 24%|██▍ | 5298/22095 [9:05:10<16:36:37, 3.56s/it] {'loss': 0.3652, 'grad_norm': 0.6294873742101729, 'learning_rate': 8.890153883057097e-06, 'epoch': 0.24} 24%|██▍ | 5298/22095 [9:05:10<16:36:37, 3.56s/it] 24%|██▍ | 5299/22095 [9:05:13<16:01:17, 3.43s/it] {'loss': 0.3965, 'grad_norm': 0.6989552091439057, 'learning_rate': 8.889693401106516e-06, 'epoch': 0.24} 24%|██▍ | 5299/22095 [9:05:13<16:01:17, 3.43s/it] 24%|██▍ | 5300/22095 [9:05:16<15:35:57, 3.34s/it] {'loss': 0.3384, 'grad_norm': 0.6747147530686606, 'learning_rate': 8.889232835578372e-06, 'epoch': 0.24} 24%|██▍ | 5300/22095 [9:05:16<15:35:57, 3.34s/it] 24%|██▍ | 5301/22095 [9:05:20<16:01:52, 3.44s/it] {'loss': 0.3815, 'grad_norm': 0.6658941217429191, 'learning_rate': 8.888772186482557e-06, 'epoch': 0.24} 24%|██▍ | 5301/22095 [9:05:20<16:01:52, 3.44s/it] 24%|██▍ | 5302/22095 [9:05:23<16:00:11, 3.43s/it] {'loss': 0.397, 'grad_norm': 0.664905784458334, 'learning_rate': 8.888311453828973e-06, 'epoch': 0.24} 24%|██▍ | 5302/22095 [9:05:23<16:00:11, 3.43s/it] 24%|██▍ | 5303/22095 [9:05:26<15:14:05, 3.27s/it] {'loss': 0.356, 'grad_norm': 0.7081115510513791, 'learning_rate': 8.887850637627517e-06, 'epoch': 0.24} 24%|██▍ | 5303/22095 [9:05:26<15:14:05, 3.27s/it] 24%|██▍ | 5304/22095 [9:05:29<14:24:55, 3.09s/it] {'loss': 0.3899, 'grad_norm': 0.7100845586062496, 'learning_rate': 8.88738973788809e-06, 'epoch': 0.24} 24%|██▍ | 5304/22095 [9:05:29<14:24:55, 3.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5305/22095 [9:05:32<14:33:31, 3.12s/it] {'loss': 0.4217, 'grad_norm': 0.7060600319820818, 'learning_rate': 8.8869287546206e-06, 'epoch': 0.24} 24%|██▍ | 5305/22095 [9:05:32<14:33:31, 3.12s/it] 24%|██▍ | 5306/22095 [9:05:36<15:36:58, 3.35s/it] {'loss': 0.3582, 'grad_norm': 0.6256192045316736, 'learning_rate': 8.886467687834946e-06, 'epoch': 0.24} 24%|██▍ | 5306/22095 [9:05:36<15:36:58, 3.35s/it] 24%|██▍ | 5307/22095 [9:05:39<15:03:24, 3.23s/it] {'loss': 0.3692, 'grad_norm': 0.8902326378633588, 'learning_rate': 8.88600653754104e-06, 'epoch': 0.24} 24%|██▍ | 5307/22095 [9:05:39<15:03:24, 3.23s/it] 24%|██▍ | 5308/22095 [9:05:42<15:14:38, 3.27s/it] {'loss': 0.4017, 'grad_norm': 0.6399337314616411, 'learning_rate': 8.885545303748786e-06, 'epoch': 0.24} 24%|██▍ | 5308/22095 [9:05:42<15:14:38, 3.27s/it] 24%|██▍ | 5309/22095 [9:05:46<15:45:23, 3.38s/it] {'loss': 0.3831, 'grad_norm': 0.6870566412315947, 'learning_rate': 8.8850839864681e-06, 'epoch': 0.24} 24%|██▍ | 5309/22095 [9:05:46<15:45:23, 3.38s/it] 24%|██▍ | 5310/22095 [9:05:49<15:34:23, 3.34s/it] {'loss': 0.3703, 'grad_norm': 0.6238546104441927, 'learning_rate': 8.884622585708888e-06, 'epoch': 0.24} 24%|██▍ | 5310/22095 [9:05:49<15:34:23, 3.34s/it] 24%|██▍ | 5311/22095 [9:05:53<15:59:38, 3.43s/it] {'loss': 0.4248, 'grad_norm': 0.99261545468166, 'learning_rate': 8.88416110148107e-06, 'epoch': 0.24} 24%|██▍ | 5311/22095 [9:05:53<15:59:38, 3.43s/it] 24%|██▍ | 5312/22095 [9:05:56<16:06:01, 3.45s/it] {'loss': 0.4143, 'grad_norm': 0.6688459208895308, 'learning_rate': 8.883699533794558e-06, 'epoch': 0.24} 24%|██▍ | 5312/22095 [9:05:56<16:06:01, 3.45s/it] 24%|██▍ | 5313/22095 [9:06:00<16:31:09, 3.54s/it] {'loss': 0.3451, 'grad_norm': 0.649799512025244, 'learning_rate': 8.883237882659271e-06, 'epoch': 0.24} 24%|██▍ | 5313/22095 [9:06:00<16:31:09, 3.54s/it] 24%|██▍ | 5314/22095 [9:06:03<16:18:36, 3.50s/it] {'loss': 0.3589, 'grad_norm': 0.6311487774639363, 'learning_rate': 8.882776148085129e-06, 'epoch': 0.24} 24%|██▍ | 5314/22095 [9:06:03<16:18:36, 3.50s/it] 24%|██▍ | 5315/22095 [9:06:08<17:25:53, 3.74s/it] {'loss': 0.3873, 'grad_norm': 0.6287242122377914, 'learning_rate': 8.882314330082051e-06, 'epoch': 0.24} 24%|██▍ | 5315/22095 [9:06:08<17:25:53, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5316/22095 [9:06:11<16:42:56, 3.59s/it] {'loss': 0.3887, 'grad_norm': 0.6422213213204758, 'learning_rate': 8.881852428659963e-06, 'epoch': 0.24} 24%|██▍ | 5316/22095 [9:06:11<16:42:56, 3.59s/it] 24%|██▍ | 5317/22095 [9:06:15<17:40:00, 3.79s/it] {'loss': 0.3831, 'grad_norm': 0.7059140395496692, 'learning_rate': 8.881390443828788e-06, 'epoch': 0.24} 24%|██▍ | 5317/22095 [9:06:15<17:40:00, 3.79s/it] 24%|██▍ | 5318/22095 [9:06:19<18:09:20, 3.90s/it] {'loss': 0.4216, 'grad_norm': 0.6526053796106218, 'learning_rate': 8.880928375598453e-06, 'epoch': 0.24} 24%|██▍ | 5318/22095 [9:06:19<18:09:20, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99622 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5319/22095 [9:06:22<16:44:33, 3.59s/it] {'loss': 0.4068, 'grad_norm': 0.6717680282483763, 'learning_rate': 8.880466223978887e-06, 'epoch': 0.24} 24%|██▍ | 5319/22095 [9:06:22<16:44:33, 3.59s/it] 24%|██▍ | 5320/22095 [9:06:26<16:58:41, 3.64s/it] {'loss': 0.3809, 'grad_norm': 0.6617612520838584, 'learning_rate': 8.880003988980019e-06, 'epoch': 0.24} 24%|██▍ | 5320/22095 [9:06:26<16:58:41, 3.64s/it] 24%|██▍ | 5321/22095 [9:06:30<17:03:34, 3.66s/it] {'loss': 0.3892, 'grad_norm': 0.6311219890329217, 'learning_rate': 8.879541670611784e-06, 'epoch': 0.24} 24%|██▍ | 5321/22095 [9:06:30<17:03:34, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95652 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5322/22095 [9:06:34<17:53:33, 3.84s/it] {'loss': 0.369, 'grad_norm': 0.7371992835369472, 'learning_rate': 8.879079268884113e-06, 'epoch': 0.24} 24%|██▍ | 5322/22095 [9:06:34<17:53:33, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5323/22095 [9:06:37<16:36:10, 3.56s/it] {'loss': 0.3762, 'grad_norm': 0.8539629154715503, 'learning_rate': 8.878616783806939e-06, 'epoch': 0.24} 24%|██▍ | 5323/22095 [9:06:37<16:36:10, 3.56s/it] 24%|██▍ | 5324/22095 [9:06:40<16:32:34, 3.55s/it] {'loss': 0.3884, 'grad_norm': 0.6722749149455938, 'learning_rate': 8.878154215390204e-06, 'epoch': 0.24} 24%|██▍ | 5324/22095 [9:06:40<16:32:34, 3.55s/it] 24%|██▍ | 5325/22095 [9:06:43<15:50:38, 3.40s/it] {'loss': 0.3862, 'grad_norm': 0.6719106851436162, 'learning_rate': 8.877691563643848e-06, 'epoch': 0.24} 24%|██▍ | 5325/22095 [9:06:43<15:50:38, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44080 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5326/22095 [9:06:48<16:56:47, 3.64s/it] {'loss': 0.3606, 'grad_norm': 0.6312868275883254, 'learning_rate': 8.877228828577809e-06, 'epoch': 0.24} 24%|██▍ | 5326/22095 [9:06:48<16:56:47, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [262, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8497020 in VC:s3://internvl-moe-sft-data/. Exception: Image size [262, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 130239, 'image': 'vrdu_texteq/astro-ph.CO/da4f8581-c979-41f0-85e4-694a17b6d2d9.png', 'image_wh': [[262, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'and we take $\\Delta = 200$.'}]} 24%|██▍ | 5327/22095 [9:06:57<25:10:58, 5.41s/it] {'loss': 0.5013, 'grad_norm': 0.47262316317293573, 'learning_rate': 8.876766010202029e-06, 'epoch': 0.24} 24%|██▍ | 5327/22095 [9:06:57<25:10:58, 5.41s/it] 24%|██▍ | 5328/22095 [9:07:00<22:12:17, 4.77s/it] {'loss': 0.4296, 'grad_norm': 0.7079327142025273, 'learning_rate': 8.876303108526455e-06, 'epoch': 0.24} 24%|██▍ | 5328/22095 [9:07:00<22:12:17, 4.77s/it] 24%|██▍ | 5329/22095 [9:07:04<20:53:34, 4.49s/it] {'loss': 0.3946, 'grad_norm': 0.658648471499382, 'learning_rate': 8.875840123561033e-06, 'epoch': 0.24} 24%|██▍ | 5329/22095 [9:07:04<20:53:34, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5330/22095 [9:07:14<27:46:39, 5.96s/it] {'loss': 0.4739, 'grad_norm': 0.30867312660550816, 'learning_rate': 8.875377055315709e-06, 'epoch': 0.24} 24%|██▍ | 5330/22095 [9:07:14<27:46:39, 5.96s/it] 24%|██▍ | 5331/22095 [9:07:17<24:35:36, 5.28s/it] {'loss': 0.3642, 'grad_norm': 0.6468211564058911, 'learning_rate': 8.874913903800436e-06, 'epoch': 0.24} 24%|██▍ | 5331/22095 [9:07:17<24:35:36, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57782 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60820 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5332/22095 [9:07:21<21:44:19, 4.67s/it] {'loss': 0.415, 'grad_norm': 0.6775896448599275, 'learning_rate': 8.874450669025161e-06, 'epoch': 0.24} 24%|██▍ | 5332/22095 [9:07:21<21:44:19, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42509 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5333/22095 [9:07:24<19:53:37, 4.27s/it] {'loss': 0.3654, 'grad_norm': 0.6495529703738707, 'learning_rate': 8.873987350999843e-06, 'epoch': 0.24} 24%|██▍ | 5333/22095 [9:07:24<19:53:37, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5334/22095 [9:07:34<28:03:20, 6.03s/it] {'loss': 0.5283, 'grad_norm': 0.4410663750016654, 'learning_rate': 8.873523949734435e-06, 'epoch': 0.24} 24%|██▍ | 5334/22095 [9:07:34<28:03:20, 6.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047921 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 4\nB. 8\nC. 16\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 24%|██▍ | 5335/22095 [9:07:38<25:19:21, 5.44s/it] {'loss': 0.3393, 'grad_norm': 0.6581032625586402, 'learning_rate': 8.873060465238894e-06, 'epoch': 0.24} 24%|██▍ | 5335/22095 [9:07:38<25:19:21, 5.44s/it] 24%|██▍ | 5336/22095 [9:07:42<23:20:00, 5.01s/it] {'loss': 0.3877, 'grad_norm': 0.6763820592845609, 'learning_rate': 8.872596897523178e-06, 'epoch': 0.24} 24%|██▍ | 5336/22095 [9:07:42<23:20:00, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43986 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5337/22095 [9:07:46<21:34:18, 4.63s/it] {'loss': 0.3722, 'grad_norm': 0.741517536605685, 'learning_rate': 8.872133246597247e-06, 'epoch': 0.24} 24%|██▍ | 5337/22095 [9:07:46<21:34:18, 4.63s/it] 24%|██▍ | 5338/22095 [9:07:49<19:18:17, 4.15s/it] {'loss': 0.3671, 'grad_norm': 0.6843360209148616, 'learning_rate': 8.871669512471068e-06, 'epoch': 0.24} 24%|██▍ | 5338/22095 [9:07:49<19:18:17, 4.15s/it] 24%|██▍ | 5339/22095 [9:07:52<17:30:59, 3.76s/it] {'loss': 0.3787, 'grad_norm': 0.6594839446020356, 'learning_rate': 8.871205695154601e-06, 'epoch': 0.24} 24%|██▍ | 5339/22095 [9:07:52<17:30:59, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55558 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76716 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97449 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5340/22095 [9:07:58<21:36:28, 4.64s/it] {'loss': 0.4727, 'grad_norm': 0.37323150087019386, 'learning_rate': 8.870741794657814e-06, 'epoch': 0.24} 24%|██▍ | 5340/22095 [9:07:58<21:36:28, 4.64s/it] 24%|██▍ | 5341/22095 [9:08:05<24:43:24, 5.31s/it] {'loss': 0.5142, 'grad_norm': 0.3654118177626498, 'learning_rate': 8.870277810990671e-06, 'epoch': 0.24} 24%|██▍ | 5341/22095 [9:08:05<24:43:24, 5.31s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (124960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110053 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5342/22095 [9:08:09<23:02:16, 4.95s/it] {'loss': 0.4065, 'grad_norm': 0.6965027222944363, 'learning_rate': 8.869813744163147e-06, 'epoch': 0.24} 24%|██▍ | 5342/22095 [9:08:09<23:02:16, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44933 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103421 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5343/22095 [9:08:13<20:38:55, 4.44s/it] {'loss': 0.3781, 'grad_norm': 0.7002338027510239, 'learning_rate': 8.86934959418521e-06, 'epoch': 0.24} 24%|██▍ | 5343/22095 [9:08:13<20:38:55, 4.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78661 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90304 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5344/22095 [9:08:16<18:59:29, 4.08s/it] {'loss': 0.3628, 'grad_norm': 0.6496249565713254, 'learning_rate': 8.868885361066835e-06, 'epoch': 0.24} 24%|██▍ | 5344/22095 [9:08:16<18:59:29, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5345/22095 [9:08:26<27:10:44, 5.84s/it] {'loss': 0.4999, 'grad_norm': 0.4174009112938387, 'learning_rate': 8.868421044817994e-06, 'epoch': 0.24} 24%|██▍ | 5345/22095 [9:08:26<27:10:44, 5.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52937 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55564 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5346/22095 [9:08:29<23:41:35, 5.09s/it] {'loss': 0.4531, 'grad_norm': 0.7319192637518557, 'learning_rate': 8.867956645448667e-06, 'epoch': 0.24} 24%|██▍ | 5346/22095 [9:08:29<23:41:35, 5.09s/it] 24%|██▍ | 5347/22095 [9:08:33<21:47:05, 4.68s/it] {'loss': 0.3535, 'grad_norm': 0.6793857881569461, 'learning_rate': 8.86749216296883e-06, 'epoch': 0.24} 24%|██▍ | 5347/22095 [9:08:33<21:47:05, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48428 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5348/22095 [9:08:44<30:28:17, 6.55s/it] {'loss': 0.4857, 'grad_norm': 0.3200373606682935, 'learning_rate': 8.867027597388467e-06, 'epoch': 0.24} 24%|██▍ | 5348/22095 [9:08:44<30:28:17, 6.55s/it] 24%|██▍ | 5349/22095 [9:08:53<34:15:34, 7.36s/it] {'loss': 0.4753, 'grad_norm': 0.319293383103753, 'learning_rate': 8.866562948717555e-06, 'epoch': 0.24} 24%|██▍ | 5349/22095 [9:08:53<34:15:34, 7.36s/it] 24%|██▍ | 5350/22095 [9:09:03<37:48:02, 8.13s/it] {'loss': 0.5124, 'grad_norm': 0.30674679923048015, 'learning_rate': 8.866098216966081e-06, 'epoch': 0.24} 24%|██▍ | 5350/22095 [9:09:03<37:48:02, 8.13s/it] 24%|██▍ | 5351/22095 [9:09:09<34:30:28, 7.42s/it] {'loss': 0.5116, 'grad_norm': 0.30221657868269836, 'learning_rate': 8.865633402144032e-06, 'epoch': 0.24} 24%|██▍ | 5351/22095 [9:09:09<34:30:28, 7.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5352/22095 [9:09:19<37:48:10, 8.13s/it] {'loss': 0.4946, 'grad_norm': 0.29530341560010703, 'learning_rate': 8.865168504261392e-06, 'epoch': 0.24} 24%|██▍ | 5352/22095 [9:09:19<37:48:10, 8.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▍ | 5353/22095 [9:09:23<32:28:56, 6.98s/it] {'loss': 0.396, 'grad_norm': 0.7944404800721792, 'learning_rate': 8.864703523328153e-06, 'epoch': 0.24} 24%|██▍ | 5353/22095 [9:09:23<32:28:56, 6.98s/it] 24%|██▍ | 5354/22095 [9:09:31<34:19:58, 7.38s/it] {'loss': 0.4723, 'grad_norm': 0.33803840764758725, 'learning_rate': 8.864238459354303e-06, 'epoch': 0.24} 24%|██▍ | 5354/22095 [9:09:31<34:19:58, 7.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 24%|██▍ | 5355/22095 [9:09:35<29:01:52, 6.24s/it] {'loss': 0.3695, 'grad_norm': 0.6518971223670713, 'learning_rate': 8.863773312349838e-06, 'epoch': 0.24} 24%|██▍ | 5355/22095 [9:09:35<29:01:52, 6.24s/it] 24%|██▍ | 5356/22095 [9:09:43<32:14:28, 6.93s/it] {'loss': 0.5039, 'grad_norm': 0.3380621018492585, 'learning_rate': 8.86330808232475e-06, 'epoch': 0.24} 24%|██▍ | 5356/22095 [9:09:43<32:14:28, 6.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (43538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104101 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5357/22095 [9:09:47<27:30:12, 5.92s/it] {'loss': 0.3771, 'grad_norm': 0.7447560157517059, 'learning_rate': 8.862842769289037e-06, 'epoch': 0.24} 24%|██▍ | 5357/22095 [9:09:47<27:30:12, 5.92s/it] 24%|██▍ | 5358/22095 [9:09:51<25:29:47, 5.48s/it] {'loss': 0.3699, 'grad_norm': 0.7501764157418583, 'learning_rate': 8.862377373252697e-06, 'epoch': 0.24} 24%|██▍ | 5358/22095 [9:09:51<25:29:47, 5.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67038 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93553 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53767 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5359/22095 [9:09:55<23:32:08, 5.06s/it] {'loss': 0.4108, 'grad_norm': 0.6265046919283168, 'learning_rate': 8.86191189422573e-06, 'epoch': 0.24} 24%|██▍ | 5359/22095 [9:09:55<23:32:08, 5.06s/it] 24%|██▍ | 5360/22095 [9:09:59<21:43:03, 4.67s/it] {'loss': 0.37, 'grad_norm': 0.6313737707557908, 'learning_rate': 8.861446332218138e-06, 'epoch': 0.24} 24%|██▍ | 5360/22095 [9:09:59<21:43:03, 4.67s/it] 24%|██▍ | 5361/22095 [9:10:03<21:02:53, 4.53s/it] {'loss': 0.4077, 'grad_norm': 0.7014318487147055, 'learning_rate': 8.860980687239922e-06, 'epoch': 0.24} 24%|██▍ | 5361/22095 [9:10:03<21:02:53, 4.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934542 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57695, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 24%|██▍ | 5362/22095 [9:10:07<19:56:25, 4.29s/it] {'loss': 0.3813, 'grad_norm': 0.7094592637276969, 'learning_rate': 8.86051495930109e-06, 'epoch': 0.24} 24%|██▍ | 5362/22095 [9:10:07<19:56:25, 4.29s/it] 24%|██▍ | 5363/22095 [9:10:11<19:34:26, 4.21s/it] {'loss': 0.3697, 'grad_norm': 0.6135837069346777, 'learning_rate': 8.860049148411649e-06, 'epoch': 0.24} 24%|██▍ | 5363/22095 [9:10:11<19:34:26, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91998 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5364/22095 [9:10:15<19:13:14, 4.14s/it] {'loss': 0.4045, 'grad_norm': 0.6548772644362388, 'learning_rate': 8.859583254581604e-06, 'epoch': 0.24} 24%|██▍ | 5364/22095 [9:10:15<19:13:14, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5365/22095 [9:10:23<24:42:12, 5.32s/it] {'loss': 0.5107, 'grad_norm': 0.4681752414364971, 'learning_rate': 8.859117277820972e-06, 'epoch': 0.24} 24%|██▍ | 5365/22095 [9:10:23<24:42:12, 5.32s/it] 24%|██▍ | 5366/22095 [9:10:26<21:45:58, 4.68s/it] {'loss': 0.4012, 'grad_norm': 0.7547141701141868, 'learning_rate': 8.85865121813976e-06, 'epoch': 0.24} 24%|██▍ | 5366/22095 [9:10:26<21:45:58, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5367/22095 [9:10:36<28:53:47, 6.22s/it] {'loss': 0.4845, 'grad_norm': 0.6690380482512771, 'learning_rate': 8.858185075547987e-06, 'epoch': 0.24} 24%|██▍ | 5367/22095 [9:10:36<28:53:47, 6.22s/it] 24%|██▍ | 5368/22095 [9:10:39<24:48:24, 5.34s/it] {'loss': 0.3698, 'grad_norm': 0.6493813874743187, 'learning_rate': 8.857718850055663e-06, 'epoch': 0.24} 24%|██▍ | 5368/22095 [9:10:40<24:48:24, 5.34s/it] 24%|██▍ | 5369/22095 [9:10:42<21:32:32, 4.64s/it] {'loss': 0.429, 'grad_norm': 0.7556859630562037, 'learning_rate': 8.857252541672812e-06, 'epoch': 0.24} 24%|██▍ | 5369/22095 [9:10:43<21:32:32, 4.64s/it] 24%|██▍ | 5370/22095 [9:10:45<19:02:30, 4.10s/it] {'loss': 0.3613, 'grad_norm': 0.6888913998842668, 'learning_rate': 8.856786150409448e-06, 'epoch': 0.24} 24%|██▍ | 5370/22095 [9:10:45<19:02:30, 4.10s/it] 24%|██▍ | 5371/22095 [9:10:49<18:11:12, 3.91s/it] {'loss': 0.3691, 'grad_norm': 0.648604761289055, 'learning_rate': 8.856319676275595e-06, 'epoch': 0.24} 24%|██▍ | 5371/22095 [9:10:49<18:11:12, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953365 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4200, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 24%|██▍ | 5372/22095 [9:10:57<23:35:15, 5.08s/it] {'loss': 0.4927, 'grad_norm': 0.5213434269140345, 'learning_rate': 8.855853119281278e-06, 'epoch': 0.24} 24%|██▍ | 5372/22095 [9:10:57<23:35:15, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59653 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5373/22095 [9:11:00<21:33:28, 4.64s/it] {'loss': 0.4262, 'grad_norm': 0.6643186693105698, 'learning_rate': 8.855386479436518e-06, 'epoch': 0.24} 24%|██▍ | 5373/22095 [9:11:00<21:33:28, 4.64s/it] 24%|██▍ | 5374/22095 [9:11:04<20:07:17, 4.33s/it] {'loss': 0.3478, 'grad_norm': 0.6483221393724724, 'learning_rate': 8.854919756751343e-06, 'epoch': 0.24} 24%|██▍ | 5374/22095 [9:11:04<20:07:17, 4.33s/it] 24%|██▍ | 5375/22095 [9:11:07<18:01:35, 3.88s/it] {'loss': 0.351, 'grad_norm': 0.611895884992864, 'learning_rate': 8.854452951235784e-06, 'epoch': 0.24} 24%|██▍ | 5375/22095 [9:11:07<18:01:35, 3.88s/it] 24%|██▍ | 5376/22095 [9:11:10<17:01:25, 3.67s/it] {'loss': 0.4008, 'grad_norm': 0.6942629669349091, 'learning_rate': 8.853986062899869e-06, 'epoch': 0.24} 24%|██▍ | 5376/22095 [9:11:10<17:01:25, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5377/22095 [9:11:13<16:55:33, 3.64s/it] {'loss': 0.3447, 'grad_norm': 0.6256217162597607, 'learning_rate': 8.853519091753629e-06, 'epoch': 0.24} 24%|██▍ | 5377/22095 [9:11:13<16:55:33, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [231, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8370123 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36875, 'image': 'vrdu_table_final_2/astro-ph.CO/299605ed-bf83-4424-8e8a-fcf19d76e34b.png', 'image_wh': [[231, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}Number of clusters\\end{tabular}\n```"}]} 24%|██▍ | 5378/22095 [9:11:16<16:04:11, 3.46s/it] {'loss': 0.3695, 'grad_norm': 0.6993569125923726, 'learning_rate': 8.853052037807099e-06, 'epoch': 0.24} 24%|██▍ | 5378/22095 [9:11:16<16:04:11, 3.46s/it] 24%|██▍ | 5379/22095 [9:11:20<16:05:02, 3.46s/it] {'loss': 0.4004, 'grad_norm': 0.6163882470276689, 'learning_rate': 8.852584901070314e-06, 'epoch': 0.24} 24%|██▍ | 5379/22095 [9:11:20<16:05:02, 3.46s/it] 24%|██▍ | 5380/22095 [9:11:24<16:49:58, 3.63s/it] {'loss': 0.3811, 'grad_norm': 0.6280681657569366, 'learning_rate': 8.852117681553312e-06, 'epoch': 0.24} 24%|██▍ | 5380/22095 [9:11:24<16:49:58, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5381/22095 [9:11:32<22:54:06, 4.93s/it] {'loss': 0.5072, 'grad_norm': 0.44492594792209034, 'learning_rate': 8.851650379266133e-06, 'epoch': 0.24} 24%|██▍ | 5381/22095 [9:11:32<22:54:06, 4.93s/it] 24%|██▍ | 5382/22095 [9:11:35<20:20:43, 4.38s/it] {'loss': 0.3498, 'grad_norm': 0.7206872723939073, 'learning_rate': 8.851182994218815e-06, 'epoch': 0.24} 24%|██▍ | 5382/22095 [9:11:35<20:20:43, 4.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887142 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10295, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5'}, {'from': 'gpt', 'value': '【解答】解:(1)由点M、N分别是线段AC、BC的中点,得MC=\\frac{1}{2}AC=\\frac{1}{2}×4=2,NC=\\frac{1}{2}BC=\\frac{1}{2}×6=3.由线段的和差,得MN=MC+NC=2+3=5;'}]} 24%|██▍ | 5383/22095 [9:11:38<18:31:29, 3.99s/it] {'loss': 0.4066, 'grad_norm': 0.6696865366923967, 'learning_rate': 8.850715526421404e-06, 'epoch': 0.24} 24%|██▍ | 5383/22095 [9:11:38<18:31:29, 3.99s/it] 24%|██▍ | 5384/22095 [9:11:42<18:37:36, 4.01s/it] {'loss': 0.3876, 'grad_norm': 0.6626802638484196, 'learning_rate': 8.850247975883942e-06, 'epoch': 0.24} 24%|██▍ | 5384/22095 [9:11:42<18:37:36, 4.01s/it] 24%|██▍ | 5385/22095 [9:11:45<17:08:26, 3.69s/it] {'loss': 0.3734, 'grad_norm': 0.6755263223808509, 'learning_rate': 8.849780342616477e-06, 'epoch': 0.24} 24%|██▍ | 5385/22095 [9:11:45<17:08:26, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52435 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110766 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5386/22095 [9:11:49<17:35:19, 3.79s/it] {'loss': 0.3842, 'grad_norm': 0.6319556903000817, 'learning_rate': 8.849312626629055e-06, 'epoch': 0.24} 24%|██▍ | 5386/22095 [9:11:49<17:35:19, 3.79s/it] 24%|██▍ | 5387/22095 [9:11:52<16:34:52, 3.57s/it] {'loss': 0.3761, 'grad_norm': 0.6609492411166165, 'learning_rate': 8.848844827931727e-06, 'epoch': 0.24} 24%|██▍ | 5387/22095 [9:11:52<16:34:52, 3.57s/it] 24%|██▍ | 5388/22095 [9:11:56<16:23:00, 3.53s/it] {'loss': 0.3792, 'grad_norm': 0.6241567642176671, 'learning_rate': 8.848376946534545e-06, 'epoch': 0.24} 24%|██▍ | 5388/22095 [9:11:56<16:23:00, 3.53s/it] 24%|██▍ | 5389/22095 [9:11:59<16:21:40, 3.53s/it] {'loss': 0.4036, 'grad_norm': 0.7183087410306984, 'learning_rate': 8.847908982447561e-06, 'epoch': 0.24} 24%|██▍ | 5389/22095 [9:11:59<16:21:40, 3.53s/it] 24%|██▍ | 5390/22095 [9:12:02<15:26:07, 3.33s/it] {'loss': 0.3661, 'grad_norm': 0.7218942848451008, 'learning_rate': 8.847440935680833e-06, 'epoch': 0.24} 24%|██▍ | 5390/22095 [9:12:02<15:26:07, 3.33s/it] 24%|██▍ | 5391/22095 [9:12:06<15:55:03, 3.43s/it] {'loss': 0.3447, 'grad_norm': 0.6694338164281567, 'learning_rate': 8.846972806244415e-06, 'epoch': 0.24} 24%|██▍ | 5391/22095 [9:12:06<15:55:03, 3.43s/it] 24%|██▍ | 5392/22095 [9:12:09<15:49:37, 3.41s/it] {'loss': 0.3893, 'grad_norm': 0.6908215455706206, 'learning_rate': 8.846504594148366e-06, 'epoch': 0.24} 24%|██▍ | 5392/22095 [9:12:09<15:49:37, 3.41s/it] 24%|██▍ | 5393/22095 [9:12:13<16:05:33, 3.47s/it] {'loss': 0.4021, 'grad_norm': 0.6561908456944305, 'learning_rate': 8.846036299402747e-06, 'epoch': 0.24} 24%|██▍ | 5393/22095 [9:12:13<16:05:33, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72989 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5394/22095 [9:12:24<26:50:48, 5.79s/it] {'loss': 0.5165, 'grad_norm': 0.4597583708167697, 'learning_rate': 8.84556792201762e-06, 'epoch': 0.24} 24%|██▍ | 5394/22095 [9:12:24<26:50:48, 5.79s/it] 24%|██▍ | 5395/22095 [9:12:27<23:51:30, 5.14s/it] {'loss': 0.4006, 'grad_norm': 0.6821535747642689, 'learning_rate': 8.845099462003049e-06, 'epoch': 0.24} 24%|██▍ | 5395/22095 [9:12:27<23:51:30, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (85781 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5396/22095 [9:12:36<28:29:51, 6.14s/it] {'loss': 0.4962, 'grad_norm': 0.3192590633854796, 'learning_rate': 8.844630919369099e-06, 'epoch': 0.24} 24%|██▍ | 5396/22095 [9:12:36<28:29:51, 6.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 24%|██▍ | 5397/22095 [9:12:40<25:39:13, 5.53s/it] {'loss': 0.3918, 'grad_norm': 0.7205417774604912, 'learning_rate': 8.84416229412584e-06, 'epoch': 0.24} 24%|██▍ | 5397/22095 [9:12:40<25:39:13, 5.53s/it] 24%|██▍ | 5398/22095 [9:12:44<23:18:41, 5.03s/it] {'loss': 0.3968, 'grad_norm': 0.647692178832509, 'learning_rate': 8.84369358628334e-06, 'epoch': 0.24} 24%|██▍ | 5398/22095 [9:12:44<23:18:41, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (222326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79244 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5399/22095 [9:12:48<21:50:32, 4.71s/it] {'loss': 0.4078, 'grad_norm': 0.6842297749923265, 'learning_rate': 8.843224795851668e-06, 'epoch': 0.24} 24%|██▍ | 5399/22095 [9:12:48<21:50:32, 4.71s/it] 24%|██▍ | 5400/22095 [9:12:51<19:42:06, 4.25s/it] {'loss': 0.3973, 'grad_norm': 0.7569759066971131, 'learning_rate': 8.8427559228409e-06, 'epoch': 0.24} 24%|██▍ | 5400/22095 [9:12:51<19:42:06, 4.25s/it] 24%|██▍ | 5401/22095 [9:12:55<18:53:35, 4.07s/it] {'loss': 0.3699, 'grad_norm': 0.6414284504701626, 'learning_rate': 8.842286967261109e-06, 'epoch': 0.24} 24%|██▍ | 5401/22095 [9:12:55<18:53:35, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5402/22095 [9:13:04<26:31:20, 5.72s/it] {'loss': 0.4885, 'grad_norm': 0.44428672016307075, 'learning_rate': 8.841817929122373e-06, 'epoch': 0.24} 24%|██▍ | 5402/22095 [9:13:04<26:31:20, 5.72s/it] 24%|██▍ | 5403/22095 [9:13:08<23:19:41, 5.03s/it] {'loss': 0.3672, 'grad_norm': 0.7171617724470963, 'learning_rate': 8.841348808434766e-06, 'epoch': 0.24} 24%|██▍ | 5403/22095 [9:13:08<23:19:41, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5404/22095 [9:13:18<30:37:18, 6.60s/it] {'loss': 0.4672, 'grad_norm': 0.37253477989149236, 'learning_rate': 8.840879605208374e-06, 'epoch': 0.24} 24%|██▍ | 5404/22095 [9:13:18<30:37:18, 6.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53635 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86415 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85319 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5405/22095 [9:13:22<26:29:33, 5.71s/it] {'loss': 0.4388, 'grad_norm': 0.7108694071388104, 'learning_rate': 8.840410319453274e-06, 'epoch': 0.24} 24%|██▍ | 5405/22095 [9:13:22<26:29:33, 5.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96787 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5406/22095 [9:13:26<24:46:39, 5.34s/it] {'loss': 0.3855, 'grad_norm': 0.6828806879996595, 'learning_rate': 8.839940951179552e-06, 'epoch': 0.24} 24%|██▍ | 5406/22095 [9:13:26<24:46:39, 5.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5407/22095 [9:13:36<31:38:26, 6.83s/it] {'loss': 0.4777, 'grad_norm': 0.3537449598703036, 'learning_rate': 8.839471500397292e-06, 'epoch': 0.24} 24%|██▍ | 5407/22095 [9:13:36<31:38:26, 6.83s/it] 24%|██▍ | 5408/22095 [9:13:40<26:58:04, 5.82s/it] {'loss': 0.3823, 'grad_norm': 0.7020097644297246, 'learning_rate': 8.83900196711658e-06, 'epoch': 0.24} 24%|██▍ | 5408/22095 [9:13:40<26:58:04, 5.82s/it] 24%|██▍ | 5409/22095 [9:13:43<23:54:50, 5.16s/it] {'loss': 0.4012, 'grad_norm': 0.6807381310828038, 'learning_rate': 8.838532351347509e-06, 'epoch': 0.24} 24%|██▍ | 5409/22095 [9:13:43<23:54:50, 5.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113705 > 40960). Running this sequence through the model will result in indexing errors 24%|██▍ | 5410/22095 [9:13:46<20:41:58, 4.47s/it] {'loss': 0.4034, 'grad_norm': 0.7073324268304318, 'learning_rate': 8.838062653100165e-06, 'epoch': 0.24} 24%|██▍ | 5410/22095 [9:13:46<20:41:58, 4.47s/it] 24%|██▍ | 5411/22095 [9:13:50<19:51:42, 4.29s/it] {'loss': 0.3713, 'grad_norm': 0.7850841150864362, 'learning_rate': 8.837592872384643e-06, 'epoch': 0.24} 24%|██▍ | 5411/22095 [9:13:50<19:51:42, 4.29s/it] 24%|██▍ | 5412/22095 [9:13:53<18:12:47, 3.93s/it] {'loss': 0.4047, 'grad_norm': 0.6864920531829904, 'learning_rate': 8.837123009211038e-06, 'epoch': 0.24} 24%|██▍ | 5412/22095 [9:13:53<18:12:47, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 24%|██▍ | 5413/22095 [9:14:03<25:55:14, 5.59s/it] {'loss': 0.4991, 'grad_norm': 0.48113930173789576, 'learning_rate': 8.836653063589443e-06, 'epoch': 0.24} 24%|██▍ | 5413/22095 [9:14:03<25:55:14, 5.59s/it] 25%|██▍ | 5414/22095 [9:14:12<31:16:22, 6.75s/it] {'loss': 0.4979, 'grad_norm': 0.4411436163117178, 'learning_rate': 8.836183035529954e-06, 'epoch': 0.25} 25%|██▍ | 5414/22095 [9:14:12<31:16:22, 6.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884034 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7187, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm'}, {'from': 'gpt', 'value': '【解答】解:根据上图所示OB=5cm-OA,∵OA=(AB+BC)÷2=4cm,∴OB=1cm.'}]} 25%|██▍ | 5415/22095 [9:14:16<27:34:04, 5.95s/it] {'loss': 0.3974, 'grad_norm': 0.7466742119661637, 'learning_rate': 8.835712925042678e-06, 'epoch': 0.25} 25%|██▍ | 5415/22095 [9:14:16<27:34:04, 5.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49670 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5416/22095 [9:14:20<23:48:21, 5.14s/it] {'loss': 0.3743, 'grad_norm': 0.7179041559362122, 'learning_rate': 8.83524273213771e-06, 'epoch': 0.25} 25%|██▍ | 5416/22095 [9:14:20<23:48:21, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59480 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5417/22095 [9:14:23<20:49:03, 4.49s/it] {'loss': 0.4023, 'grad_norm': 0.6572146908802492, 'learning_rate': 8.834772456825155e-06, 'epoch': 0.25} 25%|██▍ | 5417/22095 [9:14:23<20:49:03, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5418/22095 [9:14:32<27:42:17, 5.98s/it] {'loss': 0.5026, 'grad_norm': 0.5038071350955399, 'learning_rate': 8.834302099115118e-06, 'epoch': 0.25} 25%|██▍ | 5418/22095 [9:14:32<27:42:17, 5.98s/it] 25%|██▍ | 5419/22095 [9:14:35<24:13:33, 5.23s/it] {'loss': 0.3645, 'grad_norm': 0.8540148243050669, 'learning_rate': 8.833831659017703e-06, 'epoch': 0.25} 25%|██▍ | 5419/22095 [9:14:35<24:13:33, 5.23s/it] 25%|██▍ | 5420/22095 [9:14:39<21:44:58, 4.70s/it] {'loss': 0.3766, 'grad_norm': 0.6457518436451026, 'learning_rate': 8.833361136543021e-06, 'epoch': 0.25} 25%|██▍ | 5420/22095 [9:14:39<21:44:58, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78957 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5421/22095 [9:14:42<20:03:52, 4.33s/it] {'loss': 0.4062, 'grad_norm': 0.7338945386325528, 'learning_rate': 8.832890531701184e-06, 'epoch': 0.25} 25%|██▍ | 5421/22095 [9:14:42<20:03:52, 4.33s/it] 25%|██▍ | 5422/22095 [9:14:45<18:20:00, 3.96s/it] {'loss': 0.4127, 'grad_norm': 0.7518020196047538, 'learning_rate': 8.832419844502298e-06, 'epoch': 0.25} 25%|██▍ | 5422/22095 [9:14:45<18:20:00, 3.96s/it] 25%|██▍ | 5423/22095 [9:14:48<16:39:45, 3.60s/it] {'loss': 0.3488, 'grad_norm': 0.6440486004965358, 'learning_rate': 8.831949074956483e-06, 'epoch': 0.25} 25%|██▍ | 5423/22095 [9:14:48<16:39:45, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5424/22095 [9:14:57<24:32:22, 5.30s/it] {'loss': 0.4871, 'grad_norm': 0.3900943962261443, 'learning_rate': 8.831478223073848e-06, 'epoch': 0.25} 25%|██▍ | 5424/22095 [9:14:58<24:32:22, 5.30s/it] 25%|██▍ | 5425/22095 [9:15:05<27:52:00, 6.02s/it] {'loss': 0.4751, 'grad_norm': 0.36365756981959374, 'learning_rate': 8.831007288864517e-06, 'epoch': 0.25} 25%|██▍ | 5425/22095 [9:15:05<27:52:00, 6.02s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 25%|██▍ | 5426/22095 [9:15:08<24:05:58, 5.20s/it] {'loss': 0.3965, 'grad_norm': 0.9342732459976021, 'learning_rate': 8.830536272338602e-06, 'epoch': 0.25} 25%|██▍ | 5426/22095 [9:15:09<24:05:58, 5.20s/it] 25%|██▍ | 5427/22095 [9:15:12<21:34:53, 4.66s/it] {'loss': 0.4072, 'grad_norm': 0.7180496226403557, 'learning_rate': 8.830065173506229e-06, 'epoch': 0.25} 25%|██▍ | 5427/22095 [9:15:12<21:34:53, 4.66s/it] 25%|██▍ | 5428/22095 [9:15:16<20:23:30, 4.40s/it] {'loss': 0.3607, 'grad_norm': 0.6947647126181522, 'learning_rate': 8.829593992377518e-06, 'epoch': 0.25} 25%|██▍ | 5428/22095 [9:15:16<20:23:30, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139946 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5429/22095 [9:15:25<27:18:46, 5.90s/it] {'loss': 0.4792, 'grad_norm': 0.4581266564149199, 'learning_rate': 8.829122728962594e-06, 'epoch': 0.25} 25%|██▍ | 5429/22095 [9:15:25<27:18:46, 5.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44553 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71194 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5430/22095 [9:15:28<23:43:31, 5.13s/it] {'loss': 0.4022, 'grad_norm': 0.837846180542243, 'learning_rate': 8.828651383271582e-06, 'epoch': 0.25} 25%|██▍ | 5430/22095 [9:15:28<23:43:31, 5.13s/it] 25%|██▍ | 5431/22095 [9:15:31<20:48:48, 4.50s/it] {'loss': 0.3761, 'grad_norm': 0.8475559381295793, 'learning_rate': 8.828179955314612e-06, 'epoch': 0.25} 25%|██▍ | 5431/22095 [9:15:31<20:48:48, 4.50s/it] 25%|██▍ | 5432/22095 [9:15:34<18:40:05, 4.03s/it] {'loss': 0.3274, 'grad_norm': 0.679435079467592, 'learning_rate': 8.827708445101813e-06, 'epoch': 0.25} 25%|██▍ | 5432/22095 [9:15:34<18:40:05, 4.03s/it] 25%|██▍ | 5433/22095 [9:15:37<16:57:46, 3.66s/it] {'loss': 0.3658, 'grad_norm': 0.7471578999986938, 'learning_rate': 8.827236852643313e-06, 'epoch': 0.25} 25%|██▍ | 5433/22095 [9:15:37<16:57:46, 3.66s/it] 25%|██▍ | 5434/22095 [9:15:41<16:34:11, 3.58s/it] {'loss': 0.3832, 'grad_norm': 0.7937943646003452, 'learning_rate': 8.826765177949248e-06, 'epoch': 0.25} 25%|██▍ | 5434/22095 [9:15:41<16:34:11, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305680 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1knXJLXXXXXcnXpXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to read and retrieve all the words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\nMARX1818\n8.9\n友谊▪思想▪事业\n折\n原价\n¥899\n¥800\nRAS\nKOMMUNISTISCMI\nMANINLST\nMARX1818\nManifesto\nLemberger\n2009\nNOBLEWINE'}]} 25%|██▍ | 5435/22095 [9:15:44<16:17:49, 3.52s/it] {'loss': 0.3874, 'grad_norm': 0.7468722171561877, 'learning_rate': 8.826293421029754e-06, 'epoch': 0.25} 25%|██▍ | 5435/22095 [9:15:44<16:17:49, 3.52s/it] 25%|██▍ | 5436/22095 [9:15:48<16:51:17, 3.64s/it] {'loss': 0.4022, 'grad_norm': 0.6400369840427028, 'learning_rate': 8.825821581894964e-06, 'epoch': 0.25} 25%|██▍ | 5436/22095 [9:15:48<16:51:17, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5437/22095 [9:15:57<24:52:46, 5.38s/it] {'loss': 0.4951, 'grad_norm': 0.40295831861440534, 'learning_rate': 8.82534966055502e-06, 'epoch': 0.25} 25%|██▍ | 5437/22095 [9:15:57<24:52:46, 5.38s/it] 25%|██▍ | 5438/22095 [9:16:07<30:28:53, 6.59s/it] {'loss': 0.4945, 'grad_norm': 0.401916187918165, 'learning_rate': 8.824877657020058e-06, 'epoch': 0.25} 25%|██▍ | 5438/22095 [9:16:07<30:28:53, 6.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 25%|██▍ | 5439/22095 [9:16:10<25:51:47, 5.59s/it] {'loss': 0.3804, 'grad_norm': 0.7838023003559151, 'learning_rate': 8.824405571300225e-06, 'epoch': 0.25} 25%|██▍ | 5439/22095 [9:16:10<25:51:47, 5.59s/it] 25%|██▍ | 5440/22095 [9:16:14<23:31:45, 5.09s/it] {'loss': 0.3876, 'grad_norm': 0.9976284832408775, 'learning_rate': 8.82393340340566e-06, 'epoch': 0.25} 25%|██▍ | 5440/22095 [9:16:14<23:31:45, 5.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931431 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54584, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nA. 10\nB. 12\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 25%|██▍ | 5441/22095 [9:16:17<20:32:56, 4.44s/it] {'loss': 0.3865, 'grad_norm': 0.6993269738534612, 'learning_rate': 8.823461153346512e-06, 'epoch': 0.25} 25%|██▍ | 5441/22095 [9:16:17<20:32:56, 4.44s/it] 25%|██▍ | 5442/22095 [9:16:20<18:25:33, 3.98s/it] {'loss': 0.3761, 'grad_norm': 0.6974540450082418, 'learning_rate': 8.822988821132925e-06, 'epoch': 0.25} 25%|██▍ | 5442/22095 [9:16:20<18:25:33, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5443/22095 [9:16:23<18:06:18, 3.91s/it] {'loss': 0.3712, 'grad_norm': 0.6574267966627634, 'learning_rate': 8.822516406775051e-06, 'epoch': 0.25} 25%|██▍ | 5443/22095 [9:16:24<18:06:18, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5444/22095 [9:16:27<17:32:15, 3.79s/it] {'loss': 0.3694, 'grad_norm': 0.7192051932707151, 'learning_rate': 8.822043910283041e-06, 'epoch': 0.25} 25%|██▍ | 5444/22095 [9:16:27<17:32:15, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308154 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2NDN_tC0jpuFjy0FlXXc0bpXa_!!3000650640.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text, no need to display the bounding box.'}, {'from': 'gpt', 'value': 'All words in the image:\n居医贴就这么简单\n进入观看视频\n只要15分钟\n堵漏修复水管\n居医止漏专家\n包邮\n400-000-5192'}]} 25%|██▍ | 5445/22095 [9:16:30<16:05:51, 3.48s/it] {'loss': 0.356, 'grad_norm': 0.6941915904680251, 'learning_rate': 8.821571331667043e-06, 'epoch': 0.25} 25%|██▍ | 5445/22095 [9:16:30<16:05:51, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5446/22095 [9:16:34<16:39:06, 3.60s/it] {'loss': 0.4146, 'grad_norm': 0.6784074255903936, 'learning_rate': 8.821098670937215e-06, 'epoch': 0.25} 25%|██▍ | 5446/22095 [9:16:34<16:39:06, 3.60s/it] 25%|██▍ | 5447/22095 [9:16:37<15:49:38, 3.42s/it] {'loss': 0.3657, 'grad_norm': 0.6372613000327357, 'learning_rate': 8.820625928103712e-06, 'epoch': 0.25} 25%|██▍ | 5447/22095 [9:16:37<15:49:38, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5448/22095 [9:16:42<19:10:51, 4.15s/it] {'loss': 0.4783, 'grad_norm': 0.565621513237359, 'learning_rate': 8.820153103176692e-06, 'epoch': 0.25} 25%|██▍ | 5448/22095 [9:16:42<19:10:51, 4.15s/it] 25%|██▍ | 5449/22095 [9:16:46<18:30:31, 4.00s/it] {'loss': 0.405, 'grad_norm': 0.7421946650728314, 'learning_rate': 8.819680196166315e-06, 'epoch': 0.25} 25%|██▍ | 5449/22095 [9:16:46<18:30:31, 4.00s/it] 25%|██▍ | 5450/22095 [9:16:49<17:14:51, 3.73s/it] {'loss': 0.3889, 'grad_norm': 0.7230679706345529, 'learning_rate': 8.819207207082741e-06, 'epoch': 0.25} 25%|██▍ | 5450/22095 [9:16:49<17:14:51, 3.73s/it] 25%|██▍ | 5451/22095 [9:16:52<16:30:50, 3.57s/it] {'loss': 0.3559, 'grad_norm': 0.6871674592758005, 'learning_rate': 8.818734135936136e-06, 'epoch': 0.25} 25%|██▍ | 5451/22095 [9:16:52<16:30:50, 3.57s/it] 25%|██▍ | 5452/22095 [9:16:56<16:34:55, 3.59s/it] {'loss': 0.389, 'grad_norm': 0.704175915763451, 'learning_rate': 8.818260982736662e-06, 'epoch': 0.25} 25%|██▍ | 5452/22095 [9:16:56<16:34:55, 3.59s/it] 25%|██▍ | 5453/22095 [9:16:59<15:49:11, 3.42s/it] {'loss': 0.4332, 'grad_norm': 0.6566117862980755, 'learning_rate': 8.817787747494484e-06, 'epoch': 0.25} 25%|██▍ | 5453/22095 [9:16:59<15:49:11, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43877 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5454/22095 [9:17:03<15:56:02, 3.45s/it] {'loss': 0.3669, 'grad_norm': 1.9586366774313648, 'learning_rate': 8.817314430219775e-06, 'epoch': 0.25} 25%|██▍ | 5454/22095 [9:17:03<15:56:02, 3.45s/it] 25%|██▍ | 5455/22095 [9:17:06<15:29:44, 3.35s/it] {'loss': 0.3951, 'grad_norm': 0.6883090670651413, 'learning_rate': 8.816841030922702e-06, 'epoch': 0.25} 25%|██▍ | 5455/22095 [9:17:06<15:29:44, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76899 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5456/22095 [9:17:10<16:05:47, 3.48s/it] {'loss': 0.3708, 'grad_norm': 0.6684753360093686, 'learning_rate': 8.816367549613439e-06, 'epoch': 0.25} 25%|██▍ | 5456/22095 [9:17:10<16:05:47, 3.48s/it] 25%|██▍ | 5457/22095 [9:17:13<16:12:33, 3.51s/it] {'loss': 0.36, 'grad_norm': 0.6699780649262084, 'learning_rate': 8.815893986302158e-06, 'epoch': 0.25} 25%|██▍ | 5457/22095 [9:17:13<16:12:33, 3.51s/it] 25%|██▍ | 5458/22095 [9:17:16<15:25:29, 3.34s/it] {'loss': 0.3692, 'grad_norm': 0.6751583780687819, 'learning_rate': 8.815420340999034e-06, 'epoch': 0.25} 25%|██▍ | 5458/22095 [9:17:16<15:25:29, 3.34s/it] 25%|██▍ | 5459/22095 [9:17:19<15:07:07, 3.27s/it] {'loss': 0.3599, 'grad_norm': 0.8641158746125323, 'learning_rate': 8.814946613714244e-06, 'epoch': 0.25} 25%|██▍ | 5459/22095 [9:17:19<15:07:07, 3.27s/it] 25%|██▍ | 5460/22095 [9:17:22<14:53:04, 3.22s/it] {'loss': 0.428, 'grad_norm': 0.6746092328932488, 'learning_rate': 8.81447280445797e-06, 'epoch': 0.25} 25%|██▍ | 5460/22095 [9:17:22<14:53:04, 3.22s/it] 25%|██▍ | 5461/22095 [9:17:25<14:21:10, 3.11s/it] {'loss': 0.3568, 'grad_norm': 0.8619884401255935, 'learning_rate': 8.81399891324039e-06, 'epoch': 0.25} 25%|██▍ | 5461/22095 [9:17:25<14:21:10, 3.11s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8355460 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22151, 'image': 'vrdu_table_final_2/astro-ph.CO/a86718f0-8da3-40ba-9d4a-417c3f2c3479.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{#1}#2\\end{tabular}\n```"}]} 25%|██▍ | 5462/22095 [9:17:35<23:06:42, 5.00s/it] {'loss': 0.5229, 'grad_norm': 0.5537299656020676, 'learning_rate': 8.813524940071687e-06, 'epoch': 0.25} 25%|██▍ | 5462/22095 [9:17:35<23:06:42, 5.00s/it] 25%|██▍ | 5463/22095 [9:17:38<20:43:06, 4.48s/it] {'loss': 0.3674, 'grad_norm': 0.7466826870291752, 'learning_rate': 8.813050884962046e-06, 'epoch': 0.25} 25%|██▍ | 5463/22095 [9:17:38<20:43:06, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49396 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81393 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5464/22095 [9:17:47<27:46:23, 6.01s/it] {'loss': 0.4921, 'grad_norm': 0.32719231858791137, 'learning_rate': 8.812576747921653e-06, 'epoch': 0.25} 25%|██▍ | 5464/22095 [9:17:47<27:46:23, 6.01s/it] 25%|██▍ | 5465/22095 [9:17:51<23:48:43, 5.15s/it] {'loss': 0.3852, 'grad_norm': 0.7554589756926923, 'learning_rate': 8.812102528960693e-06, 'epoch': 0.25} 25%|██▍ | 5465/22095 [9:17:51<23:48:43, 5.15s/it] 25%|██▍ | 5466/22095 [9:17:53<20:39:52, 4.47s/it] {'loss': 0.4032, 'grad_norm': 0.7777119738955086, 'learning_rate': 8.81162822808936e-06, 'epoch': 0.25} 25%|██▍ | 5466/22095 [9:17:53<20:39:52, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52716 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5467/22095 [9:17:56<18:37:07, 4.03s/it] {'loss': 0.3426, 'grad_norm': 0.5932440156262864, 'learning_rate': 8.811153845317842e-06, 'epoch': 0.25} 25%|██▍ | 5467/22095 [9:17:56<18:37:07, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5468/22095 [9:17:59<16:52:13, 3.65s/it] {'loss': 0.3747, 'grad_norm': 0.7747935373734771, 'learning_rate': 8.810679380656331e-06, 'epoch': 0.25} 25%|██▍ | 5468/22095 [9:17:59<16:52:13, 3.65s/it] 25%|██▍ | 5469/22095 [9:18:02<15:51:01, 3.43s/it] {'loss': 0.3553, 'grad_norm': 0.834579211230089, 'learning_rate': 8.810204834115026e-06, 'epoch': 0.25} 25%|██▍ | 5469/22095 [9:18:02<15:51:01, 3.43s/it] 25%|██▍ | 5470/22095 [9:18:05<15:39:36, 3.39s/it] {'loss': 0.3669, 'grad_norm': 0.7751947065367353, 'learning_rate': 8.80973020570412e-06, 'epoch': 0.25} 25%|██▍ | 5470/22095 [9:18:05<15:39:36, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5471/22095 [9:18:08<14:50:15, 3.21s/it] {'loss': 0.3715, 'grad_norm': 0.6760212197334325, 'learning_rate': 8.809255495433814e-06, 'epoch': 0.25} 25%|██▍ | 5471/22095 [9:18:08<14:50:15, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5472/22095 [9:18:14<19:05:43, 4.14s/it] {'loss': 0.4749, 'grad_norm': 0.7701164610513445, 'learning_rate': 8.808780703314305e-06, 'epoch': 0.25} 25%|██▍ | 5472/22095 [9:18:14<19:05:43, 4.14s/it] 25%|██▍ | 5473/22095 [9:18:18<17:44:47, 3.84s/it] {'loss': 0.3791, 'grad_norm': 0.7626850010355772, 'learning_rate': 8.808305829355797e-06, 'epoch': 0.25} 25%|██▍ | 5473/22095 [9:18:18<17:44:47, 3.84s/it] 25%|██▍ | 5474/22095 [9:18:21<17:13:08, 3.73s/it] {'loss': 0.4131, 'grad_norm': 0.7405751108830647, 'learning_rate': 8.807830873568493e-06, 'epoch': 0.25} 25%|██▍ | 5474/22095 [9:18:21<17:13:08, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5475/22095 [9:18:30<24:52:15, 5.39s/it] {'loss': 0.4998, 'grad_norm': 0.39037150438658097, 'learning_rate': 8.8073558359626e-06, 'epoch': 0.25} 25%|██▍ | 5475/22095 [9:18:30<24:52:15, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5476/22095 [9:18:34<22:17:40, 4.83s/it] {'loss': 0.4044, 'grad_norm': 0.6539788911826464, 'learning_rate': 8.806880716548322e-06, 'epoch': 0.25} 25%|██▍ | 5476/22095 [9:18:34<22:17:40, 4.83s/it] 25%|██▍ | 5477/22095 [9:18:37<19:51:59, 4.30s/it] {'loss': 0.3887, 'grad_norm': 0.6535225496446154, 'learning_rate': 8.80640551533587e-06, 'epoch': 0.25} 25%|██▍ | 5477/22095 [9:18:37<19:51:59, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5478/22095 [9:18:45<24:38:33, 5.34s/it] {'loss': 0.5045, 'grad_norm': 0.4697297560970744, 'learning_rate': 8.805930232335454e-06, 'epoch': 0.25} 25%|██▍ | 5478/22095 [9:18:45<24:38:33, 5.34s/it] 25%|██▍ | 5479/22095 [9:18:48<21:40:25, 4.70s/it] {'loss': 0.3496, 'grad_norm': 0.7089333794869764, 'learning_rate': 8.805454867557284e-06, 'epoch': 0.25} 25%|██▍ | 5479/22095 [9:18:48<21:40:25, 4.70s/it] 25%|██▍ | 5480/22095 [9:18:52<20:56:22, 4.54s/it] {'loss': 0.4134, 'grad_norm': 0.6986426965189364, 'learning_rate': 8.804979421011579e-06, 'epoch': 0.25} 25%|██▍ | 5480/22095 [9:18:52<20:56:22, 4.54s/it] 25%|██▍ | 5481/22095 [9:18:55<19:11:19, 4.16s/it] {'loss': 0.3891, 'grad_norm': 0.6432027825709875, 'learning_rate': 8.804503892708552e-06, 'epoch': 0.25} 25%|██▍ | 5481/22095 [9:18:55<19:11:19, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [456, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507366 in VC:s3://internvl-moe-sft-data/. Exception: Image size [456, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 126310, 'image': 'vrdu_texteq/astro-ph.CO/5cfc87fe-c1fb-49a3-9ead-4c1d90361352.png', 'image_wh': [[456, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and the mass terms $\\Omega^2$ are defined as'}]} 25%|██▍ | 5482/22095 [9:18:59<18:52:16, 4.09s/it] {'loss': 0.3936, 'grad_norm': 0.6516825549121646, 'learning_rate': 8.80402828265842e-06, 'epoch': 0.25} 25%|██▍ | 5482/22095 [9:18:59<18:52:16, 4.09s/it] 25%|██▍ | 5483/22095 [9:19:03<18:28:15, 4.00s/it] {'loss': 0.3776, 'grad_norm': 0.6905744064567494, 'learning_rate': 8.803552590871406e-06, 'epoch': 0.25} 25%|██▍ | 5483/22095 [9:19:03<18:28:15, 4.00s/it] 25%|██▍ | 5484/22095 [9:19:07<17:46:31, 3.85s/it] {'loss': 0.4089, 'grad_norm': 0.770332771693627, 'learning_rate': 8.803076817357725e-06, 'epoch': 0.25} 25%|██▍ | 5484/22095 [9:19:07<17:46:31, 3.85s/it] 25%|██▍ | 5485/22095 [9:19:10<17:24:47, 3.77s/it] {'loss': 0.4047, 'grad_norm': 0.7445899391534694, 'learning_rate': 8.802600962127606e-06, 'epoch': 0.25} 25%|██▍ | 5485/22095 [9:19:10<17:24:47, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5486/22095 [9:19:13<16:09:15, 3.50s/it] {'loss': 0.4052, 'grad_norm': 0.610507084099802, 'learning_rate': 8.802125025191268e-06, 'epoch': 0.25} 25%|██▍ | 5486/22095 [9:19:13<16:09:15, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79295 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86528 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5487/22095 [9:19:17<16:32:21, 3.59s/it] {'loss': 0.4266, 'grad_norm': 0.6567358608756095, 'learning_rate': 8.801649006558943e-06, 'epoch': 0.25} 25%|██▍ | 5487/22095 [9:19:17<16:32:21, 3.59s/it] 25%|██▍ | 5488/22095 [9:19:20<15:47:29, 3.42s/it] {'loss': 0.4, 'grad_norm': 0.6596038921834942, 'learning_rate': 8.801172906240857e-06, 'epoch': 0.25} 25%|██▍ | 5488/22095 [9:19:20<15:47:29, 3.42s/it] 25%|██▍ | 5489/22095 [9:19:23<15:02:12, 3.26s/it] {'loss': 0.3684, 'grad_norm': 0.590396728859398, 'learning_rate': 8.800696724247239e-06, 'epoch': 0.25} 25%|██▍ | 5489/22095 [9:19:23<15:02:12, 3.26s/it] 25%|██▍ | 5490/22095 [9:19:27<16:00:53, 3.47s/it] {'loss': 0.3714, 'grad_norm': 0.6386192813750432, 'learning_rate': 8.800220460588321e-06, 'epoch': 0.25} 25%|██▍ | 5490/22095 [9:19:27<16:00:53, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42773 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63808 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60582 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43204 > 40960) for 4 sample(s). Truncating to 1428 with 1 samples. 25%|██▍ | 5491/22095 [9:19:30<15:30:47, 3.36s/it] {'loss': 0.4147, 'grad_norm': 0.6997118465528872, 'learning_rate': 8.799744115274339e-06, 'epoch': 0.25} 25%|██▍ | 5491/22095 [9:19:30<15:30:47, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44443 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5492/22095 [9:19:34<16:09:38, 3.50s/it] {'loss': 0.3748, 'grad_norm': 0.6565255682296418, 'learning_rate': 8.799267688315523e-06, 'epoch': 0.25} 25%|██▍ | 5492/22095 [9:19:34<16:09:38, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5493/22095 [9:19:37<15:40:29, 3.40s/it] {'loss': 0.3724, 'grad_norm': 0.6225869450633494, 'learning_rate': 8.798791179722114e-06, 'epoch': 0.25} 25%|██▍ | 5493/22095 [9:19:37<15:40:29, 3.40s/it] 25%|██▍ | 5494/22095 [9:19:40<15:18:21, 3.32s/it] {'loss': 0.3847, 'grad_norm': 0.7199023735140465, 'learning_rate': 8.798314589504348e-06, 'epoch': 0.25} 25%|██▍ | 5494/22095 [9:19:40<15:18:21, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5495/22095 [9:19:49<23:39:12, 5.13s/it] {'loss': 0.5017, 'grad_norm': 0.6212669132572823, 'learning_rate': 8.79783791767247e-06, 'epoch': 0.25} 25%|██▍ | 5495/22095 [9:19:49<23:39:12, 5.13s/it] 25%|██▍ | 5496/22095 [9:19:53<22:06:43, 4.80s/it] {'loss': 0.3892, 'grad_norm': 0.6840190455449334, 'learning_rate': 8.797361164236717e-06, 'epoch': 0.25} 25%|██▍ | 5496/22095 [9:19:53<22:06:43, 4.80s/it] 25%|██▍ | 5497/22095 [9:19:56<19:18:39, 4.19s/it] {'loss': 0.3901, 'grad_norm': 0.673935866571196, 'learning_rate': 8.796884329207337e-06, 'epoch': 0.25} 25%|██▍ | 5497/22095 [9:19:56<19:18:39, 4.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047179 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. \\frac{9}{2}cm\nB. 5cm\nC. \\frac{11}{2}cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 25%|██▍ | 5498/22095 [9:19:59<17:47:16, 3.86s/it] {'loss': 0.3867, 'grad_norm': 0.6989693763930492, 'learning_rate': 8.796407412594573e-06, 'epoch': 0.25} 25%|██▍ | 5498/22095 [9:19:59<17:47:16, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5499/22095 [9:20:09<26:28:50, 5.74s/it] {'loss': 0.4909, 'grad_norm': 0.3414127366149348, 'learning_rate': 8.795930414408676e-06, 'epoch': 0.25} 25%|██▍ | 5499/22095 [9:20:09<26:28:50, 5.74s/it] 25%|██▍ | 5500/22095 [9:20:18<30:17:41, 6.57s/it] {'loss': 0.4988, 'grad_norm': 0.3350419170199567, 'learning_rate': 8.795453334659889e-06, 'epoch': 0.25} 25%|██▍ | 5500/22095 [9:20:18<30:17:41, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 25%|██▍ | 5501/22095 [9:20:21<25:33:39, 5.55s/it] {'loss': 0.3886, 'grad_norm': 0.7526309166356293, 'learning_rate': 8.79497617335847e-06, 'epoch': 0.25} 25%|██▍ | 5501/22095 [9:20:21<25:33:39, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55789 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44768 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5502/22095 [9:20:24<22:19:23, 4.84s/it] {'loss': 0.3453, 'grad_norm': 0.6392150511382405, 'learning_rate': 8.794498930514666e-06, 'epoch': 0.25} 25%|██▍ | 5502/22095 [9:20:24<22:19:23, 4.84s/it] 25%|██▍ | 5503/22095 [9:20:27<19:38:55, 4.26s/it] {'loss': 0.3432, 'grad_norm': 0.6385867732761876, 'learning_rate': 8.794021606138734e-06, 'epoch': 0.25} 25%|██▍ | 5503/22095 [9:20:27<19:38:55, 4.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5504/22095 [9:20:30<17:53:51, 3.88s/it] {'loss': 0.3906, 'grad_norm': 0.6785794290773549, 'learning_rate': 8.793544200240932e-06, 'epoch': 0.25} 25%|██▍ | 5504/22095 [9:20:30<17:53:51, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51552 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5505/22095 [9:20:33<16:45:49, 3.64s/it] {'loss': 0.3868, 'grad_norm': 0.6534436391608166, 'learning_rate': 8.793066712831515e-06, 'epoch': 0.25} 25%|██▍ | 5505/22095 [9:20:33<16:45:49, 3.64s/it] 25%|██▍ | 5506/22095 [9:20:37<16:27:54, 3.57s/it] {'loss': 0.3912, 'grad_norm': 0.6416942590392011, 'learning_rate': 8.792589143920743e-06, 'epoch': 0.25} 25%|██▍ | 5506/22095 [9:20:37<16:27:54, 3.57s/it] 25%|██▍ | 5507/22095 [9:20:40<16:38:17, 3.61s/it] {'loss': 0.3795, 'grad_norm': 0.6457455294679602, 'learning_rate': 8.792111493518878e-06, 'epoch': 0.25} 25%|██▍ | 5507/22095 [9:20:40<16:38:17, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83589 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112312 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5508/22095 [9:20:44<16:07:39, 3.50s/it] {'loss': 0.3617, 'grad_norm': 0.6324241805057175, 'learning_rate': 8.791633761636186e-06, 'epoch': 0.25} 25%|██▍ | 5508/22095 [9:20:44<16:07:39, 3.50s/it] 25%|██▍ | 5509/22095 [9:20:47<16:35:04, 3.60s/it] {'loss': 0.4277, 'grad_norm': 0.7451581373182293, 'learning_rate': 8.791155948282927e-06, 'epoch': 0.25} 25%|██▍ | 5509/22095 [9:20:47<16:35:04, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▍ | 5510/22095 [9:20:57<24:27:37, 5.31s/it] {'loss': 0.4655, 'grad_norm': 0.6652436681096543, 'learning_rate': 8.790678053469372e-06, 'epoch': 0.25} 25%|██▍ | 5510/22095 [9:20:57<24:27:37, 5.31s/it] 25%|██▍ | 5511/22095 [9:21:00<22:02:25, 4.78s/it] {'loss': 0.4131, 'grad_norm': 0.6862925638345763, 'learning_rate': 8.790200077205789e-06, 'epoch': 0.25} 25%|██▍ | 5511/22095 [9:21:00<22:02:25, 4.78s/it] 25%|██▍ | 5512/22095 [9:21:04<20:13:47, 4.39s/it] {'loss': 0.415, 'grad_norm': 0.6767907349063768, 'learning_rate': 8.789722019502444e-06, 'epoch': 0.25} 25%|██▍ | 5512/22095 [9:21:04<20:13:47, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31409.png 2025-08-28 01:19:03.781183 load time: 1001.84 ms 25%|██▍ | 5513/22095 [9:21:13<27:23:02, 5.95s/it] {'loss': 0.4734, 'grad_norm': 0.3808039754062271, 'learning_rate': 8.789243880369613e-06, 'epoch': 0.25} 25%|██▍ | 5513/22095 [9:21:13<27:23:02, 5.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82363 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81420 > 40960). Running this sequence through the model will result in indexing errors 25%|██▍ | 5514/22095 [9:21:17<24:25:34, 5.30s/it] {'loss': 0.3865, 'grad_norm': 0.6477071323204457, 'learning_rate': 8.78876565981757e-06, 'epoch': 0.25} 25%|██▍ | 5514/22095 [9:21:17<24:25:34, 5.30s/it] 25%|██▍ | 5515/22095 [9:21:20<21:12:16, 4.60s/it] {'loss': 0.3678, 'grad_norm': 0.7227539653597835, 'learning_rate': 8.788287357856588e-06, 'epoch': 0.25} 25%|██▍ | 5515/22095 [9:21:20<21:12:16, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5516/22095 [9:21:29<27:44:06, 6.02s/it] {'loss': 0.5286, 'grad_norm': 0.3728619700056104, 'learning_rate': 8.787808974496946e-06, 'epoch': 0.25} 25%|██▍ | 5516/22095 [9:21:29<27:44:06, 6.02s/it] 25%|██▍ | 5517/22095 [9:21:33<24:23:22, 5.30s/it] {'loss': 0.3696, 'grad_norm': 0.8103874044499024, 'learning_rate': 8.787330509748924e-06, 'epoch': 0.25} 25%|██▍ | 5517/22095 [9:21:33<24:23:22, 5.30s/it] 25%|██▍ | 5518/22095 [9:21:36<21:31:34, 4.67s/it] {'loss': 0.4202, 'grad_norm': 0.6919952359581814, 'learning_rate': 8.786851963622799e-06, 'epoch': 0.25} 25%|██▍ | 5518/22095 [9:21:36<21:31:34, 4.67s/it] 25%|██▍ | 5519/22095 [9:21:40<19:49:43, 4.31s/it] {'loss': 0.399, 'grad_norm': 0.6528924734294044, 'learning_rate': 8.786373336128858e-06, 'epoch': 0.25} 25%|██▍ | 5519/22095 [9:21:40<19:49:43, 4.31s/it] 25%|██▍ | 5520/22095 [9:21:43<18:37:06, 4.04s/it] {'loss': 0.3805, 'grad_norm': 0.7227289627808617, 'learning_rate': 8.78589462727738e-06, 'epoch': 0.25} 25%|██▍ | 5520/22095 [9:21:43<18:37:06, 4.04s/it] 25%|██▍ | 5521/22095 [9:21:46<17:46:36, 3.86s/it] {'loss': 0.3814, 'grad_norm': 0.6116874693101342, 'learning_rate': 8.785415837078655e-06, 'epoch': 0.25} 25%|██▍ | 5521/22095 [9:21:47<17:46:36, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▍ | 5522/22095 [9:21:56<25:25:49, 5.52s/it] {'loss': 0.5113, 'grad_norm': 0.47326084506519833, 'learning_rate': 8.78493696554297e-06, 'epoch': 0.25} 25%|██▍ | 5522/22095 [9:21:56<25:25:49, 5.52s/it] 25%|██▍ | 5523/22095 [9:22:00<23:01:05, 5.00s/it] {'loss': 0.3818, 'grad_norm': 0.6477654979674229, 'learning_rate': 8.784458012680614e-06, 'epoch': 0.25} 25%|██▍ | 5523/22095 [9:22:00<23:01:05, 5.00s/it] 25%|██▌ | 5524/22095 [9:22:04<21:26:49, 4.66s/it] {'loss': 0.379, 'grad_norm': 0.6210811401562978, 'learning_rate': 8.783978978501879e-06, 'epoch': 0.25} 25%|██▌ | 5524/22095 [9:22:04<21:26:49, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43491 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5525/22095 [9:22:07<19:14:03, 4.18s/it] {'loss': 0.3561, 'grad_norm': 0.6830342901861352, 'learning_rate': 8.783499863017057e-06, 'epoch': 0.25} 25%|██▌ | 5525/22095 [9:22:07<19:14:03, 4.18s/it] 25%|██▌ | 5526/22095 [9:22:10<17:57:12, 3.90s/it] {'loss': 0.4218, 'grad_norm': 0.6731567670022601, 'learning_rate': 8.783020666236443e-06, 'epoch': 0.25} 25%|██▌ | 5526/22095 [9:22:10<17:57:12, 3.90s/it] 25%|██▌ | 5527/22095 [9:22:13<17:27:14, 3.79s/it] {'loss': 0.3543, 'grad_norm': 0.65807887584648, 'learning_rate': 8.782541388170334e-06, 'epoch': 0.25} 25%|██▌ | 5527/22095 [9:22:13<17:27:14, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5528/22095 [9:22:17<16:40:36, 3.62s/it] {'loss': 0.3845, 'grad_norm': 0.8695506353899523, 'learning_rate': 8.782062028829028e-06, 'epoch': 0.25} 25%|██▌ | 5528/22095 [9:22:17<16:40:36, 3.62s/it] 25%|██▌ | 5529/22095 [9:22:20<16:25:37, 3.57s/it] {'loss': 0.3622, 'grad_norm': 0.6392487323412227, 'learning_rate': 8.781582588222823e-06, 'epoch': 0.25} 25%|██▌ | 5529/22095 [9:22:20<16:25:37, 3.57s/it] 25%|██▌ | 5530/22095 [9:22:24<16:22:29, 3.56s/it] {'loss': 0.3361, 'grad_norm': 0.5861804047598004, 'learning_rate': 8.781103066362024e-06, 'epoch': 0.25} 25%|██▌ | 5530/22095 [9:22:24<16:22:29, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930858 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54011, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BC=4cm,∴AC=6cm,∵D是线段AC的中点,∴AD=3cm.'}]} 25%|██▌ | 5531/22095 [9:22:27<15:31:20, 3.37s/it] {'loss': 0.353, 'grad_norm': 0.6529690974582265, 'learning_rate': 8.780623463256932e-06, 'epoch': 0.25} 25%|██▌ | 5531/22095 [9:22:27<15:31:20, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924583 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47736, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 25%|██▌ | 5532/22095 [9:22:29<14:53:45, 3.24s/it] {'loss': 0.3629, 'grad_norm': 0.6388205223624392, 'learning_rate': 8.780143778917853e-06, 'epoch': 0.25} 25%|██▌ | 5532/22095 [9:22:29<14:53:45, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7798475 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '19815', 'image': '28869.jpg', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\nI am responding to the following question using the image as a reference: \nPlease explain what you think is happening or what story the image is telling. \nHere is my answer: \nThe image portrays a graphical representation showcasing a continuous arc. The curved segment is adorned with four significant milestones denoted as **X**, **W**, **V**, and **Y** sequentially. These milestones demarcate the arc into distinguishable portions.\n\nCommencing from the left, milestone **X** is positioned at the extreme left of the arc. Progressing rightward, there\'s a section leading to milestone **W**, succeeded by another segment that steers to milestone **V**, culminating with a portion that connects to milestone **Y** at the extreme right of the arc.\n\nThe denominations imply a cyclical arrangement of the milestones, with **X** preceding **W** and **V**, while **Y** follows **V**. The arc can be fractioned into three sections: **XW** (spanning milestones **X** to **W**), **WV** (between milestones **W** and **V**), and **VY** (encompassing milestones **V** to **Y**).\n\nTo delve into the circular characteristics based on traditional geometric interpretations:\n- **Section XW** represents the angular distance from milestone **X** to **W**.\n- **Section WV** denotes the angular distance from milestone **W** to **V**.\n- **Section VY** indicates the angular distance from milestone **V** to **Y**.\n\nGiven that the milestones are named in alphabetical order and are laid out in a cyclical pattern without additional geometric shapes or intersections, it can be deduced that their disposition is sequential.\n\nHere is a conceivable markdown representation of the arc and marked milestones:\n```\nX ──── W ──── V ──── Y\n```\n\nIn the absence of precise measurements or data points concerning the lengths of the sections, it is plausible to assume that this figure symbolizes a fundamental circular arrangement of discrete milestones on an arc. Such an illustration is typically utilized to explicate concepts in geometry, including arcs, angles, or circularity.\n\nIf the inquiry were, "What is the sequence of milestones on this arc segment?" The CoT (Chain of Thought) rationale would encompass validating the succession visually and understanding it sequentially:\n1. Scrutinizing the milestones and their notations.\n2. Realizing that the order advances from left to right as designated.\n3. Deducing that the sequence is **X**, **W**, **V**, **Y**.\n\nHence, the sequence of milestones on the arc segment is **X, W, V, Y**. \nKindly correct my answer based on the image, and if it\'s wrong, please explain the reason.'}, {'from': 'gpt', 'value': 'Your description aligns well with the principles of interpreting geometric figures, showing a clear understanding of how to identify and explain the sequence of points or milestones in a given visual. However, there are a few subtle errors and inconsistencies in your response:\n\n1. **Milestone Naming Error**: In the initial part of your answer, you describe milestones as being denoted by **X**, **W**, **V**, and **Y**. However, in the correct image description, the points are labeled **A**, **C**, **D**, and **B**. This discrepancy suggests a misunderstanding or misinterpretation of the provided image.\n\n2. **Cyclical vs. Linear Confusion**: Your initial interpretation assumes a cyclical or circular arrangement of milestones, but the description of the correct image correctly identifies a linear arrangement. The confusion between circular and linear properties indicates a need to revisit the geometric figure for clarification.\n\n3. **Terminology Inconsistency**: In the circular representation, you use terms like "arc" and "angular distance," which are not applicable to the linear arrangement described in the correct image. The correct image involves "line segment" and "distance" rather than "arc" and "angular distance."\n\n4. **Sequential Notation Mismatch**: In the circular representation, you mention the sequence as **X, W, V, Y**, which matches the alphabetical order but does not correspond to the correct image\'s sequential notation of **A, C, D, B**. This confusion might arise from not accurately following the sequence in the provided image.\n\nThese issues stem from a potential misunderstanding of the provided image. It\'s crucial to carefully observe and describe the given visual accurately, especially when distinguishing between linear and circular properties.'}]} 25%|██▌ | 5533/22095 [9:22:38<21:49:52, 4.75s/it] {'loss': 0.5079, 'grad_norm': 0.5135197990140405, 'learning_rate': 8.779664013355095e-06, 'epoch': 0.25} 25%|██▌ | 5533/22095 [9:22:38<21:49:52, 4.75s/it] 25%|██▌ | 5534/22095 [9:22:42<20:39:53, 4.49s/it] {'loss': 0.3658, 'grad_norm': 0.6759508366790495, 'learning_rate': 8.779184166578965e-06, 'epoch': 0.25} 25%|██▌ | 5534/22095 [9:22:42<20:39:53, 4.49s/it] 25%|██▌ | 5535/22095 [9:22:45<19:21:06, 4.21s/it] {'loss': 0.3724, 'grad_norm': 0.67004678301159, 'learning_rate': 8.778704238599775e-06, 'epoch': 0.25} 25%|██▌ | 5535/22095 [9:22:45<19:21:06, 4.21s/it] 25%|██▌ | 5536/22095 [9:22:48<17:32:32, 3.81s/it] {'loss': 0.3948, 'grad_norm': 0.7186727224626106, 'learning_rate': 8.778224229427836e-06, 'epoch': 0.25} 25%|██▌ | 5536/22095 [9:22:48<17:32:32, 3.81s/it] 25%|██▌ | 5537/22095 [9:22:51<16:35:45, 3.61s/it] {'loss': 0.397, 'grad_norm': 0.6462913809859987, 'learning_rate': 8.777744139073461e-06, 'epoch': 0.25} 25%|██▌ | 5537/22095 [9:22:51<16:35:45, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52625 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5538/22095 [9:22:55<17:22:24, 3.78s/it] {'loss': 0.4191, 'grad_norm': 0.6317619249321907, 'learning_rate': 8.777263967546969e-06, 'epoch': 0.25} 25%|██▌ | 5538/22095 [9:22:55<17:22:24, 3.78s/it] 25%|██▌ | 5539/22095 [9:22:59<16:41:27, 3.63s/it] {'loss': 0.3998, 'grad_norm': 0.6514663160067772, 'learning_rate': 8.776783714858672e-06, 'epoch': 0.25} 25%|██▌ | 5539/22095 [9:22:59<16:41:27, 3.63s/it] 25%|██▌ | 5540/22095 [9:23:02<16:12:19, 3.52s/it] {'loss': 0.3773, 'grad_norm': 0.6606566810098499, 'learning_rate': 8.776303381018895e-06, 'epoch': 0.25} 25%|██▌ | 5540/22095 [9:23:02<16:12:19, 3.52s/it] 25%|██▌ | 5541/22095 [9:23:06<16:36:43, 3.61s/it] {'loss': 0.3711, 'grad_norm': 0.696273850276573, 'learning_rate': 8.775822966037956e-06, 'epoch': 0.25} 25%|██▌ | 5541/22095 [9:23:06<16:36:43, 3.61s/it] 25%|██▌ | 5542/22095 [9:23:09<16:27:07, 3.58s/it] {'loss': 0.3843, 'grad_norm': 0.6246857391708531, 'learning_rate': 8.775342469926178e-06, 'epoch': 0.25} 25%|██▌ | 5542/22095 [9:23:09<16:27:07, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79007 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5543/22095 [9:23:12<15:29:45, 3.37s/it] {'loss': 0.3556, 'grad_norm': 0.6057970609141705, 'learning_rate': 8.774861892693886e-06, 'epoch': 0.25} 25%|██▌ | 5543/22095 [9:23:12<15:29:45, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104803 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5544/22095 [9:23:17<17:41:33, 3.85s/it] {'loss': 0.3758, 'grad_norm': 0.6359022529488361, 'learning_rate': 8.774381234351406e-06, 'epoch': 0.25} 25%|██▌ | 5544/22095 [9:23:17<17:41:33, 3.85s/it] 25%|██▌ | 5545/22095 [9:23:20<16:19:18, 3.55s/it] {'loss': 0.3607, 'grad_norm': 0.6749517080642095, 'learning_rate': 8.773900494909065e-06, 'epoch': 0.25} 25%|██▌ | 5545/22095 [9:23:20<16:19:18, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5546/22095 [9:23:26<19:17:56, 4.20s/it] {'loss': 0.5107, 'grad_norm': 0.45212094632264954, 'learning_rate': 8.77341967437719e-06, 'epoch': 0.25} 25%|██▌ | 5546/22095 [9:23:26<19:17:56, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5547/22095 [9:23:29<18:27:28, 4.02s/it] {'loss': 0.4256, 'grad_norm': 0.6890815621422842, 'learning_rate': 8.77293877276612e-06, 'epoch': 0.25} 25%|██▌ | 5547/22095 [9:23:29<18:27:28, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [678, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8528931 in VC:s3://internvl-moe-sft-data/. Exception: Image size [678, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119930, 'image': 'vrdu_texteq/astro-ph.CO/a8b08fb5-aea6-47a2-b525-e098084bd653.png', 'image_wh': [[678, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where the derivative is taken at the mean value $X_1=1$.'}]} 25%|██▌ | 5548/22095 [9:23:33<17:58:03, 3.91s/it] {'loss': 0.3561, 'grad_norm': 0.6796933014357068, 'learning_rate': 8.77245779008618e-06, 'epoch': 0.25} 25%|██▌ | 5548/22095 [9:23:33<17:58:03, 3.91s/it] 25%|██▌ | 5549/22095 [9:23:36<16:29:29, 3.59s/it] {'loss': 0.3897, 'grad_norm': 0.7136438532760775, 'learning_rate': 8.77197672634771e-06, 'epoch': 0.25} 25%|██▌ | 5549/22095 [9:23:36<16:29:29, 3.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1134, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8406260 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1134, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8447, 'image': 'vrdu_table_final_2/astro-ph.CO/9e316111-2deb-4a9d-83a8-2eab15a90490.png', 'image_wh': [[1134, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n9&10&11&12&13&14&15\n\\end{tabular}\n```"}]} 25%|██▌ | 5550/22095 [9:23:40<17:17:14, 3.76s/it] {'loss': 0.4194, 'grad_norm': 0.6531877591429376, 'learning_rate': 8.771495581561043e-06, 'epoch': 0.25} 25%|██▌ | 5550/22095 [9:23:40<17:17:14, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5551/22095 [9:23:49<25:09:53, 5.48s/it] {'loss': 0.5091, 'grad_norm': 0.3472284463790447, 'learning_rate': 8.77101435573652e-06, 'epoch': 0.25} 25%|██▌ | 5551/22095 [9:23:49<25:09:53, 5.48s/it] 25%|██▌ | 5552/22095 [9:23:53<22:31:38, 4.90s/it] {'loss': 0.3905, 'grad_norm': 0.6351845727041716, 'learning_rate': 8.770533048884483e-06, 'epoch': 0.25} 25%|██▌ | 5552/22095 [9:23:53<22:31:38, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58823 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5553/22095 [9:23:56<19:51:35, 4.32s/it] {'loss': 0.439, 'grad_norm': 0.6696146617160268, 'learning_rate': 8.77005166101527e-06, 'epoch': 0.25} 25%|██▌ | 5553/22095 [9:23:56<19:51:35, 4.32s/it] 25%|██▌ | 5554/22095 [9:23:59<18:47:47, 4.09s/it] {'loss': 0.3722, 'grad_norm': 0.6208498844865883, 'learning_rate': 8.769570192139224e-06, 'epoch': 0.25} 25%|██▌ | 5554/22095 [9:23:59<18:47:47, 4.09s/it] 25%|██▌ | 5555/22095 [9:24:03<18:23:11, 4.00s/it] {'loss': 0.3978, 'grad_norm': 0.7578644508445018, 'learning_rate': 8.76908864226669e-06, 'epoch': 0.25} 25%|██▌ | 5555/22095 [9:24:03<18:23:11, 4.00s/it] 25%|██▌ | 5556/22095 [9:24:06<17:18:38, 3.77s/it] {'loss': 0.3843, 'grad_norm': 0.6320177265238178, 'learning_rate': 8.768607011408021e-06, 'epoch': 0.25} 25%|██▌ | 5556/22095 [9:24:06<17:18:38, 3.77s/it] 25%|██▌ | 5557/22095 [9:24:10<17:33:58, 3.82s/it] {'loss': 0.3584, 'grad_norm': 0.5742711543775584, 'learning_rate': 8.76812529957356e-06, 'epoch': 0.25} 25%|██▌ | 5557/22095 [9:24:10<17:33:58, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5558/22095 [9:24:14<17:46:49, 3.87s/it] {'loss': 0.3568, 'grad_norm': 0.6223855477017578, 'learning_rate': 8.76764350677366e-06, 'epoch': 0.25} 25%|██▌ | 5558/22095 [9:24:14<17:46:49, 3.87s/it] 25%|██▌ | 5559/22095 [9:24:18<17:43:50, 3.86s/it] {'loss': 0.3652, 'grad_norm': 0.5993825762871401, 'learning_rate': 8.76716163301867e-06, 'epoch': 0.25} 25%|██▌ | 5559/22095 [9:24:18<17:43:50, 3.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5560/22095 [9:24:21<16:16:11, 3.54s/it] {'loss': 0.3735, 'grad_norm': 0.8458811737927945, 'learning_rate': 8.76667967831895e-06, 'epoch': 0.25} 25%|██▌ | 5560/22095 [9:24:21<16:16:11, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5561/22095 [9:24:30<24:18:01, 5.29s/it] {'loss': 0.492, 'grad_norm': 0.40384503794198034, 'learning_rate': 8.76619764268485e-06, 'epoch': 0.25} 25%|██▌ | 5561/22095 [9:24:30<24:18:01, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71333 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55274 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5562/22095 [9:24:34<22:10:16, 4.83s/it] {'loss': 0.4102, 'grad_norm': 0.6648333028328689, 'learning_rate': 8.76571552612673e-06, 'epoch': 0.25} 25%|██▌ | 5562/22095 [9:24:34<22:10:16, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77265 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5563/22095 [9:24:38<20:41:24, 4.51s/it] {'loss': 0.3926, 'grad_norm': 0.6475418274382717, 'learning_rate': 8.765233328654949e-06, 'epoch': 0.25} 25%|██▌ | 5563/22095 [9:24:38<20:41:24, 4.51s/it] 25%|██▌ | 5564/22095 [9:24:42<19:42:09, 4.29s/it] {'loss': 0.3489, 'grad_norm': 0.6015760283334478, 'learning_rate': 8.764751050279868e-06, 'epoch': 0.25} 25%|██▌ | 5564/22095 [9:24:42<19:42:09, 4.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5565/22095 [9:24:45<18:53:34, 4.11s/it] {'loss': 0.4221, 'grad_norm': 0.6921602856411545, 'learning_rate': 8.764268691011851e-06, 'epoch': 0.25} 25%|██▌ | 5565/22095 [9:24:45<18:53:34, 4.11s/it] 25%|██▌ | 5566/22095 [9:24:49<18:22:17, 4.00s/it] {'loss': 0.4028, 'grad_norm': 0.7213082623671482, 'learning_rate': 8.763786250861258e-06, 'epoch': 0.25} 25%|██▌ | 5566/22095 [9:24:49<18:22:17, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41322 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46508 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (137951 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48167 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5567/22095 [9:24:53<18:07:18, 3.95s/it] {'loss': 0.3781, 'grad_norm': 0.6666064268052584, 'learning_rate': 8.76330372983846e-06, 'epoch': 0.25} 25%|██▌ | 5567/22095 [9:24:53<18:07:18, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5568/22095 [9:24:56<16:34:29, 3.61s/it] {'loss': 0.356, 'grad_norm': 0.6081436800667175, 'learning_rate': 8.762821127953821e-06, 'epoch': 0.25} 25%|██▌ | 5568/22095 [9:24:56<16:34:29, 3.61s/it] 25%|██▌ | 5569/22095 [9:24:59<15:56:07, 3.47s/it] {'loss': 0.3625, 'grad_norm': 0.6309678293578441, 'learning_rate': 8.762338445217713e-06, 'epoch': 0.25} 25%|██▌ | 5569/22095 [9:24:59<15:56:07, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41473 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64236 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5570/22095 [9:25:08<24:14:49, 5.28s/it] {'loss': 0.4888, 'grad_norm': 0.38533445695115726, 'learning_rate': 8.761855681640508e-06, 'epoch': 0.25} 25%|██▌ | 5570/22095 [9:25:08<24:14:49, 5.28s/it] 25%|██▌ | 5571/22095 [9:25:12<21:13:54, 4.63s/it] {'loss': 0.3547, 'grad_norm': 0.6644202585973386, 'learning_rate': 8.761372837232578e-06, 'epoch': 0.25} 25%|██▌ | 5571/22095 [9:25:12<21:13:54, 4.63s/it] 25%|██▌ | 5572/22095 [9:25:15<19:35:18, 4.27s/it] {'loss': 0.3561, 'grad_norm': 0.6490414722219586, 'learning_rate': 8.760889912004297e-06, 'epoch': 0.25} 25%|██▌ | 5572/22095 [9:25:15<19:35:18, 4.27s/it] 25%|██▌ | 5573/22095 [9:25:19<18:34:34, 4.05s/it] {'loss': 0.3941, 'grad_norm': 0.721093897991297, 'learning_rate': 8.760406905966045e-06, 'epoch': 0.25} 25%|██▌ | 5573/22095 [9:25:19<18:34:34, 4.05s/it] 25%|██▌ | 5574/22095 [9:25:22<17:06:27, 3.73s/it] {'loss': 0.3836, 'grad_norm': 0.7217691975495019, 'learning_rate': 8.759923819128196e-06, 'epoch': 0.25} 25%|██▌ | 5574/22095 [9:25:22<17:06:27, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5575/22095 [9:25:31<24:40:20, 5.38s/it] {'loss': 0.5095, 'grad_norm': 0.3325969229718621, 'learning_rate': 8.759440651501131e-06, 'epoch': 0.25} 25%|██▌ | 5575/22095 [9:25:31<24:40:20, 5.38s/it] 25%|██▌ | 5576/22095 [9:25:34<22:13:56, 4.85s/it] {'loss': 0.4595, 'grad_norm': 0.7479140074598437, 'learning_rate': 8.758957403095234e-06, 'epoch': 0.25} 25%|██▌ | 5576/22095 [9:25:34<22:13:56, 4.85s/it] 25%|██▌ | 5577/22095 [9:25:37<19:30:33, 4.25s/it] {'loss': 0.3628, 'grad_norm': 0.7376226797049706, 'learning_rate': 8.758474073920887e-06, 'epoch': 0.25} 25%|██▌ | 5577/22095 [9:25:37<19:30:33, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5578/22095 [9:25:47<26:44:47, 5.83s/it] {'loss': 0.5044, 'grad_norm': 0.32568130987292926, 'learning_rate': 8.757990663988474e-06, 'epoch': 0.25} 25%|██▌ | 5578/22095 [9:25:47<26:44:47, 5.83s/it] 25%|██▌ | 5579/22095 [9:25:50<23:10:56, 5.05s/it] {'loss': 0.3442, 'grad_norm': 0.7091443937278222, 'learning_rate': 8.757507173308385e-06, 'epoch': 0.25} 25%|██▌ | 5579/22095 [9:25:50<23:10:56, 5.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5580/22095 [9:25:53<20:19:36, 4.43s/it] {'loss': 0.3829, 'grad_norm': 0.6641702917856819, 'learning_rate': 8.757023601891006e-06, 'epoch': 0.25} 25%|██▌ | 5580/22095 [9:25:53<20:19:36, 4.43s/it] 25%|██▌ | 5581/22095 [9:25:56<18:19:18, 3.99s/it] {'loss': 0.3979, 'grad_norm': 0.7033983142051135, 'learning_rate': 8.756539949746729e-06, 'epoch': 0.25} 25%|██▌ | 5581/22095 [9:25:56<18:19:18, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49431 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88464 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5582/22095 [9:26:02<21:19:40, 4.65s/it] {'loss': 0.5033, 'grad_norm': 0.32804607849724715, 'learning_rate': 8.756056216885946e-06, 'epoch': 0.25} 25%|██▌ | 5582/22095 [9:26:02<21:19:40, 4.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 25%|██▌ | 5583/22095 [9:26:07<21:17:15, 4.64s/it] {'loss': 0.3892, 'grad_norm': 0.7125925445274838, 'learning_rate': 8.755572403319052e-06, 'epoch': 0.25} 25%|██▌ | 5583/22095 [9:26:07<21:17:15, 4.64s/it] 25%|██▌ | 5584/22095 [9:26:10<19:22:58, 4.23s/it] {'loss': 0.3807, 'grad_norm': 0.6783287676002182, 'learning_rate': 8.75508850905644e-06, 'epoch': 0.25} 25%|██▌ | 5584/22095 [9:26:10<19:22:58, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144473 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83571 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50964 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5585/22095 [9:26:14<19:08:07, 4.17s/it] {'loss': 0.3759, 'grad_norm': 0.6702317252289326, 'learning_rate': 8.754604534108509e-06, 'epoch': 0.25} 25%|██▌ | 5585/22095 [9:26:14<19:08:07, 4.17s/it] 25%|██▌ | 5586/22095 [9:26:18<18:46:55, 4.10s/it] {'loss': 0.4367, 'grad_norm': 0.6867608478804084, 'learning_rate': 8.754120478485659e-06, 'epoch': 0.25} 25%|██▌ | 5586/22095 [9:26:18<18:46:55, 4.10s/it] 25%|██▌ | 5587/22095 [9:26:21<17:38:57, 3.85s/it] {'loss': 0.3638, 'grad_norm': 0.6694843817164187, 'learning_rate': 8.753636342198289e-06, 'epoch': 0.25} 25%|██▌ | 5587/22095 [9:26:21<17:38:57, 3.85s/it] 25%|██▌ | 5588/22095 [9:26:25<17:54:12, 3.90s/it] {'loss': 0.3795, 'grad_norm': 0.6358098373776894, 'learning_rate': 8.753152125256801e-06, 'epoch': 0.25} 25%|██▌ | 5588/22095 [9:26:25<17:54:12, 3.90s/it] 25%|██▌ | 5589/22095 [9:26:29<18:10:16, 3.96s/it] {'loss': 0.3908, 'grad_norm': 0.6631921437034257, 'learning_rate': 8.752667827671602e-06, 'epoch': 0.25} 25%|██▌ | 5589/22095 [9:26:29<18:10:16, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5590/22095 [9:26:39<26:24:33, 5.76s/it] {'loss': 0.4857, 'grad_norm': 0.3710637208753156, 'learning_rate': 8.752183449453098e-06, 'epoch': 0.25} 25%|██▌ | 5590/22095 [9:26:39<26:24:33, 5.76s/it] 25%|██▌ | 5591/22095 [9:26:42<22:52:08, 4.99s/it] {'loss': 0.4029, 'grad_norm': 0.7082325129752405, 'learning_rate': 8.751698990611694e-06, 'epoch': 0.25} 25%|██▌ | 5591/22095 [9:26:42<22:52:08, 4.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959460 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10295, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5'}, {'from': 'gpt', 'value': '【解答】解:(1)由点M、N分别是线段AC、BC的中点,得MC=\\frac{1}{2}AC=\\frac{1}{2}×4=2,NC=\\frac{1}{2}BC=\\frac{1}{2}×6=3.由线段的和差,得MN=MC+NC=2+3=5;'}]} 25%|██▌ | 5592/22095 [9:26:46<20:44:21, 4.52s/it] {'loss': 0.3426, 'grad_norm': 0.7012540179855176, 'learning_rate': 8.751214451157802e-06, 'epoch': 0.25} 25%|██▌ | 5592/22095 [9:26:46<20:44:21, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5593/22095 [9:26:55<27:19:39, 5.96s/it] {'loss': 0.4895, 'grad_norm': 0.30426007358344753, 'learning_rate': 8.750729831101831e-06, 'epoch': 0.25} 25%|██▌ | 5593/22095 [9:26:55<27:19:39, 5.96s/it] 25%|██▌ | 5594/22095 [9:26:59<23:43:22, 5.18s/it] {'loss': 0.3505, 'grad_norm': 0.613238452659256, 'learning_rate': 8.750245130454197e-06, 'epoch': 0.25} 25%|██▌ | 5594/22095 [9:26:59<23:43:22, 5.18s/it] 25%|██▌ | 5595/22095 [9:27:02<21:59:19, 4.80s/it] {'loss': 0.4148, 'grad_norm': 0.7036978272830566, 'learning_rate': 8.749760349225312e-06, 'epoch': 0.25} 25%|██▌ | 5595/22095 [9:27:03<21:59:19, 4.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5596/22095 [9:27:10<25:45:56, 5.62s/it] {'loss': 0.5372, 'grad_norm': 0.34027021169803484, 'learning_rate': 8.749275487425595e-06, 'epoch': 0.25} 25%|██▌ | 5596/22095 [9:27:10<25:45:56, 5.62s/it] 25%|██▌ | 5597/22095 [9:27:14<22:55:02, 5.00s/it] {'loss': 0.3648, 'grad_norm': 0.6430519959314669, 'learning_rate': 8.748790545065462e-06, 'epoch': 0.25} 25%|██▌ | 5597/22095 [9:27:14<22:55:02, 5.00s/it] 25%|██▌ | 5598/22095 [9:27:17<20:16:57, 4.43s/it] {'loss': 0.3924, 'grad_norm': 0.6481716182969333, 'learning_rate': 8.748305522155333e-06, 'epoch': 0.25} 25%|██▌ | 5598/22095 [9:27:17<20:16:57, 4.43s/it] 25%|██▌ | 5599/22095 [9:27:20<18:11:37, 3.97s/it] {'loss': 0.3477, 'grad_norm': 0.7850438631381669, 'learning_rate': 8.747820418705632e-06, 'epoch': 0.25} 25%|██▌ | 5599/22095 [9:27:20<18:11:37, 3.97s/it] 25%|██▌ | 5600/22095 [9:27:23<16:58:53, 3.71s/it] {'loss': 0.3393, 'grad_norm': 0.6475872459409273, 'learning_rate': 8.74733523472678e-06, 'epoch': 0.25} 25%|██▌ | 5600/22095 [9:27:23<16:58:53, 3.71s/it] 25%|██▌ | 5601/22095 [9:27:26<17:06:44, 3.73s/it] {'loss': 0.3912, 'grad_norm': 0.6892199701583623, 'learning_rate': 8.746849970229202e-06, 'epoch': 0.25} 25%|██▌ | 5601/22095 [9:27:26<17:06:44, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57389 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5602/22095 [9:27:29<16:05:52, 3.51s/it] {'loss': 0.3977, 'grad_norm': 0.652841793709212, 'learning_rate': 8.746364625223326e-06, 'epoch': 0.25} 25%|██▌ | 5602/22095 [9:27:29<16:05:52, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307859 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2CUDNcyqAXuNjy1XdXXaYcVXa_!!2947162503.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read out and tell me what is written on this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n洗澡+厨卫\n一机两用\n冷热\n标准漏电保护器\n3\n价\n特\n安全ϟ保障\n3\n秒\n活动挂座\n专用增压花洒\n02E\n2\n加长\n米\n铜芯伸缩管\n原装出水嘴\n主机3000瓦\n超高性价比'}]} 25%|██▌ | 5603/22095 [9:27:33<15:37:03, 3.41s/it] {'loss': 0.4435, 'grad_norm': 0.7133437435431227, 'learning_rate': 8.74587919971958e-06, 'epoch': 0.25} 25%|██▌ | 5603/22095 [9:27:33<15:37:03, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45715 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5604/22095 [9:27:36<15:20:51, 3.35s/it] {'loss': 0.3763, 'grad_norm': 0.6371564505576985, 'learning_rate': 8.745393693728395e-06, 'epoch': 0.25} 25%|██▌ | 5604/22095 [9:27:36<15:20:51, 3.35s/it] 25%|██▌ | 5605/22095 [9:27:39<15:31:40, 3.39s/it] {'loss': 0.3753, 'grad_norm': 0.5867002877282491, 'learning_rate': 8.744908107260204e-06, 'epoch': 0.25} 25%|██▌ | 5605/22095 [9:27:39<15:31:40, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49807 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57264 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43247 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5606/22095 [9:27:43<16:08:09, 3.52s/it] {'loss': 0.3785, 'grad_norm': 0.6591111743754317, 'learning_rate': 8.744422440325437e-06, 'epoch': 0.25} 25%|██▌ | 5606/22095 [9:27:43<16:08:09, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76407 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5607/22095 [9:27:47<16:17:54, 3.56s/it] {'loss': 0.4149, 'grad_norm': 0.702145490185904, 'learning_rate': 8.743936692934533e-06, 'epoch': 0.25} 25%|██▌ | 5607/22095 [9:27:47<16:17:54, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5608/22095 [9:27:51<17:26:06, 3.81s/it] {'loss': 0.5093, 'grad_norm': 0.3819564845362773, 'learning_rate': 8.743450865097929e-06, 'epoch': 0.25} 25%|██▌ | 5608/22095 [9:27:51<17:26:06, 3.81s/it] 25%|██▌ | 5609/22095 [9:27:55<16:47:34, 3.67s/it] {'loss': 0.3885, 'grad_norm': 0.6355224732404193, 'learning_rate': 8.742964956826063e-06, 'epoch': 0.25} 25%|██▌ | 5609/22095 [9:27:55<16:47:34, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387407 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54219, 'image': 'vrdu_table_final_2/astro-ph.CO/89e5072c-61af-4674-99d6-35f532624672.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 25%|██▌ | 5610/22095 [9:28:03<23:47:52, 5.20s/it] {'loss': 0.4812, 'grad_norm': 0.3119610838803885, 'learning_rate': 8.742478968129375e-06, 'epoch': 0.25} 25%|██▌ | 5610/22095 [9:28:03<23:47:52, 5.20s/it] 25%|██▌ | 5611/22095 [9:28:07<21:31:45, 4.70s/it] {'loss': 0.3808, 'grad_norm': 0.6582709352358617, 'learning_rate': 8.741992899018307e-06, 'epoch': 0.25} 25%|██▌ | 5611/22095 [9:28:07<21:31:45, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5612/22095 [9:28:12<22:00:35, 4.81s/it] {'loss': 0.4791, 'grad_norm': 0.3003522544833689, 'learning_rate': 8.741506749503306e-06, 'epoch': 0.25} 25%|██▌ | 5612/22095 [9:28:12<22:00:35, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83922 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50903 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121947 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5613/22095 [9:28:15<20:08:29, 4.40s/it] {'loss': 0.4024, 'grad_norm': 0.6558557492776759, 'learning_rate': 8.741020519594816e-06, 'epoch': 0.25} 25%|██▌ | 5613/22095 [9:28:15<20:08:29, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (94434 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5614/22095 [9:28:25<27:02:37, 5.91s/it] {'loss': 0.5212, 'grad_norm': 0.37483177884061825, 'learning_rate': 8.740534209303285e-06, 'epoch': 0.25} 25%|██▌ | 5614/22095 [9:28:25<27:02:37, 5.91s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (139814040 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 25%|██▌ | 5615/22095 [9:28:34<31:38:38, 6.91s/it] {'loss': 0.5019, 'grad_norm': 0.3089870798161669, 'learning_rate': 8.74004781863916e-06, 'epoch': 0.25} 25%|██▌ | 5615/22095 [9:28:34<31:38:38, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 25%|██▌ | 5616/22095 [9:28:37<26:36:51, 5.81s/it] {'loss': 0.3541, 'grad_norm': 0.6220929272548339, 'learning_rate': 8.739561347612894e-06, 'epoch': 0.25} 25%|██▌ | 5616/22095 [9:28:37<26:36:51, 5.81s/it] 25%|██▌ | 5617/22095 [9:28:41<23:56:26, 5.23s/it] {'loss': 0.3822, 'grad_norm': 0.6973460496890179, 'learning_rate': 8.739074796234943e-06, 'epoch': 0.25} 25%|██▌ | 5617/22095 [9:28:41<23:56:26, 5.23s/it] 25%|██▌ | 5618/22095 [9:28:44<20:59:45, 4.59s/it] {'loss': 0.3875, 'grad_norm': 0.6709010629334192, 'learning_rate': 8.738588164515755e-06, 'epoch': 0.25} 25%|██▌ | 5618/22095 [9:28:44<20:59:45, 4.59s/it] 25%|██▌ | 5619/22095 [9:28:48<19:43:47, 4.31s/it] {'loss': 0.3827, 'grad_norm': 0.6981970747897914, 'learning_rate': 8.738101452465793e-06, 'epoch': 0.25} 25%|██▌ | 5619/22095 [9:28:48<19:43:47, 4.31s/it] 25%|██▌ | 5620/22095 [9:28:51<18:03:59, 3.95s/it] {'loss': 0.3956, 'grad_norm': 0.6752351421153933, 'learning_rate': 8.737614660095507e-06, 'epoch': 0.25} 25%|██▌ | 5620/22095 [9:28:51<18:03:59, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5621/22095 [9:29:01<26:27:39, 5.78s/it] {'loss': 0.4983, 'grad_norm': 0.5064682169153885, 'learning_rate': 8.737127787415365e-06, 'epoch': 0.25} 25%|██▌ | 5621/22095 [9:29:01<26:27:39, 5.78s/it] 25%|██▌ | 5622/22095 [9:29:11<31:51:30, 6.96s/it] {'loss': 0.5241, 'grad_norm': 0.4210484452205898, 'learning_rate': 8.736640834435824e-06, 'epoch': 0.25} 25%|██▌ | 5622/22095 [9:29:11<31:51:30, 6.96s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (54392 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5623/22095 [9:29:15<27:39:11, 6.04s/it] {'loss': 0.374, 'grad_norm': 0.9187963479616968, 'learning_rate': 8.736153801167346e-06, 'epoch': 0.25} 25%|██▌ | 5623/22095 [9:29:15<27:39:11, 6.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83927 > 40960). Running this sequence through the model will result in indexing errors 25%|██▌ | 5624/22095 [9:29:19<25:06:50, 5.49s/it] {'loss': 0.3771, 'grad_norm': 0.6036483418713718, 'learning_rate': 8.735666687620398e-06, 'epoch': 0.25} 25%|██▌ | 5624/22095 [9:29:19<25:06:50, 5.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [109, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334987 in VC:s3://internvl-moe-sft-data/. Exception: Image size [109, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1599, 'image': 'vrdu_table_final_2/astro-ph.CO/42134518-2f00-4d96-9a0e-801eef3819d4.png', 'image_wh': [[109, 20]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha - \\alpha_{\\rm true}$\\end{tabular}\n```"}]} 25%|██▌ | 5625/22095 [9:29:22<22:30:28, 4.92s/it] {'loss': 0.3879, 'grad_norm': 0.6672026035990911, 'learning_rate': 8.735179493805446e-06, 'epoch': 0.25} 25%|██▌ | 5625/22095 [9:29:22<22:30:28, 4.92s/it] 25%|██▌ | 5626/22095 [9:29:26<20:49:03, 4.55s/it] {'loss': 0.3721, 'grad_norm': 0.6955777937848375, 'learning_rate': 8.73469221973296e-06, 'epoch': 0.25} 25%|██▌ | 5626/22095 [9:29:26<20:49:03, 4.55s/it] 25%|██▌ | 5627/22095 [9:29:30<19:38:22, 4.29s/it] {'loss': 0.365, 'grad_norm': 0.6532425045514988, 'learning_rate': 8.734204865413407e-06, 'epoch': 0.25} 25%|██▌ | 5627/22095 [9:29:30<19:38:22, 4.29s/it] 25%|██▌ | 5628/22095 [9:29:33<18:30:39, 4.05s/it] {'loss': 0.394, 'grad_norm': 0.6804037652907705, 'learning_rate': 8.73371743085726e-06, 'epoch': 0.25} 25%|██▌ | 5628/22095 [9:29:33<18:30:39, 4.05s/it] 25%|██▌ | 5629/22095 [9:29:36<17:17:26, 3.78s/it] {'loss': 0.4164, 'grad_norm': 0.6691062206288424, 'learning_rate': 8.733229916074995e-06, 'epoch': 0.25} 25%|██▌ | 5629/22095 [9:29:36<17:17:26, 3.78s/it] 25%|██▌ | 5630/22095 [9:29:41<17:47:15, 3.89s/it] {'loss': 0.3914, 'grad_norm': 0.6919494707655087, 'learning_rate': 8.732742321077082e-06, 'epoch': 0.25} 25%|██▌ | 5630/22095 [9:29:41<17:47:15, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 25%|██▌ | 5631/22095 [9:29:49<24:35:32, 5.38s/it] {'loss': 0.5069, 'grad_norm': 0.6994508346956777, 'learning_rate': 8.732254645874002e-06, 'epoch': 0.25} 25%|██▌ | 5631/22095 [9:29:49<24:35:32, 5.38s/it] 25%|██▌ | 5632/22095 [9:29:59<30:06:23, 6.58s/it] {'loss': 0.4924, 'grad_norm': 0.48480809580622497, 'learning_rate': 8.731766890476232e-06, 'epoch': 0.25} 25%|██▌ | 5632/22095 [9:29:59<30:06:23, 6.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 25%|██▌ | 5633/22095 [9:30:02<25:39:22, 5.61s/it] {'loss': 0.3927, 'grad_norm': 0.7333716987396993, 'learning_rate': 8.731279054894254e-06, 'epoch': 0.25} 25%|██▌ | 5633/22095 [9:30:02<25:39:22, 5.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307604 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB26MTjvrtlpuFjSspoXXbcDpXa_!!3332219368.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nLook at this picture and tell me what is written on it without outputting text coordinates.'}, {'from': 'gpt', 'value': 'All words in the image:\n绚丽舞台灯\n‘浪漫’舞动\n加厚\n30MM带绚丽舞台灯\n不影响楼下邻居——更舒适'}]} 25%|██▌ | 5634/22095 [9:30:06<23:02:39, 5.04s/it] {'loss': 0.3644, 'grad_norm': 0.6965699173224933, 'learning_rate': 8.730791139138546e-06, 'epoch': 0.25} 25%|██▌ | 5634/22095 [9:30:06<23:02:39, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64341 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103025 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5635/22095 [9:30:09<20:13:36, 4.42s/it] {'loss': 0.3773, 'grad_norm': 0.8260440813243546, 'learning_rate': 8.730303143219597e-06, 'epoch': 0.26} 26%|██▌ | 5635/22095 [9:30:09<20:13:36, 4.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880123 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3276, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 26%|██▌ | 5636/22095 [9:30:12<18:58:11, 4.15s/it] {'loss': 0.383, 'grad_norm': 0.6660711629700041, 'learning_rate': 8.729815067147888e-06, 'epoch': 0.26} 26%|██▌ | 5636/22095 [9:30:12<18:58:11, 4.15s/it] 26%|██▌ | 5637/22095 [9:30:16<18:01:26, 3.94s/it] {'loss': 0.3716, 'grad_norm': 0.736406331574392, 'learning_rate': 8.729326910933911e-06, 'epoch': 0.26} 26%|██▌ | 5637/22095 [9:30:16<18:01:26, 3.94s/it] 26%|██▌ | 5638/22095 [9:30:19<17:32:58, 3.84s/it] {'loss': 0.3965, 'grad_norm': 0.709712953761439, 'learning_rate': 8.728838674588151e-06, 'epoch': 0.26} 26%|██▌ | 5638/22095 [9:30:19<17:32:58, 3.84s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5639/22095 [9:30:23<16:51:37, 3.69s/it] {'loss': 0.3537, 'grad_norm': 0.6271166899069629, 'learning_rate': 8.728350358121101e-06, 'epoch': 0.26} 26%|██▌ | 5639/22095 [9:30:23<16:51:37, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64678 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51652 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5640/22095 [9:30:26<15:48:31, 3.46s/it] {'loss': 0.3755, 'grad_norm': 0.6697285911368892, 'learning_rate': 8.727861961543253e-06, 'epoch': 0.26} 26%|██▌ | 5640/22095 [9:30:26<15:48:31, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45441 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5641/22095 [9:30:30<16:32:51, 3.62s/it] {'loss': 0.4126, 'grad_norm': 0.6809780054179385, 'learning_rate': 8.7273734848651e-06, 'epoch': 0.26} 26%|██▌ | 5641/22095 [9:30:30<16:32:51, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302864 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1PQYqggoQMeJjy0FnXXb8gFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nIdentify all the text content in the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n2.1米加固加大+2球+打气筒\nBASNCTOALL\n底座加大,\n篮板加大\n管加粗'}]} 26%|██▌ | 5642/22095 [9:30:33<16:18:42, 3.57s/it] {'loss': 0.3632, 'grad_norm': 0.7967430252315837, 'learning_rate': 8.726884928097138e-06, 'epoch': 0.26} 26%|██▌ | 5642/22095 [9:30:33<16:18:42, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5643/22095 [9:30:43<24:49:31, 5.43s/it] {'loss': 0.5181, 'grad_norm': 1.381867297092667, 'learning_rate': 8.726396291249866e-06, 'epoch': 0.26} 26%|██▌ | 5643/22095 [9:30:43<24:49:31, 5.43s/it] 26%|██▌ | 5644/22095 [9:30:49<25:35:51, 5.60s/it] {'loss': 0.3891, 'grad_norm': 0.7415298912300912, 'learning_rate': 8.725907574333783e-06, 'epoch': 0.26} 26%|██▌ | 5644/22095 [9:30:49<25:35:51, 5.60s/it] 26%|██▌ | 5645/22095 [9:30:52<22:26:33, 4.91s/it] {'loss': 0.4035, 'grad_norm': 0.7454140763091222, 'learning_rate': 8.725418777359389e-06, 'epoch': 0.26} 26%|██▌ | 5645/22095 [9:30:52<22:26:33, 4.91s/it] 26%|██▌ | 5646/22095 [9:30:55<19:28:14, 4.26s/it] {'loss': 0.3777, 'grad_norm': 0.7945699973069527, 'learning_rate': 8.724929900337186e-06, 'epoch': 0.26} 26%|██▌ | 5646/22095 [9:30:55<19:28:14, 4.26s/it] 26%|██▌ | 5647/22095 [9:30:59<18:34:11, 4.06s/it] {'loss': 0.3635, 'grad_norm': 0.7909815852970442, 'learning_rate': 8.724440943277681e-06, 'epoch': 0.26} 26%|██▌ | 5647/22095 [9:30:59<18:34:11, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5648/22095 [9:31:08<26:04:40, 5.71s/it] {'loss': 0.4747, 'grad_norm': 0.6718420734648589, 'learning_rate': 8.723951906191377e-06, 'epoch': 0.26} 26%|██▌ | 5648/22095 [9:31:08<26:04:40, 5.71s/it] 26%|██▌ | 5649/22095 [9:31:17<29:58:00, 6.56s/it] {'loss': 0.5177, 'grad_norm': 0.7038507433457524, 'learning_rate': 8.723462789088785e-06, 'epoch': 0.26} 26%|██▌ | 5649/22095 [9:31:17<29:58:00, 6.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 26%|██▌ | 5650/22095 [9:31:20<25:30:00, 5.58s/it] {'loss': 0.3444, 'grad_norm': 0.6407362587147867, 'learning_rate': 8.722973591980414e-06, 'epoch': 0.26} 26%|██▌ | 5650/22095 [9:31:20<25:30:00, 5.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5651/22095 [9:31:24<23:26:55, 5.13s/it] {'loss': 0.3938, 'grad_norm': 0.7535583737045262, 'learning_rate': 8.722484314876776e-06, 'epoch': 0.26} 26%|██▌ | 5651/22095 [9:31:24<23:26:55, 5.13s/it] 26%|██▌ | 5652/22095 [9:31:28<22:04:11, 4.83s/it] {'loss': 0.345, 'grad_norm': 0.6430207864370134, 'learning_rate': 8.72199495778838e-06, 'epoch': 0.26} 26%|██▌ | 5652/22095 [9:31:28<22:04:11, 4.83s/it] 26%|██▌ | 5653/22095 [9:31:31<19:24:51, 4.25s/it] {'loss': 0.3648, 'grad_norm': 0.6251897210396002, 'learning_rate': 8.721505520725745e-06, 'epoch': 0.26} 26%|██▌ | 5653/22095 [9:31:31<19:24:51, 4.25s/it] 26%|██▌ | 5654/22095 [9:31:35<18:14:34, 3.99s/it] {'loss': 0.3697, 'grad_norm': 0.6296987375458352, 'learning_rate': 8.721016003699385e-06, 'epoch': 0.26} 26%|██▌ | 5654/22095 [9:31:35<18:14:34, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307556 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB25PgKirsTMeJjSszgXXacpFXa-1106900306.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the photo.'}, {'from': 'gpt', 'value': 'All words in the image:\n幼师推荐\nshi\n士\ntian\ng\nri\n工\nm\nTwenty-one\n日\nya\n日子\nThree\n牙\n21\n3\n牙齿\n乘法运算\n3x7=21\n3像耳朵\n1x3=3\n24\n乘法运算\n3x8=24\n买1送3\n送沙画+收纳袋+故事书'}]} 26%|██▌ | 5655/22095 [9:31:37<16:46:55, 3.67s/it] {'loss': 0.3498, 'grad_norm': 0.6776477921529509, 'learning_rate': 8.72052640671982e-06, 'epoch': 0.26} 26%|██▌ | 5655/22095 [9:31:37<16:46:55, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5656/22095 [9:31:42<17:42:26, 3.88s/it] {'loss': 0.3576, 'grad_norm': 0.5898537695546245, 'learning_rate': 8.72003672979757e-06, 'epoch': 0.26} 26%|██▌ | 5656/22095 [9:31:42<17:42:26, 3.88s/it] 26%|██▌ | 5657/22095 [9:31:45<17:23:16, 3.81s/it] {'loss': 0.3831, 'grad_norm': 0.6294189740361547, 'learning_rate': 8.719546972943156e-06, 'epoch': 0.26} 26%|██▌ | 5657/22095 [9:31:45<17:23:16, 3.81s/it] 26%|██▌ | 5658/22095 [9:31:49<17:15:33, 3.78s/it] {'loss': 0.3972, 'grad_norm': 0.6665099049361219, 'learning_rate': 8.719057136167099e-06, 'epoch': 0.26} 26%|██▌ | 5658/22095 [9:31:49<17:15:33, 3.78s/it] 26%|██▌ | 5659/22095 [9:31:52<16:00:42, 3.51s/it] {'loss': 0.3794, 'grad_norm': 0.660593059607384, 'learning_rate': 8.71856721947993e-06, 'epoch': 0.26} 26%|██▌ | 5659/22095 [9:31:52<16:00:42, 3.51s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100560000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 26%|██▌ | 5660/22095 [9:31:56<16:17:11, 3.57s/it] {'loss': 0.4129, 'grad_norm': 0.7154169972792961, 'learning_rate': 8.718077222892169e-06, 'epoch': 0.26} 26%|██▌ | 5660/22095 [9:31:56<16:17:11, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100332 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5661/22095 [9:31:59<15:29:23, 3.39s/it] {'loss': 0.4019, 'grad_norm': 0.6645475490795022, 'learning_rate': 8.717587146414348e-06, 'epoch': 0.26} 26%|██▌ | 5661/22095 [9:31:59<15:29:23, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Token indices sequence length is longer than the specified maximum sequence length for this model (83471 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5662/22095 [9:32:03<16:30:40, 3.62s/it] {'loss': 0.3973, 'grad_norm': 0.6548301565215484, 'learning_rate': 8.717096990056999e-06, 'epoch': 0.26} 26%|██▌ | 5662/22095 [9:32:03<16:30:40, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63468 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5663/22095 [9:32:06<16:07:20, 3.53s/it] {'loss': 0.4008, 'grad_norm': 0.779726610983311, 'learning_rate': 8.71660675383065e-06, 'epoch': 0.26} 26%|██▌ | 5663/22095 [9:32:06<16:07:20, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (140754 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5664/22095 [9:32:10<16:23:28, 3.59s/it] {'loss': 0.4211, 'grad_norm': 0.6523134947123276, 'learning_rate': 8.716116437745836e-06, 'epoch': 0.26} 26%|██▌ | 5664/22095 [9:32:10<16:23:28, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53951 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100855 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5665/22095 [9:32:14<16:33:49, 3.63s/it] {'loss': 0.4106, 'grad_norm': 0.6608811987550522, 'learning_rate': 8.715626041813095e-06, 'epoch': 0.26} 26%|██▌ | 5665/22095 [9:32:14<16:33:49, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50813 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5666/22095 [9:32:17<15:31:53, 3.40s/it] {'loss': 0.382, 'grad_norm': 0.6898017659880898, 'learning_rate': 8.71513556604296e-06, 'epoch': 0.26} 26%|██▌ | 5666/22095 [9:32:17<15:31:53, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5667/22095 [9:32:20<15:09:08, 3.32s/it] {'loss': 0.4066, 'grad_norm': 0.723160907659878, 'learning_rate': 8.714645010445974e-06, 'epoch': 0.26} 26%|██▌ | 5667/22095 [9:32:20<15:09:08, 3.32s/it] 26%|██▌ | 5668/22095 [9:32:23<15:49:41, 3.47s/it] {'loss': 0.4093, 'grad_norm': 0.6429431624225951, 'learning_rate': 8.714154375032675e-06, 'epoch': 0.26} 26%|██▌ | 5668/22095 [9:32:23<15:49:41, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5669/22095 [9:32:31<21:45:20, 4.77s/it] {'loss': 0.5187, 'grad_norm': 1.2113190533946177, 'learning_rate': 8.713663659813605e-06, 'epoch': 0.26} 26%|██▌ | 5669/22095 [9:32:31<21:45:20, 4.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5670/22095 [9:32:35<20:29:28, 4.49s/it] {'loss': 0.371, 'grad_norm': 0.6354341373225793, 'learning_rate': 8.713172864799309e-06, 'epoch': 0.26} 26%|██▌ | 5670/22095 [9:32:35<20:29:28, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5671/22095 [9:32:42<23:38:47, 5.18s/it] {'loss': 0.5104, 'grad_norm': 0.45960048313751684, 'learning_rate': 8.712681990000332e-06, 'epoch': 0.26} 26%|██▌ | 5671/22095 [9:32:42<23:38:47, 5.18s/it] 26%|██▌ | 5672/22095 [9:32:45<21:16:04, 4.66s/it] {'loss': 0.3442, 'grad_norm': 0.6761932077538576, 'learning_rate': 8.71219103542722e-06, 'epoch': 0.26} 26%|██▌ | 5672/22095 [9:32:45<21:16:04, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5673/22095 [9:32:54<26:15:30, 5.76s/it] {'loss': 0.4999, 'grad_norm': 0.5932332779997048, 'learning_rate': 8.711700001090524e-06, 'epoch': 0.26} 26%|██▌ | 5673/22095 [9:32:54<26:15:30, 5.76s/it] 26%|██▌ | 5674/22095 [9:33:03<31:25:19, 6.89s/it] {'loss': 0.4969, 'grad_norm': 0.6854025592794466, 'learning_rate': 8.711208887000797e-06, 'epoch': 0.26} 26%|██▌ | 5674/22095 [9:33:03<31:25:19, 6.89s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 26%|██▌ | 5675/22095 [9:33:07<27:34:54, 6.05s/it] {'loss': 0.3926, 'grad_norm': 0.7360965312204293, 'learning_rate': 8.710717693168588e-06, 'epoch': 0.26} 26%|██▌ | 5675/22095 [9:33:07<27:34:54, 6.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303851 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1YiIdcJqUQKJjSZFIXXcOkFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nTell me all text in the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n包邮\n自动出油\n收货即盖\n光敏印章\n100%\n品质保证\n当天发货\n6点前订单'}]} 26%|██▌ | 5676/22095 [9:33:11<24:06:56, 5.29s/it] {'loss': 0.405, 'grad_norm': 0.7146450211894528, 'learning_rate': 8.710226419604453e-06, 'epoch': 0.26} 26%|██▌ | 5676/22095 [9:33:11<24:06:56, 5.29s/it] 26%|██▌ | 5677/22095 [9:33:15<22:40:18, 4.97s/it] {'loss': 0.4143, 'grad_norm': 0.6804543109515563, 'learning_rate': 8.709735066318946e-06, 'epoch': 0.26} 26%|██▌ | 5677/22095 [9:33:15<22:40:18, 4.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5678/22095 [9:33:18<20:09:27, 4.42s/it] {'loss': 0.3447, 'grad_norm': 0.6413194057286674, 'learning_rate': 8.709243633322627e-06, 'epoch': 0.26} 26%|██▌ | 5678/22095 [9:33:18<20:09:27, 4.42s/it] 26%|██▌ | 5679/22095 [9:33:21<17:56:34, 3.93s/it] {'loss': 0.3746, 'grad_norm': 0.6783568892967364, 'learning_rate': 8.708752120626054e-06, 'epoch': 0.26} 26%|██▌ | 5679/22095 [9:33:21<17:56:34, 3.93s/it] 26%|██▌ | 5680/22095 [9:33:25<18:02:17, 3.96s/it] {'loss': 0.4018, 'grad_norm': 0.6505544536816305, 'learning_rate': 8.708260528239788e-06, 'epoch': 0.26} 26%|██▌ | 5680/22095 [9:33:25<18:02:17, 3.96s/it] 26%|██▌ | 5681/22095 [9:33:28<16:59:54, 3.73s/it] {'loss': 0.3693, 'grad_norm': 0.742783726637506, 'learning_rate': 8.707768856174393e-06, 'epoch': 0.26} 26%|██▌ | 5681/22095 [9:33:28<16:59:54, 3.73s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5682/22095 [9:33:31<16:20:00, 3.58s/it] {'loss': 0.3636, 'grad_norm': 0.6665288886536155, 'learning_rate': 8.707277104440432e-06, 'epoch': 0.26} 26%|██▌ | 5682/22095 [9:33:31<16:20:00, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5683/22095 [9:33:35<15:58:50, 3.51s/it] {'loss': 0.3495, 'grad_norm': 0.6467823550649642, 'learning_rate': 8.706785273048475e-06, 'epoch': 0.26} 26%|██▌ | 5683/22095 [9:33:35<15:58:50, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97628 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5684/22095 [9:33:38<15:25:47, 3.38s/it] {'loss': 0.3749, 'grad_norm': 0.605064740924202, 'learning_rate': 8.706293362009084e-06, 'epoch': 0.26} 26%|██▌ | 5684/22095 [9:33:38<15:25:47, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76018 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5685/22095 [9:33:41<15:07:41, 3.32s/it] {'loss': 0.3755, 'grad_norm': 0.7717223661878522, 'learning_rate': 8.705801371332832e-06, 'epoch': 0.26} 26%|██▌ | 5685/22095 [9:33:41<15:07:41, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100123 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5686/22095 [9:33:44<15:10:55, 3.33s/it] {'loss': 0.3562, 'grad_norm': 0.6282435336830396, 'learning_rate': 8.70530930103029e-06, 'epoch': 0.26} 26%|██▌ | 5686/22095 [9:33:44<15:10:55, 3.33s/it] 26%|██▌ | 5687/22095 [9:33:48<16:13:48, 3.56s/it] {'loss': 0.3891, 'grad_norm': 0.7577896145589404, 'learning_rate': 8.704817151112033e-06, 'epoch': 0.26} 26%|██▌ | 5687/22095 [9:33:48<16:13:48, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5688/22095 [9:33:52<16:05:20, 3.53s/it] {'loss': 0.4212, 'grad_norm': 0.7057241116379769, 'learning_rate': 8.704324921588631e-06, 'epoch': 0.26} 26%|██▌ | 5688/22095 [9:33:52<16:05:20, 3.53s/it] 26%|██▌ | 5689/22095 [9:33:55<15:27:39, 3.39s/it] {'loss': 0.3825, 'grad_norm': 0.7065461234902365, 'learning_rate': 8.703832612470665e-06, 'epoch': 0.26} 26%|██▌ | 5689/22095 [9:33:55<15:27:39, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [114, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8435646 in VC:s3://internvl-moe-sft-data/. Exception: Image size [114, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 123990, 'image': 'vrdu_texteq/astro-ph.CO/9f297134-6df2-4538-be21-a2aef7aaac5b.png', 'image_wh': [[114, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': '$k_1$ and $k_2$'}]} 26%|██▌ | 5690/22095 [9:33:59<15:42:44, 3.45s/it] {'loss': 0.3642, 'grad_norm': 0.7101016138777637, 'learning_rate': 8.703340223768713e-06, 'epoch': 0.26} 26%|██▌ | 5690/22095 [9:33:59<15:42:44, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None [Try #0] Failed to fetch sample 1865783 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None Problematic sample: {'image': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png', 'conversations': [], 'image_id': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'} 26%|██▌ | 5691/22095 [9:34:03<16:52:07, 3.70s/it] {'loss': 0.3474, 'grad_norm': 0.6020716234533079, 'learning_rate': 8.70284775549335e-06, 'epoch': 0.26} 26%|██▌ | 5691/22095 [9:34:03<16:52:07, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (68772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105059 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50949 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65713 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5692/22095 [9:34:12<24:42:40, 5.42s/it] {'loss': 0.5115, 'grad_norm': 0.7963449065128408, 'learning_rate': 8.702355207655164e-06, 'epoch': 0.26} 26%|██▌ | 5692/22095 [9:34:12<24:42:40, 5.42s/it] 26%|██▌ | 5693/22095 [9:34:16<22:06:14, 4.85s/it] {'loss': 0.3854, 'grad_norm': 0.6741311830333333, 'learning_rate': 8.701862580264735e-06, 'epoch': 0.26} 26%|██▌ | 5693/22095 [9:34:16<22:06:14, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118905 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71824 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5694/22095 [9:34:25<28:40:21, 6.29s/it] {'loss': 0.4829, 'grad_norm': 0.45541735171368786, 'learning_rate': 8.701369873332647e-06, 'epoch': 0.26} 26%|██▌ | 5694/22095 [9:34:25<28:40:21, 6.29s/it] 26%|██▌ | 5695/22095 [9:34:29<24:47:50, 5.44s/it] {'loss': 0.4208, 'grad_norm': 0.6808063988576847, 'learning_rate': 8.70087708686949e-06, 'epoch': 0.26} 26%|██▌ | 5695/22095 [9:34:29<24:47:50, 5.44s/it] 26%|██▌ | 5696/22095 [9:34:32<21:05:26, 4.63s/it] {'loss': 0.3507, 'grad_norm': 1.0002550438489555, 'learning_rate': 8.700384220885852e-06, 'epoch': 0.26} 26%|██▌ | 5696/22095 [9:34:32<21:05:26, 4.63s/it] 26%|██▌ | 5697/22095 [9:34:36<20:18:23, 4.46s/it] {'loss': 0.4122, 'grad_norm': 0.6578641214454441, 'learning_rate': 8.699891275392319e-06, 'epoch': 0.26} 26%|██▌ | 5697/22095 [9:34:36<20:18:23, 4.46s/it] 26%|██▌ | 5698/22095 [9:34:39<19:17:21, 4.24s/it] {'loss': 0.3911, 'grad_norm': 0.6321553880200829, 'learning_rate': 8.699398250399486e-06, 'epoch': 0.26} 26%|██▌ | 5698/22095 [9:34:39<19:17:21, 4.24s/it] 26%|██▌ | 5699/22095 [9:34:42<17:33:23, 3.85s/it] {'loss': 0.3443, 'grad_norm': 0.6306184982690165, 'learning_rate': 8.698905145917948e-06, 'epoch': 0.26} 26%|██▌ | 5699/22095 [9:34:42<17:33:23, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5700/22095 [9:34:52<25:54:24, 5.69s/it] {'loss': 0.4861, 'grad_norm': 0.9831445650283764, 'learning_rate': 8.6984119619583e-06, 'epoch': 0.26} 26%|██▌ | 5700/22095 [9:34:52<25:54:24, 5.69s/it] 26%|██▌ | 5701/22095 [9:34:56<22:47:26, 5.00s/it] {'loss': 0.3793, 'grad_norm': 0.6193016097001647, 'learning_rate': 8.697918698531135e-06, 'epoch': 0.26} 26%|██▌ | 5701/22095 [9:34:56<22:47:26, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [156, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8531976 in VC:s3://internvl-moe-sft-data/. Exception: Image size [156, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30130, 'image': 'vrdu_texteq/astro-ph.CO/3f2148a0-6826-402b-bbda-d94708ad25c2.png', 'image_wh': [[156, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'with $\\varepsilon_i \\in \\mathbb{Z}_2$.'}]} 26%|██▌ | 5702/22095 [9:35:06<29:25:30, 6.46s/it] {'loss': 0.4789, 'grad_norm': 0.5713133255025882, 'learning_rate': 8.697425355647055e-06, 'epoch': 0.26} 26%|██▌ | 5702/22095 [9:35:06<29:25:30, 6.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5703/22095 [9:35:09<25:33:59, 5.61s/it] {'loss': 0.3762, 'grad_norm': 0.6176027961816791, 'learning_rate': 8.696931933316661e-06, 'epoch': 0.26} 26%|██▌ | 5703/22095 [9:35:09<25:33:59, 5.61s/it] 26%|██▌ | 5704/22095 [9:35:13<23:03:23, 5.06s/it] {'loss': 0.3791, 'grad_norm': 0.7087957945916381, 'learning_rate': 8.696438431550553e-06, 'epoch': 0.26} 26%|██▌ | 5704/22095 [9:35:13<23:03:23, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88283 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108412 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5705/22095 [9:35:16<20:17:10, 4.46s/it] {'loss': 0.3673, 'grad_norm': 0.6353374786867481, 'learning_rate': 8.695944850359337e-06, 'epoch': 0.26} 26%|██▌ | 5705/22095 [9:35:16<20:17:10, 4.46s/it] 26%|██▌ | 5706/22095 [9:35:20<19:22:48, 4.26s/it] {'loss': 0.4057, 'grad_norm': 0.6955272929655746, 'learning_rate': 8.695451189753616e-06, 'epoch': 0.26} 26%|██▌ | 5706/22095 [9:35:20<19:22:48, 4.26s/it] 26%|██▌ | 5707/22095 [9:35:23<18:14:09, 4.01s/it] {'loss': 0.3476, 'grad_norm': 0.6275498827078999, 'learning_rate': 8.694957449744e-06, 'epoch': 0.26} 26%|██▌ | 5707/22095 [9:35:23<18:14:09, 4.01s/it] 26%|██▌ | 5708/22095 [9:35:26<16:55:00, 3.72s/it] {'loss': 0.3953, 'grad_norm': 0.6950507771068638, 'learning_rate': 8.694463630341094e-06, 'epoch': 0.26} 26%|██▌ | 5708/22095 [9:35:26<16:55:00, 3.72s/it] 26%|██▌ | 5709/22095 [9:35:29<15:59:33, 3.51s/it] {'loss': 0.3723, 'grad_norm': 0.7994838251226956, 'learning_rate': 8.693969731555514e-06, 'epoch': 0.26} 26%|██▌ | 5709/22095 [9:35:29<15:59:33, 3.51s/it] 26%|██▌ | 5710/22095 [9:35:32<15:25:46, 3.39s/it] {'loss': 0.373, 'grad_norm': 0.6387292929473521, 'learning_rate': 8.693475753397869e-06, 'epoch': 0.26} 26%|██▌ | 5710/22095 [9:35:32<15:25:46, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047182 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 26%|██▌ | 5711/22095 [9:35:35<14:42:12, 3.23s/it] {'loss': 0.385, 'grad_norm': 0.6301470128141162, 'learning_rate': 8.692981695878772e-06, 'epoch': 0.26} 26%|██▌ | 5711/22095 [9:35:35<14:42:12, 3.23s/it] 26%|██▌ | 5712/22095 [9:35:39<15:45:41, 3.46s/it] {'loss': 0.363, 'grad_norm': 0.5937941248355948, 'learning_rate': 8.692487559008843e-06, 'epoch': 0.26} 26%|██▌ | 5712/22095 [9:35:39<15:45:41, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301026 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1AYXGmh6I8KJjy0FgXXXXzVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n你能给我图片中所有的文本内容吗?'}, {'from': 'gpt', 'value': '图中所有文字:\nNEW\n进口材质\n精品标签\n高清打印强粘覆膜\n进口精品不伤机器\n强粘覆膜\n精品贴\n6\nTTe精品标签色带\nmm\n白底黑字\nTAPE\nTTe-211\n百佳办公\n直销旗舰店\nHttp://bjbg118.taobao.com'}]} 26%|██▌ | 5713/22095 [9:35:49<23:57:19, 5.26s/it] {'loss': 0.5158, 'grad_norm': 1.192506158297031, 'learning_rate': 8.691993342798698e-06, 'epoch': 0.26} 26%|██▌ | 5713/22095 [9:35:49<23:57:19, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48158 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131536 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5714/22095 [9:35:52<21:10:20, 4.65s/it] {'loss': 0.3817, 'grad_norm': 0.6708971470570072, 'learning_rate': 8.691499047258952e-06, 'epoch': 0.26} 26%|██▌ | 5714/22095 [9:35:52<21:10:20, 4.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8450164 in VC:s3://internvl-moe-sft-data/. Exception: Image size [225, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 72739, 'image': 'vrdu_texteq/astro-ph.CO/7ee652a1-0a19-4582-bfcd-be1a9662ea77.png', 'image_wh': [[225, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'We consider $s=3$.'}]} 26%|██▌ | 5715/22095 [9:35:55<19:31:45, 4.29s/it] {'loss': 0.3691, 'grad_norm': 0.6282540225233022, 'learning_rate': 8.69100467240023e-06, 'epoch': 0.26} 26%|██▌ | 5715/22095 [9:35:55<19:31:45, 4.29s/it] 26%|██▌ | 5716/22095 [9:35:59<18:49:36, 4.14s/it] {'loss': 0.4013, 'grad_norm': 0.6998716434175581, 'learning_rate': 8.690510218233153e-06, 'epoch': 0.26} 26%|██▌ | 5716/22095 [9:35:59<18:49:36, 4.14s/it] 26%|██▌ | 5717/22095 [9:36:02<17:02:08, 3.74s/it] {'loss': 0.3527, 'grad_norm': 0.6701690489887676, 'learning_rate': 8.690015684768347e-06, 'epoch': 0.26} 26%|██▌ | 5717/22095 [9:36:02<17:02:08, 3.74s/it] 26%|██▌ | 5718/22095 [9:36:06<17:32:54, 3.86s/it] {'loss': 0.4015, 'grad_norm': 0.6245332757292525, 'learning_rate': 8.689521072016436e-06, 'epoch': 0.26} 26%|██▌ | 5718/22095 [9:36:06<17:32:54, 3.86s/it] 26%|██▌ | 5719/22095 [9:36:09<16:07:27, 3.54s/it] {'loss': 0.393, 'grad_norm': 0.6928148401172459, 'learning_rate': 8.68902637998805e-06, 'epoch': 0.26} 26%|██▌ | 5719/22095 [9:36:09<16:07:27, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5720/22095 [9:36:13<16:08:21, 3.55s/it] {'loss': 0.361, 'grad_norm': 0.6114044470861477, 'learning_rate': 8.688531608693817e-06, 'epoch': 0.26} 26%|██▌ | 5720/22095 [9:36:13<16:08:21, 3.55s/it] 26%|██▌ | 5721/22095 [9:36:16<16:25:13, 3.61s/it] {'loss': 0.3705, 'grad_norm': 0.6257657653929006, 'learning_rate': 8.688036758144367e-06, 'epoch': 0.26} 26%|██▌ | 5721/22095 [9:36:16<16:25:13, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5722/22095 [9:36:26<24:35:08, 5.41s/it] {'loss': 0.5148, 'grad_norm': 0.6996865989535014, 'learning_rate': 8.687541828350334e-06, 'epoch': 0.26} 26%|██▌ | 5722/22095 [9:36:26<24:35:08, 5.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85242 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5723/22095 [9:36:29<22:02:33, 4.85s/it] {'loss': 0.4001, 'grad_norm': 0.7563925845604034, 'learning_rate': 8.687046819322353e-06, 'epoch': 0.26} 26%|██▌ | 5723/22095 [9:36:29<22:02:33, 4.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408535 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10729, 'image': 'vrdu_table_final_2/astro-ph.CO/f64e855e-754d-4d26-aee4-d55d1154a9f1.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 26%|██▌ | 5724/22095 [9:36:33<19:56:45, 4.39s/it] {'loss': 0.412, 'grad_norm': 0.6674974207955788, 'learning_rate': 8.68655173107106e-06, 'epoch': 0.26} 26%|██▌ | 5724/22095 [9:36:33<19:56:45, 4.39s/it] 26%|██▌ | 5725/22095 [9:36:37<19:06:00, 4.20s/it] {'loss': 0.3571, 'grad_norm': 0.6705895639170104, 'learning_rate': 8.686056563607093e-06, 'epoch': 0.26} 26%|██▌ | 5725/22095 [9:36:37<19:06:00, 4.20s/it] 26%|██▌ | 5726/22095 [9:36:40<17:58:11, 3.95s/it] {'loss': 0.3844, 'grad_norm': 0.7363012843803127, 'learning_rate': 8.685561316941091e-06, 'epoch': 0.26} 26%|██▌ | 5726/22095 [9:36:40<17:58:11, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5727/22095 [9:36:43<16:54:45, 3.72s/it] {'loss': 0.3841, 'grad_norm': 0.6487455970498142, 'learning_rate': 8.685065991083695e-06, 'epoch': 0.26} 26%|██▌ | 5727/22095 [9:36:43<16:54:45, 3.72s/it] 26%|██▌ | 5728/22095 [9:36:46<16:18:26, 3.59s/it] {'loss': 0.3753, 'grad_norm': 0.6936695497920785, 'learning_rate': 8.68457058604555e-06, 'epoch': 0.26} 26%|██▌ | 5728/22095 [9:36:46<16:18:26, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5729/22095 [9:36:56<24:20:00, 5.35s/it] {'loss': 0.4773, 'grad_norm': 0.4468251185731199, 'learning_rate': 8.684075101837298e-06, 'epoch': 0.26} 26%|██▌ | 5729/22095 [9:36:56<24:20:00, 5.35s/it] 26%|██▌ | 5730/22095 [9:36:59<21:25:14, 4.71s/it] {'loss': 0.3659, 'grad_norm': 0.6467029106064753, 'learning_rate': 8.683579538469587e-06, 'epoch': 0.26} 26%|██▌ | 5730/22095 [9:36:59<21:25:14, 4.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5731/22095 [9:37:05<22:34:30, 4.97s/it] {'loss': 0.4814, 'grad_norm': 0.38998358102716185, 'learning_rate': 8.683083895953066e-06, 'epoch': 0.26} 26%|██▌ | 5731/22095 [9:37:05<22:34:30, 4.97s/it] 26%|██▌ | 5732/22095 [9:37:08<20:25:00, 4.49s/it] {'loss': 0.4224, 'grad_norm': 0.7080368874591275, 'learning_rate': 8.682588174298384e-06, 'epoch': 0.26} 26%|██▌ | 5732/22095 [9:37:08<20:25:00, 4.49s/it] 26%|██▌ | 5733/22095 [9:37:11<18:38:22, 4.10s/it] {'loss': 0.3714, 'grad_norm': 0.6507574628339656, 'learning_rate': 8.68209237351619e-06, 'epoch': 0.26} 26%|██▌ | 5733/22095 [9:37:11<18:38:22, 4.10s/it] 26%|██▌ | 5734/22095 [9:37:15<18:15:44, 4.02s/it] {'loss': 0.3706, 'grad_norm': 0.6255841463576619, 'learning_rate': 8.681596493617141e-06, 'epoch': 0.26} 26%|██▌ | 5734/22095 [9:37:15<18:15:44, 4.02s/it] 26%|██▌ | 5735/22095 [9:37:18<17:10:06, 3.78s/it] {'loss': 0.4071, 'grad_norm': 0.7205900485067189, 'learning_rate': 8.681100534611891e-06, 'epoch': 0.26} 26%|██▌ | 5735/22095 [9:37:18<17:10:06, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942809 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65962, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3'}]} 26%|██▌ | 5736/22095 [9:37:22<16:35:06, 3.65s/it] {'loss': 0.375, 'grad_norm': 0.6450927790544251, 'learning_rate': 8.680604496511095e-06, 'epoch': 0.26} 26%|██▌ | 5736/22095 [9:37:22<16:35:06, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 87, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350008 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 87, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16681, 'image': 'vrdu_table_final_2/astro-ph.CO/19a14f0f-82aa-4a39-acd0-931e29e97694.png', 'image_wh': [[14, 87]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}\n1\\tabularnewline\n1\\tabularnewline\n1\\tabularnewline\n\\end{tabular}\n```"}]} 26%|██▌ | 5737/22095 [9:37:25<15:37:52, 3.44s/it] {'loss': 0.3737, 'grad_norm': 0.6590018160806728, 'learning_rate': 8.680108379325413e-06, 'epoch': 0.26} 26%|██▌ | 5737/22095 [9:37:25<15:37:52, 3.44s/it] 26%|██▌ | 5738/22095 [9:37:28<15:46:36, 3.47s/it] {'loss': 0.3884, 'grad_norm': 0.6418568583398381, 'learning_rate': 8.679612183065506e-06, 'epoch': 0.26} 26%|██▌ | 5738/22095 [9:37:28<15:46:36, 3.47s/it] 26%|██▌ | 5739/22095 [9:37:31<14:57:11, 3.29s/it] {'loss': 0.3564, 'grad_norm': 0.6454949426628106, 'learning_rate': 8.679115907742032e-06, 'epoch': 0.26} 26%|██▌ | 5739/22095 [9:37:31<14:57:11, 3.29s/it] 26%|██▌ | 5740/22095 [9:37:35<15:24:18, 3.39s/it] {'loss': 0.3742, 'grad_norm': 0.6739832666361274, 'learning_rate': 8.67861955336566e-06, 'epoch': 0.26} 26%|██▌ | 5740/22095 [9:37:35<15:24:18, 3.39s/it] 26%|██▌ | 5741/22095 [9:37:37<14:32:46, 3.20s/it] {'loss': 0.3584, 'grad_norm': 0.6683636845230098, 'learning_rate': 8.678123119947049e-06, 'epoch': 0.26} 26%|██▌ | 5741/22095 [9:37:37<14:32:46, 3.20s/it] 26%|██▌ | 5742/22095 [9:37:41<15:14:28, 3.36s/it] {'loss': 0.3648, 'grad_norm': 0.6559644343457217, 'learning_rate': 8.677626607496869e-06, 'epoch': 0.26} 26%|██▌ | 5742/22095 [9:37:41<15:14:28, 3.36s/it] 26%|██▌ | 5743/22095 [9:37:44<14:35:52, 3.21s/it] {'loss': 0.3556, 'grad_norm': 0.6809757711271472, 'learning_rate': 8.677130016025788e-06, 'epoch': 0.26} 26%|██▌ | 5743/22095 [9:37:44<14:35:52, 3.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333771 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 380, 'image': 'vrdu_table_final_2/astro-ph.CO/36b81f93-b1b0-40b9-8a51-794740f362cf.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394636 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61471, 'image': 'vrdu_table_final_2/astro-ph.EP/505f85fd-89c8-4bb3-8989-00340e7bb457.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 26%|██▌ | 5744/22095 [9:37:47<14:44:20, 3.25s/it] {'loss': 0.3886, 'grad_norm': 0.6666856861528969, 'learning_rate': 8.676633345544476e-06, 'epoch': 0.26} 26%|██▌ | 5744/22095 [9:37:47<14:44:20, 3.25s/it] 26%|██▌ | 5745/22095 [9:37:51<15:21:28, 3.38s/it] {'loss': 0.3589, 'grad_norm': 0.6393280406948814, 'learning_rate': 8.676136596063607e-06, 'epoch': 0.26} 26%|██▌ | 5745/22095 [9:37:51<15:21:28, 3.38s/it] 26%|██▌ | 5746/22095 [9:37:55<16:10:27, 3.56s/it] {'loss': 0.3631, 'grad_norm': 0.678176855017373, 'learning_rate': 8.675639767593851e-06, 'epoch': 0.26} 26%|██▌ | 5746/22095 [9:37:55<16:10:27, 3.56s/it] 26%|██▌ | 5747/22095 [9:37:58<15:55:08, 3.51s/it] {'loss': 0.3838, 'grad_norm': 0.6610446597238993, 'learning_rate': 8.675142860145887e-06, 'epoch': 0.26} 26%|██▌ | 5747/22095 [9:37:58<15:55:08, 3.51s/it] 26%|██▌ | 5748/22095 [9:38:01<15:10:41, 3.34s/it] {'loss': 0.3889, 'grad_norm': 0.6162186768778396, 'learning_rate': 8.67464587373039e-06, 'epoch': 0.26} 26%|██▌ | 5748/22095 [9:38:01<15:10:41, 3.34s/it] 26%|██▌ | 5749/22095 [9:38:05<15:12:55, 3.35s/it] {'loss': 0.4023, 'grad_norm': 0.6431669043412701, 'learning_rate': 8.674148808358038e-06, 'epoch': 0.26} 26%|██▌ | 5749/22095 [9:38:05<15:12:55, 3.35s/it] 26%|██▌ | 5750/22095 [9:38:08<15:25:38, 3.40s/it] {'loss': 0.352, 'grad_norm': 0.6334447577441437, 'learning_rate': 8.673651664039513e-06, 'epoch': 0.26} 26%|██▌ | 5750/22095 [9:38:08<15:25:38, 3.40s/it] 26%|██▌ | 5751/22095 [9:38:12<15:49:22, 3.49s/it] {'loss': 0.4121, 'grad_norm': 0.629109002626069, 'learning_rate': 8.673154440785496e-06, 'epoch': 0.26} 26%|██▌ | 5751/22095 [9:38:12<15:49:22, 3.49s/it] 26%|██▌ | 5752/22095 [9:38:16<16:21:26, 3.60s/it] {'loss': 0.403, 'grad_norm': 0.6882025240837115, 'learning_rate': 8.672657138606672e-06, 'epoch': 0.26} 26%|██▌ | 5752/22095 [9:38:16<16:21:26, 3.60s/it] 26%|██▌ | 5753/22095 [9:38:19<16:08:50, 3.56s/it] {'loss': 0.369, 'grad_norm': 0.6620543132588419, 'learning_rate': 8.672159757513726e-06, 'epoch': 0.26} 26%|██▌ | 5753/22095 [9:38:19<16:08:50, 3.56s/it] 26%|██▌ | 5754/22095 [9:38:23<16:59:03, 3.74s/it] {'loss': 0.4314, 'grad_norm': 0.6516401962062613, 'learning_rate': 8.671662297517344e-06, 'epoch': 0.26} 26%|██▌ | 5754/22095 [9:38:23<16:59:03, 3.74s/it] 26%|██▌ | 5755/22095 [9:38:27<17:07:55, 3.77s/it] {'loss': 0.3493, 'grad_norm': 0.6667136130475712, 'learning_rate': 8.671164758628216e-06, 'epoch': 0.26} 26%|██▌ | 5755/22095 [9:38:27<17:07:55, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (125335 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59720 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5756/22095 [9:38:30<15:47:49, 3.48s/it] {'loss': 0.3769, 'grad_norm': 0.6643676696578114, 'learning_rate': 8.670667140857034e-06, 'epoch': 0.26} 26%|██▌ | 5756/22095 [9:38:30<15:47:49, 3.48s/it] 26%|██▌ | 5757/22095 [9:38:33<14:53:29, 3.28s/it] {'loss': 0.3735, 'grad_norm': 0.632758644728174, 'learning_rate': 8.670169444214487e-06, 'epoch': 0.26} 26%|██▌ | 5757/22095 [9:38:33<14:53:29, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5758/22095 [9:38:37<15:39:37, 3.45s/it] {'loss': 0.3773, 'grad_norm': 0.630402510364806, 'learning_rate': 8.669671668711272e-06, 'epoch': 0.26} 26%|██▌ | 5758/22095 [9:38:37<15:39:37, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45789 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5759/22095 [9:38:40<15:51:56, 3.50s/it] {'loss': 0.3849, 'grad_norm': 0.6562395073335897, 'learning_rate': 8.669173814358082e-06, 'epoch': 0.26} 26%|██▌ | 5759/22095 [9:38:40<15:51:56, 3.50s/it] 26%|██▌ | 5760/22095 [9:38:43<15:30:40, 3.42s/it] {'loss': 0.3595, 'grad_norm': 0.6379547262212517, 'learning_rate': 8.668675881165616e-06, 'epoch': 0.26} 26%|██▌ | 5760/22095 [9:38:44<15:30:40, 3.42s/it] 26%|██▌ | 5761/22095 [9:38:47<15:51:04, 3.49s/it] {'loss': 0.3756, 'grad_norm': 0.6794966206353229, 'learning_rate': 8.668177869144574e-06, 'epoch': 0.26} 26%|██▌ | 5761/22095 [9:38:47<15:51:04, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5762/22095 [9:38:50<15:14:38, 3.36s/it] {'loss': 0.3591, 'grad_norm': 0.6470626273490792, 'learning_rate': 8.667679778305654e-06, 'epoch': 0.26} 26%|██▌ | 5762/22095 [9:38:50<15:14:38, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5763/22095 [9:38:53<14:31:18, 3.20s/it] {'loss': 0.3728, 'grad_norm': 0.6400645187021914, 'learning_rate': 8.66718160865956e-06, 'epoch': 0.26} 26%|██▌ | 5763/22095 [9:38:53<14:31:18, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5764/22095 [9:39:02<22:10:28, 4.89s/it] {'loss': 0.4782, 'grad_norm': 0.7782560423349998, 'learning_rate': 8.666683360216998e-06, 'epoch': 0.26} 26%|██▌ | 5764/22095 [9:39:02<22:10:28, 4.89s/it] 26%|██▌ | 5765/22095 [9:39:06<20:47:16, 4.58s/it] {'loss': 0.3924, 'grad_norm': 0.6331256763361964, 'learning_rate': 8.66618503298867e-06, 'epoch': 0.26} 26%|██▌ | 5765/22095 [9:39:06<20:47:16, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61278 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5766/22095 [9:39:09<18:29:46, 4.08s/it] {'loss': 0.4098, 'grad_norm': 0.6983330718539844, 'learning_rate': 8.665686626985286e-06, 'epoch': 0.26} 26%|██▌ | 5766/22095 [9:39:09<18:29:46, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5767/22095 [9:39:18<25:23:55, 5.60s/it] {'loss': 0.4693, 'grad_norm': 0.347891864990072, 'learning_rate': 8.665188142217555e-06, 'epoch': 0.26} 26%|██▌ | 5767/22095 [9:39:18<25:23:55, 5.60s/it] 26%|██▌ | 5768/22095 [9:39:21<22:42:40, 5.01s/it] {'loss': 0.3672, 'grad_norm': 0.7149931938736016, 'learning_rate': 8.664689578696188e-06, 'epoch': 0.26} 26%|██▌ | 5768/22095 [9:39:21<22:42:40, 5.01s/it] 26%|██▌ | 5769/22095 [9:39:25<20:10:45, 4.45s/it] {'loss': 0.3695, 'grad_norm': 0.6324010359903846, 'learning_rate': 8.664190936431896e-06, 'epoch': 0.26} 26%|██▌ | 5769/22095 [9:39:25<20:10:45, 4.45s/it] 26%|██▌ | 5770/22095 [9:39:28<18:30:23, 4.08s/it] {'loss': 0.3741, 'grad_norm': 0.6574088753449782, 'learning_rate': 8.663692215435396e-06, 'epoch': 0.26} 26%|██▌ | 5770/22095 [9:39:28<18:30:23, 4.08s/it] 26%|██▌ | 5771/22095 [9:39:31<16:40:19, 3.68s/it] {'loss': 0.3801, 'grad_norm': 0.6924097149046233, 'learning_rate': 8.663193415717402e-06, 'epoch': 0.26} 26%|██▌ | 5771/22095 [9:39:31<16:40:19, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5772/22095 [9:39:39<22:47:38, 5.03s/it] {'loss': 0.5228, 'grad_norm': 0.6212884502110326, 'learning_rate': 8.662694537288632e-06, 'epoch': 0.26} 26%|██▌ | 5772/22095 [9:39:39<22:47:38, 5.03s/it] 26%|██▌ | 5773/22095 [9:39:42<20:21:13, 4.49s/it] {'loss': 0.3978, 'grad_norm': 0.7482647862220366, 'learning_rate': 8.662195580159804e-06, 'epoch': 0.26} 26%|██▌ | 5773/22095 [9:39:42<20:21:13, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5774/22095 [9:39:47<20:52:00, 4.60s/it] {'loss': 0.4931, 'grad_norm': 0.5149044333270514, 'learning_rate': 8.661696544341642e-06, 'epoch': 0.26} 26%|██▌ | 5774/22095 [9:39:47<20:52:00, 4.60s/it] 26%|██▌ | 5775/22095 [9:39:50<19:22:47, 4.27s/it] {'loss': 0.3388, 'grad_norm': 0.653471749140414, 'learning_rate': 8.661197429844868e-06, 'epoch': 0.26} 26%|██▌ | 5775/22095 [9:39:50<19:22:47, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49902 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5776/22095 [9:39:54<18:09:13, 4.00s/it] {'loss': 0.3696, 'grad_norm': 0.6625056719664779, 'learning_rate': 8.660698236680205e-06, 'epoch': 0.26} 26%|██▌ | 5776/22095 [9:39:54<18:09:13, 4.00s/it] 26%|██▌ | 5777/22095 [9:39:57<17:13:05, 3.80s/it] {'loss': 0.4087, 'grad_norm': 0.6471294672905499, 'learning_rate': 8.66019896485838e-06, 'epoch': 0.26} 26%|██▌ | 5777/22095 [9:39:57<17:13:05, 3.80s/it] 26%|██▌ | 5778/22095 [9:40:00<16:37:13, 3.67s/it] {'loss': 0.3858, 'grad_norm': 0.7361480408869571, 'learning_rate': 8.65969961439012e-06, 'epoch': 0.26} 26%|██▌ | 5778/22095 [9:40:00<16:37:13, 3.67s/it] 26%|██▌ | 5779/22095 [9:40:04<17:13:08, 3.80s/it] {'loss': 0.3477, 'grad_norm': 0.649115216250859, 'learning_rate': 8.659200185286157e-06, 'epoch': 0.26} 26%|██▌ | 5779/22095 [9:40:04<17:13:08, 3.80s/it] 26%|██▌ | 5780/22095 [9:40:07<16:05:40, 3.55s/it] {'loss': 0.4159, 'grad_norm': 0.6561882849580633, 'learning_rate': 8.658700677557217e-06, 'epoch': 0.26} 26%|██▌ | 5780/22095 [9:40:07<16:05:40, 3.55s/it] 26%|██▌ | 5781/22095 [9:40:11<15:33:17, 3.43s/it] {'loss': 0.4287, 'grad_norm': 0.6753162434144271, 'learning_rate': 8.658201091214038e-06, 'epoch': 0.26} 26%|██▌ | 5781/22095 [9:40:11<15:33:17, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53989 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5782/22095 [9:40:14<15:08:40, 3.34s/it]Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44558 > 40960) for 4 sample(s). Truncating to 2951 with 2 samples. {'loss': 0.4304, 'grad_norm': 0.6725502703479815, 'learning_rate': 8.657701426267355e-06, 'epoch': 0.26} 26%|██▌ | 5782/22095 [9:40:14<15:08:40, 3.34s/it] 26%|██▌ | 5783/22095 [9:40:17<15:24:54, 3.40s/it] {'loss': 0.3641, 'grad_norm': 0.6815944476324566, 'learning_rate': 8.657201682727898e-06, 'epoch': 0.26} 26%|██▌ | 5783/22095 [9:40:17<15:24:54, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5784/22095 [9:40:26<22:29:52, 4.97s/it] {'loss': 0.491, 'grad_norm': 0.5161456444441221, 'learning_rate': 8.656701860606412e-06, 'epoch': 0.26} 26%|██▌ | 5784/22095 [9:40:26<22:29:52, 4.97s/it] 26%|██▌ | 5785/22095 [9:40:29<20:01:07, 4.42s/it] {'loss': 0.3502, 'grad_norm': 0.6652180348295492, 'learning_rate': 8.656201959913635e-06, 'epoch': 0.26} 26%|██▌ | 5785/22095 [9:40:29<20:01:07, 4.42s/it] 26%|██▌ | 5786/22095 [9:40:33<19:03:19, 4.21s/it] {'loss': 0.3635, 'grad_norm': 0.612731572351153, 'learning_rate': 8.655701980660305e-06, 'epoch': 0.26} 26%|██▌ | 5786/22095 [9:40:33<19:03:19, 4.21s/it] 26%|██▌ | 5787/22095 [9:40:36<17:25:29, 3.85s/it] {'loss': 0.357, 'grad_norm': 0.6252464211687456, 'learning_rate': 8.655201922857166e-06, 'epoch': 0.26} 26%|██▌ | 5787/22095 [9:40:36<17:25:29, 3.85s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5788/22095 [9:40:39<17:03:29, 3.77s/it] {'loss': 0.3842, 'grad_norm': 0.6850715997532323, 'learning_rate': 8.654701786514965e-06, 'epoch': 0.26} 26%|██▌ | 5788/22095 [9:40:39<17:03:29, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45943 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5789/22095 [9:40:42<15:56:26, 3.52s/it] {'loss': 0.3718, 'grad_norm': 0.6445923933478774, 'learning_rate': 8.654201571644447e-06, 'epoch': 0.26} 26%|██▌ | 5789/22095 [9:40:42<15:56:26, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5790/22095 [9:40:52<24:00:55, 5.30s/it] {'loss': 0.4881, 'grad_norm': 0.4093785123197519, 'learning_rate': 8.653701278256362e-06, 'epoch': 0.26} 26%|██▌ | 5790/22095 [9:40:52<24:00:55, 5.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5791/22095 [9:40:55<21:08:59, 4.67s/it] {'loss': 0.3419, 'grad_norm': 0.6996517734015768, 'learning_rate': 8.653200906361454e-06, 'epoch': 0.26} 26%|██▌ | 5791/22095 [9:40:55<21:08:59, 4.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▌ | 5792/22095 [9:40:58<19:11:54, 4.24s/it] {'loss': 0.3778, 'grad_norm': 0.6759619747376816, 'learning_rate': 8.652700455970483e-06, 'epoch': 0.26} 26%|██▌ | 5792/22095 [9:40:58<19:11:54, 4.24s/it] 26%|██▌ | 5793/22095 [9:41:02<18:37:30, 4.11s/it] {'loss': 0.3885, 'grad_norm': 0.6963571140515151, 'learning_rate': 8.652199927094194e-06, 'epoch': 0.26} 26%|██▌ | 5793/22095 [9:41:02<18:37:30, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49980 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5794/22095 [9:41:05<16:58:56, 3.75s/it] {'loss': 0.3478, 'grad_norm': 0.6076035876970179, 'learning_rate': 8.651699319743348e-06, 'epoch': 0.26} 26%|██▌ | 5794/22095 [9:41:05<16:58:56, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88004 > 40960). Running this sequence through the model will result in indexing errors 26%|██▌ | 5795/22095 [9:41:08<15:37:27, 3.45s/it] {'loss': 0.3873, 'grad_norm': 0.6776206762542706, 'learning_rate': 8.651198633928696e-06, 'epoch': 0.26} 26%|██▌ | 5795/22095 [9:41:08<15:37:27, 3.45s/it] 26%|██▌ | 5796/22095 [9:41:11<15:38:14, 3.45s/it] {'loss': 0.3991, 'grad_norm': 0.6661315977828, 'learning_rate': 8.650697869661002e-06, 'epoch': 0.26} 26%|██▌ | 5796/22095 [9:41:11<15:38:14, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▌ | 5797/22095 [9:41:21<23:51:04, 5.27s/it] {'loss': 0.4936, 'grad_norm': 0.46358353209235925, 'learning_rate': 8.650197026951022e-06, 'epoch': 0.26} 26%|██▌ | 5797/22095 [9:41:21<23:51:04, 5.27s/it] 26%|██▌ | 5798/22095 [9:41:25<22:13:31, 4.91s/it] {'loss': 0.4183, 'grad_norm': 0.6883901449463596, 'learning_rate': 8.649696105809518e-06, 'epoch': 0.26} 26%|██▌ | 5798/22095 [9:41:25<22:13:31, 4.91s/it] 26%|██▌ | 5799/22095 [9:41:28<20:27:42, 4.52s/it] {'loss': 0.378, 'grad_norm': 0.5777003885885577, 'learning_rate': 8.649195106247256e-06, 'epoch': 0.26} 26%|██▌ | 5799/22095 [9:41:28<20:27:42, 4.52s/it] 26%|██▋ | 5800/22095 [9:41:31<18:12:28, 4.02s/it] {'loss': 0.417, 'grad_norm': 0.6298595377891067, 'learning_rate': 8.648694028274998e-06, 'epoch': 0.26} 26%|██▋ | 5800/22095 [9:41:31<18:12:28, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▋ | 5801/22095 [9:41:34<17:09:12, 3.79s/it] {'loss': 0.4013, 'grad_norm': 0.6152758639766999, 'learning_rate': 8.64819287190351e-06, 'epoch': 0.26} 26%|██▋ | 5801/22095 [9:41:34<17:09:12, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5802/22095 [9:41:43<23:31:48, 5.20s/it] {'loss': 0.4892, 'grad_norm': 0.3157038254738025, 'learning_rate': 8.647691637143562e-06, 'epoch': 0.26} 26%|██▋ | 5802/22095 [9:41:43<23:31:48, 5.20s/it] 26%|██▋ | 5803/22095 [9:41:52<29:08:48, 6.44s/it] {'loss': 0.488, 'grad_norm': 0.28796182851067176, 'learning_rate': 8.647190324005925e-06, 'epoch': 0.26} 26%|██▋ | 5803/22095 [9:41:52<29:08:48, 6.44s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (42118 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77463 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5804/22095 [9:41:56<25:25:49, 5.62s/it] {'loss': 0.3802, 'grad_norm': 0.6537209460403814, 'learning_rate': 8.646688932501369e-06, 'epoch': 0.26} 26%|██▋ | 5804/22095 [9:41:56<25:25:49, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58195 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5805/22095 [9:42:00<23:54:17, 5.28s/it] {'loss': 0.4316, 'grad_norm': 0.6960618753773958, 'learning_rate': 8.646187462640668e-06, 'epoch': 0.26} 26%|██▋ | 5805/22095 [9:42:00<23:54:17, 5.28s/it] 26%|██▋ | 5806/22095 [9:42:04<22:13:47, 4.91s/it] {'loss': 0.4048, 'grad_norm': 0.6628038199492589, 'learning_rate': 8.645685914434596e-06, 'epoch': 0.26} 26%|██▋ | 5806/22095 [9:42:04<22:13:47, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43426 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67922 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5807/22095 [9:42:08<19:47:41, 4.38s/it] {'loss': 0.3942, 'grad_norm': 0.701227250896986, 'learning_rate': 8.64518428789393e-06, 'epoch': 0.26} 26%|██▋ | 5807/22095 [9:42:08<19:47:41, 4.38s/it] 26%|██▋ | 5808/22095 [9:42:12<19:10:27, 4.24s/it] {'loss': 0.3832, 'grad_norm': 0.6371205125596449, 'learning_rate': 8.644682583029452e-06, 'epoch': 0.26} 26%|██▋ | 5808/22095 [9:42:12<19:10:27, 4.24s/it] 26%|██▋ | 5809/22095 [9:42:15<17:52:25, 3.95s/it] {'loss': 0.3707, 'grad_norm': 0.6289246242511318, 'learning_rate': 8.644180799851936e-06, 'epoch': 0.26} 26%|██▋ | 5809/22095 [9:42:15<17:52:25, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50105 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66934 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5810/22095 [9:42:18<16:18:14, 3.60s/it] {'loss': 0.2983, 'grad_norm': 0.6210970208658598, 'learning_rate': 8.643678938372167e-06, 'epoch': 0.26} 26%|██▋ | 5810/22095 [9:42:18<16:18:14, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101535 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5811/22095 [9:42:21<15:58:09, 3.53s/it] {'loss': 0.3708, 'grad_norm': 0.6778788158675642, 'learning_rate': 8.643176998600931e-06, 'epoch': 0.26} 26%|██▋ | 5811/22095 [9:42:21<15:58:09, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5812/22095 [9:42:27<19:18:48, 4.27s/it] {'loss': 0.4846, 'grad_norm': 0.4103471001964465, 'learning_rate': 8.642674980549008e-06, 'epoch': 0.26} 26%|██▋ | 5812/22095 [9:42:27<19:18:48, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66873 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5813/22095 [9:42:30<17:38:46, 3.90s/it] {'loss': 0.3729, 'grad_norm': 0.6314054288667231, 'learning_rate': 8.642172884227187e-06, 'epoch': 0.26} 26%|██▋ | 5813/22095 [9:42:30<17:38:46, 3.90s/it] 26%|██▋ | 5814/22095 [9:42:33<16:12:50, 3.59s/it] {'loss': 0.3556, 'grad_norm': 0.6871376003891854, 'learning_rate': 8.641670709646258e-06, 'epoch': 0.26} 26%|██▋ | 5814/22095 [9:42:33<16:12:50, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5815/22095 [9:42:41<22:39:17, 5.01s/it] {'loss': 0.4937, 'grad_norm': 0.32402773719708344, 'learning_rate': 8.64116845681701e-06, 'epoch': 0.26} 26%|██▋ | 5815/22095 [9:42:41<22:39:17, 5.01s/it] 26%|██▋ | 5816/22095 [9:42:45<21:22:18, 4.73s/it] {'loss': 0.3971, 'grad_norm': 0.6534754943829143, 'learning_rate': 8.640666125750234e-06, 'epoch': 0.26} 26%|██▋ | 5816/22095 [9:42:45<21:22:18, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45380 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5817/22095 [9:42:49<19:51:34, 4.39s/it] {'loss': 0.4009, 'grad_norm': 0.7017365191775312, 'learning_rate': 8.640163716456726e-06, 'epoch': 0.26} 26%|██▋ | 5817/22095 [9:42:49<19:51:34, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5818/22095 [9:42:56<23:44:30, 5.25s/it] {'loss': 0.5117, 'grad_norm': 0.30537086162876564, 'learning_rate': 8.639661228947278e-06, 'epoch': 0.26} 26%|██▋ | 5818/22095 [9:42:56<23:44:30, 5.25s/it] 26%|██▋ | 5819/22095 [9:43:00<21:37:07, 4.78s/it] {'loss': 0.4687, 'grad_norm': 0.7499047532831177, 'learning_rate': 8.63915866323269e-06, 'epoch': 0.26} 26%|██▋ | 5819/22095 [9:43:00<21:37:07, 4.78s/it] 26%|██▋ | 5820/22095 [9:43:03<18:52:25, 4.17s/it] {'loss': 0.3724, 'grad_norm': 0.6707874277156979, 'learning_rate': 8.638656019323758e-06, 'epoch': 0.26} 26%|██▋ | 5820/22095 [9:43:03<18:52:25, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5821/22095 [9:43:12<26:23:07, 5.84s/it] {'loss': 0.4871, 'grad_norm': 0.3199935029667527, 'learning_rate': 8.638153297231282e-06, 'epoch': 0.26} 26%|██▋ | 5821/22095 [9:43:12<26:23:07, 5.84s/it] 26%|██▋ | 5822/22095 [9:43:15<22:48:24, 5.05s/it] {'loss': 0.35, 'grad_norm': 0.6652319046497731, 'learning_rate': 8.637650496966069e-06, 'epoch': 0.26} 26%|██▋ | 5822/22095 [9:43:15<22:48:24, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44102 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97751 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86621 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5823/22095 [9:43:19<21:03:12, 4.66s/it] {'loss': 0.4248, 'grad_norm': 0.6489623223735191, 'learning_rate': 8.637147618538918e-06, 'epoch': 0.26} 26%|██▋ | 5823/22095 [9:43:19<21:03:12, 4.66s/it] 26%|██▋ | 5824/22095 [9:43:22<18:56:53, 4.19s/it] {'loss': 0.3737, 'grad_norm': 0.6338461835366701, 'learning_rate': 8.636644661960634e-06, 'epoch': 0.26} 26%|██▋ | 5824/22095 [9:43:22<18:56:53, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5825/22095 [9:43:32<26:10:07, 5.79s/it] {'loss': 0.4618, 'grad_norm': 0.2895408176277085, 'learning_rate': 8.636141627242025e-06, 'epoch': 0.26} 26%|██▋ | 5825/22095 [9:43:32<26:10:07, 5.79s/it] 26%|██▋ | 5826/22095 [9:43:35<22:31:59, 4.99s/it] {'loss': 0.372, 'grad_norm': 0.6836727451279606, 'learning_rate': 8.6356385143939e-06, 'epoch': 0.26} 26%|██▋ | 5826/22095 [9:43:35<22:31:59, 4.99s/it] 26%|██▋ | 5827/22095 [9:43:38<20:21:07, 4.50s/it] {'loss': 0.4041, 'grad_norm': 0.6853694082506848, 'learning_rate': 8.635135323427072e-06, 'epoch': 0.26} 26%|██▋ | 5827/22095 [9:43:38<20:21:07, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▋ | 5828/22095 [9:43:41<18:26:41, 4.08s/it] {'loss': 0.3827, 'grad_norm': 0.5989493808594828, 'learning_rate': 8.634632054352347e-06, 'epoch': 0.26} 26%|██▋ | 5828/22095 [9:43:41<18:26:41, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60097 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108993 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5829/22095 [9:43:45<17:50:12, 3.95s/it] {'loss': 0.3936, 'grad_norm': 0.6143907957530291, 'learning_rate': 8.634128707180544e-06, 'epoch': 0.26} 26%|██▋ | 5829/22095 [9:43:45<17:50:12, 3.95s/it] 26%|██▋ | 5830/22095 [9:43:49<17:17:52, 3.83s/it] {'loss': 0.3784, 'grad_norm': 0.6213073360715494, 'learning_rate': 8.633625281922477e-06, 'epoch': 0.26} 26%|██▋ | 5830/22095 [9:43:49<17:17:52, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (137538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89596 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5831/22095 [9:43:58<24:44:50, 5.48s/it] {'loss': 0.4935, 'grad_norm': 0.30410771789820507, 'learning_rate': 8.63312177858896e-06, 'epoch': 0.26} 26%|██▋ | 5831/22095 [9:43:58<24:44:50, 5.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366681 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33427, 'image': 'vrdu_table_final_2/astro-ph.CO/8102a342-7b09-4d04-b30e-94e05aedbfca.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```'}]} 26%|██▋ | 5832/22095 [9:44:01<22:01:07, 4.87s/it] {'loss': 0.4466, 'grad_norm': 0.6590420669823309, 'learning_rate': 8.632618197190817e-06, 'epoch': 0.26} 26%|██▋ | 5832/22095 [9:44:01<22:01:07, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84959 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59783 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5833/22095 [9:44:04<19:13:26, 4.26s/it] {'loss': 0.3941, 'grad_norm': 0.6639909004609018, 'learning_rate': 8.632114537738865e-06, 'epoch': 0.26} 26%|██▋ | 5833/22095 [9:44:04<19:13:26, 4.26s/it] 26%|██▋ | 5834/22095 [9:44:08<18:48:30, 4.16s/it] {'loss': 0.4041, 'grad_norm': 0.629068927113239, 'learning_rate': 8.631610800243926e-06, 'epoch': 0.26} 26%|██▋ | 5834/22095 [9:44:08<18:48:30, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 26%|██▋ | 5835/22095 [9:44:12<17:52:17, 3.96s/it] {'loss': 0.3643, 'grad_norm': 0.6548830543697949, 'learning_rate': 8.631106984716824e-06, 'epoch': 0.26} 26%|██▋ | 5835/22095 [9:44:12<17:52:17, 3.96s/it] 26%|██▋ | 5836/22095 [9:44:15<16:43:01, 3.70s/it] {'loss': 0.3975, 'grad_norm': 0.6267647987383094, 'learning_rate': 8.630603091168385e-06, 'epoch': 0.26} 26%|██▋ | 5836/22095 [9:44:15<16:43:01, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5837/22095 [9:44:22<21:41:14, 4.80s/it] {'loss': 0.4905, 'grad_norm': 0.3366458561699845, 'learning_rate': 8.630099119609439e-06, 'epoch': 0.26} 26%|██▋ | 5837/22095 [9:44:22<21:41:14, 4.80s/it] 26%|██▋ | 5838/22095 [9:44:26<20:56:26, 4.64s/it] {'loss': 0.3584, 'grad_norm': 0.6485695922987192, 'learning_rate': 8.62959507005081e-06, 'epoch': 0.26} 26%|██▋ | 5838/22095 [9:44:26<20:56:26, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5839/22095 [9:44:36<27:27:10, 6.08s/it] {'loss': 0.4906, 'grad_norm': 0.29037235688732815, 'learning_rate': 8.62909094250333e-06, 'epoch': 0.26} 26%|██▋ | 5839/22095 [9:44:36<27:27:10, 6.08s/it] 26%|██▋ | 5840/22095 [9:44:40<24:26:13, 5.41s/it] {'loss': 0.3736, 'grad_norm': 0.6792032775967267, 'learning_rate': 8.62858673697783e-06, 'epoch': 0.26} 26%|██▋ | 5840/22095 [9:44:40<24:26:13, 5.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5841/22095 [9:44:50<30:30:10, 6.76s/it] {'loss': 0.5212, 'grad_norm': 0.3204503110222696, 'learning_rate': 8.628082453485149e-06, 'epoch': 0.26} 26%|██▋ | 5841/22095 [9:44:50<30:30:10, 6.76s/it] 26%|██▋ | 5842/22095 [9:44:53<25:38:01, 5.68s/it] {'loss': 0.3985, 'grad_norm': 0.7177482170374119, 'learning_rate': 8.627578092036117e-06, 'epoch': 0.26} 26%|██▋ | 5842/22095 [9:44:53<25:38:01, 5.68s/it] 26%|██▋ | 5843/22095 [9:44:56<22:24:08, 4.96s/it] {'loss': 0.3961, 'grad_norm': 0.8012517579137678, 'learning_rate': 8.627073652641573e-06, 'epoch': 0.26} 26%|██▋ | 5843/22095 [9:44:56<22:24:08, 4.96s/it] 26%|██▋ | 5844/22095 [9:45:00<21:12:46, 4.70s/it] {'loss': 0.3912, 'grad_norm': 0.6032837985123936, 'learning_rate': 8.626569135312354e-06, 'epoch': 0.26} 26%|██▋ | 5844/22095 [9:45:00<21:12:46, 4.70s/it] 26%|██▋ | 5845/22095 [9:45:03<19:18:46, 4.28s/it] {'loss': 0.3977, 'grad_norm': 0.6511910379174815, 'learning_rate': 8.626064540059305e-06, 'epoch': 0.26} 26%|██▋ | 5845/22095 [9:45:03<19:18:46, 4.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888456 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11609, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,M是AB中点,∴BM=\\frac{1}{2}AB=5cm,又∵NB=2cm,∴MN=BM-BN=5-2=3cm.'}]} 26%|██▋ | 5846/22095 [9:45:07<17:46:50, 3.94s/it] {'loss': 0.3606, 'grad_norm': 0.6740604527846302, 'learning_rate': 8.625559866893265e-06, 'epoch': 0.26} 26%|██▋ | 5846/22095 [9:45:07<17:46:50, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48099 > 40960). Running this sequence through the model will result in indexing errors 26%|██▋ | 5847/22095 [9:45:10<17:17:13, 3.83s/it] {'loss': 0.3666, 'grad_norm': 0.6388403733226904, 'learning_rate': 8.625055115825078e-06, 'epoch': 0.26} 26%|██▋ | 5847/22095 [9:45:10<17:17:13, 3.83s/it] 26%|██▋ | 5848/22095 [9:45:13<15:58:44, 3.54s/it] {'loss': 0.3362, 'grad_norm': 0.6245931039340854, 'learning_rate': 8.624550286865592e-06, 'epoch': 0.26} 26%|██▋ | 5848/22095 [9:45:13<15:58:44, 3.54s/it] 26%|██▋ | 5849/22095 [9:45:16<15:45:51, 3.49s/it] {'loss': 0.4211, 'grad_norm': 0.6749344299546478, 'learning_rate': 8.62404538002565e-06, 'epoch': 0.26} 26%|██▋ | 5849/22095 [9:45:16<15:45:51, 3.49s/it] 26%|██▋ | 5850/22095 [9:45:19<14:49:18, 3.28s/it] {'loss': 0.3206, 'grad_norm': 0.6031113265169693, 'learning_rate': 8.623540395316105e-06, 'epoch': 0.26} 26%|██▋ | 5850/22095 [9:45:19<14:49:18, 3.28s/it] 26%|██▋ | 5851/22095 [9:45:23<15:04:34, 3.34s/it] {'loss': 0.4087, 'grad_norm': 0.7118892079516408, 'learning_rate': 8.623035332747804e-06, 'epoch': 0.26} 26%|██▋ | 5851/22095 [9:45:23<15:04:34, 3.34s/it] 26%|██▋ | 5852/22095 [9:45:25<14:20:51, 3.18s/it] {'loss': 0.3765, 'grad_norm': 0.7077213249212451, 'learning_rate': 8.622530192331602e-06, 'epoch': 0.26} 26%|██▋ | 5852/22095 [9:45:25<14:20:51, 3.18s/it] 26%|██▋ | 5853/22095 [9:45:29<14:20:28, 3.18s/it] {'loss': 0.3482, 'grad_norm': 0.7198794438308793, 'learning_rate': 8.622024974078354e-06, 'epoch': 0.26} 26%|██▋ | 5853/22095 [9:45:29<14:20:28, 3.18s/it] 26%|██▋ | 5854/22095 [9:45:32<15:17:32, 3.39s/it] {'loss': 0.3665, 'grad_norm': 0.778162946379442, 'learning_rate': 8.62151967799891e-06, 'epoch': 0.26} 26%|██▋ | 5854/22095 [9:45:33<15:17:32, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 26%|██▋ | 5855/22095 [9:45:40<20:19:18, 4.50s/it] {'loss': 0.4961, 'grad_norm': 0.533309837016235, 'learning_rate': 8.621014304104131e-06, 'epoch': 0.26} 26%|██▋ | 5855/22095 [9:45:40<20:19:18, 4.50s/it] 27%|██▋ | 5856/22095 [9:45:43<19:02:20, 4.22s/it] {'loss': 0.3803, 'grad_norm': 0.7134140281191794, 'learning_rate': 8.620508852404878e-06, 'epoch': 0.27} 27%|██▋ | 5856/22095 [9:45:43<19:02:20, 4.22s/it] 27%|██▋ | 5857/22095 [9:45:48<19:41:12, 4.36s/it] {'loss': 0.4084, 'grad_norm': 0.6597382720345764, 'learning_rate': 8.620003322912008e-06, 'epoch': 0.27} 27%|██▋ | 5857/22095 [9:45:48<19:41:12, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45613 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65023 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52554 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46563 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 27%|██▋ | 5858/22095 [9:45:51<18:21:42, 4.07s/it] {'loss': 0.381, 'grad_norm': 0.6589962606418719, 'learning_rate': 8.619497715636385e-06, 'epoch': 0.27} 27%|██▋ | 5858/22095 [9:45:51<18:21:42, 4.07s/it] 27%|██▋ | 5859/22095 [9:45:55<17:32:45, 3.89s/it] {'loss': 0.4202, 'grad_norm': 0.6724882253908735, 'learning_rate': 8.618992030588872e-06, 'epoch': 0.27} 27%|██▋ | 5859/22095 [9:45:55<17:32:45, 3.89s/it] 27%|██▋ | 5860/22095 [9:45:57<16:03:48, 3.56s/it] {'loss': 0.3905, 'grad_norm': 0.708051466390435, 'learning_rate': 8.618486267780334e-06, 'epoch': 0.27} 27%|██▋ | 5860/22095 [9:45:58<16:03:48, 3.56s/it] 27%|██▋ | 5861/22095 [9:46:01<15:58:53, 3.54s/it] {'loss': 0.3767, 'grad_norm': 0.6241104554817613, 'learning_rate': 8.617980427221641e-06, 'epoch': 0.27} 27%|██▋ | 5861/22095 [9:46:01<15:58:53, 3.54s/it] 27%|██▋ | 5862/22095 [9:46:05<16:17:44, 3.61s/it] {'loss': 0.3898, 'grad_norm': 0.6367150687469564, 'learning_rate': 8.617474508923662e-06, 'epoch': 0.27} 27%|██▋ | 5862/22095 [9:46:05<16:17:44, 3.61s/it] 27%|██▋ | 5863/22095 [9:46:08<15:56:57, 3.54s/it] {'loss': 0.4204, 'grad_norm': 0.6543179077247595, 'learning_rate': 8.616968512897264e-06, 'epoch': 0.27} 27%|██▋ | 5863/22095 [9:46:08<15:56:57, 3.54s/it] 27%|██▋ | 5864/22095 [9:46:12<16:37:55, 3.69s/it] {'loss': 0.4221, 'grad_norm': 0.6934113699895663, 'learning_rate': 8.61646243915332e-06, 'epoch': 0.27} 27%|██▋ | 5864/22095 [9:46:12<16:37:55, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5865/22095 [9:46:18<19:02:27, 4.22s/it] {'loss': 0.4663, 'grad_norm': 0.4148202704133109, 'learning_rate': 8.615956287702708e-06, 'epoch': 0.27} 27%|██▋ | 5865/22095 [9:46:18<19:02:27, 4.22s/it] 27%|██▋ | 5866/22095 [9:46:27<26:01:17, 5.77s/it] {'loss': 0.4917, 'grad_norm': 0.35037573430565927, 'learning_rate': 8.615450058556301e-06, 'epoch': 0.27} 27%|██▋ | 5866/22095 [9:46:27<26:01:17, 5.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 27%|██▋ | 5867/22095 [9:46:30<22:49:00, 5.06s/it] {'loss': 0.4249, 'grad_norm': 0.7073144750498311, 'learning_rate': 8.614943751724973e-06, 'epoch': 0.27} 27%|██▋ | 5867/22095 [9:46:30<22:49:00, 5.06s/it] 27%|██▋ | 5868/22095 [9:46:34<20:22:06, 4.52s/it] {'loss': 0.3834, 'grad_norm': 0.7428661153855681, 'learning_rate': 8.614437367219609e-06, 'epoch': 0.27} 27%|██▋ | 5868/22095 [9:46:34<20:22:06, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48636 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54431 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5869/22095 [9:46:37<18:18:34, 4.06s/it] {'loss': 0.3994, 'grad_norm': 0.6768255011074285, 'learning_rate': 8.613930905051087e-06, 'epoch': 0.27} 27%|██▋ | 5869/22095 [9:46:37<18:18:34, 4.06s/it] 27%|██▋ | 5870/22095 [9:46:40<17:03:12, 3.78s/it] {'loss': 0.3717, 'grad_norm': 0.7575444906896478, 'learning_rate': 8.613424365230287e-06, 'epoch': 0.27} 27%|██▋ | 5870/22095 [9:46:40<17:03:12, 3.78s/it] 27%|██▋ | 5871/22095 [9:46:43<15:58:17, 3.54s/it] {'loss': 0.3623, 'grad_norm': 0.7548909935508007, 'learning_rate': 8.612917747768097e-06, 'epoch': 0.27} 27%|██▋ | 5871/22095 [9:46:43<15:58:17, 3.54s/it] 27%|██▋ | 5872/22095 [9:46:46<14:56:51, 3.32s/it] {'loss': 0.343, 'grad_norm': 0.6016701214142821, 'learning_rate': 8.6124110526754e-06, 'epoch': 0.27} 27%|██▋ | 5872/22095 [9:46:46<14:56:51, 3.32s/it] 27%|██▋ | 5873/22095 [9:46:49<15:09:50, 3.37s/it] {'loss': 0.3906, 'grad_norm': 0.679660680308673, 'learning_rate': 8.611904279963085e-06, 'epoch': 0.27} 27%|██▋ | 5873/22095 [9:46:49<15:09:50, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [106, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401167 in VC:s3://internvl-moe-sft-data/. Exception: Image size [106, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3329, 'image': 'vrdu_table_final_2/astro-ph.CO/9a39d18f-09be-46a3-baca-71519a81fddb.png', 'image_wh': [[106, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}} Remarks \\end{tabular}\n```"}]} 27%|██▋ | 5874/22095 [9:46:53<15:31:55, 3.45s/it] {'loss': 0.4084, 'grad_norm': 0.6360930168064564, 'learning_rate': 8.61139742964204e-06, 'epoch': 0.27} 27%|██▋ | 5874/22095 [9:46:53<15:31:55, 3.45s/it] 27%|██▋ | 5875/22095 [9:46:57<16:00:53, 3.55s/it] {'loss': 0.4094, 'grad_norm': 0.6615358533938903, 'learning_rate': 8.610890501723155e-06, 'epoch': 0.27} 27%|██▋ | 5875/22095 [9:46:57<16:00:53, 3.55s/it] 27%|██▋ | 5876/22095 [9:47:00<15:31:31, 3.45s/it] {'loss': 0.3736, 'grad_norm': 0.7109395219743451, 'learning_rate': 8.610383496217323e-06, 'epoch': 0.27} 27%|██▋ | 5876/22095 [9:47:00<15:31:31, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5877/22095 [9:47:09<23:37:40, 5.24s/it] {'loss': 0.4766, 'grad_norm': 0.6837406970708032, 'learning_rate': 8.609876413135439e-06, 'epoch': 0.27} 27%|██▋ | 5877/22095 [9:47:09<23:37:40, 5.24s/it] 27%|██▋ | 5878/22095 [9:47:12<20:55:17, 4.64s/it] {'loss': 0.4239, 'grad_norm': 0.7336459536199746, 'learning_rate': 8.609369252488398e-06, 'epoch': 0.27} 27%|██▋ | 5878/22095 [9:47:12<20:55:17, 4.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047671 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 3\nB. 6\nC. 5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 27%|██▋ | 5879/22095 [9:47:16<20:02:21, 4.45s/it] {'loss': 0.3776, 'grad_norm': 0.6437473058572298, 'learning_rate': 8.608862014287095e-06, 'epoch': 0.27} 27%|██▋ | 5879/22095 [9:47:16<20:02:21, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5880/22095 [9:47:23<22:59:46, 5.11s/it] {'loss': 0.5048, 'grad_norm': 0.32222665867547856, 'learning_rate': 8.608354698542433e-06, 'epoch': 0.27} 27%|██▋ | 5880/22095 [9:47:23<22:59:46, 5.11s/it] 27%|██▋ | 5881/22095 [9:47:27<21:50:25, 4.85s/it] {'loss': 0.3886, 'grad_norm': 0.6907684265874435, 'learning_rate': 8.607847305265312e-06, 'epoch': 0.27} 27%|██▋ | 5881/22095 [9:47:27<21:50:25, 4.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 5882/22095 [9:47:31<19:39:55, 4.37s/it] {'loss': 0.3937, 'grad_norm': 0.648226465706051, 'learning_rate': 8.607339834466632e-06, 'epoch': 0.27} 27%|██▋ | 5882/22095 [9:47:31<19:39:55, 4.37s/it] 27%|██▋ | 5883/22095 [9:47:34<17:58:36, 3.99s/it] {'loss': 0.3769, 'grad_norm': 0.6804105185275151, 'learning_rate': 8.606832286157296e-06, 'epoch': 0.27} 27%|██▋ | 5883/22095 [9:47:34<17:58:36, 3.99s/it] 27%|██▋ | 5884/22095 [9:47:37<16:41:45, 3.71s/it] {'loss': 0.3824, 'grad_norm': 0.6633112138977015, 'learning_rate': 8.606324660348214e-06, 'epoch': 0.27} 27%|██▋ | 5884/22095 [9:47:37<16:41:45, 3.71s/it] 27%|██▋ | 5885/22095 [9:47:41<16:51:15, 3.74s/it] {'loss': 0.3202, 'grad_norm': 0.7789029630758616, 'learning_rate': 8.605816957050291e-06, 'epoch': 0.27} 27%|██▋ | 5885/22095 [9:47:41<16:51:15, 3.74s/it] 27%|██▋ | 5886/22095 [9:47:44<15:53:20, 3.53s/it] {'loss': 0.3525, 'grad_norm': 0.7116149857547136, 'learning_rate': 8.605309176274434e-06, 'epoch': 0.27} 27%|██▋ | 5886/22095 [9:47:44<15:53:20, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80446 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142916 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5887/22095 [9:47:47<15:35:14, 3.46s/it] {'loss': 0.385, 'grad_norm': 0.7987998989658341, 'learning_rate': 8.604801318031556e-06, 'epoch': 0.27} 27%|██▋ | 5887/22095 [9:47:47<15:35:14, 3.46s/it] 27%|██▋ | 5888/22095 [9:47:50<15:20:26, 3.41s/it] {'loss': 0.3733, 'grad_norm': 0.6836694369333408, 'learning_rate': 8.604293382332572e-06, 'epoch': 0.27} 27%|██▋ | 5888/22095 [9:47:50<15:20:26, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53414 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57200 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5889/22095 [9:47:53<15:15:15, 3.39s/it] {'loss': 0.3588, 'grad_norm': 0.6679344237964411, 'learning_rate': 8.60378536918839e-06, 'epoch': 0.27} 27%|██▋ | 5889/22095 [9:47:53<15:15:15, 3.39s/it] 27%|██▋ | 5890/22095 [9:47:58<16:13:25, 3.60s/it] {'loss': 0.3992, 'grad_norm': 0.6560760582437417, 'learning_rate': 8.60327727860993e-06, 'epoch': 0.27} 27%|██▋ | 5890/22095 [9:47:58<16:13:25, 3.60s/it] 27%|██▋ | 5891/22095 [9:48:01<15:54:43, 3.54s/it] {'loss': 0.3676, 'grad_norm': 0.7049765744315053, 'learning_rate': 8.602769110608107e-06, 'epoch': 0.27} 27%|██▋ | 5891/22095 [9:48:01<15:54:43, 3.54s/it] 27%|██▋ | 5892/22095 [9:48:05<15:56:28, 3.54s/it] {'loss': 0.3812, 'grad_norm': 0.6276088466610943, 'learning_rate': 8.602260865193841e-06, 'epoch': 0.27} 27%|██▋ | 5892/22095 [9:48:05<15:56:28, 3.54s/it] 27%|██▋ | 5893/22095 [9:48:08<15:15:27, 3.39s/it] {'loss': 0.3714, 'grad_norm': 0.6116868774203824, 'learning_rate': 8.601752542378052e-06, 'epoch': 0.27} 27%|██▋ | 5893/22095 [9:48:08<15:15:27, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68886 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5894/22095 [9:48:10<14:37:25, 3.25s/it] {'loss': 0.3809, 'grad_norm': 0.6801239626111731, 'learning_rate': 8.601244142171665e-06, 'epoch': 0.27} 27%|██▋ | 5894/22095 [9:48:10<14:37:25, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5895/22095 [9:48:20<23:11:42, 5.15s/it] {'loss': 0.5027, 'grad_norm': 0.821498545744599, 'learning_rate': 8.6007356645856e-06, 'epoch': 0.27} 27%|██▋ | 5895/22095 [9:48:20<23:11:42, 5.15s/it] 27%|██▋ | 5896/22095 [9:48:24<20:54:58, 4.65s/it] {'loss': 0.3888, 'grad_norm': 0.6657517193075106, 'learning_rate': 8.600227109630785e-06, 'epoch': 0.27} 27%|██▋ | 5896/22095 [9:48:24<20:54:58, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5897/22095 [9:48:33<27:31:02, 6.12s/it] {'loss': 0.4869, 'grad_norm': 0.4859325881457738, 'learning_rate': 8.599718477318146e-06, 'epoch': 0.27} 27%|██▋ | 5897/22095 [9:48:33<27:31:02, 6.12s/it] 27%|██▋ | 5898/22095 [9:48:36<23:43:07, 5.27s/it] {'loss': 0.3747, 'grad_norm': 0.7389180211567423, 'learning_rate': 8.599209767658613e-06, 'epoch': 0.27} 27%|██▋ | 5898/22095 [9:48:36<23:43:07, 5.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58262 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109868 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5899/22095 [9:48:39<20:35:53, 4.58s/it] {'loss': 0.3458, 'grad_norm': 0.6422154576545112, 'learning_rate': 8.598700980663116e-06, 'epoch': 0.27} 27%|██▋ | 5899/22095 [9:48:39<20:35:53, 4.58s/it] 27%|██▋ | 5900/22095 [9:48:42<18:38:16, 4.14s/it] {'loss': 0.38, 'grad_norm': 0.6237373016006862, 'learning_rate': 8.598192116342587e-06, 'epoch': 0.27} 27%|██▋ | 5900/22095 [9:48:42<18:38:16, 4.14s/it] 27%|██▋ | 5901/22095 [9:48:46<18:12:18, 4.05s/it] {'loss': 0.3297, 'grad_norm': 0.6284487296234135, 'learning_rate': 8.597683174707961e-06, 'epoch': 0.27} 27%|██▋ | 5901/22095 [9:48:46<18:12:18, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5902/22095 [9:48:54<23:20:03, 5.19s/it] {'loss': 0.5076, 'grad_norm': 0.8446235665355496, 'learning_rate': 8.597174155770174e-06, 'epoch': 0.27} 27%|██▋ | 5902/22095 [9:48:54<23:20:03, 5.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63695 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5903/22095 [9:48:58<21:20:41, 4.75s/it] {'loss': 0.4103, 'grad_norm': 0.7014310270863416, 'learning_rate': 8.596665059540161e-06, 'epoch': 0.27} 27%|██▋ | 5903/22095 [9:48:58<21:20:41, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 5904/22095 [9:49:01<18:43:11, 4.16s/it] {'loss': 0.4106, 'grad_norm': 0.698389379913561, 'learning_rate': 8.596155886028863e-06, 'epoch': 0.27} 27%|██▋ | 5904/22095 [9:49:01<18:43:11, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104755 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5905/22095 [9:49:04<18:12:28, 4.05s/it] {'loss': 0.3526, 'grad_norm': 0.6508038142607682, 'learning_rate': 8.59564663524722e-06, 'epoch': 0.27} 27%|██▋ | 5905/22095 [9:49:04<18:12:28, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914672 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37825, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 4\nB. 8\nC. 16\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 27%|██▋ | 5906/22095 [9:49:08<17:44:36, 3.95s/it] {'loss': 0.3979, 'grad_norm': 0.6590762924057729, 'learning_rate': 8.595137307206171e-06, 'epoch': 0.27} 27%|██▋ | 5906/22095 [9:49:08<17:44:36, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8386822 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 53632, 'image': 'vrdu_table_final_2/astro-ph.CO/69202317-920e-4d7c-a56d-fa846b5310b6.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 27%|██▋ | 5907/22095 [9:49:18<25:40:06, 5.71s/it] {'loss': 0.486, 'grad_norm': 0.5143468902112852, 'learning_rate': 8.594627901916667e-06, 'epoch': 0.27} 27%|██▋ | 5907/22095 [9:49:18<25:40:06, 5.71s/it] 27%|██▋ | 5908/22095 [9:49:22<22:56:22, 5.10s/it] {'loss': 0.3568, 'grad_norm': 0.6668621637376194, 'learning_rate': 8.594118419389648e-06, 'epoch': 0.27} 27%|██▋ | 5908/22095 [9:49:22<22:56:22, 5.10s/it] 27%|██▋ | 5909/22095 [9:49:25<20:52:20, 4.64s/it] {'loss': 0.3406, 'grad_norm': 0.6586502935608585, 'learning_rate': 8.593608859636063e-06, 'epoch': 0.27} 27%|██▋ | 5909/22095 [9:49:25<20:52:20, 4.64s/it] 27%|██▋ | 5910/22095 [9:49:29<19:06:44, 4.25s/it] {'loss': 0.4176, 'grad_norm': 0.7020550504466091, 'learning_rate': 8.593099222666859e-06, 'epoch': 0.27} 27%|██▋ | 5910/22095 [9:49:29<19:06:44, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5911/22095 [9:49:39<26:48:57, 5.97s/it] {'loss': 0.4711, 'grad_norm': 0.33072277187840926, 'learning_rate': 8.592589508492989e-06, 'epoch': 0.27} 27%|██▋ | 5911/22095 [9:49:39<26:48:57, 5.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 5912/22095 [9:49:42<24:07:14, 5.37s/it] {'loss': 0.3952, 'grad_norm': 0.7552124560845879, 'learning_rate': 8.592079717125403e-06, 'epoch': 0.27} 27%|██▋ | 5912/22095 [9:49:42<24:07:14, 5.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5913/22095 [9:49:51<27:41:55, 6.16s/it] {'loss': 0.4977, 'grad_norm': 0.4440559086132741, 'learning_rate': 8.591569848575058e-06, 'epoch': 0.27} 27%|██▋ | 5913/22095 [9:49:51<27:41:55, 6.16s/it] 27%|██▋ | 5914/22095 [9:49:54<24:19:43, 5.41s/it] {'loss': 0.4298, 'grad_norm': 0.6809038852834273, 'learning_rate': 8.591059902852907e-06, 'epoch': 0.27} 27%|██▋ | 5914/22095 [9:49:54<24:19:43, 5.41s/it] 27%|██▋ | 5915/22095 [9:49:58<21:58:53, 4.89s/it] {'loss': 0.3721, 'grad_norm': 0.6532468455961231, 'learning_rate': 8.590549879969907e-06, 'epoch': 0.27} 27%|██▋ | 5915/22095 [9:49:58<21:58:53, 4.89s/it] 27%|██▋ | 5916/22095 [9:50:01<20:00:47, 4.45s/it] {'loss': 0.3939, 'grad_norm': 0.6528557835064469, 'learning_rate': 8.590039779937019e-06, 'epoch': 0.27} 27%|██▋ | 5916/22095 [9:50:01<20:00:47, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70855 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5917/22095 [9:50:05<18:26:34, 4.10s/it] {'loss': 0.3531, 'grad_norm': 0.6577332862162955, 'learning_rate': 8.5895296027652e-06, 'epoch': 0.27} 27%|██▋ | 5917/22095 [9:50:05<18:26:34, 4.10s/it] 27%|██▋ | 5918/22095 [9:50:08<17:13:18, 3.83s/it] {'loss': 0.377, 'grad_norm': 0.6707199520170976, 'learning_rate': 8.589019348465416e-06, 'epoch': 0.27} 27%|██▋ | 5918/22095 [9:50:08<17:13:18, 3.83s/it] 27%|██▋ | 5919/22095 [9:50:11<16:50:30, 3.75s/it] {'loss': 0.385, 'grad_norm': 0.6661069685645835, 'learning_rate': 8.588509017048629e-06, 'epoch': 0.27} 27%|██▋ | 5919/22095 [9:50:11<16:50:30, 3.75s/it] 27%|██▋ | 5920/22095 [9:50:15<17:04:46, 3.80s/it] {'loss': 0.3797, 'grad_norm': 0.656977906811802, 'learning_rate': 8.587998608525806e-06, 'epoch': 0.27} 27%|██▋ | 5920/22095 [9:50:15<17:04:46, 3.80s/it] 27%|██▋ | 5921/22095 [9:50:19<17:32:30, 3.90s/it] {'loss': 0.3865, 'grad_norm': 0.6457510799865884, 'learning_rate': 8.58748812290791e-06, 'epoch': 0.27} 27%|██▋ | 5921/22095 [9:50:19<17:32:30, 3.90s/it] 27%|██▋ | 5922/22095 [9:50:23<17:41:43, 3.94s/it] {'loss': 0.3889, 'grad_norm': 0.6716714434913296, 'learning_rate': 8.586977560205914e-06, 'epoch': 0.27} 27%|██▋ | 5922/22095 [9:50:23<17:41:43, 3.94s/it] 27%|██▋ | 5923/22095 [9:50:27<17:02:28, 3.79s/it] {'loss': 0.4133, 'grad_norm': 0.6985273498781074, 'learning_rate': 8.586466920430785e-06, 'epoch': 0.27} 27%|██▋ | 5923/22095 [9:50:27<17:02:28, 3.79s/it] 27%|██▋ | 5924/22095 [9:50:30<16:20:28, 3.64s/it] {'loss': 0.3627, 'grad_norm': 0.6292758691095204, 'learning_rate': 8.585956203593497e-06, 'epoch': 0.27} 27%|██▋ | 5924/22095 [9:50:30<16:20:28, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5925/22095 [9:50:40<24:20:59, 5.42s/it] {'loss': 0.5169, 'grad_norm': 0.5394491971196063, 'learning_rate': 8.585445409705026e-06, 'epoch': 0.27} 27%|██▋ | 5925/22095 [9:50:40<24:20:59, 5.42s/it] 27%|██▋ | 5926/22095 [9:50:44<22:34:19, 5.03s/it] {'loss': 0.3383, 'grad_norm': 0.6232604641935396, 'learning_rate': 8.584934538776342e-06, 'epoch': 0.27} 27%|██▋ | 5926/22095 [9:50:44<22:34:19, 5.03s/it] 27%|██▋ | 5927/22095 [9:50:47<20:09:13, 4.49s/it] {'loss': 0.3532, 'grad_norm': 0.6635149021073501, 'learning_rate': 8.584423590818427e-06, 'epoch': 0.27} 27%|██▋ | 5927/22095 [9:50:47<20:09:13, 4.49s/it] 27%|██▋ | 5928/22095 [9:50:51<19:02:05, 4.24s/it] {'loss': 0.3869, 'grad_norm': 0.9092934282618736, 'learning_rate': 8.583912565842258e-06, 'epoch': 0.27} 27%|██▋ | 5928/22095 [9:50:51<19:02:05, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71380 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5929/22095 [9:50:55<18:29:20, 4.12s/it] {'loss': 0.3775, 'grad_norm': 0.6085165446418936, 'learning_rate': 8.583401463858814e-06, 'epoch': 0.27} 27%|██▋ | 5929/22095 [9:50:55<18:29:20, 4.12s/it] 27%|██▋ | 5930/22095 [9:50:57<16:46:35, 3.74s/it] {'loss': 0.3621, 'grad_norm': 0.6810081434074116, 'learning_rate': 8.582890284879077e-06, 'epoch': 0.27} 27%|██▋ | 5930/22095 [9:50:57<16:46:35, 3.74s/it] 27%|██▋ | 5931/22095 [9:51:02<17:37:42, 3.93s/it] {'loss': 0.4089, 'grad_norm': 0.6595823794825919, 'learning_rate': 8.582379028914034e-06, 'epoch': 0.27} 27%|██▋ | 5931/22095 [9:51:02<17:37:42, 3.93s/it] 27%|██▋ | 5932/22095 [9:51:05<16:44:00, 3.73s/it] {'loss': 0.4088, 'grad_norm': 0.7243005246090832, 'learning_rate': 8.581867695974667e-06, 'epoch': 0.27} 27%|██▋ | 5932/22095 [9:51:05<16:44:00, 3.73s/it] 27%|██▋ | 5933/22095 [9:51:08<15:33:40, 3.47s/it] {'loss': 0.3949, 'grad_norm': 0.6679784436223112, 'learning_rate': 8.581356286071964e-06, 'epoch': 0.27} 27%|██▋ | 5933/22095 [9:51:08<15:33:40, 3.47s/it] 27%|██▋ | 5934/22095 [9:51:11<15:19:40, 3.41s/it] {'loss': 0.375, 'grad_norm': 0.623973476384692, 'learning_rate': 8.580844799216914e-06, 'epoch': 0.27} 27%|██▋ | 5934/22095 [9:51:11<15:19:40, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133632 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65084 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50421 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5935/22095 [9:51:15<16:14:01, 3.62s/it] {'loss': 0.3691, 'grad_norm': 0.7071319896024497, 'learning_rate': 8.580333235420509e-06, 'epoch': 0.27} 27%|██▋ | 5935/22095 [9:51:15<16:14:01, 3.62s/it] 27%|██▋ | 5936/22095 [9:51:18<15:11:59, 3.39s/it] {'loss': 0.345, 'grad_norm': 0.6381944332927595, 'learning_rate': 8.579821594693736e-06, 'epoch': 0.27} 27%|██▋ | 5936/22095 [9:51:18<15:11:59, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5937/22095 [9:51:28<23:27:40, 5.23s/it] {'loss': 0.5103, 'grad_norm': 0.5723820412634509, 'learning_rate': 8.579309877047593e-06, 'epoch': 0.27} 27%|██▋ | 5937/22095 [9:51:28<23:27:40, 5.23s/it] 27%|██▋ | 5938/22095 [9:51:32<21:51:33, 4.87s/it] {'loss': 0.3309, 'grad_norm': 0.6204819925409865, 'learning_rate': 8.578798082493074e-06, 'epoch': 0.27} 27%|██▋ | 5938/22095 [9:51:32<21:51:33, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 5939/22095 [9:51:35<19:19:08, 4.30s/it] {'loss': 0.3869, 'grad_norm': 0.7162615258973855, 'learning_rate': 8.578286211041173e-06, 'epoch': 0.27} 27%|██▋ | 5939/22095 [9:51:35<19:19:08, 4.30s/it] 27%|██▋ | 5940/22095 [9:51:38<18:35:02, 4.14s/it] {'loss': 0.3934, 'grad_norm': 0.6812068319674739, 'learning_rate': 8.577774262702894e-06, 'epoch': 0.27} 27%|██▋ | 5940/22095 [9:51:38<18:35:02, 4.14s/it] 27%|██▋ | 5941/22095 [9:51:42<18:09:02, 4.04s/it] {'loss': 0.4178, 'grad_norm': 0.8801846873382326, 'learning_rate': 8.577262237489234e-06, 'epoch': 0.27} 27%|██▋ | 5941/22095 [9:51:42<18:09:02, 4.04s/it] 27%|██▋ | 5942/22095 [9:51:45<17:02:06, 3.80s/it] {'loss': 0.4163, 'grad_norm': 0.7154494503050958, 'learning_rate': 8.576750135411194e-06, 'epoch': 0.27} 27%|██▋ | 5942/22095 [9:51:45<17:02:06, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5943/22095 [9:51:55<25:25:58, 5.67s/it] {'loss': 0.5022, 'grad_norm': 0.41689872782674725, 'learning_rate': 8.57623795647978e-06, 'epoch': 0.27} 27%|██▋ | 5943/22095 [9:51:55<25:25:58, 5.67s/it] 27%|██▋ | 5944/22095 [9:52:00<23:56:29, 5.34s/it] {'loss': 0.3541, 'grad_norm': 0.6760071606654183, 'learning_rate': 8.575725700705995e-06, 'epoch': 0.27} 27%|██▋ | 5944/22095 [9:52:00<23:56:29, 5.34s/it] 27%|██▋ | 5945/22095 [9:52:04<21:41:04, 4.83s/it] {'loss': 0.3645, 'grad_norm': 0.5954749930016797, 'learning_rate': 8.575213368100847e-06, 'epoch': 0.27} 27%|██▋ | 5945/22095 [9:52:04<21:41:04, 4.83s/it] 27%|██▋ | 5946/22095 [9:52:07<19:31:32, 4.35s/it] {'loss': 0.333, 'grad_norm': 0.6367933833228567, 'learning_rate': 8.574700958675345e-06, 'epoch': 0.27} 27%|██▋ | 5946/22095 [9:52:07<19:31:32, 4.35s/it] 27%|██▋ | 5947/22095 [9:52:11<18:43:12, 4.17s/it] {'loss': 0.3649, 'grad_norm': 0.7950089422983774, 'learning_rate': 8.574188472440497e-06, 'epoch': 0.27} 27%|██▋ | 5947/22095 [9:52:11<18:43:12, 4.17s/it] 27%|██▋ | 5948/22095 [9:52:14<18:02:48, 4.02s/it] {'loss': 0.3815, 'grad_norm': 0.7098041224453934, 'learning_rate': 8.573675909407316e-06, 'epoch': 0.27} 27%|██▋ | 5948/22095 [9:52:14<18:02:48, 4.02s/it] 27%|██▋ | 5949/22095 [9:52:18<17:39:40, 3.94s/it] {'loss': 0.3499, 'grad_norm': 0.6194752583100062, 'learning_rate': 8.573163269586818e-06, 'epoch': 0.27} 27%|██▋ | 5949/22095 [9:52:18<17:39:40, 3.94s/it] 27%|██▋ | 5950/22095 [9:52:21<16:36:07, 3.70s/it] {'loss': 0.3677, 'grad_norm': 0.613211866231783, 'learning_rate': 8.572650552990012e-06, 'epoch': 0.27} 27%|██▋ | 5950/22095 [9:52:21<16:36:07, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42400 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46461 > 40960) for 4 sample(s). Truncating to 4450 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (54113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83020 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5951/22095 [9:52:24<15:23:40, 3.43s/it] {'loss': 0.3674, 'grad_norm': 0.7044084698902522, 'learning_rate': 8.572137759627919e-06, 'epoch': 0.27} 27%|██▋ | 5951/22095 [9:52:24<15:23:40, 3.43s/it] 27%|██▋ | 5952/22095 [9:52:28<15:26:19, 3.44s/it] {'loss': 0.3938, 'grad_norm': 0.6962715657146951, 'learning_rate': 8.571624889511558e-06, 'epoch': 0.27} 27%|██▋ | 5952/22095 [9:52:28<15:26:19, 3.44s/it] 27%|██▋ | 5953/22095 [9:52:31<15:50:34, 3.53s/it] {'loss': 0.4078, 'grad_norm': 0.637278621803258, 'learning_rate': 8.571111942651945e-06, 'epoch': 0.27} 27%|██▋ | 5953/22095 [9:52:31<15:50:34, 3.53s/it] 27%|██▋ | 5954/22095 [9:52:35<16:13:23, 3.62s/it] {'loss': 0.3697, 'grad_norm': 0.6978075288266782, 'learning_rate': 8.570598919060108e-06, 'epoch': 0.27} 27%|██▋ | 5954/22095 [9:52:35<16:13:23, 3.62s/it] 27%|██▋ | 5955/22095 [9:52:38<15:41:32, 3.50s/it] {'loss': 0.3471, 'grad_norm': 0.6322435094797664, 'learning_rate': 8.570085818747063e-06, 'epoch': 0.27} 27%|██▋ | 5955/22095 [9:52:38<15:41:32, 3.50s/it] 27%|██▋ | 5956/22095 [9:52:43<17:14:04, 3.84s/it] {'loss': 0.3857, 'grad_norm': 0.6994153624100978, 'learning_rate': 8.56957264172384e-06, 'epoch': 0.27} 27%|██▋ | 5956/22095 [9:52:43<17:14:04, 3.84s/it] 27%|██▋ | 5957/22095 [9:52:47<17:26:26, 3.89s/it] {'loss': 0.3976, 'grad_norm': 0.6860845640803551, 'learning_rate': 8.569059388001463e-06, 'epoch': 0.27} 27%|██▋ | 5957/22095 [9:52:47<17:26:26, 3.89s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (139814040 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 27%|██▋ | 5958/22095 [9:52:50<16:16:25, 3.63s/it] {'loss': 0.3963, 'grad_norm': 0.776993539496546, 'learning_rate': 8.568546057590963e-06, 'epoch': 0.27} 27%|██▋ | 5958/22095 [9:52:50<16:16:25, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5959/22095 [9:53:00<24:12:13, 5.40s/it] {'loss': 0.5327, 'grad_norm': 0.41075113977310934, 'learning_rate': 8.568032650503366e-06, 'epoch': 0.27} 27%|██▋ | 5959/22095 [9:53:00<24:12:13, 5.40s/it] 27%|██▋ | 5960/22095 [9:53:03<21:19:58, 4.76s/it] {'loss': 0.3683, 'grad_norm': 0.6551096579982019, 'learning_rate': 8.567519166749707e-06, 'epoch': 0.27} 27%|██▋ | 5960/22095 [9:53:03<21:19:58, 4.76s/it] 27%|██▋ | 5961/22095 [9:53:06<18:58:47, 4.23s/it] {'loss': 0.3517, 'grad_norm': 0.9991521072753345, 'learning_rate': 8.567005606341019e-06, 'epoch': 0.27} 27%|██▋ | 5961/22095 [9:53:06<18:58:47, 4.23s/it] 27%|██▋ | 5962/22095 [9:53:09<17:00:43, 3.80s/it] {'loss': 0.4392, 'grad_norm': 0.6873912875023508, 'learning_rate': 8.566491969288333e-06, 'epoch': 0.27} 27%|██▋ | 5962/22095 [9:53:09<17:00:43, 3.80s/it] 27%|██▋ | 5963/22095 [9:53:12<16:09:06, 3.60s/it] {'loss': 0.3718, 'grad_norm': 0.6267614200958163, 'learning_rate': 8.565978255602692e-06, 'epoch': 0.27} 27%|██▋ | 5963/22095 [9:53:12<16:09:06, 3.60s/it] 27%|██▋ | 5964/22095 [9:53:16<16:47:01, 3.75s/it] {'loss': 0.442, 'grad_norm': 0.755594663681423, 'learning_rate': 8.565464465295128e-06, 'epoch': 0.27} 27%|██▋ | 5964/22095 [9:53:16<16:47:01, 3.75s/it] 27%|██▋ | 5965/22095 [9:53:19<16:21:03, 3.65s/it] {'loss': 0.3679, 'grad_norm': 0.6180654968091859, 'learning_rate': 8.564950598376683e-06, 'epoch': 0.27} 27%|██▋ | 5965/22095 [9:53:19<16:21:03, 3.65s/it] 27%|██▋ | 5966/22095 [9:53:23<16:52:14, 3.77s/it] {'loss': 0.4397, 'grad_norm': 0.809800571361141, 'learning_rate': 8.5644366548584e-06, 'epoch': 0.27} 27%|██▋ | 5966/22095 [9:53:23<16:52:14, 3.77s/it] 27%|██▋ | 5967/22095 [9:53:26<15:57:51, 3.56s/it] {'loss': 0.3701, 'grad_norm': 0.6092547414226015, 'learning_rate': 8.563922634751318e-06, 'epoch': 0.27} 27%|██▋ | 5967/22095 [9:53:26<15:57:51, 3.56s/it] 27%|██▋ | 5968/22095 [9:53:29<14:49:02, 3.31s/it] {'loss': 0.392, 'grad_norm': 0.6731909504435205, 'learning_rate': 8.563408538066486e-06, 'epoch': 0.27} 27%|██▋ | 5968/22095 [9:53:29<14:49:02, 3.31s/it] 27%|██▋ | 5969/22095 [9:53:33<15:08:31, 3.38s/it] {'loss': 0.3732, 'grad_norm': 0.6775432593263921, 'learning_rate': 8.562894364814948e-06, 'epoch': 0.27} 27%|██▋ | 5969/22095 [9:53:33<15:08:31, 3.38s/it] 27%|██▋ | 5970/22095 [9:53:37<15:59:44, 3.57s/it] {'loss': 0.376, 'grad_norm': 0.6727938342148715, 'learning_rate': 8.562380115007753e-06, 'epoch': 0.27} 27%|██▋ | 5970/22095 [9:53:37<15:59:44, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922259 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45412, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12cm'}]} 27%|██▋ | 5971/22095 [9:53:40<15:08:52, 3.38s/it] {'loss': 0.3576, 'grad_norm': 0.6533268203426031, 'learning_rate': 8.561865788655951e-06, 'epoch': 0.27} 27%|██▋ | 5971/22095 [9:53:40<15:08:52, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47617 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65146 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5972/22095 [9:53:43<15:52:21, 3.54s/it] {'loss': 0.34, 'grad_norm': 0.6259228660706317, 'learning_rate': 8.561351385770592e-06, 'epoch': 0.27} 27%|██▋ | 5972/22095 [9:53:43<15:52:21, 3.54s/it] 27%|██▋ | 5973/22095 [9:53:48<16:43:11, 3.73s/it] {'loss': 0.3913, 'grad_norm': 0.7074757571539375, 'learning_rate': 8.560836906362731e-06, 'epoch': 0.27} 27%|██▋ | 5973/22095 [9:53:48<16:43:11, 3.73s/it] 27%|██▋ | 5974/22095 [9:53:52<17:10:22, 3.83s/it] {'loss': 0.3668, 'grad_norm': 0.7318314444482122, 'learning_rate': 8.56032235044342e-06, 'epoch': 0.27} 27%|██▋ | 5974/22095 [9:53:52<17:10:22, 3.83s/it] 27%|██▋ | 5975/22095 [9:53:55<16:00:50, 3.58s/it] {'loss': 0.3639, 'grad_norm': 0.6757028464267878, 'learning_rate': 8.559807718023715e-06, 'epoch': 0.27} 27%|██▋ | 5975/22095 [9:53:55<16:00:50, 3.58s/it] 27%|██▋ | 5976/22095 [9:53:59<16:44:48, 3.74s/it] {'loss': 0.4166, 'grad_norm': 0.6430596171748171, 'learning_rate': 8.559293009114678e-06, 'epoch': 0.27} 27%|██▋ | 5976/22095 [9:53:59<16:44:48, 3.74s/it] 27%|██▋ | 5977/22095 [9:54:02<15:57:26, 3.56s/it] {'loss': 0.372, 'grad_norm': 0.6852484751594174, 'learning_rate': 8.558778223727363e-06, 'epoch': 0.27} 27%|██▋ | 5977/22095 [9:54:02<15:57:26, 3.56s/it] 27%|██▋ | 5978/22095 [9:54:06<16:01:25, 3.58s/it] {'loss': 0.4165, 'grad_norm': 0.6088024575326358, 'learning_rate': 8.558263361872836e-06, 'epoch': 0.27} 27%|██▋ | 5978/22095 [9:54:06<16:01:25, 3.58s/it] 27%|██▋ | 5979/22095 [9:54:09<15:13:05, 3.40s/it] {'loss': 0.3354, 'grad_norm': 0.6452025606498345, 'learning_rate': 8.557748423562157e-06, 'epoch': 0.27} 27%|██▋ | 5979/22095 [9:54:09<15:13:05, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880212 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3365, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵D为线段CB的中点,CD=3,∴BC=2CD=6,∴AC=AB-BC=5.'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5980/22095 [9:54:18<23:12:40, 5.19s/it] {'loss': 0.4937, 'grad_norm': 0.4629092068761617, 'learning_rate': 8.55723340880639e-06, 'epoch': 0.27} 27%|██▋ | 5980/22095 [9:54:18<23:12:40, 5.19s/it] 27%|██▋ | 5981/22095 [9:54:21<20:40:11, 4.62s/it] {'loss': 0.364, 'grad_norm': 0.650889937361659, 'learning_rate': 8.556718317616603e-06, 'epoch': 0.27} 27%|██▋ | 5981/22095 [9:54:21<20:40:11, 4.62s/it] 27%|██▋ | 5982/22095 [9:54:25<19:03:16, 4.26s/it] {'loss': 0.3798, 'grad_norm': 0.5910922166754542, 'learning_rate': 8.556203150003863e-06, 'epoch': 0.27} 27%|██▋ | 5982/22095 [9:54:25<19:03:16, 4.26s/it] 27%|██▋ | 5983/22095 [9:54:28<18:05:44, 4.04s/it] {'loss': 0.4274, 'grad_norm': 0.6773198923876816, 'learning_rate': 8.55568790597924e-06, 'epoch': 0.27} 27%|██▋ | 5983/22095 [9:54:28<18:05:44, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45388 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5984/22095 [9:54:31<16:37:15, 3.71s/it] {'loss': 0.3604, 'grad_norm': 0.6389749204912827, 'learning_rate': 8.555172585553804e-06, 'epoch': 0.27} 27%|██▋ | 5984/22095 [9:54:31<16:37:15, 3.71s/it] 27%|██▋ | 5985/22095 [9:54:34<15:38:11, 3.49s/it] {'loss': 0.4338, 'grad_norm': 0.6088735184267581, 'learning_rate': 8.55465718873863e-06, 'epoch': 0.27} 27%|██▋ | 5985/22095 [9:54:34<15:38:11, 3.49s/it] 27%|██▋ | 5986/22095 [9:54:37<15:27:58, 3.46s/it] {'loss': 0.3711, 'grad_norm': 0.6186470148336451, 'learning_rate': 8.554141715544788e-06, 'epoch': 0.27} 27%|██▋ | 5986/22095 [9:54:37<15:27:58, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51064 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43596 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5987/22095 [9:54:41<14:56:34, 3.34s/it] {'loss': 0.3897, 'grad_norm': 0.6727668532608004, 'learning_rate': 8.553626165983355e-06, 'epoch': 0.27} 27%|██▋ | 5987/22095 [9:54:41<14:56:34, 3.34s/it] 27%|██▋ | 5988/22095 [9:54:45<16:00:43, 3.58s/it] {'loss': 0.3587, 'grad_norm': 0.616134976164528, 'learning_rate': 8.553110540065412e-06, 'epoch': 0.27} 27%|██▋ | 5988/22095 [9:54:45<16:00:43, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94792 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52789 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72335 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 5989/22095 [9:54:53<22:55:22, 5.12s/it] {'loss': 0.476, 'grad_norm': 0.42615089223040237, 'learning_rate': 8.552594837802035e-06, 'epoch': 0.27} 27%|██▋ | 5989/22095 [9:54:53<22:55:22, 5.12s/it] 27%|██▋ | 5990/22095 [9:54:57<20:31:46, 4.59s/it] {'loss': 0.3496, 'grad_norm': 0.6261527251182383, 'learning_rate': 8.552079059204306e-06, 'epoch': 0.27} 27%|██▋ | 5990/22095 [9:54:57<20:31:46, 4.59s/it] 27%|██▋ | 5991/22095 [9:55:00<18:46:31, 4.20s/it] {'loss': 0.3852, 'grad_norm': 1.1080483705521476, 'learning_rate': 8.551563204283308e-06, 'epoch': 0.27} 27%|██▋ | 5991/22095 [9:55:00<18:46:31, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5992/22095 [9:55:10<26:45:41, 5.98s/it] {'loss': 0.4988, 'grad_norm': 0.33901070493735247, 'learning_rate': 8.551047273050126e-06, 'epoch': 0.27} 27%|██▋ | 5992/22095 [9:55:10<26:45:41, 5.98s/it] 27%|██▋ | 5993/22095 [9:55:14<23:21:07, 5.22s/it] {'loss': 0.3912, 'grad_norm': 0.6859078191706226, 'learning_rate': 8.550531265515842e-06, 'epoch': 0.27} 27%|██▋ | 5993/22095 [9:55:14<23:21:07, 5.22s/it] 27%|██▋ | 5994/22095 [9:55:17<20:34:49, 4.60s/it] {'loss': 0.3891, 'grad_norm': 0.673173504588645, 'learning_rate': 8.550015181691546e-06, 'epoch': 0.27} 27%|██▋ | 5994/22095 [9:55:17<20:34:49, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 5995/22095 [9:55:26<27:25:35, 6.13s/it] {'loss': 0.4939, 'grad_norm': 0.3069681794950536, 'learning_rate': 8.549499021588328e-06, 'epoch': 0.27} 27%|██▋ | 5995/22095 [9:55:26<27:25:35, 6.13s/it] 27%|██▋ | 5996/22095 [9:55:31<24:47:42, 5.54s/it] {'loss': 0.3418, 'grad_norm': 0.6517678642102981, 'learning_rate': 8.548982785217277e-06, 'epoch': 0.27} 27%|██▋ | 5996/22095 [9:55:31<24:47:42, 5.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 5997/22095 [9:55:40<30:03:36, 6.72s/it] {'loss': 0.5133, 'grad_norm': 0.3253512792732969, 'learning_rate': 8.548466472589485e-06, 'epoch': 0.27} 27%|██▋ | 5997/22095 [9:55:40<30:03:36, 6.72s/it] 27%|██▋ | 5998/22095 [9:55:43<25:28:35, 5.70s/it] {'loss': 0.3375, 'grad_norm': 0.6915729271927266, 'learning_rate': 8.547950083716047e-06, 'epoch': 0.27} 27%|██▋ | 5998/22095 [9:55:43<25:28:35, 5.70s/it] 27%|██▋ | 5999/22095 [9:55:47<22:08:17, 4.95s/it] {'loss': 0.3902, 'grad_norm': 0.6576267492188453, 'learning_rate': 8.547433618608059e-06, 'epoch': 0.27} 27%|██▋ | 5999/22095 [9:55:47<22:08:17, 4.95s/it] 27%|██▋ | 6000/22095 [9:55:50<19:49:33, 4.43s/it] {'loss': 0.3917, 'grad_norm': 0.6279246314213415, 'learning_rate': 8.546917077276618e-06, 'epoch': 0.27} 27%|██▋ | 6000/22095 [9:55:50<19:49:33, 4.43s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 27%|██▋ | 6001/22095 [9:56:24<60:14:31, 13.48s/it] {'loss': 0.401, 'grad_norm': 0.6221555956642787, 'learning_rate': 8.54640045973282e-06, 'epoch': 0.27} 27%|██▋ | 6001/22095 [9:56:24<60:14:31, 13.48s/it] 27%|██▋ | 6002/22095 [9:56:28<46:21:15, 10.37s/it] {'loss': 0.3645, 'grad_norm': 0.6306321739741833, 'learning_rate': 8.54588376598777e-06, 'epoch': 0.27} 27%|██▋ | 6002/22095 [9:56:28<46:21:15, 10.37s/it] 27%|██▋ | 6003/22095 [9:56:31<37:25:36, 8.37s/it] {'loss': 0.3908, 'grad_norm': 0.6262000288703357, 'learning_rate': 8.545366996052568e-06, 'epoch': 0.27} 27%|██▋ | 6003/22095 [9:56:31<37:25:36, 8.37s/it] 27%|██▋ | 6004/22095 [9:56:35<30:41:59, 6.87s/it] {'loss': 0.4311, 'grad_norm': 0.666632379653395, 'learning_rate': 8.54485014993832e-06, 'epoch': 0.27} 27%|██▋ | 6004/22095 [9:56:35<30:41:59, 6.87s/it] 27%|██▋ | 6005/22095 [9:56:38<25:29:10, 5.70s/it] {'loss': 0.3649, 'grad_norm': 0.6618339991231594, 'learning_rate': 8.544333227656126e-06, 'epoch': 0.27} 27%|██▋ | 6005/22095 [9:56:38<25:29:10, 5.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [570, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465423 in VC:s3://internvl-moe-sft-data/. Exception: Image size [570, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 151259, 'image': 'vrdu_texteq/astro-ph.CO/b2e5fd5c-f79f-40e9-9f40-843a918b6b46.png', 'image_wh': [[570, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'The $DR$ distribution can then be calculated as'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 6006/22095 [9:56:41<22:20:13, 5.00s/it] {'loss': 0.362, 'grad_norm': 0.6665292193157383, 'learning_rate': 8.543816229217099e-06, 'epoch': 0.27} 27%|██▋ | 6006/22095 [9:56:41<22:20:13, 5.00s/it] 27%|██▋ | 6007/22095 [9:56:44<20:13:06, 4.52s/it] {'loss': 0.4236, 'grad_norm': 0.6743160090461385, 'learning_rate': 8.543299154632343e-06, 'epoch': 0.27} 27%|██▋ | 6007/22095 [9:56:44<20:13:06, 4.52s/it] 27%|██▋ | 6008/22095 [9:56:48<18:36:13, 4.16s/it] {'loss': 0.3738, 'grad_norm': 0.6499545296448126, 'learning_rate': 8.542782003912973e-06, 'epoch': 0.27} 27%|██▋ | 6008/22095 [9:56:48<18:36:13, 4.16s/it] 27%|██▋ | 6009/22095 [9:56:51<17:28:42, 3.91s/it] {'loss': 0.3653, 'grad_norm': 0.6244358131955613, 'learning_rate': 8.542264777070097e-06, 'epoch': 0.27} 27%|██▋ | 6009/22095 [9:56:51<17:28:42, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55460 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6010/22095 [9:56:54<16:08:09, 3.61s/it] {'loss': 0.3828, 'grad_norm': 0.6584235185675267, 'learning_rate': 8.54174747411483e-06, 'epoch': 0.27} 27%|██▋ | 6010/22095 [9:56:54<16:08:09, 3.61s/it] 27%|██▋ | 6011/22095 [9:56:57<15:41:42, 3.51s/it] {'loss': 0.3765, 'grad_norm': 0.6079854139604739, 'learning_rate': 8.541230095058289e-06, 'epoch': 0.27} 27%|██▋ | 6011/22095 [9:56:57<15:41:42, 3.51s/it] 27%|██▋ | 6012/22095 [9:57:00<15:04:38, 3.37s/it] {'loss': 0.3562, 'grad_norm': 0.6601104823424143, 'learning_rate': 8.540712639911588e-06, 'epoch': 0.27} 27%|██▋ | 6012/22095 [9:57:00<15:04:38, 3.37s/it] 27%|██▋ | 6013/22095 [9:57:03<14:23:23, 3.22s/it] {'loss': 0.3878, 'grad_norm': 0.6408304883319524, 'learning_rate': 8.540195108685846e-06, 'epoch': 0.27} 27%|██▋ | 6013/22095 [9:57:03<14:23:23, 3.22s/it] 27%|██▋ | 6014/22095 [9:57:07<14:56:49, 3.35s/it] {'loss': 0.3665, 'grad_norm': 0.6128296670508544, 'learning_rate': 8.539677501392187e-06, 'epoch': 0.27} 27%|██▋ | 6014/22095 [9:57:07<14:56:49, 3.35s/it] 27%|██▋ | 6015/22095 [9:57:11<16:12:28, 3.63s/it] {'loss': 0.4219, 'grad_norm': 0.6746159565674552, 'learning_rate': 8.539159818041727e-06, 'epoch': 0.27} 27%|██▋ | 6015/22095 [9:57:11<16:12:28, 3.63s/it] 27%|██▋ | 6016/22095 [9:57:14<15:51:56, 3.55s/it] {'loss': 0.3991, 'grad_norm': 0.6738103055188247, 'learning_rate': 8.538642058645595e-06, 'epoch': 0.27} 27%|██▋ | 6016/22095 [9:57:14<15:51:56, 3.55s/it] 27%|██▋ | 6017/22095 [9:57:18<16:18:55, 3.65s/it] {'loss': 0.3372, 'grad_norm': 0.5836860257061833, 'learning_rate': 8.538124223214909e-06, 'epoch': 0.27} 27%|██▋ | 6017/22095 [9:57:18<16:18:55, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 6018/22095 [9:57:29<25:56:37, 5.81s/it] {'loss': 0.4964, 'grad_norm': 0.49335442087411124, 'learning_rate': 8.537606311760804e-06, 'epoch': 0.27} 27%|██▋ | 6018/22095 [9:57:29<25:56:37, 5.81s/it] 27%|██▋ | 6019/22095 [9:57:32<22:31:09, 5.04s/it] {'loss': 0.3229, 'grad_norm': 0.6693697621695167, 'learning_rate': 8.537088324294403e-06, 'epoch': 0.27} 27%|██▋ | 6019/22095 [9:57:32<22:31:09, 5.04s/it] 27%|██▋ | 6020/22095 [9:57:38<22:35:05, 5.06s/it] {'loss': 0.371, 'grad_norm': 0.6527074234229424, 'learning_rate': 8.536570260826837e-06, 'epoch': 0.27} 27%|██▋ | 6020/22095 [9:57:38<22:35:05, 5.06s/it] 27%|██▋ | 6021/22095 [9:57:42<21:12:44, 4.75s/it] {'loss': 0.3695, 'grad_norm': 0.6517147449836838, 'learning_rate': 8.536052121369238e-06, 'epoch': 0.27} 27%|██▋ | 6021/22095 [9:57:42<21:12:44, 4.75s/it] 27%|██▋ | 6022/22095 [9:57:45<18:49:28, 4.22s/it] {'loss': 0.381, 'grad_norm': 0.7243834845664229, 'learning_rate': 8.535533905932739e-06, 'epoch': 0.27} 27%|██▋ | 6022/22095 [9:57:45<18:49:28, 4.22s/it] 27%|██▋ | 6023/22095 [9:57:47<17:02:47, 3.82s/it] {'loss': 0.3708, 'grad_norm': 0.651690902641303, 'learning_rate': 8.535015614528475e-06, 'epoch': 0.27} 27%|██▋ | 6023/22095 [9:57:47<17:02:47, 3.82s/it] 27%|██▋ | 6024/22095 [9:57:51<16:05:30, 3.60s/it] {'loss': 0.3404, 'grad_norm': 0.6353993471501393, 'learning_rate': 8.534497247167581e-06, 'epoch': 0.27} 27%|██▋ | 6024/22095 [9:57:51<16:05:30, 3.60s/it] 27%|██▋ | 6025/22095 [9:57:54<15:22:33, 3.44s/it] {'loss': 0.4128, 'grad_norm': 0.6883967390953978, 'learning_rate': 8.533978803861199e-06, 'epoch': 0.27} 27%|██▋ | 6025/22095 [9:57:54<15:22:33, 3.44s/it] 27%|██▋ | 6026/22095 [9:57:57<14:50:15, 3.32s/it] {'loss': 0.3335, 'grad_norm': 0.7210181559318217, 'learning_rate': 8.533460284620464e-06, 'epoch': 0.27} 27%|██▋ | 6026/22095 [9:57:57<14:50:15, 3.32s/it] 27%|██▋ | 6027/22095 [9:58:00<15:28:50, 3.47s/it] {'loss': 0.3843, 'grad_norm': 0.6455031954425401, 'learning_rate': 8.532941689456521e-06, 'epoch': 0.27} 27%|██▋ | 6027/22095 [9:58:00<15:28:50, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56847 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6028/22095 [9:58:03<14:37:27, 3.28s/it] {'loss': 0.3658, 'grad_norm': 0.6338284754087283, 'learning_rate': 8.532423018380511e-06, 'epoch': 0.27} 27%|██▋ | 6028/22095 [9:58:03<14:37:27, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [45, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358152 in VC:s3://internvl-moe-sft-data/. Exception: Image size [45, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24863, 'image': 'vrdu_table_final_2/astro-ph.CO/e5dabe45-336c-4a3f-9c29-a0782384d150.png', 'image_wh': [[45, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}$Y_{tot}$\\end{tabular}\n```"}]} 27%|██▋ | 6029/22095 [9:58:09<18:11:28, 4.08s/it] {'loss': 0.5052, 'grad_norm': 0.5224069999647922, 'learning_rate': 8.53190427140358e-06, 'epoch': 0.27} 27%|██▋ | 6029/22095 [9:58:09<18:11:28, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53552 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45398 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6030/22095 [9:58:13<17:12:28, 3.86s/it] {'loss': 0.347, 'grad_norm': 0.6258685222511478, 'learning_rate': 8.531385448536875e-06, 'epoch': 0.27} 27%|██▋ | 6030/22095 [9:58:13<17:12:28, 3.86s/it] 27%|██▋ | 6031/22095 [9:58:15<15:54:15, 3.56s/it] {'loss': 0.4118, 'grad_norm': 0.6682531378852933, 'learning_rate': 8.53086654979154e-06, 'epoch': 0.27} 27%|██▋ | 6031/22095 [9:58:15<15:54:15, 3.56s/it] 27%|██▋ | 6032/22095 [9:58:19<15:58:19, 3.58s/it] {'loss': 0.3551, 'grad_norm': 0.6144327014064551, 'learning_rate': 8.530347575178728e-06, 'epoch': 0.27} 27%|██▋ | 6032/22095 [9:58:19<15:58:19, 3.58s/it] 27%|██▋ | 6033/22095 [9:58:22<15:01:52, 3.37s/it] {'loss': 0.3418, 'grad_norm': 0.6289075987124088, 'learning_rate': 8.52982852470959e-06, 'epoch': 0.27} 27%|██▋ | 6033/22095 [9:58:22<15:01:52, 3.37s/it] 27%|██▋ | 6034/22095 [9:58:26<15:30:57, 3.48s/it] {'loss': 0.3926, 'grad_norm': 0.645673471979479, 'learning_rate': 8.529309398395275e-06, 'epoch': 0.27} 27%|██▋ | 6034/22095 [9:58:26<15:30:57, 3.48s/it] 27%|██▋ | 6035/22095 [9:58:30<16:22:56, 3.67s/it] {'loss': 0.4096, 'grad_norm': 0.690713840540433, 'learning_rate': 8.528790196246944e-06, 'epoch': 0.27} 27%|██▋ | 6035/22095 [9:58:30<16:22:56, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 6036/22095 [9:58:39<23:51:54, 5.35s/it] {'loss': 0.4982, 'grad_norm': 0.48055901693607467, 'learning_rate': 8.528270918275749e-06, 'epoch': 0.27} 27%|██▋ | 6036/22095 [9:58:39<23:51:54, 5.35s/it] 27%|██▋ | 6037/22095 [9:58:42<20:47:25, 4.66s/it] {'loss': 0.3755, 'grad_norm': 0.6483839043555008, 'learning_rate': 8.527751564492847e-06, 'epoch': 0.27} 27%|██▋ | 6037/22095 [9:58:42<20:47:25, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950720 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1555, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 3\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 6038/22095 [9:58:48<22:02:58, 4.94s/it] {'loss': 0.4978, 'grad_norm': 0.33294770241593863, 'learning_rate': 8.527232134909398e-06, 'epoch': 0.27} 27%|██▋ | 6038/22095 [9:58:48<22:02:58, 4.94s/it] 27%|██▋ | 6039/22095 [9:58:51<20:16:00, 4.54s/it] {'loss': 0.3873, 'grad_norm': 0.6955806026366336, 'learning_rate': 8.526712629536566e-06, 'epoch': 0.27} 27%|██▋ | 6039/22095 [9:58:51<20:16:00, 4.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942806 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65959, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nA. 4\nB. 5\nC. 6\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 27%|██▋ | 6040/22095 [9:58:54<18:21:49, 4.12s/it] {'loss': 0.3398, 'grad_norm': 0.7046916383301215, 'learning_rate': 8.52619304838551e-06, 'epoch': 0.27} 27%|██▋ | 6040/22095 [9:58:54<18:21:49, 4.12s/it] 27%|██▋ | 6041/22095 [9:58:58<17:43:31, 3.97s/it] {'loss': 0.3614, 'grad_norm': 0.6306151655838789, 'learning_rate': 8.525673391467395e-06, 'epoch': 0.27} 27%|██▋ | 6041/22095 [9:58:58<17:43:31, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58450 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6042/22095 [9:59:01<16:20:47, 3.67s/it] {'loss': 0.4099, 'grad_norm': 0.6994791751162374, 'learning_rate': 8.525153658793386e-06, 'epoch': 0.27} 27%|██▋ | 6042/22095 [9:59:01<16:20:47, 3.67s/it] 27%|██▋ | 6043/22095 [9:59:04<16:02:05, 3.60s/it] {'loss': 0.3621, 'grad_norm': 0.6159324117192329, 'learning_rate': 8.524633850374653e-06, 'epoch': 0.27} 27%|██▋ | 6043/22095 [9:59:04<16:02:05, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97613 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64744 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76001 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6044/22095 [9:59:07<15:11:03, 3.41s/it] {'loss': 0.3456, 'grad_norm': 0.6079616116262638, 'learning_rate': 8.524113966222363e-06, 'epoch': 0.27} 27%|██▋ | 6044/22095 [9:59:07<15:11:03, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 6045/22095 [9:59:11<15:36:13, 3.50s/it] {'loss': 0.4221, 'grad_norm': 0.6898106765207705, 'learning_rate': 8.523594006347686e-06, 'epoch': 0.27} 27%|██▋ | 6045/22095 [9:59:11<15:36:13, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 27%|██▋ | 6046/22095 [9:59:15<16:14:55, 3.64s/it] {'loss': 0.3603, 'grad_norm': 0.5991907495206867, 'learning_rate': 8.523073970761799e-06, 'epoch': 0.27} 27%|██▋ | 6046/22095 [9:59:15<16:14:55, 3.64s/it] 27%|██▋ | 6047/22095 [9:59:18<15:37:40, 3.51s/it] {'loss': 0.4052, 'grad_norm': 0.6934617136625089, 'learning_rate': 8.52255385947587e-06, 'epoch': 0.27} 27%|██▋ | 6047/22095 [9:59:18<15:37:40, 3.51s/it] 27%|██▋ | 6048/22095 [9:59:22<15:18:39, 3.43s/it] {'loss': 0.372, 'grad_norm': 0.6112395249078179, 'learning_rate': 8.52203367250108e-06, 'epoch': 0.27} 27%|██▋ | 6048/22095 [9:59:22<15:18:39, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906519 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29672, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nA. 10cm\nB. 5cm\nC. 15cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 27%|██▋ | 6049/22095 [9:59:26<16:21:44, 3.67s/it] {'loss': 0.3435, 'grad_norm': 0.6502628865263396, 'learning_rate': 8.521513409848601e-06, 'epoch': 0.27} 27%|██▋ | 6049/22095 [9:59:26<16:21:44, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 6050/22095 [9:59:36<24:52:06, 5.58s/it] {'loss': 0.4928, 'grad_norm': 0.6236769646363846, 'learning_rate': 8.520993071529614e-06, 'epoch': 0.27} 27%|██▋ | 6050/22095 [9:59:36<24:52:06, 5.58s/it] 27%|██▋ | 6051/22095 [9:59:39<21:57:06, 4.93s/it] {'loss': 0.3707, 'grad_norm': 0.6919958170062448, 'learning_rate': 8.520472657555301e-06, 'epoch': 0.27} 27%|██▋ | 6051/22095 [9:59:39<21:57:06, 4.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81703 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6052/22095 [9:59:47<26:02:07, 5.84s/it] {'loss': 0.4819, 'grad_norm': 0.34575276990715864, 'learning_rate': 8.519952167936842e-06, 'epoch': 0.27} 27%|██▋ | 6052/22095 [9:59:47<26:02:07, 5.84s/it] 27%|██▋ | 6053/22095 [9:59:51<22:59:14, 5.16s/it] {'loss': 0.3679, 'grad_norm': 0.6692288837623679, 'learning_rate': 8.519431602685423e-06, 'epoch': 0.27} 27%|██▋ | 6053/22095 [9:59:51<22:59:14, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893389 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16542, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 27%|██▋ | 6054/22095 [9:59:54<20:05:32, 4.51s/it] {'loss': 0.3517, 'grad_norm': 0.6513651971217073, 'learning_rate': 8.518910961812229e-06, 'epoch': 0.27} 27%|██▋ | 6054/22095 [9:59:54<20:05:32, 4.51s/it] 27%|██▋ | 6055/22095 [9:59:57<18:15:50, 4.10s/it] {'loss': 0.3163, 'grad_norm': 0.6701945415254389, 'learning_rate': 8.518390245328444e-06, 'epoch': 0.27} 27%|██▋ | 6055/22095 [9:59:57<18:15:50, 4.10s/it] 27%|██▋ | 6056/22095 [10:00:00<17:10:41, 3.86s/it] {'loss': 0.3771, 'grad_norm': 0.6611267699545874, 'learning_rate': 8.517869453245257e-06, 'epoch': 0.27} 27%|██▋ | 6056/22095 [10:00:00<17:10:41, 3.86s/it] 27%|██▋ | 6057/22095 [10:00:03<15:55:40, 3.58s/it] {'loss': 0.3549, 'grad_norm': 0.6839169716811373, 'learning_rate': 8.517348585573862e-06, 'epoch': 0.27} 27%|██▋ | 6057/22095 [10:00:03<15:55:40, 3.58s/it] 27%|██▋ | 6058/22095 [10:00:06<14:57:35, 3.36s/it] {'loss': 0.3923, 'grad_norm': 0.6835259767360924, 'learning_rate': 8.516827642325447e-06, 'epoch': 0.27} 27%|██▋ | 6058/22095 [10:00:06<14:57:35, 3.36s/it] 27%|██▋ | 6059/22095 [10:00:09<14:48:55, 3.33s/it] {'loss': 0.3652, 'grad_norm': 0.597517933827256, 'learning_rate': 8.51630662351121e-06, 'epoch': 0.27} 27%|██▋ | 6059/22095 [10:00:09<14:48:55, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 27%|██▋ | 6060/22095 [10:00:18<21:32:58, 4.84s/it] {'loss': 0.5372, 'grad_norm': 0.8775982858679074, 'learning_rate': 8.515785529142339e-06, 'epoch': 0.27} 27%|██▋ | 6060/22095 [10:00:18<21:32:58, 4.84s/it] 27%|██▋ | 6061/22095 [10:00:21<19:20:57, 4.34s/it] {'loss': 0.3742, 'grad_norm': 0.686589275683136, 'learning_rate': 8.515264359230038e-06, 'epoch': 0.27} 27%|██▋ | 6061/22095 [10:00:21<19:20:57, 4.34s/it] 27%|██▋ | 6062/22095 [10:00:24<17:25:25, 3.91s/it] {'loss': 0.3295, 'grad_norm': 0.647086052466487, 'learning_rate': 8.514743113785501e-06, 'epoch': 0.27} 27%|██▋ | 6062/22095 [10:00:24<17:25:25, 3.91s/it] 27%|██▋ | 6063/22095 [10:00:27<16:20:23, 3.67s/it] {'loss': 0.3412, 'grad_norm': 0.640801493246443, 'learning_rate': 8.51422179281993e-06, 'epoch': 0.27} 27%|██▋ | 6063/22095 [10:00:27<16:20:23, 3.67s/it] 27%|██▋ | 6064/22095 [10:00:30<15:23:13, 3.46s/it] {'loss': 0.3725, 'grad_norm': 0.6425145743473898, 'learning_rate': 8.513700396344527e-06, 'epoch': 0.27} 27%|██▋ | 6064/22095 [10:00:30<15:23:13, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923676 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46829, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 6\nB. 5\nC. 4\nD. 3'}, {'from': 'gpt', 'value': '【解答】解:∵AM:BN=3:1,而点M是线段AB的中点,且AB=12,∴AM=BM=6,BN=2而MN=BM-BN=6-2=4'}]} 27%|██▋ | 6065/22095 [10:00:33<15:39:32, 3.52s/it] {'loss': 0.3484, 'grad_norm': 0.7084102432303667, 'learning_rate': 8.51317892437049e-06, 'epoch': 0.27} 27%|██▋ | 6065/22095 [10:00:33<15:39:32, 3.52s/it] 27%|██▋ | 6066/22095 [10:00:37<16:05:23, 3.61s/it] {'loss': 0.3933, 'grad_norm': 0.6634924837539934, 'learning_rate': 8.512657376909031e-06, 'epoch': 0.27} 27%|██▋ | 6066/22095 [10:00:37<16:05:23, 3.61s/it] 27%|██▋ | 6067/22095 [10:00:41<15:44:32, 3.54s/it] {'loss': 0.3514, 'grad_norm': 0.6744386287510311, 'learning_rate': 8.512135753971353e-06, 'epoch': 0.27} 27%|██▋ | 6067/22095 [10:00:41<15:44:32, 3.54s/it] 27%|██▋ | 6068/22095 [10:00:44<15:44:15, 3.53s/it] {'loss': 0.3501, 'grad_norm': 0.6249090418967712, 'learning_rate': 8.511614055568665e-06, 'epoch': 0.27} 27%|██▋ | 6068/22095 [10:00:44<15:44:15, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77504 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92790 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6069/22095 [10:00:47<15:07:41, 3.40s/it] {'loss': 0.4163, 'grad_norm': 0.6350782641412113, 'learning_rate': 8.511092281712174e-06, 'epoch': 0.27} 27%|██▋ | 6069/22095 [10:00:47<15:07:41, 3.40s/it] 27%|██▋ | 6070/22095 [10:00:50<14:53:41, 3.35s/it] {'loss': 0.3859, 'grad_norm': 0.6298461195689672, 'learning_rate': 8.510570432413095e-06, 'epoch': 0.27} 27%|██▋ | 6070/22095 [10:00:50<14:53:41, 3.35s/it] 27%|██▋ | 6071/22095 [10:00:53<14:30:12, 3.26s/it] {'loss': 0.3542, 'grad_norm': 0.6447723188705166, 'learning_rate': 8.510048507682637e-06, 'epoch': 0.27} 27%|██▋ | 6071/22095 [10:00:54<14:30:12, 3.26s/it] 27%|██▋ | 6072/22095 [10:00:57<14:20:46, 3.22s/it] {'loss': 0.3221, 'grad_norm': 0.6436348002464336, 'learning_rate': 8.50952650753202e-06, 'epoch': 0.27} 27%|██▋ | 6072/22095 [10:00:57<14:20:46, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113949 > 40960). Running this sequence through the model will result in indexing errors 27%|██▋ | 6073/22095 [10:01:00<14:24:07, 3.24s/it] {'loss': 0.3524, 'grad_norm': 0.6204655375985614, 'learning_rate': 8.509004431972455e-06, 'epoch': 0.27} 27%|██▋ | 6073/22095 [10:01:00<14:24:07, 3.24s/it] 27%|██▋ | 6074/22095 [10:01:03<14:13:26, 3.20s/it] {'loss': 0.3919, 'grad_norm': 0.6687700098555398, 'learning_rate': 8.508482281015163e-06, 'epoch': 0.27} 27%|██▋ | 6074/22095 [10:01:03<14:13:26, 3.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [159, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8445671 in VC:s3://internvl-moe-sft-data/. Exception: Image size [159, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27682, 'image': 'vrdu_texteq/astro-ph.CO/14dcea9f-687f-4059-8730-8d80b7d41118.png', 'image_wh': [[159, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'whrere $i\\geq 1$.'}]} 27%|██▋ | 6075/22095 [10:01:06<14:07:47, 3.18s/it] {'loss': 0.4072, 'grad_norm': 0.6433898373052561, 'learning_rate': 8.50796005467136e-06, 'epoch': 0.27} 27%|██▋ | 6075/22095 [10:01:06<14:07:47, 3.18s/it] 27%|██▋ | 6076/22095 [10:01:10<14:30:46, 3.26s/it] {'loss': 0.4101, 'grad_norm': 0.6452541013286588, 'learning_rate': 8.507437752952271e-06, 'epoch': 0.27} 27%|██▋ | 6076/22095 [10:01:10<14:30:46, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6077/22095 [10:01:19<22:42:20, 5.10s/it] {'loss': 0.4915, 'grad_norm': 0.6893631663895471, 'learning_rate': 8.506915375869118e-06, 'epoch': 0.28} 28%|██▊ | 6077/22095 [10:01:19<22:42:20, 5.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95264 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54074 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6078/22095 [10:01:23<20:42:58, 4.66s/it] {'loss': 0.3923, 'grad_norm': 0.692999676441218, 'learning_rate': 8.506392923433124e-06, 'epoch': 0.28} 28%|██▊ | 6078/22095 [10:01:23<20:42:58, 4.66s/it] 28%|██▊ | 6079/22095 [10:01:26<18:26:30, 4.15s/it] {'loss': 0.4131, 'grad_norm': 0.6724569502775599, 'learning_rate': 8.505870395655512e-06, 'epoch': 0.28} 28%|██▊ | 6079/22095 [10:01:26<18:26:30, 4.15s/it] 28%|██▊ | 6080/22095 [10:01:29<17:22:42, 3.91s/it] {'loss': 0.3892, 'grad_norm': 0.6804157172735723, 'learning_rate': 8.505347792547516e-06, 'epoch': 0.28} 28%|██▊ | 6080/22095 [10:01:29<17:22:42, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6081/22095 [10:01:38<23:55:19, 5.38s/it] {'loss': 0.5007, 'grad_norm': 0.4057416708339337, 'learning_rate': 8.504825114120361e-06, 'epoch': 0.28} 28%|██▊ | 6081/22095 [10:01:38<23:55:19, 5.38s/it] 28%|██▊ | 6082/22095 [10:01:41<21:07:07, 4.75s/it] {'loss': 0.3408, 'grad_norm': 0.7245855816415157, 'learning_rate': 8.504302360385276e-06, 'epoch': 0.28} 28%|██▊ | 6082/22095 [10:01:41<21:07:07, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6083/22095 [10:01:44<18:58:32, 4.27s/it] {'loss': 0.3677, 'grad_norm': 0.6641071295508348, 'learning_rate': 8.5037795313535e-06, 'epoch': 0.28} 28%|██▊ | 6083/22095 [10:01:44<18:58:32, 4.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6084/22095 [10:01:47<17:14:40, 3.88s/it] {'loss': 0.399, 'grad_norm': 0.6976861804856807, 'learning_rate': 8.50325662703626e-06, 'epoch': 0.28} 28%|██▊ | 6084/22095 [10:01:47<17:14:40, 3.88s/it] 28%|██▊ | 6085/22095 [10:01:50<16:32:44, 3.72s/it] {'loss': 0.3714, 'grad_norm': 0.662407202013515, 'learning_rate': 8.502733647444796e-06, 'epoch': 0.28} 28%|██▊ | 6085/22095 [10:01:50<16:32:44, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6086/22095 [10:02:00<24:06:30, 5.42s/it] {'loss': 0.4847, 'grad_norm': 0.5031182360900243, 'learning_rate': 8.502210592590344e-06, 'epoch': 0.28} 28%|██▊ | 6086/22095 [10:02:00<24:06:30, 5.42s/it] 28%|██▊ | 6087/22095 [10:02:04<21:44:50, 4.89s/it] {'loss': 0.3875, 'grad_norm': 0.8223407065276434, 'learning_rate': 8.501687462484141e-06, 'epoch': 0.28} 28%|██▊ | 6087/22095 [10:02:04<21:44:50, 4.89s/it] 28%|██▊ | 6088/22095 [10:02:07<19:52:27, 4.47s/it] {'loss': 0.4031, 'grad_norm': 0.7308641110498155, 'learning_rate': 8.501164257137431e-06, 'epoch': 0.28} 28%|██▊ | 6088/22095 [10:02:07<19:52:27, 4.47s/it] 28%|██▊ | 6089/22095 [10:02:10<18:26:44, 4.15s/it] {'loss': 0.3834, 'grad_norm': 0.6205462016558165, 'learning_rate': 8.500640976561453e-06, 'epoch': 0.28} 28%|██▊ | 6089/22095 [10:02:10<18:26:44, 4.15s/it] 28%|██▊ | 6090/22095 [10:02:14<17:16:05, 3.88s/it] {'loss': 0.3513, 'grad_norm': 0.6532663048504899, 'learning_rate': 8.500117620767452e-06, 'epoch': 0.28} 28%|██▊ | 6090/22095 [10:02:14<17:16:05, 3.88s/it] 28%|██▊ | 6091/22095 [10:02:18<17:31:53, 3.94s/it] {'loss': 0.4097, 'grad_norm': 0.6343766244654432, 'learning_rate': 8.499594189766674e-06, 'epoch': 0.28} 28%|██▊ | 6091/22095 [10:02:18<17:31:53, 3.94s/it] 28%|██▊ | 6092/22095 [10:02:21<16:34:48, 3.73s/it] {'loss': 0.4342, 'grad_norm': 0.6452984743606108, 'learning_rate': 8.499070683570363e-06, 'epoch': 0.28} 28%|██▊ | 6092/22095 [10:02:21<16:34:48, 3.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6093/22095 [10:02:24<15:47:47, 3.55s/it] {'loss': 0.401, 'grad_norm': 0.670940591277106, 'learning_rate': 8.49854710218977e-06, 'epoch': 0.28} 28%|██▊ | 6093/22095 [10:02:24<15:47:47, 3.55s/it] 28%|██▊ | 6094/22095 [10:02:28<15:50:52, 3.57s/it] {'loss': 0.367, 'grad_norm': 0.6861574396517808, 'learning_rate': 8.498023445636145e-06, 'epoch': 0.28} 28%|██▊ | 6094/22095 [10:02:28<15:50:52, 3.57s/it] 28%|██▊ | 6095/22095 [10:02:31<15:41:23, 3.53s/it] {'loss': 0.3854, 'grad_norm': 0.6537017546764119, 'learning_rate': 8.49749971392074e-06, 'epoch': 0.28} 28%|██▊ | 6095/22095 [10:02:31<15:41:23, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6096/22095 [10:02:34<15:24:06, 3.47s/it] {'loss': 0.3728, 'grad_norm': 0.6706788741382547, 'learning_rate': 8.496975907054808e-06, 'epoch': 0.28} 28%|██▊ | 6096/22095 [10:02:34<15:24:06, 3.47s/it] 28%|██▊ | 6097/22095 [10:02:38<15:04:29, 3.39s/it] {'loss': 0.386, 'grad_norm': 0.6577647196836957, 'learning_rate': 8.496452025049605e-06, 'epoch': 0.28} 28%|██▊ | 6097/22095 [10:02:38<15:04:29, 3.39s/it] 28%|██▊ | 6098/22095 [10:02:41<14:32:39, 3.27s/it] {'loss': 0.3405, 'grad_norm': 0.687328696884683, 'learning_rate': 8.495928067916383e-06, 'epoch': 0.28} 28%|██▊ | 6098/22095 [10:02:41<14:32:39, 3.27s/it] 28%|██▊ | 6099/22095 [10:02:44<14:11:42, 3.19s/it] {'loss': 0.4077, 'grad_norm': 0.6355701599063429, 'learning_rate': 8.495404035666409e-06, 'epoch': 0.28} 28%|██▊ | 6099/22095 [10:02:44<14:11:42, 3.19s/it] 28%|██▊ | 6100/22095 [10:02:47<13:56:22, 3.14s/it] {'loss': 0.3735, 'grad_norm': 0.6651890064834014, 'learning_rate': 8.494879928310934e-06, 'epoch': 0.28} 28%|██▊ | 6100/22095 [10:02:47<13:56:22, 3.14s/it] 28%|██▊ | 6101/22095 [10:03:13<44:51:37, 10.10s/it] {'loss': 0.4075, 'grad_norm': 0.6690293513244945, 'learning_rate': 8.494355745861223e-06, 'epoch': 0.28} 28%|██▊ | 6101/22095 [10:03:13<44:51:37, 10.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6102/22095 [10:03:16<35:35:50, 8.01s/it] {'loss': 0.3712, 'grad_norm': 0.6397491482824402, 'learning_rate': 8.49383148832854e-06, 'epoch': 0.28} 28%|██▊ | 6102/22095 [10:03:16<35:35:50, 8.01s/it] 28%|██▊ | 6103/22095 [10:03:20<29:42:41, 6.69s/it] {'loss': 0.3548, 'grad_norm': 0.6254108917060476, 'learning_rate': 8.493307155724147e-06, 'epoch': 0.28} 28%|██▊ | 6103/22095 [10:03:20<29:42:41, 6.69s/it] 28%|██▊ | 6104/22095 [10:03:23<25:43:57, 5.79s/it] {'loss': 0.3484, 'grad_norm': 0.6253800648922403, 'learning_rate': 8.492782748059314e-06, 'epoch': 0.28} 28%|██▊ | 6104/22095 [10:03:24<25:43:57, 5.79s/it] 28%|██▊ | 6105/22095 [10:03:27<22:58:49, 5.17s/it] {'loss': 0.3745, 'grad_norm': 1.127881443978114, 'learning_rate': 8.492258265345307e-06, 'epoch': 0.28} 28%|██▊ | 6105/22095 [10:03:27<22:58:49, 5.17s/it] 28%|██▊ | 6106/22095 [10:03:50<47:03:41, 10.60s/it] {'loss': 0.3739, 'grad_norm': 0.645476286737807, 'learning_rate': 8.491733707593395e-06, 'epoch': 0.28} 28%|██▊ | 6106/22095 [10:03:50<47:03:41, 10.60s/it] 28%|██▊ | 6107/22095 [10:03:54<37:51:50, 8.53s/it] {'loss': 0.3677, 'grad_norm': 0.6708347456649015, 'learning_rate': 8.49120907481485e-06, 'epoch': 0.28} 28%|██▊ | 6107/22095 [10:03:54<37:51:50, 8.53s/it] 28%|██▊ | 6108/22095 [10:03:58<31:05:53, 7.00s/it] {'loss': 0.3921, 'grad_norm': 0.6620974851891143, 'learning_rate': 8.490684367020944e-06, 'epoch': 0.28} 28%|██▊ | 6108/22095 [10:03:58<31:05:53, 7.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [917, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8498063 in VC:s3://internvl-moe-sft-data/. Exception: Image size [917, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119677, 'image': 'vrdu_texteq/astro-ph.CO/74547141-f102-4690-879c-726b3f7f3638.png', 'image_wh': [[917, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'The Fourier transform of the time-series data from the $I$-th GW detector is'}]} 28%|██▊ | 6109/22095 [10:04:01<26:06:47, 5.88s/it] {'loss': 0.4095, 'grad_norm': 0.6977561862853785, 'learning_rate': 8.490159584222952e-06, 'epoch': 0.28} 28%|██▊ | 6109/22095 [10:04:01<26:06:47, 5.88s/it] 28%|██▊ | 6110/22095 [10:04:23<47:24:52, 10.68s/it] {'loss': 0.3874, 'grad_norm': 0.5846728205583535, 'learning_rate': 8.48963472643215e-06, 'epoch': 0.28} 28%|██▊ | 6110/22095 [10:04:23<47:24:52, 10.68s/it] 28%|██▊ | 6111/22095 [10:04:27<38:29:59, 8.67s/it] {'loss': 0.3824, 'grad_norm': 0.6304765120857315, 'learning_rate': 8.489109793659815e-06, 'epoch': 0.28} 28%|██▊ | 6111/22095 [10:04:27<38:29:59, 8.67s/it] 28%|██▊ | 6112/22095 [10:05:07<80:17:49, 18.09s/it] {'loss': 0.4003, 'grad_norm': 0.69646468269515, 'learning_rate': 8.488584785917226e-06, 'epoch': 0.28} 28%|██▊ | 6112/22095 [10:05:07<80:17:49, 18.09s/it] 28%|██▊ | 6113/22095 [10:05:29<86:16:22, 19.43s/it] {'loss': 0.3358, 'grad_norm': 0.669757027053765, 'learning_rate': 8.488059703215666e-06, 'epoch': 0.28} 28%|██▊ | 6113/22095 [10:05:29<86:16:22, 19.43s/it] 28%|██▊ | 6114/22095 [10:05:50<88:26:10, 19.92s/it] {'loss': 0.3456, 'grad_norm': 0.6960057329359819, 'learning_rate': 8.487534545566414e-06, 'epoch': 0.28} 28%|██▊ | 6114/22095 [10:05:50<88:26:10, 19.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6115/22095 [10:05:53<65:41:45, 14.80s/it] {'loss': 0.4085, 'grad_norm': 0.6425349594788039, 'learning_rate': 8.487009312980756e-06, 'epoch': 0.28} 28%|██▊ | 6115/22095 [10:05:53<65:41:45, 14.80s/it] 28%|██▊ | 6116/22095 [10:06:19<80:30:17, 18.14s/it] {'loss': 0.386, 'grad_norm': 0.7200906547318653, 'learning_rate': 8.486484005469977e-06, 'epoch': 0.28} 28%|██▊ | 6116/22095 [10:06:19<80:30:17, 18.14s/it] 28%|██▊ | 6117/22095 [10:06:41<85:33:52, 19.28s/it] {'loss': 0.3665, 'grad_norm': 0.5778607950964567, 'learning_rate': 8.485958623045365e-06, 'epoch': 0.28} 28%|██▊ | 6117/22095 [10:06:41<85:33:52, 19.28s/it] 28%|██▊ | 6118/22095 [10:06:44<63:33:34, 14.32s/it] {'loss': 0.3692, 'grad_norm': 0.6721297193179439, 'learning_rate': 8.48543316571821e-06, 'epoch': 0.28} 28%|██▊ | 6118/22095 [10:06:44<63:33:34, 14.32s/it] 28%|██▊ | 6119/22095 [10:06:47<48:32:26, 10.94s/it] {'loss': 0.3829, 'grad_norm': 0.7550680623075885, 'learning_rate': 8.484907633499798e-06, 'epoch': 0.28} 28%|██▊ | 6119/22095 [10:06:47<48:32:26, 10.94s/it] 28%|██▊ | 6120/22095 [10:07:08<62:35:56, 14.11s/it] {'loss': 0.3975, 'grad_norm': 0.6393433200351009, 'learning_rate': 8.484382026401428e-06, 'epoch': 0.28} 28%|██▊ | 6120/22095 [10:07:08<62:35:56, 14.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6121/22095 [10:07:29<71:09:18, 16.04s/it] {'loss': 0.3644, 'grad_norm': 0.6772103434747855, 'learning_rate': 8.483856344434388e-06, 'epoch': 0.28} 28%|██▊ | 6121/22095 [10:07:29<71:09:18, 16.04s/it] 28%|██▊ | 6122/22095 [10:07:50<77:52:29, 17.55s/it] {'loss': 0.3698, 'grad_norm': 0.6192545725413289, 'learning_rate': 8.483330587609975e-06, 'epoch': 0.28} 28%|██▊ | 6122/22095 [10:07:50<77:52:29, 17.55s/it] 28%|██▊ | 6123/22095 [10:08:50<133:59:31, 30.20s/it] {'loss': 0.3637, 'grad_norm': 0.6638295813572664, 'learning_rate': 8.482804755939484e-06, 'epoch': 0.28} 28%|██▊ | 6123/22095 [10:08:50<133:59:31, 30.20s/it] 28%|██▊ | 6124/22095 [10:09:12<122:59:01, 27.72s/it] {'loss': 0.4055, 'grad_norm': 0.6374373395694628, 'learning_rate': 8.482278849434218e-06, 'epoch': 0.28} 28%|██▊ | 6124/22095 [10:09:12<122:59:01, 27.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930854 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54007, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 3cm\nB. 4cm\nC. 6cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} VC:s3://internvl2/datasets/screen2words/images/0012732.jpg 2025-08-28 02:07:10.501549 load time: 1040.2 ms 28%|██▊ | 6125/22095 [10:09:34<116:02:31, 26.16s/it] {'loss': 0.3655, 'grad_norm': 0.7399956950117406, 'learning_rate': 8.481752868105473e-06, 'epoch': 0.28} 28%|██▊ | 6125/22095 [10:09:34<116:02:31, 26.16s/it]VC:s3://multi-modal/playground/data/geoqa+/images/10031.png 2025-08-28 02:07:33.012328 load time: 1024.22 ms 28%|██▊ | 6126/22095 [10:09:38<86:04:01, 19.40s/it] {'loss': 0.367, 'grad_norm': 0.6252133276431657, 'learning_rate': 8.481226811964552e-06, 'epoch': 0.28} 28%|██▊ | 6126/22095 [10:09:38<86:04:01, 19.40s/it]VC:s3://gui-agent/data_20250526/windows/images/spotify/20250515_153414_1/images/before_screenshot_12_id_104_function_1_crop_1_grounding_instructions_random.png 2025-08-28 02:07:36.649482 load time: 1081.15 ms 28%|██▊ | 6127/22095 [10:10:19<115:15:10, 25.98s/it] {'loss': 0.399, 'grad_norm': 0.6040986055476337, 'learning_rate': 8.48070068102276e-06, 'epoch': 0.28} 28%|██▊ | 6127/22095 [10:10:19<115:15:10, 25.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71724 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6128/22095 [10:10:41<110:08:48, 24.83s/it] {'loss': 0.386, 'grad_norm': 0.6680655356074402, 'learning_rate': 8.480174475291401e-06, 'epoch': 0.28} 28%|██▊ | 6128/22095 [10:10:41<110:08:48, 24.83s/it] 28%|██▊ | 6129/22095 [10:11:04<106:53:51, 24.10s/it] {'loss': 0.3413, 'grad_norm': 0.6679662578379792, 'learning_rate': 8.47964819478178e-06, 'epoch': 0.28} 28%|██▊ | 6129/22095 [10:11:04<106:53:51, 24.10s/it] 28%|██▊ | 6130/22095 [10:11:45<129:58:17, 29.31s/it] {'loss': 0.4141, 'grad_norm': 0.6402503560072257, 'learning_rate': 8.479121839505205e-06, 'epoch': 0.28} 28%|██▊ | 6130/22095 [10:11:45<129:58:17, 29.31s/it] 28%|██▊ | 6131/22095 [10:11:48<95:01:20, 21.43s/it] {'loss': 0.363, 'grad_norm': 0.6748241095133092, 'learning_rate': 8.478595409472988e-06, 'epoch': 0.28} 28%|██▊ | 6131/22095 [10:11:48<95:01:20, 21.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57547 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6132/22095 [10:13:05<168:41:55, 38.05s/it] {'loss': 0.4045, 'grad_norm': 0.735162342819606, 'learning_rate': 8.47806890469644e-06, 'epoch': 0.28} 28%|██▊ | 6132/22095 [10:13:05<168:41:55, 38.05s/it] 28%|██▊ | 6133/22095 [10:13:30<151:21:15, 34.14s/it] {'loss': 0.3892, 'grad_norm': 0.6468589030966592, 'learning_rate': 8.477542325186873e-06, 'epoch': 0.28} 28%|██▊ | 6133/22095 [10:13:30<151:21:15, 34.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53980 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46799 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6134/22095 [10:14:10<159:39:14, 36.01s/it] {'loss': 0.3388, 'grad_norm': 0.6205833487739102, 'learning_rate': 8.4770156709556e-06, 'epoch': 0.28} 28%|██▊ | 6134/22095 [10:14:10<159:39:14, 36.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69018 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6135/22095 [10:15:09<189:29:22, 42.74s/it] {'loss': 0.3953, 'grad_norm': 0.6116203070506987, 'learning_rate': 8.476488942013941e-06, 'epoch': 0.28} 28%|██▊ | 6135/22095 [10:15:09<189:29:22, 42.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/agentnet/win_mac_images/f041aa19-e142-47b5-961f-a58298ca5e85.png 2025-08-28 02:13:07.715113 load time: 1048.33 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308715 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2on6TdTwKL1JjSZFgXXb6aVXa_!!1857411596.png.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n只提取文字,坐标框不用显示。'}, {'from': 'gpt', 'value': '图中所有文字:\nTM\n萌\n宠\n俺\n的\n睿格\n购\n疯狂\n价\n特价\n5\n超值\n斤装\n睿格\nRUIGE\nNATURAL\n包邮\n26.9'}]} 28%|██▊ | 6136/22095 [10:15:18<145:13:08, 32.76s/it] {'loss': 0.4961, 'grad_norm': 0.41637885767207033, 'learning_rate': 8.475962138373212e-06, 'epoch': 0.28} 28%|██▊ | 6136/22095 [10:15:18<145:13:08, 32.76s/it] 28%|██▊ | 6137/22095 [10:15:22<106:25:32, 24.01s/it] {'loss': 0.3943, 'grad_norm': 0.6635986927809545, 'learning_rate': 8.475435260044732e-06, 'epoch': 0.28} 28%|██▊ | 6137/22095 [10:15:22<106:25:32, 24.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [117, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390149 in VC:s3://internvl-moe-sft-data/. Exception: Image size [117, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56968, 'image': 'vrdu_table_final_2/astro-ph.EP/bc9ddb95-376a-4d51-91ac-1cb8a41aee7f.png', 'image_wh': [[117, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} Reference\\\\ \\end{tabular}\n```"}]} 28%|██▊ | 6138/22095 [10:16:03<128:26:20, 28.98s/it] {'loss': 0.3992, 'grad_norm': 0.6525112217425668, 'learning_rate': 8.474908307039822e-06, 'epoch': 0.28} 28%|██▊ | 6138/22095 [10:16:03<128:26:20, 28.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47794 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6139/22095 [10:16:43<144:19:14, 32.56s/it] {'loss': 0.3926, 'grad_norm': 0.6005514128270779, 'learning_rate': 8.474381279369804e-06, 'epoch': 0.28} 28%|██▊ | 6139/22095 [10:16:43<144:19:14, 32.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6140/22095 [10:17:49<187:43:50, 42.36s/it] {'loss': 0.479, 'grad_norm': 0.30909771445739137, 'learning_rate': 8.473854177046004e-06, 'epoch': 0.28} 28%|██▊ | 6140/22095 [10:17:49<187:43:50, 42.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71997 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52820 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6141/22095 [10:17:53<136:29:38, 30.80s/it] {'loss': 0.3605, 'grad_norm': 0.6494619158079863, 'learning_rate': 8.473327000079748e-06, 'epoch': 0.28} 28%|██▊ | 6141/22095 [10:17:53<136:29:38, 30.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6142/22095 [10:18:03<108:55:29, 24.58s/it] {'loss': 0.4826, 'grad_norm': 0.3128065828485614, 'learning_rate': 8.472799748482361e-06, 'epoch': 0.28} 28%|██▊ | 6142/22095 [10:18:03<108:55:29, 24.58s/it] 28%|██▊ | 6143/22095 [10:18:30<112:07:09, 25.30s/it] {'loss': 0.4897, 'grad_norm': 0.28853261982474304, 'learning_rate': 8.472272422265172e-06, 'epoch': 0.28} 28%|██▊ | 6143/22095 [10:18:30<112:07:09, 25.30s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 28%|██▊ | 6144/22095 [10:18:52<108:01:54, 24.38s/it] {'loss': 0.3932, 'grad_norm': 0.6229091371671683, 'learning_rate': 8.471745021439516e-06, 'epoch': 0.28} 28%|██▊ | 6144/22095 [10:18:52<108:01:54, 24.38s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_498748.png 2025-08-28 02:16:50.589133 load time: 1064.94 ms 28%|██▊ | 6145/22095 [10:19:17<108:53:35, 24.58s/it] {'loss': 0.4207, 'grad_norm': 0.6575726509092661, 'learning_rate': 8.47121754601672e-06, 'epoch': 0.28} 28%|██▊ | 6145/22095 [10:19:17<108:53:35, 24.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86085 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6146/22095 [10:21:11<227:26:03, 51.34s/it] {'loss': 0.3768, 'grad_norm': 0.7373976535519031, 'learning_rate': 8.47068999600812e-06, 'epoch': 0.28} 28%|██▊ | 6146/22095 [10:21:11<227:26:03, 51.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57750 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6147/22095 [10:22:13<241:42:17, 54.56s/it] {'loss': 0.4912, 'grad_norm': 0.3628628811080446, 'learning_rate': 8.470162371425052e-06, 'epoch': 0.28} 28%|██▊ | 6147/22095 [10:22:13<241:42:17, 54.56s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-0_3100367-split-1.jpg 2025-08-28 02:20:11.480710 load time: 1029.17 ms VC:s3://internvl-moe-sft-data/vrdu_texteq/astro-ph.CO/d297b913-090d-42e2-9324-6758994edb07.png 2025-08-28 02:20:11.480785 load time: 1048.71 ms VC:s3://gui-agent/data_20250714/web/images/20250716/667e2346-6d85-44bc-b04c-86ce73af4d18/images/step_36.png 2025-08-28 02:20:11.482551 load time: 1028.42 ms 28%|██▊ | 6148/22095 [10:22:35<198:46:30, 44.87s/it] {'loss': 0.3541, 'grad_norm': 0.6637909505966786, 'learning_rate': 8.469634672278853e-06, 'epoch': 0.28} 28%|██▊ | 6148/22095 [10:22:35<198:46:30, 44.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (104501 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6149/22095 [10:23:43<229:17:36, 51.77s/it] {'loss': 0.4806, 'grad_norm': 0.311668781451325, 'learning_rate': 8.46910689858086e-06, 'epoch': 0.28} 28%|██▊ | 6149/22095 [10:23:43<229:17:36, 51.77s/it]VC:s3://gui/visual_inputs/multi_modal/agent_data/AndroidUI/20240321/20240321_filtered/BOSSzhiping/screen_00000028.jpg 2025-08-28 02:21:41.597214 load time: 1038.67 ms VC:s3://st2pj/20250222/images/gui-share/agent_data/agent_data/ui-detect/label_data/0802_099_batch_4_num_2800/20240731134002885/main_89_after.png 2025-08-28 02:21:41.597596 load time: 1038.2 ms VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 02:21:41.596593 load time: 1064.55 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6150/22095 [10:23:47<165:29:55, 37.37s/it] {'loss': 0.3922, 'grad_norm': 0.683883187993135, 'learning_rate': 8.468579050342414e-06, 'epoch': 0.28} 28%|██▊ | 6150/22095 [10:23:47<165:29:55, 37.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73615 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6151/22095 [10:25:44<271:34:22, 61.32s/it] {'loss': 0.4114, 'grad_norm': 0.6586497915069213, 'learning_rate': 8.468051127574858e-06, 'epoch': 0.28} 28%|██▊ | 6151/22095 [10:25:44<271:34:22, 61.32s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/2b12c54542ac0ddaf4fdf1ad4cf4916b7cfc2082a7c218735fa57f9abd37227b.png 2025-08-28 02:23:42.576954 load time: 1018.79 ms 28%|██▊ | 6152/22095 [10:26:06<219:00:32, 49.45s/it] {'loss': 0.383, 'grad_norm': 0.6524905680374529, 'learning_rate': 8.467523130289535e-06, 'epoch': 0.28} 28%|██▊ | 6152/22095 [10:26:06<219:00:32, 49.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51972 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6153/22095 [10:27:06<233:21:57, 52.70s/it] {'loss': 0.345, 'grad_norm': 0.6940460177411515, 'learning_rate': 8.466995058497788e-06, 'epoch': 0.28} 28%|██▊ | 6153/22095 [10:27:06<233:21:57, 52.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage2/guiact-web-multi-v2/images/uid_record_04551_step_04.png 2025-08-28 02:25:04.613264 load time: 1055.0 ms VC:s3://gui-agent/data_20250421/web/images/wa_shopping_admin_admin/trajectory_52/img/step_10.png 2025-08-28 02:25:04.613111 load time: 1035.34 ms 28%|██▊ | 6154/22095 [10:27:11<170:41:18, 38.55s/it] {'loss': 0.47, 'grad_norm': 0.4365224565945977, 'learning_rate': 8.466466912210967e-06, 'epoch': 0.28} 28%|██▊ | 6154/22095 [10:27:11<170:41:18, 38.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137094 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6155/22095 [10:27:40<157:23:52, 35.55s/it] {'loss': 0.5013, 'grad_norm': 0.38480078277144353, 'learning_rate': 8.465938691440417e-06, 'epoch': 0.28} 28%|██▊ | 6155/22095 [10:27:40<157:23:52, 35.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 28%|██▊ | 6156/22095 [10:28:21<165:05:17, 37.29s/it] {'loss': 0.3663, 'grad_norm': 0.6727143832648759, 'learning_rate': 8.46541039619749e-06, 'epoch': 0.28} 28%|██▊ | 6156/22095 [10:28:21<165:05:17, 37.29s/it] 28%|██▊ | 6157/22095 [10:29:46<227:59:28, 51.50s/it] {'loss': 0.4784, 'grad_norm': 0.3096778956946426, 'learning_rate': 8.464882026493537e-06, 'epoch': 0.28} 28%|██▊ | 6157/22095 [10:29:46<227:59:28, 51.50s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41608 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57735 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98818 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6158/22095 [10:29:49<164:01:01, 37.05s/it] {'loss': 0.3632, 'grad_norm': 0.6339465451087098, 'learning_rate': 8.464353582339911e-06, 'epoch': 0.28} 28%|██▊ | 6158/22095 [10:29:49<164:01:01, 37.05s/it]VC:s3://gui-agent/data_20250328/windows/chrome/20250326_204534_1/images/before_screenshot_252.png 2025-08-28 02:27:48.041374 load time: 1026.28 ms 28%|██▊ | 6159/22095 [10:30:34<173:42:32, 39.24s/it] {'loss': 0.4866, 'grad_norm': 0.3841815298069502, 'learning_rate': 8.463825063747966e-06, 'epoch': 0.28} 28%|██▊ | 6159/22095 [10:30:34<173:42:32, 39.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/e249859f6a36838037cb0bdb0e27649c.png 2025-08-28 02:28:32.380548 load time: 1041.14 ms 28%|██▊ | 6160/22095 [10:31:16<177:42:13, 40.15s/it] {'loss': 0.3913, 'grad_norm': 0.6775125423465649, 'learning_rate': 8.463296470729058e-06, 'epoch': 0.28} 28%|██▊ | 6160/22095 [10:31:16<177:42:13, 40.15s/it]VC:s3://gui/visual_inputs/multi_modal_2024/gui_data/ui_data/OpenApp/image/50140.jpg 2025-08-28 02:29:14.647631 load time: 1031.94 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_1419.png 2025-08-28 02:29:14.647752 load time: 1033.19 ms VC:s3://gui-agent/jedi/images/component_library_snap_icon_data/component_library_snap_icon_data_extracted/images_pure_color_background/snap_icons/teletext.png 2025-08-28 02:29:14.645861 load time: 1042.58 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_695508.png 2025-08-28 02:29:14.645781 load time: 1039.7 ms VC:s3://gui-agent/data_20250623/windows/images/autocad/20250508_132635_1/images/before_screenshot_1.png 2025-08-28 02:29:14.647511 load time: 1044.24 ms VC:s3://gui-agent/data_20250421/web/images/wa_forum/trajectory_266/img/step_5.png 2025-08-28 02:29:14.646127 load time: 1041.87 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_273616.png 2025-08-28 02:29:14.645708 load time: 1035.53 ms 28%|██▊ | 6161/22095 [10:31:25<136:57:09, 30.94s/it] {'loss': 0.475, 'grad_norm': 0.35137476460460587, 'learning_rate': 8.462767803294547e-06, 'epoch': 0.28} 28%|██▊ | 6161/22095 [10:31:25<136:57:09, 30.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 28%|██▊ | 6162/22095 [10:31:48<125:44:36, 28.41s/it] {'loss': 0.3446, 'grad_norm': 0.6326437997259854, 'learning_rate': 8.462239061455791e-06, 'epoch': 0.28} 28%|██▊ | 6162/22095 [10:31:48<125:44:36, 28.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49895 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69246 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6163/22095 [10:32:36<151:43:18, 34.28s/it] {'loss': 0.4814, 'grad_norm': 0.2746024871363939, 'learning_rate': 8.461710245224149e-06, 'epoch': 0.28} 28%|██▊ | 6163/22095 [10:32:36<151:43:18, 34.28s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 28%|██▊ | 6164/22095 [10:33:17<160:34:33, 36.29s/it] {'loss': 0.3749, 'grad_norm': 0.6396593577582315, 'learning_rate': 8.461181354610988e-06, 'epoch': 0.28} 28%|██▊ | 6164/22095 [10:33:17<160:34:33, 36.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8363547 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30285, 'image': 'vrdu_table_final_2/astro-ph.CO/4817eeb1-418d-4d91-9d05-314af17f7851.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} VC:s3://gui-agent/jedi/images/component_library_snap_icon_data/component_library_snap_icon_data_extracted/images_grounded/component_library_icons/material-design-icons/src/social/18_up_rating/materialiconstwotone/24px.png 2025-08-28 02:31:15.553700 load time: 1014.56 ms VC:s3://gui-agent/data_20250714/windows/images/adobe_illustrator/free_task_20250714_162001/images/20250714_162133_51.png 2025-08-28 02:31:15.555095 load time: 1040.06 ms 28%|██▊ | 6165/22095 [10:33:58<166:59:46, 37.74s/it] {'loss': 0.4009, 'grad_norm': 0.6563268547409656, 'learning_rate': 8.460652389627668e-06, 'epoch': 0.28} 28%|██▊ | 6165/22095 [10:33:58<166:59:46, 37.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6166/22095 [10:34:24<151:00:39, 34.13s/it] {'loss': 0.5349, 'grad_norm': 0.4048207096311486, 'learning_rate': 8.46012335028556e-06, 'epoch': 0.28} 28%|██▊ | 6166/22095 [10:34:24<151:00:39, 34.13s/it]VC:s3://gui-agent/data_20250421/Android/tencentmap/Cycle_0_Iter_2/images/screenshot-41-1745021701.8972528-before.png 2025-08-28 02:32:22.396429 load time: 1026.42 ms VC:s3://gui-agent/data_20250612/android/images/Total_data_windows_0612_hard_data2_device_1_Simple_Gallery_Pro/RecipeAddFirstRecipesFromImage/images/003_click_1749314294821.png 2025-08-28 02:32:22.396399 load time: 1056.17 ms 28%|██▊ | 6167/22095 [10:35:37<203:21:09, 45.96s/it] {'loss': 0.3856, 'grad_norm': 0.6814293888375414, 'learning_rate': 8.459594236596024e-06, 'epoch': 0.28} 28%|██▊ | 6167/22095 [10:35:37<203:21:09, 45.96s/it]VC:s3://multi-modal/UniGeo/calculation_images/8624.png 2025-08-28 02:33:35.959376 load time: 1015.85 ms VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/AppleMusicAssets/macOS-Lossless-KO_Normal@2x.png 2025-08-28 02:33:35.959495 load time: 1027.13 ms VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/vision/test_869_image.png 2025-08-28 02:33:35.959492 load time: 1053.25 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['data/table/other_screenshot/original/ModernInteractiveTable_1739993892.4959967.png'] does not match number of images None [Try #0] Failed to fetch sample 1871596 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/table/other_screenshot/original/ModernInteractiveTable_1739993892.4959967.png'] does not match number of images None Problematic sample: {'image': 'data/table/other_screenshot/original/ModernInteractiveTable_1739993892.4959967.png', 'conversations': [], 'image_id': 'data/table/other_screenshot/original/ModernInteractiveTable_1739993892.4959967.png'} 28%|██▊ | 6168/22095 [10:36:36<220:16:02, 49.79s/it] {'loss': 0.4051, 'grad_norm': 0.7480971357179941, 'learning_rate': 8.459065048570434e-06, 'epoch': 0.28} 28%|██▊ | 6168/22095 [10:36:36<220:16:02, 49.79s/it] 28%|██▊ | 6169/22095 [10:37:34<231:28:49, 52.33s/it] {'loss': 0.3512, 'grad_norm': 0.6274966551812439, 'learning_rate': 8.45853578622016e-06, 'epoch': 0.28} 28%|██▊ | 6169/22095 [10:37:34<231:28:49, 52.33s/it]VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/chrome-icons/Earbuds Battery.png 2025-08-28 02:35:32.919540 load time: 1044.17 ms 28%|██▊ | 6170/22095 [10:37:57<192:55:24, 43.61s/it] {'loss': 0.3576, 'grad_norm': 0.6308842647074984, 'learning_rate': 8.458006449556576e-06, 'epoch': 0.28} 28%|██▊ | 6170/22095 [10:37:57<192:55:24, 43.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6171/22095 [10:38:27<174:35:31, 39.47s/it] {'loss': 0.4928, 'grad_norm': 0.32163095555358034, 'learning_rate': 8.457477038591054e-06, 'epoch': 0.28} 28%|██▊ | 6171/22095 [10:38:27<174:35:31, 39.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48109 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6172/22095 [10:39:10<179:23:43, 40.56s/it] {'loss': 0.3557, 'grad_norm': 0.6627694350400074, 'learning_rate': 8.456947553334966e-06, 'epoch': 0.28} 28%|██▊ | 6172/22095 [10:39:10<179:23:43, 40.56s/it] 28%|██▊ | 6173/22095 [10:39:14<130:49:10, 29.58s/it] {'loss': 0.4173, 'grad_norm': 0.6565527403152172, 'learning_rate': 8.456417993799695e-06, 'epoch': 0.28} 28%|██▊ | 6173/22095 [10:39:14<130:49:10, 29.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6174/22095 [10:40:20<179:21:44, 40.56s/it] {'loss': 0.5071, 'grad_norm': 0.2907819307107183, 'learning_rate': 8.455888359996616e-06, 'epoch': 0.28} 28%|██▊ | 6174/22095 [10:40:20<179:21:44, 40.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88428 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80553 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6175/22095 [10:40:46<158:47:41, 35.91s/it] {'loss': 0.3688, 'grad_norm': 0.8058453445617328, 'learning_rate': 8.455358651937111e-06, 'epoch': 0.28} 28%|██▊ | 6175/22095 [10:40:46<158:47:41, 35.91s/it] 28%|██▊ | 6176/22095 [10:41:26<164:24:55, 37.18s/it] {'loss': 0.3641, 'grad_norm': 0.6762267533514817, 'learning_rate': 8.45482886963256e-06, 'epoch': 0.28} 28%|██▊ | 6176/22095 [10:41:26<164:24:55, 37.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [639, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8455854 in VC:s3://internvl-moe-sft-data/. Exception: Image size [639, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119971, 'image': 'vrdu_texteq/astro-ph.CO/d0021ef2-b7b6-4f6f-aa64-05fa64ae7605.png', 'image_wh': [[639, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $a$ is the scale factor and $H$ is the Hubble rate.'}]} 28%|██▊ | 6177/22095 [10:41:48<144:05:46, 32.59s/it] {'loss': 0.3531, 'grad_norm': 0.6706516377549697, 'learning_rate': 8.454299013094347e-06, 'epoch': 0.28} 28%|██▊ | 6177/22095 [10:41:48<144:05:46, 32.59s/it] 28%|██▊ | 6178/22095 [10:41:50<104:34:19, 23.65s/it] {'loss': 0.3657, 'grad_norm': 0.6571031966771673, 'learning_rate': 8.453769082333858e-06, 'epoch': 0.28} 28%|██▊ | 6178/22095 [10:41:50<104:34:19, 23.65s/it] 28%|██▊ | 6179/22095 [10:42:31<127:32:11, 28.85s/it] {'loss': 0.3127, 'grad_norm': 0.682765081302516, 'learning_rate': 8.453239077362478e-06, 'epoch': 0.28} 28%|██▊ | 6179/22095 [10:42:31<127:32:11, 28.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6180/22095 [10:42:41<101:34:57, 22.98s/it] {'loss': 0.4783, 'grad_norm': 0.3404651269271434, 'learning_rate': 8.452708998191597e-06, 'epoch': 0.28} 28%|██▊ | 6180/22095 [10:42:41<101:34:57, 22.98s/it]VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/yaru/tweaks-app.png 2025-08-28 02:40:39.377738 load time: 1018.42 ms 28%|██▊ | 6181/22095 [10:43:02<98:55:28, 22.38s/it] {'loss': 0.3785, 'grad_norm': 0.6757919228891912, 'learning_rate': 8.452178844832603e-06, 'epoch': 0.28} 28%|██▊ | 6181/22095 [10:43:02<98:55:28, 22.38s/it] 28%|██▊ | 6182/22095 [10:43:24<98:41:02, 22.33s/it] {'loss': 0.3378, 'grad_norm': 0.8022516686986559, 'learning_rate': 8.451648617296889e-06, 'epoch': 0.28} 28%|██▊ | 6182/22095 [10:43:24<98:41:02, 22.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/android/images/android_lab_data_Clock/clock_4/images/003_click_1749609058784.png 2025-08-28 02:41:22.567398 load time: 1094.23 ms 28%|██▊ | 6183/22095 [10:43:51<105:42:17, 23.92s/it] {'loss': 0.489, 'grad_norm': 0.317920038886496, 'learning_rate': 8.451118315595847e-06, 'epoch': 0.28} 28%|██▊ | 6183/22095 [10:43:51<105:42:17, 23.92s/it]VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/0926031847730750_0.png 2025-08-28 02:41:50.182166 load time: 1042.47 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6184/22095 [10:43:55<78:11:35, 17.69s/it] {'loss': 0.4417, 'grad_norm': 0.6975404891441597, 'learning_rate': 8.45058793974087e-06, 'epoch': 0.28} 28%|██▊ | 6184/22095 [10:43:55<78:11:35, 17.69s/it]VC:s3://gui/aguvis/aguvis-stage2/aitw-v1/images/install_9862022874798258162_0.jpg 2025-08-28 02:41:53.353822 load time: 1018.45 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_269589.png 2025-08-28 02:41:53.353785 load time: 1030.12 ms VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/6805.jpg 2025-08-28 02:41:53.351837 load time: 1045.65 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240827_040114_before_screenshot_sub3.png 2025-08-28 02:41:53.351866 load time: 1048.25 ms 28%|██▊ | 6185/22095 [10:43:58<58:37:29, 13.27s/it] {'loss': 0.4497, 'grad_norm': 0.69970890978507, 'learning_rate': 8.450057489743359e-06, 'epoch': 0.28} 28%|██▊ | 6185/22095 [10:43:58<58:37:29, 13.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77029 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6186/22095 [10:44:22<73:00:33, 16.52s/it] {'loss': 0.3826, 'grad_norm': 0.7103773384893085, 'learning_rate': 8.449526965614708e-06, 'epoch': 0.28} 28%|██▊ | 6186/22095 [10:44:22<73:00:33, 16.52s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250505_001007_1/images/before_screenshot_1_id_45_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 02:42:21.122821 load time: 1027.18 ms 28%|██▊ | 6187/22095 [10:44:45<81:38:43, 18.48s/it] {'loss': 0.3461, 'grad_norm': 0.6669420693822625, 'learning_rate': 8.448996367366313e-06, 'epoch': 0.28} 28%|██▊ | 6187/22095 [10:44:45<81:38:43, 18.48s/it] 28%|██▊ | 6188/22095 [10:44:48<61:26:20, 13.90s/it] {'loss': 0.3839, 'grad_norm': 0.7245749365890447, 'learning_rate': 8.448465695009583e-06, 'epoch': 0.28} 28%|██▊ | 6188/22095 [10:44:48<61:26:20, 13.90s/it] 28%|██▊ | 6189/22095 [10:44:51<47:11:51, 10.68s/it] {'loss': 0.3906, 'grad_norm': 0.6974101733284066, 'learning_rate': 8.447934948555915e-06, 'epoch': 0.28} 28%|██▊ | 6189/22095 [10:44:51<47:11:51, 10.68s/it] 28%|██▊ | 6190/22095 [10:44:54<36:47:39, 8.33s/it] {'loss': 0.338, 'grad_norm': 0.6700526606181927, 'learning_rate': 8.447404128016715e-06, 'epoch': 0.28} 28%|██▊ | 6190/22095 [10:44:54<36:47:39, 8.33s/it] 28%|██▊ | 6191/22095 [10:44:58<30:48:13, 6.97s/it] {'loss': 0.4185, 'grad_norm': 0.6485693661815121, 'learning_rate': 8.446873233403388e-06, 'epoch': 0.28} 28%|██▊ | 6191/22095 [10:44:58<30:48:13, 6.97s/it] 28%|██▊ | 6192/22095 [10:45:01<26:07:09, 5.91s/it] {'loss': 0.3867, 'grad_norm': 0.6388113321491804, 'learning_rate': 8.446342264727341e-06, 'epoch': 0.28} 28%|██▊ | 6192/22095 [10:45:01<26:07:09, 5.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63262 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6193/22095 [10:45:25<50:09:37, 11.36s/it] {'loss': 0.4226, 'grad_norm': 0.6613139018209925, 'learning_rate': 8.445811221999983e-06, 'epoch': 0.28} 28%|██▊ | 6193/22095 [10:45:25<50:09:37, 11.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6194/22095 [10:45:33<45:16:50, 10.25s/it] {'loss': 0.4739, 'grad_norm': 0.514249841919849, 'learning_rate': 8.445280105232724e-06, 'epoch': 0.28} 28%|██▊ | 6194/22095 [10:45:33<45:16:50, 10.25s/it] 28%|██▊ | 6195/22095 [10:45:37<37:35:08, 8.51s/it] {'loss': 0.3916, 'grad_norm': 0.6705199570351592, 'learning_rate': 8.44474891443698e-06, 'epoch': 0.28} 28%|██▊ | 6195/22095 [10:45:37<37:35:08, 8.51s/it] 28%|██▊ | 6196/22095 [10:46:03<60:58:44, 13.81s/it] {'loss': 0.3719, 'grad_norm': 0.6287426122637085, 'learning_rate': 8.44421764962416e-06, 'epoch': 0.28} 28%|██▊ | 6196/22095 [10:46:04<60:58:44, 13.81s/it] 28%|██▊ | 6197/22095 [10:46:07<47:41:50, 10.80s/it] {'loss': 0.3873, 'grad_norm': 0.6513169442120156, 'learning_rate': 8.443686310805679e-06, 'epoch': 0.28} 28%|██▊ | 6197/22095 [10:46:07<47:41:50, 10.80s/it] 28%|██▊ | 6198/22095 [10:46:48<87:56:53, 19.92s/it] {'loss': 0.3384, 'grad_norm': 0.6885897435735384, 'learning_rate': 8.443154897992958e-06, 'epoch': 0.28} 28%|██▊ | 6198/22095 [10:46:48<87:56:53, 19.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6199/22095 [10:46:51<65:22:08, 14.80s/it] {'loss': 0.413, 'grad_norm': 0.6483482609359074, 'learning_rate': 8.442623411197412e-06, 'epoch': 0.28} 28%|██▊ | 6199/22095 [10:46:51<65:22:08, 14.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58434 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51299 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6200/22095 [10:46:55<51:01:39, 11.56s/it] {'loss': 0.3639, 'grad_norm': 0.63645573302067, 'learning_rate': 8.442091850430463e-06, 'epoch': 0.28} 28%|██▊ | 6200/22095 [10:46:55<51:01:39, 11.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (76281 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6201/22095 [10:47:03<46:10:29, 10.46s/it] {'loss': 0.5057, 'grad_norm': 0.48556986376096983, 'learning_rate': 8.441560215703531e-06, 'epoch': 0.28} 28%|██▊ | 6201/22095 [10:47:03<46:10:29, 10.46s/it] 28%|██▊ | 6202/22095 [10:47:07<36:55:04, 8.36s/it] {'loss': 0.4118, 'grad_norm': 0.6492821333841186, 'learning_rate': 8.441028507028041e-06, 'epoch': 0.28} 28%|██▊ | 6202/22095 [10:47:07<36:55:04, 8.36s/it] 28%|██▊ | 6203/22095 [10:47:10<29:59:10, 6.79s/it] {'loss': 0.3803, 'grad_norm': 0.6545075018939425, 'learning_rate': 8.440496724415415e-06, 'epoch': 0.28} 28%|██▊ | 6203/22095 [10:47:10<29:59:10, 6.79s/it] 28%|██▊ | 6204/22095 [10:47:14<25:53:04, 5.86s/it] {'loss': 0.3693, 'grad_norm': 0.6451701396785834, 'learning_rate': 8.439964867877082e-06, 'epoch': 0.28} 28%|██▊ | 6204/22095 [10:47:14<25:53:04, 5.86s/it] 28%|██▊ | 6205/22095 [10:47:17<22:11:47, 5.03s/it] {'loss': 0.3748, 'grad_norm': 0.7229587230904689, 'learning_rate': 8.439432937424468e-06, 'epoch': 0.28} 28%|██▊ | 6205/22095 [10:47:17<22:11:47, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58995 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6206/22095 [10:47:20<19:36:51, 4.44s/it] {'loss': 0.445, 'grad_norm': 0.6800637781863547, 'learning_rate': 8.438900933069006e-06, 'epoch': 0.28} 28%|██▊ | 6206/22095 [10:47:20<19:36:51, 4.44s/it] 28%|██▊ | 6207/22095 [10:47:23<17:32:23, 3.97s/it] {'loss': 0.3841, 'grad_norm': 0.6669011748349529, 'learning_rate': 8.438368854822123e-06, 'epoch': 0.28} 28%|██▊ | 6207/22095 [10:47:23<17:32:23, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/7747.jpg 2025-08-28 02:45:21.336127 load time: 1030.47 ms VC:s3://gui-agent/data_20250328/web_25k/images/Process5/google_com_hk/trajectory_112/img/step_3.png 2025-08-28 02:45:21.337251 load time: 1051.45 ms 28%|██▊ | 6208/22095 [10:47:32<24:32:21, 5.56s/it] {'loss': 0.4638, 'grad_norm': 0.42745797731326585, 'learning_rate': 8.437836702695253e-06, 'epoch': 0.28} 28%|██▊ | 6208/22095 [10:47:32<24:32:21, 5.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46451 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6209/22095 [10:47:41<29:35:00, 6.70s/it] {'loss': 0.4973, 'grad_norm': 0.33492718032317736, 'learning_rate': 8.437304476699833e-06, 'epoch': 0.28} 28%|██▊ | 6209/22095 [10:47:41<29:35:00, 6.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 28%|██▊ | 6210/22095 [10:47:45<25:31:38, 5.79s/it] {'loss': 0.3549, 'grad_norm': 0.6915936016022975, 'learning_rate': 8.436772176847295e-06, 'epoch': 0.28} 28%|██▊ | 6210/22095 [10:47:45<25:31:38, 5.79s/it] 28%|██▊ | 6211/22095 [10:47:48<22:16:40, 5.05s/it] {'loss': 0.3504, 'grad_norm': 0.6375801191470075, 'learning_rate': 8.436239803149077e-06, 'epoch': 0.28} 28%|██▊ | 6211/22095 [10:47:48<22:16:40, 5.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [562, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8513028 in VC:s3://internvl-moe-sft-data/. Exception: Image size [562, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 78463, 'image': 'vrdu_texteq/astro-ph.CO/dad31249-5213-43e8-a10b-5f5978187db5.png', 'image_wh': [[562, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'and the time evolution functions to $n$-th order'}]} 28%|██▊ | 6212/22095 [10:47:56<26:04:10, 5.91s/it] {'loss': 0.4799, 'grad_norm': 0.3721610081077562, 'learning_rate': 8.43570735561662e-06, 'epoch': 0.28} 28%|██▊ | 6212/22095 [10:47:56<26:04:10, 5.91s/it] 28%|██▊ | 6213/22095 [10:47:59<22:35:56, 5.12s/it] {'loss': 0.4056, 'grad_norm': 0.7265660941567311, 'learning_rate': 8.435174834261365e-06, 'epoch': 0.28} 28%|██▊ | 6213/22095 [10:47:59<22:35:56, 5.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6214/22095 [10:48:07<25:58:22, 5.89s/it] {'loss': 0.469, 'grad_norm': 0.35685129525632925, 'learning_rate': 8.434642239094752e-06, 'epoch': 0.28} 28%|██▊ | 6214/22095 [10:48:07<25:58:22, 5.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82356 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6215/22095 [10:48:10<22:23:57, 5.08s/it] {'loss': 0.3291, 'grad_norm': 0.7138835831888775, 'learning_rate': 8.434109570128228e-06, 'epoch': 0.28} 28%|██▊ | 6215/22095 [10:48:10<22:23:57, 5.08s/it] 28%|██▊ | 6216/22095 [10:48:15<21:23:51, 4.85s/it] {'loss': 0.3764, 'grad_norm': 0.7130084286744158, 'learning_rate': 8.433576827373234e-06, 'epoch': 0.28} 28%|██▊ | 6216/22095 [10:48:15<21:23:51, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117869 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80329 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43699 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73859 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6217/22095 [10:48:21<23:19:20, 5.29s/it] {'loss': 0.5127, 'grad_norm': 0.3521422973869999, 'learning_rate': 8.433044010841221e-06, 'epoch': 0.28} 28%|██▊ | 6217/22095 [10:48:21<23:19:20, 5.29s/it] 28%|██▊ | 6218/22095 [10:48:24<21:08:02, 4.79s/it] {'loss': 0.4221, 'grad_norm': 0.8142269849959118, 'learning_rate': 8.432511120543633e-06, 'epoch': 0.28} 28%|██▊ | 6218/22095 [10:48:25<21:08:02, 4.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358143 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24854, 'image': 'vrdu_table_final_2/astro-ph.CO/267581c2-a796-480a-b190-db07ce7465c2.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{2}$\\end{tabular}\n```"}]} 28%|██▊ | 6219/22095 [10:48:27<18:44:15, 4.25s/it] {'loss': 0.3707, 'grad_norm': 0.652441030531594, 'learning_rate': 8.431978156491927e-06, 'epoch': 0.28} 28%|██▊ | 6219/22095 [10:48:27<18:44:15, 4.25s/it] 28%|██▊ | 6220/22095 [10:48:31<18:12:25, 4.13s/it] {'loss': 0.3895, 'grad_norm': 0.7365615620283028, 'learning_rate': 8.43144511869755e-06, 'epoch': 0.28} 28%|██▊ | 6220/22095 [10:48:31<18:12:25, 4.13s/it] 28%|██▊ | 6221/22095 [10:48:35<17:19:44, 3.93s/it] {'loss': 0.3803, 'grad_norm': 0.6056462961860908, 'learning_rate': 8.430912007171957e-06, 'epoch': 0.28} 28%|██▊ | 6221/22095 [10:48:35<17:19:44, 3.93s/it] 28%|██▊ | 6222/22095 [10:48:38<15:55:10, 3.61s/it] {'loss': 0.4075, 'grad_norm': 0.6091945279871832, 'learning_rate': 8.430378821926599e-06, 'epoch': 0.28} 28%|██▊ | 6222/22095 [10:48:38<15:55:10, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6223/22095 [10:48:47<23:16:37, 5.28s/it] {'loss': 0.5117, 'grad_norm': 0.3147920289891761, 'learning_rate': 8.429845562972939e-06, 'epoch': 0.28} 28%|██▊ | 6223/22095 [10:48:47<23:16:37, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6224/22095 [10:48:50<20:32:29, 4.66s/it] {'loss': 0.3297, 'grad_norm': 0.6431671482916711, 'learning_rate': 8.429312230322431e-06, 'epoch': 0.28} 28%|██▊ | 6224/22095 [10:48:50<20:32:29, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6225/22095 [10:49:00<27:00:05, 6.13s/it] {'loss': 0.4789, 'grad_norm': 0.2877762484850333, 'learning_rate': 8.428778823986534e-06, 'epoch': 0.28} 28%|██▊ | 6225/22095 [10:49:00<27:00:05, 6.13s/it] 28%|██▊ | 6226/22095 [10:49:03<23:41:12, 5.37s/it] {'loss': 0.3275, 'grad_norm': 0.6581219748598858, 'learning_rate': 8.42824534397671e-06, 'epoch': 0.28} 28%|██▊ | 6226/22095 [10:49:03<23:41:12, 5.37s/it] 28%|██▊ | 6227/22095 [10:49:06<20:53:38, 4.74s/it] {'loss': 0.367, 'grad_norm': 0.6211291035800235, 'learning_rate': 8.427711790304426e-06, 'epoch': 0.28} 28%|██▊ | 6227/22095 [10:49:06<20:53:38, 4.74s/it] 28%|██▊ | 6228/22095 [10:49:10<19:16:11, 4.37s/it] {'loss': 0.3692, 'grad_norm': 0.7104653182176582, 'learning_rate': 8.427178162981141e-06, 'epoch': 0.28} 28%|██▊ | 6228/22095 [10:49:10<19:16:11, 4.37s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6229/22095 [10:49:13<18:05:51, 4.11s/it] {'loss': 0.4087, 'grad_norm': 0.6858916103546259, 'learning_rate': 8.426644462018323e-06, 'epoch': 0.28} 28%|██▊ | 6229/22095 [10:49:13<18:05:51, 4.11s/it] 28%|██▊ | 6230/22095 [10:49:18<18:01:56, 4.09s/it] {'loss': 0.3305, 'grad_norm': 0.630715733986845, 'learning_rate': 8.42611068742744e-06, 'epoch': 0.28} 28%|██▊ | 6230/22095 [10:49:18<18:01:56, 4.09s/it] 28%|██▊ | 6231/22095 [10:49:20<16:29:54, 3.74s/it] {'loss': 0.3584, 'grad_norm': 0.6855694797106894, 'learning_rate': 8.425576839219962e-06, 'epoch': 0.28} 28%|██▊ | 6231/22095 [10:49:20<16:29:54, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6232/22095 [10:49:24<16:21:22, 3.71s/it] {'loss': 0.3821, 'grad_norm': 0.6616610678604273, 'learning_rate': 8.425042917407358e-06, 'epoch': 0.28} 28%|██▊ | 6232/22095 [10:49:24<16:21:22, 3.71s/it] 28%|██▊ | 6233/22095 [10:49:28<15:57:31, 3.62s/it] {'loss': 0.3835, 'grad_norm': 0.6621237669737059, 'learning_rate': 8.4245089220011e-06, 'epoch': 0.28} 28%|██▊ | 6233/22095 [10:49:28<15:57:31, 3.62s/it] 28%|██▊ | 6234/22095 [10:49:31<15:24:04, 3.50s/it] {'loss': 0.386, 'grad_norm': 0.7034082290073037, 'learning_rate': 8.423974853012663e-06, 'epoch': 0.28} 28%|██▊ | 6234/22095 [10:49:31<15:24:04, 3.50s/it] 28%|██▊ | 6235/22095 [10:49:35<15:47:42, 3.59s/it] {'loss': 0.3895, 'grad_norm': 0.7146277465228836, 'learning_rate': 8.423440710453524e-06, 'epoch': 0.28} 28%|██▊ | 6235/22095 [10:49:35<15:47:42, 3.59s/it] 28%|██▊ | 6236/22095 [10:49:38<16:05:51, 3.65s/it] {'loss': 0.3778, 'grad_norm': 0.6941430005620083, 'learning_rate': 8.422906494335155e-06, 'epoch': 0.28} 28%|██▊ | 6236/22095 [10:49:38<16:05:51, 3.65s/it] 28%|██▊ | 6237/22095 [10:49:42<15:59:24, 3.63s/it] {'loss': 0.349, 'grad_norm': 0.6174910742970482, 'learning_rate': 8.42237220466904e-06, 'epoch': 0.28} 28%|██▊ | 6237/22095 [10:49:42<15:59:24, 3.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [603, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8466006 in VC:s3://internvl-moe-sft-data/. Exception: Image size [603, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55432, 'image': 'vrdu_texteq/astro-ph.CO/ca9e0612-06a7-49e3-bd0e-77ccd28d96df.png', 'image_wh': [[603, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $\\mathbf{e_r}$ is the unit vector of the radial direction.'}]} 28%|██▊ | 6238/22095 [10:49:45<15:21:47, 3.49s/it] {'loss': 0.3166, 'grad_norm': 0.6327827603546239, 'learning_rate': 8.421837841466657e-06, 'epoch': 0.28} 28%|██▊ | 6238/22095 [10:49:45<15:21:47, 3.49s/it] 28%|██▊ | 6239/22095 [10:49:49<15:31:40, 3.53s/it] {'loss': 0.35, 'grad_norm': 0.7934117873176336, 'learning_rate': 8.42130340473949e-06, 'epoch': 0.28} 28%|██▊ | 6239/22095 [10:49:49<15:31:40, 3.53s/it] 28%|██▊ | 6240/22095 [10:49:52<15:01:28, 3.41s/it] {'loss': 0.416, 'grad_norm': 0.6753158892052803, 'learning_rate': 8.420768894499018e-06, 'epoch': 0.28} 28%|██▊ | 6240/22095 [10:49:52<15:01:28, 3.41s/it] 28%|██▊ | 6241/22095 [10:49:57<17:06:18, 3.88s/it] {'loss': 0.3795, 'grad_norm': 0.6114585858457859, 'learning_rate': 8.420234310756731e-06, 'epoch': 0.28} 28%|██▊ | 6241/22095 [10:49:57<17:06:18, 3.88s/it] 28%|██▊ | 6242/22095 [10:50:00<16:35:00, 3.77s/it] {'loss': 0.3335, 'grad_norm': 0.6503678006643415, 'learning_rate': 8.419699653524112e-06, 'epoch': 0.28} 28%|██▊ | 6242/22095 [10:50:00<16:35:00, 3.77s/it] 28%|██▊ | 6243/22095 [10:50:04<16:49:15, 3.82s/it] {'loss': 0.3887, 'grad_norm': 0.6703813746328721, 'learning_rate': 8.41916492281265e-06, 'epoch': 0.28} 28%|██▊ | 6243/22095 [10:50:04<16:49:15, 3.82s/it] 28%|██▊ | 6244/22095 [10:50:07<15:33:37, 3.53s/it] {'loss': 0.3731, 'grad_norm': 0.6616322211727921, 'learning_rate': 8.418630118633835e-06, 'epoch': 0.28} 28%|██▊ | 6244/22095 [10:50:07<15:33:37, 3.53s/it] 28%|██▊ | 6245/22095 [10:50:10<14:53:39, 3.38s/it] {'loss': 0.3616, 'grad_norm': 0.679884118274133, 'learning_rate': 8.418095240999157e-06, 'epoch': 0.28} 28%|██▊ | 6245/22095 [10:50:10<14:53:39, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69847 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6246/22095 [10:50:14<15:00:29, 3.41s/it] {'loss': 0.3553, 'grad_norm': 0.6420201373911756, 'learning_rate': 8.417560289920112e-06, 'epoch': 0.28} 28%|██▊ | 6246/22095 [10:50:14<15:00:29, 3.41s/it] 28%|██▊ | 6247/22095 [10:50:17<14:26:13, 3.28s/it] {'loss': 0.3675, 'grad_norm': 0.6892038150466372, 'learning_rate': 8.417025265408192e-06, 'epoch': 0.28} 28%|██▊ | 6247/22095 [10:50:17<14:26:13, 3.28s/it] 28%|██▊ | 6248/22095 [10:50:20<14:06:01, 3.20s/it] {'loss': 0.433, 'grad_norm': 0.6629311105686149, 'learning_rate': 8.416490167474894e-06, 'epoch': 0.28} 28%|██▊ | 6248/22095 [10:50:20<14:06:01, 3.20s/it] 28%|██▊ | 6249/22095 [10:50:23<13:52:11, 3.15s/it] {'loss': 0.332, 'grad_norm': 0.753104654935313, 'learning_rate': 8.415954996131715e-06, 'epoch': 0.28} 28%|██▊ | 6249/22095 [10:50:23<13:52:11, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6250/22095 [10:50:32<21:51:19, 4.97s/it] {'loss': 0.4924, 'grad_norm': 0.4750248779522945, 'learning_rate': 8.415419751390155e-06, 'epoch': 0.28} 28%|██▊ | 6250/22095 [10:50:32<21:51:19, 4.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6251/22095 [10:50:36<20:32:27, 4.67s/it] {'loss': 0.4092, 'grad_norm': 0.6649641453862327, 'learning_rate': 8.414884433261712e-06, 'epoch': 0.28} 28%|██▊ | 6251/22095 [10:50:36<20:32:27, 4.67s/it] 28%|██▊ | 6252/22095 [10:50:39<19:10:25, 4.36s/it] {'loss': 0.3647, 'grad_norm': 0.7043728882105145, 'learning_rate': 8.414349041757895e-06, 'epoch': 0.28} 28%|██▊ | 6252/22095 [10:50:39<19:10:25, 4.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [934, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8436769 in VC:s3://internvl-moe-sft-data/. Exception: Image size [934, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 76188, 'image': 'vrdu_texteq/astro-ph.CO/1eab6926-ee8c-4b74-9956-b3c2e5202ffa.png', 'image_wh': [[934, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'Confidence limits on $a_{\\rm direct}$ and $b_{\\rm direct}$ themselves can\nbe obtained as follows.'}]} 28%|██▊ | 6253/22095 [10:50:42<17:09:30, 3.90s/it] {'loss': 0.3976, 'grad_norm': 0.7906797670385073, 'learning_rate': 8.4138135768902e-06, 'epoch': 0.28} 28%|██▊ | 6253/22095 [10:50:42<17:09:30, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 28%|██▊ | 6254/22095 [10:50:50<22:44:22, 5.17s/it] {'loss': 0.4822, 'grad_norm': 0.33427530855867776, 'learning_rate': 8.413278038670137e-06, 'epoch': 0.28} 28%|██▊ | 6254/22095 [10:50:50<22:44:22, 5.17s/it] 28%|██▊ | 6255/22095 [10:50:54<20:33:52, 4.67s/it] {'loss': 0.3494, 'grad_norm': 0.6971907768403794, 'learning_rate': 8.412742427109211e-06, 'epoch': 0.28} 28%|██▊ | 6255/22095 [10:50:54<20:33:52, 4.67s/it] 28%|██▊ | 6256/22095 [10:50:57<18:35:38, 4.23s/it] {'loss': 0.3777, 'grad_norm': 0.6501227087395217, 'learning_rate': 8.41220674221893e-06, 'epoch': 0.28} 28%|██▊ | 6256/22095 [10:50:57<18:35:38, 4.23s/it] 28%|██▊ | 6257/22095 [10:51:00<17:02:26, 3.87s/it] {'loss': 0.3844, 'grad_norm': 0.711428274830069, 'learning_rate': 8.41167098401081e-06, 'epoch': 0.28} 28%|██▊ | 6257/22095 [10:51:00<17:02:26, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6258/22095 [10:51:03<15:46:21, 3.59s/it] {'loss': 0.3792, 'grad_norm': 0.6728050850151719, 'learning_rate': 8.411135152496357e-06, 'epoch': 0.28} 28%|██▊ | 6258/22095 [10:51:03<15:46:21, 3.59s/it] 28%|██▊ | 6259/22095 [10:51:06<15:07:08, 3.44s/it] {'loss': 0.3719, 'grad_norm': 0.6833250711380504, 'learning_rate': 8.410599247687085e-06, 'epoch': 0.28} 28%|██▊ | 6259/22095 [10:51:06<15:07:08, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57373 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62048 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6260/22095 [10:51:09<14:47:18, 3.36s/it] {'loss': 0.3496, 'grad_norm': 0.6669147987584385, 'learning_rate': 8.41006326959451e-06, 'epoch': 0.28} 28%|██▊ | 6260/22095 [10:51:09<14:47:18, 3.36s/it] 28%|██▊ | 6261/22095 [10:51:13<14:43:03, 3.35s/it] {'loss': 0.3865, 'grad_norm': 0.6839235194108939, 'learning_rate': 8.409527218230152e-06, 'epoch': 0.28} 28%|██▊ | 6261/22095 [10:51:13<14:43:03, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50261 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43153 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65506 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6262/22095 [10:51:16<14:40:18, 3.34s/it] {'loss': 0.3195, 'grad_norm': 0.6959276699222522, 'learning_rate': 8.408991093605524e-06, 'epoch': 0.28} 28%|██▊ | 6262/22095 [10:51:16<14:40:18, 3.34s/it] 28%|██▊ | 6263/22095 [10:51:20<15:53:26, 3.61s/it] {'loss': 0.3427, 'grad_norm': 0.6382203905796654, 'learning_rate': 8.408454895732146e-06, 'epoch': 0.28} 28%|██▊ | 6263/22095 [10:51:20<15:53:26, 3.61s/it] 28%|██▊ | 6264/22095 [10:51:23<15:20:01, 3.49s/it] {'loss': 0.3376, 'grad_norm': 0.6391687745761867, 'learning_rate': 8.40791862462154e-06, 'epoch': 0.28} 28%|██▊ | 6264/22095 [10:51:23<15:20:01, 3.49s/it] 28%|██▊ | 6265/22095 [10:51:27<16:04:42, 3.66s/it] {'loss': 0.388, 'grad_norm': 0.6564862085170919, 'learning_rate': 8.407382280285231e-06, 'epoch': 0.28} 28%|██▊ | 6265/22095 [10:51:27<16:04:42, 3.66s/it] 28%|██▊ | 6266/22095 [10:51:31<15:23:55, 3.50s/it] {'loss': 0.3607, 'grad_norm': 0.6727667578398724, 'learning_rate': 8.406845862734741e-06, 'epoch': 0.28} 28%|██▊ | 6266/22095 [10:51:31<15:23:55, 3.50s/it] 28%|██▊ | 6267/22095 [10:51:33<14:31:00, 3.30s/it] {'loss': 0.3528, 'grad_norm': 0.6915184834363046, 'learning_rate': 8.406309371981597e-06, 'epoch': 0.28} 28%|██▊ | 6267/22095 [10:51:33<14:31:00, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72688 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108002 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6268/22095 [10:51:42<21:26:13, 4.88s/it] {'loss': 0.4959, 'grad_norm': 0.6171206668536895, 'learning_rate': 8.405772808037326e-06, 'epoch': 0.28} 28%|██▊ | 6268/22095 [10:51:42<21:26:13, 4.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6269/22095 [10:51:46<19:57:28, 4.54s/it] {'loss': 0.3933, 'grad_norm': 0.6642918766150323, 'learning_rate': 8.405236170913458e-06, 'epoch': 0.28} 28%|██▊ | 6269/22095 [10:51:46<19:57:28, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38976.png 2025-08-28 02:49:45.513118 load time: 1559.68 ms 28%|██▊ | 6270/22095 [10:51:54<25:23:08, 5.77s/it] {'loss': 0.5106, 'grad_norm': 0.44063677262443796, 'learning_rate': 8.404699460621523e-06, 'epoch': 0.28} 28%|██▊ | 6270/22095 [10:51:54<25:23:08, 5.77s/it] 28%|██▊ | 6271/22095 [10:51:58<22:03:55, 5.02s/it] {'loss': 0.3706, 'grad_norm': 0.6525737155916631, 'learning_rate': 8.404162677173052e-06, 'epoch': 0.28} 28%|██▊ | 6271/22095 [10:51:58<22:03:55, 5.02s/it] 28%|██▊ | 6272/22095 [10:52:01<19:54:23, 4.53s/it] {'loss': 0.3579, 'grad_norm': 0.6395490185811372, 'learning_rate': 8.403625820579582e-06, 'epoch': 0.28} 28%|██▊ | 6272/22095 [10:52:01<19:54:23, 4.53s/it] 28%|██▊ | 6273/22095 [10:52:04<17:50:06, 4.06s/it] {'loss': 0.3629, 'grad_norm': 0.6688331079105485, 'learning_rate': 8.403088890852646e-06, 'epoch': 0.28} 28%|██▊ | 6273/22095 [10:52:04<17:50:06, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 28%|██▊ | 6274/22095 [10:52:07<16:23:48, 3.73s/it] {'loss': 0.3402, 'grad_norm': 0.6238772693835692, 'learning_rate': 8.402551888003781e-06, 'epoch': 0.28} 28%|██▊ | 6274/22095 [10:52:07<16:23:48, 3.73s/it] 28%|██▊ | 6275/22095 [10:52:10<15:45:02, 3.58s/it] {'loss': 0.4083, 'grad_norm': 0.6460567022078382, 'learning_rate': 8.402014812044525e-06, 'epoch': 0.28} 28%|██▊ | 6275/22095 [10:52:10<15:45:02, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047147 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 3\nB. 4\nC. 5\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AB=12,且BC=\\frac{1}{2}AB∴BC=6,AC=18而点D是线段AC的中点,∴AD=\\frac{1}{2}AC=\\frac{1}{2}×18=9而BD=AB-AD=12-9=3'}]} 28%|██▊ | 6276/22095 [10:52:20<24:33:52, 5.59s/it] {'loss': 0.4873, 'grad_norm': 0.6865228006752404, 'learning_rate': 8.401477662986421e-06, 'epoch': 0.28} 28%|██▊ | 6276/22095 [10:52:20<24:33:52, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86394 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6277/22095 [10:52:24<21:47:45, 4.96s/it] {'loss': 0.3645, 'grad_norm': 0.7456566077740172, 'learning_rate': 8.400940440841008e-06, 'epoch': 0.28} 28%|██▊ | 6277/22095 [10:52:24<21:47:45, 4.96s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_0.png 2025-08-28 02:50:23.338164 load time: 1118.27 ms 28%|██▊ | 6278/22095 [10:52:29<21:16:53, 4.84s/it] {'loss': 0.411, 'grad_norm': 0.6988846647845834, 'learning_rate': 8.40040314561983e-06, 'epoch': 0.28} 28%|██▊ | 6278/22095 [10:52:29<21:16:53, 4.84s/it] 28%|██▊ | 6279/22095 [10:52:33<20:16:46, 4.62s/it] {'loss': 0.3513, 'grad_norm': 0.9361045391555585, 'learning_rate': 8.399865777334435e-06, 'epoch': 0.28} 28%|██▊ | 6279/22095 [10:52:33<20:16:46, 4.62s/it] 28%|██▊ | 6280/22095 [10:52:36<18:23:26, 4.19s/it] {'loss': 0.3572, 'grad_norm': 0.751607817915409, 'learning_rate': 8.399328335996362e-06, 'epoch': 0.28} 28%|██▊ | 6280/22095 [10:52:36<18:23:26, 4.19s/it] 28%|██▊ | 6281/22095 [10:52:39<16:31:26, 3.76s/it] {'loss': 0.3846, 'grad_norm': 0.6749108112927008, 'learning_rate': 8.398790821617166e-06, 'epoch': 0.28} 28%|██▊ | 6281/22095 [10:52:39<16:31:26, 3.76s/it] 28%|██▊ | 6282/22095 [10:52:42<16:38:28, 3.79s/it] {'loss': 0.3705, 'grad_norm': 0.7128668589052166, 'learning_rate': 8.398253234208391e-06, 'epoch': 0.28} 28%|██▊ | 6282/22095 [10:52:42<16:38:28, 3.79s/it] 28%|██▊ | 6283/22095 [10:52:45<15:25:35, 3.51s/it] {'loss': 0.3688, 'grad_norm': 0.6235231061493672, 'learning_rate': 8.397715573781596e-06, 'epoch': 0.28} 28%|██▊ | 6283/22095 [10:52:45<15:25:35, 3.51s/it] 28%|██▊ | 6284/22095 [10:52:48<14:41:33, 3.35s/it] {'loss': 0.3399, 'grad_norm': 0.7008898344484392, 'learning_rate': 8.397177840348323e-06, 'epoch': 0.28} 28%|██▊ | 6284/22095 [10:52:48<14:41:33, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885296 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8449, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1.5cm\nB. 2cm\nC. 4cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 28%|██▊ | 6285/22095 [10:52:55<19:49:19, 4.51s/it] {'loss': 0.4725, 'grad_norm': 0.43211295832271546, 'learning_rate': 8.396640033920135e-06, 'epoch': 0.28} 28%|██▊ | 6285/22095 [10:52:56<19:49:19, 4.51s/it] 28%|██▊ | 6286/22095 [10:52:59<18:34:32, 4.23s/it] {'loss': 0.3979, 'grad_norm': 0.6773046329502455, 'learning_rate': 8.396102154508584e-06, 'epoch': 0.28} 28%|██▊ | 6286/22095 [10:52:59<18:34:32, 4.23s/it] 28%|██▊ | 6287/22095 [10:53:02<16:55:16, 3.85s/it] {'loss': 0.4042, 'grad_norm': 0.6202231828662841, 'learning_rate': 8.395564202125229e-06, 'epoch': 0.28} 28%|██▊ | 6287/22095 [10:53:02<16:55:16, 3.85s/it] 28%|██▊ | 6288/22095 [10:53:05<16:01:34, 3.65s/it] {'loss': 0.3194, 'grad_norm': 0.6785238184624034, 'learning_rate': 8.395026176781627e-06, 'epoch': 0.28} 28%|██▊ | 6288/22095 [10:53:05<16:01:34, 3.65s/it] 28%|██▊ | 6289/22095 [10:53:08<15:07:45, 3.45s/it] {'loss': 0.3632, 'grad_norm': 0.663363406948093, 'learning_rate': 8.394488078489339e-06, 'epoch': 0.28} 28%|██▊ | 6289/22095 [10:53:08<15:07:45, 3.45s/it] 28%|██▊ | 6290/22095 [10:53:12<15:26:08, 3.52s/it] {'loss': 0.3984, 'grad_norm': 0.656510124986166, 'learning_rate': 8.393949907259927e-06, 'epoch': 0.28} 28%|██▊ | 6290/22095 [10:53:12<15:26:08, 3.52s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/UE/handmade_annotation_3/images/UE_2_id_1_internvl_appearance_crop_1_grounding_instructions_random_paste.png 2025-08-28 02:51:12.060370 load time: 1030.9 ms 28%|██▊ | 6291/22095 [10:53:16<15:57:00, 3.63s/it] {'loss': 0.3573, 'grad_norm': 0.6625844048352743, 'learning_rate': 8.393411663104957e-06, 'epoch': 0.28} 28%|██▊ | 6291/22095 [10:53:16<15:57:00, 3.63s/it] 28%|██▊ | 6292/22095 [10:53:19<15:11:42, 3.46s/it] {'loss': 0.3627, 'grad_norm': 0.6682820561545953, 'learning_rate': 8.392873346035992e-06, 'epoch': 0.28} 28%|██▊ | 6292/22095 [10:53:19<15:11:42, 3.46s/it] 28%|██▊ | 6293/22095 [10:53:22<14:48:44, 3.37s/it] {'loss': 0.3614, 'grad_norm': 0.6039770807957126, 'learning_rate': 8.392334956064598e-06, 'epoch': 0.28} 28%|██▊ | 6293/22095 [10:53:22<14:48:44, 3.37s/it] 28%|██▊ | 6294/22095 [10:53:26<15:53:38, 3.62s/it] {'loss': 0.3419, 'grad_norm': 0.6601147390295486, 'learning_rate': 8.391796493202346e-06, 'epoch': 0.28} 28%|██▊ | 6294/22095 [10:53:26<15:53:38, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46594 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6295/22095 [10:53:30<15:58:05, 3.64s/it] {'loss': 0.377, 'grad_norm': 0.6409892616716775, 'learning_rate': 8.391257957460803e-06, 'epoch': 0.28} 28%|██▊ | 6295/22095 [10:53:30<15:58:05, 3.64s/it] 28%|██▊ | 6296/22095 [10:53:33<15:28:48, 3.53s/it] {'loss': 0.3615, 'grad_norm': 0.6090673283201767, 'learning_rate': 8.390719348851544e-06, 'epoch': 0.28} 28%|██▊ | 6296/22095 [10:53:33<15:28:48, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41154 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41410 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75982 > 40960). Running this sequence through the model will result in indexing errors 28%|██▊ | 6297/22095 [10:53:36<14:58:01, 3.41s/it] {'loss': 0.3521, 'grad_norm': 0.6125879242381763, 'learning_rate': 8.390180667386138e-06, 'epoch': 0.28} 28%|██▊ | 6297/22095 [10:53:36<14:58:01, 3.41s/it] 29%|██▊ | 6298/22095 [10:53:39<14:31:07, 3.31s/it] {'loss': 0.3305, 'grad_norm': 0.7012367365291511, 'learning_rate': 8.389641913076163e-06, 'epoch': 0.29} 29%|██▊ | 6298/22095 [10:53:39<14:31:07, 3.31s/it] 29%|██▊ | 6299/22095 [10:53:43<14:45:10, 3.36s/it] {'loss': 0.3889, 'grad_norm': 0.6452780407011524, 'learning_rate': 8.389103085933192e-06, 'epoch': 0.29} 29%|██▊ | 6299/22095 [10:53:43<14:45:10, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (81744 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92799 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110458 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99958 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6300/22095 [10:53:49<19:02:15, 4.34s/it] {'loss': 0.5188, 'grad_norm': 0.4827058990269452, 'learning_rate': 8.388564185968805e-06, 'epoch': 0.29} 29%|██▊ | 6300/22095 [10:53:49<19:02:15, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42632 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43967 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6301/22095 [10:53:53<18:06:31, 4.13s/it] {'loss': 0.368, 'grad_norm': 0.6965575404225468, 'learning_rate': 8.388025213194585e-06, 'epoch': 0.29} 29%|██▊ | 6301/22095 [10:53:53<18:06:31, 4.13s/it] 29%|██▊ | 6302/22095 [10:53:59<21:00:11, 4.79s/it] {'loss': 0.4024, 'grad_norm': 0.7127863519099163, 'learning_rate': 8.387486167622103e-06, 'epoch': 0.29} 29%|██▊ | 6302/22095 [10:53:59<21:00:11, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6303/22095 [10:54:10<28:14:36, 6.44s/it] {'loss': 0.5249, 'grad_norm': 0.3250995844318269, 'learning_rate': 8.38694704926295e-06, 'epoch': 0.29} 29%|██▊ | 6303/22095 [10:54:10<28:14:36, 6.44s/it] 29%|██▊ | 6304/22095 [10:54:13<24:05:24, 5.49s/it] {'loss': 0.3605, 'grad_norm': 0.6735640437482957, 'learning_rate': 8.386407858128707e-06, 'epoch': 0.29} 29%|██▊ | 6304/22095 [10:54:13<24:05:24, 5.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6305/22095 [10:54:23<29:29:18, 6.72s/it] {'loss': 0.5164, 'grad_norm': 0.3165298941877711, 'learning_rate': 8.385868594230958e-06, 'epoch': 0.29} 29%|██▊ | 6305/22095 [10:54:23<29:29:18, 6.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107006 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42999 > 40960) for 4 sample(s). Truncating to 2039 with 3 samples. 29%|██▊ | 6306/22095 [10:54:27<26:06:00, 5.95s/it] {'loss': 0.3368, 'grad_norm': 0.6564690706710307, 'learning_rate': 8.385329257581295e-06, 'epoch': 0.29} 29%|██▊ | 6306/22095 [10:54:27<26:06:00, 5.95s/it] 29%|██▊ | 6307/22095 [10:54:30<22:23:48, 5.11s/it] {'loss': 0.3394, 'grad_norm': 0.6489326868070335, 'learning_rate': 8.3847898481913e-06, 'epoch': 0.29} 29%|██▊ | 6307/22095 [10:54:30<22:23:48, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6308/22095 [10:54:38<26:29:39, 6.04s/it] {'loss': 0.5119, 'grad_norm': 0.40721945360601186, 'learning_rate': 8.384250366072568e-06, 'epoch': 0.29} 29%|██▊ | 6308/22095 [10:54:38<26:29:39, 6.04s/it] 29%|██▊ | 6309/22095 [10:54:46<29:11:05, 6.66s/it] {'loss': 0.4886, 'grad_norm': 0.33191010690961714, 'learning_rate': 8.38371081123669e-06, 'epoch': 0.29} 29%|██▊ | 6309/22095 [10:54:46<29:11:05, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▊ | 6310/22095 [10:54:50<25:52:55, 5.90s/it] {'loss': 0.3459, 'grad_norm': 0.7339684742337894, 'learning_rate': 8.383171183695258e-06, 'epoch': 0.29} 29%|██▊ | 6310/22095 [10:54:50<25:52:55, 5.90s/it] 29%|██▊ | 6311/22095 [10:54:54<22:26:52, 5.12s/it] {'loss': 0.4392, 'grad_norm': 0.7711528963712407, 'learning_rate': 8.382631483459869e-06, 'epoch': 0.29} 29%|██▊ | 6311/22095 [10:54:54<22:26:52, 5.12s/it] 29%|██▊ | 6312/22095 [10:54:57<19:32:57, 4.46s/it] {'loss': 0.3706, 'grad_norm': 0.6589848254857321, 'learning_rate': 8.382091710542118e-06, 'epoch': 0.29} 29%|██▊ | 6312/22095 [10:54:57<19:32:57, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6313/22095 [10:55:01<19:07:58, 4.36s/it] {'loss': 0.3859, 'grad_norm': 0.6757671235326732, 'learning_rate': 8.381551864953603e-06, 'epoch': 0.29} 29%|██▊ | 6313/22095 [10:55:01<19:07:58, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6314/22095 [10:55:07<21:37:44, 4.93s/it] {'loss': 0.5095, 'grad_norm': 0.449617795828226, 'learning_rate': 8.381011946705926e-06, 'epoch': 0.29} 29%|██▊ | 6314/22095 [10:55:07<21:37:44, 4.93s/it] 29%|██▊ | 6315/22095 [10:55:10<19:22:50, 4.42s/it] {'loss': 0.3896, 'grad_norm': 0.7213903453427956, 'learning_rate': 8.380471955810685e-06, 'epoch': 0.29} 29%|██▊ | 6315/22095 [10:55:10<19:22:50, 4.42s/it] 29%|██▊ | 6316/22095 [10:55:14<19:11:00, 4.38s/it] {'loss': 0.3962, 'grad_norm': 0.6514003171057873, 'learning_rate': 8.379931892279483e-06, 'epoch': 0.29} 29%|██▊ | 6316/22095 [10:55:14<19:11:00, 4.38s/it] 29%|██▊ | 6317/22095 [10:55:18<18:10:55, 4.15s/it] {'loss': 0.366, 'grad_norm': 0.7085896477424691, 'learning_rate': 8.379391756123927e-06, 'epoch': 0.29} 29%|██▊ | 6317/22095 [10:55:18<18:10:55, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59717 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6318/22095 [10:55:21<16:38:59, 3.80s/it] {'loss': 0.3723, 'grad_norm': 0.6803649542757192, 'learning_rate': 8.37885154735562e-06, 'epoch': 0.29} 29%|██▊ | 6318/22095 [10:55:21<16:38:59, 3.80s/it] 29%|██▊ | 6319/22095 [10:55:24<15:51:43, 3.62s/it] {'loss': 0.373, 'grad_norm': 1.2716815933415464, 'learning_rate': 8.37831126598617e-06, 'epoch': 0.29} 29%|██▊ | 6319/22095 [10:55:24<15:51:43, 3.62s/it] 29%|██▊ | 6320/22095 [10:55:28<15:29:17, 3.53s/it] {'loss': 0.4031, 'grad_norm': 0.6554974108933356, 'learning_rate': 8.377770912027187e-06, 'epoch': 0.29} 29%|██▊ | 6320/22095 [10:55:28<15:29:17, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6321/22095 [10:55:31<14:59:13, 3.42s/it] {'loss': 0.3675, 'grad_norm': 0.7453757586179238, 'learning_rate': 8.377230485490282e-06, 'epoch': 0.29} 29%|██▊ | 6321/22095 [10:55:31<14:59:13, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (114587 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6322/22095 [10:55:34<14:13:00, 3.24s/it] {'loss': 0.3803, 'grad_norm': 0.7501007963538981, 'learning_rate': 8.376689986387066e-06, 'epoch': 0.29} 29%|██▊ | 6322/22095 [10:55:34<14:13:00, 3.24s/it] 29%|██▊ | 6323/22095 [10:55:37<13:50:26, 3.16s/it] {'loss': 0.3433, 'grad_norm': 0.6447615198903361, 'learning_rate': 8.376149414729154e-06, 'epoch': 0.29} 29%|██▊ | 6323/22095 [10:55:37<13:50:26, 3.16s/it] 29%|██▊ | 6324/22095 [10:55:40<13:45:49, 3.14s/it] {'loss': 0.3758, 'grad_norm': 0.7109482396314323, 'learning_rate': 8.375608770528157e-06, 'epoch': 0.29} 29%|██▊ | 6324/22095 [10:55:40<13:45:49, 3.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358151 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24862, 'image': 'vrdu_table_final_2/astro-ph.CO/2c76e536-b440-4079-a13e-73875ed15621.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398263 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 414, 'image': 'vrdu_table_final_2/astro-ph.CO/6df719ab-f533-45f6-9e4b-f88d3e842108.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 29%|██▊ | 6325/22095 [10:55:48<20:41:40, 4.72s/it] {'loss': 0.4917, 'grad_norm': 0.3596396620662271, 'learning_rate': 8.375068053795697e-06, 'epoch': 0.29} 29%|██▊ | 6325/22095 [10:55:48<20:41:40, 4.72s/it] 29%|██▊ | 6326/22095 [10:55:54<21:52:09, 4.99s/it] {'loss': 0.4738, 'grad_norm': 0.3406801314012656, 'learning_rate': 8.37452726454339e-06, 'epoch': 0.29} 29%|██▊ | 6326/22095 [10:55:54<21:52:09, 4.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (108992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72173 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44170 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41812 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6327/22095 [10:55:57<20:09:19, 4.60s/it] {'loss': 0.3971, 'grad_norm': 0.6914537614782104, 'learning_rate': 8.373986402782857e-06, 'epoch': 0.29} 29%|██▊ | 6327/22095 [10:55:57<20:09:19, 4.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942807 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65960, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nA. 5\nB. 6\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893505 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16658, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 6cm\nB. 12cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 29%|██▊ | 6328/22095 [10:56:00<18:12:50, 4.16s/it] {'loss': 0.3774, 'grad_norm': 0.6957523570372517, 'learning_rate': 8.373445468525719e-06, 'epoch': 0.29} 29%|██▊ | 6328/22095 [10:56:00<18:12:50, 4.16s/it] 29%|██▊ | 6329/22095 [10:56:04<17:37:42, 4.03s/it] {'loss': 0.3812, 'grad_norm': 0.6690897788009853, 'learning_rate': 8.372904461783596e-06, 'epoch': 0.29} 29%|██▊ | 6329/22095 [10:56:04<17:37:42, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6330/22095 [10:56:11<21:21:09, 4.88s/it] {'loss': 0.5283, 'grad_norm': 0.3872804952218078, 'learning_rate': 8.372363382568116e-06, 'epoch': 0.29} 29%|██▊ | 6330/22095 [10:56:11<21:21:09, 4.88s/it] 29%|██▊ | 6331/22095 [10:56:15<20:00:38, 4.57s/it] {'loss': 0.3606, 'grad_norm': 0.7496446064848421, 'learning_rate': 8.371822230890905e-06, 'epoch': 0.29} 29%|██▊ | 6331/22095 [10:56:15<20:00:38, 4.57s/it] 29%|██▊ | 6332/22095 [10:56:18<18:10:21, 4.15s/it] {'loss': 0.4128, 'grad_norm': 0.754423666023919, 'learning_rate': 8.371281006763589e-06, 'epoch': 0.29} 29%|██▊ | 6332/22095 [10:56:18<18:10:21, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6333/22095 [10:56:22<17:15:50, 3.94s/it] {'loss': 0.3817, 'grad_norm': 0.6736836893255465, 'learning_rate': 8.3707397101978e-06, 'epoch': 0.29} 29%|██▊ | 6333/22095 [10:56:22<17:15:50, 3.94s/it] 29%|██▊ | 6334/22095 [10:56:25<16:40:42, 3.81s/it] {'loss': 0.3441, 'grad_norm': 0.6353507976696714, 'learning_rate': 8.370198341205167e-06, 'epoch': 0.29} 29%|██▊ | 6334/22095 [10:56:25<16:40:42, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6335/22095 [10:56:32<20:35:33, 4.70s/it] {'loss': 0.5078, 'grad_norm': 0.40956676546522564, 'learning_rate': 8.36965689979732e-06, 'epoch': 0.29} 29%|██▊ | 6335/22095 [10:56:32<20:35:33, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47149 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6336/22095 [10:56:36<19:31:50, 4.46s/it] {'loss': 0.3762, 'grad_norm': 0.740988659745703, 'learning_rate': 8.369115385985897e-06, 'epoch': 0.29} 29%|██▊ | 6336/22095 [10:56:36<19:31:50, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55538 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6337/22095 [10:56:39<18:28:23, 4.22s/it] {'loss': 0.4027, 'grad_norm': 0.7073140874733087, 'learning_rate': 8.368573799782533e-06, 'epoch': 0.29} 29%|██▊ | 6337/22095 [10:56:39<18:28:23, 4.22s/it] 29%|██▊ | 6338/22095 [10:56:42<16:55:01, 3.87s/it] {'loss': 0.3663, 'grad_norm': 0.6412217710564962, 'learning_rate': 8.368032141198864e-06, 'epoch': 0.29} 29%|██▊ | 6338/22095 [10:56:42<16:55:01, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6339/22095 [10:56:46<16:32:11, 3.78s/it] {'loss': 0.3905, 'grad_norm': 0.6852482683651501, 'learning_rate': 8.367490410246525e-06, 'epoch': 0.29} 29%|██▊ | 6339/22095 [10:56:46<16:32:11, 3.78s/it] 29%|██▊ | 6340/22095 [10:56:49<16:01:27, 3.66s/it] {'loss': 0.3932, 'grad_norm': 0.6830362986429956, 'learning_rate': 8.366948606937161e-06, 'epoch': 0.29} 29%|██▊ | 6340/22095 [10:56:49<16:01:27, 3.66s/it] 29%|██▊ | 6341/22095 [10:56:53<15:20:29, 3.51s/it] {'loss': 0.3571, 'grad_norm': 0.6531315592813901, 'learning_rate': 8.366406731282415e-06, 'epoch': 0.29} 29%|██▊ | 6341/22095 [10:56:53<15:20:29, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65252 > 40960). Running this sequence through the model will result in indexing errors 29%|██▊ | 6342/22095 [10:56:56<15:53:05, 3.63s/it] {'loss': 0.3852, 'grad_norm': 0.6840745659773954, 'learning_rate': 8.365864783293925e-06, 'epoch': 0.29} 29%|██▊ | 6342/22095 [10:56:56<15:53:05, 3.63s/it] 29%|██▊ | 6343/22095 [10:56:59<14:56:35, 3.42s/it] {'loss': 0.3673, 'grad_norm': 0.7243044316163573, 'learning_rate': 8.36532276298334e-06, 'epoch': 0.29} 29%|██▊ | 6343/22095 [10:56:59<14:56:35, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▊ | 6344/22095 [10:57:09<22:47:16, 5.21s/it] {'loss': 0.4866, 'grad_norm': 0.3596143601853733, 'learning_rate': 8.364780670362302e-06, 'epoch': 0.29} 29%|██▊ | 6344/22095 [10:57:09<22:47:16, 5.21s/it] 29%|██▊ | 6345/22095 [10:57:18<28:20:58, 6.48s/it] {'loss': 0.4754, 'grad_norm': 0.3369326536107606, 'learning_rate': 8.364238505442462e-06, 'epoch': 0.29} 29%|██▊ | 6345/22095 [10:57:18<28:20:58, 6.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▊ | 6346/22095 [10:57:21<23:50:10, 5.45s/it] {'loss': 0.4005, 'grad_norm': 0.7430363453770691, 'learning_rate': 8.36369626823547e-06, 'epoch': 0.29} 29%|██▊ | 6346/22095 [10:57:21<23:50:10, 5.45s/it] 29%|██▊ | 6347/22095 [10:57:25<21:37:38, 4.94s/it] {'loss': 0.382, 'grad_norm': 0.6433857629537567, 'learning_rate': 8.363153958752976e-06, 'epoch': 0.29} 29%|██▊ | 6347/22095 [10:57:25<21:37:38, 4.94s/it] 29%|██▊ | 6348/22095 [10:57:29<19:50:15, 4.54s/it] {'loss': 0.3565, 'grad_norm': 0.6350133465496026, 'learning_rate': 8.362611577006632e-06, 'epoch': 0.29} 29%|██▊ | 6348/22095 [10:57:29<19:50:15, 4.54s/it] 29%|██▊ | 6349/22095 [10:57:31<17:34:48, 4.02s/it] {'loss': 0.3779, 'grad_norm': 0.6169981326762816, 'learning_rate': 8.362069123008092e-06, 'epoch': 0.29} 29%|██▊ | 6349/22095 [10:57:31<17:34:48, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6350/22095 [10:57:39<22:02:12, 5.04s/it] {'loss': 0.4995, 'grad_norm': 0.45227784655374936, 'learning_rate': 8.361526596769013e-06, 'epoch': 0.29} 29%|██▊ | 6350/22095 [10:57:39<22:02:12, 5.04s/it] 29%|██▊ | 6351/22095 [10:57:42<19:31:22, 4.46s/it] {'loss': 0.3873, 'grad_norm': 0.9147592151032556, 'learning_rate': 8.360983998301053e-06, 'epoch': 0.29} 29%|██▊ | 6351/22095 [10:57:42<19:31:22, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▊ | 6352/22095 [10:57:45<17:28:19, 4.00s/it] {'loss': 0.3806, 'grad_norm': 0.6547204163805698, 'learning_rate': 8.360441327615868e-06, 'epoch': 0.29} 29%|██▊ | 6352/22095 [10:57:45<17:28:19, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6353/22095 [10:57:50<18:35:05, 4.25s/it] {'loss': 0.3651, 'grad_norm': 0.7098358675605005, 'learning_rate': 8.35989858472512e-06, 'epoch': 0.29} 29%|██▉ | 6353/22095 [10:57:50<18:35:05, 4.25s/it] 29%|██▉ | 6354/22095 [10:57:52<16:38:24, 3.81s/it] {'loss': 0.3561, 'grad_norm': 0.7335097713307398, 'learning_rate': 8.359355769640472e-06, 'epoch': 0.29} 29%|██▉ | 6354/22095 [10:57:52<16:38:24, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51413 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6355/22095 [10:57:56<15:43:50, 3.60s/it] {'loss': 0.3531, 'grad_norm': 0.7783977190478998, 'learning_rate': 8.358812882373584e-06, 'epoch': 0.29} 29%|██▉ | 6355/22095 [10:57:56<15:43:50, 3.60s/it] 29%|██▉ | 6356/22095 [10:57:59<14:55:49, 3.42s/it] {'loss': 0.3537, 'grad_norm': 0.6820434180148108, 'learning_rate': 8.358269922936121e-06, 'epoch': 0.29} 29%|██▉ | 6356/22095 [10:57:59<14:55:49, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62554 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6357/22095 [10:58:03<16:26:34, 3.76s/it] {'loss': 0.3764, 'grad_norm': 0.6367159751994629, 'learning_rate': 8.357726891339756e-06, 'epoch': 0.29} 29%|██▉ | 6357/22095 [10:58:03<16:26:34, 3.76s/it] 29%|██▉ | 6358/22095 [10:58:06<15:40:55, 3.59s/it] {'loss': 0.3354, 'grad_norm': 0.7413979855286208, 'learning_rate': 8.357183787596151e-06, 'epoch': 0.29} 29%|██▉ | 6358/22095 [10:58:06<15:40:55, 3.59s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6359/22095 [10:58:09<14:46:37, 3.38s/it] {'loss': 0.3576, 'grad_norm': 0.7893411212943653, 'learning_rate': 8.356640611716976e-06, 'epoch': 0.29} 29%|██▉ | 6359/22095 [10:58:09<14:46:37, 3.38s/it] 29%|██▉ | 6360/22095 [10:58:13<14:46:12, 3.38s/it] {'loss': 0.3546, 'grad_norm': 0.6973211159466143, 'learning_rate': 8.356097363713904e-06, 'epoch': 0.29} 29%|██▉ | 6360/22095 [10:58:13<14:46:12, 3.38s/it] 29%|██▉ | 6361/22095 [10:58:16<14:47:05, 3.38s/it] {'loss': 0.3581, 'grad_norm': 0.6283123119027827, 'learning_rate': 8.355554043598608e-06, 'epoch': 0.29} 29%|██▉ | 6361/22095 [10:58:16<14:47:05, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6362/22095 [10:58:23<19:51:48, 4.55s/it] {'loss': 0.4921, 'grad_norm': 0.3994029621696445, 'learning_rate': 8.35501065138276e-06, 'epoch': 0.29} 29%|██▉ | 6362/22095 [10:58:23<19:51:48, 4.55s/it] 29%|██▉ | 6363/22095 [10:58:30<23:05:45, 5.29s/it] {'loss': 0.5089, 'grad_norm': 0.35484264290434336, 'learning_rate': 8.354467187078037e-06, 'epoch': 0.29} 29%|██▉ | 6363/22095 [10:58:30<23:05:45, 5.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6364/22095 [10:58:34<21:08:53, 4.84s/it] {'loss': 0.3948, 'grad_norm': 0.8314561846376063, 'learning_rate': 8.353923650696119e-06, 'epoch': 0.29} 29%|██▉ | 6364/22095 [10:58:34<21:08:53, 4.84s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 29%|██▉ | 6365/22095 [10:58:43<26:57:48, 6.17s/it] {'loss': 0.5093, 'grad_norm': 0.33738649343966226, 'learning_rate': 8.35338004224868e-06, 'epoch': 0.29} 29%|██▉ | 6365/22095 [10:58:43<26:57:48, 6.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6366/22095 [10:58:47<23:38:59, 5.41s/it] {'loss': 0.4174, 'grad_norm': 0.6726203777236593, 'learning_rate': 8.352836361747403e-06, 'epoch': 0.29} 29%|██▉ | 6366/22095 [10:58:47<23:38:59, 5.41s/it] 29%|██▉ | 6367/22095 [10:58:52<23:03:29, 5.28s/it] {'loss': 0.3607, 'grad_norm': 0.6518077292485198, 'learning_rate': 8.352292609203973e-06, 'epoch': 0.29} 29%|██▉ | 6367/22095 [10:58:52<23:03:29, 5.28s/it] 29%|██▉ | 6368/22095 [10:58:55<19:44:16, 4.52s/it] {'loss': 0.3921, 'grad_norm': 0.6762160616346438, 'learning_rate': 8.351748784630068e-06, 'epoch': 0.29} 29%|██▉ | 6368/22095 [10:58:55<19:44:16, 4.52s/it] 29%|██▉ | 6369/22095 [10:58:58<18:34:01, 4.25s/it] {'loss': 0.4079, 'grad_norm': 0.6930960439817764, 'learning_rate': 8.351204888037377e-06, 'epoch': 0.29} 29%|██▉ | 6369/22095 [10:58:58<18:34:01, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6370/22095 [10:59:02<18:05:03, 4.14s/it] {'loss': 0.32, 'grad_norm': 0.6448827108519407, 'learning_rate': 8.350660919437585e-06, 'epoch': 0.29} 29%|██▉ | 6370/22095 [10:59:02<18:05:03, 4.14s/it] 29%|██▉ | 6371/22095 [10:59:06<17:21:52, 3.98s/it] {'loss': 0.3314, 'grad_norm': 0.6114651259864918, 'learning_rate': 8.350116878842379e-06, 'epoch': 0.29} 29%|██▉ | 6371/22095 [10:59:06<17:21:52, 3.98s/it] 29%|██▉ | 6372/22095 [10:59:09<16:09:58, 3.70s/it] {'loss': 0.3434, 'grad_norm': 0.6361346844899176, 'learning_rate': 8.349572766263452e-06, 'epoch': 0.29} 29%|██▉ | 6372/22095 [10:59:09<16:09:58, 3.70s/it] 29%|██▉ | 6373/22095 [10:59:12<15:08:29, 3.47s/it] {'loss': 0.3732, 'grad_norm': 0.6258771935514704, 'learning_rate': 8.349028581712493e-06, 'epoch': 0.29} 29%|██▉ | 6373/22095 [10:59:12<15:08:29, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6374/22095 [10:59:16<15:55:04, 3.65s/it] {'loss': 0.3555, 'grad_norm': 0.6344175862930747, 'learning_rate': 8.348484325201196e-06, 'epoch': 0.29} 29%|██▉ | 6374/22095 [10:59:16<15:55:04, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [350, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8529434 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 88259, 'image': 'vrdu_texteq/astro-ph.CO/164858c4-5590-4cb4-91ae-c8a699d3d16f.png', 'image_wh': [[350, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'with $2\\times 2$ inverse covariance'}]} 29%|██▉ | 6375/22095 [10:59:19<15:46:58, 3.61s/it] {'loss': 0.3597, 'grad_norm': 0.6512433095639879, 'learning_rate': 8.347939996741255e-06, 'epoch': 0.29} 29%|██▉ | 6375/22095 [10:59:19<15:46:58, 3.61s/it] 29%|██▉ | 6376/22095 [10:59:23<15:10:59, 3.48s/it] {'loss': 0.3672, 'grad_norm': 0.6347448345003477, 'learning_rate': 8.347395596344365e-06, 'epoch': 0.29} 29%|██▉ | 6376/22095 [10:59:23<15:10:59, 3.48s/it] 29%|██▉ | 6377/22095 [10:59:26<15:37:40, 3.58s/it] {'loss': 0.3458, 'grad_norm': 0.7352586053277483, 'learning_rate': 8.346851124022226e-06, 'epoch': 0.29} 29%|██▉ | 6377/22095 [10:59:26<15:37:40, 3.58s/it] 29%|██▉ | 6378/22095 [10:59:30<15:31:01, 3.55s/it] {'loss': 0.3825, 'grad_norm': 0.6173551112383586, 'learning_rate': 8.346306579786536e-06, 'epoch': 0.29} 29%|██▉ | 6378/22095 [10:59:30<15:31:01, 3.55s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6379/22095 [10:59:33<14:34:58, 3.34s/it] {'loss': 0.3867, 'grad_norm': 0.7178480662247867, 'learning_rate': 8.345761963648993e-06, 'epoch': 0.29} 29%|██▉ | 6379/22095 [10:59:33<14:34:58, 3.34s/it] 29%|██▉ | 6380/22095 [10:59:37<15:34:01, 3.57s/it] {'loss': 0.3606, 'grad_norm': 0.6351496353698858, 'learning_rate': 8.345217275621303e-06, 'epoch': 0.29} 29%|██▉ | 6380/22095 [10:59:37<15:34:01, 3.57s/it] 29%|██▉ | 6381/22095 [10:59:41<16:30:03, 3.78s/it] {'loss': 0.3461, 'grad_norm': 0.6624551672993362, 'learning_rate': 8.344672515715165e-06, 'epoch': 0.29} 29%|██▉ | 6381/22095 [10:59:41<16:30:03, 3.78s/it] 29%|██▉ | 6382/22095 [10:59:45<16:18:38, 3.74s/it] {'loss': 0.4002, 'grad_norm': 0.650563289492609, 'learning_rate': 8.344127683942289e-06, 'epoch': 0.29} 29%|██▉ | 6382/22095 [10:59:45<16:18:38, 3.74s/it] 29%|██▉ | 6383/22095 [10:59:48<15:38:35, 3.58s/it] {'loss': 0.3572, 'grad_norm': 0.6078861028555593, 'learning_rate': 8.34358278031438e-06, 'epoch': 0.29} 29%|██▉ | 6383/22095 [10:59:48<15:38:35, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55168 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91479 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6384/22095 [10:59:52<16:02:55, 3.68s/it] {'loss': 0.3589, 'grad_norm': 0.6120498255653158, 'learning_rate': 8.343037804843143e-06, 'epoch': 0.29} 29%|██▉ | 6384/22095 [10:59:52<16:02:55, 3.68s/it] 29%|██▉ | 6385/22095 [10:59:55<14:54:38, 3.42s/it] {'loss': 0.3579, 'grad_norm': 0.6481854290974546, 'learning_rate': 8.342492757540294e-06, 'epoch': 0.29} 29%|██▉ | 6385/22095 [10:59:55<14:54:38, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6386/22095 [11:00:06<24:54:05, 5.71s/it] {'loss': 0.4952, 'grad_norm': 0.5468373204491022, 'learning_rate': 8.34194763841754e-06, 'epoch': 0.29} 29%|██▉ | 6386/22095 [11:00:06<24:54:05, 5.71s/it] 29%|██▉ | 6387/22095 [11:00:10<22:56:09, 5.26s/it] {'loss': 0.4117, 'grad_norm': 0.7190267617607462, 'learning_rate': 8.341402447486598e-06, 'epoch': 0.29} 29%|██▉ | 6387/22095 [11:00:10<22:56:09, 5.26s/it] 29%|██▉ | 6388/22095 [11:00:14<21:03:20, 4.83s/it] {'loss': 0.3373, 'grad_norm': 0.6336260112629661, 'learning_rate': 8.340857184759178e-06, 'epoch': 0.29} 29%|██▉ | 6388/22095 [11:00:14<21:03:20, 4.83s/it] 29%|██▉ | 6389/22095 [11:00:17<19:39:10, 4.50s/it] {'loss': 0.3619, 'grad_norm': 0.6630374397436601, 'learning_rate': 8.340311850246996e-06, 'epoch': 0.29} 29%|██▉ | 6389/22095 [11:00:17<19:39:10, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6390/22095 [11:00:22<19:06:35, 4.38s/it] {'loss': 0.3494, 'grad_norm': 0.7371034281394628, 'learning_rate': 8.339766443961772e-06, 'epoch': 0.29} 29%|██▉ | 6390/22095 [11:00:22<19:06:35, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43046 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72821 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6391/22095 [11:00:25<18:15:27, 4.19s/it] {'loss': 0.3795, 'grad_norm': 0.6538041235259925, 'learning_rate': 8.339220965915227e-06, 'epoch': 0.29} 29%|██▉ | 6391/22095 [11:00:25<18:15:27, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6392/22095 [11:00:35<25:30:33, 5.85s/it] {'loss': 0.4796, 'grad_norm': 0.38119731417128494, 'learning_rate': 8.338675416119076e-06, 'epoch': 0.29} 29%|██▉ | 6392/22095 [11:00:35<25:30:33, 5.85s/it] 29%|██▉ | 6393/22095 [11:00:39<22:33:33, 5.17s/it] {'loss': 0.3564, 'grad_norm': 0.726785165671723, 'learning_rate': 8.338129794585047e-06, 'epoch': 0.29} 29%|██▉ | 6393/22095 [11:00:39<22:33:33, 5.17s/it] 29%|██▉ | 6394/22095 [11:00:42<20:32:23, 4.71s/it] {'loss': 0.3779, 'grad_norm': 0.6533301961414332, 'learning_rate': 8.337584101324859e-06, 'epoch': 0.29} 29%|██▉ | 6394/22095 [11:00:42<20:32:23, 4.71s/it] 29%|██▉ | 6395/22095 [11:00:46<18:51:09, 4.32s/it] {'loss': 0.3447, 'grad_norm': 0.6282052343937943, 'learning_rate': 8.337038336350238e-06, 'epoch': 0.29} 29%|██▉ | 6395/22095 [11:00:46<18:51:09, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84779 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6396/22095 [11:00:49<17:21:58, 3.98s/it] {'loss': 0.3495, 'grad_norm': 0.6293806472535389, 'learning_rate': 8.336492499672915e-06, 'epoch': 0.29} 29%|██▉ | 6396/22095 [11:00:49<17:21:58, 3.98s/it] 29%|██▉ | 6397/22095 [11:00:52<16:44:01, 3.84s/it] {'loss': 0.3557, 'grad_norm': 0.5981108898546699, 'learning_rate': 8.335946591304614e-06, 'epoch': 0.29} 29%|██▉ | 6397/22095 [11:00:52<16:44:01, 3.84s/it] 29%|██▉ | 6398/22095 [11:00:56<16:00:03, 3.67s/it] {'loss': 0.4098, 'grad_norm': 0.6777571348756583, 'learning_rate': 8.335400611257067e-06, 'epoch': 0.29} 29%|██▉ | 6398/22095 [11:00:56<16:00:03, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (65750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49261 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52093 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6399/22095 [11:01:04<22:13:39, 5.10s/it] {'loss': 0.5415, 'grad_norm': 0.3942290080698544, 'learning_rate': 8.334854559542004e-06, 'epoch': 0.29} 29%|██▉ | 6399/22095 [11:01:04<22:13:39, 5.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (136937 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6400/22095 [11:01:07<19:38:16, 4.50s/it] {'loss': 0.4219, 'grad_norm': 0.6961096511585597, 'learning_rate': 8.334308436171159e-06, 'epoch': 0.29} 29%|██▉ | 6400/22095 [11:01:07<19:38:16, 4.50s/it] 29%|██▉ | 6401/22095 [11:01:11<18:22:04, 4.21s/it] {'loss': 0.3602, 'grad_norm': 0.6754681406841953, 'learning_rate': 8.333762241156268e-06, 'epoch': 0.29} 29%|██▉ | 6401/22095 [11:01:11<18:22:04, 4.21s/it] 29%|██▉ | 6402/22095 [11:01:14<17:05:34, 3.92s/it] {'loss': 0.3734, 'grad_norm': 0.7790640942022196, 'learning_rate': 8.33321597450906e-06, 'epoch': 0.29} 29%|██▉ | 6402/22095 [11:01:14<17:05:34, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6403/22095 [11:01:23<23:22:22, 5.36s/it] {'loss': 0.4947, 'grad_norm': 0.31179722449465863, 'learning_rate': 8.332669636241284e-06, 'epoch': 0.29} 29%|██▉ | 6403/22095 [11:01:23<23:22:22, 5.36s/it] 29%|██▉ | 6404/22095 [11:01:31<26:44:24, 6.13s/it] {'loss': 0.5166, 'grad_norm': 0.32547128381188095, 'learning_rate': 8.33212322636467e-06, 'epoch': 0.29} 29%|██▉ | 6404/22095 [11:01:31<26:44:24, 6.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6405/22095 [11:01:34<23:44:20, 5.45s/it] {'loss': 0.4074, 'grad_norm': 0.6859148528664963, 'learning_rate': 8.331576744890963e-06, 'epoch': 0.29} 29%|██▉ | 6405/22095 [11:01:34<23:44:20, 5.45s/it] 29%|██▉ | 6406/22095 [11:01:38<21:48:45, 5.01s/it] {'loss': 0.3552, 'grad_norm': 0.7084680498510506, 'learning_rate': 8.331030191831904e-06, 'epoch': 0.29} 29%|██▉ | 6406/22095 [11:01:38<21:48:45, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (85688 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6407/22095 [11:01:50<29:54:39, 6.86s/it] {'loss': 0.4949, 'grad_norm': 0.29201909746466553, 'learning_rate': 8.330483567199234e-06, 'epoch': 0.29} 29%|██▉ | 6407/22095 [11:01:50<29:54:39, 6.86s/it] 29%|██▉ | 6408/22095 [11:02:00<35:00:49, 8.04s/it] {'loss': 0.4922, 'grad_norm': 0.30213810850562034, 'learning_rate': 8.329936871004703e-06, 'epoch': 0.29} 29%|██▉ | 6408/22095 [11:02:00<35:00:49, 8.04s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (45186 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6409/22095 [11:02:05<30:18:01, 6.95s/it] {'loss': 0.3748, 'grad_norm': 0.6708127394690835, 'learning_rate': 8.329390103260057e-06, 'epoch': 0.29} 29%|██▉ | 6409/22095 [11:02:05<30:18:01, 6.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893507 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16660, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6cm'}]} 29%|██▉ | 6410/22095 [11:02:09<26:17:36, 6.03s/it] {'loss': 0.3585, 'grad_norm': 0.7416775574171741, 'learning_rate': 8.32884326397704e-06, 'epoch': 0.29} 29%|██▉ | 6410/22095 [11:02:09<26:17:36, 6.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6411/22095 [11:02:12<22:42:02, 5.21s/it] {'loss': 0.3609, 'grad_norm': 0.5993879488501103, 'learning_rate': 8.328296353167408e-06, 'epoch': 0.29} 29%|██▉ | 6411/22095 [11:02:12<22:42:02, 5.21s/it] 29%|██▉ | 6412/22095 [11:02:16<21:37:55, 4.97s/it] {'loss': 0.3863, 'grad_norm': 0.6371880564391219, 'learning_rate': 8.327749370842909e-06, 'epoch': 0.29} 29%|██▉ | 6412/22095 [11:02:16<21:37:55, 4.97s/it] 29%|██▉ | 6413/22095 [11:02:19<18:49:26, 4.32s/it] {'loss': 0.353, 'grad_norm': 0.618451111883719, 'learning_rate': 8.327202317015295e-06, 'epoch': 0.29} 29%|██▉ | 6413/22095 [11:02:19<18:49:26, 4.32s/it] 29%|██▉ | 6414/22095 [11:02:23<18:37:35, 4.28s/it] {'loss': 0.4184, 'grad_norm': 0.6249599044660648, 'learning_rate': 8.326655191696322e-06, 'epoch': 0.29} 29%|██▉ | 6414/22095 [11:02:23<18:37:35, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6415/22095 [11:02:32<24:41:22, 5.67s/it] {'loss': 0.4931, 'grad_norm': 0.34375589339921275, 'learning_rate': 8.326107994897748e-06, 'epoch': 0.29} 29%|██▉ | 6415/22095 [11:02:32<24:41:22, 5.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6416/22095 [11:02:36<21:50:07, 5.01s/it] {'loss': 0.3652, 'grad_norm': 0.6602548060548115, 'learning_rate': 8.325560726631325e-06, 'epoch': 0.29} 29%|██▉ | 6416/22095 [11:02:36<21:50:07, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6417/22095 [11:02:46<28:05:45, 6.45s/it] {'loss': 0.4795, 'grad_norm': 0.33527913002483645, 'learning_rate': 8.325013386908817e-06, 'epoch': 0.29} 29%|██▉ | 6417/22095 [11:02:46<28:05:45, 6.45s/it] 29%|██▉ | 6418/22095 [11:02:49<23:53:02, 5.48s/it] {'loss': 0.3442, 'grad_norm': 0.6547073505960216, 'learning_rate': 8.324465975741986e-06, 'epoch': 0.29} 29%|██▉ | 6418/22095 [11:02:49<23:53:02, 5.48s/it] 29%|██▉ | 6419/22095 [11:02:52<20:56:47, 4.81s/it] {'loss': 0.3898, 'grad_norm': 0.6652996596938084, 'learning_rate': 8.323918493142588e-06, 'epoch': 0.29} 29%|██▉ | 6419/22095 [11:02:52<20:56:47, 4.81s/it] 29%|██▉ | 6420/22095 [11:02:56<19:18:46, 4.44s/it] {'loss': 0.3706, 'grad_norm': 0.64028964625868, 'learning_rate': 8.323370939122393e-06, 'epoch': 0.29} 29%|██▉ | 6420/22095 [11:02:56<19:18:46, 4.44s/it] 29%|██▉ | 6421/22095 [11:02:59<17:29:43, 4.02s/it] {'loss': 0.3617, 'grad_norm': 0.680854532340238, 'learning_rate': 8.322823313693162e-06, 'epoch': 0.29} 29%|██▉ | 6421/22095 [11:02:59<17:29:43, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41373 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110242 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6422/22095 [11:03:01<15:48:55, 3.63s/it] {'loss': 0.4334, 'grad_norm': 0.6684202593248905, 'learning_rate': 8.322275616866663e-06, 'epoch': 0.29} 29%|██▉ | 6422/22095 [11:03:01<15:48:55, 3.63s/it] 29%|██▉ | 6423/22095 [11:03:05<15:10:35, 3.49s/it] {'loss': 0.3495, 'grad_norm': 0.5962055859753047, 'learning_rate': 8.321727848654666e-06, 'epoch': 0.29} 29%|██▉ | 6423/22095 [11:03:05<15:10:35, 3.49s/it] 29%|██▉ | 6424/22095 [11:03:08<15:00:41, 3.45s/it] {'loss': 0.4093, 'grad_norm': 0.6619552340244278, 'learning_rate': 8.321180009068937e-06, 'epoch': 0.29} 29%|██▉ | 6424/22095 [11:03:08<15:00:41, 3.45s/it] 29%|██▉ | 6425/22095 [11:03:11<14:09:50, 3.25s/it] {'loss': 0.3796, 'grad_norm': 0.7488039229952417, 'learning_rate': 8.320632098121253e-06, 'epoch': 0.29} 29%|██▉ | 6425/22095 [11:03:11<14:09:50, 3.25s/it] 29%|██▉ | 6426/22095 [11:03:14<14:35:44, 3.35s/it] {'loss': 0.3798, 'grad_norm': 0.6722660190015518, 'learning_rate': 8.320084115823382e-06, 'epoch': 0.29} 29%|██▉ | 6426/22095 [11:03:14<14:35:44, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46879 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6427/22095 [11:03:18<15:06:06, 3.47s/it] {'loss': 0.3934, 'grad_norm': 0.6643333815944632, 'learning_rate': 8.3195360621871e-06, 'epoch': 0.29} 29%|██▉ | 6427/22095 [11:03:18<15:06:06, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52300 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6428/22095 [11:03:21<14:57:11, 3.44s/it] {'loss': 0.3645, 'grad_norm': 0.5847662005979445, 'learning_rate': 8.318987937224183e-06, 'epoch': 0.29} 29%|██▉ | 6428/22095 [11:03:21<14:57:11, 3.44s/it] 29%|██▉ | 6429/22095 [11:03:25<15:35:39, 3.58s/it] {'loss': 0.4105, 'grad_norm': 0.7669998563625238, 'learning_rate': 8.318439740946409e-06, 'epoch': 0.29} 29%|██▉ | 6429/22095 [11:03:25<15:35:39, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (96971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92648 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6430/22095 [11:03:34<22:09:03, 5.09s/it] {'loss': 0.4751, 'grad_norm': 0.5310405439597425, 'learning_rate': 8.317891473365558e-06, 'epoch': 0.29} 29%|██▉ | 6430/22095 [11:03:34<22:09:03, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60183 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142709 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6431/22095 [11:03:37<19:33:12, 4.49s/it] {'loss': 0.332, 'grad_norm': 0.6261449752730862, 'learning_rate': 8.317343134493408e-06, 'epoch': 0.29} 29%|██▉ | 6431/22095 [11:03:37<19:33:12, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55303 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6432/22095 [11:03:41<18:18:48, 4.21s/it] {'loss': 0.3862, 'grad_norm': 0.6987428389371922, 'learning_rate': 8.316794724341743e-06, 'epoch': 0.29} 29%|██▉ | 6432/22095 [11:03:41<18:18:48, 4.21s/it] 29%|██▉ | 6433/22095 [11:03:44<16:46:12, 3.85s/it] {'loss': 0.3647, 'grad_norm': 0.6655812586635537, 'learning_rate': 8.316246242922345e-06, 'epoch': 0.29} 29%|██▉ | 6433/22095 [11:03:44<16:46:12, 3.85s/it] 29%|██▉ | 6434/22095 [11:03:47<15:56:46, 3.67s/it] {'loss': 0.3789, 'grad_norm': 0.6048388635137149, 'learning_rate': 8.315697690247002e-06, 'epoch': 0.29} 29%|██▉ | 6434/22095 [11:03:47<15:56:46, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://internvl2/datasets/IAM/image/e06-010.png 2025-08-28 03:01:47.301630 load time: 1123.11 ms 29%|██▉ | 6435/22095 [11:03:56<23:23:08, 5.38s/it] {'loss': 0.5103, 'grad_norm': 0.41803567170471573, 'learning_rate': 8.315149066327498e-06, 'epoch': 0.29} 29%|██▉ | 6435/22095 [11:03:56<23:23:08, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [456, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8524481 in VC:s3://internvl-moe-sft-data/. Exception: Image size [456, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30748, 'image': 'vrdu_texteq/astro-ph.CO/eef3e70a-a829-4b53-9138-129871896724.png', 'image_wh': [[456, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'is the Hubble constant at time $t$. But'}]} 29%|██▉ | 6436/22095 [11:04:06<28:43:44, 6.60s/it] {'loss': 0.5231, 'grad_norm': 0.40434775094018255, 'learning_rate': 8.314600371175623e-06, 'epoch': 0.29} 29%|██▉ | 6436/22095 [11:04:06<28:43:44, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6437/22095 [11:04:09<24:41:12, 5.68s/it] {'loss': 0.3729, 'grad_norm': 0.6869531963848231, 'learning_rate': 8.314051604803164e-06, 'epoch': 0.29} 29%|██▉ | 6437/22095 [11:04:09<24:41:12, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44293 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61982 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6438/22095 [11:04:19<29:30:28, 6.78s/it] {'loss': 0.4862, 'grad_norm': 0.29570610897397354, 'learning_rate': 8.313502767221916e-06, 'epoch': 0.29} 29%|██▉ | 6438/22095 [11:04:19<29:30:28, 6.78s/it] 29%|██▉ | 6439/22095 [11:04:25<28:34:27, 6.57s/it] {'loss': 0.4805, 'grad_norm': 0.31598816349430026, 'learning_rate': 8.312953858443672e-06, 'epoch': 0.29} 29%|██▉ | 6439/22095 [11:04:25<28:34:27, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6440/22095 [11:04:28<24:25:44, 5.62s/it] {'loss': 0.3466, 'grad_norm': 0.80407284756795, 'learning_rate': 8.312404878480222e-06, 'epoch': 0.29} 29%|██▉ | 6440/22095 [11:04:28<24:25:44, 5.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6441/22095 [11:04:32<22:19:36, 5.13s/it] {'loss': 0.4, 'grad_norm': 0.6493079758709581, 'learning_rate': 8.311855827343364e-06, 'epoch': 0.29} 29%|██▉ | 6441/22095 [11:04:32<22:19:36, 5.13s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 29%|██▉ | 6442/22095 [11:04:35<20:00:52, 4.60s/it] {'loss': 0.4026, 'grad_norm': 0.6575077859486658, 'learning_rate': 8.311306705044898e-06, 'epoch': 0.29} 29%|██▉ | 6442/22095 [11:04:35<20:00:52, 4.60s/it] 29%|██▉ | 6443/22095 [11:04:38<17:48:08, 4.09s/it] {'loss': 0.3636, 'grad_norm': 0.6641962626839024, 'learning_rate': 8.31075751159662e-06, 'epoch': 0.29} 29%|██▉ | 6443/22095 [11:04:38<17:48:08, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11226296 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 776, 'image': ['MMMU/Architecture_and_Engineering/test_108_image_1.png', 'MMMU/Architecture_and_Engineering/test_108_image_2.png', 'MMMU/Architecture_and_Engineering/test_108_image_3.png', 'MMMU/Architecture_and_Engineering/test_108_image_4.png', 'MMMU/Architecture_and_Engineering/test_108_image_5.png', 'MMMU/Architecture_and_Engineering/test_108_image_6.png'], 'image_wh': [[174, 169], [188, 172], [197, 21], [194, 51], [194, 42], [193, 45]], 'conversations': [{'from': 'human', 'value': "Image-1: \nImage-2: \nImage-3: \nImage-4: \nImage-5: \nImage-6: \nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words.Note that you do not need to give a specific answer to the question.Here are the content and options of this question: Match List I with List II and select the correct answer using the codes given below the lists:\nOptions: ['', '', '', '']"}, {'from': 'gpt', 'value': 'To thoroughly understand the nature of the question and effectively match the given objects and symbols, it is essential to delve into the specific characteristics of each fencing type and the general principles of representation using symbols. This exploration not only involves understanding the physical characteristics of each item but also interpreting how these can be visually symbolized. Below are detailed introductions to each concept and its applications.\n\n### 1. Hedge\n\n**Concept**: \nA hedge is a dense row of shrubs or low trees that forms a boundary or screen. Traditionally, hedges have been used for privacy, windbreaks, or to define property lines. They are living plants that can be shaped and maintained through regular trimming and care.\n\n**Applications**:\n- **Privacy and Security**: Hedges provide a natural barrier against intruders – both human and animal – and offer privacy by creating a visual screen.\n- **Aesthetic Appeal**: They enhance the landscape architecture due to their lush green appearance and ability to be sculpted into various shapes and designs.\n- **Environmental Benefits**: Hedges contribute to biodiversity by providing habitat for wildlife, reducing pollution by trapping particulates, and aiding in air purification.\n\n**Example**: A common example of a hedge is a row of well-maintained boxwood shrubs or privet bushes along the boundary of a garden.\n\n### 2. Wire Fencing\n\n**Concept**: \nWire fencing consists of strands of wire arranged in various configurations, often held up by posts. It is designed to keep animals in or out and to mark boundaries.\n\n**Types of Wire Fencing**:\n- **Barbed Wire**: Used mainly for livestock containment; it has sharp edges to deter animals.\n- **Woven Wire**: Also known as field fencing, suitable for large animals, with woven grids to provide strength.\n- **Chain Link Fencing**: Often used in urban areas for its durability and low cost.\n\n**Applications**:\n- **Agricultural**: Secures fields and pens for the safety of livestock, protecting them against predators.\n- **Industrial and Urban Use**: Provides security around buildings, parks, and construction sites.\n\n**Example**: Barbed wire fencing is often seen around ranches, protecting cattle and other livestock.\n\n### 3. Pipe Fencing\n\n**Concept**: \nPipe fencing uses metal pipes as posts and rails to construct a sturdy, long-lasting fence. It is highly durable and resistant to damage from animals and weather conditions.\n\n**Applications**:\n- **Ranch and Farm Usage**: Ideal for containing horses and cattle due to its strength and minimal maintenance requirements.\n- **Industrial Use**: Used in facilities that demand high security and safety standards.\n\n**Example**: Pipe fencing is commonly employed in horse stables and large farms where robust containment is crucial.\n\n### 4. Wood Fencing\n\n**Concept**: \nWood fencing is a traditional form of fencing using timber. It can be constructed in various styles such as picket, panel, or post and rail.\n\n**Types of Wood Fencing**:\n- **Picket Fence**: Iconic style often used in suburban homes.\n- **Privacy Fence**: Tall and solid to block visibility, commonly used around gardens.\n- **Split Rail**: Rustic style typically found in rural settings.\n\n**Applications**:\n- **Residential**: Provides privacy, defines boundaries, and enhances aesthetic appeal.\n- **Decorative**: Frequently used for purely ornamental reasons in landscaping.\n\n**Example**: A white picket fence is a traditional American symbol of the quintessential suburban lifestyle.\n\n### Symbol Representation in Fencing\n\n**Concept**: \nThe use of symbols to represent physical objects like fencing types is a method employed in mapping, technical drawings, and schematic designs to convey information in a simplified visual format.\n\n**Principles**:\n- **Recognizability**: Symbols should be easily recognizable and associated with the object they represent.\n- **Simplicity**: Symbols are typically simplified versions, capturing the essence rather than detailing every characteristic.\n- **Standardization**: In many fields, symbols are standardized to facilitate universal understanding and interpretation.\n\n**Applications**:\n- **Maps and Diagrams**: Symbols allow for quick reference without overwhelming with details.\n- **Technical Plans**: In construction or engineering, symbols efficiently communicate specifications.\n\n**Example**: On a garden landscape design, a series of “arches” might symbolize a hedge, while “crossed lines” could indicate wire fencing.\n\n### Matching Symbols to Objects\n\n**Process**:\n- **Identify Characteristics**: Assess the characteristics of the object and find the visual element in the symbol that best represents these.\n- **Logical Association**: Consider what aspects of the object are most significant in context and how they relate to the intended symbol.\n- **Contextual Relevance**: The match should make sense in the context in which the symbols are being used, reflecting the practical application of each object.\n\n**Example**: A symbol with circular repetitions or continuous flow might align with the linear continuity of a pipe fence, while scattered dots could suggest the irregularity found in wire fencing patterns.\n\nBy understanding these concepts, one can accurately match objects like fences with their symbolic representations, which is a fundamental skill in fields such as landscaping, architecture, and cartography.'}]} 29%|██▉ | 6444/22095 [11:04:42<16:44:41, 3.85s/it] {'loss': 0.3959, 'grad_norm': 0.66221788099013, 'learning_rate': 8.310208247010331e-06, 'epoch': 0.29} 29%|██▉ | 6444/22095 [11:04:42<16:44:41, 3.85s/it] 29%|██▉ | 6445/22095 [11:04:44<15:20:59, 3.53s/it] {'loss': 0.3597, 'grad_norm': 0.6939365180392555, 'learning_rate': 8.309658911297833e-06, 'epoch': 0.29} 29%|██▉ | 6445/22095 [11:04:44<15:20:59, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6446/22095 [11:04:49<17:03:22, 3.92s/it] {'loss': 0.4985, 'grad_norm': 0.4297312805988011, 'learning_rate': 8.309109504470932e-06, 'epoch': 0.29} 29%|██▉ | 6446/22095 [11:04:49<17:03:22, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52767 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55429 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107415 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6447/22095 [11:04:53<16:46:05, 3.86s/it] {'loss': 0.3852, 'grad_norm': 0.6415743882962217, 'learning_rate': 8.308560026541428e-06, 'epoch': 0.29} 29%|██▉ | 6447/22095 [11:04:53<16:46:05, 3.86s/it] 29%|██▉ | 6448/22095 [11:04:56<16:21:23, 3.76s/it] {'loss': 0.3674, 'grad_norm': 0.6544798763209472, 'learning_rate': 8.30801047752113e-06, 'epoch': 0.29} 29%|██▉ | 6448/22095 [11:04:56<16:21:23, 3.76s/it] 29%|██▉ | 6449/22095 [11:04:59<15:15:31, 3.51s/it] {'loss': 0.3697, 'grad_norm': 0.6687801130362593, 'learning_rate': 8.307460857421849e-06, 'epoch': 0.29} 29%|██▉ | 6449/22095 [11:04:59<15:15:31, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6450/22095 [11:05:03<14:59:39, 3.45s/it] {'loss': 0.3401, 'grad_norm': 0.6429403857604996, 'learning_rate': 8.306911166255392e-06, 'epoch': 0.29} 29%|██▉ | 6450/22095 [11:05:03<14:59:39, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70728 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74280 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6451/22095 [11:05:06<14:53:53, 3.43s/it] {'loss': 0.3139, 'grad_norm': 0.5829187355129206, 'learning_rate': 8.306361404033571e-06, 'epoch': 0.29} 29%|██▉ | 6451/22095 [11:05:06<14:53:53, 3.43s/it] 29%|██▉ | 6452/22095 [11:05:11<16:15:52, 3.74s/it] {'loss': 0.3806, 'grad_norm': 0.6256295801055626, 'learning_rate': 8.305811570768196e-06, 'epoch': 0.29} 29%|██▉ | 6452/22095 [11:05:11<16:15:52, 3.74s/it] 29%|██▉ | 6453/22095 [11:05:13<15:10:44, 3.49s/it] {'loss': 0.3926, 'grad_norm': 0.656436203896976, 'learning_rate': 8.305261666471085e-06, 'epoch': 0.29} 29%|██▉ | 6453/22095 [11:05:13<15:10:44, 3.49s/it] 29%|██▉ | 6454/22095 [11:05:17<15:32:07, 3.58s/it] {'loss': 0.3953, 'grad_norm': 0.6134172488872648, 'learning_rate': 8.304711691154052e-06, 'epoch': 0.29} 29%|██▉ | 6454/22095 [11:05:17<15:32:07, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127683 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76068 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6455/22095 [11:05:28<24:52:11, 5.72s/it] {'loss': 0.5043, 'grad_norm': 0.38117255138407263, 'learning_rate': 8.304161644828913e-06, 'epoch': 0.29} 29%|██▉ | 6455/22095 [11:05:28<24:52:11, 5.72s/it] 29%|██▉ | 6456/22095 [11:05:32<22:58:06, 5.29s/it] {'loss': 0.4098, 'grad_norm': 0.6713297462223095, 'learning_rate': 8.30361152750749e-06, 'epoch': 0.29} 29%|██▉ | 6456/22095 [11:05:32<22:58:06, 5.29s/it] 29%|██▉ | 6457/22095 [11:05:36<20:37:26, 4.75s/it] {'loss': 0.3802, 'grad_norm': 0.6994613343471084, 'learning_rate': 8.303061339201601e-06, 'epoch': 0.29} 29%|██▉ | 6457/22095 [11:05:36<20:37:26, 4.75s/it] 29%|██▉ | 6458/22095 [11:05:39<18:59:35, 4.37s/it] {'loss': 0.3764, 'grad_norm': 0.6373226817333996, 'learning_rate': 8.302511079923068e-06, 'epoch': 0.29} 29%|██▉ | 6458/22095 [11:05:39<18:59:35, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6459/22095 [11:05:50<27:39:21, 6.37s/it] {'loss': 0.5137, 'grad_norm': 0.31101672311753653, 'learning_rate': 8.301960749683715e-06, 'epoch': 0.29} 29%|██▉ | 6459/22095 [11:05:50<27:39:21, 6.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045980 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm'}, {'from': 'gpt', 'value': '【解答】解:∵M是AC的中点,N是DB的中点,CD=3cm,AB=9.8cm,∴MC+DN=\\frac{1}{2}(AB-CD)=3.4cm,∴MN=MC+DN+CD=3.4+3=6.4cm.'}]} 29%|██▉ | 6460/22095 [11:05:54<24:01:49, 5.53s/it] {'loss': 0.3826, 'grad_norm': 0.6661226652935572, 'learning_rate': 8.301410348495366e-06, 'epoch': 0.29} 29%|██▉ | 6460/22095 [11:05:54<24:01:49, 5.53s/it] 29%|██▉ | 6461/22095 [11:05:58<21:42:41, 5.00s/it] {'loss': 0.3283, 'grad_norm': 0.681465527925491, 'learning_rate': 8.300859876369849e-06, 'epoch': 0.29} 29%|██▉ | 6461/22095 [11:05:58<21:42:41, 5.00s/it] 29%|██▉ | 6462/22095 [11:06:01<20:14:09, 4.66s/it] {'loss': 0.3807, 'grad_norm': 0.6503583509986406, 'learning_rate': 8.300309333318992e-06, 'epoch': 0.29} 29%|██▉ | 6462/22095 [11:06:01<20:14:09, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [48, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369586 in VC:s3://internvl-moe-sft-data/. Exception: Image size [48, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36338, 'image': 'vrdu_table_final_2/astro-ph.CO/6dc94e70-7777-454c-b080-735d7749f118.png', 'image_wh': [[48, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\Delta \\theta_s$\\end{tabular}\n```"}]} 29%|██▉ | 6463/22095 [11:06:10<25:28:58, 5.87s/it] {'loss': 0.5103, 'grad_norm': 0.35637099222950214, 'learning_rate': 8.299758719354621e-06, 'epoch': 0.29} 29%|██▉ | 6463/22095 [11:06:10<25:28:58, 5.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6464/22095 [11:06:13<21:57:56, 5.06s/it] {'loss': 0.4173, 'grad_norm': 0.9007426007639495, 'learning_rate': 8.299208034488571e-06, 'epoch': 0.29} 29%|██▉ | 6464/22095 [11:06:13<21:57:56, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8351816 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18495, 'image': 'vrdu_table_final_2/astro-ph.CO/b8be4eee-e088-4224-9ec6-2d9ac2e6e344.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 29%|██▉ | 6465/22095 [11:06:17<20:13:56, 4.66s/it] {'loss': 0.4247, 'grad_norm': 0.8978949257967975, 'learning_rate': 8.298657278732673e-06, 'epoch': 0.29} 29%|██▉ | 6465/22095 [11:06:17<20:13:56, 4.66s/it] 29%|██▉ | 6466/22095 [11:06:20<17:57:59, 4.14s/it] {'loss': 0.3561, 'grad_norm': 0.7437470268845902, 'learning_rate': 8.298106452098761e-06, 'epoch': 0.29} 29%|██▉ | 6466/22095 [11:06:20<17:57:59, 4.14s/it] 29%|██▉ | 6467/22095 [11:06:24<17:46:01, 4.09s/it] {'loss': 0.4265, 'grad_norm': 0.674750975286095, 'learning_rate': 8.297555554598671e-06, 'epoch': 0.29} 29%|██▉ | 6467/22095 [11:06:24<17:46:01, 4.09s/it] 29%|██▉ | 6468/22095 [11:06:27<16:30:33, 3.80s/it] {'loss': 0.3432, 'grad_norm': 0.6389898615422173, 'learning_rate': 8.29700458624424e-06, 'epoch': 0.29} 29%|██▉ | 6468/22095 [11:06:27<16:30:33, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6469/22095 [11:06:31<16:29:13, 3.80s/it] {'loss': 0.3565, 'grad_norm': 0.6119289376751083, 'learning_rate': 8.296453547047305e-06, 'epoch': 0.29} 29%|██▉ | 6469/22095 [11:06:31<16:29:13, 3.80s/it] 29%|██▉ | 6470/22095 [11:06:35<16:29:45, 3.80s/it] {'loss': 0.3678, 'grad_norm': 0.6467003422732659, 'learning_rate': 8.295902437019709e-06, 'epoch': 0.29} 29%|██▉ | 6470/22095 [11:06:35<16:29:45, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57378 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6471/22095 [11:06:38<15:21:03, 3.54s/it] {'loss': 0.3772, 'grad_norm': 0.6275114622388187, 'learning_rate': 8.295351256173292e-06, 'epoch': 0.29} 29%|██▉ | 6471/22095 [11:06:38<15:21:03, 3.54s/it] 29%|██▉ | 6472/22095 [11:06:42<15:56:56, 3.68s/it] {'loss': 0.3339, 'grad_norm': 0.6070339394501282, 'learning_rate': 8.294800004519895e-06, 'epoch': 0.29} 29%|██▉ | 6472/22095 [11:06:42<15:56:56, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121949 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6473/22095 [11:06:44<14:50:38, 3.42s/it] {'loss': 0.3849, 'grad_norm': 0.6790702159687559, 'learning_rate': 8.294248682071369e-06, 'epoch': 0.29} 29%|██▉ | 6473/22095 [11:06:44<14:50:38, 3.42s/it] 29%|██▉ | 6474/22095 [11:06:48<14:32:00, 3.35s/it] {'loss': 0.3665, 'grad_norm': 0.6806848558424293, 'learning_rate': 8.293697288839555e-06, 'epoch': 0.29} 29%|██▉ | 6474/22095 [11:06:48<14:32:00, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53845 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6475/22095 [11:06:51<14:05:48, 3.25s/it] {'loss': 0.392, 'grad_norm': 0.6489635136154929, 'learning_rate': 8.293145824836302e-06, 'epoch': 0.29} 29%|██▉ | 6475/22095 [11:06:51<14:05:48, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6476/22095 [11:07:01<22:48:24, 5.26s/it] {'loss': 0.5182, 'grad_norm': 0.37044885389524723, 'learning_rate': 8.29259429007346e-06, 'epoch': 0.29} 29%|██▉ | 6476/22095 [11:07:01<22:48:24, 5.26s/it] 29%|██▉ | 6477/22095 [11:07:04<20:07:17, 4.64s/it] {'loss': 0.357, 'grad_norm': 0.655501620363715, 'learning_rate': 8.292042684562878e-06, 'epoch': 0.29} 29%|██▉ | 6477/22095 [11:07:04<20:07:17, 4.64s/it] 29%|██▉ | 6478/22095 [11:07:07<18:22:21, 4.24s/it] {'loss': 0.3688, 'grad_norm': 0.9507463591648303, 'learning_rate': 8.291491008316409e-06, 'epoch': 0.29} 29%|██▉ | 6478/22095 [11:07:07<18:22:21, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6479/22095 [11:07:15<22:50:28, 5.27s/it] {'loss': 0.4761, 'grad_norm': 0.32278269268452087, 'learning_rate': 8.290939261345908e-06, 'epoch': 0.29} 29%|██▉ | 6479/22095 [11:07:15<22:50:28, 5.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81208 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90772 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6480/22095 [11:07:18<20:04:00, 4.63s/it] {'loss': 0.3683, 'grad_norm': 0.6977751345184396, 'learning_rate': 8.29038744366323e-06, 'epoch': 0.29} 29%|██▉ | 6480/22095 [11:07:18<20:04:00, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6780323 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'inventor/20250511_134843_1/images/before_screenshot_1_id_149_function_1_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': '\nUse the Split tool to cut your model along selected faces or planes.'}, {'from': 'gpt', 'value': '\nclick(x=0.6880, y=0.5158)\n'}], 'width': 3024, 'height': 1964} 29%|██▉ | 6481/22095 [11:07:26<24:33:06, 5.66s/it] {'loss': 0.4853, 'grad_norm': 0.30990544000558706, 'learning_rate': 8.28983555528023e-06, 'epoch': 0.29} 29%|██▉ | 6481/22095 [11:07:26<24:33:06, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98768 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6482/22095 [11:07:29<21:18:16, 4.91s/it] {'loss': 0.3804, 'grad_norm': 0.704930034295891, 'learning_rate': 8.289283596208769e-06, 'epoch': 0.29} 29%|██▉ | 6482/22095 [11:07:29<21:18:16, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6483/22095 [11:07:36<24:28:31, 5.64s/it] {'loss': 0.4689, 'grad_norm': 0.3095176020113561, 'learning_rate': 8.288731566460706e-06, 'epoch': 0.29} 29%|██▉ | 6483/22095 [11:07:36<24:28:31, 5.64s/it] 29%|██▉ | 6484/22095 [11:07:40<21:22:18, 4.93s/it] {'loss': 0.3578, 'grad_norm': 0.6767804618120248, 'learning_rate': 8.288179466047903e-06, 'epoch': 0.29} 29%|██▉ | 6484/22095 [11:07:40<21:22:18, 4.93s/it] 29%|██▉ | 6485/22095 [11:07:43<18:43:56, 4.32s/it] {'loss': 0.385, 'grad_norm': 0.6605536627738009, 'learning_rate': 8.28762729498222e-06, 'epoch': 0.29} 29%|██▉ | 6485/22095 [11:07:43<18:43:56, 4.32s/it] 29%|██▉ | 6486/22095 [11:07:46<17:19:12, 3.99s/it] {'loss': 0.3542, 'grad_norm': 0.6499909633330283, 'learning_rate': 8.287075053275527e-06, 'epoch': 0.29} 29%|██▉ | 6486/22095 [11:07:46<17:19:12, 3.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6487/22095 [11:07:49<15:50:39, 3.65s/it] {'loss': 0.3944, 'grad_norm': 0.6419155300510001, 'learning_rate': 8.286522740939682e-06, 'epoch': 0.29} 29%|██▉ | 6487/22095 [11:07:49<15:50:39, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6488/22095 [11:07:54<18:27:43, 4.26s/it] {'loss': 0.4885, 'grad_norm': 0.41197650198755686, 'learning_rate': 8.285970357986559e-06, 'epoch': 0.29} 29%|██▉ | 6488/22095 [11:07:54<18:27:43, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8388535 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55352, 'image': 'vrdu_table_final_2/astro-ph.CO/d1309212-46ee-40b3-a45f-e42e2d636436.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 29%|██▉ | 6489/22095 [11:07:58<17:25:41, 4.02s/it] {'loss': 0.3763, 'grad_norm': 0.6783231634374803, 'learning_rate': 8.285417904428025e-06, 'epoch': 0.29} 29%|██▉ | 6489/22095 [11:07:58<17:25:41, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6490/22095 [11:08:05<21:31:08, 4.96s/it] {'loss': 0.5039, 'grad_norm': 0.3380209730570563, 'learning_rate': 8.284865380275953e-06, 'epoch': 0.29} 29%|██▉ | 6490/22095 [11:08:05<21:31:08, 4.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [56, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8421598 in VC:s3://internvl-moe-sft-data/. Exception: Image size [56, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 154452, 'image': 'vrdu_texteq/astro-ph.CO/6793f715-94bc-4b18-83ca-bc1773c55110.png', 'image_wh': [[56, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': '\\(k_{n}x\\)\\,.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6491/22095 [11:08:09<20:12:25, 4.66s/it] {'loss': 0.3606, 'grad_norm': 0.6434836225109783, 'learning_rate': 8.28431278554221e-06, 'epoch': 0.29} 29%|██▉ | 6491/22095 [11:08:09<20:12:25, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80476 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62072 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75252 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6492/22095 [11:08:13<19:27:08, 4.49s/it] {'loss': 0.3692, 'grad_norm': 0.6712871013442894, 'learning_rate': 8.283760120238672e-06, 'epoch': 0.29} 29%|██▉ | 6492/22095 [11:08:13<19:27:08, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68767 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51912 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111486 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6493/22095 [11:08:16<18:08:47, 4.19s/it] {'loss': 0.3614, 'grad_norm': 0.7502628512683776, 'learning_rate': 8.283207384377217e-06, 'epoch': 0.29} 29%|██▉ | 6493/22095 [11:08:16<18:08:47, 4.19s/it] 29%|██▉ | 6494/22095 [11:08:20<16:43:32, 3.86s/it] {'loss': 0.3889, 'grad_norm': 0.6574921802281626, 'learning_rate': 8.282654577969715e-06, 'epoch': 0.29} 29%|██▉ | 6494/22095 [11:08:20<16:43:32, 3.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6495/22095 [11:08:23<16:47:25, 3.87s/it] {'loss': 0.3759, 'grad_norm': 0.6436678017455617, 'learning_rate': 8.282101701028051e-06, 'epoch': 0.29} 29%|██▉ | 6495/22095 [11:08:23<16:47:25, 3.87s/it] 29%|██▉ | 6496/22095 [11:08:27<16:21:47, 3.78s/it] {'loss': 0.3794, 'grad_norm': 0.6331727763394492, 'learning_rate': 8.281548753564101e-06, 'epoch': 0.29} 29%|██▉ | 6496/22095 [11:08:27<16:21:47, 3.78s/it] 29%|██▉ | 6497/22095 [11:08:30<15:31:14, 3.58s/it] {'loss': 0.3326, 'grad_norm': 3.064791132428098, 'learning_rate': 8.280995735589748e-06, 'epoch': 0.29} 29%|██▉ | 6497/22095 [11:08:30<15:31:14, 3.58s/it] 29%|██▉ | 6498/22095 [11:08:34<16:21:09, 3.77s/it] {'loss': 0.3618, 'grad_norm': 0.7158885307535165, 'learning_rate': 8.28044264711687e-06, 'epoch': 0.29} 29%|██▉ | 6498/22095 [11:08:34<16:21:09, 3.77s/it] 29%|██▉ | 6499/22095 [11:08:38<16:08:20, 3.73s/it] {'loss': 0.3625, 'grad_norm': 0.633887155405091, 'learning_rate': 8.279889488157358e-06, 'epoch': 0.29} 29%|██▉ | 6499/22095 [11:08:38<16:08:20, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6500/22095 [11:08:47<23:36:21, 5.45s/it] {'loss': 0.4881, 'grad_norm': 0.44873262101556294, 'learning_rate': 8.279336258723092e-06, 'epoch': 0.29} 29%|██▉ | 6500/22095 [11:08:47<23:36:21, 5.45s/it] 29%|██▉ | 6501/22095 [11:08:57<28:49:52, 6.66s/it] {'loss': 0.4763, 'grad_norm': 0.42868033624798035, 'learning_rate': 8.278782958825963e-06, 'epoch': 0.29} 29%|██▉ | 6501/22095 [11:08:57<28:49:52, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 29%|██▉ | 6502/22095 [11:09:01<25:43:55, 5.94s/it] {'loss': 0.3633, 'grad_norm': 0.6926362327266504, 'learning_rate': 8.278229588477857e-06, 'epoch': 0.29} 29%|██▉ | 6502/22095 [11:09:01<25:43:55, 5.94s/it] 29%|██▉ | 6503/22095 [11:09:05<22:37:44, 5.22s/it] {'loss': 0.3704, 'grad_norm': 0.7159833505501458, 'learning_rate': 8.277676147690667e-06, 'epoch': 0.29} 29%|██▉ | 6503/22095 [11:09:05<22:37:44, 5.22s/it] 29%|██▉ | 6504/22095 [11:09:08<20:05:52, 4.64s/it] {'loss': 0.361, 'grad_norm': 0.6048515450661027, 'learning_rate': 8.277122636476284e-06, 'epoch': 0.29} 29%|██▉ | 6504/22095 [11:09:08<20:05:52, 4.64s/it] 29%|██▉ | 6505/22095 [11:09:12<19:15:42, 4.45s/it] {'loss': 0.4029, 'grad_norm': 0.6926580589479077, 'learning_rate': 8.276569054846598e-06, 'epoch': 0.29} 29%|██▉ | 6505/22095 [11:09:12<19:15:42, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82692 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77419 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6506/22095 [11:09:15<17:43:21, 4.09s/it] {'loss': 0.3709, 'grad_norm': 0.7087743982571297, 'learning_rate': 8.276015402813507e-06, 'epoch': 0.29} 29%|██▉ | 6506/22095 [11:09:15<17:43:21, 4.09s/it] 29%|██▉ | 6507/22095 [11:09:19<17:23:44, 4.02s/it] {'loss': 0.3808, 'grad_norm': 0.6699221368445163, 'learning_rate': 8.275461680388907e-06, 'epoch': 0.29} 29%|██▉ | 6507/22095 [11:09:19<17:23:44, 4.02s/it] 29%|██▉ | 6508/22095 [11:09:23<17:25:58, 4.03s/it] {'loss': 0.3394, 'grad_norm': 0.6510226287700173, 'learning_rate': 8.274907887584695e-06, 'epoch': 0.29} 29%|██▉ | 6508/22095 [11:09:23<17:25:58, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6509/22095 [11:09:31<22:24:14, 5.17s/it] {'loss': 0.506, 'grad_norm': 0.7599440272738529, 'learning_rate': 8.274354024412771e-06, 'epoch': 0.29} 29%|██▉ | 6509/22095 [11:09:31<22:24:14, 5.17s/it] 29%|██▉ | 6510/22095 [11:09:34<20:06:03, 4.64s/it] {'loss': 0.3876, 'grad_norm': 0.6533955878761483, 'learning_rate': 8.273800090885033e-06, 'epoch': 0.29} 29%|██▉ | 6510/22095 [11:09:34<20:06:03, 4.64s/it] 29%|██▉ | 6511/22095 [11:09:38<18:18:05, 4.23s/it] {'loss': 0.3375, 'grad_norm': 0.6330339667687448, 'learning_rate': 8.273246087013389e-06, 'epoch': 0.29} 29%|██▉ | 6511/22095 [11:09:38<18:18:05, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6512/22095 [11:09:41<17:02:31, 3.94s/it] {'loss': 0.3953, 'grad_norm': 0.6825683608101373, 'learning_rate': 8.27269201280974e-06, 'epoch': 0.29} 29%|██▉ | 6512/22095 [11:09:41<17:02:31, 3.94s/it] 29%|██▉ | 6513/22095 [11:09:44<16:13:44, 3.75s/it] {'loss': 0.3741, 'grad_norm': 0.6742266479272396, 'learning_rate': 8.272137868285988e-06, 'epoch': 0.29} 29%|██▉ | 6513/22095 [11:09:44<16:13:44, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 29%|██▉ | 6514/22095 [11:09:52<21:14:35, 4.91s/it] {'loss': 0.4807, 'grad_norm': 0.3509569724005049, 'learning_rate': 8.271583653454046e-06, 'epoch': 0.29} 29%|██▉ | 6514/22095 [11:09:52<21:14:35, 4.91s/it] 29%|██▉ | 6515/22095 [11:09:56<19:43:33, 4.56s/it] {'loss': 0.4001, 'grad_norm': 0.6964791974984916, 'learning_rate': 8.271029368325816e-06, 'epoch': 0.29} 29%|██▉ | 6515/22095 [11:09:56<19:43:33, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44244 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53407 > 40960). Running this sequence through the model will result in indexing errors 29%|██▉ | 6516/22095 [11:09:59<17:53:49, 4.14s/it] {'loss': 0.3423, 'grad_norm': 0.7441171021360583, 'learning_rate': 8.270475012913212e-06, 'epoch': 0.29} 29%|██▉ | 6516/22095 [11:09:59<17:53:49, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 29%|██▉ | 6517/22095 [11:10:02<17:08:28, 3.96s/it] {'loss': 0.3606, 'grad_norm': 0.6282383938135476, 'learning_rate': 8.269920587228145e-06, 'epoch': 0.29} 29%|██▉ | 6517/22095 [11:10:02<17:08:28, 3.96s/it] 29%|██▉ | 6518/22095 [11:10:06<17:15:14, 3.99s/it] {'loss': 0.4197, 'grad_norm': 0.6679807888966616, 'learning_rate': 8.269366091282526e-06, 'epoch': 0.29} 29%|██▉ | 6518/22095 [11:10:06<17:15:14, 3.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6519/22095 [11:10:09<15:51:40, 3.67s/it] {'loss': 0.3808, 'grad_norm': 0.6600489477472857, 'learning_rate': 8.268811525088273e-06, 'epoch': 0.3} 30%|██▉ | 6519/22095 [11:10:09<15:51:40, 3.67s/it] 30%|██▉ | 6520/22095 [11:10:14<16:56:30, 3.92s/it] {'loss': 0.3092, 'grad_norm': 0.5760869889668241, 'learning_rate': 8.2682568886573e-06, 'epoch': 0.3} 30%|██▉ | 6520/22095 [11:10:14<16:56:30, 3.92s/it] 30%|██▉ | 6521/22095 [11:10:18<17:09:13, 3.97s/it] {'loss': 0.3618, 'grad_norm': 0.660281851940936, 'learning_rate': 8.267702182001521e-06, 'epoch': 0.3} 30%|██▉ | 6521/22095 [11:10:18<17:09:13, 3.97s/it] 30%|██▉ | 6522/22095 [11:10:21<16:40:50, 3.86s/it] {'loss': 0.3764, 'grad_norm': 0.668073722385169, 'learning_rate': 8.26714740513286e-06, 'epoch': 0.3} 30%|██▉ | 6522/22095 [11:10:21<16:40:50, 3.86s/it] 30%|██▉ | 6523/22095 [11:10:25<16:20:03, 3.78s/it] {'loss': 0.3695, 'grad_norm': 0.6880108887912482, 'learning_rate': 8.266592558063235e-06, 'epoch': 0.3} 30%|██▉ | 6523/22095 [11:10:25<16:20:03, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92911 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6524/22095 [11:10:31<19:18:26, 4.46s/it] {'loss': 0.4912, 'grad_norm': 0.3869646595533447, 'learning_rate': 8.26603764080457e-06, 'epoch': 0.3} 30%|██▉ | 6524/22095 [11:10:31<19:18:26, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: broken data stream when reading image file [Try #0] Failed to fetch sample 6779130 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: broken data stream when reading image file Problematic sample: {'image': 'inventor/20250512_140254_1/images/before_screenshot_1_id_153_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nClick the 'Home' button in the top toolbar to access primary modeling tools and commands"}, {'from': 'gpt', 'value': '\nclick(x=0.5742, y=0.4556)\n'}], 'width': 2560, 'height': 1440} 30%|██▉ | 6525/22095 [11:10:36<19:43:31, 4.56s/it] {'loss': 0.3755, 'grad_norm': 0.6738971146084021, 'learning_rate': 8.265482653368786e-06, 'epoch': 0.3} 30%|██▉ | 6525/22095 [11:10:36<19:43:31, 4.56s/it] 30%|██▉ | 6526/22095 [11:10:39<18:22:54, 4.25s/it] {'loss': 0.3406, 'grad_norm': 0.9514622783496854, 'learning_rate': 8.264927595767808e-06, 'epoch': 0.3} 30%|██▉ | 6526/22095 [11:10:39<18:22:54, 4.25s/it] 30%|██▉ | 6527/22095 [11:10:43<18:00:04, 4.16s/it] {'loss': 0.3592, 'grad_norm': 0.6555285194634614, 'learning_rate': 8.264372468013566e-06, 'epoch': 0.3} 30%|██▉ | 6527/22095 [11:10:43<18:00:04, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85716 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6528/22095 [11:10:48<18:39:34, 4.32s/it] {'loss': 0.4001, 'grad_norm': 0.6941521013634866, 'learning_rate': 8.263817270117984e-06, 'epoch': 0.3} 30%|██▉ | 6528/22095 [11:10:48<18:39:34, 4.32s/it] 30%|██▉ | 6529/22095 [11:10:52<18:15:39, 4.22s/it] {'loss': 0.3549, 'grad_norm': 0.6522125610450555, 'learning_rate': 8.263262002092992e-06, 'epoch': 0.3} 30%|██▉ | 6529/22095 [11:10:52<18:15:39, 4.22s/it] 30%|██▉ | 6530/22095 [11:10:56<17:16:04, 3.99s/it] {'loss': 0.387, 'grad_norm': 0.7423122524992386, 'learning_rate': 8.262706663950522e-06, 'epoch': 0.3} 30%|██▉ | 6530/22095 [11:10:56<17:16:04, 3.99s/it] 30%|██▉ | 6531/22095 [11:10:59<16:58:38, 3.93s/it] {'loss': 0.3873, 'grad_norm': 0.6941356725705105, 'learning_rate': 8.262151255702506e-06, 'epoch': 0.3} 30%|██▉ | 6531/22095 [11:10:59<16:58:38, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885567 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8720, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 10\nB. 5\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|██▉ | 6532/22095 [11:11:07<22:04:30, 5.11s/it] {'loss': 0.4844, 'grad_norm': 0.40196026244367045, 'learning_rate': 8.261595777360881e-06, 'epoch': 0.3} 30%|██▉ | 6532/22095 [11:11:07<22:04:30, 5.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121790 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6533/22095 [11:11:11<20:15:15, 4.69s/it] {'loss': 0.3722, 'grad_norm': 0.6558786177137559, 'learning_rate': 8.261040228937578e-06, 'epoch': 0.3} 30%|██▉ | 6533/22095 [11:11:11<20:15:15, 4.69s/it] 30%|██▉ | 6534/22095 [11:11:14<18:46:23, 4.34s/it] {'loss': 0.3947, 'grad_norm': 0.6546132783831952, 'learning_rate': 8.260484610444537e-06, 'epoch': 0.3} 30%|██▉ | 6534/22095 [11:11:14<18:46:23, 4.34s/it] 30%|██▉ | 6535/22095 [11:11:18<18:07:35, 4.19s/it] {'loss': 0.3635, 'grad_norm': 0.638845136439574, 'learning_rate': 8.259928921893694e-06, 'epoch': 0.3} 30%|██▉ | 6535/22095 [11:11:18<18:07:35, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_3/images/step_3.png 2025-08-28 03:09:18.182029 load time: 1074.69 ms 30%|██▉ | 6536/22095 [11:11:26<22:41:59, 5.25s/it] {'loss': 0.4919, 'grad_norm': 0.3215929274096865, 'learning_rate': 8.259373163296992e-06, 'epoch': 0.3} 30%|██▉ | 6536/22095 [11:11:26<22:41:59, 5.25s/it] 30%|██▉ | 6537/22095 [11:11:30<20:41:28, 4.79s/it] {'loss': 0.3585, 'grad_norm': 0.6560801022010452, 'learning_rate': 8.258817334666371e-06, 'epoch': 0.3} 30%|██▉ | 6537/22095 [11:11:30<20:41:28, 4.79s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_2/images/step_0.png 2025-08-28 03:09:29.484127 load time: 1272.71 ms 30%|██▉ | 6538/22095 [11:11:34<19:34:08, 4.53s/it] {'loss': 0.3529, 'grad_norm': 0.6445716599426061, 'learning_rate': 8.258261436013774e-06, 'epoch': 0.3} 30%|██▉ | 6538/22095 [11:11:34<19:34:08, 4.53s/it] 30%|██▉ | 6539/22095 [11:11:37<17:49:38, 4.13s/it] {'loss': 0.3152, 'grad_norm': 0.617007318890825, 'learning_rate': 8.257705467351144e-06, 'epoch': 0.3} 30%|██▉ | 6539/22095 [11:11:37<17:49:38, 4.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6540/22095 [11:11:41<17:50:44, 4.13s/it] {'loss': 0.3922, 'grad_norm': 0.6083874381432658, 'learning_rate': 8.257149428690432e-06, 'epoch': 0.3} 30%|██▉ | 6540/22095 [11:11:41<17:50:44, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6541/22095 [11:11:50<24:40:10, 5.71s/it] {'loss': 0.487, 'grad_norm': 0.3776880342089537, 'learning_rate': 8.256593320043582e-06, 'epoch': 0.3} 30%|██▉ | 6541/22095 [11:11:50<24:40:10, 5.71s/it] 30%|██▉ | 6542/22095 [11:11:54<21:21:24, 4.94s/it] {'loss': 0.3876, 'grad_norm': 0.704918022131064, 'learning_rate': 8.25603714142254e-06, 'epoch': 0.3} 30%|██▉ | 6542/22095 [11:11:54<21:21:24, 4.94s/it] 30%|██▉ | 6543/22095 [11:11:57<18:55:48, 4.38s/it] {'loss': 0.3627, 'grad_norm': 0.8063942549240285, 'learning_rate': 8.255480892839262e-06, 'epoch': 0.3} 30%|██▉ | 6543/22095 [11:11:57<18:55:48, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67456 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81551 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6544/22095 [11:12:01<18:31:48, 4.29s/it] {'loss': 0.3473, 'grad_norm': 0.5742802128032614, 'learning_rate': 8.254924574305698e-06, 'epoch': 0.3} 30%|██▉ | 6544/22095 [11:12:01<18:31:48, 4.29s/it] 30%|██▉ | 6545/22095 [11:12:04<17:46:59, 4.12s/it] {'loss': 0.3822, 'grad_norm': 0.6964485588806103, 'learning_rate': 8.254368185833803e-06, 'epoch': 0.3} 30%|██▉ | 6545/22095 [11:12:04<17:46:59, 4.12s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 03:10:04.196989 load time: 1459.66 ms 30%|██▉ | 6546/22095 [11:12:07<16:03:11, 3.72s/it] {'loss': 0.356, 'grad_norm': 0.6321601650487939, 'learning_rate': 8.25381172743553e-06, 'epoch': 0.3} 30%|██▉ | 6546/22095 [11:12:07<16:03:11, 3.72s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38491.png 2025-08-28 03:10:04.801673 load time: 1538.22 ms 30%|██▉ | 6547/22095 [11:12:11<16:18:24, 3.78s/it] {'loss': 0.4008, 'grad_norm': 1.0443208662838888, 'learning_rate': 8.253255199122834e-06, 'epoch': 0.3} 30%|██▉ | 6547/22095 [11:12:11<16:18:24, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [639, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8528701 in VC:s3://internvl-moe-sft-data/. Exception: Image size [639, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 115627, 'image': 'vrdu_texteq/astro-ph.CO/58c24a49-0eca-409c-b8d2-104aefebc20e.png', 'image_wh': [[639, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'We can write $N_{ke}$ in the form $N_{ke}=N_{k}-N_{e}$ where'}]} 30%|██▉ | 6548/22095 [11:12:15<16:04:45, 3.72s/it] {'loss': 0.3482, 'grad_norm': 0.6582380851157981, 'learning_rate': 8.252698600907678e-06, 'epoch': 0.3} 30%|██▉ | 6548/22095 [11:12:15<16:04:45, 3.72s/it] 30%|██▉ | 6549/22095 [11:12:18<15:20:10, 3.55s/it] {'loss': 0.3846, 'grad_norm': 0.6970926427572819, 'learning_rate': 8.252141932802018e-06, 'epoch': 0.3} 30%|██▉ | 6549/22095 [11:12:18<15:20:10, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6550/22095 [11:12:24<18:45:54, 4.35s/it] {'loss': 0.5017, 'grad_norm': 0.4938941833933624, 'learning_rate': 8.251585194817816e-06, 'epoch': 0.3} 30%|██▉ | 6550/22095 [11:12:24<18:45:54, 4.35s/it] 30%|██▉ | 6551/22095 [11:12:28<18:18:28, 4.24s/it] {'loss': 0.3657, 'grad_norm': 0.6856811251346372, 'learning_rate': 8.251028386967035e-06, 'epoch': 0.3} 30%|██▉ | 6551/22095 [11:12:28<18:18:28, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6552/22095 [11:12:38<25:37:59, 5.94s/it] {'loss': 0.4859, 'grad_norm': 0.31996299943306283, 'learning_rate': 8.25047150926164e-06, 'epoch': 0.3} 30%|██▉ | 6552/22095 [11:12:38<25:37:59, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49941 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73951 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6553/22095 [11:12:42<22:47:54, 5.28s/it] {'loss': 0.3839, 'grad_norm': 0.6998424919733743, 'learning_rate': 8.249914561713592e-06, 'epoch': 0.3} 30%|██▉ | 6553/22095 [11:12:42<22:47:54, 5.28s/it] 30%|██▉ | 6554/22095 [11:12:45<20:05:35, 4.65s/it] {'loss': 0.3736, 'grad_norm': 0.773242550385443, 'learning_rate': 8.249357544334865e-06, 'epoch': 0.3} 30%|██▉ | 6554/22095 [11:12:45<20:05:35, 4.65s/it] 30%|██▉ | 6555/22095 [11:12:49<18:57:51, 4.39s/it] {'loss': 0.388, 'grad_norm': 0.6725820236276115, 'learning_rate': 8.248800457137422e-06, 'epoch': 0.3} 30%|██▉ | 6555/22095 [11:12:49<18:57:51, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78409 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51766 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74113 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6556/22095 [11:12:51<16:47:38, 3.89s/it] {'loss': 0.4145, 'grad_norm': 0.6893640866870768, 'learning_rate': 8.248243300133236e-06, 'epoch': 0.3} 30%|██▉ | 6556/22095 [11:12:51<16:47:38, 3.89s/it] 30%|██▉ | 6557/22095 [11:12:55<16:37:46, 3.85s/it] {'loss': 0.3928, 'grad_norm': 0.7584177846682806, 'learning_rate': 8.247686073334277e-06, 'epoch': 0.3} 30%|██▉ | 6557/22095 [11:12:55<16:37:46, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6558/22095 [11:13:05<24:14:23, 5.62s/it] {'loss': 0.4747, 'grad_norm': 0.6678868748025847, 'learning_rate': 8.247128776752517e-06, 'epoch': 0.3} 30%|██▉ | 6558/22095 [11:13:05<24:14:23, 5.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [245, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8486322 in VC:s3://internvl-moe-sft-data/. Exception: Image size [245, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6924, 'image': 'vrdu_texteq/astro-ph.CO/98f9faf8-06cc-4bba-ac14-d6ad9a4eed08.png', 'image_wh': [[245, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'For $ \\lambda = 2 $ we obtain'}]} 30%|██▉ | 6559/22095 [11:13:09<22:49:26, 5.29s/it] {'loss': 0.3582, 'grad_norm': 0.6610574223754907, 'learning_rate': 8.246571410399935e-06, 'epoch': 0.3} 30%|██▉ | 6559/22095 [11:13:09<22:49:26, 5.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6560/22095 [11:13:14<21:24:26, 4.96s/it] {'loss': 0.359, 'grad_norm': 0.6115168177738242, 'learning_rate': 8.246013974288505e-06, 'epoch': 0.3} 30%|██▉ | 6560/22095 [11:13:14<21:24:26, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6561/22095 [11:13:23<27:48:44, 6.45s/it] {'loss': 0.4911, 'grad_norm': 0.33510638899390455, 'learning_rate': 8.245456468430201e-06, 'epoch': 0.3} 30%|██▉ | 6561/22095 [11:13:23<27:48:44, 6.45s/it] 30%|██▉ | 6562/22095 [11:13:27<24:02:32, 5.57s/it] {'loss': 0.3633, 'grad_norm': 0.6297514724178221, 'learning_rate': 8.244898892837009e-06, 'epoch': 0.3} 30%|██▉ | 6562/22095 [11:13:27<24:02:32, 5.57s/it] 30%|██▉ | 6563/22095 [11:13:31<21:31:36, 4.99s/it] {'loss': 0.3601, 'grad_norm': 0.6217919692620721, 'learning_rate': 8.244341247520903e-06, 'epoch': 0.3} 30%|██▉ | 6563/22095 [11:13:31<21:31:36, 4.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6564/22095 [11:13:40<27:21:22, 6.34s/it] {'loss': 0.4649, 'grad_norm': 0.4667666793389384, 'learning_rate': 8.243783532493868e-06, 'epoch': 0.3} 30%|██▉ | 6564/22095 [11:13:40<27:21:22, 6.34s/it] 30%|██▉ | 6565/22095 [11:13:44<23:42:32, 5.50s/it] {'loss': 0.3581, 'grad_norm': 0.6434234825245986, 'learning_rate': 8.243225747767888e-06, 'epoch': 0.3} 30%|██▉ | 6565/22095 [11:13:44<23:42:32, 5.50s/it] 30%|██▉ | 6566/22095 [11:13:47<21:06:30, 4.89s/it] {'loss': 0.3694, 'grad_norm': 0.6626520810564629, 'learning_rate': 8.242667893354948e-06, 'epoch': 0.3} 30%|██▉ | 6566/22095 [11:13:47<21:06:30, 4.89s/it] 30%|██▉ | 6567/22095 [11:13:50<18:57:58, 4.40s/it] {'loss': 0.3276, 'grad_norm': 0.6603571749083783, 'learning_rate': 8.242109969267033e-06, 'epoch': 0.3} 30%|██▉ | 6567/22095 [11:13:50<18:57:58, 4.40s/it] 30%|██▉ | 6568/22095 [11:13:54<18:04:40, 4.19s/it] {'loss': 0.396, 'grad_norm': 0.6109779360936114, 'learning_rate': 8.241551975516133e-06, 'epoch': 0.3} 30%|██▉ | 6568/22095 [11:13:54<18:04:40, 4.19s/it] 30%|██▉ | 6569/22095 [11:13:57<16:52:59, 3.91s/it] {'loss': 0.3578, 'grad_norm': 0.651519606462089, 'learning_rate': 8.240993912114236e-06, 'epoch': 0.3} 30%|██▉ | 6569/22095 [11:13:57<16:52:59, 3.91s/it] 30%|██▉ | 6570/22095 [11:14:01<16:48:20, 3.90s/it] {'loss': 0.36, 'grad_norm': 0.6371769305932496, 'learning_rate': 8.240435779073336e-06, 'epoch': 0.3} 30%|██▉ | 6570/22095 [11:14:01<16:48:20, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (67330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76105 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6571/22095 [11:14:08<20:45:37, 4.81s/it] {'loss': 0.4808, 'grad_norm': 0.42362648372322165, 'learning_rate': 8.23987757640542e-06, 'epoch': 0.3} 30%|██▉ | 6571/22095 [11:14:08<20:45:37, 4.81s/it] 30%|██▉ | 6572/22095 [11:14:12<19:21:32, 4.49s/it] {'loss': 0.3618, 'grad_norm': 0.6340758942769098, 'learning_rate': 8.239319304122488e-06, 'epoch': 0.3} 30%|██▉ | 6572/22095 [11:14:12<19:21:32, 4.49s/it] 30%|██▉ | 6573/22095 [11:14:15<17:45:03, 4.12s/it] {'loss': 0.3885, 'grad_norm': 0.7264695251353803, 'learning_rate': 8.238760962236532e-06, 'epoch': 0.3} 30%|██▉ | 6573/22095 [11:14:15<17:45:03, 4.12s/it] 30%|██▉ | 6574/22095 [11:14:20<19:14:06, 4.46s/it] {'loss': 0.4151, 'grad_norm': 0.6927123830509735, 'learning_rate': 8.23820255075955e-06, 'epoch': 0.3} 30%|██▉ | 6574/22095 [11:14:20<19:14:06, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50050 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6575/22095 [11:14:24<18:39:28, 4.33s/it] {'loss': 0.3743, 'grad_norm': 0.7849148012762388, 'learning_rate': 8.23764406970354e-06, 'epoch': 0.3} 30%|██▉ | 6575/22095 [11:14:24<18:39:28, 4.33s/it] 30%|██▉ | 6576/22095 [11:14:28<17:42:34, 4.11s/it] {'loss': 0.375, 'grad_norm': 0.654854279717437, 'learning_rate': 8.237085519080503e-06, 'epoch': 0.3} 30%|██▉ | 6576/22095 [11:14:28<17:42:34, 4.11s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20420.png 2025-08-28 03:12:25.403941 load time: 1496.62 ms 30%|██▉ | 6577/22095 [11:14:32<17:11:07, 3.99s/it] {'loss': 0.388, 'grad_norm': 0.7159203992116845, 'learning_rate': 8.236526898902439e-06, 'epoch': 0.3} 30%|██▉ | 6577/22095 [11:14:32<17:11:07, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119125 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6578/22095 [11:14:35<16:10:33, 3.75s/it] {'loss': 0.3411, 'grad_norm': 0.656283121300721, 'learning_rate': 8.235968209181355e-06, 'epoch': 0.3} 30%|██▉ | 6578/22095 [11:14:35<16:10:33, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6579/22095 [11:14:45<24:10:25, 5.61s/it] {'loss': 0.4888, 'grad_norm': 0.4616172562386764, 'learning_rate': 8.23540944992925e-06, 'epoch': 0.3} 30%|██▉ | 6579/22095 [11:14:45<24:10:25, 5.61s/it] 30%|██▉ | 6580/22095 [11:14:48<21:19:45, 4.95s/it] {'loss': 0.3647, 'grad_norm': 0.664265125141684, 'learning_rate': 8.234850621158135e-06, 'epoch': 0.3} 30%|██▉ | 6580/22095 [11:14:48<21:19:45, 4.95s/it] 30%|██▉ | 6581/22095 [11:14:51<18:54:50, 4.39s/it] {'loss': 0.3735, 'grad_norm': 0.6515710215820063, 'learning_rate': 8.234291722880015e-06, 'epoch': 0.3} 30%|██▉ | 6581/22095 [11:14:51<18:54:50, 4.39s/it] 30%|██▉ | 6582/22095 [11:14:55<17:33:19, 4.07s/it] {'loss': 0.3846, 'grad_norm': 0.670688081563422, 'learning_rate': 8.233732755106897e-06, 'epoch': 0.3} 30%|██▉ | 6582/22095 [11:14:55<17:33:19, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6583/22095 [11:15:04<24:23:35, 5.66s/it] {'loss': 0.4626, 'grad_norm': 0.3096613537636291, 'learning_rate': 8.233173717850796e-06, 'epoch': 0.3} 30%|██▉ | 6583/22095 [11:15:04<24:23:35, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57060 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6584/22095 [11:15:07<21:10:46, 4.92s/it] {'loss': 0.3603, 'grad_norm': 0.670924612725373, 'learning_rate': 8.232614611123719e-06, 'epoch': 0.3} 30%|██▉ | 6584/22095 [11:15:07<21:10:46, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44470 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6585/22095 [11:15:10<19:01:10, 4.41s/it] {'loss': 0.3543, 'grad_norm': 0.7092134926955636, 'learning_rate': 8.232055434937685e-06, 'epoch': 0.3} 30%|██▉ | 6585/22095 [11:15:11<19:01:10, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (98909 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6586/22095 [11:15:21<27:00:00, 6.27s/it] {'loss': 0.4722, 'grad_norm': 0.32778289159729157, 'learning_rate': 8.231496189304704e-06, 'epoch': 0.3} 30%|██▉ | 6586/22095 [11:15:21<27:00:00, 6.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74172 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6587/22095 [11:15:24<22:57:24, 5.33s/it] {'loss': 0.4089, 'grad_norm': 0.701945960013496, 'learning_rate': 8.230936874236797e-06, 'epoch': 0.3} 30%|██▉ | 6587/22095 [11:15:24<22:57:24, 5.33s/it] 30%|██▉ | 6588/22095 [11:15:28<20:22:55, 4.73s/it] {'loss': 0.3806, 'grad_norm': 0.6969133591371268, 'learning_rate': 8.230377489745979e-06, 'epoch': 0.3} 30%|██▉ | 6588/22095 [11:15:28<20:22:55, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965312 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16147, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|██▉ | 6589/22095 [11:15:30<17:59:32, 4.18s/it] {'loss': 0.4209, 'grad_norm': 0.6535133332299422, 'learning_rate': 8.229818035844269e-06, 'epoch': 0.3} 30%|██▉ | 6589/22095 [11:15:30<17:59:32, 4.18s/it] 30%|██▉ | 6590/22095 [11:15:35<18:25:51, 4.28s/it] {'loss': 0.3675, 'grad_norm': 0.6780849897711396, 'learning_rate': 8.22925851254369e-06, 'epoch': 0.3} 30%|██▉ | 6590/22095 [11:15:35<18:25:51, 4.28s/it] 30%|██▉ | 6591/22095 [11:15:40<19:43:25, 4.58s/it] {'loss': 0.3453, 'grad_norm': 0.7267884367024268, 'learning_rate': 8.228698919856264e-06, 'epoch': 0.3} 30%|██▉ | 6591/22095 [11:15:40<19:43:25, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6592/22095 [11:15:49<25:32:12, 5.93s/it] {'loss': 0.4841, 'grad_norm': 0.39652816926380524, 'learning_rate': 8.228139257794012e-06, 'epoch': 0.3} 30%|██▉ | 6592/22095 [11:15:49<25:32:12, 5.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95944 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89280 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6593/22095 [11:15:59<30:32:19, 7.09s/it] {'loss': 0.4818, 'grad_norm': 0.30304362311260974, 'learning_rate': 8.227579526368965e-06, 'epoch': 0.3} 30%|██▉ | 6593/22095 [11:15:59<30:32:19, 7.09s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (77564 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59921 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6594/22095 [11:16:03<25:52:07, 6.01s/it] {'loss': 0.3391, 'grad_norm': 0.6851067942591184, 'learning_rate': 8.227019725593144e-06, 'epoch': 0.3} 30%|██▉ | 6594/22095 [11:16:03<25:52:07, 6.01s/it] 30%|██▉ | 6595/22095 [11:16:09<26:36:25, 6.18s/it] {'loss': 0.4676, 'grad_norm': 0.34227352067508154, 'learning_rate': 8.226459855478582e-06, 'epoch': 0.3} 30%|██▉ | 6595/22095 [11:16:09<26:36:25, 6.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 30%|██▉ | 6596/22095 [11:16:13<23:51:43, 5.54s/it] {'loss': 0.403, 'grad_norm': 0.6581569083787847, 'learning_rate': 8.225899916037305e-06, 'epoch': 0.3} 30%|██▉ | 6596/22095 [11:16:13<23:51:43, 5.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917245 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40398, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 6\nB. 8\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|██▉ | 6597/22095 [11:16:17<21:49:41, 5.07s/it] {'loss': 0.401, 'grad_norm': 0.6284583434239028, 'learning_rate': 8.22533990728135e-06, 'epoch': 0.3} 30%|██▉ | 6597/22095 [11:16:17<21:49:41, 5.07s/it] 30%|██▉ | 6598/22095 [11:16:20<18:45:27, 4.36s/it] {'loss': 0.3938, 'grad_norm': 0.7456078983710108, 'learning_rate': 8.224779829222742e-06, 'epoch': 0.3} 30%|██▉ | 6598/22095 [11:16:20<18:45:27, 4.36s/it] 30%|██▉ | 6599/22095 [11:16:23<16:59:56, 3.95s/it] {'loss': 0.3549, 'grad_norm': 0.6615636059791379, 'learning_rate': 8.224219681873522e-06, 'epoch': 0.3} 30%|██▉ | 6599/22095 [11:16:23<16:59:56, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68452 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72951 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6600/22095 [11:16:26<16:12:22, 3.77s/it] {'loss': 0.3297, 'grad_norm': 0.6578472104200105, 'learning_rate': 8.223659465245723e-06, 'epoch': 0.3} 30%|██▉ | 6600/22095 [11:16:26<16:12:22, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6601/22095 [11:16:33<20:29:19, 4.76s/it] {'loss': 0.4747, 'grad_norm': 0.327107354885252, 'learning_rate': 8.223099179351383e-06, 'epoch': 0.3} 30%|██▉ | 6601/22095 [11:16:33<20:29:19, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (132451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43523 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6602/22095 [11:16:43<26:30:45, 6.16s/it] {'loss': 0.4718, 'grad_norm': 0.31622473719570926, 'learning_rate': 8.22253882420254e-06, 'epoch': 0.3} 30%|██▉ | 6602/22095 [11:16:43<26:30:45, 6.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 30%|██▉ | 6603/22095 [11:16:46<22:47:20, 5.30s/it] {'loss': 0.3992, 'grad_norm': 0.6994810720919846, 'learning_rate': 8.221978399811237e-06, 'epoch': 0.3} 30%|██▉ | 6603/22095 [11:16:46<22:47:20, 5.30s/it] 30%|██▉ | 6604/22095 [11:16:49<20:18:00, 4.72s/it] {'loss': 0.3884, 'grad_norm': 0.6706589677639165, 'learning_rate': 8.22141790618951e-06, 'epoch': 0.3} 30%|██▉ | 6604/22095 [11:16:49<20:18:00, 4.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047542 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|██▉ | 6605/22095 [11:16:53<19:16:58, 4.48s/it] {'loss': 0.3376, 'grad_norm': 0.7094897895214411, 'learning_rate': 8.220857343349408e-06, 'epoch': 0.3} 30%|██▉ | 6605/22095 [11:16:53<19:16:58, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6606/22095 [11:16:58<19:27:06, 4.52s/it] {'loss': 0.5088, 'grad_norm': 0.32556023080515356, 'learning_rate': 8.220296711302976e-06, 'epoch': 0.3} 30%|██▉ | 6606/22095 [11:16:58<19:27:06, 4.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8402911 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5080, 'image': 'vrdu_table_final_2/astro-ph.CO/965a126e-ca38-4ca6-a7e0-97062eb7e90b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 30%|██▉ | 6607/22095 [11:17:01<17:48:42, 4.14s/it] {'loss': 0.3605, 'grad_norm': 0.7184867337191, 'learning_rate': 8.219736010062255e-06, 'epoch': 0.3} 30%|██▉ | 6607/22095 [11:17:01<17:48:42, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6608/22095 [11:17:10<23:25:59, 5.45s/it] {'loss': 0.4804, 'grad_norm': 0.3098575866940877, 'learning_rate': 8.219175239639296e-06, 'epoch': 0.3} 30%|██▉ | 6608/22095 [11:17:10<23:25:59, 5.45s/it] 30%|██▉ | 6609/22095 [11:17:14<21:29:19, 5.00s/it] {'loss': 0.3574, 'grad_norm': 0.6400206664582679, 'learning_rate': 8.21861440004615e-06, 'epoch': 0.3} 30%|██▉ | 6609/22095 [11:17:14<21:29:19, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54525 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6610/22095 [11:17:21<24:37:49, 5.73s/it] {'loss': 0.5005, 'grad_norm': 0.3010626374463441, 'learning_rate': 8.218053491294864e-06, 'epoch': 0.3} 30%|██▉ | 6610/22095 [11:17:21<24:37:49, 5.73s/it] 30%|██▉ | 6611/22095 [11:17:25<21:46:59, 5.06s/it] {'loss': 0.3662, 'grad_norm': 0.6855028942826946, 'learning_rate': 8.217492513397493e-06, 'epoch': 0.3} 30%|██▉ | 6611/22095 [11:17:25<21:46:59, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|██▉ | 6612/22095 [11:17:28<19:35:43, 4.56s/it] {'loss': 0.3976, 'grad_norm': 0.7101719022351363, 'learning_rate': 8.216931466366089e-06, 'epoch': 0.3} 30%|██▉ | 6612/22095 [11:17:28<19:35:43, 4.56s/it] 30%|██▉ | 6613/22095 [11:17:32<18:41:43, 4.35s/it] {'loss': 0.3337, 'grad_norm': 0.5527238979079008, 'learning_rate': 8.216370350212709e-06, 'epoch': 0.3} 30%|██▉ | 6613/22095 [11:17:32<18:41:43, 4.35s/it] 30%|██▉ | 6614/22095 [11:17:35<17:15:32, 4.01s/it] {'loss': 0.352, 'grad_norm': 0.6503365969415894, 'learning_rate': 8.215809164949407e-06, 'epoch': 0.3} 30%|██▉ | 6614/22095 [11:17:35<17:15:32, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48331 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55710 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6615/22095 [11:17:38<15:41:12, 3.65s/it] {'loss': 0.4055, 'grad_norm': 0.6456987454478041, 'learning_rate': 8.215247910588242e-06, 'epoch': 0.3} 30%|██▉ | 6615/22095 [11:17:38<15:41:12, 3.65s/it] 30%|██▉ | 6616/22095 [11:17:41<14:39:01, 3.41s/it] {'loss': 0.4042, 'grad_norm': 0.765375799768525, 'learning_rate': 8.214686587141277e-06, 'epoch': 0.3} 30%|██▉ | 6616/22095 [11:17:41<14:39:01, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6617/22095 [11:17:50<22:29:12, 5.23s/it] {'loss': 0.5028, 'grad_norm': 0.3333814206779106, 'learning_rate': 8.21412519462057e-06, 'epoch': 0.3} 30%|██▉ | 6617/22095 [11:17:50<22:29:12, 5.23s/it] 30%|██▉ | 6618/22095 [11:18:00<28:38:43, 6.66s/it] {'loss': 0.4787, 'grad_norm': 0.33084472661619535, 'learning_rate': 8.213563733038182e-06, 'epoch': 0.3} 30%|██▉ | 6618/22095 [11:18:00<28:38:43, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 30%|██▉ | 6619/22095 [11:18:04<25:16:31, 5.88s/it] {'loss': 0.3875, 'grad_norm': 0.6637174994749759, 'learning_rate': 8.21300220240618e-06, 'epoch': 0.3} 30%|██▉ | 6619/22095 [11:18:04<25:16:31, 5.88s/it] 30%|██▉ | 6620/22095 [11:18:08<23:01:13, 5.36s/it] {'loss': 0.3809, 'grad_norm': 0.6912141263850212, 'learning_rate': 8.212440602736628e-06, 'epoch': 0.3} 30%|██▉ | 6620/22095 [11:18:08<23:01:13, 5.36s/it] 30%|██▉ | 6621/22095 [11:18:12<20:28:31, 4.76s/it] {'loss': 0.3658, 'grad_norm': 0.6936592964069401, 'learning_rate': 8.211878934041595e-06, 'epoch': 0.3} 30%|██▉ | 6621/22095 [11:18:12<20:28:31, 4.76s/it] 30%|██▉ | 6622/22095 [11:18:16<19:11:39, 4.47s/it] {'loss': 0.3622, 'grad_norm': 0.678807367422059, 'learning_rate': 8.211317196333149e-06, 'epoch': 0.3} 30%|██▉ | 6622/22095 [11:18:16<19:11:39, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|██▉ | 6623/22095 [11:18:22<21:22:27, 4.97s/it] {'loss': 0.46, 'grad_norm': 0.38560937624830255, 'learning_rate': 8.210755389623356e-06, 'epoch': 0.3} 30%|██▉ | 6623/22095 [11:18:22<21:22:27, 4.97s/it] 30%|██▉ | 6624/22095 [11:18:25<19:24:30, 4.52s/it] {'loss': 0.3718, 'grad_norm': 0.779311131617998, 'learning_rate': 8.210193513924294e-06, 'epoch': 0.3} 30%|██▉ | 6624/22095 [11:18:25<19:24:30, 4.52s/it] 30%|██▉ | 6625/22095 [11:18:29<17:59:17, 4.19s/it] {'loss': 0.3685, 'grad_norm': 0.6420801291643528, 'learning_rate': 8.209631569248031e-06, 'epoch': 0.3} 30%|██▉ | 6625/22095 [11:18:29<17:59:17, 4.19s/it] 30%|██▉ | 6626/22095 [11:18:33<18:18:33, 4.26s/it] {'loss': 0.3472, 'grad_norm': 0.6773260279806149, 'learning_rate': 8.209069555606643e-06, 'epoch': 0.3} 30%|██▉ | 6626/22095 [11:18:33<18:18:33, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42375 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131831 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112825 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128984 > 40960). Running this sequence through the model will result in indexing errors 30%|██▉ | 6627/22095 [11:18:37<17:51:49, 4.16s/it] {'loss': 0.4212, 'grad_norm': 0.7706893310245987, 'learning_rate': 8.208507473012207e-06, 'epoch': 0.3} 30%|██▉ | 6627/22095 [11:18:37<17:51:49, 4.16s/it] 30%|██▉ | 6628/22095 [11:18:40<16:59:46, 3.96s/it] {'loss': 0.3402, 'grad_norm': 0.638140601624294, 'learning_rate': 8.2079453214768e-06, 'epoch': 0.3} 30%|██▉ | 6628/22095 [11:18:40<16:59:46, 3.96s/it] 30%|███ | 6629/22095 [11:18:44<16:17:22, 3.79s/it] {'loss': 0.3185, 'grad_norm': 0.5940795490485952, 'learning_rate': 8.2073831010125e-06, 'epoch': 0.3} 30%|███ | 6629/22095 [11:18:44<16:17:22, 3.79s/it] 30%|███ | 6630/22095 [11:18:47<16:02:17, 3.73s/it] {'loss': 0.3228, 'grad_norm': 0.6635770683032601, 'learning_rate': 8.206820811631387e-06, 'epoch': 0.3} 30%|███ | 6630/22095 [11:18:47<16:02:17, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52009 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6631/22095 [11:18:50<14:49:25, 3.45s/it] {'loss': 0.384, 'grad_norm': 0.6899477337576465, 'learning_rate': 8.206258453345543e-06, 'epoch': 0.3} 30%|███ | 6631/22095 [11:18:50<14:49:25, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047789 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 6\nB. 7.5\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 30%|███ | 6632/22095 [11:18:54<15:08:31, 3.53s/it] {'loss': 0.3475, 'grad_norm': 0.6668924568154347, 'learning_rate': 8.205696026167054e-06, 'epoch': 0.3} 30%|███ | 6632/22095 [11:18:54<15:08:31, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73493 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6633/22095 [11:18:57<14:53:59, 3.47s/it] {'loss': 0.393, 'grad_norm': 0.6413029895755245, 'learning_rate': 8.205133530108003e-06, 'epoch': 0.3} 30%|███ | 6633/22095 [11:18:57<14:53:59, 3.47s/it] 30%|███ | 6634/22095 [11:19:00<14:37:38, 3.41s/it] {'loss': 0.3834, 'grad_norm': 0.6730984902221498, 'learning_rate': 8.204570965180476e-06, 'epoch': 0.3} 30%|███ | 6634/22095 [11:19:00<14:37:38, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6635/22095 [11:19:08<19:31:12, 4.55s/it] {'loss': 0.5068, 'grad_norm': 0.371876236667934, 'learning_rate': 8.204008331396562e-06, 'epoch': 0.3} 30%|███ | 6635/22095 [11:19:08<19:31:12, 4.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047682 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 1\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|███ | 6636/22095 [11:19:11<18:23:30, 4.28s/it] {'loss': 0.3631, 'grad_norm': 0.6556362419726928, 'learning_rate': 8.203445628768347e-06, 'epoch': 0.3} 30%|███ | 6636/22095 [11:19:11<18:23:30, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76740 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6637/22095 [11:19:14<16:26:36, 3.83s/it] {'loss': 0.3779, 'grad_norm': 0.6359940940564239, 'learning_rate': 8.202882857307926e-06, 'epoch': 0.3} 30%|███ | 6637/22095 [11:19:14<16:26:36, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102020 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118264 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6638/22095 [11:19:18<15:59:37, 3.73s/it] {'loss': 0.3984, 'grad_norm': 0.6386188623759247, 'learning_rate': 8.202320017027387e-06, 'epoch': 0.3} 30%|███ | 6638/22095 [11:19:18<15:59:37, 3.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6639/22095 [11:19:25<20:49:23, 4.85s/it] {'loss': 0.5145, 'grad_norm': 0.32621834677771844, 'learning_rate': 8.201757107938829e-06, 'epoch': 0.3} 30%|███ | 6639/22095 [11:19:25<20:49:23, 4.85s/it] 30%|███ | 6640/22095 [11:19:28<18:42:22, 4.36s/it] {'loss': 0.3325, 'grad_norm': 0.5928303322514519, 'learning_rate': 8.201194130054342e-06, 'epoch': 0.3} 30%|███ | 6640/22095 [11:19:28<18:42:22, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6641/22095 [11:19:38<25:14:13, 5.88s/it] {'loss': 0.4907, 'grad_norm': 0.3094692390285682, 'learning_rate': 8.200631083386025e-06, 'epoch': 0.3} 30%|███ | 6641/22095 [11:19:38<25:14:13, 5.88s/it] 30%|███ | 6642/22095 [11:19:42<22:47:12, 5.31s/it] {'loss': 0.3741, 'grad_norm': 0.6144956649085465, 'learning_rate': 8.200067967945977e-06, 'epoch': 0.3} 30%|███ | 6642/22095 [11:19:42<22:47:12, 5.31s/it] 30%|███ | 6643/22095 [11:19:45<20:02:44, 4.67s/it] {'loss': 0.3798, 'grad_norm': 0.695521701070394, 'learning_rate': 8.199504783746297e-06, 'epoch': 0.3} 30%|███ | 6643/22095 [11:19:45<20:02:44, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88208 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119808 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43575 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6644/22095 [11:19:48<18:15:47, 4.26s/it] {'loss': 0.3339, 'grad_norm': 0.7236023029614718, 'learning_rate': 8.198941530799084e-06, 'epoch': 0.3} 30%|███ | 6644/22095 [11:19:48<18:15:47, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47805 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101508 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6645/22095 [11:19:57<24:03:02, 5.60s/it] {'loss': 0.4678, 'grad_norm': 0.2931514319730529, 'learning_rate': 8.198378209116444e-06, 'epoch': 0.3} 30%|███ | 6645/22095 [11:19:57<24:03:02, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43381 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6646/22095 [11:20:02<23:10:06, 5.40s/it] {'loss': 0.4783, 'grad_norm': 0.3172098987084482, 'learning_rate': 8.19781481871048e-06, 'epoch': 0.3} 30%|███ | 6646/22095 [11:20:02<23:10:06, 5.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6647/22095 [11:20:05<20:42:59, 4.83s/it] {'loss': 0.3924, 'grad_norm': 0.6300258938616354, 'learning_rate': 8.197251359593294e-06, 'epoch': 0.3} 30%|███ | 6647/22095 [11:20:05<20:42:59, 4.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6648/22095 [11:20:09<19:04:26, 4.45s/it] {'loss': 0.3308, 'grad_norm': 0.6117147132966283, 'learning_rate': 8.196687831776998e-06, 'epoch': 0.3} 30%|███ | 6648/22095 [11:20:09<19:04:26, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50396 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125385 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6649/22095 [11:20:12<16:52:50, 3.93s/it] {'loss': 0.4117, 'grad_norm': 0.6846165713761543, 'learning_rate': 8.196124235273698e-06, 'epoch': 0.3} 30%|███ | 6649/22095 [11:20:12<16:52:50, 3.93s/it] 30%|███ | 6650/22095 [11:20:15<15:57:00, 3.72s/it] {'loss': 0.3732, 'grad_norm': 0.6475285881169686, 'learning_rate': 8.195560570095504e-06, 'epoch': 0.3} 30%|███ | 6650/22095 [11:20:15<15:57:00, 3.72s/it] 30%|███ | 6651/22095 [11:20:18<15:43:07, 3.66s/it] {'loss': 0.3692, 'grad_norm': 0.6562117371282999, 'learning_rate': 8.194996836254527e-06, 'epoch': 0.3} 30%|███ | 6651/22095 [11:20:18<15:43:07, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6652/22095 [11:20:29<25:15:34, 5.89s/it] {'loss': 0.4973, 'grad_norm': 0.3879940700042671, 'learning_rate': 8.194433033762882e-06, 'epoch': 0.3} 30%|███ | 6652/22095 [11:20:29<25:15:34, 5.89s/it] 30%|███ | 6653/22095 [11:20:40<31:23:34, 7.32s/it] {'loss': 0.4715, 'grad_norm': 0.3532760568138731, 'learning_rate': 8.193869162632682e-06, 'epoch': 0.3} 30%|███ | 6653/22095 [11:20:40<31:23:34, 7.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8452775 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64370, 'image': 'vrdu_texteq/astro-ph.CO/829ad266-1a57-44aa-a1b2-6f073d7bdd3b.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': '$\\approx$29'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921710 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44863, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 8\nB. 7\nC. 6\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 30%|███ | 6654/22095 [11:20:50<34:41:27, 8.09s/it] {'loss': 0.5229, 'grad_norm': 0.33387439482702236, 'learning_rate': 8.193305222876043e-06, 'epoch': 0.3} 30%|███ | 6654/22095 [11:20:50<34:41:27, 8.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 30%|███ | 6655/22095 [11:20:54<29:09:31, 6.80s/it] {'loss': 0.3942, 'grad_norm': 0.6660220599660495, 'learning_rate': 8.19274121450508e-06, 'epoch': 0.3} 30%|███ | 6655/22095 [11:20:54<29:09:31, 6.80s/it] 30%|███ | 6656/22095 [11:20:58<25:38:01, 5.98s/it] {'loss': 0.3359, 'grad_norm': 0.8598578165148109, 'learning_rate': 8.192177137531916e-06, 'epoch': 0.3} 30%|███ | 6656/22095 [11:20:58<25:38:01, 5.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6657/22095 [11:21:07<30:07:24, 7.02s/it] {'loss': 0.4614, 'grad_norm': 0.4416717118687196, 'learning_rate': 8.19161299196867e-06, 'epoch': 0.3} 30%|███ | 6657/22095 [11:21:07<30:07:24, 7.02s/it] 30%|███ | 6658/22095 [11:21:12<26:30:32, 6.18s/it] {'loss': 0.3596, 'grad_norm': 0.8191984846210192, 'learning_rate': 8.191048777827462e-06, 'epoch': 0.3} 30%|███ | 6658/22095 [11:21:12<26:30:32, 6.18s/it] 30%|███ | 6659/22095 [11:21:15<23:20:12, 5.44s/it] {'loss': 0.3474, 'grad_norm': 0.7090058012522736, 'learning_rate': 8.190484495120416e-06, 'epoch': 0.3} 30%|███ | 6659/22095 [11:21:15<23:20:12, 5.44s/it] 30%|███ | 6660/22095 [11:21:19<20:48:40, 4.85s/it] {'loss': 0.356, 'grad_norm': 0.6600972546014798, 'learning_rate': 8.189920143859658e-06, 'epoch': 0.3} 30%|███ | 6660/22095 [11:21:19<20:48:40, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46887 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43002 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73920 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6661/22095 [11:21:22<18:30:58, 4.32s/it] {'loss': 0.3822, 'grad_norm': 0.758035548634254, 'learning_rate': 8.189355724057313e-06, 'epoch': 0.3} 30%|███ | 6661/22095 [11:21:22<18:30:58, 4.32s/it] 30%|███ | 6662/22095 [11:21:25<17:20:45, 4.05s/it] {'loss': 0.39, 'grad_norm': 0.6585617124518213, 'learning_rate': 8.188791235725509e-06, 'epoch': 0.3} 30%|███ | 6662/22095 [11:21:25<17:20:45, 4.05s/it] 30%|███ | 6663/22095 [11:21:28<15:52:14, 3.70s/it] {'loss': 0.3525, 'grad_norm': 0.6584263455818427, 'learning_rate': 8.188226678876374e-06, 'epoch': 0.3} 30%|███ | 6663/22095 [11:21:28<15:52:14, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884032 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7185, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 4cm\nB. 1cm\nC. 1.5cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|███ | 6664/22095 [11:21:31<14:36:01, 3.41s/it] {'loss': 0.364, 'grad_norm': 0.6284333579476523, 'learning_rate': 8.187662053522039e-06, 'epoch': 0.3} 30%|███ | 6664/22095 [11:21:31<14:36:01, 3.41s/it] 30%|███ | 6665/22095 [11:21:34<13:46:07, 3.21s/it] {'loss': 0.3824, 'grad_norm': 0.6941954463209911, 'learning_rate': 8.187097359674638e-06, 'epoch': 0.3} 30%|███ | 6665/22095 [11:21:34<13:46:07, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41222 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115816 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6666/22095 [11:21:37<14:38:08, 3.41s/it] {'loss': 0.3797, 'grad_norm': 0.6193892417113361, 'learning_rate': 8.186532597346304e-06, 'epoch': 0.3} 30%|███ | 6666/22095 [11:21:37<14:38:08, 3.41s/it] 30%|███ | 6667/22095 [11:21:40<14:07:04, 3.29s/it] {'loss': 0.3267, 'grad_norm': 0.661626608730419, 'learning_rate': 8.18596776654917e-06, 'epoch': 0.3} 30%|███ | 6667/22095 [11:21:40<14:07:04, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90401 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61234 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130621 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6668/22095 [11:21:45<15:09:55, 3.54s/it] {'loss': 0.3986, 'grad_norm': 0.6513566152865766, 'learning_rate': 8.185402867295373e-06, 'epoch': 0.3} 30%|███ | 6668/22095 [11:21:45<15:09:55, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6669/22095 [11:21:54<22:52:13, 5.34s/it] {'loss': 0.4989, 'grad_norm': 0.46167306904485933, 'learning_rate': 8.184837899597054e-06, 'epoch': 0.3} 30%|███ | 6669/22095 [11:21:54<22:52:13, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74047 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6670/22095 [11:21:57<20:17:38, 4.74s/it] {'loss': 0.3587, 'grad_norm': 0.6812849953088046, 'learning_rate': 8.184272863466348e-06, 'epoch': 0.3} 30%|███ | 6670/22095 [11:21:57<20:17:38, 4.74s/it] 30%|███ | 6671/22095 [11:22:01<19:08:12, 4.47s/it] {'loss': 0.3866, 'grad_norm': 0.6372181036074359, 'learning_rate': 8.183707758915398e-06, 'epoch': 0.3} 30%|███ | 6671/22095 [11:22:01<19:08:12, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (40978 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6672/22095 [11:22:05<18:26:55, 4.31s/it] {'loss': 0.382, 'grad_norm': 0.6663877518321638, 'learning_rate': 8.183142585956347e-06, 'epoch': 0.3} 30%|███ | 6672/22095 [11:22:05<18:26:55, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41156 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41874 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6673/22095 [11:22:09<18:00:43, 4.20s/it] {'loss': 0.3833, 'grad_norm': 0.6383630283033406, 'learning_rate': 8.182577344601337e-06, 'epoch': 0.3} 30%|███ | 6673/22095 [11:22:09<18:00:43, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6674/22095 [11:22:19<24:39:40, 5.76s/it] {'loss': 0.4855, 'grad_norm': 0.36237644474402525, 'learning_rate': 8.182012034862514e-06, 'epoch': 0.3} 30%|███ | 6674/22095 [11:22:19<24:39:40, 5.76s/it] 30%|███ | 6675/22095 [11:22:22<21:42:41, 5.07s/it] {'loss': 0.4167, 'grad_norm': 0.7009030759694248, 'learning_rate': 8.181446656752027e-06, 'epoch': 0.3} 30%|███ | 6675/22095 [11:22:22<21:42:41, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965313 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16148, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 4cm\nB. 6cm\nC. 1cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 30%|███ | 6676/22095 [11:22:31<26:05:45, 6.09s/it] {'loss': 0.4902, 'grad_norm': 0.3188822744200764, 'learning_rate': 8.18088121028202e-06, 'epoch': 0.3} 30%|███ | 6676/22095 [11:22:31<26:05:45, 6.09s/it] 30%|███ | 6677/22095 [11:22:34<22:52:40, 5.34s/it] {'loss': 0.395, 'grad_norm': 0.7033049015038353, 'learning_rate': 8.18031569546465e-06, 'epoch': 0.3} 30%|███ | 6677/22095 [11:22:34<22:52:40, 5.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6678/22095 [11:22:42<26:05:26, 6.09s/it] {'loss': 0.4924, 'grad_norm': 0.2996947562744208, 'learning_rate': 8.179750112312058e-06, 'epoch': 0.3} 30%|███ | 6678/22095 [11:22:42<26:05:26, 6.09s/it] 30%|███ | 6679/22095 [11:22:46<22:57:12, 5.36s/it] {'loss': 0.4296, 'grad_norm': 0.7107215020779526, 'learning_rate': 8.179184460836404e-06, 'epoch': 0.3} 30%|███ | 6679/22095 [11:22:46<22:57:12, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6680/22095 [11:22:53<26:10:53, 6.11s/it] {'loss': 0.4754, 'grad_norm': 0.36882862048673637, 'learning_rate': 8.178618741049841e-06, 'epoch': 0.3} 30%|███ | 6680/22095 [11:22:53<26:10:53, 6.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70778 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6681/22095 [11:23:03<30:48:12, 7.19s/it] {'loss': 0.4703, 'grad_norm': 0.35130336150032226, 'learning_rate': 8.178052952964523e-06, 'epoch': 0.3} 30%|███ | 6681/22095 [11:23:03<30:48:12, 7.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41664 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6682/22095 [11:23:07<26:04:42, 6.09s/it] {'loss': 0.3988, 'grad_norm': 0.7141293551304086, 'learning_rate': 8.177487096592607e-06, 'epoch': 0.3} 30%|███ | 6682/22095 [11:23:07<26:04:42, 6.09s/it] 30%|███ | 6683/22095 [11:23:10<22:46:35, 5.32s/it] {'loss': 0.3606, 'grad_norm': 0.7030862967298622, 'learning_rate': 8.176921171946252e-06, 'epoch': 0.3} 30%|███ | 6683/22095 [11:23:10<22:46:35, 5.32s/it] 30%|███ | 6684/22095 [11:23:14<20:44:19, 4.84s/it] {'loss': 0.4181, 'grad_norm': 0.6769924190293563, 'learning_rate': 8.176355179037619e-06, 'epoch': 0.3} 30%|███ | 6684/22095 [11:23:14<20:44:19, 4.84s/it] 30%|███ | 6685/22095 [11:23:17<18:50:41, 4.40s/it] {'loss': 0.4089, 'grad_norm': 0.6831775720861758, 'learning_rate': 8.17578911787887e-06, 'epoch': 0.3} 30%|███ | 6685/22095 [11:23:17<18:50:41, 4.40s/it] 30%|███ | 6686/22095 [11:23:21<17:40:29, 4.13s/it] {'loss': 0.3295, 'grad_norm': 0.6606531283871524, 'learning_rate': 8.175222988482163e-06, 'epoch': 0.3} 30%|███ | 6686/22095 [11:23:21<17:40:29, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131034 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6687/22095 [11:23:24<16:11:50, 3.78s/it] {'loss': 0.3567, 'grad_norm': 0.6762522492476349, 'learning_rate': 8.174656790859668e-06, 'epoch': 0.3} 30%|███ | 6687/22095 [11:23:24<16:11:50, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45166 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67038 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6688/22095 [11:23:27<15:00:35, 3.51s/it] {'loss': 0.4107, 'grad_norm': 0.6795353803369522, 'learning_rate': 8.17409052502355e-06, 'epoch': 0.3} 30%|███ | 6688/22095 [11:23:27<15:00:35, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6689/22095 [11:23:34<20:22:27, 4.76s/it] {'loss': 0.4827, 'grad_norm': 0.538399507273518, 'learning_rate': 8.173524190985973e-06, 'epoch': 0.3} 30%|███ | 6689/22095 [11:23:34<20:22:27, 4.76s/it] 30%|███ | 6690/22095 [11:23:40<21:51:49, 5.11s/it] {'loss': 0.4884, 'grad_norm': 0.40540498512979073, 'learning_rate': 8.172957788759109e-06, 'epoch': 0.3} 30%|███ | 6690/22095 [11:23:40<21:51:49, 5.11s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6691/22095 [11:23:44<19:34:02, 4.57s/it] {'loss': 0.3521, 'grad_norm': 0.7645143001189432, 'learning_rate': 8.172391318355126e-06, 'epoch': 0.3} 30%|███ | 6691/22095 [11:23:44<19:34:02, 4.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047683 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 2\nB. 3\nC. 4\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 30%|███ | 6692/22095 [11:23:47<18:02:57, 4.22s/it] {'loss': 0.383, 'grad_norm': 0.7116150900076627, 'learning_rate': 8.171824779786198e-06, 'epoch': 0.3} 30%|███ | 6692/22095 [11:23:47<18:02:57, 4.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6693/22095 [11:23:50<16:55:15, 3.96s/it] {'loss': 0.3792, 'grad_norm': 0.6885266895160488, 'learning_rate': 8.171258173064497e-06, 'epoch': 0.3} 30%|███ | 6693/22095 [11:23:50<16:55:15, 3.96s/it] 30%|███ | 6694/22095 [11:23:54<16:05:37, 3.76s/it] {'loss': 0.3836, 'grad_norm': 0.6154377255141567, 'learning_rate': 8.170691498202196e-06, 'epoch': 0.3} 30%|███ | 6694/22095 [11:23:54<16:05:37, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75522 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6695/22095 [11:24:03<23:48:24, 5.57s/it] {'loss': 0.4875, 'grad_norm': 0.6892015186835359, 'learning_rate': 8.170124755211475e-06, 'epoch': 0.3} 30%|███ | 6695/22095 [11:24:03<23:48:24, 5.57s/it] 30%|███ | 6696/22095 [11:24:13<28:27:51, 6.65s/it] {'loss': 0.518, 'grad_norm': 0.5078678332810831, 'learning_rate': 8.16955794410451e-06, 'epoch': 0.3} 30%|███ | 6696/22095 [11:24:13<28:27:51, 6.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047941 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 4\nB. 6\nC. 2\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 30%|███ | 6697/22095 [11:24:16<23:54:30, 5.59s/it] {'loss': 0.4067, 'grad_norm': 0.6996318079313077, 'learning_rate': 8.168991064893476e-06, 'epoch': 0.3} 30%|███ | 6697/22095 [11:24:16<23:54:30, 5.59s/it] 30%|███ | 6698/22095 [11:24:19<20:54:02, 4.89s/it] {'loss': 0.364, 'grad_norm': 0.6701047881657107, 'learning_rate': 8.168424117590559e-06, 'epoch': 0.3} 30%|███ | 6698/22095 [11:24:19<20:54:02, 4.89s/it] 30%|███ | 6699/22095 [11:24:22<19:10:50, 4.48s/it] {'loss': 0.3914, 'grad_norm': 0.6267231223583077, 'learning_rate': 8.167857102207936e-06, 'epoch': 0.3} 30%|███ | 6699/22095 [11:24:23<19:10:50, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98991 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6700/22095 [11:24:26<17:45:13, 4.15s/it] {'loss': 0.3677, 'grad_norm': 0.6488152116391062, 'learning_rate': 8.167290018757797e-06, 'epoch': 0.3} 30%|███ | 6700/22095 [11:24:26<17:45:13, 4.15s/it] 30%|███ | 6701/22095 [11:24:29<16:52:08, 3.94s/it] {'loss': 0.3588, 'grad_norm': 0.657644421376092, 'learning_rate': 8.166722867252321e-06, 'epoch': 0.3} 30%|███ | 6701/22095 [11:24:29<16:52:08, 3.94s/it] 30%|███ | 6702/22095 [11:24:33<16:07:16, 3.77s/it] {'loss': 0.3919, 'grad_norm': 0.6977592438122392, 'learning_rate': 8.166155647703698e-06, 'epoch': 0.3} 30%|███ | 6702/22095 [11:24:33<16:07:16, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6703/22095 [11:24:43<23:52:15, 5.58s/it] {'loss': 0.5246, 'grad_norm': 1.0770726070383594, 'learning_rate': 8.165588360124112e-06, 'epoch': 0.3} 30%|███ | 6703/22095 [11:24:43<23:52:15, 5.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6704/22095 [11:24:46<21:34:28, 5.05s/it] {'loss': 0.3728, 'grad_norm': 0.6771285868253755, 'learning_rate': 8.165021004525758e-06, 'epoch': 0.3} 30%|███ | 6704/22095 [11:24:46<21:34:28, 5.05s/it] 30%|███ | 6705/22095 [11:24:50<20:09:53, 4.72s/it] {'loss': 0.3463, 'grad_norm': 0.6544659774166018, 'learning_rate': 8.164453580920819e-06, 'epoch': 0.3} 30%|███ | 6705/22095 [11:24:50<20:09:53, 4.72s/it] 30%|███ | 6706/22095 [11:24:54<19:06:20, 4.47s/it] {'loss': 0.4144, 'grad_norm': 0.6545512816480353, 'learning_rate': 8.163886089321493e-06, 'epoch': 0.3} 30%|███ | 6706/22095 [11:24:54<19:06:20, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106096 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6707/22095 [11:24:58<18:05:33, 4.23s/it] {'loss': 0.349, 'grad_norm': 0.7171692882504428, 'learning_rate': 8.163318529739971e-06, 'epoch': 0.3} 30%|███ | 6707/22095 [11:24:58<18:05:33, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6708/22095 [11:25:05<21:23:44, 5.01s/it] {'loss': 0.4737, 'grad_norm': 0.5006054067208444, 'learning_rate': 8.162750902188452e-06, 'epoch': 0.3} 30%|███ | 6708/22095 [11:25:05<21:23:44, 5.01s/it] 30%|███ | 6709/22095 [11:25:09<20:04:42, 4.70s/it] {'loss': 0.3905, 'grad_norm': 0.6516365427753947, 'learning_rate': 8.162183206679129e-06, 'epoch': 0.3} 30%|███ | 6709/22095 [11:25:09<20:04:42, 4.70s/it] 30%|███ | 6710/22095 [11:25:13<19:02:19, 4.45s/it] {'loss': 0.346, 'grad_norm': 0.6767068672930011, 'learning_rate': 8.1616154432242e-06, 'epoch': 0.3} 30%|███ | 6710/22095 [11:25:13<19:02:19, 4.45s/it] 30%|███ | 6711/22095 [11:25:15<16:54:03, 3.96s/it] {'loss': 0.3751, 'grad_norm': 0.6397724806662128, 'learning_rate': 8.161047611835866e-06, 'epoch': 0.3} 30%|███ | 6711/22095 [11:25:15<16:54:03, 3.96s/it] 30%|███ | 6712/22095 [11:25:18<15:42:41, 3.68s/it] {'loss': 0.3977, 'grad_norm': 0.8168005161205404, 'learning_rate': 8.160479712526326e-06, 'epoch': 0.3} 30%|███ | 6712/22095 [11:25:18<15:42:41, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6713/22095 [11:25:21<14:59:56, 3.51s/it] {'loss': 0.4237, 'grad_norm': 0.7369886205778866, 'learning_rate': 8.159911745307785e-06, 'epoch': 0.3} 30%|███ | 6713/22095 [11:25:21<14:59:56, 3.51s/it] 30%|███ | 6714/22095 [11:25:25<15:04:41, 3.53s/it] {'loss': 0.389, 'grad_norm': 0.6740165118906998, 'learning_rate': 8.159343710192445e-06, 'epoch': 0.3} 30%|███ | 6714/22095 [11:25:25<15:04:41, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902701 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25854, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 30%|███ | 6715/22095 [11:25:34<22:39:20, 5.30s/it] {'loss': 0.4844, 'grad_norm': 0.5087143101780832, 'learning_rate': 8.158775607192511e-06, 'epoch': 0.3} 30%|███ | 6715/22095 [11:25:34<22:39:20, 5.30s/it] 30%|███ | 6716/22095 [11:25:38<20:06:48, 4.71s/it] {'loss': 0.3281, 'grad_norm': 0.6533814867752841, 'learning_rate': 8.158207436320192e-06, 'epoch': 0.3} 30%|███ | 6716/22095 [11:25:38<20:06:48, 4.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878403 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1556, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 4\nB. 6\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 30%|███ | 6717/22095 [11:25:41<18:46:20, 4.39s/it] {'loss': 0.3611, 'grad_norm': 0.6203679406879485, 'learning_rate': 8.157639197587694e-06, 'epoch': 0.3} 30%|███ | 6717/22095 [11:25:41<18:46:20, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43817 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112198 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6718/22095 [11:25:46<18:23:00, 4.30s/it] {'loss': 0.357, 'grad_norm': 0.6193842426608959, 'learning_rate': 8.157070891007227e-06, 'epoch': 0.3} 30%|███ | 6718/22095 [11:25:46<18:23:00, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6719/22095 [11:25:48<16:33:22, 3.88s/it] {'loss': 0.4038, 'grad_norm': 0.6689713449805593, 'learning_rate': 8.156502516591005e-06, 'epoch': 0.3} 30%|███ | 6719/22095 [11:25:48<16:33:22, 3.88s/it] 30%|███ | 6720/22095 [11:25:52<15:40:48, 3.67s/it] {'loss': 0.3695, 'grad_norm': 0.6649185774901039, 'learning_rate': 8.155934074351236e-06, 'epoch': 0.3} 30%|███ | 6720/22095 [11:25:52<15:40:48, 3.67s/it] 30%|███ | 6721/22095 [11:25:55<14:55:05, 3.49s/it] {'loss': 0.3875, 'grad_norm': 0.7085354849748461, 'learning_rate': 8.155365564300137e-06, 'epoch': 0.3} 30%|███ | 6721/22095 [11:25:55<14:55:05, 3.49s/it] 30%|███ | 6722/22095 [11:25:57<13:58:50, 3.27s/it] {'loss': 0.3678, 'grad_norm': 0.7011208137126376, 'learning_rate': 8.154796986449925e-06, 'epoch': 0.3} 30%|███ | 6722/22095 [11:25:57<13:58:50, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52937 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6723/22095 [11:26:01<13:43:10, 3.21s/it] {'loss': 0.3564, 'grad_norm': 0.6183098806515898, 'learning_rate': 8.154228340812812e-06, 'epoch': 0.3} 30%|███ | 6723/22095 [11:26:01<13:43:10, 3.21s/it] 30%|███ | 6724/22095 [11:26:04<13:48:15, 3.23s/it] {'loss': 0.4572, 'grad_norm': 0.6849206059636519, 'learning_rate': 8.15365962740102e-06, 'epoch': 0.3} 30%|███ | 6724/22095 [11:26:04<13:48:15, 3.23s/it] 30%|███ | 6725/22095 [11:26:07<13:22:55, 3.13s/it] {'loss': 0.3435, 'grad_norm': 0.667232617976701, 'learning_rate': 8.15309084622677e-06, 'epoch': 0.3} 30%|███ | 6725/22095 [11:26:07<13:22:55, 3.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6726/22095 [11:26:16<21:49:56, 5.11s/it] {'loss': 0.486, 'grad_norm': 0.5710042272132126, 'learning_rate': 8.15252199730228e-06, 'epoch': 0.3} 30%|███ | 6726/22095 [11:26:16<21:49:56, 5.11s/it] 30%|███ | 6727/22095 [11:26:21<20:38:11, 4.83s/it] {'loss': 0.4118, 'grad_norm': 0.6301191405750226, 'learning_rate': 8.151953080639777e-06, 'epoch': 0.3} 30%|███ | 6727/22095 [11:26:21<20:38:11, 4.83s/it] 30%|███ | 6728/22095 [11:26:24<18:27:18, 4.32s/it] {'loss': 0.3569, 'grad_norm': 0.6708578885522815, 'learning_rate': 8.15138409625148e-06, 'epoch': 0.3} 30%|███ | 6728/22095 [11:26:24<18:27:18, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6729/22095 [11:26:27<17:43:25, 4.15s/it] {'loss': 0.3496, 'grad_norm': 0.6846127030532159, 'learning_rate': 8.15081504414962e-06, 'epoch': 0.3} 30%|███ | 6729/22095 [11:26:28<17:43:25, 4.15s/it] 30%|███ | 6730/22095 [11:26:31<16:18:44, 3.82s/it] {'loss': 0.3675, 'grad_norm': 0.6200505350083688, 'learning_rate': 8.15024592434642e-06, 'epoch': 0.3} 30%|███ | 6730/22095 [11:26:31<16:18:44, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 30%|███ | 6731/22095 [11:26:40<22:54:51, 5.37s/it] {'loss': 0.5014, 'grad_norm': 0.5283697356772571, 'learning_rate': 8.14967673685411e-06, 'epoch': 0.3} 30%|███ | 6731/22095 [11:26:40<22:54:51, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97891 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6732/22095 [11:26:43<20:28:47, 4.80s/it] {'loss': 0.3767, 'grad_norm': 0.9863932733707426, 'learning_rate': 8.149107481684922e-06, 'epoch': 0.3} 30%|███ | 6732/22095 [11:26:43<20:28:47, 4.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41151 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54704 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6733/22095 [11:26:47<19:26:59, 4.56s/it] {'loss': 0.3789, 'grad_norm': 0.6713443061113822, 'learning_rate': 8.148538158851084e-06, 'epoch': 0.3} 30%|███ | 6733/22095 [11:26:47<19:26:59, 4.56s/it] 30%|███ | 6734/22095 [11:26:50<17:16:33, 4.05s/it] {'loss': 0.3227, 'grad_norm': 0.5988285870373714, 'learning_rate': 8.147968768364833e-06, 'epoch': 0.3} 30%|███ | 6734/22095 [11:26:50<17:16:33, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59863 > 40960). Running this sequence through the model will result in indexing errors 30%|███ | 6735/22095 [11:26:53<15:57:33, 3.74s/it] {'loss': 0.3709, 'grad_norm': 0.6679839451442557, 'learning_rate': 8.1473993102384e-06, 'epoch': 0.3} 30%|███ | 6735/22095 [11:26:53<15:57:33, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 30%|███ | 6736/22095 [11:26:56<14:58:30, 3.51s/it] {'loss': 0.3692, 'grad_norm': 0.719002401863315, 'learning_rate': 8.146829784484024e-06, 'epoch': 0.3} 30%|███ | 6736/22095 [11:26:56<14:58:30, 3.51s/it] 30%|███ | 6737/22095 [11:26:59<14:32:59, 3.41s/it] {'loss': 0.408, 'grad_norm': 0.7384804649111679, 'learning_rate': 8.146260191113937e-06, 'epoch': 0.3} 30%|███ | 6737/22095 [11:26:59<14:32:59, 3.41s/it] 30%|███ | 6738/22095 [11:27:03<14:44:03, 3.45s/it] {'loss': 0.3467, 'grad_norm': 0.5806880149958714, 'learning_rate': 8.145690530140385e-06, 'epoch': 0.3} 30%|███ | 6738/22095 [11:27:03<14:44:03, 3.45s/it] 31%|███ | 6739/22095 [11:27:06<14:38:08, 3.43s/it] {'loss': 0.3329, 'grad_norm': 0.7988457184655746, 'learning_rate': 8.145120801575603e-06, 'epoch': 0.31} 31%|███ | 6739/22095 [11:27:06<14:38:08, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6740/22095 [11:27:15<21:10:35, 4.96s/it] {'loss': 0.5055, 'grad_norm': 0.45441104152888795, 'learning_rate': 8.144551005431835e-06, 'epoch': 0.31} 31%|███ | 6740/22095 [11:27:15<21:10:35, 4.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55236 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6741/22095 [11:27:21<23:22:48, 5.48s/it] {'loss': 0.5051, 'grad_norm': 0.3480895237820813, 'learning_rate': 8.143981141721324e-06, 'epoch': 0.31} 31%|███ | 6741/22095 [11:27:21<23:22:48, 5.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 31%|███ | 6742/22095 [11:27:26<21:59:02, 5.15s/it] {'loss': 0.3572, 'grad_norm': 0.6786100980441568, 'learning_rate': 8.143411210456314e-06, 'epoch': 0.31} 31%|███ | 6742/22095 [11:27:26<21:59:02, 5.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957613 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8448, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 31%|███ | 6743/22095 [11:27:35<27:08:45, 6.37s/it] {'loss': 0.4946, 'grad_norm': 0.3247065808236292, 'learning_rate': 8.142841211649052e-06, 'epoch': 0.31} 31%|███ | 6743/22095 [11:27:35<27:08:45, 6.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 31%|███ | 6744/22095 [11:27:39<23:50:27, 5.59s/it] {'loss': 0.3885, 'grad_norm': 0.7602586822007469, 'learning_rate': 8.142271145311784e-06, 'epoch': 0.31} 31%|███ | 6744/22095 [11:27:39<23:50:27, 5.59s/it] 31%|███ | 6745/22095 [11:27:42<21:11:19, 4.97s/it] {'loss': 0.4101, 'grad_norm': 0.6827903832763981, 'learning_rate': 8.141701011456759e-06, 'epoch': 0.31} 31%|███ | 6745/22095 [11:27:42<21:11:19, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107408 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66530 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6746/22095 [11:27:45<18:59:34, 4.45s/it] {'loss': 0.3494, 'grad_norm': 0.6811187790476072, 'learning_rate': 8.14113081009623e-06, 'epoch': 0.31} 31%|███ | 6746/22095 [11:27:45<18:59:34, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74374 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6747/22095 [11:27:48<16:52:57, 3.96s/it] {'loss': 0.3774, 'grad_norm': 0.7048085893836293, 'learning_rate': 8.140560541242446e-06, 'epoch': 0.31} 31%|███ | 6747/22095 [11:27:48<16:52:57, 3.96s/it] 31%|███ | 6748/22095 [11:27:51<15:31:08, 3.64s/it] {'loss': 0.3666, 'grad_norm': 0.6491314330558875, 'learning_rate': 8.139990204907662e-06, 'epoch': 0.31} 31%|███ | 6748/22095 [11:27:51<15:31:08, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6749/22095 [11:28:00<21:50:58, 5.13s/it] {'loss': 0.5054, 'grad_norm': 0.4712696305154235, 'learning_rate': 8.139419801104133e-06, 'epoch': 0.31} 31%|███ | 6749/22095 [11:28:00<21:50:58, 5.13s/it] 31%|███ | 6750/22095 [11:28:03<19:58:14, 4.69s/it] {'loss': 0.3466, 'grad_norm': 0.6124385047064537, 'learning_rate': 8.138849329844115e-06, 'epoch': 0.31} 31%|███ | 6750/22095 [11:28:03<19:58:14, 4.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6751/22095 [11:28:07<18:08:24, 4.26s/it] {'loss': 0.3485, 'grad_norm': 0.6026356203624581, 'learning_rate': 8.138278791139863e-06, 'epoch': 0.31} 31%|███ | 6751/22095 [11:28:07<18:08:24, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84482 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6752/22095 [11:28:11<17:56:54, 4.21s/it] {'loss': 0.3676, 'grad_norm': 0.6271212950544153, 'learning_rate': 8.13770818500364e-06, 'epoch': 0.31} 31%|███ | 6752/22095 [11:28:11<17:56:54, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6753/22095 [11:28:14<16:55:14, 3.97s/it] {'loss': 0.3909, 'grad_norm': 0.6357535646778157, 'learning_rate': 8.137137511447702e-06, 'epoch': 0.31} 31%|███ | 6753/22095 [11:28:14<16:55:14, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45325 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83845 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41731 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6754/22095 [11:28:24<24:37:18, 5.78s/it] {'loss': 0.489, 'grad_norm': 0.37425033364294313, 'learning_rate': 8.136566770484316e-06, 'epoch': 0.31} 31%|███ | 6754/22095 [11:28:24<24:37:18, 5.78s/it] 31%|███ | 6755/22095 [11:28:35<30:36:43, 7.18s/it] {'loss': 0.4781, 'grad_norm': 0.3013803252222591, 'learning_rate': 8.135995962125744e-06, 'epoch': 0.31} 31%|███ | 6755/22095 [11:28:35<30:36:43, 7.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 31%|███ | 6756/22095 [11:28:38<25:40:53, 6.03s/it] {'loss': 0.3599, 'grad_norm': 0.6793897256341992, 'learning_rate': 8.135425086384249e-06, 'epoch': 0.31} 31%|███ | 6756/22095 [11:28:38<25:40:53, 6.03s/it] 31%|███ | 6757/22095 [11:28:41<22:21:02, 5.25s/it] {'loss': 0.3353, 'grad_norm': 0.5991893062045853, 'learning_rate': 8.1348541432721e-06, 'epoch': 0.31} 31%|███ | 6757/22095 [11:28:41<22:21:02, 5.25s/it] 31%|███ | 6758/22095 [11:28:45<20:48:23, 4.88s/it] {'loss': 0.3843, 'grad_norm': 0.6641871745828792, 'learning_rate': 8.134283132801562e-06, 'epoch': 0.31} 31%|███ | 6758/22095 [11:28:45<20:48:23, 4.88s/it] 31%|███ | 6759/22095 [11:28:49<19:31:01, 4.58s/it] {'loss': 0.3622, 'grad_norm': 0.615570207173398, 'learning_rate': 8.133712054984906e-06, 'epoch': 0.31} 31%|███ | 6759/22095 [11:28:49<19:31:01, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6760/22095 [11:28:59<25:46:51, 6.05s/it] {'loss': 0.5002, 'grad_norm': 0.44958620298115815, 'learning_rate': 8.133140909834402e-06, 'epoch': 0.31} 31%|███ | 6760/22095 [11:28:59<25:46:51, 6.05s/it] 31%|███ | 6761/22095 [11:29:02<22:27:19, 5.27s/it] {'loss': 0.3919, 'grad_norm': 0.738740954931246, 'learning_rate': 8.132569697362323e-06, 'epoch': 0.31} 31%|███ | 6761/22095 [11:29:02<22:27:19, 5.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6762/22095 [11:29:12<27:44:00, 6.51s/it] {'loss': 0.5022, 'grad_norm': 0.3629080785737851, 'learning_rate': 8.131998417580942e-06, 'epoch': 0.31} 31%|███ | 6762/22095 [11:29:12<27:44:00, 6.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6763/22095 [11:29:15<23:27:55, 5.51s/it] {'loss': 0.3578, 'grad_norm': 0.6430334797680526, 'learning_rate': 8.131427070502535e-06, 'epoch': 0.31} 31%|███ | 6763/22095 [11:29:15<23:27:55, 5.51s/it] 31%|███ | 6764/22095 [11:29:18<20:14:27, 4.75s/it] {'loss': 0.3389, 'grad_norm': 0.6998391775286003, 'learning_rate': 8.130855656139375e-06, 'epoch': 0.31} 31%|███ | 6764/22095 [11:29:18<20:14:27, 4.75s/it] 31%|███ | 6765/22095 [11:29:20<17:45:11, 4.17s/it] {'loss': 0.3888, 'grad_norm': 0.7018326564652131, 'learning_rate': 8.130284174503746e-06, 'epoch': 0.31} 31%|███ | 6765/22095 [11:29:21<17:45:11, 4.17s/it] 31%|███ | 6766/22095 [11:29:24<16:27:14, 3.86s/it] {'loss': 0.351, 'grad_norm': 0.6413065879688464, 'learning_rate': 8.129712625607924e-06, 'epoch': 0.31} 31%|███ | 6766/22095 [11:29:24<16:27:14, 3.86s/it] 31%|███ | 6767/22095 [11:29:27<15:19:50, 3.60s/it] {'loss': 0.3305, 'grad_norm': 0.6484535268441277, 'learning_rate': 8.129141009464187e-06, 'epoch': 0.31} 31%|███ | 6767/22095 [11:29:27<15:19:50, 3.60s/it] 31%|███ | 6768/22095 [11:29:29<14:22:58, 3.38s/it] {'loss': 0.3459, 'grad_norm': 0.6539147847237137, 'learning_rate': 8.128569326084824e-06, 'epoch': 0.31} 31%|███ | 6768/22095 [11:29:29<14:22:58, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80674 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96979 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6769/22095 [11:29:33<14:37:52, 3.44s/it] {'loss': 0.4109, 'grad_norm': 0.6790155149435549, 'learning_rate': 8.127997575482112e-06, 'epoch': 0.31} 31%|███ | 6769/22095 [11:29:33<14:37:52, 3.44s/it] 31%|███ | 6770/22095 [11:29:36<13:55:11, 3.27s/it] {'loss': 0.3752, 'grad_norm': 0.6114908532644882, 'learning_rate': 8.127425757668338e-06, 'epoch': 0.31} 31%|███ | 6770/22095 [11:29:36<13:55:11, 3.27s/it] 31%|███ | 6771/22095 [11:29:40<15:19:36, 3.60s/it] {'loss': 0.4123, 'grad_norm': 0.6098158272610537, 'learning_rate': 8.12685387265579e-06, 'epoch': 0.31} 31%|███ | 6771/22095 [11:29:40<15:19:36, 3.60s/it] 31%|███ | 6772/22095 [11:29:44<15:30:31, 3.64s/it] {'loss': 0.4011, 'grad_norm': 0.6409207978303622, 'learning_rate': 8.126281920456758e-06, 'epoch': 0.31} 31%|███ | 6772/22095 [11:29:44<15:30:31, 3.64s/it] 31%|███ | 6773/22095 [11:29:48<15:24:40, 3.62s/it] {'loss': 0.4185, 'grad_norm': 0.7401398533726881, 'learning_rate': 8.12570990108353e-06, 'epoch': 0.31} 31%|███ | 6773/22095 [11:29:48<15:24:40, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60546 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6774/22095 [11:29:52<16:14:35, 3.82s/it] {'loss': 0.356, 'grad_norm': 0.6356749159664686, 'learning_rate': 8.125137814548394e-06, 'epoch': 0.31} 31%|███ | 6774/22095 [11:29:52<16:14:35, 3.82s/it] 31%|███ | 6775/22095 [11:29:55<15:27:11, 3.63s/it] {'loss': 0.3638, 'grad_norm': 0.6529145959466799, 'learning_rate': 8.124565660863643e-06, 'epoch': 0.31} 31%|███ | 6775/22095 [11:29:55<15:27:11, 3.63s/it] 31%|███ | 6776/22095 [11:29:59<15:42:51, 3.69s/it] {'loss': 0.3563, 'grad_norm': 0.6608892458641886, 'learning_rate': 8.123993440041576e-06, 'epoch': 0.31} 31%|███ | 6776/22095 [11:29:59<15:42:51, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6777/22095 [11:30:06<19:56:48, 4.69s/it] {'loss': 0.5073, 'grad_norm': 0.5521789874530064, 'learning_rate': 8.123421152094481e-06, 'epoch': 0.31} 31%|███ | 6777/22095 [11:30:06<19:56:48, 4.69s/it] 31%|███ | 6778/22095 [11:30:09<18:01:50, 4.24s/it] {'loss': 0.3584, 'grad_norm': 0.6905730622680517, 'learning_rate': 8.12284879703466e-06, 'epoch': 0.31} 31%|███ | 6778/22095 [11:30:09<18:01:50, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44349 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44426 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45010 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (55874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94316 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6779/22095 [11:30:13<17:40:34, 4.15s/it] {'loss': 0.3673, 'grad_norm': 0.6080164401091869, 'learning_rate': 8.12227637487441e-06, 'epoch': 0.31} 31%|███ | 6779/22095 [11:30:13<17:40:34, 4.15s/it] 31%|███ | 6780/22095 [11:30:17<17:32:51, 4.12s/it] {'loss': 0.3932, 'grad_norm': 0.6349719454277032, 'learning_rate': 8.121703885626029e-06, 'epoch': 0.31} 31%|███ | 6780/22095 [11:30:17<17:32:51, 4.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53426 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51272 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6781/22095 [11:30:20<16:12:44, 3.81s/it] {'loss': 0.3911, 'grad_norm': 0.6416433540100211, 'learning_rate': 8.12113132930182e-06, 'epoch': 0.31} 31%|███ | 6781/22095 [11:30:20<16:12:44, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6782/22095 [11:30:30<23:21:30, 5.49s/it] {'loss': 0.4932, 'grad_norm': 0.3495737986624057, 'learning_rate': 8.120558705914083e-06, 'epoch': 0.31} 31%|███ | 6782/22095 [11:30:30<23:21:30, 5.49s/it] 31%|███ | 6783/22095 [11:30:34<21:44:53, 5.11s/it] {'loss': 0.3361, 'grad_norm': 0.5923057827572341, 'learning_rate': 8.119986015475126e-06, 'epoch': 0.31} 31%|███ | 6783/22095 [11:30:34<21:44:53, 5.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6784/22095 [11:30:37<19:33:02, 4.60s/it] {'loss': 0.3557, 'grad_norm': 0.707659390930438, 'learning_rate': 8.11941325799725e-06, 'epoch': 0.31} 31%|███ | 6784/22095 [11:30:37<19:33:02, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52436 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6785/22095 [11:30:40<17:28:53, 4.11s/it] {'loss': 0.3898, 'grad_norm': 0.7240388985619817, 'learning_rate': 8.118840433492764e-06, 'epoch': 0.31} 31%|███ | 6785/22095 [11:30:40<17:28:53, 4.11s/it] 31%|███ | 6786/22095 [11:30:43<16:04:05, 3.78s/it] {'loss': 0.3738, 'grad_norm': 0.6417285146209729, 'learning_rate': 8.118267541973975e-06, 'epoch': 0.31} 31%|███ | 6786/22095 [11:30:43<16:04:05, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6787/22095 [11:30:53<23:26:42, 5.51s/it] {'loss': 0.4988, 'grad_norm': 0.3553061010073323, 'learning_rate': 8.117694583453195e-06, 'epoch': 0.31} 31%|███ | 6787/22095 [11:30:53<23:26:42, 5.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6788/22095 [11:30:56<20:43:46, 4.88s/it] {'loss': 0.3736, 'grad_norm': 0.6635298379528045, 'learning_rate': 8.117121557942733e-06, 'epoch': 0.31} 31%|███ | 6788/22095 [11:30:56<20:43:46, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6789/22095 [11:31:05<26:14:29, 6.17s/it] {'loss': 0.4785, 'grad_norm': 0.312431761561904, 'learning_rate': 8.116548465454902e-06, 'epoch': 0.31} 31%|███ | 6789/22095 [11:31:05<26:14:29, 6.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59193 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6790/22095 [11:31:08<22:20:19, 5.25s/it] {'loss': 0.3714, 'grad_norm': 0.741654489937296, 'learning_rate': 8.115975306002018e-06, 'epoch': 0.31} 31%|███ | 6790/22095 [11:31:08<22:20:19, 5.25s/it] 31%|███ | 6791/22095 [11:31:12<20:32:21, 4.83s/it] {'loss': 0.3504, 'grad_norm': 0.6059806753467073, 'learning_rate': 8.115402079596392e-06, 'epoch': 0.31} 31%|███ | 6791/22095 [11:31:12<20:32:21, 4.83s/it] 31%|███ | 6792/22095 [11:31:15<18:07:19, 4.26s/it] {'loss': 0.3692, 'grad_norm': 0.6754900935493422, 'learning_rate': 8.114828786250345e-06, 'epoch': 0.31} 31%|███ | 6792/22095 [11:31:15<18:07:19, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6793/22095 [11:31:22<20:44:40, 4.88s/it] {'loss': 0.4703, 'grad_norm': 0.379407826248663, 'learning_rate': 8.114255425976193e-06, 'epoch': 0.31} 31%|███ | 6793/22095 [11:31:22<20:44:40, 4.88s/it] 31%|███ | 6794/22095 [11:31:25<19:22:00, 4.56s/it] {'loss': 0.3734, 'grad_norm': 0.6066107353561802, 'learning_rate': 8.113681998786257e-06, 'epoch': 0.31} 31%|███ | 6794/22095 [11:31:25<19:22:00, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6795/22095 [11:31:29<17:37:02, 4.15s/it] {'loss': 0.3494, 'grad_norm': 0.6140946118613567, 'learning_rate': 8.113108504692858e-06, 'epoch': 0.31} 31%|███ | 6795/22095 [11:31:29<17:37:02, 4.15s/it] 31%|███ | 6796/22095 [11:31:33<17:22:01, 4.09s/it] {'loss': 0.4195, 'grad_norm': 0.6478485310329225, 'learning_rate': 8.11253494370832e-06, 'epoch': 0.31} 31%|███ | 6796/22095 [11:31:33<17:22:01, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47842 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (70619 > 40960) for 4 sample(s). Truncating to 262 with 1 samples. 31%|███ | 6797/22095 [11:31:37<17:22:08, 4.09s/it] {'loss': 0.3788, 'grad_norm': 0.6222099995071896, 'learning_rate': 8.111961315844964e-06, 'epoch': 0.31} 31%|███ | 6797/22095 [11:31:37<17:22:08, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6798/22095 [11:31:46<24:21:54, 5.73s/it] {'loss': 0.4689, 'grad_norm': 0.32387049358651526, 'learning_rate': 8.111387621115116e-06, 'epoch': 0.31} 31%|███ | 6798/22095 [11:31:46<24:21:54, 5.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89909 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6799/22095 [11:31:49<20:57:27, 4.93s/it] {'loss': 0.3445, 'grad_norm': 0.7076263821704921, 'learning_rate': 8.110813859531104e-06, 'epoch': 0.31} 31%|███ | 6799/22095 [11:31:49<20:57:27, 4.93s/it] 31%|███ | 6800/22095 [11:31:52<18:27:11, 4.34s/it] {'loss': 0.3686, 'grad_norm': 0.6373148724099768, 'learning_rate': 8.110240031105257e-06, 'epoch': 0.31} 31%|███ | 6800/22095 [11:31:52<18:27:11, 4.34s/it] 31%|███ | 6801/22095 [11:31:56<17:13:08, 4.05s/it] {'loss': 0.3462, 'grad_norm': 0.7596219343419965, 'learning_rate': 8.109666135849905e-06, 'epoch': 0.31} 31%|███ | 6801/22095 [11:31:56<17:13:08, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6802/22095 [11:31:59<16:26:02, 3.87s/it] {'loss': 0.3602, 'grad_norm': 0.6406151701818559, 'learning_rate': 8.109092173777376e-06, 'epoch': 0.31} 31%|███ | 6802/22095 [11:31:59<16:26:02, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['data/bottom-navigation/other_screenshot/original/ProductivityBottomNavigation_1739983263.5379376.png'] does not match number of images None [Try #0] Failed to fetch sample 1868482 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/bottom-navigation/other_screenshot/original/ProductivityBottomNavigation_1739983263.5379376.png'] does not match number of images None Problematic sample: {'image': 'data/bottom-navigation/other_screenshot/original/ProductivityBottomNavigation_1739983263.5379376.png', 'conversations': [], 'image_id': 'data/bottom-navigation/other_screenshot/original/ProductivityBottomNavigation_1739983263.5379376.png'} 31%|███ | 6803/22095 [11:32:02<14:57:50, 3.52s/it] {'loss': 0.3321, 'grad_norm': 0.6168811765740875, 'learning_rate': 8.108518144900007e-06, 'epoch': 0.31} 31%|███ | 6803/22095 [11:32:02<14:57:50, 3.52s/it] 31%|███ | 6804/22095 [11:32:05<14:41:35, 3.46s/it] {'loss': 0.3178, 'grad_norm': 0.6369882372627966, 'learning_rate': 8.10794404923013e-06, 'epoch': 0.31} 31%|███ | 6804/22095 [11:32:05<14:41:35, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45473 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87611 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6805/22095 [11:32:15<22:25:46, 5.28s/it] {'loss': 0.4625, 'grad_norm': 0.3820706526356615, 'learning_rate': 8.107369886780082e-06, 'epoch': 0.31} 31%|███ | 6805/22095 [11:32:15<22:25:46, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75919 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6806/22095 [11:32:18<20:14:58, 4.77s/it] {'loss': 0.331, 'grad_norm': 0.6049111052281383, 'learning_rate': 8.106795657562197e-06, 'epoch': 0.31} 31%|███ | 6806/22095 [11:32:18<20:14:58, 4.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55843 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6807/22095 [11:32:28<26:25:03, 6.22s/it] {'loss': 0.5085, 'grad_norm': 0.3170775102658955, 'learning_rate': 8.106221361588814e-06, 'epoch': 0.31} 31%|███ | 6807/22095 [11:32:28<26:25:03, 6.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6808/22095 [11:32:31<22:43:48, 5.35s/it] {'loss': 0.3939, 'grad_norm': 0.6396101195957737, 'learning_rate': 8.105646998872275e-06, 'epoch': 0.31} 31%|███ | 6808/22095 [11:32:31<22:43:48, 5.35s/it] 31%|███ | 6809/22095 [11:32:34<19:59:10, 4.71s/it] {'loss': 0.3392, 'grad_norm': 0.6137977068626062, 'learning_rate': 8.10507256942492e-06, 'epoch': 0.31} 31%|███ | 6809/22095 [11:32:34<19:59:10, 4.71s/it] 31%|███ | 6810/22095 [11:32:38<19:14:25, 4.53s/it] {'loss': 0.338, 'grad_norm': 0.6733315999273725, 'learning_rate': 8.104498073259093e-06, 'epoch': 0.31} 31%|███ | 6810/22095 [11:32:38<19:14:25, 4.53s/it] 31%|███ | 6811/22095 [11:32:42<17:31:12, 4.13s/it] {'loss': 0.3999, 'grad_norm': 0.6554373555685616, 'learning_rate': 8.103923510387138e-06, 'epoch': 0.31} 31%|███ | 6811/22095 [11:32:42<17:31:12, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047734 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 31%|███ | 6812/22095 [11:32:45<16:37:33, 3.92s/it] {'loss': 0.3468, 'grad_norm': 0.6453238347422188, 'learning_rate': 8.103348880821399e-06, 'epoch': 0.31} 31%|███ | 6812/22095 [11:32:45<16:37:33, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110153 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100495 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6813/22095 [11:32:55<23:42:46, 5.59s/it] {'loss': 0.5147, 'grad_norm': 0.4037572749428178, 'learning_rate': 8.10277418457422e-06, 'epoch': 0.31} 31%|███ | 6813/22095 [11:32:55<23:42:46, 5.59s/it] 31%|███ | 6814/22095 [11:33:03<27:28:59, 6.47s/it] {'loss': 0.509, 'grad_norm': 0.3979089005469083, 'learning_rate': 8.102199421657957e-06, 'epoch': 0.31} 31%|███ | 6814/22095 [11:33:03<27:28:59, 6.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308587 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2ijr1mrYI8KJjy0FaXXbAiVXa_!!1067331405.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请读出图片内的文字,不要输出文字的具体位置。'}, {'from': 'gpt', 'value': '图中所有文字:\n厂家直销\n疯抢中!\n葡式蛋挞\n葡式蛋撻\n新鲜出炉,现做现\nDelicious\n葡式蛋挞FAMOUSEGGTART\n2\n健康\n美味\n同行\n50\n一件\n个\n9.9\n全国包邮\n元\n疯狂价\n¥'}]} 31%|███ | 6815/22095 [11:33:07<23:39:43, 5.57s/it] {'loss': 0.3963, 'grad_norm': 0.6515789874392688, 'learning_rate': 8.101624592084956e-06, 'epoch': 0.31} 31%|███ | 6815/22095 [11:33:07<23:39:43, 5.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70881 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59197 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6816/22095 [11:33:10<20:31:40, 4.84s/it] {'loss': 0.3619, 'grad_norm': 0.7141706623863724, 'learning_rate': 8.101049695867566e-06, 'epoch': 0.31} 31%|███ | 6816/22095 [11:33:10<20:31:40, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55822 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6817/22095 [11:33:13<18:46:31, 4.42s/it] {'loss': 0.3569, 'grad_norm': 0.6568144635043696, 'learning_rate': 8.100474733018145e-06, 'epoch': 0.31} 31%|███ | 6817/22095 [11:33:13<18:46:31, 4.42s/it] 31%|███ | 6818/22095 [11:33:17<17:51:25, 4.21s/it] {'loss': 0.3557, 'grad_norm': 0.687926850848241, 'learning_rate': 8.099899703549043e-06, 'epoch': 0.31} 31%|███ | 6818/22095 [11:33:17<17:51:25, 4.21s/it] 31%|███ | 6819/22095 [11:33:21<17:12:55, 4.06s/it] {'loss': 0.3915, 'grad_norm': 0.7585055645699293, 'learning_rate': 8.099324607472619e-06, 'epoch': 0.31} 31%|███ | 6819/22095 [11:33:21<17:12:55, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67475 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58192 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44681 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103255 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6820/22095 [11:33:24<16:08:45, 3.81s/it] {'loss': 0.3558, 'grad_norm': 0.6493471634676304, 'learning_rate': 8.098749444801226e-06, 'epoch': 0.31} 31%|███ | 6820/22095 [11:33:24<16:08:45, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/4883f6e6-c658-4d61-9cf9-e32c2b812a80/images/step_5.png 2025-08-28 03:31:22.990860 load time: 1224.3 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 03:31:23.026051 load time: 1151.84 ms 31%|███ | 6821/22095 [11:33:33<23:19:24, 5.50s/it] {'loss': 0.4819, 'grad_norm': 0.5066541089205197, 'learning_rate': 8.098174215547224e-06, 'epoch': 0.31} 31%|███ | 6821/22095 [11:33:33<23:19:24, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72170 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6822/22095 [11:33:37<21:23:30, 5.04s/it] {'loss': 0.3227, 'grad_norm': 0.6462727041437436, 'learning_rate': 8.097598919722975e-06, 'epoch': 0.31} 31%|███ | 6822/22095 [11:33:37<21:23:30, 5.04s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 03:31:35.950838 load time: 1363.51 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 03:31:37.199231 load time: 1049.99 ms 31%|███ | 6823/22095 [11:33:40<18:37:28, 4.39s/it] {'loss': 0.3285, 'grad_norm': 0.636904943838202, 'learning_rate': 8.097023557340837e-06, 'epoch': 0.31} 31%|███ | 6823/22095 [11:33:40<18:37:28, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6824/22095 [11:33:49<24:07:39, 5.69s/it] {'loss': 0.494, 'grad_norm': 0.3604508863174562, 'learning_rate': 8.096448128413177e-06, 'epoch': 0.31} 31%|███ | 6824/22095 [11:33:49<24:07:39, 5.69s/it] 31%|███ | 6825/22095 [11:33:59<30:00:52, 7.08s/it] {'loss': 0.5074, 'grad_norm': 0.3033044941525956, 'learning_rate': 8.095872632952354e-06, 'epoch': 0.31} 31%|███ | 6825/22095 [11:33:59<30:00:52, 7.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6826/22095 [11:34:03<26:16:10, 6.19s/it] {'loss': 0.4, 'grad_norm': 0.6798141188672809, 'learning_rate': 8.095297070970738e-06, 'epoch': 0.31} 31%|███ | 6826/22095 [11:34:03<26:16:10, 6.19s/it] 31%|███ | 6827/22095 [11:34:07<22:47:59, 5.38s/it] {'loss': 0.3725, 'grad_norm': 0.6561679938459225, 'learning_rate': 8.094721442480696e-06, 'epoch': 0.31} 31%|███ | 6827/22095 [11:34:07<22:47:59, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6828/22095 [11:34:17<28:39:45, 6.76s/it] {'loss': 0.4766, 'grad_norm': 0.4073413244614808, 'learning_rate': 8.094145747494591e-06, 'epoch': 0.31} 31%|███ | 6828/22095 [11:34:17<28:39:45, 6.76s/it] 31%|███ | 6829/22095 [11:34:21<25:21:42, 5.98s/it] {'loss': 0.3615, 'grad_norm': 0.7145571037258346, 'learning_rate': 8.093569986024798e-06, 'epoch': 0.31} 31%|███ | 6829/22095 [11:34:21<25:21:42, 5.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6830/22095 [11:34:31<30:19:54, 7.15s/it] {'loss': 0.5162, 'grad_norm': 0.42268248652396634, 'learning_rate': 8.092994158083689e-06, 'epoch': 0.31} 31%|███ | 6830/22095 [11:34:31<30:19:54, 7.15s/it] 31%|███ | 6831/22095 [11:34:35<26:21:51, 6.22s/it] {'loss': 0.4114, 'grad_norm': 0.6628200350098077, 'learning_rate': 8.092418263683635e-06, 'epoch': 0.31} 31%|███ | 6831/22095 [11:34:35<26:21:51, 6.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 03:32:34.754648 load time: 1244.25 ms 31%|███ | 6832/22095 [11:34:38<22:18:34, 5.26s/it] {'loss': 0.3592, 'grad_norm': 0.6047039574344523, 'learning_rate': 8.091842302837009e-06, 'epoch': 0.31} 31%|███ | 6832/22095 [11:34:38<22:18:34, 5.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365272 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32013, 'image': 'vrdu_table_final_2/astro-ph.CO/eeb10d3b-0539-429c-ab71-773235725a87.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}${\\bf f_X}$\\end{tabular}\n```"}]} 31%|███ | 6833/22095 [11:34:41<19:50:15, 4.68s/it] {'loss': 0.4155, 'grad_norm': 0.6438819114591449, 'learning_rate': 8.091266275556188e-06, 'epoch': 0.31} 31%|███ | 6833/22095 [11:34:41<19:50:15, 4.68s/it] 31%|███ | 6834/22095 [11:34:44<17:39:55, 4.17s/it] {'loss': 0.3785, 'grad_norm': 0.6437391796856079, 'learning_rate': 8.090690181853548e-06, 'epoch': 0.31} 31%|███ | 6834/22095 [11:34:44<17:39:55, 4.17s/it] 31%|███ | 6835/22095 [11:34:47<16:17:10, 3.84s/it] {'loss': 0.3633, 'grad_norm': 0.6030900916072861, 'learning_rate': 8.09011402174147e-06, 'epoch': 0.31} 31%|███ | 6835/22095 [11:34:47<16:17:10, 3.84s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/03624071-2760-416c-80f5-d94aa8dfebce/images/step_0.png 2025-08-28 03:32:45.422128 load time: 1080.94 ms 31%|███ | 6836/22095 [11:34:52<17:01:41, 4.02s/it] {'loss': 0.3587, 'grad_norm': 0.7355170778499127, 'learning_rate': 8.089537795232331e-06, 'epoch': 0.31} 31%|███ | 6836/22095 [11:34:52<17:01:41, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90379 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924294 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nWhat is the north to south extent of USA? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Not mentioned.\nThe texts do not provide information about the north to south extent of the USA.'}]} 31%|███ | 6837/22095 [11:34:55<15:47:34, 3.73s/it] {'loss': 0.3725, 'grad_norm': 0.6698188931352116, 'learning_rate': 8.088961502338514e-06, 'epoch': 0.31} 31%|███ | 6837/22095 [11:34:55<15:47:34, 3.73s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6838/22095 [11:34:59<16:09:02, 3.81s/it] {'loss': 0.3586, 'grad_norm': 0.7400131968976658, 'learning_rate': 8.088385143072402e-06, 'epoch': 0.31} 31%|███ | 6838/22095 [11:34:59<16:09:02, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6839/22095 [11:35:02<15:31:43, 3.66s/it] {'loss': 0.3694, 'grad_norm': 0.6718189379267882, 'learning_rate': 8.087808717446377e-06, 'epoch': 0.31} 31%|███ | 6839/22095 [11:35:02<15:31:43, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922255 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45408, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 31%|███ | 6840/22095 [11:35:06<16:23:40, 3.87s/it] {'loss': 0.3561, 'grad_norm': 0.6473017686964793, 'learning_rate': 8.087232225472827e-06, 'epoch': 0.31} 31%|███ | 6840/22095 [11:35:06<16:23:40, 3.87s/it] 31%|███ | 6841/22095 [11:35:10<15:44:52, 3.72s/it] {'loss': 0.3661, 'grad_norm': 1.0674858541583925, 'learning_rate': 8.086655667164137e-06, 'epoch': 0.31} 31%|███ | 6841/22095 [11:35:10<15:44:52, 3.72s/it] 31%|███ | 6842/22095 [11:35:14<16:35:43, 3.92s/it] {'loss': 0.3864, 'grad_norm': 0.5969628411282236, 'learning_rate': 8.086079042532699e-06, 'epoch': 0.31} 31%|███ | 6842/22095 [11:35:14<16:35:43, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89987 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6843/22095 [11:35:18<16:12:43, 3.83s/it] {'loss': 0.3402, 'grad_norm': 0.5847828086110796, 'learning_rate': 8.0855023515909e-06, 'epoch': 0.31} 31%|███ | 6843/22095 [11:35:18<16:12:43, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62222 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78538 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6844/22095 [11:35:22<16:18:44, 3.85s/it] {'loss': 0.3885, 'grad_norm': 0.6408510947564642, 'learning_rate': 8.08492559435113e-06, 'epoch': 0.31} 31%|███ | 6844/22095 [11:35:22<16:18:44, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6845/22095 [11:35:31<23:27:12, 5.54s/it] {'loss': 0.4956, 'grad_norm': 0.6446133618228684, 'learning_rate': 8.084348770825785e-06, 'epoch': 0.31} 31%|███ | 6845/22095 [11:35:31<23:27:12, 5.54s/it] 31%|███ | 6846/22095 [11:35:35<21:11:42, 5.00s/it] {'loss': 0.4082, 'grad_norm': 0.6549341012350445, 'learning_rate': 8.083771881027259e-06, 'epoch': 0.31} 31%|███ | 6846/22095 [11:35:35<21:11:42, 5.00s/it] 31%|███ | 6847/22095 [11:35:39<20:05:46, 4.74s/it] {'loss': 0.4099, 'grad_norm': 0.7068487841425369, 'learning_rate': 8.083194924967943e-06, 'epoch': 0.31} 31%|███ | 6847/22095 [11:35:39<20:05:46, 4.74s/it] 31%|███ | 6848/22095 [11:35:42<18:17:35, 4.32s/it] {'loss': 0.3243, 'grad_norm': 0.7007918864113356, 'learning_rate': 8.08261790266024e-06, 'epoch': 0.31} 31%|███ | 6848/22095 [11:35:42<18:17:35, 4.32s/it] 31%|███ | 6849/22095 [11:35:46<16:57:25, 4.00s/it] {'loss': 0.3626, 'grad_norm': 0.629604268051431, 'learning_rate': 8.082040814116545e-06, 'epoch': 0.31} 31%|███ | 6849/22095 [11:35:46<16:57:25, 4.00s/it] 31%|███ | 6850/22095 [11:35:50<17:36:11, 4.16s/it] {'loss': 0.3375, 'grad_norm': 0.6278545282083574, 'learning_rate': 8.081463659349258e-06, 'epoch': 0.31} 31%|███ | 6850/22095 [11:35:50<17:36:11, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_2/images/step_0.png 2025-08-28 03:33:49.718308 load time: 1049.14 ms 31%|███ | 6851/22095 [11:36:00<25:24:25, 6.00s/it] {'loss': 0.5128, 'grad_norm': 0.4734338413743656, 'learning_rate': 8.080886438370781e-06, 'epoch': 0.31} 31%|███ | 6851/22095 [11:36:00<25:24:25, 6.00s/it] 31%|███ | 6852/22095 [11:36:10<30:22:42, 7.17s/it] {'loss': 0.4826, 'grad_norm': 0.4099808770615001, 'learning_rate': 8.080309151193517e-06, 'epoch': 0.31} 31%|███ | 6852/22095 [11:36:10<30:22:42, 7.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (54376 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6853/22095 [11:36:14<25:54:08, 6.12s/it] {'loss': 0.3824, 'grad_norm': 0.6989913052154422, 'learning_rate': 8.07973179782987e-06, 'epoch': 0.31} 31%|███ | 6853/22095 [11:36:14<25:54:08, 6.12s/it] 31%|███ | 6854/22095 [11:36:17<22:25:54, 5.30s/it] {'loss': 0.3345, 'grad_norm': 0.6248720257681796, 'learning_rate': 8.079154378292246e-06, 'epoch': 0.31} 31%|███ | 6854/22095 [11:36:17<22:25:54, 5.30s/it] 31%|███ | 6855/22095 [11:36:21<20:43:25, 4.90s/it] {'loss': 0.363, 'grad_norm': 0.6139729843377981, 'learning_rate': 8.07857689259305e-06, 'epoch': 0.31} 31%|███ | 6855/22095 [11:36:21<20:43:25, 4.90s/it] 31%|███ | 6856/22095 [11:36:24<18:24:26, 4.35s/it] {'loss': 0.3383, 'grad_norm': 0.6465785975930611, 'learning_rate': 8.077999340744694e-06, 'epoch': 0.31} 31%|███ | 6856/22095 [11:36:24<18:24:26, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74687 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6857/22095 [11:36:28<17:31:51, 4.14s/it] {'loss': 0.3197, 'grad_norm': 0.654275213477745, 'learning_rate': 8.077421722759584e-06, 'epoch': 0.31} 31%|███ | 6857/22095 [11:36:28<17:31:51, 4.14s/it] 31%|███ | 6858/22095 [11:36:31<16:01:10, 3.78s/it] {'loss': 0.3761, 'grad_norm': 0.7020318044378301, 'learning_rate': 8.076844038650133e-06, 'epoch': 0.31} 31%|███ | 6858/22095 [11:36:31<16:01:10, 3.78s/it] 31%|███ | 6859/22095 [11:36:35<15:56:18, 3.77s/it] {'loss': 0.3691, 'grad_norm': 0.6495770931717897, 'learning_rate': 8.076266288428753e-06, 'epoch': 0.31} 31%|███ | 6859/22095 [11:36:35<15:56:18, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51064 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6860/22095 [11:36:38<14:53:40, 3.52s/it] {'loss': 0.3725, 'grad_norm': 0.8143884827910683, 'learning_rate': 8.075688472107859e-06, 'epoch': 0.31} 31%|███ | 6860/22095 [11:36:38<14:53:40, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6861/22095 [11:36:44<18:59:01, 4.49s/it] {'loss': 0.5127, 'grad_norm': 0.8094018848341005, 'learning_rate': 8.075110589699866e-06, 'epoch': 0.31} 31%|███ | 6861/22095 [11:36:44<18:59:01, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57116 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58688 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6862/22095 [11:36:48<18:10:20, 4.29s/it] {'loss': 0.4156, 'grad_norm': 0.6774066803171471, 'learning_rate': 8.07453264121719e-06, 'epoch': 0.31} 31%|███ | 6862/22095 [11:36:48<18:10:20, 4.29s/it] 31%|███ | 6863/22095 [11:36:51<16:16:37, 3.85s/it] {'loss': 0.3314, 'grad_norm': 0.6939323935945736, 'learning_rate': 8.07395462667225e-06, 'epoch': 0.31} 31%|███ | 6863/22095 [11:36:51<16:16:37, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127276 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6864/22095 [11:36:55<16:27:13, 3.89s/it] {'loss': 0.3835, 'grad_norm': 0.6902658938875413, 'learning_rate': 8.073376546077468e-06, 'epoch': 0.31} 31%|███ | 6864/22095 [11:36:55<16:27:13, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (119302 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6865/22095 [11:36:59<16:30:16, 3.90s/it] {'loss': 0.4922, 'grad_norm': 0.34823630616883156, 'learning_rate': 8.07279839944526e-06, 'epoch': 0.31} 31%|███ | 6865/22095 [11:36:59<16:30:16, 3.90s/it] 31%|███ | 6866/22095 [11:37:02<15:51:40, 3.75s/it] {'loss': 0.3656, 'grad_norm': 0.7711677796759714, 'learning_rate': 8.072220186788056e-06, 'epoch': 0.31} 31%|███ | 6866/22095 [11:37:02<15:51:40, 3.75s/it] 31%|███ | 6867/22095 [11:37:06<15:12:00, 3.59s/it] {'loss': 0.3609, 'grad_norm': 0.6111540013077865, 'learning_rate': 8.071641908118273e-06, 'epoch': 0.31} 31%|███ | 6867/22095 [11:37:06<15:12:00, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77733 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6868/22095 [11:37:13<19:51:24, 4.69s/it] {'loss': 0.512, 'grad_norm': 0.4677566680030358, 'learning_rate': 8.071063563448341e-06, 'epoch': 0.31} 31%|███ | 6868/22095 [11:37:13<19:51:24, 4.69s/it] 31%|███ | 6869/22095 [11:37:16<18:13:28, 4.31s/it] {'loss': 0.3427, 'grad_norm': 0.6404382698128002, 'learning_rate': 8.070485152790684e-06, 'epoch': 0.31} 31%|███ | 6869/22095 [11:37:16<18:13:28, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6870/22095 [11:37:26<24:40:18, 5.83s/it] {'loss': 0.4937, 'grad_norm': 0.3686604494675354, 'learning_rate': 8.06990667615773e-06, 'epoch': 0.31} 31%|███ | 6870/22095 [11:37:26<24:40:18, 5.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6871/22095 [11:37:29<21:48:14, 5.16s/it] {'loss': 0.3958, 'grad_norm': 0.6986737941715713, 'learning_rate': 8.069328133561911e-06, 'epoch': 0.31} 31%|███ | 6871/22095 [11:37:29<21:48:14, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_1/images/step_7.png 2025-08-28 03:35:29.183890 load time: 1003.29 ms 31%|███ | 6872/22095 [11:37:39<27:21:36, 6.47s/it] {'loss': 0.4915, 'grad_norm': 0.3162165790029801, 'learning_rate': 8.068749525015658e-06, 'epoch': 0.31} 31%|███ | 6872/22095 [11:37:39<27:21:36, 6.47s/it]VC:s3://gui-agent/data_20250407/web/images/douyin_com/trajectory_11/img/step_0.png 2025-08-28 03:35:37.502622 load time: 1249.32 ms 31%|███ | 6873/22095 [11:37:42<23:15:02, 5.50s/it] {'loss': 0.3809, 'grad_norm': 0.8038232490379009, 'learning_rate': 8.068170850531401e-06, 'epoch': 0.31} 31%|███ | 6873/22095 [11:37:42<23:15:02, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94883 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49377 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6874/22095 [11:37:45<20:00:06, 4.73s/it] {'loss': 0.3954, 'grad_norm': 0.6650645932764687, 'learning_rate': 8.067592110121576e-06, 'epoch': 0.31} 31%|███ | 6874/22095 [11:37:45<20:00:06, 4.73s/it] 31%|███ | 6875/22095 [11:37:48<17:42:06, 4.19s/it] {'loss': 0.3501, 'grad_norm': 0.6527969870447125, 'learning_rate': 8.06701330379862e-06, 'epoch': 0.31} 31%|███ | 6875/22095 [11:37:48<17:42:06, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6876/22095 [11:37:57<24:12:57, 5.73s/it] {'loss': 0.4975, 'grad_norm': 0.5320663380971439, 'learning_rate': 8.066434431574965e-06, 'epoch': 0.31} 31%|███ | 6876/22095 [11:37:57<24:12:57, 5.73s/it] 31%|███ | 6877/22095 [11:38:01<21:35:07, 5.11s/it] {'loss': 0.3703, 'grad_norm': 0.7166134358312011, 'learning_rate': 8.065855493463055e-06, 'epoch': 0.31} 31%|███ | 6877/22095 [11:38:01<21:35:07, 5.11s/it] 31%|███ | 6878/22095 [11:38:05<20:13:42, 4.79s/it] {'loss': 0.3682, 'grad_norm': 0.6843795237430064, 'learning_rate': 8.065276489475324e-06, 'epoch': 0.31} 31%|███ | 6878/22095 [11:38:05<20:13:42, 4.79s/it] 31%|███ | 6879/22095 [11:38:08<17:45:00, 4.20s/it] {'loss': 0.3842, 'grad_norm': 0.628539631506995, 'learning_rate': 8.064697419624216e-06, 'epoch': 0.31} 31%|███ | 6879/22095 [11:38:08<17:45:00, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44990 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43745 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6880/22095 [11:38:11<16:52:10, 3.99s/it] {'loss': 0.3472, 'grad_norm': 0.63461679915827, 'learning_rate': 8.064118283922173e-06, 'epoch': 0.31} 31%|███ | 6880/22095 [11:38:11<16:52:10, 3.99s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250502_111053_3/images/before_screenshot_46_id_70_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 03:36:10.420118 load time: 1220.33 ms 31%|███ | 6881/22095 [11:38:14<15:42:29, 3.72s/it] {'loss': 0.3982, 'grad_norm': 0.5858346718837931, 'learning_rate': 8.06353908238164e-06, 'epoch': 0.31} 31%|███ | 6881/22095 [11:38:14<15:42:29, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6882/22095 [11:38:24<23:34:44, 5.58s/it] {'loss': 0.4678, 'grad_norm': 0.3290559586116341, 'learning_rate': 8.06295981501506e-06, 'epoch': 0.31} 31%|███ | 6882/22095 [11:38:24<23:34:44, 5.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6883/22095 [11:38:28<20:45:51, 4.91s/it] {'loss': 0.3743, 'grad_norm': 0.6397350366827538, 'learning_rate': 8.062380481834881e-06, 'epoch': 0.31} 31%|███ | 6883/22095 [11:38:28<20:45:51, 4.91s/it] 31%|███ | 6884/22095 [11:38:31<18:47:24, 4.45s/it] {'loss': 0.3446, 'grad_norm': 0.6291797265410776, 'learning_rate': 8.061801082853548e-06, 'epoch': 0.31} 31%|███ | 6884/22095 [11:38:31<18:47:24, 4.45s/it] 31%|███ | 6885/22095 [11:38:34<17:30:30, 4.14s/it] {'loss': 0.3873, 'grad_norm': 0.6852533920605484, 'learning_rate': 8.061221618083519e-06, 'epoch': 0.31} 31%|███ | 6885/22095 [11:38:34<17:30:30, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6886/22095 [11:38:37<16:13:47, 3.84s/it] {'loss': 0.3438, 'grad_norm': 0.7471568005288752, 'learning_rate': 8.060642087537233e-06, 'epoch': 0.31} 31%|███ | 6886/22095 [11:38:37<16:13:47, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41854 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42507 > 40960) for 4 sample(s). Truncating to 792 with 1 samples. 31%|███ | 6887/22095 [11:38:41<15:15:48, 3.61s/it] {'loss': 0.3268, 'grad_norm': 0.5769928531316513, 'learning_rate': 8.060062491227154e-06, 'epoch': 0.31} 31%|███ | 6887/22095 [11:38:41<15:15:48, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6888/22095 [11:38:51<23:25:39, 5.55s/it] {'loss': 0.4775, 'grad_norm': 0.39457544398209493, 'learning_rate': 8.059482829165728e-06, 'epoch': 0.31} 31%|███ | 6888/22095 [11:38:51<23:25:39, 5.55s/it] 31%|███ | 6889/22095 [11:38:54<20:49:54, 4.93s/it] {'loss': 0.3274, 'grad_norm': 0.6477673888723783, 'learning_rate': 8.058903101365412e-06, 'epoch': 0.31} 31%|███ | 6889/22095 [11:38:54<20:49:54, 4.93s/it] 31%|███ | 6890/22095 [11:38:58<19:02:30, 4.51s/it] {'loss': 0.3386, 'grad_norm': 0.6592795719191157, 'learning_rate': 8.058323307838665e-06, 'epoch': 0.31} 31%|███ | 6890/22095 [11:38:58<19:02:30, 4.51s/it] 31%|███ | 6891/22095 [11:39:01<17:56:25, 4.25s/it] {'loss': 0.3585, 'grad_norm': 0.7579655825665975, 'learning_rate': 8.05774344859794e-06, 'epoch': 0.31} 31%|███ | 6891/22095 [11:39:01<17:56:25, 4.25s/it] 31%|███ | 6892/22095 [11:39:05<16:46:36, 3.97s/it] {'loss': 0.3319, 'grad_norm': 0.6553535146501523, 'learning_rate': 8.057163523655702e-06, 'epoch': 0.31} 31%|███ | 6892/22095 [11:39:05<16:46:36, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51201 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6893/22095 [11:39:08<16:35:11, 3.93s/it] {'loss': 0.4117, 'grad_norm': 0.722228702460221, 'learning_rate': 8.056583533024408e-06, 'epoch': 0.31} 31%|███ | 6893/22095 [11:39:08<16:35:11, 3.93s/it] 31%|███ | 6894/22095 [11:39:11<15:19:29, 3.63s/it] {'loss': 0.3519, 'grad_norm': 0.62822723167227, 'learning_rate': 8.056003476716521e-06, 'epoch': 0.31} 31%|███ | 6894/22095 [11:39:11<15:19:29, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6895/22095 [11:39:21<22:37:32, 5.36s/it] {'loss': 0.4896, 'grad_norm': 0.34302529356673794, 'learning_rate': 8.055423354744507e-06, 'epoch': 0.31} 31%|███ | 6895/22095 [11:39:21<22:37:32, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50393 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144196 > 40960). Running this sequence through the model will result in indexing errors 31%|███ | 6896/22095 [11:39:24<19:45:23, 4.68s/it] {'loss': 0.3516, 'grad_norm': 0.8038315586479057, 'learning_rate': 8.054843167120827e-06, 'epoch': 0.31} 31%|███ | 6896/22095 [11:39:24<19:45:23, 4.68s/it] 31%|███ | 6897/22095 [11:39:27<18:15:35, 4.33s/it] {'loss': 0.3913, 'grad_norm': 0.7075627538296707, 'learning_rate': 8.054262913857951e-06, 'epoch': 0.31} 31%|███ | 6897/22095 [11:39:27<18:15:35, 4.33s/it] 31%|███ | 6898/22095 [11:39:30<16:44:22, 3.97s/it] {'loss': 0.3721, 'grad_norm': 0.6365784262746258, 'learning_rate': 8.053682594968346e-06, 'epoch': 0.31} 31%|███ | 6898/22095 [11:39:30<16:44:22, 3.97s/it] 31%|███ | 6899/22095 [11:39:34<16:00:24, 3.79s/it] {'loss': 0.3466, 'grad_norm': 0.6020970936461443, 'learning_rate': 8.053102210464478e-06, 'epoch': 0.31} 31%|███ | 6899/22095 [11:39:34<16:00:24, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███ | 6900/22095 [11:39:37<14:56:20, 3.54s/it] {'loss': 0.3847, 'grad_norm': 0.6240839936563485, 'learning_rate': 8.052521760358822e-06, 'epoch': 0.31} 31%|███ | 6900/22095 [11:39:37<14:56:20, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6901/22095 [11:39:45<20:50:22, 4.94s/it] {'loss': 0.4701, 'grad_norm': 0.30425922617942247, 'learning_rate': 8.05194124466385e-06, 'epoch': 0.31} 31%|███ | 6901/22095 [11:39:45<20:50:22, 4.94s/it] 31%|███ | 6902/22095 [11:39:49<19:11:46, 4.55s/it] {'loss': 0.3752, 'grad_norm': 0.6497985170793312, 'learning_rate': 8.051360663392031e-06, 'epoch': 0.31} 31%|███ | 6902/22095 [11:39:49<19:11:46, 4.55s/it] 31%|███ | 6903/22095 [11:39:53<18:40:36, 4.43s/it] {'loss': 0.4101, 'grad_norm': 0.6580678715176217, 'learning_rate': 8.050780016555846e-06, 'epoch': 0.31} 31%|███ | 6903/22095 [11:39:53<18:40:36, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███ | 6904/22095 [11:40:05<28:54:05, 6.85s/it] {'loss': 0.4843, 'grad_norm': 0.2985922174224097, 'learning_rate': 8.050199304167766e-06, 'epoch': 0.31} 31%|███ | 6904/22095 [11:40:05<28:54:05, 6.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [87, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8521771 in VC:s3://internvl-moe-sft-data/. Exception: Image size [87, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 82750, 'image': 'vrdu_texteq/astro-ph.CO/6671412b-1a06-4038-ac7e-ba1332817612.png', 'image_wh': [[87, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': '$\\Lambda$CDM'}]} 31%|███▏ | 6905/22095 [11:40:09<25:06:40, 5.95s/it] {'loss': 0.3404, 'grad_norm': 0.6295805144586434, 'learning_rate': 8.04961852624027e-06, 'epoch': 0.31} 31%|███▏ | 6905/22095 [11:40:09<25:06:40, 5.95s/it] 31%|███▏ | 6906/22095 [11:40:12<21:49:11, 5.17s/it] {'loss': 0.352, 'grad_norm': 0.6372763963072887, 'learning_rate': 8.04903768278584e-06, 'epoch': 0.31} 31%|███▏ | 6906/22095 [11:40:12<21:49:11, 5.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███▏ | 6907/22095 [11:40:16<20:12:37, 4.79s/it] {'loss': 0.3757, 'grad_norm': 0.6481452415926576, 'learning_rate': 8.048456773816955e-06, 'epoch': 0.31} 31%|███▏ | 6907/22095 [11:40:16<20:12:37, 4.79s/it] 31%|███▏ | 6908/22095 [11:40:20<18:19:43, 4.34s/it] {'loss': 0.3824, 'grad_norm': 0.7300271667682606, 'learning_rate': 8.047875799346096e-06, 'epoch': 0.31} 31%|███▏ | 6908/22095 [11:40:20<18:19:43, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71229 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6909/22095 [11:40:22<16:21:57, 3.88s/it] {'loss': 0.3255, 'grad_norm': 0.6156677324812093, 'learning_rate': 8.047294759385746e-06, 'epoch': 0.31} 31%|███▏ | 6909/22095 [11:40:22<16:21:57, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46752 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███▏ | 6910/22095 [11:40:25<15:08:19, 3.59s/it] {'loss': 0.3683, 'grad_norm': 0.67457907378465, 'learning_rate': 8.046713653948393e-06, 'epoch': 0.31} 31%|███▏ | 6910/22095 [11:40:25<15:08:19, 3.59s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_1/images/step_0.png 2025-08-28 03:38:23.860158 load time: 1474.35 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 03:38:24.175897 load time: 1779.29 ms 31%|███▏ | 6911/22095 [11:40:29<15:15:27, 3.62s/it] {'loss': 0.3724, 'grad_norm': 0.6480105249444751, 'learning_rate': 8.046132483046518e-06, 'epoch': 0.31} 31%|███▏ | 6911/22095 [11:40:29<15:15:27, 3.62s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_0.png 2025-08-28 03:38:28.798917 load time: 1174.92 ms 31%|███▏ | 6912/22095 [11:40:33<15:21:33, 3.64s/it] {'loss': 0.3749, 'grad_norm': 0.6655706607509766, 'learning_rate': 8.045551246692612e-06, 'epoch': 0.31} 31%|███▏ | 6912/22095 [11:40:33<15:21:33, 3.64s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (111125556 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10005.png 2025-08-28 03:38:31.557973 load time: 1068.2 ms 31%|███▏ | 6913/22095 [11:40:36<14:34:04, 3.45s/it] {'loss': 0.4009, 'grad_norm': 0.696785351305484, 'learning_rate': 8.044969944899165e-06, 'epoch': 0.31} 31%|███▏ | 6913/22095 [11:40:36<14:34:04, 3.45s/it] 31%|███▏ | 6914/22095 [11:40:40<15:07:16, 3.59s/it] {'loss': 0.4151, 'grad_norm': 0.7094320509766985, 'learning_rate': 8.044388577678666e-06, 'epoch': 0.31} 31%|███▏ | 6914/22095 [11:40:40<15:07:16, 3.59s/it] 31%|███▏ | 6915/22095 [11:40:43<14:41:40, 3.48s/it] {'loss': 0.332, 'grad_norm': 0.8859580873747761, 'learning_rate': 8.043807145043604e-06, 'epoch': 0.31} 31%|███▏ | 6915/22095 [11:40:43<14:41:40, 3.48s/it] 31%|███▏ | 6916/22095 [11:40:47<15:42:31, 3.73s/it] {'loss': 0.3363, 'grad_norm': 0.7092949835152653, 'learning_rate': 8.043225647006475e-06, 'epoch': 0.31} 31%|███▏ | 6916/22095 [11:40:47<15:42:31, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███▏ | 6917/22095 [11:40:54<20:07:57, 4.78s/it] {'loss': 0.5173, 'grad_norm': 0.4288553491654917, 'learning_rate': 8.042644083579775e-06, 'epoch': 0.31} 31%|███▏ | 6917/22095 [11:40:54<20:07:57, 4.78s/it] 31%|███▏ | 6918/22095 [11:40:59<19:41:03, 4.67s/it] {'loss': 0.3481, 'grad_norm': 0.6636966437293738, 'learning_rate': 8.042062454775999e-06, 'epoch': 0.31} 31%|███▏ | 6918/22095 [11:40:59<19:41:03, 4.67s/it] 31%|███▏ | 6919/22095 [11:41:02<18:13:02, 4.32s/it] {'loss': 0.3481, 'grad_norm': 0.700343789222212, 'learning_rate': 8.041480760607642e-06, 'epoch': 0.31} 31%|███▏ | 6919/22095 [11:41:02<18:13:02, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52374 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43282 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77694 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6920/22095 [11:41:06<16:57:58, 4.02s/it] {'loss': 0.4055, 'grad_norm': 0.6413586661482176, 'learning_rate': 8.040899001087206e-06, 'epoch': 0.31} 31%|███▏ | 6920/22095 [11:41:06<16:57:58, 4.02s/it] 31%|███▏ | 6921/22095 [11:41:09<15:51:10, 3.76s/it] {'loss': 0.3947, 'grad_norm': 0.7054811245614201, 'learning_rate': 8.04031717622719e-06, 'epoch': 0.31} 31%|███▏ | 6921/22095 [11:41:09<15:51:10, 3.76s/it] 31%|███▏ | 6922/22095 [11:41:12<15:02:55, 3.57s/it] {'loss': 0.3392, 'grad_norm': 0.6748577078036545, 'learning_rate': 8.039735286040095e-06, 'epoch': 0.31} 31%|███▏ | 6922/22095 [11:41:12<15:02:55, 3.57s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/vscode_1/images/step_0.png 2025-08-28 03:39:11.242311 load time: 1740.98 ms 31%|███▏ | 6923/22095 [11:41:15<14:35:47, 3.46s/it] {'loss': 0.3515, 'grad_norm': 0.6208383147855693, 'learning_rate': 8.039153330538423e-06, 'epoch': 0.31} 31%|███▏ | 6923/22095 [11:41:15<14:35:47, 3.46s/it] 31%|███▏ | 6924/22095 [11:41:19<14:32:36, 3.45s/it] {'loss': 0.3985, 'grad_norm': 0.6376714796607698, 'learning_rate': 8.038571309734682e-06, 'epoch': 0.31} 31%|███▏ | 6924/22095 [11:41:19<14:32:36, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373194 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39967, 'image': 'vrdu_table_final_2/astro-ph.CO/b5a85ce8-aa92-4fc7-ad99-4a953ed8535a.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 31%|███▏ | 6925/22095 [11:41:22<14:27:20, 3.43s/it] {'loss': 0.3348, 'grad_norm': 0.6995289970919804, 'learning_rate': 8.037989223641375e-06, 'epoch': 0.31} 31%|███▏ | 6925/22095 [11:41:22<14:27:20, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47068 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6926/22095 [11:41:25<13:58:22, 3.32s/it] {'loss': 0.3474, 'grad_norm': 0.6844964516030871, 'learning_rate': 8.03740707227101e-06, 'epoch': 0.31} 31%|███▏ | 6926/22095 [11:41:25<13:58:22, 3.32s/it] 31%|███▏ | 6927/22095 [11:41:28<13:33:45, 3.22s/it] {'loss': 0.3519, 'grad_norm': 0.6413951552444123, 'learning_rate': 8.036824855636096e-06, 'epoch': 0.31} 31%|███▏ | 6927/22095 [11:41:28<13:33:45, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███▏ | 6928/22095 [11:41:36<19:51:59, 4.72s/it] {'loss': 0.4797, 'grad_norm': 0.4902546346685851, 'learning_rate': 8.036242573749142e-06, 'epoch': 0.31} 31%|███▏ | 6928/22095 [11:41:36<19:51:59, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███▏ | 6929/22095 [11:41:39<17:50:50, 4.24s/it] {'loss': 0.3783, 'grad_norm': 0.6548546213508429, 'learning_rate': 8.035660226622661e-06, 'epoch': 0.31} 31%|███▏ | 6929/22095 [11:41:39<17:50:50, 4.24s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 31%|███▏ | 6930/22095 [11:41:42<16:17:51, 3.87s/it] {'loss': 0.3656, 'grad_norm': 0.6675440569763577, 'learning_rate': 8.035077814269165e-06, 'epoch': 0.31} 31%|███▏ | 6930/22095 [11:41:42<16:17:51, 3.87s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38976.png 2025-08-28 03:39:38.155288 load time: 1299.71 ms 31%|███▏ | 6931/22095 [11:41:45<14:56:36, 3.55s/it] {'loss': 0.3737, 'grad_norm': 0.6934360166198267, 'learning_rate': 8.034495336701169e-06, 'epoch': 0.31} 31%|███▏ | 6931/22095 [11:41:45<14:56:36, 3.55s/it] 31%|███▏ | 6932/22095 [11:41:49<14:59:33, 3.56s/it] {'loss': 0.3313, 'grad_norm': 0.6674105315224877, 'learning_rate': 8.033912793931187e-06, 'epoch': 0.31} 31%|███▏ | 6932/22095 [11:41:49<14:59:33, 3.56s/it] 31%|███▏ | 6933/22095 [11:41:54<16:31:33, 3.92s/it] {'loss': 0.3777, 'grad_norm': 0.6779911169606194, 'learning_rate': 8.033330185971737e-06, 'epoch': 0.31} 31%|███▏ | 6933/22095 [11:41:54<16:31:33, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███▏ | 6934/22095 [11:42:04<24:42:01, 5.87s/it] {'loss': 0.5, 'grad_norm': 0.3547796842907606, 'learning_rate': 8.032747512835338e-06, 'epoch': 0.31} 31%|███▏ | 6934/22095 [11:42:04<24:42:01, 5.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███▏ | 6935/22095 [11:42:08<22:26:17, 5.33s/it] {'loss': 0.3504, 'grad_norm': 0.616740468431483, 'learning_rate': 8.03216477453451e-06, 'epoch': 0.31} 31%|███▏ | 6935/22095 [11:42:08<22:26:17, 5.33s/it] 31%|███▏ | 6936/22095 [11:42:12<20:40:18, 4.91s/it] {'loss': 0.4083, 'grad_norm': 0.636731133727304, 'learning_rate': 8.03158197108177e-06, 'epoch': 0.31} 31%|███▏ | 6936/22095 [11:42:12<20:40:18, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108706 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6937/22095 [11:42:15<18:54:18, 4.49s/it] {'loss': 0.376, 'grad_norm': 0.6420405558207202, 'learning_rate': 8.030999102489649e-06, 'epoch': 0.31} 31%|███▏ | 6937/22095 [11:42:15<18:54:18, 4.49s/it] 31%|███▏ | 6938/22095 [11:42:19<17:23:14, 4.13s/it] {'loss': 0.3452, 'grad_norm': 0.6260731233892689, 'learning_rate': 8.030416168770663e-06, 'epoch': 0.31} 31%|███▏ | 6938/22095 [11:42:19<17:23:14, 4.13s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/af851dfd-b7ce-4e95-95cf-c0fce6b8bb15/images/step_2.png 2025-08-28 03:40:17.526715 load time: 1019.97 ms 31%|███▏ | 6939/22095 [11:42:22<16:36:15, 3.94s/it] {'loss': 0.359, 'grad_norm': 0.6135576943507863, 'learning_rate': 8.029833169937343e-06, 'epoch': 0.31} 31%|███▏ | 6939/22095 [11:42:22<16:36:15, 3.94s/it] 31%|███▏ | 6940/22095 [11:42:25<15:36:58, 3.71s/it] {'loss': 0.3575, 'grad_norm': 0.7184287456717344, 'learning_rate': 8.029250106002212e-06, 'epoch': 0.31} 31%|███▏ | 6940/22095 [11:42:25<15:36:58, 3.71s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 03:40:23.590403 load time: 1247.74 ms 31%|███▏ | 6941/22095 [11:42:29<14:53:38, 3.54s/it] {'loss': 0.3642, 'grad_norm': 0.6092310549635934, 'learning_rate': 8.0286669769778e-06, 'epoch': 0.31} 31%|███▏ | 6941/22095 [11:42:29<14:53:38, 3.54s/it] 31%|███▏ | 6942/22095 [11:42:32<14:11:14, 3.37s/it] {'loss': 0.3899, 'grad_norm': 0.6550276525946266, 'learning_rate': 8.028083782876636e-06, 'epoch': 0.31} 31%|███▏ | 6942/22095 [11:42:32<14:11:14, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54039 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102826 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6943/22095 [11:42:35<13:47:25, 3.28s/it] {'loss': 0.3467, 'grad_norm': 0.6539315006997505, 'learning_rate': 8.027500523711253e-06, 'epoch': 0.31} 31%|███▏ | 6943/22095 [11:42:35<13:47:25, 3.28s/it] 31%|███▏ | 6944/22095 [11:42:39<14:45:14, 3.51s/it] {'loss': 0.3618, 'grad_norm': 0.6699656463779345, 'learning_rate': 8.026917199494181e-06, 'epoch': 0.31} 31%|███▏ | 6944/22095 [11:42:39<14:45:14, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [92, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8351062 in VC:s3://internvl-moe-sft-data/. Exception: Image size [92, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17738, 'image': 'vrdu_table_final_2/astro-ph.CO/28b84f7e-d063-46d8-b0b9-d54380fa585b.png', 'image_wh': [[92, 23]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}Finalist\\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 03:40:39.797799 load time: 1319.99 ms 31%|███▏ | 6945/22095 [11:42:48<22:38:14, 5.38s/it] {'loss': 0.4938, 'grad_norm': 0.36221957851989256, 'learning_rate': 8.026333810237956e-06, 'epoch': 0.31} 31%|███▏ | 6945/22095 [11:42:48<22:38:14, 5.38s/it] 31%|███▏ | 6946/22095 [11:42:52<20:29:04, 4.87s/it] {'loss': 0.3669, 'grad_norm': 0.7061374105862012, 'learning_rate': 8.025750355955112e-06, 'epoch': 0.31} 31%|███▏ | 6946/22095 [11:42:52<20:29:04, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105679 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6947/22095 [11:42:56<19:08:37, 4.55s/it] {'loss': 0.3171, 'grad_norm': 0.6008363417200151, 'learning_rate': 8.025166836658185e-06, 'epoch': 0.31} 31%|███▏ | 6947/22095 [11:42:56<19:08:37, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70562 > 40960). Running this sequence through the model will result in indexing errors 31%|███▏ | 6948/22095 [11:42:59<16:49:19, 4.00s/it] {'loss': 0.3622, 'grad_norm': 0.6153450850043569, 'learning_rate': 8.024583252359714e-06, 'epoch': 0.31} 31%|███▏ | 6948/22095 [11:42:59<16:49:19, 4.00s/it] 31%|███▏ | 6949/22095 [11:43:02<15:32:12, 3.69s/it] {'loss': 0.3492, 'grad_norm': 0.6484699123239448, 'learning_rate': 8.023999603072236e-06, 'epoch': 0.31} 31%|███▏ | 6949/22095 [11:43:02<15:32:12, 3.69s/it] 31%|███▏ | 6950/22095 [11:43:04<14:30:25, 3.45s/it] {'loss': 0.3531, 'grad_norm': 0.6947514250003248, 'learning_rate': 8.023415888808297e-06, 'epoch': 0.31} 31%|███▏ | 6950/22095 [11:43:04<14:30:25, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███▏ | 6951/22095 [11:43:13<21:17:13, 5.06s/it] {'loss': 0.4838, 'grad_norm': 0.38000138347421075, 'learning_rate': 8.022832109580437e-06, 'epoch': 0.31} 31%|███▏ | 6951/22095 [11:43:13<21:17:13, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 31%|███▏ | 6952/22095 [11:43:16<18:49:34, 4.48s/it] {'loss': 0.3564, 'grad_norm': 0.6763144529594775, 'learning_rate': 8.022248265401196e-06, 'epoch': 0.31} 31%|███▏ | 6952/22095 [11:43:16<18:49:34, 4.48s/it] 31%|███▏ | 6953/22095 [11:43:20<17:14:59, 4.10s/it] {'loss': 0.3684, 'grad_norm': 0.6601351368993845, 'learning_rate': 8.021664356283123e-06, 'epoch': 0.31} 31%|███▏ | 6953/22095 [11:43:20<17:14:59, 4.10s/it] 31%|███▏ | 6954/22095 [11:43:23<16:22:13, 3.89s/it] {'loss': 0.3753, 'grad_norm': 0.7040570624830684, 'learning_rate': 8.021080382238763e-06, 'epoch': 0.31} 31%|███▏ | 6954/22095 [11:43:23<16:22:13, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 31%|███▏ | 6955/22095 [11:43:29<18:51:54, 4.49s/it] {'loss': 0.4805, 'grad_norm': 0.2880395263967866, 'learning_rate': 8.020496343280664e-06, 'epoch': 0.31} 31%|███▏ | 6955/22095 [11:43:29<18:51:54, 4.49s/it] 31%|███▏ | 6956/22095 [11:43:35<20:20:11, 4.84s/it] {'loss': 0.4913, 'grad_norm': 0.3021069983303145, 'learning_rate': 8.019912239421376e-06, 'epoch': 0.31} 31%|███▏ | 6956/22095 [11:43:35<20:20:11, 4.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 31%|███▏ | 6957/22095 [11:43:38<18:57:22, 4.51s/it] {'loss': 0.3898, 'grad_norm': 0.8756622334383337, 'learning_rate': 8.019328070673449e-06, 'epoch': 0.31} 31%|███▏ | 6957/22095 [11:43:38<18:57:22, 4.51s/it] 31%|███▏ | 6958/22095 [11:43:42<18:15:51, 4.34s/it] {'loss': 0.3788, 'grad_norm': 0.6654109828266398, 'learning_rate': 8.018743837049433e-06, 'epoch': 0.31} 31%|███▏ | 6958/22095 [11:43:42<18:15:51, 4.34s/it] 31%|███▏ | 6959/22095 [11:43:46<18:03:00, 4.29s/it] {'loss': 0.3458, 'grad_norm': 0.7690824808772482, 'learning_rate': 8.018159538561888e-06, 'epoch': 0.31} 31%|███▏ | 6959/22095 [11:43:46<18:03:00, 4.29s/it] 32%|███▏ | 6960/22095 [11:43:51<17:48:15, 4.23s/it] {'loss': 0.3943, 'grad_norm': 0.790969564695417, 'learning_rate': 8.01757517522336e-06, 'epoch': 0.32} 32%|███▏ | 6960/22095 [11:43:51<17:48:15, 4.23s/it] 32%|███▏ | 6961/22095 [11:43:55<17:45:32, 4.22s/it] {'loss': 0.4222, 'grad_norm': 0.6888825820126708, 'learning_rate': 8.01699074704641e-06, 'epoch': 0.32} 32%|███▏ | 6961/22095 [11:43:55<17:45:32, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41424 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6962/22095 [11:43:58<16:26:34, 3.91s/it] {'loss': 0.3557, 'grad_norm': 0.9953360923545216, 'learning_rate': 8.016406254043595e-06, 'epoch': 0.32} 32%|███▏ | 6962/22095 [11:43:58<16:26:34, 3.91s/it] 32%|███▏ | 6963/22095 [11:44:01<15:35:53, 3.71s/it] {'loss': 0.3606, 'grad_norm': 0.7163182122643703, 'learning_rate': 8.015821696227475e-06, 'epoch': 0.32} 32%|███▏ | 6963/22095 [11:44:01<15:35:53, 3.71s/it] 32%|███▏ | 6964/22095 [11:44:05<15:54:49, 3.79s/it] {'loss': 0.3787, 'grad_norm': 0.6413564293083378, 'learning_rate': 8.015237073610607e-06, 'epoch': 0.32} 32%|███▏ | 6964/22095 [11:44:05<15:54:49, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52276 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6965/22095 [11:44:08<14:54:39, 3.55s/it] {'loss': 0.4123, 'grad_norm': 0.6576486928547289, 'learning_rate': 8.014652386205557e-06, 'epoch': 0.32} 32%|███▏ | 6965/22095 [11:44:08<14:54:39, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81213 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6966/22095 [11:44:11<14:20:55, 3.41s/it] {'loss': 0.3701, 'grad_norm': 0.6047092559852032, 'learning_rate': 8.014067634024884e-06, 'epoch': 0.32} 32%|███▏ | 6966/22095 [11:44:11<14:20:55, 3.41s/it] 32%|███▏ | 6967/22095 [11:44:14<13:31:17, 3.22s/it] {'loss': 0.3856, 'grad_norm': 0.6571742648341652, 'learning_rate': 8.013482817081157e-06, 'epoch': 0.32} 32%|███▏ | 6967/22095 [11:44:14<13:31:17, 3.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396925 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63778, 'image': 'vrdu_table_final_2/astro-ph.EP/428443fd-7b4a-44c9-9168-a5ef23250490.png', 'image_wh': [[12, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}z\\end{tabular}\n```"}]} 32%|███▏ | 6968/22095 [11:44:17<13:04:33, 3.11s/it] {'loss': 0.3592, 'grad_norm': 0.6707972533911652, 'learning_rate': 8.012897935386938e-06, 'epoch': 0.32} 32%|███▏ | 6968/22095 [11:44:17<13:04:33, 3.11s/it] 32%|███▏ | 6969/22095 [11:44:20<13:07:29, 3.12s/it] {'loss': 0.3869, 'grad_norm': 0.6202903980423257, 'learning_rate': 8.012312988954795e-06, 'epoch': 0.32} 32%|███▏ | 6969/22095 [11:44:20<13:07:29, 3.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 6970/22095 [11:44:26<16:34:13, 3.94s/it] {'loss': 0.4932, 'grad_norm': 0.4156628400616371, 'learning_rate': 8.0117279777973e-06, 'epoch': 0.32} 32%|███▏ | 6970/22095 [11:44:26<16:34:13, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94530 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6971/22095 [11:44:29<16:02:47, 3.82s/it] {'loss': 0.3479, 'grad_norm': 0.660814934434108, 'learning_rate': 8.011142901927018e-06, 'epoch': 0.32} 32%|███▏ | 6971/22095 [11:44:29<16:02:47, 3.82s/it] 32%|███▏ | 6972/22095 [11:44:32<15:12:24, 3.62s/it] {'loss': 0.4147, 'grad_norm': 0.6830413012597334, 'learning_rate': 8.010557761356523e-06, 'epoch': 0.32} 32%|███▏ | 6972/22095 [11:44:33<15:12:24, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 6973/22095 [11:44:43<23:34:46, 5.61s/it] {'loss': 0.4674, 'grad_norm': 0.30869750524274253, 'learning_rate': 8.009972556098388e-06, 'epoch': 0.32} 32%|███▏ | 6973/22095 [11:44:43<23:34:46, 5.61s/it]VC:s3://gui-agent/data_20250612/mac/images/finder/8af7889e-fbfc-443f-8629-e5b6b0484c7d/images/step_2.png 2025-08-28 03:42:43.373904 load time: 1195.94 ms 32%|███▏ | 6974/22095 [11:44:52<28:17:31, 6.74s/it] {'loss': 0.501, 'grad_norm': 0.3013701412214667, 'learning_rate': 8.009387286165188e-06, 'epoch': 0.32} 32%|███▏ | 6974/22095 [11:44:52<28:17:31, 6.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (90859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92336 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6975/22095 [11:44:56<24:21:57, 5.80s/it] {'loss': 0.3552, 'grad_norm': 0.7170094699174588, 'learning_rate': 8.008801951569501e-06, 'epoch': 0.32} 32%|███▏ | 6975/22095 [11:44:56<24:21:57, 5.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885295 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8448, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 32%|███▏ | 6976/22095 [11:44:59<21:00:37, 5.00s/it] {'loss': 0.3914, 'grad_norm': 0.7032893337212256, 'learning_rate': 8.008216552323896e-06, 'epoch': 0.32} 32%|███▏ | 6976/22095 [11:44:59<21:00:37, 5.00s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/vscode_1/images/step_0.png 2025-08-28 03:42:58.186118 load time: 1694.96 ms 32%|███▏ | 6977/22095 [11:45:03<19:22:21, 4.61s/it] {'loss': 0.3883, 'grad_norm': 0.6187411787630647, 'learning_rate': 8.007631088440959e-06, 'epoch': 0.32} 32%|███▏ | 6977/22095 [11:45:03<19:22:21, 4.61s/it] 32%|███▏ | 6978/22095 [11:45:06<17:31:48, 4.17s/it] {'loss': 0.3521, 'grad_norm': 0.6929869219292941, 'learning_rate': 8.007045559933265e-06, 'epoch': 0.32} 32%|███▏ | 6978/22095 [11:45:06<17:31:48, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8362614 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29350, 'image': 'vrdu_table_final_2/astro-ph.CO/e1293863-d7ac-4040-bd8a-90dedd705980.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{@{}c@{}}4 \\\\ 2\\end{tabular}\n```"}]} 32%|███▏ | 6979/22095 [11:45:16<24:36:02, 5.86s/it] {'loss': 0.518, 'grad_norm': 0.41659022905236215, 'learning_rate': 8.006459966813399e-06, 'epoch': 0.32} 32%|███▏ | 6979/22095 [11:45:16<24:36:02, 5.86s/it] 32%|███▏ | 6980/22095 [11:45:19<21:16:30, 5.07s/it] {'loss': 0.3382, 'grad_norm': 0.6864769287896622, 'learning_rate': 8.005874309093942e-06, 'epoch': 0.32} 32%|███▏ | 6980/22095 [11:45:19<21:16:30, 5.07s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38833.png 2025-08-28 03:43:17.533574 load time: 1390.47 ms 32%|███▏ | 6981/22095 [11:45:22<18:37:40, 4.44s/it] {'loss': 0.3954, 'grad_norm': 0.8409165007357672, 'learning_rate': 8.005288586787477e-06, 'epoch': 0.32} 32%|███▏ | 6981/22095 [11:45:22<18:37:40, 4.44s/it] 32%|███▏ | 6982/22095 [11:45:25<17:44:14, 4.23s/it] {'loss': 0.4228, 'grad_norm': 0.6951795654243249, 'learning_rate': 8.00470279990659e-06, 'epoch': 0.32} 32%|███▏ | 6982/22095 [11:45:25<17:44:14, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 6983/22095 [11:45:35<24:20:53, 5.80s/it] {'loss': 0.4808, 'grad_norm': 0.3213139159543221, 'learning_rate': 8.00411694846387e-06, 'epoch': 0.32} 32%|███▏ | 6983/22095 [11:45:35<24:20:53, 5.80s/it] 32%|███▏ | 6984/22095 [11:45:38<21:05:14, 5.02s/it] {'loss': 0.3913, 'grad_norm': 0.7266194383038589, 'learning_rate': 8.003531032471901e-06, 'epoch': 0.32} 32%|███▏ | 6984/22095 [11:45:38<21:05:14, 5.02s/it] 32%|███▏ | 6985/22095 [11:45:41<18:21:29, 4.37s/it] {'loss': 0.3398, 'grad_norm': 0.9611211426506843, 'learning_rate': 8.002945051943276e-06, 'epoch': 0.32} 32%|███▏ | 6985/22095 [11:45:41<18:21:29, 4.37s/it] 32%|███▏ | 6986/22095 [11:45:45<18:05:41, 4.31s/it] {'loss': 0.3788, 'grad_norm': 0.6052295188548436, 'learning_rate': 8.002359006890585e-06, 'epoch': 0.32} 32%|███▏ | 6986/22095 [11:45:45<18:05:41, 4.31s/it] 32%|███▏ | 6987/22095 [11:45:49<17:15:16, 4.11s/it] {'loss': 0.3707, 'grad_norm': 0.7664924172248415, 'learning_rate': 8.001772897326418e-06, 'epoch': 0.32} 32%|███▏ | 6987/22095 [11:45:49<17:15:16, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42661 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6988/22095 [11:45:52<16:11:10, 3.86s/it] {'loss': 0.3593, 'grad_norm': 0.8040540740269324, 'learning_rate': 8.001186723263374e-06, 'epoch': 0.32} 32%|███▏ | 6988/22095 [11:45:52<16:11:10, 3.86s/it] 32%|███▏ | 6989/22095 [11:45:55<15:35:32, 3.72s/it] {'loss': 0.4082, 'grad_norm': 0.6847053006400641, 'learning_rate': 8.000600484714043e-06, 'epoch': 0.32} 32%|███▏ | 6989/22095 [11:45:55<15:35:32, 3.72s/it] 32%|███▏ | 6990/22095 [11:45:59<14:59:47, 3.57s/it] {'loss': 0.3716, 'grad_norm': 0.692378851693665, 'learning_rate': 8.000014181691023e-06, 'epoch': 0.32} 32%|███▏ | 6990/22095 [11:45:59<14:59:47, 3.57s/it] 32%|███▏ | 6991/22095 [11:46:02<14:27:22, 3.45s/it] {'loss': 0.3648, 'grad_norm': 0.6925287113795098, 'learning_rate': 7.999427814206911e-06, 'epoch': 0.32} 32%|███▏ | 6991/22095 [11:46:02<14:27:22, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65417 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 6992/22095 [11:46:05<14:26:14, 3.44s/it] {'loss': 0.3365, 'grad_norm': 0.6704389383276854, 'learning_rate': 7.99884138227431e-06, 'epoch': 0.32} 32%|███▏ | 6992/22095 [11:46:05<14:26:14, 3.44s/it] 32%|███▏ | 6993/22095 [11:46:09<14:19:22, 3.41s/it] {'loss': 0.3452, 'grad_norm': 0.6874645585931815, 'learning_rate': 7.998254885905817e-06, 'epoch': 0.32} 32%|███▏ | 6993/22095 [11:46:09<14:19:22, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 6994/22095 [11:46:12<14:25:51, 3.44s/it] {'loss': 0.3476, 'grad_norm': 0.6286062996410474, 'learning_rate': 7.997668325114033e-06, 'epoch': 0.32} 32%|███▏ | 6994/22095 [11:46:12<14:25:51, 3.44s/it] 32%|███▏ | 6995/22095 [11:46:16<14:39:47, 3.50s/it] {'loss': 0.36, 'grad_norm': 0.6251766923475587, 'learning_rate': 7.997081699911566e-06, 'epoch': 0.32} 32%|███▏ | 6995/22095 [11:46:16<14:39:47, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 6996/22095 [11:46:25<22:18:44, 5.32s/it] {'loss': 0.4625, 'grad_norm': 0.4019324945117982, 'learning_rate': 7.996495010311017e-06, 'epoch': 0.32} 32%|███▏ | 6996/22095 [11:46:25<22:18:44, 5.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 6997/22095 [11:46:29<20:50:37, 4.97s/it] {'loss': 0.3416, 'grad_norm': 0.7655651498493894, 'learning_rate': 7.995908256324992e-06, 'epoch': 0.32} 32%|███▏ | 6997/22095 [11:46:29<20:50:37, 4.97s/it] 32%|███▏ | 6998/22095 [11:46:33<18:50:36, 4.49s/it] {'loss': 0.4238, 'grad_norm': 0.6785538399318121, 'learning_rate': 7.995321437966102e-06, 'epoch': 0.32} 32%|███▏ | 6998/22095 [11:46:33<18:50:36, 4.49s/it] 32%|███▏ | 6999/22095 [11:46:37<17:46:58, 4.24s/it] {'loss': 0.39, 'grad_norm': 0.6357356159438607, 'learning_rate': 7.99473455524695e-06, 'epoch': 0.32} 32%|███▏ | 6999/22095 [11:46:37<17:46:58, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7000/22095 [11:46:46<24:15:27, 5.79s/it] {'loss': 0.4999, 'grad_norm': 0.33904987132475095, 'learning_rate': 7.994147608180153e-06, 'epoch': 0.32} 32%|███▏ | 7000/22095 [11:46:46<24:15:27, 5.79s/it] 32%|███▏ | 7001/22095 [11:46:50<21:52:19, 5.22s/it] {'loss': 0.3879, 'grad_norm': 0.7621136973532631, 'learning_rate': 7.993560596778321e-06, 'epoch': 0.32} 32%|███▏ | 7001/22095 [11:46:50<21:52:19, 5.22s/it] 32%|███▏ | 7002/22095 [11:46:53<19:07:15, 4.56s/it] {'loss': 0.3871, 'grad_norm': 0.7132686937912542, 'learning_rate': 7.992973521054063e-06, 'epoch': 0.32} 32%|███▏ | 7002/22095 [11:46:53<19:07:15, 4.56s/it] 32%|███▏ | 7003/22095 [11:46:56<16:51:51, 4.02s/it] {'loss': 0.3945, 'grad_norm': 0.6690670498675246, 'learning_rate': 7.992386381019999e-06, 'epoch': 0.32} 32%|███▏ | 7003/22095 [11:46:56<16:51:51, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74289 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41526 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7004/22095 [11:46:59<15:29:45, 3.70s/it] {'loss': 0.3553, 'grad_norm': 0.7285347897571842, 'learning_rate': 7.99179917668874e-06, 'epoch': 0.32} 32%|███▏ | 7004/22095 [11:46:59<15:29:45, 3.70s/it] 32%|███▏ | 7005/22095 [11:47:03<16:08:04, 3.85s/it] {'loss': 0.3883, 'grad_norm': 0.6793097106943258, 'learning_rate': 7.991211908072905e-06, 'epoch': 0.32} 32%|███▏ | 7005/22095 [11:47:03<16:08:04, 3.85s/it] 32%|███▏ | 7006/22095 [11:47:06<14:57:01, 3.57s/it] {'loss': 0.3684, 'grad_norm': 0.6018551606250381, 'learning_rate': 7.990624575185116e-06, 'epoch': 0.32} 32%|███▏ | 7006/22095 [11:47:06<14:57:01, 3.57s/it] 32%|███▏ | 7007/22095 [11:47:09<15:12:29, 3.63s/it] {'loss': 0.4062, 'grad_norm': 0.6756261068762829, 'learning_rate': 7.990037178037987e-06, 'epoch': 0.32} 32%|███▏ | 7007/22095 [11:47:09<15:12:29, 3.63s/it] 32%|███▏ | 7008/22095 [11:47:13<15:17:40, 3.65s/it] {'loss': 0.3658, 'grad_norm': 0.6133815435263604, 'learning_rate': 7.989449716644142e-06, 'epoch': 0.32} 32%|███▏ | 7008/22095 [11:47:13<15:17:40, 3.65s/it] 32%|███▏ | 7009/22095 [11:47:16<14:33:36, 3.47s/it] {'loss': 0.362, 'grad_norm': 0.619225307675842, 'learning_rate': 7.988862191016204e-06, 'epoch': 0.32} 32%|███▏ | 7009/22095 [11:47:16<14:33:36, 3.47s/it] 32%|███▏ | 7010/22095 [11:47:19<14:06:24, 3.37s/it] {'loss': 0.3724, 'grad_norm': 0.6478285255297425, 'learning_rate': 7.9882746011668e-06, 'epoch': 0.32} 32%|███▏ | 7010/22095 [11:47:19<14:06:24, 3.37s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 03:45:17.292177 load time: 1041.48 ms 32%|███▏ | 7011/22095 [11:47:24<15:15:57, 3.64s/it] {'loss': 0.3836, 'grad_norm': 0.6368209266259487, 'learning_rate': 7.98768694710855e-06, 'epoch': 0.32} 32%|███▏ | 7011/22095 [11:47:24<15:15:57, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7012/22095 [11:47:29<17:18:47, 4.13s/it] {'loss': 0.4922, 'grad_norm': 0.4413932798745625, 'learning_rate': 7.987099228854083e-06, 'epoch': 0.32} 32%|███▏ | 7012/22095 [11:47:29<17:18:47, 4.13s/it] 32%|███▏ | 7013/22095 [11:47:32<16:12:44, 3.87s/it] {'loss': 0.3324, 'grad_norm': 0.6229862211243823, 'learning_rate': 7.986511446416029e-06, 'epoch': 0.32} 32%|███▏ | 7013/22095 [11:47:32<16:12:44, 3.87s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_6/images/before_screenshot_59_id_139_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 03:45:31.922733 load time: 1263.03 ms 32%|███▏ | 7014/22095 [11:47:35<15:20:23, 3.66s/it] {'loss': 0.4385, 'grad_norm': 0.6977332581462345, 'learning_rate': 7.985923599807017e-06, 'epoch': 0.32} 32%|███▏ | 7014/22095 [11:47:35<15:20:23, 3.66s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250504_150246_5/images/before_screenshot_52_id_107_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 03:45:33.491780 load time: 1053.09 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_5.png 2025-08-28 03:45:34.060079 load time: 1003.8 ms 32%|███▏ | 7015/22095 [11:47:39<15:07:15, 3.61s/it] {'loss': 0.3629, 'grad_norm': 0.7046571732350977, 'learning_rate': 7.985335689039675e-06, 'epoch': 0.32} 32%|███▏ | 7015/22095 [11:47:39<15:07:15, 3.61s/it] 32%|███▏ | 7016/22095 [11:47:42<14:42:55, 3.51s/it] {'loss': 0.3785, 'grad_norm': 0.6477763537029326, 'learning_rate': 7.984747714126639e-06, 'epoch': 0.32} 32%|███▏ | 7016/22095 [11:47:42<14:42:55, 3.51s/it] 32%|███▏ | 7017/22095 [11:47:46<14:58:59, 3.58s/it] {'loss': 0.3906, 'grad_norm': 0.7144266335689072, 'learning_rate': 7.984159675080543e-06, 'epoch': 0.32} 32%|███▏ | 7017/22095 [11:47:46<14:58:59, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310062 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB189sDazihSKJjy0FfXXbGzFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text is hidden in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n温度数字显示表\n铝合金不沾盘\n不锈钢机身\n台/立两用\n电饼铛\ns\n赠送全套工具,\n老师傅配方教程21世纪商贸8'}]} 32%|███▏ | 7018/22095 [11:47:49<13:58:18, 3.34s/it] {'loss': 0.3575, 'grad_norm': 0.6258472946470179, 'learning_rate': 7.98357157191402e-06, 'epoch': 0.32} 32%|███▏ | 7018/22095 [11:47:49<13:58:18, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41347 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7019/22095 [11:47:51<13:22:19, 3.19s/it] {'loss': 0.3471, 'grad_norm': 0.6200884932537759, 'learning_rate': 7.982983404639707e-06, 'epoch': 0.32} 32%|███▏ | 7019/22095 [11:47:51<13:22:19, 3.19s/it] 32%|███▏ | 7020/22095 [11:47:55<13:35:23, 3.25s/it] {'loss': 0.3554, 'grad_norm': 0.6650517614200443, 'learning_rate': 7.98239517327024e-06, 'epoch': 0.32} 32%|███▏ | 7020/22095 [11:47:55<13:35:23, 3.25s/it] 32%|███▏ | 7021/22095 [11:47:58<13:31:32, 3.23s/it] {'loss': 0.3711, 'grad_norm': 0.6530860017670242, 'learning_rate': 7.981806877818265e-06, 'epoch': 0.32} 32%|███▏ | 7021/22095 [11:47:58<13:31:32, 3.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307629 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB26jrZgPnD8KJjSspbXXbbEXXa_!!3173720736.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这张图上的全部文本是什么?请详细告诉我。'}, {'from': 'gpt', 'value': '图中所有文字:\n5条装\n纯棉\n防水\n买2组送2条,\n买3组送4条'}]} 32%|███▏ | 7022/22095 [11:48:02<14:05:48, 3.37s/it] {'loss': 0.3659, 'grad_norm': 0.8740355444512873, 'learning_rate': 7.981218518296414e-06, 'epoch': 0.32} 32%|███▏ | 7022/22095 [11:48:02<14:05:48, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7023/22095 [11:48:05<13:57:51, 3.34s/it] {'loss': 0.3792, 'grad_norm': 0.6671952504181827, 'learning_rate': 7.980630094717337e-06, 'epoch': 0.32} 32%|███▏ | 7023/22095 [11:48:05<13:57:51, 3.34s/it] 32%|███▏ | 7024/22095 [11:48:10<15:40:11, 3.74s/it] {'loss': 0.3446, 'grad_norm': 0.6477284106255007, 'learning_rate': 7.98004160709367e-06, 'epoch': 0.32} 32%|███▏ | 7024/22095 [11:48:10<15:40:11, 3.74s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/e1cc9c76-68f1-43b2-8a5f-bcfc7a70d69e/images/step_6.png 2025-08-28 03:46:08.993917 load time: 1125.71 ms 32%|███▏ | 7025/22095 [11:48:13<15:12:25, 3.63s/it] {'loss': 0.3551, 'grad_norm': 0.6192963333013082, 'learning_rate': 7.979453055438063e-06, 'epoch': 0.32} 32%|███▏ | 7025/22095 [11:48:13<15:12:25, 3.63s/it] 32%|███▏ | 7026/22095 [11:48:17<15:56:14, 3.81s/it] {'loss': 0.328, 'grad_norm': 0.6196762931669262, 'learning_rate': 7.97886443976316e-06, 'epoch': 0.32} 32%|███▏ | 7026/22095 [11:48:17<15:56:14, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74729 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56212 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7027/22095 [11:48:25<20:40:41, 4.94s/it] {'loss': 0.5026, 'grad_norm': 0.48428443466322474, 'learning_rate': 7.978275760081611e-06, 'epoch': 0.32} 32%|███▏ | 7027/22095 [11:48:25<20:40:41, 4.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7028/22095 [11:48:28<18:51:44, 4.51s/it] {'loss': 0.4117, 'grad_norm': 0.661978297863087, 'learning_rate': 7.97768701640606e-06, 'epoch': 0.32} 32%|███▏ | 7028/22095 [11:48:28<18:51:44, 4.51s/it] 32%|███▏ | 7029/22095 [11:48:32<17:49:37, 4.26s/it] {'loss': 0.3473, 'grad_norm': 1.7101319009447666, 'learning_rate': 7.977098208749162e-06, 'epoch': 0.32} 32%|███▏ | 7029/22095 [11:48:32<17:49:37, 4.26s/it] 32%|███▏ | 7030/22095 [11:48:36<17:19:07, 4.14s/it] {'loss': 0.4033, 'grad_norm': 0.6697938911628487, 'learning_rate': 7.976509337123567e-06, 'epoch': 0.32} 32%|███▏ | 7030/22095 [11:48:36<17:19:07, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7031/22095 [11:48:39<16:35:29, 3.97s/it] {'loss': 0.3733, 'grad_norm': 0.6372431024914199, 'learning_rate': 7.975920401541927e-06, 'epoch': 0.32} 32%|███▏ | 7031/22095 [11:48:39<16:35:29, 3.97s/it] 32%|███▏ | 7032/22095 [11:48:43<16:35:14, 3.96s/it] {'loss': 0.363, 'grad_norm': 0.6087178277991327, 'learning_rate': 7.975331402016898e-06, 'epoch': 0.32} 32%|███▏ | 7032/22095 [11:48:43<16:35:14, 3.96s/it] 32%|███▏ | 7033/22095 [11:48:46<15:04:10, 3.60s/it] {'loss': 0.3627, 'grad_norm': 0.6942216985962842, 'learning_rate': 7.974742338561134e-06, 'epoch': 0.32} 32%|███▏ | 7033/22095 [11:48:46<15:04:10, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7034/22095 [11:48:54<20:15:16, 4.84s/it] {'loss': 0.5146, 'grad_norm': 0.391526321631339, 'learning_rate': 7.974153211187296e-06, 'epoch': 0.32} 32%|███▏ | 7034/22095 [11:48:54<20:15:16, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70678 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81795 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7035/22095 [11:49:03<26:12:19, 6.26s/it] {'loss': 0.4979, 'grad_norm': 0.3479777571847785, 'learning_rate': 7.973564019908038e-06, 'epoch': 0.32} 32%|███▏ | 7035/22095 [11:49:03<26:12:19, 6.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 32%|███▏ | 7036/22095 [11:49:07<23:26:57, 5.61s/it] {'loss': 0.3807, 'grad_norm': 1.4253779934524895, 'learning_rate': 7.972974764736023e-06, 'epoch': 0.32} 32%|███▏ | 7036/22095 [11:49:08<23:26:57, 5.61s/it] 32%|███▏ | 7037/22095 [11:49:11<20:17:54, 4.85s/it] {'loss': 0.3789, 'grad_norm': 0.6227052935548067, 'learning_rate': 7.97238544568391e-06, 'epoch': 0.32} 32%|███▏ | 7037/22095 [11:49:11<20:17:54, 4.85s/it] 32%|███▏ | 7038/22095 [11:49:14<18:15:38, 4.37s/it] {'loss': 0.3501, 'grad_norm': 0.6147570914994451, 'learning_rate': 7.971796062764363e-06, 'epoch': 0.32} 32%|███▏ | 7038/22095 [11:49:14<18:15:38, 4.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51293 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7039/22095 [11:49:17<16:41:22, 3.99s/it] {'loss': 0.3667, 'grad_norm': 0.6806938315694051, 'learning_rate': 7.971206615990046e-06, 'epoch': 0.32} 32%|███▏ | 7039/22095 [11:49:17<16:41:22, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54477 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7040/22095 [11:49:21<16:16:36, 3.89s/it] {'loss': 0.441, 'grad_norm': 0.6507251787693218, 'learning_rate': 7.970617105373624e-06, 'epoch': 0.32} 32%|███▏ | 7040/22095 [11:49:21<16:16:36, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60781 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7041/22095 [11:49:24<15:13:16, 3.64s/it] {'loss': 0.3509, 'grad_norm': 0.6400026468720813, 'learning_rate': 7.970027530927765e-06, 'epoch': 0.32} 32%|███▏ | 7041/22095 [11:49:24<15:13:16, 3.64s/it] 32%|███▏ | 7042/22095 [11:49:27<14:45:43, 3.53s/it] {'loss': 0.335, 'grad_norm': 0.9682135791562172, 'learning_rate': 7.969437892665134e-06, 'epoch': 0.32} 32%|███▏ | 7042/22095 [11:49:27<14:45:43, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7043/22095 [11:49:35<20:09:14, 4.82s/it] {'loss': 0.4637, 'grad_norm': 0.6946780013933663, 'learning_rate': 7.968848190598404e-06, 'epoch': 0.32} 32%|███▏ | 7043/22095 [11:49:35<20:09:14, 4.82s/it] 32%|███▏ | 7044/22095 [11:49:39<18:57:26, 4.53s/it] {'loss': 0.3855, 'grad_norm': 0.6146416446661337, 'learning_rate': 7.968258424740245e-06, 'epoch': 0.32} 32%|███▏ | 7044/22095 [11:49:39<18:57:26, 4.53s/it] 32%|███▏ | 7045/22095 [11:49:42<16:55:56, 4.05s/it] {'loss': 0.3648, 'grad_norm': 0.6630762453867052, 'learning_rate': 7.967668595103328e-06, 'epoch': 0.32} 32%|███▏ | 7045/22095 [11:49:42<16:55:56, 4.05s/it] 32%|███▏ | 7046/22095 [11:49:45<16:43:45, 4.00s/it] {'loss': 0.3675, 'grad_norm': 0.6409902766059519, 'learning_rate': 7.967078701700329e-06, 'epoch': 0.32} 32%|███▏ | 7046/22095 [11:49:45<16:43:45, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7047/22095 [11:49:55<23:34:13, 5.64s/it] {'loss': 0.4827, 'grad_norm': 0.3593074982929306, 'learning_rate': 7.966488744543919e-06, 'epoch': 0.32} 32%|███▏ | 7047/22095 [11:49:55<23:34:13, 5.64s/it] 32%|███▏ | 7048/22095 [11:50:04<28:21:28, 6.78s/it] {'loss': 0.5214, 'grad_norm': 0.40176551814356487, 'learning_rate': 7.965898723646777e-06, 'epoch': 0.32} 32%|███▏ | 7048/22095 [11:50:04<28:21:28, 6.78s/it] 32%|███▏ | 7049/22095 [11:50:12<28:50:02, 6.90s/it] {'loss': 0.4759, 'grad_norm': 0.45032727794835453, 'learning_rate': 7.965308639021581e-06, 'epoch': 0.32} 32%|███▏ | 7049/22095 [11:50:12<28:50:02, 6.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 32%|███▏ | 7050/22095 [11:50:15<24:35:01, 5.88s/it] {'loss': 0.3569, 'grad_norm': 0.7198669844930446, 'learning_rate': 7.964718490681009e-06, 'epoch': 0.32} 32%|███▏ | 7050/22095 [11:50:15<24:35:01, 5.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50887 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7051/22095 [11:50:19<21:52:40, 5.24s/it] {'loss': 0.3607, 'grad_norm': 0.6546251899527719, 'learning_rate': 7.964128278637745e-06, 'epoch': 0.32} 32%|███▏ | 7051/22095 [11:50:19<21:52:40, 5.24s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 32%|███▏ | 7052/22095 [11:50:22<19:07:32, 4.58s/it] {'loss': 0.3728, 'grad_norm': 0.8092815635678249, 'learning_rate': 7.963538002904464e-06, 'epoch': 0.32} 32%|███▏ | 7052/22095 [11:50:22<19:07:32, 4.58s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38831.png 2025-08-28 03:48:18.125699 load time: 1493.8 ms 32%|███▏ | 7053/22095 [11:50:25<16:57:44, 4.06s/it] {'loss': 0.3441, 'grad_norm': 1.5350541693416326, 'learning_rate': 7.962947663493855e-06, 'epoch': 0.32} 32%|███▏ | 7053/22095 [11:50:25<16:57:44, 4.06s/it] 32%|███▏ | 7054/22095 [11:50:29<16:44:44, 4.01s/it] {'loss': 0.3581, 'grad_norm': 0.6106851124656958, 'learning_rate': 7.9623572604186e-06, 'epoch': 0.32} 32%|███▏ | 7054/22095 [11:50:29<16:44:44, 4.01s/it] 32%|███▏ | 7055/22095 [11:50:33<17:24:21, 4.17s/it] {'loss': 0.3446, 'grad_norm': 0.6489555394182779, 'learning_rate': 7.961766793691387e-06, 'epoch': 0.32} 32%|███▏ | 7055/22095 [11:50:33<17:24:21, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7056/22095 [11:50:37<17:00:58, 4.07s/it] {'loss': 0.3544, 'grad_norm': 0.6293509170922542, 'learning_rate': 7.961176263324902e-06, 'epoch': 0.32} 32%|███▏ | 7056/22095 [11:50:37<17:00:58, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53591 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46196 > 40960) for 4 sample(s). Truncating to 5236 with 3 samples. 32%|███▏ | 7057/22095 [11:50:40<15:38:17, 3.74s/it] {'loss': 0.376, 'grad_norm': 0.6794823179154822, 'learning_rate': 7.960585669331832e-06, 'epoch': 0.32} 32%|███▏ | 7057/22095 [11:50:40<15:38:17, 3.74s/it] 32%|███▏ | 7058/22095 [11:50:44<16:33:10, 3.96s/it] {'loss': 0.3947, 'grad_norm': 0.628733216417228, 'learning_rate': 7.959995011724869e-06, 'epoch': 0.32} 32%|███▏ | 7058/22095 [11:50:44<16:33:10, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880208 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3361, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 5\nB. 6\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8347042 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13707, 'image': 'vrdu_table_final_2/astro-ph.CO/5bd7fff3-4ec8-4c27-b6bd-2b4b07fcef0e.png', 'image_wh': [[23, 6]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}ccccccccccc@{}}\n...\n\\end{tabular}\n```"}]} VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_062638_before_screenshot.png 2025-08-28 03:48:45.349786 load time: 1189.1 ms 32%|███▏ | 7059/22095 [11:50:55<24:16:38, 5.81s/it] {'loss': 0.4822, 'grad_norm': 0.7145812322342616, 'learning_rate': 7.959404290516705e-06, 'epoch': 0.32} 32%|███▏ | 7059/22095 [11:50:55<24:16:38, 5.81s/it] 32%|███▏ | 7060/22095 [11:50:58<21:53:34, 5.24s/it] {'loss': 0.408, 'grad_norm': 0.6517928178132028, 'learning_rate': 7.958813505720031e-06, 'epoch': 0.32} 32%|███▏ | 7060/22095 [11:50:58<21:53:34, 5.24s/it] 32%|███▏ | 7061/22095 [11:51:02<19:39:11, 4.71s/it] {'loss': 0.3909, 'grad_norm': 0.643493560485578, 'learning_rate': 7.958222657347543e-06, 'epoch': 0.32} 32%|███▏ | 7061/22095 [11:51:02<19:39:11, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50275 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104318 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7062/22095 [11:51:05<18:00:29, 4.31s/it] {'loss': 0.3946, 'grad_norm': 0.6968346039393635, 'learning_rate': 7.957631745411936e-06, 'epoch': 0.32} 32%|███▏ | 7062/22095 [11:51:05<18:00:29, 4.31s/it] 32%|███▏ | 7063/22095 [11:51:08<16:15:17, 3.89s/it] {'loss': 0.3484, 'grad_norm': 0.6186660710362943, 'learning_rate': 7.957040769925906e-06, 'epoch': 0.32} 32%|███▏ | 7063/22095 [11:51:08<16:15:17, 3.89s/it] 32%|███▏ | 7064/22095 [11:51:11<15:20:44, 3.68s/it] {'loss': 0.3651, 'grad_norm': 0.6076213582496474, 'learning_rate': 7.95644973090215e-06, 'epoch': 0.32} 32%|███▏ | 7064/22095 [11:51:11<15:20:44, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7065/22095 [11:51:21<22:40:52, 5.43s/it] {'loss': 0.4934, 'grad_norm': 0.5507873432743083, 'learning_rate': 7.955858628353372e-06, 'epoch': 0.32} 32%|███▏ | 7065/22095 [11:51:21<22:40:52, 5.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387779 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54591, 'image': 'vrdu_table_final_2/astro-ph.CO/423d0278-f609-44fa-b333-c0e1ed6bf9cf.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_20/img/step_3.png 2025-08-28 03:49:19.656474 load time: 1256.76 ms 32%|███▏ | 7066/22095 [11:51:24<19:53:53, 4.77s/it] {'loss': 0.3774, 'grad_norm': 0.6447783016497618, 'learning_rate': 7.95526746229227e-06, 'epoch': 0.32} 32%|███▏ | 7066/22095 [11:51:24<19:53:53, 4.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7067/22095 [11:51:27<18:06:30, 4.34s/it] {'loss': 0.3423, 'grad_norm': 0.6033695502218002, 'learning_rate': 7.954676232731545e-06, 'epoch': 0.32} 32%|███▏ | 7067/22095 [11:51:27<18:06:30, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887141 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10294, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 32%|███▏ | 7068/22095 [11:51:31<16:35:55, 3.98s/it] {'loss': 0.3626, 'grad_norm': 0.7129234065266957, 'learning_rate': 7.954084939683901e-06, 'epoch': 0.32} 32%|███▏ | 7068/22095 [11:51:31<16:35:55, 3.98s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/androidstudio_handmade/handmade_annotation_2/images/2ba58f7bf0b5c6f4b73b9b79adac79d_id_12_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 03:49:30.148015 load time: 1138.0 ms 32%|███▏ | 7069/22095 [11:51:33<15:12:45, 3.64s/it] {'loss': 0.346, 'grad_norm': 0.6305288052748258, 'learning_rate': 7.953493583162047e-06, 'epoch': 0.32} 32%|███▏ | 7069/22095 [11:51:33<15:12:45, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43882 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7070/22095 [11:51:43<22:39:12, 5.43s/it] {'loss': 0.4731, 'grad_norm': 0.3498155729821522, 'learning_rate': 7.952902163178687e-06, 'epoch': 0.32} 32%|███▏ | 7070/22095 [11:51:43<22:39:12, 5.43s/it] 32%|███▏ | 7071/22095 [11:51:47<20:57:00, 5.02s/it] {'loss': 0.3761, 'grad_norm': 0.6789249555155012, 'learning_rate': 7.952310679746528e-06, 'epoch': 0.32} 32%|███▏ | 7071/22095 [11:51:47<20:57:00, 5.02s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_103/img/step_0.png 2025-08-28 03:49:47.568506 load time: 1045.55 ms 32%|███▏ | 7072/22095 [11:51:50<18:26:16, 4.42s/it] {'loss': 0.3881, 'grad_norm': 0.6980391317468145, 'learning_rate': 7.951719132878279e-06, 'epoch': 0.32} 32%|███▏ | 7072/22095 [11:51:50<18:26:16, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7073/22095 [11:51:54<17:49:33, 4.27s/it] {'loss': 0.3882, 'grad_norm': 0.9088700610711827, 'learning_rate': 7.951127522586653e-06, 'epoch': 0.32} 32%|███▏ | 7073/22095 [11:51:54<17:49:33, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359935 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26656, 'image': 'vrdu_table_final_2/astro-ph.CO/7ad8968a-fe27-43a9-9714-b18e5889ba55.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 32%|███▏ | 7074/22095 [11:51:57<16:00:55, 3.84s/it] {'loss': 0.3502, 'grad_norm': 0.6659807627108673, 'learning_rate': 7.95053584888436e-06, 'epoch': 0.32} 32%|███▏ | 7074/22095 [11:51:57<16:00:55, 3.84s/it] 32%|███▏ | 7075/22095 [11:52:01<16:30:38, 3.96s/it] {'loss': 0.4077, 'grad_norm': 0.755708696082257, 'learning_rate': 7.94994411178411e-06, 'epoch': 0.32} 32%|███▏ | 7075/22095 [11:52:01<16:30:38, 3.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960774 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11609, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,M是AB中点,∴BM=\\frac{1}{2}AB=5cm,又∵NB=2cm,∴MN=BM-BN=5-2=3cm.'}]} 32%|███▏ | 7076/22095 [11:52:04<15:08:16, 3.63s/it] {'loss': 0.3779, 'grad_norm': 0.6560835029799718, 'learning_rate': 7.949352311298626e-06, 'epoch': 0.32} 32%|███▏ | 7076/22095 [11:52:04<15:08:16, 3.63s/it] 32%|███▏ | 7077/22095 [11:52:08<15:03:11, 3.61s/it] {'loss': 0.362, 'grad_norm': 0.6869943892018556, 'learning_rate': 7.948760447440617e-06, 'epoch': 0.32} 32%|███▏ | 7077/22095 [11:52:08<15:03:11, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7078/22095 [11:52:18<24:02:12, 5.76s/it] {'loss': 0.4803, 'grad_norm': 0.3807530113056217, 'learning_rate': 7.948168520222802e-06, 'epoch': 0.32} 32%|███▏ | 7078/22095 [11:52:18<24:02:12, 5.76s/it] 32%|███▏ | 7079/22095 [11:52:22<21:17:06, 5.10s/it] {'loss': 0.3371, 'grad_norm': 0.6263532501824917, 'learning_rate': 7.9475765296579e-06, 'epoch': 0.32} 32%|███▏ | 7079/22095 [11:52:22<21:17:06, 5.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047576 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7'}]} 32%|███▏ | 7080/22095 [11:52:25<19:23:28, 4.65s/it] {'loss': 0.3876, 'grad_norm': 0.6798061248166258, 'learning_rate': 7.946984475758633e-06, 'epoch': 0.32} 32%|███▏ | 7080/22095 [11:52:25<19:23:28, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59354 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79033 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7081/22095 [11:52:29<17:25:00, 4.18s/it] {'loss': 0.372, 'grad_norm': 0.6812449344143824, 'learning_rate': 7.946392358537719e-06, 'epoch': 0.32} 32%|███▏ | 7081/22095 [11:52:29<17:25:00, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (142461 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366633 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33379, 'image': 'vrdu_table_final_2/astro-ph.CO/29d12f8d-d0f8-4fbb-940d-24d92eefddb6.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7082/22095 [11:52:31<15:37:10, 3.75s/it] {'loss': 0.3567, 'grad_norm': 0.634619025478144, 'learning_rate': 7.945800178007883e-06, 'epoch': 0.32} 32%|███▏ | 7082/22095 [11:52:31<15:37:10, 3.75s/it] 32%|███▏ | 7083/22095 [11:52:35<15:39:08, 3.75s/it] {'loss': 0.3509, 'grad_norm': 0.6872688162762708, 'learning_rate': 7.945207934181849e-06, 'epoch': 0.32} 32%|███▏ | 7083/22095 [11:52:35<15:39:08, 3.75s/it] 32%|███▏ | 7084/22095 [11:52:39<15:23:20, 3.69s/it] {'loss': 0.3606, 'grad_norm': 0.6278473270352956, 'learning_rate': 7.944615627072341e-06, 'epoch': 0.32} 32%|███▏ | 7084/22095 [11:52:39<15:23:20, 3.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366420 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33166, 'image': 'vrdu_table_final_2/astro-ph.CO/f34d5a45-b2f7-4d21-a813-88c9183749a9.png', 'image_wh': [[12, 17]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\footnotesize #1\n\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7085/22095 [11:52:42<15:12:56, 3.65s/it] {'loss': 0.3734, 'grad_norm': 0.6898920436870292, 'learning_rate': 7.944023256692086e-06, 'epoch': 0.32} 32%|███▏ | 7085/22095 [11:52:42<15:12:56, 3.65s/it] 32%|███▏ | 7086/22095 [11:52:46<15:03:35, 3.61s/it] {'loss': 0.3845, 'grad_norm': 0.6505918738584352, 'learning_rate': 7.943430823053815e-06, 'epoch': 0.32} 32%|███▏ | 7086/22095 [11:52:46<15:03:35, 3.61s/it] 32%|███▏ | 7087/22095 [11:52:49<14:58:01, 3.59s/it] {'loss': 0.3345, 'grad_norm': 0.7038738512820012, 'learning_rate': 7.942838326170255e-06, 'epoch': 0.32} 32%|███▏ | 7087/22095 [11:52:49<14:58:01, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55552 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110306 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7088/22095 [11:52:53<14:38:20, 3.51s/it] {'loss': 0.3571, 'grad_norm': 0.6051699178090763, 'learning_rate': 7.942245766054137e-06, 'epoch': 0.32} 32%|███▏ | 7088/22095 [11:52:53<14:38:20, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045998 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定'}, {'from': 'gpt', 'value': '【解答】解:∵M、N分别是线段AB、BC的中点,∴MB=0.5AB=3cm,NB=0.5BC=2cm,∴MN=MB+NB=3+2=5(cm),'}]} 32%|███▏ | 7089/22095 [11:53:00<20:00:39, 4.80s/it] {'loss': 0.4855, 'grad_norm': 0.4150689109819772, 'learning_rate': 7.941653142718194e-06, 'epoch': 0.32} 32%|███▏ | 7089/22095 [11:53:00<20:00:39, 4.80s/it] 32%|███▏ | 7090/22095 [11:53:04<18:11:20, 4.36s/it] {'loss': 0.3285, 'grad_norm': 0.6268038080046793, 'learning_rate': 7.94106045617516e-06, 'epoch': 0.32} 32%|███▏ | 7090/22095 [11:53:04<18:11:20, 4.36s/it] 32%|███▏ | 7091/22095 [11:53:07<16:47:28, 4.03s/it] {'loss': 0.3662, 'grad_norm': 0.6397684099671085, 'learning_rate': 7.94046770643777e-06, 'epoch': 0.32} 32%|███▏ | 7091/22095 [11:53:07<16:47:28, 4.03s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 03:51:06.980174 load time: 1062.91 ms 32%|███▏ | 7092/22095 [11:53:10<15:53:46, 3.81s/it] {'loss': 0.3667, 'grad_norm': 0.6908477440660391, 'learning_rate': 7.93987489351876e-06, 'epoch': 0.32} 32%|███▏ | 7092/22095 [11:53:10<15:53:46, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104328 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7093/22095 [11:53:14<15:16:55, 3.67s/it] {'loss': 0.3978, 'grad_norm': 0.6745483725451942, 'learning_rate': 7.939282017430867e-06, 'epoch': 0.32} 32%|███▏ | 7093/22095 [11:53:14<15:16:55, 3.67s/it] 32%|███▏ | 7094/22095 [11:53:17<14:39:07, 3.52s/it] {'loss': 0.3489, 'grad_norm': 0.6529001360028059, 'learning_rate': 7.93868907818683e-06, 'epoch': 0.32} 32%|███▏ | 7094/22095 [11:53:17<14:39:07, 3.52s/it] 32%|███▏ | 7095/22095 [11:53:20<14:10:16, 3.40s/it] {'loss': 0.4018, 'grad_norm': 0.6145650846844853, 'learning_rate': 7.938096075799391e-06, 'epoch': 0.32} 32%|███▏ | 7095/22095 [11:53:20<14:10:16, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7096/22095 [11:53:29<21:32:25, 5.17s/it] {'loss': 0.487, 'grad_norm': 0.4122410958350045, 'learning_rate': 7.93750301028129e-06, 'epoch': 0.32} 32%|███▏ | 7096/22095 [11:53:29<21:32:25, 5.17s/it] 32%|███▏ | 7097/22095 [11:53:33<19:51:39, 4.77s/it] {'loss': 0.3653, 'grad_norm': 0.6595309437944411, 'learning_rate': 7.936909881645275e-06, 'epoch': 0.32} 32%|███▏ | 7097/22095 [11:53:33<19:51:39, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46629 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7098/22095 [11:53:36<17:37:24, 4.23s/it] {'loss': 0.3788, 'grad_norm': 0.6300548129151529, 'learning_rate': 7.936316689904083e-06, 'epoch': 0.32} 32%|███▏ | 7098/22095 [11:53:36<17:37:24, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51169 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41856 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7099/22095 [11:53:40<17:05:39, 4.10s/it] {'loss': 0.3666, 'grad_norm': 0.6692355502644558, 'learning_rate': 7.935723435070464e-06, 'epoch': 0.32} 32%|███▏ | 7099/22095 [11:53:40<17:05:39, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51241 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58295 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75253 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7100/22095 [11:53:43<15:25:53, 3.70s/it] {'loss': 0.3704, 'grad_norm': 0.6494119925557855, 'learning_rate': 7.935130117157166e-06, 'epoch': 0.32} 32%|███▏ | 7100/22095 [11:53:43<15:25:53, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7101/22095 [11:53:53<23:54:04, 5.74s/it] {'loss': 0.5311, 'grad_norm': 0.34216565367593454, 'learning_rate': 7.934536736176934e-06, 'epoch': 0.32} 32%|███▏ | 7101/22095 [11:53:53<23:54:04, 5.74s/it] 32%|███▏ | 7102/22095 [11:53:57<21:16:42, 5.11s/it] {'loss': 0.3802, 'grad_norm': 1.0052287214302138, 'learning_rate': 7.933943292142524e-06, 'epoch': 0.32} 32%|███▏ | 7102/22095 [11:53:57<21:16:42, 5.11s/it] 32%|███▏ | 7103/22095 [11:54:00<18:54:49, 4.54s/it] {'loss': 0.344, 'grad_norm': 0.6499179876977065, 'learning_rate': 7.93334978506668e-06, 'epoch': 0.32} 32%|███▏ | 7103/22095 [11:54:00<18:54:49, 4.54s/it] 32%|███▏ | 7104/22095 [11:54:04<17:50:31, 4.28s/it] {'loss': 0.3724, 'grad_norm': 0.6181493818110998, 'learning_rate': 7.93275621496216e-06, 'epoch': 0.32} 32%|███▏ | 7104/22095 [11:54:04<17:50:31, 4.28s/it] 32%|███▏ | 7105/22095 [11:54:07<16:25:02, 3.94s/it] {'loss': 0.348, 'grad_norm': 0.6791885332119639, 'learning_rate': 7.932162581841715e-06, 'epoch': 0.32} 32%|███▏ | 7105/22095 [11:54:07<16:25:02, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7106/22095 [11:54:14<20:14:50, 4.86s/it] {'loss': 0.5062, 'grad_norm': 0.4324667087116202, 'learning_rate': 7.931568885718104e-06, 'epoch': 0.32} 32%|███▏ | 7106/22095 [11:54:14<20:14:50, 4.86s/it] 32%|███▏ | 7107/22095 [11:54:23<25:56:13, 6.23s/it] {'loss': 0.512, 'grad_norm': 0.5369739435044163, 'learning_rate': 7.930975126604079e-06, 'epoch': 0.32} 32%|███▏ | 7107/22095 [11:54:23<25:56:13, 6.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52770 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7108/22095 [11:54:33<29:58:19, 7.20s/it] {'loss': 0.5037, 'grad_norm': 0.30855384882874476, 'learning_rate': 7.930381304512401e-06, 'epoch': 0.32} 32%|███▏ | 7108/22095 [11:54:33<29:58:19, 7.20s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8336094 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2714, 'image': 'vrdu_table_final_2/astro-ph.CO/dfc03797-0c57-45ae-99d4-3d5f6c6fc10c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l} #1 \\end{tabular}\n```"}]} 32%|███▏ | 7109/22095 [11:54:36<25:08:12, 6.04s/it] {'loss': 0.442, 'grad_norm': 0.6766398168097438, 'learning_rate': 7.92978741945583e-06, 'epoch': 0.32} 32%|███▏ | 7109/22095 [11:54:36<25:08:12, 6.04s/it] 32%|███▏ | 7110/22095 [11:54:40<22:49:37, 5.48s/it] {'loss': 0.3785, 'grad_norm': 0.6718820626808718, 'learning_rate': 7.929193471447123e-06, 'epoch': 0.32} 32%|███▏ | 7110/22095 [11:54:40<22:49:37, 5.48s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f45219d4cc35521265a9fcbd416d3c17fc3501857a99990c36eb3d855895b1f9.png 2025-08-28 03:52:39.380486 load time: 1324.02 ms 32%|███▏ | 7111/22095 [11:54:43<19:31:46, 4.69s/it] {'loss': 0.3497, 'grad_norm': 0.62498935257029, 'learning_rate': 7.928599460499046e-06, 'epoch': 0.32} 32%|███▏ | 7111/22095 [11:54:43<19:31:46, 4.69s/it]VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/images/os_ubuntu_2/handmade_annotation_5/images/paste_Screenshot from 2025-07-08 17-59-54_id_5_internvl_appearance_crop_0_grounding_instructions_random.png 2025-08-28 03:52:41.512231 load time: 1043.94 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 03:52:42.118343 load time: 1312.3 ms 32%|███▏ | 7112/22095 [11:54:46<17:14:47, 4.14s/it] {'loss': 0.3715, 'grad_norm': 0.7246586779569112, 'learning_rate': 7.92800538662436e-06, 'epoch': 0.32} 32%|███▏ | 7112/22095 [11:54:46<17:14:47, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8553992 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24619, 'image': '1565111052.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Politics & Social Sciences? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304390 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1bAy3eL2H8KJjy1zkXXXr7pXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to extract and decode all the text in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n天道酬勤\n青木政治家\n壁立\n欲则\n仞無\n剛\n千\n乃大\n有容\n纳百\n川\n海\n任意2套省20%'}]} 32%|███▏ | 7113/22095 [11:54:49<15:54:46, 3.82s/it] {'loss': 0.3734, 'grad_norm': 0.6591067393074789, 'learning_rate': 7.927411249835832e-06, 'epoch': 0.32} 32%|███▏ | 7113/22095 [11:54:49<15:54:46, 3.82s/it] 32%|███▏ | 7114/22095 [11:54:52<14:55:17, 3.59s/it] {'loss': 0.3286, 'grad_norm': 0.7603779807342957, 'learning_rate': 7.926817050146227e-06, 'epoch': 0.32} 32%|███▏ | 7114/22095 [11:54:52<14:55:17, 3.59s/it] 32%|███▏ | 7115/22095 [11:54:55<14:29:29, 3.48s/it] {'loss': 0.3599, 'grad_norm': 0.657032741453226, 'learning_rate': 7.926222787568314e-06, 'epoch': 0.32} 32%|███▏ | 7115/22095 [11:54:55<14:29:29, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7116/22095 [11:55:05<22:11:28, 5.33s/it] {'loss': 0.5117, 'grad_norm': 0.8247053846685484, 'learning_rate': 7.925628462114858e-06, 'epoch': 0.32} 32%|███▏ | 7116/22095 [11:55:05<22:11:28, 5.33s/it] 32%|███▏ | 7117/22095 [11:55:09<20:20:43, 4.89s/it] {'loss': 0.3377, 'grad_norm': 0.8964379263543338, 'learning_rate': 7.925034073798632e-06, 'epoch': 0.32} 32%|███▏ | 7117/22095 [11:55:09<20:20:43, 4.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48141 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60896 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7118/22095 [11:55:11<17:41:48, 4.25s/it] {'loss': 0.3311, 'grad_norm': 0.6875665098075191, 'learning_rate': 7.92443962263241e-06, 'epoch': 0.32} 32%|███▏ | 7118/22095 [11:55:11<17:41:48, 4.25s/it] 32%|███▏ | 7119/22095 [11:55:15<17:12:35, 4.14s/it] {'loss': 0.3867, 'grad_norm': 0.9490803870507291, 'learning_rate': 7.92384510862896e-06, 'epoch': 0.32} 32%|███▏ | 7119/22095 [11:55:15<17:12:35, 4.14s/it] 32%|███▏ | 7120/22095 [11:55:19<16:02:55, 3.86s/it] {'loss': 0.3784, 'grad_norm': 0.7349779813006002, 'learning_rate': 7.92325053180106e-06, 'epoch': 0.32} 32%|███▏ | 7120/22095 [11:55:19<16:02:55, 3.86s/it] 32%|███▏ | 7121/22095 [11:55:23<16:23:10, 3.94s/it] {'loss': 0.3666, 'grad_norm': 0.7006969086062932, 'learning_rate': 7.922655892161482e-06, 'epoch': 0.32} 32%|███▏ | 7121/22095 [11:55:23<16:23:10, 3.94s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_54/img/step_0.png 2025-08-28 03:53:21.448206 load time: 1369.15 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 03:53:21.449981 load time: 1466.52 ms 32%|███▏ | 7122/22095 [11:55:27<17:01:03, 4.09s/it] {'loss': 0.3581, 'grad_norm': 0.6555049359630561, 'learning_rate': 7.922061189723007e-06, 'epoch': 0.32} 32%|███▏ | 7122/22095 [11:55:27<17:01:03, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [84, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398002 in VC:s3://internvl-moe-sft-data/. Exception: Image size [84, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 152, 'image': 'vrdu_table_final_2/astro-ph.CO/1380e8fb-ff9c-4f47-a52f-5b5762b3e632.png', 'image_wh': [[84, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{l}MMF3\\end{tabular}\n```"}]} 32%|███▏ | 7123/22095 [11:55:31<16:54:36, 4.07s/it] {'loss': 0.3809, 'grad_norm': 0.6740198833872045, 'learning_rate': 7.921466424498409e-06, 'epoch': 0.32} 32%|███▏ | 7123/22095 [11:55:31<16:54:36, 4.07s/it] 32%|███▏ | 7124/22095 [11:55:34<15:31:41, 3.73s/it] {'loss': 0.3679, 'grad_norm': 0.6650014243978319, 'learning_rate': 7.920871596500473e-06, 'epoch': 0.32} 32%|███▏ | 7124/22095 [11:55:34<15:31:41, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98925 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42308 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7125/22095 [11:55:38<15:12:45, 3.66s/it] {'loss': 0.3366, 'grad_norm': 0.6504397000732004, 'learning_rate': 7.920276705741975e-06, 'epoch': 0.32} 32%|███▏ | 7125/22095 [11:55:38<15:12:45, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7126/22095 [11:55:47<22:57:45, 5.52s/it] {'loss': 0.4496, 'grad_norm': 0.49837067139854274, 'learning_rate': 7.919681752235701e-06, 'epoch': 0.32} 32%|███▏ | 7126/22095 [11:55:47<22:57:45, 5.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107401 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82008 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7127/22095 [11:55:51<20:00:50, 4.81s/it] {'loss': 0.3166, 'grad_norm': 0.6165258872006424, 'learning_rate': 7.919086735994433e-06, 'epoch': 0.32} 32%|███▏ | 7127/22095 [11:55:51<20:00:50, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71162 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7128/22095 [11:55:54<18:21:13, 4.41s/it] {'loss': 0.3648, 'grad_norm': 0.6527582636116545, 'learning_rate': 7.918491657030956e-06, 'epoch': 0.32} 32%|███▏ | 7128/22095 [11:55:54<18:21:13, 4.41s/it] 32%|███▏ | 7129/22095 [11:55:58<17:18:10, 4.16s/it] {'loss': 0.3332, 'grad_norm': 0.6597269628107116, 'learning_rate': 7.917896515358057e-06, 'epoch': 0.32} 32%|███▏ | 7129/22095 [11:55:58<17:18:10, 4.16s/it] 32%|███▏ | 7130/22095 [11:56:01<16:09:04, 3.89s/it] {'loss': 0.3536, 'grad_norm': 0.6542637026119181, 'learning_rate': 7.917301310988525e-06, 'epoch': 0.32} 32%|███▏ | 7130/22095 [11:56:01<16:09:04, 3.89s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/ce733e79-9bcf-4d47-8303-951a3a1ae194/images/step_0.png 2025-08-28 03:53:59.676413 load time: 1003.39 ms 32%|███▏ | 7131/22095 [11:56:05<16:06:17, 3.87s/it] {'loss': 0.365, 'grad_norm': 0.6347125789718905, 'learning_rate': 7.916706043935145e-06, 'epoch': 0.32} 32%|███▏ | 7131/22095 [11:56:05<16:06:17, 3.87s/it] 32%|███▏ | 7132/22095 [11:56:08<15:13:23, 3.66s/it] {'loss': 0.3735, 'grad_norm': 0.7115640181010907, 'learning_rate': 7.916110714210711e-06, 'epoch': 0.32} 32%|███▏ | 7132/22095 [11:56:08<15:13:23, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7133/22095 [11:56:18<23:48:48, 5.73s/it] {'loss': 0.4915, 'grad_norm': 0.3234316538967396, 'learning_rate': 7.915515321828014e-06, 'epoch': 0.32} 32%|███▏ | 7133/22095 [11:56:18<23:48:48, 5.73s/it] 32%|███▏ | 7134/22095 [11:56:22<20:32:19, 4.94s/it] {'loss': 0.3634, 'grad_norm': 0.687035292169822, 'learning_rate': 7.914919866799847e-06, 'epoch': 0.32} 32%|███▏ | 7134/22095 [11:56:22<20:32:19, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7135/22095 [11:56:31<26:16:01, 6.32s/it] {'loss': 0.4805, 'grad_norm': 0.3137342255904164, 'learning_rate': 7.914324349139006e-06, 'epoch': 0.32} 32%|███▏ | 7135/22095 [11:56:31<26:16:01, 6.32s/it] 32%|███▏ | 7136/22095 [11:56:41<30:12:53, 7.27s/it] {'loss': 0.4696, 'grad_norm': 0.27487283487033065, 'learning_rate': 7.913728768858283e-06, 'epoch': 0.32} 32%|███▏ | 7136/22095 [11:56:41<30:12:53, 7.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (93317 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7137/22095 [11:56:44<25:20:11, 6.10s/it] {'loss': 0.3231, 'grad_norm': 0.6152486208784101, 'learning_rate': 7.91313312597048e-06, 'epoch': 0.32} 32%|███▏ | 7137/22095 [11:56:44<25:20:11, 6.10s/it] 32%|███▏ | 7138/22095 [11:56:48<22:30:52, 5.42s/it] {'loss': 0.3707, 'grad_norm': 0.6482665421392981, 'learning_rate': 7.91253742048839e-06, 'epoch': 0.32} 32%|███▏ | 7138/22095 [11:56:48<22:30:52, 5.42s/it] 32%|███▏ | 7139/22095 [11:56:51<19:50:06, 4.77s/it] {'loss': 0.3887, 'grad_norm': 0.6755287950541693, 'learning_rate': 7.911941652424819e-06, 'epoch': 0.32} 32%|███▏ | 7139/22095 [11:56:51<19:50:06, 4.77s/it] 32%|███▏ | 7140/22095 [11:56:54<17:25:06, 4.19s/it] {'loss': 0.3947, 'grad_norm': 0.7476984596163645, 'learning_rate': 7.911345821792565e-06, 'epoch': 0.32} 32%|███▏ | 7140/22095 [11:56:54<17:25:06, 4.19s/it] 32%|███▏ | 7141/22095 [11:56:57<16:18:22, 3.93s/it] {'loss': 0.398, 'grad_norm': 0.7050035119659278, 'learning_rate': 7.910749928604429e-06, 'epoch': 0.32} 32%|███▏ | 7141/22095 [11:56:57<16:18:22, 3.93s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 03:54:56.635805 load time: 1499.97 ms 32%|███▏ | 7142/22095 [11:57:00<15:20:10, 3.69s/it] {'loss': 0.3849, 'grad_norm': 0.6985845003611664, 'learning_rate': 7.910153972873218e-06, 'epoch': 0.32} 32%|███▏ | 7142/22095 [11:57:00<15:20:10, 3.69s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 03:54:58.907396 load time: 1358.27 ms 32%|███▏ | 7143/22095 [11:57:03<14:30:00, 3.49s/it] {'loss': 0.3831, 'grad_norm': 0.7188017184074661, 'learning_rate': 7.909557954611736e-06, 'epoch': 0.32} 32%|███▏ | 7143/22095 [11:57:03<14:30:00, 3.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401323 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3486, 'image': 'vrdu_table_final_2/astro-ph.CO/aa83e60e-f987-42af-a88b-aec999a383fb.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 32%|███▏ | 7144/22095 [11:57:06<13:49:28, 3.33s/it] {'loss': 0.3489, 'grad_norm': 0.6187212364272202, 'learning_rate': 7.908961873832788e-06, 'epoch': 0.32} 32%|███▏ | 7144/22095 [11:57:06<13:49:28, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 32%|███▏ | 7145/22095 [11:57:13<18:00:49, 4.34s/it] {'loss': 0.4918, 'grad_norm': 0.5689497713714161, 'learning_rate': 7.908365730549183e-06, 'epoch': 0.32} 32%|███▏ | 7145/22095 [11:57:13<18:00:49, 4.34s/it] 32%|███▏ | 7146/22095 [11:57:17<17:33:20, 4.23s/it] {'loss': 0.352, 'grad_norm': 0.6522720842697802, 'learning_rate': 7.907769524773734e-06, 'epoch': 0.32} 32%|███▏ | 7146/22095 [11:57:17<17:33:20, 4.23s/it] 32%|███▏ | 7147/22095 [11:57:20<16:31:53, 3.98s/it] {'loss': 0.3962, 'grad_norm': 0.68249189867944, 'learning_rate': 7.907173256519246e-06, 'epoch': 0.32} 32%|███▏ | 7147/22095 [11:57:20<16:31:53, 3.98s/it] 32%|███▏ | 7148/22095 [11:57:24<16:04:58, 3.87s/it] {'loss': 0.3562, 'grad_norm': 0.6918661181189343, 'learning_rate': 7.906576925798535e-06, 'epoch': 0.32} 32%|███▏ | 7148/22095 [11:57:24<16:04:58, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [231, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8384676 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51476, 'image': 'vrdu_table_final_2/astro-ph.CO/2ffe3834-a1c8-4205-9562-bbfc18b5998a.png', 'image_wh': [[231, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}Number of clusters\\end{tabular}\n```"}]} 32%|███▏ | 7149/22095 [11:57:27<15:22:49, 3.70s/it] {'loss': 0.3726, 'grad_norm': 0.6470440870797444, 'learning_rate': 7.905980532624411e-06, 'epoch': 0.32} 32%|███▏ | 7149/22095 [11:57:27<15:22:49, 3.70s/it] 32%|███▏ | 7150/22095 [11:57:31<14:51:51, 3.58s/it] {'loss': 0.3307, 'grad_norm': 0.6204544691347952, 'learning_rate': 7.905384077009693e-06, 'epoch': 0.32} 32%|███▏ | 7150/22095 [11:57:31<14:51:51, 3.58s/it] 32%|███▏ | 7151/22095 [11:57:34<14:52:06, 3.58s/it] {'loss': 0.3837, 'grad_norm': 0.654141371005639, 'learning_rate': 7.904787558967193e-06, 'epoch': 0.32} 32%|███▏ | 7151/22095 [11:57:34<14:52:06, 3.58s/it] 32%|███▏ | 7152/22095 [11:57:38<15:40:59, 3.78s/it] {'loss': 0.364, 'grad_norm': 0.6214250147231662, 'learning_rate': 7.904190978509729e-06, 'epoch': 0.32} 32%|███▏ | 7152/22095 [11:57:38<15:40:59, 3.78s/it] 32%|███▏ | 7153/22095 [11:57:42<15:35:52, 3.76s/it] {'loss': 0.3793, 'grad_norm': 0.6605746453137314, 'learning_rate': 7.90359433565012e-06, 'epoch': 0.32} 32%|███▏ | 7153/22095 [11:57:42<15:35:52, 3.76s/it] 32%|███▏ | 7154/22095 [11:57:45<14:33:29, 3.51s/it] {'loss': 0.3635, 'grad_norm': 0.6374246098103137, 'learning_rate': 7.902997630401188e-06, 'epoch': 0.32} 32%|███▏ | 7154/22095 [11:57:45<14:33:29, 3.51s/it] 32%|███▏ | 7155/22095 [11:57:48<13:56:27, 3.36s/it] {'loss': 0.384, 'grad_norm': 0.7547998705704276, 'learning_rate': 7.902400862775752e-06, 'epoch': 0.32} 32%|███▏ | 7155/22095 [11:57:48<13:56:27, 3.36s/it] 32%|███▏ | 7156/22095 [11:57:52<14:17:19, 3.44s/it] {'loss': 0.3638, 'grad_norm': 0.6713274567187839, 'learning_rate': 7.901804032786637e-06, 'epoch': 0.32} 32%|███▏ | 7156/22095 [11:57:52<14:17:19, 3.44s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 03:55:51.593647 load time: 1360.3 ms 32%|███▏ | 7157/22095 [11:57:55<14:31:38, 3.50s/it] {'loss': 0.3773, 'grad_norm': 0.6410889902728407, 'learning_rate': 7.901207140446662e-06, 'epoch': 0.32} 32%|███▏ | 7157/22095 [11:57:55<14:31:38, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7158/22095 [11:57:59<14:31:42, 3.50s/it] {'loss': 0.3783, 'grad_norm': 0.6597367749925079, 'learning_rate': 7.90061018576866e-06, 'epoch': 0.32} 32%|███▏ | 7158/22095 [11:57:59<14:31:42, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047835 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm'}, {'from': 'gpt', 'value': '【解答】解:如图,∵点D、E分别是线段AC、CB的中点,∴DC=\\frac{1}{2}AC,CE=\\frac{1}{2}BC,∴DE=DC+CE=\\frac{1}{2}(AC+BC)=\\frac{1}{2}AB.又∵AB=10cm,∴DE=5cm;'}]} 32%|███▏ | 7159/22095 [11:58:02<13:49:07, 3.33s/it] {'loss': 0.3571, 'grad_norm': 0.7166227954783297, 'learning_rate': 7.900013168765453e-06, 'epoch': 0.32} 32%|███▏ | 7159/22095 [11:58:02<13:49:07, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7160/22095 [11:58:05<14:06:42, 3.40s/it] {'loss': 0.3568, 'grad_norm': 0.8334546986425982, 'learning_rate': 7.899416089449867e-06, 'epoch': 0.32} 32%|███▏ | 7160/22095 [11:58:05<14:06:42, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250630/mac/images/terminal/8d2ef7b7-ff2b-4f31-a280-f43e1d982b54/images/step_2.png 2025-08-28 03:56:05.034498 load time: 1058.09 ms 32%|███▏ | 7161/22095 [11:58:10<15:19:41, 3.70s/it] {'loss': 0.3913, 'grad_norm': 0.6549313556490581, 'learning_rate': 7.898818947834737e-06, 'epoch': 0.32} 32%|███▏ | 7161/22095 [11:58:10<15:19:41, 3.70s/it] 32%|███▏ | 7162/22095 [11:58:14<15:43:24, 3.79s/it] {'loss': 0.3639, 'grad_norm': 0.6616731785929209, 'learning_rate': 7.898221743932887e-06, 'epoch': 0.32} 32%|███▏ | 7162/22095 [11:58:14<15:43:24, 3.79s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_2/images/step_4.png 2025-08-28 03:56:12.545967 load time: 1065.65 ms 32%|███▏ | 7163/22095 [11:58:17<15:28:34, 3.73s/it] {'loss': 0.3985, 'grad_norm': 0.6307360694773526, 'learning_rate': 7.897624477757156e-06, 'epoch': 0.32} 32%|███▏ | 7163/22095 [11:58:17<15:28:34, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42865 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97041 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7164/22095 [11:58:21<15:04:04, 3.63s/it] {'loss': 0.3666, 'grad_norm': 0.6211038754157479, 'learning_rate': 7.897027149320375e-06, 'epoch': 0.32} 32%|███▏ | 7164/22095 [11:58:21<15:04:04, 3.63s/it] 32%|███▏ | 7165/22095 [11:58:25<15:14:00, 3.67s/it] {'loss': 0.365, 'grad_norm': 0.6638297280567542, 'learning_rate': 7.896429758635375e-06, 'epoch': 0.32} 32%|███▏ | 7165/22095 [11:58:25<15:14:00, 3.67s/it] 32%|███▏ | 7166/22095 [11:58:28<15:00:04, 3.62s/it] {'loss': 0.3813, 'grad_norm': 0.6532817268016007, 'learning_rate': 7.895832305715e-06, 'epoch': 0.32} 32%|███▏ | 7166/22095 [11:58:28<15:00:04, 3.62s/it] 32%|███▏ | 7167/22095 [11:58:31<14:34:37, 3.52s/it] {'loss': 0.3704, 'grad_norm': 0.6701885349272433, 'learning_rate': 7.895234790572077e-06, 'epoch': 0.32} 32%|███▏ | 7167/22095 [11:58:31<14:34:37, 3.52s/it] 32%|███▏ | 7168/22095 [11:58:36<15:42:41, 3.79s/it] {'loss': 0.378, 'grad_norm': 0.6442400135179758, 'learning_rate': 7.894637213219454e-06, 'epoch': 0.32} 32%|███▏ | 7168/22095 [11:58:36<15:42:41, 3.79s/it] 32%|███▏ | 7169/22095 [11:58:38<14:27:24, 3.49s/it] {'loss': 0.3926, 'grad_norm': 0.667231431672473, 'learning_rate': 7.894039573669968e-06, 'epoch': 0.32} 32%|███▏ | 7169/22095 [11:58:39<14:27:24, 3.49s/it] 32%|███▏ | 7170/22095 [11:58:41<13:49:15, 3.33s/it] {'loss': 0.3954, 'grad_norm': 0.6344096908061077, 'learning_rate': 7.893441871936456e-06, 'epoch': 0.32} 32%|███▏ | 7170/22095 [11:58:41<13:49:15, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7171/22095 [11:58:45<14:26:35, 3.48s/it] {'loss': 0.3807, 'grad_norm': 0.6725657035402189, 'learning_rate': 7.892844108031768e-06, 'epoch': 0.32} 32%|███▏ | 7171/22095 [11:58:45<14:26:35, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7172/22095 [11:58:55<22:15:10, 5.37s/it] {'loss': 0.4932, 'grad_norm': 0.7062785122126237, 'learning_rate': 7.892246281968745e-06, 'epoch': 0.32} 32%|███▏ | 7172/22095 [11:58:55<22:15:10, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (132562 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7173/22095 [11:58:59<20:24:56, 4.93s/it] {'loss': 0.3447, 'grad_norm': 0.6774231556142725, 'learning_rate': 7.891648393760232e-06, 'epoch': 0.32} 32%|███▏ | 7173/22095 [11:58:59<20:24:56, 4.93s/it] 32%|███▏ | 7174/22095 [11:59:03<19:56:07, 4.81s/it] {'loss': 0.3446, 'grad_norm': 0.8538387827323394, 'learning_rate': 7.891050443419074e-06, 'epoch': 0.32} 32%|███▏ | 7174/22095 [11:59:04<19:56:07, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 32%|███▏ | 7175/22095 [11:59:06<17:30:32, 4.22s/it] {'loss': 0.4065, 'grad_norm': 0.6583879453482827, 'learning_rate': 7.890452430958123e-06, 'epoch': 0.32} 32%|███▏ | 7175/22095 [11:59:06<17:30:32, 4.22s/it] 32%|███▏ | 7176/22095 [11:59:10<17:19:42, 4.18s/it] {'loss': 0.4438, 'grad_norm': 0.681043978728226, 'learning_rate': 7.889854356390225e-06, 'epoch': 0.32} 32%|███▏ | 7176/22095 [11:59:10<17:19:42, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52987 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76407 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96377 > 40960). Running this sequence through the model will result in indexing errors 32%|███▏ | 7177/22095 [11:59:14<16:16:59, 3.93s/it] {'loss': 0.3427, 'grad_norm': 0.8220002508654024, 'learning_rate': 7.889256219728235e-06, 'epoch': 0.32} 32%|███▏ | 7177/22095 [11:59:14<16:16:59, 3.93s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 03:57:12.558299 load time: 1162.52 ms VC:s3://gui-agent/data_20250630/windows_augment/images/DR/handmade_annotation_3/images/DR_1_id_9_internvl_element-caption_crop_1_grounding_instructions_random_paste.png 2025-08-28 03:57:13.024266 load time: 1237.61 ms 32%|███▏ | 7178/22095 [11:59:17<15:06:18, 3.65s/it] {'loss': 0.3257, 'grad_norm': 0.6731314383680075, 'learning_rate': 7.888658020985e-06, 'epoch': 0.32} 32%|███▏ | 7178/22095 [11:59:17<15:06:18, 3.65s/it] 32%|███▏ | 7179/22095 [11:59:20<14:06:48, 3.41s/it] {'loss': 0.3643, 'grad_norm': 0.6885006191925244, 'learning_rate': 7.888059760173377e-06, 'epoch': 0.32} 32%|███▏ | 7179/22095 [11:59:20<14:06:48, 3.41s/it] 32%|███▏ | 7180/22095 [11:59:23<14:09:33, 3.42s/it] {'loss': 0.3772, 'grad_norm': 0.6801934899357927, 'learning_rate': 7.887461437306221e-06, 'epoch': 0.32} 32%|███▏ | 7180/22095 [11:59:23<14:09:33, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7181/22095 [11:59:33<21:39:18, 5.23s/it] {'loss': 0.4863, 'grad_norm': 0.7150719356238756, 'learning_rate': 7.886863052396384e-06, 'epoch': 0.33} 33%|███▎ | 7181/22095 [11:59:33<21:39:18, 5.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [312, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8426211 in VC:s3://internvl-moe-sft-data/. Exception: Image size [312, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56007, 'image': 'vrdu_texteq/astro-ph.CO/255512cf-61f2-4065-981a-6ea33afa3b14.png', 'image_wh': [[312, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'with $\\Delta N\\sim60$ and where'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_2/images/step_0.png 2025-08-28 03:57:32.876772 load time: 1246.53 ms 33%|███▎ | 7182/22095 [11:59:36<19:29:11, 4.70s/it] {'loss': 0.3638, 'grad_norm': 0.626612080010428, 'learning_rate': 7.886264605456727e-06, 'epoch': 0.33} 33%|███▎ | 7182/22095 [11:59:36<19:29:11, 4.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394374 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61209, 'image': 'vrdu_table_final_2/astro-ph.EP/776cd881-899a-4b27-9d0b-45e4af043845.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 33%|███▎ | 7183/22095 [11:59:39<17:26:28, 4.21s/it] {'loss': 0.3396, 'grad_norm': 0.6263977091282102, 'learning_rate': 7.88566609650011e-06, 'epoch': 0.33} 33%|███▎ | 7183/22095 [11:59:39<17:26:28, 4.21s/it] 33%|███▎ | 7184/22095 [11:59:42<15:53:03, 3.84s/it] {'loss': 0.3751, 'grad_norm': 0.6505262702471224, 'learning_rate': 7.88506752553939e-06, 'epoch': 0.33} 33%|███▎ | 7184/22095 [11:59:42<15:53:03, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7185/22095 [11:59:51<22:44:45, 5.49s/it] {'loss': 0.4744, 'grad_norm': 0.35956124288248265, 'learning_rate': 7.88446889258743e-06, 'epoch': 0.33} 33%|███▎ | 7185/22095 [11:59:51<22:44:45, 5.49s/it] 33%|███▎ | 7186/22095 [11:59:55<20:51:35, 5.04s/it] {'loss': 0.3577, 'grad_norm': 0.6126826293591199, 'learning_rate': 7.883870197657094e-06, 'epoch': 0.33} 33%|███▎ | 7186/22095 [11:59:55<20:51:35, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96610 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41957 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54025 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7187/22095 [11:59:59<19:35:31, 4.73s/it] {'loss': 0.3959, 'grad_norm': 0.6808837015295135, 'learning_rate': 7.883271440761241e-06, 'epoch': 0.33} 33%|███▎ | 7187/22095 [11:59:59<19:35:31, 4.73s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38944.png 2025-08-28 03:57:58.133232 load time: 1902.12 ms 33%|███▎ | 7188/22095 [12:00:03<17:46:58, 4.29s/it] {'loss': 0.3182, 'grad_norm': 0.726997546831792, 'learning_rate': 7.882672621912742e-06, 'epoch': 0.33} 33%|███▎ | 7188/22095 [12:00:03<17:46:58, 4.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [406, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393964 in VC:s3://internvl-moe-sft-data/. Exception: Image size [406, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60799, 'image': 'vrdu_table_final_2/astro-ph.EP/98e1208e-5560-4763-af50-c2b2a5c3a638.png', 'image_wh': [[406, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{ccc}\n\\multicolumn{1}{c}{\\footnotesize $^{1}$ Ref.~, $^{2}$ Ref.~, \n$^{3}$ Ref.~, $^{4}$ Ref.~, $^{5}$ Ref.~.}\n\\end{tabular}\n```"}]} 33%|███▎ | 7189/22095 [12:00:07<17:30:20, 4.23s/it] {'loss': 0.3575, 'grad_norm': 0.5954746959339828, 'learning_rate': 7.882073741124464e-06, 'epoch': 0.33} 33%|███▎ | 7189/22095 [12:00:07<17:30:20, 4.23s/it] 33%|███▎ | 7190/22095 [12:00:11<17:10:26, 4.15s/it] {'loss': 0.3882, 'grad_norm': 0.6277763190542311, 'learning_rate': 7.88147479840927e-06, 'epoch': 0.33} 33%|███▎ | 7190/22095 [12:00:11<17:10:26, 4.15s/it] 33%|███▎ | 7191/22095 [12:00:15<16:57:07, 4.09s/it] {'loss': 0.3792, 'grad_norm': 0.6181567794663823, 'learning_rate': 7.880875793780031e-06, 'epoch': 0.33} 33%|███▎ | 7191/22095 [12:00:15<16:57:07, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7192/22095 [12:00:24<23:06:27, 5.58s/it] {'loss': 0.4817, 'grad_norm': 0.6307714979444174, 'learning_rate': 7.880276727249623e-06, 'epoch': 0.33} 33%|███▎ | 7192/22095 [12:00:24<23:06:27, 5.58s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU/Agriculture/test_268_image_1.png 2025-08-28 03:58:23.624275 load time: 1425.48 ms 33%|███▎ | 7193/22095 [12:00:28<21:13:23, 5.13s/it] {'loss': 0.3931, 'grad_norm': 0.747922164182171, 'learning_rate': 7.879677598830913e-06, 'epoch': 0.33} 33%|███▎ | 7193/22095 [12:00:28<21:13:23, 5.13s/it] 33%|███▎ | 7194/22095 [12:00:32<20:44:08, 5.01s/it] {'loss': 0.3458, 'grad_norm': 0.6431937919879824, 'learning_rate': 7.879078408536774e-06, 'epoch': 0.33} 33%|███▎ | 7194/22095 [12:00:33<20:44:08, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7195/22095 [12:00:42<26:18:16, 6.36s/it] {'loss': 0.4914, 'grad_norm': 0.3977026758300841, 'learning_rate': 7.878479156380085e-06, 'epoch': 0.33} 33%|███▎ | 7195/22095 [12:00:42<26:18:16, 6.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54282 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7196/22095 [12:00:46<23:21:55, 5.65s/it] {'loss': 0.3708, 'grad_norm': 0.682964630211364, 'learning_rate': 7.877879842373718e-06, 'epoch': 0.33} 33%|███▎ | 7196/22095 [12:00:46<23:21:55, 5.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960199 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11034, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 33%|███▎ | 7197/22095 [12:00:50<21:27:17, 5.18s/it] {'loss': 0.3418, 'grad_norm': 0.6231524576095671, 'learning_rate': 7.877280466530552e-06, 'epoch': 0.33} 33%|███▎ | 7197/22095 [12:00:50<21:27:17, 5.18s/it] 33%|███▎ | 7198/22095 [12:00:54<20:00:00, 4.83s/it] {'loss': 0.3746, 'grad_norm': 0.6748437842992963, 'learning_rate': 7.876681028863464e-06, 'epoch': 0.33} 33%|███▎ | 7198/22095 [12:00:54<20:00:00, 4.83s/it] 33%|███▎ | 7199/22095 [12:00:58<18:17:10, 4.42s/it] {'loss': 0.3588, 'grad_norm': 0.6550830959993491, 'learning_rate': 7.876081529385338e-06, 'epoch': 0.33} 33%|███▎ | 7199/22095 [12:00:58<18:17:10, 4.42s/it] 33%|███▎ | 7200/22095 [12:01:00<16:19:57, 3.95s/it] {'loss': 0.3705, 'grad_norm': 0.6477789521912153, 'learning_rate': 7.875481968109052e-06, 'epoch': 0.33} 33%|███▎ | 7200/22095 [12:01:00<16:19:57, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7201/22095 [12:01:05<16:44:28, 4.05s/it] {'loss': 0.3785, 'grad_norm': 0.6372087242778248, 'learning_rate': 7.874882345047491e-06, 'epoch': 0.33} 33%|███▎ | 7201/22095 [12:01:05<16:44:28, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96678 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7202/22095 [12:01:08<16:13:54, 3.92s/it] {'loss': 0.3595, 'grad_norm': 0.655494204670402, 'learning_rate': 7.874282660213537e-06, 'epoch': 0.33} 33%|███▎ | 7202/22095 [12:01:08<16:13:54, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7203/22095 [12:01:11<14:48:05, 3.58s/it] {'loss': 0.3709, 'grad_norm': 0.6214300206617079, 'learning_rate': 7.873682913620077e-06, 'epoch': 0.33} 33%|███▎ | 7203/22095 [12:01:11<14:48:05, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047169 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 12cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 33%|███▎ | 7204/22095 [12:01:21<22:18:51, 5.39s/it] {'loss': 0.4838, 'grad_norm': 0.7395486618414999, 'learning_rate': 7.873083105279996e-06, 'epoch': 0.33} 33%|███▎ | 7204/22095 [12:01:21<22:18:51, 5.39s/it] 33%|███▎ | 7205/22095 [12:01:24<19:37:48, 4.75s/it] {'loss': 0.34, 'grad_norm': 0.6643510155786785, 'learning_rate': 7.872483235206184e-06, 'epoch': 0.33} 33%|███▎ | 7205/22095 [12:01:24<19:37:48, 4.75s/it] 33%|███▎ | 7206/22095 [12:01:28<18:28:45, 4.47s/it] {'loss': 0.3432, 'grad_norm': 0.7851872919090189, 'learning_rate': 7.87188330341153e-06, 'epoch': 0.33} 33%|███▎ | 7206/22095 [12:01:28<18:28:45, 4.47s/it] 33%|███▎ | 7207/22095 [12:01:31<17:10:40, 4.15s/it] {'loss': 0.3881, 'grad_norm': 0.6122471370986199, 'learning_rate': 7.871283309908922e-06, 'epoch': 0.33} 33%|███▎ | 7207/22095 [12:01:31<17:10:40, 4.15s/it] 33%|███▎ | 7208/22095 [12:01:34<16:00:28, 3.87s/it] {'loss': 0.4115, 'grad_norm': 0.654839829130589, 'learning_rate': 7.870683254711255e-06, 'epoch': 0.33} 33%|███▎ | 7208/22095 [12:01:34<16:00:28, 3.87s/it] 33%|███▎ | 7209/22095 [12:01:37<14:59:29, 3.63s/it] {'loss': 0.3534, 'grad_norm': 0.632937065166646, 'learning_rate': 7.870083137831423e-06, 'epoch': 0.33} 33%|███▎ | 7209/22095 [12:01:37<14:59:29, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7210/22095 [12:01:47<22:44:09, 5.50s/it] {'loss': 0.4862, 'grad_norm': 0.3925737232393038, 'learning_rate': 7.869482959282318e-06, 'epoch': 0.33} 33%|███▎ | 7210/22095 [12:01:47<22:44:09, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85295 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51536 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47759 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7211/22095 [12:01:50<19:44:45, 4.78s/it] {'loss': 0.3442, 'grad_norm': 0.6745670011729434, 'learning_rate': 7.868882719076838e-06, 'epoch': 0.33} 33%|███▎ | 7211/22095 [12:01:50<19:44:45, 4.78s/it] 33%|███▎ | 7212/22095 [12:01:54<18:07:00, 4.38s/it] {'loss': 0.3888, 'grad_norm': 0.6684057044920896, 'learning_rate': 7.868282417227877e-06, 'epoch': 0.33} 33%|███▎ | 7212/22095 [12:01:54<18:07:00, 4.38s/it] 33%|███▎ | 7213/22095 [12:01:58<17:17:47, 4.18s/it] {'loss': 0.4021, 'grad_norm': 0.6623891104338104, 'learning_rate': 7.867682053748338e-06, 'epoch': 0.33} 33%|███▎ | 7213/22095 [12:01:58<17:17:47, 4.18s/it] 33%|███▎ | 7214/22095 [12:02:01<16:52:55, 4.08s/it] {'loss': 0.3948, 'grad_norm': 0.5684645700583332, 'learning_rate': 7.86708162865112e-06, 'epoch': 0.33} 33%|███▎ | 7214/22095 [12:02:01<16:52:55, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250616/windows_paste/images/stata/20250520_101919_21/images/before_screenshot_314_id_34_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 04:00:00.739543 load time: 1176.68 ms 33%|███▎ | 7215/22095 [12:02:09<21:13:42, 5.14s/it] {'loss': 0.4923, 'grad_norm': 0.38853354414543634, 'learning_rate': 7.866481141949123e-06, 'epoch': 0.33} 33%|███▎ | 7215/22095 [12:02:09<21:13:42, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98131 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125215 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7216/22095 [12:02:12<18:54:15, 4.57s/it] {'loss': 0.4039, 'grad_norm': 0.6899635765060905, 'learning_rate': 7.86588059365525e-06, 'epoch': 0.33} 33%|███▎ | 7216/22095 [12:02:12<18:54:15, 4.57s/it] 33%|███▎ | 7217/22095 [12:02:17<18:34:31, 4.49s/it] {'loss': 0.403, 'grad_norm': 0.637953863348367, 'learning_rate': 7.865279983782402e-06, 'epoch': 0.33} 33%|███▎ | 7217/22095 [12:02:17<18:34:31, 4.49s/it] 33%|███▎ | 7218/22095 [12:02:20<16:59:12, 4.11s/it] {'loss': 0.419, 'grad_norm': 0.6447741093066474, 'learning_rate': 7.864679312343491e-06, 'epoch': 0.33} 33%|███▎ | 7218/22095 [12:02:20<16:59:12, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7219/22095 [12:02:30<24:01:47, 5.82s/it] {'loss': 0.4769, 'grad_norm': 0.3132402057909027, 'learning_rate': 7.864078579351418e-06, 'epoch': 0.33} 33%|███▎ | 7219/22095 [12:02:30<24:01:47, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44132 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7220/22095 [12:02:34<21:48:46, 5.28s/it] {'loss': 0.3385, 'grad_norm': 0.6410307435099741, 'learning_rate': 7.863477784819091e-06, 'epoch': 0.33} 33%|███▎ | 7220/22095 [12:02:34<21:48:46, 5.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [370, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8472721 in VC:s3://internvl-moe-sft-data/. Exception: Image size [370, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 82340, 'image': 'vrdu_texteq/astro-ph.CO/416de010-88fc-43e3-a3ec-845ba5e703b1.png', 'image_wh': [[370, 23]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'with $r$ a radial coordinate and'}]} 33%|███▎ | 7221/22095 [12:02:37<19:37:03, 4.75s/it] {'loss': 0.3662, 'grad_norm': 0.8084080593940757, 'learning_rate': 7.862876928759424e-06, 'epoch': 0.33} 33%|███▎ | 7221/22095 [12:02:37<19:37:03, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52675 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7222/22095 [12:02:40<17:24:54, 4.22s/it] {'loss': 0.3368, 'grad_norm': 0.6242179182343427, 'learning_rate': 7.862276011185323e-06, 'epoch': 0.33} 33%|███▎ | 7222/22095 [12:02:40<17:24:54, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51195 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7223/22095 [12:02:43<15:46:56, 3.82s/it] {'loss': 0.3591, 'grad_norm': 0.6506736147066804, 'learning_rate': 7.8616750321097e-06, 'epoch': 0.33} 33%|███▎ | 7223/22095 [12:02:43<15:46:56, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7224/22095 [12:02:53<23:15:02, 5.63s/it] {'loss': 0.5008, 'grad_norm': 0.3743456101413249, 'learning_rate': 7.861073991545472e-06, 'epoch': 0.33} 33%|███▎ | 7224/22095 [12:02:53<23:15:02, 5.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [248, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8490170 in VC:s3://internvl-moe-sft-data/. Exception: Image size [248, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12374, 'image': 'vrdu_texteq/astro-ph.CO/c8b268a4-78c6-4042-b879-5580b1e6f2c4.png', 'image_wh': [[248, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $\\Omega_\\Lambda \\equiv 1 - \\Omega_{\\rm m}$.'}]} 33%|███▎ | 7225/22095 [12:02:57<20:58:48, 5.08s/it] {'loss': 0.3881, 'grad_norm': 0.8322622971337075, 'learning_rate': 7.86047288950555e-06, 'epoch': 0.33} 33%|███▎ | 7225/22095 [12:02:57<20:58:48, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7226/22095 [12:03:04<23:10:06, 5.61s/it] {'loss': 0.493, 'grad_norm': 0.42578988571435983, 'learning_rate': 7.859871726002852e-06, 'epoch': 0.33} 33%|███▎ | 7226/22095 [12:03:04<23:10:06, 5.61s/it] 33%|███▎ | 7227/22095 [12:03:08<21:31:40, 5.21s/it] {'loss': 0.3765, 'grad_norm': 0.6491635693186513, 'learning_rate': 7.859270501050292e-06, 'epoch': 0.33} 33%|███▎ | 7227/22095 [12:03:08<21:31:40, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108985 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43604 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7228/22095 [12:03:17<25:56:41, 6.28s/it] {'loss': 0.4823, 'grad_norm': 0.2764549467599382, 'learning_rate': 7.858669214660792e-06, 'epoch': 0.33} 33%|███▎ | 7228/22095 [12:03:17<25:56:41, 6.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7229/22095 [12:03:20<22:21:36, 5.41s/it] {'loss': 0.351, 'grad_norm': 0.7046311928792779, 'learning_rate': 7.85806786684727e-06, 'epoch': 0.33} 33%|███▎ | 7229/22095 [12:03:20<22:21:36, 5.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7230/22095 [12:03:30<27:47:17, 6.73s/it] {'loss': 0.4904, 'grad_norm': 0.3321521003907507, 'learning_rate': 7.857466457622647e-06, 'epoch': 0.33} 33%|███▎ | 7230/22095 [12:03:30<27:47:17, 6.73s/it] 33%|███▎ | 7231/22095 [12:03:33<23:50:57, 5.78s/it] {'loss': 0.3917, 'grad_norm': 0.6566847794699036, 'learning_rate': 7.856864986999845e-06, 'epoch': 0.33} 33%|███▎ | 7231/22095 [12:03:33<23:50:57, 5.78s/it] 33%|███▎ | 7232/22095 [12:03:37<20:59:08, 5.08s/it] {'loss': 0.3318, 'grad_norm': 0.629014744960846, 'learning_rate': 7.856263454991791e-06, 'epoch': 0.33} 33%|███▎ | 7232/22095 [12:03:37<20:59:08, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132828 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60551 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7233/22095 [12:03:40<18:26:12, 4.47s/it] {'loss': 0.3438, 'grad_norm': 0.6085911810930542, 'learning_rate': 7.855661861611406e-06, 'epoch': 0.33} 33%|███▎ | 7233/22095 [12:03:40<18:26:12, 4.47s/it] 33%|███▎ | 7234/22095 [12:03:43<17:02:23, 4.13s/it] {'loss': 0.419, 'grad_norm': 0.6578764127823246, 'learning_rate': 7.855060206871618e-06, 'epoch': 0.33} 33%|███▎ | 7234/22095 [12:03:43<17:02:23, 4.13s/it] 33%|███▎ | 7235/22095 [12:03:46<15:34:37, 3.77s/it] {'loss': 0.3811, 'grad_norm': 0.6536238269200988, 'learning_rate': 7.854458490785354e-06, 'epoch': 0.33} 33%|███▎ | 7235/22095 [12:03:46<15:34:37, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7236/22095 [12:03:56<22:39:10, 5.49s/it] {'loss': 0.4837, 'grad_norm': 0.381547503313166, 'learning_rate': 7.853856713365547e-06, 'epoch': 0.33} 33%|███▎ | 7236/22095 [12:03:56<22:39:10, 5.49s/it] 33%|███▎ | 7237/22095 [12:03:59<20:10:00, 4.89s/it] {'loss': 0.3384, 'grad_norm': 0.7208941435848779, 'learning_rate': 7.853254874625122e-06, 'epoch': 0.33} 33%|███▎ | 7237/22095 [12:03:59<20:10:00, 4.89s/it] 33%|███▎ | 7238/22095 [12:04:03<18:59:28, 4.60s/it] {'loss': 0.3551, 'grad_norm': 0.6808342916905882, 'learning_rate': 7.852652974577012e-06, 'epoch': 0.33} 33%|███▎ | 7238/22095 [12:04:03<18:59:28, 4.60s/it] 33%|███▎ | 7239/22095 [12:04:06<17:17:24, 4.19s/it] {'loss': 0.375, 'grad_norm': 0.7001374804992242, 'learning_rate': 7.852051013234153e-06, 'epoch': 0.33} 33%|███▎ | 7239/22095 [12:04:06<17:17:24, 4.19s/it] 33%|███▎ | 7240/22095 [12:04:09<15:39:22, 3.79s/it] {'loss': 0.3435, 'grad_norm': 0.650543073012321, 'learning_rate': 7.851448990609476e-06, 'epoch': 0.33} 33%|███▎ | 7240/22095 [12:04:09<15:39:22, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7241/22095 [12:04:15<18:48:18, 4.56s/it] {'loss': 0.4859, 'grad_norm': 0.34818232252070125, 'learning_rate': 7.850846906715917e-06, 'epoch': 0.33} 33%|███▎ | 7241/22095 [12:04:15<18:48:18, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119331 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7242/22095 [12:04:19<17:14:18, 4.18s/it] {'loss': 0.35, 'grad_norm': 0.6569965955754113, 'learning_rate': 7.850244761566415e-06, 'epoch': 0.33} 33%|███▎ | 7242/22095 [12:04:19<17:14:18, 4.18s/it] 33%|███▎ | 7243/22095 [12:04:23<17:00:36, 4.12s/it] {'loss': 0.3492, 'grad_norm': 0.6249546782859293, 'learning_rate': 7.849642555173907e-06, 'epoch': 0.33} 33%|███▎ | 7243/22095 [12:04:23<17:00:36, 4.12s/it] 33%|███▎ | 7244/22095 [12:04:26<16:23:40, 3.97s/it] {'loss': 0.3906, 'grad_norm': 0.7960260771660157, 'learning_rate': 7.849040287551331e-06, 'epoch': 0.33} 33%|███▎ | 7244/22095 [12:04:26<16:23:40, 3.97s/it] 33%|███▎ | 7245/22095 [12:04:29<15:03:56, 3.65s/it] {'loss': 0.3752, 'grad_norm': 0.6504881285244849, 'learning_rate': 7.848437958711631e-06, 'epoch': 0.33} 33%|███▎ | 7245/22095 [12:04:29<15:03:56, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7246/22095 [12:04:32<14:15:40, 3.46s/it] {'loss': 0.341, 'grad_norm': 0.6562907636624832, 'learning_rate': 7.847835568667746e-06, 'epoch': 0.33} 33%|███▎ | 7246/22095 [12:04:32<14:15:40, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6777484 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250509_125727_1/images/before_screenshot_1_id_121_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nHover over 'Edit customization interface' to view a description of this functionality."}, {'from': 'gpt', 'value': '\nclick(x=0.4656, y=0.7966)\n'}], 'width': 3600, 'height': 2338} 33%|███▎ | 7247/22095 [12:04:37<15:17:17, 3.71s/it] {'loss': 0.4125, 'grad_norm': 0.663970148382978, 'learning_rate': 7.847233117432623e-06, 'epoch': 0.33} 33%|███▎ | 7247/22095 [12:04:37<15:17:17, 3.71s/it] 33%|███▎ | 7248/22095 [12:04:40<14:57:38, 3.63s/it] {'loss': 0.3623, 'grad_norm': 0.611816963228363, 'learning_rate': 7.846630605019204e-06, 'epoch': 0.33} 33%|███▎ | 7248/22095 [12:04:40<14:57:38, 3.63s/it] 33%|███▎ | 7249/22095 [12:04:44<15:37:47, 3.79s/it] {'loss': 0.3735, 'grad_norm': 0.8418055709857473, 'learning_rate': 7.846028031440436e-06, 'epoch': 0.33} 33%|███▎ | 7249/22095 [12:04:44<15:37:47, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (56908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60955 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7250/22095 [12:04:54<22:32:15, 5.47s/it] {'loss': 0.4738, 'grad_norm': 0.4046248697144449, 'learning_rate': 7.845425396709266e-06, 'epoch': 0.33} 33%|███▎ | 7250/22095 [12:04:54<22:32:15, 5.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7251/22095 [12:04:57<20:21:55, 4.94s/it] {'loss': 0.3467, 'grad_norm': 0.6705514103940412, 'learning_rate': 7.844822700838644e-06, 'epoch': 0.33} 33%|███▎ | 7251/22095 [12:04:57<20:21:55, 4.94s/it] 33%|███▎ | 7252/22095 [12:05:01<18:30:26, 4.49s/it] {'loss': 0.388, 'grad_norm': 0.663416672149569, 'learning_rate': 7.84421994384152e-06, 'epoch': 0.33} 33%|███▎ | 7252/22095 [12:05:01<18:30:26, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41347 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7253/22095 [12:05:04<17:20:59, 4.21s/it] {'loss': 0.3884, 'grad_norm': 0.651365532004636, 'learning_rate': 7.843617125730842e-06, 'epoch': 0.33} 33%|███▎ | 7253/22095 [12:05:04<17:20:59, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142897 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7254/22095 [12:05:08<16:45:23, 4.06s/it] {'loss': 0.3548, 'grad_norm': 0.6076682619566085, 'learning_rate': 7.843014246519569e-06, 'epoch': 0.33} 33%|███▎ | 7254/22095 [12:05:08<16:45:23, 4.06s/it] 33%|███▎ | 7255/22095 [12:05:11<15:32:04, 3.77s/it] {'loss': 0.3634, 'grad_norm': 0.6729372693949679, 'learning_rate': 7.84241130622065e-06, 'epoch': 0.33} 33%|███▎ | 7255/22095 [12:05:11<15:32:04, 3.77s/it] 33%|███▎ | 7256/22095 [12:05:14<14:40:55, 3.56s/it] {'loss': 0.3481, 'grad_norm': 0.6264003233718723, 'learning_rate': 7.841808304847041e-06, 'epoch': 0.33} 33%|███▎ | 7256/22095 [12:05:14<14:40:55, 3.56s/it] 33%|███▎ | 7257/22095 [12:05:17<13:57:59, 3.39s/it] {'loss': 0.3632, 'grad_norm': 0.8129037846541436, 'learning_rate': 7.841205242411701e-06, 'epoch': 0.33} 33%|███▎ | 7257/22095 [12:05:17<13:57:59, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7258/22095 [12:05:24<18:20:12, 4.45s/it] {'loss': 0.4537, 'grad_norm': 0.37059103009216027, 'learning_rate': 7.840602118927584e-06, 'epoch': 0.33} 33%|███▎ | 7258/22095 [12:05:24<18:20:12, 4.45s/it] 33%|███▎ | 7259/22095 [12:05:28<17:19:25, 4.20s/it] {'loss': 0.3527, 'grad_norm': 0.6352263729186042, 'learning_rate': 7.839998934407652e-06, 'epoch': 0.33} 33%|███▎ | 7259/22095 [12:05:28<17:19:25, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7260/22095 [12:05:31<16:18:35, 3.96s/it] {'loss': 0.3522, 'grad_norm': 0.6370304748624506, 'learning_rate': 7.839395688864868e-06, 'epoch': 0.33} 33%|███▎ | 7260/22095 [12:05:31<16:18:35, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7261/22095 [12:05:39<21:40:29, 5.26s/it] {'loss': 0.4857, 'grad_norm': 0.33410480912428436, 'learning_rate': 7.83879238231219e-06, 'epoch': 0.33} 33%|███▎ | 7261/22095 [12:05:39<21:40:29, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85174 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7262/22095 [12:05:47<24:27:08, 5.93s/it] {'loss': 0.5234, 'grad_norm': 0.39715243089542934, 'learning_rate': 7.838189014762582e-06, 'epoch': 0.33} 33%|███▎ | 7262/22095 [12:05:47<24:27:08, 5.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41000 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7263/22095 [12:05:50<21:06:42, 5.12s/it] {'loss': 0.3386, 'grad_norm': 0.6725589605603078, 'learning_rate': 7.83758558622901e-06, 'epoch': 0.33} 33%|███▎ | 7263/22095 [12:05:50<21:06:42, 5.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52393 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42644 > 40960) for 4 sample(s). Truncating to 22987 with 3 samples. 33%|███▎ | 7264/22095 [12:05:53<18:56:14, 4.60s/it] {'loss': 0.3806, 'grad_norm': 0.7081106837258482, 'learning_rate': 7.836982096724438e-06, 'epoch': 0.33} 33%|███▎ | 7264/22095 [12:05:53<18:56:14, 4.60s/it] 33%|███▎ | 7265/22095 [12:05:58<18:42:40, 4.54s/it] {'loss': 0.3394, 'grad_norm': 0.6134248938641275, 'learning_rate': 7.836378546261834e-06, 'epoch': 0.33} 33%|███▎ | 7265/22095 [12:05:58<18:42:40, 4.54s/it] 33%|███▎ | 7266/22095 [12:06:01<16:45:05, 4.07s/it] {'loss': 0.3418, 'grad_norm': 0.6459265887108824, 'learning_rate': 7.835774934854166e-06, 'epoch': 0.33} 33%|███▎ | 7266/22095 [12:06:01<16:45:05, 4.07s/it] 33%|███▎ | 7267/22095 [12:06:04<15:47:31, 3.83s/it] {'loss': 0.3263, 'grad_norm': 0.6242275569284673, 'learning_rate': 7.835171262514402e-06, 'epoch': 0.33} 33%|███▎ | 7267/22095 [12:06:04<15:47:31, 3.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7268/22095 [12:06:09<16:28:24, 4.00s/it] {'loss': 0.3335, 'grad_norm': 0.6192643139391333, 'learning_rate': 7.834567529255519e-06, 'epoch': 0.33} 33%|███▎ | 7268/22095 [12:06:09<16:28:24, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7269/22095 [12:06:12<15:48:59, 3.84s/it] {'loss': 0.3673, 'grad_norm': 0.7266708113236864, 'learning_rate': 7.833963735090484e-06, 'epoch': 0.33} 33%|███▎ | 7269/22095 [12:06:12<15:48:59, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7270/22095 [12:06:22<23:19:36, 5.66s/it] {'loss': 0.4821, 'grad_norm': 0.42644557495266, 'learning_rate': 7.833359880032272e-06, 'epoch': 0.33} 33%|███▎ | 7270/22095 [12:06:22<23:19:36, 5.66s/it] 33%|███▎ | 7271/22095 [12:06:32<28:13:05, 6.85s/it] {'loss': 0.51, 'grad_norm': 0.37821061706041686, 'learning_rate': 7.832755964093859e-06, 'epoch': 0.33} 33%|███▎ | 7271/22095 [12:06:32<28:13:05, 6.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045961 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7272/22095 [12:06:41<31:55:28, 7.75s/it] {'loss': 0.5135, 'grad_norm': 0.3157916385785731, 'learning_rate': 7.832151987288219e-06, 'epoch': 0.33} 33%|███▎ | 7272/22095 [12:06:41<31:55:28, 7.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 33%|███▎ | 7273/22095 [12:06:45<27:21:55, 6.65s/it] {'loss': 0.3712, 'grad_norm': 0.7308948326064201, 'learning_rate': 7.83154794962833e-06, 'epoch': 0.33} 33%|███▎ | 7273/22095 [12:06:45<27:21:55, 6.65s/it] 33%|███▎ | 7274/22095 [12:06:50<24:42:53, 6.00s/it] {'loss': 0.3813, 'grad_norm': 0.6851166754739852, 'learning_rate': 7.830943851127175e-06, 'epoch': 0.33} 33%|███▎ | 7274/22095 [12:06:50<24:42:53, 6.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367286 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34034, 'image': 'vrdu_table_final_2/astro-ph.CO/37d3abd3-28f7-451c-9ded-cdf010fc4900.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 33%|███▎ | 7275/22095 [12:06:53<20:58:31, 5.10s/it] {'loss': 0.3423, 'grad_norm': 0.6385881894664626, 'learning_rate': 7.830339691797727e-06, 'epoch': 0.33} 33%|███▎ | 7275/22095 [12:06:53<20:58:31, 5.10s/it] 33%|███▎ | 7276/22095 [12:06:56<18:16:34, 4.44s/it] {'loss': 0.3602, 'grad_norm': 0.9217747573027575, 'learning_rate': 7.829735471652978e-06, 'epoch': 0.33} 33%|███▎ | 7276/22095 [12:06:56<18:16:34, 4.44s/it] 33%|███▎ | 7277/22095 [12:06:59<17:14:59, 4.19s/it] {'loss': 0.371, 'grad_norm': 0.6708483049494212, 'learning_rate': 7.8291311907059e-06, 'epoch': 0.33} 33%|███▎ | 7277/22095 [12:06:59<17:14:59, 4.19s/it] 33%|███▎ | 7278/22095 [12:07:02<15:39:10, 3.80s/it] {'loss': 0.4031, 'grad_norm': 0.7331380798086038, 'learning_rate': 7.828526848969482e-06, 'epoch': 0.33} 33%|███▎ | 7278/22095 [12:07:02<15:39:10, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60636 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79263 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7279/22095 [12:07:05<14:39:07, 3.56s/it] {'loss': 0.3531, 'grad_norm': 0.7548536600065007, 'learning_rate': 7.827922446456711e-06, 'epoch': 0.33} 33%|███▎ | 7279/22095 [12:07:05<14:39:07, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54560 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (53105 > 40960) for 4 sample(s). Truncating to 7276 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (77347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46855 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7280/22095 [12:07:08<13:48:37, 3.36s/it] {'loss': 0.376, 'grad_norm': 0.6395017247716274, 'learning_rate': 7.827317983180571e-06, 'epoch': 0.33} 33%|███▎ | 7280/22095 [12:07:08<13:48:37, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104945 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41392 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47118 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7281/22095 [12:07:11<13:27:50, 3.27s/it] {'loss': 0.3621, 'grad_norm': 0.688170077140715, 'learning_rate': 7.826713459154051e-06, 'epoch': 0.33} 33%|███▎ | 7281/22095 [12:07:11<13:27:50, 3.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7282/22095 [12:07:15<13:28:43, 3.28s/it] {'loss': 0.3872, 'grad_norm': 0.6798231401394088, 'learning_rate': 7.826108874390141e-06, 'epoch': 0.33} 33%|███▎ | 7282/22095 [12:07:15<13:28:43, 3.28s/it] 33%|███▎ | 7283/22095 [12:07:18<13:32:22, 3.29s/it] {'loss': 0.358, 'grad_norm': 0.7378493099029014, 'learning_rate': 7.82550422890183e-06, 'epoch': 0.33} 33%|███▎ | 7283/22095 [12:07:18<13:32:22, 3.29s/it] 33%|███▎ | 7284/22095 [12:07:21<13:50:43, 3.37s/it] {'loss': 0.3741, 'grad_norm': 0.6859895137902425, 'learning_rate': 7.824899522702112e-06, 'epoch': 0.33} 33%|███▎ | 7284/22095 [12:07:21<13:50:43, 3.37s/it] 33%|███▎ | 7285/22095 [12:07:25<14:25:23, 3.51s/it] {'loss': 0.3697, 'grad_norm': 0.658983030067839, 'learning_rate': 7.824294755803978e-06, 'epoch': 0.33} 33%|███▎ | 7285/22095 [12:07:25<14:25:23, 3.51s/it] 33%|███▎ | 7286/22095 [12:07:28<13:45:21, 3.34s/it] {'loss': 0.3796, 'grad_norm': 0.8659072243183273, 'learning_rate': 7.823689928220424e-06, 'epoch': 0.33} 33%|███▎ | 7286/22095 [12:07:28<13:45:21, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7287/22095 [12:07:34<16:59:19, 4.13s/it] {'loss': 0.509, 'grad_norm': 0.8688001597626949, 'learning_rate': 7.823085039964446e-06, 'epoch': 0.33} 33%|███▎ | 7287/22095 [12:07:34<16:59:19, 4.13s/it] 33%|███▎ | 7288/22095 [12:07:45<25:49:19, 6.28s/it] {'loss': 0.4867, 'grad_norm': 0.6597518755134283, 'learning_rate': 7.82248009104904e-06, 'epoch': 0.33} 33%|███▎ | 7288/22095 [12:07:46<25:49:19, 6.28s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_1/images/step_0.png 2025-08-28 04:05:44.272908 load time: 1090.3 ms 33%|███▎ | 7289/22095 [12:07:53<27:25:17, 6.67s/it] {'loss': 0.496, 'grad_norm': 0.3427715840543005, 'learning_rate': 7.821875081487208e-06, 'epoch': 0.33} 33%|███▎ | 7289/22095 [12:07:53<27:25:17, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 33%|███▎ | 7290/22095 [12:07:56<23:03:57, 5.61s/it] {'loss': 0.3074, 'grad_norm': 0.6649470951982919, 'learning_rate': 7.821270011291946e-06, 'epoch': 0.33} 33%|███▎ | 7290/22095 [12:07:56<23:03:57, 5.61s/it] 33%|███▎ | 7291/22095 [12:08:00<20:33:58, 5.00s/it] {'loss': 0.3383, 'grad_norm': 0.7713963361882481, 'learning_rate': 7.820664880476257e-06, 'epoch': 0.33} 33%|███▎ | 7291/22095 [12:08:00<20:33:58, 5.00s/it] 33%|███▎ | 7292/22095 [12:08:04<19:32:14, 4.75s/it] {'loss': 0.3775, 'grad_norm': 0.6840185544238141, 'learning_rate': 7.820059689053142e-06, 'epoch': 0.33} 33%|███▎ | 7292/22095 [12:08:04<19:32:14, 4.75s/it] 33%|███▎ | 7293/22095 [12:08:07<17:57:03, 4.37s/it] {'loss': 0.3748, 'grad_norm': 0.7067990947071694, 'learning_rate': 7.819454437035605e-06, 'epoch': 0.33} 33%|███▎ | 7293/22095 [12:08:07<17:57:03, 4.37s/it] 33%|███▎ | 7294/22095 [12:08:11<16:23:16, 3.99s/it] {'loss': 0.3773, 'grad_norm': 0.6848570477143894, 'learning_rate': 7.818849124436651e-06, 'epoch': 0.33} 33%|███▎ | 7294/22095 [12:08:11<16:23:16, 3.99s/it] 33%|███▎ | 7295/22095 [12:08:14<15:11:46, 3.70s/it] {'loss': 0.4016, 'grad_norm': 0.6522391481043741, 'learning_rate': 7.818243751269288e-06, 'epoch': 0.33} 33%|███▎ | 7295/22095 [12:08:14<15:11:46, 3.70s/it] 33%|███▎ | 7296/22095 [12:08:17<14:51:54, 3.62s/it] {'loss': 0.3688, 'grad_norm': 0.6509650533075849, 'learning_rate': 7.817638317546521e-06, 'epoch': 0.33} 33%|███▎ | 7296/22095 [12:08:17<14:51:54, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71728 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49206 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7297/22095 [12:08:20<13:48:56, 3.36s/it] {'loss': 0.3385, 'grad_norm': 0.785981076357508, 'learning_rate': 7.817032823281362e-06, 'epoch': 0.33} 33%|███▎ | 7297/22095 [12:08:20<13:48:56, 3.36s/it] 33%|███▎ | 7298/22095 [12:08:23<13:29:47, 3.28s/it] {'loss': 0.3952, 'grad_norm': 0.7189844060702856, 'learning_rate': 7.816427268486819e-06, 'epoch': 0.33} 33%|███▎ | 7298/22095 [12:08:23<13:29:47, 3.28s/it] 33%|███▎ | 7299/22095 [12:08:27<14:09:57, 3.45s/it] {'loss': 0.3857, 'grad_norm': 0.6982142130746813, 'learning_rate': 7.815821653175903e-06, 'epoch': 0.33} 33%|███▎ | 7299/22095 [12:08:27<14:09:57, 3.45s/it] 33%|███▎ | 7300/22095 [12:08:30<13:38:35, 3.32s/it] {'loss': 0.3254, 'grad_norm': 0.6773609322645711, 'learning_rate': 7.815215977361628e-06, 'epoch': 0.33} 33%|███▎ | 7300/22095 [12:08:30<13:38:35, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7301/22095 [12:08:34<14:35:45, 3.55s/it] {'loss': 0.3923, 'grad_norm': 0.7337843174717779, 'learning_rate': 7.814610241057009e-06, 'epoch': 0.33} 33%|███▎ | 7301/22095 [12:08:34<14:35:45, 3.55s/it] 33%|███▎ | 7302/22095 [12:08:37<14:41:34, 3.58s/it] {'loss': 0.3394, 'grad_norm': 0.617776076989984, 'learning_rate': 7.814004444275058e-06, 'epoch': 0.33} 33%|███▎ | 7302/22095 [12:08:37<14:41:34, 3.58s/it] 33%|███▎ | 7303/22095 [12:08:41<14:07:51, 3.44s/it] {'loss': 0.3687, 'grad_norm': 0.6361374966992775, 'learning_rate': 7.813398587028798e-06, 'epoch': 0.33} 33%|███▎ | 7303/22095 [12:08:41<14:07:51, 3.44s/it] 33%|███▎ | 7304/22095 [12:08:44<13:58:11, 3.40s/it] {'loss': 0.3844, 'grad_norm': 0.6348279118574923, 'learning_rate': 7.81279266933124e-06, 'epoch': 0.33} 33%|███▎ | 7304/22095 [12:08:44<13:58:11, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54761 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7305/22095 [12:08:51<18:21:21, 4.47s/it] {'loss': 0.5487, 'grad_norm': 2.8535305256216423, 'learning_rate': 7.812186691195407e-06, 'epoch': 0.33} 33%|███▎ | 7305/22095 [12:08:51<18:21:21, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7306/22095 [12:08:55<17:28:14, 4.25s/it] {'loss': 0.379, 'grad_norm': 0.7068173549230775, 'learning_rate': 7.811580652634319e-06, 'epoch': 0.33} 33%|███▎ | 7306/22095 [12:08:55<17:28:14, 4.25s/it] 33%|███▎ | 7307/22095 [12:08:58<16:07:52, 3.93s/it] {'loss': 0.375, 'grad_norm': 0.690644774893463, 'learning_rate': 7.810974553660998e-06, 'epoch': 0.33} 33%|███▎ | 7307/22095 [12:08:58<16:07:52, 3.93s/it] 33%|███▎ | 7308/22095 [12:09:01<15:26:13, 3.76s/it] {'loss': 0.3685, 'grad_norm': 0.663678126303351, 'learning_rate': 7.810368394288468e-06, 'epoch': 0.33} 33%|███▎ | 7308/22095 [12:09:01<15:26:13, 3.76s/it] 33%|███▎ | 7309/22095 [12:09:05<15:45:25, 3.84s/it] {'loss': 0.3729, 'grad_norm': 0.740229603627771, 'learning_rate': 7.809762174529752e-06, 'epoch': 0.33} 33%|███▎ | 7309/22095 [12:09:05<15:45:25, 3.84s/it] 33%|███▎ | 7310/22095 [12:09:08<14:56:13, 3.64s/it] {'loss': 0.3873, 'grad_norm': 0.7410494585371324, 'learning_rate': 7.809155894397876e-06, 'epoch': 0.33} 33%|███▎ | 7310/22095 [12:09:08<14:56:13, 3.64s/it] 33%|███▎ | 7311/22095 [12:09:12<15:07:26, 3.68s/it] {'loss': 0.348, 'grad_norm': 0.6234515829771916, 'learning_rate': 7.808549553905867e-06, 'epoch': 0.33} 33%|███▎ | 7311/22095 [12:09:12<15:07:26, 3.68s/it] 33%|███▎ | 7312/22095 [12:09:15<14:33:57, 3.55s/it] {'loss': 0.3357, 'grad_norm': 0.6860869376405904, 'learning_rate': 7.807943153066754e-06, 'epoch': 0.33} 33%|███▎ | 7312/22095 [12:09:15<14:33:57, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7313/22095 [12:09:25<21:51:47, 5.32s/it] {'loss': 0.5247, 'grad_norm': 1.2962262711218264, 'learning_rate': 7.807336691893568e-06, 'epoch': 0.33} 33%|███▎ | 7313/22095 [12:09:25<21:51:47, 5.32s/it] 33%|███▎ | 7314/22095 [12:09:29<20:03:54, 4.89s/it] {'loss': 0.3651, 'grad_norm': 0.6934296670610857, 'learning_rate': 7.806730170399337e-06, 'epoch': 0.33} 33%|███▎ | 7314/22095 [12:09:29<20:03:54, 4.89s/it] 33%|███▎ | 7315/22095 [12:09:32<17:47:58, 4.34s/it] {'loss': 0.3592, 'grad_norm': 0.7025851521873115, 'learning_rate': 7.806123588597094e-06, 'epoch': 0.33} 33%|███▎ | 7315/22095 [12:09:32<17:47:58, 4.34s/it] 33%|███▎ | 7316/22095 [12:09:35<15:55:30, 3.88s/it] {'loss': 0.3298, 'grad_norm': 0.768101167716597, 'learning_rate': 7.805516946499876e-06, 'epoch': 0.33} 33%|███▎ | 7316/22095 [12:09:35<15:55:30, 3.88s/it] 33%|███▎ | 7317/22095 [12:09:38<15:14:55, 3.71s/it] {'loss': 0.3296, 'grad_norm': 0.6351060711723828, 'learning_rate': 7.804910244120714e-06, 'epoch': 0.33} 33%|███▎ | 7317/22095 [12:09:38<15:14:55, 3.71s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39078.png 2025-08-28 04:07:33.762826 load time: 1637.81 ms 33%|███▎ | 7318/22095 [12:09:41<14:32:53, 3.54s/it] {'loss': 0.3914, 'grad_norm': 0.6978681798720107, 'learning_rate': 7.804303481472645e-06, 'epoch': 0.33} 33%|███▎ | 7318/22095 [12:09:41<14:32:53, 3.54s/it] 33%|███▎ | 7319/22095 [12:09:44<13:51:36, 3.38s/it] {'loss': 0.3256, 'grad_norm': 0.6061560182864592, 'learning_rate': 7.80369665856871e-06, 'epoch': 0.33} 33%|███▎ | 7319/22095 [12:09:44<13:51:36, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47263 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7320/22095 [12:09:47<13:18:29, 3.24s/it] {'loss': 0.3703, 'grad_norm': 0.6335131386121605, 'learning_rate': 7.80308977542194e-06, 'epoch': 0.33} 33%|███▎ | 7320/22095 [12:09:47<13:18:29, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42968 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7321/22095 [12:09:51<13:47:31, 3.36s/it] {'loss': 0.3467, 'grad_norm': 1.3965455827885276, 'learning_rate': 7.802482832045383e-06, 'epoch': 0.33} 33%|███▎ | 7321/22095 [12:09:51<13:47:31, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84714 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7322/22095 [12:09:54<13:36:39, 3.32s/it] {'loss': 0.4098, 'grad_norm': 0.6540973659645941, 'learning_rate': 7.801875828452077e-06, 'epoch': 0.33} 33%|███▎ | 7322/22095 [12:09:54<13:36:39, 3.32s/it] 33%|███▎ | 7323/22095 [12:09:57<13:45:41, 3.35s/it] {'loss': 0.3413, 'grad_norm': 0.5751699292719178, 'learning_rate': 7.801268764655063e-06, 'epoch': 0.33} 33%|███▎ | 7323/22095 [12:09:57<13:45:41, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42052 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7324/22095 [12:10:00<13:18:04, 3.24s/it] {'loss': 0.3027, 'grad_norm': 0.6726602987568291, 'learning_rate': 7.800661640667388e-06, 'epoch': 0.33} 33%|███▎ | 7324/22095 [12:10:00<13:18:04, 3.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940291 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63444, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nA. 2\nB. 3\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 33%|███▎ | 7325/22095 [12:10:03<12:42:37, 3.10s/it] {'loss': 0.3328, 'grad_norm': 0.6510373564214473, 'learning_rate': 7.800054456502096e-06, 'epoch': 0.33} 33%|███▎ | 7325/22095 [12:10:03<12:42:37, 3.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7326/22095 [12:10:07<13:20:37, 3.25s/it] {'loss': 0.3574, 'grad_norm': 0.6380091514174852, 'learning_rate': 7.799447212172233e-06, 'epoch': 0.33} 33%|███▎ | 7326/22095 [12:10:07<13:20:37, 3.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047782 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 6\nB. 8\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893393 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16546, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 33%|███▎ | 7327/22095 [12:10:10<13:22:41, 3.26s/it] {'loss': 0.3581, 'grad_norm': 0.6531228281965532, 'learning_rate': 7.798839907690847e-06, 'epoch': 0.33} 33%|███▎ | 7327/22095 [12:10:10<13:22:41, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7328/22095 [12:10:19<21:03:58, 5.14s/it] {'loss': 0.4774, 'grad_norm': 0.9721212229654579, 'learning_rate': 7.798232543070987e-06, 'epoch': 0.33} 33%|███▎ | 7328/22095 [12:10:19<21:03:58, 5.14s/it] 33%|███▎ | 7329/22095 [12:10:24<19:57:05, 4.86s/it] {'loss': 0.3739, 'grad_norm': 0.6227626915657207, 'learning_rate': 7.797625118325705e-06, 'epoch': 0.33} 33%|███▎ | 7329/22095 [12:10:24<19:57:05, 4.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7330/22095 [12:10:33<25:45:48, 6.28s/it] {'loss': 0.4823, 'grad_norm': 0.7719234444001082, 'learning_rate': 7.797017633468052e-06, 'epoch': 0.33} 33%|███▎ | 7330/22095 [12:10:33<25:45:48, 6.28s/it] 33%|███▎ | 7331/22095 [12:10:37<23:14:55, 5.67s/it] {'loss': 0.3712, 'grad_norm': 0.627263679418447, 'learning_rate': 7.796410088511078e-06, 'epoch': 0.33} 33%|███▎ | 7331/22095 [12:10:37<23:14:55, 5.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7332/22095 [12:10:45<25:21:28, 6.18s/it] {'loss': 0.4973, 'grad_norm': 0.3508190839711614, 'learning_rate': 7.79580248346784e-06, 'epoch': 0.33} 33%|███▎ | 7332/22095 [12:10:45<25:21:28, 6.18s/it] 33%|███▎ | 7333/22095 [12:10:48<21:59:17, 5.36s/it] {'loss': 0.363, 'grad_norm': 0.6231007923758607, 'learning_rate': 7.795194818351395e-06, 'epoch': 0.33} 33%|███▎ | 7333/22095 [12:10:48<21:59:17, 5.36s/it] 33%|███▎ | 7334/22095 [12:10:51<19:07:06, 4.66s/it] {'loss': 0.4091, 'grad_norm': 0.6697970573735087, 'learning_rate': 7.794587093174797e-06, 'epoch': 0.33} 33%|███▎ | 7334/22095 [12:10:52<19:07:06, 4.66s/it] 33%|███▎ | 7335/22095 [12:10:54<17:10:46, 4.19s/it] {'loss': 0.3471, 'grad_norm': 0.6362010756444667, 'learning_rate': 7.793979307951108e-06, 'epoch': 0.33} 33%|███▎ | 7335/22095 [12:10:54<17:10:46, 4.19s/it] 33%|███▎ | 7336/22095 [12:10:58<16:34:13, 4.04s/it] {'loss': 0.3645, 'grad_norm': 0.6596318056401064, 'learning_rate': 7.79337146269338e-06, 'epoch': 0.33} 33%|███▎ | 7336/22095 [12:10:58<16:34:13, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41533 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83077 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85899 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7337/22095 [12:11:02<15:51:31, 3.87s/it] {'loss': 0.3606, 'grad_norm': 0.9210714304746968, 'learning_rate': 7.792763557414683e-06, 'epoch': 0.33} 33%|███▎ | 7337/22095 [12:11:02<15:51:31, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (141221 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7338/22095 [12:11:05<15:38:58, 3.82s/it] {'loss': 0.3364, 'grad_norm': 0.5987292893054342, 'learning_rate': 7.792155592128072e-06, 'epoch': 0.33} 33%|███▎ | 7338/22095 [12:11:05<15:38:58, 3.82s/it] 33%|███▎ | 7339/22095 [12:11:28<39:15:22, 9.58s/it] {'loss': 0.3604, 'grad_norm': 0.6215979190882727, 'learning_rate': 7.791547566846612e-06, 'epoch': 0.33} 33%|███▎ | 7339/22095 [12:11:28<39:15:22, 9.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50336 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94251 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78539 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7340/22095 [12:11:31<31:00:59, 7.57s/it] {'loss': 0.3662, 'grad_norm': 0.646836740101282, 'learning_rate': 7.79093948158337e-06, 'epoch': 0.33} 33%|███▎ | 7340/22095 [12:11:31<31:00:59, 7.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (126343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97879 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7341/22095 [12:11:34<25:25:05, 6.20s/it] {'loss': 0.3607, 'grad_norm': 0.6432445331083202, 'learning_rate': 7.790331336351408e-06, 'epoch': 0.33} 33%|███▎ | 7341/22095 [12:11:34<25:25:05, 6.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89813 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7342/22095 [12:11:37<21:19:52, 5.21s/it] {'loss': 0.4262, 'grad_norm': 0.8022329252854027, 'learning_rate': 7.7897231311638e-06, 'epoch': 0.33} 33%|███▎ | 7342/22095 [12:11:37<21:19:52, 5.21s/it] 33%|███▎ | 7343/22095 [12:11:59<41:34:12, 10.14s/it] {'loss': 0.3422, 'grad_norm': 0.6805560295429158, 'learning_rate': 7.789114866033607e-06, 'epoch': 0.33} 33%|███▎ | 7343/22095 [12:11:59<41:34:12, 10.14s/it] 33%|███▎ | 7344/22095 [12:12:25<61:47:36, 15.08s/it] {'loss': 0.4154, 'grad_norm': 0.7509513883759419, 'learning_rate': 7.788506540973902e-06, 'epoch': 0.33} 33%|███▎ | 7344/22095 [12:12:25<61:47:36, 15.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65441 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112787 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102024 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88435 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7345/22095 [12:12:31<50:01:49, 12.21s/it] {'loss': 0.5069, 'grad_norm': 1.1071314317950283, 'learning_rate': 7.787898155997755e-06, 'epoch': 0.33} 33%|███▎ | 7345/22095 [12:12:31<50:01:49, 12.21s/it] 33%|███▎ | 7346/22095 [12:12:41<47:15:51, 11.54s/it] {'loss': 0.5124, 'grad_norm': 0.992944270441858, 'learning_rate': 7.787289711118238e-06, 'epoch': 0.33} 33%|███▎ | 7346/22095 [12:12:41<47:15:51, 11.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7347/22095 [12:12:44<37:01:30, 9.04s/it] {'loss': 0.3206, 'grad_norm': 0.7760811321913402, 'learning_rate': 7.786681206348428e-06, 'epoch': 0.33} 33%|███▎ | 7347/22095 [12:12:44<37:01:30, 9.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106408 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7348/22095 [12:12:53<37:27:39, 9.14s/it] {'loss': 0.502, 'grad_norm': 0.5570280244602357, 'learning_rate': 7.786072641701397e-06, 'epoch': 0.33} 33%|███▎ | 7348/22095 [12:12:53<37:27:39, 9.14s/it] 33%|███▎ | 7349/22095 [12:13:00<34:44:20, 8.48s/it] {'loss': 0.5162, 'grad_norm': 0.43038194913860595, 'learning_rate': 7.78546401719022e-06, 'epoch': 0.33} 33%|███▎ | 7349/22095 [12:13:00<34:44:20, 8.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (70157 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42450 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (49207 > 40960) for 4 sample(s). Truncating to 34351 with 3 samples. 33%|███▎ | 7350/22095 [12:13:05<29:35:54, 7.23s/it] {'loss': 0.3599, 'grad_norm': 0.7139064690910768, 'learning_rate': 7.784855332827979e-06, 'epoch': 0.33} 33%|███▎ | 7350/22095 [12:13:05<29:35:54, 7.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81724 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7351/22095 [12:13:08<25:12:44, 6.16s/it] {'loss': 0.346, 'grad_norm': 0.6840052243027095, 'learning_rate': 7.784246588627747e-06, 'epoch': 0.33} 33%|███▎ | 7351/22095 [12:13:08<25:12:44, 6.16s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_8251.png 2025-08-28 04:11:07.003276 load time: 1026.31 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240828_202304_before_screenshot.png 2025-08-28 04:11:07.003266 load time: 1062.44 ms 33%|███▎ | 7352/22095 [12:13:30<44:45:54, 10.93s/it] {'loss': 0.3728, 'grad_norm': 0.7906548268620531, 'learning_rate': 7.783637784602608e-06, 'epoch': 0.33} 33%|███▎ | 7352/22095 [12:13:30<44:45:54, 10.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7353/22095 [12:13:40<43:00:01, 10.50s/it] {'loss': 0.5166, 'grad_norm': 0.9588818268057699, 'learning_rate': 7.783028920765644e-06, 'epoch': 0.33} 33%|███▎ | 7353/22095 [12:13:40<43:00:01, 10.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1028, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8515574 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1028, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 160878, 'image': 'vrdu_texteq/astro-ph.CO/f7d4629e-51e4-43a7-9840-65601e3f7a1b.png', 'image_wh': [[1028, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'where the standard deviation and variance are taken across the $M$ noise realizations.'}]} 33%|███▎ | 7354/22095 [12:13:43<34:17:57, 8.38s/it] {'loss': 0.3538, 'grad_norm': 0.628264054881028, 'learning_rate': 7.782419997129934e-06, 'epoch': 0.33} 33%|███▎ | 7354/22095 [12:13:43<34:17:57, 8.38s/it] 33%|███▎ | 7355/22095 [12:13:46<27:49:56, 6.80s/it] {'loss': 0.3691, 'grad_norm': 0.7094367104820966, 'learning_rate': 7.781811013708565e-06, 'epoch': 0.33} 33%|███▎ | 7355/22095 [12:13:46<27:49:56, 6.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_4/images/step_0.png 2025-08-28 04:11:43.713364 load time: 1305.25 ms Token indices sequence length is longer than the specified maximum sequence length for this model (73669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66881 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90948 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7356/22095 [12:13:54<29:09:47, 7.12s/it] {'loss': 0.5178, 'grad_norm': 0.9712871371707572, 'learning_rate': 7.78120197051462e-06, 'epoch': 0.33} 33%|███▎ | 7356/22095 [12:13:54<29:09:47, 7.12s/it] 33%|███▎ | 7357/22095 [12:13:57<24:26:36, 5.97s/it] {'loss': 0.3436, 'grad_norm': 0.7065811609441587, 'learning_rate': 7.780592867561187e-06, 'epoch': 0.33} 33%|███▎ | 7357/22095 [12:13:58<24:26:36, 5.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047552 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} 33%|███▎ | 7358/22095 [12:14:08<29:41:48, 7.25s/it] {'loss': 0.5165, 'grad_norm': 0.8080745881239523, 'learning_rate': 7.779983704861354e-06, 'epoch': 0.33} 33%|███▎ | 7358/22095 [12:14:08<29:41:48, 7.25s/it] 33%|███▎ | 7359/22095 [12:14:17<32:29:48, 7.94s/it] {'loss': 0.4997, 'grad_norm': 0.5864163976829028, 'learning_rate': 7.779374482428206e-06, 'epoch': 0.33} 33%|███▎ | 7359/22095 [12:14:17<32:29:48, 7.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (44992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86314 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7360/22095 [12:14:21<27:24:49, 6.70s/it] {'loss': 0.3599, 'grad_norm': 0.6685332325376612, 'learning_rate': 7.77876520027484e-06, 'epoch': 0.33} 33%|███▎ | 7360/22095 [12:14:21<27:24:49, 6.70s/it] 33%|███▎ | 7361/22095 [12:14:25<24:25:36, 5.97s/it] {'loss': 0.3582, 'grad_norm': 0.7441894827625023, 'learning_rate': 7.778155858414342e-06, 'epoch': 0.33} 33%|███▎ | 7361/22095 [12:14:25<24:25:36, 5.97s/it] 33%|███▎ | 7362/22095 [12:15:06<66:27:57, 16.24s/it] {'loss': 0.3976, 'grad_norm': 0.733170379006083, 'learning_rate': 7.777546456859808e-06, 'epoch': 0.33} 33%|███▎ | 7362/22095 [12:15:06<66:27:57, 16.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45382 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7363/22095 [12:15:27<72:56:56, 17.83s/it] {'loss': 0.3647, 'grad_norm': 0.621295126537377, 'learning_rate': 7.77693699562433e-06, 'epoch': 0.33} 33%|███▎ | 7363/22095 [12:15:27<72:56:56, 17.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage2/guiact-web-single/images/846da9f3-b6e9-4ab8-a27c-5fcbb8ac5854.jpg 2025-08-28 04:13:25.862571 load time: 1024.48 ms 33%|███▎ | 7364/22095 [12:15:32<56:52:47, 13.90s/it] {'loss': 0.5287, 'grad_norm': 0.6711556936745743, 'learning_rate': 7.776327474721009e-06, 'epoch': 0.33} 33%|███▎ | 7364/22095 [12:15:32<56:52:47, 13.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93469 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7365/22095 [12:15:53<66:18:58, 16.21s/it] {'loss': 0.3633, 'grad_norm': 0.722177606091501, 'learning_rate': 7.775717894162933e-06, 'epoch': 0.33} 33%|███▎ | 7365/22095 [12:15:53<66:18:58, 16.21s/it] 33%|███▎ | 7366/22095 [12:16:16<73:38:58, 18.00s/it] {'loss': 0.365, 'grad_norm': 0.7038042521755242, 'learning_rate': 7.775108253963207e-06, 'epoch': 0.33} 33%|███▎ | 7366/22095 [12:16:16<73:38:58, 18.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80625 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7367/22095 [12:16:43<85:40:05, 20.94s/it] {'loss': 0.4828, 'grad_norm': 0.7542576892215929, 'learning_rate': 7.774498554134925e-06, 'epoch': 0.33} 33%|███▎ | 7367/22095 [12:16:43<85:40:05, 20.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250714/ubuntu/images/vs_code/942dd12a-1875-4903-8bea-c0cb726fc181/images/step_0.png 2025-08-28 04:14:42.177983 load time: 1027.66 ms VC:s3://gui-agent/data_20250707/ubuntu/images/libreoffice_impress/1ce12327-548b-4c92-83a4-21f60dcbf895/images/step_1.png 2025-08-28 04:14:42.177808 load time: 1051.24 ms 33%|███▎ | 7368/22095 [12:17:11<94:15:13, 23.04s/it] {'loss': 0.4903, 'grad_norm': 0.622741190715478, 'learning_rate': 7.773888794691192e-06, 'epoch': 0.33} 33%|███▎ | 7368/22095 [12:17:11<94:15:13, 23.04s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 33%|███▎ | 7369/22095 [12:17:34<93:33:16, 22.87s/it] {'loss': 0.3455, 'grad_norm': 0.6829381274063849, 'learning_rate': 7.773278975645109e-06, 'epoch': 0.33} 33%|███▎ | 7369/22095 [12:17:34<93:33:16, 22.87s/it] 33%|███▎ | 7370/22095 [12:17:38<70:36:22, 17.26s/it] {'loss': 0.36, 'grad_norm': 0.7782974536058487, 'learning_rate': 7.772669097009777e-06, 'epoch': 0.33} 33%|███▎ | 7370/22095 [12:17:38<70:36:22, 17.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8513228 in VC:s3://internvl-moe-sft-data/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 159585, 'image': 'vrdu_texteq/astro-ph.CO/156060f2-6c3a-4b10-8ac9-9fb9aab2b469.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'and for $z \\sim 8$ as'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7371/22095 [12:17:47<60:58:22, 14.91s/it] {'loss': 0.4576, 'grad_norm': 0.5081855472190925, 'learning_rate': 7.772059158798302e-06, 'epoch': 0.33} 33%|███▎ | 7371/22095 [12:17:47<60:58:22, 14.91s/it] 33%|███▎ | 7372/22095 [12:18:10<70:21:27, 17.20s/it] {'loss': 0.3765, 'grad_norm': 0.6705304178050254, 'learning_rate': 7.77144916102379e-06, 'epoch': 0.33} 33%|███▎ | 7372/22095 [12:18:10<70:21:27, 17.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7373/22095 [12:18:38<83:31:39, 20.43s/it] {'loss': 0.4927, 'grad_norm': 0.38144921018083416, 'learning_rate': 7.770839103699345e-06, 'epoch': 0.33} 33%|███▎ | 7373/22095 [12:18:38<83:31:39, 20.43s/it] 33%|███▎ | 7374/22095 [12:18:41<62:19:23, 15.24s/it] {'loss': 0.3313, 'grad_norm': 0.7062153479980101, 'learning_rate': 7.77022898683808e-06, 'epoch': 0.33} 33%|███▎ | 7374/22095 [12:18:41<62:19:23, 15.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7375/22095 [12:18:47<50:41:33, 12.40s/it] {'loss': 0.4999, 'grad_norm': 0.4372021191401388, 'learning_rate': 7.769618810453101e-06, 'epoch': 0.33} 33%|███▎ | 7375/22095 [12:18:47<50:41:33, 12.40s/it] 33%|███▎ | 7376/22095 [12:19:28<86:28:25, 21.15s/it] {'loss': 0.3894, 'grad_norm': 0.6507493359356833, 'learning_rate': 7.769008574557522e-06, 'epoch': 0.33} 33%|███▎ | 7376/22095 [12:19:28<86:28:25, 21.15s/it] 33%|███▎ | 7377/22095 [12:20:27<132:39:38, 32.45s/it] {'loss': 0.3949, 'grad_norm': 0.7817304510978234, 'learning_rate': 7.76839827916445e-06, 'epoch': 0.33} 33%|███▎ | 7377/22095 [12:20:27<132:39:38, 32.45s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/rico/dataset/image/23648.jpg 2025-08-28 04:18:25.979774 load time: 1039.87 ms 33%|███▎ | 7378/22095 [12:21:12<148:11:50, 36.25s/it] {'loss': 0.3897, 'grad_norm': 0.6754684107822001, 'learning_rate': 7.767787924287005e-06, 'epoch': 0.33} 33%|███▎ | 7378/22095 [12:21:12<148:11:50, 36.25s/it] 33%|███▎ | 7379/22095 [12:22:12<177:24:14, 43.40s/it] {'loss': 0.4152, 'grad_norm': 0.68264505263259, 'learning_rate': 7.767177509938294e-06, 'epoch': 0.33} 33%|███▎ | 7379/22095 [12:22:12<177:24:14, 43.40s/it] 33%|███▎ | 7380/22095 [12:22:52<172:39:35, 42.24s/it] {'loss': 0.3981, 'grad_norm': 0.607110426394516, 'learning_rate': 7.76656703613144e-06, 'epoch': 0.33} 33%|███▎ | 7380/22095 [12:22:52<172:39:35, 42.24s/it] 33%|███▎ | 7381/22095 [12:22:55<124:18:00, 30.41s/it] {'loss': 0.351, 'grad_norm': 0.6315941771087339, 'learning_rate': 7.765956502879557e-06, 'epoch': 0.33} 33%|███▎ | 7381/22095 [12:22:55<124:18:00, 30.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86060 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51846 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7382/22095 [12:23:35<136:38:52, 33.44s/it] {'loss': 0.3744, 'grad_norm': 0.622596703964894, 'learning_rate': 7.765345910195764e-06, 'epoch': 0.33} 33%|███▎ | 7382/22095 [12:23:35<136:38:52, 33.44s/it] 33%|███▎ | 7383/22095 [12:24:32<165:45:31, 40.56s/it] {'loss': 0.3394, 'grad_norm': 0.6443042347847058, 'learning_rate': 7.76473525809318e-06, 'epoch': 0.33} 33%|███▎ | 7383/22095 [12:24:32<165:45:31, 40.56s/it]VC:s3://gui-agent/data_20250714/windows/images/libreoffice_calc/free_task_20250717_155823/images/20250717_155826_1.png 2025-08-28 04:22:31.205997 load time: 1033.94 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_756387.png 2025-08-28 04:22:31.207940 load time: 1049.14 ms 33%|███▎ | 7384/22095 [12:24:54<142:43:23, 34.93s/it] {'loss': 0.3718, 'grad_norm': 0.6177848452709123, 'learning_rate': 7.764124546584926e-06, 'epoch': 0.33} 33%|███▎ | 7384/22095 [12:24:54<142:43:23, 34.93s/it] 33%|███▎ | 7385/22095 [12:26:33<220:50:56, 54.05s/it] {'loss': 0.3415, 'grad_norm': 0.6317024650528028, 'learning_rate': 7.763513775684125e-06, 'epoch': 0.33} 33%|███▎ | 7385/22095 [12:26:33<220:50:56, 54.05s/it] 33%|███▎ | 7386/22095 [12:27:12<202:51:04, 49.65s/it] {'loss': 0.3432, 'grad_norm': 0.6268294522319873, 'learning_rate': 7.7629029454039e-06, 'epoch': 0.33} 33%|███▎ | 7386/22095 [12:27:12<202:51:04, 49.65s/it] 33%|███▎ | 7387/22095 [12:27:52<190:23:32, 46.60s/it] {'loss': 0.4405, 'grad_norm': 0.6961497243147552, 'learning_rate': 7.762292055757379e-06, 'epoch': 0.33} 33%|███▎ | 7387/22095 [12:27:52<190:23:32, 46.60s/it] 33%|███▎ | 7388/22095 [12:28:51<205:58:13, 50.42s/it] {'loss': 0.3533, 'grad_norm': 0.6416482361622061, 'learning_rate': 7.761681106757682e-06, 'epoch': 0.33} 33%|███▎ | 7388/22095 [12:28:51<205:58:13, 50.42s/it]VC:s3://gui-agent/data_20250505/windows/images/vscode/20250422_212637_1/images/before_screenshot_32_id_48_function_2_crop_0_grounding_instructions_random.png 2025-08-28 04:26:49.848436 load time: 1104.42 ms 33%|███▎ | 7389/22095 [12:29:51<217:56:11, 53.35s/it] {'loss': 0.3225, 'grad_norm': 0.6484749481198567, 'learning_rate': 7.761070098417943e-06, 'epoch': 0.33} 33%|███▎ | 7389/22095 [12:29:51<217:56:11, 53.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 33%|███▎ | 7390/22095 [12:30:19<186:41:38, 45.71s/it] {'loss': 0.4899, 'grad_norm': 0.7573052274699215, 'learning_rate': 7.760459030751285e-06, 'epoch': 0.33} 33%|███▎ | 7390/22095 [12:30:19<186:41:38, 45.71s/it] 33%|███▎ | 7391/22095 [12:30:22<134:33:01, 32.94s/it] {'loss': 0.3776, 'grad_norm': 0.6484070916262837, 'learning_rate': 7.759847903770841e-06, 'epoch': 0.33} 33%|███▎ | 7391/22095 [12:30:22<134:33:01, 32.94s/it] 33%|███▎ | 7392/22095 [12:31:21<165:42:35, 40.57s/it] {'loss': 0.3534, 'grad_norm': 0.6403106131889722, 'learning_rate': 7.759236717489743e-06, 'epoch': 0.33} 33%|███▎ | 7392/22095 [12:31:21<165:42:35, 40.57s/it] 33%|███▎ | 7393/22095 [12:31:42<141:32:06, 34.66s/it] {'loss': 0.4111, 'grad_norm': 0.7201737675332623, 'learning_rate': 7.75862547192112e-06, 'epoch': 0.33} 33%|███▎ | 7393/22095 [12:31:42<141:32:06, 34.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75268 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43958 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86125 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144197 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7394/22095 [12:32:10<133:33:27, 32.71s/it] {'loss': 0.4685, 'grad_norm': 0.3878614133578035, 'learning_rate': 7.75801416707811e-06, 'epoch': 0.33} 33%|███▎ | 7394/22095 [12:32:10<133:33:27, 32.71s/it] 33%|███▎ | 7395/22095 [12:32:53<146:07:03, 35.78s/it] {'loss': 0.3261, 'grad_norm': 0.5983085649938688, 'learning_rate': 7.757402802973846e-06, 'epoch': 0.33} 33%|███▎ | 7395/22095 [12:32:53<146:07:03, 35.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49652 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89567 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7396/22095 [12:35:19<281:20:08, 68.90s/it] {'loss': 0.3354, 'grad_norm': 0.7292220648834236, 'learning_rate': 7.756791379621461e-06, 'epoch': 0.33} 33%|███▎ | 7396/22095 [12:35:19<281:20:08, 68.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 33%|███▎ | 7397/22095 [12:36:01<248:18:46, 60.82s/it] {'loss': 0.3502, 'grad_norm': 0.6933013859235289, 'learning_rate': 7.756179897034101e-06, 'epoch': 0.33} 33%|███▎ | 7397/22095 [12:36:01<248:18:46, 60.82s/it]VC:s3://ocr/coco/train2014/COCO_train2014_000000513531.jpg 2025-08-28 04:33:59.561431 load time: 1026.82 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_304021.png VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_173920_before_screenshot_sub0.png 2025-08-28 04:33:59.560589 load time: 1026.81 ms 2025-08-28 04:33:59.560303 load time: 1028.14 ms VC:s3://gui-agent/data_20250612/web/images/yang_0528112335/10_140_52_49_0528112751/img/10.png 2025-08-28 04:33:59.560461 load time: 1033.56 ms 33%|███▎ | 7398/22095 [12:37:03<249:25:35, 61.10s/it] {'loss': 0.3665, 'grad_norm': 0.7088971477959854, 'learning_rate': 7.7555683552249e-06, 'epoch': 0.33} 33%|███▎ | 7398/22095 [12:37:03<249:25:35, 61.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106347 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7399/22095 [12:37:45<226:18:57, 55.44s/it] {'loss': 0.4009, 'grad_norm': 0.6655689664230773, 'learning_rate': 7.754956754206995e-06, 'epoch': 0.33} 33%|███▎ | 7399/22095 [12:37:45<226:18:57, 55.44s/it] 33%|███▎ | 7400/22095 [12:38:25<207:08:50, 50.75s/it] {'loss': 0.4111, 'grad_norm': 0.6449599873272122, 'learning_rate': 7.754345093993531e-06, 'epoch': 0.33} 33%|███▎ | 7400/22095 [12:38:25<207:08:50, 50.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105995 > 40960). Running this sequence through the model will result in indexing errors 33%|███▎ | 7401/22095 [12:40:04<266:53:33, 65.39s/it] {'loss': 0.2901, 'grad_norm': 0.642456407875221, 'learning_rate': 7.753733374597651e-06, 'epoch': 0.33} 33%|███▎ | 7401/22095 [12:40:04<266:53:33, 65.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83042 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78476 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7402/22095 [12:40:13<198:13:24, 48.57s/it] {'loss': 0.4831, 'grad_norm': 0.47876243681554803, 'learning_rate': 7.7531215960325e-06, 'epoch': 0.34} 34%|███▎ | 7402/22095 [12:40:13<198:13:24, 48.57s/it] 34%|███▎ | 7403/22095 [12:40:35<165:18:00, 40.50s/it] {'loss': 0.395, 'grad_norm': 0.6578598929289875, 'learning_rate': 7.75250975831122e-06, 'epoch': 0.34} 34%|███▎ | 7403/22095 [12:40:35<165:18:00, 40.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7404/22095 [12:40:41<123:13:51, 30.20s/it] {'loss': 0.4638, 'grad_norm': 0.43312235044318054, 'learning_rate': 7.751897861446957e-06, 'epoch': 0.34} 34%|███▎ | 7404/22095 [12:40:41<123:13:51, 30.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60308 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7405/22095 [12:40:45<90:34:39, 22.20s/it] {'loss': 0.3715, 'grad_norm': 0.5912177865190626, 'learning_rate': 7.751285905452863e-06, 'epoch': 0.34} 34%|███▎ | 7405/22095 [12:40:45<90:34:39, 22.20s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_404452.png 2025-08-28 04:38:43.578816 load time: 1051.15 ms 34%|███▎ | 7406/22095 [12:42:02<157:27:28, 38.59s/it] {'loss': 0.3882, 'grad_norm': 0.6617835213947177, 'learning_rate': 7.750673890342087e-06, 'epoch': 0.34} 34%|███▎ | 7406/22095 [12:42:02<157:27:28, 38.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (117292 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7407/22095 [12:43:06<188:45:41, 46.27s/it] {'loss': 0.4945, 'grad_norm': 0.33686268188490154, 'learning_rate': 7.750061816127773e-06, 'epoch': 0.34} 34%|███▎ | 7407/22095 [12:43:06<188:45:41, 46.27s/it] 34%|███▎ | 7408/22095 [12:43:28<158:58:23, 38.97s/it] {'loss': 0.3567, 'grad_norm': 0.6242490984195372, 'learning_rate': 7.749449682823077e-06, 'epoch': 0.34} 34%|███▎ | 7408/22095 [12:43:28<158:58:23, 38.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7409/22095 [12:43:55<144:59:09, 35.54s/it] {'loss': 0.4814, 'grad_norm': 0.3702368617253743, 'learning_rate': 7.748837490441154e-06, 'epoch': 0.34} 34%|███▎ | 7409/22095 [12:43:55<144:59:09, 35.54s/it] 34%|███▎ | 7410/22095 [12:44:03<110:58:55, 27.21s/it] {'loss': 0.4851, 'grad_norm': 0.31983652680776037, 'learning_rate': 7.748225238995155e-06, 'epoch': 0.34} 34%|███▎ | 7410/22095 [12:44:03<110:58:55, 27.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 34%|███▎ | 7411/22095 [12:44:45<129:27:55, 31.74s/it] {'loss': 0.3703, 'grad_norm': 0.6678022578758226, 'learning_rate': 7.747612928498236e-06, 'epoch': 0.34} 34%|███▎ | 7411/22095 [12:44:45<129:27:55, 31.74s/it] 34%|███▎ | 7412/22095 [12:45:26<140:28:20, 34.44s/it] {'loss': 0.3133, 'grad_norm': 0.6487321452560142, 'learning_rate': 7.747000558963553e-06, 'epoch': 0.34} 34%|███▎ | 7412/22095 [12:45:26<140:28:20, 34.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/077bbf7bbbe9ee78e666f4848678dc38158c201296281a90f1d1c0c8ab3b41ce.png 2025-08-28 04:43:24.895026 load time: 1042.73 ms VC:s3://gui-agent/data_20250407/web/images/finance_google_com/trajectory_11/img/step_1.png 2025-08-28 04:43:24.894802 load time: 1039.31 ms 34%|███▎ | 7413/22095 [12:45:56<135:08:18, 33.14s/it] {'loss': 0.5095, 'grad_norm': 0.37027873488301116, 'learning_rate': 7.746388130404266e-06, 'epoch': 0.34} 34%|███▎ | 7413/22095 [12:45:56<135:08:18, 33.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▎ | 7414/22095 [12:46:18<121:26:33, 29.78s/it] {'loss': 0.3537, 'grad_norm': 0.5749342746992915, 'learning_rate': 7.745775642833532e-06, 'epoch': 0.34} 34%|███▎ | 7414/22095 [12:46:18<121:26:33, 29.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7415/22095 [12:46:28<96:32:14, 23.67s/it] {'loss': 0.5016, 'grad_norm': 0.3615744542074436, 'learning_rate': 7.745163096264512e-06, 'epoch': 0.34} 34%|███▎ | 7415/22095 [12:46:28<96:32:14, 23.67s/it] 34%|███▎ | 7416/22095 [12:47:28<141:52:28, 34.79s/it] {'loss': 0.3701, 'grad_norm': 0.6853723855206779, 'learning_rate': 7.74455049071037e-06, 'epoch': 0.34} 34%|███▎ | 7416/22095 [12:47:28<141:52:28, 34.79s/it] 34%|███▎ | 7417/22095 [12:48:27<171:25:13, 42.04s/it] {'loss': 0.329, 'grad_norm': 0.6218699873168575, 'learning_rate': 7.743937826184266e-06, 'epoch': 0.34} 34%|███▎ | 7417/22095 [12:48:27<171:25:13, 42.04s/it] 34%|███▎ | 7418/22095 [12:48:49<145:57:33, 35.80s/it] {'loss': 0.3976, 'grad_norm': 0.6405323808791121, 'learning_rate': 7.743325102699366e-06, 'epoch': 0.34} 34%|███▎ | 7418/22095 [12:48:49<145:57:33, 35.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51220 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7419/22095 [12:49:32<155:29:24, 38.14s/it] {'loss': 0.3702, 'grad_norm': 0.7927461115027449, 'learning_rate': 7.742712320268835e-06, 'epoch': 0.34} 34%|███▎ | 7419/22095 [12:49:32<155:29:24, 38.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7420/22095 [12:49:40<118:38:52, 29.11s/it] {'loss': 0.5159, 'grad_norm': 0.37805600475702256, 'learning_rate': 7.742099478905837e-06, 'epoch': 0.34} 34%|███▎ | 7420/22095 [12:49:40<118:38:52, 29.11s/it]VC:s3://mm-dataset/ocr_data/TextVQA/train_images/970404ea84e12e0d.jpg 2025-08-28 04:47:38.923599 load time: 1036.29 ms 34%|███▎ | 7421/22095 [12:49:44<87:31:30, 21.47s/it] {'loss': 0.3676, 'grad_norm': 0.6019854929627273, 'learning_rate': 7.741486578623546e-06, 'epoch': 0.34} 34%|███▎ | 7421/22095 [12:49:44<87:31:30, 21.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47166 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44085 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7422/22095 [12:50:25<111:05:21, 27.26s/it] {'loss': 0.3441, 'grad_norm': 0.5992008219954912, 'learning_rate': 7.740873619435127e-06, 'epoch': 0.34} 34%|███▎ | 7422/22095 [12:50:25<111:05:21, 27.26s/it] 34%|███▎ | 7423/22095 [12:51:23<149:12:06, 36.61s/it] {'loss': 0.3698, 'grad_norm': 0.6385439615181806, 'learning_rate': 7.740260601353755e-06, 'epoch': 0.34} 34%|███▎ | 7423/22095 [12:51:23<149:12:06, 36.61s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_30677.png 2025-08-28 04:49:21.770867 load time: 1032.39 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_812337.png 2025-08-28 04:49:21.769077 load time: 1045.26 ms 34%|███▎ | 7424/22095 [12:51:26<108:34:38, 26.64s/it] {'loss': 0.337, 'grad_norm': 0.8383219944781505, 'learning_rate': 7.739647524392595e-06, 'epoch': 0.34} 34%|███▎ | 7424/22095 [12:51:26<108:34:38, 26.64s/it] 34%|███▎ | 7425/22095 [12:51:53<107:59:32, 26.50s/it] {'loss': 0.3804, 'grad_norm': 0.7062114631525629, 'learning_rate': 7.739034388564826e-06, 'epoch': 0.34} 34%|███▎ | 7425/22095 [12:51:53<107:59:32, 26.50s/it] 34%|███▎ | 7426/22095 [12:52:18<107:07:21, 26.29s/it] {'loss': 0.385, 'grad_norm': 1.0240473141234805, 'learning_rate': 7.738421193883618e-06, 'epoch': 0.34} 34%|███▎ | 7426/22095 [12:52:18<107:07:21, 26.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage2/aitw-v1/images/webshopping_269825422547520069_7.jpg 2025-08-28 04:50:17.124986 load time: 1039.25 ms 34%|███▎ | 7427/22095 [12:52:28<86:30:23, 21.23s/it] {'loss': 0.4713, 'grad_norm': 0.4036271018286075, 'learning_rate': 7.737807940362153e-06, 'epoch': 0.34} 34%|███▎ | 7427/22095 [12:52:28<86:30:23, 21.23s/it] 34%|███▎ | 7428/22095 [12:52:49<86:37:04, 21.26s/it] {'loss': 0.3415, 'grad_norm': 0.6935362243615151, 'learning_rate': 7.7371946280136e-06, 'epoch': 0.34} 34%|███▎ | 7428/22095 [12:52:49<86:37:04, 21.26s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-1_94436860-split-4.jpg 2025-08-28 04:50:47.883322 load time: 1028.86 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▎ | 7429/22095 [12:52:52<64:11:33, 15.76s/it] {'loss': 0.3547, 'grad_norm': 0.6765723463844222, 'learning_rate': 7.736581256851143e-06, 'epoch': 0.34} 34%|███▎ | 7429/22095 [12:52:52<64:11:33, 15.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7430/22095 [12:52:59<53:06:43, 13.04s/it] {'loss': 0.507, 'grad_norm': 0.29491475836160913, 'learning_rate': 7.735967826887957e-06, 'epoch': 0.34} 34%|███▎ | 7430/22095 [12:52:59<53:06:43, 13.04s/it] 34%|███▎ | 7431/22095 [12:53:20<63:44:10, 15.65s/it] {'loss': 0.3598, 'grad_norm': 0.6213753045889875, 'learning_rate': 7.73535433813723e-06, 'epoch': 0.34} 34%|███▎ | 7431/22095 [12:53:20<63:44:10, 15.65s/it] 34%|███▎ | 7432/22095 [12:54:02<95:30:24, 23.45s/it] {'loss': 0.3531, 'grad_norm': 0.6048305024715104, 'learning_rate': 7.734740790612137e-06, 'epoch': 0.34} 34%|███▎ | 7432/22095 [12:54:02<95:30:24, 23.45s/it] 34%|███▎ | 7433/22095 [12:54:23<92:48:29, 22.79s/it] {'loss': 0.3652, 'grad_norm': 0.6111012611969582, 'learning_rate': 7.734127184325862e-06, 'epoch': 0.34} 34%|███▎ | 7433/22095 [12:54:23<92:48:29, 22.79s/it] 34%|███▎ | 7434/22095 [12:54:45<91:53:41, 22.56s/it] {'loss': 0.3873, 'grad_norm': 0.7314719236909824, 'learning_rate': 7.73351351929159e-06, 'epoch': 0.34} 34%|███▎ | 7434/22095 [12:54:45<91:53:41, 22.56s/it] 34%|███▎ | 7435/22095 [12:54:49<68:23:24, 16.79s/it] {'loss': 0.3584, 'grad_norm': 0.6751215172117923, 'learning_rate': 7.732899795522511e-06, 'epoch': 0.34} 34%|███▎ | 7435/22095 [12:54:49<68:23:24, 16.79s/it]VC:s3://gui-agent/data_20250421/web/images/dmv_virginia_gov/trajectory_24/img/step_5.png 2025-08-28 04:52:47.499895 load time: 1023.79 ms 34%|███▎ | 7436/22095 [12:55:51<124:16:03, 30.52s/it] {'loss': 0.3316, 'grad_norm': 0.6526724286809542, 'learning_rate': 7.732286013031807e-06, 'epoch': 0.34} 34%|███▎ | 7436/22095 [12:55:51<124:16:03, 30.52s/it] 34%|███▎ | 7437/22095 [12:56:13<113:27:28, 27.87s/it] {'loss': 0.3682, 'grad_norm': 0.8181862564068937, 'learning_rate': 7.73167217183267e-06, 'epoch': 0.34} 34%|███▎ | 7437/22095 [12:56:13<113:27:28, 27.87s/it] 34%|███▎ | 7438/22095 [12:56:38<109:29:57, 26.89s/it] {'loss': 0.3887, 'grad_norm': 0.6340294789997006, 'learning_rate': 7.731058271938286e-06, 'epoch': 0.34} 34%|███▎ | 7438/22095 [12:56:38<109:29:57, 26.89s/it] 34%|███▎ | 7439/22095 [12:56:41<80:29:11, 19.77s/it] {'loss': 0.3757, 'grad_norm': 0.6380738026004611, 'learning_rate': 7.73044431336185e-06, 'epoch': 0.34} 34%|███▎ | 7439/22095 [12:56:41<80:29:11, 19.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▎ | 7440/22095 [12:56:44<59:52:50, 14.71s/it] {'loss': 0.3932, 'grad_norm': 0.7063121664201003, 'learning_rate': 7.729830296116549e-06, 'epoch': 0.34} 34%|███▎ | 7440/22095 [12:56:44<59:52:50, 14.71s/it] 34%|███▎ | 7441/22095 [12:56:47<46:06:22, 11.33s/it] {'loss': 0.3733, 'grad_norm': 0.6163861356537637, 'learning_rate': 7.729216220215579e-06, 'epoch': 0.34} 34%|███▎ | 7441/22095 [12:56:47<46:06:22, 11.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▎ | 7442/22095 [12:56:51<36:57:51, 9.08s/it] {'loss': 0.3617, 'grad_norm': 0.6808044512307527, 'learning_rate': 7.728602085672136e-06, 'epoch': 0.34} 34%|███▎ | 7442/22095 [12:56:51<36:57:51, 9.08s/it] 34%|███▎ | 7443/22095 [12:57:14<54:08:52, 13.30s/it] {'loss': 0.301, 'grad_norm': 0.6416138091184128, 'learning_rate': 7.727987892499413e-06, 'epoch': 0.34} 34%|███▎ | 7443/22095 [12:57:14<54:08:52, 13.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50294 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42102 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62718 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7444/22095 [12:57:17<41:43:12, 10.25s/it] {'loss': 0.3737, 'grad_norm': 0.6412030801412819, 'learning_rate': 7.72737364071061e-06, 'epoch': 0.34} 34%|███▎ | 7444/22095 [12:57:17<41:43:12, 10.25s/it] 34%|███▎ | 7445/22095 [12:57:21<33:54:20, 8.33s/it] {'loss': 0.3606, 'grad_norm': 0.6282639242310271, 'learning_rate': 7.726759330318922e-06, 'epoch': 0.34} 34%|███▎ | 7445/22095 [12:57:21<33:54:20, 8.33s/it] 34%|███▎ | 7446/22095 [12:57:26<29:17:56, 7.20s/it] {'loss': 0.3427, 'grad_norm': 0.7902263070049047, 'learning_rate': 7.726144961337552e-06, 'epoch': 0.34} 34%|███▎ | 7446/22095 [12:57:26<29:17:56, 7.20s/it] 34%|███▎ | 7447/22095 [12:57:28<23:57:36, 5.89s/it] {'loss': 0.3538, 'grad_norm': 1.2046021596760628, 'learning_rate': 7.7255305337797e-06, 'epoch': 0.34} 34%|███▎ | 7447/22095 [12:57:28<23:57:36, 5.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887271 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10424, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 4cm\nB. 6cm\nC. 1cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 34%|███▎ | 7448/22095 [12:57:32<20:36:04, 5.06s/it] {'loss': 0.3803, 'grad_norm': 0.67347299103925, 'learning_rate': 7.724916047658568e-06, 'epoch': 0.34} 34%|███▎ | 7448/22095 [12:57:32<20:36:04, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66032 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103431 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7449/22095 [12:57:34<17:52:16, 4.39s/it] {'loss': 0.3623, 'grad_norm': 0.6249721615380065, 'learning_rate': 7.724301502987357e-06, 'epoch': 0.34} 34%|███▎ | 7449/22095 [12:57:34<17:52:16, 4.39s/it] 34%|███▎ | 7450/22095 [12:57:37<16:02:18, 3.94s/it] {'loss': 0.3445, 'grad_norm': 0.6553766994577328, 'learning_rate': 7.723686899779277e-06, 'epoch': 0.34} 34%|███▎ | 7450/22095 [12:57:37<16:02:18, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359357 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26077, 'image': 'vrdu_table_final_2/astro-ph.CO/e0d9c084-52f9-4d20-936e-b71f1e5d9977.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 34%|███▎ | 7451/22095 [12:57:40<14:39:25, 3.60s/it] {'loss': 0.3678, 'grad_norm': 0.6312510750549889, 'learning_rate': 7.723072238047526e-06, 'epoch': 0.34} 34%|███▎ | 7451/22095 [12:57:40<14:39:25, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83413 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117618 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47339 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7452/22095 [12:57:49<20:41:02, 5.09s/it] {'loss': 0.4807, 'grad_norm': 0.40615804589458515, 'learning_rate': 7.72245751780532e-06, 'epoch': 0.34} 34%|███▎ | 7452/22095 [12:57:49<20:41:02, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▎ | 7453/22095 [12:57:52<18:16:45, 4.49s/it] {'loss': 0.3547, 'grad_norm': 0.6384877707589975, 'learning_rate': 7.721842739065862e-06, 'epoch': 0.34} 34%|███▎ | 7453/22095 [12:57:52<18:16:45, 4.49s/it] 34%|███▎ | 7454/22095 [12:57:55<16:23:23, 4.03s/it] {'loss': 0.3621, 'grad_norm': 0.6979226985911737, 'learning_rate': 7.721227901842363e-06, 'epoch': 0.34} 34%|███▎ | 7454/22095 [12:57:55<16:23:23, 4.03s/it] 34%|███▎ | 7455/22095 [12:57:58<15:26:53, 3.80s/it] {'loss': 0.3578, 'grad_norm': 0.6487294010425022, 'learning_rate': 7.720613006148034e-06, 'epoch': 0.34} 34%|███▎ | 7455/22095 [12:57:58<15:26:53, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68766 > 40960). Running this sequence through the model will result in indexing errors 34%|███▎ | 7456/22095 [12:58:01<14:14:31, 3.50s/it] {'loss': 0.3972, 'grad_norm': 0.8479401579050782, 'learning_rate': 7.719998051996087e-06, 'epoch': 0.34} 34%|███▎ | 7456/22095 [12:58:01<14:14:31, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▎ | 7457/22095 [12:58:10<21:37:24, 5.32s/it] {'loss': 0.4766, 'grad_norm': 0.3283701852873334, 'learning_rate': 7.719383039399735e-06, 'epoch': 0.34} 34%|███▎ | 7457/22095 [12:58:10<21:37:24, 5.32s/it] 34%|███▍ | 7458/22095 [12:58:15<20:35:39, 5.07s/it] {'loss': 0.3381, 'grad_norm': 0.6778409513847219, 'learning_rate': 7.718767968372193e-06, 'epoch': 0.34} 34%|███▍ | 7458/22095 [12:58:15<20:35:39, 5.07s/it] 34%|███▍ | 7459/22095 [12:58:19<19:49:32, 4.88s/it] {'loss': 0.3235, 'grad_norm': 0.6111463890412212, 'learning_rate': 7.71815283892668e-06, 'epoch': 0.34} 34%|███▍ | 7459/22095 [12:58:19<19:49:32, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88828 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7460/22095 [12:58:30<27:01:37, 6.65s/it] {'loss': 0.4745, 'grad_norm': 0.3121534940673971, 'learning_rate': 7.71753765107641e-06, 'epoch': 0.34} 34%|███▍ | 7460/22095 [12:58:30<27:01:37, 6.65s/it] 34%|███▍ | 7461/22095 [12:58:34<23:38:14, 5.81s/it] {'loss': 0.3494, 'grad_norm': 0.6458181090619489, 'learning_rate': 7.716922404834602e-06, 'epoch': 0.34} 34%|███▍ | 7461/22095 [12:58:34<23:38:14, 5.81s/it] 34%|███▍ | 7462/22095 [12:58:38<21:27:55, 5.28s/it] {'loss': 0.416, 'grad_norm': 0.6411197840593588, 'learning_rate': 7.716307100214472e-06, 'epoch': 0.34} 34%|███▍ | 7462/22095 [12:58:38<21:27:55, 5.28s/it] 34%|███▍ | 7463/22095 [12:58:41<19:00:14, 4.68s/it] {'loss': 0.3312, 'grad_norm': 0.6632890569738612, 'learning_rate': 7.715691737229249e-06, 'epoch': 0.34} 34%|███▍ | 7463/22095 [12:58:41<19:00:14, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7464/22095 [12:58:51<25:08:20, 6.19s/it] {'loss': 0.4926, 'grad_norm': 0.2935415701154594, 'learning_rate': 7.715076315892152e-06, 'epoch': 0.34} 34%|███▍ | 7464/22095 [12:58:51<25:08:20, 6.19s/it] 34%|███▍ | 7465/22095 [12:58:55<22:46:52, 5.61s/it] {'loss': 0.3823, 'grad_norm': 0.6186396883713703, 'learning_rate': 7.714460836216402e-06, 'epoch': 0.34} 34%|███▍ | 7465/22095 [12:58:55<22:46:52, 5.61s/it] 34%|███▍ | 7466/22095 [12:58:59<21:03:52, 5.18s/it] {'loss': 0.3537, 'grad_norm': 0.6027909574603085, 'learning_rate': 7.713845298215226e-06, 'epoch': 0.34} 34%|███▍ | 7466/22095 [12:58:59<21:03:52, 5.18s/it] 34%|███▍ | 7467/22095 [12:59:03<18:45:25, 4.62s/it] {'loss': 0.3794, 'grad_norm': 0.6197701183196876, 'learning_rate': 7.713229701901848e-06, 'epoch': 0.34} 34%|███▍ | 7467/22095 [12:59:03<18:45:25, 4.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-3_277392917-split-1.jpg 2025-08-28 04:57:01.416801 load time: 1024.57 ms 34%|███▍ | 7468/22095 [12:59:12<24:53:17, 6.13s/it] {'loss': 0.4657, 'grad_norm': 0.2862160805249473, 'learning_rate': 7.712614047289498e-06, 'epoch': 0.34} 34%|███▍ | 7468/22095 [12:59:12<24:53:17, 6.13s/it] 34%|███▍ | 7469/22095 [12:59:16<22:19:45, 5.50s/it] {'loss': 0.3869, 'grad_norm': 0.6806505904540876, 'learning_rate': 7.711998334391404e-06, 'epoch': 0.34} 34%|███▍ | 7469/22095 [12:59:16<22:19:45, 5.50s/it] 34%|███▍ | 7470/22095 [12:59:20<19:42:27, 4.85s/it] {'loss': 0.3814, 'grad_norm': 0.7033828216041671, 'learning_rate': 7.711382563220793e-06, 'epoch': 0.34} 34%|███▍ | 7470/22095 [12:59:20<19:42:27, 4.85s/it] 34%|███▍ | 7471/22095 [12:59:23<18:16:04, 4.50s/it] {'loss': 0.3832, 'grad_norm': 0.6488784757057378, 'learning_rate': 7.7107667337909e-06, 'epoch': 0.34} 34%|███▍ | 7471/22095 [12:59:23<18:16:04, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80458 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7472/22095 [12:59:29<20:02:29, 4.93s/it] {'loss': 0.4736, 'grad_norm': 0.3317792728759328, 'learning_rate': 7.710150846114954e-06, 'epoch': 0.34} 34%|███▍ | 7472/22095 [12:59:29<20:02:29, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100887 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7473/22095 [12:59:33<18:48:26, 4.63s/it] {'loss': 0.3642, 'grad_norm': 0.6621196774829878, 'learning_rate': 7.70953490020619e-06, 'epoch': 0.34} 34%|███▍ | 7473/22095 [12:59:33<18:48:26, 4.63s/it] 34%|███▍ | 7474/22095 [12:59:38<18:25:45, 4.54s/it] {'loss': 0.4162, 'grad_norm': 0.6740884911039556, 'learning_rate': 7.708918896077843e-06, 'epoch': 0.34} 34%|███▍ | 7474/22095 [12:59:38<18:25:45, 4.54s/it] 34%|███▍ | 7475/22095 [12:59:40<16:21:54, 4.03s/it] {'loss': 0.3285, 'grad_norm': 0.6174594802456844, 'learning_rate': 7.708302833743149e-06, 'epoch': 0.34} 34%|███▍ | 7475/22095 [12:59:40<16:21:54, 4.03s/it] 34%|███▍ | 7476/22095 [12:59:44<16:09:38, 3.98s/it] {'loss': 0.3641, 'grad_norm': 0.6464458466179519, 'learning_rate': 7.707686713215346e-06, 'epoch': 0.34} 34%|███▍ | 7476/22095 [12:59:44<16:09:38, 3.98s/it] 34%|███▍ | 7477/22095 [12:59:48<16:04:14, 3.96s/it] {'loss': 0.3353, 'grad_norm': 0.6499025032731706, 'learning_rate': 7.70707053450767e-06, 'epoch': 0.34} 34%|███▍ | 7477/22095 [12:59:48<16:04:14, 3.96s/it] 34%|███▍ | 7478/22095 [12:59:52<16:12:41, 3.99s/it] {'loss': 0.38, 'grad_norm': 0.5949856358113641, 'learning_rate': 7.706454297633363e-06, 'epoch': 0.34} 34%|███▍ | 7478/22095 [12:59:53<16:12:41, 3.99s/it] 34%|███▍ | 7479/22095 [12:59:56<15:35:49, 3.84s/it] {'loss': 0.3213, 'grad_norm': 0.6759432774690013, 'learning_rate': 7.705838002605665e-06, 'epoch': 0.34} 34%|███▍ | 7479/22095 [12:59:56<15:35:49, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7480/22095 [13:00:06<23:05:28, 5.69s/it] {'loss': 0.4835, 'grad_norm': 0.29385555492762644, 'learning_rate': 7.705221649437819e-06, 'epoch': 0.34} 34%|███▍ | 7480/22095 [13:00:06<23:05:28, 5.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405811 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7998, 'image': 'vrdu_table_final_2/astro-ph.CO/7f8e454c-420d-4553-a133-c25a5223faab.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 34%|███▍ | 7481/22095 [13:00:16<28:22:16, 6.99s/it] {'loss': 0.5036, 'grad_norm': 0.3342604546667831, 'learning_rate': 7.704605238143069e-06, 'epoch': 0.34} 34%|███▍ | 7481/22095 [13:00:16<28:22:16, 6.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 34%|███▍ | 7482/22095 [13:00:20<24:50:41, 6.12s/it] {'loss': 0.41, 'grad_norm': 0.647801597038109, 'learning_rate': 7.703988768734658e-06, 'epoch': 0.34} 34%|███▍ | 7482/22095 [13:00:20<24:50:41, 6.12s/it] 34%|███▍ | 7483/22095 [13:00:23<21:33:47, 5.31s/it] {'loss': 0.2949, 'grad_norm': 0.63915331270805, 'learning_rate': 7.703372241225832e-06, 'epoch': 0.34} 34%|███▍ | 7483/22095 [13:00:23<21:33:47, 5.31s/it] 34%|███▍ | 7484/22095 [13:00:27<19:44:19, 4.86s/it] {'loss': 0.3707, 'grad_norm': 0.6251380338112288, 'learning_rate': 7.702755655629841e-06, 'epoch': 0.34} 34%|███▍ | 7484/22095 [13:00:27<19:44:19, 4.86s/it] 34%|███▍ | 7485/22095 [13:00:31<18:25:41, 4.54s/it] {'loss': 0.3788, 'grad_norm': 0.7472768475392628, 'learning_rate': 7.702139011959933e-06, 'epoch': 0.34} 34%|███▍ | 7485/22095 [13:00:31<18:25:41, 4.54s/it] 34%|███▍ | 7486/22095 [13:00:34<16:54:48, 4.17s/it] {'loss': 0.3689, 'grad_norm': 0.6220251227286063, 'learning_rate': 7.701522310229353e-06, 'epoch': 0.34} 34%|███▍ | 7486/22095 [13:00:34<16:54:48, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7487/22095 [13:00:44<23:15:16, 5.73s/it] {'loss': 0.5129, 'grad_norm': 0.31252483447081636, 'learning_rate': 7.700905550451359e-06, 'epoch': 0.34} 34%|███▍ | 7487/22095 [13:00:44<23:15:16, 5.73s/it] 34%|███▍ | 7488/22095 [13:00:52<26:54:11, 6.63s/it] {'loss': 0.4867, 'grad_norm': 0.2934067222093203, 'learning_rate': 7.700288732639198e-06, 'epoch': 0.34} 34%|███▍ | 7488/22095 [13:00:52<26:54:11, 6.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 34%|███▍ | 7489/22095 [13:00:56<22:57:56, 5.66s/it] {'loss': 0.3321, 'grad_norm': 0.6546002810607425, 'learning_rate': 7.699671856806126e-06, 'epoch': 0.34} 34%|███▍ | 7489/22095 [13:00:56<22:57:56, 5.66s/it] 34%|███▍ | 7490/22095 [13:00:59<20:08:59, 4.97s/it] {'loss': 0.3346, 'grad_norm': 0.6325439862884268, 'learning_rate': 7.699054922965398e-06, 'epoch': 0.34} 34%|███▍ | 7490/22095 [13:00:59<20:08:59, 4.97s/it] 34%|███▍ | 7491/22095 [13:01:03<18:36:49, 4.59s/it] {'loss': 0.3741, 'grad_norm': 0.740205050820994, 'learning_rate': 7.698437931130266e-06, 'epoch': 0.34} 34%|███▍ | 7491/22095 [13:01:03<18:36:49, 4.59s/it] 34%|███▍ | 7492/22095 [13:01:07<17:52:05, 4.40s/it] {'loss': 0.3616, 'grad_norm': 0.6601280735750225, 'learning_rate': 7.697820881313994e-06, 'epoch': 0.34} 34%|███▍ | 7492/22095 [13:01:07<17:52:05, 4.40s/it] 34%|███▍ | 7493/22095 [13:01:11<17:15:50, 4.26s/it] {'loss': 0.3569, 'grad_norm': 0.6613533939553481, 'learning_rate': 7.697203773529835e-06, 'epoch': 0.34} 34%|███▍ | 7493/22095 [13:01:11<17:15:50, 4.26s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38226.png 2025-08-28 04:59:09.359366 load time: 1409.07 ms 34%|███▍ | 7494/22095 [13:01:14<16:06:23, 3.97s/it] {'loss': 0.3738, 'grad_norm': 0.6112239789987094, 'learning_rate': 7.696586607791053e-06, 'epoch': 0.34} 34%|███▍ | 7494/22095 [13:01:14<16:06:23, 3.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882178 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5331, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 12cm\nB. 6cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 34%|███▍ | 7495/22095 [13:01:17<14:52:10, 3.67s/it] {'loss': 0.3303, 'grad_norm': 0.7287828073782087, 'learning_rate': 7.695969384110906e-06, 'epoch': 0.34} 34%|███▍ | 7495/22095 [13:01:17<14:52:10, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59825 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75270 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7496/22095 [13:01:21<15:04:22, 3.72s/it] {'loss': 0.3679, 'grad_norm': 0.6877461486725416, 'learning_rate': 7.695352102502655e-06, 'epoch': 0.34} 34%|███▍ | 7496/22095 [13:01:21<15:04:22, 3.72s/it] 34%|███▍ | 7497/22095 [13:01:24<14:36:15, 3.60s/it] {'loss': 0.3722, 'grad_norm': 0.7528212080107087, 'learning_rate': 7.694734762979566e-06, 'epoch': 0.34} 34%|███▍ | 7497/22095 [13:01:24<14:36:15, 3.60s/it] 34%|███▍ | 7498/22095 [13:01:27<14:27:35, 3.57s/it] {'loss': 0.3523, 'grad_norm': 0.6911967207059251, 'learning_rate': 7.694117365554905e-06, 'epoch': 0.34} 34%|███▍ | 7498/22095 [13:01:28<14:27:35, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8370371 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37123, 'image': 'vrdu_table_final_2/astro-ph.CO/46cb6d9b-6b14-4519-8353-d048cad1e600.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 34%|███▍ | 7499/22095 [13:01:30<13:36:09, 3.36s/it] {'loss': 0.3753, 'grad_norm': 0.6781742116766306, 'learning_rate': 7.693499910241935e-06, 'epoch': 0.34} 34%|███▍ | 7499/22095 [13:01:30<13:36:09, 3.36s/it] 34%|███▍ | 7500/22095 [13:01:34<13:34:16, 3.35s/it] {'loss': 0.3711, 'grad_norm': 0.6668863342842705, 'learning_rate': 7.692882397053924e-06, 'epoch': 0.34} 34%|███▍ | 7500/22095 [13:01:34<13:34:16, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42394 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7501/22095 [13:01:37<13:04:20, 3.22s/it] {'loss': 0.3786, 'grad_norm': 0.6889227130686282, 'learning_rate': 7.69226482600414e-06, 'epoch': 0.34} 34%|███▍ | 7501/22095 [13:01:37<13:04:20, 3.22s/it] 34%|███▍ | 7502/22095 [13:01:40<13:29:22, 3.33s/it] {'loss': 0.3838, 'grad_norm': 0.6258982774738167, 'learning_rate': 7.691647197105857e-06, 'epoch': 0.34} 34%|███▍ | 7502/22095 [13:01:40<13:29:22, 3.33s/it] 34%|███▍ | 7503/22095 [13:01:44<13:34:05, 3.35s/it] {'loss': 0.3668, 'grad_norm': 0.60363916132125, 'learning_rate': 7.69102951037234e-06, 'epoch': 0.34} 34%|███▍ | 7503/22095 [13:01:44<13:34:05, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7504/22095 [13:01:53<21:01:44, 5.19s/it] {'loss': 0.4735, 'grad_norm': 0.45105570128430217, 'learning_rate': 7.690411765816864e-06, 'epoch': 0.34} 34%|███▍ | 7504/22095 [13:01:53<21:01:44, 5.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 04:59:51.850142 load time: 1571.05 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_062603_before_screenshot.png 2025-08-28 04:59:53.113120 load time: 1168.04 ms 34%|███▍ | 7505/22095 [13:01:56<18:52:45, 4.66s/it] {'loss': 0.3415, 'grad_norm': 0.6294237191403765, 'learning_rate': 7.689793963452703e-06, 'epoch': 0.34} 34%|███▍ | 7505/22095 [13:01:57<18:52:45, 4.66s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_6.png 2025-08-28 04:59:55.915575 load time: 1157.36 ms 34%|███▍ | 7506/22095 [13:01:59<16:40:39, 4.12s/it] {'loss': 0.3648, 'grad_norm': 0.6526197354715212, 'learning_rate': 7.68917610329313e-06, 'epoch': 0.34} 34%|███▍ | 7506/22095 [13:01:59<16:40:39, 4.12s/it] 34%|███▍ | 7507/22095 [13:02:03<15:49:11, 3.90s/it] {'loss': 0.3441, 'grad_norm': 0.6589135998127787, 'learning_rate': 7.68855818535142e-06, 'epoch': 0.34} 34%|███▍ | 7507/22095 [13:02:03<15:49:11, 3.90s/it] 34%|███▍ | 7508/22095 [13:02:06<14:44:34, 3.64s/it] {'loss': 0.3835, 'grad_norm': 0.6712897142276448, 'learning_rate': 7.687940209640853e-06, 'epoch': 0.34} 34%|███▍ | 7508/22095 [13:02:06<14:44:34, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7509/22095 [13:02:15<21:18:21, 5.26s/it] {'loss': 0.5078, 'grad_norm': 0.3313425590280107, 'learning_rate': 7.687322176174708e-06, 'epoch': 0.34} 34%|███▍ | 7509/22095 [13:02:15<21:18:21, 5.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7510/22095 [13:02:18<19:00:32, 4.69s/it] {'loss': 0.3913, 'grad_norm': 0.6378473326228518, 'learning_rate': 7.686704084966263e-06, 'epoch': 0.34} 34%|███▍ | 7510/22095 [13:02:18<19:00:32, 4.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047922 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 8\nB. 16\nC. 2\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [775, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8486330 in VC:s3://internvl-moe-sft-data/. Exception: Image size [775, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 127762, 'image': 'vrdu_texteq/astro-ph.CO/b5a88fc1-6373-49b8-83a8-bcd61f078d99.png', 'image_wh': [[775, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'modes should have a moderate \ncorrelation of $r\\sim 0.5-0.55$ and'}]} 34%|███▍ | 7511/22095 [13:02:22<18:14:40, 4.50s/it] {'loss': 0.3348, 'grad_norm': 0.6571514913747152, 'learning_rate': 7.686085936028798e-06, 'epoch': 0.34} 34%|███▍ | 7511/22095 [13:02:22<18:14:40, 4.50s/it] 34%|███▍ | 7512/22095 [13:02:25<16:02:51, 3.96s/it] {'loss': 0.3366, 'grad_norm': 0.6324628293351644, 'learning_rate': 7.685467729375596e-06, 'epoch': 0.34} 34%|███▍ | 7512/22095 [13:02:25<16:02:51, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83885 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7513/22095 [13:02:28<15:23:48, 3.80s/it] {'loss': 0.3518, 'grad_norm': 0.6064251790572599, 'learning_rate': 7.684849465019938e-06, 'epoch': 0.34} 34%|███▍ | 7513/22095 [13:02:28<15:23:48, 3.80s/it] 34%|███▍ | 7514/22095 [13:02:32<15:31:58, 3.84s/it] {'loss': 0.3573, 'grad_norm': 0.6791002915888196, 'learning_rate': 7.684231142975113e-06, 'epoch': 0.34} 34%|███▍ | 7514/22095 [13:02:32<15:31:58, 3.84s/it] 34%|███▍ | 7515/22095 [13:02:36<15:48:08, 3.90s/it] {'loss': 0.3393, 'grad_norm': 0.6485858044842474, 'learning_rate': 7.683612763254404e-06, 'epoch': 0.34} 34%|███▍ | 7515/22095 [13:02:36<15:48:08, 3.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954304 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5139, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 4\nB. 6\nC. 2\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7516/22095 [13:02:44<20:43:01, 5.12s/it] {'loss': 0.5176, 'grad_norm': 0.34229835613429566, 'learning_rate': 7.682994325871098e-06, 'epoch': 0.34} 34%|███▍ | 7516/22095 [13:02:44<20:43:01, 5.12s/it] 34%|███▍ | 7517/22095 [13:02:48<19:07:35, 4.72s/it] {'loss': 0.3647, 'grad_norm': 0.5985761286493197, 'learning_rate': 7.682375830838487e-06, 'epoch': 0.34} 34%|███▍ | 7517/22095 [13:02:48<19:07:35, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7518/22095 [13:02:51<16:49:11, 4.15s/it] {'loss': 0.3234, 'grad_norm': 0.6356574267140701, 'learning_rate': 7.681757278169854e-06, 'epoch': 0.34} 34%|███▍ | 7518/22095 [13:02:51<16:49:11, 4.15s/it] 34%|███▍ | 7519/22095 [13:02:55<16:59:27, 4.20s/it] {'loss': 0.3603, 'grad_norm': 0.685838642936561, 'learning_rate': 7.681138667878497e-06, 'epoch': 0.34} 34%|███▍ | 7519/22095 [13:02:55<16:59:27, 4.20s/it] 34%|███▍ | 7520/22095 [13:02:59<16:46:07, 4.14s/it] {'loss': 0.3723, 'grad_norm': 0.6511830826513854, 'learning_rate': 7.680519999977703e-06, 'epoch': 0.34} 34%|███▍ | 7520/22095 [13:02:59<16:46:07, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7521/22095 [13:03:09<23:31:19, 5.81s/it] {'loss': 0.4982, 'grad_norm': 0.31154658968417265, 'learning_rate': 7.679901274480766e-06, 'epoch': 0.34} 34%|███▍ | 7521/22095 [13:03:09<23:31:19, 5.81s/it] 34%|███▍ | 7522/22095 [13:03:13<20:52:59, 5.16s/it] {'loss': 0.4237, 'grad_norm': 0.6537163331357495, 'learning_rate': 7.67928249140098e-06, 'epoch': 0.34} 34%|███▍ | 7522/22095 [13:03:13<20:52:59, 5.16s/it] 34%|███▍ | 7523/22095 [13:03:17<19:40:03, 4.86s/it] {'loss': 0.3303, 'grad_norm': 0.6386803915716467, 'learning_rate': 7.678663650751648e-06, 'epoch': 0.34} 34%|███▍ | 7523/22095 [13:03:17<19:40:03, 4.86s/it] 34%|███▍ | 7524/22095 [13:03:20<17:42:59, 4.38s/it] {'loss': 0.366, 'grad_norm': 0.6519793736472048, 'learning_rate': 7.678044752546056e-06, 'epoch': 0.34} 34%|███▍ | 7524/22095 [13:03:20<17:42:59, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53339 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7525/22095 [13:03:24<17:01:55, 4.21s/it] {'loss': 0.3097, 'grad_norm': 0.590842503948439, 'learning_rate': 7.677425796797509e-06, 'epoch': 0.34} 34%|███▍ | 7525/22095 [13:03:24<17:01:55, 4.21s/it] 34%|███▍ | 7526/22095 [13:03:27<15:36:24, 3.86s/it] {'loss': 0.3608, 'grad_norm': 0.6407967582943332, 'learning_rate': 7.676806783519304e-06, 'epoch': 0.34} 34%|███▍ | 7526/22095 [13:03:27<15:36:24, 3.86s/it] 34%|███▍ | 7527/22095 [13:03:30<15:10:41, 3.75s/it] {'loss': 0.3703, 'grad_norm': 0.7371405089237638, 'learning_rate': 7.676187712724742e-06, 'epoch': 0.34} 34%|███▍ | 7527/22095 [13:03:30<15:10:41, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965707 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16542, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 34%|███▍ | 7528/22095 [13:03:34<14:59:03, 3.70s/it] {'loss': 0.3762, 'grad_norm': 0.6499977417250453, 'learning_rate': 7.675568584427125e-06, 'epoch': 0.34} 34%|███▍ | 7528/22095 [13:03:34<14:59:03, 3.70s/it] 34%|███▍ | 7529/22095 [13:03:38<15:25:00, 3.81s/it] {'loss': 0.3582, 'grad_norm': 0.6536935173708011, 'learning_rate': 7.674949398639759e-06, 'epoch': 0.34} 34%|███▍ | 7529/22095 [13:03:38<15:25:00, 3.81s/it] 34%|███▍ | 7530/22095 [13:03:41<14:47:17, 3.66s/it] {'loss': 0.3582, 'grad_norm': 0.6722326617995539, 'learning_rate': 7.674330155375942e-06, 'epoch': 0.34} 34%|███▍ | 7530/22095 [13:03:41<14:47:17, 3.66s/it] 34%|███▍ | 7531/22095 [13:03:46<15:38:46, 3.87s/it] {'loss': 0.3397, 'grad_norm': 0.6492041265689165, 'learning_rate': 7.673710854648988e-06, 'epoch': 0.34} 34%|███▍ | 7531/22095 [13:03:46<15:38:46, 3.87s/it] 34%|███▍ | 7532/22095 [13:03:49<15:15:09, 3.77s/it] {'loss': 0.3484, 'grad_norm': 0.6033954319869382, 'learning_rate': 7.673091496472195e-06, 'epoch': 0.34} 34%|███▍ | 7532/22095 [13:03:49<15:15:09, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7533/22095 [13:03:56<19:25:38, 4.80s/it] {'loss': 0.4913, 'grad_norm': 0.38637999773221254, 'learning_rate': 7.67247208085888e-06, 'epoch': 0.34} 34%|███▍ | 7533/22095 [13:03:56<19:25:38, 4.80s/it] 34%|███▍ | 7534/22095 [13:04:00<18:09:18, 4.49s/it] {'loss': 0.3513, 'grad_norm': 0.7052793212540459, 'learning_rate': 7.671852607822346e-06, 'epoch': 0.34} 34%|███▍ | 7534/22095 [13:04:00<18:09:18, 4.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045993 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 34%|███▍ | 7535/22095 [13:04:03<16:38:56, 4.12s/it] {'loss': 0.3736, 'grad_norm': 0.6174511378249521, 'learning_rate': 7.671233077375903e-06, 'epoch': 0.34} 34%|███▍ | 7535/22095 [13:04:03<16:38:56, 4.12s/it] 34%|███▍ | 7536/22095 [13:04:06<15:20:49, 3.79s/it] {'loss': 0.3414, 'grad_norm': 0.6239886045962482, 'learning_rate': 7.670613489532868e-06, 'epoch': 0.34} 34%|███▍ | 7536/22095 [13:04:06<15:20:49, 3.79s/it] 34%|███▍ | 7537/22095 [13:04:10<14:53:23, 3.68s/it] {'loss': 0.3477, 'grad_norm': 0.8548263136871884, 'learning_rate': 7.66999384430655e-06, 'epoch': 0.34} 34%|███▍ | 7537/22095 [13:04:10<14:53:23, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7538/22095 [13:04:13<14:18:35, 3.54s/it] {'loss': 0.3479, 'grad_norm': 0.6002015590991432, 'learning_rate': 7.669374141710266e-06, 'epoch': 0.34} 34%|███▍ | 7538/22095 [13:04:13<14:18:35, 3.54s/it] 34%|███▍ | 7539/22095 [13:04:17<14:13:09, 3.52s/it] {'loss': 0.3686, 'grad_norm': 0.6336840470348314, 'learning_rate': 7.668754381757329e-06, 'epoch': 0.34} 34%|███▍ | 7539/22095 [13:04:17<14:13:09, 3.52s/it] 34%|███▍ | 7540/22095 [13:04:20<13:38:20, 3.37s/it] {'loss': 0.3474, 'grad_norm': 0.6427120609580672, 'learning_rate': 7.668134564461057e-06, 'epoch': 0.34} 34%|███▍ | 7540/22095 [13:04:20<13:38:20, 3.37s/it] 34%|███▍ | 7541/22095 [13:04:23<13:10:19, 3.26s/it] {'loss': 0.3708, 'grad_norm': 0.8789691593560188, 'learning_rate': 7.667514689834766e-06, 'epoch': 0.34} 34%|███▍ | 7541/22095 [13:04:23<13:10:19, 3.26s/it] 34%|███▍ | 7542/22095 [13:04:26<12:59:46, 3.21s/it] {'loss': 0.3479, 'grad_norm': 0.6235062003928623, 'learning_rate': 7.666894757891779e-06, 'epoch': 0.34} 34%|███▍ | 7542/22095 [13:04:26<12:59:46, 3.21s/it] 34%|███▍ | 7543/22095 [13:04:30<13:54:21, 3.44s/it] {'loss': 0.3542, 'grad_norm': 0.6322898144122554, 'learning_rate': 7.666274768645413e-06, 'epoch': 0.34} 34%|███▍ | 7543/22095 [13:04:30<13:54:21, 3.44s/it] 34%|███▍ | 7544/22095 [13:04:34<14:29:41, 3.59s/it] {'loss': 0.3685, 'grad_norm': 0.6630673959545894, 'learning_rate': 7.665654722108994e-06, 'epoch': 0.34} 34%|███▍ | 7544/22095 [13:04:34<14:29:41, 3.59s/it] 34%|███▍ | 7545/22095 [13:04:37<13:58:35, 3.46s/it] {'loss': 0.3474, 'grad_norm': 0.6387799928039584, 'learning_rate': 7.665034618295838e-06, 'epoch': 0.34} 34%|███▍ | 7545/22095 [13:04:37<13:58:35, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7546/22095 [13:04:40<13:32:20, 3.35s/it] {'loss': 0.3239, 'grad_norm': 0.5943943062030385, 'learning_rate': 7.664414457219277e-06, 'epoch': 0.34} 34%|███▍ | 7546/22095 [13:04:40<13:32:20, 3.35s/it] 34%|███▍ | 7547/22095 [13:04:44<14:13:28, 3.52s/it] {'loss': 0.3928, 'grad_norm': 0.6699691702156413, 'learning_rate': 7.66379423889263e-06, 'epoch': 0.34} 34%|███▍ | 7547/22095 [13:04:44<14:13:28, 3.52s/it] 34%|███▍ | 7548/22095 [13:04:47<14:22:35, 3.56s/it] {'loss': 0.3197, 'grad_norm': 0.7179907829644857, 'learning_rate': 7.663173963329227e-06, 'epoch': 0.34} 34%|███▍ | 7548/22095 [13:04:47<14:22:35, 3.56s/it] 34%|███▍ | 7549/22095 [13:04:50<13:38:37, 3.38s/it] {'loss': 0.3439, 'grad_norm': 0.6417787563246365, 'learning_rate': 7.662553630542393e-06, 'epoch': 0.34} 34%|███▍ | 7549/22095 [13:04:50<13:38:37, 3.38s/it] 34%|███▍ | 7550/22095 [13:04:54<13:31:11, 3.35s/it] {'loss': 0.3279, 'grad_norm': 0.6565039262127776, 'learning_rate': 7.661933240545464e-06, 'epoch': 0.34} 34%|███▍ | 7550/22095 [13:04:54<13:31:11, 3.35s/it] 34%|███▍ | 7551/22095 [13:04:57<13:13:25, 3.27s/it] {'loss': 0.388, 'grad_norm': 0.6660501472957977, 'learning_rate': 7.661312793351758e-06, 'epoch': 0.34} 34%|███▍ | 7551/22095 [13:04:57<13:13:25, 3.27s/it] 34%|███▍ | 7552/22095 [13:05:00<12:52:09, 3.19s/it] {'loss': 0.3604, 'grad_norm': 0.6796749549491682, 'learning_rate': 7.660692288974618e-06, 'epoch': 0.34} 34%|███▍ | 7552/22095 [13:05:00<12:52:09, 3.19s/it] 34%|███▍ | 7553/22095 [13:05:03<12:41:53, 3.14s/it] {'loss': 0.3145, 'grad_norm': 0.5855195869802828, 'learning_rate': 7.660071727427372e-06, 'epoch': 0.34} 34%|███▍ | 7553/22095 [13:05:03<12:41:53, 3.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79488 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7554/22095 [13:05:07<13:54:27, 3.44s/it] {'loss': 0.3663, 'grad_norm': 0.6448670926148414, 'learning_rate': 7.659451108723353e-06, 'epoch': 0.34} 34%|███▍ | 7554/22095 [13:05:07<13:54:27, 3.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892995 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16148, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 4cm\nB. 6cm\nC. 1cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 34%|███▍ | 7555/22095 [13:05:10<13:53:40, 3.44s/it] {'loss': 0.373, 'grad_norm': 0.6840365201027633, 'learning_rate': 7.658830432875899e-06, 'epoch': 0.34} 34%|███▍ | 7555/22095 [13:05:10<13:53:40, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7556/22095 [13:05:15<15:04:41, 3.73s/it] {'loss': 0.3696, 'grad_norm': 0.6320095353665366, 'learning_rate': 7.658209699898344e-06, 'epoch': 0.34} 34%|███▍ | 7556/22095 [13:05:15<15:04:41, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78808 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7557/22095 [13:05:19<15:27:07, 3.83s/it] {'loss': 0.4076, 'grad_norm': 0.6769106274619351, 'learning_rate': 7.657588909804028e-06, 'epoch': 0.34} 34%|███▍ | 7557/22095 [13:05:19<15:27:07, 3.83s/it] 34%|███▍ | 7558/22095 [13:05:22<14:25:48, 3.57s/it] {'loss': 0.3546, 'grad_norm': 0.6830176902563303, 'learning_rate': 7.656968062606288e-06, 'epoch': 0.34} 34%|███▍ | 7558/22095 [13:05:22<14:25:48, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8478394 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 124040, 'image': 'vrdu_texteq/astro-ph.CO/6e8b10ed-3163-4e5f-8dde-f3364a5599e2.png', 'image_wh': [[17, 14]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': '$\\alpha$'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7559/22095 [13:05:26<14:56:55, 3.70s/it] {'loss': 0.3308, 'grad_norm': 0.6386112845278985, 'learning_rate': 7.656347158318462e-06, 'epoch': 0.34} 34%|███▍ | 7559/22095 [13:05:26<14:56:55, 3.70s/it] 34%|███▍ | 7560/22095 [13:05:29<14:22:35, 3.56s/it] {'loss': 0.36, 'grad_norm': 0.6494201950324293, 'learning_rate': 7.655726196953898e-06, 'epoch': 0.34} 34%|███▍ | 7560/22095 [13:05:29<14:22:35, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42075 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90016 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7561/22095 [13:05:32<14:03:08, 3.48s/it] {'loss': 0.3533, 'grad_norm': 0.6433055486507153, 'learning_rate': 7.655105178525932e-06, 'epoch': 0.34} 34%|███▍ | 7561/22095 [13:05:32<14:03:08, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7562/22095 [13:05:42<21:59:54, 5.45s/it] {'loss': 0.4686, 'grad_norm': 0.5024231652031483, 'learning_rate': 7.65448410304791e-06, 'epoch': 0.34} 34%|███▍ | 7562/22095 [13:05:42<21:59:54, 5.45s/it] 34%|███▍ | 7563/22095 [13:05:46<20:02:08, 4.96s/it] {'loss': 0.3816, 'grad_norm': 0.6118199949238902, 'learning_rate': 7.653862970533179e-06, 'epoch': 0.34} 34%|███▍ | 7563/22095 [13:05:46<20:02:08, 4.96s/it] 34%|███▍ | 7564/22095 [13:05:49<17:41:55, 4.38s/it] {'loss': 0.3528, 'grad_norm': 0.6683841803842258, 'learning_rate': 7.653241780995083e-06, 'epoch': 0.34} 34%|███▍ | 7564/22095 [13:05:49<17:41:55, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7565/22095 [13:05:54<18:31:45, 4.59s/it] {'loss': 0.4729, 'grad_norm': 0.32221683070115276, 'learning_rate': 7.652620534446968e-06, 'epoch': 0.34} 34%|███▍ | 7565/22095 [13:05:54<18:31:45, 4.59s/it] 34%|███▍ | 7566/22095 [13:05:58<17:27:49, 4.33s/it] {'loss': 0.3703, 'grad_norm': 0.6251898531891275, 'learning_rate': 7.651999230902186e-06, 'epoch': 0.34} 34%|███▍ | 7566/22095 [13:05:58<17:27:49, 4.33s/it] 34%|███▍ | 7567/22095 [13:06:01<16:08:31, 4.00s/it] {'loss': 0.3363, 'grad_norm': 0.7174094885394138, 'learning_rate': 7.651377870374087e-06, 'epoch': 0.34} 34%|███▍ | 7567/22095 [13:06:01<16:08:31, 4.00s/it] 34%|███▍ | 7568/22095 [13:06:05<15:55:31, 3.95s/it] {'loss': 0.3548, 'grad_norm': 0.5838536870265789, 'learning_rate': 7.650756452876019e-06, 'epoch': 0.34} 34%|███▍ | 7568/22095 [13:06:05<15:55:31, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7569/22095 [13:06:15<22:46:30, 5.64s/it] {'loss': 0.472, 'grad_norm': 0.4020356162026709, 'learning_rate': 7.650134978421335e-06, 'epoch': 0.34} 34%|███▍ | 7569/22095 [13:06:15<22:46:30, 5.64s/it] 34%|███▍ | 7570/22095 [13:06:18<19:56:59, 4.94s/it] {'loss': 0.3974, 'grad_norm': 0.6858008302933762, 'learning_rate': 7.64951344702339e-06, 'epoch': 0.34} 34%|███▍ | 7570/22095 [13:06:18<19:56:59, 4.94s/it] 34%|███▍ | 7571/22095 [13:06:21<18:13:25, 4.52s/it] {'loss': 0.3443, 'grad_norm': 0.6300589867991939, 'learning_rate': 7.648891858695542e-06, 'epoch': 0.34} 34%|███▍ | 7571/22095 [13:06:21<18:13:25, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50539 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7572/22095 [13:06:25<16:24:25, 4.07s/it] {'loss': 0.3277, 'grad_norm': 0.5986157076384225, 'learning_rate': 7.64827021345114e-06, 'epoch': 0.34} 34%|███▍ | 7572/22095 [13:06:25<16:24:25, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7573/22095 [13:06:33<21:44:59, 5.39s/it] {'loss': 0.494, 'grad_norm': 0.3073571483471491, 'learning_rate': 7.647648511303545e-06, 'epoch': 0.34} 34%|███▍ | 7573/22095 [13:06:33<21:44:59, 5.39s/it] 34%|███▍ | 7574/22095 [13:06:36<19:19:10, 4.79s/it] {'loss': 0.3809, 'grad_norm': 0.6350792333445885, 'learning_rate': 7.647026752266114e-06, 'epoch': 0.34} 34%|███▍ | 7574/22095 [13:06:36<19:19:10, 4.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7575/22095 [13:06:39<16:53:53, 4.19s/it] {'loss': 0.3917, 'grad_norm': 0.666039449246502, 'learning_rate': 7.64640493635221e-06, 'epoch': 0.34} 34%|███▍ | 7575/22095 [13:06:39<16:53:53, 4.19s/it] 34%|███▍ | 7576/22095 [13:06:44<17:41:37, 4.39s/it] {'loss': 0.3641, 'grad_norm': 0.735552361903718, 'learning_rate': 7.64578306357519e-06, 'epoch': 0.34} 34%|███▍ | 7576/22095 [13:06:44<17:41:37, 4.39s/it] 34%|███▍ | 7577/22095 [13:06:48<16:54:28, 4.19s/it] {'loss': 0.3148, 'grad_norm': 0.7251610813098117, 'learning_rate': 7.64516113394842e-06, 'epoch': 0.34} 34%|███▍ | 7577/22095 [13:06:48<16:54:28, 4.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [364, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8422438 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 145610, 'image': 'vrdu_texteq/astro-ph.CO/55feb00c-644b-406f-ba9f-2e396bc91260.png', 'image_wh': [[364, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $k_{\\rm max}$ is the cutoff scale.'}]} 34%|███▍ | 7578/22095 [13:06:51<15:26:28, 3.83s/it] {'loss': 0.3425, 'grad_norm': 0.6705458498226945, 'learning_rate': 7.64453914748526e-06, 'epoch': 0.34} 34%|███▍ | 7578/22095 [13:06:51<15:26:28, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43874 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7579/22095 [13:06:55<15:34:18, 3.86s/it] {'loss': 0.366, 'grad_norm': 0.6576445664417572, 'learning_rate': 7.643917104199076e-06, 'epoch': 0.34} 34%|███▍ | 7579/22095 [13:06:55<15:34:18, 3.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7580/22095 [13:07:00<17:44:35, 4.40s/it] {'loss': 0.4816, 'grad_norm': 0.35501970819457634, 'learning_rate': 7.643295004103232e-06, 'epoch': 0.34} 34%|███▍ | 7580/22095 [13:07:00<17:44:35, 4.40s/it] 34%|███▍ | 7581/22095 [13:07:04<16:55:31, 4.20s/it] {'loss': 0.3533, 'grad_norm': 0.763870455320076, 'learning_rate': 7.6426728472111e-06, 'epoch': 0.34} 34%|███▍ | 7581/22095 [13:07:04<16:55:31, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (148488 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7582/22095 [13:07:08<16:17:27, 4.04s/it] {'loss': 0.3617, 'grad_norm': 0.6554494379318401, 'learning_rate': 7.642050633536042e-06, 'epoch': 0.34} 34%|███▍ | 7582/22095 [13:07:08<16:17:27, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73489 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47408 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66099 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7583/22095 [13:07:11<14:50:46, 3.68s/it] {'loss': 0.3789, 'grad_norm': 0.6627115772554796, 'learning_rate': 7.641428363091431e-06, 'epoch': 0.34} 34%|███▍ | 7583/22095 [13:07:11<14:50:46, 3.68s/it] 34%|███▍ | 7584/22095 [13:07:14<14:40:00, 3.64s/it] {'loss': 0.3426, 'grad_norm': 0.6152424957501429, 'learning_rate': 7.640806035890637e-06, 'epoch': 0.34} 34%|███▍ | 7584/22095 [13:07:14<14:40:00, 3.64s/it] 34%|███▍ | 7585/22095 [13:07:17<14:01:17, 3.48s/it] {'loss': 0.3383, 'grad_norm': 0.6534631216532331, 'learning_rate': 7.640183651947033e-06, 'epoch': 0.34} 34%|███▍ | 7585/22095 [13:07:17<14:01:17, 3.48s/it] 34%|███▍ | 7586/22095 [13:07:21<14:15:11, 3.54s/it] {'loss': 0.3749, 'grad_norm': 0.7446778963304378, 'learning_rate': 7.639561211273989e-06, 'epoch': 0.34} 34%|███▍ | 7586/22095 [13:07:21<14:15:11, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7587/22095 [13:07:30<21:18:56, 5.29s/it] {'loss': 0.4847, 'grad_norm': 0.31481139433953165, 'learning_rate': 7.638938713884883e-06, 'epoch': 0.34} 34%|███▍ | 7587/22095 [13:07:30<21:18:56, 5.29s/it] 34%|███▍ | 7588/22095 [13:07:34<19:53:32, 4.94s/it] {'loss': 0.3569, 'grad_norm': 0.5922040233371281, 'learning_rate': 7.638316159793089e-06, 'epoch': 0.34} 34%|███▍ | 7588/22095 [13:07:34<19:53:32, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73583 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7589/22095 [13:07:38<18:18:50, 4.55s/it] {'loss': 0.3517, 'grad_norm': 0.6249165881142379, 'learning_rate': 7.637693549011983e-06, 'epoch': 0.34} 34%|███▍ | 7589/22095 [13:07:38<18:18:50, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7590/22095 [13:07:47<23:45:52, 5.90s/it] {'loss': 0.4812, 'grad_norm': 0.3221901775760477, 'learning_rate': 7.637070881554944e-06, 'epoch': 0.34} 34%|███▍ | 7590/22095 [13:07:47<23:45:52, 5.90s/it] 34%|███▍ | 7591/22095 [13:07:55<25:40:03, 6.37s/it] {'loss': 0.4885, 'grad_norm': 0.3070442368662165, 'learning_rate': 7.63644815743535e-06, 'epoch': 0.34} 34%|███▍ | 7591/22095 [13:07:55<25:40:03, 6.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 34%|███▍ | 7592/22095 [13:07:59<23:16:07, 5.78s/it] {'loss': 0.3514, 'grad_norm': 0.6727977151365402, 'learning_rate': 7.635825376666584e-06, 'epoch': 0.34} 34%|███▍ | 7592/22095 [13:07:59<23:16:07, 5.78s/it] 34%|███▍ | 7593/22095 [13:08:02<20:32:34, 5.10s/it] {'loss': 0.3718, 'grad_norm': 0.6778367970179924, 'learning_rate': 7.635202539262025e-06, 'epoch': 0.34} 34%|███▍ | 7593/22095 [13:08:02<20:32:34, 5.10s/it] 34%|███▍ | 7594/22095 [13:08:06<18:11:12, 4.52s/it] {'loss': 0.3678, 'grad_norm': 0.6923427020612646, 'learning_rate': 7.634579645235056e-06, 'epoch': 0.34} 34%|███▍ | 7594/22095 [13:08:06<18:11:12, 4.52s/it] 34%|███▍ | 7595/22095 [13:08:09<16:34:00, 4.11s/it] {'loss': 0.3305, 'grad_norm': 0.6398153691505553, 'learning_rate': 7.633956694599063e-06, 'epoch': 0.34} 34%|███▍ | 7595/22095 [13:08:09<16:34:00, 4.11s/it] 34%|███▍ | 7596/22095 [13:08:13<16:21:20, 4.06s/it] {'loss': 0.4038, 'grad_norm': 0.658762949535757, 'learning_rate': 7.63333368736743e-06, 'epoch': 0.34} 34%|███▍ | 7596/22095 [13:08:13<16:21:20, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61563 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7597/22095 [13:08:23<23:38:36, 5.87s/it] {'loss': 0.4857, 'grad_norm': 0.36541076844789505, 'learning_rate': 7.632710623553543e-06, 'epoch': 0.34} 34%|███▍ | 7597/22095 [13:08:23<23:38:36, 5.87s/it] 34%|███▍ | 7598/22095 [13:08:26<20:30:47, 5.09s/it] {'loss': 0.3249, 'grad_norm': 0.6564004935929247, 'learning_rate': 7.632087503170793e-06, 'epoch': 0.34} 34%|███▍ | 7598/22095 [13:08:26<20:30:47, 5.09s/it] 34%|███▍ | 7599/22095 [13:08:30<18:30:16, 4.60s/it] {'loss': 0.3469, 'grad_norm': 0.6711467037755176, 'learning_rate': 7.631464326232562e-06, 'epoch': 0.34} 34%|███▍ | 7599/22095 [13:08:30<18:30:16, 4.60s/it] 34%|███▍ | 7600/22095 [13:08:33<17:33:05, 4.36s/it] {'loss': 0.3429, 'grad_norm': 0.645995639153286, 'learning_rate': 7.630841092752248e-06, 'epoch': 0.34} 34%|███▍ | 7600/22095 [13:08:33<17:33:05, 4.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 56, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8351636 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 56, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18314, 'image': 'vrdu_table_final_2/astro-ph.CO/d3eaa779-9fee-4893-9f73-cc1ae65dfef0.png', 'image_wh': [[14, 56]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #1\\\\#2\n \\end{tabular}\n```"}]} 34%|███▍ | 7601/22095 [13:08:39<18:36:40, 4.62s/it] {'loss': 0.3921, 'grad_norm': 0.6707935052593224, 'learning_rate': 7.630217802743238e-06, 'epoch': 0.34} 34%|███▍ | 7601/22095 [13:08:39<18:36:40, 4.62s/it] 34%|███▍ | 7602/22095 [13:08:42<17:15:23, 4.29s/it] {'loss': 0.3883, 'grad_norm': 0.611645996050189, 'learning_rate': 7.629594456218926e-06, 'epoch': 0.34} 34%|███▍ | 7602/22095 [13:08:42<17:15:23, 4.29s/it] 34%|███▍ | 7603/22095 [13:08:45<16:03:47, 3.99s/it] {'loss': 0.3792, 'grad_norm': 0.7375476165970931, 'learning_rate': 7.628971053192705e-06, 'epoch': 0.34} 34%|███▍ | 7603/22095 [13:08:45<16:03:47, 3.99s/it] 34%|███▍ | 7604/22095 [13:08:49<15:18:06, 3.80s/it] {'loss': 0.377, 'grad_norm': 0.9643445777519964, 'learning_rate': 7.628347593677969e-06, 'epoch': 0.34} 34%|███▍ | 7604/22095 [13:08:49<15:18:06, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7605/22095 [13:08:55<18:50:23, 4.68s/it] {'loss': 0.4879, 'grad_norm': 0.42861613394280557, 'learning_rate': 7.6277240776881175e-06, 'epoch': 0.34} 34%|███▍ | 7605/22095 [13:08:55<18:50:23, 4.68s/it] 34%|███▍ | 7606/22095 [13:08:59<17:25:52, 4.33s/it] {'loss': 0.3676, 'grad_norm': 0.6210216944863378, 'learning_rate': 7.6271005052365465e-06, 'epoch': 0.34} 34%|███▍ | 7606/22095 [13:08:59<17:25:52, 4.33s/it] 34%|███▍ | 7607/22095 [13:09:02<15:56:40, 3.96s/it] {'loss': 0.3545, 'grad_norm': 0.636479508470854, 'learning_rate': 7.6264768763366525e-06, 'epoch': 0.34} 34%|███▍ | 7607/22095 [13:09:02<15:56:40, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121168 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57170 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89415 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7608/22095 [13:09:05<14:55:45, 3.71s/it] {'loss': 0.3568, 'grad_norm': 0.6229405722005883, 'learning_rate': 7.6258531910018375e-06, 'epoch': 0.34} 34%|███▍ | 7608/22095 [13:09:05<14:55:45, 3.71s/it] 34%|███▍ | 7609/22095 [13:09:08<13:54:01, 3.45s/it] {'loss': 0.323, 'grad_norm': 0.5915074305794326, 'learning_rate': 7.625229449245501e-06, 'epoch': 0.34} 34%|███▍ | 7609/22095 [13:09:08<13:54:01, 3.45s/it] 34%|███▍ | 7610/22095 [13:09:12<14:10:09, 3.52s/it] {'loss': 0.3412, 'grad_norm': 0.5860717764241135, 'learning_rate': 7.624605651081049e-06, 'epoch': 0.34} 34%|███▍ | 7610/22095 [13:09:12<14:10:09, 3.52s/it] 34%|███▍ | 7611/22095 [13:09:15<13:21:33, 3.32s/it] {'loss': 0.3505, 'grad_norm': 0.7414526006320594, 'learning_rate': 7.62398179652188e-06, 'epoch': 0.34} 34%|███▍ | 7611/22095 [13:09:15<13:21:33, 3.32s/it] 34%|███▍ | 7612/22095 [13:09:18<13:44:12, 3.41s/it] {'loss': 0.3978, 'grad_norm': 0.6203614209578381, 'learning_rate': 7.623357885581403e-06, 'epoch': 0.34} 34%|███▍ | 7612/22095 [13:09:18<13:44:12, 3.41s/it] 34%|███▍ | 7613/22095 [13:09:22<13:35:52, 3.38s/it] {'loss': 0.365, 'grad_norm': 0.7438136046629493, 'learning_rate': 7.622733918273021e-06, 'epoch': 0.34} 34%|███▍ | 7613/22095 [13:09:22<13:35:52, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63286 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88887 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50724 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7614/22095 [13:09:25<13:51:28, 3.45s/it] {'loss': 0.3576, 'grad_norm': 0.6351053778164822, 'learning_rate': 7.6221098946101415e-06, 'epoch': 0.34} 34%|███▍ | 7614/22095 [13:09:25<13:51:28, 3.45s/it] 34%|███▍ | 7615/22095 [13:09:28<13:16:46, 3.30s/it] {'loss': 0.3144, 'grad_norm': 0.6493836667947682, 'learning_rate': 7.621485814606175e-06, 'epoch': 0.34} 34%|███▍ | 7615/22095 [13:09:28<13:16:46, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7616/22095 [13:09:38<20:52:19, 5.19s/it] {'loss': 0.4993, 'grad_norm': 0.4247953999482638, 'learning_rate': 7.62086167827453e-06, 'epoch': 0.34} 34%|███▍ | 7616/22095 [13:09:38<20:52:19, 5.19s/it] 34%|███▍ | 7617/22095 [13:09:41<18:21:18, 4.56s/it] {'loss': 0.3643, 'grad_norm': 0.6601210723159385, 'learning_rate': 7.620237485628614e-06, 'epoch': 0.34} 34%|███▍ | 7617/22095 [13:09:41<18:21:18, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87364 > 40960). Running this sequence through the model will result in indexing errors 34%|███▍ | 7618/22095 [13:09:45<17:37:56, 4.38s/it] {'loss': 0.3511, 'grad_norm': 0.618361473474604, 'learning_rate': 7.619613236681845e-06, 'epoch': 0.34} 34%|███▍ | 7618/22095 [13:09:45<17:37:56, 4.38s/it] 34%|███▍ | 7619/22095 [13:09:48<15:47:40, 3.93s/it] {'loss': 0.3331, 'grad_norm': 0.6427019523563446, 'learning_rate': 7.618988931447633e-06, 'epoch': 0.34} 34%|███▍ | 7619/22095 [13:09:48<15:47:40, 3.93s/it] 34%|███▍ | 7620/22095 [13:09:51<14:36:39, 3.63s/it] {'loss': 0.3763, 'grad_norm': 0.6811436509905098, 'learning_rate': 7.61836456993939e-06, 'epoch': 0.34} 34%|███▍ | 7620/22095 [13:09:51<14:36:39, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 34%|███▍ | 7621/22095 [13:09:55<15:07:43, 3.76s/it] {'loss': 0.3439, 'grad_norm': 0.752962159483992, 'learning_rate': 7.617740152170536e-06, 'epoch': 0.34} 34%|███▍ | 7621/22095 [13:09:55<15:07:43, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 34%|███▍ | 7622/22095 [13:10:05<22:40:02, 5.64s/it] {'loss': 0.4688, 'grad_norm': 0.320214585605932, 'learning_rate': 7.617115678154485e-06, 'epoch': 0.34} 34%|███▍ | 7622/22095 [13:10:05<22:40:02, 5.64s/it] 35%|███▍ | 7623/22095 [13:10:08<20:01:43, 4.98s/it] {'loss': 0.3607, 'grad_norm': 0.6349661881515777, 'learning_rate': 7.616491147904657e-06, 'epoch': 0.35} 35%|███▍ | 7623/22095 [13:10:08<20:01:43, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7624/22095 [13:10:12<18:08:57, 4.52s/it] {'loss': 0.3703, 'grad_norm': 0.6861805691638369, 'learning_rate': 7.615866561434468e-06, 'epoch': 0.35} 35%|███▍ | 7624/22095 [13:10:12<18:08:57, 4.52s/it] 35%|███▍ | 7625/22095 [13:10:15<17:16:47, 4.30s/it] {'loss': 0.3272, 'grad_norm': 0.6198289549983903, 'learning_rate': 7.615241918757343e-06, 'epoch': 0.35} 35%|███▍ | 7625/22095 [13:10:15<17:16:47, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7626/22095 [13:10:21<19:06:20, 4.75s/it] {'loss': 0.4649, 'grad_norm': 0.2929009723502456, 'learning_rate': 7.614617219886699e-06, 'epoch': 0.35} 35%|███▍ | 7626/22095 [13:10:21<19:06:20, 4.75s/it] 35%|███▍ | 7627/22095 [13:10:25<17:38:55, 4.39s/it] {'loss': 0.3646, 'grad_norm': 0.6858863221642914, 'learning_rate': 7.613992464835964e-06, 'epoch': 0.35} 35%|███▍ | 7627/22095 [13:10:25<17:38:55, 4.39s/it] 35%|███▍ | 7628/22095 [13:10:28<15:51:17, 3.95s/it] {'loss': 0.3563, 'grad_norm': 0.6786937831026911, 'learning_rate': 7.613367653618558e-06, 'epoch': 0.35} 35%|███▍ | 7628/22095 [13:10:28<15:51:17, 3.95s/it] 35%|███▍ | 7629/22095 [13:10:31<14:45:51, 3.67s/it] {'loss': 0.3916, 'grad_norm': 0.7177119428531356, 'learning_rate': 7.612742786247906e-06, 'epoch': 0.35} 35%|███▍ | 7629/22095 [13:10:31<14:45:51, 3.67s/it] 35%|███▍ | 7630/22095 [13:10:34<14:29:09, 3.61s/it] {'loss': 0.3232, 'grad_norm': 0.603190059176882, 'learning_rate': 7.612117862737437e-06, 'epoch': 0.35} 35%|███▍ | 7630/22095 [13:10:34<14:29:09, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7631/22095 [13:10:38<15:18:00, 3.81s/it] {'loss': 0.3464, 'grad_norm': 0.5944795316400319, 'learning_rate': 7.611492883100579e-06, 'epoch': 0.35} 35%|███▍ | 7631/22095 [13:10:38<15:18:00, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7632/22095 [13:10:42<15:05:09, 3.76s/it] {'loss': 0.3884, 'grad_norm': 0.6558781026577434, 'learning_rate': 7.610867847350758e-06, 'epoch': 0.35} 35%|███▍ | 7632/22095 [13:10:42<15:05:09, 3.76s/it] 35%|███▍ | 7633/22095 [13:10:46<15:18:01, 3.81s/it] {'loss': 0.3355, 'grad_norm': 0.637930146367971, 'learning_rate': 7.610242755501404e-06, 'epoch': 0.35} 35%|███▍ | 7633/22095 [13:10:46<15:18:01, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7634/22095 [13:10:50<15:14:32, 3.79s/it] {'loss': 0.341, 'grad_norm': 0.5539742893944569, 'learning_rate': 7.6096176075659535e-06, 'epoch': 0.35} 35%|███▍ | 7634/22095 [13:10:50<15:14:32, 3.79s/it] 35%|███▍ | 7635/22095 [13:10:53<14:54:08, 3.71s/it] {'loss': 0.3933, 'grad_norm': 0.6214660355339781, 'learning_rate': 7.608992403557833e-06, 'epoch': 0.35} 35%|███▍ | 7635/22095 [13:10:53<14:54:08, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7636/22095 [13:10:57<14:40:12, 3.65s/it] {'loss': 0.3566, 'grad_norm': 0.6163769966528868, 'learning_rate': 7.60836714349048e-06, 'epoch': 0.35} 35%|███▍ | 7636/22095 [13:10:57<14:40:12, 3.65s/it] 35%|███▍ | 7637/22095 [13:11:01<14:55:12, 3.72s/it] {'loss': 0.3892, 'grad_norm': 0.6619757585034614, 'learning_rate': 7.607741827377329e-06, 'epoch': 0.35} 35%|███▍ | 7637/22095 [13:11:01<14:55:12, 3.72s/it] 35%|███▍ | 7638/22095 [13:11:04<14:21:14, 3.57s/it] {'loss': 0.3253, 'grad_norm': 0.6139244345551975, 'learning_rate': 7.607116455231811e-06, 'epoch': 0.35} 35%|███▍ | 7638/22095 [13:11:04<14:21:14, 3.57s/it] 35%|███▍ | 7639/22095 [13:11:07<13:47:03, 3.43s/it] {'loss': 0.3575, 'grad_norm': 1.0639117936927491, 'learning_rate': 7.606491027067372e-06, 'epoch': 0.35} 35%|███▍ | 7639/22095 [13:11:07<13:47:03, 3.43s/it] 35%|███▍ | 7640/22095 [13:11:10<13:40:56, 3.41s/it] {'loss': 0.3811, 'grad_norm': 0.71432928918362, 'learning_rate': 7.605865542897443e-06, 'epoch': 0.35} 35%|███▍ | 7640/22095 [13:11:10<13:40:56, 3.41s/it] 35%|███▍ | 7641/22095 [13:11:14<14:07:44, 3.52s/it] {'loss': 0.3792, 'grad_norm': 0.59747298091808, 'learning_rate': 7.605240002735469e-06, 'epoch': 0.35} 35%|███▍ | 7641/22095 [13:11:14<14:07:44, 3.52s/it] 35%|███▍ | 7642/22095 [13:11:18<14:47:10, 3.68s/it] {'loss': 0.3769, 'grad_norm': 0.6090730545711344, 'learning_rate': 7.604614406594888e-06, 'epoch': 0.35} 35%|███▍ | 7642/22095 [13:11:18<14:47:10, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144919 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91450 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7643/22095 [13:11:22<14:27:55, 3.60s/it] {'loss': 0.3977, 'grad_norm': 0.6527221123705976, 'learning_rate': 7.603988754489142e-06, 'epoch': 0.35} 35%|███▍ | 7643/22095 [13:11:22<14:27:55, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43192 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7644/22095 [13:11:25<13:43:04, 3.42s/it] {'loss': 0.3345, 'grad_norm': 0.6269383326587715, 'learning_rate': 7.603363046431676e-06, 'epoch': 0.35} 35%|███▍ | 7644/22095 [13:11:25<13:43:04, 3.42s/it] 35%|███▍ | 7645/22095 [13:11:28<13:26:21, 3.35s/it] {'loss': 0.383, 'grad_norm': 0.6320898437814025, 'learning_rate': 7.6027372824359336e-06, 'epoch': 0.35} 35%|███▍ | 7645/22095 [13:11:28<13:26:21, 3.35s/it] 35%|███▍ | 7646/22095 [13:11:31<12:50:53, 3.20s/it] {'loss': 0.369, 'grad_norm': 0.6182188002592361, 'learning_rate': 7.60211146251536e-06, 'epoch': 0.35} 35%|███▍ | 7646/22095 [13:11:31<12:50:53, 3.20s/it] 35%|███▍ | 7647/22095 [13:11:34<13:19:09, 3.32s/it] {'loss': 0.3778, 'grad_norm': 0.6648400989717609, 'learning_rate': 7.601485586683404e-06, 'epoch': 0.35} 35%|███▍ | 7647/22095 [13:11:34<13:19:09, 3.32s/it] 35%|███▍ | 7648/22095 [13:11:38<13:58:11, 3.48s/it] {'loss': 0.3981, 'grad_norm': 0.640924490371323, 'learning_rate': 7.600859654953513e-06, 'epoch': 0.35} 35%|███▍ | 7648/22095 [13:11:38<13:58:11, 3.48s/it] 35%|███▍ | 7649/22095 [13:11:41<13:43:47, 3.42s/it] {'loss': 0.3515, 'grad_norm': 0.657510092475443, 'learning_rate': 7.600233667339134e-06, 'epoch': 0.35} 35%|███▍ | 7649/22095 [13:11:41<13:43:47, 3.42s/it] 35%|███▍ | 7650/22095 [13:11:44<13:13:53, 3.30s/it] {'loss': 0.3721, 'grad_norm': 0.6443967712541337, 'learning_rate': 7.599607623853722e-06, 'epoch': 0.35} 35%|███▍ | 7650/22095 [13:11:44<13:13:53, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7651/22095 [13:11:53<19:09:10, 4.77s/it] {'loss': 0.4724, 'grad_norm': 0.3703216977234791, 'learning_rate': 7.5989815245107235e-06, 'epoch': 0.35} 35%|███▍ | 7651/22095 [13:11:53<19:09:10, 4.77s/it] 35%|███▍ | 7652/22095 [13:12:02<25:14:14, 6.29s/it] {'loss': 0.4883, 'grad_norm': 0.31900130505503843, 'learning_rate': 7.5983553693235955e-06, 'epoch': 0.35} 35%|███▍ | 7652/22095 [13:12:02<25:14:14, 6.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7653/22095 [13:12:06<22:27:25, 5.60s/it] {'loss': 0.366, 'grad_norm': 0.668104250745774, 'learning_rate': 7.597729158305791e-06, 'epoch': 0.35} 35%|███▍ | 7653/22095 [13:12:06<22:27:25, 5.60s/it] 35%|███▍ | 7654/22095 [13:12:10<20:05:08, 5.01s/it] {'loss': 0.3459, 'grad_norm': 0.6319615814150348, 'learning_rate': 7.597102891470766e-06, 'epoch': 0.35} 35%|███▍ | 7654/22095 [13:12:10<20:05:08, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7655/22095 [13:12:13<17:50:52, 4.45s/it] {'loss': 0.3496, 'grad_norm': 0.62178518968914, 'learning_rate': 7.596476568831974e-06, 'epoch': 0.35} 35%|███▍ | 7655/22095 [13:12:13<17:50:52, 4.45s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (138201600 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 35%|███▍ | 7656/22095 [13:12:16<16:01:19, 3.99s/it] {'loss': 0.3431, 'grad_norm': 0.7020410645187021, 'learning_rate': 7.595850190402877e-06, 'epoch': 0.35} 35%|███▍ | 7656/22095 [13:12:16<16:01:19, 3.99s/it] 35%|███▍ | 7657/22095 [13:12:20<15:47:59, 3.94s/it] {'loss': 0.3603, 'grad_norm': 0.6481777084808226, 'learning_rate': 7.595223756196931e-06, 'epoch': 0.35} 35%|███▍ | 7657/22095 [13:12:20<15:47:59, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90549 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7658/22095 [13:12:23<15:25:13, 3.85s/it] {'loss': 0.3261, 'grad_norm': 0.6725199795510061, 'learning_rate': 7.594597266227599e-06, 'epoch': 0.35} 35%|███▍ | 7658/22095 [13:12:23<15:25:13, 3.85s/it] 35%|███▍ | 7659/22095 [13:12:27<15:05:30, 3.76s/it] {'loss': 0.3911, 'grad_norm': 0.6695539890315392, 'learning_rate': 7.593970720508337e-06, 'epoch': 0.35} 35%|███▍ | 7659/22095 [13:12:27<15:05:30, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358337 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25049, 'image': 'vrdu_table_final_2/astro-ph.CO/81d00d9a-4b9c-4d25-a15e-ffbe78301b1f.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 35%|███▍ | 7660/22095 [13:12:30<14:02:04, 3.50s/it] {'loss': 0.329, 'grad_norm': 0.6354345747973272, 'learning_rate': 7.5933441190526146e-06, 'epoch': 0.35} 35%|███▍ | 7660/22095 [13:12:30<14:02:04, 3.50s/it] 35%|███▍ | 7661/22095 [13:12:34<14:25:53, 3.60s/it] {'loss': 0.3688, 'grad_norm': 0.6397382074660813, 'learning_rate': 7.59271746187389e-06, 'epoch': 0.35} 35%|███▍ | 7661/22095 [13:12:34<14:25:53, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7662/22095 [13:12:37<14:18:46, 3.57s/it] {'loss': 0.4006, 'grad_norm': 0.6931351805945015, 'learning_rate': 7.59209074898563e-06, 'epoch': 0.35} 35%|███▍ | 7662/22095 [13:12:37<14:18:46, 3.57s/it] 35%|███▍ | 7663/22095 [13:12:40<13:29:02, 3.36s/it] {'loss': 0.3619, 'grad_norm': 0.6902782157655498, 'learning_rate': 7.591463980401302e-06, 'epoch': 0.35} 35%|███▍ | 7663/22095 [13:12:40<13:29:02, 3.36s/it] 35%|███▍ | 7664/22095 [13:12:43<12:52:49, 3.21s/it] {'loss': 0.3402, 'grad_norm': 0.6431808265068502, 'learning_rate': 7.59083715613437e-06, 'epoch': 0.35} 35%|███▍ | 7664/22095 [13:12:43<12:52:49, 3.21s/it] 35%|███▍ | 7665/22095 [13:12:47<13:43:46, 3.43s/it] {'loss': 0.3691, 'grad_norm': 0.6268809460813694, 'learning_rate': 7.590210276198305e-06, 'epoch': 0.35} 35%|███▍ | 7665/22095 [13:12:47<13:43:46, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7666/22095 [13:12:57<21:09:19, 5.28s/it] {'loss': 0.486, 'grad_norm': 0.5677407303711991, 'learning_rate': 7.589583340606579e-06, 'epoch': 0.35} 35%|███▍ | 7666/22095 [13:12:57<21:09:19, 5.28s/it] 35%|███▍ | 7667/22095 [13:13:00<19:02:34, 4.75s/it] {'loss': 0.3523, 'grad_norm': 0.6379834997740906, 'learning_rate': 7.588956349372657e-06, 'epoch': 0.35} 35%|███▍ | 7667/22095 [13:13:00<19:02:34, 4.75s/it] 35%|███▍ | 7668/22095 [13:13:03<16:42:06, 4.17s/it] {'loss': 0.3662, 'grad_norm': 0.654715150775096, 'learning_rate': 7.588329302510017e-06, 'epoch': 0.35} 35%|███▍ | 7668/22095 [13:13:03<16:42:06, 4.17s/it] 35%|███▍ | 7669/22095 [13:13:06<15:24:43, 3.85s/it] {'loss': 0.3515, 'grad_norm': 0.6607985524070082, 'learning_rate': 7.5877022000321285e-06, 'epoch': 0.35} 35%|███▍ | 7669/22095 [13:13:06<15:24:43, 3.85s/it] 35%|███▍ | 7670/22095 [13:13:09<14:31:28, 3.62s/it] {'loss': 0.3463, 'grad_norm': 1.0173624912637775, 'learning_rate': 7.5870750419524675e-06, 'epoch': 0.35} 35%|███▍ | 7670/22095 [13:13:09<14:31:28, 3.62s/it] 35%|███▍ | 7671/22095 [13:13:13<15:19:00, 3.82s/it] {'loss': 0.3976, 'grad_norm': 0.6332869161692951, 'learning_rate': 7.586447828284509e-06, 'epoch': 0.35} 35%|███▍ | 7671/22095 [13:13:13<15:19:00, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403477 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5650, 'image': 'vrdu_table_final_2/astro-ph.CO/e26eb4cc-ccc7-44cf-b895-f07fb10075dc.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 35%|███▍ | 7672/22095 [13:13:17<14:40:06, 3.66s/it] {'loss': 0.3406, 'grad_norm': 0.64736306872132, 'learning_rate': 7.58582055904173e-06, 'epoch': 0.35} 35%|███▍ | 7672/22095 [13:13:17<14:40:06, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44779 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7673/22095 [13:13:20<14:49:02, 3.70s/it] {'loss': 0.4192, 'grad_norm': 0.6573305867265749, 'learning_rate': 7.585193234237611e-06, 'epoch': 0.35} 35%|███▍ | 7673/22095 [13:13:20<14:49:02, 3.70s/it] 35%|███▍ | 7674/22095 [13:13:23<13:49:55, 3.45s/it] {'loss': 0.3202, 'grad_norm': 0.635163449801587, 'learning_rate': 7.584565853885627e-06, 'epoch': 0.35} 35%|███▍ | 7674/22095 [13:13:23<13:49:55, 3.45s/it] 35%|███▍ | 7675/22095 [13:13:27<14:40:15, 3.66s/it] {'loss': 0.3622, 'grad_norm': 0.6196122097101256, 'learning_rate': 7.583938417999261e-06, 'epoch': 0.35} 35%|███▍ | 7675/22095 [13:13:27<14:40:15, 3.66s/it] 35%|███▍ | 7676/22095 [13:13:30<13:45:17, 3.43s/it] {'loss': 0.3226, 'grad_norm': 0.6688456245347942, 'learning_rate': 7.5833109265919955e-06, 'epoch': 0.35} 35%|███▍ | 7676/22095 [13:13:30<13:45:17, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7677/22095 [13:13:33<12:56:35, 3.23s/it] {'loss': 0.3486, 'grad_norm': 0.6659623665378279, 'learning_rate': 7.5826833796773115e-06, 'epoch': 0.35} 35%|███▍ | 7677/22095 [13:13:33<12:56:35, 3.23s/it] 35%|███▍ | 7678/22095 [13:13:37<13:13:47, 3.30s/it] {'loss': 0.3452, 'grad_norm': 0.6305891165742237, 'learning_rate': 7.582055777268693e-06, 'epoch': 0.35} 35%|███▍ | 7678/22095 [13:13:37<13:13:47, 3.30s/it] 35%|███▍ | 7679/22095 [13:13:39<12:38:39, 3.16s/it] {'loss': 0.3282, 'grad_norm': 0.6338385838715717, 'learning_rate': 7.581428119379628e-06, 'epoch': 0.35} 35%|███▍ | 7679/22095 [13:13:39<12:38:39, 3.16s/it] 35%|███▍ | 7680/22095 [13:13:43<12:44:55, 3.18s/it] {'loss': 0.3418, 'grad_norm': 0.6191310672713078, 'learning_rate': 7.5808004060235995e-06, 'epoch': 0.35} 35%|███▍ | 7680/22095 [13:13:43<12:44:55, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7681/22095 [13:13:52<20:16:52, 5.07s/it] {'loss': 0.4705, 'grad_norm': 0.48046456954070565, 'learning_rate': 7.580172637214098e-06, 'epoch': 0.35} 35%|███▍ | 7681/22095 [13:13:52<20:16:52, 5.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7682/22095 [13:13:55<17:56:48, 4.48s/it] {'loss': 0.347, 'grad_norm': 0.6075556600854566, 'learning_rate': 7.57954481296461e-06, 'epoch': 0.35} 35%|███▍ | 7682/22095 [13:13:55<17:56:48, 4.48s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_2/images/step_0.png 2025-08-28 05:11:53.994883 load time: 1118.31 ms 35%|███▍ | 7683/22095 [13:13:59<17:09:46, 4.29s/it] {'loss': 0.3466, 'grad_norm': 0.7261897682432522, 'learning_rate': 7.5789169332886255e-06, 'epoch': 0.35} 35%|███▍ | 7683/22095 [13:13:59<17:09:46, 4.29s/it] 35%|███▍ | 7684/22095 [13:14:02<15:26:45, 3.86s/it] {'loss': 0.3595, 'grad_norm': 0.743757384841053, 'learning_rate': 7.578288998199638e-06, 'epoch': 0.35} 35%|███▍ | 7684/22095 [13:14:02<15:26:45, 3.86s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250505_001007_2/images/before_screenshot_14_id_85_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:12:01.796926 load time: 1568.7 ms 35%|███▍ | 7685/22095 [13:14:05<14:24:14, 3.60s/it] {'loss': 0.3439, 'grad_norm': 0.6174087207637071, 'learning_rate': 7.5776610077111375e-06, 'epoch': 0.35} 35%|███▍ | 7685/22095 [13:14:05<14:24:14, 3.60s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_104/img/step_0.png 2025-08-28 05:12:03.678928 load time: 1459.76 ms 35%|███▍ | 7686/22095 [13:14:08<13:31:58, 3.38s/it] {'loss': 0.3457, 'grad_norm': 0.7420086401146211, 'learning_rate': 7.577032961836619e-06, 'epoch': 0.35} 35%|███▍ | 7686/22095 [13:14:08<13:31:58, 3.38s/it] 35%|███▍ | 7687/22095 [13:14:11<13:55:11, 3.48s/it] {'loss': 0.3644, 'grad_norm': 0.6720111974672807, 'learning_rate': 7.576404860589579e-06, 'epoch': 0.35} 35%|███▍ | 7687/22095 [13:14:11<13:55:11, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7688/22095 [13:14:22<22:57:46, 5.74s/it] {'loss': 0.5039, 'grad_norm': 0.39524473437736063, 'learning_rate': 7.575776703983508e-06, 'epoch': 0.35} 35%|███▍ | 7688/22095 [13:14:22<22:57:46, 5.74s/it] 35%|███▍ | 7689/22095 [13:14:26<20:48:22, 5.20s/it] {'loss': 0.3552, 'grad_norm': 0.6664363625429858, 'learning_rate': 7.575148492031908e-06, 'epoch': 0.35} 35%|███▍ | 7689/22095 [13:14:26<20:48:22, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81220 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7690/22095 [13:14:30<19:10:18, 4.79s/it] {'loss': 0.4008, 'grad_norm': 0.7055820427991333, 'learning_rate': 7.574520224748276e-06, 'epoch': 0.35} 35%|███▍ | 7690/22095 [13:14:30<19:10:18, 4.79s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 05:12:29.048056 load time: 1406.71 ms 35%|███▍ | 7691/22095 [13:14:34<17:30:23, 4.38s/it] {'loss': 0.3893, 'grad_norm': 0.8020328548076989, 'learning_rate': 7.573891902146111e-06, 'epoch': 0.35} 35%|███▍ | 7691/22095 [13:14:34<17:30:23, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 05:12:33.495268 load time: 1093.85 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 05:12:33.050640 load time: 1507.1 ms 35%|███▍ | 7692/22095 [13:14:42<21:46:40, 5.44s/it] {'loss': 0.4878, 'grad_norm': 0.3070808301664086, 'learning_rate': 7.573263524238914e-06, 'epoch': 0.35} 35%|███▍ | 7692/22095 [13:14:42<21:46:40, 5.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44561 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126023 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62993 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7693/22095 [13:14:45<19:50:33, 4.96s/it] {'loss': 0.3793, 'grad_norm': 0.653831880465432, 'learning_rate': 7.572635091040188e-06, 'epoch': 0.35} 35%|███▍ | 7693/22095 [13:14:45<19:50:33, 4.96s/it] 35%|███▍ | 7694/22095 [13:14:48<17:22:41, 4.34s/it] {'loss': 0.3251, 'grad_norm': 0.5919767279825535, 'learning_rate': 7.572006602563434e-06, 'epoch': 0.35} 35%|███▍ | 7694/22095 [13:14:48<17:22:41, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922258 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45411, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 12cm\nB. 6cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 35%|███▍ | 7695/22095 [13:14:51<15:49:43, 3.96s/it] {'loss': 0.3583, 'grad_norm': 0.6968740648600568, 'learning_rate': 7.571378058822159e-06, 'epoch': 0.35} 35%|███▍ | 7695/22095 [13:14:51<15:49:43, 3.96s/it] 35%|███▍ | 7696/22095 [13:14:56<16:27:21, 4.11s/it] {'loss': 0.4116, 'grad_norm': 0.6407502883602284, 'learning_rate': 7.570749459829865e-06, 'epoch': 0.35} 35%|███▍ | 7696/22095 [13:14:56<16:27:21, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7697/22095 [13:15:02<18:27:33, 4.62s/it] {'loss': 0.4956, 'grad_norm': 0.32059540456082625, 'learning_rate': 7.570120805600063e-06, 'epoch': 0.35} 35%|███▍ | 7697/22095 [13:15:02<18:27:33, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7698/22095 [13:15:12<24:54:49, 6.23s/it] {'loss': 0.5189, 'grad_norm': 0.3229611926125327, 'learning_rate': 7.569492096146256e-06, 'epoch': 0.35} 35%|███▍ | 7698/22095 [13:15:12<24:54:49, 6.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▍ | 7699/22095 [13:15:15<21:24:29, 5.35s/it] {'loss': 0.3237, 'grad_norm': 1.1869411507300307, 'learning_rate': 7.568863331481957e-06, 'epoch': 0.35} 35%|███▍ | 7699/22095 [13:15:15<21:24:29, 5.35s/it] 35%|███▍ | 7700/22095 [13:15:18<19:01:36, 4.76s/it] {'loss': 0.3359, 'grad_norm': 0.650355888585891, 'learning_rate': 7.568234511620674e-06, 'epoch': 0.35} 35%|███▍ | 7700/22095 [13:15:18<19:01:36, 4.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348866 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15537, 'image': 'vrdu_table_final_2/astro-ph.CO/0cfb068f-e10a-4901-9661-1c1af4b6fbcf.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #2 \\\\\n \\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 05:13:19.673979 load time: 1090.9 ms 35%|███▍ | 7701/22095 [13:15:28<24:39:39, 6.17s/it] {'loss': 0.4972, 'grad_norm': 0.32701937546089865, 'learning_rate': 7.567605636575919e-06, 'epoch': 0.35} 35%|███▍ | 7701/22095 [13:15:28<24:39:39, 6.17s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/30fa924765d794b1119ecbe77ef7c9d78045bed35211f46887007cb504e9c0b0.png 2025-08-28 05:13:26.574021 load time: 1004.98 ms 35%|███▍ | 7702/22095 [13:15:31<21:36:55, 5.41s/it] {'loss': 0.3598, 'grad_norm': 0.6480347956638689, 'learning_rate': 7.566976706361204e-06, 'epoch': 0.35} 35%|███▍ | 7702/22095 [13:15:31<21:36:55, 5.41s/it] 35%|███▍ | 7703/22095 [13:15:35<19:56:56, 4.99s/it] {'loss': 0.3799, 'grad_norm': 0.670768338920183, 'learning_rate': 7.566347720990044e-06, 'epoch': 0.35} 35%|███▍ | 7703/22095 [13:15:35<19:56:56, 4.99s/it] 35%|███▍ | 7704/22095 [13:15:39<17:59:50, 4.50s/it] {'loss': 0.339, 'grad_norm': 0.614859753339743, 'learning_rate': 7.565718680475953e-06, 'epoch': 0.35} 35%|███▍ | 7704/22095 [13:15:39<17:59:50, 4.50s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38004.png 2025-08-28 05:13:37.947703 load time: 1507.86 ms VC:s3://gui-agent/data_20250612/mac/images/settings/d23d7144-6085-4f97-99fe-8fff75db1ef9/images/step_0.png 2025-08-28 05:13:38.769508 load time: 1143.67 ms 35%|███▍ | 7705/22095 [13:15:42<16:18:03, 4.08s/it] {'loss': 0.3403, 'grad_norm': 0.8436747082401596, 'learning_rate': 7.565089584832448e-06, 'epoch': 0.35} 35%|███▍ | 7705/22095 [13:15:42<16:18:03, 4.08s/it] 35%|███▍ | 7706/22095 [13:15:46<16:07:29, 4.03s/it] {'loss': 0.3566, 'grad_norm': 0.6671309781608522, 'learning_rate': 7.564460434073047e-06, 'epoch': 0.35} 35%|███▍ | 7706/22095 [13:15:46<16:07:29, 4.03s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-28 05:13:45.751525 load time: 1306.75 ms 35%|███▍ | 7707/22095 [13:15:49<15:03:20, 3.77s/it] {'loss': 0.3943, 'grad_norm': 0.7655776253067961, 'learning_rate': 7.563831228211266e-06, 'epoch': 0.35} 35%|███▍ | 7707/22095 [13:15:49<15:03:20, 3.77s/it] 35%|███▍ | 7708/22095 [13:15:52<14:42:59, 3.68s/it] {'loss': 0.3185, 'grad_norm': 0.6443313925396289, 'learning_rate': 7.563201967260627e-06, 'epoch': 0.35} 35%|███▍ | 7708/22095 [13:15:52<14:42:59, 3.68s/it] 35%|███▍ | 7709/22095 [13:15:56<14:38:13, 3.66s/it] {'loss': 0.3712, 'grad_norm': 0.6727203672910881, 'learning_rate': 7.562572651234649e-06, 'epoch': 0.35} 35%|███▍ | 7709/22095 [13:15:56<14:38:13, 3.66s/it] 35%|███▍ | 7710/22095 [13:16:00<14:43:52, 3.69s/it] {'loss': 0.3689, 'grad_norm': 0.6965609685868654, 'learning_rate': 7.561943280146856e-06, 'epoch': 0.35} 35%|███▍ | 7710/22095 [13:16:00<14:43:52, 3.69s/it] 35%|███▍ | 7711/22095 [13:16:03<13:50:21, 3.46s/it] {'loss': 0.354, 'grad_norm': 0.6877882531696877, 'learning_rate': 7.56131385401077e-06, 'epoch': 0.35} 35%|███▍ | 7711/22095 [13:16:03<13:50:21, 3.46s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (99586880 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 35%|███▍ | 7712/22095 [13:16:06<14:08:20, 3.54s/it] {'loss': 0.3654, 'grad_norm': 0.6516043811205681, 'learning_rate': 7.560684372839915e-06, 'epoch': 0.35} 35%|███▍ | 7712/22095 [13:16:06<14:08:20, 3.54s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38019.png 2025-08-28 05:14:02.679814 load time: 1878.8 ms Token indices sequence length is longer than the specified maximum sequence length for this model (65266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64660 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7713/22095 [13:16:10<13:53:18, 3.48s/it] {'loss': 0.3211, 'grad_norm': 0.6131498119539656, 'learning_rate': 7.560054836647819e-06, 'epoch': 0.35} 35%|███▍ | 7713/22095 [13:16:10<13:53:18, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82423 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94555 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47393 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46224 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7714/22095 [13:16:13<13:23:06, 3.35s/it] {'loss': 0.3683, 'grad_norm': 0.5903973295467017, 'learning_rate': 7.559425245448006e-06, 'epoch': 0.35} 35%|███▍ | 7714/22095 [13:16:13<13:23:06, 3.35s/it] 35%|███▍ | 7715/22095 [13:16:16<12:53:30, 3.23s/it] {'loss': 0.3272, 'grad_norm': 0.7211704582858878, 'learning_rate': 7.558795599254005e-06, 'epoch': 0.35} 35%|███▍ | 7715/22095 [13:16:16<12:53:30, 3.23s/it] 35%|███▍ | 7716/22095 [13:16:19<12:42:00, 3.18s/it] {'loss': 0.3561, 'grad_norm': 0.6511390209664222, 'learning_rate': 7.558165898079346e-06, 'epoch': 0.35} 35%|███▍ | 7716/22095 [13:16:19<12:42:00, 3.18s/it] 35%|███▍ | 7717/22095 [13:16:23<13:59:43, 3.50s/it] {'loss': 0.3496, 'grad_norm': 0.6037263667339409, 'learning_rate': 7.5575361419375585e-06, 'epoch': 0.35} 35%|███▍ | 7717/22095 [13:16:23<13:59:43, 3.50s/it] 35%|███▍ | 7718/22095 [13:16:26<13:44:23, 3.44s/it] {'loss': 0.3612, 'grad_norm': 0.6542931110891881, 'learning_rate': 7.556906330842174e-06, 'epoch': 0.35} 35%|███▍ | 7718/22095 [13:16:26<13:44:23, 3.44s/it] 35%|███▍ | 7719/22095 [13:16:30<14:08:08, 3.54s/it] {'loss': 0.3447, 'grad_norm': 0.6544269421961227, 'learning_rate': 7.556276464806725e-06, 'epoch': 0.35} 35%|███▍ | 7719/22095 [13:16:30<14:08:08, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7720/22095 [13:16:33<13:30:00, 3.38s/it] {'loss': 0.352, 'grad_norm': 0.6179559855338046, 'learning_rate': 7.555646543844747e-06, 'epoch': 0.35} 35%|███▍ | 7720/22095 [13:16:33<13:30:00, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▍ | 7721/22095 [13:16:41<18:34:18, 4.65s/it] {'loss': 0.4923, 'grad_norm': 0.3694286494252386, 'learning_rate': 7.555016567969773e-06, 'epoch': 0.35} 35%|███▍ | 7721/22095 [13:16:41<18:34:18, 4.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887881 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11034, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 35%|███▍ | 7722/22095 [13:16:50<24:21:48, 6.10s/it] {'loss': 0.4915, 'grad_norm': 0.3571334190934288, 'learning_rate': 7.554386537195339e-06, 'epoch': 0.35} 35%|███▍ | 7722/22095 [13:16:50<24:21:48, 6.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▍ | 7723/22095 [13:16:54<21:30:23, 5.39s/it] {'loss': 0.3545, 'grad_norm': 0.639870037309397, 'learning_rate': 7.553756451534984e-06, 'epoch': 0.35} 35%|███▍ | 7723/22095 [13:16:54<21:30:23, 5.39s/it] 35%|███▍ | 7724/22095 [13:16:58<19:49:39, 4.97s/it] {'loss': 0.3035, 'grad_norm': 0.6321033788590497, 'learning_rate': 7.553126311002248e-06, 'epoch': 0.35} 35%|███▍ | 7724/22095 [13:16:58<19:49:39, 4.97s/it] 35%|███▍ | 7725/22095 [13:17:01<17:53:31, 4.48s/it] {'loss': 0.3517, 'grad_norm': 0.725958941617707, 'learning_rate': 7.552496115610668e-06, 'epoch': 0.35} 35%|███▍ | 7725/22095 [13:17:01<17:53:31, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884878 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8031, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 6\nB. 10\nC. 8\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30267.png 2025-08-28 05:14:58.980352 load time: 1129.83 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8336071 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2690, 'image': 'vrdu_table_final_2/astro-ph.CO/a7273bdf-7d69-4d48-981d-d149d665104a.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 35%|███▍ | 7726/22095 [13:17:11<23:33:45, 5.90s/it] {'loss': 0.5128, 'grad_norm': 0.35637223581074784, 'learning_rate': 7.5518658653737844e-06, 'epoch': 0.35} 35%|███▍ | 7726/22095 [13:17:11<23:33:45, 5.90s/it] 35%|███▍ | 7727/22095 [13:17:14<20:45:46, 5.20s/it] {'loss': 0.368, 'grad_norm': 0.6564009265796242, 'learning_rate': 7.551235560305142e-06, 'epoch': 0.35} 35%|███▍ | 7727/22095 [13:17:14<20:45:46, 5.20s/it] 35%|███▍ | 7728/22095 [13:17:17<18:32:14, 4.64s/it] {'loss': 0.3633, 'grad_norm': 0.6910617140704731, 'learning_rate': 7.550605200418283e-06, 'epoch': 0.35} 35%|███▍ | 7728/22095 [13:17:18<18:32:14, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▍ | 7729/22095 [13:17:21<16:47:48, 4.21s/it] {'loss': 0.4214, 'grad_norm': 0.7162155865789955, 'learning_rate': 7.549974785726753e-06, 'epoch': 0.35} 35%|███▍ | 7729/22095 [13:17:21<16:47:48, 4.21s/it] 35%|███▍ | 7730/22095 [13:17:24<15:35:09, 3.91s/it] {'loss': 0.3669, 'grad_norm': 0.6295258356979103, 'learning_rate': 7.549344316244094e-06, 'epoch': 0.35} 35%|███▍ | 7730/22095 [13:17:24<15:35:09, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53500 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50096 > 40960). Running this sequence through the model will result in indexing errors 35%|███▍ | 7731/22095 [13:17:27<14:34:26, 3.65s/it] {'loss': 0.3781, 'grad_norm': 0.8504213878378942, 'learning_rate': 7.548713791983857e-06, 'epoch': 0.35} 35%|███▍ | 7731/22095 [13:17:27<14:34:26, 3.65s/it] 35%|███▍ | 7732/22095 [13:17:31<14:49:38, 3.72s/it] {'loss': 0.3408, 'grad_norm': 0.6899801762438271, 'learning_rate': 7.548083212959588e-06, 'epoch': 0.35} 35%|███▍ | 7732/22095 [13:17:31<14:49:38, 3.72s/it] 35%|███▍ | 7733/22095 [13:17:34<14:27:07, 3.62s/it] {'loss': 0.3435, 'grad_norm': 0.6768212134505426, 'learning_rate': 7.547452579184836e-06, 'epoch': 0.35} 35%|███▍ | 7733/22095 [13:17:34<14:27:07, 3.62s/it] 35%|███▌ | 7734/22095 [13:17:39<15:32:10, 3.89s/it] {'loss': 0.3902, 'grad_norm': 0.6615478961791598, 'learning_rate': 7.546821890673153e-06, 'epoch': 0.35} 35%|███▌ | 7734/22095 [13:17:39<15:32:10, 3.89s/it]VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/images/AI/handmade_annotation_4/images/Ai_14_id_1_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 05:15:38.140638 load time: 1110.28 ms 35%|███▌ | 7735/22095 [13:17:43<15:27:19, 3.87s/it] {'loss': 0.3873, 'grad_norm': 0.6393455474883397, 'learning_rate': 7.546191147438089e-06, 'epoch': 0.35} 35%|███▌ | 7735/22095 [13:17:43<15:27:19, 3.87s/it] 35%|███▌ | 7736/22095 [13:17:46<14:37:15, 3.67s/it] {'loss': 0.3244, 'grad_norm': 0.6611638513107325, 'learning_rate': 7.545560349493197e-06, 'epoch': 0.35} 35%|███▌ | 7736/22095 [13:17:46<14:37:15, 3.67s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31191.png 2025-08-28 05:15:43.645712 load time: 1375.21 ms 35%|███▌ | 7737/22095 [13:17:49<13:35:32, 3.41s/it] {'loss': 0.37, 'grad_norm': 0.6593511958016057, 'learning_rate': 7.544929496852033e-06, 'epoch': 0.35} 35%|███▌ | 7737/22095 [13:17:49<13:35:32, 3.41s/it] 35%|███▌ | 7738/22095 [13:17:51<13:00:43, 3.26s/it] {'loss': 0.346, 'grad_norm': 0.9726619848240113, 'learning_rate': 7.544298589528148e-06, 'epoch': 0.35} 35%|███▌ | 7738/22095 [13:17:51<13:00:43, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/34453.png 2025-08-28 05:15:50.266695 load time: 1167.77 ms 35%|███▌ | 7739/22095 [13:17:59<18:24:05, 4.61s/it] {'loss': 0.4948, 'grad_norm': 0.521846771020006, 'learning_rate': 7.5436676275351e-06, 'epoch': 0.35} 35%|███▌ | 7739/22095 [13:17:59<18:24:05, 4.61s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 05:15:59.238179 load time: 1264.53 ms 35%|███▌ | 7740/22095 [13:18:03<17:04:15, 4.28s/it] {'loss': 0.3474, 'grad_norm': 0.6224165394621645, 'learning_rate': 7.54303661088645e-06, 'epoch': 0.35} 35%|███▌ | 7740/22095 [13:18:03<17:04:15, 4.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20406.png 2025-08-28 05:16:03.174895 load time: 1142.9 ms 35%|███▌ | 7741/22095 [13:18:07<17:04:08, 4.28s/it] {'loss': 0.3299, 'grad_norm': 0.6603027913044086, 'learning_rate': 7.542405539595752e-06, 'epoch': 0.35} 35%|███▌ | 7741/22095 [13:18:07<17:04:08, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63542 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7742/22095 [13:18:10<15:52:11, 3.98s/it] {'loss': 0.3338, 'grad_norm': 0.6366304885536189, 'learning_rate': 7.541774413676566e-06, 'epoch': 0.35} 35%|███▌ | 7742/22095 [13:18:10<15:52:11, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68629 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7743/22095 [13:18:13<14:45:02, 3.70s/it] {'loss': 0.3731, 'grad_norm': 0.623869279581973, 'learning_rate': 7.541143233142456e-06, 'epoch': 0.35} 35%|███▌ | 7743/22095 [13:18:13<14:45:02, 3.70s/it] 35%|███▌ | 7744/22095 [13:18:17<15:14:25, 3.82s/it] {'loss': 0.3644, 'grad_norm': 0.6733659163360821, 'learning_rate': 7.540511998006982e-06, 'epoch': 0.35} 35%|███▌ | 7744/22095 [13:18:17<15:14:25, 3.82s/it] 35%|███▌ | 7745/22095 [13:18:21<15:11:02, 3.81s/it] {'loss': 0.3503, 'grad_norm': 0.8206727756834944, 'learning_rate': 7.539880708283709e-06, 'epoch': 0.35} 35%|███▌ | 7745/22095 [13:18:21<15:11:02, 3.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047684 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '2'}]} 35%|███▌ | 7746/22095 [13:18:25<14:42:14, 3.69s/it] {'loss': 0.3412, 'grad_norm': 0.5872189014995534, 'learning_rate': 7.539249363986196e-06, 'epoch': 0.35} 35%|███▌ | 7746/22095 [13:18:25<14:42:14, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52596 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45956 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7747/22095 [13:18:28<14:01:57, 3.52s/it] {'loss': 0.3779, 'grad_norm': 0.7261999016080413, 'learning_rate': 7.538617965128018e-06, 'epoch': 0.35} 35%|███▌ | 7747/22095 [13:18:28<14:01:57, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047605 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 16\nB. 9\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 35%|███▌ | 7748/22095 [13:18:31<13:11:06, 3.31s/it] {'loss': 0.3461, 'grad_norm': 0.6220194173842661, 'learning_rate': 7.537986511722732e-06, 'epoch': 0.35} 35%|███▌ | 7748/22095 [13:18:31<13:11:06, 3.31s/it] 35%|███▌ | 7749/22095 [13:18:34<12:45:28, 3.20s/it] {'loss': 0.3505, 'grad_norm': 0.6140523177540189, 'learning_rate': 7.537355003783915e-06, 'epoch': 0.35} 35%|███▌ | 7749/22095 [13:18:34<12:45:28, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45186 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7750/22095 [13:18:42<19:01:10, 4.77s/it] {'loss': 0.497, 'grad_norm': 0.4861428500111197, 'learning_rate': 7.53672344132513e-06, 'epoch': 0.35} 35%|███▌ | 7750/22095 [13:18:42<19:01:10, 4.77s/it] 35%|███▌ | 7751/22095 [13:18:46<17:54:06, 4.49s/it] {'loss': 0.3759, 'grad_norm': 0.752805323197179, 'learning_rate': 7.53609182435995e-06, 'epoch': 0.35} 35%|███▌ | 7751/22095 [13:18:46<17:54:06, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129980 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7752/22095 [13:18:50<17:39:41, 4.43s/it] {'loss': 0.3968, 'grad_norm': 0.6070374149228687, 'learning_rate': 7.535460152901945e-06, 'epoch': 0.35} 35%|███▌ | 7752/22095 [13:18:50<17:39:41, 4.43s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_3/images/before_screenshot_11_id_105_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:16:48.902696 load time: 1044.33 ms 35%|███▌ | 7753/22095 [13:18:54<16:41:47, 4.19s/it] {'loss': 0.3702, 'grad_norm': 0.6517684478502604, 'learning_rate': 7.534828426964687e-06, 'epoch': 0.35} 35%|███▌ | 7753/22095 [13:18:54<16:41:47, 4.19s/it] 35%|███▌ | 7754/22095 [13:18:57<15:22:12, 3.86s/it] {'loss': 0.3961, 'grad_norm': 0.633839696078463, 'learning_rate': 7.534196646561754e-06, 'epoch': 0.35} 35%|███▌ | 7754/22095 [13:18:57<15:22:12, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 05:16:55.535331 load time: 1003.2 ms 35%|███▌ | 7755/22095 [13:19:03<18:43:05, 4.70s/it] {'loss': 0.5144, 'grad_norm': 0.34285147538363264, 'learning_rate': 7.533564811706715e-06, 'epoch': 0.35} 35%|███▌ | 7755/22095 [13:19:04<18:43:05, 4.70s/it] 35%|███▌ | 7756/22095 [13:19:07<17:45:36, 4.46s/it] {'loss': 0.3658, 'grad_norm': 0.5787212050900984, 'learning_rate': 7.532932922413152e-06, 'epoch': 0.35} 35%|███▌ | 7756/22095 [13:19:07<17:45:36, 4.46s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 05:17:06.854880 load time: 1033.05 ms 35%|███▌ | 7757/22095 [13:19:11<16:52:31, 4.24s/it] {'loss': 0.3936, 'grad_norm': 0.6087775891664018, 'learning_rate': 7.532300978694639e-06, 'epoch': 0.35} 35%|███▌ | 7757/22095 [13:19:11<16:52:31, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7758/22095 [13:19:22<25:15:32, 6.34s/it] {'loss': 0.4857, 'grad_norm': 0.3209188592804357, 'learning_rate': 7.531668980564757e-06, 'epoch': 0.35} 35%|███▌ | 7758/22095 [13:19:22<25:15:32, 6.34s/it] 35%|███▌ | 7759/22095 [13:19:27<22:41:49, 5.70s/it] {'loss': 0.3587, 'grad_norm': 0.6416155319043709, 'learning_rate': 7.531036928037081e-06, 'epoch': 0.35} 35%|███▌ | 7759/22095 [13:19:27<22:41:49, 5.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (67416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118948 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7760/22095 [13:19:34<24:46:51, 6.22s/it] {'loss': 0.4763, 'grad_norm': 0.29564737935085245, 'learning_rate': 7.530404821125197e-06, 'epoch': 0.35} 35%|███▌ | 7760/22095 [13:19:34<24:46:51, 6.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7761/22095 [13:19:37<21:01:32, 5.28s/it] {'loss': 0.3247, 'grad_norm': 0.6204602551572096, 'learning_rate': 7.529772659842685e-06, 'epoch': 0.35} 35%|███▌ | 7761/22095 [13:19:37<21:01:32, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7762/22095 [13:19:41<19:31:26, 4.90s/it] {'loss': 0.3382, 'grad_norm': 0.7110343797456145, 'learning_rate': 7.529140444203127e-06, 'epoch': 0.35} 35%|███▌ | 7762/22095 [13:19:41<19:31:26, 4.90s/it] 35%|███▌ | 7763/22095 [13:19:45<17:55:49, 4.50s/it] {'loss': 0.3661, 'grad_norm': 0.5993280775274655, 'learning_rate': 7.5285081742201085e-06, 'epoch': 0.35} 35%|███▌ | 7763/22095 [13:19:45<17:55:49, 4.50s/it] 35%|███▌ | 7764/22095 [13:19:48<16:26:03, 4.13s/it] {'loss': 0.3864, 'grad_norm': 0.6200060790127362, 'learning_rate': 7.527875849907216e-06, 'epoch': 0.35} 35%|███▌ | 7764/22095 [13:19:48<16:26:03, 4.13s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/fe40e085-7060-4505-942d-c26efa06cb6a/images/step_1.png 2025-08-28 05:17:46.731047 load time: 1043.01 ms 35%|███▌ | 7765/22095 [13:19:51<15:27:43, 3.88s/it] {'loss': 0.3635, 'grad_norm': 0.5925761739408619, 'learning_rate': 7.527243471278034e-06, 'epoch': 0.35} 35%|███▌ | 7765/22095 [13:19:51<15:27:43, 3.88s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_5/images/step_4.png 2025-08-28 05:17:50.040381 load time: 1129.71 ms VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_6.png 2025-08-28 05:17:50.038711 load time: 1153.85 ms 35%|███▌ | 7766/22095 [13:19:55<15:19:29, 3.85s/it] {'loss': 0.3621, 'grad_norm': 0.7199944515738541, 'learning_rate': 7.526611038346153e-06, 'epoch': 0.35} 35%|███▌ | 7766/22095 [13:19:55<15:19:29, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41087 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7767/22095 [13:19:58<14:08:13, 3.55s/it] {'loss': 0.3113, 'grad_norm': 0.6114599544378431, 'learning_rate': 7.5259785511251595e-06, 'epoch': 0.35} 35%|███▌ | 7767/22095 [13:19:58<14:08:13, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7768/22095 [13:20:07<21:18:43, 5.36s/it] {'loss': 0.4663, 'grad_norm': 0.3694562932467032, 'learning_rate': 7.525346009628647e-06, 'epoch': 0.35} 35%|███▌ | 7768/22095 [13:20:07<21:18:43, 5.36s/it] 35%|███▌ | 7769/22095 [13:20:12<20:46:25, 5.22s/it] {'loss': 0.3314, 'grad_norm': 0.6326146960746679, 'learning_rate': 7.524713413870201e-06, 'epoch': 0.35} 35%|███▌ | 7769/22095 [13:20:12<20:46:25, 5.22s/it] 35%|███▌ | 7770/22095 [13:20:15<17:57:51, 4.51s/it] {'loss': 0.3561, 'grad_norm': 0.6443492767754994, 'learning_rate': 7.524080763863422e-06, 'epoch': 0.35} 35%|███▌ | 7770/22095 [13:20:15<17:57:51, 4.51s/it] 35%|███▌ | 7771/22095 [13:20:18<15:58:25, 4.01s/it] {'loss': 0.341, 'grad_norm': 0.6459826223012621, 'learning_rate': 7.5234480596218965e-06, 'epoch': 0.35} 35%|███▌ | 7771/22095 [13:20:18<15:58:25, 4.01s/it] 35%|███▌ | 7772/22095 [13:20:21<14:34:58, 3.67s/it] {'loss': 0.3372, 'grad_norm': 0.6059566286637352, 'learning_rate': 7.522815301159223e-06, 'epoch': 0.35} 35%|███▌ | 7772/22095 [13:20:21<14:34:58, 3.67s/it] 35%|███▌ | 7773/22095 [13:20:25<15:11:32, 3.82s/it] {'loss': 0.3587, 'grad_norm': 0.629434666765441, 'learning_rate': 7.522182488488999e-06, 'epoch': 0.35} 35%|███▌ | 7773/22095 [13:20:25<15:11:32, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7774/22095 [13:20:28<14:33:56, 3.66s/it] {'loss': 0.3562, 'grad_norm': 0.6345649812348073, 'learning_rate': 7.5215496216248175e-06, 'epoch': 0.35} 35%|███▌ | 7774/22095 [13:20:28<14:33:56, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 05:18:27.166338 load time: 1002.33 ms 35%|███▌ | 7775/22095 [13:20:36<19:39:52, 4.94s/it] {'loss': 0.4792, 'grad_norm': 0.34756610246884706, 'learning_rate': 7.520916700580279e-06, 'epoch': 0.35} 35%|███▌ | 7775/22095 [13:20:36<19:39:52, 4.94s/it] 35%|███▌ | 7776/22095 [13:20:42<20:43:27, 5.21s/it] {'loss': 0.3657, 'grad_norm': 0.6288100576988347, 'learning_rate': 7.5202837253689845e-06, 'epoch': 0.35} 35%|███▌ | 7776/22095 [13:20:42<20:43:27, 5.21s/it] 35%|███▌ | 7777/22095 [13:20:46<18:50:01, 4.74s/it] {'loss': 0.3395, 'grad_norm': 0.6067182425724056, 'learning_rate': 7.51965069600453e-06, 'epoch': 0.35} 35%|███▌ | 7777/22095 [13:20:46<18:50:01, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67941 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45062 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7778/22095 [13:20:49<16:42:13, 4.20s/it] {'loss': 0.3289, 'grad_norm': 0.6321301031695498, 'learning_rate': 7.519017612500524e-06, 'epoch': 0.35} 35%|███▌ | 7778/22095 [13:20:49<16:42:13, 4.20s/it] 35%|███▌ | 7779/22095 [13:20:52<15:11:50, 3.82s/it] {'loss': 0.3107, 'grad_norm': 0.6059871782734803, 'learning_rate': 7.5183844748705645e-06, 'epoch': 0.35} 35%|███▌ | 7779/22095 [13:20:52<15:11:50, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49812 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51753 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7780/22095 [13:20:55<14:48:46, 3.73s/it] {'loss': 0.3416, 'grad_norm': 0.6111636119938987, 'learning_rate': 7.517751283128258e-06, 'epoch': 0.35} 35%|███▌ | 7780/22095 [13:20:55<14:48:46, 3.73s/it] 35%|███▌ | 7781/22095 [13:20:58<13:41:37, 3.44s/it] {'loss': 0.3623, 'grad_norm': 0.6488860545994534, 'learning_rate': 7.517118037287207e-06, 'epoch': 0.35} 35%|███▌ | 7781/22095 [13:20:58<13:41:37, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72808 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7782/22095 [13:21:01<13:37:06, 3.43s/it] {'loss': 0.358, 'grad_norm': 1.131189321473581, 'learning_rate': 7.516484737361023e-06, 'epoch': 0.35} 35%|███▌ | 7782/22095 [13:21:01<13:37:06, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (141227 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398689 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 841, 'image': 'vrdu_table_final_2/astro-ph.CO/cf8b34f9-5549-46ad-9817-87218933c07a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 35%|███▌ | 7783/22095 [13:21:04<12:52:09, 3.24s/it] {'loss': 0.3434, 'grad_norm': 1.2056760182043709, 'learning_rate': 7.515851383363309e-06, 'epoch': 0.35} 35%|███▌ | 7783/22095 [13:21:04<12:52:09, 3.24s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:19:04.014153 load time: 1190.87 ms 35%|███▌ | 7784/22095 [13:21:07<12:22:49, 3.11s/it] {'loss': 0.3283, 'grad_norm': 0.6553290062431891, 'learning_rate': 7.515217975307677e-06, 'epoch': 0.35} 35%|███▌ | 7784/22095 [13:21:07<12:22:49, 3.11s/it] 35%|███▌ | 7785/22095 [13:21:11<13:13:03, 3.33s/it] {'loss': 0.3355, 'grad_norm': 0.6224771498224234, 'learning_rate': 7.514584513207734e-06, 'epoch': 0.35} 35%|███▌ | 7785/22095 [13:21:11<13:13:03, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49110 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61098 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55823 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7786/22095 [13:21:22<22:23:49, 5.63s/it] {'loss': 0.5041, 'grad_norm': 0.40038324245671597, 'learning_rate': 7.513950997077094e-06, 'epoch': 0.35} 35%|███▌ | 7786/22095 [13:21:22<22:23:49, 5.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49738 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7787/22095 [13:21:25<19:35:36, 4.93s/it] {'loss': 0.3412, 'grad_norm': 0.626659752232271, 'learning_rate': 7.513317426929369e-06, 'epoch': 0.35} 35%|███▌ | 7787/22095 [13:21:25<19:35:36, 4.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7788/22095 [13:21:36<26:45:32, 6.73s/it] {'loss': 0.5069, 'grad_norm': 0.3044550549445809, 'learning_rate': 7.512683802778169e-06, 'epoch': 0.35} 35%|███▌ | 7788/22095 [13:21:36<26:45:32, 6.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106076 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7789/22095 [13:21:39<22:26:39, 5.65s/it] {'loss': 0.3139, 'grad_norm': 0.719800344359679, 'learning_rate': 7.512050124637114e-06, 'epoch': 0.35} 35%|███▌ | 7789/22095 [13:21:39<22:26:39, 5.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7790/22095 [13:21:48<26:22:52, 6.64s/it] {'loss': 0.4754, 'grad_norm': 0.3408857775887603, 'learning_rate': 7.511416392519815e-06, 'epoch': 0.35} 35%|███▌ | 7790/22095 [13:21:48<26:22:52, 6.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7791/22095 [13:21:52<23:14:40, 5.85s/it] {'loss': 0.3829, 'grad_norm': 0.6812537517058064, 'learning_rate': 7.51078260643989e-06, 'epoch': 0.35} 35%|███▌ | 7791/22095 [13:21:52<23:14:40, 5.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922568 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45721, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nA. 5cm\nB. 8cm\nC. 9cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 35%|███▌ | 7792/22095 [13:21:56<20:25:41, 5.14s/it] {'loss': 0.36, 'grad_norm': 0.703618849968868, 'learning_rate': 7.5101487664109605e-06, 'epoch': 0.35} 35%|███▌ | 7792/22095 [13:21:56<20:25:41, 5.14s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 05:19:55.459278 load time: 1078.98 ms 35%|███▌ | 7793/22095 [13:21:58<17:44:08, 4.46s/it] {'loss': 0.3385, 'grad_norm': 0.7611903498969986, 'learning_rate': 7.509514872446642e-06, 'epoch': 0.35} 35%|███▌ | 7793/22095 [13:21:58<17:44:08, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99569 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7794/22095 [13:22:08<23:37:10, 5.95s/it] {'loss': 0.4742, 'grad_norm': 0.3892495760709264, 'learning_rate': 7.5088809245605555e-06, 'epoch': 0.35} 35%|███▌ | 7794/22095 [13:22:08<23:37:10, 5.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408365 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10558, 'image': 'vrdu_table_final_2/astro-ph.CO/753181d9-446a-4625-9e54-f554d4525337.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 35%|███▌ | 7795/22095 [13:22:18<28:43:18, 7.23s/it] {'loss': 0.4939, 'grad_norm': 0.3789789286921561, 'learning_rate': 7.508246922766326e-06, 'epoch': 0.35} 35%|███▌ | 7795/22095 [13:22:18<28:43:18, 7.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▌ | 7796/22095 [13:22:21<23:58:41, 6.04s/it] {'loss': 0.3883, 'grad_norm': 0.6595053522544354, 'learning_rate': 7.507612867077571e-06, 'epoch': 0.35} 35%|███▌ | 7796/22095 [13:22:21<23:58:41, 6.04s/it] 35%|███▌ | 7797/22095 [13:22:29<26:13:10, 6.60s/it] {'loss': 0.4853, 'grad_norm': 0.2995636894582247, 'learning_rate': 7.506978757507919e-06, 'epoch': 0.35} 35%|███▌ | 7797/22095 [13:22:29<26:13:10, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▌ | 7798/22095 [13:22:33<22:22:41, 5.63s/it] {'loss': 0.3525, 'grad_norm': 0.6309037212275371, 'learning_rate': 7.506344594070991e-06, 'epoch': 0.35} 35%|███▌ | 7798/22095 [13:22:33<22:22:41, 5.63s/it] 35%|███▌ | 7799/22095 [13:22:36<19:18:49, 4.86s/it] {'loss': 0.3816, 'grad_norm': 0.6936880134731889, 'learning_rate': 7.5057103767804175e-06, 'epoch': 0.35} 35%|███▌ | 7799/22095 [13:22:36<19:18:49, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49241 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59357 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7800/22095 [13:22:39<17:05:57, 4.31s/it] {'loss': 0.3118, 'grad_norm': 0.630733616335528, 'learning_rate': 7.505076105649822e-06, 'epoch': 0.35} 35%|███▌ | 7800/22095 [13:22:39<17:05:57, 4.31s/it] 35%|███▌ | 7801/22095 [13:22:43<16:29:28, 4.15s/it] {'loss': 0.4105, 'grad_norm': 0.6846633350976611, 'learning_rate': 7.504441780692836e-06, 'epoch': 0.35} 35%|███▌ | 7801/22095 [13:22:43<16:29:28, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7802/22095 [13:22:52<22:50:01, 5.75s/it] {'loss': 0.4859, 'grad_norm': 0.6526336163178629, 'learning_rate': 7.5038074019230865e-06, 'epoch': 0.35} 35%|███▌ | 7802/22095 [13:22:52<22:50:01, 5.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047227 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 8cm\nB. 10cm\nC. 12cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 05:20:51.508597 load time: 1612.99 ms 35%|███▌ | 7803/22095 [13:22:55<19:56:40, 5.02s/it] {'loss': 0.3362, 'grad_norm': 0.6354983121815554, 'learning_rate': 7.503172969354206e-06, 'epoch': 0.35} 35%|███▌ | 7803/22095 [13:22:55<19:56:40, 5.02s/it] 35%|███▌ | 7804/22095 [13:22:59<18:01:24, 4.54s/it] {'loss': 0.3745, 'grad_norm': 0.616840523677705, 'learning_rate': 7.502538482999829e-06, 'epoch': 0.35} 35%|███▌ | 7804/22095 [13:22:59<18:01:24, 4.54s/it] 35%|███▌ | 7805/22095 [13:23:02<16:45:30, 4.22s/it] {'loss': 0.3786, 'grad_norm': 0.6277516775555112, 'learning_rate': 7.501903942873584e-06, 'epoch': 0.35} 35%|███▌ | 7805/22095 [13:23:02<16:45:30, 4.22s/it] 35%|███▌ | 7806/22095 [13:23:06<16:21:52, 4.12s/it] {'loss': 0.3646, 'grad_norm': 0.6373393336769787, 'learning_rate': 7.5012693489891065e-06, 'epoch': 0.35} 35%|███▌ | 7806/22095 [13:23:06<16:21:52, 4.12s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/standard/test_823_image_1.png 2025-08-28 05:21:06.645046 load time: 1099.37 ms 35%|███▌ | 7807/22095 [13:23:10<15:52:09, 4.00s/it] {'loss': 0.332, 'grad_norm': 0.6169337950282993, 'learning_rate': 7.500634701360034e-06, 'epoch': 0.35} 35%|███▌ | 7807/22095 [13:23:10<15:52:09, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage1/omniact/images/train_6342.png 2025-08-28 05:21:08.607668 load time: 1129.4 ms VC:s3://gui-agent/data_20250630/windows_augment/images/autocad/handmade_annotation_5/images/6_id_24_internvl_appearance_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:21:08.607478 load time: 1475.15 ms 35%|███▌ | 7808/22095 [13:23:16<18:31:10, 4.67s/it] {'loss': 0.5114, 'grad_norm': 0.3384868150450684, 'learning_rate': 7.500000000000001e-06, 'epoch': 0.35} 35%|███▌ | 7808/22095 [13:23:16<18:31:10, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55573 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7809/22095 [13:23:20<17:08:35, 4.32s/it] {'loss': 0.3781, 'grad_norm': 0.6187640351076604, 'learning_rate': 7.499365244922646e-06, 'epoch': 0.35} 35%|███▌ | 7809/22095 [13:23:20<17:08:35, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45263 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102109 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7810/22095 [13:23:22<15:19:15, 3.86s/it] {'loss': 0.3588, 'grad_norm': 0.6555446380655447, 'learning_rate': 7.498730436141609e-06, 'epoch': 0.35} 35%|███▌ | 7810/22095 [13:23:22<15:19:15, 3.86s/it] 35%|███▌ | 7811/22095 [13:23:26<15:12:15, 3.83s/it] {'loss': 0.3885, 'grad_norm': 0.6239720714131426, 'learning_rate': 7.498095573670528e-06, 'epoch': 0.35} 35%|███▌ | 7811/22095 [13:23:26<15:12:15, 3.83s/it] 35%|███▌ | 7812/22095 [13:23:30<14:46:04, 3.72s/it] {'loss': 0.391, 'grad_norm': 0.6362622564026761, 'learning_rate': 7.497460657523047e-06, 'epoch': 0.35} 35%|███▌ | 7812/22095 [13:23:30<14:46:04, 3.72s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:21:28.364331 load time: 1067.08 ms VC:s3://gui-agent/data_20250623/windows_augment/images/inventor/20250512_132100_1/images/before_screenshot_1_id_113_internvl_element-caption_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 05:21:28.901548 load time: 1397.88 ms 35%|███▌ | 7813/22095 [13:23:33<14:15:12, 3.59s/it] {'loss': 0.3378, 'grad_norm': 0.7405890819555525, 'learning_rate': 7.496825687712805e-06, 'epoch': 0.35} 35%|███▌ | 7813/22095 [13:23:33<14:15:12, 3.59s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/4883f6e6-c658-4d61-9cf9-e32c2b812a80/images/step_1.png 2025-08-28 05:21:32.242959 load time: 1103.52 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_2/images/step_0.png 2025-08-28 05:21:32.339754 load time: 1355.01 ms VC:s3://gui-agent/data_20250612/mac/images/finder/1fe1ca62-7e4a-4d85-af6c-e650a9c51129/images/step_1.png 2025-08-28 05:21:33.508672 load time: 1078.89 ms 35%|███▌ | 7814/22095 [13:23:37<15:21:04, 3.87s/it] {'loss': 0.3531, 'grad_norm': 0.6959052316973275, 'learning_rate': 7.496190664253449e-06, 'epoch': 0.35} 35%|███▌ | 7814/22095 [13:23:37<15:21:04, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71529 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132147 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7815/22095 [13:23:40<14:16:45, 3.60s/it] {'loss': 0.3421, 'grad_norm': 0.6077571764611213, 'learning_rate': 7.495555587158622e-06, 'epoch': 0.35} 35%|███▌ | 7815/22095 [13:23:40<14:16:45, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49487 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7816/22095 [13:23:44<14:17:55, 3.61s/it] {'loss': 0.3424, 'grad_norm': 0.6469514940955886, 'learning_rate': 7.49492045644197e-06, 'epoch': 0.35} 35%|███▌ | 7816/22095 [13:23:44<14:17:55, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:21:43.231585 load time: 1156.02 ms 35%|███▌ | 7817/22095 [13:23:52<19:05:01, 4.81s/it] {'loss': 0.4962, 'grad_norm': 0.3811569747548171, 'learning_rate': 7.494285272117139e-06, 'epoch': 0.35} 35%|███▌ | 7817/22095 [13:23:52<19:05:01, 4.81s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_175007_3/images/before_screenshot_18_id_34_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 05:21:51.053242 load time: 1366.4 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 05:21:51.528990 load time: 1121.78 ms 35%|███▌ | 7818/22095 [13:23:57<20:04:33, 5.06s/it] {'loss': 0.5015, 'grad_norm': 0.3295220610717146, 'learning_rate': 7.493650034197779e-06, 'epoch': 0.35} 35%|███▌ | 7818/22095 [13:23:57<20:04:33, 5.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880125 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3278, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 12\nB. 16\nC. 9\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 35%|███▌ | 7819/22095 [13:24:01<18:32:12, 4.67s/it] {'loss': 0.3612, 'grad_norm': 0.6595865740341526, 'learning_rate': 7.493014742697537e-06, 'epoch': 0.35} 35%|███▌ | 7819/22095 [13:24:01<18:32:12, 4.67s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 05:22:01.308102 load time: 1362.5 ms 35%|███▌ | 7820/22095 [13:24:08<21:50:27, 5.51s/it] {'loss': 0.4877, 'grad_norm': 0.2732336585442226, 'learning_rate': 7.4923793976300665e-06, 'epoch': 0.35} 35%|███▌ | 7820/22095 [13:24:08<21:50:27, 5.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6779314 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_132635_1/images/before_screenshot_1_id_37_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': '\nClick the Group button in the Groups panel to combine multiple objects into a single entity'}, {'from': 'gpt', 'value': '\nclick(x=0.6805, y=0.0892)\n'}], 'width': 3600, 'height': 2338} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 05:22:07.256330 load time: 1017.72 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 05:22:09.088348 load time: 1128.82 ms 35%|███▌ | 7821/22095 [13:24:12<19:08:15, 4.83s/it] {'loss': 0.3558, 'grad_norm': 0.797829661466962, 'learning_rate': 7.4917439990090165e-06, 'epoch': 0.35} 35%|███▌ | 7821/22095 [13:24:12<19:08:15, 4.83s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20406.png 2025-08-28 05:22:09.112393 load time: 1233.16 ms 35%|███▌ | 7822/22095 [13:24:15<17:15:33, 4.35s/it] {'loss': 0.3307, 'grad_norm': 0.6327197220334714, 'learning_rate': 7.491108546848041e-06, 'epoch': 0.35} 35%|███▌ | 7822/22095 [13:24:15<17:15:33, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7823/22095 [13:24:24<23:23:48, 5.90s/it] {'loss': 0.4877, 'grad_norm': 0.4456977492630711, 'learning_rate': 7.490473041160794e-06, 'epoch': 0.35} 35%|███▌ | 7823/22095 [13:24:24<23:23:48, 5.90s/it] 35%|███▌ | 7824/22095 [13:24:34<27:35:15, 6.96s/it] {'loss': 0.512, 'grad_norm': 0.37679343862054404, 'learning_rate': 7.489837481960931e-06, 'epoch': 0.35} 35%|███▌ | 7824/22095 [13:24:34<27:35:15, 6.96s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▌ | 7825/22095 [13:24:37<23:13:33, 5.86s/it] {'loss': 0.3437, 'grad_norm': 0.688862193477084, 'learning_rate': 7.489201869262106e-06, 'epoch': 0.35} 35%|███▌ | 7825/22095 [13:24:37<23:13:33, 5.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96024 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41715 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7826/22095 [13:24:40<20:00:13, 5.05s/it] {'loss': 0.3613, 'grad_norm': 0.612011506777598, 'learning_rate': 7.48856620307798e-06, 'epoch': 0.35} 35%|███▌ | 7826/22095 [13:24:40<20:00:13, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62558 > 40960). Running this sequence through the model will result in indexing errors 35%|███▌ | 7827/22095 [13:24:44<17:51:23, 4.51s/it] {'loss': 0.3514, 'grad_norm': 0.6547800401174696, 'learning_rate': 7.487930483422206e-06, 'epoch': 0.35} 35%|███▌ | 7827/22095 [13:24:44<17:51:23, 4.51s/it] 35%|███▌ | 7828/22095 [13:24:47<16:31:13, 4.17s/it] {'loss': 0.3471, 'grad_norm': 0.7017144490010115, 'learning_rate': 7.4872947103084495e-06, 'epoch': 0.35} 35%|███▌ | 7828/22095 [13:24:47<16:31:13, 4.17s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 05:22:45.755913 load time: 1322.31 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_4.png 2025-08-28 05:22:46.545137 load time: 1179.3 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 05:22:47.102223 load time: 1272.64 ms 35%|███▌ | 7829/22095 [13:24:51<16:20:48, 4.13s/it] {'loss': 0.3258, 'grad_norm': 0.6039843313631512, 'learning_rate': 7.4866588837503686e-06, 'epoch': 0.35} 35%|███▌ | 7829/22095 [13:24:51<16:20:48, 4.13s/it] 35%|███▌ | 7830/22095 [13:24:55<15:55:20, 4.02s/it] {'loss': 0.3539, 'grad_norm': 0.6856001322498695, 'learning_rate': 7.486023003761625e-06, 'epoch': 0.35} 35%|███▌ | 7830/22095 [13:24:55<15:55:20, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047932 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 35%|███▌ | 7831/22095 [13:24:58<15:22:52, 3.88s/it] {'loss': 0.3605, 'grad_norm': 0.6239377737887924, 'learning_rate': 7.48538707035588e-06, 'epoch': 0.35} 35%|███▌ | 7831/22095 [13:24:58<15:22:52, 3.88s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:22:58.163629 load time: 1178.6 ms 35%|███▌ | 7832/22095 [13:25:02<15:12:47, 3.84s/it] {'loss': 0.3635, 'grad_norm': 0.7392468682747669, 'learning_rate': 7.484751083546804e-06, 'epoch': 0.35} 35%|███▌ | 7832/22095 [13:25:02<15:12:47, 3.84s/it] 35%|███▌ | 7833/22095 [13:25:05<14:10:25, 3.58s/it] {'loss': 0.349, 'grad_norm': 0.6336965745586026, 'learning_rate': 7.484115043348056e-06, 'epoch': 0.35} 35%|███▌ | 7833/22095 [13:25:05<14:10:25, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 35%|███▌ | 7834/22095 [13:25:14<21:08:38, 5.34s/it] {'loss': 0.4627, 'grad_norm': 0.6547582868007384, 'learning_rate': 7.4834789497733065e-06, 'epoch': 0.35} 35%|███▌ | 7834/22095 [13:25:14<21:08:38, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7835/22095 [13:25:21<22:58:39, 5.80s/it] {'loss': 0.4812, 'grad_norm': 0.4528387555975036, 'learning_rate': 7.482842802836221e-06, 'epoch': 0.35} 35%|███▌ | 7835/22095 [13:25:21<22:58:39, 5.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 35%|███▌ | 7836/22095 [13:25:25<20:32:10, 5.18s/it] {'loss': 0.4041, 'grad_norm': 0.6591548404559565, 'learning_rate': 7.482206602550469e-06, 'epoch': 0.35} 35%|███▌ | 7836/22095 [13:25:25<20:32:10, 5.18s/it] 35%|███▌ | 7837/22095 [13:25:29<19:20:16, 4.88s/it] {'loss': 0.3721, 'grad_norm': 0.8262674444288512, 'learning_rate': 7.481570348929722e-06, 'epoch': 0.35} 35%|███▌ | 7837/22095 [13:25:29<19:20:16, 4.88s/it] 35%|███▌ | 7838/22095 [13:25:32<17:10:26, 4.34s/it] {'loss': 0.3699, 'grad_norm': 0.7090929955067138, 'learning_rate': 7.480934041987649e-06, 'epoch': 0.35} 35%|███▌ | 7838/22095 [13:25:32<17:10:26, 4.34s/it] 35%|███▌ | 7839/22095 [13:25:36<16:27:12, 4.15s/it] {'loss': 0.3902, 'grad_norm': 1.111779718586593, 'learning_rate': 7.480297681737922e-06, 'epoch': 0.35} 35%|███▌ | 7839/22095 [13:25:36<16:27:12, 4.15s/it] 35%|███▌ | 7840/22095 [13:25:39<15:09:32, 3.83s/it] {'loss': 0.3197, 'grad_norm': 0.6493248511249113, 'learning_rate': 7.479661268194217e-06, 'epoch': 0.35} 35%|███▌ | 7840/22095 [13:25:39<15:09:32, 3.83s/it] 35%|███▌ | 7841/22095 [13:25:43<14:45:59, 3.73s/it] {'loss': 0.3554, 'grad_norm': 0.6947138559490463, 'learning_rate': 7.479024801370206e-06, 'epoch': 0.35} 35%|███▌ | 7841/22095 [13:25:43<14:45:59, 3.73s/it] 35%|███▌ | 7842/22095 [13:25:46<14:52:15, 3.76s/it] {'loss': 0.3553, 'grad_norm': 0.750127064883699, 'learning_rate': 7.478388281279566e-06, 'epoch': 0.35} 35%|███▌ | 7842/22095 [13:25:46<14:52:15, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 35%|███▌ | 7843/22095 [13:25:51<15:25:36, 3.90s/it] {'loss': 0.3219, 'grad_norm': 0.6068588275400576, 'learning_rate': 7.477751707935974e-06, 'epoch': 0.35} 35%|███▌ | 7843/22095 [13:25:51<15:25:36, 3.90s/it] 36%|███▌ | 7844/22095 [13:25:55<15:34:15, 3.93s/it] {'loss': 0.3476, 'grad_norm': 0.6667280699696232, 'learning_rate': 7.477115081353107e-06, 'epoch': 0.36} 36%|███▌ | 7844/22095 [13:25:55<15:34:15, 3.93s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (106300000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://gui-agent/data_20250612/mac/images/terminal/d575fbf7-ad0d-4665-94ba-472d47b74314/images/step_1.png 2025-08-28 05:23:54.846480 load time: 1418.46 ms 36%|███▌ | 7845/22095 [13:25:58<15:00:38, 3.79s/it] {'loss': 0.3364, 'grad_norm': 0.6720148038383784, 'learning_rate': 7.476478401544647e-06, 'epoch': 0.36} 36%|███▌ | 7845/22095 [13:25:58<15:00:38, 3.79s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/37949.png 2025-08-28 05:23:53.984598 load time: 1520.43 ms Token indices sequence length is longer than the specified maximum sequence length for this model (99684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76075 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62177 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7846/22095 [13:26:02<14:39:36, 3.70s/it] {'loss': 0.3578, 'grad_norm': 0.6749944921167566, 'learning_rate': 7.475841668524268e-06, 'epoch': 0.36} 36%|███▌ | 7846/22095 [13:26:02<14:39:36, 3.70s/it] 36%|███▌ | 7847/22095 [13:26:05<13:52:49, 3.51s/it] {'loss': 0.3407, 'grad_norm': 0.6546624428916041, 'learning_rate': 7.475204882305659e-06, 'epoch': 0.36} 36%|███▌ | 7847/22095 [13:26:05<13:52:49, 3.51s/it] 36%|███▌ | 7848/22095 [13:26:08<13:43:39, 3.47s/it] {'loss': 0.3473, 'grad_norm': 0.6187047516268036, 'learning_rate': 7.474568042902497e-06, 'epoch': 0.36} 36%|███▌ | 7848/22095 [13:26:08<13:43:39, 3.47s/it] 36%|███▌ | 7849/22095 [13:26:12<14:14:17, 3.60s/it] {'loss': 0.3457, 'grad_norm': 0.6449786289711059, 'learning_rate': 7.4739311503284695e-06, 'epoch': 0.36} 36%|███▌ | 7849/22095 [13:26:12<14:14:17, 3.60s/it] 36%|███▌ | 7850/22095 [13:26:16<15:11:41, 3.84s/it] {'loss': 0.3385, 'grad_norm': 0.6612723807417603, 'learning_rate': 7.473294204597259e-06, 'epoch': 0.36} 36%|███▌ | 7850/22095 [13:26:16<15:11:41, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [387, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8453832 in VC:s3://internvl-moe-sft-data/. Exception: Image size [387, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 111493, 'image': 'vrdu_texteq/astro-ph.CO/4b72e2b2-9da9-40ca-9059-d368a997f7d4.png', 'image_wh': [[387, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where we use $\\mathbf{Y}$ and $\\mathbf{C}_{\\rm bao}$ from.'}]} 36%|███▌ | 7851/22095 [13:26:20<14:31:19, 3.67s/it] {'loss': 0.3679, 'grad_norm': 0.6023672840038461, 'learning_rate': 7.472657205722551e-06, 'epoch': 0.36} 36%|███▌ | 7851/22095 [13:26:20<14:31:19, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396956 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63809, 'image': 'vrdu_table_final_2/astro-ph.EP/5545aad0-ad70-465b-a635-2b8d60639a44.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[t]{l}$e_x$\\end{tabular}\n```"}]} 36%|███▌ | 7852/22095 [13:26:23<14:33:03, 3.68s/it] {'loss': 0.3502, 'grad_norm': 0.682869276494719, 'learning_rate': 7.472020153718036e-06, 'epoch': 0.36} 36%|███▌ | 7852/22095 [13:26:23<14:33:03, 3.68s/it] 36%|███▌ | 7853/22095 [13:26:27<14:40:53, 3.71s/it] {'loss': 0.3598, 'grad_norm': 0.658596181793742, 'learning_rate': 7.471383048597399e-06, 'epoch': 0.36} 36%|███▌ | 7853/22095 [13:26:27<14:40:53, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7854/22095 [13:26:34<17:56:20, 4.53s/it] {'loss': 0.502, 'grad_norm': 1.4779270416964452, 'learning_rate': 7.47074589037433e-06, 'epoch': 0.36} 36%|███▌ | 7854/22095 [13:26:34<17:56:20, 4.53s/it] 36%|███▌ | 7855/22095 [13:26:38<17:21:25, 4.39s/it] {'loss': 0.3551, 'grad_norm': 0.6974218981775188, 'learning_rate': 7.470108679062521e-06, 'epoch': 0.36} 36%|███▌ | 7855/22095 [13:26:38<17:21:25, 4.39s/it] 36%|███▌ | 7856/22095 [13:26:41<15:34:25, 3.94s/it] {'loss': 0.3752, 'grad_norm': 0.7470662270954693, 'learning_rate': 7.469471414675662e-06, 'epoch': 0.36} 36%|███▌ | 7856/22095 [13:26:41<15:34:25, 3.94s/it] 36%|███▌ | 7857/22095 [13:26:44<14:27:29, 3.66s/it] {'loss': 0.3742, 'grad_norm': 0.698352138931862, 'learning_rate': 7.468834097227448e-06, 'epoch': 0.36} 36%|███▌ | 7857/22095 [13:26:44<14:27:29, 3.66s/it] 36%|███▌ | 7858/22095 [13:26:46<13:23:41, 3.39s/it] {'loss': 0.3402, 'grad_norm': 0.70489735180748, 'learning_rate': 7.4681967267315715e-06, 'epoch': 0.36} 36%|███▌ | 7858/22095 [13:26:46<13:23:41, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77602 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7859/22095 [13:26:50<13:25:35, 3.40s/it] {'loss': 0.3232, 'grad_norm': 0.6708085387779145, 'learning_rate': 7.4675593032017266e-06, 'epoch': 0.36} 36%|███▌ | 7859/22095 [13:26:50<13:25:35, 3.40s/it] 36%|███▌ | 7860/22095 [13:26:53<12:57:44, 3.28s/it] {'loss': 0.3686, 'grad_norm': 0.7004600236705987, 'learning_rate': 7.466921826651612e-06, 'epoch': 0.36} 36%|███▌ | 7860/22095 [13:26:53<12:57:44, 3.28s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 05:24:51.503380 load time: 1061.31 ms 36%|███▌ | 7861/22095 [13:26:57<13:51:00, 3.50s/it] {'loss': 0.3684, 'grad_norm': 0.6822899888012113, 'learning_rate': 7.466284297094922e-06, 'epoch': 0.36} 36%|███▌ | 7861/22095 [13:26:57<13:51:00, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103331 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7862/22095 [13:27:00<13:11:12, 3.34s/it] {'loss': 0.3443, 'grad_norm': 0.6431997953727273, 'learning_rate': 7.46564671454536e-06, 'epoch': 0.36} 36%|███▌ | 7862/22095 [13:27:00<13:11:12, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396957 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63810, 'image': 'vrdu_table_final_2/astro-ph.EP/14f306ab-ed96-4932-93b1-197299be93b4.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{l}$e_y$\\end{tabular}\n```"}]} 36%|███▌ | 7863/22095 [13:27:03<12:49:20, 3.24s/it] {'loss': 0.3652, 'grad_norm': 0.6400738731668444, 'learning_rate': 7.46500907901662e-06, 'epoch': 0.36} 36%|███▌ | 7863/22095 [13:27:03<12:49:20, 3.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7864/22095 [13:27:06<12:37:52, 3.20s/it] {'loss': 0.3405, 'grad_norm': 0.6476788329341486, 'learning_rate': 7.4643713905224065e-06, 'epoch': 0.36} 36%|███▌ | 7864/22095 [13:27:06<12:37:52, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7865/22095 [13:27:09<12:33:06, 3.18s/it] {'loss': 0.4326, 'grad_norm': 0.9223979899731618, 'learning_rate': 7.463733649076421e-06, 'epoch': 0.36} 36%|███▌ | 7865/22095 [13:27:09<12:33:06, 3.18s/it] 36%|███▌ | 7866/22095 [13:27:12<12:39:47, 3.20s/it] {'loss': 0.3519, 'grad_norm': 0.6300366430679043, 'learning_rate': 7.4630958546923674e-06, 'epoch': 0.36} 36%|███▌ | 7866/22095 [13:27:12<12:39:47, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42998 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73498 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7867/22095 [13:27:22<20:17:35, 5.13s/it] {'loss': 0.4952, 'grad_norm': 1.2353016132302856, 'learning_rate': 7.462458007383946e-06, 'epoch': 0.36} 36%|███▌ | 7867/22095 [13:27:22<20:17:35, 5.13s/it] 36%|███▌ | 7868/22095 [13:27:26<18:40:51, 4.73s/it] {'loss': 0.3544, 'grad_norm': 0.5951207777646207, 'learning_rate': 7.461820107164867e-06, 'epoch': 0.36} 36%|███▌ | 7868/22095 [13:27:26<18:40:51, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929816 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52969, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nA. 6cm\nB. 7cm\nC. 8cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:25:24.965909 load time: 1110.05 ms 36%|███▌ | 7869/22095 [13:27:29<16:56:30, 4.29s/it] {'loss': 0.3552, 'grad_norm': 1.5186979849658055, 'learning_rate': 7.461182154048832e-06, 'epoch': 0.36} 36%|███▌ | 7869/22095 [13:27:29<16:56:30, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72802 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7870/22095 [13:27:39<23:33:32, 5.96s/it] {'loss': 0.492, 'grad_norm': 0.4881240184949437, 'learning_rate': 7.460544148049555e-06, 'epoch': 0.36} 36%|███▌ | 7870/22095 [13:27:39<23:33:32, 5.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56681 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41844 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7871/22095 [13:27:43<21:19:03, 5.40s/it] {'loss': 0.3741, 'grad_norm': 0.8171773576837302, 'learning_rate': 7.45990608918074e-06, 'epoch': 0.36} 36%|███▌ | 7871/22095 [13:27:43<21:19:03, 5.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7872/22095 [13:27:47<19:41:33, 4.98s/it] {'loss': 0.3668, 'grad_norm': 0.7048751897975343, 'learning_rate': 7.459267977456097e-06, 'epoch': 0.36} 36%|███▌ | 7872/22095 [13:27:47<19:41:33, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90856 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7873/22095 [13:27:50<17:41:12, 4.48s/it] {'loss': 0.3963, 'grad_norm': 0.68364472816167, 'learning_rate': 7.45862981288934e-06, 'epoch': 0.36} 36%|███▌ | 7873/22095 [13:27:50<17:41:12, 4.48s/it] 36%|███▌ | 7874/22095 [13:27:54<16:56:33, 4.29s/it] {'loss': 0.3283, 'grad_norm': 0.6353143543372414, 'learning_rate': 7.457991595494178e-06, 'epoch': 0.36} 36%|███▌ | 7874/22095 [13:27:54<16:56:33, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7875/22095 [13:28:03<23:04:33, 5.84s/it] {'loss': 0.4874, 'grad_norm': 0.7565528040727152, 'learning_rate': 7.457353325284327e-06, 'epoch': 0.36} 36%|███▌ | 7875/22095 [13:28:03<23:04:33, 5.84s/it] 36%|███▌ | 7876/22095 [13:28:07<20:40:41, 5.24s/it] {'loss': 0.381, 'grad_norm': 0.6582202067423059, 'learning_rate': 7.4567150022735e-06, 'epoch': 0.36} 36%|███▌ | 7876/22095 [13:28:07<20:40:41, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60720 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7877/22095 [13:28:11<18:37:33, 4.72s/it] {'loss': 0.3426, 'grad_norm': 0.6378632328892668, 'learning_rate': 7.45607662647541e-06, 'epoch': 0.36} 36%|███▌ | 7877/22095 [13:28:11<18:37:33, 4.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127806 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134856 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7878/22095 [13:28:20<24:15:24, 6.14s/it] {'loss': 0.476, 'grad_norm': 0.4406770262620411, 'learning_rate': 7.45543819790378e-06, 'epoch': 0.36} 36%|███▌ | 7878/22095 [13:28:20<24:15:24, 6.14s/it] 36%|███▌ | 7879/22095 [13:28:29<26:55:13, 6.82s/it] {'loss': 0.496, 'grad_norm': 0.36194061351320284, 'learning_rate': 7.454799716572324e-06, 'epoch': 0.36} 36%|███▌ | 7879/22095 [13:28:29<26:55:13, 6.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (43870 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7880/22095 [13:28:32<22:35:17, 5.72s/it] {'loss': 0.3941, 'grad_norm': 0.6696797075276905, 'learning_rate': 7.45416118249476e-06, 'epoch': 0.36} 36%|███▌ | 7880/22095 [13:28:32<22:35:17, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75303 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43339 > 40960) for 4 sample(s). Truncating to 1392 with 2 samples. VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_2.png 2025-08-28 05:26:30.586073 load time: 1113.28 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_5/images/before_screenshot_50_id_124_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 05:26:31.628886 load time: 1007.56 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 05:26:31.557683 load time: 1095.63 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 05:26:31.266942 load time: 1844.0 ms VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_005007_2/images/before_screenshot_14_id_160_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:26:31.921689 load time: 1551.21 ms 36%|███▌ | 7881/22095 [13:28:41<26:55:56, 6.82s/it] {'loss': 0.4982, 'grad_norm': 0.4478173564882491, 'learning_rate': 7.4535225956848115e-06, 'epoch': 0.36} 36%|███▌ | 7881/22095 [13:28:41<26:55:56, 6.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7882/22095 [13:28:45<23:14:42, 5.89s/it] {'loss': 0.3999, 'grad_norm': 0.6653358361375153, 'learning_rate': 7.452883956156197e-06, 'epoch': 0.36} 36%|███▌ | 7882/22095 [13:28:45<23:14:42, 5.89s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 05:26:45.116349 load time: 1098.31 ms 36%|███▌ | 7883/22095 [13:28:48<20:21:54, 5.16s/it] {'loss': 0.3903, 'grad_norm': 0.6890302334830477, 'learning_rate': 7.452245263922638e-06, 'epoch': 0.36} 36%|███▌ | 7883/22095 [13:28:48<20:21:54, 5.16s/it] 36%|███▌ | 7884/22095 [13:28:51<17:40:42, 4.48s/it] {'loss': 0.3289, 'grad_norm': 0.6117361768377864, 'learning_rate': 7.4516065189978625e-06, 'epoch': 0.36} 36%|███▌ | 7884/22095 [13:28:51<17:40:42, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7885/22095 [13:28:59<21:09:48, 5.36s/it] {'loss': 0.5051, 'grad_norm': 0.6131998172217197, 'learning_rate': 7.45096772139559e-06, 'epoch': 0.36} 36%|███▌ | 7885/22095 [13:28:59<21:09:48, 5.36s/it] 36%|███▌ | 7886/22095 [13:29:06<23:00:21, 5.83s/it] {'loss': 0.4946, 'grad_norm': 0.6483207055632356, 'learning_rate': 7.450328871129551e-06, 'epoch': 0.36} 36%|███▌ | 7886/22095 [13:29:06<23:00:21, 5.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7887/22095 [13:29:09<20:16:08, 5.14s/it] {'loss': 0.3814, 'grad_norm': 0.7360880947446763, 'learning_rate': 7.4496899682134684e-06, 'epoch': 0.36} 36%|███▌ | 7887/22095 [13:29:09<20:16:08, 5.14s/it] 36%|███▌ | 7888/22095 [13:29:19<25:49:25, 6.54s/it] {'loss': 0.5084, 'grad_norm': 0.4969670850818326, 'learning_rate': 7.449051012661073e-06, 'epoch': 0.36} 36%|███▌ | 7888/22095 [13:29:19<25:49:25, 6.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7889/22095 [13:29:22<21:45:50, 5.52s/it] {'loss': 0.3032, 'grad_norm': 0.654029005722767, 'learning_rate': 7.4484120044860915e-06, 'epoch': 0.36} 36%|███▌ | 7889/22095 [13:29:22<21:45:50, 5.52s/it] 36%|███▌ | 7890/22095 [13:29:25<19:01:13, 4.82s/it] {'loss': 0.3829, 'grad_norm': 0.7367063069317463, 'learning_rate': 7.447772943702258e-06, 'epoch': 0.36} 36%|███▌ | 7890/22095 [13:29:25<19:01:13, 4.82s/it] 36%|███▌ | 7891/22095 [13:29:29<17:40:17, 4.48s/it] {'loss': 0.3148, 'grad_norm': 0.6245920443685149, 'learning_rate': 7.4471338303233e-06, 'epoch': 0.36} 36%|███▌ | 7891/22095 [13:29:29<17:40:17, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133622 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7892/22095 [13:29:32<15:40:12, 3.97s/it] {'loss': 0.3702, 'grad_norm': 0.6574094557932094, 'learning_rate': 7.4464946643629535e-06, 'epoch': 0.36} 36%|███▌ | 7892/22095 [13:29:32<15:40:12, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119306 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121796 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7893/22095 [13:29:35<15:04:49, 3.82s/it] {'loss': 0.344, 'grad_norm': 0.5573694083411818, 'learning_rate': 7.4458554458349485e-06, 'epoch': 0.36} 36%|███▌ | 7893/22095 [13:29:35<15:04:49, 3.82s/it] 36%|███▌ | 7894/22095 [13:29:39<14:52:58, 3.77s/it] {'loss': 0.3853, 'grad_norm': 0.6893382885904867, 'learning_rate': 7.445216174753022e-06, 'epoch': 0.36} 36%|███▌ | 7894/22095 [13:29:39<14:52:58, 3.77s/it] 36%|███▌ | 7895/22095 [13:29:42<13:59:48, 3.55s/it] {'loss': 0.3627, 'grad_norm': 0.6443073771654781, 'learning_rate': 7.444576851130911e-06, 'epoch': 0.36} 36%|███▌ | 7895/22095 [13:29:42<13:59:48, 3.55s/it]VC:s3://gui-agent/data_20250609/windows/images/origin/20250514_131719_1/images/before_screenshot_71_concat_left.png 2025-08-28 05:27:41.953300 load time: 1031.8 ms 36%|███▌ | 7896/22095 [13:29:45<13:40:11, 3.47s/it] {'loss': 0.3524, 'grad_norm': 0.659029696855723, 'learning_rate': 7.443937474982351e-06, 'epoch': 0.36} 36%|███▌ | 7896/22095 [13:29:45<13:40:11, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7897/22095 [13:29:48<12:51:06, 3.26s/it] {'loss': 0.3294, 'grad_norm': 0.6751787259300283, 'learning_rate': 7.443298046321082e-06, 'epoch': 0.36} 36%|███▌ | 7897/22095 [13:29:48<12:51:06, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59744 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54794 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7898/22095 [13:29:51<12:18:13, 3.12s/it] {'loss': 0.3308, 'grad_norm': 0.6531740656256767, 'learning_rate': 7.442658565160838e-06, 'epoch': 0.36} 36%|███▌ | 7898/22095 [13:29:51<12:18:13, 3.12s/it] 36%|███▌ | 7899/22095 [13:29:54<12:31:05, 3.17s/it] {'loss': 0.2847, 'grad_norm': 0.7154670512286024, 'learning_rate': 7.442019031515368e-06, 'epoch': 0.36} 36%|███▌ | 7899/22095 [13:29:54<12:31:05, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60059 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42475 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7900/22095 [13:30:02<17:51:52, 4.53s/it] {'loss': 0.4848, 'grad_norm': 1.021117071184924, 'learning_rate': 7.4413794453984065e-06, 'epoch': 0.36} 36%|███▌ | 7900/22095 [13:30:02<17:51:52, 4.53s/it] 36%|███▌ | 7901/22095 [13:30:11<23:39:33, 6.00s/it] {'loss': 0.4649, 'grad_norm': 0.880368866321953, 'learning_rate': 7.4407398068237e-06, 'epoch': 0.36} 36%|███▌ | 7901/22095 [13:30:11<23:39:33, 6.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396965 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63818, 'image': 'vrdu_table_final_2/astro-ph.EP/d6956022-4422-4ae8-a93d-421530edadcc.png', 'image_wh': [[12, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[t]{l}z\\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7902/22095 [13:30:15<20:42:03, 5.25s/it] {'loss': 0.3355, 'grad_norm': 0.6463014724398792, 'learning_rate': 7.440100115804991e-06, 'epoch': 0.36} 36%|███▌ | 7902/22095 [13:30:15<20:42:03, 5.25s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_092710_4/images/before_screenshot_45_id_1320_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 05:28:14.892189 load time: 1327.55 ms 36%|███▌ | 7903/22095 [13:30:18<18:53:59, 4.79s/it] {'loss': 0.3782, 'grad_norm': 0.6454359331546151, 'learning_rate': 7.439460372356025e-06, 'epoch': 0.36} 36%|███▌ | 7903/22095 [13:30:18<18:53:59, 4.79s/it] 36%|███▌ | 7904/22095 [13:30:23<18:06:37, 4.59s/it] {'loss': 0.3673, 'grad_norm': 0.6732821923908577, 'learning_rate': 7.438820576490546e-06, 'epoch': 0.36} 36%|███▌ | 7904/22095 [13:30:23<18:06:37, 4.59s/it] 36%|███▌ | 7905/22095 [13:30:26<16:21:22, 4.15s/it] {'loss': 0.3884, 'grad_norm': 0.6885551892394901, 'learning_rate': 7.438180728222306e-06, 'epoch': 0.36} 36%|███▌ | 7905/22095 [13:30:26<16:21:22, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7906/22095 [13:30:33<20:21:30, 5.17s/it] {'loss': 0.4863, 'grad_norm': 1.0716877491064174, 'learning_rate': 7.4375408275650475e-06, 'epoch': 0.36} 36%|███▌ | 7906/22095 [13:30:33<20:21:30, 5.17s/it] 36%|███▌ | 7907/22095 [13:30:37<18:31:51, 4.70s/it] {'loss': 0.3715, 'grad_norm': 0.6553793841227316, 'learning_rate': 7.436900874532526e-06, 'epoch': 0.36} 36%|███▌ | 7907/22095 [13:30:37<18:31:51, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (95612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77023 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7908/22095 [13:30:45<22:50:40, 5.80s/it] {'loss': 0.5012, 'grad_norm': 0.8814940863816225, 'learning_rate': 7.436260869138486e-06, 'epoch': 0.36} 36%|███▌ | 7908/22095 [13:30:45<22:50:40, 5.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55409 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121302 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48799 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7909/22095 [13:30:49<21:03:06, 5.34s/it] {'loss': 0.3533, 'grad_norm': 0.6642419514151628, 'learning_rate': 7.435620811396684e-06, 'epoch': 0.36} 36%|███▌ | 7909/22095 [13:30:49<21:03:06, 5.34s/it] 36%|███▌ | 7910/22095 [13:30:52<18:17:22, 4.64s/it] {'loss': 0.3352, 'grad_norm': 0.661127123804159, 'learning_rate': 7.434980701320871e-06, 'epoch': 0.36} 36%|███▌ | 7910/22095 [13:30:52<18:17:22, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7911/22095 [13:30:56<17:01:08, 4.32s/it] {'loss': 0.401, 'grad_norm': 0.68747966185606, 'learning_rate': 7.4343405389248e-06, 'epoch': 0.36} 36%|███▌ | 7911/22095 [13:30:56<17:01:08, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49683 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115149 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7912/22095 [13:31:00<16:06:38, 4.09s/it] {'loss': 0.3478, 'grad_norm': 0.6979271552290265, 'learning_rate': 7.43370032422223e-06, 'epoch': 0.36} 36%|███▌ | 7912/22095 [13:31:00<16:06:38, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87398 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7913/22095 [13:31:03<15:52:52, 4.03s/it] {'loss': 0.3571, 'grad_norm': 0.7081415200993405, 'learning_rate': 7.433060057226913e-06, 'epoch': 0.36} 36%|███▌ | 7913/22095 [13:31:03<15:52:52, 4.03s/it] 36%|███▌ | 7914/22095 [13:31:07<15:08:16, 3.84s/it] {'loss': 0.3477, 'grad_norm': 0.6564714199182704, 'learning_rate': 7.432419737952607e-06, 'epoch': 0.36} 36%|███▌ | 7914/22095 [13:31:07<15:08:16, 3.84s/it] 36%|███▌ | 7915/22095 [13:31:11<15:44:19, 4.00s/it] {'loss': 0.3791, 'grad_norm': 0.6726460485709904, 'learning_rate': 7.431779366413073e-06, 'epoch': 0.36} 36%|███▌ | 7915/22095 [13:31:11<15:44:19, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7916/22095 [13:31:20<21:47:57, 5.53s/it] {'loss': 0.5026, 'grad_norm': 1.088140205794908, 'learning_rate': 7.431138942622069e-06, 'epoch': 0.36} 36%|███▌ | 7916/22095 [13:31:20<21:47:57, 5.53s/it] 36%|███▌ | 7917/22095 [13:31:27<22:35:15, 5.74s/it] {'loss': 0.5146, 'grad_norm': 0.8023510194742443, 'learning_rate': 7.430498466593355e-06, 'epoch': 0.36} 36%|███▌ | 7917/22095 [13:31:27<22:35:15, 5.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7918/22095 [13:31:30<19:53:36, 5.05s/it] {'loss': 0.32, 'grad_norm': 0.6175256937878455, 'learning_rate': 7.429857938340693e-06, 'epoch': 0.36} 36%|███▌ | 7918/22095 [13:31:30<19:53:36, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46596 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56786 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7919/22095 [13:31:34<18:31:45, 4.71s/it] {'loss': 0.366, 'grad_norm': 0.6853174040711785, 'learning_rate': 7.429217357877848e-06, 'epoch': 0.36} 36%|███▌ | 7919/22095 [13:31:34<18:31:45, 4.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7920/22095 [13:31:44<24:20:43, 6.18s/it] {'loss': 0.4921, 'grad_norm': 0.5864777621038076, 'learning_rate': 7.4285767252185824e-06, 'epoch': 0.36} 36%|███▌ | 7920/22095 [13:31:44<24:20:43, 6.18s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/inventor/20250513_143511_911997_1564_1/images/before_screenshot_1_id_0_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:29:42.288124 load time: 1023.67 ms 36%|███▌ | 7921/22095 [13:31:50<24:10:30, 6.14s/it] {'loss': 0.4547, 'grad_norm': 0.7552732461077418, 'learning_rate': 7.427936040376662e-06, 'epoch': 0.36} 36%|███▌ | 7921/22095 [13:31:50<24:10:30, 6.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (65350 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7922/22095 [13:31:54<21:43:04, 5.52s/it] {'loss': 0.3188, 'grad_norm': 0.6815214328335388, 'learning_rate': 7.427295303365851e-06, 'epoch': 0.36} 36%|███▌ | 7922/22095 [13:31:54<21:43:04, 5.52s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 05:29:52.412469 load time: 1171.6 ms 36%|███▌ | 7923/22095 [13:31:57<18:50:45, 4.79s/it] {'loss': 0.3368, 'grad_norm': 0.6302372503519815, 'learning_rate': 7.426654514199921e-06, 'epoch': 0.36} 36%|███▌ | 7923/22095 [13:31:57<18:50:45, 4.79s/it] 36%|███▌ | 7924/22095 [13:32:01<17:53:14, 4.54s/it] {'loss': 0.379, 'grad_norm': 0.7116580236180317, 'learning_rate': 7.426013672892639e-06, 'epoch': 0.36} 36%|███▌ | 7924/22095 [13:32:01<17:53:14, 4.54s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 05:30:00.742571 load time: 1148.76 ms 36%|███▌ | 7925/22095 [13:32:04<15:54:16, 4.04s/it] {'loss': 0.3561, 'grad_norm': 0.6324604196524789, 'learning_rate': 7.425372779457771e-06, 'epoch': 0.36} 36%|███▌ | 7925/22095 [13:32:04<15:54:16, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43785 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70030 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7926/22095 [13:32:07<15:18:54, 3.89s/it] {'loss': 0.3985, 'grad_norm': 0.9011199431306274, 'learning_rate': 7.424731833909094e-06, 'epoch': 0.36} 36%|███▌ | 7926/22095 [13:32:07<15:18:54, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7927/22095 [13:32:14<18:44:37, 4.76s/it] {'loss': 0.4937, 'grad_norm': 0.6233800357442847, 'learning_rate': 7.4240908362603745e-06, 'epoch': 0.36} 36%|███▌ | 7927/22095 [13:32:14<18:44:37, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7928/22095 [13:32:18<17:50:08, 4.53s/it] {'loss': 0.3669, 'grad_norm': 0.6580323715547287, 'learning_rate': 7.423449786525391e-06, 'epoch': 0.36} 36%|███▌ | 7928/22095 [13:32:18<17:50:08, 4.53s/it] 36%|███▌ | 7929/22095 [13:32:21<16:44:02, 4.25s/it] {'loss': 0.3993, 'grad_norm': 0.6788612202436244, 'learning_rate': 7.422808684717913e-06, 'epoch': 0.36} 36%|███▌ | 7929/22095 [13:32:21<16:44:02, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (85390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42696 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74264 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49672 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7930/22095 [13:32:31<23:00:34, 5.85s/it] {'loss': 0.4668, 'grad_norm': 0.45794257757974255, 'learning_rate': 7.422167530851716e-06, 'epoch': 0.36} 36%|███▌ | 7930/22095 [13:32:31<23:00:34, 5.85s/it] 36%|███▌ | 7931/22095 [13:32:35<20:38:16, 5.25s/it] {'loss': 0.4117, 'grad_norm': 0.7067797996645679, 'learning_rate': 7.42152632494058e-06, 'epoch': 0.36} 36%|███▌ | 7931/22095 [13:32:35<20:38:16, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48811 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7932/22095 [13:32:38<17:58:42, 4.57s/it] {'loss': 0.4242, 'grad_norm': 0.7374060859848149, 'learning_rate': 7.42088506699828e-06, 'epoch': 0.36} 36%|███▌ | 7932/22095 [13:32:38<17:58:42, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123922 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_2/images/step_0.png 2025-08-28 05:30:37.868926 load time: 1552.11 ms 36%|███▌ | 7933/22095 [13:32:41<16:08:56, 4.11s/it] {'loss': 0.3541, 'grad_norm': 0.6940096071197507, 'learning_rate': 7.420243757038593e-06, 'epoch': 0.36} 36%|███▌ | 7933/22095 [13:32:41<16:08:56, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7934/22095 [13:32:50<22:30:44, 5.72s/it] {'loss': 0.479, 'grad_norm': 0.46216186246289315, 'learning_rate': 7.419602395075304e-06, 'epoch': 0.36} 36%|███▌ | 7934/22095 [13:32:50<22:30:44, 5.72s/it] 36%|███▌ | 7935/22095 [13:32:54<19:28:58, 4.95s/it] {'loss': 0.3263, 'grad_norm': 0.7182676372748836, 'learning_rate': 7.418960981122188e-06, 'epoch': 0.36} 36%|███▌ | 7935/22095 [13:32:54<19:28:58, 4.95s/it] 36%|███▌ | 7936/22095 [13:32:57<17:55:10, 4.56s/it] {'loss': 0.3706, 'grad_norm': 0.6259753858694471, 'learning_rate': 7.418319515193032e-06, 'epoch': 0.36} 36%|███▌ | 7936/22095 [13:32:57<17:55:10, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7937/22095 [13:33:00<15:51:44, 4.03s/it] {'loss': 0.3708, 'grad_norm': 0.8776219507096461, 'learning_rate': 7.4176779973016156e-06, 'epoch': 0.36} 36%|███▌ | 7937/22095 [13:33:00<15:51:44, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7938/22095 [13:33:10<22:38:58, 5.76s/it] {'loss': 0.4882, 'grad_norm': 0.41646167675109397, 'learning_rate': 7.417036427461726e-06, 'epoch': 0.36} 36%|███▌ | 7938/22095 [13:33:10<22:38:58, 5.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51822 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7939/22095 [13:33:13<19:26:13, 4.94s/it] {'loss': 0.3602, 'grad_norm': 0.7252858911467089, 'learning_rate': 7.416394805687145e-06, 'epoch': 0.36} 36%|███▌ | 7939/22095 [13:33:13<19:26:13, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7940/22095 [13:33:21<23:13:49, 5.91s/it] {'loss': 0.4785, 'grad_norm': 0.3726637247353118, 'learning_rate': 7.415753131991661e-06, 'epoch': 0.36} 36%|███▌ | 7940/22095 [13:33:21<23:13:49, 5.91s/it] 36%|███▌ | 7941/22095 [13:33:24<20:13:59, 5.15s/it] {'loss': 0.3669, 'grad_norm': 0.740712850578873, 'learning_rate': 7.415111406389063e-06, 'epoch': 0.36} 36%|███▌ | 7941/22095 [13:33:24<20:13:59, 5.15s/it] 36%|███▌ | 7942/22095 [13:33:28<19:00:17, 4.83s/it] {'loss': 0.3762, 'grad_norm': 0.6646504393206524, 'learning_rate': 7.414469628893137e-06, 'epoch': 0.36} 36%|███▌ | 7942/22095 [13:33:28<19:00:17, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58595 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50994 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43744 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7943/22095 [13:33:32<17:35:06, 4.47s/it] {'loss': 0.3475, 'grad_norm': 0.6356744766828022, 'learning_rate': 7.413827799517674e-06, 'epoch': 0.36} 36%|███▌ | 7943/22095 [13:33:32<17:35:06, 4.47s/it] 36%|███▌ | 7944/22095 [13:33:36<16:26:54, 4.18s/it] {'loss': 0.3348, 'grad_norm': 0.607166637688898, 'learning_rate': 7.413185918276467e-06, 'epoch': 0.36} 36%|███▌ | 7944/22095 [13:33:36<16:26:54, 4.18s/it] 36%|███▌ | 7945/22095 [13:33:39<15:40:14, 3.99s/it] {'loss': 0.4069, 'grad_norm': 0.7131483598239857, 'learning_rate': 7.412543985183306e-06, 'epoch': 0.36} 36%|███▌ | 7945/22095 [13:33:39<15:40:14, 3.99s/it] 36%|███▌ | 7946/22095 [13:33:42<14:53:58, 3.79s/it] {'loss': 0.3566, 'grad_norm': 0.6299425375662295, 'learning_rate': 7.411902000251983e-06, 'epoch': 0.36} 36%|███▌ | 7946/22095 [13:33:42<14:53:58, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7947/22095 [13:33:52<21:31:27, 5.48s/it] {'loss': 0.4626, 'grad_norm': 0.5049556350096941, 'learning_rate': 7.411259963496294e-06, 'epoch': 0.36} 36%|███▌ | 7947/22095 [13:33:52<21:31:27, 5.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42874 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7948/22095 [13:33:55<18:41:08, 4.75s/it] {'loss': 0.3619, 'grad_norm': 0.6506714341238237, 'learning_rate': 7.410617874930034e-06, 'epoch': 0.36} 36%|███▌ | 7948/22095 [13:33:55<18:41:08, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7949/22095 [13:33:58<16:45:13, 4.26s/it] {'loss': 0.323, 'grad_norm': 0.639220736506604, 'learning_rate': 7.409975734566998e-06, 'epoch': 0.36} 36%|███▌ | 7949/22095 [13:33:58<16:45:13, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916668 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39821, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AC=6,CB=3,∴AB=6+3=9,∵O是线段AB的中点,∴AO=9÷2=4.5,∴OC=AC-AO=6-4.5=1.5.'}]} 36%|███▌ | 7950/22095 [13:34:01<14:56:09, 3.80s/it] {'loss': 0.3704, 'grad_norm': 0.7206397484804806, 'learning_rate': 7.4093335424209875e-06, 'epoch': 0.36} 36%|███▌ | 7950/22095 [13:34:01<14:56:09, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7951/22095 [13:34:09<20:16:23, 5.16s/it] {'loss': 0.4924, 'grad_norm': 0.37597732159073716, 'learning_rate': 7.4086912985057976e-06, 'epoch': 0.36} 36%|███▌ | 7951/22095 [13:34:09<20:16:23, 5.16s/it] 36%|███▌ | 7952/22095 [13:34:13<18:29:01, 4.70s/it] {'loss': 0.3768, 'grad_norm': 0.6455469760980388, 'learning_rate': 7.40804900283523e-06, 'epoch': 0.36} 36%|███▌ | 7952/22095 [13:34:13<18:29:01, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59566 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57385 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43923 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7953/22095 [13:34:17<17:39:10, 4.49s/it] {'loss': 0.3262, 'grad_norm': 0.6301467889728702, 'learning_rate': 7.407406655423086e-06, 'epoch': 0.36} 36%|███▌ | 7953/22095 [13:34:17<17:39:10, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98976 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68091 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94414 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7954/22095 [13:34:20<15:47:34, 4.02s/it] {'loss': 0.3356, 'grad_norm': 0.6135304956212745, 'learning_rate': 7.4067642562831656e-06, 'epoch': 0.36} 36%|███▌ | 7954/22095 [13:34:20<15:47:34, 4.02s/it] 36%|███▌ | 7955/22095 [13:34:23<14:31:15, 3.70s/it] {'loss': 0.3211, 'grad_norm': 0.6199600583542145, 'learning_rate': 7.406121805429274e-06, 'epoch': 0.36} 36%|███▌ | 7955/22095 [13:34:23<14:31:15, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (107402 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53025 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7956/22095 [13:34:32<21:20:08, 5.43s/it] {'loss': 0.4783, 'grad_norm': 0.39504063291210206, 'learning_rate': 7.405479302875212e-06, 'epoch': 0.36} 36%|███▌ | 7956/22095 [13:34:32<21:20:08, 5.43s/it] 36%|███▌ | 7957/22095 [13:34:36<19:43:59, 5.02s/it] {'loss': 0.389, 'grad_norm': 0.6760457687153064, 'learning_rate': 7.404836748634791e-06, 'epoch': 0.36} 36%|███▌ | 7957/22095 [13:34:36<19:43:59, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51018 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7958/22095 [13:34:39<17:13:00, 4.38s/it] {'loss': 0.3426, 'grad_norm': 0.6950822303958237, 'learning_rate': 7.404194142721812e-06, 'epoch': 0.36} 36%|███▌ | 7958/22095 [13:34:39<17:13:00, 4.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7959/22095 [13:34:43<16:26:44, 4.19s/it] {'loss': 0.345, 'grad_norm': 0.7108334017246865, 'learning_rate': 7.403551485150086e-06, 'epoch': 0.36} 36%|███▌ | 7959/22095 [13:34:43<16:26:44, 4.19s/it] 36%|███▌ | 7960/22095 [13:34:46<15:04:22, 3.84s/it] {'loss': 0.3175, 'grad_norm': 0.5868546699602407, 'learning_rate': 7.402908775933419e-06, 'epoch': 0.36} 36%|███▌ | 7960/22095 [13:34:46<15:04:22, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7961/22095 [13:34:56<22:35:53, 5.76s/it] {'loss': 0.4923, 'grad_norm': 0.34926700370057423, 'learning_rate': 7.402266015085624e-06, 'epoch': 0.36} 36%|███▌ | 7961/22095 [13:34:56<22:35:53, 5.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7962/22095 [13:35:06<27:48:20, 7.08s/it] {'loss': 0.4733, 'grad_norm': 0.31677625195627207, 'learning_rate': 7.401623202620509e-06, 'epoch': 0.36} 36%|███▌ | 7962/22095 [13:35:06<27:48:20, 7.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7963/22095 [13:35:09<23:13:50, 5.92s/it] {'loss': 0.3187, 'grad_norm': 0.6652658900532104, 'learning_rate': 7.40098033855189e-06, 'epoch': 0.36} 36%|███▌ | 7963/22095 [13:35:09<23:13:50, 5.92s/it] 36%|███▌ | 7964/22095 [13:35:20<28:25:25, 7.24s/it] {'loss': 0.5076, 'grad_norm': 0.3583699712794321, 'learning_rate': 7.4003374228935746e-06, 'epoch': 0.36} 36%|███▌ | 7964/22095 [13:35:20<28:25:25, 7.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250609/windows/images/stata/20250515_164657_1/images/before_screenshot_4_concat_left.png 2025-08-28 05:33:18.528422 load time: 1017.85 ms 36%|███▌ | 7965/22095 [13:35:24<24:49:07, 6.32s/it] {'loss': 0.3531, 'grad_norm': 0.6172428639056611, 'learning_rate': 7.399694455659382e-06, 'epoch': 0.36} 36%|███▌ | 7965/22095 [13:35:24<24:49:07, 6.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94280 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7966/22095 [13:35:33<28:23:21, 7.23s/it] {'loss': 0.4982, 'grad_norm': 0.30345325260551814, 'learning_rate': 7.399051436863125e-06, 'epoch': 0.36} 36%|███▌ | 7966/22095 [13:35:33<28:23:21, 7.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▌ | 7967/22095 [13:35:37<23:41:18, 6.04s/it] {'loss': 0.3837, 'grad_norm': 0.7233199446249501, 'learning_rate': 7.39840836651862e-06, 'epoch': 0.36} 36%|███▌ | 7967/22095 [13:35:37<23:41:18, 6.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91210 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7968/22095 [13:35:40<21:06:42, 5.38s/it] {'loss': 0.3595, 'grad_norm': 0.6763101826128397, 'learning_rate': 7.3977652446396855e-06, 'epoch': 0.36} 36%|███▌ | 7968/22095 [13:35:40<21:06:42, 5.38s/it] 36%|███▌ | 7969/22095 [13:35:45<19:53:54, 5.07s/it] {'loss': 0.3527, 'grad_norm': 0.6170249152749456, 'learning_rate': 7.397122071240141e-06, 'epoch': 0.36} 36%|███▌ | 7969/22095 [13:35:45<19:53:54, 5.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047100 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047923 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 16\nB. 2\nC. 4\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 36%|███▌ | 7970/22095 [13:35:49<18:24:22, 4.69s/it] {'loss': 0.3615, 'grad_norm': 0.5710552406604417, 'learning_rate': 7.396478846333805e-06, 'epoch': 0.36} 36%|███▌ | 7970/22095 [13:35:49<18:24:22, 4.69s/it] 36%|███▌ | 7971/22095 [13:35:52<17:18:31, 4.41s/it] {'loss': 0.3775, 'grad_norm': 0.6389271121653773, 'learning_rate': 7.395835569934498e-06, 'epoch': 0.36} 36%|███▌ | 7971/22095 [13:35:52<17:18:31, 4.41s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30014.png 2025-08-28 05:33:49.283552 load time: 1345.4 ms 36%|███▌ | 7972/22095 [13:35:56<16:12:14, 4.13s/it] {'loss': 0.3772, 'grad_norm': 0.6142088904895509, 'learning_rate': 7.395192242056044e-06, 'epoch': 0.36} 36%|███▌ | 7972/22095 [13:35:56<16:12:14, 4.13s/it] 36%|███▌ | 7973/22095 [13:35:59<15:13:29, 3.88s/it] {'loss': 0.3425, 'grad_norm': 0.6090371660988895, 'learning_rate': 7.394548862712264e-06, 'epoch': 0.36} 36%|███▌ | 7973/22095 [13:35:59<15:13:29, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7974/22095 [13:36:08<21:37:30, 5.51s/it] {'loss': 0.5164, 'grad_norm': 0.35945651666517714, 'learning_rate': 7.393905431916985e-06, 'epoch': 0.36} 36%|███▌ | 7974/22095 [13:36:08<21:37:30, 5.51s/it] 36%|███▌ | 7975/22095 [13:36:12<18:58:35, 4.84s/it] {'loss': 0.3691, 'grad_norm': 0.6718626629578323, 'learning_rate': 7.393261949684027e-06, 'epoch': 0.36} 36%|███▌ | 7975/22095 [13:36:12<18:58:35, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7976/22095 [13:36:17<19:47:14, 5.05s/it] {'loss': 0.4721, 'grad_norm': 0.32284190264535156, 'learning_rate': 7.392618416027224e-06, 'epoch': 0.36} 36%|███▌ | 7976/22095 [13:36:17<19:47:14, 5.05s/it] 36%|███▌ | 7977/22095 [13:36:20<17:37:40, 4.50s/it] {'loss': 0.3766, 'grad_norm': 0.6600892140380253, 'learning_rate': 7.3919748309603965e-06, 'epoch': 0.36} 36%|███▌ | 7977/22095 [13:36:20<17:37:40, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7978/22095 [13:36:24<16:11:47, 4.13s/it] {'loss': 0.352, 'grad_norm': 0.6506458136837409, 'learning_rate': 7.391331194497379e-06, 'epoch': 0.36} 36%|███▌ | 7978/22095 [13:36:24<16:11:47, 4.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7979/22095 [13:36:27<15:23:11, 3.92s/it] {'loss': 0.3817, 'grad_norm': 0.6494080072148609, 'learning_rate': 7.3906875066519964e-06, 'epoch': 0.36} 36%|███▌ | 7979/22095 [13:36:27<15:23:11, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [48, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396962 in VC:s3://internvl-moe-sft-data/. Exception: Image size [48, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63815, 'image': 'vrdu_table_final_2/astro-ph.EP/a8378c9c-a0c7-4823-b6c9-feda0cbcdf4a.png', 'image_wh': [[48, 17]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$-e_x$\\\\\\end{tabular}\n```"}]} 36%|███▌ | 7980/22095 [13:36:30<14:41:31, 3.75s/it] {'loss': 0.31, 'grad_norm': 0.6758242870450416, 'learning_rate': 7.390043767438083e-06, 'epoch': 0.36} 36%|███▌ | 7980/22095 [13:36:30<14:41:31, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55673 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58042 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (41317 > 40960) for 4 sample(s). Truncating to 40756 with 2 samples. 36%|███▌ | 7981/22095 [13:36:34<14:18:42, 3.65s/it] {'loss': 0.3563, 'grad_norm': 0.6400939611532723, 'learning_rate': 7.389399976869469e-06, 'epoch': 0.36} 36%|███▌ | 7981/22095 [13:36:34<14:18:42, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78803 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7982/22095 [13:36:37<13:09:39, 3.36s/it] {'loss': 0.318, 'grad_norm': 0.673003801643574, 'learning_rate': 7.388756134959989e-06, 'epoch': 0.36} 36%|███▌ | 7982/22095 [13:36:37<13:09:39, 3.36s/it] 36%|███▌ | 7983/22095 [13:36:40<13:17:57, 3.39s/it] {'loss': 0.3677, 'grad_norm': 0.609327398367172, 'learning_rate': 7.388112241723475e-06, 'epoch': 0.36} 36%|███▌ | 7983/22095 [13:36:40<13:17:57, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7984/22095 [13:36:43<13:03:39, 3.33s/it] {'loss': 0.3236, 'grad_norm': 0.6815401431992644, 'learning_rate': 7.387468297173764e-06, 'epoch': 0.36} 36%|███▌ | 7984/22095 [13:36:43<13:03:39, 3.33s/it] 36%|███▌ | 7985/22095 [13:36:47<13:42:48, 3.50s/it] {'loss': 0.4191, 'grad_norm': 0.6689289187919621, 'learning_rate': 7.386824301324691e-06, 'epoch': 0.36} 36%|███▌ | 7985/22095 [13:36:47<13:42:48, 3.50s/it] 36%|███▌ | 7986/22095 [13:36:51<13:52:22, 3.54s/it] {'loss': 0.351, 'grad_norm': 0.6081946398697645, 'learning_rate': 7.386180254190095e-06, 'epoch': 0.36} 36%|███▌ | 7986/22095 [13:36:51<13:52:22, 3.54s/it] 36%|███▌ | 7987/22095 [13:36:54<13:54:24, 3.55s/it] {'loss': 0.3843, 'grad_norm': 0.7500202226223562, 'learning_rate': 7.3855361557838145e-06, 'epoch': 0.36} 36%|███▌ | 7987/22095 [13:36:54<13:54:24, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366646 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33392, 'image': 'vrdu_table_final_2/astro-ph.CO/7762b07a-53fa-461b-91c0-c9b7206817ee.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_10/img/step_0.png 2025-08-28 05:34:53.087807 load time: 1028.69 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7988/22095 [13:36:58<13:46:30, 3.52s/it] {'loss': 0.3757, 'grad_norm': 0.6559071916954051, 'learning_rate': 7.384892006119687e-06, 'epoch': 0.36} 36%|███▌ | 7988/22095 [13:36:58<13:46:30, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047935 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 6.5\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 36%|███▌ | 7989/22095 [13:37:08<21:13:05, 5.42s/it] {'loss': 0.4832, 'grad_norm': 0.4872198828882111, 'learning_rate': 7.384247805211556e-06, 'epoch': 0.36} 36%|███▌ | 7989/22095 [13:37:08<21:13:05, 5.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66506 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7990/22095 [13:37:14<22:20:38, 5.70s/it] {'loss': 0.4856, 'grad_norm': 0.3916520443560128, 'learning_rate': 7.383603553073262e-06, 'epoch': 0.36} 36%|███▌ | 7990/22095 [13:37:14<22:20:38, 5.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_175007_3/images/before_screenshot_17_id_114_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 05:35:12.752699 load time: 1083.33 ms 36%|███▌ | 7991/22095 [13:37:17<19:32:22, 4.99s/it] {'loss': 0.3566, 'grad_norm': 0.5976092297300865, 'learning_rate': 7.382959249718648e-06, 'epoch': 0.36} 36%|███▌ | 7991/22095 [13:37:17<19:32:22, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107171 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85094 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79229 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7992/22095 [13:37:21<18:33:44, 4.74s/it] {'loss': 0.3494, 'grad_norm': 0.67388547399459, 'learning_rate': 7.3823148951615605e-06, 'epoch': 0.36} 36%|███▌ | 7992/22095 [13:37:21<18:33:44, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (158141 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 7993/22095 [13:37:26<17:57:41, 4.59s/it] {'loss': 0.3578, 'grad_norm': 0.6173366485190657, 'learning_rate': 7.38167048941584e-06, 'epoch': 0.36} 36%|███▌ | 7993/22095 [13:37:26<17:57:41, 4.59s/it] 36%|███▌ | 7994/22095 [13:37:29<16:22:13, 4.18s/it] {'loss': 0.348, 'grad_norm': 0.5931781342085648, 'learning_rate': 7.381026032495338e-06, 'epoch': 0.36} 36%|███▌ | 7994/22095 [13:37:29<16:22:13, 4.18s/it] 36%|███▌ | 7995/22095 [13:37:33<15:57:10, 4.07s/it] {'loss': 0.4096, 'grad_norm': 0.6809323889524895, 'learning_rate': 7.3803815244138976e-06, 'epoch': 0.36} 36%|███▌ | 7995/22095 [13:37:33<15:57:10, 4.07s/it] 36%|███▌ | 7996/22095 [13:37:36<15:01:39, 3.84s/it] {'loss': 0.3388, 'grad_norm': 0.6209283680560373, 'learning_rate': 7.379736965185369e-06, 'epoch': 0.36} 36%|███▌ | 7996/22095 [13:37:36<15:01:39, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 7997/22095 [13:37:39<13:52:39, 3.54s/it] {'loss': 0.3515, 'grad_norm': 0.6650202358554252, 'learning_rate': 7.379092354823602e-06, 'epoch': 0.36} 36%|███▌ | 7997/22095 [13:37:39<13:52:39, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 7998/22095 [13:37:47<18:54:10, 4.83s/it] {'loss': 0.4754, 'grad_norm': 0.7646399480977066, 'learning_rate': 7.378447693342447e-06, 'epoch': 0.36} 36%|███▌ | 7998/22095 [13:37:47<18:54:10, 4.83s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/432466e7-22e7-4194-aa9e-19f7c21adef5/images/step_5.png 2025-08-28 05:35:47.630490 load time: 1145.71 ms 36%|███▌ | 7999/22095 [13:37:51<17:53:51, 4.57s/it] {'loss': 0.3863, 'grad_norm': 0.6603043887038389, 'learning_rate': 7.377802980755756e-06, 'epoch': 0.36} 36%|███▌ | 7999/22095 [13:37:51<17:53:51, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▌ | 8000/22095 [13:38:00<23:43:17, 6.06s/it] {'loss': 0.5074, 'grad_norm': 0.42204413689316006, 'learning_rate': 7.377158217077381e-06, 'epoch': 0.36} 36%|███▌ | 8000/22095 [13:38:00<23:43:17, 6.06s/it] 36%|███▌ | 8001/22095 [13:38:40<63:51:07, 16.31s/it] {'loss': 0.499, 'grad_norm': 0.32598493421669134, 'learning_rate': 7.3765134023211785e-06, 'epoch': 0.36} 36%|███▌ | 8001/22095 [13:38:40<63:51:07, 16.31s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Token indices sequence length is longer than the specified maximum sequence length for this model (48509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97758 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83596 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 8002/22095 [13:38:44<48:44:45, 12.45s/it] {'loss': 0.325, 'grad_norm': 0.6719469428185199, 'learning_rate': 7.375868536501001e-06, 'epoch': 0.36} 36%|███▌ | 8002/22095 [13:38:44<48:44:45, 12.45s/it] 36%|███▌ | 8003/22095 [13:38:47<38:04:16, 9.73s/it] {'loss': 0.3806, 'grad_norm': 0.6800561272443149, 'learning_rate': 7.3752236196307045e-06, 'epoch': 0.36} 36%|███▌ | 8003/22095 [13:38:47<38:04:16, 9.73s/it] 36%|███▌ | 8004/22095 [13:38:51<30:38:28, 7.83s/it] {'loss': 0.3323, 'grad_norm': 0.6730755539353572, 'learning_rate': 7.374578651724149e-06, 'epoch': 0.36} 36%|███▌ | 8004/22095 [13:38:51<30:38:28, 7.83s/it] 36%|███▌ | 8005/22095 [13:38:54<25:48:05, 6.59s/it] {'loss': 0.354, 'grad_norm': 0.6339768447450999, 'learning_rate': 7.373933632795192e-06, 'epoch': 0.36} 36%|███▌ | 8005/22095 [13:38:54<25:48:05, 6.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8336865 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3486, 'image': 'vrdu_table_final_2/astro-ph.CO/aa83e60e-f987-42af-a88b-aec999a383fb.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 36%|███▌ | 8006/22095 [13:38:57<21:34:29, 5.51s/it] {'loss': 0.3718, 'grad_norm': 0.651900578435686, 'learning_rate': 7.37328856285769e-06, 'epoch': 0.36} 36%|███▌ | 8006/22095 [13:38:57<21:34:29, 5.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▌ | 8007/22095 [13:39:00<18:44:53, 4.79s/it] {'loss': 0.3649, 'grad_norm': 0.6331718571481463, 'learning_rate': 7.372643441925508e-06, 'epoch': 0.36} 36%|███▌ | 8007/22095 [13:39:00<18:44:53, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118250 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (83973 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 36%|███▌ | 8008/22095 [13:39:04<17:50:31, 4.56s/it] {'loss': 0.3652, 'grad_norm': 0.6150645079124384, 'learning_rate': 7.371998270012504e-06, 'epoch': 0.36} 36%|███▌ | 8008/22095 [13:39:04<17:50:31, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93555 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52158 > 40960). Running this sequence through the model will result in indexing errors 36%|███▌ | 8009/22095 [13:39:08<16:37:07, 4.25s/it] {'loss': 0.3629, 'grad_norm': 0.6634436052319046, 'learning_rate': 7.371353047132542e-06, 'epoch': 0.36} 36%|███▌ | 8009/22095 [13:39:08<16:37:07, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (102027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46605 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94998 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8010/22095 [13:39:18<22:48:36, 5.83s/it] {'loss': 0.5121, 'grad_norm': 1.0406447821867784, 'learning_rate': 7.370707773299486e-06, 'epoch': 0.36} 36%|███▋ | 8010/22095 [13:39:18<22:48:36, 5.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45366 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41884 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94274 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92811 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8011/22095 [13:39:21<20:11:48, 5.16s/it] {'loss': 0.3304, 'grad_norm': 0.6452982035033098, 'learning_rate': 7.370062448527202e-06, 'epoch': 0.36} 36%|███▋ | 8011/22095 [13:39:21<20:11:48, 5.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81893 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54177 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8012/22095 [13:39:25<18:48:16, 4.81s/it] {'loss': 0.4023, 'grad_norm': 0.7508094008376939, 'learning_rate': 7.369417072829555e-06, 'epoch': 0.36} 36%|███▋ | 8012/22095 [13:39:25<18:48:16, 4.81s/it] 36%|███▋ | 8013/22095 [13:39:28<16:59:55, 4.35s/it] {'loss': 0.3824, 'grad_norm': 0.6473029618057764, 'learning_rate': 7.368771646220412e-06, 'epoch': 0.36} 36%|███▋ | 8013/22095 [13:39:28<16:59:55, 4.35s/it] 36%|███▋ | 8014/22095 [13:39:32<16:15:54, 4.16s/it] {'loss': 0.3445, 'grad_norm': 0.694529569892738, 'learning_rate': 7.36812616871364e-06, 'epoch': 0.36} 36%|███▋ | 8014/22095 [13:39:32<16:15:54, 4.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045964 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 3\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 36%|███▋ | 8015/22095 [13:39:36<15:58:20, 4.08s/it] {'loss': 0.3497, 'grad_norm': 0.6285840846990028, 'learning_rate': 7.367480640323113e-06, 'epoch': 0.36} 36%|███▋ | 8015/22095 [13:39:36<15:58:20, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8016/22095 [13:39:45<22:17:03, 5.70s/it] {'loss': 0.4819, 'grad_norm': 0.4083667963474437, 'learning_rate': 7.366835061062696e-06, 'epoch': 0.36} 36%|███▋ | 8016/22095 [13:39:45<22:17:03, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72534 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41519 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8017/22095 [13:39:55<26:46:18, 6.85s/it] {'loss': 0.4857, 'grad_norm': 0.39779569020588457, 'learning_rate': 7.366189430946262e-06, 'epoch': 0.36} 36%|███▋ | 8017/22095 [13:39:55<26:46:18, 6.85s/it] 36%|███▋ | 8018/22095 [13:40:04<29:53:59, 7.65s/it] {'loss': 0.4672, 'grad_norm': 0.33896363384718664, 'learning_rate': 7.365543749987685e-06, 'epoch': 0.36} 36%|███▋ | 8018/22095 [13:40:05<29:53:59, 7.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8567143 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24645, 'image': '671501437.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a homosexuality book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [75, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334144 in VC:s3://internvl-moe-sft-data/. Exception: Image size [75, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 754, 'image': 'vrdu_table_final_2/astro-ph.CO/e9f576fe-3750-4b89-a18c-45f296e5725a.png', 'image_wh': [[75, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{l}3C147\\end{tabular}\n```"}]} 36%|███▋ | 8019/22095 [13:40:15<32:51:11, 8.40s/it] {'loss': 0.4865, 'grad_norm': 0.3043889489229819, 'learning_rate': 7.364898018200839e-06, 'epoch': 0.36} 36%|███▋ | 8019/22095 [13:40:15<32:51:11, 8.40s/it] 36%|███▋ | 8020/22095 [13:40:25<34:46:49, 8.90s/it] {'loss': 0.4573, 'grad_norm': 0.2978844693860657, 'learning_rate': 7.364252235599596e-06, 'epoch': 0.36} 36%|███▋ | 8020/22095 [13:40:25<34:46:49, 8.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 36%|███▋ | 8021/22095 [13:40:28<28:15:09, 7.23s/it] {'loss': 0.3927, 'grad_norm': 0.841995713596338, 'learning_rate': 7.363606402197836e-06, 'epoch': 0.36} 36%|███▋ | 8021/22095 [13:40:28<28:15:09, 7.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55591 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44831 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8022/22095 [13:40:31<23:28:26, 6.00s/it] {'loss': 0.3717, 'grad_norm': 0.7853050599559405, 'learning_rate': 7.362960518009432e-06, 'epoch': 0.36} 36%|███▋ | 8022/22095 [13:40:31<23:28:26, 6.00s/it] 36%|███▋ | 8023/22095 [13:40:35<20:52:11, 5.34s/it] {'loss': 0.3284, 'grad_norm': 0.7351000709156122, 'learning_rate': 7.362314583048265e-06, 'epoch': 0.36} 36%|███▋ | 8023/22095 [13:40:35<20:52:11, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83167 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8024/22095 [13:40:38<17:52:11, 4.57s/it] {'loss': 0.3766, 'grad_norm': 0.8623144464403313, 'learning_rate': 7.361668597328212e-06, 'epoch': 0.36} 36%|███▋ | 8024/22095 [13:40:38<17:52:11, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8025/22095 [13:40:47<23:38:13, 6.05s/it] {'loss': 0.4829, 'grad_norm': 0.6196968848506009, 'learning_rate': 7.361022560863154e-06, 'epoch': 0.36} 36%|███▋ | 8025/22095 [13:40:47<23:38:13, 6.05s/it] 36%|███▋ | 8026/22095 [13:40:51<20:21:27, 5.21s/it] {'loss': 0.3669, 'grad_norm': 0.7176131906312808, 'learning_rate': 7.360376473666973e-06, 'epoch': 0.36} 36%|███▋ | 8026/22095 [13:40:51<20:21:27, 5.21s/it] 36%|███▋ | 8027/22095 [13:40:54<17:57:11, 4.59s/it] {'loss': 0.3583, 'grad_norm': 0.6777973099375537, 'learning_rate': 7.359730335753551e-06, 'epoch': 0.36} 36%|███▋ | 8027/22095 [13:40:54<17:57:11, 4.59s/it] 36%|███▋ | 8028/22095 [13:40:58<17:12:04, 4.40s/it] {'loss': 0.3671, 'grad_norm': 0.6718870397866715, 'learning_rate': 7.35908414713677e-06, 'epoch': 0.36} 36%|███▋ | 8028/22095 [13:40:58<17:12:04, 4.40s/it] 36%|███▋ | 8029/22095 [13:41:01<16:28:37, 4.22s/it] {'loss': 0.3311, 'grad_norm': 0.5794901646044115, 'learning_rate': 7.358437907830518e-06, 'epoch': 0.36} 36%|███▋ | 8029/22095 [13:41:01<16:28:37, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8030/22095 [13:41:11<22:41:20, 5.81s/it] {'loss': 0.4644, 'grad_norm': 0.368659357946139, 'learning_rate': 7.3577916178486775e-06, 'epoch': 0.36} 36%|███▋ | 8030/22095 [13:41:11<22:41:20, 5.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52409 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62745 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8031/22095 [13:41:15<20:14:48, 5.18s/it] {'loss': 0.3567, 'grad_norm': 0.6997437666807766, 'learning_rate': 7.357145277205138e-06, 'epoch': 0.36} 36%|███▋ | 8031/22095 [13:41:15<20:14:48, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8032/22095 [13:41:24<25:17:03, 6.47s/it] {'loss': 0.4997, 'grad_norm': 0.3570708240657923, 'learning_rate': 7.356498885913784e-06, 'epoch': 0.36} 36%|███▋ | 8032/22095 [13:41:24<25:17:03, 6.47s/it] 36%|███▋ | 8033/22095 [13:41:28<22:08:26, 5.67s/it] {'loss': 0.4016, 'grad_norm': 0.8237636191487573, 'learning_rate': 7.3558524439885075e-06, 'epoch': 0.36} 36%|███▋ | 8033/22095 [13:41:28<22:08:26, 5.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96238 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8034/22095 [13:41:31<18:50:25, 4.82s/it] {'loss': 0.3163, 'grad_norm': 0.6271151725782911, 'learning_rate': 7.3552059514431985e-06, 'epoch': 0.36} 36%|███▋ | 8034/22095 [13:41:31<18:50:25, 4.82s/it] 36%|███▋ | 8035/22095 [13:41:34<16:45:44, 4.29s/it] {'loss': 0.4084, 'grad_norm': 0.6676304941879442, 'learning_rate': 7.3545594082917435e-06, 'epoch': 0.36} 36%|███▋ | 8035/22095 [13:41:34<16:45:44, 4.29s/it] 36%|███▋ | 8036/22095 [13:41:37<15:12:49, 3.90s/it] {'loss': 0.4009, 'grad_norm': 0.7293712480186316, 'learning_rate': 7.353912814548042e-06, 'epoch': 0.36} 36%|███▋ | 8036/22095 [13:41:37<15:12:49, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71665 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8037/22095 [13:41:41<15:33:21, 3.98s/it] {'loss': 0.3365, 'grad_norm': 0.7012141766795378, 'learning_rate': 7.353266170225982e-06, 'epoch': 0.36} 36%|███▋ | 8037/22095 [13:41:41<15:33:21, 3.98s/it] 36%|███▋ | 8038/22095 [13:41:44<14:43:01, 3.77s/it] {'loss': 0.3778, 'grad_norm': 0.7480133971533515, 'learning_rate': 7.35261947533946e-06, 'epoch': 0.36} 36%|███▋ | 8038/22095 [13:41:44<14:43:01, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8039/22095 [13:41:54<21:19:38, 5.46s/it] {'loss': 0.505, 'grad_norm': 0.5479419567796888, 'learning_rate': 7.35197272990237e-06, 'epoch': 0.36} 36%|███▋ | 8039/22095 [13:41:54<21:19:38, 5.46s/it] 36%|███▋ | 8040/22095 [13:41:57<18:41:13, 4.79s/it] {'loss': 0.3614, 'grad_norm': 0.6619275487363858, 'learning_rate': 7.35132593392861e-06, 'epoch': 0.36} 36%|███▋ | 8040/22095 [13:41:57<18:41:13, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 36%|███▋ | 8041/22095 [13:42:04<20:57:28, 5.37s/it] {'loss': 0.4791, 'grad_norm': 0.4062441097479375, 'learning_rate': 7.350679087432078e-06, 'epoch': 0.36} 36%|███▋ | 8041/22095 [13:42:04<20:57:28, 5.37s/it] 36%|███▋ | 8042/22095 [13:42:09<21:25:57, 5.49s/it] {'loss': 0.5068, 'grad_norm': 0.29776999395359294, 'learning_rate': 7.3500321904266725e-06, 'epoch': 0.36} 36%|███▋ | 8042/22095 [13:42:09<21:25:57, 5.49s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (116001 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99249 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8043/22095 [13:42:13<19:33:02, 5.01s/it] {'loss': 0.3744, 'grad_norm': 0.8457303726828063, 'learning_rate': 7.349385242926291e-06, 'epoch': 0.36} 36%|███▋ | 8043/22095 [13:42:13<19:33:02, 5.01s/it] 36%|███▋ | 8044/22095 [13:42:17<17:33:19, 4.50s/it] {'loss': 0.3642, 'grad_norm': 1.0320355475306249, 'learning_rate': 7.348738244944837e-06, 'epoch': 0.36} 36%|███▋ | 8044/22095 [13:42:17<17:33:19, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8045/22095 [13:42:24<20:24:11, 5.23s/it] {'loss': 0.4909, 'grad_norm': 0.4901552884921019, 'learning_rate': 7.348091196496212e-06, 'epoch': 0.36} 36%|███▋ | 8045/22095 [13:42:24<20:24:11, 5.23s/it] 36%|███▋ | 8046/22095 [13:42:27<18:44:58, 4.80s/it] {'loss': 0.3305, 'grad_norm': 0.749738817323516, 'learning_rate': 7.3474440975943185e-06, 'epoch': 0.36} 36%|███▋ | 8046/22095 [13:42:27<18:44:58, 4.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (122242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90633 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8047/22095 [13:42:33<19:17:46, 4.94s/it] {'loss': 0.3835, 'grad_norm': 0.6980541542956954, 'learning_rate': 7.346796948253061e-06, 'epoch': 0.36} 36%|███▋ | 8047/22095 [13:42:33<19:17:46, 4.94s/it] 36%|███▋ | 8048/22095 [13:42:36<17:03:57, 4.37s/it] {'loss': 0.3878, 'grad_norm': 0.6926662082886041, 'learning_rate': 7.346149748486345e-06, 'epoch': 0.36} 36%|███▋ | 8048/22095 [13:42:36<17:03:57, 4.37s/it] 36%|███▋ | 8049/22095 [13:42:39<15:54:38, 4.08s/it] {'loss': 0.3591, 'grad_norm': 0.6511728917753308, 'learning_rate': 7.345502498308076e-06, 'epoch': 0.36} 36%|███▋ | 8049/22095 [13:42:39<15:54:38, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79723 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8050/22095 [13:42:43<16:05:02, 4.12s/it] {'loss': 0.3413, 'grad_norm': 0.63481958151217, 'learning_rate': 7.3448551977321615e-06, 'epoch': 0.36} 36%|███▋ | 8050/22095 [13:42:43<16:05:02, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8051/22095 [13:42:50<19:14:21, 4.93s/it] {'loss': 0.4872, 'grad_norm': 0.4470564181613816, 'learning_rate': 7.344207846772511e-06, 'epoch': 0.36} 36%|███▋ | 8051/22095 [13:42:50<19:14:21, 4.93s/it] 36%|███▋ | 8052/22095 [13:42:54<18:28:02, 4.73s/it] {'loss': 0.3575, 'grad_norm': 0.61445838152631, 'learning_rate': 7.3435604454430345e-06, 'epoch': 0.36} 36%|███▋ | 8052/22095 [13:42:54<18:28:02, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8053/22095 [13:43:03<23:15:20, 5.96s/it] {'loss': 0.4941, 'grad_norm': 0.34039730000214496, 'learning_rate': 7.34291299375764e-06, 'epoch': 0.36} 36%|███▋ | 8053/22095 [13:43:03<23:15:20, 5.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41024 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8054/22095 [13:43:07<21:15:29, 5.45s/it] {'loss': 0.3466, 'grad_norm': 0.5926980612134997, 'learning_rate': 7.342265491730243e-06, 'epoch': 0.36} 36%|███▋ | 8054/22095 [13:43:07<21:15:29, 5.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8055/22095 [13:43:14<22:23:22, 5.74s/it] {'loss': 0.4955, 'grad_norm': 0.2837414537247516, 'learning_rate': 7.341617939374753e-06, 'epoch': 0.36} 36%|███▋ | 8055/22095 [13:43:14<22:23:22, 5.74s/it] 36%|███▋ | 8056/22095 [13:43:17<19:31:29, 5.01s/it] {'loss': 0.36, 'grad_norm': 0.7716956234209372, 'learning_rate': 7.340970336705084e-06, 'epoch': 0.36} 36%|███▋ | 8056/22095 [13:43:17<19:31:29, 5.01s/it] 36%|███▋ | 8057/22095 [13:43:20<17:01:24, 4.37s/it] {'loss': 0.3592, 'grad_norm': 0.6427150881432653, 'learning_rate': 7.340322683735155e-06, 'epoch': 0.36} 36%|███▋ | 8057/22095 [13:43:20<17:01:24, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 36%|███▋ | 8058/22095 [13:43:29<22:51:45, 5.86s/it] {'loss': 0.4854, 'grad_norm': 0.3857297404798561, 'learning_rate': 7.339674980478878e-06, 'epoch': 0.36} 36%|███▋ | 8058/22095 [13:43:29<22:51:45, 5.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [473, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8491815 in VC:s3://internvl-moe-sft-data/. Exception: Image size [473, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 144820, 'image': 'vrdu_texteq/astro-ph.CO/20035d9c-6561-4983-9900-fcf1f2af2dfd.png', 'image_wh': [[473, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'where we have taken $b=1$ and defined'}]} 36%|███▋ | 8059/22095 [13:43:33<20:06:18, 5.16s/it] {'loss': 0.3543, 'grad_norm': 0.601291663477463, 'learning_rate': 7.339027226950171e-06, 'epoch': 0.36} 36%|███▋ | 8059/22095 [13:43:33<20:06:18, 5.16s/it] 36%|███▋ | 8060/22095 [13:43:36<17:43:16, 4.55s/it] {'loss': 0.3713, 'grad_norm': 0.7298123559205156, 'learning_rate': 7.338379423162953e-06, 'epoch': 0.36} 36%|███▋ | 8060/22095 [13:43:36<17:43:16, 4.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [26, 54, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8802838 in VC:s3://internvl2/datasets/chrome_writting/images/. Exception: Image size [26, 54, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6569, 'image': '0006569.jpg', 'image_wh': [[26, 54]], 'conversations': [{'from': 'human', 'value': '\nPlease write out the expression of the formula in the image using LaTeX format.\nPlease give the latex expression directly without any extra output.'}, {'from': 'gpt', 'value': '$1$'}]} 36%|███▋ | 8061/22095 [13:43:39<15:36:04, 4.00s/it] {'loss': 0.3452, 'grad_norm': 0.6565806633034736, 'learning_rate': 7.337731569131143e-06, 'epoch': 0.36} 36%|███▋ | 8061/22095 [13:43:39<15:36:04, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [337, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474662 in VC:s3://internvl-moe-sft-data/. Exception: Image size [337, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69788, 'image': 'vrdu_texteq/astro-ph.CO/b77681e4-c045-45ee-a77e-90eb87ca6273.png', 'image_wh': [[337, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'For the case $\\alpha=1$ we have:'}]} 36%|███▋ | 8062/22095 [13:43:45<18:40:55, 4.79s/it] {'loss': 0.4727, 'grad_norm': 0.3229528406928706, 'learning_rate': 7.3370836648686616e-06, 'epoch': 0.36} 36%|███▋ | 8062/22095 [13:43:45<18:40:55, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105309 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57113 > 40960). Running this sequence through the model will result in indexing errors 36%|███▋ | 8063/22095 [13:43:49<16:58:47, 4.36s/it] {'loss': 0.348, 'grad_norm': 0.6929570188379901, 'learning_rate': 7.33643571038943e-06, 'epoch': 0.36} 36%|███▋ | 8063/22095 [13:43:49<16:58:47, 4.36s/it] 36%|███▋ | 8064/22095 [13:43:52<15:59:47, 4.10s/it] {'loss': 0.3899, 'grad_norm': 0.6625662022938806, 'learning_rate': 7.33578770570737e-06, 'epoch': 0.36} 36%|███▋ | 8064/22095 [13:43:52<15:59:47, 4.10s/it] 37%|███▋ | 8065/22095 [13:43:56<15:35:27, 4.00s/it] {'loss': 0.4089, 'grad_norm': 0.7017583007840354, 'learning_rate': 7.335139650836407e-06, 'epoch': 0.37} 37%|███▋ | 8065/22095 [13:43:56<15:35:27, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8066/22095 [13:43:59<14:43:27, 3.78s/it] {'loss': 0.3396, 'grad_norm': 0.662474726091609, 'learning_rate': 7.3344915457904655e-06, 'epoch': 0.37} 37%|███▋ | 8066/22095 [13:43:59<14:43:27, 3.78s/it] 37%|███▋ | 8067/22095 [13:44:02<13:32:07, 3.47s/it] {'loss': 0.3234, 'grad_norm': 0.601157599169405, 'learning_rate': 7.3338433905834685e-06, 'epoch': 0.37} 37%|███▋ | 8067/22095 [13:44:02<13:32:07, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8068/22095 [13:44:05<12:48:12, 3.29s/it] {'loss': 0.3668, 'grad_norm': 0.6897480223541927, 'learning_rate': 7.333195185229346e-06, 'epoch': 0.37} 37%|███▋ | 8068/22095 [13:44:05<12:48:12, 3.29s/it] 37%|███▋ | 8069/22095 [13:44:09<13:15:34, 3.40s/it] {'loss': 0.3894, 'grad_norm': 0.5994620628157344, 'learning_rate': 7.3325469297420246e-06, 'epoch': 0.37} 37%|███▋ | 8069/22095 [13:44:09<13:15:34, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58046 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96385 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8070/22095 [13:44:11<12:43:31, 3.27s/it] {'loss': 0.3513, 'grad_norm': 0.6392091647460895, 'learning_rate': 7.331898624135434e-06, 'epoch': 0.37} 37%|███▋ | 8070/22095 [13:44:11<12:43:31, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8071/22095 [13:44:20<18:55:48, 4.86s/it] {'loss': 0.4707, 'grad_norm': 0.3701285705436311, 'learning_rate': 7.331250268423505e-06, 'epoch': 0.37} 37%|███▋ | 8071/22095 [13:44:20<18:55:48, 4.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8072/22095 [13:44:24<17:24:43, 4.47s/it] {'loss': 0.34, 'grad_norm': 0.5766710018646587, 'learning_rate': 7.330601862620164e-06, 'epoch': 0.37} 37%|███▋ | 8072/22095 [13:44:24<17:24:43, 4.47s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8073/22095 [13:44:27<16:40:07, 4.28s/it] {'loss': 0.3681, 'grad_norm': 0.6618459831944085, 'learning_rate': 7.3299534067393495e-06, 'epoch': 0.37} 37%|███▋ | 8073/22095 [13:44:27<16:40:07, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79145 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8074/22095 [13:44:31<16:06:48, 4.14s/it] {'loss': 0.3479, 'grad_norm': 0.8322945398975357, 'learning_rate': 7.329304900794991e-06, 'epoch': 0.37} 37%|███▋ | 8074/22095 [13:44:31<16:06:48, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8075/22095 [13:44:35<15:06:38, 3.88s/it] {'loss': 0.3681, 'grad_norm': 0.8158659005341743, 'learning_rate': 7.328656344801025e-06, 'epoch': 0.37} 37%|███▋ | 8075/22095 [13:44:35<15:06:38, 3.88s/it] 37%|███▋ | 8076/22095 [13:44:37<14:01:35, 3.60s/it] {'loss': 0.362, 'grad_norm': 0.8539346306396185, 'learning_rate': 7.328007738771385e-06, 'epoch': 0.37} 37%|███▋ | 8076/22095 [13:44:37<14:01:35, 3.60s/it] 37%|███▋ | 8077/22095 [13:44:41<13:31:05, 3.47s/it] {'loss': 0.3609, 'grad_norm': 0.6563458590862646, 'learning_rate': 7.32735908272001e-06, 'epoch': 0.37} 37%|███▋ | 8077/22095 [13:44:41<13:31:05, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8078/22095 [13:44:48<17:35:34, 4.52s/it] {'loss': 0.502, 'grad_norm': 0.3209877402422395, 'learning_rate': 7.326710376660836e-06, 'epoch': 0.37} 37%|███▋ | 8078/22095 [13:44:48<17:35:34, 4.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308406 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2cfG2s88lpuFjSspaXXXJKpXa_!!2560145876.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read and tell me what is encoded in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\nPH检测试纸\n广范1-14\n™\n卡尔斯\nPH检测试纸\n广范\n1-14\nQ/3211821AB001-2002\n13\n3\n5\n7\n9\n11\n1\n2\n4\n6\n8\n10\n12\n14\n80条\n一本'}]} 37%|███▋ | 8079/22095 [13:44:52<17:03:28, 4.38s/it] {'loss': 0.3865, 'grad_norm': 0.675428159697903, 'learning_rate': 7.326061620607801e-06, 'epoch': 0.37} 37%|███▋ | 8079/22095 [13:44:52<17:03:28, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85980 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57602 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8080/22095 [13:44:55<16:02:57, 4.12s/it] {'loss': 0.3579, 'grad_norm': 0.6703471467189661, 'learning_rate': 7.325412814574847e-06, 'epoch': 0.37} 37%|███▋ | 8080/22095 [13:44:55<16:02:57, 4.12s/it] 37%|███▋ | 8081/22095 [13:44:58<14:57:35, 3.84s/it] {'loss': 0.3544, 'grad_norm': 0.602632944758576, 'learning_rate': 7.324763958575913e-06, 'epoch': 0.37} 37%|███▋ | 8081/22095 [13:44:58<14:57:35, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8082/22095 [13:45:06<19:12:26, 4.93s/it] {'loss': 0.5226, 'grad_norm': 0.31864481918066306, 'learning_rate': 7.324115052624941e-06, 'epoch': 0.37} 37%|███▋ | 8082/22095 [13:45:06<19:12:26, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64052 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108306 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51134 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43238 > 40960) for 4 sample(s). Truncating to 1920 with 1 samples. 37%|███▋ | 8083/22095 [13:45:09<17:38:14, 4.53s/it] {'loss': 0.3867, 'grad_norm': 0.6424378043768086, 'learning_rate': 7.323466096735875e-06, 'epoch': 0.37} 37%|███▋ | 8083/22095 [13:45:09<17:38:14, 4.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8084/22095 [13:45:12<15:53:48, 4.08s/it] {'loss': 0.359, 'grad_norm': 0.6450470744304372, 'learning_rate': 7.322817090922659e-06, 'epoch': 0.37} 37%|███▋ | 8084/22095 [13:45:13<15:53:48, 4.08s/it] 37%|███▋ | 8085/22095 [13:45:16<15:08:22, 3.89s/it] {'loss': 0.3518, 'grad_norm': 0.7814567957687872, 'learning_rate': 7.322168035199237e-06, 'epoch': 0.37} 37%|███▋ | 8085/22095 [13:45:16<15:08:22, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133061 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (41814 > 40960) for 4 sample(s). Truncating to 40830 with 2 samples. 37%|███▋ | 8086/22095 [13:45:20<14:52:42, 3.82s/it] {'loss': 0.3286, 'grad_norm': 0.6107592968368141, 'learning_rate': 7.3215189295795565e-06, 'epoch': 0.37} 37%|███▋ | 8086/22095 [13:45:20<14:52:42, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881983 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5136, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047183 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm'}, {'from': 'gpt', 'value': '【解答】解:设BC=xcm,∵BC=\\frac{1}{2}AB,∴AB=2BC=2x,AC=AB+BC=3xcm,∵D为AC的中点,∴AD=DC=\\frac{1}{2}AC=1.5xcm,∵CD=3cm,∴1.5x=3,解得:x=2,即AB=2xcm=4cm,'}]} 37%|███▋ | 8087/22095 [13:45:22<13:37:21, 3.50s/it] {'loss': 0.3235, 'grad_norm': 0.6116002376209456, 'learning_rate': 7.320869774077564e-06, 'epoch': 0.37} 37%|███▋ | 8087/22095 [13:45:22<13:37:21, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8088/22095 [13:45:32<20:36:42, 5.30s/it] {'loss': 0.4684, 'grad_norm': 0.3665436413370661, 'learning_rate': 7.320220568707207e-06, 'epoch': 0.37} 37%|███▋ | 8088/22095 [13:45:32<20:36:42, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89697 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118107 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8089/22095 [13:45:35<18:22:23, 4.72s/it] {'loss': 0.3676, 'grad_norm': 0.6819908460809249, 'learning_rate': 7.319571313482437e-06, 'epoch': 0.37} 37%|███▋ | 8089/22095 [13:45:35<18:22:23, 4.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892996 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16149, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 6cm\nB. 1cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 37%|███▋ | 8090/22095 [13:45:39<17:12:28, 4.42s/it] {'loss': 0.346, 'grad_norm': 0.6976100390540673, 'learning_rate': 7.318922008417203e-06, 'epoch': 0.37} 37%|███▋ | 8090/22095 [13:45:39<17:12:28, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8091/22095 [13:45:48<22:11:06, 5.70s/it] {'loss': 0.4916, 'grad_norm': 0.32398500017315196, 'learning_rate': 7.318272653525457e-06, 'epoch': 0.37} 37%|███▋ | 8091/22095 [13:45:48<22:11:06, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55387 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85983 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8092/22095 [13:45:51<19:27:24, 5.00s/it] {'loss': 0.3403, 'grad_norm': 0.6873578858329197, 'learning_rate': 7.317623248821153e-06, 'epoch': 0.37} 37%|███▋ | 8092/22095 [13:45:51<19:27:24, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50771 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72951 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103251 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8093/22095 [13:45:55<17:48:33, 4.58s/it] {'loss': 0.3249, 'grad_norm': 0.6234197882322604, 'learning_rate': 7.316973794318242e-06, 'epoch': 0.37} 37%|███▋ | 8093/22095 [13:45:55<17:48:33, 4.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8094/22095 [13:46:04<23:54:52, 6.15s/it] {'loss': 0.4955, 'grad_norm': 0.40565248593002673, 'learning_rate': 7.316324290030682e-06, 'epoch': 0.37} 37%|███▋ | 8094/22095 [13:46:04<23:54:52, 6.15s/it] 37%|███▋ | 8095/22095 [13:46:08<20:51:12, 5.36s/it] {'loss': 0.3717, 'grad_norm': 0.7118511193486391, 'learning_rate': 7.315674735972426e-06, 'epoch': 0.37} 37%|███▋ | 8095/22095 [13:46:08<20:51:12, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47662 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77064 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63140 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8096/22095 [13:46:12<19:20:43, 4.97s/it] {'loss': 0.386, 'grad_norm': 0.6902695636733798, 'learning_rate': 7.315025132157432e-06, 'epoch': 0.37} 37%|███▋ | 8096/22095 [13:46:12<19:20:43, 4.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8097/22095 [13:46:19<22:11:30, 5.71s/it] {'loss': 0.4877, 'grad_norm': 0.3724437368013699, 'learning_rate': 7.314375478599657e-06, 'epoch': 0.37} 37%|███▋ | 8097/22095 [13:46:19<22:11:30, 5.71s/it] 37%|███▋ | 8098/22095 [13:46:23<19:30:01, 5.02s/it] {'loss': 0.3668, 'grad_norm': 0.6294176595203217, 'learning_rate': 7.313725775313061e-06, 'epoch': 0.37} 37%|███▋ | 8098/22095 [13:46:23<19:30:01, 5.02s/it] 37%|███▋ | 8099/22095 [13:46:27<18:00:38, 4.63s/it] {'loss': 0.3367, 'grad_norm': 0.6185721391255002, 'learning_rate': 7.313076022311605e-06, 'epoch': 0.37} 37%|███▋ | 8099/22095 [13:46:27<18:00:38, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8100/22095 [13:46:37<24:38:49, 6.34s/it] {'loss': 0.4597, 'grad_norm': 0.2993327159764945, 'learning_rate': 7.31242621960925e-06, 'epoch': 0.37} 37%|███▋ | 8100/22095 [13:46:37<24:38:49, 6.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8101/22095 [13:46:47<29:11:35, 7.51s/it] {'loss': 0.4973, 'grad_norm': 0.2986381957076049, 'learning_rate': 7.311776367219956e-06, 'epoch': 0.37} 37%|███▋ | 8101/22095 [13:46:47<29:11:35, 7.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8102/22095 [13:46:51<24:28:13, 6.30s/it] {'loss': 0.3448, 'grad_norm': 0.6804572154313598, 'learning_rate': 7.3111264651576895e-06, 'epoch': 0.37} 37%|███▋ | 8102/22095 [13:46:51<24:28:13, 6.30s/it] 37%|███▋ | 8103/22095 [13:46:54<21:05:23, 5.43s/it] {'loss': 0.3784, 'grad_norm': 0.8765735545656241, 'learning_rate': 7.310476513436412e-06, 'epoch': 0.37} 37%|███▋ | 8103/22095 [13:46:54<21:05:23, 5.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (90857 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8104/22095 [13:47:04<26:25:55, 6.80s/it] {'loss': 0.4805, 'grad_norm': 0.2952493580258756, 'learning_rate': 7.3098265120700915e-06, 'epoch': 0.37} 37%|███▋ | 8104/22095 [13:47:04<26:25:55, 6.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8105/22095 [13:47:08<23:02:48, 5.93s/it] {'loss': 0.332, 'grad_norm': 0.6513747567476289, 'learning_rate': 7.3091764610726935e-06, 'epoch': 0.37} 37%|███▋ | 8105/22095 [13:47:08<23:02:48, 5.93s/it] 37%|███▋ | 8106/22095 [13:47:12<20:37:04, 5.31s/it] {'loss': 0.3715, 'grad_norm': 0.8286346174779458, 'learning_rate': 7.308526360458185e-06, 'epoch': 0.37} 37%|███▋ | 8106/22095 [13:47:12<20:37:04, 5.31s/it] 37%|███▋ | 8107/22095 [13:47:16<18:50:38, 4.85s/it] {'loss': 0.3814, 'grad_norm': 0.6412950037424189, 'learning_rate': 7.307876210240534e-06, 'epoch': 0.37} 37%|███▋ | 8107/22095 [13:47:16<18:50:38, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66149 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8108/22095 [13:47:20<18:50:06, 4.85s/it] {'loss': 0.3169, 'grad_norm': 0.741223434168854, 'learning_rate': 7.3072260104337124e-06, 'epoch': 0.37} 37%|███▋ | 8108/22095 [13:47:20<18:50:06, 4.85s/it] 37%|███▋ | 8109/22095 [13:47:24<16:50:43, 4.34s/it] {'loss': 0.3636, 'grad_norm': 0.6052335633310483, 'learning_rate': 7.3065757610516895e-06, 'epoch': 0.37} 37%|███▋ | 8109/22095 [13:47:24<16:50:43, 4.34s/it] 37%|███▋ | 8110/22095 [13:47:27<15:22:36, 3.96s/it] {'loss': 0.3614, 'grad_norm': 0.635252692031835, 'learning_rate': 7.305925462108439e-06, 'epoch': 0.37} 37%|███▋ | 8110/22095 [13:47:27<15:22:36, 3.96s/it] 37%|███▋ | 8111/22095 [13:47:30<15:06:39, 3.89s/it] {'loss': 0.3766, 'grad_norm': 0.634580241462948, 'learning_rate': 7.30527511361793e-06, 'epoch': 0.37} 37%|███▋ | 8111/22095 [13:47:30<15:06:39, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8112/22095 [13:47:37<18:23:08, 4.73s/it] {'loss': 0.5032, 'grad_norm': 0.39177715832680615, 'learning_rate': 7.30462471559414e-06, 'epoch': 0.37} 37%|███▋ | 8112/22095 [13:47:37<18:23:08, 4.73s/it] 37%|███▋ | 8113/22095 [13:47:40<16:32:17, 4.26s/it] {'loss': 0.3195, 'grad_norm': 0.6742823922848321, 'learning_rate': 7.303974268051044e-06, 'epoch': 0.37} 37%|███▋ | 8113/22095 [13:47:40<16:32:17, 4.26s/it] 37%|███▋ | 8114/22095 [13:47:43<15:20:25, 3.95s/it] {'loss': 0.3331, 'grad_norm': 0.6544761152892868, 'learning_rate': 7.303323771002615e-06, 'epoch': 0.37} 37%|███▋ | 8114/22095 [13:47:43<15:20:25, 3.95s/it] 37%|███▋ | 8115/22095 [13:47:46<14:20:22, 3.69s/it] {'loss': 0.3105, 'grad_norm': 0.6472868999827004, 'learning_rate': 7.302673224462835e-06, 'epoch': 0.37} 37%|███▋ | 8115/22095 [13:47:46<14:20:22, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8116/22095 [13:47:55<20:10:47, 5.20s/it] {'loss': 0.4879, 'grad_norm': 0.30055019896322244, 'learning_rate': 7.302022628445678e-06, 'epoch': 0.37} 37%|███▋ | 8116/22095 [13:47:55<20:10:47, 5.20s/it] 37%|███▋ | 8117/22095 [13:48:00<19:27:32, 5.01s/it] {'loss': 0.3763, 'grad_norm': 0.6749211147111039, 'learning_rate': 7.301371982965125e-06, 'epoch': 0.37} 37%|███▋ | 8117/22095 [13:48:00<19:27:32, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887138 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10291, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 3\nB. 10\nC. 5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 37%|███▋ | 8118/22095 [13:48:04<18:19:41, 4.72s/it] {'loss': 0.333, 'grad_norm': 0.6467312875354413, 'learning_rate': 7.3007212880351565e-06, 'epoch': 0.37} 37%|███▋ | 8118/22095 [13:48:04<18:19:41, 4.72s/it] 37%|███▋ | 8119/22095 [13:48:07<16:38:35, 4.29s/it] {'loss': 0.3862, 'grad_norm': 0.6935178709365164, 'learning_rate': 7.3000705436697525e-06, 'epoch': 0.37} 37%|███▋ | 8119/22095 [13:48:07<16:38:35, 4.29s/it] 37%|███▋ | 8120/22095 [13:48:11<16:12:47, 4.18s/it] {'loss': 0.3278, 'grad_norm': 0.6099660368039522, 'learning_rate': 7.2994197498828975e-06, 'epoch': 0.37} 37%|███▋ | 8120/22095 [13:48:11<16:12:47, 4.18s/it] 37%|███▋ | 8121/22095 [13:48:14<15:17:11, 3.94s/it] {'loss': 0.3747, 'grad_norm': 0.6233582404645013, 'learning_rate': 7.298768906688576e-06, 'epoch': 0.37} 37%|███▋ | 8121/22095 [13:48:14<15:17:11, 3.94s/it] 37%|███▋ | 8122/22095 [13:48:18<15:21:11, 3.96s/it] {'loss': 0.3304, 'grad_norm': 0.6454675108502524, 'learning_rate': 7.298118014100766e-06, 'epoch': 0.37} 37%|███▋ | 8122/22095 [13:48:18<15:21:11, 3.96s/it] 37%|███▋ | 8123/22095 [13:48:21<14:17:41, 3.68s/it] {'loss': 0.3657, 'grad_norm': 0.6187331614448284, 'learning_rate': 7.297467072133463e-06, 'epoch': 0.37} 37%|███▋ | 8123/22095 [13:48:21<14:17:41, 3.68s/it] 37%|███▋ | 8124/22095 [13:48:25<13:47:51, 3.56s/it] {'loss': 0.3546, 'grad_norm': 0.6252633414228883, 'learning_rate': 7.296816080800646e-06, 'epoch': 0.37} 37%|███▋ | 8124/22095 [13:48:25<13:47:51, 3.56s/it] 37%|███▋ | 8125/22095 [13:48:28<13:08:53, 3.39s/it] {'loss': 0.3447, 'grad_norm': 0.6552389003007436, 'learning_rate': 7.296165040116308e-06, 'epoch': 0.37} 37%|███▋ | 8125/22095 [13:48:28<13:08:53, 3.39s/it] 37%|███▋ | 8126/22095 [13:48:31<13:35:38, 3.50s/it] {'loss': 0.3703, 'grad_norm': 0.6660391799782329, 'learning_rate': 7.295513950094433e-06, 'epoch': 0.37} 37%|███▋ | 8126/22095 [13:48:31<13:35:38, 3.50s/it] 37%|███▋ | 8127/22095 [13:48:36<14:13:43, 3.67s/it] {'loss': 0.3484, 'grad_norm': 0.5997289999195763, 'learning_rate': 7.294862810749014e-06, 'epoch': 0.37} 37%|███▋ | 8127/22095 [13:48:36<14:13:43, 3.67s/it] 37%|███▋ | 8128/22095 [13:48:38<13:23:49, 3.45s/it] {'loss': 0.3371, 'grad_norm': 0.6706146748720235, 'learning_rate': 7.2942116220940406e-06, 'epoch': 0.37} 37%|███▋ | 8128/22095 [13:48:38<13:23:49, 3.45s/it] 37%|███▋ | 8129/22095 [13:48:43<14:40:46, 3.78s/it] {'loss': 0.3094, 'grad_norm': 0.6006983753253756, 'learning_rate': 7.293560384143506e-06, 'epoch': 0.37} 37%|███▋ | 8129/22095 [13:48:43<14:40:46, 3.78s/it] 37%|███▋ | 8130/22095 [13:48:46<13:30:10, 3.48s/it] {'loss': 0.3352, 'grad_norm': 0.631994678606701, 'learning_rate': 7.292909096911403e-06, 'epoch': 0.37} 37%|███▋ | 8130/22095 [13:48:46<13:30:10, 3.48s/it] 37%|███▋ | 8131/22095 [13:48:49<13:39:41, 3.52s/it] {'loss': 0.37, 'grad_norm': 0.7523108426437606, 'learning_rate': 7.292257760411726e-06, 'epoch': 0.37} 37%|███▋ | 8131/22095 [13:48:49<13:39:41, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303775 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1XwXxcsic_eJjSZFnXXXVwVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n检测图中的文字并告诉我内容。'}, {'from': 'gpt', 'value': '图中所有文字:\n德芯\n为品质而生\n照亮您的世界\n2012C\n2012C\n全国包邮·破损补寄\n年质保\n天免费换新\n0\n9\n3\n赠\n驱动+磁铁+接线端'}]} 37%|███▋ | 8132/22095 [13:48:52<13:00:00, 3.35s/it] {'loss': 0.3563, 'grad_norm': 0.6309566047038871, 'learning_rate': 7.29160637465847e-06, 'epoch': 0.37} 37%|███▋ | 8132/22095 [13:48:52<13:00:00, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8133/22095 [13:49:01<18:34:11, 4.79s/it] {'loss': 0.4873, 'grad_norm': 0.37387998050983035, 'learning_rate': 7.290954939665632e-06, 'epoch': 0.37} 37%|███▋ | 8133/22095 [13:49:01<18:34:11, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65179 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8134/22095 [13:49:05<18:16:55, 4.71s/it] {'loss': 0.3671, 'grad_norm': 0.6851452506465928, 'learning_rate': 7.290303455447208e-06, 'epoch': 0.37} 37%|███▋ | 8134/22095 [13:49:05<18:16:55, 4.71s/it] 37%|███▋ | 8135/22095 [13:49:09<17:29:10, 4.51s/it] {'loss': 0.373, 'grad_norm': 0.7434686822496943, 'learning_rate': 7.289651922017195e-06, 'epoch': 0.37} 37%|███▋ | 8135/22095 [13:49:09<17:29:10, 4.51s/it] 37%|███▋ | 8136/22095 [13:49:12<15:58:27, 4.12s/it] {'loss': 0.337, 'grad_norm': 0.5914979977178154, 'learning_rate': 7.289000339389596e-06, 'epoch': 0.37} 37%|███▋ | 8136/22095 [13:49:12<15:58:27, 4.12s/it] 37%|███▋ | 8137/22095 [13:49:16<15:35:25, 4.02s/it] {'loss': 0.363, 'grad_norm': 0.6321514469503917, 'learning_rate': 7.288348707578409e-06, 'epoch': 0.37} 37%|███▋ | 8137/22095 [13:49:16<15:35:25, 4.02s/it] 37%|███▋ | 8138/22095 [13:49:19<14:17:58, 3.69s/it] {'loss': 0.3691, 'grad_norm': 0.6629606564514017, 'learning_rate': 7.2876970265976365e-06, 'epoch': 0.37} 37%|███▋ | 8138/22095 [13:49:19<14:17:58, 3.69s/it] 37%|███▋ | 8139/22095 [13:49:22<13:55:44, 3.59s/it] {'loss': 0.3425, 'grad_norm': 0.6713936840695812, 'learning_rate': 7.287045296461281e-06, 'epoch': 0.37} 37%|███▋ | 8139/22095 [13:49:22<13:55:44, 3.59s/it] 37%|███▋ | 8140/22095 [13:49:25<13:18:29, 3.43s/it] {'loss': 0.3648, 'grad_norm': 0.6120971669362468, 'learning_rate': 7.2863935171833465e-06, 'epoch': 0.37} 37%|███▋ | 8140/22095 [13:49:25<13:18:29, 3.43s/it] 37%|███▋ | 8141/22095 [13:49:28<12:51:31, 3.32s/it] {'loss': 0.3662, 'grad_norm': 0.6659355828139842, 'learning_rate': 7.285741688777838e-06, 'epoch': 0.37} 37%|███▋ | 8141/22095 [13:49:28<12:51:31, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8142/22095 [13:49:38<20:10:23, 5.20s/it] {'loss': 0.502, 'grad_norm': 0.4104427222421775, 'learning_rate': 7.285089811258761e-06, 'epoch': 0.37} 37%|███▋ | 8142/22095 [13:49:38<20:10:23, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50636 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8143/22095 [13:49:41<18:03:56, 4.66s/it] {'loss': 0.33, 'grad_norm': 0.6590165373916503, 'learning_rate': 7.28443788464012e-06, 'epoch': 0.37} 37%|███▋ | 8143/22095 [13:49:41<18:03:56, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71101 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8144/22095 [13:49:45<17:18:06, 4.46s/it] {'loss': 0.3849, 'grad_norm': 0.6633311656110775, 'learning_rate': 7.283785908935927e-06, 'epoch': 0.37} 37%|███▋ | 8144/22095 [13:49:45<17:18:06, 4.46s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (92139516 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 37%|███▋ | 8145/22095 [13:49:49<16:36:47, 4.29s/it] {'loss': 0.316, 'grad_norm': 0.6063659937084452, 'learning_rate': 7.283133884160187e-06, 'epoch': 0.37} 37%|███▋ | 8145/22095 [13:49:49<16:36:47, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57106 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45664 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42756 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8146/22095 [13:49:52<15:15:45, 3.94s/it] {'loss': 0.3796, 'grad_norm': 0.5980474521604764, 'learning_rate': 7.282481810326915e-06, 'epoch': 0.37} 37%|███▋ | 8146/22095 [13:49:52<15:15:45, 3.94s/it] 37%|███▋ | 8147/22095 [13:49:56<14:21:31, 3.71s/it] {'loss': 0.3323, 'grad_norm': 0.5990355897144266, 'learning_rate': 7.281829687450117e-06, 'epoch': 0.37} 37%|███▋ | 8147/22095 [13:49:56<14:21:31, 3.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [262, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8525726 in VC:s3://internvl-moe-sft-data/. Exception: Image size [262, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 107722, 'image': 'vrdu_texteq/astro-ph.CO/f4b0c553-2a06-4fc1-867a-e03706537cb7.png', 'image_wh': [[262, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $\\Delta $ is defined as'}]} 37%|███▋ | 8148/22095 [13:49:59<13:34:58, 3.51s/it] {'loss': 0.387, 'grad_norm': 0.6745356562340838, 'learning_rate': 7.281177515543807e-06, 'epoch': 0.37} 37%|███▋ | 8148/22095 [13:49:59<13:34:58, 3.51s/it] 37%|███▋ | 8149/22095 [13:50:02<12:52:04, 3.32s/it] {'loss': 0.3219, 'grad_norm': 0.6158116775354936, 'learning_rate': 7.280525294621999e-06, 'epoch': 0.37} 37%|███▋ | 8149/22095 [13:50:02<12:52:04, 3.32s/it] 37%|███▋ | 8150/22095 [13:50:05<12:51:04, 3.32s/it] {'loss': 0.3637, 'grad_norm': 0.6545144180352372, 'learning_rate': 7.2798730246987056e-06, 'epoch': 0.37} 37%|███▋ | 8150/22095 [13:50:05<12:51:04, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8151/22095 [13:50:12<16:51:05, 4.35s/it] {'loss': 0.471, 'grad_norm': 0.4375279106063995, 'learning_rate': 7.279220705787943e-06, 'epoch': 0.37} 37%|███▋ | 8151/22095 [13:50:12<16:51:05, 4.35s/it] 37%|███▋ | 8152/22095 [13:50:21<22:45:27, 5.88s/it] {'loss': 0.4683, 'grad_norm': 0.38159927496792784, 'learning_rate': 7.278568337903729e-06, 'epoch': 0.37} 37%|███▋ | 8152/22095 [13:50:21<22:45:27, 5.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8153/22095 [13:50:24<19:42:16, 5.09s/it] {'loss': 0.3343, 'grad_norm': 0.6542186893818985, 'learning_rate': 7.2779159210600765e-06, 'epoch': 0.37} 37%|███▋ | 8153/22095 [13:50:24<19:42:16, 5.09s/it] 37%|███▋ | 8154/22095 [13:50:34<25:10:57, 6.50s/it] {'loss': 0.4718, 'grad_norm': 0.29272977656676263, 'learning_rate': 7.277263455271011e-06, 'epoch': 0.37} 37%|███▋ | 8154/22095 [13:50:34<25:10:57, 6.50s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8155/22095 [13:50:39<23:17:55, 6.02s/it] {'loss': 0.3524, 'grad_norm': 0.6501785373959793, 'learning_rate': 7.2766109405505445e-06, 'epoch': 0.37} 37%|███▋ | 8155/22095 [13:50:39<23:17:55, 6.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [284, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8503041 in VC:s3://internvl-moe-sft-data/. Exception: Image size [284, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63628, 'image': 'vrdu_texteq/astro-ph.CO/f79ed9cb-038f-46c7-989f-b6036e24c9cc.png', 'image_wh': [[284, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'The estimator of $G_K$ is'}]} 37%|███▋ | 8156/22095 [13:50:43<20:47:41, 5.37s/it] {'loss': 0.334, 'grad_norm': 0.6013988426375781, 'learning_rate': 7.275958376912703e-06, 'epoch': 0.37} 37%|███▋ | 8156/22095 [13:50:43<20:47:41, 5.37s/it] 37%|███▋ | 8157/22095 [13:50:46<18:22:38, 4.75s/it] {'loss': 0.3269, 'grad_norm': 0.620811403055435, 'learning_rate': 7.275305764371505e-06, 'epoch': 0.37} 37%|███▋ | 8157/22095 [13:50:46<18:22:38, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8158/22095 [13:50:49<16:28:17, 4.25s/it] {'loss': 0.3129, 'grad_norm': 0.6162970042756091, 'learning_rate': 7.274653102940974e-06, 'epoch': 0.37} 37%|███▋ | 8158/22095 [13:50:49<16:28:17, 4.25s/it] 37%|███▋ | 8159/22095 [13:50:53<15:37:16, 4.04s/it] {'loss': 0.3956, 'grad_norm': 0.6723016335543197, 'learning_rate': 7.274000392635134e-06, 'epoch': 0.37} 37%|███▋ | 8159/22095 [13:50:53<15:37:16, 4.04s/it] 37%|███▋ | 8160/22095 [13:50:56<15:10:09, 3.92s/it] {'loss': 0.3703, 'grad_norm': 0.6754868476325461, 'learning_rate': 7.273347633468011e-06, 'epoch': 0.37} 37%|███▋ | 8160/22095 [13:50:56<15:10:09, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8161/22095 [13:50:59<14:01:19, 3.62s/it] {'loss': 0.3852, 'grad_norm': 0.6573254874606067, 'learning_rate': 7.272694825453628e-06, 'epoch': 0.37} 37%|███▋ | 8161/22095 [13:50:59<14:01:19, 3.62s/it] 37%|███▋ | 8162/22095 [13:51:04<15:02:18, 3.89s/it] {'loss': 0.381, 'grad_norm': 0.7382278556803853, 'learning_rate': 7.272041968606014e-06, 'epoch': 0.37} 37%|███▋ | 8162/22095 [13:51:04<15:02:18, 3.89s/it] 37%|███▋ | 8163/22095 [13:51:07<14:28:05, 3.74s/it] {'loss': 0.3502, 'grad_norm': 0.6845167040529891, 'learning_rate': 7.271389062939196e-06, 'epoch': 0.37} 37%|███▋ | 8163/22095 [13:51:07<14:28:05, 3.74s/it] 37%|███▋ | 8164/22095 [13:51:11<14:45:03, 3.81s/it] {'loss': 0.3503, 'grad_norm': 0.6551708802117328, 'learning_rate': 7.270736108467202e-06, 'epoch': 0.37} 37%|███▋ | 8164/22095 [13:51:11<14:45:03, 3.81s/it] 37%|███▋ | 8165/22095 [13:51:15<14:16:14, 3.69s/it] {'loss': 0.3727, 'grad_norm': 0.6211818518982469, 'learning_rate': 7.2700831052040656e-06, 'epoch': 0.37} 37%|███▋ | 8165/22095 [13:51:15<14:16:14, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47295 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8166/22095 [13:51:18<13:55:59, 3.60s/it] {'loss': 0.3592, 'grad_norm': 0.6544819832232823, 'learning_rate': 7.269430053163813e-06, 'epoch': 0.37} 37%|███▋ | 8166/22095 [13:51:18<13:55:59, 3.60s/it] 37%|███▋ | 8167/22095 [13:51:23<15:04:56, 3.90s/it] {'loss': 0.3526, 'grad_norm': 0.6687579884282855, 'learning_rate': 7.268776952360479e-06, 'epoch': 0.37} 37%|███▋ | 8167/22095 [13:51:23<15:04:56, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44870 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56628 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8168/22095 [13:51:27<15:02:54, 3.89s/it] {'loss': 0.3713, 'grad_norm': 0.6643304048573219, 'learning_rate': 7.268123802808097e-06, 'epoch': 0.37} 37%|███▋ | 8168/22095 [13:51:27<15:02:54, 3.89s/it] 37%|███▋ | 8169/22095 [13:51:30<15:08:07, 3.91s/it] {'loss': 0.3163, 'grad_norm': 0.6462119618190477, 'learning_rate': 7.2674706045207e-06, 'epoch': 0.37} 37%|███▋ | 8169/22095 [13:51:30<15:08:07, 3.91s/it] 37%|███▋ | 8170/22095 [13:51:33<14:05:38, 3.64s/it] {'loss': 0.382, 'grad_norm': 0.6632924357449026, 'learning_rate': 7.2668173575123234e-06, 'epoch': 0.37} 37%|███▋ | 8170/22095 [13:51:34<14:05:38, 3.64s/it] 37%|███▋ | 8171/22095 [13:51:37<13:25:07, 3.47s/it] {'loss': 0.4129, 'grad_norm': 0.7171910384022454, 'learning_rate': 7.2661640617970054e-06, 'epoch': 0.37} 37%|███▋ | 8171/22095 [13:51:37<13:25:07, 3.47s/it] 37%|███▋ | 8172/22095 [13:51:41<14:14:48, 3.68s/it] {'loss': 0.3729, 'grad_norm': 0.6723190617891283, 'learning_rate': 7.26551071738878e-06, 'epoch': 0.37} 37%|███▋ | 8172/22095 [13:51:41<14:14:48, 3.68s/it] 37%|███▋ | 8173/22095 [13:51:44<13:46:45, 3.56s/it] {'loss': 0.4029, 'grad_norm': 0.6634564049143907, 'learning_rate': 7.264857324301688e-06, 'epoch': 0.37} 37%|███▋ | 8173/22095 [13:51:44<13:46:45, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50921 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8174/22095 [13:51:53<19:55:06, 5.15s/it] {'loss': 0.4852, 'grad_norm': 0.7272365330599319, 'learning_rate': 7.264203882549766e-06, 'epoch': 0.37} 37%|███▋ | 8174/22095 [13:51:53<19:55:06, 5.15s/it] 37%|███▋ | 8175/22095 [13:51:57<19:14:42, 4.98s/it] {'loss': 0.3433, 'grad_norm': 0.6412200563909634, 'learning_rate': 7.26355039214706e-06, 'epoch': 0.37} 37%|███▋ | 8175/22095 [13:51:57<19:14:42, 4.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8176/22095 [13:52:07<24:28:25, 6.33s/it] {'loss': 0.4617, 'grad_norm': 0.3903053709652611, 'learning_rate': 7.262896853107606e-06, 'epoch': 0.37} 37%|███▋ | 8176/22095 [13:52:07<24:28:25, 6.33s/it] 37%|███▋ | 8177/22095 [13:52:10<21:03:31, 5.45s/it] {'loss': 0.3362, 'grad_norm': 0.6341750980091954, 'learning_rate': 7.262243265445449e-06, 'epoch': 0.37} 37%|███▋ | 8177/22095 [13:52:10<21:03:31, 5.45s/it] 37%|███▋ | 8178/22095 [13:52:14<18:27:21, 4.77s/it] {'loss': 0.3275, 'grad_norm': 0.6452881582463291, 'learning_rate': 7.261589629174632e-06, 'epoch': 0.37} 37%|███▋ | 8178/22095 [13:52:14<18:27:21, 4.77s/it] 37%|███▋ | 8179/22095 [13:52:17<16:30:29, 4.27s/it] {'loss': 0.3315, 'grad_norm': 0.6057918110267597, 'learning_rate': 7.260935944309201e-06, 'epoch': 0.37} 37%|███▋ | 8179/22095 [13:52:17<16:30:29, 4.27s/it] 37%|███▋ | 8180/22095 [13:52:20<15:09:49, 3.92s/it] {'loss': 0.3658, 'grad_norm': 0.6226973497002455, 'learning_rate': 7.260282210863199e-06, 'epoch': 0.37} 37%|███▋ | 8180/22095 [13:52:20<15:09:49, 3.92s/it] 37%|███▋ | 8181/22095 [13:52:23<13:56:48, 3.61s/it] {'loss': 0.3743, 'grad_norm': 0.6717952130988063, 'learning_rate': 7.2596284288506745e-06, 'epoch': 0.37} 37%|███▋ | 8181/22095 [13:52:23<13:56:48, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8182/22095 [13:52:32<20:33:36, 5.32s/it] {'loss': 0.4841, 'grad_norm': 0.7272312458191902, 'learning_rate': 7.258974598285674e-06, 'epoch': 0.37} 37%|███▋ | 8182/22095 [13:52:32<20:33:36, 5.32s/it] 37%|███▋ | 8183/22095 [13:52:35<18:08:26, 4.69s/it] {'loss': 0.3406, 'grad_norm': 0.6751799745883079, 'learning_rate': 7.25832071918225e-06, 'epoch': 0.37} 37%|███▋ | 8183/22095 [13:52:35<18:08:26, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121697 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89899 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46799 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46598 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59779 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8184/22095 [13:52:38<16:13:13, 4.20s/it] {'loss': 0.3633, 'grad_norm': 0.6424543775896905, 'learning_rate': 7.257666791554448e-06, 'epoch': 0.37} 37%|███▋ | 8184/22095 [13:52:38<16:13:13, 4.20s/it] 37%|███▋ | 8185/22095 [13:52:41<14:45:39, 3.82s/it] {'loss': 0.325, 'grad_norm': 0.653516097898005, 'learning_rate': 7.25701281541632e-06, 'epoch': 0.37} 37%|███▋ | 8185/22095 [13:52:41<14:45:39, 3.82s/it] 37%|███▋ | 8186/22095 [13:52:45<14:20:37, 3.71s/it] {'loss': 0.3244, 'grad_norm': 0.5794649639415843, 'learning_rate': 7.2563587907819185e-06, 'epoch': 0.37} 37%|███▋ | 8186/22095 [13:52:45<14:20:37, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8187/22095 [13:52:48<13:43:20, 3.55s/it] {'loss': 0.4209, 'grad_norm': 0.7157715260093064, 'learning_rate': 7.255704717665298e-06, 'epoch': 0.37} 37%|███▋ | 8187/22095 [13:52:48<13:43:20, 3.55s/it] 37%|███▋ | 8188/22095 [13:52:52<13:57:27, 3.61s/it] {'loss': 0.3363, 'grad_norm': 0.5884653553265714, 'learning_rate': 7.25505059608051e-06, 'epoch': 0.37} 37%|███▋ | 8188/22095 [13:52:52<13:57:27, 3.61s/it] 37%|███▋ | 8189/22095 [13:52:56<14:37:38, 3.79s/it] {'loss': 0.3378, 'grad_norm': 0.6741184230110532, 'learning_rate': 7.25439642604161e-06, 'epoch': 0.37} 37%|███▋ | 8189/22095 [13:52:56<14:37:38, 3.79s/it] 37%|███▋ | 8190/22095 [13:53:00<15:11:19, 3.93s/it] {'loss': 0.3587, 'grad_norm': 0.9005729167867248, 'learning_rate': 7.253742207562655e-06, 'epoch': 0.37} 37%|███▋ | 8190/22095 [13:53:00<15:11:19, 3.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373533 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40307, 'image': 'vrdu_table_final_2/astro-ph.CO/630bb947-dddd-4bf5-9fae-d5f9a019354a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 37%|███▋ | 8191/22095 [13:53:03<14:39:28, 3.80s/it] {'loss': 0.3348, 'grad_norm': 0.7030014372266986, 'learning_rate': 7.253087940657702e-06, 'epoch': 0.37} 37%|███▋ | 8191/22095 [13:53:03<14:39:28, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8192/22095 [13:53:15<23:28:51, 6.08s/it] {'loss': 0.5037, 'grad_norm': 0.38392013327619495, 'learning_rate': 7.252433625340811e-06, 'epoch': 0.37} 37%|███▋ | 8192/22095 [13:53:15<23:28:51, 6.08s/it] 37%|███▋ | 8193/22095 [13:53:19<20:54:50, 5.42s/it] {'loss': 0.3572, 'grad_norm': 0.6781324653726816, 'learning_rate': 7.251779261626035e-06, 'epoch': 0.37} 37%|███▋ | 8193/22095 [13:53:19<20:54:50, 5.42s/it] 37%|███▋ | 8194/22095 [13:53:22<18:38:41, 4.83s/it] {'loss': 0.3056, 'grad_norm': 0.6761517906134753, 'learning_rate': 7.251124849527442e-06, 'epoch': 0.37} 37%|███▋ | 8194/22095 [13:53:22<18:38:41, 4.83s/it] 37%|███▋ | 8195/22095 [13:53:26<17:41:24, 4.58s/it] {'loss': 0.3996, 'grad_norm': 0.6428503014211182, 'learning_rate': 7.250470389059088e-06, 'epoch': 0.37} 37%|███▋ | 8195/22095 [13:53:26<17:41:24, 4.58s/it] 37%|███▋ | 8196/22095 [13:53:29<15:55:34, 4.13s/it] {'loss': 0.3845, 'grad_norm': 0.6354730026376507, 'learning_rate': 7.2498158802350385e-06, 'epoch': 0.37} 37%|███▋ | 8196/22095 [13:53:29<15:55:34, 4.13s/it] 37%|███▋ | 8197/22095 [13:53:32<14:37:09, 3.79s/it] {'loss': 0.3168, 'grad_norm': 0.6703399462653616, 'learning_rate': 7.249161323069355e-06, 'epoch': 0.37} 37%|███▋ | 8197/22095 [13:53:32<14:37:09, 3.79s/it] 37%|███▋ | 8198/22095 [13:53:36<14:20:45, 3.72s/it] {'loss': 0.3511, 'grad_norm': 0.7792293419436107, 'learning_rate': 7.248506717576102e-06, 'epoch': 0.37} 37%|███▋ | 8198/22095 [13:53:36<14:20:45, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49097 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42345 > 40960) for 4 sample(s). Truncating to 1452 with 3 samples. 37%|███▋ | 8199/22095 [13:53:39<13:24:49, 3.48s/it] {'loss': 0.4104, 'grad_norm': 0.6482685165776548, 'learning_rate': 7.247852063769345e-06, 'epoch': 0.37} 37%|███▋ | 8199/22095 [13:53:39<13:24:49, 3.48s/it] 37%|███▋ | 8200/22095 [13:53:42<12:47:38, 3.31s/it] {'loss': 0.3241, 'grad_norm': 0.6355574305326335, 'learning_rate': 7.247197361663152e-06, 'epoch': 0.37} 37%|███▋ | 8200/22095 [13:53:42<12:47:38, 3.31s/it] 37%|███▋ | 8201/22095 [13:53:45<12:31:18, 3.24s/it] {'loss': 0.3608, 'grad_norm': 0.6312939670347133, 'learning_rate': 7.246542611271587e-06, 'epoch': 0.37} 37%|███▋ | 8201/22095 [13:53:45<12:31:18, 3.24s/it] 37%|███▋ | 8202/22095 [13:53:50<15:20:34, 3.98s/it] {'loss': 0.425, 'grad_norm': 0.6416849686675478, 'learning_rate': 7.245887812608725e-06, 'epoch': 0.37} 37%|███▋ | 8202/22095 [13:53:50<15:20:34, 3.98s/it] 37%|███▋ | 8203/22095 [13:53:53<14:05:32, 3.65s/it] {'loss': 0.3636, 'grad_norm': 0.690208012943487, 'learning_rate': 7.245232965688629e-06, 'epoch': 0.37} 37%|███▋ | 8203/22095 [13:53:53<14:05:32, 3.65s/it] 37%|███▋ | 8204/22095 [13:53:57<13:45:40, 3.57s/it] {'loss': 0.3582, 'grad_norm': 0.8421190596902749, 'learning_rate': 7.244578070525373e-06, 'epoch': 0.37} 37%|███▋ | 8204/22095 [13:53:57<13:45:40, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59838 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8205/22095 [13:54:01<14:17:32, 3.70s/it] {'loss': 0.3884, 'grad_norm': 0.6219765356087398, 'learning_rate': 7.243923127133028e-06, 'epoch': 0.37} 37%|███▋ | 8205/22095 [13:54:01<14:17:32, 3.70s/it] 37%|███▋ | 8206/22095 [13:54:04<13:31:23, 3.51s/it] {'loss': 0.3319, 'grad_norm': 0.8775162184518147, 'learning_rate': 7.243268135525666e-06, 'epoch': 0.37} 37%|███▋ | 8206/22095 [13:54:04<13:31:23, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8207/22095 [13:54:10<16:40:02, 4.32s/it] {'loss': 0.5203, 'grad_norm': 0.3858290275054255, 'learning_rate': 7.242613095717361e-06, 'epoch': 0.37} 37%|███▋ | 8207/22095 [13:54:10<16:40:02, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8208/22095 [13:54:14<16:03:01, 4.16s/it] {'loss': 0.3723, 'grad_norm': 0.6141720874361678, 'learning_rate': 7.2419580077221906e-06, 'epoch': 0.37} 37%|███▋ | 8208/22095 [13:54:14<16:03:01, 4.16s/it] 37%|███▋ | 8209/22095 [13:54:18<15:52:03, 4.11s/it] {'loss': 0.3723, 'grad_norm': 0.6414007757700619, 'learning_rate': 7.241302871554226e-06, 'epoch': 0.37} 37%|███▋ | 8209/22095 [13:54:18<15:52:03, 4.11s/it] 37%|███▋ | 8210/22095 [13:54:21<15:00:28, 3.89s/it] {'loss': 0.377, 'grad_norm': 0.6300823776488501, 'learning_rate': 7.240647687227547e-06, 'epoch': 0.37} 37%|███▋ | 8210/22095 [13:54:21<15:00:28, 3.89s/it] 37%|███▋ | 8211/22095 [13:54:24<13:46:19, 3.57s/it] {'loss': 0.4208, 'grad_norm': 0.7237074014093474, 'learning_rate': 7.23999245475623e-06, 'epoch': 0.37} 37%|███▋ | 8211/22095 [13:54:24<13:46:19, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8212/22095 [13:54:27<13:29:11, 3.50s/it] {'loss': 0.3537, 'grad_norm': 0.7325475420895977, 'learning_rate': 7.239337174154357e-06, 'epoch': 0.37} 37%|███▋ | 8212/22095 [13:54:27<13:29:11, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89892 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8213/22095 [13:54:37<20:23:37, 5.29s/it] {'loss': 0.4822, 'grad_norm': 0.3137186371943764, 'learning_rate': 7.238681845436004e-06, 'epoch': 0.37} 37%|███▋ | 8213/22095 [13:54:37<20:23:37, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64479 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8214/22095 [13:54:40<18:08:26, 4.70s/it] {'loss': 0.3426, 'grad_norm': 0.6413702247460596, 'learning_rate': 7.238026468615255e-06, 'epoch': 0.37} 37%|███▋ | 8214/22095 [13:54:40<18:08:26, 4.70s/it] 37%|███▋ | 8215/22095 [13:54:43<16:21:52, 4.24s/it] {'loss': 0.3609, 'grad_norm': 0.670904099564202, 'learning_rate': 7.23737104370619e-06, 'epoch': 0.37} 37%|███▋ | 8215/22095 [13:54:43<16:21:52, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50973 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8216/22095 [13:54:46<14:57:26, 3.88s/it] {'loss': 0.3477, 'grad_norm': 0.6297418857803184, 'learning_rate': 7.236715570722892e-06, 'epoch': 0.37} 37%|███▋ | 8216/22095 [13:54:46<14:57:26, 3.88s/it] 37%|███▋ | 8217/22095 [13:54:50<14:40:46, 3.81s/it] {'loss': 0.308, 'grad_norm': 0.5637446500275521, 'learning_rate': 7.236060049679446e-06, 'epoch': 0.37} 37%|███▋ | 8217/22095 [13:54:50<14:40:46, 3.81s/it] 37%|███▋ | 8218/22095 [13:54:53<13:43:57, 3.56s/it] {'loss': 0.3905, 'grad_norm': 0.7496022031912113, 'learning_rate': 7.2354044805899385e-06, 'epoch': 0.37} 37%|███▋ | 8218/22095 [13:54:53<13:43:57, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83274 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8219/22095 [13:54:56<13:19:56, 3.46s/it] {'loss': 0.3696, 'grad_norm': 0.6394406128286074, 'learning_rate': 7.234748863468453e-06, 'epoch': 0.37} 37%|███▋ | 8219/22095 [13:54:56<13:19:56, 3.46s/it] 37%|███▋ | 8220/22095 [13:55:00<13:58:29, 3.63s/it] {'loss': 0.351, 'grad_norm': 0.5950512693336792, 'learning_rate': 7.234093198329078e-06, 'epoch': 0.37} 37%|███▋ | 8220/22095 [13:55:00<13:58:29, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8221/22095 [13:55:11<22:14:37, 5.77s/it] {'loss': 0.483, 'grad_norm': 0.3256096310149524, 'learning_rate': 7.233437485185904e-06, 'epoch': 0.37} 37%|███▋ | 8221/22095 [13:55:11<22:14:37, 5.77s/it] 37%|███▋ | 8222/22095 [13:55:18<24:03:29, 6.24s/it] {'loss': 0.4734, 'grad_norm': 0.28917444198334696, 'learning_rate': 7.232781724053014e-06, 'epoch': 0.37} 37%|███▋ | 8222/22095 [13:55:18<24:03:29, 6.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8223/22095 [13:55:22<21:09:37, 5.49s/it] {'loss': 0.3499, 'grad_norm': 0.5907937577751626, 'learning_rate': 7.232125914944506e-06, 'epoch': 0.37} 37%|███▋ | 8223/22095 [13:55:22<21:09:37, 5.49s/it] 37%|███▋ | 8224/22095 [13:55:30<23:57:59, 6.22s/it] {'loss': 0.503, 'grad_norm': 0.2959842299878078, 'learning_rate': 7.2314700578744635e-06, 'epoch': 0.37} 37%|███▋ | 8224/22095 [13:55:30<23:57:59, 6.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8225/22095 [13:55:34<20:57:24, 5.44s/it] {'loss': 0.3641, 'grad_norm': 0.6258794371939906, 'learning_rate': 7.230814152856986e-06, 'epoch': 0.37} 37%|███▋ | 8225/22095 [13:55:34<20:57:24, 5.44s/it] 37%|███▋ | 8226/22095 [13:55:38<19:38:15, 5.10s/it] {'loss': 0.3553, 'grad_norm': 0.6253935976814171, 'learning_rate': 7.230158199906163e-06, 'epoch': 0.37} 37%|███▋ | 8226/22095 [13:55:38<19:38:15, 5.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54066 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8227/22095 [13:55:48<25:32:48, 6.63s/it] {'loss': 0.4685, 'grad_norm': 0.3048760301862293, 'learning_rate': 7.2295021990360896e-06, 'epoch': 0.37} 37%|███▋ | 8227/22095 [13:55:48<25:32:48, 6.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66466 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8228/22095 [13:55:52<22:32:23, 5.85s/it] {'loss': 0.3783, 'grad_norm': 0.6456964042297493, 'learning_rate': 7.228846150260861e-06, 'epoch': 0.37} 37%|███▋ | 8228/22095 [13:55:52<22:32:23, 5.85s/it] 37%|███▋ | 8229/22095 [13:55:55<19:34:29, 5.08s/it] {'loss': 0.3743, 'grad_norm': 0.6633579512268112, 'learning_rate': 7.228190053594575e-06, 'epoch': 0.37} 37%|███▋ | 8229/22095 [13:55:55<19:34:29, 5.08s/it] 37%|███▋ | 8230/22095 [13:55:59<17:32:10, 4.55s/it] {'loss': 0.3664, 'grad_norm': 0.6322349637412092, 'learning_rate': 7.227533909051327e-06, 'epoch': 0.37} 37%|███▋ | 8230/22095 [13:55:59<17:32:10, 4.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364943 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31684, 'image': 'vrdu_table_final_2/astro-ph.CO/e13c5d82-3ca6-4ceb-bb97-cad7f6ce74e0.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 37%|███▋ | 8231/22095 [13:56:03<17:04:34, 4.43s/it] {'loss': 0.3535, 'grad_norm': 0.6594001063414584, 'learning_rate': 7.2268777166452175e-06, 'epoch': 0.37} 37%|███▋ | 8231/22095 [13:56:03<17:04:34, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8232/22095 [13:56:07<16:30:15, 4.29s/it] {'loss': 0.3593, 'grad_norm': 0.6914287196399008, 'learning_rate': 7.226221476390344e-06, 'epoch': 0.37} 37%|███▋ | 8232/22095 [13:56:07<16:30:15, 4.29s/it] 37%|███▋ | 8233/22095 [13:56:10<14:45:49, 3.83s/it] {'loss': 0.3559, 'grad_norm': 0.6762918894508828, 'learning_rate': 7.22556518830081e-06, 'epoch': 0.37} 37%|███▋ | 8233/22095 [13:56:10<14:45:49, 3.83s/it] 37%|███▋ | 8234/22095 [13:56:13<14:01:28, 3.64s/it] {'loss': 0.3231, 'grad_norm': 0.6233630699017264, 'learning_rate': 7.224908852390714e-06, 'epoch': 0.37} 37%|███▋ | 8234/22095 [13:56:13<14:01:28, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8235/22095 [13:56:23<21:14:23, 5.52s/it] {'loss': 0.4999, 'grad_norm': 0.32035678978033805, 'learning_rate': 7.224252468674161e-06, 'epoch': 0.37} 37%|███▋ | 8235/22095 [13:56:23<21:14:23, 5.52s/it] 37%|███▋ | 8236/22095 [13:56:33<26:37:03, 6.91s/it] {'loss': 0.4856, 'grad_norm': 0.32533747372396, 'learning_rate': 7.223596037165252e-06, 'epoch': 0.37} 37%|███▋ | 8236/22095 [13:56:33<26:37:03, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8237/22095 [13:56:37<23:13:04, 6.03s/it] {'loss': 0.3852, 'grad_norm': 0.7078979169891529, 'learning_rate': 7.2229395578780955e-06, 'epoch': 0.37} 37%|███▋ | 8237/22095 [13:56:37<23:13:04, 6.03s/it] 37%|███▋ | 8238/22095 [13:56:40<19:59:40, 5.19s/it] {'loss': 0.3419, 'grad_norm': 0.7705429790577362, 'learning_rate': 7.222283030826795e-06, 'epoch': 0.37} 37%|███▋ | 8238/22095 [13:56:40<19:59:40, 5.19s/it] 37%|███▋ | 8239/22095 [13:56:44<18:20:06, 4.76s/it] {'loss': 0.3375, 'grad_norm': 0.5864709746736614, 'learning_rate': 7.221626456025456e-06, 'epoch': 0.37} 37%|███▋ | 8239/22095 [13:56:44<18:20:06, 4.76s/it] 37%|███▋ | 8240/22095 [13:56:47<16:45:42, 4.36s/it] {'loss': 0.3456, 'grad_norm': 0.6161853610664281, 'learning_rate': 7.220969833488188e-06, 'epoch': 0.37} 37%|███▋ | 8240/22095 [13:56:47<16:45:42, 4.36s/it] 37%|███▋ | 8241/22095 [13:56:51<15:47:49, 4.10s/it] {'loss': 0.3624, 'grad_norm': 0.6767120084765911, 'learning_rate': 7.2203131632291e-06, 'epoch': 0.37} 37%|███▋ | 8241/22095 [13:56:51<15:47:49, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405520 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7707, 'image': 'vrdu_table_final_2/astro-ph.CO/94529a1c-bdb1-435b-992c-ec9f841ce93e.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 37%|███▋ | 8242/22095 [13:56:54<14:35:05, 3.79s/it] {'loss': 0.3455, 'grad_norm': 0.673082152049448, 'learning_rate': 7.2196564452623015e-06, 'epoch': 0.37} 37%|███▋ | 8242/22095 [13:56:54<14:35:05, 3.79s/it] 37%|███▋ | 8243/22095 [13:56:57<14:08:08, 3.67s/it] {'loss': 0.3499, 'grad_norm': 0.6447050864070746, 'learning_rate': 7.218999679601903e-06, 'epoch': 0.37} 37%|███▋ | 8243/22095 [13:56:57<14:08:08, 3.67s/it] 37%|███▋ | 8244/22095 [13:57:01<14:45:26, 3.84s/it] {'loss': 0.3635, 'grad_norm': 0.6341690830643045, 'learning_rate': 7.2183428662620155e-06, 'epoch': 0.37} 37%|███▋ | 8244/22095 [13:57:01<14:45:26, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85402 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8245/22095 [13:57:05<14:57:21, 3.89s/it] {'loss': 0.3529, 'grad_norm': 0.6284278281076813, 'learning_rate': 7.217686005256755e-06, 'epoch': 0.37} 37%|███▋ | 8245/22095 [13:57:05<14:57:21, 3.89s/it] 37%|███▋ | 8246/22095 [13:57:10<15:22:24, 4.00s/it] {'loss': 0.3424, 'grad_norm': 0.6416020521885557, 'learning_rate': 7.217029096600231e-06, 'epoch': 0.37} 37%|███▋ | 8246/22095 [13:57:10<15:22:24, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50573 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8247/22095 [13:57:13<14:24:50, 3.75s/it] {'loss': 0.3332, 'grad_norm': 0.5983683465068236, 'learning_rate': 7.216372140306563e-06, 'epoch': 0.37} 37%|███▋ | 8247/22095 [13:57:13<14:24:50, 3.75s/it] 37%|███▋ | 8248/22095 [13:57:17<14:26:57, 3.76s/it] {'loss': 0.356, 'grad_norm': 0.6559567018480142, 'learning_rate': 7.215715136389862e-06, 'epoch': 0.37} 37%|███▋ | 8248/22095 [13:57:17<14:26:57, 3.76s/it] 37%|███▋ | 8249/22095 [13:57:20<14:19:59, 3.73s/it] {'loss': 0.3941, 'grad_norm': 0.6773549771240448, 'learning_rate': 7.21505808486425e-06, 'epoch': 0.37} 37%|███▋ | 8249/22095 [13:57:20<14:19:59, 3.73s/it] 37%|███▋ | 8250/22095 [13:57:24<14:35:51, 3.80s/it] {'loss': 0.3708, 'grad_norm': 0.6604796043184018, 'learning_rate': 7.2144009857438436e-06, 'epoch': 0.37} 37%|███▋ | 8250/22095 [13:57:24<14:35:51, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58661 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45718 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8251/22095 [13:57:27<13:29:52, 3.51s/it] {'loss': 0.3573, 'grad_norm': 0.6423405444166955, 'learning_rate': 7.213743839042757e-06, 'epoch': 0.37} 37%|███▋ | 8251/22095 [13:57:27<13:29:52, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127060 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55265 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8252/22095 [13:57:30<13:09:17, 3.42s/it] {'loss': 0.3797, 'grad_norm': 0.6249343983434719, 'learning_rate': 7.213086644775118e-06, 'epoch': 0.37} 37%|███▋ | 8252/22095 [13:57:30<13:09:17, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (100761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47267 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8253/22095 [13:57:40<20:01:27, 5.21s/it] {'loss': 0.4988, 'grad_norm': 0.4517351308111027, 'learning_rate': 7.212429402955043e-06, 'epoch': 0.37} 37%|███▋ | 8253/22095 [13:57:40<20:01:27, 5.21s/it] 37%|███▋ | 8254/22095 [13:57:43<18:21:33, 4.78s/it] {'loss': 0.3425, 'grad_norm': 0.622628015585213, 'learning_rate': 7.211772113596656e-06, 'epoch': 0.37} 37%|███▋ | 8254/22095 [13:57:43<18:21:33, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8255/22095 [13:57:53<23:51:45, 6.21s/it] {'loss': 0.4689, 'grad_norm': 0.333877141476755, 'learning_rate': 7.211114776714077e-06, 'epoch': 0.37} 37%|███▋ | 8255/22095 [13:57:53<23:51:45, 6.21s/it] 37%|███▋ | 8256/22095 [13:57:56<20:31:24, 5.34s/it] {'loss': 0.332, 'grad_norm': 0.6688247811686822, 'learning_rate': 7.210457392321434e-06, 'epoch': 0.37} 37%|███▋ | 8256/22095 [13:57:56<20:31:24, 5.34s/it] 37%|███▋ | 8257/22095 [13:58:00<19:05:41, 4.97s/it] {'loss': 0.3083, 'grad_norm': 0.5636180479977778, 'learning_rate': 7.209799960432851e-06, 'epoch': 0.37} 37%|███▋ | 8257/22095 [13:58:00<19:05:41, 4.97s/it] 37%|███▋ | 8258/22095 [13:58:05<18:23:20, 4.78s/it] {'loss': 0.3139, 'grad_norm': 0.5817093516022838, 'learning_rate': 7.209142481062452e-06, 'epoch': 0.37} 37%|███▋ | 8258/22095 [13:58:05<18:23:20, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8259/22095 [13:58:14<24:05:45, 6.27s/it] {'loss': 0.4904, 'grad_norm': 0.35989511000783425, 'learning_rate': 7.208484954224366e-06, 'epoch': 0.37} 37%|███▋ | 8259/22095 [13:58:15<24:05:45, 6.27s/it] 37%|███▋ | 8260/22095 [13:58:19<22:09:47, 5.77s/it] {'loss': 0.4874, 'grad_norm': 0.3762427932277179, 'learning_rate': 7.207827379932724e-06, 'epoch': 0.37} 37%|███▋ | 8260/22095 [13:58:19<22:09:47, 5.77s/it] 37%|███▋ | 8261/22095 [13:58:25<22:10:05, 5.77s/it] {'loss': 0.5235, 'grad_norm': 0.3321377843546181, 'learning_rate': 7.207169758201649e-06, 'epoch': 0.37} 37%|███▋ | 8261/22095 [13:58:25<22:10:05, 5.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 37%|███▋ | 8262/22095 [13:58:30<21:03:01, 5.48s/it] {'loss': 0.3389, 'grad_norm': 0.8598727794773364, 'learning_rate': 7.206512089045277e-06, 'epoch': 0.37} 37%|███▋ | 8262/22095 [13:58:30<21:03:01, 5.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8263/22095 [13:58:33<18:45:10, 4.88s/it] {'loss': 0.3633, 'grad_norm': 0.6917048591356012, 'learning_rate': 7.205854372477735e-06, 'epoch': 0.37} 37%|███▋ | 8263/22095 [13:58:33<18:45:10, 4.88s/it] 37%|███▋ | 8264/22095 [13:58:36<16:39:34, 4.34s/it] {'loss': 0.3523, 'grad_norm': 0.6357038317196191, 'learning_rate': 7.2051966085131584e-06, 'epoch': 0.37} 37%|███▋ | 8264/22095 [13:58:36<16:39:34, 4.34s/it] 37%|███▋ | 8265/22095 [13:58:40<15:38:52, 4.07s/it] {'loss': 0.2921, 'grad_norm': 0.7597069027593606, 'learning_rate': 7.20453879716568e-06, 'epoch': 0.37} 37%|███▋ | 8265/22095 [13:58:40<15:38:52, 4.07s/it] 37%|███▋ | 8266/22095 [13:58:43<14:34:22, 3.79s/it] {'loss': 0.3254, 'grad_norm': 0.7653677810020674, 'learning_rate': 7.203880938449432e-06, 'epoch': 0.37} 37%|███▋ | 8266/22095 [13:58:43<14:34:22, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 37%|███▋ | 8267/22095 [13:58:46<13:53:35, 3.62s/it] {'loss': 0.3984, 'grad_norm': 0.7374295563959377, 'learning_rate': 7.203223032378552e-06, 'epoch': 0.37} 37%|███▋ | 8267/22095 [13:58:46<13:53:35, 3.62s/it] 37%|███▋ | 8268/22095 [13:58:49<13:19:40, 3.47s/it] {'loss': 0.3443, 'grad_norm': 0.6887589709014885, 'learning_rate': 7.202565078967176e-06, 'epoch': 0.37} 37%|███▋ | 8268/22095 [13:58:49<13:19:40, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51903 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102032 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8269/22095 [13:58:52<12:34:24, 3.27s/it] {'loss': 0.3364, 'grad_norm': 0.8163425475345482, 'learning_rate': 7.201907078229442e-06, 'epoch': 0.37} 37%|███▋ | 8269/22095 [13:58:52<12:34:24, 3.27s/it] 37%|███▋ | 8270/22095 [13:58:55<12:12:22, 3.18s/it] {'loss': 0.3452, 'grad_norm': 0.6668493386525876, 'learning_rate': 7.201249030179487e-06, 'epoch': 0.37} 37%|███▋ | 8270/22095 [13:58:55<12:12:22, 3.18s/it] 37%|███▋ | 8271/22095 [13:58:59<13:18:09, 3.46s/it] {'loss': 0.3949, 'grad_norm': 0.6931544278501053, 'learning_rate': 7.200590934831451e-06, 'epoch': 0.37} 37%|███▋ | 8271/22095 [13:58:59<13:18:09, 3.46s/it] 37%|███▋ | 8272/22095 [13:59:02<12:50:31, 3.34s/it] {'loss': 0.3592, 'grad_norm': 0.6957182852097032, 'learning_rate': 7.1999327921994735e-06, 'epoch': 0.37} 37%|███▋ | 8272/22095 [13:59:02<12:50:31, 3.34s/it] 37%|███▋ | 8273/22095 [13:59:05<12:15:10, 3.19s/it] {'loss': 0.2903, 'grad_norm': 0.8084175042085993, 'learning_rate': 7.199274602297698e-06, 'epoch': 0.37} 37%|███▋ | 8273/22095 [13:59:05<12:15:10, 3.19s/it] 37%|███▋ | 8274/22095 [13:59:08<12:04:35, 3.15s/it] {'loss': 0.3224, 'grad_norm': 0.5942923954714772, 'learning_rate': 7.198616365140264e-06, 'epoch': 0.37} 37%|███▋ | 8274/22095 [13:59:08<12:04:35, 3.15s/it] 37%|███▋ | 8275/22095 [13:59:11<12:12:11, 3.18s/it] {'loss': 0.3889, 'grad_norm': 0.6934474912796738, 'learning_rate': 7.197958080741319e-06, 'epoch': 0.37} 37%|███▋ | 8275/22095 [13:59:11<12:12:11, 3.18s/it] 37%|███▋ | 8276/22095 [13:59:15<12:42:49, 3.31s/it] {'loss': 0.3787, 'grad_norm': 0.6741172782103588, 'learning_rate': 7.1972997491150046e-06, 'epoch': 0.37} 37%|███▋ | 8276/22095 [13:59:15<12:42:49, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 37%|███▋ | 8277/22095 [13:59:25<20:02:31, 5.22s/it] {'loss': 0.4912, 'grad_norm': 0.7482604130933004, 'learning_rate': 7.196641370275467e-06, 'epoch': 0.37} 37%|███▋ | 8277/22095 [13:59:25<20:02:31, 5.22s/it] 37%|███▋ | 8278/22095 [13:59:28<17:57:27, 4.68s/it] {'loss': 0.3698, 'grad_norm': 0.6585374578882484, 'learning_rate': 7.195982944236853e-06, 'epoch': 0.37} 37%|███▋ | 8278/22095 [13:59:28<17:57:27, 4.68s/it] 37%|███▋ | 8279/22095 [13:59:33<18:27:12, 4.81s/it] {'loss': 0.3301, 'grad_norm': 0.6341317832916422, 'learning_rate': 7.195324471013309e-06, 'epoch': 0.37} 37%|███▋ | 8279/22095 [13:59:33<18:27:12, 4.81s/it] 37%|███▋ | 8280/22095 [13:59:36<16:12:34, 4.22s/it] {'loss': 0.3892, 'grad_norm': 0.6603462506256079, 'learning_rate': 7.194665950618986e-06, 'epoch': 0.37} 37%|███▋ | 8280/22095 [13:59:36<16:12:34, 4.22s/it] 37%|███▋ | 8281/22095 [13:59:40<15:35:51, 4.06s/it] {'loss': 0.3192, 'grad_norm': 0.6211242207857619, 'learning_rate': 7.194007383068031e-06, 'epoch': 0.37} 37%|███▋ | 8281/22095 [13:59:40<15:35:51, 4.06s/it] 37%|███▋ | 8282/22095 [13:59:42<14:12:25, 3.70s/it] {'loss': 0.3717, 'grad_norm': 0.7203308404997607, 'learning_rate': 7.193348768374595e-06, 'epoch': 0.37} 37%|███▋ | 8282/22095 [13:59:42<14:12:25, 3.70s/it] 37%|███▋ | 8283/22095 [13:59:47<15:26:32, 4.02s/it] {'loss': 0.3376, 'grad_norm': 0.6878629374944645, 'learning_rate': 7.192690106552833e-06, 'epoch': 0.37} 37%|███▋ | 8283/22095 [13:59:47<15:26:32, 4.02s/it] 37%|███▋ | 8284/22095 [13:59:50<14:32:28, 3.79s/it] {'loss': 0.3733, 'grad_norm': 0.6750065790075197, 'learning_rate': 7.1920313976168935e-06, 'epoch': 0.37} 37%|███▋ | 8284/22095 [13:59:51<14:32:28, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110383 > 40960). Running this sequence through the model will result in indexing errors 37%|███▋ | 8285/22095 [13:59:54<14:22:27, 3.75s/it] {'loss': 0.3331, 'grad_norm': 0.723135413194556, 'learning_rate': 7.191372641580931e-06, 'epoch': 0.37} 37%|███▋ | 8285/22095 [13:59:54<14:22:27, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8286/22095 [13:59:57<13:41:01, 3.57s/it] {'loss': 0.2828, 'grad_norm': 0.5944902240791072, 'learning_rate': 7.190713838459101e-06, 'epoch': 0.38} 38%|███▊ | 8286/22095 [13:59:57<13:41:01, 3.57s/it] 38%|███▊ | 8287/22095 [14:00:00<12:57:29, 3.38s/it] {'loss': 0.3305, 'grad_norm': 0.6726537174612722, 'learning_rate': 7.190054988265559e-06, 'epoch': 0.38} 38%|███▊ | 8287/22095 [14:00:00<12:57:29, 3.38s/it] 38%|███▊ | 8288/22095 [14:00:04<13:02:30, 3.40s/it] {'loss': 0.3667, 'grad_norm': 0.6223775834738695, 'learning_rate': 7.189396091014462e-06, 'epoch': 0.38} 38%|███▊ | 8288/22095 [14:00:04<13:02:30, 3.40s/it] 38%|███▊ | 8289/22095 [14:00:07<13:07:51, 3.42s/it] {'loss': 0.3757, 'grad_norm': 0.6676977292485492, 'learning_rate': 7.188737146719967e-06, 'epoch': 0.38} 38%|███▊ | 8289/22095 [14:00:07<13:07:51, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59806 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8290/22095 [14:00:10<12:52:50, 3.36s/it] {'loss': 0.331, 'grad_norm': 0.6564201500650513, 'learning_rate': 7.188078155396232e-06, 'epoch': 0.38} 38%|███▊ | 8290/22095 [14:00:10<12:52:50, 3.36s/it] 38%|███▊ | 8291/22095 [14:00:13<12:23:57, 3.23s/it] {'loss': 0.3548, 'grad_norm': 0.6770082220237673, 'learning_rate': 7.187419117057419e-06, 'epoch': 0.38} 38%|███▊ | 8291/22095 [14:00:13<12:23:57, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77428 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8292/22095 [14:00:17<12:33:16, 3.27s/it] {'loss': 0.3185, 'grad_norm': 0.6494451263198261, 'learning_rate': 7.1867600317176875e-06, 'epoch': 0.38} 38%|███▊ | 8292/22095 [14:00:17<12:33:16, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8293/22095 [14:00:25<18:19:56, 4.78s/it] {'loss': 0.5227, 'grad_norm': 0.7351655528808541, 'learning_rate': 7.186100899391198e-06, 'epoch': 0.38} 38%|███▊ | 8293/22095 [14:00:25<18:19:56, 4.78s/it] 38%|███▊ | 8294/22095 [14:00:30<18:22:41, 4.79s/it] {'loss': 0.3454, 'grad_norm': 0.6711745935920342, 'learning_rate': 7.185441720092114e-06, 'epoch': 0.38} 38%|███▊ | 8294/22095 [14:00:30<18:22:41, 4.79s/it] 38%|███▊ | 8295/22095 [14:00:34<17:30:08, 4.57s/it] {'loss': 0.3194, 'grad_norm': 0.6135920701861901, 'learning_rate': 7.1847824938346e-06, 'epoch': 0.38} 38%|███▊ | 8295/22095 [14:00:34<17:30:08, 4.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954495 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5330, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 10cm\nB. 12cm\nC. 6cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 38%|███▊ | 8296/22095 [14:00:37<15:34:13, 4.06s/it] {'loss': 0.391, 'grad_norm': 0.6005901565788068, 'learning_rate': 7.18412322063282e-06, 'epoch': 0.38} 38%|███▊ | 8296/22095 [14:00:37<15:34:13, 4.06s/it] 38%|███▊ | 8297/22095 [14:00:40<15:13:26, 3.97s/it] {'loss': 0.3364, 'grad_norm': 0.6292338562077902, 'learning_rate': 7.183463900500941e-06, 'epoch': 0.38} 38%|███▊ | 8297/22095 [14:00:40<15:13:26, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8298/22095 [14:00:49<20:33:01, 5.36s/it] {'loss': 0.5301, 'grad_norm': 0.34141101180389294, 'learning_rate': 7.182804533453127e-06, 'epoch': 0.38} 38%|███▊ | 8298/22095 [14:00:49<20:33:01, 5.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878405 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1558, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 38%|███▊ | 8299/22095 [14:00:53<18:32:19, 4.84s/it] {'loss': 0.3473, 'grad_norm': 0.6197088953157461, 'learning_rate': 7.182145119503549e-06, 'epoch': 0.38} 38%|███▊ | 8299/22095 [14:00:53<18:32:19, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8300/22095 [14:00:59<19:56:35, 5.20s/it] {'loss': 0.4761, 'grad_norm': 0.33265449349879456, 'learning_rate': 7.181485658666375e-06, 'epoch': 0.38} 38%|███▊ | 8300/22095 [14:00:59<19:56:35, 5.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8301/22095 [14:01:03<18:23:42, 4.80s/it] {'loss': 0.3073, 'grad_norm': 0.640304362055324, 'learning_rate': 7.180826150955772e-06, 'epoch': 0.38} 38%|███▊ | 8301/22095 [14:01:03<18:23:42, 4.80s/it] 38%|███▊ | 8302/22095 [14:01:06<17:16:23, 4.51s/it] {'loss': 0.4338, 'grad_norm': 0.6860724167861825, 'learning_rate': 7.180166596385915e-06, 'epoch': 0.38} 38%|███▊ | 8302/22095 [14:01:06<17:16:23, 4.51s/it] 38%|███▊ | 8303/22095 [14:01:10<15:39:47, 4.09s/it] {'loss': 0.3085, 'grad_norm': 0.6019944334202502, 'learning_rate': 7.179506994970972e-06, 'epoch': 0.38} 38%|███▊ | 8303/22095 [14:01:10<15:39:47, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140986 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44635 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8304/22095 [14:01:14<15:32:16, 4.06s/it] {'loss': 0.3716, 'grad_norm': 0.656435666847941, 'learning_rate': 7.178847346725119e-06, 'epoch': 0.38} 38%|███▊ | 8304/22095 [14:01:14<15:32:16, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70111 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8305/22095 [14:01:18<15:57:09, 4.16s/it] {'loss': 0.3585, 'grad_norm': 0.6424257873205942, 'learning_rate': 7.178187651662527e-06, 'epoch': 0.38} 38%|███▊ | 8305/22095 [14:01:18<15:57:09, 4.16s/it] 38%|███▊ | 8306/22095 [14:01:22<16:02:55, 4.19s/it] {'loss': 0.4092, 'grad_norm': 0.6525988216561274, 'learning_rate': 7.177527909797373e-06, 'epoch': 0.38} 38%|███▊ | 8306/22095 [14:01:22<16:02:55, 4.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948709 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71862, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nA. 3\nB. 4\nC. 1\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 38%|███▊ | 8307/22095 [14:01:26<15:52:55, 4.15s/it] {'loss': 0.3302, 'grad_norm': 0.679056074665069, 'learning_rate': 7.176868121143831e-06, 'epoch': 0.38} 38%|███▊ | 8307/22095 [14:01:26<15:52:55, 4.15s/it] 38%|███▊ | 8308/22095 [14:01:29<14:30:12, 3.79s/it] {'loss': 0.3311, 'grad_norm': 0.6405564739339455, 'learning_rate': 7.176208285716079e-06, 'epoch': 0.38} 38%|███▊ | 8308/22095 [14:01:29<14:30:12, 3.79s/it] 38%|███▊ | 8309/22095 [14:01:33<14:32:40, 3.80s/it] {'loss': 0.399, 'grad_norm': 0.7141348763998681, 'learning_rate': 7.175548403528295e-06, 'epoch': 0.38} 38%|███▊ | 8309/22095 [14:01:33<14:32:40, 3.80s/it] 38%|███▊ | 8310/22095 [14:01:36<13:31:07, 3.53s/it] {'loss': 0.3938, 'grad_norm': 0.7062128340100978, 'learning_rate': 7.174888474594659e-06, 'epoch': 0.38} 38%|███▊ | 8310/22095 [14:01:36<13:31:07, 3.53s/it] 38%|███▊ | 8311/22095 [14:01:40<14:21:16, 3.75s/it] {'loss': 0.3702, 'grad_norm': 0.6758179480303619, 'learning_rate': 7.174228498929347e-06, 'epoch': 0.38} 38%|███▊ | 8311/22095 [14:01:40<14:21:16, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [103, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358093 in VC:s3://internvl-moe-sft-data/. Exception: Image size [103, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24804, 'image': 'vrdu_table_final_2/astro-ph.CO/1d58ad8d-c1d6-4b77-945c-89e766f6b586.png', 'image_wh': [[103, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$\\mu$, $\\mu_3$, $\\epsilon_4$\\end{tabular}\n```"}]} 38%|███▊ | 8312/22095 [14:01:47<17:36:05, 4.60s/it] {'loss': 0.4848, 'grad_norm': 0.5505683467660143, 'learning_rate': 7.1735684765465444e-06, 'epoch': 0.38} 38%|███▊ | 8312/22095 [14:01:47<17:36:05, 4.60s/it] 38%|███▊ | 8313/22095 [14:01:55<21:26:06, 5.60s/it] {'loss': 0.4858, 'grad_norm': 0.45578075324415335, 'learning_rate': 7.172908407460429e-06, 'epoch': 0.38} 38%|███▊ | 8313/22095 [14:01:55<21:26:06, 5.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 38%|███▊ | 8314/22095 [14:01:59<19:25:16, 5.07s/it] {'loss': 0.316, 'grad_norm': 0.7382639843039384, 'learning_rate': 7.172248291685187e-06, 'epoch': 0.38} 38%|███▊ | 8314/22095 [14:01:59<19:25:16, 5.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047592 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 4cm\nB. 6cm\nC. 1cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 38%|███▊ | 8315/22095 [14:02:02<17:22:39, 4.54s/it] {'loss': 0.3486, 'grad_norm': 0.8101613462731673, 'learning_rate': 7.171588129234999e-06, 'epoch': 0.38} 38%|███▊ | 8315/22095 [14:02:02<17:22:39, 4.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [678, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8490835 in VC:s3://internvl-moe-sft-data/. Exception: Image size [678, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49973, 'image': 'vrdu_texteq/astro-ph.CO/35a6a402-afff-4592-b3bd-e6f3e53326ce.png', 'image_wh': [[678, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'indicates that the minimum radius is about\n$\\alpha_{\\rm min} \\approx 20^\\circ$.'}]} 38%|███▊ | 8316/22095 [14:02:05<15:41:30, 4.10s/it] {'loss': 0.3294, 'grad_norm': 0.6506764101308028, 'learning_rate': 7.170927920124052e-06, 'epoch': 0.38} 38%|███▊ | 8316/22095 [14:02:05<15:41:30, 4.10s/it] 38%|███▊ | 8317/22095 [14:02:08<14:58:36, 3.91s/it] {'loss': 0.3211, 'grad_norm': 0.5881378075645671, 'learning_rate': 7.1702676643665325e-06, 'epoch': 0.38} 38%|███▊ | 8317/22095 [14:02:08<14:58:36, 3.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885297 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8450, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 2cm\nB. 4cm\nC. 1cm\nD. 1.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8318/22095 [14:02:11<13:58:54, 3.65s/it] {'loss': 0.3051, 'grad_norm': 0.6630934221929646, 'learning_rate': 7.169607361976627e-06, 'epoch': 0.38} 38%|███▊ | 8318/22095 [14:02:11<13:58:54, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359353 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26073, 'image': 'vrdu_table_final_2/astro-ph.CO/f6c47b92-534e-451a-b55e-554771c602e4.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6779573 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_132635_1/images/before_screenshot_1_id_89_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': '\nHover over the Hatch tool in the Draw panel to view its tooltip description.'}, {'from': 'gpt', 'value': '\nclick(x=0.5422, y=0.5684)\n'}], 'width': 3600, 'height': 2338} 38%|███▊ | 8319/22095 [14:02:15<13:47:43, 3.61s/it] {'loss': 0.3842, 'grad_norm': 0.6771901923424336, 'learning_rate': 7.16894701296852e-06, 'epoch': 0.38} 38%|███▊ | 8319/22095 [14:02:15<13:47:43, 3.61s/it] 38%|███▊ | 8320/22095 [14:02:18<12:50:34, 3.36s/it] {'loss': 0.3436, 'grad_norm': 0.6014323203917278, 'learning_rate': 7.168286617356406e-06, 'epoch': 0.38} 38%|███▊ | 8320/22095 [14:02:18<12:50:34, 3.36s/it] 38%|███▊ | 8321/22095 [14:02:22<13:36:05, 3.55s/it] {'loss': 0.3776, 'grad_norm': 0.6818623234030072, 'learning_rate': 7.1676261751544696e-06, 'epoch': 0.38} 38%|███▊ | 8321/22095 [14:02:22<13:36:05, 3.55s/it] 38%|███▊ | 8322/22095 [14:02:25<12:53:01, 3.37s/it] {'loss': 0.3312, 'grad_norm': 0.6291103487682812, 'learning_rate': 7.1669656863769055e-06, 'epoch': 0.38} 38%|███▊ | 8322/22095 [14:02:25<12:53:01, 3.37s/it] 38%|███▊ | 8323/22095 [14:02:28<13:05:31, 3.42s/it] {'loss': 0.3592, 'grad_norm': 0.5881389565856336, 'learning_rate': 7.166305151037905e-06, 'epoch': 0.38} 38%|███▊ | 8323/22095 [14:02:28<13:05:31, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [556, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8485352 in VC:s3://internvl-moe-sft-data/. Exception: Image size [556, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44545, 'image': 'vrdu_texteq/astro-ph.CO/d919ca3e-3365-419c-9b1b-73c132f9d73a.png', 'image_wh': [[556, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'Then the mass associated with the radius $r$ is'}]} 38%|███▊ | 8324/22095 [14:02:31<12:46:05, 3.34s/it] {'loss': 0.3734, 'grad_norm': 0.6840081359541433, 'learning_rate': 7.165644569151658e-06, 'epoch': 0.38} 38%|███▊ | 8324/22095 [14:02:31<12:46:05, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46174 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109655 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8325/22095 [14:02:35<13:10:20, 3.44s/it] {'loss': 0.3562, 'grad_norm': 0.6440224446417455, 'learning_rate': 7.1649839407323606e-06, 'epoch': 0.38} 38%|███▊ | 8325/22095 [14:02:35<13:10:20, 3.44s/it] 38%|███▊ | 8326/22095 [14:02:39<13:26:19, 3.51s/it] {'loss': 0.363, 'grad_norm': 0.6498735609301556, 'learning_rate': 7.164323265794209e-06, 'epoch': 0.38} 38%|███▊ | 8326/22095 [14:02:39<13:26:19, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8327/22095 [14:02:42<13:05:40, 3.42s/it] {'loss': 0.3671, 'grad_norm': 0.641845120340163, 'learning_rate': 7.163662544351396e-06, 'epoch': 0.38} 38%|███▊ | 8327/22095 [14:02:42<13:05:40, 3.42s/it] 38%|███▊ | 8328/22095 [14:02:45<12:48:42, 3.35s/it] {'loss': 0.3521, 'grad_norm': 0.6982652323012907, 'learning_rate': 7.163001776418121e-06, 'epoch': 0.38} 38%|███▊ | 8328/22095 [14:02:45<12:48:42, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8329/22095 [14:02:48<12:31:30, 3.28s/it] {'loss': 0.3521, 'grad_norm': 0.6243317055112345, 'learning_rate': 7.162340962008581e-06, 'epoch': 0.38} 38%|███▊ | 8329/22095 [14:02:48<12:31:30, 3.28s/it] 38%|███▊ | 8330/22095 [14:02:52<13:00:04, 3.40s/it] {'loss': 0.3369, 'grad_norm': 0.6649935914439344, 'learning_rate': 7.1616801011369755e-06, 'epoch': 0.38} 38%|███▊ | 8330/22095 [14:02:52<13:00:04, 3.40s/it] 38%|███▊ | 8331/22095 [14:02:55<12:42:16, 3.32s/it] {'loss': 0.3221, 'grad_norm': 0.6427306330278062, 'learning_rate': 7.161019193817503e-06, 'epoch': 0.38} 38%|███▊ | 8331/22095 [14:02:55<12:42:16, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8332/22095 [14:02:59<13:31:29, 3.54s/it] {'loss': 0.3212, 'grad_norm': 0.6021879131102966, 'learning_rate': 7.1603582400643646e-06, 'epoch': 0.38} 38%|███▊ | 8332/22095 [14:02:59<13:31:29, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8333/22095 [14:03:09<21:24:39, 5.60s/it] {'loss': 0.4842, 'grad_norm': 1.1681307049239376, 'learning_rate': 7.159697239891764e-06, 'epoch': 0.38} 38%|███▊ | 8333/22095 [14:03:10<21:24:39, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41627 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8334/22095 [14:03:13<18:59:37, 4.97s/it] {'loss': 0.3408, 'grad_norm': 0.7385597893203592, 'learning_rate': 7.159036193313902e-06, 'epoch': 0.38} 38%|███▊ | 8334/22095 [14:03:13<18:59:37, 4.97s/it] 38%|███▊ | 8335/22095 [14:03:16<17:15:06, 4.51s/it] {'loss': 0.3593, 'grad_norm': 0.6559267542328838, 'learning_rate': 7.158375100344983e-06, 'epoch': 0.38} 38%|███▊ | 8335/22095 [14:03:16<17:15:06, 4.51s/it] 38%|███▊ | 8336/22095 [14:03:19<15:34:28, 4.08s/it] {'loss': 0.3517, 'grad_norm': 0.6265308322487304, 'learning_rate': 7.157713960999212e-06, 'epoch': 0.38} 38%|███▊ | 8336/22095 [14:03:20<15:34:28, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72953 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8337/22095 [14:03:23<14:22:18, 3.76s/it] {'loss': 0.3205, 'grad_norm': 0.6300808220892394, 'learning_rate': 7.157052775290795e-06, 'epoch': 0.38} 38%|███▊ | 8337/22095 [14:03:23<14:22:18, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8338/22095 [14:03:26<13:28:58, 3.53s/it] {'loss': 0.3674, 'grad_norm': 0.7314504650540607, 'learning_rate': 7.156391543233938e-06, 'epoch': 0.38} 38%|███▊ | 8338/22095 [14:03:26<13:28:58, 3.53s/it] 38%|███▊ | 8339/22095 [14:03:28<12:49:41, 3.36s/it] {'loss': 0.3975, 'grad_norm': 0.625857506921556, 'learning_rate': 7.155730264842852e-06, 'epoch': 0.38} 38%|███▊ | 8339/22095 [14:03:28<12:49:41, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8378771 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45555, 'image': 'vrdu_table_final_2/astro-ph.CO/9ad98d43-e07d-4950-83d9-949d0f84148c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [53, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397999 in VC:s3://internvl-moe-sft-data/. Exception: Image size [53, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 149, 'image': 'vrdu_table_final_2/astro-ph.CO/32d69117-df36-4157-b2de-26c8c3b93e9f.png', 'image_wh': [[53, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l}PwS\\end{tabular}\n```"}]} 38%|███▊ | 8340/22095 [14:03:32<12:45:42, 3.34s/it] {'loss': 0.3646, 'grad_norm': 0.6983912978077121, 'learning_rate': 7.155068940131741e-06, 'epoch': 0.38} 38%|███▊ | 8340/22095 [14:03:32<12:45:42, 3.34s/it] 38%|███▊ | 8341/22095 [14:03:35<13:07:41, 3.44s/it] {'loss': 0.3419, 'grad_norm': 0.6330685645971086, 'learning_rate': 7.154407569114818e-06, 'epoch': 0.38} 38%|███▊ | 8341/22095 [14:03:35<13:07:41, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (113053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68816 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8342/22095 [14:03:42<17:13:54, 4.51s/it] {'loss': 0.4922, 'grad_norm': 0.6612187254544507, 'learning_rate': 7.153746151806293e-06, 'epoch': 0.38} 38%|███▊ | 8342/22095 [14:03:42<17:13:54, 4.51s/it] 38%|███▊ | 8343/22095 [14:03:46<16:34:21, 4.34s/it] {'loss': 0.3582, 'grad_norm': 0.616178122020675, 'learning_rate': 7.153084688220379e-06, 'epoch': 0.38} 38%|███▊ | 8343/22095 [14:03:46<16:34:21, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8344/22095 [14:03:50<15:59:51, 4.19s/it] {'loss': 0.343, 'grad_norm': 0.5975510050770201, 'learning_rate': 7.152423178371286e-06, 'epoch': 0.38} 38%|███▊ | 8344/22095 [14:03:50<15:59:51, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8345/22095 [14:03:59<21:12:58, 5.55s/it] {'loss': 0.4942, 'grad_norm': 0.48499907097437206, 'learning_rate': 7.15176162227323e-06, 'epoch': 0.38} 38%|███▊ | 8345/22095 [14:03:59<21:12:58, 5.55s/it] 38%|███▊ | 8346/22095 [14:04:03<19:36:57, 5.14s/it] {'loss': 0.3072, 'grad_norm': 0.6269278584054762, 'learning_rate': 7.151100019940427e-06, 'epoch': 0.38} 38%|███▊ | 8346/22095 [14:04:03<19:36:57, 5.14s/it] 38%|███▊ | 8347/22095 [14:04:07<17:45:51, 4.65s/it] {'loss': 0.3565, 'grad_norm': 1.0530027938874988, 'learning_rate': 7.1504383713870895e-06, 'epoch': 0.38} 38%|███▊ | 8347/22095 [14:04:07<17:45:51, 4.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396004 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62845, 'image': 'vrdu_table_final_2/astro-ph.EP/f9e839b5-de5a-441b-aa80-2e3ada560206.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 38%|███▊ | 8348/22095 [14:04:10<16:45:59, 4.39s/it] {'loss': 0.3581, 'grad_norm': 0.6296408615092338, 'learning_rate': 7.149776676627436e-06, 'epoch': 0.38} 38%|███▊ | 8348/22095 [14:04:10<16:45:59, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8349/22095 [14:04:17<19:05:08, 5.00s/it] {'loss': 0.4654, 'grad_norm': 0.38352804422665704, 'learning_rate': 7.149114935675685e-06, 'epoch': 0.38} 38%|███▊ | 8349/22095 [14:04:17<19:05:08, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76421 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50116 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52610 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140843 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8350/22095 [14:04:26<24:06:50, 6.32s/it] {'loss': 0.4577, 'grad_norm': 0.3512682890162911, 'learning_rate': 7.148453148546055e-06, 'epoch': 0.38} 38%|███▊ | 8350/22095 [14:04:26<24:06:50, 6.32s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 38%|███▊ | 8351/22095 [14:04:31<21:53:30, 5.73s/it] {'loss': 0.3564, 'grad_norm': 0.6552021817776984, 'learning_rate': 7.1477913152527635e-06, 'epoch': 0.38} 38%|███▊ | 8351/22095 [14:04:31<21:53:30, 5.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8352/22095 [14:04:34<19:16:14, 5.05s/it] {'loss': 0.3423, 'grad_norm': 0.6176790702307078, 'learning_rate': 7.1471294358100344e-06, 'epoch': 0.38} 38%|███▊ | 8352/22095 [14:04:34<19:16:14, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54244 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8353/22095 [14:04:37<17:07:47, 4.49s/it] {'loss': 0.3545, 'grad_norm': 0.765039383704981, 'learning_rate': 7.146467510232088e-06, 'epoch': 0.38} 38%|███▊ | 8353/22095 [14:04:37<17:07:47, 4.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045959 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 7\nB. 6\nC. 10\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 38%|███▊ | 8354/22095 [14:04:40<15:09:09, 3.97s/it] {'loss': 0.3008, 'grad_norm': 0.6006632425414946, 'learning_rate': 7.145805538533146e-06, 'epoch': 0.38} 38%|███▊ | 8354/22095 [14:04:40<15:09:09, 3.97s/it] 38%|███▊ | 8355/22095 [14:04:44<14:52:49, 3.90s/it] {'loss': 0.3485, 'grad_norm': 0.6770490817071343, 'learning_rate': 7.145143520727434e-06, 'epoch': 0.38} 38%|███▊ | 8355/22095 [14:04:44<14:52:49, 3.90s/it] 38%|███▊ | 8356/22095 [14:04:47<14:12:00, 3.72s/it] {'loss': 0.3462, 'grad_norm': 0.7613669358884392, 'learning_rate': 7.144481456829178e-06, 'epoch': 0.38} 38%|███▊ | 8356/22095 [14:04:47<14:12:00, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923673 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46826, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 4\nB. 3\nC. 6\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 38%|███▊ | 8357/22095 [14:04:50<13:43:34, 3.60s/it] {'loss': 0.3768, 'grad_norm': 0.6814959868650572, 'learning_rate': 7.1438193468525986e-06, 'epoch': 0.38} 38%|███▊ | 8357/22095 [14:04:50<13:43:34, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885565 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8718, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 2\nB. 3\nC. 10\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 38%|███▊ | 8358/22095 [14:05:00<20:13:21, 5.30s/it] {'loss': 0.4857, 'grad_norm': 0.5110195752010532, 'learning_rate': 7.143157190811927e-06, 'epoch': 0.38} 38%|███▊ | 8358/22095 [14:05:00<20:13:21, 5.30s/it] 38%|███▊ | 8359/22095 [14:05:04<18:52:50, 4.95s/it] {'loss': 0.3829, 'grad_norm': 0.6838577856291563, 'learning_rate': 7.14249498872139e-06, 'epoch': 0.38} 38%|███▊ | 8359/22095 [14:05:04<18:52:50, 4.95s/it] 38%|███▊ | 8360/22095 [14:05:08<17:44:07, 4.65s/it] {'loss': 0.3886, 'grad_norm': 0.746673282701903, 'learning_rate': 7.141832740595217e-06, 'epoch': 0.38} 38%|███▊ | 8360/22095 [14:05:08<17:44:07, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8361/22095 [14:05:17<23:08:19, 6.07s/it] {'loss': 0.4868, 'grad_norm': 0.4042609860628919, 'learning_rate': 7.141170446447634e-06, 'epoch': 0.38} 38%|███▊ | 8361/22095 [14:05:17<23:08:19, 6.07s/it] 38%|███▊ | 8362/22095 [14:05:20<20:01:30, 5.25s/it] {'loss': 0.3452, 'grad_norm': 0.6604856741415678, 'learning_rate': 7.140508106292876e-06, 'epoch': 0.38} 38%|███▊ | 8362/22095 [14:05:20<20:01:30, 5.25s/it] 38%|███▊ | 8363/22095 [14:05:23<17:29:31, 4.59s/it] {'loss': 0.3489, 'grad_norm': 0.6152179001971164, 'learning_rate': 7.139845720145172e-06, 'epoch': 0.38} 38%|███▊ | 8363/22095 [14:05:23<17:29:31, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (136129 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8364/22095 [14:05:27<16:04:58, 4.22s/it] {'loss': 0.3523, 'grad_norm': 0.6226880833935763, 'learning_rate': 7.139183288018756e-06, 'epoch': 0.38} 38%|███▊ | 8364/22095 [14:05:27<16:04:58, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118113 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8365/22095 [14:05:29<14:17:07, 3.75s/it] {'loss': 0.324, 'grad_norm': 0.747242633729006, 'learning_rate': 7.13852080992786e-06, 'epoch': 0.38} 38%|███▊ | 8365/22095 [14:05:29<14:17:07, 3.75s/it] 38%|███▊ | 8366/22095 [14:05:33<14:11:04, 3.72s/it] {'loss': 0.3626, 'grad_norm': 0.6172830431009755, 'learning_rate': 7.137858285886721e-06, 'epoch': 0.38} 38%|███▊ | 8366/22095 [14:05:33<14:11:04, 3.72s/it] 38%|███▊ | 8367/22095 [14:05:36<13:43:54, 3.60s/it] {'loss': 0.4062, 'grad_norm': 0.6480631555125602, 'learning_rate': 7.137195715909573e-06, 'epoch': 0.38} 38%|███▊ | 8367/22095 [14:05:36<13:43:54, 3.60s/it] 38%|███▊ | 8368/22095 [14:05:40<13:11:05, 3.46s/it] {'loss': 0.3615, 'grad_norm': 0.6270206468414993, 'learning_rate': 7.136533100010654e-06, 'epoch': 0.38} 38%|███▊ | 8368/22095 [14:05:40<13:11:05, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358149 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24860, 'image': 'vrdu_table_final_2/astro-ph.CO/6781f9eb-fce0-47cf-80df-f83ed9b46e9d.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$S_{2}$\\end{tabular}\n```"}]} 38%|███▊ | 8369/22095 [14:05:43<13:15:59, 3.48s/it] {'loss': 0.3309, 'grad_norm': 0.5960545309428391, 'learning_rate': 7.135870438204198e-06, 'epoch': 0.38} 38%|███▊ | 8369/22095 [14:05:43<13:15:59, 3.48s/it] 38%|███▊ | 8370/22095 [14:05:46<12:57:44, 3.40s/it] {'loss': 0.368, 'grad_norm': 0.6132416090139773, 'learning_rate': 7.1352077305044485e-06, 'epoch': 0.38} 38%|███▊ | 8370/22095 [14:05:46<12:57:44, 3.40s/it] 38%|███▊ | 8371/22095 [14:05:50<13:37:24, 3.57s/it] {'loss': 0.3448, 'grad_norm': 0.5923204264613755, 'learning_rate': 7.1345449769256416e-06, 'epoch': 0.38} 38%|███▊ | 8371/22095 [14:05:50<13:37:24, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8372/22095 [14:05:56<15:47:31, 4.14s/it] {'loss': 0.4879, 'grad_norm': 0.522084772545119, 'learning_rate': 7.133882177482019e-06, 'epoch': 0.38} 38%|███▊ | 8372/22095 [14:05:56<15:47:31, 4.14s/it] 38%|███▊ | 8373/22095 [14:06:05<21:54:22, 5.75s/it] {'loss': 0.4873, 'grad_norm': 0.4472915138713644, 'learning_rate': 7.133219332187823e-06, 'epoch': 0.38} 38%|███▊ | 8373/22095 [14:06:05<21:54:22, 5.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 38%|███▊ | 8374/22095 [14:06:09<19:25:40, 5.10s/it] {'loss': 0.2654, 'grad_norm': 0.5607250276819523, 'learning_rate': 7.132556441057294e-06, 'epoch': 0.38} 38%|███▊ | 8374/22095 [14:06:09<19:25:40, 5.10s/it] 38%|███▊ | 8375/22095 [14:06:13<17:48:50, 4.67s/it] {'loss': 0.3619, 'grad_norm': 0.6821875143652115, 'learning_rate': 7.131893504104677e-06, 'epoch': 0.38} 38%|███▊ | 8375/22095 [14:06:13<17:48:50, 4.67s/it] 38%|███▊ | 8376/22095 [14:06:16<16:09:18, 4.24s/it] {'loss': 0.353, 'grad_norm': 0.6041315305261291, 'learning_rate': 7.131230521344217e-06, 'epoch': 0.38} 38%|███▊ | 8376/22095 [14:06:16<16:09:18, 4.24s/it] 38%|███▊ | 8377/22095 [14:06:19<15:26:16, 4.05s/it] {'loss': 0.3891, 'grad_norm': 0.6264230011816776, 'learning_rate': 7.130567492790157e-06, 'epoch': 0.38} 38%|███▊ | 8377/22095 [14:06:19<15:26:16, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43851 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8378/22095 [14:06:23<14:32:56, 3.82s/it] {'loss': 0.3123, 'grad_norm': 0.6333311023261349, 'learning_rate': 7.129904418456745e-06, 'epoch': 0.38} 38%|███▊ | 8378/22095 [14:06:23<14:32:56, 3.82s/it] 38%|███▊ | 8379/22095 [14:06:26<14:03:49, 3.69s/it] {'loss': 0.3426, 'grad_norm': 0.699733653598983, 'learning_rate': 7.129241298358231e-06, 'epoch': 0.38} 38%|███▊ | 8379/22095 [14:06:26<14:03:49, 3.69s/it] 38%|███▊ | 8380/22095 [14:06:30<14:01:03, 3.68s/it] {'loss': 0.3427, 'grad_norm': 0.6604529388686061, 'learning_rate': 7.128578132508859e-06, 'epoch': 0.38} 38%|███▊ | 8380/22095 [14:06:30<14:01:03, 3.68s/it] 38%|███▊ | 8381/22095 [14:06:33<13:18:35, 3.49s/it] {'loss': 0.3482, 'grad_norm': 0.6022601095193064, 'learning_rate': 7.127914920922883e-06, 'epoch': 0.38} 38%|███▊ | 8381/22095 [14:06:33<13:18:35, 3.49s/it] 38%|███▊ | 8382/22095 [14:06:37<13:43:24, 3.60s/it] {'loss': 0.3629, 'grad_norm': 0.7434609203247454, 'learning_rate': 7.127251663614547e-06, 'epoch': 0.38} 38%|███▊ | 8382/22095 [14:06:37<13:43:24, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8383/22095 [14:06:40<13:28:04, 3.54s/it] {'loss': 0.3559, 'grad_norm': 0.6220340421846623, 'learning_rate': 7.126588360598109e-06, 'epoch': 0.38} 38%|███▊ | 8383/22095 [14:06:40<13:28:04, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8384/22095 [14:06:43<13:18:43, 3.50s/it] {'loss': 0.335, 'grad_norm': 0.6740928356810203, 'learning_rate': 7.125925011887818e-06, 'epoch': 0.38} 38%|███▊ | 8384/22095 [14:06:43<13:18:43, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8385/22095 [14:06:47<13:10:22, 3.46s/it] {'loss': 0.3686, 'grad_norm': 0.7475687257974962, 'learning_rate': 7.125261617497926e-06, 'epoch': 0.38} 38%|███▊ | 8385/22095 [14:06:47<13:10:22, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8386/22095 [14:06:55<18:52:11, 4.96s/it] {'loss': 0.4895, 'grad_norm': 0.8291350729963307, 'learning_rate': 7.12459817744269e-06, 'epoch': 0.38} 38%|███▊ | 8386/22095 [14:06:55<18:52:11, 4.96s/it] 38%|███▊ | 8387/22095 [14:06:59<17:29:40, 4.59s/it] {'loss': 0.3739, 'grad_norm': 0.632240955498442, 'learning_rate': 7.123934691736365e-06, 'epoch': 0.38} 38%|███▊ | 8387/22095 [14:06:59<17:29:40, 4.59s/it] 38%|███▊ | 8388/22095 [14:07:02<15:45:15, 4.14s/it] {'loss': 0.3869, 'grad_norm': 0.6842845487643655, 'learning_rate': 7.123271160393206e-06, 'epoch': 0.38} 38%|███▊ | 8388/22095 [14:07:02<15:45:15, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47856 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123807 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8389/22095 [14:07:06<15:05:41, 3.96s/it] {'loss': 0.3238, 'grad_norm': 0.6127478816563668, 'learning_rate': 7.122607583427472e-06, 'epoch': 0.38} 38%|███▊ | 8389/22095 [14:07:06<15:05:41, 3.96s/it] 38%|███▊ | 8390/22095 [14:07:09<13:59:30, 3.68s/it] {'loss': 0.3773, 'grad_norm': 0.6564050899226402, 'learning_rate': 7.121943960853418e-06, 'epoch': 0.38} 38%|███▊ | 8390/22095 [14:07:09<13:59:30, 3.68s/it] 38%|███▊ | 8391/22095 [14:07:12<13:33:05, 3.56s/it] {'loss': 0.3562, 'grad_norm': 0.6262183560049303, 'learning_rate': 7.121280292685307e-06, 'epoch': 0.38} 38%|███▊ | 8391/22095 [14:07:12<13:33:05, 3.56s/it] 38%|███▊ | 8392/22095 [14:07:16<13:41:48, 3.60s/it] {'loss': 0.3793, 'grad_norm': 0.636568518471423, 'learning_rate': 7.120616578937397e-06, 'epoch': 0.38} 38%|███▊ | 8392/22095 [14:07:16<13:41:48, 3.60s/it] 38%|███▊ | 8393/22095 [14:07:19<13:23:11, 3.52s/it] {'loss': 0.3582, 'grad_norm': 0.6503707596443266, 'learning_rate': 7.1199528196239495e-06, 'epoch': 0.38} 38%|███▊ | 8393/22095 [14:07:19<13:23:11, 3.52s/it] 38%|███▊ | 8394/22095 [14:07:22<13:09:11, 3.46s/it] {'loss': 0.3641, 'grad_norm': 0.648733837093838, 'learning_rate': 7.119289014759228e-06, 'epoch': 0.38} 38%|███▊ | 8394/22095 [14:07:22<13:09:11, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63088 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50303 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8395/22095 [14:07:26<13:43:26, 3.61s/it] {'loss': 0.3614, 'grad_norm': 0.6033527463965486, 'learning_rate': 7.118625164357493e-06, 'epoch': 0.38} 38%|███▊ | 8395/22095 [14:07:26<13:43:26, 3.61s/it] 38%|███▊ | 8396/22095 [14:07:30<13:36:06, 3.57s/it] {'loss': 0.387, 'grad_norm': 0.6515459622908713, 'learning_rate': 7.117961268433012e-06, 'epoch': 0.38} 38%|███▊ | 8396/22095 [14:07:30<13:36:06, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8397/22095 [14:07:36<16:50:02, 4.42s/it] {'loss': 0.4947, 'grad_norm': 0.4871667020278753, 'learning_rate': 7.117297327000046e-06, 'epoch': 0.38} 38%|███▊ | 8397/22095 [14:07:36<16:50:02, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8398/22095 [14:07:40<15:50:28, 4.16s/it] {'loss': 0.3506, 'grad_norm': 0.6267015745529267, 'learning_rate': 7.116633340072863e-06, 'epoch': 0.38} 38%|███▊ | 8398/22095 [14:07:40<15:50:28, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8399/22095 [14:07:43<15:02:35, 3.95s/it] {'loss': 0.3351, 'grad_norm': 0.6194526502875497, 'learning_rate': 7.115969307665733e-06, 'epoch': 0.38} 38%|███▊ | 8399/22095 [14:07:43<15:02:35, 3.95s/it] 38%|███▊ | 8400/22095 [14:07:47<14:49:07, 3.90s/it] {'loss': 0.3439, 'grad_norm': 0.6102449278498785, 'learning_rate': 7.115305229792918e-06, 'epoch': 0.38} 38%|███▊ | 8400/22095 [14:07:47<14:49:07, 3.90s/it] 38%|███▊ | 8401/22095 [14:07:50<14:06:35, 3.71s/it] {'loss': 0.3679, 'grad_norm': 0.7654057374127621, 'learning_rate': 7.114641106468692e-06, 'epoch': 0.38} 38%|███▊ | 8401/22095 [14:07:50<14:06:35, 3.71s/it] 38%|███▊ | 8402/22095 [14:07:54<13:48:46, 3.63s/it] {'loss': 0.3582, 'grad_norm': 0.6773434947716912, 'learning_rate': 7.113976937707324e-06, 'epoch': 0.38} 38%|███▊ | 8402/22095 [14:07:54<13:48:46, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8403/22095 [14:08:03<20:28:06, 5.38s/it] {'loss': 0.4777, 'grad_norm': 0.31799243148880396, 'learning_rate': 7.1133127235230825e-06, 'epoch': 0.38} 38%|███▊ | 8403/22095 [14:08:03<20:28:06, 5.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8404/22095 [14:08:07<18:32:36, 4.88s/it] {'loss': 0.3615, 'grad_norm': 0.6289377071621696, 'learning_rate': 7.1126484639302425e-06, 'epoch': 0.38} 38%|███▊ | 8404/22095 [14:08:07<18:32:36, 4.88s/it] 38%|███▊ | 8405/22095 [14:08:10<17:05:56, 4.50s/it] {'loss': 0.3255, 'grad_norm': 0.6278500233355753, 'learning_rate': 7.111984158943075e-06, 'epoch': 0.38} 38%|███▊ | 8405/22095 [14:08:10<17:05:56, 4.50s/it] 38%|███▊ | 8406/22095 [14:08:14<16:13:16, 4.27s/it] {'loss': 0.3665, 'grad_norm': 0.6312913692634605, 'learning_rate': 7.1113198085758535e-06, 'epoch': 0.38} 38%|███▊ | 8406/22095 [14:08:14<16:13:16, 4.27s/it] 38%|███▊ | 8407/22095 [14:08:17<14:47:01, 3.89s/it] {'loss': 0.3658, 'grad_norm': 0.6265823361581598, 'learning_rate': 7.110655412842855e-06, 'epoch': 0.38} 38%|███▊ | 8407/22095 [14:08:17<14:47:01, 3.89s/it] 38%|███▊ | 8408/22095 [14:08:21<14:19:43, 3.77s/it] {'loss': 0.3447, 'grad_norm': 0.6374687650436551, 'learning_rate': 7.109990971758354e-06, 'epoch': 0.38} 38%|███▊ | 8408/22095 [14:08:21<14:19:43, 3.77s/it] 38%|███▊ | 8409/22095 [14:08:23<13:09:17, 3.46s/it] {'loss': 0.337, 'grad_norm': 0.6293191892231478, 'learning_rate': 7.109326485336626e-06, 'epoch': 0.38} 38%|███▊ | 8409/22095 [14:08:23<13:09:17, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310011 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB17ggQcJHO8KJjSZFtXXchfXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n从图片中辨认文字。'}, {'from': 'gpt', 'value': '图中所有文字:\nE14大红\nE27尖泡\nE14中红\nE12中红\nE27圆泡\nE12透明\nB22圆泡\nE12红色\n钨丝款\n好家依生活馆\nhttps://shop108874005.taobao.com/'}]} 38%|███▊ | 8410/22095 [14:08:26<12:46:50, 3.36s/it] {'loss': 0.3398, 'grad_norm': 0.7138126845674831, 'learning_rate': 7.108661953591953e-06, 'epoch': 0.38} 38%|███▊ | 8410/22095 [14:08:26<12:46:50, 3.36s/it] 38%|███▊ | 8411/22095 [14:08:30<13:12:06, 3.47s/it] {'loss': 0.3442, 'grad_norm': 0.7703350926583172, 'learning_rate': 7.107997376538606e-06, 'epoch': 0.38} 38%|███▊ | 8411/22095 [14:08:30<13:12:06, 3.47s/it] 38%|███▊ | 8412/22095 [14:08:33<12:37:07, 3.32s/it] {'loss': 0.3825, 'grad_norm': 0.658339137124922, 'learning_rate': 7.107332754190874e-06, 'epoch': 0.38} 38%|███▊ | 8412/22095 [14:08:33<12:37:07, 3.32s/it] 38%|███▊ | 8413/22095 [14:08:37<13:33:47, 3.57s/it] {'loss': 0.4002, 'grad_norm': 0.6314047841513415, 'learning_rate': 7.1066680865630335e-06, 'epoch': 0.38} 38%|███▊ | 8413/22095 [14:08:37<13:33:47, 3.57s/it] 38%|███▊ | 8414/22095 [14:08:40<13:03:03, 3.43s/it] {'loss': 0.3591, 'grad_norm': 0.6477014404457618, 'learning_rate': 7.106003373669363e-06, 'epoch': 0.38} 38%|███▊ | 8414/22095 [14:08:40<13:03:03, 3.43s/it] 38%|███▊ | 8415/22095 [14:08:44<13:34:06, 3.57s/it] {'loss': 0.33, 'grad_norm': 0.6041802367550553, 'learning_rate': 7.10533861552415e-06, 'epoch': 0.38} 38%|███▊ | 8415/22095 [14:08:44<13:34:06, 3.57s/it] 38%|███▊ | 8416/22095 [14:08:48<13:59:29, 3.68s/it] {'loss': 0.3893, 'grad_norm': 0.6430900504898354, 'learning_rate': 7.104673812141676e-06, 'epoch': 0.38} 38%|███▊ | 8416/22095 [14:08:48<13:59:29, 3.68s/it] 38%|███▊ | 8417/22095 [14:08:53<14:47:01, 3.89s/it] {'loss': 0.3551, 'grad_norm': 0.6314476922290938, 'learning_rate': 7.104008963536224e-06, 'epoch': 0.38} 38%|███▊ | 8417/22095 [14:08:53<14:47:01, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8418/22095 [14:08:56<14:06:03, 3.71s/it] {'loss': 0.3344, 'grad_norm': 0.7428431343765651, 'learning_rate': 7.1033440697220845e-06, 'epoch': 0.38} 38%|███▊ | 8418/22095 [14:08:56<14:06:03, 3.71s/it] 38%|███▊ | 8419/22095 [14:08:59<13:44:47, 3.62s/it] {'loss': 0.354, 'grad_norm': 0.6049014863415836, 'learning_rate': 7.102679130713538e-06, 'epoch': 0.38} 38%|███▊ | 8419/22095 [14:08:59<13:44:47, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8420/22095 [14:09:03<13:25:15, 3.53s/it] {'loss': 0.3793, 'grad_norm': 0.6490441683398328, 'learning_rate': 7.102014146524877e-06, 'epoch': 0.38} 38%|███▊ | 8420/22095 [14:09:03<13:25:15, 3.53s/it] 38%|███▊ | 8421/22095 [14:09:05<12:39:44, 3.33s/it] {'loss': 0.3522, 'grad_norm': 0.6654902860757159, 'learning_rate': 7.101349117170386e-06, 'epoch': 0.38} 38%|███▊ | 8421/22095 [14:09:06<12:39:44, 3.33s/it] 38%|███▊ | 8422/22095 [14:09:08<12:14:13, 3.22s/it] {'loss': 0.3493, 'grad_norm': 0.6555822112102216, 'learning_rate': 7.1006840426643576e-06, 'epoch': 0.38} 38%|███▊ | 8422/22095 [14:09:08<12:14:13, 3.22s/it] 38%|███▊ | 8423/22095 [14:09:12<12:02:57, 3.17s/it] {'loss': 0.3384, 'grad_norm': 0.6365048153101841, 'learning_rate': 7.10001892302108e-06, 'epoch': 0.38} 38%|███▊ | 8423/22095 [14:09:12<12:02:57, 3.17s/it] 38%|███▊ | 8424/22095 [14:09:16<13:25:51, 3.54s/it] {'loss': 0.3476, 'grad_norm': 0.6887085895296732, 'learning_rate': 7.099353758254846e-06, 'epoch': 0.38} 38%|███▊ | 8424/22095 [14:09:16<13:25:51, 3.54s/it] 38%|███▊ | 8425/22095 [14:09:19<13:02:24, 3.43s/it] {'loss': 0.3124, 'grad_norm': 0.6080229886754357, 'learning_rate': 7.0986885483799475e-06, 'epoch': 0.38} 38%|███▊ | 8425/22095 [14:09:19<13:02:24, 3.43s/it] 38%|███▊ | 8426/22095 [14:09:22<12:19:07, 3.24s/it] {'loss': 0.3001, 'grad_norm': 0.647329971750059, 'learning_rate': 7.098023293410677e-06, 'epoch': 0.38} 38%|███▊ | 8426/22095 [14:09:22<12:19:07, 3.24s/it] 38%|███▊ | 8427/22095 [14:09:25<12:35:50, 3.32s/it] {'loss': 0.3491, 'grad_norm': 0.6508641604197013, 'learning_rate': 7.09735799336133e-06, 'epoch': 0.38} 38%|███▊ | 8427/22095 [14:09:25<12:35:50, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53217 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47803 > 40960) for 4 sample(s). Truncating to 6843 with 3 samples. 38%|███▊ | 8428/22095 [14:09:29<13:01:16, 3.43s/it] {'loss': 0.327, 'grad_norm': 0.6191855403182719, 'learning_rate': 7.096692648246203e-06, 'epoch': 0.38} 38%|███▊ | 8428/22095 [14:09:29<13:01:16, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7322508 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'inventor/20250511_134843_1/images/before_screenshot_1_id_149_internvl_appearance_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nOutput only the bounding box in your response. A rectangular button with a light gray background labeled 'Split' in black text. The button has a subtle 3D effect with a slight bevel on its edges. It shows a small icon above the text that resembles a divided or split object."}, {'from': 'gpt', 'value': "A rectangular button with a light gray background labeled 'Split' in black text. The button has a subtle 3D effect with a slight bevel on its edges. It shows a small icon above the text that resembles a divided or split object.[[665, 716, 680, 725]]"}], 'width': 3600, 'height': 2338} 38%|███▊ | 8429/22095 [14:09:33<13:22:46, 3.52s/it] {'loss': 0.3401, 'grad_norm': 1.0524445853788542, 'learning_rate': 7.096027258079587e-06, 'epoch': 0.38} 38%|███▊ | 8429/22095 [14:09:33<13:22:46, 3.52s/it] 38%|███▊ | 8430/22095 [14:09:36<12:57:23, 3.41s/it] {'loss': 0.3223, 'grad_norm': 0.6245006137232687, 'learning_rate': 7.095361822875786e-06, 'epoch': 0.38} 38%|███▊ | 8430/22095 [14:09:36<12:57:23, 3.41s/it] 38%|███▊ | 8431/22095 [14:09:39<12:32:11, 3.30s/it] {'loss': 0.3883, 'grad_norm': 0.6383255613126531, 'learning_rate': 7.094696342649092e-06, 'epoch': 0.38} 38%|███▊ | 8431/22095 [14:09:39<12:32:11, 3.30s/it] 38%|███▊ | 8432/22095 [14:09:42<12:24:34, 3.27s/it] {'loss': 0.3414, 'grad_norm': 0.6874734938881214, 'learning_rate': 7.094030817413808e-06, 'epoch': 0.38} 38%|███▊ | 8432/22095 [14:09:42<12:24:34, 3.27s/it] 38%|███▊ | 8433/22095 [14:09:46<12:27:32, 3.28s/it] {'loss': 0.3289, 'grad_norm': 0.5845736916154386, 'learning_rate': 7.093365247184234e-06, 'epoch': 0.38} 38%|███▊ | 8433/22095 [14:09:46<12:27:32, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8434/22095 [14:09:54<18:27:59, 4.87s/it] {'loss': 0.4684, 'grad_norm': 0.5131578573043148, 'learning_rate': 7.09269963197467e-06, 'epoch': 0.38} 38%|███▊ | 8434/22095 [14:09:54<18:27:59, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56401 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8435/22095 [14:09:57<16:39:33, 4.39s/it] {'loss': 0.4068, 'grad_norm': 0.6426976817130162, 'learning_rate': 7.092033971799417e-06, 'epoch': 0.38} 38%|███▊ | 8435/22095 [14:09:57<16:39:33, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [345, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8532479 in VC:s3://internvl-moe-sft-data/. Exception: Image size [345, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16238, 'image': 'vrdu_texteq/astro-ph.CO/68fc2c89-5d3c-47b0-bd18-e9a8439cb579.png', 'image_wh': [[345, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'In this case $\\alpha \\simeq i\\kappa$ and then'}]} 38%|███▊ | 8436/22095 [14:10:07<22:41:05, 5.98s/it] {'loss': 0.4816, 'grad_norm': 0.3146884550883189, 'learning_rate': 7.09136826667278e-06, 'epoch': 0.38} 38%|███▊ | 8436/22095 [14:10:07<22:41:05, 5.98s/it] 38%|███▊ | 8437/22095 [14:10:12<21:16:48, 5.61s/it] {'loss': 0.3525, 'grad_norm': 0.8773610470531031, 'learning_rate': 7.0907025166090615e-06, 'epoch': 0.38} 38%|███▊ | 8437/22095 [14:10:12<21:16:48, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47807 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8438/22095 [14:10:16<19:15:53, 5.08s/it] {'loss': 0.3579, 'grad_norm': 0.6356233415643519, 'learning_rate': 7.090036721622567e-06, 'epoch': 0.38} 38%|███▊ | 8438/22095 [14:10:16<19:15:53, 5.08s/it] 38%|███▊ | 8439/22095 [14:10:19<16:46:34, 4.42s/it] {'loss': 0.3432, 'grad_norm': 0.7375475936847615, 'learning_rate': 7.089370881727604e-06, 'epoch': 0.38} 38%|███▊ | 8439/22095 [14:10:19<16:46:34, 4.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348835 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15505, 'image': 'vrdu_table_final_2/astro-ph.CO/4b79198a-52b8-4035-a54e-3498c98d1ac3.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$S_{3}$\\end{tabular}\n```"}]} 38%|███▊ | 8440/22095 [14:10:21<14:57:23, 3.94s/it] {'loss': 0.3353, 'grad_norm': 0.6364433797110283, 'learning_rate': 7.0887049969384756e-06, 'epoch': 0.38} 38%|███▊ | 8440/22095 [14:10:21<14:57:23, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8441/22095 [14:10:25<14:28:48, 3.82s/it] {'loss': 0.2949, 'grad_norm': 0.5537782812413454, 'learning_rate': 7.088039067269493e-06, 'epoch': 0.38} 38%|███▊ | 8441/22095 [14:10:25<14:28:48, 3.82s/it] 38%|███▊ | 8442/22095 [14:10:28<13:17:41, 3.51s/it] {'loss': 0.355, 'grad_norm': 0.6125051256367956, 'learning_rate': 7.087373092734964e-06, 'epoch': 0.38} 38%|███▊ | 8442/22095 [14:10:28<13:17:41, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8443/22095 [14:10:38<20:53:54, 5.51s/it] {'loss': 0.4963, 'grad_norm': 0.5627303676401337, 'learning_rate': 7.086707073349197e-06, 'epoch': 0.38} 38%|███▊ | 8443/22095 [14:10:38<20:53:54, 5.51s/it] 38%|███▊ | 8444/22095 [14:10:42<18:56:14, 4.99s/it] {'loss': 0.3288, 'grad_norm': 0.6606337857741633, 'learning_rate': 7.086041009126504e-06, 'epoch': 0.38} 38%|███▊ | 8444/22095 [14:10:42<18:56:14, 4.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [464, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8480502 in VC:s3://internvl-moe-sft-data/. Exception: Image size [464, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36744, 'image': 'vrdu_texteq/astro-ph.CO/ca36079f-6f4b-4bab-b535-131af82882de.png', 'image_wh': [[464, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'where $Y \\simeq 0.25$ is the helium fraction.'}]} 38%|███▊ | 8445/22095 [14:10:45<17:15:25, 4.55s/it] {'loss': 0.317, 'grad_norm': 0.6542767650447026, 'learning_rate': 7.0853749000811965e-06, 'epoch': 0.38} 38%|███▊ | 8445/22095 [14:10:45<17:15:25, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8446/22095 [14:10:56<24:13:49, 6.39s/it] {'loss': 0.4834, 'grad_norm': 0.40963340986258956, 'learning_rate': 7.084708746227589e-06, 'epoch': 0.38} 38%|███▊ | 8446/22095 [14:10:56<24:13:49, 6.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8447/22095 [14:11:00<21:41:36, 5.72s/it] {'loss': 0.3572, 'grad_norm': 0.5833579676406702, 'learning_rate': 7.084042547579992e-06, 'epoch': 0.38} 38%|███▊ | 8447/22095 [14:11:00<21:41:36, 5.72s/it] 38%|███▊ | 8448/22095 [14:11:03<18:52:14, 4.98s/it] {'loss': 0.3346, 'grad_norm': 0.5936420943321252, 'learning_rate': 7.08337630415272e-06, 'epoch': 0.38} 38%|███▊ | 8448/22095 [14:11:03<18:52:14, 4.98s/it] 38%|███▊ | 8449/22095 [14:11:07<17:57:49, 4.74s/it] {'loss': 0.36, 'grad_norm': 0.6331350642510168, 'learning_rate': 7.082710015960091e-06, 'epoch': 0.38} 38%|███▊ | 8449/22095 [14:11:07<17:57:49, 4.74s/it] 38%|███▊ | 8450/22095 [14:11:11<16:37:19, 4.39s/it] {'loss': 0.3742, 'grad_norm': 0.6776388853726919, 'learning_rate': 7.08204368301642e-06, 'epoch': 0.38} 38%|███▊ | 8450/22095 [14:11:11<16:37:19, 4.39s/it] 38%|███▊ | 8451/22095 [14:11:15<16:16:08, 4.29s/it] {'loss': 0.3375, 'grad_norm': 0.6536913237318658, 'learning_rate': 7.081377305336025e-06, 'epoch': 0.38} 38%|███▊ | 8451/22095 [14:11:15<16:16:08, 4.29s/it] 38%|███▊ | 8452/22095 [14:11:18<15:08:57, 4.00s/it] {'loss': 0.3457, 'grad_norm': 0.6744660832075964, 'learning_rate': 7.080710882933225e-06, 'epoch': 0.38} 38%|███▊ | 8452/22095 [14:11:18<15:08:57, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8453/22095 [14:11:22<15:01:44, 3.97s/it] {'loss': 0.3678, 'grad_norm': 0.620253089835594, 'learning_rate': 7.080044415822337e-06, 'epoch': 0.38} 38%|███▊ | 8453/22095 [14:11:22<15:01:44, 3.97s/it] 38%|███▊ | 8454/22095 [14:11:26<14:22:31, 3.79s/it] {'loss': 0.3096, 'grad_norm': 0.6134172858330584, 'learning_rate': 7.079377904017683e-06, 'epoch': 0.38} 38%|███▊ | 8454/22095 [14:11:26<14:22:31, 3.79s/it] 38%|███▊ | 8455/22095 [14:11:29<14:05:03, 3.72s/it] {'loss': 0.3486, 'grad_norm': 0.6199238081498524, 'learning_rate': 7.078711347533585e-06, 'epoch': 0.38} 38%|███▊ | 8455/22095 [14:11:29<14:05:03, 3.72s/it] 38%|███▊ | 8456/22095 [14:11:33<14:08:28, 3.73s/it] {'loss': 0.3599, 'grad_norm': 0.6391699024069745, 'learning_rate': 7.078044746384365e-06, 'epoch': 0.38} 38%|███▊ | 8456/22095 [14:11:33<14:08:28, 3.73s/it] 38%|███▊ | 8457/22095 [14:11:36<13:09:31, 3.47s/it] {'loss': 0.3583, 'grad_norm': 0.6803022292367258, 'learning_rate': 7.077378100584344e-06, 'epoch': 0.38} 38%|███▊ | 8457/22095 [14:11:36<13:09:31, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304592 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1cUBSa.UIL1JjSZFrXXb3xFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n劳烦帮忙读出并报告这张图上的文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n自发光\n联创胶粘制品厂\n全夜光\n全夜光批发联创胶粘制品厂'}]} 38%|███▊ | 8458/22095 [14:11:44<18:49:04, 4.97s/it] {'loss': 0.5134, 'grad_norm': 0.7097666551654821, 'learning_rate': 7.076711410147849e-06, 'epoch': 0.38} 38%|███▊ | 8458/22095 [14:11:44<18:49:04, 4.97s/it] 38%|███▊ | 8459/22095 [14:11:48<16:54:38, 4.46s/it] {'loss': 0.3769, 'grad_norm': 0.6725298205711432, 'learning_rate': 7.076044675089203e-06, 'epoch': 0.38} 38%|███▊ | 8459/22095 [14:11:48<16:54:38, 4.46s/it] 38%|███▊ | 8460/22095 [14:11:51<15:19:28, 4.05s/it] {'loss': 0.3512, 'grad_norm': 0.7210712947175153, 'learning_rate': 7.075377895422735e-06, 'epoch': 0.38} 38%|███▊ | 8460/22095 [14:11:51<15:19:28, 4.05s/it] 38%|███▊ | 8461/22095 [14:11:54<15:00:36, 3.96s/it] {'loss': 0.3757, 'grad_norm': 0.5975178708495634, 'learning_rate': 7.074711071162768e-06, 'epoch': 0.38} 38%|███▊ | 8461/22095 [14:11:54<15:00:36, 3.96s/it] 38%|███▊ | 8462/22095 [14:11:58<14:01:49, 3.70s/it] {'loss': 0.3438, 'grad_norm': 0.5915727400449279, 'learning_rate': 7.074044202323632e-06, 'epoch': 0.38} 38%|███▊ | 8462/22095 [14:11:58<14:01:49, 3.70s/it] 38%|███▊ | 8463/22095 [14:12:00<13:11:36, 3.48s/it] {'loss': 0.3433, 'grad_norm': 0.6853253841017686, 'learning_rate': 7.073377288919657e-06, 'epoch': 0.38} 38%|███▊ | 8463/22095 [14:12:01<13:11:36, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11303587 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 405, 'image': 'airplane_app/1090.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一款旧版的计算工具软件,界面文字是中文。左侧是不同的计算工具选项,包括基础资料、砌体、模板、钢筋制安掺和料、防水混凝土模板浇筑、其他钢筋混凝土等类别。从界面设计来看,这类软件主要用于工程、建筑、施工等领域的计算和管理。不过具体是哪款软件以及版本号、开发者等信息,仅凭这张截图无法确定更多详细信息。'}]} 38%|███▊ | 8464/22095 [14:12:03<12:37:13, 3.33s/it] {'loss': 0.3868, 'grad_norm': 2.0075612522721986, 'learning_rate': 7.072710330965171e-06, 'epoch': 0.38} 38%|███▊ | 8464/22095 [14:12:03<12:37:13, 3.33s/it] 38%|███▊ | 8465/22095 [14:12:07<13:01:20, 3.44s/it] {'loss': 0.3257, 'grad_norm': 0.649754434745739, 'learning_rate': 7.072043328474507e-06, 'epoch': 0.38} 38%|███▊ | 8465/22095 [14:12:07<13:01:20, 3.44s/it] 38%|███▊ | 8466/22095 [14:12:10<12:39:18, 3.34s/it] {'loss': 0.3428, 'grad_norm': 0.5866531935730779, 'learning_rate': 7.071376281461994e-06, 'epoch': 0.38} 38%|███▊ | 8466/22095 [14:12:10<12:39:18, 3.34s/it] 38%|███▊ | 8467/22095 [14:12:13<12:19:07, 3.25s/it] {'loss': 0.3747, 'grad_norm': 0.6168824482768775, 'learning_rate': 7.0707091899419685e-06, 'epoch': 0.38} 38%|███▊ | 8467/22095 [14:12:13<12:19:07, 3.25s/it] 38%|███▊ | 8468/22095 [14:12:16<11:56:06, 3.15s/it] {'loss': 0.3468, 'grad_norm': 0.6362663525465235, 'learning_rate': 7.070042053928763e-06, 'epoch': 0.38} 38%|███▊ | 8468/22095 [14:12:16<11:56:06, 3.15s/it] 38%|███▊ | 8469/22095 [14:12:19<11:48:49, 3.12s/it] {'loss': 0.3614, 'grad_norm': 0.6595243836995929, 'learning_rate': 7.0693748734367076e-06, 'epoch': 0.38} 38%|███▊ | 8469/22095 [14:12:19<11:48:49, 3.12s/it] 38%|███▊ | 8470/22095 [14:12:22<11:39:02, 3.08s/it] {'loss': 0.3327, 'grad_norm': 0.6817648534737756, 'learning_rate': 7.068707648480145e-06, 'epoch': 0.38} 38%|███▊ | 8470/22095 [14:12:22<11:39:02, 3.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88344 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8471/22095 [14:12:26<12:23:28, 3.27s/it] {'loss': 0.3607, 'grad_norm': 0.6195886664641986, 'learning_rate': 7.068040379073406e-06, 'epoch': 0.38} 38%|███▊ | 8471/22095 [14:12:26<12:23:28, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76281 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8472/22095 [14:12:29<12:38:16, 3.34s/it] {'loss': 0.3758, 'grad_norm': 0.6799828912841049, 'learning_rate': 7.067373065230834e-06, 'epoch': 0.38} 38%|███▊ | 8472/22095 [14:12:29<12:38:16, 3.34s/it] 38%|███▊ | 8473/22095 [14:12:33<12:24:45, 3.28s/it] {'loss': 0.3508, 'grad_norm': 0.6652330351884505, 'learning_rate': 7.0667057069667625e-06, 'epoch': 0.38} 38%|███▊ | 8473/22095 [14:12:33<12:24:45, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8474/22095 [14:12:36<12:15:43, 3.24s/it] {'loss': 0.3389, 'grad_norm': 0.704107025489184, 'learning_rate': 7.066038304295533e-06, 'epoch': 0.38} 38%|███▊ | 8474/22095 [14:12:36<12:15:43, 3.24s/it] 38%|███▊ | 8475/22095 [14:12:39<11:58:28, 3.17s/it] {'loss': 0.3261, 'grad_norm': 0.5711102874307746, 'learning_rate': 7.065370857231484e-06, 'epoch': 0.38} 38%|███▊ | 8475/22095 [14:12:39<11:58:28, 3.17s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8476/22095 [14:12:42<12:33:59, 3.32s/it] {'loss': 0.372, 'grad_norm': 0.7373487417259906, 'learning_rate': 7.064703365788961e-06, 'epoch': 0.38} 38%|███▊ | 8476/22095 [14:12:42<12:33:59, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8477/22095 [14:12:45<11:56:08, 3.16s/it] {'loss': 0.2955, 'grad_norm': 0.616650070917948, 'learning_rate': 7.064035829982302e-06, 'epoch': 0.38} 38%|███▊ | 8477/22095 [14:12:45<11:56:08, 3.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8607257 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24916, 'image': '1885726031.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Cookbooks, Food & Wine? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 38%|███▊ | 8478/22095 [14:12:48<11:53:55, 3.15s/it] {'loss': 0.3735, 'grad_norm': 1.5530480359895966, 'learning_rate': 7.063368249825855e-06, 'epoch': 0.38} 38%|███▊ | 8478/22095 [14:12:48<11:53:55, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71842 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94594 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8479/22095 [14:12:57<17:37:16, 4.66s/it] {'loss': 0.4856, 'grad_norm': 0.5846369271162104, 'learning_rate': 7.062700625333958e-06, 'epoch': 0.38} 38%|███▊ | 8479/22095 [14:12:57<17:37:16, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397387 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64241, 'image': 'vrdu_table_final_2/astro-ph.EP/0e8db629-f054-4670-8183-4991d00110e0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8480/22095 [14:13:04<20:46:41, 5.49s/it] {'loss': 0.4911, 'grad_norm': 0.5626364168272818, 'learning_rate': 7.0620329565209625e-06, 'epoch': 0.38} 38%|███▊ | 8480/22095 [14:13:04<20:46:41, 5.49s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 38%|███▊ | 8481/22095 [14:13:07<18:21:07, 4.85s/it] {'loss': 0.3438, 'grad_norm': 0.6561269380631167, 'learning_rate': 7.06136524340121e-06, 'epoch': 0.38} 38%|███▊ | 8481/22095 [14:13:07<18:21:07, 4.85s/it] 38%|███▊ | 8482/22095 [14:13:12<18:10:02, 4.80s/it] {'loss': 0.3875, 'grad_norm': 0.7824785353303435, 'learning_rate': 7.06069748598905e-06, 'epoch': 0.38} 38%|███▊ | 8482/22095 [14:13:12<18:10:02, 4.80s/it] 38%|███▊ | 8483/22095 [14:13:15<15:53:19, 4.20s/it] {'loss': 0.3339, 'grad_norm': 0.6290979318250164, 'learning_rate': 7.0600296842988305e-06, 'epoch': 0.38} 38%|███▊ | 8483/22095 [14:13:15<15:53:19, 4.20s/it] 38%|███▊ | 8484/22095 [14:13:18<14:42:42, 3.89s/it] {'loss': 0.3701, 'grad_norm': 0.6420260562777924, 'learning_rate': 7.0593618383448995e-06, 'epoch': 0.38} 38%|███▊ | 8484/22095 [14:13:18<14:42:42, 3.89s/it] 38%|███▊ | 8485/22095 [14:13:22<15:02:55, 3.98s/it] {'loss': 0.3389, 'grad_norm': 0.6896202946580808, 'learning_rate': 7.0586939481416065e-06, 'epoch': 0.38} 38%|███▊ | 8485/22095 [14:13:22<15:02:55, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49032 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118188 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8486/22095 [14:13:25<13:40:09, 3.62s/it] {'loss': 0.3515, 'grad_norm': 0.6372493711656572, 'learning_rate': 7.058026013703304e-06, 'epoch': 0.38} 38%|███▊ | 8486/22095 [14:13:25<13:40:09, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8487/22095 [14:13:35<20:37:44, 5.46s/it] {'loss': 0.4759, 'grad_norm': 0.7460403703740545, 'learning_rate': 7.057358035044344e-06, 'epoch': 0.38} 38%|███▊ | 8487/22095 [14:13:35<20:37:44, 5.46s/it] 38%|███▊ | 8488/22095 [14:13:40<19:57:58, 5.28s/it] {'loss': 0.3474, 'grad_norm': 0.6708392710946648, 'learning_rate': 7.0566900121790775e-06, 'epoch': 0.38} 38%|███▊ | 8488/22095 [14:13:40<19:57:58, 5.28s/it] 38%|███▊ | 8489/22095 [14:13:42<17:12:14, 4.55s/it] {'loss': 0.3688, 'grad_norm': 0.6826571122233077, 'learning_rate': 7.05602194512186e-06, 'epoch': 0.38} 38%|███▊ | 8489/22095 [14:13:42<17:12:14, 4.55s/it] 38%|███▊ | 8490/22095 [14:13:45<15:21:27, 4.06s/it] {'loss': 0.3285, 'grad_norm': 0.6349015181977148, 'learning_rate': 7.055353833887045e-06, 'epoch': 0.38} 38%|███▊ | 8490/22095 [14:13:45<15:21:27, 4.06s/it] 38%|███▊ | 8491/22095 [14:13:49<15:21:12, 4.06s/it] {'loss': 0.3661, 'grad_norm': 0.6544983112078551, 'learning_rate': 7.054685678488991e-06, 'epoch': 0.38} 38%|███▊ | 8491/22095 [14:13:49<15:21:12, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8492/22095 [14:13:53<14:37:39, 3.87s/it] {'loss': 0.3361, 'grad_norm': 0.6218008745167531, 'learning_rate': 7.054017478942048e-06, 'epoch': 0.38} 38%|███▊ | 8492/22095 [14:13:53<14:37:39, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70374 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103264 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8493/22095 [14:13:56<13:49:19, 3.66s/it] {'loss': 0.3383, 'grad_norm': 0.6182357777520696, 'learning_rate': 7.05334923526058e-06, 'epoch': 0.38} 38%|███▊ | 8493/22095 [14:13:56<13:49:19, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8494/22095 [14:14:06<21:27:45, 5.68s/it] {'loss': 0.4587, 'grad_norm': 0.37265872321485444, 'learning_rate': 7.052680947458944e-06, 'epoch': 0.38} 38%|███▊ | 8494/22095 [14:14:06<21:27:45, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83160 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59799 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8495/22095 [14:14:10<19:18:47, 5.11s/it] {'loss': 0.3411, 'grad_norm': 0.7152783685098152, 'learning_rate': 7.052012615551498e-06, 'epoch': 0.38} 38%|███▊ | 8495/22095 [14:14:10<19:18:47, 5.11s/it] 38%|███▊ | 8496/22095 [14:14:14<18:07:52, 4.80s/it] {'loss': 0.3328, 'grad_norm': 0.6712838532770757, 'learning_rate': 7.051344239552603e-06, 'epoch': 0.38} 38%|███▊ | 8496/22095 [14:14:14<18:07:52, 4.80s/it] 38%|███▊ | 8497/22095 [14:14:17<15:52:08, 4.20s/it] {'loss': 0.3337, 'grad_norm': 0.6847543279933849, 'learning_rate': 7.050675819476623e-06, 'epoch': 0.38} 38%|███▊ | 8497/22095 [14:14:17<15:52:08, 4.20s/it] 38%|███▊ | 8498/22095 [14:14:20<14:52:27, 3.94s/it] {'loss': 0.3469, 'grad_norm': 0.6356767753615933, 'learning_rate': 7.0500073553379136e-06, 'epoch': 0.38} 38%|███▊ | 8498/22095 [14:14:20<14:52:27, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74331 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8499/22095 [14:14:24<14:11:31, 3.76s/it] {'loss': 0.364, 'grad_norm': 0.6343150294252268, 'learning_rate': 7.049338847150845e-06, 'epoch': 0.38} 38%|███▊ | 8499/22095 [14:14:24<14:11:31, 3.76s/it] 38%|███▊ | 8500/22095 [14:14:27<14:01:42, 3.71s/it] {'loss': 0.3565, 'grad_norm': 0.6221401451155776, 'learning_rate': 7.048670294929777e-06, 'epoch': 0.38} 38%|███▊ | 8500/22095 [14:14:27<14:01:42, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 38%|███▊ | 8501/22095 [14:14:37<20:33:01, 5.44s/it] {'loss': 0.4921, 'grad_norm': 0.38006128196027994, 'learning_rate': 7.0480016986890775e-06, 'epoch': 0.38} 38%|███▊ | 8501/22095 [14:14:37<20:33:01, 5.44s/it] 38%|███▊ | 8502/22095 [14:14:40<18:32:57, 4.91s/it] {'loss': 0.3808, 'grad_norm': 0.6303947949330196, 'learning_rate': 7.047333058443111e-06, 'epoch': 0.38} 38%|███▊ | 8502/22095 [14:14:40<18:32:57, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44627 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8503/22095 [14:14:51<25:26:33, 6.74s/it] {'loss': 0.4623, 'grad_norm': 0.3335813258860991, 'learning_rate': 7.046664374206246e-06, 'epoch': 0.38} 38%|███▊ | 8503/22095 [14:14:51<25:26:33, 6.74s/it] 38%|███▊ | 8504/22095 [14:14:55<21:54:08, 5.80s/it] {'loss': 0.3664, 'grad_norm': 0.6438951118125865, 'learning_rate': 7.045995645992848e-06, 'epoch': 0.38} 38%|███▊ | 8504/22095 [14:14:55<21:54:08, 5.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 38%|███▊ | 8505/22095 [14:14:58<18:39:26, 4.94s/it] {'loss': 0.3756, 'grad_norm': 1.1040704150941836, 'learning_rate': 7.045326873817289e-06, 'epoch': 0.38} 38%|███▊ | 8505/22095 [14:14:58<18:39:26, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52571 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78598 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83301 > 40960). Running this sequence through the model will result in indexing errors 38%|███▊ | 8506/22095 [14:15:01<16:32:32, 4.38s/it] {'loss': 0.3627, 'grad_norm': 0.7376731985285496, 'learning_rate': 7.0446580576939346e-06, 'epoch': 0.38} 38%|███▊ | 8506/22095 [14:15:01<16:32:32, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44142 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8507/22095 [14:15:10<21:28:59, 5.69s/it] {'loss': 0.5044, 'grad_norm': 0.31769986927945976, 'learning_rate': 7.043989197637161e-06, 'epoch': 0.39} 39%|███▊ | 8507/22095 [14:15:10<21:28:59, 5.69s/it] 39%|███▊ | 8508/22095 [14:15:14<19:22:58, 5.14s/it] {'loss': 0.3845, 'grad_norm': 0.6709946462444102, 'learning_rate': 7.043320293661335e-06, 'epoch': 0.39} 39%|███▊ | 8508/22095 [14:15:14<19:22:58, 5.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358147 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24858, 'image': 'vrdu_table_final_2/astro-ph.CO/ac0b58db-069b-474c-b26f-ae8f88d92cbb.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{1}$\\end{tabular}\n```"}]} 39%|███▊ | 8509/22095 [14:15:16<16:39:07, 4.41s/it] {'loss': 0.3091, 'grad_norm': 0.6800794170963768, 'learning_rate': 7.0426513457808334e-06, 'epoch': 0.39} 39%|███▊ | 8509/22095 [14:15:16<16:39:07, 4.41s/it] 39%|███▊ | 8510/22095 [14:15:21<16:28:07, 4.36s/it] {'loss': 0.3342, 'grad_norm': 0.6348048129635555, 'learning_rate': 7.041982354010026e-06, 'epoch': 0.39} 39%|███▊ | 8510/22095 [14:15:21<16:28:07, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8511/22095 [14:15:28<19:31:10, 5.17s/it] {'loss': 0.4947, 'grad_norm': 0.3253351493544958, 'learning_rate': 7.041313318363291e-06, 'epoch': 0.39} 39%|███▊ | 8511/22095 [14:15:28<19:31:10, 5.17s/it] 39%|███▊ | 8512/22095 [14:15:31<17:15:42, 4.58s/it] {'loss': 0.3766, 'grad_norm': 0.6502405066779691, 'learning_rate': 7.0406442388550016e-06, 'epoch': 0.39} 39%|███▊ | 8512/22095 [14:15:31<17:15:42, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51589 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77224 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8513/22095 [14:15:35<16:27:32, 4.36s/it] {'loss': 0.3657, 'grad_norm': 0.7237918307112422, 'learning_rate': 7.039975115499534e-06, 'epoch': 0.39} 39%|███▊ | 8513/22095 [14:15:35<16:27:32, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957198 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8033, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 10\nB. 8\nC. 7\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:∵AB=20,AD=14,∴BD=AB-AD=20-14=6,∵D为线段BC的中点,∴BC=2BD=12,∴AC=AB-BC=20-12=8.'}]} 39%|███▊ | 8514/22095 [14:15:45<23:20:05, 6.19s/it] {'loss': 0.5045, 'grad_norm': 0.2887575351537154, 'learning_rate': 7.039305948311268e-06, 'epoch': 0.39} 39%|███▊ | 8514/22095 [14:15:45<23:20:05, 6.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55303 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54154 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85082 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8515/22095 [14:15:55<26:51:56, 7.12s/it] {'loss': 0.4686, 'grad_norm': 0.2819195233385298, 'learning_rate': 7.038636737304578e-06, 'epoch': 0.39} 39%|███▊ | 8515/22095 [14:15:55<26:51:56, 7.12s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 39%|███▊ | 8516/22095 [14:15:58<22:29:50, 5.96s/it] {'loss': 0.2888, 'grad_norm': 0.6932569182724204, 'learning_rate': 7.037967482493848e-06, 'epoch': 0.39} 39%|███▊ | 8516/22095 [14:15:58<22:29:50, 5.96s/it] 39%|███▊ | 8517/22095 [14:16:01<19:31:28, 5.18s/it] {'loss': 0.3678, 'grad_norm': 0.6763109238832349, 'learning_rate': 7.037298183893455e-06, 'epoch': 0.39} 39%|███▊ | 8517/22095 [14:16:01<19:31:28, 5.18s/it] 39%|███▊ | 8518/22095 [14:16:04<17:01:20, 4.51s/it] {'loss': 0.3418, 'grad_norm': 0.6190293228159512, 'learning_rate': 7.036628841517783e-06, 'epoch': 0.39} 39%|███▊ | 8518/22095 [14:16:04<17:01:20, 4.51s/it] 39%|███▊ | 8519/22095 [14:16:07<15:27:41, 4.10s/it] {'loss': 0.2971, 'grad_norm': 0.6782949993433552, 'learning_rate': 7.03595945538121e-06, 'epoch': 0.39} 39%|███▊ | 8519/22095 [14:16:07<15:27:41, 4.10s/it] 39%|███▊ | 8520/22095 [14:16:10<14:17:56, 3.79s/it] {'loss': 0.3748, 'grad_norm': 0.73709177330135, 'learning_rate': 7.035290025498121e-06, 'epoch': 0.39} 39%|███▊ | 8520/22095 [14:16:10<14:17:56, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8521/22095 [14:16:20<20:36:52, 5.47s/it] {'loss': 0.5048, 'grad_norm': 0.34987588010434106, 'learning_rate': 7.0346205518829015e-06, 'epoch': 0.39} 39%|███▊ | 8521/22095 [14:16:20<20:36:52, 5.47s/it] 39%|███▊ | 8522/22095 [14:16:23<18:30:16, 4.91s/it] {'loss': 0.3662, 'grad_norm': 0.6244611959658506, 'learning_rate': 7.033951034549935e-06, 'epoch': 0.39} 39%|███▊ | 8522/22095 [14:16:23<18:30:16, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8523/22095 [14:16:33<23:36:49, 6.26s/it] {'loss': 0.4954, 'grad_norm': 0.30869497121445927, 'learning_rate': 7.033281473513608e-06, 'epoch': 0.39} 39%|███▊ | 8523/22095 [14:16:33<23:36:49, 6.26s/it] 39%|███▊ | 8524/22095 [14:16:36<20:00:45, 5.31s/it] {'loss': 0.3213, 'grad_norm': 0.6578803599813655, 'learning_rate': 7.032611868788306e-06, 'epoch': 0.39} 39%|███▊ | 8524/22095 [14:16:36<20:00:45, 5.31s/it] 39%|███▊ | 8525/22095 [14:16:39<17:30:45, 4.65s/it] {'loss': 0.3556, 'grad_norm': 0.6887830147701085, 'learning_rate': 7.031942220388418e-06, 'epoch': 0.39} 39%|███▊ | 8525/22095 [14:16:39<17:30:45, 4.65s/it] 39%|███▊ | 8526/22095 [14:16:42<16:18:10, 4.33s/it] {'loss': 0.3319, 'grad_norm': 0.6383950468224833, 'learning_rate': 7.031272528328332e-06, 'epoch': 0.39} 39%|███▊ | 8526/22095 [14:16:42<16:18:10, 4.33s/it] 39%|███▊ | 8527/22095 [14:16:46<15:07:40, 4.01s/it] {'loss': 0.3399, 'grad_norm': 0.6276590559652488, 'learning_rate': 7.030602792622439e-06, 'epoch': 0.39} 39%|███▊ | 8527/22095 [14:16:46<15:07:40, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▊ | 8528/22095 [14:16:49<13:50:52, 3.67s/it] {'loss': 0.3898, 'grad_norm': 0.6297693881364205, 'learning_rate': 7.029933013285127e-06, 'epoch': 0.39} 39%|███▊ | 8528/22095 [14:16:49<13:50:52, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047218 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 39%|███▊ | 8529/22095 [14:16:53<14:12:14, 3.77s/it] {'loss': 0.3021, 'grad_norm': 0.6446800167295498, 'learning_rate': 7.0292631903307895e-06, 'epoch': 0.39} 39%|███▊ | 8529/22095 [14:16:53<14:12:14, 3.77s/it] 39%|███▊ | 8530/22095 [14:16:57<14:21:36, 3.81s/it] {'loss': 0.3825, 'grad_norm': 0.6276385632566854, 'learning_rate': 7.028593323773819e-06, 'epoch': 0.39} 39%|███▊ | 8530/22095 [14:16:57<14:21:36, 3.81s/it] 39%|███▊ | 8531/22095 [14:17:00<13:43:36, 3.64s/it] {'loss': 0.3399, 'grad_norm': 0.8475822563749026, 'learning_rate': 7.027923413628608e-06, 'epoch': 0.39} 39%|███▊ | 8531/22095 [14:17:00<13:43:36, 3.64s/it] 39%|███▊ | 8532/22095 [14:17:03<13:01:10, 3.46s/it] {'loss': 0.3282, 'grad_norm': 0.5886263362806824, 'learning_rate': 7.027253459909551e-06, 'epoch': 0.39} 39%|███▊ | 8532/22095 [14:17:03<13:01:10, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69850 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8533/22095 [14:17:07<13:22:50, 3.55s/it] {'loss': 0.3683, 'grad_norm': 0.5911191875338679, 'learning_rate': 7.026583462631044e-06, 'epoch': 0.39} 39%|███▊ | 8533/22095 [14:17:07<13:22:50, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954306 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5141, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 39%|███▊ | 8534/22095 [14:17:10<12:45:05, 3.39s/it] {'loss': 0.3613, 'grad_norm': 0.5994274018430233, 'learning_rate': 7.025913421807482e-06, 'epoch': 0.39} 39%|███▊ | 8534/22095 [14:17:10<12:45:05, 3.39s/it] 39%|███▊ | 8535/22095 [14:17:13<13:04:04, 3.47s/it] {'loss': 0.296, 'grad_norm': 0.6328861964249055, 'learning_rate': 7.025243337453263e-06, 'epoch': 0.39} 39%|███▊ | 8535/22095 [14:17:13<13:04:04, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948711 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71864, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nA. 1\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885570 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8723, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 2\nB. 3\nC. 10\nD. 5'}, {'from': 'gpt', 'value': '【解答】解:(1)由点M、N分别是线段AC、BC的中点,得MC=\\frac{1}{2}AC=\\frac{1}{2}×4=2,NC=\\frac{1}{2}BC=\\frac{1}{2}×6=3.由线段的和差,得MN=MC+NC=2+3=5;'}]} 39%|███▊ | 8536/22095 [14:17:16<12:40:00, 3.36s/it] {'loss': 0.3296, 'grad_norm': 0.7749002298447163, 'learning_rate': 7.024573209582783e-06, 'epoch': 0.39} 39%|███▊ | 8536/22095 [14:17:16<12:40:00, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306276 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1qDh2f3n.PuJjSZFkXXc_lpXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n劳烦帮忙读出并报告这张图上的文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n八大优点\n性能优越\n寿命长\n04\n05\n持液性高\n安全性高\n03\n06\n华\n芬\n农\n机\n足容量\n内阻小\n02\n07\nSEALED-LEAD-ACIDBATTERY6-FMD-12.0(12V12.0AH/20HR)\nCE\nPb\n维护简单\nPb\n01\n08\n耐高低温\n蓄电池八大优点'}]} 39%|███▊ | 8537/22095 [14:17:20<12:30:33, 3.32s/it] {'loss': 0.3091, 'grad_norm': 0.6125646428390575, 'learning_rate': 7.0239030382104445e-06, 'epoch': 0.39} 39%|███▊ | 8537/22095 [14:17:20<12:30:33, 3.32s/it] 39%|███▊ | 8538/22095 [14:17:22<11:56:25, 3.17s/it] {'loss': 0.345, 'grad_norm': 0.6023846423567603, 'learning_rate': 7.023232823350646e-06, 'epoch': 0.39} 39%|███▊ | 8538/22095 [14:17:22<11:56:25, 3.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73084 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84614 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106283 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8539/22095 [14:17:26<12:37:09, 3.35s/it] {'loss': 0.351, 'grad_norm': 0.6524052798822022, 'learning_rate': 7.022562565017788e-06, 'epoch': 0.39} 39%|███▊ | 8539/22095 [14:17:26<12:37:09, 3.35s/it] 39%|███▊ | 8540/22095 [14:17:29<12:02:49, 3.20s/it] {'loss': 0.3365, 'grad_norm': 0.6046955830979769, 'learning_rate': 7.021892263226271e-06, 'epoch': 0.39} 39%|███▊ | 8540/22095 [14:17:29<12:02:49, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8541/22095 [14:17:36<16:31:18, 4.39s/it] {'loss': 0.5064, 'grad_norm': 0.46937747458957113, 'learning_rate': 7.0212219179904996e-06, 'epoch': 0.39} 39%|███▊ | 8541/22095 [14:17:36<16:31:18, 4.39s/it] 39%|███▊ | 8542/22095 [14:17:41<16:40:25, 4.43s/it] {'loss': 0.3157, 'grad_norm': 0.583346339177339, 'learning_rate': 7.020551529324877e-06, 'epoch': 0.39} 39%|███▊ | 8542/22095 [14:17:41<16:40:25, 4.43s/it] 39%|███▊ | 8543/22095 [14:17:44<15:08:19, 4.02s/it] {'loss': 0.3187, 'grad_norm': 0.637042671091487, 'learning_rate': 7.019881097243808e-06, 'epoch': 0.39} 39%|███▊ | 8543/22095 [14:17:44<15:08:19, 4.02s/it] 39%|███▊ | 8544/22095 [14:17:47<13:54:57, 3.70s/it] {'loss': 0.3934, 'grad_norm': 0.6792643299062936, 'learning_rate': 7.019210621761698e-06, 'epoch': 0.39} 39%|███▊ | 8544/22095 [14:17:47<13:54:57, 3.70s/it] 39%|███▊ | 8545/22095 [14:17:51<14:14:54, 3.79s/it] {'loss': 0.3355, 'grad_norm': 0.6082188060902372, 'learning_rate': 7.018540102892952e-06, 'epoch': 0.39} 39%|███▊ | 8545/22095 [14:17:51<14:14:54, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▊ | 8546/22095 [14:17:55<14:17:29, 3.80s/it] {'loss': 0.3702, 'grad_norm': 0.6264804930140524, 'learning_rate': 7.017869540651979e-06, 'epoch': 0.39} 39%|███▊ | 8546/22095 [14:17:55<14:17:29, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8547/22095 [14:18:02<18:12:47, 4.84s/it] {'loss': 0.4738, 'grad_norm': 0.2948388318861362, 'learning_rate': 7.017198935053189e-06, 'epoch': 0.39} 39%|███▊ | 8547/22095 [14:18:02<18:12:47, 4.84s/it] 39%|███▊ | 8548/22095 [14:18:05<16:23:33, 4.36s/it] {'loss': 0.3663, 'grad_norm': 0.6400740222345551, 'learning_rate': 7.016528286110986e-06, 'epoch': 0.39} 39%|███▊ | 8548/22095 [14:18:05<16:23:33, 4.36s/it] 39%|███▊ | 8549/22095 [14:18:08<15:07:02, 4.02s/it] {'loss': 0.3538, 'grad_norm': 0.6173682417693278, 'learning_rate': 7.0158575938397856e-06, 'epoch': 0.39} 39%|███▊ | 8549/22095 [14:18:08<15:07:02, 4.02s/it] 39%|███▊ | 8550/22095 [14:18:12<14:54:58, 3.96s/it] {'loss': 0.377, 'grad_norm': 0.6417881388009231, 'learning_rate': 7.015186858253995e-06, 'epoch': 0.39} 39%|███▊ | 8550/22095 [14:18:12<14:54:58, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80908 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8551/22095 [14:18:16<14:42:19, 3.91s/it] {'loss': 0.3569, 'grad_norm': 0.7749490309522027, 'learning_rate': 7.01451607936803e-06, 'epoch': 0.39} 39%|███▊ | 8551/22095 [14:18:16<14:42:19, 3.91s/it] 39%|███▊ | 8552/22095 [14:18:19<14:16:33, 3.79s/it] {'loss': 0.3697, 'grad_norm': 0.6523873906515983, 'learning_rate': 7.013845257196301e-06, 'epoch': 0.39} 39%|███▊ | 8552/22095 [14:18:19<14:16:33, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (146274 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8553/22095 [14:18:22<13:16:13, 3.53s/it] {'loss': 0.3814, 'grad_norm': 0.6536056364088692, 'learning_rate': 7.013174391753222e-06, 'epoch': 0.39} 39%|███▊ | 8553/22095 [14:18:22<13:16:13, 3.53s/it] 39%|███▊ | 8554/22095 [14:18:26<13:48:06, 3.67s/it] {'loss': 0.3477, 'grad_norm': 0.6198002088630558, 'learning_rate': 7.012503483053209e-06, 'epoch': 0.39} 39%|███▊ | 8554/22095 [14:18:26<13:48:06, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74302 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8555/22095 [14:18:29<13:10:45, 3.50s/it] {'loss': 0.3823, 'grad_norm': 0.6534747588713392, 'learning_rate': 7.0118325311106774e-06, 'epoch': 0.39} 39%|███▊ | 8555/22095 [14:18:29<13:10:45, 3.50s/it] 39%|███▊ | 8556/22095 [14:18:33<12:53:33, 3.43s/it] {'loss': 0.3591, 'grad_norm': 0.5953562968043516, 'learning_rate': 7.011161535940042e-06, 'epoch': 0.39} 39%|███▊ | 8556/22095 [14:18:33<12:53:33, 3.43s/it] 39%|███▊ | 8557/22095 [14:18:37<13:37:18, 3.62s/it] {'loss': 0.4196, 'grad_norm': 0.7085087214260861, 'learning_rate': 7.0104904975557245e-06, 'epoch': 0.39} 39%|███▊ | 8557/22095 [14:18:37<13:37:18, 3.62s/it] 39%|███▊ | 8558/22095 [14:18:40<13:45:49, 3.66s/it] {'loss': 0.3636, 'grad_norm': 0.6173288924149105, 'learning_rate': 7.009819415972136e-06, 'epoch': 0.39} 39%|███▊ | 8558/22095 [14:18:41<13:45:49, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▊ | 8559/22095 [14:18:48<17:38:43, 4.69s/it] {'loss': 0.4874, 'grad_norm': 0.39121363517546165, 'learning_rate': 7.009148291203707e-06, 'epoch': 0.39} 39%|███▊ | 8559/22095 [14:18:48<17:38:43, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85833 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88813 > 40960). Running this sequence through the model will result in indexing errors 39%|███▊ | 8560/22095 [14:18:57<22:50:55, 6.08s/it] {'loss': 0.4868, 'grad_norm': 0.35145612881645605, 'learning_rate': 7.008477123264849e-06, 'epoch': 0.39} 39%|███▊ | 8560/22095 [14:18:57<22:50:55, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 39%|███▊ | 8561/22095 [14:19:00<19:54:50, 5.30s/it] {'loss': 0.3342, 'grad_norm': 0.6610092433572923, 'learning_rate': 7.007805912169985e-06, 'epoch': 0.39} 39%|███▊ | 8561/22095 [14:19:00<19:54:50, 5.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909860 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33013, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nA. 10\nB. 12\nC. 16\nD. 9\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 39%|███▉ | 8562/22095 [14:19:10<24:31:01, 6.52s/it] {'loss': 0.4758, 'grad_norm': 0.2939613276299066, 'learning_rate': 7.00713465793354e-06, 'epoch': 0.39} 39%|███▉ | 8562/22095 [14:19:10<24:31:01, 6.52s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 39%|███▉ | 8563/22095 [14:19:14<21:45:31, 5.79s/it] {'loss': 0.3631, 'grad_norm': 0.7630658549598067, 'learning_rate': 7.006463360569935e-06, 'epoch': 0.39} 39%|███▉ | 8563/22095 [14:19:14<21:45:31, 5.79s/it] 39%|███▉ | 8564/22095 [14:19:18<19:54:42, 5.30s/it] {'loss': 0.343, 'grad_norm': 0.6235937228967168, 'learning_rate': 7.005792020093596e-06, 'epoch': 0.39} 39%|███▉ | 8564/22095 [14:19:18<19:54:42, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66211 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8565/22095 [14:19:22<17:55:19, 4.77s/it] {'loss': 0.2882, 'grad_norm': 0.5722417308890443, 'learning_rate': 7.005120636518945e-06, 'epoch': 0.39} 39%|███▉ | 8565/22095 [14:19:22<17:55:19, 4.77s/it] 39%|███▉ | 8566/22095 [14:19:25<16:13:46, 4.32s/it] {'loss': 0.3704, 'grad_norm': 0.7099392150866518, 'learning_rate': 7.004449209860411e-06, 'epoch': 0.39} 39%|███▉ | 8566/22095 [14:19:25<16:13:46, 4.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965314 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16149, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 6cm\nB. 1cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 39%|███▉ | 8567/22095 [14:20:06<57:39:32, 15.34s/it] {'loss': 0.3677, 'grad_norm': 0.6039098044963439, 'learning_rate': 7.003777740132419e-06, 'epoch': 0.39} 39%|███▉ | 8567/22095 [14:20:06<57:39:32, 15.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8568/22095 [14:20:13<48:42:42, 12.96s/it] {'loss': 0.4731, 'grad_norm': 0.4846309482443427, 'learning_rate': 7.003106227349399e-06, 'epoch': 0.39} 39%|███▉ | 8568/22095 [14:20:13<48:42:42, 12.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74669 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8569/22095 [14:20:17<38:09:46, 10.16s/it] {'loss': 0.316, 'grad_norm': 0.6206185445623791, 'learning_rate': 7.002434671525776e-06, 'epoch': 0.39} 39%|███▉ | 8569/22095 [14:20:17<38:09:46, 10.16s/it] 39%|███▉ | 8570/22095 [14:20:20<30:42:46, 8.17s/it] {'loss': 0.3469, 'grad_norm': 0.6252646740494667, 'learning_rate': 7.001763072675984e-06, 'epoch': 0.39} 39%|███▉ | 8570/22095 [14:20:20<30:42:46, 8.17s/it] 39%|███▉ | 8571/22095 [14:20:24<25:23:03, 6.76s/it] {'loss': 0.3514, 'grad_norm': 0.5977188730549259, 'learning_rate': 7.0010914308144495e-06, 'epoch': 0.39} 39%|███▉ | 8571/22095 [14:20:24<25:23:03, 6.76s/it] 39%|███▉ | 8572/22095 [14:20:27<20:58:31, 5.58s/it] {'loss': 0.3466, 'grad_norm': 0.7207959710427151, 'learning_rate': 7.000419745955608e-06, 'epoch': 0.39} 39%|███▉ | 8572/22095 [14:20:27<20:58:31, 5.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (56204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63524 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8573/22095 [14:20:34<23:03:54, 6.14s/it] {'loss': 0.518, 'grad_norm': 0.3444553025472076, 'learning_rate': 6.999748018113889e-06, 'epoch': 0.39} 39%|███▉ | 8573/22095 [14:20:34<23:03:54, 6.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8574/22095 [14:20:38<20:07:19, 5.36s/it] {'loss': 0.3185, 'grad_norm': 0.5920163103863829, 'learning_rate': 6.999076247303727e-06, 'epoch': 0.39} 39%|███▉ | 8574/22095 [14:20:38<20:07:19, 5.36s/it] 39%|███▉ | 8575/22095 [14:20:41<17:43:36, 4.72s/it] {'loss': 0.3491, 'grad_norm': 0.6407433001736258, 'learning_rate': 6.998404433539556e-06, 'epoch': 0.39} 39%|███▉ | 8575/22095 [14:20:41<17:43:36, 4.72s/it] 39%|███▉ | 8576/22095 [14:20:44<16:16:58, 4.34s/it] {'loss': 0.3723, 'grad_norm': 0.6195459294965006, 'learning_rate': 6.997732576835812e-06, 'epoch': 0.39} 39%|███▉ | 8576/22095 [14:20:44<16:16:58, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8577/22095 [14:20:52<20:10:09, 5.37s/it] {'loss': 0.494, 'grad_norm': 0.29890055951620087, 'learning_rate': 6.997060677206928e-06, 'epoch': 0.39} 39%|███▉ | 8577/22095 [14:20:52<20:10:09, 5.37s/it] 39%|███▉ | 8578/22095 [14:20:55<17:49:52, 4.75s/it] {'loss': 0.3426, 'grad_norm': 0.6119850633665948, 'learning_rate': 6.996388734667347e-06, 'epoch': 0.39} 39%|███▉ | 8578/22095 [14:20:55<17:49:52, 4.75s/it] 39%|███▉ | 8579/22095 [14:21:00<17:23:31, 4.63s/it] {'loss': 0.3454, 'grad_norm': 0.7260265378995115, 'learning_rate': 6.995716749231501e-06, 'epoch': 0.39} 39%|███▉ | 8579/22095 [14:21:00<17:23:31, 4.63s/it] 39%|███▉ | 8580/22095 [14:21:03<15:46:17, 4.20s/it] {'loss': 0.3197, 'grad_norm': 0.5750007638619133, 'learning_rate': 6.995044720913834e-06, 'epoch': 0.39} 39%|███▉ | 8580/22095 [14:21:03<15:46:17, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8581/22095 [14:21:12<20:45:49, 5.53s/it] {'loss': 0.5086, 'grad_norm': 0.3534386399426898, 'learning_rate': 6.994372649728781e-06, 'epoch': 0.39} 39%|███▉ | 8581/22095 [14:21:12<20:45:49, 5.53s/it] 39%|███▉ | 8582/22095 [14:21:15<18:17:43, 4.87s/it] {'loss': 0.377, 'grad_norm': 0.7465663173943224, 'learning_rate': 6.993700535690786e-06, 'epoch': 0.39} 39%|███▉ | 8582/22095 [14:21:15<18:17:43, 4.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307805 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2AtGqczgy_uJjSZR0XXaK5pXa_!!45757858.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n我想知道这张图中的所有文字写了什么,你能全部识别出来吗?'}, {'from': 'gpt', 'value': '图中所有文字:\n通用平面袋\n无需纹路袋\n加宽加热丝\n全国包邮\nLack\n封口\n多功能真实封装机\n一年保修/七天退换\n正品\n注意事项\nLack\n百信包装'}]} Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8583/22095 [14:21:37<37:06:24, 9.89s/it] {'loss': 0.3151, 'grad_norm': 0.6635549051184639, 'learning_rate': 6.993028378814288e-06, 'epoch': 0.39} 39%|███▉ | 8583/22095 [14:21:37<37:06:24, 9.89s/it] 39%|███▉ | 8584/22095 [14:21:40<29:53:11, 7.96s/it] {'loss': 0.367, 'grad_norm': 0.6300073254729769, 'learning_rate': 6.992356179113735e-06, 'epoch': 0.39} 39%|███▉ | 8584/22095 [14:21:40<29:53:11, 7.96s/it] 39%|███▉ | 8585/22095 [14:22:02<46:11:15, 12.31s/it] {'loss': 0.3958, 'grad_norm': 0.6398951088006489, 'learning_rate': 6.991683936603562e-06, 'epoch': 0.39} 39%|███▉ | 8585/22095 [14:22:03<46:11:15, 12.31s/it] 39%|███▉ | 8586/22095 [14:22:05<35:37:32, 9.49s/it] {'loss': 0.376, 'grad_norm': 0.6160243481480446, 'learning_rate': 6.991011651298223e-06, 'epoch': 0.39} 39%|███▉ | 8586/22095 [14:22:05<35:37:32, 9.49s/it] 39%|███▉ | 8587/22095 [14:22:10<29:38:16, 7.90s/it] {'loss': 0.2957, 'grad_norm': 0.6365641272355594, 'learning_rate': 6.990339323212154e-06, 'epoch': 0.39} 39%|███▉ | 8587/22095 [14:22:10<29:38:16, 7.90s/it] 39%|███▉ | 8588/22095 [14:22:13<24:50:46, 6.62s/it] {'loss': 0.3649, 'grad_norm': 0.640848693473412, 'learning_rate': 6.989666952359809e-06, 'epoch': 0.39} 39%|███▉ | 8588/22095 [14:22:13<24:50:46, 6.62s/it] 39%|███▉ | 8589/22095 [14:22:53<62:13:51, 16.59s/it] {'loss': 0.3437, 'grad_norm': 0.7008769461942187, 'learning_rate': 6.988994538755631e-06, 'epoch': 0.39} 39%|███▉ | 8589/22095 [14:22:53<62:13:51, 16.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [364, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8477026 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10762, 'image': 'vrdu_texteq/astro-ph.CO/8a1975f3-bf74-4556-9c92-d917e74a43e6.png', 'image_wh': [[364, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'where the coefficients $c_n$ read:'}]} 39%|███▉ | 8590/22095 [14:23:16<69:51:27, 18.62s/it] {'loss': 0.3158, 'grad_norm': 0.601194932850971, 'learning_rate': 6.988322082414069e-06, 'epoch': 0.39} 39%|███▉ | 8590/22095 [14:23:16<69:51:27, 18.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83625 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8591/22095 [14:23:56<93:55:25, 25.04s/it] {'loss': 0.3819, 'grad_norm': 0.6295313129212234, 'learning_rate': 6.987649583349572e-06, 'epoch': 0.39} 39%|███▉ | 8591/22095 [14:23:56<93:55:25, 25.04s/it] 39%|███▉ | 8592/22095 [14:24:39<113:28:11, 30.25s/it] {'loss': 0.3393, 'grad_norm': 0.674097830312063, 'learning_rate': 6.98697704157659e-06, 'epoch': 0.39} 39%|███▉ | 8592/22095 [14:24:39<113:28:11, 30.25s/it] 39%|███▉ | 8593/22095 [14:25:19<124:03:10, 33.08s/it] {'loss': 0.3284, 'grad_norm': 0.6002185493604161, 'learning_rate': 6.986304457109574e-06, 'epoch': 0.39} 39%|███▉ | 8593/22095 [14:25:19<124:03:10, 33.08s/it] 39%|███▉ | 8594/22095 [14:25:41<111:56:51, 29.85s/it] {'loss': 0.337, 'grad_norm': 0.669798338747915, 'learning_rate': 6.9856318299629755e-06, 'epoch': 0.39} 39%|███▉ | 8594/22095 [14:25:41<111:56:51, 29.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8595/22095 [14:26:02<102:01:48, 27.21s/it] {'loss': 0.3352, 'grad_norm': 0.6334498651193496, 'learning_rate': 6.984959160151248e-06, 'epoch': 0.39} 39%|███▉ | 8595/22095 [14:26:02<102:01:48, 27.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8596/22095 [14:26:05<75:01:52, 20.01s/it] {'loss': 0.3368, 'grad_norm': 0.610674143561136, 'learning_rate': 6.984286447688844e-06, 'epoch': 0.39} 39%|███▉ | 8596/22095 [14:26:05<75:01:52, 20.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8597/22095 [14:26:14<62:28:59, 16.66s/it] {'loss': 0.4979, 'grad_norm': 0.34689978420777917, 'learning_rate': 6.983613692590219e-06, 'epoch': 0.39} 39%|███▉ | 8597/22095 [14:26:14<62:28:59, 16.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8598/22095 [14:26:18<47:56:16, 12.79s/it] {'loss': 0.3903, 'grad_norm': 0.6095161885854828, 'learning_rate': 6.9829408948698274e-06, 'epoch': 0.39} 39%|███▉ | 8598/22095 [14:26:18<47:56:16, 12.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [678, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474279 in VC:s3://internvl-moe-sft-data/. Exception: Image size [678, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 120823, 'image': 'vrdu_texteq/astro-ph.CO/fbacb41f-5741-42d9-a65c-929a4ca6dc00.png', 'image_wh': [[678, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'Now we examine the void volume fraction $F_{\\rm v}$ defined as'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8599/22095 [14:26:47<66:09:02, 17.65s/it] {'loss': 0.4885, 'grad_norm': 0.3022837257767364, 'learning_rate': 6.982268054542127e-06, 'epoch': 0.39} 39%|███▉ | 8599/22095 [14:26:47<66:09:02, 17.65s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8600/22095 [14:27:47<114:14:32, 30.48s/it] {'loss': 0.344, 'grad_norm': 0.6723165210046169, 'learning_rate': 6.981595171621572e-06, 'epoch': 0.39} 39%|███▉ | 8600/22095 [14:27:47<114:14:32, 30.48s/it]VC:s3://ocr/coco/train2014/COCO_train2014_000000558387.jpg 2025-08-28 06:25:45.893322 load time: 1032.34 ms VC:s3://gui-agent/data_20250624/ubuntu/images/libreoffice_calc/5120062c-4ddf-4360-9b26-cbc10880789d/images/step_6.png 2025-08-28 06:25:45.895038 load time: 1040.32 ms VC:s3://gui-agent/data_20250612/mac/images/vs_code/f3d618a7-c22f-44f6-9fa4-b1afcbb0d67a/images/step_0.png 2025-08-28 06:25:45.893229 load time: 1022.99 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8601/22095 [14:28:45<144:50:00, 38.64s/it] {'loss': 0.367, 'grad_norm': 0.7296296760079936, 'learning_rate': 6.980922246122626e-06, 'epoch': 0.39} 39%|███▉ | 8601/22095 [14:28:45<144:50:00, 38.64s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_165310_before_screenshot_sub0.png 2025-08-28 06:26:43.582269 load time: 1033.5 ms 39%|███▉ | 8602/22095 [14:29:07<126:36:37, 33.78s/it] {'loss': 0.337, 'grad_norm': 0.6500560720045101, 'learning_rate': 6.980249278059742e-06, 'epoch': 0.39} 39%|███▉ | 8602/22095 [14:29:07<126:36:37, 33.78s/it]VC:s3://gui/data_20250328/icon_canva/images/mobile_1080x2340_1743150472_canvas.png 2025-08-28 06:27:06.025322 load time: 1030.08 ms 39%|███▉ | 8603/22095 [14:30:08<156:23:52, 41.73s/it] {'loss': 0.3465, 'grad_norm': 0.6182250761658985, 'learning_rate': 6.979576267447385e-06, 'epoch': 0.39} 39%|███▉ | 8603/22095 [14:30:08<156:23:52, 41.73s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_107199.png 2025-08-28 06:28:06.311491 load time: 1031.78 ms 39%|███▉ | 8604/22095 [14:30:46<152:40:00, 40.74s/it] {'loss': 0.3402, 'grad_norm': 0.6968747034851669, 'learning_rate': 6.9789032143000125e-06, 'epoch': 0.39} 39%|███▉ | 8604/22095 [14:30:46<152:40:00, 40.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_81632.png 2025-08-28 06:28:44.727973 load time: 1029.82 ms VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-3_277399093-split-7.jpg 2025-08-28 06:28:44.729995 load time: 1033.16 ms VC:s3://gui-agent/data_20250612/web/images/yang_0527174255/10_140_52_49_0527200650/img/7.png 2025-08-28 06:28:44.727787 load time: 1044.08 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8605/22095 [14:31:06<129:51:01, 34.65s/it] {'loss': 0.3708, 'grad_norm': 0.6643017799130763, 'learning_rate': 6.978230118632088e-06, 'epoch': 0.39} 39%|███▉ | 8605/22095 [14:31:06<129:51:01, 34.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57203 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41467 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43258 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8606/22095 [14:31:10<95:16:26, 25.43s/it] {'loss': 0.3538, 'grad_norm': 0.6227026827677045, 'learning_rate': 6.977556980458073e-06, 'epoch': 0.39} 39%|███▉ | 8606/22095 [14:31:10<95:16:26, 25.43s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-3_275568409-split-0.jpg 2025-08-28 06:29:09.082560 load time: 1021.77 ms VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_6029.png 2025-08-28 06:29:09.082980 load time: 1021.22 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_510883.png 2025-08-28 06:29:09.082408 load time: 1024.71 ms VC:s3://gui-agent/data_20250407/web/images/rottentomatoes_com/trajectory_59/img/step_1.png 2025-08-28 06:29:09.082707 load time: 1084.73 ms 39%|███▉ | 8607/22095 [14:31:50<111:25:57, 29.74s/it] {'loss': 0.3822, 'grad_norm': 0.6742507578789183, 'learning_rate': 6.976883799792434e-06, 'epoch': 0.39} 39%|███▉ | 8607/22095 [14:31:50<111:25:57, 29.74s/it] 39%|███▉ | 8608/22095 [14:32:35<128:32:11, 34.31s/it] {'loss': 0.3246, 'grad_norm': 0.6814209280655347, 'learning_rate': 6.9762105766496315e-06, 'epoch': 0.39} 39%|███▉ | 8608/22095 [14:32:35<128:32:11, 34.31s/it] 39%|███▉ | 8609/22095 [14:32:38<93:37:53, 24.99s/it] {'loss': 0.334, 'grad_norm': 0.647736370344086, 'learning_rate': 6.975537311044136e-06, 'epoch': 0.39} 39%|███▉ | 8609/22095 [14:32:38<93:37:53, 24.99s/it] 39%|███▉ | 8610/22095 [14:33:36<130:00:35, 34.71s/it] {'loss': 0.3683, 'grad_norm': 0.6228079659433321, 'learning_rate': 6.974864002990409e-06, 'epoch': 0.39} 39%|███▉ | 8610/22095 [14:33:36<130:00:35, 34.71s/it] 39%|███▉ | 8611/22095 [14:33:40<95:53:40, 25.60s/it] {'loss': 0.3641, 'grad_norm': 0.6380735975515563, 'learning_rate': 6.97419065250292e-06, 'epoch': 0.39} 39%|███▉ | 8611/22095 [14:33:40<95:53:40, 25.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80534 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42207 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8612/22095 [14:34:39<132:50:52, 35.47s/it] {'loss': 0.3522, 'grad_norm': 1.1646045621050385, 'learning_rate': 6.973517259596138e-06, 'epoch': 0.39} 39%|███▉ | 8612/22095 [14:34:39<132:50:52, 35.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8613/22095 [14:35:19<138:46:50, 37.06s/it] {'loss': 0.3384, 'grad_norm': 0.5993642665906557, 'learning_rate': 6.9728438242845295e-06, 'epoch': 0.39} 39%|███▉ | 8613/22095 [14:35:19<138:46:50, 37.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8614/22095 [14:35:27<105:40:51, 28.22s/it] {'loss': 0.4898, 'grad_norm': 0.4615567787347811, 'learning_rate': 6.972170346582568e-06, 'epoch': 0.39} 39%|███▉ | 8614/22095 [14:35:27<105:40:51, 28.22s/it] 39%|███▉ | 8615/22095 [14:36:48<165:07:17, 44.10s/it] {'loss': 0.3456, 'grad_norm': 0.6679530107934812, 'learning_rate': 6.9714968265047234e-06, 'epoch': 0.39} 39%|███▉ | 8615/22095 [14:36:48<165:07:17, 44.10s/it]VC:s3://gui/aguvis/aguvis-stage2/amex/images/ef79fd119373413abe481911c89bf4b7step15.png 2025-08-28 06:34:46.853791 load time: 1044.85 ms 39%|███▉ | 8616/22095 [14:37:29<161:27:13, 43.12s/it] {'loss': 0.3786, 'grad_norm': 0.6438760491238555, 'learning_rate': 6.9708232640654646e-06, 'epoch': 0.39} 39%|███▉ | 8616/22095 [14:37:29<161:27:13, 43.12s/it]VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/53500.jpg 2025-08-28 06:35:27.693625 load time: 1051.18 ms 39%|███▉ | 8617/22095 [14:39:01<216:00:00, 57.69s/it] {'loss': 0.3107, 'grad_norm': 0.6502734654050656, 'learning_rate': 6.9701496592792695e-06, 'epoch': 0.39} 39%|███▉ | 8617/22095 [14:39:01<216:00:00, 57.69s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/rico/dataset/image/53799.jpg 2025-08-28 06:36:59.415850 load time: 1031.58 ms 39%|███▉ | 8618/22095 [14:39:24<177:31:21, 47.42s/it] {'loss': 0.3594, 'grad_norm': 0.655581936435849, 'learning_rate': 6.969476012160607e-06, 'epoch': 0.39} 39%|███▉ | 8618/22095 [14:39:24<177:31:21, 47.42s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_121/img/step_5.png 2025-08-28 06:37:22.840971 load time: 1045.2 ms 39%|███▉ | 8619/22095 [14:39:46<148:23:40, 39.64s/it] {'loss': 0.3011, 'grad_norm': 0.5840458237076638, 'learning_rate': 6.9688023227239555e-06, 'epoch': 0.39} 39%|███▉ | 8619/22095 [14:39:46<148:23:40, 39.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8620/22095 [14:40:13<135:10:11, 36.11s/it] {'loss': 0.4994, 'grad_norm': 0.3520744592844673, 'learning_rate': 6.968128590983787e-06, 'epoch': 0.39} 39%|███▉ | 8620/22095 [14:40:13<135:10:11, 36.11s/it]VC:s3://gui-agent/data_20250630/web/images/yang_0708152720/10_140_52_49_0708161752/img/11.png 2025-08-28 06:38:12.209041 load time: 1018.43 ms 39%|███▉ | 8621/22095 [14:40:35<118:42:28, 31.72s/it] {'loss': 0.3088, 'grad_norm': 0.6239584092422918, 'learning_rate': 6.967454816954581e-06, 'epoch': 0.39} 39%|███▉ | 8621/22095 [14:40:35<118:42:28, 31.72s/it] 39%|███▉ | 8622/22095 [14:41:35<150:54:32, 40.32s/it] {'loss': 0.3539, 'grad_norm': 0.5966980508763927, 'learning_rate': 6.966781000650813e-06, 'epoch': 0.39} 39%|███▉ | 8622/22095 [14:41:35<150:54:32, 40.32s/it] 39%|███▉ | 8623/22095 [14:42:16<151:38:35, 40.52s/it] {'loss': 0.3503, 'grad_norm': 0.6395866873179475, 'learning_rate': 6.966107142086962e-06, 'epoch': 0.39} 39%|███▉ | 8623/22095 [14:42:16<151:38:35, 40.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 108, in __call__ img_value_str = self._get(fn) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 99, in _get return self.client.get(fn) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 517, in get data, _ = self.get_with_info(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 514, in get_with_info return self._get_local_client().get_with_info(uri, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 478, in get_with_info return client.get(filepath), info File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 167, in get return self._client.get_object( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 569, in _api_call return self._make_api_call(operation_name, kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 1023, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. [Try #0] Failed to fetch sample 1034021 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. Problematic sample: {'image': 'bf3459bcaf434803a580fcd36cbe71aestep0.png', 'conversations': [{'from': 'human', 'value': '\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nOpen AP News. Share the link of the first article in the "Business" category\n\nPrevious operations:\nNone'}, {'from': 'gpt', 'value': "\nTap on the AP News app to open it.\n\n\nterminate(status='success')\n"}]} VC:s3://gui/aguvis/aguvis-stage2/aitw-v1/images/single_2064665811836874972.756-842.Go to Alienware Area-51m R1 and then add to cart_1.jpg 2025-08-28 06:40:15.062597 load time: 1042.73 ms VC:s3://gui-agent/data_20250421/Android/wuba/Cycle_0_Iter_23/images/screenshot-348-1745116134.0808809-before.png 2025-08-28 06:40:15.062816 load time: 1041.59 ms 39%|███▉ | 8624/22095 [14:43:00<155:23:59, 41.53s/it] {'loss': 0.4116, 'grad_norm': 0.9190596121602028, 'learning_rate': 6.965433241277506e-06, 'epoch': 0.39} 39%|███▉ | 8624/22095 [14:43:00<155:23:59, 41.53s/it] 39%|███▉ | 8625/22095 [14:43:39<152:14:53, 40.69s/it] {'loss': 0.3258, 'grad_norm': 0.6402288106541163, 'learning_rate': 6.964759298236927e-06, 'epoch': 0.39} 39%|███▉ | 8625/22095 [14:43:39<152:14:53, 40.69s/it] 39%|███▉ | 8626/22095 [14:44:01<131:04:52, 35.04s/it] {'loss': 0.3668, 'grad_norm': 0.6504348325632919, 'learning_rate': 6.964085312979706e-06, 'epoch': 0.39} 39%|███▉ | 8626/22095 [14:44:01<131:04:52, 35.04s/it] 39%|███▉ | 8627/22095 [14:45:40<203:03:59, 54.28s/it] {'loss': 0.312, 'grad_norm': 0.6216577096624377, 'learning_rate': 6.963411285520322e-06, 'epoch': 0.39} 39%|███▉ | 8627/22095 [14:45:40<203:03:59, 54.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8628/22095 [14:45:43<145:44:54, 38.96s/it] {'loss': 0.3615, 'grad_norm': 0.6559684801547805, 'learning_rate': 6.962737215873261e-06, 'epoch': 0.39} 39%|███▉ | 8628/22095 [14:45:43<145:44:54, 38.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80102 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8629/22095 [14:46:09<130:34:58, 34.91s/it] {'loss': 0.3883, 'grad_norm': 0.6719182999512758, 'learning_rate': 6.962063104053003e-06, 'epoch': 0.39} 39%|███▉ | 8629/22095 [14:46:09<130:34:58, 34.91s/it] 39%|███▉ | 8630/22095 [14:46:31<116:58:52, 31.28s/it] {'loss': 0.3304, 'grad_norm': 0.6345286639652378, 'learning_rate': 6.961388950074038e-06, 'epoch': 0.39} 39%|███▉ | 8630/22095 [14:46:31<116:58:52, 31.28s/it] 39%|███▉ | 8631/22095 [14:46:53<105:41:35, 28.26s/it] {'loss': 0.3543, 'grad_norm': 0.7763447214288505, 'learning_rate': 6.960714753950847e-06, 'epoch': 0.39} 39%|███▉ | 8631/22095 [14:46:53<105:41:35, 28.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49427 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8632/22095 [14:47:36<122:18:36, 32.71s/it] {'loss': 0.3274, 'grad_norm': 0.6207165825510894, 'learning_rate': 6.960040515697918e-06, 'epoch': 0.39} 39%|███▉ | 8632/22095 [14:47:36<122:18:36, 32.71s/it] 39%|███▉ | 8633/22095 [14:47:57<109:58:16, 29.41s/it] {'loss': 0.3376, 'grad_norm': 0.6863537745023689, 'learning_rate': 6.9593662353297375e-06, 'epoch': 0.39} 39%|███▉ | 8633/22095 [14:47:57<109:58:16, 29.41s/it] 39%|███▉ | 8634/22095 [14:48:58<145:27:06, 38.90s/it] {'loss': 0.3066, 'grad_norm': 0.6378501990610004, 'learning_rate': 6.958691912860794e-06, 'epoch': 0.39} 39%|███▉ | 8634/22095 [14:48:58<145:27:06, 38.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88325 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8635/22095 [14:49:38<145:39:53, 38.96s/it] {'loss': 0.3535, 'grad_norm': 0.618393802299301, 'learning_rate': 6.958017548305578e-06, 'epoch': 0.39} 39%|███▉ | 8635/22095 [14:49:38<145:39:53, 38.96s/it] 39%|███▉ | 8636/22095 [14:50:17<145:41:35, 38.97s/it] {'loss': 0.3512, 'grad_norm': 0.6521097916175539, 'learning_rate': 6.95734314167858e-06, 'epoch': 0.39} 39%|███▉ | 8636/22095 [14:50:17<145:41:35, 38.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8637/22095 [14:50:45<133:57:38, 35.83s/it] {'loss': 0.4716, 'grad_norm': 0.35284015932868484, 'learning_rate': 6.956668692994286e-06, 'epoch': 0.39} 39%|███▉ | 8637/22095 [14:50:45<133:57:38, 35.83s/it] 39%|███▉ | 8638/22095 [14:51:08<118:57:27, 31.82s/it] {'loss': 0.3446, 'grad_norm': 0.6396524024828847, 'learning_rate': 6.955994202267193e-06, 'epoch': 0.39} 39%|███▉ | 8638/22095 [14:51:08<118:57:27, 31.82s/it] 39%|███▉ | 8639/22095 [14:51:47<127:45:11, 34.18s/it] {'loss': 0.321, 'grad_norm': 0.6206237459503144, 'learning_rate': 6.955319669511793e-06, 'epoch': 0.39} 39%|███▉ | 8639/22095 [14:51:47<127:45:11, 34.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8640/22095 [14:52:16<121:38:51, 32.55s/it] {'loss': 0.4887, 'grad_norm': 0.29714977861285985, 'learning_rate': 6.954645094742577e-06, 'epoch': 0.39} 39%|███▉ | 8640/22095 [14:52:16<121:38:51, 32.55s/it] 39%|███▉ | 8641/22095 [14:52:20<89:46:57, 24.02s/it] {'loss': 0.3238, 'grad_norm': 0.6168756752989212, 'learning_rate': 6.9539704779740415e-06, 'epoch': 0.39} 39%|███▉ | 8641/22095 [14:52:20<89:46:57, 24.02s/it] 39%|███▉ | 8642/22095 [14:52:42<87:50:03, 23.50s/it] {'loss': 0.3272, 'grad_norm': 0.593369640280946, 'learning_rate': 6.953295819220681e-06, 'epoch': 0.39} 39%|███▉ | 8642/22095 [14:52:42<87:50:03, 23.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45541 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77971 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8643/22095 [14:52:52<72:26:26, 19.39s/it] {'loss': 0.4801, 'grad_norm': 0.287094237240238, 'learning_rate': 6.952621118496994e-06, 'epoch': 0.39} 39%|███▉ | 8643/22095 [14:52:52<72:26:26, 19.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69169 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41491 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8644/22095 [14:53:18<79:26:05, 21.26s/it] {'loss': 0.3513, 'grad_norm': 0.6246467794851839, 'learning_rate': 6.9519463758174745e-06, 'epoch': 0.39} 39%|███▉ | 8644/22095 [14:53:18<79:26:05, 21.26s/it] 39%|███▉ | 8645/22095 [14:53:40<80:11:51, 21.47s/it] {'loss': 0.3522, 'grad_norm': 0.6384595000969184, 'learning_rate': 6.951271591196623e-06, 'epoch': 0.39} 39%|███▉ | 8645/22095 [14:53:40<80:11:51, 21.47s/it] 39%|███▉ | 8646/22095 [14:54:20<101:10:25, 27.08s/it] {'loss': 0.3511, 'grad_norm': 0.6290461760291466, 'learning_rate': 6.950596764648938e-06, 'epoch': 0.39} 39%|███▉ | 8646/22095 [14:54:20<101:10:25, 27.08s/it] 39%|███▉ | 8647/22095 [14:54:43<96:54:13, 25.94s/it] {'loss': 0.3602, 'grad_norm': 0.6447982036114861, 'learning_rate': 6.9499218961889205e-06, 'epoch': 0.39} 39%|███▉ | 8647/22095 [14:54:43<96:54:13, 25.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045996 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 无法确定\nB. 1cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 39%|███▉ | 8648/22095 [14:55:23<112:42:49, 30.18s/it] {'loss': 0.3224, 'grad_norm': 0.6259492232560252, 'learning_rate': 6.949246985831069e-06, 'epoch': 0.39} 39%|███▉ | 8648/22095 [14:55:23<112:42:49, 30.18s/it] 39%|███▉ | 8649/22095 [14:56:03<122:58:53, 32.93s/it] {'loss': 0.2964, 'grad_norm': 0.644176305316592, 'learning_rate': 6.948572033589887e-06, 'epoch': 0.39} 39%|███▉ | 8649/22095 [14:56:03<122:58:53, 32.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893504 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16657, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 4cm\nB. 6cm\nC. 12cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 39%|███▉ | 8650/22095 [14:56:42<130:24:11, 34.92s/it] {'loss': 0.2948, 'grad_norm': 0.6377018378945989, 'learning_rate': 6.9478970394798755e-06, 'epoch': 0.39} 39%|███▉ | 8650/22095 [14:56:42<130:24:11, 34.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8651/22095 [14:56:51<101:00:24, 27.05s/it] {'loss': 0.5053, 'grad_norm': 0.34507467046136586, 'learning_rate': 6.9472220035155394e-06, 'epoch': 0.39} 39%|███▉ | 8651/22095 [14:56:51<101:00:24, 27.05s/it] 39%|███▉ | 8652/22095 [14:57:17<99:34:39, 26.67s/it] {'loss': 0.4863, 'grad_norm': 0.32562521184348303, 'learning_rate': 6.9465469257113825e-06, 'epoch': 0.39} 39%|███▉ | 8652/22095 [14:57:17<99:34:39, 26.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 39%|███▉ | 8653/22095 [14:57:20<73:19:05, 19.64s/it] {'loss': 0.3197, 'grad_norm': 0.7729315088269995, 'learning_rate': 6.945871806081911e-06, 'epoch': 0.39} 39%|███▉ | 8653/22095 [14:57:20<73:19:05, 19.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956349 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7184, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 2cm\nB. 4cm\nC. 1cm\nD. 1.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 39%|███▉ | 8654/22095 [14:58:04<100:48:11, 27.00s/it] {'loss': 0.3031, 'grad_norm': 0.6479846752585515, 'learning_rate': 6.945196644641631e-06, 'epoch': 0.39} 39%|███▉ | 8654/22095 [14:58:04<100:48:11, 27.00s/it] 39%|███▉ | 8655/22095 [14:59:09<143:51:32, 38.53s/it] {'loss': 0.3529, 'grad_norm': 0.6411110092285045, 'learning_rate': 6.944521441405049e-06, 'epoch': 0.39} 39%|███▉ | 8655/22095 [14:59:09<143:51:32, 38.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [206, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8450123 in VC:s3://internvl-moe-sft-data/. Exception: Image size [206, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49994, 'image': 'vrdu_texteq/astro-ph.CO/27a2774e-c8d9-4343-b5de-32cbf197dad9.png', 'image_wh': [[206, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'with ${\\hat{\\bm x}}_i \\cdot \\delta{\\bm v}_i =0$.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8656/22095 [14:59:31<124:34:42, 33.37s/it] {'loss': 0.3444, 'grad_norm': 0.7137929377143929, 'learning_rate': 6.943846196386673e-06, 'epoch': 0.39} 39%|███▉ | 8656/22095 [14:59:31<124:34:42, 33.37s/it] 39%|███▉ | 8657/22095 [14:59:35<91:41:09, 24.56s/it] {'loss': 0.3802, 'grad_norm': 0.6422030826957724, 'learning_rate': 6.943170909601013e-06, 'epoch': 0.39} 39%|███▉ | 8657/22095 [14:59:35<91:41:09, 24.56s/it] 39%|███▉ | 8658/22095 [14:59:56<87:24:45, 23.42s/it] {'loss': 0.3379, 'grad_norm': 0.6520353535812056, 'learning_rate': 6.942495581062578e-06, 'epoch': 0.39} 39%|███▉ | 8658/22095 [14:59:56<87:24:45, 23.42s/it] 39%|███▉ | 8659/22095 [15:00:17<84:57:46, 22.76s/it] {'loss': 0.3506, 'grad_norm': 0.6514855160159058, 'learning_rate': 6.94182021078588e-06, 'epoch': 0.39} 39%|███▉ | 8659/22095 [15:00:17<84:57:46, 22.76s/it] 39%|███▉ | 8660/22095 [15:00:20<63:24:45, 16.99s/it] {'loss': 0.3467, 'grad_norm': 0.6215040971747843, 'learning_rate': 6.941144798785429e-06, 'epoch': 0.39} 39%|███▉ | 8660/22095 [15:00:20<63:24:45, 16.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8661/22095 [15:00:25<49:28:32, 13.26s/it] {'loss': 0.3085, 'grad_norm': 0.6040186673839965, 'learning_rate': 6.9404693450757366e-06, 'epoch': 0.39} 39%|███▉ | 8661/22095 [15:00:25<49:28:32, 13.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83615 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8662/22095 [15:00:29<38:45:23, 10.39s/it] {'loss': 0.3384, 'grad_norm': 0.6247800042275801, 'learning_rate': 6.939793849671318e-06, 'epoch': 0.39} 39%|███▉ | 8662/22095 [15:00:29<38:45:23, 10.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047595 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC=2MC,BC=2CN,由线段的和差得AC-BC=2MC-2NC=2(MC-NC)=2×2=4cm,'}]} 39%|███▉ | 8663/22095 [15:00:32<31:06:19, 8.34s/it] {'loss': 0.3421, 'grad_norm': 0.6908979460847018, 'learning_rate': 6.939118312586688e-06, 'epoch': 0.39} 39%|███▉ | 8663/22095 [15:00:32<31:06:19, 8.34s/it] 39%|███▉ | 8664/22095 [15:01:19<74:10:31, 19.88s/it] {'loss': 0.3246, 'grad_norm': 0.610891920990332, 'learning_rate': 6.938442733836361e-06, 'epoch': 0.39} 39%|███▉ | 8664/22095 [15:01:19<74:10:31, 19.88s/it] 39%|███▉ | 8665/22095 [15:01:22<55:19:18, 14.83s/it] {'loss': 0.317, 'grad_norm': 0.6193622434028262, 'learning_rate': 6.9377671134348535e-06, 'epoch': 0.39} 39%|███▉ | 8665/22095 [15:01:22<55:19:18, 14.83s/it] 39%|███▉ | 8666/22095 [15:01:44<63:32:37, 17.03s/it] {'loss': 0.3232, 'grad_norm': 0.6122163861863653, 'learning_rate': 6.93709145139668e-06, 'epoch': 0.39} 39%|███▉ | 8666/22095 [15:01:44<63:32:37, 17.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8667/22095 [15:02:07<69:44:23, 18.70s/it] {'loss': 0.273, 'grad_norm': 0.6713130434464446, 'learning_rate': 6.936415747736363e-06, 'epoch': 0.39} 39%|███▉ | 8667/22095 [15:02:07<69:44:23, 18.70s/it] 39%|███▉ | 8668/22095 [15:02:10<52:01:54, 13.95s/it] {'loss': 0.3356, 'grad_norm': 0.5804329365473563, 'learning_rate': 6.935740002468417e-06, 'epoch': 0.39} 39%|███▉ | 8668/22095 [15:02:10<52:01:54, 13.95s/it] 39%|███▉ | 8669/22095 [15:02:13<40:08:07, 10.76s/it] {'loss': 0.3619, 'grad_norm': 0.6704525479055023, 'learning_rate': 6.935064215607364e-06, 'epoch': 0.39} 39%|███▉ | 8669/22095 [15:02:13<40:08:07, 10.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8670/22095 [15:02:19<34:26:16, 9.23s/it] {'loss': 0.4794, 'grad_norm': 0.5214643061521009, 'learning_rate': 6.934388387167726e-06, 'epoch': 0.39} 39%|███▉ | 8670/22095 [15:02:19<34:26:16, 9.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64413 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8671/22095 [15:02:22<28:13:36, 7.57s/it] {'loss': 0.3655, 'grad_norm': 0.7548003104336839, 'learning_rate': 6.933712517164019e-06, 'epoch': 0.39} 39%|███▉ | 8671/22095 [15:02:22<28:13:36, 7.57s/it] 39%|███▉ | 8672/22095 [15:02:25<23:01:47, 6.18s/it] {'loss': 0.2902, 'grad_norm': 0.6111333579007964, 'learning_rate': 6.933036605610773e-06, 'epoch': 0.39} 39%|███▉ | 8672/22095 [15:02:25<23:01:47, 6.18s/it] 39%|███▉ | 8673/22095 [15:02:29<20:08:55, 5.40s/it] {'loss': 0.3162, 'grad_norm': 0.6062223905435875, 'learning_rate': 6.932360652522504e-06, 'epoch': 0.39} 39%|███▉ | 8673/22095 [15:02:29<20:08:55, 5.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8674/22095 [15:02:52<40:01:04, 10.73s/it] {'loss': 0.3598, 'grad_norm': 0.6587197514595092, 'learning_rate': 6.93168465791374e-06, 'epoch': 0.39} 39%|███▉ | 8674/22095 [15:02:52<40:01:04, 10.73s/it] 39%|███▉ | 8675/22095 [15:03:17<56:19:02, 15.11s/it] {'loss': 0.3691, 'grad_norm': 0.6525325821139949, 'learning_rate': 6.931008621799007e-06, 'epoch': 0.39} 39%|███▉ | 8675/22095 [15:03:17<56:19:02, 15.11s/it] 39%|███▉ | 8676/22095 [15:03:21<43:32:54, 11.68s/it] {'loss': 0.3762, 'grad_norm': 0.6237709662578704, 'learning_rate': 6.930332544192829e-06, 'epoch': 0.39} 39%|███▉ | 8676/22095 [15:03:21<43:32:54, 11.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62186 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44675 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44116 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8677/22095 [15:03:43<55:26:00, 14.87s/it] {'loss': 0.3514, 'grad_norm': 0.6245083049472832, 'learning_rate': 6.929656425109731e-06, 'epoch': 0.39} 39%|███▉ | 8677/22095 [15:03:43<55:26:00, 14.87s/it] 39%|███▉ | 8678/22095 [15:03:47<42:25:27, 11.38s/it] {'loss': 0.3382, 'grad_norm': 0.6767809573399547, 'learning_rate': 6.9289802645642455e-06, 'epoch': 0.39} 39%|███▉ | 8678/22095 [15:03:47<42:25:27, 11.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8679/22095 [15:03:56<40:16:39, 10.81s/it] {'loss': 0.4813, 'grad_norm': 0.41616801093824357, 'learning_rate': 6.928304062570897e-06, 'epoch': 0.39} 39%|███▉ | 8679/22095 [15:03:56<40:16:39, 10.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49987 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44404 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8680/22095 [15:03:59<31:52:10, 8.55s/it] {'loss': 0.3297, 'grad_norm': 0.7707353902927744, 'learning_rate': 6.927627819144217e-06, 'epoch': 0.39} 39%|███▉ | 8680/22095 [15:03:59<31:52:10, 8.55s/it] 39%|███▉ | 8681/22095 [15:04:03<26:26:52, 7.10s/it] {'loss': 0.3728, 'grad_norm': 0.6385764521466519, 'learning_rate': 6.926951534298736e-06, 'epoch': 0.39} 39%|███▉ | 8681/22095 [15:04:03<26:26:52, 7.10s/it] 39%|███▉ | 8682/22095 [15:04:06<21:55:58, 5.89s/it] {'loss': 0.3528, 'grad_norm': 0.6644952441279728, 'learning_rate': 6.926275208048984e-06, 'epoch': 0.39} 39%|███▉ | 8682/22095 [15:04:06<21:55:58, 5.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43369 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8683/22095 [15:04:10<19:40:47, 5.28s/it] {'loss': 0.3732, 'grad_norm': 0.6429728839288491, 'learning_rate': 6.925598840409493e-06, 'epoch': 0.39} 39%|███▉ | 8683/22095 [15:04:10<19:40:47, 5.28s/it] 39%|███▉ | 8684/22095 [15:04:13<17:41:23, 4.75s/it] {'loss': 0.349, 'grad_norm': 0.6111124585554257, 'learning_rate': 6.924922431394798e-06, 'epoch': 0.39} 39%|███▉ | 8684/22095 [15:04:13<17:41:23, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8685/22095 [15:04:24<24:40:41, 6.63s/it] {'loss': 0.4543, 'grad_norm': 0.2904134706836162, 'learning_rate': 6.924245981019432e-06, 'epoch': 0.39} 39%|███▉ | 8685/22095 [15:04:24<24:40:41, 6.63s/it] 39%|███▉ | 8686/22095 [15:04:29<22:26:16, 6.02s/it] {'loss': 0.365, 'grad_norm': 0.6445296325024885, 'learning_rate': 6.92356948929793e-06, 'epoch': 0.39} 39%|███▉ | 8686/22095 [15:04:29<22:26:16, 6.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103597 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8687/22095 [15:04:33<19:48:42, 5.32s/it] {'loss': 0.3587, 'grad_norm': 0.622289117555611, 'learning_rate': 6.922892956244827e-06, 'epoch': 0.39} 39%|███▉ | 8687/22095 [15:04:33<19:48:42, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43078 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8688/22095 [15:04:41<23:19:03, 6.26s/it] {'loss': 0.5007, 'grad_norm': 0.2850696554299266, 'learning_rate': 6.92221638187466e-06, 'epoch': 0.39} 39%|███▉ | 8688/22095 [15:04:41<23:19:03, 6.26s/it] 39%|███▉ | 8689/22095 [15:04:45<20:15:34, 5.44s/it] {'loss': 0.3236, 'grad_norm': 0.6122601998647502, 'learning_rate': 6.921539766201967e-06, 'epoch': 0.39} 39%|███▉ | 8689/22095 [15:04:45<20:15:34, 5.44s/it] 39%|███▉ | 8690/22095 [15:04:48<17:30:35, 4.70s/it] {'loss': 0.3563, 'grad_norm': 0.709577675183887, 'learning_rate': 6.920863109241285e-06, 'epoch': 0.39} 39%|███▉ | 8690/22095 [15:04:48<17:30:35, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8691/22095 [15:04:56<21:18:39, 5.72s/it] {'loss': 0.4917, 'grad_norm': 0.34339414934632356, 'learning_rate': 6.920186411007155e-06, 'epoch': 0.39} 39%|███▉ | 8691/22095 [15:04:56<21:18:39, 5.72s/it] 39%|███▉ | 8692/22095 [15:05:00<19:06:41, 5.13s/it] {'loss': 0.3629, 'grad_norm': 0.7456381059814225, 'learning_rate': 6.919509671514116e-06, 'epoch': 0.39} 39%|███▉ | 8692/22095 [15:05:00<19:06:41, 5.13s/it] 39%|███▉ | 8693/22095 [15:05:03<16:40:43, 4.48s/it] {'loss': 0.3314, 'grad_norm': 0.6745920091564857, 'learning_rate': 6.91883289077671e-06, 'epoch': 0.39} 39%|███▉ | 8693/22095 [15:05:03<16:40:43, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55780 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8694/22095 [15:05:07<16:09:58, 4.34s/it] {'loss': 0.384, 'grad_norm': 0.6477926625245785, 'learning_rate': 6.918156068809479e-06, 'epoch': 0.39} 39%|███▉ | 8694/22095 [15:05:07<16:09:58, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918152 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41305, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nA. 8cm\nB. 16cm\nC. 32cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 39%|███▉ | 8695/22095 [15:05:14<19:32:31, 5.25s/it] {'loss': 0.4691, 'grad_norm': 0.2827842762719348, 'learning_rate': 6.917479205626965e-06, 'epoch': 0.39} 39%|███▉ | 8695/22095 [15:05:14<19:32:31, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46921 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80799 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8696/22095 [15:05:20<20:44:31, 5.57s/it] {'loss': 0.4701, 'grad_norm': 0.2858251259802762, 'learning_rate': 6.916802301243711e-06, 'epoch': 0.39} 39%|███▉ | 8696/22095 [15:05:20<20:44:31, 5.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (96545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45332 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8697/22095 [15:05:24<18:25:20, 4.95s/it] {'loss': 0.3364, 'grad_norm': 0.6481176264184711, 'learning_rate': 6.916125355674264e-06, 'epoch': 0.39} 39%|███▉ | 8697/22095 [15:05:24<18:25:20, 4.95s/it] 39%|███▉ | 8698/22095 [15:05:27<16:35:58, 4.46s/it] {'loss': 0.3543, 'grad_norm': 0.6563203748286447, 'learning_rate': 6.915448368933166e-06, 'epoch': 0.39} 39%|███▉ | 8698/22095 [15:05:27<16:35:58, 4.46s/it] 39%|███▉ | 8699/22095 [15:05:31<15:33:24, 4.18s/it] {'loss': 0.3515, 'grad_norm': 0.6412880506103583, 'learning_rate': 6.914771341034967e-06, 'epoch': 0.39} 39%|███▉ | 8699/22095 [15:05:31<15:33:24, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112083 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8700/22095 [15:05:35<16:05:11, 4.32s/it] {'loss': 0.4955, 'grad_norm': 0.29259987008716926, 'learning_rate': 6.914094271994211e-06, 'epoch': 0.39} 39%|███▉ | 8700/22095 [15:05:35<16:05:11, 4.32s/it] 39%|███▉ | 8701/22095 [15:05:38<14:49:13, 3.98s/it] {'loss': 0.3612, 'grad_norm': 0.6285536609147793, 'learning_rate': 6.913417161825449e-06, 'epoch': 0.39} 39%|███▉ | 8701/22095 [15:05:38<14:49:13, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (88564 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94444 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8702/22095 [15:05:49<22:23:45, 6.02s/it] {'loss': 0.4956, 'grad_norm': 0.2906222931727884, 'learning_rate': 6.912740010543229e-06, 'epoch': 0.39} 39%|███▉ | 8702/22095 [15:05:49<22:23:45, 6.02s/it] 39%|███▉ | 8703/22095 [15:05:53<19:28:13, 5.23s/it] {'loss': 0.3475, 'grad_norm': 0.7385466443098395, 'learning_rate': 6.912062818162101e-06, 'epoch': 0.39} 39%|███▉ | 8703/22095 [15:05:53<19:28:13, 5.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8704/22095 [15:06:00<21:27:17, 5.77s/it] {'loss': 0.4687, 'grad_norm': 0.2980880050919162, 'learning_rate': 6.911385584696615e-06, 'epoch': 0.39} 39%|███▉ | 8704/22095 [15:06:00<21:27:17, 5.77s/it] 39%|███▉ | 8705/22095 [15:06:10<26:08:55, 7.03s/it] {'loss': 0.4903, 'grad_norm': 0.3011430205086542, 'learning_rate': 6.910708310161323e-06, 'epoch': 0.39} 39%|███▉ | 8705/22095 [15:06:10<26:08:55, 7.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 39%|███▉ | 8706/22095 [15:06:13<22:23:23, 6.02s/it] {'loss': 0.3609, 'grad_norm': 0.6786333339665385, 'learning_rate': 6.910030994570778e-06, 'epoch': 0.39} 39%|███▉ | 8706/22095 [15:06:13<22:23:23, 6.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84681 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8707/22095 [15:06:17<20:00:31, 5.38s/it] {'loss': 0.3728, 'grad_norm': 0.6605605300378226, 'learning_rate': 6.909353637939533e-06, 'epoch': 0.39} 39%|███▉ | 8707/22095 [15:06:17<20:00:31, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8708/22095 [15:06:27<24:51:19, 6.68s/it] {'loss': 0.498, 'grad_norm': 0.3013435270774591, 'learning_rate': 6.908676240282141e-06, 'epoch': 0.39} 39%|███▉ | 8708/22095 [15:06:27<24:51:19, 6.68s/it] 39%|███▉ | 8709/22095 [15:06:31<21:44:30, 5.85s/it] {'loss': 0.3521, 'grad_norm': 0.6301341462442973, 'learning_rate': 6.907998801613162e-06, 'epoch': 0.39} 39%|███▉ | 8709/22095 [15:06:31<21:44:30, 5.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8710/22095 [15:06:40<25:55:55, 6.97s/it] {'loss': 0.4472, 'grad_norm': 0.28087605899378787, 'learning_rate': 6.907321321947146e-06, 'epoch': 0.39} 39%|███▉ | 8710/22095 [15:06:40<25:55:55, 6.97s/it] 39%|███▉ | 8711/22095 [15:06:44<22:31:22, 6.06s/it] {'loss': 0.3586, 'grad_norm': 0.7817288977807025, 'learning_rate': 6.906643801298654e-06, 'epoch': 0.39} 39%|███▉ | 8711/22095 [15:06:44<22:31:22, 6.06s/it] 39%|███▉ | 8712/22095 [15:06:48<20:22:20, 5.48s/it] {'loss': 0.3506, 'grad_norm': 1.1246420663438081, 'learning_rate': 6.9059662396822415e-06, 'epoch': 0.39} 39%|███▉ | 8712/22095 [15:06:48<20:22:20, 5.48s/it] 39%|███▉ | 8713/22095 [15:06:52<18:10:02, 4.89s/it] {'loss': 0.3839, 'grad_norm': 0.6710933590669182, 'learning_rate': 6.905288637112468e-06, 'epoch': 0.39} 39%|███▉ | 8713/22095 [15:06:52<18:10:02, 4.89s/it] 39%|███▉ | 8714/22095 [15:06:55<15:50:58, 4.26s/it] {'loss': 0.3217, 'grad_norm': 0.7314548660479658, 'learning_rate': 6.904610993603894e-06, 'epoch': 0.39} 39%|███▉ | 8714/22095 [15:06:55<15:50:58, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8715/22095 [15:07:03<20:42:31, 5.57s/it] {'loss': 0.497, 'grad_norm': 0.3041015740060619, 'learning_rate': 6.90393330917108e-06, 'epoch': 0.39} 39%|███▉ | 8715/22095 [15:07:03<20:42:31, 5.57s/it] 39%|███▉ | 8716/22095 [15:07:07<18:05:49, 4.87s/it] {'loss': 0.3684, 'grad_norm': 0.6279941312234719, 'learning_rate': 6.903255583828585e-06, 'epoch': 0.39} 39%|███▉ | 8716/22095 [15:07:07<18:05:49, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8717/22095 [15:07:10<16:29:20, 4.44s/it] {'loss': 0.3792, 'grad_norm': 0.7164162219907889, 'learning_rate': 6.902577817590975e-06, 'epoch': 0.39} 39%|███▉ | 8717/22095 [15:07:10<16:29:20, 4.44s/it] 39%|███▉ | 8718/22095 [15:07:13<14:48:35, 3.99s/it] {'loss': 0.3403, 'grad_norm': 0.6107098021250627, 'learning_rate': 6.901900010472811e-06, 'epoch': 0.39} 39%|███▉ | 8718/22095 [15:07:13<14:48:35, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61952 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75959 > 40960). Running this sequence through the model will result in indexing errors 39%|███▉ | 8719/22095 [15:07:16<14:01:37, 3.78s/it] {'loss': 0.35, 'grad_norm': 0.5809393957416973, 'learning_rate': 6.901222162488655e-06, 'epoch': 0.39} 39%|███▉ | 8719/22095 [15:07:16<14:01:37, 3.78s/it] 39%|███▉ | 8720/22095 [15:07:19<13:08:42, 3.54s/it] {'loss': 0.3235, 'grad_norm': 0.6329370924311692, 'learning_rate': 6.9005442736530745e-06, 'epoch': 0.39} 39%|███▉ | 8720/22095 [15:07:19<13:08:42, 3.54s/it] 39%|███▉ | 8721/22095 [15:07:23<13:35:49, 3.66s/it] {'loss': 0.3836, 'grad_norm': 0.6260093623120035, 'learning_rate': 6.899866343980635e-06, 'epoch': 0.39} 39%|███▉ | 8721/22095 [15:07:23<13:35:49, 3.66s/it] 39%|███▉ | 8722/22095 [15:07:27<14:14:18, 3.83s/it] {'loss': 0.3625, 'grad_norm': 0.6883826862977819, 'learning_rate': 6.899188373485903e-06, 'epoch': 0.39} 39%|███▉ | 8722/22095 [15:07:27<14:14:18, 3.83s/it] 39%|███▉ | 8723/22095 [15:07:31<13:35:38, 3.66s/it] {'loss': 0.3133, 'grad_norm': 0.5849484026828342, 'learning_rate': 6.8985103621834455e-06, 'epoch': 0.39} 39%|███▉ | 8723/22095 [15:07:31<13:35:38, 3.66s/it] 39%|███▉ | 8724/22095 [15:07:34<13:01:02, 3.50s/it] {'loss': 0.3496, 'grad_norm': 0.623484541700811, 'learning_rate': 6.8978323100878305e-06, 'epoch': 0.39} 39%|███▉ | 8724/22095 [15:07:34<13:01:02, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915460 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38613, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 6.5\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 39%|███▉ | 8725/22095 [15:07:37<12:29:39, 3.36s/it] {'loss': 0.3508, 'grad_norm': 0.5962752422609053, 'learning_rate': 6.897154217213629e-06, 'epoch': 0.39} 39%|███▉ | 8725/22095 [15:07:37<12:29:39, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 39%|███▉ | 8726/22095 [15:07:40<12:47:28, 3.44s/it] {'loss': 0.3668, 'grad_norm': 0.6707375328358476, 'learning_rate': 6.8964760835754095e-06, 'epoch': 0.39} 39%|███▉ | 8726/22095 [15:07:40<12:47:28, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 39%|███▉ | 8727/22095 [15:07:49<18:03:11, 4.86s/it] {'loss': 0.5035, 'grad_norm': 0.32723708194093143, 'learning_rate': 6.895797909187745e-06, 'epoch': 0.39} 39%|███▉ | 8727/22095 [15:07:49<18:03:11, 4.86s/it] 40%|███▉ | 8728/22095 [15:07:53<17:14:08, 4.64s/it] {'loss': 0.3286, 'grad_norm': 0.5904918431761856, 'learning_rate': 6.8951196940652045e-06, 'epoch': 0.4} 40%|███▉ | 8728/22095 [15:07:53<17:14:08, 4.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [117, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8380333 in VC:s3://internvl-moe-sft-data/. Exception: Image size [117, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47120, 'image': 'vrdu_table_final_2/astro-ph.CO/a13b5d34-3bbf-47ef-8b77-469c47beec40.png', 'image_wh': [[117, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{c} \\emph{Evolution}\\\\ \\fncb\\end{tabular}\n```'}]} 40%|███▉ | 8729/22095 [15:07:56<15:21:44, 4.14s/it] {'loss': 0.3425, 'grad_norm': 0.6513250106633817, 'learning_rate': 6.894441438222362e-06, 'epoch': 0.4} 40%|███▉ | 8729/22095 [15:07:56<15:21:44, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115420 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8730/22095 [15:07:59<14:13:00, 3.83s/it] {'loss': 0.3482, 'grad_norm': 0.7455422173357793, 'learning_rate': 6.89376314167379e-06, 'epoch': 0.4} 40%|███▉ | 8730/22095 [15:07:59<14:13:00, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63165 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8731/22095 [15:08:02<13:58:01, 3.76s/it] {'loss': 0.3727, 'grad_norm': 0.6774642915248136, 'learning_rate': 6.893084804434067e-06, 'epoch': 0.4} 40%|███▉ | 8731/22095 [15:08:02<13:58:01, 3.76s/it] 40%|███▉ | 8732/22095 [15:08:05<13:01:06, 3.51s/it] {'loss': 0.3227, 'grad_norm': 0.6096825832240897, 'learning_rate': 6.892406426517764e-06, 'epoch': 0.4} 40%|███▉ | 8732/22095 [15:08:05<13:01:06, 3.51s/it] 40%|███▉ | 8733/22095 [15:08:08<12:24:19, 3.34s/it] {'loss': 0.3622, 'grad_norm': 0.666446310795665, 'learning_rate': 6.8917280079394596e-06, 'epoch': 0.4} 40%|███▉ | 8733/22095 [15:08:08<12:24:19, 3.34s/it] 40%|███▉ | 8734/22095 [15:08:11<12:14:11, 3.30s/it] {'loss': 0.3496, 'grad_norm': 0.6059322710118061, 'learning_rate': 6.891049548713731e-06, 'epoch': 0.4} 40%|███▉ | 8734/22095 [15:08:11<12:14:11, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954493 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5328, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 40%|███▉ | 8735/22095 [15:08:20<18:17:08, 4.93s/it] {'loss': 0.4777, 'grad_norm': 0.3275420726550203, 'learning_rate': 6.8903710488551544e-06, 'epoch': 0.4} 40%|███▉ | 8735/22095 [15:08:20<18:17:08, 4.93s/it] 40%|███▉ | 8736/22095 [15:08:24<16:56:27, 4.57s/it] {'loss': 0.382, 'grad_norm': 0.6373966786981582, 'learning_rate': 6.889692508378312e-06, 'epoch': 0.4} 40%|███▉ | 8736/22095 [15:08:24<16:56:27, 4.57s/it] 40%|███▉ | 8737/22095 [15:08:27<15:25:52, 4.16s/it] {'loss': 0.3628, 'grad_norm': 0.657915152352585, 'learning_rate': 6.889013927297778e-06, 'epoch': 0.4} 40%|███▉ | 8737/22095 [15:08:27<15:25:52, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [103, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393517 in VC:s3://internvl-moe-sft-data/. Exception: Image size [103, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60349, 'image': 'vrdu_table_final_2/astro-ph.EP/539cb6e4-cb14-4c15-9c3d-34ac1b3fcad6.png', 'image_wh': [[103, 23]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}l@{}}Latitude\\end{tabular}\n```"}]} 40%|███▉ | 8738/22095 [15:08:38<22:23:44, 6.04s/it] {'loss': 0.4972, 'grad_norm': 0.43447516800031616, 'learning_rate': 6.888335305628138e-06, 'epoch': 0.4} 40%|███▉ | 8738/22095 [15:08:38<22:23:44, 6.04s/it] 40%|███▉ | 8739/22095 [15:08:47<26:41:10, 7.19s/it] {'loss': 0.4785, 'grad_norm': 0.294283044657983, 'learning_rate': 6.887656643383972e-06, 'epoch': 0.4} 40%|███▉ | 8739/22095 [15:08:47<26:41:10, 7.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 40%|███▉ | 8740/22095 [15:08:51<22:11:59, 5.98s/it] {'loss': 0.3414, 'grad_norm': 0.701929657079196, 'learning_rate': 6.886977940579862e-06, 'epoch': 0.4} 40%|███▉ | 8740/22095 [15:08:51<22:11:59, 5.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8741/22095 [15:08:54<19:24:14, 5.23s/it] {'loss': 0.3436, 'grad_norm': 0.6942881894781736, 'learning_rate': 6.886299197230391e-06, 'epoch': 0.4} 40%|███▉ | 8741/22095 [15:08:54<19:24:14, 5.23s/it] 40%|███▉ | 8742/22095 [15:08:59<19:20:16, 5.21s/it] {'loss': 0.3261, 'grad_norm': 0.5906622336773468, 'learning_rate': 6.885620413350145e-06, 'epoch': 0.4} 40%|███▉ | 8742/22095 [15:08:59<19:20:16, 5.21s/it] 40%|███▉ | 8743/22095 [15:09:02<16:45:55, 4.52s/it] {'loss': 0.3508, 'grad_norm': 0.732763832599632, 'learning_rate': 6.884941588953706e-06, 'epoch': 0.4} 40%|███▉ | 8743/22095 [15:09:02<16:45:55, 4.52s/it] 40%|███▉ | 8744/22095 [15:09:05<15:11:26, 4.10s/it] {'loss': 0.3474, 'grad_norm': 0.7552325739031256, 'learning_rate': 6.884262724055663e-06, 'epoch': 0.4} 40%|███▉ | 8744/22095 [15:09:05<15:11:26, 4.10s/it] 40%|███▉ | 8745/22095 [15:09:08<13:51:49, 3.74s/it] {'loss': 0.3221, 'grad_norm': 0.6542164691291017, 'learning_rate': 6.8835838186705985e-06, 'epoch': 0.4} 40%|███▉ | 8745/22095 [15:09:08<13:51:49, 3.74s/it] 40%|███▉ | 8746/22095 [15:09:12<13:38:13, 3.68s/it] {'loss': 0.3626, 'grad_norm': 0.6364917758168952, 'learning_rate': 6.8829048728131056e-06, 'epoch': 0.4} 40%|███▉ | 8746/22095 [15:09:12<13:38:13, 3.68s/it] 40%|███▉ | 8747/22095 [15:09:15<12:58:11, 3.50s/it] {'loss': 0.3339, 'grad_norm': 0.6307626027957765, 'learning_rate': 6.882225886497768e-06, 'epoch': 0.4} 40%|███▉ | 8747/22095 [15:09:15<12:58:11, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8748/22095 [15:09:19<13:21:30, 3.60s/it] {'loss': 0.3451, 'grad_norm': 0.6846426846072403, 'learning_rate': 6.8815468597391785e-06, 'epoch': 0.4} 40%|███▉ | 8748/22095 [15:09:19<13:21:30, 3.60s/it] 40%|███▉ | 8749/22095 [15:09:23<14:28:20, 3.90s/it] {'loss': 0.3539, 'grad_norm': 0.6193145573313631, 'learning_rate': 6.880867792551924e-06, 'epoch': 0.4} 40%|███▉ | 8749/22095 [15:09:23<14:28:20, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65001 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8750/22095 [15:09:27<13:56:30, 3.76s/it] {'loss': 0.3727, 'grad_norm': 0.6411114657727086, 'learning_rate': 6.880188684950599e-06, 'epoch': 0.4} 40%|███▉ | 8750/22095 [15:09:27<13:56:30, 3.76s/it] 40%|███▉ | 8751/22095 [15:09:30<13:35:18, 3.67s/it] {'loss': 0.3605, 'grad_norm': 0.676463974354279, 'learning_rate': 6.879509536949792e-06, 'epoch': 0.4} 40%|███▉ | 8751/22095 [15:09:30<13:35:18, 3.67s/it] 40%|███▉ | 8752/22095 [15:09:33<12:46:01, 3.44s/it] {'loss': 0.3631, 'grad_norm': 0.6642196025191914, 'learning_rate': 6.878830348564098e-06, 'epoch': 0.4} 40%|███▉ | 8752/22095 [15:09:33<12:46:01, 3.44s/it] 40%|███▉ | 8753/22095 [15:09:38<13:58:20, 3.77s/it] {'loss': 0.3443, 'grad_norm': 0.6307656132854766, 'learning_rate': 6.878151119808111e-06, 'epoch': 0.4} 40%|███▉ | 8753/22095 [15:09:38<13:58:20, 3.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302479 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1MApwitqUQKJjSZFIXXcOkFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这张照片中的所有文本是什么内容?'}, {'from': 'gpt', 'value': '图中所有文字:\n七天包退换永久提供技术支持\n万像素\n1000\n带硬底座\n高清晰\nA4\n添音\n增值发票\n全国联保\n数码'}]} 40%|███▉ | 8754/22095 [15:09:41<14:03:12, 3.79s/it] {'loss': 0.3222, 'grad_norm': 0.6247080085850564, 'learning_rate': 6.8774718506964245e-06, 'epoch': 0.4} 40%|███▉ | 8754/22095 [15:09:41<14:03:12, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8755/22095 [15:09:44<13:00:42, 3.51s/it] {'loss': 0.3564, 'grad_norm': 0.9383560582951295, 'learning_rate': 6.876792541243633e-06, 'epoch': 0.4} 40%|███▉ | 8755/22095 [15:09:44<13:00:42, 3.51s/it] 40%|███▉ | 8756/22095 [15:09:47<12:18:52, 3.32s/it] {'loss': 0.364, 'grad_norm': 0.6281135643962057, 'learning_rate': 6.876113191464336e-06, 'epoch': 0.4} 40%|███▉ | 8756/22095 [15:09:47<12:18:52, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60581 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65946 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42015 > 40960) for 4 sample(s). Truncating to 37522 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (103594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54129 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8757/22095 [15:09:51<12:30:43, 3.38s/it] {'loss': 0.289, 'grad_norm': 0.7782318894620474, 'learning_rate': 6.875433801373128e-06, 'epoch': 0.4} 40%|███▉ | 8757/22095 [15:09:51<12:30:43, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (108888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91547 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8758/22095 [15:10:00<18:46:24, 5.07s/it] {'loss': 0.4773, 'grad_norm': 0.4686477527301178, 'learning_rate': 6.8747543709846064e-06, 'epoch': 0.4} 40%|███▉ | 8758/22095 [15:10:00<18:46:24, 5.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308090 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2KjkQgNrI8KJjy0FpXXb5hVXa_!!2805951914.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n列出图片包含的文字内容。'}, {'from': 'gpt', 'value': '图中所有文字:\n超大液晶屏\n10年换新\n防爆型\n预设胎压\n自动冲停\n家车\n两用'}]} 40%|███▉ | 8759/22095 [15:10:03<17:07:32, 4.62s/it] {'loss': 0.3568, 'grad_norm': 0.5931616218567658, 'learning_rate': 6.8740749003133725e-06, 'epoch': 0.4} 40%|███▉ | 8759/22095 [15:10:03<17:07:32, 4.62s/it] 40%|███▉ | 8760/22095 [15:10:07<15:47:24, 4.26s/it] {'loss': 0.3537, 'grad_norm': 0.6620622851903066, 'learning_rate': 6.873395389374024e-06, 'epoch': 0.4} 40%|███▉ | 8760/22095 [15:10:07<15:47:24, 4.26s/it] 40%|███▉ | 8761/22095 [15:10:10<14:13:04, 3.84s/it] {'loss': 0.3823, 'grad_norm': 0.6560883533650138, 'learning_rate': 6.872715838181161e-06, 'epoch': 0.4} 40%|███▉ | 8761/22095 [15:10:10<14:13:04, 3.84s/it] 40%|███▉ | 8762/22095 [15:10:13<13:29:12, 3.64s/it] {'loss': 0.3656, 'grad_norm': 0.6712679049437771, 'learning_rate': 6.872036246749387e-06, 'epoch': 0.4} 40%|███▉ | 8762/22095 [15:10:13<13:29:12, 3.64s/it] 40%|███▉ | 8763/22095 [15:10:16<12:37:52, 3.41s/it] {'loss': 0.3273, 'grad_norm': 0.576014190977509, 'learning_rate': 6.871356615093306e-06, 'epoch': 0.4} 40%|███▉ | 8763/22095 [15:10:16<12:37:52, 3.41s/it] 40%|███▉ | 8764/22095 [15:10:19<12:48:38, 3.46s/it] {'loss': 0.3706, 'grad_norm': 0.6717922727634008, 'learning_rate': 6.870676943227516e-06, 'epoch': 0.4} 40%|███▉ | 8764/22095 [15:10:19<12:48:38, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8765/22095 [15:10:29<20:24:38, 5.51s/it] {'loss': 0.4814, 'grad_norm': 0.40196289686369996, 'learning_rate': 6.869997231166625e-06, 'epoch': 0.4} 40%|███▉ | 8765/22095 [15:10:29<20:24:38, 5.51s/it] 40%|███▉ | 8766/22095 [15:10:39<24:29:31, 6.61s/it] {'loss': 0.4755, 'grad_norm': 0.39845981511037476, 'learning_rate': 6.869317478925236e-06, 'epoch': 0.4} 40%|███▉ | 8766/22095 [15:10:39<24:29:31, 6.61s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (59497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44765 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8767/22095 [15:10:42<21:10:16, 5.72s/it] {'loss': 0.3497, 'grad_norm': 0.7263105208145871, 'learning_rate': 6.8686376865179576e-06, 'epoch': 0.4} 40%|███▉ | 8767/22095 [15:10:42<21:10:16, 5.72s/it] 40%|███▉ | 8768/22095 [15:10:46<18:30:41, 5.00s/it] {'loss': 0.3601, 'grad_norm': 0.6656152094364597, 'learning_rate': 6.867957853959392e-06, 'epoch': 0.4} 40%|███▉ | 8768/22095 [15:10:46<18:30:41, 5.00s/it] 40%|███▉ | 8769/22095 [15:10:49<16:28:29, 4.45s/it] {'loss': 0.3859, 'grad_norm': 0.6096887779698981, 'learning_rate': 6.86727798126415e-06, 'epoch': 0.4} 40%|███▉ | 8769/22095 [15:10:49<16:28:29, 4.45s/it] 40%|███▉ | 8770/22095 [15:10:52<14:57:15, 4.04s/it] {'loss': 0.3555, 'grad_norm': 0.7120773328838061, 'learning_rate': 6.866598068446839e-06, 'epoch': 0.4} 40%|███▉ | 8770/22095 [15:10:52<14:57:15, 4.04s/it] 40%|███▉ | 8771/22095 [15:10:56<14:40:19, 3.96s/it] {'loss': 0.3321, 'grad_norm': 0.9733028374256942, 'learning_rate': 6.8659181155220674e-06, 'epoch': 0.4} 40%|███▉ | 8771/22095 [15:10:56<14:40:19, 3.96s/it] 40%|███▉ | 8772/22095 [15:10:59<13:58:17, 3.78s/it] {'loss': 0.3416, 'grad_norm': 0.7205067740179895, 'learning_rate': 6.865238122504449e-06, 'epoch': 0.4} 40%|███▉ | 8772/22095 [15:10:59<13:58:17, 3.78s/it] 40%|███▉ | 8773/22095 [15:11:03<14:24:14, 3.89s/it] {'loss': 0.3867, 'grad_norm': 0.7190878853947535, 'learning_rate': 6.86455808940859e-06, 'epoch': 0.4} 40%|███▉ | 8773/22095 [15:11:03<14:24:14, 3.89s/it] 40%|███▉ | 8774/22095 [15:11:06<13:07:40, 3.55s/it] {'loss': 0.3064, 'grad_norm': 0.5962105622474269, 'learning_rate': 6.863878016249103e-06, 'epoch': 0.4} 40%|███▉ | 8774/22095 [15:11:06<13:07:40, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8775/22095 [15:11:14<17:42:24, 4.79s/it] {'loss': 0.4991, 'grad_norm': 0.4949437489129078, 'learning_rate': 6.8631979030406045e-06, 'epoch': 0.4} 40%|███▉ | 8775/22095 [15:11:14<17:42:24, 4.79s/it] 40%|███▉ | 8776/22095 [15:11:17<16:19:14, 4.41s/it] {'loss': 0.3329, 'grad_norm': 0.7032902018583524, 'learning_rate': 6.862517749797703e-06, 'epoch': 0.4} 40%|███▉ | 8776/22095 [15:11:17<16:19:14, 4.41s/it] 40%|███▉ | 8777/22095 [15:11:21<15:55:03, 4.30s/it] {'loss': 0.3734, 'grad_norm': 0.6427673222676692, 'learning_rate': 6.861837556535018e-06, 'epoch': 0.4} 40%|███▉ | 8777/22095 [15:11:21<15:55:03, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8778/22095 [15:11:25<15:03:25, 4.07s/it] {'loss': 0.3366, 'grad_norm': 0.6180141133170143, 'learning_rate': 6.86115732326716e-06, 'epoch': 0.4} 40%|███▉ | 8778/22095 [15:11:25<15:03:25, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90356 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41591 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69236 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44058 > 40960) for 4 sample(s). Truncating to 3098 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (57533 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94653 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8779/22095 [15:11:28<13:47:47, 3.73s/it] {'loss': 0.3522, 'grad_norm': 0.6308046882524578, 'learning_rate': 6.860477050008749e-06, 'epoch': 0.4} 40%|███▉ | 8779/22095 [15:11:28<13:47:47, 3.73s/it] 40%|███▉ | 8780/22095 [15:11:31<13:37:46, 3.69s/it] {'loss': 0.4036, 'grad_norm': 0.6888985871886418, 'learning_rate': 6.859796736774399e-06, 'epoch': 0.4} 40%|███▉ | 8780/22095 [15:11:31<13:37:46, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85503 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8781/22095 [15:11:35<14:03:12, 3.80s/it] {'loss': 0.3372, 'grad_norm': 0.6252466017722336, 'learning_rate': 6.859116383578729e-06, 'epoch': 0.4} 40%|███▉ | 8781/22095 [15:11:35<14:03:12, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91961 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8782/22095 [15:11:40<14:52:37, 4.02s/it] {'loss': 0.3682, 'grad_norm': 0.6339694577973134, 'learning_rate': 6.858435990436357e-06, 'epoch': 0.4} 40%|███▉ | 8782/22095 [15:11:40<14:52:37, 4.02s/it] 40%|███▉ | 8783/22095 [15:11:43<14:15:59, 3.86s/it] {'loss': 0.3422, 'grad_norm': 0.6378813574811948, 'learning_rate': 6.857755557361904e-06, 'epoch': 0.4} 40%|███▉ | 8783/22095 [15:11:43<14:15:59, 3.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8309604 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB14.UuSFXXXXc2XVXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n提取这张照片中的文本信息,但不要输出坐标。'}, {'from': 'gpt', 'value': '图中所有文字:\n进口台湾玉线纯手工编织、不褪色、不过敏、无味\n好好干好好赚\n辟邪七星阵\n转运结\n转运结\n四通发达\n四通发达\n腰链可调节3-6厘米\n12生肖转运珠可调节\n生肖猴子!\n默认发\n辟邪七星阵\n7\n天无理由退货\n红玛瑙通过珠宝检测研究中心鉴定'}]} 40%|███▉ | 8784/22095 [15:11:47<13:41:13, 3.70s/it] {'loss': 0.3562, 'grad_norm': 0.6418165335242025, 'learning_rate': 6.8570750843699906e-06, 'epoch': 0.4} 40%|███▉ | 8784/22095 [15:11:47<13:41:13, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8785/22095 [15:11:51<14:42:02, 3.98s/it] {'loss': 0.3942, 'grad_norm': 0.6998441305303985, 'learning_rate': 6.856394571475236e-06, 'epoch': 0.4} 40%|███▉ | 8785/22095 [15:11:51<14:42:02, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [562, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8436784 in VC:s3://internvl-moe-sft-data/. Exception: Image size [562, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25533, 'image': 'vrdu_texteq/astro-ph.CO/6ee2fa14-3473-4bf3-9566-a47aac6a4fcc.png', 'image_wh': [[562, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $n_i$ are variables of the new distribution.'}]} 40%|███▉ | 8786/22095 [15:11:55<14:36:09, 3.95s/it] {'loss': 0.3044, 'grad_norm': 0.6763851297883828, 'learning_rate': 6.855714018692266e-06, 'epoch': 0.4} 40%|███▉ | 8786/22095 [15:11:55<14:36:09, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8787/22095 [15:12:05<20:50:00, 5.64s/it] {'loss': 0.4853, 'grad_norm': 0.3356185758187047, 'learning_rate': 6.855033426035698e-06, 'epoch': 0.4} 40%|███▉ | 8787/22095 [15:12:05<20:50:00, 5.64s/it] 40%|███▉ | 8788/22095 [15:12:08<18:17:46, 4.95s/it] {'loss': 0.3482, 'grad_norm': 0.6597605889712057, 'learning_rate': 6.854352793520161e-06, 'epoch': 0.4} 40%|███▉ | 8788/22095 [15:12:08<18:17:46, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337290 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3912, 'image': 'vrdu_table_final_2/astro-ph.CO/3407a9cc-1e77-47ca-a061-a64fd7199e17.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 40%|███▉ | 8789/22095 [15:12:16<21:12:39, 5.74s/it] {'loss': 0.5004, 'grad_norm': 0.30797573717135995, 'learning_rate': 6.853672121160277e-06, 'epoch': 0.4} 40%|███▉ | 8789/22095 [15:12:16<21:12:39, 5.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8790/22095 [15:12:23<22:53:59, 6.20s/it] {'loss': 0.4789, 'grad_norm': 0.29981839225811097, 'learning_rate': 6.852991408970673e-06, 'epoch': 0.4} 40%|███▉ | 8790/22095 [15:12:23<22:53:59, 6.20s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 40%|███▉ | 8791/22095 [15:12:26<19:33:20, 5.29s/it] {'loss': 0.3341, 'grad_norm': 0.7316998984685005, 'learning_rate': 6.852310656965973e-06, 'epoch': 0.4} 40%|███▉ | 8791/22095 [15:12:26<19:33:20, 5.29s/it] 40%|███▉ | 8792/22095 [15:12:30<17:49:53, 4.83s/it] {'loss': 0.3401, 'grad_norm': 0.6594598143900187, 'learning_rate': 6.8516298651608075e-06, 'epoch': 0.4} 40%|███▉ | 8792/22095 [15:12:30<17:49:53, 4.83s/it] 40%|███▉ | 8793/22095 [15:12:33<16:05:37, 4.36s/it] {'loss': 0.2948, 'grad_norm': 0.647731276075123, 'learning_rate': 6.850949033569802e-06, 'epoch': 0.4} 40%|███▉ | 8793/22095 [15:12:33<16:05:37, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8794/22095 [15:12:37<15:24:54, 4.17s/it] {'loss': 0.3411, 'grad_norm': 0.6491077344661909, 'learning_rate': 6.850268162207587e-06, 'epoch': 0.4} 40%|███▉ | 8794/22095 [15:12:37<15:24:54, 4.17s/it] 40%|███▉ | 8795/22095 [15:12:41<15:13:18, 4.12s/it] {'loss': 0.3782, 'grad_norm': 0.6955332825784432, 'learning_rate': 6.84958725108879e-06, 'epoch': 0.4} 40%|███▉ | 8795/22095 [15:12:41<15:13:18, 4.12s/it] 40%|███▉ | 8796/22095 [15:12:45<15:00:29, 4.06s/it] {'loss': 0.3427, 'grad_norm': 0.6033407972613656, 'learning_rate': 6.848906300228047e-06, 'epoch': 0.4} 40%|███▉ | 8796/22095 [15:12:45<15:00:29, 4.06s/it] 40%|███▉ | 8797/22095 [15:12:48<14:13:53, 3.85s/it] {'loss': 0.3563, 'grad_norm': 0.6390814926999342, 'learning_rate': 6.8482253096399835e-06, 'epoch': 0.4} 40%|███▉ | 8797/22095 [15:12:48<14:13:53, 3.85s/it] 40%|███▉ | 8798/22095 [15:12:52<14:17:21, 3.87s/it] {'loss': 0.4011, 'grad_norm': 0.8164864194155153, 'learning_rate': 6.847544279339235e-06, 'epoch': 0.4} 40%|███▉ | 8798/22095 [15:12:52<14:17:21, 3.87s/it] 40%|███▉ | 8799/22095 [15:12:56<14:35:10, 3.95s/it] {'loss': 0.3459, 'grad_norm': 0.6229443416891212, 'learning_rate': 6.8468632093404356e-06, 'epoch': 0.4} 40%|███▉ | 8799/22095 [15:12:56<14:35:10, 3.95s/it] 40%|███▉ | 8800/22095 [15:13:00<14:07:18, 3.82s/it] {'loss': 0.3232, 'grad_norm': 0.6137351192779077, 'learning_rate': 6.846182099658216e-06, 'epoch': 0.4} 40%|███▉ | 8800/22095 [15:13:00<14:07:18, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120464 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8801/22095 [15:13:03<13:09:39, 3.56s/it] {'loss': 0.3585, 'grad_norm': 0.6480828136217117, 'learning_rate': 6.845500950307215e-06, 'epoch': 0.4} 40%|███▉ | 8801/22095 [15:13:03<13:09:39, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111029 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8802/22095 [15:13:06<13:25:34, 3.64s/it] {'loss': 0.3287, 'grad_norm': 0.6361253861236457, 'learning_rate': 6.8448197613020664e-06, 'epoch': 0.4} 40%|███▉ | 8802/22095 [15:13:06<13:25:34, 3.64s/it] 40%|███▉ | 8803/22095 [15:13:10<12:59:15, 3.52s/it] {'loss': 0.3557, 'grad_norm': 0.6415046705485342, 'learning_rate': 6.844138532657405e-06, 'epoch': 0.4} 40%|███▉ | 8803/22095 [15:13:10<12:59:15, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99603 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94247 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8804/22095 [15:13:13<12:49:17, 3.47s/it] {'loss': 0.3274, 'grad_norm': 0.6278134895515164, 'learning_rate': 6.843457264387874e-06, 'epoch': 0.4} 40%|███▉ | 8804/22095 [15:13:13<12:49:17, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877033 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 186, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 40%|███▉ | 8805/22095 [15:13:16<12:19:29, 3.34s/it] {'loss': 0.3383, 'grad_norm': 0.6363884007409067, 'learning_rate': 6.842775956508104e-06, 'epoch': 0.4} 40%|███▉ | 8805/22095 [15:13:16<12:19:29, 3.34s/it] 40%|███▉ | 8806/22095 [15:13:19<11:53:54, 3.22s/it] {'loss': 0.335, 'grad_norm': 0.837352345152312, 'learning_rate': 6.8420946090327416e-06, 'epoch': 0.4} 40%|███▉ | 8806/22095 [15:13:19<11:53:54, 3.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8807/22095 [15:13:23<12:48:24, 3.47s/it] {'loss': 0.3698, 'grad_norm': 0.5971161813211178, 'learning_rate': 6.841413221976422e-06, 'epoch': 0.4} 40%|███▉ | 8807/22095 [15:13:23<12:48:24, 3.47s/it] 40%|███▉ | 8808/22095 [15:13:27<13:19:12, 3.61s/it] {'loss': 0.3433, 'grad_norm': 0.7140643777217771, 'learning_rate': 6.840731795353788e-06, 'epoch': 0.4} 40%|███▉ | 8808/22095 [15:13:27<13:19:12, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8809/22095 [15:13:35<18:36:36, 5.04s/it] {'loss': 0.4989, 'grad_norm': 0.523852046512976, 'learning_rate': 6.840050329179481e-06, 'epoch': 0.4} 40%|███▉ | 8809/22095 [15:13:35<18:36:36, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88273 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8810/22095 [15:13:45<23:31:19, 6.37s/it] {'loss': 0.4669, 'grad_norm': 0.4224525675611614, 'learning_rate': 6.839368823468144e-06, 'epoch': 0.4} 40%|███▉ | 8810/22095 [15:13:45<23:31:19, 6.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (57476 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86475 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54902 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8811/22095 [15:13:48<19:57:17, 5.41s/it] {'loss': 0.3543, 'grad_norm': 0.6361397108087635, 'learning_rate': 6.838687278234419e-06, 'epoch': 0.4} 40%|███▉ | 8811/22095 [15:13:48<19:57:17, 5.41s/it] 40%|███▉ | 8812/22095 [15:13:56<22:30:36, 6.10s/it] {'loss': 0.4774, 'grad_norm': 0.3237349798154476, 'learning_rate': 6.838005693492953e-06, 'epoch': 0.4} 40%|███▉ | 8812/22095 [15:13:56<22:30:36, 6.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 40%|███▉ | 8813/22095 [15:13:59<19:29:22, 5.28s/it] {'loss': 0.3643, 'grad_norm': 0.6200239783378564, 'learning_rate': 6.837324069258389e-06, 'epoch': 0.4} 40%|███▉ | 8813/22095 [15:13:59<19:29:22, 5.28s/it] 40%|███▉ | 8814/22095 [15:14:03<17:52:38, 4.85s/it] {'loss': 0.3307, 'grad_norm': 0.742821653188746, 'learning_rate': 6.836642405545374e-06, 'epoch': 0.4} 40%|███▉ | 8814/22095 [15:14:03<17:52:38, 4.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365007 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31748, 'image': 'vrdu_table_final_2/astro-ph.CO/d3132c5a-48ee-4ff3-b682-cb067f0860b8.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 40%|███▉ | 8815/22095 [15:14:07<16:50:25, 4.57s/it] {'loss': 0.3176, 'grad_norm': 0.5872253694957739, 'learning_rate': 6.8359607023685544e-06, 'epoch': 0.4} 40%|███▉ | 8815/22095 [15:14:07<16:50:25, 4.57s/it] 40%|███▉ | 8816/22095 [15:14:11<16:01:58, 4.35s/it] {'loss': 0.3363, 'grad_norm': 0.6693686789127838, 'learning_rate': 6.835278959742577e-06, 'epoch': 0.4} 40%|███▉ | 8816/22095 [15:14:11<16:01:58, 4.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893391 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16544, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 3cm\nB. 2cm\nC. 5cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 40%|███▉ | 8817/22095 [15:14:14<14:37:25, 3.96s/it] {'loss': 0.3102, 'grad_norm': 0.5871827507512425, 'learning_rate': 6.8345971776820944e-06, 'epoch': 0.4} 40%|███▉ | 8817/22095 [15:14:14<14:37:25, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8818/22095 [15:14:23<20:40:53, 5.61s/it] {'loss': 0.4883, 'grad_norm': 0.6081941918857661, 'learning_rate': 6.833915356201749e-06, 'epoch': 0.4} 40%|███▉ | 8818/22095 [15:14:23<20:40:53, 5.61s/it] 40%|███▉ | 8819/22095 [15:14:33<24:50:58, 6.74s/it] {'loss': 0.5231, 'grad_norm': 0.542069814873775, 'learning_rate': 6.833233495316198e-06, 'epoch': 0.4} 40%|███▉ | 8819/22095 [15:14:33<24:50:58, 6.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (58635 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8820/22095 [15:14:36<21:11:02, 5.74s/it] {'loss': 0.3471, 'grad_norm': 0.6396501003092583, 'learning_rate': 6.832551595040089e-06, 'epoch': 0.4} 40%|███▉ | 8820/22095 [15:14:36<21:11:02, 5.74s/it] 40%|███▉ | 8821/22095 [15:14:39<18:27:46, 5.01s/it] {'loss': 0.3832, 'grad_norm': 0.6680543486040922, 'learning_rate': 6.8318696553880736e-06, 'epoch': 0.4} 40%|███▉ | 8821/22095 [15:14:39<18:27:46, 5.01s/it] 40%|███▉ | 8822/22095 [15:14:43<16:43:23, 4.54s/it] {'loss': 0.3086, 'grad_norm': 0.651301525618315, 'learning_rate': 6.831187676374807e-06, 'epoch': 0.4} 40%|███▉ | 8822/22095 [15:14:43<16:43:23, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8823/22095 [15:14:46<14:51:09, 4.03s/it] {'loss': 0.3631, 'grad_norm': 0.6727077273084098, 'learning_rate': 6.83050565801494e-06, 'epoch': 0.4} 40%|███▉ | 8823/22095 [15:14:46<14:51:09, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|███▉ | 8824/22095 [15:14:49<14:35:02, 3.96s/it] {'loss': 0.3353, 'grad_norm': 0.6163866932954916, 'learning_rate': 6.8298236003231264e-06, 'epoch': 0.4} 40%|███▉ | 8824/22095 [15:14:49<14:35:02, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91480 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8825/22095 [15:14:53<14:07:12, 3.83s/it] {'loss': 0.3233, 'grad_norm': 0.6197188225438527, 'learning_rate': 6.829141503314027e-06, 'epoch': 0.4} 40%|███▉ | 8825/22095 [15:14:53<14:07:12, 3.83s/it] 40%|███▉ | 8826/22095 [15:14:56<13:09:54, 3.57s/it] {'loss': 0.3059, 'grad_norm': 0.6424342060833129, 'learning_rate': 6.8284593670022925e-06, 'epoch': 0.4} 40%|███▉ | 8826/22095 [15:14:56<13:09:54, 3.57s/it] 40%|███▉ | 8827/22095 [15:15:00<13:19:31, 3.62s/it] {'loss': 0.3934, 'grad_norm': 0.6480051962242951, 'learning_rate': 6.827777191402584e-06, 'epoch': 0.4} 40%|███▉ | 8827/22095 [15:15:00<13:19:31, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42444 > 40960). Running this sequence through the model will result in indexing errors 40%|███▉ | 8828/22095 [15:15:08<18:11:17, 4.94s/it] {'loss': 0.5059, 'grad_norm': 0.9945980509456495, 'learning_rate': 6.827094976529555e-06, 'epoch': 0.4} 40%|███▉ | 8828/22095 [15:15:08<18:11:17, 4.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [53, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333541 in VC:s3://internvl-moe-sft-data/. Exception: Image size [53, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 149, 'image': 'vrdu_table_final_2/astro-ph.CO/32d69117-df36-4157-b2de-26c8c3b93e9f.png', 'image_wh': [[53, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l}PwS\\end{tabular}\n```"}]} 40%|███▉ | 8829/22095 [15:15:11<16:32:55, 4.49s/it] {'loss': 0.3237, 'grad_norm': 0.6656078511701508, 'learning_rate': 6.826412722397867e-06, 'epoch': 0.4} 40%|███▉ | 8829/22095 [15:15:11<16:32:55, 4.49s/it] 40%|███▉ | 8830/22095 [15:15:15<15:43:38, 4.27s/it] {'loss': 0.3523, 'grad_norm': 0.6621313999952261, 'learning_rate': 6.8257304290221794e-06, 'epoch': 0.4} 40%|███▉ | 8830/22095 [15:15:15<15:43:38, 4.27s/it] 40%|███▉ | 8831/22095 [15:15:18<14:54:18, 4.05s/it] {'loss': 0.3349, 'grad_norm': 0.6759962831435172, 'learning_rate': 6.8250480964171526e-06, 'epoch': 0.4} 40%|███▉ | 8831/22095 [15:15:18<14:54:18, 4.05s/it] 40%|███▉ | 8832/22095 [15:15:22<14:45:03, 4.00s/it] {'loss': 0.3805, 'grad_norm': 0.6632127212233728, 'learning_rate': 6.824365724597446e-06, 'epoch': 0.4} 40%|███▉ | 8832/22095 [15:15:22<14:45:03, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8833/22095 [15:15:28<17:09:34, 4.66s/it] {'loss': 0.4721, 'grad_norm': 0.33370362668382353, 'learning_rate': 6.823683313577725e-06, 'epoch': 0.4} 40%|███▉ | 8833/22095 [15:15:28<17:09:34, 4.66s/it] 40%|███▉ | 8834/22095 [15:15:32<15:55:48, 4.32s/it] {'loss': 0.3339, 'grad_norm': 0.5915639289253449, 'learning_rate': 6.823000863372649e-06, 'epoch': 0.4} 40%|███▉ | 8834/22095 [15:15:32<15:55:48, 4.32s/it] 40%|███▉ | 8835/22095 [15:15:36<15:07:21, 4.11s/it] {'loss': 0.3628, 'grad_norm': 0.6474460312332149, 'learning_rate': 6.822318373996884e-06, 'epoch': 0.4} 40%|███▉ | 8835/22095 [15:15:36<15:07:21, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|███▉ | 8836/22095 [15:15:45<20:35:29, 5.59s/it] {'loss': 0.4712, 'grad_norm': 0.3924738120952629, 'learning_rate': 6.8216358454650935e-06, 'epoch': 0.4} 40%|███▉ | 8836/22095 [15:15:45<20:35:29, 5.59s/it] 40%|███▉ | 8837/22095 [15:15:50<20:34:01, 5.58s/it] {'loss': 0.313, 'grad_norm': 0.6035930899859268, 'learning_rate': 6.820953277791944e-06, 'epoch': 0.4} 40%|███▉ | 8837/22095 [15:15:50<20:34:01, 5.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43027 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8838/22095 [15:15:53<17:53:42, 4.86s/it] {'loss': 0.3697, 'grad_norm': 0.6635310443361517, 'learning_rate': 6.8202706709921e-06, 'epoch': 0.4} 40%|████ | 8838/22095 [15:15:53<17:53:42, 4.86s/it] 40%|████ | 8839/22095 [15:15:57<16:44:34, 4.55s/it] {'loss': 0.3649, 'grad_norm': 0.6681270122548083, 'learning_rate': 6.81958802508023e-06, 'epoch': 0.4} 40%|████ | 8839/22095 [15:15:57<16:44:34, 4.55s/it] 40%|████ | 8840/22095 [15:16:01<15:26:21, 4.19s/it] {'loss': 0.3629, 'grad_norm': 0.5962241863603737, 'learning_rate': 6.818905340071004e-06, 'epoch': 0.4} 40%|████ | 8840/22095 [15:16:01<15:26:21, 4.19s/it] 40%|████ | 8841/22095 [15:16:04<14:53:00, 4.04s/it] {'loss': 0.328, 'grad_norm': 0.6215104277591916, 'learning_rate': 6.818222615979087e-06, 'epoch': 0.4} 40%|████ | 8841/22095 [15:16:04<14:53:00, 4.04s/it] 40%|████ | 8842/22095 [15:16:08<14:30:02, 3.94s/it] {'loss': 0.3285, 'grad_norm': 0.6175036142464018, 'learning_rate': 6.817539852819149e-06, 'epoch': 0.4} 40%|████ | 8842/22095 [15:16:08<14:30:02, 3.94s/it] 40%|████ | 8843/22095 [15:16:11<14:03:34, 3.82s/it] {'loss': 0.3486, 'grad_norm': 0.705849429599764, 'learning_rate': 6.816857050605864e-06, 'epoch': 0.4} 40%|████ | 8843/22095 [15:16:11<14:03:34, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138784 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41704 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118120 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8844/22095 [15:16:15<13:16:39, 3.61s/it] {'loss': 0.3474, 'grad_norm': 0.6307225714621325, 'learning_rate': 6.8161742093539005e-06, 'epoch': 0.4} 40%|████ | 8844/22095 [15:16:15<13:16:39, 3.61s/it] 40%|████ | 8845/22095 [15:16:18<13:29:46, 3.67s/it] {'loss': 0.354, 'grad_norm': 0.6726724488003374, 'learning_rate': 6.81549132907793e-06, 'epoch': 0.4} 40%|████ | 8845/22095 [15:16:18<13:29:46, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8846/22095 [15:16:28<19:32:36, 5.31s/it] {'loss': 0.5051, 'grad_norm': 0.48933163712064615, 'learning_rate': 6.814808409792628e-06, 'epoch': 0.4} 40%|████ | 8846/22095 [15:16:28<19:32:36, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44864 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47049 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8847/22095 [15:16:31<17:44:23, 4.82s/it] {'loss': 0.3468, 'grad_norm': 0.6639468435423697, 'learning_rate': 6.814125451512666e-06, 'epoch': 0.4} 40%|████ | 8847/22095 [15:16:31<17:44:23, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45475 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82932 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8848/22095 [15:16:35<16:44:02, 4.55s/it] {'loss': 0.3581, 'grad_norm': 0.6380662284908794, 'learning_rate': 6.8134424542527215e-06, 'epoch': 0.4} 40%|████ | 8848/22095 [15:16:35<16:44:02, 4.55s/it] 40%|████ | 8849/22095 [15:16:39<15:41:39, 4.27s/it] {'loss': 0.3805, 'grad_norm': 0.6132429048830895, 'learning_rate': 6.812759418027466e-06, 'epoch': 0.4} 40%|████ | 8849/22095 [15:16:39<15:41:39, 4.27s/it] 40%|████ | 8850/22095 [15:16:42<15:02:29, 4.09s/it] {'loss': 0.3224, 'grad_norm': 0.6307231742366833, 'learning_rate': 6.812076342851579e-06, 'epoch': 0.4} 40%|████ | 8850/22095 [15:16:42<15:02:29, 4.09s/it] 40%|████ | 8851/22095 [15:16:46<14:50:15, 4.03s/it] {'loss': 0.3307, 'grad_norm': 0.6191078091758072, 'learning_rate': 6.811393228739737e-06, 'epoch': 0.4} 40%|████ | 8851/22095 [15:16:46<14:50:15, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8852/22095 [15:16:52<16:27:28, 4.47s/it] {'loss': 0.4633, 'grad_norm': 0.32682970215003815, 'learning_rate': 6.810710075706618e-06, 'epoch': 0.4} 40%|████ | 8852/22095 [15:16:52<16:27:28, 4.47s/it] 40%|████ | 8853/22095 [15:16:57<17:00:35, 4.62s/it] {'loss': 0.3689, 'grad_norm': 0.6713896907402072, 'learning_rate': 6.8100268837669e-06, 'epoch': 0.4} 40%|████ | 8853/22095 [15:16:57<17:00:35, 4.62s/it] 40%|████ | 8854/22095 [15:17:00<15:23:19, 4.18s/it] {'loss': 0.3655, 'grad_norm': 0.66965729091601, 'learning_rate': 6.809343652935263e-06, 'epoch': 0.4} 40%|████ | 8854/22095 [15:17:00<15:23:19, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54080 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8855/22095 [15:17:10<21:22:00, 5.81s/it] {'loss': 0.4726, 'grad_norm': 0.3073650613234617, 'learning_rate': 6.808660383226388e-06, 'epoch': 0.4} 40%|████ | 8855/22095 [15:17:10<21:22:00, 5.81s/it] 40%|████ | 8856/22095 [15:17:19<25:16:30, 6.87s/it] {'loss': 0.4807, 'grad_norm': 0.3061154699957953, 'learning_rate': 6.807977074654957e-06, 'epoch': 0.4} 40%|████ | 8856/22095 [15:17:19<25:16:30, 6.87s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 40%|████ | 8857/22095 [15:17:23<22:21:40, 6.08s/it] {'loss': 0.3332, 'grad_norm': 0.6343734983995349, 'learning_rate': 6.807293727235651e-06, 'epoch': 0.4} 40%|████ | 8857/22095 [15:17:23<22:21:40, 6.08s/it] 40%|████ | 8858/22095 [15:17:27<19:31:48, 5.31s/it] {'loss': 0.3331, 'grad_norm': 0.6222786895743901, 'learning_rate': 6.806610340983154e-06, 'epoch': 0.4} 40%|████ | 8858/22095 [15:17:27<19:31:48, 5.31s/it] 40%|████ | 8859/22095 [15:17:31<18:35:04, 5.05s/it] {'loss': 0.3793, 'grad_norm': 0.5848984802642063, 'learning_rate': 6.8059269159121484e-06, 'epoch': 0.4} 40%|████ | 8859/22095 [15:17:31<18:35:04, 5.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8860/22095 [15:17:34<16:17:40, 4.43s/it] {'loss': 0.3325, 'grad_norm': 0.6518113755263946, 'learning_rate': 6.8052434520373204e-06, 'epoch': 0.4} 40%|████ | 8860/22095 [15:17:34<16:17:40, 4.43s/it] 40%|████ | 8861/22095 [15:17:37<14:39:11, 3.99s/it] {'loss': 0.348, 'grad_norm': 0.6355972797900781, 'learning_rate': 6.804559949373355e-06, 'epoch': 0.4} 40%|████ | 8861/22095 [15:17:37<14:39:11, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877034 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 187, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 16cm\nB. 10cm\nC. 5cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 40%|████ | 8862/22095 [15:17:42<15:14:25, 4.15s/it] {'loss': 0.3771, 'grad_norm': 0.6869756142935315, 'learning_rate': 6.803876407934939e-06, 'epoch': 0.4} 40%|████ | 8862/22095 [15:17:42<15:14:25, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8863/22095 [15:17:52<21:46:26, 5.92s/it] {'loss': 0.4636, 'grad_norm': 0.3713660385866207, 'learning_rate': 6.803192827736758e-06, 'epoch': 0.4} 40%|████ | 8863/22095 [15:17:52<21:46:26, 5.92s/it] 40%|████ | 8864/22095 [15:17:55<19:20:29, 5.26s/it] {'loss': 0.314, 'grad_norm': 0.649725866030626, 'learning_rate': 6.802509208793502e-06, 'epoch': 0.4} 40%|████ | 8864/22095 [15:17:55<19:20:29, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120996 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8865/22095 [15:17:59<17:23:00, 4.73s/it] {'loss': 0.3146, 'grad_norm': 0.6290007862830288, 'learning_rate': 6.80182555111986e-06, 'epoch': 0.4} 40%|████ | 8865/22095 [15:17:59<17:23:00, 4.73s/it] 40%|████ | 8866/22095 [15:18:03<16:20:34, 4.45s/it] {'loss': 0.3255, 'grad_norm': 0.6416736202002891, 'learning_rate': 6.80114185473052e-06, 'epoch': 0.4} 40%|████ | 8866/22095 [15:18:03<16:20:34, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8867/22095 [15:18:06<14:58:24, 4.08s/it] {'loss': 0.3269, 'grad_norm': 0.646691722276954, 'learning_rate': 6.800458119640172e-06, 'epoch': 0.4} 40%|████ | 8867/22095 [15:18:06<14:58:24, 4.08s/it] 40%|████ | 8868/22095 [15:18:09<13:58:20, 3.80s/it] {'loss': 0.3505, 'grad_norm': 0.601916942813255, 'learning_rate': 6.79977434586351e-06, 'epoch': 0.4} 40%|████ | 8868/22095 [15:18:09<13:58:20, 3.80s/it] 40%|████ | 8869/22095 [15:18:13<13:39:45, 3.72s/it] {'loss': 0.3107, 'grad_norm': 0.6052031851252415, 'learning_rate': 6.799090533415225e-06, 'epoch': 0.4} 40%|████ | 8869/22095 [15:18:13<13:39:45, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 45, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358038 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 45, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24749, 'image': 'vrdu_table_final_2/astro-ph.CO/bdf77567-3d23-44ce-b540-b9ce6e34e2a1.png', 'image_wh': [[25, 45]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$\\am$, $\\ab$,\\\\ $\\at$, $\\ak$\\end{tabular}\n```"}]} 40%|████ | 8870/22095 [15:18:17<14:26:23, 3.93s/it] {'loss': 0.3625, 'grad_norm': 0.6420105058443195, 'learning_rate': 6.798406682310009e-06, 'epoch': 0.4} 40%|████ | 8870/22095 [15:18:17<14:26:23, 3.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88224 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46440 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8871/22095 [15:18:20<13:26:21, 3.66s/it] {'loss': 0.3505, 'grad_norm': 0.6665437907809222, 'learning_rate': 6.797722792562558e-06, 'epoch': 0.4} 40%|████ | 8871/22095 [15:18:20<13:26:21, 3.66s/it] 40%|████ | 8872/22095 [15:18:24<13:39:06, 3.72s/it] {'loss': 0.3484, 'grad_norm': 0.6927302844427357, 'learning_rate': 6.797038864187564e-06, 'epoch': 0.4} 40%|████ | 8872/22095 [15:18:24<13:39:06, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8873/22095 [15:18:33<19:59:30, 5.44s/it] {'loss': 0.4747, 'grad_norm': 0.30249294964986345, 'learning_rate': 6.796354897199726e-06, 'epoch': 0.4} 40%|████ | 8873/22095 [15:18:33<19:59:30, 5.44s/it] 40%|████ | 8874/22095 [15:18:37<17:33:53, 4.78s/it] {'loss': 0.3636, 'grad_norm': 0.6182420381926265, 'learning_rate': 6.795670891613737e-06, 'epoch': 0.4} 40%|████ | 8874/22095 [15:18:37<17:33:53, 4.78s/it] 40%|████ | 8875/22095 [15:18:40<15:51:19, 4.32s/it] {'loss': 0.3258, 'grad_norm': 0.6310058170885068, 'learning_rate': 6.794986847444296e-06, 'epoch': 0.4} 40%|████ | 8875/22095 [15:18:40<15:51:19, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65650 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43365 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8876/22095 [15:18:43<14:44:49, 4.02s/it] {'loss': 0.3617, 'grad_norm': 0.6888547574317977, 'learning_rate': 6.7943027647061e-06, 'epoch': 0.4} 40%|████ | 8876/22095 [15:18:43<14:44:49, 4.02s/it] 40%|████ | 8877/22095 [15:18:47<14:34:09, 3.97s/it] {'loss': 0.3105, 'grad_norm': 0.5791059330086225, 'learning_rate': 6.793618643413848e-06, 'epoch': 0.4} 40%|████ | 8877/22095 [15:18:47<14:34:09, 3.97s/it] 40%|████ | 8878/22095 [15:18:50<13:28:05, 3.67s/it] {'loss': 0.4149, 'grad_norm': 0.6450799542232147, 'learning_rate': 6.792934483582242e-06, 'epoch': 0.4} 40%|████ | 8878/22095 [15:18:50<13:28:05, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8879/22095 [15:19:00<20:06:17, 5.48s/it] {'loss': 0.4816, 'grad_norm': 0.3069299614041707, 'learning_rate': 6.792250285225978e-06, 'epoch': 0.4} 40%|████ | 8879/22095 [15:19:00<20:06:17, 5.48s/it] 40%|████ | 8880/22095 [15:19:03<18:18:49, 4.99s/it] {'loss': 0.3549, 'grad_norm': 0.6801140072717393, 'learning_rate': 6.791566048359761e-06, 'epoch': 0.4} 40%|████ | 8880/22095 [15:19:03<18:18:49, 4.99s/it] 40%|████ | 8881/22095 [15:19:07<17:11:18, 4.68s/it] {'loss': 0.3048, 'grad_norm': 0.6316838717075581, 'learning_rate': 6.7908817729982936e-06, 'epoch': 0.4} 40%|████ | 8881/22095 [15:19:07<17:11:18, 4.68s/it] 40%|████ | 8882/22095 [15:19:11<16:07:58, 4.40s/it] {'loss': 0.3788, 'grad_norm': 0.6214297573389774, 'learning_rate': 6.790197459156275e-06, 'epoch': 0.4} 40%|████ | 8882/22095 [15:19:11<16:07:58, 4.40s/it] 40%|████ | 8883/22095 [15:19:15<15:04:50, 4.11s/it] {'loss': 0.3267, 'grad_norm': 0.6234239462090553, 'learning_rate': 6.789513106848412e-06, 'epoch': 0.4} 40%|████ | 8883/22095 [15:19:15<15:04:50, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8884/22095 [15:19:21<17:38:16, 4.81s/it] {'loss': 0.4826, 'grad_norm': 0.29946524483125314, 'learning_rate': 6.788828716089407e-06, 'epoch': 0.4} 40%|████ | 8884/22095 [15:19:21<17:38:16, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47386 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130555 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77159 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8885/22095 [15:19:25<16:42:20, 4.55s/it] {'loss': 0.3385, 'grad_norm': 0.612541727714678, 'learning_rate': 6.78814428689397e-06, 'epoch': 0.4} 40%|████ | 8885/22095 [15:19:25<16:42:20, 4.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047943 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 40%|████ | 8886/22095 [15:19:29<15:49:31, 4.31s/it] {'loss': 0.3526, 'grad_norm': 0.5987562227498454, 'learning_rate': 6.787459819276802e-06, 'epoch': 0.4} 40%|████ | 8886/22095 [15:19:29<15:49:31, 4.31s/it] 40%|████ | 8887/22095 [15:19:32<14:11:53, 3.87s/it] {'loss': 0.3332, 'grad_norm': 0.6365973311112314, 'learning_rate': 6.786775313252611e-06, 'epoch': 0.4} 40%|████ | 8887/22095 [15:19:32<14:11:53, 3.87s/it] 40%|████ | 8888/22095 [15:19:35<13:46:37, 3.76s/it] {'loss': 0.3377, 'grad_norm': 0.5884045771732902, 'learning_rate': 6.7860907688361084e-06, 'epoch': 0.4} 40%|████ | 8888/22095 [15:19:35<13:46:37, 3.76s/it] 40%|████ | 8889/22095 [15:19:38<13:26:24, 3.66s/it] {'loss': 0.3605, 'grad_norm': 0.6385536028054486, 'learning_rate': 6.785406186042e-06, 'epoch': 0.4} 40%|████ | 8889/22095 [15:19:39<13:26:24, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308924 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'T1mLyYXvRaXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text information from this photo.'}, {'from': 'gpt', 'value': 'All words in the image:\n1英寸螺口弯针头\n(45度)\n1.5英寸\n鑫源电子产品\n规格:15G-27G\n管长:25mm'}]} 40%|████ | 8890/22095 [15:19:42<13:22:43, 3.65s/it] {'loss': 0.3168, 'grad_norm': 0.5846199958871605, 'learning_rate': 6.7847215648849964e-06, 'epoch': 0.4} 40%|████ | 8890/22095 [15:19:42<13:22:43, 3.65s/it] 40%|████ | 8891/22095 [15:19:45<12:27:11, 3.40s/it] {'loss': 0.383, 'grad_norm': 0.6891479504414567, 'learning_rate': 6.784036905379807e-06, 'epoch': 0.4} 40%|████ | 8891/22095 [15:19:45<12:27:11, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67745 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8892/22095 [15:19:49<13:22:28, 3.65s/it] {'loss': 0.3155, 'grad_norm': 0.6574568805801104, 'learning_rate': 6.783352207541144e-06, 'epoch': 0.4} 40%|████ | 8892/22095 [15:19:49<13:22:28, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8893/22095 [15:20:00<21:01:59, 5.74s/it] {'loss': 0.4899, 'grad_norm': 0.34285635218528, 'learning_rate': 6.782667471383719e-06, 'epoch': 0.4} 40%|████ | 8893/22095 [15:20:00<21:01:59, 5.74s/it] 40%|████ | 8894/22095 [15:20:03<18:31:03, 5.05s/it] {'loss': 0.3361, 'grad_norm': 0.5997713023380722, 'learning_rate': 6.7819826969222465e-06, 'epoch': 0.4} 40%|████ | 8894/22095 [15:20:03<18:31:03, 5.05s/it] 40%|████ | 8895/22095 [15:20:07<17:00:11, 4.64s/it] {'loss': 0.3324, 'grad_norm': 0.6014580200566753, 'learning_rate': 6.781297884171436e-06, 'epoch': 0.4} 40%|████ | 8895/22095 [15:20:07<17:00:11, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58773 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54849 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42727 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47392 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 40%|████ | 8896/22095 [15:20:11<16:53:26, 4.61s/it] {'loss': 0.3372, 'grad_norm': 0.7053990750107237, 'learning_rate': 6.780613033146008e-06, 'epoch': 0.4} 40%|████ | 8896/22095 [15:20:11<16:53:26, 4.61s/it] 40%|████ | 8897/22095 [15:20:15<15:20:22, 4.18s/it] {'loss': 0.3809, 'grad_norm': 0.6005162280243345, 'learning_rate': 6.779928143860672e-06, 'epoch': 0.4} 40%|████ | 8897/22095 [15:20:15<15:20:22, 4.18s/it] 40%|████ | 8898/22095 [15:20:18<14:09:43, 3.86s/it] {'loss': 0.3489, 'grad_norm': 0.6332156859771381, 'learning_rate': 6.779243216330149e-06, 'epoch': 0.4} 40%|████ | 8898/22095 [15:20:18<14:09:43, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49837 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8899/22095 [15:20:21<13:56:00, 3.80s/it] {'loss': 0.3228, 'grad_norm': 0.5848052680601269, 'learning_rate': 6.7785582505691525e-06, 'epoch': 0.4} 40%|████ | 8899/22095 [15:20:21<13:56:00, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8900/22095 [15:20:31<20:10:40, 5.51s/it] {'loss': 0.4915, 'grad_norm': 0.3297464573181845, 'learning_rate': 6.777873246592403e-06, 'epoch': 0.4} 40%|████ | 8900/22095 [15:20:31<20:10:40, 5.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified visual_processed = processor.preprocess(image, return_tensors="pt") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:21 and width:135 must be larger than factor:28 [Try #0] Failed to fetch sample 2128867 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:21 and width:135 must be larger than factor:28 Problematic sample: {'image': 'b740dccee641dd995e5ce727ca3882efdf31feffa6d5688fe120c85e9c186e93.png', 'conversations': [{'from': 'human', 'value': "\nLet me describe the visual characteristics of this Text label:\nThe element is a text label with the words 'Your Library' in white, set against a black background. The font is sans-serif, providing a clean and modern look. It is part of a vertical navigation menu on the left side of the interface, which features other similar text labels and icons.\n\nUsage and purpose of this Text label:\nThe primary function of this element is to navigate the user to their personal library within the application. When clicked, it likely displays the user's saved music, playlists, and other personalized content."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]', 'recipient': 'all', 'end_turn': True}]} 40%|████ | 8901/22095 [15:20:34<18:03:53, 4.93s/it] {'loss': 0.3845, 'grad_norm': 0.6603865664890434, 'learning_rate': 6.777188204414615e-06, 'epoch': 0.4} 40%|████ | 8901/22095 [15:20:34<18:03:53, 4.93s/it] 40%|████ | 8902/22095 [15:20:37<15:59:26, 4.36s/it] {'loss': 0.3845, 'grad_norm': 0.6331108359086557, 'learning_rate': 6.776503124050514e-06, 'epoch': 0.4} 40%|████ | 8902/22095 [15:20:38<15:59:26, 4.36s/it] 40%|████ | 8903/22095 [15:20:41<15:31:20, 4.24s/it] {'loss': 0.3873, 'grad_norm': 0.6393073819687393, 'learning_rate': 6.775818005514815e-06, 'epoch': 0.4} 40%|████ | 8903/22095 [15:20:41<15:31:20, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57484 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8904/22095 [15:20:52<22:07:33, 6.04s/it] {'loss': 0.4827, 'grad_norm': 0.2893071946967361, 'learning_rate': 6.7751328488222414e-06, 'epoch': 0.4} 40%|████ | 8904/22095 [15:20:52<22:07:33, 6.04s/it] 40%|████ | 8905/22095 [15:20:55<19:16:03, 5.26s/it] {'loss': 0.3215, 'grad_norm': 0.6510819344466393, 'learning_rate': 6.774447653987515e-06, 'epoch': 0.4} 40%|████ | 8905/22095 [15:20:55<19:16:03, 5.26s/it] 40%|████ | 8906/22095 [15:20:58<16:46:13, 4.58s/it] {'loss': 0.3793, 'grad_norm': 0.7675879619243432, 'learning_rate': 6.773762421025359e-06, 'epoch': 0.4} 40%|████ | 8906/22095 [15:20:58<16:46:13, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8907/22095 [15:21:08<22:13:11, 6.07s/it] {'loss': 0.4829, 'grad_norm': 0.29502548677594626, 'learning_rate': 6.773077149950494e-06, 'epoch': 0.4} 40%|████ | 8907/22095 [15:21:08<22:13:11, 6.07s/it] 40%|████ | 8908/22095 [15:21:11<19:18:36, 5.27s/it] {'loss': 0.3047, 'grad_norm': 0.6446546984632263, 'learning_rate': 6.772391840777648e-06, 'epoch': 0.4} 40%|████ | 8908/22095 [15:21:11<19:18:36, 5.27s/it] 40%|████ | 8909/22095 [15:21:15<17:28:22, 4.77s/it] {'loss': 0.3229, 'grad_norm': 0.7347921913071014, 'learning_rate': 6.771706493521546e-06, 'epoch': 0.4} 40%|████ | 8909/22095 [15:21:15<17:28:22, 4.77s/it] 40%|████ | 8910/22095 [15:21:19<16:48:26, 4.59s/it] {'loss': 0.3785, 'grad_norm': 0.6707052281228602, 'learning_rate': 6.771021108196912e-06, 'epoch': 0.4} 40%|████ | 8910/22095 [15:21:19<16:48:26, 4.59s/it] 40%|████ | 8911/22095 [15:21:22<15:19:54, 4.19s/it] {'loss': 0.3774, 'grad_norm': 0.6264078396574432, 'learning_rate': 6.770335684818472e-06, 'epoch': 0.4} 40%|████ | 8911/22095 [15:21:22<15:19:54, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8912/22095 [15:21:29<18:24:39, 5.03s/it] {'loss': 0.4636, 'grad_norm': 0.2943032704082629, 'learning_rate': 6.7696502234009576e-06, 'epoch': 0.4} 40%|████ | 8912/22095 [15:21:29<18:24:39, 5.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8360552 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27279, 'image': 'vrdu_table_final_2/astro-ph.CO/00e2c001-e0b4-44f0-a5b2-37fb3a5fd272.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 40%|████ | 8913/22095 [15:21:33<17:02:36, 4.65s/it] {'loss': 0.3458, 'grad_norm': 0.6016668486483328, 'learning_rate': 6.768964723959093e-06, 'epoch': 0.4} 40%|████ | 8913/22095 [15:21:33<17:02:36, 4.65s/it] 40%|████ | 8914/22095 [15:21:36<15:19:00, 4.18s/it] {'loss': 0.3216, 'grad_norm': 0.8107831458714004, 'learning_rate': 6.768279186507611e-06, 'epoch': 0.4} 40%|████ | 8914/22095 [15:21:36<15:19:00, 4.18s/it] 40%|████ | 8915/22095 [15:21:39<14:27:22, 3.95s/it] {'loss': 0.3378, 'grad_norm': 0.640569774825471, 'learning_rate': 6.7675936110612405e-06, 'epoch': 0.4} 40%|████ | 8915/22095 [15:21:39<14:27:22, 3.95s/it] 40%|████ | 8916/22095 [15:21:43<14:20:24, 3.92s/it] {'loss': 0.3228, 'grad_norm': 0.5799696440458336, 'learning_rate': 6.766907997634711e-06, 'epoch': 0.4} 40%|████ | 8916/22095 [15:21:43<14:20:24, 3.92s/it] 40%|████ | 8917/22095 [15:21:46<13:35:36, 3.71s/it] {'loss': 0.3377, 'grad_norm': 0.6273281359680126, 'learning_rate': 6.766222346242755e-06, 'epoch': 0.4} 40%|████ | 8917/22095 [15:21:46<13:35:36, 3.71s/it] 40%|████ | 8918/22095 [15:21:50<13:31:03, 3.69s/it] {'loss': 0.3666, 'grad_norm': 0.8178978298391293, 'learning_rate': 6.765536656900105e-06, 'epoch': 0.4} 40%|████ | 8918/22095 [15:21:50<13:31:03, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8919/22095 [15:21:57<16:41:29, 4.56s/it] {'loss': 0.4742, 'grad_norm': 0.318327709742364, 'learning_rate': 6.764850929621496e-06, 'epoch': 0.4} 40%|████ | 8919/22095 [15:21:57<16:41:29, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8920/22095 [15:22:00<15:42:09, 4.29s/it] {'loss': 0.3087, 'grad_norm': 0.5859553890996915, 'learning_rate': 6.764165164421661e-06, 'epoch': 0.4} 40%|████ | 8920/22095 [15:22:00<15:42:09, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113698 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124760 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8921/22095 [15:22:03<14:18:34, 3.91s/it] {'loss': 0.3398, 'grad_norm': 0.6387102179408355, 'learning_rate': 6.763479361315334e-06, 'epoch': 0.4} 40%|████ | 8921/22095 [15:22:03<14:18:34, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51727 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8922/22095 [15:22:06<13:18:15, 3.64s/it] {'loss': 0.331, 'grad_norm': 0.5772994972957782, 'learning_rate': 6.762793520317251e-06, 'epoch': 0.4} 40%|████ | 8922/22095 [15:22:06<13:18:15, 3.64s/it] 40%|████ | 8923/22095 [15:22:10<13:49:12, 3.78s/it] {'loss': 0.3443, 'grad_norm': 0.5899030881067611, 'learning_rate': 6.7621076414421505e-06, 'epoch': 0.4} 40%|████ | 8923/22095 [15:22:10<13:49:12, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78548 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8924/22095 [15:22:14<13:55:46, 3.81s/it] {'loss': 0.3731, 'grad_norm': 0.669877351584218, 'learning_rate': 6.761421724704768e-06, 'epoch': 0.4} 40%|████ | 8924/22095 [15:22:14<13:55:46, 3.81s/it] 40%|████ | 8925/22095 [15:22:17<13:03:20, 3.57s/it] {'loss': 0.3242, 'grad_norm': 0.6487265557529385, 'learning_rate': 6.760735770119843e-06, 'epoch': 0.4} 40%|████ | 8925/22095 [15:22:17<13:03:20, 3.57s/it] 40%|████ | 8926/22095 [15:22:20<12:26:26, 3.40s/it] {'loss': 0.311, 'grad_norm': 0.6209192812599255, 'learning_rate': 6.7600497777021125e-06, 'epoch': 0.4} 40%|████ | 8926/22095 [15:22:20<12:26:26, 3.40s/it] 40%|████ | 8927/22095 [15:22:23<11:49:10, 3.23s/it] {'loss': 0.2858, 'grad_norm': 0.6228468323866984, 'learning_rate': 6.7593637474663195e-06, 'epoch': 0.4} 40%|████ | 8927/22095 [15:22:23<11:49:10, 3.23s/it] 40%|████ | 8928/22095 [15:22:27<12:15:12, 3.35s/it] {'loss': 0.369, 'grad_norm': 0.6184542504966413, 'learning_rate': 6.758677679427204e-06, 'epoch': 0.4} 40%|████ | 8928/22095 [15:22:27<12:15:12, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (65434 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8929/22095 [15:22:37<19:27:16, 5.32s/it] {'loss': 0.4819, 'grad_norm': 0.3459435067429982, 'learning_rate': 6.757991573599504e-06, 'epoch': 0.4} 40%|████ | 8929/22095 [15:22:37<19:27:16, 5.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899860 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23013, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nA. 16cm\nB. 4cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 40%|████ | 8930/22095 [15:22:41<18:25:20, 5.04s/it] {'loss': 0.3685, 'grad_norm': 0.6301670896170039, 'learning_rate': 6.7573054299979655e-06, 'epoch': 0.4} 40%|████ | 8930/22095 [15:22:41<18:25:20, 5.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8931/22095 [15:22:44<16:18:19, 4.46s/it] {'loss': 0.3526, 'grad_norm': 0.6889789268158945, 'learning_rate': 6.756619248637331e-06, 'epoch': 0.4} 40%|████ | 8931/22095 [15:22:44<16:18:19, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8932/22095 [15:22:54<21:38:39, 5.92s/it] {'loss': 0.4792, 'grad_norm': 0.2939816782313439, 'learning_rate': 6.755933029532342e-06, 'epoch': 0.4} 40%|████ | 8932/22095 [15:22:54<21:38:39, 5.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396966 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63819, 'image': 'vrdu_table_final_2/astro-ph.EP/59d974cc-a259-4f27-a994-fdbd9963c455.png', 'image_wh': [[14, 20]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}y\\end{tabular}\n```"}]} 40%|████ | 8933/22095 [15:22:57<18:52:42, 5.16s/it] {'loss': 0.3159, 'grad_norm': 0.8951588454851417, 'learning_rate': 6.755246772697748e-06, 'epoch': 0.4} 40%|████ | 8933/22095 [15:22:57<18:52:42, 5.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70536 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46877 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8934/22095 [15:23:01<17:23:51, 4.76s/it] {'loss': 0.3452, 'grad_norm': 0.7378167619676134, 'learning_rate': 6.754560478148289e-06, 'epoch': 0.4} 40%|████ | 8934/22095 [15:23:01<17:23:51, 4.76s/it] 40%|████ | 8935/22095 [15:23:05<16:51:21, 4.61s/it] {'loss': 0.3129, 'grad_norm': 0.6214978287086871, 'learning_rate': 6.753874145898716e-06, 'epoch': 0.4} 40%|████ | 8935/22095 [15:23:05<16:51:21, 4.61s/it] 40%|████ | 8936/22095 [15:23:09<15:39:33, 4.28s/it] {'loss': 0.3781, 'grad_norm': 0.6569656789319982, 'learning_rate': 6.753187775963773e-06, 'epoch': 0.4} 40%|████ | 8936/22095 [15:23:09<15:39:33, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 40%|████ | 8937/22095 [15:23:19<22:31:26, 6.16s/it] {'loss': 0.4666, 'grad_norm': 0.32579131965062763, 'learning_rate': 6.752501368358209e-06, 'epoch': 0.4} 40%|████ | 8937/22095 [15:23:19<22:31:26, 6.16s/it] 40%|████ | 8938/22095 [15:23:24<21:05:05, 5.77s/it] {'loss': 0.368, 'grad_norm': 0.6825464587480135, 'learning_rate': 6.751814923096773e-06, 'epoch': 0.4} 40%|████ | 8938/22095 [15:23:24<21:05:05, 5.77s/it] 40%|████ | 8939/22095 [15:23:28<19:08:44, 5.24s/it] {'loss': 0.3341, 'grad_norm': 0.6420022976480055, 'learning_rate': 6.751128440194216e-06, 'epoch': 0.4} 40%|████ | 8939/22095 [15:23:28<19:08:44, 5.24s/it] 40%|████ | 8940/22095 [15:23:32<17:27:30, 4.78s/it] {'loss': 0.3565, 'grad_norm': 0.6656853771714591, 'learning_rate': 6.750441919665286e-06, 'epoch': 0.4} 40%|████ | 8940/22095 [15:23:32<17:27:30, 4.78s/it] 40%|████ | 8941/22095 [15:23:34<15:20:36, 4.20s/it] {'loss': 0.3591, 'grad_norm': 0.6230241276435788, 'learning_rate': 6.7497553615247355e-06, 'epoch': 0.4} 40%|████ | 8941/22095 [15:23:34<15:20:36, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [123, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8528800 in VC:s3://internvl-moe-sft-data/. Exception: Image size [123, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 84635, 'image': 'vrdu_texteq/astro-ph.CO/3671ac15-de91-45b1-8b5b-77de0a1f4763.png', 'image_wh': [[123, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'with $\\mathscr{S}_{lm}$:'}]} 40%|████ | 8942/22095 [15:23:44<21:01:37, 5.76s/it] {'loss': 0.4804, 'grad_norm': 0.29656768184411453, 'learning_rate': 6.749068765787316e-06, 'epoch': 0.4} 40%|████ | 8942/22095 [15:23:44<21:01:37, 5.76s/it] 40%|████ | 8943/22095 [15:23:47<18:18:43, 5.01s/it] {'loss': 0.3531, 'grad_norm': 0.6631787295713072, 'learning_rate': 6.748382132467781e-06, 'epoch': 0.4} 40%|████ | 8943/22095 [15:23:47<18:18:43, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104474 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8944/22095 [15:23:51<16:46:47, 4.59s/it] {'loss': 0.3612, 'grad_norm': 0.616176087322102, 'learning_rate': 6.7476954615808835e-06, 'epoch': 0.4} 40%|████ | 8944/22095 [15:23:51<16:46:47, 4.59s/it] 40%|████ | 8945/22095 [15:23:54<14:52:56, 4.07s/it] {'loss': 0.3362, 'grad_norm': 0.6365449640829076, 'learning_rate': 6.747008753141377e-06, 'epoch': 0.4} 40%|████ | 8945/22095 [15:23:54<14:52:56, 4.07s/it] 40%|████ | 8946/22095 [15:23:57<14:09:16, 3.88s/it] {'loss': 0.3492, 'grad_norm': 0.646535504292674, 'learning_rate': 6.74632200716402e-06, 'epoch': 0.4} 40%|████ | 8946/22095 [15:23:57<14:09:16, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 40%|████ | 8947/22095 [15:24:01<13:48:53, 3.78s/it] {'loss': 0.3331, 'grad_norm': 0.6643896596439289, 'learning_rate': 6.745635223663565e-06, 'epoch': 0.4} 40%|████ | 8947/22095 [15:24:01<13:48:53, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134368 > 40960). Running this sequence through the model will result in indexing errors 40%|████ | 8948/22095 [15:24:03<12:45:45, 3.49s/it] {'loss': 0.3664, 'grad_norm': 0.6066751912169945, 'learning_rate': 6.7449484026547705e-06, 'epoch': 0.4} 40%|████ | 8948/22095 [15:24:03<12:45:45, 3.49s/it] 41%|████ | 8949/22095 [15:24:06<12:05:49, 3.31s/it] {'loss': 0.3496, 'grad_norm': 0.6417521544879048, 'learning_rate': 6.744261544152395e-06, 'epoch': 0.41} 41%|████ | 8949/22095 [15:24:06<12:05:49, 3.31s/it] 41%|████ | 8950/22095 [15:24:10<12:03:10, 3.30s/it] {'loss': 0.2984, 'grad_norm': 0.6291110101036786, 'learning_rate': 6.743574648171195e-06, 'epoch': 0.41} 41%|████ | 8950/22095 [15:24:10<12:03:10, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (95762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86160 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44334 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8951/22095 [15:24:17<16:23:37, 4.49s/it] {'loss': 0.4766, 'grad_norm': 0.3148204187264297, 'learning_rate': 6.7428877147259305e-06, 'epoch': 0.41} 41%|████ | 8951/22095 [15:24:17<16:23:37, 4.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7323152 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_161646_1/images/before_screenshot_1_id_73_internvl_position_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nOutput only the bounding box in your response. Located in the upper portion of the AutoCAD interface, within the ribbon toolbar area. It sits in a row of tabs/buttons, positioned below the main application menu and file operations toolbar, and above the drawing workspace. It's in the central-left area of the top ribbon, within what appears to be the Block definition section."}, {'from': 'gpt', 'value': "Located in the upper portion of the AutoCAD interface, within the ribbon toolbar area. It sits in a row of tabs/buttons, positioned below the main application menu and file operations toolbar, and above the drawing workspace. It's in the central-left area of the top ribbon, within what appears to be the Block definition section.[[306, 215, 348, 226]]"}], 'width': 2704, 'height': 1756} 41%|████ | 8952/22095 [15:24:24<19:27:02, 5.33s/it] {'loss': 0.4854, 'grad_norm': 0.30458788376917484, 'learning_rate': 6.742200743831364e-06, 'epoch': 0.41} 41%|████ | 8952/22095 [15:24:24<19:27:02, 5.33s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████ | 8953/22095 [15:24:27<17:06:28, 4.69s/it] {'loss': 0.3461, 'grad_norm': 0.6258531165395406, 'learning_rate': 6.741513735502252e-06, 'epoch': 0.41} 41%|████ | 8953/22095 [15:24:27<17:06:28, 4.69s/it] 41%|████ | 8954/22095 [15:24:30<15:20:30, 4.20s/it] {'loss': 0.3262, 'grad_norm': 0.6165429605530012, 'learning_rate': 6.740826689753359e-06, 'epoch': 0.41} 41%|████ | 8954/22095 [15:24:30<15:20:30, 4.20s/it] 41%|████ | 8955/22095 [15:24:33<13:58:03, 3.83s/it] {'loss': 0.2975, 'grad_norm': 0.6414593000502342, 'learning_rate': 6.740139606599448e-06, 'epoch': 0.41} 41%|████ | 8955/22095 [15:24:33<13:58:03, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8956/22095 [15:24:42<18:42:46, 5.13s/it] {'loss': 0.493, 'grad_norm': 0.27098158133221006, 'learning_rate': 6.73945248605528e-06, 'epoch': 0.41} 41%|████ | 8956/22095 [15:24:42<18:42:46, 5.13s/it] 41%|████ | 8957/22095 [15:24:45<17:07:36, 4.69s/it] {'loss': 0.3464, 'grad_norm': 0.6298077983608283, 'learning_rate': 6.738765328135621e-06, 'epoch': 0.41} 41%|████ | 8957/22095 [15:24:45<17:07:36, 4.69s/it] 41%|████ | 8958/22095 [15:24:48<15:34:30, 4.27s/it] {'loss': 0.3491, 'grad_norm': 0.6444828431474185, 'learning_rate': 6.7380781328552346e-06, 'epoch': 0.41} 41%|████ | 8958/22095 [15:24:48<15:34:30, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8959/22095 [15:24:58<21:50:08, 5.98s/it] {'loss': 0.4622, 'grad_norm': 0.2905457425379055, 'learning_rate': 6.737390900228888e-06, 'epoch': 0.41} 41%|████ | 8959/22095 [15:24:58<21:50:08, 5.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68514 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8960/22095 [15:25:02<18:47:27, 5.15s/it] {'loss': 0.3944, 'grad_norm': 0.694225612326219, 'learning_rate': 6.736703630271347e-06, 'epoch': 0.41} 41%|████ | 8960/22095 [15:25:02<18:47:27, 5.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8347882 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14549, 'image': 'vrdu_table_final_2/astro-ph.CO/eb8a3b6a-8b63-4113-8756-fcc459921732.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 41%|████ | 8961/22095 [15:25:05<16:45:36, 4.59s/it] {'loss': 0.3612, 'grad_norm': 0.6095087960379569, 'learning_rate': 6.736016322997379e-06, 'epoch': 0.41} 41%|████ | 8961/22095 [15:25:05<16:45:36, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [556, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8471368 in VC:s3://internvl-moe-sft-data/. Exception: Image size [556, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19301, 'image': 'vrdu_texteq/astro-ph.CO/445cf642-8654-4423-8444-8848d78ee0e0.png', 'image_wh': [[556, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'Since we are in the matter era we set $\\omega_{m}\\equiv 0$.'}]} 41%|████ | 8962/22095 [15:25:08<15:12:49, 4.17s/it] {'loss': 0.328, 'grad_norm': 0.7793100472578198, 'learning_rate': 6.7353289784217525e-06, 'epoch': 0.41} 41%|████ | 8962/22095 [15:25:08<15:12:49, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43260 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8963/22095 [15:25:12<14:35:07, 4.00s/it] {'loss': 0.3221, 'grad_norm': 0.5994659358964869, 'learning_rate': 6.734641596559234e-06, 'epoch': 0.41} 41%|████ | 8963/22095 [15:25:12<14:35:07, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80655 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46415 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8964/22095 [15:25:15<14:09:40, 3.88s/it] {'loss': 0.3511, 'grad_norm': 0.7179033980482736, 'learning_rate': 6.733954177424598e-06, 'epoch': 0.41} 41%|████ | 8964/22095 [15:25:15<14:09:40, 3.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047933 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 5\nB. 6\nC. 6.5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 8965/22095 [15:25:19<13:38:25, 3.74s/it] {'loss': 0.3766, 'grad_norm': 0.6262554659821052, 'learning_rate': 6.733266721032609e-06, 'epoch': 0.41} 41%|████ | 8965/22095 [15:25:19<13:38:25, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [545, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8441049 in VC:s3://internvl-moe-sft-data/. Exception: Image size [545, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4321, 'image': 'vrdu_texteq/astro-ph.CO/d5907639-e0ab-4478-a1fd-7a342535bda0.png', 'image_wh': [[545, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'Hence the theories with $\\lambda<0$ are not viable.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358144 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24855, 'image': 'vrdu_table_final_2/astro-ph.CO/a727a273-7057-4b2e-949d-847b3cad43da.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 41%|████ | 8966/22095 [15:25:22<12:57:10, 3.55s/it] {'loss': 0.3574, 'grad_norm': 0.6368406550673319, 'learning_rate': 6.732579227398043e-06, 'epoch': 0.41} 41%|████ | 8966/22095 [15:25:22<12:57:10, 3.55s/it] 41%|████ | 8967/22095 [15:25:30<17:47:08, 4.88s/it] {'loss': 0.3499, 'grad_norm': 0.6136063256982253, 'learning_rate': 6.731891696535671e-06, 'epoch': 0.41} 41%|████ | 8967/22095 [15:25:30<17:47:08, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8968/22095 [15:25:39<22:48:45, 6.26s/it] {'loss': 0.4881, 'grad_norm': 0.33916521123273174, 'learning_rate': 6.731204128460265e-06, 'epoch': 0.41} 41%|████ | 8968/22095 [15:25:39<22:48:45, 6.26s/it] 41%|████ | 8969/22095 [15:25:43<20:04:31, 5.51s/it] {'loss': 0.3713, 'grad_norm': 0.6510734811700759, 'learning_rate': 6.730516523186599e-06, 'epoch': 0.41} 41%|████ | 8969/22095 [15:25:43<20:04:31, 5.51s/it] 41%|████ | 8970/22095 [15:25:47<18:01:24, 4.94s/it] {'loss': 0.3469, 'grad_norm': 0.6554821074225733, 'learning_rate': 6.729828880729448e-06, 'epoch': 0.41} 41%|████ | 8970/22095 [15:25:47<18:01:24, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8971/22095 [15:25:52<18:09:31, 4.98s/it] {'loss': 0.4643, 'grad_norm': 0.29512133094170623, 'learning_rate': 6.7291412011035866e-06, 'epoch': 0.41} 41%|████ | 8971/22095 [15:25:52<18:09:31, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8395631 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62469, 'image': 'vrdu_table_final_2/astro-ph.EP/13efc73b-3d0f-498b-a7fe-3e28cd8c30f0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 41%|████ | 8972/22095 [15:26:01<23:05:09, 6.33s/it] {'loss': 0.452, 'grad_norm': 0.2946429814813781, 'learning_rate': 6.728453484323791e-06, 'epoch': 0.41} 41%|████ | 8972/22095 [15:26:01<23:05:09, 6.33s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████ | 8973/22095 [15:26:04<19:37:56, 5.39s/it] {'loss': 0.3698, 'grad_norm': 0.7983353679920422, 'learning_rate': 6.727765730404841e-06, 'epoch': 0.41} 41%|████ | 8973/22095 [15:26:04<19:37:56, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55909 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8974/22095 [15:26:08<17:16:39, 4.74s/it] {'loss': 0.3641, 'grad_norm': 0.6758477176684954, 'learning_rate': 6.7270779393615095e-06, 'epoch': 0.41} 41%|████ | 8974/22095 [15:26:08<17:16:39, 4.74s/it] 41%|████ | 8975/22095 [15:26:11<16:01:11, 4.40s/it] {'loss': 0.2701, 'grad_norm': 0.589628555008071, 'learning_rate': 6.726390111208579e-06, 'epoch': 0.41} 41%|████ | 8975/22095 [15:26:11<16:01:11, 4.40s/it] 41%|████ | 8976/22095 [15:26:15<15:01:16, 4.12s/it] {'loss': 0.2902, 'grad_norm': 0.6707201662154986, 'learning_rate': 6.725702245960827e-06, 'epoch': 0.41} 41%|████ | 8976/22095 [15:26:15<15:01:16, 4.12s/it] 41%|████ | 8977/22095 [15:26:18<14:01:23, 3.85s/it] {'loss': 0.3468, 'grad_norm': 0.6308128620949878, 'learning_rate': 6.725014343633033e-06, 'epoch': 0.41} 41%|████ | 8977/22095 [15:26:18<14:01:23, 3.85s/it] 41%|████ | 8978/22095 [15:26:21<13:05:13, 3.59s/it] {'loss': 0.2778, 'grad_norm': 0.6225704778634961, 'learning_rate': 6.7243264042399795e-06, 'epoch': 0.41} 41%|████ | 8978/22095 [15:26:21<13:05:13, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 8979/22095 [15:26:24<12:15:05, 3.36s/it] {'loss': 0.349, 'grad_norm': 0.6543361066301392, 'learning_rate': 6.7236384277964465e-06, 'epoch': 0.41} 41%|████ | 8979/22095 [15:26:24<12:15:05, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47210 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8980/22095 [15:26:27<12:32:37, 3.44s/it] {'loss': 0.3477, 'grad_norm': 0.6259696940928321, 'learning_rate': 6.722950414317218e-06, 'epoch': 0.41} 41%|████ | 8980/22095 [15:26:27<12:32:37, 3.44s/it] 41%|████ | 8981/22095 [15:26:30<12:08:11, 3.33s/it] {'loss': 0.3543, 'grad_norm': 0.66971951479067, 'learning_rate': 6.722262363817077e-06, 'epoch': 0.41} 41%|████ | 8981/22095 [15:26:30<12:08:11, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373197 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39970, 'image': 'vrdu_table_final_2/astro-ph.CO/a747ccd7-e900-4252-970e-00804ef77cfc.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 41%|████ | 8982/22095 [15:26:40<18:54:26, 5.19s/it] {'loss': 0.4749, 'grad_norm': 0.3774940043813065, 'learning_rate': 6.721574276310807e-06, 'epoch': 0.41} 41%|████ | 8982/22095 [15:26:40<18:54:26, 5.19s/it] 41%|████ | 8983/22095 [15:26:43<16:40:34, 4.58s/it] {'loss': 0.3383, 'grad_norm': 0.706138680641507, 'learning_rate': 6.720886151813194e-06, 'epoch': 0.41} 41%|████ | 8983/22095 [15:26:43<16:40:34, 4.58s/it] 41%|████ | 8984/22095 [15:26:47<15:43:16, 4.32s/it] {'loss': 0.41, 'grad_norm': 0.6063299043643507, 'learning_rate': 6.720197990339022e-06, 'epoch': 0.41} 41%|████ | 8984/22095 [15:26:47<15:43:16, 4.32s/it] 41%|████ | 8985/22095 [15:26:50<14:08:47, 3.88s/it] {'loss': 0.3464, 'grad_norm': 0.6917969396568446, 'learning_rate': 6.719509791903078e-06, 'epoch': 0.41} 41%|████ | 8985/22095 [15:26:50<14:08:47, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42282 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48833 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43205 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71719 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8986/22095 [15:26:53<13:16:23, 3.65s/it] {'loss': 0.3329, 'grad_norm': 0.5925119104848456, 'learning_rate': 6.718821556520151e-06, 'epoch': 0.41} 41%|████ | 8986/22095 [15:26:53<13:16:23, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58776 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8987/22095 [15:26:56<13:04:51, 3.59s/it] {'loss': 0.3235, 'grad_norm': 0.7172368950413045, 'learning_rate': 6.718133284205026e-06, 'epoch': 0.41} 41%|████ | 8987/22095 [15:26:56<13:04:51, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8988/22095 [15:27:06<19:41:33, 5.41s/it] {'loss': 0.482, 'grad_norm': 0.35530670970656764, 'learning_rate': 6.717444974972495e-06, 'epoch': 0.41} 41%|████ | 8988/22095 [15:27:06<19:41:33, 5.41s/it] 41%|████ | 8989/22095 [15:27:10<17:48:55, 4.89s/it] {'loss': 0.3467, 'grad_norm': 0.6179833011689086, 'learning_rate': 6.716756628837345e-06, 'epoch': 0.41} 41%|████ | 8989/22095 [15:27:10<17:48:55, 4.89s/it] 41%|████ | 8990/22095 [15:27:13<16:23:14, 4.50s/it] {'loss': 0.3698, 'grad_norm': 0.6400629806758316, 'learning_rate': 6.716068245814369e-06, 'epoch': 0.41} 41%|████ | 8990/22095 [15:27:13<16:23:14, 4.50s/it] 41%|████ | 8991/22095 [15:27:16<14:56:46, 4.11s/it] {'loss': 0.3034, 'grad_norm': 0.5903144512944397, 'learning_rate': 6.715379825918357e-06, 'epoch': 0.41} 41%|████ | 8991/22095 [15:27:16<14:56:46, 4.11s/it] 41%|████ | 8992/22095 [15:27:20<14:04:35, 3.87s/it] {'loss': 0.3332, 'grad_norm': 0.5593568955536351, 'learning_rate': 6.714691369164099e-06, 'epoch': 0.41} 41%|████ | 8992/22095 [15:27:20<14:04:35, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113599 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (145224 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8993/22095 [15:27:23<13:39:09, 3.75s/it] {'loss': 0.3597, 'grad_norm': 0.6698171715141726, 'learning_rate': 6.714002875566392e-06, 'epoch': 0.41} 41%|████ | 8993/22095 [15:27:23<13:39:09, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 8994/22095 [15:27:33<19:57:24, 5.48s/it] {'loss': 0.4941, 'grad_norm': 0.4575285327706009, 'learning_rate': 6.713314345140025e-06, 'epoch': 0.41} 41%|████ | 8994/22095 [15:27:33<19:57:24, 5.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 8995/22095 [15:27:42<24:11:14, 6.65s/it] {'loss': 0.4714, 'grad_norm': 0.29855367789502585, 'learning_rate': 6.712625777899797e-06, 'epoch': 0.41} 41%|████ | 8995/22095 [15:27:42<24:11:14, 6.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████ | 8996/22095 [15:27:46<21:00:51, 5.78s/it] {'loss': 0.3387, 'grad_norm': 0.6371976573051277, 'learning_rate': 6.7119371738605e-06, 'epoch': 0.41} 41%|████ | 8996/22095 [15:27:46<21:00:51, 5.78s/it] 41%|████ | 8997/22095 [15:27:50<18:54:45, 5.20s/it] {'loss': 0.3558, 'grad_norm': 0.5841837035168561, 'learning_rate': 6.711248533036931e-06, 'epoch': 0.41} 41%|████ | 8997/22095 [15:27:50<18:54:45, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79122 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48648 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 8998/22095 [15:27:53<16:47:40, 4.62s/it] {'loss': 0.3371, 'grad_norm': 0.5992793271836137, 'learning_rate': 6.710559855443885e-06, 'epoch': 0.41} 41%|████ | 8998/22095 [15:27:53<16:47:40, 4.62s/it] 41%|████ | 8999/22095 [15:27:57<16:20:30, 4.49s/it] {'loss': 0.3865, 'grad_norm': 0.6635349205549409, 'learning_rate': 6.709871141096164e-06, 'epoch': 0.41} 41%|████ | 8999/22095 [15:27:57<16:20:30, 4.49s/it] 41%|████ | 9000/22095 [15:28:01<15:34:11, 4.28s/it] {'loss': 0.3479, 'grad_norm': 0.5703140692706825, 'learning_rate': 6.709182390008563e-06, 'epoch': 0.41} 41%|████ | 9000/22095 [15:28:01<15:34:11, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46461 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52050 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9001/22095 [15:28:09<19:34:08, 5.38s/it] {'loss': 0.494, 'grad_norm': 0.3243766357473611, 'learning_rate': 6.70849360219588e-06, 'epoch': 0.41} 41%|████ | 9001/22095 [15:28:09<19:34:08, 5.38s/it] 41%|████ | 9002/22095 [15:28:12<17:17:25, 4.75s/it] {'loss': 0.3719, 'grad_norm': 0.6563326984838999, 'learning_rate': 6.70780477767292e-06, 'epoch': 0.41} 41%|████ | 9002/22095 [15:28:12<17:17:25, 4.75s/it] 41%|████ | 9003/22095 [15:28:15<15:26:51, 4.25s/it] {'loss': 0.3754, 'grad_norm': 0.6227235866029053, 'learning_rate': 6.7071159164544775e-06, 'epoch': 0.41} 41%|████ | 9003/22095 [15:28:15<15:26:51, 4.25s/it] 41%|████ | 9004/22095 [15:28:18<13:57:26, 3.84s/it] {'loss': 0.3696, 'grad_norm': 0.6609459209436451, 'learning_rate': 6.706427018555359e-06, 'epoch': 0.41} 41%|████ | 9004/22095 [15:28:18<13:57:26, 3.84s/it] 41%|████ | 9005/22095 [15:28:21<13:28:43, 3.71s/it] {'loss': 0.2979, 'grad_norm': 0.5852792648844204, 'learning_rate': 6.705738083990363e-06, 'epoch': 0.41} 41%|████ | 9005/22095 [15:28:22<13:28:43, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9006/22095 [15:28:31<19:43:59, 5.43s/it] {'loss': 0.5131, 'grad_norm': 0.30829767078260123, 'learning_rate': 6.705049112774295e-06, 'epoch': 0.41} 41%|████ | 9006/22095 [15:28:31<19:43:59, 5.43s/it] 41%|████ | 9007/22095 [15:28:35<17:59:36, 4.95s/it] {'loss': 0.3456, 'grad_norm': 0.6404114897481298, 'learning_rate': 6.704360104921959e-06, 'epoch': 0.41} 41%|████ | 9007/22095 [15:28:35<17:59:36, 4.95s/it] 41%|████ | 9008/22095 [15:28:38<15:52:12, 4.37s/it] {'loss': 0.3363, 'grad_norm': 0.6660237164001803, 'learning_rate': 6.703671060448158e-06, 'epoch': 0.41} 41%|████ | 9008/22095 [15:28:38<15:52:12, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (86137 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9009/22095 [15:28:43<16:41:57, 4.59s/it] {'loss': 0.4687, 'grad_norm': 0.2834674880371996, 'learning_rate': 6.702981979367699e-06, 'epoch': 0.41} 41%|████ | 9009/22095 [15:28:43<16:41:57, 4.59s/it] 41%|████ | 9010/22095 [15:28:47<15:41:57, 4.32s/it] {'loss': 0.3304, 'grad_norm': 0.5858283436495668, 'learning_rate': 6.7022928616953865e-06, 'epoch': 0.41} 41%|████ | 9010/22095 [15:28:47<15:41:57, 4.32s/it] 41%|████ | 9011/22095 [15:28:50<14:50:23, 4.08s/it] {'loss': 0.3362, 'grad_norm': 0.5975401010177234, 'learning_rate': 6.701603707446029e-06, 'epoch': 0.41} 41%|████ | 9011/22095 [15:28:50<14:50:23, 4.08s/it] 41%|████ | 9012/22095 [15:28:53<13:45:47, 3.79s/it] {'loss': 0.3549, 'grad_norm': 0.6278500455622565, 'learning_rate': 6.7009145166344355e-06, 'epoch': 0.41} 41%|████ | 9012/22095 [15:28:53<13:45:47, 3.79s/it] 41%|████ | 9013/22095 [15:28:56<12:44:26, 3.51s/it] {'loss': 0.3401, 'grad_norm': 0.5997105715074218, 'learning_rate': 6.700225289275411e-06, 'epoch': 0.41} 41%|████ | 9013/22095 [15:28:56<12:44:26, 3.51s/it] 41%|████ | 9014/22095 [15:28:59<12:30:31, 3.44s/it] {'loss': 0.3463, 'grad_norm': 0.6259148881571266, 'learning_rate': 6.699536025383768e-06, 'epoch': 0.41} 41%|████ | 9014/22095 [15:28:59<12:30:31, 3.44s/it] 41%|████ | 9015/22095 [15:29:02<11:57:08, 3.29s/it] {'loss': 0.335, 'grad_norm': 0.6228649006571026, 'learning_rate': 6.698846724974315e-06, 'epoch': 0.41} 41%|████ | 9015/22095 [15:29:02<11:57:08, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112807 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9016/22095 [15:29:10<17:13:17, 4.74s/it] {'loss': 0.4701, 'grad_norm': 0.30266594936736707, 'learning_rate': 6.6981573880618636e-06, 'epoch': 0.41} 41%|████ | 9016/22095 [15:29:10<17:13:17, 4.74s/it] 41%|████ | 9017/22095 [15:29:19<20:52:49, 5.75s/it] {'loss': 0.465, 'grad_norm': 0.2953912308079833, 'learning_rate': 6.697468014661226e-06, 'epoch': 0.41} 41%|████ | 9017/22095 [15:29:19<20:52:49, 5.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (97314 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9018/22095 [15:29:23<18:57:59, 5.22s/it] {'loss': 0.3318, 'grad_norm': 0.6024204480029517, 'learning_rate': 6.696778604787213e-06, 'epoch': 0.41} 41%|████ | 9018/22095 [15:29:23<18:57:59, 5.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9019/22095 [15:29:26<17:19:08, 4.77s/it] {'loss': 0.3546, 'grad_norm': 0.6727613447472707, 'learning_rate': 6.69608915845464e-06, 'epoch': 0.41} 41%|████ | 9019/22095 [15:29:26<17:19:08, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65391 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101713 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9020/22095 [15:29:29<15:28:06, 4.26s/it] {'loss': 0.3267, 'grad_norm': 0.6725825317612492, 'learning_rate': 6.69539967567832e-06, 'epoch': 0.41} 41%|████ | 9020/22095 [15:29:29<15:28:06, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70550 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9021/22095 [15:29:33<15:05:44, 4.16s/it] {'loss': 0.3287, 'grad_norm': 0.6251203031313752, 'learning_rate': 6.694710156473067e-06, 'epoch': 0.41} 41%|████ | 9021/22095 [15:29:33<15:05:44, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62619 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9022/22095 [15:29:36<13:41:39, 3.77s/it] {'loss': 0.3493, 'grad_norm': 0.5948194157228681, 'learning_rate': 6.694020600853699e-06, 'epoch': 0.41} 41%|████ | 9022/22095 [15:29:36<13:41:39, 3.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893503 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16656, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 41%|████ | 9023/22095 [15:29:39<13:06:38, 3.61s/it] {'loss': 0.3588, 'grad_norm': 0.6262507575282605, 'learning_rate': 6.69333100883503e-06, 'epoch': 0.41} 41%|████ | 9023/22095 [15:29:39<13:06:38, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9024/22095 [15:29:49<19:34:20, 5.39s/it] {'loss': 0.4705, 'grad_norm': 0.35450308304949074, 'learning_rate': 6.692641380431879e-06, 'epoch': 0.41} 41%|████ | 9024/22095 [15:29:49<19:34:20, 5.39s/it] 41%|████ | 9025/22095 [15:29:53<17:47:13, 4.90s/it] {'loss': 0.3682, 'grad_norm': 0.607371939659254, 'learning_rate': 6.691951715659063e-06, 'epoch': 0.41} 41%|████ | 9025/22095 [15:29:53<17:47:13, 4.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80297 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9026/22095 [15:29:59<19:36:27, 5.40s/it] {'loss': 0.4836, 'grad_norm': 0.3412133730192881, 'learning_rate': 6.691262014531401e-06, 'epoch': 0.41} 41%|████ | 9026/22095 [15:29:59<19:36:27, 5.40s/it] 41%|████ | 9027/22095 [15:30:03<18:00:25, 4.96s/it] {'loss': 0.3402, 'grad_norm': 0.6366134224531387, 'learning_rate': 6.690572277063711e-06, 'epoch': 0.41} 41%|████ | 9027/22095 [15:30:03<18:00:25, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9028/22095 [15:30:10<20:21:42, 5.61s/it] {'loss': 0.5028, 'grad_norm': 0.2912418177907017, 'learning_rate': 6.689882503270818e-06, 'epoch': 0.41} 41%|████ | 9028/22095 [15:30:10<20:21:42, 5.61s/it] 41%|████ | 9029/22095 [15:30:19<23:29:20, 6.47s/it] {'loss': 0.4761, 'grad_norm': 0.3163477138902858, 'learning_rate': 6.689192693167539e-06, 'epoch': 0.41} 41%|████ | 9029/22095 [15:30:19<23:29:20, 6.47s/it] 41%|████ | 9030/22095 [15:30:25<23:01:37, 6.35s/it] {'loss': 0.473, 'grad_norm': 0.28719296847973913, 'learning_rate': 6.688502846768697e-06, 'epoch': 0.41} 41%|████ | 9030/22095 [15:30:25<23:01:37, 6.35s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████ | 9031/22095 [15:30:29<20:23:57, 5.62s/it] {'loss': 0.3251, 'grad_norm': 0.6244954493296053, 'learning_rate': 6.6878129640891135e-06, 'epoch': 0.41} 41%|████ | 9031/22095 [15:30:29<20:23:57, 5.62s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9032/22095 [15:30:32<17:56:28, 4.94s/it] {'loss': 0.3849, 'grad_norm': 0.6516279469831836, 'learning_rate': 6.687123045143613e-06, 'epoch': 0.41} 41%|████ | 9032/22095 [15:30:32<17:56:28, 4.94s/it] 41%|████ | 9033/22095 [15:30:36<16:47:19, 4.63s/it] {'loss': 0.2979, 'grad_norm': 0.5998224028305761, 'learning_rate': 6.686433089947022e-06, 'epoch': 0.41} 41%|████ | 9033/22095 [15:30:36<16:47:19, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9034/22095 [15:30:46<22:11:25, 6.12s/it] {'loss': 0.4565, 'grad_norm': 0.3103766703711543, 'learning_rate': 6.685743098514161e-06, 'epoch': 0.41} 41%|████ | 9034/22095 [15:30:46<22:11:25, 6.12s/it] 41%|████ | 9035/22095 [15:30:49<19:05:37, 5.26s/it] {'loss': 0.3101, 'grad_norm': 0.6139132238441113, 'learning_rate': 6.685053070859861e-06, 'epoch': 0.41} 41%|████ | 9035/22095 [15:30:49<19:05:37, 5.26s/it] 41%|████ | 9036/22095 [15:30:52<16:52:27, 4.65s/it] {'loss': 0.3024, 'grad_norm': 0.6160606143251894, 'learning_rate': 6.684363006998944e-06, 'epoch': 0.41} 41%|████ | 9036/22095 [15:30:52<16:52:27, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9037/22095 [15:30:59<19:19:16, 5.33s/it] {'loss': 0.4851, 'grad_norm': 0.30752656365509423, 'learning_rate': 6.683672906946239e-06, 'epoch': 0.41} 41%|████ | 9037/22095 [15:30:59<19:19:16, 5.33s/it] 41%|████ | 9038/22095 [15:31:02<16:53:17, 4.66s/it] {'loss': 0.369, 'grad_norm': 0.6513482944547165, 'learning_rate': 6.682982770716575e-06, 'epoch': 0.41} 41%|████ | 9038/22095 [15:31:02<16:53:17, 4.66s/it] 41%|████ | 9039/22095 [15:31:05<15:09:11, 4.18s/it] {'loss': 0.3203, 'grad_norm': 0.7791012897019023, 'learning_rate': 6.682292598324779e-06, 'epoch': 0.41} 41%|████ | 9039/22095 [15:31:05<15:09:11, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50271 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9040/22095 [15:31:08<13:46:41, 3.80s/it] {'loss': 0.3487, 'grad_norm': 0.6894452263955075, 'learning_rate': 6.681602389785683e-06, 'epoch': 0.41} 41%|████ | 9040/22095 [15:31:08<13:46:41, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308004 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2HSPAi4HI8KJjy1zbXXaxdpXa_!!3470550527.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否告诉我图中所有的文字是什么?'}, {'from': 'gpt', 'value': '图中所有文字:\n限时特别优惠\n¥\n17\n¥\n件套\n你还在犹豫什么?\n亏本大让利\n金卡思\n936D+\nESD\nSAFE\n200\n抢!\n-\n+\nCAL\n℃'}]} 41%|████ | 9041/22095 [15:31:11<13:04:36, 3.61s/it] {'loss': 0.3672, 'grad_norm': 0.6752996121282719, 'learning_rate': 6.680912145114116e-06, 'epoch': 0.41} 41%|████ | 9041/22095 [15:31:11<13:04:36, 3.61s/it] 41%|████ | 9042/22095 [15:31:15<13:05:49, 3.61s/it] {'loss': 0.3466, 'grad_norm': 0.6197637473794768, 'learning_rate': 6.680221864324908e-06, 'epoch': 0.41} 41%|████ | 9042/22095 [15:31:15<13:05:49, 3.61s/it] 41%|████ | 9043/22095 [15:31:18<12:16:39, 3.39s/it] {'loss': 0.3436, 'grad_norm': 0.6856247512459379, 'learning_rate': 6.679531547432896e-06, 'epoch': 0.41} 41%|████ | 9043/22095 [15:31:18<12:16:39, 3.39s/it] 41%|████ | 9044/22095 [15:31:22<12:58:07, 3.58s/it] {'loss': 0.3553, 'grad_norm': 0.6567848141481791, 'learning_rate': 6.6788411944529064e-06, 'epoch': 0.41} 41%|████ | 9044/22095 [15:31:22<12:58:07, 3.58s/it] 41%|████ | 9045/22095 [15:31:25<13:03:46, 3.60s/it] {'loss': 0.3257, 'grad_norm': 0.7058855753044633, 'learning_rate': 6.678150805399777e-06, 'epoch': 0.41} 41%|████ | 9045/22095 [15:31:25<13:03:46, 3.60s/it] 41%|████ | 9046/22095 [15:31:29<13:18:25, 3.67s/it] {'loss': 0.3346, 'grad_norm': 0.588293854134537, 'learning_rate': 6.67746038028834e-06, 'epoch': 0.41} 41%|████ | 9046/22095 [15:31:29<13:18:25, 3.67s/it] 41%|████ | 9047/22095 [15:31:33<13:02:38, 3.60s/it] {'loss': 0.3763, 'grad_norm': 0.6538508362128201, 'learning_rate': 6.676769919133431e-06, 'epoch': 0.41} 41%|████ | 9047/22095 [15:31:33<13:02:38, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959590 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10425, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 6cm\nB. 1cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 41%|████ | 9048/22095 [15:31:42<19:29:15, 5.38s/it] {'loss': 0.4864, 'grad_norm': 0.4019729609796205, 'learning_rate': 6.6760794219498874e-06, 'epoch': 0.41} 41%|████ | 9048/22095 [15:31:42<19:29:15, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396926 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63779, 'image': 'vrdu_table_final_2/astro-ph.EP/b512beee-6f19-426c-a56d-c83e0c30545a.png', 'image_wh': [[14, 20]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}y\\end{tabular}\n```'}]} 41%|████ | 9049/22095 [15:31:46<17:53:07, 4.94s/it] {'loss': 0.3817, 'grad_norm': 0.65655875129064, 'learning_rate': 6.675388888752544e-06, 'epoch': 0.41} 41%|████ | 9049/22095 [15:31:46<17:53:07, 4.94s/it] 41%|████ | 9050/22095 [15:31:49<15:51:02, 4.37s/it] {'loss': 0.3708, 'grad_norm': 0.6169079869153397, 'learning_rate': 6.674698319556239e-06, 'epoch': 0.41} 41%|████ | 9050/22095 [15:31:49<15:51:02, 4.37s/it] 41%|████ | 9051/22095 [15:31:52<14:15:42, 3.94s/it] {'loss': 0.3638, 'grad_norm': 0.6588170636410992, 'learning_rate': 6.674007714375812e-06, 'epoch': 0.41} 41%|████ | 9051/22095 [15:31:52<14:15:42, 3.94s/it] 41%|████ | 9052/22095 [15:31:55<13:22:39, 3.69s/it] {'loss': 0.3221, 'grad_norm': 0.6373846978424119, 'learning_rate': 6.673317073226097e-06, 'epoch': 0.41} 41%|████ | 9052/22095 [15:31:55<13:22:39, 3.69s/it] 41%|████ | 9053/22095 [15:31:58<12:53:02, 3.56s/it] {'loss': 0.3713, 'grad_norm': 0.6560858708014767, 'learning_rate': 6.672626396121942e-06, 'epoch': 0.41} 41%|████ | 9053/22095 [15:31:58<12:53:02, 3.56s/it] 41%|████ | 9054/22095 [15:32:02<13:08:35, 3.63s/it] {'loss': 0.3882, 'grad_norm': 0.6087462390017654, 'learning_rate': 6.671935683078179e-06, 'epoch': 0.41} 41%|████ | 9054/22095 [15:32:02<13:08:35, 3.63s/it] 41%|████ | 9055/22095 [15:32:05<12:21:19, 3.41s/it] {'loss': 0.3496, 'grad_norm': 0.705119345393188, 'learning_rate': 6.6712449341096555e-06, 'epoch': 0.41} 41%|████ | 9055/22095 [15:32:05<12:21:19, 3.41s/it] 41%|████ | 9056/22095 [15:32:09<12:56:47, 3.57s/it] {'loss': 0.3491, 'grad_norm': 0.6344883167152161, 'learning_rate': 6.67055414923121e-06, 'epoch': 0.41} 41%|████ | 9056/22095 [15:32:09<12:56:47, 3.57s/it] 41%|████ | 9057/22095 [15:32:13<13:28:40, 3.72s/it] {'loss': 0.3344, 'grad_norm': 0.6210838758555896, 'learning_rate': 6.669863328457686e-06, 'epoch': 0.41} 41%|████ | 9057/22095 [15:32:13<13:28:40, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9058/22095 [15:32:24<21:18:11, 5.88s/it] {'loss': 0.4915, 'grad_norm': 0.3234130847164188, 'learning_rate': 6.6691724718039285e-06, 'epoch': 0.41} 41%|████ | 9058/22095 [15:32:24<21:18:11, 5.88s/it] 41%|████ | 9059/22095 [15:32:27<18:39:32, 5.15s/it] {'loss': 0.3855, 'grad_norm': 0.6170558332327496, 'learning_rate': 6.668481579284781e-06, 'epoch': 0.41} 41%|████ | 9059/22095 [15:32:27<18:39:32, 5.15s/it] 41%|████ | 9060/22095 [15:32:30<16:19:13, 4.51s/it] {'loss': 0.3475, 'grad_norm': 0.6614143628085448, 'learning_rate': 6.667790650915089e-06, 'epoch': 0.41} 41%|████ | 9060/22095 [15:32:30<16:19:13, 4.51s/it] 41%|████ | 9061/22095 [15:32:34<14:57:08, 4.13s/it] {'loss': 0.3502, 'grad_norm': 0.642262702067851, 'learning_rate': 6.667099686709697e-06, 'epoch': 0.41} 41%|████ | 9061/22095 [15:32:34<14:57:08, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047180 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 5cm\nB. \\frac{11}{2}cm\nC. 4cm\nD. \\frac{9}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 41%|████ | 9062/22095 [15:32:38<14:37:30, 4.04s/it] {'loss': 0.3367, 'grad_norm': 0.6701479057169873, 'learning_rate': 6.666408686683455e-06, 'epoch': 0.41} 41%|████ | 9062/22095 [15:32:38<14:37:30, 4.04s/it] 41%|████ | 9063/22095 [15:32:41<14:07:53, 3.90s/it] {'loss': 0.298, 'grad_norm': 0.5870143826263454, 'learning_rate': 6.665717650851205e-06, 'epoch': 0.41} 41%|████ | 9063/22095 [15:32:41<14:07:53, 3.90s/it] 41%|████ | 9064/22095 [15:32:45<13:57:29, 3.86s/it] {'loss': 0.3733, 'grad_norm': 0.6748879525929066, 'learning_rate': 6.665026579227802e-06, 'epoch': 0.41} 41%|████ | 9064/22095 [15:32:45<13:57:29, 3.86s/it] 41%|████ | 9065/22095 [15:32:48<12:47:03, 3.53s/it] {'loss': 0.3482, 'grad_norm': 0.6586658160921205, 'learning_rate': 6.66433547182809e-06, 'epoch': 0.41} 41%|████ | 9065/22095 [15:32:48<12:47:03, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53550 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9066/22095 [15:32:51<12:05:44, 3.34s/it] {'loss': 0.3387, 'grad_norm': 0.7078301453617645, 'learning_rate': 6.663644328666921e-06, 'epoch': 0.41} 41%|████ | 9066/22095 [15:32:51<12:05:44, 3.34s/it] 41%|████ | 9067/22095 [15:32:54<12:21:55, 3.42s/it] {'loss': 0.332, 'grad_norm': 0.6306545682547016, 'learning_rate': 6.662953149759144e-06, 'epoch': 0.41} 41%|████ | 9067/22095 [15:32:54<12:21:55, 3.42s/it] 41%|████ | 9068/22095 [15:32:57<12:12:44, 3.37s/it] {'loss': 0.3714, 'grad_norm': 0.6647220500569193, 'learning_rate': 6.6622619351196115e-06, 'epoch': 0.41} 41%|████ | 9068/22095 [15:32:57<12:12:44, 3.37s/it] 41%|████ | 9069/22095 [15:33:01<12:21:11, 3.41s/it] {'loss': 0.3428, 'grad_norm': 0.6229576833143935, 'learning_rate': 6.661570684763175e-06, 'epoch': 0.41} 41%|████ | 9069/22095 [15:33:01<12:21:11, 3.41s/it] 41%|████ | 9070/22095 [15:33:04<12:10:29, 3.37s/it] {'loss': 0.3541, 'grad_norm': 0.6352150216178791, 'learning_rate': 6.660879398704689e-06, 'epoch': 0.41} 41%|████ | 9070/22095 [15:33:04<12:10:29, 3.37s/it] 41%|████ | 9071/22095 [15:33:08<12:30:44, 3.46s/it] {'loss': 0.3341, 'grad_norm': 0.6452785895743032, 'learning_rate': 6.660188076959004e-06, 'epoch': 0.41} 41%|████ | 9071/22095 [15:33:08<12:30:44, 3.46s/it] 41%|████ | 9072/22095 [15:33:12<12:51:47, 3.56s/it] {'loss': 0.3371, 'grad_norm': 0.6412904402714849, 'learning_rate': 6.659496719540976e-06, 'epoch': 0.41} 41%|████ | 9072/22095 [15:33:12<12:51:47, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73683 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71167 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79489 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9073/22095 [15:33:15<12:11:53, 3.37s/it] {'loss': 0.3943, 'grad_norm': 0.6635652999800646, 'learning_rate': 6.658805326465462e-06, 'epoch': 0.41} 41%|████ | 9073/22095 [15:33:15<12:11:53, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9074/22095 [15:33:22<16:14:50, 4.49s/it] {'loss': 0.4658, 'grad_norm': 0.31807832099944305, 'learning_rate': 6.658113897747315e-06, 'epoch': 0.41} 41%|████ | 9074/22095 [15:33:22<16:14:50, 4.49s/it] 41%|████ | 9075/22095 [15:33:25<15:00:44, 4.15s/it] {'loss': 0.306, 'grad_norm': 0.6611497775536742, 'learning_rate': 6.657422433401392e-06, 'epoch': 0.41} 41%|████ | 9075/22095 [15:33:25<15:00:44, 4.15s/it] 41%|████ | 9076/22095 [15:33:28<14:04:39, 3.89s/it] {'loss': 0.354, 'grad_norm': 0.6214667092573483, 'learning_rate': 6.656730933442552e-06, 'epoch': 0.41} 41%|████ | 9076/22095 [15:33:28<14:04:39, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8494284 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 104915, 'image': 'vrdu_texteq/astro-ph.CO/9594ea22-fb12-46c1-9ff0-889e9890fc53.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': '$\\approx$90'}]} 41%|████ | 9077/22095 [15:33:31<13:02:26, 3.61s/it] {'loss': 0.3566, 'grad_norm': 0.6319617771618785, 'learning_rate': 6.656039397885653e-06, 'epoch': 0.41} 41%|████ | 9077/22095 [15:33:31<13:02:26, 3.61s/it] 41%|████ | 9078/22095 [15:33:34<12:33:14, 3.47s/it] {'loss': 0.3404, 'grad_norm': 0.6248363665958656, 'learning_rate': 6.6553478267455526e-06, 'epoch': 0.41} 41%|████ | 9078/22095 [15:33:34<12:33:14, 3.47s/it] 41%|████ | 9079/22095 [15:33:38<12:09:54, 3.36s/it] {'loss': 0.3264, 'grad_norm': 0.6214439842268216, 'learning_rate': 6.654656220037112e-06, 'epoch': 0.41} 41%|████ | 9079/22095 [15:33:38<12:09:54, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76999 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9080/22095 [15:33:40<11:36:15, 3.21s/it] {'loss': 0.3447, 'grad_norm': 0.6171175050479654, 'learning_rate': 6.653964577775192e-06, 'epoch': 0.41} 41%|████ | 9080/22095 [15:33:40<11:36:15, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9081/22095 [15:33:50<18:15:34, 5.05s/it] {'loss': 0.4523, 'grad_norm': 0.3451405243141214, 'learning_rate': 6.653272899974652e-06, 'epoch': 0.41} 41%|████ | 9081/22095 [15:33:50<18:15:34, 5.05s/it] 41%|████ | 9082/22095 [15:33:53<16:35:10, 4.59s/it] {'loss': 0.3703, 'grad_norm': 0.6359812714302097, 'learning_rate': 6.652581186650355e-06, 'epoch': 0.41} 41%|████ | 9082/22095 [15:33:53<16:35:10, 4.59s/it] 41%|████ | 9083/22095 [15:33:57<16:05:59, 4.45s/it] {'loss': 0.3389, 'grad_norm': 0.624058737568414, 'learning_rate': 6.651889437817165e-06, 'epoch': 0.41} 41%|████ | 9083/22095 [15:33:57<16:05:59, 4.45s/it] 41%|████ | 9084/22095 [15:34:01<15:25:40, 4.27s/it] {'loss': 0.2987, 'grad_norm': 0.574939779007028, 'learning_rate': 6.6511976534899414e-06, 'epoch': 0.41} 41%|████ | 9084/22095 [15:34:01<15:25:40, 4.27s/it] 41%|████ | 9085/22095 [15:34:05<14:22:55, 3.98s/it] {'loss': 0.352, 'grad_norm': 0.6465268244781667, 'learning_rate': 6.650505833683555e-06, 'epoch': 0.41} 41%|████ | 9085/22095 [15:34:05<14:22:55, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9086/22095 [15:34:08<13:40:54, 3.79s/it] {'loss': 0.3659, 'grad_norm': 0.6275208315388503, 'learning_rate': 6.649813978412866e-06, 'epoch': 0.41} 41%|████ | 9086/22095 [15:34:08<13:40:54, 3.79s/it] 41%|████ | 9087/22095 [15:34:11<12:49:19, 3.55s/it] {'loss': 0.3388, 'grad_norm': 0.6279978110455504, 'learning_rate': 6.6491220876927406e-06, 'epoch': 0.41} 41%|████ | 9087/22095 [15:34:11<12:49:19, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46438 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59048 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9088/22095 [15:34:16<14:58:58, 4.15s/it] {'loss': 0.4842, 'grad_norm': 0.36609485100695516, 'learning_rate': 6.648430161538047e-06, 'epoch': 0.41} 41%|████ | 9088/22095 [15:34:16<14:58:58, 4.15s/it] 41%|████ | 9089/22095 [15:34:21<15:17:17, 4.23s/it] {'loss': 0.3161, 'grad_norm': 0.6285456556219239, 'learning_rate': 6.6477381999636525e-06, 'epoch': 0.41} 41%|████ | 9089/22095 [15:34:21<15:17:17, 4.23s/it] 41%|████ | 9090/22095 [15:34:24<13:55:12, 3.85s/it] {'loss': 0.3104, 'grad_norm': 0.6172940666972289, 'learning_rate': 6.647046202984424e-06, 'epoch': 0.41} 41%|████ | 9090/22095 [15:34:24<13:55:12, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9091/22095 [15:34:32<18:16:47, 5.06s/it] {'loss': 0.4603, 'grad_norm': 0.266702196880152, 'learning_rate': 6.646354170615232e-06, 'epoch': 0.41} 41%|████ | 9091/22095 [15:34:32<18:16:47, 5.06s/it] 41%|████ | 9092/22095 [15:34:35<16:40:48, 4.62s/it] {'loss': 0.3367, 'grad_norm': 0.6172185599650262, 'learning_rate': 6.645662102870944e-06, 'epoch': 0.41} 41%|████ | 9092/22095 [15:34:35<16:40:48, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52024 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50579 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46365 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9093/22095 [15:34:39<15:24:15, 4.27s/it] {'loss': 0.3369, 'grad_norm': 0.6231255908509027, 'learning_rate': 6.644969999766434e-06, 'epoch': 0.41} 41%|████ | 9093/22095 [15:34:39<15:24:15, 4.27s/it] 41%|████ | 9094/22095 [15:34:42<13:55:28, 3.86s/it] {'loss': 0.2951, 'grad_norm': 0.6711560277546211, 'learning_rate': 6.644277861316569e-06, 'epoch': 0.41} 41%|████ | 9094/22095 [15:34:42<13:55:28, 3.86s/it] 41%|████ | 9095/22095 [15:34:45<13:13:21, 3.66s/it] {'loss': 0.3233, 'grad_norm': 0.7050844540839191, 'learning_rate': 6.643585687536224e-06, 'epoch': 0.41} 41%|████ | 9095/22095 [15:34:45<13:13:21, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9096/22095 [15:34:55<20:01:24, 5.55s/it] {'loss': 0.4817, 'grad_norm': 0.33534558413590837, 'learning_rate': 6.642893478440269e-06, 'epoch': 0.41} 41%|████ | 9096/22095 [15:34:55<20:01:24, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65339 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9097/22095 [15:34:59<18:33:29, 5.14s/it] {'loss': 0.3155, 'grad_norm': 0.6289465132386788, 'learning_rate': 6.6422012340435796e-06, 'epoch': 0.41} 41%|████ | 9097/22095 [15:34:59<18:33:29, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47288 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9098/22095 [15:35:02<16:22:29, 4.54s/it] {'loss': 0.3155, 'grad_norm': 0.6164854466612605, 'learning_rate': 6.641508954361029e-06, 'epoch': 0.41} 41%|████ | 9098/22095 [15:35:02<16:22:29, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9099/22095 [15:35:06<15:41:53, 4.35s/it] {'loss': 0.3495, 'grad_norm': 0.9583668980218552, 'learning_rate': 6.640816639407494e-06, 'epoch': 0.41} 41%|████ | 9099/22095 [15:35:06<15:41:53, 4.35s/it] 41%|████ | 9100/22095 [15:35:09<14:10:28, 3.93s/it] {'loss': 0.3603, 'grad_norm': 0.6494624743247183, 'learning_rate': 6.640124289197845e-06, 'epoch': 0.41} 41%|████ | 9100/22095 [15:35:09<14:10:28, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████ | 9101/22095 [15:35:18<20:14:47, 5.61s/it] {'loss': 0.495, 'grad_norm': 0.3167337648889493, 'learning_rate': 6.639431903746967e-06, 'epoch': 0.41} 41%|████ | 9101/22095 [15:35:18<20:14:47, 5.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9102/22095 [15:35:23<18:47:34, 5.21s/it] {'loss': 0.4093, 'grad_norm': 0.6147842236862651, 'learning_rate': 6.638739483069729e-06, 'epoch': 0.41} 41%|████ | 9102/22095 [15:35:23<18:47:34, 5.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8300316 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB14.UuSFXXXXc2XVXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n提取这张照片中的文本信息,但不要输出坐标。'}, {'from': 'gpt', 'value': '图中所有文字:\n进口台湾玉线纯手工编织、不褪色、不过敏、无味\n好好干好好赚\n辟邪七星阵\n转运结\n转运结\n四通发达\n四通发达\n腰链可调节3-6厘米\n12生肖转运珠可调节\n生肖猴子!\n默认发\n辟邪七星阵\n7\n天无理由退货\n红玛瑙通过珠宝检测研究中心鉴定'}]} 41%|████ | 9103/22095 [15:35:26<16:11:05, 4.48s/it] {'loss': 0.3575, 'grad_norm': 0.6248505173002057, 'learning_rate': 6.6380470271810146e-06, 'epoch': 0.41} 41%|████ | 9103/22095 [15:35:26<16:11:05, 4.48s/it] 41%|████ | 9104/22095 [15:35:29<15:06:45, 4.19s/it] {'loss': 0.3657, 'grad_norm': 0.6509884609769292, 'learning_rate': 6.637354536095699e-06, 'epoch': 0.41} 41%|████ | 9104/22095 [15:35:29<15:06:45, 4.19s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11307022 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 584, 'image': 'chat/274.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请说明图中的聊天记录。'}, {'from': 'gpt', 'value': '这张图片展示了微信聊天记录,聊天双方分别使用了绿色对话框和红色对话框,其中绿色对话框配有一个青蛙形象的头像,红色对话框使用的是一个穿红色T恤的人物头像。以下是对话内容的详细说明:\n\n1. 青蛙形象的用户发送了一条消息:\n - "铁子!"(意思是“兄弟!”或“朋友!”)\n\n2. 另外一位用户收到消息后,回复了两个问号提示困惑:“??”。\n\n3. 接下来,青蛙形象的用户发送了一条消息,内容为:\n - "微信能发隐藏消息了你知道吗" (意思是“你知道微信现在可以发送隐藏消息了吗?”)\n\n4. 红色对话框的用户回应:\n - “(⊙o⊙)啥?”(表示吃惊的表情,意思是“什么?”)\n\n5. 青蛙形象的用户继续对话:\n - “不信你看” (表示“如果你不信,可以看一下”)\n\n6. 然后,青蛙形象的用户发送了一张图片,内容为:\n - “你点开” (意思是“你点开看看”)\n\n7. 红色头像的用户没有作出文字回应,仅仅用了三个省略号提示:“…”\n\n8. 青蛙形象的用户再次发送了一张相同的图片:\n - “你点开”(意思是“你点开看看”)\n\n9. 红色头像的用户用另一个省略号回应:“…”\n\n整个对话的情景是青蛙形象的用户尝试让对方相信微信可以发送隐藏消息,通过不断重复“你点开”来提醒对方查看所谓的隐藏消息,而对方则表达出试探性的困惑和不确定。这其中可能包含一种恶作剧或调侃的意味。'}]} 41%|████ | 9105/22095 [15:35:38<20:15:52, 5.62s/it] {'loss': 0.5212, 'grad_norm': 0.31781228423799274, 'learning_rate': 6.636662009828665e-06, 'epoch': 0.41} 41%|████ | 9105/22095 [15:35:38<20:15:52, 5.62s/it] 41%|████ | 9106/22095 [15:35:47<24:19:33, 6.74s/it] {'loss': 0.4782, 'grad_norm': 0.2928137845304512, 'learning_rate': 6.635969448394789e-06, 'epoch': 0.41} 41%|████ | 9106/22095 [15:35:47<24:19:33, 6.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████ | 9107/22095 [15:35:51<20:50:34, 5.78s/it] {'loss': 0.3403, 'grad_norm': 0.6442762188793938, 'learning_rate': 6.635276851808955e-06, 'epoch': 0.41} 41%|████ | 9107/22095 [15:35:51<20:50:34, 5.78s/it] 41%|████ | 9108/22095 [15:35:54<17:57:25, 4.98s/it] {'loss': 0.3659, 'grad_norm': 0.6580738386452342, 'learning_rate': 6.634584220086043e-06, 'epoch': 0.41} 41%|████ | 9108/22095 [15:35:54<17:57:25, 4.98s/it] 41%|████ | 9109/22095 [15:35:57<16:12:45, 4.49s/it] {'loss': 0.3583, 'grad_norm': 0.6581215553798003, 'learning_rate': 6.633891553240938e-06, 'epoch': 0.41} 41%|████ | 9109/22095 [15:35:57<16:12:45, 4.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9110/22095 [15:36:01<14:58:55, 4.15s/it] {'loss': 0.3061, 'grad_norm': 0.6474338516977602, 'learning_rate': 6.63319885128852e-06, 'epoch': 0.41} 41%|████ | 9110/22095 [15:36:01<14:58:55, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101682 > 40960). Running this sequence through the model will result in indexing errors 41%|████ | 9111/22095 [15:36:05<15:06:46, 4.19s/it] {'loss': 0.339, 'grad_norm': 0.6093922450524109, 'learning_rate': 6.632506114243676e-06, 'epoch': 0.41} 41%|████ | 9111/22095 [15:36:05<15:06:46, 4.19s/it] 41%|████ | 9112/22095 [15:36:09<14:45:38, 4.09s/it] {'loss': 0.3515, 'grad_norm': 0.6672804842611628, 'learning_rate': 6.631813342121289e-06, 'epoch': 0.41} 41%|████ | 9112/22095 [15:36:09<14:45:38, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922571 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45724, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 41%|████ | 9113/22095 [15:36:12<14:08:02, 3.92s/it] {'loss': 0.341, 'grad_norm': 0.6295667090545991, 'learning_rate': 6.631120534936244e-06, 'epoch': 0.41} 41%|████ | 9113/22095 [15:36:12<14:08:02, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047603 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 10\nB. 12\nC. 16\nD. 9\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████ | 9114/22095 [15:36:16<13:18:12, 3.69s/it] {'loss': 0.312, 'grad_norm': 0.6006838717938614, 'learning_rate': 6.6304276927034305e-06, 'epoch': 0.41} 41%|████ | 9114/22095 [15:36:16<13:18:12, 3.69s/it] 41%|████▏ | 9115/22095 [15:36:19<13:11:19, 3.66s/it] {'loss': 0.3529, 'grad_norm': 0.7046041718813307, 'learning_rate': 6.629734815437731e-06, 'epoch': 0.41} 41%|████▏ | 9115/22095 [15:36:19<13:11:19, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████▏ | 9116/22095 [15:36:22<12:52:07, 3.57s/it] {'loss': 0.3505, 'grad_norm': 0.6305552420053111, 'learning_rate': 6.629041903154038e-06, 'epoch': 0.41} 41%|████▏ | 9116/22095 [15:36:22<12:52:07, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████▏ | 9117/22095 [15:36:26<12:25:23, 3.45s/it] {'loss': 0.342, 'grad_norm': 0.6647842074571177, 'learning_rate': 6.628348955867237e-06, 'epoch': 0.41} 41%|████▏ | 9117/22095 [15:36:26<12:25:23, 3.45s/it] 41%|████▏ | 9118/22095 [15:36:29<12:03:04, 3.34s/it] {'loss': 0.3299, 'grad_norm': 0.7025522272858689, 'learning_rate': 6.627655973592216e-06, 'epoch': 0.41} 41%|████▏ | 9118/22095 [15:36:29<12:03:04, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9119/22095 [15:36:37<16:54:32, 4.69s/it] {'loss': 0.4953, 'grad_norm': 0.4316424212966116, 'learning_rate': 6.626962956343868e-06, 'epoch': 0.41} 41%|████▏ | 9119/22095 [15:36:37<16:54:32, 4.69s/it] 41%|████▏ | 9120/22095 [15:36:40<15:55:54, 4.42s/it] {'loss': 0.3277, 'grad_norm': 0.6497356997555592, 'learning_rate': 6.626269904137086e-06, 'epoch': 0.41} 41%|████▏ | 9120/22095 [15:36:40<15:55:54, 4.42s/it] 41%|████▏ | 9121/22095 [15:36:43<14:18:17, 3.97s/it] {'loss': 0.3319, 'grad_norm': 0.6221328793676323, 'learning_rate': 6.625576816986754e-06, 'epoch': 0.41} 41%|████▏ | 9121/22095 [15:36:43<14:18:17, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (143592 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9122/22095 [15:36:47<13:38:16, 3.78s/it] {'loss': 0.3294, 'grad_norm': 0.6172935644265816, 'learning_rate': 6.624883694907772e-06, 'epoch': 0.41} 41%|████▏ | 9122/22095 [15:36:47<13:38:16, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115635 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107161 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9123/22095 [15:36:50<13:16:26, 3.68s/it] {'loss': 0.3276, 'grad_norm': 0.6632952225923628, 'learning_rate': 6.624190537915028e-06, 'epoch': 0.41} 41%|████▏ | 9123/22095 [15:36:50<13:16:26, 3.68s/it] 41%|████▏ | 9124/22095 [15:36:55<14:15:58, 3.96s/it] {'loss': 0.3668, 'grad_norm': 0.6007638726533377, 'learning_rate': 6.6234973460234184e-06, 'epoch': 0.41} 41%|████▏ | 9124/22095 [15:36:55<14:15:58, 3.96s/it] 41%|████▏ | 9125/22095 [15:36:58<13:20:57, 3.71s/it] {'loss': 0.3211, 'grad_norm': 0.6670716039159356, 'learning_rate': 6.6228041192478365e-06, 'epoch': 0.41} 41%|████▏ | 9125/22095 [15:36:58<13:20:57, 3.71s/it] 41%|████▏ | 9126/22095 [15:37:02<13:58:05, 3.88s/it] {'loss': 0.3393, 'grad_norm': 0.608979806865586, 'learning_rate': 6.622110857603179e-06, 'epoch': 0.41} 41%|████▏ | 9126/22095 [15:37:02<13:58:05, 3.88s/it] 41%|████▏ | 9127/22095 [15:37:05<13:03:54, 3.63s/it] {'loss': 0.3359, 'grad_norm': 0.6191617627583516, 'learning_rate': 6.6214175611043395e-06, 'epoch': 0.41} 41%|████▏ | 9127/22095 [15:37:05<13:03:54, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████▏ | 9128/22095 [15:37:09<13:47:18, 3.83s/it] {'loss': 0.3426, 'grad_norm': 0.7211852075791401, 'learning_rate': 6.620724229766219e-06, 'epoch': 0.41} 41%|████▏ | 9128/22095 [15:37:09<13:47:18, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9129/22095 [15:37:19<19:43:48, 5.48s/it] {'loss': 0.4728, 'grad_norm': 0.3787113096246672, 'learning_rate': 6.62003086360371e-06, 'epoch': 0.41} 41%|████▏ | 9129/22095 [15:37:19<19:43:48, 5.48s/it] 41%|████▏ | 9130/22095 [15:37:22<17:13:55, 4.78s/it] {'loss': 0.3416, 'grad_norm': 0.7311943998256455, 'learning_rate': 6.6193374626317155e-06, 'epoch': 0.41} 41%|████▏ | 9130/22095 [15:37:22<17:13:55, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████▏ | 9131/22095 [15:37:29<20:00:18, 5.56s/it] {'loss': 0.4925, 'grad_norm': 0.3177175282849726, 'learning_rate': 6.61864402686513e-06, 'epoch': 0.41} 41%|████▏ | 9131/22095 [15:37:29<20:00:18, 5.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [312, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8434821 in VC:s3://internvl-moe-sft-data/. Exception: Image size [312, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5238, 'image': 'vrdu_texteq/astro-ph.CO/e443d8e2-357d-4cf0-afdf-b4abc1b43fea.png', 'image_wh': [[312, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'for the $\\alpha$NFW model and'}]} 41%|████▏ | 9132/22095 [15:37:35<19:53:58, 5.53s/it] {'loss': 0.4782, 'grad_norm': 0.2790578219806817, 'learning_rate': 6.617950556318858e-06, 'epoch': 0.41} 41%|████▏ | 9132/22095 [15:37:35<19:53:58, 5.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 41%|████▏ | 9133/22095 [15:37:38<17:48:33, 4.95s/it] {'loss': 0.3752, 'grad_norm': 0.7027309997642883, 'learning_rate': 6.617257051007796e-06, 'epoch': 0.41} 41%|████▏ | 9133/22095 [15:37:38<17:48:33, 4.95s/it] 41%|████▏ | 9134/22095 [15:37:42<16:08:17, 4.48s/it] {'loss': 0.3496, 'grad_norm': 0.6393780180131584, 'learning_rate': 6.616563510946848e-06, 'epoch': 0.41} 41%|████▏ | 9134/22095 [15:37:42<16:08:17, 4.48s/it] 41%|████▏ | 9135/22095 [15:37:45<14:20:57, 3.99s/it] {'loss': 0.3649, 'grad_norm': 0.593594767525673, 'learning_rate': 6.615869936150914e-06, 'epoch': 0.41} 41%|████▏ | 9135/22095 [15:37:45<14:20:57, 3.99s/it] 41%|████▏ | 9136/22095 [15:37:48<13:31:27, 3.76s/it] {'loss': 0.3348, 'grad_norm': 0.610098381858523, 'learning_rate': 6.6151763266348975e-06, 'epoch': 0.41} 41%|████▏ | 9136/22095 [15:37:48<13:31:27, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49895 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81557 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9137/22095 [15:37:51<12:51:44, 3.57s/it] {'loss': 0.3531, 'grad_norm': 0.6090898027798755, 'learning_rate': 6.614482682413703e-06, 'epoch': 0.41} 41%|████▏ | 9137/22095 [15:37:51<12:51:44, 3.57s/it] 41%|████▏ | 9138/22095 [15:37:54<11:59:34, 3.33s/it] {'loss': 0.3304, 'grad_norm': 0.6430454610132569, 'learning_rate': 6.613789003502236e-06, 'epoch': 0.41} 41%|████▏ | 9138/22095 [15:37:54<11:59:34, 3.33s/it] 41%|████▏ | 9139/22095 [15:37:57<11:39:29, 3.24s/it] {'loss': 0.3471, 'grad_norm': 0.6193389709370466, 'learning_rate': 6.6130952899153966e-06, 'epoch': 0.41} 41%|████▏ | 9139/22095 [15:37:57<11:39:29, 3.24s/it] 41%|████▏ | 9140/22095 [15:38:00<11:15:55, 3.13s/it] {'loss': 0.3211, 'grad_norm': 0.6107043710984218, 'learning_rate': 6.6124015416680955e-06, 'epoch': 0.41} 41%|████▏ | 9140/22095 [15:38:00<11:15:55, 3.13s/it] 41%|████▏ | 9141/22095 [15:38:02<11:02:00, 3.07s/it] {'loss': 0.3417, 'grad_norm': 0.6248234817107684, 'learning_rate': 6.611707758775238e-06, 'epoch': 0.41} 41%|████▏ | 9141/22095 [15:38:03<11:02:00, 3.07s/it] 41%|████▏ | 9142/22095 [15:38:05<10:48:32, 3.00s/it] {'loss': 0.3247, 'grad_norm': 0.6257364165127264, 'learning_rate': 6.611013941251728e-06, 'epoch': 0.41} 41%|████▏ | 9142/22095 [15:38:05<10:48:32, 3.00s/it] 41%|████▏ | 9143/22095 [15:38:08<10:42:03, 2.97s/it] {'loss': 0.3538, 'grad_norm': 0.6544488321986621, 'learning_rate': 6.61032008911248e-06, 'epoch': 0.41} 41%|████▏ | 9143/22095 [15:38:08<10:42:03, 2.97s/it] 41%|████▏ | 9144/22095 [15:38:11<10:56:17, 3.04s/it] {'loss': 0.3456, 'grad_norm': 0.5954957989123458, 'learning_rate': 6.609626202372396e-06, 'epoch': 0.41} 41%|████▏ | 9144/22095 [15:38:11<10:56:17, 3.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59651 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88699 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60186 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77726 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9145/22095 [15:38:19<16:02:55, 4.46s/it] {'loss': 0.4729, 'grad_norm': 0.49113248105346125, 'learning_rate': 6.6089322810463895e-06, 'epoch': 0.41} 41%|████▏ | 9145/22095 [15:38:19<16:02:55, 4.46s/it] 41%|████▏ | 9146/22095 [15:38:23<15:33:14, 4.32s/it] {'loss': 0.3611, 'grad_norm': 0.8460178834179111, 'learning_rate': 6.60823832514937e-06, 'epoch': 0.41} 41%|████▏ | 9146/22095 [15:38:23<15:33:14, 4.32s/it] 41%|████▏ | 9147/22095 [15:38:28<15:35:12, 4.33s/it] {'loss': 0.3432, 'grad_norm': 0.6319870355690369, 'learning_rate': 6.6075443346962475e-06, 'epoch': 0.41} 41%|████▏ | 9147/22095 [15:38:28<15:35:12, 4.33s/it] 41%|████▏ | 9148/22095 [15:38:31<14:13:33, 3.96s/it] {'loss': 0.3163, 'grad_norm': 0.6076256702515779, 'learning_rate': 6.606850309701936e-06, 'epoch': 0.41} 41%|████▏ | 9148/22095 [15:38:31<14:13:33, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72329 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9149/22095 [15:38:34<13:57:44, 3.88s/it] {'loss': 0.3641, 'grad_norm': 0.696995594690972, 'learning_rate': 6.606156250181346e-06, 'epoch': 0.41} 41%|████▏ | 9149/22095 [15:38:34<13:57:44, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9150/22095 [15:38:42<18:31:21, 5.15s/it] {'loss': 0.4977, 'grad_norm': 0.3384968033801821, 'learning_rate': 6.6054621561493896e-06, 'epoch': 0.41} 41%|████▏ | 9150/22095 [15:38:42<18:31:21, 5.15s/it] 41%|████▏ | 9151/22095 [15:38:47<17:56:40, 4.99s/it] {'loss': 0.3514, 'grad_norm': 0.6415829878760637, 'learning_rate': 6.604768027620984e-06, 'epoch': 0.41} 41%|████▏ | 9151/22095 [15:38:47<17:56:40, 4.99s/it] 41%|████▏ | 9152/22095 [15:38:51<16:38:40, 4.63s/it] {'loss': 0.3517, 'grad_norm': 0.5922317216981312, 'learning_rate': 6.60407386461104e-06, 'epoch': 0.41} 41%|████▏ | 9152/22095 [15:38:51<16:38:40, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910451 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33604, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} 41%|████▏ | 9153/22095 [15:39:00<21:48:43, 6.07s/it] {'loss': 0.4755, 'grad_norm': 0.33405495227931115, 'learning_rate': 6.603379667134478e-06, 'epoch': 0.41} 41%|████▏ | 9153/22095 [15:39:00<21:48:43, 6.07s/it] 41%|████▏ | 9154/22095 [15:39:04<19:03:22, 5.30s/it] {'loss': 0.3481, 'grad_norm': 0.6350463815012574, 'learning_rate': 6.602685435206209e-06, 'epoch': 0.41} 41%|████▏ | 9154/22095 [15:39:04<19:03:22, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57600 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9155/22095 [15:39:07<16:45:50, 4.66s/it] {'loss': 0.3343, 'grad_norm': 0.6168553855298148, 'learning_rate': 6.6019911688411535e-06, 'epoch': 0.41} 41%|████▏ | 9155/22095 [15:39:07<16:45:50, 4.66s/it] 41%|████▏ | 9156/22095 [15:39:10<15:08:43, 4.21s/it] {'loss': 0.3139, 'grad_norm': 0.7275595796187577, 'learning_rate': 6.601296868054227e-06, 'epoch': 0.41} 41%|████▏ | 9156/22095 [15:39:10<15:08:43, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 41%|████▏ | 9157/22095 [15:39:20<20:51:39, 5.80s/it] {'loss': 0.4838, 'grad_norm': 0.3305580939019304, 'learning_rate': 6.600602532860349e-06, 'epoch': 0.41} 41%|████▏ | 9157/22095 [15:39:20<20:51:39, 5.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65750 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9158/22095 [15:39:23<18:28:05, 5.14s/it] {'loss': 0.3148, 'grad_norm': 0.5836929686129411, 'learning_rate': 6.599908163274439e-06, 'epoch': 0.41} 41%|████▏ | 9158/22095 [15:39:23<18:28:05, 5.14s/it] 41%|████▏ | 9159/22095 [15:39:27<16:45:03, 4.66s/it] {'loss': 0.3648, 'grad_norm': 0.5992491283122371, 'learning_rate': 6.599213759311416e-06, 'epoch': 0.41} 41%|████▏ | 9159/22095 [15:39:27<16:45:03, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104828 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55934 > 40960). Running this sequence through the model will result in indexing errors 41%|████▏ | 9160/22095 [15:39:30<15:21:08, 4.27s/it] {'loss': 0.384, 'grad_norm': 0.6499845187153163, 'learning_rate': 6.598519320986201e-06, 'epoch': 0.41} 41%|████▏ | 9160/22095 [15:39:30<15:21:08, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8309063 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1.K45onvI8KJjSspjXXcgjXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nI require the transcribed text from this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n书包暴力测试\n好书包质量过硬\n书包承重测试\n好书包才能\n承受34斤\n超强测试\n16.92\n0.00'}]} 41%|████▏ | 9161/22095 [15:39:33<13:51:34, 3.86s/it] {'loss': 0.3766, 'grad_norm': 0.6498925304476317, 'learning_rate': 6.5978248483137165e-06, 'epoch': 0.41} 41%|████▏ | 9161/22095 [15:39:33<13:51:34, 3.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910447 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33600, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 41%|████▏ | 9162/22095 [15:39:37<13:49:25, 3.85s/it] {'loss': 0.3851, 'grad_norm': 0.6344836570957235, 'learning_rate': 6.597130341308881e-06, 'epoch': 0.41} 41%|████▏ | 9162/22095 [15:39:37<13:49:25, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9163/22095 [15:39:47<20:19:12, 5.66s/it] {'loss': 0.4889, 'grad_norm': 0.31294675135178723, 'learning_rate': 6.5964357999866214e-06, 'epoch': 0.41} 41%|████▏ | 9163/22095 [15:39:47<20:19:12, 5.66s/it] 41%|████▏ | 9164/22095 [15:39:50<18:08:14, 5.05s/it] {'loss': 0.3172, 'grad_norm': 0.6378095099327816, 'learning_rate': 6.595741224361858e-06, 'epoch': 0.41} 41%|████▏ | 9164/22095 [15:39:50<18:08:14, 5.05s/it] 41%|████▏ | 9165/22095 [15:39:54<16:10:46, 4.50s/it] {'loss': 0.3677, 'grad_norm': 0.6152584122949394, 'learning_rate': 6.595046614449518e-06, 'epoch': 0.41} 41%|████▏ | 9165/22095 [15:39:54<16:10:46, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9166/22095 [15:40:03<21:09:09, 5.89s/it] {'loss': 0.4763, 'grad_norm': 0.2874984063973487, 'learning_rate': 6.594351970264525e-06, 'epoch': 0.41} 41%|████▏ | 9166/22095 [15:40:03<21:09:09, 5.89s/it] 41%|████▏ | 9167/22095 [15:40:06<18:29:25, 5.15s/it] {'loss': 0.3501, 'grad_norm': 0.816731893554995, 'learning_rate': 6.593657291821804e-06, 'epoch': 0.41} 41%|████▏ | 9167/22095 [15:40:06<18:29:25, 5.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 41%|████▏ | 9168/22095 [15:40:16<23:15:19, 6.48s/it] {'loss': 0.5067, 'grad_norm': 0.30283741282069704, 'learning_rate': 6.592962579136283e-06, 'epoch': 0.41} 41%|████▏ | 9168/22095 [15:40:16<23:15:19, 6.48s/it] 41%|████▏ | 9169/22095 [15:40:19<19:52:04, 5.53s/it] {'loss': 0.3793, 'grad_norm': 0.6416453864713597, 'learning_rate': 6.592267832222888e-06, 'epoch': 0.41} 41%|████▏ | 9169/22095 [15:40:19<19:52:04, 5.53s/it] 42%|████▏ | 9170/22095 [15:40:23<17:47:02, 4.95s/it] {'loss': 0.3798, 'grad_norm': 0.6468314186517801, 'learning_rate': 6.591573051096549e-06, 'epoch': 0.42} 42%|████▏ | 9170/22095 [15:40:23<17:47:02, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81334 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100820 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9171/22095 [15:40:26<16:11:56, 4.51s/it] {'loss': 0.3458, 'grad_norm': 0.6383963610937056, 'learning_rate': 6.5908782357721914e-06, 'epoch': 0.42} 42%|████▏ | 9171/22095 [15:40:26<16:11:56, 4.51s/it] 42%|████▏ | 9172/22095 [15:40:30<15:02:54, 4.19s/it] {'loss': 0.3717, 'grad_norm': 0.6506341832672858, 'learning_rate': 6.590183386264748e-06, 'epoch': 0.42} 42%|████▏ | 9172/22095 [15:40:30<15:02:54, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46883 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9173/22095 [15:40:33<13:51:17, 3.86s/it] {'loss': 0.3672, 'grad_norm': 0.7739364454783211, 'learning_rate': 6.5894885025891455e-06, 'epoch': 0.42} 42%|████▏ | 9173/22095 [15:40:33<13:51:17, 3.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393047 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 59878, 'image': 'vrdu_table_final_2/astro-ph.EP/a278e9e8-4c85-42cf-b7b5-44a35874b1e0.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 42%|████▏ | 9174/22095 [15:40:36<13:07:27, 3.66s/it] {'loss': 0.2971, 'grad_norm': 0.6417639748068534, 'learning_rate': 6.5887935847603204e-06, 'epoch': 0.42} 42%|████▏ | 9174/22095 [15:40:36<13:07:27, 3.66s/it] 42%|████▏ | 9175/22095 [15:40:39<12:45:42, 3.56s/it] {'loss': 0.3405, 'grad_norm': 0.5913598035198431, 'learning_rate': 6.588098632793197e-06, 'epoch': 0.42} 42%|████▏ | 9175/22095 [15:40:39<12:45:42, 3.56s/it] 42%|████▏ | 9176/22095 [15:40:43<12:42:56, 3.54s/it] {'loss': 0.3168, 'grad_norm': 0.6306761921126739, 'learning_rate': 6.5874036467027135e-06, 'epoch': 0.42} 42%|████▏ | 9176/22095 [15:40:43<12:42:56, 3.54s/it] 42%|████▏ | 9177/22095 [15:40:46<12:06:55, 3.38s/it] {'loss': 0.3436, 'grad_norm': 0.6407055478791495, 'learning_rate': 6.5867086265038005e-06, 'epoch': 0.42} 42%|████▏ | 9177/22095 [15:40:46<12:06:55, 3.38s/it] 42%|████▏ | 9178/22095 [15:40:49<12:04:31, 3.37s/it] {'loss': 0.3196, 'grad_norm': 0.7739259144316215, 'learning_rate': 6.586013572211394e-06, 'epoch': 0.42} 42%|████▏ | 9178/22095 [15:40:49<12:04:31, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9179/22095 [15:40:57<17:22:49, 4.84s/it] {'loss': 0.4897, 'grad_norm': 0.3843724146831492, 'learning_rate': 6.585318483840424e-06, 'epoch': 0.42} 42%|████▏ | 9179/22095 [15:40:57<17:22:49, 4.84s/it] 42%|████▏ | 9180/22095 [15:41:01<15:50:24, 4.42s/it] {'loss': 0.3899, 'grad_norm': 0.6610425816406361, 'learning_rate': 6.58462336140583e-06, 'epoch': 0.42} 42%|████▏ | 9180/22095 [15:41:01<15:50:24, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9181/22095 [15:41:10<21:27:38, 5.98s/it] {'loss': 0.4788, 'grad_norm': 0.28457472167734965, 'learning_rate': 6.583928204922546e-06, 'epoch': 0.42} 42%|████▏ | 9181/22095 [15:41:10<21:27:38, 5.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341514 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8159, 'image': 'vrdu_table_final_2/astro-ph.CO/f04b6c54-8e5a-40fd-a3a6-5cd53646078c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 42%|████▏ | 9182/22095 [15:41:14<19:10:42, 5.35s/it] {'loss': 0.3485, 'grad_norm': 0.62001978773301, 'learning_rate': 6.5832330144055116e-06, 'epoch': 0.42} 42%|████▏ | 9182/22095 [15:41:14<19:10:42, 5.35s/it] 42%|████▏ | 9183/22095 [15:41:18<17:29:28, 4.88s/it] {'loss': 0.3399, 'grad_norm': 0.6396613108916812, 'learning_rate': 6.58253778986966e-06, 'epoch': 0.42} 42%|████▏ | 9183/22095 [15:41:18<17:29:28, 4.88s/it] 42%|████▏ | 9184/22095 [15:41:21<15:20:45, 4.28s/it] {'loss': 0.3741, 'grad_norm': 0.6411496367755108, 'learning_rate': 6.5818425313299325e-06, 'epoch': 0.42} 42%|████▏ | 9184/22095 [15:41:21<15:20:45, 4.28s/it] 42%|████▏ | 9185/22095 [15:41:24<14:20:57, 4.00s/it] {'loss': 0.3171, 'grad_norm': 0.6754167372367614, 'learning_rate': 6.581147238801268e-06, 'epoch': 0.42} 42%|████▏ | 9185/22095 [15:41:24<14:20:57, 4.00s/it] 42%|████▏ | 9186/22095 [15:41:27<13:06:58, 3.66s/it] {'loss': 0.3366, 'grad_norm': 0.6254106383490038, 'learning_rate': 6.5804519122986045e-06, 'epoch': 0.42} 42%|████▏ | 9186/22095 [15:41:27<13:06:58, 3.66s/it] 42%|████▏ | 9187/22095 [15:41:31<13:04:40, 3.65s/it] {'loss': 0.3561, 'grad_norm': 0.6739536124384865, 'learning_rate': 6.5797565518368835e-06, 'epoch': 0.42} 42%|████▏ | 9187/22095 [15:41:31<13:04:40, 3.65s/it] 42%|████▏ | 9188/22095 [15:41:34<13:00:17, 3.63s/it] {'loss': 0.3764, 'grad_norm': 0.7134668711042326, 'learning_rate': 6.579061157431046e-06, 'epoch': 0.42} 42%|████▏ | 9188/22095 [15:41:34<13:00:17, 3.63s/it] 42%|████▏ | 9189/22095 [15:41:39<13:43:41, 3.83s/it] {'loss': 0.3168, 'grad_norm': 0.630690784551514, 'learning_rate': 6.578365729096034e-06, 'epoch': 0.42} 42%|████▏ | 9189/22095 [15:41:39<13:43:41, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54542 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73608 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9190/22095 [15:41:42<12:58:21, 3.62s/it] {'loss': 0.3313, 'grad_norm': 0.631189148468012, 'learning_rate': 6.57767026684679e-06, 'epoch': 0.42} 42%|████▏ | 9190/22095 [15:41:42<12:58:21, 3.62s/it] 42%|████▏ | 9191/22095 [15:41:45<12:25:00, 3.46s/it] {'loss': 0.3645, 'grad_norm': 0.6650612871222172, 'learning_rate': 6.576974770698259e-06, 'epoch': 0.42} 42%|████▏ | 9191/22095 [15:41:45<12:25:00, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9192/22095 [15:41:49<13:09:58, 3.67s/it] {'loss': 0.3724, 'grad_norm': 0.6459197229536661, 'learning_rate': 6.576279240665381e-06, 'epoch': 0.42} 42%|████▏ | 9192/22095 [15:41:49<13:09:58, 3.67s/it] 42%|████▏ | 9193/22095 [15:41:53<13:07:39, 3.66s/it] {'loss': 0.3587, 'grad_norm': 0.583243594845825, 'learning_rate': 6.575583676763105e-06, 'epoch': 0.42} 42%|████▏ | 9193/22095 [15:41:53<13:07:39, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9194/22095 [15:41:56<13:01:06, 3.63s/it] {'loss': 0.349, 'grad_norm': 0.6114416733865374, 'learning_rate': 6.574888079006374e-06, 'epoch': 0.42} 42%|████▏ | 9194/22095 [15:41:56<13:01:06, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9195/22095 [15:42:00<13:12:35, 3.69s/it] {'loss': 0.3292, 'grad_norm': 0.6310044548859889, 'learning_rate': 6.574192447410136e-06, 'epoch': 0.42} 42%|████▏ | 9195/22095 [15:42:00<13:12:35, 3.69s/it] 42%|████▏ | 9196/22095 [15:42:03<12:49:31, 3.58s/it] {'loss': 0.2981, 'grad_norm': 0.6890581030769432, 'learning_rate': 6.573496781989336e-06, 'epoch': 0.42} 42%|████▏ | 9196/22095 [15:42:03<12:49:31, 3.58s/it] 42%|████▏ | 9197/22095 [15:42:07<12:39:12, 3.53s/it] {'loss': 0.3722, 'grad_norm': 0.6077206380083984, 'learning_rate': 6.572801082758923e-06, 'epoch': 0.42} 42%|████▏ | 9197/22095 [15:42:07<12:39:12, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9198/22095 [15:42:16<19:14:08, 5.37s/it] {'loss': 0.5054, 'grad_norm': 0.5266079459598753, 'learning_rate': 6.5721053497338464e-06, 'epoch': 0.42} 42%|████▏ | 9198/22095 [15:42:16<19:14:08, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106211 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9199/22095 [15:42:20<17:10:42, 4.80s/it] {'loss': 0.3137, 'grad_norm': 0.6082102498170532, 'learning_rate': 6.571409582929053e-06, 'epoch': 0.42} 42%|████▏ | 9199/22095 [15:42:20<17:10:42, 4.80s/it] 42%|████▏ | 9200/22095 [15:42:23<15:12:20, 4.25s/it] {'loss': 0.3673, 'grad_norm': 0.6345993046401832, 'learning_rate': 6.570713782359493e-06, 'epoch': 0.42} 42%|████▏ | 9200/22095 [15:42:23<15:12:20, 4.25s/it] 42%|████▏ | 9201/22095 [15:42:26<13:57:40, 3.90s/it] {'loss': 0.3619, 'grad_norm': 0.6354457264023634, 'learning_rate': 6.57001794804012e-06, 'epoch': 0.42} 42%|████▏ | 9201/22095 [15:42:26<13:57:40, 3.90s/it] 42%|████▏ | 9202/22095 [15:42:29<12:46:45, 3.57s/it] {'loss': 0.3376, 'grad_norm': 0.7378579848599901, 'learning_rate': 6.569322079985881e-06, 'epoch': 0.42} 42%|████▏ | 9202/22095 [15:42:29<12:46:45, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127940 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9203/22095 [15:42:32<12:39:58, 3.54s/it] {'loss': 0.3416, 'grad_norm': 0.6618244842604736, 'learning_rate': 6.568626178211732e-06, 'epoch': 0.42} 42%|████▏ | 9203/22095 [15:42:32<12:39:58, 3.54s/it] 42%|████▏ | 9204/22095 [15:42:35<12:21:50, 3.45s/it] {'loss': 0.3417, 'grad_norm': 0.6361598077473152, 'learning_rate': 6.567930242732624e-06, 'epoch': 0.42} 42%|████▏ | 9204/22095 [15:42:35<12:21:50, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924511 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47664, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 42%|████▏ | 9205/22095 [15:42:40<13:04:26, 3.65s/it] {'loss': 0.3632, 'grad_norm': 0.6876223638053927, 'learning_rate': 6.5672342735635095e-06, 'epoch': 0.42} 42%|████▏ | 9205/22095 [15:42:40<13:04:26, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92528 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9206/22095 [15:42:49<19:06:11, 5.34s/it] {'loss': 0.4684, 'grad_norm': 0.3977714891906502, 'learning_rate': 6.566538270719345e-06, 'epoch': 0.42} 42%|████▏ | 9206/22095 [15:42:49<19:06:11, 5.34s/it] 42%|████▏ | 9207/22095 [15:42:58<23:34:01, 6.58s/it] {'loss': 0.4653, 'grad_norm': 0.31331673993273423, 'learning_rate': 6.565842234215085e-06, 'epoch': 0.42} 42%|████▏ | 9207/22095 [15:42:58<23:34:01, 6.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 42%|████▏ | 9208/22095 [15:43:03<21:12:34, 5.92s/it] {'loss': 0.3515, 'grad_norm': 0.7117005769691125, 'learning_rate': 6.5651461640656825e-06, 'epoch': 0.42} 42%|████▏ | 9208/22095 [15:43:03<21:12:34, 5.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9209/22095 [15:43:06<18:49:03, 5.26s/it] {'loss': 0.3227, 'grad_norm': 0.6359816588793948, 'learning_rate': 6.564450060286098e-06, 'epoch': 0.42} 42%|████▏ | 9209/22095 [15:43:06<18:49:03, 5.26s/it] 42%|████▏ | 9210/22095 [15:43:10<17:10:30, 4.80s/it] {'loss': 0.3229, 'grad_norm': 1.259017491550393, 'learning_rate': 6.563753922891284e-06, 'epoch': 0.42} 42%|████▏ | 9210/22095 [15:43:10<17:10:30, 4.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9211/22095 [15:43:18<20:09:27, 5.63s/it] {'loss': 0.4819, 'grad_norm': 0.4617368788687044, 'learning_rate': 6.563057751896204e-06, 'epoch': 0.42} 42%|████▏ | 9211/22095 [15:43:18<20:09:27, 5.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885568 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8721, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 5\nB. 2\nC. 3\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 42%|████▏ | 9212/22095 [15:43:22<19:04:29, 5.33s/it] {'loss': 0.331, 'grad_norm': 0.6432718249738655, 'learning_rate': 6.562361547315811e-06, 'epoch': 0.42} 42%|████▏ | 9212/22095 [15:43:22<19:04:29, 5.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48892 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90110 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47423 > 40960) for 4 sample(s). Truncating to 1171 with 2 samples. 42%|████▏ | 9213/22095 [15:43:26<17:01:07, 4.76s/it] {'loss': 0.3203, 'grad_norm': 0.6933318021267154, 'learning_rate': 6.561665309165067e-06, 'epoch': 0.42} 42%|████▏ | 9213/22095 [15:43:26<17:01:07, 4.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69057 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9214/22095 [15:43:35<21:54:25, 6.12s/it] {'loss': 0.4685, 'grad_norm': 0.31445652021371545, 'learning_rate': 6.560969037458933e-06, 'epoch': 0.42} 42%|████▏ | 9214/22095 [15:43:35<21:54:25, 6.12s/it] 42%|████▏ | 9215/22095 [15:43:38<18:53:10, 5.28s/it] {'loss': 0.3184, 'grad_norm': 0.6828889036220073, 'learning_rate': 6.5602727322123675e-06, 'epoch': 0.42} 42%|████▏ | 9215/22095 [15:43:38<18:53:10, 5.28s/it] 42%|████▏ | 9216/22095 [15:43:42<16:39:16, 4.66s/it] {'loss': 0.3864, 'grad_norm': 0.6778087903452624, 'learning_rate': 6.5595763934403335e-06, 'epoch': 0.42} 42%|████▏ | 9216/22095 [15:43:42<16:39:16, 4.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9217/22095 [15:43:45<15:06:30, 4.22s/it] {'loss': 0.3403, 'grad_norm': 0.68361582002995, 'learning_rate': 6.5588800211577915e-06, 'epoch': 0.42} 42%|████▏ | 9217/22095 [15:43:45<15:06:30, 4.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9218/22095 [15:43:49<14:45:58, 4.13s/it] {'loss': 0.3495, 'grad_norm': 0.6222438144782241, 'learning_rate': 6.558183615379708e-06, 'epoch': 0.42} 42%|████▏ | 9218/22095 [15:43:49<14:45:58, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44577 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9219/22095 [15:43:53<14:25:52, 4.03s/it] {'loss': 0.3167, 'grad_norm': 0.6665841861318481, 'learning_rate': 6.557487176121042e-06, 'epoch': 0.42} 42%|████▏ | 9219/22095 [15:43:53<14:25:52, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9220/22095 [15:43:56<13:49:41, 3.87s/it] {'loss': 0.403, 'grad_norm': 0.6534720487634615, 'learning_rate': 6.5567907033967616e-06, 'epoch': 0.42} 42%|████▏ | 9220/22095 [15:43:56<13:49:41, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (130027 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9221/22095 [15:44:07<21:50:27, 6.11s/it] {'loss': 0.4636, 'grad_norm': 0.4525851786579055, 'learning_rate': 6.556094197221828e-06, 'epoch': 0.42} 42%|████▏ | 9221/22095 [15:44:07<21:50:27, 6.11s/it] 42%|████▏ | 9222/22095 [15:44:15<23:31:55, 6.58s/it] {'loss': 0.4762, 'grad_norm': 0.35344989041189595, 'learning_rate': 6.5553976576112124e-06, 'epoch': 0.42} 42%|████▏ | 9222/22095 [15:44:15<23:31:55, 6.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364940 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31681, 'image': 'vrdu_table_final_2/astro-ph.CO/61b600b4-d689-4afb-b9dd-621b2374e163.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 42%|████▏ | 9223/22095 [15:44:19<20:36:09, 5.76s/it] {'loss': 0.2829, 'grad_norm': 0.6190618498454409, 'learning_rate': 6.554701084579876e-06, 'epoch': 0.42} 42%|████▏ | 9223/22095 [15:44:19<20:36:09, 5.76s/it] 42%|████▏ | 9224/22095 [15:44:24<19:54:53, 5.57s/it] {'loss': 0.3461, 'grad_norm': 0.6814680694155807, 'learning_rate': 6.554004478142789e-06, 'epoch': 0.42} 42%|████▏ | 9224/22095 [15:44:24<19:54:53, 5.57s/it] 42%|████▏ | 9225/22095 [15:44:28<18:16:32, 5.11s/it] {'loss': 0.3507, 'grad_norm': 0.67215300592011, 'learning_rate': 6.553307838314919e-06, 'epoch': 0.42} 42%|████▏ | 9225/22095 [15:44:28<18:16:32, 5.11s/it] 42%|████▏ | 9226/22095 [15:44:32<16:40:29, 4.66s/it] {'loss': 0.3373, 'grad_norm': 0.6642982648054744, 'learning_rate': 6.552611165111233e-06, 'epoch': 0.42} 42%|████▏ | 9226/22095 [15:44:32<16:40:29, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9227/22095 [15:44:42<22:30:12, 6.30s/it] {'loss': 0.4814, 'grad_norm': 0.5308400151907808, 'learning_rate': 6.551914458546702e-06, 'epoch': 0.42} 42%|████▏ | 9227/22095 [15:44:42<22:30:12, 6.30s/it] 42%|████▏ | 9228/22095 [15:44:46<20:11:02, 5.65s/it] {'loss': 0.3097, 'grad_norm': 0.6098024845830202, 'learning_rate': 6.5512177186362956e-06, 'epoch': 0.42} 42%|████▏ | 9228/22095 [15:44:46<20:11:02, 5.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9229/22095 [15:44:52<21:00:00, 5.88s/it] {'loss': 0.4894, 'grad_norm': 0.3971927840088478, 'learning_rate': 6.5505209453949844e-06, 'epoch': 0.42} 42%|████▏ | 9229/22095 [15:44:52<21:00:00, 5.88s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9230/22095 [15:44:56<19:10:14, 5.36s/it] {'loss': 0.304, 'grad_norm': 0.6215690645807083, 'learning_rate': 6.5498241388377415e-06, 'epoch': 0.42} 42%|████▏ | 9230/22095 [15:44:57<19:10:14, 5.36s/it] 42%|████▏ | 9231/22095 [15:45:00<16:43:43, 4.68s/it] {'loss': 0.3561, 'grad_norm': 0.6501272460909904, 'learning_rate': 6.549127298979535e-06, 'epoch': 0.42} 42%|████▏ | 9231/22095 [15:45:00<16:43:43, 4.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9232/22095 [15:45:03<15:44:13, 4.40s/it] {'loss': 0.3503, 'grad_norm': 1.047121293479635, 'learning_rate': 6.5484304258353435e-06, 'epoch': 0.42} 42%|████▏ | 9232/22095 [15:45:03<15:44:13, 4.40s/it] 42%|████▏ | 9233/22095 [15:45:06<14:16:18, 3.99s/it] {'loss': 0.3353, 'grad_norm': 0.6911381850116922, 'learning_rate': 6.547733519420136e-06, 'epoch': 0.42} 42%|████▏ | 9233/22095 [15:45:06<14:16:18, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [864, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8460240 in VC:s3://internvl-moe-sft-data/. Exception: Image size [864, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 128732, 'image': 'vrdu_texteq/astro-ph.CO/32885d81-cb8d-4d7d-a787-2ab06d263509.png', 'image_wh': [[864, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'As in the main text let us write the order $n$ contribution to the $\\delta \\tau^m$ as'}]} 42%|████▏ | 9234/22095 [15:45:09<13:14:43, 3.71s/it] {'loss': 0.3237, 'grad_norm': 0.6769210876746538, 'learning_rate': 6.54703657974889e-06, 'epoch': 0.42} 42%|████▏ | 9234/22095 [15:45:09<13:14:43, 3.71s/it] 42%|████▏ | 9235/22095 [15:45:13<12:46:45, 3.58s/it] {'loss': 0.3453, 'grad_norm': 0.6153345028518419, 'learning_rate': 6.546339606836578e-06, 'epoch': 0.42} 42%|████▏ | 9235/22095 [15:45:13<12:46:45, 3.58s/it] 42%|████▏ | 9236/22095 [15:45:16<13:00:10, 3.64s/it] {'loss': 0.358, 'grad_norm': 0.6601778241017583, 'learning_rate': 6.545642600698179e-06, 'epoch': 0.42} 42%|████▏ | 9236/22095 [15:45:16<13:00:10, 3.64s/it] 42%|████▏ | 9237/22095 [15:45:19<12:13:56, 3.42s/it] {'loss': 0.3401, 'grad_norm': 0.6643447241178514, 'learning_rate': 6.544945561348665e-06, 'epoch': 0.42} 42%|████▏ | 9237/22095 [15:45:19<12:13:56, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942810 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65963, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nA. 3\nB. 4\nC. 5\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AB=12,且BC=\\frac{1}{2}AB∴BC=6,AC=18而点D是线段AC的中点,∴AD=\\frac{1}{2}AC=\\frac{1}{2}×18=9而BD=AB-AD=12-9=3'}]} 42%|████▏ | 9238/22095 [15:45:23<11:58:16, 3.35s/it] {'loss': 0.3554, 'grad_norm': 0.6757724192115454, 'learning_rate': 6.544248488803017e-06, 'epoch': 0.42} 42%|████▏ | 9238/22095 [15:45:23<11:58:16, 3.35s/it] 42%|████▏ | 9239/22095 [15:45:27<12:36:43, 3.53s/it] {'loss': 0.4217, 'grad_norm': 0.611764580109547, 'learning_rate': 6.5435513830762125e-06, 'epoch': 0.42} 42%|████▏ | 9239/22095 [15:45:27<12:36:43, 3.53s/it] 42%|████▏ | 9240/22095 [15:45:29<11:55:32, 3.34s/it] {'loss': 0.3357, 'grad_norm': 0.5760886846843914, 'learning_rate': 6.542854244183229e-06, 'epoch': 0.42} 42%|████▏ | 9240/22095 [15:45:29<11:55:32, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47726 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53508 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9241/22095 [15:45:34<12:53:24, 3.61s/it] {'loss': 0.3631, 'grad_norm': 0.6448729001164268, 'learning_rate': 6.542157072139046e-06, 'epoch': 0.42} 42%|████▏ | 9241/22095 [15:45:34<12:53:24, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396442 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63293, 'image': 'vrdu_table_final_2/astro-ph.EP/b3fe3f77-7e36-4b46-9903-018908a47e8b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 42%|████▏ | 9242/22095 [15:45:37<12:50:59, 3.60s/it] {'loss': 0.3061, 'grad_norm': 0.6129535632789035, 'learning_rate': 6.541459866958644e-06, 'epoch': 0.42} 42%|████▏ | 9242/22095 [15:45:37<12:50:59, 3.60s/it] 42%|████▏ | 9243/22095 [15:45:41<12:31:57, 3.51s/it] {'loss': 0.3362, 'grad_norm': 0.617909176415105, 'learning_rate': 6.540762628657003e-06, 'epoch': 0.42} 42%|████▏ | 9243/22095 [15:45:41<12:31:57, 3.51s/it] 42%|████▏ | 9244/22095 [15:45:45<13:27:51, 3.77s/it] {'loss': 0.3623, 'grad_norm': 0.6611050290972347, 'learning_rate': 6.5400653572491055e-06, 'epoch': 0.42} 42%|████▏ | 9244/22095 [15:45:45<13:27:51, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44144 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9245/22095 [15:45:48<12:38:04, 3.54s/it] {'loss': 0.3086, 'grad_norm': 0.5944838215802871, 'learning_rate': 6.539368052749935e-06, 'epoch': 0.42} 42%|████▏ | 9245/22095 [15:45:48<12:38:04, 3.54s/it] 42%|████▏ | 9246/22095 [15:45:52<12:57:10, 3.63s/it] {'loss': 0.3581, 'grad_norm': 0.6339489370126075, 'learning_rate': 6.538670715174471e-06, 'epoch': 0.42} 42%|████▏ | 9246/22095 [15:45:52<12:57:10, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9247/22095 [15:45:55<12:06:04, 3.39s/it] {'loss': 0.3388, 'grad_norm': 0.600420415879879, 'learning_rate': 6.537973344537699e-06, 'epoch': 0.42} 42%|████▏ | 9247/22095 [15:45:55<12:06:04, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9248/22095 [15:45:58<11:42:52, 3.28s/it] {'loss': 0.3529, 'grad_norm': 0.6029103116209852, 'learning_rate': 6.537275940854604e-06, 'epoch': 0.42} 42%|████▏ | 9248/22095 [15:45:58<11:42:52, 3.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957615 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8450, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 2cm\nB. 4cm\nC. 1cm\nD. 1.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9249/22095 [15:46:06<17:29:33, 4.90s/it] {'loss': 0.4934, 'grad_norm': 0.6447280673198621, 'learning_rate': 6.536578504140172e-06, 'epoch': 0.42} 42%|████▏ | 9249/22095 [15:46:06<17:29:33, 4.90s/it] 42%|████▏ | 9250/22095 [15:46:10<16:07:26, 4.52s/it] {'loss': 0.3363, 'grad_norm': 0.6194822194770873, 'learning_rate': 6.535881034409384e-06, 'epoch': 0.42} 42%|████▏ | 9250/22095 [15:46:10<16:07:26, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9251/22095 [15:46:13<14:58:57, 4.20s/it] {'loss': 0.3372, 'grad_norm': 0.6502620940642008, 'learning_rate': 6.535183531677232e-06, 'epoch': 0.42} 42%|████▏ | 9251/22095 [15:46:13<14:58:57, 4.20s/it] 42%|████▏ | 9252/22095 [15:46:17<14:25:28, 4.04s/it] {'loss': 0.3933, 'grad_norm': 0.6700261335411145, 'learning_rate': 6.534485995958699e-06, 'epoch': 0.42} 42%|████▏ | 9252/22095 [15:46:17<14:25:28, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9253/22095 [15:46:26<20:07:30, 5.64s/it] {'loss': 0.492, 'grad_norm': 0.349521227256, 'learning_rate': 6.533788427268777e-06, 'epoch': 0.42} 42%|████▏ | 9253/22095 [15:46:26<20:07:30, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51456 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9254/22095 [15:46:30<17:37:46, 4.94s/it] {'loss': 0.3542, 'grad_norm': 0.6771202574472491, 'learning_rate': 6.533090825622451e-06, 'epoch': 0.42} 42%|████▏ | 9254/22095 [15:46:30<17:37:46, 4.94s/it] 42%|████▏ | 9255/22095 [15:46:33<16:18:07, 4.57s/it] {'loss': 0.3166, 'grad_norm': 0.6251717433479416, 'learning_rate': 6.532393191034711e-06, 'epoch': 0.42} 42%|████▏ | 9255/22095 [15:46:33<16:18:07, 4.57s/it] 42%|████▏ | 9256/22095 [15:46:37<15:13:31, 4.27s/it] {'loss': 0.3458, 'grad_norm': 0.6941485466388639, 'learning_rate': 6.53169552352055e-06, 'epoch': 0.42} 42%|████▏ | 9256/22095 [15:46:37<15:13:31, 4.27s/it] 42%|████▏ | 9257/22095 [15:46:41<14:40:43, 4.12s/it] {'loss': 0.3605, 'grad_norm': 0.6536270290477265, 'learning_rate': 6.530997823094956e-06, 'epoch': 0.42} 42%|████▏ | 9257/22095 [15:46:41<14:40:43, 4.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359355 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26075, 'image': 'vrdu_table_final_2/astro-ph.CO/967a1749-7d5f-413f-9805-8aef2b367463.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 42%|████▏ | 9258/22095 [15:46:44<13:32:38, 3.80s/it] {'loss': 0.35, 'grad_norm': 0.6279317153068226, 'learning_rate': 6.530300089772918e-06, 'epoch': 0.42} 42%|████▏ | 9258/22095 [15:46:44<13:32:38, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9259/22095 [15:46:47<12:31:29, 3.51s/it] {'loss': 0.3297, 'grad_norm': 0.6270067678352912, 'learning_rate': 6.529602323569435e-06, 'epoch': 0.42} 42%|████▏ | 9259/22095 [15:46:47<12:31:29, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51153 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9260/22095 [15:46:50<12:38:53, 3.55s/it] {'loss': 0.3209, 'grad_norm': 0.6002916694129556, 'learning_rate': 6.528904524499492e-06, 'epoch': 0.42} 42%|████▏ | 9260/22095 [15:46:50<12:38:53, 3.55s/it] 42%|████▏ | 9261/22095 [15:46:54<12:17:42, 3.45s/it] {'loss': 0.3243, 'grad_norm': 0.5958028898914056, 'learning_rate': 6.5282066925780896e-06, 'epoch': 0.42} 42%|████▏ | 9261/22095 [15:46:54<12:17:42, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51831 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9262/22095 [15:47:02<18:10:21, 5.10s/it] {'loss': 0.5249, 'grad_norm': 0.41948313179471364, 'learning_rate': 6.527508827820217e-06, 'epoch': 0.42} 42%|████▏ | 9262/22095 [15:47:02<18:10:21, 5.10s/it] 42%|████▏ | 9263/22095 [15:47:06<16:35:32, 4.65s/it] {'loss': 0.3436, 'grad_norm': 0.6650697757035404, 'learning_rate': 6.526810930240872e-06, 'epoch': 0.42} 42%|████▏ | 9263/22095 [15:47:06<16:35:32, 4.65s/it] 42%|████▏ | 9264/22095 [15:47:10<15:23:05, 4.32s/it] {'loss': 0.3139, 'grad_norm': 0.5925651770758924, 'learning_rate': 6.526112999855049e-06, 'epoch': 0.42} 42%|████▏ | 9264/22095 [15:47:10<15:23:05, 4.32s/it] 42%|████▏ | 9265/22095 [15:47:13<14:08:45, 3.97s/it] {'loss': 0.3447, 'grad_norm': 0.6675276809606014, 'learning_rate': 6.525415036677745e-06, 'epoch': 0.42} 42%|████▏ | 9265/22095 [15:47:13<14:08:45, 3.97s/it] 42%|████▏ | 9266/22095 [15:47:16<13:04:40, 3.67s/it] {'loss': 0.3409, 'grad_norm': 0.6131476470944407, 'learning_rate': 6.524717040723956e-06, 'epoch': 0.42} 42%|████▏ | 9266/22095 [15:47:16<13:04:40, 3.67s/it] 42%|████▏ | 9267/22095 [15:47:19<12:56:37, 3.63s/it] {'loss': 0.3838, 'grad_norm': 0.6353899230609578, 'learning_rate': 6.524019012008681e-06, 'epoch': 0.42} 42%|████▏ | 9267/22095 [15:47:19<12:56:37, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9268/22095 [15:47:27<17:31:42, 4.92s/it] {'loss': 0.4586, 'grad_norm': 0.3245527559211798, 'learning_rate': 6.523320950546919e-06, 'epoch': 0.42} 42%|████▏ | 9268/22095 [15:47:27<17:31:42, 4.92s/it] 42%|████▏ | 9269/22095 [15:47:31<16:02:02, 4.50s/it] {'loss': 0.366, 'grad_norm': 0.705392549434997, 'learning_rate': 6.522622856353667e-06, 'epoch': 0.42} 42%|████▏ | 9269/22095 [15:47:31<16:02:02, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9270/22095 [15:47:36<16:43:04, 4.69s/it] {'loss': 0.5068, 'grad_norm': 0.3172452665983041, 'learning_rate': 6.521924729443928e-06, 'epoch': 0.42} 42%|████▏ | 9270/22095 [15:47:36<16:43:04, 4.69s/it] 42%|████▏ | 9271/22095 [15:47:39<15:03:06, 4.23s/it] {'loss': 0.3742, 'grad_norm': 0.6478833118357851, 'learning_rate': 6.521226569832699e-06, 'epoch': 0.42} 42%|████▏ | 9271/22095 [15:47:39<15:03:06, 4.23s/it] 42%|████▏ | 9272/22095 [15:47:42<14:00:54, 3.93s/it] {'loss': 0.3379, 'grad_norm': 0.6690294381143386, 'learning_rate': 6.520528377534984e-06, 'epoch': 0.42} 42%|████▏ | 9272/22095 [15:47:42<14:00:54, 3.93s/it] 42%|████▏ | 9273/22095 [15:47:46<14:07:31, 3.97s/it] {'loss': 0.3108, 'grad_norm': 0.5924226382675255, 'learning_rate': 6.519830152565784e-06, 'epoch': 0.42} 42%|████▏ | 9273/22095 [15:47:46<14:07:31, 3.97s/it] 42%|████▏ | 9274/22095 [15:47:50<14:03:45, 3.95s/it] {'loss': 0.3508, 'grad_norm': 0.8549559792432717, 'learning_rate': 6.5191318949401005e-06, 'epoch': 0.42} 42%|████▏ | 9274/22095 [15:47:50<14:03:45, 3.95s/it] 42%|████▏ | 9275/22095 [15:47:54<13:55:44, 3.91s/it] {'loss': 0.3518, 'grad_norm': 0.6788708282638739, 'learning_rate': 6.51843360467294e-06, 'epoch': 0.42} 42%|████▏ | 9275/22095 [15:47:54<13:55:44, 3.91s/it] 42%|████▏ | 9276/22095 [15:47:57<13:24:19, 3.76s/it] {'loss': 0.3614, 'grad_norm': 0.6608690918320347, 'learning_rate': 6.517735281779304e-06, 'epoch': 0.42} 42%|████▏ | 9276/22095 [15:47:57<13:24:19, 3.76s/it] 42%|████▏ | 9277/22095 [15:48:01<12:47:36, 3.59s/it] {'loss': 0.3568, 'grad_norm': 0.644622363537874, 'learning_rate': 6.517036926274198e-06, 'epoch': 0.42} 42%|████▏ | 9277/22095 [15:48:01<12:47:36, 3.59s/it] 42%|████▏ | 9278/22095 [15:48:05<13:23:38, 3.76s/it] {'loss': 0.3761, 'grad_norm': 0.672102533674108, 'learning_rate': 6.51633853817263e-06, 'epoch': 0.42} 42%|████▏ | 9278/22095 [15:48:05<13:23:38, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9279/22095 [15:48:08<12:44:30, 3.58s/it] {'loss': 0.317, 'grad_norm': 0.6160561545018294, 'learning_rate': 6.5156401174896e-06, 'epoch': 0.42} 42%|████▏ | 9279/22095 [15:48:08<12:44:30, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48074 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9280/22095 [15:48:11<11:51:00, 3.33s/it] {'loss': 0.3363, 'grad_norm': 0.6409995525830237, 'learning_rate': 6.514941664240122e-06, 'epoch': 0.42} 42%|████▏ | 9280/22095 [15:48:11<11:51:00, 3.33s/it] 42%|████▏ | 9281/22095 [15:48:15<12:39:42, 3.56s/it] {'loss': 0.2734, 'grad_norm': 0.6213300786894572, 'learning_rate': 6.5142431784391976e-06, 'epoch': 0.42} 42%|████▏ | 9281/22095 [15:48:15<12:39:42, 3.56s/it] 42%|████▏ | 9282/22095 [15:48:18<12:15:57, 3.45s/it] {'loss': 0.3613, 'grad_norm': 0.6648642571051393, 'learning_rate': 6.513544660101841e-06, 'epoch': 0.42} 42%|████▏ | 9282/22095 [15:48:18<12:15:57, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9283/22095 [15:48:27<17:42:09, 4.97s/it] {'loss': 0.4955, 'grad_norm': 0.4573516667736759, 'learning_rate': 6.512846109243056e-06, 'epoch': 0.42} 42%|████▏ | 9283/22095 [15:48:27<17:42:09, 4.97s/it] 42%|████▏ | 9284/22095 [15:48:30<16:06:03, 4.52s/it] {'loss': 0.3704, 'grad_norm': 0.6287135334931294, 'learning_rate': 6.512147525877856e-06, 'epoch': 0.42} 42%|████▏ | 9284/22095 [15:48:30<16:06:03, 4.52s/it] 42%|████▏ | 9285/22095 [15:48:33<14:56:47, 4.20s/it] {'loss': 0.3798, 'grad_norm': 0.6254628698269683, 'learning_rate': 6.5114489100212485e-06, 'epoch': 0.42} 42%|████▏ | 9285/22095 [15:48:33<14:56:47, 4.20s/it] 42%|████▏ | 9286/22095 [15:48:37<13:53:26, 3.90s/it] {'loss': 0.3456, 'grad_norm': 0.6000869513094843, 'learning_rate': 6.510750261688246e-06, 'epoch': 0.42} 42%|████▏ | 9286/22095 [15:48:37<13:53:26, 3.90s/it] 42%|████▏ | 9287/22095 [15:48:40<13:14:19, 3.72s/it] {'loss': 0.3511, 'grad_norm': 0.6583406363645683, 'learning_rate': 6.510051580893861e-06, 'epoch': 0.42} 42%|████▏ | 9287/22095 [15:48:40<13:14:19, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9288/22095 [15:48:48<17:39:36, 4.96s/it] {'loss': 0.479, 'grad_norm': 0.30321244352810744, 'learning_rate': 6.509352867653106e-06, 'epoch': 0.42} 42%|████▏ | 9288/22095 [15:48:48<17:39:36, 4.96s/it] 42%|████▏ | 9289/22095 [15:48:51<16:05:59, 4.53s/it] {'loss': 0.3494, 'grad_norm': 0.6540125283971915, 'learning_rate': 6.508654121980992e-06, 'epoch': 0.42} 42%|████▏ | 9289/22095 [15:48:51<16:05:59, 4.53s/it] 42%|████▏ | 9290/22095 [15:48:55<15:26:29, 4.34s/it] {'loss': 0.3639, 'grad_norm': 0.6378658016227312, 'learning_rate': 6.507955343892536e-06, 'epoch': 0.42} 42%|████▏ | 9290/22095 [15:48:55<15:26:29, 4.34s/it] 42%|████▏ | 9291/22095 [15:48:59<14:25:18, 4.05s/it] {'loss': 0.3373, 'grad_norm': 0.6512258485466873, 'learning_rate': 6.507256533402749e-06, 'epoch': 0.42} 42%|████▏ | 9291/22095 [15:48:59<14:25:18, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46825 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108062 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9292/22095 [15:49:02<13:14:00, 3.72s/it] {'loss': 0.3492, 'grad_norm': 0.6278434316788836, 'learning_rate': 6.506557690526649e-06, 'epoch': 0.42} 42%|████▏ | 9292/22095 [15:49:02<13:14:00, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8335269 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1882, 'image': 'vrdu_table_final_2/astro-ph.CO/2f0d8460-67b1-4173-94c8-28ebace550c6.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 42%|████▏ | 9293/22095 [15:49:04<12:12:28, 3.43s/it] {'loss': 0.365, 'grad_norm': 0.6065740516331564, 'learning_rate': 6.5058588152792516e-06, 'epoch': 0.42} 42%|████▏ | 9293/22095 [15:49:04<12:12:28, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9294/22095 [15:49:07<11:33:48, 3.25s/it] {'loss': 0.3135, 'grad_norm': 0.6239073368527323, 'learning_rate': 6.5051599076755735e-06, 'epoch': 0.42} 42%|████▏ | 9294/22095 [15:49:07<11:33:48, 3.25s/it] 42%|████▏ | 9295/22095 [15:49:11<12:10:15, 3.42s/it] {'loss': 0.3243, 'grad_norm': 0.6390514081870159, 'learning_rate': 6.50446096773063e-06, 'epoch': 0.42} 42%|████▏ | 9295/22095 [15:49:11<12:10:15, 3.42s/it] 42%|████▏ | 9296/22095 [15:49:14<11:48:45, 3.32s/it] {'loss': 0.304, 'grad_norm': 0.6657578724741773, 'learning_rate': 6.503761995459443e-06, 'epoch': 0.42} 42%|████▏ | 9296/22095 [15:49:14<11:48:45, 3.32s/it] 42%|████▏ | 9297/22095 [15:49:17<11:44:12, 3.30s/it] {'loss': 0.2998, 'grad_norm': 0.608600570311915, 'learning_rate': 6.503062990877028e-06, 'epoch': 0.42} 42%|████▏ | 9297/22095 [15:49:17<11:44:12, 3.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8335786 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2402, 'image': 'vrdu_table_final_2/astro-ph.CO/2978d904-03a0-4bfd-9758-9c806df47e04.png', 'image_wh': [[14, 6]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}\n...\n\\end{tabular}\n```"}]} 42%|████▏ | 9298/22095 [15:49:21<12:01:01, 3.38s/it] {'loss': 0.3535, 'grad_norm': 0.5960942254073808, 'learning_rate': 6.502363953998406e-06, 'epoch': 0.42} 42%|████▏ | 9298/22095 [15:49:21<12:01:01, 3.38s/it] 42%|████▏ | 9299/22095 [15:49:24<12:05:24, 3.40s/it] {'loss': 0.3158, 'grad_norm': 0.5951803109195455, 'learning_rate': 6.501664884838597e-06, 'epoch': 0.42} 42%|████▏ | 9299/22095 [15:49:24<12:05:24, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9300/22095 [15:49:34<19:12:13, 5.40s/it] {'loss': 0.4864, 'grad_norm': 0.3982193169565782, 'learning_rate': 6.500965783412621e-06, 'epoch': 0.42} 42%|████▏ | 9300/22095 [15:49:34<19:12:13, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111878 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9301/22095 [15:49:38<17:10:57, 4.83s/it] {'loss': 0.3853, 'grad_norm': 0.6354665260421296, 'learning_rate': 6.5002666497355015e-06, 'epoch': 0.42} 42%|████▏ | 9301/22095 [15:49:38<17:10:57, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49379 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9302/22095 [15:49:41<15:39:50, 4.41s/it] {'loss': 0.3432, 'grad_norm': 0.7922319996822186, 'learning_rate': 6.4995674838222575e-06, 'epoch': 0.42} 42%|████▏ | 9302/22095 [15:49:41<15:39:50, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922962 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46115, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定'}, {'from': 'gpt', 'value': '【解答】解:∵M、N分别是线段AB、BC的中点,∴MB=0.5AB=3cm,NB=0.5BC=2cm,∴MN=MB+NB=3+2=5(cm),'}]} 42%|████▏ | 9303/22095 [15:49:45<14:34:35, 4.10s/it] {'loss': 0.3436, 'grad_norm': 0.720572508906418, 'learning_rate': 6.498868285687916e-06, 'epoch': 0.42} 42%|████▏ | 9303/22095 [15:49:45<14:34:35, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9304/22095 [15:49:49<14:25:11, 4.06s/it] {'loss': 0.3878, 'grad_norm': 1.0918756163190484, 'learning_rate': 6.498169055347498e-06, 'epoch': 0.42} 42%|████▏ | 9304/22095 [15:49:49<14:25:11, 4.06s/it] 42%|████▏ | 9305/22095 [15:49:52<13:55:17, 3.92s/it] {'loss': 0.3567, 'grad_norm': 0.6265199496335899, 'learning_rate': 6.497469792816027e-06, 'epoch': 0.42} 42%|████▏ | 9305/22095 [15:49:52<13:55:17, 3.92s/it] 42%|████▏ | 9306/22095 [15:49:55<12:49:14, 3.61s/it] {'loss': 0.3204, 'grad_norm': 0.6569177766236907, 'learning_rate': 6.49677049810853e-06, 'epoch': 0.42} 42%|████▏ | 9306/22095 [15:49:55<12:49:14, 3.61s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (106300000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9307/22095 [15:49:58<12:14:26, 3.45s/it] {'loss': 0.345, 'grad_norm': 0.5878224560887056, 'learning_rate': 6.4960711712400314e-06, 'epoch': 0.42} 42%|████▏ | 9307/22095 [15:49:58<12:14:26, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83624 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90667 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43258 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66380 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9308/22095 [15:50:02<12:18:25, 3.46s/it] {'loss': 0.3928, 'grad_norm': 0.6379857860705639, 'learning_rate': 6.4953718122255584e-06, 'epoch': 0.42} 42%|████▏ | 9308/22095 [15:50:02<12:18:25, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9309/22095 [15:50:12<19:30:12, 5.49s/it] {'loss': 0.464, 'grad_norm': 0.3598080210324236, 'learning_rate': 6.494672421080139e-06, 'epoch': 0.42} 42%|████▏ | 9309/22095 [15:50:12<19:30:12, 5.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [323, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8449411 in VC:s3://internvl-moe-sft-data/. Exception: Image size [323, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47177, 'image': 'vrdu_texteq/astro-ph.CO/06129f97-8161-43b1-86c7-bffeeea158b7.png', 'image_wh': [[323, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $t_{\\rm i}$ is the initial time.'}]} 42%|████▏ | 9310/22095 [15:50:18<20:04:40, 5.65s/it] {'loss': 0.501, 'grad_norm': 0.33433311563012347, 'learning_rate': 6.493972997818798e-06, 'epoch': 0.42} 42%|████▏ | 9310/22095 [15:50:18<20:04:40, 5.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 42%|████▏ | 9311/22095 [15:50:21<17:42:07, 4.98s/it] {'loss': 0.3916, 'grad_norm': 0.6860341505203552, 'learning_rate': 6.493273542456567e-06, 'epoch': 0.42} 42%|████▏ | 9311/22095 [15:50:21<17:42:07, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9312/22095 [15:50:25<15:55:49, 4.49s/it] {'loss': 0.3437, 'grad_norm': 0.7154400037378822, 'learning_rate': 6.492574055008474e-06, 'epoch': 0.42} 42%|████▏ | 9312/22095 [15:50:25<15:55:49, 4.49s/it] 42%|████▏ | 9313/22095 [15:50:28<15:09:21, 4.27s/it] {'loss': 0.351, 'grad_norm': 0.6669624091081103, 'learning_rate': 6.491874535489547e-06, 'epoch': 0.42} 42%|████▏ | 9313/22095 [15:50:29<15:09:21, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (109904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83184 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9314/22095 [15:50:36<18:49:19, 5.30s/it] {'loss': 0.492, 'grad_norm': 0.4255153708413403, 'learning_rate': 6.4911749839148195e-06, 'epoch': 0.42} 42%|████▏ | 9314/22095 [15:50:36<18:49:19, 5.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9315/22095 [15:50:40<17:07:34, 4.82s/it] {'loss': 0.3438, 'grad_norm': 0.6582232973062351, 'learning_rate': 6.490475400299321e-06, 'epoch': 0.42} 42%|████▏ | 9315/22095 [15:50:40<17:07:34, 4.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9316/22095 [15:50:46<18:51:51, 5.31s/it] {'loss': 0.4968, 'grad_norm': 0.35842591242246086, 'learning_rate': 6.489775784658083e-06, 'epoch': 0.42} 42%|████▏ | 9316/22095 [15:50:46<18:51:51, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48265 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77351 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9317/22095 [15:50:50<16:48:08, 4.73s/it] {'loss': 0.3256, 'grad_norm': 0.6233884916153226, 'learning_rate': 6.489076137006141e-06, 'epoch': 0.42} 42%|████▏ | 9317/22095 [15:50:50<16:48:08, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93412 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9318/22095 [15:50:54<16:18:13, 4.59s/it] {'loss': 0.3612, 'grad_norm': 0.626987885497307, 'learning_rate': 6.488376457358525e-06, 'epoch': 0.42} 42%|████▏ | 9318/22095 [15:50:54<16:18:13, 4.59s/it] 42%|████▏ | 9319/22095 [15:50:57<14:32:33, 4.10s/it] {'loss': 0.3438, 'grad_norm': 0.7825176726361815, 'learning_rate': 6.487676745730271e-06, 'epoch': 0.42} 42%|████▏ | 9319/22095 [15:50:57<14:32:33, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [270, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8499303 in VC:s3://internvl-moe-sft-data/. Exception: Image size [270, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32314, 'image': 'vrdu_texteq/astro-ph.CO/f3ba4faf-3fdd-4216-956d-272d50a76692.png', 'image_wh': [[270, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $\\bar{v}_i$ is transverse.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965315 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16150, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924299 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nWhat type of mining is shown in the state of Arizona? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Copper mining is shown in the state of Arizona.\nThe text does not directly mention any mining activities in Arizona, but it is known that Arizona is one of the top copper-producing states in the United States. Copper is a vital mineral used in many industries, including construction and electronics. Therefore, it can be inferred that the mention of Arizona in the text is indirectly referring to copper mining.'}]} 42%|████▏ | 9320/22095 [15:51:01<14:07:56, 3.98s/it] {'loss': 0.3372, 'grad_norm': 0.7636418718424111, 'learning_rate': 6.4869770021364105e-06, 'epoch': 0.42} 42%|████▏ | 9320/22095 [15:51:01<14:07:56, 3.98s/it] 42%|████▏ | 9321/22095 [15:51:04<13:28:07, 3.80s/it] {'loss': 0.3473, 'grad_norm': 0.6968989386721643, 'learning_rate': 6.486277226591982e-06, 'epoch': 0.42} 42%|████▏ | 9321/22095 [15:51:04<13:28:07, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9322/22095 [15:51:10<16:08:22, 4.55s/it] {'loss': 0.4952, 'grad_norm': 0.41793635105649984, 'learning_rate': 6.4855774191120215e-06, 'epoch': 0.42} 42%|████▏ | 9322/22095 [15:51:11<16:08:22, 4.55s/it] 42%|████▏ | 9323/22095 [15:51:14<15:27:34, 4.36s/it] {'loss': 0.3457, 'grad_norm': 0.6678593798332022, 'learning_rate': 6.484877579711565e-06, 'epoch': 0.42} 42%|████▏ | 9323/22095 [15:51:14<15:27:34, 4.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897247 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20400, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nA. 4\nB. 5\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 42%|████▏ | 9324/22095 [15:51:18<15:19:44, 4.32s/it] {'loss': 0.34, 'grad_norm': 0.634953767633606, 'learning_rate': 6.484177708405649e-06, 'epoch': 0.42} 42%|████▏ | 9324/22095 [15:51:18<15:19:44, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9325/22095 [15:51:30<23:03:06, 6.50s/it] {'loss': 0.482, 'grad_norm': 0.3158100492797705, 'learning_rate': 6.4834778052093125e-06, 'epoch': 0.42} 42%|████▏ | 9325/22095 [15:51:30<23:03:06, 6.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84380 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44702 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9326/22095 [15:51:35<21:17:21, 6.00s/it] {'loss': 0.329, 'grad_norm': 0.5584665390209338, 'learning_rate': 6.482777870137594e-06, 'epoch': 0.42} 42%|████▏ | 9326/22095 [15:51:35<21:17:21, 6.00s/it] 42%|████▏ | 9327/22095 [15:51:38<18:09:20, 5.12s/it] {'loss': 0.3376, 'grad_norm': 0.5845534166457408, 'learning_rate': 6.4820779032055335e-06, 'epoch': 0.42} 42%|████▏ | 9327/22095 [15:51:38<18:09:20, 5.12s/it] 42%|████▏ | 9328/22095 [15:51:41<15:52:22, 4.48s/it] {'loss': 0.341, 'grad_norm': 0.6833774180728347, 'learning_rate': 6.481377904428171e-06, 'epoch': 0.42} 42%|████▏ | 9328/22095 [15:51:41<15:52:22, 4.48s/it] 42%|████▏ | 9329/22095 [15:51:44<14:43:06, 4.15s/it] {'loss': 0.3291, 'grad_norm': 0.6828282315122735, 'learning_rate': 6.4806778738205455e-06, 'epoch': 0.42} 42%|████▏ | 9329/22095 [15:51:44<14:43:06, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9330/22095 [15:51:47<13:35:21, 3.83s/it] {'loss': 0.319, 'grad_norm': 0.613631604631963, 'learning_rate': 6.479977811397702e-06, 'epoch': 0.42} 42%|████▏ | 9330/22095 [15:51:47<13:35:21, 3.83s/it] 42%|████▏ | 9331/22095 [15:51:52<14:05:30, 3.97s/it] {'loss': 0.3601, 'grad_norm': 0.6997442427758623, 'learning_rate': 6.479277717174679e-06, 'epoch': 0.42} 42%|████▏ | 9331/22095 [15:51:52<14:05:30, 3.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item raise ValueError( ValueError: Number of image tokens ['data/dialogs/other_screenshot/original/ProfileDetailsDialog_1739921015.458943.png'] does not match number of images None [Try #0] Failed to fetch sample 1868189 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/dialogs/other_screenshot/original/ProfileDetailsDialog_1739921015.458943.png'] does not match number of images None Problematic sample: {'image': 'data/dialogs/other_screenshot/original/ProfileDetailsDialog_1739921015.458943.png', 'conversations': [], 'image_id': 'data/dialogs/other_screenshot/original/ProfileDetailsDialog_1739921015.458943.png'} 42%|████▏ | 9332/22095 [15:51:55<13:17:02, 3.75s/it] {'loss': 0.3411, 'grad_norm': 0.6189173652158538, 'learning_rate': 6.478577591166523e-06, 'epoch': 0.42} 42%|████▏ | 9332/22095 [15:51:55<13:17:02, 3.75s/it] 42%|████▏ | 9333/22095 [15:51:58<12:48:13, 3.61s/it] {'loss': 0.2974, 'grad_norm': 0.5729779897205061, 'learning_rate': 6.477877433388274e-06, 'epoch': 0.42} 42%|████▏ | 9333/22095 [15:51:58<12:48:13, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91820 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9334/22095 [15:52:09<19:57:34, 5.63s/it] {'loss': 0.4943, 'grad_norm': 0.3622437132156431, 'learning_rate': 6.477177243854978e-06, 'epoch': 0.42} 42%|████▏ | 9334/22095 [15:52:09<19:57:34, 5.63s/it] 42%|████▏ | 9335/22095 [15:52:14<19:27:40, 5.49s/it] {'loss': 0.483, 'grad_norm': 0.32321408167241117, 'learning_rate': 6.476477022581681e-06, 'epoch': 0.42} 42%|████▏ | 9335/22095 [15:52:14<19:27:40, 5.49s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (72269 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114523 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9336/22095 [15:52:17<17:34:26, 4.96s/it] {'loss': 0.326, 'grad_norm': 0.6281626093772569, 'learning_rate': 6.475776769583426e-06, 'epoch': 0.42} 42%|████▏ | 9336/22095 [15:52:17<17:34:26, 4.96s/it] 42%|████▏ | 9337/22095 [15:52:21<16:16:05, 4.59s/it] {'loss': 0.3697, 'grad_norm': 0.6403466744410745, 'learning_rate': 6.475076484875262e-06, 'epoch': 0.42} 42%|████▏ | 9337/22095 [15:52:21<16:16:05, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44529 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47750 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9338/22095 [15:52:24<14:50:11, 4.19s/it] {'loss': 0.3769, 'grad_norm': 0.6502452275171857, 'learning_rate': 6.4743761684722354e-06, 'epoch': 0.42} 42%|████▏ | 9338/22095 [15:52:24<14:50:11, 4.19s/it] 42%|████▏ | 9339/22095 [15:52:27<13:36:44, 3.84s/it] {'loss': 0.339, 'grad_norm': 0.5923763294515152, 'learning_rate': 6.4736758203893915e-06, 'epoch': 0.42} 42%|████▏ | 9339/22095 [15:52:27<13:36:44, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 81, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8377781 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 81, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44564, 'image': 'vrdu_table_final_2/astro-ph.CO/fe833fab-473f-462c-bdd0-0c0a358c8f1d.png', 'image_wh': [[17, 81]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$x$ \\\\$y$ \\\\ $z$ \\end{tabular}\n```"}]} 42%|████▏ | 9340/22095 [15:52:31<13:34:50, 3.83s/it] {'loss': 0.3564, 'grad_norm': 0.6721740786041263, 'learning_rate': 6.472975440641781e-06, 'epoch': 0.42} 42%|████▏ | 9340/22095 [15:52:31<13:34:50, 3.83s/it] 42%|████▏ | 9341/22095 [15:52:35<12:58:52, 3.66s/it] {'loss': 0.3491, 'grad_norm': 0.6428716930000375, 'learning_rate': 6.472275029244452e-06, 'epoch': 0.42} 42%|████▏ | 9341/22095 [15:52:35<12:58:52, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9342/22095 [15:52:44<18:39:27, 5.27s/it] {'loss': 0.4759, 'grad_norm': 0.4617371834957663, 'learning_rate': 6.471574586212454e-06, 'epoch': 0.42} 42%|████▏ | 9342/22095 [15:52:44<18:39:27, 5.27s/it] 42%|████▏ | 9343/22095 [15:52:51<21:09:43, 5.97s/it] {'loss': 0.4809, 'grad_norm': 0.39874311204630764, 'learning_rate': 6.470874111560837e-06, 'epoch': 0.42} 42%|████▏ | 9343/22095 [15:52:51<21:09:43, 5.97s/it] 42%|████▏ | 9344/22095 [15:53:01<25:38:07, 7.24s/it] {'loss': 0.4753, 'grad_norm': 0.3015463366870361, 'learning_rate': 6.470173605304655e-06, 'epoch': 0.42} 42%|████▏ | 9344/22095 [15:53:01<25:38:07, 7.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (105684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76336 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55493 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9345/22095 [15:53:05<22:03:17, 6.23s/it] {'loss': 0.3577, 'grad_norm': 0.6940333918436483, 'learning_rate': 6.469473067458956e-06, 'epoch': 0.42} 42%|████▏ | 9345/22095 [15:53:05<22:03:17, 6.23s/it] 42%|████▏ | 9346/22095 [15:53:15<25:17:28, 7.14s/it] {'loss': 0.4999, 'grad_norm': 0.3984963380341798, 'learning_rate': 6.468772498038795e-06, 'epoch': 0.42} 42%|████▏ | 9346/22095 [15:53:15<25:17:28, 7.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9347/22095 [15:53:18<21:26:15, 6.05s/it] {'loss': 0.3329, 'grad_norm': 0.6775492573141172, 'learning_rate': 6.468071897059222e-06, 'epoch': 0.42} 42%|████▏ | 9347/22095 [15:53:18<21:26:15, 6.05s/it] 42%|████▏ | 9348/22095 [15:53:22<19:06:15, 5.40s/it] {'loss': 0.339, 'grad_norm': 0.7675894133760427, 'learning_rate': 6.467371264535295e-06, 'epoch': 0.42} 42%|████▏ | 9348/22095 [15:53:22<19:06:15, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49893 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9349/22095 [15:53:26<17:16:04, 4.88s/it] {'loss': 0.4015, 'grad_norm': 0.658077152410635, 'learning_rate': 6.466670600482065e-06, 'epoch': 0.42} 42%|████▏ | 9349/22095 [15:53:26<17:16:04, 4.88s/it] 42%|████▏ | 9350/22095 [15:53:30<16:40:15, 4.71s/it] {'loss': 0.3805, 'grad_norm': 0.6308682728934399, 'learning_rate': 6.465969904914589e-06, 'epoch': 0.42} 42%|████▏ | 9350/22095 [15:53:30<16:40:15, 4.71s/it] 42%|████▏ | 9351/22095 [15:53:33<14:49:55, 4.19s/it] {'loss': 0.3121, 'grad_norm': 0.5976569284225053, 'learning_rate': 6.4652691778479215e-06, 'epoch': 0.42} 42%|████▏ | 9351/22095 [15:53:33<14:49:55, 4.19s/it] 42%|████▏ | 9352/22095 [15:53:37<14:50:00, 4.19s/it] {'loss': 0.3197, 'grad_norm': 0.7100140072873342, 'learning_rate': 6.4645684192971195e-06, 'epoch': 0.42} 42%|████▏ | 9352/22095 [15:53:37<14:50:00, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9353/22095 [15:53:47<20:32:59, 5.81s/it] {'loss': 0.4765, 'grad_norm': 0.6518239354126296, 'learning_rate': 6.463867629277241e-06, 'epoch': 0.42} 42%|████▏ | 9353/22095 [15:53:47<20:32:59, 5.81s/it] 42%|████▏ | 9354/22095 [15:53:50<18:20:01, 5.18s/it] {'loss': 0.3341, 'grad_norm': 0.6269798987646074, 'learning_rate': 6.463166807803342e-06, 'epoch': 0.42} 42%|████▏ | 9354/22095 [15:53:50<18:20:01, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55874 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9355/22095 [15:53:53<15:49:46, 4.47s/it] {'loss': 0.338, 'grad_norm': 0.7638670907037846, 'learning_rate': 6.462465954890482e-06, 'epoch': 0.42} 42%|████▏ | 9355/22095 [15:53:53<15:49:46, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76845 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53974 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9356/22095 [15:53:57<15:18:32, 4.33s/it] {'loss': 0.3279, 'grad_norm': 0.6502171356431934, 'learning_rate': 6.46176507055372e-06, 'epoch': 0.42} 42%|████▏ | 9356/22095 [15:53:57<15:18:32, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9357/22095 [15:54:04<17:35:22, 4.97s/it] {'loss': 0.4782, 'grad_norm': 0.33969432269480376, 'learning_rate': 6.461064154808118e-06, 'epoch': 0.42} 42%|████▏ | 9357/22095 [15:54:04<17:35:22, 4.97s/it] 42%|████▏ | 9358/22095 [15:54:07<16:01:24, 4.53s/it] {'loss': 0.3244, 'grad_norm': 0.7200506948589684, 'learning_rate': 6.460363207668734e-06, 'epoch': 0.42} 42%|████▏ | 9358/22095 [15:54:07<16:01:24, 4.53s/it] 42%|████▏ | 9359/22095 [15:54:11<15:19:27, 4.33s/it] {'loss': 0.2856, 'grad_norm': 0.59014599078168, 'learning_rate': 6.45966222915063e-06, 'epoch': 0.42} 42%|████▏ | 9359/22095 [15:54:11<15:19:27, 4.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49605 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (153457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97587 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9360/22095 [15:54:16<16:16:58, 4.60s/it] {'loss': 0.3552, 'grad_norm': 0.7047584502504611, 'learning_rate': 6.4589612192688656e-06, 'epoch': 0.42} 42%|████▏ | 9360/22095 [15:54:16<16:16:58, 4.60s/it] 42%|████▏ | 9361/22095 [15:54:19<14:46:12, 4.18s/it] {'loss': 0.3337, 'grad_norm': 0.6219126607206661, 'learning_rate': 6.458260178038508e-06, 'epoch': 0.42} 42%|████▏ | 9361/22095 [15:54:19<14:46:12, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71991 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9362/22095 [15:54:23<13:51:52, 3.92s/it] {'loss': 0.3628, 'grad_norm': 0.6525766470892861, 'learning_rate': 6.457559105474617e-06, 'epoch': 0.42} 42%|████▏ | 9362/22095 [15:54:23<13:51:52, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9363/22095 [15:54:29<16:03:31, 4.54s/it] {'loss': 0.5076, 'grad_norm': 0.49286730889390995, 'learning_rate': 6.456858001592257e-06, 'epoch': 0.42} 42%|████▏ | 9363/22095 [15:54:29<16:03:31, 4.54s/it] 42%|████▏ | 9364/22095 [15:54:38<21:29:27, 6.08s/it] {'loss': 0.5279, 'grad_norm': 0.43228591511802883, 'learning_rate': 6.456156866406493e-06, 'epoch': 0.42} 42%|████▏ | 9364/22095 [15:54:38<21:29:27, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 42%|████▏ | 9365/22095 [15:54:42<19:18:43, 5.46s/it] {'loss': 0.3571, 'grad_norm': 0.7548313358238873, 'learning_rate': 6.45545569993239e-06, 'epoch': 0.42} 42%|████▏ | 9365/22095 [15:54:42<19:18:43, 5.46s/it] 42%|████▏ | 9366/22095 [15:54:46<17:42:18, 5.01s/it] {'loss': 0.2961, 'grad_norm': 0.6385538040486504, 'learning_rate': 6.454754502185015e-06, 'epoch': 0.42} 42%|████▏ | 9366/22095 [15:54:46<17:42:18, 5.01s/it] 42%|████▏ | 9367/22095 [15:54:50<16:30:25, 4.67s/it] {'loss': 0.3793, 'grad_norm': 0.7150213127800817, 'learning_rate': 6.454053273179435e-06, 'epoch': 0.42} 42%|████▏ | 9367/22095 [15:54:50<16:30:25, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9368/22095 [15:55:00<21:38:33, 6.12s/it] {'loss': 0.4684, 'grad_norm': 0.42403833084216347, 'learning_rate': 6.453352012930713e-06, 'epoch': 0.42} 42%|████▏ | 9368/22095 [15:55:00<21:38:33, 6.12s/it] 42%|████▏ | 9369/22095 [15:55:03<18:31:25, 5.24s/it] {'loss': 0.4078, 'grad_norm': 0.8437666534491314, 'learning_rate': 6.452650721453921e-06, 'epoch': 0.42} 42%|████▏ | 9369/22095 [15:55:03<18:31:25, 5.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 42%|████▏ | 9370/22095 [15:55:11<21:02:15, 5.95s/it] {'loss': 0.5022, 'grad_norm': 0.44738550576229486, 'learning_rate': 6.451949398764127e-06, 'epoch': 0.42} 42%|████▏ | 9370/22095 [15:55:11<21:02:15, 5.95s/it] 42%|████▏ | 9371/22095 [15:55:14<18:41:04, 5.29s/it] {'loss': 0.3763, 'grad_norm': 0.6883373927932057, 'learning_rate': 6.451248044876399e-06, 'epoch': 0.42} 42%|████▏ | 9371/22095 [15:55:14<18:41:04, 5.29s/it] 42%|████▏ | 9372/22095 [15:55:18<16:57:03, 4.80s/it] {'loss': 0.3658, 'grad_norm': 0.6465158845984057, 'learning_rate': 6.450546659805807e-06, 'epoch': 0.42} 42%|████▏ | 9372/22095 [15:55:18<16:57:03, 4.80s/it] 42%|████▏ | 9373/22095 [15:55:22<16:28:27, 4.66s/it] {'loss': 0.3262, 'grad_norm': 0.5976605595425212, 'learning_rate': 6.449845243567424e-06, 'epoch': 0.42} 42%|████▏ | 9373/22095 [15:55:22<16:28:27, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [370, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8505574 in VC:s3://internvl-moe-sft-data/. Exception: Image size [370, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64543, 'image': 'vrdu_texteq/astro-ph.CO/1256454e-320e-4d96-be9f-8ee992ccf02c.png', 'image_wh': [[370, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'where $\\nu_1$ and $\\nu_2$ are constants.'}]} 42%|████▏ | 9374/22095 [15:55:27<16:10:28, 4.58s/it] {'loss': 0.3406, 'grad_norm': 0.6650479113683798, 'learning_rate': 6.449143796176318e-06, 'epoch': 0.42} 42%|████▏ | 9374/22095 [15:55:27<16:10:28, 4.58s/it] 42%|████▏ | 9375/22095 [15:55:31<15:38:22, 4.43s/it] {'loss': 0.3514, 'grad_norm': 0.6812372296874605, 'learning_rate': 6.448442317647563e-06, 'epoch': 0.42} 42%|████▏ | 9375/22095 [15:55:31<15:38:22, 4.43s/it] 42%|████▏ | 9376/22095 [15:55:34<14:41:31, 4.16s/it] {'loss': 0.3203, 'grad_norm': 0.5904246462988874, 'learning_rate': 6.447740807996232e-06, 'epoch': 0.42} 42%|████▏ | 9376/22095 [15:55:34<14:41:31, 4.16s/it] 42%|████▏ | 9377/22095 [15:55:37<13:24:03, 3.79s/it] {'loss': 0.3109, 'grad_norm': 0.6512807613106717, 'learning_rate': 6.447039267237397e-06, 'epoch': 0.42} 42%|████▏ | 9377/22095 [15:55:37<13:24:03, 3.79s/it] 42%|████▏ | 9378/22095 [15:55:40<12:26:30, 3.52s/it] {'loss': 0.3841, 'grad_norm': 0.7020948553285277, 'learning_rate': 6.446337695386132e-06, 'epoch': 0.42} 42%|████▏ | 9378/22095 [15:55:40<12:26:30, 3.52s/it] 42%|████▏ | 9379/22095 [15:55:44<12:47:08, 3.62s/it] {'loss': 0.3559, 'grad_norm': 0.6360695297590829, 'learning_rate': 6.445636092457512e-06, 'epoch': 0.42} 42%|████▏ | 9379/22095 [15:55:44<12:47:08, 3.62s/it] 42%|████▏ | 9380/22095 [15:55:47<12:41:49, 3.59s/it] {'loss': 0.303, 'grad_norm': 0.6904837117783633, 'learning_rate': 6.444934458466614e-06, 'epoch': 0.42} 42%|████▏ | 9380/22095 [15:55:47<12:41:49, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50616 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9381/22095 [15:55:51<12:10:12, 3.45s/it] {'loss': 0.3771, 'grad_norm': 0.7228940949355944, 'learning_rate': 6.444232793428511e-06, 'epoch': 0.42} 42%|████▏ | 9381/22095 [15:55:51<12:10:12, 3.45s/it] 42%|████▏ | 9382/22095 [15:55:54<11:55:49, 3.38s/it] {'loss': 0.3762, 'grad_norm': 0.6104398617473498, 'learning_rate': 6.4435310973582795e-06, 'epoch': 0.42} 42%|████▏ | 9382/22095 [15:55:54<11:55:49, 3.38s/it] 42%|████▏ | 9383/22095 [15:55:57<11:57:18, 3.39s/it] {'loss': 0.3394, 'grad_norm': 0.579187174280872, 'learning_rate': 6.442829370271e-06, 'epoch': 0.42} 42%|████▏ | 9383/22095 [15:55:57<11:57:18, 3.39s/it] 42%|████▏ | 9384/22095 [15:56:00<11:39:14, 3.30s/it] {'loss': 0.3552, 'grad_norm': 0.7785916025725149, 'learning_rate': 6.442127612181747e-06, 'epoch': 0.42} 42%|████▏ | 9384/22095 [15:56:00<11:39:14, 3.30s/it] 42%|████▏ | 9385/22095 [15:56:04<11:44:08, 3.32s/it] {'loss': 0.3298, 'grad_norm': 0.6105297157822821, 'learning_rate': 6.441425823105603e-06, 'epoch': 0.42} 42%|████▏ | 9385/22095 [15:56:04<11:44:08, 3.32s/it] 42%|████▏ | 9386/22095 [15:56:07<11:19:17, 3.21s/it] {'loss': 0.3274, 'grad_norm': 0.6995238406533221, 'learning_rate': 6.440724003057643e-06, 'epoch': 0.42} 42%|████▏ | 9386/22095 [15:56:07<11:19:17, 3.21s/it] 42%|████▏ | 9387/22095 [15:56:11<12:06:39, 3.43s/it] {'loss': 0.3037, 'grad_norm': 0.6808658548787668, 'learning_rate': 6.440022152052951e-06, 'epoch': 0.42} 42%|████▏ | 9387/22095 [15:56:11<12:06:39, 3.43s/it] 42%|████▏ | 9388/22095 [15:56:14<12:25:59, 3.52s/it] {'loss': 0.3627, 'grad_norm': 0.6164624205567748, 'learning_rate': 6.4393202701066046e-06, 'epoch': 0.42} 42%|████▏ | 9388/22095 [15:56:14<12:25:59, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 42%|████▏ | 9389/22095 [15:56:20<14:24:09, 4.08s/it] {'loss': 0.4658, 'grad_norm': 0.5725999624006355, 'learning_rate': 6.4386183572336854e-06, 'epoch': 0.42} 42%|████▏ | 9389/22095 [15:56:20<14:24:09, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53452 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81309 > 40960). Running this sequence through the model will result in indexing errors 42%|████▏ | 9390/22095 [15:56:23<13:49:56, 3.92s/it] {'loss': 0.3545, 'grad_norm': 0.6049117002644915, 'learning_rate': 6.437916413449278e-06, 'epoch': 0.42} 42%|████▏ | 9390/22095 [15:56:23<13:49:56, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48889 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9391/22095 [15:56:27<13:32:49, 3.84s/it] {'loss': 0.3626, 'grad_norm': 0.6477856887287778, 'learning_rate': 6.437214438768462e-06, 'epoch': 0.43} 43%|████▎ | 9391/22095 [15:56:27<13:32:49, 3.84s/it] 43%|████▎ | 9392/22095 [15:56:31<13:24:49, 3.80s/it] {'loss': 0.3455, 'grad_norm': 0.6093961114293333, 'learning_rate': 6.436512433206321e-06, 'epoch': 0.43} 43%|████▎ | 9392/22095 [15:56:31<13:24:49, 3.80s/it] 43%|████▎ | 9393/22095 [15:56:34<12:34:28, 3.56s/it] {'loss': 0.3519, 'grad_norm': 0.6660332275889961, 'learning_rate': 6.435810396777941e-06, 'epoch': 0.43} 43%|████▎ | 9393/22095 [15:56:34<12:34:28, 3.56s/it] 43%|████▎ | 9394/22095 [15:56:38<13:14:23, 3.75s/it] {'loss': 0.3201, 'grad_norm': 0.6843726330753063, 'learning_rate': 6.435108329498404e-06, 'epoch': 0.43} 43%|████▎ | 9394/22095 [15:56:38<13:14:23, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113225 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78006 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9395/22095 [15:56:41<12:17:00, 3.48s/it] {'loss': 0.3305, 'grad_norm': 0.6189098497854676, 'learning_rate': 6.434406231382797e-06, 'epoch': 0.43} 43%|████▎ | 9395/22095 [15:56:41<12:17:00, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94263 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9396/22095 [15:56:44<12:26:07, 3.53s/it] {'loss': 0.3814, 'grad_norm': 0.6627815625904901, 'learning_rate': 6.433704102446207e-06, 'epoch': 0.43} 43%|████▎ | 9396/22095 [15:56:44<12:26:07, 3.53s/it] 43%|████▎ | 9397/22095 [15:56:48<12:47:20, 3.63s/it] {'loss': 0.3447, 'grad_norm': 0.6338700019878458, 'learning_rate': 6.433001942703717e-06, 'epoch': 0.43} 43%|████▎ | 9397/22095 [15:56:48<12:47:20, 3.63s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307748 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB29pORthRDOuFjSZFzXXcIipXa_!!3165088771.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract text from the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n送\n心盘+2个花筒+垫纸\n特价包邮\n冠军\n全网销量\n8点前\n当天\n下单都发货\n新款13件套+水晶+玻璃罩'}]} 43%|████▎ | 9398/22095 [15:56:58<18:56:21, 5.37s/it] {'loss': 0.4892, 'grad_norm': 0.4237312118309754, 'learning_rate': 6.432299752170419e-06, 'epoch': 0.43} 43%|████▎ | 9398/22095 [15:56:58<18:56:21, 5.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9399/22095 [15:57:07<23:21:31, 6.62s/it] {'loss': 0.5054, 'grad_norm': 0.37646514307108286, 'learning_rate': 6.431597530861396e-06, 'epoch': 0.43} 43%|████▎ | 9399/22095 [15:57:07<23:21:31, 6.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 43%|████▎ | 9400/22095 [15:57:10<19:43:29, 5.59s/it] {'loss': 0.3495, 'grad_norm': 0.7563459022781303, 'learning_rate': 6.430895278791739e-06, 'epoch': 0.43} 43%|████▎ | 9400/22095 [15:57:10<19:43:29, 5.59s/it] 43%|████▎ | 9401/22095 [15:57:14<18:03:08, 5.12s/it] {'loss': 0.3455, 'grad_norm': 0.6547677278910567, 'learning_rate': 6.4301929959765375e-06, 'epoch': 0.43} 43%|████▎ | 9401/22095 [15:57:14<18:03:08, 5.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86587 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60093 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9402/22095 [15:57:17<15:47:34, 4.48s/it] {'loss': 0.3177, 'grad_norm': 0.6676633441951189, 'learning_rate': 6.429490682430881e-06, 'epoch': 0.43} 43%|████▎ | 9402/22095 [15:57:17<15:47:34, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9403/22095 [15:57:27<21:03:47, 5.97s/it] {'loss': 0.4467, 'grad_norm': 0.36912778358643694, 'learning_rate': 6.42878833816986e-06, 'epoch': 0.43} 43%|████▎ | 9403/22095 [15:57:27<21:03:47, 5.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45812 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43473 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120365 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9404/22095 [15:57:30<18:22:51, 5.21s/it] {'loss': 0.3217, 'grad_norm': 0.7687649341155516, 'learning_rate': 6.428085963208567e-06, 'epoch': 0.43} 43%|████▎ | 9404/22095 [15:57:30<18:22:51, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47388 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9405/22095 [15:57:33<15:50:45, 4.50s/it] {'loss': 0.3548, 'grad_norm': 0.6299817488297312, 'learning_rate': 6.427383557562091e-06, 'epoch': 0.43} 43%|████▎ | 9405/22095 [15:57:33<15:50:45, 4.50s/it] 43%|████▎ | 9406/22095 [15:57:37<14:53:30, 4.22s/it] {'loss': 0.3691, 'grad_norm': 0.624422589718829, 'learning_rate': 6.426681121245527e-06, 'epoch': 0.43} 43%|████▎ | 9406/22095 [15:57:37<14:53:30, 4.22s/it] 43%|████▎ | 9407/22095 [15:57:40<14:24:06, 4.09s/it] {'loss': 0.3522, 'grad_norm': 0.6652405335093032, 'learning_rate': 6.4259786542739676e-06, 'epoch': 0.43} 43%|████▎ | 9407/22095 [15:57:40<14:24:06, 4.09s/it] 43%|████▎ | 9408/22095 [15:57:43<13:14:28, 3.76s/it] {'loss': 0.4053, 'grad_norm': 0.6581907622201878, 'learning_rate': 6.425276156662506e-06, 'epoch': 0.43} 43%|████▎ | 9408/22095 [15:57:43<13:14:28, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8340547 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7191, 'image': 'vrdu_table_final_2/astro-ph.CO/5a4c928d-1154-4e9f-bc05-7105816d329a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 43%|████▎ | 9409/22095 [15:57:46<12:32:43, 3.56s/it] {'loss': 0.3439, 'grad_norm': 0.6736635664062993, 'learning_rate': 6.424573628426239e-06, 'epoch': 0.43} 43%|████▎ | 9409/22095 [15:57:46<12:32:43, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9410/22095 [15:57:56<18:55:30, 5.37s/it] {'loss': 0.4577, 'grad_norm': 0.35440063532291827, 'learning_rate': 6.423871069580256e-06, 'epoch': 0.43} 43%|████▎ | 9410/22095 [15:57:56<18:55:30, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46336 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9411/22095 [15:57:59<16:37:01, 4.72s/it] {'loss': 0.3462, 'grad_norm': 0.6730631122694064, 'learning_rate': 6.423168480139661e-06, 'epoch': 0.43} 43%|████▎ | 9411/22095 [15:57:59<16:37:01, 4.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9412/22095 [15:58:08<20:30:32, 5.82s/it] {'loss': 0.4678, 'grad_norm': 0.2975473247448006, 'learning_rate': 6.4224658601195445e-06, 'epoch': 0.43} 43%|████▎ | 9412/22095 [15:58:08<20:30:32, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79364 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9413/22095 [15:58:11<17:43:34, 5.03s/it] {'loss': 0.3254, 'grad_norm': 0.7888191764002732, 'learning_rate': 6.4217632095350046e-06, 'epoch': 0.43} 43%|████▎ | 9413/22095 [15:58:11<17:43:34, 5.03s/it] 43%|████▎ | 9414/22095 [15:58:15<16:27:11, 4.67s/it] {'loss': 0.3372, 'grad_norm': 0.6764281596247608, 'learning_rate': 6.421060528401141e-06, 'epoch': 0.43} 43%|████▎ | 9414/22095 [15:58:15<16:27:11, 4.67s/it] 43%|████▎ | 9415/22095 [15:58:18<14:51:39, 4.22s/it] {'loss': 0.3741, 'grad_norm': 0.6446197101185022, 'learning_rate': 6.42035781673305e-06, 'epoch': 0.43} 43%|████▎ | 9415/22095 [15:58:18<14:51:39, 4.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047672 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 43%|████▎ | 9416/22095 [15:58:21<13:29:39, 3.83s/it] {'loss': 0.3316, 'grad_norm': 0.6554610625181327, 'learning_rate': 6.419655074545833e-06, 'epoch': 0.43} 43%|████▎ | 9416/22095 [15:58:21<13:29:39, 3.83s/it] 43%|████▎ | 9417/22095 [15:58:24<12:43:15, 3.61s/it] {'loss': 0.3527, 'grad_norm': 0.7530874118558856, 'learning_rate': 6.41895230185459e-06, 'epoch': 0.43} 43%|████▎ | 9417/22095 [15:58:24<12:43:15, 3.61s/it] 43%|████▎ | 9418/22095 [15:58:27<11:55:17, 3.39s/it] {'loss': 0.3683, 'grad_norm': 0.670199306873179, 'learning_rate': 6.418249498674417e-06, 'epoch': 0.43} 43%|████▎ | 9418/22095 [15:58:27<11:55:17, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (130997 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98735 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9419/22095 [15:58:36<18:17:50, 5.20s/it] {'loss': 0.4751, 'grad_norm': 0.36679226234773377, 'learning_rate': 6.41754666502042e-06, 'epoch': 0.43} 43%|████▎ | 9419/22095 [15:58:36<18:17:50, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41470 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55971 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9420/22095 [15:58:40<16:40:26, 4.74s/it] {'loss': 0.3224, 'grad_norm': 0.6648165630423366, 'learning_rate': 6.416843800907698e-06, 'epoch': 0.43} 43%|████▎ | 9420/22095 [15:58:40<16:40:26, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9421/22095 [15:58:49<20:59:34, 5.96s/it] {'loss': 0.4722, 'grad_norm': 0.34829911958308424, 'learning_rate': 6.416140906351355e-06, 'epoch': 0.43} 43%|████▎ | 9421/22095 [15:58:49<20:59:34, 5.96s/it] 43%|████▎ | 9422/22095 [15:58:53<18:47:47, 5.34s/it] {'loss': 0.3925, 'grad_norm': 0.6661627189350696, 'learning_rate': 6.4154379813664926e-06, 'epoch': 0.43} 43%|████▎ | 9422/22095 [15:58:53<18:47:47, 5.34s/it] 43%|████▎ | 9423/22095 [15:58:56<16:22:38, 4.65s/it] {'loss': 0.3515, 'grad_norm': 0.6552253402331372, 'learning_rate': 6.4147350259682155e-06, 'epoch': 0.43} 43%|████▎ | 9423/22095 [15:58:56<16:22:38, 4.65s/it] 43%|████▎ | 9424/22095 [15:58:58<14:33:25, 4.14s/it] {'loss': 0.3435, 'grad_norm': 0.6309157437493499, 'learning_rate': 6.414032040171627e-06, 'epoch': 0.43} 43%|████▎ | 9424/22095 [15:58:59<14:33:25, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85872 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9425/22095 [15:59:02<13:50:41, 3.93s/it] {'loss': 0.3759, 'grad_norm': 0.6562280887615455, 'learning_rate': 6.413329023991834e-06, 'epoch': 0.43} 43%|████▎ | 9425/22095 [15:59:02<13:50:41, 3.93s/it] 43%|████▎ | 9426/22095 [15:59:06<13:45:40, 3.91s/it] {'loss': 0.3487, 'grad_norm': 0.6730454798509388, 'learning_rate': 6.412625977443939e-06, 'epoch': 0.43} 43%|████▎ | 9426/22095 [15:59:06<13:45:40, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9427/22095 [15:59:13<16:59:01, 4.83s/it] {'loss': 0.5055, 'grad_norm': 0.3568611834228988, 'learning_rate': 6.411922900543053e-06, 'epoch': 0.43} 43%|████▎ | 9427/22095 [15:59:13<16:59:01, 4.83s/it] 43%|████▎ | 9428/22095 [15:59:22<21:57:33, 6.24s/it] {'loss': 0.4881, 'grad_norm': 0.3593917177276333, 'learning_rate': 6.411219793304278e-06, 'epoch': 0.43} 43%|████▎ | 9428/22095 [15:59:22<21:57:33, 6.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (49981 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9429/22095 [15:59:27<20:00:28, 5.69s/it] {'loss': 0.3341, 'grad_norm': 0.6544861807828374, 'learning_rate': 6.410516655742725e-06, 'epoch': 0.43} 43%|████▎ | 9429/22095 [15:59:27<20:00:28, 5.69s/it] 43%|████▎ | 9430/22095 [15:59:31<18:34:15, 5.28s/it] {'loss': 0.3393, 'grad_norm': 0.655777602680432, 'learning_rate': 6.4098134878735005e-06, 'epoch': 0.43} 43%|████▎ | 9430/22095 [15:59:31<18:34:15, 5.28s/it] 43%|████▎ | 9431/22095 [15:59:34<16:24:38, 4.67s/it] {'loss': 0.2939, 'grad_norm': 0.6226018999883778, 'learning_rate': 6.409110289711715e-06, 'epoch': 0.43} 43%|████▎ | 9431/22095 [15:59:34<16:24:38, 4.67s/it] 43%|████▎ | 9432/22095 [15:59:38<15:29:33, 4.40s/it] {'loss': 0.364, 'grad_norm': 0.7018429768891283, 'learning_rate': 6.4084070612724765e-06, 'epoch': 0.43} 43%|████▎ | 9432/22095 [15:59:38<15:29:33, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9433/22095 [15:59:48<20:47:55, 5.91s/it] {'loss': 0.4937, 'grad_norm': 0.34061795632785796, 'learning_rate': 6.407703802570896e-06, 'epoch': 0.43} 43%|████▎ | 9433/22095 [15:59:48<20:47:55, 5.91s/it] 43%|████▎ | 9434/22095 [15:59:51<18:42:31, 5.32s/it] {'loss': 0.3767, 'grad_norm': 0.7167933010917638, 'learning_rate': 6.407000513622083e-06, 'epoch': 0.43} 43%|████▎ | 9434/22095 [15:59:51<18:42:31, 5.32s/it] 43%|████▎ | 9435/22095 [15:59:55<17:13:43, 4.90s/it] {'loss': 0.3876, 'grad_norm': 0.643637606655535, 'learning_rate': 6.4062971944411514e-06, 'epoch': 0.43} 43%|████▎ | 9435/22095 [15:59:55<17:13:43, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67442 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9436/22095 [15:59:59<15:32:35, 4.42s/it] {'loss': 0.3753, 'grad_norm': 0.6931332491271285, 'learning_rate': 6.405593845043212e-06, 'epoch': 0.43} 43%|████▎ | 9436/22095 [15:59:59<15:32:35, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9437/22095 [16:00:08<20:55:09, 5.95s/it] {'loss': 0.4618, 'grad_norm': 0.3162355029302319, 'learning_rate': 6.4048904654433785e-06, 'epoch': 0.43} 43%|████▎ | 9437/22095 [16:00:08<20:55:09, 5.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9438/22095 [16:00:12<18:22:26, 5.23s/it] {'loss': 0.347, 'grad_norm': 0.7194911038746254, 'learning_rate': 6.4041870556567645e-06, 'epoch': 0.43} 43%|████▎ | 9438/22095 [16:00:12<18:22:26, 5.23s/it] 43%|████▎ | 9439/22095 [16:00:16<17:12:02, 4.89s/it] {'loss': 0.3973, 'grad_norm': 0.6319541792320363, 'learning_rate': 6.4034836156984805e-06, 'epoch': 0.43} 43%|████▎ | 9439/22095 [16:00:16<17:12:02, 4.89s/it] 43%|████▎ | 9440/22095 [16:00:19<15:00:24, 4.27s/it] {'loss': 0.3761, 'grad_norm': 0.6732680070654008, 'learning_rate': 6.4027801455836466e-06, 'epoch': 0.43} 43%|████▎ | 9440/22095 [16:00:19<15:00:24, 4.27s/it] 43%|████▎ | 9441/22095 [16:00:23<14:48:08, 4.21s/it] {'loss': 0.3256, 'grad_norm': 0.5858062417249877, 'learning_rate': 6.402076645327374e-06, 'epoch': 0.43} 43%|████▎ | 9441/22095 [16:00:23<14:48:08, 4.21s/it] 43%|████▎ | 9442/22095 [16:00:26<13:25:03, 3.82s/it] {'loss': 0.3255, 'grad_norm': 0.6912276975662498, 'learning_rate': 6.401373114944781e-06, 'epoch': 0.43} 43%|████▎ | 9442/22095 [16:00:26<13:25:03, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53805 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9443/22095 [16:00:28<12:13:51, 3.48s/it] {'loss': 0.3305, 'grad_norm': 0.5969928258655933, 'learning_rate': 6.400669554450985e-06, 'epoch': 0.43} 43%|████▎ | 9443/22095 [16:00:28<12:13:51, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047217 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 8\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (138261 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9444/22095 [16:00:36<17:08:09, 4.88s/it] {'loss': 0.4703, 'grad_norm': 0.3981362523108663, 'learning_rate': 6.3999659638611e-06, 'epoch': 0.43} 43%|████▎ | 9444/22095 [16:00:36<17:08:09, 4.88s/it] 43%|████▎ | 9445/22095 [16:00:40<15:33:24, 4.43s/it] {'loss': 0.3542, 'grad_norm': 0.6625477549277766, 'learning_rate': 6.399262343190247e-06, 'epoch': 0.43} 43%|████▎ | 9445/22095 [16:00:40<15:33:24, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9446/22095 [16:00:43<14:12:26, 4.04s/it] {'loss': 0.3116, 'grad_norm': 1.0213337138421563, 'learning_rate': 6.398558692453545e-06, 'epoch': 0.43} 43%|████▎ | 9446/22095 [16:00:43<14:12:26, 4.04s/it] 43%|████▎ | 9447/22095 [16:00:46<12:54:43, 3.68s/it] {'loss': 0.3276, 'grad_norm': 0.6018532829667804, 'learning_rate': 6.397855011666109e-06, 'epoch': 0.43} 43%|████▎ | 9447/22095 [16:00:46<12:54:43, 3.68s/it] 43%|████▎ | 9448/22095 [16:00:49<12:54:24, 3.67s/it] {'loss': 0.3211, 'grad_norm': 0.6584790078653553, 'learning_rate': 6.397151300843065e-06, 'epoch': 0.43} 43%|████▎ | 9448/22095 [16:00:49<12:54:24, 3.67s/it] 43%|████▎ | 9449/22095 [16:00:53<12:43:33, 3.62s/it] {'loss': 0.3514, 'grad_norm': 1.0077965139681038, 'learning_rate': 6.396447559999528e-06, 'epoch': 0.43} 43%|████▎ | 9449/22095 [16:00:53<12:43:33, 3.62s/it] 43%|████▎ | 9450/22095 [16:00:56<12:32:09, 3.57s/it] {'loss': 0.3492, 'grad_norm': 0.6114458956301414, 'learning_rate': 6.3957437891506236e-06, 'epoch': 0.43} 43%|████▎ | 9450/22095 [16:00:56<12:32:09, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9451/22095 [16:01:04<16:22:52, 4.66s/it] {'loss': 0.4868, 'grad_norm': 0.3438870869900401, 'learning_rate': 6.395039988311472e-06, 'epoch': 0.43} 43%|████▎ | 9451/22095 [16:01:04<16:22:52, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887137 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10290, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887270 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10423, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9452/22095 [16:01:08<16:20:49, 4.65s/it] {'loss': 0.3622, 'grad_norm': 0.6191567022098997, 'learning_rate': 6.394336157497195e-06, 'epoch': 0.43} 43%|████▎ | 9452/22095 [16:01:08<16:20:49, 4.65s/it] 43%|████▎ | 9453/22095 [16:01:12<14:55:48, 4.25s/it] {'loss': 0.3345, 'grad_norm': 0.654114259431335, 'learning_rate': 6.393632296722916e-06, 'epoch': 0.43} 43%|████▎ | 9453/22095 [16:01:12<14:55:48, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9454/22095 [16:01:15<13:50:26, 3.94s/it] {'loss': 0.3388, 'grad_norm': 0.631728787366428, 'learning_rate': 6.39292840600376e-06, 'epoch': 0.43} 43%|████▎ | 9454/22095 [16:01:15<13:50:26, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9455/22095 [16:01:19<14:01:57, 4.00s/it] {'loss': 0.3387, 'grad_norm': 0.6681836940210151, 'learning_rate': 6.39222448535485e-06, 'epoch': 0.43} 43%|████▎ | 9455/22095 [16:01:19<14:01:57, 4.00s/it] 43%|████▎ | 9456/22095 [16:01:22<13:27:51, 3.84s/it] {'loss': 0.3091, 'grad_norm': 0.6476944324988394, 'learning_rate': 6.3915205347913124e-06, 'epoch': 0.43} 43%|████▎ | 9456/22095 [16:01:22<13:27:51, 3.84s/it] 43%|████▎ | 9457/22095 [16:01:25<12:34:39, 3.58s/it] {'loss': 0.3336, 'grad_norm': 0.5720201070992996, 'learning_rate': 6.3908165543282706e-06, 'epoch': 0.43} 43%|████▎ | 9457/22095 [16:01:25<12:34:39, 3.58s/it] 43%|████▎ | 9458/22095 [16:01:29<12:22:20, 3.52s/it] {'loss': 0.2851, 'grad_norm': 0.5846592923118611, 'learning_rate': 6.390112543980854e-06, 'epoch': 0.43} 43%|████▎ | 9458/22095 [16:01:29<12:22:20, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89178 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51395 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66911 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9459/22095 [16:01:32<11:46:56, 3.36s/it] {'loss': 0.3803, 'grad_norm': 0.6430948603061865, 'learning_rate': 6.389408503764188e-06, 'epoch': 0.43} 43%|████▎ | 9459/22095 [16:01:32<11:46:56, 3.36s/it] 43%|████▎ | 9460/22095 [16:01:36<12:28:43, 3.56s/it] {'loss': 0.3493, 'grad_norm': 0.6632950837946977, 'learning_rate': 6.3887044336934005e-06, 'epoch': 0.43} 43%|████▎ | 9460/22095 [16:01:36<12:28:43, 3.56s/it] 43%|████▎ | 9461/22095 [16:01:39<11:50:15, 3.37s/it] {'loss': 0.3627, 'grad_norm': 0.6500691181724101, 'learning_rate': 6.38800033378362e-06, 'epoch': 0.43} 43%|████▎ | 9461/22095 [16:01:39<11:50:15, 3.37s/it] 43%|████▎ | 9462/22095 [16:01:42<11:51:38, 3.38s/it] {'loss': 0.3479, 'grad_norm': 0.6254439914297663, 'learning_rate': 6.387296204049975e-06, 'epoch': 0.43} 43%|████▎ | 9462/22095 [16:01:42<11:51:38, 3.38s/it] 43%|████▎ | 9463/22095 [16:01:46<11:54:19, 3.39s/it] {'loss': 0.3588, 'grad_norm': 0.6662471119654548, 'learning_rate': 6.386592044507595e-06, 'epoch': 0.43} 43%|████▎ | 9463/22095 [16:01:46<11:54:19, 3.39s/it] 43%|████▎ | 9464/22095 [16:01:50<12:57:57, 3.70s/it] {'loss': 0.3142, 'grad_norm': 0.6723755867111733, 'learning_rate': 6.385887855171611e-06, 'epoch': 0.43} 43%|████▎ | 9464/22095 [16:01:50<12:57:57, 3.70s/it] 43%|████▎ | 9465/22095 [16:01:54<13:06:47, 3.74s/it] {'loss': 0.3371, 'grad_norm': 0.6204698827060886, 'learning_rate': 6.3851836360571525e-06, 'epoch': 0.43} 43%|████▎ | 9465/22095 [16:01:54<13:06:47, 3.74s/it] 43%|████▎ | 9466/22095 [16:01:57<12:47:21, 3.65s/it] {'loss': 0.3384, 'grad_norm': 0.7133235517253447, 'learning_rate': 6.384479387179353e-06, 'epoch': 0.43} 43%|████▎ | 9466/22095 [16:01:57<12:47:21, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9467/22095 [16:02:07<18:54:02, 5.39s/it] {'loss': 0.4872, 'grad_norm': 0.37723556554414667, 'learning_rate': 6.383775108553344e-06, 'epoch': 0.43} 43%|████▎ | 9467/22095 [16:02:07<18:54:02, 5.39s/it] 43%|████▎ | 9468/22095 [16:02:11<17:35:13, 5.01s/it] {'loss': 0.372, 'grad_norm': 0.7365255621440524, 'learning_rate': 6.383070800194257e-06, 'epoch': 0.43} 43%|████▎ | 9468/22095 [16:02:11<17:35:13, 5.01s/it] 43%|████▎ | 9469/22095 [16:02:15<16:30:38, 4.71s/it] {'loss': 0.3212, 'grad_norm': 0.5662704539748532, 'learning_rate': 6.382366462117227e-06, 'epoch': 0.43} 43%|████▎ | 9469/22095 [16:02:15<16:30:38, 4.71s/it] 43%|████▎ | 9470/22095 [16:02:19<15:35:17, 4.44s/it] {'loss': 0.3255, 'grad_norm': 0.6703587861944948, 'learning_rate': 6.381662094337385e-06, 'epoch': 0.43} 43%|████▎ | 9470/22095 [16:02:19<15:35:17, 4.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307588 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB26.l8jwvD8KJjy0FlXXagBFXa_!!2556476274.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nConvert the text in this image to written form.'}, {'from': 'gpt', 'value': 'All words in the image:\n强劲动力\n去球\n不伤衣\n粘毛器\n加\n设计\n双倍\n高艺顺\n60\n低.\n中.\n高.\n动力'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [264, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8419821 in VC:s3://internvl-moe-sft-data/. Exception: Image size [264, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 158585, 'image': 'vrdu_texteq/astro-ph.CO/7d437201-662f-4f58-94b2-870740e67b1b.png', 'image_wh': [[264, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'and with the $2\\sigma$ C.L.:'}]} 43%|████▎ | 9471/22095 [16:02:23<15:01:22, 4.28s/it] {'loss': 0.3454, 'grad_norm': 0.6570286513757565, 'learning_rate': 6.380957696869872e-06, 'epoch': 0.43} 43%|████▎ | 9471/22095 [16:02:23<15:01:22, 4.28s/it] 43%|████▎ | 9472/22095 [16:02:26<14:27:59, 4.13s/it] {'loss': 0.3366, 'grad_norm': 0.653602329680775, 'learning_rate': 6.380253269729816e-06, 'epoch': 0.43} 43%|████▎ | 9472/22095 [16:02:26<14:27:59, 4.13s/it] 43%|████▎ | 9473/22095 [16:02:30<13:43:03, 3.91s/it] {'loss': 0.3405, 'grad_norm': 0.6247528482411291, 'learning_rate': 6.379548812932358e-06, 'epoch': 0.43} 43%|████▎ | 9473/22095 [16:02:30<13:43:03, 3.91s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (94609968 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 43%|████▎ | 9474/22095 [16:02:33<13:15:44, 3.78s/it] {'loss': 0.3396, 'grad_norm': 0.6564823764144673, 'learning_rate': 6.3788443264926325e-06, 'epoch': 0.43} 43%|████▎ | 9474/22095 [16:02:33<13:15:44, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44230 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90912 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49411 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102901 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9475/22095 [16:02:37<13:29:41, 3.85s/it] {'loss': 0.3855, 'grad_norm': 0.6338050514675024, 'learning_rate': 6.378139810425777e-06, 'epoch': 0.43} 43%|████▎ | 9475/22095 [16:02:37<13:29:41, 3.85s/it] 43%|████▎ | 9476/22095 [16:02:41<13:25:41, 3.83s/it] {'loss': 0.343, 'grad_norm': 0.6320833873764625, 'learning_rate': 6.37743526474693e-06, 'epoch': 0.43} 43%|████▎ | 9476/22095 [16:02:41<13:25:41, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41239 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44771 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9477/22095 [16:02:44<12:36:21, 3.60s/it] {'loss': 0.3721, 'grad_norm': 0.6692924545788045, 'learning_rate': 6.37673068947123e-06, 'epoch': 0.43} 43%|████▎ | 9477/22095 [16:02:44<12:36:21, 3.60s/it] 43%|████▎ | 9478/22095 [16:02:48<12:41:54, 3.62s/it] {'loss': 0.3149, 'grad_norm': 0.6481553878042078, 'learning_rate': 6.376026084613813e-06, 'epoch': 0.43} 43%|████▎ | 9478/22095 [16:02:48<12:41:54, 3.62s/it] 43%|████▎ | 9479/22095 [16:02:51<12:07:47, 3.46s/it] {'loss': 0.3226, 'grad_norm': 0.6127879848183435, 'learning_rate': 6.375321450189826e-06, 'epoch': 0.43} 43%|████▎ | 9479/22095 [16:02:51<12:07:47, 3.46s/it] 43%|████▎ | 9480/22095 [16:02:54<12:10:33, 3.47s/it] {'loss': 0.3491, 'grad_norm': 0.6519182770593378, 'learning_rate': 6.374616786214402e-06, 'epoch': 0.43} 43%|████▎ | 9480/22095 [16:02:54<12:10:33, 3.47s/it] 43%|████▎ | 9481/22095 [16:02:58<11:54:54, 3.40s/it] {'loss': 0.339, 'grad_norm': 0.6238691493270582, 'learning_rate': 6.373912092702686e-06, 'epoch': 0.43} 43%|████▎ | 9481/22095 [16:02:58<11:54:54, 3.40s/it] 43%|████▎ | 9482/22095 [16:03:00<11:18:43, 3.23s/it] {'loss': 0.312, 'grad_norm': 0.6332219367972534, 'learning_rate': 6.3732073696698194e-06, 'epoch': 0.43} 43%|████▎ | 9482/22095 [16:03:00<11:18:43, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9483/22095 [16:03:10<17:55:12, 5.12s/it] {'loss': 0.4572, 'grad_norm': 0.3263672533628197, 'learning_rate': 6.372502617130942e-06, 'epoch': 0.43} 43%|████▎ | 9483/22095 [16:03:10<17:55:12, 5.12s/it] 43%|████▎ | 9484/22095 [16:03:14<16:36:39, 4.74s/it] {'loss': 0.352, 'grad_norm': 0.5997874965565261, 'learning_rate': 6.371797835101201e-06, 'epoch': 0.43} 43%|████▎ | 9484/22095 [16:03:14<16:36:39, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9485/22095 [16:03:23<21:30:19, 6.14s/it] {'loss': 0.4785, 'grad_norm': 0.31386368068126863, 'learning_rate': 6.371093023595736e-06, 'epoch': 0.43} 43%|████▎ | 9485/22095 [16:03:23<21:30:19, 6.14s/it] 43%|████▎ | 9486/22095 [16:03:27<19:33:33, 5.58s/it] {'loss': 0.3197, 'grad_norm': 0.5963847286606663, 'learning_rate': 6.370388182629693e-06, 'epoch': 0.43} 43%|████▎ | 9486/22095 [16:03:27<19:33:33, 5.58s/it] 43%|████▎ | 9487/22095 [16:03:31<17:59:19, 5.14s/it] {'loss': 0.3258, 'grad_norm': 0.6467820210257044, 'learning_rate': 6.3696833122182175e-06, 'epoch': 0.43} 43%|████▎ | 9487/22095 [16:03:32<17:59:19, 5.14s/it] 43%|████▎ | 9488/22095 [16:03:34<15:42:38, 4.49s/it] {'loss': 0.3583, 'grad_norm': 0.601051314895126, 'learning_rate': 6.368978412376456e-06, 'epoch': 0.43} 43%|████▎ | 9488/22095 [16:03:34<15:42:38, 4.49s/it] 43%|████▎ | 9489/22095 [16:03:38<14:28:16, 4.13s/it] {'loss': 0.3357, 'grad_norm': 0.5897040599319083, 'learning_rate': 6.3682734831195495e-06, 'epoch': 0.43} 43%|████▎ | 9489/22095 [16:03:38<14:28:16, 4.13s/it] 43%|████▎ | 9490/22095 [16:03:42<14:56:13, 4.27s/it] {'loss': 0.2893, 'grad_norm': 0.5789772461006665, 'learning_rate': 6.367568524462651e-06, 'epoch': 0.43} 43%|████▎ | 9490/22095 [16:03:42<14:56:13, 4.27s/it] 43%|████▎ | 9491/22095 [16:03:45<13:38:13, 3.90s/it] {'loss': 0.3493, 'grad_norm': 0.6720690671844922, 'learning_rate': 6.366863536420903e-06, 'epoch': 0.43} 43%|████▎ | 9491/22095 [16:03:45<13:38:13, 3.90s/it] 43%|████▎ | 9492/22095 [16:03:49<13:18:11, 3.80s/it] {'loss': 0.3756, 'grad_norm': 0.9106057929009909, 'learning_rate': 6.3661585190094555e-06, 'epoch': 0.43} 43%|████▎ | 9492/22095 [16:03:49<13:18:11, 3.80s/it] 43%|████▎ | 9493/22095 [16:03:52<12:12:25, 3.49s/it] {'loss': 0.3645, 'grad_norm': 0.6571383440927613, 'learning_rate': 6.365453472243458e-06, 'epoch': 0.43} 43%|████▎ | 9493/22095 [16:03:52<12:12:25, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9494/22095 [16:03:55<11:48:52, 3.38s/it] {'loss': 0.3029, 'grad_norm': 0.6064326061612422, 'learning_rate': 6.36474839613806e-06, 'epoch': 0.43} 43%|████▎ | 9494/22095 [16:03:55<11:48:52, 3.38s/it] 43%|████▎ | 9495/22095 [16:03:58<11:19:41, 3.24s/it] {'loss': 0.3526, 'grad_norm': 0.691674373407346, 'learning_rate': 6.364043290708409e-06, 'epoch': 0.43} 43%|████▎ | 9495/22095 [16:03:58<11:19:41, 3.24s/it] 43%|████▎ | 9496/22095 [16:04:01<10:58:41, 3.14s/it] {'loss': 0.31, 'grad_norm': 0.6137817520380174, 'learning_rate': 6.363338155969658e-06, 'epoch': 0.43} 43%|████▎ | 9496/22095 [16:04:01<10:58:41, 3.14s/it] 43%|████▎ | 9497/22095 [16:04:04<11:11:30, 3.20s/it] {'loss': 0.3724, 'grad_norm': 0.6353400075810914, 'learning_rate': 6.362632991936956e-06, 'epoch': 0.43} 43%|████▎ | 9497/22095 [16:04:04<11:11:30, 3.20s/it] 43%|████▎ | 9498/22095 [16:04:07<11:06:46, 3.18s/it] {'loss': 0.3388, 'grad_norm': 0.6286136029257747, 'learning_rate': 6.361927798625458e-06, 'epoch': 0.43} 43%|████▎ | 9498/22095 [16:04:07<11:06:46, 3.18s/it] 43%|████▎ | 9499/22095 [16:04:10<11:04:11, 3.16s/it] {'loss': 0.3398, 'grad_norm': 0.6081121318544058, 'learning_rate': 6.361222576050312e-06, 'epoch': 0.43} 43%|████▎ | 9499/22095 [16:04:10<11:04:11, 3.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (195698800 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7925639 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (195698800 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/10446.png', 'image_wh': [[15800, 12386]], 'conversations': [{'from': 'human', 'value': '\nWhat percent of cases have not been confirmed by RT-PCR of throat swab? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': '38% not confirmed by RT-PCR.\nAccording to the text, 38% of cases have not been confirmed by RT-PCR of throat swab. This means that almost 2 out of 5 cases have not been tested using this method and therefore may not have an official diagnosis of COVID-19. It is important to note that the text does not specify why these cases were not confirmed using RT-PCR and if alternative tests were used to diagnose them.'}]} 43%|████▎ | 9500/22095 [16:04:14<12:04:44, 3.45s/it] {'loss': 0.3425, 'grad_norm': 0.6939297880857666, 'learning_rate': 6.360517324226676e-06, 'epoch': 0.43} 43%|████▎ | 9500/22095 [16:04:14<12:04:44, 3.45s/it] 43%|████▎ | 9501/22095 [16:04:18<11:46:43, 3.37s/it] {'loss': 0.3216, 'grad_norm': 0.6198614495621046, 'learning_rate': 6.3598120431697e-06, 'epoch': 0.43} 43%|████▎ | 9501/22095 [16:04:18<11:46:43, 3.37s/it] 43%|████▎ | 9502/22095 [16:04:21<11:46:11, 3.36s/it] {'loss': 0.3396, 'grad_norm': 0.6307838569841074, 'learning_rate': 6.35910673289454e-06, 'epoch': 0.43} 43%|████▎ | 9502/22095 [16:04:21<11:46:11, 3.36s/it] 43%|████▎ | 9503/22095 [16:04:24<11:40:45, 3.34s/it] {'loss': 0.3387, 'grad_norm': 0.6183345620328679, 'learning_rate': 6.358401393416349e-06, 'epoch': 0.43} 43%|████▎ | 9503/22095 [16:04:24<11:40:45, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9504/22095 [16:04:32<16:50:37, 4.82s/it] {'loss': 0.4909, 'grad_norm': 0.4830405915588279, 'learning_rate': 6.357696024750286e-06, 'epoch': 0.43} 43%|████▎ | 9504/22095 [16:04:32<16:50:37, 4.82s/it] 43%|████▎ | 9505/22095 [16:04:37<16:40:39, 4.77s/it] {'loss': 0.3208, 'grad_norm': 0.6336024479610478, 'learning_rate': 6.356990626911503e-06, 'epoch': 0.43} 43%|████▎ | 9505/22095 [16:04:37<16:40:39, 4.77s/it] 43%|████▎ | 9506/22095 [16:04:41<15:37:57, 4.47s/it] {'loss': 0.364, 'grad_norm': 0.6690021288856202, 'learning_rate': 6.356285199915162e-06, 'epoch': 0.43} 43%|████▎ | 9506/22095 [16:04:41<15:37:57, 4.47s/it] 43%|████▎ | 9507/22095 [16:04:44<13:57:15, 3.99s/it] {'loss': 0.3436, 'grad_norm': 0.6478736001609527, 'learning_rate': 6.355579743776415e-06, 'epoch': 0.43} 43%|████▎ | 9507/22095 [16:04:44<13:57:15, 3.99s/it] 43%|████▎ | 9508/22095 [16:04:48<14:05:34, 4.03s/it] {'loss': 0.3633, 'grad_norm': 0.7058901153166096, 'learning_rate': 6.354874258510425e-06, 'epoch': 0.43} 43%|████▎ | 9508/22095 [16:04:48<14:05:34, 4.03s/it] 43%|████▎ | 9509/22095 [16:04:51<13:10:25, 3.77s/it] {'loss': 0.3173, 'grad_norm': 0.646687768947303, 'learning_rate': 6.3541687441323466e-06, 'epoch': 0.43} 43%|████▎ | 9509/22095 [16:04:51<13:10:25, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9510/22095 [16:04:54<12:50:12, 3.67s/it] {'loss': 0.3546, 'grad_norm': 0.6384243764796345, 'learning_rate': 6.353463200657341e-06, 'epoch': 0.43} 43%|████▎ | 9510/22095 [16:04:54<12:50:12, 3.67s/it] 43%|████▎ | 9511/22095 [16:04:59<13:18:28, 3.81s/it] {'loss': 0.3214, 'grad_norm': 0.5782346792836871, 'learning_rate': 6.352757628100569e-06, 'epoch': 0.43} 43%|████▎ | 9511/22095 [16:04:59<13:18:28, 3.81s/it] 43%|████▎ | 9512/22095 [16:05:02<12:48:53, 3.67s/it] {'loss': 0.3601, 'grad_norm': 0.6410978720387365, 'learning_rate': 6.352052026477189e-06, 'epoch': 0.43} 43%|████▎ | 9512/22095 [16:05:02<12:48:53, 3.67s/it] 43%|████▎ | 9513/22095 [16:05:06<13:02:14, 3.73s/it] {'loss': 0.2959, 'grad_norm': 0.6338795047306764, 'learning_rate': 6.351346395802365e-06, 'epoch': 0.43} 43%|████▎ | 9513/22095 [16:05:06<13:02:14, 3.73s/it] 43%|████▎ | 9514/22095 [16:05:10<13:24:20, 3.84s/it] {'loss': 0.3601, 'grad_norm': 0.6219979703602239, 'learning_rate': 6.350640736091256e-06, 'epoch': 0.43} 43%|████▎ | 9514/22095 [16:05:10<13:24:20, 3.84s/it] 43%|████▎ | 9515/22095 [16:05:13<12:54:31, 3.69s/it] {'loss': 0.3235, 'grad_norm': 0.603935883727528, 'learning_rate': 6.349935047359026e-06, 'epoch': 0.43} 43%|████▎ | 9515/22095 [16:05:13<12:54:31, 3.69s/it] 43%|████▎ | 9516/22095 [16:05:17<13:14:09, 3.79s/it] {'loss': 0.3183, 'grad_norm': 0.6304081216699563, 'learning_rate': 6.349229329620839e-06, 'epoch': 0.43} 43%|████▎ | 9516/22095 [16:05:17<13:14:09, 3.79s/it] 43%|████▎ | 9517/22095 [16:05:20<12:38:19, 3.62s/it] {'loss': 0.3703, 'grad_norm': 0.6260900712445192, 'learning_rate': 6.348523582891857e-06, 'epoch': 0.43} 43%|████▎ | 9517/22095 [16:05:20<12:38:19, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52079 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9518/22095 [16:05:24<12:09:31, 3.48s/it] {'loss': 0.3287, 'grad_norm': 0.7608811482008814, 'learning_rate': 6.347817807187242e-06, 'epoch': 0.43} 43%|████▎ | 9518/22095 [16:05:24<12:09:31, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (150012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63060 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9519/22095 [16:05:27<11:31:55, 3.30s/it] {'loss': 0.32, 'grad_norm': 0.6650272533873631, 'learning_rate': 6.347112002522167e-06, 'epoch': 0.43} 43%|████▎ | 9519/22095 [16:05:27<11:31:55, 3.30s/it] 43%|████▎ | 9520/22095 [16:05:30<11:36:25, 3.32s/it] {'loss': 0.3253, 'grad_norm': 0.6347777593589885, 'learning_rate': 6.346406168911787e-06, 'epoch': 0.43} 43%|████▎ | 9520/22095 [16:05:30<11:36:25, 3.32s/it] 43%|████▎ | 9521/22095 [16:05:33<11:36:06, 3.32s/it] {'loss': 0.3733, 'grad_norm': 0.6417946074599703, 'learning_rate': 6.3457003063712775e-06, 'epoch': 0.43} 43%|████▎ | 9521/22095 [16:05:33<11:36:06, 3.32s/it] 43%|████▎ | 9522/22095 [16:05:36<11:20:08, 3.25s/it] {'loss': 0.322, 'grad_norm': 0.7687534975018414, 'learning_rate': 6.344994414915801e-06, 'epoch': 0.43} 43%|████▎ | 9522/22095 [16:05:36<11:20:08, 3.25s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (142643256 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 43%|████▎ | 9523/22095 [16:05:39<11:11:26, 3.20s/it] {'loss': 0.3365, 'grad_norm': 0.7554017704102255, 'learning_rate': 6.3442884945605244e-06, 'epoch': 0.43} 43%|████▎ | 9523/22095 [16:05:39<11:11:26, 3.20s/it] 43%|████▎ | 9524/22095 [16:05:43<11:06:23, 3.18s/it] {'loss': 0.3264, 'grad_norm': 0.6077760535588105, 'learning_rate': 6.343582545320617e-06, 'epoch': 0.43} 43%|████▎ | 9524/22095 [16:05:43<11:06:23, 3.18s/it] 43%|████▎ | 9525/22095 [16:05:45<10:46:37, 3.09s/it] {'loss': 0.3541, 'grad_norm': 0.7121399654195334, 'learning_rate': 6.342876567211247e-06, 'epoch': 0.43} 43%|████▎ | 9525/22095 [16:05:45<10:46:37, 3.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (56939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107650 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9526/22095 [16:05:52<14:44:59, 4.22s/it] {'loss': 0.4888, 'grad_norm': 0.5372455333369079, 'learning_rate': 6.3421705602475835e-06, 'epoch': 0.43} 43%|████▎ | 9526/22095 [16:05:52<14:44:59, 4.22s/it] 43%|████▎ | 9527/22095 [16:05:58<16:12:23, 4.64s/it] {'loss': 0.5041, 'grad_norm': 0.4297497874863686, 'learning_rate': 6.341464524444798e-06, 'epoch': 0.43} 43%|████▎ | 9527/22095 [16:05:58<16:12:23, 4.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 43%|████▎ | 9528/22095 [16:06:02<15:39:24, 4.49s/it] {'loss': 0.3267, 'grad_norm': 0.592031368327786, 'learning_rate': 6.340758459818058e-06, 'epoch': 0.43} 43%|████▎ | 9528/22095 [16:06:02<15:39:24, 4.49s/it] 43%|████▎ | 9529/22095 [16:06:06<14:48:48, 4.24s/it] {'loss': 0.3383, 'grad_norm': 0.6250388602448749, 'learning_rate': 6.340052366382539e-06, 'epoch': 0.43} 43%|████▎ | 9529/22095 [16:06:06<14:48:48, 4.24s/it] 43%|████▎ | 9530/22095 [16:06:09<13:57:36, 4.00s/it] {'loss': 0.3561, 'grad_norm': 0.6089617843367738, 'learning_rate': 6.339346244153408e-06, 'epoch': 0.43} 43%|████▎ | 9530/22095 [16:06:09<13:57:36, 4.00s/it] 43%|████▎ | 9531/22095 [16:06:12<13:10:03, 3.77s/it] {'loss': 0.3427, 'grad_norm': 0.6268140034238477, 'learning_rate': 6.3386400931458415e-06, 'epoch': 0.43} 43%|████▎ | 9531/22095 [16:06:12<13:10:03, 3.77s/it] 43%|████▎ | 9532/22095 [16:06:15<12:12:19, 3.50s/it] {'loss': 0.3338, 'grad_norm': 0.6129149633673354, 'learning_rate': 6.33793391337501e-06, 'epoch': 0.43} 43%|████▎ | 9532/22095 [16:06:15<12:12:19, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78183 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9533/22095 [16:06:19<12:20:22, 3.54s/it] {'loss': 0.3824, 'grad_norm': 0.6096958155296873, 'learning_rate': 6.337227704856088e-06, 'epoch': 0.43} 43%|████▎ | 9533/22095 [16:06:19<12:20:22, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [675, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8357276 in VC:s3://internvl-moe-sft-data/. Exception: Image size [675, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23985, 'image': 'vrdu_table_final_2/astro-ph.CO/dc89dc34-d472-4607-b9bd-ee8e34699ff3.png', 'image_wh': [[675, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{l}\n\\footnotemark[1] All values measured for the CO J=3-2 line in km s$^{-1}$. \\\\\n\\end{tabular}\n```"}]} 43%|████▎ | 9534/22095 [16:06:28<18:31:26, 5.31s/it] {'loss': 0.4856, 'grad_norm': 0.7077925181477205, 'learning_rate': 6.336521467604248e-06, 'epoch': 0.43} 43%|████▎ | 9534/22095 [16:06:28<18:31:26, 5.31s/it] 43%|████▎ | 9535/22095 [16:06:32<16:39:14, 4.77s/it] {'loss': 0.3558, 'grad_norm': 0.698918442674996, 'learning_rate': 6.33581520163467e-06, 'epoch': 0.43} 43%|████▎ | 9535/22095 [16:06:32<16:39:14, 4.77s/it] 43%|████▎ | 9536/22095 [16:06:35<14:42:18, 4.22s/it] {'loss': 0.3547, 'grad_norm': 0.6263258929760982, 'learning_rate': 6.335108906962523e-06, 'epoch': 0.43} 43%|████▎ | 9536/22095 [16:06:35<14:42:18, 4.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9537/22095 [16:06:38<14:04:35, 4.04s/it] {'loss': 0.3516, 'grad_norm': 0.6214809292754883, 'learning_rate': 6.334402583602988e-06, 'epoch': 0.43} 43%|████▎ | 9537/22095 [16:06:38<14:04:35, 4.04s/it] 43%|████▎ | 9538/22095 [16:06:42<13:16:25, 3.81s/it] {'loss': 0.3184, 'grad_norm': 0.6941423915887114, 'learning_rate': 6.333696231571238e-06, 'epoch': 0.43} 43%|████▎ | 9538/22095 [16:06:42<13:16:25, 3.81s/it] 43%|████▎ | 9539/22095 [16:06:45<12:38:22, 3.62s/it] {'loss': 0.3666, 'grad_norm': 0.805609294731211, 'learning_rate': 6.332989850882453e-06, 'epoch': 0.43} 43%|████▎ | 9539/22095 [16:06:45<12:38:22, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9540/22095 [16:06:48<12:15:42, 3.52s/it] {'loss': 0.3617, 'grad_norm': 0.6618733159358412, 'learning_rate': 6.33228344155181e-06, 'epoch': 0.43} 43%|████▎ | 9540/22095 [16:06:48<12:15:42, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9541/22095 [16:06:55<15:38:37, 4.49s/it] {'loss': 0.4706, 'grad_norm': 0.34824166171139703, 'learning_rate': 6.331577003594487e-06, 'epoch': 0.43} 43%|████▎ | 9541/22095 [16:06:55<15:38:37, 4.49s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9542/22095 [16:06:58<14:29:41, 4.16s/it] {'loss': 0.3587, 'grad_norm': 0.7031531376323451, 'learning_rate': 6.330870537025664e-06, 'epoch': 0.43} 43%|████▎ | 9542/22095 [16:06:58<14:29:41, 4.16s/it] 43%|████▎ | 9543/22095 [16:07:02<14:07:17, 4.05s/it] {'loss': 0.3356, 'grad_norm': 0.629939383500528, 'learning_rate': 6.3301640418605205e-06, 'epoch': 0.43} 43%|████▎ | 9543/22095 [16:07:02<14:07:17, 4.05s/it] 43%|████▎ | 9544/22095 [16:07:06<14:07:17, 4.05s/it] {'loss': 0.3183, 'grad_norm': 0.7383560966237717, 'learning_rate': 6.329457518114237e-06, 'epoch': 0.43} 43%|████▎ | 9544/22095 [16:07:06<14:07:17, 4.05s/it] 43%|████▎ | 9545/22095 [16:07:09<12:53:14, 3.70s/it] {'loss': 0.3452, 'grad_norm': 0.6529920903315911, 'learning_rate': 6.3287509658019955e-06, 'epoch': 0.43} 43%|████▎ | 9545/22095 [16:07:09<12:53:14, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62614 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9546/22095 [16:07:13<13:05:10, 3.75s/it] {'loss': 0.3728, 'grad_norm': 0.6080741176068959, 'learning_rate': 6.328044384938977e-06, 'epoch': 0.43} 43%|████▎ | 9546/22095 [16:07:13<13:05:10, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9547/22095 [16:07:17<13:02:37, 3.74s/it] {'loss': 0.3624, 'grad_norm': 0.6343854373387572, 'learning_rate': 6.327337775540362e-06, 'epoch': 0.43} 43%|████▎ | 9547/22095 [16:07:17<13:02:37, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1245, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8462665 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1245, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49986, 'image': 'vrdu_texteq/astro-ph.CO/5d227913-a42a-4a4d-aa05-42d52a1a6eaa.png', 'image_wh': [[1245, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where $\\kappa$ is a numerical factor which accounts\nfor the model uncertainties of EBL. Here we take $\\kappa \\sim 1$~.'}]} 43%|████▎ | 9548/22095 [16:07:20<12:57:08, 3.72s/it] {'loss': 0.3479, 'grad_norm': 0.6243543241577083, 'learning_rate': 6.326631137621336e-06, 'epoch': 0.43} 43%|████▎ | 9548/22095 [16:07:20<12:57:08, 3.72s/it] 43%|████▎ | 9549/22095 [16:07:23<12:24:49, 3.56s/it] {'loss': 0.3652, 'grad_norm': 0.6233388741859107, 'learning_rate': 6.32592447119708e-06, 'epoch': 0.43} 43%|████▎ | 9549/22095 [16:07:23<12:24:49, 3.56s/it] 43%|████▎ | 9550/22095 [16:07:26<11:45:44, 3.38s/it] {'loss': 0.3699, 'grad_norm': 0.6586508655334762, 'learning_rate': 6.32521777628278e-06, 'epoch': 0.43} 43%|████▎ | 9550/22095 [16:07:26<11:45:44, 3.38s/it] 43%|████▎ | 9551/22095 [16:07:30<12:23:52, 3.56s/it] {'loss': 0.2788, 'grad_norm': 0.5813175230678054, 'learning_rate': 6.324511052893621e-06, 'epoch': 0.43} 43%|████▎ | 9551/22095 [16:07:30<12:23:52, 3.56s/it] 43%|████▎ | 9552/22095 [16:07:33<11:56:37, 3.43s/it] {'loss': 0.3374, 'grad_norm': 0.6953927867608366, 'learning_rate': 6.323804301044787e-06, 'epoch': 0.43} 43%|████▎ | 9552/22095 [16:07:33<11:56:37, 3.43s/it] 43%|████▎ | 9553/22095 [16:07:36<11:28:51, 3.30s/it] {'loss': 0.3052, 'grad_norm': 0.5648584961368391, 'learning_rate': 6.323097520751463e-06, 'epoch': 0.43} 43%|████▎ | 9553/22095 [16:07:36<11:28:51, 3.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9554/22095 [16:07:44<15:40:04, 4.50s/it] {'loss': 0.5073, 'grad_norm': 0.4741493747837489, 'learning_rate': 6.322390712028839e-06, 'epoch': 0.43} 43%|████▎ | 9554/22095 [16:07:44<15:40:04, 4.50s/it] 43%|████▎ | 9555/22095 [16:07:47<14:41:11, 4.22s/it] {'loss': 0.377, 'grad_norm': 0.6282792193419438, 'learning_rate': 6.321683874892097e-06, 'epoch': 0.43} 43%|████▎ | 9555/22095 [16:07:47<14:41:11, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48595 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88648 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9556/22095 [16:07:50<13:17:23, 3.82s/it] {'loss': 0.317, 'grad_norm': 0.6002379549916733, 'learning_rate': 6.3209770093564315e-06, 'epoch': 0.43} 43%|████▎ | 9556/22095 [16:07:50<13:17:23, 3.82s/it] 43%|████▎ | 9557/22095 [16:07:54<12:46:59, 3.67s/it] {'loss': 0.3424, 'grad_norm': 0.6804592964508102, 'learning_rate': 6.320270115437024e-06, 'epoch': 0.43} 43%|████▎ | 9557/22095 [16:07:54<12:46:59, 3.67s/it] 43%|████▎ | 9558/22095 [16:07:57<12:19:55, 3.54s/it] {'loss': 0.3416, 'grad_norm': 0.6396321643958531, 'learning_rate': 6.319563193149069e-06, 'epoch': 0.43} 43%|████▎ | 9558/22095 [16:07:57<12:19:55, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9559/22095 [16:08:00<12:12:03, 3.50s/it] {'loss': 0.3518, 'grad_norm': 0.6131211335825673, 'learning_rate': 6.318856242507751e-06, 'epoch': 0.43} 43%|████▎ | 9559/22095 [16:08:00<12:12:03, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72760 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9560/22095 [16:08:04<12:24:43, 3.56s/it] {'loss': 0.3267, 'grad_norm': 0.6539758858362998, 'learning_rate': 6.318149263528266e-06, 'epoch': 0.43} 43%|████▎ | 9560/22095 [16:08:04<12:24:43, 3.56s/it] 43%|████▎ | 9561/22095 [16:08:08<12:39:15, 3.63s/it] {'loss': 0.3866, 'grad_norm': 0.6427223122736058, 'learning_rate': 6.3174422562258e-06, 'epoch': 0.43} 43%|████▎ | 9561/22095 [16:08:08<12:39:15, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116576 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71466 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9562/22095 [16:08:11<12:11:26, 3.50s/it] {'loss': 0.3473, 'grad_norm': 0.620529057754623, 'learning_rate': 6.316735220615546e-06, 'epoch': 0.43} 43%|████▎ | 9562/22095 [16:08:11<12:11:26, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55253 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49568 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9563/22095 [16:08:14<11:29:43, 3.30s/it] {'loss': 0.3013, 'grad_norm': 0.5354036421937424, 'learning_rate': 6.316028156712697e-06, 'epoch': 0.43} 43%|████▎ | 9563/22095 [16:08:14<11:29:43, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9564/22095 [16:08:20<14:17:58, 4.11s/it] {'loss': 0.4936, 'grad_norm': 0.38415502626453674, 'learning_rate': 6.315321064532444e-06, 'epoch': 0.43} 43%|████▎ | 9564/22095 [16:08:20<14:17:58, 4.11s/it] 43%|████▎ | 9565/22095 [16:08:24<14:04:59, 4.05s/it] {'loss': 0.3507, 'grad_norm': 0.6673503544936131, 'learning_rate': 6.31461394408998e-06, 'epoch': 0.43} 43%|████▎ | 9565/22095 [16:08:24<14:04:59, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56403 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9566/22095 [16:08:27<13:06:06, 3.76s/it] {'loss': 0.3543, 'grad_norm': 0.6086413226302155, 'learning_rate': 6.313906795400503e-06, 'epoch': 0.43} 43%|████▎ | 9566/22095 [16:08:27<13:06:06, 3.76s/it] 43%|████▎ | 9567/22095 [16:08:30<12:30:00, 3.59s/it] {'loss': 0.3341, 'grad_norm': 0.613894175157072, 'learning_rate': 6.313199618479202e-06, 'epoch': 0.43} 43%|████▎ | 9567/22095 [16:08:30<12:30:00, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9568/22095 [16:08:41<19:54:31, 5.72s/it] {'loss': 0.476, 'grad_norm': 0.27093995914272184, 'learning_rate': 6.312492413341274e-06, 'epoch': 0.43} 43%|████▎ | 9568/22095 [16:08:41<19:54:31, 5.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9569/22095 [16:08:45<18:21:47, 5.28s/it] {'loss': 0.3675, 'grad_norm': 0.6441830810283002, 'learning_rate': 6.311785180001917e-06, 'epoch': 0.43} 43%|████▎ | 9569/22095 [16:08:45<18:21:47, 5.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69984 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9570/22095 [16:08:55<23:24:04, 6.73s/it] {'loss': 0.5011, 'grad_norm': 0.2881058366572796, 'learning_rate': 6.311077918476324e-06, 'epoch': 0.43} 43%|████▎ | 9570/22095 [16:08:55<23:24:04, 6.73s/it] 43%|████▎ | 9571/22095 [16:09:05<27:07:05, 7.80s/it] {'loss': 0.4897, 'grad_norm': 0.3016593831565181, 'learning_rate': 6.3103706287796925e-06, 'epoch': 0.43} 43%|████▎ | 9571/22095 [16:09:05<27:07:05, 7.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 43%|████▎ | 9572/22095 [16:09:10<23:32:59, 6.77s/it] {'loss': 0.3611, 'grad_norm': 0.6379247355962797, 'learning_rate': 6.309663310927222e-06, 'epoch': 0.43} 43%|████▎ | 9572/22095 [16:09:10<23:32:59, 6.77s/it] 43%|████▎ | 9573/22095 [16:09:13<20:33:12, 5.91s/it] {'loss': 0.3237, 'grad_norm': 0.6403026737285819, 'learning_rate': 6.30895596493411e-06, 'epoch': 0.43} 43%|████▎ | 9573/22095 [16:09:14<20:33:12, 5.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9574/22095 [16:09:20<21:24:54, 6.16s/it] {'loss': 0.4911, 'grad_norm': 0.29694733852073946, 'learning_rate': 6.308248590815552e-06, 'epoch': 0.43} 43%|████▎ | 9574/22095 [16:09:20<21:24:54, 6.16s/it] 43%|████▎ | 9575/22095 [16:09:24<18:26:23, 5.30s/it] {'loss': 0.3418, 'grad_norm': 0.6408292536123505, 'learning_rate': 6.3075411885867525e-06, 'epoch': 0.43} 43%|████▎ | 9575/22095 [16:09:24<18:26:23, 5.30s/it] 43%|████▎ | 9576/22095 [16:09:27<16:58:11, 4.88s/it] {'loss': 0.3343, 'grad_norm': 0.6038428892113239, 'learning_rate': 6.306833758262906e-06, 'epoch': 0.43} 43%|████▎ | 9576/22095 [16:09:27<16:58:11, 4.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9577/22095 [16:09:31<15:06:34, 4.35s/it] {'loss': 0.3618, 'grad_norm': 0.6628742329075414, 'learning_rate': 6.306126299859218e-06, 'epoch': 0.43} 43%|████▎ | 9577/22095 [16:09:31<15:06:34, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9578/22095 [16:09:38<18:39:56, 5.37s/it] {'loss': 0.5013, 'grad_norm': 0.3045993291323623, 'learning_rate': 6.305418813390885e-06, 'epoch': 0.43} 43%|████▎ | 9578/22095 [16:09:38<18:39:56, 5.37s/it] 43%|████▎ | 9579/22095 [16:09:41<16:19:53, 4.70s/it] {'loss': 0.3365, 'grad_norm': 0.6780581289844321, 'learning_rate': 6.304711298873113e-06, 'epoch': 0.43} 43%|████▎ | 9579/22095 [16:09:41<16:19:53, 4.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898846 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21999, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nA. 3\nB. 6\nC. 5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9580/22095 [16:09:44<14:25:03, 4.15s/it] {'loss': 0.3188, 'grad_norm': 0.6211410836718435, 'learning_rate': 6.304003756321101e-06, 'epoch': 0.43} 43%|████▎ | 9580/22095 [16:09:44<14:25:03, 4.15s/it] 43%|████▎ | 9581/22095 [16:09:49<14:38:42, 4.21s/it] {'loss': 0.3625, 'grad_norm': 0.7669868311980905, 'learning_rate': 6.303296185750054e-06, 'epoch': 0.43} 43%|████▎ | 9581/22095 [16:09:49<14:38:42, 4.21s/it] 43%|████▎ | 9582/22095 [16:09:52<13:20:01, 3.84s/it] {'loss': 0.3708, 'grad_norm': 0.7573626626127393, 'learning_rate': 6.302588587175175e-06, 'epoch': 0.43} 43%|████▎ | 9582/22095 [16:09:52<13:20:01, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 43%|████▎ | 9583/22095 [16:09:54<12:20:32, 3.55s/it] {'loss': 0.3345, 'grad_norm': 0.6855161490216711, 'learning_rate': 6.301880960611668e-06, 'epoch': 0.43} 43%|████▎ | 9583/22095 [16:09:54<12:20:32, 3.55s/it] 43%|████▎ | 9584/22095 [16:09:57<11:26:26, 3.29s/it] {'loss': 0.323, 'grad_norm': 0.6437440624954825, 'learning_rate': 6.301173306074735e-06, 'epoch': 0.43} 43%|████▎ | 9584/22095 [16:09:57<11:26:26, 3.29s/it] 43%|████▎ | 9585/22095 [16:10:01<11:54:39, 3.43s/it] {'loss': 0.3519, 'grad_norm': 0.6232157340832585, 'learning_rate': 6.300465623579587e-06, 'epoch': 0.43} 43%|████▎ | 9585/22095 [16:10:01<11:54:39, 3.43s/it] 43%|████▎ | 9586/22095 [16:10:05<12:20:46, 3.55s/it] {'loss': 0.3498, 'grad_norm': 0.6634296907842349, 'learning_rate': 6.299757913141424e-06, 'epoch': 0.43} 43%|████▎ | 9586/22095 [16:10:05<12:20:46, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75582 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9587/22095 [16:10:08<12:24:17, 3.57s/it] {'loss': 0.3335, 'grad_norm': 1.2642722267366815, 'learning_rate': 6.299050174775458e-06, 'epoch': 0.43} 43%|████▎ | 9587/22095 [16:10:08<12:24:17, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373105 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39878, 'image': 'vrdu_table_final_2/astro-ph.CO/8a49f0b1-7022-4834-916e-00ae469fd4c0.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 43%|████▎ | 9588/22095 [16:10:11<11:37:05, 3.34s/it] {'loss': 0.3338, 'grad_norm': 0.651210690916979, 'learning_rate': 6.298342408496892e-06, 'epoch': 0.43} 43%|████▎ | 9588/22095 [16:10:11<11:37:05, 3.34s/it] 43%|████▎ | 9589/22095 [16:10:15<11:39:45, 3.36s/it] {'loss': 0.3365, 'grad_norm': 0.8266010855192732, 'learning_rate': 6.297634614320937e-06, 'epoch': 0.43} 43%|████▎ | 9589/22095 [16:10:15<11:39:45, 3.36s/it] 43%|████▎ | 9590/22095 [16:10:18<11:32:50, 3.32s/it] {'loss': 0.3787, 'grad_norm': 0.777980841927406, 'learning_rate': 6.2969267922627975e-06, 'epoch': 0.43} 43%|████▎ | 9590/22095 [16:10:18<11:32:50, 3.32s/it] 43%|████▎ | 9591/22095 [16:10:21<11:24:28, 3.28s/it] {'loss': 0.3893, 'grad_norm': 0.6771848978264605, 'learning_rate': 6.296218942337685e-06, 'epoch': 0.43} 43%|████▎ | 9591/22095 [16:10:21<11:24:28, 3.28s/it] 43%|████▎ | 9592/22095 [16:10:25<11:39:08, 3.36s/it] {'loss': 0.3225, 'grad_norm': 0.6163189995087976, 'learning_rate': 6.295511064560808e-06, 'epoch': 0.43} 43%|████▎ | 9592/22095 [16:10:25<11:39:08, 3.36s/it] 43%|████▎ | 9593/22095 [16:10:28<11:40:12, 3.36s/it] {'loss': 0.3155, 'grad_norm': 0.657764399625542, 'learning_rate': 6.294803158947378e-06, 'epoch': 0.43} 43%|████▎ | 9593/22095 [16:10:28<11:40:12, 3.36s/it] 43%|████▎ | 9594/22095 [16:10:31<11:43:08, 3.37s/it] {'loss': 0.3458, 'grad_norm': 0.6305694061235845, 'learning_rate': 6.294095225512604e-06, 'epoch': 0.43} 43%|████▎ | 9594/22095 [16:10:31<11:43:08, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9595/22095 [16:10:39<16:40:22, 4.80s/it] {'loss': 0.4632, 'grad_norm': 0.35629379362146724, 'learning_rate': 6.293387264271699e-06, 'epoch': 0.43} 43%|████▎ | 9595/22095 [16:10:39<16:40:22, 4.80s/it] 43%|████▎ | 9596/22095 [16:10:43<15:03:28, 4.34s/it] {'loss': 0.3321, 'grad_norm': 0.6351505528430289, 'learning_rate': 6.292679275239875e-06, 'epoch': 0.43} 43%|████▎ | 9596/22095 [16:10:43<15:03:28, 4.34s/it] 43%|████▎ | 9597/22095 [16:10:46<13:54:09, 4.00s/it] {'loss': 0.2933, 'grad_norm': 0.6206739807132818, 'learning_rate': 6.29197125843234e-06, 'epoch': 0.43} 43%|████▎ | 9597/22095 [16:10:46<13:54:09, 4.00s/it] 43%|████▎ | 9598/22095 [16:10:49<12:48:16, 3.69s/it] {'loss': 0.335, 'grad_norm': 0.6170676077633372, 'learning_rate': 6.291263213864314e-06, 'epoch': 0.43} 43%|████▎ | 9598/22095 [16:10:49<12:48:16, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 43%|████▎ | 9599/22095 [16:10:58<18:52:53, 5.44s/it] {'loss': 0.5105, 'grad_norm': 0.29918808007546976, 'learning_rate': 6.290555141551006e-06, 'epoch': 0.43} 43%|████▎ | 9599/22095 [16:10:58<18:52:53, 5.44s/it] 43%|████▎ | 9600/22095 [16:11:02<16:44:56, 4.83s/it] {'loss': 0.318, 'grad_norm': 0.6466778800542244, 'learning_rate': 6.289847041507632e-06, 'epoch': 0.43} 43%|████▎ | 9600/22095 [16:11:02<16:44:56, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115728 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108239 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9601/22095 [16:11:05<14:56:00, 4.30s/it] {'loss': 0.3263, 'grad_norm': 0.6644688745721504, 'learning_rate': 6.289138913749406e-06, 'epoch': 0.43} 43%|████▎ | 9601/22095 [16:11:05<14:56:00, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358058 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24769, 'image': 'vrdu_table_final_2/astro-ph.CO/73d6c61e-a698-4878-8126-ba9d0e1fbed3.png', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\eftcamb basis\\end{tabular}\n```"}]} 43%|████▎ | 9602/22095 [16:11:08<13:23:55, 3.86s/it] {'loss': 0.3247, 'grad_norm': 0.5956829179386198, 'learning_rate': 6.2884307582915434e-06, 'epoch': 0.43} 43%|████▎ | 9602/22095 [16:11:08<13:23:55, 3.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924584 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47737, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 8cm\nB. 16cm\nC. 32cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 43%|████▎ | 9603/22095 [16:11:12<14:09:41, 4.08s/it] {'loss': 0.3776, 'grad_norm': 0.6384688707836155, 'learning_rate': 6.287722575149262e-06, 'epoch': 0.43} 43%|████▎ | 9603/22095 [16:11:12<14:09:41, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84075 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9604/22095 [16:11:16<13:15:49, 3.82s/it] {'loss': 0.3438, 'grad_norm': 0.6210441993230379, 'learning_rate': 6.287014364337778e-06, 'epoch': 0.43} 43%|████▎ | 9604/22095 [16:11:16<13:15:49, 3.82s/it] 43%|████▎ | 9605/22095 [16:11:20<13:27:30, 3.88s/it] {'loss': 0.3044, 'grad_norm': 0.6515747518311703, 'learning_rate': 6.286306125872307e-06, 'epoch': 0.43} 43%|████▎ | 9605/22095 [16:11:20<13:27:30, 3.88s/it] 43%|████▎ | 9606/22095 [16:11:23<12:35:02, 3.63s/it] {'loss': 0.3433, 'grad_norm': 0.6065693284092167, 'learning_rate': 6.285597859768069e-06, 'epoch': 0.43} 43%|████▎ | 9606/22095 [16:11:23<12:35:02, 3.63s/it] 43%|████▎ | 9607/22095 [16:11:26<12:51:54, 3.71s/it] {'loss': 0.3329, 'grad_norm': 0.649773066821743, 'learning_rate': 6.28488956604028e-06, 'epoch': 0.43} 43%|████▎ | 9607/22095 [16:11:26<12:51:54, 3.71s/it] 43%|████▎ | 9608/22095 [16:11:30<12:46:23, 3.68s/it] {'loss': 0.3311, 'grad_norm': 0.6402533582637937, 'learning_rate': 6.284181244704161e-06, 'epoch': 0.43} 43%|████▎ | 9608/22095 [16:11:30<12:46:23, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (80042 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48237 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9609/22095 [16:11:40<18:48:41, 5.42s/it] {'loss': 0.4732, 'grad_norm': 0.34486747340777774, 'learning_rate': 6.2834728957749315e-06, 'epoch': 0.43} 43%|████▎ | 9609/22095 [16:11:40<18:48:41, 5.42s/it] 43%|████▎ | 9610/22095 [16:11:43<16:47:31, 4.84s/it] {'loss': 0.352, 'grad_norm': 0.6517980038717447, 'learning_rate': 6.2827645192678114e-06, 'epoch': 0.43} 43%|████▎ | 9610/22095 [16:11:43<16:47:31, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111469 > 40960). Running this sequence through the model will result in indexing errors 43%|████▎ | 9611/22095 [16:11:47<16:13:22, 4.68s/it] {'loss': 0.3426, 'grad_norm': 0.6361557329684554, 'learning_rate': 6.282056115198021e-06, 'epoch': 0.43} 43%|████▎ | 9611/22095 [16:11:47<16:13:22, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▎ | 9612/22095 [16:11:57<21:07:07, 6.09s/it] {'loss': 0.4682, 'grad_norm': 0.29829048121652124, 'learning_rate': 6.2813476835807814e-06, 'epoch': 0.44} 44%|████▎ | 9612/22095 [16:11:57<21:07:07, 6.09s/it] 44%|████▎ | 9613/22095 [16:12:00<18:09:42, 5.24s/it] {'loss': 0.3527, 'grad_norm': 0.6764980210107786, 'learning_rate': 6.280639224431317e-06, 'epoch': 0.44} 44%|████▎ | 9613/22095 [16:12:00<18:09:42, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55170 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9614/22095 [16:12:04<17:07:11, 4.94s/it] {'loss': 0.3193, 'grad_norm': 0.6549779058980503, 'learning_rate': 6.27993073776485e-06, 'epoch': 0.44} 44%|████▎ | 9614/22095 [16:12:04<17:07:11, 4.94s/it] 44%|████▎ | 9615/22095 [16:12:07<15:18:36, 4.42s/it] {'loss': 0.3335, 'grad_norm': 0.6323811410845982, 'learning_rate': 6.279222223596599e-06, 'epoch': 0.44} 44%|████▎ | 9615/22095 [16:12:07<15:18:36, 4.42s/it] 44%|████▎ | 9616/22095 [16:12:11<14:24:08, 4.15s/it] {'loss': 0.3349, 'grad_norm': 0.5938579736378444, 'learning_rate': 6.278513681941793e-06, 'epoch': 0.44} 44%|████▎ | 9616/22095 [16:12:11<14:24:08, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55175 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9617/22095 [16:12:16<15:28:16, 4.46s/it] {'loss': 0.4687, 'grad_norm': 0.2802435542448092, 'learning_rate': 6.277805112815656e-06, 'epoch': 0.44} 44%|████▎ | 9617/22095 [16:12:16<15:28:16, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▎ | 9618/22095 [16:12:19<14:06:12, 4.07s/it] {'loss': 0.3541, 'grad_norm': 0.6716376377922707, 'learning_rate': 6.277096516233409e-06, 'epoch': 0.44} 44%|████▎ | 9618/22095 [16:12:19<14:06:12, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▎ | 9619/22095 [16:12:29<19:38:41, 5.67s/it] {'loss': 0.4743, 'grad_norm': 0.29189397078435236, 'learning_rate': 6.276387892210281e-06, 'epoch': 0.44} 44%|████▎ | 9619/22095 [16:12:29<19:38:41, 5.67s/it] 44%|████▎ | 9620/22095 [16:12:32<17:08:56, 4.95s/it] {'loss': 0.3602, 'grad_norm': 0.7074551039812558, 'learning_rate': 6.275679240761499e-06, 'epoch': 0.44} 44%|████▎ | 9620/22095 [16:12:32<17:08:56, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88443 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9621/22095 [16:12:35<15:29:14, 4.47s/it] {'loss': 0.319, 'grad_norm': 0.5986286546331557, 'learning_rate': 6.274970561902286e-06, 'epoch': 0.44} 44%|████▎ | 9621/22095 [16:12:35<15:29:14, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99826 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9622/22095 [16:12:46<21:30:40, 6.21s/it] {'loss': 0.4556, 'grad_norm': 0.2924588447120755, 'learning_rate': 6.274261855647872e-06, 'epoch': 0.44} 44%|████▎ | 9622/22095 [16:12:46<21:30:40, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75497 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9623/22095 [16:12:50<20:00:42, 5.78s/it] {'loss': 0.3189, 'grad_norm': 0.6350275796798999, 'learning_rate': 6.273553122013485e-06, 'epoch': 0.44} 44%|████▎ | 9623/22095 [16:12:50<20:00:42, 5.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48950 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9624/22095 [16:12:54<17:19:33, 5.00s/it] {'loss': 0.3469, 'grad_norm': 0.6569204309990117, 'learning_rate': 6.272844361014352e-06, 'epoch': 0.44} 44%|████▎ | 9624/22095 [16:12:54<17:19:33, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▎ | 9625/22095 [16:13:03<21:53:08, 6.32s/it] {'loss': 0.475, 'grad_norm': 0.28160925038059104, 'learning_rate': 6.272135572665704e-06, 'epoch': 0.44} 44%|████▎ | 9625/22095 [16:13:03<21:53:08, 6.32s/it] 44%|████▎ | 9626/22095 [16:13:06<18:52:59, 5.45s/it] {'loss': 0.3391, 'grad_norm': 0.6936511520119449, 'learning_rate': 6.271426756982768e-06, 'epoch': 0.44} 44%|████▎ | 9626/22095 [16:13:06<18:52:59, 5.45s/it] 44%|████▎ | 9627/22095 [16:13:10<17:11:29, 4.96s/it] {'loss': 0.3363, 'grad_norm': 0.6334270977412247, 'learning_rate': 6.270717913980777e-06, 'epoch': 0.44} 44%|████▎ | 9627/22095 [16:13:10<17:11:29, 4.96s/it] 44%|████▎ | 9628/22095 [16:13:13<15:25:22, 4.45s/it] {'loss': 0.3338, 'grad_norm': 0.7076211424429047, 'learning_rate': 6.270009043674959e-06, 'epoch': 0.44} 44%|████▎ | 9628/22095 [16:13:13<15:25:22, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▎ | 9629/22095 [16:13:24<21:56:59, 6.34s/it] {'loss': 0.4749, 'grad_norm': 0.2949010307384991, 'learning_rate': 6.26930014608055e-06, 'epoch': 0.44} 44%|████▎ | 9629/22095 [16:13:24<21:56:59, 6.34s/it] 44%|████▎ | 9630/22095 [16:13:32<23:17:47, 6.73s/it] {'loss': 0.5113, 'grad_norm': 0.2991584601582421, 'learning_rate': 6.268591221212779e-06, 'epoch': 0.44} 44%|████▎ | 9630/22095 [16:13:32<23:17:47, 6.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56880 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58782 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61251 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68338 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9631/22095 [16:13:41<26:10:40, 7.56s/it] {'loss': 0.4618, 'grad_norm': 0.28826042168950156, 'learning_rate': 6.2678822690868765e-06, 'epoch': 0.44} 44%|████▎ | 9631/22095 [16:13:41<26:10:40, 7.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047924 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 44%|████▎ | 9632/22095 [16:13:49<26:17:20, 7.59s/it] {'loss': 0.4611, 'grad_norm': 0.27923258726620354, 'learning_rate': 6.267173289718079e-06, 'epoch': 0.44} 44%|████▎ | 9632/22095 [16:13:49<26:17:20, 7.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▎ | 9633/22095 [16:13:54<23:10:19, 6.69s/it] {'loss': 0.3311, 'grad_norm': 0.6535515640602259, 'learning_rate': 6.2664642831216206e-06, 'epoch': 0.44} 44%|████▎ | 9633/22095 [16:13:54<23:10:19, 6.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88276 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9634/22095 [16:13:58<20:16:59, 5.86s/it] {'loss': 0.3343, 'grad_norm': 0.658058957760891, 'learning_rate': 6.265755249312733e-06, 'epoch': 0.44} 44%|████▎ | 9634/22095 [16:13:58<20:16:59, 5.86s/it] 44%|████▎ | 9635/22095 [16:14:01<18:00:08, 5.20s/it] {'loss': 0.2928, 'grad_norm': 0.7420584561566588, 'learning_rate': 6.2650461883066534e-06, 'epoch': 0.44} 44%|████▎ | 9635/22095 [16:14:01<18:00:08, 5.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957616 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8451, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 4cm\nB. 1cm\nC. 1.5cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 44%|████▎ | 9636/22095 [16:14:04<15:53:08, 4.59s/it] {'loss': 0.3421, 'grad_norm': 0.679580769956985, 'learning_rate': 6.264337100118615e-06, 'epoch': 0.44} 44%|████▎ | 9636/22095 [16:14:04<15:53:08, 4.59s/it] 44%|████▎ | 9637/22095 [16:14:08<14:23:47, 4.16s/it] {'loss': 0.4015, 'grad_norm': 0.6600756328928645, 'learning_rate': 6.263627984763858e-06, 'epoch': 0.44} 44%|████▎ | 9637/22095 [16:14:08<14:23:47, 4.16s/it] 44%|████▎ | 9638/22095 [16:14:11<13:20:02, 3.85s/it] {'loss': 0.3845, 'grad_norm': 0.6217618393021392, 'learning_rate': 6.262918842257615e-06, 'epoch': 0.44} 44%|████▎ | 9638/22095 [16:14:11<13:20:02, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▎ | 9639/22095 [16:14:21<20:00:58, 5.79s/it] {'loss': 0.4553, 'grad_norm': 0.42654541926319295, 'learning_rate': 6.262209672615125e-06, 'epoch': 0.44} 44%|████▎ | 9639/22095 [16:14:21<20:00:58, 5.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882167 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5320, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} 44%|████▎ | 9640/22095 [16:14:24<17:22:54, 5.02s/it] {'loss': 0.3663, 'grad_norm': 0.5991685886740397, 'learning_rate': 6.261500475851625e-06, 'epoch': 0.44} 44%|████▎ | 9640/22095 [16:14:24<17:22:54, 5.02s/it] 44%|████▎ | 9641/22095 [16:14:28<15:50:45, 4.58s/it] {'loss': 0.3534, 'grad_norm': 0.6526326165159332, 'learning_rate': 6.260791251982354e-06, 'epoch': 0.44} 44%|████▎ | 9641/22095 [16:14:28<15:50:45, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▎ | 9642/22095 [16:14:38<21:25:54, 6.20s/it] {'loss': 0.4822, 'grad_norm': 0.33023414793782335, 'learning_rate': 6.260082001022553e-06, 'epoch': 0.44} 44%|████▎ | 9642/22095 [16:14:38<21:25:54, 6.20s/it] 44%|████▎ | 9643/22095 [16:14:41<18:41:21, 5.40s/it] {'loss': 0.3539, 'grad_norm': 0.614008987239421, 'learning_rate': 6.259372722987459e-06, 'epoch': 0.44} 44%|████▎ | 9643/22095 [16:14:41<18:41:21, 5.40s/it] 44%|████▎ | 9644/22095 [16:14:44<16:16:09, 4.70s/it] {'loss': 0.3327, 'grad_norm': 0.6193780654351428, 'learning_rate': 6.2586634178923124e-06, 'epoch': 0.44} 44%|████▎ | 9644/22095 [16:14:44<16:16:09, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▎ | 9645/22095 [16:14:48<15:32:53, 4.50s/it] {'loss': 0.3385, 'grad_norm': 0.678929067511659, 'learning_rate': 6.257954085752356e-06, 'epoch': 0.44} 44%|████▎ | 9645/22095 [16:14:48<15:32:53, 4.50s/it] 44%|████▎ | 9646/22095 [16:14:51<14:00:41, 4.05s/it] {'loss': 0.3944, 'grad_norm': 0.6455644243579621, 'learning_rate': 6.257244726582829e-06, 'epoch': 0.44} 44%|████▎ | 9646/22095 [16:14:51<14:00:41, 4.05s/it] 44%|████▎ | 9647/22095 [16:14:55<13:32:35, 3.92s/it] {'loss': 0.3395, 'grad_norm': 0.6332816139581993, 'learning_rate': 6.256535340398974e-06, 'epoch': 0.44} 44%|████▎ | 9647/22095 [16:14:55<13:32:35, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54450 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60348 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88847 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9648/22095 [16:14:58<12:40:15, 3.66s/it] {'loss': 0.3107, 'grad_norm': 0.6241992920208398, 'learning_rate': 6.255825927216032e-06, 'epoch': 0.44} 44%|████▎ | 9648/22095 [16:14:58<12:40:15, 3.66s/it] 44%|████▎ | 9649/22095 [16:15:01<11:54:29, 3.44s/it] {'loss': 0.3049, 'grad_norm': 0.6499774891475161, 'learning_rate': 6.2551164870492506e-06, 'epoch': 0.44} 44%|████▎ | 9649/22095 [16:15:01<11:54:29, 3.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8376564 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 43345, 'image': 'vrdu_table_final_2/astro-ph.CO/7e166fde-f4b1-4805-ac85-8358e7fa0ce7.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 44%|████▎ | 9650/22095 [16:15:05<12:02:51, 3.49s/it] {'loss': 0.3219, 'grad_norm': 0.5852926834843499, 'learning_rate': 6.25440701991387e-06, 'epoch': 0.44} 44%|████▎ | 9650/22095 [16:15:05<12:02:51, 3.49s/it] 44%|████▎ | 9651/22095 [16:15:08<12:02:34, 3.48s/it] {'loss': 0.3546, 'grad_norm': 0.682648927831333, 'learning_rate': 6.253697525825134e-06, 'epoch': 0.44} 44%|████▎ | 9651/22095 [16:15:08<12:02:34, 3.48s/it] 44%|████▎ | 9652/22095 [16:15:11<11:29:42, 3.33s/it] {'loss': 0.3562, 'grad_norm': 0.6755898462967815, 'learning_rate': 6.25298800479829e-06, 'epoch': 0.44} 44%|████▎ | 9652/22095 [16:15:11<11:29:42, 3.33s/it] 44%|████▎ | 9653/22095 [16:15:15<11:56:47, 3.46s/it] {'loss': 0.342, 'grad_norm': 0.6553521064520352, 'learning_rate': 6.252278456848581e-06, 'epoch': 0.44} 44%|████▎ | 9653/22095 [16:15:15<11:56:47, 3.46s/it] 44%|████▎ | 9654/22095 [16:15:18<12:11:03, 3.53s/it] {'loss': 0.3483, 'grad_norm': 0.6414939440118836, 'learning_rate': 6.251568881991256e-06, 'epoch': 0.44} 44%|████▎ | 9654/22095 [16:15:18<12:11:03, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121729 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9655/22095 [16:15:22<11:57:46, 3.46s/it] {'loss': 0.3432, 'grad_norm': 0.6554899417647392, 'learning_rate': 6.250859280241557e-06, 'epoch': 0.44} 44%|████▎ | 9655/22095 [16:15:22<11:57:46, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45685 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1528, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8439966 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1528, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34017, 'image': 'vrdu_texteq/astro-ph.CO/f3ef8de5-a248-4a25-8aa9-68761b804508.png', 'image_wh': [[1528, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where we have used that the fourth moment of a normal distribution is 3$\\sigma^4$. Therefore our estimate for the fourth moment is:'}]} 44%|████▎ | 9656/22095 [16:15:32<18:35:21, 5.38s/it] {'loss': 0.4453, 'grad_norm': 0.39347989973742065, 'learning_rate': 6.250149651614735e-06, 'epoch': 0.44} 44%|████▎ | 9656/22095 [16:15:32<18:35:21, 5.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▎ | 9657/22095 [16:15:36<17:08:12, 4.96s/it] {'loss': 0.33, 'grad_norm': 0.616710306543043, 'learning_rate': 6.249439996126036e-06, 'epoch': 0.44} 44%|████▎ | 9657/22095 [16:15:36<17:08:12, 4.96s/it] 44%|████▎ | 9658/22095 [16:15:39<15:02:55, 4.36s/it] {'loss': 0.3539, 'grad_norm': 0.6916743798248338, 'learning_rate': 6.24873031379071e-06, 'epoch': 0.44} 44%|████▎ | 9658/22095 [16:15:39<15:02:55, 4.36s/it] 44%|████▎ | 9659/22095 [16:15:41<13:28:29, 3.90s/it] {'loss': 0.3506, 'grad_norm': 0.7770154644576485, 'learning_rate': 6.248020604624004e-06, 'epoch': 0.44} 44%|████▎ | 9659/22095 [16:15:41<13:28:29, 3.90s/it] 44%|████▎ | 9660/22095 [16:15:44<12:26:42, 3.60s/it] {'loss': 0.3122, 'grad_norm': 0.6583445912764695, 'learning_rate': 6.247310868641168e-06, 'epoch': 0.44} 44%|████▎ | 9660/22095 [16:15:44<12:26:42, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51736 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63262 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9661/22095 [16:15:47<11:42:07, 3.39s/it] {'loss': 0.328, 'grad_norm': 0.8566787002605626, 'learning_rate': 6.246601105857453e-06, 'epoch': 0.44} 44%|████▎ | 9661/22095 [16:15:47<11:42:07, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43401 > 40960). Running this sequence through the model will result in indexing errors 44%|████▎ | 9662/22095 [16:15:51<11:47:41, 3.42s/it] {'loss': 0.3277, 'grad_norm': 0.5977562191253866, 'learning_rate': 6.245891316288108e-06, 'epoch': 0.44} 44%|████▎ | 9662/22095 [16:15:51<11:47:41, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047666 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} 44%|████▎ | 9663/22095 [16:15:59<16:26:45, 4.76s/it] {'loss': 0.4739, 'grad_norm': 0.5231617692788478, 'learning_rate': 6.245181499948385e-06, 'epoch': 0.44} 44%|████▎ | 9663/22095 [16:15:59<16:26:45, 4.76s/it] 44%|████▎ | 9664/22095 [16:16:03<15:43:25, 4.55s/it] {'loss': 0.3472, 'grad_norm': 0.6619926024483229, 'learning_rate': 6.244471656853538e-06, 'epoch': 0.44} 44%|████▎ | 9664/22095 [16:16:03<15:43:25, 4.55s/it] 44%|████▎ | 9665/22095 [16:16:07<15:02:33, 4.36s/it] {'loss': 0.3355, 'grad_norm': 0.6386867229353993, 'learning_rate': 6.243761787018814e-06, 'epoch': 0.44} 44%|████▎ | 9665/22095 [16:16:07<15:02:33, 4.36s/it] 44%|████▎ | 9666/22095 [16:16:11<15:07:33, 4.38s/it] {'loss': 0.3314, 'grad_norm': 0.6153655896855599, 'learning_rate': 6.2430518904594715e-06, 'epoch': 0.44} 44%|████▎ | 9666/22095 [16:16:11<15:07:33, 4.38s/it] 44%|████▍ | 9667/22095 [16:16:14<13:28:13, 3.90s/it] {'loss': 0.3335, 'grad_norm': 0.6500289470819034, 'learning_rate': 6.24234196719076e-06, 'epoch': 0.44} 44%|████▍ | 9667/22095 [16:16:14<13:28:13, 3.90s/it] 44%|████▍ | 9668/22095 [16:16:17<12:42:39, 3.68s/it] {'loss': 0.3209, 'grad_norm': 0.6105016241224432, 'learning_rate': 6.241632017227937e-06, 'epoch': 0.44} 44%|████▍ | 9668/22095 [16:16:17<12:42:39, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48823 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9669/22095 [16:16:27<19:45:48, 5.73s/it] {'loss': 0.4953, 'grad_norm': 0.33115584574894136, 'learning_rate': 6.240922040586254e-06, 'epoch': 0.44} 44%|████▍ | 9669/22095 [16:16:27<19:45:48, 5.73s/it] 44%|████▍ | 9670/22095 [16:16:37<23:34:33, 6.83s/it] {'loss': 0.4787, 'grad_norm': 0.33989481089576024, 'learning_rate': 6.240212037280967e-06, 'epoch': 0.44} 44%|████▍ | 9670/22095 [16:16:37<23:34:33, 6.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (137358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43561 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9671/22095 [16:16:40<20:05:50, 5.82s/it] {'loss': 0.3285, 'grad_norm': 0.6816818191990349, 'learning_rate': 6.239502007327334e-06, 'epoch': 0.44} 44%|████▍ | 9671/22095 [16:16:40<20:05:50, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55490 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9672/22095 [16:16:44<17:48:51, 5.16s/it] {'loss': 0.308, 'grad_norm': 0.6037031811269876, 'learning_rate': 6.2387919507406085e-06, 'epoch': 0.44} 44%|████▍ | 9672/22095 [16:16:44<17:48:51, 5.16s/it] 44%|████▍ | 9673/22095 [16:16:47<15:38:07, 4.53s/it] {'loss': 0.3443, 'grad_norm': 0.6195194185977877, 'learning_rate': 6.238081867536049e-06, 'epoch': 0.44} 44%|████▍ | 9673/22095 [16:16:47<15:38:07, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48675 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70145 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9674/22095 [16:16:51<15:11:25, 4.40s/it] {'loss': 0.3311, 'grad_norm': 0.5802615024132223, 'learning_rate': 6.237371757728914e-06, 'epoch': 0.44} 44%|████▍ | 9674/22095 [16:16:51<15:11:25, 4.40s/it] 44%|████▍ | 9675/22095 [16:16:55<14:40:40, 4.25s/it] {'loss': 0.332, 'grad_norm': 0.6486256013867389, 'learning_rate': 6.236661621334458e-06, 'epoch': 0.44} 44%|████▍ | 9675/22095 [16:16:55<14:40:40, 4.25s/it] 44%|████▍ | 9676/22095 [16:16:58<13:37:35, 3.95s/it] {'loss': 0.3169, 'grad_norm': 0.5983038373745534, 'learning_rate': 6.235951458367943e-06, 'epoch': 0.44} 44%|████▍ | 9676/22095 [16:16:58<13:37:35, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9677/22095 [16:17:08<20:04:25, 5.82s/it] {'loss': 0.4952, 'grad_norm': 0.3263293080204479, 'learning_rate': 6.235241268844626e-06, 'epoch': 0.44} 44%|████▍ | 9677/22095 [16:17:08<20:04:25, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93665 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9678/22095 [16:17:12<17:38:31, 5.11s/it] {'loss': 0.3285, 'grad_norm': 0.6615016165723331, 'learning_rate': 6.234531052779769e-06, 'epoch': 0.44} 44%|████▍ | 9678/22095 [16:17:12<17:38:31, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9679/22095 [16:17:22<22:41:59, 6.58s/it] {'loss': 0.4683, 'grad_norm': 0.2958982133149087, 'learning_rate': 6.233820810188631e-06, 'epoch': 0.44} 44%|████▍ | 9679/22095 [16:17:22<22:41:59, 6.58s/it] 44%|████▍ | 9680/22095 [16:17:26<20:28:01, 5.93s/it] {'loss': 0.349, 'grad_norm': 0.597163484202998, 'learning_rate': 6.233110541086473e-06, 'epoch': 0.44} 44%|████▍ | 9680/22095 [16:17:26<20:28:01, 5.93s/it] 44%|████▍ | 9681/22095 [16:17:30<17:42:41, 5.14s/it] {'loss': 0.3328, 'grad_norm': 0.6319411261177034, 'learning_rate': 6.2324002454885565e-06, 'epoch': 0.44} 44%|████▍ | 9681/22095 [16:17:30<17:42:41, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42921 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85393 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9682/22095 [16:17:35<17:52:58, 5.19s/it] {'loss': 0.4843, 'grad_norm': 0.28556124909790315, 'learning_rate': 6.231689923410144e-06, 'epoch': 0.44} 44%|████▍ | 9682/22095 [16:17:35<17:52:58, 5.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111467 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9683/22095 [16:17:38<15:56:11, 4.62s/it] {'loss': 0.3118, 'grad_norm': 0.6600743868567222, 'learning_rate': 6.230979574866498e-06, 'epoch': 0.44} 44%|████▍ | 9683/22095 [16:17:38<15:56:11, 4.62s/it] 44%|████▍ | 9684/22095 [16:17:41<14:02:34, 4.07s/it] {'loss': 0.3368, 'grad_norm': 0.7072446001623893, 'learning_rate': 6.230269199872881e-06, 'epoch': 0.44} 44%|████▍ | 9684/22095 [16:17:41<14:02:34, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44740 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41949 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9685/22095 [16:17:44<13:03:28, 3.79s/it] {'loss': 0.3548, 'grad_norm': 0.6703396770777446, 'learning_rate': 6.22955879844456e-06, 'epoch': 0.44} 44%|████▍ | 9685/22095 [16:17:44<13:03:28, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72431 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9686/22095 [16:17:55<20:48:11, 6.04s/it] {'loss': 0.4889, 'grad_norm': 0.39158306035806945, 'learning_rate': 6.228848370596793e-06, 'epoch': 0.44} 44%|████▍ | 9686/22095 [16:17:55<20:48:11, 6.04s/it] 44%|████▍ | 9687/22095 [16:17:59<18:03:24, 5.24s/it] {'loss': 0.3602, 'grad_norm': 0.6576588402412944, 'learning_rate': 6.228137916344852e-06, 'epoch': 0.44} 44%|████▍ | 9687/22095 [16:17:59<18:03:24, 5.24s/it] 44%|████▍ | 9688/22095 [16:18:02<15:58:11, 4.63s/it] {'loss': 0.3318, 'grad_norm': 0.638629956214423, 'learning_rate': 6.227427435703997e-06, 'epoch': 0.44} 44%|████▍ | 9688/22095 [16:18:02<15:58:11, 4.63s/it] 44%|████▍ | 9689/22095 [16:18:06<15:02:58, 4.37s/it] {'loss': 0.3348, 'grad_norm': 0.633776454109039, 'learning_rate': 6.2267169286894954e-06, 'epoch': 0.44} 44%|████▍ | 9689/22095 [16:18:06<15:02:58, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8411158 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13363, 'image': 'vrdu_table_final_2/astro-ph.CO/f8ddf088-bab5-406a-ba41-0c1b5ec347fc.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #2 \\\\\n \\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9690/22095 [16:18:16<20:53:35, 6.06s/it] {'loss': 0.4711, 'grad_norm': 0.3955281000545189, 'learning_rate': 6.2260063953166165e-06, 'epoch': 0.44} 44%|████▍ | 9690/22095 [16:18:16<20:53:35, 6.06s/it] 44%|████▍ | 9691/22095 [16:18:22<20:39:11, 5.99s/it] {'loss': 0.5086, 'grad_norm': 0.2967833814586568, 'learning_rate': 6.225295835600624e-06, 'epoch': 0.44} 44%|████▍ | 9691/22095 [16:18:22<20:39:11, 5.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▍ | 9692/22095 [16:18:26<18:40:02, 5.42s/it] {'loss': 0.3629, 'grad_norm': 0.6881500244083311, 'learning_rate': 6.2245852495567885e-06, 'epoch': 0.44} 44%|████▍ | 9692/22095 [16:18:26<18:40:02, 5.42s/it] 44%|████▍ | 9693/22095 [16:18:35<23:06:00, 6.71s/it] {'loss': 0.4664, 'grad_norm': 0.2702582686060595, 'learning_rate': 6.2238746372003775e-06, 'epoch': 0.44} 44%|████▍ | 9693/22095 [16:18:35<23:06:00, 6.71s/it] 44%|████▍ | 9694/22095 [16:18:42<23:01:13, 6.68s/it] {'loss': 0.4739, 'grad_norm': 0.2826358246260504, 'learning_rate': 6.223163998546657e-06, 'epoch': 0.44} 44%|████▍ | 9694/22095 [16:18:42<23:01:13, 6.68s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▍ | 9695/22095 [16:18:45<19:35:06, 5.69s/it] {'loss': 0.3642, 'grad_norm': 0.6637887438021715, 'learning_rate': 6.2224533336109015e-06, 'epoch': 0.44} 44%|████▍ | 9695/22095 [16:18:45<19:35:06, 5.69s/it] 44%|████▍ | 9696/22095 [16:18:49<17:20:58, 5.04s/it] {'loss': 0.3628, 'grad_norm': 0.612207286684882, 'learning_rate': 6.221742642408377e-06, 'epoch': 0.44} 44%|████▍ | 9696/22095 [16:18:49<17:20:58, 5.04s/it] 44%|████▍ | 9697/22095 [16:18:52<15:23:05, 4.47s/it] {'loss': 0.3216, 'grad_norm': 0.6165039348405416, 'learning_rate': 6.221031924954356e-06, 'epoch': 0.44} 44%|████▍ | 9697/22095 [16:18:52<15:23:05, 4.47s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9698/22095 [16:18:56<15:06:58, 4.39s/it] {'loss': 0.3381, 'grad_norm': 0.696379443461046, 'learning_rate': 6.220321181264108e-06, 'epoch': 0.44} 44%|████▍ | 9698/22095 [16:18:56<15:06:58, 4.39s/it] 44%|████▍ | 9699/22095 [16:19:00<14:01:55, 4.08s/it] {'loss': 0.3482, 'grad_norm': 0.5904740059119723, 'learning_rate': 6.2196104113529064e-06, 'epoch': 0.44} 44%|████▍ | 9699/22095 [16:19:00<14:01:55, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48841 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41613 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9700/22095 [16:19:02<12:41:18, 3.69s/it] {'loss': 0.3034, 'grad_norm': 0.7721270978948145, 'learning_rate': 6.218899615236022e-06, 'epoch': 0.44} 44%|████▍ | 9700/22095 [16:19:02<12:41:18, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (165158 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9701/22095 [16:19:06<12:43:52, 3.70s/it] {'loss': 0.3402, 'grad_norm': 0.7227083202010125, 'learning_rate': 6.21818879292873e-06, 'epoch': 0.44} 44%|████▍ | 9701/22095 [16:19:06<12:43:52, 3.70s/it] 44%|████▍ | 9702/22095 [16:19:09<11:47:36, 3.43s/it] {'loss': 0.2952, 'grad_norm': 0.6098500040445197, 'learning_rate': 6.217477944446301e-06, 'epoch': 0.44} 44%|████▍ | 9702/22095 [16:19:09<11:47:36, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9703/22095 [16:19:12<11:10:32, 3.25s/it] {'loss': 0.357, 'grad_norm': 0.6829152157415639, 'learning_rate': 6.216767069804011e-06, 'epoch': 0.44} 44%|████▍ | 9703/22095 [16:19:12<11:10:32, 3.25s/it] 44%|████▍ | 9704/22095 [16:19:15<11:29:31, 3.34s/it] {'loss': 0.375, 'grad_norm': 0.7364773799141779, 'learning_rate': 6.216056169017133e-06, 'epoch': 0.44} 44%|████▍ | 9704/22095 [16:19:15<11:29:31, 3.34s/it] 44%|████▍ | 9705/22095 [16:19:19<11:28:20, 3.33s/it] {'loss': 0.3248, 'grad_norm': 0.6548002280902314, 'learning_rate': 6.215345242100942e-06, 'epoch': 0.44} 44%|████▍ | 9705/22095 [16:19:19<11:28:20, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9706/22095 [16:19:22<11:09:42, 3.24s/it] {'loss': 0.3308, 'grad_norm': 0.6254654159755485, 'learning_rate': 6.214634289070717e-06, 'epoch': 0.44} 44%|████▍ | 9706/22095 [16:19:22<11:09:42, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47269 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89386 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9707/22095 [16:19:25<11:29:56, 3.34s/it] {'loss': 0.3062, 'grad_norm': 0.634866839571993, 'learning_rate': 6.213923309941728e-06, 'epoch': 0.44} 44%|████▍ | 9707/22095 [16:19:25<11:29:56, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9708/22095 [16:19:34<17:04:04, 4.96s/it] {'loss': 0.5051, 'grad_norm': 0.40657094828930934, 'learning_rate': 6.213212304729259e-06, 'epoch': 0.44} 44%|████▍ | 9708/22095 [16:19:34<17:04:04, 4.96s/it] 44%|████▍ | 9709/22095 [16:19:38<15:43:54, 4.57s/it] {'loss': 0.3353, 'grad_norm': 0.6105111080738309, 'learning_rate': 6.212501273448581e-06, 'epoch': 0.44} 44%|████▍ | 9709/22095 [16:19:38<15:43:54, 4.57s/it] 44%|████▍ | 9710/22095 [16:19:41<14:24:58, 4.19s/it] {'loss': 0.3619, 'grad_norm': 0.662434514457218, 'learning_rate': 6.211790216114976e-06, 'epoch': 0.44} 44%|████▍ | 9710/22095 [16:19:41<14:24:58, 4.19s/it] 44%|████▍ | 9711/22095 [16:19:44<13:47:10, 4.01s/it] {'loss': 0.3231, 'grad_norm': 0.6727588402946885, 'learning_rate': 6.21107913274372e-06, 'epoch': 0.44} 44%|████▍ | 9711/22095 [16:19:44<13:47:10, 4.01s/it] 44%|████▍ | 9712/22095 [16:19:48<13:03:47, 3.80s/it] {'loss': 0.3426, 'grad_norm': 0.6425714546278731, 'learning_rate': 6.210368023350094e-06, 'epoch': 0.44} 44%|████▍ | 9712/22095 [16:19:48<13:03:47, 3.80s/it] 44%|████▍ | 9713/22095 [16:19:51<12:12:33, 3.55s/it] {'loss': 0.3355, 'grad_norm': 0.61234101605595, 'learning_rate': 6.209656887949376e-06, 'epoch': 0.44} 44%|████▍ | 9713/22095 [16:19:51<12:12:33, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130092 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49279 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9714/22095 [16:19:54<12:07:35, 3.53s/it] {'loss': 0.3608, 'grad_norm': 0.634668015159857, 'learning_rate': 6.208945726556848e-06, 'epoch': 0.44} 44%|████▍ | 9714/22095 [16:19:54<12:07:35, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9715/22095 [16:20:01<16:00:46, 4.66s/it] {'loss': 0.4741, 'grad_norm': 0.29388674268361786, 'learning_rate': 6.2082345391877865e-06, 'epoch': 0.44} 44%|████▍ | 9715/22095 [16:20:01<16:00:46, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92322 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83251 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67976 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9716/22095 [16:20:05<14:28:11, 4.21s/it] {'loss': 0.3334, 'grad_norm': 0.6535017456665534, 'learning_rate': 6.207523325857479e-06, 'epoch': 0.44} 44%|████▍ | 9716/22095 [16:20:05<14:28:11, 4.21s/it] 44%|████▍ | 9717/22095 [16:20:09<14:24:53, 4.19s/it] {'loss': 0.3798, 'grad_norm': 0.6852149829343164, 'learning_rate': 6.206812086581201e-06, 'epoch': 0.44} 44%|████▍ | 9717/22095 [16:20:09<14:24:53, 4.19s/it] 44%|████▍ | 9718/22095 [16:20:12<13:12:52, 3.84s/it] {'loss': 0.3105, 'grad_norm': 0.6729182496858611, 'learning_rate': 6.206100821374238e-06, 'epoch': 0.44} 44%|████▍ | 9718/22095 [16:20:12<13:12:52, 3.84s/it] 44%|████▍ | 9719/22095 [16:20:15<12:21:52, 3.60s/it] {'loss': 0.3502, 'grad_norm': 0.6320681877620283, 'learning_rate': 6.205389530251873e-06, 'epoch': 0.44} 44%|████▍ | 9719/22095 [16:20:15<12:21:52, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9720/22095 [16:20:18<12:09:16, 3.54s/it] {'loss': 0.3513, 'grad_norm': 0.6497975622661094, 'learning_rate': 6.204678213229389e-06, 'epoch': 0.44} 44%|████▍ | 9720/22095 [16:20:18<12:09:16, 3.54s/it] 44%|████▍ | 9721/22095 [16:20:21<11:36:28, 3.38s/it] {'loss': 0.3292, 'grad_norm': 0.6672429135341902, 'learning_rate': 6.203966870322071e-06, 'epoch': 0.44} 44%|████▍ | 9721/22095 [16:20:21<11:36:28, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9722/22095 [16:20:29<15:51:11, 4.61s/it] {'loss': 0.4851, 'grad_norm': 0.31104056891057263, 'learning_rate': 6.2032555015452036e-06, 'epoch': 0.44} 44%|████▍ | 9722/22095 [16:20:29<15:51:11, 4.61s/it] 44%|████▍ | 9723/22095 [16:20:33<15:17:19, 4.45s/it] {'loss': 0.3552, 'grad_norm': 0.6751221659802483, 'learning_rate': 6.202544106914068e-06, 'epoch': 0.44} 44%|████▍ | 9723/22095 [16:20:33<15:17:19, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9724/22095 [16:20:43<21:05:08, 6.14s/it] {'loss': 0.4623, 'grad_norm': 0.2951432521217396, 'learning_rate': 6.201832686443955e-06, 'epoch': 0.44} 44%|████▍ | 9724/22095 [16:20:43<21:05:08, 6.14s/it] 44%|████▍ | 9725/22095 [16:20:47<18:36:10, 5.41s/it] {'loss': 0.3288, 'grad_norm': 0.6698341616356065, 'learning_rate': 6.201121240150147e-06, 'epoch': 0.44} 44%|████▍ | 9725/22095 [16:20:47<18:36:10, 5.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881050 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4203, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 无法确定\nB. 1cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 44%|████▍ | 9726/22095 [16:20:50<16:05:41, 4.68s/it] {'loss': 0.32, 'grad_norm': 0.6358296298524535, 'learning_rate': 6.200409768047935e-06, 'epoch': 0.44} 44%|████▍ | 9726/22095 [16:20:50<16:05:41, 4.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65152 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9727/22095 [16:20:54<15:20:02, 4.46s/it] {'loss': 0.333, 'grad_norm': 0.6169335416900048, 'learning_rate': 6.199698270152602e-06, 'epoch': 0.44} 44%|████▍ | 9727/22095 [16:20:54<15:20:02, 4.46s/it] 44%|████▍ | 9728/22095 [16:20:57<14:25:00, 4.20s/it] {'loss': 0.3533, 'grad_norm': 0.6492481963366252, 'learning_rate': 6.198986746479439e-06, 'epoch': 0.44} 44%|████▍ | 9728/22095 [16:20:57<14:25:00, 4.20s/it] 44%|████▍ | 9729/22095 [16:21:00<13:21:44, 3.89s/it] {'loss': 0.3324, 'grad_norm': 0.6320510462358143, 'learning_rate': 6.198275197043732e-06, 'epoch': 0.44} 44%|████▍ | 9729/22095 [16:21:00<13:21:44, 3.89s/it] 44%|████▍ | 9730/22095 [16:21:04<13:27:35, 3.92s/it] {'loss': 0.3547, 'grad_norm': 0.6742375310083387, 'learning_rate': 6.197563621860771e-06, 'epoch': 0.44} 44%|████▍ | 9730/22095 [16:21:04<13:27:35, 3.92s/it] 44%|████▍ | 9731/22095 [16:21:08<13:43:10, 3.99s/it] {'loss': 0.362, 'grad_norm': 0.6290118966322753, 'learning_rate': 6.196852020945846e-06, 'epoch': 0.44} 44%|████▍ | 9731/22095 [16:21:08<13:43:10, 3.99s/it] 44%|████▍ | 9732/22095 [16:21:11<12:39:08, 3.68s/it] {'loss': 0.3841, 'grad_norm': 0.6904537020386137, 'learning_rate': 6.196140394314247e-06, 'epoch': 0.44} 44%|████▍ | 9732/22095 [16:21:11<12:39:08, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9733/22095 [16:21:15<12:38:20, 3.68s/it] {'loss': 0.3469, 'grad_norm': 0.581190785112767, 'learning_rate': 6.195428741981266e-06, 'epoch': 0.44} 44%|████▍ | 9733/22095 [16:21:15<12:38:20, 3.68s/it] 44%|████▍ | 9734/22095 [16:21:18<11:42:28, 3.41s/it] {'loss': 0.3413, 'grad_norm': 0.6780527641923901, 'learning_rate': 6.194717063962191e-06, 'epoch': 0.44} 44%|████▍ | 9734/22095 [16:21:18<11:42:28, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9735/22095 [16:21:25<15:10:47, 4.42s/it] {'loss': 0.474, 'grad_norm': 0.32026258131864827, 'learning_rate': 6.194005360272317e-06, 'epoch': 0.44} 44%|████▍ | 9735/22095 [16:21:25<15:10:47, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9736/22095 [16:21:28<13:53:24, 4.05s/it] {'loss': 0.3376, 'grad_norm': 0.7847063668187274, 'learning_rate': 6.193293630926933e-06, 'epoch': 0.44} 44%|████▍ | 9736/22095 [16:21:28<13:53:24, 4.05s/it] 44%|████▍ | 9737/22095 [16:21:31<13:18:09, 3.88s/it] {'loss': 0.3821, 'grad_norm': 0.6635025468514317, 'learning_rate': 6.192581875941336e-06, 'epoch': 0.44} 44%|████▍ | 9737/22095 [16:21:31<13:18:09, 3.88s/it] 44%|████▍ | 9738/22095 [16:21:35<13:13:28, 3.85s/it] {'loss': 0.3396, 'grad_norm': 0.649767023666101, 'learning_rate': 6.191870095330817e-06, 'epoch': 0.44} 44%|████▍ | 9738/22095 [16:21:35<13:13:28, 3.85s/it] 44%|████▍ | 9739/22095 [16:21:38<12:42:46, 3.70s/it] {'loss': 0.3332, 'grad_norm': 0.6212240638480885, 'learning_rate': 6.191158289110669e-06, 'epoch': 0.44} 44%|████▍ | 9739/22095 [16:21:38<12:42:46, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50339 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43597 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49298 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9740/22095 [16:21:42<12:12:19, 3.56s/it] {'loss': 0.3473, 'grad_norm': 0.6820651677451741, 'learning_rate': 6.1904464572961874e-06, 'epoch': 0.44} 44%|████▍ | 9740/22095 [16:21:42<12:12:19, 3.56s/it] 44%|████▍ | 9741/22095 [16:21:45<11:30:27, 3.35s/it] {'loss': 0.3568, 'grad_norm': 0.6214271892311886, 'learning_rate': 6.1897345999026695e-06, 'epoch': 0.44} 44%|████▍ | 9741/22095 [16:21:45<11:30:27, 3.35s/it] 44%|████▍ | 9742/22095 [16:21:48<11:11:42, 3.26s/it] {'loss': 0.3334, 'grad_norm': 0.6027011619205818, 'learning_rate': 6.1890227169454075e-06, 'epoch': 0.44} 44%|████▍ | 9742/22095 [16:21:48<11:11:42, 3.26s/it] 44%|████▍ | 9743/22095 [16:21:51<11:28:30, 3.34s/it] {'loss': 0.3459, 'grad_norm': 0.6328640836910864, 'learning_rate': 6.188310808439701e-06, 'epoch': 0.44} 44%|████▍ | 9743/22095 [16:21:51<11:28:30, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55576 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9744/22095 [16:21:55<11:46:21, 3.43s/it] {'loss': 0.3734, 'grad_norm': 0.6179245657168015, 'learning_rate': 6.187598874400842e-06, 'epoch': 0.44} 44%|████▍ | 9744/22095 [16:21:55<11:46:21, 3.43s/it] 44%|████▍ | 9745/22095 [16:21:58<11:50:15, 3.45s/it] {'loss': 0.342, 'grad_norm': 0.7759051603585791, 'learning_rate': 6.1868869148441325e-06, 'epoch': 0.44} 44%|████▍ | 9745/22095 [16:21:58<11:50:15, 3.45s/it] 44%|████▍ | 9746/22095 [16:22:02<12:04:57, 3.52s/it] {'loss': 0.3105, 'grad_norm': 0.606907011740406, 'learning_rate': 6.1861749297848685e-06, 'epoch': 0.44} 44%|████▍ | 9746/22095 [16:22:02<12:04:57, 3.52s/it] 44%|████▍ | 9747/22095 [16:22:05<11:49:08, 3.45s/it] {'loss': 0.3557, 'grad_norm': 0.6237538836904465, 'learning_rate': 6.185462919238348e-06, 'epoch': 0.44} 44%|████▍ | 9747/22095 [16:22:05<11:49:08, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9748/22095 [16:22:10<13:02:23, 3.80s/it] {'loss': 0.4823, 'grad_norm': 0.3348760559457101, 'learning_rate': 6.184750883219869e-06, 'epoch': 0.44} 44%|████▍ | 9748/22095 [16:22:10<13:02:23, 3.80s/it] 44%|████▍ | 9749/22095 [16:22:13<12:31:12, 3.65s/it] {'loss': 0.3532, 'grad_norm': 0.6390519401669638, 'learning_rate': 6.184038821744733e-06, 'epoch': 0.44} 44%|████▍ | 9749/22095 [16:22:13<12:31:12, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9750/22095 [16:22:16<11:50:24, 3.45s/it] {'loss': 0.2972, 'grad_norm': 0.657523255701622, 'learning_rate': 6.18332673482824e-06, 'epoch': 0.44} 44%|████▍ | 9750/22095 [16:22:16<11:50:24, 3.45s/it] 44%|████▍ | 9751/22095 [16:22:19<11:24:52, 3.33s/it] {'loss': 0.3295, 'grad_norm': 0.6160734718098636, 'learning_rate': 6.18261462248569e-06, 'epoch': 0.44} 44%|████▍ | 9751/22095 [16:22:19<11:24:52, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71415 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45186 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87179 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9752/22095 [16:22:24<13:05:48, 3.82s/it] {'loss': 0.4782, 'grad_norm': 0.2923393424606555, 'learning_rate': 6.181902484732381e-06, 'epoch': 0.44} 44%|████▍ | 9752/22095 [16:22:24<13:05:48, 3.82s/it] 44%|████▍ | 9753/22095 [16:22:29<14:13:17, 4.15s/it] {'loss': 0.3, 'grad_norm': 0.6990437189078308, 'learning_rate': 6.181190321583621e-06, 'epoch': 0.44} 44%|████▍ | 9753/22095 [16:22:29<14:13:17, 4.15s/it] 44%|████▍ | 9754/22095 [16:22:32<13:04:05, 3.81s/it] {'loss': 0.3256, 'grad_norm': 0.6699717353002903, 'learning_rate': 6.180478133054707e-06, 'epoch': 0.44} 44%|████▍ | 9754/22095 [16:22:32<13:04:05, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51844 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9755/22095 [16:22:35<11:59:15, 3.50s/it] {'loss': 0.2922, 'grad_norm': 0.6438335030507906, 'learning_rate': 6.179765919160945e-06, 'epoch': 0.44} 44%|████▍ | 9755/22095 [16:22:35<11:59:15, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113247 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41154 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9756/22095 [16:22:38<11:14:50, 3.28s/it] {'loss': 0.349, 'grad_norm': 0.6392233623323376, 'learning_rate': 6.179053679917635e-06, 'epoch': 0.44} 44%|████▍ | 9756/22095 [16:22:38<11:14:50, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87876 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9757/22095 [16:22:41<11:04:36, 3.23s/it] {'loss': 0.3244, 'grad_norm': 0.7026479472661328, 'learning_rate': 6.1783414153400835e-06, 'epoch': 0.44} 44%|████▍ | 9757/22095 [16:22:41<11:04:36, 3.23s/it] 44%|████▍ | 9758/22095 [16:22:44<11:10:50, 3.26s/it] {'loss': 0.3285, 'grad_norm': 0.6389687826947396, 'learning_rate': 6.177629125443594e-06, 'epoch': 0.44} 44%|████▍ | 9758/22095 [16:22:44<11:10:50, 3.26s/it] 44%|████▍ | 9759/22095 [16:22:47<10:44:33, 3.14s/it] {'loss': 0.354, 'grad_norm': 0.6566514186922977, 'learning_rate': 6.176916810243471e-06, 'epoch': 0.44} 44%|████▍ | 9759/22095 [16:22:47<10:44:33, 3.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9760/22095 [16:22:57<18:01:03, 5.26s/it] {'loss': 0.4714, 'grad_norm': 0.43650505652041166, 'learning_rate': 6.176204469755021e-06, 'epoch': 0.44} 44%|████▍ | 9760/22095 [16:22:57<18:01:03, 5.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916663 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39816, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 44%|████▍ | 9761/22095 [16:23:01<16:32:08, 4.83s/it] {'loss': 0.3237, 'grad_norm': 0.7489027344347635, 'learning_rate': 6.175492103993548e-06, 'epoch': 0.44} 44%|████▍ | 9761/22095 [16:23:01<16:32:08, 4.83s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (138240000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 44%|████▍ | 9762/22095 [16:23:04<15:05:47, 4.41s/it] {'loss': 0.3311, 'grad_norm': 0.7357275635937931, 'learning_rate': 6.1747797129743605e-06, 'epoch': 0.44} 44%|████▍ | 9762/22095 [16:23:04<15:05:47, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9763/22095 [16:23:14<20:02:04, 5.85s/it] {'loss': 0.4697, 'grad_norm': 0.31040027336716486, 'learning_rate': 6.174067296712765e-06, 'epoch': 0.44} 44%|████▍ | 9763/22095 [16:23:14<20:02:04, 5.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950721 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1556, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 4\nB. 6\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 44%|████▍ | 9764/22095 [16:23:21<22:00:59, 6.43s/it] {'loss': 0.5069, 'grad_norm': 0.2952355102718732, 'learning_rate': 6.173354855224071e-06, 'epoch': 0.44} 44%|████▍ | 9764/22095 [16:23:21<22:00:59, 6.43s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▍ | 9765/22095 [16:23:25<19:27:10, 5.68s/it] {'loss': 0.2843, 'grad_norm': 0.6238306699955422, 'learning_rate': 6.1726423885235816e-06, 'epoch': 0.44} 44%|████▍ | 9765/22095 [16:23:25<19:27:10, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69876 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42170 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134063 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9766/22095 [16:23:28<16:54:26, 4.94s/it] {'loss': 0.3352, 'grad_norm': 0.6155755513933405, 'learning_rate': 6.1719298966266114e-06, 'epoch': 0.44} 44%|████▍ | 9766/22095 [16:23:29<16:54:26, 4.94s/it] 44%|████▍ | 9767/22095 [16:23:33<15:58:39, 4.67s/it] {'loss': 0.3683, 'grad_norm': 0.5998973251292518, 'learning_rate': 6.1712173795484665e-06, 'epoch': 0.44} 44%|████▍ | 9767/22095 [16:23:33<15:58:39, 4.67s/it] 44%|████▍ | 9768/22095 [16:23:36<14:24:50, 4.21s/it] {'loss': 0.3218, 'grad_norm': 0.6325835388706952, 'learning_rate': 6.170504837304458e-06, 'epoch': 0.44} 44%|████▍ | 9768/22095 [16:23:36<14:24:50, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9769/22095 [16:23:39<13:43:03, 4.01s/it] {'loss': 0.2931, 'grad_norm': 0.6759271954619623, 'learning_rate': 6.169792269909893e-06, 'epoch': 0.44} 44%|████▍ | 9769/22095 [16:23:39<13:43:03, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45977 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45502 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9770/22095 [16:23:42<12:32:17, 3.66s/it] {'loss': 0.3695, 'grad_norm': 0.7109614631678387, 'learning_rate': 6.169079677380086e-06, 'epoch': 0.44} 44%|████▍ | 9770/22095 [16:23:42<12:32:17, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [20, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396942 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63795, 'image': 'vrdu_table_final_2/astro-ph.EP/6ab66a76-4da0-4c31-aeb8-15054a0ccc58.png', 'image_wh': [[20, 14]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[t]{l}x\\end{tabular}\n```"}]} 44%|████▍ | 9771/22095 [16:23:46<12:32:49, 3.67s/it] {'loss': 0.3557, 'grad_norm': 0.6078522568876246, 'learning_rate': 6.168367059730348e-06, 'epoch': 0.44} 44%|████▍ | 9771/22095 [16:23:46<12:32:49, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9772/22095 [16:23:49<12:08:36, 3.55s/it] {'loss': 0.3428, 'grad_norm': 0.6527485169924666, 'learning_rate': 6.167654416975991e-06, 'epoch': 0.44} 44%|████▍ | 9772/22095 [16:23:49<12:08:36, 3.55s/it] 44%|████▍ | 9773/22095 [16:23:53<12:05:40, 3.53s/it] {'loss': 0.3384, 'grad_norm': 0.5826122835066895, 'learning_rate': 6.166941749132325e-06, 'epoch': 0.44} 44%|████▍ | 9773/22095 [16:23:53<12:05:40, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9774/22095 [16:23:58<13:53:56, 4.06s/it] {'loss': 0.4777, 'grad_norm': 0.5247511503156523, 'learning_rate': 6.166229056214665e-06, 'epoch': 0.44} 44%|████▍ | 9774/22095 [16:23:58<13:53:56, 4.06s/it] 44%|████▍ | 9775/22095 [16:24:02<13:47:42, 4.03s/it] {'loss': 0.3281, 'grad_norm': 0.6002895767581452, 'learning_rate': 6.165516338238324e-06, 'epoch': 0.44} 44%|████▍ | 9775/22095 [16:24:02<13:47:42, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9776/22095 [16:24:05<13:18:05, 3.89s/it] {'loss': 0.3679, 'grad_norm': 0.6204952381918223, 'learning_rate': 6.164803595218618e-06, 'epoch': 0.44} 44%|████▍ | 9776/22095 [16:24:05<13:18:05, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123074 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9777/22095 [16:24:09<13:09:21, 3.84s/it] {'loss': 0.288, 'grad_norm': 0.5812317480029556, 'learning_rate': 6.16409082717086e-06, 'epoch': 0.44} 44%|████▍ | 9777/22095 [16:24:09<13:09:21, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9778/22095 [16:24:18<18:53:47, 5.52s/it] {'loss': 0.4954, 'grad_norm': 0.3027247180783634, 'learning_rate': 6.163378034110364e-06, 'epoch': 0.44} 44%|████▍ | 9778/22095 [16:24:19<18:53:47, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9779/22095 [16:24:22<17:14:57, 5.04s/it] {'loss': 0.3415, 'grad_norm': 0.5976073792363202, 'learning_rate': 6.162665216052448e-06, 'epoch': 0.44} 44%|████▍ | 9779/22095 [16:24:22<17:14:57, 5.04s/it] 44%|████▍ | 9780/22095 [16:24:44<34:04:42, 9.96s/it] {'loss': 0.3267, 'grad_norm': 0.6213027447734806, 'learning_rate': 6.161952373012427e-06, 'epoch': 0.44} 44%|████▍ | 9780/22095 [16:24:44<34:04:42, 9.96s/it] 44%|████▍ | 9781/22095 [16:25:07<47:36:44, 13.92s/it] {'loss': 0.3131, 'grad_norm': 0.6648348362802524, 'learning_rate': 6.161239505005618e-06, 'epoch': 0.44} 44%|████▍ | 9781/22095 [16:25:07<47:36:44, 13.92s/it] 44%|████▍ | 9782/22095 [16:25:10<36:43:22, 10.74s/it] {'loss': 0.3248, 'grad_norm': 0.6201774829422037, 'learning_rate': 6.160526612047339e-06, 'epoch': 0.44} 44%|████▍ | 9782/22095 [16:25:10<36:43:22, 10.74s/it] 44%|████▍ | 9783/22095 [16:25:14<29:39:31, 8.67s/it] {'loss': 0.3755, 'grad_norm': 0.5839515753747767, 'learning_rate': 6.159813694152907e-06, 'epoch': 0.44} 44%|████▍ | 9783/22095 [16:25:14<29:39:31, 8.67s/it] 44%|████▍ | 9784/22095 [16:25:18<24:27:52, 7.15s/it] {'loss': 0.36, 'grad_norm': 0.6353528762384202, 'learning_rate': 6.1591007513376425e-06, 'epoch': 0.44} 44%|████▍ | 9784/22095 [16:25:18<24:27:52, 7.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47765 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9785/22095 [16:25:27<26:10:12, 7.65s/it] {'loss': 0.4675, 'grad_norm': 0.406392908771862, 'learning_rate': 6.1583877836168615e-06, 'epoch': 0.44} 44%|████▍ | 9785/22095 [16:25:27<26:10:12, 7.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67899 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67068 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9786/22095 [16:25:30<21:47:31, 6.37s/it] {'loss': 0.3431, 'grad_norm': 0.7766715682857842, 'learning_rate': 6.157674791005884e-06, 'epoch': 0.44} 44%|████▍ | 9786/22095 [16:25:30<21:47:31, 6.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9787/22095 [16:25:33<18:24:59, 5.39s/it] {'loss': 0.3392, 'grad_norm': 0.6466693556842529, 'learning_rate': 6.1569617735200314e-06, 'epoch': 0.44} 44%|████▍ | 9787/22095 [16:25:33<18:24:59, 5.39s/it] 44%|████▍ | 9788/22095 [16:25:36<15:43:47, 4.60s/it] {'loss': 0.3699, 'grad_norm': 0.6287656784939121, 'learning_rate': 6.156248731174623e-06, 'epoch': 0.44} 44%|████▍ | 9788/22095 [16:25:36<15:43:47, 4.60s/it] 44%|████▍ | 9789/22095 [16:25:39<14:20:32, 4.20s/it] {'loss': 0.3456, 'grad_norm': 0.6648703914194296, 'learning_rate': 6.155535663984982e-06, 'epoch': 0.44} 44%|████▍ | 9789/22095 [16:25:39<14:20:32, 4.20s/it] 44%|████▍ | 9790/22095 [16:25:43<13:57:06, 4.08s/it] {'loss': 0.3244, 'grad_norm': 0.6262050918375093, 'learning_rate': 6.154822571966428e-06, 'epoch': 0.44} 44%|████▍ | 9790/22095 [16:25:43<13:57:06, 4.08s/it] 44%|████▍ | 9791/22095 [16:25:47<13:31:10, 3.96s/it] {'loss': 0.3541, 'grad_norm': 0.5743181017272367, 'learning_rate': 6.154109455134283e-06, 'epoch': 0.44} 44%|████▍ | 9791/22095 [16:25:47<13:31:10, 3.96s/it] 44%|████▍ | 9792/22095 [16:25:50<12:29:00, 3.65s/it] {'loss': 0.3589, 'grad_norm': 0.6474090918108056, 'learning_rate': 6.15339631350387e-06, 'epoch': 0.44} 44%|████▍ | 9792/22095 [16:25:50<12:29:00, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57120 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110350 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9793/22095 [16:25:59<18:24:47, 5.39s/it] {'loss': 0.4708, 'grad_norm': 0.2949861466946327, 'learning_rate': 6.152683147090514e-06, 'epoch': 0.44} 44%|████▍ | 9793/22095 [16:25:59<18:24:47, 5.39s/it] 44%|████▍ | 9794/22095 [16:26:08<22:02:11, 6.45s/it] {'loss': 0.4968, 'grad_norm': 0.3234561818756022, 'learning_rate': 6.151969955909536e-06, 'epoch': 0.44} 44%|████▍ | 9794/22095 [16:26:08<22:02:11, 6.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▍ | 9795/22095 [16:26:11<18:47:12, 5.50s/it] {'loss': 0.4069, 'grad_norm': 0.6340148275150607, 'learning_rate': 6.151256739976264e-06, 'epoch': 0.44} 44%|████▍ | 9795/22095 [16:26:11<18:47:12, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107359 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9796/22095 [16:26:15<17:07:44, 5.01s/it] {'loss': 0.3435, 'grad_norm': 0.6197326211786413, 'learning_rate': 6.150543499306016e-06, 'epoch': 0.44} 44%|████▍ | 9796/22095 [16:26:15<17:07:44, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112394 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9797/22095 [16:26:19<15:39:40, 4.58s/it] {'loss': 0.3038, 'grad_norm': 0.5949256976740551, 'learning_rate': 6.149830233914127e-06, 'epoch': 0.44} 44%|████▍ | 9797/22095 [16:26:19<15:39:40, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894380 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17533, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 8cm\nB. 1lcm\nC. 13cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 44%|████▍ | 9798/22095 [16:26:23<15:10:40, 4.44s/it] {'loss': 0.3539, 'grad_norm': 0.6661137701878944, 'learning_rate': 6.149116943815915e-06, 'epoch': 0.44} 44%|████▍ | 9798/22095 [16:26:23<15:10:40, 4.44s/it] 44%|████▍ | 9799/22095 [16:26:27<14:29:34, 4.24s/it] {'loss': 0.3425, 'grad_norm': 0.6438231289467011, 'learning_rate': 6.148403629026709e-06, 'epoch': 0.44} 44%|████▍ | 9799/22095 [16:26:27<14:29:34, 4.24s/it] 44%|████▍ | 9800/22095 [16:26:29<13:07:43, 3.84s/it] {'loss': 0.3322, 'grad_norm': 0.585216600771846, 'learning_rate': 6.147690289561836e-06, 'epoch': 0.44} 44%|████▍ | 9800/22095 [16:26:29<13:07:43, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9801/22095 [16:26:39<19:09:51, 5.61s/it] {'loss': 0.4644, 'grad_norm': 0.3512680335430706, 'learning_rate': 6.146976925436625e-06, 'epoch': 0.44} 44%|████▍ | 9801/22095 [16:26:39<19:09:51, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47206 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74236 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9802/22095 [16:26:43<17:16:04, 5.06s/it] {'loss': 0.2967, 'grad_norm': 0.6362038853976834, 'learning_rate': 6.146263536666401e-06, 'epoch': 0.44} 44%|████▍ | 9802/22095 [16:26:43<17:16:04, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9803/22095 [16:26:51<20:24:01, 5.97s/it] {'loss': 0.4783, 'grad_norm': 0.33783993682468194, 'learning_rate': 6.145550123266496e-06, 'epoch': 0.44} 44%|████▍ | 9803/22095 [16:26:51<20:24:01, 5.97s/it] 44%|████▍ | 9804/22095 [16:27:00<23:29:19, 6.88s/it] {'loss': 0.4825, 'grad_norm': 0.3090639197217425, 'learning_rate': 6.1448366852522346e-06, 'epoch': 0.44} 44%|████▍ | 9804/22095 [16:27:00<23:29:19, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 44%|████▍ | 9805/22095 [16:27:04<20:10:55, 5.91s/it] {'loss': 0.3328, 'grad_norm': 0.583863872091518, 'learning_rate': 6.144123222638952e-06, 'epoch': 0.44} 44%|████▍ | 9805/22095 [16:27:04<20:10:55, 5.91s/it] 44%|████▍ | 9806/22095 [16:27:07<17:46:48, 5.21s/it] {'loss': 0.3285, 'grad_norm': 0.6303943405691291, 'learning_rate': 6.143409735441972e-06, 'epoch': 0.44} 44%|████▍ | 9806/22095 [16:27:07<17:46:48, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64013 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9807/22095 [16:27:11<16:27:39, 4.82s/it] {'loss': 0.3571, 'grad_norm': 0.6284380418815342, 'learning_rate': 6.1426962236766294e-06, 'epoch': 0.44} 44%|████▍ | 9807/22095 [16:27:11<16:27:39, 4.82s/it] 44%|████▍ | 9808/22095 [16:27:33<33:29:45, 9.81s/it] {'loss': 0.3188, 'grad_norm': 0.6136933151196827, 'learning_rate': 6.141982687358255e-06, 'epoch': 0.44} 44%|████▍ | 9808/22095 [16:27:33<33:29:45, 9.81s/it] 44%|████▍ | 9809/22095 [16:27:55<45:50:48, 13.43s/it] {'loss': 0.346, 'grad_norm': 0.6394782780168193, 'learning_rate': 6.14126912650218e-06, 'epoch': 0.44} 44%|████▍ | 9809/22095 [16:27:55<45:50:48, 13.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9810/22095 [16:28:23<60:43:49, 17.80s/it] {'loss': 0.494, 'grad_norm': 0.3835124101312275, 'learning_rate': 6.140555541123737e-06, 'epoch': 0.44} 44%|████▍ | 9810/22095 [16:28:23<60:43:49, 17.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9811/22095 [16:28:26<46:17:17, 13.57s/it] {'loss': 0.3775, 'grad_norm': 0.6657903827089986, 'learning_rate': 6.1398419312382575e-06, 'epoch': 0.44} 44%|████▍ | 9811/22095 [16:28:26<46:17:17, 13.57s/it] 44%|████▍ | 9812/22095 [16:28:48<54:20:05, 15.92s/it] {'loss': 0.3312, 'grad_norm': 0.6194909844891778, 'learning_rate': 6.139128296861076e-06, 'epoch': 0.44} 44%|████▍ | 9812/22095 [16:28:48<54:20:05, 15.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54949 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9813/22095 [16:29:10<61:14:19, 17.95s/it] {'loss': 0.3239, 'grad_norm': 0.6453686197003785, 'learning_rate': 6.138414638007526e-06, 'epoch': 0.44} 44%|████▍ | 9813/22095 [16:29:10<61:14:19, 17.95s/it] 44%|████▍ | 9814/22095 [16:29:35<67:43:18, 19.85s/it] {'loss': 0.3028, 'grad_norm': 0.6254271915732517, 'learning_rate': 6.137700954692944e-06, 'epoch': 0.44} 44%|████▍ | 9814/22095 [16:29:35<67:43:18, 19.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 44%|████▍ | 9815/22095 [16:29:43<55:53:36, 16.39s/it] {'loss': 0.4782, 'grad_norm': 0.3586269129410446, 'learning_rate': 6.136987246932658e-06, 'epoch': 0.44} 44%|████▍ | 9815/22095 [16:29:43<55:53:36, 16.39s/it] 44%|████▍ | 9816/22095 [16:29:46<42:31:19, 12.47s/it] {'loss': 0.3272, 'grad_norm': 0.6018532392814256, 'learning_rate': 6.136273514742013e-06, 'epoch': 0.44} 44%|████▍ | 9816/22095 [16:29:46<42:31:19, 12.47s/it] 44%|████▍ | 9817/22095 [16:30:07<51:18:47, 15.05s/it] {'loss': 0.3353, 'grad_norm': 0.6227910130275708, 'learning_rate': 6.135559758136337e-06, 'epoch': 0.44} 44%|████▍ | 9817/22095 [16:30:07<51:18:47, 15.05s/it] 44%|████▍ | 9818/22095 [16:30:11<39:36:23, 11.61s/it] {'loss': 0.3647, 'grad_norm': 0.6510502346855318, 'learning_rate': 6.13484597713097e-06, 'epoch': 0.44} 44%|████▍ | 9818/22095 [16:30:11<39:36:23, 11.61s/it] 44%|████▍ | 9819/22095 [16:30:53<71:19:10, 20.91s/it] {'loss': 0.3081, 'grad_norm': 0.5901502996575975, 'learning_rate': 6.134132171741247e-06, 'epoch': 0.44} 44%|████▍ | 9819/22095 [16:30:54<71:19:10, 20.91s/it] 44%|████▍ | 9820/22095 [16:32:11<129:08:01, 37.87s/it] {'loss': 0.369, 'grad_norm': 0.6095995854076501, 'learning_rate': 6.133418341982509e-06, 'epoch': 0.44} 44%|████▍ | 9820/22095 [16:32:11<129:08:01, 37.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8404942 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7128, 'image': 'vrdu_table_final_2/astro-ph.CO/e6577296-a0cd-4548-986e-2ffd979cbb29.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 44%|████▍ | 9821/22095 [16:32:58<138:01:04, 40.48s/it] {'loss': 0.4773, 'grad_norm': 0.30644324654506866, 'learning_rate': 6.132704487870091e-06, 'epoch': 0.44} 44%|████▍ | 9821/22095 [16:32:58<138:01:04, 40.48s/it] 44%|████▍ | 9822/22095 [16:33:21<120:35:05, 35.37s/it] {'loss': 0.343, 'grad_norm': 0.5812312860977836, 'learning_rate': 6.131990609419334e-06, 'epoch': 0.44} 44%|████▍ | 9822/22095 [16:33:21<120:35:05, 35.37s/it] 44%|████▍ | 9823/22095 [16:33:44<108:11:23, 31.74s/it] {'loss': 0.3077, 'grad_norm': 0.6686231801887372, 'learning_rate': 6.131276706645572e-06, 'epoch': 0.44} 44%|████▍ | 9823/22095 [16:33:44<108:11:23, 31.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42524 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9824/22095 [16:34:06<97:46:08, 28.68s/it] {'loss': 0.3347, 'grad_norm': 0.5833261560869663, 'learning_rate': 6.130562779564151e-06, 'epoch': 0.44} 44%|████▍ | 9824/22095 [16:34:06<97:46:08, 28.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45959 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79404 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9825/22095 [16:34:26<89:23:21, 26.23s/it] {'loss': 0.3275, 'grad_norm': 0.7014683950261718, 'learning_rate': 6.129848828190405e-06, 'epoch': 0.44} 44%|████▍ | 9825/22095 [16:34:26<89:23:21, 26.23s/it] 44%|████▍ | 9826/22095 [16:34:50<86:34:57, 25.41s/it] {'loss': 0.3752, 'grad_norm': 0.6994318413430156, 'learning_rate': 6.129134852539682e-06, 'epoch': 0.44} 44%|████▍ | 9826/22095 [16:34:50<86:34:57, 25.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 44%|████▍ | 9827/22095 [16:35:11<82:12:29, 24.12s/it] {'loss': 0.3189, 'grad_norm': 0.6038864527395649, 'learning_rate': 6.128420852627316e-06, 'epoch': 0.44} 44%|████▍ | 9827/22095 [16:35:11<82:12:29, 24.12s/it] 44%|████▍ | 9828/22095 [16:35:32<79:06:04, 23.21s/it] {'loss': 0.3217, 'grad_norm': 0.6531328968033617, 'learning_rate': 6.127706828468653e-06, 'epoch': 0.44} 44%|████▍ | 9828/22095 [16:35:32<79:06:04, 23.21s/it] 44%|████▍ | 9829/22095 [16:35:53<77:18:13, 22.69s/it] {'loss': 0.3516, 'grad_norm': 0.6903825925649602, 'learning_rate': 6.126992780079032e-06, 'epoch': 0.44} 44%|████▍ | 9829/22095 [16:35:54<77:18:13, 22.69s/it] 44%|████▍ | 9830/22095 [16:36:16<77:02:06, 22.61s/it] {'loss': 0.3222, 'grad_norm': 0.6003467816513118, 'learning_rate': 6.1262787074738e-06, 'epoch': 0.44} 44%|████▍ | 9830/22095 [16:36:16<77:02:06, 22.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99393 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109472 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88542 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9831/22095 [16:37:15<114:14:11, 33.53s/it] {'loss': 0.3023, 'grad_norm': 0.5740307584921875, 'learning_rate': 6.125564610668294e-06, 'epoch': 0.44} 44%|████▍ | 9831/22095 [16:37:15<114:14:11, 33.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (87340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78516 > 40960). Running this sequence through the model will result in indexing errors 44%|████▍ | 9832/22095 [16:37:24<89:36:40, 26.31s/it] {'loss': 0.4725, 'grad_norm': 0.33965468133552557, 'learning_rate': 6.124850489677865e-06, 'epoch': 0.44} 44%|████▍ | 9832/22095 [16:37:24<89:36:40, 26.31s/it] 45%|████▍ | 9833/22095 [16:37:52<91:25:36, 26.84s/it] {'loss': 0.4507, 'grad_norm': 0.2948191948780353, 'learning_rate': 6.1241363445178515e-06, 'epoch': 0.45} 45%|████▍ | 9833/22095 [16:37:52<91:25:36, 26.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9834/22095 [16:37:56<68:02:07, 19.98s/it] {'loss': 0.3551, 'grad_norm': 0.6054233565022529, 'learning_rate': 6.1234221752036015e-06, 'epoch': 0.45} 45%|████▍ | 9834/22095 [16:37:56<68:02:07, 19.98s/it] 45%|████▍ | 9835/22095 [16:38:38<90:26:12, 26.56s/it] {'loss': 0.3295, 'grad_norm': 0.5959582579867868, 'learning_rate': 6.122707981750458e-06, 'epoch': 0.45} 45%|████▍ | 9835/22095 [16:38:38<90:26:12, 26.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75416 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9836/22095 [16:39:21<106:26:50, 31.26s/it] {'loss': 0.342, 'grad_norm': 0.6347395989261895, 'learning_rate': 6.12199376417377e-06, 'epoch': 0.45} 45%|████▍ | 9836/22095 [16:39:21<106:26:50, 31.26s/it] 45%|████▍ | 9837/22095 [16:40:21<136:19:59, 40.04s/it] {'loss': 0.399, 'grad_norm': 0.6944384128632148, 'learning_rate': 6.121279522488881e-06, 'epoch': 0.45} 45%|████▍ | 9837/22095 [16:40:21<136:19:59, 40.04s/it] 45%|████▍ | 9838/22095 [16:42:05<201:13:55, 59.10s/it] {'loss': 0.3245, 'grad_norm': 1.8592269415278064, 'learning_rate': 6.120565256711138e-06, 'epoch': 0.45} 45%|████▍ | 9838/22095 [16:42:05<201:13:55, 59.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379746 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46531, 'image': 'vrdu_table_final_2/astro-ph.CO/34b1dfc1-8ca0-4c70-a820-c8f12af2eff8.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}4 \\\\ 2\\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9839/22095 [16:42:34<170:34:38, 50.10s/it] {'loss': 0.4517, 'grad_norm': 0.5513116467297371, 'learning_rate': 6.11985096685589e-06, 'epoch': 0.45} 45%|████▍ | 9839/22095 [16:42:34<170:34:38, 50.10s/it] 45%|████▍ | 9840/22095 [16:43:01<147:26:24, 43.31s/it] {'loss': 0.4674, 'grad_norm': 0.5414042070571672, 'learning_rate': 6.1191366529384845e-06, 'epoch': 0.45} 45%|████▍ | 9840/22095 [16:43:01<147:26:24, 43.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9841/22095 [16:43:05<107:19:17, 31.53s/it] {'loss': 0.3548, 'grad_norm': 0.6531904168417739, 'learning_rate': 6.118422314974269e-06, 'epoch': 0.45} 45%|████▍ | 9841/22095 [16:43:05<107:19:17, 31.53s/it] 45%|████▍ | 9842/22095 [16:43:28<98:47:37, 29.03s/it] {'loss': 0.3676, 'grad_norm': 0.7155330778152321, 'learning_rate': 6.117707952978593e-06, 'epoch': 0.45} 45%|████▍ | 9842/22095 [16:43:28<98:47:37, 29.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8409782 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11978, 'image': 'vrdu_table_final_2/astro-ph.CO/d68e3990-c36f-4d1c-a303-87443fa558a0.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 45%|████▍ | 9843/22095 [16:44:30<131:44:01, 38.71s/it] {'loss': 0.3315, 'grad_norm': 0.820859464855041, 'learning_rate': 6.116993566966807e-06, 'epoch': 0.45} 45%|████▍ | 9843/22095 [16:44:30<131:44:01, 38.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9844/22095 [16:45:33<156:55:27, 46.11s/it] {'loss': 0.4529, 'grad_norm': 0.33759141723286856, 'learning_rate': 6.1162791569542576e-06, 'epoch': 0.45} 45%|████▍ | 9844/22095 [16:45:33<156:55:27, 46.11s/it] 45%|████▍ | 9845/22095 [16:45:37<113:49:55, 33.45s/it] {'loss': 0.3414, 'grad_norm': 0.6551364903781752, 'learning_rate': 6.1155647229562994e-06, 'epoch': 0.45} 45%|████▍ | 9845/22095 [16:45:37<113:49:55, 33.45s/it] 45%|████▍ | 9846/22095 [16:45:58<101:12:09, 29.74s/it] {'loss': 0.3875, 'grad_norm': 0.7403036860239424, 'learning_rate': 6.1148502649882805e-06, 'epoch': 0.45} 45%|████▍ | 9846/22095 [16:45:58<101:12:09, 29.74s/it] 45%|████▍ | 9847/22095 [16:46:55<129:22:11, 38.03s/it] {'loss': 0.3745, 'grad_norm': 0.7837013726439063, 'learning_rate': 6.114135783065553e-06, 'epoch': 0.45} 45%|████▍ | 9847/22095 [16:46:55<129:22:11, 38.03s/it] 45%|████▍ | 9848/22095 [16:47:18<113:03:33, 33.23s/it] {'loss': 0.407, 'grad_norm': 0.6477213100740911, 'learning_rate': 6.113421277203471e-06, 'epoch': 0.45} 45%|████▍ | 9848/22095 [16:47:18<113:03:33, 33.23s/it] 45%|████▍ | 9849/22095 [16:47:40<102:18:37, 30.08s/it] {'loss': 0.3495, 'grad_norm': 0.614216462619982, 'learning_rate': 6.112706747417384e-06, 'epoch': 0.45} 45%|████▍ | 9849/22095 [16:47:40<102:18:37, 30.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9850/22095 [16:48:06<97:52:48, 28.78s/it] {'loss': 0.4765, 'grad_norm': 0.325186835765987, 'learning_rate': 6.111992193722647e-06, 'epoch': 0.45} 45%|████▍ | 9850/22095 [16:48:06<97:52:48, 28.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41771 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9851/22095 [16:48:29<92:00:30, 27.05s/it] {'loss': 0.3754, 'grad_norm': 0.71102109767876, 'learning_rate': 6.111277616134613e-06, 'epoch': 0.45} 45%|████▍ | 9851/22095 [16:48:29<92:00:30, 27.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105224 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9852/22095 [16:49:45<142:23:00, 41.87s/it] {'loss': 0.3431, 'grad_norm': 0.6230716349605854, 'learning_rate': 6.1105630146686345e-06, 'epoch': 0.45} 45%|████▍ | 9852/22095 [16:49:45<142:23:00, 41.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359784 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26505, 'image': 'vrdu_table_final_2/astro-ph.CO/b32f64b6-e3e1-4873-944b-5fb7331971a8.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 45%|████▍ | 9853/22095 [16:50:30<144:38:23, 42.53s/it] {'loss': 0.2981, 'grad_norm': 0.6124749160422528, 'learning_rate': 6.109848389340071e-06, 'epoch': 0.45} 45%|████▍ | 9853/22095 [16:50:30<144:38:23, 42.53s/it] 45%|████▍ | 9854/22095 [16:51:32<164:48:03, 48.47s/it] {'loss': 0.3268, 'grad_norm': 0.6262606678189672, 'learning_rate': 6.109133740164271e-06, 'epoch': 0.45} 45%|████▍ | 9854/22095 [16:51:32<164:48:03, 48.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53511 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63221 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77622 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9855/22095 [16:51:55<139:02:47, 40.90s/it] {'loss': 0.3142, 'grad_norm': 0.6227322143402623, 'learning_rate': 6.108419067156595e-06, 'epoch': 0.45} 45%|████▍ | 9855/22095 [16:51:55<139:02:47, 40.90s/it] 45%|████▍ | 9856/22095 [16:52:17<119:52:30, 35.26s/it] {'loss': 0.3388, 'grad_norm': 0.651331240442703, 'learning_rate': 6.1077043703323964e-06, 'epoch': 0.45} 45%|████▍ | 9856/22095 [16:52:17<119:52:30, 35.26s/it] 45%|████▍ | 9857/22095 [16:52:39<105:53:46, 31.15s/it] {'loss': 0.3347, 'grad_norm': 0.6447017477504052, 'learning_rate': 6.106989649707034e-06, 'epoch': 0.45} 45%|████▍ | 9857/22095 [16:52:39<105:53:46, 31.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9858/22095 [16:53:18<114:27:53, 33.67s/it] {'loss': 0.3285, 'grad_norm': 1.0241416300529436, 'learning_rate': 6.106274905295864e-06, 'epoch': 0.45} 45%|████▍ | 9858/22095 [16:53:18<114:27:53, 33.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9859/22095 [16:53:28<89:53:41, 26.45s/it] {'loss': 0.4933, 'grad_norm': 0.3596739071402451, 'learning_rate': 6.105560137114244e-06, 'epoch': 0.45} 45%|████▍ | 9859/22095 [16:53:28<89:53:41, 26.45s/it] 45%|████▍ | 9860/22095 [16:53:38<72:48:40, 21.42s/it] {'loss': 0.4793, 'grad_norm': 0.34139558368903, 'learning_rate': 6.1048453451775305e-06, 'epoch': 0.45} 45%|████▍ | 9860/22095 [16:53:38<72:48:40, 21.42s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9861/22095 [16:54:00<73:29:05, 21.62s/it] {'loss': 0.3348, 'grad_norm': 0.6831938419121877, 'learning_rate': 6.104130529501086e-06, 'epoch': 0.45} 45%|████▍ | 9861/22095 [16:54:00<73:29:05, 21.62s/it] 45%|████▍ | 9862/22095 [16:54:21<73:38:23, 21.67s/it] {'loss': 0.3799, 'grad_norm': 0.6882346739189739, 'learning_rate': 6.103415690100265e-06, 'epoch': 0.45} 45%|████▍ | 9862/22095 [16:54:21<73:38:23, 21.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (123074 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9863/22095 [16:54:31<61:13:05, 18.02s/it] {'loss': 0.459, 'grad_norm': 0.3176810590962061, 'learning_rate': 6.102700826990432e-06, 'epoch': 0.45} 45%|████▍ | 9863/22095 [16:54:31<61:13:05, 18.02s/it] 45%|████▍ | 9864/22095 [16:54:57<69:05:26, 20.34s/it] {'loss': 0.3489, 'grad_norm': 0.6221193294243147, 'learning_rate': 6.101985940186943e-06, 'epoch': 0.45} 45%|████▍ | 9864/22095 [16:54:57<69:05:26, 20.34s/it] 45%|████▍ | 9865/22095 [16:55:39<91:51:16, 27.04s/it] {'loss': 0.3321, 'grad_norm': 0.6180950453337573, 'learning_rate': 6.101271029705163e-06, 'epoch': 0.45} 45%|████▍ | 9865/22095 [16:55:39<91:51:16, 27.04s/it] 45%|████▍ | 9866/22095 [16:56:02<87:29:32, 25.76s/it] {'loss': 0.3362, 'grad_norm': 0.6067883624713181, 'learning_rate': 6.100556095560448e-06, 'epoch': 0.45} 45%|████▍ | 9866/22095 [16:56:02<87:29:32, 25.76s/it] 45%|████▍ | 9867/22095 [16:56:24<83:15:40, 24.51s/it] {'loss': 0.3529, 'grad_norm': 0.6901054349850831, 'learning_rate': 6.099841137768164e-06, 'epoch': 0.45} 45%|████▍ | 9867/22095 [16:56:24<83:15:40, 24.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41436 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90381 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9868/22095 [16:57:25<120:10:47, 35.38s/it] {'loss': 0.3405, 'grad_norm': 0.7772309288047123, 'learning_rate': 6.099126156343672e-06, 'epoch': 0.45} 45%|████▍ | 9868/22095 [16:57:25<120:10:47, 35.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69497 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9869/22095 [16:57:47<107:27:06, 31.64s/it] {'loss': 0.3095, 'grad_norm': 0.6338037509155247, 'learning_rate': 6.098411151302335e-06, 'epoch': 0.45} 45%|████▍ | 9869/22095 [16:57:47<107:27:06, 31.64s/it] 45%|████▍ | 9870/22095 [16:57:51<78:45:28, 23.19s/it] {'loss': 0.365, 'grad_norm': 0.6478361223504291, 'learning_rate': 6.097696122659515e-06, 'epoch': 0.45} 45%|████▍ | 9870/22095 [16:57:51<78:45:28, 23.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46818 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120003 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9871/22095 [16:58:00<64:52:09, 19.10s/it] {'loss': 0.5074, 'grad_norm': 0.3435502929420577, 'learning_rate': 6.096981070430577e-06, 'epoch': 0.45} 45%|████▍ | 9871/22095 [16:58:00<64:52:09, 19.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51410 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107984 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9872/22095 [16:58:23<68:52:23, 20.28s/it] {'loss': 0.3604, 'grad_norm': 0.5684308557539682, 'learning_rate': 6.096265994630886e-06, 'epoch': 0.45} 45%|████▍ | 9872/22095 [16:58:24<68:52:23, 20.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396975 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63828, 'image': 'vrdu_table_final_2/astro-ph.EP/30058f90-1f29-40d5-abf4-c10f35f66c91.png', 'image_wh': [[17, 14]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$\\omega$\\end{tabular}\n```"}]} 45%|████▍ | 9873/22095 [16:59:22<107:38:03, 31.70s/it] {'loss': 0.3511, 'grad_norm': 0.6546593740824457, 'learning_rate': 6.095550895275803e-06, 'epoch': 0.45} 45%|████▍ | 9873/22095 [16:59:22<107:38:03, 31.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43426 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46707 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9874/22095 [16:59:25<78:18:44, 23.07s/it] {'loss': 0.3389, 'grad_norm': 0.6351718584179072, 'learning_rate': 6.094835772380699e-06, 'epoch': 0.45} 45%|████▍ | 9874/22095 [16:59:25<78:18:44, 23.07s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 45%|████▍ | 9875/22095 [17:00:22<113:27:49, 33.43s/it] {'loss': 0.3783, 'grad_norm': 0.6333198074375345, 'learning_rate': 6.094120625960934e-06, 'epoch': 0.45} 45%|████▍ | 9875/22095 [17:00:22<113:27:49, 33.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9876/22095 [17:00:47<104:14:35, 30.71s/it] {'loss': 0.4683, 'grad_norm': 0.28208508094965357, 'learning_rate': 6.09340545603188e-06, 'epoch': 0.45} 45%|████▍ | 9876/22095 [17:00:47<104:14:35, 30.71s/it] 45%|████▍ | 9877/22095 [17:01:14<100:37:56, 29.65s/it] {'loss': 0.463, 'grad_norm': 0.27034783904859405, 'learning_rate': 6.092690262608899e-06, 'epoch': 0.45} 45%|████▍ | 9877/22095 [17:01:14<100:37:56, 29.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9878/22095 [17:01:18<74:56:41, 22.08s/it] {'loss': 0.3384, 'grad_norm': 0.6585479271201585, 'learning_rate': 6.091975045707361e-06, 'epoch': 0.45} 45%|████▍ | 9878/22095 [17:01:18<74:56:41, 22.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62282 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9879/22095 [17:01:46<80:12:09, 23.64s/it] {'loss': 0.4755, 'grad_norm': 0.3012915983187069, 'learning_rate': 6.091259805342632e-06, 'epoch': 0.45} 45%|████▍ | 9879/22095 [17:01:46<80:12:09, 23.64s/it] 45%|████▍ | 9880/22095 [17:02:14<85:03:46, 25.07s/it] {'loss': 0.4655, 'grad_norm': 0.30539675156875673, 'learning_rate': 6.0905445415300835e-06, 'epoch': 0.45} 45%|████▍ | 9880/22095 [17:02:14<85:03:46, 25.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9881/22095 [17:02:18<63:27:44, 18.71s/it] {'loss': 0.3302, 'grad_norm': 0.5853264373929327, 'learning_rate': 6.089829254285079e-06, 'epoch': 0.45} 45%|████▍ | 9881/22095 [17:02:18<63:27:44, 18.71s/it] 45%|████▍ | 9882/22095 [17:03:04<91:10:02, 26.87s/it] {'loss': 0.4778, 'grad_norm': 0.27688497076430446, 'learning_rate': 6.089113943622994e-06, 'epoch': 0.45} 45%|████▍ | 9882/22095 [17:03:04<91:10:02, 26.87s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9883/22095 [17:03:28<87:56:19, 25.92s/it] {'loss': 0.3475, 'grad_norm': 0.6386649390187925, 'learning_rate': 6.088398609559193e-06, 'epoch': 0.45} 45%|████▍ | 9883/22095 [17:03:28<87:56:19, 25.92s/it] 45%|████▍ | 9884/22095 [17:03:31<65:01:08, 19.17s/it] {'loss': 0.3465, 'grad_norm': 0.67766272206694, 'learning_rate': 6.08768325210905e-06, 'epoch': 0.45} 45%|████▍ | 9884/22095 [17:03:31<65:01:08, 19.17s/it] 45%|████▍ | 9885/22095 [17:03:53<67:43:17, 19.97s/it] {'loss': 0.3466, 'grad_norm': 0.5990159683409747, 'learning_rate': 6.086967871287934e-06, 'epoch': 0.45} 45%|████▍ | 9885/22095 [17:03:53<67:43:17, 19.97s/it] 45%|████▍ | 9886/22095 [17:04:16<71:01:04, 20.94s/it] {'loss': 0.3293, 'grad_norm': 0.617226860730486, 'learning_rate': 6.086252467111216e-06, 'epoch': 0.45} 45%|████▍ | 9886/22095 [17:04:16<71:01:04, 20.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (84264 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102554 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9887/22095 [17:04:44<78:06:44, 23.03s/it] {'loss': 0.4634, 'grad_norm': 0.32740738856566126, 'learning_rate': 6.0855370395942705e-06, 'epoch': 0.45} 45%|████▍ | 9887/22095 [17:04:44<78:06:44, 23.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9888/22095 [17:05:11<82:04:32, 24.21s/it] {'loss': 0.4757, 'grad_norm': 0.33128541122915056, 'learning_rate': 6.0848215887524665e-06, 'epoch': 0.45} 45%|████▍ | 9888/22095 [17:05:11<82:04:32, 24.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9889/22095 [17:05:14<60:45:36, 17.92s/it] {'loss': 0.3073, 'grad_norm': 0.6868043705289181, 'learning_rate': 6.084106114601178e-06, 'epoch': 0.45} 45%|████▍ | 9889/22095 [17:05:14<60:45:36, 17.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45258 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44659 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9890/22095 [17:05:18<46:03:29, 13.59s/it] {'loss': 0.3492, 'grad_norm': 0.6463951687803111, 'learning_rate': 6.08339061715578e-06, 'epoch': 0.45} 45%|████▍ | 9890/22095 [17:05:18<46:03:29, 13.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50950 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9891/22095 [17:06:00<75:28:04, 22.26s/it] {'loss': 0.2924, 'grad_norm': 0.5876814576312829, 'learning_rate': 6.082675096431645e-06, 'epoch': 0.45} 45%|████▍ | 9891/22095 [17:06:00<75:28:04, 22.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9892/22095 [17:06:03<56:15:25, 16.60s/it] {'loss': 0.3545, 'grad_norm': 0.6016861692698359, 'learning_rate': 6.081959552444147e-06, 'epoch': 0.45} 45%|████▍ | 9892/22095 [17:06:03<56:15:25, 16.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045960 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 6\nB. 10\nC. 8\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 45%|████▍ | 9893/22095 [17:06:06<42:18:24, 12.48s/it] {'loss': 0.348, 'grad_norm': 0.607095977406509, 'learning_rate': 6.081243985208662e-06, 'epoch': 0.45} 45%|████▍ | 9893/22095 [17:06:06<42:18:24, 12.48s/it] 45%|████▍ | 9894/22095 [17:06:11<34:29:06, 10.18s/it] {'loss': 0.3336, 'grad_norm': 0.6987791654284069, 'learning_rate': 6.0805283947405625e-06, 'epoch': 0.45} 45%|████▍ | 9894/22095 [17:06:11<34:29:06, 10.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9895/22095 [17:06:41<54:38:01, 16.12s/it] {'loss': 0.4934, 'grad_norm': 0.33277908746130674, 'learning_rate': 6.079812781055228e-06, 'epoch': 0.45} 45%|████▍ | 9895/22095 [17:06:41<54:38:01, 16.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67613 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42839 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9896/22095 [17:06:44<41:32:18, 12.26s/it] {'loss': 0.3706, 'grad_norm': 0.7554432633335122, 'learning_rate': 6.0790971441680325e-06, 'epoch': 0.45} 45%|████▍ | 9896/22095 [17:06:44<41:32:18, 12.26s/it] 45%|████▍ | 9897/22095 [17:06:47<31:56:17, 9.43s/it] {'loss': 0.3155, 'grad_norm': 0.6070321705846707, 'learning_rate': 6.078381484094353e-06, 'epoch': 0.45} 45%|████▍ | 9897/22095 [17:06:47<31:56:17, 9.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9898/22095 [17:06:57<32:02:32, 9.46s/it] {'loss': 0.4661, 'grad_norm': 0.2850215988806868, 'learning_rate': 6.077665800849568e-06, 'epoch': 0.45} 45%|████▍ | 9898/22095 [17:06:57<32:02:32, 9.46s/it] 45%|████▍ | 9899/22095 [17:07:27<53:27:49, 15.78s/it] {'loss': 0.4477, 'grad_norm': 0.27157890571522925, 'learning_rate': 6.076950094449055e-06, 'epoch': 0.45} 45%|████▍ | 9899/22095 [17:07:27<53:27:49, 15.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 45%|████▍ | 9900/22095 [17:07:31<41:09:43, 12.15s/it] {'loss': 0.3479, 'grad_norm': 0.637999409504055, 'learning_rate': 6.076234364908192e-06, 'epoch': 0.45} 45%|████▍ | 9900/22095 [17:07:31<41:09:43, 12.15s/it] 45%|████▍ | 9901/22095 [17:07:35<32:37:24, 9.63s/it] {'loss': 0.3614, 'grad_norm': 0.7043538513070119, 'learning_rate': 6.07551861224236e-06, 'epoch': 0.45} 45%|████▍ | 9901/22095 [17:07:35<32:37:24, 9.63s/it] 45%|████▍ | 9902/22095 [17:08:17<66:12:32, 19.55s/it] {'loss': 0.3262, 'grad_norm': 0.6120136354672181, 'learning_rate': 6.074802836466932e-06, 'epoch': 0.45} 45%|████▍ | 9902/22095 [17:08:17<66:12:32, 19.55s/it] 45%|████▍ | 9903/22095 [17:08:38<66:54:08, 19.75s/it] {'loss': 0.3485, 'grad_norm': 0.6531903882646292, 'learning_rate': 6.074087037597296e-06, 'epoch': 0.45} 45%|████▍ | 9903/22095 [17:08:38<66:54:08, 19.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9904/22095 [17:09:05<74:57:38, 22.14s/it] {'loss': 0.4707, 'grad_norm': 0.35229527558395135, 'learning_rate': 6.073371215648824e-06, 'epoch': 0.45} 45%|████▍ | 9904/22095 [17:09:05<74:57:38, 22.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49534 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9905/22095 [17:09:09<56:24:26, 16.66s/it] {'loss': 0.3817, 'grad_norm': 0.6613695716254754, 'learning_rate': 6.072655370636905e-06, 'epoch': 0.45} 45%|████▍ | 9905/22095 [17:09:09<56:24:26, 16.66s/it] 45%|████▍ | 9906/22095 [17:09:53<83:50:17, 24.76s/it] {'loss': 0.3511, 'grad_norm': 0.6818675839553595, 'learning_rate': 6.071939502576916e-06, 'epoch': 0.45} 45%|████▍ | 9906/22095 [17:09:53<83:50:17, 24.76s/it] 45%|████▍ | 9907/22095 [17:09:57<63:01:30, 18.62s/it] {'loss': 0.3688, 'grad_norm': 0.6641385200315042, 'learning_rate': 6.071223611484238e-06, 'epoch': 0.45} 45%|████▍ | 9907/22095 [17:09:57<63:01:30, 18.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [406, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8519923 in VC:s3://internvl-moe-sft-data/. Exception: Image size [406, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 115376, 'image': 'vrdu_texteq/astro-ph.CO/4aa926ea-274a-411a-81f1-826e0e54e7ae.png', 'image_wh': [[406, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where we sum over $c$ from $1$ to $n$.'}]} 45%|████▍ | 9908/22095 [17:10:01<47:51:27, 14.14s/it] {'loss': 0.3579, 'grad_norm': 0.6913005755156559, 'learning_rate': 6.070507697374255e-06, 'epoch': 0.45} 45%|████▍ | 9908/22095 [17:10:01<47:51:27, 14.14s/it] 45%|████▍ | 9909/22095 [17:10:23<56:18:56, 16.64s/it] {'loss': 0.3429, 'grad_norm': 0.6397984850762364, 'learning_rate': 6.06979176026235e-06, 'epoch': 0.45} 45%|████▍ | 9909/22095 [17:10:23<56:18:56, 16.64s/it] 45%|████▍ | 9910/22095 [17:10:27<42:55:06, 12.68s/it] {'loss': 0.3481, 'grad_norm': 0.5996746250995699, 'learning_rate': 6.069075800163905e-06, 'epoch': 0.45} 45%|████▍ | 9910/22095 [17:10:27<42:55:06, 12.68s/it] 45%|████▍ | 9911/22095 [17:10:30<33:17:30, 9.84s/it] {'loss': 0.3566, 'grad_norm': 0.6204333710521447, 'learning_rate': 6.068359817094305e-06, 'epoch': 0.45} 45%|████▍ | 9911/22095 [17:10:30<33:17:30, 9.84s/it] 45%|████▍ | 9912/22095 [17:10:51<44:54:23, 13.27s/it] {'loss': 0.3729, 'grad_norm': 0.68970625469837, 'learning_rate': 6.067643811068933e-06, 'epoch': 0.45} 45%|████▍ | 9912/22095 [17:10:51<44:54:23, 13.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9913/22095 [17:10:59<39:16:43, 11.61s/it] {'loss': 0.5064, 'grad_norm': 0.3392200840779211, 'learning_rate': 6.066927782103176e-06, 'epoch': 0.45} 45%|████▍ | 9913/22095 [17:10:59<39:16:43, 11.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61928 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59806 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9914/22095 [17:11:27<56:13:30, 16.62s/it] {'loss': 0.5039, 'grad_norm': 0.30088581321615016, 'learning_rate': 6.066211730212416e-06, 'epoch': 0.45} 45%|████▍ | 9914/22095 [17:11:27<56:13:30, 16.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46965 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106964 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9915/22095 [17:11:36<48:34:19, 14.36s/it] {'loss': 0.4639, 'grad_norm': 0.28778244578525586, 'learning_rate': 6.0654956554120415e-06, 'epoch': 0.45} 45%|████▍ | 9915/22095 [17:11:36<48:34:19, 14.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9916/22095 [17:11:40<37:17:52, 11.02s/it] {'loss': 0.339, 'grad_norm': 0.6963911431712173, 'learning_rate': 6.064779557717437e-06, 'epoch': 0.45} 45%|████▍ | 9916/22095 [17:11:40<37:17:52, 11.02s/it] 45%|████▍ | 9917/22095 [17:11:43<29:23:35, 8.69s/it] {'loss': 0.3543, 'grad_norm': 0.6489724625248927, 'learning_rate': 6.064063437143991e-06, 'epoch': 0.45} 45%|████▍ | 9917/22095 [17:11:43<29:23:35, 8.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100209 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53709 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (49021 > 40960) for 4 sample(s). Truncating to 3062 with 2 samples. 45%|████▍ | 9918/22095 [17:11:47<24:29:57, 7.24s/it] {'loss': 0.3149, 'grad_norm': 0.6240808389505063, 'learning_rate': 6.063347293707089e-06, 'epoch': 0.45} 45%|████▍ | 9918/22095 [17:11:47<24:29:57, 7.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59307 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9919/22095 [17:11:50<20:20:05, 6.01s/it] {'loss': 0.3497, 'grad_norm': 0.622898892449197, 'learning_rate': 6.06263112742212e-06, 'epoch': 0.45} 45%|████▍ | 9919/22095 [17:11:50<20:20:05, 6.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104321 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85507 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44995 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9920/22095 [17:11:54<18:10:10, 5.37s/it] {'loss': 0.357, 'grad_norm': 0.6727813964875871, 'learning_rate': 6.06191493830447e-06, 'epoch': 0.45} 45%|████▍ | 9920/22095 [17:11:54<18:10:10, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72339 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48565 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59641 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9921/22095 [17:11:58<16:37:42, 4.92s/it] {'loss': 0.3959, 'grad_norm': 0.6661519934507191, 'learning_rate': 6.061198726369531e-06, 'epoch': 0.45} 45%|████▍ | 9921/22095 [17:11:58<16:37:42, 4.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9922/22095 [17:12:07<21:09:15, 6.26s/it] {'loss': 0.4497, 'grad_norm': 0.4142201447101892, 'learning_rate': 6.060482491632692e-06, 'epoch': 0.45} 45%|████▍ | 9922/22095 [17:12:07<21:09:15, 6.26s/it] 45%|████▍ | 9923/22095 [17:12:14<21:52:49, 6.47s/it] {'loss': 0.4966, 'grad_norm': 0.393005672498883, 'learning_rate': 6.0597662341093385e-06, 'epoch': 0.45} 45%|████▍ | 9923/22095 [17:12:14<21:52:49, 6.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (91220 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9924/22095 [17:12:17<18:32:09, 5.48s/it] {'loss': 0.3212, 'grad_norm': 0.6824675162900966, 'learning_rate': 6.059049953814866e-06, 'epoch': 0.45} 45%|████▍ | 9924/22095 [17:12:17<18:32:09, 5.48s/it] 45%|████▍ | 9925/22095 [17:12:21<16:30:37, 4.88s/it] {'loss': 0.374, 'grad_norm': 0.6373071535052531, 'learning_rate': 6.058333650764661e-06, 'epoch': 0.45} 45%|████▍ | 9925/22095 [17:12:21<16:30:37, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43909 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9926/22095 [17:12:26<17:20:24, 5.13s/it] {'loss': 0.5098, 'grad_norm': 0.3104536489191795, 'learning_rate': 6.057617324974117e-06, 'epoch': 0.45} 45%|████▍ | 9926/22095 [17:12:26<17:20:24, 5.13s/it] 45%|████▍ | 9927/22095 [17:12:33<19:15:03, 5.70s/it] {'loss': 0.4813, 'grad_norm': 0.3229491305240693, 'learning_rate': 6.056900976458624e-06, 'epoch': 0.45} 45%|████▍ | 9927/22095 [17:12:33<19:15:03, 5.70s/it] 45%|████▍ | 9928/22095 [17:12:43<22:59:50, 6.80s/it] {'loss': 0.4662, 'grad_norm': 0.30806458088232463, 'learning_rate': 6.056184605233576e-06, 'epoch': 0.45} 45%|████▍ | 9928/22095 [17:12:43<22:59:50, 6.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57136 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9929/22095 [17:12:52<25:39:25, 7.59s/it] {'loss': 0.473, 'grad_norm': 0.28375591384829485, 'learning_rate': 6.0554682113143634e-06, 'epoch': 0.45} 45%|████▍ | 9929/22095 [17:12:52<25:39:25, 7.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9930/22095 [17:12:57<22:49:49, 6.76s/it] {'loss': 0.3145, 'grad_norm': 0.7300146494047709, 'learning_rate': 6.054751794716383e-06, 'epoch': 0.45} 45%|████▍ | 9930/22095 [17:12:57<22:49:49, 6.76s/it] 45%|████▍ | 9931/22095 [17:13:01<19:52:07, 5.88s/it] {'loss': 0.3321, 'grad_norm': 0.6897056093316484, 'learning_rate': 6.054035355455023e-06, 'epoch': 0.45} 45%|████▍ | 9931/22095 [17:13:01<19:52:07, 5.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9932/22095 [17:13:08<21:21:28, 6.32s/it] {'loss': 0.47, 'grad_norm': 0.36312836939211174, 'learning_rate': 6.053318893545683e-06, 'epoch': 0.45} 45%|████▍ | 9932/22095 [17:13:08<21:21:28, 6.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9933/22095 [17:13:12<18:34:14, 5.50s/it] {'loss': 0.336, 'grad_norm': 0.6457343812310226, 'learning_rate': 6.052602409003752e-06, 'epoch': 0.45} 45%|████▍ | 9933/22095 [17:13:12<18:34:14, 5.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▍ | 9934/22095 [17:13:22<23:21:39, 6.92s/it] {'loss': 0.4805, 'grad_norm': 0.34105938676114667, 'learning_rate': 6.051885901844631e-06, 'epoch': 0.45} 45%|████▍ | 9934/22095 [17:13:22<23:21:39, 6.92s/it] 45%|████▍ | 9935/22095 [17:13:26<20:08:30, 5.96s/it] {'loss': 0.3401, 'grad_norm': 0.675213898440973, 'learning_rate': 6.0511693720837115e-06, 'epoch': 0.45} 45%|████▍ | 9935/22095 [17:13:26<20:08:30, 5.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42343 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9936/22095 [17:13:30<18:16:11, 5.41s/it] {'loss': 0.3298, 'grad_norm': 0.6437818053252721, 'learning_rate': 6.05045281973639e-06, 'epoch': 0.45} 45%|████▍ | 9936/22095 [17:13:30<18:16:11, 5.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▍ | 9937/22095 [17:13:39<22:07:39, 6.55s/it] {'loss': 0.4943, 'grad_norm': 0.28992887781932325, 'learning_rate': 6.049736244818064e-06, 'epoch': 0.45} 45%|████▍ | 9937/22095 [17:13:39<22:07:39, 6.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67714 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46275 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9938/22095 [17:13:43<19:12:31, 5.69s/it] {'loss': 0.3324, 'grad_norm': 0.6648857665223082, 'learning_rate': 6.049019647344131e-06, 'epoch': 0.45} 45%|████▍ | 9938/22095 [17:13:43<19:12:31, 5.69s/it] 45%|████▍ | 9939/22095 [17:13:45<16:16:07, 4.82s/it] {'loss': 0.3156, 'grad_norm': 0.5901260207845679, 'learning_rate': 6.048303027329987e-06, 'epoch': 0.45} 45%|████▍ | 9939/22095 [17:13:45<16:16:07, 4.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366673 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33419, 'image': 'vrdu_table_final_2/astro-ph.CO/625210ea-fd99-41e4-9ef3-fff633d66ddb.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$S_{9}$\\end{tabular}\n```"}]} 45%|████▍ | 9940/22095 [17:13:49<15:14:25, 4.51s/it] {'loss': 0.3203, 'grad_norm': 0.6111094970963682, 'learning_rate': 6.047586384791031e-06, 'epoch': 0.45} 45%|████▍ | 9940/22095 [17:13:49<15:14:25, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43930 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89762 > 40960). Running this sequence through the model will result in indexing errors 45%|████▍ | 9941/22095 [17:13:53<14:12:13, 4.21s/it] {'loss': 0.3222, 'grad_norm': 0.6613587091807451, 'learning_rate': 6.0468697197426595e-06, 'epoch': 0.45} 45%|████▍ | 9941/22095 [17:13:53<14:12:13, 4.21s/it] 45%|████▍ | 9942/22095 [17:13:56<12:49:49, 3.80s/it] {'loss': 0.3667, 'grad_norm': 0.6615717557607383, 'learning_rate': 6.046153032200275e-06, 'epoch': 0.45} 45%|████▍ | 9942/22095 [17:13:56<12:49:49, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57794 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41691 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41501 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60618 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 9943/22095 [17:13:58<11:43:07, 3.47s/it] {'loss': 0.2944, 'grad_norm': 0.6185763797191234, 'learning_rate': 6.045436322179274e-06, 'epoch': 0.45} 45%|████▌ | 9943/22095 [17:13:58<11:43:07, 3.47s/it] 45%|████▌ | 9944/22095 [17:14:01<11:13:45, 3.33s/it] {'loss': 0.3209, 'grad_norm': 0.5794541395057855, 'learning_rate': 6.044719589695056e-06, 'epoch': 0.45} 45%|████▌ | 9944/22095 [17:14:01<11:13:45, 3.33s/it] 45%|████▌ | 9945/22095 [17:14:04<11:05:27, 3.29s/it] {'loss': 0.3123, 'grad_norm': 0.6453103902286667, 'learning_rate': 6.044002834763023e-06, 'epoch': 0.45} 45%|████▌ | 9945/22095 [17:14:04<11:05:27, 3.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 9946/22095 [17:14:13<16:23:19, 4.86s/it] {'loss': 0.4973, 'grad_norm': 0.3764726932362269, 'learning_rate': 6.043286057398576e-06, 'epoch': 0.45} 45%|████▌ | 9946/22095 [17:14:13<16:23:19, 4.86s/it] 45%|████▌ | 9947/22095 [17:14:16<14:45:56, 4.38s/it] {'loss': 0.2867, 'grad_norm': 0.5987607755953244, 'learning_rate': 6.042569257617117e-06, 'epoch': 0.45} 45%|████▌ | 9947/22095 [17:14:16<14:45:56, 4.38s/it] 45%|████▌ | 9948/22095 [17:14:20<13:48:42, 4.09s/it] {'loss': 0.3281, 'grad_norm': 0.6875835265469559, 'learning_rate': 6.041852435434044e-06, 'epoch': 0.45} 45%|████▌ | 9948/22095 [17:14:20<13:48:42, 4.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 9949/22095 [17:14:28<17:44:14, 5.26s/it] {'loss': 0.4826, 'grad_norm': 0.31049192884635557, 'learning_rate': 6.041135590864764e-06, 'epoch': 0.45} 45%|████▌ | 9949/22095 [17:14:28<17:44:14, 5.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [781, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8419352 in VC:s3://internvl-moe-sft-data/. Exception: Image size [781, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11568, 'image': 'vrdu_texteq/astro-ph.CO/20b4e07d-1d9a-410c-a28e-6f5299a11d47.png', 'image_wh': [[781, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'There is a small contribution to the $V$-band flux from the broad'}]} 45%|████▌ | 9950/22095 [17:14:31<15:39:10, 4.64s/it] {'loss': 0.3443, 'grad_norm': 0.6758650923785007, 'learning_rate': 6.040418723924677e-06, 'epoch': 0.45} 45%|████▌ | 9950/22095 [17:14:31<15:39:10, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045994 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 4cm\nB. 5cm\nC. 无法确定\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 45%|████▌ | 9951/22095 [17:14:40<20:42:25, 6.14s/it] {'loss': 0.452, 'grad_norm': 0.28328861483255147, 'learning_rate': 6.039701834629189e-06, 'epoch': 0.45} 45%|████▌ | 9951/22095 [17:14:40<20:42:25, 6.14s/it] 45%|████▌ | 9952/22095 [17:14:44<18:11:23, 5.39s/it] {'loss': 0.3517, 'grad_norm': 0.6284360416631046, 'learning_rate': 6.0389849229936995e-06, 'epoch': 0.45} 45%|████▌ | 9952/22095 [17:14:44<18:11:23, 5.39s/it] 45%|████▌ | 9953/22095 [17:14:48<16:41:36, 4.95s/it] {'loss': 0.3349, 'grad_norm': 0.654522012389242, 'learning_rate': 6.038267989033616e-06, 'epoch': 0.45} 45%|████▌ | 9953/22095 [17:14:48<16:41:36, 4.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8299636 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'T1mLyYXvRaXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text information from this photo.'}, {'from': 'gpt', 'value': 'All words in the image:\n1英寸螺口弯针头\n(45度)\n1.5英寸\n鑫源电子产品\n规格:15G-27G\n管长:25mm'}]} 45%|████▌ | 9954/22095 [17:14:51<14:29:05, 4.29s/it] {'loss': 0.36, 'grad_norm': 0.6509169037023864, 'learning_rate': 6.03755103276434e-06, 'epoch': 0.45} 45%|████▌ | 9954/22095 [17:14:51<14:29:05, 4.29s/it] 45%|████▌ | 9955/22095 [17:14:54<13:08:44, 3.90s/it] {'loss': 0.3259, 'grad_norm': 0.613634808440209, 'learning_rate': 6.036834054201283e-06, 'epoch': 0.45} 45%|████▌ | 9955/22095 [17:14:54<13:08:44, 3.90s/it] 45%|████▌ | 9956/22095 [17:14:57<12:32:18, 3.72s/it] {'loss': 0.3353, 'grad_norm': 0.6591701614555701, 'learning_rate': 6.036117053359844e-06, 'epoch': 0.45} 45%|████▌ | 9956/22095 [17:14:57<12:32:18, 3.72s/it] 45%|████▌ | 9957/22095 [17:15:01<12:29:16, 3.70s/it] {'loss': 0.3627, 'grad_norm': 0.6706779719525753, 'learning_rate': 6.035400030255431e-06, 'epoch': 0.45} 45%|████▌ | 9957/22095 [17:15:01<12:29:16, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50022 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81760 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 9958/22095 [17:15:04<11:49:03, 3.51s/it] {'loss': 0.3374, 'grad_norm': 0.5676917733584951, 'learning_rate': 6.034682984903453e-06, 'epoch': 0.45} 45%|████▌ | 9958/22095 [17:15:04<11:49:03, 3.51s/it] 45%|████▌ | 9959/22095 [17:15:08<12:09:39, 3.61s/it] {'loss': 0.3101, 'grad_norm': 0.6145185773045998, 'learning_rate': 6.0339659173193146e-06, 'epoch': 0.45} 45%|████▌ | 9959/22095 [17:15:08<12:09:39, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60442 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68319 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 9960/22095 [17:15:12<12:39:06, 3.75s/it] {'loss': 0.3428, 'grad_norm': 0.6095296510631012, 'learning_rate': 6.033248827518424e-06, 'epoch': 0.45} 45%|████▌ | 9960/22095 [17:15:12<12:39:06, 3.75s/it] 45%|████▌ | 9961/22095 [17:15:15<12:10:35, 3.61s/it] {'loss': 0.3494, 'grad_norm': 0.9770970068781155, 'learning_rate': 6.032531715516191e-06, 'epoch': 0.45} 45%|████▌ | 9961/22095 [17:15:15<12:10:35, 3.61s/it] 45%|████▌ | 9962/22095 [17:15:19<12:11:15, 3.62s/it] {'loss': 0.3174, 'grad_norm': 0.6038847607884669, 'learning_rate': 6.03181458132802e-06, 'epoch': 0.45} 45%|████▌ | 9962/22095 [17:15:19<12:11:15, 3.62s/it] 45%|████▌ | 9963/22095 [17:15:22<11:39:04, 3.46s/it] {'loss': 0.3378, 'grad_norm': 0.6458052103580482, 'learning_rate': 6.031097424969326e-06, 'epoch': 0.45} 45%|████▌ | 9963/22095 [17:15:22<11:39:04, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 9964/22095 [17:15:25<11:12:38, 3.33s/it] {'loss': 0.3476, 'grad_norm': 0.6262492290191575, 'learning_rate': 6.030380246455513e-06, 'epoch': 0.45} 45%|████▌ | 9964/22095 [17:15:25<11:12:38, 3.33s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [128, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390166 in VC:s3://internvl-moe-sft-data/. Exception: Image size [128, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56985, 'image': 'vrdu_table_final_2/astro-ph.EP/4120329c-c193-4eb9-b5d0-8e09a3474c15.png', 'image_wh': [[128, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c} Resolution \\\\ \\end{tabular}\n```"}]} 45%|████▌ | 9965/22095 [17:15:28<11:35:56, 3.44s/it] {'loss': 0.3662, 'grad_norm': 0.6787994342237069, 'learning_rate': 6.0296630458019925e-06, 'epoch': 0.45} 45%|████▌ | 9965/22095 [17:15:28<11:35:56, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 9966/22095 [17:15:35<14:59:10, 4.45s/it] {'loss': 0.4631, 'grad_norm': 0.47006450266382405, 'learning_rate': 6.028945823024176e-06, 'epoch': 0.45} 45%|████▌ | 9966/22095 [17:15:35<14:59:10, 4.45s/it] 45%|████▌ | 9967/22095 [17:15:39<14:39:55, 4.35s/it] {'loss': 0.3742, 'grad_norm': 0.6496704316985095, 'learning_rate': 6.0282285781374746e-06, 'epoch': 0.45} 45%|████▌ | 9967/22095 [17:15:39<14:39:55, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 9968/22095 [17:15:48<18:49:57, 5.59s/it] {'loss': 0.4825, 'grad_norm': 0.3344543499700861, 'learning_rate': 6.027511311157298e-06, 'epoch': 0.45} 45%|████▌ | 9968/22095 [17:15:48<18:49:57, 5.59s/it] 45%|████▌ | 9969/22095 [17:15:51<16:28:56, 4.89s/it] {'loss': 0.3752, 'grad_norm': 0.6464649385526088, 'learning_rate': 6.026794022099061e-06, 'epoch': 0.45} 45%|████▌ | 9969/22095 [17:15:51<16:28:56, 4.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [350, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8482555 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39983, 'image': 'vrdu_texteq/astro-ph.CO/b130d4f3-5c28-459c-bc90-fe59e4789dc7.png', 'image_wh': [[350, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'Redshift slice $0.55 < z < 0.7$:'}]} 45%|████▌ | 9970/22095 [17:15:54<14:29:37, 4.30s/it] {'loss': 0.34, 'grad_norm': 0.6656227654498554, 'learning_rate': 6.026076710978172e-06, 'epoch': 0.45} 45%|████▌ | 9970/22095 [17:15:54<14:29:37, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403096 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5268, 'image': 'vrdu_table_final_2/astro-ph.CO/7ae129df-d6d7-44c4-adaf-39df5fc4b34a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [709, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8487922 in VC:s3://internvl-moe-sft-data/. Exception: Image size [709, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17301, 'image': 'vrdu_texteq/astro-ph.CO/370f445d-823c-45a6-9498-b3bede35e4b7.png', 'image_wh': [[709, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $V$ is the sum of all the Voronoi volumes in the void.'}]} 45%|████▌ | 9971/22095 [17:16:01<16:49:11, 4.99s/it] {'loss': 0.4541, 'grad_norm': 0.3244651461172815, 'learning_rate': 6.0253593778100475e-06, 'epoch': 0.45} 45%|████▌ | 9971/22095 [17:16:01<16:49:11, 4.99s/it] 45%|████▌ | 9972/22095 [17:16:04<15:07:29, 4.49s/it] {'loss': 0.3385, 'grad_norm': 0.631254794194664, 'learning_rate': 6.0246420226100976e-06, 'epoch': 0.45} 45%|████▌ | 9972/22095 [17:16:04<15:07:29, 4.49s/it] 45%|████▌ | 9973/22095 [17:16:07<13:31:44, 4.02s/it] {'loss': 0.3506, 'grad_norm': 0.6291399810343311, 'learning_rate': 6.023924645393739e-06, 'epoch': 0.45} 45%|████▌ | 9973/22095 [17:16:07<13:31:44, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [339, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8496438 in VC:s3://internvl-moe-sft-data/. Exception: Image size [339, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 128730, 'image': 'vrdu_texteq/astro-ph.CO/ec1ebaa3-9e3c-471e-824d-11358d7c2fb2.png', 'image_wh': [[339, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'As $k\\to 0$ this is of the form'}]} 45%|████▌ | 9974/22095 [17:16:10<12:27:16, 3.70s/it] {'loss': 0.3311, 'grad_norm': 0.6180885336147697, 'learning_rate': 6.023207246176383e-06, 'epoch': 0.45} 45%|████▌ | 9974/22095 [17:16:10<12:27:16, 3.70s/it] 45%|████▌ | 9975/22095 [17:16:14<12:32:23, 3.72s/it] {'loss': 0.3407, 'grad_norm': 0.6788040399146853, 'learning_rate': 6.0224898249734466e-06, 'epoch': 0.45} 45%|████▌ | 9975/22095 [17:16:14<12:32:23, 3.72s/it] 45%|████▌ | 9976/22095 [17:16:17<12:37:19, 3.75s/it] {'loss': 0.3521, 'grad_norm': 0.7289625570092265, 'learning_rate': 6.021772381800344e-06, 'epoch': 0.45} 45%|████▌ | 9976/22095 [17:16:17<12:37:19, 3.75s/it] 45%|████▌ | 9977/22095 [17:16:21<12:11:25, 3.62s/it] {'loss': 0.3814, 'grad_norm': 0.6269926510587083, 'learning_rate': 6.021054916672491e-06, 'epoch': 0.45} 45%|████▌ | 9977/22095 [17:16:21<12:11:25, 3.62s/it] 45%|████▌ | 9978/22095 [17:16:24<12:17:37, 3.65s/it] {'loss': 0.3716, 'grad_norm': 0.6911981045719757, 'learning_rate': 6.020337429605304e-06, 'epoch': 0.45} 45%|████▌ | 9978/22095 [17:16:24<12:17:37, 3.65s/it] 45%|████▌ | 9979/22095 [17:16:27<11:36:32, 3.45s/it] {'loss': 0.2915, 'grad_norm': 0.6034806566182572, 'learning_rate': 6.019619920614199e-06, 'epoch': 0.45} 45%|████▌ | 9979/22095 [17:16:27<11:36:32, 3.45s/it] 45%|████▌ | 9980/22095 [17:16:31<12:02:13, 3.58s/it] {'loss': 0.3452, 'grad_norm': 0.62852179758487, 'learning_rate': 6.0189023897145944e-06, 'epoch': 0.45} 45%|████▌ | 9980/22095 [17:16:31<12:02:13, 3.58s/it] 45%|████▌ | 9981/22095 [17:16:35<12:20:25, 3.67s/it] {'loss': 0.3315, 'grad_norm': 0.647496318788268, 'learning_rate': 6.0181848369219055e-06, 'epoch': 0.45} 45%|████▌ | 9981/22095 [17:16:35<12:20:25, 3.67s/it] 45%|████▌ | 9982/22095 [17:16:39<12:31:09, 3.72s/it] {'loss': 0.3492, 'grad_norm': 0.6219216916623668, 'learning_rate': 6.017467262251553e-06, 'epoch': 0.45} 45%|████▌ | 9982/22095 [17:16:39<12:31:09, 3.72s/it] 45%|████▌ | 9983/22095 [17:16:43<12:18:35, 3.66s/it] {'loss': 0.284, 'grad_norm': 0.5964205467527914, 'learning_rate': 6.016749665718953e-06, 'epoch': 0.45} 45%|████▌ | 9983/22095 [17:16:43<12:18:35, 3.66s/it] 45%|████▌ | 9984/22095 [17:16:46<11:42:02, 3.48s/it] {'loss': 0.3743, 'grad_norm': 0.6214559322845244, 'learning_rate': 6.016032047339526e-06, 'epoch': 0.45} 45%|████▌ | 9984/22095 [17:16:46<11:42:02, 3.48s/it] 45%|████▌ | 9985/22095 [17:16:49<11:45:01, 3.49s/it] {'loss': 0.3482, 'grad_norm': 0.5930174872567172, 'learning_rate': 6.01531440712869e-06, 'epoch': 0.45} 45%|████▌ | 9985/22095 [17:16:49<11:45:01, 3.49s/it] 45%|████▌ | 9986/22095 [17:16:52<11:32:18, 3.43s/it] {'loss': 0.3815, 'grad_norm': 0.6644054531840476, 'learning_rate': 6.014596745101866e-06, 'epoch': 0.45} 45%|████▌ | 9986/22095 [17:16:52<11:32:18, 3.43s/it] 45%|████▌ | 9987/22095 [17:16:55<11:05:10, 3.30s/it] {'loss': 0.3439, 'grad_norm': 0.6062908315932274, 'learning_rate': 6.0138790612744746e-06, 'epoch': 0.45} 45%|████▌ | 9987/22095 [17:16:55<11:05:10, 3.30s/it] 45%|████▌ | 9988/22095 [17:16:59<11:29:18, 3.42s/it] {'loss': 0.3537, 'grad_norm': 0.6729276946757002, 'learning_rate': 6.013161355661935e-06, 'epoch': 0.45} 45%|████▌ | 9988/22095 [17:16:59<11:29:18, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (227188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45095 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43143 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 9989/22095 [17:17:05<13:59:12, 4.16s/it] {'loss': 0.3689, 'grad_norm': 0.7996378503409862, 'learning_rate': 6.01244362827967e-06, 'epoch': 0.45} 45%|████▌ | 9989/22095 [17:17:05<13:59:12, 4.16s/it] 45%|████▌ | 9990/22095 [17:17:08<12:43:45, 3.79s/it] {'loss': 0.3175, 'grad_norm': 0.6228845676127899, 'learning_rate': 6.011725879143102e-06, 'epoch': 0.45} 45%|████▌ | 9990/22095 [17:17:08<12:43:45, 3.79s/it] 45%|████▌ | 9991/22095 [17:17:11<11:46:57, 3.50s/it] {'loss': 0.3412, 'grad_norm': 0.6135257766245439, 'learning_rate': 6.01100810826765e-06, 'epoch': 0.45} 45%|████▌ | 9991/22095 [17:17:11<11:46:57, 3.50s/it] 45%|████▌ | 9992/22095 [17:17:14<11:50:27, 3.52s/it] {'loss': 0.328, 'grad_norm': 0.6441029814830849, 'learning_rate': 6.0102903156687406e-06, 'epoch': 0.45} 45%|████▌ | 9992/22095 [17:17:14<11:50:27, 3.52s/it] 45%|████▌ | 9993/22095 [17:17:18<12:27:29, 3.71s/it] {'loss': 0.3718, 'grad_norm': 0.6892365356820823, 'learning_rate': 6.009572501361794e-06, 'epoch': 0.45} 45%|████▌ | 9993/22095 [17:17:18<12:27:29, 3.71s/it] 45%|████▌ | 9994/22095 [17:17:23<12:51:02, 3.82s/it] {'loss': 0.346, 'grad_norm': 0.6175618766242349, 'learning_rate': 6.008854665362236e-06, 'epoch': 0.45} 45%|████▌ | 9994/22095 [17:17:23<12:51:02, 3.82s/it] 45%|████▌ | 9995/22095 [17:17:26<12:16:04, 3.65s/it] {'loss': 0.3737, 'grad_norm': 0.6940736989878789, 'learning_rate': 6.00813680768549e-06, 'epoch': 0.45} 45%|████▌ | 9995/22095 [17:17:26<12:16:04, 3.65s/it] 45%|████▌ | 9996/22095 [17:17:30<12:23:14, 3.69s/it] {'loss': 0.3341, 'grad_norm': 0.6413699942659871, 'learning_rate': 6.007418928346979e-06, 'epoch': 0.45} 45%|████▌ | 9996/22095 [17:17:30<12:23:14, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 9997/22095 [17:17:39<18:19:22, 5.45s/it] {'loss': 0.5034, 'grad_norm': 0.46958215559153266, 'learning_rate': 6.0067010273621295e-06, 'epoch': 0.45} 45%|████▌ | 9997/22095 [17:17:39<18:19:22, 5.45s/it] 45%|████▌ | 9998/22095 [17:17:42<15:56:56, 4.75s/it] {'loss': 0.3008, 'grad_norm': 0.6344552552649131, 'learning_rate': 6.005983104746367e-06, 'epoch': 0.45} 45%|████▌ | 9998/22095 [17:17:42<15:56:56, 4.75s/it] 45%|████▌ | 9999/22095 [17:17:46<14:41:20, 4.37s/it] {'loss': 0.3428, 'grad_norm': 0.6182104080609797, 'learning_rate': 6.005265160515117e-06, 'epoch': 0.45} 45%|████▌ | 9999/22095 [17:17:46<14:41:20, 4.37s/it] 45%|████▌ | 10000/22095 [17:17:49<13:44:06, 4.09s/it] {'loss': 0.3569, 'grad_norm': 0.6217252706723825, 'learning_rate': 6.004547194683806e-06, 'epoch': 0.45} 45%|████▌ | 10000/22095 [17:17:49<13:44:06, 4.09s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 45%|████▌ | 10001/22095 [17:18:24<44:49:22, 13.34s/it] {'loss': 0.3729, 'grad_norm': 0.6887992737610831, 'learning_rate': 6.003829207267863e-06, 'epoch': 0.45} 45%|████▌ | 10001/22095 [17:18:24<44:49:22, 13.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10002/22095 [17:18:27<34:45:34, 10.35s/it] {'loss': 0.3274, 'grad_norm': 0.6410925267438082, 'learning_rate': 6.00311119828271e-06, 'epoch': 0.45} 45%|████▌ | 10002/22095 [17:18:28<34:45:34, 10.35s/it] 45%|████▌ | 10003/22095 [17:18:31<28:03:01, 8.35s/it] {'loss': 0.359, 'grad_norm': 0.6721644066136669, 'learning_rate': 6.002393167743782e-06, 'epoch': 0.45} 45%|████▌ | 10003/22095 [17:18:31<28:03:01, 8.35s/it] 45%|████▌ | 10004/22095 [17:18:35<23:13:24, 6.91s/it] {'loss': 0.3413, 'grad_norm': 0.6646472727003765, 'learning_rate': 6.001675115666501e-06, 'epoch': 0.45} 45%|████▌ | 10004/22095 [17:18:35<23:13:24, 6.91s/it] 45%|████▌ | 10005/22095 [17:18:38<19:32:01, 5.82s/it] {'loss': 0.3476, 'grad_norm': 0.6296735501659243, 'learning_rate': 6.000957042066299e-06, 'epoch': 0.45} 45%|████▌ | 10005/22095 [17:18:38<19:32:01, 5.82s/it] 45%|████▌ | 10006/22095 [17:18:41<16:47:50, 5.00s/it] {'loss': 0.3066, 'grad_norm': 0.6067665919217092, 'learning_rate': 6.0002389469586035e-06, 'epoch': 0.45} 45%|████▌ | 10006/22095 [17:18:41<16:47:50, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64577 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55771 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109741 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10007/22095 [17:18:50<20:42:39, 6.17s/it] {'loss': 0.493, 'grad_norm': 0.39061303340342207, 'learning_rate': 5.999520830358845e-06, 'epoch': 0.45} 45%|████▌ | 10007/22095 [17:18:50<20:42:39, 6.17s/it] 45%|████▌ | 10008/22095 [17:18:54<18:42:22, 5.57s/it] {'loss': 0.3357, 'grad_norm': 0.7134542623683294, 'learning_rate': 5.998802692282454e-06, 'epoch': 0.45} 45%|████▌ | 10008/22095 [17:18:54<18:42:22, 5.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10009/22095 [17:18:58<16:34:42, 4.94s/it] {'loss': 0.3566, 'grad_norm': 0.7105305038574864, 'learning_rate': 5.998084532744861e-06, 'epoch': 0.45} 45%|████▌ | 10009/22095 [17:18:58<16:34:42, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52069 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54020 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10010/22095 [17:19:01<14:48:54, 4.41s/it] {'loss': 0.3439, 'grad_norm': 0.6754264853847363, 'learning_rate': 5.997366351761497e-06, 'epoch': 0.45} 45%|████▌ | 10010/22095 [17:19:01<14:48:54, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76844 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74431 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10011/22095 [17:19:05<14:33:06, 4.34s/it] {'loss': 0.3478, 'grad_norm': 0.6172224892039797, 'learning_rate': 5.996648149347794e-06, 'epoch': 0.45} 45%|████▌ | 10011/22095 [17:19:05<14:33:06, 4.34s/it] 45%|████▌ | 10012/22095 [17:19:08<13:02:44, 3.89s/it] {'loss': 0.359, 'grad_norm': 0.6040767266365203, 'learning_rate': 5.995929925519181e-06, 'epoch': 0.45} 45%|████▌ | 10012/22095 [17:19:08<13:02:44, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10013/22095 [17:19:11<12:08:08, 3.62s/it] {'loss': 0.3322, 'grad_norm': 0.6136311395265972, 'learning_rate': 5.9952116802910945e-06, 'epoch': 0.45} 45%|████▌ | 10013/22095 [17:19:11<12:08:08, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101987 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115727 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10014/22095 [17:19:15<12:40:10, 3.78s/it] {'loss': 0.3624, 'grad_norm': 0.6227029274185454, 'learning_rate': 5.994493413678964e-06, 'epoch': 0.45} 45%|████▌ | 10014/22095 [17:19:15<12:40:10, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10015/22095 [17:19:25<18:39:47, 5.56s/it] {'loss': 0.4818, 'grad_norm': 0.3313337340020232, 'learning_rate': 5.993775125698226e-06, 'epoch': 0.45} 45%|████▌ | 10015/22095 [17:19:25<18:39:47, 5.56s/it] 45%|████▌ | 10016/22095 [17:19:28<16:18:54, 4.86s/it] {'loss': 0.3081, 'grad_norm': 0.5690430086005838, 'learning_rate': 5.993056816364312e-06, 'epoch': 0.45} 45%|████▌ | 10016/22095 [17:19:28<16:18:54, 4.86s/it] 45%|████▌ | 10017/22095 [17:19:31<14:41:02, 4.38s/it] {'loss': 0.4077, 'grad_norm': 0.6301972050072376, 'learning_rate': 5.992338485692657e-06, 'epoch': 0.45} 45%|████▌ | 10017/22095 [17:19:31<14:41:02, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10018/22095 [17:19:41<19:48:28, 5.90s/it] {'loss': 0.4723, 'grad_norm': 0.28937216440665076, 'learning_rate': 5.991620133698694e-06, 'epoch': 0.45} 45%|████▌ | 10018/22095 [17:19:41<19:48:28, 5.90s/it] 45%|████▌ | 10019/22095 [17:19:50<23:20:02, 6.96s/it] {'loss': 0.4923, 'grad_norm': 0.4170841645776839, 'learning_rate': 5.990901760397863e-06, 'epoch': 0.45} 45%|████▌ | 10019/22095 [17:19:50<23:20:02, 6.96s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [514, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8511874 in VC:s3://internvl-moe-sft-data/. Exception: Image size [514, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40703, 'image': 'vrdu_texteq/astro-ph.CO/537f87cb-ad0d-4871-b834-14be71b565a6.png', 'image_wh': [[514, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': "We will call it ``const $\\Delta z$'' model hereafter."}]} 45%|████▌ | 10020/22095 [17:19:54<20:19:36, 6.06s/it] {'loss': 0.3438, 'grad_norm': 0.626958759180911, 'learning_rate': 5.990183365805594e-06, 'epoch': 0.45} 45%|████▌ | 10020/22095 [17:19:54<20:19:36, 6.06s/it] 45%|████▌ | 10021/22095 [17:19:57<17:31:26, 5.22s/it] {'loss': 0.3539, 'grad_norm': 0.7145696132172378, 'learning_rate': 5.989464949937328e-06, 'epoch': 0.45} 45%|████▌ | 10021/22095 [17:19:57<17:31:26, 5.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10022/22095 [17:20:07<21:44:51, 6.48s/it] {'loss': 0.4736, 'grad_norm': 0.28916337801422026, 'learning_rate': 5.988746512808497e-06, 'epoch': 0.45} 45%|████▌ | 10022/22095 [17:20:07<21:44:51, 6.48s/it] 45%|████▌ | 10023/22095 [17:20:10<18:27:41, 5.51s/it] {'loss': 0.3466, 'grad_norm': 0.8029091456258143, 'learning_rate': 5.988028054434542e-06, 'epoch': 0.45} 45%|████▌ | 10023/22095 [17:20:10<18:27:41, 5.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97857 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10024/22095 [17:20:14<16:53:32, 5.04s/it] {'loss': 0.3498, 'grad_norm': 0.660620246553375, 'learning_rate': 5.987309574830897e-06, 'epoch': 0.45} 45%|████▌ | 10024/22095 [17:20:14<16:53:32, 5.04s/it] 45%|████▌ | 10025/22095 [17:20:17<14:39:47, 4.37s/it] {'loss': 0.3371, 'grad_norm': 0.6306767920116505, 'learning_rate': 5.986591074013002e-06, 'epoch': 0.45} 45%|████▌ | 10025/22095 [17:20:17<14:39:47, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10026/22095 [17:20:20<13:46:00, 4.11s/it] {'loss': 0.3572, 'grad_norm': 0.6301389145434231, 'learning_rate': 5.985872551996294e-06, 'epoch': 0.45} 45%|████▌ | 10026/22095 [17:20:20<13:46:00, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10027/22095 [17:20:24<13:07:43, 3.92s/it] {'loss': 0.3909, 'grad_norm': 0.653586722565106, 'learning_rate': 5.9851540087962134e-06, 'epoch': 0.45} 45%|████▌ | 10027/22095 [17:20:24<13:07:43, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047591 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 45%|████▌ | 10028/22095 [17:20:27<12:24:52, 3.70s/it] {'loss': 0.3169, 'grad_norm': 0.7102396128575281, 'learning_rate': 5.984435444428199e-06, 'epoch': 0.45} 45%|████▌ | 10028/22095 [17:20:27<12:24:52, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [109, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405746 in VC:s3://internvl-moe-sft-data/. Exception: Image size [109, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7933, 'image': 'vrdu_table_final_2/astro-ph.CO/f7c6018e-a885-4370-9434-75c86c1a7d62.png', 'image_wh': [[109, 20]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha - \\alpha_{\\rm true}$\\end{tabular}\n```"}]} 45%|████▌ | 10029/22095 [17:20:32<13:51:58, 4.14s/it] {'loss': 0.4836, 'grad_norm': 0.31410983897179384, 'learning_rate': 5.9837168589076915e-06, 'epoch': 0.45} 45%|████▌ | 10029/22095 [17:20:32<13:51:58, 4.14s/it] 45%|████▌ | 10030/22095 [17:20:36<13:33:08, 4.04s/it] {'loss': 0.3427, 'grad_norm': 0.6940641462750593, 'learning_rate': 5.982998252250127e-06, 'epoch': 0.45} 45%|████▌ | 10030/22095 [17:20:36<13:33:08, 4.04s/it] 45%|████▌ | 10031/22095 [17:20:39<12:46:18, 3.81s/it] {'loss': 0.3049, 'grad_norm': 0.829965545364354, 'learning_rate': 5.982279624470951e-06, 'epoch': 0.45} 45%|████▌ | 10031/22095 [17:20:39<12:46:18, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10032/22095 [17:20:47<16:51:38, 5.03s/it] {'loss': 0.4807, 'grad_norm': 0.32688203077652456, 'learning_rate': 5.981560975585604e-06, 'epoch': 0.45} 45%|████▌ | 10032/22095 [17:20:47<16:51:38, 5.03s/it] 45%|████▌ | 10033/22095 [17:20:52<16:29:31, 4.92s/it] {'loss': 0.3039, 'grad_norm': 0.6398806724095104, 'learning_rate': 5.980842305609524e-06, 'epoch': 0.45} 45%|████▌ | 10033/22095 [17:20:52<16:29:31, 4.92s/it] 45%|████▌ | 10034/22095 [17:20:56<15:28:27, 4.62s/it] {'loss': 0.369, 'grad_norm': 3.293531905029241, 'learning_rate': 5.9801236145581575e-06, 'epoch': 0.45} 45%|████▌ | 10034/22095 [17:20:56<15:28:27, 4.62s/it] 45%|████▌ | 10035/22095 [17:20:59<14:31:41, 4.34s/it] {'loss': 0.3643, 'grad_norm': 0.5979734329060953, 'learning_rate': 5.979404902446944e-06, 'epoch': 0.45} 45%|████▌ | 10035/22095 [17:20:59<14:31:41, 4.34s/it] 45%|████▌ | 10036/22095 [17:21:03<13:39:49, 4.08s/it] {'loss': 0.3453, 'grad_norm': 0.6247016742930394, 'learning_rate': 5.978686169291325e-06, 'epoch': 0.45} 45%|████▌ | 10036/22095 [17:21:03<13:39:49, 4.08s/it] 45%|████▌ | 10037/22095 [17:21:06<12:26:39, 3.72s/it] {'loss': 0.3212, 'grad_norm': 0.8567027597175304, 'learning_rate': 5.977967415106748e-06, 'epoch': 0.45} 45%|████▌ | 10037/22095 [17:21:06<12:26:39, 3.72s/it] 45%|████▌ | 10038/22095 [17:21:10<12:56:49, 3.87s/it] {'loss': 0.3437, 'grad_norm': 0.6229950767660342, 'learning_rate': 5.977248639908655e-06, 'epoch': 0.45} 45%|████▌ | 10038/22095 [17:21:10<12:56:49, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10039/22095 [17:21:18<17:24:05, 5.20s/it] {'loss': 0.4906, 'grad_norm': 0.3089428583891721, 'learning_rate': 5.976529843712489e-06, 'epoch': 0.45} 45%|████▌ | 10039/22095 [17:21:18<17:24:05, 5.20s/it] 45%|████▌ | 10040/22095 [17:21:22<15:49:37, 4.73s/it] {'loss': 0.3332, 'grad_norm': 0.6260978746591849, 'learning_rate': 5.975811026533698e-06, 'epoch': 0.45} 45%|████▌ | 10040/22095 [17:21:22<15:49:37, 4.73s/it] 45%|████▌ | 10041/22095 [17:21:25<14:30:59, 4.34s/it] {'loss': 0.3336, 'grad_norm': 0.838058394573743, 'learning_rate': 5.975092188387722e-06, 'epoch': 0.45} 45%|████▌ | 10041/22095 [17:21:25<14:30:59, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10042/22095 [17:21:28<13:13:50, 3.95s/it] {'loss': 0.3577, 'grad_norm': 0.7191431229815619, 'learning_rate': 5.974373329290012e-06, 'epoch': 0.45} 45%|████▌ | 10042/22095 [17:21:28<13:13:50, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46745 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48302 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41027 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10043/22095 [17:21:31<12:28:25, 3.73s/it] {'loss': 0.314, 'grad_norm': 1.3640864595801048, 'learning_rate': 5.97365444925601e-06, 'epoch': 0.45} 45%|████▌ | 10043/22095 [17:21:31<12:28:25, 3.73s/it] 45%|████▌ | 10044/22095 [17:21:34<11:33:37, 3.45s/it] {'loss': 0.3784, 'grad_norm': 0.6553645603471241, 'learning_rate': 5.972935548301165e-06, 'epoch': 0.45} 45%|████▌ | 10044/22095 [17:21:34<11:33:37, 3.45s/it] 45%|████▌ | 10045/22095 [17:21:38<11:46:17, 3.52s/it] {'loss': 0.3119, 'grad_norm': 0.6919586894797458, 'learning_rate': 5.972216626440923e-06, 'epoch': 0.45} 45%|████▌ | 10045/22095 [17:21:38<11:46:17, 3.52s/it] 45%|████▌ | 10046/22095 [17:21:41<11:44:26, 3.51s/it] {'loss': 0.3095, 'grad_norm': 0.6058196014582589, 'learning_rate': 5.971497683690732e-06, 'epoch': 0.45} 45%|████▌ | 10046/22095 [17:21:41<11:44:26, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387776 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54588, 'image': 'vrdu_table_final_2/astro-ph.CO/082fc70d-86d4-454d-afda-4d16619a610b.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 45%|████▌ | 10047/22095 [17:21:45<11:58:42, 3.58s/it] {'loss': 0.3469, 'grad_norm': 0.5859664888367128, 'learning_rate': 5.970778720066039e-06, 'epoch': 0.45} 45%|████▌ | 10047/22095 [17:21:45<11:58:42, 3.58s/it] 45%|████▌ | 10048/22095 [17:21:48<11:32:06, 3.45s/it] {'loss': 0.3582, 'grad_norm': 0.6679169505616824, 'learning_rate': 5.970059735582295e-06, 'epoch': 0.45} 45%|████▌ | 10048/22095 [17:21:48<11:32:06, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10049/22095 [17:21:51<11:05:13, 3.31s/it] {'loss': 0.3352, 'grad_norm': 0.6748470447738677, 'learning_rate': 5.969340730254943e-06, 'epoch': 0.45} 45%|████▌ | 10049/22095 [17:21:51<11:05:13, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43323 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96242 > 40960). Running this sequence through the model will result in indexing errors 45%|████▌ | 10050/22095 [17:21:55<11:09:18, 3.33s/it] {'loss': 0.3526, 'grad_norm': 0.6253274352764578, 'learning_rate': 5.96862170409944e-06, 'epoch': 0.45} 45%|████▌ | 10050/22095 [17:21:55<11:09:18, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 45%|████▌ | 10051/22095 [17:22:00<13:39:57, 4.08s/it] {'loss': 0.5057, 'grad_norm': 0.3514815605156929, 'learning_rate': 5.967902657131228e-06, 'epoch': 0.45} 45%|████▌ | 10051/22095 [17:22:00<13:39:57, 4.08s/it] 45%|████▌ | 10052/22095 [17:22:04<13:29:59, 4.04s/it] {'loss': 0.3268, 'grad_norm': 0.6386680369414963, 'learning_rate': 5.967183589365761e-06, 'epoch': 0.45} 45%|████▌ | 10052/22095 [17:22:04<13:29:59, 4.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 45%|████▌ | 10053/22095 [17:22:07<12:16:12, 3.67s/it] {'loss': 0.3392, 'grad_norm': 0.6423595813623103, 'learning_rate': 5.96646450081849e-06, 'epoch': 0.45} 45%|████▌ | 10053/22095 [17:22:07<12:16:12, 3.67s/it] 46%|████▌ | 10054/22095 [17:22:10<11:53:51, 3.56s/it] {'loss': 0.3584, 'grad_norm': 0.5911899645875737, 'learning_rate': 5.965745391504866e-06, 'epoch': 0.46} 46%|████▌ | 10054/22095 [17:22:10<11:53:51, 3.56s/it] 46%|████▌ | 10055/22095 [17:22:14<12:06:32, 3.62s/it] {'loss': 0.3638, 'grad_norm': 0.6756603866248126, 'learning_rate': 5.965026261440338e-06, 'epoch': 0.46} 46%|████▌ | 10055/22095 [17:22:14<12:06:32, 3.62s/it] 46%|████▌ | 10056/22095 [17:22:19<12:56:37, 3.87s/it] {'loss': 0.2995, 'grad_norm': 0.5730749821872515, 'learning_rate': 5.964307110640359e-06, 'epoch': 0.46} 46%|████▌ | 10056/22095 [17:22:19<12:56:37, 3.87s/it] 46%|████▌ | 10057/22095 [17:22:23<13:01:57, 3.90s/it] {'loss': 0.3, 'grad_norm': 0.647423316803012, 'learning_rate': 5.963587939120383e-06, 'epoch': 0.46} 46%|████▌ | 10057/22095 [17:22:23<13:01:57, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10058/22095 [17:22:33<19:36:23, 5.86s/it] {'loss': 0.5188, 'grad_norm': 0.43464516641582995, 'learning_rate': 5.962868746895863e-06, 'epoch': 0.46} 46%|████▌ | 10058/22095 [17:22:33<19:36:23, 5.86s/it] 46%|████▌ | 10059/22095 [17:22:37<17:18:07, 5.18s/it] {'loss': 0.3265, 'grad_norm': 0.6385870718981455, 'learning_rate': 5.962149533982249e-06, 'epoch': 0.46} 46%|████▌ | 10059/22095 [17:22:37<17:18:07, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122960 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10060/22095 [17:22:41<16:06:35, 4.82s/it] {'loss': 0.3227, 'grad_norm': 0.586098357918665, 'learning_rate': 5.961430300394996e-06, 'epoch': 0.46} 46%|████▌ | 10060/22095 [17:22:41<16:06:35, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47831 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60130 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10061/22095 [17:22:44<14:15:23, 4.26s/it] {'loss': 0.324, 'grad_norm': 0.6571832827328082, 'learning_rate': 5.960711046149561e-06, 'epoch': 0.46} 46%|████▌ | 10061/22095 [17:22:44<14:15:23, 4.26s/it] 46%|████▌ | 10062/22095 [17:22:48<14:01:29, 4.20s/it] {'loss': 0.3186, 'grad_norm': 0.6963240291123793, 'learning_rate': 5.959991771261393e-06, 'epoch': 0.46} 46%|████▌ | 10062/22095 [17:22:48<14:01:29, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (145781 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10063/22095 [17:22:52<13:46:54, 4.12s/it] {'loss': 0.3183, 'grad_norm': 0.675978472481428, 'learning_rate': 5.959272475745953e-06, 'epoch': 0.46} 46%|████▌ | 10063/22095 [17:22:52<13:46:54, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10064/22095 [17:22:56<13:35:35, 4.07s/it] {'loss': 0.3561, 'grad_norm': 0.6799292075558153, 'learning_rate': 5.958553159618693e-06, 'epoch': 0.46} 46%|████▌ | 10064/22095 [17:22:56<13:35:35, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56768 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10065/22095 [17:22:59<13:15:03, 3.97s/it] {'loss': 0.327, 'grad_norm': 0.6488099183480327, 'learning_rate': 5.957833822895069e-06, 'epoch': 0.46} 46%|████▌ | 10065/22095 [17:22:59<13:15:03, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71260 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64705 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101801 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10066/22095 [17:23:03<12:45:55, 3.82s/it] {'loss': 0.3028, 'grad_norm': 0.6392367712514024, 'learning_rate': 5.957114465590537e-06, 'epoch': 0.46} 46%|████▌ | 10066/22095 [17:23:03<12:45:55, 3.82s/it] 46%|████▌ | 10067/22095 [17:23:06<12:30:38, 3.74s/it] {'loss': 0.3145, 'grad_norm': 0.650807133681978, 'learning_rate': 5.9563950877205564e-06, 'epoch': 0.46} 46%|████▌ | 10067/22095 [17:23:06<12:30:38, 3.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89089 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10068/22095 [17:23:16<18:30:05, 5.54s/it] {'loss': 0.4869, 'grad_norm': 0.35525035348306405, 'learning_rate': 5.955675689300583e-06, 'epoch': 0.46} 46%|████▌ | 10068/22095 [17:23:16<18:30:05, 5.54s/it] 46%|████▌ | 10069/22095 [17:23:25<22:23:34, 6.70s/it] {'loss': 0.4686, 'grad_norm': 0.3131661760499863, 'learning_rate': 5.954956270346074e-06, 'epoch': 0.46} 46%|████▌ | 10069/22095 [17:23:26<22:23:34, 6.70s/it] 46%|████▌ | 10070/22095 [17:23:32<22:10:32, 6.64s/it] {'loss': 0.4722, 'grad_norm': 0.28290393614239373, 'learning_rate': 5.954236830872486e-06, 'epoch': 0.46} 46%|████▌ | 10070/22095 [17:23:32<22:10:32, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10071/22095 [17:23:36<19:24:10, 5.81s/it] {'loss': 0.3424, 'grad_norm': 0.6734508988009212, 'learning_rate': 5.953517370895281e-06, 'epoch': 0.46} 46%|████▌ | 10071/22095 [17:23:36<19:24:10, 5.81s/it] 46%|████▌ | 10072/22095 [17:23:40<17:16:57, 5.17s/it] {'loss': 0.3573, 'grad_norm': 0.6159152425039114, 'learning_rate': 5.9527978904299156e-06, 'epoch': 0.46} 46%|████▌ | 10072/22095 [17:23:40<17:16:57, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10073/22095 [17:23:51<23:21:53, 7.00s/it] {'loss': 0.4681, 'grad_norm': 0.40209035644627683, 'learning_rate': 5.952078389491849e-06, 'epoch': 0.46} 46%|████▌ | 10073/22095 [17:23:51<23:21:53, 7.00s/it] 46%|████▌ | 10074/22095 [17:24:00<25:58:16, 7.78s/it] {'loss': 0.4741, 'grad_norm': 0.3838006111097636, 'learning_rate': 5.951358868096543e-06, 'epoch': 0.46} 46%|████▌ | 10074/22095 [17:24:00<25:58:16, 7.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10075/22095 [17:24:04<21:28:35, 6.43s/it] {'loss': 0.3061, 'grad_norm': 0.5806324459925772, 'learning_rate': 5.950639326259456e-06, 'epoch': 0.46} 46%|████▌ | 10075/22095 [17:24:04<21:28:35, 6.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72977 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10076/22095 [17:24:11<22:05:18, 6.62s/it] {'loss': 0.4833, 'grad_norm': 0.2915926212204432, 'learning_rate': 5.949919763996049e-06, 'epoch': 0.46} 46%|████▌ | 10076/22095 [17:24:11<22:05:18, 6.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (46439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61534 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10077/22095 [17:24:14<18:43:28, 5.61s/it] {'loss': 0.2768, 'grad_norm': 0.6698235355513487, 'learning_rate': 5.949200181321785e-06, 'epoch': 0.46} 46%|████▌ | 10077/22095 [17:24:14<18:43:28, 5.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047228 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 10cm\nB. 12cm\nC. 6cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (47515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85342 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46744 > 40960) for 4 sample(s). Truncating to 5784 with 3 samples. 46%|████▌ | 10078/22095 [17:24:18<16:41:31, 5.00s/it] {'loss': 0.3365, 'grad_norm': 0.661875473907528, 'learning_rate': 5.948480578252124e-06, 'epoch': 0.46} 46%|████▌ | 10078/22095 [17:24:18<16:41:31, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49986 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10079/22095 [17:24:21<14:50:38, 4.45s/it] {'loss': 0.334, 'grad_norm': 0.7092775293804304, 'learning_rate': 5.9477609548025295e-06, 'epoch': 0.46} 46%|████▌ | 10079/22095 [17:24:21<14:50:38, 4.45s/it] 46%|████▌ | 10080/22095 [17:24:24<13:28:14, 4.04s/it] {'loss': 0.3144, 'grad_norm': 0.6381995964901996, 'learning_rate': 5.9470413109884605e-06, 'epoch': 0.46} 46%|████▌ | 10080/22095 [17:24:24<13:28:14, 4.04s/it] 46%|████▌ | 10081/22095 [17:24:27<12:51:43, 3.85s/it] {'loss': 0.3292, 'grad_norm': 0.7031847456684301, 'learning_rate': 5.946321646825385e-06, 'epoch': 0.46} 46%|████▌ | 10081/22095 [17:24:27<12:51:43, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79000 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10082/22095 [17:24:30<12:06:58, 3.63s/it] {'loss': 0.3261, 'grad_norm': 0.6255334778708324, 'learning_rate': 5.945601962328762e-06, 'epoch': 0.46} 46%|████▌ | 10082/22095 [17:24:30<12:06:58, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59534 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43676 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10083/22095 [17:24:33<11:29:32, 3.44s/it] {'loss': 0.3444, 'grad_norm': 0.680176425307315, 'learning_rate': 5.9448822575140575e-06, 'epoch': 0.46} 46%|████▌ | 10083/22095 [17:24:33<11:29:32, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10084/22095 [17:24:43<17:36:21, 5.28s/it] {'loss': 0.4819, 'grad_norm': 0.5395328200815813, 'learning_rate': 5.944162532396735e-06, 'epoch': 0.46} 46%|████▌ | 10084/22095 [17:24:43<17:36:21, 5.28s/it] 46%|████▌ | 10085/22095 [17:24:53<21:56:46, 6.58s/it] {'loss': 0.5027, 'grad_norm': 0.3849787355614753, 'learning_rate': 5.94344278699226e-06, 'epoch': 0.46} 46%|████▌ | 10085/22095 [17:24:53<21:56:46, 6.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10086/22095 [17:24:56<18:31:08, 5.55s/it] {'loss': 0.3238, 'grad_norm': 0.6424042333189114, 'learning_rate': 5.942723021316096e-06, 'epoch': 0.46} 46%|████▌ | 10086/22095 [17:24:56<18:31:08, 5.55s/it] 46%|████▌ | 10087/22095 [17:25:06<22:56:35, 6.88s/it] {'loss': 0.4707, 'grad_norm': 0.31396481615912764, 'learning_rate': 5.94200323538371e-06, 'epoch': 0.46} 46%|████▌ | 10087/22095 [17:25:06<22:56:35, 6.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942808 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65961, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nA. 6\nB. 3\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10088/22095 [17:25:09<19:38:35, 5.89s/it] {'loss': 0.3282, 'grad_norm': 0.6379754659076351, 'learning_rate': 5.941283429210568e-06, 'epoch': 0.46} 46%|████▌ | 10088/22095 [17:25:09<19:38:35, 5.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10089/22095 [17:25:13<17:06:48, 5.13s/it] {'loss': 0.3359, 'grad_norm': 0.80606623399743, 'learning_rate': 5.940563602812136e-06, 'epoch': 0.46} 46%|████▌ | 10089/22095 [17:25:13<17:06:48, 5.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10090/22095 [17:25:17<16:10:37, 4.85s/it] {'loss': 0.5133, 'grad_norm': 0.5049334906739511, 'learning_rate': 5.939843756203881e-06, 'epoch': 0.46} 46%|████▌ | 10090/22095 [17:25:17<16:10:37, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41082 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77326 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10091/22095 [17:25:25<19:19:56, 5.80s/it] {'loss': 0.4887, 'grad_norm': 0.4405229130785534, 'learning_rate': 5.939123889401269e-06, 'epoch': 0.46} 46%|████▌ | 10091/22095 [17:25:25<19:19:56, 5.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10092/22095 [17:25:29<17:26:28, 5.23s/it] {'loss': 0.3418, 'grad_norm': 0.6133949372009695, 'learning_rate': 5.9384040024197706e-06, 'epoch': 0.46} 46%|████▌ | 10092/22095 [17:25:29<17:26:28, 5.23s/it] 46%|████▌ | 10093/22095 [17:25:32<15:20:15, 4.60s/it] {'loss': 0.2896, 'grad_norm': 0.6528224932137077, 'learning_rate': 5.937684095274852e-06, 'epoch': 0.46} 46%|████▌ | 10093/22095 [17:25:32<15:20:15, 4.60s/it] 46%|████▌ | 10094/22095 [17:25:36<14:48:54, 4.44s/it] {'loss': 0.3126, 'grad_norm': 0.6773183075316381, 'learning_rate': 5.9369641679819825e-06, 'epoch': 0.46} 46%|████▌ | 10094/22095 [17:25:36<14:48:54, 4.44s/it] 46%|████▌ | 10095/22095 [17:25:40<14:02:48, 4.21s/it] {'loss': 0.2843, 'grad_norm': 0.6173430889617483, 'learning_rate': 5.936244220556629e-06, 'epoch': 0.46} 46%|████▌ | 10095/22095 [17:25:40<14:02:48, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61187 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127804 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10096/22095 [17:25:44<13:55:42, 4.18s/it] {'loss': 0.3498, 'grad_norm': 0.6545161701377518, 'learning_rate': 5.935524253014263e-06, 'epoch': 0.46} 46%|████▌ | 10096/22095 [17:25:44<13:55:42, 4.18s/it] 46%|████▌ | 10097/22095 [17:25:47<13:18:44, 3.99s/it] {'loss': 0.3623, 'grad_norm': 0.6926825859793615, 'learning_rate': 5.934804265370355e-06, 'epoch': 0.46} 46%|████▌ | 10097/22095 [17:25:47<13:18:44, 3.99s/it] 46%|████▌ | 10098/22095 [17:25:52<13:44:05, 4.12s/it] {'loss': 0.3769, 'grad_norm': 0.6709432458048451, 'learning_rate': 5.934084257640374e-06, 'epoch': 0.46} 46%|████▌ | 10098/22095 [17:25:52<13:44:05, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10099/22095 [17:26:02<19:56:38, 5.99s/it] {'loss': 0.4754, 'grad_norm': 0.7025990623404197, 'learning_rate': 5.933364229839791e-06, 'epoch': 0.46} 46%|████▌ | 10099/22095 [17:26:02<19:56:38, 5.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8411514 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13721, 'image': 'vrdu_table_final_2/astro-ph.CO/4af78706-c702-4053-8a4c-3aeabb9eec80.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 46%|████▌ | 10100/22095 [17:26:06<17:29:52, 5.25s/it] {'loss': 0.3262, 'grad_norm': 0.6522116942364439, 'learning_rate': 5.9326441819840785e-06, 'epoch': 0.46} 46%|████▌ | 10100/22095 [17:26:06<17:29:52, 5.25s/it] 46%|████▌ | 10101/22095 [17:26:09<15:45:08, 4.73s/it] {'loss': 0.3285, 'grad_norm': 0.7262529789467604, 'learning_rate': 5.931924114088704e-06, 'epoch': 0.46} 46%|████▌ | 10101/22095 [17:26:09<15:45:08, 4.73s/it] 46%|████▌ | 10102/22095 [17:26:13<14:46:38, 4.44s/it] {'loss': 0.2971, 'grad_norm': 0.6095318022692825, 'learning_rate': 5.931204026169146e-06, 'epoch': 0.46} 46%|████▌ | 10102/22095 [17:26:13<14:46:38, 4.44s/it] 46%|████▌ | 10103/22095 [17:26:16<13:12:36, 3.97s/it] {'loss': 0.3, 'grad_norm': 0.6440862160573638, 'learning_rate': 5.930483918240871e-06, 'epoch': 0.46} 46%|████▌ | 10103/22095 [17:26:16<13:12:36, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10104/22095 [17:26:26<19:23:53, 5.82s/it] {'loss': 0.4633, 'grad_norm': 0.3511639562008092, 'learning_rate': 5.929763790319355e-06, 'epoch': 0.46} 46%|████▌ | 10104/22095 [17:26:26<19:23:53, 5.82s/it] 46%|████▌ | 10105/22095 [17:26:30<17:44:52, 5.33s/it] {'loss': 0.3472, 'grad_norm': 0.6597290674517585, 'learning_rate': 5.929043642420072e-06, 'epoch': 0.46} 46%|████▌ | 10105/22095 [17:26:30<17:44:52, 5.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46447 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10106/22095 [17:26:34<16:33:04, 4.97s/it] {'loss': 0.3625, 'grad_norm': 0.6926020451172388, 'learning_rate': 5.928323474558492e-06, 'epoch': 0.46} 46%|████▌ | 10106/22095 [17:26:34<16:33:04, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41709 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10107/22095 [17:26:38<15:24:37, 4.63s/it] {'loss': 0.3357, 'grad_norm': 0.6236702207782863, 'learning_rate': 5.9276032867500935e-06, 'epoch': 0.46} 46%|████▌ | 10107/22095 [17:26:38<15:24:37, 4.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10108/22095 [17:26:42<14:35:58, 4.38s/it] {'loss': 0.3405, 'grad_norm': 0.6585100389290004, 'learning_rate': 5.926883079010348e-06, 'epoch': 0.46} 46%|████▌ | 10108/22095 [17:26:42<14:35:58, 4.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10109/22095 [17:26:45<13:20:05, 4.01s/it] {'loss': 0.3067, 'grad_norm': 0.6536198954693773, 'learning_rate': 5.926162851354733e-06, 'epoch': 0.46} 46%|████▌ | 10109/22095 [17:26:45<13:20:05, 4.01s/it] 46%|████▌ | 10110/22095 [17:26:48<12:09:02, 3.65s/it] {'loss': 0.3512, 'grad_norm': 0.6709630553993108, 'learning_rate': 5.925442603798721e-06, 'epoch': 0.46} 46%|████▌ | 10110/22095 [17:26:48<12:09:02, 3.65s/it] 46%|████▌ | 10111/22095 [17:26:51<11:59:35, 3.60s/it] {'loss': 0.3776, 'grad_norm': 0.6357049472328576, 'learning_rate': 5.924722336357793e-06, 'epoch': 0.46} 46%|████▌ | 10111/22095 [17:26:51<11:59:35, 3.60s/it] 46%|████▌ | 10112/22095 [17:26:54<11:16:09, 3.39s/it] {'loss': 0.3589, 'grad_norm': 0.6401419572145257, 'learning_rate': 5.924002049047419e-06, 'epoch': 0.46} 46%|████▌ | 10112/22095 [17:26:54<11:16:09, 3.39s/it] 46%|████▌ | 10113/22095 [17:26:58<11:34:23, 3.48s/it] {'loss': 0.3536, 'grad_norm': 0.6412345227285043, 'learning_rate': 5.92328174188308e-06, 'epoch': 0.46} 46%|████▌ | 10113/22095 [17:26:58<11:34:23, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10114/22095 [17:27:03<13:44:01, 4.13s/it] {'loss': 0.4642, 'grad_norm': 0.35092114987906203, 'learning_rate': 5.922561414880253e-06, 'epoch': 0.46} 46%|████▌ | 10114/22095 [17:27:03<13:44:01, 4.13s/it] 46%|████▌ | 10115/22095 [17:27:13<18:56:31, 5.69s/it] {'loss': 0.467, 'grad_norm': 0.3163893146299063, 'learning_rate': 5.9218410680544135e-06, 'epoch': 0.46} 46%|████▌ | 10115/22095 [17:27:13<18:56:31, 5.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10116/22095 [17:27:17<16:58:13, 5.10s/it] {'loss': 0.3683, 'grad_norm': 0.6642074094805996, 'learning_rate': 5.92112070142104e-06, 'epoch': 0.46} 46%|████▌ | 10116/22095 [17:27:17<16:58:13, 5.10s/it] 46%|████▌ | 10117/22095 [17:27:20<15:09:26, 4.56s/it] {'loss': 0.3682, 'grad_norm': 0.6728589957006714, 'learning_rate': 5.920400314995612e-06, 'epoch': 0.46} 46%|████▌ | 10117/22095 [17:27:20<15:09:26, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10118/22095 [17:27:23<13:29:13, 4.05s/it] {'loss': 0.353, 'grad_norm': 0.6188594622706161, 'learning_rate': 5.919679908793609e-06, 'epoch': 0.46} 46%|████▌ | 10118/22095 [17:27:23<13:29:13, 4.05s/it] 46%|████▌ | 10119/22095 [17:27:26<12:53:41, 3.88s/it] {'loss': 0.3518, 'grad_norm': 0.6152141100731404, 'learning_rate': 5.91895948283051e-06, 'epoch': 0.46} 46%|████▌ | 10119/22095 [17:27:26<12:53:41, 3.88s/it] 46%|████▌ | 10120/22095 [17:27:29<12:09:41, 3.66s/it] {'loss': 0.3551, 'grad_norm': 1.3605112239088757, 'learning_rate': 5.918239037121791e-06, 'epoch': 0.46} 46%|████▌ | 10120/22095 [17:27:29<12:09:41, 3.66s/it] 46%|████▌ | 10121/22095 [17:27:33<12:12:55, 3.67s/it] {'loss': 0.3611, 'grad_norm': 0.6493685868114359, 'learning_rate': 5.917518571682938e-06, 'epoch': 0.46} 46%|████▌ | 10121/22095 [17:27:33<12:12:55, 3.67s/it] 46%|████▌ | 10122/22095 [17:27:36<11:31:05, 3.46s/it] {'loss': 0.3527, 'grad_norm': 0.6374143410362819, 'learning_rate': 5.9167980865294285e-06, 'epoch': 0.46} 46%|████▌ | 10122/22095 [17:27:36<11:31:05, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10123/22095 [17:27:39<11:02:54, 3.32s/it] {'loss': 0.3147, 'grad_norm': 0.6177615252904042, 'learning_rate': 5.916077581676743e-06, 'epoch': 0.46} 46%|████▌ | 10123/22095 [17:27:39<11:02:54, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883159 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6312, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10124/22095 [17:27:43<11:19:23, 3.41s/it] {'loss': 0.33, 'grad_norm': 0.6017872906281839, 'learning_rate': 5.915357057140364e-06, 'epoch': 0.46} 46%|████▌ | 10124/22095 [17:27:43<11:19:23, 3.41s/it] 46%|████▌ | 10125/22095 [17:27:46<11:03:35, 3.33s/it] {'loss': 0.2914, 'grad_norm': 2.520985279055733, 'learning_rate': 5.914636512935773e-06, 'epoch': 0.46} 46%|████▌ | 10125/22095 [17:27:46<11:03:35, 3.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [934, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465032 in VC:s3://internvl-moe-sft-data/. Exception: Image size [934, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 80744, 'image': 'vrdu_texteq/astro-ph.CO/bf0179f1-9e7c-4c98-afc2-4ae4c8211197.png', 'image_wh': [[934, 23]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'The factor $c$ can be chosen to minimize a distance metric as discussed below.'}]} 46%|████▌ | 10126/22095 [17:27:49<10:46:18, 3.24s/it] {'loss': 0.3159, 'grad_norm': 0.6258472550396663, 'learning_rate': 5.913915949078453e-06, 'epoch': 0.46} 46%|████▌ | 10126/22095 [17:27:49<10:46:18, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69991 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63403 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10127/22095 [17:27:53<11:26:53, 3.44s/it] {'loss': 0.3225, 'grad_norm': 0.6200775670641495, 'learning_rate': 5.913195365583886e-06, 'epoch': 0.46} 46%|████▌ | 10127/22095 [17:27:53<11:26:53, 3.44s/it] 46%|████▌ | 10128/22095 [17:27:56<11:04:10, 3.33s/it] {'loss': 0.3223, 'grad_norm': 0.6198459344095034, 'learning_rate': 5.912474762467554e-06, 'epoch': 0.46} 46%|████▌ | 10128/22095 [17:27:56<11:04:10, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10129/22095 [17:27:59<10:45:38, 3.24s/it] {'loss': 0.3457, 'grad_norm': 0.663417256125671, 'learning_rate': 5.911754139744944e-06, 'epoch': 0.46} 46%|████▌ | 10129/22095 [17:27:59<10:45:38, 3.24s/it] 46%|████▌ | 10130/22095 [17:28:02<11:10:43, 3.36s/it] {'loss': 0.3483, 'grad_norm': 0.6357687156435802, 'learning_rate': 5.911033497431535e-06, 'epoch': 0.46} 46%|████▌ | 10130/22095 [17:28:02<11:10:43, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897252 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20405, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nA. 4\nB. 5\nC. 6\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵D为线段CB的中点,CD=3,∴BC=2CD=6,∴AC=AB-BC=5.'}]} 46%|████▌ | 10131/22095 [17:28:05<10:39:48, 3.21s/it] {'loss': 0.3519, 'grad_norm': 0.6808329048633953, 'learning_rate': 5.910312835542818e-06, 'epoch': 0.46} 46%|████▌ | 10131/22095 [17:28:05<10:39:48, 3.21s/it] 46%|████▌ | 10132/22095 [17:28:08<10:34:37, 3.18s/it] {'loss': 0.3417, 'grad_norm': 0.6184081392584136, 'learning_rate': 5.909592154094272e-06, 'epoch': 0.46} 46%|████▌ | 10132/22095 [17:28:08<10:34:37, 3.18s/it] 46%|████▌ | 10133/22095 [17:28:12<11:01:03, 3.32s/it] {'loss': 0.3706, 'grad_norm': 0.8011839566403799, 'learning_rate': 5.908871453101382e-06, 'epoch': 0.46} 46%|████▌ | 10133/22095 [17:28:12<11:01:03, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64345 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10134/22095 [17:28:15<11:02:58, 3.33s/it] {'loss': 0.3616, 'grad_norm': 0.6182329034996796, 'learning_rate': 5.908150732579638e-06, 'epoch': 0.46} 46%|████▌ | 10134/22095 [17:28:15<11:02:58, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (124461 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85042 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10135/22095 [17:28:25<17:03:33, 5.13s/it] {'loss': 0.4797, 'grad_norm': 0.5550657593697929, 'learning_rate': 5.907429992544524e-06, 'epoch': 0.46} 46%|████▌ | 10135/22095 [17:28:25<17:03:33, 5.13s/it] 46%|████▌ | 10136/22095 [17:28:28<15:13:16, 4.58s/it] {'loss': 0.2912, 'grad_norm': 0.6363540464766801, 'learning_rate': 5.906709233011526e-06, 'epoch': 0.46} 46%|████▌ | 10136/22095 [17:28:28<15:13:16, 4.58s/it] 46%|████▌ | 10137/22095 [17:28:31<13:49:51, 4.16s/it] {'loss': 0.3403, 'grad_norm': 0.6775860869864486, 'learning_rate': 5.905988453996132e-06, 'epoch': 0.46} 46%|████▌ | 10137/22095 [17:28:31<13:49:51, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84818 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49508 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41661 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10138/22095 [17:28:34<12:24:58, 3.74s/it] {'loss': 0.3386, 'grad_norm': 0.677654388601567, 'learning_rate': 5.905267655513828e-06, 'epoch': 0.46} 46%|████▌ | 10138/22095 [17:28:34<12:24:58, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52906 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10139/22095 [17:28:37<11:48:37, 3.56s/it] {'loss': 0.3217, 'grad_norm': 0.6242789450230746, 'learning_rate': 5.904546837580102e-06, 'epoch': 0.46} 46%|████▌ | 10139/22095 [17:28:37<11:48:37, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364128 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30868, 'image': 'vrdu_table_final_2/astro-ph.CO/f8758213-8b5c-44ff-a01b-cb325c6bff7c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10140/22095 [17:28:45<16:26:34, 4.95s/it] {'loss': 0.5025, 'grad_norm': 0.3399240313310149, 'learning_rate': 5.903826000210444e-06, 'epoch': 0.46} 46%|████▌ | 10140/22095 [17:28:45<16:26:34, 4.95s/it] 46%|████▌ | 10141/22095 [17:28:49<14:49:26, 4.46s/it] {'loss': 0.3618, 'grad_norm': 0.7333455939094149, 'learning_rate': 5.903105143420339e-06, 'epoch': 0.46} 46%|████▌ | 10141/22095 [17:28:49<14:49:26, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10142/22095 [17:28:57<18:34:07, 5.59s/it] {'loss': 0.4924, 'grad_norm': 0.32216027105417944, 'learning_rate': 5.9023842672252805e-06, 'epoch': 0.46} 46%|████▌ | 10142/22095 [17:28:57<18:34:07, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60151 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103597 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10143/22095 [17:29:05<21:09:49, 6.37s/it] {'loss': 0.4918, 'grad_norm': 0.3183411706250615, 'learning_rate': 5.901663371640754e-06, 'epoch': 0.46} 46%|████▌ | 10143/22095 [17:29:05<21:09:49, 6.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10144/22095 [17:29:08<18:13:24, 5.49s/it] {'loss': 0.3371, 'grad_norm': 0.6769035244721872, 'learning_rate': 5.9009424566822515e-06, 'epoch': 0.46} 46%|████▌ | 10144/22095 [17:29:08<18:13:24, 5.49s/it] 46%|████▌ | 10145/22095 [17:29:12<16:46:43, 5.05s/it] {'loss': 0.3708, 'grad_norm': 0.6855526785778393, 'learning_rate': 5.900221522365262e-06, 'epoch': 0.46} 46%|████▌ | 10145/22095 [17:29:12<16:46:43, 5.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10146/22095 [17:29:22<21:12:54, 6.39s/it] {'loss': 0.5104, 'grad_norm': 0.30930897707540267, 'learning_rate': 5.899500568705279e-06, 'epoch': 0.46} 46%|████▌ | 10146/22095 [17:29:22<21:12:54, 6.39s/it] 46%|████▌ | 10147/22095 [17:29:26<18:23:51, 5.54s/it] {'loss': 0.3168, 'grad_norm': 0.6335489140165369, 'learning_rate': 5.898779595717788e-06, 'epoch': 0.46} 46%|████▌ | 10147/22095 [17:29:26<18:23:51, 5.54s/it] 46%|████▌ | 10148/22095 [17:29:30<16:54:10, 5.09s/it] {'loss': 0.3341, 'grad_norm': 0.6643824401837953, 'learning_rate': 5.898058603418287e-06, 'epoch': 0.46} 46%|████▌ | 10148/22095 [17:29:30<16:54:10, 5.09s/it] 46%|████▌ | 10149/22095 [17:29:33<15:42:57, 4.74s/it] {'loss': 0.3671, 'grad_norm': 0.6753211229313297, 'learning_rate': 5.897337591822262e-06, 'epoch': 0.46} 46%|████▌ | 10149/22095 [17:29:34<15:42:57, 4.74s/it] 46%|████▌ | 10150/22095 [17:29:37<14:34:43, 4.39s/it] {'loss': 0.337, 'grad_norm': 0.6164101604781804, 'learning_rate': 5.896616560945211e-06, 'epoch': 0.46} 46%|████▌ | 10150/22095 [17:29:37<14:34:43, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10151/22095 [17:29:47<20:17:33, 6.12s/it] {'loss': 0.4682, 'grad_norm': 0.35054630893544075, 'learning_rate': 5.89589551080262e-06, 'epoch': 0.46} 46%|████▌ | 10151/22095 [17:29:47<20:17:33, 6.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41588 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42717 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10152/22095 [17:29:51<17:49:44, 5.37s/it] {'loss': 0.358, 'grad_norm': 0.6409338224630452, 'learning_rate': 5.89517444140999e-06, 'epoch': 0.46} 46%|████▌ | 10152/22095 [17:29:51<17:49:44, 5.37s/it] 46%|████▌ | 10153/22095 [17:29:54<15:59:49, 4.82s/it] {'loss': 0.3228, 'grad_norm': 0.6369081943211347, 'learning_rate': 5.8944533527828095e-06, 'epoch': 0.46} 46%|████▌ | 10153/22095 [17:29:54<15:59:49, 4.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11304305 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6, 'image': 'airplane_app/149.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一张中国国际航空公司的登机牌。以下是登机牌上的详细信息:\n\n- 航段信息:中国国航CA1515/HU1223\n- 日期:2017年1月13日\n- 出发地:北京首都机场T1\n- 目的地:上海虹桥机场T1\n- 乘客姓名:张小明\n- 座位号:36A\n- 舱位等级:头等舱\n- 登机时间:20:40\n- 登机口:C15\n- 会员卡号/合约:CA1234567890\n- 航班号:999542123456\n- 会员卡积分:123\n- 登机顺序:15\n\n右下方的“973游戏”标志表明这张登机牌图片可能是从一个游戏或应用程序中获取的。\n\n注意:这只是模拟登机牌的图片,不是实际的登机牌。'}]} 46%|████▌ | 10154/22095 [17:29:58<14:22:14, 4.33s/it] {'loss': 0.321, 'grad_norm': 0.6745637529016413, 'learning_rate': 5.893732244936572e-06, 'epoch': 0.46} 46%|████▌ | 10154/22095 [17:29:58<14:22:14, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104656 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10155/22095 [17:30:07<19:21:21, 5.84s/it] {'loss': 0.4715, 'grad_norm': 0.33008809593587707, 'learning_rate': 5.893011117886775e-06, 'epoch': 0.46} 46%|████▌ | 10155/22095 [17:30:07<19:21:21, 5.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105312 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10156/22095 [17:30:10<16:34:59, 5.00s/it] {'loss': 0.3288, 'grad_norm': 0.7145102204647745, 'learning_rate': 5.892289971648912e-06, 'epoch': 0.46} 46%|████▌ | 10156/22095 [17:30:10<16:34:59, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62620 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10157/22095 [17:30:13<14:32:48, 4.39s/it] {'loss': 0.3219, 'grad_norm': 0.6899872862577504, 'learning_rate': 5.8915688062384755e-06, 'epoch': 0.46} 46%|████▌ | 10157/22095 [17:30:13<14:32:48, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67130 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10158/22095 [17:30:16<13:29:02, 4.07s/it] {'loss': 0.358, 'grad_norm': 0.5941872763439713, 'learning_rate': 5.890847621670966e-06, 'epoch': 0.46} 46%|████▌ | 10158/22095 [17:30:16<13:29:02, 4.07s/it] 46%|████▌ | 10159/22095 [17:30:19<12:28:16, 3.76s/it] {'loss': 0.3593, 'grad_norm': 0.6521882399774729, 'learning_rate': 5.8901264179618755e-06, 'epoch': 0.46} 46%|████▌ | 10159/22095 [17:30:20<12:28:16, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10160/22095 [17:30:29<18:27:10, 5.57s/it] {'loss': 0.4555, 'grad_norm': 0.41761082565214547, 'learning_rate': 5.889405195126704e-06, 'epoch': 0.46} 46%|████▌ | 10160/22095 [17:30:29<18:27:10, 5.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10161/22095 [17:30:32<16:14:31, 4.90s/it] {'loss': 0.3178, 'grad_norm': 0.5930093261474315, 'learning_rate': 5.8886839531809455e-06, 'epoch': 0.46} 46%|████▌ | 10161/22095 [17:30:32<16:14:31, 4.90s/it] 46%|████▌ | 10162/22095 [17:30:35<14:15:36, 4.30s/it] {'loss': 0.3226, 'grad_norm': 0.6120930429914294, 'learning_rate': 5.8879626921400975e-06, 'epoch': 0.46} 46%|████▌ | 10162/22095 [17:30:35<14:15:36, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (117086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63437 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10163/22095 [17:30:44<18:38:25, 5.62s/it] {'loss': 0.4654, 'grad_norm': 0.30768656389224336, 'learning_rate': 5.88724141201966e-06, 'epoch': 0.46} 46%|████▌ | 10163/22095 [17:30:44<18:38:25, 5.62s/it] 46%|████▌ | 10164/22095 [17:30:53<22:12:23, 6.70s/it] {'loss': 0.4662, 'grad_norm': 0.29084001888272587, 'learning_rate': 5.886520112835128e-06, 'epoch': 0.46} 46%|████▌ | 10164/22095 [17:30:53<22:12:23, 6.70s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10165/22095 [17:31:03<25:28:45, 7.69s/it] {'loss': 0.4972, 'grad_norm': 0.28239681672321787, 'learning_rate': 5.8857987946020025e-06, 'epoch': 0.46} 46%|████▌ | 10165/22095 [17:31:03<25:28:45, 7.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10166/22095 [17:31:07<21:08:42, 6.38s/it] {'loss': 0.3313, 'grad_norm': 0.6171878946319844, 'learning_rate': 5.8850774573357804e-06, 'epoch': 0.46} 46%|████▌ | 10166/22095 [17:31:07<21:08:42, 6.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59733 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10167/22095 [17:31:11<18:51:23, 5.69s/it] {'loss': 0.3886, 'grad_norm': 0.6448552071058545, 'learning_rate': 5.884356101051962e-06, 'epoch': 0.46} 46%|████▌ | 10167/22095 [17:31:11<18:51:23, 5.69s/it] 46%|████▌ | 10168/22095 [17:31:14<16:08:13, 4.87s/it] {'loss': 0.3452, 'grad_norm': 0.6837662949759267, 'learning_rate': 5.8836347257660485e-06, 'epoch': 0.46} 46%|████▌ | 10168/22095 [17:31:14<16:08:13, 4.87s/it] 46%|████▌ | 10169/22095 [17:31:17<14:10:06, 4.28s/it] {'loss': 0.3138, 'grad_norm': 0.6473672470153388, 'learning_rate': 5.882913331493538e-06, 'epoch': 0.46} 46%|████▌ | 10169/22095 [17:31:17<14:10:06, 4.28s/it] 46%|████▌ | 10170/22095 [17:31:20<13:35:02, 4.10s/it] {'loss': 0.3047, 'grad_norm': 0.601651739774847, 'learning_rate': 5.882191918249931e-06, 'epoch': 0.46} 46%|████▌ | 10170/22095 [17:31:20<13:35:02, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047680 in VC:s3://multi-modal/UniGeo/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 3\nB. 4\nC. 1\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10171/22095 [17:31:31<20:07:24, 6.08s/it] {'loss': 0.4877, 'grad_norm': 0.35601805678176673, 'learning_rate': 5.881470486050731e-06, 'epoch': 0.46} 46%|████▌ | 10171/22095 [17:31:31<20:07:24, 6.08s/it] 46%|████▌ | 10172/22095 [17:31:34<17:21:51, 5.24s/it] {'loss': 0.3386, 'grad_norm': 0.6413091667459374, 'learning_rate': 5.880749034911435e-06, 'epoch': 0.46} 46%|████▌ | 10172/22095 [17:31:34<17:21:51, 5.24s/it] 46%|████▌ | 10173/22095 [17:31:37<15:04:11, 4.55s/it] {'loss': 0.3215, 'grad_norm': 0.5842090741092955, 'learning_rate': 5.880027564847549e-06, 'epoch': 0.46} 46%|████▌ | 10173/22095 [17:31:37<15:04:11, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90015 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10174/22095 [17:31:41<14:26:03, 4.36s/it] {'loss': 0.3427, 'grad_norm': 0.7689309377557272, 'learning_rate': 5.879306075874572e-06, 'epoch': 0.46} 46%|████▌ | 10174/22095 [17:31:41<14:26:03, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102387 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10175/22095 [17:31:50<19:11:14, 5.79s/it] {'loss': 0.4716, 'grad_norm': 0.3315562112584059, 'learning_rate': 5.8785845680080085e-06, 'epoch': 0.46} 46%|████▌ | 10175/22095 [17:31:50<19:11:14, 5.79s/it] 46%|████▌ | 10176/22095 [17:31:54<17:24:09, 5.26s/it] {'loss': 0.3191, 'grad_norm': 0.6272824790835605, 'learning_rate': 5.877863041263362e-06, 'epoch': 0.46} 46%|████▌ | 10176/22095 [17:31:54<17:24:09, 5.26s/it] 46%|████▌ | 10177/22095 [17:31:57<15:20:32, 4.63s/it] {'loss': 0.3509, 'grad_norm': 0.6484198612109718, 'learning_rate': 5.877141495656136e-06, 'epoch': 0.46} 46%|████▌ | 10177/22095 [17:31:57<15:20:32, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73494 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10178/22095 [17:32:00<13:42:02, 4.14s/it] {'loss': 0.2981, 'grad_norm': 0.5869832697863312, 'learning_rate': 5.876419931201829e-06, 'epoch': 0.46} 46%|████▌ | 10178/22095 [17:32:00<13:42:02, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129657 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10179/22095 [17:32:04<13:06:31, 3.96s/it] {'loss': 0.3253, 'grad_norm': 0.6531080746471695, 'learning_rate': 5.875698347915954e-06, 'epoch': 0.46} 46%|████▌ | 10179/22095 [17:32:04<13:06:31, 3.96s/it] 46%|████▌ | 10180/22095 [17:32:07<12:12:13, 3.69s/it] {'loss': 0.3144, 'grad_norm': 0.6298903489672832, 'learning_rate': 5.8749767458140075e-06, 'epoch': 0.46} 46%|████▌ | 10180/22095 [17:32:07<12:12:13, 3.69s/it] 46%|████▌ | 10181/22095 [17:32:11<12:08:28, 3.67s/it] {'loss': 0.328, 'grad_norm': 0.5973823819739114, 'learning_rate': 5.8742551249115e-06, 'epoch': 0.46} 46%|████▌ | 10181/22095 [17:32:11<12:08:28, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10182/22095 [17:32:21<18:56:29, 5.72s/it] {'loss': 0.4677, 'grad_norm': 0.39421276793144844, 'learning_rate': 5.873533485223934e-06, 'epoch': 0.46} 46%|████▌ | 10182/22095 [17:32:21<18:56:29, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45775 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10183/22095 [17:32:31<23:05:41, 6.98s/it] {'loss': 0.4668, 'grad_norm': 0.33173025820647933, 'learning_rate': 5.872811826766817e-06, 'epoch': 0.46} 46%|████▌ | 10183/22095 [17:32:31<23:05:41, 6.98s/it] 46%|████▌ | 10184/22095 [17:32:40<25:29:59, 7.71s/it] {'loss': 0.4817, 'grad_norm': 0.2915282231759761, 'learning_rate': 5.872090149555653e-06, 'epoch': 0.46} 46%|████▌ | 10184/22095 [17:32:40<25:29:59, 7.71s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (70599 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68682 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10185/22095 [17:32:45<22:31:35, 6.81s/it] {'loss': 0.3945, 'grad_norm': 1.2720611030473974, 'learning_rate': 5.871368453605951e-06, 'epoch': 0.46} 46%|████▌ | 10185/22095 [17:32:45<22:31:35, 6.81s/it] 46%|████▌ | 10186/22095 [17:32:56<26:48:34, 8.10s/it] {'loss': 0.4937, 'grad_norm': 0.429056553504757, 'learning_rate': 5.870646738933218e-06, 'epoch': 0.46} 46%|████▌ | 10186/22095 [17:32:56<26:48:34, 8.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55849 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72020 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10187/22095 [17:33:06<28:27:08, 8.60s/it] {'loss': 0.4667, 'grad_norm': 0.4424597400168609, 'learning_rate': 5.869925005552959e-06, 'epoch': 0.46} 46%|████▌ | 10187/22095 [17:33:06<28:27:08, 8.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▌ | 10188/22095 [17:33:10<24:18:06, 7.35s/it] {'loss': 0.334, 'grad_norm': 0.6422267533943191, 'learning_rate': 5.869203253480684e-06, 'epoch': 0.46} 46%|████▌ | 10188/22095 [17:33:10<24:18:06, 7.35s/it] 46%|████▌ | 10189/22095 [17:33:14<20:44:14, 6.27s/it] {'loss': 0.3131, 'grad_norm': 0.624955624943747, 'learning_rate': 5.868481482731903e-06, 'epoch': 0.46} 46%|████▌ | 10189/22095 [17:33:14<20:44:14, 6.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [606, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8449211 in VC:s3://internvl-moe-sft-data/. Exception: Image size [606, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 93998, 'image': 'vrdu_texteq/astro-ph.CO/db00bcc3-bf0a-44b9-ae6d-a444932fb47a.png', 'image_wh': [[606, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'The derivative of the distance modulus to bin $a$ is'}]} 46%|████▌ | 10190/22095 [17:33:18<18:35:54, 5.62s/it] {'loss': 0.3318, 'grad_norm': 0.6120821241352987, 'learning_rate': 5.867759693322119e-06, 'epoch': 0.46} 46%|████▌ | 10190/22095 [17:33:18<18:35:54, 5.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10191/22095 [17:33:29<24:04:40, 7.28s/it] {'loss': 0.5031, 'grad_norm': 0.30119929184566724, 'learning_rate': 5.867037885266845e-06, 'epoch': 0.46} 46%|████▌ | 10191/22095 [17:33:29<24:04:40, 7.28s/it] 46%|████▌ | 10192/22095 [17:33:33<20:41:37, 6.26s/it] {'loss': 0.3349, 'grad_norm': 0.6428062728626699, 'learning_rate': 5.86631605858159e-06, 'epoch': 0.46} 46%|████▌ | 10192/22095 [17:33:33<20:41:37, 6.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10193/22095 [17:33:44<24:49:39, 7.51s/it] {'loss': 0.4707, 'grad_norm': 0.28165607627203954, 'learning_rate': 5.865594213281864e-06, 'epoch': 0.46} 46%|████▌ | 10193/22095 [17:33:44<24:49:39, 7.51s/it] 46%|████▌ | 10194/22095 [17:33:48<21:13:17, 6.42s/it] {'loss': 0.356, 'grad_norm': 0.6336775479151228, 'learning_rate': 5.864872349383177e-06, 'epoch': 0.46} 46%|████▌ | 10194/22095 [17:33:48<21:13:17, 6.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80658 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10195/22095 [17:33:58<25:05:10, 7.59s/it] {'loss': 0.4849, 'grad_norm': 0.2913933081534654, 'learning_rate': 5.864150466901038e-06, 'epoch': 0.46} 46%|████▌ | 10195/22095 [17:33:58<25:05:10, 7.59s/it] 46%|████▌ | 10196/22095 [17:34:02<21:53:16, 6.62s/it] {'loss': 0.3303, 'grad_norm': 0.598970252669126, 'learning_rate': 5.863428565850961e-06, 'epoch': 0.46} 46%|████▌ | 10196/22095 [17:34:02<21:53:16, 6.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76959 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10197/22095 [17:34:05<18:27:25, 5.58s/it] {'loss': 0.3021, 'grad_norm': 0.608507335922516, 'learning_rate': 5.862706646248455e-06, 'epoch': 0.46} 46%|████▌ | 10197/22095 [17:34:05<18:27:25, 5.58s/it] 46%|████▌ | 10198/22095 [17:34:09<16:28:39, 4.99s/it] {'loss': 0.3543, 'grad_norm': 0.6200924088031894, 'learning_rate': 5.861984708109035e-06, 'epoch': 0.46} 46%|████▌ | 10198/22095 [17:34:09<16:28:39, 4.99s/it] 46%|████▌ | 10199/22095 [17:34:12<14:55:30, 4.52s/it] {'loss': 0.3112, 'grad_norm': 0.5752767246422879, 'learning_rate': 5.861262751448208e-06, 'epoch': 0.46} 46%|████▌ | 10199/22095 [17:34:12<14:55:30, 4.52s/it] 46%|████▌ | 10200/22095 [17:34:16<13:50:01, 4.19s/it] {'loss': 0.3551, 'grad_norm': 0.6281904087924641, 'learning_rate': 5.860540776281492e-06, 'epoch': 0.46} 46%|████▌ | 10200/22095 [17:34:16<13:50:01, 4.19s/it] 46%|████▌ | 10201/22095 [17:34:19<12:30:54, 3.79s/it] {'loss': 0.341, 'grad_norm': 0.6328412769197119, 'learning_rate': 5.859818782624395e-06, 'epoch': 0.46} 46%|████▌ | 10201/22095 [17:34:19<12:30:54, 3.79s/it] 46%|████▌ | 10202/22095 [17:34:22<11:50:42, 3.59s/it] {'loss': 0.3139, 'grad_norm': 0.6876693828114777, 'learning_rate': 5.8590967704924365e-06, 'epoch': 0.46} 46%|████▌ | 10202/22095 [17:34:22<11:50:42, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10203/22095 [17:34:31<17:32:34, 5.31s/it] {'loss': 0.4687, 'grad_norm': 0.31990046259109833, 'learning_rate': 5.858374739901125e-06, 'epoch': 0.46} 46%|████▌ | 10203/22095 [17:34:31<17:32:34, 5.31s/it] 46%|████▌ | 10204/22095 [17:34:35<16:09:24, 4.89s/it] {'loss': 0.3444, 'grad_norm': 0.6150479228587775, 'learning_rate': 5.857652690865976e-06, 'epoch': 0.46} 46%|████▌ | 10204/22095 [17:34:35<16:09:24, 4.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89230 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10205/22095 [17:34:45<21:07:11, 6.39s/it] {'loss': 0.4619, 'grad_norm': 0.2747063505928363, 'learning_rate': 5.856930623402506e-06, 'epoch': 0.46} 46%|████▌ | 10205/22095 [17:34:45<21:07:11, 6.39s/it] 46%|████▌ | 10206/22095 [17:34:49<18:21:46, 5.56s/it] {'loss': 0.3926, 'grad_norm': 0.6457097815772058, 'learning_rate': 5.856208537526229e-06, 'epoch': 0.46} 46%|████▌ | 10206/22095 [17:34:49<18:21:46, 5.56s/it] 46%|████▌ | 10207/22095 [17:34:53<16:44:11, 5.07s/it] {'loss': 0.3503, 'grad_norm': 0.6171431621945438, 'learning_rate': 5.855486433252658e-06, 'epoch': 0.46} 46%|████▌ | 10207/22095 [17:34:53<16:44:11, 5.07s/it] 46%|████▌ | 10208/22095 [17:34:55<14:27:53, 4.38s/it] {'loss': 0.3435, 'grad_norm': 0.6364217330608023, 'learning_rate': 5.854764310597314e-06, 'epoch': 0.46} 46%|████▌ | 10208/22095 [17:34:55<14:27:53, 4.38s/it] 46%|████▌ | 10209/22095 [17:34:59<13:29:58, 4.09s/it] {'loss': 0.3563, 'grad_norm': 0.6389061843500499, 'learning_rate': 5.8540421695757064e-06, 'epoch': 0.46} 46%|████▌ | 10209/22095 [17:34:59<13:29:58, 4.09s/it] 46%|████▌ | 10210/22095 [17:35:02<12:45:09, 3.86s/it] {'loss': 0.3703, 'grad_norm': 0.6891372948551845, 'learning_rate': 5.85332001020336e-06, 'epoch': 0.46} 46%|████▌ | 10210/22095 [17:35:02<12:45:09, 3.86s/it] 46%|████▌ | 10211/22095 [17:35:06<12:48:32, 3.88s/it] {'loss': 0.3664, 'grad_norm': 0.6416000376271216, 'learning_rate': 5.852597832495785e-06, 'epoch': 0.46} 46%|████▌ | 10211/22095 [17:35:06<12:48:32, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51361 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10212/22095 [17:35:09<11:55:47, 3.61s/it] {'loss': 0.3413, 'grad_norm': 0.6537246700366226, 'learning_rate': 5.851875636468501e-06, 'epoch': 0.46} 46%|████▌ | 10212/22095 [17:35:09<11:55:47, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▌ | 10213/22095 [17:35:17<16:27:55, 4.99s/it] {'loss': 0.4696, 'grad_norm': 0.3856112248677831, 'learning_rate': 5.851153422137026e-06, 'epoch': 0.46} 46%|████▌ | 10213/22095 [17:35:18<16:27:55, 4.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▌ | 10214/22095 [17:35:21<15:13:15, 4.61s/it] {'loss': 0.377, 'grad_norm': 0.6338681957306734, 'learning_rate': 5.850431189516878e-06, 'epoch': 0.46} 46%|████▌ | 10214/22095 [17:35:21<15:13:15, 4.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50898 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74247 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10215/22095 [17:35:24<13:49:35, 4.19s/it] {'loss': 0.3399, 'grad_norm': 0.5774558126226176, 'learning_rate': 5.849708938623575e-06, 'epoch': 0.46} 46%|████▌ | 10215/22095 [17:35:24<13:49:35, 4.19s/it] 46%|████▌ | 10216/22095 [17:35:27<12:35:44, 3.82s/it] {'loss': 0.3139, 'grad_norm': 0.6104011755295417, 'learning_rate': 5.848986669472637e-06, 'epoch': 0.46} 46%|████▌ | 10216/22095 [17:35:27<12:35:44, 3.82s/it] 46%|████▌ | 10217/22095 [17:35:30<11:44:16, 3.56s/it] {'loss': 0.3121, 'grad_norm': 1.0495709555357173, 'learning_rate': 5.848264382079584e-06, 'epoch': 0.46} 46%|████▌ | 10217/22095 [17:35:30<11:44:16, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58105 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51121 > 40960). Running this sequence through the model will result in indexing errors 46%|████▌ | 10218/22095 [17:35:34<11:49:04, 3.58s/it] {'loss': 0.3383, 'grad_norm': 0.7581803532503051, 'learning_rate': 5.847542076459933e-06, 'epoch': 0.46} 46%|████▌ | 10218/22095 [17:35:34<11:49:04, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47385 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54493 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10219/22095 [17:35:38<12:21:44, 3.75s/it] {'loss': 0.2785, 'grad_norm': 0.830956095614911, 'learning_rate': 5.846819752629208e-06, 'epoch': 0.46} 46%|████▋ | 10219/22095 [17:35:38<12:21:44, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (126602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107988 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10220/22095 [17:35:47<17:36:26, 5.34s/it] {'loss': 0.4788, 'grad_norm': 0.3575301386080226, 'learning_rate': 5.846097410602925e-06, 'epoch': 0.46} 46%|████▋ | 10220/22095 [17:35:47<17:36:26, 5.34s/it] 46%|████▋ | 10221/22095 [17:35:50<15:39:06, 4.75s/it] {'loss': 0.3295, 'grad_norm': 0.6153185689546572, 'learning_rate': 5.84537505039661e-06, 'epoch': 0.46} 46%|████▋ | 10221/22095 [17:35:50<15:39:06, 4.75s/it] 46%|████▋ | 10222/22095 [17:35:53<14:10:41, 4.30s/it] {'loss': 0.3411, 'grad_norm': 0.6523731086184149, 'learning_rate': 5.844652672025779e-06, 'epoch': 0.46} 46%|████▋ | 10222/22095 [17:35:53<14:10:41, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10223/22095 [17:36:02<18:51:05, 5.72s/it] {'loss': 0.465, 'grad_norm': 0.3059244969304898, 'learning_rate': 5.843930275505958e-06, 'epoch': 0.46} 46%|████▋ | 10223/22095 [17:36:02<18:51:05, 5.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▋ | 10224/22095 [17:36:07<17:47:44, 5.40s/it] {'loss': 0.3175, 'grad_norm': 0.6114200072668081, 'learning_rate': 5.843207860852667e-06, 'epoch': 0.46} 46%|████▋ | 10224/22095 [17:36:07<17:47:44, 5.40s/it] 46%|████▋ | 10225/22095 [17:36:10<15:18:15, 4.64s/it] {'loss': 0.3491, 'grad_norm': 0.6347751736428424, 'learning_rate': 5.842485428081428e-06, 'epoch': 0.46} 46%|████▋ | 10225/22095 [17:36:10<15:18:15, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10226/22095 [17:36:20<21:03:01, 6.38s/it] {'loss': 0.4776, 'grad_norm': 0.2788723118165968, 'learning_rate': 5.841762977207764e-06, 'epoch': 0.46} 46%|████▋ | 10226/22095 [17:36:20<21:03:01, 6.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899857 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23010, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 46%|████▋ | 10227/22095 [17:36:28<22:22:32, 6.79s/it] {'loss': 0.4959, 'grad_norm': 0.2938092295138972, 'learning_rate': 5.841040508247201e-06, 'epoch': 0.46} 46%|████▋ | 10227/22095 [17:36:28<22:22:32, 6.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68976 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80136 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104808 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10228/22095 [17:36:38<25:12:48, 7.65s/it] {'loss': 0.4595, 'grad_norm': 0.2843870448715732, 'learning_rate': 5.840318021215259e-06, 'epoch': 0.46} 46%|████▋ | 10228/22095 [17:36:38<25:12:48, 7.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 46%|████▋ | 10229/22095 [17:36:41<21:13:38, 6.44s/it] {'loss': 0.3508, 'grad_norm': 0.778637112356431, 'learning_rate': 5.839595516127464e-06, 'epoch': 0.46} 46%|████▋ | 10229/22095 [17:36:41<21:13:38, 6.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▋ | 10230/22095 [17:36:45<18:13:55, 5.53s/it] {'loss': 0.3166, 'grad_norm': 0.6174599810697912, 'learning_rate': 5.838872992999339e-06, 'epoch': 0.46} 46%|████▋ | 10230/22095 [17:36:45<18:13:55, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118093 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10231/22095 [17:36:49<16:41:14, 5.06s/it] {'loss': 0.3472, 'grad_norm': 0.7235202774561535, 'learning_rate': 5.8381504518464114e-06, 'epoch': 0.46} 46%|████▋ | 10231/22095 [17:36:49<16:41:14, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047664 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 10cm\nB. 16cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364939 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31680, 'image': 'vrdu_table_final_2/astro-ph.CO/63018153-2cb7-4872-bd22-971d70149460.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 46%|████▋ | 10232/22095 [17:36:53<15:37:49, 4.74s/it] {'loss': 0.3, 'grad_norm': 0.6361329502755545, 'learning_rate': 5.837427892684205e-06, 'epoch': 0.46} 46%|████▋ | 10232/22095 [17:36:53<15:37:49, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51928 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10233/22095 [17:36:56<14:30:01, 4.40s/it] {'loss': 0.333, 'grad_norm': 0.6060017986409006, 'learning_rate': 5.836705315528244e-06, 'epoch': 0.46} 46%|████▋ | 10233/22095 [17:36:56<14:30:01, 4.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▋ | 10234/22095 [17:37:01<14:14:41, 4.32s/it] {'loss': 0.3558, 'grad_norm': 0.6487855691336682, 'learning_rate': 5.8359827203940555e-06, 'epoch': 0.46} 46%|████▋ | 10234/22095 [17:37:01<14:14:41, 4.32s/it] 46%|████▋ | 10235/22095 [17:37:04<13:40:58, 4.15s/it] {'loss': 0.3884, 'grad_norm': 0.6297909598410385, 'learning_rate': 5.835260107297167e-06, 'epoch': 0.46} 46%|████▋ | 10235/22095 [17:37:04<13:40:58, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42286 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71371 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10236/22095 [17:37:14<18:52:26, 5.73s/it] {'loss': 0.4537, 'grad_norm': 0.40330348336805316, 'learning_rate': 5.834537476253102e-06, 'epoch': 0.46} 46%|████▋ | 10236/22095 [17:37:14<18:52:26, 5.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8522330 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15302, 'image': 'vrdu_texteq/astro-ph.CO/4297f436-0a9d-4cba-b2bf-6f3dc3d466f3.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': '$\\approx$34'}]} 46%|████▋ | 10237/22095 [17:37:17<16:43:21, 5.08s/it] {'loss': 0.3219, 'grad_norm': 0.609177172025508, 'learning_rate': 5.833814827277391e-06, 'epoch': 0.46} 46%|████▋ | 10237/22095 [17:37:17<16:43:21, 5.08s/it] 46%|████▋ | 10238/22095 [17:37:21<15:22:40, 4.67s/it] {'loss': 0.3323, 'grad_norm': 0.747417185536518, 'learning_rate': 5.83309216038556e-06, 'epoch': 0.46} 46%|████▋ | 10238/22095 [17:37:21<15:22:40, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10239/22095 [17:37:31<20:50:03, 6.33s/it] {'loss': 0.4879, 'grad_norm': 0.2943224458498767, 'learning_rate': 5.832369475593138e-06, 'epoch': 0.46} 46%|████▋ | 10239/22095 [17:37:31<20:50:03, 6.33s/it] 46%|████▋ | 10240/22095 [17:37:36<19:15:06, 5.85s/it] {'loss': 0.3296, 'grad_norm': 0.6302865670246146, 'learning_rate': 5.831646772915651e-06, 'epoch': 0.46} 46%|████▋ | 10240/22095 [17:37:36<19:15:06, 5.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10241/22095 [17:37:46<23:26:58, 7.12s/it] {'loss': 0.4706, 'grad_norm': 0.33260995177021796, 'learning_rate': 5.8309240523686295e-06, 'epoch': 0.46} 46%|████▋ | 10241/22095 [17:37:46<23:26:58, 7.12s/it] 46%|████▋ | 10242/22095 [17:37:49<19:29:22, 5.92s/it] {'loss': 0.3561, 'grad_norm': 0.6227230630505952, 'learning_rate': 5.830201313967603e-06, 'epoch': 0.46} 46%|████▋ | 10242/22095 [17:37:49<19:29:22, 5.92s/it] 46%|████▋ | 10243/22095 [17:37:53<16:59:03, 5.16s/it] {'loss': 0.2956, 'grad_norm': 0.5977630892372184, 'learning_rate': 5.829478557728098e-06, 'epoch': 0.46} 46%|████▋ | 10243/22095 [17:37:53<16:59:03, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10244/22095 [17:38:02<21:07:04, 6.42s/it] {'loss': 0.4798, 'grad_norm': 0.3112423505124785, 'learning_rate': 5.828755783665649e-06, 'epoch': 0.46} 46%|████▋ | 10244/22095 [17:38:02<21:07:04, 6.42s/it] 46%|████▋ | 10245/22095 [17:38:05<18:16:42, 5.55s/it] {'loss': 0.3492, 'grad_norm': 0.7156177789007848, 'learning_rate': 5.828032991795781e-06, 'epoch': 0.46} 46%|████▋ | 10245/22095 [17:38:05<18:16:42, 5.55s/it] 46%|████▋ | 10246/22095 [17:38:09<16:21:04, 4.97s/it] {'loss': 0.3904, 'grad_norm': 0.6311863336708481, 'learning_rate': 5.827310182134029e-06, 'epoch': 0.46} 46%|████▋ | 10246/22095 [17:38:09<16:21:04, 4.97s/it] 46%|████▋ | 10247/22095 [17:38:13<15:22:27, 4.67s/it] {'loss': 0.2968, 'grad_norm': 0.6156550034528948, 'learning_rate': 5.8265873546959205e-06, 'epoch': 0.46} 46%|████▋ | 10247/22095 [17:38:13<15:22:27, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48172 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10248/22095 [17:38:16<13:32:55, 4.12s/it] {'loss': 0.3233, 'grad_norm': 0.6030965572245929, 'learning_rate': 5.825864509496991e-06, 'epoch': 0.46} 46%|████▋ | 10248/22095 [17:38:16<13:32:55, 4.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [948, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8483790 in VC:s3://internvl-moe-sft-data/. Exception: Image size [948, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 95144, 'image': 'vrdu_texteq/astro-ph.CO/262358e0-0af7-46e1-bbdd-1b3ea819fe7e.png', 'image_wh': [[948, 25]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': "where $\\delta_D$ is Dirac's delta function and where we have introduced the notation"}]} 46%|████▋ | 10249/22095 [17:38:19<12:41:10, 3.86s/it] {'loss': 0.3911, 'grad_norm': 0.6954897934939829, 'learning_rate': 5.825141646552767e-06, 'epoch': 0.46} 46%|████▋ | 10249/22095 [17:38:19<12:41:10, 3.86s/it] 46%|████▋ | 10250/22095 [17:38:23<12:17:10, 3.73s/it] {'loss': 0.3344, 'grad_norm': 0.6408441042598859, 'learning_rate': 5.8244187658787855e-06, 'epoch': 0.46} 46%|████▋ | 10250/22095 [17:38:23<12:17:10, 3.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [112, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8489808 in VC:s3://internvl-moe-sft-data/. Exception: Image size [112, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13656, 'image': 'vrdu_texteq/astro-ph.CO/db9d8767-433c-4b09-84b9-9eca573c0a54.png', 'image_wh': [[112, 23]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'as $r \\rightarrow 0$.'}]} 46%|████▋ | 10251/22095 [17:38:27<12:35:28, 3.83s/it] {'loss': 0.3723, 'grad_norm': 0.6074890088107604, 'learning_rate': 5.8236958674905746e-06, 'epoch': 0.46} 46%|████▋ | 10251/22095 [17:38:27<12:35:28, 3.83s/it] 46%|████▋ | 10252/22095 [17:38:31<12:47:45, 3.89s/it] {'loss': 0.3434, 'grad_norm': 0.6334622958231805, 'learning_rate': 5.82297295140367e-06, 'epoch': 0.46} 46%|████▋ | 10252/22095 [17:38:31<12:47:45, 3.89s/it] 46%|████▋ | 10253/22095 [17:38:35<12:54:55, 3.93s/it] {'loss': 0.3199, 'grad_norm': 0.5730384902807221, 'learning_rate': 5.822250017633605e-06, 'epoch': 0.46} 46%|████▋ | 10253/22095 [17:38:35<12:54:55, 3.93s/it] 46%|████▋ | 10254/22095 [17:38:38<12:17:53, 3.74s/it] {'loss': 0.3377, 'grad_norm': 0.6189461609429648, 'learning_rate': 5.821527066195911e-06, 'epoch': 0.46} 46%|████▋ | 10254/22095 [17:38:38<12:17:53, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64721 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51074 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10255/22095 [17:38:41<11:21:55, 3.46s/it] {'loss': 0.3073, 'grad_norm': 0.6371926272056898, 'learning_rate': 5.820804097106125e-06, 'epoch': 0.46} 46%|████▋ | 10255/22095 [17:38:41<11:21:55, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 46%|████▋ | 10256/22095 [17:38:50<17:30:49, 5.33s/it] {'loss': 0.489, 'grad_norm': 0.36073886007825423, 'learning_rate': 5.82008111037978e-06, 'epoch': 0.46} 46%|████▋ | 10256/22095 [17:38:50<17:30:49, 5.33s/it] 46%|████▋ | 10257/22095 [17:38:54<15:39:59, 4.76s/it] {'loss': 0.3613, 'grad_norm': 0.6265469379681541, 'learning_rate': 5.819358106032409e-06, 'epoch': 0.46} 46%|████▋ | 10257/22095 [17:38:54<15:39:59, 4.76s/it] 46%|████▋ | 10258/22095 [17:38:57<14:04:00, 4.28s/it] {'loss': 0.3313, 'grad_norm': 0.6752147741747988, 'learning_rate': 5.81863508407955e-06, 'epoch': 0.46} 46%|████▋ | 10258/22095 [17:38:57<14:04:00, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52562 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10259/22095 [17:39:01<13:21:12, 4.06s/it] {'loss': 0.3466, 'grad_norm': 0.6333988101454193, 'learning_rate': 5.817912044536735e-06, 'epoch': 0.46} 46%|████▋ | 10259/22095 [17:39:01<13:21:12, 4.06s/it] 46%|████▋ | 10260/22095 [17:39:03<12:08:09, 3.69s/it] {'loss': 0.3298, 'grad_norm': 0.7462015858385178, 'learning_rate': 5.8171889874195066e-06, 'epoch': 0.46} 46%|████▋ | 10260/22095 [17:39:03<12:08:09, 3.69s/it] 46%|████▋ | 10261/22095 [17:39:06<11:24:14, 3.47s/it] {'loss': 0.3552, 'grad_norm': 0.6957970518827933, 'learning_rate': 5.8164659127433935e-06, 'epoch': 0.46} 46%|████▋ | 10261/22095 [17:39:06<11:24:14, 3.47s/it] 46%|████▋ | 10262/22095 [17:39:10<11:30:51, 3.50s/it] {'loss': 0.3092, 'grad_norm': 0.6435510261970632, 'learning_rate': 5.815742820523936e-06, 'epoch': 0.46} 46%|████▋ | 10262/22095 [17:39:10<11:30:51, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922567 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45720, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 46%|████▋ | 10263/22095 [17:39:17<15:04:24, 4.59s/it] {'loss': 0.4796, 'grad_norm': 0.3325824347052253, 'learning_rate': 5.815019710776671e-06, 'epoch': 0.46} 46%|████▋ | 10263/22095 [17:39:17<15:04:24, 4.59s/it] 46%|████▋ | 10264/22095 [17:39:21<14:01:57, 4.27s/it] {'loss': 0.3059, 'grad_norm': 0.7540769034192851, 'learning_rate': 5.814296583517135e-06, 'epoch': 0.46} 46%|████▋ | 10264/22095 [17:39:21<14:01:57, 4.27s/it] 46%|████▋ | 10265/22095 [17:39:24<13:27:35, 4.10s/it] {'loss': 0.365, 'grad_norm': 0.623352839941175, 'learning_rate': 5.813573438760867e-06, 'epoch': 0.46} 46%|████▋ | 10265/22095 [17:39:24<13:27:35, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75170 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10266/22095 [17:39:29<13:37:57, 4.15s/it] {'loss': 0.3726, 'grad_norm': 0.626453682575965, 'learning_rate': 5.812850276523405e-06, 'epoch': 0.46} 46%|████▋ | 10266/22095 [17:39:29<13:37:57, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 46%|████▋ | 10267/22095 [17:39:38<18:23:13, 5.60s/it] {'loss': 0.4693, 'grad_norm': 0.29882319766048127, 'learning_rate': 5.812127096820285e-06, 'epoch': 0.46} 46%|████▋ | 10267/22095 [17:39:38<18:23:13, 5.60s/it] 46%|████▋ | 10268/22095 [17:39:41<16:14:28, 4.94s/it] {'loss': 0.329, 'grad_norm': 0.5995073751713392, 'learning_rate': 5.811403899667049e-06, 'epoch': 0.46} 46%|████▋ | 10268/22095 [17:39:41<16:14:28, 4.94s/it] 46%|████▋ | 10269/22095 [17:39:44<14:26:20, 4.40s/it] {'loss': 0.2995, 'grad_norm': 0.5822564001987135, 'learning_rate': 5.810680685079236e-06, 'epoch': 0.46} 46%|████▋ | 10269/22095 [17:39:44<14:26:20, 4.40s/it] 46%|████▋ | 10270/22095 [17:39:47<12:53:51, 3.93s/it] {'loss': 0.327, 'grad_norm': 0.6120648327385629, 'learning_rate': 5.809957453072385e-06, 'epoch': 0.46} 46%|████▋ | 10270/22095 [17:39:47<12:53:51, 3.93s/it] 46%|████▋ | 10271/22095 [17:39:50<12:11:00, 3.71s/it] {'loss': 0.3188, 'grad_norm': 0.5772405836572486, 'learning_rate': 5.809234203662034e-06, 'epoch': 0.46} 46%|████▋ | 10271/22095 [17:39:50<12:11:00, 3.71s/it] 46%|████▋ | 10272/22095 [17:39:54<12:17:11, 3.74s/it] {'loss': 0.3568, 'grad_norm': 0.6548826917793775, 'learning_rate': 5.808510936863727e-06, 'epoch': 0.46} 46%|████▋ | 10272/22095 [17:39:54<12:17:11, 3.74s/it] 46%|████▋ | 10273/22095 [17:39:57<11:55:47, 3.63s/it] {'loss': 0.3347, 'grad_norm': 0.6218802174741334, 'learning_rate': 5.807787652693002e-06, 'epoch': 0.46} 46%|████▋ | 10273/22095 [17:39:57<11:55:47, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (95181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62299 > 40960). Running this sequence through the model will result in indexing errors 46%|████▋ | 10274/22095 [17:40:07<17:33:56, 5.35s/it] {'loss': 0.4809, 'grad_norm': 0.3925733151859355, 'learning_rate': 5.8070643511654025e-06, 'epoch': 0.46} 46%|████▋ | 10274/22095 [17:40:07<17:33:56, 5.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932781 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55934, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 6cm\nB. 12cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 47%|████▋ | 10275/22095 [17:40:10<16:01:05, 4.88s/it] {'loss': 0.313, 'grad_norm': 0.6150511972804596, 'learning_rate': 5.806341032296468e-06, 'epoch': 0.47} 47%|████▋ | 10275/22095 [17:40:10<16:01:05, 4.88s/it] 47%|████▋ | 10276/22095 [17:40:14<14:40:54, 4.47s/it] {'loss': 0.3701, 'grad_norm': 0.6160171370967106, 'learning_rate': 5.805617696101742e-06, 'epoch': 0.47} 47%|████▋ | 10276/22095 [17:40:14<14:40:54, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10277/22095 [17:40:17<13:06:05, 3.99s/it] {'loss': 0.3755, 'grad_norm': 0.5990757050539252, 'learning_rate': 5.804894342596766e-06, 'epoch': 0.47} 47%|████▋ | 10277/22095 [17:40:17<13:06:05, 3.99s/it] 47%|████▋ | 10278/22095 [17:40:20<11:54:11, 3.63s/it] {'loss': 0.2972, 'grad_norm': 0.5648945834265401, 'learning_rate': 5.804170971797081e-06, 'epoch': 0.47} 47%|████▋ | 10278/22095 [17:40:20<11:54:11, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52617 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45904 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10279/22095 [17:40:23<11:42:12, 3.57s/it] {'loss': 0.368, 'grad_norm': 0.6026134317216376, 'learning_rate': 5.803447583718234e-06, 'epoch': 0.47} 47%|████▋ | 10279/22095 [17:40:23<11:42:12, 3.57s/it] 47%|████▋ | 10280/22095 [17:40:26<11:30:12, 3.51s/it] {'loss': 0.3353, 'grad_norm': 0.5500918012924714, 'learning_rate': 5.802724178375762e-06, 'epoch': 0.47} 47%|████▋ | 10280/22095 [17:40:26<11:30:12, 3.51s/it] 47%|████▋ | 10281/22095 [17:40:30<11:26:32, 3.49s/it] {'loss': 0.2981, 'grad_norm': 0.6222704125021451, 'learning_rate': 5.802000755785217e-06, 'epoch': 0.47} 47%|████▋ | 10281/22095 [17:40:30<11:26:32, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10282/22095 [17:40:39<17:17:02, 5.27s/it] {'loss': 0.49, 'grad_norm': 0.3463249038069997, 'learning_rate': 5.801277315962139e-06, 'epoch': 0.47} 47%|████▋ | 10282/22095 [17:40:39<17:17:02, 5.27s/it] 47%|████▋ | 10283/22095 [17:40:43<15:18:51, 4.67s/it] {'loss': 0.3667, 'grad_norm': 0.6643901305295368, 'learning_rate': 5.80055385892207e-06, 'epoch': 0.47} 47%|████▋ | 10283/22095 [17:40:43<15:18:51, 4.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10284/22095 [17:40:46<14:06:00, 4.30s/it] {'loss': 0.3516, 'grad_norm': 0.6112542275471315, 'learning_rate': 5.799830384680558e-06, 'epoch': 0.47} 47%|████▋ | 10284/22095 [17:40:46<14:06:00, 4.30s/it] 47%|████▋ | 10285/22095 [17:40:50<13:27:36, 4.10s/it] {'loss': 0.3737, 'grad_norm': 0.6482440782183497, 'learning_rate': 5.799106893253148e-06, 'epoch': 0.47} 47%|████▋ | 10285/22095 [17:40:50<13:27:36, 4.10s/it] 47%|████▋ | 10286/22095 [17:40:53<13:07:18, 4.00s/it] {'loss': 0.3306, 'grad_norm': 0.5779577042672598, 'learning_rate': 5.798383384655384e-06, 'epoch': 0.47} 47%|████▋ | 10286/22095 [17:40:53<13:07:18, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10287/22095 [17:41:03<18:22:05, 5.60s/it] {'loss': 0.4888, 'grad_norm': 0.2977800126206191, 'learning_rate': 5.7976598589028154e-06, 'epoch': 0.47} 47%|████▋ | 10287/22095 [17:41:03<18:22:05, 5.60s/it] 47%|████▋ | 10288/22095 [17:41:09<19:08:59, 5.84s/it] {'loss': 0.4909, 'grad_norm': 0.29912546170849535, 'learning_rate': 5.796936316010984e-06, 'epoch': 0.47} 47%|████▋ | 10288/22095 [17:41:09<19:08:59, 5.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337288 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3910, 'image': 'vrdu_table_final_2/astro-ph.CO/bab9bd78-a02e-4097-a49e-5544398324c3.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 47%|████▋ | 10289/22095 [17:41:13<17:31:56, 5.35s/it] {'loss': 0.3606, 'grad_norm': 0.5550307674716896, 'learning_rate': 5.796212755995439e-06, 'epoch': 0.47} 47%|████▋ | 10289/22095 [17:41:13<17:31:56, 5.35s/it] 47%|████▋ | 10290/22095 [17:41:17<16:25:07, 5.01s/it] {'loss': 0.32, 'grad_norm': 0.6016564912884554, 'learning_rate': 5.795489178871728e-06, 'epoch': 0.47} 47%|████▋ | 10290/22095 [17:41:18<16:25:07, 5.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8395336 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62174, 'image': 'vrdu_table_final_2/astro-ph.EP/adb168d2-2e33-42ff-973c-4b685d7b3ff7.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (111125556 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 47%|████▋ | 10291/22095 [17:41:20<14:17:25, 4.36s/it] {'loss': 0.3154, 'grad_norm': 0.6441104882214519, 'learning_rate': 5.794765584655397e-06, 'epoch': 0.47} 47%|████▋ | 10291/22095 [17:41:20<14:17:25, 4.36s/it] 47%|████▋ | 10292/22095 [17:41:24<13:24:54, 4.09s/it] {'loss': 0.3459, 'grad_norm': 0.6519116239052571, 'learning_rate': 5.794041973361996e-06, 'epoch': 0.47} 47%|████▋ | 10292/22095 [17:41:24<13:24:54, 4.09s/it] 47%|████▋ | 10293/22095 [17:41:28<13:17:18, 4.05s/it] {'loss': 0.3281, 'grad_norm': 0.6734882207659202, 'learning_rate': 5.793318345007071e-06, 'epoch': 0.47} 47%|████▋ | 10293/22095 [17:41:28<13:17:18, 4.05s/it] 47%|████▋ | 10294/22095 [17:41:31<12:36:41, 3.85s/it] {'loss': 0.303, 'grad_norm': 0.657001665342723, 'learning_rate': 5.7925946996061696e-06, 'epoch': 0.47} 47%|████▋ | 10294/22095 [17:41:31<12:36:41, 3.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10295/22095 [17:41:34<11:31:58, 3.52s/it] {'loss': 0.3287, 'grad_norm': 0.784940555608557, 'learning_rate': 5.791871037174844e-06, 'epoch': 0.47} 47%|████▋ | 10295/22095 [17:41:34<11:31:58, 3.52s/it] 47%|████▋ | 10296/22095 [17:41:37<11:31:54, 3.52s/it] {'loss': 0.3177, 'grad_norm': 1.2622160018753232, 'learning_rate': 5.7911473577286415e-06, 'epoch': 0.47} 47%|████▋ | 10296/22095 [17:41:37<11:31:54, 3.52s/it] 47%|████▋ | 10297/22095 [17:41:40<10:53:03, 3.32s/it] {'loss': 0.3239, 'grad_norm': 0.7490174556923634, 'learning_rate': 5.790423661283112e-06, 'epoch': 0.47} 47%|████▋ | 10297/22095 [17:41:40<10:53:03, 3.32s/it] 47%|████▋ | 10298/22095 [17:41:44<11:00:58, 3.36s/it] {'loss': 0.3614, 'grad_norm': 0.676624989660703, 'learning_rate': 5.789699947853807e-06, 'epoch': 0.47} 47%|████▋ | 10298/22095 [17:41:44<11:00:58, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [787, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8352304 in VC:s3://internvl-moe-sft-data/. Exception: Image size [787, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18985, 'image': 'vrdu_table_final_2/astro-ph.CO/1a541d60-9a99-4e39-84d2-4618462b3a07.png', 'image_wh': [[787, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{lccccc}\n\\multicolumn{1}{c}{\\footnotesize $^\\dagger$ Using lattice QCD, this transition is normally calculated to 150--170 MeV.}\n\\end{tabular}\n```"}]} 47%|████▋ | 10299/22095 [17:41:47<10:59:24, 3.35s/it] {'loss': 0.3225, 'grad_norm': 0.652027534931178, 'learning_rate': 5.788976217456275e-06, 'epoch': 0.47} 47%|████▋ | 10299/22095 [17:41:47<10:59:24, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93212 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10300/22095 [17:41:51<11:15:25, 3.44s/it] {'loss': 0.3683, 'grad_norm': 0.5830828483232589, 'learning_rate': 5.788252470106066e-06, 'epoch': 0.47} 47%|████▋ | 10300/22095 [17:41:51<11:15:25, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44172 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10301/22095 [17:41:54<10:40:17, 3.26s/it] {'loss': 0.3223, 'grad_norm': 0.5600393977412688, 'learning_rate': 5.787528705818732e-06, 'epoch': 0.47} 47%|████▋ | 10301/22095 [17:41:54<10:40:17, 3.26s/it] 47%|████▋ | 10302/22095 [17:41:57<10:40:37, 3.26s/it] {'loss': 0.3012, 'grad_norm': 0.6438021708441337, 'learning_rate': 5.786804924609827e-06, 'epoch': 0.47} 47%|████▋ | 10302/22095 [17:41:57<10:40:37, 3.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10303/22095 [17:42:00<10:26:02, 3.19s/it] {'loss': 0.3149, 'grad_norm': 0.6132114991086414, 'learning_rate': 5.786081126494899e-06, 'epoch': 0.47} 47%|████▋ | 10303/22095 [17:42:00<10:26:02, 3.19s/it] 47%|████▋ | 10304/22095 [17:42:04<11:14:47, 3.43s/it] {'loss': 0.3078, 'grad_norm': 0.6359022819021914, 'learning_rate': 5.785357311489502e-06, 'epoch': 0.47} 47%|████▋ | 10304/22095 [17:42:04<11:14:47, 3.43s/it] 47%|████▋ | 10305/22095 [17:42:07<11:20:53, 3.47s/it] {'loss': 0.3595, 'grad_norm': 0.652813458711096, 'learning_rate': 5.784633479609188e-06, 'epoch': 0.47} 47%|████▋ | 10305/22095 [17:42:07<11:20:53, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10306/22095 [17:42:10<10:40:57, 3.26s/it] {'loss': 0.3429, 'grad_norm': 0.6192102296622036, 'learning_rate': 5.783909630869513e-06, 'epoch': 0.47} 47%|████▋ | 10306/22095 [17:42:10<10:40:57, 3.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10307/22095 [17:42:13<10:22:33, 3.17s/it] {'loss': 0.28, 'grad_norm': 0.5587784962323307, 'learning_rate': 5.7831857652860234e-06, 'epoch': 0.47} 47%|████▋ | 10307/22095 [17:42:13<10:22:33, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10308/22095 [17:42:21<15:16:21, 4.66s/it] {'loss': 0.4761, 'grad_norm': 0.4213056610694686, 'learning_rate': 5.782461882874281e-06, 'epoch': 0.47} 47%|████▋ | 10308/22095 [17:42:21<15:16:21, 4.66s/it] 47%|████▋ | 10309/22095 [17:42:31<20:20:03, 6.21s/it] {'loss': 0.4666, 'grad_norm': 0.3523271291831369, 'learning_rate': 5.781737983649833e-06, 'epoch': 0.47} 47%|████▋ | 10309/22095 [17:42:31<20:20:03, 6.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 47%|████▋ | 10310/22095 [17:42:35<18:14:29, 5.57s/it] {'loss': 0.3487, 'grad_norm': 0.6272547002980707, 'learning_rate': 5.781014067628239e-06, 'epoch': 0.47} 47%|████▋ | 10310/22095 [17:42:35<18:14:29, 5.57s/it] 47%|████▋ | 10311/22095 [17:42:38<15:52:52, 4.85s/it] {'loss': 0.3387, 'grad_norm': 0.6324788202023445, 'learning_rate': 5.78029013482505e-06, 'epoch': 0.47} 47%|████▋ | 10311/22095 [17:42:38<15:52:52, 4.85s/it] 47%|████▋ | 10312/22095 [17:42:42<14:40:55, 4.49s/it] {'loss': 0.373, 'grad_norm': 0.6166975355394055, 'learning_rate': 5.779566185255823e-06, 'epoch': 0.47} 47%|████▋ | 10312/22095 [17:42:42<14:40:55, 4.49s/it] 47%|████▋ | 10313/22095 [17:42:45<13:35:50, 4.15s/it] {'loss': 0.3311, 'grad_norm': 0.646302923312705, 'learning_rate': 5.778842218936113e-06, 'epoch': 0.47} 47%|████▋ | 10313/22095 [17:42:45<13:35:50, 4.15s/it] 47%|████▋ | 10314/22095 [17:42:49<13:14:21, 4.05s/it] {'loss': 0.3905, 'grad_norm': 0.6577866467883527, 'learning_rate': 5.778118235881475e-06, 'epoch': 0.47} 47%|████▋ | 10314/22095 [17:42:49<13:14:21, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10315/22095 [17:42:52<12:04:23, 3.69s/it] {'loss': 0.3354, 'grad_norm': 0.645231514488921, 'learning_rate': 5.777394236107465e-06, 'epoch': 0.47} 47%|████▋ | 10315/22095 [17:42:52<12:04:23, 3.69s/it] 47%|████▋ | 10316/22095 [17:42:55<11:26:58, 3.50s/it] {'loss': 0.3309, 'grad_norm': 0.6187727317439589, 'learning_rate': 5.776670219629643e-06, 'epoch': 0.47} 47%|████▋ | 10316/22095 [17:42:55<11:26:58, 3.50s/it] 47%|████▋ | 10317/22095 [17:42:58<10:40:32, 3.26s/it] {'loss': 0.3635, 'grad_norm': 0.6915513013065251, 'learning_rate': 5.775946186463561e-06, 'epoch': 0.47} 47%|████▋ | 10317/22095 [17:42:58<10:40:32, 3.26s/it] 47%|████▋ | 10318/22095 [17:43:01<10:38:56, 3.26s/it] {'loss': 0.3722, 'grad_norm': 0.6626864260683646, 'learning_rate': 5.775222136624781e-06, 'epoch': 0.47} 47%|████▋ | 10318/22095 [17:43:01<10:38:56, 3.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045962 in VC:s3://multi-modal/UniGeo/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 10\nB. 8\nC. 7\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:∵AB=20,AD=14,∴BD=AB-AD=20-14=6,∵D为线段BC的中点,∴BC=2BD=12,∴AC=AB-BC=20-12=8.'}]} 47%|████▋ | 10319/22095 [17:43:04<10:24:59, 3.18s/it] {'loss': 0.3243, 'grad_norm': 0.6645297758465186, 'learning_rate': 5.774498070128857e-06, 'epoch': 0.47} 47%|████▋ | 10319/22095 [17:43:04<10:24:59, 3.18s/it] 47%|████▋ | 10320/22095 [17:43:08<11:07:27, 3.40s/it] {'loss': 0.2781, 'grad_norm': 0.6766053400804806, 'learning_rate': 5.773773986991348e-06, 'epoch': 0.47} 47%|████▋ | 10320/22095 [17:43:08<11:07:27, 3.40s/it] 47%|████▋ | 10321/22095 [17:43:12<11:31:53, 3.53s/it] {'loss': 0.3932, 'grad_norm': 0.6428071642509061, 'learning_rate': 5.773049887227813e-06, 'epoch': 0.47} 47%|████▋ | 10321/22095 [17:43:12<11:31:53, 3.53s/it] 47%|████▋ | 10322/22095 [17:43:15<11:30:18, 3.52s/it] {'loss': 0.3251, 'grad_norm': 0.6598168419273106, 'learning_rate': 5.772325770853809e-06, 'epoch': 0.47} 47%|████▋ | 10322/22095 [17:43:15<11:30:18, 3.52s/it] 47%|████▋ | 10323/22095 [17:43:18<10:43:08, 3.28s/it] {'loss': 0.3507, 'grad_norm': 0.6502389888661539, 'learning_rate': 5.771601637884897e-06, 'epoch': 0.47} 47%|████▋ | 10323/22095 [17:43:18<10:43:08, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61898 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44595 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10324/22095 [17:43:22<11:29:06, 3.51s/it] {'loss': 0.2911, 'grad_norm': 0.7030154069910686, 'learning_rate': 5.770877488336636e-06, 'epoch': 0.47} 47%|████▋ | 10324/22095 [17:43:22<11:29:06, 3.51s/it] 47%|████▋ | 10325/22095 [17:43:25<11:19:45, 3.47s/it] {'loss': 0.3697, 'grad_norm': 0.6618412096184401, 'learning_rate': 5.770153322224584e-06, 'epoch': 0.47} 47%|████▋ | 10325/22095 [17:43:25<11:19:45, 3.47s/it] 47%|████▋ | 10326/22095 [17:43:29<11:00:17, 3.37s/it] {'loss': 0.2987, 'grad_norm': 0.592769704678834, 'learning_rate': 5.769429139564303e-06, 'epoch': 0.47} 47%|████▋ | 10326/22095 [17:43:29<11:00:17, 3.37s/it] 47%|████▋ | 10327/22095 [17:43:33<11:44:42, 3.59s/it] {'loss': 0.3575, 'grad_norm': 0.6440489207513519, 'learning_rate': 5.7687049403713545e-06, 'epoch': 0.47} 47%|████▋ | 10327/22095 [17:43:33<11:44:42, 3.59s/it] 47%|████▋ | 10328/22095 [17:43:36<11:49:34, 3.62s/it] {'loss': 0.3235, 'grad_norm': 0.794780089394584, 'learning_rate': 5.767980724661295e-06, 'epoch': 0.47} 47%|████▋ | 10328/22095 [17:43:36<11:49:34, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10329/22095 [17:43:40<12:00:12, 3.67s/it] {'loss': 0.3428, 'grad_norm': 0.6140401917762476, 'learning_rate': 5.767256492449691e-06, 'epoch': 0.47} 47%|████▋ | 10329/22095 [17:43:40<12:00:12, 3.67s/it] 47%|████▋ | 10330/22095 [17:43:44<11:48:43, 3.61s/it] {'loss': 0.3425, 'grad_norm': 0.6446457210170683, 'learning_rate': 5.7665322437521e-06, 'epoch': 0.47} 47%|████▋ | 10330/22095 [17:43:44<11:48:43, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10331/22095 [17:43:46<11:03:31, 3.38s/it] {'loss': 0.3229, 'grad_norm': 0.6975506143904119, 'learning_rate': 5.765807978584086e-06, 'epoch': 0.47} 47%|████▋ | 10331/22095 [17:43:46<11:03:31, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10332/22095 [17:43:56<17:34:59, 5.38s/it] {'loss': 0.4884, 'grad_norm': 0.6558090303313163, 'learning_rate': 5.76508369696121e-06, 'epoch': 0.47} 47%|████▋ | 10332/22095 [17:43:56<17:34:59, 5.38s/it] 47%|████▋ | 10333/22095 [17:44:00<15:56:21, 4.88s/it] {'loss': 0.3582, 'grad_norm': 0.6493883687712777, 'learning_rate': 5.764359398899035e-06, 'epoch': 0.47} 47%|████▋ | 10333/22095 [17:44:00<15:56:21, 4.88s/it] 47%|████▋ | 10334/22095 [17:44:04<14:34:30, 4.46s/it] {'loss': 0.3433, 'grad_norm': 0.6830461604852215, 'learning_rate': 5.763635084413124e-06, 'epoch': 0.47} 47%|████▋ | 10334/22095 [17:44:04<14:34:30, 4.46s/it] 47%|████▋ | 10335/22095 [17:44:07<13:40:18, 4.19s/it] {'loss': 0.2892, 'grad_norm': 0.5862373864261052, 'learning_rate': 5.762910753519041e-06, 'epoch': 0.47} 47%|████▋ | 10335/22095 [17:44:07<13:40:18, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10336/22095 [17:44:11<13:02:03, 3.99s/it] {'loss': 0.284, 'grad_norm': 0.6203705565094673, 'learning_rate': 5.7621864062323484e-06, 'epoch': 0.47} 47%|████▋ | 10336/22095 [17:44:11<13:02:03, 3.99s/it] 47%|████▋ | 10337/22095 [17:44:14<12:16:00, 3.76s/it] {'loss': 0.3411, 'grad_norm': 0.6226866995996769, 'learning_rate': 5.7614620425686115e-06, 'epoch': 0.47} 47%|████▋ | 10337/22095 [17:44:14<12:16:00, 3.76s/it] 47%|████▋ | 10338/22095 [17:44:18<12:42:54, 3.89s/it] {'loss': 0.3842, 'grad_norm': 0.6302494720705131, 'learning_rate': 5.760737662543393e-06, 'epoch': 0.47} 47%|████▋ | 10338/22095 [17:44:18<12:42:54, 3.89s/it] 47%|████▋ | 10339/22095 [17:44:21<11:40:53, 3.58s/it] {'loss': 0.3205, 'grad_norm': 0.6905149999540106, 'learning_rate': 5.760013266172261e-06, 'epoch': 0.47} 47%|████▋ | 10339/22095 [17:44:21<11:40:53, 3.58s/it] 47%|████▋ | 10340/22095 [17:44:25<11:43:53, 3.59s/it] {'loss': 0.3267, 'grad_norm': 0.6035717777445082, 'learning_rate': 5.759288853470776e-06, 'epoch': 0.47} 47%|████▋ | 10340/22095 [17:44:25<11:43:53, 3.59s/it] 47%|████▋ | 10341/22095 [17:44:28<11:30:14, 3.52s/it] {'loss': 0.3656, 'grad_norm': 0.6758276913963119, 'learning_rate': 5.758564424454505e-06, 'epoch': 0.47} 47%|████▋ | 10341/22095 [17:44:28<11:30:14, 3.52s/it] 47%|████▋ | 10342/22095 [17:44:31<10:51:24, 3.33s/it] {'loss': 0.3276, 'grad_norm': 0.6773124521999553, 'learning_rate': 5.757839979139015e-06, 'epoch': 0.47} 47%|████▋ | 10342/22095 [17:44:31<10:51:24, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8558102 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23069, 'image': '806513969.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a youngster related book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 47%|████▋ | 10343/22095 [17:44:40<16:49:20, 5.15s/it] {'loss': 0.4777, 'grad_norm': 0.39690207641958924, 'learning_rate': 5.757115517539871e-06, 'epoch': 0.47} 47%|████▋ | 10343/22095 [17:44:40<16:49:20, 5.15s/it] 47%|████▋ | 10344/22095 [17:44:44<15:30:46, 4.75s/it] {'loss': 0.3494, 'grad_norm': 0.6325568396327806, 'learning_rate': 5.7563910396726406e-06, 'epoch': 0.47} 47%|████▋ | 10344/22095 [17:44:44<15:30:46, 4.75s/it] 47%|████▋ | 10345/22095 [17:44:47<13:32:41, 4.15s/it] {'loss': 0.3444, 'grad_norm': 0.6558991024020747, 'learning_rate': 5.7556665455528905e-06, 'epoch': 0.47} 47%|████▋ | 10345/22095 [17:44:47<13:32:41, 4.15s/it] 47%|████▋ | 10346/22095 [17:44:50<12:17:30, 3.77s/it] {'loss': 0.3216, 'grad_norm': 0.5942159698676818, 'learning_rate': 5.7549420351961845e-06, 'epoch': 0.47} 47%|████▋ | 10346/22095 [17:44:50<12:17:30, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10347/22095 [17:44:53<11:33:37, 3.54s/it] {'loss': 0.3273, 'grad_norm': 0.7086765666045822, 'learning_rate': 5.754217508618096e-06, 'epoch': 0.47} 47%|████▋ | 10347/22095 [17:44:53<11:33:37, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54742 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94329 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10348/22095 [17:44:56<10:49:06, 3.32s/it] {'loss': 0.3766, 'grad_norm': 1.5854606445523287, 'learning_rate': 5.7534929658341875e-06, 'epoch': 0.47} 47%|████▋ | 10348/22095 [17:44:56<10:49:06, 3.32s/it] 47%|████▋ | 10349/22095 [17:44:59<11:04:41, 3.40s/it] {'loss': 0.3445, 'grad_norm': 0.6060422652921184, 'learning_rate': 5.75276840686003e-06, 'epoch': 0.47} 47%|████▋ | 10349/22095 [17:44:59<11:04:41, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43698 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10350/22095 [17:45:03<11:06:02, 3.40s/it] {'loss': 0.368, 'grad_norm': 0.6780583191381162, 'learning_rate': 5.752043831711191e-06, 'epoch': 0.47} 47%|████▋ | 10350/22095 [17:45:03<11:06:02, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [434, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8432625 in VC:s3://internvl-moe-sft-data/. Exception: Image size [434, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 96487, 'image': 'vrdu_texteq/astro-ph.CO/9472c7b4-e30f-4ac4-ae13-11821ba5a1eb.png', 'image_wh': [[434, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where $A$ can be written in the form'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [292, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8512872 in VC:s3://internvl-moe-sft-data/. Exception: Image size [292, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24120, 'image': 'vrdu_texteq/astro-ph.CO/7acddec2-5d18-403a-b6ce-00a405c6d6e2.png', 'image_wh': [[292, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'The solution for $h$ reads'}]} 47%|████▋ | 10351/22095 [17:45:05<10:36:28, 3.25s/it] {'loss': 0.3302, 'grad_norm': 0.6071328346358127, 'learning_rate': 5.75131924040324e-06, 'epoch': 0.47} 47%|████▋ | 10351/22095 [17:45:05<10:36:28, 3.25s/it] 47%|████▋ | 10352/22095 [17:45:09<10:32:16, 3.23s/it] {'loss': 0.3249, 'grad_norm': 0.6544464830019272, 'learning_rate': 5.750594632951746e-06, 'epoch': 0.47} 47%|████▋ | 10352/22095 [17:45:09<10:32:16, 3.23s/it] 47%|████▋ | 10353/22095 [17:45:13<11:16:25, 3.46s/it] {'loss': 0.3449, 'grad_norm': 0.6721548509646648, 'learning_rate': 5.749870009372279e-06, 'epoch': 0.47} 47%|████▋ | 10353/22095 [17:45:13<11:16:25, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86750 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10354/22095 [17:45:16<10:46:27, 3.30s/it] {'loss': 0.3217, 'grad_norm': 0.6522678147244234, 'learning_rate': 5.7491453696804075e-06, 'epoch': 0.47} 47%|████▋ | 10354/22095 [17:45:16<10:46:27, 3.30s/it] 47%|████▋ | 10355/22095 [17:45:18<10:26:30, 3.20s/it] {'loss': 0.3414, 'grad_norm': 0.600514796449989, 'learning_rate': 5.7484207138917046e-06, 'epoch': 0.47} 47%|████▋ | 10355/22095 [17:45:18<10:26:30, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10356/22095 [17:45:28<16:30:10, 5.06s/it] {'loss': 0.4632, 'grad_norm': 0.34556402129044617, 'learning_rate': 5.747696042021737e-06, 'epoch': 0.47} 47%|████▋ | 10356/22095 [17:45:28<16:30:10, 5.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10357/22095 [17:45:31<14:46:11, 4.53s/it] {'loss': 0.3596, 'grad_norm': 0.6345826108971352, 'learning_rate': 5.746971354086079e-06, 'epoch': 0.47} 47%|████▋ | 10357/22095 [17:45:31<14:46:11, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10358/22095 [17:45:38<17:15:16, 5.29s/it] {'loss': 0.4986, 'grad_norm': 0.30393717099708784, 'learning_rate': 5.746246650100302e-06, 'epoch': 0.47} 47%|████▋ | 10358/22095 [17:45:38<17:15:16, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69032 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10359/22095 [17:45:41<15:09:05, 4.65s/it] {'loss': 0.3326, 'grad_norm': 0.8897658153634812, 'learning_rate': 5.745521930079974e-06, 'epoch': 0.47} 47%|████▋ | 10359/22095 [17:45:41<15:09:05, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69313 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10360/22095 [17:45:45<13:41:56, 4.20s/it] {'loss': 0.3094, 'grad_norm': 0.5911021530626996, 'learning_rate': 5.744797194040672e-06, 'epoch': 0.47} 47%|████▋ | 10360/22095 [17:45:45<13:41:56, 4.20s/it] 47%|████▋ | 10361/22095 [17:45:48<12:35:53, 3.87s/it] {'loss': 0.3444, 'grad_norm': 0.666037617818459, 'learning_rate': 5.744072441997964e-06, 'epoch': 0.47} 47%|████▋ | 10361/22095 [17:45:48<12:35:53, 3.87s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8336070 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2689, 'image': 'vrdu_table_final_2/astro-ph.CO/ec0fe774-3ac1-44ea-a803-03066e15ad5a.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 47%|████▋ | 10362/22095 [17:45:57<18:05:41, 5.55s/it] {'loss': 0.4974, 'grad_norm': 0.30775239245639197, 'learning_rate': 5.743347673967425e-06, 'epoch': 0.47} 47%|████▋ | 10362/22095 [17:45:57<18:05:41, 5.55s/it] 47%|████▋ | 10363/22095 [17:46:01<16:29:17, 5.06s/it] {'loss': 0.3626, 'grad_norm': 0.668637519512142, 'learning_rate': 5.742622889964628e-06, 'epoch': 0.47} 47%|████▋ | 10363/22095 [17:46:01<16:29:17, 5.06s/it] 47%|████▋ | 10364/22095 [17:46:05<15:12:06, 4.67s/it] {'loss': 0.3252, 'grad_norm': 0.6288653334835292, 'learning_rate': 5.7418980900051445e-06, 'epoch': 0.47} 47%|████▋ | 10364/22095 [17:46:05<15:12:06, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49190 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128060 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10365/22095 [17:46:08<13:34:38, 4.17s/it] {'loss': 0.3328, 'grad_norm': 0.6235682340663499, 'learning_rate': 5.74117327410455e-06, 'epoch': 0.47} 47%|████▋ | 10365/22095 [17:46:08<13:34:38, 4.17s/it] 47%|████▋ | 10366/22095 [17:46:11<13:00:48, 3.99s/it] {'loss': 0.3471, 'grad_norm': 0.6477660006050718, 'learning_rate': 5.740448442278419e-06, 'epoch': 0.47} 47%|████▋ | 10366/22095 [17:46:11<13:00:48, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84610 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10367/22095 [17:46:21<18:28:53, 5.67s/it] {'loss': 0.4813, 'grad_norm': 0.3137744726867989, 'learning_rate': 5.739723594542323e-06, 'epoch': 0.47} 47%|████▋ | 10367/22095 [17:46:21<18:28:53, 5.67s/it] 47%|████▋ | 10368/22095 [17:46:25<16:42:59, 5.13s/it] {'loss': 0.3605, 'grad_norm': 0.652174300270186, 'learning_rate': 5.738998730911842e-06, 'epoch': 0.47} 47%|████▋ | 10368/22095 [17:46:25<16:42:59, 5.13s/it] 47%|████▋ | 10369/22095 [17:46:28<14:55:51, 4.58s/it] {'loss': 0.3369, 'grad_norm': 0.6532414271831297, 'learning_rate': 5.738273851402547e-06, 'epoch': 0.47} 47%|████▋ | 10369/22095 [17:46:28<14:55:51, 4.58s/it] 47%|████▋ | 10370/22095 [17:46:31<13:15:31, 4.07s/it] {'loss': 0.3022, 'grad_norm': 0.673193705744866, 'learning_rate': 5.737548956030014e-06, 'epoch': 0.47} 47%|████▋ | 10370/22095 [17:46:31<13:15:31, 4.07s/it] 47%|████▋ | 10371/22095 [17:46:35<12:49:29, 3.94s/it] {'loss': 0.3666, 'grad_norm': 0.6975430135053898, 'learning_rate': 5.736824044809818e-06, 'epoch': 0.47} 47%|████▋ | 10371/22095 [17:46:35<12:49:29, 3.94s/it] 47%|████▋ | 10372/22095 [17:46:38<11:48:12, 3.62s/it] {'loss': 0.3067, 'grad_norm': 0.6213955186406469, 'learning_rate': 5.736099117757536e-06, 'epoch': 0.47} 47%|████▋ | 10372/22095 [17:46:38<11:48:12, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10373/22095 [17:46:47<17:27:12, 5.36s/it] {'loss': 0.4613, 'grad_norm': 0.33070487315086977, 'learning_rate': 5.735374174888747e-06, 'epoch': 0.47} 47%|████▋ | 10373/22095 [17:46:47<17:27:12, 5.36s/it] 47%|████▋ | 10374/22095 [17:46:50<15:25:03, 4.74s/it] {'loss': 0.3415, 'grad_norm': 0.5927061983360309, 'learning_rate': 5.734649216219025e-06, 'epoch': 0.47} 47%|████▋ | 10374/22095 [17:46:50<15:25:03, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10375/22095 [17:47:00<19:55:48, 6.12s/it] {'loss': 0.4421, 'grad_norm': 0.3027285314363435, 'learning_rate': 5.733924241763946e-06, 'epoch': 0.47} 47%|████▋ | 10375/22095 [17:47:00<19:55:48, 6.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [37, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7805986 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [37, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27326', 'image': '51716.jpg', 'image_wh': [[37, 25]], 'conversations': [{'from': 'human', 'value': "\n Here is the caption I wrote for the image.\nThis image displays a mathematical notation, specifically the symbol \\\\( M_{xy} \\\\). Here, the letter \\\\( M \\\\) is italicized and in uppercase, while the subscript \\\\( xy \\\\) comprises lowercase italicized letters \\\\( x \\\\) and \\\\( y \\\\). This notation style is prevalent in mathematical contexts, often seen in equations, formulas, or descriptions of mathematical models.\n\n### Analysis\n1. **Notation Context**:\n - The symbol \\\\( M_{xy} \\\\) could be a component of an equation within a complex mathematical system. For instance, it might denote a measurement of distance, a moment of inertia, or another significant mathematical concept that uses subscripts to distinguish between various related entities or directions.\n\n2. **Common Uses**:\n - **Geography**: In geographical studies, \\\\( M_{xy} \\\\) might represent the distance from point \\\\( x \\\\) to point \\\\( y \\\\) on a map.\n - **Mechanics and Architecture**: In mechanics or architecture, it could indicate a moment of inertia or a structural parameter concerning two points, \\\\( x \\\\) and \\\\( y \\\\).\n - **Computer Science or Data Analysis**: It could symbolize a metric with indices \\\\( x \\\\) and \\\\( y \\\\) defining data categories or different points in a dataset.\n\n### Logical Chain of Thought\nContemplating the potential applications of \\\\( M_{xy} \\\\), let's explore a scenario in geography:\n\n**Step-by-Step Analysis for Geography**:\n- **Identify Context**: Assume we are mapping the distances between various locations on a map, focusing on the distance from point \\\\( x \\\\) to point \\\\( y \\\\).\n- **Parameter Definition**: Here, \\\\( M_{xy} \\\\) represents the geographical distance from point \\\\( x \\\\) to point \\\\( y \\\\).\n- **Application**:\n - If \\\\( M_{xy} \\\\) = 120 kilometers, it signifies that there's a distance of 120 kilometers between point \\\\( x \\\\) and point \\\\( y \\\\).\n - Such a matrix of distances, represented by \\\\( M \\\\), is essential for geographical mapping and spatial analysis.\n\nMoreover, considering other domains:\n- **Mechanics**: In the analysis of mechanical systems, \\\\( M_{xy} \\\\) could denote the moment of inertia concerning the axes defined by points \\\\( x \\\\) and \\\\( y \\\\).\n- **Computer Science**: When dealing with multidimensional data arrays, \\\\( M_{xy} \\\\) might represent the intensity or value at coordinates \\\\( x \\\\) and \\\\( y \\\\) within a digital image or a dataset.\n\n### Conclusion\nThe notation \\\\( M_{xy} \\\\) indicates a parameter dependent on two subscripts. Through this comprehensive examination, we highlighted potential domains—mainly in geography (as the distance between points on a map), but also in mechanics and computer science. Each domain lends a distinct interpretation to this notation, highlighting its flexibility and significance in a variety of academic and practical contexts. \nPlease assess if my caption is accurate, and if it’s not, please point out all the errors."}, {'from': 'gpt', 'value': "Your description is well-structured and detailed, demonstrating a comprehensive understanding of how mathematical notations can be interpreted across different fields. However, there's a fundamental discrepancy in your analysis—it appears to be based on the notation \\( M_{xy} \\), whereas the image shows \\( P_{st} \\). This mismatch leads to incorrect specific applications and examples tied to the \\( M_{xy} \\) notation.\n\nTo correct this, simply replace all instances of \\( M_{xy} \\) with \\( P_{st} \\) and adjust the examples and applications accordingly. For instance, the geographical mapping example would no longer be appropriate, as \\( P_{st} \\) is less likely to represent distance and more likely to represent a probability or a power value in other contexts like probability theory or physics and engineering, respectively.\n\nAdditionally, ensure that the examples and interpretations align with the typical usage of \\( P_{st} \\) as seen in probability, physics, economics, or engineering. This adjustment will make your analysis accurate and relevant to the notation depicted in the image."}]} 47%|████▋ | 10376/22095 [17:47:03<17:22:41, 5.34s/it] {'loss': 0.363, 'grad_norm': 0.6054536612662688, 'learning_rate': 5.733199251539091e-06, 'epoch': 0.47} 47%|████▋ | 10376/22095 [17:47:03<17:22:41, 5.34s/it] 47%|████▋ | 10377/22095 [17:47:06<15:05:40, 4.64s/it] {'loss': 0.3624, 'grad_norm': 0.7190888725596375, 'learning_rate': 5.732474245560035e-06, 'epoch': 0.47} 47%|████▋ | 10377/22095 [17:47:06<15:05:40, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [431, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8458030 in VC:s3://internvl-moe-sft-data/. Exception: Image size [431, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 110248, 'image': 'vrdu_texteq/astro-ph.CO/bf31bd45-329f-4d6f-84bc-83509f2714b7.png', 'image_wh': [[431, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'With $\\delta=0$ we find the distribution'}]} 47%|████▋ | 10378/22095 [17:47:10<14:24:56, 4.43s/it] {'loss': 0.3174, 'grad_norm': 0.6658433291423098, 'learning_rate': 5.7317492238423565e-06, 'epoch': 0.47} 47%|████▋ | 10378/22095 [17:47:10<14:24:56, 4.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8338453 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5080, 'image': 'vrdu_table_final_2/astro-ph.CO/965a126e-ca38-4ca6-a7e0-97062eb7e90b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 47%|████▋ | 10379/22095 [17:47:13<13:07:12, 4.03s/it] {'loss': 0.3492, 'grad_norm': 0.6978929635468228, 'learning_rate': 5.731024186401636e-06, 'epoch': 0.47} 47%|████▋ | 10379/22095 [17:47:13<13:07:12, 4.03s/it] 47%|████▋ | 10380/22095 [17:47:16<12:03:25, 3.71s/it] {'loss': 0.3135, 'grad_norm': 0.5995286401107972, 'learning_rate': 5.730299133253449e-06, 'epoch': 0.47} 47%|████▋ | 10380/22095 [17:47:16<12:03:25, 3.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8340377 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7021, 'image': 'vrdu_table_final_2/astro-ph.CO/9742f02a-39c7-4844-8887-bb1f2ef54172.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 47%|████▋ | 10381/22095 [17:47:19<11:27:54, 3.52s/it] {'loss': 0.326, 'grad_norm': 0.6186094867021583, 'learning_rate': 5.729574064413378e-06, 'epoch': 0.47} 47%|████▋ | 10381/22095 [17:47:19<11:27:54, 3.52s/it] 47%|████▋ | 10382/22095 [17:47:23<11:32:27, 3.55s/it] {'loss': 0.2979, 'grad_norm': 0.5917704318434182, 'learning_rate': 5.728848979897001e-06, 'epoch': 0.47} 47%|████▋ | 10382/22095 [17:47:23<11:32:27, 3.55s/it] 47%|████▋ | 10383/22095 [17:47:26<11:01:15, 3.39s/it] {'loss': 0.3497, 'grad_norm': 0.6118993093230494, 'learning_rate': 5.728123879719898e-06, 'epoch': 0.47} 47%|████▋ | 10383/22095 [17:47:26<11:01:15, 3.39s/it] 47%|████▋ | 10384/22095 [17:47:29<10:27:07, 3.21s/it] {'loss': 0.3151, 'grad_norm': 0.6497195551446878, 'learning_rate': 5.727398763897648e-06, 'epoch': 0.47} 47%|████▋ | 10384/22095 [17:47:29<10:27:07, 3.21s/it] 47%|████▋ | 10385/22095 [17:47:32<10:56:11, 3.36s/it] {'loss': 0.3738, 'grad_norm': 0.6050964668972018, 'learning_rate': 5.726673632445834e-06, 'epoch': 0.47} 47%|████▋ | 10385/22095 [17:47:32<10:56:11, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10386/22095 [17:47:36<10:52:28, 3.34s/it] {'loss': 0.3026, 'grad_norm': 0.6806531900254521, 'learning_rate': 5.725948485380034e-06, 'epoch': 0.47} 47%|████▋ | 10386/22095 [17:47:36<10:52:28, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58194 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138305 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98777 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10387/22095 [17:47:40<11:26:25, 3.52s/it] {'loss': 0.4091, 'grad_norm': 0.663190369071432, 'learning_rate': 5.725223322715833e-06, 'epoch': 0.47} 47%|████▋ | 10387/22095 [17:47:40<11:26:25, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10388/22095 [17:47:46<13:53:31, 4.27s/it] {'loss': 0.4785, 'grad_norm': 0.38758399729658655, 'learning_rate': 5.724498144468807e-06, 'epoch': 0.47} 47%|████▋ | 10388/22095 [17:47:46<13:53:31, 4.27s/it] 47%|████▋ | 10389/22095 [17:47:49<12:51:08, 3.95s/it] {'loss': 0.3497, 'grad_norm': 0.6718509573949357, 'learning_rate': 5.7237729506545435e-06, 'epoch': 0.47} 47%|████▋ | 10389/22095 [17:47:49<12:51:08, 3.95s/it] 47%|████▋ | 10390/22095 [17:47:52<12:25:50, 3.82s/it] {'loss': 0.3236, 'grad_norm': 0.699050141291865, 'learning_rate': 5.723047741288621e-06, 'epoch': 0.47} 47%|████▋ | 10390/22095 [17:47:52<12:25:50, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10391/22095 [17:48:01<17:33:03, 5.40s/it] {'loss': 0.4859, 'grad_norm': 0.3215874781013927, 'learning_rate': 5.722322516386623e-06, 'epoch': 0.47} 47%|████▋ | 10391/22095 [17:48:01<17:33:03, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51198 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10392/22095 [17:48:05<15:47:52, 4.86s/it] {'loss': 0.3606, 'grad_norm': 0.8302704908623061, 'learning_rate': 5.7215972759641335e-06, 'epoch': 0.47} 47%|████▋ | 10392/22095 [17:48:05<15:47:52, 4.86s/it] 47%|████▋ | 10393/22095 [17:48:08<13:59:58, 4.31s/it] {'loss': 0.3094, 'grad_norm': 0.6579121221890892, 'learning_rate': 5.720872020036734e-06, 'epoch': 0.47} 47%|████▋ | 10393/22095 [17:48:08<13:59:58, 4.31s/it] 47%|████▋ | 10394/22095 [17:48:11<12:42:52, 3.91s/it] {'loss': 0.3317, 'grad_norm': 0.6251053099740931, 'learning_rate': 5.720146748620009e-06, 'epoch': 0.47} 47%|████▋ | 10394/22095 [17:48:11<12:42:52, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10395/22095 [17:48:20<18:02:59, 5.55s/it] {'loss': 0.4976, 'grad_norm': 0.4553160396122376, 'learning_rate': 5.719421461729544e-06, 'epoch': 0.47} 47%|████▋ | 10395/22095 [17:48:20<18:02:59, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55455 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10396/22095 [17:48:24<16:17:14, 5.01s/it] {'loss': 0.3031, 'grad_norm': 0.6385209815002308, 'learning_rate': 5.718696159380918e-06, 'epoch': 0.47} 47%|████▋ | 10396/22095 [17:48:24<16:17:14, 5.01s/it] 47%|████▋ | 10397/22095 [17:48:27<14:41:33, 4.52s/it] {'loss': 0.3575, 'grad_norm': 0.6550005906490711, 'learning_rate': 5.717970841589722e-06, 'epoch': 0.47} 47%|████▋ | 10397/22095 [17:48:28<14:41:33, 4.52s/it] 47%|████▋ | 10398/22095 [17:48:31<13:18:12, 4.09s/it] {'loss': 0.3571, 'grad_norm': 0.6144433391942784, 'learning_rate': 5.717245508371535e-06, 'epoch': 0.47} 47%|████▋ | 10398/22095 [17:48:31<13:18:12, 4.09s/it] 47%|████▋ | 10399/22095 [17:48:34<12:57:07, 3.99s/it] {'loss': 0.3345, 'grad_norm': 0.56431820576879, 'learning_rate': 5.716520159741946e-06, 'epoch': 0.47} 47%|████▋ | 10399/22095 [17:48:34<12:57:07, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (77029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88703 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10400/22095 [17:48:40<14:35:01, 4.49s/it] {'loss': 0.4817, 'grad_norm': 0.31773722371787644, 'learning_rate': 5.715794795716539e-06, 'epoch': 0.47} 47%|████▋ | 10400/22095 [17:48:40<14:35:01, 4.49s/it] 47%|████▋ | 10401/22095 [17:48:44<14:07:18, 4.35s/it] {'loss': 0.3792, 'grad_norm': 0.6638658663750014, 'learning_rate': 5.7150694163109015e-06, 'epoch': 0.47} 47%|████▋ | 10401/22095 [17:48:44<14:07:18, 4.35s/it] 47%|████▋ | 10402/22095 [17:48:47<12:43:44, 3.92s/it] {'loss': 0.3311, 'grad_norm': 0.6109695735079277, 'learning_rate': 5.714344021540616e-06, 'epoch': 0.47} 47%|████▋ | 10402/22095 [17:48:47<12:43:44, 3.92s/it] 47%|████▋ | 10403/22095 [17:48:51<13:04:00, 4.02s/it] {'loss': 0.3193, 'grad_norm': 0.634634568023838, 'learning_rate': 5.713618611421273e-06, 'epoch': 0.47} 47%|████▋ | 10403/22095 [17:48:51<13:04:00, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107403 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10404/22095 [17:48:59<16:30:38, 5.08s/it] {'loss': 0.4807, 'grad_norm': 0.2813397261233842, 'learning_rate': 5.712893185968458e-06, 'epoch': 0.47} 47%|████▋ | 10404/22095 [17:48:59<16:30:38, 5.08s/it] 47%|████▋ | 10405/22095 [17:49:02<15:00:03, 4.62s/it] {'loss': 0.3455, 'grad_norm': 0.6239281195599348, 'learning_rate': 5.712167745197757e-06, 'epoch': 0.47} 47%|████▋ | 10405/22095 [17:49:02<15:00:03, 4.62s/it] 47%|████▋ | 10406/22095 [17:49:06<14:10:05, 4.36s/it] {'loss': 0.3336, 'grad_norm': 0.6635538212106871, 'learning_rate': 5.71144228912476e-06, 'epoch': 0.47} 47%|████▋ | 10406/22095 [17:49:06<14:10:05, 4.36s/it] 47%|████▋ | 10407/22095 [17:49:09<12:59:23, 4.00s/it] {'loss': 0.3805, 'grad_norm': 0.7367756633644262, 'learning_rate': 5.710716817765052e-06, 'epoch': 0.47} 47%|████▋ | 10407/22095 [17:49:09<12:59:23, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10408/22095 [17:49:18<17:40:47, 5.45s/it] {'loss': 0.4866, 'grad_norm': 0.30067415828969046, 'learning_rate': 5.709991331134224e-06, 'epoch': 0.47} 47%|████▋ | 10408/22095 [17:49:18<17:40:47, 5.45s/it] 47%|████▋ | 10409/22095 [17:49:21<15:35:30, 4.80s/it] {'loss': 0.2583, 'grad_norm': 0.6064990682433967, 'learning_rate': 5.709265829247861e-06, 'epoch': 0.47} 47%|████▋ | 10409/22095 [17:49:21<15:35:30, 4.80s/it] 47%|████▋ | 10410/22095 [17:49:25<14:27:02, 4.45s/it] {'loss': 0.3187, 'grad_norm': 0.5816228931837361, 'learning_rate': 5.7085403121215545e-06, 'epoch': 0.47} 47%|████▋ | 10410/22095 [17:49:25<14:27:02, 4.45s/it] 47%|████▋ | 10411/22095 [17:49:28<13:03:33, 4.02s/it] {'loss': 0.3383, 'grad_norm': 0.6846169928806831, 'learning_rate': 5.707814779770892e-06, 'epoch': 0.47} 47%|████▋ | 10411/22095 [17:49:28<13:03:33, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10412/22095 [17:49:37<18:16:08, 5.63s/it] {'loss': 0.4872, 'grad_norm': 0.2864393917728087, 'learning_rate': 5.707089232211463e-06, 'epoch': 0.47} 47%|████▋ | 10412/22095 [17:49:37<18:16:08, 5.63s/it] 47%|████▋ | 10413/22095 [17:49:41<16:40:25, 5.14s/it] {'loss': 0.3431, 'grad_norm': 0.7883095576233838, 'learning_rate': 5.70636366945886e-06, 'epoch': 0.47} 47%|████▋ | 10413/22095 [17:49:41<16:40:25, 5.14s/it] 47%|████▋ | 10414/22095 [17:49:45<14:47:10, 4.56s/it] {'loss': 0.3319, 'grad_norm': 0.671171558514905, 'learning_rate': 5.70563809152867e-06, 'epoch': 0.47} 47%|████▋ | 10414/22095 [17:49:45<14:47:10, 4.56s/it] 47%|████▋ | 10415/22095 [17:49:47<13:07:39, 4.05s/it] {'loss': 0.3253, 'grad_norm': 0.570402605761979, 'learning_rate': 5.704912498436486e-06, 'epoch': 0.47} 47%|████▋ | 10415/22095 [17:49:47<13:07:39, 4.05s/it] 47%|████▋ | 10416/22095 [17:49:51<12:48:59, 3.95s/it] {'loss': 0.3538, 'grad_norm': 0.6494215852181807, 'learning_rate': 5.704186890197897e-06, 'epoch': 0.47} 47%|████▋ | 10416/22095 [17:49:51<12:48:59, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111268 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10417/22095 [17:49:54<11:47:12, 3.63s/it] {'loss': 0.3157, 'grad_norm': 0.6309491842680105, 'learning_rate': 5.703461266828493e-06, 'epoch': 0.47} 47%|████▋ | 10417/22095 [17:49:54<11:47:12, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49570 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10418/22095 [17:49:57<10:55:31, 3.37s/it] {'loss': 0.3297, 'grad_norm': 0.6158447888774433, 'learning_rate': 5.702735628343869e-06, 'epoch': 0.47} 47%|████▋ | 10418/22095 [17:49:57<10:55:31, 3.37s/it] 47%|████▋ | 10419/22095 [17:50:00<10:32:11, 3.25s/it] {'loss': 0.3274, 'grad_norm': 0.6330071116585992, 'learning_rate': 5.702009974759612e-06, 'epoch': 0.47} 47%|████▋ | 10419/22095 [17:50:00<10:32:11, 3.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047897 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 8cm\nB. 1lcm\nC. 13cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 47%|████▋ | 10420/22095 [17:50:03<10:25:50, 3.22s/it] {'loss': 0.329, 'grad_norm': 0.6302672089025184, 'learning_rate': 5.701284306091319e-06, 'epoch': 0.47} 47%|████▋ | 10420/22095 [17:50:03<10:25:50, 3.22s/it] 47%|████▋ | 10421/22095 [17:50:06<10:18:35, 3.18s/it] {'loss': 0.3319, 'grad_norm': 0.698076458574524, 'learning_rate': 5.700558622354579e-06, 'epoch': 0.47} 47%|████▋ | 10421/22095 [17:50:06<10:18:35, 3.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44126 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10422/22095 [17:50:09<10:14:46, 3.16s/it] {'loss': 0.3318, 'grad_norm': 0.6162874189440831, 'learning_rate': 5.699832923564986e-06, 'epoch': 0.47} 47%|████▋ | 10422/22095 [17:50:09<10:14:46, 3.16s/it] 47%|████▋ | 10423/22095 [17:50:12<10:26:19, 3.22s/it] {'loss': 0.3253, 'grad_norm': 0.6526134943107602, 'learning_rate': 5.699107209738133e-06, 'epoch': 0.47} 47%|████▋ | 10423/22095 [17:50:12<10:26:19, 3.22s/it] 47%|████▋ | 10424/22095 [17:50:15<10:13:49, 3.16s/it] {'loss': 0.4051, 'grad_norm': 0.680267091801362, 'learning_rate': 5.698381480889614e-06, 'epoch': 0.47} 47%|████▋ | 10424/22095 [17:50:15<10:13:49, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48696 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50429 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56220 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10425/22095 [17:50:25<16:18:29, 5.03s/it] {'loss': 0.4714, 'grad_norm': 0.30455858467410707, 'learning_rate': 5.697655737035019e-06, 'epoch': 0.47} 47%|████▋ | 10425/22095 [17:50:25<16:18:29, 5.03s/it] 47%|████▋ | 10426/22095 [17:50:28<14:37:14, 4.51s/it] {'loss': 0.33, 'grad_norm': 0.6318519040245598, 'learning_rate': 5.6969299781899486e-06, 'epoch': 0.47} 47%|████▋ | 10426/22095 [17:50:28<14:37:14, 4.51s/it] 47%|████▋ | 10427/22095 [17:50:32<13:35:01, 4.19s/it] {'loss': 0.3563, 'grad_norm': 0.5964106428025935, 'learning_rate': 5.696204204369991e-06, 'epoch': 0.47} 47%|████▋ | 10427/22095 [17:50:32<13:35:01, 4.19s/it] 47%|████▋ | 10428/22095 [17:50:35<13:05:32, 4.04s/it] {'loss': 0.3742, 'grad_norm': 0.6215879264225173, 'learning_rate': 5.695478415590745e-06, 'epoch': 0.47} 47%|████▋ | 10428/22095 [17:50:36<13:05:32, 4.04s/it] 47%|████▋ | 10429/22095 [17:50:38<12:13:23, 3.77s/it] {'loss': 0.3318, 'grad_norm': 0.6929168753091881, 'learning_rate': 5.6947526118678024e-06, 'epoch': 0.47} 47%|████▋ | 10429/22095 [17:50:38<12:13:23, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10430/22095 [17:50:48<18:19:39, 5.66s/it] {'loss': 0.4975, 'grad_norm': 0.3668308493260659, 'learning_rate': 5.69402679321676e-06, 'epoch': 0.47} 47%|████▋ | 10430/22095 [17:50:49<18:19:39, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111513 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10431/22095 [17:50:52<16:24:17, 5.06s/it] {'loss': 0.3537, 'grad_norm': 0.6125920894562785, 'learning_rate': 5.693300959653214e-06, 'epoch': 0.47} 47%|████▋ | 10431/22095 [17:50:52<16:24:17, 5.06s/it] 47%|████▋ | 10432/22095 [17:50:55<14:17:53, 4.41s/it] {'loss': 0.346, 'grad_norm': 0.6896258816056711, 'learning_rate': 5.69257511119276e-06, 'epoch': 0.47} 47%|████▋ | 10432/22095 [17:50:55<14:17:53, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53577 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10433/22095 [17:51:01<15:54:31, 4.91s/it] {'loss': 0.4737, 'grad_norm': 0.29626764972290515, 'learning_rate': 5.691849247850993e-06, 'epoch': 0.47} 47%|████▋ | 10433/22095 [17:51:01<15:54:31, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57596 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10434/22095 [17:51:04<14:07:41, 4.36s/it] {'loss': 0.2943, 'grad_norm': 0.6294195346534802, 'learning_rate': 5.691123369643511e-06, 'epoch': 0.47} 47%|████▋ | 10434/22095 [17:51:04<14:07:41, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (140196 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10435/22095 [17:51:09<14:22:07, 4.44s/it] {'loss': 0.3178, 'grad_norm': 0.6396559300429897, 'learning_rate': 5.690397476585909e-06, 'epoch': 0.47} 47%|████▋ | 10435/22095 [17:51:09<14:22:07, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10436/22095 [17:51:13<13:43:13, 4.24s/it] {'loss': 0.3441, 'grad_norm': 0.6148819531156378, 'learning_rate': 5.689671568693788e-06, 'epoch': 0.47} 47%|████▋ | 10436/22095 [17:51:13<13:43:13, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922260 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45413, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵AN:MN=1:2,且AN=2,∴2:MN=1:2,∴MN=4cm,∴AM=6cm.∵M是线段AB的中点,∴AB=2AM,∴AB=12cm,故D答案正确.'}]} 47%|████▋ | 10437/22095 [17:51:16<13:09:49, 4.06s/it] {'loss': 0.3151, 'grad_norm': 0.6441787478142561, 'learning_rate': 5.688945645982743e-06, 'epoch': 0.47} 47%|████▋ | 10437/22095 [17:51:16<13:09:49, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41275 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89115 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10438/22095 [17:51:20<12:32:01, 3.87s/it] {'loss': 0.3661, 'grad_norm': 0.6019671802140769, 'learning_rate': 5.68821970846837e-06, 'epoch': 0.47} 47%|████▋ | 10438/22095 [17:51:20<12:32:01, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10439/22095 [17:51:27<15:45:16, 4.87s/it] {'loss': 0.4804, 'grad_norm': 0.4033827542244026, 'learning_rate': 5.687493756166272e-06, 'epoch': 0.47} 47%|████▋ | 10439/22095 [17:51:27<15:45:16, 4.87s/it] 47%|████▋ | 10440/22095 [17:51:31<15:22:16, 4.75s/it] {'loss': 0.3211, 'grad_norm': 0.6473518747090413, 'learning_rate': 5.686767789092041e-06, 'epoch': 0.47} 47%|████▋ | 10440/22095 [17:51:31<15:22:16, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10441/22095 [17:51:40<19:03:30, 5.89s/it] {'loss': 0.4899, 'grad_norm': 0.3296715058419835, 'learning_rate': 5.6860418072612826e-06, 'epoch': 0.47} 47%|████▋ | 10441/22095 [17:51:40<19:03:30, 5.89s/it] 47%|████▋ | 10442/22095 [17:51:46<18:54:48, 5.84s/it] {'loss': 0.4893, 'grad_norm': 0.270224074842136, 'learning_rate': 5.6853158106895915e-06, 'epoch': 0.47} 47%|████▋ | 10442/22095 [17:51:46<18:54:48, 5.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53210 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10443/22095 [17:51:49<16:45:39, 5.18s/it] {'loss': 0.3276, 'grad_norm': 0.657635254692437, 'learning_rate': 5.684589799392568e-06, 'epoch': 0.47} 47%|████▋ | 10443/22095 [17:51:49<16:45:39, 5.18s/it] 47%|████▋ | 10444/22095 [17:51:53<15:42:04, 4.85s/it] {'loss': 0.3139, 'grad_norm': 0.6170343378678285, 'learning_rate': 5.683863773385813e-06, 'epoch': 0.47} 47%|████▋ | 10444/22095 [17:51:53<15:42:04, 4.85s/it] 47%|████▋ | 10445/22095 [17:51:57<14:24:32, 4.45s/it] {'loss': 0.2783, 'grad_norm': 0.6288497648807237, 'learning_rate': 5.683137732684926e-06, 'epoch': 0.47} 47%|████▋ | 10445/22095 [17:51:57<14:24:32, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10446/22095 [17:52:07<19:56:35, 6.16s/it] {'loss': 0.4675, 'grad_norm': 0.42879652640310395, 'learning_rate': 5.682411677305506e-06, 'epoch': 0.47} 47%|████▋ | 10446/22095 [17:52:07<19:56:35, 6.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102818 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101583 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10447/22095 [17:52:11<17:48:52, 5.51s/it] {'loss': 0.3025, 'grad_norm': 0.666824200507803, 'learning_rate': 5.681685607263156e-06, 'epoch': 0.47} 47%|████▋ | 10447/22095 [17:52:11<17:48:52, 5.51s/it] 47%|████▋ | 10448/22095 [17:52:15<16:00:30, 4.95s/it] {'loss': 0.3202, 'grad_norm': 0.6456795722225278, 'learning_rate': 5.680959522573476e-06, 'epoch': 0.47} 47%|████▋ | 10448/22095 [17:52:15<16:00:30, 4.95s/it] 47%|████▋ | 10449/22095 [17:52:18<14:15:28, 4.41s/it] {'loss': 0.358, 'grad_norm': 0.6255694791465786, 'learning_rate': 5.680233423252066e-06, 'epoch': 0.47} 47%|████▋ | 10449/22095 [17:52:18<14:15:28, 4.41s/it] 47%|████▋ | 10450/22095 [17:52:21<13:05:35, 4.05s/it] {'loss': 0.298, 'grad_norm': 0.6549822291733464, 'learning_rate': 5.67950730931453e-06, 'epoch': 0.47} 47%|████▋ | 10450/22095 [17:52:21<13:05:35, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10451/22095 [17:52:32<19:23:07, 5.99s/it] {'loss': 0.4755, 'grad_norm': 0.3193477388383573, 'learning_rate': 5.678781180776469e-06, 'epoch': 0.47} 47%|████▋ | 10451/22095 [17:52:32<19:23:07, 5.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10452/22095 [17:52:36<17:32:16, 5.42s/it] {'loss': 0.3221, 'grad_norm': 0.6423122948962887, 'learning_rate': 5.678055037653485e-06, 'epoch': 0.47} 47%|████▋ | 10452/22095 [17:52:36<17:32:16, 5.42s/it] 47%|████▋ | 10453/22095 [17:52:39<15:19:48, 4.74s/it] {'loss': 0.3089, 'grad_norm': 0.5848128098939198, 'learning_rate': 5.677328879961182e-06, 'epoch': 0.47} 47%|████▋ | 10453/22095 [17:52:39<15:19:48, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45750 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10454/22095 [17:52:42<13:53:41, 4.30s/it] {'loss': 0.3378, 'grad_norm': 0.617186467546983, 'learning_rate': 5.676602707715159e-06, 'epoch': 0.47} 47%|████▋ | 10454/22095 [17:52:42<13:53:41, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10455/22095 [17:52:52<19:52:10, 6.15s/it] {'loss': 0.4806, 'grad_norm': 0.3529437506726879, 'learning_rate': 5.675876520931023e-06, 'epoch': 0.47} 47%|████▋ | 10455/22095 [17:52:53<19:52:10, 6.15s/it] 47%|████▋ | 10456/22095 [17:52:56<17:07:14, 5.30s/it] {'loss': 0.3185, 'grad_norm': 0.6428562390362067, 'learning_rate': 5.675150319624375e-06, 'epoch': 0.47} 47%|████▋ | 10456/22095 [17:52:56<17:07:14, 5.30s/it] 47%|████▋ | 10457/22095 [17:53:00<15:56:26, 4.93s/it] {'loss': 0.3537, 'grad_norm': 0.637439086908186, 'learning_rate': 5.674424103810822e-06, 'epoch': 0.47} 47%|████▋ | 10457/22095 [17:53:00<15:56:26, 4.93s/it] 47%|████▋ | 10458/22095 [17:53:03<13:48:09, 4.27s/it] {'loss': 0.3302, 'grad_norm': 0.6085350660379343, 'learning_rate': 5.6736978735059665e-06, 'epoch': 0.47} 47%|████▋ | 10458/22095 [17:53:03<13:48:09, 4.27s/it] 47%|████▋ | 10459/22095 [17:53:06<13:12:25, 4.09s/it] {'loss': 0.3168, 'grad_norm': 0.6298430509677303, 'learning_rate': 5.672971628725412e-06, 'epoch': 0.47} 47%|████▋ | 10459/22095 [17:53:06<13:12:25, 4.09s/it] 47%|████▋ | 10460/22095 [17:53:09<12:10:30, 3.77s/it] {'loss': 0.3543, 'grad_norm': 0.6593200109445425, 'learning_rate': 5.672245369484765e-06, 'epoch': 0.47} 47%|████▋ | 10460/22095 [17:53:09<12:10:30, 3.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367661 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34409, 'image': 'vrdu_table_final_2/astro-ph.CO/f57a6f14-3282-4891-b3dd-0d226e25cb8f.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 47%|████▋ | 10461/22095 [17:53:13<12:23:03, 3.83s/it] {'loss': 0.3174, 'grad_norm': 0.6021724432032531, 'learning_rate': 5.671519095799629e-06, 'epoch': 0.47} 47%|████▋ | 10461/22095 [17:53:13<12:23:03, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10462/22095 [17:53:23<18:02:31, 5.58s/it] {'loss': 0.4881, 'grad_norm': 0.38837544330242485, 'learning_rate': 5.67079280768561e-06, 'epoch': 0.47} 47%|████▋ | 10462/22095 [17:53:23<18:02:31, 5.58s/it] 47%|████▋ | 10463/22095 [17:53:27<16:18:06, 5.05s/it] {'loss': 0.3139, 'grad_norm': 0.635804739256016, 'learning_rate': 5.670066505158314e-06, 'epoch': 0.47} 47%|████▋ | 10463/22095 [17:53:27<16:18:06, 5.05s/it] 47%|████▋ | 10464/22095 [17:53:30<14:35:10, 4.51s/it] {'loss': 0.3154, 'grad_norm': 0.6266395671735902, 'learning_rate': 5.6693401882333455e-06, 'epoch': 0.47} 47%|████▋ | 10464/22095 [17:53:30<14:35:10, 4.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047751 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AC=6,CB=3,∴AB=6+3=9,∵O是线段AB的中点,∴AO=9÷2=4.5,∴OC=AC-AO=6-4.5=1.5.'}]} 47%|████▋ | 10465/22095 [17:53:34<13:48:03, 4.27s/it] {'loss': 0.3305, 'grad_norm': 0.6620216762722568, 'learning_rate': 5.668613856926312e-06, 'epoch': 0.47} 47%|████▋ | 10465/22095 [17:53:34<13:48:03, 4.27s/it] 47%|████▋ | 10466/22095 [17:53:37<12:49:22, 3.97s/it] {'loss': 0.3249, 'grad_norm': 1.1799266727472142, 'learning_rate': 5.667887511252823e-06, 'epoch': 0.47} 47%|████▋ | 10466/22095 [17:53:37<12:49:22, 3.97s/it] 47%|████▋ | 10467/22095 [17:53:40<11:54:55, 3.69s/it] {'loss': 0.3139, 'grad_norm': 0.6204130806355646, 'learning_rate': 5.667161151228481e-06, 'epoch': 0.47} 47%|████▋ | 10467/22095 [17:53:40<11:54:55, 3.69s/it] 47%|████▋ | 10468/22095 [17:53:44<12:21:45, 3.83s/it] {'loss': 0.322, 'grad_norm': 0.6310237088814322, 'learning_rate': 5.666434776868895e-06, 'epoch': 0.47} 47%|████▋ | 10468/22095 [17:53:44<12:21:45, 3.83s/it] 47%|████▋ | 10469/22095 [17:53:47<11:41:03, 3.62s/it] {'loss': 0.3484, 'grad_norm': 0.6381134611880385, 'learning_rate': 5.665708388189672e-06, 'epoch': 0.47} 47%|████▋ | 10469/22095 [17:53:47<11:41:03, 3.62s/it] 47%|████▋ | 10470/22095 [17:53:51<11:17:06, 3.49s/it] {'loss': 0.2972, 'grad_norm': 0.5889469271310414, 'learning_rate': 5.664981985206421e-06, 'epoch': 0.47} 47%|████▋ | 10470/22095 [17:53:51<11:17:06, 3.49s/it] 47%|████▋ | 10471/22095 [17:53:54<11:38:35, 3.61s/it] {'loss': 0.3813, 'grad_norm': 0.711647404359138, 'learning_rate': 5.664255567934749e-06, 'epoch': 0.47} 47%|████▋ | 10471/22095 [17:53:54<11:38:35, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10472/22095 [17:53:58<11:42:41, 3.63s/it] {'loss': 0.3075, 'grad_norm': 0.636677445282444, 'learning_rate': 5.663529136390264e-06, 'epoch': 0.47} 47%|████▋ | 10472/22095 [17:53:58<11:42:41, 3.63s/it] 47%|████▋ | 10473/22095 [17:54:02<11:39:51, 3.61s/it] {'loss': 0.3174, 'grad_norm': 0.6477771180653775, 'learning_rate': 5.662802690588578e-06, 'epoch': 0.47} 47%|████▋ | 10473/22095 [17:54:02<11:39:51, 3.61s/it] 47%|████▋ | 10474/22095 [17:54:05<10:59:10, 3.40s/it] {'loss': 0.3416, 'grad_norm': 0.6224392332441187, 'learning_rate': 5.662076230545297e-06, 'epoch': 0.47} 47%|████▋ | 10474/22095 [17:54:05<10:59:10, 3.40s/it] 47%|████▋ | 10475/22095 [17:54:08<10:41:28, 3.31s/it] {'loss': 0.3335, 'grad_norm': 0.6727880059139598, 'learning_rate': 5.66134975627603e-06, 'epoch': 0.47} 47%|████▋ | 10475/22095 [17:54:08<10:41:28, 3.31s/it] 47%|████▋ | 10476/22095 [17:54:11<11:06:08, 3.44s/it] {'loss': 0.3144, 'grad_norm': 0.6891150407798442, 'learning_rate': 5.660623267796389e-06, 'epoch': 0.47} 47%|████▋ | 10476/22095 [17:54:11<11:06:08, 3.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [753, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8532239 in VC:s3://internvl-moe-sft-data/. Exception: Image size [753, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 116693, 'image': 'vrdu_texteq/astro-ph.CO/380ed1f1-f85a-41fb-bcc0-9f351b0dcb7a.png', 'image_wh': [[753, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $a_\\star$ is the scale factor at the RD to KD transition. Thus'}]} 47%|████▋ | 10477/22095 [17:54:16<11:54:16, 3.69s/it] {'loss': 0.3266, 'grad_norm': 0.6435883187225452, 'learning_rate': 5.659896765121982e-06, 'epoch': 0.47} 47%|████▋ | 10477/22095 [17:54:16<11:54:16, 3.69s/it] 47%|████▋ | 10478/22095 [17:54:19<11:54:55, 3.69s/it] {'loss': 0.3437, 'grad_norm': 0.608530341011818, 'learning_rate': 5.659170248268422e-06, 'epoch': 0.47} 47%|████▋ | 10478/22095 [17:54:19<11:54:55, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104255 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10479/22095 [17:54:23<11:33:19, 3.58s/it] {'loss': 0.3254, 'grad_norm': 0.7542906131639765, 'learning_rate': 5.658443717251316e-06, 'epoch': 0.47} 47%|████▋ | 10479/22095 [17:54:23<11:33:19, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121394 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10480/22095 [17:54:27<11:57:09, 3.70s/it] {'loss': 0.3608, 'grad_norm': 0.6917075448860384, 'learning_rate': 5.657717172086278e-06, 'epoch': 0.47} 47%|████▋ | 10480/22095 [17:54:27<11:57:09, 3.70s/it] 47%|████▋ | 10481/22095 [17:54:30<11:33:41, 3.58s/it] {'loss': 0.3296, 'grad_norm': 0.5847839847757341, 'learning_rate': 5.656990612788918e-06, 'epoch': 0.47} 47%|████▋ | 10481/22095 [17:54:30<11:33:41, 3.58s/it] 47%|████▋ | 10482/22095 [17:54:33<11:16:22, 3.49s/it] {'loss': 0.3042, 'grad_norm': 0.5982518729555611, 'learning_rate': 5.656264039374846e-06, 'epoch': 0.47} 47%|████▋ | 10482/22095 [17:54:33<11:16:22, 3.49s/it] 47%|████▋ | 10483/22095 [17:54:36<10:44:13, 3.33s/it] {'loss': 0.2949, 'grad_norm': 0.5898479533174593, 'learning_rate': 5.6555374518596765e-06, 'epoch': 0.47} 47%|████▋ | 10483/22095 [17:54:36<10:44:13, 3.33s/it] 47%|████▋ | 10484/22095 [17:54:39<10:21:34, 3.21s/it] {'loss': 0.3386, 'grad_norm': 0.6297858839636593, 'learning_rate': 5.654810850259021e-06, 'epoch': 0.47} 47%|████▋ | 10484/22095 [17:54:39<10:21:34, 3.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10485/22095 [17:54:43<10:31:22, 3.26s/it] {'loss': 0.3221, 'grad_norm': 0.6019311826617614, 'learning_rate': 5.65408423458849e-06, 'epoch': 0.47} 47%|████▋ | 10485/22095 [17:54:43<10:31:22, 3.26s/it] 47%|████▋ | 10486/22095 [17:54:46<10:54:03, 3.38s/it] {'loss': 0.318, 'grad_norm': 0.6191628764874518, 'learning_rate': 5.653357604863698e-06, 'epoch': 0.47} 47%|████▋ | 10486/22095 [17:54:46<10:54:03, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 47%|████▋ | 10487/22095 [17:54:56<17:04:36, 5.30s/it] {'loss': 0.4458, 'grad_norm': 0.3862071876986101, 'learning_rate': 5.65263096110026e-06, 'epoch': 0.47} 47%|████▋ | 10487/22095 [17:54:56<17:04:36, 5.30s/it] 47%|████▋ | 10488/22095 [17:55:06<21:20:48, 6.62s/it] {'loss': 0.4774, 'grad_norm': 0.32403615101505817, 'learning_rate': 5.651904303313784e-06, 'epoch': 0.47} 47%|████▋ | 10488/22095 [17:55:06<21:20:48, 6.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 47%|████▋ | 10489/22095 [17:55:10<19:04:33, 5.92s/it] {'loss': 0.336, 'grad_norm': 0.6090597584329148, 'learning_rate': 5.6511776315198886e-06, 'epoch': 0.47} 47%|████▋ | 10489/22095 [17:55:10<19:04:33, 5.92s/it] 47%|████▋ | 10490/22095 [17:55:14<16:53:02, 5.24s/it] {'loss': 0.3159, 'grad_norm': 0.6378578217907716, 'learning_rate': 5.650450945734185e-06, 'epoch': 0.47} 47%|████▋ | 10490/22095 [17:55:14<16:53:02, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47655 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10491/22095 [17:55:17<15:23:07, 4.77s/it] {'loss': 0.3137, 'grad_norm': 0.7852386464362128, 'learning_rate': 5.649724245972288e-06, 'epoch': 0.47} 47%|████▋ | 10491/22095 [17:55:17<15:23:07, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41306 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108943 > 40960). Running this sequence through the model will result in indexing errors 47%|████▋ | 10492/22095 [17:55:21<14:25:38, 4.48s/it] {'loss': 0.3738, 'grad_norm': 0.7013033187362679, 'learning_rate': 5.6489975322498124e-06, 'epoch': 0.47} 47%|████▋ | 10492/22095 [17:55:21<14:25:38, 4.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 47%|████▋ | 10493/22095 [17:55:25<13:47:35, 4.28s/it] {'loss': 0.3624, 'grad_norm': 0.6168953929319065, 'learning_rate': 5.6482708045823734e-06, 'epoch': 0.47} 47%|████▋ | 10493/22095 [17:55:25<13:47:35, 4.28s/it] 47%|████▋ | 10494/22095 [17:55:28<12:32:55, 3.89s/it] {'loss': 0.3126, 'grad_norm': 0.6178354670293249, 'learning_rate': 5.647544062985586e-06, 'epoch': 0.47} 47%|████▋ | 10494/22095 [17:55:28<12:32:55, 3.89s/it] 47%|████▋ | 10495/22095 [17:55:31<12:06:30, 3.76s/it] {'loss': 0.31, 'grad_norm': 0.6318537399545678, 'learning_rate': 5.646817307475066e-06, 'epoch': 0.47} 47%|████▋ | 10495/22095 [17:55:31<12:06:30, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83341 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57854 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10496/22095 [17:55:35<11:41:14, 3.63s/it] {'loss': 0.2877, 'grad_norm': 0.644458175134238, 'learning_rate': 5.646090538066426e-06, 'epoch': 0.48} 48%|████▊ | 10496/22095 [17:55:35<11:41:14, 3.63s/it] 48%|████▊ | 10497/22095 [17:55:38<11:11:52, 3.48s/it] {'loss': 0.3776, 'grad_norm': 0.642582769804473, 'learning_rate': 5.645363754775288e-06, 'epoch': 0.48} 48%|████▊ | 10497/22095 [17:55:38<11:11:52, 3.48s/it] 48%|████▊ | 10498/22095 [17:55:41<10:32:56, 3.27s/it] {'loss': 0.3314, 'grad_norm': 0.7798488205437791, 'learning_rate': 5.644636957617264e-06, 'epoch': 0.48} 48%|████▊ | 10498/22095 [17:55:41<10:32:56, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10499/22095 [17:55:50<16:43:30, 5.19s/it] {'loss': 0.5135, 'grad_norm': 0.4828585852711643, 'learning_rate': 5.643910146607972e-06, 'epoch': 0.48} 48%|████▊ | 10499/22095 [17:55:50<16:43:30, 5.19s/it] 48%|████▊ | 10500/22095 [17:56:00<20:47:02, 6.45s/it] {'loss': 0.4605, 'grad_norm': 0.40002404946581677, 'learning_rate': 5.643183321763027e-06, 'epoch': 0.48} 48%|████▊ | 10500/22095 [17:56:00<20:47:02, 6.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10501/22095 [17:56:03<17:42:21, 5.50s/it] {'loss': 0.3191, 'grad_norm': 0.6665771337235158, 'learning_rate': 5.642456483098049e-06, 'epoch': 0.48} 48%|████▊ | 10501/22095 [17:56:03<17:42:21, 5.50s/it] 48%|████▊ | 10502/22095 [17:56:06<15:32:01, 4.82s/it] {'loss': 0.2882, 'grad_norm': 0.6147364813893643, 'learning_rate': 5.641729630628654e-06, 'epoch': 0.48} 48%|████▊ | 10502/22095 [17:56:06<15:32:01, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48936 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59846 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111016 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42689 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10503/22095 [17:56:09<13:57:50, 4.34s/it] {'loss': 0.2919, 'grad_norm': 0.6099779425362699, 'learning_rate': 5.641002764370461e-06, 'epoch': 0.48} 48%|████▊ | 10503/22095 [17:56:09<13:57:50, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10504/22095 [17:56:19<18:54:01, 5.87s/it] {'loss': 0.4744, 'grad_norm': 0.43910766007800495, 'learning_rate': 5.6402758843390844e-06, 'epoch': 0.48} 48%|████▊ | 10504/22095 [17:56:19<18:54:01, 5.87s/it] 48%|████▊ | 10505/22095 [17:56:22<16:12:59, 5.04s/it] {'loss': 0.3509, 'grad_norm': 0.6479746229972967, 'learning_rate': 5.63954899055015e-06, 'epoch': 0.48} 48%|████▊ | 10505/22095 [17:56:22<16:12:59, 5.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [95, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374902 in VC:s3://internvl-moe-sft-data/. Exception: Image size [95, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41678, 'image': 'vrdu_table_final_2/astro-ph.CO/9d7ae02a-30cd-45ee-baa8-c4cdb278b3b5.png', 'image_wh': [[95, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}} IF$_{\\rm CONV}$ \\end{tabular}\n```"}]} 48%|████▊ | 10506/22095 [17:56:25<14:13:49, 4.42s/it] {'loss': 0.318, 'grad_norm': 0.6720997314016489, 'learning_rate': 5.638822083019267e-06, 'epoch': 0.48} 48%|████▊ | 10506/22095 [17:56:25<14:13:49, 4.42s/it] 48%|████▊ | 10507/22095 [17:56:28<12:59:03, 4.03s/it] {'loss': 0.3371, 'grad_norm': 0.644017282983267, 'learning_rate': 5.638095161762064e-06, 'epoch': 0.48} 48%|████▊ | 10507/22095 [17:56:28<12:59:03, 4.03s/it] 48%|████▊ | 10508/22095 [17:56:32<12:35:19, 3.91s/it] {'loss': 0.3441, 'grad_norm': 0.6956189399848878, 'learning_rate': 5.637368226794153e-06, 'epoch': 0.48} 48%|████▊ | 10508/22095 [17:56:32<12:35:19, 3.91s/it] 48%|████▊ | 10509/22095 [17:56:35<11:47:09, 3.66s/it] {'loss': 0.2998, 'grad_norm': 0.7176511786720804, 'learning_rate': 5.6366412781311575e-06, 'epoch': 0.48} 48%|████▊ | 10509/22095 [17:56:35<11:47:09, 3.66s/it] 48%|████▊ | 10510/22095 [17:56:38<11:36:20, 3.61s/it] {'loss': 0.3256, 'grad_norm': 0.6742388327178932, 'learning_rate': 5.635914315788695e-06, 'epoch': 0.48} 48%|████▊ | 10510/22095 [17:56:38<11:36:20, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92923 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45134 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120965 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10511/22095 [17:56:48<17:12:26, 5.35s/it] {'loss': 0.4666, 'grad_norm': 0.3869309011976393, 'learning_rate': 5.635187339782389e-06, 'epoch': 0.48} 48%|████▊ | 10511/22095 [17:56:48<17:12:26, 5.35s/it] 48%|████▊ | 10512/22095 [17:56:51<15:25:08, 4.79s/it] {'loss': 0.3445, 'grad_norm': 0.6723727392135527, 'learning_rate': 5.634460350127855e-06, 'epoch': 0.48} 48%|████▊ | 10512/22095 [17:56:51<15:25:08, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58367 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10513/22095 [17:56:54<13:56:19, 4.33s/it] {'loss': 0.3382, 'grad_norm': 0.7130082870729761, 'learning_rate': 5.633733346840719e-06, 'epoch': 0.48} 48%|████▊ | 10513/22095 [17:56:54<13:56:19, 4.33s/it] 48%|████▊ | 10514/22095 [17:56:59<14:32:43, 4.52s/it] {'loss': 0.3402, 'grad_norm': 0.6353001641146917, 'learning_rate': 5.633006329936599e-06, 'epoch': 0.48} 48%|████▊ | 10514/22095 [17:56:59<14:32:43, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73469 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78599 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57032 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67924 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10515/22095 [17:57:03<13:35:52, 4.23s/it] {'loss': 0.3894, 'grad_norm': 0.6832213982688691, 'learning_rate': 5.632279299431117e-06, 'epoch': 0.48} 48%|████▊ | 10515/22095 [17:57:03<13:35:52, 4.23s/it] 48%|████▊ | 10516/22095 [17:57:06<13:01:11, 4.05s/it] {'loss': 0.3305, 'grad_norm': 0.6601553798734152, 'learning_rate': 5.631552255339896e-06, 'epoch': 0.48} 48%|████▊ | 10516/22095 [17:57:06<13:01:11, 4.05s/it] 48%|████▊ | 10517/22095 [17:57:10<12:57:29, 4.03s/it] {'loss': 0.3038, 'grad_norm': 0.5967921800961582, 'learning_rate': 5.630825197678556e-06, 'epoch': 0.48} 48%|████▊ | 10517/22095 [17:57:10<12:57:29, 4.03s/it] 48%|████▊ | 10518/22095 [17:57:14<12:12:57, 3.80s/it] {'loss': 0.3315, 'grad_norm': 0.6301548350456041, 'learning_rate': 5.630098126462719e-06, 'epoch': 0.48} 48%|████▊ | 10518/22095 [17:57:14<12:12:57, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10519/22095 [17:57:24<18:08:17, 5.64s/it] {'loss': 0.4891, 'grad_norm': 0.32232962909444457, 'learning_rate': 5.629371041708009e-06, 'epoch': 0.48} 48%|████▊ | 10519/22095 [17:57:24<18:08:17, 5.64s/it] 48%|████▊ | 10520/22095 [17:57:28<16:24:40, 5.10s/it] {'loss': 0.3032, 'grad_norm': 0.6719022010093874, 'learning_rate': 5.6286439434300476e-06, 'epoch': 0.48} 48%|████▊ | 10520/22095 [17:57:28<16:24:40, 5.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 48%|████▊ | 10521/22095 [17:57:37<20:37:28, 6.42s/it] {'loss': 0.4913, 'grad_norm': 0.3066219988017249, 'learning_rate': 5.627916831644459e-06, 'epoch': 0.48} 48%|████▊ | 10521/22095 [17:57:37<20:37:28, 6.42s/it] 48%|████▊ | 10522/22095 [17:57:41<17:52:53, 5.56s/it] {'loss': 0.3651, 'grad_norm': 0.7207359042047378, 'learning_rate': 5.627189706366866e-06, 'epoch': 0.48} 48%|████▊ | 10522/22095 [17:57:41<17:52:53, 5.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10523/22095 [17:57:50<21:38:11, 6.73s/it] {'loss': 0.4776, 'grad_norm': 0.37989185098569295, 'learning_rate': 5.626462567612892e-06, 'epoch': 0.48} 48%|████▊ | 10523/22095 [17:57:50<21:38:11, 6.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308136 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2MVFMjjqhSKJjSspnXXc79XXa_!!2822611227.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n提取这张照片中的文本信息。'}, {'from': 'gpt', 'value': '图中所有文字:\n升级版十二首\n现货\n毅航教玩具\n萨克斯来袭\nOR\nALL-AMERICANFLA\nR\nF\nB\nA\nY\nL\n送电池+螺丝刀'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10524/22095 [17:57:53<18:20:29, 5.71s/it] {'loss': 0.3261, 'grad_norm': 0.647918593072506, 'learning_rate': 5.625735415398164e-06, 'epoch': 0.48} 48%|████▊ | 10524/22095 [17:57:53<18:20:29, 5.71s/it] 48%|████▊ | 10525/22095 [17:57:57<15:59:54, 4.98s/it] {'loss': 0.3424, 'grad_norm': 0.6128445967722367, 'learning_rate': 5.625008249738301e-06, 'epoch': 0.48} 48%|████▊ | 10525/22095 [17:57:57<15:59:54, 4.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [523, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8475520 in VC:s3://internvl-moe-sft-data/. Exception: Image size [523, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 147403, 'image': 'vrdu_texteq/astro-ph.CO/8e1fe338-8963-4125-a901-8f835abef784.png', 'image_wh': [[523, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and we therefore obtain constraints on $b$ as'}]} 48%|████▊ | 10526/22095 [17:58:00<14:38:46, 4.56s/it] {'loss': 0.2983, 'grad_norm': 0.6504907307873667, 'learning_rate': 5.624281070648933e-06, 'epoch': 0.48} 48%|████▊ | 10526/22095 [17:58:00<14:38:46, 4.56s/it] 48%|████▊ | 10527/22095 [17:58:03<12:59:51, 4.04s/it] {'loss': 0.326, 'grad_norm': 0.6773751990955243, 'learning_rate': 5.623553878145679e-06, 'epoch': 0.48} 48%|████▊ | 10527/22095 [17:58:03<12:59:51, 4.04s/it] 48%|████▊ | 10528/22095 [17:58:06<12:00:13, 3.74s/it] {'loss': 0.3685, 'grad_norm': 0.6813088755616269, 'learning_rate': 5.622826672244169e-06, 'epoch': 0.48} 48%|████▊ | 10528/22095 [17:58:06<12:00:13, 3.74s/it] 48%|████▊ | 10529/22095 [17:58:09<11:26:59, 3.56s/it] {'loss': 0.3429, 'grad_norm': 0.599714585578754, 'learning_rate': 5.622099452960027e-06, 'epoch': 0.48} 48%|████▊ | 10529/22095 [17:58:09<11:26:59, 3.56s/it] 48%|████▊ | 10530/22095 [17:58:12<10:54:01, 3.39s/it] {'loss': 0.3029, 'grad_norm': 0.6266975657425706, 'learning_rate': 5.621372220308877e-06, 'epoch': 0.48} 48%|████▊ | 10530/22095 [17:58:12<10:54:01, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931430 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54583, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nA. 8\nB. 10\nC. 12\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10531/22095 [17:58:15<10:18:40, 3.21s/it] {'loss': 0.3379, 'grad_norm': 0.638104136678306, 'learning_rate': 5.620644974306347e-06, 'epoch': 0.48} 48%|████▊ | 10531/22095 [17:58:15<10:18:40, 3.21s/it] 48%|████▊ | 10532/22095 [17:58:18<10:11:19, 3.17s/it] {'loss': 0.3223, 'grad_norm': 0.5813658371373994, 'learning_rate': 5.619917714968064e-06, 'epoch': 0.48} 48%|████▊ | 10532/22095 [17:58:18<10:11:19, 3.17s/it] 48%|████▊ | 10533/22095 [17:58:22<10:29:00, 3.26s/it] {'loss': 0.3755, 'grad_norm': 0.6488863979765176, 'learning_rate': 5.619190442309651e-06, 'epoch': 0.48} 48%|████▊ | 10533/22095 [17:58:22<10:29:00, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53592 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10534/22095 [17:58:26<11:16:46, 3.51s/it] {'loss': 0.3545, 'grad_norm': 0.6424566258027432, 'learning_rate': 5.61846315634674e-06, 'epoch': 0.48} 48%|████▊ | 10534/22095 [17:58:26<11:16:46, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52916 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10535/22095 [17:58:30<11:50:15, 3.69s/it] {'loss': 0.3029, 'grad_norm': 0.6801460304007487, 'learning_rate': 5.617735857094951e-06, 'epoch': 0.48} 48%|████▊ | 10535/22095 [17:58:30<11:50:15, 3.69s/it] 48%|████▊ | 10536/22095 [17:58:33<11:14:23, 3.50s/it] {'loss': 0.2941, 'grad_norm': 0.5846304119121462, 'learning_rate': 5.61700854456992e-06, 'epoch': 0.48} 48%|████▊ | 10536/22095 [17:58:33<11:14:23, 3.50s/it] 48%|████▊ | 10537/22095 [17:58:36<11:20:31, 3.53s/it] {'loss': 0.3463, 'grad_norm': 0.6947565676732286, 'learning_rate': 5.616281218787268e-06, 'epoch': 0.48} 48%|████▊ | 10537/22095 [17:58:36<11:20:31, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133303 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82339 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102209 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10538/22095 [17:58:40<11:03:39, 3.45s/it] {'loss': 0.2924, 'grad_norm': 0.588970540066978, 'learning_rate': 5.6155538797626254e-06, 'epoch': 0.48} 48%|████▊ | 10538/22095 [17:58:40<11:03:39, 3.45s/it] 48%|████▊ | 10539/22095 [17:58:44<11:35:17, 3.61s/it] {'loss': 0.3402, 'grad_norm': 0.5913635705662535, 'learning_rate': 5.614826527511621e-06, 'epoch': 0.48} 48%|████▊ | 10539/22095 [17:58:44<11:35:17, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10540/22095 [17:58:47<11:20:44, 3.53s/it] {'loss': 0.3182, 'grad_norm': 0.6070944121687863, 'learning_rate': 5.614099162049883e-06, 'epoch': 0.48} 48%|████▊ | 10540/22095 [17:58:47<11:20:44, 3.53s/it] 48%|████▊ | 10541/22095 [17:58:50<10:39:47, 3.32s/it] {'loss': 0.3275, 'grad_norm': 0.6484577073713257, 'learning_rate': 5.613371783393039e-06, 'epoch': 0.48} 48%|████▊ | 10541/22095 [17:58:50<10:39:47, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10542/22095 [17:59:00<16:47:20, 5.23s/it] {'loss': 0.4919, 'grad_norm': 0.4262153304553133, 'learning_rate': 5.612644391556721e-06, 'epoch': 0.48} 48%|████▊ | 10542/22095 [17:59:00<16:47:20, 5.23s/it]VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/images/os_ubuntu/handmade_annotation_6/images/paste_Screenshot from 2025-07-04 18-35-05_id_2_function_1_crop_1_grounding_instructions_random.png 2025-08-28 09:56:58.759024 load time: 1068.03 ms 48%|████▊ | 10543/22095 [17:59:07<19:24:53, 6.05s/it] {'loss': 0.4972, 'grad_norm': 0.3873556240946461, 'learning_rate': 5.611916986556555e-06, 'epoch': 0.48} 48%|████▊ | 10543/22095 [17:59:07<19:24:53, 6.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374002 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40777, 'image': 'vrdu_table_final_2/astro-ph.CO/1e412d7a-e81d-47ad-9451-303028c61dc7.png', 'image_wh': [[1975, 6]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{p{\\textwidth}}\\hline\\ \\end{tabular}\n```"}]} 48%|████▊ | 10544/22095 [17:59:11<17:11:05, 5.36s/it] {'loss': 0.3236, 'grad_norm': 0.6849575773347174, 'learning_rate': 5.611189568408173e-06, 'epoch': 0.48} 48%|████▊ | 10544/22095 [17:59:11<17:11:05, 5.36s/it] 48%|████▊ | 10545/22095 [17:59:15<15:28:02, 4.82s/it] {'loss': 0.2837, 'grad_norm': 0.5726982431800233, 'learning_rate': 5.610462137127205e-06, 'epoch': 0.48} 48%|████▊ | 10545/22095 [17:59:15<15:28:02, 4.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10546/22095 [17:59:24<19:52:48, 6.20s/it] {'loss': 0.5073, 'grad_norm': 0.4824256200216908, 'learning_rate': 5.609734692729278e-06, 'epoch': 0.48} 48%|████▊ | 10546/22095 [17:59:24<19:52:48, 6.20s/it] 48%|████▊ | 10547/22095 [17:59:27<17:04:26, 5.32s/it] {'loss': 0.3515, 'grad_norm': 0.6105993805076046, 'learning_rate': 5.609007235230029e-06, 'epoch': 0.48} 48%|████▊ | 10547/22095 [17:59:27<17:04:26, 5.32s/it] 48%|████▊ | 10548/22095 [17:59:31<15:05:02, 4.70s/it] {'loss': 0.3263, 'grad_norm': 0.5849218911471216, 'learning_rate': 5.60827976464508e-06, 'epoch': 0.48} 48%|████▊ | 10548/22095 [17:59:31<15:05:02, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (130634 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10549/22095 [17:59:35<14:17:51, 4.46s/it] {'loss': 0.3844, 'grad_norm': 0.6010479395747507, 'learning_rate': 5.607552280990071e-06, 'epoch': 0.48} 48%|████▊ | 10549/22095 [17:59:35<14:17:51, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43309 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96922 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10550/22095 [17:59:38<12:58:16, 4.04s/it] {'loss': 0.3451, 'grad_norm': 0.5929995553550435, 'learning_rate': 5.606824784280629e-06, 'epoch': 0.48} 48%|████▊ | 10550/22095 [17:59:38<12:58:16, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100886 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45617 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78358 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10551/22095 [17:59:41<12:28:44, 3.89s/it] {'loss': 0.3029, 'grad_norm': 0.5829228500969121, 'learning_rate': 5.606097274532385e-06, 'epoch': 0.48} 48%|████▊ | 10551/22095 [17:59:41<12:28:44, 3.89s/it] 48%|████▊ | 10552/22095 [17:59:44<11:40:03, 3.64s/it] {'loss': 0.3454, 'grad_norm': 0.6339342525556836, 'learning_rate': 5.6053697517609725e-06, 'epoch': 0.48} 48%|████▊ | 10552/22095 [17:59:44<11:40:03, 3.64s/it] 48%|████▊ | 10553/22095 [17:59:47<10:48:05, 3.37s/it] {'loss': 0.2935, 'grad_norm': 0.6276833410613624, 'learning_rate': 5.604642215982025e-06, 'epoch': 0.48} 48%|████▊ | 10553/22095 [17:59:47<10:48:05, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45804 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10554/22095 [17:59:51<11:23:29, 3.55s/it] {'loss': 0.3491, 'grad_norm': 0.6209626862316334, 'learning_rate': 5.60391466721117e-06, 'epoch': 0.48} 48%|████▊ | 10554/22095 [17:59:51<11:23:29, 3.55s/it] 48%|████▊ | 10555/22095 [17:59:55<11:28:52, 3.58s/it] {'loss': 0.3242, 'grad_norm': 0.6916101322205742, 'learning_rate': 5.603187105464045e-06, 'epoch': 0.48} 48%|████▊ | 10555/22095 [17:59:55<11:28:52, 3.58s/it] 48%|████▊ | 10556/22095 [17:59:59<11:53:31, 3.71s/it] {'loss': 0.3542, 'grad_norm': 0.5816352573986318, 'learning_rate': 5.6024595307562815e-06, 'epoch': 0.48} 48%|████▊ | 10556/22095 [17:59:59<11:53:31, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10557/22095 [18:00:06<15:40:36, 4.89s/it] {'loss': 0.4782, 'grad_norm': 0.47474496295445967, 'learning_rate': 5.601731943103515e-06, 'epoch': 0.48} 48%|████▊ | 10557/22095 [18:00:06<15:40:36, 4.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76289 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10558/22095 [18:00:16<20:25:05, 6.37s/it] {'loss': 0.4784, 'grad_norm': 0.41798351738900213, 'learning_rate': 5.601004342521374e-06, 'epoch': 0.48} 48%|████▊ | 10558/22095 [18:00:16<20:25:05, 6.37s/it] 48%|████▊ | 10559/22095 [18:00:26<23:22:49, 7.30s/it] {'loss': 0.4703, 'grad_norm': 0.33629964526700756, 'learning_rate': 5.6002767290254975e-06, 'epoch': 0.48} 48%|████▊ | 10559/22095 [18:00:26<23:22:49, 7.30s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10560/22095 [18:00:30<20:14:54, 6.32s/it] {'loss': 0.284, 'grad_norm': 0.5380229410130636, 'learning_rate': 5.599549102631516e-06, 'epoch': 0.48} 48%|████▊ | 10560/22095 [18:00:30<20:14:54, 6.32s/it] 48%|████▊ | 10561/22095 [18:00:34<18:08:20, 5.66s/it] {'loss': 0.3106, 'grad_norm': 0.8051297804005851, 'learning_rate': 5.598821463355069e-06, 'epoch': 0.48} 48%|████▊ | 10561/22095 [18:00:34<18:08:20, 5.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/web/images/yang_0528112335/10_140_52_49_0528151614/img/0.png 2025-08-28 09:58:32.537248 load time: 1049.19 ms 48%|████▊ | 10562/22095 [18:00:42<20:42:31, 6.46s/it] {'loss': 0.4833, 'grad_norm': 0.5170642766936449, 'learning_rate': 5.598093811211785e-06, 'epoch': 0.48} 48%|████▊ | 10562/22095 [18:00:42<20:42:31, 6.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344075 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10727, 'image': 'vrdu_table_final_2/astro-ph.CO/a1102c05-6ff0-4965-af9b-cfab23096f41.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 48%|████▊ | 10563/22095 [18:00:46<17:53:21, 5.58s/it] {'loss': 0.2974, 'grad_norm': 0.5963256229323173, 'learning_rate': 5.597366146217303e-06, 'epoch': 0.48} 48%|████▊ | 10563/22095 [18:00:46<17:53:21, 5.58s/it] 48%|████▊ | 10564/22095 [18:00:49<15:19:39, 4.79s/it] {'loss': 0.3371, 'grad_norm': 0.6599750262075692, 'learning_rate': 5.596638468387255e-06, 'epoch': 0.48} 48%|████▊ | 10564/22095 [18:00:49<15:19:39, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53301 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10565/22095 [18:00:51<13:33:29, 4.23s/it] {'loss': 0.297, 'grad_norm': 0.9292747092655834, 'learning_rate': 5.595910777737281e-06, 'epoch': 0.48} 48%|████▊ | 10565/22095 [18:00:51<13:33:29, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10566/22095 [18:01:00<18:01:31, 5.63s/it] {'loss': 0.482, 'grad_norm': 0.5795709443727703, 'learning_rate': 5.5951830742830145e-06, 'epoch': 0.48} 48%|████▊ | 10566/22095 [18:01:00<18:01:31, 5.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [759, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8497075 in VC:s3://internvl-moe-sft-data/. Exception: Image size [759, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 62121, 'image': 'vrdu_texteq/astro-ph.CO/025e8ea7-df7e-443d-ad9a-7b9450a594d6.png', 'image_wh': [[759, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'and the bubble final size after a time $t_{\\rm cr}+t_{\\rm acc}+t_{\\rm end}\\sim t_{\\rm end}$ is:'}]} 48%|████▊ | 10567/22095 [18:01:04<16:33:47, 5.17s/it] {'loss': 0.3363, 'grad_norm': 0.6727811870569353, 'learning_rate': 5.594455358040091e-06, 'epoch': 0.48} 48%|████▊ | 10567/22095 [18:01:04<16:33:47, 5.17s/it] 48%|████▊ | 10568/22095 [18:01:08<15:15:02, 4.76s/it] {'loss': 0.3043, 'grad_norm': 0.657959114215397, 'learning_rate': 5.5937276290241486e-06, 'epoch': 0.48} 48%|████▊ | 10568/22095 [18:01:08<15:15:02, 4.76s/it] 48%|████▊ | 10569/22095 [18:01:12<14:26:49, 4.51s/it] {'loss': 0.36, 'grad_norm': 0.6521324485995212, 'learning_rate': 5.5929998872508215e-06, 'epoch': 0.48} 48%|████▊ | 10569/22095 [18:01:12<14:26:49, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42880 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48864 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59875 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10570/22095 [18:01:15<13:13:40, 4.13s/it] {'loss': 0.34, 'grad_norm': 0.5943329694166126, 'learning_rate': 5.592272132735749e-06, 'epoch': 0.48} 48%|████▊ | 10570/22095 [18:01:15<13:13:40, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (149096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65580 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68817 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110206 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10571/22095 [18:01:19<12:15:59, 3.83s/it] {'loss': 0.3467, 'grad_norm': 0.6612431740464342, 'learning_rate': 5.591544365494567e-06, 'epoch': 0.48} 48%|████▊ | 10571/22095 [18:01:19<12:15:59, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [12, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410672 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12871, 'image': 'vrdu_table_final_2/astro-ph.CO/27850288-601b-4679-a2fd-859894298097.png', 'image_wh': [[12, 17]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\footnotesize #1\n\\end{tabular}\n```"}]} 48%|████▊ | 10572/22095 [18:01:21<11:20:53, 3.55s/it] {'loss': 0.2876, 'grad_norm': 0.6059079878953904, 'learning_rate': 5.590816585542913e-06, 'epoch': 0.48} 48%|████▊ | 10572/22095 [18:01:21<11:20:53, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10573/22095 [18:01:29<14:53:48, 4.65s/it] {'loss': 0.4569, 'grad_norm': 0.4460620888325759, 'learning_rate': 5.590088792896427e-06, 'epoch': 0.48} 48%|████▊ | 10573/22095 [18:01:29<14:53:48, 4.65s/it] 48%|████▊ | 10574/22095 [18:01:33<14:44:32, 4.61s/it] {'loss': 0.3438, 'grad_norm': 0.6267320935882589, 'learning_rate': 5.589360987570745e-06, 'epoch': 0.48} 48%|████▊ | 10574/22095 [18:01:33<14:44:32, 4.61s/it] 48%|████▊ | 10575/22095 [18:01:37<13:57:13, 4.36s/it] {'loss': 0.3631, 'grad_norm': 0.6944370493131874, 'learning_rate': 5.588633169581502e-06, 'epoch': 0.48} 48%|████▊ | 10575/22095 [18:01:37<13:57:13, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10576/22095 [18:01:41<13:26:24, 4.20s/it] {'loss': 0.3306, 'grad_norm': 0.5718505842229085, 'learning_rate': 5.5879053389443435e-06, 'epoch': 0.48} 48%|████▊ | 10576/22095 [18:01:41<13:26:24, 4.20s/it] 48%|████▊ | 10577/22095 [18:01:44<12:06:35, 3.79s/it] {'loss': 0.3318, 'grad_norm': 0.654911543789487, 'learning_rate': 5.587177495674902e-06, 'epoch': 0.48} 48%|████▊ | 10577/22095 [18:01:44<12:06:35, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10578/22095 [18:01:53<17:41:13, 5.53s/it] {'loss': 0.459, 'grad_norm': 0.38311412169455544, 'learning_rate': 5.586449639788822e-06, 'epoch': 0.48} 48%|████▊ | 10578/22095 [18:01:53<17:41:13, 5.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10579/22095 [18:02:03<21:29:29, 6.72s/it] {'loss': 0.4551, 'grad_norm': 0.296067994927196, 'learning_rate': 5.5857217713017394e-06, 'epoch': 0.48} 48%|████▊ | 10579/22095 [18:02:03<21:29:29, 6.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10580/22095 [18:02:06<18:06:59, 5.66s/it] {'loss': 0.3378, 'grad_norm': 0.6393993738142472, 'learning_rate': 5.584993890229296e-06, 'epoch': 0.48} 48%|████▊ | 10580/22095 [18:02:06<18:06:59, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45381 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10581/22095 [18:02:10<16:41:24, 5.22s/it] {'loss': 0.3513, 'grad_norm': 0.6430280142761382, 'learning_rate': 5.584265996587129e-06, 'epoch': 0.48} 48%|████▊ | 10581/22095 [18:02:10<16:41:24, 5.22s/it] 48%|████▊ | 10582/22095 [18:02:13<14:22:46, 4.50s/it] {'loss': 0.3143, 'grad_norm': 0.6254186892649977, 'learning_rate': 5.583538090390882e-06, 'epoch': 0.48} 48%|████▊ | 10582/22095 [18:02:13<14:22:46, 4.50s/it] 48%|████▊ | 10583/22095 [18:02:16<12:53:00, 4.03s/it] {'loss': 0.3176, 'grad_norm': 0.6131120850240496, 'learning_rate': 5.582810171656191e-06, 'epoch': 0.48} 48%|████▊ | 10583/22095 [18:02:16<12:53:00, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77509 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10584/22095 [18:02:19<12:30:23, 3.91s/it] {'loss': 0.2955, 'grad_norm': 0.6319043952102459, 'learning_rate': 5.582082240398702e-06, 'epoch': 0.48} 48%|████▊ | 10584/22095 [18:02:20<12:30:23, 3.91s/it] 48%|████▊ | 10585/22095 [18:02:22<11:24:56, 3.57s/it] {'loss': 0.3094, 'grad_norm': 0.6220473199791497, 'learning_rate': 5.5813542966340514e-06, 'epoch': 0.48} 48%|████▊ | 10585/22095 [18:02:22<11:24:56, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953370 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4205, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定'}, {'from': 'gpt', 'value': '【解答】解:∵M、N分别是线段AB、BC的中点,∴MB=0.5AB=3cm,NB=0.5BC=2cm,∴MN=MB+NB=3+2=5(cm),'}]} 48%|████▊ | 10586/22095 [18:02:25<10:56:06, 3.42s/it] {'loss': 0.347, 'grad_norm': 0.6886628697806193, 'learning_rate': 5.580626340377884e-06, 'epoch': 0.48} 48%|████▊ | 10586/22095 [18:02:25<10:56:06, 3.42s/it] 48%|████▊ | 10587/22095 [18:02:29<10:42:52, 3.35s/it] {'loss': 0.3447, 'grad_norm': 0.6260532435901625, 'learning_rate': 5.579898371645839e-06, 'epoch': 0.48} 48%|████▊ | 10587/22095 [18:02:29<10:42:52, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (136469 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10588/22095 [18:02:33<11:20:17, 3.55s/it] {'loss': 0.3037, 'grad_norm': 0.6086583527185936, 'learning_rate': 5.5791703904535584e-06, 'epoch': 0.48} 48%|████▊ | 10588/22095 [18:02:33<11:20:17, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49967 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111288 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10589/22095 [18:02:42<17:27:09, 5.46s/it] {'loss': 0.4868, 'grad_norm': 0.41963576368214195, 'learning_rate': 5.578442396816685e-06, 'epoch': 0.48} 48%|████▊ | 10589/22095 [18:02:42<17:27:09, 5.46s/it] 48%|████▊ | 10590/22095 [18:02:46<15:43:17, 4.92s/it] {'loss': 0.3631, 'grad_norm': 0.8103420874521906, 'learning_rate': 5.577714390750862e-06, 'epoch': 0.48} 48%|████▊ | 10590/22095 [18:02:46<15:43:17, 4.92s/it] 48%|████▊ | 10591/22095 [18:02:50<14:26:25, 4.52s/it] {'loss': 0.3607, 'grad_norm': 0.8064511619876602, 'learning_rate': 5.576986372271731e-06, 'epoch': 0.48} 48%|████▊ | 10591/22095 [18:02:50<14:26:25, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10592/22095 [18:03:00<19:40:59, 6.16s/it] {'loss': 0.4842, 'grad_norm': 0.31885495383354473, 'learning_rate': 5.576258341394936e-06, 'epoch': 0.48} 48%|████▊ | 10592/22095 [18:03:00<19:40:59, 6.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49480 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10593/22095 [18:03:04<17:31:54, 5.49s/it] {'loss': 0.3272, 'grad_norm': 0.6218266829354264, 'learning_rate': 5.575530298136116e-06, 'epoch': 0.48} 48%|████▊ | 10593/22095 [18:03:04<17:31:54, 5.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8488356 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52310, 'image': 'vrdu_texteq/astro-ph.CO/a07a2920-2726-40fc-a51e-22a17ed15764.png', 'image_wh': [[1120, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'The D-brane and K$\\ddot{\\rm a}$hler-moduli inflation models are consistent with the observational data.'}]} 48%|████▊ | 10594/22095 [18:03:07<15:56:59, 4.99s/it] {'loss': 0.3527, 'grad_norm': 0.704607964592815, 'learning_rate': 5.574802242510921e-06, 'epoch': 0.48} 48%|████▊ | 10594/22095 [18:03:07<15:56:59, 4.99s/it] 48%|████▊ | 10595/22095 [18:03:11<14:07:33, 4.42s/it] {'loss': 0.3301, 'grad_norm': 0.6220004783827727, 'learning_rate': 5.574074174534989e-06, 'epoch': 0.48} 48%|████▊ | 10595/22095 [18:03:11<14:07:33, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51415 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125121 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10596/22095 [18:03:19<17:44:27, 5.55s/it] {'loss': 0.5055, 'grad_norm': 0.3650889133524078, 'learning_rate': 5.573346094223966e-06, 'epoch': 0.48} 48%|████▊ | 10596/22095 [18:03:19<17:44:27, 5.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [439, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8529988 in VC:s3://internvl-moe-sft-data/. Exception: Image size [439, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119321, 'image': 'vrdu_texteq/astro-ph.CO/f94d1e8f-f201-4e2e-a0b9-77252a31ab6c.png', 'image_wh': [[439, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'while in the small $x$ limit {\\it i.e.} $x \\ll 1$'}]} 48%|████▊ | 10597/22095 [18:03:28<21:18:46, 6.67s/it] {'loss': 0.4825, 'grad_norm': 0.35017989979027714, 'learning_rate': 5.5726180015934976e-06, 'epoch': 0.48} 48%|████▊ | 10597/22095 [18:03:28<21:18:46, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047896 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 48%|████▊ | 10598/22095 [18:03:32<19:00:23, 5.95s/it] {'loss': 0.2814, 'grad_norm': 0.5869712702415046, 'learning_rate': 5.571889896659225e-06, 'epoch': 0.48} 48%|████▊ | 10598/22095 [18:03:32<19:00:23, 5.95s/it] 48%|████▊ | 10599/22095 [18:03:36<16:36:41, 5.20s/it] {'loss': 0.3136, 'grad_norm': 0.814425945966293, 'learning_rate': 5.571161779436797e-06, 'epoch': 0.48} 48%|████▊ | 10599/22095 [18:03:36<16:36:41, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47617 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54272 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10600/22095 [18:03:40<15:47:31, 4.95s/it] {'loss': 0.3181, 'grad_norm': 0.6462944916736755, 'learning_rate': 5.570433649941855e-06, 'epoch': 0.48} 48%|████▊ | 10600/22095 [18:03:40<15:47:31, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10601/22095 [18:03:48<18:44:25, 5.87s/it] {'loss': 0.508, 'grad_norm': 0.293109634291855, 'learning_rate': 5.5697055081900465e-06, 'epoch': 0.48} 48%|████▊ | 10601/22095 [18:03:48<18:44:25, 5.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52397 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60115 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10602/22095 [18:03:52<16:40:49, 5.22s/it] {'loss': 0.3507, 'grad_norm': 0.5914173537503168, 'learning_rate': 5.568977354197016e-06, 'epoch': 0.48} 48%|████▊ | 10602/22095 [18:03:52<16:40:49, 5.22s/it] 48%|████▊ | 10603/22095 [18:03:56<15:17:48, 4.79s/it] {'loss': 0.3485, 'grad_norm': 0.6150955968191683, 'learning_rate': 5.568249187978412e-06, 'epoch': 0.48} 48%|████▊ | 10603/22095 [18:03:56<15:17:48, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75509 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10604/22095 [18:04:05<19:50:46, 6.22s/it] {'loss': 0.4621, 'grad_norm': 0.32318681448665276, 'learning_rate': 5.567521009549874e-06, 'epoch': 0.48} 48%|████▊ | 10604/22095 [18:04:05<19:50:46, 6.22s/it] 48%|████▊ | 10605/22095 [18:04:09<17:17:29, 5.42s/it] {'loss': 0.3452, 'grad_norm': 0.7618492618904363, 'learning_rate': 5.566792818927056e-06, 'epoch': 0.48} 48%|████▊ | 10605/22095 [18:04:09<17:17:29, 5.42s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 10:02:09.150069 load time: 1077.34 ms 48%|████▊ | 10606/22095 [18:04:12<15:13:14, 4.77s/it] {'loss': 0.3312, 'grad_norm': 0.6691405458194113, 'learning_rate': 5.566064616125599e-06, 'epoch': 0.48} 48%|████▊ | 10606/22095 [18:04:12<15:13:14, 4.77s/it] 48%|████▊ | 10607/22095 [18:04:15<13:48:24, 4.33s/it] {'loss': 0.3358, 'grad_norm': 0.6048012011493709, 'learning_rate': 5.565336401161153e-06, 'epoch': 0.48} 48%|████▊ | 10607/22095 [18:04:15<13:48:24, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10608/22095 [18:04:25<18:49:31, 5.90s/it] {'loss': 0.4653, 'grad_norm': 0.3210420083699887, 'learning_rate': 5.564608174049364e-06, 'epoch': 0.48} 48%|████▊ | 10608/22095 [18:04:25<18:49:31, 5.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_98/img/step_0.png 2025-08-28 10:02:25.335197 load time: 1111.79 ms 48%|████▊ | 10609/22095 [18:04:29<16:46:06, 5.26s/it] {'loss': 0.3502, 'grad_norm': 0.6075408372209316, 'learning_rate': 5.5638799348058795e-06, 'epoch': 0.48} 48%|████▊ | 10609/22095 [18:04:29<16:46:06, 5.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10610/22095 [18:04:38<20:53:09, 6.55s/it] {'loss': 0.4962, 'grad_norm': 0.30198056087813385, 'learning_rate': 5.563151683446346e-06, 'epoch': 0.48} 48%|████▊ | 10610/22095 [18:04:38<20:53:09, 6.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46419 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10611/22095 [18:04:42<17:50:30, 5.59s/it] {'loss': 0.3455, 'grad_norm': 0.6459826113080686, 'learning_rate': 5.562423419986415e-06, 'epoch': 0.48} 48%|████▊ | 10611/22095 [18:04:42<17:50:30, 5.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10612/22095 [18:04:50<20:51:30, 6.54s/it] {'loss': 0.457, 'grad_norm': 0.26951371419713627, 'learning_rate': 5.561695144441729e-06, 'epoch': 0.48} 48%|████▊ | 10612/22095 [18:04:50<20:51:30, 6.54s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_2/images/step_0.png 2025-08-28 10:02:49.032675 load time: 1195.88 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_2/images/before_screenshot_7_id_94_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:02:50.171317 load time: 1063.53 ms 48%|████▊ | 10613/22095 [18:04:54<17:46:01, 5.57s/it] {'loss': 0.3443, 'grad_norm': 0.624369197658816, 'learning_rate': 5.5609668568279415e-06, 'epoch': 0.48} 48%|████▊ | 10613/22095 [18:04:54<17:46:01, 5.57s/it] 48%|████▊ | 10614/22095 [18:04:57<15:38:20, 4.90s/it] {'loss': 0.301, 'grad_norm': 0.6414411152346636, 'learning_rate': 5.560238557160698e-06, 'epoch': 0.48} 48%|████▊ | 10614/22095 [18:04:57<15:38:20, 4.90s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_182623_3/images/before_screenshot_39_id_36_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:02:54.660298 load time: 1810.96 ms VC:s3://gui-agent/mind2web_train/images/f57e6c0a-8f8b-4756-9f1d-1bdea7a0af5c/images/7.png 2025-08-28 10:02:55.695866 load time: 1224.58 ms 48%|████▊ | 10615/22095 [18:05:01<14:49:01, 4.65s/it] {'loss': 0.3197, 'grad_norm': 0.7575620000309918, 'learning_rate': 5.559510245455649e-06, 'epoch': 0.48} 48%|████▊ | 10615/22095 [18:05:01<14:49:01, 4.65s/it] 48%|████▊ | 10616/22095 [18:05:05<14:01:56, 4.40s/it] {'loss': 0.3071, 'grad_norm': 0.6573322938311212, 'learning_rate': 5.558781921728443e-06, 'epoch': 0.48} 48%|████▊ | 10616/22095 [18:05:05<14:01:56, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43650 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10617/22095 [18:05:08<13:16:48, 4.17s/it] {'loss': 0.3711, 'grad_norm': 0.7094660698496261, 'learning_rate': 5.558053585994729e-06, 'epoch': 0.48} 48%|████▊ | 10617/22095 [18:05:08<13:16:48, 4.17s/it] 48%|████▊ | 10618/22095 [18:05:12<12:39:53, 3.97s/it] {'loss': 0.3501, 'grad_norm': 0.6309626577717088, 'learning_rate': 5.557325238270158e-06, 'epoch': 0.48} 48%|████▊ | 10618/22095 [18:05:12<12:39:53, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10619/22095 [18:05:21<17:30:02, 5.49s/it] {'loss': 0.4785, 'grad_norm': 0.5516977033728775, 'learning_rate': 5.5565968785703795e-06, 'epoch': 0.48} 48%|████▊ | 10619/22095 [18:05:21<17:30:02, 5.49s/it] 48%|████▊ | 10620/22095 [18:05:25<15:44:46, 4.94s/it] {'loss': 0.3558, 'grad_norm': 0.6460332565278315, 'learning_rate': 5.5558685069110444e-06, 'epoch': 0.48} 48%|████▊ | 10620/22095 [18:05:25<15:44:46, 4.94s/it] 48%|████▊ | 10621/22095 [18:05:27<13:40:52, 4.29s/it] {'loss': 0.3051, 'grad_norm': 0.6748996044819889, 'learning_rate': 5.5551401233078e-06, 'epoch': 0.48} 48%|████▊ | 10621/22095 [18:05:27<13:40:52, 4.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 10:03:26.967151 load time: 1042.21 ms 48%|████▊ | 10622/22095 [18:05:31<12:55:50, 4.06s/it] {'loss': 0.3576, 'grad_norm': 0.6158170111840895, 'learning_rate': 5.554411727776301e-06, 'epoch': 0.48} 48%|████▊ | 10622/22095 [18:05:31<12:55:50, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43388 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10623/22095 [18:05:35<12:46:29, 4.01s/it] {'loss': 0.3463, 'grad_norm': 0.6420214315777303, 'learning_rate': 5.553683320332196e-06, 'epoch': 0.48} 48%|████▊ | 10623/22095 [18:05:35<12:46:29, 4.01s/it] 48%|████▊ | 10624/22095 [18:05:38<12:01:58, 3.78s/it] {'loss': 0.3237, 'grad_norm': 0.6657582922286736, 'learning_rate': 5.552954900991139e-06, 'epoch': 0.48} 48%|████▊ | 10624/22095 [18:05:38<12:01:58, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10625/22095 [18:05:45<14:40:06, 4.60s/it] {'loss': 0.5012, 'grad_norm': 0.2967738912280458, 'learning_rate': 5.552226469768777e-06, 'epoch': 0.48} 48%|████▊ | 10625/22095 [18:05:45<14:40:06, 4.60s/it] 48%|████▊ | 10626/22095 [18:05:48<13:16:07, 4.16s/it] {'loss': 0.3351, 'grad_norm': 0.6290390226298818, 'learning_rate': 5.551498026680766e-06, 'epoch': 0.48} 48%|████▊ | 10626/22095 [18:05:48<13:16:07, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10627/22095 [18:05:52<13:37:03, 4.27s/it] {'loss': 0.3395, 'grad_norm': 0.6340394446650136, 'learning_rate': 5.550769571742755e-06, 'epoch': 0.48} 48%|████▊ | 10627/22095 [18:05:52<13:37:03, 4.27s/it] 48%|████▊ | 10628/22095 [18:05:55<12:06:40, 3.80s/it] {'loss': 0.3375, 'grad_norm': 0.6845926201588363, 'learning_rate': 5.550041104970398e-06, 'epoch': 0.48} 48%|████▊ | 10628/22095 [18:05:55<12:06:40, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (124757 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10629/22095 [18:05:59<12:05:53, 3.80s/it] {'loss': 0.3517, 'grad_norm': 0.637658919649331, 'learning_rate': 5.5493126263793465e-06, 'epoch': 0.48} 48%|████▊ | 10629/22095 [18:05:59<12:05:53, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44401 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10630/22095 [18:06:03<12:13:59, 3.84s/it] {'loss': 0.3642, 'grad_norm': 0.6112584722416579, 'learning_rate': 5.548584135985253e-06, 'epoch': 0.48} 48%|████▊ | 10630/22095 [18:06:03<12:13:59, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10631/22095 [18:06:09<14:48:33, 4.65s/it] {'loss': 0.464, 'grad_norm': 0.29810893204105327, 'learning_rate': 5.547855633803773e-06, 'epoch': 0.48} 48%|████▊ | 10631/22095 [18:06:09<14:48:33, 4.65s/it] 48%|████▊ | 10632/22095 [18:06:16<16:39:01, 5.23s/it] {'loss': 0.4612, 'grad_norm': 0.2780123534377946, 'learning_rate': 5.547127119850557e-06, 'epoch': 0.48} 48%|████▊ | 10632/22095 [18:06:16<16:39:01, 5.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10633/22095 [18:06:19<14:43:15, 4.62s/it] {'loss': 0.319, 'grad_norm': 0.7624268380123517, 'learning_rate': 5.546398594141259e-06, 'epoch': 0.48} 48%|████▊ | 10633/22095 [18:06:19<14:43:15, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55591 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61588 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10634/22095 [18:06:22<13:19:14, 4.18s/it] {'loss': 0.3073, 'grad_norm': 0.6333384073958104, 'learning_rate': 5.545670056691535e-06, 'epoch': 0.48} 48%|████▊ | 10634/22095 [18:06:22<13:19:14, 4.18s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_6.png 2025-08-28 10:04:21.780236 load time: 1060.76 ms 48%|████▊ | 10635/22095 [18:06:25<12:02:21, 3.78s/it] {'loss': 0.324, 'grad_norm': 0.6160571792514077, 'learning_rate': 5.544941507517036e-06, 'epoch': 0.48} 48%|████▊ | 10635/22095 [18:06:25<12:02:21, 3.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_103/img/step_0.png 2025-08-28 10:04:25.500129 load time: 1141.61 ms 48%|████▊ | 10636/22095 [18:06:28<11:31:08, 3.62s/it] {'loss': 0.3533, 'grad_norm': 0.8222033027521287, 'learning_rate': 5.544212946633418e-06, 'epoch': 0.48} 48%|████▊ | 10636/22095 [18:06:28<11:31:08, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png VC:s3://gui-agent/data_20250624/web/images/yang_0626114758/google_com_0626213618/img/4.png 2025-08-28 10:04:27.021584 load time: 1056.95 ms 2025-08-28 10:04:26.710819 load time: 1209.71 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:04:27.507388 load time: 1732.06 ms 48%|████▊ | 10637/22095 [18:06:35<14:40:27, 4.61s/it] {'loss': 0.4877, 'grad_norm': 0.3127573927438028, 'learning_rate': 5.543484374056336e-06, 'epoch': 0.48} 48%|████▊ | 10637/22095 [18:06:35<14:40:27, 4.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46246 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72550 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10638/22095 [18:06:42<16:22:02, 5.14s/it] {'loss': 0.4931, 'grad_norm': 0.30309197025768586, 'learning_rate': 5.542755789801442e-06, 'epoch': 0.48} 48%|████▊ | 10638/22095 [18:06:42<16:22:02, 5.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (46810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (142083 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10639/22095 [18:06:45<14:45:24, 4.64s/it] {'loss': 0.3357, 'grad_norm': 0.7781188479398874, 'learning_rate': 5.542027193884395e-06, 'epoch': 0.48} 48%|████▊ | 10639/22095 [18:06:45<14:45:24, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111062 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:04:45.026037 load time: 1493.9 ms 48%|████▊ | 10640/22095 [18:06:49<14:36:34, 4.59s/it] {'loss': 0.3053, 'grad_norm': 0.6058750001614668, 'learning_rate': 5.541298586320848e-06, 'epoch': 0.48} 48%|████▊ | 10640/22095 [18:06:50<14:36:34, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41591 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10641/22095 [18:06:53<13:26:01, 4.22s/it] {'loss': 0.3496, 'grad_norm': 0.7037355994647424, 'learning_rate': 5.540569967126457e-06, 'epoch': 0.48} 48%|████▊ | 10641/22095 [18:06:53<13:26:01, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/mind2web_train/images/6d338fef-6d40-4f08-a045-861ddbc3d9f4/images/2.png 2025-08-28 10:04:54.029990 load time: 1081.12 ms 48%|████▊ | 10642/22095 [18:07:01<17:25:02, 5.47s/it] {'loss': 0.4922, 'grad_norm': 0.2824906190415698, 'learning_rate': 5.539841336316878e-06, 'epoch': 0.48} 48%|████▊ | 10642/22095 [18:07:01<17:25:02, 5.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44677 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10643/22095 [18:07:04<15:15:08, 4.79s/it] {'loss': 0.3252, 'grad_norm': 0.6428244763588142, 'learning_rate': 5.539112693907765e-06, 'epoch': 0.48} 48%|████▊ | 10643/22095 [18:07:04<15:15:08, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45197 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68371 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10644/22095 [18:07:07<13:26:29, 4.23s/it] {'loss': 0.3226, 'grad_norm': 0.8841811706079792, 'learning_rate': 5.538384039914777e-06, 'epoch': 0.48} 48%|████▊ | 10644/22095 [18:07:07<13:26:29, 4.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [648, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8468986 in VC:s3://internvl-moe-sft-data/. Exception: Image size [648, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14339, 'image': 'vrdu_texteq/astro-ph.CO/964a6328-47b5-4116-a0bd-cffa979183de.png', 'image_wh': [[648, 25]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'Note that these formulae are correct when $\\lambda_{\\rm obs}>\\lambda_{\\rm L}$.'}]} 48%|████▊ | 10645/22095 [18:07:12<14:05:43, 4.43s/it] {'loss': 0.2941, 'grad_norm': 0.687525201512995, 'learning_rate': 5.53765537435357e-06, 'epoch': 0.48} 48%|████▊ | 10645/22095 [18:07:12<14:05:43, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/a20b9928c84f34609341f80726830037d55d8a74072f967e838e5bc2ee218e1d.png 2025-08-28 10:05:12.524832 load time: 1235.11 ms 48%|████▊ | 10646/22095 [18:07:23<20:15:17, 6.37s/it] {'loss': 0.4843, 'grad_norm': 0.27305283829515287, 'learning_rate': 5.536926697239799e-06, 'epoch': 0.48} 48%|████▊ | 10646/22095 [18:07:23<20:15:17, 6.37s/it] 48%|████▊ | 10647/22095 [18:07:31<21:32:27, 6.77s/it] {'loss': 0.4813, 'grad_norm': 0.2857701176133048, 'learning_rate': 5.536198008589123e-06, 'epoch': 0.48} 48%|████▊ | 10647/22095 [18:07:31<21:32:27, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/agentnet/win_mac_images/b8072484-3a80-40a2-908c-d92b32571778.png 2025-08-28 10:05:31.135374 load time: 1761.4 ms 48%|████▊ | 10648/22095 [18:07:35<18:48:54, 5.92s/it] {'loss': 0.3229, 'grad_norm': 0.6667747799625429, 'learning_rate': 5.535469308417198e-06, 'epoch': 0.48} 48%|████▊ | 10648/22095 [18:07:35<18:48:54, 5.92s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_1.png 2025-08-28 10:05:34.013115 load time: 1351.0 ms 48%|████▊ | 10649/22095 [18:07:44<22:03:25, 6.94s/it] {'loss': 0.4841, 'grad_norm': 0.2865961501721208, 'learning_rate': 5.5347405967396825e-06, 'epoch': 0.48} 48%|████▊ | 10649/22095 [18:07:44<22:03:25, 6.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10650/22095 [18:07:48<18:46:26, 5.91s/it] {'loss': 0.3061, 'grad_norm': 0.6549681438750801, 'learning_rate': 5.534011873572235e-06, 'epoch': 0.48} 48%|████▊ | 10650/22095 [18:07:48<18:46:26, 5.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365278 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32019, 'image': 'vrdu_table_final_2/astro-ph.CO/a2e6b623-0c1c-4b64-acbf-c2ac9ae0e375.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}1\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047792 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10651/22095 [18:07:56<20:55:14, 6.58s/it] {'loss': 0.4667, 'grad_norm': 0.3055832816720461, 'learning_rate': 5.533283138930511e-06, 'epoch': 0.48} 48%|████▊ | 10651/22095 [18:07:56<20:55:14, 6.58s/it] 48%|████▊ | 10652/22095 [18:08:06<24:16:41, 7.64s/it] {'loss': 0.4667, 'grad_norm': 0.288024145984902, 'learning_rate': 5.532554392830171e-06, 'epoch': 0.48} 48%|████▊ | 10652/22095 [18:08:06<24:16:41, 7.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 48%|████▊ | 10653/22095 [18:08:09<20:02:21, 6.31s/it] {'loss': 0.2995, 'grad_norm': 0.6439717321302626, 'learning_rate': 5.531825635286872e-06, 'epoch': 0.48} 48%|████▊ | 10653/22095 [18:08:09<20:02:21, 6.31s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_0.png 2025-08-28 10:06:08.644815 load time: 1925.57 ms 48%|████▊ | 10654/22095 [18:08:13<17:31:18, 5.51s/it] {'loss': 0.3659, 'grad_norm': 0.7070447804383497, 'learning_rate': 5.531096866316273e-06, 'epoch': 0.48} 48%|████▊ | 10654/22095 [18:08:13<17:31:18, 5.51s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:06:13.050043 load time: 1032.54 ms 48%|████▊ | 10655/22095 [18:08:16<15:21:05, 4.83s/it] {'loss': 0.3284, 'grad_norm': 0.6090117333768033, 'learning_rate': 5.530368085934036e-06, 'epoch': 0.48} 48%|████▊ | 10655/22095 [18:08:16<15:21:05, 4.83s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_5/images/before_screenshot_51_id_113_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:06:15.288404 load time: 1754.28 ms VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_1/images/before_screenshot_1_id_272_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:06:15.952327 load time: 1399.84 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_4/images/step_0.png 2025-08-28 10:06:16.073663 load time: 1322.45 ms 48%|████▊ | 10656/22095 [18:08:19<13:59:00, 4.40s/it] {'loss': 0.3244, 'grad_norm': 0.649044910192023, 'learning_rate': 5.529639294155815e-06, 'epoch': 0.48} 48%|████▊ | 10656/22095 [18:08:19<13:59:00, 4.40s/it] 48%|████▊ | 10657/22095 [18:08:23<13:36:17, 4.28s/it] {'loss': 0.3237, 'grad_norm': 0.6588154605789093, 'learning_rate': 5.528910490997275e-06, 'epoch': 0.48} 48%|████▊ | 10657/22095 [18:08:23<13:36:17, 4.28s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_1/images/step_1.png 2025-08-28 10:06:22.154156 load time: 1282.49 ms VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_8/images/before_screenshot_33_id_84_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:06:22.594096 load time: 2032.04 ms 48%|████▊ | 10658/22095 [18:08:26<12:12:17, 3.84s/it] {'loss': 0.3157, 'grad_norm': 0.6840278031937022, 'learning_rate': 5.528181676474071e-06, 'epoch': 0.48} 48%|████▊ | 10658/22095 [18:08:26<12:12:17, 3.84s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_0.png 2025-08-28 10:06:25.943205 load time: 1060.96 ms 48%|████▊ | 10659/22095 [18:08:29<11:15:25, 3.54s/it] {'loss': 0.3481, 'grad_norm': 0.7609837095962596, 'learning_rate': 5.527452850601864e-06, 'epoch': 0.48} 48%|████▊ | 10659/22095 [18:08:29<11:15:25, 3.54s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:06:27.821021 load time: 1332.65 ms 48%|████▊ | 10660/22095 [18:08:33<11:17:25, 3.55s/it] {'loss': 0.3152, 'grad_norm': 0.766232069555319, 'learning_rate': 5.526724013396317e-06, 'epoch': 0.48} 48%|████▊ | 10660/22095 [18:08:33<11:17:25, 3.55s/it] 48%|████▊ | 10661/22095 [18:08:36<10:58:59, 3.46s/it] {'loss': 0.2826, 'grad_norm': 0.870320948169904, 'learning_rate': 5.5259951648730885e-06, 'epoch': 0.48} 48%|████▊ | 10661/22095 [18:08:36<10:58:59, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [717, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523653 in VC:s3://internvl-moe-sft-data/. Exception: Image size [717, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 161719, 'image': 'vrdu_texteq/astro-ph.CO/ce215c51-527d-4b9f-bef7-a09590256677.png', 'image_wh': [[717, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'The number of e-folds is related to the curvaton field $\\sigma$ via'}]} VC:s3://gui-agent/data_20250630/windows_augment/images/AI/handmade_annotation_2/images/Ai_5_id_8_internvl_element-caption_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:06:35.382363 load time: 1128.38 ms 48%|████▊ | 10662/22095 [18:08:40<11:38:26, 3.67s/it] {'loss': 0.3707, 'grad_norm': 0.7347925460790161, 'learning_rate': 5.525266305047838e-06, 'epoch': 0.48} 48%|████▊ | 10662/22095 [18:08:40<11:38:26, 3.67s/it] 48%|████▊ | 10663/22095 [18:08:43<11:04:23, 3.49s/it] {'loss': 0.3407, 'grad_norm': 0.6507573047099565, 'learning_rate': 5.52453743393623e-06, 'epoch': 0.48} 48%|████▊ | 10663/22095 [18:08:43<11:04:23, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44785 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10664/22095 [18:08:46<10:53:37, 3.43s/it] {'loss': 0.3185, 'grad_norm': 0.6773965666426126, 'learning_rate': 5.523808551553922e-06, 'epoch': 0.48} 48%|████▊ | 10664/22095 [18:08:46<10:53:37, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117696 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10665/22095 [18:08:51<11:36:03, 3.65s/it] {'loss': 0.3734, 'grad_norm': 0.726493244938831, 'learning_rate': 5.523079657916578e-06, 'epoch': 0.48} 48%|████▊ | 10665/22095 [18:08:51<11:36:03, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250630/windows_augment/images/DR/handmade_annotation_1/images/DR_9_id_21_internvl_appearance_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:06:49.324653 load time: 1377.73 ms 48%|████▊ | 10666/22095 [18:08:54<11:43:42, 3.69s/it] {'loss': 0.3267, 'grad_norm': 0.7505689732391188, 'learning_rate': 5.522350753039858e-06, 'epoch': 0.48} 48%|████▊ | 10666/22095 [18:08:54<11:43:42, 3.69s/it]VC:s3://gui-agent/data_20250407/web/images/adidas_com_cn/trajectory_48/img/step_1.png 2025-08-28 10:06:54.199602 load time: 1025.35 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_3/images/step_0.png 2025-08-28 10:06:53.794510 load time: 1446.05 ms 48%|████▊ | 10667/22095 [18:08:57<11:02:32, 3.48s/it] {'loss': 0.3462, 'grad_norm': 0.6373008470479439, 'learning_rate': 5.521621836939424e-06, 'epoch': 0.48} 48%|████▊ | 10667/22095 [18:08:57<11:02:32, 3.48s/it] 48%|████▊ | 10668/22095 [18:09:00<10:38:50, 3.35s/it] {'loss': 0.347, 'grad_norm': 0.656718497793086, 'learning_rate': 5.520892909630939e-06, 'epoch': 0.48} 48%|████▊ | 10668/22095 [18:09:00<10:38:50, 3.35s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:06:59.154618 load time: 1255.44 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_5.png 2025-08-28 10:07:00.818974 load time: 1047.59 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:07:00.901602 load time: 1177.81 ms 48%|████▊ | 10669/22095 [18:09:04<11:08:53, 3.51s/it] {'loss': 0.3255, 'grad_norm': 0.7163210793168534, 'learning_rate': 5.520163971130066e-06, 'epoch': 0.48} 48%|████▊ | 10669/22095 [18:09:04<11:08:53, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/images/blender/handmade_annotation_2/images/blender(11)_id_6_internvl_appearance_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:07:03.901444 load time: 1002.15 ms 48%|████▊ | 10670/22095 [18:09:07<10:30:53, 3.31s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 10:07:04.317465 load time: 1589.8 ms {'loss': 0.3682, 'grad_norm': 0.8900367229797369, 'learning_rate': 5.519435021452466e-06, 'epoch': 0.48} 48%|████▊ | 10670/22095 [18:09:07<10:30:53, 3.31s/it]VC:s3://gui-agent/data_20250630/windows_data_20250703/images/os_windows/handmade_annotation_7/images/Windows_1.PNG 2025-08-28 10:07:07.646446 load time: 1052.58 ms 48%|████▊ | 10671/22095 [18:09:10<10:31:40, 3.32s/it] {'loss': 0.3681, 'grad_norm': 0.6803573374057947, 'learning_rate': 5.518706060613805e-06, 'epoch': 0.48} 48%|████▊ | 10671/22095 [18:09:10<10:31:40, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/95886ee2-46c8-4a0f-865b-9ddbfb2af444/images/step_2.png 2025-08-28 10:07:10.918946 load time: 1004.07 ms 48%|████▊ | 10672/22095 [18:09:15<11:30:52, 3.63s/it] {'loss': 0.483, 'grad_norm': 0.41030123308996586, 'learning_rate': 5.5179770886297405e-06, 'epoch': 0.48} 48%|████▊ | 10672/22095 [18:09:15<11:30:52, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10673/22095 [18:09:18<11:03:25, 3.49s/it] {'loss': 0.3629, 'grad_norm': 0.6613579069249995, 'learning_rate': 5.517248105515941e-06, 'epoch': 0.48} 48%|████▊ | 10673/22095 [18:09:18<11:03:25, 3.49s/it] 48%|████▊ | 10674/22095 [18:09:22<11:24:59, 3.60s/it] {'loss': 0.313, 'grad_norm': 0.7609906339983002, 'learning_rate': 5.5165191112880674e-06, 'epoch': 0.48} 48%|████▊ | 10674/22095 [18:09:22<11:24:59, 3.60s/it] 48%|████▊ | 10675/22095 [18:09:25<10:47:49, 3.40s/it] {'loss': 0.3746, 'grad_norm': 0.6434155323688646, 'learning_rate': 5.515790105961785e-06, 'epoch': 0.48} 48%|████▊ | 10675/22095 [18:09:25<10:47:49, 3.40s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/2acc5969-d89d-471b-9f90-58fc22597739/images/step_0.png 2025-08-28 10:07:24.719014 load time: 1020.81 ms 48%|████▊ | 10676/22095 [18:09:29<11:26:21, 3.61s/it] {'loss': 0.3438, 'grad_norm': 0.6547562892095287, 'learning_rate': 5.515061089552758e-06, 'epoch': 0.48} 48%|████▊ | 10676/22095 [18:09:29<11:26:21, 3.61s/it] 48%|████▊ | 10677/22095 [18:09:32<10:45:41, 3.39s/it] {'loss': 0.362, 'grad_norm': 0.6847596676297534, 'learning_rate': 5.514332062076649e-06, 'epoch': 0.48} 48%|████▊ | 10677/22095 [18:09:32<10:45:41, 3.39s/it] 48%|████▊ | 10678/22095 [18:09:35<10:22:44, 3.27s/it] {'loss': 0.3041, 'grad_norm': 0.6315597904582344, 'learning_rate': 5.513603023549124e-06, 'epoch': 0.48} 48%|████▊ | 10678/22095 [18:09:35<10:22:44, 3.27s/it] 48%|████▊ | 10679/22095 [18:09:38<10:08:18, 3.20s/it] {'loss': 0.3337, 'grad_norm': 0.9328268878726889, 'learning_rate': 5.512873973985847e-06, 'epoch': 0.48} 48%|████▊ | 10679/22095 [18:09:38<10:08:18, 3.20s/it] 48%|████▊ | 10680/22095 [18:09:41<10:11:37, 3.21s/it] {'loss': 0.3443, 'grad_norm': 0.6916736855340677, 'learning_rate': 5.512144913402485e-06, 'epoch': 0.48} 48%|████▊ | 10680/22095 [18:09:41<10:11:37, 3.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304433 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1bPtGLXXXXXaEXFXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请帮忙详细地输出出图片中的所有文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n尚洁\n2016新款\n物理灭蚊\n母婴推荐\n灭蚊器\n光触媒灭蚊灯\n无辐射\nAPP20:00-23:59\n限时秒杀\n正品联保\n¥\n156\n原价256'}]} 48%|████▊ | 10681/22095 [18:09:45<11:20:35, 3.58s/it] {'loss': 0.3758, 'grad_norm': 0.6105004747411769, 'learning_rate': 5.5114158418147005e-06, 'epoch': 0.48} 48%|████▊ | 10681/22095 [18:09:45<11:20:35, 3.58s/it] 48%|████▊ | 10682/22095 [18:09:48<10:38:23, 3.36s/it] {'loss': 0.348, 'grad_norm': 0.6197122850557252, 'learning_rate': 5.51068675923816e-06, 'epoch': 0.48} 48%|████▊ | 10682/22095 [18:09:48<10:38:23, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045967 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 48%|████▊ | 10683/22095 [18:09:52<10:51:25, 3.42s/it] {'loss': 0.296, 'grad_norm': 0.6173874073638086, 'learning_rate': 5.50995766568853e-06, 'epoch': 0.48} 48%|████▊ | 10683/22095 [18:09:52<10:51:25, 3.42s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/f9be7ed3-49aa-4f23-a176-7af6afdfae84/images/step_6.png 2025-08-28 10:07:51.793307 load time: 1433.0 ms 48%|████▊ | 10684/22095 [18:09:56<11:17:30, 3.56s/it] {'loss': 0.3266, 'grad_norm': 0.7139824073742472, 'learning_rate': 5.509228561181476e-06, 'epoch': 0.48} 48%|████▊ | 10684/22095 [18:09:56<11:17:30, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [442, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8456950 in VC:s3://internvl-moe-sft-data/. Exception: Image size [442, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 68663, 'image': 'vrdu_texteq/astro-ph.CO/4bd19355-5fed-40ad-a0ba-e08a09e66c1d.png', 'image_wh': [[442, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where summation over $a$ is assumed.'}]} 48%|████▊ | 10685/22095 [18:09:59<10:57:49, 3.46s/it] {'loss': 0.3495, 'grad_norm': 0.6004396779219929, 'learning_rate': 5.508499445732664e-06, 'epoch': 0.48} 48%|████▊ | 10685/22095 [18:09:59<10:57:49, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:07:58.696638 load time: 1240.76 ms 48%|████▊ | 10686/22095 [18:10:08<16:23:18, 5.17s/it] {'loss': 0.4929, 'grad_norm': 0.3708540694305002, 'learning_rate': 5.507770319357762e-06, 'epoch': 0.48} 48%|████▊ | 10686/22095 [18:10:08<16:23:18, 5.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57503 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94068 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10687/22095 [18:10:12<14:51:47, 4.69s/it] {'loss': 0.3306, 'grad_norm': 0.6407768627436697, 'learning_rate': 5.507041182072434e-06, 'epoch': 0.48} 48%|████▊ | 10687/22095 [18:10:12<14:51:47, 4.69s/it] 48%|████▊ | 10688/22095 [18:10:15<13:39:01, 4.31s/it] {'loss': 0.3176, 'grad_norm': 0.6358955180217345, 'learning_rate': 5.506312033892348e-06, 'epoch': 0.48} 48%|████▊ | 10688/22095 [18:10:15<13:39:01, 4.31s/it] 48%|████▊ | 10689/22095 [18:10:18<12:14:47, 3.87s/it] {'loss': 0.3283, 'grad_norm': 0.6569460673575273, 'learning_rate': 5.505582874833172e-06, 'epoch': 0.48} 48%|████▊ | 10689/22095 [18:10:18<12:14:47, 3.87s/it]VC:s3://gui-agent/data_20250630/mac/images/terminal/82454962-a1bb-4086-b60b-2a998b5fbb4e/images/step_0.png 2025-08-28 10:08:17.183037 load time: 1026.63 ms 48%|████▊ | 10690/22095 [18:10:21<11:15:25, 3.55s/it] {'loss': 0.3068, 'grad_norm': 0.644320307937128, 'learning_rate': 5.5048537049105725e-06, 'epoch': 0.48} 48%|████▊ | 10690/22095 [18:10:21<11:15:25, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10691/22095 [18:10:24<11:13:06, 3.54s/it] {'loss': 0.3419, 'grad_norm': 0.7781673045915627, 'learning_rate': 5.504124524140218e-06, 'epoch': 0.48} 48%|████▊ | 10691/22095 [18:10:24<11:13:06, 3.54s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 10:08:23.045693 load time: 1033.2 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_1/images/step_0.png 2025-08-28 10:08:23.045797 load time: 1869.52 ms 48%|████▊ | 10692/22095 [18:10:28<11:35:08, 3.66s/it] {'loss': 0.3007, 'grad_norm': 0.6220391667566061, 'learning_rate': 5.503395332537775e-06, 'epoch': 0.48} 48%|████▊ | 10692/22095 [18:10:28<11:35:08, 3.66s/it] 48%|████▊ | 10693/22095 [18:10:32<11:19:32, 3.58s/it] {'loss': 0.3194, 'grad_norm': 0.6090721522407699, 'learning_rate': 5.502666130118912e-06, 'epoch': 0.48} 48%|████▊ | 10693/22095 [18:10:32<11:19:32, 3.58s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_202007_3/images/before_screenshot_24_id_125_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:08:30.355250 load time: 1583.81 ms VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250507_011522_1/images/before_screenshot_3_id_108_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:08:31.593952 load time: 1235.72 ms 48%|████▊ | 10694/22095 [18:10:35<11:09:10, 3.52s/it] {'loss': 0.3303, 'grad_norm': 0.6498239850687649, 'learning_rate': 5.501936916899299e-06, 'epoch': 0.48} 48%|████▊ | 10694/22095 [18:10:35<11:09:10, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10695/22095 [18:10:42<14:02:37, 4.43s/it] {'loss': 0.504, 'grad_norm': 0.5225679995389199, 'learning_rate': 5.5012076928946035e-06, 'epoch': 0.48} 48%|████▊ | 10695/22095 [18:10:42<14:02:37, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81447 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10696/22095 [18:10:45<12:54:41, 4.08s/it] {'loss': 0.3156, 'grad_norm': 0.6396807582547848, 'learning_rate': 5.500478458120493e-06, 'epoch': 0.48} 48%|████▊ | 10696/22095 [18:10:45<12:54:41, 4.08s/it] 48%|████▊ | 10697/22095 [18:10:49<12:37:48, 3.99s/it] {'loss': 0.3393, 'grad_norm': 0.6745967780966945, 'learning_rate': 5.499749212592638e-06, 'epoch': 0.48} 48%|████▊ | 10697/22095 [18:10:49<12:37:48, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [225, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8456965 in VC:s3://internvl-moe-sft-data/. Exception: Image size [225, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 81691, 'image': 'vrdu_texteq/astro-ph.CO/f4ed583d-1838-4ada-acc0-53a038c6c70b.png', 'image_wh': [[225, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'where $A$ and $h$ are'}]} 48%|████▊ | 10698/22095 [18:10:52<11:55:14, 3.77s/it] {'loss': 0.3375, 'grad_norm': 0.5948531132657466, 'learning_rate': 5.499019956326707e-06, 'epoch': 0.48} 48%|████▊ | 10698/22095 [18:10:52<11:55:14, 3.77s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38869.png 2025-08-28 10:08:50.586497 load time: 1570.71 ms 48%|████▊ | 10699/22095 [18:10:55<11:46:06, 3.72s/it] {'loss': 0.365, 'grad_norm': 0.7187557241178281, 'learning_rate': 5.498290689338369e-06, 'epoch': 0.48} 48%|████▊ | 10699/22095 [18:10:55<11:46:06, 3.72s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250504_153105_5/images/before_screenshot_50_id_136_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:08:54.193321 load time: 1445.13 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f45219d4cc35521265a9fcbd416d3c17fc3501857a99990c36eb3d855895b1f9.png 2025-08-28 10:08:55.207587 load time: 1571.17 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/37837.png 2025-08-28 10:08:54.733078 load time: 2133.64 ms 48%|████▊ | 10700/22095 [18:10:59<12:00:48, 3.80s/it] {'loss': 0.3192, 'grad_norm': 0.6229183218355234, 'learning_rate': 5.497561411643295e-06, 'epoch': 0.48} 48%|████▊ | 10700/22095 [18:10:59<12:00:48, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 48%|████▊ | 10701/22095 [18:11:03<12:00:07, 3.79s/it] {'loss': 0.3165, 'grad_norm': 0.65019446915603, 'learning_rate': 5.496832123257154e-06, 'epoch': 0.48} 48%|████▊ | 10701/22095 [18:11:03<12:00:07, 3.79s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/9db8b077-9fe1-4e64-9ae8-95dea252fdb4/images/step_1.png 2025-08-28 10:09:03.157651 load time: 1222.62 ms 48%|████▊ | 10702/22095 [18:11:06<11:17:20, 3.57s/it] {'loss': 0.3165, 'grad_norm': 0.6392369740491995, 'learning_rate': 5.496102824195618e-06, 'epoch': 0.48} 48%|████▊ | 10702/22095 [18:11:06<11:17:20, 3.57s/it] 48%|████▊ | 10703/22095 [18:11:09<11:00:09, 3.48s/it] {'loss': 0.3322, 'grad_norm': 0.6310505782519518, 'learning_rate': 5.495373514474356e-06, 'epoch': 0.48} 48%|████▊ | 10703/22095 [18:11:09<11:00:09, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 48%|████▊ | 10704/22095 [18:11:18<15:21:18, 4.85s/it] {'loss': 0.4691, 'grad_norm': 0.3597350218284494, 'learning_rate': 5.494644194109037e-06, 'epoch': 0.48} 48%|████▊ | 10704/22095 [18:11:18<15:21:18, 4.85s/it] 48%|████▊ | 10705/22095 [18:11:21<13:56:58, 4.41s/it] {'loss': 0.3297, 'grad_norm': 0.7423058383703874, 'learning_rate': 5.493914863115334e-06, 'epoch': 0.48} 48%|████▊ | 10705/22095 [18:11:21<13:56:58, 4.41s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/d1e543d1-751f-4201-8047-6e1e7f14dec2/images/step_4.png 2025-08-28 10:09:19.056032 load time: 1221.37 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 10:09:21.331861 load time: 1158.71 ms 48%|████▊ | 10706/22095 [18:11:24<12:52:44, 4.07s/it] {'loss': 0.3338, 'grad_norm': 0.6461765373067897, 'learning_rate': 5.493185521508918e-06, 'epoch': 0.48} 48%|████▊ | 10706/22095 [18:11:24<12:52:44, 4.07s/it] 48%|████▊ | 10707/22095 [18:11:28<12:49:01, 4.05s/it] {'loss': 0.3471, 'grad_norm': 0.6258937227965276, 'learning_rate': 5.492456169305459e-06, 'epoch': 0.48} 48%|████▊ | 10707/22095 [18:11:28<12:49:01, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308305 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2aCrAgtbJ8KJjy1zjXXaqapXa_!!3327750394.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请帮我从这张图片中提炼出全部文字内容。'}, {'from': 'gpt', 'value': '图中所有文字:\n购买\n手柄扁圆设计,握感舒适\n2\n理由\n挂钩设计,方便悬挂收纳\n手柄边缘光滑,握感舒适'}]} 48%|████▊ | 10708/22095 [18:11:32<12:07:17, 3.83s/it] {'loss': 0.2936, 'grad_norm': 0.6575981123752621, 'learning_rate': 5.49172680652063e-06, 'epoch': 0.48} 48%|████▊ | 10708/22095 [18:11:32<12:07:17, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906518 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29671, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nA. 16cm\nB. 10cm\nC. 5cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 10:09:30.312416 load time: 1029.63 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f45219d4cc35521265a9fcbd416d3c17fc3501857a99990c36eb3d855895b1f9.png 2025-08-28 10:09:31.124368 load time: 1018.43 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:09:31.843723 load time: 1273.44 ms 48%|████▊ | 10709/22095 [18:11:35<11:33:12, 3.65s/it] {'loss': 0.3563, 'grad_norm': 0.6326251364558374, 'learning_rate': 5.490997433170102e-06, 'epoch': 0.48} 48%|████▊ | 10709/22095 [18:11:35<11:33:12, 3.65s/it] 48%|████▊ | 10710/22095 [18:11:38<11:15:20, 3.56s/it] {'loss': 0.3364, 'grad_norm': 0.621611401740867, 'learning_rate': 5.490268049269547e-06, 'epoch': 0.48} 48%|████▊ | 10710/22095 [18:11:38<11:15:20, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45685 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10711/22095 [18:11:41<10:45:09, 3.40s/it] {'loss': 0.3161, 'grad_norm': 0.6495453676007767, 'learning_rate': 5.489538654834638e-06, 'epoch': 0.48} 48%|████▊ | 10711/22095 [18:11:41<10:45:09, 3.40s/it] 48%|████▊ | 10712/22095 [18:11:44<10:07:40, 3.20s/it] {'loss': 0.3478, 'grad_norm': 0.7567639136682123, 'learning_rate': 5.488809249881046e-06, 'epoch': 0.48} 48%|████▊ | 10712/22095 [18:11:44<10:07:40, 3.20s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38214.png 2025-08-28 10:09:42.655874 load time: 1043.77 ms 48%|████▊ | 10713/22095 [18:11:47<9:48:26, 3.10s/it] {'loss': 0.3391, 'grad_norm': 0.6084504260324419, 'learning_rate': 5.488079834424446e-06, 'epoch': 0.48} 48%|████▊ | 10713/22095 [18:11:47<9:48:26, 3.10s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_2.png 2025-08-28 10:09:47.166373 load time: 1241.4 ms 48%|████▊ | 10714/22095 [18:11:50<10:02:52, 3.18s/it] {'loss': 0.3693, 'grad_norm': 0.6199904202505113, 'learning_rate': 5.487350408480507e-06, 'epoch': 0.48} 48%|████▊ | 10714/22095 [18:11:50<10:02:52, 3.18s/it] 48%|████▊ | 10715/22095 [18:11:53<10:01:26, 3.17s/it] {'loss': 0.3143, 'grad_norm': 0.694471474371993, 'learning_rate': 5.486620972064907e-06, 'epoch': 0.48} 48%|████▊ | 10715/22095 [18:11:53<10:01:26, 3.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43170 > 40960). Running this sequence through the model will result in indexing errors 48%|████▊ | 10716/22095 [18:11:57<10:24:19, 3.29s/it] {'loss': 0.3475, 'grad_norm': 0.6589954660090999, 'learning_rate': 5.485891525193316e-06, 'epoch': 0.48} 48%|████▊ | 10716/22095 [18:11:57<10:24:19, 3.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892994 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16147, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_1/images/step_3.png 2025-08-28 10:09:55.959567 load time: 1008.25 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_212129_1/images/before_screenshot_1_id_50_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:09:56.246835 load time: 1545.7 ms 49%|████▊ | 10717/22095 [18:12:00<10:03:15, 3.18s/it] {'loss': 0.3641, 'grad_norm': 0.7121177870279708, 'learning_rate': 5.485162067881407e-06, 'epoch': 0.49} 49%|████▊ | 10717/22095 [18:12:00<10:03:15, 3.18s/it] 49%|████▊ | 10718/22095 [18:12:03<10:09:51, 3.22s/it] {'loss': 0.2968, 'grad_norm': 0.6038698862707603, 'learning_rate': 5.484432600144857e-06, 'epoch': 0.49} 49%|████▊ | 10718/22095 [18:12:03<10:09:51, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▊ | 10719/22095 [18:12:11<14:25:16, 4.56s/it] {'loss': 0.4761, 'grad_norm': 0.3526127088415516, 'learning_rate': 5.483703121999337e-06, 'epoch': 0.49} 49%|████▊ | 10719/22095 [18:12:11<14:25:16, 4.56s/it] 49%|████▊ | 10720/22095 [18:12:15<14:23:44, 4.56s/it] {'loss': 0.3395, 'grad_norm': 0.6175444607040002, 'learning_rate': 5.482973633460524e-06, 'epoch': 0.49} 49%|████▊ | 10720/22095 [18:12:15<14:23:44, 4.56s/it] 49%|████▊ | 10721/22095 [18:12:18<13:05:02, 4.14s/it] {'loss': 0.3059, 'grad_norm': 0.6735534842917275, 'learning_rate': 5.48224413454409e-06, 'epoch': 0.49} 49%|████▊ | 10721/22095 [18:12:18<13:05:02, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358178 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24889, 'image': 'vrdu_table_final_2/astro-ph.CO/9aa1e143-05ca-4fbb-bd88-f61c62aadf50.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 49%|████▊ | 10722/22095 [18:12:22<12:08:12, 3.84s/it] {'loss': 0.3289, 'grad_norm': 0.6806734732368638, 'learning_rate': 5.481514625265709e-06, 'epoch': 0.49} 49%|████▊ | 10722/22095 [18:12:22<12:08:12, 3.84s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:10:20.392076 load time: 1086.18 ms 49%|████▊ | 10723/22095 [18:12:26<13:04:33, 4.14s/it] {'loss': 0.3057, 'grad_norm': 0.6232723889284854, 'learning_rate': 5.480785105641061e-06, 'epoch': 0.49} 49%|████▊ | 10723/22095 [18:12:26<13:04:33, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_3/images/step_0.png 2025-08-28 10:10:26.210391 load time: 1209.45 ms 49%|████▊ | 10724/22095 [18:12:37<18:49:43, 5.96s/it] {'loss': 0.4743, 'grad_norm': 0.3112196698409816, 'learning_rate': 5.480055575685815e-06, 'epoch': 0.49} 49%|████▊ | 10724/22095 [18:12:37<18:49:43, 5.96s/it] 49%|████▊ | 10725/22095 [18:12:41<17:19:15, 5.48s/it] {'loss': 0.3451, 'grad_norm': 0.6491875086620946, 'learning_rate': 5.479326035415651e-06, 'epoch': 0.49} 49%|████▊ | 10725/22095 [18:12:41<17:19:15, 5.48s/it] 49%|████▊ | 10726/22095 [18:12:44<15:04:58, 4.78s/it] {'loss': 0.3236, 'grad_norm': 0.5653696631158297, 'learning_rate': 5.47859648484624e-06, 'epoch': 0.49} 49%|████▊ | 10726/22095 [18:12:44<15:04:58, 4.78s/it] 49%|████▊ | 10727/22095 [18:12:48<14:11:14, 4.49s/it] {'loss': 0.3604, 'grad_norm': 0.6438329748116453, 'learning_rate': 5.477866923993262e-06, 'epoch': 0.49} 49%|████▊ | 10727/22095 [18:12:48<14:11:14, 4.49s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/vivado/20250508_133202_871894_3985_1/images/before_screenshot_1_id_0_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:10:47.928658 load time: 1405.74 ms 49%|████▊ | 10728/22095 [18:12:51<13:04:53, 4.14s/it] {'loss': 0.3349, 'grad_norm': 0.6606980186635552, 'learning_rate': 5.477137352872393e-06, 'epoch': 0.49} 49%|████▊ | 10728/22095 [18:12:51<13:04:53, 4.14s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/d1e543d1-751f-4201-8047-6e1e7f14dec2/images/step_8.png 2025-08-28 10:10:51.395917 load time: 1399.44 ms 49%|████▊ | 10729/22095 [18:12:54<11:55:24, 3.78s/it] {'loss': 0.3571, 'grad_norm': 0.6504920369442497, 'learning_rate': 5.476407771499305e-06, 'epoch': 0.49} 49%|████▊ | 10729/22095 [18:12:54<11:55:24, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59973 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10730/22095 [18:12:57<10:55:53, 3.46s/it] {'loss': 0.3178, 'grad_norm': 0.6187465532547302, 'learning_rate': 5.475678179889678e-06, 'epoch': 0.49} 49%|████▊ | 10730/22095 [18:12:57<10:55:53, 3.46s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_5/images/step_6.png 2025-08-28 10:10:56.974260 load time: 1119.86 ms 49%|████▊ | 10731/22095 [18:13:01<11:11:37, 3.55s/it] {'loss': 0.3314, 'grad_norm': 0.618073859499965, 'learning_rate': 5.474948578059188e-06, 'epoch': 0.49} 49%|████▊ | 10731/22095 [18:13:01<11:11:37, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▊ | 10732/22095 [18:13:04<10:53:15, 3.45s/it] {'loss': 0.2919, 'grad_norm': 0.60108084486261, 'learning_rate': 5.474218966023512e-06, 'epoch': 0.49} 49%|████▊ | 10732/22095 [18:13:04<10:53:15, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98621 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124958 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10733/22095 [18:13:08<11:17:42, 3.58s/it] {'loss': 0.3514, 'grad_norm': 0.6285127287027362, 'learning_rate': 5.473489343798327e-06, 'epoch': 0.49} 49%|████▊ | 10733/22095 [18:13:08<11:17:42, 3.58s/it]VC:s3://gui-agent/data_20250612/mac/images/reminders/375bea2e-27d3-4521-aeef-5f7b6fcb5726/images/step_0.png 2025-08-28 10:11:07.021953 load time: 1160.05 ms 49%|████▊ | 10734/22095 [18:13:11<10:38:16, 3.37s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/d9f91a67-b7b2-4c00-add0-da44a4621f69/images/step_0.png 2025-08-28 10:11:07.856528 load time: 1460.89 ms {'loss': 0.302, 'grad_norm': 0.5864677323792765, 'learning_rate': 5.472759711399311e-06, 'epoch': 0.49} 49%|████▊ | 10734/22095 [18:13:11<10:38:16, 3.37s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/9db8b077-9fe1-4e64-9ae8-95dea252fdb4/images/step_0.png 2025-08-28 10:11:09.756428 load time: 1055.69 ms 49%|████▊ | 10735/22095 [18:13:14<10:52:53, 3.45s/it] {'loss': 0.3118, 'grad_norm': 0.6108680980516022, 'learning_rate': 5.472030068842139e-06, 'epoch': 0.49} 49%|████▊ | 10735/22095 [18:13:14<10:52:53, 3.45s/it] 49%|████▊ | 10736/22095 [18:13:18<11:04:52, 3.51s/it] {'loss': 0.3982, 'grad_norm': 0.6878846950476892, 'learning_rate': 5.471300416142492e-06, 'epoch': 0.49} 49%|████▊ | 10736/22095 [18:13:18<11:04:52, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71571 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10737/22095 [18:13:22<11:22:11, 3.60s/it] {'loss': 0.3123, 'grad_norm': 0.5736098866508658, 'learning_rate': 5.470570753316046e-06, 'epoch': 0.49} 49%|████▊ | 10737/22095 [18:13:22<11:22:11, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46672 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43402 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10738/22095 [18:13:30<15:28:59, 4.91s/it] {'loss': 0.486, 'grad_norm': 0.39332779899616394, 'learning_rate': 5.469841080378479e-06, 'epoch': 0.49} 49%|████▊ | 10738/22095 [18:13:30<15:28:59, 4.91s/it] 49%|████▊ | 10739/22095 [18:13:34<14:47:11, 4.69s/it] {'loss': 0.2917, 'grad_norm': 0.6134887660443886, 'learning_rate': 5.469111397345471e-06, 'epoch': 0.49} 49%|████▊ | 10739/22095 [18:13:34<14:47:11, 4.69s/it] 49%|████▊ | 10740/22095 [18:13:37<13:29:21, 4.28s/it] {'loss': 0.3249, 'grad_norm': 0.616320225429893, 'learning_rate': 5.468381704232699e-06, 'epoch': 0.49} 49%|████▊ | 10740/22095 [18:13:37<13:29:21, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▊ | 10741/22095 [18:13:47<18:16:00, 5.79s/it] {'loss': 0.4679, 'grad_norm': 0.30036757827431215, 'learning_rate': 5.467652001055844e-06, 'epoch': 0.49} 49%|████▊ | 10741/22095 [18:13:47<18:16:00, 5.79s/it] 49%|████▊ | 10742/22095 [18:13:50<15:49:46, 5.02s/it] {'loss': 0.314, 'grad_norm': 0.627407516553, 'learning_rate': 5.466922287830584e-06, 'epoch': 0.49} 49%|████▊ | 10742/22095 [18:13:50<15:49:46, 5.02s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/vision/test_468_image.png 2025-08-28 10:11:49.035499 load time: 1179.65 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 10:11:49.173142 load time: 1141.24 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:11:50.745022 load time: 1128.85 ms 49%|████▊ | 10743/22095 [18:13:54<14:41:53, 4.66s/it] {'loss': 0.36, 'grad_norm': 0.6642704155718921, 'learning_rate': 5.466192564572597e-06, 'epoch': 0.49} 49%|████▊ | 10743/22095 [18:13:54<14:41:53, 4.66s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_211653/images/step_4_id_56_internvl_element-caption_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:11:52.395471 load time: 1006.15 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 10:11:53.760023 load time: 1044.19 ms VC:s3://gui/aguvis/aguvis-stage1/omniact/images/train_6347.png 2025-08-28 10:11:54.024992 load time: 1137.65 ms 49%|████▊ | 10744/22095 [18:13:57<13:41:48, 4.34s/it] {'loss': 0.3316, 'grad_norm': 0.6556948496626056, 'learning_rate': 5.465462831297564e-06, 'epoch': 0.49} 49%|████▊ | 10744/22095 [18:13:57<13:41:48, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73000 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10745/22095 [18:14:07<19:08:34, 6.07s/it] {'loss': 0.4845, 'grad_norm': 0.3150740101579165, 'learning_rate': 5.464733088021165e-06, 'epoch': 0.49} 49%|████▊ | 10745/22095 [18:14:07<19:08:34, 6.07s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/inventor/20250513_143703_755135_4964_1/images/before_screenshot_1_id_0_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:12:06.109941 load time: 1484.09 ms 49%|████▊ | 10746/22095 [18:14:17<22:30:13, 7.14s/it] {'loss': 0.4992, 'grad_norm': 0.3502713140017343, 'learning_rate': 5.464003334759077e-06, 'epoch': 0.49} 49%|████▊ | 10746/22095 [18:14:17<22:30:13, 7.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250612/mac/images/finder/1fe1ca62-7e4a-4d85-af6c-e650a9c51129/images/step_1.png 2025-08-28 10:12:16.680478 load time: 1187.46 ms 49%|████▊ | 10747/22095 [18:14:20<18:54:34, 6.00s/it] {'loss': 0.324, 'grad_norm': 0.675623605474163, 'learning_rate': 5.463273571526985e-06, 'epoch': 0.49} 49%|████▊ | 10747/22095 [18:14:20<18:54:34, 6.00s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 10:12:19.581482 load time: 1117.33 ms 49%|████▊ | 10748/22095 [18:14:24<16:16:32, 5.16s/it] {'loss': 0.3361, 'grad_norm': 0.6686506686394733, 'learning_rate': 5.462543798340565e-06, 'epoch': 0.49} 49%|████▊ | 10748/22095 [18:14:24<16:16:32, 5.16s/it] 49%|████▊ | 10749/22095 [18:14:27<14:41:49, 4.66s/it] {'loss': 0.3557, 'grad_norm': 0.6719577676699698, 'learning_rate': 5.4618140152155e-06, 'epoch': 0.49} 49%|████▊ | 10749/22095 [18:14:27<14:41:49, 4.66s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20475.png 2025-08-28 10:12:24.434924 load time: 1536.87 ms 49%|████▊ | 10750/22095 [18:14:31<13:46:03, 4.37s/it] {'loss': 0.3191, 'grad_norm': 0.625166400770162, 'learning_rate': 5.461084222167471e-06, 'epoch': 0.49} 49%|████▊ | 10750/22095 [18:14:31<13:46:03, 4.37s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:12:30.471393 load time: 1091.85 ms 49%|████▊ | 10751/22095 [18:14:34<12:54:57, 4.10s/it] {'loss': 0.3423, 'grad_norm': 0.6150470753407132, 'learning_rate': 5.460354419212156e-06, 'epoch': 0.49} 49%|████▊ | 10751/22095 [18:14:34<12:54:57, 4.10s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_1/images/step_0.png 2025-08-28 10:12:31.034224 load time: 1744.79 ms VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250508_132635_1/images/before_screenshot_1_id_26_function_0_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 10:12:32.933248 load time: 1243.49 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:12:33.969518 load time: 1175.99 ms VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_11/images/before_screenshot_54_id_204_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:12:34.059004 load time: 1647.16 ms 49%|████▊ | 10752/22095 [18:14:38<12:32:54, 3.98s/it] {'loss': 0.3481, 'grad_norm': 0.6496271771479472, 'learning_rate': 5.4596246063652405e-06, 'epoch': 0.49} 49%|████▊ | 10752/22095 [18:14:38<12:32:54, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▊ | 10753/22095 [18:14:47<17:41:23, 5.61s/it] {'loss': 0.4824, 'grad_norm': 0.35633811231314305, 'learning_rate': 5.458894783642402e-06, 'epoch': 0.49} 49%|████▊ | 10753/22095 [18:14:47<17:41:23, 5.61s/it] 49%|████▊ | 10754/22095 [18:14:51<15:29:41, 4.92s/it] {'loss': 0.3248, 'grad_norm': 0.6279048917219101, 'learning_rate': 5.458164951059326e-06, 'epoch': 0.49} 49%|████▊ | 10754/22095 [18:14:51<15:29:41, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113345 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42136 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10755/22095 [18:14:53<13:34:47, 4.31s/it] {'loss': 0.3321, 'grad_norm': 0.8146324246424269, 'learning_rate': 5.457435108631691e-06, 'epoch': 0.49} 49%|████▊ | 10755/22095 [18:14:53<13:34:47, 4.31s/it] 49%|████▊ | 10756/22095 [18:14:57<12:47:16, 4.06s/it] {'loss': 0.2981, 'grad_norm': 0.6477127892981791, 'learning_rate': 5.456705256375181e-06, 'epoch': 0.49} 49%|████▊ | 10756/22095 [18:14:57<12:47:16, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95387 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58272 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10757/22095 [18:15:00<12:13:49, 3.88s/it] {'loss': 0.3271, 'grad_norm': 0.6579613350271492, 'learning_rate': 5.455975394305477e-06, 'epoch': 0.49} 49%|████▊ | 10757/22095 [18:15:00<12:13:49, 3.88s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:12:59.684530 load time: 1123.73 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 10:12:59.743681 load time: 1126.29 ms 49%|████▊ | 10758/22095 [18:15:04<11:54:42, 3.78s/it] {'loss': 0.3102, 'grad_norm': 0.5833393576827078, 'learning_rate': 5.455245522438263e-06, 'epoch': 0.49} 49%|████▊ | 10758/22095 [18:15:04<11:54:42, 3.78s/it] 49%|████▊ | 10759/22095 [18:15:07<11:22:30, 3.61s/it] {'loss': 0.3344, 'grad_norm': 0.6247727229746702, 'learning_rate': 5.4545156407892204e-06, 'epoch': 0.49} 49%|████▊ | 10759/22095 [18:15:07<11:22:30, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359937 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26658, 'image': 'vrdu_table_final_2/astro-ph.CO/acb631c6-1a1f-446d-b2fe-393cf5205957.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892993 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16146, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} VC:s3://internvl2/datasets/MMMUDataset/MMMU/Agriculture/test_255_image_1.png 2025-08-28 10:13:05.959118 load time: 1812.07 ms 49%|████▊ | 10760/22095 [18:15:11<11:40:31, 3.71s/it] {'loss': 0.2856, 'grad_norm': 0.689951121413877, 'learning_rate': 5.453785749374033e-06, 'epoch': 0.49} 49%|████▊ | 10760/22095 [18:15:11<11:40:31, 3.71s/it] 49%|████▊ | 10761/22095 [18:15:15<12:06:43, 3.85s/it] {'loss': 0.348, 'grad_norm': 0.64134476580352, 'learning_rate': 5.453055848208383e-06, 'epoch': 0.49} 49%|████▊ | 10761/22095 [18:15:15<12:06:43, 3.85s/it] 49%|████▊ | 10762/22095 [18:15:18<11:17:45, 3.59s/it] {'loss': 0.3274, 'grad_norm': 0.6136023603777939, 'learning_rate': 5.452325937307955e-06, 'epoch': 0.49} 49%|████▊ | 10762/22095 [18:15:18<11:17:45, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▊ | 10763/22095 [18:15:26<14:45:48, 4.69s/it] {'loss': 0.4868, 'grad_norm': 0.3744441172659562, 'learning_rate': 5.4515960166884315e-06, 'epoch': 0.49} 49%|████▊ | 10763/22095 [18:15:26<14:45:48, 4.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933827 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56980, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_5/images/step_0.png 2025-08-28 10:13:24.308884 load time: 1367.85 ms 49%|████▊ | 10764/22095 [18:15:30<14:18:19, 4.54s/it] {'loss': 0.3281, 'grad_norm': 0.6212186162849846, 'learning_rate': 5.450866086365496e-06, 'epoch': 0.49} 49%|████▊ | 10764/22095 [18:15:30<14:18:19, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 10:13:28.512363 load time: 1051.59 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://internvl2/datasets/MMMUDataset/MMMU/Agriculture/validation_28_image_1.png 2025-08-28 10:13:30.068647 load time: 1107.81 ms 49%|████▊ | 10765/22095 [18:15:41<20:25:50, 6.49s/it] {'loss': 0.4618, 'grad_norm': 0.2842456058853913, 'learning_rate': 5.450136146354834e-06, 'epoch': 0.49} 49%|████▊ | 10765/22095 [18:15:41<20:25:50, 6.49s/it]VC:s3://gui-agent/data_20250421/web/images/wa_map/trajectory_15/img/step_1.png 2025-08-28 10:13:40.693230 load time: 1002.59 ms 49%|████▊ | 10766/22095 [18:15:45<17:57:22, 5.71s/it] {'loss': 0.3403, 'grad_norm': 0.6214680943609159, 'learning_rate': 5.449406196672129e-06, 'epoch': 0.49} 49%|████▊ | 10766/22095 [18:15:45<17:57:22, 5.71s/it] 49%|████▊ | 10767/22095 [18:15:48<15:39:14, 4.97s/it] {'loss': 0.3273, 'grad_norm': 0.6344077106540689, 'learning_rate': 5.448676237333064e-06, 'epoch': 0.49} 49%|████▊ | 10767/22095 [18:15:48<15:39:14, 4.97s/it] 49%|████▊ | 10768/22095 [18:15:52<14:24:29, 4.58s/it] {'loss': 0.3384, 'grad_norm': 0.7736892506859431, 'learning_rate': 5.447946268353324e-06, 'epoch': 0.49} 49%|████▊ | 10768/22095 [18:15:52<14:24:29, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [256, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429225 in VC:s3://internvl-moe-sft-data/. Exception: Image size [256, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 94910, 'image': 'vrdu_texteq/astro-ph.CO/8e51deea-e04b-4cef-aa86-f4a96c13c318.png', 'image_wh': [[256, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'where \\( ~ia\\overline{v}_{D}~~= \\overline{u}_{D} \\) .'}]} 49%|████▊ | 10769/22095 [18:15:55<12:53:07, 4.10s/it] {'loss': 0.296, 'grad_norm': 0.6034017418949362, 'learning_rate': 5.447216289748596e-06, 'epoch': 0.49} 49%|████▊ | 10769/22095 [18:15:55<12:53:07, 4.10s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/9f7edf74-7688-4efc-998b-d858abee5e38/images/step_7.png 2025-08-28 10:13:52.302001 load time: 1033.09 ms Token indices sequence length is longer than the specified maximum sequence length for this model (85860 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44217 > 40960). Running this sequence through the model will result in indexing errors 49%|████▊ | 10770/22095 [18:15:58<12:11:31, 3.88s/it] {'loss': 0.3702, 'grad_norm': 0.6784861654016594, 'learning_rate': 5.446486301534564e-06, 'epoch': 0.49} 49%|████▊ | 10770/22095 [18:15:58<12:11:31, 3.88s/it] 49%|████▊ | 10771/22095 [18:16:01<11:28:21, 3.65s/it] {'loss': 0.2995, 'grad_norm': 0.6188537207264507, 'learning_rate': 5.445756303726913e-06, 'epoch': 0.49} 49%|████▊ | 10771/22095 [18:16:01<11:28:21, 3.65s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 10:13:59.795652 load time: 1128.35 ms 49%|████▉ | 10772/22095 [18:16:05<11:48:48, 3.76s/it] {'loss': 0.3689, 'grad_norm': 0.7374548514598774, 'learning_rate': 5.445026296341325e-06, 'epoch': 0.49} 49%|████▉ | 10772/22095 [18:16:05<11:48:48, 3.76s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_0.png 2025-08-28 10:14:04.767871 load time: 1286.14 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:14:04.839869 load time: 1864.24 ms 49%|████▉ | 10773/22095 [18:16:08<11:16:15, 3.58s/it] {'loss': 0.3288, 'grad_norm': 0.5612933828257421, 'learning_rate': 5.44429627939349e-06, 'epoch': 0.49} 49%|████▉ | 10773/22095 [18:16:08<11:16:15, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47600 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80562 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10774/22095 [18:16:12<11:40:58, 3.72s/it] {'loss': 0.3163, 'grad_norm': 0.5757263653169219, 'learning_rate': 5.443566252899093e-06, 'epoch': 0.49} 49%|████▉ | 10774/22095 [18:16:12<11:40:58, 3.72s/it] 49%|████▉ | 10775/22095 [18:16:15<11:03:55, 3.52s/it] {'loss': 0.3361, 'grad_norm': 0.6066722995925015, 'learning_rate': 5.442836216873819e-06, 'epoch': 0.49} 49%|████▉ | 10775/22095 [18:16:15<11:03:55, 3.52s/it] 49%|████▉ | 10776/22095 [18:16:19<11:19:23, 3.60s/it] {'loss': 0.3517, 'grad_norm': 0.6216811471150734, 'learning_rate': 5.442106171333355e-06, 'epoch': 0.49} 49%|████▉ | 10776/22095 [18:16:19<11:19:23, 3.60s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/95886ee2-46c8-4a0f-865b-9ddbfb2af444/images/step_0.png 2025-08-28 10:14:17.857226 load time: 1404.69 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:14:18.850856 load time: 1317.21 ms 49%|████▉ | 10777/22095 [18:16:22<11:04:22, 3.52s/it] {'loss': 0.3058, 'grad_norm': 0.6070444961735875, 'learning_rate': 5.441376116293388e-06, 'epoch': 0.49} 49%|████▉ | 10777/22095 [18:16:22<11:04:22, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10778/22095 [18:16:31<15:23:00, 4.89s/it] {'loss': 0.4778, 'grad_norm': 0.4419608326387728, 'learning_rate': 5.4406460517696035e-06, 'epoch': 0.49} 49%|████▉ | 10778/22095 [18:16:31<15:23:00, 4.89s/it] 49%|████▉ | 10779/22095 [18:16:34<14:19:47, 4.56s/it] {'loss': 0.297, 'grad_norm': 0.6615232986768225, 'learning_rate': 5.439915977777689e-06, 'epoch': 0.49} 49%|████▉ | 10779/22095 [18:16:34<14:19:47, 4.56s/it] 49%|████▉ | 10780/22095 [18:16:37<12:39:31, 4.03s/it] {'loss': 0.287, 'grad_norm': 0.6232505080950455, 'learning_rate': 5.43918589433333e-06, 'epoch': 0.49} 49%|████▉ | 10780/22095 [18:16:37<12:39:31, 4.03s/it] 49%|████▉ | 10781/22095 [18:16:41<12:18:25, 3.92s/it] {'loss': 0.3024, 'grad_norm': 0.6065556060776699, 'learning_rate': 5.438455801452216e-06, 'epoch': 0.49} 49%|████▉ | 10781/22095 [18:16:41<12:18:25, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▉ | 10782/22095 [18:16:50<17:37:42, 5.61s/it] {'loss': 0.4714, 'grad_norm': 0.2888300571280535, 'learning_rate': 5.437725699150031e-06, 'epoch': 0.49} 49%|████▉ | 10782/22095 [18:16:50<17:37:42, 5.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109846 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10783/22095 [18:16:54<15:50:22, 5.04s/it] {'loss': 0.3645, 'grad_norm': 0.6129089740283804, 'learning_rate': 5.43699558744247e-06, 'epoch': 0.49} 49%|████▉ | 10783/22095 [18:16:54<15:50:22, 5.04s/it] 49%|████▉ | 10784/22095 [18:16:57<13:58:05, 4.45s/it] {'loss': 0.3244, 'grad_norm': 0.6321064368855184, 'learning_rate': 5.4362654663452115e-06, 'epoch': 0.49} 49%|████▉ | 10784/22095 [18:16:57<13:58:05, 4.45s/it] 49%|████▉ | 10785/22095 [18:17:01<13:30:36, 4.30s/it] {'loss': 0.3237, 'grad_norm': 0.6236913706818845, 'learning_rate': 5.435535335873951e-06, 'epoch': 0.49} 49%|████▉ | 10785/22095 [18:17:01<13:30:36, 4.30s/it] 49%|████▉ | 10786/22095 [18:17:04<12:21:55, 3.94s/it] {'loss': 0.3217, 'grad_norm': 0.6794406833180068, 'learning_rate': 5.434805196044372e-06, 'epoch': 0.49} 49%|████▉ | 10786/22095 [18:17:04<12:21:55, 3.94s/it] 49%|████▉ | 10787/22095 [18:17:07<11:35:28, 3.69s/it] {'loss': 0.3482, 'grad_norm': 0.6015193977905489, 'learning_rate': 5.434075046872165e-06, 'epoch': 0.49} 49%|████▉ | 10787/22095 [18:17:07<11:35:28, 3.69s/it] 49%|████▉ | 10788/22095 [18:17:11<11:55:47, 3.80s/it] {'loss': 0.3236, 'grad_norm': 0.7173251165967416, 'learning_rate': 5.4333448883730176e-06, 'epoch': 0.49} 49%|████▉ | 10788/22095 [18:17:11<11:55:47, 3.80s/it] 49%|████▉ | 10789/22095 [18:17:16<12:44:56, 4.06s/it] {'loss': 0.3191, 'grad_norm': 0.5847772835682842, 'learning_rate': 5.432614720562621e-06, 'epoch': 0.49} 49%|████▉ | 10789/22095 [18:17:16<12:44:56, 4.06s/it] 49%|████▉ | 10790/22095 [18:17:19<11:39:32, 3.71s/it] {'loss': 0.3342, 'grad_norm': 0.564475806463531, 'learning_rate': 5.431884543456662e-06, 'epoch': 0.49} 49%|████▉ | 10790/22095 [18:17:19<11:39:32, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121399 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113425 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10791/22095 [18:17:29<17:20:38, 5.52s/it] {'loss': 0.4891, 'grad_norm': 0.3581390264447621, 'learning_rate': 5.43115435707083e-06, 'epoch': 0.49} 49%|████▉ | 10791/22095 [18:17:29<17:20:38, 5.52s/it] 49%|████▉ | 10792/22095 [18:17:35<17:47:50, 5.67s/it] {'loss': 0.4979, 'grad_norm': 0.31725716739829074, 'learning_rate': 5.430424161420817e-06, 'epoch': 0.49} 49%|████▉ | 10792/22095 [18:17:35<17:47:50, 5.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10793/22095 [18:17:38<15:31:48, 4.95s/it] {'loss': 0.3351, 'grad_norm': 0.6605082182680612, 'learning_rate': 5.429693956522308e-06, 'epoch': 0.49} 49%|████▉ | 10793/22095 [18:17:38<15:31:48, 4.95s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_2/images/step_12.png 2025-08-28 10:15:35.427272 load time: 1221.81 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10794/22095 [18:17:42<14:20:00, 4.57s/it] {'loss': 0.276, 'grad_norm': 0.6187867157877088, 'learning_rate': 5.428963742390998e-06, 'epoch': 0.49} 49%|████▉ | 10794/22095 [18:17:42<14:20:00, 4.57s/it] 49%|████▉ | 10795/22095 [18:17:44<12:38:15, 4.03s/it] {'loss': 0.3166, 'grad_norm': 0.6339357376584758, 'learning_rate': 5.428233519042574e-06, 'epoch': 0.49} 49%|████▉ | 10795/22095 [18:17:44<12:38:15, 4.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944627 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67780, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:15:43.090695 load time: 1320.73 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/03624071-2760-416c-80f5-d94aa8dfebce/images/step_3.png 2025-08-28 10:15:44.445186 load time: 1173.84 ms 49%|████▉ | 10796/22095 [18:17:48<11:58:04, 3.81s/it] {'loss': 0.3244, 'grad_norm': 0.6775069897284436, 'learning_rate': 5.427503286492727e-06, 'epoch': 0.49} 49%|████▉ | 10796/22095 [18:17:48<11:58:04, 3.81s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_170451_4/images/before_screenshot_46_id_64_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:15:47.740686 load time: 1542.02 ms 49%|████▉ | 10797/22095 [18:17:51<11:22:27, 3.62s/it] {'loss': 0.3392, 'grad_norm': 0.5769499742725703, 'learning_rate': 5.426773044757146e-06, 'epoch': 0.49} 49%|████▉ | 10797/22095 [18:17:51<11:22:27, 3.62s/it] 49%|████▉ | 10798/22095 [18:17:55<11:55:20, 3.80s/it] {'loss': 0.3598, 'grad_norm': 0.6189106653236469, 'learning_rate': 5.426042793851525e-06, 'epoch': 0.49} 49%|████▉ | 10798/22095 [18:17:55<11:55:20, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10799/22095 [18:18:06<18:50:09, 6.00s/it] {'loss': 0.4796, 'grad_norm': 0.4653135340607896, 'learning_rate': 5.4253125337915514e-06, 'epoch': 0.49} 49%|████▉ | 10799/22095 [18:18:06<18:50:09, 6.00s/it] 49%|████▉ | 10800/22095 [18:18:10<16:57:36, 5.41s/it] {'loss': 0.325, 'grad_norm': 0.5879197568638728, 'learning_rate': 5.424582264592919e-06, 'epoch': 0.49} 49%|████▉ | 10800/22095 [18:18:10<16:57:36, 5.41s/it] 49%|████▉ | 10801/22095 [18:18:14<15:07:25, 4.82s/it] {'loss': 0.3432, 'grad_norm': 0.6698195305676378, 'learning_rate': 5.423851986271316e-06, 'epoch': 0.49} 49%|████▉ | 10801/22095 [18:18:14<15:07:25, 4.82s/it] 49%|████▉ | 10802/22095 [18:18:16<13:06:58, 4.18s/it] {'loss': 0.3195, 'grad_norm': 0.5785312045865859, 'learning_rate': 5.423121698842437e-06, 'epoch': 0.49} 49%|████▉ | 10802/22095 [18:18:16<13:06:58, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56922 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47036 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43256 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10803/22095 [18:18:20<12:23:21, 3.95s/it] {'loss': 0.317, 'grad_norm': 0.6450008538019762, 'learning_rate': 5.422391402321971e-06, 'epoch': 0.49} 49%|████▉ | 10803/22095 [18:18:20<12:23:21, 3.95s/it] 49%|████▉ | 10804/22095 [18:18:23<12:05:34, 3.86s/it] {'loss': 0.3354, 'grad_norm': 0.6335218164038602, 'learning_rate': 5.421661096725612e-06, 'epoch': 0.49} 49%|████▉ | 10804/22095 [18:18:23<12:05:34, 3.86s/it] 49%|████▉ | 10805/22095 [18:18:26<11:23:29, 3.63s/it] {'loss': 0.3516, 'grad_norm': 0.5882818683954184, 'learning_rate': 5.42093078206905e-06, 'epoch': 0.49} 49%|████▉ | 10805/22095 [18:18:26<11:23:29, 3.63s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/3f2b3961-0642-4fca-9848-82ff1d70c9af/images/step_3.png 2025-08-28 10:16:25.753229 load time: 1030.21 ms VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_98/img/step_0.png 2025-08-28 10:16:26.434669 load time: 1053.52 ms VC:s3://gui-agent/data_20250612/mac/images/settings/ef46b385-4881-42a9-838d-26099df92ad9/images/step_0.png 2025-08-28 10:16:26.168310 load time: 1607.43 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_5.png 2025-08-28 10:16:27.108069 load time: 1011.54 ms 49%|████▉ | 10806/22095 [18:18:30<10:59:24, 3.50s/it] {'loss': 0.3559, 'grad_norm': 0.6605788221478196, 'learning_rate': 5.42020045836798e-06, 'epoch': 0.49} 49%|████▉ | 10806/22095 [18:18:30<10:59:24, 3.50s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_1.png 2025-08-28 10:16:29.488029 load time: 1055.92 ms 49%|████▉ | 10807/22095 [18:18:34<11:17:50, 3.60s/it] {'loss': 0.3454, 'grad_norm': 0.6831599714525608, 'learning_rate': 5.419470125638091e-06, 'epoch': 0.49} 49%|████▉ | 10807/22095 [18:18:34<11:17:50, 3.60s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:16:33.782902 load time: 1082.77 ms VC:s3://gui-agent/data_20250623/windows_augment/images/android_studio/2025-06-18_203434/images/step_4_id_34_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:16:33.841010 load time: 1296.48 ms 49%|████▉ | 10808/22095 [18:18:37<11:10:29, 3.56s/it] {'loss': 0.3969, 'grad_norm': 0.820649625790173, 'learning_rate': 5.418739783895079e-06, 'epoch': 0.49} 49%|████▉ | 10808/22095 [18:18:37<11:10:29, 3.56s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30197.png 2025-08-28 10:16:33.678274 load time: 1887.88 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 10:16:35.770070 load time: 1328.62 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:16:36.844148 load time: 1010.69 ms 49%|████▉ | 10809/22095 [18:18:40<10:35:16, 3.38s/it] {'loss': 0.3134, 'grad_norm': 0.5944023432998939, 'learning_rate': 5.418009433154633e-06, 'epoch': 0.49} 49%|████▉ | 10809/22095 [18:18:40<10:35:16, 3.38s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39147.png 2025-08-28 10:16:37.140519 load time: 1412.23 ms VC:s3://gui-agent/data_20250630/windows_augment/images/FL/handmade_annotation_1/images/FL_3_id_31_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:16:40.353889 load time: 1098.91 ms 49%|████▉ | 10810/22095 [18:18:44<11:29:13, 3.66s/it] {'loss': 0.3646, 'grad_norm': 0.6119284376862187, 'learning_rate': 5.41727907343245e-06, 'epoch': 0.49} 49%|████▉ | 10810/22095 [18:18:44<11:29:13, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250421/web/images/rei_com/trajectory_36/img/step_0.png 2025-08-28 10:16:44.973872 load time: 1079.05 ms 49%|████▉ | 10811/22095 [18:18:53<16:30:51, 5.27s/it] {'loss': 0.5179, 'grad_norm': 0.33465584922221014, 'learning_rate': 5.41654870474422e-06, 'epoch': 0.49} 49%|████▉ | 10811/22095 [18:18:53<16:30:51, 5.27s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/d9f91a67-b7b2-4c00-add0-da44a4621f69/images/step_0.png 2025-08-28 10:16:53.626150 load time: 1530.01 ms 49%|████▉ | 10812/22095 [18:18:57<15:09:50, 4.84s/it] {'loss': 0.3378, 'grad_norm': 0.6614144868552617, 'learning_rate': 5.4158183271056385e-06, 'epoch': 0.49} 49%|████▉ | 10812/22095 [18:18:57<15:09:50, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/34181.png 2025-08-28 10:16:57.640713 load time: 3814.42 ms 49%|████▉ | 10813/22095 [18:19:06<19:19:04, 6.16s/it] {'loss': 0.4535, 'grad_norm': 0.2797843285100049, 'learning_rate': 5.415087940532398e-06, 'epoch': 0.49} 49%|████▉ | 10813/22095 [18:19:06<19:19:04, 6.16s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:17:05.155089 load time: 1038.03 ms 49%|████▉ | 10814/22095 [18:19:10<16:33:38, 5.28s/it] {'loss': 0.3209, 'grad_norm': 0.6000761307553717, 'learning_rate': 5.414357545040193e-06, 'epoch': 0.49} 49%|████▉ | 10814/22095 [18:19:10<16:33:38, 5.28s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/d1e543d1-751f-4201-8047-6e1e7f14dec2/images/step_5.png 2025-08-28 10:17:09.518391 load time: 1058.78 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_5.png 2025-08-28 10:17:09.678330 load time: 1073.59 ms 49%|████▉ | 10815/22095 [18:19:13<14:45:22, 4.71s/it] {'loss': 0.3127, 'grad_norm': 0.6367329656813222, 'learning_rate': 5.413627140644716e-06, 'epoch': 0.49} 49%|████▉ | 10815/22095 [18:19:13<14:45:22, 4.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365011 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31752, 'image': 'vrdu_table_final_2/astro-ph.CO/46265eea-889b-4e63-b359-41a8c8ac3475.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 49%|████▉ | 10816/22095 [18:19:16<13:11:17, 4.21s/it] {'loss': 0.3298, 'grad_norm': 0.6852619243332301, 'learning_rate': 5.412896727361663e-06, 'epoch': 0.49} 49%|████▉ | 10816/22095 [18:19:16<13:11:17, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46550 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45661 > 40960) for 4 sample(s). Truncating to 1004 with 1 samples. 49%|████▉ | 10817/22095 [18:19:19<12:01:37, 3.84s/it] {'loss': 0.2659, 'grad_norm': 0.6331187290160614, 'learning_rate': 5.4121663052067265e-06, 'epoch': 0.49} 49%|████▉ | 10817/22095 [18:19:19<12:01:37, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133060 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10818/22095 [18:19:22<11:05:35, 3.54s/it] {'loss': 0.3313, 'grad_norm': 0.6319135130209138, 'learning_rate': 5.411435874195602e-06, 'epoch': 0.49} 49%|████▉ | 10818/22095 [18:19:22<11:05:35, 3.54s/it] 49%|████▉ | 10819/22095 [18:19:25<10:37:39, 3.39s/it] {'loss': 0.3094, 'grad_norm': 0.6181685095469309, 'learning_rate': 5.410705434343985e-06, 'epoch': 0.49} 49%|████▉ | 10819/22095 [18:19:25<10:37:39, 3.39s/it] 49%|████▉ | 10820/22095 [18:19:29<11:10:25, 3.57s/it] {'loss': 0.2987, 'grad_norm': 0.5959127812339418, 'learning_rate': 5.409974985667569e-06, 'epoch': 0.49} 49%|████▉ | 10820/22095 [18:19:29<11:10:25, 3.57s/it] 49%|████▉ | 10821/22095 [18:19:32<10:53:08, 3.48s/it] {'loss': 0.3174, 'grad_norm': 0.6046669266642816, 'learning_rate': 5.409244528182051e-06, 'epoch': 0.49} 49%|████▉ | 10821/22095 [18:19:32<10:53:08, 3.48s/it] 49%|████▉ | 10822/22095 [18:19:37<12:08:53, 3.88s/it] {'loss': 0.32, 'grad_norm': 0.6079813229979458, 'learning_rate': 5.408514061903123e-06, 'epoch': 0.49} 49%|████▉ | 10822/22095 [18:19:37<12:08:53, 3.88s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250509_125727_1/images/before_screenshot_1_id_103_function_1_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 10:17:35.719760 load time: 1731.32 ms 49%|████▉ | 10823/22095 [18:19:40<11:42:16, 3.74s/it] {'loss': 0.3443, 'grad_norm': 0.6102670213077448, 'learning_rate': 5.407783586846484e-06, 'epoch': 0.49} 49%|████▉ | 10823/22095 [18:19:40<11:42:16, 3.74s/it] 49%|████▉ | 10824/22095 [18:19:44<11:40:06, 3.73s/it] {'loss': 0.3698, 'grad_norm': 0.6393248736433033, 'learning_rate': 5.407053103027826e-06, 'epoch': 0.49} 49%|████▉ | 10824/22095 [18:19:44<11:40:06, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▉ | 10825/22095 [18:19:49<12:51:46, 4.11s/it] {'loss': 0.4866, 'grad_norm': 0.36085845063400557, 'learning_rate': 5.40632261046285e-06, 'epoch': 0.49} 49%|████▉ | 10825/22095 [18:19:49<12:51:46, 4.11s/it] 49%|████▉ | 10826/22095 [18:19:52<12:02:56, 3.85s/it] {'loss': 0.3598, 'grad_norm': 0.6483175296284592, 'learning_rate': 5.405592109167247e-06, 'epoch': 0.49} 49%|████▉ | 10826/22095 [18:19:52<12:02:56, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302979 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1QH7wevNNTKJjSspfXXbXIFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n仅输出照片中的文字内容,不包含文字的排列位置。'}, {'from': 'gpt', 'value': '图中所有文字:\n一套10个装只能绣一个名字\n我爱姓名贴\nWOAIXINGMINGTIE\n陈显伦'}]} 49%|████▉ | 10827/22095 [18:19:56<11:35:17, 3.70s/it] {'loss': 0.2913, 'grad_norm': 0.6434207102297304, 'learning_rate': 5.404861599156715e-06, 'epoch': 0.49} 49%|████▉ | 10827/22095 [18:19:56<11:35:17, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894381 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17534, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 1lcm\nB. 13cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_3/images/step_0.png 2025-08-28 10:17:55.680915 load time: 1149.24 ms 49%|████▉ | 10828/22095 [18:19:59<10:52:46, 3.48s/it] {'loss': 0.328, 'grad_norm': 0.6106852104150455, 'learning_rate': 5.404131080446952e-06, 'epoch': 0.49} 49%|████▉ | 10828/22095 [18:19:59<10:52:46, 3.48s/it] 49%|████▉ | 10829/22095 [18:20:02<11:11:06, 3.57s/it] {'loss': 0.3257, 'grad_norm': 0.5672917334952595, 'learning_rate': 5.403400553053654e-06, 'epoch': 0.49} 49%|████▉ | 10829/22095 [18:20:02<11:11:06, 3.57s/it] 49%|████▉ | 10830/22095 [18:20:06<11:26:02, 3.65s/it] {'loss': 0.3027, 'grad_norm': 0.5955844527154185, 'learning_rate': 5.402670016992514e-06, 'epoch': 0.49} 49%|████▉ | 10830/22095 [18:20:06<11:26:02, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97240 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10831/22095 [18:20:11<12:18:59, 3.94s/it] {'loss': 0.3474, 'grad_norm': 0.6209346722321435, 'learning_rate': 5.401939472279235e-06, 'epoch': 0.49} 49%|████▉ | 10831/22095 [18:20:11<12:18:59, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53524 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10832/22095 [18:20:14<11:27:01, 3.66s/it] {'loss': 0.2958, 'grad_norm': 0.5779865260605683, 'learning_rate': 5.401208918929509e-06, 'epoch': 0.49} 49%|████▉ | 10832/22095 [18:20:14<11:27:01, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89821 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10833/22095 [18:20:17<10:36:36, 3.39s/it] {'loss': 0.3225, 'grad_norm': 0.6041288922310544, 'learning_rate': 5.400478356959037e-06, 'epoch': 0.49} 49%|████▉ | 10833/22095 [18:20:17<10:36:36, 3.39s/it] 49%|████▉ | 10834/22095 [18:20:20<11:01:26, 3.52s/it] {'loss': 0.3329, 'grad_norm': 0.5982865724226629, 'learning_rate': 5.399747786383515e-06, 'epoch': 0.49} 49%|████▉ | 10834/22095 [18:20:20<11:01:26, 3.52s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_1/images/step_0.png 2025-08-28 10:18:19.230200 load time: 1096.91 ms 49%|████▉ | 10835/22095 [18:20:25<11:33:30, 3.70s/it] {'loss': 0.3717, 'grad_norm': 0.6113716925853488, 'learning_rate': 5.39901720721864e-06, 'epoch': 0.49} 49%|████▉ | 10835/22095 [18:20:25<11:33:30, 3.70s/it] 49%|████▉ | 10836/22095 [18:20:27<10:48:26, 3.46s/it] {'loss': 0.3493, 'grad_norm': 0.6375109781082008, 'learning_rate': 5.398286619480111e-06, 'epoch': 0.49} 49%|████▉ | 10836/22095 [18:20:27<10:48:26, 3.46s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_4/images/before_screenshot_39_id_39_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:18:26.803439 load time: 1825.34 ms 49%|████▉ | 10837/22095 [18:20:31<11:04:09, 3.54s/it] {'loss': 0.3364, 'grad_norm': 0.6280684294342922, 'learning_rate': 5.397556023183627e-06, 'epoch': 0.49} 49%|████▉ | 10837/22095 [18:20:31<11:04:09, 3.54s/it]VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/6501970517773858_11.png 2025-08-28 10:18:29.957603 load time: 1033.0 ms 49%|████▉ | 10838/22095 [18:20:35<11:09:19, 3.57s/it] {'loss': 0.3083, 'grad_norm': 0.632665793054112, 'learning_rate': 5.396825418344883e-06, 'epoch': 0.49} 49%|████▉ | 10838/22095 [18:20:35<11:09:19, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59903 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55131 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10839/22095 [18:20:43<15:35:21, 4.99s/it] {'loss': 0.4969, 'grad_norm': 0.35018151725738517, 'learning_rate': 5.39609480497958e-06, 'epoch': 0.49} 49%|████▉ | 10839/22095 [18:20:43<15:35:21, 4.99s/it] 49%|████▉ | 10840/22095 [18:20:51<18:01:05, 5.76s/it] {'loss': 0.4889, 'grad_norm': 0.3135080903253166, 'learning_rate': 5.395364183103418e-06, 'epoch': 0.49} 49%|████▉ | 10840/22095 [18:20:51<18:01:05, 5.76s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 49%|████▉ | 10841/22095 [18:20:54<15:36:41, 4.99s/it] {'loss': 0.3122, 'grad_norm': 0.6470675002316936, 'learning_rate': 5.394633552732091e-06, 'epoch': 0.49} 49%|████▉ | 10841/22095 [18:20:54<15:36:41, 4.99s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:18:53.785673 load time: 1143.04 ms 49%|████▉ | 10842/22095 [18:20:58<14:21:30, 4.59s/it] {'loss': 0.3127, 'grad_norm': 0.6641179250395611, 'learning_rate': 5.393902913881304e-06, 'epoch': 0.49} 49%|████▉ | 10842/22095 [18:20:58<14:21:30, 4.59s/it]VC:s3://gui-agent/data_20250612/mac/images/vs_code/2b180dfe-089e-4384-a548-b060a85bc9e6/images/step_0.png 2025-08-28 10:18:56.321119 load time: 1047.48 ms 49%|████▉ | 10843/22095 [18:21:01<12:51:22, 4.11s/it] {'loss': 0.2972, 'grad_norm': 0.6837994695623595, 'learning_rate': 5.393172266566751e-06, 'epoch': 0.49} 49%|████▉ | 10843/22095 [18:21:01<12:51:22, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [20, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8152887 in VC:s3://multi-modal/laion_gpt4v/images/. Exception: Image size [20, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1923, 'image': '622f702ea27fea3f4bb430f8c01572f8.jpg', 'image_wh': [[20, 20]], 'conversations': [{'from': 'human', 'value': '\nWrite a detailed description of this image.'}, {'from': 'gpt', 'value': 'This image appears to be the cover of a book. The top portion of the image contains text that reads "Literature and Language." Below this, in larger, bold font, is the title "Plexippus: or, the aspiring plebeian. ... Volume 2 of 2." The author\'s name, "Richard Graves," is printed below the title. The central portion of the image features two illustrations of spherical objects, resembling armillary spheres, which are models representing the celestial sphere with rings and hoops representing the equator and other astronomical circles. The background of the cover is a light beige color, and there is a visible crease or shadow suggesting that the book cover may be slightly bent or that the image shows two adjoining pages. The overall style of the image is simple and straightforward, with a focus on the text and the illustrations of the armillary spheres.'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_1/images/step_2.png 2025-08-28 10:18:59.753171 load time: 1233.76 ms VC:s3://gui-agent/data_20250612/mac/images/finder/1fe1ca62-7e4a-4d85-af6c-e650a9c51129/images/step_2.png 2025-08-28 10:19:00.869979 load time: 1012.29 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_5.png 2025-08-28 10:19:00.853616 load time: 1354.92 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 10:19:01.093428 load time: 1082.9 ms 49%|████▉ | 10844/22095 [18:21:04<12:02:50, 3.85s/it] {'loss': 0.3528, 'grad_norm': 0.57976310001045, 'learning_rate': 5.392441610804135e-06, 'epoch': 0.49} 49%|████▉ | 10844/22095 [18:21:04<12:02:50, 3.85s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 10:19:03.664254 load time: 1591.06 ms 49%|████▉ | 10845/22095 [18:21:07<11:41:21, 3.74s/it] {'loss': 0.3338, 'grad_norm': 0.6808006276626665, 'learning_rate': 5.391710946609152e-06, 'epoch': 0.49} 49%|████▉ | 10845/22095 [18:21:07<11:41:21, 3.74s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_54/img/step_4.png 2025-08-28 10:19:06.042822 load time: 1634.6 ms 49%|████▉ | 10846/22095 [18:21:10<10:50:48, 3.47s/it] {'loss': 0.3185, 'grad_norm': 0.6162543213535839, 'learning_rate': 5.390980273997507e-06, 'epoch': 0.49} 49%|████▉ | 10846/22095 [18:21:10<10:50:48, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/432466e7-22e7-4194-aa9e-19f7c21adef5/images/step_5.png 2025-08-28 10:19:10.269262 load time: 1114.45 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/2f03d0e5-118a-43d8-abe2-78c680ff4f86/images/step_0.png 2025-08-28 10:19:10.681051 load time: 1004.55 ms 49%|████▉ | 10847/22095 [18:21:18<14:45:31, 4.72s/it] {'loss': 0.4859, 'grad_norm': 0.4064457148287276, 'learning_rate': 5.390249592984894e-06, 'epoch': 0.49} 49%|████▉ | 10847/22095 [18:21:18<14:45:31, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10848/22095 [18:21:22<14:01:19, 4.49s/it] {'loss': 0.3389, 'grad_norm': 0.6968576972450587, 'learning_rate': 5.389518903587016e-06, 'epoch': 0.49} 49%|████▉ | 10848/22095 [18:21:22<14:01:19, 4.49s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_5/images/step_4.png 2025-08-28 10:19:19.566158 load time: 1091.25 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250501_133654_4/images/before_screenshot_33_id_106_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:19:23.192806 load time: 1069.4 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 10:19:22.367435 load time: 2081.75 ms 49%|████▉ | 10849/22095 [18:21:26<13:53:29, 4.45s/it] {'loss': 0.3141, 'grad_norm': 0.5918707805528982, 'learning_rate': 5.388788205819575e-06, 'epoch': 0.49} 49%|████▉ | 10849/22095 [18:21:26<13:53:29, 4.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366677 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33423, 'image': 'vrdu_table_final_2/astro-ph.CO/96369ef9-5cb6-4527-b4f4-4dddd017f4f1.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{1}$\\end{tabular}\n```"}]} 49%|████▉ | 10850/22095 [18:21:29<12:57:13, 4.15s/it] {'loss': 0.3083, 'grad_norm': 0.6700538372595888, 'learning_rate': 5.38805749969827e-06, 'epoch': 0.49} 49%|████▉ | 10850/22095 [18:21:30<12:57:13, 4.15s/it] 49%|████▉ | 10851/22095 [18:21:33<11:56:26, 3.82s/it] {'loss': 0.3105, 'grad_norm': 0.5692254658928143, 'learning_rate': 5.387326785238798e-06, 'epoch': 0.49} 49%|████▉ | 10851/22095 [18:21:33<11:56:26, 3.82s/it] 49%|████▉ | 10852/22095 [18:21:36<11:13:16, 3.59s/it] {'loss': 0.3289, 'grad_norm': 0.7663508797023766, 'learning_rate': 5.386596062456865e-06, 'epoch': 0.49} 49%|████▉ | 10852/22095 [18:21:36<11:13:16, 3.59s/it] 49%|████▉ | 10853/22095 [18:21:39<10:58:41, 3.52s/it] {'loss': 0.329, 'grad_norm': 0.7226289709380682, 'learning_rate': 5.385865331368169e-06, 'epoch': 0.49} 49%|████▉ | 10853/22095 [18:21:39<10:58:41, 3.52s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:19:36.366885 load time: 1480.52 ms 49%|████▉ | 10854/22095 [18:21:42<10:39:00, 3.41s/it] {'loss': 0.3443, 'grad_norm': 0.6834416027758144, 'learning_rate': 5.385134591988412e-06, 'epoch': 0.49} 49%|████▉ | 10854/22095 [18:21:42<10:39:00, 3.41s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 10:19:39.426997 load time: 1400.87 ms 49%|████▉ | 10855/22095 [18:21:45<10:12:46, 3.27s/it] {'loss': 0.3413, 'grad_norm': 0.6554737185398497, 'learning_rate': 5.384403844333297e-06, 'epoch': 0.49} 49%|████▉ | 10855/22095 [18:21:45<10:12:46, 3.27s/it] 49%|████▉ | 10856/22095 [18:21:49<10:26:08, 3.34s/it] {'loss': 0.3498, 'grad_norm': 0.6095854602314272, 'learning_rate': 5.383673088418523e-06, 'epoch': 0.49} 49%|████▉ | 10856/22095 [18:21:49<10:26:08, 3.34s/it]VC:s3://gui-agent/data_20250612/mac/images/finder/8af7889e-fbfc-443f-8629-e5b6b0484c7d/images/step_0.png 2025-08-28 10:19:48.635904 load time: 1090.8 ms 49%|████▉ | 10857/22095 [18:21:52<10:25:24, 3.34s/it] {'loss': 0.3505, 'grad_norm': 0.6254987626059457, 'learning_rate': 5.382942324259792e-06, 'epoch': 0.49} 49%|████▉ | 10857/22095 [18:21:52<10:25:24, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965311 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16146, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 49%|████▉ | 10858/22095 [18:21:56<10:47:14, 3.46s/it] {'loss': 0.3281, 'grad_norm': 0.600840135233815, 'learning_rate': 5.382211551872808e-06, 'epoch': 0.49} 49%|████▉ | 10858/22095 [18:21:56<10:47:14, 3.46s/it] 49%|████▉ | 10859/22095 [18:21:58<10:05:20, 3.23s/it] {'loss': 0.3398, 'grad_norm': 0.7062759844203941, 'learning_rate': 5.38148077127327e-06, 'epoch': 0.49} 49%|████▉ | 10859/22095 [18:21:58<10:05:20, 3.23s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_5.png 2025-08-28 10:19:57.116729 load time: 1034.06 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:19:57.116414 load time: 1080.81 ms 49%|████▉ | 10860/22095 [18:22:01<9:48:39, 3.14s/it] {'loss': 0.3508, 'grad_norm': 0.6182077333196346, 'learning_rate': 5.380749982476884e-06, 'epoch': 0.49} 49%|████▉ | 10860/22095 [18:22:01<9:48:39, 3.14s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_3/images/step_0.png 2025-08-28 10:20:00.049185 load time: 1039.27 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:20:00.534462 load time: 1493.69 ms 49%|████▉ | 10861/22095 [18:22:04<9:50:26, 3.15s/it] {'loss': 0.3335, 'grad_norm': 0.6250150814729192, 'learning_rate': 5.380019185499348e-06, 'epoch': 0.49} 49%|████▉ | 10861/22095 [18:22:04<9:50:26, 3.15s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:20:03.699718 load time: 1245.34 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:20:04.575747 load time: 1042.42 ms 49%|████▉ | 10862/22095 [18:22:08<9:57:32, 3.19s/it] {'loss': 0.3034, 'grad_norm': 0.644831247436958, 'learning_rate': 5.379288380356369e-06, 'epoch': 0.49} 49%|████▉ | 10862/22095 [18:22:08<9:57:32, 3.19s/it] 49%|████▉ | 10863/22095 [18:22:11<10:26:39, 3.35s/it] {'loss': 0.3236, 'grad_norm': 0.6896682788376114, 'learning_rate': 5.378557567063646e-06, 'epoch': 0.49} 49%|████▉ | 10863/22095 [18:22:11<10:26:39, 3.35s/it] 49%|████▉ | 10864/22095 [18:22:15<10:35:28, 3.39s/it] {'loss': 0.3136, 'grad_norm': 0.6116663195157774, 'learning_rate': 5.3778267456368836e-06, 'epoch': 0.49} 49%|████▉ | 10864/22095 [18:22:15<10:35:28, 3.39s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/inventor/handmade_annotation_2/images/1_id_6_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:20:13.726830 load time: 1311.9 ms VC:s3://gui-agent/data_20250612/mac/images/finder/8af7889e-fbfc-443f-8629-e5b6b0484c7d/images/step_0.png 2025-08-28 10:20:14.133814 load time: 1053.58 ms VC:s3://gui-agent/data_20250612/mac/images/finder/e71f2734-fd4f-4c3c-bc42-83b36162a0a4/images/step_0.png 2025-08-28 10:20:14.973433 load time: 1474.08 ms 49%|████▉ | 10865/22095 [18:22:19<11:01:28, 3.53s/it] {'loss': 0.4097, 'grad_norm': 0.686230452116269, 'learning_rate': 5.377095916091786e-06, 'epoch': 0.49} 49%|████▉ | 10865/22095 [18:22:19<11:01:28, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53870 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10866/22095 [18:22:22<10:55:59, 3.51s/it] {'loss': 0.3125, 'grad_norm': 0.6313581308698523, 'learning_rate': 5.376365078444053e-06, 'epoch': 0.49} 49%|████▉ | 10866/22095 [18:22:22<10:55:59, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112509 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10867/22095 [18:22:26<11:23:36, 3.65s/it] {'loss': 0.3094, 'grad_norm': 0.6151971347833678, 'learning_rate': 5.375634232709392e-06, 'epoch': 0.49} 49%|████▉ | 10867/22095 [18:22:26<11:23:36, 3.65s/it] 49%|████▉ | 10868/22095 [18:22:30<11:12:52, 3.60s/it] {'loss': 0.3611, 'grad_norm': 0.6493310508228657, 'learning_rate': 5.374903378903506e-06, 'epoch': 0.49} 49%|████▉ | 10868/22095 [18:22:30<11:12:52, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [492, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8473734 in VC:s3://internvl-moe-sft-data/. Exception: Image size [492, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6319, 'image': 'vrdu_texteq/astro-ph.CO/25365790-dea7-4bf2-bfec-8483f43587ad.png', 'image_wh': [[492, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'The effective number ${\\mit \\Delta} N_{\\rm eff}$ is defined as'}]} 49%|████▉ | 10869/22095 [18:22:33<10:47:03, 3.46s/it] {'loss': 0.3345, 'grad_norm': 0.624810363256607, 'learning_rate': 5.374172517042095e-06, 'epoch': 0.49} 49%|████▉ | 10869/22095 [18:22:33<10:47:03, 3.46s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_100/img/step_3.png 2025-08-28 10:20:32.799867 load time: 1293.51 ms 49%|████▉ | 10870/22095 [18:22:36<10:12:49, 3.28s/it] {'loss': 0.3351, 'grad_norm': 0.620486668426, 'learning_rate': 5.373441647140868e-06, 'epoch': 0.49} 49%|████▉ | 10870/22095 [18:22:36<10:12:49, 3.28s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:20:35.258823 load time: 1205.92 ms 49%|████▉ | 10871/22095 [18:22:39<9:57:14, 3.19s/it] {'loss': 0.3263, 'grad_norm': 0.6993162835016654, 'learning_rate': 5.372710769215528e-06, 'epoch': 0.49} 49%|████▉ | 10871/22095 [18:22:39<9:57:14, 3.19s/it] 49%|████▉ | 10872/22095 [18:22:42<10:03:25, 3.23s/it] {'loss': 0.3029, 'grad_norm': 0.6195139398422571, 'learning_rate': 5.371979883281775e-06, 'epoch': 0.49} 49%|████▉ | 10872/22095 [18:22:42<10:03:25, 3.23s/it] 49%|████▉ | 10873/22095 [18:22:45<9:55:03, 3.18s/it] {'loss': 0.3384, 'grad_norm': 0.6278648577458785, 'learning_rate': 5.37124898935532e-06, 'epoch': 0.49} 49%|████▉ | 10873/22095 [18:22:45<9:55:03, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:20:42.717778 load time: 1526.18 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:20:45.848875 load time: 1311.73 ms 49%|████▉ | 10874/22095 [18:22:55<15:55:55, 5.11s/it] {'loss': 0.4994, 'grad_norm': 0.3736919014627429, 'learning_rate': 5.370518087451861e-06, 'epoch': 0.49} 49%|████▉ | 10874/22095 [18:22:55<15:55:55, 5.11s/it] 49%|████▉ | 10875/22095 [18:23:04<20:04:47, 6.44s/it] {'loss': 0.467, 'grad_norm': 0.31140376218161614, 'learning_rate': 5.36978717758711e-06, 'epoch': 0.49} 49%|████▉ | 10875/22095 [18:23:04<20:04:47, 6.44s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/mind2web_train/images/851998b2-fda2-4bd4-a822-f1871a9fde12/images/1.png 2025-08-28 10:21:04.005118 load time: 1291.25 ms VC:s3://gui-agent/data_20250630/windows_augment/images/PR/handmade_annotation_1/images/PR (3)_id_3_internvl_element-caption_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:21:04.665678 load time: 1465.49 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/34247.png 2025-08-28 10:21:04.461576 load time: 1618.47 ms 49%|████▉ | 10876/22095 [18:23:08<17:34:48, 5.64s/it] {'loss': 0.3101, 'grad_norm': 0.6264908346834731, 'learning_rate': 5.369056259776766e-06, 'epoch': 0.49} 49%|████▉ | 10876/22095 [18:23:08<17:34:48, 5.64s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:21:06.794329 load time: 1024.95 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/DocVQA/pngs/xrlk0226_3.png 2025-08-28 10:21:07.355810 load time: 1057.51 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/cf183c3a-83c3-4eb7-a60d-c6cbfaa27f3e/images/step_6.png 2025-08-28 10:21:08.306324 load time: 1122.29 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/cf183c3a-83c3-4eb7-a60d-c6cbfaa27f3e/images/step_6.png 2025-08-28 10:21:07.947401 load time: 1460.32 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_1/images/step_0.png 2025-08-28 10:21:08.029825 load time: 1886.44 ms 49%|████▉ | 10877/22095 [18:23:12<15:49:01, 5.08s/it] {'loss': 0.3226, 'grad_norm': 0.6107193434356737, 'learning_rate': 5.368325334036537e-06, 'epoch': 0.49} 49%|████▉ | 10877/22095 [18:23:12<15:49:01, 5.08s/it] 49%|████▉ | 10878/22095 [18:23:16<14:45:33, 4.74s/it] {'loss': 0.3546, 'grad_norm': 0.6140612466191963, 'learning_rate': 5.367594400382128e-06, 'epoch': 0.49} 49%|████▉ | 10878/22095 [18:23:16<14:45:33, 4.74s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 10:21:15.664576 load time: 1133.76 ms 49%|████▉ | 10879/22095 [18:23:19<12:59:15, 4.17s/it] {'loss': 0.3197, 'grad_norm': 0.6609013483376353, 'learning_rate': 5.366863458829245e-06, 'epoch': 0.49} 49%|████▉ | 10879/22095 [18:23:19<12:59:15, 4.17s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/4883f6e6-c658-4d61-9cf9-e32c2b812a80/images/step_0.png 2025-08-28 10:21:17.924089 load time: 1394.29 ms 49%|████▉ | 10880/22095 [18:23:22<12:25:18, 3.99s/it] {'loss': 0.3521, 'grad_norm': 0.6760753743411584, 'learning_rate': 5.36613250939359e-06, 'epoch': 0.49} 49%|████▉ | 10880/22095 [18:23:22<12:25:18, 3.99s/it] 49%|████▉ | 10881/22095 [18:23:26<12:27:23, 4.00s/it] {'loss': 0.3776, 'grad_norm': 0.6654052260248408, 'learning_rate': 5.365401552090876e-06, 'epoch': 0.49} 49%|████▉ | 10881/22095 [18:23:26<12:27:23, 4.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887140 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10293, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 5\nB. 2\nC. 3\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_3/images/step_3.png 2025-08-28 10:21:26.080118 load time: 1032.23 ms 49%|████▉ | 10882/22095 [18:23:29<11:26:33, 3.67s/it] {'loss': 0.3614, 'grad_norm': 0.6076148237516632, 'learning_rate': 5.364670586936801e-06, 'epoch': 0.49} 49%|████▉ | 10882/22095 [18:23:29<11:26:33, 3.67s/it]VC:s3://gui-agent/mind2web_train/images/3a85b415-9e68-4cf0-91be-386d4d8f0710/images/3.png 2025-08-28 10:21:25.547341 load time: 2408.8 ms 49%|████▉ | 10883/22095 [18:23:32<10:53:53, 3.50s/it] {'loss': 0.3794, 'grad_norm': 0.6324147862809746, 'learning_rate': 5.363939613947078e-06, 'epoch': 0.49} 49%|████▉ | 10883/22095 [18:23:32<10:53:53, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8309201 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB105UWiC3PL1JjSZFtXXclRVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you decode and provide me with the exact words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n24V暖手桌垫\n安全第一\n双档可调\nSpring♬\nhffgyxj'}]} 49%|████▉ | 10884/22095 [18:23:35<10:42:45, 3.44s/it] {'loss': 0.3159, 'grad_norm': 0.674557660483406, 'learning_rate': 5.363208633137409e-06, 'epoch': 0.49} 49%|████▉ | 10884/22095 [18:23:35<10:42:45, 3.44s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_170451_4/images/before_screenshot_41_id_1483_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:21:32.746347 load time: 1521.86 ms VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_10/img/step_0.png 2025-08-28 10:21:35.527695 load time: 1025.25 ms 49%|████▉ | 10885/22095 [18:23:39<10:44:33, 3.45s/it] {'loss': 0.3391, 'grad_norm': 0.6074914429601798, 'learning_rate': 5.3624776445235025e-06, 'epoch': 0.49} 49%|████▉ | 10885/22095 [18:23:39<10:44:33, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57395 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10886/22095 [18:23:42<10:28:14, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61687 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.3425, 'grad_norm': 0.6394413452281954, 'learning_rate': 5.361746648121064e-06, 'epoch': 0.49} 49%|████▉ | 10886/22095 [18:23:42<10:28:14, 3.36s/it] 49%|████▉ | 10887/22095 [18:23:46<11:10:28, 3.59s/it] {'loss': 0.3405, 'grad_norm': 0.5797041229041443, 'learning_rate': 5.361015643945803e-06, 'epoch': 0.49} 49%|████▉ | 10887/22095 [18:23:46<11:10:28, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58583 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66520 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112380 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109265 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10888/22095 [18:23:49<10:32:25, 3.39s/it] {'loss': 0.3199, 'grad_norm': 0.6373249682044135, 'learning_rate': 5.3602846320134216e-06, 'epoch': 0.49} 49%|████▉ | 10888/22095 [18:23:49<10:32:25, 3.39s/it] 49%|████▉ | 10889/22095 [18:23:52<10:19:51, 3.32s/it] {'loss': 0.2873, 'grad_norm': 0.6562131440044063, 'learning_rate': 5.359553612339633e-06, 'epoch': 0.49} 49%|████▉ | 10889/22095 [18:23:52<10:19:51, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952444 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3279, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 16\nB. 9\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 49%|████▉ | 10890/22095 [18:23:56<11:01:56, 3.54s/it] {'loss': 0.3358, 'grad_norm': 0.7090555888596984, 'learning_rate': 5.358822584940139e-06, 'epoch': 0.49} 49%|████▉ | 10890/22095 [18:23:56<11:01:56, 3.54s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_175007_3/images/before_screenshot_24_id_83_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:21:56.059195 load time: 1034.37 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10891/22095 [18:24:00<10:58:15, 3.53s/it] {'loss': 0.336, 'grad_norm': 0.605339486263966, 'learning_rate': 5.358091549830651e-06, 'epoch': 0.49} 49%|████▉ | 10891/22095 [18:24:00<10:58:15, 3.53s/it] 49%|████▉ | 10892/22095 [18:24:03<10:27:20, 3.36s/it] {'loss': 0.3124, 'grad_norm': 0.6203741456837942, 'learning_rate': 5.357360507026875e-06, 'epoch': 0.49} 49%|████▉ | 10892/22095 [18:24:03<10:27:20, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 49%|████▉ | 10893/22095 [18:24:06<10:28:49, 3.37s/it] {'loss': 0.3535, 'grad_norm': 0.685886739956919, 'learning_rate': 5.35662945654452e-06, 'epoch': 0.49} 49%|████▉ | 10893/22095 [18:24:06<10:28:49, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54475 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10894/22095 [18:24:14<14:11:05, 4.56s/it] {'loss': 0.4987, 'grad_norm': 0.5849598914637093, 'learning_rate': 5.3558983983992915e-06, 'epoch': 0.49} 49%|████▉ | 10894/22095 [18:24:14<14:11:05, 4.56s/it]VC:s3://gui-agent/data_20250612/mac/images/clock/733fbdf3-b935-4450-a8ab-82c3e5f42705/images/step_2.png 2025-08-28 10:22:12.309357 load time: 1165.82 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 10:22:13.999002 load time: 1457.32 ms 49%|████▉ | 10895/22095 [18:24:17<13:31:33, 4.35s/it] {'loss': 0.2888, 'grad_norm': 0.6099707630801916, 'learning_rate': 5.355167332606901e-06, 'epoch': 0.49} 49%|████▉ | 10895/22095 [18:24:17<13:31:33, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_2/img/step_0.png 2025-08-28 10:22:16.157987 load time: 1426.76 ms VC:s3://gui-agent/data_20250612/mac/images/terminal/af851dfd-b7ce-4e95-95cf-c0fce6b8bb15/images/step_4.png 2025-08-28 10:22:17.089010 load time: 1106.73 ms VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_6.png 2025-08-28 10:22:17.730363 load time: 1032.59 ms 49%|████▉ | 10896/22095 [18:24:27<18:30:39, 5.95s/it] {'loss': 0.4624, 'grad_norm': 0.387638619022389, 'learning_rate': 5.354436259183054e-06, 'epoch': 0.49} 49%|████▉ | 10896/22095 [18:24:27<18:30:39, 5.95s/it] 49%|████▉ | 10897/22095 [18:24:31<16:15:04, 5.22s/it] {'loss': 0.3194, 'grad_norm': 0.6293204916290952, 'learning_rate': 5.353705178143462e-06, 'epoch': 0.49} 49%|████▉ | 10897/22095 [18:24:31<16:15:04, 5.22s/it] 49%|████▉ | 10898/22095 [18:24:34<14:42:01, 4.73s/it] {'loss': 0.3186, 'grad_norm': 0.6166390666247227, 'learning_rate': 5.352974089503832e-06, 'epoch': 0.49} 49%|████▉ | 10898/22095 [18:24:34<14:42:01, 4.73s/it] 49%|████▉ | 10899/22095 [18:24:37<13:23:36, 4.31s/it] {'loss': 0.3787, 'grad_norm': 0.6305222876486003, 'learning_rate': 5.352242993279871e-06, 'epoch': 0.49} 49%|████▉ | 10899/22095 [18:24:38<13:23:36, 4.31s/it] 49%|████▉ | 10900/22095 [18:24:40<12:02:39, 3.87s/it] {'loss': 0.3472, 'grad_norm': 0.6176998912923989, 'learning_rate': 5.351511889487293e-06, 'epoch': 0.49} 49%|████▉ | 10900/22095 [18:24:40<12:02:39, 3.87s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 10:22:38.120058 load time: 1010.08 ms VC:s3://gui-agent/data_20250612/mac/images/terminal/95886ee2-46c8-4a0f-865b-9ddbfb2af444/images/step_2.png 2025-08-28 10:22:38.287905 load time: 1040.79 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_3.png 2025-08-28 10:22:39.128312 load time: 1635.8 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:22:39.522742 load time: 1399.04 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:22:39.894692 load time: 1135.3 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:22:40.680473 load time: 1295.03 ms 49%|████▉ | 10901/22095 [18:24:44<11:44:50, 3.78s/it] {'loss': 0.3142, 'grad_norm': 0.6752658070048627, 'learning_rate': 5.350780778141801e-06, 'epoch': 0.49} 49%|████▉ | 10901/22095 [18:24:44<11:44:50, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20441.png 2025-08-28 10:22:40.880021 load time: 3179.91 ms 49%|████▉ | 10902/22095 [18:24:54<17:28:14, 5.62s/it] {'loss': 0.4815, 'grad_norm': 0.7026151470603141, 'learning_rate': 5.35004965925911e-06, 'epoch': 0.49} 49%|████▉ | 10902/22095 [18:24:54<17:28:14, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51286 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55770 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10903/22095 [18:24:57<15:35:29, 5.02s/it] {'loss': 0.3572, 'grad_norm': 0.6056281896817226, 'learning_rate': 5.349318532854924e-06, 'epoch': 0.49} 49%|████▉ | 10903/22095 [18:24:57<15:35:29, 5.02s/it] 49%|████▉ | 10904/22095 [18:25:01<14:06:52, 4.54s/it] {'loss': 0.3166, 'grad_norm': 0.6059450088427075, 'learning_rate': 5.348587398944959e-06, 'epoch': 0.49} 49%|████▉ | 10904/22095 [18:25:01<14:06:52, 4.54s/it] 49%|████▉ | 10905/22095 [18:25:04<13:13:14, 4.25s/it] {'loss': 0.2925, 'grad_norm': 0.6160544560726442, 'learning_rate': 5.347856257544919e-06, 'epoch': 0.49} 49%|████▉ | 10905/22095 [18:25:04<13:13:14, 4.25s/it] 49%|████▉ | 10906/22095 [18:25:07<11:59:53, 3.86s/it] {'loss': 0.3096, 'grad_norm': 0.6557893318809236, 'learning_rate': 5.347125108670516e-06, 'epoch': 0.49} 49%|████▉ | 10906/22095 [18:25:07<11:59:53, 3.86s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:23:07.395772 load time: 1517.36 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39216.png 2025-08-28 10:23:06.836505 load time: 2186.33 ms 49%|████▉ | 10907/22095 [18:25:11<12:03:07, 3.88s/it] {'loss': 0.3395, 'grad_norm': 0.6031112497162145, 'learning_rate': 5.3463939523374616e-06, 'epoch': 0.49} 49%|████▉ | 10907/22095 [18:25:11<12:03:07, 3.88s/it] 49%|████▉ | 10908/22095 [18:25:14<11:09:01, 3.59s/it] {'loss': 0.362, 'grad_norm': 0.7063980157611425, 'learning_rate': 5.345662788561466e-06, 'epoch': 0.49} 49%|████▉ | 10908/22095 [18:25:14<11:09:01, 3.59s/it] 49%|████▉ | 10909/22095 [18:25:18<11:01:13, 3.55s/it] {'loss': 0.3011, 'grad_norm': 0.6090917761194881, 'learning_rate': 5.344931617358237e-06, 'epoch': 0.49} 49%|████▉ | 10909/22095 [18:25:18<11:01:13, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117747 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41723 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10910/22095 [18:25:21<10:39:44, 3.43s/it] {'loss': 0.3522, 'grad_norm': 0.6698751151563621, 'learning_rate': 5.344200438743489e-06, 'epoch': 0.49} 49%|████▉ | 10910/22095 [18:25:21<10:39:44, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (113769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93577 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10911/22095 [18:25:30<16:11:56, 5.21s/it] {'loss': 0.4968, 'grad_norm': 0.3498846920345231, 'learning_rate': 5.343469252732928e-06, 'epoch': 0.49} 49%|████▉ | 10911/22095 [18:25:30<16:11:56, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108443 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10912/22095 [18:25:34<14:48:26, 4.77s/it] {'loss': 0.3478, 'grad_norm': 0.6144344328879905, 'learning_rate': 5.34273805934227e-06, 'epoch': 0.49} 49%|████▉ | 10912/22095 [18:25:34<14:48:26, 4.77s/it] 49%|████▉ | 10913/22095 [18:25:37<13:05:28, 4.21s/it] {'loss': 0.3395, 'grad_norm': 0.6833336709587083, 'learning_rate': 5.342006858587222e-06, 'epoch': 0.49} 49%|████▉ | 10913/22095 [18:25:37<13:05:28, 4.21s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:23:35.422287 load time: 1320.4 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8554615 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22978, 'image': '935926232.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': "Is this book related to Children's Books? Answer the question using a single word or phrase."}, {'from': 'gpt', 'value': 'No'}]} VC:s3://gui-agent/data_20250630/mac/images/terminal/5685b8a4-5bcb-4b03-8a69-df5db43dbe42/images/step_3.png 2025-08-28 10:23:36.280695 load time: 1505.72 ms 49%|████▉ | 10914/22095 [18:25:40<12:10:25, 3.92s/it] {'loss': 0.3197, 'grad_norm': 0.6541159079724937, 'learning_rate': 5.341275650483497e-06, 'epoch': 0.49} 49%|████▉ | 10914/22095 [18:25:40<12:10:25, 3.92s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 10:23:38.866688 load time: 1179.61 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_1/images/step_0.png 2025-08-28 10:23:39.493075 load time: 1490.64 ms 49%|████▉ | 10915/22095 [18:25:43<11:37:25, 3.74s/it] {'loss': 0.3453, 'grad_norm': 0.5724005590293713, 'learning_rate': 5.340544435046807e-06, 'epoch': 0.49} 49%|████▉ | 10915/22095 [18:25:43<11:37:25, 3.74s/it] 49%|████▉ | 10916/22095 [18:25:47<11:23:14, 3.67s/it] {'loss': 0.3212, 'grad_norm': 0.6626481102515759, 'learning_rate': 5.3398132122928635e-06, 'epoch': 0.49} 49%|████▉ | 10916/22095 [18:25:47<11:23:14, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894382 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17535, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 13cm\nB. 7cm\nC. 8cm\nD. 1lcm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 49%|████▉ | 10917/22095 [18:25:51<11:29:13, 3.70s/it] {'loss': 0.3028, 'grad_norm': 0.6123494656814092, 'learning_rate': 5.339081982237377e-06, 'epoch': 0.49} 49%|████▉ | 10917/22095 [18:25:51<11:29:13, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 49%|████▉ | 10918/22095 [18:26:02<18:14:32, 5.88s/it] {'loss': 0.4727, 'grad_norm': 0.3694123866230547, 'learning_rate': 5.3383507448960605e-06, 'epoch': 0.49} 49%|████▉ | 10918/22095 [18:26:02<18:14:32, 5.88s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_005007_1/images/before_screenshot_1_id_152_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:24:00.415209 load time: 1116.24 ms VC:s3://gui-agent/data_20250612/mac/images/settings/fd4c6d2d-8881-4c0a-819b-06672eb781fc/images/step_0.png 2025-08-28 10:24:02.324096 load time: 1000.18 ms 49%|████▉ | 10919/22095 [18:26:12<22:32:26, 7.26s/it] {'loss': 0.4782, 'grad_norm': 0.38805313791080737, 'learning_rate': 5.3376195002846255e-06, 'epoch': 0.49} 49%|████▉ | 10919/22095 [18:26:12<22:32:26, 7.26s/it] 49%|████▉ | 10920/22095 [18:26:20<23:18:53, 7.51s/it] {'loss': 0.4617, 'grad_norm': 0.27574780421912554, 'learning_rate': 5.336888248418784e-06, 'epoch': 0.49} 49%|████▉ | 10920/22095 [18:26:20<23:18:53, 7.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 49%|████▉ | 10921/22095 [18:26:24<19:32:31, 6.30s/it] {'loss': 0.3636, 'grad_norm': 0.5884279939932177, 'learning_rate': 5.3361569893142505e-06, 'epoch': 0.49} 49%|████▉ | 10921/22095 [18:26:24<19:32:31, 6.30s/it] 49%|████▉ | 10922/22095 [18:26:27<16:54:55, 5.45s/it] {'loss': 0.3125, 'grad_norm': 0.6648088950131924, 'learning_rate': 5.335425722986735e-06, 'epoch': 0.49} 49%|████▉ | 10922/22095 [18:26:27<16:54:55, 5.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8356402 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23110, 'image': 'vrdu_table_final_2/astro-ph.CO/405fe621-a6e3-4cbc-b232-c5ed8eba9afa.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903165 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26318, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 5cm\nB. \\frac{11}{2}cm\nC. 4cm\nD. \\frac{9}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:24:27.158923 load time: 1113.17 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:24:27.796736 load time: 1055.76 ms 49%|████▉ | 10923/22095 [18:26:30<14:48:26, 4.77s/it] {'loss': 0.3286, 'grad_norm': 0.6967573887842057, 'learning_rate': 5.334694449451949e-06, 'epoch': 0.49} 49%|████▉ | 10923/22095 [18:26:30<14:48:26, 4.77s/it] 49%|████▉ | 10924/22095 [18:26:35<15:00:44, 4.84s/it] {'loss': 0.3723, 'grad_norm': 0.5998366409788873, 'learning_rate': 5.3339631687256085e-06, 'epoch': 0.49} 49%|████▉ | 10924/22095 [18:26:35<15:00:44, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70116 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10925/22095 [18:26:38<13:09:05, 4.24s/it] {'loss': 0.308, 'grad_norm': 0.6340096059926653, 'learning_rate': 5.333231880823425e-06, 'epoch': 0.49} 49%|████▉ | 10925/22095 [18:26:38<13:09:05, 4.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_1.png 2025-08-28 10:24:37.344184 load time: 1332.77 ms 49%|████▉ | 10926/22095 [18:26:42<12:55:49, 4.17s/it] {'loss': 0.3365, 'grad_norm': 0.6382985585954123, 'learning_rate': 5.3325005857611126e-06, 'epoch': 0.49} 49%|████▉ | 10926/22095 [18:26:42<12:55:49, 4.17s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 10:24:40.961237 load time: 1209.28 ms 49%|████▉ | 10927/22095 [18:26:45<11:57:40, 3.86s/it] {'loss': 0.3088, 'grad_norm': 0.6236564404757622, 'learning_rate': 5.331769283554382e-06, 'epoch': 0.49} 49%|████▉ | 10927/22095 [18:26:45<11:57:40, 3.86s/it] 49%|████▉ | 10928/22095 [18:26:48<11:14:17, 3.62s/it] {'loss': 0.3432, 'grad_norm': 0.6272800491529169, 'learning_rate': 5.33103797421895e-06, 'epoch': 0.49} 49%|████▉ | 10928/22095 [18:26:48<11:14:17, 3.62s/it] 49%|████▉ | 10929/22095 [18:26:52<11:00:58, 3.55s/it] {'loss': 0.292, 'grad_norm': 0.6036608440507453, 'learning_rate': 5.33030665777053e-06, 'epoch': 0.49} 49%|████▉ | 10929/22095 [18:26:52<11:00:58, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59309 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10930/22095 [18:26:55<10:23:19, 3.35s/it] {'loss': 0.3284, 'grad_norm': 0.6449185476085723, 'learning_rate': 5.329575334224832e-06, 'epoch': 0.49} 49%|████▉ | 10930/22095 [18:26:55<10:23:19, 3.35s/it] 49%|████▉ | 10931/22095 [18:26:58<10:05:32, 3.25s/it] {'loss': 0.3207, 'grad_norm': 0.6568644023628296, 'learning_rate': 5.328844003597573e-06, 'epoch': 0.49} 49%|████▉ | 10931/22095 [18:26:58<10:05:32, 3.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 10:24:58.177408 load time: 1067.23 ms 49%|████▉ | 10932/22095 [18:27:01<9:56:52, 3.21s/it] {'loss': 0.3314, 'grad_norm': 0.6664172500534838, 'learning_rate': 5.328112665904465e-06, 'epoch': 0.49} 49%|████▉ | 10932/22095 [18:27:01<9:56:52, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38032.png 2025-08-28 10:25:00.855751 load time: 2749.9 ms 49%|████▉ | 10933/22095 [18:27:07<13:00:10, 4.19s/it] {'loss': 0.4498, 'grad_norm': 0.7000066487436837, 'learning_rate': 5.3273813211612254e-06, 'epoch': 0.49} 49%|████▉ | 10933/22095 [18:27:07<13:00:10, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75920 > 40960). Running this sequence through the model will result in indexing errors 49%|████▉ | 10934/22095 [18:27:10<11:59:10, 3.87s/it] {'loss': 0.3427, 'grad_norm': 0.6227058982525127, 'learning_rate': 5.3266499693835664e-06, 'epoch': 0.49} 49%|████▉ | 10934/22095 [18:27:10<11:59:10, 3.87s/it] 49%|████▉ | 10935/22095 [18:27:14<11:34:58, 3.74s/it] {'loss': 0.3172, 'grad_norm': 0.6609441288231116, 'learning_rate': 5.325918610587202e-06, 'epoch': 0.49} 49%|████▉ | 10935/22095 [18:27:14<11:34:58, 3.74s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/f9be7ed3-49aa-4f23-a176-7af6afdfae84/images/step_6.png 2025-08-28 10:25:12.596242 load time: 1400.79 ms 49%|████▉ | 10936/22095 [18:27:17<10:49:13, 3.49s/it] {'loss': 0.3608, 'grad_norm': 0.6541893702664565, 'learning_rate': 5.325187244787848e-06, 'epoch': 0.49} 49%|████▉ | 10936/22095 [18:27:17<10:49:13, 3.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [675, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8455096 in VC:s3://internvl-moe-sft-data/. Exception: Image size [675, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 126373, 'image': 'vrdu_texteq/astro-ph.CO/5d1df627-efb9-4811-b95c-ceaa888384c2.png', 'image_wh': [[675, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'where $n$ is the number of elements in the data vector $d$.'}]} VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_092710_4/images/before_screenshot_45_id_1405_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:25:16.120238 load time: 1134.7 ms 49%|████▉ | 10937/22095 [18:27:20<10:27:12, 3.37s/it] {'loss': 0.299, 'grad_norm': 0.6666335731827807, 'learning_rate': 5.324455872001221e-06, 'epoch': 0.49} 49%|████▉ | 10937/22095 [18:27:20<10:27:12, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_0.png 2025-08-28 10:25:18.366972 load time: 1056.41 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 10:25:19.008212 load time: 1062.31 ms 50%|████▉ | 10938/22095 [18:27:28<14:43:28, 4.75s/it] {'loss': 0.4673, 'grad_norm': 0.3023990661780046, 'learning_rate': 5.32372449224303e-06, 'epoch': 0.5} 50%|████▉ | 10938/22095 [18:27:28<14:43:28, 4.75s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_053634_before_screenshot.png 2025-08-28 10:25:26.578365 load time: 1222.18 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 10:25:27.609565 load time: 1504.6 ms 50%|████▉ | 10939/22095 [18:27:31<13:21:19, 4.31s/it] {'loss': 0.307, 'grad_norm': 0.6952874310170034, 'learning_rate': 5.322993105528996e-06, 'epoch': 0.5} 50%|████▉ | 10939/22095 [18:27:31<13:21:19, 4.31s/it] 50%|████▉ | 10940/22095 [18:27:34<12:04:03, 3.89s/it] {'loss': 0.3121, 'grad_norm': 0.654364686935079, 'learning_rate': 5.322261711874831e-06, 'epoch': 0.5} 50%|████▉ | 10940/22095 [18:27:34<12:04:03, 3.89s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_4.png 2025-08-28 10:25:34.285041 load time: 1036.4 ms 50%|████▉ | 10941/22095 [18:27:37<11:34:45, 3.74s/it] {'loss': 0.346, 'grad_norm': 0.6533502738870345, 'learning_rate': 5.321530311296253e-06, 'epoch': 0.5} 50%|████▉ | 10941/22095 [18:27:37<11:34:45, 3.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 10:25:37.249414 load time: 1004.11 ms 50%|████▉ | 10942/22095 [18:27:47<16:58:51, 5.48s/it] {'loss': 0.4691, 'grad_norm': 0.36164641617759674, 'learning_rate': 5.320798903808976e-06, 'epoch': 0.5} 50%|████▉ | 10942/22095 [18:27:47<16:58:51, 5.48s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_3/images/step_3.png 2025-08-28 10:25:46.463308 load time: 1246.6 ms 50%|████▉ | 10943/22095 [18:27:50<14:48:30, 4.78s/it] {'loss': 0.3039, 'grad_norm': 0.6153895176913012, 'learning_rate': 5.320067489428715e-06, 'epoch': 0.5} 50%|████▉ | 10943/22095 [18:27:50<14:48:30, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 10:25:49.614487 load time: 1029.38 ms 50%|████▉ | 10944/22095 [18:27:53<13:04:08, 4.22s/it] {'loss': 0.3082, 'grad_norm': 0.6072482821425039, 'learning_rate': 5.319336068171187e-06, 'epoch': 0.5} 50%|████▉ | 10944/22095 [18:27:53<13:04:08, 4.22s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 10:25:52.932076 load time: 1449.12 ms 50%|████▉ | 10945/22095 [18:27:57<13:02:19, 4.21s/it] {'loss': 0.3422, 'grad_norm': 0.6703237557809038, 'learning_rate': 5.318604640052107e-06, 'epoch': 0.5} 50%|████▉ | 10945/22095 [18:27:57<13:02:19, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/57f0e951e810592d9ceae1ee4edc34446d856723e561f6008a9f3984f3d70a51.png 2025-08-28 10:25:57.044317 load time: 1248.75 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_2/images/step_1.png 2025-08-28 10:25:57.396068 load time: 1268.08 ms 50%|████▉ | 10946/22095 [18:28:00<11:51:03, 3.83s/it] {'loss': 0.3292, 'grad_norm': 0.617543960875071, 'learning_rate': 5.317873205087193e-06, 'epoch': 0.5} 50%|████▉ | 10946/22095 [18:28:00<11:51:03, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [398, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507905 in VC:s3://internvl-moe-sft-data/. Exception: Image size [398, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 101010, 'image': 'vrdu_texteq/astro-ph.CO/8b2145c7-734c-4c33-abe2-b92c65ece4f9.png', 'image_wh': [[398, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $G$ is the Newton constant.'}]} VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_43_id_119_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 10:25:59.338244 load time: 1590.24 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-28 10:26:00.007306 load time: 1524.1 ms 50%|████▉ | 10947/22095 [18:28:03<10:57:27, 3.54s/it] {'loss': 0.3505, 'grad_norm': 0.6251190285600072, 'learning_rate': 5.31714176329216e-06, 'epoch': 0.5} 50%|████▉ | 10947/22095 [18:28:03<10:57:27, 3.54s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_4.png 2025-08-28 10:26:02.562886 load time: 1224.36 ms 50%|████▉ | 10948/22095 [18:28:06<10:46:58, 3.48s/it] {'loss': 0.3389, 'grad_norm': 0.6543858933536746, 'learning_rate': 5.3164103146827225e-06, 'epoch': 0.5} 50%|████▉ | 10948/22095 [18:28:06<10:46:58, 3.48s/it] 50%|████▉ | 10949/22095 [18:28:16<16:33:46, 5.35s/it] {'loss': 0.3592, 'grad_norm': 0.7297640462728502, 'learning_rate': 5.315678859274601e-06, 'epoch': 0.5} 50%|████▉ | 10949/22095 [18:28:16<16:33:46, 5.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61153 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10950/22095 [18:28:19<14:12:39, 4.59s/it] {'loss': 0.3158, 'grad_norm': 0.6665550483127983, 'learning_rate': 5.314947397083512e-06, 'epoch': 0.5} 50%|████▉ | 10950/22095 [18:28:19<14:12:39, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102334 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63118 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10951/22095 [18:28:22<12:36:57, 4.08s/it] {'loss': 0.3262, 'grad_norm': 0.6365177479194217, 'learning_rate': 5.314215928125167e-06, 'epoch': 0.5} 50%|████▉ | 10951/22095 [18:28:22<12:36:57, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46501 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10952/22095 [18:28:25<12:11:00, 3.94s/it] {'loss': 0.3406, 'grad_norm': 0.6694352069351751, 'learning_rate': 5.313484452415289e-06, 'epoch': 0.5} 50%|████▉ | 10952/22095 [18:28:25<12:11:00, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 10:26:24.144815 load time: 1112.57 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:26:26.184932 load time: 1006.06 ms 50%|████▉ | 10953/22095 [18:28:29<11:58:41, 3.87s/it] {'loss': 0.3397, 'grad_norm': 0.667497336946359, 'learning_rate': 5.312752969969592e-06, 'epoch': 0.5} 50%|████▉ | 10953/22095 [18:28:29<11:58:41, 3.87s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/3a52484c7bdb64fb04f1a7a17274d8179c509aa59f3132caf08a2a5cf8932ec3.png 2025-08-28 10:26:27.833898 load time: 1669.81 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 10:26:28.369310 load time: 1225.78 ms 50%|████▉ | 10954/22095 [18:28:32<11:03:17, 3.57s/it] {'loss': 0.3268, 'grad_norm': 0.6467036143338789, 'learning_rate': 5.3120214808037954e-06, 'epoch': 0.5} 50%|████▉ | 10954/22095 [18:28:32<11:03:17, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956348 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7183, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1.5cm\nB. 2cm\nC. 4cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 50%|████▉ | 10955/22095 [18:28:35<11:02:26, 3.57s/it] {'loss': 0.3347, 'grad_norm': 0.6394702341304671, 'learning_rate': 5.311289984933615e-06, 'epoch': 0.5} 50%|████▉ | 10955/22095 [18:28:35<11:02:26, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405005 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7191, 'image': 'vrdu_table_final_2/astro-ph.CO/5a4c928d-1154-4e9f-bc05-7105816d329a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/notes_1/images/step_0.png 2025-08-28 10:26:35.607484 load time: 1489.67 ms 50%|████▉ | 10956/22095 [18:28:39<10:49:06, 3.50s/it] {'loss': 0.3179, 'grad_norm': 0.6664012078186998, 'learning_rate': 5.310558482374768e-06, 'epoch': 0.5} 50%|████▉ | 10956/22095 [18:28:39<10:49:06, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047546 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_2/images/step_0.png 2025-08-28 10:26:37.592914 load time: 1274.01 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 10957/22095 [18:28:42<10:18:44, 3.33s/it] {'loss': 0.3369, 'grad_norm': 0.6043814803870569, 'learning_rate': 5.309826973142974e-06, 'epoch': 0.5} 50%|████▉ | 10957/22095 [18:28:42<10:18:44, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45969 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10958/22095 [18:28:51<15:41:01, 5.07s/it] {'loss': 0.4913, 'grad_norm': 0.37418229367821654, 'learning_rate': 5.30909545725395e-06, 'epoch': 0.5} 50%|████▉ | 10958/22095 [18:28:51<15:41:01, 5.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 10959/22095 [18:29:00<19:49:38, 6.41s/it] {'loss': 0.4493, 'grad_norm': 0.3446965224369728, 'learning_rate': 5.308363934723412e-06, 'epoch': 0.5} 50%|████▉ | 10959/22095 [18:29:00<19:49:38, 6.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [578, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8477699 in VC:s3://internvl-moe-sft-data/. Exception: Image size [578, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 59572, 'image': 'vrdu_texteq/astro-ph.CO/b8ae1967-8725-4c00-bf73-1da8ac37b2e9.png', 'image_wh': [[578, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'with $\\alpha_i \\in \\mathbb{C}$. The caustic conditions tell us that'}]} 50%|████▉ | 10960/22095 [18:29:08<21:04:54, 6.82s/it] {'loss': 0.4693, 'grad_norm': 0.29210342646383686, 'learning_rate': 5.307632405567084e-06, 'epoch': 0.5} 50%|████▉ | 10960/22095 [18:29:08<21:04:54, 6.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/terminal/e42e08f7-e360-485b-bab8-238070c09891/images/step_0.png 2025-08-28 10:27:07.436334 load time: 1025.89 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 10:27:08.717599 load time: 1310.4 ms 50%|████▉ | 10961/22095 [18:29:12<17:55:56, 5.80s/it] {'loss': 0.3238, 'grad_norm': 0.6848138089237998, 'learning_rate': 5.306900869800676e-06, 'epoch': 0.5} 50%|████▉ | 10961/22095 [18:29:12<17:55:56, 5.80s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:27:11.277746 load time: 1093.74 ms VC:s3://gui-agent/data_20250612/windows/images/settings/free_task_20250606_212846/images/20250606_212904_9.png 2025-08-28 10:27:11.588349 load time: 1018.19 ms 50%|████▉ | 10962/22095 [18:29:15<15:38:25, 5.06s/it] {'loss': 0.3727, 'grad_norm': 0.6610513471135908, 'learning_rate': 5.306169327439914e-06, 'epoch': 0.5} 50%|████▉ | 10962/22095 [18:29:15<15:38:25, 5.06s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:27:14.774332 load time: 1192.02 ms 50%|████▉ | 10963/22095 [18:29:18<13:37:27, 4.41s/it] {'loss': 0.3011, 'grad_norm': 0.6254059324403175, 'learning_rate': 5.3054377785005114e-06, 'epoch': 0.5} 50%|████▉ | 10963/22095 [18:29:18<13:37:27, 4.41s/it] 50%|████▉ | 10964/22095 [18:29:21<12:49:18, 4.15s/it] {'loss': 0.3804, 'grad_norm': 0.6963258697924865, 'learning_rate': 5.30470622299819e-06, 'epoch': 0.5} 50%|████▉ | 10964/22095 [18:29:21<12:49:18, 4.15s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_0.png 2025-08-28 10:27:20.992671 load time: 1305.02 ms 50%|████▉ | 10965/22095 [18:29:24<11:35:24, 3.75s/it] {'loss': 0.2995, 'grad_norm': 0.6253787977907792, 'learning_rate': 5.303974660948669e-06, 'epoch': 0.5} 50%|████▉ | 10965/22095 [18:29:24<11:35:24, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 10966/22095 [18:29:34<17:01:04, 5.50s/it] {'loss': 0.4814, 'grad_norm': 0.5194836757059452, 'learning_rate': 5.3032430923676635e-06, 'epoch': 0.5} 50%|████▉ | 10966/22095 [18:29:34<17:01:04, 5.50s/it] 50%|████▉ | 10967/22095 [18:29:37<14:43:09, 4.76s/it] {'loss': 0.3655, 'grad_norm': 0.7144946249067363, 'learning_rate': 5.302511517270897e-06, 'epoch': 0.5} 50%|████▉ | 10967/22095 [18:29:37<14:43:09, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125784 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56499 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10968/22095 [18:29:40<12:55:54, 4.18s/it] {'loss': 0.3143, 'grad_norm': 0.6770453635481531, 'learning_rate': 5.301779935674087e-06, 'epoch': 0.5} 50%|████▉ | 10968/22095 [18:29:40<12:55:54, 4.18s/it] 50%|████▉ | 10969/22095 [18:29:43<11:46:48, 3.81s/it] {'loss': 0.3671, 'grad_norm': 0.6699003525392917, 'learning_rate': 5.301048347592952e-06, 'epoch': 0.5} 50%|████▉ | 10969/22095 [18:29:43<11:46:48, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72506 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 10970/22095 [18:29:46<11:22:07, 3.68s/it] {'loss': 0.3028, 'grad_norm': 0.7024479803516988, 'learning_rate': 5.300316753043214e-06, 'epoch': 0.5} 50%|████▉ | 10970/22095 [18:29:46<11:22:07, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74007 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10971/22095 [18:29:50<11:22:41, 3.68s/it] {'loss': 0.3316, 'grad_norm': 0.6159114504436758, 'learning_rate': 5.299585152040592e-06, 'epoch': 0.5} 50%|████▉ | 10971/22095 [18:29:50<11:22:41, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46672 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10972/22095 [18:29:52<10:33:28, 3.42s/it] {'loss': 0.3352, 'grad_norm': 0.6961387616550401, 'learning_rate': 5.298853544600802e-06, 'epoch': 0.5} 50%|████▉ | 10972/22095 [18:29:52<10:33:28, 3.42s/it] 50%|████▉ | 10973/22095 [18:29:56<10:21:46, 3.35s/it] {'loss': 0.3507, 'grad_norm': 0.628516422333494, 'learning_rate': 5.298121930739571e-06, 'epoch': 0.5} 50%|████▉ | 10973/22095 [18:29:56<10:21:46, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50754 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10974/22095 [18:30:00<10:53:29, 3.53s/it] {'loss': 0.3468, 'grad_norm': 0.6162577726628683, 'learning_rate': 5.297390310472612e-06, 'epoch': 0.5} 50%|████▉ | 10974/22095 [18:30:00<10:53:29, 3.53s/it] 50%|████▉ | 10975/22095 [18:30:02<10:12:22, 3.30s/it] {'loss': 0.3158, 'grad_norm': 0.6160097811332337, 'learning_rate': 5.29665868381565e-06, 'epoch': 0.5} 50%|████▉ | 10975/22095 [18:30:02<10:12:22, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 10976/22095 [18:30:12<16:10:29, 5.24s/it] {'loss': 0.4435, 'grad_norm': 0.3708650924765746, 'learning_rate': 5.295927050784404e-06, 'epoch': 0.5} 50%|████▉ | 10976/22095 [18:30:12<16:10:29, 5.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_4/images/step_0.png 2025-08-28 10:28:12.123808 load time: 1014.77 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:28:12.184633 load time: 1424.42 ms 50%|████▉ | 10977/22095 [18:30:18<16:59:17, 5.50s/it] {'loss': 0.4923, 'grad_norm': 0.34981254449603477, 'learning_rate': 5.295195411394595e-06, 'epoch': 0.5} 50%|████▉ | 10977/22095 [18:30:18<16:59:17, 5.50s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (100712 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10978/22095 [18:30:22<15:20:21, 4.97s/it] {'loss': 0.3009, 'grad_norm': 0.6413983531235229, 'learning_rate': 5.2944637656619415e-06, 'epoch': 0.5} 50%|████▉ | 10978/22095 [18:30:22<15:20:21, 4.97s/it] 50%|████▉ | 10979/22095 [18:30:29<17:40:49, 5.73s/it] {'loss': 0.4772, 'grad_norm': 0.2988611839833011, 'learning_rate': 5.293732113602169e-06, 'epoch': 0.5} 50%|████▉ | 10979/22095 [18:30:29<17:40:49, 5.73s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 10:28:29.232791 load time: 1306.7 ms 50%|████▉ | 10980/22095 [18:30:38<20:39:31, 6.69s/it] {'loss': 0.457, 'grad_norm': 0.30784643760060076, 'learning_rate': 5.293000455230992e-06, 'epoch': 0.5} 50%|████▉ | 10980/22095 [18:30:38<20:39:31, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 50%|████▉ | 10981/22095 [18:30:42<17:48:39, 5.77s/it] {'loss': 0.3285, 'grad_norm': 0.5847235092637628, 'learning_rate': 5.292268790564138e-06, 'epoch': 0.5} 50%|████▉ | 10981/22095 [18:30:42<17:48:39, 5.77s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_185849_2/images/before_screenshot_7_id_27_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:28:42.094613 load time: 1212.11 ms 50%|████▉ | 10982/22095 [18:30:52<21:32:13, 6.98s/it] {'loss': 0.467, 'grad_norm': 0.33329237159104297, 'learning_rate': 5.291537119617322e-06, 'epoch': 0.5} 50%|████▉ | 10982/22095 [18:30:52<21:32:13, 6.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250612/mac/images/settings/ce733e79-9bcf-4d47-8303-951a3a1ae194/images/step_0.png 2025-08-28 10:28:51.809691 load time: 1007.49 ms 50%|████▉ | 10983/22095 [18:30:55<18:02:37, 5.85s/it] {'loss': 0.3166, 'grad_norm': 0.7333630104412473, 'learning_rate': 5.290805442406273e-06, 'epoch': 0.5} 50%|████▉ | 10983/22095 [18:30:55<18:02:37, 5.85s/it]VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_23.png 2025-08-28 10:28:53.804216 load time: 1059.31 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 10984/22095 [18:30:59<15:56:30, 5.17s/it] {'loss': 0.3379, 'grad_norm': 0.651647334533816, 'learning_rate': 5.290073758946705e-06, 'epoch': 0.5} 50%|████▉ | 10984/22095 [18:30:59<15:56:30, 5.17s/it] 50%|████▉ | 10985/22095 [18:31:03<14:55:39, 4.84s/it] {'loss': 0.3138, 'grad_norm': 0.6267299723598055, 'learning_rate': 5.289342069254345e-06, 'epoch': 0.5} 50%|████▉ | 10985/22095 [18:31:03<14:55:39, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/chrome_2/images/step_0.png 2025-08-28 10:29:04.077063 load time: 1724.84 ms 50%|████▉ | 10986/22095 [18:31:12<18:44:27, 6.07s/it] {'loss': 0.5256, 'grad_norm': 0.348045624189992, 'learning_rate': 5.288610373344911e-06, 'epoch': 0.5} 50%|████▉ | 10986/22095 [18:31:12<18:44:27, 6.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74353 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10987/22095 [18:31:15<16:14:44, 5.27s/it] {'loss': 0.3337, 'grad_norm': 0.642822543396966, 'learning_rate': 5.287878671234127e-06, 'epoch': 0.5} 50%|████▉ | 10987/22095 [18:31:15<16:14:44, 5.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937804 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60957, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nA. \\frac{11}{2}cm\nB. 4cm\nC. \\frac{9}{2}cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (86255 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_1/images/step_3.png 2025-08-28 10:29:14.352524 load time: 1323.16 ms Token indices sequence length is longer than the specified maximum sequence length for this model (123110 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250612/mac/images/desktop/d3102782-cd38-4f40-b8d1-eb8388e87c99/images/step_3.png 2025-08-28 10:29:15.583148 load time: 1342.23 ms 50%|████▉ | 10988/22095 [18:31:19<15:16:31, 4.95s/it] {'loss': 0.3539, 'grad_norm': 0.6261370129498127, 'learning_rate': 5.287146962937715e-06, 'epoch': 0.5} 50%|████▉ | 10988/22095 [18:31:19<15:16:31, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/e42e08f7-e360-485b-bab8-238070c09891/images/step_0.png 2025-08-28 10:29:18.798778 load time: 1016.43 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 10:29:20.396873 load time: 1430.65 ms 50%|████▉ | 10989/22095 [18:31:28<18:40:37, 6.05s/it] {'loss': 0.4861, 'grad_norm': 0.31546532431943075, 'learning_rate': 5.286415248471397e-06, 'epoch': 0.5} 50%|████▉ | 10989/22095 [18:31:28<18:40:37, 6.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885298 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8451, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 4cm\nB. 1cm\nC. 1.5cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [259, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8432623 in VC:s3://internvl-moe-sft-data/. Exception: Image size [259, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63133, 'image': 'vrdu_texteq/astro-ph.CO/96952c03-6183-4487-bb31-d2e053ea1e07.png', 'image_wh': [[259, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'where $ic$ is defined as'}]} 50%|████▉ | 10990/22095 [18:31:32<16:29:37, 5.35s/it] {'loss': 0.3291, 'grad_norm': 0.8233396237313507, 'learning_rate': 5.285683527850892e-06, 'epoch': 0.5} 50%|████▉ | 10990/22095 [18:31:32<16:29:37, 5.35s/it] 50%|████▉ | 10991/22095 [18:31:34<14:14:31, 4.62s/it] {'loss': 0.3274, 'grad_norm': 0.6464844367231789, 'learning_rate': 5.284951801091929e-06, 'epoch': 0.5} 50%|████▉ | 10991/22095 [18:31:34<14:14:31, 4.62s/it] 50%|████▉ | 10992/22095 [18:31:38<13:22:26, 4.34s/it] {'loss': 0.3618, 'grad_norm': 0.6104061184013967, 'learning_rate': 5.284220068210225e-06, 'epoch': 0.5} 50%|████▉ | 10992/22095 [18:31:38<13:22:26, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/terminal/d575fbf7-ad0d-4665-94ba-472d47b74314/images/step_1.png 2025-08-28 10:29:37.920242 load time: 1244.18 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 10:29:37.673819 load time: 1469.49 ms 50%|████▉ | 10993/22095 [18:31:41<12:01:57, 3.90s/it] {'loss': 0.3895, 'grad_norm': 0.6293890350828011, 'learning_rate': 5.283488329221506e-06, 'epoch': 0.5} 50%|████▉ | 10993/22095 [18:31:41<12:01:57, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44466 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10994/22095 [18:31:44<11:07:50, 3.61s/it] {'loss': 0.3484, 'grad_norm': 0.6475500629074776, 'learning_rate': 5.2827565841414915e-06, 'epoch': 0.5} 50%|████▉ | 10994/22095 [18:31:44<11:07:50, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 10995/22095 [18:31:53<16:10:45, 5.25s/it] {'loss': 0.4795, 'grad_norm': 0.35003481742079523, 'learning_rate': 5.282024832985908e-06, 'epoch': 0.5} 50%|████▉ | 10995/22095 [18:31:53<16:10:45, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44307 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10996/22095 [18:31:57<15:10:41, 4.92s/it] {'loss': 0.3034, 'grad_norm': 0.6124094693385026, 'learning_rate': 5.281293075770476e-06, 'epoch': 0.5} 50%|████▉ | 10996/22095 [18:31:57<15:10:41, 4.92s/it] 50%|████▉ | 10997/22095 [18:32:01<14:28:03, 4.69s/it] {'loss': 0.3426, 'grad_norm': 0.6289972302871515, 'learning_rate': 5.280561312510921e-06, 'epoch': 0.5} 50%|████▉ | 10997/22095 [18:32:01<14:28:03, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42760 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 10998/22095 [18:32:05<13:36:38, 4.42s/it] {'loss': 0.326, 'grad_norm': 0.6095955157067299, 'learning_rate': 5.279829543222963e-06, 'epoch': 0.5} 50%|████▉ | 10998/22095 [18:32:05<13:36:38, 4.42s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20411.png 2025-08-28 10:29:55.971193 load time: 6770.46 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39137.png 2025-08-28 10:30:04.903730 load time: 1013.6 ms 50%|████▉ | 10999/22095 [18:32:08<12:39:08, 4.10s/it] {'loss': 0.3287, 'grad_norm': 0.660604168797498, 'learning_rate': 5.27909776792233e-06, 'epoch': 0.5} 50%|████▉ | 10999/22095 [18:32:09<12:39:08, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86671 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 11000/22095 [18:32:12<12:06:00, 3.93s/it] {'loss': 0.3563, 'grad_norm': 0.6395277156110963, 'learning_rate': 5.278365986624743e-06, 'epoch': 0.5} 50%|████▉ | 11000/22095 [18:32:12<12:06:00, 3.93s/it] 50%|████▉ | 11001/22095 [18:32:15<11:18:05, 3.67s/it] {'loss': 0.3451, 'grad_norm': 0.6254936495960255, 'learning_rate': 5.277634199345924e-06, 'epoch': 0.5} 50%|████▉ | 11001/22095 [18:32:15<11:18:05, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 11002/22095 [18:32:24<15:57:50, 5.18s/it] {'loss': 0.4898, 'grad_norm': 0.2960462330487585, 'learning_rate': 5.2769024061016e-06, 'epoch': 0.5} 50%|████▉ | 11002/22095 [18:32:24<15:57:50, 5.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [50, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358070 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24781, 'image': 'vrdu_table_final_2/astro-ph.CO/62daabb4-21d4-45e0-850f-bc35440bf10f.png', 'image_wh': [[50, 9]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$\\am=-\\ab$\\end{tabular}\n```'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 10:30:23.033475 load time: 1022.11 ms 50%|████▉ | 11003/22095 [18:32:28<15:07:26, 4.91s/it] {'loss': 0.3291, 'grad_norm': 0.6469695831600601, 'learning_rate': 5.276170606907492e-06, 'epoch': 0.5} 50%|████▉ | 11003/22095 [18:32:28<15:07:26, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 11004/22095 [18:32:38<19:29:54, 6.33s/it] {'loss': 0.4513, 'grad_norm': 0.2650054780274163, 'learning_rate': 5.275438801779328e-06, 'epoch': 0.5} 50%|████▉ | 11004/22095 [18:32:38<19:29:54, 6.33s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 11005/22095 [18:32:41<16:29:26, 5.35s/it] {'loss': 0.3008, 'grad_norm': 0.5869666126213625, 'learning_rate': 5.27470699073283e-06, 'epoch': 0.5} 50%|████▉ | 11005/22095 [18:32:41<16:29:26, 5.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30254.png 2025-08-28 10:30:40.547018 load time: 1164.68 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/cf183c3a-83c3-4eb7-a60d-c6cbfaa27f3e/images/step_3.png 2025-08-28 10:30:40.574098 load time: 1380.8 ms 50%|████▉ | 11006/22095 [18:32:51<21:14:29, 6.90s/it] {'loss': 0.4997, 'grad_norm': 0.2852893613309126, 'learning_rate': 5.273975173783721e-06, 'epoch': 0.5} 50%|████▉ | 11006/22095 [18:32:51<21:14:29, 6.90s/it] 50%|████▉ | 11007/22095 [18:32:55<18:22:50, 5.97s/it] {'loss': 0.3148, 'grad_norm': 0.5978448571071595, 'learning_rate': 5.273243350947728e-06, 'epoch': 0.5} 50%|████▉ | 11007/22095 [18:32:55<18:22:50, 5.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 11008/22095 [18:33:04<21:22:10, 6.94s/it] {'loss': 0.4952, 'grad_norm': 0.2849176493083047, 'learning_rate': 5.272511522240574e-06, 'epoch': 0.5} 50%|████▉ | 11008/22095 [18:33:04<21:22:10, 6.94s/it]VC:s3://gui-agent/data_20250624/web/images/yang_0626114758/google_com_0626162543/img/12.png 2025-08-28 10:31:03.054041 load time: 1017.92 ms VC:s3://gui-agent/data_20250612/mac/images/settings/d23d7144-6085-4f97-99fe-8fff75db1ef9/images/step_0.png 2025-08-28 10:31:04.192415 load time: 1107.47 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_5.png 2025-08-28 10:31:04.738930 load time: 1082.63 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 10:31:04.151271 load time: 1639.54 ms VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 10:31:04.387348 load time: 1474.85 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:31:04.874768 load time: 1035.64 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_1/images/step_0.png 2025-08-28 10:31:05.031294 load time: 1190.97 ms 50%|████▉ | 11009/22095 [18:33:08<18:32:39, 6.02s/it] {'loss': 0.3416, 'grad_norm': 0.586350458099298, 'learning_rate': 5.271779687677984e-06, 'epoch': 0.5} 50%|████▉ | 11009/22095 [18:33:08<18:32:39, 6.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 11010/22095 [18:33:12<16:28:51, 5.35s/it] {'loss': 0.3097, 'grad_norm': 0.6386831237999933, 'learning_rate': 5.271047847275685e-06, 'epoch': 0.5} 50%|████▉ | 11010/22095 [18:33:12<16:28:51, 5.35s/it] 50%|████▉ | 11011/22095 [18:33:36<33:47:34, 10.98s/it] {'loss': 0.3842, 'grad_norm': 0.6339958723931352, 'learning_rate': 5.270316001049398e-06, 'epoch': 0.5} 50%|████▉ | 11011/22095 [18:33:36<33:47:34, 10.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106850 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 11012/22095 [18:33:46<32:31:18, 10.56s/it] {'loss': 0.4807, 'grad_norm': 0.29838913407663087, 'learning_rate': 5.269584149014852e-06, 'epoch': 0.5} 50%|████▉ | 11012/22095 [18:33:46<32:31:18, 10.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953367 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4202, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 5cm\nB. 无法确定\nC. 1cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 50%|████▉ | 11013/22095 [18:33:49<25:53:43, 8.41s/it] {'loss': 0.3287, 'grad_norm': 0.6826512568699339, 'learning_rate': 5.268852291187771e-06, 'epoch': 0.5} 50%|████▉ | 11013/22095 [18:33:49<25:53:43, 8.41s/it] 50%|████▉ | 11014/22095 [18:33:52<21:10:03, 6.88s/it] {'loss': 0.3152, 'grad_norm': 0.5931840879498935, 'learning_rate': 5.2681204275838785e-06, 'epoch': 0.5} 50%|████▉ | 11014/22095 [18:33:52<21:10:03, 6.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42531 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (141139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46813 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 11015/22095 [18:33:56<18:17:28, 5.94s/it] {'loss': 0.3263, 'grad_norm': 0.6657488830808198, 'learning_rate': 5.267388558218902e-06, 'epoch': 0.5} 50%|████▉ | 11015/22095 [18:33:56<18:17:28, 5.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 11016/22095 [18:33:59<15:26:47, 5.02s/it] {'loss': 0.3041, 'grad_norm': 0.6057858608335768, 'learning_rate': 5.266656683108566e-06, 'epoch': 0.5} 50%|████▉ | 11016/22095 [18:33:59<15:26:47, 5.02s/it] 50%|████▉ | 11017/22095 [18:34:02<13:36:25, 4.42s/it] {'loss': 0.3572, 'grad_norm': 0.6745789119505499, 'learning_rate': 5.265924802268598e-06, 'epoch': 0.5} 50%|████▉ | 11017/22095 [18:34:02<13:36:25, 4.42s/it] 50%|████▉ | 11018/22095 [18:34:05<12:18:24, 4.00s/it] {'loss': 0.3098, 'grad_norm': 0.5747209305871905, 'learning_rate': 5.265192915714723e-06, 'epoch': 0.5} 50%|████▉ | 11018/22095 [18:34:05<12:18:24, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 11019/22095 [18:34:08<11:25:54, 3.72s/it] {'loss': 0.2639, 'grad_norm': 0.6218352907198297, 'learning_rate': 5.2644610234626646e-06, 'epoch': 0.5} 50%|████▉ | 11019/22095 [18:34:08<11:25:54, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|████▉ | 11020/22095 [18:34:18<16:45:24, 5.45s/it] {'loss': 0.4646, 'grad_norm': 0.3348861378427615, 'learning_rate': 5.2637291255281545e-06, 'epoch': 0.5} 50%|████▉ | 11020/22095 [18:34:18<16:45:24, 5.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398223 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 374, 'image': 'vrdu_table_final_2/astro-ph.CO/0fea7510-b803-41b9-94b7-e34373412534.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} 50%|████▉ | 11021/22095 [18:34:21<15:01:01, 4.88s/it] {'loss': 0.304, 'grad_norm': 0.6534352481784338, 'learning_rate': 5.262997221926912e-06, 'epoch': 0.5} 50%|████▉ | 11021/22095 [18:34:21<15:01:01, 4.88s/it] 50%|████▉ | 11022/22095 [18:34:24<13:31:04, 4.39s/it] {'loss': 0.3518, 'grad_norm': 0.6371580418242253, 'learning_rate': 5.262265312674669e-06, 'epoch': 0.5} 50%|████▉ | 11022/22095 [18:34:24<13:31:04, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51178 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63151 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 11023/22095 [18:34:35<19:05:40, 6.21s/it] {'loss': 0.4731, 'grad_norm': 0.29895083134067113, 'learning_rate': 5.261533397787149e-06, 'epoch': 0.5} 50%|████▉ | 11023/22095 [18:34:35<19:05:40, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44146 > 40960). Running this sequence through the model will result in indexing errors 50%|████▉ | 11024/22095 [18:34:39<16:59:28, 5.53s/it] {'loss': 0.3198, 'grad_norm': 0.6893328823430755, 'learning_rate': 5.26080147728008e-06, 'epoch': 0.5} 50%|████▉ | 11024/22095 [18:34:39<16:59:28, 5.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [442, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8502663 in VC:s3://internvl-moe-sft-data/. Exception: Image size [442, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 76738, 'image': 'vrdu_texteq/astro-ph.CO/b6efd79c-be3d-403d-ac15-c983e2dd0212.png', 'image_wh': [[442, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'where $N$ is a normalisation constant'}]} 50%|████▉ | 11025/22095 [18:34:42<14:49:04, 4.82s/it] {'loss': 0.3078, 'grad_norm': 0.6131173186560834, 'learning_rate': 5.260069551169187e-06, 'epoch': 0.5} 50%|████▉ | 11025/22095 [18:34:42<14:49:04, 4.82s/it] 50%|████▉ | 11026/22095 [18:35:03<30:03:46, 9.78s/it] {'loss': 0.3216, 'grad_norm': 0.6330615061810914, 'learning_rate': 5.2593376194702e-06, 'epoch': 0.5} 50%|████▉ | 11026/22095 [18:35:03<30:03:46, 9.78s/it] 50%|████▉ | 11027/22095 [18:35:07<24:09:25, 7.86s/it] {'loss': 0.3118, 'grad_norm': 0.6459216134848859, 'learning_rate': 5.258605682198842e-06, 'epoch': 0.5} 50%|████▉ | 11027/22095 [18:35:07<24:09:25, 7.86s/it] 50%|████▉ | 11028/22095 [18:35:11<20:44:07, 6.75s/it] {'loss': 0.3307, 'grad_norm': 0.6205966481412192, 'learning_rate': 5.2578737393708435e-06, 'epoch': 0.5} 50%|████▉ | 11028/22095 [18:35:11<20:44:07, 6.75s/it] 50%|████▉ | 11029/22095 [18:35:14<17:51:37, 5.81s/it] {'loss': 0.3494, 'grad_norm': 0.7227561225552619, 'learning_rate': 5.257141791001931e-06, 'epoch': 0.5} 50%|████▉ | 11029/22095 [18:35:14<17:51:37, 5.81s/it]VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/images/os_ubuntu/handmade_annotation_30/images/paste_Screenshot from 2025-07-04 18-11-22_id_4_function_2_crop_1_grounding_instructions_random.png 2025-08-28 10:33:14.261218 load time: 1266.33 ms 50%|████▉ | 11030/22095 [18:35:17<15:15:49, 4.97s/it] {'loss': 0.3502, 'grad_norm': 0.5829741951280177, 'learning_rate': 5.256409837107828e-06, 'epoch': 0.5} 50%|████▉ | 11030/22095 [18:35:17<15:15:49, 4.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 10:33:16.195482 load time: 1143.44 ms 50%|████▉ | 11031/22095 [18:35:27<19:29:17, 6.34s/it] {'loss': 0.5144, 'grad_norm': 0.3573351925053009, 'learning_rate': 5.255677877704269e-06, 'epoch': 0.5} 50%|████▉ | 11031/22095 [18:35:27<19:29:17, 6.34s/it] 50%|████▉ | 11032/22095 [18:35:37<22:56:10, 7.46s/it] {'loss': 0.4578, 'grad_norm': 0.4306011684363518, 'learning_rate': 5.254945912806977e-06, 'epoch': 0.5} 50%|████▉ | 11032/22095 [18:35:37<22:56:10, 7.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 50%|████▉ | 11033/22095 [18:35:41<19:41:37, 6.41s/it] {'loss': 0.3542, 'grad_norm': 0.5924615467188293, 'learning_rate': 5.254213942431679e-06, 'epoch': 0.5} 50%|████▉ | 11033/22095 [18:35:41<19:41:37, 6.41s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/af851dfd-b7ce-4e95-95cf-c0fce6b8bb15/images/step_2.png 2025-08-28 10:33:39.775623 load time: 1020.54 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_1/images/step_0.png 2025-08-28 10:33:40.392888 load time: 1232.74 ms 50%|████▉ | 11034/22095 [18:35:45<17:34:41, 5.72s/it] {'loss': 0.3409, 'grad_norm': 0.6294281367848047, 'learning_rate': 5.253481966594104e-06, 'epoch': 0.5} 50%|████▉ | 11034/22095 [18:35:45<17:34:41, 5.72s/it] 50%|████▉ | 11035/22095 [18:35:49<16:00:09, 5.21s/it] {'loss': 0.3123, 'grad_norm': 0.624064714320788, 'learning_rate': 5.25274998530998e-06, 'epoch': 0.5} 50%|████▉ | 11035/22095 [18:35:49<16:00:09, 5.21s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_1.png 2025-08-28 10:33:49.097997 load time: 1109.32 ms 50%|████▉ | 11036/22095 [18:36:10<30:50:40, 10.04s/it] {'loss': 0.2917, 'grad_norm': 0.7475311399192298, 'learning_rate': 5.252017998595036e-06, 'epoch': 0.5} 50%|████▉ | 11036/22095 [18:36:10<30:50:40, 10.04s/it] 50%|████▉ | 11037/22095 [18:36:14<25:01:53, 8.15s/it] {'loss': 0.3123, 'grad_norm': 0.6201713220823594, 'learning_rate': 5.2512860064649985e-06, 'epoch': 0.5} 50%|████▉ | 11037/22095 [18:36:14<25:01:53, 8.15s/it] 50%|████▉ | 11038/22095 [18:36:18<20:45:35, 6.76s/it] {'loss': 0.3383, 'grad_norm': 0.6376182014202468, 'learning_rate': 5.250554008935596e-06, 'epoch': 0.5} 50%|████▉ | 11038/22095 [18:36:18<20:45:35, 6.76s/it] 50%|████▉ | 11039/22095 [18:36:21<17:14:28, 5.61s/it] {'loss': 0.2996, 'grad_norm': 0.6267144751580137, 'learning_rate': 5.24982200602256e-06, 'epoch': 0.5} 50%|████▉ | 11039/22095 [18:36:21<17:14:28, 5.61s/it] 50%|████▉ | 11040/22095 [18:36:43<32:55:12, 10.72s/it] {'loss': 0.3086, 'grad_norm': 0.6305389310730376, 'learning_rate': 5.249089997741613e-06, 'epoch': 0.5} 50%|████▉ | 11040/22095 [18:36:43<32:55:12, 10.72s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/safari_1/images/step_0.png 2025-08-28 10:34:44.362117 load time: 1048.11 ms 50%|████▉ | 11041/22095 [18:36:47<26:27:37, 8.62s/it] {'loss': 0.2934, 'grad_norm': 0.6201450886917379, 'learning_rate': 5.248357984108489e-06, 'epoch': 0.5} 50%|████▉ | 11041/22095 [18:36:47<26:27:37, 8.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|████▉ | 11042/22095 [18:36:50<21:10:23, 6.90s/it] {'loss': 0.3555, 'grad_norm': 0.6182082306856194, 'learning_rate': 5.247625965138915e-06, 'epoch': 0.5} 50%|████▉ | 11042/22095 [18:36:50<21:10:23, 6.90s/it] 50%|████▉ | 11043/22095 [18:37:12<35:01:49, 11.41s/it] {'loss': 0.35, 'grad_norm': 0.6015189328325169, 'learning_rate': 5.246893940848619e-06, 'epoch': 0.5} 50%|████▉ | 11043/22095 [18:37:12<35:01:49, 11.41s/it] 50%|████▉ | 11044/22095 [18:37:15<27:29:01, 8.95s/it] {'loss': 0.3129, 'grad_norm': 0.5770996555177424, 'learning_rate': 5.24616191125333e-06, 'epoch': 0.5} 50%|████▉ | 11044/22095 [18:37:15<27:29:01, 8.95s/it] 50%|████▉ | 11045/22095 [18:37:18<21:56:53, 7.15s/it] {'loss': 0.3085, 'grad_norm': 0.6121094238797752, 'learning_rate': 5.245429876368777e-06, 'epoch': 0.5} 50%|████▉ | 11045/22095 [18:37:18<21:56:53, 7.15s/it] 50%|████▉ | 11046/22095 [18:37:40<35:12:28, 11.47s/it] {'loss': 0.373, 'grad_norm': 0.6523235892928385, 'learning_rate': 5.244697836210691e-06, 'epoch': 0.5} 50%|████▉ | 11046/22095 [18:37:40<35:12:28, 11.47s/it] 50%|████▉ | 11047/22095 [18:37:43<27:52:09, 9.08s/it] {'loss': 0.3459, 'grad_norm': 0.6345743591337244, 'learning_rate': 5.2439657907948005e-06, 'epoch': 0.5} 50%|████▉ | 11047/22095 [18:37:43<27:52:09, 9.08s/it] 50%|█████ | 11048/22095 [18:37:46<22:09:13, 7.22s/it] {'loss': 0.3521, 'grad_norm': 0.6277061830033281, 'learning_rate': 5.243233740136833e-06, 'epoch': 0.5} 50%|█████ | 11048/22095 [18:37:46<22:09:13, 7.22s/it] 50%|█████ | 11049/22095 [18:37:49<18:33:47, 6.05s/it] {'loss': 0.3166, 'grad_norm': 0.584450845482722, 'learning_rate': 5.24250168425252e-06, 'epoch': 0.5} 50%|█████ | 11049/22095 [18:37:49<18:33:47, 6.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48016 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11050/22095 [18:37:52<15:42:59, 5.12s/it] {'loss': 0.3242, 'grad_norm': 0.6260654492263931, 'learning_rate': 5.241769623157591e-06, 'epoch': 0.5} 50%|█████ | 11050/22095 [18:37:52<15:42:59, 5.12s/it] 50%|█████ | 11051/22095 [18:38:32<47:38:32, 15.53s/it] {'loss': 0.3088, 'grad_norm': 0.6012131043558517, 'learning_rate': 5.241037556867775e-06, 'epoch': 0.5} 50%|█████ | 11051/22095 [18:38:32<47:38:32, 15.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44752 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11052/22095 [18:38:35<36:16:20, 11.82s/it] {'loss': 0.3373, 'grad_norm': 0.7609827413186744, 'learning_rate': 5.2403054853988025e-06, 'epoch': 0.5} 50%|█████ | 11052/22095 [18:38:35<36:16:20, 11.82s/it] 50%|█████ | 11053/22095 [18:38:58<46:10:27, 15.05s/it] {'loss': 0.3348, 'grad_norm': 0.6481397861005885, 'learning_rate': 5.239573408766402e-06, 'epoch': 0.5} 50%|█████ | 11053/22095 [18:38:58<46:10:27, 15.05s/it] 50%|█████ | 11054/22095 [18:39:01<34:54:41, 11.38s/it] {'loss': 0.3084, 'grad_norm': 0.6230503792736052, 'learning_rate': 5.2388413269863046e-06, 'epoch': 0.5} 50%|█████ | 11054/22095 [18:39:01<34:54:41, 11.38s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_2/images/step_0.png 2025-08-28 10:37:01.862859 load time: 1090.7 ms 50%|█████ | 11055/22095 [18:39:58<77:36:52, 25.31s/it] {'loss': 0.3683, 'grad_norm': 0.6175391721691664, 'learning_rate': 5.238109240074242e-06, 'epoch': 0.5} 50%|█████ | 11055/22095 [18:39:58<77:36:52, 25.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047101 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 8cm\nB. 16cm\nC. 32cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_0.png 2025-08-28 10:37:58.270711 load time: 1128.32 ms 50%|█████ | 11056/22095 [18:40:20<74:16:53, 24.22s/it] {'loss': 0.3168, 'grad_norm': 0.6082765218977666, 'learning_rate': 5.237377148045942e-06, 'epoch': 0.5} 50%|█████ | 11056/22095 [18:40:20<74:16:53, 24.22s/it] 50%|█████ | 11057/22095 [18:41:00<88:25:03, 28.84s/it] {'loss': 0.322, 'grad_norm': 0.6186954600485816, 'learning_rate': 5.236645050917137e-06, 'epoch': 0.5} 50%|█████ | 11057/22095 [18:41:00<88:25:03, 28.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_121220.png 2025-08-28 10:38:58.449012 load time: 1069.6 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 10:38:59.527772 load time: 1249.35 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 10:39:00.395039 load time: 1313.55 ms 50%|█████ | 11058/22095 [18:41:09<70:50:33, 23.11s/it] {'loss': 0.5052, 'grad_norm': 0.4754584167898786, 'learning_rate': 5.235912948703557e-06, 'epoch': 0.5} 50%|█████ | 11058/22095 [18:41:09<70:50:33, 23.11s/it]VC:s3://internvl-moe-sft-data/vrdu_texteq/astro-ph.CO/ab541429-647f-4aee-8b7c-7c3bd2b857c0.png 2025-08-28 10:39:08.187112 load time: 1018.75 ms 50%|█████ | 11059/22095 [18:41:33<71:44:27, 23.40s/it] {'loss': 0.3247, 'grad_norm': 0.5948232513771863, 'learning_rate': 5.235180841420932e-06, 'epoch': 0.5} 50%|█████ | 11059/22095 [18:41:34<71:44:27, 23.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43323 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11060/22095 [18:41:40<56:38:12, 18.48s/it] {'loss': 0.4682, 'grad_norm': 0.3340250369796509, 'learning_rate': 5.234448729084993e-06, 'epoch': 0.5} 50%|█████ | 11060/22095 [18:41:40<56:38:12, 18.48s/it]VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/f714496456164947d0469827beb647f7.png 2025-08-28 10:39:39.261023 load time: 1041.18 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047736 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 7cm\nB. 8cm\nC. 5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 50%|█████ | 11061/22095 [18:41:44<42:33:29, 13.89s/it] {'loss': 0.3387, 'grad_norm': 0.6822242617146005, 'learning_rate': 5.233716611711469e-06, 'epoch': 0.5} 50%|█████ | 11061/22095 [18:41:44<42:33:29, 13.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359351 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26071, 'image': 'vrdu_table_final_2/astro-ph.CO/e4adb042-fcba-4b31-b6e6-97115f0a187c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 50%|█████ | 11062/22095 [18:42:08<52:21:43, 17.09s/it] {'loss': 0.3623, 'grad_norm': 0.6499428614474025, 'learning_rate': 5.232984489316095e-06, 'epoch': 0.5} 50%|█████ | 11062/22095 [18:42:08<52:21:43, 17.09s/it] 50%|█████ | 11063/22095 [18:42:52<76:33:32, 24.98s/it] {'loss': 0.2922, 'grad_norm': 0.5985211579875924, 'learning_rate': 5.2322523619146e-06, 'epoch': 0.5} 50%|█████ | 11063/22095 [18:42:52<76:33:32, 24.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58197 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105684 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11064/22095 [18:43:32<90:32:04, 29.55s/it] {'loss': 0.3392, 'grad_norm': 0.6165914090170149, 'learning_rate': 5.2315202295227144e-06, 'epoch': 0.5} 50%|█████ | 11064/22095 [18:43:32<90:32:04, 29.55s/it] 50%|█████ | 11065/22095 [18:43:53<83:00:52, 27.09s/it] {'loss': 0.3121, 'grad_norm': 0.5905261648096668, 'learning_rate': 5.2307880921561695e-06, 'epoch': 0.5} 50%|█████ | 11065/22095 [18:43:53<83:00:52, 27.09s/it]VC:s3://gui/aguvis/aguvis-stage1/widget_captioning/images/61794.jpg 2025-08-28 10:41:51.967814 load time: 1025.15 ms 50%|█████ | 11066/22095 [18:44:15<77:44:53, 25.38s/it] {'loss': 0.3261, 'grad_norm': 0.6898713814030945, 'learning_rate': 5.230055949830698e-06, 'epoch': 0.5} 50%|█████ | 11066/22095 [18:44:15<77:44:53, 25.38s/it] 50%|█████ | 11067/22095 [18:44:38<76:06:15, 24.84s/it] {'loss': 0.353, 'grad_norm': 0.6222355567404058, 'learning_rate': 5.229323802562031e-06, 'epoch': 0.5} 50%|█████ | 11067/22095 [18:44:38<76:06:15, 24.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/iphone/images/Translate/Iter_11/images/Step_8.png 2025-08-28 10:42:36.933815 load time: 1025.42 ms VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-1_92596554-split-5.jpg 2025-08-28 10:42:36.933466 load time: 1026.71 ms VC:s3://internvl2/datasets/screen2words/images/0010736.jpg 2025-08-28 10:42:36.932854 load time: 1026.94 ms VC:s3://gui/visual_inputs/multi_modal/agent_data/rico/dataset/image/52263.jpg 2025-08-28 10:42:36.931623 load time: 1045.46 ms VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/images/android_studio_mac/handmade_annotation_10/images/10_id_3_function_0_crop_0_grounding_instructions_random_pure_paste.png 2025-08-28 10:42:36.932121 load time: 1043.42 ms 50%|█████ | 11068/22095 [18:44:45<59:27:01, 19.41s/it] {'loss': 0.4744, 'grad_norm': 0.5075586203431988, 'learning_rate': 5.2285916503659e-06, 'epoch': 0.5} 50%|█████ | 11068/22095 [18:44:45<59:27:01, 19.41s/it]VC:s3://gui-agent/data_20250612/web/images/yang_0613164240/10_140_52_49_0613164415/img/2.png 2025-08-28 10:42:43.661947 load time: 1162.1 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 10:42:43.663201 load time: 1065.02 ms 50%|█████ | 11069/22095 [18:44:48<44:38:54, 14.58s/it] {'loss': 0.3207, 'grad_norm': 0.6291194527210394, 'learning_rate': 5.227859493258035e-06, 'epoch': 0.5} 50%|█████ | 11069/22095 [18:44:48<44:38:54, 14.58s/it] 50%|█████ | 11070/22095 [18:45:28<68:14:01, 22.28s/it] {'loss': 0.3069, 'grad_norm': 0.6418232667533473, 'learning_rate': 5.227127331254171e-06, 'epoch': 0.5} 50%|█████ | 11070/22095 [18:45:28<68:14:01, 22.28s/it] 50%|█████ | 11071/22095 [18:46:11<87:09:35, 28.46s/it] {'loss': 0.3068, 'grad_norm': 0.6018724740067458, 'learning_rate': 5.226395164370038e-06, 'epoch': 0.5} 50%|█████ | 11071/22095 [18:46:11<87:09:35, 28.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240827_043425_before_screenshot_sub3.png 2025-08-28 10:44:10.106668 load time: 1036.3 ms VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/63242.jpg 2025-08-28 10:44:10.105041 load time: 1048.83 ms VC:s3://gui-agent/data_20250421/web/images/gmail/trajectory_146/img/step_5.png 2025-08-28 10:44:10.107081 load time: 1049.09 ms 50%|█████ | 11072/22095 [18:46:58<104:02:42, 33.98s/it] {'loss': 0.454, 'grad_norm': 0.3252219454503633, 'learning_rate': 5.225662992621367e-06, 'epoch': 0.5} 50%|█████ | 11072/22095 [18:46:58<104:02:42, 33.98s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_062701_before_screenshot_sub0.png 2025-08-28 10:44:56.960836 load time: 1037.37 ms VC:s3://gui-agent/agentnet/ubuntu_images/5df511ab-1403-4e9d-8792-c99f2e994214.png 2025-08-28 10:44:56.962019 load time: 1076.35 ms 50%|█████ | 11073/22095 [18:47:20<92:47:00, 30.30s/it] {'loss': 0.35, 'grad_norm': 0.7285264576358847, 'learning_rate': 5.224930816023892e-06, 'epoch': 0.5} 50%|█████ | 11073/22095 [18:47:20<92:47:00, 30.30s/it] 50%|█████ | 11074/22095 [18:48:37<135:27:48, 44.25s/it] {'loss': 0.3039, 'grad_norm': 0.6205010962097538, 'learning_rate': 5.224198634593344e-06, 'epoch': 0.5} 50%|█████ | 11074/22095 [18:48:37<135:27:48, 44.25s/it] 50%|█████ | 11075/22095 [18:48:59<115:08:04, 37.61s/it] {'loss': 0.3091, 'grad_norm': 0.594041615008846, 'learning_rate': 5.223466448345457e-06, 'epoch': 0.5} 50%|█████ | 11075/22095 [18:48:59<115:08:04, 37.61s/it] 50%|█████ | 11076/22095 [18:49:58<134:59:02, 44.10s/it] {'loss': 0.3225, 'grad_norm': 0.6321829785744144, 'learning_rate': 5.222734257295963e-06, 'epoch': 0.5} 50%|█████ | 11076/22095 [18:49:58<134:59:02, 44.10s/it]VC:s3://internvl2/datasets/VCR-wiki-en-easy/images/0013926.jpg 2025-08-28 10:47:56.843641 load time: 1023.91 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:47:56.843597 load time: 1040.74 ms 50%|█████ | 11077/22095 [18:50:20<114:54:22, 37.54s/it] {'loss': 0.2942, 'grad_norm': 0.6118644687548812, 'learning_rate': 5.222002061460592e-06, 'epoch': 0.5} 50%|█████ | 11077/22095 [18:50:20<114:54:22, 37.54s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_139818.png 2025-08-28 10:48:19.096337 load time: 1050.32 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_0.png 2025-08-28 10:48:19.470701 load time: 1023.87 ms 50%|█████ | 11078/22095 [18:51:01<117:49:09, 38.50s/it] {'loss': 0.3398, 'grad_norm': 0.7053326924588188, 'learning_rate': 5.22126986085508e-06, 'epoch': 0.5} 50%|█████ | 11078/22095 [18:51:01<117:49:09, 38.50s/it] 50%|█████ | 11079/22095 [18:51:04<85:15:46, 27.86s/it] {'loss': 0.3353, 'grad_norm': 0.6527266573297478, 'learning_rate': 5.220537655495156e-06, 'epoch': 0.5} 50%|█████ | 11079/22095 [18:51:04<85:15:46, 27.86s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38206.png 2025-08-28 10:49:01.094077 load time: 1843.89 ms Token indices sequence length is longer than the specified maximum sequence length for this model (41018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84071 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11080/22095 [18:51:46<98:01:58, 32.04s/it] {'loss': 0.362, 'grad_norm': 0.6638030771401946, 'learning_rate': 5.219805445396558e-06, 'epoch': 0.5} 50%|█████ | 11080/22095 [18:51:46<98:01:58, 32.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047660 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} VC:s3://internvl2/datasets/MMMUDataset/MMMU/Clinical_Medicine/test_238_image_1.png 2025-08-28 10:49:44.649722 load time: 1067.26 ms VC:s3://gui/data_20250328/icon_canva/images/mobile_1080x2340_1743154694_canvas.png 2025-08-28 10:49:44.646611 load time: 1051.82 ms VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/images/pycharm/handmade_annotation_3/images/pycharm_3_id_27_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:49:45.051090 load time: 1016.91 ms 50%|█████ | 11081/22095 [18:52:06<87:14:16, 28.51s/it] {'loss': 0.3098, 'grad_norm': 0.620148977612418, 'learning_rate': 5.219073230575014e-06, 'epoch': 0.5} 50%|█████ | 11081/22095 [18:52:06<87:14:16, 28.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11082/22095 [18:52:49<100:06:48, 32.73s/it] {'loss': 0.319, 'grad_norm': 0.6372724691585553, 'learning_rate': 5.218341011046259e-06, 'epoch': 0.5} 50%|█████ | 11082/22095 [18:52:49<100:06:48, 32.73s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/agent_data/OS-Atlas/androidworld/db1e3109-b1bc-47c8-a7ca-44dc437479ec.png 2025-08-28 10:50:47.490440 load time: 1041.64 ms 50%|█████ | 11083/22095 [18:53:11<90:29:48, 29.58s/it] {'loss': 0.3637, 'grad_norm': 0.6373736598102927, 'learning_rate': 5.217608786826028e-06, 'epoch': 0.5} 50%|█████ | 11083/22095 [18:53:11<90:29:48, 29.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101293 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11084/22095 [18:53:15<67:16:49, 22.00s/it] {'loss': 0.3337, 'grad_norm': 0.605732810497086, 'learning_rate': 5.216876557930052e-06, 'epoch': 0.5} 50%|█████ | 11084/22095 [18:53:15<67:16:49, 22.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67437 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11085/22095 [18:53:58<85:52:16, 28.08s/it] {'loss': 0.3213, 'grad_norm': 0.6222119929539311, 'learning_rate': 5.216144324374064e-06, 'epoch': 0.5} 50%|█████ | 11085/22095 [18:53:58<85:52:16, 28.08s/it] 50%|█████ | 11086/22095 [18:54:57<115:01:44, 37.62s/it] {'loss': 0.3204, 'grad_norm': 0.6386771222508453, 'learning_rate': 5.215412086173798e-06, 'epoch': 0.5} 50%|█████ | 11086/22095 [18:54:57<115:01:44, 37.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307515 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB24EgZXgoSMeJjSspaXXbMOFXa_!!752472399.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nLook at this picture and tell me what is written on it without outputting text coordinates.'}, {'from': 'gpt', 'value': 'All words in the image:\n热卖\n外開門専用\n无需装门拉手*\n一按即开再按即关\n保用2万次\n*\n美学\n强力拍门器'}]} 50%|█████ | 11087/22095 [18:55:19<99:52:41, 32.66s/it] {'loss': 0.3172, 'grad_norm': 0.6350812082463306, 'learning_rate': 5.214679843344989e-06, 'epoch': 0.5} 50%|█████ | 11087/22095 [18:55:19<99:52:41, 32.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|█████ | 11088/22095 [18:55:47<95:53:57, 31.37s/it] {'loss': 0.477, 'grad_norm': 0.5767092330974389, 'learning_rate': 5.213947595903369e-06, 'epoch': 0.5} 50%|█████ | 11088/22095 [18:55:47<95:53:57, 31.37s/it] 50%|█████ | 11089/22095 [18:55:50<70:18:45, 23.00s/it] {'loss': 0.3516, 'grad_norm': 0.6352459911259863, 'learning_rate': 5.213215343864674e-06, 'epoch': 0.5} 50%|█████ | 11089/22095 [18:55:50<70:18:45, 23.00s/it]VC:s3://gui-agent/data_20250526/windows/images/inventor/20250513_095212_1/images/before_screenshot_3_id_82_function_2_crop_0_grounding_instructions_point_o.png 2025-08-28 10:53:49.091289 load time: 1041.35 ms VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/40a08a474d9a74d4c705c1cc06ab98e9.png 2025-08-28 10:53:49.091395 load time: 1044.47 ms 50%|█████ | 11090/22095 [18:57:10<121:57:22, 39.89s/it] {'loss': 0.3, 'grad_norm': 2.2769096522106915, 'learning_rate': 5.212483087244633e-06, 'epoch': 0.5} 50%|█████ | 11090/22095 [18:57:10<121:57:22, 39.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43076 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48744 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11091/22095 [18:58:46<174:03:12, 56.94s/it] {'loss': 0.3866, 'grad_norm': 0.6258735223163712, 'learning_rate': 5.211750826058986e-06, 'epoch': 0.5} 50%|█████ | 11091/22095 [18:58:46<174:03:12, 56.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339019 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5650, 'image': 'vrdu_table_final_2/astro-ph.CO/e26eb4cc-ccc7-44cf-b895-f07fb10075dc.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 50%|█████ | 11092/22095 [18:59:16<149:25:36, 48.89s/it] {'loss': 0.4903, 'grad_norm': 0.3613147591509796, 'learning_rate': 5.211018560323462e-06, 'epoch': 0.5} 50%|█████ | 11092/22095 [18:59:16<149:25:36, 48.89s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 10:57:15.934954 load time: 1179.62 ms VC:s3://gui-agent/data_20250616/windows_paste/images/vivado/20250508_133204_622077_1/images/before_screenshot_1_id_0_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 10:57:16.468837 load time: 1164.73 ms 50%|█████ | 11093/22095 [18:59:45<130:29:38, 42.70s/it] {'loss': 0.4728, 'grad_norm': 0.36778750299336693, 'learning_rate': 5.2102862900537975e-06, 'epoch': 0.5} 50%|█████ | 11093/22095 [18:59:45<130:29:38, 42.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/5324683322975207_7.png 2025-08-28 10:57:43.491272 load time: 1031.55 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11094/22095 [19:00:47<148:38:17, 48.64s/it] {'loss': 0.3507, 'grad_norm': 0.6867057430629306, 'learning_rate': 5.209554015265727e-06, 'epoch': 0.5} 50%|█████ | 11094/22095 [19:00:47<148:38:17, 48.64s/it]VC:s3://multi-modal/Super-CLEVR/images/superCLEVR_new_002413.png 2025-08-28 10:58:45.994989 load time: 1025.32 ms VC:s3://gui-agent/data_20250707/windows/images/chrome/free_task_20250619_200657/images/20250619_200704_3.png 2025-08-28 10:58:45.992809 load time: 1029.12 ms VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/data/tabs/other_screenshot/original/ProductCategoryTabs_1739986776.4921775.png 2025-08-28 10:58:45.993793 load time: 1037.04 ms VC:s3://gui-agent/data_20250612/web/images/yang_0526142648/10_140_52_49_0526142801/img/16.png 2025-08-28 10:58:45.992660 load time: 1080.92 ms 50%|█████ | 11095/22095 [19:00:50<106:56:57, 35.00s/it] {'loss': 0.3298, 'grad_norm': 0.668992589318168, 'learning_rate': 5.208821735974984e-06, 'epoch': 0.5} 50%|█████ | 11095/22095 [19:00:50<106:56:57, 35.00s/it] 50%|█████ | 11096/22095 [19:02:30<166:36:17, 54.53s/it] {'loss': 0.3499, 'grad_norm': 0.6752741459092744, 'learning_rate': 5.208089452197302e-06, 'epoch': 0.5} 50%|█████ | 11096/22095 [19:02:31<166:36:17, 54.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307960 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2G8XUbIrI8KJjy0FhXXbfnpXa_!!2655502098.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the text hidden in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n三角形电钻用锤\n形电锤用锤\n方'}]} 50%|█████ | 11097/22095 [19:02:52<136:11:10, 44.58s/it] {'loss': 0.3207, 'grad_norm': 0.6438409777835392, 'learning_rate': 5.20735716394842e-06, 'epoch': 0.5} 50%|█████ | 11097/22095 [19:02:52<136:11:10, 44.58s/it] 50%|█████ | 11098/22095 [19:03:33<133:17:50, 43.64s/it] {'loss': 0.3713, 'grad_norm': 0.6102737159370142, 'learning_rate': 5.206624871244066e-06, 'epoch': 0.5} 50%|█████ | 11098/22095 [19:03:33<133:17:50, 43.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64841 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80309 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11099/22095 [19:04:02<119:16:06, 39.05s/it] {'loss': 0.4836, 'grad_norm': 0.433094389848305, 'learning_rate': 5.205892574099981e-06, 'epoch': 0.5} 50%|█████ | 11099/22095 [19:04:02<119:16:06, 39.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11100/22095 [19:04:05<86:47:13, 28.42s/it] {'loss': 0.3217, 'grad_norm': 0.6388238597155772, 'learning_rate': 5.205160272531895e-06, 'epoch': 0.5} 50%|█████ | 11100/22095 [19:04:05<86:47:13, 28.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [170, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8446536 in VC:s3://internvl-moe-sft-data/. Exception: Image size [170, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 94222, 'image': 'vrdu_texteq/astro-ph.CO/9d82ed61-0213-48ec-a98c-3eb653e4135a.png', 'image_wh': [[170, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'and $\\mathcal{M}_2$ with:'}]} VC:s3://gui-agent/data_20250714/windows/images/photoshop/free_task_20250709_161656/images/20250709_161658_1.png 2025-08-28 11:02:04.011394 load time: 1032.59 ms 50%|█████ | 11101/22095 [19:04:26<79:59:28, 26.19s/it] {'loss': 0.3346, 'grad_norm': 0.6274527566329003, 'learning_rate': 5.204427966555545e-06, 'epoch': 0.5} 50%|█████ | 11101/22095 [19:04:26<79:59:28, 26.19s/it] 50%|█████ | 11102/22095 [19:04:31<59:53:44, 19.61s/it] {'loss': 0.3253, 'grad_norm': 0.6025656163217872, 'learning_rate': 5.203695656186667e-06, 'epoch': 0.5} 50%|█████ | 11102/22095 [19:04:31<59:53:44, 19.61s/it] 50%|█████ | 11103/22095 [19:04:34<44:44:06, 14.65s/it] {'loss': 0.3418, 'grad_norm': 0.6551152806125176, 'learning_rate': 5.202963341440994e-06, 'epoch': 0.5} 50%|█████ | 11103/22095 [19:04:34<44:44:06, 14.65s/it] 50%|█████ | 11104/22095 [19:04:56<51:51:57, 16.99s/it] {'loss': 0.3047, 'grad_norm': 0.816953226291965, 'learning_rate': 5.202231022334262e-06, 'epoch': 0.5} 50%|█████ | 11104/22095 [19:04:56<51:51:57, 16.99s/it] 50%|█████ | 11105/22095 [19:04:59<39:15:49, 12.86s/it] {'loss': 0.3312, 'grad_norm': 0.6103218202483534, 'learning_rate': 5.201498698882207e-06, 'epoch': 0.5} 50%|█████ | 11105/22095 [19:04:59<39:15:49, 12.86s/it] 50%|█████ | 11106/22095 [19:05:40<64:45:14, 21.21s/it] {'loss': 0.3227, 'grad_norm': 0.6104779818808732, 'learning_rate': 5.200766371100564e-06, 'epoch': 0.5} 50%|█████ | 11106/22095 [19:05:40<64:45:14, 21.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage2/guiact-web-single/images/5c67965a-5c0c-4aed-8b12-ef18f36fc811.jpg 2025-08-28 11:03:38.752879 load time: 1030.12 ms VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/18973.jpg 2025-08-28 11:03:38.753323 load time: 1042.68 ms 50%|█████ | 11107/22095 [19:05:50<54:25:23, 17.83s/it] {'loss': 0.4819, 'grad_norm': 0.33092996473441333, 'learning_rate': 5.200034039005068e-06, 'epoch': 0.5} 50%|█████ | 11107/22095 [19:05:50<54:25:23, 17.83s/it] 50%|█████ | 11108/22095 [19:06:30<75:03:48, 24.60s/it] {'loss': 0.3256, 'grad_norm': 0.5760559752047034, 'learning_rate': 5.199301702611454e-06, 'epoch': 0.5} 50%|█████ | 11108/22095 [19:06:30<75:03:48, 24.60s/it] 50%|█████ | 11109/22095 [19:07:29<106:08:10, 34.78s/it] {'loss': 0.3257, 'grad_norm': 0.7355190355427469, 'learning_rate': 5.1985693619354604e-06, 'epoch': 0.5} 50%|█████ | 11109/22095 [19:07:29<106:08:10, 34.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43132 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11110/22095 [19:07:51<94:56:25, 31.11s/it] {'loss': 0.3348, 'grad_norm': 0.5909027528222714, 'learning_rate': 5.197837016992819e-06, 'epoch': 0.5} 50%|█████ | 11110/22095 [19:07:51<94:56:25, 31.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11111/22095 [19:08:15<87:57:43, 28.83s/it] {'loss': 0.3381, 'grad_norm': 0.7936770031677421, 'learning_rate': 5.1971046677992695e-06, 'epoch': 0.5} 50%|█████ | 11111/22095 [19:08:15<87:57:43, 28.83s/it] 50%|█████ | 11112/22095 [19:08:58<101:13:38, 33.18s/it] {'loss': 0.3098, 'grad_norm': 0.6005895346477476, 'learning_rate': 5.196372314370545e-06, 'epoch': 0.5} 50%|█████ | 11112/22095 [19:08:58<101:13:38, 33.18s/it] 50%|█████ | 11113/22095 [19:09:24<94:46:04, 31.07s/it] {'loss': 0.3555, 'grad_norm': 0.6393324770340331, 'learning_rate': 5.195639956722382e-06, 'epoch': 0.5} 50%|█████ | 11113/22095 [19:09:24<94:46:04, 31.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43240 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43651 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11114/22095 [19:10:04<102:14:53, 33.52s/it] {'loss': 0.3672, 'grad_norm': 0.7363714170001564, 'learning_rate': 5.194907594870519e-06, 'epoch': 0.5} 50%|█████ | 11114/22095 [19:10:04<102:14:53, 33.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11115/22095 [19:10:45<109:18:14, 35.84s/it] {'loss': 0.3095, 'grad_norm': 0.6282518354837773, 'learning_rate': 5.194175228830689e-06, 'epoch': 0.5} 50%|█████ | 11115/22095 [19:10:45<109:18:14, 35.84s/it] 50%|█████ | 11116/22095 [19:10:48<79:46:46, 26.16s/it] {'loss': 0.3435, 'grad_norm': 0.6451065493325112, 'learning_rate': 5.19344285861863e-06, 'epoch': 0.5} 50%|█████ | 11116/22095 [19:10:48<79:46:46, 26.16s/it] 50%|█████ | 11117/22095 [19:11:28<91:48:35, 30.11s/it] {'loss': 0.327, 'grad_norm': 0.6009264106991573, 'learning_rate': 5.192710484250078e-06, 'epoch': 0.5} 50%|█████ | 11117/22095 [19:11:28<91:48:35, 30.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|█████ | 11118/22095 [19:11:38<73:34:01, 24.13s/it] {'loss': 0.4864, 'grad_norm': 0.36986139771972265, 'learning_rate': 5.19197810574077e-06, 'epoch': 0.5} 50%|█████ | 11118/22095 [19:11:38<73:34:01, 24.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11119/22095 [19:12:02<73:50:05, 24.22s/it] {'loss': 0.2666, 'grad_norm': 0.6092739699754802, 'learning_rate': 5.191245723106442e-06, 'epoch': 0.5} 50%|█████ | 11119/22095 [19:12:02<73:50:05, 24.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45410 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87375 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11120/22095 [19:13:01<105:33:39, 34.63s/it] {'loss': 0.3426, 'grad_norm': 0.6406215574886052, 'learning_rate': 5.1905133363628314e-06, 'epoch': 0.5} 50%|█████ | 11120/22095 [19:13:01<105:33:39, 34.63s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/agent_data/OS-Atlas/androidworld/95b04981-e3b4-4487-ae3b-70fe678509ca.png 2025-08-28 11:11:00.033748 load time: 1049.95 ms VC:s3://gui-agent/data_20250421/Android/tencentmap/Cycle_0_Iter_4/images/screenshot-61-1745021942.0965388-before.png 2025-08-28 11:11:00.033329 load time: 1049.86 ms VC:s3://gui-agent/data_20250612/mac/images/settings/fe40e085-7060-4505-942d-c26efa06cb6a/images/step_1.png 2025-08-28 11:11:00.033693 load time: 1044.41 ms 50%|█████ | 11121/22095 [19:13:04<76:49:05, 25.20s/it] {'loss': 0.3212, 'grad_norm': 0.6530649316219899, 'learning_rate': 5.189780945525673e-06, 'epoch': 0.5} 50%|█████ | 11121/22095 [19:13:04<76:49:05, 25.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|█████ | 11122/22095 [19:13:14<62:53:23, 20.63s/it] {'loss': 0.4628, 'grad_norm': 0.2882004458790363, 'learning_rate': 5.189048550610706e-06, 'epoch': 0.5} 50%|█████ | 11122/22095 [19:13:14<62:53:23, 20.63s/it]VC:s3://gui-agent/data_20250609/windows/images/word/20250430_193816_1/images/before_screenshot_1_concat_right.png 2025-08-28 11:11:13.197352 load time: 1041.0 ms 50%|█████ | 11123/22095 [19:13:18<47:26:03, 15.56s/it] {'loss': 0.3331, 'grad_norm': 0.6282642020762652, 'learning_rate': 5.188316151633665e-06, 'epoch': 0.5} 50%|█████ | 11123/22095 [19:13:18<47:26:03, 15.56s/it]VC:s3://gui-agent/data_20250714/ubuntu/images/vs_code/28c6b105-ca98-4aaa-927b-12bbe3698d7e/images/step_1.png 2025-08-28 11:11:16.934940 load time: 1039.2 ms 50%|█████ | 11124/22095 [19:13:39<52:17:25, 17.16s/it] {'loss': 0.3781, 'grad_norm': 0.6610170936830095, 'learning_rate': 5.187583748610289e-06, 'epoch': 0.5} 50%|█████ | 11124/22095 [19:13:39<52:17:25, 17.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/7677283634733148_25.png 2025-08-28 11:11:39.473273 load time: 1137.13 ms 50%|█████ | 11125/22095 [19:14:02<57:50:36, 18.98s/it] {'loss': 0.3464, 'grad_norm': 0.6332532439068955, 'learning_rate': 5.186851341556315e-06, 'epoch': 0.5} 50%|█████ | 11125/22095 [19:14:02<57:50:36, 18.98s/it] 50%|█████ | 11126/22095 [19:14:07<44:22:58, 14.57s/it] {'loss': 0.293, 'grad_norm': 0.6949540478270081, 'learning_rate': 5.186118930487479e-06, 'epoch': 0.5} 50%|█████ | 11126/22095 [19:14:07<44:22:58, 14.57s/it] 50%|█████ | 11127/22095 [19:14:49<69:31:17, 22.82s/it] {'loss': 0.3259, 'grad_norm': 0.67081655894998, 'learning_rate': 5.185386515419518e-06, 'epoch': 0.5} 50%|█████ | 11127/22095 [19:14:49<69:31:17, 22.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://internvl2/datasets/VCR-wiki-en-easy/images/0016903.jpg 2025-08-28 11:12:47.388117 load time: 1043.96 ms VC:s3://gui-agent/data_20250612/web/images/yang_0528112335/10_140_52_49_0528153057/img/3.png 2025-08-28 11:12:47.387791 load time: 1056.16 ms 50%|█████ | 11128/22095 [19:14:59<57:53:21, 19.00s/it] {'loss': 0.4637, 'grad_norm': 0.3210088732142528, 'learning_rate': 5.184654096368172e-06, 'epoch': 0.5} 50%|█████ | 11128/22095 [19:14:59<57:53:21, 19.00s/it]VC:s3://gui/OS-Atlas/desktop_domain/macos_images/20240905_145144_screenshot_sub0.png 2025-08-28 11:12:57.485544 load time: 1019.25 ms VC:s3://gui-agent/data_20250421/Android/deepseek/Cycle_0_Iter_25/images/screenshot-383-1744964586.9424603-before.png 2025-08-28 11:12:57.487406 load time: 1047.9 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11129/22095 [19:15:21<60:29:08, 19.86s/it] {'loss': 0.3081, 'grad_norm': 0.5915652837478227, 'learning_rate': 5.183921673349174e-06, 'epoch': 0.5} 50%|█████ | 11129/22095 [19:15:21<60:29:08, 19.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75534 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46323 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144195 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11130/22095 [19:15:25<46:03:43, 15.12s/it] {'loss': 0.3451, 'grad_norm': 0.6268015361227933, 'learning_rate': 5.183189246378266e-06, 'epoch': 0.5} 50%|█████ | 11130/22095 [19:15:25<46:03:43, 15.12s/it]VC:s3://gui-agent/data_20250714/windows/images/adobe_illustrator/free_task_20250714_180543/images/20250714_180617_16.png 2025-08-28 11:13:23.411437 load time: 1021.34 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_031729_before_screenshot.png 2025-08-28 11:13:23.412234 load time: 1052.0 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_6.png 2025-08-28 11:13:25.014813 load time: 1194.36 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20464.png 2025-08-28 11:13:25.162893 load time: 1159.44 ms 50%|█████ | 11131/22095 [19:16:06<70:02:05, 23.00s/it] {'loss': 0.3106, 'grad_norm': 0.7243887671807692, 'learning_rate': 5.182456815471184e-06, 'epoch': 0.5} 50%|█████ | 11131/22095 [19:16:06<70:02:05, 23.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46372 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51049 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11132/22095 [19:16:27<68:37:46, 22.54s/it] {'loss': 0.3437, 'grad_norm': 0.6726835956341942, 'learning_rate': 5.181724380643664e-06, 'epoch': 0.5} 50%|█████ | 11132/22095 [19:16:27<68:37:46, 22.54s/it]VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/67016.jpg 2025-08-28 11:14:26.249444 load time: 1034.2 ms VC:s3://gui/visual_inputs/multi_modal_2024/gui_data/ui_data/OpenApp/image/46866.jpg 2025-08-28 11:14:26.247844 load time: 1046.64 ms VC:s3://gui-agent/agentnet/ubuntu_images/37439585-30a0-4523-880d-ead7ad22f07d.png 2025-08-28 11:14:26.247721 load time: 1031.77 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_809097.png 2025-08-28 11:14:26.247458 load time: 1071.13 ms VC:s3://gui-agent/data_20250407/windows/images/excel/20250404_233446_1/images/before_screenshot_2.png 2025-08-28 11:14:26.249676 load time: 1352.62 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/30fa924765d794b1119ecbe77ef7c9d78045bed35211f46887007cb504e9c0b0.png 2025-08-28 11:14:28.887301 load time: 1187.03 ms 50%|█████ | 11133/22095 [19:16:50<68:47:09, 22.59s/it] {'loss': 0.2941, 'grad_norm': 0.6491434431452936, 'learning_rate': 5.180991941911446e-06, 'epoch': 0.5} 50%|█████ | 11133/22095 [19:16:50<68:47:09, 22.59s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_230204.png 2025-08-28 11:14:48.958921 load time: 1049.6 ms 50%|█████ | 11134/22095 [19:16:53<51:02:15, 16.76s/it] {'loss': 0.3248, 'grad_norm': 0.6497408706120921, 'learning_rate': 5.180259499290268e-06, 'epoch': 0.5} 50%|█████ | 11134/22095 [19:16:53<51:02:15, 16.76s/it] 50%|█████ | 11135/22095 [19:16:57<38:56:56, 12.79s/it] {'loss': 0.2914, 'grad_norm': 0.657340082524694, 'learning_rate': 5.179527052795865e-06, 'epoch': 0.5} 50%|█████ | 11135/22095 [19:16:57<38:56:56, 12.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80645 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11136/22095 [19:17:19<47:35:02, 15.63s/it] {'loss': 0.3023, 'grad_norm': 0.6404778084269498, 'learning_rate': 5.178794602443978e-06, 'epoch': 0.5} 50%|█████ | 11136/22095 [19:17:19<47:35:02, 15.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71694 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11137/22095 [19:17:22<35:51:42, 11.78s/it] {'loss': 0.3541, 'grad_norm': 0.7219616940330175, 'learning_rate': 5.178062148250343e-06, 'epoch': 0.5} 50%|█████ | 11137/22095 [19:17:22<35:51:42, 11.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 50%|█████ | 11138/22095 [19:17:28<31:04:14, 10.21s/it] {'loss': 0.4824, 'grad_norm': 0.4489197089113747, 'learning_rate': 5.177329690230702e-06, 'epoch': 0.5} 50%|█████ | 11138/22095 [19:17:29<31:04:14, 10.21s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 50%|█████ | 11139/22095 [19:17:32<24:37:31, 8.09s/it] {'loss': 0.3333, 'grad_norm': 0.619375488651365, 'learning_rate': 5.176597228400789e-06, 'epoch': 0.5} 50%|█████ | 11139/22095 [19:17:32<24:37:31, 8.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://multi-modal/Super-CLEVR/images/superCLEVR_new_006639.png 2025-08-28 11:15:30.398974 load time: 1042.28 ms VC:s3://gui-agent/data_20250612/android/images/Total_data_windows_0612_medium_data_device1_Simple_Calendar_Pro/SimpleCalendarDeleteEventsOnRelativeDay_9/images/003_click_1748922640798.png 2025-08-28 11:15:30.397326 load time: 1055.28 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_155723_before_screenshot.png 2025-08-28 11:15:30.397084 load time: 1059.96 ms 50%|█████ | 11140/22095 [19:17:41<26:11:59, 8.61s/it] {'loss': 0.4816, 'grad_norm': 0.31081728274069303, 'learning_rate': 5.175864762776343e-06, 'epoch': 0.5} 50%|█████ | 11140/22095 [19:17:41<26:11:59, 8.61s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-1_91961345-split-3.jpg 2025-08-28 11:15:40.213516 load time: 1037.1 ms VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/GUICourse/guienv/chunk_54/C4web50k-3_275645094-split-21.png 2025-08-28 11:15:40.213329 load time: 1387.26 ms VC:s3://sa-1b/sa_000002/sa_23302.jpg 2025-08-28 11:15:40.214127 load time: 1463.61 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/safari_3/images/step_0.png 2025-08-28 11:15:42.075227 load time: 1077.41 ms 50%|█████ | 11141/22095 [19:17:45<21:53:06, 7.19s/it] {'loss': 0.347, 'grad_norm': 0.6297193844551721, 'learning_rate': 5.175132293373105e-06, 'epoch': 0.5} 50%|█████ | 11141/22095 [19:17:45<21:53:06, 7.19s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 11:15:45.185906 load time: 1388.14 ms 50%|█████ | 11142/22095 [19:18:28<54:16:55, 17.84s/it] {'loss': 0.3354, 'grad_norm': 0.6650090584331564, 'learning_rate': 5.174399820206811e-06, 'epoch': 0.5} 50%|█████ | 11142/22095 [19:18:28<54:16:55, 17.84s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/2acc5969-d89d-471b-9f90-58fc22597739/images/step_2.png 2025-08-28 11:16:27.825428 load time: 1224.1 ms 50%|█████ | 11143/22095 [19:18:31<41:02:08, 13.49s/it] {'loss': 0.3263, 'grad_norm': 0.6580033293915457, 'learning_rate': 5.1736673432932e-06, 'epoch': 0.5} 50%|█████ | 11143/22095 [19:18:31<41:02:08, 13.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/windows/images/settings/free_task_20250606_180411/images/20250606_180413_1.png 2025-08-28 11:16:30.127698 load time: 1013.18 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 11:16:30.127512 load time: 1477.12 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:16:31.245674 load time: 1259.44 ms 50%|█████ | 11144/22095 [19:18:39<35:15:21, 11.59s/it] {'loss': 0.4612, 'grad_norm': 0.3533605652723998, 'learning_rate': 5.172934862648012e-06, 'epoch': 0.5} 50%|█████ | 11144/22095 [19:18:39<35:15:21, 11.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922569 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45722, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC中点,如果CD=4cm,AB=13cm,BC长度为()\nA. 8cm\nB. 9cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 50%|█████ | 11145/22095 [19:18:48<33:14:36, 10.93s/it] {'loss': 0.4858, 'grad_norm': 0.3096719747489321, 'learning_rate': 5.172202378286986e-06, 'epoch': 0.5} 50%|█████ | 11145/22095 [19:18:48<33:14:36, 10.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11146/22095 [19:18:51<26:20:48, 8.66s/it] {'loss': 0.3483, 'grad_norm': 0.976650936575463, 'learning_rate': 5.171469890225857e-06, 'epoch': 0.5} 50%|█████ | 11146/22095 [19:18:51<26:20:48, 8.66s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 11:16:50.600234 load time: 1043.62 ms VC:s3://gui-agent/data_20250612/mac/images/terminal/2bcc2507-16b4-45d6-93d4-8ef594edd6f3/images/step_2.png 2025-08-28 11:16:50.938816 load time: 1194.94 ms 50%|█████ | 11147/22095 [19:18:55<22:17:13, 7.33s/it] {'loss': 0.354, 'grad_norm': 0.6514853901406809, 'learning_rate': 5.17073739848037e-06, 'epoch': 0.5} 50%|█████ | 11147/22095 [19:18:56<22:17:13, 7.33s/it] 50%|█████ | 11148/22095 [19:19:22<39:22:37, 12.95s/it] {'loss': 0.3374, 'grad_norm': 0.6230159049535104, 'learning_rate': 5.170004903066258e-06, 'epoch': 0.5} 50%|█████ | 11148/22095 [19:19:22<39:22:37, 12.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11149/22095 [19:19:25<30:18:15, 9.97s/it] {'loss': 0.3315, 'grad_norm': 0.5977594402004174, 'learning_rate': 5.169272403999265e-06, 'epoch': 0.5} 50%|█████ | 11149/22095 [19:19:25<30:18:15, 9.97s/it] 50%|█████ | 11150/22095 [19:19:28<24:06:53, 7.93s/it] {'loss': 0.3276, 'grad_norm': 0.6652306304159178, 'learning_rate': 5.1685399012951244e-06, 'epoch': 0.5} 50%|█████ | 11150/22095 [19:19:28<24:06:53, 7.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 50%|█████ | 11151/22095 [19:19:31<20:03:16, 6.60s/it] {'loss': 0.3413, 'grad_norm': 0.6319080666281376, 'learning_rate': 5.167807394969583e-06, 'epoch': 0.5} 50%|█████ | 11151/22095 [19:19:31<20:03:16, 6.60s/it]VC:s3://gui-agent/data_20250624/web/images/yang_0626114758/google_com_0626205948/img/1.png 2025-08-28 11:17:30.003278 load time: 1004.73 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30183.png 2025-08-28 11:17:30.521933 load time: 2078.61 ms 50%|█████ | 11152/22095 [19:19:53<33:30:03, 11.02s/it] {'loss': 0.3306, 'grad_norm': 0.6793929270264313, 'learning_rate': 5.1670748850383734e-06, 'epoch': 0.5} 50%|█████ | 11152/22095 [19:19:53<33:30:03, 11.02s/it] 50%|█████ | 11153/22095 [19:19:55<26:03:02, 8.57s/it] {'loss': 0.3292, 'grad_norm': 0.6600474914651242, 'learning_rate': 5.166342371517239e-06, 'epoch': 0.5} 50%|█████ | 11153/22095 [19:19:55<26:03:02, 8.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66410 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11154/22095 [19:19:58<20:47:03, 6.84s/it] {'loss': 0.3273, 'grad_norm': 0.5960777358642412, 'learning_rate': 5.165609854421917e-06, 'epoch': 0.5} 50%|█████ | 11154/22095 [19:19:58<20:47:03, 6.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77384 > 40960). Running this sequence through the model will result in indexing errors 50%|█████ | 11155/22095 [19:20:01<17:15:37, 5.68s/it] {'loss': 0.3084, 'grad_norm': 0.6414202022312718, 'learning_rate': 5.164877333768149e-06, 'epoch': 0.5} 50%|█████ | 11155/22095 [19:20:01<17:15:37, 5.68s/it] 50%|█████ | 11156/22095 [19:20:05<15:42:51, 5.17s/it] {'loss': 0.3788, 'grad_norm': 0.5750570450752185, 'learning_rate': 5.1641448095716715e-06, 'epoch': 0.5} 50%|█████ | 11156/22095 [19:20:05<15:42:51, 5.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250504_222213_4/images/before_screenshot_51_id_144_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:18:04.797108 load time: 1222.98 ms 50%|█████ | 11157/22095 [19:20:08<13:45:56, 4.53s/it] {'loss': 0.2903, 'grad_norm': 0.5541977367870765, 'learning_rate': 5.163412281848229e-06, 'epoch': 0.5} 50%|█████ | 11157/22095 [19:20:08<13:45:56, 4.53s/it] 51%|█████ | 11158/22095 [19:20:11<12:18:53, 4.05s/it] {'loss': 0.3191, 'grad_norm': 0.6017463923431956, 'learning_rate': 5.162679750613555e-06, 'epoch': 0.51} 51%|█████ | 11158/22095 [19:20:11<12:18:53, 4.05s/it] 51%|█████ | 11159/22095 [19:20:14<11:17:08, 3.72s/it] {'loss': 0.2896, 'grad_norm': 0.6164484835509431, 'learning_rate': 5.1619472158833964e-06, 'epoch': 0.51} 51%|█████ | 11159/22095 [19:20:14<11:17:08, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129891 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11160/22095 [19:20:17<10:41:33, 3.52s/it] {'loss': 0.3643, 'grad_norm': 0.5741582036164451, 'learning_rate': 5.161214677673487e-06, 'epoch': 0.51} 51%|█████ | 11160/22095 [19:20:17<10:41:33, 3.52s/it] 51%|█████ | 11161/22095 [19:20:20<10:09:57, 3.35s/it] {'loss': 0.326, 'grad_norm': 0.6408902888090979, 'learning_rate': 5.16048213599957e-06, 'epoch': 0.51} 51%|█████ | 11161/22095 [19:20:20<10:09:57, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895851 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19004, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nA. 1lcm\nB. 13cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:18:20.363408 load time: 1117.5 ms 51%|█████ | 11162/22095 [19:20:28<14:33:29, 4.79s/it] {'loss': 0.471, 'grad_norm': 0.3830985888230755, 'learning_rate': 5.159749590877384e-06, 'epoch': 0.51} 51%|█████ | 11162/22095 [19:20:28<14:33:29, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86505 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (52422 > 40960) for 4 sample(s). Truncating to 11462 with 3 samples. 51%|█████ | 11163/22095 [19:20:38<19:03:15, 6.27s/it] {'loss': 0.477, 'grad_norm': 0.35226583525158767, 'learning_rate': 5.159017042322671e-06, 'epoch': 0.51} 51%|█████ | 11163/22095 [19:20:38<19:03:15, 6.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_1/images/before_screenshot_1_id_161_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:18:37.383849 load time: 1565.72 ms 51%|█████ | 11164/22095 [19:20:42<16:39:23, 5.49s/it] {'loss': 0.2958, 'grad_norm': 0.6303737577199218, 'learning_rate': 5.158284490351169e-06, 'epoch': 0.51} 51%|█████ | 11164/22095 [19:20:42<16:39:23, 5.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8354016 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20701, 'image': 'vrdu_table_final_2/astro-ph.CO/b1432601-8a11-44d9-8b00-4313619ba40c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 51%|█████ | 11165/22095 [19:20:45<15:10:07, 5.00s/it] {'loss': 0.3251, 'grad_norm': 0.6127947670780783, 'learning_rate': 5.157551934978622e-06, 'epoch': 0.51} 51%|█████ | 11165/22095 [19:20:45<15:10:07, 5.00s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_7.png 2025-08-28 11:18:45.515898 load time: 1212.22 ms 51%|█████ | 11166/22095 [19:20:49<13:44:37, 4.53s/it] {'loss': 0.3087, 'grad_norm': 0.6064243651942174, 'learning_rate': 5.156819376220765e-06, 'epoch': 0.51} 51%|█████ | 11166/22095 [19:20:49<13:44:37, 4.53s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_2/images/step_8.png 2025-08-28 11:18:47.695241 load time: 1002.78 ms 51%|█████ | 11167/22095 [19:20:53<13:23:31, 4.41s/it] {'loss': 0.3313, 'grad_norm': 1.068945333103363, 'learning_rate': 5.1560868140933425e-06, 'epoch': 0.51} 51%|█████ | 11167/22095 [19:20:53<13:23:31, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11168/22095 [19:21:01<16:35:22, 5.47s/it] {'loss': 0.4639, 'grad_norm': 0.41989461112524973, 'learning_rate': 5.155354248612095e-06, 'epoch': 0.51} 51%|█████ | 11168/22095 [19:21:01<16:35:22, 5.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/settings/9db8b077-9fe1-4e64-9ae8-95dea252fdb4/images/step_1.png 2025-08-28 11:18:59.761491 load time: 1120.07 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11169/22095 [19:21:05<15:07:24, 4.98s/it] {'loss': 0.3175, 'grad_norm': 0.7009044558515606, 'learning_rate': 5.1546216797927594e-06, 'epoch': 0.51} 51%|█████ | 11169/22095 [19:21:05<15:07:24, 4.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924296 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nIn which state is Carver Museum is located? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Texas.\nThe text does not specifically mention the Carver Museum, but it does mention the state of Texas in the list of states within the United States. Therefore, it can be inferred that the Carver Museum is located in Texas.'}]} VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_121/img/step_1.png 2025-08-28 11:19:04.764673 load time: 1045.99 ms 51%|█████ | 11170/22095 [19:21:09<14:15:13, 4.70s/it] {'loss': 0.353, 'grad_norm': 0.6599507990452416, 'learning_rate': 5.1538891076510815e-06, 'epoch': 0.51} 51%|█████ | 11170/22095 [19:21:09<14:15:13, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11171/22095 [19:21:18<18:38:18, 6.14s/it] {'loss': 0.4709, 'grad_norm': 0.34121769850879935, 'learning_rate': 5.153156532202795e-06, 'epoch': 0.51} 51%|█████ | 11171/22095 [19:21:18<18:38:18, 6.14s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_6.png 2025-08-28 11:19:18.440954 load time: 1288.19 ms 51%|█████ | 11172/22095 [19:21:22<16:32:37, 5.45s/it] {'loss': 0.3466, 'grad_norm': 0.6500256984129645, 'learning_rate': 5.152423953463649e-06, 'epoch': 0.51} 51%|█████ | 11172/22095 [19:21:22<16:32:37, 5.45s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_1/images/step_0.png 2025-08-28 11:19:21.715943 load time: 1099.65 ms 51%|█████ | 11173/22095 [19:21:26<14:43:59, 4.86s/it] {'loss': 0.3298, 'grad_norm': 0.6725159093189567, 'learning_rate': 5.151691371449378e-06, 'epoch': 0.51} 51%|█████ | 11173/22095 [19:21:26<14:43:59, 4.86s/it] 51%|█████ | 11174/22095 [19:21:28<12:48:27, 4.22s/it] {'loss': 0.2905, 'grad_norm': 0.5950271952157588, 'learning_rate': 5.150958786175727e-06, 'epoch': 0.51} 51%|█████ | 11174/22095 [19:21:28<12:48:27, 4.22s/it] 51%|█████ | 11175/22095 [19:21:32<11:47:10, 3.89s/it] {'loss': 0.282, 'grad_norm': 0.6326018823306971, 'learning_rate': 5.1502261976584354e-06, 'epoch': 0.51} 51%|█████ | 11175/22095 [19:21:32<11:47:10, 3.89s/it] 51%|█████ | 11176/22095 [19:21:34<10:51:33, 3.58s/it] {'loss': 0.3124, 'grad_norm': 0.589511282902164, 'learning_rate': 5.149493605913244e-06, 'epoch': 0.51} 51%|█████ | 11176/22095 [19:21:34<10:51:33, 3.58s/it] 51%|█████ | 11177/22095 [19:21:37<10:23:12, 3.42s/it] {'loss': 0.3334, 'grad_norm': 0.9978463493192055, 'learning_rate': 5.148761010955893e-06, 'epoch': 0.51} 51%|█████ | 11177/22095 [19:21:37<10:23:12, 3.42s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 11:19:37.422658 load time: 1112.49 ms 51%|█████ | 11178/22095 [19:21:42<11:01:21, 3.63s/it] {'loss': 0.3836, 'grad_norm': 0.6701595925463175, 'learning_rate': 5.1480284128021265e-06, 'epoch': 0.51} 51%|█████ | 11178/22095 [19:21:42<11:01:21, 3.63s/it] 51%|█████ | 11179/22095 [19:21:45<10:54:51, 3.60s/it] {'loss': 0.3671, 'grad_norm': 0.6096570947381155, 'learning_rate': 5.147295811467681e-06, 'epoch': 0.51} 51%|█████ | 11179/22095 [19:21:45<10:54:51, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47958 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49363 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11180/22095 [19:21:49<11:06:01, 3.66s/it] {'loss': 0.3427, 'grad_norm': 0.6534848644153554, 'learning_rate': 5.146563206968303e-06, 'epoch': 0.51} 51%|█████ | 11180/22095 [19:21:49<11:06:01, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60700 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53011 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11181/22095 [19:21:52<10:34:48, 3.49s/it] {'loss': 0.3127, 'grad_norm': 0.6840519353875196, 'learning_rate': 5.1458305993197326e-06, 'epoch': 0.51} 51%|█████ | 11181/22095 [19:21:52<10:34:48, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75584 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44785 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11182/22095 [19:21:55<10:00:40, 3.30s/it] {'loss': 0.3149, 'grad_norm': 0.6609155547256175, 'learning_rate': 5.145097988537709e-06, 'epoch': 0.51} 51%|█████ | 11182/22095 [19:21:55<10:00:40, 3.30s/it] 51%|█████ | 11183/22095 [19:21:58<9:30:36, 3.14s/it] {'loss': 0.3082, 'grad_norm': 0.6332826976514082, 'learning_rate': 5.144365374637976e-06, 'epoch': 0.51} 51%|█████ | 11183/22095 [19:21:58<9:30:36, 3.14s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 11:19:57.064692 load time: 1145.94 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:19:57.526022 load time: 1090.81 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 11:19:57.872795 load time: 1002.08 ms 51%|█████ | 11184/22095 [19:22:02<10:22:51, 3.43s/it] {'loss': 0.3321, 'grad_norm': 0.6580709619987227, 'learning_rate': 5.143632757636275e-06, 'epoch': 0.51} 51%|█████ | 11184/22095 [19:22:02<10:22:51, 3.43s/it] 51%|█████ | 11185/22095 [19:22:05<9:57:44, 3.29s/it] {'loss': 0.2912, 'grad_norm': 0.6328645701871171, 'learning_rate': 5.142900137548346e-06, 'epoch': 0.51} 51%|█████ | 11185/22095 [19:22:05<9:57:44, 3.29s/it] 51%|█████ | 11186/22095 [19:22:08<10:00:03, 3.30s/it] {'loss': 0.3509, 'grad_norm': 0.6329476213869506, 'learning_rate': 5.142167514389933e-06, 'epoch': 0.51} 51%|█████ | 11186/22095 [19:22:08<10:00:03, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52887 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79812 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11187/22095 [19:22:11<9:56:34, 3.28s/it] {'loss': 0.3406, 'grad_norm': 0.6852797503715907, 'learning_rate': 5.141434888176775e-06, 'epoch': 0.51} 51%|█████ | 11187/22095 [19:22:11<9:56:34, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250623/windows_augment/images/inventor/20250513_095212_1/images/before_screenshot_1_id_127_internvl_position_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 11:20:10.905305 load time: 1036.51 ms 51%|█████ | 11188/22095 [19:22:20<15:18:25, 5.05s/it] {'loss': 0.487, 'grad_norm': 0.46909894318034845, 'learning_rate': 5.140702258924618e-06, 'epoch': 0.51} 51%|█████ | 11188/22095 [19:22:20<15:18:25, 5.05s/it] 51%|█████ | 11189/22095 [19:22:27<16:45:55, 5.53s/it] {'loss': 0.4912, 'grad_norm': 0.3779597483581766, 'learning_rate': 5.1399696266491996e-06, 'epoch': 0.51} 51%|█████ | 11189/22095 [19:22:27<16:45:55, 5.53s/it] 51%|█████ | 11190/22095 [19:22:37<20:22:58, 6.73s/it] {'loss': 0.486, 'grad_norm': 0.2965653746703262, 'learning_rate': 5.1392369913662646e-06, 'epoch': 0.51} 51%|█████ | 11190/22095 [19:22:37<20:22:58, 6.73s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:20:35.792848 load time: 1017.45 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/17d889dde6fdc256ad29650d59b78fd97a42b5762061f527fd3e2766966f8a46.png 2025-08-28 11:20:37.128457 load time: 1158.92 ms 51%|█████ | 11191/22095 [19:22:41<18:03:44, 5.96s/it] {'loss': 0.3288, 'grad_norm': 0.6462902080209708, 'learning_rate': 5.138504353091555e-06, 'epoch': 0.51} 51%|█████ | 11191/22095 [19:22:41<18:03:44, 5.96s/it] 51%|█████ | 11192/22095 [19:22:45<16:35:32, 5.48s/it] {'loss': 0.317, 'grad_norm': 0.6200576019698926, 'learning_rate': 5.137771711840811e-06, 'epoch': 0.51} 51%|█████ | 11192/22095 [19:22:45<16:35:32, 5.48s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10166.png 2025-08-28 11:20:44.732992 load time: 1052.36 ms 51%|█████ | 11193/22095 [19:22:48<14:39:09, 4.84s/it] {'loss': 0.3249, 'grad_norm': 0.6191220053106656, 'learning_rate': 5.137039067629776e-06, 'epoch': 0.51} 51%|█████ | 11193/22095 [19:22:48<14:39:09, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46069 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58644 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11194/22095 [19:22:53<14:41:58, 4.85s/it] {'loss': 0.3895, 'grad_norm': 0.6109383276969177, 'learning_rate': 5.136306420474193e-06, 'epoch': 0.51} 51%|█████ | 11194/22095 [19:22:53<14:41:58, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11195/22095 [19:23:02<18:02:04, 5.96s/it] {'loss': 0.4816, 'grad_norm': 0.540868332291682, 'learning_rate': 5.135573770389804e-06, 'epoch': 0.51} 51%|█████ | 11195/22095 [19:23:02<18:02:04, 5.96s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-28 11:21:00.686557 load time: 1045.03 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:21:01.674015 load time: 1013.93 ms 51%|█████ | 11196/22095 [19:23:06<16:31:59, 5.46s/it] {'loss': 0.3157, 'grad_norm': 0.5663607782035478, 'learning_rate': 5.134841117392349e-06, 'epoch': 0.51} 51%|█████ | 11196/22095 [19:23:06<16:31:59, 5.46s/it] 51%|█████ | 11197/22095 [19:23:10<14:40:17, 4.85s/it] {'loss': 0.3748, 'grad_norm': 0.6249994867232201, 'learning_rate': 5.134108461497576e-06, 'epoch': 0.51} 51%|█████ | 11197/22095 [19:23:10<14:40:17, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57511 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11198/22095 [19:23:14<14:23:31, 4.75s/it] {'loss': 0.3007, 'grad_norm': 0.644905525457812, 'learning_rate': 5.133375802721221e-06, 'epoch': 0.51} 51%|█████ | 11198/22095 [19:23:14<14:23:31, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69550 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71496 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120357 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11199/22095 [19:23:18<13:44:53, 4.54s/it] {'loss': 0.3555, 'grad_norm': 0.5906218504091207, 'learning_rate': 5.132643141079031e-06, 'epoch': 0.51} 51%|█████ | 11199/22095 [19:23:18<13:44:53, 4.54s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_2/images/step_0.png 2025-08-28 11:21:17.876495 load time: 1076.08 ms 51%|█████ | 11200/22095 [19:23:22<13:00:37, 4.30s/it] {'loss': 0.3488, 'grad_norm': 0.5924636723678586, 'learning_rate': 5.131910476586747e-06, 'epoch': 0.51} 51%|█████ | 11200/22095 [19:23:22<13:00:37, 4.30s/it] 51%|█████ | 11201/22095 [19:23:27<13:18:51, 4.40s/it] {'loss': 0.2914, 'grad_norm': 0.617360493101454, 'learning_rate': 5.131177809260113e-06, 'epoch': 0.51} 51%|█████ | 11201/22095 [19:23:27<13:18:51, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11202/22095 [19:23:35<16:47:30, 5.55s/it] {'loss': 0.5056, 'grad_norm': 0.37941447725663624, 'learning_rate': 5.130445139114869e-06, 'epoch': 0.51} 51%|█████ | 11202/22095 [19:23:35<16:47:30, 5.55s/it] 51%|█████ | 11203/22095 [19:23:38<14:41:48, 4.86s/it] {'loss': 0.3294, 'grad_norm': 0.6042484854768109, 'learning_rate': 5.129712466166761e-06, 'epoch': 0.51} 51%|█████ | 11203/22095 [19:23:38<14:41:48, 4.86s/it] 51%|█████ | 11204/22095 [19:23:41<13:22:08, 4.42s/it] {'loss': 0.3125, 'grad_norm': 0.6631656173398461, 'learning_rate': 5.1289797904315295e-06, 'epoch': 0.51} 51%|█████ | 11204/22095 [19:23:41<13:22:08, 4.42s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:21:38.908008 load time: 1350.63 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:21:40.217942 load time: 1014.07 ms 51%|█████ | 11205/22095 [19:23:45<12:10:27, 4.02s/it] {'loss': 0.2989, 'grad_norm': 0.6901252842609684, 'learning_rate': 5.12824711192492e-06, 'epoch': 0.51} 51%|█████ | 11205/22095 [19:23:45<12:10:27, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 11:21:46.021770 load time: 1015.53 ms 51%|█████ | 11206/22095 [19:23:53<16:09:01, 5.34s/it] {'loss': 0.4765, 'grad_norm': 0.3076522049578289, 'learning_rate': 5.127514430662671e-06, 'epoch': 0.51} 51%|█████ | 11206/22095 [19:23:53<16:09:01, 5.34s/it] 51%|█████ | 11207/22095 [19:24:02<19:50:54, 6.56s/it] {'loss': 0.4618, 'grad_norm': 0.28162741026972016, 'learning_rate': 5.126781746660532e-06, 'epoch': 0.51} 51%|█████ | 11207/22095 [19:24:02<19:50:54, 6.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 51%|█████ | 11208/22095 [19:24:06<17:11:17, 5.68s/it] {'loss': 0.3878, 'grad_norm': 0.7051014395322401, 'learning_rate': 5.126049059934239e-06, 'epoch': 0.51} 51%|█████ | 11208/22095 [19:24:06<17:11:17, 5.68s/it]VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_25.png 2025-08-28 11:22:05.467103 load time: 1022.64 ms 51%|█████ | 11209/22095 [19:24:09<14:53:19, 4.92s/it] {'loss': 0.3245, 'grad_norm': 0.659795061917355, 'learning_rate': 5.1253163704995425e-06, 'epoch': 0.51} 51%|█████ | 11209/22095 [19:24:09<14:53:19, 4.92s/it] 51%|█████ | 11210/22095 [19:24:12<13:01:18, 4.31s/it] {'loss': 0.3236, 'grad_norm': 0.6166099899942831, 'learning_rate': 5.124583678372179e-06, 'epoch': 0.51} 51%|█████ | 11210/22095 [19:24:12<13:01:18, 4.31s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_211310/images/step_0_concat_left.png 2025-08-28 11:22:09.559299 load time: 1524.71 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 11:22:12.346309 load time: 1497.22 ms 51%|█████ | 11211/22095 [19:24:15<12:12:29, 4.04s/it] {'loss': 0.3747, 'grad_norm': 0.6118398319259457, 'learning_rate': 5.1238509835678966e-06, 'epoch': 0.51} 51%|█████ | 11211/22095 [19:24:15<12:12:29, 4.04s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/cf183c3a-83c3-4eb7-a60d-c6cbfaa27f3e/images/step_2.png 2025-08-28 11:22:14.207513 load time: 1363.4 ms 51%|█████ | 11212/22095 [19:24:19<11:29:18, 3.80s/it] {'loss': 0.3546, 'grad_norm': 0.645756325863196, 'learning_rate': 5.1231182861024365e-06, 'epoch': 0.51} 51%|█████ | 11212/22095 [19:24:19<11:29:18, 3.80s/it] 51%|█████ | 11213/22095 [19:24:22<10:42:37, 3.54s/it] {'loss': 0.3146, 'grad_norm': 1.0032332349126258, 'learning_rate': 5.122385585991543e-06, 'epoch': 0.51} 51%|█████ | 11213/22095 [19:24:22<10:42:37, 3.54s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 11:22:21.301937 load time: 1036.13 ms 51%|█████ | 11214/22095 [19:24:25<10:42:33, 3.54s/it] {'loss': 0.3278, 'grad_norm': 0.6702414915122858, 'learning_rate': 5.121652883250958e-06, 'epoch': 0.51} 51%|█████ | 11214/22095 [19:24:25<10:42:33, 3.54s/it] 51%|█████ | 11215/22095 [19:24:29<10:34:26, 3.50s/it] {'loss': 0.3213, 'grad_norm': 0.6723140979649809, 'learning_rate': 5.120920177896427e-06, 'epoch': 0.51} 51%|█████ | 11215/22095 [19:24:29<10:34:26, 3.50s/it] 51%|█████ | 11216/22095 [19:24:32<10:15:45, 3.40s/it] {'loss': 0.3214, 'grad_norm': 0.5790786377335796, 'learning_rate': 5.120187469943693e-06, 'epoch': 0.51} 51%|█████ | 11216/22095 [19:24:32<10:15:45, 3.40s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 11:22:31.300330 load time: 1043.23 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:22:32.086865 load time: 1255.24 ms 51%|█████ | 11217/22095 [19:24:35<10:30:39, 3.48s/it] {'loss': 0.3392, 'grad_norm': 0.6766593472851177, 'learning_rate': 5.1194547594085e-06, 'epoch': 0.51} 51%|█████ | 11217/22095 [19:24:35<10:30:39, 3.48s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 11:22:35.500257 load time: 1065.58 ms 51%|█████ | 11218/22095 [19:24:39<10:45:57, 3.56s/it] {'loss': 0.3047, 'grad_norm': 0.6590885592070846, 'learning_rate': 5.11872204630659e-06, 'epoch': 0.51} 51%|█████ | 11218/22095 [19:24:39<10:45:57, 3.56s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_1/images/step_0.png 2025-08-28 11:22:37.954719 load time: 1884.93 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:22:39.207420 load time: 1546.56 ms 51%|█████ | 11219/22095 [19:24:43<10:49:12, 3.58s/it] {'loss': 0.3255, 'grad_norm': 0.5935583909536788, 'learning_rate': 5.117989330653708e-06, 'epoch': 0.51} 51%|█████ | 11219/22095 [19:24:43<10:49:12, 3.58s/it]VC:s3://gui/aguvis/aguvis-stage1/omniact/images/train_6394.png 2025-08-28 11:22:42.640709 load time: 1170.23 ms 51%|█████ | 11220/22095 [19:24:46<10:31:22, 3.48s/it] {'loss': 0.3217, 'grad_norm': 0.6477657735336387, 'learning_rate': 5.117256612465598e-06, 'epoch': 0.51} 51%|█████ | 11220/22095 [19:24:46<10:31:22, 3.48s/it] 51%|█████ | 11221/22095 [19:24:49<9:53:26, 3.27s/it] {'loss': 0.3173, 'grad_norm': 0.6513408752224927, 'learning_rate': 5.116523891758002e-06, 'epoch': 0.51} 51%|█████ | 11221/22095 [19:24:49<9:53:26, 3.27s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/17d889dde6fdc256ad29650d59b78fd97a42b5762061f527fd3e2766966f8a46.png 2025-08-28 11:22:48.537891 load time: 1036.53 ms 51%|█████ | 11222/22095 [19:24:53<10:23:38, 3.44s/it] {'loss': 0.3308, 'grad_norm': 0.6476151623312782, 'learning_rate': 5.115791168546667e-06, 'epoch': 0.51} 51%|█████ | 11222/22095 [19:24:53<10:23:38, 3.44s/it] 51%|█████ | 11223/22095 [19:24:56<10:41:59, 3.54s/it] {'loss': 0.3149, 'grad_norm': 0.6360775434592607, 'learning_rate': 5.115058442847335e-06, 'epoch': 0.51} 51%|█████ | 11223/22095 [19:24:56<10:41:59, 3.54s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 11:22:56.538665 load time: 1052.63 ms 51%|█████ | 11224/22095 [19:25:01<11:16:59, 3.74s/it] {'loss': 0.31, 'grad_norm': 0.6270942302739652, 'learning_rate': 5.1143257146757495e-06, 'epoch': 0.51} 51%|█████ | 11224/22095 [19:25:01<11:16:59, 3.74s/it] 51%|█████ | 11225/22095 [19:25:04<10:57:28, 3.63s/it] {'loss': 0.3273, 'grad_norm': 0.9616494452634846, 'learning_rate': 5.113592984047657e-06, 'epoch': 0.51} 51%|█████ | 11225/22095 [19:25:04<10:57:28, 3.63s/it] 51%|█████ | 11226/22095 [19:25:08<10:57:33, 3.63s/it] {'loss': 0.3141, 'grad_norm': 0.6000044934131067, 'learning_rate': 5.1128602509788e-06, 'epoch': 0.51} 51%|█████ | 11226/22095 [19:25:08<10:57:33, 3.63s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/432466e7-22e7-4194-aa9e-19f7c21adef5/images/step_5.png 2025-08-28 11:23:06.400376 load time: 1073.01 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_1/images/step_0.png 2025-08-28 11:23:07.810493 load time: 1271.51 ms 51%|█████ | 11227/22095 [19:25:11<10:31:39, 3.49s/it] {'loss': 0.2911, 'grad_norm': 0.605687051745947, 'learning_rate': 5.112127515484923e-06, 'epoch': 0.51} 51%|█████ | 11227/22095 [19:25:11<10:31:39, 3.49s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_2/images/step_0.png 2025-08-28 11:23:11.878761 load time: 1028.82 ms 51%|█████ | 11228/22095 [19:25:15<10:48:07, 3.58s/it] {'loss': 0.3262, 'grad_norm': 0.6413182962936163, 'learning_rate': 5.111394777581769e-06, 'epoch': 0.51} 51%|█████ | 11228/22095 [19:25:15<10:48:07, 3.58s/it] 51%|█████ | 11229/22095 [19:25:18<10:22:00, 3.43s/it] {'loss': 0.3477, 'grad_norm': 0.6508238650826035, 'learning_rate': 5.110662037285084e-06, 'epoch': 0.51} 51%|█████ | 11229/22095 [19:25:18<10:22:00, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50902 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44655 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89794 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11230/22095 [19:25:22<10:54:17, 3.61s/it] {'loss': 0.3572, 'grad_norm': 0.584862532364185, 'learning_rate': 5.109929294610611e-06, 'epoch': 0.51} 51%|█████ | 11230/22095 [19:25:22<10:54:17, 3.61s/it] 51%|█████ | 11231/22095 [19:25:25<10:59:27, 3.64s/it] {'loss': 0.327, 'grad_norm': 0.6690967334167736, 'learning_rate': 5.109196549574097e-06, 'epoch': 0.51} 51%|█████ | 11231/22095 [19:25:25<10:59:27, 3.64s/it] 51%|█████ | 11232/22095 [19:25:29<10:40:25, 3.54s/it] {'loss': 0.303, 'grad_norm': 0.6375743641978658, 'learning_rate': 5.108463802191282e-06, 'epoch': 0.51} 51%|█████ | 11232/22095 [19:25:29<10:40:25, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8405972 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8159, 'image': 'vrdu_table_final_2/astro-ph.CO/f04b6c54-8e5a-40fd-a3a6-5cd53646078c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 51%|█████ | 11233/22095 [19:25:34<11:57:43, 3.96s/it] {'loss': 0.3497, 'grad_norm': 0.6475998504990564, 'learning_rate': 5.1077310524779144e-06, 'epoch': 0.51} 51%|█████ | 11233/22095 [19:25:34<11:57:43, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79691 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49309 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11234/22095 [19:25:37<11:26:29, 3.79s/it] {'loss': 0.3137, 'grad_norm': 0.7126058888744391, 'learning_rate': 5.106998300449738e-06, 'epoch': 0.51} 51%|█████ | 11234/22095 [19:25:37<11:26:29, 3.79s/it] 51%|█████ | 11235/22095 [19:25:41<11:11:37, 3.71s/it] {'loss': 0.3389, 'grad_norm': 0.6367622694735621, 'learning_rate': 5.106265546122495e-06, 'epoch': 0.51} 51%|█████ | 11235/22095 [19:25:41<11:11:37, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81521 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11236/22095 [19:25:44<11:09:10, 3.70s/it] {'loss': 0.3334, 'grad_norm': 0.6321256652100014, 'learning_rate': 5.105532789511935e-06, 'epoch': 0.51} 51%|█████ | 11236/22095 [19:25:44<11:09:10, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348836 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15506, 'image': 'vrdu_table_final_2/astro-ph.CO/9e691543-4cfb-48e6-b993-a9b4273d1a7a.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$S_{4}$\\end{tabular}\n```"}]} 51%|█████ | 11237/22095 [19:25:48<10:45:57, 3.57s/it] {'loss': 0.3278, 'grad_norm': 0.5958640465462083, 'learning_rate': 5.104800030633795e-06, 'epoch': 0.51} 51%|█████ | 11237/22095 [19:25:48<10:45:57, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11238/22095 [19:25:59<17:31:21, 5.81s/it] {'loss': 0.4554, 'grad_norm': 0.5115932366401839, 'learning_rate': 5.104067269503828e-06, 'epoch': 0.51} 51%|█████ | 11238/22095 [19:25:59<17:31:21, 5.81s/it] 51%|█████ | 11239/22095 [19:26:02<15:10:29, 5.03s/it] {'loss': 0.3469, 'grad_norm': 0.60649625395521, 'learning_rate': 5.103334506137773e-06, 'epoch': 0.51} 51%|█████ | 11239/22095 [19:26:02<15:10:29, 5.03s/it] 51%|█████ | 11240/22095 [19:26:06<14:09:06, 4.69s/it] {'loss': 0.3477, 'grad_norm': 0.6774631407983585, 'learning_rate': 5.102601740551376e-06, 'epoch': 0.51} 51%|█████ | 11240/22095 [19:26:06<14:09:06, 4.69s/it] 51%|█████ | 11241/22095 [19:26:09<13:05:03, 4.34s/it] {'loss': 0.3248, 'grad_norm': 1.0746385590831669, 'learning_rate': 5.101868972760384e-06, 'epoch': 0.51} 51%|█████ | 11241/22095 [19:26:09<13:05:03, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 11:24:07.955379 load time: 1041.22 ms 51%|█████ | 11242/22095 [19:26:19<17:49:29, 5.91s/it] {'loss': 0.4738, 'grad_norm': 0.3245666411781549, 'learning_rate': 5.101136202780541e-06, 'epoch': 0.51} 51%|█████ | 11242/22095 [19:26:19<17:49:29, 5.91s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_5/images/step_3.png 2025-08-28 11:24:17.539002 load time: 1224.12 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_212129_2/images/before_screenshot_9_id_33_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:24:18.139450 load time: 1293.0 ms 51%|█████ | 11243/22095 [19:26:24<17:04:13, 5.66s/it] {'loss': 0.3212, 'grad_norm': 0.5872003958509445, 'learning_rate': 5.100403430627591e-06, 'epoch': 0.51} 51%|█████ | 11243/22095 [19:26:24<17:04:13, 5.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11244/22095 [19:26:33<20:30:21, 6.80s/it] {'loss': 0.5003, 'grad_norm': 0.315392723450212, 'learning_rate': 5.099670656317279e-06, 'epoch': 0.51} 51%|█████ | 11244/22095 [19:26:33<20:30:21, 6.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50525 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11245/22095 [19:26:36<17:08:11, 5.69s/it] {'loss': 0.3016, 'grad_norm': 0.6502171119613663, 'learning_rate': 5.098937879865352e-06, 'epoch': 0.51} 51%|█████ | 11245/22095 [19:26:36<17:08:11, 5.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [42, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8461085 in VC:s3://internvl-moe-sft-data/. Exception: Image size [42, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 72438, 'image': 'vrdu_texteq/astro-ph.CO/be16526f-ea60-4fa1-9c8f-5ff2b279472a.png', 'image_wh': [[42, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': '$\\bf M_2$'}]} 51%|█████ | 11246/22095 [19:26:40<14:52:08, 4.93s/it] {'loss': 0.3329, 'grad_norm': 0.6229935855299926, 'learning_rate': 5.098205101287554e-06, 'epoch': 0.51} 51%|█████ | 11246/22095 [19:26:40<14:52:08, 4.93s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:24:38.859696 load time: 1215.02 ms 51%|█████ | 11247/22095 [19:26:44<14:11:54, 4.71s/it] {'loss': 0.3536, 'grad_norm': 0.6350228907401154, 'learning_rate': 5.09747232059963e-06, 'epoch': 0.51} 51%|█████ | 11247/22095 [19:26:44<14:11:54, 4.71s/it] 51%|█████ | 11248/22095 [19:26:47<12:27:11, 4.13s/it] {'loss': 0.3211, 'grad_norm': 0.5996675389445115, 'learning_rate': 5.096739537817324e-06, 'epoch': 0.51} 51%|█████ | 11248/22095 [19:26:47<12:27:11, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68605 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11249/22095 [19:26:55<16:08:09, 5.36s/it] {'loss': 0.4433, 'grad_norm': 0.28875990227643683, 'learning_rate': 5.096006752956383e-06, 'epoch': 0.51} 51%|█████ | 11249/22095 [19:26:55<16:08:09, 5.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in results = [self.process_image_unified(file) for file in image_file] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified image = load_image(image_file, tcs_loader=self.tcs_loader) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7323258 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_161646_1/images/before_screenshot_1_id_73_internvl_position_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nExtract the coordinates for: Located in the top ribbon toolbar of AutoCAD Mechanical 2019, within the Data section. It sits to the right of the 'Update Fields' button and to the left of the 'Download from Source' button in the same toolbar row."}, {'from': 'gpt', 'value': "Located in the top ribbon toolbar of AutoCAD Mechanical 2019, within the Data section. It sits to the right of the 'Update Fields' button and to the left of the 'Download from Source' button in the same toolbar row.[[343, 618, 355, 655]]"}], 'width': 3024, 'height': 1964} VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_3/images/before_screenshot_11_id_119_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:24:53.528388 load time: 1121.78 ms 51%|█████ | 11250/22095 [19:26:58<14:07:11, 4.69s/it] {'loss': 0.3585, 'grad_norm': 0.6452691574848929, 'learning_rate': 5.09527396603255e-06, 'epoch': 0.51} 51%|█████ | 11250/22095 [19:26:58<14:07:11, 4.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358076 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24787, 'image': 'vrdu_table_final_2/astro-ph.CO/d6b340d4-f9ee-41e1-a2ba-275502dea82e.png', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\eftcamb basis\\end{tabular}\n```"}]} 51%|█████ | 11251/22095 [19:27:01<12:38:23, 4.20s/it] {'loss': 0.3045, 'grad_norm': 0.6289696919418744, 'learning_rate': 5.094541177061575e-06, 'epoch': 0.51} 51%|█████ | 11251/22095 [19:27:01<12:38:23, 4.20s/it] 51%|█████ | 11252/22095 [19:27:04<11:42:15, 3.89s/it] {'loss': 0.3139, 'grad_norm': 0.6431979381717381, 'learning_rate': 5.093808386059199e-06, 'epoch': 0.51} 51%|█████ | 11252/22095 [19:27:04<11:42:15, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████ | 11253/22095 [19:27:14<16:43:54, 5.56s/it] {'loss': 0.4865, 'grad_norm': 0.3145776248691224, 'learning_rate': 5.093075593041169e-06, 'epoch': 0.51} 51%|█████ | 11253/22095 [19:27:14<16:43:54, 5.56s/it] 51%|█████ | 11254/22095 [19:27:18<15:36:54, 5.19s/it] {'loss': 0.2992, 'grad_norm': 0.6125843389798492, 'learning_rate': 5.092342798023231e-06, 'epoch': 0.51} 51%|█████ | 11254/22095 [19:27:18<15:36:54, 5.19s/it] 51%|█████ | 11255/22095 [19:27:22<14:13:26, 4.72s/it] {'loss': 0.3389, 'grad_norm': 0.6316823394050614, 'learning_rate': 5.09161000102113e-06, 'epoch': 0.51} 51%|█████ | 11255/22095 [19:27:22<14:13:26, 4.72s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/vivado/20250509_111638_059895_1047_1/images/before_screenshot_1_id_0_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:25:21.502048 load time: 1768.4 ms 51%|█████ | 11256/22095 [19:27:25<12:58:39, 4.31s/it] {'loss': 0.3666, 'grad_norm': 0.6598121599800412, 'learning_rate': 5.09087720205061e-06, 'epoch': 0.51} 51%|█████ | 11256/22095 [19:27:25<12:58:39, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_10/images/20250417140218.png 2025-08-28 11:25:22.447928 load time: 1414.94 ms VC:s3://gui-agent/data_20250630/mac/images/terminal/5685b8a4-5bcb-4b03-8a69-df5db43dbe42/images/step_0.png 2025-08-28 11:25:23.643737 load time: 1028.54 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:25:23.643949 load time: 1328.37 ms 51%|█████ | 11257/22095 [19:27:34<17:35:14, 5.84s/it] {'loss': 0.4977, 'grad_norm': 0.28874403727307535, 'learning_rate': 5.09014440112742e-06, 'epoch': 0.51} 51%|█████ | 11257/22095 [19:27:34<17:35:14, 5.84s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 51%|█████ | 11258/22095 [19:27:38<15:46:18, 5.24s/it] {'loss': 0.308, 'grad_norm': 0.583549177944878, 'learning_rate': 5.089411598267301e-06, 'epoch': 0.51} 51%|█████ | 11258/22095 [19:27:38<15:46:18, 5.24s/it] 51%|█████ | 11259/22095 [19:27:41<13:45:15, 4.57s/it] {'loss': 0.3034, 'grad_norm': 0.6404139081606993, 'learning_rate': 5.0886787934860035e-06, 'epoch': 0.51} 51%|█████ | 11259/22095 [19:27:41<13:45:15, 4.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11260/22095 [19:27:44<12:15:49, 4.07s/it] {'loss': 0.3312, 'grad_norm': 0.7182314766797886, 'learning_rate': 5.087945986799271e-06, 'epoch': 0.51} 51%|█████ | 11260/22095 [19:27:44<12:15:49, 4.07s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250502_111053_6/images/before_screenshot_60_id_36_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:25:41.750373 load time: 1079.66 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:25:43.701682 load time: 1049.72 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_3/images/step_0.png 2025-08-28 11:25:43.869867 load time: 1167.2 ms 51%|█████ | 11261/22095 [19:27:47<11:25:39, 3.80s/it] {'loss': 0.3477, 'grad_norm': 0.575133981644903, 'learning_rate': 5.087213178222849e-06, 'epoch': 0.51} 51%|█████ | 11261/22095 [19:27:47<11:25:39, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11262/22095 [19:27:50<10:34:47, 3.52s/it] {'loss': 0.3013, 'grad_norm': 0.6096373866619839, 'learning_rate': 5.086480367772483e-06, 'epoch': 0.51} 51%|█████ | 11262/22095 [19:27:50<10:34:47, 3.52s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_2/images/step_1.png 2025-08-28 11:25:50.271158 load time: 1287.72 ms 51%|█████ | 11263/22095 [19:27:54<10:47:39, 3.59s/it] {'loss': 0.3047, 'grad_norm': 0.6335065629864226, 'learning_rate': 5.085747555463921e-06, 'epoch': 0.51} 51%|█████ | 11263/22095 [19:27:54<10:47:39, 3.59s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_2.png 2025-08-28 11:25:50.862789 load time: 1645.8 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047593 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 6cm\nB. 1cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 51%|█████ | 11264/22095 [19:27:57<10:16:27, 3.41s/it] {'loss': 0.34, 'grad_norm': 0.7209290500060698, 'learning_rate': 5.0850147413129054e-06, 'epoch': 0.51} 51%|█████ | 11264/22095 [19:27:57<10:16:27, 3.41s/it] 51%|█████ | 11265/22095 [19:28:00<9:52:54, 3.28s/it] {'loss': 0.3312, 'grad_norm': 0.5751831416002953, 'learning_rate': 5.084281925335186e-06, 'epoch': 0.51} 51%|█████ | 11265/22095 [19:28:00<9:52:54, 3.28s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:25:59.008410 load time: 1118.24 ms 51%|█████ | 11266/22095 [19:28:03<9:35:41, 3.19s/it] {'loss': 0.33, 'grad_norm': 0.5940786723235196, 'learning_rate': 5.083549107546505e-06, 'epoch': 0.51} 51%|█████ | 11266/22095 [19:28:03<9:35:41, 3.19s/it] 51%|█████ | 11267/22095 [19:28:06<9:47:15, 3.25s/it] {'loss': 0.3349, 'grad_norm': 0.5952467247815935, 'learning_rate': 5.082816287962612e-06, 'epoch': 0.51} 51%|█████ | 11267/22095 [19:28:06<9:47:15, 3.25s/it] 51%|█████ | 11268/22095 [19:28:09<9:40:54, 3.22s/it] {'loss': 0.3127, 'grad_norm': 0.6749626366271329, 'learning_rate': 5.08208346659925e-06, 'epoch': 0.51} 51%|█████ | 11268/22095 [19:28:09<9:40:54, 3.22s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 11:26:09.570136 load time: 1142.78 ms 51%|█████ | 11269/22095 [19:28:13<10:10:56, 3.39s/it] {'loss': 0.3224, 'grad_norm': 0.6412464993754627, 'learning_rate': 5.0813506434721675e-06, 'epoch': 0.51} 51%|█████ | 11269/22095 [19:28:13<10:10:56, 3.39s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/2bcc2507-16b4-45d6-93d4-8ef594edd6f3/images/step_4.png 2025-08-28 11:26:10.746958 load time: 1181.04 ms 51%|█████ | 11270/22095 [19:28:16<9:50:13, 3.27s/it] {'loss': 0.333, 'grad_norm': 0.6567906439007134, 'learning_rate': 5.080617818597109e-06, 'epoch': 0.51} 51%|█████ | 11270/22095 [19:28:16<9:50:13, 3.27s/it] 51%|█████ | 11271/22095 [19:28:20<10:12:41, 3.40s/it] {'loss': 0.305, 'grad_norm': 0.6648483420021117, 'learning_rate': 5.07988499198982e-06, 'epoch': 0.51} 51%|█████ | 11271/22095 [19:28:20<10:12:41, 3.40s/it] 51%|█████ | 11272/22095 [19:28:23<10:09:28, 3.38s/it] {'loss': 0.3293, 'grad_norm': 0.5562962246107327, 'learning_rate': 5.07915216366605e-06, 'epoch': 0.51} 51%|█████ | 11272/22095 [19:28:23<10:09:28, 3.38s/it] 51%|█████ | 11273/22095 [19:28:26<10:04:31, 3.35s/it] {'loss': 0.3272, 'grad_norm': 0.6539207385618393, 'learning_rate': 5.078419333641542e-06, 'epoch': 0.51} 51%|█████ | 11273/22095 [19:28:26<10:04:31, 3.35s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/inventor/20250512_140254_1/images/before_screenshot_5_id_163_function_0_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 11:26:23.544259 load time: 1576.89 ms VC:s3://gui-agent/data_20250612/mac/images/map/00f8416e-ae71-4e19-9e4f-bbda30643657/images/step_6.png 2025-08-28 11:26:26.326742 load time: 1179.63 ms VC:s3://gui-agent/data_20250421/web/images/wa_map/trajectory_10/img/step_0.png 2025-08-28 11:26:26.866168 load time: 1143.36 ms 51%|█████ | 11274/22095 [19:28:30<10:03:45, 3.35s/it] {'loss': 0.3564, 'grad_norm': 0.5975765423928, 'learning_rate': 5.0776865019320435e-06, 'epoch': 0.51} 51%|█████ | 11274/22095 [19:28:30<10:03:45, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48636 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59147 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128714 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (48214 > 40960) for 4 sample(s). Truncating to 7254 with 3 samples. 51%|█████ | 11275/22095 [19:28:33<10:10:45, 3.39s/it] {'loss': 0.3093, 'grad_norm': 0.6711899726697937, 'learning_rate': 5.0769536685533005e-06, 'epoch': 0.51} 51%|█████ | 11275/22095 [19:28:33<10:10:45, 3.39s/it] 51%|█████ | 11276/22095 [19:28:36<9:56:07, 3.31s/it] {'loss': 0.3185, 'grad_norm': 0.6561532773126665, 'learning_rate': 5.07622083352106e-06, 'epoch': 0.51} 51%|█████ | 11276/22095 [19:28:36<9:56:07, 3.31s/it] 51%|█████ | 11277/22095 [19:28:40<9:49:02, 3.27s/it] {'loss': 0.3302, 'grad_norm': 0.6482268957086683, 'learning_rate': 5.075487996851067e-06, 'epoch': 0.51} 51%|█████ | 11277/22095 [19:28:40<9:49:02, 3.27s/it] 51%|█████ | 11278/22095 [19:28:43<10:08:48, 3.38s/it] {'loss': 0.3065, 'grad_norm': 0.6693747821770387, 'learning_rate': 5.074755158559071e-06, 'epoch': 0.51} 51%|█████ | 11278/22095 [19:28:43<10:08:48, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11279/22095 [19:28:51<14:28:20, 4.82s/it] {'loss': 0.4807, 'grad_norm': 0.3465167500866455, 'learning_rate': 5.074022318660813e-06, 'epoch': 0.51} 51%|█████ | 11279/22095 [19:28:51<14:28:20, 4.82s/it] 51%|█████ | 11280/22095 [19:28:55<13:02:36, 4.34s/it] {'loss': 0.321, 'grad_norm': 0.6136755901790223, 'learning_rate': 5.073289477172045e-06, 'epoch': 0.51} 51%|█████ | 11280/22095 [19:28:55<13:02:36, 4.34s/it]VC:s3://gui-agent/data_20250421/web/images/wa_map/trajectory_38/img/step_0.png 2025-08-28 11:26:53.321836 load time: 1123.44 ms 51%|█████ | 11281/22095 [19:28:58<12:13:12, 4.07s/it] {'loss': 0.3116, 'grad_norm': 0.6950315561088651, 'learning_rate': 5.072556634108511e-06, 'epoch': 0.51} 51%|█████ | 11281/22095 [19:28:58<12:13:12, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57919 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11282/22095 [19:29:01<11:11:46, 3.73s/it] {'loss': 0.3314, 'grad_norm': 0.629696999529083, 'learning_rate': 5.0718237894859564e-06, 'epoch': 0.51} 51%|█████ | 11282/22095 [19:29:01<11:11:46, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/images/matlab/handmade_annotation_3/images/ML_3_id_10_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:27:01.399961 load time: 1086.83 ms 51%|█████ | 11283/22095 [19:29:10<15:47:52, 5.26s/it] {'loss': 0.4862, 'grad_norm': 0.32162850665106396, 'learning_rate': 5.0710909433201305e-06, 'epoch': 0.51} 51%|█████ | 11283/22095 [19:29:10<15:47:52, 5.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [400, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8500380 in VC:s3://internvl-moe-sft-data/. Exception: Image size [400, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37889, 'image': 'vrdu_texteq/astro-ph.CO/822d8b8e-3544-4b32-9690-8c270749b169.png', 'image_wh': [[400, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'so we have $\\epsilon \\approx \\Delta b$ when $\\Delta b \\ll 1$.'}]} VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30693.png 2025-08-28 11:27:09.101903 load time: 1828.57 ms 51%|█████ | 11284/22095 [19:29:13<14:07:20, 4.70s/it] {'loss': 0.3205, 'grad_norm': 0.6296288455006523, 'learning_rate': 5.07035809562678e-06, 'epoch': 0.51} 51%|█████ | 11284/22095 [19:29:13<14:07:20, 4.70s/it] 51%|█████ | 11285/22095 [19:29:17<13:42:26, 4.56s/it] {'loss': 0.328, 'grad_norm': 0.6243151764913002, 'learning_rate': 5.069625246421646e-06, 'epoch': 0.51} 51%|█████ | 11285/22095 [19:29:17<13:42:26, 4.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11286/22095 [19:29:21<12:27:34, 4.15s/it] {'loss': 0.3147, 'grad_norm': 0.6881553488129036, 'learning_rate': 5.068892395720482e-06, 'epoch': 0.51} 51%|█████ | 11286/22095 [19:29:21<12:27:34, 4.15s/it] 51%|█████ | 11287/22095 [19:29:24<11:37:18, 3.87s/it] {'loss': 0.3508, 'grad_norm': 0.6794158457913292, 'learning_rate': 5.068159543539031e-06, 'epoch': 0.51} 51%|█████ | 11287/22095 [19:29:24<11:37:18, 3.87s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/4883f6e6-c658-4d61-9cf9-e32c2b812a80/images/step_1.png 2025-08-28 11:27:23.799912 load time: 1350.86 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_2/images/step_0.png 2025-08-28 11:27:23.898098 load time: 1655.57 ms 51%|█████ | 11288/22095 [19:29:28<11:28:36, 3.82s/it] {'loss': 0.3202, 'grad_norm': 0.679924159439684, 'learning_rate': 5.067426689893043e-06, 'epoch': 0.51} 51%|█████ | 11288/22095 [19:29:28<11:28:36, 3.82s/it] 51%|█████ | 11289/22095 [19:29:31<10:59:48, 3.66s/it] {'loss': 0.3471, 'grad_norm': 0.5821612922899742, 'learning_rate': 5.0666938347982595e-06, 'epoch': 0.51} 51%|█████ | 11289/22095 [19:29:31<10:59:48, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (99586880 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38019.png 2025-08-28 11:27:27.765213 load time: 2483.46 ms 51%|█████ | 11290/22095 [19:29:40<16:12:45, 5.40s/it] {'loss': 0.471, 'grad_norm': 0.29824764368092205, 'learning_rate': 5.065960978270432e-06, 'epoch': 0.51} 51%|█████ | 11290/22095 [19:29:40<16:12:45, 5.40s/it] 51%|█████ | 11291/22095 [19:29:44<14:55:32, 4.97s/it] {'loss': 0.3381, 'grad_norm': 0.8062790804593769, 'learning_rate': 5.065228120325305e-06, 'epoch': 0.51} 51%|█████ | 11291/22095 [19:29:44<14:55:32, 4.97s/it]VC:s3://gui/aguvis/aguvis-stage2/android_control/images/11259/screenshot_0.png 2025-08-28 11:27:44.687081 load time: 1026.44 ms 51%|█████ | 11292/22095 [19:29:48<14:06:53, 4.70s/it] {'loss': 0.3005, 'grad_norm': 0.8877515178860867, 'learning_rate': 5.064495260978627e-06, 'epoch': 0.51} 51%|█████ | 11292/22095 [19:29:48<14:06:53, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/2bcc2507-16b4-45d6-93d4-8ef594edd6f3/images/step_2.png 2025-08-28 11:27:48.299244 load time: 1302.32 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 11:27:48.666275 load time: 1244.59 ms 51%|█████ | 11293/22095 [19:29:57<17:27:41, 5.82s/it] {'loss': 0.4749, 'grad_norm': 0.3518551067373952, 'learning_rate': 5.063762400246142e-06, 'epoch': 0.51} 51%|█████ | 11293/22095 [19:29:57<17:27:41, 5.82s/it] 51%|█████ | 11294/22095 [19:30:02<16:37:20, 5.54s/it] {'loss': 0.3357, 'grad_norm': 0.5545536333545275, 'learning_rate': 5.0630295381436024e-06, 'epoch': 0.51} 51%|█████ | 11294/22095 [19:30:02<16:37:20, 5.54s/it] 51%|█████ | 11295/22095 [19:30:05<15:06:43, 5.04s/it] {'loss': 0.3143, 'grad_norm': 0.6323065540402106, 'learning_rate': 5.0622966746867474e-06, 'epoch': 0.51} 51%|█████ | 11295/22095 [19:30:05<15:06:43, 5.04s/it] 51%|█████ | 11296/22095 [19:30:09<13:38:43, 4.55s/it] {'loss': 0.3411, 'grad_norm': 0.6076090835310118, 'learning_rate': 5.061563809891331e-06, 'epoch': 0.51} 51%|█████ | 11296/22095 [19:30:09<13:38:43, 4.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [359, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8509585 in VC:s3://internvl-moe-sft-data/. Exception: Image size [359, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30651, 'image': 'vrdu_texteq/astro-ph.CO/8c6749cc-abd5-46e8-b258-4a129b169bb7.png', 'image_wh': [[359, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $\\mathbf F$ is the Fisher Matrix.'}]} 51%|█████ | 11297/22095 [19:30:12<12:42:19, 4.24s/it] {'loss': 0.342, 'grad_norm': 0.653226132656761, 'learning_rate': 5.060830943773096e-06, 'epoch': 0.51} 51%|█████ | 11297/22095 [19:30:12<12:42:19, 4.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/images/eviews/handmade_annotation_2/images/Eviews_1_id_6_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:28:11.166142 load time: 1211.91 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_8.png 2025-08-28 11:28:12.932408 load time: 1345.52 ms 51%|█████ | 11298/22095 [19:30:16<12:21:13, 4.12s/it] {'loss': 0.3033, 'grad_norm': 0.6571716045609006, 'learning_rate': 5.060098076347793e-06, 'epoch': 0.51} 51%|█████ | 11298/22095 [19:30:16<12:21:13, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (143038128 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_10.png 2025-08-28 11:28:15.792191 load time: 1051.3 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31442.png 2025-08-28 11:28:13.624869 load time: 2084.81 ms 51%|█████ | 11299/22095 [19:30:20<11:39:57, 3.89s/it] {'loss': 0.3646, 'grad_norm': 0.7121177366376688, 'learning_rate': 5.059365207631164e-06, 'epoch': 0.51} 51%|█████ | 11299/22095 [19:30:20<11:39:57, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54075 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11300/22095 [19:30:23<11:32:43, 3.85s/it] {'loss': 0.316, 'grad_norm': 0.6071299612800266, 'learning_rate': 5.05863233763896e-06, 'epoch': 0.51} 51%|█████ | 11300/22095 [19:30:23<11:32:43, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45269 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11301/22095 [19:30:27<11:16:23, 3.76s/it] {'loss': 0.3396, 'grad_norm': 0.6936881305190062, 'learning_rate': 5.057899466386927e-06, 'epoch': 0.51} 51%|█████ | 11301/22095 [19:30:27<11:16:23, 3.76s/it] 51%|█████ | 11302/22095 [19:30:30<11:03:45, 3.69s/it] {'loss': 0.3067, 'grad_norm': 0.6665339230126445, 'learning_rate': 5.057166593890813e-06, 'epoch': 0.51} 51%|█████ | 11302/22095 [19:30:30<11:03:45, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250616/windows_paste/images/autocad/20250509_145313_988389_1251_1/images/before_screenshot_1_id_0_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:28:29.842726 load time: 1353.1 ms 51%|█████ | 11303/22095 [19:30:33<10:16:40, 3.43s/it] {'loss': 0.3372, 'grad_norm': 0.6278926273373675, 'learning_rate': 5.056433720166365e-06, 'epoch': 0.51} 51%|█████ | 11303/22095 [19:30:33<10:16:40, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 11:28:32.705731 load time: 1458.59 ms 51%|█████ | 11304/22095 [19:30:37<10:11:18, 3.40s/it] {'loss': 0.3061, 'grad_norm': 0.7060565458924746, 'learning_rate': 5.0557008452293275e-06, 'epoch': 0.51} 51%|█████ | 11304/22095 [19:30:37<10:11:18, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62589 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11305/22095 [19:30:39<9:37:32, 3.21s/it] {'loss': 0.2859, 'grad_norm': 0.5859283629350893, 'learning_rate': 5.054967969095453e-06, 'epoch': 0.51} 51%|█████ | 11305/22095 [19:30:39<9:37:32, 3.21s/it] 51%|█████ | 11306/22095 [19:30:43<9:41:22, 3.23s/it] {'loss': 0.3336, 'grad_norm': 0.624192608388278, 'learning_rate': 5.054235091780483e-06, 'epoch': 0.51} 51%|█████ | 11306/22095 [19:30:43<9:41:22, 3.23s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_10/img/step_0.png 2025-08-28 11:28:42.837080 load time: 1635.15 ms 51%|█████ | 11307/22095 [19:30:47<10:27:07, 3.49s/it] {'loss': 0.3499, 'grad_norm': 0.6051449201263233, 'learning_rate': 5.0535022133001684e-06, 'epoch': 0.51} 51%|█████ | 11307/22095 [19:30:47<10:27:07, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250501_170451_4/images/before_screenshot_41_id_1485_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:28:46.587143 load time: 1053.72 ms VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_47_id_94_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:28:47.249836 load time: 1275.48 ms 51%|█████ | 11308/22095 [19:30:54<13:53:52, 4.64s/it] {'loss': 0.4626, 'grad_norm': 0.33856949948383025, 'learning_rate': 5.052769333670255e-06, 'epoch': 0.51} 51%|█████ | 11308/22095 [19:30:54<13:53:52, 4.64s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:28:53.269562 load time: 2049.84 ms 51%|█████ | 11309/22095 [19:31:03<18:12:15, 6.08s/it] {'loss': 0.4817, 'grad_norm': 0.3127064522869994, 'learning_rate': 5.052036452906493e-06, 'epoch': 0.51} 51%|█████ | 11309/22095 [19:31:03<18:12:15, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 51%|█████ | 11310/22095 [19:31:07<16:07:30, 5.38s/it] {'loss': 0.2916, 'grad_norm': 0.692213222529521, 'learning_rate': 5.051303571024625e-06, 'epoch': 0.51} 51%|█████ | 11310/22095 [19:31:07<16:07:30, 5.38s/it] 51%|█████ | 11311/22095 [19:31:11<14:32:18, 4.85s/it] {'loss': 0.3441, 'grad_norm': 0.6157695962818099, 'learning_rate': 5.050570688040402e-06, 'epoch': 0.51} 51%|█████ | 11311/22095 [19:31:11<14:32:18, 4.85s/it]VC:s3://gui-agent/data_20250612/mac/images/reminders/ffb7349e-e87d-4f78-8cb6-445c57daba79/images/step_2.png 2025-08-28 11:29:10.264732 load time: 1062.78 ms VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_2/img/step_0.png 2025-08-28 11:29:10.306176 load time: 1217.89 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20420.png 2025-08-28 11:29:09.625578 load time: 1459.37 ms 51%|█████ | 11312/22095 [19:31:15<13:34:21, 4.53s/it] {'loss': 0.3353, 'grad_norm': 0.703509672456279, 'learning_rate': 5.0498378039695685e-06, 'epoch': 0.51} 51%|█████ | 11312/22095 [19:31:15<13:34:21, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11313/22095 [19:31:25<18:28:15, 6.17s/it] {'loss': 0.4974, 'grad_norm': 0.3098488163330391, 'learning_rate': 5.0491049188278755e-06, 'epoch': 0.51} 51%|█████ | 11313/22095 [19:31:25<18:28:15, 6.17s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 11:29:24.467989 load time: 1465.8 ms 51%|█████ | 11314/22095 [19:31:29<16:36:10, 5.54s/it] {'loss': 0.2952, 'grad_norm': 0.6039855223359988, 'learning_rate': 5.048372032631067e-06, 'epoch': 0.51} 51%|█████ | 11314/22095 [19:31:29<16:36:10, 5.54s/it]VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/images/eviews/handmade_annotation_2/images/Eviews_1_id_18_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:29:27.480147 load time: 1198.35 ms 51%|█████ | 11315/22095 [19:31:32<14:38:08, 4.89s/it] {'loss': 0.314, 'grad_norm': 0.6613158443050209, 'learning_rate': 5.047639145394895e-06, 'epoch': 0.51} 51%|█████ | 11315/22095 [19:31:32<14:38:08, 4.89s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/57f0e951e810592d9ceae1ee4edc34446d856723e561f6008a9f3984f3d70a51.png 2025-08-28 11:29:31.462609 load time: 1407.31 ms 51%|█████ | 11316/22095 [19:31:35<12:38:16, 4.22s/it] {'loss': 0.3231, 'grad_norm': 0.6614902158701181, 'learning_rate': 5.0469062571351e-06, 'epoch': 0.51} 51%|█████ | 11316/22095 [19:31:35<12:38:16, 4.22s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_3/images/step_3.png 2025-08-28 11:29:33.091846 load time: 1083.86 ms 51%|█████ | 11317/22095 [19:31:38<11:33:31, 3.86s/it] {'loss': 0.3198, 'grad_norm': 0.9304745086510386, 'learning_rate': 5.046173367867438e-06, 'epoch': 0.51} 51%|█████ | 11317/22095 [19:31:38<11:33:31, 3.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████ | 11318/22095 [19:31:41<10:52:02, 3.63s/it] {'loss': 0.2927, 'grad_norm': 0.5757091533057657, 'learning_rate': 5.045440477607649e-06, 'epoch': 0.51} 51%|█████ | 11318/22095 [19:31:41<10:52:02, 3.63s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 11:29:39.616433 load time: 1790.69 ms 51%|█████ | 11319/22095 [19:31:44<10:43:54, 3.59s/it] {'loss': 0.3387, 'grad_norm': 0.6358379134113163, 'learning_rate': 5.0447075863714845e-06, 'epoch': 0.51} 51%|█████ | 11319/22095 [19:31:44<10:43:54, 3.59s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:29:43.653682 load time: 1102.56 ms 51%|█████ | 11320/22095 [19:31:48<10:40:30, 3.57s/it] {'loss': 0.3569, 'grad_norm': 0.6245896771537696, 'learning_rate': 5.0439746941746914e-06, 'epoch': 0.51} 51%|█████ | 11320/22095 [19:31:48<10:40:30, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44013 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42018 > 40960). Running this sequence through the model will result in indexing errors 51%|█████ | 11321/22095 [19:31:52<10:53:54, 3.64s/it] {'loss': 0.3068, 'grad_norm': 0.5986751300931925, 'learning_rate': 5.043241801033016e-06, 'epoch': 0.51} 51%|█████ | 11321/22095 [19:31:52<10:53:54, 3.64s/it] 51%|█████ | 11322/22095 [19:31:55<10:22:25, 3.47s/it] {'loss': 0.3433, 'grad_norm': 0.6944406862173468, 'learning_rate': 5.0425089069622094e-06, 'epoch': 0.51} 51%|█████ | 11322/22095 [19:31:55<10:22:25, 3.47s/it]VC:s3://gui-agent/data_20250421/web/images/dmv_virginia_gov/trajectory_17/img/step_5.png 2025-08-28 11:29:54.771527 load time: 1049.97 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:29:55.149098 load time: 1103.43 ms 51%|█████ | 11323/22095 [19:31:58<10:11:35, 3.41s/it] {'loss': 0.3522, 'grad_norm': 0.6517562624227294, 'learning_rate': 5.041776011978016e-06, 'epoch': 0.51} 51%|█████ | 11323/22095 [19:31:58<10:11:35, 3.41s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/ui2json_os_d20240822_v1/omniact/photos_screen_2.png 2025-08-28 11:29:57.363469 load time: 1071.75 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_183722_1/images/before_screenshot_1_id_90_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:29:58.051681 load time: 1370.71 ms 51%|█████▏ | 11324/22095 [19:32:03<11:18:49, 3.78s/it] {'loss': 0.3264, 'grad_norm': 0.6062106577965188, 'learning_rate': 5.041043116096184e-06, 'epoch': 0.51} 51%|█████▏ | 11324/22095 [19:32:03<11:18:49, 3.78s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_2.png 2025-08-28 11:30:01.417591 load time: 1027.72 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 11:30:01.419321 load time: 1011.03 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_0.png 2025-08-28 11:30:02.267650 load time: 1373.84 ms 51%|█████▏ | 11325/22095 [19:32:06<10:38:17, 3.56s/it] {'loss': 0.282, 'grad_norm': 0.6044463216389774, 'learning_rate': 5.040310219332462e-06, 'epoch': 0.51} 51%|█████▏ | 11325/22095 [19:32:06<10:38:17, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81967 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11326/22095 [19:32:09<10:12:03, 3.41s/it] {'loss': 0.3062, 'grad_norm': 0.6393686203566962, 'learning_rate': 5.039577321702597e-06, 'epoch': 0.51} 51%|█████▏ | 11326/22095 [19:32:09<10:12:03, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50679 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11327/22095 [19:32:12<10:14:52, 3.43s/it] {'loss': 0.2901, 'grad_norm': 0.6003692003449826, 'learning_rate': 5.038844423222337e-06, 'epoch': 0.51} 51%|█████▏ | 11327/22095 [19:32:12<10:14:52, 3.43s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:30:12.444191 load time: 1364.77 ms 51%|█████▏ | 11328/22095 [19:32:15<10:02:35, 3.36s/it] {'loss': 0.2987, 'grad_norm': 0.588872996754063, 'learning_rate': 5.038111523907429e-06, 'epoch': 0.51} 51%|█████▏ | 11328/22095 [19:32:15<10:02:35, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952445 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3280, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12'}]} 51%|█████▏ | 11329/22095 [19:32:22<13:06:14, 4.38s/it] {'loss': 0.4625, 'grad_norm': 0.4081869555533952, 'learning_rate': 5.037378623773622e-06, 'epoch': 0.51} 51%|█████▏ | 11329/22095 [19:32:22<13:06:14, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73243 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11330/22095 [19:32:26<12:44:28, 4.26s/it] {'loss': 0.3555, 'grad_norm': 0.5975544563016666, 'learning_rate': 5.0366457228366625e-06, 'epoch': 0.51} 51%|█████▏ | 11330/22095 [19:32:26<12:44:28, 4.26s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/f9be7ed3-49aa-4f23-a176-7af6afdfae84/images/step_3.png 2025-08-28 11:30:24.930144 load time: 1151.63 ms 51%|█████▏ | 11331/22095 [19:32:29<11:50:15, 3.96s/it] {'loss': 0.3226, 'grad_norm': 0.6545036025728562, 'learning_rate': 5.0359128211123e-06, 'epoch': 0.51} 51%|█████▏ | 11331/22095 [19:32:29<11:50:15, 3.96s/it] 51%|█████▏ | 11332/22095 [19:32:32<11:00:36, 3.68s/it] {'loss': 0.3669, 'grad_norm': 0.6590359641589497, 'learning_rate': 5.03517991861628e-06, 'epoch': 0.51} 51%|█████▏ | 11332/22095 [19:32:32<11:00:36, 3.68s/it] 51%|█████▏ | 11333/22095 [19:32:36<10:27:49, 3.50s/it] {'loss': 0.3039, 'grad_norm': 0.6099890479128197, 'learning_rate': 5.0344470153643525e-06, 'epoch': 0.51} 51%|█████▏ | 11333/22095 [19:32:36<10:27:49, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59225 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45155 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11334/22095 [19:32:43<13:57:20, 4.67s/it] {'loss': 0.4761, 'grad_norm': 0.28940934663786255, 'learning_rate': 5.033714111372264e-06, 'epoch': 0.51} 51%|█████▏ | 11334/22095 [19:32:43<13:57:20, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76926 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11335/22095 [19:32:51<16:44:04, 5.60s/it] {'loss': 0.4659, 'grad_norm': 0.27877054420668457, 'learning_rate': 5.0329812066557625e-06, 'epoch': 0.51} 51%|█████▏ | 11335/22095 [19:32:51<16:44:04, 5.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_0.png 2025-08-28 11:30:50.229827 load time: 1070.69 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/e24c5377-9274-4e23-9add-775946a51baa/images/step_1.png 2025-08-28 11:30:51.157818 load time: 1014.76 ms 51%|█████▏ | 11336/22095 [19:32:54<14:58:52, 5.01s/it] {'loss': 0.3442, 'grad_norm': 0.6698297968837902, 'learning_rate': 5.032248301230598e-06, 'epoch': 0.51} 51%|█████▏ | 11336/22095 [19:32:54<14:58:52, 5.01s/it] 51%|█████▏ | 11337/22095 [19:32:57<13:19:37, 4.46s/it] {'loss': 0.3132, 'grad_norm': 0.6981877078120617, 'learning_rate': 5.031515395112514e-06, 'epoch': 0.51} 51%|█████▏ | 11337/22095 [19:32:58<13:19:37, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (87716 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48868 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11338/22095 [19:33:06<16:55:55, 5.67s/it] {'loss': 0.4765, 'grad_norm': 0.29108841447013006, 'learning_rate': 5.030782488317264e-06, 'epoch': 0.51} 51%|█████▏ | 11338/22095 [19:33:06<16:55:55, 5.67s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31489.png 2025-08-28 11:31:06.433338 load time: 1058.15 ms 51%|█████▏ | 11339/22095 [19:33:10<15:01:49, 5.03s/it] {'loss': 0.3438, 'grad_norm': 0.6216874067123067, 'learning_rate': 5.0300495808605905e-06, 'epoch': 0.51} 51%|█████▏ | 11339/22095 [19:33:10<15:01:49, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████▏ | 11340/22095 [19:33:13<13:38:03, 4.56s/it] {'loss': 0.295, 'grad_norm': 0.6140580587652936, 'learning_rate': 5.029316672758244e-06, 'epoch': 0.51} 51%|█████▏ | 11340/22095 [19:33:13<13:38:03, 4.56s/it] 51%|█████▏ | 11341/22095 [19:33:16<12:19:45, 4.13s/it] {'loss': 0.3516, 'grad_norm': 0.6596484104592925, 'learning_rate': 5.028583764025973e-06, 'epoch': 0.51} 51%|█████▏ | 11341/22095 [19:33:16<12:19:45, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_2/images/step_0.png 2025-08-28 11:31:15.880848 load time: 1944.53 ms 51%|█████▏ | 11342/22095 [19:33:25<16:17:18, 5.45s/it] {'loss': 0.5116, 'grad_norm': 0.3130909373251786, 'learning_rate': 5.027850854679525e-06, 'epoch': 0.51} 51%|█████▏ | 11342/22095 [19:33:25<16:17:18, 5.45s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/20475.png 2025-08-28 11:31:23.791345 load time: 1062.58 ms 51%|█████▏ | 11343/22095 [19:33:35<20:58:21, 7.02s/it] {'loss': 0.4425, 'grad_norm': 0.2835968055163799, 'learning_rate': 5.0271179447346465e-06, 'epoch': 0.51} 51%|█████▏ | 11343/22095 [19:33:35<20:58:21, 7.02s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 51%|█████▏ | 11344/22095 [19:33:40<18:25:59, 6.17s/it] {'loss': 0.3356, 'grad_norm': 0.7193538124059337, 'learning_rate': 5.026385034207087e-06, 'epoch': 0.51} 51%|█████▏ | 11344/22095 [19:33:40<18:25:59, 6.17s/it] 51%|█████▏ | 11345/22095 [19:33:43<15:51:29, 5.31s/it] {'loss': 0.3215, 'grad_norm': 0.5762159401245821, 'learning_rate': 5.0256521231125945e-06, 'epoch': 0.51} 51%|█████▏ | 11345/22095 [19:33:43<15:51:29, 5.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 11:31:43.325672 load time: 1158.12 ms 51%|█████▏ | 11346/22095 [19:33:52<19:41:23, 6.59s/it] {'loss': 0.457, 'grad_norm': 0.27371331037951185, 'learning_rate': 5.024919211466916e-06, 'epoch': 0.51} 51%|█████▏ | 11346/22095 [19:33:52<19:41:23, 6.59s/it] 51%|█████▏ | 11347/22095 [19:33:56<16:38:56, 5.58s/it] {'loss': 0.3389, 'grad_norm': 0.6497819866175213, 'learning_rate': 5.024186299285801e-06, 'epoch': 0.51} 51%|█████▏ | 11347/22095 [19:33:56<16:38:56, 5.58s/it] 51%|█████▏ | 11348/22095 [19:34:00<15:12:59, 5.10s/it] {'loss': 0.3606, 'grad_norm': 0.5756216408873701, 'learning_rate': 5.023453386584997e-06, 'epoch': 0.51} 51%|█████▏ | 11348/22095 [19:34:00<15:12:59, 5.10s/it] 51%|█████▏ | 11349/22095 [19:34:03<13:48:32, 4.63s/it] {'loss': 0.3028, 'grad_norm': 0.581163075070484, 'learning_rate': 5.02272047338025e-06, 'epoch': 0.51} 51%|█████▏ | 11349/22095 [19:34:03<13:48:32, 4.63s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 11:32:01.902703 load time: 1495.39 ms 51%|█████▏ | 11350/22095 [19:34:06<12:15:59, 4.11s/it] {'loss': 0.3246, 'grad_norm': 0.5916833302080331, 'learning_rate': 5.021987559687311e-06, 'epoch': 0.51} 51%|█████▏ | 11350/22095 [19:34:06<12:15:59, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89941 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11351/22095 [19:34:10<11:44:01, 3.93s/it] {'loss': 0.3248, 'grad_norm': 0.6405202372136433, 'learning_rate': 5.021254645521927e-06, 'epoch': 0.51} 51%|█████▏ | 11351/22095 [19:34:10<11:44:01, 3.93s/it]VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_7.png 2025-08-28 11:32:08.322224 load time: 1306.22 ms 51%|█████▏ | 11352/22095 [19:34:13<10:58:15, 3.68s/it] {'loss': 0.2805, 'grad_norm': 0.5926543425677399, 'learning_rate': 5.020521730899846e-06, 'epoch': 0.51} 51%|█████▏ | 11352/22095 [19:34:13<10:58:15, 3.68s/it] 51%|█████▏ | 11353/22095 [19:34:16<10:56:12, 3.67s/it] {'loss': 0.3524, 'grad_norm': 0.676497412921749, 'learning_rate': 5.019788815836816e-06, 'epoch': 0.51} 51%|█████▏ | 11353/22095 [19:34:16<10:56:12, 3.67s/it] 51%|█████▏ | 11354/22095 [19:34:19<10:20:53, 3.47s/it] {'loss': 0.4115, 'grad_norm': 0.6304341978323126, 'learning_rate': 5.019055900348584e-06, 'epoch': 0.51} 51%|█████▏ | 11354/22095 [19:34:19<10:20:53, 3.47s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_5.png 2025-08-28 11:32:18.055140 load time: 1102.39 ms VC:s3://gui-agent/data_20250612/mac/images/terminal/95886ee2-46c8-4a0f-865b-9ddbfb2af444/images/step_0.png 2025-08-28 11:32:19.135306 load time: 1018.43 ms 51%|█████▏ | 11355/22095 [19:34:23<10:41:19, 3.58s/it] {'loss': 0.3192, 'grad_norm': 0.5920342969027814, 'learning_rate': 5.018322984450902e-06, 'epoch': 0.51} 51%|█████▏ | 11355/22095 [19:34:23<10:41:19, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895854 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19007, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm'}, {'from': 'gpt', 'value': '【解答】解:∵CB=3cm,DB=5cm,∴CD=5-3=2cm,∵D是AC的中点,∴AC=2CD=4cm,∴AB=AC+CB=4+3=7cm.'}]} 51%|█████▏ | 11356/22095 [19:34:27<10:36:14, 3.55s/it] {'loss': 0.3172, 'grad_norm': 0.6092487652218684, 'learning_rate': 5.0175900681595116e-06, 'epoch': 0.51} 51%|█████▏ | 11356/22095 [19:34:27<10:36:14, 3.55s/it] 51%|█████▏ | 11357/22095 [19:34:30<10:12:58, 3.43s/it] {'loss': 0.352, 'grad_norm': 0.6443659421278713, 'learning_rate': 5.016857151490167e-06, 'epoch': 0.51} 51%|█████▏ | 11357/22095 [19:34:30<10:12:58, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_4.png 2025-08-28 11:32:30.100873 load time: 1099.4 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250501_133654_1/images/before_screenshot_6_id_187_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:32:30.069571 load time: 1184.3 ms 51%|█████▏ | 11358/22095 [19:34:33<10:07:07, 3.39s/it] {'loss': 0.3265, 'grad_norm': 0.6284718857153448, 'learning_rate': 5.016124234458612e-06, 'epoch': 0.51} 51%|█████▏ | 11358/22095 [19:34:33<10:07:07, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:32:31.880131 load time: 1050.63 ms 51%|█████▏ | 11359/22095 [19:34:41<14:15:10, 4.78s/it] {'loss': 0.4694, 'grad_norm': 0.3664084196581059, 'learning_rate': 5.0153913170806e-06, 'epoch': 0.51} 51%|█████▏ | 11359/22095 [19:34:41<14:15:10, 4.78s/it] 51%|█████▏ | 11360/22095 [19:34:45<13:10:21, 4.42s/it] {'loss': 0.3564, 'grad_norm': 0.6508373755946546, 'learning_rate': 5.0146583993718746e-06, 'epoch': 0.51} 51%|█████▏ | 11360/22095 [19:34:45<13:10:21, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41283 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71397 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11361/22095 [19:34:48<12:01:00, 4.03s/it] {'loss': 0.3143, 'grad_norm': 0.7032696127025374, 'learning_rate': 5.013925481348184e-06, 'epoch': 0.51} 51%|█████▏ | 11361/22095 [19:34:48<12:01:00, 4.03s/it] 51%|█████▏ | 11362/22095 [19:34:51<11:43:56, 3.94s/it] {'loss': 0.335, 'grad_norm': 0.6514752461003346, 'learning_rate': 5.013192563025279e-06, 'epoch': 0.51} 51%|█████▏ | 11362/22095 [19:34:51<11:43:56, 3.94s/it] 51%|█████▏ | 11363/22095 [19:34:55<11:44:53, 3.94s/it] {'loss': 0.3743, 'grad_norm': 0.6165901346699098, 'learning_rate': 5.012459644418905e-06, 'epoch': 0.51} 51%|█████▏ | 11363/22095 [19:34:55<11:44:53, 3.94s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_1/images/step_0.png 2025-08-28 11:32:54.765165 load time: 1973.66 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/cf183c3a-83c3-4eb7-a60d-c6cbfaa27f3e/images/step_0.png 2025-08-28 11:32:55.464604 load time: 1419.46 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_4/images/step_2.png 2025-08-28 11:32:56.085694 load time: 1054.12 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:32:55.864190 load time: 1434.38 ms 51%|█████▏ | 11364/22095 [19:34:59<11:43:11, 3.93s/it] {'loss': 0.3444, 'grad_norm': 0.6550129626238077, 'learning_rate': 5.0117267255448125e-06, 'epoch': 0.51} 51%|█████▏ | 11364/22095 [19:34:59<11:43:11, 3.93s/it] 51%|█████▏ | 11365/22095 [19:35:02<10:51:56, 3.65s/it] {'loss': 0.3336, 'grad_norm': 0.6607140941538844, 'learning_rate': 5.010993806418749e-06, 'epoch': 0.51} 51%|█████▏ | 11365/22095 [19:35:02<10:51:56, 3.65s/it] 51%|█████▏ | 11366/22095 [19:35:05<10:10:04, 3.41s/it] {'loss': 0.3137, 'grad_norm': 0.6011220664787138, 'learning_rate': 5.010260887056461e-06, 'epoch': 0.51} 51%|█████▏ | 11366/22095 [19:35:05<10:10:04, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 51%|█████▏ | 11367/22095 [19:35:09<10:17:37, 3.45s/it] {'loss': 0.2907, 'grad_norm': 0.7794734661688585, 'learning_rate': 5.0095279674736985e-06, 'epoch': 0.51} 51%|█████▏ | 11367/22095 [19:35:09<10:17:37, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047549 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 16cm\nB. 10cm\nC. 5cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 11:33:08.984096 load time: 1291.37 ms 51%|█████▏ | 11368/22095 [19:35:12<10:10:56, 3.42s/it] {'loss': 0.359, 'grad_norm': 0.6761020044403644, 'learning_rate': 5.00879504768621e-06, 'epoch': 0.51} 51%|█████▏ | 11368/22095 [19:35:12<10:10:56, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████▏ | 11369/22095 [19:35:21<15:29:14, 5.20s/it] {'loss': 0.4823, 'grad_norm': 0.3171163849441446, 'learning_rate': 5.0080621277097415e-06, 'epoch': 0.51} 51%|█████▏ | 11369/22095 [19:35:21<15:29:14, 5.20s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 11:33:20.776121 load time: 1058.57 ms 51%|█████▏ | 11370/22095 [19:35:25<13:50:07, 4.64s/it] {'loss': 0.3481, 'grad_norm': 0.6345630972622137, 'learning_rate': 5.007329207560045e-06, 'epoch': 0.51} 51%|█████▏ | 11370/22095 [19:35:25<13:50:07, 4.64s/it]VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 11:33:25.020947 load time: 1417.0 ms 51%|█████▏ | 11371/22095 [19:35:29<13:15:39, 4.45s/it] {'loss': 0.3116, 'grad_norm': 0.6281598512262935, 'learning_rate': 5.006596287252864e-06, 'epoch': 0.51} 51%|█████▏ | 11371/22095 [19:35:29<13:15:39, 4.45s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_171641_1/images/before_screenshot_1_id_177_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:33:28.866670 load time: 1332.11 ms 51%|█████▏ | 11372/22095 [19:35:33<12:36:36, 4.23s/it] {'loss': 0.3117, 'grad_norm': 0.5969350370380021, 'learning_rate': 5.005863366803949e-06, 'epoch': 0.51} 51%|█████▏ | 11372/22095 [19:35:33<12:36:36, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 51%|█████▏ | 11373/22095 [19:35:41<16:04:43, 5.40s/it] {'loss': 0.4937, 'grad_norm': 0.3021692952870953, 'learning_rate': 5.005130446229051e-06, 'epoch': 0.51} 51%|█████▏ | 11373/22095 [19:35:41<16:04:43, 5.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922959 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46112, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 5cm\nB. 无法确定\nC. 1cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 51%|█████▏ | 11374/22095 [19:35:44<14:07:59, 4.75s/it] {'loss': 0.3018, 'grad_norm': 0.6542062893315373, 'learning_rate': 5.004397525543912e-06, 'epoch': 0.51} 51%|█████▏ | 11374/22095 [19:35:44<14:07:59, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85889 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42445 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11375/22095 [19:35:48<13:28:22, 4.52s/it] {'loss': 0.3452, 'grad_norm': 0.6304255438734256, 'learning_rate': 5.003664604764287e-06, 'epoch': 0.51} 51%|█████▏ | 11375/22095 [19:35:48<13:28:22, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43353 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47938 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74297 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11376/22095 [19:35:51<12:05:12, 4.06s/it] {'loss': 0.3184, 'grad_norm': 0.6261344215797569, 'learning_rate': 5.0029316839059185e-06, 'epoch': 0.51} 51%|█████▏ | 11376/22095 [19:35:51<12:05:12, 4.06s/it] 51%|█████▏ | 11377/22095 [19:35:54<10:58:28, 3.69s/it] {'loss': 0.3365, 'grad_norm': 0.6652000712960027, 'learning_rate': 5.002198762984558e-06, 'epoch': 0.51} 51%|█████▏ | 11377/22095 [19:35:54<10:58:28, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87467 > 40960). Running this sequence through the model will result in indexing errors 51%|█████▏ | 11378/22095 [19:35:56<10:13:01, 3.43s/it] {'loss': 0.3421, 'grad_norm': 0.6554706239445697, 'learning_rate': 5.001465842015952e-06, 'epoch': 0.51} 51%|█████▏ | 11378/22095 [19:35:56<10:13:01, 3.43s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_2/images/step_0.png 2025-08-28 11:33:56.624980 load time: 1097.41 ms 52%|█████▏ | 11379/22095 [19:36:00<10:43:03, 3.60s/it] {'loss': 0.3314, 'grad_norm': 0.6191293900481116, 'learning_rate': 5.00073292101585e-06, 'epoch': 0.52} 52%|█████▏ | 11379/22095 [19:36:00<10:43:03, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110975 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11380/22095 [19:36:04<10:53:34, 3.66s/it] {'loss': 0.3071, 'grad_norm': 0.6021614928686941, 'learning_rate': 5e-06, 'epoch': 0.52} 52%|█████▏ | 11380/22095 [19:36:04<10:53:34, 3.66s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:34:03.433879 load time: 1168.77 ms 52%|█████▏ | 11381/22095 [19:36:07<10:07:27, 3.40s/it] {'loss': 0.3469, 'grad_norm': 0.6004614620135654, 'learning_rate': 4.999267078984151e-06, 'epoch': 0.52} 52%|█████▏ | 11381/22095 [19:36:07<10:07:27, 3.40s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/d1e543d1-751f-4201-8047-6e1e7f14dec2/images/step_7.png 2025-08-28 11:34:05.863135 load time: 1186.1 ms 52%|█████▏ | 11382/22095 [19:36:10<9:44:15, 3.27s/it] {'loss': 0.2796, 'grad_norm': 0.6382424917561594, 'learning_rate': 4.9985341579840505e-06, 'epoch': 0.52} 52%|█████▏ | 11382/22095 [19:36:10<9:44:15, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_062010_before_screenshot.png 2025-08-28 11:34:08.820520 load time: 1272.59 ms 52%|█████▏ | 11383/22095 [19:36:18<13:46:07, 4.63s/it] {'loss': 0.5161, 'grad_norm': 0.3659738748625842, 'learning_rate': 4.997801237015443e-06, 'epoch': 0.52} 52%|█████▏ | 11383/22095 [19:36:18<13:46:07, 4.63s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_5/images/before_screenshot_51_id_58_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:34:17.105242 load time: 1303.77 ms 52%|█████▏ | 11384/22095 [19:36:28<18:52:56, 6.35s/it] {'loss': 0.4659, 'grad_norm': 0.3201373508939831, 'learning_rate': 4.997068316094082e-06, 'epoch': 0.52} 52%|█████▏ | 11384/22095 [19:36:28<18:52:56, 6.35s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/3228704213401472_6.png 2025-08-28 11:34:27.413553 load time: 1027.06 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 11:34:27.775769 load time: 1476.23 ms 52%|█████▏ | 11385/22095 [19:36:32<16:19:31, 5.49s/it] {'loss': 0.3298, 'grad_norm': 0.6206303491788823, 'learning_rate': 4.996335395235715e-06, 'epoch': 0.52} 52%|█████▏ | 11385/22095 [19:36:32<16:19:31, 5.49s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/FL/handmade_annotation_1/images/FL_3_id_20_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:34:30.453242 load time: 1080.81 ms VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_2/img/step_1.png 2025-08-28 11:34:32.290817 load time: 1239.3 ms 52%|█████▏ | 11386/22095 [19:36:42<20:57:32, 7.05s/it] {'loss': 0.4794, 'grad_norm': 0.27895766028848795, 'learning_rate': 4.9956024744560895e-06, 'epoch': 0.52} 52%|█████▏ | 11386/22095 [19:36:42<20:57:32, 7.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 52%|█████▏ | 11387/22095 [19:36:46<17:32:20, 5.90s/it] {'loss': 0.3421, 'grad_norm': 0.679592436620248, 'learning_rate': 4.994869553770951e-06, 'epoch': 0.52} 52%|█████▏ | 11387/22095 [19:36:46<17:32:20, 5.90s/it] 52%|█████▏ | 11388/22095 [19:36:55<20:45:57, 6.98s/it] {'loss': 0.4615, 'grad_norm': 0.28995181663669395, 'learning_rate': 4.99413663319605e-06, 'epoch': 0.52} 52%|█████▏ | 11388/22095 [19:36:55<20:45:57, 6.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8399728 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1883, 'image': 'vrdu_table_final_2/astro-ph.CO/79e03ec1-8f9d-4318-8019-cc5d52f103fb.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_18/images/20250417140207.png 2025-08-28 11:34:54.471493 load time: 1001.84 ms VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_104/img/step_0.png 2025-08-28 11:34:54.333907 load time: 1306.75 ms 52%|█████▏ | 11389/22095 [19:36:58<17:26:48, 5.87s/it] {'loss': 0.3421, 'grad_norm': 0.6250008664608733, 'learning_rate': 4.9934037127471375e-06, 'epoch': 0.52} 52%|█████▏ | 11389/22095 [19:36:58<17:26:48, 5.87s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_0.png 2025-08-28 11:34:59.049082 load time: 1040.11 ms 52%|█████▏ | 11390/22095 [19:37:02<15:08:07, 5.09s/it] {'loss': 0.3616, 'grad_norm': 0.6482942267558705, 'learning_rate': 4.992670792439958e-06, 'epoch': 0.52} 52%|█████▏ | 11390/22095 [19:37:02<15:08:07, 5.09s/it] 52%|█████▏ | 11391/22095 [19:37:05<13:16:13, 4.46s/it] {'loss': 0.366, 'grad_norm': 0.6523165173209333, 'learning_rate': 4.9919378722902585e-06, 'epoch': 0.52} 52%|█████▏ | 11391/22095 [19:37:05<13:16:13, 4.46s/it] 52%|█████▏ | 11392/22095 [19:37:08<12:16:26, 4.13s/it] {'loss': 0.3269, 'grad_norm': 0.6858262611847109, 'learning_rate': 4.991204952313793e-06, 'epoch': 0.52} 52%|█████▏ | 11392/22095 [19:37:08<12:16:26, 4.13s/it] 52%|█████▏ | 11393/22095 [19:37:11<11:39:02, 3.92s/it] {'loss': 0.3338, 'grad_norm': 0.6718800704422592, 'learning_rate': 4.990472032526302e-06, 'epoch': 0.52} 52%|█████▏ | 11393/22095 [19:37:11<11:39:02, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57759 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89474 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11394/22095 [19:37:16<11:50:55, 3.99s/it] {'loss': 0.3295, 'grad_norm': 0.6642519794348746, 'learning_rate': 4.98973911294354e-06, 'epoch': 0.52} 52%|█████▏ | 11394/22095 [19:37:16<11:50:55, 3.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11395/22095 [19:37:20<12:01:01, 4.04s/it] {'loss': 0.3274, 'grad_norm': 0.9539961494847039, 'learning_rate': 4.989006193581254e-06, 'epoch': 0.52} 52%|█████▏ | 11395/22095 [19:37:20<12:01:01, 4.04s/it] 52%|█████▏ | 11396/22095 [19:37:24<11:59:19, 4.03s/it] {'loss': 0.3493, 'grad_norm': 0.7514131009659355, 'learning_rate': 4.98827327445519e-06, 'epoch': 0.52} 52%|█████▏ | 11396/22095 [19:37:24<11:59:19, 4.03s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/b079d8c8-d553-4a1e-8f01-7e9e79e07722/images/step_0.png 2025-08-28 11:35:22.516789 load time: 1439.36 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906522 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29675, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm'}, {'from': 'gpt', 'value': '【解答】解:∵点C是线段AB的中点,AB=20cm,∴BC=\\frac{1}{2}AB=\\frac{1}{2}×20cm=10cm,∵点D是线段BC的中点,∴BD=\\frac{1}{2}BC=\\frac{1}{2}×10cm=5cm,∴AD=AB-BD=20cm-5cm=15cm.'}]} 52%|█████▏ | 11397/22095 [19:37:27<11:39:37, 3.92s/it] {'loss': 0.303, 'grad_norm': 0.5885314984600968, 'learning_rate': 4.987540355581095e-06, 'epoch': 0.52} 52%|█████▏ | 11397/22095 [19:37:27<11:39:37, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64112 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57764 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64072 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41410 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50531 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80656 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11398/22095 [19:37:31<10:56:59, 3.69s/it] {'loss': 0.3593, 'grad_norm': 0.6082433375446966, 'learning_rate': 4.986807436974723e-06, 'epoch': 0.52} 52%|█████▏ | 11398/22095 [19:37:31<10:56:59, 3.69s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/DocVQA/pngs/jgmk0226_1.png 2025-08-28 11:35:29.819707 load time: 1539.99 ms 52%|█████▏ | 11399/22095 [19:37:34<10:36:10, 3.57s/it] {'loss': 0.3339, 'grad_norm': 0.6451130905415389, 'learning_rate': 4.986074518651817e-06, 'epoch': 0.52} 52%|█████▏ | 11399/22095 [19:37:34<10:36:10, 3.57s/it]VC:s3://gui/OS-Atlas/desktop_domain/macos_images/20240905_150336_screenshot.png 2025-08-28 11:35:33.582147 load time: 1245.36 ms 52%|█████▏ | 11400/22095 [19:37:37<10:03:16, 3.38s/it] {'loss': 0.3196, 'grad_norm': 0.5772202526007606, 'learning_rate': 4.985341600628127e-06, 'epoch': 0.52} 52%|█████▏ | 11400/22095 [19:37:37<10:03:16, 3.38s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:35:34.097826 load time: 1325.11 ms 52%|█████▏ | 11401/22095 [19:37:41<10:21:15, 3.49s/it] {'loss': 0.3233, 'grad_norm': 0.5913749203713625, 'learning_rate': 4.984608682919402e-06, 'epoch': 0.52} 52%|█████▏ | 11401/22095 [19:37:41<10:21:15, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53888 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11402/22095 [19:37:47<13:10:04, 4.43s/it] {'loss': 0.4645, 'grad_norm': 0.3501031557695471, 'learning_rate': 4.983875765541389e-06, 'epoch': 0.52} 52%|█████▏ | 11402/22095 [19:37:47<13:10:04, 4.43s/it] 52%|█████▏ | 11403/22095 [19:37:51<12:18:34, 4.14s/it] {'loss': 0.2862, 'grad_norm': 0.6909893434416505, 'learning_rate': 4.9831428485098336e-06, 'epoch': 0.52} 52%|█████▏ | 11403/22095 [19:37:51<12:18:34, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369313 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36065, 'image': 'vrdu_table_final_2/astro-ph.CO/edd16d8d-61a4-460c-9ecb-bb3d677a3c99.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/podcasts_2/images/step_0.png 2025-08-28 11:35:49.877682 load time: 1837.23 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 11:35:50.646924 load time: 1609.93 ms 52%|█████▏ | 11404/22095 [19:37:54<11:54:33, 4.01s/it] {'loss': 0.2947, 'grad_norm': 0.6190287034727685, 'learning_rate': 4.982409931840489e-06, 'epoch': 0.52} 52%|█████▏ | 11404/22095 [19:37:54<11:54:33, 4.01s/it] 52%|█████▏ | 11405/22095 [19:37:57<11:03:05, 3.72s/it] {'loss': 0.3384, 'grad_norm': 0.608323485565074, 'learning_rate': 4.981677015549101e-06, 'epoch': 0.52} 52%|█████▏ | 11405/22095 [19:37:57<11:03:05, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250501_133654_1/images/before_screenshot_6_id_185_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:35:56.152334 load time: 1579.55 ms 52%|█████▏ | 11406/22095 [19:38:07<16:16:55, 5.48s/it] {'loss': 0.4672, 'grad_norm': 0.2996223595119399, 'learning_rate': 4.9809440996514175e-06, 'epoch': 0.52} 52%|█████▏ | 11406/22095 [19:38:07<16:16:55, 5.48s/it]VC:s3://gui-agent/data_20250612/mac/images/finder/8af7889e-fbfc-443f-8629-e5b6b0484c7d/images/step_1.png 2025-08-28 11:36:07.489528 load time: 1257.51 ms 52%|█████▏ | 11407/22095 [19:38:11<14:45:50, 4.97s/it] {'loss': 0.4021, 'grad_norm': 0.6549238199747524, 'learning_rate': 4.980211184163185e-06, 'epoch': 0.52} 52%|█████▏ | 11407/22095 [19:38:11<14:45:50, 4.97s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/blender/handmade_annotation_4/images/blender (1)_id_5_internvl_element-caption_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:36:10.128916 load time: 1002.68 ms 52%|█████▏ | 11408/22095 [19:38:15<13:54:44, 4.69s/it] {'loss': 0.301, 'grad_norm': 0.545305893167641, 'learning_rate': 4.979478269100156e-06, 'epoch': 0.52} 52%|█████▏ | 11408/22095 [19:38:15<13:54:44, 4.69s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_4/images/before_screenshot_14_id_125_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:36:14.342374 load time: 1043.68 ms 52%|█████▏ | 11409/22095 [19:38:19<13:16:39, 4.47s/it] {'loss': 0.3317, 'grad_norm': 0.5946659894475338, 'learning_rate': 4.978745354478074e-06, 'epoch': 0.52} 52%|█████▏ | 11409/22095 [19:38:19<13:16:39, 4.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11410/22095 [19:38:28<17:53:07, 6.03s/it] {'loss': 0.5026, 'grad_norm': 0.34696223925319947, 'learning_rate': 4.97801244031269e-06, 'epoch': 0.52} 52%|█████▏ | 11410/22095 [19:38:28<17:53:07, 6.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11411/22095 [19:38:32<15:44:14, 5.30s/it] {'loss': 0.3588, 'grad_norm': 0.5886503604054116, 'learning_rate': 4.977279526619752e-06, 'epoch': 0.52} 52%|█████▏ | 11411/22095 [19:38:32<15:44:14, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108919 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11412/22095 [19:38:35<13:59:02, 4.71s/it] {'loss': 0.3335, 'grad_norm': 0.6532625865775712, 'learning_rate': 4.976546613415005e-06, 'epoch': 0.52} 52%|█████▏ | 11412/22095 [19:38:35<13:59:02, 4.71s/it] 52%|█████▏ | 11413/22095 [19:38:38<12:29:21, 4.21s/it] {'loss': 0.3029, 'grad_norm': 0.5755714762624607, 'learning_rate': 4.9758137007141996e-06, 'epoch': 0.52} 52%|█████▏ | 11413/22095 [19:38:38<12:29:21, 4.21s/it] 52%|█████▏ | 11414/22095 [19:38:42<11:32:41, 3.89s/it] {'loss': 0.2644, 'grad_norm': 0.6314543801653296, 'learning_rate': 4.975080788533086e-06, 'epoch': 0.52} 52%|█████▏ | 11414/22095 [19:38:42<11:32:41, 3.89s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_034601_before_screenshot.png 2025-08-28 11:36:39.707256 load time: 1011.43 ms 52%|█████▏ | 11415/22095 [19:38:44<10:42:44, 3.61s/it] {'loss': 0.3214, 'grad_norm': 0.6535819873033479, 'learning_rate': 4.974347876887408e-06, 'epoch': 0.52} 52%|█████▏ | 11415/22095 [19:38:44<10:42:44, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959456 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10291, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 3\nB. 10\nC. 5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 52%|█████▏ | 11416/22095 [19:38:48<10:25:12, 3.51s/it] {'loss': 0.3325, 'grad_norm': 0.6748569895089654, 'learning_rate': 4.9736149657929136e-06, 'epoch': 0.52} 52%|█████▏ | 11416/22095 [19:38:48<10:25:12, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (124541 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102855 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11417/22095 [19:38:51<10:21:23, 3.49s/it] {'loss': 0.2784, 'grad_norm': 0.5890510812062223, 'learning_rate': 4.972882055265354e-06, 'epoch': 0.52} 52%|█████▏ | 11417/22095 [19:38:51<10:21:23, 3.49s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:36:50.339780 load time: 1345.49 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11418/22095 [19:38:55<10:18:26, 3.48s/it] {'loss': 0.3645, 'grad_norm': 0.669667646454698, 'learning_rate': 4.9721491453204775e-06, 'epoch': 0.52} 52%|█████▏ | 11418/22095 [19:38:55<10:18:26, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62429 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87234 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118756 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11419/22095 [19:38:57<9:44:46, 3.29s/it] {'loss': 0.356, 'grad_norm': 0.6469370141276698, 'learning_rate': 4.971416235974029e-06, 'epoch': 0.52} 52%|█████▏ | 11419/22095 [19:38:57<9:44:46, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11420/22095 [19:39:07<15:14:00, 5.14s/it] {'loss': 0.4433, 'grad_norm': 0.28727861727393994, 'learning_rate': 4.970683327241756e-06, 'epoch': 0.52} 52%|█████▏ | 11420/22095 [19:39:07<15:14:00, 5.14s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_204854_4/images/before_screenshot_38_id_202_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:37:05.727656 load time: 1685.39 ms VC:s3://gui-agent/data_20250612/mac/images/settings/9db8b077-9fe1-4e64-9ae8-95dea252fdb4/images/step_1.png 2025-08-28 11:37:06.346193 load time: 1171.52 ms 52%|█████▏ | 11421/22095 [19:39:11<14:14:12, 4.80s/it] {'loss': 0.3127, 'grad_norm': 0.5721570053047997, 'learning_rate': 4.969950419139412e-06, 'epoch': 0.52} 52%|█████▏ | 11421/22095 [19:39:11<14:14:12, 4.80s/it] 52%|█████▏ | 11422/22095 [19:39:14<12:38:21, 4.26s/it] {'loss': 0.2943, 'grad_norm': 0.6452400902352549, 'learning_rate': 4.969217511682738e-06, 'epoch': 0.52} 52%|█████▏ | 11422/22095 [19:39:14<12:38:21, 4.26s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 11:37:11.647837 load time: 1351.26 ms Token indices sequence length is longer than the specified maximum sequence length for this model (75054 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11423/22095 [19:39:17<11:53:05, 4.01s/it] {'loss': 0.3301, 'grad_norm': 0.6123018177339106, 'learning_rate': 4.968484604887486e-06, 'epoch': 0.52} 52%|█████▏ | 11423/22095 [19:39:17<11:53:05, 4.01s/it] 52%|█████▏ | 11424/22095 [19:39:21<11:52:13, 4.00s/it] {'loss': 0.3226, 'grad_norm': 0.6869000275438506, 'learning_rate': 4.967751698769404e-06, 'epoch': 0.52} 52%|█████▏ | 11424/22095 [19:39:21<11:52:13, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11425/22095 [19:39:31<17:09:00, 5.79s/it] {'loss': 0.4798, 'grad_norm': 0.33237830928716483, 'learning_rate': 4.967018793344238e-06, 'epoch': 0.52} 52%|█████▏ | 11425/22095 [19:39:31<17:09:00, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45255 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11426/22095 [19:39:34<14:49:34, 5.00s/it] {'loss': 0.3272, 'grad_norm': 0.6681644421207563, 'learning_rate': 4.966285888627737e-06, 'epoch': 0.52} 52%|█████▏ | 11426/22095 [19:39:35<14:49:34, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11427/22095 [19:39:45<19:27:46, 6.57s/it] {'loss': 0.4678, 'grad_norm': 0.35537307589594674, 'learning_rate': 4.965552984635649e-06, 'epoch': 0.52} 52%|█████▏ | 11427/22095 [19:39:45<19:27:46, 6.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 11:37:43.491631 load time: 1007.81 ms VC:s3://gui-agent/data_20250612/mac/images/settings/9db8b077-9fe1-4e64-9ae8-95dea252fdb4/images/step_7.png 2025-08-28 11:37:44.671678 load time: 1205.49 ms 52%|█████▏ | 11428/22095 [19:39:48<16:21:25, 5.52s/it] {'loss': 0.3503, 'grad_norm': 0.5954443500581541, 'learning_rate': 4.964820081383721e-06, 'epoch': 0.52} 52%|█████▏ | 11428/22095 [19:39:48<16:21:25, 5.52s/it]VC:s3://gui-agent/data_20250407/windows/images/settings/20250410_194406_1/images/before_screenshot_7.png 2025-08-28 11:37:46.568620 load time: 1235.67 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/safari_4/images/step_0.png 2025-08-28 11:37:46.568475 load time: 1312.23 ms 52%|█████▏ | 11429/22095 [19:39:52<14:56:25, 5.04s/it] {'loss': 0.3322, 'grad_norm': 1.2948284935381316, 'learning_rate': 4.964087178887702e-06, 'epoch': 0.52} 52%|█████▏ | 11429/22095 [19:39:52<14:56:25, 5.04s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/contacts_1/images/step_15.png 2025-08-28 11:37:50.509497 load time: 1006.42 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_3.png 2025-08-28 11:37:52.057938 load time: 1045.82 ms 52%|█████▏ | 11430/22095 [19:39:55<13:21:17, 4.51s/it] {'loss': 0.3744, 'grad_norm': 0.6466216348525649, 'learning_rate': 4.9633542771633374e-06, 'epoch': 0.52} 52%|█████▏ | 11430/22095 [19:39:55<13:21:17, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11431/22095 [19:40:02<15:20:05, 5.18s/it] {'loss': 0.4401, 'grad_norm': 0.28981058998753617, 'learning_rate': 4.96262137622638e-06, 'epoch': 0.52} 52%|█████▏ | 11431/22095 [19:40:02<15:20:05, 5.18s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250509_125727_1/images/before_screenshot_1_id_43_function_1_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 11:38:00.495970 load time: 1242.74 ms VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_211444/images/step_0_id_17_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:38:02.072011 load time: 1083.49 ms 52%|█████▏ | 11432/22095 [19:40:05<13:49:57, 4.67s/it] {'loss': 0.3396, 'grad_norm': 0.6355869243233664, 'learning_rate': 4.961888476092572e-06, 'epoch': 0.52} 52%|█████▏ | 11432/22095 [19:40:05<13:49:57, 4.67s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 11:38:03.983269 load time: 1682.4 ms 52%|█████▏ | 11433/22095 [19:40:09<12:47:55, 4.32s/it] {'loss': 0.3626, 'grad_norm': 0.669479187452769, 'learning_rate': 4.961155576777665e-06, 'epoch': 0.52} 52%|█████▏ | 11433/22095 [19:40:09<12:47:55, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11434/22095 [19:40:16<15:43:24, 5.31s/it] {'loss': 0.4604, 'grad_norm': 0.2818465609687323, 'learning_rate': 4.960422678297405e-06, 'epoch': 0.52} 52%|█████▏ | 11434/22095 [19:40:16<15:43:24, 5.31s/it] 52%|█████▏ | 11435/22095 [19:40:20<14:39:51, 4.95s/it] {'loss': 0.3208, 'grad_norm': 0.643082002501434, 'learning_rate': 4.959689780667541e-06, 'epoch': 0.52} 52%|█████▏ | 11435/22095 [19:40:20<14:39:51, 4.95s/it] 52%|█████▏ | 11436/22095 [19:40:24<13:27:29, 4.55s/it] {'loss': 0.3216, 'grad_norm': 0.6454671244405962, 'learning_rate': 4.958956883903816e-06, 'epoch': 0.52} 52%|█████▏ | 11436/22095 [19:40:24<13:27:29, 4.55s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30201.png 2025-08-28 11:38:23.928042 load time: 2115.15 ms 52%|█████▏ | 11437/22095 [19:40:28<12:58:55, 4.38s/it] {'loss': 0.2993, 'grad_norm': 0.6115329967832458, 'learning_rate': 4.958223988021986e-06, 'epoch': 0.52} 52%|█████▏ | 11437/22095 [19:40:28<12:58:55, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64051 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11438/22095 [19:40:31<11:36:24, 3.92s/it] {'loss': 0.3235, 'grad_norm': 0.6792218491410067, 'learning_rate': 4.957491093037792e-06, 'epoch': 0.52} 52%|█████▏ | 11438/22095 [19:40:31<11:36:24, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45598 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11439/22095 [19:40:34<10:33:42, 3.57s/it] {'loss': 0.3097, 'grad_norm': 0.6469072805006693, 'learning_rate': 4.9567581989669846e-06, 'epoch': 0.52} 52%|█████▏ | 11439/22095 [19:40:34<10:33:42, 3.57s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/e1cc9c76-68f1-43b2-8a5f-bcfc7a70d69e/images/step_4.png 2025-08-28 11:38:34.908301 load time: 1045.97 ms 52%|█████▏ | 11440/22095 [19:40:37<10:42:22, 3.62s/it] {'loss': 0.332, 'grad_norm': 0.5611885638407211, 'learning_rate': 4.956025305825311e-06, 'epoch': 0.52} 52%|█████▏ | 11440/22095 [19:40:37<10:42:22, 3.62s/it] 52%|█████▏ | 11441/22095 [19:40:42<11:41:24, 3.95s/it] {'loss': 0.3216, 'grad_norm': 0.646053881191501, 'learning_rate': 4.955292413628517e-06, 'epoch': 0.52} 52%|█████▏ | 11441/22095 [19:40:42<11:41:24, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 11:38:41.711623 load time: 1594.44 ms 52%|█████▏ | 11442/22095 [19:40:46<11:46:11, 3.98s/it] {'loss': 0.3362, 'grad_norm': 0.6438452394996295, 'learning_rate': 4.954559522392353e-06, 'epoch': 0.52} 52%|█████▏ | 11442/22095 [19:40:46<11:46:11, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64938 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64741 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95027 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11443/22095 [19:40:50<11:53:43, 4.02s/it] {'loss': 0.3116, 'grad_norm': 0.5463017390785394, 'learning_rate': 4.953826632132565e-06, 'epoch': 0.52} 52%|█████▏ | 11443/22095 [19:40:50<11:53:43, 4.02s/it] 52%|█████▏ | 11444/22095 [19:40:53<10:48:44, 3.65s/it] {'loss': 0.3027, 'grad_norm': 0.6006738174475235, 'learning_rate': 4.953093742864901e-06, 'epoch': 0.52} 52%|█████▏ | 11444/22095 [19:40:53<10:48:44, 3.65s/it] 52%|█████▏ | 11445/22095 [19:40:57<10:43:37, 3.63s/it] {'loss': 0.3432, 'grad_norm': 0.6338506106940345, 'learning_rate': 4.952360854605107e-06, 'epoch': 0.52} 52%|█████▏ | 11445/22095 [19:40:57<10:43:37, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250623/windows_augment/images/autocad/20250508_161646_1/images/before_screenshot_1_id_132_function_1_crop_1_grounding_instructions_point_o_paste.png 2025-08-28 11:38:56.399484 load time: 1210.93 ms 52%|█████▏ | 11446/22095 [19:41:08<17:13:04, 5.82s/it] {'loss': 0.4782, 'grad_norm': 0.30322297477801935, 'learning_rate': 4.9516279673689325e-06, 'epoch': 0.52} 52%|█████▏ | 11446/22095 [19:41:08<17:13:04, 5.82s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:39:06.870663 load time: 1145.06 ms 52%|█████▏ | 11447/22095 [19:41:18<21:05:03, 7.13s/it] {'loss': 0.4573, 'grad_norm': 0.2954691434464442, 'learning_rate': 4.950895081172126e-06, 'epoch': 0.52} 52%|█████▏ | 11447/22095 [19:41:18<21:05:03, 7.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (78184 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61489 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48611 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11448/22095 [19:41:21<17:38:27, 5.96s/it] {'loss': 0.3375, 'grad_norm': 0.60676611158701, 'learning_rate': 4.950162196030432e-06, 'epoch': 0.52} 52%|█████▏ | 11448/22095 [19:41:21<17:38:27, 5.96s/it] 52%|█████▏ | 11449/22095 [19:41:24<15:22:27, 5.20s/it] {'loss': 0.3341, 'grad_norm': 0.6813301058139706, 'learning_rate': 4.949429311959599e-06, 'epoch': 0.52} 52%|█████▏ | 11449/22095 [19:41:24<15:22:27, 5.20s/it] 52%|█████▏ | 11450/22095 [19:41:28<13:38:59, 4.62s/it] {'loss': 0.3329, 'grad_norm': 0.6132195451458272, 'learning_rate': 4.948696428975378e-06, 'epoch': 0.52} 52%|█████▏ | 11450/22095 [19:41:28<13:38:59, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/34181.png 2025-08-28 11:39:27.215939 load time: 3634.55 ms 52%|█████▏ | 11451/22095 [19:41:37<17:28:55, 5.91s/it] {'loss': 0.4948, 'grad_norm': 0.3179752179157839, 'learning_rate': 4.94796354709351e-06, 'epoch': 0.52} 52%|█████▏ | 11451/22095 [19:41:37<17:28:55, 5.91s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:39:36.866716 load time: 1101.62 ms 52%|█████▏ | 11452/22095 [19:41:40<15:08:29, 5.12s/it] {'loss': 0.307, 'grad_norm': 0.6149764342268045, 'learning_rate': 4.947230666329746e-06, 'epoch': 0.52} 52%|█████▏ | 11452/22095 [19:41:40<15:08:29, 5.12s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_005007_2/images/before_screenshot_14_id_76_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:39:36.884922 load time: 1680.08 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/e1cc9c76-68f1-43b2-8a5f-bcfc7a70d69e/images/step_4.png 2025-08-28 11:39:39.275911 load time: 1013.02 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/vscode_1/images/step_2.png 2025-08-28 11:39:39.445904 load time: 1298.35 ms VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_202007_8/images/before_screenshot_55_id_31_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:39:40.208547 load time: 1301.44 ms 52%|█████▏ | 11453/22095 [19:41:44<13:58:07, 4.73s/it] {'loss': 0.3167, 'grad_norm': 0.6185752635572527, 'learning_rate': 4.946497786699834e-06, 'epoch': 0.52} 52%|█████▏ | 11453/22095 [19:41:44<13:58:07, 4.73s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:39:43.018205 load time: 1696.19 ms VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_211202/images/step_1_id_26_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:39:44.024511 load time: 1697.4 ms 52%|█████▏ | 11454/22095 [19:41:48<13:33:43, 4.59s/it] {'loss': 0.3347, 'grad_norm': 0.6235007636911877, 'learning_rate': 4.945764908219518e-06, 'epoch': 0.52} 52%|█████▏ | 11454/22095 [19:41:48<13:33:43, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63025 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11455/22095 [19:41:51<12:06:34, 4.10s/it] {'loss': 0.3323, 'grad_norm': 0.5803562587200566, 'learning_rate': 4.945032030904549e-06, 'epoch': 0.52} 52%|█████▏ | 11455/22095 [19:41:51<12:06:34, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11456/22095 [19:42:00<16:44:24, 5.66s/it] {'loss': 0.4791, 'grad_norm': 0.30145818078884856, 'learning_rate': 4.944299154770673e-06, 'epoch': 0.52} 52%|█████▏ | 11456/22095 [19:42:00<16:44:24, 5.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_13_id_101_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:40:00.435046 load time: 1453.93 ms 52%|█████▏ | 11457/22095 [19:42:03<14:32:52, 4.92s/it] {'loss': 0.3044, 'grad_norm': 0.5993805071540821, 'learning_rate': 4.943566279833637e-06, 'epoch': 0.52} 52%|█████▏ | 11457/22095 [19:42:03<14:32:52, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46372 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11458/22095 [19:42:07<13:10:50, 4.46s/it] {'loss': 0.334, 'grad_norm': 0.7045733996670277, 'learning_rate': 4.942833406109188e-06, 'epoch': 0.52} 52%|█████▏ | 11458/22095 [19:42:07<13:10:50, 4.46s/it] 52%|█████▏ | 11459/22095 [19:42:10<11:41:02, 3.95s/it] {'loss': 0.3033, 'grad_norm': 0.7412287485525683, 'learning_rate': 4.942100533613073e-06, 'epoch': 0.52} 52%|█████▏ | 11459/22095 [19:42:10<11:41:02, 3.95s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:40:06.912651 load time: 1265.13 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/settings_1/images/step_3.png 2025-08-28 11:40:07.589974 load time: 1022.6 ms VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 11:40:08.342834 load time: 1322.25 ms 52%|█████▏ | 11460/22095 [19:42:13<11:21:47, 3.85s/it] {'loss': 0.3015, 'grad_norm': 0.6682742478151386, 'learning_rate': 4.9413676623610415e-06, 'epoch': 0.52} 52%|█████▏ | 11460/22095 [19:42:13<11:21:47, 3.85s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 11:40:11.934766 load time: 1556.18 ms 52%|█████▏ | 11461/22095 [19:42:17<11:31:10, 3.90s/it] {'loss': 0.3295, 'grad_norm': 0.6134757832433286, 'learning_rate': 4.940634792368838e-06, 'epoch': 0.52} 52%|█████▏ | 11461/22095 [19:42:17<11:31:10, 3.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397556 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64412, 'image': 'vrdu_table_final_2/astro-ph.EP/0f4a4a00-06c3-410e-bd62-ae25102d07ff.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 52%|█████▏ | 11462/22095 [19:42:20<10:52:59, 3.68s/it] {'loss': 0.3094, 'grad_norm': 0.5984181963560531, 'learning_rate': 4.93990192365221e-06, 'epoch': 0.52} 52%|█████▏ | 11462/22095 [19:42:20<10:52:59, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:40:20.010289 load time: 1140.44 ms 52%|█████▏ | 11463/22095 [19:42:29<14:54:59, 5.05s/it] {'loss': 0.4776, 'grad_norm': 0.2826570298747315, 'learning_rate': 4.939169056226905e-06, 'epoch': 0.52} 52%|█████▏ | 11463/22095 [19:42:29<14:54:59, 5.05s/it] 52%|█████▏ | 11464/22095 [19:42:32<13:25:28, 4.55s/it] {'loss': 0.3114, 'grad_norm': 0.6281200564906452, 'learning_rate': 4.93843619010867e-06, 'epoch': 0.52} 52%|█████▏ | 11464/22095 [19:42:32<13:25:28, 4.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 11:40:32.667112 load time: 1003.91 ms 52%|█████▏ | 11465/22095 [19:42:36<12:33:14, 4.25s/it] {'loss': 0.3136, 'grad_norm': 0.5984727108969583, 'learning_rate': 4.9377033253132525e-06, 'epoch': 0.52} 52%|█████▏ | 11465/22095 [19:42:36<12:33:14, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_1/images/step_7.png 2025-08-28 11:40:35.504343 load time: 2129.65 ms 52%|█████▏ | 11466/22095 [19:42:44<16:08:51, 5.47s/it] {'loss': 0.4897, 'grad_norm': 0.2765346528439434, 'learning_rate': 4.936970461856401e-06, 'epoch': 0.52} 52%|█████▏ | 11466/22095 [19:42:44<16:08:51, 5.47s/it] 52%|█████▏ | 11467/22095 [19:42:47<14:11:47, 4.81s/it] {'loss': 0.2882, 'grad_norm': 0.5694173173865967, 'learning_rate': 4.9362375997538585e-06, 'epoch': 0.52} 52%|█████▏ | 11467/22095 [19:42:47<14:11:47, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_8.png 2025-08-28 11:40:46.945658 load time: 1118.46 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:40:46.984813 load time: 1394.95 ms 52%|█████▏ | 11468/22095 [19:42:50<12:29:50, 4.23s/it] {'loss': 0.33, 'grad_norm': 0.6698294625395641, 'learning_rate': 4.935504739021373e-06, 'epoch': 0.52} 52%|█████▏ | 11468/22095 [19:42:50<12:29:50, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 11:40:49.362652 load time: 1007.94 ms VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/33998.png 2025-08-28 11:40:49.195659 load time: 1687.85 ms 52%|█████▏ | 11469/22095 [19:42:54<12:06:54, 4.10s/it] {'loss': 0.3066, 'grad_norm': 0.6551708984371104, 'learning_rate': 4.934771879674697e-06, 'epoch': 0.52} 52%|█████▏ | 11469/22095 [19:42:54<12:06:54, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11266109 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 956, 'image': 'MMMU_Pro/standard/test_955_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words.Note that you do not need to give a specific answer to the question.Here are the content and options of this question: In the following reaction , what mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: ['38.50 grams', '40.00 grams', '35.50 grams', '37.00 grams', '34.78 grams', ' 36.78 grams', '36.50 grams', ' 36.76 grams', ' 36.77 grams']"}, {'from': 'gpt', 'value': "## Introduction to Chemical Reactions and Stoichiometry\n\nChemical reactions involve the transformation of reactants into products. They are represented by balanced chemical equations that show the ratio in which substances react and form products. Understanding chemical reactions is crucial in many fields, including chemistry, biology, and environmental science.\n\n### Concepts of Chemical Equations\n\nA chemical equation is a symbolic representation of a chemical reaction. It consists of reactants (the starting substances) and products (the substances formed) separated by an arrow. \n\n#### Example:\n\\[ \\text{H}_2\\text{SO}_4 + \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + \\text{H}_2\\text{O} \\]\n\nHere, sulfuric acid (\\(\\text{H}_2\\text{SO}_4\\)) reacts with sodium hydroxide (\\(\\text{NaOH}\\)) to form sodium sulfate (\\(\\text{Na}_2\\text{SO}_4\\)) and water (\\(\\text{H}_2\\text{O}\\)).\n\n#### Balancing Chemical Equations\n\nTo obey the law of conservation of mass, chemical equations must be balanced. This implies that the number of atoms of each element must be the same on both sides of the equation. Balancing involves adjusting coefficients (the numbers before molecules or atoms) to achieve this equality.\n\n### Stoichiometry\n\nStoichiometry is the quantitative study of reactants and products in a chemical reaction. It allows us to predict how much of each substance is needed or produced in a reaction. \n\n#### Moles and Molar Mass\n\n- **Moles**: A mole is a unit that measures the amount of substance. It is equivalent to \\(6.022 \\times 10^{23}\\) entities (Avogadro's number).\n- **Molar Mass**: The mass of one mole of a substance, usually in grams per mole (g/mol). For instance, the molar mass of \\(\\text{H}_2\\text{SO}_4\\) is 98.08 g/mol and for \\(\\text{NaOH}\\), it is 40.00 g/mol.\n\n### Stoichiometric Calculations\n\nStoichiometric calculations involve using balanced equations to determine the mass or volume of reactants or products. Let’s look at an example based on the equation given above:\n\n1. **Balance the Equation**:\n \\[ \\text{H}_2\\text{SO}_4 + 2\\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2\\text{H}_2\\text{O} \\]\n\n2. **Identify Molar Ratios**:\n From the balanced equation, 1 mole of \\(\\text{H}_2\\text{SO}_4\\) reacts with 2 moles of \\(\\text{NaOH}\\).\n\n3. **Calculate Required Mass**:\n To find out how much \\(\\text{H}_2\\text{SO}_4\\) is needed to react with a certain amount of \\(\\text{NaOH}\\), we use the relation:\n \\[ \\text{Mass of substance} = \\text{Moles} \\times \\text{Molar Mass} \\]\n\n### Application in Real-Life and Industry\n\nUnderstanding stoichiometry and chemical reactions is crucial in fields such as pharmaceuticals for drug synthesis, agriculture for fertilizer manufacturing, and environmental science for pollution management.\n\n### Example of Stoichiometric Calculation\n\nTo illustrate, if we have 0.75 moles of \\(\\text{NaOH}\\):\n\n1. According to the reaction, 0.75 moles of \\(\\text{NaOH}\\) require 0.375 moles of \\(\\text{H}_2\\text{SO}_4\\) (because it takes half as many moles of \\(\\text{H}_2\\text{SO}_4\\) to completely react with \\(\\text{NaOH}\\)).\n\n2. Calculating the mass of \\(\\text{H}_2\\text{SO}_4\\):\n \\[ \\text{Mass} = 0.375 \\, \\text{moles} \\times 98.08 \\, \\text{g/mol} = 36.78 \\, \\text{grams} \\]\n\n### Acids, Bases, and Neutralization Reactions\n\n#### Acid-Base Reactions\n\n- **Acids**: Substances that donate protons (\\(H^+\\)) in a reaction, such as \\(\\text{H}_2\\text{SO}_4\\).\n- **Bases**: Substances that accept protons, such as \\(\\text{NaOH}\\).\n\nThe process of neutralization involves an acid and a base reacting to form water and a salt. In our case, \\(\\text{H}_2\\text{SO}_4\\) neutralizes \\(\\text{NaOH}\\).\n\n### Properties of Sulfuric Acid and Sodium Hydroxide\n\n#### Sulfuric Acid (\\(\\text{H}_2\\text{SO}_4\\))\n\n- **Properties**: Dense, oily liquid, strong acid, corrosive.\n- **Uses**: Manufacture of fertilizers, cleaning agents, and in petroleum refining.\n\n#### Sodium Hydroxide (\\(\\text{NaOH}\\))\n\n- **Properties**: White, crystalline solid, very soluble in water, strong base.\n- **Uses**: Paper production, soap manufacturing, water treatment.\n\n### Safety and Handling\n\nBoth substances are hazardous and require careful handling. Proper protective equipment and protocols must be followed to prevent skin burns and inhalation injuries.\n\n### Advanced Concepts\n\n#### Limiting Reactants\n\nIn a reaction, the limiting reactant is the substance that is completely consumed first, determining the amount of product formed. Understanding limiting reactants is essential for optimizing yields in industrial processes.\n\n#### Titration\n\nA laboratory technique that involves the gradual addition of one reactant from a burette to another until the reaction is complete. This technique is often used to determine the concentration of an unknown solution.\n\n### Examples and Practice Problems\n\n##### Example Problem:\nSuppose you have 50.0 grams of \\(\\text{NaOH}\\). How much \\(\\text{H}_2\\text{SO}_4\\) would be required for complete reaction?\n\n1. Convert grams of \\(\\text{NaOH}\\) to moles: 50.0 g \\(\\times \\frac{1 \\, \\text{mol}}{40.00 \\, \\text{g}}\\) = 1.25 mol.\n\n2. Use the mole ratio to find moles of \\(\\text{H}_2\\text{SO}_4\\): \n \\(\\frac{1.25 \\, \\text{mol NaOH}}{2} = 0.625 \\, \\text{mol H}_2\\text{SO}_4\\).\n\n3. Calculate mass of \\(\\text{H}_2\\text{SO}_4\\): 0.625 mol \\(\\times 98.08 \\, \\text{g/mol} = 61.30 \\, \\text{g}\\).\n\n### Conclusion\n\nUnderstanding the principles of chemical equations, stoichiometry, and acid-base reactions is fundamental to mastering chemistry. These concepts not only allow us to predict and quantify the results of reactions but also enable the practical application of chemistry in various industries and scientific research fields. As demonstrated in the example of \\(\\text{H}_2\\text{SO}_4\\) and \\(\\text{NaOH}\\), these principles form the cornerstone of chemical processes that drive technological advancements and innovation."}]} VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_3/images/step_0.png 2025-08-28 11:40:53.736619 load time: 1338.32 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/f9be7ed3-49aa-4f23-a176-7af6afdfae84/images/step_3.png 2025-08-28 11:40:54.342623 load time: 1127.92 ms 52%|█████▏ | 11470/22095 [19:42:58<12:02:17, 4.08s/it] {'loss': 0.3236, 'grad_norm': 0.6106802527555026, 'learning_rate': 4.9340390217295695e-06, 'epoch': 0.52} 52%|█████▏ | 11470/22095 [19:42:58<12:02:17, 4.08s/it] 52%|█████▏ | 11471/22095 [19:43:01<11:20:16, 3.84s/it] {'loss': 0.359, 'grad_norm': 0.6727567886294621, 'learning_rate': 4.933306165201741e-06, 'epoch': 0.52} 52%|█████▏ | 11471/22095 [19:43:01<11:20:16, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54813 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90377 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11472/22095 [19:43:05<11:40:09, 3.95s/it] {'loss': 0.3012, 'grad_norm': 0.6107305453605221, 'learning_rate': 4.93257331010696e-06, 'epoch': 0.52} 52%|█████▏ | 11472/22095 [19:43:05<11:40:09, 3.95s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_12/images/20250417140224.png 2025-08-28 11:41:04.108941 load time: 1014.34 ms 52%|█████▏ | 11473/22095 [19:43:08<10:40:55, 3.62s/it] {'loss': 0.2876, 'grad_norm': 0.6547948104154359, 'learning_rate': 4.93184045646097e-06, 'epoch': 0.52} 52%|█████▏ | 11473/22095 [19:43:08<10:40:55, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49022 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96732 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11474/22095 [19:43:12<10:50:24, 3.67s/it] {'loss': 0.401, 'grad_norm': 0.7767733500126526, 'learning_rate': 4.9311076042795185e-06, 'epoch': 0.52} 52%|█████▏ | 11474/22095 [19:43:12<10:50:24, 3.67s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:41:10.749930 load time: 1123.64 ms VC:s3://gui-agent/data_20250630/windows_augment/images/SW/handmade_annotation_1/images/2_id_3_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:41:11.438170 load time: 1472.81 ms 52%|█████▏ | 11475/22095 [19:43:15<10:12:11, 3.46s/it] {'loss': 0.3949, 'grad_norm': 0.6586598931865223, 'learning_rate': 4.9303747535783546e-06, 'epoch': 0.52} 52%|█████▏ | 11475/22095 [19:43:15<10:12:11, 3.46s/it] 52%|█████▏ | 11476/22095 [19:43:18<9:42:49, 3.29s/it] {'loss': 0.3286, 'grad_norm': 0.6023598627621121, 'learning_rate': 4.929641904373224e-06, 'epoch': 0.52} 52%|█████▏ | 11476/22095 [19:43:18<9:42:49, 3.29s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:41:17.453521 load time: 1085.98 ms 52%|█████▏ | 11477/22095 [19:43:21<9:42:49, 3.29s/it] {'loss': 0.3372, 'grad_norm': 0.6771772438763664, 'learning_rate': 4.928909056679871e-06, 'epoch': 0.52} 52%|█████▏ | 11477/22095 [19:43:21<9:42:49, 3.29s/it] 52%|█████▏ | 11478/22095 [19:43:24<9:25:31, 3.20s/it] {'loss': 0.3104, 'grad_norm': 0.6669304323314843, 'learning_rate': 4.9281762105140435e-06, 'epoch': 0.52} 52%|█████▏ | 11478/22095 [19:43:24<9:25:31, 3.20s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_1.png 2025-08-28 11:41:22.520237 load time: 1100.51 ms Token indices sequence length is longer than the specified maximum sequence length for this model (46811 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46580 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108987 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11479/22095 [19:43:27<9:19:45, 3.16s/it] {'loss': 0.3041, 'grad_norm': 0.5836886978307488, 'learning_rate': 4.927443365891491e-06, 'epoch': 0.52} 52%|█████▏ | 11479/22095 [19:43:27<9:19:45, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:41:26.812414 load time: 1153.36 ms 52%|█████▏ | 11480/22095 [19:43:37<15:30:19, 5.26s/it] {'loss': 0.4568, 'grad_norm': 0.2876301646577038, 'learning_rate': 4.926710522827956e-06, 'epoch': 0.52} 52%|█████▏ | 11480/22095 [19:43:37<15:30:19, 5.26s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_2/images/step_0.png 2025-08-28 11:41:37.425308 load time: 1840.24 ms 52%|█████▏ | 11481/22095 [19:43:41<14:17:40, 4.85s/it] {'loss': 0.3551, 'grad_norm': 0.6165277494595637, 'learning_rate': 4.925977681339187e-06, 'epoch': 0.52} 52%|█████▏ | 11481/22095 [19:43:41<14:17:40, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (119291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49664 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11482/22095 [19:43:46<14:03:57, 4.77s/it] {'loss': 0.4685, 'grad_norm': 0.27093643070250045, 'learning_rate': 4.925244841440932e-06, 'epoch': 0.52} 52%|█████▏ | 11482/22095 [19:43:46<14:03:57, 4.77s/it] 52%|█████▏ | 11483/22095 [19:43:49<12:38:03, 4.29s/it] {'loss': 0.3018, 'grad_norm': 0.6405904609560221, 'learning_rate': 4.924512003148934e-06, 'epoch': 0.52} 52%|█████▏ | 11483/22095 [19:43:49<12:38:03, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134172 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11484/22095 [19:43:53<12:16:14, 4.16s/it] {'loss': 0.3549, 'grad_norm': 0.9864860414307957, 'learning_rate': 4.923779166478941e-06, 'epoch': 0.52} 52%|█████▏ | 11484/22095 [19:43:53<12:16:14, 4.16s/it] 52%|█████▏ | 11485/22095 [19:43:56<11:48:43, 4.01s/it] {'loss': 0.3307, 'grad_norm': 0.6053341823481457, 'learning_rate': 4.923046331446701e-06, 'epoch': 0.52} 52%|█████▏ | 11485/22095 [19:43:57<11:48:43, 4.01s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_202007_6/images/before_screenshot_48_id_61_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:41:55.272020 load time: 1149.17 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 11:41:55.700183 load time: 1014.81 ms 52%|█████▏ | 11486/22095 [19:43:59<10:53:47, 3.70s/it] {'loss': 0.4052, 'grad_norm': 0.687499422803569, 'learning_rate': 4.922313498067957e-06, 'epoch': 0.52} 52%|█████▏ | 11486/22095 [19:43:59<10:53:47, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/images/inventor/handmade_annotation_1/images/inventor_1_id_17_internvl_position_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:41:58.245773 load time: 1113.73 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_3/images/step_0.png 2025-08-28 11:41:58.805447 load time: 1103.04 ms 52%|█████▏ | 11487/22095 [19:44:08<15:20:04, 5.20s/it] {'loss': 0.5052, 'grad_norm': 0.30206540097624307, 'learning_rate': 4.921580666358459e-06, 'epoch': 0.52} 52%|█████▏ | 11487/22095 [19:44:08<15:20:04, 5.20s/it] 52%|█████▏ | 11488/22095 [19:44:11<13:31:15, 4.59s/it] {'loss': 0.2951, 'grad_norm': 0.637064127625573, 'learning_rate': 4.92084783633395e-06, 'epoch': 0.52} 52%|█████▏ | 11488/22095 [19:44:11<13:31:15, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84182 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11489/22095 [19:44:14<12:08:03, 4.12s/it] {'loss': 0.3268, 'grad_norm': 0.6493410177667419, 'learning_rate': 4.92011500801018e-06, 'epoch': 0.52} 52%|█████▏ | 11489/22095 [19:44:14<12:08:03, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43108 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11490/22095 [19:44:21<14:31:32, 4.93s/it] {'loss': 0.4844, 'grad_norm': 0.2834604949324538, 'learning_rate': 4.919382181402892e-06, 'epoch': 0.52} 52%|█████▏ | 11490/22095 [19:44:21<14:31:32, 4.93s/it] 52%|█████▏ | 11491/22095 [19:44:25<13:19:39, 4.52s/it] {'loss': 0.4206, 'grad_norm': 1.102465837368673, 'learning_rate': 4.918649356527833e-06, 'epoch': 0.52} 52%|█████▏ | 11491/22095 [19:44:25<13:19:39, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11492/22095 [19:44:34<17:45:00, 6.03s/it] {'loss': 0.4756, 'grad_norm': 0.32046127818036063, 'learning_rate': 4.917916533400751e-06, 'epoch': 0.52} 52%|█████▏ | 11492/22095 [19:44:34<17:45:00, 6.03s/it] 52%|█████▏ | 11493/22095 [19:44:38<15:16:49, 5.19s/it] {'loss': 0.3912, 'grad_norm': 0.7355754043795567, 'learning_rate': 4.917183712037389e-06, 'epoch': 0.52} 52%|█████▏ | 11493/22095 [19:44:38<15:16:49, 5.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250630/windows_augment/images/autocad/handmade_annotation_6/images/1_id_74_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:42:35.310439 load time: 1456.2 ms 52%|█████▏ | 11494/22095 [19:44:41<13:40:10, 4.64s/it] {'loss': 0.3091, 'grad_norm': 0.638127357125275, 'learning_rate': 4.916450892453495e-06, 'epoch': 0.52} 52%|█████▏ | 11494/22095 [19:44:41<13:40:10, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/finder/8af7889e-fbfc-443f-8629-e5b6b0484c7d/images/step_2.png 2025-08-28 11:42:40.182622 load time: 1061.39 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:42:41.120606 load time: 1200.07 ms 52%|█████▏ | 11495/22095 [19:44:44<12:13:31, 4.15s/it] {'loss': 0.3264, 'grad_norm': 0.6315955219941163, 'learning_rate': 4.915718074664816e-06, 'epoch': 0.52} 52%|█████▏ | 11495/22095 [19:44:44<12:13:31, 4.15s/it] 52%|█████▏ | 11496/22095 [19:44:47<11:11:52, 3.80s/it] {'loss': 0.2954, 'grad_norm': 0.7076413970554138, 'learning_rate': 4.914985258687096e-06, 'epoch': 0.52} 52%|█████▏ | 11496/22095 [19:44:47<11:11:52, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [400, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8476991 in VC:s3://internvl-moe-sft-data/. Exception: Image size [400, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 129922, 'image': 'vrdu_texteq/astro-ph.CO/43bd143e-2784-4ec6-aa26-bcca8b5e78ef.png', 'image_wh': [[400, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $\\theta_{\\rm Ein}$ is the Einstein radius.'}]} 52%|█████▏ | 11497/22095 [19:44:54<14:18:13, 4.86s/it] {'loss': 0.4464, 'grad_norm': 0.27708411098994, 'learning_rate': 4.91425244453608e-06, 'epoch': 0.52} 52%|█████▏ | 11497/22095 [19:44:54<14:18:13, 4.86s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:42:55.341639 load time: 1200.39 ms 52%|█████▏ | 11498/22095 [19:45:02<17:15:35, 5.86s/it] {'loss': 0.4828, 'grad_norm': 0.31059135604941385, 'learning_rate': 4.9135196322275195e-06, 'epoch': 0.52} 52%|█████▏ | 11498/22095 [19:45:02<17:15:35, 5.86s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 52%|█████▏ | 11499/22095 [19:45:06<15:32:55, 5.28s/it] {'loss': 0.3207, 'grad_norm': 0.6448738124679796, 'learning_rate': 4.912786821777152e-06, 'epoch': 0.52} 52%|█████▏ | 11499/22095 [19:45:06<15:32:55, 5.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [670, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433517 in VC:s3://internvl-moe-sft-data/. Exception: Image size [670, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 121588, 'image': 'vrdu_texteq/astro-ph.CO/8103d1a4-0155-46b0-9bd7-91b811d86852.png', 'image_wh': [[670, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': '$\\lambda$6355 feature. In the notation used in BSNIP~II\nthis is'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ sample = self._get_item(i) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item raise ValueError( ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8343907 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10558, 'image': 'vrdu_table_final_2/astro-ph.CO/753181d9-446a-4625-9e54-f554d4525337.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 52%|█████▏ | 11500/22095 [19:45:10<13:45:16, 4.67s/it] {'loss': 0.3071, 'grad_norm': 0.6242971855514119, 'learning_rate': 4.912054013200731e-06, 'epoch': 0.52} 52%|█████▏ | 11500/22095 [19:45:10<13:45:16, 4.67s/it] 52%|█████▏ | 11501/22095 [19:45:13<12:15:53, 4.17s/it] {'loss': 0.32, 'grad_norm': 0.5925596570301539, 'learning_rate': 4.911321206513996e-06, 'epoch': 0.52} 52%|█████▏ | 11501/22095 [19:45:13<12:15:53, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63263 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105664 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11502/22095 [19:45:16<11:28:29, 3.90s/it] {'loss': 0.3359, 'grad_norm': 0.5976703369179355, 'learning_rate': 4.9105884017327e-06, 'epoch': 0.52} 52%|█████▏ | 11502/22095 [19:45:16<11:28:29, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (114532 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11503/22095 [19:45:19<10:35:03, 3.60s/it] {'loss': 0.3594, 'grad_norm': 0.6275887504151985, 'learning_rate': 4.9098555988725814e-06, 'epoch': 0.52} 52%|█████▏ | 11503/22095 [19:45:19<10:35:03, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/17d889dde6fdc256ad29650d59b78fd97a42b5762061f527fd3e2766966f8a46.png 2025-08-28 11:43:19.103836 load time: 1183.1 ms 52%|█████▏ | 11504/22095 [19:45:23<11:04:27, 3.76s/it] {'loss': 0.3609, 'grad_norm': 0.6165560172448781, 'learning_rate': 4.909122797949391e-06, 'epoch': 0.52} 52%|█████▏ | 11504/22095 [19:45:23<11:04:27, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54618 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11505/22095 [19:45:27<10:59:09, 3.73s/it] {'loss': 0.2873, 'grad_norm': 0.5923730829115844, 'learning_rate': 4.908389998978872e-06, 'epoch': 0.52} 52%|█████▏ | 11505/22095 [19:45:27<10:59:09, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56145 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11506/22095 [19:45:30<10:58:15, 3.73s/it] {'loss': 0.3318, 'grad_norm': 0.625363493919042, 'learning_rate': 4.90765720197677e-06, 'epoch': 0.52} 52%|█████▏ | 11506/22095 [19:45:30<10:58:15, 3.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11507/22095 [19:45:33<10:17:36, 3.50s/it] {'loss': 0.3188, 'grad_norm': 0.6421525850337274, 'learning_rate': 4.9069244069588305e-06, 'epoch': 0.52} 52%|█████▏ | 11507/22095 [19:45:33<10:17:36, 3.50s/it] 52%|█████▏ | 11508/22095 [19:45:37<10:28:37, 3.56s/it] {'loss': 0.2584, 'grad_norm': 0.6784976499251623, 'learning_rate': 4.906191613940802e-06, 'epoch': 0.52} 52%|█████▏ | 11508/22095 [19:45:37<10:28:37, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78334 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11509/22095 [19:45:41<10:41:49, 3.64s/it] {'loss': 0.3429, 'grad_norm': 0.5921940899188074, 'learning_rate': 4.905458822938426e-06, 'epoch': 0.52} 52%|█████▏ | 11509/22095 [19:45:41<10:41:49, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11510/22095 [19:45:47<12:49:49, 4.36s/it] {'loss': 0.4917, 'grad_norm': 0.3467998770991752, 'learning_rate': 4.904726033967449e-06, 'epoch': 0.52} 52%|█████▏ | 11510/22095 [19:45:47<12:49:49, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_2/images/step_0.png 2025-08-28 11:43:45.614322 load time: 1417.29 ms 52%|█████▏ | 11511/22095 [19:45:50<11:37:36, 3.95s/it] {'loss': 0.2988, 'grad_norm': 0.6696798074266498, 'learning_rate': 4.903993247043619e-06, 'epoch': 0.52} 52%|█████▏ | 11511/22095 [19:45:50<11:37:36, 3.95s/it]VC:s3://gui-agent/data_20250630/windows_augment/images/autocad/handmade_annotation_6/images/1_id_17_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:43:50.138977 load time: 1165.95 ms 52%|█████▏ | 11512/22095 [19:45:53<10:46:30, 3.67s/it] {'loss': 0.3349, 'grad_norm': 0.6357144820286588, 'learning_rate': 4.903260462182679e-06, 'epoch': 0.52} 52%|█████▏ | 11512/22095 [19:45:53<10:46:30, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/17d889dde6fdc256ad29650d59b78fd97a42b5762061f527fd3e2766966f8a46.png 2025-08-28 11:43:54.441998 load time: 1060.41 ms 52%|█████▏ | 11513/22095 [19:46:00<13:39:28, 4.65s/it] {'loss': 0.4921, 'grad_norm': 0.2981795596572952, 'learning_rate': 4.9025276794003715e-06, 'epoch': 0.52} 52%|█████▏ | 11513/22095 [19:46:00<13:39:28, 4.65s/it] 52%|█████▏ | 11514/22095 [19:46:09<17:48:51, 6.06s/it] {'loss': 0.4528, 'grad_norm': 0.2738371351530118, 'learning_rate': 4.901794898712448e-06, 'epoch': 0.52} 52%|█████▏ | 11514/22095 [19:46:09<17:48:51, 6.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ try: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item if min(sizes) < 28: ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310314 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1AYXGmh6I8KJjy0FgXXXXzVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n你能给我图片中所有的文本内容吗?'}, {'from': 'gpt', 'value': '图中所有文字:\nNEW\n进口材质\n精品标签\n高清打印强粘覆膜\n进口精品不伤机器\n强粘覆膜\n精品贴\n6\nTTe精品标签色带\nmm\n白底黑字\nTAPE\nTTe-211\n百佳办公\n直销旗舰店\nHttp://bjbg118.taobao.com'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11515/22095 [19:46:13<15:47:12, 5.37s/it] {'loss': 0.3219, 'grad_norm': 0.6725567010375184, 'learning_rate': 4.901062120134649e-06, 'epoch': 0.52} 52%|█████▏ | 11515/22095 [19:46:13<15:47:12, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52649 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11516/22095 [19:46:16<13:53:45, 4.73s/it] {'loss': 0.3185, 'grad_norm': 0.652312178444958, 'learning_rate': 4.900329343682722e-06, 'epoch': 0.52} 52%|█████▏ | 11516/22095 [19:46:16<13:53:45, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 11:44:16.547744 load time: 1112.75 ms 52%|█████▏ | 11517/22095 [19:46:25<17:59:25, 6.12s/it] {'loss': 0.5009, 'grad_norm': 0.30183426080952397, 'learning_rate': 4.899596569372409e-06, 'epoch': 0.52} 52%|█████▏ | 11517/22095 [19:46:25<17:59:25, 6.12s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/d575fbf7-ad0d-4665-94ba-472d47b74314/images/step_1.png 2025-08-28 11:44:25.390813 load time: 1055.25 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ try: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item if min(sizes) < 28: ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047171 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵点P是AC的中点,点Q是BC的中点,线段AC=8cm,线段BC=4cm,∴CP=4cm,CQ=2cm,∴PQ=4+2=6cm.'}]} VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_202007_3/images/before_screenshot_25_id_25_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:44:25.724204 load time: 1180.23 ms 52%|█████▏ | 11518/22095 [19:46:29<15:57:17, 5.43s/it] {'loss': 0.3348, 'grad_norm': 0.6150731244812301, 'learning_rate': 4.898863797219461e-06, 'epoch': 0.52} 52%|█████▏ | 11518/22095 [19:46:29<15:57:17, 5.43s/it] 52%|█████▏ | 11519/22095 [19:46:33<14:25:44, 4.91s/it] {'loss': 0.3102, 'grad_norm': 0.6168255764535435, 'learning_rate': 4.898131027239617e-06, 'epoch': 0.52} 52%|█████▏ | 11519/22095 [19:46:33<14:25:44, 4.91s/it] 52%|█████▏ | 11520/22095 [19:46:36<13:05:16, 4.46s/it] {'loss': 0.3573, 'grad_norm': 0.6182232484985482, 'learning_rate': 4.897398259448625e-06, 'epoch': 0.52} 52%|█████▏ | 11520/22095 [19:46:36<13:05:16, 4.46s/it] 52%|█████▏ | 11521/22095 [19:46:41<12:50:55, 4.37s/it] {'loss': 0.3264, 'grad_norm': 0.6503476877876679, 'learning_rate': 4.89666549386223e-06, 'epoch': 0.52} 52%|█████▏ | 11521/22095 [19:46:41<12:50:55, 4.37s/it] 52%|█████▏ | 11522/22095 [19:46:44<12:04:04, 4.11s/it] {'loss': 0.3065, 'grad_norm': 0.6180965256332858, 'learning_rate': 4.895932730496174e-06, 'epoch': 0.52} 52%|█████▏ | 11522/22095 [19:46:44<12:04:04, 4.11s/it] 52%|█████▏ | 11523/22095 [19:46:49<12:28:05, 4.25s/it] {'loss': 0.3076, 'grad_norm': 0.6210216574725677, 'learning_rate': 4.895199969366206e-06, 'epoch': 0.52} 52%|█████▏ | 11523/22095 [19:46:49<12:28:05, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11524/22095 [19:46:59<17:25:34, 5.93s/it] {'loss': 0.498, 'grad_norm': 0.35488493465928805, 'learning_rate': 4.894467210488069e-06, 'epoch': 0.52} 52%|█████▏ | 11524/22095 [19:46:59<17:25:34, 5.93s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_4/images/step_4.png 2025-08-28 11:44:57.287393 load time: 1294.03 ms 52%|█████▏ | 11525/22095 [19:47:02<15:15:53, 5.20s/it] {'loss': 0.3155, 'grad_norm': 0.5825552011467424, 'learning_rate': 4.893734453877506e-06, 'epoch': 0.52} 52%|█████▏ | 11525/22095 [19:47:02<15:15:53, 5.20s/it] 52%|█████▏ | 11526/22095 [19:47:05<13:18:44, 4.53s/it] {'loss': 0.325, 'grad_norm': 0.723602606465192, 'learning_rate': 4.893001699550263e-06, 'epoch': 0.52} 52%|█████▏ | 11526/22095 [19:47:05<13:18:44, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55647 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11527/22095 [19:47:09<13:17:55, 4.53s/it] {'loss': 0.3568, 'grad_norm': 0.6108827405279276, 'learning_rate': 4.892268947522088e-06, 'epoch': 0.52} 52%|█████▏ | 11527/22095 [19:47:10<13:17:55, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50777 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69921 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52100 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11528/22095 [19:47:14<13:01:35, 4.44s/it] {'loss': 0.2834, 'grad_norm': 0.6114150637030855, 'learning_rate': 4.891536197808719e-06, 'epoch': 0.52} 52%|█████▏ | 11528/22095 [19:47:14<13:01:35, 4.44s/it] 52%|█████▏ | 11529/22095 [19:47:17<12:09:23, 4.14s/it] {'loss': 0.3451, 'grad_norm': 0.6360114618316366, 'learning_rate': 4.890803450425905e-06, 'epoch': 0.52} 52%|█████▏ | 11529/22095 [19:47:17<12:09:23, 4.14s/it] 52%|█████▏ | 11530/22095 [19:47:20<10:53:10, 3.71s/it] {'loss': 0.3402, 'grad_norm': 0.6545738871505882, 'learning_rate': 4.890070705389388e-06, 'epoch': 0.52} 52%|█████▏ | 11530/22095 [19:47:20<10:53:10, 3.71s/it] 52%|█████▏ | 11531/22095 [19:47:23<10:07:48, 3.45s/it] {'loss': 0.337, 'grad_norm': 0.6206975055805826, 'learning_rate': 4.889337962714918e-06, 'epoch': 0.52} 52%|█████▏ | 11531/22095 [19:47:23<10:07:48, 3.45s/it] 52%|█████▏ | 11532/22095 [19:47:26<9:57:36, 3.39s/it] {'loss': 0.3216, 'grad_norm': 0.7917530209205013, 'learning_rate': 4.888605222418232e-06, 'epoch': 0.52} 52%|█████▏ | 11532/22095 [19:47:26<9:57:36, 3.39s/it] 52%|█████▏ | 11533/22095 [19:47:29<9:36:35, 3.28s/it] {'loss': 0.2912, 'grad_norm': 1.048978913657367, 'learning_rate': 4.887872484515078e-06, 'epoch': 0.52} 52%|█████▏ | 11533/22095 [19:47:29<9:36:35, 3.28s/it] 52%|█████▏ | 11534/22095 [19:47:33<10:04:02, 3.43s/it] {'loss': 0.3535, 'grad_norm': 0.6780769881876405, 'learning_rate': 4.8871397490212015e-06, 'epoch': 0.52} 52%|█████▏ | 11534/22095 [19:47:33<10:04:02, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/finder/1fe1ca62-7e4a-4d85-af6c-e650a9c51129/images/step_3.png 2025-08-28 11:45:31.559808 load time: 1053.46 ms 52%|█████▏ | 11535/22095 [19:47:42<14:49:51, 5.06s/it] {'loss': 0.4988, 'grad_norm': 0.33566560310405, 'learning_rate': 4.886407015952344e-06, 'epoch': 0.52} 52%|█████▏ | 11535/22095 [19:47:42<14:49:51, 5.06s/it] 52%|█████▏ | 11536/22095 [19:47:53<20:06:41, 6.86s/it] {'loss': 0.4726, 'grad_norm': 0.33959867580729625, 'learning_rate': 4.8856742853242504e-06, 'epoch': 0.52} 52%|█████▏ | 11536/22095 [19:47:53<20:06:41, 6.86s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (97054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120499 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11537/22095 [19:47:57<18:09:24, 6.19s/it] {'loss': 0.2987, 'grad_norm': 0.6735425814731475, 'learning_rate': 4.884941557152666e-06, 'epoch': 0.52} 52%|█████▏ | 11537/22095 [19:47:57<18:09:24, 6.19s/it] 52%|█████▏ | 11538/22095 [19:48:01<15:46:18, 5.38s/it] {'loss': 0.2959, 'grad_norm': 0.654398883844253, 'learning_rate': 4.884208831453335e-06, 'epoch': 0.52} 52%|█████▏ | 11538/22095 [19:48:01<15:46:18, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ try: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item if min(sizes) < 28: ValueError: Image size [23, 89, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8345991 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 89, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12648, 'image': 'vrdu_table_final_2/astro-ph.CO/576925fe-7cb5-48d1-b9ae-d5f1bc5ff1f2.png', 'image_wh': [[23, 89]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}\n $A$\\\\\n $B$\\\\\n $C$\\\\\n \\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11539/22095 [19:48:09<18:35:56, 6.34s/it] {'loss': 0.4815, 'grad_norm': 0.28491726359378344, 'learning_rate': 4.883476108241999e-06, 'epoch': 0.52} 52%|█████▏ | 11539/22095 [19:48:09<18:35:56, 6.34s/it] 52%|█████▏ | 11540/22095 [19:48:13<16:28:47, 5.62s/it] {'loss': 0.383, 'grad_norm': 0.632877892939454, 'learning_rate': 4.882743387534406e-06, 'epoch': 0.52} 52%|█████▏ | 11540/22095 [19:48:13<16:28:47, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76368 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11541/22095 [19:48:17<14:58:26, 5.11s/it] {'loss': 0.3215, 'grad_norm': 0.678673092334952, 'learning_rate': 4.882010669346294e-06, 'epoch': 0.52} 52%|█████▏ | 11541/22095 [19:48:17<14:58:26, 5.11s/it] 52%|█████▏ | 11542/22095 [19:48:21<14:02:45, 4.79s/it] {'loss': 0.2891, 'grad_norm': 0.5418262072495581, 'learning_rate': 4.881277953693412e-06, 'epoch': 0.52} 52%|█████▏ | 11542/22095 [19:48:21<14:02:45, 4.79s/it] 52%|█████▏ | 11543/22095 [19:48:24<12:18:50, 4.20s/it] {'loss': 0.3584, 'grad_norm': 0.6492432998704616, 'learning_rate': 4.8805452405915025e-06, 'epoch': 0.52} 52%|█████▏ | 11543/22095 [19:48:24<12:18:50, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ try: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item if min(sizes) < 28: ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946047 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69200, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 7.5\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 11:46:24.226099 load time: 1049.02 ms 52%|█████▏ | 11544/22095 [19:48:27<11:29:44, 3.92s/it] {'loss': 0.3324, 'grad_norm': 0.6027133643001226, 'learning_rate': 4.879812530056309e-06, 'epoch': 0.52} 52%|█████▏ | 11544/22095 [19:48:27<11:29:44, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11545/22095 [19:48:37<16:18:38, 5.57s/it] {'loss': 0.4705, 'grad_norm': 0.37705179015063844, 'learning_rate': 4.879079822103575e-06, 'epoch': 0.52} 52%|█████▏ | 11545/22095 [19:48:37<16:18:38, 5.57s/it] 52%|█████▏ | 11546/22095 [19:48:40<14:27:24, 4.93s/it] {'loss': 0.2968, 'grad_norm': 0.6388994946240203, 'learning_rate': 4.878347116749042e-06, 'epoch': 0.52} 52%|█████▏ | 11546/22095 [19:48:40<14:27:24, 4.93s/it] 52%|█████▏ | 11547/22095 [19:48:44<13:04:13, 4.46s/it] {'loss': 0.2954, 'grad_norm': 1.0835052752581886, 'learning_rate': 4.877614414008459e-06, 'epoch': 0.52} 52%|█████▏ | 11547/22095 [19:48:44<13:04:13, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11548/22095 [19:48:47<12:31:45, 4.28s/it] {'loss': 0.3191, 'grad_norm': 0.6052171996998544, 'learning_rate': 4.876881713897565e-06, 'epoch': 0.52} 52%|█████▏ | 11548/22095 [19:48:47<12:31:45, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11549/22095 [19:48:55<15:17:20, 5.22s/it] {'loss': 0.4955, 'grad_norm': 0.2959331868748914, 'learning_rate': 4.876149016432104e-06, 'epoch': 0.52} 52%|█████▏ | 11549/22095 [19:48:55<15:17:20, 5.22s/it] 52%|█████▏ | 11550/22095 [19:48:59<14:21:51, 4.90s/it] {'loss': 0.3521, 'grad_norm': 0.6744346120219048, 'learning_rate': 4.875416321627823e-06, 'epoch': 0.52} 52%|█████▏ | 11550/22095 [19:48:59<14:21:51, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71881 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11551/22095 [19:49:02<12:42:30, 4.34s/it] {'loss': 0.3312, 'grad_norm': 0.6485308645284827, 'learning_rate': 4.87468362950046e-06, 'epoch': 0.52} 52%|█████▏ | 11551/22095 [19:49:02<12:42:30, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99090 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110431 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11552/22095 [19:49:06<12:35:25, 4.30s/it] {'loss': 0.327, 'grad_norm': 0.6479885264159125, 'learning_rate': 4.873950940065762e-06, 'epoch': 0.52} 52%|█████▏ | 11552/22095 [19:49:06<12:35:25, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ for attempt_idx in range(num_base_retries): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sizes = reduce(lambda x, y: x + y, sizes) ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11306949 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 504, 'image': 'chat/801.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n谁能解读一下图中的聊天内容?'}, {'from': 'gpt', 'value': '图片显示的是一部手机的聊天界面,使用的类似微信的聊天应用。聊天内容如下:\n\n- 左边的头像发的消息:\n1. 三\n2. 你用我教你的,回家。\n\n- 右边的头像发的消息:\n1. 是我家。快。\n\n目前显示的聊天内容就这些。整个会话似乎涉及某人使用某种方法回到他们家,并且对方在催促速度快些。'}]} 52%|█████▏ | 11553/22095 [19:49:10<11:43:05, 4.00s/it] {'loss': 0.3253, 'grad_norm': 0.6287323547050803, 'learning_rate': 4.8732182533394716e-06, 'epoch': 0.52} 52%|█████▏ | 11553/22095 [19:49:10<11:43:05, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11554/22095 [19:49:13<10:53:44, 3.72s/it] {'loss': 0.3297, 'grad_norm': 0.5957143013642198, 'learning_rate': 4.87248556933733e-06, 'epoch': 0.52} 52%|█████▏ | 11554/22095 [19:49:13<10:53:44, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47489 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41039 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11555/22095 [19:49:15<9:59:47, 3.41s/it] {'loss': 0.3196, 'grad_norm': 0.7821061377676696, 'learning_rate': 4.871752888075082e-06, 'epoch': 0.52} 52%|█████▏ | 11555/22095 [19:49:15<9:59:47, 3.41s/it] 52%|█████▏ | 11556/22095 [19:49:19<10:14:10, 3.50s/it] {'loss': 0.2934, 'grad_norm': 0.5824779569512576, 'learning_rate': 4.871020209568473e-06, 'epoch': 0.52} 52%|█████▏ | 11556/22095 [19:49:19<10:14:10, 3.50s/it] 52%|█████▏ | 11557/22095 [19:49:22<9:45:52, 3.34s/it] {'loss': 0.3042, 'grad_norm': 0.6703735640538666, 'learning_rate': 4.870287533833241e-06, 'epoch': 0.52} 52%|█████▏ | 11557/22095 [19:49:22<9:45:52, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ for attempt_idx in range(num_base_retries): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sizes = reduce(lambda x, y: x + y, sizes) ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408762 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10956, 'image': 'vrdu_table_final_2/astro-ph.CO/dc9aeac2-350b-41af-8c9e-8bb89563b9f4.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 52%|█████▏ | 11558/22095 [19:49:25<9:25:28, 3.22s/it] {'loss': 0.3039, 'grad_norm': 0.6302364020589345, 'learning_rate': 4.8695548608851326e-06, 'epoch': 0.52} 52%|█████▏ | 11558/22095 [19:49:25<9:25:28, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58714 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41848 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45108 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11559/22095 [19:49:34<14:47:52, 5.06s/it] {'loss': 0.4762, 'grad_norm': 0.3537950301088894, 'learning_rate': 4.868822190739888e-06, 'epoch': 0.52} 52%|█████▏ | 11559/22095 [19:49:34<14:47:52, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121283 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (128717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54702 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85109 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11560/22095 [19:49:44<18:57:33, 6.48s/it] {'loss': 0.4747, 'grad_norm': 0.32810713616438986, 'learning_rate': 4.868089523413255e-06, 'epoch': 0.52} 52%|█████▏ | 11560/22095 [19:49:44<18:57:33, 6.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 52%|█████▏ | 11561/22095 [19:49:47<16:11:57, 5.54s/it] {'loss': 0.3298, 'grad_norm': 0.6764435138048115, 'learning_rate': 4.86735685892097e-06, 'epoch': 0.52} 52%|█████▏ | 11561/22095 [19:49:47<16:11:57, 5.54s/it] 52%|█████▏ | 11562/22095 [19:49:51<14:08:22, 4.83s/it] {'loss': 0.2964, 'grad_norm': 0.665599259745116, 'learning_rate': 4.8666241972787794e-06, 'epoch': 0.52} 52%|█████▏ | 11562/22095 [19:49:51<14:08:22, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11563/22095 [19:50:01<18:49:33, 6.43s/it] {'loss': 0.4723, 'grad_norm': 0.29847661924202107, 'learning_rate': 4.865891538502427e-06, 'epoch': 0.52} 52%|█████▏ | 11563/22095 [19:50:01<18:49:33, 6.43s/it] 52%|█████▏ | 11564/22095 [19:50:05<16:42:30, 5.71s/it] {'loss': 0.3278, 'grad_norm': 0.7256024754638591, 'learning_rate': 4.8651588826076514e-06, 'epoch': 0.52} 52%|█████▏ | 11564/22095 [19:50:05<16:42:30, 5.71s/it] 52%|█████▏ | 11565/22095 [19:50:08<14:13:01, 4.86s/it] {'loss': 0.3186, 'grad_norm': 0.633130441389736, 'learning_rate': 4.864426229610197e-06, 'epoch': 0.52} 52%|█████▏ | 11565/22095 [19:50:08<14:13:01, 4.86s/it] 52%|█████▏ | 11566/22095 [19:50:11<12:46:17, 4.37s/it] {'loss': 0.2952, 'grad_norm': 0.7023084421439088, 'learning_rate': 4.863693579525809e-06, 'epoch': 0.52} 52%|█████▏ | 11566/22095 [19:50:11<12:46:17, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 52%|█████▏ | 11567/22095 [19:50:20<17:12:17, 5.88s/it] {'loss': 0.4613, 'grad_norm': 0.32772249762550243, 'learning_rate': 4.862960932370225e-06, 'epoch': 0.52} 52%|█████▏ | 11567/22095 [19:50:20<17:12:17, 5.88s/it] 52%|█████▏ | 11568/22095 [19:50:30<20:44:49, 7.10s/it] {'loss': 0.4735, 'grad_norm': 0.481162743463051, 'learning_rate': 4.862228288159191e-06, 'epoch': 0.52} 52%|█████▏ | 11568/22095 [19:50:30<20:44:49, 7.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 52%|█████▏ | 11569/22095 [19:50:34<18:04:27, 6.18s/it] {'loss': 0.3308, 'grad_norm': 0.5770663986524796, 'learning_rate': 4.861495646908448e-06, 'epoch': 0.52} 52%|█████▏ | 11569/22095 [19:50:34<18:04:27, 6.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11570/22095 [19:50:38<15:52:13, 5.43s/it] {'loss': 0.3551, 'grad_norm': 0.7454149959163855, 'learning_rate': 4.860763008633736e-06, 'epoch': 0.52} 52%|█████▏ | 11570/22095 [19:50:38<15:52:13, 5.43s/it] 52%|█████▏ | 11571/22095 [19:50:42<14:29:48, 4.96s/it] {'loss': 0.3473, 'grad_norm': 1.1015416066071197, 'learning_rate': 4.860030373350801e-06, 'epoch': 0.52} 52%|█████▏ | 11571/22095 [19:50:42<14:29:48, 4.96s/it] 52%|█████▏ | 11572/22095 [19:50:45<12:52:49, 4.41s/it] {'loss': 0.3224, 'grad_norm': 0.6270488120483275, 'learning_rate': 4.859297741075384e-06, 'epoch': 0.52} 52%|█████▏ | 11572/22095 [19:50:45<12:52:49, 4.41s/it] 52%|█████▏ | 11573/22095 [19:50:48<11:45:23, 4.02s/it] {'loss': 0.3071, 'grad_norm': 0.6364875464470632, 'learning_rate': 4.858565111823226e-06, 'epoch': 0.52} 52%|█████▏ | 11573/22095 [19:50:48<11:45:23, 4.02s/it] 52%|█████▏ | 11574/22095 [19:50:51<10:46:08, 3.68s/it] {'loss': 0.3626, 'grad_norm': 0.6294535199966994, 'learning_rate': 4.857832485610068e-06, 'epoch': 0.52} 52%|█████▏ | 11574/22095 [19:50:51<10:46:08, 3.68s/it] 52%|█████▏ | 11575/22095 [19:50:54<10:25:39, 3.57s/it] {'loss': 0.3054, 'grad_norm': 0.6204239645520809, 'learning_rate': 4.857099862451654e-06, 'epoch': 0.52} 52%|█████▏ | 11575/22095 [19:50:54<10:25:39, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11576/22095 [19:50:58<10:29:34, 3.59s/it] {'loss': 0.3272, 'grad_norm': 0.6299344433260511, 'learning_rate': 4.856367242363727e-06, 'epoch': 0.52} 52%|█████▏ | 11576/22095 [19:50:58<10:29:34, 3.59s/it] 52%|█████▏ | 11577/22095 [19:51:02<10:52:07, 3.72s/it] {'loss': 0.3414, 'grad_norm': 0.7757501561103395, 'learning_rate': 4.8556346253620256e-06, 'epoch': 0.52} 52%|█████▏ | 11577/22095 [19:51:02<10:52:07, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65751 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11578/22095 [19:51:05<10:41:52, 3.66s/it] {'loss': 0.3632, 'grad_norm': 0.6331417772382716, 'learning_rate': 4.854902011462291e-06, 'epoch': 0.52} 52%|█████▏ | 11578/22095 [19:51:05<10:41:52, 3.66s/it] 52%|█████▏ | 11579/22095 [19:51:09<10:36:30, 3.63s/it] {'loss': 0.289, 'grad_norm': 0.5749494648969943, 'learning_rate': 4.85416940068027e-06, 'epoch': 0.52} 52%|█████▏ | 11579/22095 [19:51:09<10:36:30, 3.63s/it] 52%|█████▏ | 11580/22095 [19:51:12<10:24:20, 3.56s/it] {'loss': 0.2715, 'grad_norm': 0.6052192024100661, 'learning_rate': 4.853436793031698e-06, 'epoch': 0.52} 52%|█████▏ | 11580/22095 [19:51:12<10:24:20, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52232 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11581/22095 [19:51:15<9:37:53, 3.30s/it] {'loss': 0.3065, 'grad_norm': 0.6163087970541375, 'learning_rate': 4.852704188532319e-06, 'epoch': 0.52} 52%|█████▏ | 11581/22095 [19:51:15<9:37:53, 3.30s/it] 52%|█████▏ | 11582/22095 [19:51:19<9:59:38, 3.42s/it] {'loss': 0.3127, 'grad_norm': 0.650723415979576, 'learning_rate': 4.851971587197877e-06, 'epoch': 0.52} 52%|█████▏ | 11582/22095 [19:51:19<9:59:38, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11583/22095 [19:51:22<9:35:49, 3.29s/it] {'loss': 0.3626, 'grad_norm': 0.694249073684704, 'learning_rate': 4.8512389890441085e-06, 'epoch': 0.52} 52%|█████▏ | 11583/22095 [19:51:22<9:35:49, 3.29s/it] 52%|█████▏ | 11584/22095 [19:51:25<9:11:31, 3.15s/it] {'loss': 0.2904, 'grad_norm': 0.6401599007480614, 'learning_rate': 4.850506394086758e-06, 'epoch': 0.52} 52%|█████▏ | 11584/22095 [19:51:25<9:11:31, 3.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ num_base_retries = 1 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item self.list_data_dict[i].get("height", 100), ValueError: Image size [45, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8436020 in VC:s3://internvl-moe-sft-data/. Exception: Image size [45, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 143296, 'image': 'vrdu_texteq/astro-ph.CO/ed2e9807-8308-4619-8c92-5b01dde2ce68.png', 'image_wh': [[45, 20]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': '$\\qquad${\\tt end}'}]} VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_104/img/step_0.png 2025-08-28 11:49:24.540791 load time: 1221.79 ms 52%|█████▏ | 11585/22095 [19:51:28<9:16:47, 3.18s/it] {'loss': 0.3397, 'grad_norm': 0.6155923503742697, 'learning_rate': 4.849773802341567e-06, 'epoch': 0.52} 52%|█████▏ | 11585/22095 [19:51:28<9:16:47, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/terminal/432466e7-22e7-4194-aa9e-19f7c21adef5/images/step_0.png 2025-08-28 11:49:26.016697 load time: 1160.79 ms 52%|█████▏ | 11586/22095 [19:51:38<14:58:27, 5.13s/it] {'loss': 0.4777, 'grad_norm': 0.5400596329824081, 'learning_rate': 4.849041213824274e-06, 'epoch': 0.52} 52%|█████▏ | 11586/22095 [19:51:38<14:58:27, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 52%|█████▏ | 11587/22095 [19:51:41<13:23:44, 4.59s/it] {'loss': 0.3501, 'grad_norm': 0.6594264393188617, 'learning_rate': 4.8483086285506224e-06, 'epoch': 0.52} 52%|█████▏ | 11587/22095 [19:51:41<13:23:44, 4.59s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38206.png 2025-08-28 11:49:37.703222 load time: 1832.38 ms 52%|█████▏ | 11588/22095 [19:51:44<12:14:16, 4.19s/it] {'loss': 0.2842, 'grad_norm': 0.5901626896745396, 'learning_rate': 4.847576046536351e-06, 'epoch': 0.52} 52%|█████▏ | 11588/22095 [19:51:44<12:14:16, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46893 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45270 > 40960). Running this sequence through the model will result in indexing errors 52%|█████▏ | 11589/22095 [19:51:53<16:21:09, 5.60s/it] {'loss': 0.467, 'grad_norm': 0.313815295453429, 'learning_rate': 4.8468434677972055e-06, 'epoch': 0.52} 52%|█████▏ | 11589/22095 [19:51:53<16:21:09, 5.60s/it] 52%|█████▏ | 11590/22095 [19:51:57<15:21:03, 5.26s/it] {'loss': 0.3314, 'grad_norm': 0.6058793660821811, 'learning_rate': 4.846110892348921e-06, 'epoch': 0.52} 52%|█████▏ | 11590/22095 [19:51:57<15:21:03, 5.26s/it] 52%|█████▏ | 11591/22095 [19:52:01<13:50:55, 4.75s/it] {'loss': 0.3176, 'grad_norm': 0.6104563403943248, 'learning_rate': 4.845378320207241e-06, 'epoch': 0.52} 52%|█████▏ | 11591/22095 [19:52:01<13:50:55, 4.75s/it] 52%|█████▏ | 11592/22095 [19:52:04<12:13:09, 4.19s/it] {'loss': 0.2999, 'grad_norm': 0.6523091636477124, 'learning_rate': 4.844645751387908e-06, 'epoch': 0.52} 52%|█████▏ | 11592/22095 [19:52:04<12:13:09, 4.19s/it] 52%|█████▏ | 11593/22095 [19:52:07<11:14:59, 3.86s/it] {'loss': 0.3172, 'grad_norm': 0.7426514955082142, 'learning_rate': 4.843913185906658e-06, 'epoch': 0.52} 52%|█████▏ | 11593/22095 [19:52:07<11:14:59, 3.86s/it] 52%|█████▏ | 11594/22095 [19:52:10<10:22:28, 3.56s/it] {'loss': 0.3273, 'grad_norm': 0.7036324622303272, 'learning_rate': 4.843180623779235e-06, 'epoch': 0.52} 52%|█████▏ | 11594/22095 [19:52:10<10:22:28, 3.56s/it] 52%|█████▏ | 11595/22095 [19:52:13<10:11:40, 3.50s/it] {'loss': 0.311, 'grad_norm': 0.5821863957649411, 'learning_rate': 4.84244806502138e-06, 'epoch': 0.52} 52%|█████▏ | 11595/22095 [19:52:13<10:11:40, 3.50s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:50:12.497428 load time: 1014.06 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_9/images/20250417140122.png 2025-08-28 11:50:13.175574 load time: 1014.0 ms 52%|█████▏ | 11596/22095 [19:52:17<10:37:34, 3.64s/it] {'loss': 0.3352, 'grad_norm': 0.5758372192065354, 'learning_rate': 4.8417155096488315e-06, 'epoch': 0.52} 52%|█████▏ | 11596/22095 [19:52:17<10:37:34, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/settings/ce733e79-9bcf-4d47-8303-951a3a1ae194/images/step_0.png 2025-08-28 11:50:15.978559 load time: 1126.3 ms 52%|█████▏ | 11597/22095 [19:52:28<16:29:15, 5.65s/it] {'loss': 0.4698, 'grad_norm': 0.5169097065158664, 'learning_rate': 4.84098295767733e-06, 'epoch': 0.52} 52%|█████▏ | 11597/22095 [19:52:28<16:29:15, 5.65s/it] 52%|█████▏ | 11598/22095 [19:52:33<15:59:42, 5.49s/it] {'loss': 0.3466, 'grad_norm': 0.6203385669439295, 'learning_rate': 4.840250409122617e-06, 'epoch': 0.52} 52%|█████▏ | 11598/22095 [19:52:33<15:59:42, 5.49s/it] 52%|█████▏ | 11599/22095 [19:52:37<14:36:24, 5.01s/it] {'loss': 0.2755, 'grad_norm': 0.6227472133526765, 'learning_rate': 4.8395178640004316e-06, 'epoch': 0.52} 52%|█████▏ | 11599/22095 [19:52:37<14:36:24, 5.01s/it] 53%|█████▎ | 11600/22095 [19:52:40<13:08:25, 4.51s/it] {'loss': 0.3559, 'grad_norm': 0.6781019728826658, 'learning_rate': 4.838785322326514e-06, 'epoch': 0.53} 53%|█████▎ | 11600/22095 [19:52:40<13:08:25, 4.51s/it] 53%|█████▎ | 11601/22095 [19:52:43<12:03:02, 4.13s/it] {'loss': 0.3216, 'grad_norm': 0.6091503037903379, 'learning_rate': 4.838052784116606e-06, 'epoch': 0.53} 53%|█████▎ | 11601/22095 [19:52:43<12:03:02, 4.13s/it] 53%|█████▎ | 11602/22095 [19:52:46<10:49:15, 3.71s/it] {'loss': 0.3209, 'grad_norm': 0.6534442212691152, 'learning_rate': 4.837320249386446e-06, 'epoch': 0.53} 53%|█████▎ | 11602/22095 [19:52:46<10:49:15, 3.71s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 11:50:43.612177 load time: 1124.74 ms VC:s3://gui-agent/data_20250612/mac/images/calculator/6554dd0e-1a16-4b46-9ac0-15f4f672cbe7/images/step_7.png 2025-08-28 11:50:44.645466 load time: 1011.79 ms 53%|█████▎ | 11603/22095 [19:52:49<10:02:43, 3.45s/it] {'loss': 0.3384, 'grad_norm': 0.629883307804516, 'learning_rate': 4.836587718151773e-06, 'epoch': 0.53} 53%|█████▎ | 11603/22095 [19:52:49<10:02:43, 3.45s/it] 53%|█████▎ | 11604/22095 [19:52:52<9:35:58, 3.29s/it] {'loss': 0.34, 'grad_norm': 0.6153519783719406, 'learning_rate': 4.8358551904283285e-06, 'epoch': 0.53} 53%|█████▎ | 11604/22095 [19:52:52<9:35:58, 3.29s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30182.png 2025-08-28 11:50:51.660918 load time: 1134.81 ms 53%|█████▎ | 11605/22095 [19:52:55<9:53:38, 3.40s/it] {'loss': 0.343, 'grad_norm': 0.6418449757511626, 'learning_rate': 4.835122666231854e-06, 'epoch': 0.53} 53%|█████▎ | 11605/22095 [19:52:55<9:53:38, 3.40s/it] 53%|█████▎ | 11606/22095 [19:52:59<10:21:23, 3.55s/it] {'loss': 0.3172, 'grad_norm': 0.6674292870560887, 'learning_rate': 4.834390145578085e-06, 'epoch': 0.53} 53%|█████▎ | 11606/22095 [19:52:59<10:21:23, 3.55s/it] 53%|█████▎ | 11607/22095 [19:53:03<10:25:21, 3.58s/it] {'loss': 0.33, 'grad_norm': 0.6841831017464953, 'learning_rate': 4.833657628482762e-06, 'epoch': 0.53} 53%|█████▎ | 11607/22095 [19:53:03<10:25:21, 3.58s/it] 53%|█████▎ | 11608/22095 [19:53:06<9:54:33, 3.40s/it] {'loss': 0.3138, 'grad_norm': 0.656399605022193, 'learning_rate': 4.832925114961629e-06, 'epoch': 0.53} 53%|█████▎ | 11608/22095 [19:53:06<9:54:33, 3.40s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_2/images/step_0.png 2025-08-28 11:51:04.588108 load time: 1204.45 ms 53%|█████▎ | 11609/22095 [19:53:09<9:23:59, 3.23s/it] {'loss': 0.3187, 'grad_norm': 0.6375523783307073, 'learning_rate': 4.832192605030419e-06, 'epoch': 0.53} 53%|█████▎ | 11609/22095 [19:53:09<9:23:59, 3.23s/it] 53%|█████▎ | 11610/22095 [19:53:12<9:47:00, 3.36s/it] {'loss': 0.3179, 'grad_norm': 0.6306405944594975, 'learning_rate': 4.8314600987048755e-06, 'epoch': 0.53} 53%|█████▎ | 11610/22095 [19:53:12<9:47:00, 3.36s/it] 53%|█████▎ | 11611/22095 [19:53:16<10:10:32, 3.49s/it] {'loss': 0.3457, 'grad_norm': 0.6388363788630063, 'learning_rate': 4.8307275960007385e-06, 'epoch': 0.53} 53%|█████▎ | 11611/22095 [19:53:16<10:10:32, 3.49s/it] 53%|█████▎ | 11612/22095 [19:53:19<9:32:12, 3.28s/it] {'loss': 0.3226, 'grad_norm': 0.6314349162576515, 'learning_rate': 4.829995096933744e-06, 'epoch': 0.53} 53%|█████▎ | 11612/22095 [19:53:19<9:32:12, 3.28s/it] 53%|█████▎ | 11613/22095 [19:53:23<9:58:09, 3.42s/it] {'loss': 0.3318, 'grad_norm': 0.6298986320385729, 'learning_rate': 4.829262601519632e-06, 'epoch': 0.53} 53%|█████▎ | 11613/22095 [19:53:23<9:58:09, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ return [new_conversations] File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified def check_mm_input(self, idx: int) -> bool: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7322390 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'inventor/20250511_134843_1/images/before_screenshot_1_id_149_internvl_appearance_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nQuery:A labeled button that reads 'Modify' with a dropdown arrow. It has a light gray background with slightly darker gray borders, appearing as part of the ribbon interface toolbar. The button has a rectangular shape with rounded corners and is part of a grouped set of controls in the interface.\nOutput only the bounding box in your response"}, {'from': 'gpt', 'value': "A labeled button that reads 'Modify' with a dropdown arrow. It has a light gray background with slightly darker gray borders, appearing as part of the ribbon interface toolbar. The button has a rectangular shape with rounded corners and is part of a grouped set of controls in the interface.[[556, 137, 660, 145]]"}], 'width': 3600, 'height': 2338} 53%|█████▎ | 11614/22095 [19:53:27<10:28:24, 3.60s/it] {'loss': 0.3065, 'grad_norm': 0.5812589402658017, 'learning_rate': 4.828530109774143e-06, 'epoch': 0.53} 53%|█████▎ | 11614/22095 [19:53:27<10:28:24, 3.60s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/57f0e951e810592d9ceae1ee4edc34446d856723e561f6008a9f3984f3d70a51.png 2025-08-28 11:51:25.419938 load time: 1268.46 ms VC:s3://gui-agent/data_20250612/mac/images/finder/a47acc16-0ade-4eee-bdee-b1e4cad95e05/images/step_0.png 2025-08-28 11:51:26.851470 load time: 1133.48 ms 53%|█████▎ | 11615/22095 [19:53:29<9:48:35, 3.37s/it] {'loss': 0.3103, 'grad_norm': 0.6130135799885693, 'learning_rate': 4.827797621713017e-06, 'epoch': 0.53} 53%|█████▎ | 11615/22095 [19:53:29<9:48:35, 3.37s/it] 53%|█████▎ | 11616/22095 [19:53:34<10:23:53, 3.57s/it] {'loss': 0.3412, 'grad_norm': 0.6913402064094225, 'learning_rate': 4.827065137351989e-06, 'epoch': 0.53} 53%|█████▎ | 11616/22095 [19:53:34<10:23:53, 3.57s/it] 53%|█████▎ | 11617/22095 [19:53:37<10:00:24, 3.44s/it] {'loss': 0.3162, 'grad_norm': 0.609221764111646, 'learning_rate': 4.8263326567068e-06, 'epoch': 0.53} 53%|█████▎ | 11617/22095 [19:53:37<10:00:24, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11618/22095 [19:53:40<9:33:56, 3.29s/it] {'loss': 0.3289, 'grad_norm': 0.6355749663590441, 'learning_rate': 4.82560017979319e-06, 'epoch': 0.53} 53%|█████▎ | 11618/22095 [19:53:40<9:33:56, 3.29s/it] 53%|█████▎ | 11619/22095 [19:53:44<10:07:52, 3.48s/it] {'loss': 0.3576, 'grad_norm': 0.5889558380160951, 'learning_rate': 4.824867706626896e-06, 'epoch': 0.53} 53%|█████▎ | 11619/22095 [19:53:44<10:07:52, 3.48s/it] 53%|█████▎ | 11620/22095 [19:53:47<10:18:20, 3.54s/it] {'loss': 0.3272, 'grad_norm': 0.6284121917912118, 'learning_rate': 4.824135237223657e-06, 'epoch': 0.53} 53%|█████▎ | 11620/22095 [19:53:47<10:18:20, 3.54s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250504_153105_1/images/before_screenshot_1_id_127_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:51:45.995913 load time: 1180.18 ms 53%|█████▎ | 11621/22095 [19:53:50<9:47:48, 3.37s/it] {'loss': 0.3105, 'grad_norm': 0.7364661098326682, 'learning_rate': 4.823402771599213e-06, 'epoch': 0.53} 53%|█████▎ | 11621/22095 [19:53:50<9:47:48, 3.37s/it] 53%|█████▎ | 11622/22095 [19:53:54<9:57:38, 3.42s/it] {'loss': 0.3555, 'grad_norm': 0.6478860127947099, 'learning_rate': 4.8226703097693e-06, 'epoch': 0.53} 53%|█████▎ | 11622/22095 [19:53:54<9:57:38, 3.42s/it] 53%|█████▎ | 11623/22095 [19:53:57<10:01:26, 3.45s/it] {'loss': 0.3183, 'grad_norm': 0.6213982022709907, 'learning_rate': 4.821937851749656e-06, 'epoch': 0.53} 53%|█████▎ | 11623/22095 [19:53:57<10:01:26, 3.45s/it] 53%|█████▎ | 11624/22095 [19:54:00<9:28:59, 3.26s/it] {'loss': 0.3429, 'grad_norm': 0.6582634639981945, 'learning_rate': 4.8212053975560234e-06, 'epoch': 0.53} 53%|█████▎ | 11624/22095 [19:54:00<9:28:59, 3.26s/it] 53%|█████▎ | 11625/22095 [19:54:03<9:35:05, 3.30s/it] {'loss': 0.3219, 'grad_norm': 0.8059204720277006, 'learning_rate': 4.820472947204136e-06, 'epoch': 0.53} 53%|█████▎ | 11625/22095 [19:54:03<9:35:05, 3.30s/it] 53%|█████▎ | 11626/22095 [19:54:07<9:59:52, 3.44s/it] {'loss': 0.305, 'grad_norm': 0.6953029754941127, 'learning_rate': 4.8197405007097346e-06, 'epoch': 0.53} 53%|█████▎ | 11626/22095 [19:54:07<9:59:52, 3.44s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 11:52:04.547130 load time: 1423.94 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_2/images/step_3.png 2025-08-28 11:52:07.629975 load time: 1060.86 ms 53%|█████▎ | 11627/22095 [19:54:12<11:04:12, 3.81s/it] {'loss': 0.3335, 'grad_norm': 0.7495626317789454, 'learning_rate': 4.819008058088557e-06, 'epoch': 0.53} 53%|█████▎ | 11627/22095 [19:54:12<11:04:12, 3.81s/it] 53%|█████▎ | 11628/22095 [19:54:15<10:30:45, 3.62s/it] {'loss': 0.3214, 'grad_norm': 1.065609757684985, 'learning_rate': 4.8182756193563365e-06, 'epoch': 0.53} 53%|█████▎ | 11628/22095 [19:54:15<10:30:45, 3.62s/it] 53%|█████▎ | 11629/22095 [19:54:19<10:29:56, 3.61s/it] {'loss': 0.3208, 'grad_norm': 0.6059383855298436, 'learning_rate': 4.817543184528817e-06, 'epoch': 0.53} 53%|█████▎ | 11629/22095 [19:54:19<10:29:56, 3.61s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_5/images/before_screenshot_51_id_113_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 11:52:19.023082 load time: 1093.49 ms 53%|█████▎ | 11630/22095 [19:54:22<10:10:27, 3.50s/it] {'loss': 0.2865, 'grad_norm': 0.6914288972480178, 'learning_rate': 4.816810753621735e-06, 'epoch': 0.53} 53%|█████▎ | 11630/22095 [19:54:22<10:10:27, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55855 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11631/22095 [19:54:26<10:20:02, 3.56s/it] {'loss': 0.3535, 'grad_norm': 0.7088506050035354, 'learning_rate': 4.816078326650827e-06, 'epoch': 0.53} 53%|█████▎ | 11631/22095 [19:54:26<10:20:02, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917248 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40401, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 12\nB. 6\nC. 8\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 53%|█████▎ | 11632/22095 [19:54:29<10:20:10, 3.56s/it] {'loss': 0.3199, 'grad_norm': 0.6231122942346944, 'learning_rate': 4.8153459036318295e-06, 'epoch': 0.53} 53%|█████▎ | 11632/22095 [19:54:29<10:20:10, 3.56s/it]VC:s3://gui-agent/data_20250707/windows_augment_data_20250707/images/os_ubuntu_2/handmade_annotation_7/images/paste_Screenshot from 2025-07-09 14-15-49_id_2_function_2_crop_1_grounding_instructions_random.png 2025-08-28 11:52:29.129975 load time: 1297.21 ms 53%|█████▎ | 11633/22095 [19:54:32<9:48:57, 3.38s/it] {'loss': 0.3321, 'grad_norm': 0.7525461765440211, 'learning_rate': 4.8146134845804825e-06, 'epoch': 0.53} 53%|█████▎ | 11633/22095 [19:54:32<9:48:57, 3.38s/it] 53%|█████▎ | 11634/22095 [19:54:35<9:28:09, 3.26s/it] {'loss': 0.3682, 'grad_norm': 0.6610384539648507, 'learning_rate': 4.813881069512523e-06, 'epoch': 0.53} 53%|█████▎ | 11634/22095 [19:54:35<9:28:09, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65401 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98942 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11635/22095 [19:54:38<9:21:25, 3.22s/it] {'loss': 0.3105, 'grad_norm': 0.6021964475738, 'learning_rate': 4.813148658443687e-06, 'epoch': 0.53} 53%|█████▎ | 11635/22095 [19:54:38<9:21:25, 3.22s/it] 53%|█████▎ | 11636/22095 [19:54:42<9:34:06, 3.29s/it] {'loss': 0.3171, 'grad_norm': 0.7096372561872358, 'learning_rate': 4.812416251389711e-06, 'epoch': 0.53} 53%|█████▎ | 11636/22095 [19:54:42<9:34:06, 3.29s/it] 53%|█████▎ | 11637/22095 [19:54:45<9:33:58, 3.29s/it] {'loss': 0.2988, 'grad_norm': 0.6117060167385769, 'learning_rate': 4.811683848366337e-06, 'epoch': 0.53} 53%|█████▎ | 11637/22095 [19:54:45<9:33:58, 3.29s/it]VC:s3://gui-agent/data_20250612/mac/images/calculator/425d5e0b-af14-4fc3-a73c-1d97ab91471a/images/step_8.png 2025-08-28 11:52:44.210850 load time: 1019.33 ms VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250502_111053_6/images/before_screenshot_59_id_40_function_1_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:52:44.726626 load time: 1060.48 ms VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_1/images/before_screenshot_1_id_270_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 11:52:45.526756 load time: 1324.95 ms 53%|█████▎ | 11638/22095 [19:54:49<9:53:30, 3.41s/it] {'loss': 0.3235, 'grad_norm': 0.6371498511510651, 'learning_rate': 4.810951449389296e-06, 'epoch': 0.53} 53%|█████▎ | 11638/22095 [19:54:49<9:53:30, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11639/22095 [19:54:55<12:50:37, 4.42s/it] {'loss': 0.5039, 'grad_norm': 0.42872180050104086, 'learning_rate': 4.810219054474328e-06, 'epoch': 0.53} 53%|█████▎ | 11639/22095 [19:54:55<12:50:37, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11640/22095 [19:54:59<11:48:33, 4.07s/it] {'loss': 0.2954, 'grad_norm': 0.6127103887238807, 'learning_rate': 4.809486663637171e-06, 'epoch': 0.53} 53%|█████▎ | 11640/22095 [19:54:59<11:48:33, 4.07s/it] 53%|█████▎ | 11641/22095 [19:55:02<11:18:19, 3.89s/it] {'loss': 0.3038, 'grad_norm': 0.6174635844891634, 'learning_rate': 4.808754276893561e-06, 'epoch': 0.53} 53%|█████▎ | 11641/22095 [19:55:02<11:18:19, 3.89s/it] 53%|█████▎ | 11642/22095 [19:55:06<11:06:02, 3.82s/it] {'loss': 0.2961, 'grad_norm': 0.6389246437029213, 'learning_rate': 4.808021894259231e-06, 'epoch': 0.53} 53%|█████▎ | 11642/22095 [19:55:06<11:06:02, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11643/22095 [19:55:16<16:22:27, 5.64s/it] {'loss': 0.4631, 'grad_norm': 0.33784446396240825, 'learning_rate': 4.807289515749922e-06, 'epoch': 0.53} 53%|█████▎ | 11643/22095 [19:55:16<16:22:27, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85253 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11644/22095 [19:55:19<14:13:04, 4.90s/it] {'loss': 0.3594, 'grad_norm': 0.6487534965345723, 'learning_rate': 4.806557141381372e-06, 'epoch': 0.53} 53%|█████▎ | 11644/22095 [19:55:19<14:13:04, 4.90s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_19/images/20250417140147.png 2025-08-28 11:53:19.330776 load time: 1052.61 ms 53%|█████▎ | 11645/22095 [19:55:22<12:37:17, 4.35s/it] {'loss': 0.3189, 'grad_norm': 0.6231353459848805, 'learning_rate': 4.8058247711693125e-06, 'epoch': 0.53} 53%|█████▎ | 11645/22095 [19:55:22<12:37:17, 4.35s/it]VC:s3://gui-agent/data_20250612/mac/images/clock/2893e5ec-9c82-44a4-a95d-ef60cb2dec0a/images/step_0.png 2025-08-28 11:53:21.352769 load time: 1122.68 ms 53%|█████▎ | 11646/22095 [19:55:25<11:42:56, 4.04s/it] {'loss': 0.3131, 'grad_norm': 0.6998728150085539, 'learning_rate': 4.805092405129482e-06, 'epoch': 0.53} 53%|█████▎ | 11646/22095 [19:55:25<11:42:56, 4.04s/it] 53%|█████▎ | 11647/22095 [19:55:28<10:47:01, 3.72s/it] {'loss': 0.3058, 'grad_norm': 0.6695260173459374, 'learning_rate': 4.8043600432776186e-06, 'epoch': 0.53} 53%|█████▎ | 11647/22095 [19:55:28<10:47:01, 3.72s/it] 53%|█████▎ | 11648/22095 [19:55:32<10:29:04, 3.61s/it] {'loss': 0.352, 'grad_norm': 0.6494845805375873, 'learning_rate': 4.803627685629456e-06, 'epoch': 0.53} 53%|█████▎ | 11648/22095 [19:55:32<10:29:04, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [275, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8469682 in VC:s3://internvl-moe-sft-data/. Exception: Image size [275, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51215, 'image': 'vrdu_texteq/astro-ph.CO/bbdbb50a-f2cf-4231-a53d-774a18b42614.png', 'image_wh': [[275, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'where $K$ is a constant.'}]} 53%|█████▎ | 11649/22095 [19:55:35<10:09:05, 3.50s/it] {'loss': 0.3103, 'grad_norm': 0.593318239997085, 'learning_rate': 4.802895332200732e-06, 'epoch': 0.53} 53%|█████▎ | 11649/22095 [19:55:35<10:09:05, 3.50s/it] 53%|█████▎ | 11650/22095 [19:55:38<10:20:00, 3.56s/it] {'loss': 0.3239, 'grad_norm': 0.6348791995672993, 'learning_rate': 4.8021629830071824e-06, 'epoch': 0.53} 53%|█████▎ | 11650/22095 [19:55:39<10:20:00, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8601023 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24614, 'image': '938817477.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 53%|█████▎ | 11651/22095 [19:55:42<10:30:46, 3.62s/it] {'loss': 0.361, 'grad_norm': 0.6810575830189948, 'learning_rate': 4.801430638064541e-06, 'epoch': 0.53} 53%|█████▎ | 11651/22095 [19:55:42<10:30:46, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11652/22095 [19:55:52<15:31:57, 5.35s/it] {'loss': 0.4956, 'grad_norm': 0.3611467775995699, 'learning_rate': 4.800698297388546e-06, 'epoch': 0.53} 53%|█████▎ | 11652/22095 [19:55:52<15:31:57, 5.35s/it] 53%|█████▎ | 11653/22095 [19:55:56<14:30:52, 5.00s/it] {'loss': 0.3363, 'grad_norm': 0.7888433588801044, 'learning_rate': 4.799965960994934e-06, 'epoch': 0.53} 53%|█████▎ | 11653/22095 [19:55:56<14:30:52, 5.00s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 11:53:55.881418 load time: 1038.7 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_2/images/step_0.png 2025-08-28 11:53:55.968106 load time: 1078.52 ms 53%|█████▎ | 11654/22095 [19:55:59<13:03:54, 4.50s/it] {'loss': 0.3621, 'grad_norm': 0.6675214604942749, 'learning_rate': 4.799233628899438e-06, 'epoch': 0.53} 53%|█████▎ | 11654/22095 [19:55:59<13:03:54, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11655/22095 [19:56:03<12:05:17, 4.17s/it] {'loss': 0.2869, 'grad_norm': 0.6828697170062485, 'learning_rate': 4.798501301117795e-06, 'epoch': 0.53} 53%|█████▎ | 11655/22095 [19:56:03<12:05:17, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11656/22095 [19:56:07<12:15:00, 4.22s/it] {'loss': 0.342, 'grad_norm': 0.8840805701229352, 'learning_rate': 4.79776897766574e-06, 'epoch': 0.53} 53%|█████▎ | 11656/22095 [19:56:07<12:15:00, 4.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387994 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54808, 'image': 'vrdu_table_final_2/astro-ph.CO/5b129a1d-b935-4e0d-bf21-4a0c6ffded08.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} 53%|█████▎ | 11657/22095 [19:56:10<11:22:58, 3.93s/it] {'loss': 0.2812, 'grad_norm': 0.6454186370060752, 'learning_rate': 4.797036658559008e-06, 'epoch': 0.53} 53%|█████▎ | 11657/22095 [19:56:10<11:22:58, 3.93s/it] 53%|█████▎ | 11658/22095 [19:56:14<10:56:20, 3.77s/it] {'loss': 0.3479, 'grad_norm': 0.7100634649960617, 'learning_rate': 4.796304343813334e-06, 'epoch': 0.53} 53%|█████▎ | 11658/22095 [19:56:14<10:56:20, 3.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956352 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7187, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm'}, {'from': 'gpt', 'value': '【解答】解:根据上图所示OB=5cm-OA,∵OA=(AB+BC)÷2=4cm,∴OB=1cm.'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 11:54:13.022967 load time: 1000.29 ms 53%|█████▎ | 11659/22095 [19:56:17<10:30:19, 3.62s/it] {'loss': 0.3524, 'grad_norm': 0.6197573068724485, 'learning_rate': 4.795572033444456e-06, 'epoch': 0.53} 53%|█████▎ | 11659/22095 [19:56:17<10:30:19, 3.62s/it] 53%|█████▎ | 11660/22095 [19:56:20<9:55:47, 3.43s/it] {'loss': 0.3208, 'grad_norm': 0.9175159401326038, 'learning_rate': 4.794839727468107e-06, 'epoch': 0.53} 53%|█████▎ | 11660/22095 [19:56:20<9:55:47, 3.43s/it]VC:s3://gui-agent/data_20250612/mac/images/settings/d9f91a67-b7b2-4c00-add0-da44a4621f69/images/step_0.png 2025-08-28 11:54:17.506527 load time: 1072.89 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_1/images/step_4.png 2025-08-28 11:54:19.123764 load time: 1040.2 ms 53%|█████▎ | 11661/22095 [19:56:23<9:27:55, 3.27s/it] {'loss': 0.359, 'grad_norm': 0.6546605773311394, 'learning_rate': 4.7941074259000205e-06, 'epoch': 0.53} 53%|█████▎ | 11661/22095 [19:56:23<9:27:55, 3.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11662/22095 [19:56:26<9:07:17, 3.15s/it] {'loss': 0.3707, 'grad_norm': 0.6532201480708522, 'learning_rate': 4.793375128755934e-06, 'epoch': 0.53} 53%|█████▎ | 11662/22095 [19:56:26<9:07:17, 3.15s/it] 53%|█████▎ | 11663/22095 [19:56:29<9:26:33, 3.26s/it] {'loss': 0.3373, 'grad_norm': 0.6959393775248613, 'learning_rate': 4.792642836051582e-06, 'epoch': 0.53} 53%|█████▎ | 11663/22095 [19:56:29<9:26:33, 3.26s/it] 53%|█████▎ | 11664/22095 [19:56:32<9:09:46, 3.16s/it] {'loss': 0.3366, 'grad_norm': 0.6183465447029041, 'learning_rate': 4.7919105478026985e-06, 'epoch': 0.53} 53%|█████▎ | 11664/22095 [19:56:32<9:09:46, 3.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83544 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80567 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46711 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (56891 > 40960) for 4 sample(s). Truncating to 23854 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (60715 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11665/22095 [19:56:35<9:00:53, 3.11s/it] {'loss': 0.305, 'grad_norm': 0.6034941995919838, 'learning_rate': 4.791178264025017e-06, 'epoch': 0.53} 53%|█████▎ | 11665/22095 [19:56:35<9:00:53, 3.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116444 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11666/22095 [19:56:39<9:40:59, 3.34s/it] {'loss': 0.3577, 'grad_norm': 0.6695252246471096, 'learning_rate': 4.790445984734276e-06, 'epoch': 0.53} 53%|█████▎ | 11666/22095 [19:56:39<9:40:59, 3.34s/it] 53%|█████▎ | 11667/22095 [19:56:42<9:42:26, 3.35s/it] {'loss': 0.3265, 'grad_norm': 0.6469582967562887, 'learning_rate': 4.789713709946204e-06, 'epoch': 0.53} 53%|█████▎ | 11667/22095 [19:56:42<9:42:26, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67667 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60905 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59217 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11668/22095 [19:56:45<9:24:56, 3.25s/it] {'loss': 0.3115, 'grad_norm': 0.614241262918158, 'learning_rate': 4.78898143967654e-06, 'epoch': 0.53} 53%|█████▎ | 11668/22095 [19:56:45<9:24:56, 3.25s/it] 53%|█████▎ | 11669/22095 [19:56:49<9:47:44, 3.38s/it] {'loss': 0.2803, 'grad_norm': 0.6096260352455624, 'learning_rate': 4.788249173941018e-06, 'epoch': 0.53} 53%|█████▎ | 11669/22095 [19:56:49<9:47:44, 3.38s/it] 53%|█████▎ | 11670/22095 [19:56:52<9:32:01, 3.29s/it] {'loss': 0.3376, 'grad_norm': 0.7147214143476651, 'learning_rate': 4.787516912755369e-06, 'epoch': 0.53} 53%|█████▎ | 11670/22095 [19:56:52<9:32:01, 3.29s/it] 53%|█████▎ | 11671/22095 [19:56:56<9:57:56, 3.44s/it] {'loss': 0.3161, 'grad_norm': 0.5721691495643542, 'learning_rate': 4.786784656135328e-06, 'epoch': 0.53} 53%|█████▎ | 11671/22095 [19:56:56<9:57:56, 3.44s/it] 53%|█████▎ | 11672/22095 [19:56:59<9:50:03, 3.40s/it] {'loss': 0.2999, 'grad_norm': 0.6560440703019803, 'learning_rate': 4.7860524040966316e-06, 'epoch': 0.53} 53%|█████▎ | 11672/22095 [19:56:59<9:50:03, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [170, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8355867 in VC:s3://internvl-moe-sft-data/. Exception: Image size [170, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22571, 'image': 'vrdu_table_final_2/astro-ph.CO/b342f18c-1895-4d77-8538-5561ce710476.png', 'image_wh': [[170, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{r} Distance ratio \\end{tabular}\n```"}]} 53%|█████▎ | 11673/22095 [19:57:09<15:23:49, 5.32s/it] {'loss': 0.4573, 'grad_norm': 0.35524932920415997, 'learning_rate': 4.785320156655013e-06, 'epoch': 0.53} 53%|█████▎ | 11673/22095 [19:57:09<15:23:49, 5.32s/it] 53%|█████▎ | 11674/22095 [19:57:12<13:35:26, 4.69s/it] {'loss': 0.3017, 'grad_norm': 0.5887683799085369, 'learning_rate': 4.784587913826203e-06, 'epoch': 0.53} 53%|█████▎ | 11674/22095 [19:57:12<13:35:26, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101116 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11675/22095 [19:57:15<12:02:47, 4.16s/it] {'loss': 0.3053, 'grad_norm': 0.5629299722363656, 'learning_rate': 4.7838556756259365e-06, 'epoch': 0.53} 53%|█████▎ | 11675/22095 [19:57:15<12:02:47, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11676/22095 [19:57:22<14:12:07, 4.91s/it] {'loss': 0.4844, 'grad_norm': 0.30189560182789005, 'learning_rate': 4.78312344206995e-06, 'epoch': 0.53} 53%|█████▎ | 11676/22095 [19:57:22<14:12:07, 4.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11677/22095 [19:57:25<12:42:12, 4.39s/it] {'loss': 0.3099, 'grad_norm': 0.6264257087723134, 'learning_rate': 4.782391213173973e-06, 'epoch': 0.53} 53%|█████▎ | 11677/22095 [19:57:25<12:42:12, 4.39s/it] 53%|█████▎ | 11678/22095 [19:57:28<11:36:12, 4.01s/it] {'loss': 0.2973, 'grad_norm': 0.6104495057889123, 'learning_rate': 4.7816589889537415e-06, 'epoch': 0.53} 53%|█████▎ | 11678/22095 [19:57:28<11:36:12, 4.01s/it] 53%|█████▎ | 11679/22095 [19:57:31<10:50:37, 3.75s/it] {'loss': 0.3305, 'grad_norm': 0.6157904204618331, 'learning_rate': 4.780926769424988e-06, 'epoch': 0.53} 53%|█████▎ | 11679/22095 [19:57:31<10:50:37, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11680/22095 [19:57:37<12:39:27, 4.38s/it] {'loss': 0.4591, 'grad_norm': 0.3013608723061759, 'learning_rate': 4.780194554603444e-06, 'epoch': 0.53} 53%|█████▎ | 11680/22095 [19:57:37<12:39:27, 4.38s/it] 53%|█████▎ | 11681/22095 [19:57:41<12:01:07, 4.15s/it] {'loss': 0.3438, 'grad_norm': 0.6707113511900932, 'learning_rate': 4.779462344504845e-06, 'epoch': 0.53} 53%|█████▎ | 11681/22095 [19:57:41<12:01:07, 4.15s/it] 53%|█████▎ | 11682/22095 [19:57:44<11:18:20, 3.91s/it] {'loss': 0.3302, 'grad_norm': 0.6069204030793611, 'learning_rate': 4.778730139144923e-06, 'epoch': 0.53} 53%|█████▎ | 11682/22095 [19:57:44<11:18:20, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11683/22095 [19:57:49<12:28:09, 4.31s/it] {'loss': 0.4674, 'grad_norm': 0.30036751368928316, 'learning_rate': 4.777997938539411e-06, 'epoch': 0.53} 53%|█████▎ | 11683/22095 [19:57:49<12:28:09, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53530 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48442 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93567 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11684/22095 [19:57:54<12:44:11, 4.40s/it] {'loss': 0.3414, 'grad_norm': 0.6392490111304531, 'learning_rate': 4.777265742704039e-06, 'epoch': 0.53} 53%|█████▎ | 11684/22095 [19:57:54<12:44:11, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50714 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54870 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58551 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11685/22095 [19:57:57<11:49:31, 4.09s/it] {'loss': 0.293, 'grad_norm': 0.6088450841981451, 'learning_rate': 4.776533551654543e-06, 'epoch': 0.53} 53%|█████▎ | 11685/22095 [19:57:57<11:49:31, 4.09s/it] 53%|█████▎ | 11686/22095 [19:58:02<12:10:54, 4.21s/it] {'loss': 0.3402, 'grad_norm': 0.6248569779027412, 'learning_rate': 4.775801365406657e-06, 'epoch': 0.53} 53%|█████▎ | 11686/22095 [19:58:02<12:10:54, 4.21s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/inventor/20250513_095212_1/images/before_screenshot_1_id_66_internvl_appearance_crop_0_grounding_instructions_point_o_paste.png 2025-08-28 11:56:02.083997 load time: 1184.91 ms 53%|█████▎ | 11687/22095 [19:58:06<11:49:17, 4.09s/it] {'loss': 0.3631, 'grad_norm': 0.6331192626517268, 'learning_rate': 4.77506918397611e-06, 'epoch': 0.53} 53%|█████▎ | 11687/22095 [19:58:06<11:49:17, 4.09s/it] 53%|█████▎ | 11688/22095 [19:58:09<11:08:03, 3.85s/it] {'loss': 0.2863, 'grad_norm': 0.5615598344171098, 'learning_rate': 4.774337007378633e-06, 'epoch': 0.53} 53%|█████▎ | 11688/22095 [19:58:09<11:08:03, 3.85s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/calendar_1/images/step_0.png 2025-08-28 11:56:08.118450 load time: 1183.44 ms 53%|█████▎ | 11689/22095 [19:58:13<11:11:56, 3.87s/it] {'loss': 0.379, 'grad_norm': 0.6345617408032932, 'learning_rate': 4.773604835629965e-06, 'epoch': 0.53} 53%|█████▎ | 11689/22095 [19:58:13<11:11:56, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/vision/test_1723_image.png 2025-08-28 11:56:13.125690 load time: 1034.87 ms 53%|█████▎ | 11690/22095 [19:58:16<10:30:55, 3.64s/it] {'loss': 0.3255, 'grad_norm': 0.6236578233133567, 'learning_rate': 4.77287266874583e-06, 'epoch': 0.53} 53%|█████▎ | 11690/22095 [19:58:16<10:30:55, 3.64s/it] 53%|█████▎ | 11691/22095 [19:58:19<10:18:58, 3.57s/it] {'loss': 0.3424, 'grad_norm': 0.6598727164174507, 'learning_rate': 4.772140506741966e-06, 'epoch': 0.53} 53%|█████▎ | 11691/22095 [19:58:19<10:18:58, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11692/22095 [19:58:29<15:25:24, 5.34s/it] {'loss': 0.4797, 'grad_norm': 0.33072579076807246, 'learning_rate': 4.771408349634103e-06, 'epoch': 0.53} 53%|█████▎ | 11692/22095 [19:58:29<15:25:24, 5.34s/it] 53%|█████▎ | 11693/22095 [19:58:38<19:13:51, 6.66s/it] {'loss': 0.4725, 'grad_norm': 0.3321147784678341, 'learning_rate': 4.770676197437971e-06, 'epoch': 0.53} 53%|█████▎ | 11693/22095 [19:58:38<19:13:51, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 11:56:37.683653 load time: 1146.64 ms 53%|█████▎ | 11694/22095 [19:58:42<16:28:44, 5.70s/it] {'loss': 0.3207, 'grad_norm': 0.6858860519788593, 'learning_rate': 4.769944050169303e-06, 'epoch': 0.53} 53%|█████▎ | 11694/22095 [19:58:42<16:28:44, 5.70s/it] 53%|█████▎ | 11695/22095 [19:58:46<14:39:19, 5.07s/it] {'loss': 0.3649, 'grad_norm': 0.6174437330371626, 'learning_rate': 4.769211907843833e-06, 'epoch': 0.53} 53%|█████▎ | 11695/22095 [19:58:46<14:39:19, 5.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 11:56:45.254235 load time: 1039.76 ms 53%|█████▎ | 11696/22095 [19:58:49<13:41:44, 4.74s/it] {'loss': 0.32, 'grad_norm': 0.6631526550644542, 'learning_rate': 4.768479770477287e-06, 'epoch': 0.53} 53%|█████▎ | 11696/22095 [19:58:50<13:41:44, 4.74s/it] 53%|█████▎ | 11697/22095 [19:58:53<12:58:31, 4.49s/it] {'loss': 0.3154, 'grad_norm': 0.6050793631720137, 'learning_rate': 4.767747638085402e-06, 'epoch': 0.53} 53%|█████▎ | 11697/22095 [19:58:53<12:58:31, 4.49s/it] 53%|█████▎ | 11698/22095 [19:58:58<13:01:16, 4.51s/it] {'loss': 0.3788, 'grad_norm': 0.6513012927299898, 'learning_rate': 4.767015510683906e-06, 'epoch': 0.53} 53%|█████▎ | 11698/22095 [19:58:58<13:01:16, 4.51s/it] 53%|█████▎ | 11699/22095 [19:59:02<12:21:50, 4.28s/it] {'loss': 0.3555, 'grad_norm': 0.7069803491491897, 'learning_rate': 4.766283388288532e-06, 'epoch': 0.53} 53%|█████▎ | 11699/22095 [19:59:02<12:21:50, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81526 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11700/22095 [19:59:05<11:46:58, 4.08s/it] {'loss': 0.3533, 'grad_norm': 0.6905703982268144, 'learning_rate': 4.765551270915008e-06, 'epoch': 0.53} 53%|█████▎ | 11700/22095 [19:59:05<11:46:58, 4.08s/it] 53%|█████▎ | 11701/22095 [19:59:09<11:07:48, 3.85s/it] {'loss': 0.2813, 'grad_norm': 0.6164310287508942, 'learning_rate': 4.764819158579069e-06, 'epoch': 0.53} 53%|█████▎ | 11701/22095 [19:59:09<11:07:48, 3.85s/it] 53%|█████▎ | 11702/22095 [19:59:13<11:21:30, 3.93s/it] {'loss': 0.3171, 'grad_norm': 0.6334263773508509, 'learning_rate': 4.764087051296445e-06, 'epoch': 0.53} 53%|█████▎ | 11702/22095 [19:59:13<11:21:30, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_17/images/20250417135943.png 2025-08-28 11:57:11.543493 load time: 1330.45 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_1/images/step_0.png 2025-08-28 11:57:12.629930 load time: 1245.54 ms 53%|█████▎ | 11703/22095 [19:59:22<16:22:05, 5.67s/it] {'loss': 0.4844, 'grad_norm': 0.4756404789746504, 'learning_rate': 4.763354949082864e-06, 'epoch': 0.53} 53%|█████▎ | 11703/22095 [19:59:22<16:22:05, 5.67s/it] 53%|█████▎ | 11704/22095 [19:59:26<14:16:37, 4.95s/it] {'loss': 0.3588, 'grad_norm': 0.8110841726189192, 'learning_rate': 4.762622851954058e-06, 'epoch': 0.53} 53%|█████▎ | 11704/22095 [19:59:26<14:16:37, 4.95s/it] 53%|█████▎ | 11705/22095 [19:59:29<12:52:26, 4.46s/it] {'loss': 0.3587, 'grad_norm': 0.6715038490678876, 'learning_rate': 4.761890759925759e-06, 'epoch': 0.53} 53%|█████▎ | 11705/22095 [19:59:29<12:52:26, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11706/22095 [19:59:36<15:14:31, 5.28s/it] {'loss': 0.4792, 'grad_norm': 0.29191115580705324, 'learning_rate': 4.761158673013696e-06, 'epoch': 0.53} 53%|█████▎ | 11706/22095 [19:59:36<15:14:31, 5.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396950 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63803, 'image': 'vrdu_table_final_2/astro-ph.EP/61ed9aeb-1111-40ba-8513-054c0e92e10f.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$\\phi$\\end{tabular}\n```"}]} VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f745714e97855ab35e3205169098c5b789a17f11abb9edfdf9df2aa11513519c.png 2025-08-28 11:57:35.545666 load time: 1010.14 ms 53%|█████▎ | 11707/22095 [19:59:40<13:46:37, 4.77s/it] {'loss': 0.342, 'grad_norm': 0.67648546466302, 'learning_rate': 4.7604265912336e-06, 'epoch': 0.53} 53%|█████▎ | 11707/22095 [19:59:40<13:46:37, 4.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (86119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104361 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11708/22095 [19:59:48<16:45:22, 5.81s/it] {'loss': 0.4784, 'grad_norm': 0.3268432712087116, 'learning_rate': 4.759694514601201e-06, 'epoch': 0.53} 53%|█████▎ | 11708/22095 [19:59:48<16:45:22, 5.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401746 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3910, 'image': 'vrdu_table_final_2/astro-ph.CO/bab9bd78-a02e-4097-a49e-5544398324c3.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 53%|█████▎ | 11709/22095 [19:59:52<15:18:38, 5.31s/it] {'loss': 0.304, 'grad_norm': 0.6779791178983064, 'learning_rate': 4.758962443132227e-06, 'epoch': 0.53} 53%|█████▎ | 11709/22095 [19:59:52<15:18:38, 5.31s/it] 53%|█████▎ | 11710/22095 [19:59:56<13:57:10, 4.84s/it] {'loss': 0.3333, 'grad_norm': 0.6503922808669877, 'learning_rate': 4.75823037684241e-06, 'epoch': 0.53} 53%|█████▎ | 11710/22095 [19:59:56<13:57:10, 4.84s/it] 53%|█████▎ | 11711/22095 [19:59:59<12:23:35, 4.30s/it] {'loss': 0.3482, 'grad_norm': 0.6275606411625194, 'learning_rate': 4.757498315747482e-06, 'epoch': 0.53} 53%|█████▎ | 11711/22095 [19:59:59<12:23:35, 4.30s/it] 53%|█████▎ | 11712/22095 [20:00:02<11:06:08, 3.85s/it] {'loss': 0.3182, 'grad_norm': 0.6126662598486359, 'learning_rate': 4.756766259863169e-06, 'epoch': 0.53} 53%|█████▎ | 11712/22095 [20:00:02<11:06:08, 3.85s/it] 53%|█████▎ | 11713/22095 [20:00:05<10:31:52, 3.65s/it] {'loss': 0.3135, 'grad_norm': 0.604527940770792, 'learning_rate': 4.756034209205201e-06, 'epoch': 0.53} 53%|█████▎ | 11713/22095 [20:00:05<10:31:52, 3.65s/it] 53%|█████▎ | 11714/22095 [20:00:08<9:59:11, 3.46s/it] {'loss': 0.2854, 'grad_norm': 0.6128021844596666, 'learning_rate': 4.75530216378931e-06, 'epoch': 0.53} 53%|█████▎ | 11714/22095 [20:00:08<9:59:11, 3.46s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_4/images/step_0.png 2025-08-28 11:58:06.791545 load time: 1178.29 ms 53%|█████▎ | 11715/22095 [20:00:11<9:59:00, 3.46s/it] {'loss': 0.313, 'grad_norm': 0.6237930727497758, 'learning_rate': 4.754570123631224e-06, 'epoch': 0.53} 53%|█████▎ | 11715/22095 [20:00:11<9:59:00, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914855 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38008, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [692, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8525368 in VC:s3://internvl-moe-sft-data/. Exception: Image size [692, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 132299, 'image': 'vrdu_texteq/astro-ph.CO/d6fb9357-a10b-4751-a773-5e7c8b221a16.png', 'image_wh': [[692, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'The accretion rate of a BH of mass $M_{\\rm PBH}$ can be cast as'}]} 53%|█████▎ | 11716/22095 [20:00:17<12:01:13, 4.17s/it] {'loss': 0.468, 'grad_norm': 0.41318223344137056, 'learning_rate': 4.753838088746672e-06, 'epoch': 0.53} 53%|█████▎ | 11716/22095 [20:00:17<12:01:13, 4.17s/it] 53%|█████▎ | 11717/22095 [20:00:21<11:16:50, 3.91s/it] {'loss': 0.3176, 'grad_norm': 0.6638863894739259, 'learning_rate': 4.753106059151382e-06, 'epoch': 0.53} 53%|█████▎ | 11717/22095 [20:00:21<11:16:50, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11718/22095 [20:00:28<13:59:09, 4.85s/it] {'loss': 0.4704, 'grad_norm': 0.32642953381378825, 'learning_rate': 4.752374034861088e-06, 'epoch': 0.53} 53%|█████▎ | 11718/22095 [20:00:28<13:59:09, 4.85s/it] 53%|█████▎ | 11719/22095 [20:00:31<12:46:42, 4.43s/it] {'loss': 0.3375, 'grad_norm': 0.621866477687391, 'learning_rate': 4.7516420158915115e-06, 'epoch': 0.53} 53%|█████▎ | 11719/22095 [20:00:31<12:46:42, 4.43s/it] 53%|█████▎ | 11720/22095 [20:00:35<12:17:18, 4.26s/it] {'loss': 0.3308, 'grad_norm': 0.6712285445401825, 'learning_rate': 4.750910002258387e-06, 'epoch': 0.53} 53%|█████▎ | 11720/22095 [20:00:35<12:17:18, 4.26s/it] 53%|█████▎ | 11721/22095 [20:00:38<11:21:43, 3.94s/it] {'loss': 0.293, 'grad_norm': 0.7017424575427466, 'learning_rate': 4.750177993977442e-06, 'epoch': 0.53} 53%|█████▎ | 11721/22095 [20:00:38<11:21:43, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108984 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11722/22095 [20:00:42<11:00:37, 3.82s/it] {'loss': 0.3111, 'grad_norm': 0.7136507819466259, 'learning_rate': 4.7494459910644044e-06, 'epoch': 0.53} 53%|█████▎ | 11722/22095 [20:00:42<11:00:37, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11723/22095 [20:00:45<10:29:03, 3.64s/it] {'loss': 0.3338, 'grad_norm': 0.6721779070751936, 'learning_rate': 4.7487139935350015e-06, 'epoch': 0.53} 53%|█████▎ | 11723/22095 [20:00:45<10:29:03, 3.64s/it] 53%|█████▎ | 11724/22095 [20:00:48<9:50:41, 3.42s/it] {'loss': 0.2902, 'grad_norm': 0.6295776044002247, 'learning_rate': 4.747982001404965e-06, 'epoch': 0.53} 53%|█████▎ | 11724/22095 [20:00:48<9:50:41, 3.42s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/d639d36cbeb936c568d59407a247a94e9887ea96c62cca7a213fb2c303210be1.png 2025-08-28 11:58:45.532039 load time: 1415.66 ms 53%|█████▎ | 11725/22095 [20:00:51<9:27:46, 3.29s/it] {'loss': 0.3294, 'grad_norm': 0.64717952938952, 'learning_rate': 4.7472500146900206e-06, 'epoch': 0.53} 53%|█████▎ | 11725/22095 [20:00:51<9:27:46, 3.29s/it] 53%|█████▎ | 11726/22095 [20:00:54<9:25:56, 3.27s/it] {'loss': 0.3328, 'grad_norm': 0.6384251289927297, 'learning_rate': 4.746518033405897e-06, 'epoch': 0.53} 53%|█████▎ | 11726/22095 [20:00:54<9:25:56, 3.27s/it] 53%|█████▎ | 11727/22095 [20:00:57<9:21:13, 3.25s/it] {'loss': 0.3558, 'grad_norm': 0.6299181902472535, 'learning_rate': 4.745786057568324e-06, 'epoch': 0.53} 53%|█████▎ | 11727/22095 [20:00:57<9:21:13, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44076 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11728/22095 [20:01:01<9:41:45, 3.37s/it] {'loss': 0.4017, 'grad_norm': 0.6467058997655225, 'learning_rate': 4.745054087193025e-06, 'epoch': 0.53} 53%|█████▎ | 11728/22095 [20:01:01<9:41:45, 3.37s/it] 53%|█████▎ | 11729/22095 [20:01:04<9:13:00, 3.20s/it] {'loss': 0.3431, 'grad_norm': 0.6051062425231508, 'learning_rate': 4.744322122295732e-06, 'epoch': 0.53} 53%|█████▎ | 11729/22095 [20:01:04<9:13:00, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11730/22095 [20:01:11<12:59:48, 4.51s/it] {'loss': 0.503, 'grad_norm': 0.4436395210006455, 'learning_rate': 4.743590162892171e-06, 'epoch': 0.53} 53%|█████▎ | 11730/22095 [20:01:11<12:59:48, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103076 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72942 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11731/22095 [20:01:15<12:05:37, 4.20s/it] {'loss': 0.3252, 'grad_norm': 0.7358701096074631, 'learning_rate': 4.742858208998072e-06, 'epoch': 0.53} 53%|█████▎ | 11731/22095 [20:01:15<12:05:37, 4.20s/it] 53%|█████▎ | 11732/22095 [20:01:18<10:57:14, 3.81s/it] {'loss': 0.284, 'grad_norm': 0.5952948832084214, 'learning_rate': 4.742126260629158e-06, 'epoch': 0.53} 53%|█████▎ | 11732/22095 [20:01:18<10:57:14, 3.81s/it] 53%|█████▎ | 11733/22095 [20:01:21<10:38:59, 3.70s/it] {'loss': 0.336, 'grad_norm': 0.6126645832337427, 'learning_rate': 4.741394317801158e-06, 'epoch': 0.53} 53%|█████▎ | 11733/22095 [20:01:21<10:38:59, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71987 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45036 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11734/22095 [20:01:25<10:29:52, 3.65s/it] {'loss': 0.3016, 'grad_norm': 0.5700785120288356, 'learning_rate': 4.740662380529802e-06, 'epoch': 0.53} 53%|█████▎ | 11734/22095 [20:01:25<10:29:52, 3.65s/it] 53%|█████▎ | 11735/22095 [20:01:27<9:44:44, 3.39s/it] {'loss': 0.2852, 'grad_norm': 0.6028915201145363, 'learning_rate': 4.739930448830814e-06, 'epoch': 0.53} 53%|█████▎ | 11735/22095 [20:01:27<9:44:44, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (129107 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250612/mac/images/calculator/49c94f6a-b29e-4461-8f3c-7265418d21d1/images/step_3.png 2025-08-28 11:59:26.156407 load time: 1107.61 ms 53%|█████▎ | 11736/22095 [20:01:32<11:05:02, 3.85s/it] {'loss': 0.4607, 'grad_norm': 0.29438188912144864, 'learning_rate': 4.739198522719922e-06, 'epoch': 0.53} 53%|█████▎ | 11736/22095 [20:01:32<11:05:02, 3.85s/it] 53%|█████▎ | 11737/22095 [20:01:37<11:34:50, 4.02s/it] {'loss': 0.3175, 'grad_norm': 0.6169329863522592, 'learning_rate': 4.738466602212854e-06, 'epoch': 0.53} 53%|█████▎ | 11737/22095 [20:01:37<11:34:50, 4.02s/it] 53%|█████▎ | 11738/22095 [20:01:41<11:32:58, 4.01s/it] {'loss': 0.3394, 'grad_norm': 0.6408398342033451, 'learning_rate': 4.737734687325332e-06, 'epoch': 0.53} 53%|█████▎ | 11738/22095 [20:01:41<11:32:58, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11739/22095 [20:01:50<16:19:37, 5.68s/it] {'loss': 0.4713, 'grad_norm': 0.2797451455880777, 'learning_rate': 4.737002778073089e-06, 'epoch': 0.53} 53%|█████▎ | 11739/22095 [20:01:50<16:19:37, 5.68s/it] 53%|█████▎ | 11740/22095 [20:01:54<14:16:41, 4.96s/it] {'loss': 0.3008, 'grad_norm': 0.8173420620683368, 'learning_rate': 4.736270874471849e-06, 'epoch': 0.53} 53%|█████▎ | 11740/22095 [20:01:54<14:16:41, 4.96s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39066.png 2025-08-28 11:59:53.678169 load time: 1074.02 ms 53%|█████▎ | 11741/22095 [20:01:57<12:44:21, 4.43s/it] {'loss': 0.3178, 'grad_norm': 0.6721444700649927, 'learning_rate': 4.735538976537336e-06, 'epoch': 0.53} 53%|█████▎ | 11741/22095 [20:01:57<12:44:21, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11742/22095 [20:02:06<17:06:52, 5.95s/it] {'loss': 0.4832, 'grad_norm': 0.2953942616732436, 'learning_rate': 4.734807084285278e-06, 'epoch': 0.53} 53%|█████▎ | 11742/22095 [20:02:06<17:06:52, 5.95s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250501_133654_4/images/before_screenshot_44_id_29_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 12:00:06.804369 load time: 1174.05 ms 53%|█████▎ | 11743/22095 [20:02:10<14:57:54, 5.20s/it] {'loss': 0.2878, 'grad_norm': 0.6173867128773916, 'learning_rate': 4.734075197731403e-06, 'epoch': 0.53} 53%|█████▎ | 11743/22095 [20:02:10<14:57:54, 5.20s/it] 53%|█████▎ | 11744/22095 [20:02:13<13:10:04, 4.58s/it] {'loss': 0.3174, 'grad_norm': 0.6106232963497055, 'learning_rate': 4.733343316891435e-06, 'epoch': 0.53} 53%|█████▎ | 11744/22095 [20:02:13<13:10:04, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67981 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11745/22095 [20:02:17<12:22:39, 4.31s/it] {'loss': 0.3542, 'grad_norm': 0.6509230072417117, 'learning_rate': 4.7326114417811e-06, 'epoch': 0.53} 53%|█████▎ | 11745/22095 [20:02:17<12:22:39, 4.31s/it] 53%|█████▎ | 11746/22095 [20:02:19<11:08:33, 3.88s/it] {'loss': 0.2748, 'grad_norm': 0.627189292175373, 'learning_rate': 4.7318795724161214e-06, 'epoch': 0.53} 53%|█████▎ | 11746/22095 [20:02:19<11:08:33, 3.88s/it] 53%|█████▎ | 11747/22095 [20:02:23<10:56:11, 3.80s/it] {'loss': 0.3616, 'grad_norm': 0.7136115075309686, 'learning_rate': 4.731147708812232e-06, 'epoch': 0.53} 53%|█████▎ | 11747/22095 [20:02:23<10:56:11, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47674 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11748/22095 [20:02:26<10:22:47, 3.61s/it] {'loss': 0.3701, 'grad_norm': 0.6629747069512445, 'learning_rate': 4.730415850985149e-06, 'epoch': 0.53} 53%|█████▎ | 11748/22095 [20:02:26<10:22:47, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 280, in load s = read(self.decodermaxblock) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 989, in load_read cid, pos, length = self.png.read() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 173, in read length = i32(s) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/_binary.py", line 95, in i32be return unpack_from(">I", c, o)[0] struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len if ("image" in sample) or ("video" in sample) else -cur_len File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image return tcs_loader(image_path) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ img = self.pil_loader(img_value_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader return img.convert("RGB") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 287, in load raise OSError(msg) from e OSError: image file is truncated [Try #0] Failed to fetch sample 329452 in VC:s3://gui/aguvis/aguvis-stage1/omniact/images. Exception: image file is truncated Problematic sample: {'image': 'train_1500.png', 'conversations': [{'from': 'human', 'value': '\nDistribute the present map location.'}, {'from': 'gpt', 'value': '\nclick(x=0.9646, y=0.0194)\n'}]} 53%|█████▎ | 11749/22095 [20:02:30<10:50:26, 3.77s/it] {'loss': 0.3288, 'grad_norm': 0.6906180915021461, 'learning_rate': 4.729683998950602e-06, 'epoch': 0.53} 53%|█████▎ | 11749/22095 [20:02:30<10:50:26, 3.77s/it] 53%|█████▎ | 11750/22095 [20:02:34<10:49:24, 3.77s/it] {'loss': 0.3461, 'grad_norm': 0.5979704596044886, 'learning_rate': 4.728952152724317e-06, 'epoch': 0.53} 53%|█████▎ | 11750/22095 [20:02:34<10:49:24, 3.77s/it] 53%|█████▎ | 11751/22095 [20:02:38<10:45:52, 3.75s/it] {'loss': 0.3514, 'grad_norm': 0.6733068733347735, 'learning_rate': 4.728220312322017e-06, 'epoch': 0.53} 53%|█████▎ | 11751/22095 [20:02:38<10:45:52, 3.75s/it] 53%|█████▎ | 11752/22095 [20:02:42<11:06:14, 3.86s/it] {'loss': 0.3166, 'grad_norm': 0.6288131092820356, 'learning_rate': 4.7274884777594265e-06, 'epoch': 0.53} 53%|█████▎ | 11752/22095 [20:02:42<11:06:14, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74865 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49453 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11753/22095 [20:02:52<16:35:15, 5.77s/it] {'loss': 0.4755, 'grad_norm': 0.34938270335009647, 'learning_rate': 4.726756649052274e-06, 'epoch': 0.53} 53%|█████▎ | 11753/22095 [20:02:52<16:35:15, 5.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914673 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37826, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 8\nB. 16\nC. 2\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 53%|█████▎ | 11754/22095 [20:02:56<14:37:36, 5.09s/it] {'loss': 0.2964, 'grad_norm': 0.8193599994439758, 'learning_rate': 4.726024826216281e-06, 'epoch': 0.53} 53%|█████▎ | 11754/22095 [20:02:56<14:37:36, 5.09s/it] 53%|█████▎ | 11755/22095 [20:02:59<13:29:01, 4.69s/it] {'loss': 0.2954, 'grad_norm': 0.6055021071486388, 'learning_rate': 4.725293009267173e-06, 'epoch': 0.53} 53%|█████▎ | 11755/22095 [20:02:59<13:29:01, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11756/22095 [20:03:07<15:45:03, 5.48s/it] {'loss': 0.4802, 'grad_norm': 0.31638532661399055, 'learning_rate': 4.724561198220672e-06, 'epoch': 0.53} 53%|█████▎ | 11756/22095 [20:03:07<15:45:03, 5.48s/it] 53%|█████▎ | 11757/22095 [20:03:11<14:17:33, 4.98s/it] {'loss': 0.3484, 'grad_norm': 0.6145143406936915, 'learning_rate': 4.7238293930925085e-06, 'epoch': 0.53} 53%|█████▎ | 11757/22095 [20:03:11<14:17:33, 4.98s/it] 53%|█████▎ | 11758/22095 [20:03:13<12:27:56, 4.34s/it] {'loss': 0.3007, 'grad_norm': 1.0571107759072538, 'learning_rate': 4.723097593898402e-06, 'epoch': 0.53} 53%|█████▎ | 11758/22095 [20:03:13<12:27:56, 4.34s/it] 53%|█████▎ | 11759/22095 [20:03:17<12:14:08, 4.26s/it] {'loss': 0.3161, 'grad_norm': 0.6978453898443266, 'learning_rate': 4.7223658006540775e-06, 'epoch': 0.53} 53%|█████▎ | 11759/22095 [20:03:17<12:14:08, 4.26s/it] 53%|█████▎ | 11760/22095 [20:03:21<11:51:05, 4.13s/it] {'loss': 0.3334, 'grad_norm': 0.7433654020085223, 'learning_rate': 4.7216340133752604e-06, 'epoch': 0.53} 53%|█████▎ | 11760/22095 [20:03:21<11:51:05, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11761/22095 [20:03:30<15:43:20, 5.48s/it] {'loss': 0.4618, 'grad_norm': 0.30745122387587526, 'learning_rate': 4.720902232077671e-06, 'epoch': 0.53} 53%|█████▎ | 11761/22095 [20:03:30<15:43:20, 5.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46509 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11762/22095 [20:03:33<14:01:25, 4.89s/it] {'loss': 0.2957, 'grad_norm': 1.7468362884445348, 'learning_rate': 4.720170456777036e-06, 'epoch': 0.53} 53%|█████▎ | 11762/22095 [20:03:33<14:01:25, 4.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (114287 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11763/22095 [20:03:38<13:38:53, 4.76s/it] {'loss': 0.3068, 'grad_norm': 0.5927317587947376, 'learning_rate': 4.719438687489081e-06, 'epoch': 0.53} 53%|█████▎ | 11763/22095 [20:03:38<13:38:53, 4.76s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 12:01:37.991760 load time: 1140.23 ms 53%|█████▎ | 11764/22095 [20:03:42<12:41:10, 4.42s/it] {'loss': 0.3427, 'grad_norm': 0.773194991169054, 'learning_rate': 4.718706924229525e-06, 'epoch': 0.53} 53%|█████▎ | 11764/22095 [20:03:42<12:41:10, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80596 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47826 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11765/22095 [20:03:45<11:37:12, 4.05s/it] {'loss': 0.303, 'grad_norm': 0.6088699127754879, 'learning_rate': 4.7179751670140936e-06, 'epoch': 0.53} 53%|█████▎ | 11765/22095 [20:03:45<11:37:12, 4.05s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30592.png 2025-08-28 12:01:44.633518 load time: 1519.09 ms 53%|█████▎ | 11766/22095 [20:03:48<11:03:03, 3.85s/it] {'loss': 0.3374, 'grad_norm': 0.6149492255144813, 'learning_rate': 4.717243415858511e-06, 'epoch': 0.53} 53%|█████▎ | 11766/22095 [20:03:48<11:03:03, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74689 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95913 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11767/22095 [20:03:51<10:19:14, 3.60s/it] {'loss': 0.3539, 'grad_norm': 0.6561851623753537, 'learning_rate': 4.716511670778496e-06, 'epoch': 0.53} 53%|█████▎ | 11767/22095 [20:03:51<10:19:14, 3.60s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_3/images/20250417135851.png 2025-08-28 12:01:50.745646 load time: 1174.28 ms 53%|█████▎ | 11768/22095 [20:03:54<9:37:14, 3.35s/it] {'loss': 0.3041, 'grad_norm': 0.6043910413103322, 'learning_rate': 4.715779931789776e-06, 'epoch': 0.53} 53%|█████▎ | 11768/22095 [20:03:54<9:37:14, 3.35s/it] 53%|█████▎ | 11769/22095 [20:03:57<9:06:17, 3.17s/it] {'loss': 0.2939, 'grad_norm': 0.6451846146810193, 'learning_rate': 4.715048198908074e-06, 'epoch': 0.53} 53%|█████▎ | 11769/22095 [20:03:57<9:06:17, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250414/mac/images/right_click_handmade/handmade_annotation_3/images/5ed781f773f9eecedfcca9027dc000c.png 2025-08-28 12:01:56.500014 load time: 1011.94 ms 53%|█████▎ | 11770/22095 [20:04:04<12:55:57, 4.51s/it] {'loss': 0.509, 'grad_norm': 0.3657427690358148, 'learning_rate': 4.7143164721491095e-06, 'epoch': 0.53} 53%|█████▎ | 11770/22095 [20:04:04<12:55:57, 4.51s/it] 53%|█████▎ | 11771/22095 [20:04:11<14:37:01, 5.10s/it] {'loss': 0.4882, 'grad_norm': 0.3300517167423059, 'learning_rate': 4.713584751528605e-06, 'epoch': 0.53} 53%|█████▎ | 11771/22095 [20:04:11<14:37:01, 5.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 53%|█████▎ | 11772/22095 [20:04:14<13:06:03, 4.57s/it] {'loss': 0.3101, 'grad_norm': 0.8410660502421503, 'learning_rate': 4.712853037062286e-06, 'epoch': 0.53} 53%|█████▎ | 11772/22095 [20:04:14<13:06:03, 4.57s/it] 53%|█████▎ | 11773/22095 [20:04:24<17:17:23, 6.03s/it] {'loss': 0.4687, 'grad_norm': 0.3151118849340022, 'learning_rate': 4.712121328765875e-06, 'epoch': 0.53} 53%|█████▎ | 11773/22095 [20:04:24<17:17:23, 6.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 53%|█████▎ | 11774/22095 [20:04:27<14:55:14, 5.20s/it] {'loss': 0.3459, 'grad_norm': 0.6545005536550695, 'learning_rate': 4.71138962665509e-06, 'epoch': 0.53} 53%|█████▎ | 11774/22095 [20:04:27<14:55:14, 5.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11775/22095 [20:04:36<17:58:35, 6.27s/it] {'loss': 0.481, 'grad_norm': 0.30887497244294765, 'learning_rate': 4.710657930745656e-06, 'epoch': 0.53} 53%|█████▎ | 11775/22095 [20:04:36<17:58:35, 6.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11776/22095 [20:04:39<15:18:06, 5.34s/it] {'loss': 0.3226, 'grad_norm': 0.6612389703396073, 'learning_rate': 4.709926241053296e-06, 'epoch': 0.53} 53%|█████▎ | 11776/22095 [20:04:39<15:18:06, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43387 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11777/22095 [20:04:42<13:50:13, 4.83s/it] {'loss': 0.3174, 'grad_norm': 0.6661687889059671, 'learning_rate': 4.709194557593729e-06, 'epoch': 0.53} 53%|█████▎ | 11777/22095 [20:04:42<13:50:13, 4.83s/it] 53%|█████▎ | 11778/22095 [20:04:46<13:06:14, 4.57s/it] {'loss': 0.3221, 'grad_norm': 0.6595961119315379, 'learning_rate': 4.708462880382677e-06, 'epoch': 0.53} 53%|█████▎ | 11778/22095 [20:04:46<13:06:14, 4.57s/it] 53%|█████▎ | 11779/22095 [20:04:50<12:37:58, 4.41s/it] {'loss': 0.2952, 'grad_norm': 0.7566412686848811, 'learning_rate': 4.707731209435864e-06, 'epoch': 0.53} 53%|█████▎ | 11779/22095 [20:04:50<12:37:58, 4.41s/it] 53%|█████▎ | 11780/22095 [20:04:54<12:05:11, 4.22s/it] {'loss': 0.3253, 'grad_norm': 0.6319506838900698, 'learning_rate': 4.706999544769009e-06, 'epoch': 0.53} 53%|█████▎ | 11780/22095 [20:04:54<12:05:11, 4.22s/it] 53%|█████▎ | 11781/22095 [20:04:57<10:56:17, 3.82s/it] {'loss': 0.3122, 'grad_norm': 0.5638834312022738, 'learning_rate': 4.706267886397833e-06, 'epoch': 0.53} 53%|█████▎ | 11781/22095 [20:04:57<10:56:17, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 12:02:54.995737 load time: 1016.13 ms 53%|█████▎ | 11782/22095 [20:05:00<10:10:58, 3.55s/it] {'loss': 0.3171, 'grad_norm': 0.570363300428756, 'learning_rate': 4.705536234338059e-06, 'epoch': 0.53} 53%|█████▎ | 11782/22095 [20:05:00<10:10:58, 3.55s/it]VC:s3://gui-agent/mind2web_train/images/c7b0d1bc-2a0c-4060-92dd-cd4b8721b625/images/0.png 2025-08-28 12:03:00.190269 load time: 1152.06 ms 53%|█████▎ | 11783/22095 [20:05:04<10:25:06, 3.64s/it] {'loss': 0.32, 'grad_norm': 0.6580123748014653, 'learning_rate': 4.704804588605407e-06, 'epoch': 0.53} 53%|█████▎ | 11783/22095 [20:05:04<10:25:06, 3.64s/it] 53%|█████▎ | 11784/22095 [20:05:07<9:57:42, 3.48s/it] {'loss': 0.3385, 'grad_norm': 1.0069993479241546, 'learning_rate': 4.704072949215598e-06, 'epoch': 0.53} 53%|█████▎ | 11784/22095 [20:05:07<9:57:42, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11785/22095 [20:05:16<15:12:52, 5.31s/it] {'loss': 0.4799, 'grad_norm': 0.43109927661137915, 'learning_rate': 4.703341316184351e-06, 'epoch': 0.53} 53%|█████▎ | 11785/22095 [20:05:16<15:12:52, 5.31s/it] 53%|█████▎ | 11786/22095 [20:05:20<14:06:02, 4.92s/it] {'loss': 0.3446, 'grad_norm': 0.6811949904047957, 'learning_rate': 4.702609689527389e-06, 'epoch': 0.53} 53%|█████▎ | 11786/22095 [20:05:21<14:06:02, 4.92s/it] 53%|█████▎ | 11787/22095 [20:05:25<13:23:37, 4.68s/it] {'loss': 0.386, 'grad_norm': 0.6131719461888363, 'learning_rate': 4.701878069260432e-06, 'epoch': 0.53} 53%|█████▎ | 11787/22095 [20:05:25<13:23:37, 4.68s/it] 53%|█████▎ | 11788/22095 [20:05:28<12:02:51, 4.21s/it] {'loss': 0.3326, 'grad_norm': 0.6740418379070753, 'learning_rate': 4.701146455399198e-06, 'epoch': 0.53} 53%|█████▎ | 11788/22095 [20:05:28<12:02:51, 4.21s/it] 53%|█████▎ | 11789/22095 [20:05:31<11:10:35, 3.90s/it] {'loss': 0.3025, 'grad_norm': 0.6461944062048777, 'learning_rate': 4.7004148479594114e-06, 'epoch': 0.53} 53%|█████▎ | 11789/22095 [20:05:31<11:10:35, 3.90s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (108952500 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://gui-agent/data_20250616/windows_paste/images/excel/20250502_100758_5/images/before_screenshot_20_id_65_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 12:03:30.481989 load time: 1145.29 ms 53%|█████▎ | 11790/22095 [20:05:34<10:30:40, 3.67s/it] {'loss': 0.3169, 'grad_norm': 0.6409431886352649, 'learning_rate': 4.699683246956787e-06, 'epoch': 0.53} 53%|█████▎ | 11790/22095 [20:05:34<10:30:40, 3.67s/it] 53%|█████▎ | 11791/22095 [20:05:37<9:57:25, 3.48s/it] {'loss': 0.3526, 'grad_norm': 0.7155276422903248, 'learning_rate': 4.698951652407048e-06, 'epoch': 0.53} 53%|█████▎ | 11791/22095 [20:05:37<9:57:25, 3.48s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250506_150548_1/images/before_screenshot_28_id_119_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 12:03:35.156392 load time: 1000.49 ms Token indices sequence length is longer than the specified maximum sequence length for this model (43909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50152 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11792/22095 [20:05:40<9:53:44, 3.46s/it] {'loss': 0.3639, 'grad_norm': 0.6465783751741991, 'learning_rate': 4.698220064325915e-06, 'epoch': 0.53} 53%|█████▎ | 11792/22095 [20:05:40<9:53:44, 3.46s/it] 53%|█████▎ | 11793/22095 [20:05:44<10:11:15, 3.56s/it] {'loss': 0.3421, 'grad_norm': 0.6097472048185627, 'learning_rate': 4.697488482729105e-06, 'epoch': 0.53} 53%|█████▎ | 11793/22095 [20:05:44<10:11:15, 3.56s/it] 53%|█████▎ | 11794/22095 [20:05:49<11:16:46, 3.94s/it] {'loss': 0.3602, 'grad_norm': 0.6409515693192006, 'learning_rate': 4.696756907632336e-06, 'epoch': 0.53} 53%|█████▎ | 11794/22095 [20:05:49<11:16:46, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8579962 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7170, 'image': '764106120.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this an exam preparation book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comics book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11795/22095 [20:05:53<11:16:56, 3.94s/it] {'loss': 0.3309, 'grad_norm': 0.599249241356723, 'learning_rate': 4.6960253390513346e-06, 'epoch': 0.53} 53%|█████▎ | 11795/22095 [20:05:53<11:16:56, 3.94s/it] 53%|█████▎ | 11796/22095 [20:05:56<10:39:30, 3.73s/it] {'loss': 0.3502, 'grad_norm': 0.6196382148914346, 'learning_rate': 4.6952937770018105e-06, 'epoch': 0.53} 53%|█████▎ | 11796/22095 [20:05:56<10:39:30, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52848 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57081 > 40960). Running this sequence through the model will result in indexing errors 53%|█████▎ | 11797/22095 [20:05:59<9:57:43, 3.48s/it] {'loss': 0.2808, 'grad_norm': 0.6316758562418318, 'learning_rate': 4.694562221499489e-06, 'epoch': 0.53} 53%|█████▎ | 11797/22095 [20:05:59<9:57:43, 3.48s/it] 53%|█████▎ | 11798/22095 [20:06:02<9:29:52, 3.32s/it] {'loss': 0.3623, 'grad_norm': 0.6741241563752027, 'learning_rate': 4.693830672560089e-06, 'epoch': 0.53} 53%|█████▎ | 11798/22095 [20:06:02<9:29:52, 3.32s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_195453_5/images/before_screenshot_52_id_111_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 12:04:01.563465 load time: 1335.3 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/app_music_1/images/step_0.png 2025-08-28 12:04:02.346304 load time: 1016.46 ms 53%|█████▎ | 11799/22095 [20:06:06<9:39:19, 3.38s/it] {'loss': 0.3471, 'grad_norm': 0.587340740877701, 'learning_rate': 4.6930991301993255e-06, 'epoch': 0.53} 53%|█████▎ | 11799/22095 [20:06:06<9:39:19, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/terminal_1/images/step_2.png 2025-08-28 12:04:04.415360 load time: 1039.3 ms VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/mail_1/images/step_0.png 2025-08-28 12:04:04.413120 load time: 1082.85 ms 53%|█████▎ | 11800/22095 [20:06:15<14:28:44, 5.06s/it] {'loss': 0.4839, 'grad_norm': 0.32964651554241226, 'learning_rate': 4.692367594432919e-06, 'epoch': 0.53} 53%|█████▎ | 11800/22095 [20:06:15<14:28:44, 5.06s/it]VC:s3://gui-agent/data_20250616/windows_paste/images/ppt/20250504_153105_5/images/before_screenshot_50_id_139_function_0_crop_0_grounding_instructions_random_paste.png 2025-08-28 12:04:14.149603 load time: 1049.16 ms 53%|█████▎ | 11801/22095 [20:06:19<13:37:21, 4.76s/it] {'loss': 0.3331, 'grad_norm': 0.6854707444673713, 'learning_rate': 4.6916360652765876e-06, 'epoch': 0.53} 53%|█████▎ | 11801/22095 [20:06:19<13:37:21, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11802/22095 [20:06:23<13:02:37, 4.56s/it] {'loss': 0.3483, 'grad_norm': 0.7373786452883535, 'learning_rate': 4.690904542746052e-06, 'epoch': 0.53} 53%|█████▎ | 11802/22095 [20:06:23<13:02:37, 4.56s/it] 53%|█████▎ | 11803/22095 [20:06:26<11:41:50, 4.09s/it] {'loss': 0.3411, 'grad_norm': 0.6466194356635377, 'learning_rate': 4.690173026857028e-06, 'epoch': 0.53} 53%|█████▎ | 11803/22095 [20:06:26<11:41:50, 4.09s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_98/img/step_0.png 2025-08-28 12:04:26.136809 load time: 1261.01 ms 53%|█████▎ | 11804/22095 [20:06:30<11:30:52, 4.03s/it] {'loss': 0.3298, 'grad_norm': 0.6102879667747352, 'learning_rate': 4.689441517625232e-06, 'epoch': 0.53} 53%|█████▎ | 11804/22095 [20:06:30<11:30:52, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 53%|█████▎ | 11805/22095 [20:06:39<16:04:58, 5.63s/it] {'loss': 0.4822, 'grad_norm': 0.30553155491046163, 'learning_rate': 4.688710015066388e-06, 'epoch': 0.53} 53%|█████▎ | 11805/22095 [20:06:39<16:04:58, 5.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [295, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8487033 in VC:s3://internvl-moe-sft-data/. Exception: Image size [295, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 89806, 'image': 'vrdu_texteq/astro-ph.CO/0ea2e771-8cfd-4909-9fce-aa6636e98972.png', 'image_wh': [[295, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'The distance kernel $W^{g}$:'}]} 53%|█████▎ | 11806/22095 [20:06:42<14:10:25, 4.96s/it] {'loss': 0.2957, 'grad_norm': 0.650611281210423, 'learning_rate': 4.687978519196205e-06, 'epoch': 0.53} 53%|█████▎ | 11806/22095 [20:06:42<14:10:25, 4.96s/it] 53%|█████▎ | 11807/22095 [20:06:47<13:40:01, 4.78s/it] {'loss': 0.3534, 'grad_norm': 0.5941169919812131, 'learning_rate': 4.687247030030409e-06, 'epoch': 0.53} 53%|█████▎ | 11807/22095 [20:06:47<13:40:01, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 53%|█████▎ | 11808/22095 [20:06:51<13:27:21, 4.71s/it] {'loss': 0.29, 'grad_norm': 0.6493684457101689, 'learning_rate': 4.686515547584713e-06, 'epoch': 0.53} 53%|█████▎ | 11808/22095 [20:06:51<13:27:21, 4.71s/it] 53%|█████▎ | 11809/22095 [20:06:54<11:45:36, 4.12s/it] {'loss': 0.3088, 'grad_norm': 0.6147351712741508, 'learning_rate': 4.685784071874835e-06, 'epoch': 0.53} 53%|█████▎ | 11809/22095 [20:06:54<11:45:36, 4.12s/it] 53%|█████▎ | 11810/22095 [20:06:57<11:08:46, 3.90s/it] {'loss': 0.3271, 'grad_norm': 0.6363524473472438, 'learning_rate': 4.68505260291649e-06, 'epoch': 0.53} 53%|█████▎ | 11810/22095 [20:06:57<11:08:46, 3.90s/it] 53%|█████▎ | 11811/22095 [20:07:00<10:20:28, 3.62s/it] {'loss': 0.3188, 'grad_norm': 0.618546025225323, 'learning_rate': 4.6843211407254e-06, 'epoch': 0.53} 53%|█████▎ | 11811/22095 [20:07:00<10:20:28, 3.62s/it] 53%|█████▎ | 11812/22095 [20:07:04<10:21:51, 3.63s/it] {'loss': 0.3589, 'grad_norm': 0.6070095148382589, 'learning_rate': 4.683589685317278e-06, 'epoch': 0.53} 53%|█████▎ | 11812/22095 [20:07:04<10:21:51, 3.63s/it] 53%|█████▎ | 11813/22095 [20:07:07<10:02:29, 3.52s/it] {'loss': 0.3006, 'grad_norm': 0.619057250383705, 'learning_rate': 4.682858236707842e-06, 'epoch': 0.53} 53%|█████▎ | 11813/22095 [20:07:07<10:02:29, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301545 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1ELryLpXXXXcIXFXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nIdentify text from the picture.'}, {'from': 'gpt', 'value': 'All words in the image:\n英格索兰正品气动隔膜泵中国区域总代理\n厂家电话\n13663713257\nARO进口气动隔膜泵\nBQG70/0.2隔膜泵\nBQG150/0.2隔膜泵\nBQG350/0.2隔膜泵\n广西桂林工具有限公司\nBQG450/0.2隔膜泵'}]} 53%|█████▎ | 11814/22095 [20:07:11<9:53:54, 3.47s/it] {'loss': 0.2924, 'grad_norm': 0.6694414833050002, 'learning_rate': 4.682126794912808e-06, 'epoch': 0.53} 53%|█████▎ | 11814/22095 [20:07:11<9:53:54, 3.47s/it] 53%|█████▎ | 11815/22095 [20:07:13<9:19:17, 3.26s/it] {'loss': 0.3283, 'grad_norm': 0.6538425008294642, 'learning_rate': 4.681395359947894e-06, 'epoch': 0.53} 53%|█████▎ | 11815/22095 [20:07:13<9:19:17, 3.26s/it] 53%|█████▎ | 11816/22095 [20:07:16<9:04:44, 3.18s/it] {'loss': 0.3126, 'grad_norm': 0.618884000810644, 'learning_rate': 4.680663931828815e-06, 'epoch': 0.53} 53%|█████▎ | 11816/22095 [20:07:16<9:04:44, 3.18s/it] 53%|█████▎ | 11817/22095 [20:07:19<8:51:46, 3.10s/it] {'loss': 0.3021, 'grad_norm': 0.6622986593486823, 'learning_rate': 4.679932510571286e-06, 'epoch': 0.53} 53%|█████▎ | 11817/22095 [20:07:19<8:51:46, 3.10s/it] 53%|█████▎ | 11818/22095 [20:07:22<8:40:26, 3.04s/it] {'loss': 0.3415, 'grad_norm': 0.6046152483107878, 'learning_rate': 4.679201096191027e-06, 'epoch': 0.53} 53%|█████▎ | 11818/22095 [20:07:22<8:40:26, 3.04s/it] 53%|█████▎ | 11819/22095 [20:07:25<8:28:06, 2.97s/it] {'loss': 0.2878, 'grad_norm': 0.6457583281268079, 'learning_rate': 4.6784696887037475e-06, 'epoch': 0.53} 53%|█████▎ | 11819/22095 [20:07:25<8:28:06, 2.97s/it] 53%|█████▎ | 11820/22095 [20:07:28<8:18:08, 2.91s/it] {'loss': 0.3254, 'grad_norm': 0.7731153771800899, 'learning_rate': 4.6777382881251695e-06, 'epoch': 0.53} 53%|█████▎ | 11820/22095 [20:07:28<8:18:08, 2.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118277 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44167 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79945 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11821/22095 [20:07:31<8:38:25, 3.03s/it] {'loss': 0.3089, 'grad_norm': 0.5818533268500036, 'learning_rate': 4.677006894471006e-06, 'epoch': 0.54} 54%|█████▎ | 11821/22095 [20:07:31<8:38:25, 3.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_212959/images/step_0_id_22_function_2_crop_0_grounding_instructions_random_paste.png 2025-08-28 12:05:30.942269 load time: 1174.22 ms 54%|█████▎ | 11822/22095 [20:07:34<8:37:22, 3.02s/it] {'loss': 0.3404, 'grad_norm': 0.6437935399271492, 'learning_rate': 4.676275507756972e-06, 'epoch': 0.54} 54%|█████▎ | 11822/22095 [20:07:34<8:37:22, 3.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94732 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11823/22095 [20:07:37<8:40:16, 3.04s/it] {'loss': 0.3071, 'grad_norm': 0.6062609675532306, 'learning_rate': 4.6755441279987815e-06, 'epoch': 0.54} 54%|█████▎ | 11823/22095 [20:07:37<8:40:16, 3.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137589 > 40960). Running this sequence through the model will result in indexing errors VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 12:05:36.994995 load time: 1143.9 ms 54%|█████▎ | 11824/22095 [20:07:40<8:43:41, 3.06s/it] {'loss': 0.3151, 'grad_norm': 0.6875800441401139, 'learning_rate': 4.674812755212154e-06, 'epoch': 0.54} 54%|█████▎ | 11824/22095 [20:07:40<8:43:41, 3.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398232 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 383, 'image': 'vrdu_table_final_2/astro-ph.CO/22b3fb85-a525-49bc-8fa7-7a82926820a6.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} 54%|█████▎ | 11825/22095 [20:07:44<8:59:00, 3.15s/it] {'loss': 0.3227, 'grad_norm': 0.6476206078752045, 'learning_rate': 4.674081389412799e-06, 'epoch': 0.54} 54%|█████▎ | 11825/22095 [20:07:44<8:59:00, 3.15s/it] 54%|█████▎ | 11826/22095 [20:07:47<9:27:17, 3.31s/it] {'loss': 0.3612, 'grad_norm': 0.7150470473074529, 'learning_rate': 4.673350030616435e-06, 'epoch': 0.54} 54%|█████▎ | 11826/22095 [20:07:47<9:27:17, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77670 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11827/22095 [20:07:52<10:08:07, 3.55s/it] {'loss': 0.3064, 'grad_norm': 0.8702744022584356, 'learning_rate': 4.6726186788387745e-06, 'epoch': 0.54} 54%|█████▎ | 11827/22095 [20:07:52<10:08:07, 3.55s/it] 54%|█████▎ | 11828/22095 [20:07:55<9:42:57, 3.41s/it] {'loss': 0.3063, 'grad_norm': 0.6838788147840302, 'learning_rate': 4.671887334095537e-06, 'epoch': 0.54} 54%|█████▎ | 11828/22095 [20:07:55<9:42:57, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11829/22095 [20:08:02<13:34:32, 4.76s/it] {'loss': 0.4708, 'grad_norm': 0.3631132216259713, 'learning_rate': 4.671155996402429e-06, 'epoch': 0.54} 54%|█████▎ | 11829/22095 [20:08:03<13:34:32, 4.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408765 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10959, 'image': 'vrdu_table_final_2/astro-ph.CO/35b64eee-535b-4415-9d92-b80c6ad94416.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▎ | 11830/22095 [20:08:06<12:14:32, 4.29s/it] {'loss': 0.2972, 'grad_norm': 0.7416558030623343, 'learning_rate': 4.670424665775169e-06, 'epoch': 0.54} 54%|█████▎ | 11830/22095 [20:08:06<12:14:32, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11831/22095 [20:08:12<13:58:31, 4.90s/it] {'loss': 0.4574, 'grad_norm': 0.3263187846566589, 'learning_rate': 4.669693342229473e-06, 'epoch': 0.54} 54%|█████▎ | 11831/22095 [20:08:12<13:58:31, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91711 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75711 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11832/22095 [20:08:16<13:17:41, 4.66s/it] {'loss': 0.3016, 'grad_norm': 0.6714728109702466, 'learning_rate': 4.668962025781051e-06, 'epoch': 0.54} 54%|█████▎ | 11832/22095 [20:08:16<13:17:41, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100325 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91458 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11833/22095 [20:08:19<12:03:48, 4.23s/it] {'loss': 0.308, 'grad_norm': 0.6091662070395258, 'learning_rate': 4.668230716445618e-06, 'epoch': 0.54} 54%|█████▎ | 11833/22095 [20:08:19<12:03:48, 4.23s/it] 54%|█████▎ | 11834/22095 [20:08:23<11:50:38, 4.16s/it] {'loss': 0.365, 'grad_norm': 0.6675165747541576, 'learning_rate': 4.66749941423889e-06, 'epoch': 0.54} 54%|█████▎ | 11834/22095 [20:08:23<11:50:38, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11835/22095 [20:08:33<16:34:27, 5.82s/it] {'loss': 0.4817, 'grad_norm': 0.3165058569241368, 'learning_rate': 4.666768119176576e-06, 'epoch': 0.54} 54%|█████▎ | 11835/22095 [20:08:33<16:34:27, 5.82s/it] 54%|█████▎ | 11836/22095 [20:08:36<14:26:56, 5.07s/it] {'loss': 0.3129, 'grad_norm': 0.6903447640884273, 'learning_rate': 4.666036831274392e-06, 'epoch': 0.54} 54%|█████▎ | 11836/22095 [20:08:36<14:26:56, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11837/22095 [20:08:42<15:16:26, 5.36s/it] {'loss': 0.4939, 'grad_norm': 0.37165014809763863, 'learning_rate': 4.665305550548053e-06, 'epoch': 0.54} 54%|█████▎ | 11837/22095 [20:08:42<15:16:26, 5.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▎ | 11838/22095 [20:08:46<13:27:24, 4.72s/it] {'loss': 0.2911, 'grad_norm': 0.6023690777695794, 'learning_rate': 4.664574277013267e-06, 'epoch': 0.54} 54%|█████▎ | 11838/22095 [20:08:46<13:27:24, 4.72s/it] 54%|█████▎ | 11839/22095 [20:08:49<11:57:25, 4.20s/it] {'loss': 0.3533, 'grad_norm': 0.6341556166653007, 'learning_rate': 4.663843010685751e-06, 'epoch': 0.54} 54%|█████▎ | 11839/22095 [20:08:49<11:57:25, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11840/22095 [20:08:58<16:26:59, 5.77s/it] {'loss': 0.4711, 'grad_norm': 0.3014725055190625, 'learning_rate': 4.663111751581217e-06, 'epoch': 0.54} 54%|█████▎ | 11840/22095 [20:08:58<16:26:59, 5.77s/it] 54%|█████▎ | 11841/22095 [20:09:08<19:53:32, 6.98s/it] {'loss': 0.4784, 'grad_norm': 0.2846751634981725, 'learning_rate': 4.662380499715376e-06, 'epoch': 0.54} 54%|█████▎ | 11841/22095 [20:09:08<19:53:32, 6.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▎ | 11842/22095 [20:09:11<16:40:03, 5.85s/it] {'loss': 0.3562, 'grad_norm': 0.6283147441083387, 'learning_rate': 4.661649255103941e-06, 'epoch': 0.54} 54%|█████▎ | 11842/22095 [20:09:11<16:40:03, 5.85s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_1/images/20250417140010.png 2025-08-28 12:07:10.459016 load time: 1348.68 ms 54%|█████▎ | 11843/22095 [20:09:15<14:50:23, 5.21s/it] {'loss': 0.4033, 'grad_norm': 0.7101453282416704, 'learning_rate': 4.660918017762624e-06, 'epoch': 0.54} 54%|█████▎ | 11843/22095 [20:09:15<14:50:23, 5.21s/it]VC:s3://gui-agent/data_20250630/mac/images/terminal/5685b8a4-5bcb-4b03-8a69-df5db43dbe42/images/step_5.png 2025-08-28 12:07:15.026554 load time: 1036.79 ms 54%|█████▎ | 11844/22095 [20:09:18<12:55:33, 4.54s/it] {'loss': 0.3172, 'grad_norm': 0.6798369669668829, 'learning_rate': 4.660186787707137e-06, 'epoch': 0.54} 54%|█████▎ | 11844/22095 [20:09:18<12:55:33, 4.54s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38144.png 2025-08-28 12:07:17.580547 load time: 1126.33 ms 54%|█████▎ | 11845/22095 [20:09:21<11:57:59, 4.20s/it] {'loss': 0.3335, 'grad_norm': 0.6234796803913034, 'learning_rate': 4.6594555649531935e-06, 'epoch': 0.54} 54%|█████▎ | 11845/22095 [20:09:21<11:57:59, 4.20s/it] 54%|█████▎ | 11846/22095 [20:09:25<11:32:56, 4.06s/it] {'loss': 0.3402, 'grad_norm': 0.6276398422477145, 'learning_rate': 4.658724349516504e-06, 'epoch': 0.54} 54%|█████▎ | 11846/22095 [20:09:25<11:32:56, 4.06s/it] 54%|█████▎ | 11847/22095 [20:09:29<11:28:24, 4.03s/it] {'loss': 0.2826, 'grad_norm': 0.6032229180107805, 'learning_rate': 4.657993141412781e-06, 'epoch': 0.54} 54%|█████▎ | 11847/22095 [20:09:29<11:28:24, 4.03s/it] 54%|█████▎ | 11848/22095 [20:09:32<10:57:07, 3.85s/it] {'loss': 0.3014, 'grad_norm': 0.6492236595026835, 'learning_rate': 4.657261940657732e-06, 'epoch': 0.54} 54%|█████▎ | 11848/22095 [20:09:32<10:57:07, 3.85s/it] 54%|█████▎ | 11849/22095 [20:09:35<10:23:31, 3.65s/it] {'loss': 0.3241, 'grad_norm': 0.6576568803938141, 'learning_rate': 4.656530747267073e-06, 'epoch': 0.54} 54%|█████▎ | 11849/22095 [20:09:35<10:23:31, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61217 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11850/22095 [20:09:38<9:46:41, 3.44s/it] {'loss': 0.2732, 'grad_norm': 0.6720134070854751, 'learning_rate': 4.6557995612565146e-06, 'epoch': 0.54} 54%|█████▎ | 11850/22095 [20:09:38<9:46:41, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▎ | 11851/22095 [20:09:42<9:36:28, 3.38s/it] {'loss': 0.324, 'grad_norm': 0.6301081903669309, 'learning_rate': 4.655068382641764e-06, 'epoch': 0.54} 54%|█████▎ | 11851/22095 [20:09:42<9:36:28, 3.38s/it] 54%|█████▎ | 11852/22095 [20:09:45<9:21:18, 3.29s/it] {'loss': 0.3196, 'grad_norm': 0.649505180454233, 'learning_rate': 4.654337211438535e-06, 'epoch': 0.54} 54%|█████▎ | 11852/22095 [20:09:45<9:21:18, 3.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047900 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7cm'}]} 54%|█████▎ | 11853/22095 [20:09:48<8:57:24, 3.15s/it] {'loss': 0.3554, 'grad_norm': 0.9158481667574583, 'learning_rate': 4.653606047662541e-06, 'epoch': 0.54} 54%|█████▎ | 11853/22095 [20:09:48<8:57:24, 3.15s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_5/images/20250417140054.png 2025-08-28 12:07:47.320718 load time: 1137.11 ms 54%|█████▎ | 11854/22095 [20:09:51<9:06:55, 3.20s/it] {'loss': 0.3233, 'grad_norm': 0.5772610368338478, 'learning_rate': 4.652874891329484e-06, 'epoch': 0.54} 54%|█████▎ | 11854/22095 [20:09:51<9:06:55, 3.20s/it] 54%|█████▎ | 11855/22095 [20:09:54<8:53:20, 3.13s/it] {'loss': 0.3304, 'grad_norm': 0.6187458467750405, 'learning_rate': 4.652143742455082e-06, 'epoch': 0.54} 54%|█████▎ | 11855/22095 [20:09:54<8:53:20, 3.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960201 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11036, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1.5\nB. 2\nC. 0.5\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_16/images/20250417135830.png 2025-08-28 12:07:53.230998 load time: 1170.24 ms 54%|█████▎ | 11856/22095 [20:09:57<9:17:01, 3.26s/it] {'loss': 0.3128, 'grad_norm': 0.6887980125160978, 'learning_rate': 4.651412601055042e-06, 'epoch': 0.54} 54%|█████▎ | 11856/22095 [20:09:57<9:17:01, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41303 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11857/22095 [20:10:01<9:56:34, 3.50s/it] {'loss': 0.2979, 'grad_norm': 0.5715217572713842, 'learning_rate': 4.650681467145077e-06, 'epoch': 0.54} 54%|█████▎ | 11857/22095 [20:10:01<9:56:34, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (91112 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113019 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11858/22095 [20:10:11<15:26:53, 5.43s/it] {'loss': 0.5086, 'grad_norm': 0.47967059834371656, 'learning_rate': 4.649950340740892e-06, 'epoch': 0.54} 54%|█████▎ | 11858/22095 [20:10:11<15:26:53, 5.43s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 12:08:10.176821 load time: 1026.85 ms 54%|█████▎ | 11859/22095 [20:10:15<13:50:43, 4.87s/it] {'loss': 0.3593, 'grad_norm': 0.6855366105109115, 'learning_rate': 4.649219221858199e-06, 'epoch': 0.54} 54%|█████▎ | 11859/22095 [20:10:15<13:50:43, 4.87s/it] 54%|█████▎ | 11860/22095 [20:10:18<12:30:34, 4.40s/it] {'loss': 0.2789, 'grad_norm': 0.5939512965029445, 'learning_rate': 4.64848811051271e-06, 'epoch': 0.54} 54%|█████▎ | 11860/22095 [20:10:18<12:30:34, 4.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▎ | 11861/22095 [20:10:22<11:36:51, 4.09s/it] {'loss': 0.33, 'grad_norm': 0.7928225031359858, 'learning_rate': 4.6477570067201295e-06, 'epoch': 0.54} 54%|█████▎ | 11861/22095 [20:10:22<11:36:51, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71523 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108205 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11862/22095 [20:10:25<10:41:36, 3.76s/it] {'loss': 0.3168, 'grad_norm': 0.6282339858757565, 'learning_rate': 4.647025910496169e-06, 'epoch': 0.54} 54%|█████▎ | 11862/22095 [20:10:25<10:41:36, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118958 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11863/22095 [20:10:28<10:25:43, 3.67s/it] {'loss': 0.3152, 'grad_norm': 0.6349383292936635, 'learning_rate': 4.646294821856539e-06, 'epoch': 0.54} 54%|█████▎ | 11863/22095 [20:10:28<10:25:43, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item return sources ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344304 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10956, 'image': 'vrdu_table_final_2/astro-ph.CO/dc9aeac2-350b-41af-8c9e-8bb89563b9f4.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 54%|█████▎ | 11864/22095 [20:10:31<9:53:36, 3.48s/it] {'loss': 0.3244, 'grad_norm': 0.636330307450891, 'learning_rate': 4.6455637408169466e-06, 'epoch': 0.54} 54%|█████▎ | 11864/22095 [20:10:31<9:53:36, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (96788 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▎ | 11865/22095 [20:10:41<15:05:04, 5.31s/it] {'loss': 0.5009, 'grad_norm': 0.3136639017064418, 'learning_rate': 4.6448326673931e-06, 'epoch': 0.54} 54%|█████▎ | 11865/22095 [20:10:41<15:05:04, 5.31s/it] 54%|█████▎ | 11866/22095 [20:10:50<18:37:10, 6.55s/it] {'loss': 0.4579, 'grad_norm': 0.26578378504096073, 'learning_rate': 4.644101601600711e-06, 'epoch': 0.54} 54%|█████▎ | 11866/22095 [20:10:50<18:37:10, 6.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▎ | 11867/22095 [20:10:53<15:44:05, 5.54s/it] {'loss': 0.3301, 'grad_norm': 0.6772093834795655, 'learning_rate': 4.6433705434554825e-06, 'epoch': 0.54} 54%|█████▎ | 11867/22095 [20:10:53<15:44:05, 5.54s/it] 54%|█████▎ | 11868/22095 [20:10:57<13:47:48, 4.86s/it] {'loss': 0.3938, 'grad_norm': 0.6869899521420297, 'learning_rate': 4.6426394929731264e-06, 'epoch': 0.54} 54%|█████▎ | 11868/22095 [20:10:57<13:47:48, 4.86s/it] 54%|█████▎ | 11869/22095 [20:10:59<12:06:57, 4.27s/it] {'loss': 0.287, 'grad_norm': 0.6032550183160156, 'learning_rate': 4.641908450169351e-06, 'epoch': 0.54} 54%|█████▎ | 11869/22095 [20:10:59<12:06:57, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▎ | 11870/22095 [20:11:07<14:37:18, 5.15s/it] {'loss': 0.4619, 'grad_norm': 0.2729589981531489, 'learning_rate': 4.641177415059863e-06, 'epoch': 0.54} 54%|█████▎ | 11870/22095 [20:11:07<14:37:18, 5.15s/it] 54%|█████▎ | 11871/22095 [20:11:16<18:33:01, 6.53s/it] {'loss': 0.4725, 'grad_norm': 0.2930555677217789, 'learning_rate': 4.640446387660369e-06, 'epoch': 0.54} 54%|█████▎ | 11871/22095 [20:11:16<18:33:01, 6.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▎ | 11872/22095 [20:11:20<15:48:52, 5.57s/it] {'loss': 0.3643, 'grad_norm': 0.6367420254060255, 'learning_rate': 4.639715367986578e-06, 'epoch': 0.54} 54%|█████▎ | 11872/22095 [20:11:20<15:48:52, 5.57s/it] 54%|█████▎ | 11873/22095 [20:11:29<19:04:56, 6.72s/it] {'loss': 0.4915, 'grad_norm': 0.2707564173906194, 'learning_rate': 4.6389843560541995e-06, 'epoch': 0.54} 54%|█████▎ | 11873/22095 [20:11:29<19:04:56, 6.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▎ | 11874/22095 [20:11:33<16:15:51, 5.73s/it] {'loss': 0.3589, 'grad_norm': 0.7680386608088853, 'learning_rate': 4.638253351878937e-06, 'epoch': 0.54} 54%|█████▎ | 11874/22095 [20:11:33<16:15:51, 5.73s/it] 54%|█████▎ | 11875/22095 [20:11:36<14:10:36, 4.99s/it] {'loss': 0.3109, 'grad_norm': 0.6348233087753784, 'learning_rate': 4.637522355476499e-06, 'epoch': 0.54} 54%|█████▎ | 11875/22095 [20:11:36<14:10:36, 4.99s/it] 54%|█████▎ | 11876/22095 [20:11:40<13:03:56, 4.60s/it] {'loss': 0.2914, 'grad_norm': 0.7222645882308814, 'learning_rate': 4.636791366862593e-06, 'epoch': 0.54} 54%|█████▎ | 11876/22095 [20:11:40<13:03:56, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56613 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11877/22095 [20:11:43<11:59:56, 4.23s/it] {'loss': 0.3326, 'grad_norm': 0.5781883641735861, 'learning_rate': 4.636060386052924e-06, 'epoch': 0.54} 54%|█████▍ | 11877/22095 [20:11:43<11:59:56, 4.23s/it] 54%|█████▍ | 11878/22095 [20:11:47<12:01:13, 4.24s/it] {'loss': 0.2968, 'grad_norm': 0.593240330734267, 'learning_rate': 4.635329413063199e-06, 'epoch': 0.54} 54%|█████▍ | 11878/22095 [20:11:47<12:01:13, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307669 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB27rKvAORnpuFjSZFCXXX2DXXa_!!1075074163.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n久量\nDP\n护眼LED台灯\n小身材\n大能量\nUSB充电\n支持\n送'}]} 54%|█████▍ | 11879/22095 [20:11:50<10:49:03, 3.81s/it] {'loss': 0.2873, 'grad_norm': 0.846105349465719, 'learning_rate': 4.634598447909127e-06, 'epoch': 0.54} 54%|█████▍ | 11879/22095 [20:11:50<10:49:03, 3.81s/it] 54%|█████▍ | 11880/22095 [20:11:53<10:26:58, 3.68s/it] {'loss': 0.3117, 'grad_norm': 0.9134058698420537, 'learning_rate': 4.633867490606411e-06, 'epoch': 0.54} 54%|█████▍ | 11880/22095 [20:11:53<10:26:58, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54837 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60714 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41743 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11881/22095 [20:11:56<9:53:34, 3.49s/it] {'loss': 0.3077, 'grad_norm': 0.6051682716152033, 'learning_rate': 4.633136541170757e-06, 'epoch': 0.54} 54%|█████▍ | 11881/22095 [20:11:56<9:53:34, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11882/22095 [20:12:00<9:35:46, 3.38s/it] {'loss': 0.3233, 'grad_norm': 0.6438228832444766, 'learning_rate': 4.632405599617875e-06, 'epoch': 0.54} 54%|█████▍ | 11882/22095 [20:12:00<9:35:46, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914366 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37519, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nA. 4\nB. 8\nC. 16\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 54%|█████▍ | 11883/22095 [20:12:03<9:18:59, 3.28s/it] {'loss': 0.3048, 'grad_norm': 0.6111067636628326, 'learning_rate': 4.631674665963464e-06, 'epoch': 0.54} 54%|█████▍ | 11883/22095 [20:12:03<9:18:59, 3.28s/it] 54%|█████▍ | 11884/22095 [20:12:06<9:03:41, 3.19s/it] {'loss': 0.3436, 'grad_norm': 0.5708830172022095, 'learning_rate': 4.630943740223235e-06, 'epoch': 0.54} 54%|█████▍ | 11884/22095 [20:12:06<9:03:41, 3.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11885/22095 [20:12:13<12:35:22, 4.44s/it] {'loss': 0.4633, 'grad_norm': 0.385893873408589, 'learning_rate': 4.630212822412891e-06, 'epoch': 0.54} 54%|█████▍ | 11885/22095 [20:12:13<12:35:22, 4.44s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/39084.png 2025-08-28 12:10:12.758128 load time: 1342.44 ms 54%|█████▍ | 11886/22095 [20:12:16<11:33:05, 4.07s/it] {'loss': 0.3244, 'grad_norm': 0.6496981603736385, 'learning_rate': 4.62948191254814e-06, 'epoch': 0.54} 54%|█████▍ | 11886/22095 [20:12:16<11:33:05, 4.07s/it] 54%|█████▍ | 11887/22095 [20:12:20<11:13:27, 3.96s/it] {'loss': 0.3126, 'grad_norm': 0.5687704852137276, 'learning_rate': 4.6287510106446814e-06, 'epoch': 0.54} 54%|█████▍ | 11887/22095 [20:12:20<11:13:27, 3.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960202 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11037, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 2\nB. 0.5\nC. 1\nD. 1.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 54%|█████▍ | 11888/22095 [20:12:23<10:43:09, 3.78s/it] {'loss': 0.3365, 'grad_norm': 0.6763024486567406, 'learning_rate': 4.628020116718225e-06, 'epoch': 0.54} 54%|█████▍ | 11888/22095 [20:12:23<10:43:09, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11889/22095 [20:12:33<15:40:36, 5.53s/it] {'loss': 0.4631, 'grad_norm': 0.3024211002191299, 'learning_rate': 4.627289230784474e-06, 'epoch': 0.54} 54%|█████▍ | 11889/22095 [20:12:33<15:40:36, 5.53s/it] 54%|█████▍ | 11890/22095 [20:12:36<13:40:23, 4.82s/it] {'loss': 0.2705, 'grad_norm': 0.6528307867202187, 'learning_rate': 4.626558352859133e-06, 'epoch': 0.54} 54%|█████▍ | 11890/22095 [20:12:36<13:40:23, 4.82s/it]VC:s3://gui-agent/data_20250623/windows_augment/images/pycharm/2025-06-18_211202/images/step_1_id_31_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 12:10:36.380863 load time: 1125.76 ms 54%|█████▍ | 11891/22095 [20:12:40<13:18:44, 4.70s/it] {'loss': 0.3328, 'grad_norm': 0.6384263104660005, 'learning_rate': 4.625827482957904e-06, 'epoch': 0.54} 54%|█████▍ | 11891/22095 [20:12:40<13:18:44, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41860 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71691 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52809 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11892/22095 [20:12:44<12:19:31, 4.35s/it] {'loss': 0.3629, 'grad_norm': 0.6278570776518351, 'learning_rate': 4.625096621096497e-06, 'epoch': 0.54} 54%|█████▍ | 11892/22095 [20:12:44<12:19:31, 4.35s/it] 54%|█████▍ | 11893/22095 [20:12:48<11:59:44, 4.23s/it] {'loss': 0.3031, 'grad_norm': 0.6506955725953594, 'learning_rate': 4.624365767290609e-06, 'epoch': 0.54} 54%|█████▍ | 11893/22095 [20:12:48<11:59:44, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11894/22095 [20:12:52<11:46:24, 4.15s/it] {'loss': 0.3065, 'grad_norm': 0.6058536493638982, 'learning_rate': 4.6236349215559476e-06, 'epoch': 0.54} 54%|█████▍ | 11894/22095 [20:12:52<11:46:24, 4.15s/it] 54%|█████▍ | 11895/22095 [20:12:55<11:17:03, 3.98s/it] {'loss': 0.3567, 'grad_norm': 0.6477269762566259, 'learning_rate': 4.6229040839082174e-06, 'epoch': 0.54} 54%|█████▍ | 11895/22095 [20:12:55<11:17:03, 3.98s/it] 54%|█████▍ | 11896/22095 [20:12:59<10:46:05, 3.80s/it] {'loss': 0.358, 'grad_norm': 0.7315641361074862, 'learning_rate': 4.622173254363117e-06, 'epoch': 0.54} 54%|█████▍ | 11896/22095 [20:12:59<10:46:05, 3.80s/it] 54%|█████▍ | 11897/22095 [20:13:03<10:41:45, 3.78s/it] {'loss': 0.3315, 'grad_norm': 0.6390613193796425, 'learning_rate': 4.621442432936355e-06, 'epoch': 0.54} 54%|█████▍ | 11897/22095 [20:13:03<10:41:45, 3.78s/it] 54%|█████▍ | 11898/22095 [20:13:06<10:33:30, 3.73s/it] {'loss': 0.3121, 'grad_norm': 0.6932636661113306, 'learning_rate': 4.620711619643633e-06, 'epoch': 0.54} 54%|█████▍ | 11898/22095 [20:13:06<10:33:30, 3.73s/it] 54%|█████▍ | 11899/22095 [20:13:09<10:11:08, 3.60s/it] {'loss': 0.3536, 'grad_norm': 0.6466938654027299, 'learning_rate': 4.619980814500654e-06, 'epoch': 0.54} 54%|█████▍ | 11899/22095 [20:13:09<10:11:08, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11900/22095 [20:13:19<15:04:42, 5.32s/it] {'loss': 0.5017, 'grad_norm': 0.3754239818716435, 'learning_rate': 4.619250017523118e-06, 'epoch': 0.54} 54%|█████▍ | 11900/22095 [20:13:19<15:04:42, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50761 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11901/22095 [20:13:22<13:36:16, 4.80s/it] {'loss': 0.3208, 'grad_norm': 0.6201964027603627, 'learning_rate': 4.61851922872673e-06, 'epoch': 0.54} 54%|█████▍ | 11901/22095 [20:13:22<13:36:16, 4.80s/it] 54%|█████▍ | 11902/22095 [20:13:26<12:32:31, 4.43s/it] {'loss': 0.3018, 'grad_norm': 0.6683817172821697, 'learning_rate': 4.617788448127194e-06, 'epoch': 0.54} 54%|█████▍ | 11902/22095 [20:13:26<12:32:31, 4.43s/it] 54%|█████▍ | 11903/22095 [20:13:30<11:52:09, 4.19s/it] {'loss': 0.3449, 'grad_norm': 0.6207113423464251, 'learning_rate': 4.6170576757402095e-06, 'epoch': 0.54} 54%|█████▍ | 11903/22095 [20:13:30<11:52:09, 4.19s/it] 54%|█████▍ | 11904/22095 [20:13:33<11:29:26, 4.06s/it] {'loss': 0.3246, 'grad_norm': 0.6061645330534171, 'learning_rate': 4.616326911581478e-06, 'epoch': 0.54} 54%|█████▍ | 11904/22095 [20:13:33<11:29:26, 4.06s/it] 54%|█████▍ | 11905/22095 [20:13:36<10:26:04, 3.69s/it] {'loss': 0.2933, 'grad_norm': 0.6102535429940648, 'learning_rate': 4.6155961556667064e-06, 'epoch': 0.54} 54%|█████▍ | 11905/22095 [20:13:36<10:26:04, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11906/22095 [20:13:40<10:19:59, 3.65s/it] {'loss': 0.3462, 'grad_norm': 0.6458502246912481, 'learning_rate': 4.614865408011589e-06, 'epoch': 0.54} 54%|█████▍ | 11906/22095 [20:13:40<10:19:59, 3.65s/it] 54%|█████▍ | 11907/22095 [20:13:43<9:55:39, 3.51s/it] {'loss': 0.3193, 'grad_norm': 0.7296757629234377, 'learning_rate': 4.614134668631832e-06, 'epoch': 0.54} 54%|█████▍ | 11907/22095 [20:13:43<9:55:39, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65933 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11908/22095 [20:13:47<10:25:08, 3.68s/it] {'loss': 0.3622, 'grad_norm': 0.6260925635905281, 'learning_rate': 4.613403937543138e-06, 'epoch': 0.54} 54%|█████▍ | 11908/22095 [20:13:47<10:25:08, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11909/22095 [20:13:54<13:33:30, 4.79s/it] {'loss': 0.4524, 'grad_norm': 0.3092383604250637, 'learning_rate': 4.612673214761204e-06, 'epoch': 0.54} 54%|█████▍ | 11909/22095 [20:13:54<13:33:30, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72600 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11910/22095 [20:13:58<12:45:55, 4.51s/it] {'loss': 0.3545, 'grad_norm': 0.641101217881589, 'learning_rate': 4.611942500301733e-06, 'epoch': 0.54} 54%|█████▍ | 11910/22095 [20:13:58<12:45:55, 4.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8517074 in VC:s3://internvl-moe-sft-data/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56428, 'image': 'vrdu_texteq/astro-ph.CO/a4ad619a-96da-4f1b-b6b0-332376df11e3.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'in the limit $r\\rightarrow0$.'}]} 54%|█████▍ | 11911/22095 [20:14:02<11:51:37, 4.19s/it] {'loss': 0.3894, 'grad_norm': 0.729212585377849, 'learning_rate': 4.611211794180427e-06, 'epoch': 0.54} 54%|█████▍ | 11911/22095 [20:14:02<11:51:37, 4.19s/it] 54%|█████▍ | 11912/22095 [20:14:05<11:29:17, 4.06s/it] {'loss': 0.3575, 'grad_norm': 0.6985165644437046, 'learning_rate': 4.610481096412985e-06, 'epoch': 0.54} 54%|█████▍ | 11912/22095 [20:14:05<11:29:17, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67509 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11913/22095 [20:14:09<10:55:33, 3.86s/it] {'loss': 0.3114, 'grad_norm': 0.5623032596839105, 'learning_rate': 4.609750407015107e-06, 'epoch': 0.54} 54%|█████▍ | 11913/22095 [20:14:09<10:55:33, 3.86s/it] 54%|█████▍ | 11914/22095 [20:14:12<10:13:03, 3.61s/it] {'loss': 0.3008, 'grad_norm': 0.6227257576443729, 'learning_rate': 4.609019726002494e-06, 'epoch': 0.54} 54%|█████▍ | 11914/22095 [20:14:12<10:13:03, 3.61s/it] 54%|█████▍ | 11915/22095 [20:14:15<9:54:23, 3.50s/it] {'loss': 0.3494, 'grad_norm': 0.6173945881059006, 'learning_rate': 4.608289053390849e-06, 'epoch': 0.54} 54%|█████▍ | 11915/22095 [20:14:15<9:54:23, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250630/windows_augment_data_20250703/images/eviews/handmade_annotation_1/images/EV_2_id_0_function_2_crop_1_grounding_instructions_random_paste.png 2025-08-28 12:12:13.228922 load time: 1123.1 ms 54%|█████▍ | 11916/22095 [20:14:20<11:27:26, 4.05s/it] {'loss': 0.5019, 'grad_norm': 0.3187630793576256, 'learning_rate': 4.6075583891958665e-06, 'epoch': 0.54} 54%|█████▍ | 11916/22095 [20:14:20<11:27:26, 4.05s/it] 54%|█████▍ | 11917/22095 [20:14:24<11:22:34, 4.02s/it] {'loss': 0.3434, 'grad_norm': 0.6328780046178049, 'learning_rate': 4.606827733433249e-06, 'epoch': 0.54} 54%|█████▍ | 11917/22095 [20:14:24<11:22:34, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11918/22095 [20:14:32<14:08:03, 5.00s/it] {'loss': 0.46, 'grad_norm': 0.2991991417478379, 'learning_rate': 4.606097086118699e-06, 'epoch': 0.54} 54%|█████▍ | 11918/22095 [20:14:32<14:08:03, 5.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11919/22095 [20:14:35<12:36:04, 4.46s/it] {'loss': 0.3301, 'grad_norm': 0.5951216847187082, 'learning_rate': 4.60536644726791e-06, 'epoch': 0.54} 54%|█████▍ | 11919/22095 [20:14:35<12:36:04, 4.46s/it] 54%|█████▍ | 11920/22095 [20:14:39<12:31:05, 4.43s/it] {'loss': 0.2981, 'grad_norm': 0.6331957282164389, 'learning_rate': 4.604635816896583e-06, 'epoch': 0.54} 54%|█████▍ | 11920/22095 [20:14:39<12:31:05, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11921/22095 [20:14:46<14:48:39, 5.24s/it] {'loss': 0.4919, 'grad_norm': 0.28609010619166414, 'learning_rate': 4.6039051950204215e-06, 'epoch': 0.54} 54%|█████▍ | 11921/22095 [20:14:46<14:48:39, 5.24s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_4/images/20250417140025.png 2025-08-28 12:12:45.115609 load time: 1008.22 ms 54%|█████▍ | 11922/22095 [20:14:50<13:52:51, 4.91s/it] {'loss': 0.3241, 'grad_norm': 0.5983856542995213, 'learning_rate': 4.603174581655118e-06, 'epoch': 0.54} 54%|█████▍ | 11922/22095 [20:14:50<13:52:51, 4.91s/it] 54%|█████▍ | 11923/22095 [20:14:54<12:32:45, 4.44s/it] {'loss': 0.3307, 'grad_norm': 0.65502210479594, 'learning_rate': 4.602443976816375e-06, 'epoch': 0.54} 54%|█████▍ | 11923/22095 [20:14:54<12:32:45, 4.44s/it] 54%|█████▍ | 11924/22095 [20:14:57<11:52:39, 4.20s/it] {'loss': 0.3455, 'grad_norm': 0.6122185417503756, 'learning_rate': 4.601713380519891e-06, 'epoch': 0.54} 54%|█████▍ | 11924/22095 [20:14:57<11:52:39, 4.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337168 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3790, 'image': 'vrdu_table_final_2/astro-ph.CO/a16bdba6-eb8b-414f-8ec4-9bb0c7cbfd6c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} 54%|█████▍ | 11925/22095 [20:15:01<11:09:28, 3.95s/it] {'loss': 0.3278, 'grad_norm': 0.6313224879589558, 'learning_rate': 4.600982792781361e-06, 'epoch': 0.54} 54%|█████▍ | 11925/22095 [20:15:01<11:09:28, 3.95s/it] 54%|█████▍ | 11926/22095 [20:15:04<10:22:41, 3.67s/it] {'loss': 0.3315, 'grad_norm': 0.608001776525165, 'learning_rate': 4.600252213616486e-06, 'epoch': 0.54} 54%|█████▍ | 11926/22095 [20:15:04<10:22:41, 3.67s/it] 54%|█████▍ | 11927/22095 [20:15:08<10:53:28, 3.86s/it] {'loss': 0.3284, 'grad_norm': 0.6289475431513395, 'learning_rate': 4.599521643040964e-06, 'epoch': 0.54} 54%|█████▍ | 11927/22095 [20:15:08<10:53:28, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44292 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101367 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11928/22095 [20:15:12<11:04:25, 3.92s/it] {'loss': 0.3224, 'grad_norm': 0.6749502573548707, 'learning_rate': 4.598791081070493e-06, 'epoch': 0.54} 54%|█████▍ | 11928/22095 [20:15:12<11:04:25, 3.92s/it] 54%|█████▍ | 11929/22095 [20:15:15<10:17:42, 3.65s/it] {'loss': 0.2764, 'grad_norm': 0.5380611898001285, 'learning_rate': 4.598060527720766e-06, 'epoch': 0.54} 54%|█████▍ | 11929/22095 [20:15:15<10:17:42, 3.65s/it] 54%|█████▍ | 11930/22095 [20:15:19<10:03:06, 3.56s/it] {'loss': 0.329, 'grad_norm': 0.6538404966143314, 'learning_rate': 4.597329983007486e-06, 'epoch': 0.54} 54%|█████▍ | 11930/22095 [20:15:19<10:03:06, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ msg = self.transform_coordinates(msg, new_image_size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item rank0_print("Fixed image tokens in the conversation") ValueError: Image size [464, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8475143 in VC:s3://internvl-moe-sft-data/. Exception: Image size [464, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 112346, 'image': 'vrdu_texteq/astro-ph.CO/1a4f705e-fe83-4dae-9cfc-0ffc3e6f48db.png', 'image_wh': [[464, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'where $V_{\\mathcal{D}}$ is the volume of the domain'}]} 54%|█████▍ | 11931/22095 [20:15:22<10:09:57, 3.60s/it] {'loss': 0.3138, 'grad_norm': 0.6662952534304654, 'learning_rate': 4.5965994469463485e-06, 'epoch': 0.54} 54%|█████▍ | 11931/22095 [20:15:22<10:09:57, 3.60s/it] 54%|█████▍ | 11932/22095 [20:15:26<10:20:28, 3.66s/it] {'loss': 0.3088, 'grad_norm': 0.59477988751589, 'learning_rate': 4.595868919553049e-06, 'epoch': 0.54} 54%|█████▍ | 11932/22095 [20:15:26<10:20:28, 3.66s/it] 54%|█████▍ | 11933/22095 [20:15:29<9:46:08, 3.46s/it] {'loss': 0.3148, 'grad_norm': 0.5960129169306702, 'learning_rate': 4.595138400843285e-06, 'epoch': 0.54} 54%|█████▍ | 11933/22095 [20:15:29<9:46:08, 3.46s/it] 54%|█████▍ | 11934/22095 [20:15:32<9:13:02, 3.27s/it] {'loss': 0.3107, 'grad_norm': 0.8784161525700802, 'learning_rate': 4.594407890832755e-06, 'epoch': 0.54} 54%|█████▍ | 11934/22095 [20:15:32<9:13:02, 3.27s/it] 54%|█████▍ | 11935/22095 [20:15:36<9:41:45, 3.44s/it] {'loss': 0.3674, 'grad_norm': 0.7629956582515477, 'learning_rate': 4.5936773895371525e-06, 'epoch': 0.54} 54%|█████▍ | 11935/22095 [20:15:36<9:41:45, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49747 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61014 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11936/22095 [20:15:46<15:31:25, 5.50s/it] {'loss': 0.4688, 'grad_norm': 0.38093627317634576, 'learning_rate': 4.592946896972174e-06, 'epoch': 0.54} 54%|█████▍ | 11936/22095 [20:15:46<15:31:25, 5.50s/it] 54%|█████▍ | 11937/22095 [20:15:49<13:31:19, 4.79s/it] {'loss': 0.2927, 'grad_norm': 0.5874686472601677, 'learning_rate': 4.592216413153519e-06, 'epoch': 0.54} 54%|█████▍ | 11937/22095 [20:15:49<13:31:19, 4.79s/it] 54%|█████▍ | 11938/22095 [20:15:52<11:56:38, 4.23s/it] {'loss': 0.2726, 'grad_norm': 0.6845177792070586, 'learning_rate': 4.591485938096879e-06, 'epoch': 0.54} 54%|█████▍ | 11938/22095 [20:15:52<11:56:38, 4.23s/it] 54%|█████▍ | 11939/22095 [20:15:56<11:49:12, 4.19s/it] {'loss': 0.2617, 'grad_norm': 0.6942333260757441, 'learning_rate': 4.590755471817951e-06, 'epoch': 0.54} 54%|█████▍ | 11939/22095 [20:15:56<11:49:12, 4.19s/it] 54%|█████▍ | 11940/22095 [20:16:00<11:06:42, 3.94s/it] {'loss': 0.3057, 'grad_norm': 0.5685941422602182, 'learning_rate': 4.590025014332431e-06, 'epoch': 0.54} 54%|█████▍ | 11940/22095 [20:16:00<11:06:42, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11941/22095 [20:16:04<11:21:35, 4.03s/it] {'loss': 0.3292, 'grad_norm': 0.6192698308865857, 'learning_rate': 4.589294565656017e-06, 'epoch': 0.54} 54%|█████▍ | 11941/22095 [20:16:04<11:21:35, 4.03s/it] 54%|█████▍ | 11942/22095 [20:16:07<10:57:20, 3.88s/it] {'loss': 0.3679, 'grad_norm': 0.7505932995096428, 'learning_rate': 4.5885641258044e-06, 'epoch': 0.54} 54%|█████▍ | 11942/22095 [20:16:07<10:57:20, 3.88s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 54%|█████▍ | 11943/22095 [20:16:11<10:21:08, 3.67s/it] {'loss': 0.3437, 'grad_norm': 0.5872677963439269, 'learning_rate': 4.587833694793274e-06, 'epoch': 0.54} 54%|█████▍ | 11943/22095 [20:16:11<10:21:08, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11944/22095 [20:16:19<14:05:01, 4.99s/it] {'loss': 0.4798, 'grad_norm': 0.29940284734855005, 'learning_rate': 4.587103272638339e-06, 'epoch': 0.54} 54%|█████▍ | 11944/22095 [20:16:19<14:05:01, 4.99s/it] 54%|█████▍ | 11945/22095 [20:16:23<13:41:55, 4.86s/it] {'loss': 0.3247, 'grad_norm': 0.7274460135206671, 'learning_rate': 4.586372859355285e-06, 'epoch': 0.54} 54%|█████▍ | 11945/22095 [20:16:23<13:41:55, 4.86s/it] 54%|█████▍ | 11946/22095 [20:16:27<12:51:55, 4.56s/it] {'loss': 0.3083, 'grad_norm': 0.6549271468756779, 'learning_rate': 4.585642454959809e-06, 'epoch': 0.54} 54%|█████▍ | 11946/22095 [20:16:27<12:51:55, 4.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51179 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11947/22095 [20:16:30<11:34:08, 4.10s/it] {'loss': 0.2975, 'grad_norm': 0.6872198529055045, 'learning_rate': 4.584912059467604e-06, 'epoch': 0.54} 54%|█████▍ | 11947/22095 [20:16:30<11:34:08, 4.10s/it] 54%|█████▍ | 11948/22095 [20:16:33<10:26:24, 3.70s/it] {'loss': 0.3594, 'grad_norm': 0.7216276634194774, 'learning_rate': 4.584181672894362e-06, 'epoch': 0.54} 54%|█████▍ | 11948/22095 [20:16:33<10:26:24, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [562, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8500512 in VC:s3://internvl-moe-sft-data/. Exception: Image size [562, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 102364, 'image': 'vrdu_texteq/astro-ph.CO/45eaa458-bb5d-4faa-bddb-da596d3d62fb.png', 'image_wh': [[562, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'We can write the field $X$ in terms of modes as'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11949/22095 [20:16:37<10:51:54, 3.86s/it] {'loss': 0.3129, 'grad_norm': 0.6088900714811756, 'learning_rate': 4.5834512952557805e-06, 'epoch': 0.54} 54%|█████▍ | 11949/22095 [20:16:37<10:51:54, 3.86s/it] 54%|█████▍ | 11950/22095 [20:16:41<10:45:35, 3.82s/it] {'loss': 0.3398, 'grad_norm': 0.5911821508409031, 'learning_rate': 4.582720926567552e-06, 'epoch': 0.54} 54%|█████▍ | 11950/22095 [20:16:41<10:45:35, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11951/22095 [20:16:44<10:31:30, 3.74s/it] {'loss': 0.3065, 'grad_norm': 0.6734692864936836, 'learning_rate': 4.581990566845368e-06, 'epoch': 0.54} 54%|█████▍ | 11951/22095 [20:16:44<10:31:30, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11952/22095 [20:16:47<9:42:52, 3.45s/it] {'loss': 0.3369, 'grad_norm': 0.6074726667239979, 'learning_rate': 4.581260216104923e-06, 'epoch': 0.54} 54%|█████▍ | 11952/22095 [20:16:47<9:42:52, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308290 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2TKImbRDH8KJjSspnXXbNAVXa_!!3310486094.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract text from the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n鼎新包装\n封口粘\n韧性强\n防爆边\n首件有优惠\n全国多地区包邮\n每个ID限购一次\n规格齐全等你来选'}]} 54%|█████▍ | 11953/22095 [20:16:50<9:35:07, 3.40s/it] {'loss': 0.3316, 'grad_norm': 0.779060196209588, 'learning_rate': 4.580529874361911e-06, 'epoch': 0.54} 54%|█████▍ | 11953/22095 [20:16:50<9:35:07, 3.40s/it] 54%|█████▍ | 11954/22095 [20:16:55<10:15:34, 3.64s/it] {'loss': 0.3096, 'grad_norm': 0.6229799956881542, 'learning_rate': 4.579799541632022e-06, 'epoch': 0.54} 54%|█████▍ | 11954/22095 [20:16:55<10:15:34, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11955/22095 [20:17:04<15:13:21, 5.40s/it] {'loss': 0.4898, 'grad_norm': 0.30134750706287144, 'learning_rate': 4.5790692179309506e-06, 'epoch': 0.54} 54%|█████▍ | 11955/22095 [20:17:04<15:13:21, 5.40s/it] 54%|█████▍ | 11956/22095 [20:17:07<13:14:23, 4.70s/it] {'loss': 0.2994, 'grad_norm': 0.6001279768603515, 'learning_rate': 4.578338903274389e-06, 'epoch': 0.54} 54%|█████▍ | 11956/22095 [20:17:07<13:14:23, 4.70s/it] 54%|█████▍ | 11957/22095 [20:17:11<12:13:07, 4.34s/it] {'loss': 0.3237, 'grad_norm': 0.6335982180014519, 'learning_rate': 4.577608597678031e-06, 'epoch': 0.54} 54%|█████▍ | 11957/22095 [20:17:11<12:13:07, 4.34s/it] 54%|█████▍ | 11958/22095 [20:17:14<11:03:46, 3.93s/it] {'loss': 0.3291, 'grad_norm': 0.6388397554978568, 'learning_rate': 4.576878301157564e-06, 'epoch': 0.54} 54%|█████▍ | 11958/22095 [20:17:14<11:03:46, 3.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52489 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52275 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114568 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11959/22095 [20:17:17<10:59:45, 3.91s/it] {'loss': 0.2893, 'grad_norm': 0.6780612805093535, 'learning_rate': 4.576148013728685e-06, 'epoch': 0.54} 54%|█████▍ | 11959/22095 [20:17:17<10:59:45, 3.91s/it] 54%|█████▍ | 11960/22095 [20:17:21<10:39:14, 3.78s/it] {'loss': 0.288, 'grad_norm': 0.6035805490342991, 'learning_rate': 4.575417735407084e-06, 'epoch': 0.54} 54%|█████▍ | 11960/22095 [20:17:21<10:39:14, 3.78s/it] 54%|█████▍ | 11961/22095 [20:17:24<10:06:18, 3.59s/it] {'loss': 0.3083, 'grad_norm': 0.5938415113037409, 'learning_rate': 4.57468746620845e-06, 'epoch': 0.54} 54%|█████▍ | 11961/22095 [20:17:24<10:06:18, 3.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8381404 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 48193, 'image': 'vrdu_table_final_2/astro-ph.CO/f0bcb3cd-df05-4120-8d01-03abd49df6d3.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 54%|█████▍ | 11962/22095 [20:17:27<9:36:09, 3.41s/it] {'loss': 0.2967, 'grad_norm': 0.6037836247611696, 'learning_rate': 4.573957206148476e-06, 'epoch': 0.54} 54%|█████▍ | 11962/22095 [20:17:27<9:36:09, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49109 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11963/22095 [20:17:31<10:10:47, 3.62s/it] {'loss': 0.3599, 'grad_norm': 0.5917513817545543, 'learning_rate': 4.573226955242856e-06, 'epoch': 0.54} 54%|█████▍ | 11963/22095 [20:17:31<10:10:47, 3.62s/it] 54%|█████▍ | 11964/22095 [20:17:35<10:08:02, 3.60s/it] {'loss': 0.3089, 'grad_norm': 0.6220484958209552, 'learning_rate': 4.5724967135072746e-06, 'epoch': 0.54} 54%|█████▍ | 11964/22095 [20:17:35<10:08:02, 3.60s/it] 54%|█████▍ | 11965/22095 [20:17:38<10:10:11, 3.61s/it] {'loss': 0.3187, 'grad_norm': 0.6425750847613974, 'learning_rate': 4.571766480957427e-06, 'epoch': 0.54} 54%|█████▍ | 11965/22095 [20:17:38<10:10:11, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51120 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89914 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11966/22095 [20:17:44<11:42:49, 4.16s/it] {'loss': 0.3204, 'grad_norm': 0.6090489388183843, 'learning_rate': 4.571036257609004e-06, 'epoch': 0.54} 54%|█████▍ | 11966/22095 [20:17:44<11:42:49, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11967/22095 [20:17:53<16:19:37, 5.80s/it] {'loss': 0.4816, 'grad_norm': 0.31832057329905533, 'learning_rate': 4.570306043477693e-06, 'epoch': 0.54} 54%|█████▍ | 11967/22095 [20:17:53<16:19:37, 5.80s/it] 54%|█████▍ | 11968/22095 [20:17:57<14:24:39, 5.12s/it] {'loss': 0.3218, 'grad_norm': 0.6710364857290778, 'learning_rate': 4.569575838579184e-06, 'epoch': 0.54} 54%|█████▍ | 11968/22095 [20:17:57<14:24:39, 5.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 11969/22095 [20:18:01<13:11:58, 4.69s/it] {'loss': 0.3415, 'grad_norm': 0.6389355538423641, 'learning_rate': 4.56884564292917e-06, 'epoch': 0.54} 54%|█████▍ | 11969/22095 [20:18:01<13:11:58, 4.69s/it] 54%|█████▍ | 11970/22095 [20:18:04<12:17:58, 4.37s/it] {'loss': 0.3246, 'grad_norm': 0.682930932654138, 'learning_rate': 4.568115456543339e-06, 'epoch': 0.54} 54%|█████▍ | 11970/22095 [20:18:04<12:17:58, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11971/22095 [20:18:13<15:51:33, 5.64s/it] {'loss': 0.4868, 'grad_norm': 0.2845147003828224, 'learning_rate': 4.567385279437381e-06, 'epoch': 0.54} 54%|█████▍ | 11971/22095 [20:18:13<15:51:33, 5.64s/it] 54%|█████▍ | 11972/22095 [20:18:22<18:52:49, 6.71s/it] {'loss': 0.4951, 'grad_norm': 0.2978005161473394, 'learning_rate': 4.566655111626982e-06, 'epoch': 0.54} 54%|█████▍ | 11972/22095 [20:18:22<18:52:49, 6.71s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (48731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82021 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11973/22095 [20:18:26<16:15:03, 5.78s/it] {'loss': 0.362, 'grad_norm': 0.688595153123291, 'learning_rate': 4.565924953127837e-06, 'epoch': 0.54} 54%|█████▍ | 11973/22095 [20:18:26<16:15:03, 5.78s/it] 54%|█████▍ | 11974/22095 [20:18:35<19:15:23, 6.85s/it] {'loss': 0.4786, 'grad_norm': 0.28024177680746104, 'learning_rate': 4.56519480395563e-06, 'epoch': 0.54} 54%|█████▍ | 11974/22095 [20:18:35<19:15:23, 6.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▍ | 11975/22095 [20:18:39<16:27:34, 5.86s/it] {'loss': 0.3376, 'grad_norm': 0.6663854522634133, 'learning_rate': 4.564464664126052e-06, 'epoch': 0.54} 54%|█████▍ | 11975/22095 [20:18:39<16:27:34, 5.86s/it] 54%|█████▍ | 11976/22095 [20:18:42<14:45:29, 5.25s/it] {'loss': 0.3519, 'grad_norm': 0.6908051407952255, 'learning_rate': 4.56373453365479e-06, 'epoch': 0.54} 54%|█████▍ | 11976/22095 [20:18:42<14:45:29, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75151 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11977/22095 [20:18:45<12:41:57, 4.52s/it] {'loss': 0.3238, 'grad_norm': 0.658050688240133, 'learning_rate': 4.563004412557532e-06, 'epoch': 0.54} 54%|█████▍ | 11977/22095 [20:18:45<12:41:57, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47962 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11978/22095 [20:18:48<11:25:02, 4.06s/it] {'loss': 0.3674, 'grad_norm': 0.6400947024626424, 'learning_rate': 4.562274300849968e-06, 'epoch': 0.54} 54%|█████▍ | 11978/22095 [20:18:48<11:25:02, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71572 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47270 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57870 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91700 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11979/22095 [20:18:51<10:36:05, 3.77s/it] {'loss': 0.3258, 'grad_norm': 0.611671600869507, 'learning_rate': 4.561544198547786e-06, 'epoch': 0.54} 54%|█████▍ | 11979/22095 [20:18:51<10:36:05, 3.77s/it] 54%|█████▍ | 11980/22095 [20:18:54<9:42:22, 3.45s/it] {'loss': 0.317, 'grad_norm': 0.6596228395551971, 'learning_rate': 4.560814105666672e-06, 'epoch': 0.54} 54%|█████▍ | 11980/22095 [20:18:54<9:42:22, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41374 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86259 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11981/22095 [20:19:02<13:32:15, 4.82s/it] {'loss': 0.4869, 'grad_norm': 0.3664205560652866, 'learning_rate': 4.560084022222313e-06, 'epoch': 0.54} 54%|█████▍ | 11981/22095 [20:19:02<13:32:15, 4.82s/it] 54%|█████▍ | 11982/22095 [20:19:06<12:26:24, 4.43s/it] {'loss': 0.3114, 'grad_norm': 0.655928615674818, 'learning_rate': 4.559353948230399e-06, 'epoch': 0.54} 54%|█████▍ | 11982/22095 [20:19:06<12:26:24, 4.43s/it] 54%|█████▍ | 11983/22095 [20:19:08<11:07:34, 3.96s/it] {'loss': 0.2704, 'grad_norm': 0.591209692657875, 'learning_rate': 4.558623883706613e-06, 'epoch': 0.54} 54%|█████▍ | 11983/22095 [20:19:08<11:07:34, 3.96s/it] 54%|█████▍ | 11984/22095 [20:19:12<10:26:18, 3.72s/it] {'loss': 0.3081, 'grad_norm': 0.6378830156479883, 'learning_rate': 4.5578938286666455e-06, 'epoch': 0.54} 54%|█████▍ | 11984/22095 [20:19:12<10:26:18, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11985/22095 [20:19:20<14:03:37, 5.01s/it] {'loss': 0.4662, 'grad_norm': 0.28458597921919504, 'learning_rate': 4.557163783126181e-06, 'epoch': 0.54} 54%|█████▍ | 11985/22095 [20:19:20<14:03:37, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55015 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11986/22095 [20:19:27<15:52:57, 5.66s/it] {'loss': 0.4925, 'grad_norm': 0.29464568064879143, 'learning_rate': 4.556433747100909e-06, 'epoch': 0.54} 54%|█████▍ | 11986/22095 [20:19:27<15:52:57, 5.66s/it] 54%|█████▍ | 11987/22095 [20:19:36<19:03:41, 6.79s/it] {'loss': 0.4858, 'grad_norm': 0.2928511770969385, 'learning_rate': 4.5557037206065105e-06, 'epoch': 0.54} 54%|█████▍ | 11987/22095 [20:19:36<19:03:41, 6.79s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▍ | 11988/22095 [20:19:39<15:55:35, 5.67s/it] {'loss': 0.3027, 'grad_norm': 0.6863478044764233, 'learning_rate': 4.554973703658676e-06, 'epoch': 0.54} 54%|█████▍ | 11988/22095 [20:19:39<15:55:35, 5.67s/it] 54%|█████▍ | 11989/22095 [20:19:47<17:58:41, 6.40s/it] {'loss': 0.4724, 'grad_norm': 0.3077387580483022, 'learning_rate': 4.554243696273091e-06, 'epoch': 0.54} 54%|█████▍ | 11989/22095 [20:19:47<17:58:41, 6.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (59910 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11990/22095 [20:19:51<15:17:56, 5.45s/it] {'loss': 0.3115, 'grad_norm': 0.5953582888405915, 'learning_rate': 4.553513698465438e-06, 'epoch': 0.54} 54%|█████▍ | 11990/22095 [20:19:51<15:17:56, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59541 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48466 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101100 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53100 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46637 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 11991/22095 [20:19:54<13:38:25, 4.86s/it] {'loss': 0.3038, 'grad_norm': 0.6285184049048317, 'learning_rate': 4.552783710251404e-06, 'epoch': 0.54} 54%|█████▍ | 11991/22095 [20:19:54<13:38:25, 4.86s/it] 54%|█████▍ | 11992/22095 [20:19:57<12:05:16, 4.31s/it] {'loss': 0.3, 'grad_norm': 0.5900146272819862, 'learning_rate': 4.5520537316466775e-06, 'epoch': 0.54} 54%|█████▍ | 11992/22095 [20:19:57<12:05:16, 4.31s/it] 54%|█████▍ | 11993/22095 [20:20:01<11:48:10, 4.21s/it] {'loss': 0.3211, 'grad_norm': 0.5857511473636708, 'learning_rate': 4.551323762666937e-06, 'epoch': 0.54} 54%|█████▍ | 11993/22095 [20:20:01<11:48:10, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11994/22095 [20:20:09<14:34:14, 5.19s/it] {'loss': 0.4651, 'grad_norm': 0.35763795383161173, 'learning_rate': 4.550593803327873e-06, 'epoch': 0.54} 54%|█████▍ | 11994/22095 [20:20:09<14:34:14, 5.19s/it] 54%|█████▍ | 11995/22095 [20:20:12<12:51:02, 4.58s/it] {'loss': 0.3648, 'grad_norm': 0.6251418565575303, 'learning_rate': 4.5498638536451675e-06, 'epoch': 0.54} 54%|█████▍ | 11995/22095 [20:20:12<12:51:02, 4.58s/it] 54%|█████▍ | 11996/22095 [20:20:15<11:48:50, 4.21s/it] {'loss': 0.3317, 'grad_norm': 0.6511923733741692, 'learning_rate': 4.5491339136345055e-06, 'epoch': 0.54} 54%|█████▍ | 11996/22095 [20:20:15<11:48:50, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047735 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 6cm\nB. 7cm\nC. 8cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 54%|█████▍ | 11997/22095 [20:20:19<11:29:06, 4.09s/it] {'loss': 0.3394, 'grad_norm': 0.6871015698177254, 'learning_rate': 4.548403983311569e-06, 'epoch': 0.54} 54%|█████▍ | 11997/22095 [20:20:19<11:29:06, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 11998/22095 [20:20:29<16:06:03, 5.74s/it] {'loss': 0.505, 'grad_norm': 0.3014881726501736, 'learning_rate': 4.547674062692046e-06, 'epoch': 0.54} 54%|█████▍ | 11998/22095 [20:20:29<16:06:03, 5.74s/it] 54%|█████▍ | 11999/22095 [20:20:39<20:12:32, 7.21s/it] {'loss': 0.4723, 'grad_norm': 0.28402470065143265, 'learning_rate': 4.546944151791618e-06, 'epoch': 0.54} 54%|█████▍ | 11999/22095 [20:20:39<20:12:32, 7.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 12000/22095 [20:20:44<18:00:41, 6.42s/it] {'loss': 0.3266, 'grad_norm': 0.5945012448408755, 'learning_rate': 4.546214250625969e-06, 'epoch': 0.54} 54%|█████▍ | 12000/22095 [20:20:44<18:00:41, 6.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46272 > 40960). Running this sequence through the model will result in indexing errors /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 54%|█████▍ | 12001/22095 [20:21:40<59:46:15, 21.32s/it] {'loss': 0.3621, 'grad_norm': 0.6330897895361995, 'learning_rate': 4.54548435921078e-06, 'epoch': 0.54} 54%|█████▍ | 12001/22095 [20:21:40<59:46:15, 21.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63614 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57903 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12002/22095 [20:21:49<49:52:12, 17.79s/it] {'loss': 0.474, 'grad_norm': 0.3276877083329625, 'learning_rate': 4.544754477561739e-06, 'epoch': 0.54} 54%|█████▍ | 12002/22095 [20:21:49<49:52:12, 17.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45581 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101557 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12003/22095 [20:21:53<38:09:55, 13.61s/it] {'loss': 0.3111, 'grad_norm': 0.6783486670274087, 'learning_rate': 4.544024605694524e-06, 'epoch': 0.54} 54%|█████▍ | 12003/22095 [20:21:53<38:09:55, 13.61s/it]VC:s3://gui-agent/data_20250421/Android/bilibilicn/Cycle_0_Iter_10_1/images/screenshot-161-1745200250.479557-before.png 2025-08-28 12:19:52.017597 load time: 1047.3 ms 54%|█████▍ | 12004/22095 [20:21:56<29:12:22, 10.42s/it] {'loss': 0.3129, 'grad_norm': 0.6288752587564099, 'learning_rate': 4.54329474362482e-06, 'epoch': 0.54} 54%|█████▍ | 12004/22095 [20:21:56<29:12:22, 10.42s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_775808.png 2025-08-28 12:19:54.975390 load time: 1020.41 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://st2pj/20250222/images/multi_modal/agent_data/ui_data/ui_home_screen_phone/ui_homescreen_phone_20240416_v7/homescreen-217.jpg 2025-08-28 12:19:54.975491 load time: 1041.65 ms 54%|█████▍ | 12005/22095 [20:22:00<23:23:44, 8.35s/it] {'loss': 0.2944, 'grad_norm': 0.6338755715175465, 'learning_rate': 4.542564891368311e-06, 'epoch': 0.54} 54%|█████▍ | 12005/22095 [20:22:00<23:23:44, 8.35s/it] 54%|█████▍ | 12006/22095 [20:22:03<19:14:56, 6.87s/it] {'loss': 0.3262, 'grad_norm': 0.7239986062645524, 'learning_rate': 4.541835048940675e-06, 'epoch': 0.54} 54%|█████▍ | 12006/22095 [20:22:03<19:14:56, 6.87s/it] 54%|█████▍ | 12007/22095 [20:22:07<17:00:33, 6.07s/it] {'loss': 0.3215, 'grad_norm': 0.6078227665196527, 'learning_rate': 4.5411052163575986e-06, 'epoch': 0.54} 54%|█████▍ | 12007/22095 [20:22:07<17:00:33, 6.07s/it] 54%|█████▍ | 12008/22095 [20:22:11<15:08:37, 5.40s/it] {'loss': 0.343, 'grad_norm': 0.6332652552296952, 'learning_rate': 4.540375393634762e-06, 'epoch': 0.54} 54%|█████▍ | 12008/22095 [20:22:11<15:08:37, 5.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 12009/22095 [20:22:19<16:56:10, 6.05s/it] {'loss': 0.4671, 'grad_norm': 0.29713938921676397, 'learning_rate': 4.539645580787845e-06, 'epoch': 0.54} 54%|█████▍ | 12009/22095 [20:22:19<16:56:10, 6.05s/it] 54%|█████▍ | 12010/22095 [20:22:28<19:59:31, 7.14s/it] {'loss': 0.4816, 'grad_norm': 0.30752786135562743, 'learning_rate': 4.538915777832531e-06, 'epoch': 0.54} 54%|█████▍ | 12010/22095 [20:22:28<19:59:31, 7.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047574 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 4.5\nB. 7\nC. 2\nD. 2.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 54%|█████▍ | 12011/22095 [20:22:39<22:50:46, 8.16s/it] {'loss': 0.4879, 'grad_norm': 0.30628064591608223, 'learning_rate': 4.538185984784501e-06, 'epoch': 0.54} 54%|█████▍ | 12011/22095 [20:22:39<22:50:46, 8.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (41329 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12012/22095 [20:22:42<18:52:55, 6.74s/it] {'loss': 0.3286, 'grad_norm': 0.6034792349851319, 'learning_rate': 4.537456201659437e-06, 'epoch': 0.54} 54%|█████▍ | 12012/22095 [20:22:42<18:52:55, 6.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84618 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42975 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12013/22095 [20:22:46<16:25:44, 5.87s/it] {'loss': 0.3018, 'grad_norm': 0.6460022175329486, 'learning_rate': 4.536726428473017e-06, 'epoch': 0.54} 54%|█████▍ | 12013/22095 [20:22:46<16:25:44, 5.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45520 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (53503 > 40960) for 4 sample(s). Truncating to 37083 with 2 samples. 54%|█████▍ | 12014/22095 [20:22:50<14:41:42, 5.25s/it] {'loss': 0.3531, 'grad_norm': 0.6330256849552708, 'learning_rate': 4.535996665240923e-06, 'epoch': 0.54} 54%|█████▍ | 12014/22095 [20:22:50<14:41:42, 5.25s/it] 54%|█████▍ | 12015/22095 [20:22:53<13:01:37, 4.65s/it] {'loss': 0.3037, 'grad_norm': 1.0554318969867602, 'learning_rate': 4.535266911978838e-06, 'epoch': 0.54} 54%|█████▍ | 12015/22095 [20:22:53<13:01:37, 4.65s/it] 54%|█████▍ | 12016/22095 [20:22:58<12:53:03, 4.60s/it] {'loss': 0.3666, 'grad_norm': 1.0406055103027143, 'learning_rate': 4.534537168702437e-06, 'epoch': 0.54} 54%|█████▍ | 12016/22095 [20:22:58<12:53:03, 4.60s/it] 54%|█████▍ | 12017/22095 [20:23:01<12:01:32, 4.30s/it] {'loss': 0.318, 'grad_norm': 0.6586450980212927, 'learning_rate': 4.533807435427404e-06, 'epoch': 0.54} 54%|█████▍ | 12017/22095 [20:23:01<12:01:32, 4.30s/it] 54%|█████▍ | 12018/22095 [20:23:05<11:07:24, 3.97s/it] {'loss': 0.3146, 'grad_norm': 0.6321733642129694, 'learning_rate': 4.533077712169418e-06, 'epoch': 0.54} 54%|█████▍ | 12018/22095 [20:23:05<11:07:24, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48009 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45433 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12019/22095 [20:23:07<10:13:52, 3.66s/it] {'loss': 0.2886, 'grad_norm': 0.5850842470544956, 'learning_rate': 4.532347998944158e-06, 'epoch': 0.54} 54%|█████▍ | 12019/22095 [20:23:07<10:13:52, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 12020/22095 [20:23:15<13:25:09, 4.80s/it] {'loss': 0.4809, 'grad_norm': 0.356016927067565, 'learning_rate': 4.531618295767301e-06, 'epoch': 0.54} 54%|█████▍ | 12020/22095 [20:23:15<13:25:09, 4.80s/it] 54%|█████▍ | 12021/22095 [20:23:23<16:02:51, 5.73s/it] {'loss': 0.4538, 'grad_norm': 0.30494498859019264, 'learning_rate': 4.53088860265453e-06, 'epoch': 0.54} 54%|█████▍ | 12021/22095 [20:23:23<16:02:51, 5.73s/it] 54%|█████▍ | 12022/22095 [20:23:32<19:19:38, 6.91s/it] {'loss': 0.4459, 'grad_norm': 0.27791137503165086, 'learning_rate': 4.5301589196215214e-06, 'epoch': 0.54} 54%|█████▍ | 12022/22095 [20:23:33<19:19:38, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 54%|█████▍ | 12023/22095 [20:23:36<16:50:29, 6.02s/it] {'loss': 0.3609, 'grad_norm': 0.7059627342048436, 'learning_rate': 4.529429246683956e-06, 'epoch': 0.54} 54%|█████▍ | 12023/22095 [20:23:36<16:50:29, 6.02s/it] 54%|█████▍ | 12024/22095 [20:23:40<14:40:08, 5.24s/it] {'loss': 0.333, 'grad_norm': 0.6517913855189322, 'learning_rate': 4.52869958385751e-06, 'epoch': 0.54} 54%|█████▍ | 12024/22095 [20:23:40<14:40:08, 5.24s/it] 54%|█████▍ | 12025/22095 [20:23:43<12:45:40, 4.56s/it] {'loss': 0.3143, 'grad_norm': 0.6495883155524725, 'learning_rate': 4.527969931157863e-06, 'epoch': 0.54} 54%|█████▍ | 12025/22095 [20:23:43<12:45:40, 4.56s/it] 54%|█████▍ | 12026/22095 [20:23:47<12:12:00, 4.36s/it] {'loss': 0.3674, 'grad_norm': 0.633776785633181, 'learning_rate': 4.5272402886006904e-06, 'epoch': 0.54} 54%|█████▍ | 12026/22095 [20:23:47<12:12:00, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 12027/22095 [20:23:55<15:46:38, 5.64s/it] {'loss': 0.4945, 'grad_norm': 0.3472350589231652, 'learning_rate': 4.526510656201673e-06, 'epoch': 0.54} 54%|█████▍ | 12027/22095 [20:23:55<15:46:38, 5.64s/it] 54%|█████▍ | 12028/22095 [20:23:59<14:07:36, 5.05s/it] {'loss': 0.3017, 'grad_norm': 0.682612710667583, 'learning_rate': 4.525781033976489e-06, 'epoch': 0.54} 54%|█████▍ | 12028/22095 [20:23:59<14:07:36, 5.05s/it] 54%|█████▍ | 12029/22095 [20:24:02<12:19:20, 4.41s/it] {'loss': 0.3625, 'grad_norm': 0.7569396141176389, 'learning_rate': 4.525051421940813e-06, 'epoch': 0.54} 54%|█████▍ | 12029/22095 [20:24:02<12:19:20, 4.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 54%|█████▍ | 12030/22095 [20:24:05<11:33:23, 4.13s/it] {'loss': 0.3546, 'grad_norm': 0.7007330874287369, 'learning_rate': 4.524321820110322e-06, 'epoch': 0.54} 54%|█████▍ | 12030/22095 [20:24:05<11:33:23, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55399 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12031/22095 [20:24:08<10:24:54, 3.73s/it] {'loss': 0.3169, 'grad_norm': 0.624317023029584, 'learning_rate': 4.523592228500696e-06, 'epoch': 0.54} 54%|█████▍ | 12031/22095 [20:24:08<10:24:54, 3.73s/it] 54%|█████▍ | 12032/22095 [20:24:11<9:52:10, 3.53s/it] {'loss': 0.2608, 'grad_norm': 0.6360197728590014, 'learning_rate': 4.522862647127609e-06, 'epoch': 0.54} 54%|█████▍ | 12032/22095 [20:24:11<9:52:10, 3.53s/it] 54%|█████▍ | 12033/22095 [20:24:14<9:20:35, 3.34s/it] {'loss': 0.2985, 'grad_norm': 0.622608362495375, 'learning_rate': 4.5221330760067386e-06, 'epoch': 0.54} 54%|█████▍ | 12033/22095 [20:24:14<9:20:35, 3.34s/it] 54%|█████▍ | 12034/22095 [20:24:17<8:59:33, 3.22s/it] {'loss': 0.2987, 'grad_norm': 0.6777283117689842, 'learning_rate': 4.521403515153762e-06, 'epoch': 0.54} 54%|█████▍ | 12034/22095 [20:24:17<8:59:33, 3.22s/it] 54%|█████▍ | 12035/22095 [20:24:21<9:35:28, 3.43s/it] {'loss': 0.312, 'grad_norm': 0.6558982457517104, 'learning_rate': 4.520673964584351e-06, 'epoch': 0.54} 54%|█████▍ | 12035/22095 [20:24:21<9:35:28, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367290 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34038, 'image': 'vrdu_table_final_2/astro-ph.CO/21451a65-6455-4b52-9129-25a95dacfcfc.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946049 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69202, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 54%|█████▍ | 12036/22095 [20:24:24<9:09:25, 3.28s/it] {'loss': 0.273, 'grad_norm': 0.5762145784272449, 'learning_rate': 4.519944424314186e-06, 'epoch': 0.54} 54%|█████▍ | 12036/22095 [20:24:24<9:09:25, 3.28s/it] 54%|█████▍ | 12037/22095 [20:24:27<8:51:27, 3.17s/it] {'loss': 0.3186, 'grad_norm': 0.7052212045652171, 'learning_rate': 4.519214894358942e-06, 'epoch': 0.54} 54%|█████▍ | 12037/22095 [20:24:27<8:51:27, 3.17s/it] 54%|█████▍ | 12038/22095 [20:24:31<9:39:34, 3.46s/it] {'loss': 0.3102, 'grad_norm': 0.6014994041697003, 'learning_rate': 4.5184853747342926e-06, 'epoch': 0.54} 54%|█████▍ | 12038/22095 [20:24:31<9:39:34, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 54%|█████▍ | 12039/22095 [20:24:40<14:31:08, 5.20s/it] {'loss': 0.4806, 'grad_norm': 0.3189094508896656, 'learning_rate': 4.517755865455912e-06, 'epoch': 0.54} 54%|█████▍ | 12039/22095 [20:24:40<14:31:08, 5.20s/it] 54%|█████▍ | 12040/22095 [20:24:44<13:08:04, 4.70s/it] {'loss': 0.3474, 'grad_norm': 0.6859969141969133, 'learning_rate': 4.517026366539477e-06, 'epoch': 0.54} 54%|█████▍ | 12040/22095 [20:24:44<13:08:04, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63864 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43914 > 40960). Running this sequence through the model will result in indexing errors 54%|█████▍ | 12041/22095 [20:24:47<11:59:48, 4.30s/it] {'loss': 0.323, 'grad_norm': 0.6115840685105562, 'learning_rate': 4.516296878000664e-06, 'epoch': 0.54} 54%|█████▍ | 12041/22095 [20:24:47<11:59:48, 4.30s/it] 55%|█████▍ | 12042/22095 [20:24:51<11:27:59, 4.11s/it] {'loss': 0.3155, 'grad_norm': 0.6546584864922931, 'learning_rate': 4.515567399855145e-06, 'epoch': 0.55} 55%|█████▍ | 12042/22095 [20:24:51<11:27:59, 4.11s/it] 55%|█████▍ | 12043/22095 [20:24:54<10:33:42, 3.78s/it] {'loss': 0.3146, 'grad_norm': 0.600731391425279, 'learning_rate': 4.514837932118593e-06, 'epoch': 0.55} 55%|█████▍ | 12043/22095 [20:24:54<10:33:42, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65438 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49804 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12044/22095 [20:24:57<9:59:37, 3.58s/it] {'loss': 0.295, 'grad_norm': 0.5894286013671077, 'learning_rate': 4.514108474806687e-06, 'epoch': 0.55} 55%|█████▍ | 12044/22095 [20:24:57<9:59:37, 3.58s/it] 55%|█████▍ | 12045/22095 [20:25:00<9:21:00, 3.35s/it] {'loss': 0.3349, 'grad_norm': 0.7288286128601018, 'learning_rate': 4.513379027935094e-06, 'epoch': 0.55} 55%|█████▍ | 12045/22095 [20:25:00<9:21:00, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12046/22095 [20:25:03<8:57:22, 3.21s/it] {'loss': 0.3451, 'grad_norm': 0.5718145121685675, 'learning_rate': 4.5126495915194936e-06, 'epoch': 0.55} 55%|█████▍ | 12046/22095 [20:25:03<8:57:22, 3.21s/it] 55%|█████▍ | 12047/22095 [20:25:06<9:00:15, 3.23s/it] {'loss': 0.3341, 'grad_norm': 0.6346709284490044, 'learning_rate': 4.5119201655755565e-06, 'epoch': 0.55} 55%|█████▍ | 12047/22095 [20:25:06<9:00:15, 3.23s/it] 55%|█████▍ | 12048/22095 [20:25:09<9:02:55, 3.24s/it] {'loss': 0.3268, 'grad_norm': 0.6521556040336637, 'learning_rate': 4.511190750118955e-06, 'epoch': 0.55} 55%|█████▍ | 12048/22095 [20:25:09<9:02:55, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12049/22095 [20:25:19<14:13:45, 5.10s/it] {'loss': 0.4741, 'grad_norm': 0.3027426900568299, 'learning_rate': 4.510461345165362e-06, 'epoch': 0.55} 55%|█████▍ | 12049/22095 [20:25:19<14:13:45, 5.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63269 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42416 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12050/22095 [20:25:23<13:19:49, 4.78s/it] {'loss': 0.3516, 'grad_norm': 0.649681449175025, 'learning_rate': 4.509731950730454e-06, 'epoch': 0.55} 55%|█████▍ | 12050/22095 [20:25:23<13:19:49, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12051/22095 [20:25:26<11:52:28, 4.26s/it] {'loss': 0.3321, 'grad_norm': 0.7190082909611529, 'learning_rate': 4.509002566829899e-06, 'epoch': 0.55} 55%|█████▍ | 12051/22095 [20:25:26<11:52:28, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12052/22095 [20:25:35<16:23:55, 5.88s/it] {'loss': 0.4746, 'grad_norm': 0.30540548048002153, 'learning_rate': 4.508273193479371e-06, 'epoch': 0.55} 55%|█████▍ | 12052/22095 [20:25:35<16:23:55, 5.88s/it] 55%|█████▍ | 12053/22095 [20:25:39<14:46:16, 5.30s/it] {'loss': 0.3647, 'grad_norm': 0.6481182817233865, 'learning_rate': 4.507543830694543e-06, 'epoch': 0.55} 55%|█████▍ | 12053/22095 [20:25:39<14:46:16, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107976 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93102 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12054/22095 [20:25:43<13:23:03, 4.80s/it] {'loss': 0.308, 'grad_norm': 0.6334004009278218, 'learning_rate': 4.506814478491084e-06, 'epoch': 0.55} 55%|█████▍ | 12054/22095 [20:25:43<13:23:03, 4.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47204 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12055/22095 [20:25:46<11:52:08, 4.26s/it] {'loss': 0.3341, 'grad_norm': 0.6768165181731143, 'learning_rate': 4.506085136884667e-06, 'epoch': 0.55} 55%|█████▍ | 12055/22095 [20:25:46<11:52:08, 4.26s/it] 55%|█████▍ | 12056/22095 [20:25:49<10:40:45, 3.83s/it] {'loss': 0.3679, 'grad_norm': 0.6072803233190442, 'learning_rate': 4.505355805890964e-06, 'epoch': 0.55} 55%|█████▍ | 12056/22095 [20:25:49<10:40:45, 3.83s/it] 55%|█████▍ | 12057/22095 [20:25:52<9:59:27, 3.58s/it] {'loss': 0.2941, 'grad_norm': 0.6449685267219651, 'learning_rate': 4.504626485525647e-06, 'epoch': 0.55} 55%|█████▍ | 12057/22095 [20:25:52<9:59:27, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12058/22095 [20:26:00<13:37:40, 4.89s/it] {'loss': 0.4699, 'grad_norm': 0.3259735884598887, 'learning_rate': 4.503897175804383e-06, 'epoch': 0.55} 55%|█████▍ | 12058/22095 [20:26:00<13:37:40, 4.89s/it] 55%|█████▍ | 12059/22095 [20:26:03<12:30:36, 4.49s/it] {'loss': 0.3379, 'grad_norm': 0.622368199343704, 'learning_rate': 4.503167876742846e-06, 'epoch': 0.55} 55%|█████▍ | 12059/22095 [20:26:03<12:30:36, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43406 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12060/22095 [20:26:06<11:16:11, 4.04s/it] {'loss': 0.3012, 'grad_norm': 0.6202468893832425, 'learning_rate': 4.502438588356707e-06, 'epoch': 0.55} 55%|█████▍ | 12060/22095 [20:26:06<11:16:11, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48009 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12061/22095 [20:26:09<10:16:47, 3.69s/it] {'loss': 0.297, 'grad_norm': 0.6416966220784046, 'learning_rate': 4.501709310661632e-06, 'epoch': 0.55} 55%|█████▍ | 12061/22095 [20:26:09<10:16:47, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12062/22095 [20:26:17<13:57:15, 5.01s/it] {'loss': 0.5102, 'grad_norm': 0.4889086177698067, 'learning_rate': 4.500980043673295e-06, 'epoch': 0.55} 55%|█████▍ | 12062/22095 [20:26:17<13:57:15, 5.01s/it] 55%|█████▍ | 12063/22095 [20:26:28<18:38:40, 6.69s/it] {'loss': 0.4713, 'grad_norm': 0.25808571648322653, 'learning_rate': 4.5002507874073655e-06, 'epoch': 0.55} 55%|█████▍ | 12063/22095 [20:26:28<18:38:40, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 55%|█████▍ | 12064/22095 [20:26:32<16:24:27, 5.89s/it] {'loss': 0.3067, 'grad_norm': 0.6337181113011422, 'learning_rate': 4.499521541879508e-06, 'epoch': 0.55} 55%|█████▍ | 12064/22095 [20:26:32<16:24:27, 5.89s/it] 55%|█████▍ | 12065/22095 [20:26:36<15:05:26, 5.42s/it] {'loss': 0.3187, 'grad_norm': 0.5806705729297652, 'learning_rate': 4.498792307105398e-06, 'epoch': 0.55} 55%|█████▍ | 12065/22095 [20:26:36<15:05:26, 5.42s/it] 55%|█████▍ | 12066/22095 [20:26:40<13:24:30, 4.81s/it] {'loss': 0.3391, 'grad_norm': 0.6180438362672532, 'learning_rate': 4.498063083100703e-06, 'epoch': 0.55} 55%|█████▍ | 12066/22095 [20:26:40<13:24:30, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12067/22095 [20:26:43<12:23:49, 4.45s/it] {'loss': 0.3072, 'grad_norm': 0.6576832425490626, 'learning_rate': 4.497333869881089e-06, 'epoch': 0.55} 55%|█████▍ | 12067/22095 [20:26:43<12:23:49, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12068/22095 [20:26:52<16:21:53, 5.88s/it] {'loss': 0.4592, 'grad_norm': 0.32216628821352994, 'learning_rate': 4.496604667462225e-06, 'epoch': 0.55} 55%|█████▍ | 12068/22095 [20:26:52<16:21:53, 5.88s/it] 55%|█████▍ | 12069/22095 [20:26:56<14:32:53, 5.22s/it] {'loss': 0.3309, 'grad_norm': 0.7037977697870069, 'learning_rate': 4.495875475859783e-06, 'epoch': 0.55} 55%|█████▍ | 12069/22095 [20:26:56<14:32:53, 5.22s/it] 55%|█████▍ | 12070/22095 [20:27:00<13:40:40, 4.91s/it] {'loss': 0.3421, 'grad_norm': 0.6512723214925447, 'learning_rate': 4.495146295089428e-06, 'epoch': 0.55} 55%|█████▍ | 12070/22095 [20:27:00<13:40:40, 4.91s/it] 55%|█████▍ | 12071/22095 [20:27:04<12:59:45, 4.67s/it] {'loss': 0.3383, 'grad_norm': 0.6249409724190828, 'learning_rate': 4.49441712516683e-06, 'epoch': 0.55} 55%|█████▍ | 12071/22095 [20:27:04<12:59:45, 4.67s/it] 55%|█████▍ | 12072/22095 [20:27:08<11:57:28, 4.29s/it] {'loss': 0.3084, 'grad_norm': 0.5988598636905825, 'learning_rate': 4.493687966107652e-06, 'epoch': 0.55} 55%|█████▍ | 12072/22095 [20:27:08<11:57:28, 4.29s/it] 55%|█████▍ | 12073/22095 [20:27:11<10:39:28, 3.83s/it] {'loss': 0.3239, 'grad_norm': 0.666932683877474, 'learning_rate': 4.492958817927569e-06, 'epoch': 0.55} 55%|█████▍ | 12073/22095 [20:27:11<10:39:28, 3.83s/it] 55%|█████▍ | 12074/22095 [20:27:14<10:37:11, 3.82s/it] {'loss': 0.3071, 'grad_norm': 0.7087470576939989, 'learning_rate': 4.492229680642239e-06, 'epoch': 0.55} 55%|█████▍ | 12074/22095 [20:27:14<10:37:11, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12075/22095 [20:27:19<11:07:49, 4.00s/it] {'loss': 0.3713, 'grad_norm': 0.5911853754264186, 'learning_rate': 4.4915005542673365e-06, 'epoch': 0.55} 55%|█████▍ | 12075/22095 [20:27:19<11:07:49, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43711 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90945 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41377 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12076/22095 [20:27:22<10:15:25, 3.69s/it] {'loss': 0.3105, 'grad_norm': 0.6432801875991805, 'learning_rate': 4.490771438818525e-06, 'epoch': 0.55} 55%|█████▍ | 12076/22095 [20:27:22<10:15:25, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72989 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12077/22095 [20:27:25<10:12:36, 3.67s/it] {'loss': 0.3419, 'grad_norm': 0.6452758095884515, 'learning_rate': 4.490042334311472e-06, 'epoch': 0.55} 55%|█████▍ | 12077/22095 [20:27:25<10:12:36, 3.67s/it] 55%|█████▍ | 12078/22095 [20:27:28<9:33:42, 3.44s/it] {'loss': 0.269, 'grad_norm': 0.6089284177796053, 'learning_rate': 4.48931324076184e-06, 'epoch': 0.55} 55%|█████▍ | 12078/22095 [20:27:28<9:33:42, 3.44s/it] 55%|█████▍ | 12079/22095 [20:27:32<9:30:47, 3.42s/it] {'loss': 0.3078, 'grad_norm': 0.6470458332930936, 'learning_rate': 4.488584158185301e-06, 'epoch': 0.55} 55%|█████▍ | 12079/22095 [20:27:32<9:30:47, 3.42s/it] 55%|█████▍ | 12080/22095 [20:27:35<9:26:09, 3.39s/it] {'loss': 0.2899, 'grad_norm': 0.6071068578697375, 'learning_rate': 4.487855086597517e-06, 'epoch': 0.55} 55%|█████▍ | 12080/22095 [20:27:35<9:26:09, 3.39s/it] 55%|█████▍ | 12081/22095 [20:27:39<10:17:56, 3.70s/it] {'loss': 0.3827, 'grad_norm': 0.6715335365000962, 'learning_rate': 4.487126026014154e-06, 'epoch': 0.55} 55%|█████▍ | 12081/22095 [20:27:39<10:17:56, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359849 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26570, 'image': 'vrdu_table_final_2/astro-ph.CO/23241691-1d22-4a16-a0e1-742d8ad29f99.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 55%|█████▍ | 12082/22095 [20:27:43<10:22:20, 3.73s/it] {'loss': 0.3398, 'grad_norm': 0.6113273788313882, 'learning_rate': 4.486396976450876e-06, 'epoch': 0.55} 55%|█████▍ | 12082/22095 [20:27:43<10:22:20, 3.73s/it] 55%|█████▍ | 12083/22095 [20:27:47<10:33:47, 3.80s/it] {'loss': 0.2983, 'grad_norm': 0.6478775139279586, 'learning_rate': 4.485667937923352e-06, 'epoch': 0.55} 55%|█████▍ | 12083/22095 [20:27:47<10:33:47, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12084/22095 [20:27:59<16:58:18, 6.10s/it] {'loss': 0.4905, 'grad_norm': 0.34625857371418256, 'learning_rate': 4.4849389104472435e-06, 'epoch': 0.55} 55%|█████▍ | 12084/22095 [20:27:59<16:58:18, 6.10s/it] 55%|█████▍ | 12085/22095 [20:28:03<15:31:47, 5.59s/it] {'loss': 0.3285, 'grad_norm': 0.6405866789560921, 'learning_rate': 4.4842098940382155e-06, 'epoch': 0.55} 55%|█████▍ | 12085/22095 [20:28:03<15:31:47, 5.59s/it] 55%|█████▍ | 12086/22095 [20:28:07<14:09:05, 5.09s/it] {'loss': 0.3227, 'grad_norm': 0.5824949965901939, 'learning_rate': 4.483480888711935e-06, 'epoch': 0.55} 55%|█████▍ | 12086/22095 [20:28:07<14:09:05, 5.09s/it] 55%|█████▍ | 12087/22095 [20:28:10<12:26:57, 4.48s/it] {'loss': 0.3531, 'grad_norm': 0.6195648132210937, 'learning_rate': 4.4827518944840606e-06, 'epoch': 0.55} 55%|█████▍ | 12087/22095 [20:28:10<12:26:57, 4.48s/it] 55%|█████▍ | 12088/22095 [20:28:13<11:09:05, 4.01s/it] {'loss': 0.3271, 'grad_norm': 0.6462744084113154, 'learning_rate': 4.48202291137026e-06, 'epoch': 0.55} 55%|█████▍ | 12088/22095 [20:28:13<11:09:05, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [156, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396961 in VC:s3://internvl-moe-sft-data/. Exception: Image size [156, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63814, 'image': 'vrdu_table_final_2/astro-ph.EP/a2d52c59-19e3-4a52-8ec8-2fcb09e953b7.png', 'image_wh': [[156, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}Earth's orbit\\end{tabular}\n```"}]} 55%|█████▍ | 12089/22095 [20:28:16<10:47:24, 3.88s/it] {'loss': 0.3474, 'grad_norm': 0.6415272922809488, 'learning_rate': 4.481293939386198e-06, 'epoch': 0.55} 55%|█████▍ | 12089/22095 [20:28:16<10:47:24, 3.88s/it] 55%|█████▍ | 12090/22095 [20:28:19<9:50:19, 3.54s/it] {'loss': 0.3007, 'grad_norm': 0.7149404102600072, 'learning_rate': 4.480564978547535e-06, 'epoch': 0.55} 55%|█████▍ | 12090/22095 [20:28:19<9:50:19, 3.54s/it] 55%|█████▍ | 12091/22095 [20:28:22<9:24:03, 3.38s/it] {'loss': 0.3111, 'grad_norm': 0.7008732021330117, 'learning_rate': 4.479836028869935e-06, 'epoch': 0.55} 55%|█████▍ | 12091/22095 [20:28:22<9:24:03, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948333 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71486, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 6cm\nB. 6.5cm\nC. 5cm\nD. 5.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 55%|█████▍ | 12092/22095 [20:28:26<9:44:29, 3.51s/it] {'loss': 0.3306, 'grad_norm': 0.583088917034706, 'learning_rate': 4.479107090369063e-06, 'epoch': 0.55} 55%|█████▍ | 12092/22095 [20:28:26<9:44:29, 3.51s/it] 55%|█████▍ | 12093/22095 [20:28:29<9:08:39, 3.29s/it] {'loss': 0.2929, 'grad_norm': 0.6070470620347299, 'learning_rate': 4.478378163060577e-06, 'epoch': 0.55} 55%|█████▍ | 12093/22095 [20:28:29<9:08:39, 3.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12094/22095 [20:28:32<8:51:02, 3.19s/it] {'loss': 0.292, 'grad_norm': 0.7040018121624776, 'learning_rate': 4.477649246960144e-06, 'epoch': 0.55} 55%|█████▍ | 12094/22095 [20:28:32<8:51:02, 3.19s/it] 55%|█████▍ | 12095/22095 [20:28:35<8:42:54, 3.14s/it] {'loss': 0.293, 'grad_norm': 0.6153340024288626, 'learning_rate': 4.476920342083425e-06, 'epoch': 0.55} 55%|█████▍ | 12095/22095 [20:28:35<8:42:54, 3.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90797 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108412 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12096/22095 [20:28:38<9:10:38, 3.30s/it] {'loss': 0.3593, 'grad_norm': 0.6445238387160752, 'learning_rate': 4.47619144844608e-06, 'epoch': 0.55} 55%|█████▍ | 12096/22095 [20:28:38<9:10:38, 3.30s/it] 55%|█████▍ | 12097/22095 [20:28:41<8:45:50, 3.16s/it] {'loss': 0.3116, 'grad_norm': 0.5952893306035464, 'learning_rate': 4.475462566063771e-06, 'epoch': 0.55} 55%|█████▍ | 12097/22095 [20:28:41<8:45:50, 3.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43973 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12098/22095 [20:28:45<9:20:38, 3.36s/it] {'loss': 0.3125, 'grad_norm': 0.5764937417401543, 'learning_rate': 4.474733694952162e-06, 'epoch': 0.55} 55%|█████▍ | 12098/22095 [20:28:45<9:20:38, 3.36s/it] 55%|█████▍ | 12099/22095 [20:28:49<9:45:37, 3.52s/it] {'loss': 0.3149, 'grad_norm': 0.5910636591909206, 'learning_rate': 4.474004835126913e-06, 'epoch': 0.55} 55%|█████▍ | 12099/22095 [20:28:49<9:45:37, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12100/22095 [20:28:52<9:38:41, 3.47s/it] {'loss': 0.3249, 'grad_norm': 0.6165891092203315, 'learning_rate': 4.4732759866036846e-06, 'epoch': 0.55} 55%|█████▍ | 12100/22095 [20:28:52<9:38:41, 3.47s/it] 55%|█████▍ | 12101/22095 [20:28:56<9:24:41, 3.39s/it] {'loss': 0.3324, 'grad_norm': 0.6241000733082706, 'learning_rate': 4.472547149398136e-06, 'epoch': 0.55} 55%|█████▍ | 12101/22095 [20:28:56<9:24:41, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44524 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83125 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54532 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12102/22095 [20:28:59<9:30:07, 3.42s/it] {'loss': 0.284, 'grad_norm': 0.6730294073739986, 'learning_rate': 4.471818323525932e-06, 'epoch': 0.55} 55%|█████▍ | 12102/22095 [20:28:59<9:30:07, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70700 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12103/22095 [20:29:02<8:56:15, 3.22s/it] {'loss': 0.301, 'grad_norm': 0.6396666882471538, 'learning_rate': 4.471089509002728e-06, 'epoch': 0.55} 55%|█████▍ | 12103/22095 [20:29:02<8:56:15, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12104/22095 [20:29:11<14:04:36, 5.07s/it] {'loss': 0.4623, 'grad_norm': 0.33717396214770234, 'learning_rate': 4.470360705844186e-06, 'epoch': 0.55} 55%|█████▍ | 12104/22095 [20:29:11<14:04:36, 5.07s/it] 55%|█████▍ | 12105/22095 [20:29:15<12:58:04, 4.67s/it] {'loss': 0.2796, 'grad_norm': 0.5946426926801431, 'learning_rate': 4.469631914065967e-06, 'epoch': 0.55} 55%|█████▍ | 12105/22095 [20:29:15<12:58:04, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [484, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8511826 in VC:s3://internvl-moe-sft-data/. Exception: Image size [484, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32659, 'image': 'vrdu_texteq/astro-ph.CO/e1adc545-7e92-4cd2-bc65-e81ebe5961ea.png', 'image_wh': [[484, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'The concentration index $C$ is defined as'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12106/22095 [20:29:26<18:05:19, 6.52s/it] {'loss': 0.4492, 'grad_norm': 0.3031311469575686, 'learning_rate': 4.468903133683728e-06, 'epoch': 0.55} 55%|█████▍ | 12106/22095 [20:29:26<18:05:19, 6.52s/it] 55%|█████▍ | 12107/22095 [20:29:30<16:22:54, 5.90s/it] {'loss': 0.3431, 'grad_norm': 0.5894908589290491, 'learning_rate': 4.4681743647131285e-06, 'epoch': 0.55} 55%|█████▍ | 12107/22095 [20:29:30<16:22:54, 5.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106998 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47318 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12108/22095 [20:29:33<13:51:16, 4.99s/it] {'loss': 0.3392, 'grad_norm': 0.6107167467803513, 'learning_rate': 4.4674456071698315e-06, 'epoch': 0.55} 55%|█████▍ | 12108/22095 [20:29:33<13:51:16, 4.99s/it] 55%|█████▍ | 12109/22095 [20:29:37<12:33:16, 4.53s/it] {'loss': 0.3126, 'grad_norm': 0.7012730961307557, 'learning_rate': 4.466716861069491e-06, 'epoch': 0.55} 55%|█████▍ | 12109/22095 [20:29:37<12:33:16, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12110/22095 [20:29:46<16:45:15, 6.04s/it] {'loss': 0.4605, 'grad_norm': 0.29570784932734145, 'learning_rate': 4.465988126427767e-06, 'epoch': 0.55} 55%|█████▍ | 12110/22095 [20:29:46<16:45:15, 6.04s/it] 55%|█████▍ | 12111/22095 [20:29:51<15:52:01, 5.72s/it] {'loss': 0.4735, 'grad_norm': 0.3330322126985907, 'learning_rate': 4.4652594032603174e-06, 'epoch': 0.55} 55%|█████▍ | 12111/22095 [20:29:51<15:52:01, 5.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 55%|█████▍ | 12112/22095 [20:29:55<14:13:45, 5.13s/it] {'loss': 0.3188, 'grad_norm': 0.6404707219224894, 'learning_rate': 4.4645306915828025e-06, 'epoch': 0.55} 55%|█████▍ | 12112/22095 [20:29:55<14:13:45, 5.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921712 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44865, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 6\nB. 10\nC. 8\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 55%|█████▍ | 12113/22095 [20:29:58<12:55:50, 4.66s/it] {'loss': 0.2984, 'grad_norm': 0.6781019519562312, 'learning_rate': 4.463801991410878e-06, 'epoch': 0.55} 55%|█████▍ | 12113/22095 [20:29:58<12:55:50, 4.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12114/22095 [20:30:03<12:45:31, 4.60s/it] {'loss': 0.3858, 'grad_norm': 0.6193509541981514, 'learning_rate': 4.463073302760202e-06, 'epoch': 0.55} 55%|█████▍ | 12114/22095 [20:30:03<12:45:31, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50730 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66632 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12115/22095 [20:30:07<12:05:33, 4.36s/it] {'loss': 0.3567, 'grad_norm': 0.6195827035062909, 'learning_rate': 4.462344625646433e-06, 'epoch': 0.55} 55%|█████▍ | 12115/22095 [20:30:07<12:05:33, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12116/22095 [20:30:16<16:14:48, 5.86s/it] {'loss': 0.4641, 'grad_norm': 0.29883589279706985, 'learning_rate': 4.461615960085224e-06, 'epoch': 0.55} 55%|█████▍ | 12116/22095 [20:30:16<16:14:48, 5.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12117/22095 [20:30:19<14:06:12, 5.09s/it] {'loss': 0.3359, 'grad_norm': 0.6455256843550773, 'learning_rate': 4.460887306092236e-06, 'epoch': 0.55} 55%|█████▍ | 12117/22095 [20:30:19<14:06:12, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71299 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12118/22095 [20:30:22<12:29:26, 4.51s/it] {'loss': 0.3094, 'grad_norm': 0.6054056715448799, 'learning_rate': 4.460158663683125e-06, 'epoch': 0.55} 55%|█████▍ | 12118/22095 [20:30:22<12:29:26, 4.51s/it] 55%|█████▍ | 12119/22095 [20:30:25<11:06:30, 4.01s/it] {'loss': 0.3059, 'grad_norm': 0.641684405299334, 'learning_rate': 4.459430032873545e-06, 'epoch': 0.55} 55%|█████▍ | 12119/22095 [20:30:25<11:06:30, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (138080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120338 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12120/22095 [20:30:29<11:01:27, 3.98s/it] {'loss': 0.3443, 'grad_norm': 0.6473034497692319, 'learning_rate': 4.458701413679152e-06, 'epoch': 0.55} 55%|█████▍ | 12120/22095 [20:30:29<11:01:27, 3.98s/it] 55%|█████▍ | 12121/22095 [20:30:33<10:39:21, 3.85s/it] {'loss': 0.2921, 'grad_norm': 0.578951718971156, 'learning_rate': 4.457972806115607e-06, 'epoch': 0.55} 55%|█████▍ | 12121/22095 [20:30:33<10:39:21, 3.85s/it] 55%|█████▍ | 12122/22095 [20:30:36<10:22:55, 3.75s/it] {'loss': 0.311, 'grad_norm': 0.66634270501242, 'learning_rate': 4.4572442101985584e-06, 'epoch': 0.55} 55%|█████▍ | 12122/22095 [20:30:37<10:22:55, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12123/22095 [20:30:46<15:39:19, 5.65s/it] {'loss': 0.4561, 'grad_norm': 0.2952537233831481, 'learning_rate': 4.456515625943666e-06, 'epoch': 0.55} 55%|█████▍ | 12123/22095 [20:30:46<15:39:19, 5.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [117, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390930 in VC:s3://internvl-moe-sft-data/. Exception: Image size [117, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57750, 'image': 'vrdu_table_final_2/astro-ph.EP/1050f22e-6c02-4230-b15d-a5b4d3ee2c48.png', 'image_wh': [[117, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}}\\textbf{Solution} \\end{tabular}\n```"}]} 55%|█████▍ | 12124/22095 [20:30:49<13:33:35, 4.90s/it] {'loss': 0.3031, 'grad_norm': 0.6761764609665392, 'learning_rate': 4.455787053366583e-06, 'epoch': 0.55} 55%|█████▍ | 12124/22095 [20:30:50<13:33:35, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41157 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50013 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12125/22095 [20:30:53<12:26:35, 4.49s/it] {'loss': 0.331, 'grad_norm': 0.6259223256492764, 'learning_rate': 4.455058492482966e-06, 'epoch': 0.55} 55%|█████▍ | 12125/22095 [20:30:53<12:26:35, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73157 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917249 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40402, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 55%|█████▍ | 12126/22095 [20:30:59<13:23:26, 4.84s/it] {'loss': 0.458, 'grad_norm': 0.27580866668673604, 'learning_rate': 4.454329943308466e-06, 'epoch': 0.55} 55%|█████▍ | 12126/22095 [20:30:59<13:23:26, 4.84s/it] 55%|█████▍ | 12127/22095 [20:31:02<12:28:22, 4.50s/it] {'loss': 0.3242, 'grad_norm': 0.693568649449279, 'learning_rate': 4.453601405858741e-06, 'epoch': 0.55} 55%|█████▍ | 12127/22095 [20:31:03<12:28:22, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12128/22095 [20:31:13<17:07:58, 6.19s/it] {'loss': 0.501, 'grad_norm': 0.29380132883853927, 'learning_rate': 4.4528728801494455e-06, 'epoch': 0.55} 55%|█████▍ | 12128/22095 [20:31:13<17:07:58, 6.19s/it] 55%|█████▍ | 12129/22095 [20:31:16<14:30:55, 5.24s/it] {'loss': 0.297, 'grad_norm': 0.6456773713755553, 'learning_rate': 4.452144366196229e-06, 'epoch': 0.55} 55%|█████▍ | 12129/22095 [20:31:16<14:30:55, 5.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12130/22095 [20:31:19<12:51:19, 4.64s/it] {'loss': 0.3567, 'grad_norm': 0.6466053776304937, 'learning_rate': 4.451415864014747e-06, 'epoch': 0.55} 55%|█████▍ | 12130/22095 [20:31:19<12:51:19, 4.64s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_13/images/20250417140108.png 2025-08-28 12:29:17.882004 load time: 1001.99 ms 55%|█████▍ | 12131/22095 [20:31:23<12:38:09, 4.57s/it] {'loss': 0.2942, 'grad_norm': 0.6261122547635842, 'learning_rate': 4.450687373620656e-06, 'epoch': 0.55} 55%|█████▍ | 12131/22095 [20:31:23<12:38:09, 4.57s/it] 55%|█████▍ | 12132/22095 [20:31:27<11:47:31, 4.26s/it] {'loss': 0.3534, 'grad_norm': 0.6552351948762685, 'learning_rate': 4.449958895029604e-06, 'epoch': 0.55} 55%|█████▍ | 12132/22095 [20:31:27<11:47:31, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61288 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95973 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12133/22095 [20:31:34<14:26:06, 5.22s/it] {'loss': 0.4713, 'grad_norm': 0.28897300839931045, 'learning_rate': 4.449230428257247e-06, 'epoch': 0.55} 55%|█████▍ | 12133/22095 [20:31:34<14:26:06, 5.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [523, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429970 in VC:s3://internvl-moe-sft-data/. Exception: Image size [523, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38610, 'image': 'vrdu_texteq/astro-ph.CO/ebff97a1-5a4e-4aa9-87e7-5296d6c08e67.png', 'image_wh': [[523, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $z_i$ is the middle of the redshift bin $i$.'}]} 55%|█████▍ | 12134/22095 [20:31:37<12:40:13, 4.58s/it] {'loss': 0.3511, 'grad_norm': 0.6544743726214356, 'learning_rate': 4.448501973319237e-06, 'epoch': 0.55} 55%|█████▍ | 12134/22095 [20:31:37<12:40:13, 4.58s/it] 55%|█████▍ | 12135/22095 [20:31:41<11:56:50, 4.32s/it] {'loss': 0.3179, 'grad_norm': 0.6922051336111326, 'learning_rate': 4.447773530231225e-06, 'epoch': 0.55} 55%|█████▍ | 12135/22095 [20:31:41<11:56:50, 4.32s/it] 55%|█████▍ | 12136/22095 [20:31:45<11:22:19, 4.11s/it] {'loss': 0.3338, 'grad_norm': 0.7139447536673934, 'learning_rate': 4.447045099008863e-06, 'epoch': 0.55} 55%|█████▍ | 12136/22095 [20:31:45<11:22:19, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12137/22095 [20:31:48<10:37:35, 3.84s/it] {'loss': 0.3448, 'grad_norm': 0.6248896889526092, 'learning_rate': 4.446316679667805e-06, 'epoch': 0.55} 55%|█████▍ | 12137/22095 [20:31:48<10:37:35, 3.84s/it] 55%|█████▍ | 12138/22095 [20:31:51<10:01:31, 3.62s/it] {'loss': 0.3224, 'grad_norm': 0.6580828432429198, 'learning_rate': 4.445588272223701e-06, 'epoch': 0.55} 55%|█████▍ | 12138/22095 [20:31:51<10:01:31, 3.62s/it] 55%|█████▍ | 12139/22095 [20:31:54<9:29:48, 3.43s/it] {'loss': 0.3264, 'grad_norm': 0.6255330996021665, 'learning_rate': 4.4448598766922005e-06, 'epoch': 0.55} 55%|█████▍ | 12139/22095 [20:31:54<9:29:48, 3.43s/it] 55%|█████▍ | 12140/22095 [20:31:57<9:22:25, 3.39s/it] {'loss': 0.3109, 'grad_norm': 0.655788026482158, 'learning_rate': 4.444131493088956e-06, 'epoch': 0.55} 55%|█████▍ | 12140/22095 [20:31:57<9:22:25, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045968 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:根据题意,AC=12cm,CB=\\frac{2}{3}AC,所以CB=8cm,所以AB=AC+CB=20cm,又D、E分别为AC、AB的中点,所以DE=AE-AD=\\frac{1}{2}(AB-AC)=4cm.即DE=4cm.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946048 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69201, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 8\nB. 4\nC. 6\nD. 7.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 55%|█████▍ | 12141/22095 [20:32:04<11:51:17, 4.29s/it] {'loss': 0.4838, 'grad_norm': 0.33868090853487126, 'learning_rate': 4.443403121429621e-06, 'epoch': 0.55} 55%|█████▍ | 12141/22095 [20:32:04<11:51:17, 4.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12142/22095 [20:32:07<11:00:21, 3.98s/it] {'loss': 0.2962, 'grad_norm': 0.6138962693984958, 'learning_rate': 4.442674761729843e-06, 'epoch': 0.55} 55%|█████▍ | 12142/22095 [20:32:07<11:00:21, 3.98s/it] 55%|█████▍ | 12143/22095 [20:32:10<10:15:28, 3.71s/it] {'loss': 0.3167, 'grad_norm': 0.6139537752754014, 'learning_rate': 4.441946414005272e-06, 'epoch': 0.55} 55%|█████▍ | 12143/22095 [20:32:10<10:15:28, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (141708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53464 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▍ | 12144/22095 [20:32:13<9:43:23, 3.52s/it] {'loss': 0.2908, 'grad_norm': 0.5965832383615192, 'learning_rate': 4.44121807827156e-06, 'epoch': 0.55} 55%|█████▍ | 12144/22095 [20:32:13<9:43:23, 3.52s/it] 55%|█████▍ | 12145/22095 [20:32:17<9:57:06, 3.60s/it] {'loss': 0.2979, 'grad_norm': 0.6017878823620227, 'learning_rate': 4.4404897545443525e-06, 'epoch': 0.55} 55%|█████▍ | 12145/22095 [20:32:17<9:57:06, 3.60s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/finder_3/images/step_0.png 2025-08-28 12:30:16.114619 load time: 1138.0 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▍ | 12146/22095 [20:32:20<9:42:25, 3.51s/it] {'loss': 0.2946, 'grad_norm': 0.6192189887678029, 'learning_rate': 4.439761442839303e-06, 'epoch': 0.55} 55%|█████▍ | 12146/22095 [20:32:20<9:42:25, 3.51s/it] 55%|█████▍ | 12147/22095 [20:32:23<9:12:38, 3.33s/it] {'loss': 0.3149, 'grad_norm': 0.6770136591794844, 'learning_rate': 4.439033143172061e-06, 'epoch': 0.55} 55%|█████▍ | 12147/22095 [20:32:23<9:12:38, 3.33s/it] 55%|█████▍ | 12148/22095 [20:32:26<8:54:23, 3.22s/it] {'loss': 0.2928, 'grad_norm': 0.6328717534830032, 'learning_rate': 4.4383048555582725e-06, 'epoch': 0.55} 55%|█████▍ | 12148/22095 [20:32:26<8:54:23, 3.22s/it] 55%|█████▍ | 12149/22095 [20:32:29<8:35:05, 3.11s/it] {'loss': 0.3373, 'grad_norm': 0.6284658393596169, 'learning_rate': 4.437576580013587e-06, 'epoch': 0.55} 55%|█████▍ | 12149/22095 [20:32:29<8:35:05, 3.11s/it] 55%|█████▍ | 12150/22095 [20:32:32<8:59:48, 3.26s/it] {'loss': 0.352, 'grad_norm': 0.6863147237521107, 'learning_rate': 4.436848316553655e-06, 'epoch': 0.55} 55%|█████▍ | 12150/22095 [20:32:32<8:59:48, 3.26s/it] 55%|█████▍ | 12151/22095 [20:32:36<9:29:11, 3.43s/it] {'loss': 0.3245, 'grad_norm': 0.6818944988875426, 'learning_rate': 4.436120065194121e-06, 'epoch': 0.55} 55%|█████▍ | 12151/22095 [20:32:36<9:29:11, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▍ | 12152/22095 [20:32:46<14:25:46, 5.22s/it] {'loss': 0.4963, 'grad_norm': 0.3969587015992509, 'learning_rate': 4.435391825950637e-06, 'epoch': 0.55} 55%|█████▍ | 12152/22095 [20:32:46<14:25:46, 5.22s/it] 55%|█████▌ | 12153/22095 [20:32:49<12:46:29, 4.63s/it] {'loss': 0.2811, 'grad_norm': 0.6994559752644615, 'learning_rate': 4.434663598838847e-06, 'epoch': 0.55} 55%|█████▌ | 12153/22095 [20:32:49<12:46:29, 4.63s/it] 55%|█████▌ | 12154/22095 [20:32:52<11:24:15, 4.13s/it] {'loss': 0.2816, 'grad_norm': 0.5782646329030336, 'learning_rate': 4.4339353838744024e-06, 'epoch': 0.55} 55%|█████▌ | 12154/22095 [20:32:52<11:24:15, 4.13s/it] 55%|█████▌ | 12155/22095 [20:32:55<10:45:57, 3.90s/it] {'loss': 0.3159, 'grad_norm': 0.6333818254219354, 'learning_rate': 4.433207181072945e-06, 'epoch': 0.55} 55%|█████▌ | 12155/22095 [20:32:55<10:45:57, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53785 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85083 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12156/22095 [20:32:59<10:45:15, 3.90s/it] {'loss': 0.3255, 'grad_norm': 0.6194011645499826, 'learning_rate': 4.432478990450126e-06, 'epoch': 0.55} 55%|█████▌ | 12156/22095 [20:32:59<10:45:15, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71417 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12157/22095 [20:33:03<10:24:04, 3.77s/it] {'loss': 0.2947, 'grad_norm': 0.680557849521606, 'learning_rate': 4.431750812021591e-06, 'epoch': 0.55} 55%|█████▌ | 12157/22095 [20:33:03<10:24:04, 3.77s/it] 55%|█████▌ | 12158/22095 [20:33:05<9:36:32, 3.48s/it] {'loss': 0.3302, 'grad_norm': 0.6530503515211541, 'learning_rate': 4.431022645802985e-06, 'epoch': 0.55} 55%|█████▌ | 12158/22095 [20:33:05<9:36:32, 3.48s/it] 55%|█████▌ | 12159/22095 [20:33:08<9:11:44, 3.33s/it] {'loss': 0.3135, 'grad_norm': 0.6781429736989625, 'learning_rate': 4.430294491809954e-06, 'epoch': 0.55} 55%|█████▌ | 12159/22095 [20:33:08<9:11:44, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12160/22095 [20:33:12<9:31:36, 3.45s/it] {'loss': 0.3066, 'grad_norm': 0.6310181148735358, 'learning_rate': 4.429566350058146e-06, 'epoch': 0.55} 55%|█████▌ | 12160/22095 [20:33:12<9:31:36, 3.45s/it] 55%|█████▌ | 12161/22095 [20:33:16<9:55:23, 3.60s/it] {'loss': 0.3766, 'grad_norm': 0.6180237070682488, 'learning_rate': 4.428838220563205e-06, 'epoch': 0.55} 55%|█████▌ | 12161/22095 [20:33:16<9:55:23, 3.60s/it] 55%|█████▌ | 12162/22095 [20:33:19<9:16:42, 3.36s/it] {'loss': 0.2911, 'grad_norm': 0.6238922783779193, 'learning_rate': 4.428110103340776e-06, 'epoch': 0.55} 55%|█████▌ | 12162/22095 [20:33:19<9:16:42, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8370238 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36990, 'image': 'vrdu_table_final_2/astro-ph.CO/13935046-4cc8-4d45-b111-e00a9bee6bdd.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 55%|█████▌ | 12163/22095 [20:33:22<8:59:47, 3.26s/it] {'loss': 0.3012, 'grad_norm': 0.627010311697926, 'learning_rate': 4.427381998406506e-06, 'epoch': 0.55} 55%|█████▌ | 12163/22095 [20:33:22<8:59:47, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (80427 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84527 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12164/22095 [20:33:30<13:00:23, 4.71s/it] {'loss': 0.458, 'grad_norm': 0.3007742329014419, 'learning_rate': 4.426653905776035e-06, 'epoch': 0.55} 55%|█████▌ | 12164/22095 [20:33:30<13:00:23, 4.71s/it] 55%|█████▌ | 12165/22095 [20:33:38<15:49:37, 5.74s/it] {'loss': 0.4722, 'grad_norm': 0.2872082070907317, 'learning_rate': 4.425925825465013e-06, 'epoch': 0.55} 55%|█████▌ | 12165/22095 [20:33:38<15:49:37, 5.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 55%|█████▌ | 12166/22095 [20:33:42<14:21:23, 5.21s/it] {'loss': 0.3196, 'grad_norm': 0.6197356908953288, 'learning_rate': 4.425197757489082e-06, 'epoch': 0.55} 55%|█████▌ | 12166/22095 [20:33:42<14:21:23, 5.21s/it] 55%|█████▌ | 12167/22095 [20:33:46<13:07:59, 4.76s/it] {'loss': 0.3273, 'grad_norm': 0.6987742922291102, 'learning_rate': 4.4244697018638845e-06, 'epoch': 0.55} 55%|█████▌ | 12167/22095 [20:33:46<13:07:59, 4.76s/it] 55%|█████▌ | 12168/22095 [20:33:49<11:55:29, 4.32s/it] {'loss': 0.2972, 'grad_norm': 0.6004808697850207, 'learning_rate': 4.423741658605066e-06, 'epoch': 0.55} 55%|█████▌ | 12168/22095 [20:33:49<11:55:29, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76021 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12169/22095 [20:33:52<10:42:45, 3.89s/it] {'loss': 0.3355, 'grad_norm': 0.6460250550153468, 'learning_rate': 4.423013627728269e-06, 'epoch': 0.55} 55%|█████▌ | 12169/22095 [20:33:52<10:42:45, 3.89s/it] 55%|█████▌ | 12170/22095 [20:33:56<10:28:50, 3.80s/it] {'loss': 0.3276, 'grad_norm': 0.6257553030413675, 'learning_rate': 4.422285609249139e-06, 'epoch': 0.55} 55%|█████▌ | 12170/22095 [20:33:56<10:28:50, 3.80s/it] 55%|█████▌ | 12171/22095 [20:34:00<11:02:12, 4.00s/it] {'loss': 0.3384, 'grad_norm': 0.6327753745132836, 'learning_rate': 4.4215576031833155e-06, 'epoch': 0.55} 55%|█████▌ | 12171/22095 [20:34:00<11:02:12, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12172/22095 [20:34:06<12:45:56, 4.63s/it] {'loss': 0.4806, 'grad_norm': 0.31764744050962074, 'learning_rate': 4.420829609546442e-06, 'epoch': 0.55} 55%|█████▌ | 12172/22095 [20:34:06<12:45:56, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86242 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12173/22095 [20:34:10<12:08:41, 4.41s/it] {'loss': 0.3133, 'grad_norm': 0.6285208469040747, 'learning_rate': 4.420101628354164e-06, 'epoch': 0.55} 55%|█████▌ | 12173/22095 [20:34:10<12:08:41, 4.41s/it] 55%|█████▌ | 12174/22095 [20:34:13<11:09:38, 4.05s/it] {'loss': 0.3227, 'grad_norm': 0.5957967636997421, 'learning_rate': 4.419373659622117e-06, 'epoch': 0.55} 55%|█████▌ | 12174/22095 [20:34:13<11:09:38, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301436 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1D_2pd2MTUeJjSZFKXXagopXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否将这幅图像中的所有文字详细地列出来?'}, {'from': 'gpt', 'value': '图中所有文字:\n笔画\n练字轩人教版课本同步\n®\n练\n字\n轩\n偏旁部首\n槽\n学\n凹\n生\n练字帖\n数字\n轻松\n练成\n一手好字\n21天\n同步(1-2)年级\n拼音\n(1-\n常用成语\n年级'}]} 55%|█████▌ | 12175/22095 [20:34:16<10:16:43, 3.73s/it] {'loss': 0.3315, 'grad_norm': 0.6882976139091561, 'learning_rate': 4.418645703365949e-06, 'epoch': 0.55} 55%|█████▌ | 12175/22095 [20:34:16<10:16:43, 3.73s/it] 55%|█████▌ | 12176/22095 [20:34:19<9:49:46, 3.57s/it] {'loss': 0.3295, 'grad_norm': 0.7370220979424742, 'learning_rate': 4.4179177596013005e-06, 'epoch': 0.55} 55%|█████▌ | 12176/22095 [20:34:19<9:49:46, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12177/22095 [20:34:29<14:39:09, 5.32s/it] {'loss': 0.4769, 'grad_norm': 0.30805795099873884, 'learning_rate': 4.4171898283438104e-06, 'epoch': 0.55} 55%|█████▌ | 12177/22095 [20:34:29<14:39:09, 5.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8351819 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18498, 'image': 'vrdu_table_final_2/astro-ph.CO/e25e499a-cb97-4aa0-a997-aa416958435f.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 55%|█████▌ | 12178/22095 [20:34:33<13:30:11, 4.90s/it] {'loss': 0.3468, 'grad_norm': 0.6003256657053226, 'learning_rate': 4.416461909609119e-06, 'epoch': 0.55} 55%|█████▌ | 12178/22095 [20:34:33<13:30:11, 4.90s/it] 55%|█████▌ | 12179/22095 [20:34:36<12:17:44, 4.46s/it] {'loss': 0.2972, 'grad_norm': 0.6286414600373732, 'learning_rate': 4.415734003412873e-06, 'epoch': 0.55} 55%|█████▌ | 12179/22095 [20:34:36<12:17:44, 4.46s/it] 55%|█████▌ | 12180/22095 [20:34:39<11:10:46, 4.06s/it] {'loss': 0.3631, 'grad_norm': 0.6598533584898012, 'learning_rate': 4.415006109770706e-06, 'epoch': 0.55} 55%|█████▌ | 12180/22095 [20:34:39<11:10:46, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8404835 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7021, 'image': 'vrdu_table_final_2/astro-ph.CO/9742f02a-39c7-4844-8887-bb1f2ef54172.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 55%|█████▌ | 12181/22095 [20:34:49<15:41:13, 5.70s/it] {'loss': 0.4777, 'grad_norm': 0.2659973937123042, 'learning_rate': 4.414278228698261e-06, 'epoch': 0.55} 55%|█████▌ | 12181/22095 [20:34:49<15:41:13, 5.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [828, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8446189 in VC:s3://internvl-moe-sft-data/. Exception: Image size [828, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11937, 'image': 'vrdu_texteq/astro-ph.CO/6b7aed1f-5683-4c51-a826-e936e0f415af.png', 'image_wh': [[828, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'and $r=k_1+k_2+\\dots+k_m$. The ``reduced cumulants" are defined as'}]} 55%|█████▌ | 12182/22095 [20:34:53<14:15:19, 5.18s/it] {'loss': 0.3176, 'grad_norm': 2.6385411080370003, 'learning_rate': 4.413550360211177e-06, 'epoch': 0.55} 55%|█████▌ | 12182/22095 [20:34:53<14:15:19, 5.18s/it] 55%|█████▌ | 12183/22095 [20:34:56<12:33:12, 4.56s/it] {'loss': 0.3038, 'grad_norm': 0.6190630777781309, 'learning_rate': 4.412822504325099e-06, 'epoch': 0.55} 55%|█████▌ | 12183/22095 [20:34:56<12:33:12, 4.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [109, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8371250 in VC:s3://internvl-moe-sft-data/. Exception: Image size [109, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38011, 'image': 'vrdu_table_final_2/astro-ph.CO/8a620cc8-029b-4c83-a833-ebcac4088f6b.png', 'image_wh': [[109, 20]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha - \\alpha_{\\rm true}$\\end{tabular}\n```"}]} 55%|█████▌ | 12184/22095 [20:35:00<11:48:14, 4.29s/it] {'loss': 0.3428, 'grad_norm': 0.6594277253803893, 'learning_rate': 4.412094661055658e-06, 'epoch': 0.55} 55%|█████▌ | 12184/22095 [20:35:00<11:48:14, 4.29s/it] 55%|█████▌ | 12185/22095 [20:35:02<10:35:52, 3.85s/it] {'loss': 0.3288, 'grad_norm': 0.6978013596534337, 'learning_rate': 4.411366830418498e-06, 'epoch': 0.55} 55%|█████▌ | 12185/22095 [20:35:02<10:35:52, 3.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12186/22095 [20:35:05<9:56:03, 3.61s/it] {'loss': 0.3011, 'grad_norm': 0.6448886516709758, 'learning_rate': 4.410639012429259e-06, 'epoch': 0.55} 55%|█████▌ | 12186/22095 [20:35:05<9:56:03, 3.61s/it] 55%|█████▌ | 12187/22095 [20:35:08<9:22:08, 3.40s/it] {'loss': 0.3257, 'grad_norm': 0.6144338416530524, 'learning_rate': 4.409911207103576e-06, 'epoch': 0.55} 55%|█████▌ | 12187/22095 [20:35:08<9:22:08, 3.40s/it] 55%|█████▌ | 12188/22095 [20:35:13<10:07:01, 3.68s/it] {'loss': 0.3444, 'grad_norm': 0.6649078543814887, 'learning_rate': 4.409183414457086e-06, 'epoch': 0.55} 55%|█████▌ | 12188/22095 [20:35:13<10:07:01, 3.68s/it] 55%|█████▌ | 12189/22095 [20:35:16<9:26:29, 3.43s/it] {'loss': 0.3016, 'grad_norm': 0.6712539994119634, 'learning_rate': 4.408455634505435e-06, 'epoch': 0.55} 55%|█████▌ | 12189/22095 [20:35:16<9:26:29, 3.43s/it] 55%|█████▌ | 12190/22095 [20:35:18<8:53:43, 3.23s/it] {'loss': 0.3021, 'grad_norm': 0.5761806251973871, 'learning_rate': 4.407727867264253e-06, 'epoch': 0.55} 55%|█████▌ | 12190/22095 [20:35:18<8:53:43, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12191/22095 [20:35:25<11:31:11, 4.19s/it] {'loss': 0.474, 'grad_norm': 0.31372664794374205, 'learning_rate': 4.407000112749179e-06, 'epoch': 0.55} 55%|█████▌ | 12191/22095 [20:35:25<11:31:11, 4.19s/it] 55%|█████▌ | 12192/22095 [20:35:29<11:17:54, 4.11s/it] {'loss': 0.3507, 'grad_norm': 0.7033017381026658, 'learning_rate': 4.406272370975854e-06, 'epoch': 0.55} 55%|█████▌ | 12192/22095 [20:35:29<11:17:54, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12193/22095 [20:35:32<11:00:14, 4.00s/it] {'loss': 0.3767, 'grad_norm': 0.8187659702589716, 'learning_rate': 4.40554464195991e-06, 'epoch': 0.55} 55%|█████▌ | 12193/22095 [20:35:33<11:00:14, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12194/22095 [20:35:36<10:17:46, 3.74s/it] {'loss': 0.332, 'grad_norm': 0.7055026439228711, 'learning_rate': 4.404816925716987e-06, 'epoch': 0.55} 55%|█████▌ | 12194/22095 [20:35:36<10:17:46, 3.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12195/22095 [20:35:43<13:32:29, 4.92s/it] {'loss': 0.4674, 'grad_norm': 0.30227345454402327, 'learning_rate': 4.404089222262721e-06, 'epoch': 0.55} 55%|█████▌ | 12195/22095 [20:35:43<13:32:29, 4.92s/it] 55%|█████▌ | 12196/22095 [20:35:47<12:09:26, 4.42s/it] {'loss': 0.2831, 'grad_norm': 0.6038295544645176, 'learning_rate': 4.4033615316127466e-06, 'epoch': 0.55} 55%|█████▌ | 12196/22095 [20:35:47<12:09:26, 4.42s/it] 55%|█████▌ | 12197/22095 [20:35:50<11:04:16, 4.03s/it] {'loss': 0.312, 'grad_norm': 0.6109965842943273, 'learning_rate': 4.402633853782699e-06, 'epoch': 0.55} 55%|█████▌ | 12197/22095 [20:35:50<11:04:16, 4.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [209, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8468595 in VC:s3://internvl-moe-sft-data/. Exception: Image size [209, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 124030, 'image': 'vrdu_texteq/astro-ph.CO/10dea9cc-6c32-4725-a091-520e37d716ed.png', 'image_wh': [[209, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': '$5000$ realizations.'}]} 55%|█████▌ | 12198/22095 [20:35:52<10:00:03, 3.64s/it] {'loss': 0.2787, 'grad_norm': 0.6117511805063619, 'learning_rate': 4.401906188788216e-06, 'epoch': 0.55} 55%|█████▌ | 12198/22095 [20:35:52<10:00:03, 3.64s/it] 55%|█████▌ | 12199/22095 [20:35:56<10:06:39, 3.68s/it] {'loss': 0.2971, 'grad_norm': 0.6615939112863005, 'learning_rate': 4.401178536644934e-06, 'epoch': 0.55} 55%|█████▌ | 12199/22095 [20:35:56<10:06:39, 3.68s/it] 55%|█████▌ | 12200/22095 [20:35:59<9:32:03, 3.47s/it] {'loss': 0.3102, 'grad_norm': 0.7905339578895, 'learning_rate': 4.4004508973684844e-06, 'epoch': 0.55} 55%|█████▌ | 12200/22095 [20:35:59<9:32:03, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54134 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44907 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12201/22095 [20:36:02<9:06:26, 3.31s/it] {'loss': 0.2908, 'grad_norm': 0.6298379390373061, 'learning_rate': 4.399723270974503e-06, 'epoch': 0.55} 55%|█████▌ | 12201/22095 [20:36:02<9:06:26, 3.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888455 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11608, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 55%|█████▌ | 12202/22095 [20:36:05<9:05:45, 3.31s/it] {'loss': 0.3327, 'grad_norm': 1.0137912656423052, 'learning_rate': 4.398995657478628e-06, 'epoch': 0.55} 55%|█████▌ | 12202/22095 [20:36:05<9:05:45, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12203/22095 [20:36:09<9:18:27, 3.39s/it] {'loss': 0.3398, 'grad_norm': 0.642412628400293, 'learning_rate': 4.398268056896488e-06, 'epoch': 0.55} 55%|█████▌ | 12203/22095 [20:36:09<9:18:27, 3.39s/it] 55%|█████▌ | 12204/22095 [20:36:13<9:57:03, 3.62s/it] {'loss': 0.2802, 'grad_norm': 0.6056875907816118, 'learning_rate': 4.397540469243719e-06, 'epoch': 0.55} 55%|█████▌ | 12204/22095 [20:36:13<9:57:03, 3.62s/it] 55%|█████▌ | 12205/22095 [20:36:17<9:59:41, 3.64s/it] {'loss': 0.3177, 'grad_norm': 0.623085870979715, 'learning_rate': 4.396812894535957e-06, 'epoch': 0.55} 55%|█████▌ | 12205/22095 [20:36:17<9:59:41, 3.64s/it] 55%|█████▌ | 12206/22095 [20:36:21<10:19:54, 3.76s/it] {'loss': 0.3342, 'grad_norm': 0.6998316003144387, 'learning_rate': 4.396085332788832e-06, 'epoch': 0.55} 55%|█████▌ | 12206/22095 [20:36:21<10:19:54, 3.76s/it] 55%|█████▌ | 12207/22095 [20:36:24<10:10:29, 3.70s/it] {'loss': 0.2944, 'grad_norm': 0.692349791540142, 'learning_rate': 4.395357784017977e-06, 'epoch': 0.55} 55%|█████▌ | 12207/22095 [20:36:24<10:10:29, 3.70s/it] 55%|█████▌ | 12208/22095 [20:36:28<9:45:39, 3.55s/it] {'loss': 0.2867, 'grad_norm': 0.5964649889517802, 'learning_rate': 4.394630248239029e-06, 'epoch': 0.55} 55%|█████▌ | 12208/22095 [20:36:28<9:45:39, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393035 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 59866, 'image': 'vrdu_table_final_2/astro-ph.EP/effdfbdd-7b49-4538-b399-a8737303cbd8.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 55%|█████▌ | 12209/22095 [20:36:36<13:26:26, 4.89s/it] {'loss': 0.491, 'grad_norm': 0.36281497979131894, 'learning_rate': 4.393902725467616e-06, 'epoch': 0.55} 55%|█████▌ | 12209/22095 [20:36:36<13:26:26, 4.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12210/22095 [20:36:41<13:28:09, 4.91s/it] {'loss': 0.3744, 'grad_norm': 0.6925980131373757, 'learning_rate': 4.3931752157193725e-06, 'epoch': 0.55} 55%|█████▌ | 12210/22095 [20:36:41<13:28:09, 4.91s/it] 55%|█████▌ | 12211/22095 [20:36:45<12:44:40, 4.64s/it] {'loss': 0.3029, 'grad_norm': 0.5824848786527309, 'learning_rate': 4.3924477190099286e-06, 'epoch': 0.55} 55%|█████▌ | 12211/22095 [20:36:45<12:44:40, 4.64s/it] 55%|█████▌ | 12212/22095 [20:36:47<11:09:58, 4.07s/it] {'loss': 0.3234, 'grad_norm': 0.6192125272134775, 'learning_rate': 4.391720235354921e-06, 'epoch': 0.55} 55%|█████▌ | 12212/22095 [20:36:47<11:09:58, 4.07s/it] 55%|█████▌ | 12213/22095 [20:36:50<10:15:21, 3.74s/it] {'loss': 0.3189, 'grad_norm': 0.6198155454511541, 'learning_rate': 4.390992764769974e-06, 'epoch': 0.55} 55%|█████▌ | 12213/22095 [20:36:50<10:15:21, 3.74s/it] 55%|█████▌ | 12214/22095 [20:36:55<10:57:41, 3.99s/it] {'loss': 0.3609, 'grad_norm': 0.6843133179389448, 'learning_rate': 4.390265307270722e-06, 'epoch': 0.55} 55%|█████▌ | 12214/22095 [20:36:55<10:57:41, 3.99s/it] 55%|█████▌ | 12215/22095 [20:36:58<10:11:11, 3.71s/it] {'loss': 0.3014, 'grad_norm': 0.6429985571485314, 'learning_rate': 4.389537862872798e-06, 'epoch': 0.55} 55%|█████▌ | 12215/22095 [20:36:58<10:11:11, 3.71s/it] 55%|█████▌ | 12216/22095 [20:37:01<9:29:32, 3.46s/it] {'loss': 0.2934, 'grad_norm': 0.6993967541463431, 'learning_rate': 4.388810431591829e-06, 'epoch': 0.55} 55%|█████▌ | 12216/22095 [20:37:01<9:29:32, 3.46s/it] 55%|█████▌ | 12217/22095 [20:37:04<9:40:57, 3.53s/it] {'loss': 0.3322, 'grad_norm': 0.5852124802333049, 'learning_rate': 4.388083013443445e-06, 'epoch': 0.55} 55%|█████▌ | 12217/22095 [20:37:05<9:40:57, 3.53s/it] 55%|█████▌ | 12218/22095 [20:37:26<24:47:40, 9.04s/it] {'loss': 0.3433, 'grad_norm': 0.6098756681057482, 'learning_rate': 4.387355608443281e-06, 'epoch': 0.55} 55%|█████▌ | 12218/22095 [20:37:26<24:47:40, 9.04s/it] 55%|█████▌ | 12219/22095 [20:37:30<20:16:24, 7.39s/it] {'loss': 0.3231, 'grad_norm': 0.5960790982391869, 'learning_rate': 4.386628216606962e-06, 'epoch': 0.55} 55%|█████▌ | 12219/22095 [20:37:30<20:16:24, 7.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127796 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54230 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61096 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12220/22095 [20:37:35<18:00:54, 6.57s/it] {'loss': 0.2942, 'grad_norm': 0.5563541033680087, 'learning_rate': 4.385900837950119e-06, 'epoch': 0.55} 55%|█████▌ | 12220/22095 [20:37:35<18:00:54, 6.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81178 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55712 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12221/22095 [20:37:38<15:04:41, 5.50s/it] {'loss': 0.2754, 'grad_norm': 0.6190208956368316, 'learning_rate': 4.385173472488382e-06, 'epoch': 0.55} 55%|█████▌ | 12221/22095 [20:37:38<15:04:41, 5.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12222/22095 [20:37:59<28:02:42, 10.23s/it] {'loss': 0.2881, 'grad_norm': 0.6148730927860822, 'learning_rate': 4.384446120237375e-06, 'epoch': 0.55} 55%|█████▌ | 12222/22095 [20:37:59<28:02:42, 10.23s/it] 55%|█████▌ | 12223/22095 [20:38:03<22:41:17, 8.27s/it] {'loss': 0.2961, 'grad_norm': 0.6017200305214885, 'learning_rate': 4.3837187812127335e-06, 'epoch': 0.55} 55%|█████▌ | 12223/22095 [20:38:03<22:41:17, 8.27s/it] 55%|█████▌ | 12224/22095 [20:38:05<18:15:53, 6.66s/it] {'loss': 0.3047, 'grad_norm': 0.6199969656357945, 'learning_rate': 4.382991455430082e-06, 'epoch': 0.55} 55%|█████▌ | 12224/22095 [20:38:05<18:15:53, 6.66s/it] 55%|█████▌ | 12225/22095 [20:38:08<15:09:10, 5.53s/it] {'loss': 0.3139, 'grad_norm': 0.6884288422848379, 'learning_rate': 4.38226414290505e-06, 'epoch': 0.55} 55%|█████▌ | 12225/22095 [20:38:08<15:09:10, 5.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923672 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46825, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 5\nB. 4\nC. 3\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 55%|█████▌ | 12226/22095 [20:38:11<13:10:04, 4.80s/it] {'loss': 0.2918, 'grad_norm': 0.574458671702428, 'learning_rate': 4.381536843653262e-06, 'epoch': 0.55} 55%|█████▌ | 12226/22095 [20:38:11<13:10:04, 4.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884877 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8030, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 7\nB. 6\nC. 10\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 55%|█████▌ | 12227/22095 [20:38:16<12:48:13, 4.67s/it] {'loss': 0.3011, 'grad_norm': 0.6292783706079346, 'learning_rate': 4.380809557690349e-06, 'epoch': 0.55} 55%|█████▌ | 12227/22095 [20:38:16<12:48:13, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12228/22095 [20:38:26<17:00:50, 6.21s/it] {'loss': 0.4775, 'grad_norm': 0.3549734484788748, 'learning_rate': 4.380082285031938e-06, 'epoch': 0.55} 55%|█████▌ | 12228/22095 [20:38:26<17:00:50, 6.21s/it] 55%|█████▌ | 12229/22095 [20:38:33<18:15:37, 6.66s/it] {'loss': 0.4714, 'grad_norm': 0.3279537547182673, 'learning_rate': 4.379355025693654e-06, 'epoch': 0.55} 55%|█████▌ | 12229/22095 [20:38:33<18:15:37, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 55%|█████▌ | 12230/22095 [20:38:37<15:57:06, 5.82s/it] {'loss': 0.309, 'grad_norm': 0.6157546861625464, 'learning_rate': 4.378627779691123e-06, 'epoch': 0.55} 55%|█████▌ | 12230/22095 [20:38:37<15:57:06, 5.82s/it] 55%|█████▌ | 12231/22095 [20:38:40<13:51:12, 5.06s/it] {'loss': 0.2888, 'grad_norm': 0.6530262504499489, 'learning_rate': 4.377900547039976e-06, 'epoch': 0.55} 55%|█████▌ | 12231/22095 [20:38:40<13:51:12, 5.06s/it] 55%|█████▌ | 12232/22095 [20:38:44<12:20:13, 4.50s/it] {'loss': 0.3317, 'grad_norm': 0.6055986541476903, 'learning_rate': 4.377173327755832e-06, 'epoch': 0.55} 55%|█████▌ | 12232/22095 [20:38:44<12:20:13, 4.50s/it] 55%|█████▌ | 12233/22095 [20:38:47<11:15:08, 4.11s/it] {'loss': 0.3026, 'grad_norm': 0.6338147844885054, 'learning_rate': 4.376446121854322e-06, 'epoch': 0.55} 55%|█████▌ | 12233/22095 [20:38:47<11:15:08, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12234/22095 [20:38:51<11:10:50, 4.08s/it] {'loss': 0.357, 'grad_norm': 0.6080275418862028, 'learning_rate': 4.3757189293510696e-06, 'epoch': 0.55} 55%|█████▌ | 12234/22095 [20:38:51<11:10:50, 4.08s/it] 55%|█████▌ | 12235/22095 [20:38:55<10:56:25, 3.99s/it] {'loss': 0.3525, 'grad_norm': 0.5928751762361252, 'learning_rate': 4.3749917502617e-06, 'epoch': 0.55} 55%|█████▌ | 12235/22095 [20:38:55<10:56:25, 3.99s/it] 55%|█████▌ | 12236/22095 [20:38:58<10:39:31, 3.89s/it] {'loss': 0.3303, 'grad_norm': 0.698941683211696, 'learning_rate': 4.374264584601837e-06, 'epoch': 0.55} 55%|█████▌ | 12236/22095 [20:38:58<10:39:31, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49010 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12237/22095 [20:39:01<9:46:09, 3.57s/it] {'loss': 0.3464, 'grad_norm': 0.6663961230327661, 'learning_rate': 4.3735374323871084e-06, 'epoch': 0.55} 55%|█████▌ | 12237/22095 [20:39:01<9:46:09, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84258 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12238/22095 [20:39:05<9:50:02, 3.59s/it] {'loss': 0.2846, 'grad_norm': 0.6377326823601144, 'learning_rate': 4.372810293633135e-06, 'epoch': 0.55} 55%|█████▌ | 12238/22095 [20:39:05<9:50:02, 3.59s/it] 55%|█████▌ | 12239/22095 [20:39:08<9:26:15, 3.45s/it] {'loss': 0.2871, 'grad_norm': 0.5987760173833161, 'learning_rate': 4.372083168355543e-06, 'epoch': 0.55} 55%|█████▌ | 12239/22095 [20:39:08<9:26:15, 3.45s/it] 55%|█████▌ | 12240/22095 [20:39:12<9:51:17, 3.60s/it] {'loss': 0.3175, 'grad_norm': 0.582142971841127, 'learning_rate': 4.371356056569953e-06, 'epoch': 0.55} 55%|█████▌ | 12240/22095 [20:39:12<9:51:17, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12241/22095 [20:39:16<10:30:39, 3.84s/it] {'loss': 0.2885, 'grad_norm': 0.6951289154150705, 'learning_rate': 4.370628958291993e-06, 'epoch': 0.55} 55%|█████▌ | 12241/22095 [20:39:16<10:30:39, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [292, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465359 in VC:s3://internvl-moe-sft-data/. Exception: Image size [292, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27064, 'image': 'vrdu_texteq/astro-ph.CO/a6814e44-dd1c-4b85-a5e9-baa62a658980.png', 'image_wh': [[292, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'where ${\\bf s}_{12} \\equiv{\\bf s}_1 - {\\bf s}_2$ and'}]} 55%|█████▌ | 12242/22095 [20:39:19<9:37:11, 3.51s/it] {'loss': 0.3195, 'grad_norm': 0.691357756254213, 'learning_rate': 4.369901873537283e-06, 'epoch': 0.55} 55%|█████▌ | 12242/22095 [20:39:19<9:37:11, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12243/22095 [20:39:22<9:12:05, 3.36s/it] {'loss': 0.3092, 'grad_norm': 0.6281731913361734, 'learning_rate': 4.369174802321447e-06, 'epoch': 0.55} 55%|█████▌ | 12243/22095 [20:39:22<9:12:05, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [681, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433311 in VC:s3://internvl-moe-sft-data/. Exception: Image size [681, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 127898, 'image': 'vrdu_texteq/astro-ph.CO/c89cd220-30fc-48bf-bd1b-acb797091254.png', 'image_wh': [[681, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': '\\ between the $z=0.5-1$ and $z=1-2$ bins and a factor'}]} VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30201.png 2025-08-28 12:37:22.445754 load time: 1001.47 ms 55%|█████▌ | 12244/22095 [20:39:26<9:41:57, 3.54s/it] {'loss': 0.3712, 'grad_norm': 0.7427270951811668, 'learning_rate': 4.368447744660107e-06, 'epoch': 0.55} 55%|█████▌ | 12244/22095 [20:39:26<9:41:57, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12245/22095 [20:39:53<28:42:18, 10.49s/it] {'loss': 0.487, 'grad_norm': 0.40689852775381286, 'learning_rate': 4.367720700568885e-06, 'epoch': 0.55} 55%|█████▌ | 12245/22095 [20:39:53<28:42:18, 10.49s/it] 55%|█████▌ | 12246/22095 [20:39:56<22:42:40, 8.30s/it] {'loss': 0.2849, 'grad_norm': 0.7117771895581163, 'learning_rate': 4.366993670063402e-06, 'epoch': 0.55} 55%|█████▌ | 12246/22095 [20:39:56<22:42:40, 8.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58548 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12247/22095 [20:40:00<18:59:20, 6.94s/it] {'loss': 0.2953, 'grad_norm': 0.6206158885079222, 'learning_rate': 4.366266653159283e-06, 'epoch': 0.55} 55%|█████▌ | 12247/22095 [20:40:00<18:59:20, 6.94s/it] 55%|█████▌ | 12248/22095 [20:40:03<15:50:48, 5.79s/it] {'loss': 0.3379, 'grad_norm': 0.6255779764365972, 'learning_rate': 4.365539649872146e-06, 'epoch': 0.55} 55%|█████▌ | 12248/22095 [20:40:03<15:50:48, 5.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [764, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8406261 in VC:s3://internvl-moe-sft-data/. Exception: Image size [764, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8448, 'image': 'vrdu_table_final_2/astro-ph.CO/93c87edf-30a9-417f-8008-1fc60b84e142.png', 'image_wh': [[764, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n16&17&18&19&20\n\\end{tabular}\n```"}]} 55%|█████▌ | 12249/22095 [20:40:06<13:51:12, 5.07s/it] {'loss': 0.3351, 'grad_norm': 0.6808302225014655, 'learning_rate': 4.364812660217614e-06, 'epoch': 0.55} 55%|█████▌ | 12249/22095 [20:40:06<13:51:12, 5.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54736 > 40960). Running this sequence through the model will result in indexing errors 55%|█████▌ | 12250/22095 [20:40:09<12:11:29, 4.46s/it] {'loss': 0.3615, 'grad_norm': 0.6666912810231732, 'learning_rate': 4.364085684211307e-06, 'epoch': 0.55} 55%|█████▌ | 12250/22095 [20:40:09<12:11:29, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047748 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1.5\nB. 2\nC. 0.5\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 55%|█████▌ | 12251/22095 [20:40:13<11:26:54, 4.19s/it] {'loss': 0.3224, 'grad_norm': 0.6315922128061487, 'learning_rate': 4.363358721868844e-06, 'epoch': 0.55} 55%|█████▌ | 12251/22095 [20:40:13<11:26:54, 4.19s/it] 55%|█████▌ | 12252/22095 [20:40:17<11:33:24, 4.23s/it] {'loss': 0.354, 'grad_norm': 0.6674302959256082, 'learning_rate': 4.362631773205848e-06, 'epoch': 0.55} 55%|█████▌ | 12252/22095 [20:40:17<11:33:24, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 55%|█████▌ | 12253/22095 [20:40:20<10:26:08, 3.82s/it] {'loss': 0.3141, 'grad_norm': 0.6301333820021424, 'learning_rate': 4.361904838237938e-06, 'epoch': 0.55} 55%|█████▌ | 12253/22095 [20:40:20<10:26:08, 3.82s/it] 55%|█████▌ | 12254/22095 [20:40:43<25:54:46, 9.48s/it] {'loss': 0.3001, 'grad_norm': 0.6182147024876224, 'learning_rate': 4.3611779169807335e-06, 'epoch': 0.55} 55%|█████▌ | 12254/22095 [20:40:43<25:54:46, 9.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8299775 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1.K45onvI8KJjSspjXXcgjXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nI require the transcribed text from this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n书包暴力测试\n好书包质量过硬\n书包承重测试\n好书包才能\n承受34斤\n超强测试\n16.92\n0.00'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12255/22095 [20:40:52<26:04:31, 9.54s/it] {'loss': 0.5037, 'grad_norm': 0.3287120848947762, 'learning_rate': 4.360451009449852e-06, 'epoch': 0.55} 55%|█████▌ | 12255/22095 [20:40:52<26:04:31, 9.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924585 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47738, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 16cm\nB. 32cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 55%|█████▌ | 12256/22095 [20:40:56<20:57:34, 7.67s/it] {'loss': 0.3193, 'grad_norm': 0.5805587844522027, 'learning_rate': 4.359724115660915e-06, 'epoch': 0.55} 55%|█████▌ | 12256/22095 [20:40:56<20:57:34, 7.67s/it] 55%|█████▌ | 12257/22095 [20:40:59<17:07:34, 6.27s/it] {'loss': 0.2737, 'grad_norm': 0.5753460065480046, 'learning_rate': 4.3589972356295415e-06, 'epoch': 0.55} 55%|█████▌ | 12257/22095 [20:40:59<17:07:34, 6.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 55%|█████▌ | 12258/22095 [20:41:09<20:46:21, 7.60s/it] {'loss': 0.4724, 'grad_norm': 0.27834192808116914, 'learning_rate': 4.3582703693713475e-06, 'epoch': 0.55} 55%|█████▌ | 12258/22095 [20:41:09<20:46:21, 7.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944624 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67777, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 15cm\nB. 13cm\nC. 11cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 55%|█████▌ | 12259/22095 [20:41:13<17:40:02, 6.47s/it] {'loss': 0.3138, 'grad_norm': 0.5975988206476961, 'learning_rate': 4.357543516901951e-06, 'epoch': 0.55} 55%|█████▌ | 12259/22095 [20:41:13<17:40:02, 6.47s/it] 55%|█████▌ | 12260/22095 [20:41:34<29:37:19, 10.84s/it] {'loss': 0.3372, 'grad_norm': 0.66133209006719, 'learning_rate': 4.356816678236975e-06, 'epoch': 0.55} 55%|█████▌ | 12260/22095 [20:41:34<29:37:19, 10.84s/it] 55%|█████▌ | 12261/22095 [20:41:38<23:46:36, 8.70s/it] {'loss': 0.3399, 'grad_norm': 0.5894943538014618, 'learning_rate': 4.35608985339203e-06, 'epoch': 0.55} 55%|█████▌ | 12261/22095 [20:41:38<23:46:36, 8.70s/it] 55%|█████▌ | 12262/22095 [20:41:42<19:53:30, 7.28s/it] {'loss': 0.3074, 'grad_norm': 0.5781218574299265, 'learning_rate': 4.355363042382737e-06, 'epoch': 0.55} 55%|█████▌ | 12262/22095 [20:41:42<19:53:30, 7.28s/it] 56%|█████▌ | 12263/22095 [20:42:24<48:51:59, 17.89s/it] {'loss': 0.3386, 'grad_norm': 0.6363791850426851, 'learning_rate': 4.3546362452247135e-06, 'epoch': 0.56} 56%|█████▌ | 12263/22095 [20:42:24<48:51:59, 17.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108860 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64985 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63292 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12264/22095 [20:43:04<66:22:26, 24.31s/it] {'loss': 0.3461, 'grad_norm': 0.6129669070303816, 'learning_rate': 4.3539094619335746e-06, 'epoch': 0.56} 56%|█████▌ | 12264/22095 [20:43:04<66:22:26, 24.31s/it] 56%|█████▌ | 12265/22095 [20:43:26<64:33:47, 23.64s/it] {'loss': 0.2904, 'grad_norm': 0.5914749739899136, 'learning_rate': 4.3531826925249355e-06, 'epoch': 0.56} 56%|█████▌ | 12265/22095 [20:43:26<64:33:47, 23.64s/it] 56%|█████▌ | 12266/22095 [20:43:29<48:06:15, 17.62s/it] {'loss': 0.3281, 'grad_norm': 0.6209594996658195, 'learning_rate': 4.352455937014414e-06, 'epoch': 0.56} 56%|█████▌ | 12266/22095 [20:43:29<48:06:15, 17.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12267/22095 [20:43:51<51:33:36, 18.89s/it] {'loss': 0.3305, 'grad_norm': 0.6419748858543516, 'learning_rate': 4.351729195417627e-06, 'epoch': 0.56} 56%|█████▌ | 12267/22095 [20:43:51<51:33:36, 18.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12268/22095 [20:44:15<55:16:31, 20.25s/it] {'loss': 0.3338, 'grad_norm': 0.6433578417970277, 'learning_rate': 4.351002467750189e-06, 'epoch': 0.56} 56%|█████▌ | 12268/22095 [20:44:15<55:16:31, 20.25s/it]VC:s3://gui-agent/data_20250707/android/images/all/Broccoli_RecipeAddSingleRecipe_1/images/001_start_1752223192133.png 2025-08-28 12:42:13.448480 load time: 1019.33 ms VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/TB1g2RNLXXXXXaDXpXXunYpLFXX.jpg 2025-08-28 12:42:13.448282 load time: 1062.05 ms 56%|█████▌ | 12269/22095 [20:44:54<70:34:04, 25.85s/it] {'loss': 0.3219, 'grad_norm': 0.6542378892446439, 'learning_rate': 4.350275754027713e-06, 'epoch': 0.56} 56%|█████▌ | 12269/22095 [20:44:54<70:34:04, 25.85s/it]VC:s3://gui-agent/data_20250612/mac/images/weather/658042b3-9032-4df6-8e53-038608662ce0/images/step_3.png 2025-08-28 12:42:52.377208 load time: 1045.79 ms 56%|█████▌ | 12270/22095 [20:45:15<67:01:29, 24.56s/it] {'loss': 0.3169, 'grad_norm': 0.6360447076266809, 'learning_rate': 4.349549054265817e-06, 'epoch': 0.56} 56%|█████▌ | 12270/22095 [20:45:15<67:01:29, 24.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44284 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12271/22095 [20:45:39<66:23:09, 24.33s/it] {'loss': 0.3268, 'grad_norm': 0.5615509706467567, 'learning_rate': 4.348822368480113e-06, 'epoch': 0.56} 56%|█████▌ | 12271/22095 [20:45:39<66:23:09, 24.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12272/22095 [20:45:49<54:20:56, 19.92s/it] {'loss': 0.4908, 'grad_norm': 0.3815671050635444, 'learning_rate': 4.348095696686217e-06, 'epoch': 0.56} 56%|█████▌ | 12272/22095 [20:45:49<54:20:56, 19.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12273/22095 [20:45:52<41:03:37, 15.05s/it] {'loss': 0.2881, 'grad_norm': 0.6375360091018338, 'learning_rate': 4.347369038899744e-06, 'epoch': 0.56} 56%|█████▌ | 12273/22095 [20:45:52<41:03:37, 15.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12274/22095 [20:46:15<46:57:51, 17.22s/it] {'loss': 0.31, 'grad_norm': 0.5722597760505286, 'learning_rate': 4.346642395136303e-06, 'epoch': 0.56} 56%|█████▌ | 12274/22095 [20:46:15<46:57:51, 17.22s/it]VC:s3://internvl2/datasets/VCR-wiki-en-easy/images/0013875.jpg 2025-08-28 12:44:13.283440 load time: 1038.88 ms VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/1279.jpg 2025-08-28 12:44:13.283242 load time: 1035.18 ms VC:s3://st2pj/20250222/images/multi_modal/agent_data/AndroidUI/20240321/20240321_filtered/jingdong/screen_00000142.jpg 2025-08-28 12:44:13.283789 load time: 1047.81 ms VC:s3://gui-agent/data_20250714/windows/images/adobe_illustrator/free_task_20250714_160504/images/20250714_160534_15.png 2025-08-28 12:44:13.285195 load time: 1055.23 ms 56%|█████▌ | 12275/22095 [20:46:18<35:55:57, 13.17s/it] {'loss': 0.337, 'grad_norm': 0.633575672379051, 'learning_rate': 4.345915765411511e-06, 'epoch': 0.56} 56%|█████▌ | 12275/22095 [20:46:18<35:55:57, 13.17s/it] 56%|█████▌ | 12276/22095 [20:47:35<88:11:15, 32.33s/it] {'loss': 0.3364, 'grad_norm': 0.6112490004503777, 'learning_rate': 4.345189149740982e-06, 'epoch': 0.56} 56%|█████▌ | 12276/22095 [20:47:35<88:11:15, 32.33s/it] 56%|█████▌ | 12277/22095 [20:47:58<80:10:23, 29.40s/it] {'loss': 0.2906, 'grad_norm': 0.5802223407166853, 'learning_rate': 4.344462548140325e-06, 'epoch': 0.56} 56%|█████▌ | 12277/22095 [20:47:58<80:10:23, 29.40s/it] 56%|█████▌ | 12278/22095 [20:49:44<142:58:53, 52.43s/it] {'loss': 0.3308, 'grad_norm': 0.5860357053241917, 'learning_rate': 4.343735960625156e-06, 'epoch': 0.56} 56%|█████▌ | 12278/22095 [20:49:44<142:58:53, 52.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12279/22095 [20:49:54<107:51:38, 39.56s/it] {'loss': 0.4983, 'grad_norm': 0.3729925473292314, 'learning_rate': 4.343009387211086e-06, 'epoch': 0.56} 56%|█████▌ | 12279/22095 [20:49:54<107:51:38, 39.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76203 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12280/22095 [20:49:57<78:19:13, 28.73s/it] {'loss': 0.3084, 'grad_norm': 0.5933383937464417, 'learning_rate': 4.3422828279137245e-06, 'epoch': 0.56} 56%|█████▌ | 12280/22095 [20:49:57<78:19:13, 28.73s/it] 56%|█████▌ | 12281/22095 [20:51:01<106:56:05, 39.23s/it] {'loss': 0.349, 'grad_norm': 0.8911220942079052, 'learning_rate': 4.341556282748685e-06, 'epoch': 0.56} 56%|█████▌ | 12281/22095 [20:51:01<106:56:05, 39.23s/it] 56%|█████▌ | 12282/22095 [20:51:21<91:42:16, 33.64s/it] {'loss': 0.289, 'grad_norm': 0.6579449341515171, 'learning_rate': 4.34082975173158e-06, 'epoch': 0.56} 56%|█████▌ | 12282/22095 [20:51:21<91:42:16, 33.64s/it] 56%|█████▌ | 12283/22095 [20:52:23<114:44:09, 42.10s/it] {'loss': 0.3467, 'grad_norm': 0.64297167758931, 'learning_rate': 4.34010323487802e-06, 'epoch': 0.56} 56%|█████▌ | 12283/22095 [20:52:23<114:44:09, 42.10s/it]VC:s3://gui/visual_inputs/multi_modal_2024/gui_data/ui_data/OpenApp/image/45272.jpg 2025-08-28 12:50:21.925996 load time: 1028.89 ms 56%|█████▌ | 12284/22095 [20:52:27<83:18:04, 30.57s/it] {'loss': 0.3273, 'grad_norm': 0.6358433470685065, 'learning_rate': 4.3393767322036125e-06, 'epoch': 0.56} 56%|█████▌ | 12284/22095 [20:52:27<83:18:04, 30.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047940 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 8\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 56%|█████▌ | 12285/22095 [20:53:29<108:58:30, 39.99s/it] {'loss': 0.2801, 'grad_norm': 0.5989624689504114, 'learning_rate': 4.338650243723971e-06, 'epoch': 0.56} 56%|█████▌ | 12285/22095 [20:53:29<108:58:30, 39.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [14, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348850 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15520, 'image': 'vrdu_table_final_2/astro-ph.CO/311c014a-29f0-4653-ae9c-56212bdcb1a2.png', 'image_wh': [[14, 6]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}\n...\n\\end{tabular}\n```"}]} 56%|█████▌ | 12286/22095 [20:53:50<93:39:38, 34.37s/it] {'loss': 0.383, 'grad_norm': 0.6361718365706649, 'learning_rate': 4.337923769454706e-06, 'epoch': 0.56} 56%|█████▌ | 12286/22095 [20:53:50<93:39:38, 34.37s/it] 56%|█████▌ | 12287/22095 [20:53:53<67:53:44, 24.92s/it] {'loss': 0.3281, 'grad_norm': 0.5717929766259564, 'learning_rate': 4.337197309411424e-06, 'epoch': 0.56} 56%|█████▌ | 12287/22095 [20:53:53<67:53:44, 24.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58453 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45078 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12288/22095 [20:54:38<84:15:52, 30.93s/it] {'loss': 0.4839, 'grad_norm': 0.31954508426963363, 'learning_rate': 4.336470863609736e-06, 'epoch': 0.56} 56%|█████▌ | 12288/22095 [20:54:38<84:15:52, 30.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ if not self.coord_norm and new_image_size is not None: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item sources[0]["conversations"][first_user_input_idx]["value"] = msg ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915462 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38615, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5'}, {'from': 'gpt', 'value': '【解答】解:∵点D是AC的中点,如果CD=4,∴AC=2CD=8∵AB=14∴BC=AB-AC=6'}]} 56%|█████▌ | 12289/22095 [20:54:42<61:56:55, 22.74s/it] {'loss': 0.3484, 'grad_norm': 0.7010522726012361, 'learning_rate': 4.335744432065254e-06, 'epoch': 0.56} 56%|█████▌ | 12289/22095 [20:54:42<61:56:55, 22.74s/it] 56%|█████▌ | 12290/22095 [20:55:38<89:50:19, 32.99s/it] {'loss': 0.3233, 'grad_norm': 0.6413260103451215, 'learning_rate': 4.33501801479358e-06, 'epoch': 0.56} 56%|█████▌ | 12290/22095 [20:55:38<89:50:19, 32.99s/it] 56%|█████▌ | 12291/22095 [20:56:04<83:32:31, 30.68s/it] {'loss': 0.26, 'grad_norm': 0.6260547273063286, 'learning_rate': 4.334291611810329e-06, 'epoch': 0.56} 56%|█████▌ | 12291/22095 [20:56:04<83:32:31, 30.68s/it] 56%|█████▌ | 12292/22095 [20:56:45<92:31:46, 33.98s/it] {'loss': 0.3036, 'grad_norm': 0.5989004568803357, 'learning_rate': 4.333565223131107e-06, 'epoch': 0.56} 56%|█████▌ | 12292/22095 [20:56:45<92:31:46, 33.98s/it] 56%|█████▌ | 12293/22095 [20:57:25<97:02:28, 35.64s/it] {'loss': 0.2733, 'grad_norm': 0.6397236265829002, 'learning_rate': 4.332838848771521e-06, 'epoch': 0.56} 56%|█████▌ | 12293/22095 [20:57:25<97:02:28, 35.64s/it] 56%|█████▌ | 12294/22095 [20:57:46<85:16:26, 31.32s/it] {'loss': 0.3241, 'grad_norm': 0.6092452917225999, 'learning_rate': 4.332112488747178e-06, 'epoch': 0.56} 56%|█████▌ | 12294/22095 [20:57:46<85:16:26, 31.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47753 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12295/22095 [20:58:07<77:04:39, 28.31s/it] {'loss': 0.3309, 'grad_norm': 0.6211793412226144, 'learning_rate': 4.331386143073687e-06, 'epoch': 0.56} 56%|█████▌ | 12295/22095 [20:58:07<77:04:39, 28.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12296/22095 [20:58:53<91:12:09, 33.51s/it] {'loss': 0.4651, 'grad_norm': 0.2849571156764159, 'learning_rate': 4.330659811766655e-06, 'epoch': 0.56} 56%|█████▌ | 12296/22095 [20:58:53<91:12:09, 33.51s/it] 56%|█████▌ | 12297/22095 [20:59:36<98:35:07, 36.22s/it] {'loss': 0.3299, 'grad_norm': 0.6398917116737811, 'learning_rate': 4.329933494841689e-06, 'epoch': 0.56} 56%|█████▌ | 12297/22095 [20:59:36<98:35:07, 36.22s/it] 56%|█████▌ | 12298/22095 [20:59:58<87:25:20, 32.12s/it] {'loss': 0.3257, 'grad_norm': 0.6752901353544469, 'learning_rate': 4.3292071923143905e-06, 'epoch': 0.56} 56%|█████▌ | 12298/22095 [20:59:58<87:25:20, 32.12s/it] 56%|█████▌ | 12299/22095 [21:00:57<109:33:48, 40.26s/it] {'loss': 0.3306, 'grad_norm': 0.7044712818301637, 'learning_rate': 4.328480904200373e-06, 'epoch': 0.56} 56%|█████▌ | 12299/22095 [21:00:57<109:33:48, 40.26s/it] 56%|█████▌ | 12300/22095 [21:01:20<95:22:21, 35.05s/it] {'loss': 0.3237, 'grad_norm': 0.6004103904041961, 'learning_rate': 4.327754630515236e-06, 'epoch': 0.56} 56%|█████▌ | 12300/22095 [21:01:20<95:22:21, 35.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ ), f"Message {msg} does not contain and tags" File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = "" * len(image_file) + msg ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333805 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 414, 'image': 'vrdu_table_final_2/astro-ph.CO/6df719ab-f533-45f6-9e4b-f88d3e842108.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 56%|█████▌ | 12301/22095 [21:02:02<100:37:13, 36.99s/it] {'loss': 0.2621, 'grad_norm': 0.6051455344287193, 'learning_rate': 4.3270283712745885e-06, 'epoch': 0.56} 56%|█████▌ | 12301/22095 [21:02:02<100:37:13, 36.99s/it] 56%|█████▌ | 12302/22095 [21:02:44<104:30:52, 38.42s/it] {'loss': 0.3687, 'grad_norm': 0.6325282147607654, 'learning_rate': 4.326302126494035e-06, 'epoch': 0.56} 56%|█████▌ | 12302/22095 [21:02:44<104:30:52, 38.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12303/22095 [21:03:53<129:54:58, 47.76s/it] {'loss': 0.4804, 'grad_norm': 0.2778695822780776, 'learning_rate': 4.325575896189178e-06, 'epoch': 0.56} 56%|█████▌ | 12303/22095 [21:03:53<129:54:58, 47.76s/it] 56%|█████▌ | 12304/22095 [21:03:57<93:48:10, 34.49s/it] {'loss': 0.3322, 'grad_norm': 0.6901227211736138, 'learning_rate': 4.324849680375625e-06, 'epoch': 0.56} 56%|█████▌ | 12304/22095 [21:03:57<93:48:10, 34.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41877 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12305/22095 [21:05:17<131:31:17, 48.36s/it] {'loss': 0.3157, 'grad_norm': 0.6373310885407559, 'learning_rate': 4.324123479068979e-06, 'epoch': 0.56} 56%|█████▌ | 12305/22095 [21:05:17<131:31:17, 48.36s/it]VC:s3://gui-agent/data_20250428/Android/sohutv/Cycle_0_Iter_39/images/screenshot-590-1745484404.007593-before.png 2025-08-28 13:03:16.193896 load time: 1039.84 ms 56%|█████▌ | 12306/22095 [21:05:39<109:30:14, 40.27s/it] {'loss': 0.3418, 'grad_norm': 0.626480672047507, 'learning_rate': 4.3233972922848435e-06, 'epoch': 0.56} 56%|█████▌ | 12306/22095 [21:05:39<109:30:14, 40.27s/it] 56%|█████▌ | 12307/22095 [21:05:42<79:06:07, 29.09s/it] {'loss': 0.3671, 'grad_norm': 0.6081506214931938, 'learning_rate': 4.32267112003882e-06, 'epoch': 0.56} 56%|█████▌ | 12307/22095 [21:05:42<79:06:07, 29.09s/it] 56%|█████▌ | 12308/22095 [21:05:45<57:52:34, 21.29s/it] {'loss': 0.3599, 'grad_norm': 0.6349245263217774, 'learning_rate': 4.321944962346517e-06, 'epoch': 0.56} 56%|█████▌ | 12308/22095 [21:05:45<57:52:34, 21.29s/it] 56%|█████▌ | 12309/22095 [21:06:28<75:56:37, 27.94s/it] {'loss': 0.279, 'grad_norm': 0.6028597309271571, 'learning_rate': 4.321218819223533e-06, 'epoch': 0.56} 56%|█████▌ | 12309/22095 [21:06:28<75:56:37, 27.94s/it] 56%|█████▌ | 12310/22095 [21:06:51<71:57:34, 26.47s/it] {'loss': 0.347, 'grad_norm': 0.6558489076897699, 'learning_rate': 4.320492690685471e-06, 'epoch': 0.56} 56%|█████▌ | 12310/22095 [21:06:51<71:57:34, 26.47s/it] 56%|█████▌ | 12311/22095 [21:07:13<68:19:53, 25.14s/it] {'loss': 0.29, 'grad_norm': 0.631884906983182, 'learning_rate': 4.319766576747934e-06, 'epoch': 0.56} 56%|█████▌ | 12311/22095 [21:07:13<68:19:53, 25.14s/it] 56%|█████▌ | 12312/22095 [21:07:37<66:57:54, 24.64s/it] {'loss': 0.2948, 'grad_norm': 0.5963251037440932, 'learning_rate': 4.319040477426527e-06, 'epoch': 0.56} 56%|█████▌ | 12312/22095 [21:07:37<66:57:54, 24.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59621 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12313/22095 [21:08:02<67:22:03, 24.79s/it] {'loss': 0.4583, 'grad_norm': 0.3178597251797919, 'learning_rate': 4.318314392736845e-06, 'epoch': 0.56} 56%|█████▌ | 12313/22095 [21:08:02<67:22:03, 24.79s/it] 56%|█████▌ | 12314/22095 [21:08:05<49:49:46, 18.34s/it] {'loss': 0.3433, 'grad_norm': 0.5772619349251756, 'learning_rate': 4.317588322694495e-06, 'epoch': 0.56} 56%|█████▌ | 12314/22095 [21:08:05<49:49:46, 18.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54380 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12315/22095 [21:08:33<57:49:00, 21.28s/it] {'loss': 0.4762, 'grad_norm': 0.29671798934162635, 'learning_rate': 4.3168622673150765e-06, 'epoch': 0.56} 56%|█████▌ | 12315/22095 [21:08:33<57:49:00, 21.28s/it] 56%|█████▌ | 12316/22095 [21:09:18<76:40:29, 28.23s/it] {'loss': 0.3671, 'grad_norm': 0.7101392642545054, 'learning_rate': 4.3161362266141895e-06, 'epoch': 0.56} 56%|█████▌ | 12316/22095 [21:09:18<76:40:29, 28.23s/it] 56%|█████▌ | 12317/22095 [21:10:54<132:06:24, 48.64s/it] {'loss': 0.2922, 'grad_norm': 0.5306534955505768, 'learning_rate': 4.315410200607433e-06, 'epoch': 0.56} 56%|█████▌ | 12317/22095 [21:10:54<132:06:24, 48.64s/it] 56%|█████▌ | 12318/22095 [21:10:57<95:00:34, 34.98s/it] {'loss': 0.3151, 'grad_norm': 0.6101963309651014, 'learning_rate': 4.314684189310412e-06, 'epoch': 0.56} 56%|█████▌ | 12318/22095 [21:10:57<95:00:34, 34.98s/it] 56%|█████▌ | 12319/22095 [21:11:40<101:39:13, 37.43s/it] {'loss': 0.3394, 'grad_norm': 0.6203140579627333, 'learning_rate': 4.31395819273872e-06, 'epoch': 0.56} 56%|█████▌ | 12319/22095 [21:11:40<101:39:13, 37.43s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 56%|█████▌ | 12320/22095 [21:12:01<87:53:20, 32.37s/it] {'loss': 0.3391, 'grad_norm': 0.6078518079041008, 'learning_rate': 4.313232210907959e-06, 'epoch': 0.56} 56%|█████▌ | 12320/22095 [21:12:01<87:53:20, 32.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12321/22095 [21:12:46<98:25:29, 36.25s/it] {'loss': 0.4811, 'grad_norm': 0.3124391598086161, 'learning_rate': 4.312506243833732e-06, 'epoch': 0.56} 56%|█████▌ | 12321/22095 [21:12:46<98:25:29, 36.25s/it] 56%|█████▌ | 12322/22095 [21:13:15<91:54:17, 33.85s/it] {'loss': 0.4722, 'grad_norm': 0.31236298450559097, 'learning_rate': 4.311780291531632e-06, 'epoch': 0.56} 56%|█████▌ | 12322/22095 [21:13:15<91:54:17, 33.85s/it] 56%|█████▌ | 12323/22095 [21:14:00<101:24:56, 37.36s/it] {'loss': 0.4677, 'grad_norm': 0.2680955973781369, 'learning_rate': 4.311054354017259e-06, 'epoch': 0.56} 56%|█████▌ | 12323/22095 [21:14:00<101:24:56, 37.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 56%|█████▌ | 12324/22095 [21:14:04<73:51:23, 27.21s/it] {'loss': 0.3133, 'grad_norm': 0.6576465320357912, 'learning_rate': 4.310328431306213e-06, 'epoch': 0.56} 56%|█████▌ | 12324/22095 [21:14:04<73:51:23, 27.21s/it] 56%|█████▌ | 12325/22095 [21:14:44<84:19:15, 31.07s/it] {'loss': 0.2899, 'grad_norm': 0.6337519434030606, 'learning_rate': 4.309602523414092e-06, 'epoch': 0.56} 56%|█████▌ | 12325/22095 [21:14:44<84:19:15, 31.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12326/22095 [21:15:43<106:57:54, 39.42s/it] {'loss': 0.3153, 'grad_norm': 0.6146377689137756, 'learning_rate': 4.308876630356491e-06, 'epoch': 0.56} 56%|█████▌ | 12326/22095 [21:15:43<106:57:54, 39.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46878 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44933 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41028 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12327/22095 [21:15:46<77:15:11, 28.47s/it] {'loss': 0.293, 'grad_norm': 0.6836536803399664, 'learning_rate': 4.308150752149007e-06, 'epoch': 0.56} 56%|█████▌ | 12327/22095 [21:15:46<77:15:11, 28.47s/it] 56%|█████▌ | 12328/22095 [21:15:49<56:33:49, 20.85s/it] {'loss': 0.3027, 'grad_norm': 0.6250698035311567, 'learning_rate': 4.307424888807242e-06, 'epoch': 0.56} 56%|█████▌ | 12328/22095 [21:15:49<56:33:49, 20.85s/it] 56%|█████▌ | 12329/22095 [21:15:53<42:52:11, 15.80s/it] {'loss': 0.3148, 'grad_norm': 0.7400318887479171, 'learning_rate': 4.306699040346788e-06, 'epoch': 0.56} 56%|█████▌ | 12329/22095 [21:15:53<42:52:11, 15.80s/it] 56%|█████▌ | 12330/22095 [21:15:57<33:37:57, 12.40s/it] {'loss': 0.2921, 'grad_norm': 0.5955651260196211, 'learning_rate': 4.305973206783241e-06, 'epoch': 0.56} 56%|█████▌ | 12330/22095 [21:15:57<33:37:57, 12.40s/it] 56%|█████▌ | 12331/22095 [21:16:19<41:21:18, 15.25s/it] {'loss': 0.2934, 'grad_norm': 0.6247834393219636, 'learning_rate': 4.3052473881322e-06, 'epoch': 0.56} 56%|█████▌ | 12331/22095 [21:16:19<41:21:18, 15.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 56%|█████▌ | 12332/22095 [21:16:41<47:14:48, 17.42s/it] {'loss': 0.2905, 'grad_norm': 1.2878975281808585, 'learning_rate': 4.304521584409257e-06, 'epoch': 0.56} 56%|█████▌ | 12332/22095 [21:16:41<47:14:48, 17.42s/it] 56%|█████▌ | 12333/22095 [21:18:02<98:51:09, 36.45s/it] {'loss': 0.3457, 'grad_norm': 0.6963145804773342, 'learning_rate': 4.30379579563001e-06, 'epoch': 0.56} 56%|█████▌ | 12333/22095 [21:18:02<98:51:09, 36.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [828, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8437392 in VC:s3://internvl-moe-sft-data/. Exception: Image size [828, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51016, 'image': 'vrdu_texteq/astro-ph.CO/c2efbc1b-3348-4356-a9c3-355421fa17c5.png', 'image_wh': [[828, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where $P \\cup K$ denotes the combination of Planck and KiDS datasets.'}]} 56%|█████▌ | 12334/22095 [21:18:06<71:55:34, 26.53s/it] {'loss': 0.3477, 'grad_norm': 0.6439684461235223, 'learning_rate': 4.303070021810053e-06, 'epoch': 0.56} 56%|█████▌ | 12334/22095 [21:18:06<71:55:34, 26.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-2_186029943-split-2.jpg 2025-08-28 13:16:04.497902 load time: 1023.97 ms VC:s3://gui-agent/data_20250612/windows/images/vlc/free_task_20250611_202026/images/20250611_202053_9.png 2025-08-28 13:16:04.497527 load time: 1030.79 ms 56%|█████▌ | 12335/22095 [21:18:15<58:07:20, 21.44s/it] {'loss': 0.4662, 'grad_norm': 0.37819610638538015, 'learning_rate': 4.3023442629649816e-06, 'epoch': 0.56} 56%|█████▌ | 12335/22095 [21:18:15<58:07:20, 21.44s/it] 56%|█████▌ | 12336/22095 [21:18:19<43:23:07, 16.00s/it] {'loss': 0.3422, 'grad_norm': 0.6274598277275636, 'learning_rate': 4.3016185191103874e-06, 'epoch': 0.56} 56%|█████▌ | 12336/22095 [21:18:19<43:23:07, 16.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115970 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12337/22095 [21:18:43<50:31:34, 18.64s/it] {'loss': 0.3476, 'grad_norm': 0.6752885359439768, 'learning_rate': 4.300892790261867e-06, 'epoch': 0.56} 56%|█████▌ | 12337/22095 [21:18:43<50:31:34, 18.64s/it] 56%|█████▌ | 12338/22095 [21:18:46<37:47:58, 13.95s/it] {'loss': 0.3122, 'grad_norm': 0.6359580258984199, 'learning_rate': 4.300167076435015e-06, 'epoch': 0.56} 56%|█████▌ | 12338/22095 [21:18:46<37:47:58, 13.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65933 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115429 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12339/22095 [21:19:08<44:21:26, 16.37s/it] {'loss': 0.3011, 'grad_norm': 0.6038278153379546, 'learning_rate': 4.2994413776454225e-06, 'epoch': 0.56} 56%|█████▌ | 12339/22095 [21:19:08<44:21:26, 16.37s/it] 56%|█████▌ | 12340/22095 [21:19:30<48:36:29, 17.94s/it] {'loss': 0.3211, 'grad_norm': 0.6408495486894391, 'learning_rate': 4.298715693908682e-06, 'epoch': 0.56} 56%|█████▌ | 12340/22095 [21:19:30<48:36:29, 17.94s/it] 56%|█████▌ | 12341/22095 [21:20:13<68:49:51, 25.40s/it] {'loss': 0.2956, 'grad_norm': 0.6594204681087569, 'learning_rate': 4.2979900252403895e-06, 'epoch': 0.56} 56%|█████▌ | 12341/22095 [21:20:13<68:49:51, 25.40s/it] 56%|█████▌ | 12342/22095 [21:20:54<81:23:47, 30.04s/it] {'loss': 0.2894, 'grad_norm': 0.8247731950449819, 'learning_rate': 4.297264371656133e-06, 'epoch': 0.56} 56%|█████▌ | 12342/22095 [21:20:54<81:23:47, 30.04s/it] 56%|█████▌ | 12343/22095 [21:20:57<59:17:09, 21.89s/it] {'loss': 0.3281, 'grad_norm': 0.6941235288007112, 'learning_rate': 4.296538733171507e-06, 'epoch': 0.56} 56%|█████▌ | 12343/22095 [21:20:57<59:17:09, 21.89s/it] 56%|█████▌ | 12344/22095 [21:21:00<44:39:58, 16.49s/it] {'loss': 0.2901, 'grad_norm': 0.6599637003374355, 'learning_rate': 4.295813109802106e-06, 'epoch': 0.56} 56%|█████▌ | 12344/22095 [21:21:00<44:39:58, 16.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12345/22095 [21:21:29<54:30:57, 20.13s/it] {'loss': 0.4677, 'grad_norm': 0.29949819318411175, 'learning_rate': 4.295087501563516e-06, 'epoch': 0.56} 56%|█████▌ | 12345/22095 [21:21:29<54:30:57, 20.13s/it] 56%|█████▌ | 12346/22095 [21:21:51<56:13:22, 20.76s/it] {'loss': 0.2863, 'grad_norm': 0.6169216710706968, 'learning_rate': 4.294361908471329e-06, 'epoch': 0.56} 56%|█████▌ | 12346/22095 [21:21:51<56:13:22, 20.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12347/22095 [21:22:17<60:04:00, 22.18s/it] {'loss': 0.4689, 'grad_norm': 0.27206602819963843, 'learning_rate': 4.293636330541141e-06, 'epoch': 0.56} 56%|█████▌ | 12347/22095 [21:22:17<60:04:00, 22.18s/it] 56%|█████▌ | 12348/22095 [21:22:20<44:43:15, 16.52s/it] {'loss': 0.3434, 'grad_norm': 0.7291025964436351, 'learning_rate': 4.2929107677885375e-06, 'epoch': 0.56} 56%|█████▌ | 12348/22095 [21:22:20<44:43:15, 16.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_515437.png 2025-08-28 13:20:18.900888 load time: 1052.1 ms 56%|█████▌ | 12349/22095 [21:22:42<49:21:32, 18.23s/it] {'loss': 0.3006, 'grad_norm': 0.5961323828297462, 'learning_rate': 4.29218522022911e-06, 'epoch': 0.56} 56%|█████▌ | 12349/22095 [21:22:42<49:21:32, 18.23s/it] 56%|█████▌ | 12350/22095 [21:22:46<37:30:59, 13.86s/it] {'loss': 0.3232, 'grad_norm': 0.6081864008471584, 'learning_rate': 4.291459687878449e-06, 'epoch': 0.56} 56%|█████▌ | 12350/22095 [21:22:46<37:30:59, 13.86s/it] 56%|█████▌ | 12351/22095 [21:22:49<28:44:13, 10.62s/it] {'loss': 0.3078, 'grad_norm': 0.8359323755964424, 'learning_rate': 4.29073417075214e-06, 'epoch': 0.56} 56%|█████▌ | 12351/22095 [21:22:49<28:44:13, 10.62s/it] 56%|█████▌ | 12352/22095 [21:22:52<22:46:51, 8.42s/it] {'loss': 0.3308, 'grad_norm': 0.7135344530168528, 'learning_rate': 4.290008668865778e-06, 'epoch': 0.56} 56%|█████▌ | 12352/22095 [21:22:52<22:46:51, 8.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49000 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102004 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12353/22095 [21:23:15<34:11:52, 12.64s/it] {'loss': 0.2947, 'grad_norm': 0.6491425822680523, 'learning_rate': 4.289283182234948e-06, 'epoch': 0.56} 56%|█████▌ | 12353/22095 [21:23:15<34:11:52, 12.64s/it] 56%|█████▌ | 12354/22095 [21:23:18<26:50:39, 9.92s/it] {'loss': 0.3435, 'grad_norm': 0.6778886453996668, 'learning_rate': 4.288557710875242e-06, 'epoch': 0.56} 56%|█████▌ | 12354/22095 [21:23:18<26:50:39, 9.92s/it] 56%|█████▌ | 12355/22095 [21:23:41<37:18:23, 13.79s/it] {'loss': 0.3024, 'grad_norm': 0.6056368355836651, 'learning_rate': 4.287832254802244e-06, 'epoch': 0.56} 56%|█████▌ | 12355/22095 [21:23:41<37:18:23, 13.79s/it] 56%|█████▌ | 12356/22095 [21:23:44<28:37:31, 10.58s/it] {'loss': 0.2883, 'grad_norm': 0.5971191283513166, 'learning_rate': 4.287106814031542e-06, 'epoch': 0.56} 56%|█████▌ | 12356/22095 [21:23:44<28:37:31, 10.58s/it] 56%|█████▌ | 12357/22095 [21:24:12<42:13:04, 15.61s/it] {'loss': 0.3402, 'grad_norm': 0.6408366890253473, 'learning_rate': 4.286381388578728e-06, 'epoch': 0.56} 56%|█████▌ | 12357/22095 [21:24:12<42:13:04, 15.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55340 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12358/22095 [21:24:15<32:21:34, 11.96s/it] {'loss': 0.3338, 'grad_norm': 0.614472336181477, 'learning_rate': 4.285655978459385e-06, 'epoch': 0.56} 56%|█████▌ | 12358/22095 [21:24:15<32:21:34, 11.96s/it] 56%|█████▌ | 12359/22095 [21:24:18<24:53:02, 9.20s/it] {'loss': 0.3243, 'grad_norm': 0.645675843719086, 'learning_rate': 4.2849305836891e-06, 'epoch': 0.56} 56%|█████▌ | 12359/22095 [21:24:18<24:53:02, 9.20s/it] 56%|█████▌ | 12360/22095 [21:24:21<20:03:27, 7.42s/it] {'loss': 0.3116, 'grad_norm': 0.6961527371204147, 'learning_rate': 4.284205204283463e-06, 'epoch': 0.56} 56%|█████▌ | 12360/22095 [21:24:21<20:03:27, 7.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12361/22095 [21:24:28<19:52:55, 7.35s/it] {'loss': 0.4632, 'grad_norm': 0.37595319417062534, 'learning_rate': 4.283479840258055e-06, 'epoch': 0.56} 56%|█████▌ | 12361/22095 [21:24:28<19:52:55, 7.35s/it] 56%|█████▌ | 12362/22095 [21:24:31<16:29:22, 6.10s/it] {'loss': 0.3192, 'grad_norm': 0.6705977459084392, 'learning_rate': 4.2827544916284655e-06, 'epoch': 0.56} 56%|█████▌ | 12362/22095 [21:24:32<16:29:22, 6.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12363/22095 [21:24:35<14:11:14, 5.25s/it] {'loss': 0.3134, 'grad_norm': 0.6716578120239428, 'learning_rate': 4.2820291584102815e-06, 'epoch': 0.56} 56%|█████▌ | 12363/22095 [21:24:35<14:11:14, 5.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882180 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5333, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵AN:MN=1:2,且AN=2,∴2:MN=1:2,∴MN=4cm,∴AM=6cm.∵M是线段AB的中点,∴AB=2AM,∴AB=12cm,故D答案正确.'}]} 56%|█████▌ | 12364/22095 [21:24:38<12:39:42, 4.68s/it] {'loss': 0.2788, 'grad_norm': 0.6678896667477587, 'learning_rate': 4.281303840619083e-06, 'epoch': 0.56} 56%|█████▌ | 12364/22095 [21:24:38<12:39:42, 4.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12365/22095 [21:24:41<11:12:48, 4.15s/it] {'loss': 0.3372, 'grad_norm': 0.6762181244429644, 'learning_rate': 4.280578538270458e-06, 'epoch': 0.56} 56%|█████▌ | 12365/22095 [21:24:41<11:12:48, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308122 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2Lnp4bjgy_uJjSZKzXXb_jXXa_!!3365965160.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n您能帮忙解析这张图上记录的文字吗?'}, {'from': 'gpt', 'value': '图中所有文字:\n加粗加厚加大\n可折叠\n乐器通用谱架\nMusic\nMusic\nMusic\nMusic\n买就送谱夹\n【谱夹颜色随机发】'}]} 56%|█████▌ | 12366/22095 [21:24:50<15:27:13, 5.72s/it] {'loss': 0.4595, 'grad_norm': 0.30584590139664086, 'learning_rate': 4.27985325137999e-06, 'epoch': 0.56} 56%|█████▌ | 12366/22095 [21:24:50<15:27:13, 5.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12367/22095 [21:24:54<13:29:06, 4.99s/it] {'loss': 0.3272, 'grad_norm': 0.6381422081849868, 'learning_rate': 4.279127979963266e-06, 'epoch': 0.56} 56%|█████▌ | 12367/22095 [21:24:54<13:29:06, 4.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914851 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38004, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 3\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 56%|█████▌ | 12368/22095 [21:24:57<11:45:48, 4.35s/it] {'loss': 0.3583, 'grad_norm': 0.7344560788607514, 'learning_rate': 4.278402724035868e-06, 'epoch': 0.56} 56%|█████▌ | 12368/22095 [21:24:57<11:45:48, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45783 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12369/22095 [21:25:00<10:56:33, 4.05s/it] {'loss': 0.3134, 'grad_norm': 0.6260833716254307, 'learning_rate': 4.277677483613377e-06, 'epoch': 0.56} 56%|█████▌ | 12369/22095 [21:25:00<10:56:33, 4.05s/it] 56%|█████▌ | 12370/22095 [21:25:04<10:49:58, 4.01s/it] {'loss': 0.3312, 'grad_norm': 0.7521739392430035, 'learning_rate': 4.276952258711381e-06, 'epoch': 0.56} 56%|█████▌ | 12370/22095 [21:25:04<10:49:58, 4.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952441 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3276, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 56%|█████▌ | 12371/22095 [21:25:07<9:49:42, 3.64s/it] {'loss': 0.2971, 'grad_norm': 0.599559188266718, 'learning_rate': 4.276227049345458e-06, 'epoch': 0.56} 56%|█████▌ | 12371/22095 [21:25:07<9:49:42, 3.64s/it] 56%|█████▌ | 12372/22095 [21:25:10<9:30:16, 3.52s/it] {'loss': 0.326, 'grad_norm': 1.1519106873598228, 'learning_rate': 4.2755018555311935e-06, 'epoch': 0.56} 56%|█████▌ | 12372/22095 [21:25:10<9:30:16, 3.52s/it] 56%|█████▌ | 12373/22095 [21:25:13<9:20:12, 3.46s/it] {'loss': 0.3232, 'grad_norm': 1.1096769841546963, 'learning_rate': 4.2747766772841695e-06, 'epoch': 0.56} 56%|█████▌ | 12373/22095 [21:25:13<9:20:12, 3.46s/it] 56%|█████▌ | 12374/22095 [21:25:16<8:57:36, 3.32s/it] {'loss': 0.3061, 'grad_norm': 0.628112357109904, 'learning_rate': 4.2740515146199675e-06, 'epoch': 0.56} 56%|█████▌ | 12374/22095 [21:25:16<8:57:36, 3.32s/it] 56%|█████▌ | 12375/22095 [21:25:19<8:44:30, 3.24s/it] {'loss': 0.3176, 'grad_norm': 0.6317378580340778, 'learning_rate': 4.273326367554167e-06, 'epoch': 0.56} 56%|█████▌ | 12375/22095 [21:25:19<8:44:30, 3.24s/it] 56%|█████▌ | 12376/22095 [21:25:22<8:39:24, 3.21s/it] {'loss': 0.3638, 'grad_norm': 0.6372257992411345, 'learning_rate': 4.272601236102353e-06, 'epoch': 0.56} 56%|█████▌ | 12376/22095 [21:25:22<8:39:24, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12377/22095 [21:25:33<14:34:16, 5.40s/it] {'loss': 0.4667, 'grad_norm': 0.32587176034152876, 'learning_rate': 4.271876120280104e-06, 'epoch': 0.56} 56%|█████▌ | 12377/22095 [21:25:33<14:34:16, 5.40s/it] 56%|█████▌ | 12378/22095 [21:25:37<13:11:42, 4.89s/it] {'loss': 0.3214, 'grad_norm': 0.6291645665666001, 'learning_rate': 4.2711510201030005e-06, 'epoch': 0.56} 56%|█████▌ | 12378/22095 [21:25:37<13:11:42, 4.89s/it] 56%|█████▌ | 12379/22095 [21:25:40<12:26:00, 4.61s/it] {'loss': 0.2826, 'grad_norm': 0.6139529937097798, 'learning_rate': 4.270425935586624e-06, 'epoch': 0.56} 56%|█████▌ | 12379/22095 [21:25:41<12:26:00, 4.61s/it] 56%|█████▌ | 12380/22095 [21:25:44<11:24:07, 4.23s/it] {'loss': 0.3625, 'grad_norm': 0.7866626209421049, 'learning_rate': 4.2697008667465515e-06, 'epoch': 0.56} 56%|█████▌ | 12380/22095 [21:25:44<11:24:07, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12381/22095 [21:25:47<10:55:49, 4.05s/it] {'loss': 0.3586, 'grad_norm': 0.6575234379812429, 'learning_rate': 4.268975813598366e-06, 'epoch': 0.56} 56%|█████▌ | 12381/22095 [21:25:47<10:55:49, 4.05s/it] 56%|█████▌ | 12382/22095 [21:25:50<10:02:17, 3.72s/it] {'loss': 0.3114, 'grad_norm': 0.6294270608967876, 'learning_rate': 4.268250776157644e-06, 'epoch': 0.56} 56%|█████▌ | 12382/22095 [21:25:50<10:02:17, 3.72s/it] 56%|█████▌ | 12383/22095 [21:25:53<9:29:19, 3.52s/it] {'loss': 0.3194, 'grad_norm': 0.6302772266526723, 'learning_rate': 4.267525754439967e-06, 'epoch': 0.56} 56%|█████▌ | 12383/22095 [21:25:53<9:29:19, 3.52s/it] 56%|█████▌ | 12384/22095 [21:25:58<9:57:51, 3.69s/it] {'loss': 0.3264, 'grad_norm': 0.6089326567380705, 'learning_rate': 4.2668007484609106e-06, 'epoch': 0.56} 56%|█████▌ | 12384/22095 [21:25:58<9:57:51, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12385/22095 [21:26:01<9:32:27, 3.54s/it] {'loss': 0.2781, 'grad_norm': 0.5959077220631366, 'learning_rate': 4.266075758236055e-06, 'epoch': 0.56} 56%|█████▌ | 12385/22095 [21:26:01<9:32:27, 3.54s/it] 56%|█████▌ | 12386/22095 [21:26:04<9:26:18, 3.50s/it] {'loss': 0.2962, 'grad_norm': 0.6175674183428848, 'learning_rate': 4.265350783780977e-06, 'epoch': 0.56} 56%|█████▌ | 12386/22095 [21:26:04<9:26:18, 3.50s/it] 56%|█████▌ | 12387/22095 [21:26:07<9:07:47, 3.39s/it] {'loss': 0.2856, 'grad_norm': 0.544309122649723, 'learning_rate': 4.264625825111255e-06, 'epoch': 0.56} 56%|█████▌ | 12387/22095 [21:26:07<9:07:47, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12388/22095 [21:26:11<9:32:51, 3.54s/it] {'loss': 0.2918, 'grad_norm': 0.6570475858920106, 'learning_rate': 4.2639008822424644e-06, 'epoch': 0.56} 56%|█████▌ | 12388/22095 [21:26:11<9:32:51, 3.54s/it] 56%|█████▌ | 12389/22095 [21:26:15<9:30:13, 3.52s/it] {'loss': 0.2824, 'grad_norm': 0.5978688068020497, 'learning_rate': 4.2631759551901845e-06, 'epoch': 0.56} 56%|█████▌ | 12389/22095 [21:26:15<9:30:13, 3.52s/it] 56%|█████▌ | 12390/22095 [21:26:18<9:28:37, 3.52s/it] {'loss': 0.3136, 'grad_norm': 0.689402321038741, 'learning_rate': 4.262451043969988e-06, 'epoch': 0.56} 56%|█████▌ | 12390/22095 [21:26:18<9:28:37, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12391/22095 [21:26:26<12:39:01, 4.69s/it] {'loss': 0.4657, 'grad_norm': 0.3426201622095635, 'learning_rate': 4.2617261485974545e-06, 'epoch': 0.56} 56%|█████▌ | 12391/22095 [21:26:26<12:39:01, 4.69s/it] 56%|█████▌ | 12392/22095 [21:26:29<11:24:45, 4.23s/it] {'loss': 0.3229, 'grad_norm': 0.6249086540005816, 'learning_rate': 4.261001269088161e-06, 'epoch': 0.56} 56%|█████▌ | 12392/22095 [21:26:29<11:24:45, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12393/22095 [21:26:33<11:06:16, 4.12s/it] {'loss': 0.2768, 'grad_norm': 0.6209909771694748, 'learning_rate': 4.260276405457678e-06, 'epoch': 0.56} 56%|█████▌ | 12393/22095 [21:26:33<11:06:16, 4.12s/it] 56%|█████▌ | 12394/22095 [21:26:36<10:36:43, 3.94s/it] {'loss': 0.3645, 'grad_norm': 0.7335251266441353, 'learning_rate': 4.259551557721582e-06, 'epoch': 0.56} 56%|█████▌ | 12394/22095 [21:26:36<10:36:43, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12395/22095 [21:26:39<9:54:09, 3.68s/it] {'loss': 0.3363, 'grad_norm': 0.6449609075121965, 'learning_rate': 4.25882672589545e-06, 'epoch': 0.56} 56%|█████▌ | 12395/22095 [21:26:39<9:54:09, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53893 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12396/22095 [21:26:43<9:36:23, 3.57s/it] {'loss': 0.2566, 'grad_norm': 0.9284828798571106, 'learning_rate': 4.258101909994857e-06, 'epoch': 0.56} 56%|█████▌ | 12396/22095 [21:26:43<9:36:23, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88803 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12397/22095 [21:26:46<9:50:55, 3.66s/it] {'loss': 0.3242, 'grad_norm': 0.8104997105300392, 'learning_rate': 4.257377110035374e-06, 'epoch': 0.56} 56%|█████▌ | 12397/22095 [21:26:46<9:50:55, 3.66s/it] 56%|█████▌ | 12398/22095 [21:26:51<10:25:19, 3.87s/it] {'loss': 0.294, 'grad_norm': 0.5587361776941873, 'learning_rate': 4.2566523260325755e-06, 'epoch': 0.56} 56%|█████▌ | 12398/22095 [21:26:51<10:25:19, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881987 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5140, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 56%|█████▌ | 12399/22095 [21:26:54<10:16:46, 3.82s/it] {'loss': 0.317, 'grad_norm': 0.8574056365982415, 'learning_rate': 4.255927558002038e-06, 'epoch': 0.56} 56%|█████▌ | 12399/22095 [21:26:54<10:16:46, 3.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12400/22095 [21:26:58<10:09:08, 3.77s/it] {'loss': 0.2914, 'grad_norm': 0.6338371685512539, 'learning_rate': 4.2552028059593294e-06, 'epoch': 0.56} 56%|█████▌ | 12400/22095 [21:26:58<10:09:08, 3.77s/it] 56%|█████▌ | 12401/22095 [21:27:02<10:34:03, 3.92s/it] {'loss': 0.32, 'grad_norm': 0.6167283818031284, 'learning_rate': 4.2544780699200265e-06, 'epoch': 0.56} 56%|█████▌ | 12401/22095 [21:27:02<10:34:03, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53689 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45412 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73244 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12402/22095 [21:27:06<10:00:42, 3.72s/it] {'loss': 0.3176, 'grad_norm': 0.5741515398590784, 'learning_rate': 4.2537533498997005e-06, 'epoch': 0.56} 56%|█████▌ | 12402/22095 [21:27:06<10:00:42, 3.72s/it] 56%|█████▌ | 12403/22095 [21:27:09<9:31:29, 3.54s/it] {'loss': 0.3117, 'grad_norm': 0.6307609620005522, 'learning_rate': 4.253028645913922e-06, 'epoch': 0.56} 56%|█████▌ | 12403/22095 [21:27:09<9:31:29, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12404/22095 [21:27:18<14:19:00, 5.32s/it] {'loss': 0.4565, 'grad_norm': 0.3266962933317707, 'learning_rate': 4.252303957978263e-06, 'epoch': 0.56} 56%|█████▌ | 12404/22095 [21:27:18<14:19:00, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98151 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47623 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12405/22095 [21:27:28<17:53:12, 6.65s/it] {'loss': 0.4909, 'grad_norm': 0.9995670900758218, 'learning_rate': 4.251579286108297e-06, 'epoch': 0.56} 56%|█████▌ | 12405/22095 [21:27:28<17:53:12, 6.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 56%|█████▌ | 12406/22095 [21:27:32<15:33:28, 5.78s/it] {'loss': 0.3343, 'grad_norm': 0.8186606301570003, 'learning_rate': 4.250854630319593e-06, 'epoch': 0.56} 56%|█████▌ | 12406/22095 [21:27:32<15:33:28, 5.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12407/22095 [21:27:35<13:57:06, 5.18s/it] {'loss': 0.3144, 'grad_norm': 0.6546198346426051, 'learning_rate': 4.2501299906277225e-06, 'epoch': 0.56} 56%|█████▌ | 12407/22095 [21:27:36<13:57:06, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71884 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108413 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12408/22095 [21:27:40<13:35:33, 5.05s/it] {'loss': 0.3132, 'grad_norm': 0.661691131812228, 'learning_rate': 4.249405367048254e-06, 'epoch': 0.56} 56%|█████▌ | 12408/22095 [21:27:40<13:35:33, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52470 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42575 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12409/22095 [21:27:43<11:54:26, 4.43s/it] {'loss': 0.3229, 'grad_norm': 0.6055607608378684, 'learning_rate': 4.248680759596761e-06, 'epoch': 0.56} 56%|█████▌ | 12409/22095 [21:27:43<11:54:26, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44857 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12410/22095 [21:27:49<13:06:59, 4.88s/it] {'loss': 0.4666, 'grad_norm': 0.33701630670332994, 'learning_rate': 4.24795616828881e-06, 'epoch': 0.56} 56%|█████▌ | 12410/22095 [21:27:49<13:06:59, 4.88s/it] 56%|█████▌ | 12411/22095 [21:27:53<11:58:57, 4.45s/it] {'loss': 0.3406, 'grad_norm': 0.5820937489912922, 'learning_rate': 4.247231593139971e-06, 'epoch': 0.56} 56%|█████▌ | 12411/22095 [21:27:53<11:58:57, 4.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902702 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25855, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 5cm\nB. 8cm\nC. 9cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 56%|█████▌ | 12412/22095 [21:27:56<10:52:05, 4.04s/it] {'loss': 0.2933, 'grad_norm': 0.6162890639808882, 'learning_rate': 4.246507034165815e-06, 'epoch': 0.56} 56%|█████▌ | 12412/22095 [21:27:56<10:52:05, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58226 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12413/22095 [21:28:05<15:18:14, 5.69s/it] {'loss': 0.4581, 'grad_norm': 0.3381713375818819, 'learning_rate': 4.245782491381905e-06, 'epoch': 0.56} 56%|█████▌ | 12413/22095 [21:28:05<15:18:14, 5.69s/it] 56%|█████▌ | 12414/22095 [21:28:09<13:37:36, 5.07s/it] {'loss': 0.252, 'grad_norm': 0.6219759160425906, 'learning_rate': 4.245057964803815e-06, 'epoch': 0.56} 56%|█████▌ | 12414/22095 [21:28:09<13:37:36, 5.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56154 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12415/22095 [21:28:12<11:50:21, 4.40s/it] {'loss': 0.2949, 'grad_norm': 0.6374850105878566, 'learning_rate': 4.244333454447112e-06, 'epoch': 0.56} 56%|█████▌ | 12415/22095 [21:28:12<11:50:21, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12416/22095 [21:28:21<16:03:18, 5.97s/it] {'loss': 0.4669, 'grad_norm': 0.2702414442270348, 'learning_rate': 4.243608960327361e-06, 'epoch': 0.56} 56%|█████▌ | 12416/22095 [21:28:21<16:03:18, 5.97s/it] 56%|█████▌ | 12417/22095 [21:28:28<16:36:15, 6.18s/it] {'loss': 0.4573, 'grad_norm': 0.2660859641437605, 'learning_rate': 4.242884482460129e-06, 'epoch': 0.56} 56%|█████▌ | 12417/22095 [21:28:28<16:36:15, 6.18s/it] 56%|█████▌ | 12418/22095 [21:28:33<16:01:51, 5.96s/it] {'loss': 0.4678, 'grad_norm': 0.3057110766984944, 'learning_rate': 4.242160020860988e-06, 'epoch': 0.56} 56%|█████▌ | 12418/22095 [21:28:33<16:01:51, 5.96s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 56%|█████▌ | 12419/22095 [21:28:37<14:15:07, 5.30s/it] {'loss': 0.3115, 'grad_norm': 0.6044778611836339, 'learning_rate': 4.241435575545496e-06, 'epoch': 0.56} 56%|█████▌ | 12419/22095 [21:28:37<14:15:07, 5.30s/it] 56%|█████▌ | 12420/22095 [21:28:41<13:03:46, 4.86s/it] {'loss': 0.331, 'grad_norm': 0.6265272349973912, 'learning_rate': 4.2407111465292265e-06, 'epoch': 0.56} 56%|█████▌ | 12420/22095 [21:28:41<13:03:46, 4.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12421/22095 [21:28:48<14:52:13, 5.53s/it] {'loss': 0.4752, 'grad_norm': 0.2717995782592673, 'learning_rate': 4.239986733827742e-06, 'epoch': 0.56} 56%|█████▌ | 12421/22095 [21:28:48<14:52:13, 5.53s/it] 56%|█████▌ | 12422/22095 [21:28:51<13:01:15, 4.85s/it] {'loss': 0.3366, 'grad_norm': 0.586424601557017, 'learning_rate': 4.239262337456609e-06, 'epoch': 0.56} 56%|█████▌ | 12422/22095 [21:28:51<13:01:15, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▌ | 12423/22095 [21:29:01<16:44:02, 6.23s/it] {'loss': 0.4845, 'grad_norm': 0.30407532265471443, 'learning_rate': 4.238537957431389e-06, 'epoch': 0.56} 56%|█████▌ | 12423/22095 [21:29:01<16:44:02, 6.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12424/22095 [21:29:06<15:49:01, 5.89s/it] {'loss': 0.467, 'grad_norm': 0.28007290194860684, 'learning_rate': 4.2378135937676515e-06, 'epoch': 0.56} 56%|█████▌ | 12424/22095 [21:29:06<15:49:01, 5.89s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▌ | 12425/22095 [21:29:10<14:03:53, 5.24s/it] {'loss': 0.3326, 'grad_norm': 0.6812849363923967, 'learning_rate': 4.23708924648096e-06, 'epoch': 0.56} 56%|█████▌ | 12425/22095 [21:29:10<14:03:53, 5.24s/it] 56%|█████▌ | 12426/22095 [21:29:13<12:44:30, 4.74s/it] {'loss': 0.3533, 'grad_norm': 0.6931167864204915, 'learning_rate': 4.236364915586877e-06, 'epoch': 0.56} 56%|█████▌ | 12426/22095 [21:29:13<12:44:30, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71008 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86105 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (153536 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▌ | 12427/22095 [21:29:17<12:09:12, 4.53s/it] {'loss': 0.3113, 'grad_norm': 0.616717908059227, 'learning_rate': 4.2356406011009654e-06, 'epoch': 0.56} 56%|█████▌ | 12427/22095 [21:29:17<12:09:12, 4.53s/it] 56%|█████▌ | 12428/22095 [21:29:21<11:47:34, 4.39s/it] {'loss': 0.3134, 'grad_norm': 0.6588738954730928, 'learning_rate': 4.234916303038793e-06, 'epoch': 0.56} 56%|█████▌ | 12428/22095 [21:29:21<11:47:34, 4.39s/it] 56%|█████▋ | 12429/22095 [21:29:25<11:13:58, 4.18s/it] {'loss': 0.3124, 'grad_norm': 0.6045150831513721, 'learning_rate': 4.234192021415916e-06, 'epoch': 0.56} 56%|█████▋ | 12429/22095 [21:29:25<11:13:58, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43656 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12430/22095 [21:29:28<10:20:40, 3.85s/it] {'loss': 0.3182, 'grad_norm': 0.6866858585460003, 'learning_rate': 4.233467756247901e-06, 'epoch': 0.56} 56%|█████▋ | 12430/22095 [21:29:28<10:20:40, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59894 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12431/22095 [21:29:31<9:44:38, 3.63s/it] {'loss': 0.3195, 'grad_norm': 0.7354770339037802, 'learning_rate': 4.232743507550311e-06, 'epoch': 0.56} 56%|█████▋ | 12431/22095 [21:29:31<9:44:38, 3.63s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 56%|█████▋ | 12432/22095 [21:29:34<9:14:22, 3.44s/it] {'loss': 0.2722, 'grad_norm': 0.5548344294691757, 'learning_rate': 4.232019275338706e-06, 'epoch': 0.56} 56%|█████▋ | 12432/22095 [21:29:34<9:14:22, 3.44s/it] 56%|█████▋ | 12433/22095 [21:29:38<9:08:53, 3.41s/it] {'loss': 0.3577, 'grad_norm': 0.6686252850257488, 'learning_rate': 4.231295059628647e-06, 'epoch': 0.56} 56%|█████▋ | 12433/22095 [21:29:38<9:08:53, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77334 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61014 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12434/22095 [21:29:41<9:31:08, 3.55s/it] {'loss': 0.3318, 'grad_norm': 0.6753077640957464, 'learning_rate': 4.230570860435698e-06, 'epoch': 0.56} 56%|█████▋ | 12434/22095 [21:29:41<9:31:08, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▋ | 12435/22095 [21:29:51<14:11:39, 5.29s/it] {'loss': 0.4465, 'grad_norm': 0.37310213050581825, 'learning_rate': 4.2298466777754175e-06, 'epoch': 0.56} 56%|█████▋ | 12435/22095 [21:29:51<14:11:39, 5.29s/it] 56%|█████▋ | 12436/22095 [21:29:55<13:33:15, 5.05s/it] {'loss': 0.3255, 'grad_norm': 0.5783839882048719, 'learning_rate': 4.2291225116633665e-06, 'epoch': 0.56} 56%|█████▋ | 12436/22095 [21:29:55<13:33:15, 5.05s/it] 56%|█████▋ | 12437/22095 [21:29:58<12:01:12, 4.48s/it] {'loss': 0.3002, 'grad_norm': 0.6434521887200615, 'learning_rate': 4.228398362115103e-06, 'epoch': 0.56} 56%|█████▋ | 12437/22095 [21:29:58<12:01:12, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (90420 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74109 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12438/22095 [21:30:09<16:36:02, 6.19s/it] {'loss': 0.4682, 'grad_norm': 0.3290184769604561, 'learning_rate': 4.227674229146193e-06, 'epoch': 0.56} 56%|█████▋ | 12438/22095 [21:30:09<16:36:02, 6.19s/it] 56%|█████▋ | 12439/22095 [21:30:18<19:23:29, 7.23s/it] {'loss': 0.4763, 'grad_norm': 0.27603873378163185, 'learning_rate': 4.226950112772189e-06, 'epoch': 0.56} 56%|█████▋ | 12439/22095 [21:30:18<19:23:29, 7.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 56%|█████▋ | 12440/22095 [21:30:22<16:14:15, 6.05s/it] {'loss': 0.2841, 'grad_norm': 0.6221420328633019, 'learning_rate': 4.226226013008654e-06, 'epoch': 0.56} 56%|█████▋ | 12440/22095 [21:30:22<16:14:15, 6.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79617 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43885 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139847 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12441/22095 [21:30:25<14:08:18, 5.27s/it] {'loss': 0.2746, 'grad_norm': 0.5667063179695143, 'learning_rate': 4.225501929871146e-06, 'epoch': 0.56} 56%|█████▋ | 12441/22095 [21:30:25<14:08:18, 5.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/data_20250328/icon_canva/images/desktop_2560x1440_1743144640_canvas.png 2025-08-28 13:28:23.792566 load time: 1030.06 ms 56%|█████▋ | 12442/22095 [21:30:32<15:48:29, 5.90s/it] {'loss': 0.4623, 'grad_norm': 0.33589286969250104, 'learning_rate': 4.22477786337522e-06, 'epoch': 0.56} 56%|█████▋ | 12442/22095 [21:30:32<15:48:29, 5.90s/it] 56%|█████▋ | 12443/22095 [21:30:36<13:38:07, 5.09s/it] {'loss': 0.3203, 'grad_norm': 0.6587862966320408, 'learning_rate': 4.224053813536439e-06, 'epoch': 0.56} 56%|█████▋ | 12443/22095 [21:30:36<13:38:07, 5.09s/it] 56%|█████▋ | 12444/22095 [21:30:39<12:09:30, 4.54s/it] {'loss': 0.2977, 'grad_norm': 0.6754237354060931, 'learning_rate': 4.223329780370359e-06, 'epoch': 0.56} 56%|█████▋ | 12444/22095 [21:30:39<12:09:30, 4.54s/it] 56%|█████▋ | 12445/22095 [21:30:42<10:58:49, 4.10s/it] {'loss': 0.3681, 'grad_norm': 0.663692464836983, 'learning_rate': 4.222605763892535e-06, 'epoch': 0.56} 56%|█████▋ | 12445/22095 [21:30:42<10:58:49, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▋ | 12446/22095 [21:30:51<15:14:25, 5.69s/it] {'loss': 0.4469, 'grad_norm': 0.33810651602626374, 'learning_rate': 4.221881764118526e-06, 'epoch': 0.56} 56%|█████▋ | 12446/22095 [21:30:51<15:14:25, 5.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▋ | 12447/22095 [21:30:55<13:18:12, 4.96s/it] {'loss': 0.2916, 'grad_norm': 0.6857029521910759, 'learning_rate': 4.22115778106389e-06, 'epoch': 0.56} 56%|█████▋ | 12447/22095 [21:30:55<13:18:12, 4.96s/it] 56%|█████▋ | 12448/22095 [21:30:58<12:09:37, 4.54s/it] {'loss': 0.3617, 'grad_norm': 0.6359154186799292, 'learning_rate': 4.220433814744179e-06, 'epoch': 0.56} 56%|█████▋ | 12448/22095 [21:30:58<12:09:37, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▋ | 12449/22095 [21:31:06<14:59:26, 5.59s/it] {'loss': 0.4948, 'grad_norm': 0.29578040102195857, 'learning_rate': 4.219709865174951e-06, 'epoch': 0.56} 56%|█████▋ | 12449/22095 [21:31:06<14:59:26, 5.59s/it] 56%|█████▋ | 12450/22095 [21:31:11<14:33:47, 5.44s/it] {'loss': 0.3066, 'grad_norm': 0.6621635262913025, 'learning_rate': 4.218985932371764e-06, 'epoch': 0.56} 56%|█████▋ | 12450/22095 [21:31:11<14:33:47, 5.44s/it] 56%|█████▋ | 12451/22095 [21:31:15<13:01:00, 4.86s/it] {'loss': 0.2919, 'grad_norm': 0.5902543749544588, 'learning_rate': 4.218262016350169e-06, 'epoch': 0.56} 56%|█████▋ | 12451/22095 [21:31:15<13:01:00, 4.86s/it] 56%|█████▋ | 12452/22095 [21:31:18<12:05:43, 4.52s/it] {'loss': 0.3323, 'grad_norm': 0.8094348894797204, 'learning_rate': 4.21753811712572e-06, 'epoch': 0.56} 56%|█████▋ | 12452/22095 [21:31:18<12:05:43, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57514 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80032 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12453/22095 [21:31:22<11:32:08, 4.31s/it] {'loss': 0.3319, 'grad_norm': 0.6149692212115275, 'learning_rate': 4.2168142347139765e-06, 'epoch': 0.56} 56%|█████▋ | 12453/22095 [21:31:22<11:32:08, 4.31s/it] 56%|█████▋ | 12454/22095 [21:31:26<10:50:15, 4.05s/it] {'loss': 0.2577, 'grad_norm': 0.7002479435493842, 'learning_rate': 4.21609036913049e-06, 'epoch': 0.56} 56%|█████▋ | 12454/22095 [21:31:26<10:50:15, 4.05s/it] 56%|█████▋ | 12455/22095 [21:31:29<10:06:01, 3.77s/it] {'loss': 0.2823, 'grad_norm': 0.6264439152159017, 'learning_rate': 4.2153665203908125e-06, 'epoch': 0.56} 56%|█████▋ | 12455/22095 [21:31:29<10:06:01, 3.77s/it] 56%|█████▋ | 12456/22095 [21:31:33<10:34:51, 3.95s/it] {'loss': 0.3785, 'grad_norm': 0.6442835838787396, 'learning_rate': 4.214642688510498e-06, 'epoch': 0.56} 56%|█████▋ | 12456/22095 [21:31:33<10:34:51, 3.95s/it] 56%|█████▋ | 12457/22095 [21:31:36<9:37:34, 3.60s/it] {'loss': 0.3356, 'grad_norm': 0.5754049656613867, 'learning_rate': 4.213918873505103e-06, 'epoch': 0.56} 56%|█████▋ | 12457/22095 [21:31:36<9:37:34, 3.60s/it] 56%|█████▋ | 12458/22095 [21:31:39<9:10:38, 3.43s/it] {'loss': 0.2787, 'grad_norm': 0.6206389488426615, 'learning_rate': 4.213195075390175e-06, 'epoch': 0.56} 56%|█████▋ | 12458/22095 [21:31:39<9:10:38, 3.43s/it] 56%|█████▋ | 12459/22095 [21:31:43<9:16:43, 3.47s/it] {'loss': 0.3161, 'grad_norm': 0.6025820884307962, 'learning_rate': 4.212471294181269e-06, 'epoch': 0.56} 56%|█████▋ | 12459/22095 [21:31:43<9:16:43, 3.47s/it] 56%|█████▋ | 12460/22095 [21:31:46<9:36:20, 3.59s/it] {'loss': 0.3361, 'grad_norm': 0.6599164073600439, 'learning_rate': 4.211747529893936e-06, 'epoch': 0.56} 56%|█████▋ | 12460/22095 [21:31:46<9:36:20, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57990 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53159 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70572 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81152 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12461/22095 [21:31:57<15:15:50, 5.70s/it] {'loss': 0.4476, 'grad_norm': 0.4259348750726592, 'learning_rate': 4.2110237825437275e-06, 'epoch': 0.56} 56%|█████▋ | 12461/22095 [21:31:57<15:15:50, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52817 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53454 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12462/22095 [21:32:01<13:51:34, 5.18s/it] {'loss': 0.3217, 'grad_norm': 0.6265420256719098, 'learning_rate': 4.210300052146194e-06, 'epoch': 0.56} 56%|█████▋ | 12462/22095 [21:32:01<13:51:34, 5.18s/it] 56%|█████▋ | 12463/22095 [21:32:05<13:07:16, 4.90s/it] {'loss': 0.3401, 'grad_norm': 0.6012447982413059, 'learning_rate': 4.2095763387168895e-06, 'epoch': 0.56} 56%|█████▋ | 12463/22095 [21:32:05<13:07:16, 4.90s/it] 56%|█████▋ | 12464/22095 [21:32:08<11:34:21, 4.33s/it] {'loss': 0.2803, 'grad_norm': 0.6426222003181654, 'learning_rate': 4.208852642271359e-06, 'epoch': 0.56} 56%|█████▋ | 12464/22095 [21:32:08<11:34:21, 4.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 56%|█████▋ | 12465/22095 [21:32:11<10:24:44, 3.89s/it] {'loss': 0.2666, 'grad_norm': 2.3473227350320647, 'learning_rate': 4.208128962825157e-06, 'epoch': 0.56} 56%|█████▋ | 12465/22095 [21:32:11<10:24:44, 3.89s/it] 56%|█████▋ | 12466/22095 [21:32:15<10:16:32, 3.84s/it] {'loss': 0.3239, 'grad_norm': 0.6419177385904847, 'learning_rate': 4.2074053003938296e-06, 'epoch': 0.56} 56%|█████▋ | 12466/22095 [21:32:15<10:16:32, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44357 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12467/22095 [21:32:18<9:25:12, 3.52s/it] {'loss': 0.2964, 'grad_norm': 0.640389380164466, 'learning_rate': 4.2066816549929315e-06, 'epoch': 0.56} 56%|█████▋ | 12467/22095 [21:32:18<9:25:12, 3.52s/it] 56%|█████▋ | 12468/22095 [21:32:21<9:10:39, 3.43s/it] {'loss': 0.3068, 'grad_norm': 0.5871885642771933, 'learning_rate': 4.205958026638006e-06, 'epoch': 0.56} 56%|█████▋ | 12468/22095 [21:32:21<9:10:39, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▋ | 12469/22095 [21:32:27<11:32:19, 4.32s/it] {'loss': 0.4818, 'grad_norm': 0.36427456424647253, 'learning_rate': 4.2052344153446035e-06, 'epoch': 0.56} 56%|█████▋ | 12469/22095 [21:32:27<11:32:19, 4.32s/it] 56%|█████▋ | 12470/22095 [21:32:37<15:40:41, 5.86s/it] {'loss': 0.5034, 'grad_norm': 0.3353419505539499, 'learning_rate': 4.204510821128274e-06, 'epoch': 0.56} 56%|█████▋ | 12470/22095 [21:32:37<15:40:41, 5.86s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 56%|█████▋ | 12471/22095 [21:32:41<14:14:14, 5.33s/it] {'loss': 0.2994, 'grad_norm': 0.629458993351915, 'learning_rate': 4.2037872440045615e-06, 'epoch': 0.56} 56%|█████▋ | 12471/22095 [21:32:41<14:14:14, 5.33s/it] 56%|█████▋ | 12472/22095 [21:32:44<12:44:17, 4.77s/it] {'loss': 0.3031, 'grad_norm': 0.7561757437539882, 'learning_rate': 4.203063683989017e-06, 'epoch': 0.56} 56%|█████▋ | 12472/22095 [21:32:44<12:44:17, 4.77s/it] 56%|█████▋ | 12473/22095 [21:32:47<11:12:22, 4.19s/it] {'loss': 0.2994, 'grad_norm': 0.6085330300726697, 'learning_rate': 4.202340141097188e-06, 'epoch': 0.56} 56%|█████▋ | 12473/22095 [21:32:47<11:12:22, 4.19s/it] 56%|█████▋ | 12474/22095 [21:32:50<10:07:19, 3.79s/it] {'loss': 0.3275, 'grad_norm': 0.5898340763903122, 'learning_rate': 4.2016166153446174e-06, 'epoch': 0.56} 56%|█████▋ | 12474/22095 [21:32:50<10:07:19, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366978 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33724, 'image': 'vrdu_table_final_2/astro-ph.CO/b3a9c86d-3517-4566-8fac-6d3c5670b0a8.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 56%|█████▋ | 12475/22095 [21:32:56<12:03:21, 4.51s/it] {'loss': 0.5033, 'grad_norm': 0.3621647513055581, 'learning_rate': 4.200893106746853e-06, 'epoch': 0.56} 56%|█████▋ | 12475/22095 [21:32:56<12:03:21, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65424 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90764 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46280 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12476/22095 [21:33:00<11:41:01, 4.37s/it] {'loss': 0.3175, 'grad_norm': 0.6416906160818326, 'learning_rate': 4.2001696153194445e-06, 'epoch': 0.56} 56%|█████▋ | 12476/22095 [21:33:00<11:41:01, 4.37s/it] 56%|█████▋ | 12477/22095 [21:33:03<10:31:43, 3.94s/it] {'loss': 0.3422, 'grad_norm': 0.6275430528310015, 'learning_rate': 4.199446141077932e-06, 'epoch': 0.56} 56%|█████▋ | 12477/22095 [21:33:03<10:31:43, 3.94s/it] 56%|█████▋ | 12478/22095 [21:33:07<10:39:19, 3.99s/it] {'loss': 0.2949, 'grad_norm': 0.66606714845945, 'learning_rate': 4.198722684037864e-06, 'epoch': 0.56} 56%|█████▋ | 12478/22095 [21:33:07<10:39:19, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87928 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83506 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12479/22095 [21:33:11<10:23:21, 3.89s/it] {'loss': 0.3036, 'grad_norm': 0.6124342570104664, 'learning_rate': 4.197999244214783e-06, 'epoch': 0.56} 56%|█████▋ | 12479/22095 [21:33:11<10:23:21, 3.89s/it] 56%|█████▋ | 12480/22095 [21:33:14<9:40:00, 3.62s/it] {'loss': 0.3299, 'grad_norm': 0.8367497963384407, 'learning_rate': 4.197275821624239e-06, 'epoch': 0.56} 56%|█████▋ | 12480/22095 [21:33:14<9:40:00, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 56%|█████▋ | 12481/22095 [21:33:22<13:12:04, 4.94s/it] {'loss': 0.482, 'grad_norm': 0.3321969996370139, 'learning_rate': 4.196552416281768e-06, 'epoch': 0.56} 56%|█████▋ | 12481/22095 [21:33:22<13:12:04, 4.94s/it] 56%|█████▋ | 12482/22095 [21:33:25<12:01:06, 4.50s/it] {'loss': 0.3089, 'grad_norm': 0.5723278264837929, 'learning_rate': 4.19582902820292e-06, 'epoch': 0.56} 56%|█████▋ | 12482/22095 [21:33:25<12:01:06, 4.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84121 > 40960). Running this sequence through the model will result in indexing errors 56%|█████▋ | 12483/22095 [21:33:29<11:04:32, 4.15s/it] {'loss': 0.3272, 'grad_norm': 0.6631719342411454, 'learning_rate': 4.195105657403236e-06, 'epoch': 0.56} 56%|█████▋ | 12483/22095 [21:33:29<11:04:32, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12484/22095 [21:33:39<15:42:20, 5.88s/it] {'loss': 0.4605, 'grad_norm': 0.2974743436661958, 'learning_rate': 4.19438230389826e-06, 'epoch': 0.57} 57%|█████▋ | 12484/22095 [21:33:39<15:42:20, 5.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63831 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43572 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12485/22095 [21:33:42<13:50:47, 5.19s/it] {'loss': 0.2911, 'grad_norm': 2.4532523574610137, 'learning_rate': 4.193658967703532e-06, 'epoch': 0.57} 57%|█████▋ | 12485/22095 [21:33:42<13:50:47, 5.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881047 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4200, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 57%|█████▋ | 12486/22095 [21:33:45<12:05:47, 4.53s/it] {'loss': 0.3498, 'grad_norm': 0.6637453590331593, 'learning_rate': 4.192935648834599e-06, 'epoch': 0.57} 57%|█████▋ | 12486/22095 [21:33:45<12:05:47, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12487/22095 [21:33:55<15:58:09, 5.98s/it] {'loss': 0.4973, 'grad_norm': 0.2940291784258796, 'learning_rate': 4.192212347306999e-06, 'epoch': 0.57} 57%|█████▋ | 12487/22095 [21:33:55<15:58:09, 5.98s/it] 57%|█████▋ | 12488/22095 [21:33:58<13:39:59, 5.12s/it] {'loss': 0.3016, 'grad_norm': 0.5755812778287472, 'learning_rate': 4.191489063136274e-06, 'epoch': 0.57} 57%|█████▋ | 12488/22095 [21:33:58<13:39:59, 5.12s/it] 57%|█████▋ | 12489/22095 [21:34:01<11:54:43, 4.46s/it] {'loss': 0.3179, 'grad_norm': 0.6427057193532089, 'learning_rate': 4.190765796337968e-06, 'epoch': 0.57} 57%|█████▋ | 12489/22095 [21:34:01<11:54:43, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106597 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44892 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125765 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12490/22095 [21:34:03<10:37:12, 3.98s/it] {'loss': 0.3303, 'grad_norm': 0.6591412378938449, 'learning_rate': 4.190042546927618e-06, 'epoch': 0.57} 57%|█████▋ | 12490/22095 [21:34:03<10:37:12, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12491/22095 [21:34:13<14:39:35, 5.50s/it] {'loss': 0.4698, 'grad_norm': 0.32428203390183824, 'learning_rate': 4.189319314920766e-06, 'epoch': 0.57} 57%|█████▋ | 12491/22095 [21:34:13<14:39:35, 5.50s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 57%|█████▋ | 12492/22095 [21:34:16<12:47:18, 4.79s/it] {'loss': 0.3052, 'grad_norm': 0.6346047768285362, 'learning_rate': 4.188596100332953e-06, 'epoch': 0.57} 57%|█████▋ | 12492/22095 [21:34:16<12:47:18, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12493/22095 [21:34:25<16:30:55, 6.19s/it] {'loss': 0.4703, 'grad_norm': 0.5467035608516998, 'learning_rate': 4.1878729031797165e-06, 'epoch': 0.57} 57%|█████▋ | 12493/22095 [21:34:25<16:30:55, 6.19s/it] 57%|█████▋ | 12494/22095 [21:34:28<14:06:07, 5.29s/it] {'loss': 0.3133, 'grad_norm': 0.6546999103912576, 'learning_rate': 4.187149723476597e-06, 'epoch': 0.57} 57%|█████▋ | 12494/22095 [21:34:28<14:06:07, 5.29s/it] 57%|█████▋ | 12495/22095 [21:34:31<12:25:46, 4.66s/it] {'loss': 0.3387, 'grad_norm': 0.6745645431237851, 'learning_rate': 4.186426561239134e-06, 'epoch': 0.57} 57%|█████▋ | 12495/22095 [21:34:32<12:25:46, 4.66s/it] 57%|█████▋ | 12496/22095 [21:34:35<11:41:35, 4.39s/it] {'loss': 0.2982, 'grad_norm': 0.6720264368361668, 'learning_rate': 4.185703416482867e-06, 'epoch': 0.57} 57%|█████▋ | 12496/22095 [21:34:35<11:41:35, 4.39s/it] 57%|█████▋ | 12497/22095 [21:34:39<10:51:33, 4.07s/it] {'loss': 0.3212, 'grad_norm': 0.5966460736841118, 'learning_rate': 4.184980289223331e-06, 'epoch': 0.57} 57%|█████▋ | 12497/22095 [21:34:39<10:51:33, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12498/22095 [21:34:44<12:09:50, 4.56s/it] {'loss': 0.4905, 'grad_norm': 0.5941486030441155, 'learning_rate': 4.184257179476065e-06, 'epoch': 0.57} 57%|█████▋ | 12498/22095 [21:34:44<12:09:50, 4.56s/it] 57%|█████▋ | 12499/22095 [21:34:48<11:24:49, 4.28s/it] {'loss': 0.283, 'grad_norm': 0.6087554097990314, 'learning_rate': 4.183534087256609e-06, 'epoch': 0.57} 57%|█████▋ | 12499/22095 [21:34:48<11:24:49, 4.28s/it] 57%|█████▋ | 12500/22095 [21:34:51<10:30:44, 3.94s/it] {'loss': 0.3618, 'grad_norm': 0.6923120727230525, 'learning_rate': 4.182811012580495e-06, 'epoch': 0.57} 57%|█████▋ | 12500/22095 [21:34:51<10:30:44, 3.94s/it] 57%|█████▋ | 12501/22095 [21:34:56<11:03:02, 4.15s/it] {'loss': 0.3, 'grad_norm': 0.6157486806061441, 'learning_rate': 4.182087955463264e-06, 'epoch': 0.57} 57%|█████▋ | 12501/22095 [21:34:56<11:03:02, 4.15s/it] 57%|█████▋ | 12502/22095 [21:34:59<10:31:46, 3.95s/it] {'loss': 0.316, 'grad_norm': 0.5838262062234353, 'learning_rate': 4.181364915920453e-06, 'epoch': 0.57} 57%|█████▋ | 12502/22095 [21:34:59<10:31:46, 3.95s/it] 57%|█████▋ | 12503/22095 [21:35:02<9:56:13, 3.73s/it] {'loss': 0.3127, 'grad_norm': 0.7193124084515455, 'learning_rate': 4.180641893967593e-06, 'epoch': 0.57} 57%|█████▋ | 12503/22095 [21:35:02<9:56:13, 3.73s/it] 57%|█████▋ | 12504/22095 [21:35:06<9:37:14, 3.61s/it] {'loss': 0.3122, 'grad_norm': 0.655315477496645, 'learning_rate': 4.179918889620221e-06, 'epoch': 0.57} 57%|█████▋ | 12504/22095 [21:35:06<9:37:14, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76082 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12505/22095 [21:35:09<9:24:19, 3.53s/it] {'loss': 0.3616, 'grad_norm': 0.645451805585538, 'learning_rate': 4.179195902893878e-06, 'epoch': 0.57} 57%|█████▋ | 12505/22095 [21:35:09<9:24:19, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12506/22095 [21:35:13<9:54:33, 3.72s/it] {'loss': 0.2893, 'grad_norm': 0.6778970618048709, 'learning_rate': 4.17847293380409e-06, 'epoch': 0.57} 57%|█████▋ | 12506/22095 [21:35:13<9:54:33, 3.72s/it] 57%|█████▋ | 12507/22095 [21:35:17<9:42:56, 3.65s/it] {'loss': 0.3205, 'grad_norm': 0.5948215804243057, 'learning_rate': 4.177749982366397e-06, 'epoch': 0.57} 57%|█████▋ | 12507/22095 [21:35:17<9:42:56, 3.65s/it] 57%|█████▋ | 12508/22095 [21:35:20<9:46:40, 3.67s/it] {'loss': 0.3061, 'grad_norm': 0.6467076040360541, 'learning_rate': 4.17702704859633e-06, 'epoch': 0.57} 57%|█████▋ | 12508/22095 [21:35:20<9:46:40, 3.67s/it] 57%|█████▋ | 12509/22095 [21:35:24<9:28:37, 3.56s/it] {'loss': 0.3304, 'grad_norm': 0.5704107704943451, 'learning_rate': 4.176304132509428e-06, 'epoch': 0.57} 57%|█████▋ | 12509/22095 [21:35:24<9:28:37, 3.56s/it] 57%|█████▋ | 12510/22095 [21:35:28<9:55:21, 3.73s/it] {'loss': 0.3336, 'grad_norm': 0.6390073058823326, 'learning_rate': 4.175581234121216e-06, 'epoch': 0.57} 57%|█████▋ | 12510/22095 [21:35:28<9:55:21, 3.73s/it] 57%|█████▋ | 12511/22095 [21:35:32<10:14:33, 3.85s/it] {'loss': 0.311, 'grad_norm': 0.6960136454544098, 'learning_rate': 4.174858353447234e-06, 'epoch': 0.57} 57%|█████▋ | 12511/22095 [21:35:32<10:14:33, 3.85s/it] 57%|█████▋ | 12512/22095 [21:35:35<9:40:10, 3.63s/it] {'loss': 0.3133, 'grad_norm': 0.6104350462181477, 'learning_rate': 4.1741354905030115e-06, 'epoch': 0.57} 57%|█████▋ | 12512/22095 [21:35:35<9:40:10, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12513/22095 [21:35:41<11:46:32, 4.42s/it] {'loss': 0.4926, 'grad_norm': 0.2917322371481019, 'learning_rate': 4.17341264530408e-06, 'epoch': 0.57} 57%|█████▋ | 12513/22095 [21:35:41<11:46:32, 4.42s/it] 57%|█████▋ | 12514/22095 [21:35:45<10:51:24, 4.08s/it] {'loss': 0.3293, 'grad_norm': 0.7144016324699435, 'learning_rate': 4.1726898178659714e-06, 'epoch': 0.57} 57%|█████▋ | 12514/22095 [21:35:45<10:51:24, 4.08s/it] 57%|█████▋ | 12515/22095 [21:35:48<9:56:41, 3.74s/it] {'loss': 0.2988, 'grad_norm': 0.7909838933014116, 'learning_rate': 4.1719670082042194e-06, 'epoch': 0.57} 57%|█████▋ | 12515/22095 [21:35:48<9:56:41, 3.74s/it] 57%|█████▋ | 12516/22095 [21:35:51<9:33:11, 3.59s/it] {'loss': 0.2753, 'grad_norm': 0.5857732475977157, 'learning_rate': 4.171244216334353e-06, 'epoch': 0.57} 57%|█████▋ | 12516/22095 [21:35:51<9:33:11, 3.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11245602 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8709, 'image': 'MMMU/Pharmacy/test_112_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,How many moles of each product are formed\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': "### Introduction to the Chemical Reaction\n\nThe chemical reaction given is:\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + \\text{CO}(g) \\rightarrow \\text{Fe}(s) + \\text{CO}_2(g) \\]\n\nThis is a redox reaction occurring in a blast furnace, where iron(III) oxide (hematite) is reduced to iron, and carbon monoxide is oxidized to carbon dioxide. This reaction is central to iron extraction and metallurgical processes.\n\n### Iron(III) Oxide (\\( \\text{Fe}_2\\text{O}_3 \\))\n\n#### Composition and Properties\nIron(III) oxide, or ferric oxide, is one of the iron oxides with a chemical formula Fe₂O₃. It is a reddish-brown solid that is prevalent in nature as hematite, a key iron ore. \n\n- **Molecular Weight**: 159.69 g/mol\n- **Occurrence**: Naturally found as hematite, a principal ore of iron.\n- **Properties**: Fe₂O₃ is insoluble in water and exhibits magnetic properties.\n\n#### Applications\n- **Iron Production**: It is primarily used in steel manufacturing.\n- **Pigments**: Used as a pigment in cosmetics, paints, and coatings.\n- **Magnetic Applications**: Some forms are used in magnetic recording materials.\n\n#### Related Reactions\n- **Reduction in Blast Furnace**:\n \\[ \\text{Fe}_2\\text{O}_3 + 3\\text{CO} \\rightarrow 2\\text{Fe} + 3\\text{CO}_2 \\]\n This is the complete balanced equation for reduction in a blast furnace.\n\n### Carbon Monoxide (\\( \\text{CO} \\))\n\n#### Composition and Properties\nCarbon monoxide is a colorless, odorless, and tasteless gas with the formula CO. It is lighter than air.\n\n- **Molecular Weight**: 28.01 g/mol\n- **Properties**: It is a toxic gas with a strong affinity for hemoglobin, forming carboxyhemoglobin, which prevents blood from carrying oxygen.\n\n#### Applications\n- **Metallurgy**: Used as a reducing agent in metal extraction.\n- **Chemical Industry**: Building block for various chemicals.\n- **Fuel**: Part of syngas (a mixture of carbon monoxide and hydrogen).\n\n#### Related Reactions\n- **Combustion**:\n \\[ 2\\text{CO} + \\text{O}_2 \\rightarrow 2\\text{CO}_2 \\]\n CO is oxidized to CO₂ in excess oxygen.\n\n### Iron (\\( \\text{Fe} \\))\n\n#### Composition and Properties\nIron is a metallic element with the symbol Fe. It is essential in construction and manufacturing.\n\n- **Molecular Weight**: 55.85 g/mol\n- **Occurrence**: Found in ores like hematite (Fe₂O₃), magnetite (Fe₃O₄), and more.\n- **Properties**: Ferromagnetic, malleable, and ductile.\n\n#### Applications\n- **Construction**: Primary component in steel and alloys.\n- **Manufacturing**: Machinery, automotive parts, and tools.\n- **Biological**: Vital for blood production in the form of hemoglobin.\n\n#### Related Reactions\n- **Rust Formation**:\n \\[ 4\\text{Fe} + 3\\text{O}_2 \\rightarrow 2\\text{Fe}_2\\text{O}_3 \\]\n Fe is oxidized to Fe₂O₃ in the presence of oxygen.\n\n### Carbon Dioxide (\\( \\text{CO}_2 \\))\n\n#### Composition and Properties\nCarbon dioxide is a colorless, odorless gas with the formula CO₂. It is slightly acidic and soluble in water, forming carbonic acid.\n\n- **Molecular Weight**: 44.01 g/mol\n- **Occurrence**: Found in Earth's atmosphere at low concentrations.\n\n#### Applications\n- **Carbonation**: Used in soft drinks and sparkling water.\n- **Refrigeration**: Used as a cooling agent (dry ice).\n- **Fire Extinguishing**: Utilized in fire suppression systems.\n\n#### Related Reactions\n- **Photosynthesis**:\n \\[ 6\\text{CO}_2 + 6\\text{H}_2\\text{O} \\rightarrow \\text{C}_6\\text{H}_{12}\\text{O}_6 + 6\\text{O}_2 \\]\n Plants convert CO₂ into glucose using sunlight.\n\n### Metallurgical Processes\n\nIron extraction in a blast furnace involves several steps:\n\n1. **Charge Preparation**: Iron ore, coke, and limestone are added to the furnace.\n2. **Reduction Zone**: CO reduces Fe₂O₃ to Fe.\n3. **Slag Formation**: Limestone forms calcium silicate slag, removing impurities.\n4. **Iron Tapping**: Molten iron is tapped at the furnace base.\n\n### Calculating Moles of Products\n\nThe moles of Fe and CO₂ produced can be determined by the stoichiometry of the balanced chemical reaction:\n\n\\[ \\text{Fe}_2\\text{O}_3 + 3\\text{CO} \\rightarrow 2\\text{Fe} + 3\\text{CO}_2 \\]\n\nHere, 1 mole of Fe₂O₃ produces 2 moles of Fe and 3 moles of CO₂.\n\n#### Application of Stoichiometry\nTo find the number of moles of Fe and CO₂ produced, stoichiometric calculations are performed:\n\n- **Stoichiometric Ratios**: Derived from balanced equation coefficients.\n\n### Environmental and Safety Considerations\n\n#### Emissions and Pollution\n- **Carbon Emissions**: CO and CO₂ are greenhouse gases.\n- **Air Quality**: CO is toxic. Proper ventilation and emissions controls are essential.\n\n#### Safety Protocols\n- **Handling**: Use protective equipment and ensure proper ventilation.\n- **Monitoring**: Regular air quality monitoring is necessary in industrial settings.\n\n### Conclusion\n\nUnderstanding the chemistry of iron extraction involves mastering several concepts, including oxidation-reduction reactions, stoichiometry, and environmental impacts. Iron plays a crucial role in numerous industries, and its production from iron(III) oxide through the blast furnace process exemplifies the integration of chemical knowledge with practical applications. Proper management of by-products and emissions is vital for environmentally sustainable practices in metallurgy."}]} 57%|█████▋ | 12517/22095 [21:35:55<9:44:41, 3.66s/it] {'loss': 0.2952, 'grad_norm': 0.7735522212373134, 'learning_rate': 4.1705214422719024e-06, 'epoch': 0.57} 57%|█████▋ | 12517/22095 [21:35:55<9:44:41, 3.66s/it] 57%|█████▋ | 12518/22095 [21:35:58<9:30:06, 3.57s/it] {'loss': 0.2912, 'grad_norm': 0.5934958361037524, 'learning_rate': 4.1697986860324e-06, 'epoch': 0.57} 57%|█████▋ | 12518/22095 [21:35:58<9:30:06, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [689, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8448236 in VC:s3://internvl-moe-sft-data/. Exception: Image size [689, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4497, 'image': 'vrdu_texteq/astro-ph.CO/d3097677-08eb-4d6a-843e-6fa7a0eac283.png', 'image_wh': [[689, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'The absence of a correlation is excluded at the $15\\sigma$ level.'}]} 57%|█████▋ | 12519/22095 [21:36:01<8:57:19, 3.37s/it] {'loss': 0.2762, 'grad_norm': 0.5350188760957395, 'learning_rate': 4.169075947631371e-06, 'epoch': 0.57} 57%|█████▋ | 12519/22095 [21:36:01<8:57:19, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45881 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12520/22095 [21:36:05<9:18:26, 3.50s/it] {'loss': 0.3131, 'grad_norm': 0.86593002964982, 'learning_rate': 4.1683532270843505e-06, 'epoch': 0.57} 57%|█████▋ | 12520/22095 [21:36:05<9:18:26, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303230 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1SUSobv2H8KJjy1zkXXXr7pXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否帮忙识别这张图片里的全部文本并告诉我?'}, {'from': 'gpt', 'value': '图中所有文字:\n我能行\nIcandoit!\n离成功近一点\n对自己狠一点\n梦\n把握生命中的每一天,全力以赴我们心中的\nHeartofthedream'}]} 57%|█████▋ | 12521/22095 [21:36:07<8:42:38, 3.28s/it] {'loss': 0.2886, 'grad_norm': 0.8766304309226985, 'learning_rate': 4.1676305244068645e-06, 'epoch': 0.57} 57%|█████▋ | 12521/22095 [21:36:08<8:42:38, 3.28s/it] 57%|█████▋ | 12522/22095 [21:36:10<8:27:09, 3.18s/it] {'loss': 0.2739, 'grad_norm': 0.6122471541271316, 'learning_rate': 4.166907839614442e-06, 'epoch': 0.57} 57%|█████▋ | 12522/22095 [21:36:10<8:27:09, 3.18s/it] 57%|█████▋ | 12523/22095 [21:36:14<8:53:37, 3.34s/it] {'loss': 0.3459, 'grad_norm': 0.6411446012105898, 'learning_rate': 4.16618517272261e-06, 'epoch': 0.57} 57%|█████▋ | 12523/22095 [21:36:14<8:53:37, 3.34s/it] 57%|█████▋ | 12524/22095 [21:36:18<9:26:03, 3.55s/it] {'loss': 0.3422, 'grad_norm': 0.6081462210575242, 'learning_rate': 4.165462523746899e-06, 'epoch': 0.57} 57%|█████▋ | 12524/22095 [21:36:18<9:26:03, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047942 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 57%|█████▋ | 12525/22095 [21:36:22<9:25:03, 3.54s/it] {'loss': 0.2996, 'grad_norm': 0.5875629432466725, 'learning_rate': 4.164739892702836e-06, 'epoch': 0.57} 57%|█████▋ | 12525/22095 [21:36:22<9:25:03, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12526/22095 [21:36:29<12:10:53, 4.58s/it] {'loss': 0.4936, 'grad_norm': 0.3287040692234207, 'learning_rate': 4.164017279605946e-06, 'epoch': 0.57} 57%|█████▋ | 12526/22095 [21:36:29<12:10:53, 4.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12527/22095 [21:36:32<11:17:32, 4.25s/it] {'loss': 0.3036, 'grad_norm': 0.6252197324600116, 'learning_rate': 4.163294684471757e-06, 'epoch': 0.57} 57%|█████▋ | 12527/22095 [21:36:32<11:17:32, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12528/22095 [21:36:41<14:43:19, 5.54s/it] {'loss': 0.475, 'grad_norm': 0.301561350521929, 'learning_rate': 4.162572107315798e-06, 'epoch': 0.57} 57%|█████▋ | 12528/22095 [21:36:41<14:43:19, 5.54s/it] 57%|█████▋ | 12529/22095 [21:36:44<12:49:52, 4.83s/it] {'loss': 0.2777, 'grad_norm': 0.6671373326093292, 'learning_rate': 4.161849548153589e-06, 'epoch': 0.57} 57%|█████▋ | 12529/22095 [21:36:44<12:49:52, 4.83s/it] 57%|█████▋ | 12530/22095 [21:36:47<11:31:37, 4.34s/it] {'loss': 0.324, 'grad_norm': 0.6620647807278043, 'learning_rate': 4.161127007000662e-06, 'epoch': 0.57} 57%|█████▋ | 12530/22095 [21:36:47<11:31:37, 4.34s/it] 57%|█████▋ | 12531/22095 [21:36:50<10:23:22, 3.91s/it] {'loss': 0.3413, 'grad_norm': 0.6390365926440836, 'learning_rate': 4.160404483872538e-06, 'epoch': 0.57} 57%|█████▋ | 12531/22095 [21:36:50<10:23:22, 3.91s/it] 57%|█████▋ | 12532/22095 [21:36:53<9:46:11, 3.68s/it] {'loss': 0.3346, 'grad_norm': 0.6515037937373498, 'learning_rate': 4.159681978784743e-06, 'epoch': 0.57} 57%|█████▋ | 12532/22095 [21:36:53<9:46:11, 3.68s/it] 57%|█████▋ | 12533/22095 [21:36:57<9:30:27, 3.58s/it] {'loss': 0.2838, 'grad_norm': 0.70976713212594, 'learning_rate': 4.1589594917528006e-06, 'epoch': 0.57} 57%|█████▋ | 12533/22095 [21:36:57<9:30:27, 3.58s/it] 57%|█████▋ | 12534/22095 [21:37:00<9:09:08, 3.45s/it] {'loss': 0.3601, 'grad_norm': 0.6151749329986993, 'learning_rate': 4.158237022792237e-06, 'epoch': 0.57} 57%|█████▋ | 12534/22095 [21:37:00<9:09:08, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12535/22095 [21:37:03<8:59:52, 3.39s/it] {'loss': 0.3204, 'grad_norm': 0.5838365082606324, 'learning_rate': 4.157514571918574e-06, 'epoch': 0.57} 57%|█████▋ | 12535/22095 [21:37:03<8:59:52, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59316 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12536/22095 [21:37:07<9:35:58, 3.62s/it] {'loss': 0.3476, 'grad_norm': 0.6441399562644541, 'learning_rate': 4.156792139147336e-06, 'epoch': 0.57} 57%|█████▋ | 12536/22095 [21:37:07<9:35:58, 3.62s/it] 57%|█████▋ | 12537/22095 [21:37:11<9:49:54, 3.70s/it] {'loss': 0.3006, 'grad_norm': 0.5794311590688679, 'learning_rate': 4.156069724494043e-06, 'epoch': 0.57} 57%|█████▋ | 12537/22095 [21:37:11<9:49:54, 3.70s/it] 57%|█████▋ | 12538/22095 [21:37:14<9:34:32, 3.61s/it] {'loss': 0.3577, 'grad_norm': 0.6359896469226998, 'learning_rate': 4.155347327974223e-06, 'epoch': 0.57} 57%|█████▋ | 12538/22095 [21:37:14<9:34:32, 3.61s/it] 57%|█████▋ | 12539/22095 [21:37:18<9:40:18, 3.64s/it] {'loss': 0.3332, 'grad_norm': 0.9274568004702104, 'learning_rate': 4.154624949603391e-06, 'epoch': 0.57} 57%|█████▋ | 12539/22095 [21:37:18<9:40:18, 3.64s/it] 57%|█████▋ | 12540/22095 [21:37:22<9:32:01, 3.59s/it] {'loss': 0.3121, 'grad_norm': 0.6045661204640542, 'learning_rate': 4.153902589397075e-06, 'epoch': 0.57} 57%|█████▋ | 12540/22095 [21:37:22<9:32:01, 3.59s/it] 57%|█████▋ | 12541/22095 [21:37:25<9:04:45, 3.42s/it] {'loss': 0.2814, 'grad_norm': 0.5885564599166756, 'learning_rate': 4.153180247370794e-06, 'epoch': 0.57} 57%|█████▋ | 12541/22095 [21:37:25<9:04:45, 3.42s/it] 57%|█████▋ | 12542/22095 [21:37:28<8:42:14, 3.28s/it] {'loss': 0.3642, 'grad_norm': 0.6470512546656807, 'learning_rate': 4.152457923540068e-06, 'epoch': 0.57} 57%|█████▋ | 12542/22095 [21:37:28<8:42:14, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44163 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12543/22095 [21:37:30<8:23:19, 3.16s/it] {'loss': 0.3252, 'grad_norm': 0.6405301017985353, 'learning_rate': 4.151735617920417e-06, 'epoch': 0.57} 57%|█████▋ | 12543/22095 [21:37:30<8:23:19, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12544/22095 [21:37:40<13:36:37, 5.13s/it] {'loss': 0.5038, 'grad_norm': 0.44958018256330395, 'learning_rate': 4.151013330527364e-06, 'epoch': 0.57} 57%|█████▋ | 12544/22095 [21:37:40<13:36:37, 5.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50696 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12545/22095 [21:37:45<13:08:55, 4.96s/it] {'loss': 0.2978, 'grad_norm': 0.6490291798847689, 'learning_rate': 4.150291061376426e-06, 'epoch': 0.57} 57%|█████▋ | 12545/22095 [21:37:45<13:08:55, 4.96s/it] 57%|█████▋ | 12546/22095 [21:37:49<12:32:46, 4.73s/it] {'loss': 0.3462, 'grad_norm': 0.7109462139773439, 'learning_rate': 4.149568810483124e-06, 'epoch': 0.57} 57%|█████▋ | 12546/22095 [21:37:49<12:32:46, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52644 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12547/22095 [21:37:52<11:02:45, 4.16s/it] {'loss': 0.3077, 'grad_norm': 0.6038537660170973, 'learning_rate': 4.148846577862977e-06, 'epoch': 0.57} 57%|█████▋ | 12547/22095 [21:37:52<11:02:45, 4.16s/it] 57%|█████▋ | 12548/22095 [21:37:56<10:43:57, 4.05s/it] {'loss': 0.324, 'grad_norm': 0.628410020280818, 'learning_rate': 4.148124363531501e-06, 'epoch': 0.57} 57%|█████▋ | 12548/22095 [21:37:56<10:43:57, 4.05s/it] 57%|█████▋ | 12549/22095 [21:37:59<10:12:31, 3.85s/it] {'loss': 0.3144, 'grad_norm': 0.6446328523395594, 'learning_rate': 4.147402167504218e-06, 'epoch': 0.57} 57%|█████▋ | 12549/22095 [21:37:59<10:12:31, 3.85s/it] 57%|█████▋ | 12550/22095 [21:38:02<9:33:31, 3.61s/it] {'loss': 0.3268, 'grad_norm': 0.5789304471638873, 'learning_rate': 4.146679989796643e-06, 'epoch': 0.57} 57%|█████▋ | 12550/22095 [21:38:02<9:33:31, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12551/22095 [21:38:09<12:23:48, 4.68s/it] {'loss': 0.486, 'grad_norm': 0.2817358866934816, 'learning_rate': 4.145957830424294e-06, 'epoch': 0.57} 57%|█████▋ | 12551/22095 [21:38:09<12:23:48, 4.68s/it] 57%|█████▋ | 12552/22095 [21:38:13<11:57:23, 4.51s/it] {'loss': 0.3289, 'grad_norm': 0.6622630583860681, 'learning_rate': 4.145235689402688e-06, 'epoch': 0.57} 57%|█████▋ | 12552/22095 [21:38:13<11:57:23, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50978 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12553/22095 [21:38:23<16:21:21, 6.17s/it] {'loss': 0.4641, 'grad_norm': 0.3054074301809164, 'learning_rate': 4.144513566747342e-06, 'epoch': 0.57} 57%|█████▋ | 12553/22095 [21:38:23<16:21:21, 6.17s/it] 57%|█████▋ | 12554/22095 [21:38:27<14:01:59, 5.30s/it] {'loss': 0.2911, 'grad_norm': 0.6051067622157187, 'learning_rate': 4.143791462473774e-06, 'epoch': 0.57} 57%|█████▋ | 12554/22095 [21:38:27<14:01:59, 5.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12555/22095 [21:38:36<17:29:05, 6.60s/it] {'loss': 0.4848, 'grad_norm': 0.27926213530322874, 'learning_rate': 4.143069376597496e-06, 'epoch': 0.57} 57%|█████▋ | 12555/22095 [21:38:36<17:29:05, 6.60s/it] 57%|█████▋ | 12556/22095 [21:38:39<14:44:55, 5.57s/it] {'loss': 0.3453, 'grad_norm': 0.6382366524036927, 'learning_rate': 4.142347309134024e-06, 'epoch': 0.57} 57%|█████▋ | 12556/22095 [21:38:39<14:44:55, 5.57s/it] 57%|█████▋ | 12557/22095 [21:38:42<12:38:40, 4.77s/it] {'loss': 0.2998, 'grad_norm': 0.6275300895206359, 'learning_rate': 4.141625260098878e-06, 'epoch': 0.57} 57%|█████▋ | 12557/22095 [21:38:42<12:38:40, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58286 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12558/22095 [21:38:46<11:55:32, 4.50s/it] {'loss': 0.2994, 'grad_norm': 0.5958461166824709, 'learning_rate': 4.140903229507566e-06, 'epoch': 0.57} 57%|█████▋ | 12558/22095 [21:38:46<11:55:32, 4.50s/it] 57%|█████▋ | 12559/22095 [21:38:49<10:42:32, 4.04s/it] {'loss': 0.2883, 'grad_norm': 0.6190751677743475, 'learning_rate': 4.1401812173756055e-06, 'epoch': 0.57} 57%|█████▋ | 12559/22095 [21:38:49<10:42:32, 4.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367288 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34036, 'image': 'vrdu_table_final_2/astro-ph.CO/23cecfb4-77b9-434f-bca7-ea30524c25f8.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 57%|█████▋ | 12560/22095 [21:38:53<10:13:36, 3.86s/it] {'loss': 0.2983, 'grad_norm': 0.7678343018745986, 'learning_rate': 4.139459223718511e-06, 'epoch': 0.57} 57%|█████▋ | 12560/22095 [21:38:53<10:13:36, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12561/22095 [21:39:00<12:45:00, 4.81s/it] {'loss': 0.4877, 'grad_norm': 0.3230824047445227, 'learning_rate': 4.138737248551793e-06, 'epoch': 0.57} 57%|█████▋ | 12561/22095 [21:39:00<12:45:00, 4.81s/it] 57%|█████▋ | 12562/22095 [21:39:03<11:30:30, 4.35s/it] {'loss': 0.295, 'grad_norm': 0.6874374038803182, 'learning_rate': 4.1380152918909665e-06, 'epoch': 0.57} 57%|█████▋ | 12562/22095 [21:39:03<11:30:30, 4.35s/it] 57%|█████▋ | 12563/22095 [21:39:06<10:42:36, 4.04s/it] {'loss': 0.3186, 'grad_norm': 0.6884284518616037, 'learning_rate': 4.137293353751546e-06, 'epoch': 0.57} 57%|█████▋ | 12563/22095 [21:39:06<10:42:36, 4.04s/it] 57%|█████▋ | 12564/22095 [21:39:10<10:40:04, 4.03s/it] {'loss': 0.3564, 'grad_norm': 0.5863243128631148, 'learning_rate': 4.13657143414904e-06, 'epoch': 0.57} 57%|█████▋ | 12564/22095 [21:39:10<10:40:04, 4.03s/it] 57%|█████▋ | 12565/22095 [21:39:13<9:52:40, 3.73s/it] {'loss': 0.294, 'grad_norm': 0.627551766252621, 'learning_rate': 4.1358495330989625e-06, 'epoch': 0.57} 57%|█████▋ | 12565/22095 [21:39:13<9:52:40, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (65801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91710 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76254 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12566/22095 [21:39:23<14:24:51, 5.45s/it] {'loss': 0.4816, 'grad_norm': 0.27480827574499894, 'learning_rate': 4.1351276506168235e-06, 'epoch': 0.57} 57%|█████▋ | 12566/22095 [21:39:23<14:24:51, 5.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [478, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8478789 in VC:s3://internvl-moe-sft-data/. Exception: Image size [478, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 110618, 'image': 'vrdu_texteq/astro-ph.CO/054d7af3-1c29-4d0e-8993-92e1ee9e7243.png', 'image_wh': [[478, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'where $N_{\\rm E}$ is the number of realizations.'}]} 57%|█████▋ | 12567/22095 [21:39:27<13:35:21, 5.13s/it] {'loss': 0.3285, 'grad_norm': 0.6537997783814505, 'learning_rate': 4.134405786718138e-06, 'epoch': 0.57} 57%|█████▋ | 12567/22095 [21:39:27<13:35:21, 5.13s/it] 57%|█████▋ | 12568/22095 [21:39:30<11:41:13, 4.42s/it] {'loss': 0.3106, 'grad_norm': 0.6606229899697229, 'learning_rate': 4.133683941418411e-06, 'epoch': 0.57} 57%|█████▋ | 12568/22095 [21:39:30<11:41:13, 4.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12569/22095 [21:39:33<10:23:24, 3.93s/it] {'loss': 0.3025, 'grad_norm': 0.6383340830097654, 'learning_rate': 4.132962114733156e-06, 'epoch': 0.57} 57%|█████▋ | 12569/22095 [21:39:33<10:23:24, 3.93s/it] 57%|█████▋ | 12570/22095 [21:39:36<10:05:12, 3.81s/it] {'loss': 0.2986, 'grad_norm': 0.6340031462442014, 'learning_rate': 4.132240306677883e-06, 'epoch': 0.57} 57%|█████▋ | 12570/22095 [21:39:36<10:05:12, 3.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8385325 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52129, 'image': 'vrdu_table_final_2/astro-ph.CO/2fa9ad82-edf5-4b9f-b0d4-129c66b75ff5.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{#3}#4\\end{tabular}\n```'}]} 57%|█████▋ | 12571/22095 [21:39:39<9:34:30, 3.62s/it] {'loss': 0.3112, 'grad_norm': 0.7355684662490959, 'learning_rate': 4.1315185172681e-06, 'epoch': 0.57} 57%|█████▋ | 12571/22095 [21:39:39<9:34:30, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51340 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12572/22095 [21:39:43<9:29:05, 3.59s/it] {'loss': 0.3361, 'grad_norm': 0.6237544027549345, 'learning_rate': 4.130796746519316e-06, 'epoch': 0.57} 57%|█████▋ | 12572/22095 [21:39:43<9:29:05, 3.59s/it] 57%|█████▋ | 12573/22095 [21:39:47<9:36:02, 3.63s/it] {'loss': 0.3462, 'grad_norm': 0.628615082490157, 'learning_rate': 4.130074994447042e-06, 'epoch': 0.57} 57%|█████▋ | 12573/22095 [21:39:47<9:36:02, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12574/22095 [21:39:56<14:28:48, 5.48s/it] {'loss': 0.4807, 'grad_norm': 0.3102732059637665, 'learning_rate': 4.129353261066784e-06, 'epoch': 0.57} 57%|█████▋ | 12574/22095 [21:39:56<14:28:48, 5.48s/it] 57%|█████▋ | 12575/22095 [21:40:00<13:15:35, 5.01s/it] {'loss': 0.287, 'grad_norm': 0.5653306933787046, 'learning_rate': 4.12863154639405e-06, 'epoch': 0.57} 57%|█████▋ | 12575/22095 [21:40:00<13:15:35, 5.01s/it] 57%|█████▋ | 12576/22095 [21:40:03<11:40:15, 4.41s/it] {'loss': 0.2981, 'grad_norm': 0.6455550118701715, 'learning_rate': 4.127909850444349e-06, 'epoch': 0.57} 57%|█████▋ | 12576/22095 [21:40:03<11:40:15, 4.41s/it] 57%|█████▋ | 12577/22095 [21:40:07<10:46:47, 4.08s/it] {'loss': 0.3185, 'grad_norm': 0.6443727050940745, 'learning_rate': 4.127188173233185e-06, 'epoch': 0.57} 57%|█████▋ | 12577/22095 [21:40:07<10:46:47, 4.08s/it] 57%|█████▋ | 12578/22095 [21:40:11<11:18:13, 4.28s/it] {'loss': 0.3418, 'grad_norm': 0.6621726582166643, 'learning_rate': 4.126466514776067e-06, 'epoch': 0.57} 57%|█████▋ | 12578/22095 [21:40:11<11:18:13, 4.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8470174 in VC:s3://internvl-moe-sft-data/. Exception: Image size [198, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 155410, 'image': 'vrdu_texteq/astro-ph.CO/57802e7e-ffca-4914-b1d4-820c8543cad6.png', 'image_wh': [[198, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'at over $95\\%$ CL.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12579/22095 [21:40:14<10:20:47, 3.91s/it] {'loss': 0.3396, 'grad_norm': 0.6077376379773018, 'learning_rate': 4.125744875088502e-06, 'epoch': 0.57} 57%|█████▋ | 12579/22095 [21:40:14<10:20:47, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12580/22095 [21:40:24<14:40:39, 5.55s/it] {'loss': 0.4544, 'grad_norm': 0.28837605872185323, 'learning_rate': 4.125023254185995e-06, 'epoch': 0.57} 57%|█████▋ | 12580/22095 [21:40:24<14:40:39, 5.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11245568 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8675, 'image': 'MMMU/Pharmacy/test_78_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,If 4.00 kg Fe2O3 are available to react, how many moles of CO are needed?\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': "## Introduction to the Reaction\n\n### Chemical Equation\nThe unbalanced chemical reaction between iron(III) oxide (\\( \\text{Fe}_2\\text{O}_3 \\)) and carbon monoxide (\\( \\text{CO} \\)) in a blast furnace is used to reduce iron ore to produce metallic iron (\\( \\text{Fe} \\)). This process is essential in the steel-making industry. The chemical equation for this reaction is:\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + \\text{CO}(g) \\rightarrow \\text{Fe}(s) + \\text{CO}_2(g) \\]\n\n### Balancing Chemical Equations\nBalancing the chemical equation ensures that matter is conserved in accordance with the law of conservation of mass. Each reactant and product must have the same quantity of each type of atom. Here, the balanced form of the equation is:\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\]\n\nThis equation indicates that one mole of iron(III) oxide reacts with three moles of carbon monoxide to produce two moles of iron and three moles of carbon dioxide.\n\n### Stoichiometry\nStoichiometry deals with the quantitative relationships of the reactants and products in a chemical reaction. It helps in calculating the amount of reactants required or products formed in a reaction. The stoichiometric coefficients in a balanced equation, like 1:3:2:3 in this case, show the molar relationships.\n\n### Applications of the Reaction\n1. **Iron Production**: \n - This reaction is pivotal in the extraction of iron in the steel industry. Iron, extracted from iron ore using carbon monoxide, serves as a primary material in building structures, machinery, vehicles, and many other forms of infrastructure.\n \n2. **Blast Furnace Operation**:\n - A blast furnace is the primary machine used in this reduction process. It operates at high temperatures and facilitates the conversion of iron oxide to pure iron. The blast furnace also helps in the removal of impurities from the iron ore through chemical reactions and physical processes.\n\n3. **Role of Carbon Monoxide**:\n - Carbon monoxide acts as a reducing agent in this reaction, forming carbon dioxide as a byproduct. Its ability to convert metal oxides to metals by removing oxygen is a critical chemical property utilized not only in iron extraction but also in other metallurgical processes.\n\n### Detailed Explanation of Key Concepts\n\n#### Iron(III) Oxide (\\( \\text{Fe}_2\\text{O}_3 \\))\n- **Properties**:\n - A reddish-brown, crystalline or powdery substance, iron(III) oxide is a compound of iron and oxygen.\n - It is naturally found as the mineral hematite and serves as an important ore of iron.\n\n- **Industrial Uses**:\n - Beyond metallurgy, \\( \\text{Fe}_2\\text{O}_3 \\) is used as a pigment in coatings, paints, and varnishes due to its color.\n - It is also used in magnetic applications and as a polishing agent.\n\n#### Carbon Monoxide (\\( \\text{CO} \\))\n- **Properties**:\n - A colorless, odorless, and tasteless gas, carbon monoxide is slightly lighter than air.\n - It is highly flammable and toxic to humans and animals when present in the air.\n\n- **Production**:\n - Produced industrially through the incomplete combustion of carbon-based fuels, it is a byproduct of many combustion engines.\n\n- **Applications**:\n - Beyond its metallurgical use, carbon monoxide is a precursor to many industrial chemicals like acetic acid.\n - It's also used in the production of synthetic chemicals through processes like oxo synthesis.\n\n#### Iron (\\( \\text{Fe} \\))\n- **Properties**:\n - A lustrous, ductile, and malleable metal, iron is prone to corrosion in moist air, forming rust.\n\n- **Industrial Importance**:\n - As a component of steel, it becomes a versatile material used worldwide in construction, automotive, and manufacturing industries.\n - Iron is also an essential element in biological systems, serving as a key component of hemoglobin in blood.\n\n### Calculating Moles\n\nTo solve problems involving stoichiometry, one must often calculate moles, molar mass, and apply conversion factors derived from balanced equations.\n\n#### Molar Mass\n- **Iron(III) Oxide (\\( \\text{Fe}_2\\text{O}_3 \\))**:\n - Calculate using atomic masses: \n - Iron (Fe) = 55.85 g/mol, \n - Oxygen (O) = 16.00 g/mol.\n - Molar mass = 2(55.85) + 3(16.00) = 159.70 g/mol.\n\n- **Use in Calculations**:\n - Knowing the molar mass helps convert a given mass of a substance to moles, facilitating the use of stoichiometry.\n\n#### Stoichiometric Calculations\n- Initial steps involve finding the moles of \\( \\text{Fe}_2\\text{O}_3 \\) present by dividing its provided mass by its molar mass.\n- Use the balanced equation to determine the moles of \\( \\text{CO} \\) required, considering the stoichiometric ratio of 1:3 between \\( \\text{Fe}_2\\text{O}_3 \\) and \\( \\text{CO} \\).\n\n### Real-World Implications\n\n#### Environmental Considerations\n- **Carbon Emissions**:\n - The combustion of carbon-based fuels in the production of \\( \\text{CO} \\) contributes to carbon emissions.\n - Efficient and sustainable practices are crucial to minimize the environmental impact of such industrial activities.\n\n- **Pollution Control**:\n - Technologies like carbon capture and storage (CCS) are increasingly employed in industries to mitigate the release of \\( \\text{CO}_2 \\) from processes like this one.\n\n#### Economic Impact\n- **Steel Industry**:\n - Primary input material, iron ore processing plays a key role in the global economy.\n - Fluctuations in iron and steel prices can significantly impact manufacturing industries worldwide.\n\n#### Future Trends in Iron Production\n- As the steel industry seeks to reduce its carbon footprint, alternative reducing agents like hydrogen are being explored.\n- Innovations in blast furnace technology focus on improving energy efficiency and reducing raw material consumption.\n\n### Conclusion\n\nUnderstanding the detailed chemistry and application of the reduction of iron(III) oxide is essential for students and professionals in chemistry and materials science. This reaction, played out on a massive scale in blast furnaces, symbolizes a blend of fundamental chemical principles and practical engineering solutions to meet global demand for iron and steel, while continuously adapting to environmental and economic challenges. The comprehensive grasp of stoichiometry, chemical properties, and industrial implications forms the backbone of advancements in metallurgical processes.\n\nThese intertwined aspects of chemistry and industry underscore the importance of mastering foundational concepts and applying them in real-world scenarios, advancing both technological and sustainable goals in iron production."}]} 57%|█████▋ | 12581/22095 [21:40:27<13:01:47, 4.93s/it] {'loss': 0.3333, 'grad_norm': 0.5750559154445537, 'learning_rate': 4.124301652084049e-06, 'epoch': 0.57} 57%|█████▋ | 12581/22095 [21:40:27<13:01:47, 4.93s/it] 57%|█████▋ | 12582/22095 [21:40:31<12:09:49, 4.60s/it] {'loss': 0.2996, 'grad_norm': 0.6627604804141155, 'learning_rate': 4.123580068798171e-06, 'epoch': 0.57} 57%|█████▋ | 12582/22095 [21:40:31<12:09:49, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45980 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12583/22095 [21:40:35<11:23:02, 4.31s/it] {'loss': 0.3424, 'grad_norm': 0.7667761592962654, 'learning_rate': 4.122858504343868e-06, 'epoch': 0.57} 57%|█████▋ | 12583/22095 [21:40:35<11:23:02, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52086 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304635 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1cizkhMMPMeJjy1XbXXcwxVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat is the all word in the image?'}, {'from': 'gpt', 'value': 'All words in the image:\n®\nMIYN\n米雅诗\n1瓶\n液体”黄金“\n万用橄榄油\n补水\n保湿\n护肤\n护法\n卸妆\n护唇\n®\nMIYN\n米维诗\nDIY\n壬辰纹\n橄榄油\n买1\n送1'}]} 57%|█████▋ | 12584/22095 [21:40:38<10:48:49, 4.09s/it] {'loss': 0.3573, 'grad_norm': 0.7305511014596607, 'learning_rate': 4.1221369587366395e-06, 'epoch': 0.57} 57%|█████▋ | 12584/22095 [21:40:38<10:48:49, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48744 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58196 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44561 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120956 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12585/22095 [21:40:42<10:20:29, 3.91s/it] {'loss': 0.3094, 'grad_norm': 0.6430158353794494, 'learning_rate': 4.121415431991991e-06, 'epoch': 0.57} 57%|█████▋ | 12585/22095 [21:40:42<10:20:29, 3.91s/it] 57%|█████▋ | 12586/22095 [21:40:46<10:19:10, 3.91s/it] {'loss': 0.3755, 'grad_norm': 0.7005562455378558, 'learning_rate': 4.12069392412543e-06, 'epoch': 0.57} 57%|█████▋ | 12586/22095 [21:40:46<10:19:10, 3.91s/it] 57%|█████▋ | 12587/22095 [21:40:49<9:56:16, 3.76s/it] {'loss': 0.3079, 'grad_norm': 0.5827775448153438, 'learning_rate': 4.119972435152453e-06, 'epoch': 0.57} 57%|█████▋ | 12587/22095 [21:40:49<9:56:16, 3.76s/it] 57%|█████▋ | 12588/22095 [21:40:52<9:16:12, 3.51s/it] {'loss': 0.367, 'grad_norm': 0.7272577668064656, 'learning_rate': 4.119250965088566e-06, 'epoch': 0.57} 57%|█████▋ | 12588/22095 [21:40:52<9:16:12, 3.51s/it] 57%|█████▋ | 12589/22095 [21:40:56<9:17:31, 3.52s/it] {'loss': 0.32, 'grad_norm': 0.6866287625304002, 'learning_rate': 4.118529513949272e-06, 'epoch': 0.57} 57%|█████▋ | 12589/22095 [21:40:56<9:17:31, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12590/22095 [21:40:59<9:20:04, 3.54s/it] {'loss': 0.3211, 'grad_norm': 0.6199082633100993, 'learning_rate': 4.11780808175007e-06, 'epoch': 0.57} 57%|█████▋ | 12590/22095 [21:40:59<9:20:04, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12591/22095 [21:41:08<13:15:28, 5.02s/it] {'loss': 0.4791, 'grad_norm': 0.3132223313660452, 'learning_rate': 4.1170866685064625e-06, 'epoch': 0.57} 57%|█████▋ | 12591/22095 [21:41:08<13:15:28, 5.02s/it] 57%|█████▋ | 12592/22095 [21:41:11<12:06:01, 4.58s/it] {'loss': 0.3395, 'grad_norm': 0.6191028735269731, 'learning_rate': 4.116365274233952e-06, 'epoch': 0.57} 57%|█████▋ | 12592/22095 [21:41:11<12:06:01, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12593/22095 [21:41:18<13:38:31, 5.17s/it] {'loss': 0.4736, 'grad_norm': 0.3177343220045825, 'learning_rate': 4.115643898948039e-06, 'epoch': 0.57} 57%|█████▋ | 12593/22095 [21:41:18<13:38:31, 5.17s/it] 57%|█████▋ | 12594/22095 [21:41:21<11:59:27, 4.54s/it] {'loss': 0.3509, 'grad_norm': 0.662381067071977, 'learning_rate': 4.114922542664221e-06, 'epoch': 0.57} 57%|█████▋ | 12594/22095 [21:41:21<11:59:27, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12595/22095 [21:41:24<10:44:01, 4.07s/it] {'loss': 0.3642, 'grad_norm': 0.6767235362430225, 'learning_rate': 4.114201205397998e-06, 'epoch': 0.57} 57%|█████▋ | 12595/22095 [21:41:24<10:44:01, 4.07s/it] 57%|█████▋ | 12596/22095 [21:41:28<10:32:20, 3.99s/it] {'loss': 0.3442, 'grad_norm': 0.6522887209274827, 'learning_rate': 4.113479887164873e-06, 'epoch': 0.57} 57%|█████▋ | 12596/22095 [21:41:28<10:32:20, 3.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12597/22095 [21:41:32<10:30:13, 3.98s/it] {'loss': 0.3241, 'grad_norm': 0.6629478414681544, 'learning_rate': 4.112758587980342e-06, 'epoch': 0.57} 57%|█████▋ | 12597/22095 [21:41:32<10:30:13, 3.98s/it] 57%|█████▋ | 12598/22095 [21:41:35<10:23:59, 3.94s/it] {'loss': 0.3091, 'grad_norm': 0.5835930500962448, 'learning_rate': 4.112037307859903e-06, 'epoch': 0.57} 57%|█████▋ | 12598/22095 [21:41:35<10:23:59, 3.94s/it] 57%|█████▋ | 12599/22095 [21:41:39<9:56:30, 3.77s/it] {'loss': 0.3226, 'grad_norm': 0.6259458536357286, 'learning_rate': 4.111316046819057e-06, 'epoch': 0.57} 57%|█████▋ | 12599/22095 [21:41:39<9:56:30, 3.77s/it] 57%|█████▋ | 12600/22095 [21:41:42<9:34:09, 3.63s/it] {'loss': 0.2821, 'grad_norm': 0.6550185899599036, 'learning_rate': 4.110594804873297e-06, 'epoch': 0.57} 57%|█████▋ | 12600/22095 [21:41:42<9:34:09, 3.63s/it] 57%|█████▋ | 12601/22095 [21:41:45<9:09:41, 3.47s/it] {'loss': 0.3065, 'grad_norm': 0.6371420170637637, 'learning_rate': 4.1098735820381244e-06, 'epoch': 0.57} 57%|█████▋ | 12601/22095 [21:41:45<9:09:41, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305071 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1fvbXgwMPMeJjy1XdXXasrXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n识别文字,但不要输出坐标框。'}, {'from': 'gpt', 'value': '图中所有文字:\n常熟货架\n4皇冠6年老店\n产地快速发货\n省市包邮\n19\n品质\nBFC\n保证\n承重靠谱\nFRONT\nFRONT\n配套全套配件三角片扳手'}]} 57%|█████▋ | 12602/22095 [21:41:49<9:20:13, 3.54s/it] {'loss': 0.3473, 'grad_norm': 0.7330960754239171, 'learning_rate': 4.109152378329036e-06, 'epoch': 0.57} 57%|█████▋ | 12602/22095 [21:41:49<9:20:13, 3.54s/it] 57%|█████▋ | 12603/22095 [21:41:52<9:19:37, 3.54s/it] {'loss': 0.3019, 'grad_norm': 0.662731391985183, 'learning_rate': 4.108431193761525e-06, 'epoch': 0.57} 57%|█████▋ | 12603/22095 [21:41:52<9:19:37, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8381086 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47875, 'image': 'vrdu_table_final_2/astro-ph.CO/55d93d28-0210-49a6-a405-9caeafa16117.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 57%|█████▋ | 12604/22095 [21:41:55<9:00:06, 3.41s/it] {'loss': 0.3158, 'grad_norm': 0.5955175335506525, 'learning_rate': 4.107710028351089e-06, 'epoch': 0.57} 57%|█████▋ | 12604/22095 [21:41:56<9:00:06, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45234 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12605/22095 [21:41:59<8:59:13, 3.41s/it] {'loss': 0.3779, 'grad_norm': 0.5912044747365863, 'learning_rate': 4.106988882113228e-06, 'epoch': 0.57} 57%|█████▋ | 12605/22095 [21:41:59<8:59:13, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90156 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101340 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12606/22095 [21:42:03<9:40:33, 3.67s/it] {'loss': 0.3616, 'grad_norm': 0.7742260903363052, 'learning_rate': 4.106267755063429e-06, 'epoch': 0.57} 57%|█████▋ | 12606/22095 [21:42:03<9:40:33, 3.67s/it] 57%|█████▋ | 12607/22095 [21:42:06<8:57:48, 3.40s/it] {'loss': 0.3191, 'grad_norm': 0.6187984756118377, 'learning_rate': 4.105546647217192e-06, 'epoch': 0.57} 57%|█████▋ | 12607/22095 [21:42:06<8:57:48, 3.40s/it] 57%|█████▋ | 12608/22095 [21:42:09<8:33:47, 3.25s/it] {'loss': 0.2965, 'grad_norm': 0.6393811844148454, 'learning_rate': 4.104825558590011e-06, 'epoch': 0.57} 57%|█████▋ | 12608/22095 [21:42:09<8:33:47, 3.25s/it] 57%|█████▋ | 12609/22095 [21:42:12<8:24:56, 3.19s/it] {'loss': 0.3399, 'grad_norm': 0.7597578452839173, 'learning_rate': 4.104104489197381e-06, 'epoch': 0.57} 57%|█████▋ | 12609/22095 [21:42:12<8:24:56, 3.19s/it] 57%|█████▋ | 12610/22095 [21:42:15<8:17:10, 3.15s/it] {'loss': 0.2856, 'grad_norm': 0.5876493739667094, 'learning_rate': 4.1033834390547905e-06, 'epoch': 0.57} 57%|█████▋ | 12610/22095 [21:42:15<8:17:10, 3.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12611/22095 [21:42:19<8:39:59, 3.29s/it] {'loss': 0.3119, 'grad_norm': 0.6053375291576475, 'learning_rate': 4.102662408177738e-06, 'epoch': 0.57} 57%|█████▋ | 12611/22095 [21:42:19<8:39:59, 3.29s/it] 57%|█████▋ | 12612/22095 [21:42:22<8:28:57, 3.22s/it] {'loss': 0.3265, 'grad_norm': 0.6556433596751344, 'learning_rate': 4.1019413965817154e-06, 'epoch': 0.57} 57%|█████▋ | 12612/22095 [21:42:22<8:28:57, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12613/22095 [21:42:31<13:39:49, 5.19s/it] {'loss': 0.4635, 'grad_norm': 0.4186174147926591, 'learning_rate': 4.101220404282213e-06, 'epoch': 0.57} 57%|█████▋ | 12613/22095 [21:42:31<13:39:49, 5.19s/it] 57%|█████▋ | 12614/22095 [21:42:37<14:12:51, 5.40s/it] {'loss': 0.4867, 'grad_norm': 0.3760300246390405, 'learning_rate': 4.100499431294722e-06, 'epoch': 0.57} 57%|█████▋ | 12614/22095 [21:42:37<14:12:51, 5.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 57%|█████▋ | 12615/22095 [21:42:42<13:24:23, 5.09s/it] {'loss': 0.3504, 'grad_norm': 0.6868071595945288, 'learning_rate': 4.099778477634739e-06, 'epoch': 0.57} 57%|█████▋ | 12615/22095 [21:42:42<13:24:23, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12616/22095 [21:42:46<12:28:35, 4.74s/it] {'loss': 0.2872, 'grad_norm': 0.6522276286304554, 'learning_rate': 4.099057543317749e-06, 'epoch': 0.57} 57%|█████▋ | 12616/22095 [21:42:46<12:28:35, 4.74s/it] 57%|█████▋ | 12617/22095 [21:42:49<11:35:24, 4.40s/it] {'loss': 0.3311, 'grad_norm': 0.611826672633969, 'learning_rate': 4.098336628359247e-06, 'epoch': 0.57} 57%|█████▋ | 12617/22095 [21:42:49<11:35:24, 4.40s/it] 57%|█████▋ | 12618/22095 [21:42:53<11:14:58, 4.27s/it] {'loss': 0.3015, 'grad_norm': 0.5757073653381102, 'learning_rate': 4.097615732774722e-06, 'epoch': 0.57} 57%|█████▋ | 12618/22095 [21:42:53<11:14:58, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965711 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16546, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 57%|█████▋ | 12619/22095 [21:42:56<10:30:21, 3.99s/it] {'loss': 0.3205, 'grad_norm': 0.6543449155667355, 'learning_rate': 4.096894856579662e-06, 'epoch': 0.57} 57%|█████▋ | 12619/22095 [21:42:57<10:30:21, 3.99s/it] 57%|█████▋ | 12620/22095 [21:42:59<9:34:55, 3.64s/it] {'loss': 0.3432, 'grad_norm': 0.6499685402810728, 'learning_rate': 4.096173999789558e-06, 'epoch': 0.57} 57%|█████▋ | 12620/22095 [21:42:59<9:34:55, 3.64s/it] 57%|█████▋ | 12621/22095 [21:43:02<9:04:32, 3.45s/it] {'loss': 0.286, 'grad_norm': 0.6168195605009977, 'learning_rate': 4.095453162419898e-06, 'epoch': 0.57} 57%|█████▋ | 12621/22095 [21:43:02<9:04:32, 3.45s/it] 57%|█████▋ | 12622/22095 [21:43:05<8:42:51, 3.31s/it] {'loss': 0.2918, 'grad_norm': 0.6798988113354032, 'learning_rate': 4.094732344486174e-06, 'epoch': 0.57} 57%|█████▋ | 12622/22095 [21:43:05<8:42:51, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43194 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12623/22095 [21:43:08<8:25:31, 3.20s/it] {'loss': 0.3076, 'grad_norm': 0.6054918063775061, 'learning_rate': 4.0940115460038695e-06, 'epoch': 0.57} 57%|█████▋ | 12623/22095 [21:43:08<8:25:31, 3.20s/it] 57%|█████▋ | 12624/22095 [21:43:12<8:40:59, 3.30s/it] {'loss': 0.3144, 'grad_norm': 0.6838615401522805, 'learning_rate': 4.093290766988474e-06, 'epoch': 0.57} 57%|█████▋ | 12624/22095 [21:43:12<8:40:59, 3.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12625/22095 [21:43:15<8:45:17, 3.33s/it] {'loss': 0.3133, 'grad_norm': 0.6405000541011384, 'learning_rate': 4.092570007455477e-06, 'epoch': 0.57} 57%|█████▋ | 12625/22095 [21:43:15<8:45:17, 3.33s/it] 57%|█████▋ | 12626/22095 [21:43:18<8:43:38, 3.32s/it] {'loss': 0.3394, 'grad_norm': 0.6336666469050273, 'learning_rate': 4.0918492674203634e-06, 'epoch': 0.57} 57%|█████▋ | 12626/22095 [21:43:18<8:43:38, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12627/22095 [21:43:21<8:27:03, 3.21s/it] {'loss': 0.2751, 'grad_norm': 0.6549913219931577, 'learning_rate': 4.091128546898619e-06, 'epoch': 0.57} 57%|█████▋ | 12627/22095 [21:43:21<8:27:03, 3.21s/it] 57%|█████▋ | 12628/22095 [21:43:25<8:27:57, 3.22s/it] {'loss': 0.3164, 'grad_norm': 0.6524068949012286, 'learning_rate': 4.090407845905732e-06, 'epoch': 0.57} 57%|█████▋ | 12628/22095 [21:43:25<8:27:57, 3.22s/it] 57%|█████▋ | 12629/22095 [21:43:28<8:45:23, 3.33s/it] {'loss': 0.3177, 'grad_norm': 0.6281430874277494, 'learning_rate': 4.089687164457184e-06, 'epoch': 0.57} 57%|█████▋ | 12629/22095 [21:43:28<8:45:23, 3.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [495, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8443747 in VC:s3://internvl-moe-sft-data/. Exception: Image size [495, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55989, 'image': 'vrdu_texteq/astro-ph.CO/f365b6c3-02e5-44eb-a74a-f5c4109992be.png', 'image_wh': [[495, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $v_k$ is the mode function defined as'}]} 57%|█████▋ | 12630/22095 [21:43:32<9:04:21, 3.45s/it] {'loss': 0.2746, 'grad_norm': 0.6065271029200794, 'learning_rate': 4.088966502568465e-06, 'epoch': 0.57} 57%|█████▋ | 12630/22095 [21:43:32<9:04:21, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12631/22095 [21:43:41<13:15:28, 5.04s/it] {'loss': 0.4791, 'grad_norm': 0.5427963923567827, 'learning_rate': 4.0882458602550586e-06, 'epoch': 0.57} 57%|█████▋ | 12631/22095 [21:43:41<13:15:28, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74296 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12632/22095 [21:43:44<12:06:55, 4.61s/it] {'loss': 0.3448, 'grad_norm': 0.6819911887506003, 'learning_rate': 4.087525237532447e-06, 'epoch': 0.57} 57%|█████▋ | 12632/22095 [21:43:44<12:06:55, 4.61s/it] 57%|█████▋ | 12633/22095 [21:43:47<10:56:43, 4.16s/it] {'loss': 0.3352, 'grad_norm': 0.655494482755223, 'learning_rate': 4.086804634416115e-06, 'epoch': 0.57} 57%|█████▋ | 12633/22095 [21:43:47<10:56:43, 4.16s/it] 57%|█████▋ | 12634/22095 [21:43:50<9:58:37, 3.80s/it] {'loss': 0.3025, 'grad_norm': 0.6626172355189169, 'learning_rate': 4.08608405092155e-06, 'epoch': 0.57} 57%|█████▋ | 12634/22095 [21:43:50<9:58:37, 3.80s/it] 57%|█████▋ | 12635/22095 [21:43:54<9:26:01, 3.59s/it] {'loss': 0.3052, 'grad_norm': 0.6266465106621113, 'learning_rate': 4.085363487064228e-06, 'epoch': 0.57} 57%|█████▋ | 12635/22095 [21:43:54<9:26:01, 3.59s/it] 57%|█████▋ | 12636/22095 [21:43:57<9:10:31, 3.49s/it] {'loss': 0.287, 'grad_norm': 0.6048553709958095, 'learning_rate': 4.084642942859638e-06, 'epoch': 0.57} 57%|█████▋ | 12636/22095 [21:43:57<9:10:31, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71869 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12637/22095 [21:44:00<9:11:07, 3.50s/it] {'loss': 0.3331, 'grad_norm': 0.6391854257326505, 'learning_rate': 4.083922418323257e-06, 'epoch': 0.57} 57%|█████▋ | 12637/22095 [21:44:00<9:11:07, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106964 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12638/22095 [21:44:04<9:26:57, 3.60s/it] {'loss': 0.2752, 'grad_norm': 0.6385617384850278, 'learning_rate': 4.083201913470574e-06, 'epoch': 0.57} 57%|█████▋ | 12638/22095 [21:44:04<9:26:57, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (143959 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12639/22095 [21:44:07<8:58:20, 3.42s/it] {'loss': 0.2886, 'grad_norm': 0.6224515114391435, 'learning_rate': 4.082481428317063e-06, 'epoch': 0.57} 57%|█████▋ | 12639/22095 [21:44:07<8:58:20, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78776 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12640/22095 [21:44:16<13:04:00, 4.98s/it] {'loss': 0.4714, 'grad_norm': 0.31548837794382356, 'learning_rate': 4.081760962878209e-06, 'epoch': 0.57} 57%|█████▋ | 12640/22095 [21:44:16<13:04:00, 4.98s/it] 57%|█████▋ | 12641/22095 [21:44:19<12:06:16, 4.61s/it] {'loss': 0.3219, 'grad_norm': 0.6290456500635456, 'learning_rate': 4.081040517169493e-06, 'epoch': 0.57} 57%|█████▋ | 12641/22095 [21:44:20<12:06:16, 4.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [437, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429469 in VC:s3://internvl-moe-sft-data/. Exception: Image size [437, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55403, 'image': 'vrdu_texteq/astro-ph.CO/cb060002-0a68-4736-a930-f363a8a78479.png', 'image_wh': [[437, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': "where $F_{\\it ll'}$ is the $C_{\\it l}$'s Fisher matrix."}]} 57%|█████▋ | 12642/22095 [21:44:24<11:43:03, 4.46s/it] {'loss': 0.3539, 'grad_norm': 0.6123838167027538, 'learning_rate': 4.080320091206392e-06, 'epoch': 0.57} 57%|█████▋ | 12642/22095 [21:44:24<11:43:03, 4.46s/it] 57%|█████▋ | 12643/22095 [21:44:28<11:20:46, 4.32s/it] {'loss': 0.2924, 'grad_norm': 0.7026155094801178, 'learning_rate': 4.079599685004388e-06, 'epoch': 0.57} 57%|█████▋ | 12643/22095 [21:44:28<11:20:46, 4.32s/it] 57%|█████▋ | 12644/22095 [21:44:31<10:41:06, 4.07s/it] {'loss': 0.311, 'grad_norm': 0.6244310496732987, 'learning_rate': 4.078879298578961e-06, 'epoch': 0.57} 57%|█████▋ | 12644/22095 [21:44:31<10:41:06, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109506 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12645/22095 [21:44:40<14:20:08, 5.46s/it] {'loss': 0.4704, 'grad_norm': 0.3315480626927533, 'learning_rate': 4.078158931945588e-06, 'epoch': 0.57} 57%|█████▋ | 12645/22095 [21:44:40<14:20:08, 5.46s/it] 57%|█████▋ | 12646/22095 [21:44:49<17:39:28, 6.73s/it] {'loss': 0.485, 'grad_norm': 0.28464159746166984, 'learning_rate': 4.077438585119748e-06, 'epoch': 0.57} 57%|█████▋ | 12646/22095 [21:44:49<17:39:28, 6.73s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 57%|█████▋ | 12647/22095 [21:44:53<14:52:28, 5.67s/it] {'loss': 0.3401, 'grad_norm': 0.6443006669304278, 'learning_rate': 4.076718258116922e-06, 'epoch': 0.57} 57%|█████▋ | 12647/22095 [21:44:53<14:52:28, 5.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64751 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12648/22095 [21:44:59<15:45:45, 6.01s/it] {'loss': 0.4848, 'grad_norm': 0.3044131293795776, 'learning_rate': 4.0759979509525826e-06, 'epoch': 0.57} 57%|█████▋ | 12648/22095 [21:44:59<15:45:45, 6.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12649/22095 [21:45:03<13:52:36, 5.29s/it] {'loss': 0.3524, 'grad_norm': 0.630328672719413, 'learning_rate': 4.075277663642208e-06, 'epoch': 0.57} 57%|█████▋ | 12649/22095 [21:45:03<13:52:36, 5.29s/it] 57%|█████▋ | 12650/22095 [21:45:06<12:12:01, 4.65s/it] {'loss': 0.317, 'grad_norm': 0.6414834546317076, 'learning_rate': 4.074557396201279e-06, 'epoch': 0.57} 57%|█████▋ | 12650/22095 [21:45:06<12:12:01, 4.65s/it] 57%|█████▋ | 12651/22095 [21:45:10<11:37:01, 4.43s/it] {'loss': 0.2713, 'grad_norm': 0.5825287557417055, 'learning_rate': 4.073837148645269e-06, 'epoch': 0.57} 57%|█████▋ | 12651/22095 [21:45:10<11:37:01, 4.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932780 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55933, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 4cm\nB. 6cm\nC. 12cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 57%|█████▋ | 12652/22095 [21:45:13<10:28:40, 3.99s/it] {'loss': 0.2846, 'grad_norm': 0.5882568662766171, 'learning_rate': 4.073116920989653e-06, 'epoch': 0.57} 57%|█████▋ | 12652/22095 [21:45:13<10:28:40, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12653/22095 [21:45:23<14:46:06, 5.63s/it] {'loss': 0.4546, 'grad_norm': 0.3948042084638724, 'learning_rate': 4.072396713249907e-06, 'epoch': 0.57} 57%|█████▋ | 12653/22095 [21:45:23<14:46:06, 5.63s/it] 57%|█████▋ | 12654/22095 [21:45:26<13:13:56, 5.05s/it] {'loss': 0.3553, 'grad_norm': 0.6078405818525381, 'learning_rate': 4.071676525441509e-06, 'epoch': 0.57} 57%|█████▋ | 12654/22095 [21:45:26<13:13:56, 5.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881052 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4205, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定'}, {'from': 'gpt', 'value': '【解答】解:∵M、N分别是线段AB、BC的中点,∴MB=0.5AB=3cm,NB=0.5BC=2cm,∴MN=MB+NB=3+2=5(cm),'}]} 57%|█████▋ | 12655/22095 [21:45:30<11:54:12, 4.54s/it] {'loss': 0.308, 'grad_norm': 0.7194445472363838, 'learning_rate': 4.07095635757993e-06, 'epoch': 0.57} 57%|█████▋ | 12655/22095 [21:45:30<11:54:12, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12656/22095 [21:45:40<16:21:00, 6.24s/it] {'loss': 0.4775, 'grad_norm': 0.3168071964095616, 'learning_rate': 4.070236209680646e-06, 'epoch': 0.57} 57%|█████▋ | 12656/22095 [21:45:40<16:21:00, 6.24s/it] 57%|█████▋ | 12657/22095 [21:45:43<14:14:06, 5.43s/it] {'loss': 0.2968, 'grad_norm': 0.6062929801973241, 'learning_rate': 4.069516081759131e-06, 'epoch': 0.57} 57%|█████▋ | 12657/22095 [21:45:43<14:14:06, 5.43s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12658/22095 [21:45:47<12:30:21, 4.77s/it] {'loss': 0.3192, 'grad_norm': 0.6174877429131435, 'learning_rate': 4.068795973830856e-06, 'epoch': 0.57} 57%|█████▋ | 12658/22095 [21:45:47<12:30:21, 4.77s/it] 57%|█████▋ | 12659/22095 [21:45:50<11:13:23, 4.28s/it] {'loss': 0.3188, 'grad_norm': 0.614924387600183, 'learning_rate': 4.068075885911295e-06, 'epoch': 0.57} 57%|█████▋ | 12659/22095 [21:45:50<11:13:23, 4.28s/it] 57%|█████▋ | 12660/22095 [21:45:53<10:11:20, 3.89s/it] {'loss': 0.3344, 'grad_norm': 0.7104822249231385, 'learning_rate': 4.067355818015925e-06, 'epoch': 0.57} 57%|█████▋ | 12660/22095 [21:45:53<10:11:20, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46076 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12661/22095 [21:45:56<9:32:39, 3.64s/it] {'loss': 0.2727, 'grad_norm': 0.7277396852893528, 'learning_rate': 4.0666357701602105e-06, 'epoch': 0.57} 57%|█████▋ | 12661/22095 [21:45:56<9:32:39, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [570, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8436301 in VC:s3://internvl-moe-sft-data/. Exception: Image size [570, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55725, 'image': 'vrdu_texteq/astro-ph.CO/1a335535-a75d-4544-887f-66ab44b4386d.png', 'image_wh': [[570, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $J$ is the Bessel function of the first kind.'}]} 57%|█████▋ | 12662/22095 [21:46:00<9:59:01, 3.81s/it] {'loss': 0.3058, 'grad_norm': 0.5984528852078446, 'learning_rate': 4.0659157423596265e-06, 'epoch': 0.57} 57%|█████▋ | 12662/22095 [21:46:00<9:59:01, 3.81s/it] 57%|█████▋ | 12663/22095 [21:46:04<9:57:58, 3.80s/it] {'loss': 0.3135, 'grad_norm': 0.5822935757880945, 'learning_rate': 4.065195734629646e-06, 'epoch': 0.57} 57%|█████▋ | 12663/22095 [21:46:04<9:57:58, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51294 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (141463 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12664/22095 [21:46:07<9:18:36, 3.55s/it] {'loss': 0.3307, 'grad_norm': 0.7260328271628357, 'learning_rate': 4.064475746985738e-06, 'epoch': 0.57} 57%|█████▋ | 12664/22095 [21:46:07<9:18:36, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97707 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12665/22095 [21:46:10<8:46:18, 3.35s/it] {'loss': 0.3185, 'grad_norm': 0.6039575429945476, 'learning_rate': 4.063755779443372e-06, 'epoch': 0.57} 57%|█████▋ | 12665/22095 [21:46:10<8:46:18, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120466 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12666/22095 [21:46:13<8:33:16, 3.27s/it] {'loss': 0.3056, 'grad_norm': 0.6447887050870809, 'learning_rate': 4.063035832018018e-06, 'epoch': 0.57} 57%|█████▋ | 12666/22095 [21:46:13<8:33:16, 3.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379309 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46094, 'image': 'vrdu_table_final_2/astro-ph.CO/ecaf3901-e3a8-4f23-8a8f-0c574e0a44bd.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} Token indices sequence length is longer than the specified maximum sequence length for this model (138937 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12667/22095 [21:46:16<8:41:21, 3.32s/it] {'loss': 0.3142, 'grad_norm': 0.636991093146034, 'learning_rate': 4.06231590472515e-06, 'epoch': 0.57} 57%|█████▋ | 12667/22095 [21:46:16<8:41:21, 3.32s/it] 57%|█████▋ | 12668/22095 [21:46:20<8:57:06, 3.42s/it] {'loss': 0.3491, 'grad_norm': 0.6648637483903794, 'learning_rate': 4.06159599758023e-06, 'epoch': 0.57} 57%|█████▋ | 12668/22095 [21:46:20<8:57:06, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12669/22095 [21:46:29<13:18:52, 5.09s/it] {'loss': 0.4871, 'grad_norm': 0.3602955931210812, 'learning_rate': 4.060876110598731e-06, 'epoch': 0.57} 57%|█████▋ | 12669/22095 [21:46:29<13:18:52, 5.09s/it] 57%|█████▋ | 12670/22095 [21:46:32<11:57:49, 4.57s/it] {'loss': 0.3092, 'grad_norm': 0.5996667931514984, 'learning_rate': 4.0601562437961215e-06, 'epoch': 0.57} 57%|█████▋ | 12670/22095 [21:46:32<11:57:49, 4.57s/it] 57%|█████▋ | 12671/22095 [21:46:36<11:15:16, 4.30s/it] {'loss': 0.3432, 'grad_norm': 0.591590722739638, 'learning_rate': 4.059436397187866e-06, 'epoch': 0.57} 57%|█████▋ | 12671/22095 [21:46:36<11:15:16, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12672/22095 [21:46:46<15:30:52, 5.93s/it] {'loss': 0.4563, 'grad_norm': 0.42572110447408806, 'learning_rate': 4.0587165707894326e-06, 'epoch': 0.57} 57%|█████▋ | 12672/22095 [21:46:46<15:30:52, 5.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58369 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48663 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12673/22095 [21:46:49<13:25:08, 5.13s/it] {'loss': 0.3394, 'grad_norm': 0.6530519024731599, 'learning_rate': 4.0579967646162915e-06, 'epoch': 0.57} 57%|█████▋ | 12673/22095 [21:46:49<13:25:08, 5.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12674/22095 [21:46:55<14:38:31, 5.60s/it] {'loss': 0.4838, 'grad_norm': 0.29068368801514916, 'learning_rate': 4.057276978683906e-06, 'epoch': 0.57} 57%|█████▋ | 12674/22095 [21:46:55<14:38:31, 5.60s/it] 57%|█████▋ | 12675/22095 [21:46:59<12:58:57, 4.96s/it] {'loss': 0.2792, 'grad_norm': 0.6793392061769342, 'learning_rate': 4.056557213007743e-06, 'epoch': 0.57} 57%|█████▋ | 12675/22095 [21:46:59<12:58:57, 4.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [45, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358146 in VC:s3://internvl-moe-sft-data/. Exception: Image size [45, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24857, 'image': 'vrdu_table_final_2/astro-ph.CO/6d4b3a73-6874-427c-9c9e-a29fd85d33f7.png', 'image_wh': [[45, 25]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$Y_{tot}$\\end{tabular}\n```"}]} 57%|█████▋ | 12676/22095 [21:47:03<12:07:04, 4.63s/it] {'loss': 0.3201, 'grad_norm': 0.6032629284496717, 'learning_rate': 4.055837467603268e-06, 'epoch': 0.57} 57%|█████▋ | 12676/22095 [21:47:03<12:07:04, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74200 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12677/22095 [21:47:07<11:26:00, 4.37s/it] {'loss': 0.3243, 'grad_norm': 0.6227563593182059, 'learning_rate': 4.055117742485944e-06, 'epoch': 0.57} 57%|█████▋ | 12677/22095 [21:47:07<11:26:00, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12678/22095 [21:47:10<10:45:15, 4.11s/it] {'loss': 0.3283, 'grad_norm': 0.6140441774350899, 'learning_rate': 4.05439803767124e-06, 'epoch': 0.57} 57%|█████▋ | 12678/22095 [21:47:10<10:45:15, 4.11s/it] 57%|█████▋ | 12679/22095 [21:47:13<9:55:31, 3.79s/it] {'loss': 0.3038, 'grad_norm': 0.6493695835053662, 'learning_rate': 4.053678353174616e-06, 'epoch': 0.57} 57%|█████▋ | 12679/22095 [21:47:13<9:55:31, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12680/22095 [21:47:17<9:44:06, 3.72s/it] {'loss': 0.2927, 'grad_norm': 0.674659064650729, 'learning_rate': 4.05295868901154e-06, 'epoch': 0.57} 57%|█████▋ | 12680/22095 [21:47:17<9:44:06, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12681/22095 [21:47:24<12:25:27, 4.75s/it] {'loss': 0.4449, 'grad_norm': 0.3355983023504748, 'learning_rate': 4.052239045197472e-06, 'epoch': 0.57} 57%|█████▋ | 12681/22095 [21:47:24<12:25:27, 4.75s/it] 57%|█████▋ | 12682/22095 [21:47:27<11:16:13, 4.31s/it] {'loss': 0.2783, 'grad_norm': 0.6013667689517775, 'learning_rate': 4.051519421747876e-06, 'epoch': 0.57} 57%|█████▋ | 12682/22095 [21:47:27<11:16:13, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64350 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12683/22095 [21:47:31<10:38:08, 4.07s/it] {'loss': 0.3296, 'grad_norm': 0.6710995569903319, 'learning_rate': 4.050799818678216e-06, 'epoch': 0.57} 57%|█████▋ | 12683/22095 [21:47:31<10:38:08, 4.07s/it] 57%|█████▋ | 12684/22095 [21:47:34<10:11:05, 3.90s/it] {'loss': 0.3149, 'grad_norm': 0.6503624710261977, 'learning_rate': 4.050080236003952e-06, 'epoch': 0.57} 57%|█████▋ | 12684/22095 [21:47:34<10:11:05, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12685/22095 [21:47:41<12:35:24, 4.82s/it] {'loss': 0.4615, 'grad_norm': 0.29002163267180253, 'learning_rate': 4.049360673740545e-06, 'epoch': 0.57} 57%|█████▋ | 12685/22095 [21:47:41<12:35:24, 4.82s/it] 57%|█████▋ | 12686/22095 [21:47:44<11:23:04, 4.36s/it] {'loss': 0.357, 'grad_norm': 0.6306557530330095, 'learning_rate': 4.04864113190346e-06, 'epoch': 0.57} 57%|█████▋ | 12686/22095 [21:47:44<11:23:04, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12687/22095 [21:47:51<13:28:05, 5.15s/it] {'loss': 0.4879, 'grad_norm': 0.2837338449412986, 'learning_rate': 4.047921610508152e-06, 'epoch': 0.57} 57%|█████▋ | 12687/22095 [21:47:51<13:28:05, 5.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12688/22095 [21:47:55<12:24:06, 4.75s/it] {'loss': 0.2996, 'grad_norm': 0.6031911633754032, 'learning_rate': 4.047202109570086e-06, 'epoch': 0.57} 57%|█████▋ | 12688/22095 [21:47:55<12:24:06, 4.75s/it] 57%|█████▋ | 12689/22095 [21:47:58<11:10:25, 4.28s/it] {'loss': 0.3194, 'grad_norm': 0.7921926294430719, 'learning_rate': 4.046482629104722e-06, 'epoch': 0.57} 57%|█████▋ | 12689/22095 [21:47:58<11:10:25, 4.28s/it] 57%|█████▋ | 12690/22095 [21:48:02<10:25:06, 3.99s/it] {'loss': 0.3046, 'grad_norm': 0.6333884473496525, 'learning_rate': 4.045763169127516e-06, 'epoch': 0.57} 57%|█████▋ | 12690/22095 [21:48:02<10:25:06, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 57%|█████▋ | 12691/22095 [21:48:11<14:46:14, 5.65s/it] {'loss': 0.4862, 'grad_norm': 0.2641626700740687, 'learning_rate': 4.045043729653927e-06, 'epoch': 0.57} 57%|█████▋ | 12691/22095 [21:48:11<14:46:14, 5.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12692/22095 [21:48:14<12:50:50, 4.92s/it] {'loss': 0.3217, 'grad_norm': 0.6023775403806684, 'learning_rate': 4.044324310699418e-06, 'epoch': 0.57} 57%|█████▋ | 12692/22095 [21:48:14<12:50:50, 4.92s/it] 57%|█████▋ | 12693/22095 [21:48:18<11:53:36, 4.55s/it] {'loss': 0.3261, 'grad_norm': 0.6682577862608416, 'learning_rate': 4.043604912279444e-06, 'epoch': 0.57} 57%|█████▋ | 12693/22095 [21:48:18<11:53:36, 4.55s/it] 57%|█████▋ | 12694/22095 [21:48:21<10:45:53, 4.12s/it] {'loss': 0.3143, 'grad_norm': 0.6677978891771651, 'learning_rate': 4.0428855344094635e-06, 'epoch': 0.57} 57%|█████▋ | 12694/22095 [21:48:21<10:45:53, 4.12s/it] 57%|█████▋ | 12695/22095 [21:48:25<10:17:11, 3.94s/it] {'loss': 0.3123, 'grad_norm': 0.5927057380531942, 'learning_rate': 4.042166177104932e-06, 'epoch': 0.57} 57%|█████▋ | 12695/22095 [21:48:25<10:17:11, 3.94s/it] 57%|█████▋ | 12696/22095 [21:48:28<9:24:15, 3.60s/it] {'loss': 0.3114, 'grad_norm': 0.6226581838813324, 'learning_rate': 4.041446840381309e-06, 'epoch': 0.57} 57%|█████▋ | 12696/22095 [21:48:28<9:24:15, 3.60s/it] 57%|█████▋ | 12697/22095 [21:48:30<8:50:24, 3.39s/it] {'loss': 0.3143, 'grad_norm': 0.6481231373661983, 'learning_rate': 4.040727524254048e-06, 'epoch': 0.57} 57%|█████▋ | 12697/22095 [21:48:30<8:50:24, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8353851 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20536, 'image': 'vrdu_table_final_2/astro-ph.CO/3dd5ca6d-0bf9-4170-825e-44fccaef9dca.png', 'image_wh': [[1975, 6]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{p{\\textwidth}}\\hline\\ \\end{tabular}\n```"}]} 57%|█████▋ | 12698/22095 [21:48:33<8:28:33, 3.25s/it] {'loss': 0.3287, 'grad_norm': 0.6117393568444289, 'learning_rate': 4.040008228738607e-06, 'epoch': 0.57} 57%|█████▋ | 12698/22095 [21:48:33<8:28:33, 3.25s/it] 57%|█████▋ | 12699/22095 [21:48:37<8:37:37, 3.31s/it] {'loss': 0.3346, 'grad_norm': 0.6422235183124816, 'learning_rate': 4.039288953850442e-06, 'epoch': 0.57} 57%|█████▋ | 12699/22095 [21:48:37<8:37:37, 3.31s/it] 57%|█████▋ | 12700/22095 [21:48:40<8:31:26, 3.27s/it] {'loss': 0.2951, 'grad_norm': 0.6381619210328913, 'learning_rate': 4.038569699605005e-06, 'epoch': 0.57} 57%|█████▋ | 12700/22095 [21:48:40<8:31:26, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48231 > 40960). Running this sequence through the model will result in indexing errors 57%|█████▋ | 12701/22095 [21:48:44<8:52:02, 3.40s/it] {'loss': 0.3187, 'grad_norm': 0.6660502247869946, 'learning_rate': 4.037850466017752e-06, 'epoch': 0.57} 57%|█████▋ | 12701/22095 [21:48:44<8:52:02, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396941 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63794, 'image': 'vrdu_table_final_2/astro-ph.EP/15eb11d3-7444-4257-9dc5-c49d9221750f.png', 'image_wh': [[14, 20]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}y\\end{tabular}\n```"}]} 57%|█████▋ | 12702/22095 [21:48:47<8:25:52, 3.23s/it] {'loss': 0.3356, 'grad_norm': 0.6603447028048267, 'learning_rate': 4.03713125310414e-06, 'epoch': 0.57} 57%|█████▋ | 12702/22095 [21:48:47<8:25:52, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 57%|█████▋ | 12703/22095 [21:48:49<8:06:57, 3.11s/it] {'loss': 0.3152, 'grad_norm': 0.5892533315864354, 'learning_rate': 4.036412060879618e-06, 'epoch': 0.57} 57%|█████▋ | 12703/22095 [21:48:49<8:06:57, 3.11s/it] 57%|█████▋ | 12704/22095 [21:48:52<8:06:52, 3.11s/it] {'loss': 0.346, 'grad_norm': 0.6621800751010352, 'learning_rate': 4.035692889359642e-06, 'epoch': 0.57} 57%|█████▋ | 12704/22095 [21:48:52<8:06:52, 3.11s/it] 58%|█████▊ | 12705/22095 [21:48:56<8:16:41, 3.17s/it] {'loss': 0.3314, 'grad_norm': 0.6373051961133144, 'learning_rate': 4.034973738559664e-06, 'epoch': 0.58} 58%|█████▊ | 12705/22095 [21:48:56<8:16:41, 3.17s/it] 58%|█████▊ | 12706/22095 [21:48:59<8:06:14, 3.11s/it] {'loss': 0.325, 'grad_norm': 0.6217914074195996, 'learning_rate': 4.034254608495136e-06, 'epoch': 0.58} 58%|█████▊ | 12706/22095 [21:48:59<8:06:14, 3.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75621 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44446 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 58%|█████▊ | 12707/22095 [21:49:02<8:08:52, 3.12s/it] {'loss': 0.3181, 'grad_norm': 0.6660384422033511, 'learning_rate': 4.03353549918151e-06, 'epoch': 0.58} 58%|█████▊ | 12707/22095 [21:49:02<8:08:52, 3.12s/it] 58%|█████▊ | 12708/22095 [21:49:05<8:03:50, 3.09s/it] {'loss': 0.3084, 'grad_norm': 0.6239159549463238, 'learning_rate': 4.032816410634239e-06, 'epoch': 0.58} 58%|█████▊ | 12708/22095 [21:49:05<8:03:50, 3.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12709/22095 [21:49:15<13:12:13, 5.06s/it] {'loss': 0.4744, 'grad_norm': 0.37417081939858304, 'learning_rate': 4.032097342868774e-06, 'epoch': 0.58} 58%|█████▊ | 12709/22095 [21:49:15<13:12:13, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965708 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16543, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 4cm\nB. 3cm\nC. 2cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 58%|█████▊ | 12710/22095 [21:49:18<11:45:15, 4.51s/it] {'loss': 0.2774, 'grad_norm': 0.6254837344720027, 'learning_rate': 4.031378295900562e-06, 'epoch': 0.58} 58%|█████▊ | 12710/22095 [21:49:18<11:45:15, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48253 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55143 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12711/22095 [21:49:21<10:32:47, 4.05s/it] {'loss': 0.3565, 'grad_norm': 0.7112272484871678, 'learning_rate': 4.030659269745057e-06, 'epoch': 0.58} 58%|█████▊ | 12711/22095 [21:49:21<10:32:47, 4.05s/it] 58%|█████▊ | 12712/22095 [21:49:24<9:44:23, 3.74s/it] {'loss': 0.3082, 'grad_norm': 0.5642715316156801, 'learning_rate': 4.029940264417708e-06, 'epoch': 0.58} 58%|█████▊ | 12712/22095 [21:49:24<9:44:23, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66208 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98676 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12713/22095 [21:49:28<10:01:32, 3.85s/it] {'loss': 0.3061, 'grad_norm': 0.918628603221368, 'learning_rate': 4.0292212799339615e-06, 'epoch': 0.58} 58%|█████▊ | 12713/22095 [21:49:28<10:01:32, 3.85s/it] 58%|█████▊ | 12714/22095 [21:49:32<10:13:13, 3.92s/it] {'loss': 0.3714, 'grad_norm': 0.6357959836387956, 'learning_rate': 4.028502316309268e-06, 'epoch': 0.58} 58%|█████▊ | 12714/22095 [21:49:32<10:13:13, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12715/22095 [21:49:41<14:34:39, 5.59s/it] {'loss': 0.4591, 'grad_norm': 0.2789898995520453, 'learning_rate': 4.0277833735590785e-06, 'epoch': 0.58} 58%|█████▊ | 12715/22095 [21:49:41<14:34:39, 5.59s/it] 58%|█████▊ | 12716/22095 [21:49:45<12:55:56, 4.96s/it] {'loss': 0.3302, 'grad_norm': 0.6074162663412183, 'learning_rate': 4.027064451698836e-06, 'epoch': 0.58} 58%|█████▊ | 12716/22095 [21:49:45<12:55:56, 4.96s/it] 58%|█████▊ | 12717/22095 [21:49:48<11:22:49, 4.37s/it] {'loss': 0.2989, 'grad_norm': 0.6881224733338933, 'learning_rate': 4.026345550743991e-06, 'epoch': 0.58} 58%|█████▊ | 12717/22095 [21:49:48<11:22:49, 4.37s/it] 58%|█████▊ | 12718/22095 [21:49:51<10:29:27, 4.03s/it] {'loss': 0.3411, 'grad_norm': 0.6765735227591337, 'learning_rate': 4.02562667070999e-06, 'epoch': 0.58} 58%|█████▊ | 12718/22095 [21:49:51<10:29:27, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47907 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12719/22095 [21:50:01<14:44:24, 5.66s/it] {'loss': 0.4767, 'grad_norm': 0.28841045085364697, 'learning_rate': 4.024907811612279e-06, 'epoch': 0.58} 58%|█████▊ | 12719/22095 [21:50:01<14:44:24, 5.66s/it] 58%|█████▊ | 12720/22095 [21:50:04<13:03:51, 5.02s/it] {'loss': 0.3265, 'grad_norm': 0.6353278290575984, 'learning_rate': 4.024188973466304e-06, 'epoch': 0.58} 58%|█████▊ | 12720/22095 [21:50:04<13:03:51, 5.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [573, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8420421 in VC:s3://internvl-moe-sft-data/. Exception: Image size [573, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60200, 'image': 'vrdu_texteq/astro-ph.CO/8b91ad67-56e2-4f24-a357-1fec42cf0098.png', 'image_wh': [[573, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'The $B$-mode rms at $\\ell=100$ scales with $R_{\\rm eff}$\nas'}]} 58%|█████▊ | 12721/22095 [21:50:14<16:30:21, 6.34s/it] {'loss': 0.4648, 'grad_norm': 0.26528212661842693, 'learning_rate': 4.023470156287511e-06, 'epoch': 0.58} 58%|█████▊ | 12721/22095 [21:50:14<16:30:21, 6.34s/it] 58%|█████▊ | 12722/22095 [21:50:18<14:42:48, 5.65s/it] {'loss': 0.3239, 'grad_norm': 0.7287600120998405, 'learning_rate': 4.022751360091347e-06, 'epoch': 0.58} 58%|█████▊ | 12722/22095 [21:50:18<14:42:48, 5.65s/it] 58%|█████▊ | 12723/22095 [21:50:21<12:58:14, 4.98s/it] {'loss': 0.308, 'grad_norm': 0.7113149968104155, 'learning_rate': 4.022032584893253e-06, 'epoch': 0.58} 58%|█████▊ | 12723/22095 [21:50:21<12:58:14, 4.98s/it] 58%|█████▊ | 12724/22095 [21:50:24<11:42:02, 4.49s/it] {'loss': 0.3078, 'grad_norm': 0.581184386254656, 'learning_rate': 4.021313830708675e-06, 'epoch': 0.58} 58%|█████▊ | 12724/22095 [21:50:24<11:42:02, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12725/22095 [21:50:34<15:43:17, 6.04s/it] {'loss': 0.4655, 'grad_norm': 0.2746754520286916, 'learning_rate': 4.0205950975530596e-06, 'epoch': 0.58} 58%|█████▊ | 12725/22095 [21:50:34<15:43:17, 6.04s/it] 58%|█████▊ | 12726/22095 [21:50:37<13:27:11, 5.17s/it] {'loss': 0.272, 'grad_norm': 0.5908541244857715, 'learning_rate': 4.019876385441844e-06, 'epoch': 0.58} 58%|█████▊ | 12726/22095 [21:50:37<13:27:11, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12727/22095 [21:50:45<15:42:22, 6.04s/it] {'loss': 0.4895, 'grad_norm': 0.2643314381459348, 'learning_rate': 4.019157694390477e-06, 'epoch': 0.58} 58%|█████▊ | 12727/22095 [21:50:45<15:42:22, 6.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12728/22095 [21:50:49<13:37:46, 5.24s/it] {'loss': 0.3155, 'grad_norm': 0.650686614819478, 'learning_rate': 4.018439024414399e-06, 'epoch': 0.58} 58%|█████▊ | 12728/22095 [21:50:49<13:37:46, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43694 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12729/22095 [21:50:52<12:19:31, 4.74s/it] {'loss': 0.3354, 'grad_norm': 0.6030405524006986, 'learning_rate': 4.0177203755290496e-06, 'epoch': 0.58} 58%|█████▊ | 12729/22095 [21:50:52<12:19:31, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42785 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12730/22095 [21:50:55<10:57:41, 4.21s/it] {'loss': 0.3462, 'grad_norm': 0.6695106687422865, 'learning_rate': 4.017001747749873e-06, 'epoch': 0.58} 58%|█████▊ | 12730/22095 [21:50:55<10:57:41, 4.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85990 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75156 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12731/22095 [21:50:58<9:58:44, 3.84s/it] {'loss': 0.2662, 'grad_norm': 0.5929157274011767, 'learning_rate': 4.016283141092311e-06, 'epoch': 0.58} 58%|█████▊ | 12731/22095 [21:50:58<9:58:44, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84639 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12732/22095 [21:51:02<9:57:34, 3.83s/it] {'loss': 0.3397, 'grad_norm': 0.669110331844809, 'learning_rate': 4.015564555571802e-06, 'epoch': 0.58} 58%|█████▊ | 12732/22095 [21:51:02<9:57:34, 3.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12733/22095 [21:51:05<9:33:36, 3.68s/it] {'loss': 0.3736, 'grad_norm': 0.6443987887172116, 'learning_rate': 4.014845991203787e-06, 'epoch': 0.58} 58%|█████▊ | 12733/22095 [21:51:05<9:33:36, 3.68s/it] 58%|█████▊ | 12734/22095 [21:51:09<9:46:06, 3.76s/it] {'loss': 0.3354, 'grad_norm': 0.5731768790903239, 'learning_rate': 4.0141274480037065e-06, 'epoch': 0.58} 58%|█████▊ | 12734/22095 [21:51:09<9:46:06, 3.76s/it] 58%|█████▊ | 12735/22095 [21:51:12<9:11:13, 3.53s/it] {'loss': 0.3162, 'grad_norm': 0.6561540012298605, 'learning_rate': 4.0134089259870005e-06, 'epoch': 0.58} 58%|█████▊ | 12735/22095 [21:51:12<9:11:13, 3.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8345323 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11977, 'image': 'vrdu_table_final_2/astro-ph.CO/cadcff51-d710-4e6b-9e6f-bdc4134dde33.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 58%|█████▊ | 12736/22095 [21:51:16<9:12:06, 3.54s/it] {'loss': 0.2947, 'grad_norm': 0.6221184811404515, 'learning_rate': 4.012690425169104e-06, 'epoch': 0.58} 58%|█████▊ | 12736/22095 [21:51:16<9:12:06, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12737/22095 [21:51:25<13:48:22, 5.31s/it] {'loss': 0.459, 'grad_norm': 0.32495705128941155, 'learning_rate': 4.011971945565461e-06, 'epoch': 0.58} 58%|█████▊ | 12737/22095 [21:51:25<13:48:22, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49262 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12738/22095 [21:51:35<17:01:15, 6.55s/it] {'loss': 0.4768, 'grad_norm': 0.2992483251964859, 'learning_rate': 4.011253487191505e-06, 'epoch': 0.58} 58%|█████▊ | 12738/22095 [21:51:35<17:01:15, 6.55s/it] 58%|█████▊ | 12739/22095 [21:51:44<19:22:57, 7.46s/it] {'loss': 0.4937, 'grad_norm': 0.2760061214392678, 'learning_rate': 4.0105350500626735e-06, 'epoch': 0.58} 58%|█████▊ | 12739/22095 [21:51:44<19:22:57, 7.46s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 58%|█████▊ | 12740/22095 [21:51:52<19:39:28, 7.56s/it] {'loss': 0.476, 'grad_norm': 0.3164653998547672, 'learning_rate': 4.009816634194405e-06, 'epoch': 0.58} 58%|█████▊ | 12740/22095 [21:51:52<19:39:28, 7.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (55671 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12741/22095 [21:51:55<16:07:54, 6.21s/it] {'loss': 0.2865, 'grad_norm': 0.6132393217954801, 'learning_rate': 4.009098239602139e-06, 'epoch': 0.58} 58%|█████▊ | 12741/22095 [21:51:55<16:07:54, 6.21s/it] 58%|█████▊ | 12742/22095 [21:51:58<13:54:18, 5.35s/it] {'loss': 0.3452, 'grad_norm': 0.6912522265863241, 'learning_rate': 4.008379866301307e-06, 'epoch': 0.58} 58%|█████▊ | 12742/22095 [21:51:58<13:54:18, 5.35s/it] 58%|█████▊ | 12743/22095 [21:52:02<12:47:40, 4.93s/it] {'loss': 0.3405, 'grad_norm': 0.688102334677657, 'learning_rate': 4.007661514307344e-06, 'epoch': 0.58} 58%|█████▊ | 12743/22095 [21:52:02<12:47:40, 4.93s/it] 58%|█████▊ | 12744/22095 [21:52:05<11:18:24, 4.35s/it] {'loss': 0.2986, 'grad_norm': 0.5962739726272968, 'learning_rate': 4.006943183635691e-06, 'epoch': 0.58} 58%|█████▊ | 12744/22095 [21:52:05<11:18:24, 4.35s/it] 58%|█████▊ | 12745/22095 [21:52:10<11:31:01, 4.43s/it] {'loss': 0.3043, 'grad_norm': 0.6144464910937142, 'learning_rate': 4.006224874301776e-06, 'epoch': 0.58} 58%|█████▊ | 12745/22095 [21:52:10<11:31:01, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111849 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12746/22095 [21:52:20<15:45:38, 6.07s/it] {'loss': 0.4581, 'grad_norm': 0.34017827111284143, 'learning_rate': 4.0055065863210365e-06, 'epoch': 0.58} 58%|█████▊ | 12746/22095 [21:52:20<15:45:38, 6.07s/it] 58%|█████▊ | 12747/22095 [21:52:25<14:44:27, 5.68s/it] {'loss': 0.3251, 'grad_norm': 0.6265824965179838, 'learning_rate': 4.004788319708908e-06, 'epoch': 0.58} 58%|█████▊ | 12747/22095 [21:52:25<14:44:27, 5.68s/it] 58%|█████▊ | 12748/22095 [21:52:28<12:56:21, 4.98s/it] {'loss': 0.3459, 'grad_norm': 0.7435574824893756, 'learning_rate': 4.004070074480821e-06, 'epoch': 0.58} 58%|█████▊ | 12748/22095 [21:52:28<12:56:21, 4.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12749/22095 [21:52:34<14:01:34, 5.40s/it] {'loss': 0.4624, 'grad_norm': 0.31812262154286036, 'learning_rate': 4.003351850652208e-06, 'epoch': 0.58} 58%|█████▊ | 12749/22095 [21:52:34<14:01:34, 5.40s/it] 58%|█████▊ | 12750/22095 [21:52:39<13:00:20, 5.01s/it] {'loss': 0.3289, 'grad_norm': 0.7157534177543335, 'learning_rate': 4.002633648238504e-06, 'epoch': 0.58} 58%|█████▊ | 12750/22095 [21:52:39<13:00:20, 5.01s/it] 58%|█████▊ | 12751/22095 [21:52:42<11:44:51, 4.53s/it] {'loss': 0.2835, 'grad_norm': 0.5694676418437684, 'learning_rate': 4.00191546725514e-06, 'epoch': 0.58} 58%|█████▊ | 12751/22095 [21:52:42<11:44:51, 4.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12752/22095 [21:52:46<11:05:06, 4.27s/it] {'loss': 0.3197, 'grad_norm': 0.6275407189140688, 'learning_rate': 4.001197307717547e-06, 'epoch': 0.58} 58%|█████▊ | 12752/22095 [21:52:46<11:05:06, 4.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12753/22095 [21:52:48<9:52:32, 3.81s/it] {'loss': 0.2917, 'grad_norm': 0.6405241813082845, 'learning_rate': 4.000479169641155e-06, 'epoch': 0.58} 58%|█████▊ | 12753/22095 [21:52:48<9:52:32, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12754/22095 [21:52:54<11:28:32, 4.42s/it] {'loss': 0.485, 'grad_norm': 0.30821742585046996, 'learning_rate': 3.999761053041398e-06, 'epoch': 0.58} 58%|█████▊ | 12754/22095 [21:52:54<11:28:32, 4.42s/it] 58%|█████▊ | 12755/22095 [21:52:58<11:03:57, 4.27s/it] {'loss': 0.2894, 'grad_norm': 0.5630410441773159, 'learning_rate': 3.999042957933703e-06, 'epoch': 0.58} 58%|█████▊ | 12755/22095 [21:52:58<11:03:57, 4.27s/it] 58%|█████▊ | 12756/22095 [21:53:02<10:25:04, 4.02s/it] {'loss': 0.3302, 'grad_norm': 0.6669897014151894, 'learning_rate': 3.9983248843335e-06, 'epoch': 0.58} 58%|█████▊ | 12756/22095 [21:53:02<10:25:04, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46105 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85394 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12757/22095 [21:53:05<10:12:50, 3.94s/it] {'loss': 0.3323, 'grad_norm': 0.7217996992980149, 'learning_rate': 3.997606832256221e-06, 'epoch': 0.58} 58%|█████▊ | 12757/22095 [21:53:05<10:12:50, 3.94s/it] 58%|█████▊ | 12758/22095 [21:53:09<9:55:58, 3.83s/it] {'loss': 0.3612, 'grad_norm': 0.7234170943206248, 'learning_rate': 3.9968888017172905e-06, 'epoch': 0.58} 58%|█████▊ | 12758/22095 [21:53:09<9:55:58, 3.83s/it] 58%|█████▊ | 12759/22095 [21:53:13<9:48:17, 3.78s/it] {'loss': 0.3308, 'grad_norm': 0.6280443839476121, 'learning_rate': 3.996170792732139e-06, 'epoch': 0.58} 58%|█████▊ | 12759/22095 [21:53:13<9:48:17, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [214, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8346213 in VC:s3://internvl-moe-sft-data/. Exception: Image size [214, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12870, 'image': 'vrdu_table_final_2/astro-ph.CO/4b2b60e7-a29f-41a4-82c3-95f9b909fe7f.png', 'image_wh': [[214, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\small #1 \\today\n\\end{tabular}\n```"}]} 58%|█████▊ | 12760/22095 [21:53:16<9:56:13, 3.83s/it] {'loss': 0.3524, 'grad_norm': 0.6410420830552802, 'learning_rate': 3.995452805316195e-06, 'epoch': 0.58} 58%|█████▊ | 12760/22095 [21:53:16<9:56:13, 3.83s/it] 58%|█████▊ | 12761/22095 [21:53:20<9:20:30, 3.60s/it] {'loss': 0.3039, 'grad_norm': 0.6175844970036328, 'learning_rate': 3.994734839484884e-06, 'epoch': 0.58} 58%|█████▊ | 12761/22095 [21:53:20<9:20:30, 3.60s/it] 58%|█████▊ | 12762/22095 [21:53:23<9:27:24, 3.65s/it] {'loss': 0.3257, 'grad_norm': 0.6249105358043234, 'learning_rate': 3.994016895253635e-06, 'epoch': 0.58} 58%|█████▊ | 12762/22095 [21:53:23<9:27:24, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55154 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12763/22095 [21:53:28<9:58:48, 3.85s/it] {'loss': 0.349, 'grad_norm': 0.6208124682102306, 'learning_rate': 3.9932989726378705e-06, 'epoch': 0.58} 58%|█████▊ | 12763/22095 [21:53:28<9:58:48, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71632 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12764/22095 [21:53:33<10:51:23, 4.19s/it] {'loss': 0.3263, 'grad_norm': 0.6246253614348443, 'learning_rate': 3.992581071653023e-06, 'epoch': 0.58} 58%|█████▊ | 12764/22095 [21:53:33<10:51:23, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44189 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56823 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134534 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12765/22095 [21:53:42<14:58:40, 5.78s/it] {'loss': 0.4966, 'grad_norm': 0.35917458120348167, 'learning_rate': 3.991863192314512e-06, 'epoch': 0.58} 58%|█████▊ | 12765/22095 [21:53:42<14:58:40, 5.78s/it] 58%|█████▊ | 12766/22095 [21:53:46<13:20:06, 5.15s/it] {'loss': 0.3089, 'grad_norm': 0.5729151681896888, 'learning_rate': 3.991145334637765e-06, 'epoch': 0.58} 58%|█████▊ | 12766/22095 [21:53:46<13:20:06, 5.15s/it] 58%|█████▊ | 12767/22095 [21:53:49<11:42:57, 4.52s/it] {'loss': 0.2974, 'grad_norm': 0.6572528307452108, 'learning_rate': 3.990427498638208e-06, 'epoch': 0.58} 58%|█████▊ | 12767/22095 [21:53:49<11:42:57, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12768/22095 [21:53:52<10:24:28, 4.02s/it] {'loss': 0.264, 'grad_norm': 0.8428472380687402, 'learning_rate': 3.98970968433126e-06, 'epoch': 0.58} 58%|█████▊ | 12768/22095 [21:53:52<10:24:28, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55985 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89549 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12769/22095 [21:53:55<9:46:26, 3.77s/it] {'loss': 0.3117, 'grad_norm': 0.6736949965479246, 'learning_rate': 3.98899189173235e-06, 'epoch': 0.58} 58%|█████▊ | 12769/22095 [21:53:55<9:46:26, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12770/22095 [21:53:59<9:43:30, 3.75s/it] {'loss': 0.3536, 'grad_norm': 0.6581459249727647, 'learning_rate': 3.988274120856901e-06, 'epoch': 0.58} 58%|█████▊ | 12770/22095 [21:53:59<9:43:30, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12771/22095 [21:54:08<14:04:59, 5.44s/it] {'loss': 0.482, 'grad_norm': 0.28315764861161824, 'learning_rate': 3.987556371720331e-06, 'epoch': 0.58} 58%|█████▊ | 12771/22095 [21:54:08<14:04:59, 5.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12772/22095 [21:54:17<17:06:29, 6.61s/it] {'loss': 0.4758, 'grad_norm': 0.28487596963205375, 'learning_rate': 3.986838644338066e-06, 'epoch': 0.58} 58%|█████▊ | 12772/22095 [21:54:17<17:06:29, 6.61s/it] 58%|█████▊ | 12773/22095 [21:54:27<19:20:16, 7.47s/it] {'loss': 0.4713, 'grad_norm': 0.2829952282231626, 'learning_rate': 3.986120938725529e-06, 'epoch': 0.58} 58%|█████▊ | 12773/22095 [21:54:27<19:20:16, 7.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12774/22095 [21:54:30<16:09:35, 6.24s/it] {'loss': 0.313, 'grad_norm': 0.621195865422692, 'learning_rate': 3.9854032548981354e-06, 'epoch': 0.58} 58%|█████▊ | 12774/22095 [21:54:30<16:09:35, 6.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12775/22095 [21:54:34<14:05:59, 5.45s/it] {'loss': 0.3479, 'grad_norm': 0.611176870304872, 'learning_rate': 3.984685592871311e-06, 'epoch': 0.58} 58%|█████▊ | 12775/22095 [21:54:34<14:05:59, 5.45s/it] 58%|█████▊ | 12776/22095 [21:54:37<12:24:09, 4.79s/it] {'loss': 0.3335, 'grad_norm': 0.6438316659957177, 'learning_rate': 3.983967952660477e-06, 'epoch': 0.58} 58%|█████▊ | 12776/22095 [21:54:37<12:24:09, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12777/22095 [21:54:46<15:30:16, 5.99s/it] {'loss': 0.4994, 'grad_norm': 0.29361965736641604, 'learning_rate': 3.983250334281049e-06, 'epoch': 0.58} 58%|█████▊ | 12777/22095 [21:54:46<15:30:16, 5.99s/it] 58%|█████▊ | 12778/22095 [21:54:49<13:44:08, 5.31s/it] {'loss': 0.303, 'grad_norm': 0.7576573224162063, 'learning_rate': 3.982532737748448e-06, 'epoch': 0.58} 58%|█████▊ | 12778/22095 [21:54:49<13:44:08, 5.31s/it] 58%|█████▊ | 12779/22095 [21:54:53<12:39:07, 4.89s/it] {'loss': 0.2755, 'grad_norm': 0.5893669556988376, 'learning_rate': 3.9818151630780945e-06, 'epoch': 0.58} 58%|█████▊ | 12779/22095 [21:54:53<12:39:07, 4.89s/it] 58%|█████▊ | 12780/22095 [21:54:57<11:47:45, 4.56s/it] {'loss': 0.2798, 'grad_norm': 0.6434397060345102, 'learning_rate': 3.981097610285407e-06, 'epoch': 0.58} 58%|█████▊ | 12780/22095 [21:54:57<11:47:45, 4.56s/it] 58%|█████▊ | 12781/22095 [21:55:00<10:36:17, 4.10s/it] {'loss': 0.3416, 'grad_norm': 0.6010668767369041, 'learning_rate': 3.980380079385802e-06, 'epoch': 0.58} 58%|█████▊ | 12781/22095 [21:55:00<10:36:17, 4.10s/it] 58%|█████▊ | 12782/22095 [21:55:04<10:02:29, 3.88s/it] {'loss': 0.3412, 'grad_norm': 0.6356394138357916, 'learning_rate': 3.979662570394696e-06, 'epoch': 0.58} 58%|█████▊ | 12782/22095 [21:55:04<10:02:29, 3.88s/it] 58%|█████▊ | 12783/22095 [21:55:07<9:56:00, 3.84s/it] {'loss': 0.3152, 'grad_norm': 0.5965268149709056, 'learning_rate': 3.97894508332751e-06, 'epoch': 0.58} 58%|█████▊ | 12783/22095 [21:55:07<9:56:00, 3.84s/it] 58%|█████▊ | 12784/22095 [21:55:11<9:31:23, 3.68s/it] {'loss': 0.3035, 'grad_norm': 0.6397490509031601, 'learning_rate': 3.978227618199657e-06, 'epoch': 0.58} 58%|█████▊ | 12784/22095 [21:55:11<9:31:23, 3.68s/it] 58%|█████▊ | 12785/22095 [21:55:14<8:53:45, 3.44s/it] {'loss': 0.3658, 'grad_norm': 0.7026293236806275, 'learning_rate': 3.977510175026555e-06, 'epoch': 0.58} 58%|█████▊ | 12785/22095 [21:55:14<8:53:45, 3.44s/it] 58%|█████▊ | 12786/22095 [21:55:16<8:27:46, 3.27s/it] {'loss': 0.3371, 'grad_norm': 0.647604276207456, 'learning_rate': 3.976792753823619e-06, 'epoch': 0.58} 58%|█████▊ | 12786/22095 [21:55:16<8:27:46, 3.27s/it] 58%|█████▊ | 12787/22095 [21:55:20<8:32:07, 3.30s/it] {'loss': 0.2776, 'grad_norm': 1.547699319518536, 'learning_rate': 3.976075354606263e-06, 'epoch': 0.58} 58%|█████▊ | 12787/22095 [21:55:20<8:32:07, 3.30s/it] 58%|█████▊ | 12788/22095 [21:55:23<8:35:46, 3.33s/it] {'loss': 0.3261, 'grad_norm': 0.5939460075902443, 'learning_rate': 3.975357977389903e-06, 'epoch': 0.58} 58%|█████▊ | 12788/22095 [21:55:23<8:35:46, 3.33s/it] 58%|█████▊ | 12789/22095 [21:55:26<8:21:54, 3.24s/it] {'loss': 0.2983, 'grad_norm': 0.6238851371558091, 'learning_rate': 3.974640622189955e-06, 'epoch': 0.58} 58%|█████▊ | 12789/22095 [21:55:26<8:21:54, 3.24s/it] 58%|█████▊ | 12790/22095 [21:55:30<8:33:06, 3.31s/it] {'loss': 0.3346, 'grad_norm': 0.6823620323587973, 'learning_rate': 3.973923289021829e-06, 'epoch': 0.58} 58%|█████▊ | 12790/22095 [21:55:30<8:33:06, 3.31s/it] 58%|█████▊ | 12791/22095 [21:55:33<8:18:40, 3.22s/it] {'loss': 0.3021, 'grad_norm': 0.6663588433491106, 'learning_rate': 3.97320597790094e-06, 'epoch': 0.58} 58%|█████▊ | 12791/22095 [21:55:33<8:18:40, 3.22s/it] 58%|█████▊ | 12792/22095 [21:55:37<8:53:27, 3.44s/it] {'loss': 0.3425, 'grad_norm': 0.7458479916275186, 'learning_rate': 3.972488688842701e-06, 'epoch': 0.58} 58%|█████▊ | 12792/22095 [21:55:37<8:53:27, 3.44s/it] 58%|█████▊ | 12793/22095 [21:55:41<9:37:40, 3.73s/it] {'loss': 0.3797, 'grad_norm': 0.6802867122290038, 'learning_rate': 3.971771421862527e-06, 'epoch': 0.58} 58%|█████▊ | 12793/22095 [21:55:41<9:37:40, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74140 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12794/22095 [21:55:46<10:36:54, 4.11s/it] {'loss': 0.3342, 'grad_norm': 0.6712055248027057, 'learning_rate': 3.971054176975825e-06, 'epoch': 0.58} 58%|█████▊ | 12794/22095 [21:55:46<10:36:54, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047849 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 3cm\nB. 4cm\nC. 6cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 58%|█████▊ | 12795/22095 [21:55:51<11:00:44, 4.26s/it] {'loss': 0.3158, 'grad_norm': 0.7196814387097701, 'learning_rate': 3.970336954198008e-06, 'epoch': 0.58} 58%|█████▊ | 12795/22095 [21:55:51<11:00:44, 4.26s/it] 58%|█████▊ | 12796/22095 [21:55:54<10:24:42, 4.03s/it] {'loss': 0.2616, 'grad_norm': 0.6351901479408457, 'learning_rate': 3.969619753544491e-06, 'epoch': 0.58} 58%|█████▊ | 12796/22095 [21:55:54<10:24:42, 4.03s/it] 58%|█████▊ | 12797/22095 [21:55:57<9:32:16, 3.69s/it] {'loss': 0.3328, 'grad_norm': 0.655342732680527, 'learning_rate': 3.968902575030676e-06, 'epoch': 0.58} 58%|█████▊ | 12797/22095 [21:55:57<9:32:16, 3.69s/it] 58%|█████▊ | 12798/22095 [21:56:01<9:22:34, 3.63s/it] {'loss': 0.3492, 'grad_norm': 0.770406743505147, 'learning_rate': 3.968185418671981e-06, 'epoch': 0.58} 58%|█████▊ | 12798/22095 [21:56:01<9:22:34, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12799/22095 [21:56:09<12:57:58, 5.02s/it] {'loss': 0.468, 'grad_norm': 0.3541400721543433, 'learning_rate': 3.967468284483812e-06, 'epoch': 0.58} 58%|█████▊ | 12799/22095 [21:56:09<12:57:58, 5.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [625, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8436479 in VC:s3://internvl-moe-sft-data/. Exception: Image size [625, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 53186, 'image': 'vrdu_texteq/astro-ph.CO/47402bbd-b083-4599-be0c-13268974b92e.png', 'image_wh': [[625, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $\\Delta_{N}$ is the observed fluctuation at the source:'}]} 58%|█████▊ | 12800/22095 [21:56:18<16:21:23, 6.34s/it] {'loss': 0.4574, 'grad_norm': 0.3190653247968186, 'learning_rate': 3.966751172481577e-06, 'epoch': 0.58} 58%|█████▊ | 12800/22095 [21:56:18<16:21:23, 6.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12801/22095 [21:56:23<14:49:20, 5.74s/it] {'loss': 0.2742, 'grad_norm': 0.6770977156420795, 'learning_rate': 3.966034082680686e-06, 'epoch': 0.58} 58%|█████▊ | 12801/22095 [21:56:23<14:49:20, 5.74s/it] 58%|█████▊ | 12802/22095 [21:56:34<18:55:32, 7.33s/it] {'loss': 0.4658, 'grad_norm': 0.28384315591545206, 'learning_rate': 3.9653170150965494e-06, 'epoch': 0.58} 58%|█████▊ | 12802/22095 [21:56:34<18:55:32, 7.33s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12803/22095 [21:56:37<15:55:12, 6.17s/it] {'loss': 0.3507, 'grad_norm': 0.6288247634755264, 'learning_rate': 3.96459996974457e-06, 'epoch': 0.58} 58%|█████▊ | 12803/22095 [21:56:37<15:55:12, 6.17s/it] 58%|█████▊ | 12804/22095 [21:56:41<14:03:14, 5.45s/it] {'loss': 0.3339, 'grad_norm': 0.6554554918173014, 'learning_rate': 3.963882946640158e-06, 'epoch': 0.58} 58%|█████▊ | 12804/22095 [21:56:41<14:03:14, 5.45s/it] 58%|█████▊ | 12805/22095 [21:56:44<12:41:22, 4.92s/it] {'loss': 0.3093, 'grad_norm': 0.6005424677853866, 'learning_rate': 3.963165945798718e-06, 'epoch': 0.58} 58%|█████▊ | 12805/22095 [21:56:44<12:41:22, 4.92s/it] 58%|█████▊ | 12806/22095 [21:56:47<11:11:58, 4.34s/it] {'loss': 0.3165, 'grad_norm': 0.6177179456099513, 'learning_rate': 3.9624489672356605e-06, 'epoch': 0.58} 58%|█████▊ | 12806/22095 [21:56:47<11:11:58, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89828 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12807/22095 [21:56:51<10:25:30, 4.04s/it] {'loss': 0.3249, 'grad_norm': 0.623730570206981, 'learning_rate': 3.961732010966385e-06, 'epoch': 0.58} 58%|█████▊ | 12807/22095 [21:56:51<10:25:30, 4.04s/it] 58%|█████▊ | 12808/22095 [21:56:54<10:00:08, 3.88s/it] {'loss': 0.3376, 'grad_norm': 0.5890721069191882, 'learning_rate': 3.961015077006301e-06, 'epoch': 0.58} 58%|█████▊ | 12808/22095 [21:56:54<10:00:08, 3.88s/it] 58%|█████▊ | 12809/22095 [21:56:58<9:38:55, 3.74s/it] {'loss': 0.2701, 'grad_norm': 0.5774276192360328, 'learning_rate': 3.960298165370814e-06, 'epoch': 0.58} 58%|█████▊ | 12809/22095 [21:56:58<9:38:55, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48033 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108543 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12810/22095 [21:57:02<9:50:12, 3.81s/it] {'loss': 0.3298, 'grad_norm': 0.6945883497703088, 'learning_rate': 3.959581276075324e-06, 'epoch': 0.58} 58%|█████▊ | 12810/22095 [21:57:02<9:50:12, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80879 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12811/22095 [21:57:05<9:32:23, 3.70s/it] {'loss': 0.3123, 'grad_norm': 0.5855370075421497, 'learning_rate': 3.958864409135236e-06, 'epoch': 0.58} 58%|█████▊ | 12811/22095 [21:57:05<9:32:23, 3.70s/it] 58%|█████▊ | 12812/22095 [21:57:08<8:54:50, 3.46s/it] {'loss': 0.3212, 'grad_norm': 0.6266850936855771, 'learning_rate': 3.9581475645659565e-06, 'epoch': 0.58} 58%|█████▊ | 12812/22095 [21:57:08<8:54:50, 3.46s/it] 58%|█████▊ | 12813/22095 [21:57:12<9:18:37, 3.61s/it] {'loss': 0.2849, 'grad_norm': 0.6205072222388495, 'learning_rate': 3.957430742382885e-06, 'epoch': 0.58} 58%|█████▊ | 12813/22095 [21:57:12<9:18:37, 3.61s/it] 58%|█████▊ | 12814/22095 [21:57:15<9:00:51, 3.50s/it] {'loss': 0.2991, 'grad_norm': 0.6942275103026728, 'learning_rate': 3.956713942601425e-06, 'epoch': 0.58} 58%|█████▊ | 12814/22095 [21:57:15<9:00:51, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (81735 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12815/22095 [21:57:25<13:37:53, 5.29s/it] {'loss': 0.4859, 'grad_norm': 0.4025010941475067, 'learning_rate': 3.955997165236979e-06, 'epoch': 0.58} 58%|█████▊ | 12815/22095 [21:57:25<13:37:53, 5.29s/it] 58%|█████▊ | 12816/22095 [21:57:28<11:59:37, 4.65s/it] {'loss': 0.3484, 'grad_norm': 0.6681984766651937, 'learning_rate': 3.955280410304945e-06, 'epoch': 0.58} 58%|█████▊ | 12816/22095 [21:57:28<11:59:37, 4.65s/it] 58%|█████▊ | 12817/22095 [21:57:31<10:53:46, 4.23s/it] {'loss': 0.3124, 'grad_norm': 0.6516308948976135, 'learning_rate': 3.954563677820729e-06, 'epoch': 0.58} 58%|█████▊ | 12817/22095 [21:57:31<10:53:46, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57292 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47751 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89410 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12818/22095 [21:57:38<12:44:29, 4.94s/it] {'loss': 0.4557, 'grad_norm': 0.31077237574827565, 'learning_rate': 3.953846967799728e-06, 'epoch': 0.58} 58%|█████▊ | 12818/22095 [21:57:38<12:44:29, 4.94s/it] 58%|█████▊ | 12819/22095 [21:57:44<13:33:26, 5.26s/it] {'loss': 0.4849, 'grad_norm': 0.2932590904326027, 'learning_rate': 3.953130280257342e-06, 'epoch': 0.58} 58%|█████▊ | 12819/22095 [21:57:44<13:33:26, 5.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (57634 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44049 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46263 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12820/22095 [21:57:47<11:51:22, 4.60s/it] {'loss': 0.2986, 'grad_norm': 0.6527721169835451, 'learning_rate': 3.95241361520897e-06, 'epoch': 0.58} 58%|█████▊ | 12820/22095 [21:57:47<11:51:22, 4.60s/it] 58%|█████▊ | 12821/22095 [21:57:51<11:21:29, 4.41s/it] {'loss': 0.2966, 'grad_norm': 0.6206449598878321, 'learning_rate': 3.9516969726700135e-06, 'epoch': 0.58} 58%|█████▊ | 12821/22095 [21:57:51<11:21:29, 4.41s/it] 58%|█████▊ | 12822/22095 [21:57:55<10:56:58, 4.25s/it] {'loss': 0.3191, 'grad_norm': 0.6256445807776387, 'learning_rate': 3.950980352655871e-06, 'epoch': 0.58} 58%|█████▊ | 12822/22095 [21:57:55<10:56:58, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888451 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11604, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 58%|█████▊ | 12823/22095 [21:58:04<15:00:06, 5.82s/it] {'loss': 0.459, 'grad_norm': 0.32108889209316294, 'learning_rate': 3.950263755181937e-06, 'epoch': 0.58} 58%|█████▊ | 12823/22095 [21:58:04<15:00:06, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44785 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12824/22095 [21:58:07<13:04:58, 5.08s/it] {'loss': 0.3327, 'grad_norm': 0.6527273268980939, 'learning_rate': 3.94954718026361e-06, 'epoch': 0.58} 58%|█████▊ | 12824/22095 [21:58:07<13:04:58, 5.08s/it] 58%|█████▊ | 12825/22095 [21:58:11<11:33:02, 4.49s/it] {'loss': 0.3324, 'grad_norm': 0.6044083467629512, 'learning_rate': 3.948830627916291e-06, 'epoch': 0.58} 58%|█████▊ | 12825/22095 [21:58:11<11:33:02, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (66078 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45909 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12826/22095 [21:58:17<12:42:24, 4.94s/it] {'loss': 0.4796, 'grad_norm': 0.4489799411707293, 'learning_rate': 3.94811409815537e-06, 'epoch': 0.58} 58%|█████▊ | 12826/22095 [21:58:17<12:42:24, 4.94s/it] 58%|█████▊ | 12827/22095 [21:58:20<11:55:01, 4.63s/it] {'loss': 0.3054, 'grad_norm': 0.6309711058364544, 'learning_rate': 3.9473975909962484e-06, 'epoch': 0.58} 58%|█████▊ | 12827/22095 [21:58:20<11:55:01, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12828/22095 [21:58:32<17:11:14, 6.68s/it] {'loss': 0.4489, 'grad_norm': 0.33958261127758466, 'learning_rate': 3.946681106454319e-06, 'epoch': 0.58} 58%|█████▊ | 12828/22095 [21:58:32<17:11:14, 6.68s/it] 58%|█████▊ | 12829/22095 [21:58:36<15:26:26, 6.00s/it] {'loss': 0.318, 'grad_norm': 0.6527249012356833, 'learning_rate': 3.9459646445449785e-06, 'epoch': 0.58} 58%|█████▊ | 12829/22095 [21:58:36<15:26:26, 6.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43889 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (140637 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12830/22095 [21:58:40<13:59:10, 5.43s/it] {'loss': 0.2796, 'grad_norm': 0.5899912082073602, 'learning_rate': 3.945248205283618e-06, 'epoch': 0.58} 58%|█████▊ | 12830/22095 [21:58:40<13:59:10, 5.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12831/22095 [21:58:45<13:21:59, 5.19s/it] {'loss': 0.3251, 'grad_norm': 0.6648150974363372, 'learning_rate': 3.944531788685637e-06, 'epoch': 0.58} 58%|█████▊ | 12831/22095 [21:58:45<13:21:59, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48811 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56781 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48165 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12832/22095 [21:58:52<14:22:57, 5.59s/it] {'loss': 0.4743, 'grad_norm': 0.33089472321988717, 'learning_rate': 3.943815394766426e-06, 'epoch': 0.58} 58%|█████▊ | 12832/22095 [21:58:52<14:22:57, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49024 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120556 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12833/22095 [21:58:55<12:48:20, 4.98s/it] {'loss': 0.2991, 'grad_norm': 0.5800850848571503, 'learning_rate': 3.943099023541377e-06, 'epoch': 0.58} 58%|█████▊ | 12833/22095 [21:58:55<12:48:20, 4.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122646 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12834/22095 [21:59:06<17:26:28, 6.78s/it] {'loss': 0.4845, 'grad_norm': 0.33315107133424154, 'learning_rate': 3.942382675025883e-06, 'epoch': 0.58} 58%|█████▊ | 12834/22095 [21:59:06<17:26:28, 6.78s/it] 58%|█████▊ | 12835/22095 [21:59:10<15:22:03, 5.97s/it] {'loss': 0.3057, 'grad_norm': 0.598203060435861, 'learning_rate': 3.941666349235341e-06, 'epoch': 0.58} 58%|█████▊ | 12835/22095 [21:59:10<15:22:03, 5.97s/it] 58%|█████▊ | 12836/22095 [21:59:13<13:16:43, 5.16s/it] {'loss': 0.3694, 'grad_norm': 0.6589915175799442, 'learning_rate': 3.9409500461851355e-06, 'epoch': 0.58} 58%|█████▊ | 12836/22095 [21:59:14<13:16:43, 5.16s/it] 58%|█████▊ | 12837/22095 [21:59:17<12:16:02, 4.77s/it] {'loss': 0.3176, 'grad_norm': 0.7464556060783643, 'learning_rate': 3.9402337658906615e-06, 'epoch': 0.58} 58%|█████▊ | 12837/22095 [21:59:17<12:16:02, 4.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12838/22095 [21:59:25<14:45:37, 5.74s/it] {'loss': 0.4881, 'grad_norm': 0.38661431178489153, 'learning_rate': 3.93951750836731e-06, 'epoch': 0.58} 58%|█████▊ | 12838/22095 [21:59:25<14:45:37, 5.74s/it] 58%|█████▊ | 12839/22095 [21:59:29<13:25:00, 5.22s/it] {'loss': 0.3421, 'grad_norm': 0.6059876360330815, 'learning_rate': 3.93880127363047e-06, 'epoch': 0.58} 58%|█████▊ | 12839/22095 [21:59:29<13:25:00, 5.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12840/22095 [21:59:39<16:51:43, 6.56s/it] {'loss': 0.489, 'grad_norm': 0.3021531130840247, 'learning_rate': 3.938085061695529e-06, 'epoch': 0.58} 58%|█████▊ | 12840/22095 [21:59:39<16:51:43, 6.56s/it] 58%|█████▊ | 12841/22095 [21:59:49<19:24:14, 7.55s/it] {'loss': 0.4935, 'grad_norm': 0.27781242396514616, 'learning_rate': 3.937368872577882e-06, 'epoch': 0.58} 58%|█████▊ | 12841/22095 [21:59:49<19:24:14, 7.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108364 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107598 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12842/22095 [21:59:58<20:58:53, 8.16s/it] {'loss': 0.4707, 'grad_norm': 0.44269264539934433, 'learning_rate': 3.9366527062929126e-06, 'epoch': 0.58} 58%|█████▊ | 12842/22095 [21:59:59<20:58:53, 8.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12843/22095 [22:00:02<17:13:14, 6.70s/it] {'loss': 0.3245, 'grad_norm': 0.6714753451841458, 'learning_rate': 3.935936562856011e-06, 'epoch': 0.58} 58%|█████▊ | 12843/22095 [22:00:02<17:13:14, 6.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12844/22095 [22:00:05<14:35:42, 5.68s/it] {'loss': 0.3028, 'grad_norm': 0.5905751013141842, 'learning_rate': 3.935220442282565e-06, 'epoch': 0.58} 58%|█████▊ | 12844/22095 [22:00:05<14:35:42, 5.68s/it] 58%|█████▊ | 12845/22095 [22:00:08<12:39:31, 4.93s/it] {'loss': 0.3371, 'grad_norm': 0.5738321032246363, 'learning_rate': 3.93450434458796e-06, 'epoch': 0.58} 58%|█████▊ | 12845/22095 [22:00:08<12:39:31, 4.93s/it] 58%|█████▊ | 12846/22095 [22:00:11<11:01:42, 4.29s/it] {'loss': 0.3246, 'grad_norm': 0.6601455637263614, 'learning_rate': 3.933788269787585e-06, 'epoch': 0.58} 58%|█████▊ | 12846/22095 [22:00:11<11:01:42, 4.29s/it] 58%|█████▊ | 12847/22095 [22:00:14<10:10:19, 3.96s/it] {'loss': 0.2964, 'grad_norm': 0.5894179831252712, 'learning_rate': 3.9330722178968275e-06, 'epoch': 0.58} 58%|█████▊ | 12847/22095 [22:00:14<10:10:19, 3.96s/it] 58%|█████▊ | 12848/22095 [22:00:17<9:35:18, 3.73s/it] {'loss': 0.2994, 'grad_norm': 0.6887398419556391, 'learning_rate': 3.932356188931069e-06, 'epoch': 0.58} 58%|█████▊ | 12848/22095 [22:00:17<9:35:18, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83902 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69929 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12849/22095 [22:00:21<9:10:04, 3.57s/it] {'loss': 0.3156, 'grad_norm': 0.5871548930199265, 'learning_rate': 3.931640182905696e-06, 'epoch': 0.58} 58%|█████▊ | 12849/22095 [22:00:21<9:10:04, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (118357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49106 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12850/22095 [22:00:29<12:29:05, 4.86s/it] {'loss': 0.4892, 'grad_norm': 0.37197516532945846, 'learning_rate': 3.930924199836096e-06, 'epoch': 0.58} 58%|█████▊ | 12850/22095 [22:00:29<12:29:05, 4.86s/it] 58%|█████▊ | 12851/22095 [22:00:32<11:12:01, 4.36s/it] {'loss': 0.3032, 'grad_norm': 0.572034524958866, 'learning_rate': 3.930208239737651e-06, 'epoch': 0.58} 58%|█████▊ | 12851/22095 [22:00:32<11:12:01, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108852 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12852/22095 [22:00:36<10:52:53, 4.24s/it] {'loss': 0.285, 'grad_norm': 1.2407798640954382, 'learning_rate': 3.929492302625746e-06, 'epoch': 0.58} 58%|█████▊ | 12852/22095 [22:00:36<10:52:53, 4.24s/it] 58%|█████▊ | 12853/22095 [22:00:39<10:29:21, 4.09s/it] {'loss': 0.3773, 'grad_norm': 0.6219601274131884, 'learning_rate': 3.9287763885157625e-06, 'epoch': 0.58} 58%|█████▊ | 12853/22095 [22:00:39<10:29:21, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12854/22095 [22:00:47<13:18:46, 5.19s/it] {'loss': 0.5083, 'grad_norm': 0.3062327908374255, 'learning_rate': 3.928060497423087e-06, 'epoch': 0.58} 58%|█████▊ | 12854/22095 [22:00:47<13:18:46, 5.19s/it] 58%|█████▊ | 12855/22095 [22:00:51<11:59:33, 4.67s/it] {'loss': 0.3089, 'grad_norm': 0.6209752052130844, 'learning_rate': 3.9273446293630956e-06, 'epoch': 0.58} 58%|█████▊ | 12855/22095 [22:00:51<11:59:33, 4.67s/it] 58%|█████▊ | 12856/22095 [22:00:54<10:40:20, 4.16s/it] {'loss': 0.3641, 'grad_norm': 0.6137163222857616, 'learning_rate': 3.926628784351175e-06, 'epoch': 0.58} 58%|█████▊ | 12856/22095 [22:00:54<10:40:20, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12857/22095 [22:01:03<14:36:19, 5.69s/it] {'loss': 0.4472, 'grad_norm': 0.27066161094848945, 'learning_rate': 3.925912962402707e-06, 'epoch': 0.58} 58%|█████▊ | 12857/22095 [22:01:03<14:36:19, 5.69s/it] 58%|█████▊ | 12858/22095 [22:01:13<18:09:29, 7.08s/it] {'loss': 0.4903, 'grad_norm': 0.2924520462926132, 'learning_rate': 3.925197163533069e-06, 'epoch': 0.58} 58%|█████▊ | 12858/22095 [22:01:13<18:09:29, 7.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (44256 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12859/22095 [22:01:16<15:14:23, 5.94s/it] {'loss': 0.2943, 'grad_norm': 0.5878082240172025, 'learning_rate': 3.924481387757642e-06, 'epoch': 0.58} 58%|█████▊ | 12859/22095 [22:01:16<15:14:23, 5.94s/it] 58%|█████▊ | 12860/22095 [22:01:26<17:58:27, 7.01s/it] {'loss': 0.4897, 'grad_norm': 0.28326281618130894, 'learning_rate': 3.9237656350918095e-06, 'epoch': 0.58} 58%|█████▊ | 12860/22095 [22:01:26<17:58:27, 7.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12861/22095 [22:01:29<15:05:19, 5.88s/it] {'loss': 0.2985, 'grad_norm': 0.655266197542611, 'learning_rate': 3.9230499055509454e-06, 'epoch': 0.58} 58%|█████▊ | 12861/22095 [22:01:29<15:05:19, 5.88s/it] 58%|█████▊ | 12862/22095 [22:01:32<13:03:27, 5.09s/it] {'loss': 0.2734, 'grad_norm': 0.6480727118851123, 'learning_rate': 3.922334199150433e-06, 'epoch': 0.58} 58%|█████▊ | 12862/22095 [22:01:32<13:03:27, 5.09s/it] 58%|█████▊ | 12863/22095 [22:01:36<11:59:38, 4.68s/it] {'loss': 0.3413, 'grad_norm': 0.6601708874608262, 'learning_rate': 3.921618515905647e-06, 'epoch': 0.58} 58%|█████▊ | 12863/22095 [22:01:36<11:59:38, 4.68s/it] 58%|█████▊ | 12864/22095 [22:01:40<11:02:32, 4.31s/it] {'loss': 0.2701, 'grad_norm': 0.895556337999452, 'learning_rate': 3.920902855831969e-06, 'epoch': 0.58} 58%|█████▊ | 12864/22095 [22:01:40<11:02:32, 4.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12865/22095 [22:01:45<11:54:42, 4.65s/it] {'loss': 0.318, 'grad_norm': 0.6330166938868974, 'learning_rate': 3.920187218944774e-06, 'epoch': 0.58} 58%|█████▊ | 12865/22095 [22:01:45<11:54:42, 4.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047716 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 58%|█████▊ | 12866/22095 [22:01:48<10:34:17, 4.12s/it] {'loss': 0.2955, 'grad_norm': 0.6601310669210659, 'learning_rate': 3.919471605259438e-06, 'epoch': 0.58} 58%|█████▊ | 12866/22095 [22:01:48<10:34:17, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12867/22095 [22:01:55<12:54:07, 5.03s/it] {'loss': 0.4634, 'grad_norm': 0.30908864221676563, 'learning_rate': 3.918756014791341e-06, 'epoch': 0.58} 58%|█████▊ | 12867/22095 [22:01:55<12:54:07, 5.03s/it] 58%|█████▊ | 12868/22095 [22:01:59<12:22:09, 4.83s/it] {'loss': 0.3537, 'grad_norm': 0.5976452518286259, 'learning_rate': 3.9180404475558555e-06, 'epoch': 0.58} 58%|█████▊ | 12868/22095 [22:01:59<12:22:09, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41263 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61524 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12869/22095 [22:02:03<11:17:19, 4.40s/it] {'loss': 0.3309, 'grad_norm': 0.5953767557162067, 'learning_rate': 3.917324903568356e-06, 'epoch': 0.58} 58%|█████▊ | 12869/22095 [22:02:03<11:17:19, 4.40s/it] 58%|█████▊ | 12870/22095 [22:02:07<11:08:30, 4.35s/it] {'loss': 0.3396, 'grad_norm': 0.6054511531383094, 'learning_rate': 3.916609382844221e-06, 'epoch': 0.58} 58%|█████▊ | 12870/22095 [22:02:07<11:08:30, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12871/22095 [22:02:14<13:25:46, 5.24s/it] {'loss': 0.4907, 'grad_norm': 0.2822725519781029, 'learning_rate': 3.915893885398823e-06, 'epoch': 0.58} 58%|█████▊ | 12871/22095 [22:02:14<13:25:46, 5.24s/it] 58%|█████▊ | 12872/22095 [22:02:19<12:48:48, 5.00s/it] {'loss': 0.2795, 'grad_norm': 0.6860614582918673, 'learning_rate': 3.915178411247535e-06, 'epoch': 0.58} 58%|█████▊ | 12872/22095 [22:02:19<12:48:48, 5.00s/it] 58%|█████▊ | 12873/22095 [22:02:22<11:26:52, 4.47s/it] {'loss': 0.3569, 'grad_norm': 0.6228482677721467, 'learning_rate': 3.914462960405733e-06, 'epoch': 0.58} 58%|█████▊ | 12873/22095 [22:02:22<11:26:52, 4.47s/it] 58%|█████▊ | 12874/22095 [22:02:26<10:41:49, 4.18s/it] {'loss': 0.3243, 'grad_norm': 0.6258446491336329, 'learning_rate': 3.913747532888784e-06, 'epoch': 0.58} 58%|█████▊ | 12874/22095 [22:02:26<10:41:49, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60630 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12875/22095 [22:02:29<10:18:50, 4.03s/it] {'loss': 0.3357, 'grad_norm': 0.6210560291668852, 'learning_rate': 3.913032128712068e-06, 'epoch': 0.58} 58%|█████▊ | 12875/22095 [22:02:29<10:18:50, 4.03s/it] 58%|█████▊ | 12876/22095 [22:02:32<9:40:44, 3.78s/it] {'loss': 0.307, 'grad_norm': 0.6804612819642828, 'learning_rate': 3.912316747890951e-06, 'epoch': 0.58} 58%|█████▊ | 12876/22095 [22:02:32<9:40:44, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12877/22095 [22:02:42<14:03:09, 5.49s/it] {'loss': 0.4564, 'grad_norm': 0.2943025412568137, 'learning_rate': 3.911601390440809e-06, 'epoch': 0.58} 58%|█████▊ | 12877/22095 [22:02:42<14:03:09, 5.49s/it] 58%|█████▊ | 12878/22095 [22:02:45<12:29:48, 4.88s/it] {'loss': 0.3168, 'grad_norm': 0.6165984329019698, 'learning_rate': 3.910886056377008e-06, 'epoch': 0.58} 58%|█████▊ | 12878/22095 [22:02:45<12:29:48, 4.88s/it] 58%|█████▊ | 12879/22095 [22:02:50<12:07:51, 4.74s/it] {'loss': 0.3258, 'grad_norm': 0.6612618568342109, 'learning_rate': 3.9101707457149216e-06, 'epoch': 0.58} 58%|█████▊ | 12879/22095 [22:02:50<12:07:51, 4.74s/it] 58%|█████▊ | 12880/22095 [22:02:53<11:05:20, 4.33s/it] {'loss': 0.292, 'grad_norm': 0.6777851209953912, 'learning_rate': 3.90945545846992e-06, 'epoch': 0.58} 58%|█████▊ | 12880/22095 [22:02:53<11:05:20, 4.33s/it] 58%|█████▊ | 12881/22095 [22:02:57<10:52:45, 4.25s/it] {'loss': 0.3352, 'grad_norm': 0.6028725455461451, 'learning_rate': 3.908740194657369e-06, 'epoch': 0.58} 58%|█████▊ | 12881/22095 [22:02:57<10:52:45, 4.25s/it] 58%|█████▊ | 12882/22095 [22:03:03<12:00:29, 4.69s/it] {'loss': 0.3178, 'grad_norm': 0.6581191427684837, 'learning_rate': 3.90802495429264e-06, 'epoch': 0.58} 58%|█████▊ | 12882/22095 [22:03:03<12:00:29, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12883/22095 [22:03:13<16:28:03, 6.44s/it] {'loss': 0.483, 'grad_norm': 0.29210513084469514, 'learning_rate': 3.907309737391104e-06, 'epoch': 0.58} 58%|█████▊ | 12883/22095 [22:03:13<16:28:03, 6.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132941 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12884/22095 [22:03:18<14:48:48, 5.79s/it] {'loss': 0.3101, 'grad_norm': 0.6627943810879815, 'learning_rate': 3.906594543968122e-06, 'epoch': 0.58} 58%|█████▊ | 12884/22095 [22:03:18<14:48:48, 5.79s/it] 58%|█████▊ | 12885/22095 [22:03:21<12:39:49, 4.95s/it] {'loss': 0.3404, 'grad_norm': 0.7161743237685878, 'learning_rate': 3.905879374039066e-06, 'epoch': 0.58} 58%|█████▊ | 12885/22095 [22:03:21<12:39:49, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48251 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52745 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12886/22095 [22:03:25<11:59:01, 4.68s/it] {'loss': 0.3235, 'grad_norm': 0.6400180015090664, 'learning_rate': 3.905164227619303e-06, 'epoch': 0.58} 58%|█████▊ | 12886/22095 [22:03:25<11:59:01, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12887/22095 [22:03:32<14:07:36, 5.52s/it] {'loss': 0.4719, 'grad_norm': 0.266768614980608, 'learning_rate': 3.904449104724198e-06, 'epoch': 0.58} 58%|█████▊ | 12887/22095 [22:03:32<14:07:36, 5.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047784 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 10\nB. 12\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 58%|█████▊ | 12888/22095 [22:03:36<12:44:36, 4.98s/it] {'loss': 0.3302, 'grad_norm': 0.6990613540113876, 'learning_rate': 3.903734005369115e-06, 'epoch': 0.58} 58%|█████▊ | 12888/22095 [22:03:36<12:44:36, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12889/22095 [22:03:39<11:26:25, 4.47s/it] {'loss': 0.3261, 'grad_norm': 0.5934812759285606, 'learning_rate': 3.903018929569424e-06, 'epoch': 0.58} 58%|█████▊ | 12889/22095 [22:03:39<11:26:25, 4.47s/it] 58%|█████▊ | 12890/22095 [22:03:43<10:31:23, 4.12s/it] {'loss': 0.3165, 'grad_norm': 0.6164190942026797, 'learning_rate': 3.902303877340486e-06, 'epoch': 0.58} 58%|█████▊ | 12890/22095 [22:03:43<10:31:23, 4.12s/it] 58%|█████▊ | 12891/22095 [22:03:46<9:46:01, 3.82s/it] {'loss': 0.3159, 'grad_norm': 0.7572763337769141, 'learning_rate': 3.9015888486976666e-06, 'epoch': 0.58} 58%|█████▊ | 12891/22095 [22:03:46<9:46:01, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12892/22095 [22:03:57<15:25:24, 6.03s/it] {'loss': 0.4659, 'grad_norm': 0.31349991781540076, 'learning_rate': 3.900873843656328e-06, 'epoch': 0.58} 58%|█████▊ | 12892/22095 [22:03:57<15:25:24, 6.03s/it] 58%|█████▊ | 12893/22095 [22:04:00<13:17:08, 5.20s/it] {'loss': 0.3018, 'grad_norm': 0.6034155507082287, 'learning_rate': 3.900158862231837e-06, 'epoch': 0.58} 58%|█████▊ | 12893/22095 [22:04:00<13:17:08, 5.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12894/22095 [22:04:11<17:36:32, 6.89s/it] {'loss': 0.4543, 'grad_norm': 0.27460524405924486, 'learning_rate': 3.899443904439553e-06, 'epoch': 0.58} 58%|█████▊ | 12894/22095 [22:04:11<17:36:32, 6.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56377 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12895/22095 [22:04:15<15:21:47, 6.01s/it] {'loss': 0.3148, 'grad_norm': 0.5654824623125475, 'learning_rate': 3.89872897029484e-06, 'epoch': 0.58} 58%|█████▊ | 12895/22095 [22:04:15<15:21:47, 6.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107418 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12896/22095 [22:04:18<12:54:16, 5.05s/it] {'loss': 0.3318, 'grad_norm': 0.6280443259816898, 'learning_rate': 3.8980140598130585e-06, 'epoch': 0.58} 58%|█████▊ | 12896/22095 [22:04:18<12:54:16, 5.05s/it] 58%|█████▊ | 12897/22095 [22:04:21<11:32:32, 4.52s/it] {'loss': 0.3363, 'grad_norm': 0.6714072933785529, 'learning_rate': 3.89729917300957e-06, 'epoch': 0.58} 58%|█████▊ | 12897/22095 [22:04:21<11:32:32, 4.52s/it] 58%|█████▊ | 12898/22095 [22:04:24<10:41:59, 4.19s/it] {'loss': 0.2579, 'grad_norm': 2.789699193483099, 'learning_rate': 3.896584309899736e-06, 'epoch': 0.58} 58%|█████▊ | 12898/22095 [22:04:24<10:41:59, 4.19s/it] 58%|█████▊ | 12899/22095 [22:04:27<9:34:08, 3.75s/it] {'loss': 0.2702, 'grad_norm': 0.6012598147773093, 'learning_rate': 3.895869470498917e-06, 'epoch': 0.58} 58%|█████▊ | 12899/22095 [22:04:27<9:34:08, 3.75s/it] 58%|█████▊ | 12900/22095 [22:04:31<9:37:19, 3.77s/it] {'loss': 0.3081, 'grad_norm': 0.6348915755057556, 'learning_rate': 3.895154654822471e-06, 'epoch': 0.58} 58%|█████▊ | 12900/22095 [22:04:31<9:37:19, 3.77s/it] 58%|█████▊ | 12901/22095 [22:04:35<9:43:01, 3.80s/it] {'loss': 0.2654, 'grad_norm': 0.5786267882485973, 'learning_rate': 3.894439862885758e-06, 'epoch': 0.58} 58%|█████▊ | 12901/22095 [22:04:35<9:43:01, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [206, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350975 in VC:s3://internvl-moe-sft-data/. Exception: Image size [206, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17650, 'image': 'vrdu_table_final_2/astro-ph.CO/1751b4b9-79f0-4b66-adb1-cef112721ae1.png', 'image_wh': [[206, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{l c}\n$^1$ Center of Field\n&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\\\\\n\\end{tabular}\n```"}]} 58%|█████▊ | 12902/22095 [22:04:39<9:46:09, 3.83s/it] {'loss': 0.2948, 'grad_norm': 0.5707728278791753, 'learning_rate': 3.89372509470414e-06, 'epoch': 0.58} 58%|█████▊ | 12902/22095 [22:04:39<9:46:09, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 58%|█████▊ | 12903/22095 [22:04:45<11:50:19, 4.64s/it] {'loss': 0.4875, 'grad_norm': 0.2980866435477156, 'learning_rate': 3.893010350292967e-06, 'epoch': 0.58} 58%|█████▊ | 12903/22095 [22:04:45<11:50:19, 4.64s/it] 58%|█████▊ | 12904/22095 [22:04:55<15:33:47, 6.10s/it] {'loss': 0.4832, 'grad_norm': 0.272228408390971, 'learning_rate': 3.892295629667604e-06, 'epoch': 0.58} 58%|█████▊ | 12904/22095 [22:04:55<15:33:47, 6.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12905/22095 [22:04:58<13:32:12, 5.30s/it] {'loss': 0.2819, 'grad_norm': 0.6108062446836691, 'learning_rate': 3.891580932843406e-06, 'epoch': 0.58} 58%|█████▊ | 12905/22095 [22:04:58<13:32:12, 5.30s/it] 58%|█████▊ | 12906/22095 [22:05:01<11:58:04, 4.69s/it] {'loss': 0.3065, 'grad_norm': 0.7200864095677929, 'learning_rate': 3.890866259835731e-06, 'epoch': 0.58} 58%|█████▊ | 12906/22095 [22:05:01<11:58:04, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97618 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88541 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12907/22095 [22:05:04<10:33:03, 4.13s/it] {'loss': 0.3127, 'grad_norm': 0.6191814565649314, 'learning_rate': 3.890151610659931e-06, 'epoch': 0.58} 58%|█████▊ | 12907/22095 [22:05:04<10:33:03, 4.13s/it] 58%|█████▊ | 12908/22095 [22:05:08<9:56:07, 3.89s/it] {'loss': 0.3008, 'grad_norm': 0.6345450883334448, 'learning_rate': 3.8894369853313654e-06, 'epoch': 0.58} 58%|█████▊ | 12908/22095 [22:05:08<9:56:07, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8357834 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24544, 'image': 'vrdu_table_final_2/astro-ph.CO/4e0ddcd6-fd9b-4ec5-a536-27e671fc31d2.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 58%|█████▊ | 12909/22095 [22:05:11<9:38:40, 3.78s/it] {'loss': 0.3201, 'grad_norm': 0.6054599253457945, 'learning_rate': 3.888722383865389e-06, 'epoch': 0.58} 58%|█████▊ | 12909/22095 [22:05:11<9:38:40, 3.78s/it] 58%|█████▊ | 12910/22095 [22:05:14<9:13:35, 3.62s/it] {'loss': 0.2845, 'grad_norm': 0.6019468470687604, 'learning_rate': 3.888007806277355e-06, 'epoch': 0.58} 58%|█████▊ | 12910/22095 [22:05:14<9:13:35, 3.62s/it] 58%|█████▊ | 12911/22095 [22:05:17<8:40:42, 3.40s/it] {'loss': 0.282, 'grad_norm': 0.6457389851076811, 'learning_rate': 3.887293252582616e-06, 'epoch': 0.58} 58%|█████▊ | 12911/22095 [22:05:17<8:40:42, 3.40s/it] 58%|█████▊ | 12912/22095 [22:05:20<8:25:36, 3.30s/it] {'loss': 0.3555, 'grad_norm': 0.666848565280595, 'learning_rate': 3.886578722796532e-06, 'epoch': 0.58} 58%|█████▊ | 12912/22095 [22:05:20<8:25:36, 3.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [500, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8529159 in VC:s3://internvl-moe-sft-data/. Exception: Image size [500, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 78743, 'image': 'vrdu_texteq/astro-ph.CO/ce757a4b-f154-4c90-80e0-daf27d86b574.png', 'image_wh': [[500, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'In the diluted limit $ \\nu \\ll -1 $ we have that'}]} 58%|█████▊ | 12913/22095 [22:05:24<8:41:01, 3.40s/it] {'loss': 0.2958, 'grad_norm': 0.7169589320217262, 'learning_rate': 3.885864216934448e-06, 'epoch': 0.58} 58%|█████▊ | 12913/22095 [22:05:24<8:41:01, 3.40s/it] 58%|█████▊ | 12914/22095 [22:05:27<8:13:38, 3.23s/it] {'loss': 0.302, 'grad_norm': 0.7171063512367375, 'learning_rate': 3.88514973501172e-06, 'epoch': 0.58} 58%|█████▊ | 12914/22095 [22:05:27<8:13:38, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12915/22095 [22:05:36<12:37:58, 4.95s/it] {'loss': 0.4629, 'grad_norm': 0.35810858882672103, 'learning_rate': 3.884435277043703e-06, 'epoch': 0.58} 58%|█████▊ | 12915/22095 [22:05:36<12:37:58, 4.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929818 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52971, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nA. 8cm\nB. 5cm\nC. 6cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 58%|█████▊ | 12916/22095 [22:05:42<13:26:26, 5.27s/it] {'loss': 0.4891, 'grad_norm': 0.3523811905750544, 'learning_rate': 3.883720843045744e-06, 'epoch': 0.58} 58%|█████▊ | 12916/22095 [22:05:42<13:26:26, 5.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 58%|█████▊ | 12917/22095 [22:05:45<11:54:27, 4.67s/it] {'loss': 0.3221, 'grad_norm': 0.6418237068424264, 'learning_rate': 3.883006433033194e-06, 'epoch': 0.58} 58%|█████▊ | 12917/22095 [22:05:45<11:54:27, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97443 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72002 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12918/22095 [22:05:49<11:25:43, 4.48s/it] {'loss': 0.3118, 'grad_norm': 0.6018725955131646, 'learning_rate': 3.882292047021407e-06, 'epoch': 0.58} 58%|█████▊ | 12918/22095 [22:05:49<11:25:43, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65340 > 40960). Running this sequence through the model will result in indexing errors 58%|█████▊ | 12919/22095 [22:05:52<10:14:23, 4.02s/it] {'loss': 0.2973, 'grad_norm': 0.6411292128169545, 'learning_rate': 3.8815776850257325e-06, 'epoch': 0.58} 58%|█████▊ | 12919/22095 [22:05:52<10:14:23, 4.02s/it] 58%|█████▊ | 12920/22095 [22:05:56<10:08:47, 3.98s/it] {'loss': 0.329, 'grad_norm': 0.6211094756032005, 'learning_rate': 3.880863347061516e-06, 'epoch': 0.58} 58%|█████▊ | 12920/22095 [22:05:56<10:08:47, 3.98s/it] 58%|█████▊ | 12921/22095 [22:06:00<9:57:03, 3.90s/it] {'loss': 0.3365, 'grad_norm': 0.620245134177958, 'learning_rate': 3.88014903314411e-06, 'epoch': 0.58} 58%|█████▊ | 12921/22095 [22:06:00<9:57:03, 3.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 58%|█████▊ | 12922/22095 [22:06:03<9:44:06, 3.82s/it] {'loss': 0.322, 'grad_norm': 0.6868312250884809, 'learning_rate': 3.879434743288863e-06, 'epoch': 0.58} 58%|█████▊ | 12922/22095 [22:06:03<9:44:06, 3.82s/it] 58%|█████▊ | 12923/22095 [22:06:06<9:07:46, 3.58s/it] {'loss': 0.303, 'grad_norm': 0.6462710705303839, 'learning_rate': 3.87872047751112e-06, 'epoch': 0.58} 58%|█████▊ | 12923/22095 [22:06:06<9:07:46, 3.58s/it] 58%|█████▊ | 12924/22095 [22:06:10<9:12:33, 3.62s/it] {'loss': 0.2956, 'grad_norm': 0.6655929839190928, 'learning_rate': 3.878006235826231e-06, 'epoch': 0.58} 58%|█████▊ | 12924/22095 [22:06:10<9:12:33, 3.62s/it] 58%|█████▊ | 12925/22095 [22:06:14<9:26:40, 3.71s/it] {'loss': 0.3132, 'grad_norm': 0.6818995713564633, 'learning_rate': 3.877292018249543e-06, 'epoch': 0.58} 58%|█████▊ | 12925/22095 [22:06:14<9:26:40, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42587 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69016 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12926/22095 [22:06:23<13:51:20, 5.44s/it] {'loss': 0.4698, 'grad_norm': 0.38136703826778795, 'learning_rate': 3.8765778247964e-06, 'epoch': 0.59} 59%|█████▊ | 12926/22095 [22:06:23<13:51:20, 5.44s/it] 59%|█████▊ | 12927/22095 [22:06:31<15:28:44, 6.08s/it] {'loss': 0.4696, 'grad_norm': 0.3412002803460328, 'learning_rate': 3.875863655482149e-06, 'epoch': 0.59} 59%|█████▊ | 12927/22095 [22:06:31<15:28:44, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▊ | 12928/22095 [22:06:35<13:35:21, 5.34s/it] {'loss': 0.343, 'grad_norm': 0.5931125850967812, 'learning_rate': 3.875149510322137e-06, 'epoch': 0.59} 59%|█████▊ | 12928/22095 [22:06:35<13:35:21, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41366 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12929/22095 [22:06:38<11:56:22, 4.69s/it] {'loss': 0.3706, 'grad_norm': 0.6466884623426743, 'learning_rate': 3.8744353893317075e-06, 'epoch': 0.59} 59%|█████▊ | 12929/22095 [22:06:38<11:56:22, 4.69s/it] 59%|█████▊ | 12930/22095 [22:06:41<10:51:56, 4.27s/it] {'loss': 0.2481, 'grad_norm': 0.6434405477840357, 'learning_rate': 3.873721292526202e-06, 'epoch': 0.59} 59%|█████▊ | 12930/22095 [22:06:41<10:51:56, 4.27s/it] 59%|█████▊ | 12931/22095 [22:06:44<9:52:17, 3.88s/it] {'loss': 0.2998, 'grad_norm': 0.7373483580330585, 'learning_rate': 3.8730072199209705e-06, 'epoch': 0.59} 59%|█████▊ | 12931/22095 [22:06:44<9:52:17, 3.88s/it] 59%|█████▊ | 12932/22095 [22:06:48<9:59:41, 3.93s/it] {'loss': 0.3174, 'grad_norm': 0.9587743096576387, 'learning_rate': 3.87229317153135e-06, 'epoch': 0.59} 59%|█████▊ | 12932/22095 [22:06:48<9:59:41, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12933/22095 [22:06:52<10:08:23, 3.98s/it] {'loss': 0.3463, 'grad_norm': 0.6359927738078779, 'learning_rate': 3.871579147372685e-06, 'epoch': 0.59} 59%|█████▊ | 12933/22095 [22:06:52<10:08:23, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50147 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72217 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12934/22095 [22:07:02<14:26:30, 5.68s/it] {'loss': 0.4735, 'grad_norm': 0.34809681752623584, 'learning_rate': 3.870865147460319e-06, 'epoch': 0.59} 59%|█████▊ | 12934/22095 [22:07:02<14:26:30, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53662 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84536 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12935/22095 [22:07:05<12:35:02, 4.95s/it] {'loss': 0.3279, 'grad_norm': 0.6740694498628458, 'learning_rate': 3.870151171809596e-06, 'epoch': 0.59} 59%|█████▊ | 12935/22095 [22:07:05<12:35:02, 4.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12936/22095 [22:07:09<11:26:13, 4.50s/it] {'loss': 0.3176, 'grad_norm': 0.6666175522244245, 'learning_rate': 3.869437220435851e-06, 'epoch': 0.59} 59%|█████▊ | 12936/22095 [22:07:09<11:26:13, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▊ | 12937/22095 [22:07:18<15:12:49, 5.98s/it] {'loss': 0.474, 'grad_norm': 0.31852509391543643, 'learning_rate': 3.868723293354429e-06, 'epoch': 0.59} 59%|█████▊ | 12937/22095 [22:07:18<15:12:49, 5.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77565 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107007 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12938/22095 [22:07:22<13:42:32, 5.39s/it] {'loss': 0.3437, 'grad_norm': 0.6416405830095192, 'learning_rate': 3.8680093905806695e-06, 'epoch': 0.59} 59%|█████▊ | 12938/22095 [22:07:22<13:42:32, 5.39s/it] 59%|█████▊ | 12939/22095 [22:07:26<12:25:06, 4.88s/it] {'loss': 0.2698, 'grad_norm': 0.608559978948162, 'learning_rate': 3.86729551212991e-06, 'epoch': 0.59} 59%|█████▊ | 12939/22095 [22:07:26<12:25:06, 4.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12940/22095 [22:07:29<11:01:16, 4.33s/it] {'loss': 0.3077, 'grad_norm': 0.6460179817947529, 'learning_rate': 3.866581658017492e-06, 'epoch': 0.59} 59%|█████▊ | 12940/22095 [22:07:29<11:01:16, 4.33s/it] 59%|█████▊ | 12941/22095 [22:07:33<10:55:37, 4.30s/it] {'loss': 0.3642, 'grad_norm': 0.6021138283075794, 'learning_rate': 3.865867828258754e-06, 'epoch': 0.59} 59%|█████▊ | 12941/22095 [22:07:33<10:55:37, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308070 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2JfXEnmFjpuFjSszhXXaBuVXa_!!2819123425.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nDetect the text in the image and tell me the content.'}, {'from': 'gpt', 'value': 'All words in the image:\n展架海报\n定制\n买4送1更优惠\n房屋\n健身俱乐部\n托管\nAHEALTHCLUB\n瘦身健康从这里开始YOURTEXTHERE\n去悟\n空找\n6\n房\n元起\n仅售\n¥\n.8'}]} 59%|█████▊ | 12942/22095 [22:07:43<15:30:46, 6.10s/it] {'loss': 0.5104, 'grad_norm': 0.3031943483218063, 'learning_rate': 3.865154022869032e-06, 'epoch': 0.59} 59%|█████▊ | 12942/22095 [22:07:43<15:30:46, 6.10s/it] 59%|█████▊ | 12943/22095 [22:07:51<16:49:21, 6.62s/it] {'loss': 0.4606, 'grad_norm': 0.32523217420743983, 'learning_rate': 3.864440241863665e-06, 'epoch': 0.59} 59%|█████▊ | 12943/22095 [22:07:51<16:49:21, 6.62s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12944/22095 [22:07:55<14:32:25, 5.72s/it] {'loss': 0.337, 'grad_norm': 0.6740941885338018, 'learning_rate': 3.86372648525799e-06, 'epoch': 0.59} 59%|█████▊ | 12944/22095 [22:07:55<14:32:25, 5.72s/it] 59%|█████▊ | 12945/22095 [22:07:58<12:52:21, 5.06s/it] {'loss': 0.307, 'grad_norm': 0.7082381442153999, 'learning_rate': 3.863012753067343e-06, 'epoch': 0.59} 59%|█████▊ | 12945/22095 [22:07:58<12:52:21, 5.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▊ | 12946/22095 [22:08:08<16:18:09, 6.41s/it] {'loss': 0.4816, 'grad_norm': 0.3069995352713124, 'learning_rate': 3.862299045307058e-06, 'epoch': 0.59} 59%|█████▊ | 12946/22095 [22:08:08<16:18:09, 6.41s/it] 59%|█████▊ | 12947/22095 [22:08:11<13:48:05, 5.43s/it] {'loss': 0.3188, 'grad_norm': 0.6359988599338807, 'learning_rate': 3.861585361992474e-06, 'epoch': 0.59} 59%|█████▊ | 12947/22095 [22:08:11<13:48:05, 5.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59798 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12948/22095 [22:08:15<12:40:59, 4.99s/it] {'loss': 0.2627, 'grad_norm': 0.6044620031315169, 'learning_rate': 3.860871703138925e-06, 'epoch': 0.59} 59%|█████▊ | 12948/22095 [22:08:15<12:40:59, 4.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▊ | 12949/22095 [22:08:22<14:35:46, 5.75s/it] {'loss': 0.4874, 'grad_norm': 0.28995056119402646, 'learning_rate': 3.860158068761743e-06, 'epoch': 0.59} 59%|█████▊ | 12949/22095 [22:08:22<14:35:46, 5.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [12, 103, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379216 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 103, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46000, 'image': 'vrdu_table_final_2/astro-ph.CO/a466fef3-3d47-4138-8558-098d6f4a2712.png', 'image_wh': [[12, 103]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}c@{}c@{}} - \\\\ - \\\\ - \\\\ - \\end{tabular}\n```'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12950/22095 [22:08:26<12:46:18, 5.03s/it] {'loss': 0.3463, 'grad_norm': 0.6203997084408709, 'learning_rate': 3.859444458876264e-06, 'epoch': 0.59} 59%|█████▊ | 12950/22095 [22:08:26<12:46:18, 5.03s/it] 59%|█████▊ | 12951/22095 [22:08:29<11:35:28, 4.56s/it] {'loss': 0.2999, 'grad_norm': 0.5766424550001963, 'learning_rate': 3.85873087349782e-06, 'epoch': 0.59} 59%|█████▊ | 12951/22095 [22:08:29<11:35:28, 4.56s/it] 59%|█████▊ | 12952/22095 [22:08:32<10:31:09, 4.14s/it] {'loss': 0.3003, 'grad_norm': 0.6454390235375199, 'learning_rate': 3.8580173126417455e-06, 'epoch': 0.59} 59%|█████▊ | 12952/22095 [22:08:32<10:31:09, 4.14s/it] 59%|█████▊ | 12953/22095 [22:08:36<9:59:37, 3.94s/it] {'loss': 0.3141, 'grad_norm': 0.6459639638157862, 'learning_rate': 3.857303776323371e-06, 'epoch': 0.59} 59%|█████▊ | 12953/22095 [22:08:36<9:59:37, 3.94s/it] 59%|█████▊ | 12954/22095 [22:08:39<9:32:55, 3.76s/it] {'loss': 0.3233, 'grad_norm': 0.674173137586515, 'learning_rate': 3.85659026455803e-06, 'epoch': 0.59} 59%|█████▊ | 12954/22095 [22:08:39<9:32:55, 3.76s/it] 59%|█████▊ | 12955/22095 [22:08:43<9:19:01, 3.67s/it] {'loss': 0.3477, 'grad_norm': 0.6182253537794538, 'learning_rate': 3.855876777361051e-06, 'epoch': 0.59} 59%|█████▊ | 12955/22095 [22:08:43<9:19:01, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▊ | 12956/22095 [22:08:52<13:51:29, 5.46s/it] {'loss': 0.4931, 'grad_norm': 0.30382783121496, 'learning_rate': 3.855163314747765e-06, 'epoch': 0.59} 59%|█████▊ | 12956/22095 [22:08:52<13:51:29, 5.46s/it] 59%|█████▊ | 12957/22095 [22:08:55<12:02:46, 4.75s/it] {'loss': 0.3316, 'grad_norm': 0.6468381071406778, 'learning_rate': 3.854449876733507e-06, 'epoch': 0.59} 59%|█████▊ | 12957/22095 [22:08:55<12:02:46, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12958/22095 [22:08:59<11:00:11, 4.34s/it] {'loss': 0.3195, 'grad_norm': 0.6063408189941445, 'learning_rate': 3.8537364633336e-06, 'epoch': 0.59} 59%|█████▊ | 12958/22095 [22:08:59<11:00:11, 4.34s/it] 59%|█████▊ | 12959/22095 [22:09:02<10:20:43, 4.08s/it] {'loss': 0.2697, 'grad_norm': 0.5842726588714356, 'learning_rate': 3.853023074563376e-06, 'epoch': 0.59} 59%|█████▊ | 12959/22095 [22:09:02<10:20:43, 4.08s/it] 59%|█████▊ | 12960/22095 [22:09:06<10:18:47, 4.06s/it] {'loss': 0.3199, 'grad_norm': 0.5781391825709464, 'learning_rate': 3.852309710438165e-06, 'epoch': 0.59} 59%|█████▊ | 12960/22095 [22:09:06<10:18:47, 4.06s/it] 59%|█████▊ | 12961/22095 [22:09:10<9:58:58, 3.93s/it] {'loss': 0.333, 'grad_norm': 0.6755210754771175, 'learning_rate': 3.851596370973292e-06, 'epoch': 0.59} 59%|█████▊ | 12961/22095 [22:09:10<9:58:58, 3.93s/it] 59%|█████▊ | 12962/22095 [22:09:14<10:05:18, 3.98s/it] {'loss': 0.3779, 'grad_norm': 0.6235477776992845, 'learning_rate': 3.850883056184087e-06, 'epoch': 0.59} 59%|█████▊ | 12962/22095 [22:09:14<10:05:18, 3.98s/it] 59%|█████▊ | 12963/22095 [22:09:17<9:28:19, 3.73s/it] {'loss': 0.2543, 'grad_norm': 0.5812073917967391, 'learning_rate': 3.850169766085874e-06, 'epoch': 0.59} 59%|█████▊ | 12963/22095 [22:09:17<9:28:19, 3.73s/it] 59%|█████▊ | 12964/22095 [22:09:20<9:02:28, 3.56s/it] {'loss': 0.3029, 'grad_norm': 0.6138726206796495, 'learning_rate': 3.849456500693985e-06, 'epoch': 0.59} 59%|█████▊ | 12964/22095 [22:09:20<9:02:28, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item self.list_data_dict[i].get("height", 100), ValueError: Number of image tokens ['data/ant-design/tree/other_screenshot/original/CustomHierarchicalView_1741951688.260986.png'] does not match number of images None [Try #0] Failed to fetch sample 1843732 in VC:s3://gui-agent/jedi/images/final_1.5m/final_1.5m_extracted/. Exception: Number of image tokens ['data/ant-design/tree/other_screenshot/original/CustomHierarchicalView_1741951688.260986.png'] does not match number of images None Problematic sample: {'image': 'data/ant-design/tree/other_screenshot/original/CustomHierarchicalView_1741951688.260986.png', 'conversations': []} 59%|█████▊ | 12965/22095 [22:09:24<8:58:31, 3.54s/it] {'loss': 0.2815, 'grad_norm': 0.6533874774901375, 'learning_rate': 3.848743260023739e-06, 'epoch': 0.59} 59%|█████▊ | 12965/22095 [22:09:24<8:58:31, 3.54s/it] 59%|█████▊ | 12966/22095 [22:09:27<8:46:29, 3.46s/it] {'loss': 0.3062, 'grad_norm': 1.647690739910162, 'learning_rate': 3.848030044090464e-06, 'epoch': 0.59} 59%|█████▊ | 12966/22095 [22:09:27<8:46:29, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61627 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12967/22095 [22:09:31<8:55:27, 3.52s/it] {'loss': 0.2889, 'grad_norm': 0.751645176452105, 'learning_rate': 3.847316852909488e-06, 'epoch': 0.59} 59%|█████▊ | 12967/22095 [22:09:31<8:55:27, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54452 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51357 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111025 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12968/22095 [22:09:34<8:51:14, 3.49s/it] {'loss': 0.3171, 'grad_norm': 0.6473090293190665, 'learning_rate': 3.8466036864961315e-06, 'epoch': 0.59} 59%|█████▊ | 12968/22095 [22:09:34<8:51:14, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45496 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52848 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89482 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100550 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▊ | 12969/22095 [22:09:38<9:29:40, 3.75s/it] {'loss': 0.3188, 'grad_norm': 0.6242772006515882, 'learning_rate': 3.845890544865718e-06, 'epoch': 0.59} 59%|█████▊ | 12969/22095 [22:09:38<9:29:40, 3.75s/it] 59%|█████▊ | 12970/22095 [22:09:42<9:15:15, 3.65s/it] {'loss': 0.3178, 'grad_norm': 0.5576862503082809, 'learning_rate': 3.845177428033574e-06, 'epoch': 0.59} 59%|█████▊ | 12970/22095 [22:09:42<9:15:15, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366675 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33421, 'image': 'vrdu_table_final_2/astro-ph.CO/f355ef15-2d82-4b43-9fe3-7fb3a2dc1a7e.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 59%|█████▊ | 12971/22095 [22:09:45<9:07:13, 3.60s/it] {'loss': 0.3154, 'grad_norm': 0.6225941302204959, 'learning_rate': 3.84446433601502e-06, 'epoch': 0.59} 59%|█████▊ | 12971/22095 [22:09:45<9:07:13, 3.60s/it] 59%|█████▊ | 12972/22095 [22:09:49<9:23:23, 3.71s/it] {'loss': 0.355, 'grad_norm': 0.7071677053660017, 'learning_rate': 3.843751268825378e-06, 'epoch': 0.59} 59%|█████▊ | 12972/22095 [22:09:49<9:23:23, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▊ | 12973/22095 [22:09:52<8:53:24, 3.51s/it] {'loss': 0.2623, 'grad_norm': 0.6348094595951862, 'learning_rate': 3.843038226479971e-06, 'epoch': 0.59} 59%|█████▊ | 12973/22095 [22:09:52<8:53:24, 3.51s/it] 59%|█████▊ | 12974/22095 [22:09:57<9:40:58, 3.82s/it] {'loss': 0.318, 'grad_norm': 0.6802360931954542, 'learning_rate': 3.842325208994117e-06, 'epoch': 0.59} 59%|█████▊ | 12974/22095 [22:09:57<9:40:58, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▊ | 12975/22095 [22:10:07<14:09:23, 5.59s/it] {'loss': 0.474, 'grad_norm': 0.3432497848458775, 'learning_rate': 3.84161221638314e-06, 'epoch': 0.59} 59%|█████▊ | 12975/22095 [22:10:07<14:09:23, 5.59s/it] 59%|█████▊ | 12976/22095 [22:10:14<15:28:16, 6.11s/it] {'loss': 0.4543, 'grad_norm': 0.3179869040291361, 'learning_rate': 3.840899248662358e-06, 'epoch': 0.59} 59%|█████▊ | 12976/22095 [22:10:14<15:28:16, 6.11s/it] 59%|█████▊ | 12977/22095 [22:10:24<18:07:15, 7.15s/it] {'loss': 0.4646, 'grad_norm': 0.30280561510928294, 'learning_rate': 3.840186305847094e-06, 'epoch': 0.59} 59%|█████▊ | 12977/22095 [22:10:24<18:07:15, 7.15s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▊ | 12978/22095 [22:10:27<15:12:25, 6.00s/it] {'loss': 0.2744, 'grad_norm': 0.588527362911705, 'learning_rate': 3.839473387952662e-06, 'epoch': 0.59} 59%|█████▊ | 12978/22095 [22:10:27<15:12:25, 6.00s/it] 59%|█████▊ | 12979/22095 [22:10:31<13:26:27, 5.31s/it] {'loss': 0.3408, 'grad_norm': 0.6391647492314123, 'learning_rate': 3.8387604949943816e-06, 'epoch': 0.59} 59%|█████▊ | 12979/22095 [22:10:31<13:26:27, 5.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337287 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3909, 'image': 'vrdu_table_final_2/astro-ph.CO/985b6e65-5c63-45e9-a3ef-18a638b01f55.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 59%|█████▊ | 12980/22095 [22:10:34<11:41:47, 4.62s/it] {'loss': 0.3547, 'grad_norm': 0.6678100391164818, 'learning_rate': 3.8380476269875745e-06, 'epoch': 0.59} 59%|█████▊ | 12980/22095 [22:10:34<11:41:47, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66148 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 12981/22095 [22:10:37<10:37:30, 4.20s/it] {'loss': 0.3443, 'grad_norm': 0.6755665912320579, 'learning_rate': 3.837334783947553e-06, 'epoch': 0.59} 59%|█████▉ | 12981/22095 [22:10:37<10:37:30, 4.20s/it] 59%|█████▉ | 12982/22095 [22:10:41<10:22:17, 4.10s/it] {'loss': 0.2746, 'grad_norm': 0.6287999602413568, 'learning_rate': 3.836621965889637e-06, 'epoch': 0.59} 59%|█████▉ | 12982/22095 [22:10:41<10:22:17, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 12983/22095 [22:10:44<9:39:33, 3.82s/it] {'loss': 0.3053, 'grad_norm': 1.2565627186637056, 'learning_rate': 3.8359091728291426e-06, 'epoch': 0.59} 59%|█████▉ | 12983/22095 [22:10:44<9:39:33, 3.82s/it] 59%|█████▉ | 12984/22095 [22:10:47<8:53:04, 3.51s/it] {'loss': 0.3133, 'grad_norm': 0.6786037263914724, 'learning_rate': 3.835196404781383e-06, 'epoch': 0.59} 59%|█████▉ | 12984/22095 [22:10:47<8:53:04, 3.51s/it] 59%|█████▉ | 12985/22095 [22:10:50<8:26:56, 3.34s/it] {'loss': 0.3334, 'grad_norm': 0.6310320695975805, 'learning_rate': 3.834483661761676e-06, 'epoch': 0.59} 59%|█████▉ | 12985/22095 [22:10:50<8:26:56, 3.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 12986/22095 [22:10:54<8:55:20, 3.53s/it] {'loss': 0.3574, 'grad_norm': 0.6925419971381628, 'learning_rate': 3.8337709437853365e-06, 'epoch': 0.59} 59%|█████▉ | 12986/22095 [22:10:54<8:55:20, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948712 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71865, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nA. 2\nB. 3\nC. 4\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 59%|█████▉ | 12987/22095 [22:11:02<12:22:04, 4.89s/it] {'loss': 0.4888, 'grad_norm': 0.4543710896067492, 'learning_rate': 3.833058250867677e-06, 'epoch': 0.59} 59%|█████▉ | 12987/22095 [22:11:02<12:22:04, 4.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 12988/22095 [22:11:05<11:12:54, 4.43s/it] {'loss': 0.3251, 'grad_norm': 0.6252709175360854, 'learning_rate': 3.83234558302401e-06, 'epoch': 0.59} 59%|█████▉ | 12988/22095 [22:11:05<11:12:54, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 12989/22095 [22:11:11<12:45:22, 5.04s/it] {'loss': 0.4555, 'grad_norm': 0.35321906291413185, 'learning_rate': 3.8316329402696524e-06, 'epoch': 0.59} 59%|█████▉ | 12989/22095 [22:11:11<12:45:22, 5.04s/it]Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43368 > 40960) for 4 sample(s). Truncating to 39151 with 3 samples. 59%|█████▉ | 12990/22095 [22:11:15<11:30:02, 4.55s/it] {'loss': 0.354, 'grad_norm': 0.8580523387406522, 'learning_rate': 3.8309203226199145e-06, 'epoch': 0.59} 59%|█████▉ | 12990/22095 [22:11:15<11:30:02, 4.55s/it] 59%|█████▉ | 12991/22095 [22:11:18<10:15:56, 4.06s/it] {'loss': 0.3289, 'grad_norm': 0.648674379215099, 'learning_rate': 3.830207730090108e-06, 'epoch': 0.59} 59%|█████▉ | 12991/22095 [22:11:18<10:15:56, 4.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6777476 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_132635_1/images/before_screenshot_1_id_37_function_0_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': '\nClick the Insert button in the Annotation panel to add a block or drawing to your current project'}, {'from': 'gpt', 'value': '\nclick(x=0.4649, y=0.2042)\n'}], 'width': 3024, 'height': 1964} 59%|█████▉ | 12992/22095 [22:11:22<10:07:29, 4.00s/it] {'loss': 0.3033, 'grad_norm': 0.6421814724828729, 'learning_rate': 3.829495162695543e-06, 'epoch': 0.59} 59%|█████▉ | 12992/22095 [22:11:22<10:07:29, 4.00s/it] 59%|█████▉ | 12993/22095 [22:11:24<9:12:00, 3.64s/it] {'loss': 0.3507, 'grad_norm': 0.6463585445701963, 'learning_rate': 3.828782620451535e-06, 'epoch': 0.59} 59%|█████▉ | 12993/22095 [22:11:24<9:12:00, 3.64s/it] 59%|█████▉ | 12994/22095 [22:11:28<9:18:07, 3.68s/it] {'loss': 0.3406, 'grad_norm': 0.5912068879481209, 'learning_rate': 3.828070103373389e-06, 'epoch': 0.59} 59%|█████▉ | 12994/22095 [22:11:28<9:18:07, 3.68s/it] 59%|█████▉ | 12995/22095 [22:11:31<8:49:15, 3.49s/it] {'loss': 0.2994, 'grad_norm': 0.6465651509972019, 'learning_rate': 3.8273576114764176e-06, 'epoch': 0.59} 59%|█████▉ | 12995/22095 [22:11:31<8:49:15, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55361 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 12996/22095 [22:11:34<8:22:59, 3.32s/it] {'loss': 0.2897, 'grad_norm': 0.6432592889234297, 'learning_rate': 3.8266451447759315e-06, 'epoch': 0.59} 59%|█████▉ | 12996/22095 [22:11:34<8:22:59, 3.32s/it] 59%|█████▉ | 12997/22095 [22:11:37<8:00:26, 3.17s/it] {'loss': 0.3524, 'grad_norm': 0.640912875516796, 'learning_rate': 3.825932703287236e-06, 'epoch': 0.59} 59%|█████▉ | 12997/22095 [22:11:37<8:00:26, 3.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [364, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8481721 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 30124, 'image': 'vrdu_texteq/astro-ph.CO/91486163-9b08-4267-a28c-b7178b9180e9.png', 'image_wh': [[364, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $\\mathcal{V}$ is a convex closed set'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 12998/22095 [22:11:45<11:40:02, 4.62s/it] {'loss': 0.4835, 'grad_norm': 0.441436437662703, 'learning_rate': 3.8252202870256395e-06, 'epoch': 0.59} 59%|█████▉ | 12998/22095 [22:11:45<11:40:02, 4.62s/it] 59%|█████▉ | 12999/22095 [22:11:53<14:34:09, 5.77s/it] {'loss': 0.4494, 'grad_norm': 0.39010100191764097, 'learning_rate': 3.824507896006454e-06, 'epoch': 0.59} 59%|█████▉ | 12999/22095 [22:11:53<14:34:09, 5.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (45204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81714 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13000/22095 [22:11:57<12:37:52, 5.00s/it] {'loss': 0.2936, 'grad_norm': 0.6478115035019769, 'learning_rate': 3.823795530244982e-06, 'epoch': 0.59} 59%|█████▉ | 13000/22095 [22:11:57<12:37:52, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46332 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13001/22095 [22:12:00<11:24:14, 4.51s/it] {'loss': 0.3461, 'grad_norm': 0.5908031580738624, 'learning_rate': 3.823083189756531e-06, 'epoch': 0.59} 59%|█████▉ | 13001/22095 [22:12:00<11:24:14, 4.51s/it] 59%|█████▉ | 13002/22095 [22:12:05<11:26:46, 4.53s/it] {'loss': 0.3281, 'grad_norm': 0.6325161225022814, 'learning_rate': 3.822370874556408e-06, 'epoch': 0.59} 59%|█████▉ | 13002/22095 [22:12:05<11:26:46, 4.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8347880 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14547, 'image': 'vrdu_table_final_2/astro-ph.CO/54ce0b94-d725-42be-9b7f-9c8f343ad4c5.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 59%|█████▉ | 13003/22095 [22:12:08<10:53:27, 4.31s/it] {'loss': 0.3767, 'grad_norm': 0.6765632817653675, 'learning_rate': 3.821658584659918e-06, 'epoch': 0.59} 59%|█████▉ | 13003/22095 [22:12:08<10:53:27, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13004/22095 [22:12:17<14:02:57, 5.56s/it] {'loss': 0.4932, 'grad_norm': 0.3610735045231856, 'learning_rate': 3.820946320082366e-06, 'epoch': 0.59} 59%|█████▉ | 13004/22095 [22:12:17<14:02:57, 5.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76278 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13005/22095 [22:12:20<12:32:21, 4.97s/it] {'loss': 0.3134, 'grad_norm': 0.7040019211616559, 'learning_rate': 3.820234080839057e-06, 'epoch': 0.59} 59%|█████▉ | 13005/22095 [22:12:20<12:32:21, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64168 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48647 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13006/22095 [22:12:23<10:55:43, 4.33s/it] {'loss': 0.3553, 'grad_norm': 0.6711299751775798, 'learning_rate': 3.819521866945295e-06, 'epoch': 0.59} 59%|█████▉ | 13006/22095 [22:12:23<10:55:43, 4.33s/it] 59%|█████▉ | 13007/22095 [22:12:26<10:01:36, 3.97s/it] {'loss': 0.3059, 'grad_norm': 0.5931720760249749, 'learning_rate': 3.81880967841638e-06, 'epoch': 0.59} 59%|█████▉ | 13007/22095 [22:12:26<10:01:36, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13008/22095 [22:12:36<14:13:57, 5.64s/it] {'loss': 0.4693, 'grad_norm': 0.36097307950893653, 'learning_rate': 3.818097515267618e-06, 'epoch': 0.59} 59%|█████▉ | 13008/22095 [22:12:36<14:13:57, 5.64s/it] 59%|█████▉ | 13009/22095 [22:12:39<12:24:20, 4.92s/it] {'loss': 0.2972, 'grad_norm': 0.772885550432615, 'learning_rate': 3.817385377514312e-06, 'epoch': 0.59} 59%|█████▉ | 13009/22095 [22:12:39<12:24:20, 4.92s/it] 59%|█████▉ | 13010/22095 [22:12:42<11:01:31, 4.37s/it] {'loss': 0.3071, 'grad_norm': 0.6133863749155195, 'learning_rate': 3.816673265171762e-06, 'epoch': 0.59} 59%|█████▉ | 13010/22095 [22:12:42<11:01:31, 4.37s/it] 59%|█████▉ | 13011/22095 [22:12:45<9:55:48, 3.94s/it] {'loss': 0.3196, 'grad_norm': 0.6605504314425381, 'learning_rate': 3.815961178255267e-06, 'epoch': 0.59} 59%|█████▉ | 13011/22095 [22:12:45<9:55:48, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8592981 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22543, 'image': '312155344.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a pharmaceutical book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 59%|█████▉ | 13012/22095 [22:12:49<9:46:14, 3.87s/it] {'loss': 0.297, 'grad_norm': 0.5994473366414182, 'learning_rate': 3.815249116780133e-06, 'epoch': 0.59} 59%|█████▉ | 13012/22095 [22:12:49<9:46:14, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13013/22095 [22:12:58<14:02:31, 5.57s/it] {'loss': 0.4696, 'grad_norm': 0.31147458714389, 'learning_rate': 3.8145370807616545e-06, 'epoch': 0.59} 59%|█████▉ | 13013/22095 [22:12:58<14:02:31, 5.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13014/22095 [22:13:08<16:59:51, 6.74s/it] {'loss': 0.4779, 'grad_norm': 0.27633811696298244, 'learning_rate': 3.8138250702151336e-06, 'epoch': 0.59} 59%|█████▉ | 13014/22095 [22:13:08<16:59:51, 6.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▉ | 13015/22095 [22:13:12<14:52:19, 5.90s/it] {'loss': 0.3148, 'grad_norm': 0.6898232002811877, 'learning_rate': 3.8131130851558696e-06, 'epoch': 0.59} 59%|█████▉ | 13015/22095 [22:13:12<14:52:19, 5.90s/it] 59%|█████▉ | 13016/22095 [22:13:15<12:49:55, 5.09s/it] {'loss': 0.3316, 'grad_norm': 0.6506935341637908, 'learning_rate': 3.81240112559916e-06, 'epoch': 0.59} 59%|█████▉ | 13016/22095 [22:13:15<12:49:55, 5.09s/it] 59%|█████▉ | 13017/22095 [22:13:18<11:15:49, 4.47s/it] {'loss': 0.2769, 'grad_norm': 0.6508169215201881, 'learning_rate': 3.811689191560301e-06, 'epoch': 0.59} 59%|█████▉ | 13017/22095 [22:13:18<11:15:49, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115788 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13018/22095 [22:13:22<10:34:51, 4.20s/it] {'loss': 0.2914, 'grad_norm': 0.659463492837663, 'learning_rate': 3.8109772830545933e-06, 'epoch': 0.59} 59%|█████▉ | 13018/22095 [22:13:22<10:34:51, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68642 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13019/22095 [22:13:24<9:32:07, 3.78s/it] {'loss': 0.3229, 'grad_norm': 0.6711464673967386, 'learning_rate': 3.8102654000973326e-06, 'epoch': 0.59} 59%|█████▉ | 13019/22095 [22:13:24<9:32:07, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49676 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74468 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45611 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13020/22095 [22:13:28<9:36:54, 3.81s/it] {'loss': 0.3426, 'grad_norm': 0.6483412067025784, 'learning_rate': 3.8095535427038134e-06, 'epoch': 0.59} 59%|█████▉ | 13020/22095 [22:13:28<9:36:54, 3.81s/it] 59%|█████▉ | 13021/22095 [22:13:32<9:14:19, 3.67s/it] {'loss': 0.2988, 'grad_norm': 0.5998729560447539, 'learning_rate': 3.808841710889332e-06, 'epoch': 0.59} 59%|█████▉ | 13021/22095 [22:13:32<9:14:19, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110176 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13022/22095 [22:13:35<8:47:26, 3.49s/it] {'loss': 0.3056, 'grad_norm': 0.6400881001941805, 'learning_rate': 3.808129904669186e-06, 'epoch': 0.59} 59%|█████▉ | 13022/22095 [22:13:35<8:47:26, 3.49s/it] 59%|█████▉ | 13023/22095 [22:13:38<8:27:36, 3.36s/it] {'loss': 0.2728, 'grad_norm': 0.7228179397581488, 'learning_rate': 3.807418124058665e-06, 'epoch': 0.59} 59%|█████▉ | 13023/22095 [22:13:38<8:27:36, 3.36s/it] 59%|█████▉ | 13024/22095 [22:13:42<8:57:31, 3.56s/it] {'loss': 0.318, 'grad_norm': 0.646531452310446, 'learning_rate': 3.8067063690730672e-06, 'epoch': 0.59} 59%|█████▉ | 13024/22095 [22:13:42<8:57:31, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45987 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13025/22095 [22:13:45<8:54:04, 3.53s/it] {'loss': 0.3524, 'grad_norm': 0.6952376992340696, 'learning_rate': 3.8059946397276854e-06, 'epoch': 0.59} 59%|█████▉ | 13025/22095 [22:13:45<8:54:04, 3.53s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307520 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB24S.IXef85uJjSZFtXXa4bVXa_!!3299624543.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide the text from this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n老寒腿\n关节不适\n送\n~\n炭\n炭\n竹\n护膝\n舒适养护\n膝部问题一款搞定'}]} 59%|█████▉ | 13026/22095 [22:13:55<13:14:17, 5.26s/it] {'loss': 0.5017, 'grad_norm': 0.4455158133737101, 'learning_rate': 3.805282936037811e-06, 'epoch': 0.59} 59%|█████▉ | 13026/22095 [22:13:55<13:14:17, 5.26s/it] 59%|█████▉ | 13027/22095 [22:14:03<15:43:50, 6.25s/it] {'loss': 0.4685, 'grad_norm': 0.3768748833292996, 'learning_rate': 3.8045712580187356e-06, 'epoch': 0.59} 59%|█████▉ | 13027/22095 [22:14:03<15:43:50, 6.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54150 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13028/22095 [22:14:13<18:11:44, 7.22s/it] {'loss': 0.4487, 'grad_norm': 0.2977798946241966, 'learning_rate': 3.803859605685754e-06, 'epoch': 0.59} 59%|█████▉ | 13028/22095 [22:14:13<18:11:44, 7.22s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▉ | 13029/22095 [22:14:16<15:30:38, 6.16s/it] {'loss': 0.3178, 'grad_norm': 0.7536806509562007, 'learning_rate': 3.803147979054155e-06, 'epoch': 0.59} 59%|█████▉ | 13029/22095 [22:14:16<15:30:38, 6.16s/it] 59%|█████▉ | 13030/22095 [22:14:22<15:02:31, 5.97s/it] {'loss': 0.4742, 'grad_norm': 0.34640164397571077, 'learning_rate': 3.8024363781392304e-06, 'epoch': 0.59} 59%|█████▉ | 13030/22095 [22:14:22<15:02:31, 5.97s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▉ | 13031/22095 [22:14:25<13:15:37, 5.27s/it] {'loss': 0.2978, 'grad_norm': 0.6354442806532212, 'learning_rate': 3.8017248029562713e-06, 'epoch': 0.59} 59%|█████▉ | 13031/22095 [22:14:25<13:15:37, 5.27s/it] 59%|█████▉ | 13032/22095 [22:14:29<12:01:42, 4.78s/it] {'loss': 0.3283, 'grad_norm': 0.6308339653666732, 'learning_rate': 3.8010132535205634e-06, 'epoch': 0.59} 59%|█████▉ | 13032/22095 [22:14:29<12:01:42, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13033/22095 [22:14:37<14:08:52, 5.62s/it] {'loss': 0.4627, 'grad_norm': 0.43622553607956044, 'learning_rate': 3.8003017298474e-06, 'epoch': 0.59} 59%|█████▉ | 13033/22095 [22:14:37<14:08:52, 5.62s/it] 59%|█████▉ | 13034/22095 [22:14:41<13:00:23, 5.17s/it] {'loss': 0.2918, 'grad_norm': 0.5934010485802537, 'learning_rate': 3.7995902319520674e-06, 'epoch': 0.59} 59%|█████▉ | 13034/22095 [22:14:41<13:00:23, 5.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13035/22095 [22:14:44<11:21:52, 4.52s/it] {'loss': 0.2955, 'grad_norm': 0.6677173047396981, 'learning_rate': 3.7988787598498543e-06, 'epoch': 0.59} 59%|█████▉ | 13035/22095 [22:14:44<11:21:52, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13036/22095 [22:14:53<15:01:40, 5.97s/it] {'loss': 0.5038, 'grad_norm': 0.38851060414369165, 'learning_rate': 3.7981673135560464e-06, 'epoch': 0.59} 59%|█████▉ | 13036/22095 [22:14:53<15:01:40, 5.97s/it] 59%|█████▉ | 13037/22095 [22:14:57<13:25:41, 5.34s/it] {'loss': 0.3106, 'grad_norm': 0.657627358931422, 'learning_rate': 3.797455893085933e-06, 'epoch': 0.59} 59%|█████▉ | 13037/22095 [22:14:57<13:25:41, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13038/22095 [22:15:00<12:00:33, 4.77s/it] {'loss': 0.3411, 'grad_norm': 0.6333835119681968, 'learning_rate': 3.7967444984548e-06, 'epoch': 0.59} 59%|█████▉ | 13038/22095 [22:15:00<12:00:33, 4.77s/it] 59%|█████▉ | 13039/22095 [22:15:04<11:21:28, 4.52s/it] {'loss': 0.3323, 'grad_norm': 0.5793072590899354, 'learning_rate': 3.796033129677931e-06, 'epoch': 0.59} 59%|█████▉ | 13039/22095 [22:15:04<11:21:28, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71581 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121122 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13040/22095 [22:15:07<10:07:03, 4.02s/it] {'loss': 0.3203, 'grad_norm': 0.6453468287225503, 'learning_rate': 3.7953217867706106e-06, 'epoch': 0.59} 59%|█████▉ | 13040/22095 [22:15:07<10:07:03, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80944 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13041/22095 [22:15:10<9:29:14, 3.77s/it] {'loss': 0.2964, 'grad_norm': 0.6391441798115033, 'learning_rate': 3.794610469748129e-06, 'epoch': 0.59} 59%|█████▉ | 13041/22095 [22:15:10<9:29:14, 3.77s/it] 59%|█████▉ | 13042/22095 [22:15:14<9:05:51, 3.62s/it] {'loss': 0.3161, 'grad_norm': 0.621375675956063, 'learning_rate': 3.793899178625763e-06, 'epoch': 0.59} 59%|█████▉ | 13042/22095 [22:15:14<9:05:51, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13043/22095 [22:15:17<8:53:10, 3.53s/it] {'loss': 0.2955, 'grad_norm': 0.6274388766689246, 'learning_rate': 3.7931879134188002e-06, 'epoch': 0.59} 59%|█████▉ | 13043/22095 [22:15:17<8:53:10, 3.53s/it] 59%|█████▉ | 13044/22095 [22:15:20<8:23:11, 3.34s/it] {'loss': 0.3132, 'grad_norm': 0.6190550008897789, 'learning_rate': 3.7924766741425247e-06, 'epoch': 0.59} 59%|█████▉ | 13044/22095 [22:15:20<8:23:11, 3.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13045/22095 [22:15:23<8:30:52, 3.39s/it] {'loss': 0.3291, 'grad_norm': 0.616290704979202, 'learning_rate': 3.791765460812215e-06, 'epoch': 0.59} 59%|█████▉ | 13045/22095 [22:15:23<8:30:52, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53274 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13046/22095 [22:15:27<8:46:15, 3.49s/it] {'loss': 0.3059, 'grad_norm': 0.6969178418384471, 'learning_rate': 3.7910542734431537e-06, 'epoch': 0.59} 59%|█████▉ | 13046/22095 [22:15:27<8:46:15, 3.49s/it] 59%|█████▉ | 13047/22095 [22:15:30<8:19:25, 3.31s/it] {'loss': 0.2953, 'grad_norm': 0.6856688562365002, 'learning_rate': 3.7903431120506247e-06, 'epoch': 0.59} 59%|█████▉ | 13047/22095 [22:15:30<8:19:25, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13048/22095 [22:15:37<11:12:14, 4.46s/it] {'loss': 0.4756, 'grad_norm': 0.4143437058883918, 'learning_rate': 3.7896319766499073e-06, 'epoch': 0.59} 59%|█████▉ | 13048/22095 [22:15:37<11:12:14, 4.46s/it] 59%|█████▉ | 13049/22095 [22:15:41<10:32:56, 4.20s/it] {'loss': 0.2771, 'grad_norm': 0.6607130200060778, 'learning_rate': 3.788920867256281e-06, 'epoch': 0.59} 59%|█████▉ | 13049/22095 [22:15:41<10:32:56, 4.20s/it] 59%|█████▉ | 13050/22095 [22:15:44<10:08:19, 4.04s/it] {'loss': 0.3414, 'grad_norm': 0.5988983320355563, 'learning_rate': 3.788209783885024e-06, 'epoch': 0.59} 59%|█████▉ | 13050/22095 [22:15:44<10:08:19, 4.04s/it] 59%|█████▉ | 13051/22095 [22:15:49<10:21:36, 4.12s/it] {'loss': 0.2825, 'grad_norm': 0.6226333656842887, 'learning_rate': 3.7874987265514197e-06, 'epoch': 0.59} 59%|█████▉ | 13051/22095 [22:15:49<10:21:36, 4.12s/it] 59%|█████▉ | 13052/22095 [22:15:52<9:28:03, 3.77s/it] {'loss': 0.2941, 'grad_norm': 0.6485822725381329, 'learning_rate': 3.786787695270743e-06, 'epoch': 0.59} 59%|█████▉ | 13052/22095 [22:15:52<9:28:03, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13053/22095 [22:15:55<8:45:51, 3.49s/it] {'loss': 0.3147, 'grad_norm': 0.6526722775186214, 'learning_rate': 3.7860766900582716e-06, 'epoch': 0.59} 59%|█████▉ | 13053/22095 [22:15:55<8:45:51, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13054/22095 [22:16:04<13:10:17, 5.24s/it] {'loss': 0.4555, 'grad_norm': 0.2947319369351773, 'learning_rate': 3.785365710929286e-06, 'epoch': 0.59} 59%|█████▉ | 13054/22095 [22:16:04<13:10:17, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56424 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89438 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13055/22095 [22:16:07<11:35:36, 4.62s/it] {'loss': 0.3248, 'grad_norm': 0.6537057936992485, 'learning_rate': 3.784654757899059e-06, 'epoch': 0.59} 59%|█████▉ | 13055/22095 [22:16:07<11:35:36, 4.62s/it] 59%|█████▉ | 13056/22095 [22:16:10<10:44:10, 4.28s/it] {'loss': 0.3591, 'grad_norm': 0.6367926569809269, 'learning_rate': 3.783943830982868e-06, 'epoch': 0.59} 59%|█████▉ | 13056/22095 [22:16:10<10:44:10, 4.28s/it] 59%|█████▉ | 13057/22095 [22:16:14<10:22:23, 4.13s/it] {'loss': 0.2841, 'grad_norm': 0.6043301757986551, 'learning_rate': 3.7832329301959914e-06, 'epoch': 0.59} 59%|█████▉ | 13057/22095 [22:16:14<10:22:23, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79079 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13058/22095 [22:16:17<9:26:40, 3.76s/it] {'loss': 0.3573, 'grad_norm': 0.6727643037951684, 'learning_rate': 3.7825220555537006e-06, 'epoch': 0.59} 59%|█████▉ | 13058/22095 [22:16:17<9:26:40, 3.76s/it] 59%|█████▉ | 13059/22095 [22:16:20<8:42:44, 3.47s/it] {'loss': 0.3215, 'grad_norm': 0.590524442701709, 'learning_rate': 3.781811207071272e-06, 'epoch': 0.59} 59%|█████▉ | 13059/22095 [22:16:20<8:42:44, 3.47s/it] 59%|█████▉ | 13060/22095 [22:16:24<8:48:12, 3.51s/it] {'loss': 0.3183, 'grad_norm': 0.6354132649387176, 'learning_rate': 3.781100384763978e-06, 'epoch': 0.59} 59%|█████▉ | 13060/22095 [22:16:24<8:48:12, 3.51s/it] 59%|█████▉ | 13061/22095 [22:16:27<8:37:08, 3.43s/it] {'loss': 0.2922, 'grad_norm': 0.6075769280663038, 'learning_rate': 3.7803895886470952e-06, 'epoch': 0.59} 59%|█████▉ | 13061/22095 [22:16:27<8:37:08, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401744 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3908, 'image': 'vrdu_table_final_2/astro-ph.CO/fd803df4-4ecd-42ca-bd18-401c733136c7.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 59%|█████▉ | 13062/22095 [22:16:30<8:32:49, 3.41s/it] {'loss': 0.3243, 'grad_norm': 0.592386100884859, 'learning_rate': 3.7796788187358934e-06, 'epoch': 0.59} 59%|█████▉ | 13062/22095 [22:16:30<8:32:49, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13063/22095 [22:16:40<13:06:19, 5.22s/it] {'loss': 0.4747, 'grad_norm': 0.3625514389963376, 'learning_rate': 3.778968075045646e-06, 'epoch': 0.59} 59%|█████▉ | 13063/22095 [22:16:40<13:06:19, 5.22s/it] 59%|█████▉ | 13064/22095 [22:16:43<11:54:14, 4.75s/it] {'loss': 0.3174, 'grad_norm': 0.6902941165882858, 'learning_rate': 3.7782573575916255e-06, 'epoch': 0.59} 59%|█████▉ | 13064/22095 [22:16:43<11:54:14, 4.75s/it] 59%|█████▉ | 13065/22095 [22:16:47<11:15:11, 4.49s/it] {'loss': 0.3421, 'grad_norm': 0.6402809000377081, 'learning_rate': 3.7775466663890997e-06, 'epoch': 0.59} 59%|█████▉ | 13065/22095 [22:16:47<11:15:11, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13066/22095 [22:16:57<14:56:06, 5.95s/it] {'loss': 0.4765, 'grad_norm': 0.30137486000398755, 'learning_rate': 3.7768360014533427e-06, 'epoch': 0.59} 59%|█████▉ | 13066/22095 [22:16:57<14:56:06, 5.95s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (104400000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 59%|█████▉ | 13067/22095 [22:17:00<13:05:42, 5.22s/it] {'loss': 0.3262, 'grad_norm': 0.5848318681138999, 'learning_rate': 3.7761253627996245e-06, 'epoch': 0.59} 59%|█████▉ | 13067/22095 [22:17:00<13:05:42, 5.22s/it] 59%|█████▉ | 13068/22095 [22:17:04<11:46:36, 4.70s/it] {'loss': 0.2678, 'grad_norm': 0.6320255325803715, 'learning_rate': 3.7754147504432128e-06, 'epoch': 0.59} 59%|█████▉ | 13068/22095 [22:17:04<11:46:36, 4.70s/it] 59%|█████▉ | 13069/22095 [22:17:07<11:08:45, 4.45s/it] {'loss': 0.3337, 'grad_norm': 0.6638000769349245, 'learning_rate': 3.7747041643993755e-06, 'epoch': 0.59} 59%|█████▉ | 13069/22095 [22:17:07<11:08:45, 4.45s/it] 59%|█████▉ | 13070/22095 [22:17:10<10:00:28, 3.99s/it] {'loss': 0.3151, 'grad_norm': 0.6680733135489192, 'learning_rate': 3.7739936046833856e-06, 'epoch': 0.59} 59%|█████▉ | 13070/22095 [22:17:10<10:00:28, 3.99s/it] 59%|█████▉ | 13071/22095 [22:17:13<9:14:33, 3.69s/it] {'loss': 0.2493, 'grad_norm': 0.6145310273808362, 'learning_rate': 3.773283071310505e-06, 'epoch': 0.59} 59%|█████▉ | 13071/22095 [22:17:13<9:14:33, 3.69s/it] 59%|█████▉ | 13072/22095 [22:17:16<8:49:32, 3.52s/it] {'loss': 0.3029, 'grad_norm': 0.5646272596608798, 'learning_rate': 3.7725725642960047e-06, 'epoch': 0.59} 59%|█████▉ | 13072/22095 [22:17:16<8:49:32, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894384 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17537, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm'}, {'from': 'gpt', 'value': '【解答】解:∵CB=3cm,DB=5cm,∴CD=5-3=2cm,∵D是AC的中点,∴AC=2CD=4cm,∴AB=AC+CB=4+3=7cm.'}]} 59%|█████▉ | 13073/22095 [22:17:19<8:25:29, 3.36s/it] {'loss': 0.3276, 'grad_norm': 0.5684147601436456, 'learning_rate': 3.7718620836551512e-06, 'epoch': 0.59} 59%|█████▉ | 13073/22095 [22:17:19<8:25:29, 3.36s/it] 59%|█████▉ | 13074/22095 [22:17:23<8:40:18, 3.46s/it] {'loss': 0.3275, 'grad_norm': 0.6525397914151486, 'learning_rate': 3.7711516294032086e-06, 'epoch': 0.59} 59%|█████▉ | 13074/22095 [22:17:23<8:40:18, 3.46s/it] 59%|█████▉ | 13075/22095 [22:17:26<8:26:48, 3.37s/it] {'loss': 0.3035, 'grad_norm': 0.6030261523641939, 'learning_rate': 3.770441201555442e-06, 'epoch': 0.59} 59%|█████▉ | 13075/22095 [22:17:26<8:26:48, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129222 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71812 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13076/22095 [22:17:30<8:25:46, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120863 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.3092, 'grad_norm': 0.658476768697074, 'learning_rate': 3.769730800127119e-06, 'epoch': 0.59} 59%|█████▉ | 13076/22095 [22:17:30<8:25:46, 3.36s/it] 59%|█████▉ | 13077/22095 [22:17:33<8:24:07, 3.35s/it] {'loss': 0.3108, 'grad_norm': 0.752410239829877, 'learning_rate': 3.769020425133503e-06, 'epoch': 0.59} 59%|█████▉ | 13077/22095 [22:17:33<8:24:07, 3.35s/it] 59%|█████▉ | 13078/22095 [22:17:37<8:44:03, 3.49s/it] {'loss': 0.3165, 'grad_norm': 0.8037543463094624, 'learning_rate': 3.7683100765898573e-06, 'epoch': 0.59} 59%|█████▉ | 13078/22095 [22:17:37<8:44:03, 3.49s/it] 59%|█████▉ | 13079/22095 [22:17:41<9:06:41, 3.64s/it] {'loss': 0.3113, 'grad_norm': 0.7767337488831968, 'learning_rate': 3.7675997545114435e-06, 'epoch': 0.59} 59%|█████▉ | 13079/22095 [22:17:41<9:06:41, 3.64s/it] 59%|█████▉ | 13080/22095 [22:17:45<9:17:16, 3.71s/it] {'loss': 0.3304, 'grad_norm': 0.6177110144688015, 'learning_rate': 3.7668894589135284e-06, 'epoch': 0.59} 59%|█████▉ | 13080/22095 [22:17:45<9:17:16, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69564 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13081/22095 [22:17:48<8:57:43, 3.58s/it] {'loss': 0.3054, 'grad_norm': 0.6972389230091802, 'learning_rate': 3.76617918981137e-06, 'epoch': 0.59} 59%|█████▉ | 13081/22095 [22:17:48<8:57:43, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (109789 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13082/22095 [22:17:51<8:50:08, 3.53s/it] {'loss': 0.3142, 'grad_norm': 0.594758697082108, 'learning_rate': 3.7654689472202323e-06, 'epoch': 0.59} 59%|█████▉ | 13082/22095 [22:17:51<8:50:08, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58512 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13083/22095 [22:17:54<8:33:59, 3.42s/it] {'loss': 0.3388, 'grad_norm': 0.6565473637710484, 'learning_rate': 3.7647587311553758e-06, 'epoch': 0.59} 59%|█████▉ | 13083/22095 [22:17:54<8:33:59, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (85510 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113719 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13084/22095 [22:18:04<13:03:24, 5.22s/it] {'loss': 0.4423, 'grad_norm': 0.3632395504641611, 'learning_rate': 3.7640485416320586e-06, 'epoch': 0.59} 59%|█████▉ | 13084/22095 [22:18:04<13:03:24, 5.22s/it] 59%|█████▉ | 13085/22095 [22:18:07<11:28:44, 4.59s/it] {'loss': 0.2714, 'grad_norm': 0.6618035706270688, 'learning_rate': 3.763338378665543e-06, 'epoch': 0.59} 59%|█████▉ | 13085/22095 [22:18:07<11:28:44, 4.59s/it] 59%|█████▉ | 13086/22095 [22:18:10<10:22:15, 4.14s/it] {'loss': 0.315, 'grad_norm': 0.9727610599227895, 'learning_rate': 3.762628242271089e-06, 'epoch': 0.59} 59%|█████▉ | 13086/22095 [22:18:10<10:22:15, 4.14s/it] 59%|█████▉ | 13087/22095 [22:18:13<9:26:27, 3.77s/it] {'loss': 0.2623, 'grad_norm': 0.6265858065970411, 'learning_rate': 3.7619181324639526e-06, 'epoch': 0.59} 59%|█████▉ | 13087/22095 [22:18:13<9:26:27, 3.77s/it] 59%|█████▉ | 13088/22095 [22:18:16<8:57:51, 3.58s/it] {'loss': 0.3221, 'grad_norm': 0.658143947875074, 'learning_rate': 3.761208049259393e-06, 'epoch': 0.59} 59%|█████▉ | 13088/22095 [22:18:16<8:57:51, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13089/22095 [22:18:26<13:36:16, 5.44s/it] {'loss': 0.4734, 'grad_norm': 0.2794168498838751, 'learning_rate': 3.760497992672667e-06, 'epoch': 0.59} 59%|█████▉ | 13089/22095 [22:18:26<13:36:16, 5.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56774 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13090/22095 [22:18:35<16:21:38, 6.54s/it] {'loss': 0.4726, 'grad_norm': 0.29591340671507, 'learning_rate': 3.7597879627190337e-06, 'epoch': 0.59} 59%|█████▉ | 13090/22095 [22:18:35<16:21:38, 6.54s/it] 59%|█████▉ | 13091/22095 [22:18:41<15:49:51, 6.33s/it] {'loss': 0.4824, 'grad_norm': 0.2957711298433703, 'learning_rate': 3.7590779594137476e-06, 'epoch': 0.59} 59%|█████▉ | 13091/22095 [22:18:41<15:49:51, 6.33s/it] 59%|█████▉ | 13092/22095 [22:18:51<18:33:07, 7.42s/it] {'loss': 0.51, 'grad_norm': 0.29424958786733124, 'learning_rate': 3.758367982772065e-06, 'epoch': 0.59} 59%|█████▉ | 13092/22095 [22:18:51<18:33:07, 7.42s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 59%|█████▉ | 13093/22095 [22:18:54<15:26:28, 6.18s/it] {'loss': 0.322, 'grad_norm': 0.6516448084403705, 'learning_rate': 3.7576580328092416e-06, 'epoch': 0.59} 59%|█████▉ | 13093/22095 [22:18:54<15:26:28, 6.18s/it] 59%|█████▉ | 13094/22095 [22:18:57<13:21:44, 5.34s/it] {'loss': 0.3076, 'grad_norm': 0.6188868863340756, 'learning_rate': 3.7569481095405297e-06, 'epoch': 0.59} 59%|█████▉ | 13094/22095 [22:18:57<13:21:44, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96974 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13095/22095 [22:19:01<11:58:06, 4.79s/it] {'loss': 0.3304, 'grad_norm': 0.643871355009258, 'learning_rate': 3.7562382129811863e-06, 'epoch': 0.59} 59%|█████▉ | 13095/22095 [22:19:01<11:58:06, 4.79s/it] 59%|█████▉ | 13096/22095 [22:19:04<10:28:04, 4.19s/it] {'loss': 0.3142, 'grad_norm': 0.613491725606984, 'learning_rate': 3.755528343146465e-06, 'epoch': 0.59} 59%|█████▉ | 13096/22095 [22:19:04<10:28:04, 4.19s/it] 59%|█████▉ | 13097/22095 [22:19:07<9:30:03, 3.80s/it] {'loss': 0.3098, 'grad_norm': 0.5718218562017449, 'learning_rate': 3.7548185000516163e-06, 'epoch': 0.59} 59%|█████▉ | 13097/22095 [22:19:07<9:30:03, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (102532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94451 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13098/22095 [22:19:14<12:09:17, 4.86s/it] {'loss': 0.4664, 'grad_norm': 0.32929129744003455, 'learning_rate': 3.7541086837118923e-06, 'epoch': 0.59} 59%|█████▉ | 13098/22095 [22:19:14<12:09:17, 4.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13099/22095 [22:19:17<10:52:56, 4.35s/it] {'loss': 0.3681, 'grad_norm': 0.7171791450318256, 'learning_rate': 3.7533988941425497e-06, 'epoch': 0.59} 59%|█████▉ | 13099/22095 [22:19:17<10:52:56, 4.35s/it] 59%|█████▉ | 13100/22095 [22:19:20<9:47:46, 3.92s/it] {'loss': 0.2816, 'grad_norm': 0.7508003949385689, 'learning_rate': 3.7526891313588334e-06, 'epoch': 0.59} 59%|█████▉ | 13100/22095 [22:19:20<9:47:46, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77330 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54411 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13101/22095 [22:19:23<9:00:47, 3.61s/it] {'loss': 0.342, 'grad_norm': 0.6463903925300547, 'learning_rate': 3.7519793953759976e-06, 'epoch': 0.59} 59%|█████▉ | 13101/22095 [22:19:23<9:00:47, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8368297 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 35045, 'image': 'vrdu_table_final_2/astro-ph.CO/8041c175-f552-4f1a-a46d-9069ed6b4ffb.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 59%|█████▉ | 13102/22095 [22:19:27<9:18:54, 3.73s/it] {'loss': 0.3186, 'grad_norm': 0.6034850507917721, 'learning_rate': 3.7512696862092924e-06, 'epoch': 0.59} 59%|█████▉ | 13102/22095 [22:19:27<9:18:54, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51702 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13103/22095 [22:19:36<13:16:44, 5.32s/it] {'loss': 0.4677, 'grad_norm': 0.323643335339662, 'learning_rate': 3.750560003873965e-06, 'epoch': 0.59} 59%|█████▉ | 13103/22095 [22:19:36<13:16:44, 5.32s/it] 59%|█████▉ | 13104/22095 [22:19:39<11:49:20, 4.73s/it] {'loss': 0.3061, 'grad_norm': 0.5797391881618859, 'learning_rate': 3.7498503483852655e-06, 'epoch': 0.59} 59%|█████▉ | 13104/22095 [22:19:39<11:49:20, 4.73s/it] 59%|█████▉ | 13105/22095 [22:19:43<10:44:39, 4.30s/it] {'loss': 0.2891, 'grad_norm': 0.6533239166183596, 'learning_rate': 3.749140719758444e-06, 'epoch': 0.59} 59%|█████▉ | 13105/22095 [22:19:43<10:44:39, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13106/22095 [22:19:54<15:51:08, 6.35s/it] {'loss': 0.4748, 'grad_norm': 0.28435596156172427, 'learning_rate': 3.748431118008747e-06, 'epoch': 0.59} 59%|█████▉ | 13106/22095 [22:19:54<15:51:08, 6.35s/it] 59%|█████▉ | 13107/22095 [22:19:58<14:30:06, 5.81s/it] {'loss': 0.3061, 'grad_norm': 0.6077688964257816, 'learning_rate': 3.7477215431514203e-06, 'epoch': 0.59} 59%|█████▉ | 13107/22095 [22:19:58<14:30:06, 5.81s/it] 59%|█████▉ | 13108/22095 [22:20:01<12:28:44, 5.00s/it] {'loss': 0.3221, 'grad_norm': 0.8999032205253109, 'learning_rate': 3.74701199520171e-06, 'epoch': 0.59} 59%|█████▉ | 13108/22095 [22:20:01<12:28:44, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13109/22095 [22:20:11<15:51:05, 6.35s/it] {'loss': 0.4725, 'grad_norm': 0.34773585807513885, 'learning_rate': 3.7463024741748665e-06, 'epoch': 0.59} 59%|█████▉ | 13109/22095 [22:20:11<15:51:05, 6.35s/it] 59%|█████▉ | 13110/22095 [22:20:16<14:48:14, 5.93s/it] {'loss': 0.3044, 'grad_norm': 0.6586318888584578, 'learning_rate': 3.745592980086132e-06, 'epoch': 0.59} 59%|█████▉ | 13110/22095 [22:20:16<14:48:14, 5.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13111/22095 [22:20:23<16:02:16, 6.43s/it] {'loss': 0.4663, 'grad_norm': 0.26404863142989515, 'learning_rate': 3.744883512950751e-06, 'epoch': 0.59} 59%|█████▉ | 13111/22095 [22:20:23<16:02:16, 6.43s/it] 59%|█████▉ | 13112/22095 [22:20:27<13:43:48, 5.50s/it] {'loss': 0.3381, 'grad_norm': 0.6557620068116499, 'learning_rate': 3.7441740727839693e-06, 'epoch': 0.59} 59%|█████▉ | 13112/22095 [22:20:27<13:43:48, 5.50s/it] 59%|█████▉ | 13113/22095 [22:20:30<12:01:46, 4.82s/it] {'loss': 0.3182, 'grad_norm': 0.72527557830192, 'learning_rate': 3.7434646596010284e-06, 'epoch': 0.59} 59%|█████▉ | 13113/22095 [22:20:30<12:01:46, 4.82s/it] 59%|█████▉ | 13114/22095 [22:20:33<10:30:42, 4.21s/it] {'loss': 0.3369, 'grad_norm': 0.6732815876045888, 'learning_rate': 3.742755273417173e-06, 'epoch': 0.59} 59%|█████▉ | 13114/22095 [22:20:33<10:30:42, 4.21s/it] 59%|█████▉ | 13115/22095 [22:20:36<9:36:15, 3.85s/it] {'loss': 0.3187, 'grad_norm': 0.6230884754824806, 'learning_rate': 3.742045914247647e-06, 'epoch': 0.59} 59%|█████▉ | 13115/22095 [22:20:36<9:36:15, 3.85s/it] 59%|█████▉ | 13116/22095 [22:20:39<9:03:53, 3.63s/it] {'loss': 0.3313, 'grad_norm': 0.6512486014953773, 'learning_rate': 3.7413365821076897e-06, 'epoch': 0.59} 59%|█████▉ | 13116/22095 [22:20:39<9:03:53, 3.63s/it] 59%|█████▉ | 13117/22095 [22:20:43<8:58:44, 3.60s/it] {'loss': 0.2908, 'grad_norm': 0.6264029046891488, 'learning_rate': 3.740627277012542e-06, 'epoch': 0.59} 59%|█████▉ | 13117/22095 [22:20:43<8:58:44, 3.60s/it] 59%|█████▉ | 13118/22095 [22:20:46<9:00:37, 3.61s/it] {'loss': 0.2991, 'grad_norm': 0.8811085700085609, 'learning_rate': 3.7399179989774483e-06, 'epoch': 0.59} 59%|█████▉ | 13118/22095 [22:20:46<9:00:37, 3.61s/it] 59%|█████▉ | 13119/22095 [22:20:50<9:02:17, 3.62s/it] {'loss': 0.3208, 'grad_norm': 0.6397706143848809, 'learning_rate': 3.739208748017647e-06, 'epoch': 0.59} 59%|█████▉ | 13119/22095 [22:20:50<9:02:17, 3.62s/it] 59%|█████▉ | 13120/22095 [22:20:53<8:29:57, 3.41s/it] {'loss': 0.3121, 'grad_norm': 0.6302304728471242, 'learning_rate': 3.7384995241483767e-06, 'epoch': 0.59} 59%|█████▉ | 13120/22095 [22:20:53<8:29:57, 3.41s/it] 59%|█████▉ | 13121/22095 [22:20:57<9:00:58, 3.62s/it] {'loss': 0.3062, 'grad_norm': 0.6752519831545459, 'learning_rate': 3.737790327384876e-06, 'epoch': 0.59} 59%|█████▉ | 13121/22095 [22:20:57<9:00:58, 3.62s/it] 59%|█████▉ | 13122/22095 [22:21:01<9:20:05, 3.75s/it] {'loss': 0.3445, 'grad_norm': 0.6138243669102184, 'learning_rate': 3.7370811577423883e-06, 'epoch': 0.59} 59%|█████▉ | 13122/22095 [22:21:01<9:20:05, 3.75s/it] 59%|█████▉ | 13123/22095 [22:21:04<8:40:22, 3.48s/it] {'loss': 0.3125, 'grad_norm': 0.6637397709643503, 'learning_rate': 3.7363720152361436e-06, 'epoch': 0.59} 59%|█████▉ | 13123/22095 [22:21:04<8:40:22, 3.48s/it] 59%|█████▉ | 13124/22095 [22:21:07<8:11:11, 3.29s/it] {'loss': 0.2955, 'grad_norm': 0.5809437749619315, 'learning_rate': 3.735662899881385e-06, 'epoch': 0.59} 59%|█████▉ | 13124/22095 [22:21:07<8:11:11, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52404 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13125/22095 [22:21:10<8:06:46, 3.26s/it] {'loss': 0.3139, 'grad_norm': 0.6544272698464104, 'learning_rate': 3.734953811693349e-06, 'epoch': 0.59} 59%|█████▉ | 13125/22095 [22:21:10<8:06:46, 3.26s/it] 59%|█████▉ | 13126/22095 [22:21:13<8:05:14, 3.25s/it] {'loss': 0.3094, 'grad_norm': 1.2643189481680515, 'learning_rate': 3.7342447506872686e-06, 'epoch': 0.59} 59%|█████▉ | 13126/22095 [22:21:13<8:05:14, 3.25s/it] 59%|█████▉ | 13127/22095 [22:21:16<8:17:58, 3.33s/it] {'loss': 0.2845, 'grad_norm': 0.6409913588796115, 'learning_rate': 3.7335357168783802e-06, 'epoch': 0.59} 59%|█████▉ | 13127/22095 [22:21:17<8:17:58, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82728 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80849 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125839 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13128/22095 [22:21:19<8:00:42, 3.22s/it] {'loss': 0.3311, 'grad_norm': 0.7650230050867153, 'learning_rate': 3.732826710281923e-06, 'epoch': 0.59} 59%|█████▉ | 13128/22095 [22:21:19<8:00:42, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50850 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13129/22095 [22:21:23<7:58:48, 3.20s/it] {'loss': 0.2978, 'grad_norm': 0.6220906941303066, 'learning_rate': 3.7321177309131248e-06, 'epoch': 0.59} 59%|█████▉ | 13129/22095 [22:21:23<7:58:48, 3.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44074 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57260 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (53242 > 40960) for 4 sample(s). Truncating to 10193 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (44232 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13130/22095 [22:21:27<8:38:20, 3.47s/it] {'loss': 0.3028, 'grad_norm': 0.6174783888824636, 'learning_rate': 3.7314087787872234e-06, 'epoch': 0.59} 59%|█████▉ | 13130/22095 [22:21:27<8:38:20, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11307487 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1077, 'image': 'chat/937.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n图中的聊天记录在说什么?'}, {'from': 'gpt', 'value': '图中的聊天记录是中国大陆常用的通讯工具微信的界面,显示的是与“妈妈”的聊天记录。具体内容如下:\n\n1. 显示的是微信群发自公众号“LOGO设计软件” 的一条消息:\n 消息内容为:“分享实用的手机摄影诀窍”。\n\n2. 旁边字样说明关于手机摄影学习的公众号已经插入消息里,文字后展现一张小封面图,封面图显示了一位站立的女子。\n\n整体来看,这是一条由公众号发布的消息,分享到与“妈妈”的聊天窗口中。消息分享了一个手机摄影的诀窍链接。'}]} 59%|█████▉ | 13131/22095 [22:21:30<8:21:07, 3.35s/it] {'loss': 0.2791, 'grad_norm': 0.5795536682728678, 'learning_rate': 3.73069985391945e-06, 'epoch': 0.59} 59%|█████▉ | 13131/22095 [22:21:30<8:21:07, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76256 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13132/22095 [22:21:33<8:28:26, 3.40s/it] {'loss': 0.2955, 'grad_norm': 0.5777705970662388, 'learning_rate': 3.7299909563250414e-06, 'epoch': 0.59} 59%|█████▉ | 13132/22095 [22:21:33<8:28:26, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13133/22095 [22:21:41<11:55:07, 4.79s/it] {'loss': 0.4909, 'grad_norm': 0.3486003160911572, 'learning_rate': 3.7292820860192235e-06, 'epoch': 0.59} 59%|█████▉ | 13133/22095 [22:21:41<11:55:07, 4.79s/it] 59%|█████▉ | 13134/22095 [22:21:45<10:52:46, 4.37s/it] {'loss': 0.2864, 'grad_norm': 0.6233231953624803, 'learning_rate': 3.7285732430172327e-06, 'epoch': 0.59} 59%|█████▉ | 13134/22095 [22:21:45<10:52:46, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13135/22095 [22:21:48<10:04:53, 4.05s/it] {'loss': 0.3719, 'grad_norm': 0.6304934226609963, 'learning_rate': 3.7278644273342982e-06, 'epoch': 0.59} 59%|█████▉ | 13135/22095 [22:21:48<10:04:53, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61893 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13136/22095 [22:21:57<14:03:57, 5.65s/it] {'loss': 0.4767, 'grad_norm': 0.30752683520412194, 'learning_rate': 3.7271556389856493e-06, 'epoch': 0.59} 59%|█████▉ | 13136/22095 [22:21:57<14:03:57, 5.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13137/22095 [22:22:01<12:35:17, 5.06s/it] {'loss': 0.3436, 'grad_norm': 0.6703948879065504, 'learning_rate': 3.726446877986516e-06, 'epoch': 0.59} 59%|█████▉ | 13137/22095 [22:22:01<12:35:17, 5.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 59%|█████▉ | 13138/22095 [22:22:12<16:53:38, 6.79s/it] {'loss': 0.4707, 'grad_norm': 0.29394166362208135, 'learning_rate': 3.725738144352129e-06, 'epoch': 0.59} 59%|█████▉ | 13138/22095 [22:22:12<16:53:38, 6.79s/it] 59%|█████▉ | 13139/22095 [22:22:16<14:41:57, 5.91s/it] {'loss': 0.3421, 'grad_norm': 0.6364131387899856, 'learning_rate': 3.725029438097715e-06, 'epoch': 0.59} 59%|█████▉ | 13139/22095 [22:22:16<14:41:57, 5.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 59%|█████▉ | 13140/22095 [22:22:25<17:20:16, 6.97s/it] {'loss': 0.4662, 'grad_norm': 0.3072016357761098, 'learning_rate': 3.7243207592385034e-06, 'epoch': 0.59} 59%|█████▉ | 13140/22095 [22:22:25<17:20:16, 6.97s/it] 59%|█████▉ | 13141/22095 [22:22:29<14:46:29, 5.94s/it] {'loss': 0.3032, 'grad_norm': 1.065624468877734, 'learning_rate': 3.7236121077897208e-06, 'epoch': 0.59} 59%|█████▉ | 13141/22095 [22:22:29<14:46:29, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81699 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13142/22095 [22:22:32<12:55:02, 5.19s/it] {'loss': 0.329, 'grad_norm': 0.6361755726188255, 'learning_rate': 3.7229034837665923e-06, 'epoch': 0.59} 59%|█████▉ | 13142/22095 [22:22:32<12:55:02, 5.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61498 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13143/22095 [22:22:35<11:20:53, 4.56s/it] {'loss': 0.323, 'grad_norm': 0.6633173196046636, 'learning_rate': 3.722194887184346e-06, 'epoch': 0.59} 59%|█████▉ | 13143/22095 [22:22:35<11:20:53, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (136185 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 59%|█████▉ | 13144/22095 [22:22:45<14:58:38, 6.02s/it] {'loss': 0.4561, 'grad_norm': 0.39576080171503325, 'learning_rate': 3.7214863180582085e-06, 'epoch': 0.59} 59%|█████▉ | 13144/22095 [22:22:45<14:58:38, 6.02s/it] 59%|█████▉ | 13145/22095 [22:22:48<13:08:20, 5.28s/it] {'loss': 0.2974, 'grad_norm': 1.0278169141323898, 'learning_rate': 3.7207777764034027e-06, 'epoch': 0.59} 59%|█████▉ | 13145/22095 [22:22:48<13:08:20, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41224 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74892 > 40960). Running this sequence through the model will result in indexing errors 59%|█████▉ | 13146/22095 [22:22:51<11:25:48, 4.60s/it] {'loss': 0.3285, 'grad_norm': 0.6570809526656647, 'learning_rate': 3.720069262235152e-06, 'epoch': 0.59} 59%|█████▉ | 13146/22095 [22:22:51<11:25:48, 4.60s/it] 60%|█████▉ | 13147/22095 [22:22:54<10:12:57, 4.11s/it] {'loss': 0.3014, 'grad_norm': 0.6269859847151987, 'learning_rate': 3.7193607755686836e-06, 'epoch': 0.6} 60%|█████▉ | 13147/22095 [22:22:54<10:12:57, 4.11s/it] 60%|█████▉ | 13148/22095 [22:22:57<9:23:36, 3.78s/it] {'loss': 0.3585, 'grad_norm': 0.657802326899779, 'learning_rate': 3.718652316419219e-06, 'epoch': 0.6} 60%|█████▉ | 13148/22095 [22:22:57<9:23:36, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915461 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38614, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6'}]} 60%|█████▉ | 13149/22095 [22:23:01<9:15:22, 3.72s/it] {'loss': 0.3145, 'grad_norm': 0.5563274901862401, 'learning_rate': 3.7179438848019805e-06, 'epoch': 0.6} 60%|█████▉ | 13149/22095 [22:23:01<9:15:22, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13150/22095 [22:23:11<14:19:08, 5.76s/it] {'loss': 0.4512, 'grad_norm': 0.3098645111637703, 'learning_rate': 3.7172354807321894e-06, 'epoch': 0.6} 60%|█████▉ | 13150/22095 [22:23:11<14:19:08, 5.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13151/22095 [22:23:24<19:13:16, 7.74s/it] {'loss': 0.4798, 'grad_norm': 0.3088313873530433, 'learning_rate': 3.7165271042250706e-06, 'epoch': 0.6} 60%|█████▉ | 13151/22095 [22:23:24<19:13:16, 7.74s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 60%|█████▉ | 13152/22095 [22:23:27<16:10:31, 6.51s/it] {'loss': 0.2714, 'grad_norm': 0.6343676769371877, 'learning_rate': 3.7158187552958403e-06, 'epoch': 0.6} 60%|█████▉ | 13152/22095 [22:23:27<16:10:31, 6.51s/it] 60%|█████▉ | 13153/22095 [22:23:32<15:01:22, 6.05s/it] {'loss': 0.2857, 'grad_norm': 0.6710306240092611, 'learning_rate': 3.7151104339597212e-06, 'epoch': 0.6} 60%|█████▉ | 13153/22095 [22:23:32<15:01:22, 6.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13154/22095 [22:23:43<18:36:43, 7.49s/it] {'loss': 0.4489, 'grad_norm': 0.27054839637529265, 'learning_rate': 3.7144021402319334e-06, 'epoch': 0.6} 60%|█████▉ | 13154/22095 [22:23:43<18:36:43, 7.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7322204 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250508_161646_1/images/before_screenshot_1_id_73_internvl_element-caption_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nOnly respond with the point inside: The 'Block Editor' button in the AutoCAD ribbon interface that allows users to edit block definitions in the drawing."}, {'from': 'gpt', 'value': "The 'Block Editor' button in the AutoCAD ribbon interface that allows users to edit block definitions in the drawing.[[584, 97]]"}], 'width': 3600, 'height': 2338} 60%|█████▉ | 13155/22095 [22:23:47<15:50:19, 6.38s/it] {'loss': 0.2968, 'grad_norm': 0.6816851121029694, 'learning_rate': 3.713693874127695e-06, 'epoch': 0.6} 60%|█████▉ | 13155/22095 [22:23:47<15:50:19, 6.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53244 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94395 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13156/22095 [22:23:50<13:18:47, 5.36s/it] {'loss': 0.3018, 'grad_norm': 0.6553860649813443, 'learning_rate': 3.712985635662223e-06, 'epoch': 0.6} 60%|█████▉ | 13156/22095 [22:23:50<13:18:47, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47837 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56244 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91951 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13157/22095 [22:23:53<11:53:44, 4.79s/it] {'loss': 0.304, 'grad_norm': 0.6782676736421867, 'learning_rate': 3.7122774248507386e-06, 'epoch': 0.6} 60%|█████▉ | 13157/22095 [22:23:53<11:53:44, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13158/22095 [22:24:01<13:50:41, 5.58s/it] {'loss': 0.4485, 'grad_norm': 0.2797039203100852, 'learning_rate': 3.7115692417084574e-06, 'epoch': 0.6} 60%|█████▉ | 13158/22095 [22:24:01<13:50:41, 5.58s/it] 60%|█████▉ | 13159/22095 [22:24:05<12:28:53, 5.03s/it] {'loss': 0.291, 'grad_norm': 0.6062200687096395, 'learning_rate': 3.7108610862505955e-06, 'epoch': 0.6} 60%|█████▉ | 13159/22095 [22:24:05<12:28:53, 5.03s/it] 60%|█████▉ | 13160/22095 [22:24:08<11:13:15, 4.52s/it] {'loss': 0.3194, 'grad_norm': 0.5928153101844938, 'learning_rate': 3.710152958492369e-06, 'epoch': 0.6} 60%|█████▉ | 13160/22095 [22:24:08<11:13:15, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13161/22095 [22:24:11<10:12:39, 4.11s/it] {'loss': 0.3465, 'grad_norm': 0.6381956679356376, 'learning_rate': 3.7094448584489955e-06, 'epoch': 0.6} 60%|█████▉ | 13161/22095 [22:24:11<10:12:39, 4.11s/it] 60%|█████▉ | 13162/22095 [22:24:15<10:02:22, 4.05s/it] {'loss': 0.2943, 'grad_norm': 0.6051622048176114, 'learning_rate': 3.708736786135687e-06, 'epoch': 0.6} 60%|█████▉ | 13162/22095 [22:24:15<10:02:22, 4.05s/it] 60%|█████▉ | 13163/22095 [22:24:18<9:37:40, 3.88s/it] {'loss': 0.2923, 'grad_norm': 0.7120118076247374, 'learning_rate': 3.70802874156766e-06, 'epoch': 0.6} 60%|█████▉ | 13163/22095 [22:24:18<9:37:40, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13164/22095 [22:24:29<14:33:09, 5.87s/it] {'loss': 0.4615, 'grad_norm': 0.3312549098297892, 'learning_rate': 3.7073207247601285e-06, 'epoch': 0.6} 60%|█████▉ | 13164/22095 [22:24:29<14:33:09, 5.87s/it] 60%|█████▉ | 13165/22095 [22:24:38<17:03:54, 6.88s/it] {'loss': 0.4679, 'grad_norm': 0.29657193989846825, 'learning_rate': 3.7066127357283026e-06, 'epoch': 0.6} 60%|█████▉ | 13165/22095 [22:24:38<17:03:54, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 60%|█████▉ | 13166/22095 [22:24:42<14:52:30, 6.00s/it] {'loss': 0.3242, 'grad_norm': 0.5841555972131169, 'learning_rate': 3.705904774487396e-06, 'epoch': 0.6} 60%|█████▉ | 13166/22095 [22:24:42<14:52:30, 6.00s/it] 60%|█████▉ | 13167/22095 [22:24:49<15:47:33, 6.37s/it] {'loss': 0.4993, 'grad_norm': 0.28481066257332377, 'learning_rate': 3.7051968410526236e-06, 'epoch': 0.6} 60%|█████▉ | 13167/22095 [22:24:49<15:47:33, 6.37s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (79034 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13168/22095 [22:24:53<14:02:36, 5.66s/it] {'loss': 0.3452, 'grad_norm': 0.6477331403565503, 'learning_rate': 3.7044889354391934e-06, 'epoch': 0.6} 60%|█████▉ | 13168/22095 [22:24:53<14:02:36, 5.66s/it] 60%|█████▉ | 13169/22095 [22:24:57<12:16:47, 4.95s/it] {'loss': 0.2652, 'grad_norm': 0.6739577213503118, 'learning_rate': 3.703781057662317e-06, 'epoch': 0.6} 60%|█████▉ | 13169/22095 [22:24:57<12:16:47, 4.95s/it] 60%|█████▉ | 13170/22095 [22:25:00<11:15:40, 4.54s/it] {'loss': 0.3061, 'grad_norm': 0.6052741040937766, 'learning_rate': 3.703073207737205e-06, 'epoch': 0.6} 60%|█████▉ | 13170/22095 [22:25:00<11:15:40, 4.54s/it] 60%|█████▉ | 13171/22095 [22:25:04<10:29:56, 4.24s/it] {'loss': 0.2465, 'grad_norm': 0.5757953144746631, 'learning_rate': 3.7023653856790655e-06, 'epoch': 0.6} 60%|█████▉ | 13171/22095 [22:25:04<10:29:56, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44675 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55796 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54330 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13172/22095 [22:25:07<9:36:14, 3.87s/it] {'loss': 0.3131, 'grad_norm': 0.6197451940727606, 'learning_rate': 3.7016575915031084e-06, 'epoch': 0.6} 60%|█████▉ | 13172/22095 [22:25:07<9:36:14, 3.87s/it] 60%|█████▉ | 13173/22095 [22:25:10<8:53:22, 3.59s/it] {'loss': 0.3386, 'grad_norm': 0.6442250708090359, 'learning_rate': 3.700949825224544e-06, 'epoch': 0.6} 60%|█████▉ | 13173/22095 [22:25:10<8:53:22, 3.59s/it] 60%|█████▉ | 13174/22095 [22:25:13<8:46:53, 3.54s/it] {'loss': 0.3504, 'grad_norm': 0.6828610388809717, 'learning_rate': 3.700242086858577e-06, 'epoch': 0.6} 60%|█████▉ | 13174/22095 [22:25:13<8:46:53, 3.54s/it] 60%|█████▉ | 13175/22095 [22:25:16<8:27:29, 3.41s/it] {'loss': 0.3109, 'grad_norm': 0.6245904625773707, 'learning_rate': 3.6995343764204143e-06, 'epoch': 0.6} 60%|█████▉ | 13175/22095 [22:25:16<8:27:29, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13176/22095 [22:25:27<13:42:02, 5.53s/it] {'loss': 0.4575, 'grad_norm': 0.29816155391876076, 'learning_rate': 3.6988266939252647e-06, 'epoch': 0.6} 60%|█████▉ | 13176/22095 [22:25:27<13:42:02, 5.53s/it] 60%|█████▉ | 13177/22095 [22:25:30<12:14:04, 4.94s/it] {'loss': 0.3007, 'grad_norm': 0.6163325957250416, 'learning_rate': 3.698119039388335e-06, 'epoch': 0.6} 60%|█████▉ | 13177/22095 [22:25:30<12:14:04, 4.94s/it] 60%|█████▉ | 13178/22095 [22:25:34<11:06:58, 4.49s/it] {'loss': 0.2607, 'grad_norm': 0.5790691504496822, 'learning_rate': 3.6974114128248274e-06, 'epoch': 0.6} 60%|█████▉ | 13178/22095 [22:25:34<11:06:58, 4.49s/it] 60%|█████▉ | 13179/22095 [22:25:37<9:59:40, 4.04s/it] {'loss': 0.3196, 'grad_norm': 0.7598184284764053, 'learning_rate': 3.696703814249947e-06, 'epoch': 0.6} 60%|█████▉ | 13179/22095 [22:25:37<9:59:40, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13180/22095 [22:25:46<13:57:37, 5.64s/it] {'loss': 0.4382, 'grad_norm': 0.2825882685277589, 'learning_rate': 3.695996243678901e-06, 'epoch': 0.6} 60%|█████▉ | 13180/22095 [22:25:46<13:57:37, 5.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13181/22095 [22:25:50<12:41:50, 5.13s/it] {'loss': 0.3261, 'grad_norm': 0.6359353171429746, 'learning_rate': 3.6952887011268885e-06, 'epoch': 0.6} 60%|█████▉ | 13181/22095 [22:25:50<12:41:50, 5.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909863 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33016, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12'}]} 60%|█████▉ | 13182/22095 [22:25:53<11:13:35, 4.53s/it] {'loss': 0.3117, 'grad_norm': 0.6116551042236428, 'learning_rate': 3.6945811866091153e-06, 'epoch': 0.6} 60%|█████▉ | 13182/22095 [22:25:53<11:13:35, 4.53s/it] 60%|█████▉ | 13183/22095 [22:25:57<10:33:42, 4.27s/it] {'loss': 0.2932, 'grad_norm': 0.7821988598175718, 'learning_rate': 3.6938737001407847e-06, 'epoch': 0.6} 60%|█████▉ | 13183/22095 [22:25:57<10:33:42, 4.27s/it] 60%|█████▉ | 13184/22095 [22:26:00<9:30:30, 3.84s/it] {'loss': 0.3058, 'grad_norm': 0.621388684245905, 'learning_rate': 3.6931662417370956e-06, 'epoch': 0.6} 60%|█████▉ | 13184/22095 [22:26:00<9:30:30, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88303 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13185/22095 [22:26:03<8:55:43, 3.61s/it] {'loss': 0.2793, 'grad_norm': 0.5882159349528706, 'learning_rate': 3.692458811413249e-06, 'epoch': 0.6} 60%|█████▉ | 13185/22095 [22:26:03<8:55:43, 3.61s/it] 60%|█████▉ | 13186/22095 [22:26:06<8:46:27, 3.55s/it] {'loss': 0.3099, 'grad_norm': 0.5827495491993245, 'learning_rate': 3.6917514091844497e-06, 'epoch': 0.6} 60%|█████▉ | 13186/22095 [22:26:06<8:46:27, 3.55s/it] 60%|█████▉ | 13187/22095 [22:26:10<8:52:59, 3.59s/it] {'loss': 0.3516, 'grad_norm': 0.6505631041125741, 'learning_rate': 3.691044035065893e-06, 'epoch': 0.6} 60%|█████▉ | 13187/22095 [22:26:10<8:52:59, 3.59s/it] 60%|█████▉ | 13188/22095 [22:26:14<8:59:07, 3.63s/it] {'loss': 0.298, 'grad_norm': 0.6166883268007635, 'learning_rate': 3.6903366890727792e-06, 'epoch': 0.6} 60%|█████▉ | 13188/22095 [22:26:14<8:59:07, 3.63s/it] 60%|█████▉ | 13189/22095 [22:26:17<8:59:49, 3.64s/it] {'loss': 0.2935, 'grad_norm': 0.6128453820552738, 'learning_rate': 3.6896293712203075e-06, 'epoch': 0.6} 60%|█████▉ | 13189/22095 [22:26:17<8:59:49, 3.64s/it] 60%|█████▉ | 13190/22095 [22:26:21<8:43:23, 3.53s/it] {'loss': 0.3052, 'grad_norm': 0.6593897806724689, 'learning_rate': 3.6889220815236776e-06, 'epoch': 0.6} 60%|█████▉ | 13190/22095 [22:26:21<8:43:23, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77995 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13191/22095 [22:26:23<8:17:38, 3.35s/it] {'loss': 0.2967, 'grad_norm': 0.6467674494601752, 'learning_rate': 3.688214819998085e-06, 'epoch': 0.6} 60%|█████▉ | 13191/22095 [22:26:23<8:17:38, 3.35s/it] 60%|█████▉ | 13192/22095 [22:26:27<8:46:35, 3.55s/it] {'loss': 0.3223, 'grad_norm': 0.9026493624411818, 'learning_rate': 3.687507586658726e-06, 'epoch': 0.6} 60%|█████▉ | 13192/22095 [22:26:27<8:46:35, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13193/22095 [22:26:37<13:09:54, 5.32s/it] {'loss': 0.4837, 'grad_norm': 0.35026014317474924, 'learning_rate': 3.6868003815208003e-06, 'epoch': 0.6} 60%|█████▉ | 13193/22095 [22:26:37<13:09:54, 5.32s/it] 60%|█████▉ | 13194/22095 [22:26:40<11:35:29, 4.69s/it] {'loss': 0.3092, 'grad_norm': 0.6215915010677973, 'learning_rate': 3.686093204599499e-06, 'epoch': 0.6} 60%|█████▉ | 13194/22095 [22:26:40<11:35:29, 4.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952529 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3364, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 60%|█████▉ | 13195/22095 [22:26:43<10:15:41, 4.15s/it] {'loss': 0.2798, 'grad_norm': 0.6232982673485767, 'learning_rate': 3.68538605591002e-06, 'epoch': 0.6} 60%|█████▉ | 13195/22095 [22:26:43<10:15:41, 4.15s/it] 60%|█████▉ | 13196/22095 [22:26:46<9:43:35, 3.93s/it] {'loss': 0.3021, 'grad_norm': 0.6129914831636445, 'learning_rate': 3.6846789354675584e-06, 'epoch': 0.6} 60%|█████▉ | 13196/22095 [22:26:46<9:43:35, 3.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78421 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13197/22095 [22:26:49<9:01:32, 3.65s/it] {'loss': 0.2759, 'grad_norm': 0.6216919708011899, 'learning_rate': 3.683971843287305e-06, 'epoch': 0.6} 60%|█████▉ | 13197/22095 [22:26:49<9:01:32, 3.65s/it] 60%|█████▉ | 13198/22095 [22:26:53<8:50:44, 3.58s/it] {'loss': 0.2936, 'grad_norm': 0.6521665296020159, 'learning_rate': 3.6832647793844557e-06, 'epoch': 0.6} 60%|█████▉ | 13198/22095 [22:26:53<8:50:44, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13199/22095 [22:27:02<13:13:38, 5.35s/it] {'loss': 0.479, 'grad_norm': 0.2537563545628852, 'learning_rate': 3.6825577437742028e-06, 'epoch': 0.6} 60%|█████▉ | 13199/22095 [22:27:02<13:13:38, 5.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62621 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88209 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13200/22095 [22:27:06<11:35:13, 4.69s/it] {'loss': 0.2845, 'grad_norm': 0.5903977732511272, 'learning_rate': 3.681850736471736e-06, 'epoch': 0.6} 60%|█████▉ | 13200/22095 [22:27:06<11:35:13, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (136830 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13201/22095 [22:27:09<10:25:13, 4.22s/it] {'loss': 0.326, 'grad_norm': 0.639447725295969, 'learning_rate': 3.6811437574922494e-06, 'epoch': 0.6} 60%|█████▉ | 13201/22095 [22:27:09<10:25:13, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76933 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108988 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13202/22095 [22:27:12<9:49:15, 3.98s/it] {'loss': 0.36, 'grad_norm': 0.6507427678934059, 'learning_rate': 3.680436806850933e-06, 'epoch': 0.6} 60%|█████▉ | 13202/22095 [22:27:12<9:49:15, 3.98s/it] 60%|█████▉ | 13203/22095 [22:27:15<9:03:59, 3.67s/it] {'loss': 0.3502, 'grad_norm': 0.6147854529236003, 'learning_rate': 3.6797298845629776e-06, 'epoch': 0.6} 60%|█████▉ | 13203/22095 [22:27:15<9:03:59, 3.67s/it] 60%|█████▉ | 13204/22095 [22:27:18<8:21:20, 3.38s/it] {'loss': 0.3185, 'grad_norm': 0.6453083421397205, 'learning_rate': 3.6790229906435706e-06, 'epoch': 0.6} 60%|█████▉ | 13204/22095 [22:27:18<8:21:20, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13205/22095 [22:27:25<11:28:41, 4.65s/it] {'loss': 0.4633, 'grad_norm': 0.2641602153652307, 'learning_rate': 3.6783161251079026e-06, 'epoch': 0.6} 60%|█████▉ | 13205/22095 [22:27:25<11:28:41, 4.65s/it] 60%|█████▉ | 13206/22095 [22:27:29<10:44:41, 4.35s/it] {'loss': 0.3133, 'grad_norm': 0.6993737385876693, 'learning_rate': 3.677609287971163e-06, 'epoch': 0.6} 60%|█████▉ | 13206/22095 [22:27:29<10:44:41, 4.35s/it] 60%|█████▉ | 13207/22095 [22:27:32<9:43:07, 3.94s/it] {'loss': 0.3749, 'grad_norm': 0.6268767172580938, 'learning_rate': 3.676902479248538e-06, 'epoch': 0.6} 60%|█████▉ | 13207/22095 [22:27:32<9:43:07, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47414 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44728 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13208/22095 [22:27:42<14:08:43, 5.73s/it] {'loss': 0.4625, 'grad_norm': 0.27262484330790376, 'learning_rate': 3.6761956989552138e-06, 'epoch': 0.6} 60%|█████▉ | 13208/22095 [22:27:42<14:08:43, 5.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43495 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13209/22095 [22:27:45<12:21:38, 5.01s/it] {'loss': 0.2837, 'grad_norm': 0.8495143981164707, 'learning_rate': 3.6754889471063814e-06, 'epoch': 0.6} 60%|█████▉ | 13209/22095 [22:27:45<12:21:38, 5.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13210/22095 [22:27:52<13:52:50, 5.62s/it] {'loss': 0.4842, 'grad_norm': 0.33892822582460136, 'learning_rate': 3.6747822237172204e-06, 'epoch': 0.6} 60%|█████▉ | 13210/22095 [22:27:52<13:52:50, 5.62s/it] 60%|█████▉ | 13211/22095 [22:27:56<12:12:57, 4.95s/it] {'loss': 0.3122, 'grad_norm': 0.6494562517896424, 'learning_rate': 3.6740755288029206e-06, 'epoch': 0.6} 60%|█████▉ | 13211/22095 [22:27:56<12:12:57, 4.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13212/22095 [22:27:59<10:50:41, 4.40s/it] {'loss': 0.2886, 'grad_norm': 0.6489746897269669, 'learning_rate': 3.6733688623786667e-06, 'epoch': 0.6} 60%|█████▉ | 13212/22095 [22:27:59<10:50:41, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43012 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13213/22095 [22:28:07<13:23:59, 5.43s/it] {'loss': 0.4729, 'grad_norm': 0.27422493756120675, 'learning_rate': 3.67266222445964e-06, 'epoch': 0.6} 60%|█████▉ | 13213/22095 [22:28:07<13:23:59, 5.43s/it] 60%|█████▉ | 13214/22095 [22:28:10<11:54:55, 4.83s/it] {'loss': 0.2931, 'grad_norm': 0.5409356168569753, 'learning_rate': 3.6719556150610243e-06, 'epoch': 0.6} 60%|█████▉ | 13214/22095 [22:28:10<11:54:55, 4.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [148, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367580 in VC:s3://internvl-moe-sft-data/. Exception: Image size [148, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34328, 'image': 'vrdu_table_final_2/astro-ph.CO/cbcd0bb3-8526-4a5e-818b-9eeb3134cb45.png', 'image_wh': [[148, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{lcccccc}\n\\multicolumn{7}{l}{\\footnotesize\n$^1$;\n$^2$;\n$^3$;\n$^4$;\n$^5$;\n$^6$.}\n\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [373, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8463026 in VC:s3://internvl-moe-sft-data/. Exception: Image size [373, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 84646, 'image': 'vrdu_texteq/astro-ph.CO/6a9682fc-0095-4abe-af47-f7eb72901cb6.png', 'image_wh': [[373, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'with $\\Theta$ the Heaviside function.'}]} 60%|█████▉ | 13215/22095 [22:28:14<11:01:43, 4.47s/it] {'loss': 0.306, 'grad_norm': 0.6618889061564469, 'learning_rate': 3.6712490341980057e-06, 'epoch': 0.6} 60%|█████▉ | 13215/22095 [22:28:14<11:01:43, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13216/22095 [22:28:24<15:10:44, 6.15s/it] {'loss': 0.4711, 'grad_norm': 0.27558100516395273, 'learning_rate': 3.6705424818857636e-06, 'epoch': 0.6} 60%|█████▉ | 13216/22095 [22:28:24<15:10:44, 6.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87014 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13217/22095 [22:28:33<17:39:30, 7.16s/it] {'loss': 0.4569, 'grad_norm': 0.26551145333875725, 'learning_rate': 3.6698359581394803e-06, 'epoch': 0.6} 60%|█████▉ | 13217/22095 [22:28:33<17:39:30, 7.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 60%|█████▉ | 13218/22095 [22:28:37<15:00:28, 6.09s/it] {'loss': 0.309, 'grad_norm': 0.6406255430648299, 'learning_rate': 3.669129462974337e-06, 'epoch': 0.6} 60%|█████▉ | 13218/22095 [22:28:37<15:00:28, 6.09s/it] 60%|█████▉ | 13219/22095 [22:28:41<13:46:33, 5.59s/it] {'loss': 0.3046, 'grad_norm': 0.6131297048044994, 'learning_rate': 3.668422996405515e-06, 'epoch': 0.6} 60%|█████▉ | 13219/22095 [22:28:41<13:46:33, 5.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13220/22095 [22:28:45<12:27:14, 5.05s/it] {'loss': 0.3683, 'grad_norm': 0.7123061004163291, 'learning_rate': 3.667716558448192e-06, 'epoch': 0.6} 60%|█████▉ | 13220/22095 [22:28:45<12:27:14, 5.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13221/22095 [22:28:54<15:14:19, 6.18s/it] {'loss': 0.4673, 'grad_norm': 0.4038625913971144, 'learning_rate': 3.667010149117549e-06, 'epoch': 0.6} 60%|█████▉ | 13221/22095 [22:28:54<15:14:19, 6.18s/it] 60%|█████▉ | 13222/22095 [22:28:57<13:01:10, 5.28s/it] {'loss': 0.2671, 'grad_norm': 0.635090127562506, 'learning_rate': 3.666303768428765e-06, 'epoch': 0.6} 60%|█████▉ | 13222/22095 [22:28:57<13:01:10, 5.28s/it] 60%|█████▉ | 13223/22095 [22:29:00<11:14:30, 4.56s/it] {'loss': 0.3068, 'grad_norm': 0.8239338249393553, 'learning_rate': 3.665597416397014e-06, 'epoch': 0.6} 60%|█████▉ | 13223/22095 [22:29:00<11:14:30, 4.56s/it] 60%|█████▉ | 13224/22095 [22:29:03<10:00:05, 4.06s/it] {'loss': 0.3097, 'grad_norm': 0.6299322564454998, 'learning_rate': 3.6648910930374783e-06, 'epoch': 0.6} 60%|█████▉ | 13224/22095 [22:29:03<10:00:05, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13225/22095 [22:29:12<14:02:24, 5.70s/it] {'loss': 0.4702, 'grad_norm': 0.28881756200221953, 'learning_rate': 3.6641847983653326e-06, 'epoch': 0.6} 60%|█████▉ | 13225/22095 [22:29:12<14:02:24, 5.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898843 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21996, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nA. 6\nB. 5\nC. 4\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 60%|█████▉ | 13226/22095 [22:29:17<13:00:39, 5.28s/it] {'loss': 0.3299, 'grad_norm': 0.5770655353848592, 'learning_rate': 3.6634785323957522e-06, 'epoch': 0.6} 60%|█████▉ | 13226/22095 [22:29:17<13:00:39, 5.28s/it] 60%|█████▉ | 13227/22095 [22:29:20<11:47:35, 4.79s/it] {'loss': 0.2902, 'grad_norm': 0.5883840900394046, 'learning_rate': 3.6627722951439125e-06, 'epoch': 0.6} 60%|█████▉ | 13227/22095 [22:29:20<11:47:35, 4.79s/it] 60%|█████▉ | 13228/22095 [22:29:24<10:41:32, 4.34s/it] {'loss': 0.2909, 'grad_norm': 0.6533817489703067, 'learning_rate': 3.6620660866249922e-06, 'epoch': 0.6} 60%|█████▉ | 13228/22095 [22:29:24<10:41:32, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55526 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13229/22095 [22:29:28<10:35:33, 4.30s/it] {'loss': 0.3526, 'grad_norm': 0.6342454950558009, 'learning_rate': 3.66135990685416e-06, 'epoch': 0.6} 60%|█████▉ | 13229/22095 [22:29:28<10:35:33, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76883 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73632 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47982 > 40960) for 4 sample(s). Truncating to 4269 with 1 samples. 60%|█████▉ | 13230/22095 [22:29:31<9:45:29, 3.96s/it] {'loss': 0.3083, 'grad_norm': 0.6688804477128425, 'learning_rate': 3.6606537558465925e-06, 'epoch': 0.6} 60%|█████▉ | 13230/22095 [22:29:31<9:45:29, 3.96s/it] 60%|█████▉ | 13231/22095 [22:29:34<9:07:28, 3.71s/it] {'loss': 0.3435, 'grad_norm': 0.6621555789786193, 'learning_rate': 3.6599476336174622e-06, 'epoch': 0.6} 60%|█████▉ | 13231/22095 [22:29:34<9:07:28, 3.71s/it] 60%|█████▉ | 13232/22095 [22:29:38<8:57:03, 3.64s/it] {'loss': 0.3667, 'grad_norm': 0.614671513931057, 'learning_rate': 3.659241540181943e-06, 'epoch': 0.6} 60%|█████▉ | 13232/22095 [22:29:38<8:57:03, 3.64s/it] 60%|█████▉ | 13233/22095 [22:29:41<9:04:41, 3.69s/it] {'loss': 0.3283, 'grad_norm': 0.5872663880952672, 'learning_rate': 3.6585354755552032e-06, 'epoch': 0.6} 60%|█████▉ | 13233/22095 [22:29:41<9:04:41, 3.69s/it] 60%|█████▉ | 13234/22095 [22:29:45<9:11:39, 3.74s/it] {'loss': 0.3528, 'grad_norm': 0.6158303410100913, 'learning_rate': 3.6578294397524174e-06, 'epoch': 0.6} 60%|█████▉ | 13234/22095 [22:29:45<9:11:39, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109586 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13235/22095 [22:29:49<9:14:40, 3.76s/it] {'loss': 0.3133, 'grad_norm': 0.6203668634959106, 'learning_rate': 3.657123432788755e-06, 'epoch': 0.6} 60%|█████▉ | 13235/22095 [22:29:49<9:14:40, 3.76s/it] 60%|█████▉ | 13236/22095 [22:29:52<8:43:03, 3.54s/it] {'loss': 0.3182, 'grad_norm': 0.6394831186748366, 'learning_rate': 3.656417454679385e-06, 'epoch': 0.6} 60%|█████▉ | 13236/22095 [22:29:52<8:43:03, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13237/22095 [22:30:02<13:22:41, 5.44s/it] {'loss': 0.4717, 'grad_norm': 0.30639042230507724, 'learning_rate': 3.6557115054394764e-06, 'epoch': 0.6} 60%|█████▉ | 13237/22095 [22:30:02<13:22:41, 5.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047545 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 9cm\nB. 4cm\nC. 5cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 60%|█████▉ | 13238/22095 [22:30:11<16:27:43, 6.69s/it] {'loss': 0.4776, 'grad_norm': 0.29415688444257554, 'learning_rate': 3.655005585084202e-06, 'epoch': 0.6} 60%|█████▉ | 13238/22095 [22:30:11<16:27:43, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8597976 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20818, 'image': '879725109.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 60%|█████▉ | 13239/22095 [22:30:15<14:26:28, 5.87s/it] {'loss': 0.3387, 'grad_norm': 0.8168793453305554, 'learning_rate': 3.6542996936287233e-06, 'epoch': 0.6} 60%|█████▉ | 13239/22095 [22:30:15<14:26:28, 5.87s/it] 60%|█████▉ | 13240/22095 [22:30:19<12:47:45, 5.20s/it] {'loss': 0.3146, 'grad_norm': 0.6441091079639482, 'learning_rate': 3.6535938310882124e-06, 'epoch': 0.6} 60%|█████▉ | 13240/22095 [22:30:19<12:47:45, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53735 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70399 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13241/22095 [22:30:23<11:56:15, 4.85s/it] {'loss': 0.3021, 'grad_norm': 0.6282743875406009, 'learning_rate': 3.6528879974778365e-06, 'epoch': 0.6} 60%|█████▉ | 13241/22095 [22:30:23<11:56:15, 4.85s/it] 60%|█████▉ | 13242/22095 [22:30:27<10:57:57, 4.46s/it] {'loss': 0.3358, 'grad_norm': 0.6778664796167693, 'learning_rate': 3.6521821928127588e-06, 'epoch': 0.6} 60%|█████▉ | 13242/22095 [22:30:27<10:57:57, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13243/22095 [22:30:36<14:54:32, 6.06s/it] {'loss': 0.4482, 'grad_norm': 0.27633099760908875, 'learning_rate': 3.6514764171081454e-06, 'epoch': 0.6} 60%|█████▉ | 13243/22095 [22:30:36<14:54:32, 6.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [506, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465771 in VC:s3://internvl-moe-sft-data/. Exception: Image size [506, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65585, 'image': 'vrdu_texteq/astro-ph.CO/640ce826-78df-4ade-a596-c20865ff3304.png', 'image_wh': [[506, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'where $N_{\\rm halos}$ is the total number of halos.'}]} 60%|█████▉ | 13244/22095 [22:30:40<13:15:11, 5.39s/it] {'loss': 0.2861, 'grad_norm': 0.6270164059202556, 'learning_rate': 3.6507706703791624e-06, 'epoch': 0.6} 60%|█████▉ | 13244/22095 [22:30:40<13:15:11, 5.39s/it] 60%|█████▉ | 13245/22095 [22:30:43<11:23:49, 4.64s/it] {'loss': 0.3334, 'grad_norm': 0.5866021898261772, 'learning_rate': 3.650064952640976e-06, 'epoch': 0.6} 60%|█████▉ | 13245/22095 [22:30:43<11:23:49, 4.64s/it] 60%|█████▉ | 13246/22095 [22:30:47<10:57:19, 4.46s/it] {'loss': 0.2756, 'grad_norm': 0.627212468249608, 'learning_rate': 3.649359263908746e-06, 'epoch': 0.6} 60%|█████▉ | 13246/22095 [22:30:47<10:57:19, 4.46s/it] 60%|█████▉ | 13247/22095 [22:30:51<10:47:51, 4.39s/it] {'loss': 0.2926, 'grad_norm': 0.6358470020069243, 'learning_rate': 3.6486536041976362e-06, 'epoch': 0.6} 60%|█████▉ | 13247/22095 [22:30:51<10:47:51, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50090 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86241 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43596 > 40960). Running this sequence through the model will result in indexing errors 60%|█████▉ | 13248/22095 [22:30:54<9:42:09, 3.95s/it] {'loss': 0.2641, 'grad_norm': 0.6168356613463569, 'learning_rate': 3.6479479735228117e-06, 'epoch': 0.6} 60%|█████▉ | 13248/22095 [22:30:54<9:42:09, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|█████▉ | 13249/22095 [22:31:04<13:42:22, 5.58s/it] {'loss': 0.4825, 'grad_norm': 0.3012527809032249, 'learning_rate': 3.6472423718994326e-06, 'epoch': 0.6} 60%|█████▉ | 13249/22095 [22:31:04<13:42:22, 5.58s/it] 60%|█████▉ | 13250/22095 [22:31:08<12:27:22, 5.07s/it] {'loss': 0.3112, 'grad_norm': 0.6151714677223123, 'learning_rate': 3.6465367993426603e-06, 'epoch': 0.6} 60%|█████▉ | 13250/22095 [22:31:08<12:27:22, 5.07s/it] 60%|█████▉ | 13251/22095 [22:31:11<11:09:30, 4.54s/it] {'loss': 0.2875, 'grad_norm': 0.6661811042765926, 'learning_rate': 3.6458312558676555e-06, 'epoch': 0.6} 60%|█████▉ | 13251/22095 [22:31:11<11:09:30, 4.54s/it] 60%|█████▉ | 13252/22095 [22:31:14<10:10:35, 4.14s/it] {'loss': 0.3149, 'grad_norm': 0.6218647748077784, 'learning_rate': 3.6451257414895767e-06, 'epoch': 0.6} 60%|█████▉ | 13252/22095 [22:31:14<10:10:35, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047145 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 6\nB. 3\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 60%|█████▉ | 13253/22095 [22:31:17<9:27:22, 3.85s/it] {'loss': 0.3269, 'grad_norm': 0.5646580107053886, 'learning_rate': 3.6444202562235854e-06, 'epoch': 0.6} 60%|█████▉ | 13253/22095 [22:31:17<9:27:22, 3.85s/it] 60%|█████▉ | 13254/22095 [22:31:20<8:40:47, 3.53s/it] {'loss': 0.3071, 'grad_norm': 0.6234519410493783, 'learning_rate': 3.6437148000848404e-06, 'epoch': 0.6} 60%|█████▉ | 13254/22095 [22:31:20<8:40:47, 3.53s/it] 60%|█████▉ | 13255/22095 [22:31:23<8:16:42, 3.37s/it] {'loss': 0.3069, 'grad_norm': 0.6258900034740645, 'learning_rate': 3.6430093730884973e-06, 'epoch': 0.6} 60%|█████▉ | 13255/22095 [22:31:23<8:16:42, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|█████▉ | 13256/22095 [22:31:33<13:01:59, 5.31s/it] {'loss': 0.4885, 'grad_norm': 0.30236203249057025, 'learning_rate': 3.6423039752497146e-06, 'epoch': 0.6} 60%|█████▉ | 13256/22095 [22:31:33<13:01:59, 5.31s/it] 60%|██████ | 13257/22095 [22:31:36<11:37:56, 4.74s/it] {'loss': 0.3121, 'grad_norm': 0.8056969703403314, 'learning_rate': 3.641598606583653e-06, 'epoch': 0.6} 60%|██████ | 13257/22095 [22:31:36<11:37:56, 4.74s/it] 60%|██████ | 13258/22095 [22:31:40<10:46:11, 4.39s/it] {'loss': 0.2839, 'grad_norm': 0.623021541836843, 'learning_rate': 3.640893267105462e-06, 'epoch': 0.6} 60%|██████ | 13258/22095 [22:31:40<10:46:11, 4.39s/it] 60%|██████ | 13259/22095 [22:31:43<9:48:15, 3.99s/it] {'loss': 0.2896, 'grad_norm': 0.654267609693474, 'learning_rate': 3.6401879568303013e-06, 'epoch': 0.6} 60%|██████ | 13259/22095 [22:31:43<9:48:15, 3.99s/it] 60%|██████ | 13260/22095 [22:31:46<9:08:10, 3.72s/it] {'loss': 0.2611, 'grad_norm': 0.7279223216549868, 'learning_rate': 3.639482675773324e-06, 'epoch': 0.6} 60%|██████ | 13260/22095 [22:31:46<9:08:10, 3.72s/it] 60%|██████ | 13261/22095 [22:31:50<9:11:39, 3.75s/it] {'loss': 0.3153, 'grad_norm': 0.7553416821907419, 'learning_rate': 3.6387774239496893e-06, 'epoch': 0.6} 60%|██████ | 13261/22095 [22:31:50<9:11:39, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13262/22095 [22:31:54<9:15:46, 3.78s/it] {'loss': 0.3622, 'grad_norm': 0.6802481482404504, 'learning_rate': 3.6380722013745434e-06, 'epoch': 0.6} 60%|██████ | 13262/22095 [22:31:54<9:15:46, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62558 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46823 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13263/22095 [22:31:57<9:06:49, 3.71s/it] {'loss': 0.3007, 'grad_norm': 0.6082365759513287, 'learning_rate': 3.637367008063044e-06, 'epoch': 0.6} 60%|██████ | 13263/22095 [22:31:57<9:06:49, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (157849 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13264/22095 [22:32:00<8:39:48, 3.53s/it] {'loss': 0.2381, 'grad_norm': 0.5913283821044162, 'learning_rate': 3.6366618440303436e-06, 'epoch': 0.6} 60%|██████ | 13264/22095 [22:32:00<8:39:48, 3.53s/it] 60%|██████ | 13265/22095 [22:32:03<8:13:30, 3.35s/it] {'loss': 0.3526, 'grad_norm': 0.7278794020979359, 'learning_rate': 3.6359567092915928e-06, 'epoch': 0.6} 60%|██████ | 13265/22095 [22:32:03<8:13:30, 3.35s/it] 60%|██████ | 13266/22095 [22:32:07<8:13:56, 3.36s/it] {'loss': 0.3553, 'grad_norm': 0.785940979506602, 'learning_rate': 3.635251603861941e-06, 'epoch': 0.6} 60%|██████ | 13266/22095 [22:32:07<8:13:56, 3.36s/it] 60%|██████ | 13267/22095 [22:32:11<8:38:23, 3.52s/it] {'loss': 0.3146, 'grad_norm': 0.6008767130405239, 'learning_rate': 3.6345465277565427e-06, 'epoch': 0.6} 60%|██████ | 13267/22095 [22:32:11<8:38:23, 3.52s/it] 60%|██████ | 13268/22095 [22:32:14<8:43:06, 3.56s/it] {'loss': 0.3412, 'grad_norm': 0.967970368322063, 'learning_rate': 3.6338414809905453e-06, 'epoch': 0.6} 60%|██████ | 13268/22095 [22:32:14<8:43:06, 3.56s/it] 60%|██████ | 13269/22095 [22:32:17<8:25:18, 3.44s/it] {'loss': 0.287, 'grad_norm': 0.6307483799109639, 'learning_rate': 3.633136463579099e-06, 'epoch': 0.6} 60%|██████ | 13269/22095 [22:32:17<8:25:18, 3.44s/it] 60%|██████ | 13270/22095 [22:32:21<8:42:41, 3.55s/it] {'loss': 0.3575, 'grad_norm': 0.600574822035914, 'learning_rate': 3.6324314755373523e-06, 'epoch': 0.6} 60%|██████ | 13270/22095 [22:32:21<8:42:41, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13271/22095 [22:32:31<13:21:06, 5.45s/it] {'loss': 0.4642, 'grad_norm': 0.29013622581752974, 'learning_rate': 3.6317265168804526e-06, 'epoch': 0.6} 60%|██████ | 13271/22095 [22:32:31<13:21:06, 5.45s/it] 60%|██████ | 13272/22095 [22:32:35<12:05:04, 4.93s/it] {'loss': 0.3476, 'grad_norm': 0.6462805277268697, 'learning_rate': 3.631021587623547e-06, 'epoch': 0.6} 60%|██████ | 13272/22095 [22:32:35<12:05:04, 4.93s/it] 60%|██████ | 13273/22095 [22:32:38<10:45:42, 4.39s/it] {'loss': 0.33, 'grad_norm': 0.6454512431625602, 'learning_rate': 3.630316687781783e-06, 'epoch': 0.6} 60%|██████ | 13273/22095 [22:32:38<10:45:42, 4.39s/it] 60%|██████ | 13274/22095 [22:32:41<9:40:10, 3.95s/it] {'loss': 0.3078, 'grad_norm': 0.6181042514441442, 'learning_rate': 3.6296118173703075e-06, 'epoch': 0.6} 60%|██████ | 13274/22095 [22:32:41<9:40:10, 3.95s/it] 60%|██████ | 13275/22095 [22:32:44<9:14:59, 3.78s/it] {'loss': 0.3061, 'grad_norm': 0.6651883769771438, 'learning_rate': 3.628906976404265e-06, 'epoch': 0.6} 60%|██████ | 13275/22095 [22:32:44<9:14:59, 3.78s/it] 60%|██████ | 13276/22095 [22:32:48<9:12:05, 3.76s/it] {'loss': 0.3268, 'grad_norm': 0.6182735386264937, 'learning_rate': 3.6282021648988e-06, 'epoch': 0.6} 60%|██████ | 13276/22095 [22:32:48<9:12:05, 3.76s/it] 60%|██████ | 13277/22095 [22:32:51<8:52:48, 3.63s/it] {'loss': 0.3354, 'grad_norm': 0.7468317605762024, 'learning_rate': 3.6274973828690584e-06, 'epoch': 0.6} 60%|██████ | 13277/22095 [22:32:51<8:52:48, 3.63s/it] 60%|██████ | 13278/22095 [22:32:55<8:46:14, 3.58s/it] {'loss': 0.3057, 'grad_norm': 0.6376219336618065, 'learning_rate': 3.6267926303301827e-06, 'epoch': 0.6} 60%|██████ | 13278/22095 [22:32:55<8:46:14, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44319 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13279/22095 [22:32:58<8:35:26, 3.51s/it] {'loss': 0.3278, 'grad_norm': 0.5961662944166821, 'learning_rate': 3.6260879072973155e-06, 'epoch': 0.6} 60%|██████ | 13279/22095 [22:32:58<8:35:26, 3.51s/it] 60%|██████ | 13280/22095 [22:33:01<8:02:45, 3.29s/it] {'loss': 0.3189, 'grad_norm': 0.590256658116722, 'learning_rate': 3.6253832137856e-06, 'epoch': 0.6} 60%|██████ | 13280/22095 [22:33:01<8:02:45, 3.29s/it] 60%|██████ | 13281/22095 [22:33:04<7:55:47, 3.24s/it] {'loss': 0.3412, 'grad_norm': 0.6627626694142456, 'learning_rate': 3.6246785498101754e-06, 'epoch': 0.6} 60%|██████ | 13281/22095 [22:33:04<7:55:47, 3.24s/it] 60%|██████ | 13282/22095 [22:33:07<8:05:23, 3.30s/it] {'loss': 0.3421, 'grad_norm': 0.7990817075726195, 'learning_rate': 3.6239739153861863e-06, 'epoch': 0.6} 60%|██████ | 13282/22095 [22:33:07<8:05:23, 3.30s/it] 60%|██████ | 13283/22095 [22:33:11<8:30:02, 3.47s/it] {'loss': 0.2819, 'grad_norm': 0.6661198485629423, 'learning_rate': 3.623269310528773e-06, 'epoch': 0.6} 60%|██████ | 13283/22095 [22:33:11<8:30:02, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (99771 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107138 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13284/22095 [22:33:20<12:37:57, 5.16s/it] {'loss': 0.5014, 'grad_norm': 0.3169460283991881, 'learning_rate': 3.622564735253072e-06, 'epoch': 0.6} 60%|██████ | 13284/22095 [22:33:20<12:37:57, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [264, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8513063 in VC:s3://internvl-moe-sft-data/. Exception: Image size [264, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119026, 'image': 'vrdu_texteq/astro-ph.CO/874fc5e4-b0d4-44a1-a80a-7b338a86631c.png', 'image_wh': [[264, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'where $\\Theta$ is defined as:'}]} 60%|██████ | 13285/22095 [22:33:24<11:14:21, 4.59s/it] {'loss': 0.3107, 'grad_norm': 0.6367798706528928, 'learning_rate': 3.6218601895742234e-06, 'epoch': 0.6} 60%|██████ | 13285/22095 [22:33:24<11:14:21, 4.59s/it] 60%|██████ | 13286/22095 [22:33:28<10:59:16, 4.49s/it] {'loss': 0.3321, 'grad_norm': 0.6489505146493503, 'learning_rate': 3.6211556735073704e-06, 'epoch': 0.6} 60%|██████ | 13286/22095 [22:33:28<10:59:16, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13287/22095 [22:33:34<12:01:28, 4.91s/it] {'loss': 0.4626, 'grad_norm': 0.30221069429949665, 'learning_rate': 3.620451187067644e-06, 'epoch': 0.6} 60%|██████ | 13287/22095 [22:33:34<12:01:28, 4.91s/it] 60%|██████ | 13288/22095 [22:33:37<10:59:14, 4.49s/it] {'loss': 0.2908, 'grad_norm': 0.6338490368445091, 'learning_rate': 3.619746730270185e-06, 'epoch': 0.6} 60%|██████ | 13288/22095 [22:33:37<10:59:14, 4.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13289/22095 [22:33:41<10:06:06, 4.13s/it] {'loss': 0.3412, 'grad_norm': 0.6348180019316083, 'learning_rate': 3.619042303130129e-06, 'epoch': 0.6} 60%|██████ | 13289/22095 [22:33:41<10:06:06, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41237 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13290/22095 [22:33:46<11:16:01, 4.61s/it] {'loss': 0.4547, 'grad_norm': 0.2727124763310069, 'learning_rate': 3.618337905662616e-06, 'epoch': 0.6} 60%|██████ | 13290/22095 [22:33:46<11:16:01, 4.61s/it] 60%|██████ | 13291/22095 [22:33:50<10:34:58, 4.33s/it] {'loss': 0.3109, 'grad_norm': 0.6508143398730285, 'learning_rate': 3.6176335378827747e-06, 'epoch': 0.6} 60%|██████ | 13291/22095 [22:33:50<10:34:58, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13292/22095 [22:33:59<13:47:36, 5.64s/it] {'loss': 0.4804, 'grad_norm': 0.8292016353801743, 'learning_rate': 3.616929199805744e-06, 'epoch': 0.6} 60%|██████ | 13292/22095 [22:33:59<13:47:36, 5.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933823 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56976, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 60%|██████ | 13293/22095 [22:34:02<12:17:05, 5.02s/it] {'loss': 0.2883, 'grad_norm': 0.6238778648347993, 'learning_rate': 3.616224891446658e-06, 'epoch': 0.6} 60%|██████ | 13293/22095 [22:34:02<12:17:05, 5.02s/it] 60%|██████ | 13294/22095 [22:34:05<10:43:38, 4.39s/it] {'loss': 0.354, 'grad_norm': 0.6587510313119597, 'learning_rate': 3.615520612820649e-06, 'epoch': 0.6} 60%|██████ | 13294/22095 [22:34:05<10:43:38, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79477 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13295/22095 [22:34:09<10:17:10, 4.21s/it] {'loss': 0.3445, 'grad_norm': 0.6311257520628122, 'learning_rate': 3.6148163639428475e-06, 'epoch': 0.6} 60%|██████ | 13295/22095 [22:34:09<10:17:10, 4.21s/it] 60%|██████ | 13296/22095 [22:34:12<9:39:42, 3.95s/it] {'loss': 0.3234, 'grad_norm': 0.6216525729643326, 'learning_rate': 3.6141121448283904e-06, 'epoch': 0.6} 60%|██████ | 13296/22095 [22:34:12<9:39:42, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44339 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83014 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13297/22095 [22:34:17<10:16:10, 4.20s/it] {'loss': 0.3193, 'grad_norm': 0.7357679569482551, 'learning_rate': 3.6134079554924062e-06, 'epoch': 0.6} 60%|██████ | 13297/22095 [22:34:17<10:16:10, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13298/22095 [22:34:24<12:05:53, 4.95s/it] {'loss': 0.4836, 'grad_norm': 0.3342103094602393, 'learning_rate': 3.6127037959500267e-06, 'epoch': 0.6} 60%|██████ | 13298/22095 [22:34:24<12:05:53, 4.95s/it] 60%|██████ | 13299/22095 [22:34:28<11:18:25, 4.63s/it] {'loss': 0.3104, 'grad_norm': 0.6043645496196589, 'learning_rate': 3.6119996662163824e-06, 'epoch': 0.6} 60%|██████ | 13299/22095 [22:34:28<11:18:25, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13300/22095 [22:34:38<15:34:11, 6.37s/it] {'loss': 0.4925, 'grad_norm': 0.32463802624158095, 'learning_rate': 3.6112955663066008e-06, 'epoch': 0.6} 60%|██████ | 13300/22095 [22:34:38<15:34:11, 6.37s/it] 60%|██████ | 13301/22095 [22:34:42<13:55:00, 5.70s/it] {'loss': 0.3298, 'grad_norm': 0.6827651046445397, 'learning_rate': 3.610591496235813e-06, 'epoch': 0.6} 60%|██████ | 13301/22095 [22:34:42<13:55:00, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62057 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13302/22095 [22:34:45<11:59:25, 4.91s/it] {'loss': 0.3104, 'grad_norm': 0.6881451561163642, 'learning_rate': 3.6098874560191465e-06, 'epoch': 0.6} 60%|██████ | 13302/22095 [22:34:45<11:59:25, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [250, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474140 in VC:s3://internvl-moe-sft-data/. Exception: Image size [250, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 141901, 'image': 'vrdu_texteq/astro-ph.CO/dc36f53c-ba54-4de3-b57b-bece7250e140.png', 'image_wh': [[250, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where $r$ is defined as'}]} 60%|██████ | 13303/22095 [22:34:53<14:16:10, 5.84s/it] {'loss': 0.4674, 'grad_norm': 0.2740287878315835, 'learning_rate': 3.609183445671731e-06, 'epoch': 0.6} 60%|██████ | 13303/22095 [22:34:53<14:16:10, 5.84s/it] 60%|██████ | 13304/22095 [22:34:57<12:20:29, 5.05s/it] {'loss': 0.2934, 'grad_norm': 0.6265971913851386, 'learning_rate': 3.6084794652086892e-06, 'epoch': 0.6} 60%|██████ | 13304/22095 [22:34:57<12:20:29, 5.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8932779 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55932, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为直线段AB的上点,P点为AC的中点,Q点为BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 60%|██████ | 13305/22095 [22:35:06<15:38:00, 6.40s/it] {'loss': 0.4862, 'grad_norm': 0.27654597975083506, 'learning_rate': 3.607775514645151e-06, 'epoch': 0.6} 60%|██████ | 13305/22095 [22:35:06<15:38:00, 6.40s/it] 60%|██████ | 13306/22095 [22:35:09<13:21:15, 5.47s/it] {'loss': 0.3091, 'grad_norm': 0.6190244001689355, 'learning_rate': 3.607071593996242e-06, 'epoch': 0.6} 60%|██████ | 13306/22095 [22:35:09<13:21:15, 5.47s/it] 60%|██████ | 13307/22095 [22:35:13<12:14:57, 5.02s/it] {'loss': 0.3189, 'grad_norm': 0.6018265444032312, 'learning_rate': 3.606367703277085e-06, 'epoch': 0.6} 60%|██████ | 13307/22095 [22:35:13<12:14:57, 5.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13308/22095 [22:35:17<11:21:01, 4.65s/it] {'loss': 0.2785, 'grad_norm': 0.6326827208686144, 'learning_rate': 3.6056638425028068e-06, 'epoch': 0.6} 60%|██████ | 13308/22095 [22:35:17<11:21:01, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52298 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13309/22095 [22:35:20<9:56:57, 4.08s/it] {'loss': 0.3024, 'grad_norm': 1.0066003674078408, 'learning_rate': 3.6049600116885307e-06, 'epoch': 0.6} 60%|██████ | 13309/22095 [22:35:20<9:56:57, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13310/22095 [22:35:29<13:51:29, 5.68s/it] {'loss': 0.4698, 'grad_norm': 0.3089674950974842, 'learning_rate': 3.6042562108493772e-06, 'epoch': 0.6} 60%|██████ | 13310/22095 [22:35:29<13:51:29, 5.68s/it] 60%|██████ | 13311/22095 [22:35:33<12:26:41, 5.10s/it] {'loss': 0.2933, 'grad_norm': 0.6178521392338734, 'learning_rate': 3.603552440000472e-06, 'epoch': 0.6} 60%|██████ | 13311/22095 [22:35:33<12:26:41, 5.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [75, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398602 in VC:s3://internvl-moe-sft-data/. Exception: Image size [75, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 754, 'image': 'vrdu_table_final_2/astro-ph.CO/e9f576fe-3750-4b89-a18c-45f296e5725a.png', 'image_wh': [[75, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{l}3C147\\end{tabular}\n```"}]} 60%|██████ | 13312/22095 [22:35:37<11:27:43, 4.70s/it] {'loss': 0.3329, 'grad_norm': 0.6770493087166252, 'learning_rate': 3.6028486991569376e-06, 'epoch': 0.6} 60%|██████ | 13312/22095 [22:35:37<11:27:43, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80689 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123917 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13313/22095 [22:35:41<10:44:50, 4.41s/it] {'loss': 0.3046, 'grad_norm': 0.6601070659367079, 'learning_rate': 3.6021449883338923e-06, 'epoch': 0.6} 60%|██████ | 13313/22095 [22:35:41<10:44:50, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55177 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134526 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13314/22095 [22:35:44<9:54:13, 4.06s/it] {'loss': 0.3644, 'grad_norm': 0.6320469520340588, 'learning_rate': 3.6014413075464573e-06, 'epoch': 0.6} 60%|██████ | 13314/22095 [22:35:44<9:54:13, 4.06s/it] 60%|██████ | 13315/22095 [22:35:47<9:06:04, 3.73s/it] {'loss': 0.2946, 'grad_norm': 0.6056737659973819, 'learning_rate': 3.600737656809754e-06, 'epoch': 0.6} 60%|██████ | 13315/22095 [22:35:47<9:06:04, 3.73s/it] 60%|██████ | 13316/22095 [22:35:50<8:29:38, 3.48s/it] {'loss': 0.2944, 'grad_norm': 0.5849266824543456, 'learning_rate': 3.600034036138902e-06, 'epoch': 0.6} 60%|██████ | 13316/22095 [22:35:50<8:29:38, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13317/22095 [22:35:58<11:58:25, 4.91s/it] {'loss': 0.4546, 'grad_norm': 0.30346221504831905, 'learning_rate': 3.5993304455490173e-06, 'epoch': 0.6} 60%|██████ | 13317/22095 [22:35:58<11:58:25, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54786 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13318/22095 [22:36:01<10:40:01, 4.38s/it] {'loss': 0.3263, 'grad_norm': 0.6478262489689017, 'learning_rate': 3.598626885055219e-06, 'epoch': 0.6} 60%|██████ | 13318/22095 [22:36:01<10:40:01, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71660 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [337, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507092 in VC:s3://internvl-moe-sft-data/. Exception: Image size [337, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 156579, 'image': 'vrdu_texteq/astro-ph.CO/abc4a54a-0e51-4c40-ac36-022a11a21225.png', 'image_wh': [[337, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': '$^2$E-mail: maroto@fis.ucm.es \\vfill\\eject'}]} 60%|██████ | 13319/22095 [22:36:05<10:05:17, 4.14s/it] {'loss': 0.2907, 'grad_norm': 0.6031657905281659, 'learning_rate': 3.597923354672628e-06, 'epoch': 0.6} 60%|██████ | 13319/22095 [22:36:05<10:05:17, 4.14s/it] 60%|██████ | 13320/22095 [22:36:08<9:39:56, 3.97s/it] {'loss': 0.3628, 'grad_norm': 0.6356314317536731, 'learning_rate': 3.597219854416355e-06, 'epoch': 0.6} 60%|██████ | 13320/22095 [22:36:08<9:39:56, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047170 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6cm'}]} 60%|██████ | 13321/22095 [22:36:18<13:42:31, 5.62s/it] {'loss': 0.4826, 'grad_norm': 0.2929946520224333, 'learning_rate': 3.59651638430152e-06, 'epoch': 0.6} 60%|██████ | 13321/22095 [22:36:18<13:42:31, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108178 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13322/22095 [22:36:27<16:31:20, 6.78s/it] {'loss': 0.4851, 'grad_norm': 0.2812105966859869, 'learning_rate': 3.595812944343239e-06, 'epoch': 0.6} 60%|██████ | 13322/22095 [22:36:27<16:31:20, 6.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (113309 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62726 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121683 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13323/22095 [22:36:31<14:09:42, 5.81s/it] {'loss': 0.3125, 'grad_norm': 0.6210192622739172, 'learning_rate': 3.5951095345566232e-06, 'epoch': 0.6} 60%|██████ | 13323/22095 [22:36:31<14:09:42, 5.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88285 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13324/22095 [22:36:34<12:33:49, 5.16s/it] {'loss': 0.2914, 'grad_norm': 0.6331813274922533, 'learning_rate': 3.5944061549567876e-06, 'epoch': 0.6} 60%|██████ | 13324/22095 [22:36:34<12:33:49, 5.16s/it] 60%|██████ | 13325/22095 [22:36:38<11:10:48, 4.59s/it] {'loss': 0.2978, 'grad_norm': 0.6672025406942507, 'learning_rate': 3.59370280555885e-06, 'epoch': 0.6} 60%|██████ | 13325/22095 [22:36:38<11:10:48, 4.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13326/22095 [22:36:47<14:40:50, 6.03s/it] {'loss': 0.4624, 'grad_norm': 0.2833025363843245, 'learning_rate': 3.592999486377918e-06, 'epoch': 0.6} 60%|██████ | 13326/22095 [22:36:47<14:40:50, 6.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13327/22095 [22:36:50<12:32:37, 5.15s/it] {'loss': 0.3058, 'grad_norm': 0.6844554810860389, 'learning_rate': 3.592296197429106e-06, 'epoch': 0.6} 60%|██████ | 13327/22095 [22:36:50<12:32:37, 5.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13328/22095 [22:36:57<14:06:47, 5.80s/it] {'loss': 0.5005, 'grad_norm': 0.2904387392592274, 'learning_rate': 3.591592938727526e-06, 'epoch': 0.6} 60%|██████ | 13328/22095 [22:36:57<14:06:47, 5.80s/it] 60%|██████ | 13329/22095 [22:37:01<12:44:59, 5.24s/it] {'loss': 0.2839, 'grad_norm': 0.6018361904817027, 'learning_rate': 3.5908897102882868e-06, 'epoch': 0.6} 60%|██████ | 13329/22095 [22:37:01<12:44:59, 5.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13330/22095 [22:37:11<15:51:48, 6.52s/it] {'loss': 0.4737, 'grad_norm': 0.2586940970741543, 'learning_rate': 3.5901865121265e-06, 'epoch': 0.6} 60%|██████ | 13330/22095 [22:37:11<15:51:48, 6.52s/it] 60%|██████ | 13331/22095 [22:37:19<17:02:32, 7.00s/it] {'loss': 0.4526, 'grad_norm': 0.27398799440128146, 'learning_rate': 3.5894833442572763e-06, 'epoch': 0.6} 60%|██████ | 13331/22095 [22:37:19<17:02:32, 7.00s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 60%|██████ | 13332/22095 [22:37:23<14:37:34, 6.01s/it] {'loss': 0.3301, 'grad_norm': 0.622998206410414, 'learning_rate': 3.588780206695724e-06, 'epoch': 0.6} 60%|██████ | 13332/22095 [22:37:23<14:37:34, 6.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13333/22095 [22:37:27<13:29:29, 5.54s/it] {'loss': 0.313, 'grad_norm': 0.6808638843999142, 'learning_rate': 3.5880770994569485e-06, 'epoch': 0.6} 60%|██████ | 13333/22095 [22:37:27<13:29:29, 5.54s/it] 60%|██████ | 13334/22095 [22:37:30<11:53:49, 4.89s/it] {'loss': 0.2803, 'grad_norm': 0.6318067758791139, 'learning_rate': 3.587374022556061e-06, 'epoch': 0.6} 60%|██████ | 13334/22095 [22:37:30<11:53:49, 4.89s/it] 60%|██████ | 13335/22095 [22:37:34<10:57:31, 4.50s/it] {'loss': 0.3366, 'grad_norm': 0.6496949493400798, 'learning_rate': 3.5866709760081684e-06, 'epoch': 0.6} 60%|██████ | 13335/22095 [22:37:34<10:57:31, 4.50s/it] 60%|██████ | 13336/22095 [22:37:37<10:01:25, 4.12s/it] {'loss': 0.33, 'grad_norm': 0.6469781095008104, 'learning_rate': 3.585967959828375e-06, 'epoch': 0.6} 60%|██████ | 13336/22095 [22:37:37<10:01:25, 4.12s/it] 60%|██████ | 13337/22095 [22:37:40<9:07:14, 3.75s/it] {'loss': 0.3142, 'grad_norm': 0.6061078384711173, 'learning_rate': 3.5852649740317858e-06, 'epoch': 0.6} 60%|██████ | 13337/22095 [22:37:40<9:07:14, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13338/22095 [22:37:43<8:37:54, 3.55s/it] {'loss': 0.3582, 'grad_norm': 0.6063965955355926, 'learning_rate': 3.58456201863351e-06, 'epoch': 0.6} 60%|██████ | 13338/22095 [22:37:43<8:37:54, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49356 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131162 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13339/22095 [22:37:47<8:42:10, 3.58s/it] {'loss': 0.2983, 'grad_norm': 0.6221814436406296, 'learning_rate': 3.5838590936486467e-06, 'epoch': 0.6} 60%|██████ | 13339/22095 [22:37:47<8:42:10, 3.58s/it] 60%|██████ | 13340/22095 [22:37:51<8:49:26, 3.63s/it] {'loss': 0.3154, 'grad_norm': 0.6344451011991756, 'learning_rate': 3.583156199092303e-06, 'epoch': 0.6} 60%|██████ | 13340/22095 [22:37:51<8:49:26, 3.63s/it] 60%|██████ | 13341/22095 [22:37:54<8:29:16, 3.49s/it] {'loss': 0.3203, 'grad_norm': 0.5959380289884578, 'learning_rate': 3.582453334979582e-06, 'epoch': 0.6} 60%|██████ | 13341/22095 [22:37:54<8:29:16, 3.49s/it] 60%|██████ | 13342/22095 [22:37:57<8:15:01, 3.39s/it] {'loss': 0.316, 'grad_norm': 0.6284294962048357, 'learning_rate': 3.5817505013255847e-06, 'epoch': 0.6} 60%|██████ | 13342/22095 [22:37:57<8:15:01, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48331 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69370 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13343/22095 [22:38:02<9:35:39, 3.95s/it] {'loss': 0.3196, 'grad_norm': 0.6991259807827724, 'learning_rate': 3.581047698145412e-06, 'epoch': 0.6} 60%|██████ | 13343/22095 [22:38:02<9:35:39, 3.95s/it] 60%|██████ | 13344/22095 [22:38:05<9:02:36, 3.72s/it] {'loss': 0.2832, 'grad_norm': 0.6036458234344553, 'learning_rate': 3.580344925454167e-06, 'epoch': 0.6} 60%|██████ | 13344/22095 [22:38:05<9:02:36, 3.72s/it] 60%|██████ | 13345/22095 [22:38:08<8:25:19, 3.47s/it] {'loss': 0.313, 'grad_norm': 0.6094023062667671, 'learning_rate': 3.5796421832669503e-06, 'epoch': 0.6} 60%|██████ | 13345/22095 [22:38:08<8:25:19, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13346/22095 [22:38:12<8:17:32, 3.41s/it] {'loss': 0.358, 'grad_norm': 0.6116601881054674, 'learning_rate': 3.5789394715988602e-06, 'epoch': 0.6} 60%|██████ | 13346/22095 [22:38:12<8:17:32, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13347/22095 [22:38:21<12:41:13, 5.22s/it] {'loss': 0.4588, 'grad_norm': 0.33519227315021916, 'learning_rate': 3.578236790464995e-06, 'epoch': 0.6} 60%|██████ | 13347/22095 [22:38:21<12:41:13, 5.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46695 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13348/22095 [22:38:24<11:11:22, 4.61s/it] {'loss': 0.3414, 'grad_norm': 1.018762962468953, 'learning_rate': 3.5775341398804585e-06, 'epoch': 0.6} 60%|██████ | 13348/22095 [22:38:24<11:11:22, 4.61s/it] 60%|██████ | 13349/22095 [22:38:28<10:15:20, 4.22s/it] {'loss': 0.3244, 'grad_norm': 0.6323299652027151, 'learning_rate': 3.576831519860341e-06, 'epoch': 0.6} 60%|██████ | 13349/22095 [22:38:28<10:15:20, 4.22s/it] 60%|██████ | 13350/22095 [22:38:31<9:25:53, 3.88s/it] {'loss': 0.2797, 'grad_norm': 0.6310873421400982, 'learning_rate': 3.576128930419744e-06, 'epoch': 0.6} 60%|██████ | 13350/22095 [22:38:31<9:25:53, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 60%|██████ | 13351/22095 [22:38:40<13:33:14, 5.58s/it] {'loss': 0.4488, 'grad_norm': 0.2841091768586345, 'learning_rate': 3.575426371573764e-06, 'epoch': 0.6} 60%|██████ | 13351/22095 [22:38:40<13:33:14, 5.58s/it] 60%|██████ | 13352/22095 [22:38:44<12:22:18, 5.09s/it] {'loss': 0.2842, 'grad_norm': 0.6497447780283677, 'learning_rate': 3.5747238433374952e-06, 'epoch': 0.6} 60%|██████ | 13352/22095 [22:38:44<12:22:18, 5.09s/it] 60%|██████ | 13353/22095 [22:38:47<10:41:16, 4.40s/it] {'loss': 0.3004, 'grad_norm': 0.6442050756032109, 'learning_rate': 3.5740213457260333e-06, 'epoch': 0.6} 60%|██████ | 13353/22095 [22:38:47<10:41:16, 4.40s/it] 60%|██████ | 13354/22095 [22:38:50<9:36:45, 3.96s/it] {'loss': 0.2854, 'grad_norm': 0.6022216934016352, 'learning_rate': 3.573318878754475e-06, 'epoch': 0.6} 60%|██████ | 13354/22095 [22:38:50<9:36:45, 3.96s/it] 60%|██████ | 13355/22095 [22:38:54<9:34:40, 3.95s/it] {'loss': 0.3581, 'grad_norm': 0.6266066483654207, 'learning_rate': 3.5726164424379106e-06, 'epoch': 0.6} 60%|██████ | 13355/22095 [22:38:54<9:34:40, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358145 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24856, 'image': 'vrdu_table_final_2/astro-ph.CO/0cbbb513-23e2-4e7b-92af-04c34bd85d49.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 60%|██████ | 13356/22095 [22:39:00<11:36:25, 4.78s/it] {'loss': 0.4978, 'grad_norm': 0.2971858300753773, 'learning_rate': 3.571914036791435e-06, 'epoch': 0.6} 60%|██████ | 13356/22095 [22:39:00<11:36:25, 4.78s/it] 60%|██████ | 13357/22095 [22:39:04<10:47:19, 4.44s/it] {'loss': 0.316, 'grad_norm': 0.6109185868269074, 'learning_rate': 3.571211661830142e-06, 'epoch': 0.6} 60%|██████ | 13357/22095 [22:39:04<10:47:19, 4.44s/it] 60%|██████ | 13358/22095 [22:39:08<10:01:57, 4.13s/it] {'loss': 0.2798, 'grad_norm': 0.6521278981812285, 'learning_rate': 3.5705093175691195e-06, 'epoch': 0.6} 60%|██████ | 13358/22095 [22:39:08<10:01:57, 4.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 60%|██████ | 13359/22095 [22:39:10<9:05:19, 3.75s/it] {'loss': 0.3216, 'grad_norm': 0.6130984616975004, 'learning_rate': 3.5698070040234633e-06, 'epoch': 0.6} 60%|██████ | 13359/22095 [22:39:10<9:05:19, 3.75s/it] 60%|██████ | 13360/22095 [22:39:14<8:56:15, 3.68s/it] {'loss': 0.3053, 'grad_norm': 0.6458711888618369, 'learning_rate': 3.569104721208262e-06, 'epoch': 0.6} 60%|██████ | 13360/22095 [22:39:14<8:56:15, 3.68s/it] 60%|██████ | 13361/22095 [22:39:17<8:27:02, 3.48s/it] {'loss': 0.3134, 'grad_norm': 0.585594980145394, 'learning_rate': 3.5684024691386067e-06, 'epoch': 0.6} 60%|██████ | 13361/22095 [22:39:17<8:27:02, 3.48s/it] 60%|██████ | 13362/22095 [22:39:21<8:31:41, 3.52s/it] {'loss': 0.3302, 'grad_norm': 0.6469463908589563, 'learning_rate': 3.567700247829583e-06, 'epoch': 0.6} 60%|██████ | 13362/22095 [22:39:21<8:31:41, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44041 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94372 > 40960). Running this sequence through the model will result in indexing errors 60%|██████ | 13363/22095 [22:39:24<8:32:22, 3.52s/it] {'loss': 0.2925, 'grad_norm': 0.6154644667480161, 'learning_rate': 3.5669980572962836e-06, 'epoch': 0.6} 60%|██████ | 13363/22095 [22:39:24<8:32:22, 3.52s/it] 60%|██████ | 13364/22095 [22:39:27<8:10:40, 3.37s/it] {'loss': 0.2968, 'grad_norm': 0.6104621738919574, 'learning_rate': 3.5662958975537955e-06, 'epoch': 0.6} 60%|██████ | 13364/22095 [22:39:27<8:10:40, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [392, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8508512 in VC:s3://internvl-moe-sft-data/. Exception: Image size [392, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 114821, 'image': 'vrdu_texteq/astro-ph.CO/cb7e632c-d0f6-4166-b27a-4c6c96be3f70.png', 'image_wh': [[392, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'and $X$ can be rewritten more as'}]} 60%|██████ | 13365/22095 [22:39:30<7:50:25, 3.23s/it] {'loss': 0.3088, 'grad_norm': 0.6360070530683783, 'learning_rate': 3.5655937686172037e-06, 'epoch': 0.6} 60%|██████ | 13365/22095 [22:39:30<7:50:25, 3.23s/it] 60%|██████ | 13366/22095 [22:39:33<7:42:07, 3.18s/it] {'loss': 0.3785, 'grad_norm': 0.5661342734041364, 'learning_rate': 3.5648916705015964e-06, 'epoch': 0.6} 60%|██████ | 13366/22095 [22:39:33<7:42:07, 3.18s/it] 60%|██████ | 13367/22095 [22:39:36<7:33:05, 3.11s/it] {'loss': 0.3104, 'grad_norm': 0.5702960130273277, 'learning_rate': 3.5641896032220626e-06, 'epoch': 0.6} 60%|██████ | 13367/22095 [22:39:36<7:33:05, 3.11s/it] 61%|██████ | 13368/22095 [22:39:40<7:50:50, 3.24s/it] {'loss': 0.3048, 'grad_norm': 0.6131253656760342, 'learning_rate': 3.5634875667936803e-06, 'epoch': 0.61} 61%|██████ | 13368/22095 [22:39:40<7:50:50, 3.24s/it] 61%|██████ | 13369/22095 [22:39:43<8:06:05, 3.34s/it] {'loss': 0.3033, 'grad_norm': 0.6174377700558944, 'learning_rate': 3.56278556123154e-06, 'epoch': 0.61} 61%|██████ | 13369/22095 [22:39:43<8:06:05, 3.34s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13370/22095 [22:39:47<8:21:43, 3.45s/it] {'loss': 0.3133, 'grad_norm': 0.6471682654144832, 'learning_rate': 3.562083586550725e-06, 'epoch': 0.61} 61%|██████ | 13370/22095 [22:39:47<8:21:43, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13371/22095 [22:39:56<12:48:07, 5.28s/it] {'loss': 0.4485, 'grad_norm': 0.36123504228980124, 'learning_rate': 3.5613816427663162e-06, 'epoch': 0.61} 61%|██████ | 13371/22095 [22:39:56<12:48:07, 5.28s/it] 61%|██████ | 13372/22095 [22:40:00<11:15:54, 4.65s/it] {'loss': 0.2823, 'grad_norm': 0.6044077284848175, 'learning_rate': 3.5606797298933967e-06, 'epoch': 0.61} 61%|██████ | 13372/22095 [22:40:00<11:15:54, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43373 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13373/22095 [22:40:03<10:31:37, 4.35s/it] {'loss': 0.295, 'grad_norm': 0.6937280255969518, 'learning_rate': 3.5599778479470498e-06, 'epoch': 0.61} 61%|██████ | 13373/22095 [22:40:03<10:31:37, 4.35s/it] 61%|██████ | 13374/22095 [22:40:08<10:54:01, 4.50s/it] {'loss': 0.2912, 'grad_norm': 0.6140668102863763, 'learning_rate': 3.5592759969423573e-06, 'epoch': 0.61} 61%|██████ | 13374/22095 [22:40:08<10:54:01, 4.50s/it] 61%|██████ | 13375/22095 [22:40:11<9:49:56, 4.06s/it] {'loss': 0.3286, 'grad_norm': 0.6584099771068982, 'learning_rate': 3.5585741768943982e-06, 'epoch': 0.61} 61%|██████ | 13375/22095 [22:40:11<9:49:56, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41368 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105892 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13376/22095 [22:40:16<10:07:15, 4.18s/it] {'loss': 0.3716, 'grad_norm': 0.613803210549905, 'learning_rate': 3.5578723878182518e-06, 'epoch': 0.61} 61%|██████ | 13376/22095 [22:40:16<10:07:15, 4.18s/it] 61%|██████ | 13377/22095 [22:40:19<9:42:08, 4.01s/it] {'loss': 0.3128, 'grad_norm': 0.7060930148845469, 'learning_rate': 3.557170629729001e-06, 'epoch': 0.61} 61%|██████ | 13377/22095 [22:40:19<9:42:08, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13378/22095 [22:40:23<9:48:25, 4.05s/it] {'loss': 0.3283, 'grad_norm': 0.6792705093090348, 'learning_rate': 3.556468902641721e-06, 'epoch': 0.61} 61%|██████ | 13378/22095 [22:40:23<9:48:25, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047215 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 5\nB. 6\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 61%|██████ | 13379/22095 [22:40:27<9:18:05, 3.84s/it] {'loss': 0.3075, 'grad_norm': 0.8428630717752048, 'learning_rate': 3.555767206571491e-06, 'epoch': 0.61} 61%|██████ | 13379/22095 [22:40:27<9:18:05, 3.84s/it] 61%|██████ | 13380/22095 [22:40:30<9:11:53, 3.80s/it] {'loss': 0.3179, 'grad_norm': 0.5752766645066538, 'learning_rate': 3.555065541533389e-06, 'epoch': 0.61} 61%|██████ | 13380/22095 [22:40:30<9:11:53, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946834 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69987, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nA. 7cm\nB. 5.4cm\nC. 6.4cm\nD. 6.8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 61%|██████ | 13381/22095 [22:40:39<12:48:08, 5.29s/it] {'loss': 0.4824, 'grad_norm': 0.3290549372923084, 'learning_rate': 3.5543639075424897e-06, 'epoch': 0.61} 61%|██████ | 13381/22095 [22:40:39<12:48:08, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65873 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13382/22095 [22:40:43<11:40:18, 4.82s/it] {'loss': 0.3344, 'grad_norm': 0.6225255968894837, 'learning_rate': 3.5536623046138685e-06, 'epoch': 0.61} 61%|██████ | 13382/22095 [22:40:43<11:40:18, 4.82s/it] 61%|██████ | 13383/22095 [22:40:46<10:29:56, 4.34s/it] {'loss': 0.2937, 'grad_norm': 0.5772962530922928, 'learning_rate': 3.552960732762605e-06, 'epoch': 0.61} 61%|██████ | 13383/22095 [22:40:46<10:29:56, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13384/22095 [22:40:56<14:22:12, 5.94s/it] {'loss': 0.4464, 'grad_norm': 0.28994310594926126, 'learning_rate': 3.5522591920037698e-06, 'epoch': 0.61} 61%|██████ | 13384/22095 [22:40:56<14:22:12, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105934 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84168 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13385/22095 [22:41:01<13:39:41, 5.65s/it] {'loss': 0.3062, 'grad_norm': 0.7948689428509049, 'learning_rate': 3.5515576823524377e-06, 'epoch': 0.61} 61%|██████ | 13385/22095 [22:41:01<13:39:41, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74021 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13386/22095 [22:41:04<11:38:42, 4.81s/it] {'loss': 0.3117, 'grad_norm': 0.6432547158435971, 'learning_rate': 3.5508562038236817e-06, 'epoch': 0.61} 61%|██████ | 13386/22095 [22:41:04<11:38:42, 4.81s/it] 61%|██████ | 13387/22095 [22:41:07<10:34:42, 4.37s/it] {'loss': 0.3015, 'grad_norm': 0.713155852051197, 'learning_rate': 3.5501547564325777e-06, 'epoch': 0.61} 61%|██████ | 13387/22095 [22:41:07<10:34:42, 4.37s/it] 61%|██████ | 13388/22095 [22:41:11<10:21:43, 4.28s/it] {'loss': 0.3255, 'grad_norm': 0.5769908367074335, 'learning_rate': 3.549453340194194e-06, 'epoch': 0.61} 61%|██████ | 13388/22095 [22:41:11<10:21:43, 4.28s/it] 61%|██████ | 13389/22095 [22:41:14<9:28:44, 3.92s/it] {'loss': 0.3209, 'grad_norm': 0.5942258901578376, 'learning_rate': 3.5487519551236025e-06, 'epoch': 0.61} 61%|██████ | 13389/22095 [22:41:14<9:28:44, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13390/22095 [22:41:24<13:31:37, 5.59s/it] {'loss': 0.4456, 'grad_norm': 0.29522903900146297, 'learning_rate': 3.548050601235876e-06, 'epoch': 0.61} 61%|██████ | 13390/22095 [22:41:24<13:31:37, 5.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13391/22095 [22:41:27<12:16:02, 5.07s/it] {'loss': 0.3175, 'grad_norm': 0.599766040405492, 'learning_rate': 3.54734927854608e-06, 'epoch': 0.61} 61%|██████ | 13391/22095 [22:41:27<12:16:02, 5.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76992 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13392/22095 [22:41:31<11:21:08, 4.70s/it] {'loss': 0.3344, 'grad_norm': 0.6533666125281052, 'learning_rate': 3.5466479870692883e-06, 'epoch': 0.61} 61%|██████ | 13392/22095 [22:41:31<11:21:08, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13393/22095 [22:41:39<13:30:08, 5.59s/it] {'loss': 0.4758, 'grad_norm': 0.3529903275260208, 'learning_rate': 3.5459467268205683e-06, 'epoch': 0.61} 61%|██████ | 13393/22095 [22:41:39<13:30:08, 5.59s/it] 61%|██████ | 13394/22095 [22:41:47<15:27:00, 6.39s/it] {'loss': 0.4692, 'grad_norm': 0.3486681973081686, 'learning_rate': 3.5452454978149864e-06, 'epoch': 0.61} 61%|██████ | 13394/22095 [22:41:47<15:27:00, 6.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13395/22095 [22:41:52<14:01:08, 5.80s/it] {'loss': 0.3866, 'grad_norm': 0.6403340211996071, 'learning_rate': 3.5445443000676096e-06, 'epoch': 0.61} 61%|██████ | 13395/22095 [22:41:52<14:01:08, 5.80s/it] 61%|██████ | 13396/22095 [22:41:55<12:24:32, 5.14s/it] {'loss': 0.2986, 'grad_norm': 0.6369882531039904, 'learning_rate': 3.543843133593509e-06, 'epoch': 0.61} 61%|██████ | 13396/22095 [22:41:55<12:24:32, 5.14s/it] 61%|██████ | 13397/22095 [22:41:59<11:15:16, 4.66s/it] {'loss': 0.3006, 'grad_norm': 0.6608157083343831, 'learning_rate': 3.5431419984077444e-06, 'epoch': 0.61} 61%|██████ | 13397/22095 [22:41:59<11:15:16, 4.66s/it] 61%|██████ | 13398/22095 [22:42:02<10:00:29, 4.14s/it] {'loss': 0.3066, 'grad_norm': 0.6016818949787632, 'learning_rate': 3.542440894525384e-06, 'epoch': 0.61} 61%|██████ | 13398/22095 [22:42:02<10:00:29, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61112 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13399/22095 [22:42:05<9:22:36, 3.88s/it] {'loss': 0.3113, 'grad_norm': 0.6261206840159689, 'learning_rate': 3.541739821961494e-06, 'epoch': 0.61} 61%|██████ | 13399/22095 [22:42:05<9:22:36, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13400/22095 [22:42:14<13:27:28, 5.57s/it] {'loss': 0.4883, 'grad_norm': 0.3348048032808948, 'learning_rate': 3.5410387807311353e-06, 'epoch': 0.61} 61%|██████ | 13400/22095 [22:42:14<13:27:28, 5.57s/it] 61%|██████ | 13401/22095 [22:42:19<12:29:51, 5.17s/it] {'loss': 0.3508, 'grad_norm': 0.6042801979473782, 'learning_rate': 3.5403377708493714e-06, 'epoch': 0.61} 61%|██████ | 13401/22095 [22:42:19<12:29:51, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13402/22095 [22:42:25<13:14:10, 5.48s/it] {'loss': 0.4753, 'grad_norm': 0.30132407700194175, 'learning_rate': 3.539636792331267e-06, 'epoch': 0.61} 61%|██████ | 13402/22095 [22:42:25<13:14:10, 5.48s/it] 61%|██████ | 13403/22095 [22:42:28<11:38:08, 4.82s/it] {'loss': 0.3028, 'grad_norm': 0.645422373017156, 'learning_rate': 3.538935845191884e-06, 'epoch': 0.61} 61%|██████ | 13403/22095 [22:42:28<11:38:08, 4.82s/it] 61%|██████ | 13404/22095 [22:42:31<10:28:48, 4.34s/it] {'loss': 0.3165, 'grad_norm': 0.6487120512379727, 'learning_rate': 3.5382349294462803e-06, 'epoch': 0.61} 61%|██████ | 13404/22095 [22:42:31<10:28:48, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13405/22095 [22:42:35<9:45:27, 4.04s/it] {'loss': 0.3097, 'grad_norm': 0.6076635547412993, 'learning_rate': 3.5375340451095186e-06, 'epoch': 0.61} 61%|██████ | 13405/22095 [22:42:35<9:45:27, 4.04s/it] 61%|██████ | 13406/22095 [22:42:38<9:06:45, 3.78s/it] {'loss': 0.2996, 'grad_norm': 0.6358029989813431, 'learning_rate': 3.53683319219666e-06, 'epoch': 0.61} 61%|██████ | 13406/22095 [22:42:38<9:06:45, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (77739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106497 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13407/22095 [22:42:48<13:38:31, 5.65s/it] {'loss': 0.5306, 'grad_norm': 0.2926919659262302, 'learning_rate': 3.536132370722761e-06, 'epoch': 0.61} 61%|██████ | 13407/22095 [22:42:48<13:38:31, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64535 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13408/22095 [22:42:55<14:48:59, 6.14s/it] {'loss': 0.4834, 'grad_norm': 0.2741018832720857, 'learning_rate': 3.5354315807028826e-06, 'epoch': 0.61} 61%|██████ | 13408/22095 [22:42:55<14:48:59, 6.14s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13409/22095 [22:42:58<12:39:03, 5.24s/it] {'loss': 0.2883, 'grad_norm': 0.6091839935589203, 'learning_rate': 3.5347308221520814e-06, 'epoch': 0.61} 61%|██████ | 13409/22095 [22:42:58<12:39:03, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110605 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13410/22095 [22:43:01<11:07:31, 4.61s/it] {'loss': 0.2632, 'grad_norm': 0.6006493393550284, 'learning_rate': 3.5340300950854135e-06, 'epoch': 0.61} 61%|██████ | 13410/22095 [22:43:02<11:07:31, 4.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957617 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8452, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1cm'}]} 61%|██████ | 13411/22095 [22:43:04<9:56:34, 4.12s/it] {'loss': 0.281, 'grad_norm': 0.5984574567811332, 'learning_rate': 3.5333293995179362e-06, 'epoch': 0.61} 61%|██████ | 13411/22095 [22:43:04<9:56:34, 4.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85225 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13412/22095 [22:43:07<9:09:01, 3.79s/it] {'loss': 0.3004, 'grad_norm': 0.5765612346379179, 'learning_rate': 3.5326287354647077e-06, 'epoch': 0.61} 61%|██████ | 13412/22095 [22:43:08<9:09:01, 3.79s/it] 61%|██████ | 13413/22095 [22:43:10<8:28:54, 3.52s/it] {'loss': 0.3161, 'grad_norm': 0.5907166678593426, 'learning_rate': 3.5319281029407793e-06, 'epoch': 0.61} 61%|██████ | 13413/22095 [22:43:10<8:28:54, 3.52s/it] 61%|██████ | 13414/22095 [22:43:14<8:24:25, 3.49s/it] {'loss': 0.3119, 'grad_norm': 0.6486670915591076, 'learning_rate': 3.5312275019612065e-06, 'epoch': 0.61} 61%|██████ | 13414/22095 [22:43:14<8:24:25, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13415/22095 [22:43:23<12:43:30, 5.28s/it] {'loss': 0.4739, 'grad_norm': 0.2855136328850759, 'learning_rate': 3.530526932541045e-06, 'epoch': 0.61} 61%|██████ | 13415/22095 [22:43:23<12:43:30, 5.28s/it] 61%|██████ | 13416/22095 [22:43:32<15:22:56, 6.38s/it] {'loss': 0.4689, 'grad_norm': 0.28702072041527865, 'learning_rate': 3.529826394695347e-06, 'epoch': 0.61} 61%|██████ | 13416/22095 [22:43:32<15:22:56, 6.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13417/22095 [22:43:35<13:07:07, 5.44s/it] {'loss': 0.2641, 'grad_norm': 1.0163123512406775, 'learning_rate': 3.529125888439164e-06, 'epoch': 0.61} 61%|██████ | 13417/22095 [22:43:35<13:07:07, 5.44s/it] 61%|██████ | 13418/22095 [22:43:45<15:56:22, 6.61s/it] {'loss': 0.4725, 'grad_norm': 0.3417303228950468, 'learning_rate': 3.5284254137875472e-06, 'epoch': 0.61} 61%|██████ | 13418/22095 [22:43:45<15:56:22, 6.61s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13419/22095 [22:43:48<13:29:24, 5.60s/it] {'loss': 0.3404, 'grad_norm': 0.6336573660997611, 'learning_rate': 3.5277249707555507e-06, 'epoch': 0.61} 61%|██████ | 13419/22095 [22:43:48<13:29:24, 5.60s/it] 61%|██████ | 13420/22095 [22:43:57<16:02:44, 6.66s/it] {'loss': 0.4588, 'grad_norm': 0.27356633073337205, 'learning_rate': 3.527024559358221e-06, 'epoch': 0.61} 61%|██████ | 13420/22095 [22:43:57<16:02:44, 6.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13421/22095 [22:44:02<14:31:07, 6.03s/it] {'loss': 0.3182, 'grad_norm': 2.292413323718286, 'learning_rate': 3.5263241796106097e-06, 'epoch': 0.61} 61%|██████ | 13421/22095 [22:44:02<14:31:07, 6.03s/it] 61%|██████ | 13422/22095 [22:44:06<13:11:38, 5.48s/it] {'loss': 0.294, 'grad_norm': 0.6645893638135631, 'learning_rate': 3.525623831527767e-06, 'epoch': 0.61} 61%|██████ | 13422/22095 [22:44:06<13:11:38, 5.48s/it] 61%|██████ | 13423/22095 [22:44:09<11:29:21, 4.77s/it] {'loss': 0.3234, 'grad_norm': 0.6186388894118936, 'learning_rate': 3.5249235151247398e-06, 'epoch': 0.61} 61%|██████ | 13423/22095 [22:44:09<11:29:21, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58527 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13424/22095 [22:44:13<10:52:11, 4.51s/it] {'loss': 0.3798, 'grad_norm': 0.633921444031, 'learning_rate': 3.5242232304165736e-06, 'epoch': 0.61} 61%|██████ | 13424/22095 [22:44:13<10:52:11, 4.51s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 61%|██████ | 13425/22095 [22:44:16<10:06:54, 4.20s/it] {'loss': 0.3376, 'grad_norm': 0.6255478093227299, 'learning_rate': 3.5235229774183217e-06, 'epoch': 0.61} 61%|██████ | 13425/22095 [22:44:16<10:06:54, 4.20s/it] 61%|██████ | 13426/22095 [22:44:20<9:49:21, 4.08s/it] {'loss': 0.298, 'grad_norm': 0.6235753491898307, 'learning_rate': 3.522822756145022e-06, 'epoch': 0.61} 61%|██████ | 13426/22095 [22:44:20<9:49:21, 4.08s/it] 61%|██████ | 13427/22095 [22:44:24<9:28:30, 3.94s/it] {'loss': 0.3151, 'grad_norm': 0.6644382198804875, 'learning_rate': 3.5221225666117272e-06, 'epoch': 0.61} 61%|██████ | 13427/22095 [22:44:24<9:28:30, 3.94s/it] 61%|██████ | 13428/22095 [22:44:27<8:48:29, 3.66s/it] {'loss': 0.2818, 'grad_norm': 0.6755344353780085, 'learning_rate': 3.52142240883348e-06, 'epoch': 0.61} 61%|██████ | 13428/22095 [22:44:27<8:48:29, 3.66s/it] 61%|██████ | 13429/22095 [22:44:31<8:55:48, 3.71s/it] {'loss': 0.319, 'grad_norm': 0.6135967249537492, 'learning_rate': 3.520722282825323e-06, 'epoch': 0.61} 61%|██████ | 13429/22095 [22:44:31<8:55:48, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41745 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63329 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13430/22095 [22:44:34<8:35:23, 3.57s/it] {'loss': 0.314, 'grad_norm': 0.6180751455692161, 'learning_rate': 3.520022188602299e-06, 'epoch': 0.61} 61%|██████ | 13430/22095 [22:44:34<8:35:23, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358101 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24812, 'image': 'vrdu_table_final_2/astro-ph.CO/4649a5bd-cc89-442d-9ef4-e078aac66448.png', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\eftcamb basis\\end{tabular}\n```"}]} Token indices sequence length is longer than the specified maximum sequence length for this model (92914 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13431/22095 [22:44:37<8:27:55, 3.52s/it] {'loss': 0.3285, 'grad_norm': 0.6419498870555718, 'learning_rate': 3.519322126179455e-06, 'epoch': 0.61} 61%|██████ | 13431/22095 [22:44:37<8:27:55, 3.52s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [278, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8424101 in VC:s3://internvl-moe-sft-data/. Exception: Image size [278, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 66909, 'image': 'vrdu_texteq/astro-ph.CO/0e758ef5-f039-49cc-aa51-f34f64eadf19.png', 'image_wh': [[278, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'with $a$ the scale factor.'}]} 61%|██████ | 13432/22095 [22:44:46<12:03:55, 5.01s/it] {'loss': 0.4893, 'grad_norm': 0.34575186377873107, 'learning_rate': 3.518622095571831e-06, 'epoch': 0.61} 61%|██████ | 13432/22095 [22:44:46<12:03:55, 5.01s/it] 61%|██████ | 13433/22095 [22:44:49<10:46:04, 4.48s/it] {'loss': 0.3333, 'grad_norm': 0.6326462699830524, 'learning_rate': 3.517922096794468e-06, 'epoch': 0.61} 61%|██████ | 13433/22095 [22:44:49<10:46:04, 4.48s/it] 61%|██████ | 13434/22095 [22:44:52<10:02:06, 4.17s/it] {'loss': 0.3326, 'grad_norm': 0.6098311111428941, 'learning_rate': 3.5172221298624067e-06, 'epoch': 0.61} 61%|██████ | 13434/22095 [22:44:52<10:02:06, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13435/22095 [22:45:03<14:18:45, 5.95s/it] {'loss': 0.4515, 'grad_norm': 0.2651308468352901, 'learning_rate': 3.516522194790689e-06, 'epoch': 0.61} 61%|██████ | 13435/22095 [22:45:03<14:18:45, 5.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366680 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33426, 'image': 'vrdu_table_final_2/astro-ph.CO/57cb199a-66c4-4480-9979-12866ac10d2d.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 61%|██████ | 13436/22095 [22:45:13<17:45:16, 7.38s/it] {'loss': 0.456, 'grad_norm': 0.27729441427110796, 'learning_rate': 3.5158222915943524e-06, 'epoch': 0.61} 61%|██████ | 13436/22095 [22:45:13<17:45:16, 7.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (68814 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13437/22095 [22:45:17<15:25:58, 6.42s/it] {'loss': 0.3203, 'grad_norm': 0.6091409290588269, 'learning_rate': 3.5151224202884364e-06, 'epoch': 0.61} 61%|██████ | 13437/22095 [22:45:18<15:25:58, 6.42s/it] 61%|██████ | 13438/22095 [22:45:27<17:24:48, 7.24s/it] {'loss': 0.4623, 'grad_norm': 0.2652121404589406, 'learning_rate': 3.5144225808879806e-06, 'epoch': 0.61} 61%|██████ | 13438/22095 [22:45:27<17:24:48, 7.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13439/22095 [22:45:30<14:54:41, 6.20s/it] {'loss': 0.2576, 'grad_norm': 0.6086578023770722, 'learning_rate': 3.513722773408018e-06, 'epoch': 0.61} 61%|██████ | 13439/22095 [22:45:30<14:54:41, 6.20s/it] 61%|██████ | 13440/22095 [22:45:35<13:32:46, 5.63s/it] {'loss': 0.3113, 'grad_norm': 0.7042375101393927, 'learning_rate': 3.51302299786359e-06, 'epoch': 0.61} 61%|██████ | 13440/22095 [22:45:35<13:32:46, 5.63s/it] 61%|██████ | 13441/22095 [22:45:38<11:58:38, 4.98s/it] {'loss': 0.3279, 'grad_norm': 0.6347651648407661, 'learning_rate': 3.512323254269732e-06, 'epoch': 0.61} 61%|██████ | 13441/22095 [22:45:38<11:58:38, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13442/22095 [22:45:42<11:14:05, 4.67s/it] {'loss': 0.2872, 'grad_norm': 0.6096456656656934, 'learning_rate': 3.5116235426414767e-06, 'epoch': 0.61} 61%|██████ | 13442/22095 [22:45:42<11:14:05, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63180 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13443/22095 [22:45:45<10:14:48, 4.26s/it] {'loss': 0.2631, 'grad_norm': 0.6177301460681318, 'learning_rate': 3.51092386299386e-06, 'epoch': 0.61} 61%|██████ | 13443/22095 [22:45:45<10:14:48, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8373193 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39966, 'image': 'vrdu_table_final_2/astro-ph.CO/eb17cb58-b37a-41eb-8ea5-1b0fe15cddcc.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 61%|██████ | 13444/22095 [22:45:50<10:08:56, 4.22s/it] {'loss': 0.3088, 'grad_norm': 0.6562325905779547, 'learning_rate': 3.5102242153419164e-06, 'epoch': 0.61} 61%|██████ | 13444/22095 [22:45:50<10:08:56, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13445/22095 [22:45:57<12:36:59, 5.25s/it] {'loss': 0.4631, 'grad_norm': 0.3103744888178165, 'learning_rate': 3.50952459970068e-06, 'epoch': 0.61} 61%|██████ | 13445/22095 [22:45:57<12:36:59, 5.25s/it] 61%|██████ | 13446/22095 [22:46:01<11:29:35, 4.78s/it] {'loss': 0.3588, 'grad_norm': 0.6626718082982688, 'learning_rate': 3.5088250160851817e-06, 'epoch': 0.61} 61%|██████ | 13446/22095 [22:46:01<11:29:35, 4.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959455 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10290, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 2\nB. 3\nC. 10\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 61%|██████ | 13447/22095 [22:46:04<10:32:48, 4.39s/it] {'loss': 0.3407, 'grad_norm': 0.6568108014435411, 'learning_rate': 3.5081254645104525e-06, 'epoch': 0.61} 61%|██████ | 13447/22095 [22:46:04<10:32:48, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13448/22095 [22:46:13<13:47:35, 5.74s/it] {'loss': 0.4779, 'grad_norm': 0.27975503482648345, 'learning_rate': 3.507425944991529e-06, 'epoch': 0.61} 61%|██████ | 13448/22095 [22:46:13<13:47:35, 5.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46801 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49796 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13449/22095 [22:46:17<12:24:28, 5.17s/it] {'loss': 0.3334, 'grad_norm': 0.6176299986228942, 'learning_rate': 3.506726457543434e-06, 'epoch': 0.61} 61%|██████ | 13449/22095 [22:46:17<12:24:28, 5.17s/it] 61%|██████ | 13450/22095 [22:46:20<10:58:16, 4.57s/it] {'loss': 0.2849, 'grad_norm': 0.701602134769069, 'learning_rate': 3.5060270021812027e-06, 'epoch': 0.61} 61%|██████ | 13450/22095 [22:46:20<10:58:16, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13451/22095 [22:46:30<14:29:54, 6.04s/it] {'loss': 0.4584, 'grad_norm': 0.28129388479860296, 'learning_rate': 3.5053275789198634e-06, 'epoch': 0.61} 61%|██████ | 13451/22095 [22:46:30<14:29:54, 6.04s/it] 61%|██████ | 13452/22095 [22:46:33<12:40:58, 5.28s/it] {'loss': 0.3203, 'grad_norm': 0.7571593717790948, 'learning_rate': 3.5046281877744424e-06, 'epoch': 0.61} 61%|██████ | 13452/22095 [22:46:33<12:40:58, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43183 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46292 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13453/22095 [22:46:37<11:19:17, 4.72s/it] {'loss': 0.2742, 'grad_norm': 0.6840506216804946, 'learning_rate': 3.503928828759969e-06, 'epoch': 0.61} 61%|██████ | 13453/22095 [22:46:37<11:19:17, 4.72s/it] 61%|██████ | 13454/22095 [22:46:40<10:00:55, 4.17s/it] {'loss': 0.3317, 'grad_norm': 0.6572465584720832, 'learning_rate': 3.503229501891472e-06, 'epoch': 0.61} 61%|██████ | 13454/22095 [22:46:40<10:00:55, 4.17s/it] 61%|██████ | 13455/22095 [22:46:43<9:46:42, 4.07s/it] {'loss': 0.3169, 'grad_norm': 0.6389350839603637, 'learning_rate': 3.5025302071839746e-06, 'epoch': 0.61} 61%|██████ | 13455/22095 [22:46:43<9:46:42, 4.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13456/22095 [22:46:48<9:52:10, 4.11s/it] {'loss': 0.3028, 'grad_norm': 0.5981348804029151, 'learning_rate': 3.501830944652504e-06, 'epoch': 0.61} 61%|██████ | 13456/22095 [22:46:48<9:52:10, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13457/22095 [22:46:57<13:45:31, 5.73s/it] {'loss': 0.4633, 'grad_norm': 0.26849945214336984, 'learning_rate': 3.5011317143120845e-06, 'epoch': 0.61} 61%|██████ | 13457/22095 [22:46:57<13:45:31, 5.73s/it] 61%|██████ | 13458/22095 [22:47:01<12:06:50, 5.05s/it] {'loss': 0.3132, 'grad_norm': 0.7653660823165149, 'learning_rate': 3.5004325161777437e-06, 'epoch': 0.61} 61%|██████ | 13458/22095 [22:47:01<12:06:50, 5.05s/it] 61%|██████ | 13459/22095 [22:47:05<11:27:51, 4.78s/it] {'loss': 0.3269, 'grad_norm': 0.6098125186287409, 'learning_rate': 3.4997333502644994e-06, 'epoch': 0.61} 61%|██████ | 13459/22095 [22:47:05<11:27:51, 4.78s/it] 61%|██████ | 13460/22095 [22:47:08<10:44:37, 4.48s/it] {'loss': 0.3019, 'grad_norm': 0.5955342692604938, 'learning_rate': 3.499034216587379e-06, 'epoch': 0.61} 61%|██████ | 13460/22095 [22:47:09<10:44:37, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13461/22095 [22:47:17<13:49:57, 5.77s/it] {'loss': 0.4498, 'grad_norm': 0.28986142679869115, 'learning_rate': 3.4983351151614043e-06, 'epoch': 0.61} 61%|██████ | 13461/22095 [22:47:17<13:49:57, 5.77s/it] 61%|██████ | 13462/22095 [22:47:21<12:22:30, 5.16s/it] {'loss': 0.304, 'grad_norm': 0.6228379080550507, 'learning_rate': 3.4976360460015953e-06, 'epoch': 0.61} 61%|██████ | 13462/22095 [22:47:21<12:22:30, 5.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13463/22095 [22:47:24<10:39:39, 4.45s/it] {'loss': 0.3011, 'grad_norm': 0.6021659146296512, 'learning_rate': 3.496937009122972e-06, 'epoch': 0.61} 61%|██████ | 13463/22095 [22:47:24<10:39:39, 4.45s/it] 61%|██████ | 13464/22095 [22:47:27<10:02:27, 4.19s/it] {'loss': 0.3529, 'grad_norm': 1.0426986399623595, 'learning_rate': 3.4962380045405585e-06, 'epoch': 0.61} 61%|██████ | 13464/22095 [22:47:27<10:02:27, 4.19s/it] 61%|██████ | 13465/22095 [22:47:49<22:31:27, 9.40s/it] {'loss': 0.3446, 'grad_norm': 0.7277310854494984, 'learning_rate': 3.4955390322693704e-06, 'epoch': 0.61} 61%|██████ | 13465/22095 [22:47:49<22:31:27, 9.40s/it] 61%|██████ | 13466/22095 [22:47:52<17:41:55, 7.38s/it] {'loss': 0.2836, 'grad_norm': 0.6511462469867378, 'learning_rate': 3.4948400923244286e-06, 'epoch': 0.61} 61%|██████ | 13466/22095 [22:47:52<17:41:55, 7.38s/it] 61%|██████ | 13467/22095 [22:47:55<14:37:04, 6.10s/it] {'loss': 0.3008, 'grad_norm': 0.6410223484258175, 'learning_rate': 3.4941411847207505e-06, 'epoch': 0.61} 61%|██████ | 13467/22095 [22:47:55<14:37:04, 6.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949351 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 186, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 61%|██████ | 13468/22095 [22:47:58<12:43:20, 5.31s/it] {'loss': 0.3549, 'grad_norm': 0.6357391756745651, 'learning_rate': 3.4934423094733516e-06, 'epoch': 0.61} 61%|██████ | 13468/22095 [22:47:58<12:43:20, 5.31s/it] 61%|██████ | 13469/22095 [22:48:01<11:13:48, 4.69s/it] {'loss': 0.3126, 'grad_norm': 0.8127095590421107, 'learning_rate': 3.492743466597252e-06, 'epoch': 0.61} 61%|██████ | 13469/22095 [22:48:01<11:13:48, 4.69s/it] 61%|██████ | 13470/22095 [22:48:05<10:13:52, 4.27s/it] {'loss': 0.3239, 'grad_norm': 0.6054784960538481, 'learning_rate': 3.4920446561074673e-06, 'epoch': 0.61} 61%|██████ | 13470/22095 [22:48:05<10:13:52, 4.27s/it] 61%|██████ | 13471/22095 [22:48:08<9:41:04, 4.04s/it] {'loss': 0.3085, 'grad_norm': 0.6112478540856312, 'learning_rate': 3.49134587801901e-06, 'epoch': 0.61} 61%|██████ | 13471/22095 [22:48:08<9:41:04, 4.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13472/22095 [22:48:29<21:47:02, 9.09s/it] {'loss': 0.2751, 'grad_norm': 0.655204505443951, 'learning_rate': 3.4906471323468955e-06, 'epoch': 0.61} 61%|██████ | 13472/22095 [22:48:29<21:47:02, 9.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887884 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11037, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 2\nB. 0.5\nC. 1\nD. 1.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 61%|██████ | 13473/22095 [22:48:33<18:18:08, 7.64s/it] {'loss': 0.2883, 'grad_norm': 0.6170945340887124, 'learning_rate': 3.4899484191061394e-06, 'epoch': 0.61} 61%|██████ | 13473/22095 [22:48:33<18:18:08, 7.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13474/22095 [22:48:42<18:54:34, 7.90s/it] {'loss': 0.4773, 'grad_norm': 0.3074866732977803, 'learning_rate': 3.4892497383117553e-06, 'epoch': 0.61} 61%|██████ | 13474/22095 [22:48:42<18:54:34, 7.90s/it] 61%|██████ | 13475/22095 [22:48:46<16:19:12, 6.82s/it] {'loss': 0.3611, 'grad_norm': 0.6455642516141017, 'learning_rate': 3.488551089978753e-06, 'epoch': 0.61} 61%|██████ | 13475/22095 [22:48:46<16:19:12, 6.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64419 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (182905 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13476/22095 [22:48:49<13:41:45, 5.72s/it] {'loss': 0.3316, 'grad_norm': 0.6488672160053081, 'learning_rate': 3.487852474122145e-06, 'epoch': 0.61} 61%|██████ | 13476/22095 [22:48:49<13:41:45, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120359 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [220, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8532197 in VC:s3://internvl-moe-sft-data/. Exception: Image size [220, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 149809, 'image': 'vrdu_texteq/astro-ph.CO/e05c50e4-cfff-429d-827b-126ddba4c9d0.png', 'image_wh': [[220, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': '.\n$M_d$ is derived as'}]} 61%|██████ | 13477/22095 [22:48:53<11:59:36, 5.01s/it] {'loss': 0.3207, 'grad_norm': 0.5809931280865803, 'learning_rate': 3.487153890756946e-06, 'epoch': 0.61} 61%|██████ | 13477/22095 [22:48:53<11:59:36, 5.01s/it] 61%|██████ | 13478/22095 [22:48:57<11:21:32, 4.75s/it] {'loss': 0.3279, 'grad_norm': 0.6430702923168584, 'learning_rate': 3.4864553398981606e-06, 'epoch': 0.61} 61%|██████ | 13478/22095 [22:48:57<11:21:32, 4.75s/it] 61%|██████ | 13479/22095 [22:49:00<10:16:32, 4.29s/it] {'loss': 0.2798, 'grad_norm': 0.6790153937287134, 'learning_rate': 3.4857568215608024e-06, 'epoch': 0.61} 61%|██████ | 13479/22095 [22:49:00<10:16:32, 4.29s/it] 61%|██████ | 13480/22095 [22:49:04<10:01:33, 4.19s/it] {'loss': 0.2876, 'grad_norm': 0.5692761648336583, 'learning_rate': 3.4850583357598805e-06, 'epoch': 0.61} 61%|██████ | 13480/22095 [22:49:04<10:01:33, 4.19s/it] 61%|██████ | 13481/22095 [22:49:07<9:31:39, 3.98s/it] {'loss': 0.2788, 'grad_norm': 2.1181254113615373, 'learning_rate': 3.4843598825104013e-06, 'epoch': 0.61} 61%|██████ | 13481/22095 [22:49:07<9:31:39, 3.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047590 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 61%|██████ | 13482/22095 [22:49:10<8:46:56, 3.67s/it] {'loss': 0.3434, 'grad_norm': 0.6191110811683334, 'learning_rate': 3.483661461827372e-06, 'epoch': 0.61} 61%|██████ | 13482/22095 [22:49:10<8:46:56, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59696 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67036 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13483/22095 [22:49:14<8:37:47, 3.61s/it] {'loss': 0.3268, 'grad_norm': 0.6159076449318271, 'learning_rate': 3.482963073725803e-06, 'epoch': 0.61} 61%|██████ | 13483/22095 [22:49:14<8:37:47, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13484/22095 [22:49:23<12:45:13, 5.33s/it] {'loss': 0.457, 'grad_norm': 0.29597245858900834, 'learning_rate': 3.482264718220697e-06, 'epoch': 0.61} 61%|██████ | 13484/22095 [22:49:23<12:45:13, 5.33s/it] 61%|██████ | 13485/22095 [22:49:27<11:23:45, 4.76s/it] {'loss': 0.2979, 'grad_norm': 0.6516268351726187, 'learning_rate': 3.481566395327062e-06, 'epoch': 0.61} 61%|██████ | 13485/22095 [22:49:27<11:23:45, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13486/22095 [22:49:30<10:05:02, 4.22s/it] {'loss': 0.3135, 'grad_norm': 0.5816022423269239, 'learning_rate': 3.480868105059899e-06, 'epoch': 0.61} 61%|██████ | 13486/22095 [22:49:30<10:05:02, 4.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047720 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1cm'}]} 61%|██████ | 13487/22095 [22:49:33<9:14:54, 3.87s/it] {'loss': 0.2927, 'grad_norm': 0.5999535442164554, 'learning_rate': 3.4801698474342176e-06, 'epoch': 0.61} 61%|██████ | 13487/22095 [22:49:33<9:14:54, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67823 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69470 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13488/22095 [22:49:35<8:27:20, 3.54s/it] {'loss': 0.316, 'grad_norm': 0.7022729924679137, 'learning_rate': 3.479471622465017e-06, 'epoch': 0.61} 61%|██████ | 13488/22095 [22:49:35<8:27:20, 3.54s/it] 61%|██████ | 13489/22095 [22:49:39<8:31:17, 3.56s/it] {'loss': 0.3041, 'grad_norm': 0.6212432678891955, 'learning_rate': 3.478773430167302e-06, 'epoch': 0.61} 61%|██████ | 13489/22095 [22:49:39<8:31:17, 3.56s/it] 61%|██████ | 13490/22095 [22:49:42<8:03:42, 3.37s/it] {'loss': 0.2661, 'grad_norm': 0.6285515117991046, 'learning_rate': 3.478075270556075e-06, 'epoch': 0.61} 61%|██████ | 13490/22095 [22:49:42<8:03:42, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13491/22095 [22:49:49<10:56:30, 4.58s/it] {'loss': 0.4593, 'grad_norm': 0.26767567328925884, 'learning_rate': 3.4773771436463346e-06, 'epoch': 0.61} 61%|██████ | 13491/22095 [22:49:49<10:56:30, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41275 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44215 > 40960) for 4 sample(s). Truncating to 728 with 1 samples. 61%|██████ | 13492/22095 [22:49:54<10:38:42, 4.45s/it] {'loss': 0.3232, 'grad_norm': 0.6986766791328546, 'learning_rate': 3.4766790494530824e-06, 'epoch': 0.61} 61%|██████ | 13492/22095 [22:49:54<10:38:42, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105882 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13493/22095 [22:49:57<10:08:48, 4.25s/it] {'loss': 0.3202, 'grad_norm': 0.6301950786058406, 'learning_rate': 3.47598098799132e-06, 'epoch': 0.61} 61%|██████ | 13493/22095 [22:49:57<10:08:48, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13494/22095 [22:50:19<22:30:15, 9.42s/it] {'loss': 0.3065, 'grad_norm': 0.6723265814968592, 'learning_rate': 3.475282959276045e-06, 'epoch': 0.61} 61%|██████ | 13494/22095 [22:50:19<22:30:15, 9.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54229 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13495/22095 [22:50:23<18:36:26, 7.79s/it] {'loss': 0.3619, 'grad_norm': 0.7080890105682609, 'learning_rate': 3.4745849633222566e-06, 'epoch': 0.61} 61%|██████ | 13495/22095 [22:50:23<18:36:26, 7.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13496/22095 [22:50:48<31:05:14, 13.01s/it] {'loss': 0.4856, 'grad_norm': 0.2851715591834395, 'learning_rate': 3.4738870001449533e-06, 'epoch': 0.61} 61%|██████ | 13496/22095 [22:50:48<31:05:14, 13.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55727 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13497/22095 [22:50:51<24:13:48, 10.15s/it] {'loss': 0.3401, 'grad_norm': 0.6202568147808254, 'learning_rate': 3.4731890697591297e-06, 'epoch': 0.61} 61%|██████ | 13497/22095 [22:50:51<24:13:48, 10.15s/it] 61%|██████ | 13498/22095 [22:51:13<32:25:37, 13.58s/it] {'loss': 0.2923, 'grad_norm': 0.6091828279663052, 'learning_rate': 3.472491172179784e-06, 'epoch': 0.61} 61%|██████ | 13498/22095 [22:51:13<32:25:37, 13.58s/it] 61%|██████ | 13499/22095 [22:51:34<38:04:30, 15.95s/it] {'loss': 0.324, 'grad_norm': 0.6171615737326103, 'learning_rate': 3.471793307421913e-06, 'epoch': 0.61} 61%|██████ | 13499/22095 [22:51:35<38:04:30, 15.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13500/22095 [22:51:41<31:30:53, 13.20s/it] {'loss': 0.4887, 'grad_norm': 0.2717845790814612, 'learning_rate': 3.4710954755005087e-06, 'epoch': 0.61} 61%|██████ | 13500/22095 [22:51:41<31:30:53, 13.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13501/22095 [22:51:50<28:07:27, 11.78s/it] {'loss': 0.48, 'grad_norm': 0.2640603581855504, 'learning_rate': 3.470397676430567e-06, 'epoch': 0.61} 61%|██████ | 13501/22095 [22:51:50<28:07:27, 11.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13502/22095 [22:51:53<22:11:11, 9.29s/it] {'loss': 0.3468, 'grad_norm': 0.6644426756295154, 'learning_rate': 3.469699910227082e-06, 'epoch': 0.61} 61%|██████ | 13502/22095 [22:51:53<22:11:11, 9.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82028 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75187 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13503/22095 [22:52:36<45:47:10, 19.18s/it] {'loss': 0.467, 'grad_norm': 0.25364988424384577, 'learning_rate': 3.4690021769050462e-06, 'epoch': 0.61} 61%|██████ | 13503/22095 [22:52:36<45:47:10, 19.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (48838 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99632 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54268 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56581 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13504/22095 [22:52:39<34:49:56, 14.60s/it] {'loss': 0.2898, 'grad_norm': 0.6486677182552096, 'learning_rate': 3.4683044764794516e-06, 'epoch': 0.61} 61%|██████ | 13504/22095 [22:52:39<34:49:56, 14.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13505/22095 [22:52:49<31:09:55, 13.06s/it] {'loss': 0.4651, 'grad_norm': 0.26813038137037226, 'learning_rate': 3.4676068089652883e-06, 'epoch': 0.61} 61%|██████ | 13505/22095 [22:52:49<31:09:55, 13.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (62081 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90420 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58812 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13506/22095 [22:52:52<24:07:06, 10.11s/it] {'loss': 0.3387, 'grad_norm': 0.671097206847934, 'learning_rate': 3.466909174377551e-06, 'epoch': 0.61} 61%|██████ | 13506/22095 [22:52:52<24:07:06, 10.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60060 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43655 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13507/22095 [22:52:55<19:11:39, 8.05s/it] {'loss': 0.2821, 'grad_norm': 0.6126280949793477, 'learning_rate': 3.466211572731224e-06, 'epoch': 0.61} 61%|██████ | 13507/22095 [22:52:55<19:11:39, 8.05s/it] 61%|██████ | 13508/22095 [22:53:35<41:33:06, 17.42s/it] {'loss': 0.3244, 'grad_norm': 0.5705294614602041, 'learning_rate': 3.465514004041301e-06, 'epoch': 0.61} 61%|██████ | 13508/22095 [22:53:35<41:33:06, 17.42s/it] 61%|██████ | 13509/22095 [22:54:15<57:57:20, 24.30s/it] {'loss': 0.3043, 'grad_norm': 0.5805375561356275, 'learning_rate': 3.4648164683227702e-06, 'epoch': 0.61} 61%|██████ | 13509/22095 [22:54:15<57:57:20, 24.30s/it] 61%|██████ | 13510/22095 [22:54:36<55:49:19, 23.41s/it] {'loss': 0.2985, 'grad_norm': 0.6295482157076951, 'learning_rate': 3.464118965590617e-06, 'epoch': 0.61} 61%|██████ | 13510/22095 [22:54:36<55:49:19, 23.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946050 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69203, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 61%|██████ | 13511/22095 [22:54:46<45:47:50, 19.21s/it] {'loss': 0.4726, 'grad_norm': 0.3022925327555956, 'learning_rate': 3.46342149585983e-06, 'epoch': 0.61} 61%|██████ | 13511/22095 [22:54:46<45:47:50, 19.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54335 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130219 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13512/22095 [22:55:14<52:01:48, 21.82s/it] {'loss': 0.4703, 'grad_norm': 0.27754398433310995, 'learning_rate': 3.462724059145397e-06, 'epoch': 0.61} 61%|██████ | 13512/22095 [22:55:14<52:01:48, 21.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13513/22095 [22:55:55<66:04:29, 27.72s/it] {'loss': 0.3484, 'grad_norm': 0.6260772278890104, 'learning_rate': 3.4620266554623016e-06, 'epoch': 0.61} 61%|██████ | 13513/22095 [22:55:55<66:04:29, 27.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13514/22095 [22:56:05<53:19:15, 22.37s/it] {'loss': 0.4927, 'grad_norm': 0.32133429856181145, 'learning_rate': 3.4613292848255307e-06, 'epoch': 0.61} 61%|██████ | 13514/22095 [22:56:05<53:19:15, 22.37s/it] 61%|██████ | 13515/22095 [22:56:13<43:11:38, 18.12s/it] {'loss': 0.4865, 'grad_norm': 0.2931450993845942, 'learning_rate': 3.460631947250066e-06, 'epoch': 0.61} 61%|██████ | 13515/22095 [22:56:13<43:11:38, 18.12s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████ | 13516/22095 [22:56:18<33:33:09, 14.08s/it] {'loss': 0.326, 'grad_norm': 0.6189908854062645, 'learning_rate': 3.459934642750895e-06, 'epoch': 0.61} 61%|██████ | 13516/22095 [22:56:18<33:33:09, 14.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60818 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13517/22095 [22:56:39<38:54:07, 16.33s/it] {'loss': 0.2923, 'grad_norm': 0.6752200443544591, 'learning_rate': 3.4592373713429984e-06, 'epoch': 0.61} 61%|██████ | 13517/22095 [22:56:39<38:54:07, 16.33s/it] 61%|██████ | 13518/22095 [22:57:05<45:19:51, 19.03s/it] {'loss': 0.3386, 'grad_norm': 0.5882537092542902, 'learning_rate': 3.4585401330413574e-06, 'epoch': 0.61} 61%|██████ | 13518/22095 [22:57:05<45:19:51, 19.03s/it] 61%|██████ | 13519/22095 [22:57:08<34:07:04, 14.32s/it] {'loss': 0.299, 'grad_norm': 0.936107982255461, 'learning_rate': 3.4578429278609566e-06, 'epoch': 0.61} 61%|██████ | 13519/22095 [22:57:08<34:07:04, 14.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13520/22095 [22:57:18<30:40:04, 12.88s/it] {'loss': 0.4628, 'grad_norm': 0.3321527606429436, 'learning_rate': 3.4571457558167727e-06, 'epoch': 0.61} 61%|██████ | 13520/22095 [22:57:18<30:40:04, 12.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13521/22095 [22:57:21<23:59:24, 10.07s/it] {'loss': 0.3115, 'grad_norm': 0.7063148875935, 'learning_rate': 3.4564486169237888e-06, 'epoch': 0.61} 61%|██████ | 13521/22095 [22:57:21<23:59:24, 10.07s/it] 61%|██████ | 13522/22095 [22:57:25<19:23:14, 8.14s/it] {'loss': 0.3187, 'grad_norm': 0.6443386328125476, 'learning_rate': 3.4557515111969843e-06, 'epoch': 0.61} 61%|██████ | 13522/22095 [22:57:25<19:23:14, 8.14s/it] 61%|██████ | 13523/22095 [22:57:47<29:27:19, 12.37s/it] {'loss': 0.3011, 'grad_norm': 0.6155674561221551, 'learning_rate': 3.4550544386513364e-06, 'epoch': 0.61} 61%|██████ | 13523/22095 [22:57:47<29:27:19, 12.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46596 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13524/22095 [22:58:47<63:16:24, 26.58s/it] {'loss': 0.3257, 'grad_norm': 0.6732227821436269, 'learning_rate': 3.4543573993018225e-06, 'epoch': 0.61} 61%|██████ | 13524/22095 [22:58:47<63:16:24, 26.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13525/22095 [22:58:56<51:02:29, 21.44s/it] {'loss': 0.4452, 'grad_norm': 0.2799539037248683, 'learning_rate': 3.453660393163424e-06, 'epoch': 0.61} 61%|██████ | 13525/22095 [22:58:56<51:02:29, 21.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████ | 13526/22095 [22:59:00<38:11:12, 16.04s/it] {'loss': 0.3111, 'grad_norm': 0.5963331242645898, 'learning_rate': 3.452963420251112e-06, 'epoch': 0.61} 61%|██████ | 13526/22095 [22:59:00<38:11:12, 16.04s/it] 61%|██████ | 13527/22095 [22:59:38<54:23:24, 22.85s/it] {'loss': 0.3555, 'grad_norm': 0.5948844068033212, 'learning_rate': 3.4522664805798643e-06, 'epoch': 0.61} 61%|██████ | 13527/22095 [22:59:38<54:23:24, 22.85s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/standard/test_251_image_1.png 2025-08-28 14:57:37.172769 load time: 1043.35 ms 61%|██████ | 13528/22095 [23:00:38<80:25:15, 33.79s/it] {'loss': 0.3022, 'grad_norm': 0.5394219952798536, 'learning_rate': 3.451569574164658e-06, 'epoch': 0.61} 61%|██████ | 13528/22095 [23:00:38<80:25:15, 33.79s/it] 61%|██████ | 13529/22095 [23:01:56<112:17:57, 47.20s/it] {'loss': 0.2994, 'grad_norm': 0.631473684109706, 'learning_rate': 3.4508727010204663e-06, 'epoch': 0.61} 61%|██████ | 13529/22095 [23:01:56<112:17:57, 47.20s/it] 61%|██████ | 13530/22095 [23:02:36<107:14:07, 45.07s/it] {'loss': 0.3709, 'grad_norm': 0.6854496656754867, 'learning_rate': 3.4501758611622606e-06, 'epoch': 0.61} 61%|██████ | 13530/22095 [23:02:36<107:14:07, 45.07s/it] 61%|██████ | 13531/22095 [23:04:16<146:34:04, 61.61s/it] {'loss': 0.3225, 'grad_norm': 0.6196837073329371, 'learning_rate': 3.449479054605016e-06, 'epoch': 0.61} 61%|██████ | 13531/22095 [23:04:17<146:34:04, 61.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████ | 13532/22095 [23:04:26<109:02:50, 45.84s/it] {'loss': 0.4733, 'grad_norm': 0.2770645560779661, 'learning_rate': 3.448782281363706e-06, 'epoch': 0.61} 61%|██████ | 13532/22095 [23:04:26<109:02:50, 45.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42238 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70206 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105130 > 40960). Running this sequence through the model will result in indexing errors 61%|██████ | 13533/22095 [23:04:29<79:04:14, 33.25s/it] {'loss': 0.2879, 'grad_norm': 0.5787548892775997, 'learning_rate': 3.4480855414533e-06, 'epoch': 0.61} 61%|██████ | 13533/22095 [23:04:29<79:04:14, 33.25s/it]VC:s3://gui-agent/data_20250526/windows/images/inventor/20250512_162754_1/images/before_screenshot_19_id_67_function_2_crop_1_grounding_instructions_point_o.png 2025-08-28 15:02:28.178487 load time: 1044.24 ms VC:s3://gui-agent/data_20250714/web/images/20250718/e00ecad1-466d-4cf3-a4d5-a1dd6d86c545/images/step_20.png 2025-08-28 15:02:28.180028 load time: 1062.66 ms VC:s3://gui-agent/data_20250421/web/images/wa_forum/trajectory_18/img/step_0.png 2025-08-28 15:02:28.178159 load time: 1062.16 ms 61%|██████▏ | 13534/22095 [23:04:52<71:10:47, 29.93s/it] {'loss': 0.292, 'grad_norm': 0.5992493454284118, 'learning_rate': 3.4473888348887673e-06, 'epoch': 0.61} 61%|██████▏ | 13534/22095 [23:04:52<71:10:47, 29.93s/it] 61%|██████▏ | 13535/22095 [23:05:50<91:22:12, 38.43s/it] {'loss': 0.3218, 'grad_norm': 0.6407299956446944, 'learning_rate': 3.4466921616850847e-06, 'epoch': 0.61} 61%|██████▏ | 13535/22095 [23:05:50<91:22:12, 38.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55039 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13536/22095 [23:05:53<66:24:37, 27.93s/it] {'loss': 0.3027, 'grad_norm': 0.6333286620983509, 'learning_rate': 3.445995521857213e-06, 'epoch': 0.61} 61%|██████▏ | 13536/22095 [23:05:53<66:24:37, 27.93s/it] 61%|██████▏ | 13537/22095 [23:05:57<48:53:42, 20.57s/it] {'loss': 0.3287, 'grad_norm': 0.662291156417707, 'learning_rate': 3.4452989154201256e-06, 'epoch': 0.61} 61%|██████▏ | 13537/22095 [23:05:57<48:53:42, 20.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46881 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13538/22095 [23:06:01<37:01:15, 15.58s/it] {'loss': 0.3112, 'grad_norm': 0.7020675293337494, 'learning_rate': 3.4446023423887905e-06, 'epoch': 0.61} 61%|██████▏ | 13538/22095 [23:06:01<37:01:15, 15.58s/it] 61%|██████▏ | 13539/22095 [23:07:00<68:02:42, 28.63s/it] {'loss': 0.3326, 'grad_norm': 0.8046855346438109, 'learning_rate': 3.443905802778173e-06, 'epoch': 0.61} 61%|██████▏ | 13539/22095 [23:07:00<68:02:42, 28.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61162 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13540/22095 [23:07:42<77:47:33, 32.74s/it] {'loss': 0.3219, 'grad_norm': 0.6657469550052136, 'learning_rate': 3.4432092966032397e-06, 'epoch': 0.61} 61%|██████▏ | 13540/22095 [23:07:42<77:47:33, 32.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████▏ | 13541/22095 [23:08:11<75:13:08, 31.66s/it] {'loss': 0.4805, 'grad_norm': 0.3111574269257986, 'learning_rate': 3.4425128238789594e-06, 'epoch': 0.61} 61%|██████▏ | 13541/22095 [23:08:11<75:13:08, 31.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937806 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60959, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm'}, {'from': 'gpt', 'value': '【解答】解:设BC=xcm,∵BC=\\frac{1}{2}AB,∴AB=2BC=2x,AC=AB+BC=3xcm,∵D为AC的中点,∴AD=DC=\\frac{1}{2}AC=1.5xcm,∵CD=3cm,∴1.5x=3,解得:x=2,即AB=2xcm=4cm,'}]} VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-1_92716464-split-0.jpg 2025-08-28 15:06:09.926454 load time: 1062.54 ms 61%|██████▏ | 13542/22095 [23:08:15<55:36:17, 23.40s/it] {'loss': 0.3149, 'grad_norm': 0.7803599969425479, 'learning_rate': 3.4418163846202945e-06, 'epoch': 0.61} 61%|██████▏ | 13542/22095 [23:08:15<55:36:17, 23.40s/it] 61%|██████▏ | 13543/22095 [23:09:33<94:06:34, 39.62s/it] {'loss': 0.2993, 'grad_norm': 0.6984892738731218, 'learning_rate': 3.4411199788422093e-06, 'epoch': 0.61} 61%|██████▏ | 13543/22095 [23:09:33<94:06:34, 39.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_536707.png 2025-08-28 15:07:31.518615 load time: 1046.35 ms 61%|██████▏ | 13544/22095 [23:09:42<72:40:30, 30.60s/it] {'loss': 0.4713, 'grad_norm': 0.3298082981291895, 'learning_rate': 3.4404236065596673e-06, 'epoch': 0.61} 61%|██████▏ | 13544/22095 [23:09:42<72:40:30, 30.60s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/GUICourse/guienv/chunk_61/C4web50k-3_277119177-split-0.png 2025-08-28 15:07:41.073918 load time: 1044.94 ms 61%|██████▏ | 13545/22095 [23:09:46<53:39:44, 22.59s/it] {'loss': 0.3007, 'grad_norm': 0.641591555289859, 'learning_rate': 3.439727267787634e-06, 'epoch': 0.61} 61%|██████▏ | 13545/22095 [23:09:46<53:39:44, 22.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████▏ | 13546/22095 [23:09:55<43:59:08, 18.52s/it] {'loss': 0.4702, 'grad_norm': 0.30014997903508966, 'learning_rate': 3.439030962541069e-06, 'epoch': 0.61} 61%|██████▏ | 13546/22095 [23:09:55<43:59:08, 18.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████▏ | 13547/22095 [23:10:00<34:06:36, 14.37s/it] {'loss': 0.2835, 'grad_norm': 0.7192659293934158, 'learning_rate': 3.438334690834934e-06, 'epoch': 0.61} 61%|██████▏ | 13547/22095 [23:10:00<34:06:36, 14.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46030 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13548/22095 [23:10:28<43:40:38, 18.40s/it] {'loss': 0.4918, 'grad_norm': 0.27457620479106704, 'learning_rate': 3.4376384526841918e-06, 'epoch': 0.61} 61%|██████▏ | 13548/22095 [23:10:28<43:40:38, 18.40s/it] 61%|██████▏ | 13549/22095 [23:10:50<46:31:34, 19.60s/it] {'loss': 0.3029, 'grad_norm': 0.6625082498934292, 'learning_rate': 3.4369422481037984e-06, 'epoch': 0.61} 61%|██████▏ | 13549/22095 [23:10:50<46:31:34, 19.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88898 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13550/22095 [23:11:13<49:00:36, 20.65s/it] {'loss': 0.3097, 'grad_norm': 0.6179382329542554, 'learning_rate': 3.4362460771087162e-06, 'epoch': 0.61} 61%|██████▏ | 13550/22095 [23:11:13<49:00:36, 20.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43220 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92192 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80781 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52411 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41150 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (65744 > 40960) for 4 sample(s). Truncating to 23871 with 2 samples. 61%|██████▏ | 13551/22095 [23:11:53<62:53:41, 26.50s/it] {'loss': 0.3403, 'grad_norm': 0.690846385935376, 'learning_rate': 3.4355499397139047e-06, 'epoch': 0.61} 61%|██████▏ | 13551/22095 [23:11:53<62:53:41, 26.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931432 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54585, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nA. 12\nB. 6\nC. 8\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 61%|██████▏ | 13552/22095 [23:12:34<72:40:02, 30.62s/it] {'loss': 0.308, 'grad_norm': 0.6691610028396701, 'learning_rate': 3.4348538359343187e-06, 'epoch': 0.61} 61%|██████▏ | 13552/22095 [23:12:34<72:40:02, 30.62s/it] 61%|██████▏ | 13553/22095 [23:13:33<93:10:32, 39.27s/it] {'loss': 0.2863, 'grad_norm': 0.6484163841141876, 'learning_rate': 3.4341577657849163e-06, 'epoch': 0.61} 61%|██████▏ | 13553/22095 [23:13:33<93:10:32, 39.27s/it] 61%|██████▏ | 13554/22095 [23:14:13<93:30:49, 39.42s/it] {'loss': 0.3299, 'grad_norm': 0.6710404290800567, 'learning_rate': 3.433461729280657e-06, 'epoch': 0.61} 61%|██████▏ | 13554/22095 [23:14:13<93:30:49, 39.42s/it] 61%|██████▏ | 13555/22095 [23:14:34<80:47:46, 34.06s/it] {'loss': 0.3115, 'grad_norm': 0.662797078639147, 'learning_rate': 3.4327657264364913e-06, 'epoch': 0.61} 61%|██████▏ | 13555/22095 [23:14:34<80:47:46, 34.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_85422.png 2025-08-28 15:12:33.144800 load time: 1017.2 ms 61%|██████▏ | 13556/22095 [23:14:44<63:14:18, 26.66s/it] {'loss': 0.4806, 'grad_norm': 0.331292948132087, 'learning_rate': 3.4320697572673774e-06, 'epoch': 0.61} 61%|██████▏ | 13556/22095 [23:14:44<63:14:18, 26.66s/it]VC:s3://internvl2/datasets/ocr/Wired_Table_10w/D/images/border_2841_PMTS2SXPWYXCXMEXSL5H.jpg 2025-08-28 15:12:42.538509 load time: 1022.53 ms 61%|██████▏ | 13557/22095 [23:15:24<72:56:12, 30.75s/it] {'loss': 0.3057, 'grad_norm': 0.6338532740399662, 'learning_rate': 3.4313738217882676e-06, 'epoch': 0.61} 61%|██████▏ | 13557/22095 [23:15:24<72:56:12, 30.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████▏ | 13558/22095 [23:15:34<57:56:28, 24.43s/it] {'loss': 0.4512, 'grad_norm': 0.28146032818850647, 'learning_rate': 3.4306779200141204e-06, 'epoch': 0.61} 61%|██████▏ | 13558/22095 [23:15:34<57:56:28, 24.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959588 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10423, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 61%|██████▏ | 13559/22095 [23:15:43<47:11:34, 19.90s/it] {'loss': 0.4511, 'grad_norm': 0.28191321203490505, 'learning_rate': 3.4299820519598814e-06, 'epoch': 0.61} 61%|██████▏ | 13559/22095 [23:15:43<47:11:34, 19.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████▏ | 13560/22095 [23:16:06<49:13:02, 20.76s/it] {'loss': 0.3293, 'grad_norm': 0.7828211842898387, 'learning_rate': 3.4292862176405075e-06, 'epoch': 0.61} 61%|██████▏ | 13560/22095 [23:16:06<49:13:02, 20.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████▏ | 13561/22095 [23:16:28<49:55:17, 21.06s/it] {'loss': 0.3475, 'grad_norm': 0.6837348984558683, 'learning_rate': 3.4285904170709495e-06, 'epoch': 0.61} 61%|██████▏ | 13561/22095 [23:16:28<49:55:17, 21.06s/it] 61%|██████▏ | 13562/22095 [23:17:49<92:59:11, 39.23s/it] {'loss': 0.3363, 'grad_norm': 0.6027187377328831, 'learning_rate': 3.427894650266156e-06, 'epoch': 0.61} 61%|██████▏ | 13562/22095 [23:17:49<92:59:11, 39.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70797 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54651 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13563/22095 [23:18:29<93:19:38, 39.38s/it] {'loss': 0.3355, 'grad_norm': 0.6826603664435904, 'learning_rate': 3.4271989172410768e-06, 'epoch': 0.61} 61%|██████▏ | 13563/22095 [23:18:29<93:19:38, 39.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47303 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13564/22095 [23:19:14<97:01:37, 40.94s/it] {'loss': 0.3017, 'grad_norm': 0.6563556306672855, 'learning_rate': 3.4265032180106656e-06, 'epoch': 0.61} 61%|██████▏ | 13564/22095 [23:19:14<97:01:37, 40.94s/it]VC:s3://gui/aguvis/aguvis-stage1/widget_captioning/images/38076.jpg 2025-08-28 15:17:12.334058 load time: 1031.46 ms VC:s3://multi-modal/TQA/train/question_images/atomic_mass_number_9011.png 2025-08-28 15:17:12.332362 load time: 1049.81 ms 61%|██████▏ | 13565/22095 [23:19:17<70:01:13, 29.55s/it] {'loss': 0.3064, 'grad_norm': 0.6249553420140989, 'learning_rate': 3.425807552589866e-06, 'epoch': 0.61} 61%|██████▏ | 13565/22095 [23:19:17<70:01:13, 29.55s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_124934.png 2025-08-28 15:17:15.299684 load time: 1024.42 ms VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/59039.jpg 2025-08-28 15:17:15.296999 load time: 1039.41 ms 61%|██████▏ | 13566/22095 [23:19:40<65:41:01, 27.72s/it] {'loss': 0.3024, 'grad_norm': 0.6720765103225914, 'learning_rate': 3.425111920993627e-06, 'epoch': 0.61} 61%|██████▏ | 13566/22095 [23:19:40<65:41:01, 27.72s/it]VC:s3://gui/aguvis/aguvis-stage1/webui350k/images/1657122157176.png 2025-08-28 15:17:38.759872 load time: 1043.98 ms 61%|██████▏ | 13567/22095 [23:20:21<75:23:28, 31.83s/it] {'loss': 0.3224, 'grad_norm': 0.6304178675344458, 'learning_rate': 3.424416323236897e-06, 'epoch': 0.61} 61%|██████▏ | 13567/22095 [23:20:21<75:23:28, 31.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46466 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47226 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13568/22095 [23:20:42<67:25:03, 28.46s/it] {'loss': 0.3275, 'grad_norm': 0.6751915719397409, 'learning_rate': 3.4237207593346207e-06, 'epoch': 0.61} 61%|██████▏ | 13568/22095 [23:20:42<67:25:03, 28.46s/it] 61%|██████▏ | 13569/22095 [23:21:25<77:44:19, 32.82s/it] {'loss': 0.3224, 'grad_norm': 0.6877550526977185, 'learning_rate': 3.423025229301743e-06, 'epoch': 0.61} 61%|██████▏ | 13569/22095 [23:21:25<77:44:19, 32.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████▏ | 13570/22095 [23:22:07<84:26:19, 35.66s/it] {'loss': 0.3017, 'grad_norm': 0.65181587448465, 'learning_rate': 3.42232973315321e-06, 'epoch': 0.61} 61%|██████▏ | 13570/22095 [23:22:07<84:26:19, 35.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308532 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2hGktc7.HL1JjSZFlXXaiRFXa_!!1784247747.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read and tell me what is encoded in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n全新TCL电源板\nFLOW\n昊天液晶配件\n认准KKK用心服务\nCQC\n型号:\n40-L4202C-PWI1XG'}]} VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_458637.png 2025-08-28 15:20:06.044261 load time: 1032.72 ms 61%|██████▏ | 13571/22095 [23:22:15<64:41:06, 27.32s/it] {'loss': 0.4635, 'grad_norm': 0.39223110812417794, 'learning_rate': 3.4216342709039675e-06, 'epoch': 0.61} 61%|██████▏ | 13571/22095 [23:22:15<64:41:06, 27.32s/it] 61%|██████▏ | 13572/22095 [23:22:25<52:21:28, 22.12s/it] {'loss': 0.4698, 'grad_norm': 0.33188089678483657, 'learning_rate': 3.4209388425689556e-06, 'epoch': 0.61} 61%|██████▏ | 13572/22095 [23:22:25<52:21:28, 22.12s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 61%|██████▏ | 13573/22095 [23:22:47<52:26:20, 22.15s/it] {'loss': 0.2885, 'grad_norm': 0.6084418154167035, 'learning_rate': 3.420243448163117e-06, 'epoch': 0.61} 61%|██████▏ | 13573/22095 [23:22:47<52:26:20, 22.15s/it] 61%|██████▏ | 13574/22095 [23:23:30<67:04:56, 28.34s/it] {'loss': 0.329, 'grad_norm': 0.5984272589446169, 'learning_rate': 3.4195480877013976e-06, 'epoch': 0.61} 61%|██████▏ | 13574/22095 [23:23:30<67:04:56, 28.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55805 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45928 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13575/22095 [23:24:30<89:22:30, 37.76s/it] {'loss': 0.3345, 'grad_norm': 0.5804821195101811, 'learning_rate': 3.4188527611987343e-06, 'epoch': 0.61} 61%|██████▏ | 13575/22095 [23:24:30<89:22:30, 37.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 61%|██████▏ | 13576/22095 [23:24:56<80:58:18, 34.22s/it] {'loss': 0.4794, 'grad_norm': 0.31348491520577604, 'learning_rate': 3.4181574686700687e-06, 'epoch': 0.61} 61%|██████▏ | 13576/22095 [23:24:56<80:58:18, 34.22s/it] 61%|██████▏ | 13577/22095 [23:25:00<59:21:40, 25.09s/it] {'loss': 0.3458, 'grad_norm': 0.6456037759163459, 'learning_rate': 3.417462210130342e-06, 'epoch': 0.61} 61%|██████▏ | 13577/22095 [23:25:00<59:21:40, 25.09s/it] 61%|██████▏ | 13578/22095 [23:25:59<83:44:56, 35.40s/it] {'loss': 0.3037, 'grad_norm': 0.6486113186177295, 'learning_rate': 3.4167669855944905e-06, 'epoch': 0.61} 61%|██████▏ | 13578/22095 [23:25:59<83:44:56, 35.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887269 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10422, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398226 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 377, 'image': 'vrdu_table_final_2/astro-ph.CO/acc39a41-8397-46e0-b5ad-0b57ec647b79.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} 61%|██████▏ | 13579/22095 [23:26:39<87:05:09, 36.81s/it] {'loss': 0.3292, 'grad_norm': 0.6203561762672279, 'learning_rate': 3.416071795077455e-06, 'epoch': 0.61} 61%|██████▏ | 13579/22095 [23:26:39<87:05:09, 36.81s/it]VC:s3://gui/aguvis/aguvis-stage1/ricoig16k/images/1088.jpg 2025-08-28 15:24:37.952860 load time: 1036.88 ms 61%|██████▏ | 13580/22095 [23:27:02<77:14:07, 32.65s/it] {'loss': 0.3388, 'grad_norm': 0.6063378791570977, 'learning_rate': 3.415376638594172e-06, 'epoch': 0.61} 61%|██████▏ | 13580/22095 [23:27:02<77:14:07, 32.65s/it] 61%|██████▏ | 13581/22095 [23:27:24<69:29:09, 29.38s/it] {'loss': 0.2982, 'grad_norm': 0.6807597534346058, 'learning_rate': 3.414681516159578e-06, 'epoch': 0.61} 61%|██████▏ | 13581/22095 [23:27:24<69:29:09, 29.38s/it] 61%|██████▏ | 13582/22095 [23:28:24<91:19:44, 38.62s/it] {'loss': 0.3341, 'grad_norm': 1.0380220289205693, 'learning_rate': 3.4139864277886083e-06, 'epoch': 0.61} 61%|██████▏ | 13582/22095 [23:28:24<91:19:44, 38.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 61%|██████▏ | 13583/22095 [23:28:51<83:08:28, 35.16s/it] {'loss': 0.4856, 'grad_norm': 0.3318394658340282, 'learning_rate': 3.413291373496202e-06, 'epoch': 0.61} 61%|██████▏ | 13583/22095 [23:28:51<83:08:28, 35.16s/it] 61%|██████▏ | 13584/22095 [23:28:55<60:40:22, 25.66s/it] {'loss': 0.3554, 'grad_norm': 0.6834381237273527, 'learning_rate': 3.4125963532972878e-06, 'epoch': 0.61} 61%|██████▏ | 13584/22095 [23:28:55<60:40:22, 25.66s/it] 61%|██████▏ | 13585/22095 [23:29:35<70:50:25, 29.97s/it] {'loss': 0.3444, 'grad_norm': 0.661222968496695, 'learning_rate': 3.4119013672068034e-06, 'epoch': 0.61} 61%|██████▏ | 13585/22095 [23:29:35<70:50:25, 29.97s/it] 61%|██████▏ | 13586/22095 [23:29:38<51:37:27, 21.84s/it] {'loss': 0.2932, 'grad_norm': 0.6150340736437391, 'learning_rate': 3.411206415239681e-06, 'epoch': 0.61} 61%|██████▏ | 13586/22095 [23:29:38<51:37:27, 21.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46279 > 40960). Running this sequence through the model will result in indexing errors 61%|██████▏ | 13587/22095 [23:29:41<38:27:59, 16.28s/it] {'loss': 0.3131, 'grad_norm': 0.5982665467514687, 'learning_rate': 3.4105114974108553e-06, 'epoch': 0.61} 61%|██████▏ | 13587/22095 [23:29:41<38:27:59, 16.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306201 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1pUXAlz3z9KJjy0FmXXXiwXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read and tell me what is written on this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n良品书苑\n与变频器丛书\n兴旺\n书社\nPLC编程指令\n图快速入门\n赠送赠书签\n電子工業出版社赠送赠书签\n实用\n经典'}]} 61%|██████▏ | 13588/22095 [23:29:44<28:53:38, 12.23s/it] {'loss': 0.3097, 'grad_norm': 0.6062719537081716, 'learning_rate': 3.4098166137352534e-06, 'epoch': 0.61} 61%|██████▏ | 13588/22095 [23:29:44<28:53:38, 12.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42343 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13589/22095 [23:30:06<36:04:01, 15.26s/it] {'loss': 0.3013, 'grad_norm': 0.5801537581376082, 'learning_rate': 3.409121764227809e-06, 'epoch': 0.62} 62%|██████▏ | 13589/22095 [23:30:06<36:04:01, 15.26s/it] 62%|██████▏ | 13590/22095 [23:30:09<27:31:12, 11.65s/it] {'loss': 0.3208, 'grad_norm': 0.6342492363047709, 'learning_rate': 3.408426948903453e-06, 'epoch': 0.62} 62%|██████▏ | 13590/22095 [23:30:09<27:31:12, 11.65s/it]VC:s3://gui-agent/data_20250421/web/images/wa_shopping_admin_admin/trajectory_46/img/step_12.png 2025-08-28 15:28:07.943077 load time: 1036.95 ms 62%|██████▏ | 13591/22095 [23:30:35<37:13:40, 15.76s/it] {'loss': 0.2913, 'grad_norm': 0.618670713571157, 'learning_rate': 3.4077321677771137e-06, 'epoch': 0.62} 62%|██████▏ | 13591/22095 [23:30:35<37:13:40, 15.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55466 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122316 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45525 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13592/22095 [23:30:38<28:32:42, 12.09s/it] {'loss': 0.3249, 'grad_norm': 0.6074155228706901, 'learning_rate': 3.4070374208637173e-06, 'epoch': 0.62} 62%|██████▏ | 13592/22095 [23:30:38<28:32:42, 12.09s/it] 62%|██████▏ | 13593/22095 [23:30:41<22:13:44, 9.41s/it] {'loss': 0.2985, 'grad_norm': 0.6094925796063878, 'learning_rate': 3.4063427081781973e-06, 'epoch': 0.62} 62%|██████▏ | 13593/22095 [23:30:41<22:13:44, 9.41s/it] 62%|██████▏ | 13594/22095 [23:31:05<32:13:03, 13.64s/it] {'loss': 0.336, 'grad_norm': 0.6737245100747968, 'learning_rate': 3.4056480297354767e-06, 'epoch': 0.62} 62%|██████▏ | 13594/22095 [23:31:05<32:13:03, 13.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13595/22095 [23:31:14<29:12:45, 12.37s/it] {'loss': 0.486, 'grad_norm': 0.3467582044192305, 'learning_rate': 3.4049533855504835e-06, 'epoch': 0.62} 62%|██████▏ | 13595/22095 [23:31:14<29:12:45, 12.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66701 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13596/22095 [23:31:37<36:45:04, 15.57s/it] {'loss': 0.321, 'grad_norm': 0.6309695090828856, 'learning_rate': 3.404258775638144e-06, 'epoch': 0.62} 62%|██████▏ | 13596/22095 [23:31:37<36:45:04, 15.57s/it] 62%|██████▏ | 13597/22095 [23:32:18<54:31:36, 23.10s/it] {'loss': 0.2694, 'grad_norm': 0.5930313711457792, 'learning_rate': 3.4035642000133806e-06, 'epoch': 0.62} 62%|██████▏ | 13597/22095 [23:32:18<54:31:36, 23.10s/it] 62%|██████▏ | 13598/22095 [23:32:22<40:55:17, 17.34s/it] {'loss': 0.3151, 'grad_norm': 0.5859657562645595, 'learning_rate': 3.4028696586911203e-06, 'epoch': 0.62} 62%|██████▏ | 13598/22095 [23:32:22<40:55:17, 17.34s/it] 62%|██████▏ | 13599/22095 [23:32:25<30:55:50, 13.11s/it] {'loss': 0.2716, 'grad_norm': 0.6732209596085635, 'learning_rate': 3.4021751516862856e-06, 'epoch': 0.62} 62%|██████▏ | 13599/22095 [23:32:25<30:55:50, 13.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [284, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396939 in VC:s3://internvl-moe-sft-data/. Exception: Image size [284, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63792, 'image': 'vrdu_table_final_2/astro-ph.EP/e8915398-cf11-440b-af49-b2c54501a97c.png', 'image_wh': [[284, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}Earth's axis of rotation\\end{tabular}\n```"}]} 62%|██████▏ | 13600/22095 [23:32:55<43:03:07, 18.24s/it] {'loss': 0.4739, 'grad_norm': 0.3069295366174163, 'learning_rate': 3.401480679013801e-06, 'epoch': 0.62} 62%|██████▏ | 13600/22095 [23:32:55<43:03:07, 18.24s/it] 62%|██████▏ | 13601/22095 [23:33:00<33:14:01, 14.09s/it] {'loss': 0.2899, 'grad_norm': 0.7672622217379496, 'learning_rate': 3.4007862406885863e-06, 'epoch': 0.62} 62%|██████▏ | 13601/22095 [23:33:00<33:14:01, 14.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50126 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13602/22095 [23:33:21<38:12:03, 16.19s/it] {'loss': 0.3028, 'grad_norm': 0.6369540019262062, 'learning_rate': 3.400091836725562e-06, 'epoch': 0.62} 62%|██████▏ | 13602/22095 [23:33:21<38:12:03, 16.19s/it] 62%|██████▏ | 13603/22095 [23:33:24<28:44:34, 12.18s/it] {'loss': 0.2963, 'grad_norm': 0.5712373454878116, 'learning_rate': 3.3993974671396523e-06, 'epoch': 0.62} 62%|██████▏ | 13603/22095 [23:33:24<28:44:34, 12.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13604/22095 [23:33:30<24:45:11, 10.49s/it] {'loss': 0.4829, 'grad_norm': 0.30381427668797734, 'learning_rate': 3.3987031319457747e-06, 'epoch': 0.62} 62%|██████▏ | 13604/22095 [23:33:30<24:45:11, 10.49s/it] 62%|██████▏ | 13605/22095 [23:33:52<32:35:45, 13.82s/it] {'loss': 0.3064, 'grad_norm': 0.8279645261570301, 'learning_rate': 3.398008831158849e-06, 'epoch': 0.62} 62%|██████▏ | 13605/22095 [23:33:52<32:35:45, 13.82s/it] 62%|██████▏ | 13606/22095 [23:33:56<25:38:42, 10.88s/it] {'loss': 0.3673, 'grad_norm': 0.6295116827920167, 'learning_rate': 3.3973145647937935e-06, 'epoch': 0.62} 62%|██████▏ | 13606/22095 [23:33:56<25:38:42, 10.88s/it] 62%|██████▏ | 13607/22095 [23:34:37<47:02:51, 19.95s/it] {'loss': 0.2947, 'grad_norm': 0.6245798726321056, 'learning_rate': 3.3966203328655244e-06, 'epoch': 0.62} 62%|██████▏ | 13607/22095 [23:34:37<47:02:51, 19.95s/it] 62%|██████▏ | 13608/22095 [23:34:40<35:11:23, 14.93s/it] {'loss': 0.3124, 'grad_norm': 0.5876837841079967, 'learning_rate': 3.3959261353889605e-06, 'epoch': 0.62} 62%|██████▏ | 13608/22095 [23:34:40<35:11:23, 14.93s/it] 62%|██████▏ | 13609/22095 [23:34:43<26:40:22, 11.32s/it] {'loss': 0.3449, 'grad_norm': 0.6718860068837423, 'learning_rate': 3.395231972379019e-06, 'epoch': 0.62} 62%|██████▏ | 13609/22095 [23:34:43<26:40:22, 11.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13610/22095 [23:34:52<25:23:52, 10.78s/it] {'loss': 0.4554, 'grad_norm': 0.327732997048033, 'learning_rate': 3.3945378438506125e-06, 'epoch': 0.62} 62%|██████▏ | 13610/22095 [23:34:52<25:23:52, 10.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13611/22095 [23:34:57<20:59:14, 8.91s/it] {'loss': 0.2966, 'grad_norm': 0.6726952775245555, 'learning_rate': 3.393843749818656e-06, 'epoch': 0.62} 62%|██████▏ | 13611/22095 [23:34:57<20:59:14, 8.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122477 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13612/22095 [23:35:19<30:33:02, 12.97s/it] {'loss': 0.2989, 'grad_norm': 0.6939323728620533, 'learning_rate': 3.393149690298067e-06, 'epoch': 0.62} 62%|██████▏ | 13612/22095 [23:35:19<30:33:02, 12.97s/it] 62%|██████▏ | 13613/22095 [23:35:23<23:44:38, 10.08s/it] {'loss': 0.3446, 'grad_norm': 0.7009716702689694, 'learning_rate': 3.3924556653037533e-06, 'epoch': 0.62} 62%|██████▏ | 13613/22095 [23:35:23<23:44:38, 10.08s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/clock_1/images/step_0.png 2025-08-28 15:33:22.843757 load time: 1176.53 ms 62%|██████▏ | 13614/22095 [23:35:26<18:44:30, 7.96s/it] {'loss': 0.3226, 'grad_norm': 0.6311392257799732, 'learning_rate': 3.391761674850631e-06, 'epoch': 0.62} 62%|██████▏ | 13614/22095 [23:35:26<18:44:30, 7.96s/it] 62%|██████▏ | 13615/22095 [23:35:29<15:26:27, 6.56s/it] {'loss': 0.2801, 'grad_norm': 0.6330456115405911, 'learning_rate': 3.39106771895361e-06, 'epoch': 0.62} 62%|██████▏ | 13615/22095 [23:35:29<15:26:27, 6.56s/it] 62%|██████▏ | 13616/22095 [23:35:33<13:35:05, 5.77s/it] {'loss': 0.3049, 'grad_norm': 0.5927001608548688, 'learning_rate': 3.3903737976276064e-06, 'epoch': 0.62} 62%|██████▏ | 13616/22095 [23:35:33<13:35:05, 5.77s/it] 62%|██████▏ | 13617/22095 [23:35:36<11:29:11, 4.88s/it] {'loss': 0.3208, 'grad_norm': 0.6139805996534979, 'learning_rate': 3.389679910887522e-06, 'epoch': 0.62} 62%|██████▏ | 13617/22095 [23:35:36<11:29:11, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65084 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109652 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41086 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43527 > 40960) for 4 sample(s). Truncating to 93 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (121674 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13618/22095 [23:35:40<10:46:27, 4.58s/it] {'loss': 0.2736, 'grad_norm': 0.7454624269731916, 'learning_rate': 3.3889860587482716e-06, 'epoch': 0.62} 62%|██████▏ | 13618/22095 [23:35:40<10:46:27, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79941 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13619/22095 [23:35:43<10:16:40, 4.37s/it] {'loss': 0.3121, 'grad_norm': 0.5839197225597309, 'learning_rate': 3.3882922412247644e-06, 'epoch': 0.62} 62%|██████▏ | 13619/22095 [23:35:43<10:16:40, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13620/22095 [23:35:50<12:04:49, 5.13s/it] {'loss': 0.4687, 'grad_norm': 0.32865271198484763, 'learning_rate': 3.387598458331906e-06, 'epoch': 0.62} 62%|██████▏ | 13620/22095 [23:35:50<12:04:49, 5.13s/it] 62%|██████▏ | 13621/22095 [23:35:54<10:44:30, 4.56s/it] {'loss': 0.3113, 'grad_norm': 0.6346643957182696, 'learning_rate': 3.386904710084603e-06, 'epoch': 0.62} 62%|██████▏ | 13621/22095 [23:35:54<10:44:30, 4.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396954 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63807, 'image': 'vrdu_table_final_2/astro-ph.EP/57ae63d8-05d9-46f1-9748-58725f8459b0.png', 'image_wh': [[14, 20]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}y\\end{tabular}\n```"}]} 62%|██████▏ | 13622/22095 [23:35:57<9:47:51, 4.16s/it] {'loss': 0.2919, 'grad_norm': 0.5804358646950067, 'learning_rate': 3.3862109964977665e-06, 'epoch': 0.62} 62%|██████▏ | 13622/22095 [23:35:57<9:47:51, 4.16s/it] 62%|██████▏ | 13623/22095 [23:36:00<8:52:30, 3.77s/it] {'loss': 0.3376, 'grad_norm': 0.6657660824615055, 'learning_rate': 3.3855173175862976e-06, 'epoch': 0.62} 62%|██████▏ | 13623/22095 [23:36:00<8:52:30, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58093 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62565 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13624/22095 [23:36:04<9:06:14, 3.87s/it] {'loss': 0.3341, 'grad_norm': 0.6726518102362474, 'learning_rate': 3.3848236733651034e-06, 'epoch': 0.62} 62%|██████▏ | 13624/22095 [23:36:04<9:06:14, 3.87s/it] 62%|██████▏ | 13625/22095 [23:36:07<8:52:29, 3.77s/it] {'loss': 0.2905, 'grad_norm': 0.6619560109542502, 'learning_rate': 3.3841300638490885e-06, 'epoch': 0.62} 62%|██████▏ | 13625/22095 [23:36:07<8:52:29, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13626/22095 [23:36:17<12:56:32, 5.50s/it] {'loss': 0.4705, 'grad_norm': 0.4164067724363868, 'learning_rate': 3.383436489053154e-06, 'epoch': 0.62} 62%|██████▏ | 13626/22095 [23:36:17<12:56:32, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54029 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13627/22095 [23:36:20<11:25:30, 4.86s/it] {'loss': 0.3422, 'grad_norm': 0.630389271345759, 'learning_rate': 3.3827429489922053e-06, 'epoch': 0.62} 62%|██████▏ | 13627/22095 [23:36:20<11:25:30, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50433 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52504 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96360 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13628/22095 [23:36:23<10:02:42, 4.27s/it] {'loss': 0.2897, 'grad_norm': 0.59583288899598, 'learning_rate': 3.3820494436811435e-06, 'epoch': 0.62} 62%|██████▏ | 13628/22095 [23:36:23<10:02:42, 4.27s/it] 62%|██████▏ | 13629/22095 [23:36:26<9:04:52, 3.86s/it] {'loss': 0.2902, 'grad_norm': 0.5847224930970383, 'learning_rate': 3.3813559731348716e-06, 'epoch': 0.62} 62%|██████▏ | 13629/22095 [23:36:26<9:04:52, 3.86s/it] 62%|██████▏ | 13630/22095 [23:36:30<9:05:08, 3.86s/it] {'loss': 0.3259, 'grad_norm': 0.6896455432590854, 'learning_rate': 3.380662537368286e-06, 'epoch': 0.62} 62%|██████▏ | 13630/22095 [23:36:30<9:05:08, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50085 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13631/22095 [23:36:34<9:25:54, 4.01s/it] {'loss': 0.2756, 'grad_norm': 0.5948295445386829, 'learning_rate': 3.3799691363962904e-06, 'epoch': 0.62} 62%|██████▏ | 13631/22095 [23:36:34<9:25:54, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13632/22095 [23:36:43<12:50:05, 5.46s/it] {'loss': 0.4591, 'grad_norm': 0.2790831913582214, 'learning_rate': 3.379275770233783e-06, 'epoch': 0.62} 62%|██████▏ | 13632/22095 [23:36:43<12:50:05, 5.46s/it] 62%|██████▏ | 13633/22095 [23:36:47<11:39:27, 4.96s/it] {'loss': 0.32, 'grad_norm': 0.698686932276024, 'learning_rate': 3.3785824388956613e-06, 'epoch': 0.62} 62%|██████▏ | 13633/22095 [23:36:47<11:39:27, 4.96s/it] 62%|██████▏ | 13634/22095 [23:36:50<10:26:05, 4.44s/it] {'loss': 0.3113, 'grad_norm': 0.7019383457076268, 'learning_rate': 3.377889142396822e-06, 'epoch': 0.62} 62%|██████▏ | 13634/22095 [23:36:50<10:26:05, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13635/22095 [23:37:01<14:46:23, 6.29s/it] {'loss': 0.4626, 'grad_norm': 0.2976661593405612, 'learning_rate': 3.3771958807521656e-06, 'epoch': 0.62} 62%|██████▏ | 13635/22095 [23:37:01<14:46:23, 6.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13636/22095 [23:37:12<18:10:26, 7.73s/it] {'loss': 0.464, 'grad_norm': 0.3217511310855456, 'learning_rate': 3.3765026539765832e-06, 'epoch': 0.62} 62%|██████▏ | 13636/22095 [23:37:12<18:10:26, 7.73s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 62%|██████▏ | 13637/22095 [23:37:16<15:44:45, 6.70s/it] {'loss': 0.3252, 'grad_norm': 0.623664016439237, 'learning_rate': 3.3758094620849737e-06, 'epoch': 0.62} 62%|██████▏ | 13637/22095 [23:37:16<15:44:45, 6.70s/it] 62%|██████▏ | 13638/22095 [23:37:19<13:11:26, 5.62s/it] {'loss': 0.3349, 'grad_norm': 0.5967522282609519, 'learning_rate': 3.3751163050922307e-06, 'epoch': 0.62} 62%|██████▏ | 13638/22095 [23:37:19<13:11:26, 5.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13639/22095 [23:37:29<15:52:39, 6.76s/it] {'loss': 0.4745, 'grad_norm': 0.2962581492991003, 'learning_rate': 3.3744231830132473e-06, 'epoch': 0.62} 62%|██████▏ | 13639/22095 [23:37:29<15:52:39, 6.76s/it] 62%|██████▏ | 13640/22095 [23:37:32<13:35:56, 5.79s/it] {'loss': 0.3707, 'grad_norm': 0.6495079733390845, 'learning_rate': 3.373730095862916e-06, 'epoch': 0.62} 62%|██████▏ | 13640/22095 [23:37:32<13:35:56, 5.79s/it] 62%|██████▏ | 13641/22095 [23:37:36<11:58:41, 5.10s/it] {'loss': 0.295, 'grad_norm': 0.6280898772506784, 'learning_rate': 3.3730370436561316e-06, 'epoch': 0.62} 62%|██████▏ | 13641/22095 [23:37:36<11:58:41, 5.10s/it] 62%|██████▏ | 13642/22095 [23:37:40<11:10:37, 4.76s/it] {'loss': 0.3142, 'grad_norm': 0.604642646016628, 'learning_rate': 3.372344026407785e-06, 'epoch': 0.62} 62%|██████▏ | 13642/22095 [23:37:40<11:10:37, 4.76s/it] 62%|██████▏ | 13643/22095 [23:37:43<10:22:27, 4.42s/it] {'loss': 0.3517, 'grad_norm': 0.6327627082802671, 'learning_rate': 3.3716510441327653e-06, 'epoch': 0.62} 62%|██████▏ | 13643/22095 [23:37:43<10:22:27, 4.42s/it] 62%|██████▏ | 13644/22095 [23:37:47<9:52:52, 4.21s/it] {'loss': 0.2857, 'grad_norm': 0.6490284830665943, 'learning_rate': 3.3709580968459628e-06, 'epoch': 0.62} 62%|██████▏ | 13644/22095 [23:37:47<9:52:52, 4.21s/it] 62%|██████▏ | 13645/22095 [23:37:50<8:55:29, 3.80s/it] {'loss': 0.3027, 'grad_norm': 0.6287983386684702, 'learning_rate': 3.3702651845622703e-06, 'epoch': 0.62} 62%|██████▏ | 13645/22095 [23:37:50<8:55:29, 3.80s/it] 62%|██████▏ | 13646/22095 [23:37:53<8:25:04, 3.59s/it] {'loss': 0.3167, 'grad_norm': 0.6207686356181382, 'learning_rate': 3.3695723072965707e-06, 'epoch': 0.62} 62%|██████▏ | 13646/22095 [23:37:53<8:25:04, 3.59s/it] 62%|██████▏ | 13647/22095 [23:37:57<8:46:19, 3.74s/it] {'loss': 0.2965, 'grad_norm': 0.5933952431953913, 'learning_rate': 3.3688794650637557e-06, 'epoch': 0.62} 62%|██████▏ | 13647/22095 [23:37:57<8:46:19, 3.74s/it] 62%|██████▏ | 13648/22095 [23:38:00<8:25:08, 3.59s/it] {'loss': 0.3241, 'grad_norm': 0.6157345800940398, 'learning_rate': 3.3681866578787124e-06, 'epoch': 0.62} 62%|██████▏ | 13648/22095 [23:38:00<8:25:08, 3.59s/it] 62%|██████▏ | 13649/22095 [23:38:05<9:06:38, 3.88s/it] {'loss': 0.3288, 'grad_norm': 0.5934092362653312, 'learning_rate': 3.3674938857563256e-06, 'epoch': 0.62} 62%|██████▏ | 13649/22095 [23:38:05<9:06:38, 3.88s/it] 62%|██████▏ | 13650/22095 [23:38:08<8:27:56, 3.61s/it] {'loss': 0.2881, 'grad_norm': 0.6173483257897173, 'learning_rate': 3.3668011487114798e-06, 'epoch': 0.62} 62%|██████▏ | 13650/22095 [23:38:08<8:27:56, 3.61s/it] 62%|██████▏ | 13651/22095 [23:38:11<8:29:31, 3.62s/it] {'loss': 0.2957, 'grad_norm': 0.5965277040554047, 'learning_rate': 3.3661084467590637e-06, 'epoch': 0.62} 62%|██████▏ | 13651/22095 [23:38:11<8:29:31, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46687 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13652/22095 [23:38:14<8:00:50, 3.42s/it] {'loss': 0.3309, 'grad_norm': 0.6436909221396913, 'learning_rate': 3.3654157799139576e-06, 'epoch': 0.62} 62%|██████▏ | 13652/22095 [23:38:14<8:00:50, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49208 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13653/22095 [23:38:18<7:51:28, 3.35s/it] {'loss': 0.2989, 'grad_norm': 0.6945090155316245, 'learning_rate': 3.3647231481910464e-06, 'epoch': 0.62} 62%|██████▏ | 13653/22095 [23:38:18<7:51:28, 3.35s/it] 62%|██████▏ | 13654/22095 [23:38:21<8:09:46, 3.48s/it] {'loss': 0.3137, 'grad_norm': 0.8550100311569186, 'learning_rate': 3.364030551605213e-06, 'epoch': 0.62} 62%|██████▏ | 13654/22095 [23:38:21<8:09:46, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302134 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1J0zSRpXXXXbqapXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请对图中文字进行OCR识别。'}, {'from': 'gpt', 'value': '图中所有文字:\n房屋卫士\n品质\n服务\n认真\n创新\n高强度\n一贴止漏\n粘力\n荣誉出品上海龙实业有限公司屋顶、玻璃、彩钢等缝隙防水\n房屋卫士\n防水胶带\n•操作简单\n•耐高低温\n•耐老化性\n注意事项\n使用方式'}]} 62%|██████▏ | 13655/22095 [23:38:25<8:26:11, 3.60s/it] {'loss': 0.3127, 'grad_norm': 0.6315291386273997, 'learning_rate': 3.363337990171337e-06, 'epoch': 0.62} 62%|██████▏ | 13655/22095 [23:38:25<8:26:11, 3.60s/it] 62%|██████▏ | 13656/22095 [23:38:29<8:31:47, 3.64s/it] {'loss': 0.3174, 'grad_norm': 0.6369925272081775, 'learning_rate': 3.3626454639043018e-06, 'epoch': 0.62} 62%|██████▏ | 13656/22095 [23:38:29<8:31:47, 3.64s/it] 62%|██████▏ | 13657/22095 [23:38:34<9:14:53, 3.95s/it] {'loss': 0.321, 'grad_norm': 0.6373841023921314, 'learning_rate': 3.361952972818987e-06, 'epoch': 0.62} 62%|██████▏ | 13657/22095 [23:38:34<9:14:53, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67895 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63828 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13658/22095 [23:38:37<9:11:10, 3.92s/it] {'loss': 0.3059, 'grad_norm': 0.6862637267107752, 'learning_rate': 3.3612605169302724e-06, 'epoch': 0.62} 62%|██████▏ | 13658/22095 [23:38:38<9:11:10, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (68554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71044 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13659/22095 [23:38:47<12:48:17, 5.46s/it] {'loss': 0.4704, 'grad_norm': 0.3469364433723692, 'learning_rate': 3.360568096253035e-06, 'epoch': 0.62} 62%|██████▏ | 13659/22095 [23:38:47<12:48:17, 5.46s/it] 62%|██████▏ | 13660/22095 [23:38:56<15:17:42, 6.53s/it] {'loss': 0.4795, 'grad_norm': 0.33049897548744755, 'learning_rate': 3.3598757108021546e-06, 'epoch': 0.62} 62%|██████▏ | 13660/22095 [23:38:56<15:17:42, 6.53s/it] 62%|██████▏ | 13661/22095 [23:39:04<16:58:09, 7.24s/it] {'loss': 0.4743, 'grad_norm': 0.3221609743944461, 'learning_rate': 3.359183360592509e-06, 'epoch': 0.62} 62%|██████▏ | 13661/22095 [23:39:04<16:58:09, 7.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 62%|██████▏ | 13662/22095 [23:39:08<14:28:23, 6.18s/it] {'loss': 0.2902, 'grad_norm': 0.6382213127197119, 'learning_rate': 3.3584910456389726e-06, 'epoch': 0.62} 62%|██████▏ | 13662/22095 [23:39:08<14:28:23, 6.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72567 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13663/22095 [23:39:12<12:34:30, 5.37s/it] {'loss': 0.2928, 'grad_norm': 0.5931313383666015, 'learning_rate': 3.357798765956421e-06, 'epoch': 0.62} 62%|██████▏ | 13663/22095 [23:39:12<12:34:30, 5.37s/it] 62%|██████▏ | 13664/22095 [23:39:15<11:21:43, 4.85s/it] {'loss': 0.2874, 'grad_norm': 0.6169420726404996, 'learning_rate': 3.357106521559733e-06, 'epoch': 0.62} 62%|██████▏ | 13664/22095 [23:39:15<11:21:43, 4.85s/it] 62%|██████▏ | 13665/22095 [23:39:18<9:53:51, 4.23s/it] {'loss': 0.2929, 'grad_norm': 0.6228570680089183, 'learning_rate': 3.356414312463778e-06, 'epoch': 0.62} 62%|██████▏ | 13665/22095 [23:39:18<9:53:51, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41511 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44902 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49630 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47156 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95919 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13666/22095 [23:39:27<13:01:43, 5.56s/it] {'loss': 0.4908, 'grad_norm': 0.4429317158088941, 'learning_rate': 3.3557221386834323e-06, 'epoch': 0.62} 62%|██████▏ | 13666/22095 [23:39:27<13:01:43, 5.56s/it] 62%|██████▏ | 13667/22095 [23:39:36<15:36:30, 6.67s/it] {'loss': 0.4649, 'grad_norm': 0.339620418569209, 'learning_rate': 3.3550300002335685e-06, 'epoch': 0.62} 62%|██████▏ | 13667/22095 [23:39:36<15:36:30, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (67442 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13668/22095 [23:39:39<13:13:57, 5.65s/it] {'loss': 0.3275, 'grad_norm': 0.6310993574186863, 'learning_rate': 3.354337897129057e-06, 'epoch': 0.62} 62%|██████▏ | 13668/22095 [23:39:39<13:13:57, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80832 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13669/22095 [23:39:43<11:34:27, 4.95s/it] {'loss': 0.2751, 'grad_norm': 0.6359004375019375, 'learning_rate': 3.3536458293847686e-06, 'epoch': 0.62} 62%|██████▏ | 13669/22095 [23:39:43<11:34:27, 4.95s/it] 62%|██████▏ | 13670/22095 [23:39:46<10:20:03, 4.42s/it] {'loss': 0.3195, 'grad_norm': 0.6743555355132227, 'learning_rate': 3.3529537970155756e-06, 'epoch': 0.62} 62%|██████▏ | 13670/22095 [23:39:46<10:20:03, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13671/22095 [23:39:55<13:52:21, 5.93s/it] {'loss': 0.5041, 'grad_norm': 0.31263598762639505, 'learning_rate': 3.3522618000363487e-06, 'epoch': 0.62} 62%|██████▏ | 13671/22095 [23:39:55<13:52:21, 5.93s/it] 62%|██████▏ | 13672/22095 [23:39:59<12:03:42, 5.16s/it] {'loss': 0.3067, 'grad_norm': 0.6122582855146629, 'learning_rate': 3.3515698384619543e-06, 'epoch': 0.62} 62%|██████▏ | 13672/22095 [23:39:59<12:03:42, 5.16s/it] 62%|██████▏ | 13673/22095 [23:40:01<10:28:19, 4.48s/it] {'loss': 0.3204, 'grad_norm': 0.6522907156600588, 'learning_rate': 3.35087791230726e-06, 'epoch': 0.62} 62%|██████▏ | 13673/22095 [23:40:01<10:28:19, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13674/22095 [23:40:11<14:18:23, 6.12s/it] {'loss': 0.4722, 'grad_norm': 0.2912571970881983, 'learning_rate': 3.3501860215871363e-06, 'epoch': 0.62} 62%|██████▏ | 13674/22095 [23:40:11<14:18:23, 6.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79858 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13675/22095 [23:40:15<12:26:17, 5.32s/it] {'loss': 0.2829, 'grad_norm': 0.6105368967235455, 'learning_rate': 3.3494941663164465e-06, 'epoch': 0.62} 62%|██████▏ | 13675/22095 [23:40:15<12:26:17, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13676/22095 [23:40:25<15:51:01, 6.78s/it] {'loss': 0.4621, 'grad_norm': 0.2997500457463555, 'learning_rate': 3.348802346510058e-06, 'epoch': 0.62} 62%|██████▏ | 13676/22095 [23:40:25<15:51:01, 6.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45553 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108997 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104658 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13677/22095 [23:40:28<13:25:26, 5.74s/it] {'loss': 0.2748, 'grad_norm': 0.6123374708352166, 'learning_rate': 3.348110562182838e-06, 'epoch': 0.62} 62%|██████▏ | 13677/22095 [23:40:28<13:25:26, 5.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13678/22095 [23:40:32<11:41:41, 5.00s/it] {'loss': 0.3066, 'grad_norm': 0.6084595710799209, 'learning_rate': 3.3474188133496466e-06, 'epoch': 0.62} 62%|██████▏ | 13678/22095 [23:40:32<11:41:41, 5.00s/it] 62%|██████▏ | 13679/22095 [23:40:35<10:17:17, 4.40s/it] {'loss': 0.2863, 'grad_norm': 0.5729332026915518, 'learning_rate': 3.346727100025349e-06, 'epoch': 0.62} 62%|██████▏ | 13679/22095 [23:40:35<10:17:17, 4.40s/it] 62%|██████▏ | 13680/22095 [23:40:38<9:28:24, 4.05s/it] {'loss': 0.3138, 'grad_norm': 0.6368878796261367, 'learning_rate': 3.34603542222481e-06, 'epoch': 0.62} 62%|██████▏ | 13680/22095 [23:40:38<9:28:24, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68977 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107352 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13681/22095 [23:40:41<8:54:29, 3.81s/it] {'loss': 0.3202, 'grad_norm': 0.5823494292580653, 'learning_rate': 3.3453437799628885e-06, 'epoch': 0.62} 62%|██████▏ | 13681/22095 [23:40:41<8:54:29, 3.81s/it] 62%|██████▏ | 13682/22095 [23:40:44<8:27:00, 3.62s/it] {'loss': 0.3633, 'grad_norm': 0.6531662241634709, 'learning_rate': 3.344652173254448e-06, 'epoch': 0.62} 62%|██████▏ | 13682/22095 [23:40:44<8:27:00, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13683/22095 [23:40:52<11:31:41, 4.93s/it] {'loss': 0.4781, 'grad_norm': 0.30126341048357147, 'learning_rate': 3.343960602114349e-06, 'epoch': 0.62} 62%|██████▏ | 13683/22095 [23:40:52<11:31:41, 4.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 56, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8361299 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 56, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 28027, 'image': 'vrdu_table_final_2/astro-ph.CO/247a82e4-9804-4cf7-8c3e-7a8ce66b230d.png', 'image_wh': [[14, 56]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #1\\\\#2\n \\end{tabular}\n```"}]} 62%|██████▏ | 13684/22095 [23:40:56<10:25:55, 4.47s/it] {'loss': 0.2774, 'grad_norm': 0.6433215004630558, 'learning_rate': 3.3432690665574485e-06, 'epoch': 0.62} 62%|██████▏ | 13684/22095 [23:40:56<10:25:55, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13685/22095 [23:41:05<13:58:50, 5.98s/it] {'loss': 0.4561, 'grad_norm': 0.29760997546260387, 'learning_rate': 3.3425775665986093e-06, 'epoch': 0.62} 62%|██████▏ | 13685/22095 [23:41:05<13:58:50, 5.98s/it] 62%|██████▏ | 13686/22095 [23:41:08<11:59:12, 5.13s/it] {'loss': 0.2772, 'grad_norm': 0.6400974170920216, 'learning_rate': 3.341886102252687e-06, 'epoch': 0.62} 62%|██████▏ | 13686/22095 [23:41:08<11:59:12, 5.13s/it] 62%|██████▏ | 13687/22095 [23:41:12<11:00:16, 4.71s/it] {'loss': 0.3839, 'grad_norm': 0.6834423109711578, 'learning_rate': 3.3411946735345412e-06, 'epoch': 0.62} 62%|██████▏ | 13687/22095 [23:41:12<11:00:16, 4.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13688/22095 [23:41:15<9:54:56, 4.25s/it] {'loss': 0.2889, 'grad_norm': 0.6234401108842078, 'learning_rate': 3.340503280459024e-06, 'epoch': 0.62} 62%|██████▏ | 13688/22095 [23:41:15<9:54:56, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13689/22095 [23:41:25<13:42:21, 5.87s/it] {'loss': 0.4897, 'grad_norm': 0.2790971693531577, 'learning_rate': 3.3398119230409976e-06, 'epoch': 0.62} 62%|██████▏ | 13689/22095 [23:41:25<13:42:21, 5.87s/it] 62%|██████▏ | 13690/22095 [23:41:28<11:56:15, 5.11s/it] {'loss': 0.3546, 'grad_norm': 0.6686486132169249, 'learning_rate': 3.339120601295314e-06, 'epoch': 0.62} 62%|██████▏ | 13690/22095 [23:41:28<11:56:15, 5.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60691 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83299 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13691/22095 [23:41:31<10:37:55, 4.55s/it] {'loss': 0.322, 'grad_norm': 0.6426563328291927, 'learning_rate': 3.3384293152368264e-06, 'epoch': 0.62} 62%|██████▏ | 13691/22095 [23:41:31<10:37:55, 4.55s/it] 62%|██████▏ | 13692/22095 [23:41:34<9:32:11, 4.09s/it] {'loss': 0.2905, 'grad_norm': 0.6026238741279107, 'learning_rate': 3.3377380648803894e-06, 'epoch': 0.62} 62%|██████▏ | 13692/22095 [23:41:34<9:32:11, 4.09s/it] 62%|██████▏ | 13693/22095 [23:41:37<8:43:40, 3.74s/it] {'loss': 0.3009, 'grad_norm': 0.6161566790579164, 'learning_rate': 3.3370468502408584e-06, 'epoch': 0.62} 62%|██████▏ | 13693/22095 [23:41:37<8:43:40, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49485 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47435 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51559 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13694/22095 [23:41:40<8:05:39, 3.47s/it] {'loss': 0.3089, 'grad_norm': 0.6770974820021, 'learning_rate': 3.3363556713330806e-06, 'epoch': 0.62} 62%|██████▏ | 13694/22095 [23:41:40<8:05:39, 3.47s/it] 62%|██████▏ | 13695/22095 [23:41:43<7:53:08, 3.38s/it] {'loss': 0.3018, 'grad_norm': 0.5855209014504454, 'learning_rate': 3.3356645281719114e-06, 'epoch': 0.62} 62%|██████▏ | 13695/22095 [23:41:43<7:53:08, 3.38s/it] 62%|██████▏ | 13696/22095 [23:41:47<7:47:45, 3.34s/it] {'loss': 0.4007, 'grad_norm': 0.594537096198144, 'learning_rate': 3.3349734207722e-06, 'epoch': 0.62} 62%|██████▏ | 13696/22095 [23:41:47<7:47:45, 3.34s/it] 62%|██████▏ | 13697/22095 [23:41:50<7:54:07, 3.39s/it] {'loss': 0.2955, 'grad_norm': 0.575094066503019, 'learning_rate': 3.334282349148795e-06, 'epoch': 0.62} 62%|██████▏ | 13697/22095 [23:41:50<7:54:07, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [664, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8423354 in VC:s3://internvl-moe-sft-data/. Exception: Image size [664, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 162377, 'image': 'vrdu_texteq/astro-ph.CO/952d7a09-ab86-486f-acdd-1507202d419a.png', 'image_wh': [[664, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': '$C_\\Delta$ is inferred from the covariances of $A$ and $A+B$ as'}]} 62%|██████▏ | 13698/22095 [23:41:53<7:35:15, 3.25s/it] {'loss': 0.3129, 'grad_norm': 0.6381673155404338, 'learning_rate': 3.3335913133165467e-06, 'epoch': 0.62} 62%|██████▏ | 13698/22095 [23:41:53<7:35:15, 3.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13699/22095 [23:42:01<10:55:31, 4.68s/it] {'loss': 0.4803, 'grad_norm': 0.310952462490026, 'learning_rate': 3.332900313290303e-06, 'epoch': 0.62} 62%|██████▏ | 13699/22095 [23:42:01<10:55:31, 4.68s/it] 62%|██████▏ | 13700/22095 [23:42:05<10:13:44, 4.39s/it] {'loss': 0.28, 'grad_norm': 0.6186094329233779, 'learning_rate': 3.332209349084913e-06, 'epoch': 0.62} 62%|██████▏ | 13700/22095 [23:42:05<10:13:44, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101608 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13701/22095 [23:42:08<9:43:22, 4.17s/it] {'loss': 0.3011, 'grad_norm': 0.6440602505732157, 'learning_rate': 3.3315184207152208e-06, 'epoch': 0.62} 62%|██████▏ | 13701/22095 [23:42:08<9:43:22, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13702/22095 [23:42:16<11:57:03, 5.13s/it] {'loss': 0.4786, 'grad_norm': 0.29370766930924735, 'learning_rate': 3.330827528196072e-06, 'epoch': 0.62} 62%|██████▏ | 13702/22095 [23:42:16<11:57:03, 5.13s/it] 62%|██████▏ | 13703/22095 [23:42:20<10:57:14, 4.70s/it] {'loss': 0.3187, 'grad_norm': 0.5860643236849766, 'learning_rate': 3.330136671542315e-06, 'epoch': 0.62} 62%|██████▏ | 13703/22095 [23:42:20<10:57:14, 4.70s/it] 62%|██████▏ | 13704/22095 [23:42:23<10:11:34, 4.37s/it] {'loss': 0.3108, 'grad_norm': 0.6053177624397245, 'learning_rate': 3.329445850768792e-06, 'epoch': 0.62} 62%|██████▏ | 13704/22095 [23:42:23<10:11:34, 4.37s/it] 62%|██████▏ | 13705/22095 [23:42:26<9:11:13, 3.94s/it] {'loss': 0.3194, 'grad_norm': 0.6287777257029172, 'learning_rate': 3.3287550658903466e-06, 'epoch': 0.62} 62%|██████▏ | 13705/22095 [23:42:26<9:11:13, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310854 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1EWCIhBDH8KJjSspnXXbNAVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease process this image using OCR and provide me with the text.'}, {'from': 'gpt', 'value': 'All words in the image:\n仓库搬迁特价促销\n尺寸定做\n超强承重\n日常价\n日常价\n底价\n跌破\n年中感恩\n回馈到底\n清仓价'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893390 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16543, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 4cm\nB. 3cm\nC. 2cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 62%|██████▏ | 13706/22095 [23:42:29<8:34:21, 3.68s/it] {'loss': 0.2893, 'grad_norm': 0.6483596673339858, 'learning_rate': 3.328064316921823e-06, 'epoch': 0.62} 62%|██████▏ | 13706/22095 [23:42:29<8:34:21, 3.68s/it] 62%|██████▏ | 13707/22095 [23:42:32<8:12:04, 3.52s/it] {'loss': 0.3129, 'grad_norm': 0.640477653434398, 'learning_rate': 3.3273736038780604e-06, 'epoch': 0.62} 62%|██████▏ | 13707/22095 [23:42:32<8:12:04, 3.52s/it] 62%|██████▏ | 13708/22095 [23:42:37<8:50:59, 3.80s/it] {'loss': 0.3462, 'grad_norm': 0.6681407434538018, 'learning_rate': 3.3266829267739026e-06, 'epoch': 0.62} 62%|██████▏ | 13708/22095 [23:42:37<8:50:59, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13709/22095 [23:42:44<10:59:46, 4.72s/it] {'loss': 0.4843, 'grad_norm': 0.29857528906145336, 'learning_rate': 3.325992285624191e-06, 'epoch': 0.62} 62%|██████▏ | 13709/22095 [23:42:44<10:59:46, 4.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358992 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1975, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25710, 'image': 'vrdu_table_final_2/astro-ph.CO/0a376e38-849a-4c84-9f0e-ade42c817837.png', 'image_wh': [[1975, 6]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{p{\\textwidth}}\\hline\\ \\end{tabular}\n```"}]} 62%|██████▏ | 13710/22095 [23:42:47<10:11:23, 4.37s/it] {'loss': 0.3636, 'grad_norm': 0.6447706460657421, 'learning_rate': 3.325301680443762e-06, 'epoch': 0.62} 62%|██████▏ | 13710/22095 [23:42:47<10:11:23, 4.37s/it] 62%|██████▏ | 13711/22095 [23:42:50<9:09:57, 3.94s/it] {'loss': 0.3289, 'grad_norm': 0.6682698026092924, 'learning_rate': 3.3246111112474578e-06, 'epoch': 0.62} 62%|██████▏ | 13711/22095 [23:42:50<9:09:57, 3.94s/it] 62%|██████▏ | 13712/22095 [23:42:53<8:25:31, 3.62s/it] {'loss': 0.3497, 'grad_norm': 0.6591637616635906, 'learning_rate': 3.3239205780501134e-06, 'epoch': 0.62} 62%|██████▏ | 13712/22095 [23:42:53<8:25:31, 3.62s/it] 62%|██████▏ | 13713/22095 [23:42:56<8:18:53, 3.57s/it] {'loss': 0.2837, 'grad_norm': 0.6754149109862705, 'learning_rate': 3.3232300808665703e-06, 'epoch': 0.62} 62%|██████▏ | 13713/22095 [23:42:56<8:18:53, 3.57s/it] 62%|██████▏ | 13714/22095 [23:43:00<8:23:23, 3.60s/it] {'loss': 0.3218, 'grad_norm': 0.6193464763113535, 'learning_rate': 3.3225396197116616e-06, 'epoch': 0.62} 62%|██████▏ | 13714/22095 [23:43:00<8:23:23, 3.60s/it] 62%|██████▏ | 13715/22095 [23:43:03<8:09:05, 3.50s/it] {'loss': 0.328, 'grad_norm': 0.6162007209511707, 'learning_rate': 3.321849194600225e-06, 'epoch': 0.62} 62%|██████▏ | 13715/22095 [23:43:03<8:09:05, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914856 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38009, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 3\nB. 4\nC. 5\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AB=12,且BC=\\frac{1}{2}AB∴BC=6,AC=18而点D是线段AC的中点,∴AD=\\frac{1}{2}AC=\\frac{1}{2}×18=9而BD=AB-AD=12-9=3'}]} 62%|██████▏ | 13716/22095 [23:43:07<8:20:38, 3.59s/it] {'loss': 0.2941, 'grad_norm': 0.6253423668804438, 'learning_rate': 3.321158805547096e-06, 'epoch': 0.62} 62%|██████▏ | 13716/22095 [23:43:07<8:20:38, 3.59s/it] 62%|██████▏ | 13717/22095 [23:43:10<7:51:34, 3.38s/it] {'loss': 0.3408, 'grad_norm': 0.6661802237947486, 'learning_rate': 3.320468452567106e-06, 'epoch': 0.62} 62%|██████▏ | 13717/22095 [23:43:10<7:51:34, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85376 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13718/22095 [23:43:13<7:26:24, 3.20s/it] {'loss': 0.3066, 'grad_norm': 0.6118279427263077, 'learning_rate': 3.319778135675092e-06, 'epoch': 0.62} 62%|██████▏ | 13718/22095 [23:43:13<7:26:24, 3.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50722 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13719/22095 [23:43:16<7:04:42, 3.04s/it] {'loss': 0.3464, 'grad_norm': 0.6382050988995399, 'learning_rate': 3.3190878548858862e-06, 'epoch': 0.62} 62%|██████▏ | 13719/22095 [23:43:16<7:04:42, 3.04s/it] 62%|██████▏ | 13720/22095 [23:43:19<7:12:56, 3.10s/it] {'loss': 0.2996, 'grad_norm': 0.6587494163288371, 'learning_rate': 3.318397610214319e-06, 'epoch': 0.62} 62%|██████▏ | 13720/22095 [23:43:19<7:12:56, 3.10s/it] 62%|██████▏ | 13721/22095 [23:43:22<7:23:04, 3.17s/it] {'loss': 0.2783, 'grad_norm': 0.7209291907841151, 'learning_rate': 3.317707401675221e-06, 'epoch': 0.62} 62%|██████▏ | 13721/22095 [23:43:22<7:23:04, 3.17s/it] 62%|██████▏ | 13722/22095 [23:43:25<7:26:08, 3.20s/it] {'loss': 0.2755, 'grad_norm': 0.5451826261203058, 'learning_rate': 3.317017229283428e-06, 'epoch': 0.62} 62%|██████▏ | 13722/22095 [23:43:25<7:26:08, 3.20s/it] 62%|██████▏ | 13723/22095 [23:43:29<8:05:08, 3.48s/it] {'loss': 0.3281, 'grad_norm': 0.6646853793016391, 'learning_rate': 3.3163270930537623e-06, 'epoch': 0.62} 62%|██████▏ | 13723/22095 [23:43:29<8:05:08, 3.48s/it] 62%|██████▏ | 13724/22095 [23:43:33<8:18:48, 3.58s/it] {'loss': 0.2968, 'grad_norm': 0.7129622897575655, 'learning_rate': 3.3156369930010574e-06, 'epoch': 0.62} 62%|██████▏ | 13724/22095 [23:43:33<8:18:48, 3.58s/it] 62%|██████▏ | 13725/22095 [23:43:37<8:07:59, 3.50s/it] {'loss': 0.3496, 'grad_norm': 0.6001944567292407, 'learning_rate': 3.3149469291401413e-06, 'epoch': 0.62} 62%|██████▏ | 13725/22095 [23:43:37<8:07:59, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13726/22095 [23:43:45<11:49:12, 5.08s/it] {'loss': 0.4998, 'grad_norm': 0.31675565312106924, 'learning_rate': 3.3142569014858395e-06, 'epoch': 0.62} 62%|██████▏ | 13726/22095 [23:43:45<11:49:12, 5.08s/it] 62%|██████▏ | 13727/22095 [23:43:49<10:33:38, 4.54s/it] {'loss': 0.2726, 'grad_norm': 0.5826162656953073, 'learning_rate': 3.313566910052979e-06, 'epoch': 0.62} 62%|██████▏ | 13727/22095 [23:43:49<10:33:38, 4.54s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31416.png 2025-08-28 15:41:46.391591 load time: 1232.51 ms 62%|██████▏ | 13728/22095 [23:43:52<9:39:48, 4.16s/it] {'loss': 0.3287, 'grad_norm': 0.6090598312717707, 'learning_rate': 3.3128769548563864e-06, 'epoch': 0.62} 62%|██████▏ | 13728/22095 [23:43:52<9:39:48, 4.16s/it] 62%|██████▏ | 13729/22095 [23:43:55<8:41:34, 3.74s/it] {'loss': 0.3374, 'grad_norm': 0.6458666374249951, 'learning_rate': 3.312187035910888e-06, 'epoch': 0.62} 62%|██████▏ | 13729/22095 [23:43:55<8:41:34, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63806 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60689 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13730/22095 [23:43:59<8:44:53, 3.76s/it] {'loss': 0.3267, 'grad_norm': 0.583242363800884, 'learning_rate': 3.3114971532313058e-06, 'epoch': 0.62} 62%|██████▏ | 13730/22095 [23:43:59<8:44:53, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70961 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13731/22095 [23:44:03<9:09:18, 3.94s/it] {'loss': 0.2917, 'grad_norm': 0.6149879962998592, 'learning_rate': 3.310807306832462e-06, 'epoch': 0.62} 62%|██████▏ | 13731/22095 [23:44:03<9:09:18, 3.94s/it] 62%|██████▏ | 13732/22095 [23:44:06<8:50:18, 3.80s/it] {'loss': 0.3231, 'grad_norm': 0.622620677794301, 'learning_rate': 3.310117496729184e-06, 'epoch': 0.62} 62%|██████▏ | 13732/22095 [23:44:06<8:50:18, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13733/22095 [23:44:10<9:04:03, 3.90s/it] {'loss': 0.3134, 'grad_norm': 0.6379068039069801, 'learning_rate': 3.309427722936289e-06, 'epoch': 0.62} 62%|██████▏ | 13733/22095 [23:44:10<9:04:03, 3.90s/it] 62%|██████▏ | 13734/22095 [23:44:14<9:04:43, 3.91s/it] {'loss': 0.3594, 'grad_norm': 0.6829768054297026, 'learning_rate': 3.308737985468601e-06, 'epoch': 0.62} 62%|██████▏ | 13734/22095 [23:44:14<9:04:43, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81184 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13735/22095 [23:44:18<9:00:47, 3.88s/it] {'loss': 0.3395, 'grad_norm': 0.6276077247995031, 'learning_rate': 3.3080482843409402e-06, 'epoch': 0.62} 62%|██████▏ | 13735/22095 [23:44:18<9:00:47, 3.88s/it] 62%|██████▏ | 13736/22095 [23:44:21<8:14:37, 3.55s/it] {'loss': 0.2992, 'grad_norm': 0.5825313407520454, 'learning_rate': 3.307358619568123e-06, 'epoch': 0.62} 62%|██████▏ | 13736/22095 [23:44:21<8:14:37, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78472 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13737/22095 [23:44:24<7:51:50, 3.39s/it] {'loss': 0.3046, 'grad_norm': 0.6154254435580296, 'learning_rate': 3.3066689911649714e-06, 'epoch': 0.62} 62%|██████▏ | 13737/22095 [23:44:24<7:51:50, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (121140 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47298 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13738/22095 [23:44:34<12:20:41, 5.32s/it] {'loss': 0.4916, 'grad_norm': 0.30001794513072544, 'learning_rate': 3.305979399146304e-06, 'epoch': 0.62} 62%|██████▏ | 13738/22095 [23:44:34<12:20:41, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61574 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115380 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64533 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75067 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13739/22095 [23:44:43<14:50:13, 6.39s/it] {'loss': 0.4988, 'grad_norm': 0.29097572170197145, 'learning_rate': 3.305289843526935e-06, 'epoch': 0.62} 62%|██████▏ | 13739/22095 [23:44:43<14:50:13, 6.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 62%|██████▏ | 13740/22095 [23:44:46<12:47:34, 5.51s/it] {'loss': 0.3225, 'grad_norm': 0.6322721344768839, 'learning_rate': 3.304600324321682e-06, 'epoch': 0.62} 62%|██████▏ | 13740/22095 [23:44:46<12:47:34, 5.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54109 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13741/22095 [23:44:50<11:20:15, 4.89s/it] {'loss': 0.2876, 'grad_norm': 0.6406997502751267, 'learning_rate': 3.3039108415453614e-06, 'epoch': 0.62} 62%|██████▏ | 13741/22095 [23:44:50<11:20:15, 4.89s/it] 62%|██████▏ | 13742/22095 [23:44:53<9:58:04, 4.30s/it] {'loss': 0.3173, 'grad_norm': 0.6260918992215052, 'learning_rate': 3.303221395212789e-06, 'epoch': 0.62} 62%|██████▏ | 13742/22095 [23:44:53<9:58:04, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13743/22095 [23:45:02<13:54:04, 5.99s/it] {'loss': 0.4688, 'grad_norm': 0.2673962411768744, 'learning_rate': 3.302531985338776e-06, 'epoch': 0.62} 62%|██████▏ | 13743/22095 [23:45:03<13:54:04, 5.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60998 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44729 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13744/22095 [23:45:06<12:29:21, 5.38s/it] {'loss': 0.3163, 'grad_norm': 0.6301571532531297, 'learning_rate': 3.3018426119381364e-06, 'epoch': 0.62} 62%|██████▏ | 13744/22095 [23:45:06<12:29:21, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13745/22095 [23:45:15<14:59:59, 6.47s/it] {'loss': 0.4791, 'grad_norm': 0.28869214485696515, 'learning_rate': 3.3011532750256874e-06, 'epoch': 0.62} 62%|██████▏ | 13745/22095 [23:45:15<14:59:59, 6.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13746/22095 [23:45:22<14:47:59, 6.38s/it] {'loss': 0.5083, 'grad_norm': 0.2968194690831498, 'learning_rate': 3.300463974616234e-06, 'epoch': 0.62} 62%|██████▏ | 13746/22095 [23:45:22<14:47:59, 6.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77395 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13747/22095 [23:45:30<16:13:09, 6.99s/it] {'loss': 0.4796, 'grad_norm': 0.2750714673096283, 'learning_rate': 3.2997747107245898e-06, 'epoch': 0.62} 62%|██████▏ | 13747/22095 [23:45:30<16:13:09, 6.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_515682.png 2025-08-28 15:43:30.464140 load time: 1114.66 ms 62%|██████▏ | 13748/22095 [23:45:33<13:42:37, 5.91s/it] {'loss': 0.3564, 'grad_norm': 0.6431241474261704, 'learning_rate': 3.2990854833655674e-06, 'epoch': 0.62} 62%|██████▏ | 13748/22095 [23:45:33<13:42:37, 5.91s/it] 62%|██████▏ | 13749/22095 [23:45:43<15:59:20, 6.90s/it] {'loss': 0.5099, 'grad_norm': 0.3403552256461183, 'learning_rate': 3.298396292553972e-06, 'epoch': 0.62} 62%|██████▏ | 13749/22095 [23:45:43<15:59:20, 6.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921711 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44864, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 7\nB. 6\nC. 10\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 62%|██████▏ | 13750/22095 [23:45:46<13:39:40, 5.89s/it] {'loss': 0.3121, 'grad_norm': 0.5901200168171835, 'learning_rate': 3.2977071383046134e-06, 'epoch': 0.62} 62%|██████▏ | 13750/22095 [23:45:46<13:39:40, 5.89s/it]VC:s3://gui-agent/data_20250707/windows/images/os_windows/free_task_20250710_215113/images/20250710_215120_3.png 2025-08-28 15:43:44.956844 load time: 1285.53 ms VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/aaf88046c6bcc5e7e90b18ab2c829785.png 2025-08-28 15:43:46.211443 load time: 1018.53 ms 62%|██████▏ | 13751/22095 [23:45:49<11:47:36, 5.09s/it] {'loss': 0.3034, 'grad_norm': 0.5680916309917373, 'learning_rate': 3.297018020632304e-06, 'epoch': 0.62} 62%|██████▏ | 13751/22095 [23:45:49<11:47:36, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250630/windows_augment_data_20250630/images/matlab/handmade_annotation_2/images/ML_1_id_36_function_0_crop_1_grounding_instructions_random_paste.png 2025-08-28 15:43:47.349236 load time: 3055.87 ms 62%|██████▏ | 13752/22095 [23:46:00<15:17:57, 6.60s/it] {'loss': 0.4828, 'grad_norm': 0.2810511524918745, 'learning_rate': 3.2963289395518434e-06, 'epoch': 0.62} 62%|██████▏ | 13752/22095 [23:46:00<15:17:57, 6.60s/it] 62%|██████▏ | 13753/22095 [23:46:04<13:31:49, 5.84s/it] {'loss': 0.3447, 'grad_norm': 0.6593233991957315, 'learning_rate': 3.295639895078042e-06, 'epoch': 0.62} 62%|██████▏ | 13753/22095 [23:46:04<13:31:49, 5.84s/it] 62%|██████▏ | 13754/22095 [23:46:07<12:11:02, 5.26s/it] {'loss': 0.2901, 'grad_norm': 0.6151045949479557, 'learning_rate': 3.294950887225707e-06, 'epoch': 0.62} 62%|██████▏ | 13754/22095 [23:46:07<12:11:02, 5.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13755/22095 [23:46:14<13:17:48, 5.74s/it] {'loss': 0.478, 'grad_norm': 0.2867113598124026, 'learning_rate': 3.294261916009639e-06, 'epoch': 0.62} 62%|██████▏ | 13755/22095 [23:46:14<13:17:48, 5.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [945, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507746 in VC:s3://internvl-moe-sft-data/. Exception: Image size [945, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4627, 'image': 'vrdu_texteq/astro-ph.CO/21ff96c0-7702-414e-ac76-61d25b5f5d1a.png', 'image_wh': [[945, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'In this context $ A\\supset B$ denotes that A contains B as well as some other terms.'}]} 62%|██████▏ | 13756/22095 [23:46:18<11:43:14, 5.06s/it] {'loss': 0.3057, 'grad_norm': 0.6284771216660319, 'learning_rate': 3.2935729814446426e-06, 'epoch': 0.62} 62%|██████▏ | 13756/22095 [23:46:18<11:43:14, 5.06s/it] 62%|██████▏ | 13757/22095 [23:46:22<10:48:11, 4.66s/it] {'loss': 0.3069, 'grad_norm': 0.6213561568312568, 'learning_rate': 3.2928840835455233e-06, 'epoch': 0.62} 62%|██████▏ | 13757/22095 [23:46:22<10:48:11, 4.66s/it]VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 15:44:21.890018 load time: 1658.99 ms 62%|██████▏ | 13758/22095 [23:46:25<10:01:46, 4.33s/it] {'loss': 0.3111, 'grad_norm': 0.5948747707066405, 'learning_rate': 3.2921952223270824e-06, 'epoch': 0.62} 62%|██████▏ | 13758/22095 [23:46:25<10:01:46, 4.33s/it] 62%|██████▏ | 13759/22095 [23:46:28<9:21:41, 4.04s/it] {'loss': 0.2965, 'grad_norm': 0.6322336320261707, 'learning_rate': 3.2915063978041205e-06, 'epoch': 0.62} 62%|██████▏ | 13759/22095 [23:46:29<9:21:41, 4.04s/it] 62%|██████▏ | 13760/22095 [23:46:32<8:44:36, 3.78s/it] {'loss': 0.3032, 'grad_norm': 0.6610317478714497, 'learning_rate': 3.290817609991438e-06, 'epoch': 0.62} 62%|██████▏ | 13760/22095 [23:46:32<8:44:36, 3.78s/it] 62%|██████▏ | 13761/22095 [23:46:36<8:48:56, 3.81s/it] {'loss': 0.3142, 'grad_norm': 0.6406700715494363, 'learning_rate': 3.290128858903837e-06, 'epoch': 0.62} 62%|██████▏ | 13761/22095 [23:46:36<8:48:56, 3.81s/it] 62%|██████▏ | 13762/22095 [23:46:39<8:30:37, 3.68s/it] {'loss': 0.2807, 'grad_norm': 0.5882719121004318, 'learning_rate': 3.2894401445561154e-06, 'epoch': 0.62} 62%|██████▏ | 13762/22095 [23:46:39<8:30:37, 3.68s/it] 62%|██████▏ | 13763/22095 [23:46:42<8:05:11, 3.49s/it] {'loss': 0.2932, 'grad_norm': 0.6216161164157775, 'learning_rate': 3.2887514669630706e-06, 'epoch': 0.62} 62%|██████▏ | 13763/22095 [23:46:42<8:05:11, 3.49s/it] 62%|██████▏ | 13764/22095 [23:46:45<7:54:50, 3.42s/it] {'loss': 0.3113, 'grad_norm': 0.6124949720716838, 'learning_rate': 3.2880628261395033e-06, 'epoch': 0.62} 62%|██████▏ | 13764/22095 [23:46:45<7:54:50, 3.42s/it] 62%|██████▏ | 13765/22095 [23:46:49<8:06:19, 3.50s/it] {'loss': 0.3131, 'grad_norm': 0.5685852505759508, 'learning_rate': 3.287374222100205e-06, 'epoch': 0.62} 62%|██████▏ | 13765/22095 [23:46:49<8:06:19, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954483 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5318, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 7cm\nB. 8cm\nC. 5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 62%|██████▏ | 13766/22095 [23:46:52<7:40:38, 3.32s/it] {'loss': 0.2906, 'grad_norm': 0.5748963189048516, 'learning_rate': 3.2866856548599757e-06, 'epoch': 0.62} 62%|██████▏ | 13766/22095 [23:46:52<7:40:38, 3.32s/it] 62%|██████▏ | 13767/22095 [23:46:55<7:34:02, 3.27s/it] {'loss': 0.313, 'grad_norm': 0.6265152952502434, 'learning_rate': 3.2859971244336107e-06, 'epoch': 0.62} 62%|██████▏ | 13767/22095 [23:46:55<7:34:02, 3.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89004 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13768/22095 [23:46:58<7:37:40, 3.30s/it] {'loss': 0.326, 'grad_norm': 0.6210138216125896, 'learning_rate': 3.285308630835903e-06, 'epoch': 0.62} 62%|██████▏ | 13768/22095 [23:46:58<7:37:40, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80145 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13769/22095 [23:47:01<7:20:37, 3.18s/it] {'loss': 0.278, 'grad_norm': 0.6093822225680078, 'learning_rate': 3.2846201740816446e-06, 'epoch': 0.62} 62%|██████▏ | 13769/22095 [23:47:01<7:20:37, 3.18s/it] 62%|██████▏ | 13770/22095 [23:47:05<7:55:15, 3.43s/it] {'loss': 0.3086, 'grad_norm': 0.6318730141083124, 'learning_rate': 3.2839317541856317e-06, 'epoch': 0.62} 62%|██████▏ | 13770/22095 [23:47:05<7:55:15, 3.43s/it] 62%|██████▏ | 13771/22095 [23:47:09<7:55:05, 3.42s/it] {'loss': 0.3258, 'grad_norm': 0.6329790145727102, 'learning_rate': 3.2832433711626562e-06, 'epoch': 0.62} 62%|██████▏ | 13771/22095 [23:47:09<7:55:05, 3.42s/it] 62%|██████▏ | 13772/22095 [23:47:12<8:10:52, 3.54s/it] {'loss': 0.3155, 'grad_norm': 0.5543442506257924, 'learning_rate': 3.282555025027507e-06, 'epoch': 0.62} 62%|██████▏ | 13772/22095 [23:47:12<8:10:52, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43435 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13773/22095 [23:47:15<7:45:25, 3.36s/it] {'loss': 0.3133, 'grad_norm': 0.6117032449228192, 'learning_rate': 3.2818667157949742e-06, 'epoch': 0.62} 62%|██████▏ | 13773/22095 [23:47:15<7:45:25, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86749 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13774/22095 [23:47:18<7:33:29, 3.27s/it] {'loss': 0.314, 'grad_norm': 0.6131769415951153, 'learning_rate': 3.281178443479852e-06, 'epoch': 0.62} 62%|██████▏ | 13774/22095 [23:47:18<7:33:29, 3.27s/it] 62%|██████▏ | 13775/22095 [23:47:22<7:38:21, 3.31s/it] {'loss': 0.3104, 'grad_norm': 0.754090728019248, 'learning_rate': 3.2804902080969233e-06, 'epoch': 0.62} 62%|██████▏ | 13775/22095 [23:47:22<7:38:21, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13776/22095 [23:47:32<12:07:34, 5.25s/it] {'loss': 0.4743, 'grad_norm': 0.4622437492790766, 'learning_rate': 3.2798020096609795e-06, 'epoch': 0.62} 62%|██████▏ | 13776/22095 [23:47:32<12:07:34, 5.25s/it] 62%|██████▏ | 13777/22095 [23:47:35<10:43:55, 4.64s/it] {'loss': 0.3231, 'grad_norm': 0.5918545279223919, 'learning_rate': 3.2791138481868084e-06, 'epoch': 0.62} 62%|██████▏ | 13777/22095 [23:47:35<10:43:55, 4.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13778/22095 [23:47:38<9:53:29, 4.28s/it] {'loss': 0.3062, 'grad_norm': 0.6049844320425433, 'learning_rate': 3.2784257236891948e-06, 'epoch': 0.62} 62%|██████▏ | 13778/22095 [23:47:38<9:53:29, 4.28s/it] 62%|██████▏ | 13779/22095 [23:47:42<9:32:22, 4.13s/it] {'loss': 0.2907, 'grad_norm': 0.6220890451444148, 'learning_rate': 3.2777376361829237e-06, 'epoch': 0.62} 62%|██████▏ | 13779/22095 [23:47:42<9:32:22, 4.13s/it] 62%|██████▏ | 13780/22095 [23:47:46<9:27:45, 4.10s/it] {'loss': 0.3416, 'grad_norm': 0.5917243166323288, 'learning_rate': 3.2770495856827834e-06, 'epoch': 0.62} 62%|██████▏ | 13780/22095 [23:47:46<9:27:45, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13781/22095 [23:47:55<13:07:46, 5.69s/it] {'loss': 0.4654, 'grad_norm': 0.3768603010876834, 'learning_rate': 3.2763615722035548e-06, 'epoch': 0.62} 62%|██████▏ | 13781/22095 [23:47:55<13:07:46, 5.69s/it]VC:s3://gui-agent/data_20250612/mac/images/terminal/af851dfd-b7ce-4e95-95cf-c0fce6b8bb15/images/step_0.png 2025-08-28 15:45:55.920334 load time: 1289.1 ms 62%|██████▏ | 13782/22095 [23:47:59<11:47:04, 5.10s/it] {'loss': 0.3292, 'grad_norm': 1.1432024816430995, 'learning_rate': 3.275673595760022e-06, 'epoch': 0.62} 62%|██████▏ | 13782/22095 [23:47:59<11:47:04, 5.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366811 in VC:s3://internvl-moe-sft-data/. Exception: Image size [198, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33557, 'image': 'vrdu_table_final_2/astro-ph.CO/c9b556c7-6cb4-4197-8aba-3aa51ebebbf3.png', 'image_wh': [[198, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}\\let\\\\=\\author@nextline\\@author\n \\end{tabular}\n```"}]} 62%|██████▏ | 13783/22095 [23:48:02<10:20:13, 4.48s/it] {'loss': 0.3377, 'grad_norm': 0.6519811601739103, 'learning_rate': 3.274985656366967e-06, 'epoch': 0.62} 62%|██████▏ | 13783/22095 [23:48:02<10:20:13, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42528 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13784/22095 [23:48:05<9:10:18, 3.97s/it] {'loss': 0.284, 'grad_norm': 0.6088029234300733, 'learning_rate': 3.2742977540391747e-06, 'epoch': 0.62} 62%|██████▏ | 13784/22095 [23:48:05<9:10:18, 3.97s/it] 62%|██████▏ | 13785/22095 [23:48:09<9:00:43, 3.90s/it] {'loss': 0.3033, 'grad_norm': 0.6060950905983736, 'learning_rate': 3.273609888791422e-06, 'epoch': 0.62} 62%|██████▏ | 13785/22095 [23:48:09<9:00:43, 3.90s/it] 62%|██████▏ | 13786/22095 [23:48:12<8:33:06, 3.71s/it] {'loss': 0.2962, 'grad_norm': 1.1703658968985153, 'learning_rate': 3.2729220606384905e-06, 'epoch': 0.62} 62%|██████▏ | 13786/22095 [23:48:12<8:33:06, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 62%|██████▏ | 13787/22095 [23:48:19<11:10:21, 4.84s/it] {'loss': 0.4635, 'grad_norm': 0.3148542397514561, 'learning_rate': 3.2722342695951612e-06, 'epoch': 0.62} 62%|██████▏ | 13787/22095 [23:48:20<11:10:21, 4.84s/it] 62%|██████▏ | 13788/22095 [23:48:23<10:19:39, 4.48s/it] {'loss': 0.3219, 'grad_norm': 0.5937671301737225, 'learning_rate': 3.2715465156762095e-06, 'epoch': 0.62} 62%|██████▏ | 13788/22095 [23:48:23<10:19:39, 4.48s/it] 62%|██████▏ | 13789/22095 [23:48:27<10:08:25, 4.40s/it] {'loss': 0.3268, 'grad_norm': 0.6898405827371933, 'learning_rate': 3.2708587988964134e-06, 'epoch': 0.62} 62%|██████▏ | 13789/22095 [23:48:27<10:08:25, 4.40s/it] 62%|██████▏ | 13790/22095 [23:48:31<9:30:33, 4.12s/it] {'loss': 0.3403, 'grad_norm': 0.6329187007223473, 'learning_rate': 3.270171119270554e-06, 'epoch': 0.62} 62%|██████▏ | 13790/22095 [23:48:31<9:30:33, 4.12s/it] 62%|██████▏ | 13791/22095 [23:48:35<9:35:48, 4.16s/it] {'loss': 0.3126, 'grad_norm': 0.6249047775111826, 'learning_rate': 3.269483476813403e-06, 'epoch': 0.62} 62%|██████▏ | 13791/22095 [23:48:35<9:35:48, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13792/22095 [23:48:41<10:59:13, 4.76s/it] {'loss': 0.4665, 'grad_norm': 0.287356452626235, 'learning_rate': 3.2687958715397373e-06, 'epoch': 0.62} 62%|██████▏ | 13792/22095 [23:48:41<10:59:13, 4.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369585 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36337, 'image': 'vrdu_table_final_2/astro-ph.CO/36d48d26-c1bb-4650-a12a-83235d75dcc8.png', 'image_wh': [[17, 23]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\gamma$\\end{tabular}\n```"}]} 62%|██████▏ | 13793/22095 [23:48:50<14:00:46, 6.08s/it] {'loss': 0.4719, 'grad_norm': 0.30557783476155476, 'learning_rate': 3.2681083034643323e-06, 'epoch': 0.62} 62%|██████▏ | 13793/22095 [23:48:50<14:00:46, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 62%|██████▏ | 13794/22095 [23:48:55<13:03:15, 5.66s/it] {'loss': 0.3011, 'grad_norm': 0.6428013543053296, 'learning_rate': 3.2674207726019586e-06, 'epoch': 0.62} 62%|██████▏ | 13794/22095 [23:48:55<13:03:15, 5.66s/it] 62%|██████▏ | 13795/22095 [23:48:58<11:24:36, 4.95s/it] {'loss': 0.3334, 'grad_norm': 0.6440078992859094, 'learning_rate': 3.2667332789673923e-06, 'epoch': 0.62} 62%|██████▏ | 13795/22095 [23:48:58<11:24:36, 4.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 62%|██████▏ | 13796/22095 [23:49:02<10:45:16, 4.67s/it] {'loss': 0.291, 'grad_norm': 0.6103591359768376, 'learning_rate': 3.2660458225754053e-06, 'epoch': 0.62} 62%|██████▏ | 13796/22095 [23:49:02<10:45:16, 4.67s/it] 62%|██████▏ | 13797/22095 [23:49:07<10:28:54, 4.55s/it] {'loss': 0.3366, 'grad_norm': 0.5711343505430136, 'learning_rate': 3.2653584034407677e-06, 'epoch': 0.62} 62%|██████▏ | 13797/22095 [23:49:07<10:28:54, 4.55s/it] 62%|██████▏ | 13798/22095 [23:49:10<9:31:43, 4.13s/it] {'loss': 0.2947, 'grad_norm': 0.6753677755634805, 'learning_rate': 3.264671021578249e-06, 'epoch': 0.62} 62%|██████▏ | 13798/22095 [23:49:10<9:31:43, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [125, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365279 in VC:s3://internvl-moe-sft-data/. Exception: Image size [125, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32020, 'image': 'vrdu_table_final_2/astro-ph.CO/dbc164d9-b6bd-48d6-a057-3a872d0b658b.png', 'image_wh': [[125, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}Hard SED\\end{tabular}\n```"}]} 62%|██████▏ | 13799/22095 [23:49:13<9:01:14, 3.91s/it] {'loss': 0.3245, 'grad_norm': 0.647574021925651, 'learning_rate': 3.2639836770026215e-06, 'epoch': 0.62} 62%|██████▏ | 13799/22095 [23:49:13<9:01:14, 3.91s/it] 62%|██████▏ | 13800/22095 [23:49:17<8:53:20, 3.86s/it] {'loss': 0.3375, 'grad_norm': 0.6507967950709432, 'learning_rate': 3.2632963697286546e-06, 'epoch': 0.62} 62%|██████▏ | 13800/22095 [23:49:17<8:53:20, 3.86s/it] 62%|██████▏ | 13801/22095 [23:49:21<8:46:05, 3.81s/it] {'loss': 0.3414, 'grad_norm': 0.7578310633233802, 'learning_rate': 3.262609099771113e-06, 'epoch': 0.62} 62%|██████▏ | 13801/22095 [23:49:21<8:46:05, 3.81s/it] 62%|██████▏ | 13802/22095 [23:49:24<8:30:11, 3.69s/it] {'loss': 0.317, 'grad_norm': 0.7524822672827859, 'learning_rate': 3.261921867144765e-06, 'epoch': 0.62} 62%|██████▏ | 13802/22095 [23:49:24<8:30:11, 3.69s/it] 62%|██████▏ | 13803/22095 [23:49:27<8:15:12, 3.58s/it] {'loss': 0.3302, 'grad_norm': 0.6484206522561464, 'learning_rate': 3.2612346718643818e-06, 'epoch': 0.62} 62%|██████▏ | 13803/22095 [23:49:27<8:15:12, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42764 > 40960). Running this sequence through the model will result in indexing errors 62%|██████▏ | 13804/22095 [23:49:31<7:58:51, 3.47s/it] {'loss': 0.2862, 'grad_norm': 0.6192293017632162, 'learning_rate': 3.2605475139447207e-06, 'epoch': 0.62} 62%|██████▏ | 13804/22095 [23:49:31<7:58:51, 3.47s/it] 62%|██████▏ | 13805/22095 [23:49:34<7:40:54, 3.34s/it] {'loss': 0.3266, 'grad_norm': 0.6352264745867553, 'learning_rate': 3.2598603934005535e-06, 'epoch': 0.62} 62%|██████▏ | 13805/22095 [23:49:34<7:40:54, 3.34s/it] 62%|██████▏ | 13806/22095 [23:49:37<7:26:17, 3.23s/it] {'loss': 0.3212, 'grad_norm': 0.6771239055173687, 'learning_rate': 3.259173310246643e-06, 'epoch': 0.62} 62%|██████▏ | 13806/22095 [23:49:37<7:26:17, 3.23s/it] 62%|██████▏ | 13807/22095 [23:49:39<7:10:20, 3.12s/it] {'loss': 0.3014, 'grad_norm': 0.6036924888487291, 'learning_rate': 3.25848626449775e-06, 'epoch': 0.62} 62%|██████▏ | 13807/22095 [23:49:39<7:10:20, 3.12s/it] 62%|██████▏ | 13808/22095 [23:49:44<7:53:30, 3.43s/it] {'loss': 0.3092, 'grad_norm': 0.5917879949587903, 'learning_rate': 3.2577992561686377e-06, 'epoch': 0.62} 62%|██████▏ | 13808/22095 [23:49:44<7:53:30, 3.43s/it] 62%|██████▏ | 13809/22095 [23:49:46<7:28:42, 3.25s/it] {'loss': 0.3168, 'grad_norm': 0.6359248283898935, 'learning_rate': 3.2571122852740703e-06, 'epoch': 0.62} 62%|██████▏ | 13809/22095 [23:49:46<7:28:42, 3.25s/it] 63%|██████▎ | 13810/22095 [23:49:50<7:31:01, 3.27s/it] {'loss': 0.3212, 'grad_norm': 0.6815532910505805, 'learning_rate': 3.256425351828807e-06, 'epoch': 0.63} 63%|██████▎ | 13810/22095 [23:49:50<7:31:01, 3.27s/it] 63%|██████▎ | 13811/22095 [23:49:53<7:22:29, 3.20s/it] {'loss': 0.3324, 'grad_norm': 0.6523054381738229, 'learning_rate': 3.2557384558476067e-06, 'epoch': 0.63} 63%|██████▎ | 13811/22095 [23:49:53<7:22:29, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70835 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118453 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13812/22095 [23:50:02<11:36:03, 5.04s/it] {'loss': 0.4923, 'grad_norm': 0.36420794672238715, 'learning_rate': 3.2550515973452295e-06, 'epoch': 0.63} 63%|██████▎ | 13812/22095 [23:50:02<11:36:03, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54635 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13813/22095 [23:50:06<10:34:39, 4.60s/it] {'loss': 0.3078, 'grad_norm': 0.6536888411766718, 'learning_rate': 3.2543647763364362e-06, 'epoch': 0.63} 63%|██████▎ | 13813/22095 [23:50:06<10:34:39, 4.60s/it] 63%|██████▎ | 13814/22095 [23:50:09<9:55:17, 4.31s/it] {'loss': 0.2874, 'grad_norm': 0.6278223232996571, 'learning_rate': 3.2536779928359818e-06, 'epoch': 0.63} 63%|██████▎ | 13814/22095 [23:50:09<9:55:17, 4.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [317, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8510710 in VC:s3://internvl-moe-sft-data/. Exception: Image size [317, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20288, 'image': 'vrdu_texteq/astro-ph.CO/f2cfe195-3b56-414e-a397-bab5de4724a7.png', 'image_wh': [[317, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $b_v$ denotes the bias.'}]} 63%|██████▎ | 13815/22095 [23:50:13<9:13:29, 4.01s/it] {'loss': 0.3187, 'grad_norm': 0.6031600414981914, 'learning_rate': 3.252991246858623e-06, 'epoch': 0.63} 63%|██████▎ | 13815/22095 [23:50:13<9:13:29, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41374 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90458 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13816/22095 [23:50:17<9:30:14, 4.13s/it] {'loss': 0.3119, 'grad_norm': 0.8775349594361912, 'learning_rate': 3.2523045384191186e-06, 'epoch': 0.63} 63%|██████▎ | 13816/22095 [23:50:17<9:30:14, 4.13s/it] 63%|██████▎ | 13817/22095 [23:50:20<8:32:40, 3.72s/it] {'loss': 0.3461, 'grad_norm': 0.6574777145007359, 'learning_rate': 3.25161786753222e-06, 'epoch': 0.63} 63%|██████▎ | 13817/22095 [23:50:20<8:32:40, 3.72s/it] 63%|██████▎ | 13818/22095 [23:50:24<8:34:54, 3.73s/it] {'loss': 0.3365, 'grad_norm': 0.6385496332359393, 'learning_rate': 3.2509312342126846e-06, 'epoch': 0.63} 63%|██████▎ | 13818/22095 [23:50:24<8:34:54, 3.73s/it]VC:s3://gui-agent/data_20250421/web/images/marriott_com/trajectory_121/img/step_2.png 2025-08-28 15:48:22.811294 load time: 1011.97 ms 63%|██████▎ | 13819/22095 [23:50:27<8:19:14, 3.62s/it] {'loss': 0.319, 'grad_norm': 0.5592285709201028, 'learning_rate': 3.250244638475266e-06, 'epoch': 0.63} 63%|██████▎ | 13819/22095 [23:50:27<8:19:14, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13820/22095 [23:50:32<9:29:56, 4.13s/it] {'loss': 0.48, 'grad_norm': 0.3097990064128394, 'learning_rate': 3.249558080334716e-06, 'epoch': 0.63} 63%|██████▎ | 13820/22095 [23:50:32<9:29:56, 4.13s/it] 63%|██████▎ | 13821/22095 [23:50:42<13:17:18, 5.78s/it] {'loss': 0.4853, 'grad_norm': 0.27962557719584863, 'learning_rate': 3.2488715598057856e-06, 'epoch': 0.63} 63%|██████▎ | 13821/22095 [23:50:42<13:17:18, 5.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13822/22095 [23:50:45<11:27:15, 4.98s/it] {'loss': 0.3368, 'grad_norm': 0.9387839473662496, 'learning_rate': 3.2481850769032287e-06, 'epoch': 0.63} 63%|██████▎ | 13822/22095 [23:50:45<11:27:15, 4.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [12, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8361120 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27848, 'image': 'vrdu_table_final_2/astro-ph.CO/20ec57d0-a1c1-4f1f-b1ff-4aa2961a5f95.png', 'image_wh': [[12, 17]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\footnotesize #1\n\\end{tabular}\n```"}]} 63%|██████▎ | 13823/22095 [23:50:48<10:20:26, 4.50s/it] {'loss': 0.317, 'grad_norm': 0.6884232226794563, 'learning_rate': 3.2474986316417923e-06, 'epoch': 0.63} 63%|██████▎ | 13823/22095 [23:50:48<10:20:26, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13824/22095 [23:50:57<13:31:29, 5.89s/it] {'loss': 0.4574, 'grad_norm': 0.2886797804451184, 'learning_rate': 3.2468122240362287e-06, 'epoch': 0.63} 63%|██████▎ | 13824/22095 [23:50:57<13:31:29, 5.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960773 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11608, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 63%|██████▎ | 13825/22095 [23:51:01<11:40:16, 5.08s/it] {'loss': 0.3144, 'grad_norm': 0.6355643364928295, 'learning_rate': 3.246125854101287e-06, 'epoch': 0.63} 63%|██████▎ | 13825/22095 [23:51:01<11:40:16, 5.08s/it] 63%|██████▎ | 13826/22095 [23:51:04<10:19:48, 4.50s/it] {'loss': 0.3257, 'grad_norm': 0.6015479752126067, 'learning_rate': 3.2454395218517132e-06, 'epoch': 0.63} 63%|██████▎ | 13826/22095 [23:51:04<10:19:48, 4.50s/it] 63%|██████▎ | 13827/22095 [23:51:08<10:01:07, 4.36s/it] {'loss': 0.2851, 'grad_norm': 0.6398553600797272, 'learning_rate': 3.2447532273022536e-06, 'epoch': 0.63} 63%|██████▎ | 13827/22095 [23:51:08<10:01:07, 4.36s/it] 63%|██████▎ | 13828/22095 [23:51:13<10:24:32, 4.53s/it] {'loss': 0.2759, 'grad_norm': 0.6120003353559114, 'learning_rate': 3.244066970467658e-06, 'epoch': 0.63} 63%|██████▎ | 13828/22095 [23:51:13<10:24:32, 4.53s/it] 63%|██████▎ | 13829/22095 [23:51:16<9:28:36, 4.13s/it] {'loss': 0.2943, 'grad_norm': 0.6475841047237043, 'learning_rate': 3.2433807513626714e-06, 'epoch': 0.63} 63%|██████▎ | 13829/22095 [23:51:16<9:28:36, 4.13s/it] 63%|██████▎ | 13830/22095 [23:51:20<9:15:50, 4.04s/it] {'loss': 0.321, 'grad_norm': 0.6480570558579203, 'learning_rate': 3.242694570002036e-06, 'epoch': 0.63} 63%|██████▎ | 13830/22095 [23:51:20<9:15:50, 4.04s/it] 63%|██████▎ | 13831/22095 [23:51:23<9:01:09, 3.93s/it] {'loss': 0.297, 'grad_norm': 0.6847686675984392, 'learning_rate': 3.2420084264004966e-06, 'epoch': 0.63} 63%|██████▎ | 13831/22095 [23:51:23<9:01:09, 3.93s/it] 63%|██████▎ | 13832/22095 [23:51:27<8:37:59, 3.76s/it] {'loss': 0.3233, 'grad_norm': 0.6184729580512613, 'learning_rate': 3.2413223205727995e-06, 'epoch': 0.63} 63%|██████▎ | 13832/22095 [23:51:27<8:37:59, 3.76s/it] 63%|██████▎ | 13833/22095 [23:51:30<7:59:24, 3.48s/it] {'loss': 0.285, 'grad_norm': 0.63232383158112, 'learning_rate': 3.240636252533681e-06, 'epoch': 0.63} 63%|██████▎ | 13833/22095 [23:51:30<7:59:24, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73730 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13834/22095 [23:51:33<7:42:11, 3.36s/it] {'loss': 0.328, 'grad_norm': 1.6564727417893674, 'learning_rate': 3.2399502222978875e-06, 'epoch': 0.63} 63%|██████▎ | 13834/22095 [23:51:33<7:42:11, 3.36s/it] 63%|██████▎ | 13835/22095 [23:51:36<7:51:00, 3.42s/it] {'loss': 0.2903, 'grad_norm': 0.6573502586519927, 'learning_rate': 3.239264229880159e-06, 'epoch': 0.63} 63%|██████▎ | 13835/22095 [23:51:36<7:51:00, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8368502 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 35250, 'image': 'vrdu_table_final_2/astro-ph.CO/fafd48b2-84f5-419b-a5f4-eaa25845668d.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 63%|██████▎ | 13836/22095 [23:51:40<7:57:00, 3.47s/it] {'loss': 0.2902, 'grad_norm': 0.6437835906829803, 'learning_rate': 3.2385782752952336e-06, 'epoch': 0.63} 63%|██████▎ | 13836/22095 [23:51:40<7:57:00, 3.47s/it] 63%|██████▎ | 13837/22095 [23:51:44<8:27:57, 3.69s/it] {'loss': 0.3426, 'grad_norm': 0.6287223521759137, 'learning_rate': 3.2378923585578504e-06, 'epoch': 0.63} 63%|██████▎ | 13837/22095 [23:51:44<8:27:57, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56069 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77202 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13838/22095 [23:51:47<7:51:26, 3.43s/it] {'loss': 0.3348, 'grad_norm': 0.6936936100572169, 'learning_rate': 3.237206479682751e-06, 'epoch': 0.63} 63%|██████▎ | 13838/22095 [23:51:47<7:51:26, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89145 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47537 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13839/22095 [23:51:50<7:26:00, 3.24s/it] {'loss': 0.3061, 'grad_norm': 0.6140669836485039, 'learning_rate': 3.236520638684668e-06, 'epoch': 0.63} 63%|██████▎ | 13839/22095 [23:51:50<7:26:00, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50846 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13840/22095 [23:51:53<7:12:48, 3.15s/it] {'loss': 0.3029, 'grad_norm': 0.6145631143983877, 'learning_rate': 3.235834835578341e-06, 'epoch': 0.63} 63%|██████▎ | 13840/22095 [23:51:53<7:12:48, 3.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13841/22095 [23:51:56<7:20:56, 3.21s/it] {'loss': 0.3551, 'grad_norm': 0.6446262287679247, 'learning_rate': 3.235149070378504e-06, 'epoch': 0.63} 63%|██████▎ | 13841/22095 [23:51:56<7:20:56, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94572 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13842/22095 [23:52:06<12:05:12, 5.27s/it] {'loss': 0.4625, 'grad_norm': 0.41859033443519317, 'learning_rate': 3.2344633430998955e-06, 'epoch': 0.63} 63%|██████▎ | 13842/22095 [23:52:06<12:05:12, 5.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [228, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8445184 in VC:s3://internvl-moe-sft-data/. Exception: Image size [228, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 141594, 'image': 'vrdu_texteq/astro-ph.CO/e24090a4-b5a1-4e90-b0a9-10a1dfb2f41c.png', 'image_wh': [[228, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'with $r_\\text{dr}$ defined as'}]} 63%|██████▎ | 13843/22095 [23:52:09<10:39:46, 4.65s/it] {'loss': 0.2925, 'grad_norm': 0.6986814785352065, 'learning_rate': 3.233777653757246e-06, 'epoch': 0.63} 63%|██████▎ | 13843/22095 [23:52:09<10:39:46, 4.65s/it] 63%|██████▎ | 13844/22095 [23:52:13<10:15:05, 4.47s/it] {'loss': 0.2838, 'grad_norm': 0.6146056230805496, 'learning_rate': 3.2330920023652906e-06, 'epoch': 0.63} 63%|██████▎ | 13844/22095 [23:52:13<10:15:05, 4.47s/it] 63%|██████▎ | 13845/22095 [23:52:17<10:00:20, 4.37s/it] {'loss': 0.2855, 'grad_norm': 0.6248541826618237, 'learning_rate': 3.2324063889387624e-06, 'epoch': 0.63} 63%|██████▎ | 13845/22095 [23:52:17<10:00:20, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13846/22095 [23:52:27<13:25:40, 5.86s/it] {'loss': 0.4494, 'grad_norm': 0.29127147352835037, 'learning_rate': 3.2317208134923895e-06, 'epoch': 0.63} 63%|██████▎ | 13846/22095 [23:52:27<13:25:40, 5.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13847/22095 [23:52:36<15:49:29, 6.91s/it] {'loss': 0.4794, 'grad_norm': 0.5426443301129628, 'learning_rate': 3.2310352760409067e-06, 'epoch': 0.63} 63%|██████▎ | 13847/22095 [23:52:36<15:49:29, 6.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45295 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (152487 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13848/22095 [23:52:43<15:45:04, 6.88s/it] {'loss': 0.4425, 'grad_norm': 0.256103861392261, 'learning_rate': 3.2303497765990445e-06, 'epoch': 0.63} 63%|██████▎ | 13848/22095 [23:52:43<15:45:04, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13849/22095 [23:52:47<13:46:57, 6.02s/it] {'loss': 0.3382, 'grad_norm': 0.6424238921109408, 'learning_rate': 3.229664315181529e-06, 'epoch': 0.63} 63%|██████▎ | 13849/22095 [23:52:47<13:46:57, 6.02s/it] 63%|██████▎ | 13850/22095 [23:52:56<16:07:48, 7.04s/it] {'loss': 0.4543, 'grad_norm': 0.27190227539063644, 'learning_rate': 3.2289788918030894e-06, 'epoch': 0.63} 63%|██████▎ | 13850/22095 [23:52:56<16:07:48, 7.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [175, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8464751 in VC:s3://internvl-moe-sft-data/. Exception: Image size [175, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57110, 'image': 'vrdu_texteq/astro-ph.CO/7b0a3838-b19a-4b9e-a5a0-22cfbc55aa41.png', 'image_wh': [[175, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': '$\\mathcal{S}$ is defined as'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13851/22095 [23:53:00<13:39:24, 5.96s/it] {'loss': 0.3145, 'grad_norm': 0.6729867835394304, 'learning_rate': 3.228293506478457e-06, 'epoch': 0.63} 63%|██████▎ | 13851/22095 [23:53:00<13:39:24, 5.96s/it] 63%|██████▎ | 13852/22095 [23:53:04<12:10:49, 5.32s/it] {'loss': 0.3033, 'grad_norm': 0.6188356839840186, 'learning_rate': 3.227608159222353e-06, 'epoch': 0.63} 63%|██████▎ | 13852/22095 [23:53:04<12:10:49, 5.32s/it] 63%|██████▎ | 13853/22095 [23:53:07<10:41:51, 4.67s/it] {'loss': 0.2733, 'grad_norm': 0.5950567912039953, 'learning_rate': 3.2269228500495066e-06, 'epoch': 0.63} 63%|██████▎ | 13853/22095 [23:53:07<10:41:51, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56839 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81924 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44114 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13854/22095 [23:53:10<9:23:41, 4.10s/it] {'loss': 0.2744, 'grad_norm': 0.6284604076040228, 'learning_rate': 3.2262375789746426e-06, 'epoch': 0.63} 63%|██████▎ | 13854/22095 [23:53:10<9:23:41, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045975 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 63%|██████▎ | 13855/22095 [23:53:13<8:57:11, 3.91s/it] {'loss': 0.3253, 'grad_norm': 0.726216221839223, 'learning_rate': 3.225552346012487e-06, 'epoch': 0.63} 63%|██████▎ | 13855/22095 [23:53:13<8:57:11, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 63%|██████▎ | 13856/22095 [23:53:23<12:46:48, 5.58s/it] {'loss': 0.4537, 'grad_norm': 0.3194964820786272, 'learning_rate': 3.22486715117776e-06, 'epoch': 0.63} 63%|██████▎ | 13856/22095 [23:53:23<12:46:48, 5.58s/it] 63%|██████▎ | 13857/22095 [23:53:26<11:27:05, 5.00s/it] {'loss': 0.2999, 'grad_norm': 0.7452714533400248, 'learning_rate': 3.224181994485186e-06, 'epoch': 0.63} 63%|██████▎ | 13857/22095 [23:53:26<11:27:05, 5.00s/it] 63%|██████▎ | 13858/22095 [23:53:30<10:54:45, 4.77s/it] {'loss': 0.2673, 'grad_norm': 0.5865218640010116, 'learning_rate': 3.2234968759494883e-06, 'epoch': 0.63} 63%|██████▎ | 13858/22095 [23:53:30<10:54:45, 4.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [509, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8531115 in VC:s3://internvl-moe-sft-data/. Exception: Image size [509, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3475, 'image': 'vrdu_texteq/astro-ph.CO/1e9ff18d-ca1c-4fa0-abe9-0155e0ffefd1.png', 'image_wh': [[509, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'where $m$ is the mass of the fluid elements.'}]} 63%|██████▎ | 13859/22095 [23:53:34<9:53:13, 4.32s/it] {'loss': 0.2728, 'grad_norm': 0.625521131109761, 'learning_rate': 3.2228117955853853e-06, 'epoch': 0.63} 63%|██████▎ | 13859/22095 [23:53:34<9:53:13, 4.32s/it] 63%|██████▎ | 13860/22095 [23:53:38<9:31:11, 4.16s/it] {'loss': 0.3171, 'grad_norm': 0.6312711646093929, 'learning_rate': 3.2221267534075986e-06, 'epoch': 0.63} 63%|██████▎ | 13860/22095 [23:53:38<9:31:11, 4.16s/it] 63%|██████▎ | 13861/22095 [23:53:41<9:16:43, 4.06s/it] {'loss': 0.3308, 'grad_norm': 0.6587997158261353, 'learning_rate': 3.221441749430849e-06, 'epoch': 0.63} 63%|██████▎ | 13861/22095 [23:53:41<9:16:43, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13862/22095 [23:53:51<13:14:42, 5.79s/it] {'loss': 0.4494, 'grad_norm': 0.3269749147811726, 'learning_rate': 3.220756783669852e-06, 'epoch': 0.63} 63%|██████▎ | 13862/22095 [23:53:51<13:14:42, 5.79s/it] 63%|██████▎ | 13863/22095 [23:53:55<11:45:33, 5.14s/it] {'loss': 0.3255, 'grad_norm': 0.6300477574521686, 'learning_rate': 3.2200718561393283e-06, 'epoch': 0.63} 63%|██████▎ | 13863/22095 [23:53:55<11:45:33, 5.14s/it] 63%|██████▎ | 13864/22095 [23:53:58<10:45:03, 4.70s/it] {'loss': 0.3018, 'grad_norm': 0.7105608912769671, 'learning_rate': 3.2193869668539947e-06, 'epoch': 0.63} 63%|██████▎ | 13864/22095 [23:53:58<10:45:03, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914368 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37521, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nA. 16\nB. 2\nC. 4\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13865/22095 [23:54:02<10:01:27, 4.38s/it] {'loss': 0.3009, 'grad_norm': 0.6095324079681443, 'learning_rate': 3.2187021158285646e-06, 'epoch': 0.63} 63%|██████▎ | 13865/22095 [23:54:02<10:01:27, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893508 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16661, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm'}, {'from': 'gpt', 'value': '【解答】解:∵点P是AC的中点,点Q是BC的中点,线段AC=8cm,线段BC=4cm,∴CP=4cm,CQ=2cm,∴PQ=4+2=6cm.'}]} 63%|██████▎ | 13866/22095 [23:54:06<9:26:08, 4.13s/it] {'loss': 0.3255, 'grad_norm': 0.585902388220184, 'learning_rate': 3.2180173030777552e-06, 'epoch': 0.63} 63%|██████▎ | 13866/22095 [23:54:06<9:26:08, 4.13s/it] 63%|██████▎ | 13867/22095 [23:54:09<8:39:23, 3.79s/it] {'loss': 0.324, 'grad_norm': 0.6738114149519168, 'learning_rate': 3.2173325286162825e-06, 'epoch': 0.63} 63%|██████▎ | 13867/22095 [23:54:09<8:39:23, 3.79s/it] 63%|██████▎ | 13868/22095 [23:54:12<8:06:55, 3.55s/it] {'loss': 0.3191, 'grad_norm': 0.6705044356390757, 'learning_rate': 3.216647792458858e-06, 'epoch': 0.63} 63%|██████▎ | 13868/22095 [23:54:12<8:06:55, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116728 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13869/22095 [23:54:15<7:53:28, 3.45s/it] {'loss': 0.3322, 'grad_norm': 0.6308960620317975, 'learning_rate': 3.215963094620195e-06, 'epoch': 0.63} 63%|██████▎ | 13869/22095 [23:54:15<7:53:28, 3.45s/it] 63%|██████▎ | 13870/22095 [23:54:18<7:39:00, 3.35s/it] {'loss': 0.3152, 'grad_norm': 0.6366572702729194, 'learning_rate': 3.215278435115005e-06, 'epoch': 0.63} 63%|██████▎ | 13870/22095 [23:54:18<7:39:00, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94241 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13871/22095 [23:54:21<7:30:07, 3.28s/it] {'loss': 0.2958, 'grad_norm': 0.6480624211144015, 'learning_rate': 3.2145938139580015e-06, 'epoch': 0.63} 63%|██████▎ | 13871/22095 [23:54:21<7:30:07, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13872/22095 [23:54:24<7:28:54, 3.28s/it] {'loss': 0.3296, 'grad_norm': 0.6289197414985261, 'learning_rate': 3.2139092311638932e-06, 'epoch': 0.63} 63%|██████▎ | 13872/22095 [23:54:24<7:28:54, 3.28s/it] 63%|██████▎ | 13873/22095 [23:54:27<7:16:20, 3.18s/it] {'loss': 0.3154, 'grad_norm': 0.5775930571137048, 'learning_rate': 3.2132246867473892e-06, 'epoch': 0.63} 63%|██████▎ | 13873/22095 [23:54:27<7:16:20, 3.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63809 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13874/22095 [23:54:31<7:26:21, 3.26s/it] {'loss': 0.3109, 'grad_norm': 0.6611953191837328, 'learning_rate': 3.2125401807232008e-06, 'epoch': 0.63} 63%|██████▎ | 13874/22095 [23:54:31<7:26:21, 3.26s/it] 63%|██████▎ | 13875/22095 [23:54:34<7:14:47, 3.17s/it] {'loss': 0.3177, 'grad_norm': 0.6291090272339493, 'learning_rate': 3.2118557131060323e-06, 'epoch': 0.63} 63%|██████▎ | 13875/22095 [23:54:34<7:14:47, 3.17s/it] 63%|██████▎ | 13876/22095 [23:54:38<7:56:35, 3.48s/it] {'loss': 0.3307, 'grad_norm': 0.6368641840116436, 'learning_rate': 3.211171283910593e-06, 'epoch': 0.63} 63%|██████▎ | 13876/22095 [23:54:38<7:56:35, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13877/22095 [23:54:47<12:03:50, 5.28s/it] {'loss': 0.4719, 'grad_norm': 0.2998450724408988, 'learning_rate': 3.21048689315159e-06, 'epoch': 0.63} 63%|██████▎ | 13877/22095 [23:54:47<12:03:50, 5.28s/it] 63%|██████▎ | 13878/22095 [23:54:54<12:51:17, 5.63s/it] {'loss': 0.4801, 'grad_norm': 0.3263249370296473, 'learning_rate': 3.209802540843727e-06, 'epoch': 0.63} 63%|██████▎ | 13878/22095 [23:54:54<12:51:17, 5.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13879/22095 [23:54:57<11:28:36, 5.03s/it] {'loss': 0.3182, 'grad_norm': 0.6287527403460311, 'learning_rate': 3.2091182270017073e-06, 'epoch': 0.63} 63%|██████▎ | 13879/22095 [23:54:57<11:28:36, 5.03s/it] 63%|██████▎ | 13880/22095 [23:55:02<11:23:40, 4.99s/it] {'loss': 0.2956, 'grad_norm': 0.6882364507965004, 'learning_rate': 3.208433951640241e-06, 'epoch': 0.63} 63%|██████▎ | 13880/22095 [23:55:02<11:23:40, 4.99s/it] 63%|██████▎ | 13881/22095 [23:55:07<11:00:13, 4.82s/it] {'loss': 0.2979, 'grad_norm': 0.6499708052175333, 'learning_rate': 3.207749714774023e-06, 'epoch': 0.63} 63%|██████▎ | 13881/22095 [23:55:07<11:00:13, 4.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13882/22095 [23:55:16<14:06:22, 6.18s/it] {'loss': 0.4612, 'grad_norm': 0.27344950178467453, 'learning_rate': 3.20706551641776e-06, 'epoch': 0.63} 63%|██████▎ | 13882/22095 [23:55:16<14:06:22, 6.18s/it] 63%|██████▎ | 13883/22095 [23:55:20<12:27:19, 5.46s/it] {'loss': 0.3316, 'grad_norm': 0.6161048425175149, 'learning_rate': 3.206381356586151e-06, 'epoch': 0.63} 63%|██████▎ | 13883/22095 [23:55:20<12:27:19, 5.46s/it] 63%|██████▎ | 13884/22095 [23:55:23<10:43:50, 4.70s/it] {'loss': 0.3029, 'grad_norm': 0.5950934825653854, 'learning_rate': 3.205697235293902e-06, 'epoch': 0.63} 63%|██████▎ | 13884/22095 [23:55:23<10:43:50, 4.70s/it] 63%|██████▎ | 13885/22095 [23:55:26<9:32:58, 4.19s/it] {'loss': 0.3429, 'grad_norm': 0.5932586492807993, 'learning_rate': 3.205013152555705e-06, 'epoch': 0.63} 63%|██████▎ | 13885/22095 [23:55:26<9:32:58, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13886/22095 [23:55:29<9:05:28, 3.99s/it] {'loss': 0.311, 'grad_norm': 0.666508282891372, 'learning_rate': 3.2043291083862636e-06, 'epoch': 0.63} 63%|██████▎ | 13886/22095 [23:55:29<9:05:28, 3.99s/it] 63%|██████▎ | 13887/22095 [23:55:33<8:34:04, 3.76s/it] {'loss': 0.2759, 'grad_norm': 0.5829000266122752, 'learning_rate': 3.203645102800276e-06, 'epoch': 0.63} 63%|██████▎ | 13887/22095 [23:55:33<8:34:04, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44164 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43402 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86135 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13888/22095 [23:55:35<7:58:13, 3.50s/it] {'loss': 0.3163, 'grad_norm': 0.6251673579480885, 'learning_rate': 3.202961135812437e-06, 'epoch': 0.63} 63%|██████▎ | 13888/22095 [23:55:35<7:58:13, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13889/22095 [23:55:39<7:43:21, 3.39s/it] {'loss': 0.3294, 'grad_norm': 0.6107779886231521, 'learning_rate': 3.2022772074374424e-06, 'epoch': 0.63} 63%|██████▎ | 13889/22095 [23:55:39<7:43:21, 3.39s/it] 63%|██████▎ | 13890/22095 [23:55:42<7:43:47, 3.39s/it] {'loss': 0.3132, 'grad_norm': 0.6090162862844448, 'learning_rate': 3.2015933176899915e-06, 'epoch': 0.63} 63%|██████▎ | 13890/22095 [23:55:42<7:43:47, 3.39s/it] 63%|██████▎ | 13891/22095 [23:55:45<7:22:07, 3.23s/it] {'loss': 0.3301, 'grad_norm': 0.6906544995735248, 'learning_rate': 3.2009094665847763e-06, 'epoch': 0.63} 63%|██████▎ | 13891/22095 [23:55:45<7:22:07, 3.23s/it] 63%|██████▎ | 13892/22095 [23:55:49<7:38:42, 3.36s/it] {'loss': 0.3585, 'grad_norm': 0.6114037897338339, 'learning_rate': 3.200225654136491e-06, 'epoch': 0.63} 63%|██████▎ | 13892/22095 [23:55:49<7:38:42, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [381, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429310 in VC:s3://internvl-moe-sft-data/. Exception: Image size [381, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36757, 'image': 'vrdu_texteq/astro-ph.CO/c5db5466-c8af-4efa-bbce-e711648e3e4a.png', 'image_wh': [[381, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'We set $M_{\\mathrm{min}}$ as the Jeans mass~'}]} 63%|██████▎ | 13893/22095 [23:55:52<7:57:56, 3.50s/it] {'loss': 0.3176, 'grad_norm': 0.6393766558008107, 'learning_rate': 3.19954188035983e-06, 'epoch': 0.63} 63%|██████▎ | 13893/22095 [23:55:52<7:57:56, 3.50s/it] 63%|██████▎ | 13894/22095 [23:55:56<7:47:39, 3.42s/it] {'loss': 0.3214, 'grad_norm': 0.7565660542118592, 'learning_rate': 3.1988581452694815e-06, 'epoch': 0.63} 63%|██████▎ | 13894/22095 [23:55:56<7:47:39, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11234185 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8676, 'image': 'MMMU/Pharmacy/test_78_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,If 4.00 kg Fe2O3 are available to react, how many moles of CO are needed?\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': "## Introduction to the Blast Furnace and Iron Production\n\nThe production of iron in a blast furnace is a classic example of industrial chemistry, where raw materials undergo chemical reactions at high temperatures to produce metals. The blast furnace is a large, steel stack lined with refractory brick, wherein iron ores, coke, and limestone are combined. The chemical reactions taking place in a blast furnace are a balance of oxidation and reduction reactions, integral to the extraction of iron from its ores.\n\n### Iron(III) Oxide and its Role\n\nIron(III) oxide, or ferric oxide (Fe₂O₃), is a reddish-brown compound identified as a major constituent of rust. It occurs naturally in the mineral hematite, which is one of the main ores used in the iron extraction process. Iron(III) oxide is chosen for its role in the blast furnace due to its availability and relatively high iron content.\n\n#### Chemical Properties:\n\n- **Molar Mass:** 159.69 g/mol\n- **Structure:** Iron atoms are in the +3 oxidation state, bonded with three oxygen atoms.\n \n#### Application in Blast Furnaces:\n\nIn these high-temperature furnaces, Fe₂O₃ is reduced to metallic iron through reactions with carbon monoxide, which acts as the reducing agent. \n\n### Carbon Monoxide as a Reducing Agent\n\nCarbon monoxide (CO) is a colorless, odorless gas often produced in combustion processes. In metallurgy, particularly in the blast furnace, carbon monoxide plays an essential role as a reducing agent, facilitating the conversion of metal oxides to their elemental forms.\n\n#### Chemical Properties:\n\n- **Molar Mass:** 28.01 g/mol\n- **Structure:** A simple diatomic molecule with a carbon and an oxygen atom triple-bonded.\n\n#### Application in Iron Extraction:\n\nWithin the context of a blast furnace, carbon monoxide is generated from coke (a form of carbon) and reacts with iron ore (Fe₂O₃) to reduce it to metallic iron while being oxidized to carbon dioxide (CO₂):\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\]\n\n### Stoichiometry and Chemical Reactions\n\nStoichiometry involves the quantitative relationships between reactants and products in a chemical reaction. It is a fundamental concept in chemistry that allows us to predict the amounts of substances consumed and produced in a given reaction.\n\n#### Balancing Chemical Equations:\n\nA balanced chemical equation has equal numbers of atoms of each element on both sides of the equation, which is crucial for maintaining the law of conservation of mass.\n\n#### Example: Balancing the Blast Furnace Reaction\n\nThe initial step is to balance the reaction:\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\]\n\nEach side has 2 iron atoms, 3 carbon atoms, and 6 oxygen atoms, maintaining balance and showing the stoichiometry of the reaction.\n\n### Conversion Between Mass and Moles\n\nUnderstanding the relationship between the mass of a substance and the amount in moles is critical for solving stoichiometric problems. \n\n#### Mole Concept:\n\nA mole is a standard scientific unit for measuring large quantities of small entities such as atoms or molecules. One mole is Avogadro's number (approximately \\(6.022 \\times 10^{23}\\)) of entities.\n\n#### Calculating Moles from Mass:\n\nTo determine the number of moles (n) from mass (m), the following formula is often used:\n\n\\[ n = \\frac{m}{M} \\]\n\nwhere \\( M \\) is the molar mass of the substance.\n\n### Application to the Given Problem\n\nGiven that 4.00 kg of Fe₂O₃ are available, we need to determine how many moles of CO are needed for the reaction. The balanced equation reveals a 1:3 molar ratio between Fe₂O₃ and CO.\n\n1. **Convert Mass to Moles:**\n - Calculate the moles of Fe₂O₃ from the given mass:\n \\[\n \\text{Mass of Fe}_2\\text{O}_3 = 4.00 \\, \\text{kg} = 4000 \\, \\text{g}\n \\]\n \\[\n \\text{Moles of Fe}_2\\text{O}_3 = \\frac{4000 \\, \\text{g}}{159.69 \\, \\text{g/mol}} \\approx 25.05 \\, \\text{mol}\n \\]\n\n2. **Determine Moles of CO Required:**\n - Utilizing the 1:3 molar ratio in the reaction:\n \\[\n \\text{Moles of CO} = 3 \\times \\text{Moles of Fe}_2\\text{O}_3 = 3 \\times 25.05 \\, \\text{mol} \\approx 75.15 \\, \\text{mol}\n \\]\n\n### Examples and Applications\n\n#### Industrial Applications:\n\n1. **Steel Making:** The produced iron is used in making steel, which involves further refining and alloying to achieve desired mechanical properties.\n \n2. **Automobile Manufacturing:** Iron and steel form essential components in the construction of vehicles, contributing to structural integrity and safety.\n\n#### Environmental Considerations:\n\nThe blast furnace process, while efficient, is a significant source of CO₂ emissions, prompting efforts to find more sustainable methods of iron production, like hydrogen-based reduction processes, which can potentially reduce carbon footprint.\n\n### Conclusion\n\nUnderstanding the stoichiometry of reactions, especially in industrial settings like blast furnaces, is crucial to achieving efficient and effective production processes. It involves grasping concepts of moles, balancing reactions, and applying these to real-world scenarios to calculate reactant and product quantities accurately. This not only ensures the proper functioning of processes but also helps in minimizing waste and optimizing resource use. As industries strive towards sustainability, innovations in these processes and their chemistry are ongoing fields of study, with the aim of balancing economic viability with environmental responsibility."}]} 63%|██████▎ | 13895/22095 [23:56:00<8:09:54, 3.58s/it] {'loss': 0.3344, 'grad_norm': 0.624572649998261, 'learning_rate': 3.1981744488801416e-06, 'epoch': 0.63} 63%|██████▎ | 13895/22095 [23:56:00<8:09:54, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13896/22095 [23:56:09<12:07:03, 5.32s/it] {'loss': 0.4628, 'grad_norm': 0.33261469621894185, 'learning_rate': 3.1974907912064986e-06, 'epoch': 0.63} 63%|██████▎ | 13896/22095 [23:56:09<12:07:03, 5.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13897/22095 [23:56:12<10:53:09, 4.78s/it] {'loss': 0.3199, 'grad_norm': 0.7448596327261254, 'learning_rate': 3.1968071722632432e-06, 'epoch': 0.63} 63%|██████▎ | 13897/22095 [23:56:12<10:53:09, 4.78s/it] 63%|██████▎ | 13898/22095 [23:56:16<9:56:08, 4.36s/it] {'loss': 0.3175, 'grad_norm': 0.5851809149186197, 'learning_rate': 3.196123592065063e-06, 'epoch': 0.63} 63%|██████▎ | 13898/22095 [23:56:16<9:56:08, 4.36s/it] 63%|██████▎ | 13899/22095 [23:56:20<9:30:45, 4.18s/it] {'loss': 0.3231, 'grad_norm': 0.6151664808855503, 'learning_rate': 3.1954400506266453e-06, 'epoch': 0.63} 63%|██████▎ | 13899/22095 [23:56:20<9:30:45, 4.18s/it] 63%|██████▎ | 13900/22095 [23:56:23<9:15:54, 4.07s/it] {'loss': 0.3243, 'grad_norm': 0.61384899017604, 'learning_rate': 3.194756547962681e-06, 'epoch': 0.63} 63%|██████▎ | 13900/22095 [23:56:23<9:15:54, 4.07s/it] 63%|██████▎ | 13901/22095 [23:56:27<8:53:17, 3.90s/it] {'loss': 0.2846, 'grad_norm': 0.6155705262342234, 'learning_rate': 3.1940730840878532e-06, 'epoch': 0.63} 63%|██████▎ | 13901/22095 [23:56:27<8:53:17, 3.90s/it] 63%|██████▎ | 13902/22095 [23:56:31<8:41:04, 3.82s/it] {'loss': 0.323, 'grad_norm': 0.6030867830465274, 'learning_rate': 3.193389659016848e-06, 'epoch': 0.63} 63%|██████▎ | 13902/22095 [23:56:31<8:41:04, 3.82s/it] 63%|██████▎ | 13903/22095 [23:56:34<8:17:41, 3.65s/it] {'loss': 0.2796, 'grad_norm': 0.5851968191163726, 'learning_rate': 3.192706272764351e-06, 'epoch': 0.63} 63%|██████▎ | 13903/22095 [23:56:34<8:17:41, 3.65s/it] 63%|██████▎ | 13904/22095 [23:56:37<7:44:02, 3.40s/it] {'loss': 0.33, 'grad_norm': 0.6389463259250212, 'learning_rate': 3.192022925345044e-06, 'epoch': 0.63} 63%|██████▎ | 13904/22095 [23:56:37<7:44:02, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:21 and width:135 must be larger than factor:28 [Try #0] Failed to fetch sample 2102091 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:21 and width:135 must be larger than factor:28 Problematic sample: {'image': 'b740dccee641dd995e5ce727ca3882efdf31feffa6d5688fe120c85e9c186e93.png', 'conversations': [{'from': 'human', 'value': '\nThis Button is positioned as follows:\nThe button is located in the middle section of the interface, to the right of a green circular play button. It is part of a horizontal control panel that includes other interactive elements. The button is positioned between the play button and a three-dot menu icon.'}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]', 'recipient': 'all', 'end_turn': True}]} 63%|██████▎ | 13905/22095 [23:56:40<8:00:19, 3.52s/it] {'loss': 0.3088, 'grad_norm': 0.6699153846748191, 'learning_rate': 3.191339616773612e-06, 'epoch': 0.63} 63%|██████▎ | 13905/22095 [23:56:40<8:00:19, 3.52s/it] 63%|██████▎ | 13906/22095 [23:56:44<8:12:19, 3.61s/it] {'loss': 0.3375, 'grad_norm': 0.6763927720203098, 'learning_rate': 3.190656347064739e-06, 'epoch': 0.63} 63%|██████▎ | 13906/22095 [23:56:44<8:12:19, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13907/22095 [23:56:48<8:12:20, 3.61s/it] {'loss': 0.3117, 'grad_norm': 0.6410174157777444, 'learning_rate': 3.189973116233103e-06, 'epoch': 0.63} 63%|██████▎ | 13907/22095 [23:56:48<8:12:20, 3.61s/it] 63%|██████▎ | 13908/22095 [23:56:52<8:18:00, 3.65s/it] {'loss': 0.3496, 'grad_norm': 0.613551811374985, 'learning_rate': 3.1892899242933834e-06, 'epoch': 0.63} 63%|██████▎ | 13908/22095 [23:56:52<8:18:00, 3.65s/it] 63%|██████▎ | 13909/22095 [23:56:55<8:25:08, 3.70s/it] {'loss': 0.3513, 'grad_norm': 0.606707201595439, 'learning_rate': 3.1886067712602656e-06, 'epoch': 0.63} 63%|██████▎ | 13909/22095 [23:56:55<8:25:08, 3.70s/it] 63%|██████▎ | 13910/22095 [23:56:58<7:46:17, 3.42s/it] {'loss': 0.3551, 'grad_norm': 0.6494404568011635, 'learning_rate': 3.1879236571484224e-06, 'epoch': 0.63} 63%|██████▎ | 13910/22095 [23:56:58<7:46:17, 3.42s/it] 63%|██████▎ | 13911/22095 [23:57:01<7:23:33, 3.25s/it] {'loss': 0.2697, 'grad_norm': 0.5912183003606637, 'learning_rate': 3.1872405819725356e-06, 'epoch': 0.63} 63%|██████▎ | 13911/22095 [23:57:01<7:23:33, 3.25s/it] 63%|██████▎ | 13912/22095 [23:57:04<7:12:12, 3.17s/it] {'loss': 0.2802, 'grad_norm': 0.6521680921138964, 'learning_rate': 3.1865575457472797e-06, 'epoch': 0.63} 63%|██████▎ | 13912/22095 [23:57:04<7:12:12, 3.17s/it] 63%|██████▎ | 13913/22095 [23:57:08<7:27:12, 3.28s/it] {'loss': 0.2909, 'grad_norm': 0.5875091213842563, 'learning_rate': 3.1858745484873356e-06, 'epoch': 0.63} 63%|██████▎ | 13913/22095 [23:57:08<7:27:12, 3.28s/it] 63%|██████▎ | 13914/22095 [23:57:11<7:24:12, 3.26s/it] {'loss': 0.3577, 'grad_norm': 0.6159229710750475, 'learning_rate': 3.1851915902073734e-06, 'epoch': 0.63} 63%|██████▎ | 13914/22095 [23:57:11<7:24:12, 3.26s/it] 63%|██████▎ | 13915/22095 [23:57:14<7:08:36, 3.14s/it] {'loss': 0.3101, 'grad_norm': 0.6312091121112716, 'learning_rate': 3.184508670922071e-06, 'epoch': 0.63} 63%|██████▎ | 13915/22095 [23:57:14<7:08:36, 3.14s/it] 63%|██████▎ | 13916/22095 [23:57:17<7:32:34, 3.32s/it] {'loss': 0.3303, 'grad_norm': 0.6328397534853961, 'learning_rate': 3.1838257906461016e-06, 'epoch': 0.63} 63%|██████▎ | 13916/22095 [23:57:17<7:32:34, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81521 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13917/22095 [23:57:21<7:56:47, 3.50s/it] {'loss': 0.2836, 'grad_norm': 0.6220419513305495, 'learning_rate': 3.183142949394138e-06, 'epoch': 0.63} 63%|██████▎ | 13917/22095 [23:57:21<7:56:47, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13918/22095 [23:57:25<8:01:45, 3.53s/it] {'loss': 0.3122, 'grad_norm': 0.6483485814812089, 'learning_rate': 3.1824601471808504e-06, 'epoch': 0.63} 63%|██████▎ | 13918/22095 [23:57:25<8:01:45, 3.53s/it] 63%|██████▎ | 13919/22095 [23:57:29<8:15:00, 3.63s/it] {'loss': 0.3569, 'grad_norm': 0.6153026674344495, 'learning_rate': 3.181777384020915e-06, 'epoch': 0.63} 63%|██████▎ | 13919/22095 [23:57:29<8:15:00, 3.63s/it] 63%|██████▎ | 13920/22095 [23:57:32<8:15:58, 3.64s/it] {'loss': 0.3436, 'grad_norm': 1.1590618909167922, 'learning_rate': 3.1810946599289983e-06, 'epoch': 0.63} 63%|██████▎ | 13920/22095 [23:57:32<8:15:58, 3.64s/it] 63%|██████▎ | 13921/22095 [23:57:36<8:30:35, 3.75s/it] {'loss': 0.3378, 'grad_norm': 0.6600727506975453, 'learning_rate': 3.1804119749197703e-06, 'epoch': 0.63} 63%|██████▎ | 13921/22095 [23:57:36<8:30:35, 3.75s/it] 63%|██████▎ | 13922/22095 [23:57:40<8:11:41, 3.61s/it] {'loss': 0.2901, 'grad_norm': 0.570883949381704, 'learning_rate': 3.179729329007902e-06, 'epoch': 0.63} 63%|██████▎ | 13922/22095 [23:57:40<8:11:41, 3.61s/it] 63%|██████▎ | 13923/22095 [23:57:43<7:51:28, 3.46s/it] {'loss': 0.3714, 'grad_norm': 0.6916262353759627, 'learning_rate': 3.179046722208058e-06, 'epoch': 0.63} 63%|██████▎ | 13923/22095 [23:57:43<7:51:28, 3.46s/it] 63%|██████▎ | 13924/22095 [23:57:46<7:49:06, 3.44s/it] {'loss': 0.3072, 'grad_norm': 0.580028197703851, 'learning_rate': 3.1783641545349074e-06, 'epoch': 0.63} 63%|██████▎ | 13924/22095 [23:57:46<7:49:06, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43939 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13925/22095 [23:57:53<10:25:12, 4.59s/it] {'loss': 0.4392, 'grad_norm': 0.2973455929085098, 'learning_rate': 3.1776816260031172e-06, 'epoch': 0.63} 63%|██████▎ | 13925/22095 [23:57:53<10:25:12, 4.59s/it] 63%|██████▎ | 13926/22095 [23:57:57<9:24:04, 4.14s/it] {'loss': 0.2949, 'grad_norm': 0.6103130810803539, 'learning_rate': 3.1769991366273533e-06, 'epoch': 0.63} 63%|██████▎ | 13926/22095 [23:57:57<9:24:04, 4.14s/it] 63%|██████▎ | 13927/22095 [23:58:00<8:44:18, 3.85s/it] {'loss': 0.3511, 'grad_norm': 0.6174746681694337, 'learning_rate': 3.1763166864222766e-06, 'epoch': 0.63} 63%|██████▎ | 13927/22095 [23:58:00<8:44:18, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58965 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93600 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13928/22095 [23:58:03<8:23:27, 3.70s/it] {'loss': 0.309, 'grad_norm': 0.6284170932260028, 'learning_rate': 3.175634275402555e-06, 'epoch': 0.63} 63%|██████▎ | 13928/22095 [23:58:03<8:23:27, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47038 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13929/22095 [23:58:06<7:54:35, 3.49s/it] {'loss': 0.3246, 'grad_norm': 0.6609520162027985, 'learning_rate': 3.1749519035828495e-06, 'epoch': 0.63} 63%|██████▎ | 13929/22095 [23:58:06<7:54:35, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111731 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13930/22095 [23:58:09<7:37:31, 3.36s/it] {'loss': 0.325, 'grad_norm': 0.7255600667588301, 'learning_rate': 3.1742695709778222e-06, 'epoch': 0.63} 63%|██████▎ | 13930/22095 [23:58:09<7:37:31, 3.36s/it] 63%|██████▎ | 13931/22095 [23:58:12<7:15:24, 3.20s/it] {'loss': 0.2919, 'grad_norm': 0.5971870426282323, 'learning_rate': 3.1735872776021344e-06, 'epoch': 0.63} 63%|██████▎ | 13931/22095 [23:58:12<7:15:24, 3.20s/it] 63%|██████▎ | 13932/22095 [23:58:16<7:37:00, 3.36s/it] {'loss': 0.3205, 'grad_norm': 0.6410748277342618, 'learning_rate': 3.1729050234704474e-06, 'epoch': 0.63} 63%|██████▎ | 13932/22095 [23:58:16<7:37:00, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13933/22095 [23:58:23<10:15:23, 4.52s/it] {'loss': 0.4587, 'grad_norm': 0.3077461694511978, 'learning_rate': 3.1722228085974183e-06, 'epoch': 0.63} 63%|██████▎ | 13933/22095 [23:58:23<10:15:23, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43272 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13934/22095 [23:58:33<13:40:39, 6.03s/it] {'loss': 0.4757, 'grad_norm': 0.2873055720141124, 'learning_rate': 3.1715406329977083e-06, 'epoch': 0.63} 63%|██████▎ | 13934/22095 [23:58:33<13:40:39, 6.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13935/22095 [23:58:36<11:59:11, 5.29s/it] {'loss': 0.2686, 'grad_norm': 0.6291234455503854, 'learning_rate': 3.1708584966859745e-06, 'epoch': 0.63} 63%|██████▎ | 13935/22095 [23:58:36<11:59:11, 5.29s/it] 63%|██████▎ | 13936/22095 [23:58:44<13:36:31, 6.00s/it] {'loss': 0.4715, 'grad_norm': 0.2676697398113157, 'learning_rate': 3.1701763996768744e-06, 'epoch': 0.63} 63%|██████▎ | 13936/22095 [23:58:44<13:36:31, 6.00s/it] 63%|██████▎ | 13937/22095 [23:58:53<16:00:20, 7.06s/it] {'loss': 0.4649, 'grad_norm': 0.2841129202382441, 'learning_rate': 3.1694943419850616e-06, 'epoch': 0.63} 63%|██████▎ | 13937/22095 [23:58:53<16:00:20, 7.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13938/22095 [23:58:57<13:28:57, 5.95s/it] {'loss': 0.2485, 'grad_norm': 0.5972951884939902, 'learning_rate': 3.1688123236251967e-06, 'epoch': 0.63} 63%|██████▎ | 13938/22095 [23:58:57<13:28:57, 5.95s/it] 63%|██████▎ | 13939/22095 [23:59:00<11:50:25, 5.23s/it] {'loss': 0.3125, 'grad_norm': 0.6976197308832803, 'learning_rate': 3.1681303446119277e-06, 'epoch': 0.63} 63%|██████▎ | 13939/22095 [23:59:00<11:50:25, 5.23s/it] 63%|██████▎ | 13940/22095 [23:59:03<10:18:19, 4.55s/it] {'loss': 0.3349, 'grad_norm': 0.6712376562482888, 'learning_rate': 3.167448404959913e-06, 'epoch': 0.63} 63%|██████▎ | 13940/22095 [23:59:03<10:18:19, 4.55s/it] 63%|██████▎ | 13941/22095 [23:59:07<9:40:19, 4.27s/it] {'loss': 0.3106, 'grad_norm': 0.581681353721388, 'learning_rate': 3.166766504683802e-06, 'epoch': 0.63} 63%|██████▎ | 13941/22095 [23:59:07<9:40:19, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334377 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 987, 'image': 'vrdu_table_final_2/astro-ph.CO/c49b457c-749a-493c-afaa-41e88078b5f6.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 63%|██████▎ | 13942/22095 [23:59:16<13:18:26, 5.88s/it] {'loss': 0.494, 'grad_norm': 0.3447944047961648, 'learning_rate': 3.166084643798252e-06, 'epoch': 0.63} 63%|██████▎ | 13942/22095 [23:59:16<13:18:26, 5.88s/it] 63%|██████▎ | 13943/22095 [23:59:20<12:04:10, 5.33s/it] {'loss': 0.3649, 'grad_norm': 0.6397469634467333, 'learning_rate': 3.165402822317908e-06, 'epoch': 0.63} 63%|██████▎ | 13943/22095 [23:59:20<12:04:10, 5.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45483 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44660 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13944/22095 [23:59:23<10:29:02, 4.63s/it] {'loss': 0.2909, 'grad_norm': 1.0218459586067286, 'learning_rate': 3.1647210402574223e-06, 'epoch': 0.63} 63%|██████▎ | 13944/22095 [23:59:23<10:29:02, 4.63s/it] 63%|██████▎ | 13945/22095 [23:59:27<9:42:53, 4.29s/it] {'loss': 0.3024, 'grad_norm': 0.5746655024858726, 'learning_rate': 3.1640392976314472e-06, 'epoch': 0.63} 63%|██████▎ | 13945/22095 [23:59:27<9:42:53, 4.29s/it] 63%|██████▎ | 13946/22095 [23:59:31<9:37:49, 4.25s/it] {'loss': 0.2883, 'grad_norm': 0.6052059717797788, 'learning_rate': 3.1633575944546273e-06, 'epoch': 0.63} 63%|██████▎ | 13946/22095 [23:59:31<9:37:49, 4.25s/it] 63%|██████▎ | 13947/22095 [23:59:35<9:16:05, 4.09s/it] {'loss': 0.302, 'grad_norm': 0.6117422105639195, 'learning_rate': 3.162675930741611e-06, 'epoch': 0.63} 63%|██████▎ | 13947/22095 [23:59:35<9:16:05, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13948/22095 [23:59:44<12:37:24, 5.58s/it] {'loss': 0.457, 'grad_norm': 0.2782623106961924, 'learning_rate': 3.161994306507048e-06, 'epoch': 0.63} 63%|██████▎ | 13948/22095 [23:59:44<12:37:24, 5.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45772 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13949/22095 [23:59:47<11:08:19, 4.92s/it] {'loss': 0.309, 'grad_norm': 0.6013180195685133, 'learning_rate': 3.1613127217655814e-06, 'epoch': 0.63} 63%|██████▎ | 13949/22095 [23:59:47<11:08:19, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47724 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13950/22095 [23:59:51<10:17:35, 4.55s/it] {'loss': 0.2948, 'grad_norm': 0.6245021539144215, 'learning_rate': 3.160631176531858e-06, 'epoch': 0.63} 63%|██████▎ | 13950/22095 [23:59:51<10:17:35, 4.55s/it] 63%|██████▎ | 13951/22095 [23:59:54<9:31:12, 4.21s/it] {'loss': 0.2789, 'grad_norm': 0.6155792584363897, 'learning_rate': 3.1599496708205212e-06, 'epoch': 0.63} 63%|██████▎ | 13951/22095 [23:59:54<9:31:12, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [481, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8433304 in VC:s3://internvl-moe-sft-data/. Exception: Image size [481, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 129913, 'image': 'vrdu_texteq/astro-ph.CO/ed570682-7185-4534-a50e-60d287c198ef.png', 'image_wh': [[481, 25]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where $z_{{\\rm d}}$ is the redshift of the deflector.'}]} 63%|██████▎ | 13952/22095 [24:00:03<12:41:25, 5.61s/it] {'loss': 0.4629, 'grad_norm': 0.5400955697561185, 'learning_rate': 3.159268204646213e-06, 'epoch': 0.63} 63%|██████▎ | 13952/22095 [24:00:03<12:41:25, 5.61s/it] 63%|██████▎ | 13953/22095 [24:00:07<11:38:04, 5.14s/it] {'loss': 0.2852, 'grad_norm': 0.6008466199890503, 'learning_rate': 3.158586778023579e-06, 'epoch': 0.63} 63%|██████▎ | 13953/22095 [24:00:07<11:38:04, 5.14s/it] 63%|██████▎ | 13954/22095 [24:00:11<10:24:30, 4.60s/it] {'loss': 0.2784, 'grad_norm': 0.6247523900052501, 'learning_rate': 3.1579053909672597e-06, 'epoch': 0.63} 63%|██████▎ | 13954/22095 [24:00:11<10:24:30, 4.60s/it] 63%|██████▎ | 13955/22095 [24:00:14<9:50:11, 4.35s/it] {'loss': 0.3079, 'grad_norm': 0.624934184912694, 'learning_rate': 3.1572240434918975e-06, 'epoch': 0.63} 63%|██████▎ | 13955/22095 [24:00:14<9:50:11, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45507 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48880 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13956/22095 [24:00:18<9:34:43, 4.24s/it] {'loss': 0.3273, 'grad_norm': 0.6361115372253076, 'learning_rate': 3.156542735612128e-06, 'epoch': 0.63} 63%|██████▎ | 13956/22095 [24:00:18<9:34:43, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13957/22095 [24:00:28<13:24:04, 5.93s/it] {'loss': 0.4658, 'grad_norm': 0.2770654450453919, 'learning_rate': 3.1558614673425946e-06, 'epoch': 0.63} 63%|██████▎ | 13957/22095 [24:00:28<13:24:04, 5.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57399 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59177 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104806 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53854 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13958/22095 [24:00:38<15:58:51, 7.07s/it] {'loss': 0.4614, 'grad_norm': 0.26875663521487186, 'learning_rate': 3.1551802386979356e-06, 'epoch': 0.63} 63%|██████▎ | 13958/22095 [24:00:38<15:58:51, 7.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13959/22095 [24:00:41<13:19:56, 5.90s/it] {'loss': 0.2884, 'grad_norm': 0.8911871688377059, 'learning_rate': 3.1544990496927864e-06, 'epoch': 0.63} 63%|██████▎ | 13959/22095 [24:00:41<13:19:56, 5.90s/it] 63%|██████▎ | 13960/22095 [24:00:45<11:40:07, 5.16s/it] {'loss': 0.3352, 'grad_norm': 0.6482525152061249, 'learning_rate': 3.1538179003417836e-06, 'epoch': 0.63} 63%|██████▎ | 13960/22095 [24:00:45<11:40:07, 5.16s/it] 63%|██████▎ | 13961/22095 [24:00:48<10:17:23, 4.55s/it] {'loss': 0.3274, 'grad_norm': 0.6406084975429104, 'learning_rate': 3.1531367906595665e-06, 'epoch': 0.63} 63%|██████▎ | 13961/22095 [24:00:48<10:17:23, 4.55s/it] 63%|██████▎ | 13962/22095 [24:00:51<9:38:28, 4.27s/it] {'loss': 0.3397, 'grad_norm': 0.6568044044364002, 'learning_rate': 3.1524557206607655e-06, 'epoch': 0.63} 63%|██████▎ | 13962/22095 [24:00:51<9:38:28, 4.27s/it] 63%|██████▎ | 13963/22095 [24:00:55<9:22:16, 4.15s/it] {'loss': 0.3109, 'grad_norm': 0.6302651681053412, 'learning_rate': 3.1517746903600173e-06, 'epoch': 0.63} 63%|██████▎ | 13963/22095 [24:00:55<9:22:16, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13964/22095 [24:01:05<12:59:30, 5.75s/it] {'loss': 0.4441, 'grad_norm': 0.27192785144332376, 'learning_rate': 3.1510936997719557e-06, 'epoch': 0.63} 63%|██████▎ | 13964/22095 [24:01:05<12:59:30, 5.75s/it] 63%|██████▎ | 13965/22095 [24:01:08<11:12:48, 4.97s/it] {'loss': 0.3169, 'grad_norm': 0.7875442893777145, 'learning_rate': 3.1504127489112105e-06, 'epoch': 0.63} 63%|██████▎ | 13965/22095 [24:01:08<11:12:48, 4.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91948 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49811 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13966/22095 [24:01:17<14:14:46, 6.31s/it] {'loss': 0.4735, 'grad_norm': 0.2886598711250437, 'learning_rate': 3.149731837792414e-06, 'epoch': 0.63} 63%|██████▎ | 13966/22095 [24:01:17<14:14:46, 6.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13967/22095 [24:01:21<12:14:32, 5.42s/it] {'loss': 0.3278, 'grad_norm': 0.5776455071697102, 'learning_rate': 3.149050966430199e-06, 'epoch': 0.63} 63%|██████▎ | 13967/22095 [24:01:21<12:14:32, 5.42s/it] 63%|██████▎ | 13968/22095 [24:01:25<11:14:52, 4.98s/it] {'loss': 0.3332, 'grad_norm': 0.6388254430484523, 'learning_rate': 3.148370134839195e-06, 'epoch': 0.63} 63%|██████▎ | 13968/22095 [24:01:25<11:14:52, 4.98s/it] 63%|██████▎ | 13969/22095 [24:01:28<10:28:49, 4.64s/it] {'loss': 0.326, 'grad_norm': 0.6266131600997259, 'learning_rate': 3.1476893430340282e-06, 'epoch': 0.63} 63%|██████▎ | 13969/22095 [24:01:28<10:28:49, 4.64s/it] 63%|██████▎ | 13970/22095 [24:01:31<9:23:04, 4.16s/it] {'loss': 0.3019, 'grad_norm': 0.6778133379330554, 'learning_rate': 3.147008591029328e-06, 'epoch': 0.63} 63%|██████▎ | 13970/22095 [24:01:31<9:23:04, 4.16s/it] 63%|██████▎ | 13971/22095 [24:01:35<8:39:26, 3.84s/it] {'loss': 0.3309, 'grad_norm': 0.5863638792175061, 'learning_rate': 3.1463278788397256e-06, 'epoch': 0.63} 63%|██████▎ | 13971/22095 [24:01:35<8:39:26, 3.84s/it] 63%|██████▎ | 13972/22095 [24:01:37<8:04:52, 3.58s/it] {'loss': 0.2714, 'grad_norm': 0.6172203148817229, 'learning_rate': 3.1456472064798403e-06, 'epoch': 0.63} 63%|██████▎ | 13972/22095 [24:01:38<8:04:52, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13973/22095 [24:01:45<10:43:43, 4.76s/it] {'loss': 0.472, 'grad_norm': 0.2877354389046555, 'learning_rate': 3.144966573964302e-06, 'epoch': 0.63} 63%|██████▎ | 13973/22095 [24:01:45<10:43:43, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13974/22095 [24:01:54<13:54:25, 6.16s/it] {'loss': 0.4476, 'grad_norm': 0.26938771291979924, 'learning_rate': 3.1442859813077364e-06, 'epoch': 0.63} 63%|██████▎ | 13974/22095 [24:01:54<13:54:25, 6.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 63%|██████▎ | 13975/22095 [24:01:58<11:55:32, 5.29s/it] {'loss': 0.3145, 'grad_norm': 0.6809283680606592, 'learning_rate': 3.1436054285247645e-06, 'epoch': 0.63} 63%|██████▎ | 13975/22095 [24:01:58<11:55:32, 5.29s/it] 63%|██████▎ | 13976/22095 [24:02:01<10:50:49, 4.81s/it] {'loss': 0.3005, 'grad_norm': 0.6095048093758612, 'learning_rate': 3.1429249156300094e-06, 'epoch': 0.63} 63%|██████▎ | 13976/22095 [24:02:01<10:50:49, 4.81s/it] 63%|██████▎ | 13977/22095 [24:02:05<10:06:10, 4.48s/it] {'loss': 0.3093, 'grad_norm': 0.5992126505470327, 'learning_rate': 3.1422444426380964e-06, 'epoch': 0.63} 63%|██████▎ | 13977/22095 [24:02:05<10:06:10, 4.48s/it] 63%|██████▎ | 13978/22095 [24:02:08<9:14:10, 4.10s/it] {'loss': 0.2985, 'grad_norm': 0.5853842538818939, 'learning_rate': 3.1415640095636436e-06, 'epoch': 0.63} 63%|██████▎ | 13978/22095 [24:02:08<9:14:10, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310573 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1C9TcpC3PL1JjSZFxXXcBBVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content hidden in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n老鼠不跑光\n包退\n高效\n声光结合\n模式可调\n自动关闭\n主机防水\n多项专利\n低耗电型\n缤纷日子\n汽车专用\nW218\n12强光灯\n驱属之神'}]} 63%|██████▎ | 13979/22095 [24:02:11<8:36:22, 3.82s/it] {'loss': 0.3389, 'grad_norm': 0.5870067931295567, 'learning_rate': 3.1408836164212724e-06, 'epoch': 0.63} 63%|██████▎ | 13979/22095 [24:02:11<8:36:22, 3.82s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13980/22095 [24:02:15<8:07:33, 3.60s/it] {'loss': 0.2815, 'grad_norm': 0.588258900006323, 'learning_rate': 3.140203263225604e-06, 'epoch': 0.63} 63%|██████▎ | 13980/22095 [24:02:15<8:07:33, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403478 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5651, 'image': 'vrdu_table_final_2/astro-ph.CO/4da3c934-8c0a-4727-912b-2703e92489cb.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 63%|██████▎ | 13981/22095 [24:02:18<8:14:44, 3.66s/it] {'loss': 0.2697, 'grad_norm': 0.644122902292735, 'learning_rate': 3.139522949991253e-06, 'epoch': 0.63} 63%|██████▎ | 13981/22095 [24:02:18<8:14:44, 3.66s/it] 63%|██████▎ | 13982/22095 [24:02:21<7:44:25, 3.43s/it] {'loss': 0.26, 'grad_norm': 0.7828996185510263, 'learning_rate': 3.1388426767328408e-06, 'epoch': 0.63} 63%|██████▎ | 13982/22095 [24:02:21<7:44:25, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 13983/22095 [24:02:24<7:31:45, 3.34s/it] {'loss': 0.3584, 'grad_norm': 0.6284030766321109, 'learning_rate': 3.138162443464983e-06, 'epoch': 0.63} 63%|██████▎ | 13983/22095 [24:02:24<7:31:45, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [95, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8370122 in VC:s3://internvl-moe-sft-data/. Exception: Image size [95, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36874, 'image': 'vrdu_table_final_2/astro-ph.CO/de4d86a1-e442-4321-a92d-ebd45694421e.png', 'image_wh': [[95, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{c}Mean $\\alpha$\\end{tabular}\n```'}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 63%|██████▎ | 13984/22095 [24:02:28<7:41:45, 3.42s/it] {'loss': 0.3107, 'grad_norm': 0.6178985167885237, 'learning_rate': 3.137482250202298e-06, 'epoch': 0.63} 63%|██████▎ | 13984/22095 [24:02:28<7:41:45, 3.42s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38124.png 2025-08-28 16:00:24.943122 load time: 1068.07 ms Token indices sequence length is longer than the specified maximum sequence length for this model (47451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60182 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13985/22095 [24:02:32<7:53:29, 3.50s/it] {'loss': 0.3128, 'grad_norm': 0.6510107780176263, 'learning_rate': 3.1368020969593967e-06, 'epoch': 0.63} 63%|██████▎ | 13985/22095 [24:02:32<7:53:29, 3.50s/it] 63%|██████▎ | 13986/22095 [24:02:35<7:34:28, 3.36s/it] {'loss': 0.3402, 'grad_norm': 0.7311425717620409, 'learning_rate': 3.136121983750897e-06, 'epoch': 0.63} 63%|██████▎ | 13986/22095 [24:02:35<7:34:28, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8343906 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10557, 'image': 'vrdu_table_final_2/astro-ph.CO/e2a53cc9-7e7b-44ae-9ef9-b43c75618f97.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 63%|██████▎ | 13987/22095 [24:02:38<7:35:57, 3.37s/it] {'loss': 0.3011, 'grad_norm': 0.6064276109335814, 'learning_rate': 3.1354419105914127e-06, 'epoch': 0.63} 63%|██████▎ | 13987/22095 [24:02:38<7:35:57, 3.37s/it] 63%|██████▎ | 13988/22095 [24:02:41<7:19:50, 3.26s/it] {'loss': 0.3333, 'grad_norm': 0.6193044912933445, 'learning_rate': 3.1347618774955534e-06, 'epoch': 0.63} 63%|██████▎ | 13988/22095 [24:02:41<7:19:50, 3.26s/it] 63%|██████▎ | 13989/22095 [24:02:44<7:03:17, 3.13s/it] {'loss': 0.304, 'grad_norm': 0.6514901654955743, 'learning_rate': 3.134081884477932e-06, 'epoch': 0.63} 63%|██████▎ | 13989/22095 [24:02:44<7:03:17, 3.13s/it] 63%|██████▎ | 13990/22095 [24:02:47<7:09:10, 3.18s/it] {'loss': 0.3437, 'grad_norm': 0.7694051174167692, 'learning_rate': 3.133401931553163e-06, 'epoch': 0.63} 63%|██████▎ | 13990/22095 [24:02:47<7:09:10, 3.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49780 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13991/22095 [24:02:50<6:59:37, 3.11s/it] {'loss': 0.2673, 'grad_norm': 0.6834306358109133, 'learning_rate': 3.1327220187358515e-06, 'epoch': 0.63} 63%|██████▎ | 13991/22095 [24:02:50<6:59:37, 3.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67734 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13992/22095 [24:02:53<7:03:19, 3.13s/it] {'loss': 0.3151, 'grad_norm': 0.6285507208508668, 'learning_rate': 3.1320421460406093e-06, 'epoch': 0.63} 63%|██████▎ | 13992/22095 [24:02:53<7:03:19, 3.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13993/22095 [24:03:00<9:44:36, 4.33s/it] {'loss': 0.4871, 'grad_norm': 0.29874642143798624, 'learning_rate': 3.1313623134820454e-06, 'epoch': 0.63} 63%|██████▎ | 13993/22095 [24:03:00<9:44:36, 4.33s/it] 63%|██████▎ | 13994/22095 [24:03:04<9:10:46, 4.08s/it] {'loss': 0.3333, 'grad_norm': 0.6455014552179699, 'learning_rate': 3.1306825210747654e-06, 'epoch': 0.63} 63%|██████▎ | 13994/22095 [24:03:04<9:10:46, 4.08s/it] 63%|██████▎ | 13995/22095 [24:03:08<9:00:29, 4.00s/it] {'loss': 0.3577, 'grad_norm': 0.6898874640279824, 'learning_rate': 3.130002768833376e-06, 'epoch': 0.63} 63%|██████▎ | 13995/22095 [24:03:08<9:00:29, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48421 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 13996/22095 [24:03:11<8:20:47, 3.71s/it] {'loss': 0.3018, 'grad_norm': 0.6025726694594395, 'learning_rate': 3.1293230567724843e-06, 'epoch': 0.63} 63%|██████▎ | 13996/22095 [24:03:11<8:20:47, 3.71s/it] 63%|██████▎ | 13997/22095 [24:03:14<7:50:28, 3.49s/it] {'loss': 0.3003, 'grad_norm': 0.6278449707753837, 'learning_rate': 3.1286433849066965e-06, 'epoch': 0.63} 63%|██████▎ | 13997/22095 [24:03:14<7:50:28, 3.49s/it] 63%|██████▎ | 13998/22095 [24:03:18<8:14:58, 3.67s/it] {'loss': 0.2732, 'grad_norm': 0.6014718230979809, 'learning_rate': 3.1279637532506134e-06, 'epoch': 0.63} 63%|██████▎ | 13998/22095 [24:03:18<8:14:58, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 13999/22095 [24:03:24<10:05:17, 4.49s/it] {'loss': 0.478, 'grad_norm': 0.2878032134378391, 'learning_rate': 3.1272841618188388e-06, 'epoch': 0.63} 63%|██████▎ | 13999/22095 [24:03:24<10:05:17, 4.49s/it] 63%|██████▎ | 14000/22095 [24:03:34<13:24:52, 5.97s/it] {'loss': 0.4708, 'grad_norm': 0.38175351117616113, 'learning_rate': 3.1266046106259784e-06, 'epoch': 0.63} 63%|██████▎ | 14000/22095 [24:03:34<13:24:52, 5.97s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 63%|██████▎ | 14001/22095 [24:04:29<46:31:02, 20.69s/it] {'loss': 0.3235, 'grad_norm': 0.6449141988543913, 'learning_rate': 3.1259250996866296e-06, 'epoch': 0.63} 63%|██████▎ | 14001/22095 [24:04:29<46:31:02, 20.69s/it] 63%|██████▎ | 14002/22095 [24:04:34<35:55:03, 15.98s/it] {'loss': 0.2579, 'grad_norm': 0.7625685410368898, 'learning_rate': 3.1252456290153952e-06, 'epoch': 0.63} 63%|██████▎ | 14002/22095 [24:04:34<35:55:03, 15.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [259, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8475471 in VC:s3://internvl-moe-sft-data/. Exception: Image size [259, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 118213, 'image': 'vrdu_texteq/astro-ph.CO/4dddf8d6-892a-4b9c-a429-9127a1581320.png', 'image_wh': [[259, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $n=2$ or 4 and'}]} 63%|██████▎ | 14003/22095 [24:04:38<27:45:12, 12.35s/it] {'loss': 0.2912, 'grad_norm': 0.6300075430673482, 'learning_rate': 3.124566198626875e-06, 'epoch': 0.63} 63%|██████▎ | 14003/22095 [24:04:38<27:45:12, 12.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 14004/22095 [24:04:44<23:42:12, 10.55s/it] {'loss': 0.4758, 'grad_norm': 0.26417259947224064, 'learning_rate': 3.1238868085356656e-06, 'epoch': 0.63} 63%|██████▎ | 14004/22095 [24:04:44<23:42:12, 10.55s/it] 63%|██████▎ | 14005/22095 [24:04:47<18:47:31, 8.36s/it] {'loss': 0.3039, 'grad_norm': 0.5877373818985486, 'learning_rate': 3.1232074587563667e-06, 'epoch': 0.63} 63%|██████▎ | 14005/22095 [24:04:47<18:47:31, 8.36s/it] 63%|██████▎ | 14006/22095 [24:04:51<15:57:43, 7.10s/it] {'loss': 0.3232, 'grad_norm': 0.6570464396043647, 'learning_rate': 3.1225281493035776e-06, 'epoch': 0.63} 63%|██████▎ | 14006/22095 [24:04:51<15:57:43, 7.10s/it] 63%|██████▎ | 14007/22095 [24:04:55<13:48:50, 6.15s/it] {'loss': 0.2726, 'grad_norm': 0.6492915172805375, 'learning_rate': 3.12184888019189e-06, 'epoch': 0.63} 63%|██████▎ | 14007/22095 [24:04:55<13:48:50, 6.15s/it] 63%|██████▎ | 14008/22095 [24:04:59<12:29:40, 5.56s/it] {'loss': 0.319, 'grad_norm': 0.6045436003490696, 'learning_rate': 3.121169651435903e-06, 'epoch': 0.63} 63%|██████▎ | 14008/22095 [24:05:00<12:29:40, 5.56s/it] 63%|██████▎ | 14009/22095 [24:05:04<11:31:06, 5.13s/it] {'loss': 0.3086, 'grad_norm': 0.6003389466902463, 'learning_rate': 3.12049046305021e-06, 'epoch': 0.63} 63%|██████▎ | 14009/22095 [24:05:04<11:31:06, 5.13s/it] 63%|██████▎ | 14010/22095 [24:05:08<10:56:01, 4.87s/it] {'loss': 0.3019, 'grad_norm': 0.6248635843347558, 'learning_rate': 3.1198113150494026e-06, 'epoch': 0.63} 63%|██████▎ | 14010/22095 [24:05:08<10:56:01, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 14011/22095 [24:05:11<9:40:30, 4.31s/it] {'loss': 0.278, 'grad_norm': 0.6040295162437355, 'learning_rate': 3.1191322074480766e-06, 'epoch': 0.63} 63%|██████▎ | 14011/22095 [24:05:11<9:40:30, 4.31s/it] 63%|██████▎ | 14012/22095 [24:05:15<9:36:21, 4.28s/it] {'loss': 0.2921, 'grad_norm': 0.5699431480642985, 'learning_rate': 3.118453140260823e-06, 'epoch': 0.63} 63%|██████▎ | 14012/22095 [24:05:15<9:36:21, 4.28s/it] 63%|██████▎ | 14013/22095 [24:05:19<9:02:40, 4.03s/it] {'loss': 0.3305, 'grad_norm': 0.6460627176672942, 'learning_rate': 3.1177741135022334e-06, 'epoch': 0.63} 63%|██████▎ | 14013/22095 [24:05:19<9:02:40, 4.03s/it] 63%|██████▎ | 14014/22095 [24:05:22<8:55:37, 3.98s/it] {'loss': 0.318, 'grad_norm': 0.6028203619114058, 'learning_rate': 3.1170951271868953e-06, 'epoch': 0.63} 63%|██████▎ | 14014/22095 [24:05:22<8:55:37, 3.98s/it] 63%|██████▎ | 14015/22095 [24:05:27<9:07:33, 4.07s/it] {'loss': 0.3098, 'grad_norm': 0.6705489517107884, 'learning_rate': 3.1164161813294014e-06, 'epoch': 0.63} 63%|██████▎ | 14015/22095 [24:05:27<9:07:33, 4.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 14016/22095 [24:05:31<9:08:58, 4.08s/it] {'loss': 0.2868, 'grad_norm': 0.6932377918562581, 'learning_rate': 3.1157372759443396e-06, 'epoch': 0.63} 63%|██████▎ | 14016/22095 [24:05:31<9:08:58, 4.08s/it] 63%|██████▎ | 14017/22095 [24:05:34<8:48:36, 3.93s/it] {'loss': 0.2307, 'grad_norm': 0.5981430645831732, 'learning_rate': 3.1150584110462955e-06, 'epoch': 0.63} 63%|██████▎ | 14017/22095 [24:05:34<8:48:36, 3.93s/it] 63%|██████▎ | 14018/22095 [24:05:38<8:44:23, 3.90s/it] {'loss': 0.3063, 'grad_norm': 0.6158121063213936, 'learning_rate': 3.114379586649856e-06, 'epoch': 0.63} 63%|██████▎ | 14018/22095 [24:05:38<8:44:23, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73625 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 14019/22095 [24:05:42<8:39:18, 3.86s/it] {'loss': 0.3035, 'grad_norm': 0.6415675114557254, 'learning_rate': 3.1137008027696113e-06, 'epoch': 0.63} 63%|██████▎ | 14019/22095 [24:05:42<8:39:18, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (136294 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 14020/22095 [24:05:45<8:06:43, 3.62s/it] {'loss': 0.2814, 'grad_norm': 0.6342444158352719, 'learning_rate': 3.1130220594201395e-06, 'epoch': 0.63} 63%|██████▎ | 14020/22095 [24:05:45<8:06:43, 3.62s/it] 63%|██████▎ | 14021/22095 [24:05:48<7:39:07, 3.41s/it] {'loss': 0.368, 'grad_norm': 0.6398484780157805, 'learning_rate': 3.1123433566160293e-06, 'epoch': 0.63} 63%|██████▎ | 14021/22095 [24:05:48<7:39:07, 3.41s/it] 63%|██████▎ | 14022/22095 [24:05:52<7:53:46, 3.52s/it] {'loss': 0.351, 'grad_norm': 0.6312930733263915, 'learning_rate': 3.1116646943718642e-06, 'epoch': 0.63} 63%|██████▎ | 14022/22095 [24:05:52<7:53:46, 3.52s/it] 63%|██████▎ | 14023/22095 [24:05:55<7:37:23, 3.40s/it] {'loss': 0.2892, 'grad_norm': 0.606712973490306, 'learning_rate': 3.110986072702224e-06, 'epoch': 0.63} 63%|██████▎ | 14023/22095 [24:05:55<7:37:23, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114271 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 14024/22095 [24:05:58<7:24:20, 3.30s/it] {'loss': 0.3021, 'grad_norm': 0.6732129537847016, 'learning_rate': 3.1103074916216903e-06, 'epoch': 0.63} 63%|██████▎ | 14024/22095 [24:05:58<7:24:20, 3.30s/it] 63%|██████▎ | 14025/22095 [24:06:02<7:41:11, 3.43s/it] {'loss': 0.32, 'grad_norm': 0.5789095832572656, 'learning_rate': 3.1096289511448464e-06, 'epoch': 0.63} 63%|██████▎ | 14025/22095 [24:06:02<7:41:11, 3.43s/it] 63%|██████▎ | 14026/22095 [24:06:05<7:37:36, 3.40s/it] {'loss': 0.329, 'grad_norm': 0.5983393100348733, 'learning_rate': 3.108950451286271e-06, 'epoch': 0.63} 63%|██████▎ | 14026/22095 [24:06:05<7:37:36, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 14027/22095 [24:06:14<11:40:47, 5.21s/it] {'loss': 0.4729, 'grad_norm': 0.3569180718168922, 'learning_rate': 3.1082719920605413e-06, 'epoch': 0.63} 63%|██████▎ | 14027/22095 [24:06:14<11:40:47, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77703 > 40960). Running this sequence through the model will result in indexing errors 63%|██████▎ | 14028/22095 [24:06:18<10:19:18, 4.61s/it] {'loss': 0.2957, 'grad_norm': 0.6096092009852317, 'learning_rate': 3.107593573482236e-06, 'epoch': 0.63} 63%|██████▎ | 14028/22095 [24:06:18<10:19:18, 4.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 63%|██████▎ | 14029/22095 [24:06:21<9:23:40, 4.19s/it] {'loss': 0.3194, 'grad_norm': 0.5723108604017576, 'learning_rate': 3.106915195565935e-06, 'epoch': 0.63} 63%|██████▎ | 14029/22095 [24:06:21<9:23:40, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 63%|██████▎ | 14030/22095 [24:06:31<13:10:24, 5.88s/it] {'loss': 0.4703, 'grad_norm': 0.29577567478017497, 'learning_rate': 3.1062368583262103e-06, 'epoch': 0.63} 63%|██████▎ | 14030/22095 [24:06:31<13:10:24, 5.88s/it] 64%|██████▎ | 14031/22095 [24:06:34<11:30:44, 5.14s/it] {'loss': 0.3194, 'grad_norm': 0.6064904924551691, 'learning_rate': 3.1055585617776397e-06, 'epoch': 0.64} 64%|██████▎ | 14031/22095 [24:06:34<11:30:44, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134711 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14032/22095 [24:06:43<14:00:38, 6.26s/it] {'loss': 0.4598, 'grad_norm': 0.2820866429678342, 'learning_rate': 3.104880305934799e-06, 'epoch': 0.64} 64%|██████▎ | 14032/22095 [24:06:43<14:00:38, 6.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14033/22095 [24:06:46<12:08:56, 5.43s/it] {'loss': 0.2889, 'grad_norm': 0.7375791329775107, 'learning_rate': 3.104202090812257e-06, 'epoch': 0.64} 64%|██████▎ | 14033/22095 [24:06:46<12:08:56, 5.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45194 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14034/22095 [24:06:50<10:35:56, 4.73s/it] {'loss': 0.2739, 'grad_norm': 0.6573603364676751, 'learning_rate': 3.1035239164245913e-06, 'epoch': 0.64} 64%|██████▎ | 14034/22095 [24:06:50<10:35:56, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949356 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 191, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm'}, {'from': 'gpt', 'value': '【解答】解:∵点C是线段AB的中点,AB=20cm,∴BC=\\frac{1}{2}AB=\\frac{1}{2}×20cm=10cm,∵点D是线段BC的中点,∴BD=\\frac{1}{2}BC=\\frac{1}{2}×10cm=5cm,∴AD=AB-BD=20cm-5cm=15cm.'}]} 64%|██████▎ | 14035/22095 [24:06:53<9:38:21, 4.31s/it] {'loss': 0.2995, 'grad_norm': 0.6032982927863696, 'learning_rate': 3.1028457827863723e-06, 'epoch': 0.64} 64%|██████▎ | 14035/22095 [24:06:53<9:38:21, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14036/22095 [24:07:03<13:17:42, 5.94s/it] {'loss': 0.4676, 'grad_norm': 0.28255894706465395, 'learning_rate': 3.1021676899121703e-06, 'epoch': 0.64} 64%|██████▎ | 14036/22095 [24:07:03<13:17:42, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56514 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14037/22095 [24:07:06<11:33:45, 5.17s/it] {'loss': 0.2935, 'grad_norm': 0.6202280264903082, 'learning_rate': 3.101489637816555e-06, 'epoch': 0.64} 64%|██████▎ | 14037/22095 [24:07:06<11:33:45, 5.17s/it] 64%|██████▎ | 14038/22095 [24:07:10<10:50:01, 4.84s/it] {'loss': 0.2959, 'grad_norm': 1.0338433813847048, 'learning_rate': 3.1008116265140974e-06, 'epoch': 0.64} 64%|██████▎ | 14038/22095 [24:07:10<10:50:01, 4.84s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14039/22095 [24:07:13<9:26:05, 4.22s/it] {'loss': 0.2817, 'grad_norm': 0.6175401370396773, 'learning_rate': 3.100133656019366e-06, 'epoch': 0.64} 64%|██████▎ | 14039/22095 [24:07:13<9:26:05, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14040/22095 [24:07:20<11:15:10, 5.03s/it] {'loss': 0.4735, 'grad_norm': 0.28522691456631716, 'learning_rate': 3.0994557263469267e-06, 'epoch': 0.64} 64%|██████▎ | 14040/22095 [24:07:20<11:15:10, 5.03s/it] 64%|██████▎ | 14041/22095 [24:07:23<10:16:11, 4.59s/it] {'loss': 0.286, 'grad_norm': 0.5962872012390195, 'learning_rate': 3.0987778375113464e-06, 'epoch': 0.64} 64%|██████▎ | 14041/22095 [24:07:23<10:16:11, 4.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14042/22095 [24:07:31<12:25:04, 5.55s/it] {'loss': 0.4837, 'grad_norm': 0.28706614162773536, 'learning_rate': 3.0980999895271923e-06, 'epoch': 0.64} 64%|██████▎ | 14042/22095 [24:07:31<12:25:04, 5.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14043/22095 [24:07:34<10:50:06, 4.84s/it] {'loss': 0.3296, 'grad_norm': 0.6489585933566904, 'learning_rate': 3.0974221824090263e-06, 'epoch': 0.64} 64%|██████▎ | 14043/22095 [24:07:34<10:50:06, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64205 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56408 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14044/22095 [24:07:38<10:05:24, 4.51s/it] {'loss': 0.31, 'grad_norm': 0.5938771744151462, 'learning_rate': 3.096744416171415e-06, 'epoch': 0.64} 64%|██████▎ | 14044/22095 [24:07:38<10:05:24, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14045/22095 [24:07:45<11:37:42, 5.20s/it] {'loss': 0.4782, 'grad_norm': 0.295895117425673, 'learning_rate': 3.0960666908289217e-06, 'epoch': 0.64} 64%|██████▎ | 14045/22095 [24:07:45<11:37:42, 5.20s/it] 64%|██████▎ | 14046/22095 [24:07:49<10:39:19, 4.77s/it] {'loss': 0.3508, 'grad_norm': 0.7077041445755813, 'learning_rate': 3.095389006396107e-06, 'epoch': 0.64} 64%|██████▎ | 14046/22095 [24:07:49<10:39:19, 4.77s/it] 64%|██████▎ | 14047/22095 [24:07:52<9:48:25, 4.39s/it] {'loss': 0.3049, 'grad_norm': 0.6472413942804193, 'learning_rate': 3.0947113628875327e-06, 'epoch': 0.64} 64%|██████▎ | 14047/22095 [24:07:52<9:48:25, 4.39s/it] 64%|██████▎ | 14048/22095 [24:07:56<9:40:31, 4.33s/it] {'loss': 0.2803, 'grad_norm': 0.6015905003099128, 'learning_rate': 3.094033760317761e-06, 'epoch': 0.64} 64%|██████▎ | 14048/22095 [24:07:56<9:40:31, 4.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14049/22095 [24:08:00<9:00:37, 4.03s/it] {'loss': 0.3106, 'grad_norm': 0.6162874944280102, 'learning_rate': 3.0933561987013484e-06, 'epoch': 0.64} 64%|██████▎ | 14049/22095 [24:08:00<9:00:37, 4.03s/it] 64%|██████▎ | 14050/22095 [24:08:03<8:32:02, 3.82s/it] {'loss': 0.3219, 'grad_norm': 0.5711546807757878, 'learning_rate': 3.092678678052855e-06, 'epoch': 0.64} 64%|██████▎ | 14050/22095 [24:08:03<8:32:02, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8481437 in VC:s3://internvl-moe-sft-data/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 159587, 'image': 'vrdu_texteq/astro-ph.CO/132d0ea0-7b61-4e82-98ae-46a3e9bfee1c.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'and for $z \\sim10$ are'}]} 64%|██████▎ | 14051/22095 [24:08:06<8:03:39, 3.61s/it] {'loss': 0.3423, 'grad_norm': 0.6153270001605267, 'learning_rate': 3.0920011983868413e-06, 'epoch': 0.64} 64%|██████▎ | 14051/22095 [24:08:06<8:03:39, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14052/22095 [24:08:16<12:20:01, 5.52s/it] {'loss': 0.4676, 'grad_norm': 0.29725827368371055, 'learning_rate': 3.0913237597178603e-06, 'epoch': 0.64} 64%|██████▎ | 14052/22095 [24:08:16<12:20:01, 5.52s/it] 64%|██████▎ | 14053/22095 [24:08:19<10:55:39, 4.89s/it] {'loss': 0.3056, 'grad_norm': 0.5803809341027548, 'learning_rate': 3.0906463620604688e-06, 'epoch': 0.64} 64%|██████▎ | 14053/22095 [24:08:19<10:55:39, 4.89s/it] 64%|██████▎ | 14054/22095 [24:08:24<10:44:02, 4.81s/it] {'loss': 0.2788, 'grad_norm': 0.5935044897307553, 'learning_rate': 3.089969005429223e-06, 'epoch': 0.64} 64%|██████▎ | 14054/22095 [24:08:24<10:44:02, 4.81s/it] 64%|██████▎ | 14055/22095 [24:08:28<10:25:06, 4.67s/it] {'loss': 0.3317, 'grad_norm': 0.7871930290907165, 'learning_rate': 3.089291689838679e-06, 'epoch': 0.64} 64%|██████▎ | 14055/22095 [24:08:28<10:25:06, 4.67s/it] 64%|██████▎ | 14056/22095 [24:08:32<9:31:30, 4.27s/it] {'loss': 0.2705, 'grad_norm': 0.5927085282311848, 'learning_rate': 3.088614415303387e-06, 'epoch': 0.64} 64%|██████▎ | 14056/22095 [24:08:32<9:31:30, 4.27s/it] 64%|██████▎ | 14057/22095 [24:08:35<8:58:44, 4.02s/it] {'loss': 0.2861, 'grad_norm': 0.5601225069918039, 'learning_rate': 3.0879371818379e-06, 'epoch': 0.64} 64%|██████▎ | 14057/22095 [24:08:35<8:58:44, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304703 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1dDJ8iJfJ8KJjy0FeXXXKEXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否请您帮我解析图片中的所有文字内容?'}, {'from': 'gpt', 'value': '图中所有文字:\nESOG满99包邮\n华芯创科技有限公司\n主营:ADIANALOGDEVICESANALOGDEVICESLOGLCES原装正品假一赔\nAN\nDE\n十\nANALOG\nDEVICES\nANALOG'}]} 64%|██████▎ | 14058/22095 [24:08:38<8:07:27, 3.64s/it] {'loss': 0.2878, 'grad_norm': 0.6310018963259992, 'learning_rate': 3.0872599894567723e-06, 'epoch': 0.64} 64%|██████▎ | 14058/22095 [24:08:38<8:07:27, 3.64s/it] 64%|██████▎ | 14059/22095 [24:08:41<8:04:02, 3.61s/it] {'loss': 0.3544, 'grad_norm': 0.6657166888741248, 'learning_rate': 3.0865828381745515e-06, 'epoch': 0.64} 64%|██████▎ | 14059/22095 [24:08:41<8:04:02, 3.61s/it] 64%|██████▎ | 14060/22095 [24:08:45<8:08:35, 3.65s/it] {'loss': 0.3219, 'grad_norm': 0.6768852705295367, 'learning_rate': 3.08590572800579e-06, 'epoch': 0.64} 64%|██████▎ | 14060/22095 [24:08:45<8:08:35, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14061/22095 [24:08:54<11:55:11, 5.34s/it] {'loss': 0.4713, 'grad_norm': 0.30506889420259875, 'learning_rate': 3.085228658965036e-06, 'epoch': 0.64} 64%|██████▎ | 14061/22095 [24:08:54<11:55:11, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14062/22095 [24:08:59<11:08:09, 4.99s/it] {'loss': 0.3135, 'grad_norm': 0.5846094679846036, 'learning_rate': 3.0845516310668348e-06, 'epoch': 0.64} 64%|██████▎ | 14062/22095 [24:08:59<11:08:09, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69629 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14063/22095 [24:09:03<10:39:29, 4.78s/it] {'loss': 0.3101, 'grad_norm': 0.5972984714076979, 'learning_rate': 3.0838746443257385e-06, 'epoch': 0.64} 64%|██████▎ | 14063/22095 [24:09:03<10:39:29, 4.78s/it] 64%|██████▎ | 14064/22095 [24:09:06<9:23:19, 4.21s/it] {'loss': 0.3254, 'grad_norm': 0.6280799877614217, 'learning_rate': 3.0831976987562906e-06, 'epoch': 0.64} 64%|██████▎ | 14064/22095 [24:09:06<9:23:19, 4.21s/it] 64%|██████▎ | 14065/22095 [24:09:10<9:06:32, 4.08s/it] {'loss': 0.2823, 'grad_norm': 0.7375942101160714, 'learning_rate': 3.0825207943730375e-06, 'epoch': 0.64} 64%|██████▎ | 14065/22095 [24:09:10<9:06:32, 4.08s/it] 64%|██████▎ | 14066/22095 [24:09:13<8:56:56, 4.01s/it] {'loss': 0.2988, 'grad_norm': 0.6223337925347012, 'learning_rate': 3.081843931190522e-06, 'epoch': 0.64} 64%|██████▎ | 14066/22095 [24:09:13<8:56:56, 4.01s/it] 64%|██████▎ | 14067/22095 [24:09:17<8:45:37, 3.93s/it] {'loss': 0.2809, 'grad_norm': 0.5629945588989438, 'learning_rate': 3.0811671092232896e-06, 'epoch': 0.64} 64%|██████▎ | 14067/22095 [24:09:17<8:45:37, 3.93s/it] 64%|██████▎ | 14068/22095 [24:09:20<8:18:54, 3.73s/it] {'loss': 0.3316, 'grad_norm': 0.5995456341261228, 'learning_rate': 3.0804903284858844e-06, 'epoch': 0.64} 64%|██████▎ | 14068/22095 [24:09:20<8:18:54, 3.73s/it] 64%|██████▎ | 14069/22095 [24:09:23<7:47:11, 3.49s/it] {'loss': 0.3203, 'grad_norm': 0.6028071031391948, 'learning_rate': 3.079813588992846e-06, 'epoch': 0.64} 64%|██████▎ | 14069/22095 [24:09:23<7:47:11, 3.49s/it] 64%|██████▎ | 14070/22095 [24:09:26<7:24:53, 3.33s/it] {'loss': 0.3213, 'grad_norm': 0.6773538683823238, 'learning_rate': 3.079136890758715e-06, 'epoch': 0.64} 64%|██████▎ | 14070/22095 [24:09:26<7:24:53, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▎ | 14071/22095 [24:09:35<11:06:45, 4.99s/it] {'loss': 0.4667, 'grad_norm': 0.30538410810758704, 'learning_rate': 3.078460233798036e-06, 'epoch': 0.64} 64%|██████▎ | 14071/22095 [24:09:35<11:06:45, 4.99s/it] 64%|██████▎ | 14072/22095 [24:09:39<10:16:20, 4.61s/it] {'loss': 0.306, 'grad_norm': 0.5734688367277772, 'learning_rate': 3.077783618125341e-06, 'epoch': 0.64} 64%|██████▎ | 14072/22095 [24:09:39<10:16:20, 4.61s/it] 64%|██████▎ | 14073/22095 [24:09:42<9:22:00, 4.20s/it] {'loss': 0.2836, 'grad_norm': 0.647864019317746, 'learning_rate': 3.0771070437551743e-06, 'epoch': 0.64} 64%|██████▎ | 14073/22095 [24:09:42<9:22:00, 4.20s/it] 64%|██████▎ | 14074/22095 [24:09:46<9:01:18, 4.05s/it] {'loss': 0.3498, 'grad_norm': 0.6234597688592606, 'learning_rate': 3.076430510702072e-06, 'epoch': 0.64} 64%|██████▎ | 14074/22095 [24:09:46<9:01:18, 4.05s/it] 64%|██████▎ | 14075/22095 [24:09:49<8:19:45, 3.74s/it] {'loss': 0.2771, 'grad_norm': 0.5875273985255427, 'learning_rate': 3.0757540189805695e-06, 'epoch': 0.64} 64%|██████▎ | 14075/22095 [24:09:49<8:19:45, 3.74s/it] 64%|██████▎ | 14076/22095 [24:09:52<7:49:27, 3.51s/it] {'loss': 0.2913, 'grad_norm': 0.5434031054147186, 'learning_rate': 3.0750775686052024e-06, 'epoch': 0.64} 64%|██████▎ | 14076/22095 [24:09:52<7:49:27, 3.51s/it] 64%|██████▎ | 14077/22095 [24:09:55<7:20:37, 3.30s/it] {'loss': 0.3241, 'grad_norm': 0.6519079447295152, 'learning_rate': 3.0744011595905084e-06, 'epoch': 0.64} 64%|██████▎ | 14077/22095 [24:09:55<7:20:37, 3.30s/it] 64%|██████▎ | 14078/22095 [24:09:58<7:30:52, 3.37s/it] {'loss': 0.316, 'grad_norm': 0.6046272817777023, 'learning_rate': 3.0737247919510182e-06, 'epoch': 0.64} 64%|██████▎ | 14078/22095 [24:09:58<7:30:52, 3.37s/it] 64%|██████▎ | 14079/22095 [24:10:02<7:28:22, 3.36s/it] {'loss': 0.2697, 'grad_norm': 0.5697051594331598, 'learning_rate': 3.073048465701266e-06, 'epoch': 0.64} 64%|██████▎ | 14079/22095 [24:10:02<7:28:22, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98312 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▎ | 14080/22095 [24:10:04<7:09:52, 3.22s/it] {'loss': 0.3651, 'grad_norm': 0.6877517308747793, 'learning_rate': 3.0723721808557857e-06, 'epoch': 0.64} 64%|██████▎ | 14080/22095 [24:10:04<7:09:52, 3.22s/it] 64%|██████▎ | 14081/22095 [24:10:08<7:06:34, 3.19s/it] {'loss': 0.3316, 'grad_norm': 0.7527290715522306, 'learning_rate': 3.0716959374291053e-06, 'epoch': 0.64} 64%|██████▎ | 14081/22095 [24:10:08<7:06:34, 3.19s/it] 64%|██████▎ | 14082/22095 [24:10:11<7:25:19, 3.33s/it] {'loss': 0.3241, 'grad_norm': 0.6233015972507545, 'learning_rate': 3.071019735435756e-06, 'epoch': 0.64} 64%|██████▎ | 14082/22095 [24:10:11<7:25:19, 3.33s/it] 64%|██████▎ | 14083/22095 [24:10:15<7:54:40, 3.55s/it] {'loss': 0.3037, 'grad_norm': 0.6072325059612153, 'learning_rate': 3.0703435748902693e-06, 'epoch': 0.64} 64%|██████▎ | 14083/22095 [24:10:15<7:54:40, 3.55s/it] 64%|██████▎ | 14084/22095 [24:10:19<7:56:40, 3.57s/it] {'loss': 0.3355, 'grad_norm': 0.6523962092443791, 'learning_rate': 3.069667455807174e-06, 'epoch': 0.64} 64%|██████▎ | 14084/22095 [24:10:19<7:56:40, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▎ | 14085/22095 [24:10:22<7:35:55, 3.42s/it] {'loss': 0.2885, 'grad_norm': 0.6225076419749288, 'learning_rate': 3.068991378200995e-06, 'epoch': 0.64} 64%|██████▎ | 14085/22095 [24:10:22<7:35:55, 3.42s/it] 64%|██████▍ | 14086/22095 [24:10:25<7:34:11, 3.40s/it] {'loss': 0.3127, 'grad_norm': 0.6509686915667399, 'learning_rate': 3.06831534208626e-06, 'epoch': 0.64} 64%|██████▍ | 14086/22095 [24:10:25<7:34:11, 3.40s/it] 64%|██████▍ | 14087/22095 [24:10:29<7:29:29, 3.37s/it] {'loss': 0.2725, 'grad_norm': 0.5662595934973575, 'learning_rate': 3.0676393474774972e-06, 'epoch': 0.64} 64%|██████▍ | 14087/22095 [24:10:29<7:29:29, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14088/22095 [24:10:38<11:35:41, 5.21s/it] {'loss': 0.4708, 'grad_norm': 0.29967031884905504, 'learning_rate': 3.0669633943892294e-06, 'epoch': 0.64} 64%|██████▍ | 14088/22095 [24:10:38<11:35:41, 5.21s/it] 64%|██████▍ | 14089/22095 [24:10:43<11:18:32, 5.09s/it] {'loss': 0.313, 'grad_norm': 0.5931619888481149, 'learning_rate': 3.066287482835981e-06, 'epoch': 0.64} 64%|██████▍ | 14089/22095 [24:10:43<11:18:32, 5.09s/it] 64%|██████▍ | 14090/22095 [24:10:47<10:44:43, 4.83s/it] {'loss': 0.3233, 'grad_norm': 0.6794717002108048, 'learning_rate': 3.0656116128322773e-06, 'epoch': 0.64} 64%|██████▍ | 14090/22095 [24:10:47<10:44:43, 4.83s/it] 64%|██████▍ | 14091/22095 [24:10:51<9:52:11, 4.44s/it] {'loss': 0.2784, 'grad_norm': 0.6069205047274574, 'learning_rate': 3.0649357843926365e-06, 'epoch': 0.64} 64%|██████▍ | 14091/22095 [24:10:51<9:52:11, 4.44s/it] 64%|██████▍ | 14092/22095 [24:10:54<9:03:00, 4.07s/it] {'loss': 0.2856, 'grad_norm': 0.5762227454928642, 'learning_rate': 3.0642599975315836e-06, 'epoch': 0.64} 64%|██████▍ | 14092/22095 [24:10:54<9:03:00, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49989 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14093/22095 [24:10:57<8:13:29, 3.70s/it] {'loss': 0.2966, 'grad_norm': 0.8781504213385238, 'learning_rate': 3.0635842522636392e-06, 'epoch': 0.64} 64%|██████▍ | 14093/22095 [24:10:57<8:13:29, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [278, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8459962 in VC:s3://internvl-moe-sft-data/. Exception: Image size [278, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44500, 'image': 'vrdu_texteq/astro-ph.CO/98660479-b8d8-49bc-b5ea-c4868e2d133c.png', 'image_wh': [[278, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'is restricted to $z<3.5$.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14094/22095 [24:11:00<8:00:11, 3.60s/it] {'loss': 0.3098, 'grad_norm': 0.5917374303485107, 'learning_rate': 3.0629085486033217e-06, 'epoch': 0.64} 64%|██████▍ | 14094/22095 [24:11:00<8:00:11, 3.60s/it] 64%|██████▍ | 14095/22095 [24:11:04<7:53:35, 3.55s/it] {'loss': 0.2716, 'grad_norm': 0.6064820929771447, 'learning_rate': 3.0622328865651486e-06, 'epoch': 0.64} 64%|██████▍ | 14095/22095 [24:11:04<7:53:35, 3.55s/it] 64%|██████▍ | 14096/22095 [24:11:06<7:29:10, 3.37s/it] {'loss': 0.3389, 'grad_norm': 0.621100849610007, 'learning_rate': 3.06155726616364e-06, 'epoch': 0.64} 64%|██████▍ | 14096/22095 [24:11:06<7:29:10, 3.37s/it] 64%|██████▍ | 14097/22095 [24:11:10<7:16:21, 3.27s/it] {'loss': 0.2949, 'grad_norm': 0.6248818555128179, 'learning_rate': 3.0608816874133135e-06, 'epoch': 0.64} 64%|██████▍ | 14097/22095 [24:11:10<7:16:21, 3.27s/it] 64%|██████▍ | 14098/22095 [24:11:12<7:01:38, 3.16s/it] {'loss': 0.2765, 'grad_norm': 0.6350564177824697, 'learning_rate': 3.0602061503286827e-06, 'epoch': 0.64} 64%|██████▍ | 14098/22095 [24:11:12<7:01:38, 3.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339494 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6128, 'image': 'vrdu_table_final_2/astro-ph.CO/9dd28813-b3fb-489f-9db1-98a9978dbf48.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} 64%|██████▍ | 14099/22095 [24:11:15<6:46:11, 3.05s/it] {'loss': 0.2896, 'grad_norm': 0.6025741385211456, 'learning_rate': 3.0595306549242643e-06, 'epoch': 0.64} 64%|██████▍ | 14099/22095 [24:11:15<6:46:11, 3.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72908 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14100/22095 [24:11:18<6:50:23, 3.08s/it] {'loss': 0.359, 'grad_norm': 0.7198585599741678, 'learning_rate': 3.0588552012145743e-06, 'epoch': 0.64} 64%|██████▍ | 14100/22095 [24:11:18<6:50:23, 3.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14101/22095 [24:11:26<10:05:19, 4.54s/it] {'loss': 0.4593, 'grad_norm': 0.2987848378295094, 'learning_rate': 3.058179789214122e-06, 'epoch': 0.64} 64%|██████▍ | 14101/22095 [24:11:26<10:05:19, 4.54s/it] 64%|██████▍ | 14102/22095 [24:11:30<9:43:22, 4.38s/it] {'loss': 0.3294, 'grad_norm': 0.610061918764689, 'learning_rate': 3.0575044189374225e-06, 'epoch': 0.64} 64%|██████▍ | 14102/22095 [24:11:30<9:43:22, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87061 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14103/22095 [24:11:34<9:20:05, 4.20s/it] {'loss': 0.3345, 'grad_norm': 0.5692385277557372, 'learning_rate': 3.0568290903989885e-06, 'epoch': 0.64} 64%|██████▍ | 14103/22095 [24:11:34<9:20:05, 4.20s/it] 64%|██████▍ | 14104/22095 [24:11:38<9:18:32, 4.19s/it] {'loss': 0.2944, 'grad_norm': 0.6333484754121524, 'learning_rate': 3.0561538036133275e-06, 'epoch': 0.64} 64%|██████▍ | 14104/22095 [24:11:38<9:18:32, 4.19s/it] 64%|██████▍ | 14105/22095 [24:11:41<8:33:21, 3.85s/it] {'loss': 0.2976, 'grad_norm': 0.5669055540397957, 'learning_rate': 3.0554785585949514e-06, 'epoch': 0.64} 64%|██████▍ | 14105/22095 [24:11:41<8:33:21, 3.85s/it] 64%|██████▍ | 14106/22095 [24:11:45<8:12:59, 3.70s/it] {'loss': 0.312, 'grad_norm': 0.6554879825600077, 'learning_rate': 3.0548033553583707e-06, 'epoch': 0.64} 64%|██████▍ | 14106/22095 [24:11:45<8:12:59, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [681, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8478354 in VC:s3://internvl-moe-sft-data/. Exception: Image size [681, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12458, 'image': 'vrdu_texteq/astro-ph.CO/5e5354f8-a7c5-487a-af9d-c722f32953ef.png', 'image_wh': [[681, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where the normalization constant $A$ is chosen such that:'}]} 64%|██████▍ | 14107/22095 [24:11:48<8:16:29, 3.73s/it] {'loss': 0.2654, 'grad_norm': 0.5691376303708063, 'learning_rate': 3.05412819391809e-06, 'epoch': 0.64} 64%|██████▍ | 14107/22095 [24:11:48<8:16:29, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101306 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14108/22095 [24:11:57<11:11:36, 5.05s/it] {'loss': 0.4849, 'grad_norm': 0.28976649932335347, 'learning_rate': 3.0534530742886187e-06, 'epoch': 0.64} 64%|██████▍ | 14108/22095 [24:11:57<11:11:36, 5.05s/it] 64%|██████▍ | 14109/22095 [24:12:00<9:57:23, 4.49s/it] {'loss': 0.3153, 'grad_norm': 0.6500680482864272, 'learning_rate': 3.052777996484462e-06, 'epoch': 0.64} 64%|██████▍ | 14109/22095 [24:12:00<9:57:23, 4.49s/it] 64%|██████▍ | 14110/22095 [24:12:04<9:32:08, 4.30s/it] {'loss': 0.313, 'grad_norm': 0.6022213605742276, 'learning_rate': 3.052102960520126e-06, 'epoch': 0.64} 64%|██████▍ | 14110/22095 [24:12:04<9:32:08, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14111/22095 [24:12:13<12:54:45, 5.82s/it] {'loss': 0.4745, 'grad_norm': 0.26796570792864116, 'learning_rate': 3.0514279664101153e-06, 'epoch': 0.64} 64%|██████▍ | 14111/22095 [24:12:13<12:54:45, 5.82s/it] 64%|██████▍ | 14112/22095 [24:12:16<11:11:14, 5.04s/it] {'loss': 0.2933, 'grad_norm': 0.5572222325190271, 'learning_rate': 3.0507530141689324e-06, 'epoch': 0.64} 64%|██████▍ | 14112/22095 [24:12:16<11:11:14, 5.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8355987 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22692, 'image': 'vrdu_table_final_2/astro-ph.CO/f4904035-ab35-4406-92a2-8220c6f6a49f.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 64%|██████▍ | 14113/22095 [24:12:20<10:09:02, 4.58s/it] {'loss': 0.3544, 'grad_norm': 0.6644756722110994, 'learning_rate': 3.050078103811082e-06, 'epoch': 0.64} 64%|██████▍ | 14113/22095 [24:12:20<10:09:02, 4.58s/it] 64%|██████▍ | 14114/22095 [24:12:23<9:17:31, 4.19s/it] {'loss': 0.3171, 'grad_norm': 0.5716566187790919, 'learning_rate': 3.0494032353510634e-06, 'epoch': 0.64} 64%|██████▍ | 14114/22095 [24:12:23<9:17:31, 4.19s/it] 64%|██████▍ | 14115/22095 [24:12:27<8:50:30, 3.99s/it] {'loss': 0.3356, 'grad_norm': 0.6707308766695779, 'learning_rate': 3.0487284088033776e-06, 'epoch': 0.64} 64%|██████▍ | 14115/22095 [24:12:27<8:50:30, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47269 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14116/22095 [24:12:36<12:30:18, 5.64s/it] {'loss': 0.4822, 'grad_norm': 0.27757105153482103, 'learning_rate': 3.0480536241825263e-06, 'epoch': 0.64} 64%|██████▍ | 14116/22095 [24:12:36<12:30:18, 5.64s/it] 64%|██████▍ | 14117/22095 [24:12:40<11:30:53, 5.20s/it] {'loss': 0.3391, 'grad_norm': 0.7721032612446138, 'learning_rate': 3.047378881503008e-06, 'epoch': 0.64} 64%|██████▍ | 14117/22095 [24:12:40<11:30:53, 5.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14118/22095 [24:12:48<13:07:25, 5.92s/it] {'loss': 0.4815, 'grad_norm': 0.2996677332075844, 'learning_rate': 3.0467041807793198e-06, 'epoch': 0.64} 64%|██████▍ | 14118/22095 [24:12:48<13:07:25, 5.92s/it] 64%|██████▍ | 14119/22095 [24:12:52<11:38:40, 5.26s/it] {'loss': 0.3202, 'grad_norm': 0.633219965755037, 'learning_rate': 3.046029522025961e-06, 'epoch': 0.64} 64%|██████▍ | 14119/22095 [24:12:52<11:38:40, 5.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14120/22095 [24:12:56<10:53:37, 4.92s/it] {'loss': 0.2799, 'grad_norm': 0.7161682752114296, 'learning_rate': 3.045354905257425e-06, 'epoch': 0.64} 64%|██████▍ | 14120/22095 [24:12:56<10:53:37, 4.92s/it] 64%|██████▍ | 14121/22095 [24:12:59<9:44:00, 4.39s/it] {'loss': 0.3324, 'grad_norm': 0.5959397462784317, 'learning_rate': 3.044680330488209e-06, 'epoch': 0.64} 64%|██████▍ | 14121/22095 [24:12:59<9:44:00, 4.39s/it] 64%|██████▍ | 14122/22095 [24:13:02<8:59:33, 4.06s/it] {'loss': 0.324, 'grad_norm': 0.629765307972555, 'learning_rate': 3.0440057977328086e-06, 'epoch': 0.64} 64%|██████▍ | 14122/22095 [24:13:02<8:59:33, 4.06s/it] 64%|██████▍ | 14123/22095 [24:13:05<8:30:36, 3.84s/it] {'loss': 0.3315, 'grad_norm': 0.6622370050010291, 'learning_rate': 3.0433313070057157e-06, 'epoch': 0.64} 64%|██████▍ | 14123/22095 [24:13:05<8:30:36, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14124/22095 [24:13:15<12:21:32, 5.58s/it] {'loss': 0.4666, 'grad_norm': 0.2798673125588423, 'learning_rate': 3.0426568583214224e-06, 'epoch': 0.64} 64%|██████▍ | 14124/22095 [24:13:15<12:21:32, 5.58s/it] 64%|██████▍ | 14125/22095 [24:13:19<11:14:32, 5.08s/it] {'loss': 0.3218, 'grad_norm': 0.6499396878375385, 'learning_rate': 3.041982451694422e-06, 'epoch': 0.64} 64%|██████▍ | 14125/22095 [24:13:19<11:14:32, 5.08s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367152 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33900, 'image': 'vrdu_table_final_2/astro-ph.CO/5e3cc3ac-4a67-41e2-94fb-cb37800423a5.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 64%|██████▍ | 14126/22095 [24:13:23<10:12:58, 4.62s/it] {'loss': 0.3067, 'grad_norm': 0.5736196118655933, 'learning_rate': 3.0413080871392063e-06, 'epoch': 0.64} 64%|██████▍ | 14126/22095 [24:13:23<10:12:58, 4.62s/it] 64%|██████▍ | 14127/22095 [24:13:26<9:12:51, 4.16s/it] {'loss': 0.3067, 'grad_norm': 0.6391268496601686, 'learning_rate': 3.0406337646702638e-06, 'epoch': 0.64} 64%|██████▍ | 14127/22095 [24:13:26<9:12:51, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14128/22095 [24:13:29<9:01:32, 4.08s/it] {'loss': 0.3353, 'grad_norm': 1.020993104950133, 'learning_rate': 3.039959484302083e-06, 'epoch': 0.64} 64%|██████▍ | 14128/22095 [24:13:30<9:01:32, 4.08s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [525, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8444045 in VC:s3://internvl-moe-sft-data/. Exception: Image size [525, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 146251, 'image': 'vrdu_texteq/astro-ph.CO/3d93dcca-34d6-4686-b7dc-c8dde8dcb99f.png', 'image_wh': [[525, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'while $s$ is either $v$ or $c$. It then follows that'}]} 64%|██████▍ | 14129/22095 [24:13:33<8:37:11, 3.90s/it] {'loss': 0.2749, 'grad_norm': 0.6174232971913675, 'learning_rate': 3.039285246049155e-06, 'epoch': 0.64} 64%|██████▍ | 14129/22095 [24:13:33<8:37:11, 3.90s/it] 64%|██████▍ | 14130/22095 [24:13:36<7:56:11, 3.59s/it] {'loss': 0.3148, 'grad_norm': 0.6704099198505801, 'learning_rate': 3.0386110499259635e-06, 'epoch': 0.64} 64%|██████▍ | 14130/22095 [24:13:36<7:56:11, 3.59s/it] 64%|██████▍ | 14131/22095 [24:13:40<8:04:15, 3.65s/it] {'loss': 0.2975, 'grad_norm': 0.6262980876605436, 'learning_rate': 3.0379368959469967e-06, 'epoch': 0.64} 64%|██████▍ | 14131/22095 [24:13:40<8:04:15, 3.65s/it] 64%|██████▍ | 14132/22095 [24:13:43<7:43:48, 3.49s/it] {'loss': 0.2947, 'grad_norm': 0.6167601797243967, 'learning_rate': 3.0372627841267418e-06, 'epoch': 0.64} 64%|██████▍ | 14132/22095 [24:13:43<7:43:48, 3.49s/it] 64%|██████▍ | 14133/22095 [24:13:46<7:47:26, 3.52s/it] {'loss': 0.3014, 'grad_norm': 0.6249766585899085, 'learning_rate': 3.0365887144796796e-06, 'epoch': 0.64} 64%|██████▍ | 14133/22095 [24:13:46<7:47:26, 3.52s/it] 64%|██████▍ | 14134/22095 [24:13:49<7:27:20, 3.37s/it] {'loss': 0.3278, 'grad_norm': 0.5942506404857038, 'learning_rate': 3.0359146870202954e-06, 'epoch': 0.64} 64%|██████▍ | 14134/22095 [24:13:49<7:27:20, 3.37s/it] 64%|██████▍ | 14135/22095 [24:13:53<7:52:16, 3.56s/it] {'loss': 0.2704, 'grad_norm': 0.5785357614177752, 'learning_rate': 3.035240701763074e-06, 'epoch': 0.64} 64%|██████▍ | 14135/22095 [24:13:53<7:52:16, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83168 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14136/22095 [24:13:56<7:34:03, 3.42s/it] {'loss': 0.2888, 'grad_norm': 0.5573281982727035, 'learning_rate': 3.0345667587224946e-06, 'epoch': 0.64} 64%|██████▍ | 14136/22095 [24:13:56<7:34:03, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14137/22095 [24:13:59<7:11:21, 3.25s/it] {'loss': 0.2976, 'grad_norm': 0.6203447435262367, 'learning_rate': 3.03389285791304e-06, 'epoch': 0.64} 64%|██████▍ | 14137/22095 [24:13:59<7:11:21, 3.25s/it] 64%|██████▍ | 14138/22095 [24:14:03<7:18:52, 3.31s/it] {'loss': 0.3066, 'grad_norm': 0.5662763294622898, 'learning_rate': 3.0332189993491877e-06, 'epoch': 0.64} 64%|██████▍ | 14138/22095 [24:14:03<7:18:52, 3.31s/it] 64%|██████▍ | 14139/22095 [24:14:06<7:20:26, 3.32s/it] {'loss': 0.3161, 'grad_norm': 0.6295862196549189, 'learning_rate': 3.0325451830454207e-06, 'epoch': 0.64} 64%|██████▍ | 14139/22095 [24:14:06<7:20:26, 3.32s/it] 64%|██████▍ | 14140/22095 [24:14:09<7:07:13, 3.22s/it] {'loss': 0.3402, 'grad_norm': 0.6594638994579315, 'learning_rate': 3.031871409016214e-06, 'epoch': 0.64} 64%|██████▍ | 14140/22095 [24:14:09<7:07:13, 3.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14141/22095 [24:14:13<7:23:03, 3.34s/it] {'loss': 0.321, 'grad_norm': 0.6413837854225898, 'learning_rate': 3.0311976772760466e-06, 'epoch': 0.64} 64%|██████▍ | 14141/22095 [24:14:13<7:23:03, 3.34s/it] 64%|██████▍ | 14142/22095 [24:14:16<7:13:41, 3.27s/it] {'loss': 0.3008, 'grad_norm': 0.636807890867043, 'learning_rate': 3.0305239878393947e-06, 'epoch': 0.64} 64%|██████▍ | 14142/22095 [24:14:16<7:13:41, 3.27s/it] 64%|██████▍ | 14143/22095 [24:14:19<7:06:57, 3.22s/it] {'loss': 0.2723, 'grad_norm': 0.6562468927418301, 'learning_rate': 3.0298503407207317e-06, 'epoch': 0.64} 64%|██████▍ | 14143/22095 [24:14:19<7:06:57, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14144/22095 [24:14:26<9:46:55, 4.43s/it] {'loss': 0.4841, 'grad_norm': 0.31525091302289265, 'learning_rate': 3.029176735934536e-06, 'epoch': 0.64} 64%|██████▍ | 14144/22095 [24:14:26<9:46:55, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52556 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14145/22095 [24:14:29<8:59:21, 4.07s/it] {'loss': 0.2699, 'grad_norm': 0.5849769289677633, 'learning_rate': 3.028503173495279e-06, 'epoch': 0.64} 64%|██████▍ | 14145/22095 [24:14:29<8:59:21, 4.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888453 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11606, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 3cm\nB. 2cm\nC. 5cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (148065 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14146/22095 [24:14:33<8:34:51, 3.89s/it] {'loss': 0.2883, 'grad_norm': 0.6111170056322482, 'learning_rate': 3.0278296534174334e-06, 'epoch': 0.64} 64%|██████▍ | 14146/22095 [24:14:33<8:34:51, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14147/22095 [24:14:36<8:08:31, 3.69s/it] {'loss': 0.3315, 'grad_norm': 0.700306479916621, 'learning_rate': 3.0271561757154705e-06, 'epoch': 0.64} 64%|██████▍ | 14147/22095 [24:14:36<8:08:31, 3.69s/it] 64%|██████▍ | 14148/22095 [24:14:39<7:32:42, 3.42s/it] {'loss': 0.3392, 'grad_norm': 0.6045621286992134, 'learning_rate': 3.0264827404038655e-06, 'epoch': 0.64} 64%|██████▍ | 14148/22095 [24:14:39<7:32:42, 3.42s/it] 64%|██████▍ | 14149/22095 [24:14:42<7:21:23, 3.33s/it] {'loss': 0.3427, 'grad_norm': 0.613524506215485, 'learning_rate': 3.0258093474970817e-06, 'epoch': 0.64} 64%|██████▍ | 14149/22095 [24:14:42<7:21:23, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86058 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14150/22095 [24:14:46<7:29:24, 3.39s/it] {'loss': 0.3591, 'grad_norm': 0.6087397259443168, 'learning_rate': 3.0251359970095927e-06, 'epoch': 0.64} 64%|██████▍ | 14150/22095 [24:14:46<7:29:24, 3.39s/it] 64%|██████▍ | 14151/22095 [24:14:49<7:24:49, 3.36s/it] {'loss': 0.3239, 'grad_norm': 0.7792919719497309, 'learning_rate': 3.024462688955867e-06, 'epoch': 0.64} 64%|██████▍ | 14151/22095 [24:14:49<7:24:49, 3.36s/it] 64%|██████▍ | 14152/22095 [24:14:53<8:11:22, 3.71s/it] {'loss': 0.316, 'grad_norm': 0.6220333276322899, 'learning_rate': 3.0237894233503697e-06, 'epoch': 0.64} 64%|██████▍ | 14152/22095 [24:14:53<8:11:22, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14153/22095 [24:15:03<12:20:02, 5.59s/it] {'loss': 0.4724, 'grad_norm': 0.31641963417420904, 'learning_rate': 3.0231162002075678e-06, 'epoch': 0.64} 64%|██████▍ | 14153/22095 [24:15:03<12:20:02, 5.59s/it] 64%|██████▍ | 14154/22095 [24:15:07<10:44:25, 4.87s/it] {'loss': 0.2947, 'grad_norm': 0.6263995623866292, 'learning_rate': 3.0224430195419274e-06, 'epoch': 0.64} 64%|██████▍ | 14154/22095 [24:15:07<10:44:25, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97310 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119058 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14155/22095 [24:15:10<10:05:51, 4.58s/it] {'loss': 0.3623, 'grad_norm': 0.6354504950160101, 'learning_rate': 3.021769881367914e-06, 'epoch': 0.64} 64%|██████▍ | 14155/22095 [24:15:10<10:05:51, 4.58s/it] 64%|██████▍ | 14156/22095 [24:15:13<8:57:39, 4.06s/it] {'loss': 0.2683, 'grad_norm': 0.6123559216921669, 'learning_rate': 3.0210967856999896e-06, 'epoch': 0.64} 64%|██████▍ | 14156/22095 [24:15:13<8:57:39, 4.06s/it] 64%|██████▍ | 14157/22095 [24:15:17<8:38:27, 3.92s/it] {'loss': 0.315, 'grad_norm': 0.6038707224113978, 'learning_rate': 3.0204237325526166e-06, 'epoch': 0.64} 64%|██████▍ | 14157/22095 [24:15:17<8:38:27, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14158/22095 [24:15:24<10:32:21, 4.78s/it] {'loss': 0.4695, 'grad_norm': 0.2684083420193887, 'learning_rate': 3.01975072194026e-06, 'epoch': 0.64} 64%|██████▍ | 14158/22095 [24:15:24<10:32:21, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83585 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14159/22095 [24:15:27<9:32:20, 4.33s/it] {'loss': 0.2578, 'grad_norm': 0.791181978079724, 'learning_rate': 3.0190777538773763e-06, 'epoch': 0.64} 64%|██████▍ | 14159/22095 [24:15:27<9:32:20, 4.33s/it] 64%|██████▍ | 14160/22095 [24:15:31<9:29:32, 4.31s/it] {'loss': 0.2635, 'grad_norm': 0.5665999544527709, 'learning_rate': 3.0184048283784284e-06, 'epoch': 0.64} 64%|██████▍ | 14160/22095 [24:15:31<9:29:32, 4.31s/it] 64%|██████▍ | 14161/22095 [24:15:35<8:51:48, 4.02s/it] {'loss': 0.3019, 'grad_norm': 0.6494721148390382, 'learning_rate': 3.0177319454578756e-06, 'epoch': 0.64} 64%|██████▍ | 14161/22095 [24:15:35<8:51:48, 4.02s/it] 64%|██████▍ | 14162/22095 [24:15:38<8:38:36, 3.92s/it] {'loss': 0.3231, 'grad_norm': 0.6559197261771749, 'learning_rate': 3.0170591051301746e-06, 'epoch': 0.64} 64%|██████▍ | 14162/22095 [24:15:38<8:38:36, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14163/22095 [24:15:49<13:08:27, 5.96s/it] {'loss': 0.4769, 'grad_norm': 0.29533722371071636, 'learning_rate': 3.0163863074097823e-06, 'epoch': 0.64} 64%|██████▍ | 14163/22095 [24:15:49<13:08:27, 5.96s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31397.png 2025-08-28 16:13:49.276080 load time: 1256.2 ms 64%|██████▍ | 14164/22095 [24:15:55<13:07:28, 5.96s/it] {'loss': 0.2935, 'grad_norm': 0.7705795692488842, 'learning_rate': 3.0157135523111574e-06, 'epoch': 0.64} 64%|██████▍ | 14164/22095 [24:15:55<13:07:28, 5.96s/it] 64%|██████▍ | 14165/22095 [24:15:59<11:43:28, 5.32s/it] {'loss': 0.3477, 'grad_norm': 0.8557306437058548, 'learning_rate': 3.0150408398487536e-06, 'epoch': 0.64} 64%|██████▍ | 14165/22095 [24:15:59<11:43:28, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101469 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14166/22095 [24:16:02<10:26:28, 4.74s/it] {'loss': 0.3246, 'grad_norm': 0.6391934999181982, 'learning_rate': 3.0143681700370253e-06, 'epoch': 0.64} 64%|██████▍ | 14166/22095 [24:16:02<10:26:28, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14167/22095 [24:16:12<13:42:05, 6.22s/it] {'loss': 0.4821, 'grad_norm': 0.27606869359092295, 'learning_rate': 3.013695542890426e-06, 'epoch': 0.64} 64%|██████▍ | 14167/22095 [24:16:12<13:42:05, 6.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46383 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14168/22095 [24:16:15<11:53:48, 5.40s/it] {'loss': 0.2753, 'grad_norm': 0.7056753474223514, 'learning_rate': 3.0130229584234117e-06, 'epoch': 0.64} 64%|██████▍ | 14168/22095 [24:16:15<11:53:48, 5.40s/it] 64%|██████▍ | 14169/22095 [24:16:18<10:18:16, 4.68s/it] {'loss': 0.313, 'grad_norm': 0.6001043987094551, 'learning_rate': 3.0123504166504293e-06, 'epoch': 0.64} 64%|██████▍ | 14169/22095 [24:16:18<10:18:16, 4.68s/it] 64%|██████▍ | 14170/22095 [24:16:21<9:07:01, 4.14s/it] {'loss': 0.3387, 'grad_norm': 0.6753891660126754, 'learning_rate': 3.0116779175859322e-06, 'epoch': 0.64} 64%|██████▍ | 14170/22095 [24:16:21<9:07:01, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [37, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8371041 in VC:s3://internvl-moe-sft-data/. Exception: Image size [37, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37802, 'image': 'vrdu_table_final_2/astro-ph.CO/b4842c53-9227-4abc-b612-63e5699888c6.png', 'image_wh': [[37, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}1.0 \\end{tabular}\n```"}]} 64%|██████▍ | 14171/22095 [24:16:24<8:19:43, 3.78s/it] {'loss': 0.3401, 'grad_norm': 0.6610764557175421, 'learning_rate': 3.011005461244372e-06, 'epoch': 0.64} 64%|██████▍ | 14171/22095 [24:16:24<8:19:43, 3.78s/it] 64%|██████▍ | 14172/22095 [24:16:27<7:54:45, 3.60s/it] {'loss': 0.3199, 'grad_norm': 0.6515044167069902, 'learning_rate': 3.010333047640192e-06, 'epoch': 0.64} 64%|██████▍ | 14172/22095 [24:16:27<7:54:45, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76565 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14173/22095 [24:16:31<7:54:30, 3.59s/it] {'loss': 0.3511, 'grad_norm': 0.6496929131103447, 'learning_rate': 3.009660676787846e-06, 'epoch': 0.64} 64%|██████▍ | 14173/22095 [24:16:31<7:54:30, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14174/22095 [24:16:37<9:47:44, 4.45s/it] {'loss': 0.4819, 'grad_norm': 0.3124187274189695, 'learning_rate': 3.0089883487017803e-06, 'epoch': 0.64} 64%|██████▍ | 14174/22095 [24:16:37<9:47:44, 4.45s/it] 64%|██████▍ | 14175/22095 [24:16:41<9:33:05, 4.34s/it] {'loss': 0.3403, 'grad_norm': 0.6471645470233871, 'learning_rate': 3.0083160633964385e-06, 'epoch': 0.64} 64%|██████▍ | 14175/22095 [24:16:41<9:33:05, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41160 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95316 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79382 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14176/22095 [24:16:50<12:02:58, 5.48s/it] {'loss': 0.4696, 'grad_norm': 0.30259448985732956, 'learning_rate': 3.007643820886267e-06, 'epoch': 0.64} 64%|██████▍ | 14176/22095 [24:16:50<12:02:58, 5.48s/it] 64%|██████▍ | 14177/22095 [24:16:53<10:45:34, 4.89s/it] {'loss': 0.2916, 'grad_norm': 0.6346368867233724, 'learning_rate': 3.0069716211857137e-06, 'epoch': 0.64} 64%|██████▍ | 14177/22095 [24:16:53<10:45:34, 4.89s/it] 64%|██████▍ | 14178/22095 [24:16:56<9:39:56, 4.40s/it] {'loss': 0.2947, 'grad_norm': 0.8301621225683237, 'learning_rate': 3.006299464309216e-06, 'epoch': 0.64} 64%|██████▍ | 14178/22095 [24:16:56<9:39:56, 4.40s/it] 64%|██████▍ | 14179/22095 [24:17:00<9:25:53, 4.29s/it] {'loss': 0.329, 'grad_norm': 0.6255210650197129, 'learning_rate': 3.0056273502712203e-06, 'epoch': 0.64} 64%|██████▍ | 14179/22095 [24:17:00<9:25:53, 4.29s/it] 64%|██████▍ | 14180/22095 [24:17:04<9:11:52, 4.18s/it] {'loss': 0.3265, 'grad_norm': 0.5838680131848357, 'learning_rate': 3.004955279086167e-06, 'epoch': 0.64} 64%|██████▍ | 14180/22095 [24:17:04<9:11:52, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14181/22095 [24:17:11<10:48:32, 4.92s/it] {'loss': 0.4702, 'grad_norm': 0.31067328632448243, 'learning_rate': 3.0042832507685005e-06, 'epoch': 0.64} 64%|██████▍ | 14181/22095 [24:17:11<10:48:32, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112672 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133943 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14182/22095 [24:17:14<9:38:37, 4.39s/it] {'loss': 0.3028, 'grad_norm': 0.6004743983019462, 'learning_rate': 3.0036112653326544e-06, 'epoch': 0.64} 64%|██████▍ | 14182/22095 [24:17:14<9:38:37, 4.39s/it] 64%|██████▍ | 14183/22095 [24:17:17<8:57:28, 4.08s/it] {'loss': 0.2985, 'grad_norm': 0.551667150339416, 'learning_rate': 3.0029393227930712e-06, 'epoch': 0.64} 64%|██████▍ | 14183/22095 [24:17:17<8:57:28, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (82807 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46341 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14184/22095 [24:17:27<12:30:09, 5.69s/it] {'loss': 0.4687, 'grad_norm': 0.2957063245086027, 'learning_rate': 3.0022674231641903e-06, 'epoch': 0.64} 64%|██████▍ | 14184/22095 [24:17:27<12:30:09, 5.69s/it] 64%|██████▍ | 14185/22095 [24:17:31<11:27:33, 5.22s/it] {'loss': 0.3463, 'grad_norm': 0.624071202751652, 'learning_rate': 3.001595566460446e-06, 'epoch': 0.64} 64%|██████▍ | 14185/22095 [24:17:31<11:27:33, 5.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14186/22095 [24:17:34<9:59:45, 4.55s/it] {'loss': 0.3186, 'grad_norm': 0.6038896558156434, 'learning_rate': 3.0009237526962735e-06, 'epoch': 0.64} 64%|██████▍ | 14186/22095 [24:17:34<9:59:45, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80139 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78004 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14187/22095 [24:17:37<9:13:01, 4.20s/it] {'loss': 0.3022, 'grad_norm': 0.5568839175147954, 'learning_rate': 3.0002519818861126e-06, 'epoch': 0.64} 64%|██████▍ | 14187/22095 [24:17:37<9:13:01, 4.20s/it] 64%|██████▍ | 14188/22095 [24:17:41<8:35:32, 3.91s/it] {'loss': 0.3807, 'grad_norm': 0.6025428150532423, 'learning_rate': 2.999580254044393e-06, 'epoch': 0.64} 64%|██████▍ | 14188/22095 [24:17:41<8:35:32, 3.91s/it] 64%|██████▍ | 14189/22095 [24:17:45<8:36:12, 3.92s/it] {'loss': 0.2849, 'grad_norm': 0.5803168353423076, 'learning_rate': 2.9989085691855513e-06, 'epoch': 0.64} 64%|██████▍ | 14189/22095 [24:17:45<8:36:12, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908199 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31352, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nA. 4.5\nB. 7\nC. 2\nD. 2.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14190/22095 [24:17:54<12:13:04, 5.56s/it] {'loss': 0.4689, 'grad_norm': 0.2851186215276721, 'learning_rate': 2.9982369273240186e-06, 'epoch': 0.64} 64%|██████▍ | 14190/22095 [24:17:54<12:13:04, 5.56s/it] 64%|██████▍ | 14191/22095 [24:17:57<10:41:19, 4.87s/it] {'loss': 0.3065, 'grad_norm': 0.5989723155273211, 'learning_rate': 2.9975653284742257e-06, 'epoch': 0.64} 64%|██████▍ | 14191/22095 [24:17:57<10:41:19, 4.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83043 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14192/22095 [24:18:07<13:43:14, 6.25s/it] {'loss': 0.4629, 'grad_norm': 0.26671375041777434, 'learning_rate': 2.996893772650602e-06, 'epoch': 0.64} 64%|██████▍ | 14192/22095 [24:18:07<13:43:14, 6.25s/it] 64%|██████▍ | 14193/22095 [24:18:10<11:58:48, 5.46s/it] {'loss': 0.3266, 'grad_norm': 0.6129745855779944, 'learning_rate': 2.996222259867582e-06, 'epoch': 0.64} 64%|██████▍ | 14193/22095 [24:18:10<11:58:48, 5.46s/it] 64%|██████▍ | 14194/22095 [24:18:14<10:47:20, 4.92s/it] {'loss': 0.3478, 'grad_norm': 0.5675197444532076, 'learning_rate': 2.9955507901395908e-06, 'epoch': 0.64} 64%|██████▍ | 14194/22095 [24:18:14<10:47:20, 4.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41159 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42444 > 40960) for 4 sample(s). Truncating to 1369 with 2 samples. 64%|██████▍ | 14195/22095 [24:18:23<13:40:57, 6.24s/it] {'loss': 0.4871, 'grad_norm': 0.28583493374778624, 'learning_rate': 2.994879363481056e-06, 'epoch': 0.64} 64%|██████▍ | 14195/22095 [24:18:23<13:40:57, 6.24s/it] 64%|██████▍ | 14196/22095 [24:18:27<11:55:03, 5.43s/it] {'loss': 0.3435, 'grad_norm': 0.60330637142904, 'learning_rate': 2.994207979906405e-06, 'epoch': 0.64} 64%|██████▍ | 14196/22095 [24:18:27<11:55:03, 5.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54190 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (156161 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14197/22095 [24:18:31<10:49:21, 4.93s/it] {'loss': 0.3244, 'grad_norm': 0.6031206315634002, 'learning_rate': 2.993536639430066e-06, 'epoch': 0.64} 64%|██████▍ | 14197/22095 [24:18:31<10:49:21, 4.93s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (94609968 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 64%|██████▍ | 14198/22095 [24:18:34<9:32:44, 4.35s/it] {'loss': 0.2869, 'grad_norm': 0.6196682127212395, 'learning_rate': 2.992865342066461e-06, 'epoch': 0.64} 64%|██████▍ | 14198/22095 [24:18:34<9:32:44, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30024.png 2025-08-28 16:16:29.856113 load time: 1125.55 ms 64%|██████▍ | 14199/22095 [24:18:41<11:29:00, 5.24s/it] {'loss': 0.4426, 'grad_norm': 0.2655158263326608, 'learning_rate': 2.992194087830016e-06, 'epoch': 0.64} 64%|██████▍ | 14199/22095 [24:18:41<11:29:00, 5.24s/it] 64%|██████▍ | 14200/22095 [24:18:47<12:15:58, 5.59s/it] {'loss': 0.4621, 'grad_norm': 0.2960483703553002, 'learning_rate': 2.991522876735154e-06, 'epoch': 0.64} 64%|██████▍ | 14200/22095 [24:18:47<12:15:58, 5.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 64%|██████▍ | 14201/22095 [24:18:51<10:56:35, 4.99s/it] {'loss': 0.302, 'grad_norm': 0.6015521179132051, 'learning_rate': 2.990851708796295e-06, 'epoch': 0.64} 64%|██████▍ | 14201/22095 [24:18:51<10:56:35, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78277 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109301 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71694 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14202/22095 [24:18:55<10:17:20, 4.69s/it] {'loss': 0.3565, 'grad_norm': 0.5963876428881998, 'learning_rate': 2.990180584027863e-06, 'epoch': 0.64} 64%|██████▍ | 14202/22095 [24:18:55<10:17:20, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42068 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93451 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14203/22095 [24:19:01<11:25:56, 5.21s/it] {'loss': 0.4862, 'grad_norm': 0.2691629477578797, 'learning_rate': 2.989509502444279e-06, 'epoch': 0.64} 64%|██████▍ | 14203/22095 [24:19:01<11:25:56, 5.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307763 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2ABTjicrI8KJjy0FhXXbfnpXa_!!3476548650.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n提取这张照片中的文本信息,但不要输出文字所在位置。'}, {'from': 'gpt', 'value': '图中所有文字:\n买3件\n送1只\n送抱枕\n送2只\n买5件\n高密度海绵\n防滑\n可固定\n可拆洗'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938305 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61458, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 64%|██████▍ | 14204/22095 [24:19:05<10:23:53, 4.74s/it] {'loss': 0.34, 'grad_norm': 0.7691217666066809, 'learning_rate': 2.98883846405996e-06, 'epoch': 0.64} 64%|██████▍ | 14204/22095 [24:19:05<10:23:53, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14205/22095 [24:19:14<13:26:32, 6.13s/it] {'loss': 0.4938, 'grad_norm': 0.28863810537045187, 'learning_rate': 2.988167468889324e-06, 'epoch': 0.64} 64%|██████▍ | 14205/22095 [24:19:14<13:26:32, 6.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14206/22095 [24:19:18<12:08:37, 5.54s/it] {'loss': 0.2781, 'grad_norm': 0.662039036583885, 'learning_rate': 2.9874965169467934e-06, 'epoch': 0.64} 64%|██████▍ | 14206/22095 [24:19:18<12:08:37, 5.54s/it] 64%|██████▍ | 14207/22095 [24:19:21<10:23:49, 4.75s/it] {'loss': 0.2576, 'grad_norm': 0.6085765306963504, 'learning_rate': 2.986825608246779e-06, 'epoch': 0.64} 64%|██████▍ | 14207/22095 [24:19:21<10:23:49, 4.75s/it] 64%|██████▍ | 14208/22095 [24:19:25<9:23:27, 4.29s/it] {'loss': 0.3167, 'grad_norm': 0.6022563110175009, 'learning_rate': 2.9861547428037003e-06, 'epoch': 0.64} 64%|██████▍ | 14208/22095 [24:19:25<9:23:27, 4.29s/it] 64%|██████▍ | 14209/22095 [24:19:28<9:06:15, 4.16s/it] {'loss': 0.2836, 'grad_norm': 0.5465923287275317, 'learning_rate': 2.9854839206319697e-06, 'epoch': 0.64} 64%|██████▍ | 14209/22095 [24:19:28<9:06:15, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14210/22095 [24:19:38<12:31:09, 5.72s/it] {'loss': 0.4954, 'grad_norm': 0.28547996312858637, 'learning_rate': 2.984813141746006e-06, 'epoch': 0.64} 64%|██████▍ | 14210/22095 [24:19:38<12:31:09, 5.72s/it] 64%|██████▍ | 14211/22095 [24:19:46<13:58:13, 6.38s/it] {'loss': 0.4616, 'grad_norm': 0.2592260567682769, 'learning_rate': 2.9841424061602153e-06, 'epoch': 0.64} 64%|██████▍ | 14211/22095 [24:19:46<13:58:13, 6.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 64%|██████▍ | 14212/22095 [24:19:49<12:09:41, 5.55s/it] {'loss': 0.275, 'grad_norm': 0.5807378407030216, 'learning_rate': 2.9834717138890145e-06, 'epoch': 0.64} 64%|██████▍ | 14212/22095 [24:19:49<12:09:41, 5.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [45, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366676 in VC:s3://internvl-moe-sft-data/. Exception: Image size [45, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33422, 'image': 'vrdu_table_final_2/astro-ph.CO/6ca83451-dc3e-4336-92b1-2b34daef753b.png', 'image_wh': [[45, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{c}$Y_{tot}$\\end{tabular}\n```'}]} 64%|██████▍ | 14213/22095 [24:19:53<11:01:06, 5.03s/it] {'loss': 0.284, 'grad_norm': 0.5648631166659934, 'learning_rate': 2.9828010649468144e-06, 'epoch': 0.64} 64%|██████▍ | 14213/22095 [24:19:53<11:01:06, 5.03s/it] 64%|██████▍ | 14214/22095 [24:19:56<9:52:35, 4.51s/it] {'loss': 0.3002, 'grad_norm': 0.5817751136785265, 'learning_rate': 2.982130459348022e-06, 'epoch': 0.64} 64%|██████▍ | 14214/22095 [24:19:56<9:52:35, 4.51s/it] 64%|██████▍ | 14215/22095 [24:20:01<9:35:38, 4.38s/it] {'loss': 0.3609, 'grad_norm': 0.6588689318366263, 'learning_rate': 2.9814598971070487e-06, 'epoch': 0.64} 64%|██████▍ | 14215/22095 [24:20:01<9:35:38, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61513 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14216/22095 [24:20:04<8:42:12, 3.98s/it] {'loss': 0.3373, 'grad_norm': 0.6150071152737294, 'learning_rate': 2.980789378238305e-06, 'epoch': 0.64} 64%|██████▍ | 14216/22095 [24:20:04<8:42:12, 3.98s/it] 64%|██████▍ | 14217/22095 [24:20:07<8:16:06, 3.78s/it] {'loss': 0.3161, 'grad_norm': 0.6178255211284963, 'learning_rate': 2.980118902756194e-06, 'epoch': 0.64} 64%|██████▍ | 14217/22095 [24:20:07<8:16:06, 3.78s/it] 64%|██████▍ | 14218/22095 [24:20:10<7:34:45, 3.46s/it] {'loss': 0.3345, 'grad_norm': 0.6240835861392467, 'learning_rate': 2.9794484706751243e-06, 'epoch': 0.64} 64%|██████▍ | 14218/22095 [24:20:10<7:34:45, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14219/22095 [24:20:13<7:14:28, 3.31s/it] {'loss': 0.2888, 'grad_norm': 0.6383863142520622, 'learning_rate': 2.9787780820095025e-06, 'epoch': 0.64} 64%|██████▍ | 14219/22095 [24:20:13<7:14:28, 3.31s/it] 64%|██████▍ | 14220/22095 [24:20:16<7:09:43, 3.27s/it] {'loss': 0.2687, 'grad_norm': 0.6012313492309478, 'learning_rate': 2.97810773677373e-06, 'epoch': 0.64} 64%|██████▍ | 14220/22095 [24:20:16<7:09:43, 3.27s/it] 64%|██████▍ | 14221/22095 [24:20:20<7:42:43, 3.53s/it] {'loss': 0.2902, 'grad_norm': 0.6116343534165997, 'learning_rate': 2.977437434982214e-06, 'epoch': 0.64} 64%|██████▍ | 14221/22095 [24:20:20<7:42:43, 3.53s/it] 64%|██████▍ | 14222/22095 [24:20:23<7:32:58, 3.45s/it] {'loss': 0.2817, 'grad_norm': 0.5824607060502492, 'learning_rate': 2.976767176649356e-06, 'epoch': 0.64} 64%|██████▍ | 14222/22095 [24:20:23<7:32:58, 3.45s/it] 64%|██████▍ | 14223/22095 [24:20:26<7:18:33, 3.34s/it] {'loss': 0.3218, 'grad_norm': 0.6720034791116571, 'learning_rate': 2.9760969617895567e-06, 'epoch': 0.64} 64%|██████▍ | 14223/22095 [24:20:26<7:18:33, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14224/22095 [24:20:37<12:00:38, 5.49s/it] {'loss': 0.4402, 'grad_norm': 0.30218453334348905, 'learning_rate': 2.975426790417218e-06, 'epoch': 0.64} 64%|██████▍ | 14224/22095 [24:20:37<12:00:38, 5.49s/it] 64%|██████▍ | 14225/22095 [24:20:40<10:33:25, 4.83s/it] {'loss': 0.3273, 'grad_norm': 0.618944995342216, 'learning_rate': 2.974756662546738e-06, 'epoch': 0.64} 64%|██████▍ | 14225/22095 [24:20:40<10:33:25, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44781 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55949 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14226/22095 [24:20:46<11:14:59, 5.15s/it] {'loss': 0.4503, 'grad_norm': 0.27838162273474976, 'learning_rate': 2.97408657819252e-06, 'epoch': 0.64} 64%|██████▍ | 14226/22095 [24:20:46<11:14:59, 5.15s/it] 64%|██████▍ | 14227/22095 [24:20:49<10:04:15, 4.61s/it] {'loss': 0.2749, 'grad_norm': 0.6088025550095602, 'learning_rate': 2.9734165373689577e-06, 'epoch': 0.64} 64%|██████▍ | 14227/22095 [24:20:49<10:04:15, 4.61s/it] 64%|██████▍ | 14228/22095 [24:20:53<9:37:51, 4.41s/it] {'loss': 0.3192, 'grad_norm': 0.6119993938372056, 'learning_rate': 2.97274654009045e-06, 'epoch': 0.64} 64%|██████▍ | 14228/22095 [24:20:53<9:37:51, 4.41s/it] 64%|██████▍ | 14229/22095 [24:20:57<9:26:09, 4.32s/it] {'loss': 0.3238, 'grad_norm': 0.6109517436523327, 'learning_rate': 2.972076586371394e-06, 'epoch': 0.64} 64%|██████▍ | 14229/22095 [24:20:57<9:26:09, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14230/22095 [24:21:01<8:49:09, 4.04s/it] {'loss': 0.3165, 'grad_norm': 0.6097190740331956, 'learning_rate': 2.9714066762261825e-06, 'epoch': 0.64} 64%|██████▍ | 14230/22095 [24:21:01<8:49:09, 4.04s/it] 64%|██████▍ | 14231/22095 [24:21:04<8:06:59, 3.72s/it] {'loss': 0.3748, 'grad_norm': 0.6508118809135099, 'learning_rate': 2.9707368096692113e-06, 'epoch': 0.64} 64%|██████▍ | 14231/22095 [24:21:04<8:06:59, 3.72s/it] 64%|██████▍ | 14232/22095 [24:21:07<8:08:01, 3.72s/it] {'loss': 0.2921, 'grad_norm': 0.5701642957641045, 'learning_rate': 2.9700669867148747e-06, 'epoch': 0.64} 64%|██████▍ | 14232/22095 [24:21:07<8:08:01, 3.72s/it] 64%|██████▍ | 14233/22095 [24:21:10<7:36:51, 3.49s/it] {'loss': 0.2968, 'grad_norm': 0.5970835905768958, 'learning_rate': 2.9693972073775633e-06, 'epoch': 0.64} 64%|██████▍ | 14233/22095 [24:21:10<7:36:51, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14234/22095 [24:21:18<10:17:58, 4.72s/it] {'loss': 0.4623, 'grad_norm': 0.2789643423704904, 'learning_rate': 2.9687274716716686e-06, 'epoch': 0.64} 64%|██████▍ | 14234/22095 [24:21:18<10:17:58, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396928 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63781, 'image': 'vrdu_table_final_2/astro-ph.EP/d39668df-112a-42a3-bda3-4051c37ad6e6.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$e_x$\\end{tabular}\n```"}]} 64%|██████▍ | 14235/22095 [24:21:22<9:36:34, 4.40s/it] {'loss': 0.2662, 'grad_norm': 0.6103176597058867, 'learning_rate': 2.968057779611585e-06, 'epoch': 0.64} 64%|██████▍ | 14235/22095 [24:21:22<9:36:34, 4.40s/it] 64%|██████▍ | 14236/22095 [24:21:25<8:50:24, 4.05s/it] {'loss': 0.291, 'grad_norm': 0.5945299475385736, 'learning_rate': 2.967388131211696e-06, 'epoch': 0.64} 64%|██████▍ | 14236/22095 [24:21:25<8:50:24, 4.05s/it] 64%|██████▍ | 14237/22095 [24:21:29<8:50:49, 4.05s/it] {'loss': 0.3026, 'grad_norm': 0.6105465442782668, 'learning_rate': 2.966718526486394e-06, 'epoch': 0.64} 64%|██████▍ | 14237/22095 [24:21:29<8:50:49, 4.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337490 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4112, 'image': 'vrdu_table_final_2/astro-ph.CO/bac8bb57-ceea-4e5a-9cae-277d540ff36c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 64%|██████▍ | 14238/22095 [24:21:32<8:23:01, 3.84s/it] {'loss': 0.3068, 'grad_norm': 0.7439019656806879, 'learning_rate': 2.966048965450065e-06, 'epoch': 0.64} 64%|██████▍ | 14238/22095 [24:21:32<8:23:01, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934541 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57694, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 64%|██████▍ | 14239/22095 [24:21:37<8:48:54, 4.04s/it] {'loss': 0.3215, 'grad_norm': 0.6234767585672157, 'learning_rate': 2.9653794481171006e-06, 'epoch': 0.64} 64%|██████▍ | 14239/22095 [24:21:37<8:48:54, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55222 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53180 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14240/22095 [24:21:40<8:34:48, 3.93s/it] {'loss': 0.355, 'grad_norm': 0.6511768410489892, 'learning_rate': 2.9647099745018794e-06, 'epoch': 0.64} 64%|██████▍ | 14240/22095 [24:21:40<8:34:48, 3.93s/it] 64%|██████▍ | 14241/22095 [24:21:44<8:10:15, 3.75s/it] {'loss': 0.2834, 'grad_norm': 0.5577217240585588, 'learning_rate': 2.9640405446187915e-06, 'epoch': 0.64} 64%|██████▍ | 14241/22095 [24:21:44<8:10:15, 3.75s/it] 64%|██████▍ | 14242/22095 [24:21:47<8:09:51, 3.74s/it] {'loss': 0.2845, 'grad_norm': 0.6636043203449777, 'learning_rate': 2.96337115848222e-06, 'epoch': 0.64} 64%|██████▍ | 14242/22095 [24:21:47<8:09:51, 3.74s/it] 64%|██████▍ | 14243/22095 [24:21:52<9:00:18, 4.13s/it] {'loss': 0.2784, 'grad_norm': 0.5674048522001951, 'learning_rate': 2.9627018161065456e-06, 'epoch': 0.64} 64%|██████▍ | 14243/22095 [24:21:52<9:00:18, 4.13s/it] 64%|██████▍ | 14244/22095 [24:21:55<8:10:51, 3.75s/it] {'loss': 0.3145, 'grad_norm': 0.6702110182418993, 'learning_rate': 2.962032517506152e-06, 'epoch': 0.64} 64%|██████▍ | 14244/22095 [24:21:55<8:10:51, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 64%|██████▍ | 14245/22095 [24:21:58<7:46:56, 3.57s/it] {'loss': 0.3155, 'grad_norm': 0.6525622517016991, 'learning_rate': 2.9613632626954226e-06, 'epoch': 0.64} 64%|██████▍ | 14245/22095 [24:21:59<7:46:56, 3.57s/it] 64%|██████▍ | 14246/22095 [24:22:02<7:44:40, 3.55s/it] {'loss': 0.2913, 'grad_norm': 0.7183003943697874, 'learning_rate': 2.960694051688734e-06, 'epoch': 0.64} 64%|██████▍ | 14246/22095 [24:22:02<7:44:40, 3.55s/it] 64%|██████▍ | 14247/22095 [24:22:05<7:16:30, 3.34s/it] {'loss': 0.2935, 'grad_norm': 0.6211610036592395, 'learning_rate': 2.960024884500467e-06, 'epoch': 0.64} 64%|██████▍ | 14247/22095 [24:22:05<7:16:30, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14248/22095 [24:22:15<11:24:42, 5.24s/it] {'loss': 0.4627, 'grad_norm': 0.2738736290787737, 'learning_rate': 2.959355761145001e-06, 'epoch': 0.64} 64%|██████▍ | 14248/22095 [24:22:15<11:24:42, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95385 > 40960). Running this sequence through the model will result in indexing errors 64%|██████▍ | 14249/22095 [24:22:18<10:11:36, 4.68s/it] {'loss': 0.303, 'grad_norm': 0.640655623750578, 'learning_rate': 2.9586866816367104e-06, 'epoch': 0.64} 64%|██████▍ | 14249/22095 [24:22:18<10:11:36, 4.68s/it] 64%|██████▍ | 14250/22095 [24:22:21<9:02:52, 4.15s/it] {'loss': 0.3239, 'grad_norm': 0.6000781467813328, 'learning_rate': 2.9580176459899747e-06, 'epoch': 0.64} 64%|██████▍ | 14250/22095 [24:22:21<9:02:52, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 64%|██████▍ | 14251/22095 [24:22:28<11:07:55, 5.11s/it] {'loss': 0.4631, 'grad_norm': 0.26561936790025625, 'learning_rate': 2.9573486542191682e-06, 'epoch': 0.64} 64%|██████▍ | 14251/22095 [24:22:28<11:07:55, 5.11s/it] 65%|██████▍ | 14252/22095 [24:22:32<10:12:32, 4.69s/it] {'loss': 0.2882, 'grad_norm': 0.9501142452737259, 'learning_rate': 2.9566797063386665e-06, 'epoch': 0.65} 65%|██████▍ | 14252/22095 [24:22:32<10:12:32, 4.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8603646 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19765, 'image': '679435824.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Education & Teaching? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 65%|██████▍ | 14253/22095 [24:22:36<9:33:12, 4.39s/it] {'loss': 0.3373, 'grad_norm': 0.6272983545080569, 'learning_rate': 2.9560108023628403e-06, 'epoch': 0.65} 65%|██████▍ | 14253/22095 [24:22:36<9:33:12, 4.39s/it] 65%|██████▍ | 14254/22095 [24:22:39<8:55:30, 4.10s/it] {'loss': 0.3377, 'grad_norm': 0.6390746721660203, 'learning_rate': 2.955341942306066e-06, 'epoch': 0.65} 65%|██████▍ | 14254/22095 [24:22:39<8:55:30, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307929 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2FAk9b0zJ8KJjSspkXXbF7VXa_!!1616985162.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text does this image contain?'}, {'from': 'gpt', 'value': 'All words in the image:\n送\n七\n买\n飞机原装\n华太电池\n价值5元\nUSB数据线\n遥控器电池\n螺丝刀一把\n充电器\n风叶AB配件\n尾翼'}]} 65%|██████▍ | 14255/22095 [24:22:42<8:05:10, 3.71s/it] {'loss': 0.2609, 'grad_norm': 0.6762475747292197, 'learning_rate': 2.9546731261827135e-06, 'epoch': 0.65} 65%|██████▍ | 14255/22095 [24:22:42<8:05:10, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88840 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14256/22095 [24:22:46<8:12:53, 3.77s/it] {'loss': 0.3167, 'grad_norm': 0.6354579445660217, 'learning_rate': 2.9540043540071535e-06, 'epoch': 0.65} 65%|██████▍ | 14256/22095 [24:22:46<8:12:53, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14257/22095 [24:22:55<11:59:43, 5.51s/it] {'loss': 0.4601, 'grad_norm': 0.2658740095150606, 'learning_rate': 2.953335625793755e-06, 'epoch': 0.65} 65%|██████▍ | 14257/22095 [24:22:55<11:59:43, 5.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [78, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390926 in VC:s3://internvl-moe-sft-data/. Exception: Image size [78, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57746, 'image': 'vrdu_table_final_2/astro-ph.EP/fb064445-f5a2-4846-a0d8-9e8d880d2215.png', 'image_wh': [[78, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}}\\textbf{Label} \\end{tabular}\n```"}]} 65%|██████▍ | 14258/22095 [24:22:58<10:26:40, 4.80s/it] {'loss': 0.3051, 'grad_norm': 0.6089096635417536, 'learning_rate': 2.952666941556891e-06, 'epoch': 0.65} 65%|██████▍ | 14258/22095 [24:22:58<10:26:40, 4.80s/it] 65%|██████▍ | 14259/22095 [24:23:02<9:26:33, 4.34s/it] {'loss': 0.3246, 'grad_norm': 0.6345912634128601, 'learning_rate': 2.9519983013109233e-06, 'epoch': 0.65} 65%|██████▍ | 14259/22095 [24:23:02<9:26:33, 4.34s/it] 65%|██████▍ | 14260/22095 [24:23:05<8:33:10, 3.93s/it] {'loss': 0.3671, 'grad_norm': 0.6944907784932608, 'learning_rate': 2.9513297050702238e-06, 'epoch': 0.65} 65%|██████▍ | 14260/22095 [24:23:05<8:33:10, 3.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047831 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5.5cm\nB. 6cm\nC. 6.5cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 65%|██████▍ | 14261/22095 [24:23:09<8:59:16, 4.13s/it] {'loss': 0.3174, 'grad_norm': 0.5834899490919657, 'learning_rate': 2.9506611528491574e-06, 'epoch': 0.65} 65%|██████▍ | 14261/22095 [24:23:09<8:59:16, 4.13s/it] 65%|██████▍ | 14262/22095 [24:23:12<8:09:17, 3.75s/it] {'loss': 0.3125, 'grad_norm': 0.6705937698241765, 'learning_rate': 2.949992644662088e-06, 'epoch': 0.65} 65%|██████▍ | 14262/22095 [24:23:12<8:09:17, 3.75s/it] 65%|██████▍ | 14263/22095 [24:23:15<7:36:31, 3.50s/it] {'loss': 0.3252, 'grad_norm': 0.5996420799570138, 'learning_rate': 2.9493241805233795e-06, 'epoch': 0.65} 65%|██████▍ | 14263/22095 [24:23:15<7:36:31, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14264/22095 [24:23:23<10:29:25, 4.82s/it] {'loss': 0.4378, 'grad_norm': 0.26899489308008584, 'learning_rate': 2.9486557604473993e-06, 'epoch': 0.65} 65%|██████▍ | 14264/22095 [24:23:23<10:29:25, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42426 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79395 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14265/22095 [24:23:26<9:39:23, 4.44s/it] {'loss': 0.3032, 'grad_norm': 0.7507072498117943, 'learning_rate': 2.947987384448503e-06, 'epoch': 0.65} 65%|██████▍ | 14265/22095 [24:23:26<9:39:23, 4.44s/it] 65%|██████▍ | 14266/22095 [24:23:30<8:52:27, 4.08s/it] {'loss': 0.3055, 'grad_norm': 0.6252518968544177, 'learning_rate': 2.9473190525410573e-06, 'epoch': 0.65} 65%|██████▍ | 14266/22095 [24:23:30<8:52:27, 4.08s/it] 65%|██████▍ | 14267/22095 [24:23:33<8:26:54, 3.89s/it] {'loss': 0.3365, 'grad_norm': 0.6217862203131317, 'learning_rate': 2.9466507647394193e-06, 'epoch': 0.65} 65%|██████▍ | 14267/22095 [24:23:33<8:26:54, 3.89s/it] 65%|██████▍ | 14268/22095 [24:23:37<8:22:05, 3.85s/it] {'loss': 0.2922, 'grad_norm': 0.6136071409273623, 'learning_rate': 2.9459825210579534e-06, 'epoch': 0.65} 65%|██████▍ | 14268/22095 [24:23:37<8:22:05, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877035 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 188, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 10cm\nB. 5cm\nC. 15cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 65%|██████▍ | 14269/22095 [24:23:40<8:08:27, 3.74s/it] {'loss': 0.2907, 'grad_norm': 0.6190482145623376, 'learning_rate': 2.9453143215110113e-06, 'epoch': 0.65} 65%|██████▍ | 14269/22095 [24:23:40<8:08:27, 3.74s/it]Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42420 > 40960) for 4 sample(s). Truncating to 37428 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (54383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80201 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14270/22095 [24:23:43<7:34:45, 3.49s/it] {'loss': 0.2979, 'grad_norm': 0.6836419932946133, 'learning_rate': 2.9446461661129553e-06, 'epoch': 0.65} 65%|██████▍ | 14270/22095 [24:23:43<7:34:45, 3.49s/it] 65%|██████▍ | 14271/22095 [24:23:47<7:42:38, 3.55s/it] {'loss': 0.2856, 'grad_norm': 0.6098661425702997, 'learning_rate': 2.9439780548781414e-06, 'epoch': 0.65} 65%|██████▍ | 14271/22095 [24:23:47<7:42:38, 3.55s/it] 65%|██████▍ | 14272/22095 [24:23:51<7:49:23, 3.60s/it] {'loss': 0.3018, 'grad_norm': 0.6647327199400282, 'learning_rate': 2.9433099878209238e-06, 'epoch': 0.65} 65%|██████▍ | 14272/22095 [24:23:51<7:49:23, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14273/22095 [24:24:00<11:36:51, 5.35s/it] {'loss': 0.4411, 'grad_norm': 0.40327286634865583, 'learning_rate': 2.9426419649556566e-06, 'epoch': 0.65} 65%|██████▍ | 14273/22095 [24:24:00<11:36:51, 5.35s/it] 65%|██████▍ | 14274/22095 [24:24:03<10:17:58, 4.74s/it] {'loss': 0.3332, 'grad_norm': 0.6334619344167035, 'learning_rate': 2.941973986296697e-06, 'epoch': 0.65} 65%|██████▍ | 14274/22095 [24:24:03<10:17:58, 4.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923675 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46828, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 65%|██████▍ | 14275/22095 [24:24:06<9:05:22, 4.18s/it] {'loss': 0.2829, 'grad_norm': 0.6411940748488537, 'learning_rate': 2.9413060518583948e-06, 'epoch': 0.65} 65%|██████▍ | 14275/22095 [24:24:06<9:05:22, 4.18s/it] 65%|██████▍ | 14276/22095 [24:24:09<8:23:07, 3.86s/it] {'loss': 0.2762, 'grad_norm': 0.5965215268418078, 'learning_rate': 2.9406381616551026e-06, 'epoch': 0.65} 65%|██████▍ | 14276/22095 [24:24:09<8:23:07, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54747 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14277/22095 [24:24:12<7:51:28, 3.62s/it] {'loss': 0.3011, 'grad_norm': 0.6653536857292324, 'learning_rate': 2.939970315701173e-06, 'epoch': 0.65} 65%|██████▍ | 14277/22095 [24:24:12<7:51:28, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14278/22095 [24:24:20<10:06:06, 4.65s/it] {'loss': 0.4659, 'grad_norm': 0.2752597094240355, 'learning_rate': 2.939302514010951e-06, 'epoch': 0.65} 65%|██████▍ | 14278/22095 [24:24:20<10:06:06, 4.65s/it] 65%|██████▍ | 14279/22095 [24:24:23<9:23:22, 4.32s/it] {'loss': 0.3371, 'grad_norm': 0.6272241354713876, 'learning_rate': 2.9386347565987917e-06, 'epoch': 0.65} 65%|██████▍ | 14279/22095 [24:24:23<9:23:22, 4.32s/it] 65%|██████▍ | 14280/22095 [24:24:27<9:09:06, 4.22s/it] {'loss': 0.3134, 'grad_norm': 0.64838854296624, 'learning_rate': 2.937967043479039e-06, 'epoch': 0.65} 65%|██████▍ | 14280/22095 [24:24:27<9:09:06, 4.22s/it] 65%|██████▍ | 14281/22095 [24:24:31<8:50:43, 4.08s/it] {'loss': 0.3058, 'grad_norm': 0.628894173057687, 'learning_rate': 2.937299374666044e-06, 'epoch': 0.65} 65%|██████▍ | 14281/22095 [24:24:31<8:50:43, 4.08s/it] 65%|██████▍ | 14282/22095 [24:24:34<8:06:50, 3.74s/it] {'loss': 0.3215, 'grad_norm': 0.6724655897300117, 'learning_rate': 2.936631750174147e-06, 'epoch': 0.65} 65%|██████▍ | 14282/22095 [24:24:34<8:06:50, 3.74s/it] 65%|██████▍ | 14283/22095 [24:24:37<7:31:14, 3.47s/it] {'loss': 0.2922, 'grad_norm': 0.5889457910829546, 'learning_rate': 2.9359641700176977e-06, 'epoch': 0.65} 65%|██████▍ | 14283/22095 [24:24:37<7:31:14, 3.47s/it] 65%|██████▍ | 14284/22095 [24:24:40<7:12:34, 3.32s/it] {'loss': 0.2842, 'grad_norm': 0.7019494698064456, 'learning_rate': 2.935296634211041e-06, 'epoch': 0.65} 65%|██████▍ | 14284/22095 [24:24:40<7:12:34, 3.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14285/22095 [24:24:47<9:52:59, 4.56s/it] {'loss': 0.469, 'grad_norm': 0.30655246499787175, 'learning_rate': 2.934629142768517e-06, 'epoch': 0.65} 65%|██████▍ | 14285/22095 [24:24:47<9:52:59, 4.56s/it] 65%|██████▍ | 14286/22095 [24:24:50<9:04:44, 4.19s/it] {'loss': 0.285, 'grad_norm': 0.6274803319498268, 'learning_rate': 2.9339616957044683e-06, 'epoch': 0.65} 65%|██████▍ | 14286/22095 [24:24:50<9:04:44, 4.19s/it] 65%|██████▍ | 14287/22095 [24:24:54<8:42:54, 4.02s/it] {'loss': 0.3083, 'grad_norm': 0.5830271213214427, 'learning_rate': 2.9332942930332404e-06, 'epoch': 0.65} 65%|██████▍ | 14287/22095 [24:24:54<8:42:54, 4.02s/it] 65%|██████▍ | 14288/22095 [24:24:58<8:43:53, 4.03s/it] {'loss': 0.2746, 'grad_norm': 0.5883275567298492, 'learning_rate': 2.9326269347691675e-06, 'epoch': 0.65} 65%|██████▍ | 14288/22095 [24:24:58<8:43:53, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75766 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45623 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86899 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14289/22095 [24:25:01<8:04:49, 3.73s/it] {'loss': 0.3313, 'grad_norm': 0.640035380891334, 'learning_rate': 2.931959620926594e-06, 'epoch': 0.65} 65%|██████▍ | 14289/22095 [24:25:01<8:04:49, 3.73s/it] 65%|██████▍ | 14290/22095 [24:25:05<8:02:06, 3.71s/it] {'loss': 0.2949, 'grad_norm': 0.6397116725595818, 'learning_rate': 2.9312923515198577e-06, 'epoch': 0.65} 65%|██████▍ | 14290/22095 [24:25:05<8:02:06, 3.71s/it] 65%|██████▍ | 14291/22095 [24:25:09<8:05:31, 3.73s/it] {'loss': 0.2905, 'grad_norm': 0.8665629168087222, 'learning_rate': 2.9306251265632932e-06, 'epoch': 0.65} 65%|██████▍ | 14291/22095 [24:25:09<8:05:31, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45433 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43344 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14292/22095 [24:25:11<7:30:32, 3.46s/it] {'loss': 0.3104, 'grad_norm': 0.6764450931946282, 'learning_rate': 2.929957946071239e-06, 'epoch': 0.65} 65%|██████▍ | 14292/22095 [24:25:11<7:30:32, 3.46s/it] 65%|██████▍ | 14293/22095 [24:25:15<7:22:05, 3.40s/it] {'loss': 0.2939, 'grad_norm': 0.5662457813127475, 'learning_rate': 2.929290810058032e-06, 'epoch': 0.65} 65%|██████▍ | 14293/22095 [24:25:15<7:22:05, 3.40s/it] 65%|██████▍ | 14294/22095 [24:25:18<7:34:56, 3.50s/it] {'loss': 0.3027, 'grad_norm': 0.6583252374043029, 'learning_rate': 2.928623718538006e-06, 'epoch': 0.65} 65%|██████▍ | 14294/22095 [24:25:18<7:34:56, 3.50s/it] 65%|██████▍ | 14295/22095 [24:25:21<7:08:54, 3.30s/it] {'loss': 0.3758, 'grad_norm': 0.6466550381257198, 'learning_rate': 2.9279566715254944e-06, 'epoch': 0.65} 65%|██████▍ | 14295/22095 [24:25:21<7:08:54, 3.30s/it] 65%|██████▍ | 14296/22095 [24:25:24<6:52:59, 3.18s/it] {'loss': 0.3053, 'grad_norm': 0.7036033115749474, 'learning_rate': 2.9272896690348283e-06, 'epoch': 0.65} 65%|██████▍ | 14296/22095 [24:25:24<6:52:59, 3.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14297/22095 [24:25:34<11:11:36, 5.17s/it] {'loss': 0.4613, 'grad_norm': 0.3131935441913814, 'learning_rate': 2.926622711080345e-06, 'epoch': 0.65} 65%|██████▍ | 14297/22095 [24:25:34<11:11:36, 5.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66870 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122410 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14298/22095 [24:25:37<9:51:41, 4.55s/it] {'loss': 0.2803, 'grad_norm': 0.6368356188862032, 'learning_rate': 2.9259557976763686e-06, 'epoch': 0.65} 65%|██████▍ | 14298/22095 [24:25:37<9:51:41, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123898 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104317 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14299/22095 [24:25:41<9:18:39, 4.30s/it] {'loss': 0.3066, 'grad_norm': 0.629431616289445, 'learning_rate': 2.9252889288372335e-06, 'epoch': 0.65} 65%|██████▍ | 14299/22095 [24:25:41<9:18:39, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52020 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48313 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14300/22095 [24:25:51<12:59:00, 6.00s/it] {'loss': 0.444, 'grad_norm': 0.3006838383038752, 'learning_rate': 2.9246221045772683e-06, 'epoch': 0.65} 65%|██████▍ | 14300/22095 [24:25:51<12:59:00, 6.00s/it] 65%|██████▍ | 14301/22095 [24:25:54<11:33:37, 5.34s/it] {'loss': 0.3333, 'grad_norm': 1.038397028608156, 'learning_rate': 2.9239553249107985e-06, 'epoch': 0.65} 65%|██████▍ | 14301/22095 [24:25:54<11:33:37, 5.34s/it] 65%|██████▍ | 14302/22095 [24:25:58<10:35:42, 4.89s/it] {'loss': 0.3064, 'grad_norm': 0.613277700934174, 'learning_rate': 2.9232885898521516e-06, 'epoch': 0.65} 65%|██████▍ | 14302/22095 [24:25:58<10:35:42, 4.89s/it] 65%|██████▍ | 14303/22095 [24:26:02<9:33:58, 4.42s/it] {'loss': 0.3362, 'grad_norm': 0.6310216092316894, 'learning_rate': 2.9226218994156574e-06, 'epoch': 0.65} 65%|██████▍ | 14303/22095 [24:26:02<9:33:58, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43310 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43870 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14304/22095 [24:26:06<9:32:55, 4.41s/it] {'loss': 0.2963, 'grad_norm': 0.6578581401449011, 'learning_rate': 2.921955253615637e-06, 'epoch': 0.65} 65%|██████▍ | 14304/22095 [24:26:06<9:32:55, 4.41s/it] 65%|██████▍ | 14305/22095 [24:26:10<9:03:26, 4.19s/it] {'loss': 0.3143, 'grad_norm': 0.6242737955561755, 'learning_rate': 2.9212886524664164e-06, 'epoch': 0.65} 65%|██████▍ | 14305/22095 [24:26:10<9:03:26, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14306/22095 [24:26:20<13:18:39, 6.15s/it] {'loss': 0.4529, 'grad_norm': 0.28960963861197336, 'learning_rate': 2.9206220959823183e-06, 'epoch': 0.65} 65%|██████▍ | 14306/22095 [24:26:20<13:18:39, 6.15s/it] 65%|██████▍ | 14307/22095 [24:26:29<14:51:18, 6.87s/it] {'loss': 0.4931, 'grad_norm': 0.3077247899182825, 'learning_rate': 2.9199555841776637e-06, 'epoch': 0.65} 65%|██████▍ | 14307/22095 [24:26:29<14:51:18, 6.87s/it] 65%|██████▍ | 14308/22095 [24:26:36<15:17:11, 7.07s/it] {'loss': 0.4699, 'grad_norm': 0.2732012975551414, 'learning_rate': 2.919289117066777e-06, 'epoch': 0.65} 65%|██████▍ | 14308/22095 [24:26:36<15:17:11, 7.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 65%|██████▍ | 14309/22095 [24:26:41<13:52:34, 6.42s/it] {'loss': 0.3195, 'grad_norm': 0.6653553447247057, 'learning_rate': 2.918622694663975e-06, 'epoch': 0.65} 65%|██████▍ | 14309/22095 [24:26:41<13:52:34, 6.42s/it] 65%|██████▍ | 14310/22095 [24:26:45<11:58:08, 5.53s/it] {'loss': 0.3181, 'grad_norm': 0.6963381989202885, 'learning_rate': 2.9179563169835808e-06, 'epoch': 0.65} 65%|██████▍ | 14310/22095 [24:26:45<11:58:08, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60621 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14311/22095 [24:26:48<10:25:55, 4.82s/it] {'loss': 0.3637, 'grad_norm': 0.642825853602677, 'learning_rate': 2.9172899840399106e-06, 'epoch': 0.65} 65%|██████▍ | 14311/22095 [24:26:48<10:25:55, 4.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14312/22095 [24:26:57<13:25:01, 6.21s/it] {'loss': 0.4687, 'grad_norm': 0.2854289444108715, 'learning_rate': 2.9166236958472805e-06, 'epoch': 0.65} 65%|██████▍ | 14312/22095 [24:26:57<13:25:01, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126016 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73031 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14313/22095 [24:27:01<11:28:59, 5.31s/it] {'loss': 0.2866, 'grad_norm': 0.7019237006351507, 'learning_rate': 2.9159574524200105e-06, 'epoch': 0.65} 65%|██████▍ | 14313/22095 [24:27:01<11:28:59, 5.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8588329 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7065, 'image': '446396001.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this an exam preparation book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a pharmaceutical book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 65%|██████▍ | 14314/22095 [24:27:04<10:11:03, 4.71s/it] {'loss': 0.3105, 'grad_norm': 0.6214376307211359, 'learning_rate': 2.915291253772412e-06, 'epoch': 0.65} 65%|██████▍ | 14314/22095 [24:27:04<10:11:03, 4.71s/it] 65%|██████▍ | 14315/22095 [24:27:07<9:10:41, 4.25s/it] {'loss': 0.3209, 'grad_norm': 0.6254524080494197, 'learning_rate': 2.9146250999188043e-06, 'epoch': 0.65} 65%|██████▍ | 14315/22095 [24:27:07<9:10:41, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14316/22095 [24:27:10<8:12:16, 3.80s/it] {'loss': 0.3289, 'grad_norm': 0.57960723110441, 'learning_rate': 2.9139589908734977e-06, 'epoch': 0.65} 65%|██████▍ | 14316/22095 [24:27:10<8:12:16, 3.80s/it] 65%|██████▍ | 14317/22095 [24:27:13<7:35:11, 3.51s/it] {'loss': 0.2834, 'grad_norm': 0.8486329736280915, 'learning_rate': 2.9132929266508043e-06, 'epoch': 0.65} 65%|██████▍ | 14317/22095 [24:27:13<7:35:11, 3.51s/it] 65%|██████▍ | 14318/22095 [24:27:16<7:14:09, 3.35s/it] {'loss': 0.3208, 'grad_norm': 0.5990172383692677, 'learning_rate': 2.912626907265037e-06, 'epoch': 0.65} 65%|██████▍ | 14318/22095 [24:27:16<7:14:09, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045997 in VC:s3://multi-modal/UniGeo/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 65%|██████▍ | 14319/22095 [24:27:24<10:13:24, 4.73s/it] {'loss': 0.4633, 'grad_norm': 0.31174741569254, 'learning_rate': 2.91196093273051e-06, 'epoch': 0.65} 65%|██████▍ | 14319/22095 [24:27:24<10:13:24, 4.73s/it] 65%|██████▍ | 14320/22095 [24:27:27<9:17:00, 4.30s/it] {'loss': 0.2711, 'grad_norm': 0.5803472112943121, 'learning_rate': 2.911295003061526e-06, 'epoch': 0.65} 65%|██████▍ | 14320/22095 [24:27:27<9:17:00, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (138026 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14321/22095 [24:27:36<12:35:59, 5.83s/it] {'loss': 0.4638, 'grad_norm': 0.2820889514644517, 'learning_rate': 2.910629118272398e-06, 'epoch': 0.65} 65%|██████▍ | 14321/22095 [24:27:36<12:35:59, 5.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50656 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103024 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14322/22095 [24:27:40<10:58:27, 5.08s/it] {'loss': 0.3118, 'grad_norm': 0.6109816105803477, 'learning_rate': 2.9099632783774325e-06, 'epoch': 0.65} 65%|██████▍ | 14322/22095 [24:27:40<10:58:27, 5.08s/it] 65%|██████▍ | 14323/22095 [24:27:43<9:56:02, 4.60s/it] {'loss': 0.2832, 'grad_norm': 0.6306317742637505, 'learning_rate': 2.909297483390941e-06, 'epoch': 0.65} 65%|██████▍ | 14323/22095 [24:27:43<9:56:02, 4.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398235 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 386, 'image': 'vrdu_table_final_2/astro-ph.CO/5f3bc9e2-a6ee-4526-8939-7664cbd0b5fa.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} 65%|██████▍ | 14324/22095 [24:27:47<9:20:09, 4.32s/it] {'loss': 0.3358, 'grad_norm': 0.6263910342435784, 'learning_rate': 2.9086317333272218e-06, 'epoch': 0.65} 65%|██████▍ | 14324/22095 [24:27:47<9:20:09, 4.32s/it] 65%|██████▍ | 14325/22095 [24:27:50<8:21:46, 3.87s/it] {'loss': 0.2469, 'grad_norm': 0.9421587567712512, 'learning_rate': 2.9079660282005833e-06, 'epoch': 0.65} 65%|██████▍ | 14325/22095 [24:27:50<8:21:46, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8353032 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19717, 'image': 'vrdu_table_final_2/astro-ph.CO/18e5f889-651d-449f-9685-2dc8c89ca09b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{l} #1 \\end{tabular}\n```"}]} 65%|██████▍ | 14326/22095 [24:27:53<7:59:13, 3.70s/it] {'loss': 0.2902, 'grad_norm': 0.6924369118078506, 'learning_rate': 2.907300368025332e-06, 'epoch': 0.65} 65%|██████▍ | 14326/22095 [24:27:53<7:59:13, 3.70s/it] 65%|██████▍ | 14327/22095 [24:27:56<7:50:48, 3.64s/it] {'loss': 0.3156, 'grad_norm': 0.6439189029583597, 'learning_rate': 2.906634752815768e-06, 'epoch': 0.65} 65%|██████▍ | 14327/22095 [24:27:56<7:50:48, 3.64s/it] 65%|██████▍ | 14328/22095 [24:27:59<7:23:22, 3.43s/it] {'loss': 0.294, 'grad_norm': 0.6731846534259659, 'learning_rate': 2.9059691825861926e-06, 'epoch': 0.65} 65%|██████▍ | 14328/22095 [24:27:59<7:23:22, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14329/22095 [24:28:09<11:35:04, 5.37s/it] {'loss': 0.4662, 'grad_norm': 0.3488924832775132, 'learning_rate': 2.9053036573509096e-06, 'epoch': 0.65} 65%|██████▍ | 14329/22095 [24:28:09<11:35:04, 5.37s/it] 65%|██████▍ | 14330/22095 [24:28:19<14:12:30, 6.59s/it] {'loss': 0.4322, 'grad_norm': 0.3336693302384131, 'learning_rate': 2.904638177124216e-06, 'epoch': 0.65} 65%|██████▍ | 14330/22095 [24:28:19<14:12:30, 6.59s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (42130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94749 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14331/22095 [24:28:23<12:28:12, 5.78s/it] {'loss': 0.2927, 'grad_norm': 0.6285069632218658, 'learning_rate': 2.9039727419204146e-06, 'epoch': 0.65} 65%|██████▍ | 14331/22095 [24:28:23<12:28:12, 5.78s/it] 65%|██████▍ | 14332/22095 [24:28:26<10:46:09, 4.99s/it] {'loss': 0.2927, 'grad_norm': 0.6141502723029155, 'learning_rate': 2.9033073517538008e-06, 'epoch': 0.65} 65%|██████▍ | 14332/22095 [24:28:26<10:46:09, 4.99s/it] 65%|██████▍ | 14333/22095 [24:28:29<9:24:24, 4.36s/it] {'loss': 0.3329, 'grad_norm': 0.6560939059690654, 'learning_rate': 2.9026420066386705e-06, 'epoch': 0.65} 65%|██████▍ | 14333/22095 [24:28:29<9:24:24, 4.36s/it] 65%|██████▍ | 14334/22095 [24:28:32<8:32:42, 3.96s/it] {'loss': 0.2866, 'grad_norm': 0.6558394371129095, 'learning_rate': 2.9019767065893227e-06, 'epoch': 0.65} 65%|██████▍ | 14334/22095 [24:28:32<8:32:42, 3.96s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38248.png 2025-08-28 16:26:28.810942 load time: 1090.48 ms 65%|██████▍ | 14335/22095 [24:28:36<8:28:48, 3.93s/it] {'loss': 0.2918, 'grad_norm': 0.6485885431882231, 'learning_rate': 2.9013114516200537e-06, 'epoch': 0.65} 65%|██████▍ | 14335/22095 [24:28:36<8:28:48, 3.93s/it] 65%|██████▍ | 14336/22095 [24:28:39<8:03:23, 3.74s/it] {'loss': 0.3494, 'grad_norm': 0.6142572990614329, 'learning_rate': 2.900646241745156e-06, 'epoch': 0.65} 65%|██████▍ | 14336/22095 [24:28:39<8:03:23, 3.74s/it] 65%|██████▍ | 14337/22095 [24:28:43<8:05:42, 3.76s/it] {'loss': 0.3, 'grad_norm': 0.6310608157762924, 'learning_rate': 2.8999810769789204e-06, 'epoch': 0.65} 65%|██████▍ | 14337/22095 [24:28:43<8:05:42, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14338/22095 [24:28:46<7:32:01, 3.50s/it] {'loss': 0.2935, 'grad_norm': 0.6451028508599663, 'learning_rate': 2.899315957335642e-06, 'epoch': 0.65} 65%|██████▍ | 14338/22095 [24:28:46<7:32:01, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14339/22095 [24:28:55<11:40:18, 5.42s/it] {'loss': 0.4867, 'grad_norm': 0.38797774658659645, 'learning_rate': 2.8986508828296144e-06, 'epoch': 0.65} 65%|██████▍ | 14339/22095 [24:28:55<11:40:18, 5.42s/it] 65%|██████▍ | 14340/22095 [24:28:59<10:18:46, 4.79s/it] {'loss': 0.2877, 'grad_norm': 0.7113345408661017, 'learning_rate': 2.897985853475125e-06, 'epoch': 0.65} 65%|██████▍ | 14340/22095 [24:28:59<10:18:46, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55230 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14341/22095 [24:29:03<9:40:23, 4.49s/it] {'loss': 0.2924, 'grad_norm': 0.5809836155986944, 'learning_rate': 2.8973208692864623e-06, 'epoch': 0.65} 65%|██████▍ | 14341/22095 [24:29:03<9:40:23, 4.49s/it] 65%|██████▍ | 14342/22095 [24:29:06<8:48:45, 4.09s/it] {'loss': 0.3202, 'grad_norm': 0.8393594686255713, 'learning_rate': 2.896655930277918e-06, 'epoch': 0.65} 65%|██████▍ | 14342/22095 [24:29:06<8:48:45, 4.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14343/22095 [24:29:09<8:22:18, 3.89s/it] {'loss': 0.2833, 'grad_norm': 0.6353389822032426, 'learning_rate': 2.8959910364637755e-06, 'epoch': 0.65} 65%|██████▍ | 14343/22095 [24:29:09<8:22:18, 3.89s/it] 65%|██████▍ | 14344/22095 [24:29:13<8:05:01, 3.75s/it] {'loss': 0.3251, 'grad_norm': 0.6497489518809756, 'learning_rate': 2.8953261878583263e-06, 'epoch': 0.65} 65%|██████▍ | 14344/22095 [24:29:13<8:05:01, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957193 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8028, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 10\nB. 8\nC. 7\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 65%|██████▍ | 14345/22095 [24:29:16<7:38:20, 3.55s/it] {'loss': 0.2892, 'grad_norm': 0.6217565026702535, 'learning_rate': 2.8946613844758526e-06, 'epoch': 0.65} 65%|██████▍ | 14345/22095 [24:29:16<7:38:20, 3.55s/it] 65%|██████▍ | 14346/22095 [24:29:18<7:04:24, 3.29s/it] {'loss': 0.3525, 'grad_norm': 0.682100611004778, 'learning_rate': 2.893996626330638e-06, 'epoch': 0.65} 65%|██████▍ | 14346/22095 [24:29:18<7:04:24, 3.29s/it] 65%|██████▍ | 14347/22095 [24:29:21<6:47:57, 3.16s/it] {'loss': 0.3273, 'grad_norm': 0.6481556737222688, 'learning_rate': 2.8933319134369677e-06, 'epoch': 0.65} 65%|██████▍ | 14347/22095 [24:29:21<6:47:57, 3.16s/it] 65%|██████▍ | 14348/22095 [24:29:25<7:04:34, 3.29s/it] {'loss': 0.3111, 'grad_norm': 0.5924432907301626, 'learning_rate': 2.8926672458091265e-06, 'epoch': 0.65} 65%|██████▍ | 14348/22095 [24:29:25<7:04:34, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50605 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98740 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52581 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14349/22095 [24:29:34<10:46:53, 5.01s/it] {'loss': 0.4781, 'grad_norm': 0.35012226221446846, 'learning_rate': 2.892002623461394e-06, 'epoch': 0.65} 65%|██████▍ | 14349/22095 [24:29:34<10:46:53, 5.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374923 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41699, 'image': 'vrdu_table_final_2/astro-ph.CO/5bb80af0-ebe1-4e64-b29b-1aae78516b3f.png', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}} NoIS\n \\end{tabular}\n```"}]} 65%|██████▍ | 14350/22095 [24:29:37<9:38:41, 4.48s/it] {'loss': 0.311, 'grad_norm': 0.7342121078319623, 'learning_rate': 2.8913380464080487e-06, 'epoch': 0.65} 65%|██████▍ | 14350/22095 [24:29:37<9:38:41, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [139, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523115 in VC:s3://internvl-moe-sft-data/. Exception: Image size [139, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 91408, 'image': 'vrdu_texteq/astro-ph.CO/96f4ca8e-1ca1-40a4-9a64-0acc3e6db2af.png', 'image_wh': [[139, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'at $68\\%$ CL.'}]} 65%|██████▍ | 14351/22095 [24:29:41<9:07:28, 4.24s/it] {'loss': 0.3, 'grad_norm': 0.6000927905765256, 'learning_rate': 2.890673514663373e-06, 'epoch': 0.65} 65%|██████▍ | 14351/22095 [24:29:41<9:07:28, 4.24s/it] 65%|██████▍ | 14352/22095 [24:29:44<8:33:06, 3.98s/it] {'loss': 0.2875, 'grad_norm': 1.7539158633690095, 'learning_rate': 2.890009028241647e-06, 'epoch': 0.65} 65%|██████▍ | 14352/22095 [24:29:44<8:33:06, 3.98s/it] 65%|██████▍ | 14353/22095 [24:29:47<8:04:10, 3.75s/it] {'loss': 0.3334, 'grad_norm': 0.6715777521885723, 'learning_rate': 2.8893445871571463e-06, 'epoch': 0.65} 65%|██████▍ | 14353/22095 [24:29:47<8:04:10, 3.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14354/22095 [24:29:50<7:32:14, 3.51s/it] {'loss': 0.2926, 'grad_norm': 0.6676157791442822, 'learning_rate': 2.8886801914241465e-06, 'epoch': 0.65} 65%|██████▍ | 14354/22095 [24:29:50<7:32:14, 3.51s/it] 65%|██████▍ | 14355/22095 [24:29:54<7:23:59, 3.44s/it] {'loss': 0.3457, 'grad_norm': 0.6263183832848004, 'learning_rate': 2.8880158410569264e-06, 'epoch': 0.65} 65%|██████▍ | 14355/22095 [24:29:54<7:23:59, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48523 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▍ | 14356/22095 [24:29:57<7:07:03, 3.31s/it] {'loss': 0.2787, 'grad_norm': 0.675025958023004, 'learning_rate': 2.88735153606976e-06, 'epoch': 0.65} 65%|██████▍ | 14356/22095 [24:29:57<7:07:03, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▍ | 14357/22095 [24:30:03<8:55:04, 4.15s/it] {'loss': 0.4732, 'grad_norm': 0.30587303186604464, 'learning_rate': 2.8866872764769183e-06, 'epoch': 0.65} 65%|██████▍ | 14357/22095 [24:30:03<8:55:04, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14358/22095 [24:30:06<8:17:29, 3.86s/it] {'loss': 0.3298, 'grad_norm': 0.6729939352178069, 'learning_rate': 2.8860230622926787e-06, 'epoch': 0.65} 65%|██████▍ | 14358/22095 [24:30:06<8:17:29, 3.86s/it] 65%|██████▍ | 14359/22095 [24:30:09<7:56:05, 3.69s/it] {'loss': 0.3025, 'grad_norm': 0.596371834667015, 'learning_rate': 2.885358893531308e-06, 'epoch': 0.65} 65%|██████▍ | 14359/22095 [24:30:09<7:56:05, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▍ | 14360/22095 [24:30:13<8:16:12, 3.85s/it] {'loss': 0.33, 'grad_norm': 0.8082889237237667, 'learning_rate': 2.884694770207083e-06, 'epoch': 0.65} 65%|██████▍ | 14360/22095 [24:30:13<8:16:12, 3.85s/it] 65%|██████▍ | 14361/22095 [24:30:17<8:21:14, 3.89s/it] {'loss': 0.3092, 'grad_norm': 0.5788032355268777, 'learning_rate': 2.8840306923342707e-06, 'epoch': 0.65} 65%|██████▍ | 14361/22095 [24:30:17<8:21:14, 3.89s/it] 65%|██████▌ | 14362/22095 [24:30:20<7:37:34, 3.55s/it] {'loss': 0.3058, 'grad_norm': 0.599923225178742, 'learning_rate': 2.883366659927138e-06, 'epoch': 0.65} 65%|██████▌ | 14362/22095 [24:30:20<7:37:34, 3.55s/it] 65%|██████▌ | 14363/22095 [24:30:23<7:21:35, 3.43s/it] {'loss': 0.3054, 'grad_norm': 0.6499818118745128, 'learning_rate': 2.8827026729999553e-06, 'epoch': 0.65} 65%|██████▌ | 14363/22095 [24:30:23<7:21:35, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14364/22095 [24:30:30<9:33:34, 4.45s/it] {'loss': 0.4747, 'grad_norm': 0.2773497872558239, 'learning_rate': 2.882038731566991e-06, 'epoch': 0.65} 65%|██████▌ | 14364/22095 [24:30:30<9:33:34, 4.45s/it] 65%|██████▌ | 14365/22095 [24:30:34<8:53:50, 4.14s/it] {'loss': 0.2914, 'grad_norm': 0.6571747074412797, 'learning_rate': 2.881374835642509e-06, 'epoch': 0.65} 65%|██████▌ | 14365/22095 [24:30:34<8:53:50, 4.14s/it] 65%|██████▌ | 14366/22095 [24:30:37<8:11:55, 3.82s/it] {'loss': 0.3449, 'grad_norm': 0.7022076125813193, 'learning_rate': 2.880710985240774e-06, 'epoch': 0.65} 65%|██████▌ | 14366/22095 [24:30:37<8:11:55, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908010 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31163, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nA. 6\nB. 2\nC. 8\nD. 4'}, {'from': 'gpt', 'value': '【解答】解:∵BC=AB-AC=4,点D是线段BC的中点,∴CD=DB=\\frac{1}{2}BC=2,∴AD=AC+CD=6+2=8;'}]} 65%|██████▌ | 14367/22095 [24:30:40<7:59:37, 3.72s/it] {'loss': 0.3139, 'grad_norm': 0.6016931902558902, 'learning_rate': 2.8800471803760504e-06, 'epoch': 0.65} 65%|██████▌ | 14367/22095 [24:30:40<7:59:37, 3.72s/it] 65%|██████▌ | 14368/22095 [24:30:44<8:00:52, 3.73s/it] {'loss': 0.2877, 'grad_norm': 0.5979079268440192, 'learning_rate': 2.8793834210626036e-06, 'epoch': 0.65} 65%|██████▌ | 14368/22095 [24:30:44<8:00:52, 3.73s/it] 65%|██████▌ | 14369/22095 [24:30:47<7:44:37, 3.61s/it] {'loss': 0.312, 'grad_norm': 0.5774696254930354, 'learning_rate': 2.878719707314695e-06, 'epoch': 0.65} 65%|██████▌ | 14369/22095 [24:30:47<7:44:37, 3.61s/it] 65%|██████▌ | 14370/22095 [24:30:51<7:43:23, 3.60s/it] {'loss': 0.2982, 'grad_norm': 0.6664975643935593, 'learning_rate': 2.8780560391465828e-06, 'epoch': 0.65} 65%|██████▌ | 14370/22095 [24:30:51<7:43:23, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [317, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8510744 in VC:s3://internvl-moe-sft-data/. Exception: Image size [317, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 124231, 'image': 'vrdu_texteq/astro-ph.CO/621662e8-4c46-4955-9f9f-0554429f048c.png', 'image_wh': [[317, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'and the final value of $Y$ is'}]} 65%|██████▌ | 14371/22095 [24:30:54<7:12:33, 3.36s/it] {'loss': 0.338, 'grad_norm': 0.6311890521309541, 'learning_rate': 2.877392416572531e-06, 'epoch': 0.65} 65%|██████▌ | 14371/22095 [24:30:54<7:12:33, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51143 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58630 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14372/22095 [24:30:57<7:06:00, 3.31s/it] {'loss': 0.2884, 'grad_norm': 0.6102114402960955, 'learning_rate': 2.876728839606795e-06, 'epoch': 0.65} 65%|██████▌ | 14372/22095 [24:30:57<7:06:00, 3.31s/it] 65%|██████▌ | 14373/22095 [24:31:00<6:58:36, 3.25s/it] {'loss': 0.2693, 'grad_norm': 0.6672385468388162, 'learning_rate': 2.876065308263637e-06, 'epoch': 0.65} 65%|██████▌ | 14373/22095 [24:31:00<6:58:36, 3.25s/it] 65%|██████▌ | 14374/22095 [24:31:03<6:40:48, 3.11s/it] {'loss': 0.2921, 'grad_norm': 0.6171736644178366, 'learning_rate': 2.875401822557312e-06, 'epoch': 0.65} 65%|██████▌ | 14374/22095 [24:31:03<6:40:48, 3.11s/it] 65%|██████▌ | 14375/22095 [24:31:06<6:43:45, 3.14s/it] {'loss': 0.3198, 'grad_norm': 0.6134210902409978, 'learning_rate': 2.8747383825020753e-06, 'epoch': 0.65} 65%|██████▌ | 14375/22095 [24:31:06<6:43:45, 3.14s/it] 65%|██████▌ | 14376/22095 [24:31:10<7:13:58, 3.37s/it] {'loss': 0.3354, 'grad_norm': 0.6485596815035234, 'learning_rate': 2.874074988112183e-06, 'epoch': 0.65} 65%|██████▌ | 14376/22095 [24:31:10<7:13:58, 3.37s/it] 65%|██████▌ | 14377/22095 [24:31:13<7:10:06, 3.34s/it] {'loss': 0.2816, 'grad_norm': 0.6538269871511858, 'learning_rate': 2.873411639401893e-06, 'epoch': 0.65} 65%|██████▌ | 14377/22095 [24:31:13<7:10:06, 3.34s/it] 65%|██████▌ | 14378/22095 [24:31:16<6:54:48, 3.23s/it] {'loss': 0.3235, 'grad_norm': 0.6694802347127378, 'learning_rate': 2.8727483363854547e-06, 'epoch': 0.65} 65%|██████▌ | 14378/22095 [24:31:16<6:54:48, 3.23s/it] 65%|██████▌ | 14379/22095 [24:31:20<7:11:54, 3.36s/it] {'loss': 0.3212, 'grad_norm': 0.619858970538202, 'learning_rate': 2.872085079077119e-06, 'epoch': 0.65} 65%|██████▌ | 14379/22095 [24:31:20<7:11:54, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14380/22095 [24:31:29<11:09:36, 5.21s/it] {'loss': 0.4462, 'grad_norm': 0.2800973148325299, 'learning_rate': 2.8714218674911397e-06, 'epoch': 0.65} 65%|██████▌ | 14380/22095 [24:31:29<11:09:36, 5.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14381/22095 [24:31:33<10:03:30, 4.69s/it] {'loss': 0.3097, 'grad_norm': 0.61955442849468, 'learning_rate': 2.8707587016417695e-06, 'epoch': 0.65} 65%|██████▌ | 14381/22095 [24:31:33<10:03:30, 4.69s/it] 65%|██████▌ | 14382/22095 [24:31:36<9:09:51, 4.28s/it] {'loss': 0.3206, 'grad_norm': 0.5888297171014283, 'learning_rate': 2.870095581543255e-06, 'epoch': 0.65} 65%|██████▌ | 14382/22095 [24:31:36<9:09:51, 4.28s/it] 65%|██████▌ | 14383/22095 [24:31:40<8:46:28, 4.10s/it] {'loss': 0.3221, 'grad_norm': 0.5962007118260905, 'learning_rate': 2.8694325072098434e-06, 'epoch': 0.65} 65%|██████▌ | 14383/22095 [24:31:40<8:46:28, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55838 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14384/22095 [24:31:43<8:21:56, 3.91s/it] {'loss': 0.2852, 'grad_norm': 0.5353932920407563, 'learning_rate': 2.868769478655785e-06, 'epoch': 0.65} 65%|██████▌ | 14384/22095 [24:31:43<8:21:56, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57095 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87774 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14385/22095 [24:31:46<7:52:34, 3.68s/it] {'loss': 0.2848, 'grad_norm': 0.6399218831839524, 'learning_rate': 2.868106495895323e-06, 'epoch': 0.65} 65%|██████▌ | 14385/22095 [24:31:46<7:52:34, 3.68s/it] 65%|██████▌ | 14386/22095 [24:31:50<8:06:20, 3.79s/it] {'loss': 0.312, 'grad_norm': 0.6031502416346952, 'learning_rate': 2.8674435589427075e-06, 'epoch': 0.65} 65%|██████▌ | 14386/22095 [24:31:50<8:06:20, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74559 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14387/22095 [24:31:53<7:27:37, 3.48s/it] {'loss': 0.3196, 'grad_norm': 0.6202017345451638, 'learning_rate': 2.86678066781218e-06, 'epoch': 0.65} 65%|██████▌ | 14387/22095 [24:31:53<7:27:37, 3.48s/it] 65%|██████▌ | 14388/22095 [24:31:57<7:32:48, 3.53s/it] {'loss': 0.2961, 'grad_norm': 0.6092919726405652, 'learning_rate': 2.866117822517982e-06, 'epoch': 0.65} 65%|██████▌ | 14388/22095 [24:31:57<7:32:48, 3.53s/it] 65%|██████▌ | 14389/22095 [24:32:00<7:27:54, 3.49s/it] {'loss': 0.2981, 'grad_norm': 0.7608931850326427, 'learning_rate': 2.8654550230743605e-06, 'epoch': 0.65} 65%|██████▌ | 14389/22095 [24:32:00<7:27:54, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59296 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14390/22095 [24:32:03<7:09:34, 3.35s/it] {'loss': 0.3183, 'grad_norm': 0.7392695622420434, 'learning_rate': 2.8647922694955544e-06, 'epoch': 0.65} 65%|██████▌ | 14390/22095 [24:32:03<7:09:34, 3.35s/it] 65%|██████▌ | 14391/22095 [24:32:07<7:37:49, 3.57s/it] {'loss': 0.322, 'grad_norm': 0.704571196624388, 'learning_rate': 2.8641295617958033e-06, 'epoch': 0.65} 65%|██████▌ | 14391/22095 [24:32:07<7:37:49, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14392/22095 [24:32:15<10:39:34, 4.98s/it] {'loss': 0.4592, 'grad_norm': 0.282888891059829, 'learning_rate': 2.8634668999893477e-06, 'epoch': 0.65} 65%|██████▌ | 14392/22095 [24:32:15<10:39:34, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42940 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14393/22095 [24:32:25<13:43:33, 6.42s/it] {'loss': 0.4833, 'grad_norm': 0.2980961398662729, 'learning_rate': 2.862804284090428e-06, 'epoch': 0.65} 65%|██████▌ | 14393/22095 [24:32:25<13:43:33, 6.42s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 65%|██████▌ | 14394/22095 [24:32:29<12:10:48, 5.69s/it] {'loss': 0.2558, 'grad_norm': 1.271083687124575, 'learning_rate': 2.8621417141132813e-06, 'epoch': 0.65} 65%|██████▌ | 14394/22095 [24:32:29<12:10:48, 5.69s/it] 65%|██████▌ | 14395/22095 [24:32:39<14:29:04, 6.77s/it] {'loss': 0.4753, 'grad_norm': 0.43418125969404187, 'learning_rate': 2.8614791900721407e-06, 'epoch': 0.65} 65%|██████▌ | 14395/22095 [24:32:39<14:29:04, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 65%|██████▌ | 14396/22095 [24:32:42<12:12:54, 5.71s/it] {'loss': 0.3046, 'grad_norm': 0.5849742285716373, 'learning_rate': 2.860816711981245e-06, 'epoch': 0.65} 65%|██████▌ | 14396/22095 [24:32:42<12:12:54, 5.71s/it] 65%|██████▌ | 14397/22095 [24:32:45<10:35:54, 4.96s/it] {'loss': 0.2849, 'grad_norm': 0.6102146219858934, 'learning_rate': 2.8601542798548295e-06, 'epoch': 0.65} 65%|██████▌ | 14397/22095 [24:32:45<10:35:54, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54218 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14398/22095 [24:32:54<13:32:41, 6.34s/it] {'loss': 0.4796, 'grad_norm': 0.2841653912168513, 'learning_rate': 2.8594918937071264e-06, 'epoch': 0.65} 65%|██████▌ | 14398/22095 [24:32:55<13:32:41, 6.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14399/22095 [24:32:58<11:53:22, 5.56s/it] {'loss': 0.3134, 'grad_norm': 0.6324213646858495, 'learning_rate': 2.8588295535523667e-06, 'epoch': 0.65} 65%|██████▌ | 14399/22095 [24:32:58<11:53:22, 5.56s/it] 65%|██████▌ | 14400/22095 [24:33:02<10:57:59, 5.13s/it] {'loss': 0.3675, 'grad_norm': 0.6129634366352824, 'learning_rate': 2.858167259404786e-06, 'epoch': 0.65} 65%|██████▌ | 14400/22095 [24:33:02<10:57:59, 5.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14401/22095 [24:33:10<12:39:58, 5.93s/it] {'loss': 0.4834, 'grad_norm': 0.28533615643166216, 'learning_rate': 2.85750501127861e-06, 'epoch': 0.65} 65%|██████▌ | 14401/22095 [24:33:10<12:39:58, 5.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387775 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54587, 'image': 'vrdu_table_final_2/astro-ph.CO/4580e3fb-a16e-4cfc-930e-0736f0409209.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 65%|██████▌ | 14402/22095 [24:33:14<11:07:54, 5.21s/it] {'loss': 0.3116, 'grad_norm': 0.699339220550563, 'learning_rate': 2.856842809188074e-06, 'epoch': 0.65} 65%|██████▌ | 14402/22095 [24:33:14<11:07:54, 5.21s/it] 65%|██████▌ | 14403/22095 [24:33:17<9:52:23, 4.62s/it] {'loss': 0.2572, 'grad_norm': 0.6195956920362005, 'learning_rate': 2.8561806531474035e-06, 'epoch': 0.65} 65%|██████▌ | 14403/22095 [24:33:17<9:52:23, 4.62s/it] 65%|██████▌ | 14404/22095 [24:33:20<8:49:09, 4.13s/it] {'loss': 0.2432, 'grad_norm': 0.5413467945492126, 'learning_rate': 2.855518543170824e-06, 'epoch': 0.65} 65%|██████▌ | 14404/22095 [24:33:20<8:49:09, 4.13s/it] 65%|██████▌ | 14405/22095 [24:33:23<8:19:20, 3.90s/it] {'loss': 0.3013, 'grad_norm': 0.600232623190571, 'learning_rate': 2.8548564792725652e-06, 'epoch': 0.65} 65%|██████▌ | 14405/22095 [24:33:23<8:19:20, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68775 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14406/22095 [24:33:33<11:57:34, 5.60s/it] {'loss': 0.4688, 'grad_norm': 0.2723010923448121, 'learning_rate': 2.8541944614668548e-06, 'epoch': 0.65} 65%|██████▌ | 14406/22095 [24:33:33<11:57:34, 5.60s/it] 65%|██████▌ | 14407/22095 [24:33:37<10:51:20, 5.08s/it] {'loss': 0.3261, 'grad_norm': 0.6251866065262676, 'learning_rate': 2.8535324897679153e-06, 'epoch': 0.65} 65%|██████▌ | 14407/22095 [24:33:37<10:51:20, 5.08s/it] 65%|██████▌ | 14408/22095 [24:33:41<10:01:09, 4.69s/it] {'loss': 0.3214, 'grad_norm': 0.6008352127614647, 'learning_rate': 2.852870564189967e-06, 'epoch': 0.65} 65%|██████▌ | 14408/22095 [24:33:41<10:01:09, 4.69s/it] 65%|██████▌ | 14409/22095 [24:33:44<9:00:35, 4.22s/it] {'loss': 0.2951, 'grad_norm': 0.7247530449061969, 'learning_rate': 2.8522086847472365e-06, 'epoch': 0.65} 65%|██████▌ | 14409/22095 [24:33:44<9:00:35, 4.22s/it] 65%|██████▌ | 14410/22095 [24:33:47<8:10:48, 3.83s/it] {'loss': 0.2495, 'grad_norm': 0.5944063153735807, 'learning_rate': 2.851546851453947e-06, 'epoch': 0.65} 65%|██████▌ | 14410/22095 [24:33:47<8:10:48, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68782 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14411/22095 [24:33:49<7:28:37, 3.50s/it] {'loss': 0.2665, 'grad_norm': 0.6242000460070781, 'learning_rate': 2.8508850643243168e-06, 'epoch': 0.65} 65%|██████▌ | 14411/22095 [24:33:49<7:28:37, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8353598 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20283, 'image': 'vrdu_table_final_2/astro-ph.CO/c1f06b84-4eee-409a-a1cd-007a64884c4c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nAdapt the table from the image into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll adapt the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 65%|██████▌ | 14412/22095 [24:33:52<7:02:30, 3.30s/it] {'loss': 0.3021, 'grad_norm': 0.6220752224649899, 'learning_rate': 2.8502233233725647e-06, 'epoch': 0.65} 65%|██████▌ | 14412/22095 [24:33:52<7:02:30, 3.30s/it] 65%|██████▌ | 14413/22095 [24:33:56<7:07:49, 3.34s/it] {'loss': 0.2813, 'grad_norm': 0.6201982821878654, 'learning_rate': 2.8495616286129125e-06, 'epoch': 0.65} 65%|██████▌ | 14413/22095 [24:33:56<7:07:49, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14414/22095 [24:34:05<11:00:59, 5.16s/it] {'loss': 0.4717, 'grad_norm': 0.2602797614505162, 'learning_rate': 2.848899980059574e-06, 'epoch': 0.65} 65%|██████▌ | 14414/22095 [24:34:05<11:00:59, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11303662 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 480, 'image': 'airplane_app/235.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一份中文版的产品对比和宣传背景报告,主要介绍了中国民航信息网络股份有限公司推出的移动服务产品“航旅纵横”的相关信息和同类产品对比。这页幻灯片的具体内容如下:\n\n1. **传递背景**:\n - 通过这份幻灯片的内容,可以了解一些关于“航旅纵横”的基本情况。\n\n2. **友商对比·产品定位**:\n - **航旅纵横**:\n - “航旅纵横”是中国民航信息网络股份有限公司在2012年推出的首款基于出行的移动服务产品,目标用户是坐飞机的旅客。\n - **航班管家**:\n - 航班管家是一款受欢迎的旅行类手机应用,其功能是提供完美行程和伴你同行的服务。\n - **飞常准**:\n - 飞常准是一款覆盖全球的实时动态和数据的手机应用。\n\n3. **小结**:\n - 提到“航旅纵横”的品牌定位和标签化的重要性:\n\n4. **图示**:\n - 幻灯片右侧展示了一些应用程序的图标和信息,分别是“航班管家”,“飞常准”和“航旅纵横”。\n\n5. **公司标志**:\n - 幻灯片左下角是“联华世纪传媒”的公司标志,表明这份报告是由该公司制作的。\n\n这份内容的主要目的是对比几款旅行类的移动服务产品,并突出“航旅纵横”的特点和优点。'}]} 65%|██████▌ | 14415/22095 [24:34:08<9:55:16, 4.65s/it] {'loss': 0.3093, 'grad_norm': 0.5564924797920047, 'learning_rate': 2.8482383777267707e-06, 'epoch': 0.65} 65%|██████▌ | 14415/22095 [24:34:08<9:55:16, 4.65s/it] 65%|██████▌ | 14416/22095 [24:34:12<8:54:43, 4.18s/it] {'loss': 0.2676, 'grad_norm': 0.5661725215802511, 'learning_rate': 2.847576821628716e-06, 'epoch': 0.65} 65%|██████▌ | 14416/22095 [24:34:12<8:54:43, 4.18s/it] 65%|██████▌ | 14417/22095 [24:34:15<8:16:02, 3.88s/it] {'loss': 0.2867, 'grad_norm': 0.605395525670404, 'learning_rate': 2.8469153117796226e-06, 'epoch': 0.65} 65%|██████▌ | 14417/22095 [24:34:15<8:16:02, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64321 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41751 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14418/22095 [24:34:18<8:10:09, 3.83s/it] {'loss': 0.2992, 'grad_norm': 0.5962137534020132, 'learning_rate': 2.8462538481937067e-06, 'epoch': 0.65} 65%|██████▌ | 14418/22095 [24:34:18<8:10:09, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14419/22095 [24:34:27<10:54:26, 5.12s/it] {'loss': 0.465, 'grad_norm': 0.27038535192728275, 'learning_rate': 2.8455924308851843e-06, 'epoch': 0.65} 65%|██████▌ | 14419/22095 [24:34:27<10:54:26, 5.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81648 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135789 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95091 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52207 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14420/22095 [24:34:30<10:01:06, 4.70s/it] {'loss': 0.3161, 'grad_norm': 0.6366746385318536, 'learning_rate': 2.844931059868261e-06, 'epoch': 0.65} 65%|██████▌ | 14420/22095 [24:34:30<10:01:06, 4.70s/it] 65%|██████▌ | 14421/22095 [24:34:33<8:58:05, 4.21s/it] {'loss': 0.3016, 'grad_norm': 0.6479535834948442, 'learning_rate': 2.8442697351571496e-06, 'epoch': 0.65} 65%|██████▌ | 14421/22095 [24:34:33<8:58:05, 4.21s/it] 65%|██████▌ | 14422/22095 [24:34:36<8:16:26, 3.88s/it] {'loss': 0.3387, 'grad_norm': 0.7228641491464504, 'learning_rate': 2.8436084567660604e-06, 'epoch': 0.65} 65%|██████▌ | 14422/22095 [24:34:36<8:16:26, 3.88s/it] 65%|██████▌ | 14423/22095 [24:34:40<7:51:29, 3.69s/it] {'loss': 0.3561, 'grad_norm': 0.6542739679281547, 'learning_rate': 2.8429472247092077e-06, 'epoch': 0.65} 65%|██████▌ | 14423/22095 [24:34:40<7:51:29, 3.69s/it] 65%|██████▌ | 14424/22095 [24:34:44<7:58:13, 3.74s/it] {'loss': 0.3671, 'grad_norm': 0.7079313320065974, 'learning_rate': 2.8422860390007896e-06, 'epoch': 0.65} 65%|██████▌ | 14424/22095 [24:34:44<7:58:13, 3.74s/it] 65%|██████▌ | 14425/22095 [24:34:47<7:45:04, 3.64s/it] {'loss': 0.2878, 'grad_norm': 0.6326576281591828, 'learning_rate': 2.8416248996550176e-06, 'epoch': 0.65} 65%|██████▌ | 14425/22095 [24:34:47<7:45:04, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14426/22095 [24:34:57<12:07:00, 5.69s/it] {'loss': 0.459, 'grad_norm': 0.2668548599523566, 'learning_rate': 2.8409638066860994e-06, 'epoch': 0.65} 65%|██████▌ | 14426/22095 [24:34:57<12:07:00, 5.69s/it] 65%|██████▌ | 14427/22095 [24:35:02<11:33:03, 5.42s/it] {'loss': 0.3585, 'grad_norm': 0.6751405890400213, 'learning_rate': 2.8403027601082385e-06, 'epoch': 0.65} 65%|██████▌ | 14427/22095 [24:35:02<11:33:03, 5.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14428/22095 [24:35:12<14:38:52, 6.88s/it] {'loss': 0.4647, 'grad_norm': 0.2812480485423244, 'learning_rate': 2.8396417599356363e-06, 'epoch': 0.65} 65%|██████▌ | 14428/22095 [24:35:12<14:38:52, 6.88s/it] 65%|██████▌ | 14429/22095 [24:35:17<12:55:09, 6.07s/it] {'loss': 0.3163, 'grad_norm': 0.6065274845901797, 'learning_rate': 2.838980806182499e-06, 'epoch': 0.65} 65%|██████▌ | 14429/22095 [24:35:17<12:55:09, 6.07s/it] 65%|██████▌ | 14430/22095 [24:35:21<11:54:52, 5.60s/it] {'loss': 0.2846, 'grad_norm': 0.6540976748783749, 'learning_rate': 2.8383198988630257e-06, 'epoch': 0.65} 65%|██████▌ | 14430/22095 [24:35:21<11:54:52, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75719 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14431/22095 [24:35:24<10:10:49, 4.78s/it] {'loss': 0.3243, 'grad_norm': 0.5858231834276961, 'learning_rate': 2.83765903799142e-06, 'epoch': 0.65} 65%|██████▌ | 14431/22095 [24:35:24<10:10:49, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77074 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14432/22095 [24:35:27<8:54:33, 4.19s/it] {'loss': 0.3672, 'grad_norm': 0.6227041477821702, 'learning_rate': 2.8369982235818817e-06, 'epoch': 0.65} 65%|██████▌ | 14432/22095 [24:35:27<8:54:33, 4.19s/it] 65%|██████▌ | 14433/22095 [24:35:30<7:58:14, 3.75s/it] {'loss': 0.2545, 'grad_norm': 1.1696453604041717, 'learning_rate': 2.836337455648605e-06, 'epoch': 0.65} 65%|██████▌ | 14433/22095 [24:35:30<7:58:14, 3.75s/it] 65%|██████▌ | 14434/22095 [24:35:33<7:57:36, 3.74s/it] {'loss': 0.3471, 'grad_norm': 0.6102567853373334, 'learning_rate': 2.835676734205792e-06, 'epoch': 0.65} 65%|██████▌ | 14434/22095 [24:35:33<7:57:36, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365301 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32042, 'image': 'vrdu_table_final_2/astro-ph.CO/1415c420-f362-4aa5-96f8-8bf657315b21.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}0.05 \\end{tabular}\n```"}]} 65%|██████▌ | 14435/22095 [24:35:37<7:57:20, 3.74s/it] {'loss': 0.3261, 'grad_norm': 0.617072658570023, 'learning_rate': 2.8350160592676407e-06, 'epoch': 0.65} 65%|██████▌ | 14435/22095 [24:35:37<7:57:20, 3.74s/it] 65%|██████▌ | 14436/22095 [24:35:41<7:51:49, 3.70s/it] {'loss': 0.3148, 'grad_norm': 0.640343205295688, 'learning_rate': 2.8343554308483444e-06, 'epoch': 0.65} 65%|██████▌ | 14436/22095 [24:35:41<7:51:49, 3.70s/it] 65%|██████▌ | 14437/22095 [24:35:43<7:12:07, 3.39s/it] {'loss': 0.3159, 'grad_norm': 0.5843377197812158, 'learning_rate': 2.8336948489620973e-06, 'epoch': 0.65} 65%|██████▌ | 14437/22095 [24:35:43<7:12:07, 3.39s/it] 65%|██████▌ | 14438/22095 [24:35:46<6:51:56, 3.23s/it] {'loss': 0.2863, 'grad_norm': 0.6244745426145994, 'learning_rate': 2.833034313623095e-06, 'epoch': 0.65} 65%|██████▌ | 14438/22095 [24:35:46<6:51:56, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14439/22095 [24:35:55<10:14:53, 4.82s/it] {'loss': 0.481, 'grad_norm': 0.27814110294909966, 'learning_rate': 2.8323738248455313e-06, 'epoch': 0.65} 65%|██████▌ | 14439/22095 [24:35:55<10:14:53, 4.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14440/22095 [24:35:58<9:16:58, 4.37s/it] {'loss': 0.2858, 'grad_norm': 0.6083305305579919, 'learning_rate': 2.8317133826435968e-06, 'epoch': 0.65} 65%|██████▌ | 14440/22095 [24:35:58<9:16:58, 4.37s/it] 65%|██████▌ | 14441/22095 [24:36:01<8:16:56, 3.90s/it] {'loss': 0.3011, 'grad_norm': 0.6288104167954419, 'learning_rate': 2.8310529870314805e-06, 'epoch': 0.65} 65%|██████▌ | 14441/22095 [24:36:01<8:16:56, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70217 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14442/22095 [24:36:04<8:10:58, 3.85s/it] {'loss': 0.2908, 'grad_norm': 0.6484817765709348, 'learning_rate': 2.830392638023376e-06, 'epoch': 0.65} 65%|██████▌ | 14442/22095 [24:36:05<8:10:58, 3.85s/it] 65%|██████▌ | 14443/22095 [24:36:08<8:12:26, 3.86s/it] {'loss': 0.3, 'grad_norm': 0.6555705986626023, 'learning_rate': 2.8297323356334683e-06, 'epoch': 0.65} 65%|██████▌ | 14443/22095 [24:36:08<8:12:26, 3.86s/it] 65%|██████▌ | 14444/22095 [24:36:12<7:48:54, 3.68s/it] {'loss': 0.3228, 'grad_norm': 0.6426833166344065, 'learning_rate': 2.829072079875949e-06, 'epoch': 0.65} 65%|██████▌ | 14444/22095 [24:36:12<7:48:54, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14445/22095 [24:36:15<7:34:40, 3.57s/it] {'loss': 0.3043, 'grad_norm': 0.6407112303345188, 'learning_rate': 2.8284118707650033e-06, 'epoch': 0.65} 65%|██████▌ | 14445/22095 [24:36:15<7:34:40, 3.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 65%|██████▌ | 14446/22095 [24:36:18<7:06:25, 3.34s/it] {'loss': 0.3129, 'grad_norm': 0.5983310459118458, 'learning_rate': 2.8277517083148155e-06, 'epoch': 0.65} 65%|██████▌ | 14446/22095 [24:36:18<7:06:25, 3.34s/it] 65%|██████▌ | 14447/22095 [24:36:21<7:00:30, 3.30s/it] {'loss': 0.3132, 'grad_norm': 0.6640789563914657, 'learning_rate': 2.8270915925395714e-06, 'epoch': 0.65} 65%|██████▌ | 14447/22095 [24:36:21<7:00:30, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50241 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14448/22095 [24:36:24<7:08:06, 3.36s/it] {'loss': 0.2707, 'grad_norm': 0.6133590414182064, 'learning_rate': 2.8264315234534594e-06, 'epoch': 0.65} 65%|██████▌ | 14448/22095 [24:36:24<7:08:06, 3.36s/it] 65%|██████▌ | 14449/22095 [24:36:28<7:09:41, 3.37s/it] {'loss': 0.3189, 'grad_norm': 0.6340453316708997, 'learning_rate': 2.8257715010706544e-06, 'epoch': 0.65} 65%|██████▌ | 14449/22095 [24:36:28<7:09:41, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (157042 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42152 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14450/22095 [24:36:36<10:20:47, 4.87s/it] {'loss': 0.4638, 'grad_norm': 0.27372905646296936, 'learning_rate': 2.8251115254053426e-06, 'epoch': 0.65} 65%|██████▌ | 14450/22095 [24:36:36<10:20:47, 4.87s/it] 65%|██████▌ | 14451/22095 [24:36:40<9:31:12, 4.48s/it] {'loss': 0.3543, 'grad_norm': 0.5895567568955655, 'learning_rate': 2.824451596471704e-06, 'epoch': 0.65} 65%|██████▌ | 14451/22095 [24:36:40<9:31:12, 4.48s/it] 65%|██████▌ | 14452/22095 [24:36:43<8:29:03, 4.00s/it] {'loss': 0.3237, 'grad_norm': 0.6503694689675438, 'learning_rate': 2.823791714283923e-06, 'epoch': 0.65} 65%|██████▌ | 14452/22095 [24:36:43<8:29:03, 4.00s/it] 65%|██████▌ | 14453/22095 [24:36:46<7:50:37, 3.70s/it] {'loss': 0.3103, 'grad_norm': 0.5918394190940748, 'learning_rate': 2.8231318788561702e-06, 'epoch': 0.65} 65%|██████▌ | 14453/22095 [24:36:46<7:50:37, 3.70s/it] 65%|██████▌ | 14454/22095 [24:36:49<7:51:26, 3.70s/it] {'loss': 0.2933, 'grad_norm': 0.6474765844945153, 'learning_rate': 2.8224720902026283e-06, 'epoch': 0.65} 65%|██████▌ | 14454/22095 [24:36:49<7:51:26, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76881 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41036 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14455/22095 [24:36:52<7:20:26, 3.46s/it] {'loss': 0.3107, 'grad_norm': 0.5915692141412208, 'learning_rate': 2.821812348337475e-06, 'epoch': 0.65} 65%|██████▌ | 14455/22095 [24:36:52<7:20:26, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047665 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 16cm\nB. 4cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 65%|██████▌ | 14456/22095 [24:37:00<10:22:06, 4.89s/it] {'loss': 0.4804, 'grad_norm': 0.2727139188700424, 'learning_rate': 2.821152653274884e-06, 'epoch': 0.65} 65%|██████▌ | 14456/22095 [24:37:00<10:22:06, 4.89s/it] 65%|██████▌ | 14457/22095 [24:37:04<9:22:31, 4.42s/it] {'loss': 0.3106, 'grad_norm': 0.6330423123605577, 'learning_rate': 2.820493005029029e-06, 'epoch': 0.65} 65%|██████▌ | 14457/22095 [24:37:04<9:22:31, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14458/22095 [24:37:13<12:32:33, 5.91s/it] {'loss': 0.467, 'grad_norm': 0.26587214402432924, 'learning_rate': 2.8198334036140873e-06, 'epoch': 0.65} 65%|██████▌ | 14458/22095 [24:37:13<12:32:33, 5.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8895852 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19005, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段上的两点,CB=3cm,DB=5cm,D为AC的中点,则AB段长度为()\nA. 13cm\nB. 7cm\nC. 8cm\nD. 1lcm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 65%|██████▌ | 14459/22095 [24:37:17<10:54:40, 5.14s/it] {'loss': 0.3307, 'grad_norm': 0.612997939811741, 'learning_rate': 2.819173849044229e-06, 'epoch': 0.65} 65%|██████▌ | 14459/22095 [24:37:17<10:54:40, 5.14s/it] 65%|██████▌ | 14460/22095 [24:37:21<10:14:26, 4.83s/it] {'loss': 0.2693, 'grad_norm': 0.6129959000307879, 'learning_rate': 2.8185143413336272e-06, 'epoch': 0.65} 65%|██████▌ | 14460/22095 [24:37:21<10:14:26, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (126450 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57915 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14461/22095 [24:37:25<9:47:05, 4.61s/it] {'loss': 0.3163, 'grad_norm': 0.6254208228940998, 'learning_rate': 2.8178548804964536e-06, 'epoch': 0.65} 65%|██████▌ | 14461/22095 [24:37:25<9:47:05, 4.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [575, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8431147 in VC:s3://internvl-moe-sft-data/. Exception: Image size [575, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 80954, 'image': 'vrdu_texteq/astro-ph.CO/bcf15ab4-eeaa-4dd7-a7fa-44ae75f19704.png', 'image_wh': [[575, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'We find that the fractional error of $\\alpha_{1}$ scales as'}]} 65%|██████▌ | 14462/22095 [24:37:28<8:47:07, 4.14s/it] {'loss': 0.3117, 'grad_norm': 0.9923324801370447, 'learning_rate': 2.817195466546874e-06, 'epoch': 0.65} 65%|██████▌ | 14462/22095 [24:37:28<8:47:07, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (127735 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118080 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14463/22095 [24:37:37<12:14:29, 5.77s/it] {'loss': 0.4408, 'grad_norm': 0.3034442698917206, 'learning_rate': 2.8165360994990598e-06, 'epoch': 0.65} 65%|██████▌ | 14463/22095 [24:37:37<12:14:29, 5.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54772 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14464/22095 [24:37:41<10:38:15, 5.02s/it] {'loss': 0.3154, 'grad_norm': 0.6557756043198308, 'learning_rate': 2.815876779367181e-06, 'epoch': 0.65} 65%|██████▌ | 14464/22095 [24:37:41<10:38:15, 5.02s/it] 65%|██████▌ | 14465/22095 [24:37:44<9:22:55, 4.43s/it] {'loss': 0.308, 'grad_norm': 0.6200793549760715, 'learning_rate': 2.8152175061654017e-06, 'epoch': 0.65} 65%|██████▌ | 14465/22095 [24:37:44<9:22:55, 4.43s/it] 65%|██████▌ | 14466/22095 [24:37:48<9:17:46, 4.39s/it] {'loss': 0.2597, 'grad_norm': 0.541491605820331, 'learning_rate': 2.8145582799078873e-06, 'epoch': 0.65} 65%|██████▌ | 14466/22095 [24:37:48<9:17:46, 4.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960200 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11035, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1\nB. 1.5\nC. 2\nD. 0.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 65%|██████▌ | 14467/22095 [24:37:52<8:55:03, 4.21s/it] {'loss': 0.3003, 'grad_norm': 0.6239722423207902, 'learning_rate': 2.8138991006088024e-06, 'epoch': 0.65} 65%|██████▌ | 14467/22095 [24:37:52<8:55:03, 4.21s/it] 65%|██████▌ | 14468/22095 [24:37:56<8:42:05, 4.11s/it] {'loss': 0.3497, 'grad_norm': 0.6460007049209576, 'learning_rate': 2.813239968282314e-06, 'epoch': 0.65} 65%|██████▌ | 14468/22095 [24:37:56<8:42:05, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14469/22095 [24:38:05<12:01:12, 5.67s/it] {'loss': 0.4872, 'grad_norm': 0.2721666372049071, 'learning_rate': 2.812580882942583e-06, 'epoch': 0.65} 65%|██████▌ | 14469/22095 [24:38:05<12:01:12, 5.67s/it] 65%|██████▌ | 14470/22095 [24:38:09<10:47:16, 5.09s/it] {'loss': 0.3248, 'grad_norm': 0.673252288948146, 'learning_rate': 2.811921844603768e-06, 'epoch': 0.65} 65%|██████▌ | 14470/22095 [24:38:09<10:47:16, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 65%|██████▌ | 14471/22095 [24:38:18<13:38:58, 6.45s/it] {'loss': 0.4691, 'grad_norm': 0.2721882793861817, 'learning_rate': 2.8112628532800345e-06, 'epoch': 0.65} 65%|██████▌ | 14471/22095 [24:38:18<13:38:58, 6.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51963 > 40960). Running this sequence through the model will result in indexing errors 65%|██████▌ | 14472/22095 [24:38:26<14:22:04, 6.79s/it] {'loss': 0.4552, 'grad_norm': 0.2922565844858254, 'learning_rate': 2.8106039089855385e-06, 'epoch': 0.65} 65%|██████▌ | 14472/22095 [24:38:26<14:22:04, 6.79s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 66%|██████▌ | 14473/22095 [24:38:29<12:05:47, 5.71s/it] {'loss': 0.2839, 'grad_norm': 0.735896164955326, 'learning_rate': 2.809945011734442e-06, 'epoch': 0.66} 66%|██████▌ | 14473/22095 [24:38:29<12:05:47, 5.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14474/22095 [24:38:33<10:59:14, 5.19s/it] {'loss': 0.3003, 'grad_norm': 0.6297271221726509, 'learning_rate': 2.8092861615409004e-06, 'epoch': 0.66} 66%|██████▌ | 14474/22095 [24:38:33<10:59:14, 5.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [298, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8530297 in VC:s3://internvl-moe-sft-data/. Exception: Image size [298, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20753, 'image': 'vrdu_texteq/astro-ph.CO/042c9432-669c-4df0-84c0-185a5e1cde45.png', 'image_wh': [[298, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': '$\\bullet$ {\\bf Parametrization III}'}]} 66%|██████▌ | 14475/22095 [24:38:37<9:52:05, 4.66s/it] {'loss': 0.3047, 'grad_norm': 0.6394901774218247, 'learning_rate': 2.8086273584190704e-06, 'epoch': 0.66} 66%|██████▌ | 14475/22095 [24:38:37<9:52:05, 4.66s/it] 66%|██████▌ | 14476/22095 [24:38:41<9:27:21, 4.47s/it] {'loss': 0.3283, 'grad_norm': 0.6439208339140995, 'learning_rate': 2.807968602383107e-06, 'epoch': 0.66} 66%|██████▌ | 14476/22095 [24:38:41<9:27:21, 4.47s/it] 66%|██████▌ | 14477/22095 [24:38:44<8:30:19, 4.02s/it] {'loss': 0.3227, 'grad_norm': 0.5642350700947738, 'learning_rate': 2.8073098934471703e-06, 'epoch': 0.66} 66%|██████▌ | 14477/22095 [24:38:44<8:30:19, 4.02s/it] 66%|██████▌ | 14478/22095 [24:38:46<7:49:49, 3.70s/it] {'loss': 0.2996, 'grad_norm': 0.6685770730535802, 'learning_rate': 2.806651231625406e-06, 'epoch': 0.66} 66%|██████▌ | 14478/22095 [24:38:46<7:49:49, 3.70s/it] 66%|██████▌ | 14479/22095 [24:38:50<7:40:54, 3.63s/it] {'loss': 0.2653, 'grad_norm': 0.6824815601127436, 'learning_rate': 2.8059926169319694e-06, 'epoch': 0.66} 66%|██████▌ | 14479/22095 [24:38:50<7:40:54, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14480/22095 [24:38:59<11:26:40, 5.41s/it] {'loss': 0.4775, 'grad_norm': 0.30267560974840607, 'learning_rate': 2.8053340493810143e-06, 'epoch': 0.66} 66%|██████▌ | 14480/22095 [24:38:59<11:26:40, 5.41s/it] 66%|██████▌ | 14481/22095 [24:39:04<10:47:50, 5.11s/it] {'loss': 0.3277, 'grad_norm': 0.671206729091289, 'learning_rate': 2.804675528986693e-06, 'epoch': 0.66} 66%|██████▌ | 14481/22095 [24:39:04<10:47:50, 5.11s/it] 66%|██████▌ | 14482/22095 [24:39:07<9:49:25, 4.65s/it] {'loss': 0.3015, 'grad_norm': 0.5600886177378184, 'learning_rate': 2.804017055763149e-06, 'epoch': 0.66} 66%|██████▌ | 14482/22095 [24:39:07<9:49:25, 4.65s/it] 66%|██████▌ | 14483/22095 [24:39:11<8:53:23, 4.20s/it] {'loss': 0.3511, 'grad_norm': 0.6033961960418963, 'learning_rate': 2.8033586297245336e-06, 'epoch': 0.66} 66%|██████▌ | 14483/22095 [24:39:11<8:53:23, 4.20s/it] 66%|██████▌ | 14484/22095 [24:39:15<9:02:35, 4.28s/it] {'loss': 0.3397, 'grad_norm': 0.5938406866774515, 'learning_rate': 2.8027002508849967e-06, 'epoch': 0.66} 66%|██████▌ | 14484/22095 [24:39:15<9:02:35, 4.28s/it] 66%|██████▌ | 14485/22095 [24:39:19<8:37:16, 4.08s/it] {'loss': 0.2954, 'grad_norm': 0.6218836501499261, 'learning_rate': 2.8020419192586836e-06, 'epoch': 0.66} 66%|██████▌ | 14485/22095 [24:39:19<8:37:16, 4.08s/it] 66%|██████▌ | 14486/22095 [24:39:22<8:09:07, 3.86s/it] {'loss': 0.3092, 'grad_norm': 0.6338006210940533, 'learning_rate': 2.801383634859737e-06, 'epoch': 0.66} 66%|██████▌ | 14486/22095 [24:39:22<8:09:07, 3.86s/it] 66%|██████▌ | 14487/22095 [24:39:25<7:36:33, 3.60s/it] {'loss': 0.2932, 'grad_norm': 0.5836146648909326, 'learning_rate': 2.8007253977023045e-06, 'epoch': 0.66} 66%|██████▌ | 14487/22095 [24:39:25<7:36:33, 3.60s/it] 66%|██████▌ | 14488/22095 [24:39:29<7:36:29, 3.60s/it] {'loss': 0.3117, 'grad_norm': 0.735478218662718, 'learning_rate': 2.8000672078005277e-06, 'epoch': 0.66} 66%|██████▌ | 14488/22095 [24:39:29<7:36:29, 3.60s/it] 66%|██████▌ | 14489/22095 [24:39:32<7:41:42, 3.64s/it] {'loss': 0.2882, 'grad_norm': 0.5864591443386192, 'learning_rate': 2.799409065168551e-06, 'epoch': 0.66} 66%|██████▌ | 14489/22095 [24:39:32<7:41:42, 3.64s/it] 66%|██████▌ | 14490/22095 [24:39:36<7:42:26, 3.65s/it] {'loss': 0.3331, 'grad_norm': 0.6412764443291205, 'learning_rate': 2.7987509698205163e-06, 'epoch': 0.66} 66%|██████▌ | 14490/22095 [24:39:36<7:42:26, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [345, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8500336 in VC:s3://internvl-moe-sft-data/. Exception: Image size [345, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69075, 'image': 'vrdu_texteq/astro-ph.CO/c299e0de-743c-4e5f-b9b3-0d5eda6abe1d.png', 'image_wh': [[345, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where we have set $c=1$ and'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14491/22095 [24:39:41<8:27:28, 4.00s/it] {'loss': 0.3269, 'grad_norm': 0.6238632744185769, 'learning_rate': 2.79809292177056e-06, 'epoch': 0.66} 66%|██████▌ | 14491/22095 [24:39:41<8:27:28, 4.00s/it] 66%|██████▌ | 14492/22095 [24:39:46<8:51:58, 4.20s/it] {'loss': 0.3298, 'grad_norm': 0.6110432933489035, 'learning_rate': 2.7974349210328234e-06, 'epoch': 0.66} 66%|██████▌ | 14492/22095 [24:39:46<8:51:58, 4.20s/it] 66%|██████▌ | 14493/22095 [24:39:49<8:20:47, 3.95s/it] {'loss': 0.3514, 'grad_norm': 0.6878994627966224, 'learning_rate': 2.7967769676214486e-06, 'epoch': 0.66} 66%|██████▌ | 14493/22095 [24:39:49<8:20:47, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43423 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14494/22095 [24:39:53<8:15:15, 3.91s/it] {'loss': 0.3398, 'grad_norm': 0.6380556906955277, 'learning_rate': 2.7961190615505695e-06, 'epoch': 0.66} 66%|██████▌ | 14494/22095 [24:39:53<8:15:15, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14495/22095 [24:39:56<7:51:13, 3.72s/it] {'loss': 0.2999, 'grad_norm': 0.6014487945413114, 'learning_rate': 2.7954612028343218e-06, 'epoch': 0.66} 66%|██████▌ | 14495/22095 [24:39:56<7:51:13, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [420, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8453171 in VC:s3://internvl-moe-sft-data/. Exception: Image size [420, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 134980, 'image': 'vrdu_texteq/astro-ph.CO/acff258e-ab2f-4a4a-af67-1307a524ae6a.png', 'image_wh': [[420, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'The $G_{1-4}$ functions are defined as:'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884879 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8032, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 66%|██████▌ | 14496/22095 [24:39:59<7:26:24, 3.52s/it] {'loss': 0.3078, 'grad_norm': 0.678185293870894, 'learning_rate': 2.7948033914868415e-06, 'epoch': 0.66} 66%|██████▌ | 14496/22095 [24:39:59<7:26:24, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46823 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14497/22095 [24:40:02<7:16:30, 3.45s/it] {'loss': 0.2989, 'grad_norm': 0.577372925066938, 'learning_rate': 2.7941456275222658e-06, 'epoch': 0.66} 66%|██████▌ | 14497/22095 [24:40:02<7:16:30, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365005 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31746, 'image': 'vrdu_table_final_2/astro-ph.CO/5aa2642b-b741-496e-8508-0ea1efcd03fc.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 66%|██████▌ | 14498/22095 [24:40:06<7:31:40, 3.57s/it] {'loss': 0.3397, 'grad_norm': 0.5706492885380627, 'learning_rate': 2.793487910954726e-06, 'epoch': 0.66} 66%|██████▌ | 14498/22095 [24:40:06<7:31:40, 3.57s/it] 66%|██████▌ | 14499/22095 [24:40:09<7:17:28, 3.46s/it] {'loss': 0.2975, 'grad_norm': 0.6273409052262731, 'learning_rate': 2.7928302417983524e-06, 'epoch': 0.66} 66%|██████▌ | 14499/22095 [24:40:09<7:17:28, 3.46s/it] 66%|██████▌ | 14500/22095 [24:40:12<7:05:08, 3.36s/it] {'loss': 0.2611, 'grad_norm': 0.5952325031578662, 'learning_rate': 2.7921726200672793e-06, 'epoch': 0.66} 66%|██████▌ | 14500/22095 [24:40:13<7:05:08, 3.36s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047602 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 66%|██████▌ | 14501/22095 [24:40:16<6:56:59, 3.29s/it] {'loss': 0.2996, 'grad_norm': 0.6492498565560609, 'learning_rate': 2.791515045775634e-06, 'epoch': 0.66} 66%|██████▌ | 14501/22095 [24:40:16<6:56:59, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14502/22095 [24:40:25<10:49:48, 5.13s/it] {'loss': 0.4532, 'grad_norm': 0.2819515413997284, 'learning_rate': 2.79085751893755e-06, 'epoch': 0.66} 66%|██████▌ | 14502/22095 [24:40:25<10:49:48, 5.13s/it] 66%|██████▌ | 14503/22095 [24:40:28<9:37:58, 4.57s/it] {'loss': 0.3138, 'grad_norm': 0.6220399803635098, 'learning_rate': 2.7902000395671523e-06, 'epoch': 0.66} 66%|██████▌ | 14503/22095 [24:40:28<9:37:58, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76827 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14504/22095 [24:40:31<8:34:30, 4.07s/it] {'loss': 0.3258, 'grad_norm': 0.625456312620898, 'learning_rate': 2.7895426076785676e-06, 'epoch': 0.66} 66%|██████▌ | 14504/22095 [24:40:31<8:34:30, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (135568 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14505/22095 [24:40:41<12:00:27, 5.70s/it] {'loss': 0.4833, 'grad_norm': 0.28459561254149424, 'learning_rate': 2.788885223285923e-06, 'epoch': 0.66} 66%|██████▌ | 14505/22095 [24:40:41<12:00:27, 5.70s/it] 66%|██████▌ | 14506/22095 [24:40:44<10:32:52, 5.00s/it] {'loss': 0.2799, 'grad_norm': 1.8729040089538573, 'learning_rate': 2.7882278864033465e-06, 'epoch': 0.66} 66%|██████▌ | 14506/22095 [24:40:44<10:32:52, 5.00s/it] 66%|██████▌ | 14507/22095 [24:40:48<9:43:20, 4.61s/it] {'loss': 0.3439, 'grad_norm': 0.6141294598649133, 'learning_rate': 2.787570597044959e-06, 'epoch': 0.66} 66%|██████▌ | 14507/22095 [24:40:48<9:43:20, 4.61s/it] 66%|██████▌ | 14508/22095 [24:40:51<8:43:41, 4.14s/it] {'loss': 0.2964, 'grad_norm': 0.6304622492167625, 'learning_rate': 2.786913355224883e-06, 'epoch': 0.66} 66%|██████▌ | 14508/22095 [24:40:51<8:43:41, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14509/22095 [24:40:55<8:36:54, 4.09s/it] {'loss': 0.321, 'grad_norm': 0.6245041485695481, 'learning_rate': 2.7862561609572414e-06, 'epoch': 0.66} 66%|██████▌ | 14509/22095 [24:40:55<8:36:54, 4.09s/it] 66%|██████▌ | 14510/22095 [24:40:59<8:26:16, 4.00s/it] {'loss': 0.3239, 'grad_norm': 0.6741403524459103, 'learning_rate': 2.7855990142561606e-06, 'epoch': 0.66} 66%|██████▌ | 14510/22095 [24:40:59<8:26:16, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50327 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14511/22095 [24:41:01<7:42:57, 3.66s/it] {'loss': 0.3053, 'grad_norm': 0.5674773642493651, 'learning_rate': 2.7849419151357513e-06, 'epoch': 0.66} 66%|██████▌ | 14511/22095 [24:41:01<7:42:57, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42885 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84126 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71490 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14512/22095 [24:41:07<9:06:25, 4.32s/it] {'loss': 0.4902, 'grad_norm': 0.28760962113666705, 'learning_rate': 2.784284863610138e-06, 'epoch': 0.66} 66%|██████▌ | 14512/22095 [24:41:07<9:06:25, 4.32s/it] 66%|██████▌ | 14513/22095 [24:41:11<8:23:29, 3.98s/it] {'loss': 0.3242, 'grad_norm': 0.6315742915634163, 'learning_rate': 2.7836278596934395e-06, 'epoch': 0.66} 66%|██████▌ | 14513/22095 [24:41:11<8:23:29, 3.98s/it] 66%|██████▌ | 14514/22095 [24:41:14<8:01:55, 3.81s/it] {'loss': 0.2842, 'grad_norm': 0.7371476507524212, 'learning_rate': 2.782970903399771e-06, 'epoch': 0.66} 66%|██████▌ | 14514/22095 [24:41:14<8:01:55, 3.81s/it] 66%|██████▌ | 14515/22095 [24:41:17<7:29:14, 3.56s/it] {'loss': 0.2426, 'grad_norm': 0.7316752146725508, 'learning_rate': 2.782313994743247e-06, 'epoch': 0.66} 66%|██████▌ | 14515/22095 [24:41:17<7:29:14, 3.56s/it] 66%|██████▌ | 14516/22095 [24:41:20<7:18:05, 3.47s/it] {'loss': 0.299, 'grad_norm': 0.6650642757049506, 'learning_rate': 2.781657133737986e-06, 'epoch': 0.66} 66%|██████▌ | 14516/22095 [24:41:20<7:18:05, 3.47s/it] 66%|██████▌ | 14517/22095 [24:41:23<7:04:16, 3.36s/it] {'loss': 0.2837, 'grad_norm': 0.5989291909812126, 'learning_rate': 2.7810003203980983e-06, 'epoch': 0.66} 66%|██████▌ | 14517/22095 [24:41:23<7:04:16, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (94609968 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30024.png 2025-08-28 16:39:22.596316 load time: 1168.66 ms 66%|██████▌ | 14518/22095 [24:41:27<7:29:15, 3.56s/it] {'loss': 0.3266, 'grad_norm': 0.6512623588879672, 'learning_rate': 2.7803435547377006e-06, 'epoch': 0.66} 66%|██████▌ | 14518/22095 [24:41:27<7:29:15, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42599 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14519/22095 [24:41:35<10:09:23, 4.83s/it] {'loss': 0.4504, 'grad_norm': 0.2594526263361003, 'learning_rate': 2.779686836770903e-06, 'epoch': 0.66} 66%|██████▌ | 14519/22095 [24:41:35<10:09:23, 4.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14520/22095 [24:41:44<13:01:44, 6.19s/it] {'loss': 0.4825, 'grad_norm': 0.3290550773122507, 'learning_rate': 2.7790301665118137e-06, 'epoch': 0.66} 66%|██████▌ | 14520/22095 [24:41:44<13:01:44, 6.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 66%|██████▌ | 14521/22095 [24:41:48<11:35:38, 5.51s/it] {'loss': 0.2996, 'grad_norm': 0.6135740857375592, 'learning_rate': 2.7783735439745447e-06, 'epoch': 0.66} 66%|██████▌ | 14521/22095 [24:41:48<11:35:38, 5.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14522/22095 [24:41:52<10:13:55, 4.86s/it] {'loss': 0.295, 'grad_norm': 0.6624231097263641, 'learning_rate': 2.7777169691732074e-06, 'epoch': 0.66} 66%|██████▌ | 14522/22095 [24:41:52<10:13:55, 4.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334408 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1018, 'image': 'vrdu_table_final_2/astro-ph.CO/13c136f6-e7d3-45f8-9220-71b8458b132d.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}{#2}\\\\{#3}\\end{tabular}\n```"}]} 66%|██████▌ | 14523/22095 [24:41:56<9:50:20, 4.68s/it] {'loss': 0.302, 'grad_norm': 0.6008466558370399, 'learning_rate': 2.777060442121907e-06, 'epoch': 0.66} 66%|██████▌ | 14523/22095 [24:41:56<9:50:20, 4.68s/it] 66%|██████▌ | 14524/22095 [24:42:00<9:11:41, 4.37s/it] {'loss': 0.3174, 'grad_norm': 0.6499752987880355, 'learning_rate': 2.7764039628347484e-06, 'epoch': 0.66} 66%|██████▌ | 14524/22095 [24:42:00<9:11:41, 4.37s/it] 66%|██████▌ | 14525/22095 [24:42:03<8:19:22, 3.96s/it] {'loss': 0.3008, 'grad_norm': 0.5845108030775217, 'learning_rate': 2.7757475313258397e-06, 'epoch': 0.66} 66%|██████▌ | 14525/22095 [24:42:03<8:19:22, 3.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [639, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8475522 in VC:s3://internvl-moe-sft-data/. Exception: Image size [639, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 98979, 'image': 'vrdu_texteq/astro-ph.CO/c8326776-1fc0-42b4-b9d9-8288135c4f89.png', 'image_wh': [[639, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'while the first correction for the matter is $x^2P$ times'}]} 66%|██████▌ | 14526/22095 [24:42:06<8:10:44, 3.89s/it] {'loss': 0.3072, 'grad_norm': 0.6367031103099358, 'learning_rate': 2.775091147609287e-06, 'epoch': 0.66} 66%|██████▌ | 14526/22095 [24:42:06<8:10:44, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14527/22095 [24:42:16<11:46:00, 5.60s/it] {'loss': 0.4913, 'grad_norm': 0.29519705624979603, 'learning_rate': 2.7744348116991925e-06, 'epoch': 0.66} 66%|██████▌ | 14527/22095 [24:42:16<11:46:00, 5.60s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (104400000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 66%|██████▌ | 14528/22095 [24:42:20<10:51:13, 5.16s/it] {'loss': 0.3021, 'grad_norm': 0.9076162605324154, 'learning_rate': 2.7737785236096563e-06, 'epoch': 0.66} 66%|██████▌ | 14528/22095 [24:42:20<10:51:13, 5.16s/it] 66%|██████▌ | 14529/22095 [24:42:23<9:34:11, 4.55s/it] {'loss': 0.3272, 'grad_norm': 0.7232551838129118, 'learning_rate': 2.7731222833547842e-06, 'epoch': 0.66} 66%|██████▌ | 14529/22095 [24:42:23<9:34:11, 4.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14530/22095 [24:42:27<8:58:41, 4.27s/it] {'loss': 0.3128, 'grad_norm': 0.665210978002039, 'learning_rate': 2.7724660909486732e-06, 'epoch': 0.66} 66%|██████▌ | 14530/22095 [24:42:27<8:58:41, 4.27s/it] 66%|██████▌ | 14531/22095 [24:42:30<8:12:18, 3.91s/it] {'loss': 0.2759, 'grad_norm': 0.6183998088090225, 'learning_rate': 2.771809946405427e-06, 'epoch': 0.66} 66%|██████▌ | 14531/22095 [24:42:30<8:12:18, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14532/22095 [24:42:33<7:54:38, 3.77s/it] {'loss': 0.2959, 'grad_norm': 0.6421908889548431, 'learning_rate': 2.771153849739141e-06, 'epoch': 0.66} 66%|██████▌ | 14532/22095 [24:42:33<7:54:38, 3.77s/it] 66%|██████▌ | 14533/22095 [24:42:37<7:32:12, 3.59s/it] {'loss': 0.3348, 'grad_norm': 0.6180099869793233, 'learning_rate': 2.7704978009639117e-06, 'epoch': 0.66} 66%|██████▌ | 14533/22095 [24:42:37<7:32:12, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14534/22095 [24:42:40<7:35:34, 3.62s/it] {'loss': 0.3214, 'grad_norm': 0.6045601813827167, 'learning_rate': 2.7698418000938374e-06, 'epoch': 0.66} 66%|██████▌ | 14534/22095 [24:42:40<7:35:34, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882176 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5329, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nA. 8cm\nB. 10cm\nC. 12cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 66%|██████▌ | 14535/22095 [24:42:43<7:11:37, 3.43s/it] {'loss': 0.3022, 'grad_norm': 0.6655538988058615, 'learning_rate': 2.7691858471430157e-06, 'epoch': 0.66} 66%|██████▌ | 14535/22095 [24:42:43<7:11:37, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50820 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14536/22095 [24:42:47<7:23:09, 3.52s/it] {'loss': 0.2832, 'grad_norm': 0.6071011266492075, 'learning_rate': 2.7685299421255373e-06, 'epoch': 0.66} 66%|██████▌ | 14536/22095 [24:42:47<7:23:09, 3.52s/it] 66%|██████▌ | 14537/22095 [24:42:50<7:14:05, 3.45s/it] {'loss': 0.3455, 'grad_norm': 0.6744160545583696, 'learning_rate': 2.7678740850554965e-06, 'epoch': 0.66} 66%|██████▌ | 14537/22095 [24:42:50<7:14:05, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14538/22095 [24:42:54<7:10:57, 3.42s/it] {'loss': 0.2837, 'grad_norm': 0.5592789821618848, 'learning_rate': 2.7672182759469857e-06, 'epoch': 0.66} 66%|██████▌ | 14538/22095 [24:42:54<7:10:57, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78138 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57437 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14539/22095 [24:42:57<7:01:31, 3.35s/it] {'loss': 0.2734, 'grad_norm': 0.5920929266921543, 'learning_rate': 2.7665625148141e-06, 'epoch': 0.66} 66%|██████▌ | 14539/22095 [24:42:57<7:01:31, 3.35s/it] 66%|██████▌ | 14540/22095 [24:43:00<6:59:33, 3.33s/it] {'loss': 0.2813, 'grad_norm': 0.5622962505052304, 'learning_rate': 2.7659068016709234e-06, 'epoch': 0.66} 66%|██████▌ | 14540/22095 [24:43:00<6:59:33, 3.33s/it] 66%|██████▌ | 14541/22095 [24:43:03<6:47:55, 3.24s/it] {'loss': 0.3048, 'grad_norm': 0.7655350616626647, 'learning_rate': 2.7652511365315473e-06, 'epoch': 0.66} 66%|██████▌ | 14541/22095 [24:43:03<6:47:55, 3.24s/it] 66%|██████▌ | 14542/22095 [24:43:07<7:07:44, 3.40s/it] {'loss': 0.2995, 'grad_norm': 0.6174261956501719, 'learning_rate': 2.764595519410063e-06, 'epoch': 0.66} 66%|██████▌ | 14542/22095 [24:43:07<7:07:44, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14543/22095 [24:43:11<7:29:55, 3.57s/it] {'loss': 0.3058, 'grad_norm': 0.6266561514215079, 'learning_rate': 2.763939950320556e-06, 'epoch': 0.66} 66%|██████▌ | 14543/22095 [24:43:11<7:29:55, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14544/22095 [24:43:16<8:15:32, 3.94s/it] {'loss': 0.4655, 'grad_norm': 0.28002556277807183, 'learning_rate': 2.7632844292771094e-06, 'epoch': 0.66} 66%|██████▌ | 14544/22095 [24:43:16<8:15:32, 3.94s/it] 66%|██████▌ | 14545/22095 [24:43:19<7:51:19, 3.75s/it] {'loss': 0.3146, 'grad_norm': 0.6485738125198387, 'learning_rate': 2.762628956293813e-06, 'epoch': 0.66} 66%|██████▌ | 14545/22095 [24:43:19<7:51:19, 3.75s/it] 66%|██████▌ | 14546/22095 [24:43:23<7:55:13, 3.78s/it] {'loss': 0.3156, 'grad_norm': 0.7066485212378323, 'learning_rate': 2.7619735313847467e-06, 'epoch': 0.66} 66%|██████▌ | 14546/22095 [24:43:23<7:55:13, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14547/22095 [24:43:32<11:29:17, 5.48s/it] {'loss': 0.4787, 'grad_norm': 0.2891774709824707, 'learning_rate': 2.761318154563998e-06, 'epoch': 0.66} 66%|██████▌ | 14547/22095 [24:43:32<11:29:17, 5.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14548/22095 [24:43:36<10:18:09, 4.91s/it] {'loss': 0.283, 'grad_norm': 0.6208970598323403, 'learning_rate': 2.7606628258456457e-06, 'epoch': 0.66} 66%|██████▌ | 14548/22095 [24:43:36<10:18:09, 4.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61279 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14549/22095 [24:43:39<9:01:57, 4.31s/it] {'loss': 0.2955, 'grad_norm': 0.5981822572032913, 'learning_rate': 2.760007545243771e-06, 'epoch': 0.66} 66%|██████▌ | 14549/22095 [24:43:39<9:01:57, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43509 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14550/22095 [24:43:42<8:27:10, 4.03s/it] {'loss': 0.3219, 'grad_norm': 0.629617347306717, 'learning_rate': 2.759352312772454e-06, 'epoch': 0.66} 66%|██████▌ | 14550/22095 [24:43:42<8:27:10, 4.03s/it] 66%|██████▌ | 14551/22095 [24:43:45<7:54:02, 3.77s/it] {'loss': 0.2668, 'grad_norm': 0.6254752051909233, 'learning_rate': 2.7586971284457753e-06, 'epoch': 0.66} 66%|██████▌ | 14551/22095 [24:43:45<7:54:02, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14552/22095 [24:43:48<7:28:40, 3.57s/it] {'loss': 0.3469, 'grad_norm': 0.6137413177807575, 'learning_rate': 2.7580419922778124e-06, 'epoch': 0.66} 66%|██████▌ | 14552/22095 [24:43:48<7:28:40, 3.57s/it] 66%|██████▌ | 14553/22095 [24:43:51<7:10:45, 3.43s/it] {'loss': 0.3049, 'grad_norm': 0.714623058271599, 'learning_rate': 2.7573869042826396e-06, 'epoch': 0.66} 66%|██████▌ | 14553/22095 [24:43:51<7:10:45, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14554/22095 [24:43:56<7:55:10, 3.78s/it] {'loss': 0.305, 'grad_norm': 0.5973950768771438, 'learning_rate': 2.7567318644743344e-06, 'epoch': 0.66} 66%|██████▌ | 14554/22095 [24:43:56<7:55:10, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940293 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63446, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nA. 4\nB. 6\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 66%|██████▌ | 14555/22095 [24:43:59<7:43:44, 3.69s/it] {'loss': 0.3131, 'grad_norm': 0.5703781422935155, 'learning_rate': 2.756076872866974e-06, 'epoch': 0.66} 66%|██████▌ | 14555/22095 [24:44:00<7:43:44, 3.69s/it] 66%|██████▌ | 14556/22095 [24:44:03<7:42:17, 3.68s/it] {'loss': 0.3368, 'grad_norm': 0.6563924959666125, 'learning_rate': 2.755421929474629e-06, 'epoch': 0.66} 66%|██████▌ | 14556/22095 [24:44:03<7:42:17, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14557/22095 [24:44:10<9:37:22, 4.60s/it] {'loss': 0.4752, 'grad_norm': 0.314269215921467, 'learning_rate': 2.7547670343113718e-06, 'epoch': 0.66} 66%|██████▌ | 14557/22095 [24:44:10<9:37:22, 4.60s/it] 66%|██████▌ | 14558/22095 [24:44:19<12:46:12, 6.10s/it] {'loss': 0.436, 'grad_norm': 0.267282794090128, 'learning_rate': 2.7541121873912774e-06, 'epoch': 0.66} 66%|██████▌ | 14558/22095 [24:44:20<12:46:12, 6.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (76932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72564 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63000 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14559/22095 [24:44:23<10:59:36, 5.25s/it] {'loss': 0.3465, 'grad_norm': 0.638997590884908, 'learning_rate': 2.7534573887284123e-06, 'epoch': 0.66} 66%|██████▌ | 14559/22095 [24:44:23<10:59:36, 5.25s/it] 66%|██████▌ | 14560/22095 [24:44:27<10:16:55, 4.91s/it] {'loss': 0.3123, 'grad_norm': 0.8136242615153971, 'learning_rate': 2.75280263833685e-06, 'epoch': 0.66} 66%|██████▌ | 14560/22095 [24:44:27<10:16:55, 4.91s/it] 66%|██████▌ | 14561/22095 [24:44:30<9:01:05, 4.31s/it] {'loss': 0.333, 'grad_norm': 0.5918204058084633, 'learning_rate': 2.7521479362306574e-06, 'epoch': 0.66} 66%|██████▌ | 14561/22095 [24:44:30<9:01:05, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72504 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43075 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14562/22095 [24:44:33<8:09:54, 3.90s/it] {'loss': 0.3035, 'grad_norm': 0.6079493814701858, 'learning_rate': 2.7514932824239e-06, 'epoch': 0.66} 66%|██████▌ | 14562/22095 [24:44:33<8:09:54, 3.90s/it] 66%|██████▌ | 14563/22095 [24:44:36<7:31:16, 3.59s/it] {'loss': 0.3094, 'grad_norm': 0.6118825012765209, 'learning_rate': 2.7508386769306462e-06, 'epoch': 0.66} 66%|██████▌ | 14563/22095 [24:44:36<7:31:16, 3.59s/it] 66%|██████▌ | 14564/22095 [24:44:39<7:24:11, 3.54s/it] {'loss': 0.2915, 'grad_norm': 0.7444506793187203, 'learning_rate': 2.7501841197649627e-06, 'epoch': 0.66} 66%|██████▌ | 14564/22095 [24:44:39<7:24:11, 3.54s/it] 66%|██████▌ | 14565/22095 [24:44:43<7:51:44, 3.76s/it] {'loss': 0.3388, 'grad_norm': 0.6609723903965469, 'learning_rate': 2.7495296109409136e-06, 'epoch': 0.66} 66%|██████▌ | 14565/22095 [24:44:43<7:51:44, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57523 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86139 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14566/22095 [24:44:48<8:15:37, 3.95s/it] {'loss': 0.3072, 'grad_norm': 0.5776648608785567, 'learning_rate': 2.7488751504725587e-06, 'epoch': 0.66} 66%|██████▌ | 14566/22095 [24:44:48<8:15:37, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308593 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2ixCid_TI8KJjSsphXXcFppXa_!!3496794657.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n请将图片中的所有文字输出出来。'}, {'from': 'gpt', 'value': '图中所有文字:\n特惠\n9..9\n15天练好字\n30支褪色大笔芯\n2个笔套2个握笔器30支褪色笔芯\n(买三套送一套)'}]} 66%|██████▌ | 14567/22095 [24:44:57<11:41:27, 5.59s/it] {'loss': 0.4886, 'grad_norm': 0.2930781439292153, 'learning_rate': 2.7482207383739636e-06, 'epoch': 0.66} 66%|██████▌ | 14567/22095 [24:44:57<11:41:27, 5.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14568/22095 [24:45:01<10:41:34, 5.11s/it] {'loss': 0.3029, 'grad_norm': 0.6247275615918767, 'learning_rate': 2.7475663746591906e-06, 'epoch': 0.66} 66%|██████▌ | 14568/22095 [24:45:01<10:41:34, 5.11s/it] 66%|██████▌ | 14569/22095 [24:45:04<9:31:18, 4.55s/it] {'loss': 0.3377, 'grad_norm': 0.6498235375519393, 'learning_rate': 2.746912059342299e-06, 'epoch': 0.66} 66%|██████▌ | 14569/22095 [24:45:04<9:31:18, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14570/22095 [24:45:14<12:37:52, 6.04s/it] {'loss': 0.4664, 'grad_norm': 0.29449251536630294, 'learning_rate': 2.7462577924373448e-06, 'epoch': 0.66} 66%|██████▌ | 14570/22095 [24:45:14<12:37:52, 6.04s/it] 66%|██████▌ | 14571/22095 [24:45:18<11:15:40, 5.39s/it] {'loss': 0.2717, 'grad_norm': 0.572784277483015, 'learning_rate': 2.745603573958391e-06, 'epoch': 0.66} 66%|██████▌ | 14571/22095 [24:45:18<11:15:40, 5.39s/it] 66%|██████▌ | 14572/22095 [24:45:21<10:11:30, 4.88s/it] {'loss': 0.3143, 'grad_norm': 0.728629363071293, 'learning_rate': 2.74494940391949e-06, 'epoch': 0.66} 66%|██████▌ | 14572/22095 [24:45:21<10:11:30, 4.88s/it] 66%|██████▌ | 14573/22095 [24:45:25<9:13:20, 4.41s/it] {'loss': 0.3188, 'grad_norm': 0.6563748386276432, 'learning_rate': 2.7442952823347035e-06, 'epoch': 0.66} 66%|██████▌ | 14573/22095 [24:45:25<9:13:20, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50405 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59333 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107942 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14574/22095 [24:45:31<10:06:03, 4.83s/it] {'loss': 0.4824, 'grad_norm': 0.28586724962853566, 'learning_rate': 2.743641209218083e-06, 'epoch': 0.66} 66%|██████▌ | 14574/22095 [24:45:31<10:06:03, 4.83s/it] 66%|██████▌ | 14575/22095 [24:45:34<9:21:01, 4.48s/it] {'loss': 0.2562, 'grad_norm': 0.6000616689940764, 'learning_rate': 2.742987184583681e-06, 'epoch': 0.66} 66%|██████▌ | 14575/22095 [24:45:34<9:21:01, 4.48s/it] 66%|██████▌ | 14576/22095 [24:45:38<8:59:01, 4.30s/it] {'loss': 0.3191, 'grad_norm': 0.6180064457315072, 'learning_rate': 2.7423332084455543e-06, 'epoch': 0.66} 66%|██████▌ | 14576/22095 [24:45:38<8:59:01, 4.30s/it] 66%|██████▌ | 14577/22095 [24:45:41<8:18:23, 3.98s/it] {'loss': 0.32, 'grad_norm': 0.5672784538288332, 'learning_rate': 2.7416792808177516e-06, 'epoch': 0.66} 66%|██████▌ | 14577/22095 [24:45:41<8:18:23, 3.98s/it] 66%|██████▌ | 14578/22095 [24:45:44<7:43:25, 3.70s/it] {'loss': 0.3216, 'grad_norm': 0.6002952662222252, 'learning_rate': 2.741025401714327e-06, 'epoch': 0.66} 66%|██████▌ | 14578/22095 [24:45:44<7:43:25, 3.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14579/22095 [24:45:47<7:20:04, 3.51s/it] {'loss': 0.3128, 'grad_norm': 0.7181766236652201, 'learning_rate': 2.7403715711493264e-06, 'epoch': 0.66} 66%|██████▌ | 14579/22095 [24:45:47<7:20:04, 3.51s/it] 66%|██████▌ | 14580/22095 [24:45:51<7:13:38, 3.46s/it] {'loss': 0.3064, 'grad_norm': 0.6649504513424246, 'learning_rate': 2.7397177891368033e-06, 'epoch': 0.66} 66%|██████▌ | 14580/22095 [24:45:51<7:13:38, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74108 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14581/22095 [24:45:54<6:48:23, 3.26s/it] {'loss': 0.2912, 'grad_norm': 0.5940222820479465, 'learning_rate': 2.7390640556908023e-06, 'epoch': 0.66} 66%|██████▌ | 14581/22095 [24:45:54<6:48:23, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14582/22095 [24:46:04<11:07:34, 5.33s/it] {'loss': 0.4343, 'grad_norm': 0.28298728873634355, 'learning_rate': 2.7384103708253697e-06, 'epoch': 0.66} 66%|██████▌ | 14582/22095 [24:46:04<11:07:34, 5.33s/it] 66%|██████▌ | 14583/22095 [24:46:07<10:01:19, 4.80s/it] {'loss': 0.2864, 'grad_norm': 0.6560700005915628, 'learning_rate': 2.7377567345545514e-06, 'epoch': 0.66} 66%|██████▌ | 14583/22095 [24:46:07<10:01:19, 4.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104657 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14584/22095 [24:46:11<9:09:08, 4.39s/it] {'loss': 0.357, 'grad_norm': 0.6090951121414581, 'learning_rate': 2.737103146892395e-06, 'epoch': 0.66} 66%|██████▌ | 14584/22095 [24:46:11<9:09:08, 4.39s/it] 66%|██████▌ | 14585/22095 [24:46:15<9:03:46, 4.34s/it] {'loss': 0.317, 'grad_norm': 0.6028122135930079, 'learning_rate': 2.7364496078529425e-06, 'epoch': 0.66} 66%|██████▌ | 14585/22095 [24:46:15<9:03:46, 4.34s/it] 66%|██████▌ | 14586/22095 [24:46:18<8:26:58, 4.05s/it] {'loss': 0.2849, 'grad_norm': 0.5996556745656106, 'learning_rate': 2.7357961174502335e-06, 'epoch': 0.66} 66%|██████▌ | 14586/22095 [24:46:18<8:26:58, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14587/22095 [24:46:22<8:06:28, 3.89s/it] {'loss': 0.2917, 'grad_norm': 0.5778079712939874, 'learning_rate': 2.7351426756983145e-06, 'epoch': 0.66} 66%|██████▌ | 14587/22095 [24:46:22<8:06:28, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14588/22095 [24:46:25<7:31:56, 3.61s/it] {'loss': 0.4023, 'grad_norm': 0.6911837816561222, 'learning_rate': 2.734489282611221e-06, 'epoch': 0.66} 66%|██████▌ | 14588/22095 [24:46:25<7:31:56, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14589/22095 [24:46:34<11:18:49, 5.43s/it] {'loss': 0.4774, 'grad_norm': 0.2729175501123961, 'learning_rate': 2.733835938202997e-06, 'epoch': 0.66} 66%|██████▌ | 14589/22095 [24:46:35<11:18:49, 5.43s/it] 66%|██████▌ | 14590/22095 [24:46:38<9:55:16, 4.76s/it] {'loss': 0.2917, 'grad_norm': 0.5936874447342413, 'learning_rate': 2.7331826424876782e-06, 'epoch': 0.66} 66%|██████▌ | 14590/22095 [24:46:38<9:55:16, 4.76s/it] 66%|██████▌ | 14591/22095 [24:46:41<8:51:10, 4.25s/it] {'loss': 0.3421, 'grad_norm': 0.6315346424071278, 'learning_rate': 2.7325293954793013e-06, 'epoch': 0.66} 66%|██████▌ | 14591/22095 [24:46:41<8:51:10, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47468 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14592/22095 [24:46:50<12:06:35, 5.81s/it] {'loss': 0.4755, 'grad_norm': 0.302728605590915, 'learning_rate': 2.7318761971919034e-06, 'epoch': 0.66} 66%|██████▌ | 14592/22095 [24:46:50<12:06:35, 5.81s/it] 66%|██████▌ | 14593/22095 [24:46:54<10:53:16, 5.22s/it] {'loss': 0.2912, 'grad_norm': 0.6244775173123975, 'learning_rate': 2.731223047639522e-06, 'epoch': 0.66} 66%|██████▌ | 14593/22095 [24:46:54<10:53:16, 5.22s/it] 66%|██████▌ | 14594/22095 [24:46:58<10:14:15, 4.91s/it] {'loss': 0.329, 'grad_norm': 0.60301323677301, 'learning_rate': 2.730569946836189e-06, 'epoch': 0.66} 66%|██████▌ | 14594/22095 [24:46:58<10:14:15, 4.91s/it] 66%|██████▌ | 14595/22095 [24:47:02<9:31:58, 4.58s/it] {'loss': 0.3207, 'grad_norm': 0.7322230548298987, 'learning_rate': 2.7299168947959365e-06, 'epoch': 0.66} 66%|██████▌ | 14595/22095 [24:47:02<9:31:58, 4.58s/it] 66%|██████▌ | 14596/22095 [24:47:06<9:25:58, 4.53s/it] {'loss': 0.3251, 'grad_norm': 0.6134897267742255, 'learning_rate': 2.7292638915327975e-06, 'epoch': 0.66} 66%|██████▌ | 14596/22095 [24:47:06<9:25:58, 4.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14597/22095 [24:47:10<8:42:35, 4.18s/it] {'loss': 0.2786, 'grad_norm': 0.5731490592855168, 'learning_rate': 2.728610937060805e-06, 'epoch': 0.66} 66%|██████▌ | 14597/22095 [24:47:10<8:42:35, 4.18s/it] 66%|██████▌ | 14598/22095 [24:47:14<8:50:21, 4.24s/it] {'loss': 0.3816, 'grad_norm': 0.6774614477652757, 'learning_rate': 2.727958031393988e-06, 'epoch': 0.66} 66%|██████▌ | 14598/22095 [24:47:14<8:50:21, 4.24s/it] 66%|██████▌ | 14599/22095 [24:47:17<7:56:32, 3.81s/it] {'loss': 0.2787, 'grad_norm': 0.6043544377177085, 'learning_rate': 2.727305174546372e-06, 'epoch': 0.66} 66%|██████▌ | 14599/22095 [24:47:17<7:56:32, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [659, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8489374 in VC:s3://internvl-moe-sft-data/. Exception: Image size [659, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 75866, 'image': 'vrdu_texteq/astro-ph.CO/a7e87796-7d8f-47f9-ad34-431c1ef81b07.png', 'image_wh': [[659, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'We use the distance radio $d_{z}$ at $z=0.2$ and $z=0.35$ -'}]} 66%|██████▌ | 14600/22095 [24:47:26<10:56:17, 5.25s/it] {'loss': 0.4921, 'grad_norm': 0.30317006861177326, 'learning_rate': 2.7266523665319904e-06, 'epoch': 0.66} 66%|██████▌ | 14600/22095 [24:47:26<10:56:17, 5.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14601/22095 [24:47:30<10:40:26, 5.13s/it] {'loss': 0.2868, 'grad_norm': 0.5828626242746316, 'learning_rate': 2.725999607364865e-06, 'epoch': 0.66} 66%|██████▌ | 14601/22095 [24:47:31<10:40:26, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14602/22095 [24:47:34<9:51:41, 4.74s/it] {'loss': 0.3035, 'grad_norm': 0.6287623493610928, 'learning_rate': 2.725346897059027e-06, 'epoch': 0.66} 66%|██████▌ | 14602/22095 [24:47:34<9:51:41, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56693 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72135 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14603/22095 [24:47:38<9:18:25, 4.47s/it] {'loss': 0.3314, 'grad_norm': 0.6667618633210299, 'learning_rate': 2.724694235628498e-06, 'epoch': 0.66} 66%|██████▌ | 14603/22095 [24:47:38<9:18:25, 4.47s/it] 66%|██████▌ | 14604/22095 [24:47:41<8:23:36, 4.03s/it] {'loss': 0.2995, 'grad_norm': 0.6039172989770963, 'learning_rate': 2.724041623087299e-06, 'epoch': 0.66} 66%|██████▌ | 14604/22095 [24:47:41<8:23:36, 4.03s/it] 66%|██████▌ | 14605/22095 [24:47:45<8:17:02, 3.98s/it] {'loss': 0.3158, 'grad_norm': 0.6335276592368326, 'learning_rate': 2.723389059449455e-06, 'epoch': 0.66} 66%|██████▌ | 14605/22095 [24:47:45<8:17:02, 3.98s/it] 66%|██████▌ | 14606/22095 [24:47:49<8:21:28, 4.02s/it] {'loss': 0.3307, 'grad_norm': 0.6461064971619715, 'learning_rate': 2.722736544728991e-06, 'epoch': 0.66} 66%|██████▌ | 14606/22095 [24:47:49<8:21:28, 4.02s/it] 66%|██████▌ | 14607/22095 [24:47:53<8:22:25, 4.03s/it] {'loss': 0.3228, 'grad_norm': 0.5979661875521738, 'learning_rate': 2.7220840789399243e-06, 'epoch': 0.66} 66%|██████▌ | 14607/22095 [24:47:53<8:22:25, 4.03s/it] 66%|██████▌ | 14608/22095 [24:47:56<7:32:47, 3.63s/it] {'loss': 0.2936, 'grad_norm': 0.650606256072474, 'learning_rate': 2.7214316620962727e-06, 'epoch': 0.66} 66%|██████▌ | 14608/22095 [24:47:56<7:32:47, 3.63s/it] 66%|██████▌ | 14609/22095 [24:47:59<7:03:05, 3.39s/it] {'loss': 0.3279, 'grad_norm': 0.6703756569986283, 'learning_rate': 2.720779294212059e-06, 'epoch': 0.66} 66%|██████▌ | 14609/22095 [24:47:59<7:03:05, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14610/22095 [24:48:02<7:01:22, 3.38s/it] {'loss': 0.3444, 'grad_norm': 0.7022940751613966, 'learning_rate': 2.720126975301297e-06, 'epoch': 0.66} 66%|██████▌ | 14610/22095 [24:48:02<7:01:22, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78883 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14611/22095 [24:48:12<10:59:40, 5.29s/it] {'loss': 0.4496, 'grad_norm': 0.27846024025290605, 'learning_rate': 2.7194747053780037e-06, 'epoch': 0.66} 66%|██████▌ | 14611/22095 [24:48:12<10:59:40, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78855 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14612/22095 [24:48:15<9:53:00, 4.75s/it] {'loss': 0.3225, 'grad_norm': 0.6336076704471227, 'learning_rate': 2.718822484456194e-06, 'epoch': 0.66} 66%|██████▌ | 14612/22095 [24:48:15<9:53:00, 4.75s/it] 66%|██████▌ | 14613/22095 [24:48:19<9:09:51, 4.41s/it] {'loss': 0.2723, 'grad_norm': 0.6162067561278477, 'learning_rate': 2.718170312549885e-06, 'epoch': 0.66} 66%|██████▌ | 14613/22095 [24:48:19<9:09:51, 4.41s/it] 66%|██████▌ | 14614/22095 [24:48:22<8:38:37, 4.16s/it] {'loss': 0.334, 'grad_norm': 0.5828372659832474, 'learning_rate': 2.717518189673088e-06, 'epoch': 0.66} 66%|██████▌ | 14614/22095 [24:48:23<8:38:37, 4.16s/it] 66%|██████▌ | 14615/22095 [24:48:26<7:57:20, 3.83s/it] {'loss': 0.2896, 'grad_norm': 0.6171211495647955, 'learning_rate': 2.716866115839813e-06, 'epoch': 0.66} 66%|██████▌ | 14615/22095 [24:48:26<7:57:20, 3.83s/it] 66%|██████▌ | 14616/22095 [24:48:30<8:20:25, 4.01s/it] {'loss': 0.2921, 'grad_norm': 0.6540331289151489, 'learning_rate': 2.716214091064075e-06, 'epoch': 0.66} 66%|██████▌ | 14616/22095 [24:48:30<8:20:25, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42745 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14617/22095 [24:48:33<7:53:41, 3.80s/it] {'loss': 0.2761, 'grad_norm': 0.5574949221745757, 'learning_rate': 2.71556211535988e-06, 'epoch': 0.66} 66%|██████▌ | 14617/22095 [24:48:33<7:53:41, 3.80s/it] 66%|██████▌ | 14618/22095 [24:48:36<7:19:14, 3.52s/it] {'loss': 0.3316, 'grad_norm': 0.6485906746506321, 'learning_rate': 2.714910188741241e-06, 'epoch': 0.66} 66%|██████▌ | 14618/22095 [24:48:36<7:19:14, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44624 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14619/22095 [24:48:46<11:17:46, 5.44s/it] {'loss': 0.4668, 'grad_norm': 0.27752831386377874, 'learning_rate': 2.714258311222162e-06, 'epoch': 0.66} 66%|██████▌ | 14619/22095 [24:48:46<11:17:46, 5.44s/it] 66%|██████▌ | 14620/22095 [24:48:55<13:29:04, 6.49s/it] {'loss': 0.475, 'grad_norm': 0.2789879546078319, 'learning_rate': 2.7136064828166543e-06, 'epoch': 0.66} 66%|██████▌ | 14620/22095 [24:48:55<13:29:04, 6.49s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 66%|██████▌ | 14621/22095 [24:48:58<11:29:01, 5.53s/it] {'loss': 0.3271, 'grad_norm': 0.7431375370864788, 'learning_rate': 2.7129547035387187e-06, 'epoch': 0.66} 66%|██████▌ | 14621/22095 [24:48:58<11:29:01, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86079 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14622/22095 [24:49:02<10:11:09, 4.91s/it] {'loss': 0.3225, 'grad_norm': 0.6110256512004637, 'learning_rate': 2.7123029734023643e-06, 'epoch': 0.66} 66%|██████▌ | 14622/22095 [24:49:02<10:11:09, 4.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▌ | 14623/22095 [24:49:05<8:55:42, 4.30s/it] {'loss': 0.2673, 'grad_norm': 0.5927904023046602, 'learning_rate': 2.711651292421593e-06, 'epoch': 0.66} 66%|██████▌ | 14623/22095 [24:49:05<8:55:42, 4.30s/it] 66%|██████▌ | 14624/22095 [24:49:08<8:21:50, 4.03s/it] {'loss': 0.3119, 'grad_norm': 0.6309109945227651, 'learning_rate': 2.7109996606104054e-06, 'epoch': 0.66} 66%|██████▌ | 14624/22095 [24:49:08<8:21:50, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109424 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66495 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14625/22095 [24:49:18<12:00:49, 5.79s/it] {'loss': 0.4362, 'grad_norm': 0.27083279929004345, 'learning_rate': 2.710348077982805e-06, 'epoch': 0.66} 66%|██████▌ | 14625/22095 [24:49:18<12:00:49, 5.79s/it] 66%|██████▌ | 14626/22095 [24:49:22<10:39:43, 5.14s/it] {'loss': 0.2773, 'grad_norm': 0.6326640005584911, 'learning_rate': 2.7096965445527947e-06, 'epoch': 0.66} 66%|██████▌ | 14626/22095 [24:49:22<10:39:43, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48371 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45756 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14627/22095 [24:49:30<12:49:47, 6.18s/it] {'loss': 0.4829, 'grad_norm': 0.26875623419781375, 'learning_rate': 2.7090450603343703e-06, 'epoch': 0.66} 66%|██████▌ | 14627/22095 [24:49:30<12:49:47, 6.18s/it] 66%|██████▌ | 14628/22095 [24:49:34<11:10:03, 5.38s/it] {'loss': 0.3308, 'grad_norm': 0.5908815928580272, 'learning_rate': 2.70839362534153e-06, 'epoch': 0.66} 66%|██████▌ | 14628/22095 [24:49:34<11:10:03, 5.38s/it] 66%|██████▌ | 14629/22095 [24:49:37<9:43:47, 4.69s/it] {'loss': 0.2937, 'grad_norm': 0.591809248196907, 'learning_rate': 2.7077422395882745e-06, 'epoch': 0.66} 66%|██████▌ | 14629/22095 [24:49:37<9:43:47, 4.69s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937802 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60955, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nA. \\frac{9}{2}cm\nB. 5cm\nC. \\frac{11}{2}cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 66%|██████▌ | 14630/22095 [24:49:44<11:31:07, 5.55s/it] {'loss': 0.4638, 'grad_norm': 0.25957882080678385, 'learning_rate': 2.7070909030885967e-06, 'epoch': 0.66} 66%|██████▌ | 14630/22095 [24:49:44<11:31:07, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53472 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66705 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65543 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14631/22095 [24:49:49<10:52:02, 5.24s/it] {'loss': 0.3007, 'grad_norm': 0.5857590862321028, 'learning_rate': 2.706439615856495e-06, 'epoch': 0.66} 66%|██████▌ | 14631/22095 [24:49:49<10:52:02, 5.24s/it] 66%|██████▌ | 14632/22095 [24:49:52<9:36:59, 4.64s/it] {'loss': 0.2595, 'grad_norm': 0.6156546064522539, 'learning_rate': 2.705788377905961e-06, 'epoch': 0.66} 66%|██████▌ | 14632/22095 [24:49:52<9:36:59, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14633/22095 [24:50:02<12:37:31, 6.09s/it] {'loss': 0.4496, 'grad_norm': 0.27266227794839076, 'learning_rate': 2.705137189250988e-06, 'epoch': 0.66} 66%|██████▌ | 14633/22095 [24:50:02<12:37:31, 6.09s/it] 66%|██████▌ | 14634/22095 [24:50:06<11:30:13, 5.55s/it] {'loss': 0.3283, 'grad_norm': 0.7641074447659439, 'learning_rate': 2.7044860499055682e-06, 'epoch': 0.66} 66%|██████▌ | 14634/22095 [24:50:06<11:30:13, 5.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▌ | 14635/22095 [24:50:13<12:30:17, 6.03s/it] {'loss': 0.4556, 'grad_norm': 0.26066804349701916, 'learning_rate': 2.7038349598836944e-06, 'epoch': 0.66} 66%|██████▌ | 14635/22095 [24:50:13<12:30:17, 6.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52808 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▌ | 14636/22095 [24:50:16<10:43:49, 5.18s/it] {'loss': 0.2849, 'grad_norm': 0.5907273354900149, 'learning_rate': 2.703183919199356e-06, 'epoch': 0.66} 66%|██████▌ | 14636/22095 [24:50:16<10:43:49, 5.18s/it] 66%|██████▌ | 14637/22095 [24:50:20<9:33:20, 4.61s/it] {'loss': 0.2869, 'grad_norm': 0.631603567997779, 'learning_rate': 2.702532927866538e-06, 'epoch': 0.66} 66%|██████▌ | 14637/22095 [24:50:20<9:33:20, 4.61s/it] 66%|██████▋ | 14638/22095 [24:50:23<8:48:48, 4.25s/it] {'loss': 0.2978, 'grad_norm': 0.61739783806948, 'learning_rate': 2.7018819858992323e-06, 'epoch': 0.66} 66%|██████▋ | 14638/22095 [24:50:23<8:48:48, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▋ | 14639/22095 [24:50:32<12:02:53, 5.82s/it] {'loss': 0.5085, 'grad_norm': 0.2919577399403899, 'learning_rate': 2.7012310933114283e-06, 'epoch': 0.66} 66%|██████▋ | 14639/22095 [24:50:32<12:02:53, 5.82s/it] 66%|██████▋ | 14640/22095 [24:50:36<10:34:30, 5.11s/it] {'loss': 0.2701, 'grad_norm': 0.6630811709636194, 'learning_rate': 2.7005802501171037e-06, 'epoch': 0.66} 66%|██████▋ | 14640/22095 [24:50:36<10:34:30, 5.11s/it] 66%|██████▋ | 14641/22095 [24:50:39<9:35:17, 4.63s/it] {'loss': 0.3429, 'grad_norm': 0.6682876965040512, 'learning_rate': 2.6999294563302474e-06, 'epoch': 0.66} 66%|██████▋ | 14641/22095 [24:50:39<9:35:17, 4.63s/it] 66%|██████▋ | 14642/22095 [24:50:42<8:36:41, 4.16s/it] {'loss': 0.3079, 'grad_norm': 0.6592272286289782, 'learning_rate': 2.6992787119648456e-06, 'epoch': 0.66} 66%|██████▋ | 14642/22095 [24:50:42<8:36:41, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▋ | 14643/22095 [24:50:45<7:53:01, 3.81s/it] {'loss': 0.2948, 'grad_norm': 0.6500740029003796, 'learning_rate': 2.698628017034877e-06, 'epoch': 0.66} 66%|██████▋ | 14643/22095 [24:50:45<7:53:01, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▋ | 14644/22095 [24:50:54<10:54:56, 5.27s/it] {'loss': 0.4846, 'grad_norm': 0.3547010827896515, 'learning_rate': 2.6979773715543234e-06, 'epoch': 0.66} 66%|██████▋ | 14644/22095 [24:50:54<10:54:56, 5.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49797 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49196 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▋ | 14645/22095 [24:50:59<10:37:09, 5.13s/it] {'loss': 0.3246, 'grad_norm': 0.8248138219132292, 'learning_rate': 2.697326775537167e-06, 'epoch': 0.66} 66%|██████▋ | 14645/22095 [24:50:59<10:37:09, 5.13s/it] 66%|██████▋ | 14646/22095 [24:51:03<9:51:53, 4.77s/it] {'loss': 0.2961, 'grad_norm': 0.6486771166763444, 'learning_rate': 2.696676228997385e-06, 'epoch': 0.66} 66%|██████▋ | 14646/22095 [24:51:03<9:51:53, 4.77s/it] 66%|██████▋ | 14647/22095 [24:51:06<9:05:58, 4.40s/it] {'loss': 0.2985, 'grad_norm': 0.5916296106868336, 'learning_rate': 2.696025731948958e-06, 'epoch': 0.66} 66%|██████▋ | 14647/22095 [24:51:06<9:05:58, 4.40s/it] 66%|██████▋ | 14648/22095 [24:51:10<8:48:14, 4.26s/it] {'loss': 0.2885, 'grad_norm': 0.5843377254274962, 'learning_rate': 2.69537528440586e-06, 'epoch': 0.66} 66%|██████▋ | 14648/22095 [24:51:10<8:48:14, 4.26s/it] 66%|██████▋ | 14649/22095 [24:51:14<8:41:41, 4.20s/it] {'loss': 0.3126, 'grad_norm': 0.6058458147622786, 'learning_rate': 2.6947248863820712e-06, 'epoch': 0.66} 66%|██████▋ | 14649/22095 [24:51:14<8:41:41, 4.20s/it] 66%|██████▋ | 14650/22095 [24:51:18<8:09:37, 3.95s/it] {'loss': 0.324, 'grad_norm': 0.7042748783059469, 'learning_rate': 2.6940745378915623e-06, 'epoch': 0.66} 66%|██████▋ | 14650/22095 [24:51:18<8:09:37, 3.95s/it] 66%|██████▋ | 14651/22095 [24:51:22<8:19:46, 4.03s/it] {'loss': 0.3391, 'grad_norm': 0.7634106275562195, 'learning_rate': 2.6934242389483118e-06, 'epoch': 0.66} 66%|██████▋ | 14651/22095 [24:51:22<8:19:46, 4.03s/it] 66%|██████▋ | 14652/22095 [24:51:25<7:39:51, 3.71s/it] {'loss': 0.3108, 'grad_norm': 0.6186184423505748, 'learning_rate': 2.6927739895662897e-06, 'epoch': 0.66} 66%|██████▋ | 14652/22095 [24:51:25<7:39:51, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▋ | 14653/22095 [24:51:35<11:41:19, 5.65s/it] {'loss': 0.4476, 'grad_norm': 0.28352438648346373, 'learning_rate': 2.692123789759467e-06, 'epoch': 0.66} 66%|██████▋ | 14653/22095 [24:51:35<11:41:19, 5.65s/it] 66%|██████▋ | 14654/22095 [24:51:39<10:45:17, 5.20s/it] {'loss': 0.3498, 'grad_norm': 0.6628832546732235, 'learning_rate': 2.6914736395418162e-06, 'epoch': 0.66} 66%|██████▋ | 14654/22095 [24:51:39<10:45:17, 5.20s/it] 66%|██████▋ | 14655/22095 [24:51:43<9:47:46, 4.74s/it] {'loss': 0.2811, 'grad_norm': 0.6349727134291555, 'learning_rate': 2.6908235389273086e-06, 'epoch': 0.66} 66%|██████▋ | 14655/22095 [24:51:43<9:47:46, 4.74s/it] 66%|██████▋ | 14656/22095 [24:51:47<9:39:52, 4.68s/it] {'loss': 0.3177, 'grad_norm': 0.5571570233509476, 'learning_rate': 2.69017348792991e-06, 'epoch': 0.66} 66%|██████▋ | 14656/22095 [24:51:47<9:39:52, 4.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▋ | 14657/22095 [24:51:50<8:36:16, 4.16s/it] {'loss': 0.33, 'grad_norm': 0.587265023201979, 'learning_rate': 2.6895234865635883e-06, 'epoch': 0.66} 66%|██████▋ | 14657/22095 [24:51:50<8:36:16, 4.16s/it] 66%|██████▋ | 14658/22095 [24:51:55<8:44:55, 4.23s/it] {'loss': 0.3366, 'grad_norm': 0.6182820774276857, 'learning_rate': 2.688873534842312e-06, 'epoch': 0.66} 66%|██████▋ | 14658/22095 [24:51:55<8:44:55, 4.23s/it] 66%|██████▋ | 14659/22095 [24:51:58<7:57:05, 3.85s/it] {'loss': 0.2685, 'grad_norm': 0.7652304431041801, 'learning_rate': 2.688223632780044e-06, 'epoch': 0.66} 66%|██████▋ | 14659/22095 [24:51:58<7:57:05, 3.85s/it] 66%|██████▋ | 14660/22095 [24:52:02<8:05:44, 3.92s/it] {'loss': 0.2825, 'grad_norm': 0.6361782325815011, 'learning_rate': 2.687573780390752e-06, 'epoch': 0.66} 66%|██████▋ | 14660/22095 [24:52:02<8:05:44, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66033 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74338 > 40960). Running this sequence through the model will result in indexing errors 66%|██████▋ | 14661/22095 [24:52:05<7:27:44, 3.61s/it] {'loss': 0.2845, 'grad_norm': 0.5523403925336813, 'learning_rate': 2.686923977688397e-06, 'epoch': 0.66} 66%|██████▋ | 14661/22095 [24:52:05<7:27:44, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [95, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504981 in VC:s3://internvl-moe-sft-data/. Exception: Image size [95, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 124981, 'image': 'vrdu_texteq/astro-ph.CO/be4638f8-cf23-402f-852a-16fbead46d68.png', 'image_wh': [[95, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'If $k=l$:'}]} 66%|██████▋ | 14662/22095 [24:52:08<7:01:25, 3.40s/it] {'loss': 0.2967, 'grad_norm': 0.5916077285971831, 'learning_rate': 2.68627422468694e-06, 'epoch': 0.66} 66%|██████▋ | 14662/22095 [24:52:08<7:01:25, 3.40s/it] 66%|██████▋ | 14663/22095 [24:52:11<6:59:23, 3.39s/it] {'loss': 0.3592, 'grad_norm': 0.6534276462669631, 'learning_rate': 2.685624521400344e-06, 'epoch': 0.66} 66%|██████▋ | 14663/22095 [24:52:11<6:59:23, 3.39s/it] 66%|██████▋ | 14664/22095 [24:52:15<7:11:33, 3.48s/it] {'loss': 0.3006, 'grad_norm': 0.5755201570950208, 'learning_rate': 2.68497486784257e-06, 'epoch': 0.66} 66%|██████▋ | 14664/22095 [24:52:15<7:11:33, 3.48s/it] 66%|██████▋ | 14665/22095 [24:52:18<7:10:34, 3.48s/it] {'loss': 0.3319, 'grad_norm': 0.6065498661857095, 'learning_rate': 2.684325264027577e-06, 'epoch': 0.66} 66%|██████▋ | 14665/22095 [24:52:18<7:10:34, 3.48s/it] 66%|██████▋ | 14666/22095 [24:52:22<7:28:39, 3.62s/it] {'loss': 0.2887, 'grad_norm': 0.7403589657336263, 'learning_rate': 2.68367570996932e-06, 'epoch': 0.66} 66%|██████▋ | 14666/22095 [24:52:22<7:28:39, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▋ | 14667/22095 [24:52:26<7:39:31, 3.71s/it] {'loss': 0.2969, 'grad_norm': 0.9069088530531536, 'learning_rate': 2.6830262056817574e-06, 'epoch': 0.66} 66%|██████▋ | 14667/22095 [24:52:26<7:39:31, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 66%|██████▋ | 14668/22095 [24:52:34<10:03:56, 4.88s/it] {'loss': 0.4687, 'grad_norm': 0.37938075410330124, 'learning_rate': 2.68237675117885e-06, 'epoch': 0.66} 66%|██████▋ | 14668/22095 [24:52:34<10:03:56, 4.88s/it] 66%|██████▋ | 14669/22095 [24:52:44<13:43:14, 6.65s/it] {'loss': 0.4758, 'grad_norm': 0.36833643317394965, 'learning_rate': 2.6817273464745443e-06, 'epoch': 0.66} 66%|██████▋ | 14669/22095 [24:52:44<13:43:14, 6.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047656 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 66%|██████▋ | 14670/22095 [24:52:49<12:27:19, 6.04s/it] {'loss': 0.3122, 'grad_norm': 0.6321562537827897, 'learning_rate': 2.681077991582797e-06, 'epoch': 0.66} 66%|██████▋ | 14670/22095 [24:52:49<12:27:19, 6.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 66%|██████▋ | 14671/22095 [24:52:54<11:39:59, 5.66s/it] {'loss': 0.3505, 'grad_norm': 0.6201058925027416, 'learning_rate': 2.6804286865175645e-06, 'epoch': 0.66} 66%|██████▋ | 14671/22095 [24:52:54<11:39:59, 5.66s/it] 66%|██████▋ | 14672/22095 [24:52:58<10:35:22, 5.14s/it] {'loss': 0.3678, 'grad_norm': 0.7259966491374608, 'learning_rate': 2.679779431292795e-06, 'epoch': 0.66} 66%|██████▋ | 14672/22095 [24:52:58<10:35:22, 5.14s/it] 66%|██████▋ | 14673/22095 [24:53:01<9:22:09, 4.54s/it] {'loss': 0.285, 'grad_norm': 0.5941258634548173, 'learning_rate': 2.6791302259224385e-06, 'epoch': 0.66} 66%|██████▋ | 14673/22095 [24:53:01<9:22:09, 4.54s/it] 66%|██████▋ | 14674/22095 [24:53:04<8:17:25, 4.02s/it] {'loss': 0.3586, 'grad_norm': 0.6272193636667374, 'learning_rate': 2.678481070420446e-06, 'epoch': 0.66} 66%|██████▋ | 14674/22095 [24:53:04<8:17:25, 4.02s/it] 66%|██████▋ | 14675/22095 [24:53:07<7:54:55, 3.84s/it] {'loss': 0.3205, 'grad_norm': 0.5736449544025902, 'learning_rate': 2.6778319648007645e-06, 'epoch': 0.66} 66%|██████▋ | 14675/22095 [24:53:07<7:54:55, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [350, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8497074 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39979, 'image': 'vrdu_texteq/astro-ph.CO/0e4b188f-ac1d-4bd3-84ba-2fc76c4b97bb.png', 'image_wh': [[350, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'Redshift slice $0.55 < z < 0.7$:'}]} 66%|██████▋ | 14676/22095 [24:53:12<8:22:25, 4.06s/it] {'loss': 0.3317, 'grad_norm': 0.7449192943128308, 'learning_rate': 2.677182909077343e-06, 'epoch': 0.66} 66%|██████▋ | 14676/22095 [24:53:12<8:22:25, 4.06s/it] 66%|██████▋ | 14677/22095 [24:53:15<8:08:49, 3.95s/it] {'loss': 0.295, 'grad_norm': 0.6651381927164225, 'learning_rate': 2.6765339032641256e-06, 'epoch': 0.66} 66%|██████▋ | 14677/22095 [24:53:15<8:08:49, 3.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387777 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54589, 'image': 'vrdu_table_final_2/astro-ph.CO/eec36f8b-c719-4d68-bc52-9e53cf8f49ac.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8359933 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26654, 'image': 'vrdu_table_final_2/astro-ph.CO/8dc7ba16-29b2-4539-a000-282815c58b41.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 66%|██████▋ | 14678/22095 [24:53:19<7:37:22, 3.70s/it] {'loss': 0.3309, 'grad_norm': 0.6301906237222805, 'learning_rate': 2.6758849473750605e-06, 'epoch': 0.66} 66%|██████▋ | 14678/22095 [24:53:19<7:37:22, 3.70s/it] 66%|██████▋ | 14679/22095 [24:53:22<7:23:38, 3.59s/it] {'loss': 0.3705, 'grad_norm': 0.685974008499579, 'learning_rate': 2.6752360414240874e-06, 'epoch': 0.66} 66%|██████▋ | 14679/22095 [24:53:22<7:23:38, 3.59s/it] 66%|██████▋ | 14680/22095 [24:53:26<7:30:05, 3.64s/it] {'loss': 0.3604, 'grad_norm': 0.7277368945564235, 'learning_rate': 2.674587185425155e-06, 'epoch': 0.66} 66%|██████▋ | 14680/22095 [24:53:26<7:30:05, 3.64s/it] 66%|██████▋ | 14681/22095 [24:53:29<7:35:19, 3.68s/it] {'loss': 0.336, 'grad_norm': 0.668819004807301, 'learning_rate': 2.6739383793922007e-06, 'epoch': 0.66} 66%|██████▋ | 14681/22095 [24:53:29<7:35:19, 3.68s/it] 66%|██████▋ | 14682/22095 [24:53:35<8:50:38, 4.29s/it] {'loss': 0.3389, 'grad_norm': 0.7308638274786853, 'learning_rate': 2.673289623339165e-06, 'epoch': 0.66} 66%|██████▋ | 14682/22095 [24:53:35<8:50:38, 4.29s/it] 66%|██████▋ | 14683/22095 [24:53:39<8:24:49, 4.09s/it] {'loss': 0.3391, 'grad_norm': 0.5964153150842396, 'learning_rate': 2.67264091727999e-06, 'epoch': 0.66} 66%|██████▋ | 14683/22095 [24:53:39<8:24:49, 4.09s/it] 66%|██████▋ | 14684/22095 [24:53:43<8:30:24, 4.13s/it] {'loss': 0.3304, 'grad_norm': 0.6417431207623295, 'learning_rate': 2.6719922612286152e-06, 'epoch': 0.66} 66%|██████▋ | 14684/22095 [24:53:43<8:30:24, 4.13s/it] 66%|██████▋ | 14685/22095 [24:53:47<8:41:27, 4.22s/it] {'loss': 0.2997, 'grad_norm': 0.5823163333153432, 'learning_rate': 2.6713436551989767e-06, 'epoch': 0.66} 66%|██████▋ | 14685/22095 [24:53:47<8:41:27, 4.22s/it] 66%|██████▋ | 14686/22095 [24:53:50<7:55:17, 3.85s/it] {'loss': 0.273, 'grad_norm': 0.6141979768525521, 'learning_rate': 2.6706950992050097e-06, 'epoch': 0.66} 66%|██████▋ | 14686/22095 [24:53:50<7:55:17, 3.85s/it] 66%|██████▋ | 14687/22095 [24:53:53<7:18:49, 3.55s/it] {'loss': 0.2546, 'grad_norm': 0.6255508535122599, 'learning_rate': 2.670046593260652e-06, 'epoch': 0.66} 66%|██████▋ | 14687/22095 [24:53:53<7:18:49, 3.55s/it] 66%|██████▋ | 14688/22095 [24:53:56<6:50:19, 3.32s/it] {'loss': 0.2532, 'grad_norm': 0.5823902037514233, 'learning_rate': 2.669398137379837e-06, 'epoch': 0.66} 66%|██████▋ | 14688/22095 [24:53:56<6:50:19, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308038 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2Ibf9aRLN8KJjSZPhXXc.spXa_!!387492340.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text is hidden in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n17*30到70*90\n韧性强\n粘度高\n全尺寸现货\n不爆边\n多达58中规格\n一捆包邮\n支持订做任意尺寸'}]} 66%|██████▋ | 14689/22095 [24:54:00<7:07:57, 3.47s/it] {'loss': 0.2779, 'grad_norm': 0.6167108534057749, 'learning_rate': 2.6687497315764987e-06, 'epoch': 0.66} 66%|██████▋ | 14689/22095 [24:54:00<7:07:57, 3.47s/it] 66%|██████▋ | 14690/22095 [24:54:03<7:03:42, 3.43s/it] {'loss': 0.2896, 'grad_norm': 0.7864013107862138, 'learning_rate': 2.668101375864567e-06, 'epoch': 0.66} 66%|██████▋ | 14690/22095 [24:54:03<7:03:42, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396958 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63811, 'image': 'vrdu_table_final_2/astro-ph.EP/6916172e-b69a-40a6-8ea0-4473ad57f91f.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$e_z$\\end{tabular}\n```"}]} 66%|██████▋ | 14691/22095 [24:54:06<6:35:20, 3.20s/it] {'loss': 0.3622, 'grad_norm': 0.7672884746572013, 'learning_rate': 2.667453070257977e-06, 'epoch': 0.66} 66%|██████▋ | 14691/22095 [24:54:06<6:35:20, 3.20s/it] 66%|██████▋ | 14692/22095 [24:54:09<6:47:19, 3.30s/it] {'loss': 0.3145, 'grad_norm': 0.6104806892085198, 'learning_rate': 2.666804814770654e-06, 'epoch': 0.66} 66%|██████▋ | 14692/22095 [24:54:09<6:47:19, 3.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379499 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46284, 'image': 'vrdu_table_final_2/astro-ph.CO/cb12238c-7155-4fcd-bec6-018000f8fcbc.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{#1}#2\\end{tabular}\n```'}]} 66%|██████▋ | 14693/22095 [24:54:32<18:35:14, 9.04s/it] {'loss': 0.3222, 'grad_norm': 0.6513842948804288, 'learning_rate': 2.6661566094165327e-06, 'epoch': 0.66} 66%|██████▋ | 14693/22095 [24:54:32<18:35:14, 9.04s/it] 67%|██████▋ | 14694/22095 [24:54:36<15:19:49, 7.46s/it] {'loss': 0.3105, 'grad_norm': 0.6239399644369519, 'learning_rate': 2.665508454209538e-06, 'epoch': 0.67} 67%|██████▋ | 14694/22095 [24:54:36<15:19:49, 7.46s/it] 67%|██████▋ | 14695/22095 [24:54:40<13:11:54, 6.42s/it] {'loss': 0.3106, 'grad_norm': 0.6439218172036969, 'learning_rate': 2.664860349163594e-06, 'epoch': 0.67} 67%|██████▋ | 14695/22095 [24:54:40<13:11:54, 6.42s/it] 67%|██████▋ | 14696/22095 [24:54:44<11:41:44, 5.69s/it] {'loss': 0.3386, 'grad_norm': 0.6535200562376338, 'learning_rate': 2.6642122942926297e-06, 'epoch': 0.67} 67%|██████▋ | 14696/22095 [24:54:44<11:41:44, 5.69s/it] 67%|██████▋ | 14697/22095 [24:55:06<21:52:15, 10.64s/it] {'loss': 0.2991, 'grad_norm': 0.6673961251279874, 'learning_rate': 2.663564289610573e-06, 'epoch': 0.67} 67%|██████▋ | 14697/22095 [24:55:06<21:52:15, 10.64s/it] 67%|██████▋ | 14698/22095 [24:55:09<17:12:17, 8.37s/it] {'loss': 0.3338, 'grad_norm': 0.6856669351238862, 'learning_rate': 2.66291633513134e-06, 'epoch': 0.67} 67%|██████▋ | 14698/22095 [24:55:09<17:12:17, 8.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61503 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61231 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14699/22095 [24:55:12<14:09:04, 6.89s/it] {'loss': 0.292, 'grad_norm': 0.6393130628186594, 'learning_rate': 2.6622684308688575e-06, 'epoch': 0.67} 67%|██████▋ | 14699/22095 [24:55:12<14:09:04, 6.89s/it] 67%|██████▋ | 14700/22095 [24:55:16<12:06:45, 5.90s/it] {'loss': 0.3133, 'grad_norm': 0.6061214167728383, 'learning_rate': 2.6616205768370483e-06, 'epoch': 0.67} 67%|██████▋ | 14700/22095 [24:55:16<12:06:45, 5.90s/it] 67%|██████▋ | 14701/22095 [24:55:20<10:48:46, 5.26s/it] {'loss': 0.3005, 'grad_norm': 0.6467006444397589, 'learning_rate': 2.660972773049831e-06, 'epoch': 0.67} 67%|██████▋ | 14701/22095 [24:55:20<10:48:46, 5.26s/it] 67%|██████▋ | 14702/22095 [24:55:23<9:29:44, 4.62s/it] {'loss': 0.3107, 'grad_norm': 0.6369249651369691, 'learning_rate': 2.6603250195211235e-06, 'epoch': 0.67} 67%|██████▋ | 14702/22095 [24:55:23<9:29:44, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57395 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14703/22095 [24:55:26<8:56:15, 4.35s/it] {'loss': 0.3086, 'grad_norm': 0.70653484849908, 'learning_rate': 2.659677316264847e-06, 'epoch': 0.67} 67%|██████▋ | 14703/22095 [24:55:26<8:56:15, 4.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14704/22095 [24:55:30<8:14:55, 4.02s/it] {'loss': 0.2752, 'grad_norm': 0.6565328879057835, 'learning_rate': 2.6590296632949157e-06, 'epoch': 0.67} 67%|██████▋ | 14704/22095 [24:55:30<8:14:55, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92484 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14705/22095 [24:55:38<10:40:01, 5.20s/it] {'loss': 0.4851, 'grad_norm': 0.35181142116798614, 'learning_rate': 2.658382060625249e-06, 'epoch': 0.67} 67%|██████▋ | 14705/22095 [24:55:38<10:40:01, 5.20s/it] 67%|██████▋ | 14706/22095 [24:55:41<9:34:36, 4.67s/it] {'loss': 0.3556, 'grad_norm': 0.630447726301422, 'learning_rate': 2.657734508269758e-06, 'epoch': 0.67} 67%|██████▋ | 14706/22095 [24:55:41<9:34:36, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884875 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8028, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 10\nB. 8\nC. 7\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 67%|██████▋ | 14707/22095 [24:55:51<12:43:24, 6.20s/it] {'loss': 0.4654, 'grad_norm': 0.3048831867135513, 'learning_rate': 2.6570870062423616e-06, 'epoch': 0.67} 67%|██████▋ | 14707/22095 [24:55:51<12:43:24, 6.20s/it] 67%|██████▋ | 14708/22095 [24:55:55<11:21:04, 5.53s/it] {'loss': 0.3463, 'grad_norm': 0.7057071837671293, 'learning_rate': 2.6564395545569667e-06, 'epoch': 0.67} 67%|██████▋ | 14708/22095 [24:55:55<11:21:04, 5.53s/it] 67%|██████▋ | 14709/22095 [24:55:58<9:57:17, 4.85s/it] {'loss': 0.3569, 'grad_norm': 0.5558048212112113, 'learning_rate': 2.65579215322749e-06, 'epoch': 0.67} 67%|██████▋ | 14709/22095 [24:55:58<9:57:17, 4.85s/it] 67%|██████▋ | 14710/22095 [24:56:02<9:08:28, 4.46s/it] {'loss': 0.3089, 'grad_norm': 0.6253333082889162, 'learning_rate': 2.6551448022678406e-06, 'epoch': 0.67} 67%|██████▋ | 14710/22095 [24:56:02<9:08:28, 4.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14711/22095 [24:56:09<10:46:44, 5.26s/it] {'loss': 0.461, 'grad_norm': 0.2712330226795423, 'learning_rate': 2.6544975016919263e-06, 'epoch': 0.67} 67%|██████▋ | 14711/22095 [24:56:09<10:46:44, 5.26s/it] 67%|██████▋ | 14712/22095 [24:56:12<9:30:43, 4.64s/it] {'loss': 0.3745, 'grad_norm': 0.6592651233475737, 'learning_rate': 2.653850251513656e-06, 'epoch': 0.67} 67%|██████▋ | 14712/22095 [24:56:12<9:30:43, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14713/22095 [24:56:40<24:10:16, 11.79s/it] {'loss': 0.4944, 'grad_norm': 0.28287393808975175, 'learning_rate': 2.6532030517469408e-06, 'epoch': 0.67} 67%|██████▋ | 14713/22095 [24:56:40<24:10:16, 11.79s/it] 67%|██████▋ | 14714/22095 [24:56:44<18:50:57, 9.19s/it] {'loss': 0.3081, 'grad_norm': 0.655168003309896, 'learning_rate': 2.652555902405684e-06, 'epoch': 0.67} 67%|██████▋ | 14714/22095 [24:56:44<18:50:57, 9.19s/it] 67%|██████▋ | 14715/22095 [24:56:46<14:58:38, 7.31s/it] {'loss': 0.3165, 'grad_norm': 0.6746233358903633, 'learning_rate': 2.651908803503789e-06, 'epoch': 0.67} 67%|██████▋ | 14715/22095 [24:56:46<14:58:38, 7.31s/it] 67%|██████▋ | 14716/22095 [24:56:50<12:55:15, 6.30s/it] {'loss': 0.2922, 'grad_norm': 0.6332929557042597, 'learning_rate': 2.651261755055165e-06, 'epoch': 0.67} 67%|██████▋ | 14716/22095 [24:56:50<12:55:15, 6.30s/it] 67%|██████▋ | 14717/22095 [24:57:13<22:37:54, 11.04s/it] {'loss': 0.3304, 'grad_norm': 0.6089090989543154, 'learning_rate': 2.6506147570737094e-06, 'epoch': 0.67} 67%|██████▋ | 14717/22095 [24:57:13<22:37:54, 11.04s/it] 67%|██████▋ | 14718/22095 [24:57:15<17:36:43, 8.59s/it] {'loss': 0.3131, 'grad_norm': 0.6286355370023886, 'learning_rate': 2.64996780957333e-06, 'epoch': 0.67} 67%|██████▋ | 14718/22095 [24:57:15<17:36:43, 8.59s/it] 67%|██████▋ | 14719/22095 [24:57:19<14:34:50, 7.12s/it] {'loss': 0.3271, 'grad_norm': 0.7411944904056007, 'learning_rate': 2.649320912567922e-06, 'epoch': 0.67} 67%|██████▋ | 14719/22095 [24:57:19<14:34:50, 7.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14720/22095 [24:57:40<23:12:00, 11.32s/it] {'loss': 0.3336, 'grad_norm': 0.6004250448542339, 'learning_rate': 2.6486740660713904e-06, 'epoch': 0.67} 67%|██████▋ | 14720/22095 [24:57:40<23:12:00, 11.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76295 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14721/22095 [24:57:44<18:29:14, 9.03s/it] {'loss': 0.3026, 'grad_norm': 0.6536229703714422, 'learning_rate': 2.64802727009763e-06, 'epoch': 0.67} 67%|██████▋ | 14721/22095 [24:57:44<18:29:14, 9.03s/it] 67%|██████▋ | 14722/22095 [24:57:47<14:55:44, 7.29s/it] {'loss': 0.3047, 'grad_norm': 0.6600307770153987, 'learning_rate': 2.6473805246605416e-06, 'epoch': 0.67} 67%|██████▋ | 14722/22095 [24:57:47<14:55:44, 7.29s/it] 67%|██████▋ | 14723/22095 [24:57:51<12:53:11, 6.29s/it] {'loss': 0.3154, 'grad_norm': 0.6713346700277442, 'learning_rate': 2.64673382977402e-06, 'epoch': 0.67} 67%|██████▋ | 14723/22095 [24:57:51<12:53:11, 6.29s/it] 67%|██████▋ | 14724/22095 [24:57:55<11:15:59, 5.50s/it] {'loss': 0.2783, 'grad_norm': 0.6903849972604962, 'learning_rate': 2.6460871854519594e-06, 'epoch': 0.67} 67%|██████▋ | 14724/22095 [24:57:55<11:15:59, 5.50s/it] 67%|██████▋ | 14725/22095 [24:58:16<21:14:30, 10.38s/it] {'loss': 0.3169, 'grad_norm': 0.581681796442678, 'learning_rate': 2.6454405917082556e-06, 'epoch': 0.67} 67%|██████▋ | 14725/22095 [24:58:17<21:14:30, 10.38s/it] 67%|██████▋ | 14726/22095 [24:58:38<28:13:54, 13.79s/it] {'loss': 0.2909, 'grad_norm': 0.6353521186320087, 'learning_rate': 2.6447940485568057e-06, 'epoch': 0.67} 67%|██████▋ | 14726/22095 [24:58:38<28:13:54, 13.79s/it] 67%|██████▋ | 14727/22095 [24:59:19<44:38:21, 21.81s/it] {'loss': 0.3011, 'grad_norm': 0.6398498279203021, 'learning_rate': 2.6441475560114938e-06, 'epoch': 0.67} 67%|██████▋ | 14727/22095 [24:59:19<44:38:21, 21.81s/it] 67%|██████▋ | 14728/22095 [24:59:58<55:17:48, 27.02s/it] {'loss': 0.3179, 'grad_norm': 0.6425698529167011, 'learning_rate': 2.6435011140862167e-06, 'epoch': 0.67} 67%|██████▋ | 14728/22095 [24:59:58<55:17:48, 27.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14729/22095 [25:00:26<56:13:06, 27.48s/it] {'loss': 0.4648, 'grad_norm': 0.3115862301345666, 'learning_rate': 2.642854722794864e-06, 'epoch': 0.67} 67%|██████▋ | 14729/22095 [25:00:27<56:13:06, 27.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81903 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84538 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14730/22095 [25:00:48<52:36:34, 25.72s/it] {'loss': 0.2429, 'grad_norm': 0.5943414758115273, 'learning_rate': 2.6422083821513246e-06, 'epoch': 0.67} 67%|██████▋ | 14730/22095 [25:00:48<52:36:34, 25.72s/it] 67%|██████▋ | 14731/22095 [25:02:06<84:20:32, 41.23s/it] {'loss': 0.3025, 'grad_norm': 0.5934254282346557, 'learning_rate': 2.6415620921694836e-06, 'epoch': 0.67} 67%|██████▋ | 14731/22095 [25:02:06<84:20:32, 41.23s/it] 67%|██████▋ | 14732/22095 [25:02:09<61:10:54, 29.91s/it] {'loss': 0.2915, 'grad_norm': 0.5653671869192441, 'learning_rate': 2.6409158528632315e-06, 'epoch': 0.67} 67%|██████▋ | 14732/22095 [25:02:09<61:10:54, 29.91s/it] 67%|██████▋ | 14733/22095 [25:03:10<80:06:51, 39.18s/it] {'loss': 0.3329, 'grad_norm': 0.7070394724449415, 'learning_rate': 2.640269664246451e-06, 'epoch': 0.67} 67%|██████▋ | 14733/22095 [25:03:10<80:06:51, 39.18s/it] 67%|██████▋ | 14734/22095 [25:03:50<80:39:33, 39.45s/it] {'loss': 0.3314, 'grad_norm': 0.6389565087574908, 'learning_rate': 2.6396235263330293e-06, 'epoch': 0.67} 67%|██████▋ | 14734/22095 [25:03:50<80:39:33, 39.45s/it] 67%|██████▋ | 14735/22095 [25:04:11<69:38:58, 34.07s/it] {'loss': 0.3354, 'grad_norm': 0.6245585495817344, 'learning_rate': 2.638977439136847e-06, 'epoch': 0.67} 67%|██████▋ | 14735/22095 [25:04:11<69:38:58, 34.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49898 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54912 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56179 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57476 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144700 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14736/22095 [25:04:38<65:05:53, 31.85s/it] {'loss': 0.4906, 'grad_norm': 0.28605449137789823, 'learning_rate': 2.6383314026717903e-06, 'epoch': 0.67} 67%|██████▋ | 14736/22095 [25:04:38<65:05:53, 31.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53838 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14737/22095 [25:05:00<58:53:50, 28.82s/it] {'loss': 0.2988, 'grad_norm': 0.6119075839120809, 'learning_rate': 2.637685416951736e-06, 'epoch': 0.67} 67%|██████▋ | 14737/22095 [25:05:00<58:53:50, 28.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306514 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1sgFALXXXXXajXVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nI require the text from this image transcribed.'}, {'from': 'gpt', 'value': 'All words in the image:\n原质海信电视机高压包\n®\nCQC\nCDE\nC\nUS\nBSC\n25-0201H/ROH\n20080323\n10\n天乐电子配件\n现货\nBSC25-0201H'}]} VC:s3://gui/aguvis/aguvis-stage1/widget_captioning/images/41454.jpg 2025-08-28 17:02:58.610558 load time: 1042.08 ms 67%|██████▋ | 14738/22095 [25:05:09<46:40:49, 22.84s/it] {'loss': 0.4716, 'grad_norm': 0.3082055751433085, 'learning_rate': 2.6370394819905698e-06, 'epoch': 0.67} 67%|██████▋ | 14738/22095 [25:05:09<46:40:49, 22.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14739/22095 [25:05:13<35:17:14, 17.27s/it] {'loss': 0.3388, 'grad_norm': 0.6307907284535162, 'learning_rate': 2.636393597802167e-06, 'epoch': 0.67} 67%|██████▋ | 14739/22095 [25:05:13<35:17:14, 17.27s/it] 67%|██████▋ | 14740/22095 [25:05:35<38:20:32, 18.77s/it] {'loss': 0.2721, 'grad_norm': 0.6078912974944578, 'learning_rate': 2.635747764400405e-06, 'epoch': 0.67} 67%|██████▋ | 14740/22095 [25:05:35<38:20:32, 18.77s/it] 67%|██████▋ | 14741/22095 [25:05:58<41:00:03, 20.07s/it] {'loss': 0.3125, 'grad_norm': 0.5925676623298787, 'learning_rate': 2.635101981799162e-06, 'epoch': 0.67} 67%|██████▋ | 14741/22095 [25:05:58<41:00:03, 20.07s/it] 67%|██████▋ | 14742/22095 [25:07:00<66:09:09, 32.39s/it] {'loss': 0.2931, 'grad_norm': 0.5533639263255467, 'learning_rate': 2.634456250012316e-06, 'epoch': 0.67} 67%|██████▋ | 14742/22095 [25:07:00<66:09:09, 32.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42258 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14743/22095 [25:07:27<62:53:48, 30.80s/it] {'loss': 0.4449, 'grad_norm': 0.29181393082657847, 'learning_rate': 2.6338105690537402e-06, 'epoch': 0.67} 67%|██████▋ | 14743/22095 [25:07:27<62:53:48, 30.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14744/22095 [25:08:11<70:57:32, 34.75s/it] {'loss': 0.512, 'grad_norm': 0.3115948826921394, 'learning_rate': 2.633164938937306e-06, 'epoch': 0.67} 67%|██████▋ | 14744/22095 [25:08:11<70:57:32, 34.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51303 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48755 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14745/22095 [25:08:57<78:17:54, 38.35s/it] {'loss': 0.4772, 'grad_norm': 0.28064957458736633, 'learning_rate': 2.6325193596768905e-06, 'epoch': 0.67} 67%|██████▋ | 14745/22095 [25:08:57<78:17:54, 38.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893506 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16659, 'image': 'images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 12cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14746/22095 [25:09:39<80:17:29, 39.33s/it] {'loss': 0.2879, 'grad_norm': 0.7221727407308424, 'learning_rate': 2.63187383128636e-06, 'epoch': 0.67} 67%|██████▋ | 14746/22095 [25:09:39<80:17:29, 39.33s/it] 67%|██████▋ | 14747/22095 [25:10:01<69:29:55, 34.05s/it] {'loss': 0.2931, 'grad_norm': 0.6131893416410314, 'learning_rate': 2.6312283537795902e-06, 'epoch': 0.67} 67%|██████▋ | 14747/22095 [25:10:01<69:29:55, 34.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14748/22095 [25:10:04<50:49:10, 24.90s/it] {'loss': 0.3419, 'grad_norm': 0.6204251169985142, 'learning_rate': 2.630582927170446e-06, 'epoch': 0.67} 67%|██████▋ | 14748/22095 [25:10:04<50:49:10, 24.90s/it] 67%|██████▋ | 14749/22095 [25:11:03<71:17:46, 34.94s/it] {'loss': 0.3098, 'grad_norm': 0.6119163871445997, 'learning_rate': 2.6299375514727998e-06, 'epoch': 0.67} 67%|██████▋ | 14749/22095 [25:11:03<71:17:46, 34.94s/it] 67%|██████▋ | 14750/22095 [25:11:24<63:03:26, 30.91s/it] {'loss': 0.3163, 'grad_norm': 0.5971973040823447, 'learning_rate': 2.629292226700514e-06, 'epoch': 0.67} 67%|██████▋ | 14750/22095 [25:11:24<63:03:26, 30.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14751/22095 [25:11:34<49:56:49, 24.48s/it] {'loss': 0.4745, 'grad_norm': 0.3046693253865641, 'learning_rate': 2.6286469528674598e-06, 'epoch': 0.67} 67%|██████▋ | 14751/22095 [25:11:34<49:56:49, 24.48s/it] 67%|██████▋ | 14752/22095 [25:11:37<36:52:10, 18.08s/it] {'loss': 0.2506, 'grad_norm': 0.5983885615746021, 'learning_rate': 2.6280017299874984e-06, 'epoch': 0.67} 67%|██████▋ | 14752/22095 [25:11:37<36:52:10, 18.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14753/22095 [25:11:45<30:41:07, 15.05s/it] {'loss': 0.4827, 'grad_norm': 0.35135599929684164, 'learning_rate': 2.6273565580744942e-06, 'epoch': 0.67} 67%|██████▋ | 14753/22095 [25:11:45<30:41:07, 15.05s/it] 67%|██████▋ | 14754/22095 [25:11:48<23:26:33, 11.50s/it] {'loss': 0.2888, 'grad_norm': 0.6258238245008194, 'learning_rate': 2.6267114371423097e-06, 'epoch': 0.67} 67%|██████▋ | 14754/22095 [25:11:48<23:26:33, 11.50s/it] 67%|██████▋ | 14755/22095 [25:12:50<54:20:15, 26.65s/it] {'loss': 0.2864, 'grad_norm': 0.6635996374843116, 'learning_rate': 2.6260663672048094e-06, 'epoch': 0.67} 67%|██████▋ | 14755/22095 [25:12:50<54:20:15, 26.65s/it] 67%|██████▋ | 14756/22095 [25:13:32<63:58:38, 31.38s/it] {'loss': 0.2592, 'grad_norm': 0.6056793405065296, 'learning_rate': 2.6254213482758518e-06, 'epoch': 0.67} 67%|██████▋ | 14756/22095 [25:13:32<63:58:38, 31.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69643 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103074 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14757/22095 [25:13:53<57:36:25, 28.26s/it] {'loss': 0.2924, 'grad_norm': 0.6187894917066431, 'learning_rate': 2.624776380369295e-06, 'epoch': 0.67} 67%|██████▋ | 14757/22095 [25:13:53<57:36:25, 28.26s/it]VC:s3://gui-agent/agentnet/win_mac_images/b223ec83-209c-4fcc-953a-89a95ca2ea53.png 2025-08-28 17:11:52.080109 load time: 1047.72 ms VC:s3://gui-agent/data_20250526/windows/images/spotify/20250515_130610_1/images/before_screenshot_69.png 2025-08-28 17:11:52.079507 load time: 1042.56 ms 67%|██████▋ | 14758/22095 [25:14:15<53:33:50, 26.28s/it] {'loss': 0.3077, 'grad_norm': 0.6193570671741726, 'learning_rate': 2.6241314634990005e-06, 'epoch': 0.67} 67%|██████▋ | 14758/22095 [25:14:15<53:33:50, 26.28s/it] 67%|██████▋ | 14759/22095 [25:14:36<50:13:14, 24.64s/it] {'loss': 0.3027, 'grad_norm': 0.7064378066940625, 'learning_rate': 2.6234865976788236e-06, 'epoch': 0.67} 67%|██████▋ | 14759/22095 [25:14:36<50:13:14, 24.64s/it] 67%|██████▋ | 14760/22095 [25:14:57<48:13:40, 23.67s/it] {'loss': 0.2954, 'grad_norm': 0.7637733039531868, 'learning_rate': 2.6228417829226195e-06, 'epoch': 0.67} 67%|██████▋ | 14760/22095 [25:14:57<48:13:40, 23.67s/it]VC:s3://mm-dataset/ocr_data/TextVQA/train_images/00e942ce767e4d58.jpg 2025-08-28 17:12:55.962771 load time: 1036.22 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240828_145645_before_screenshot_sub0.png 2025-08-28 17:12:55.960890 load time: VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_060545_before_screenshot_sub0.png 2025-08-28 17:12:55.963037 load time: 1047.78 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/1137/screenshot_2.png 1043.12 ms 2025-08-28 17:12:55.960833 load time: 1043.62 ms 67%|██████▋ | 14761/22095 [25:15:01<35:54:27, 17.63s/it] {'loss': 0.2501, 'grad_norm': 0.6751147642440454, 'learning_rate': 2.622197019244245e-06, 'epoch': 0.67} 67%|██████▋ | 14761/22095 [25:15:01<35:54:27, 17.63s/it] 67%|██████▋ | 14762/22095 [25:15:44<51:28:57, 25.27s/it] {'loss': 0.3243, 'grad_norm': 0.622694810711134, 'learning_rate': 2.6215523066575542e-06, 'epoch': 0.67} 67%|██████▋ | 14762/22095 [25:15:44<51:28:57, 25.27s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_771353.png 2025-08-28 17:13:42.607975 load time: 1056.83 ms VC:s3://gui-agent/data_20250609/pc_agent_e/images/screenshot/f31f_33109747_2.png 2025-08-28 17:13:42.608179 load time: 1051.53 ms 67%|██████▋ | 14763/22095 [25:16:26<61:34:21, 30.23s/it] {'loss': 0.313, 'grad_norm': 0.7096323697849127, 'learning_rate': 2.6209076451764004e-06, 'epoch': 0.67} 67%|██████▋ | 14763/22095 [25:16:26<61:34:21, 30.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14764/22095 [25:17:04<66:50:19, 32.82s/it] {'loss': 0.3191, 'grad_norm': 0.575045414444866, 'learning_rate': 2.6202630348146323e-06, 'epoch': 0.67} 67%|██████▋ | 14764/22095 [25:17:05<66:50:19, 32.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14765/22095 [25:17:14<52:26:18, 25.75s/it] {'loss': 0.4933, 'grad_norm': 0.3627578595993997, 'learning_rate': 2.6196184755861054e-06, 'epoch': 0.67} 67%|██████▋ | 14765/22095 [25:17:14<52:26:18, 25.75s/it] 67%|██████▋ | 14766/22095 [25:17:35<49:47:48, 24.46s/it] {'loss': 0.3007, 'grad_norm': 0.6621699630985428, 'learning_rate': 2.618973967504664e-06, 'epoch': 0.67} 67%|██████▋ | 14766/22095 [25:17:35<49:47:48, 24.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52397 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14767/22095 [25:18:15<59:10:16, 29.07s/it] {'loss': 0.2904, 'grad_norm': 0.6963598038335564, 'learning_rate': 2.618329510584161e-06, 'epoch': 0.67} 67%|██████▋ | 14767/22095 [25:18:15<59:10:16, 29.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50843 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14768/22095 [25:18:18<43:20:21, 21.29s/it] {'loss': 0.2846, 'grad_norm': 0.6530712366145673, 'learning_rate': 2.617685104838443e-06, 'epoch': 0.67} 67%|██████▋ | 14768/22095 [25:18:18<43:20:21, 21.29s/it] 67%|██████▋ | 14769/22095 [25:19:21<68:50:34, 33.83s/it] {'loss': 0.2612, 'grad_norm': 0.551809083567866, 'learning_rate': 2.617040750281352e-06, 'epoch': 0.67} 67%|██████▋ | 14769/22095 [25:19:21<68:50:34, 33.83s/it]VC:s3://gui-agent/data_20250505/android/images/calendar/Cycle_3_Iter_2/images/screenshot-50-1746132409.386944-before.png 2025-08-28 17:17:20.030508 load time: 1045.37 ms 67%|██████▋ | 14770/22095 [25:19:43<61:36:12, 30.28s/it] {'loss': 0.3178, 'grad_norm': 0.5585593830608696, 'learning_rate': 2.616396446926738e-06, 'epoch': 0.67} 67%|██████▋ | 14770/22095 [25:19:43<61:36:12, 30.28s/it]VC:s3://multi-modal/playground/data/geoqa+/images/2410.png 2025-08-28 17:17:42.012784 load time: 1033.89 ms 67%|██████▋ | 14771/22095 [25:20:06<56:45:18, 27.90s/it] {'loss': 0.3394, 'grad_norm': 0.6878872213137671, 'learning_rate': 2.615752194788445e-06, 'epoch': 0.67} 67%|██████▋ | 14771/22095 [25:20:06<56:45:18, 27.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14772/22095 [25:20:27<52:30:03, 25.81s/it] {'loss': 0.3154, 'grad_norm': 0.6736646003467953, 'learning_rate': 2.615107993880315e-06, 'epoch': 0.67} 67%|██████▋ | 14772/22095 [25:20:27<52:30:03, 25.81s/it]VC:s3://gui-agent/data_20250421/web/images/wa_shopping_admin_admin/trajectory_119/img/step_2.png 2025-08-28 17:18:25.300442 load time: 1072.1 ms 67%|██████▋ | 14773/22095 [25:20:51<51:39:06, 25.40s/it] {'loss': 0.2896, 'grad_norm': 0.577778634752951, 'learning_rate': 2.614463844216187e-06, 'epoch': 0.67} 67%|██████▋ | 14773/22095 [25:20:51<51:39:06, 25.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56834 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14774/22095 [25:21:31<60:31:02, 29.76s/it] {'loss': 0.2919, 'grad_norm': 0.5888540087169923, 'learning_rate': 2.613819745809907e-06, 'epoch': 0.67} 67%|██████▋ | 14774/22095 [25:21:31<60:31:02, 29.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14775/22095 [25:22:19<71:47:16, 35.31s/it] {'loss': 0.4709, 'grad_norm': 0.3142271462539952, 'learning_rate': 2.6131756986753097e-06, 'epoch': 0.67} 67%|██████▋ | 14775/22095 [25:22:19<71:47:16, 35.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41941 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65726 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14776/22095 [25:22:47<67:07:24, 33.02s/it] {'loss': 0.4692, 'grad_norm': 0.2816735923390981, 'learning_rate': 2.6125317028262383e-06, 'epoch': 0.67} 67%|██████▋ | 14776/22095 [25:22:47<67:07:24, 33.02s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250623/windows/images/autocad/20250509_114138_1/images/before_screenshot_1.png 2025-08-28 17:20:45.588926 load time: 1032.84 ms VC:s3://gui-agent/data_20250623/windows/images/inventor/20250514_094522_1/images/before_screenshot_1_id_42_function_0_crop_1_grounding_instructions_point_o.png 2025-08-28 17:20:45.588821 VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_414454.png 2025-08-28 17:20:45.588928 load time: 1046.84 ms load time: 1053.37 ms 67%|██████▋ | 14777/22095 [25:22:51<49:18:09, 24.25s/it] {'loss': 0.3335, 'grad_norm': 0.5990464043433165, 'learning_rate': 2.6118877582765255e-06, 'epoch': 0.67} 67%|██████▋ | 14777/22095 [25:22:51<49:18:09, 24.25s/it] 67%|██████▋ | 14778/22095 [25:22:54<36:49:40, 18.12s/it] {'loss': 0.3245, 'grad_norm': 0.6297307102369214, 'learning_rate': 2.611243865040013e-06, 'epoch': 0.67} 67%|██████▋ | 14778/22095 [25:22:54<36:49:40, 18.12s/it] 67%|██████▋ | 14779/22095 [25:23:57<63:53:33, 31.44s/it] {'loss': 0.3014, 'grad_norm': 0.6290230209495843, 'learning_rate': 2.6106000231305306e-06, 'epoch': 0.67} 67%|██████▋ | 14779/22095 [25:23:57<63:53:33, 31.44s/it] 67%|██████▋ | 14780/22095 [25:24:58<81:52:24, 40.29s/it] {'loss': 0.294, 'grad_norm': 0.8288293083591458, 'learning_rate': 2.6099562325619175e-06, 'epoch': 0.67} 67%|██████▋ | 14780/22095 [25:24:58<81:52:24, 40.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83487 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14781/22095 [25:25:21<71:14:56, 35.07s/it] {'loss': 0.2819, 'grad_norm': 0.6020139200786987, 'learning_rate': 2.6093124933480052e-06, 'epoch': 0.67} 67%|██████▋ | 14781/22095 [25:25:21<71:14:56, 35.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14782/22095 [25:25:50<67:51:04, 33.40s/it] {'loss': 0.4852, 'grad_norm': 0.27612277737532254, 'learning_rate': 2.608668805502622e-06, 'epoch': 0.67} 67%|██████▋ | 14782/22095 [25:25:50<67:51:04, 33.40s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_339639.png 2025-08-28 17:23:49.066906 load time: 1035.92 ms 67%|██████▋ | 14783/22095 [25:26:32<72:42:16, 35.80s/it] {'loss': 0.2924, 'grad_norm': 0.6359528732903538, 'learning_rate': 2.6080251690396026e-06, 'epoch': 0.67} 67%|██████▋ | 14783/22095 [25:26:32<72:42:16, 35.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250707/ubuntu/images/libreoffice_impress/4b55358b-d107-4cbc-904f-920291abc7a4/images/step_6.png 2025-08-28 17:24:30.450346 load time: 1043.52 ms 67%|██████▋ | 14784/22095 [25:26:59<67:49:01, 33.39s/it] {'loss': 0.4634, 'grad_norm': 0.2736896880795578, 'learning_rate': 2.607381583972777e-06, 'epoch': 0.67} 67%|██████▋ | 14784/22095 [25:26:59<67:49:01, 33.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42299 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45978 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14785/22095 [25:27:23<61:47:29, 30.43s/it] {'loss': 0.3164, 'grad_norm': 0.9012747453630279, 'learning_rate': 2.6067380503159735e-06, 'epoch': 0.67} 67%|██████▋ | 14785/22095 [25:27:23<61:47:29, 30.43s/it] 67%|██████▋ | 14786/22095 [25:27:45<56:27:49, 27.81s/it] {'loss': 0.3233, 'grad_norm': 0.6229379860976303, 'learning_rate': 2.606094568083017e-06, 'epoch': 0.67} 67%|██████▋ | 14786/22095 [25:27:45<56:27:49, 27.81s/it] 67%|██████▋ | 14787/22095 [25:28:43<75:13:05, 37.05s/it] {'loss': 0.3263, 'grad_norm': 0.6330120866522541, 'learning_rate': 2.605451137287738e-06, 'epoch': 0.67} 67%|██████▋ | 14787/22095 [25:28:43<75:13:05, 37.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14788/22095 [25:29:05<65:59:18, 32.51s/it] {'loss': 0.3047, 'grad_norm': 0.6707506830477721, 'learning_rate': 2.604807757943957e-06, 'epoch': 0.67} 67%|██████▋ | 14788/22095 [25:29:05<65:59:18, 32.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/web/images/yang_0527174255/10_140_52_49_0527200650/img/16.png 2025-08-28 17:27:03.981063 load time: 1037.78 ms VC:s3://gui-agent/data_20250707/ubuntu/images/libreoffice_impress/f1aadd72-b7bf-4cb0-9356-f020d7f38c5f/images/step_7.png 2025-08-28 17:27:03.983276 load time: 1059.78 ms 67%|██████▋ | 14789/22095 [25:29:35<64:12:00, 31.63s/it] {'loss': 0.4905, 'grad_norm': 0.301561285535464, 'learning_rate': 2.6041644300655035e-06, 'epoch': 0.67} 67%|██████▋ | 14789/22095 [25:29:35<64:12:00, 31.63s/it] 67%|██████▋ | 14790/22095 [25:29:38<46:48:55, 23.07s/it] {'loss': 0.3398, 'grad_norm': 0.5884118503855394, 'learning_rate': 2.6035211536661966e-06, 'epoch': 0.67} 67%|██████▋ | 14790/22095 [25:29:38<46:48:55, 23.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14791/22095 [25:30:02<47:16:51, 23.30s/it] {'loss': 0.2668, 'grad_norm': 0.5863125088203237, 'learning_rate': 2.6028779287598606e-06, 'epoch': 0.67} 67%|██████▋ | 14791/22095 [25:30:02<47:16:51, 23.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44864 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79174 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66402 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51112 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14792/22095 [25:30:24<46:50:57, 23.09s/it] {'loss': 0.3191, 'grad_norm': 0.6488902625627246, 'learning_rate': 2.6022347553603145e-06, 'epoch': 0.67} 67%|██████▋ | 14792/22095 [25:30:24<46:50:57, 23.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14793/22095 [25:30:51<49:01:18, 24.17s/it] {'loss': 0.4875, 'grad_norm': 0.27688118481367496, 'learning_rate': 2.6015916334813818e-06, 'epoch': 0.67} 67%|██████▋ | 14793/22095 [25:30:51<49:01:18, 24.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14794/22095 [25:30:54<36:23:35, 17.94s/it] {'loss': 0.2703, 'grad_norm': 0.6130194968066307, 'learning_rate': 2.600948563136878e-06, 'epoch': 0.67} 67%|██████▋ | 14794/22095 [25:30:54<36:23:35, 17.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68847 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14795/22095 [25:31:15<38:13:08, 18.85s/it] {'loss': 0.2429, 'grad_norm': 0.5475132725839402, 'learning_rate': 2.60030554434062e-06, 'epoch': 0.67} 67%|██████▋ | 14795/22095 [25:31:15<38:13:08, 18.85s/it] 67%|██████▋ | 14796/22095 [25:31:37<39:57:51, 19.71s/it] {'loss': 0.2978, 'grad_norm': 0.609963564618995, 'learning_rate': 2.599662577106427e-06, 'epoch': 0.67} 67%|██████▋ | 14796/22095 [25:31:37<39:57:51, 19.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14797/22095 [25:32:00<41:36:28, 20.52s/it] {'loss': 0.2818, 'grad_norm': 0.5612618651319895, 'learning_rate': 2.5990196614481135e-06, 'epoch': 0.67} 67%|██████▋ | 14797/22095 [25:32:00<41:36:28, 20.52s/it] 67%|██████▋ | 14798/22095 [25:32:40<53:44:26, 26.51s/it] {'loss': 0.3169, 'grad_norm': 0.6252181494789513, 'learning_rate': 2.5983767973794915e-06, 'epoch': 0.67} 67%|██████▋ | 14798/22095 [25:32:40<53:44:26, 26.51s/it] 67%|██████▋ | 14799/22095 [25:32:44<40:11:14, 19.83s/it] {'loss': 0.3206, 'grad_norm': 0.6255870501678163, 'learning_rate': 2.597733984914377e-06, 'epoch': 0.67} 67%|██████▋ | 14799/22095 [25:32:44<40:11:14, 19.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (86810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104475 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14800/22095 [25:33:12<44:44:32, 22.08s/it] {'loss': 0.4794, 'grad_norm': 0.2880044890391729, 'learning_rate': 2.5970912240665815e-06, 'epoch': 0.67} 67%|██████▋ | 14800/22095 [25:33:12<44:44:32, 22.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75126 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48905 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110850 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85331 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93570 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14801/22095 [25:33:15<33:16:15, 16.42s/it] {'loss': 0.2681, 'grad_norm': 0.7775456388447965, 'learning_rate': 2.5964485148499165e-06, 'epoch': 0.67} 67%|██████▋ | 14801/22095 [25:33:15<33:16:15, 16.42s/it] 67%|██████▋ | 14802/22095 [25:34:34<71:21:50, 35.23s/it] {'loss': 0.2694, 'grad_norm': 0.8829594166813511, 'learning_rate': 2.595805857278189e-06, 'epoch': 0.67} 67%|██████▋ | 14802/22095 [25:34:34<71:21:50, 35.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (126887 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14803/22095 [25:34:44<56:16:42, 27.78s/it] {'loss': 0.4689, 'grad_norm': 0.28253780830597774, 'learning_rate': 2.5951632513652113e-06, 'epoch': 0.67} 67%|██████▋ | 14803/22095 [25:34:44<56:16:42, 27.78s/it] 67%|██████▋ | 14804/22095 [25:35:06<52:32:17, 25.94s/it] {'loss': 0.3272, 'grad_norm': 0.6075988341196107, 'learning_rate': 2.594520697124788e-06, 'epoch': 0.67} 67%|██████▋ | 14804/22095 [25:35:06<52:32:17, 25.94s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/f0f651b54ff02c0ff80873408f1f1b0c80d42c8b464fda66cebb4698257391fd.png 2025-08-28 17:33:04.754743 load time: 1030.6 ms VC:s3://gui-agent/data_20250421/web/images/budget_com/trajectory_62/img/step_1.png 2025-08-28 17:33:04.754623 load time: 1076.29 ms 67%|██████▋ | 14805/22095 [25:35:09<38:35:53, 19.06s/it] {'loss': 0.3277, 'grad_norm': 0.6036323472055383, 'learning_rate': 2.5938781945707293e-06, 'epoch': 0.67} 67%|██████▋ | 14805/22095 [25:35:09<38:35:53, 19.06s/it]VC:s3://gui-agent/data_20250714/web/images/20250716/07669033-8158-4c6d-8d81-c34559ceee87/images/step_60.png 2025-08-28 17:33:07.757022 load time: 1042.69 ms 67%|██████▋ | 14806/22095 [25:35:34<42:21:20, 20.92s/it] {'loss': 0.3211, 'grad_norm': 0.5779008475528601, 'learning_rate': 2.5932357437168353e-06, 'epoch': 0.67} 67%|██████▋ | 14806/22095 [25:35:34<42:21:20, 20.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14807/22095 [25:35:59<44:44:05, 22.10s/it] {'loss': 0.3489, 'grad_norm': 0.6180057462429976, 'learning_rate': 2.592593344576916e-06, 'epoch': 0.67} 67%|██████▋ | 14807/22095 [25:35:59<44:44:05, 22.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74110 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75501 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44876 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14808/22095 [25:36:05<35:09:13, 17.37s/it] {'loss': 0.4964, 'grad_norm': 0.2792284287855959, 'learning_rate': 2.59195099716477e-06, 'epoch': 0.67} 67%|██████▋ | 14808/22095 [25:36:05<35:09:13, 17.37s/it] 67%|██████▋ | 14809/22095 [25:36:52<52:44:42, 26.06s/it] {'loss': 0.4862, 'grad_norm': 0.2840852019818451, 'learning_rate': 2.591308701494203e-06, 'epoch': 0.67} 67%|██████▋ | 14809/22095 [25:36:52<52:44:42, 26.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14810/22095 [25:36:56<39:14:26, 19.39s/it] {'loss': 0.3202, 'grad_norm': 0.6465522456857574, 'learning_rate': 2.590666457579014e-06, 'epoch': 0.67} 67%|██████▋ | 14810/22095 [25:36:56<39:14:26, 19.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53203 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14811/22095 [25:37:43<56:26:16, 27.89s/it] {'loss': 0.4471, 'grad_norm': 0.2608260318289237, 'learning_rate': 2.590024265433002e-06, 'epoch': 0.67} 67%|██████▋ | 14811/22095 [25:37:43<56:26:16, 27.89s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14812/22095 [25:37:47<41:44:49, 20.64s/it] {'loss': 0.3108, 'grad_norm': 0.6305923595751279, 'learning_rate': 2.589382125069967e-06, 'epoch': 0.67} 67%|██████▋ | 14812/22095 [25:37:47<41:44:49, 20.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14813/22095 [25:37:50<31:06:07, 15.38s/it] {'loss': 0.33, 'grad_norm': 1.0727394971790696, 'learning_rate': 2.5887400365037075e-06, 'epoch': 0.67} 67%|██████▋ | 14813/22095 [25:37:50<31:06:07, 15.38s/it] 67%|██████▋ | 14814/22095 [25:38:30<45:57:02, 22.72s/it] {'loss': 0.3479, 'grad_norm': 0.8948768742495324, 'learning_rate': 2.5880979997480193e-06, 'epoch': 0.67} 67%|██████▋ | 14814/22095 [25:38:30<45:57:02, 22.72s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_127409.png 2025-08-28 17:36:28.755457 load time: 1042.36 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14815/22095 [25:38:33<34:06:10, 16.86s/it] {'loss': 0.3509, 'grad_norm': 0.6213365279867468, 'learning_rate': 2.5874560148166953e-06, 'epoch': 0.67} 67%|██████▋ | 14815/22095 [25:38:33<34:06:10, 16.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14816/22095 [25:39:01<40:56:38, 20.25s/it] {'loss': 0.4971, 'grad_norm': 0.2882151533209925, 'learning_rate': 2.5868140817235344e-06, 'epoch': 0.67} 67%|██████▋ | 14816/22095 [25:39:01<40:56:38, 20.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127772 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41148 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14817/22095 [25:39:05<30:56:44, 15.31s/it] {'loss': 0.2446, 'grad_norm': 0.6089141964028402, 'learning_rate': 2.5861722004823254e-06, 'epoch': 0.67} 67%|██████▋ | 14817/22095 [25:39:05<30:56:44, 15.31s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14818/22095 [25:39:27<35:11:09, 17.41s/it] {'loss': 0.3552, 'grad_norm': 0.6515023495329259, 'learning_rate': 2.585530371106864e-06, 'epoch': 0.67} 67%|██████▋ | 14818/22095 [25:39:27<35:11:09, 17.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14819/22095 [25:39:38<30:52:03, 15.27s/it] {'loss': 0.4446, 'grad_norm': 0.26172932451240744, 'learning_rate': 2.5848885936109382e-06, 'epoch': 0.67} 67%|██████▋ | 14819/22095 [25:39:38<30:52:03, 15.27s/it] 67%|██████▋ | 14820/22095 [25:39:41<23:41:31, 11.72s/it] {'loss': 0.3255, 'grad_norm': 0.6545152496550372, 'learning_rate': 2.58424686800834e-06, 'epoch': 0.67} 67%|██████▋ | 14820/22095 [25:39:41<23:41:31, 11.72s/it] 67%|██████▋ | 14821/22095 [25:40:05<31:18:08, 15.49s/it] {'loss': 0.3356, 'grad_norm': 0.6269818687343297, 'learning_rate': 2.583605194312856e-06, 'epoch': 0.67} 67%|██████▋ | 14821/22095 [25:40:05<31:18:08, 15.49s/it] 67%|██████▋ | 14822/22095 [25:40:09<23:47:00, 11.77s/it] {'loss': 0.2754, 'grad_norm': 0.7356448283387264, 'learning_rate': 2.5829635725382764e-06, 'epoch': 0.67} 67%|██████▋ | 14822/22095 [25:40:09<23:47:00, 11.77s/it] 67%|██████▋ | 14823/22095 [25:40:33<31:15:48, 15.48s/it] {'loss': 0.3226, 'grad_norm': 0.6411225363370119, 'learning_rate': 2.5823220026983865e-06, 'epoch': 0.67} 67%|██████▋ | 14823/22095 [25:40:33<31:15:48, 15.48s/it] 67%|██████▋ | 14824/22095 [25:40:36<23:38:25, 11.70s/it] {'loss': 0.3262, 'grad_norm': 0.6612202553526849, 'learning_rate': 2.5816804848069693e-06, 'epoch': 0.67} 67%|██████▋ | 14824/22095 [25:40:36<23:38:25, 11.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14825/22095 [25:40:45<22:00:22, 10.90s/it] {'loss': 0.4557, 'grad_norm': 0.29496916689818664, 'learning_rate': 2.581039018877811e-06, 'epoch': 0.67} 67%|██████▋ | 14825/22095 [25:40:45<22:00:22, 10.90s/it] 67%|██████▋ | 14826/22095 [25:40:48<17:25:23, 8.63s/it] {'loss': 0.3078, 'grad_norm': 0.5952504723621468, 'learning_rate': 2.580397604924699e-06, 'epoch': 0.67} 67%|██████▋ | 14826/22095 [25:40:48<17:25:23, 8.63s/it] 67%|██████▋ | 14827/22095 [25:40:51<14:06:34, 6.99s/it] {'loss': 0.3059, 'grad_norm': 0.6380343137276523, 'learning_rate': 2.5797562429614075e-06, 'epoch': 0.67} 67%|██████▋ | 14827/22095 [25:40:51<14:06:34, 6.99s/it] 67%|██████▋ | 14828/22095 [25:40:54<11:40:50, 5.79s/it] {'loss': 0.2787, 'grad_norm': 0.6199335184293526, 'learning_rate': 2.579114933001722e-06, 'epoch': 0.67} 67%|██████▋ | 14828/22095 [25:40:54<11:40:50, 5.79s/it] 67%|██████▋ | 14829/22095 [25:40:57<10:09:53, 5.04s/it] {'loss': 0.2945, 'grad_norm': 1.0874228926103768, 'learning_rate': 2.5784736750594218e-06, 'epoch': 0.67} 67%|██████▋ | 14829/22095 [25:40:57<10:09:53, 5.04s/it] 67%|██████▋ | 14830/22095 [25:41:00<8:56:52, 4.43s/it] {'loss': 0.3024, 'grad_norm': 0.6125779118419148, 'learning_rate': 2.577832469148286e-06, 'epoch': 0.67} 67%|██████▋ | 14830/22095 [25:41:00<8:56:52, 4.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366272 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33018, 'image': 'vrdu_table_final_2/astro-ph.CO/e3ba235f-540f-4da3-ac70-3e448ad8c375.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14831/22095 [25:41:03<8:03:38, 3.99s/it] {'loss': 0.2795, 'grad_norm': 0.7788844335689945, 'learning_rate': 2.5771913152820895e-06, 'epoch': 0.67} 67%|██████▋ | 14831/22095 [25:41:03<8:03:38, 3.99s/it] 67%|██████▋ | 14832/22095 [25:41:06<7:20:42, 3.64s/it] {'loss': 0.3204, 'grad_norm': 0.6314433647143567, 'learning_rate': 2.57655021347461e-06, 'epoch': 0.67} 67%|██████▋ | 14832/22095 [25:41:06<7:20:42, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918289 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41442, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 67%|██████▋ | 14833/22095 [25:41:09<6:52:42, 3.41s/it] {'loss': 0.287, 'grad_norm': 0.6141001508362471, 'learning_rate': 2.5759091637396254e-06, 'epoch': 0.67} 67%|██████▋ | 14833/22095 [25:41:09<6:52:42, 3.41s/it] 67%|██████▋ | 14834/22095 [25:41:13<7:11:06, 3.56s/it] {'loss': 0.2765, 'grad_norm': 0.6273161713818961, 'learning_rate': 2.575268166090908e-06, 'epoch': 0.67} 67%|██████▋ | 14834/22095 [25:41:13<7:11:06, 3.56s/it] 67%|██████▋ | 14835/22095 [25:41:16<6:57:34, 3.45s/it] {'loss': 0.3148, 'grad_norm': 0.6215596863860496, 'learning_rate': 2.5746272205422285e-06, 'epoch': 0.67} 67%|██████▋ | 14835/22095 [25:41:16<6:57:34, 3.45s/it] 67%|██████▋ | 14836/22095 [25:41:20<6:58:48, 3.46s/it] {'loss': 0.3477, 'grad_norm': 0.6978944273632913, 'learning_rate': 2.5739863271073634e-06, 'epoch': 0.67} 67%|██████▋ | 14836/22095 [25:41:20<6:58:48, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44951 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84777 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14837/22095 [25:41:23<6:49:01, 3.38s/it] {'loss': 0.2893, 'grad_norm': 0.6040212388799695, 'learning_rate': 2.5733454858000795e-06, 'epoch': 0.67} 67%|██████▋ | 14837/22095 [25:41:23<6:49:01, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57244 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98946 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14838/22095 [25:41:26<6:28:53, 3.22s/it] {'loss': 0.2562, 'grad_norm': 0.6080583118847751, 'learning_rate': 2.5727046966341495e-06, 'epoch': 0.67} 67%|██████▋ | 14838/22095 [25:41:26<6:28:53, 3.22s/it] 67%|██████▋ | 14839/22095 [25:41:29<6:27:20, 3.20s/it] {'loss': 0.2728, 'grad_norm': 0.5755508530350106, 'learning_rate': 2.572063959623341e-06, 'epoch': 0.67} 67%|██████▋ | 14839/22095 [25:41:29<6:27:20, 3.20s/it] 67%|██████▋ | 14840/22095 [25:41:52<18:24:38, 9.14s/it] {'loss': 0.3187, 'grad_norm': 0.6491815330242983, 'learning_rate': 2.5714232747814192e-06, 'epoch': 0.67} 67%|██████▋ | 14840/22095 [25:41:52<18:24:38, 9.14s/it] 67%|██████▋ | 14841/22095 [25:41:55<14:48:34, 7.35s/it] {'loss': 0.3104, 'grad_norm': 0.5995387960306222, 'learning_rate': 2.5707826421221527e-06, 'epoch': 0.67} 67%|██████▋ | 14841/22095 [25:41:55<14:48:34, 7.35s/it] 67%|██████▋ | 14842/22095 [25:41:59<12:37:27, 6.27s/it] {'loss': 0.3251, 'grad_norm': 0.9448323782002909, 'learning_rate': 2.5701420616593078e-06, 'epoch': 0.67} 67%|██████▋ | 14842/22095 [25:41:59<12:37:27, 6.27s/it] 67%|██████▋ | 14843/22095 [25:42:02<10:35:10, 5.26s/it] {'loss': 0.2946, 'grad_norm': 0.6522763308065314, 'learning_rate': 2.5695015334066475e-06, 'epoch': 0.67} 67%|██████▋ | 14843/22095 [25:42:02<10:35:10, 5.26s/it] 67%|██████▋ | 14844/22095 [25:42:05<9:38:10, 4.78s/it] {'loss': 0.3095, 'grad_norm': 0.5770132512230949, 'learning_rate': 2.5688610573779327e-06, 'epoch': 0.67} 67%|██████▋ | 14844/22095 [25:42:05<9:38:10, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81646 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107745 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14845/22095 [25:42:10<9:45:25, 4.84s/it] {'loss': 0.4789, 'grad_norm': 0.29970308751832503, 'learning_rate': 2.568220633586929e-06, 'epoch': 0.67} 67%|██████▋ | 14845/22095 [25:42:10<9:45:25, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82721 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68288 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14846/22095 [25:42:20<12:37:58, 6.27s/it] {'loss': 0.4513, 'grad_norm': 0.30084471159795745, 'learning_rate': 2.567580262047393e-06, 'epoch': 0.67} 67%|██████▋ | 14846/22095 [25:42:20<12:37:58, 6.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (66754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111091 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14847/22095 [25:42:24<11:21:09, 5.64s/it] {'loss': 0.3568, 'grad_norm': 0.6722694384592166, 'learning_rate': 2.566939942773089e-06, 'epoch': 0.67} 67%|██████▋ | 14847/22095 [25:42:24<11:21:09, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42462 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14848/22095 [25:42:28<10:08:26, 5.04s/it] {'loss': 0.2789, 'grad_norm': 0.6594993876155704, 'learning_rate': 2.5662996757777716e-06, 'epoch': 0.67} 67%|██████▋ | 14848/22095 [25:42:28<10:08:26, 5.04s/it] 67%|██████▋ | 14849/22095 [25:42:31<9:08:07, 4.54s/it] {'loss': 0.2859, 'grad_norm': 0.9012927279021385, 'learning_rate': 2.5656594610752005e-06, 'epoch': 0.67} 67%|██████▋ | 14849/22095 [25:42:31<9:08:07, 4.54s/it] 67%|██████▋ | 14850/22095 [25:42:35<8:29:02, 4.22s/it] {'loss': 0.3334, 'grad_norm': 0.8042398329352978, 'learning_rate': 2.5650192986791293e-06, 'epoch': 0.67} 67%|██████▋ | 14850/22095 [25:42:35<8:29:02, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62019 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14851/22095 [25:42:38<7:53:15, 3.92s/it] {'loss': 0.3362, 'grad_norm': 0.5570366902178083, 'learning_rate': 2.5643791886033177e-06, 'epoch': 0.67} 67%|██████▋ | 14851/22095 [25:42:38<7:53:15, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117109 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14852/22095 [25:42:41<7:38:55, 3.80s/it] {'loss': 0.2737, 'grad_norm': 0.6377163244803467, 'learning_rate': 2.5637391308615155e-06, 'epoch': 0.67} 67%|██████▋ | 14852/22095 [25:42:41<7:38:55, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047553 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm'}, {'from': 'gpt', 'value': '【解答】解:∵点C是线段AB的中点,AB=20cm,∴BC=\\frac{1}{2}AB=\\frac{1}{2}×20cm=10cm,∵点D是线段BC的中点,∴BD=\\frac{1}{2}BC=\\frac{1}{2}×10cm=5cm,∴AD=AB-BD=20cm-5cm=15cm.'}]} 67%|██████▋ | 14853/22095 [25:42:51<11:21:40, 5.65s/it] {'loss': 0.4655, 'grad_norm': 0.3154837118667572, 'learning_rate': 2.5630991254674764e-06, 'epoch': 0.67} 67%|██████▋ | 14853/22095 [25:42:51<11:21:40, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44448 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45539 > 40960) for 4 sample(s). Truncating to 39812 with 2 samples. 67%|██████▋ | 14854/22095 [25:43:01<13:43:32, 6.82s/it] {'loss': 0.5063, 'grad_norm': 0.3066986197362134, 'learning_rate': 2.562459172434952e-06, 'epoch': 0.67} 67%|██████▋ | 14854/22095 [25:43:01<13:43:32, 6.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63624 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41612 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14855/22095 [25:43:10<15:20:32, 7.63s/it] {'loss': 0.4417, 'grad_norm': 0.2595637218282012, 'learning_rate': 2.561819271777698e-06, 'epoch': 0.67} 67%|██████▋ | 14855/22095 [25:43:10<15:20:32, 7.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62805 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14856/22095 [25:43:17<14:42:13, 7.31s/it] {'loss': 0.465, 'grad_norm': 0.27092358982107523, 'learning_rate': 2.5611794235094545e-06, 'epoch': 0.67} 67%|██████▋ | 14856/22095 [25:43:17<14:42:13, 7.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14857/22095 [25:43:21<12:39:06, 6.29s/it] {'loss': 0.2826, 'grad_norm': 0.6557612974703131, 'learning_rate': 2.5605396276439764e-06, 'epoch': 0.67} 67%|██████▋ | 14857/22095 [25:43:21<12:39:06, 6.29s/it] 67%|██████▋ | 14858/22095 [25:43:25<11:31:14, 5.73s/it] {'loss': 0.2981, 'grad_norm': 0.5703719087363823, 'learning_rate': 2.5598998841950105e-06, 'epoch': 0.67} 67%|██████▋ | 14858/22095 [25:43:25<11:31:14, 5.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79261 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91083 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14859/22095 [25:43:34<13:20:12, 6.64s/it] {'loss': 0.466, 'grad_norm': 0.28514248737355297, 'learning_rate': 2.5592601931763024e-06, 'epoch': 0.67} 67%|██████▋ | 14859/22095 [25:43:34<13:20:12, 6.64s/it] 67%|██████▋ | 14860/22095 [25:43:44<15:31:49, 7.73s/it] {'loss': 0.4683, 'grad_norm': 0.2741781805340848, 'learning_rate': 2.558620554601594e-06, 'epoch': 0.67} 67%|██████▋ | 14860/22095 [25:43:44<15:31:49, 7.73s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14861/22095 [25:43:48<13:09:01, 6.54s/it] {'loss': 0.2875, 'grad_norm': 0.6923206061416766, 'learning_rate': 2.5579809684846323e-06, 'epoch': 0.67} 67%|██████▋ | 14861/22095 [25:43:48<13:09:01, 6.54s/it] 67%|██████▋ | 14862/22095 [25:43:59<15:34:56, 7.76s/it] {'loss': 0.4465, 'grad_norm': 0.26899865757289887, 'learning_rate': 2.5573414348391613e-06, 'epoch': 0.67} 67%|██████▋ | 14862/22095 [25:43:59<15:34:56, 7.76s/it] 67%|██████▋ | 14863/22095 [25:44:09<17:17:51, 8.61s/it] {'loss': 0.4642, 'grad_norm': 0.2803036034475733, 'learning_rate': 2.5567019536789204e-06, 'epoch': 0.67} 67%|██████▋ | 14863/22095 [25:44:09<17:17:51, 8.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14864/22095 [25:44:17<16:59:07, 8.46s/it] {'loss': 0.478, 'grad_norm': 0.2722707798580064, 'learning_rate': 2.5560625250176495e-06, 'epoch': 0.67} 67%|██████▋ | 14864/22095 [25:44:17<16:59:07, 8.46s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14865/22095 [25:44:21<14:20:04, 7.14s/it] {'loss': 0.2754, 'grad_norm': 0.5717873551481465, 'learning_rate': 2.5554231488690908e-06, 'epoch': 0.67} 67%|██████▋ | 14865/22095 [25:44:21<14:20:04, 7.14s/it] 67%|██████▋ | 14866/22095 [25:44:25<12:11:08, 6.07s/it] {'loss': 0.2943, 'grad_norm': 0.7573567287495568, 'learning_rate': 2.554783825246978e-06, 'epoch': 0.67} 67%|██████▋ | 14866/22095 [25:44:25<12:11:08, 6.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14867/22095 [25:44:34<13:45:58, 6.86s/it] {'loss': 0.4947, 'grad_norm': 0.310794806996194, 'learning_rate': 2.5541445541650536e-06, 'epoch': 0.67} 67%|██████▋ | 14867/22095 [25:44:34<13:45:58, 6.86s/it] 67%|██████▋ | 14868/22095 [25:44:38<12:30:19, 6.23s/it] {'loss': 0.3025, 'grad_norm': 0.5633981967848383, 'learning_rate': 2.55350533563705e-06, 'epoch': 0.67} 67%|██████▋ | 14868/22095 [25:44:38<12:30:19, 6.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382979 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49774, 'image': 'vrdu_table_final_2/astro-ph.CO/33db1c31-9a98-444f-8d45-1ddaa2bbaf35.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903166 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26319, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. \\frac{11}{2}cm\nB. 4cm\nC. \\frac{9}{2}cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 67%|██████▋ | 14869/22095 [25:44:42<10:55:29, 5.44s/it] {'loss': 0.3401, 'grad_norm': 0.6291189797545162, 'learning_rate': 2.552866169676701e-06, 'epoch': 0.67} 67%|██████▋ | 14869/22095 [25:44:42<10:55:29, 5.44s/it] 67%|██████▋ | 14870/22095 [25:44:46<9:46:02, 4.87s/it] {'loss': 0.3003, 'grad_norm': 0.5796014872041827, 'learning_rate': 2.5522270562977424e-06, 'epoch': 0.67} 67%|██████▋ | 14870/22095 [25:44:46<9:46:02, 4.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8380435 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47224, 'image': 'vrdu_table_final_2/astro-ph.CO/a7a7e062-020e-45a9-9e2f-a917604f43a1.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 67%|██████▋ | 14871/22095 [25:44:49<8:57:38, 4.47s/it] {'loss': 0.3058, 'grad_norm': 0.5882744559978718, 'learning_rate': 2.551587995513909e-06, 'epoch': 0.67} 67%|██████▋ | 14871/22095 [25:44:49<8:57:38, 4.47s/it] 67%|██████▋ | 14872/22095 [25:44:53<8:29:47, 4.23s/it] {'loss': 0.302, 'grad_norm': 0.5832979545729652, 'learning_rate': 2.550948987338929e-06, 'epoch': 0.67} 67%|██████▋ | 14872/22095 [25:44:53<8:29:47, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72844 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101198 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14873/22095 [25:44:59<9:35:11, 4.78s/it] {'loss': 0.3275, 'grad_norm': 0.6154236592404919, 'learning_rate': 2.5503100317865324e-06, 'epoch': 0.67} 67%|██████▋ | 14873/22095 [25:44:59<9:35:11, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45615 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51731 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14874/22095 [25:45:02<8:44:58, 4.36s/it] {'loss': 0.3201, 'grad_norm': 0.6174955768741429, 'learning_rate': 2.549671128870452e-06, 'epoch': 0.67} 67%|██████▋ | 14874/22095 [25:45:02<8:44:58, 4.36s/it] 67%|██████▋ | 14875/22095 [25:45:05<8:02:15, 4.01s/it] {'loss': 0.269, 'grad_norm': 0.6488605190840145, 'learning_rate': 2.549032278604411e-06, 'epoch': 0.67} 67%|██████▋ | 14875/22095 [25:45:05<8:02:15, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14876/22095 [25:45:09<7:46:27, 3.88s/it] {'loss': 0.326, 'grad_norm': 0.649150724749603, 'learning_rate': 2.54839348100214e-06, 'epoch': 0.67} 67%|██████▋ | 14876/22095 [25:45:09<7:46:27, 3.88s/it] 67%|██████▋ | 14877/22095 [25:45:12<7:30:00, 3.74s/it] {'loss': 0.2756, 'grad_norm': 0.684597473992833, 'learning_rate': 2.5477547360773626e-06, 'epoch': 0.67} 67%|██████▋ | 14877/22095 [25:45:12<7:30:00, 3.74s/it] 67%|██████▋ | 14878/22095 [25:45:16<7:20:09, 3.66s/it] {'loss': 0.2814, 'grad_norm': 0.6649204591550062, 'learning_rate': 2.5471160438438058e-06, 'epoch': 0.67} 67%|██████▋ | 14878/22095 [25:45:16<7:20:09, 3.66s/it] 67%|██████▋ | 14879/22095 [25:45:21<8:00:18, 3.99s/it] {'loss': 0.2824, 'grad_norm': 0.6315462929374838, 'learning_rate': 2.5464774043151897e-06, 'epoch': 0.67} 67%|██████▋ | 14879/22095 [25:45:21<8:00:18, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72243 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72012 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14880/22095 [25:45:24<7:45:12, 3.87s/it] {'loss': 0.3113, 'grad_norm': 0.6376100767180388, 'learning_rate': 2.5458388175052407e-06, 'epoch': 0.67} 67%|██████▋ | 14880/22095 [25:45:24<7:45:12, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14881/22095 [25:45:34<11:12:44, 5.60s/it] {'loss': 0.4864, 'grad_norm': 0.3157602884109542, 'learning_rate': 2.5452002834276784e-06, 'epoch': 0.67} 67%|██████▋ | 14881/22095 [25:45:34<11:12:44, 5.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885299 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8452, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1cm'}]} 67%|██████▋ | 14882/22095 [25:45:43<13:31:59, 6.75s/it] {'loss': 0.4724, 'grad_norm': 0.3159061388350569, 'learning_rate': 2.5445618020962203e-06, 'epoch': 0.67} 67%|██████▋ | 14882/22095 [25:45:43<13:31:59, 6.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14883/22095 [25:45:46<11:21:59, 5.67s/it] {'loss': 0.2934, 'grad_norm': 0.6535128707460278, 'learning_rate': 2.543923373524588e-06, 'epoch': 0.67} 67%|██████▋ | 14883/22095 [25:45:46<11:21:59, 5.67s/it] 67%|██████▋ | 14884/22095 [25:45:50<10:09:51, 5.07s/it] {'loss': 0.3202, 'grad_norm': 0.6120965324245942, 'learning_rate': 2.543284997726504e-06, 'epoch': 0.67} 67%|██████▋ | 14884/22095 [25:45:50<10:09:51, 5.07s/it] 67%|██████▋ | 14885/22095 [25:45:53<9:00:36, 4.50s/it] {'loss': 0.3191, 'grad_norm': 0.6501927712827466, 'learning_rate': 2.542646674715675e-06, 'epoch': 0.67} 67%|██████▋ | 14885/22095 [25:45:53<9:00:36, 4.50s/it] 67%|██████▋ | 14886/22095 [25:45:56<8:12:45, 4.10s/it] {'loss': 0.2935, 'grad_norm': 0.6882156235163461, 'learning_rate': 2.5420084045058226e-06, 'epoch': 0.67} 67%|██████▋ | 14886/22095 [25:45:56<8:12:45, 4.10s/it] 67%|██████▋ | 14887/22095 [25:46:00<7:57:24, 3.97s/it] {'loss': 0.3092, 'grad_norm': 0.6085103868276374, 'learning_rate': 2.5413701871106618e-06, 'epoch': 0.67} 67%|██████▋ | 14887/22095 [25:46:00<7:57:24, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14888/22095 [25:46:10<11:27:25, 5.72s/it] {'loss': 0.4571, 'grad_norm': 0.32797325748302764, 'learning_rate': 2.540732022543905e-06, 'epoch': 0.67} 67%|██████▋ | 14888/22095 [25:46:10<11:27:25, 5.72s/it] 67%|██████▋ | 14889/22095 [25:46:13<10:01:31, 5.01s/it] {'loss': 0.259, 'grad_norm': 0.5767157669986146, 'learning_rate': 2.5400939108192615e-06, 'epoch': 0.67} 67%|██████▋ | 14889/22095 [25:46:13<10:01:31, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103560 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14890/22095 [25:46:16<8:41:46, 4.35s/it] {'loss': 0.32, 'grad_norm': 0.6497609474966436, 'learning_rate': 2.539455851950445e-06, 'epoch': 0.67} 67%|██████▋ | 14890/22095 [25:46:16<8:41:46, 4.35s/it] 67%|██████▋ | 14891/22095 [25:46:19<8:02:21, 4.02s/it] {'loss': 0.3382, 'grad_norm': 0.6661441669420962, 'learning_rate': 2.5388178459511676e-06, 'epoch': 0.67} 67%|██████▋ | 14891/22095 [25:46:19<8:02:21, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305889 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1mcQRRXXXXXaLaXXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to read and tell me what is written on this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n赠送运费险\n零风险购物\n台式机专用\nDDR3升级30天包退三年包换\n1600\n4G\n包邮\nADATA\nPOST\n超强兼容闪电提速'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 67%|██████▋ | 14892/22095 [25:46:23<7:50:41, 3.92s/it] {'loss': 0.3246, 'grad_norm': 0.6545120640034809, 'learning_rate': 2.5381798928351355e-06, 'epoch': 0.67} 67%|██████▋ | 14892/22095 [25:46:23<7:50:41, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66760 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82889 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14893/22095 [25:46:26<7:29:26, 3.74s/it] {'loss': 0.3176, 'grad_norm': 0.7178892402201152, 'learning_rate': 2.537541992616055e-06, 'epoch': 0.67} 67%|██████▋ | 14893/22095 [25:46:26<7:29:26, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84382 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78600 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14894/22095 [25:46:29<7:00:06, 3.50s/it] {'loss': 0.3053, 'grad_norm': 0.5834186467443114, 'learning_rate': 2.5369041453076355e-06, 'epoch': 0.67} 67%|██████▋ | 14894/22095 [25:46:29<7:00:06, 3.50s/it] 67%|██████▋ | 14895/22095 [25:46:32<6:45:14, 3.38s/it] {'loss': 0.316, 'grad_norm': 0.637241992326166, 'learning_rate': 2.5362663509235796e-06, 'epoch': 0.67} 67%|██████▋ | 14895/22095 [25:46:32<6:45:14, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [131, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365304 in VC:s3://internvl-moe-sft-data/. Exception: Image size [131, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32045, 'image': 'vrdu_table_final_2/astro-ph.CO/500a796a-1439-4ff9-8843-67bb4873af8a.png', 'image_wh': [[131, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$0.06-0.11$\\end{tabular}\n```"}]} 67%|██████▋ | 14896/22095 [25:46:36<6:55:20, 3.46s/it] {'loss': 0.2712, 'grad_norm': 0.6823378248252496, 'learning_rate': 2.5356286094775943e-06, 'epoch': 0.67} 67%|██████▋ | 14896/22095 [25:46:36<6:55:20, 3.46s/it] 67%|██████▋ | 14897/22095 [25:46:39<6:55:47, 3.47s/it] {'loss': 0.2777, 'grad_norm': 0.6170334175678358, 'learning_rate': 2.5349909209833823e-06, 'epoch': 0.67} 67%|██████▋ | 14897/22095 [25:46:39<6:55:47, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56473 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14898/22095 [25:46:47<9:13:05, 4.61s/it] {'loss': 0.4658, 'grad_norm': 0.30346031839056486, 'learning_rate': 2.5343532854546425e-06, 'epoch': 0.67} 67%|██████▋ | 14898/22095 [25:46:47<9:13:05, 4.61s/it] 67%|██████▋ | 14899/22095 [25:46:50<8:25:34, 4.22s/it] {'loss': 0.3181, 'grad_norm': 0.5985682336587592, 'learning_rate': 2.533715702905078e-06, 'epoch': 0.67} 67%|██████▋ | 14899/22095 [25:46:50<8:25:34, 4.22s/it] 67%|██████▋ | 14900/22095 [25:46:53<7:40:40, 3.84s/it] {'loss': 0.3288, 'grad_norm': 0.6128606460509117, 'learning_rate': 2.53307817334839e-06, 'epoch': 0.67} 67%|██████▋ | 14900/22095 [25:46:53<7:40:40, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14901/22095 [25:47:00<9:50:10, 4.92s/it] {'loss': 0.4709, 'grad_norm': 0.2968114827100891, 'learning_rate': 2.5324406967982764e-06, 'epoch': 0.67} 67%|██████▋ | 14901/22095 [25:47:00<9:50:10, 4.92s/it] 67%|██████▋ | 14902/22095 [25:47:10<12:21:43, 6.19s/it] {'loss': 0.4621, 'grad_norm': 0.38491508192950186, 'learning_rate': 2.5318032732684306e-06, 'epoch': 0.67} 67%|██████▋ | 14902/22095 [25:47:10<12:21:43, 6.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 67%|██████▋ | 14903/22095 [25:47:13<10:54:23, 5.46s/it] {'loss': 0.3006, 'grad_norm': 0.6463875398224902, 'learning_rate': 2.5311659027725523e-06, 'epoch': 0.67} 67%|██████▋ | 14903/22095 [25:47:13<10:54:23, 5.46s/it] 67%|██████▋ | 14904/22095 [25:47:18<10:12:31, 5.11s/it] {'loss': 0.2863, 'grad_norm': 0.7427266610588816, 'learning_rate': 2.530528585324339e-06, 'epoch': 0.67} 67%|██████▋ | 14904/22095 [25:47:18<10:12:31, 5.11s/it] 67%|██████▋ | 14905/22095 [25:47:21<8:58:03, 4.49s/it] {'loss': 0.2886, 'grad_norm': 0.6320123767632743, 'learning_rate': 2.529891320937481e-06, 'epoch': 0.67} 67%|██████▋ | 14905/22095 [25:47:21<8:58:03, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54724 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14906/22095 [25:47:30<11:53:09, 5.95s/it] {'loss': 0.4613, 'grad_norm': 0.282299480515388, 'learning_rate': 2.5292541096256706e-06, 'epoch': 0.67} 67%|██████▋ | 14906/22095 [25:47:30<11:53:09, 5.95s/it] 67%|██████▋ | 14907/22095 [25:47:33<10:15:15, 5.14s/it] {'loss': 0.3129, 'grad_norm': 0.5895777919573687, 'learning_rate': 2.528616951402603e-06, 'epoch': 0.67} 67%|██████▋ | 14907/22095 [25:47:33<10:15:15, 5.14s/it] 67%|██████▋ | 14908/22095 [25:47:37<9:13:17, 4.62s/it] {'loss': 0.3194, 'grad_norm': 0.5958128659036863, 'learning_rate': 2.5279798462819647e-06, 'epoch': 0.67} 67%|██████▋ | 14908/22095 [25:47:37<9:13:17, 4.62s/it] 67%|██████▋ | 14909/22095 [25:47:39<8:04:48, 4.05s/it] {'loss': 0.3085, 'grad_norm': 0.6247947020635181, 'learning_rate': 2.52734279427745e-06, 'epoch': 0.67} 67%|██████▋ | 14909/22095 [25:47:39<8:04:48, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79610 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50832 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14910/22095 [25:47:43<7:40:59, 3.85s/it] {'loss': 0.3221, 'grad_norm': 0.6390878216793594, 'learning_rate': 2.5267057954027437e-06, 'epoch': 0.67} 67%|██████▋ | 14910/22095 [25:47:43<7:40:59, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80474 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41093 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14911/22095 [25:47:47<7:38:15, 3.83s/it] {'loss': 0.2886, 'grad_norm': 0.6288651134205863, 'learning_rate': 2.5260688496715318e-06, 'epoch': 0.67} 67%|██████▋ | 14911/22095 [25:47:47<7:38:15, 3.83s/it] 67%|██████▋ | 14912/22095 [25:47:50<7:17:07, 3.65s/it] {'loss': 0.3176, 'grad_norm': 0.6489868403838829, 'learning_rate': 2.5254319570975026e-06, 'epoch': 0.67} 67%|██████▋ | 14912/22095 [25:47:50<7:17:07, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51364 > 40960). Running this sequence through the model will result in indexing errors 67%|██████▋ | 14913/22095 [25:47:54<7:22:20, 3.70s/it] {'loss': 0.3026, 'grad_norm': 0.6122035663289863, 'learning_rate': 2.524795117694344e-06, 'epoch': 0.67} 67%|██████▋ | 14913/22095 [25:47:54<7:22:20, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 67%|██████▋ | 14914/22095 [25:48:02<9:58:18, 5.00s/it] {'loss': 0.4706, 'grad_norm': 0.277427821508405, 'learning_rate': 2.5241583314757327e-06, 'epoch': 0.67} 67%|██████▋ | 14914/22095 [25:48:02<9:58:18, 5.00s/it] 68%|██████▊ | 14915/22095 [25:48:05<9:12:23, 4.62s/it] {'loss': 0.3289, 'grad_norm': 0.6597618750541544, 'learning_rate': 2.523521598455355e-06, 'epoch': 0.68} 68%|██████▊ | 14915/22095 [25:48:05<9:12:23, 4.62s/it] 68%|██████▊ | 14916/22095 [25:48:09<8:40:12, 4.35s/it] {'loss': 0.3565, 'grad_norm': 0.5947017395599469, 'learning_rate': 2.522884918646894e-06, 'epoch': 0.68} 68%|██████▊ | 14916/22095 [25:48:09<8:40:12, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87598 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82248 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48855 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14917/22095 [25:48:12<8:01:49, 4.03s/it] {'loss': 0.2782, 'grad_norm': 0.5862091108400317, 'learning_rate': 2.5222482920640285e-06, 'epoch': 0.68} 68%|██████▊ | 14917/22095 [25:48:12<8:01:49, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14918/22095 [25:48:22<11:20:41, 5.69s/it] {'loss': 0.4796, 'grad_norm': 0.2960613611617704, 'learning_rate': 2.5216117187204346e-06, 'epoch': 0.68} 68%|██████▊ | 14918/22095 [25:48:22<11:20:41, 5.69s/it] 68%|██████▊ | 14919/22095 [25:48:31<13:38:13, 6.84s/it] {'loss': 0.4825, 'grad_norm': 0.2674426621288423, 'learning_rate': 2.520975198629794e-06, 'epoch': 0.68} 68%|██████▊ | 14919/22095 [25:48:31<13:38:13, 6.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 14920/22095 [25:48:36<11:57:53, 6.00s/it] {'loss': 0.2978, 'grad_norm': 0.6306769854950436, 'learning_rate': 2.520338731805785e-06, 'epoch': 0.68} 68%|██████▊ | 14920/22095 [25:48:36<11:57:53, 6.00s/it] 68%|██████▊ | 14921/22095 [25:48:45<14:12:30, 7.13s/it] {'loss': 0.4577, 'grad_norm': 0.26766885441575, 'learning_rate': 2.5197023182620795e-06, 'epoch': 0.68} 68%|██████▊ | 14921/22095 [25:48:45<14:12:30, 7.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 14922/22095 [25:48:50<12:43:11, 6.38s/it] {'loss': 0.3197, 'grad_norm': 0.6303183429655842, 'learning_rate': 2.5190659580123524e-06, 'epoch': 0.68} 68%|██████▊ | 14922/22095 [25:48:50<12:43:11, 6.38s/it] 68%|██████▊ | 14923/22095 [25:48:53<10:57:45, 5.50s/it] {'loss': 0.2715, 'grad_norm': 0.624135549266601, 'learning_rate': 2.51842965107028e-06, 'epoch': 0.68} 68%|██████▊ | 14923/22095 [25:48:53<10:57:45, 5.50s/it] 68%|██████▊ | 14924/22095 [25:48:57<10:02:38, 5.04s/it] {'loss': 0.3257, 'grad_norm': 0.6602309754641182, 'learning_rate': 2.517793397449531e-06, 'epoch': 0.68} 68%|██████▊ | 14924/22095 [25:48:57<10:02:38, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14925/22095 [25:49:07<12:58:56, 6.52s/it] {'loss': 0.4771, 'grad_norm': 0.43063048420359085, 'learning_rate': 2.5171571971637805e-06, 'epoch': 0.68} 68%|██████▊ | 14925/22095 [25:49:07<12:58:56, 6.52s/it] 68%|██████▊ | 14926/22095 [25:49:11<10:59:50, 5.52s/it] {'loss': 0.2835, 'grad_norm': 0.6352771016933549, 'learning_rate': 2.5165210502266964e-06, 'epoch': 0.68} 68%|██████▊ | 14926/22095 [25:49:11<10:59:50, 5.52s/it] 68%|██████▊ | 14927/22095 [25:49:14<9:53:27, 4.97s/it] {'loss': 0.3086, 'grad_norm': 0.7075697434121344, 'learning_rate': 2.515884956651945e-06, 'epoch': 0.68} 68%|██████▊ | 14927/22095 [25:49:14<9:53:27, 4.97s/it] 68%|██████▊ | 14928/22095 [25:49:18<9:09:46, 4.60s/it] {'loss': 0.2733, 'grad_norm': 0.6011383335968612, 'learning_rate': 2.515248916453197e-06, 'epoch': 0.68} 68%|██████▊ | 14928/22095 [25:49:18<9:09:46, 4.60s/it] 68%|██████▊ | 14929/22095 [25:49:21<8:06:31, 4.07s/it] {'loss': 0.2945, 'grad_norm': 0.5922423977368473, 'learning_rate': 2.51461292964412e-06, 'epoch': 0.68} 68%|██████▊ | 14929/22095 [25:49:21<8:06:31, 4.07s/it] 68%|██████▊ | 14930/22095 [25:49:24<7:21:45, 3.70s/it] {'loss': 0.2941, 'grad_norm': 1.2989889900263096, 'learning_rate': 2.5139769962383788e-06, 'epoch': 0.68} 68%|██████▊ | 14930/22095 [25:49:24<7:21:45, 3.70s/it] 68%|██████▊ | 14931/22095 [25:49:27<7:05:32, 3.56s/it] {'loss': 0.2791, 'grad_norm': 0.5764395400902315, 'learning_rate': 2.5133411162496335e-06, 'epoch': 0.68} 68%|██████▊ | 14931/22095 [25:49:27<7:05:32, 3.56s/it] 68%|██████▊ | 14932/22095 [25:49:31<7:29:30, 3.77s/it] {'loss': 0.3288, 'grad_norm': 0.6428344078064796, 'learning_rate': 2.512705289691551e-06, 'epoch': 0.68} 68%|██████▊ | 14932/22095 [25:49:31<7:29:30, 3.77s/it] 68%|██████▊ | 14933/22095 [25:49:35<7:37:45, 3.83s/it] {'loss': 0.2842, 'grad_norm': 0.6114776529509702, 'learning_rate': 2.5120695165777946e-06, 'epoch': 0.68} 68%|██████▊ | 14933/22095 [25:49:36<7:37:45, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14934/22095 [25:49:43<10:18:17, 5.18s/it] {'loss': 0.4543, 'grad_norm': 0.28320462128242, 'learning_rate': 2.5114337969220233e-06, 'epoch': 0.68} 68%|██████▊ | 14934/22095 [25:49:43<10:18:17, 5.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8370443 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37195, 'image': 'vrdu_table_final_2/astro-ph.CO/de14c4ac-12db-49cc-9214-f378ce54548a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 68%|██████▊ | 14935/22095 [25:49:47<9:06:10, 4.58s/it] {'loss': 0.3156, 'grad_norm': 0.6086254527757627, 'learning_rate': 2.510798130737895e-06, 'epoch': 0.68} 68%|██████▊ | 14935/22095 [25:49:47<9:06:10, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8354292 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20977, 'image': 'vrdu_table_final_2/astro-ph.CO/f0194cb0-1fb4-47bd-880c-c569973d6999.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 14936/22095 [25:49:50<8:28:26, 4.26s/it] {'loss': 0.3279, 'grad_norm': 0.6228585294371244, 'learning_rate': 2.510162518039071e-06, 'epoch': 0.68} 68%|██████▊ | 14936/22095 [25:49:50<8:28:26, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51401 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14937/22095 [25:49:53<7:37:57, 3.84s/it] {'loss': 0.3093, 'grad_norm': 0.62743672614483, 'learning_rate': 2.5095269588392055e-06, 'epoch': 0.68} 68%|██████▊ | 14937/22095 [25:49:53<7:37:57, 3.84s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/safari_4/images/step_0.png 2025-08-28 17:47:52.568188 load time: 1319.12 ms 68%|██████▊ | 14938/22095 [25:49:57<7:37:01, 3.83s/it] {'loss': 0.3188, 'grad_norm': 0.637625832238023, 'learning_rate': 2.50889145315196e-06, 'epoch': 0.68} 68%|██████▊ | 14938/22095 [25:49:57<7:37:01, 3.83s/it] 68%|██████▊ | 14939/22095 [25:50:01<7:38:51, 3.85s/it] {'loss': 0.2864, 'grad_norm': 0.7949598540965647, 'learning_rate': 2.508256000990985e-06, 'epoch': 0.68} 68%|██████▊ | 14939/22095 [25:50:01<7:38:51, 3.85s/it] 68%|██████▊ | 14940/22095 [25:50:05<8:09:44, 4.11s/it] {'loss': 0.2753, 'grad_norm': 0.606394208948491, 'learning_rate': 2.5076206023699344e-06, 'epoch': 0.68} 68%|██████▊ | 14940/22095 [25:50:05<8:09:44, 4.11s/it] 68%|██████▊ | 14941/22095 [25:50:09<8:09:42, 4.11s/it] {'loss': 0.3026, 'grad_norm': 0.5686953213430728, 'learning_rate': 2.5069852573024624e-06, 'epoch': 0.68} 68%|██████▊ | 14941/22095 [25:50:09<8:09:42, 4.11s/it] 68%|██████▊ | 14942/22095 [25:50:13<7:44:52, 3.90s/it] {'loss': 0.3118, 'grad_norm': 0.6984108573696061, 'learning_rate': 2.5063499658022227e-06, 'epoch': 0.68} 68%|██████▊ | 14942/22095 [25:50:13<7:44:52, 3.90s/it] 68%|██████▊ | 14943/22095 [25:50:17<7:55:39, 3.99s/it] {'loss': 0.2925, 'grad_norm': 0.6034097239048911, 'learning_rate': 2.505714727882863e-06, 'epoch': 0.68} 68%|██████▊ | 14943/22095 [25:50:17<7:55:39, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51862 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14944/22095 [25:50:20<7:18:56, 3.68s/it] {'loss': 0.3273, 'grad_norm': 0.6563126248198178, 'learning_rate': 2.505079543558031e-06, 'epoch': 0.68} 68%|██████▊ | 14944/22095 [25:50:20<7:18:56, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (60468 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94288 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14945/22095 [25:50:30<10:56:41, 5.51s/it] {'loss': 0.4755, 'grad_norm': 0.2716205123914153, 'learning_rate': 2.504444412841378e-06, 'epoch': 0.68} 68%|██████▊ | 14945/22095 [25:50:30<10:56:41, 5.51s/it] 68%|██████▊ | 14946/22095 [25:50:33<9:47:32, 4.93s/it] {'loss': 0.3471, 'grad_norm': 0.628172470558325, 'learning_rate': 2.503809335746553e-06, 'epoch': 0.68} 68%|██████▊ | 14946/22095 [25:50:33<9:47:32, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (154426 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116226 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14947/22095 [25:50:37<8:48:09, 4.43s/it] {'loss': 0.2969, 'grad_norm': 0.6236765157508242, 'learning_rate': 2.5031743122871954e-06, 'epoch': 0.68} 68%|██████▊ | 14947/22095 [25:50:37<8:48:09, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 14948/22095 [25:50:40<8:04:13, 4.07s/it] {'loss': 0.3084, 'grad_norm': 0.7013063988857922, 'learning_rate': 2.502539342476953e-06, 'epoch': 0.68} 68%|██████▊ | 14948/22095 [25:50:40<8:04:13, 4.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41039 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41084 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14949/22095 [25:50:43<7:41:04, 3.87s/it] {'loss': 0.2788, 'grad_norm': 0.630089781026667, 'learning_rate': 2.5019044263294724e-06, 'epoch': 0.68} 68%|██████▊ | 14949/22095 [25:50:43<7:41:04, 3.87s/it] 68%|██████▊ | 14950/22095 [25:50:47<7:23:46, 3.73s/it] {'loss': 0.2592, 'grad_norm': 0.5768425223257683, 'learning_rate': 2.5012695638583933e-06, 'epoch': 0.68} 68%|██████▊ | 14950/22095 [25:50:47<7:23:46, 3.73s/it] 68%|██████▊ | 14951/22095 [25:50:50<7:17:00, 3.67s/it] {'loss': 0.2956, 'grad_norm': 0.669027420317218, 'learning_rate': 2.5006347550773547e-06, 'epoch': 0.68} 68%|██████▊ | 14951/22095 [25:50:50<7:17:00, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 68%|██████▊ | 14952/22095 [25:50:59<10:26:06, 5.26s/it] {'loss': 0.4678, 'grad_norm': 0.3014969538495637, 'learning_rate': 2.5000000000000015e-06, 'epoch': 0.68} 68%|██████▊ | 14952/22095 [25:50:59<10:26:06, 5.26s/it] 68%|██████▊ | 14953/22095 [25:51:09<12:54:24, 6.51s/it] {'loss': 0.4667, 'grad_norm': 0.28886597729116914, 'learning_rate': 2.4993652986399675e-06, 'epoch': 0.68} 68%|██████▊ | 14953/22095 [25:51:09<12:54:24, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (68052 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14954/22095 [25:51:14<11:59:04, 6.04s/it] {'loss': 0.3097, 'grad_norm': 0.5987131742441169, 'learning_rate': 2.4987306510108956e-06, 'epoch': 0.68} 68%|██████▊ | 14954/22095 [25:51:14<11:59:04, 6.04s/it] 68%|██████▊ | 14955/22095 [25:51:25<14:58:04, 7.55s/it] {'loss': 0.491, 'grad_norm': 0.2877713827912253, 'learning_rate': 2.4980960571264195e-06, 'epoch': 0.68} 68%|██████▊ | 14955/22095 [25:51:25<14:58:04, 7.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 14956/22095 [25:51:29<13:06:49, 6.61s/it] {'loss': 0.2975, 'grad_norm': 0.7496311178592051, 'learning_rate': 2.497461517000173e-06, 'epoch': 0.68} 68%|██████▊ | 14956/22095 [25:51:29<13:06:49, 6.61s/it] 68%|██████▊ | 14957/22095 [25:51:40<15:39:23, 7.90s/it] {'loss': 0.4762, 'grad_norm': 0.3042963939527462, 'learning_rate': 2.496827030645793e-06, 'epoch': 0.68} 68%|██████▊ | 14957/22095 [25:51:40<15:39:23, 7.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 14958/22095 [25:51:45<13:58:33, 7.05s/it] {'loss': 0.2719, 'grad_norm': 0.6794376426236794, 'learning_rate': 2.4961925980769144e-06, 'epoch': 0.68} 68%|██████▊ | 14958/22095 [25:51:45<13:58:33, 7.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 14959/22095 [25:51:49<12:19:04, 6.21s/it] {'loss': 0.318, 'grad_norm': 0.682983746007749, 'learning_rate': 2.4955582193071664e-06, 'epoch': 0.68} 68%|██████▊ | 14959/22095 [25:51:49<12:19:04, 6.21s/it] 68%|██████▊ | 14960/22095 [25:51:53<10:38:50, 5.37s/it] {'loss': 0.3331, 'grad_norm': 0.5837459735848904, 'learning_rate': 2.494923894350179e-06, 'epoch': 0.68} 68%|██████▊ | 14960/22095 [25:51:53<10:38:50, 5.37s/it] 68%|██████▊ | 14961/22095 [25:51:56<9:39:40, 4.88s/it] {'loss': 0.3305, 'grad_norm': 0.6577176249026865, 'learning_rate': 2.494289623219583e-06, 'epoch': 0.68} 68%|██████▊ | 14961/22095 [25:51:56<9:39:40, 4.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14962/22095 [25:52:03<10:29:38, 5.30s/it] {'loss': 0.4543, 'grad_norm': 0.3025775812822773, 'learning_rate': 2.4936554059290095e-06, 'epoch': 0.68} 68%|██████▊ | 14962/22095 [25:52:03<10:29:38, 5.30s/it] 68%|██████▊ | 14963/22095 [25:52:06<9:31:01, 4.80s/it] {'loss': 0.3092, 'grad_norm': 0.6238240104432574, 'learning_rate': 2.4930212424920837e-06, 'epoch': 0.68} 68%|██████▊ | 14963/22095 [25:52:06<9:31:01, 4.80s/it] 68%|██████▊ | 14964/22095 [25:52:10<8:34:48, 4.33s/it] {'loss': 0.2957, 'grad_norm': 0.5594267201562955, 'learning_rate': 2.49238713292243e-06, 'epoch': 0.68} 68%|██████▊ | 14964/22095 [25:52:10<8:34:48, 4.33s/it] 68%|██████▊ | 14965/22095 [25:52:13<7:45:54, 3.92s/it] {'loss': 0.3335, 'grad_norm': 0.6158262707830824, 'learning_rate': 2.491753077233676e-06, 'epoch': 0.68} 68%|██████▊ | 14965/22095 [25:52:13<7:45:54, 3.92s/it] 68%|██████▊ | 14966/22095 [25:52:15<7:07:56, 3.60s/it] {'loss': 0.3059, 'grad_norm': 0.6131637951233143, 'learning_rate': 2.4911190754394445e-06, 'epoch': 0.68} 68%|██████▊ | 14966/22095 [25:52:15<7:07:56, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46915 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14967/22095 [25:52:19<7:04:10, 3.57s/it] {'loss': 0.2906, 'grad_norm': 0.5816517091849465, 'learning_rate': 2.49048512755336e-06, 'epoch': 0.68} 68%|██████▊ | 14967/22095 [25:52:19<7:04:10, 3.57s/it] 68%|██████▊ | 14968/22095 [25:52:22<6:35:01, 3.33s/it] {'loss': 0.3485, 'grad_norm': 0.6345875687869158, 'learning_rate': 2.4898512335890425e-06, 'epoch': 0.68} 68%|██████▊ | 14968/22095 [25:52:22<6:35:01, 3.33s/it] 68%|██████▊ | 14969/22095 [25:52:25<6:43:48, 3.40s/it] {'loss': 0.3308, 'grad_norm': 0.6298797929591657, 'learning_rate': 2.4892173935601112e-06, 'epoch': 0.68} 68%|██████▊ | 14969/22095 [25:52:25<6:43:48, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:21 and width:135 must be larger than factor:28 [Try #0] Failed to fetch sample 2092006 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:21 and width:135 must be larger than factor:28 Problematic sample: {'image': 'b740dccee641dd995e5ce727ca3882efdf31feffa6d5688fe120c85e9c186e93.png', 'conversations': [{'from': 'human', 'value': "\nThe visual attributes of this Button are:\nThe element is a circular button with a gray background and a black plus sign in the center. It has a minimalist design with no additional text or icons. The button is visually distinct due to its simple geometric shape and contrasting colors.\n\nThe spatial layout of this Button:\nThe button is located in the middle section of the interface, to the right of a green circular play button. It is part of a horizontal control panel that includes other interactive elements. The button is positioned between the play button and a three-dot menu icon.\n\nFunctional description of the Button:\nThe primary function of this button is likely to add or save the current item, such as a song or playlist, to a user's library or a specific list. Users can interact with it by clicking or tapping, which would typically result in the item being added to their collection."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]', 'recipient': 'all', 'end_turn': True}]} 68%|██████▊ | 14970/22095 [25:52:28<6:24:25, 3.24s/it] {'loss': 0.3261, 'grad_norm': 0.6800828759636052, 'learning_rate': 2.488583607480186e-06, 'epoch': 0.68} 68%|██████▊ | 14970/22095 [25:52:28<6:24:25, 3.24s/it] 68%|██████▊ | 14971/22095 [25:52:31<6:28:22, 3.27s/it] {'loss': 0.3023, 'grad_norm': 0.6050569028834316, 'learning_rate': 2.4879498753628885e-06, 'epoch': 0.68} 68%|██████▊ | 14971/22095 [25:52:31<6:28:22, 3.27s/it] 68%|██████▊ | 14972/22095 [25:52:34<6:13:10, 3.14s/it] {'loss': 0.3389, 'grad_norm': 0.6514948521987356, 'learning_rate': 2.487316197221833e-06, 'epoch': 0.68} 68%|██████▊ | 14972/22095 [25:52:34<6:13:10, 3.14s/it] 68%|██████▊ | 14973/22095 [25:52:38<6:21:50, 3.22s/it] {'loss': 0.2832, 'grad_norm': 0.6305047865132518, 'learning_rate': 2.486682573070633e-06, 'epoch': 0.68} 68%|██████▊ | 14973/22095 [25:52:38<6:21:50, 3.22s/it] 68%|██████▊ | 14974/22095 [25:52:40<6:06:57, 3.09s/it] {'loss': 0.3041, 'grad_norm': 0.642479019505682, 'learning_rate': 2.4860490029229056e-06, 'epoch': 0.68} 68%|██████▊ | 14974/22095 [25:52:40<6:06:57, 3.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14975/22095 [25:52:47<8:12:39, 4.15s/it] {'loss': 0.4724, 'grad_norm': 0.2966655752974817, 'learning_rate': 2.485415486792266e-06, 'epoch': 0.68} 68%|██████▊ | 14975/22095 [25:52:47<8:12:39, 4.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 14976/22095 [25:52:51<7:48:09, 3.95s/it] {'loss': 0.3296, 'grad_norm': 0.654410954345375, 'learning_rate': 2.4847820246923244e-06, 'epoch': 0.68} 68%|██████▊ | 14976/22095 [25:52:51<7:48:09, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59429 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14977/22095 [25:52:54<7:39:15, 3.87s/it] {'loss': 0.2785, 'grad_norm': 0.7127968738118841, 'learning_rate': 2.4841486166366908e-06, 'epoch': 0.68} 68%|██████▊ | 14977/22095 [25:52:54<7:39:15, 3.87s/it] 68%|██████▊ | 14978/22095 [25:52:58<7:17:41, 3.69s/it] {'loss': 0.2813, 'grad_norm': 0.6122340196596452, 'learning_rate': 2.483515262638978e-06, 'epoch': 0.68} 68%|██████▊ | 14978/22095 [25:52:58<7:17:41, 3.69s/it] 68%|██████▊ | 14979/22095 [25:53:01<7:08:07, 3.61s/it] {'loss': 0.2824, 'grad_norm': 0.6249973534876662, 'learning_rate': 2.482881962712794e-06, 'epoch': 0.68} 68%|██████▊ | 14979/22095 [25:53:01<7:08:07, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46623 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14980/22095 [25:53:04<6:50:43, 3.46s/it] {'loss': 0.2977, 'grad_norm': 0.6260218678357067, 'learning_rate': 2.4822487168717437e-06, 'epoch': 0.68} 68%|██████▊ | 14980/22095 [25:53:04<6:50:43, 3.46s/it] 68%|██████▊ | 14981/22095 [25:53:07<6:25:26, 3.25s/it] {'loss': 0.2551, 'grad_norm': 0.6507315239786348, 'learning_rate': 2.481615525129437e-06, 'epoch': 0.68} 68%|██████▊ | 14981/22095 [25:53:07<6:25:26, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76370 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14982/22095 [25:53:10<6:06:27, 3.09s/it] {'loss': 0.3069, 'grad_norm': 0.5892910468708393, 'learning_rate': 2.480982387499477e-06, 'epoch': 0.68} 68%|██████▊ | 14982/22095 [25:53:10<6:06:27, 3.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47261 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55938 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45077 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14983/22095 [25:53:13<6:33:01, 3.32s/it] {'loss': 0.3318, 'grad_norm': 1.8074788988332977, 'learning_rate': 2.480349303995471e-06, 'epoch': 0.68} 68%|██████▊ | 14983/22095 [25:53:13<6:33:01, 3.32s/it] 68%|██████▊ | 14984/22095 [25:53:17<6:47:19, 3.44s/it] {'loss': 0.2766, 'grad_norm': 0.5737467735703359, 'learning_rate': 2.4797162746310193e-06, 'epoch': 0.68} 68%|██████▊ | 14984/22095 [25:53:17<6:47:19, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54488 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48823 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72188 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43179 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14985/22095 [25:53:27<10:21:02, 5.24s/it] {'loss': 0.4643, 'grad_norm': 0.3156381897305302, 'learning_rate': 2.479083299419723e-06, 'epoch': 0.68} 68%|██████▊ | 14985/22095 [25:53:27<10:21:02, 5.24s/it] 68%|██████▊ | 14986/22095 [25:53:31<9:35:50, 4.86s/it] {'loss': 0.3306, 'grad_norm': 0.7418544758899195, 'learning_rate': 2.4784503783751834e-06, 'epoch': 0.68} 68%|██████▊ | 14986/22095 [25:53:31<9:35:50, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41784 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91209 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14987/22095 [25:53:35<9:14:38, 4.68s/it] {'loss': 0.2876, 'grad_norm': 0.5487600317511814, 'learning_rate': 2.477817511511003e-06, 'epoch': 0.68} 68%|██████▊ | 14987/22095 [25:53:35<9:14:38, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51152 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76192 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57936 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53310 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61178 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14988/22095 [25:53:42<10:31:02, 5.33s/it] {'loss': 0.4652, 'grad_norm': 0.2603885575345069, 'learning_rate': 2.477184698840779e-06, 'epoch': 0.68} 68%|██████▊ | 14988/22095 [25:53:42<10:31:02, 5.33s/it] 68%|██████▊ | 14989/22095 [25:53:45<9:24:09, 4.76s/it] {'loss': 0.3801, 'grad_norm': 0.6194122431720369, 'learning_rate': 2.4765519403781048e-06, 'epoch': 0.68} 68%|██████▊ | 14989/22095 [25:53:45<9:24:09, 4.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 14990/22095 [25:53:54<12:07:40, 6.15s/it] {'loss': 0.4808, 'grad_norm': 0.27762558728478176, 'learning_rate': 2.475919236136579e-06, 'epoch': 0.68} 68%|██████▊ | 14990/22095 [25:53:54<12:07:40, 6.15s/it] 68%|██████▊ | 14991/22095 [25:54:05<14:36:07, 7.40s/it] {'loss': 0.4834, 'grad_norm': 0.31156010455713856, 'learning_rate': 2.4752865861297994e-06, 'epoch': 0.68} 68%|██████▊ | 14991/22095 [25:54:05<14:36:07, 7.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 14992/22095 [25:54:09<12:34:15, 6.37s/it] {'loss': 0.3135, 'grad_norm': 0.6896153422630451, 'learning_rate': 2.474653990371356e-06, 'epoch': 0.68} 68%|██████▊ | 14992/22095 [25:54:09<12:34:15, 6.37s/it] 68%|██████▊ | 14993/22095 [25:54:19<14:35:05, 7.39s/it] {'loss': 0.4519, 'grad_norm': 0.2979982913756971, 'learning_rate': 2.474021448874841e-06, 'epoch': 0.68} 68%|██████▊ | 14993/22095 [25:54:19<14:35:05, 7.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 14994/22095 [25:54:22<12:31:04, 6.35s/it] {'loss': 0.307, 'grad_norm': 0.6265256108940277, 'learning_rate': 2.4733889616538493e-06, 'epoch': 0.68} 68%|██████▊ | 14994/22095 [25:54:22<12:31:04, 6.35s/it] 68%|██████▊ | 14995/22095 [25:54:32<14:22:16, 7.29s/it] {'loss': 0.4919, 'grad_norm': 0.292726559882182, 'learning_rate': 2.472756528721966e-06, 'epoch': 0.68} 68%|██████▊ | 14995/22095 [25:54:32<14:22:16, 7.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81289 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 14996/22095 [25:54:38<13:25:12, 6.81s/it] {'loss': 0.4561, 'grad_norm': 0.29246799357364633, 'learning_rate': 2.4721241500927863e-06, 'epoch': 0.68} 68%|██████▊ | 14996/22095 [25:54:38<13:25:12, 6.81s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 68%|██████▊ | 14997/22095 [25:54:41<11:41:14, 5.93s/it] {'loss': 0.2792, 'grad_norm': 0.6261086562541617, 'learning_rate': 2.4714918257798936e-06, 'epoch': 0.68} 68%|██████▊ | 14997/22095 [25:54:41<11:41:14, 5.93s/it] 68%|██████▊ | 14998/22095 [25:54:45<10:34:10, 5.36s/it] {'loss': 0.3236, 'grad_norm': 0.6299828393508288, 'learning_rate': 2.470859555796875e-06, 'epoch': 0.68} 68%|██████▊ | 14998/22095 [25:54:45<10:34:10, 5.36s/it] 68%|██████▊ | 14999/22095 [25:54:49<9:12:11, 4.67s/it] {'loss': 0.254, 'grad_norm': 0.6333138654289642, 'learning_rate': 2.470227340157316e-06, 'epoch': 0.68} 68%|██████▊ | 14999/22095 [25:54:49<9:12:11, 4.67s/it] 68%|██████▊ | 15000/22095 [25:54:52<8:14:59, 4.19s/it] {'loss': 0.291, 'grad_norm': 0.6989859571225095, 'learning_rate': 2.4695951788748047e-06, 'epoch': 0.68} 68%|██████▊ | 15000/22095 [25:54:52<8:14:59, 4.19s/it] 68%|██████▊ | 15001/22095 [25:54:55<7:53:01, 4.00s/it] {'loss': 0.2595, 'grad_norm': 0.6215116559144774, 'learning_rate': 2.4689630719629206e-06, 'epoch': 0.68} 68%|██████▊ | 15001/22095 [25:54:55<7:53:01, 4.00s/it] 68%|██████▊ | 15002/22095 [25:54:59<7:40:14, 3.89s/it] {'loss': 0.2968, 'grad_norm': 0.5904808625350783, 'learning_rate': 2.468331019435245e-06, 'epoch': 0.68} 68%|██████▊ | 15002/22095 [25:54:59<7:40:14, 3.89s/it] 68%|██████▊ | 15003/22095 [25:55:02<7:03:21, 3.58s/it] {'loss': 0.2764, 'grad_norm': 0.575496997947114, 'learning_rate': 2.4676990213053603e-06, 'epoch': 0.68} 68%|██████▊ | 15003/22095 [25:55:02<7:03:21, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15004/22095 [25:55:09<9:30:22, 4.83s/it] {'loss': 0.463, 'grad_norm': 0.31230031547698267, 'learning_rate': 2.467067077586848e-06, 'epoch': 0.68} 68%|██████▊ | 15004/22095 [25:55:09<9:30:22, 4.83s/it] 68%|██████▊ | 15005/22095 [25:55:13<9:01:52, 4.59s/it] {'loss': 0.2692, 'grad_norm': 0.7635158903039766, 'learning_rate': 2.466435188293286e-06, 'epoch': 0.68} 68%|██████▊ | 15005/22095 [25:55:13<9:01:52, 4.59s/it] 68%|██████▊ | 15006/22095 [25:55:17<8:39:36, 4.40s/it] {'loss': 0.3226, 'grad_norm': 1.3168042258773374, 'learning_rate': 2.4658033534382476e-06, 'epoch': 0.68} 68%|██████▊ | 15006/22095 [25:55:17<8:39:36, 4.40s/it] 68%|██████▊ | 15007/22095 [25:55:20<7:45:59, 3.94s/it] {'loss': 0.2941, 'grad_norm': 0.5862584713732727, 'learning_rate': 2.465171573035314e-06, 'epoch': 0.68} 68%|██████▊ | 15007/22095 [25:55:20<7:45:59, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301350 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1Ci0yLXXXXXcFXVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text information from this photo, but do not output the coordinates.'}, {'from': 'gpt', 'value': 'All words in the image:\n软件购买\n旺旺ID\n:whzywj\n体重/weight\n48.2\n公斤\nWelevas韦乐雅斯\n强大的批量处理,能避免你想要的效果!!!'}]} 68%|██████▊ | 15008/22095 [25:55:24<7:49:50, 3.98s/it] {'loss': 0.2765, 'grad_norm': 0.5971792872764682, 'learning_rate': 2.4645398470980564e-06, 'epoch': 0.68} 68%|██████▊ | 15008/22095 [25:55:24<7:49:50, 3.98s/it] 68%|██████▊ | 15009/22095 [25:55:28<7:44:50, 3.94s/it] {'loss': 0.2697, 'grad_norm': 0.6321924431964658, 'learning_rate': 2.463908175640052e-06, 'epoch': 0.68} 68%|██████▊ | 15009/22095 [25:55:28<7:44:50, 3.94s/it] 68%|██████▊ | 15010/22095 [25:55:32<7:54:40, 4.02s/it] {'loss': 0.2953, 'grad_norm': 0.643688802392928, 'learning_rate': 2.463276558674872e-06, 'epoch': 0.68} 68%|██████▊ | 15010/22095 [25:55:32<7:54:40, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15011/22095 [25:55:36<7:28:01, 3.79s/it] {'loss': 0.2685, 'grad_norm': 0.9119352682482706, 'learning_rate': 2.462644996216086e-06, 'epoch': 0.68} 68%|██████▊ | 15011/22095 [25:55:36<7:28:01, 3.79s/it] 68%|██████▊ | 15012/22095 [25:55:38<6:53:09, 3.50s/it] {'loss': 0.2431, 'grad_norm': 0.925526246168888, 'learning_rate': 2.4620134882772683e-06, 'epoch': 0.68} 68%|██████▊ | 15012/22095 [25:55:38<6:53:09, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (84829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69682 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15013/22095 [25:55:44<8:14:56, 4.19s/it] {'loss': 0.4957, 'grad_norm': 0.32426128200660725, 'learning_rate': 2.461382034871986e-06, 'epoch': 0.68} 68%|██████▊ | 15013/22095 [25:55:44<8:14:56, 4.19s/it] 68%|██████▊ | 15014/22095 [25:55:49<8:17:01, 4.21s/it] {'loss': 0.3386, 'grad_norm': 0.6005725209893311, 'learning_rate': 2.4607506360138044e-06, 'epoch': 0.68} 68%|██████▊ | 15014/22095 [25:55:49<8:17:01, 4.21s/it] 68%|██████▊ | 15015/22095 [25:55:53<8:19:14, 4.23s/it] {'loss': 0.3316, 'grad_norm': 0.5949138426379929, 'learning_rate': 2.460119291716293e-06, 'epoch': 0.68} 68%|██████▊ | 15015/22095 [25:55:53<8:19:14, 4.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8357034 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23743, 'image': 'vrdu_table_final_2/astro-ph.CO/4b7ec940-c7f2-4846-96eb-b9c91c3a2780.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{#1} #2 \\end{tabular}\n```"}]} 68%|██████▊ | 15016/22095 [25:55:56<7:43:12, 3.93s/it] {'loss': 0.3183, 'grad_norm': 0.6225411278253692, 'learning_rate': 2.4594880019930194e-06, 'epoch': 0.68} 68%|██████▊ | 15016/22095 [25:55:56<7:43:12, 3.93s/it] 68%|██████▊ | 15017/22095 [25:55:59<7:04:11, 3.60s/it] {'loss': 0.3276, 'grad_norm': 0.6325554520778243, 'learning_rate': 2.4588567668575463e-06, 'epoch': 0.68} 68%|██████▊ | 15017/22095 [25:55:59<7:04:11, 3.60s/it] 68%|██████▊ | 15018/22095 [25:56:02<6:57:34, 3.54s/it] {'loss': 0.3176, 'grad_norm': 0.6760739235968206, 'learning_rate': 2.458225586323435e-06, 'epoch': 0.68} 68%|██████▊ | 15018/22095 [25:56:02<6:57:34, 3.54s/it] 68%|██████▊ | 15019/22095 [25:56:06<6:57:26, 3.54s/it] {'loss': 0.3023, 'grad_norm': 0.5645343967429904, 'learning_rate': 2.457594460404249e-06, 'epoch': 0.68} 68%|██████▊ | 15019/22095 [25:56:06<6:57:26, 3.54s/it] 68%|██████▊ | 15020/22095 [25:56:09<6:39:04, 3.38s/it] {'loss': 0.2849, 'grad_norm': 0.6337538050357513, 'learning_rate': 2.456963389113552e-06, 'epoch': 0.68} 68%|██████▊ | 15020/22095 [25:56:09<6:39:04, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55176 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15021/22095 [25:56:18<9:47:43, 4.98s/it] {'loss': 0.4676, 'grad_norm': 0.2906828686861893, 'learning_rate': 2.4563323724649006e-06, 'epoch': 0.68} 68%|██████▊ | 15021/22095 [25:56:18<9:47:43, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53231 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15022/22095 [25:56:21<8:55:27, 4.54s/it] {'loss': 0.3091, 'grad_norm': 0.5591360322721972, 'learning_rate': 2.4557014104718536e-06, 'epoch': 0.68} 68%|██████▊ | 15022/22095 [25:56:21<8:55:27, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42401 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56716 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15023/22095 [25:56:25<8:37:16, 4.39s/it] {'loss': 0.3012, 'grad_norm': 0.6327486766443987, 'learning_rate': 2.4550705031479697e-06, 'epoch': 0.68} 68%|██████▊ | 15023/22095 [25:56:25<8:37:16, 4.39s/it] 68%|██████▊ | 15024/22095 [25:56:29<8:18:59, 4.23s/it] {'loss': 0.297, 'grad_norm': 0.6229533038465822, 'learning_rate': 2.4544396505068037e-06, 'epoch': 0.68} 68%|██████▊ | 15024/22095 [25:56:29<8:18:59, 4.23s/it] 68%|██████▊ | 15025/22095 [25:56:33<8:05:54, 4.12s/it] {'loss': 0.3375, 'grad_norm': 0.7074763849772877, 'learning_rate': 2.4538088525619124e-06, 'epoch': 0.68} 68%|██████▊ | 15025/22095 [25:56:33<8:05:54, 4.12s/it] 68%|██████▊ | 15026/22095 [25:56:37<7:54:50, 4.03s/it] {'loss': 0.3504, 'grad_norm': 0.641807895663935, 'learning_rate': 2.453178109326849e-06, 'epoch': 0.68} 68%|██████▊ | 15026/22095 [25:56:37<7:54:50, 4.03s/it] 68%|██████▊ | 15027/22095 [25:56:40<7:46:32, 3.96s/it] {'loss': 0.3808, 'grad_norm': 0.6876442627979192, 'learning_rate': 2.452547420815165e-06, 'epoch': 0.68} 68%|██████▊ | 15027/22095 [25:56:40<7:46:32, 3.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1334, 12, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410065 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1334, 12, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12263, 'image': 'vrdu_table_final_2/astro-ph.CO/8f33418b-d929-47b3-8c04-56a22b2c66f0.png', 'image_wh': [[1334, 12]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{llllllllllllllllllllllllllllllllllllllllllllllll}\n & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & \\\\\n\\hline \\hline\n\\end{tabular}\n```"}]} 68%|██████▊ | 15028/22095 [25:56:44<7:21:58, 3.75s/it] {'loss': 0.2731, 'grad_norm': 0.6633112177907754, 'learning_rate': 2.4519167870404126e-06, 'epoch': 0.68} 68%|██████▊ | 15028/22095 [25:56:44<7:21:58, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15029/22095 [25:56:52<9:50:06, 5.01s/it] {'loss': 0.4741, 'grad_norm': 0.3050394725195347, 'learning_rate': 2.451286208016144e-06, 'epoch': 0.68} 68%|██████▊ | 15029/22095 [25:56:52<9:50:06, 5.01s/it] 68%|██████▊ | 15030/22095 [25:57:01<12:22:40, 6.31s/it] {'loss': 0.4441, 'grad_norm': 0.3037036497796297, 'learning_rate': 2.4506556837559074e-06, 'epoch': 0.68} 68%|██████▊ | 15030/22095 [25:57:01<12:22:40, 6.31s/it] 68%|██████▊ | 15031/22095 [25:57:07<12:06:11, 6.17s/it] {'loss': 0.4828, 'grad_norm': 0.2735878686555206, 'learning_rate': 2.450025214273249e-06, 'epoch': 0.68} 68%|██████▊ | 15031/22095 [25:57:07<12:06:11, 6.17s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (114900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51790 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15032/22095 [25:57:10<10:34:51, 5.39s/it] {'loss': 0.3022, 'grad_norm': 0.6187834499050971, 'learning_rate': 2.4493947995817165e-06, 'epoch': 0.68} 68%|██████▊ | 15032/22095 [25:57:10<10:34:51, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15033/22095 [25:57:14<9:41:46, 4.94s/it] {'loss': 0.2988, 'grad_norm': 0.5779258372319492, 'learning_rate': 2.4487644396948584e-06, 'epoch': 0.68} 68%|██████▊ | 15033/22095 [25:57:14<9:41:46, 4.94s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 68%|██████▊ | 15034/22095 [25:57:17<8:40:05, 4.42s/it] {'loss': 0.2988, 'grad_norm': 0.6103992843675301, 'learning_rate': 2.448134134626217e-06, 'epoch': 0.68} 68%|██████▊ | 15034/22095 [25:57:17<8:40:05, 4.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [887, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8423561 in VC:s3://internvl-moe-sft-data/. Exception: Image size [887, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 85414, 'image': 'vrdu_texteq/astro-ph.CO/7e6ab3c3-cb61-4af7-bc14-ec7fd5d75f39.png', 'image_wh': [[887, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $ R$ is the virial radius of the halo. The second halo mass relation is'}]} 68%|██████▊ | 15035/22095 [25:57:20<7:49:46, 3.99s/it] {'loss': 0.2891, 'grad_norm': 0.6238415603043627, 'learning_rate': 2.4475038843893327e-06, 'epoch': 0.68} 68%|██████▊ | 15035/22095 [25:57:20<7:49:46, 3.99s/it] 68%|██████▊ | 15036/22095 [25:57:24<7:32:52, 3.85s/it] {'loss': 0.3173, 'grad_norm': 0.6368674523002104, 'learning_rate': 2.4468736889977536e-06, 'epoch': 0.68} 68%|██████▊ | 15036/22095 [25:57:24<7:32:52, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15037/22095 [25:57:32<10:05:56, 5.15s/it] {'loss': 0.468, 'grad_norm': 0.29612709938276416, 'learning_rate': 2.4462435484650156e-06, 'epoch': 0.68} 68%|██████▊ | 15037/22095 [25:57:32<10:05:56, 5.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47627 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75523 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15038/22095 [25:57:36<9:08:15, 4.66s/it] {'loss': 0.3427, 'grad_norm': 0.6270179992072349, 'learning_rate': 2.4456134628046617e-06, 'epoch': 0.68} 68%|██████▊ | 15038/22095 [25:57:36<9:08:15, 4.66s/it] 68%|██████▊ | 15039/22095 [25:57:40<8:39:28, 4.42s/it] {'loss': 0.2887, 'grad_norm': 0.6686654576794504, 'learning_rate': 2.4449834320302297e-06, 'epoch': 0.68} 68%|██████▊ | 15039/22095 [25:57:40<8:39:28, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15040/22095 [25:57:49<11:40:18, 5.96s/it] {'loss': 0.4802, 'grad_norm': 0.3245521884159804, 'learning_rate': 2.4443534561552543e-06, 'epoch': 0.68} 68%|██████▊ | 15040/22095 [25:57:49<11:40:18, 5.96s/it] 68%|██████▊ | 15041/22095 [25:57:52<10:09:53, 5.19s/it] {'loss': 0.3216, 'grad_norm': 0.6097813124642308, 'learning_rate': 2.4437235351932746e-06, 'epoch': 0.68} 68%|██████▊ | 15041/22095 [25:57:53<10:09:53, 5.19s/it] 68%|██████▊ | 15042/22095 [25:57:55<8:49:47, 4.51s/it] {'loss': 0.3034, 'grad_norm': 0.5942174037417574, 'learning_rate': 2.4430936691578287e-06, 'epoch': 0.68} 68%|██████▊ | 15042/22095 [25:57:55<8:49:47, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119413 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50753 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15043/22095 [25:57:59<8:32:48, 4.36s/it] {'loss': 0.3415, 'grad_norm': 0.6237027666111084, 'learning_rate': 2.442463858062444e-06, 'epoch': 0.68} 68%|██████▊ | 15043/22095 [25:57:59<8:32:48, 4.36s/it] 68%|██████▊ | 15044/22095 [25:58:03<8:05:31, 4.13s/it] {'loss': 0.3145, 'grad_norm': 0.6188913554656105, 'learning_rate': 2.441834101920655e-06, 'epoch': 0.68} 68%|██████▊ | 15044/22095 [25:58:03<8:05:31, 4.13s/it] 68%|██████▊ | 15045/22095 [25:58:07<7:45:26, 3.96s/it] {'loss': 0.291, 'grad_norm': 0.6309016051437135, 'learning_rate': 2.4412044007459945e-06, 'epoch': 0.68} 68%|██████▊ | 15045/22095 [25:58:07<7:45:26, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15046/22095 [25:58:16<11:00:09, 5.62s/it] {'loss': 0.468, 'grad_norm': 0.26462806540351497, 'learning_rate': 2.4405747545519966e-06, 'epoch': 0.68} 68%|██████▊ | 15046/22095 [25:58:16<11:00:09, 5.62s/it] 68%|██████▊ | 15047/22095 [25:58:25<12:45:04, 6.51s/it] {'loss': 0.4632, 'grad_norm': 0.2661694906424091, 'learning_rate': 2.4399451633521825e-06, 'epoch': 0.68} 68%|██████▊ | 15047/22095 [25:58:25<12:45:04, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 15048/22095 [25:58:28<10:50:38, 5.54s/it] {'loss': 0.3349, 'grad_norm': 0.6641368818274791, 'learning_rate': 2.4393156271600847e-06, 'epoch': 0.68} 68%|██████▊ | 15048/22095 [25:58:28<10:50:38, 5.54s/it] 68%|██████▊ | 15049/22095 [25:58:31<9:27:20, 4.83s/it] {'loss': 0.2653, 'grad_norm': 0.5960234629586615, 'learning_rate': 2.4386861459892312e-06, 'epoch': 0.68} 68%|██████▊ | 15049/22095 [25:58:31<9:27:20, 4.83s/it] 68%|██████▊ | 15050/22095 [25:58:34<8:22:51, 4.28s/it] {'loss': 0.3426, 'grad_norm': 0.62521763613905, 'learning_rate': 2.4380567198531462e-06, 'epoch': 0.68} 68%|██████▊ | 15050/22095 [25:58:34<8:22:51, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15051/22095 [25:58:44<11:44:46, 6.00s/it] {'loss': 0.4493, 'grad_norm': 0.2717915731520093, 'learning_rate': 2.4374273487653517e-06, 'epoch': 0.68} 68%|██████▊ | 15051/22095 [25:58:44<11:44:46, 6.00s/it] 68%|██████▊ | 15052/22095 [25:58:48<10:17:16, 5.26s/it] {'loss': 0.3155, 'grad_norm': 0.5922236013507167, 'learning_rate': 2.4367980327393752e-06, 'epoch': 0.68} 68%|██████▊ | 15052/22095 [25:58:48<10:17:16, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (207497 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15053/22095 [25:58:51<9:08:26, 4.67s/it] {'loss': 0.3052, 'grad_norm': 0.6491962951976494, 'learning_rate': 2.4361687717887346e-06, 'epoch': 0.68} 68%|██████▊ | 15053/22095 [25:58:51<9:08:26, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74128 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15054/22095 [25:58:54<8:18:36, 4.25s/it] {'loss': 0.2897, 'grad_norm': 0.6055216662373926, 'learning_rate': 2.435539565926955e-06, 'epoch': 0.68} 68%|██████▊ | 15054/22095 [25:58:54<8:18:36, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15055/22095 [25:58:58<7:59:50, 4.09s/it] {'loss': 0.2912, 'grad_norm': 0.62030495638086, 'learning_rate': 2.434910415167554e-06, 'epoch': 0.68} 68%|██████▊ | 15055/22095 [25:58:58<7:59:50, 4.09s/it] 68%|██████▊ | 15056/22095 [25:59:01<7:33:04, 3.86s/it] {'loss': 0.2882, 'grad_norm': 0.5848255587798645, 'learning_rate': 2.4342813195240477e-06, 'epoch': 0.68} 68%|██████▊ | 15056/22095 [25:59:01<7:33:04, 3.86s/it] 68%|██████▊ | 15057/22095 [25:59:05<7:32:11, 3.85s/it] {'loss': 0.2777, 'grad_norm': 1.0867145517086927, 'learning_rate': 2.4336522790099563e-06, 'epoch': 0.68} 68%|██████▊ | 15057/22095 [25:59:05<7:32:11, 3.85s/it] 68%|██████▊ | 15058/22095 [25:59:08<7:03:30, 3.61s/it] {'loss': 0.292, 'grad_norm': 0.616768628832872, 'learning_rate': 2.4330232936387975e-06, 'epoch': 0.68} 68%|██████▊ | 15058/22095 [25:59:08<7:03:30, 3.61s/it] 68%|██████▊ | 15059/22095 [25:59:12<7:12:26, 3.69s/it] {'loss': 0.2836, 'grad_norm': 0.6299643341596347, 'learning_rate': 2.4323943634240838e-06, 'epoch': 0.68} 68%|██████▊ | 15059/22095 [25:59:12<7:12:26, 3.69s/it] 68%|██████▊ | 15060/22095 [25:59:16<7:26:28, 3.81s/it] {'loss': 0.3279, 'grad_norm': 0.6101124308071156, 'learning_rate': 2.431765488379328e-06, 'epoch': 0.68} 68%|██████▊ | 15060/22095 [25:59:16<7:26:28, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15061/22095 [25:59:26<11:02:06, 5.65s/it] {'loss': 0.4191, 'grad_norm': 0.29891723094065137, 'learning_rate': 2.4311366685180436e-06, 'epoch': 0.68} 68%|██████▊ | 15061/22095 [25:59:26<11:02:06, 5.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952446 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3281, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由题意得,EC+FD=EF-CD=8-4=4,∵E是AC的中点,F是BD的中点,∴AE+FB=EC+FD=4,∴AB=AE+FB+EF=4+8=12.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_2/images/20250417140039.png 2025-08-28 17:57:26.012435 load time: 1023.09 ms 68%|██████▊ | 15062/22095 [25:59:37<14:03:45, 7.20s/it] {'loss': 0.4834, 'grad_norm': 0.31111970321204335, 'learning_rate': 2.430507903853745e-06, 'epoch': 0.68} 68%|██████▊ | 15062/22095 [25:59:37<14:03:45, 7.20s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 68%|██████▊ | 15063/22095 [25:59:41<12:12:10, 6.25s/it] {'loss': 0.2647, 'grad_norm': 0.5747182195248783, 'learning_rate': 2.42987919439994e-06, 'epoch': 0.68} 68%|██████▊ | 15063/22095 [25:59:41<12:12:10, 6.25s/it] 68%|██████▊ | 15064/22095 [25:59:45<10:45:58, 5.51s/it] {'loss': 0.2548, 'grad_norm': 0.6957102220606519, 'learning_rate': 2.429250540170135e-06, 'epoch': 0.68} 68%|██████▊ | 15064/22095 [25:59:45<10:45:58, 5.51s/it] 68%|██████▊ | 15065/22095 [25:59:48<9:11:42, 4.71s/it] {'loss': 0.2803, 'grad_norm': 0.5184383773596717, 'learning_rate': 2.428621941177843e-06, 'epoch': 0.68} 68%|██████▊ | 15065/22095 [25:59:48<9:11:42, 4.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047604 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 12\nB. 16\nC. 9\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 68%|██████▊ | 15066/22095 [25:59:51<8:20:16, 4.27s/it] {'loss': 0.3289, 'grad_norm': 0.649242266265955, 'learning_rate': 2.4279933974365662e-06, 'epoch': 0.68} 68%|██████▊ | 15066/22095 [25:59:51<8:20:16, 4.27s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15067/22095 [25:59:54<7:55:38, 4.06s/it] {'loss': 0.2894, 'grad_norm': 0.6386315571308373, 'learning_rate': 2.4273649089598133e-06, 'epoch': 0.68} 68%|██████▊ | 15067/22095 [25:59:54<7:55:38, 4.06s/it] 68%|██████▊ | 15068/22095 [25:59:57<7:18:18, 3.74s/it] {'loss': 0.2738, 'grad_norm': 0.6135005713400414, 'learning_rate': 2.4267364757610878e-06, 'epoch': 0.68} 68%|██████▊ | 15068/22095 [25:59:57<7:18:18, 3.74s/it] 68%|██████▊ | 15069/22095 [26:00:00<6:45:43, 3.46s/it] {'loss': 0.3396, 'grad_norm': 0.6213338889164568, 'learning_rate': 2.4261080978538897e-06, 'epoch': 0.68} 68%|██████▊ | 15069/22095 [26:00:00<6:45:43, 3.46s/it] 68%|██████▊ | 15070/22095 [26:00:03<6:26:45, 3.30s/it] {'loss': 0.2587, 'grad_norm': 0.5964228446691353, 'learning_rate': 2.425479775251724e-06, 'epoch': 0.68} 68%|██████▊ | 15070/22095 [26:00:03<6:26:45, 3.30s/it] 68%|██████▊ | 15071/22095 [26:00:07<6:56:56, 3.56s/it] {'loss': 0.3102, 'grad_norm': 0.5767022668088903, 'learning_rate': 2.4248515079680945e-06, 'epoch': 0.68} 68%|██████▊ | 15071/22095 [26:00:07<6:56:56, 3.56s/it] 68%|██████▊ | 15072/22095 [26:00:10<6:36:12, 3.38s/it] {'loss': 0.2785, 'grad_norm': 0.6267563690965127, 'learning_rate': 2.4242232960164937e-06, 'epoch': 0.68} 68%|██████▊ | 15072/22095 [26:00:10<6:36:12, 3.38s/it] 68%|██████▊ | 15073/22095 [26:00:13<6:27:54, 3.31s/it] {'loss': 0.2771, 'grad_norm': 0.6050276519979759, 'learning_rate': 2.423595139410423e-06, 'epoch': 0.68} 68%|██████▊ | 15073/22095 [26:00:13<6:27:54, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903167 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26320, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 68%|██████▊ | 15074/22095 [26:00:17<6:37:25, 3.40s/it] {'loss': 0.2796, 'grad_norm': 0.5967407347994533, 'learning_rate': 2.4229670381633804e-06, 'epoch': 0.68} 68%|██████▊ | 15074/22095 [26:00:17<6:37:25, 3.40s/it] 68%|██████▊ | 15075/22095 [26:00:21<7:02:14, 3.61s/it] {'loss': 0.3188, 'grad_norm': 0.5878917183116571, 'learning_rate': 2.4223389922888646e-06, 'epoch': 0.68} 68%|██████▊ | 15075/22095 [26:00:21<7:02:14, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101367 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15076/22095 [26:00:24<6:42:11, 3.44s/it] {'loss': 0.3001, 'grad_norm': 0.5176462985315065, 'learning_rate': 2.4217110018003636e-06, 'epoch': 0.68} 68%|██████▊ | 15076/22095 [26:00:24<6:42:11, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54400 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89928 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93306 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15077/22095 [26:00:29<7:33:27, 3.88s/it] {'loss': 0.2778, 'grad_norm': 0.6936683358693684, 'learning_rate': 2.4210830667113745e-06, 'epoch': 0.68} 68%|██████▊ | 15077/22095 [26:00:29<7:33:27, 3.88s/it] 68%|██████▊ | 15078/22095 [26:00:33<7:27:46, 3.83s/it] {'loss': 0.3505, 'grad_norm': 0.6189248083875724, 'learning_rate': 2.4204551870353917e-06, 'epoch': 0.68} 68%|██████▊ | 15078/22095 [26:00:33<7:27:46, 3.83s/it] 68%|██████▊ | 15079/22095 [26:00:36<6:55:35, 3.55s/it] {'loss': 0.3475, 'grad_norm': 0.7311188825467586, 'learning_rate': 2.4198273627859043e-06, 'epoch': 0.68} 68%|██████▊ | 15079/22095 [26:00:36<6:55:35, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [484, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8515691 in VC:s3://internvl-moe-sft-data/. Exception: Image size [484, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 194, 'image': 'vrdu_texteq/astro-ph.CO/042ae6de-924c-45ce-a98a-ac231fd9ab89.png', 'image_wh': [[484, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $ \\hbar$ is the reduced Planck constant.'}]} 68%|██████▊ | 15080/22095 [26:00:39<6:31:36, 3.35s/it] {'loss': 0.2967, 'grad_norm': 0.6536316463000047, 'learning_rate': 2.419199593976401e-06, 'epoch': 0.68} 68%|██████▊ | 15080/22095 [26:00:39<6:31:36, 3.35s/it] 68%|██████▊ | 15081/22095 [26:00:42<6:32:26, 3.36s/it] {'loss': 0.2991, 'grad_norm': 0.6168381295450212, 'learning_rate': 2.4185718806203738e-06, 'epoch': 0.68} 68%|██████▊ | 15081/22095 [26:00:42<6:32:26, 3.36s/it] 68%|██████▊ | 15082/22095 [26:00:45<6:21:55, 3.27s/it] {'loss': 0.2927, 'grad_norm': 0.5989253129012235, 'learning_rate': 2.4179442227313065e-06, 'epoch': 0.68} 68%|██████▊ | 15082/22095 [26:00:45<6:21:55, 3.27s/it] 68%|██████▊ | 15083/22095 [26:00:49<6:34:34, 3.38s/it] {'loss': 0.3425, 'grad_norm': 0.6387303771408718, 'learning_rate': 2.41731662032269e-06, 'epoch': 0.68} 68%|██████▊ | 15083/22095 [26:00:49<6:34:34, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15084/22095 [26:00:58<10:12:20, 5.24s/it] {'loss': 0.4635, 'grad_norm': 0.3572420858535034, 'learning_rate': 2.4166890734080066e-06, 'epoch': 0.68} 68%|██████▊ | 15084/22095 [26:00:58<10:12:20, 5.24s/it] 68%|██████▊ | 15085/22095 [26:01:04<10:44:29, 5.52s/it] {'loss': 0.4804, 'grad_norm': 0.3106526202462484, 'learning_rate': 2.41606158200074e-06, 'epoch': 0.68} 68%|██████▊ | 15085/22095 [26:01:04<10:44:29, 5.52s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (91601 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15086/22095 [26:01:08<9:49:10, 5.04s/it] {'loss': 0.2942, 'grad_norm': 0.6606184577508364, 'learning_rate': 2.4154341461143734e-06, 'epoch': 0.68} 68%|██████▊ | 15086/22095 [26:01:08<9:49:10, 5.04s/it] 68%|██████▊ | 15087/22095 [26:01:12<8:55:31, 4.58s/it] {'loss': 0.3086, 'grad_norm': 0.6548534213992573, 'learning_rate': 2.4148067657623907e-06, 'epoch': 0.68} 68%|██████▊ | 15087/22095 [26:01:12<8:55:31, 4.58s/it] 68%|██████▊ | 15088/22095 [26:01:16<8:51:23, 4.55s/it] {'loss': 0.2829, 'grad_norm': 0.6151721435049017, 'learning_rate': 2.4141794409582713e-06, 'epoch': 0.68} 68%|██████▊ | 15088/22095 [26:01:16<8:51:23, 4.55s/it] 68%|██████▊ | 15089/22095 [26:01:20<8:34:33, 4.41s/it] {'loss': 0.3002, 'grad_norm': 0.70234226813624, 'learning_rate': 2.413552171715492e-06, 'epoch': 0.68} 68%|██████▊ | 15089/22095 [26:01:20<8:34:33, 4.41s/it] 68%|██████▊ | 15090/22095 [26:01:24<7:59:36, 4.11s/it] {'loss': 0.3023, 'grad_norm': 0.5181581808026793, 'learning_rate': 2.412924958047533e-06, 'epoch': 0.68} 68%|██████▊ | 15090/22095 [26:01:24<7:59:36, 4.11s/it] 68%|██████▊ | 15091/22095 [26:01:27<7:37:38, 3.92s/it] {'loss': 0.3185, 'grad_norm': 0.655670861689387, 'learning_rate': 2.4122977999678727e-06, 'epoch': 0.68} 68%|██████▊ | 15091/22095 [26:01:27<7:37:38, 3.92s/it] 68%|██████▊ | 15092/22095 [26:01:32<7:55:50, 4.08s/it] {'loss': 0.3472, 'grad_norm': 0.5946982122245101, 'learning_rate': 2.4116706974899857e-06, 'epoch': 0.68} 68%|██████▊ | 15092/22095 [26:01:32<7:55:50, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15093/22095 [26:01:36<7:52:45, 4.05s/it] {'loss': 0.3462, 'grad_norm': 0.7419455436996496, 'learning_rate': 2.411043650627343e-06, 'epoch': 0.68} 68%|██████▊ | 15093/22095 [26:01:36<7:52:45, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 68%|██████▊ | 15094/22095 [26:01:45<10:59:53, 5.66s/it] {'loss': 0.4753, 'grad_norm': 0.33988968281405163, 'learning_rate': 2.4104166593934237e-06, 'epoch': 0.68} 68%|██████▊ | 15094/22095 [26:01:45<10:59:53, 5.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15095/22095 [26:01:49<9:53:25, 5.09s/it] {'loss': 0.3085, 'grad_norm': 0.643007225023249, 'learning_rate': 2.409789723801695e-06, 'epoch': 0.68} 68%|██████▊ | 15095/22095 [26:01:49<9:53:25, 5.09s/it] 68%|██████▊ | 15096/22095 [26:01:53<9:08:52, 4.71s/it] {'loss': 0.2933, 'grad_norm': 0.7106285604036804, 'learning_rate': 2.409162843865632e-06, 'epoch': 0.68} 68%|██████▊ | 15096/22095 [26:01:53<9:08:52, 4.71s/it] 68%|██████▊ | 15097/22095 [26:01:56<8:06:56, 4.17s/it] {'loss': 0.2589, 'grad_norm': 0.6235195273871946, 'learning_rate': 2.4085360195987017e-06, 'epoch': 0.68} 68%|██████▊ | 15097/22095 [26:01:56<8:06:56, 4.17s/it] 68%|██████▊ | 15098/22095 [26:02:00<8:11:01, 4.21s/it] {'loss': 0.3071, 'grad_norm': 0.5841690154702869, 'learning_rate': 2.4079092510143712e-06, 'epoch': 0.68} 68%|██████▊ | 15098/22095 [26:02:00<8:11:01, 4.21s/it] 68%|██████▊ | 15099/22095 [26:02:03<7:35:21, 3.91s/it] {'loss': 0.2403, 'grad_norm': 0.5636313308066683, 'learning_rate': 2.407282538126111e-06, 'epoch': 0.68} 68%|██████▊ | 15099/22095 [26:02:03<7:35:21, 3.91s/it] 68%|██████▊ | 15100/22095 [26:02:07<7:50:24, 4.03s/it] {'loss': 0.2854, 'grad_norm': 0.6407811380404763, 'learning_rate': 2.4066558809473896e-06, 'epoch': 0.68} 68%|██████▊ | 15100/22095 [26:02:07<7:50:24, 4.03s/it] 68%|██████▊ | 15101/22095 [26:02:10<7:15:20, 3.73s/it] {'loss': 0.3146, 'grad_norm': 0.6759491624865307, 'learning_rate': 2.406029279491664e-06, 'epoch': 0.68} 68%|██████▊ | 15101/22095 [26:02:10<7:15:20, 3.73s/it] 68%|██████▊ | 15102/22095 [26:02:13<6:47:13, 3.49s/it] {'loss': 0.3136, 'grad_norm': 0.6259445406121261, 'learning_rate': 2.405402733772403e-06, 'epoch': 0.68} 68%|██████▊ | 15102/22095 [26:02:13<6:47:13, 3.49s/it] 68%|██████▊ | 15103/22095 [26:02:16<6:24:17, 3.30s/it] {'loss': 0.2622, 'grad_norm': 0.6051948786321726, 'learning_rate': 2.404776243803068e-06, 'epoch': 0.68} 68%|██████▊ | 15103/22095 [26:02:16<6:24:17, 3.30s/it] 68%|██████▊ | 15104/22095 [26:02:20<6:38:16, 3.42s/it] {'loss': 0.2993, 'grad_norm': 0.5876119935476364, 'learning_rate': 2.4041498095971253e-06, 'epoch': 0.68} 68%|██████▊ | 15104/22095 [26:02:20<6:38:16, 3.42s/it] 68%|██████▊ | 15105/22095 [26:02:23<6:32:52, 3.37s/it] {'loss': 0.3371, 'grad_norm': 0.5800757727860213, 'learning_rate': 2.4035234311680267e-06, 'epoch': 0.68} 68%|██████▊ | 15105/22095 [26:02:23<6:32:52, 3.37s/it] 68%|██████▊ | 15106/22095 [26:02:27<6:35:07, 3.39s/it] {'loss': 0.2724, 'grad_norm': 0.6108119594459906, 'learning_rate': 2.402897108529235e-06, 'epoch': 0.68} 68%|██████▊ | 15106/22095 [26:02:27<6:35:07, 3.39s/it] 68%|██████▊ | 15107/22095 [26:02:30<6:51:41, 3.53s/it] {'loss': 0.3746, 'grad_norm': 0.6520504016494019, 'learning_rate': 2.40227084169421e-06, 'epoch': 0.68} 68%|██████▊ | 15107/22095 [26:02:30<6:51:41, 3.53s/it] 68%|██████▊ | 15108/22095 [26:02:34<6:35:01, 3.39s/it] {'loss': 0.3151, 'grad_norm': 0.7391233436868181, 'learning_rate': 2.401644630676406e-06, 'epoch': 0.68} 68%|██████▊ | 15108/22095 [26:02:34<6:35:01, 3.39s/it] 68%|██████▊ | 15109/22095 [26:02:37<6:44:26, 3.47s/it] {'loss': 0.2969, 'grad_norm': 0.5998437164037514, 'learning_rate': 2.4010184754892773e-06, 'epoch': 0.68} 68%|██████▊ | 15109/22095 [26:02:37<6:44:26, 3.47s/it] 68%|██████▊ | 15110/22095 [26:02:41<6:42:17, 3.46s/it] {'loss': 0.3113, 'grad_norm': 0.612633443932278, 'learning_rate': 2.400392376146281e-06, 'epoch': 0.68} 68%|██████▊ | 15110/22095 [26:02:41<6:42:17, 3.46s/it] 68%|██████▊ | 15111/22095 [26:02:44<6:52:06, 3.54s/it] {'loss': 0.3705, 'grad_norm': 0.6737683247307297, 'learning_rate': 2.3997663326608663e-06, 'epoch': 0.68} 68%|██████▊ | 15111/22095 [26:02:44<6:52:06, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929820 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52973, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BD=4cm,∴AD=AB-BD=10-4=6(cm),∵点C是AD中点,∴CD=\\frac{1}{2}AD=3cm,则BC=CD+BD=7cm,'}]} 68%|██████▊ | 15112/22095 [26:02:47<6:31:30, 3.36s/it] {'loss': 0.2848, 'grad_norm': 0.5984500289040087, 'learning_rate': 2.3991403450464896e-06, 'epoch': 0.68} 68%|██████▊ | 15112/22095 [26:02:47<6:31:30, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15113/22095 [26:02:51<6:57:09, 3.58s/it] {'loss': 0.3334, 'grad_norm': 0.685521554319672, 'learning_rate': 2.398514413316598e-06, 'epoch': 0.68} 68%|██████▊ | 15113/22095 [26:02:51<6:57:09, 3.58s/it] 68%|██████▊ | 15114/22095 [26:02:55<6:41:39, 3.45s/it] {'loss': 0.3064, 'grad_norm': 0.6089728266060771, 'learning_rate': 2.397888537484641e-06, 'epoch': 0.68} 68%|██████▊ | 15114/22095 [26:02:55<6:41:39, 3.45s/it] 68%|██████▊ | 15115/22095 [26:02:58<6:34:33, 3.39s/it] {'loss': 0.3223, 'grad_norm': 0.5548942735671424, 'learning_rate': 2.397262717564067e-06, 'epoch': 0.68} 68%|██████▊ | 15115/22095 [26:02:58<6:34:33, 3.39s/it] 68%|██████▊ | 15116/22095 [26:03:01<6:24:43, 3.31s/it] {'loss': 0.3467, 'grad_norm': 0.6383884701744859, 'learning_rate': 2.3966369535683254e-06, 'epoch': 0.68} 68%|██████▊ | 15116/22095 [26:03:01<6:24:43, 3.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903163 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26316, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 68%|██████▊ | 15117/22095 [26:03:04<6:20:11, 3.27s/it] {'loss': 0.2759, 'grad_norm': 0.5859354192074316, 'learning_rate': 2.3960112455108604e-06, 'epoch': 0.68} 68%|██████▊ | 15117/22095 [26:03:04<6:20:11, 3.27s/it] 68%|██████▊ | 15118/22095 [26:03:07<6:22:34, 3.29s/it] {'loss': 0.2808, 'grad_norm': 0.6497286961086137, 'learning_rate': 2.3953855934051135e-06, 'epoch': 0.68} 68%|██████▊ | 15118/22095 [26:03:07<6:22:34, 3.29s/it] 68%|██████▊ | 15119/22095 [26:03:11<6:23:52, 3.30s/it] {'loss': 0.3897, 'grad_norm': 0.6832885488840652, 'learning_rate': 2.3947599972645313e-06, 'epoch': 0.68} 68%|██████▊ | 15119/22095 [26:03:11<6:23:52, 3.30s/it] 68%|██████▊ | 15120/22095 [26:03:14<6:10:14, 3.18s/it] {'loss': 0.2846, 'grad_norm': 0.5990897995013262, 'learning_rate': 2.3941344571025575e-06, 'epoch': 0.68} 68%|██████▊ | 15120/22095 [26:03:14<6:10:14, 3.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93493 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15121/22095 [26:03:17<6:13:46, 3.22s/it] {'loss': 0.295, 'grad_norm': 0.6285928770397525, 'learning_rate': 2.3935089729326307e-06, 'epoch': 0.68} 68%|██████▊ | 15121/22095 [26:03:17<6:13:46, 3.22s/it] 68%|██████▊ | 15122/22095 [26:03:20<6:16:44, 3.24s/it] {'loss': 0.3387, 'grad_norm': 0.637133212579721, 'learning_rate': 2.3928835447681886e-06, 'epoch': 0.68} 68%|██████▊ | 15122/22095 [26:03:20<6:16:44, 3.24s/it] 68%|██████▊ | 15123/22095 [26:03:23<6:16:35, 3.24s/it] {'loss': 0.2953, 'grad_norm': 0.5783934039329125, 'learning_rate': 2.392258172622674e-06, 'epoch': 0.68} 68%|██████▊ | 15123/22095 [26:03:24<6:16:35, 3.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15124/22095 [26:03:27<6:30:03, 3.36s/it] {'loss': 0.2819, 'grad_norm': 0.5715727546399437, 'learning_rate': 2.391632856509521e-06, 'epoch': 0.68} 68%|██████▊ | 15124/22095 [26:03:27<6:30:03, 3.36s/it] 68%|██████▊ | 15125/22095 [26:03:30<6:28:00, 3.34s/it] {'loss': 0.3124, 'grad_norm': 0.6303769605021657, 'learning_rate': 2.3910075964421682e-06, 'epoch': 0.68} 68%|██████▊ | 15125/22095 [26:03:30<6:28:00, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47393 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15126/22095 [26:03:34<6:38:19, 3.43s/it] {'loss': 0.3138, 'grad_norm': 0.6382169114047389, 'learning_rate': 2.390382392434049e-06, 'epoch': 0.68} 68%|██████▊ | 15126/22095 [26:03:34<6:38:19, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (146604 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15127/22095 [26:03:37<6:26:31, 3.33s/it] {'loss': 0.3024, 'grad_norm': 1.884026468527679, 'learning_rate': 2.389757244498596e-06, 'epoch': 0.68} 68%|██████▊ | 15127/22095 [26:03:37<6:26:31, 3.33s/it] 68%|██████▊ | 15128/22095 [26:03:40<6:23:15, 3.30s/it] {'loss': 0.304, 'grad_norm': 0.6009302379190262, 'learning_rate': 2.389132152649243e-06, 'epoch': 0.68} 68%|██████▊ | 15128/22095 [26:03:40<6:23:15, 3.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 68%|██████▊ | 15129/22095 [26:03:50<9:55:28, 5.13s/it] {'loss': 0.4527, 'grad_norm': 0.2907404985498884, 'learning_rate': 2.3885071168994245e-06, 'epoch': 0.68} 68%|██████▊ | 15129/22095 [26:03:50<9:55:28, 5.13s/it] 68%|██████▊ | 15130/22095 [26:03:54<9:19:37, 4.82s/it] {'loss': 0.3325, 'grad_norm': 0.6375852995803484, 'learning_rate': 2.3878821372625645e-06, 'epoch': 0.68} 68%|██████▊ | 15130/22095 [26:03:54<9:19:37, 4.82s/it] 68%|██████▊ | 15131/22095 [26:03:58<8:39:26, 4.48s/it] {'loss': 0.2996, 'grad_norm': 0.6255821180704887, 'learning_rate': 2.3872572137520942e-06, 'epoch': 0.68} 68%|██████▊ | 15131/22095 [26:03:58<8:39:26, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62540 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15132/22095 [26:04:02<8:23:22, 4.34s/it] {'loss': 0.3524, 'grad_norm': 0.6084577807492569, 'learning_rate': 2.3866323463814426e-06, 'epoch': 0.68} 68%|██████▊ | 15132/22095 [26:04:02<8:23:22, 4.34s/it] 68%|██████▊ | 15133/22095 [26:04:06<8:25:13, 4.35s/it] {'loss': 0.2679, 'grad_norm': 0.6140593026995164, 'learning_rate': 2.386007535164039e-06, 'epoch': 0.68} 68%|██████▊ | 15133/22095 [26:04:06<8:25:13, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41807 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43529 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87270 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118960 > 40960). Running this sequence through the model will result in indexing errors 68%|██████▊ | 15134/22095 [26:04:15<11:22:40, 5.88s/it] {'loss': 0.4455, 'grad_norm': 0.37943621133603733, 'learning_rate': 2.3853827801133015e-06, 'epoch': 0.68} 68%|██████▊ | 15134/22095 [26:04:15<11:22:40, 5.88s/it] 68%|██████▊ | 15135/22095 [26:04:20<10:25:09, 5.39s/it] {'loss': 0.32, 'grad_norm': 0.6303980113130074, 'learning_rate': 2.384758081242658e-06, 'epoch': 0.68} 68%|██████▊ | 15135/22095 [26:04:20<10:25:09, 5.39s/it] 69%|██████▊ | 15136/22095 [26:04:23<9:17:28, 4.81s/it] {'loss': 0.3279, 'grad_norm': 0.6468168375303229, 'learning_rate': 2.384133438565533e-06, 'epoch': 0.69} 69%|██████▊ | 15136/22095 [26:04:23<9:17:28, 4.81s/it] 69%|██████▊ | 15137/22095 [26:04:26<8:11:32, 4.24s/it] {'loss': 0.3224, 'grad_norm': 0.6264960194763265, 'learning_rate': 2.383508852095346e-06, 'epoch': 0.69} 69%|██████▊ | 15137/22095 [26:04:26<8:11:32, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (102492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75711 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15138/22095 [26:04:30<7:45:43, 4.02s/it] {'loss': 0.3099, 'grad_norm': 0.6333601588370803, 'learning_rate': 2.382884321845516e-06, 'epoch': 0.69} 69%|██████▊ | 15138/22095 [26:04:30<7:45:43, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▊ | 15139/22095 [26:04:32<7:07:40, 3.69s/it] {'loss': 0.3354, 'grad_norm': 0.6282786241901707, 'learning_rate': 2.382259847829467e-06, 'epoch': 0.69} 69%|██████▊ | 15139/22095 [26:04:32<7:07:40, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366723 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33469, 'image': 'vrdu_table_final_2/astro-ph.CO/747e3ef3-1472-4322-987d-3242ebfa49e1.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{#1}#2\\end{tabular}\n```"}]} 69%|██████▊ | 15140/22095 [26:04:41<9:49:50, 5.09s/it] {'loss': 0.4853, 'grad_norm': 0.2730465512823068, 'learning_rate': 2.381635430060611e-06, 'epoch': 0.69} 69%|██████▊ | 15140/22095 [26:04:41<9:49:50, 5.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [367, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8527492 in VC:s3://internvl-moe-sft-data/. Exception: Image size [367, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 79644, 'image': 'vrdu_texteq/astro-ph.CO/b34bbdf1-52c4-41c8-a6c5-97ac0e3aca70.png', 'image_wh': [[367, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'where the kernel function ${\\mathcal D}$ is'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [295, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8416981 in VC:s3://internvl-moe-sft-data/. Exception: Image size [295, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 146600, 'image': 'vrdu_texteq/astro-ph.CO/6c935d1f-a2e5-4066-9306-23b00ae2acc7.png', 'image_wh': [[295, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': '${\\cal R}$ which we define to be'}]} 69%|██████▊ | 15141/22095 [26:04:45<9:10:07, 4.75s/it] {'loss': 0.2762, 'grad_norm': 0.5941869212174515, 'learning_rate': 2.38101106855237e-06, 'epoch': 0.69} 69%|██████▊ | 15141/22095 [26:04:45<9:10:07, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15142/22095 [26:04:54<11:50:45, 6.13s/it] {'loss': 0.4624, 'grad_norm': 0.28845968455259635, 'learning_rate': 2.3803867633181575e-06, 'epoch': 0.69} 69%|██████▊ | 15142/22095 [26:04:54<11:50:45, 6.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047178 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 69%|██████▊ | 15143/22095 [26:04:58<10:37:57, 5.51s/it] {'loss': 0.2775, 'grad_norm': 0.6890545300013281, 'learning_rate': 2.3797625143713865e-06, 'epoch': 0.69} 69%|██████▊ | 15143/22095 [26:04:58<10:37:57, 5.51s/it] 69%|██████▊ | 15144/22095 [26:05:02<9:41:11, 5.02s/it] {'loss': 0.3612, 'grad_norm': 0.6849086964199705, 'learning_rate': 2.3791383217254717e-06, 'epoch': 0.69} 69%|██████▊ | 15144/22095 [26:05:02<9:41:11, 5.02s/it] 69%|██████▊ | 15145/22095 [26:05:05<8:32:47, 4.43s/it] {'loss': 0.2985, 'grad_norm': 0.6329796736475822, 'learning_rate': 2.3785141853938266e-06, 'epoch': 0.69} 69%|██████▊ | 15145/22095 [26:05:05<8:32:47, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15146/22095 [26:05:15<11:37:23, 6.02s/it] {'loss': 0.4552, 'grad_norm': 0.2658507084788963, 'learning_rate': 2.37789010538986e-06, 'epoch': 0.69} 69%|██████▊ | 15146/22095 [26:05:15<11:37:23, 6.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77926 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43317 > 40960) for 4 sample(s). Truncating to 20918 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (135844 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15147/22095 [26:05:18<9:56:01, 5.15s/it] {'loss': 0.2875, 'grad_norm': 0.6106735785669123, 'learning_rate': 2.3772660817269806e-06, 'epoch': 0.69} 69%|██████▊ | 15147/22095 [26:05:18<9:56:01, 5.15s/it] 69%|██████▊ | 15148/22095 [26:05:21<8:53:30, 4.61s/it] {'loss': 0.2948, 'grad_norm': 0.5513527402067496, 'learning_rate': 2.3766421144185977e-06, 'epoch': 0.69} 69%|██████▊ | 15148/22095 [26:05:21<8:53:30, 4.61s/it] 69%|██████▊ | 15149/22095 [26:05:24<7:58:15, 4.13s/it] {'loss': 0.2645, 'grad_norm': 0.5290145509732991, 'learning_rate': 2.3760182034781203e-06, 'epoch': 0.69} 69%|██████▊ | 15149/22095 [26:05:24<7:58:15, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81317 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15150/22095 [26:05:28<7:53:09, 4.09s/it] {'loss': 0.3056, 'grad_norm': 0.6402660303302956, 'learning_rate': 2.3753943489189537e-06, 'epoch': 0.69} 69%|██████▊ | 15150/22095 [26:05:28<7:53:09, 4.09s/it] 69%|██████▊ | 15151/22095 [26:05:31<7:09:12, 3.71s/it] {'loss': 0.3194, 'grad_norm': 0.6428218167659006, 'learning_rate': 2.3747705507544986e-06, 'epoch': 0.69} 69%|██████▊ | 15151/22095 [26:05:31<7:09:12, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70921 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15152/22095 [26:05:34<6:53:52, 3.58s/it] {'loss': 0.2862, 'grad_norm': 0.6706866565556091, 'learning_rate': 2.3741468089981646e-06, 'epoch': 0.69} 69%|██████▊ | 15152/22095 [26:05:34<6:53:52, 3.58s/it] 69%|██████▊ | 15153/22095 [26:05:38<6:45:29, 3.50s/it] {'loss': 0.2883, 'grad_norm': 0.5990018968682368, 'learning_rate': 2.3735231236633483e-06, 'epoch': 0.69} 69%|██████▊ | 15153/22095 [26:05:38<6:45:29, 3.50s/it] 69%|██████▊ | 15154/22095 [26:05:41<6:42:28, 3.48s/it] {'loss': 0.293, 'grad_norm': 0.6217163752330904, 'learning_rate': 2.372899494763456e-06, 'epoch': 0.69} 69%|██████▊ | 15154/22095 [26:05:41<6:42:28, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42755 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41183 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94147 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15155/22095 [26:05:50<10:06:47, 5.25s/it] {'loss': 0.4643, 'grad_norm': 0.28048717025355663, 'learning_rate': 2.3722759223118846e-06, 'epoch': 0.69} 69%|██████▊ | 15155/22095 [26:05:51<10:06:47, 5.25s/it] 69%|██████▊ | 15156/22095 [26:05:54<9:16:19, 4.81s/it] {'loss': 0.2873, 'grad_norm': 0.6270455720998772, 'learning_rate': 2.371652406322031e-06, 'epoch': 0.69} 69%|██████▊ | 15156/22095 [26:05:54<9:16:19, 4.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401626 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3790, 'image': 'vrdu_table_final_2/astro-ph.CO/a16bdba6-eb8b-414f-8ec4-9bb0c7cbfd6c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [59, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8374921 in VC:s3://internvl-moe-sft-data/. Exception: Image size [59, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41697, 'image': 'vrdu_table_final_2/astro-ph.CO/2a1709eb-b86b-44c0-9b06-c1eaf8776914.png', 'image_wh': [[59, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}} IF$_{\\rm CL}$ \\end{tabular}\n```"}]} 69%|██████▊ | 15157/22095 [26:05:57<8:11:38, 4.25s/it] {'loss': 0.3119, 'grad_norm': 0.5866299850272513, 'learning_rate': 2.3710289468072957e-06, 'epoch': 0.69} 69%|██████▊ | 15157/22095 [26:05:57<8:11:38, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15158/22095 [26:06:04<9:52:17, 5.12s/it] {'loss': 0.4836, 'grad_norm': 0.27914664793966554, 'learning_rate': 2.3704055437810754e-06, 'epoch': 0.69} 69%|██████▊ | 15158/22095 [26:06:04<9:52:17, 5.12s/it] 69%|██████▊ | 15159/22095 [26:06:08<8:46:10, 4.55s/it] {'loss': 0.2848, 'grad_norm': 0.5762223082408546, 'learning_rate': 2.3697821972567635e-06, 'epoch': 0.69} 69%|██████▊ | 15159/22095 [26:06:08<8:46:10, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15160/22095 [26:06:17<11:36:16, 6.02s/it] {'loss': 0.4544, 'grad_norm': 0.29689041409226063, 'learning_rate': 2.3691589072477527e-06, 'epoch': 0.69} 69%|██████▊ | 15160/22095 [26:06:17<11:36:16, 6.02s/it] 69%|██████▊ | 15161/22095 [26:06:21<10:19:39, 5.36s/it] {'loss': 0.3022, 'grad_norm': 0.616291195054509, 'learning_rate': 2.3685356737674364e-06, 'epoch': 0.69} 69%|██████▊ | 15161/22095 [26:06:21<10:19:39, 5.36s/it] 69%|██████▊ | 15162/22095 [26:06:25<9:20:12, 4.85s/it] {'loss': 0.2788, 'grad_norm': 0.6195029821454514, 'learning_rate': 2.367912496829211e-06, 'epoch': 0.69} 69%|██████▊ | 15162/22095 [26:06:25<9:20:12, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15163/22095 [26:06:35<12:19:37, 6.40s/it] {'loss': 0.4686, 'grad_norm': 0.2606837427560347, 'learning_rate': 2.367289376446458e-06, 'epoch': 0.69} 69%|██████▊ | 15163/22095 [26:06:35<12:19:37, 6.40s/it] 69%|██████▊ | 15164/22095 [26:06:39<11:02:51, 5.74s/it] {'loss': 0.2994, 'grad_norm': 0.6483111444061642, 'learning_rate': 2.3666663126325705e-06, 'epoch': 0.69} 69%|██████▊ | 15164/22095 [26:06:39<11:02:51, 5.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129817 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15165/22095 [26:06:42<9:32:08, 4.95s/it] {'loss': 0.2767, 'grad_norm': 0.5808106655048166, 'learning_rate': 2.3660433054009385e-06, 'epoch': 0.69} 69%|██████▊ | 15165/22095 [26:06:42<9:32:08, 4.95s/it] 69%|██████▊ | 15166/22095 [26:06:46<9:01:02, 4.69s/it] {'loss': 0.2759, 'grad_norm': 0.5582665769123862, 'learning_rate': 2.3654203547649463e-06, 'epoch': 0.69} 69%|██████▊ | 15166/22095 [26:06:46<9:01:02, 4.69s/it] 69%|██████▊ | 15167/22095 [26:06:49<8:03:27, 4.19s/it] {'loss': 0.2635, 'grad_norm': 0.6517516887106195, 'learning_rate': 2.364797460737977e-06, 'epoch': 0.69} 69%|██████▊ | 15167/22095 [26:06:49<8:03:27, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73054 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15168/22095 [26:06:52<7:16:27, 3.78s/it] {'loss': 0.2802, 'grad_norm': 0.6199077902775992, 'learning_rate': 2.364174623333419e-06, 'epoch': 0.69} 69%|██████▊ | 15168/22095 [26:06:52<7:16:27, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50417 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15169/22095 [26:06:55<6:47:59, 3.53s/it] {'loss': 0.3534, 'grad_norm': 0.6461909522002824, 'learning_rate': 2.363551842564651e-06, 'epoch': 0.69} 69%|██████▊ | 15169/22095 [26:06:55<6:47:59, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67696 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15170/22095 [26:06:58<6:36:38, 3.44s/it] {'loss': 0.3022, 'grad_norm': 0.6299888083758769, 'learning_rate': 2.362929118445059e-06, 'epoch': 0.69} 69%|██████▊ | 15170/22095 [26:06:58<6:36:38, 3.44s/it] 69%|██████▊ | 15171/22095 [26:07:02<6:44:42, 3.51s/it] {'loss': 0.3375, 'grad_norm': 0.7662971676827985, 'learning_rate': 2.36230645098802e-06, 'epoch': 0.69} 69%|██████▊ | 15171/22095 [26:07:02<6:44:42, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15172/22095 [26:07:10<9:25:05, 4.90s/it] {'loss': 0.4623, 'grad_norm': 0.28553350487559287, 'learning_rate': 2.3616838402069132e-06, 'epoch': 0.69} 69%|██████▊ | 15172/22095 [26:07:10<9:25:05, 4.90s/it] 69%|██████▊ | 15173/22095 [26:07:20<12:18:17, 6.40s/it] {'loss': 0.4512, 'grad_norm': 0.28039720222164843, 'learning_rate': 2.361061286115118e-06, 'epoch': 0.69} 69%|██████▊ | 15173/22095 [26:07:20<12:18:17, 6.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 69%|██████▊ | 15174/22095 [26:07:24<11:03:10, 5.75s/it] {'loss': 0.2728, 'grad_norm': 0.6293996693057095, 'learning_rate': 2.3604387887260122e-06, 'epoch': 0.69} 69%|██████▊ | 15174/22095 [26:07:24<11:03:10, 5.75s/it] 69%|██████▊ | 15175/22095 [26:07:29<10:33:52, 5.50s/it] {'loss': 0.2687, 'grad_norm': 0.620450588469678, 'learning_rate': 2.35981634805297e-06, 'epoch': 0.69} 69%|██████▊ | 15175/22095 [26:07:29<10:33:52, 5.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15176/22095 [26:07:39<13:09:11, 6.84s/it] {'loss': 0.4641, 'grad_norm': 0.2611191221662981, 'learning_rate': 2.359193964109364e-06, 'epoch': 0.69} 69%|██████▊ | 15176/22095 [26:07:39<13:09:11, 6.84s/it] 69%|██████▊ | 15177/22095 [26:07:46<13:33:28, 7.06s/it] {'loss': 0.4613, 'grad_norm': 0.2633681409227305, 'learning_rate': 2.3585716369085692e-06, 'epoch': 0.69} 69%|██████▊ | 15177/22095 [26:07:46<13:33:28, 7.06s/it] 69%|██████▊ | 15178/22095 [26:07:56<14:54:53, 7.76s/it] {'loss': 0.4759, 'grad_norm': 0.2854475002845172, 'learning_rate': 2.35794936646396e-06, 'epoch': 0.69} 69%|██████▊ | 15178/22095 [26:07:56<14:54:53, 7.76s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (42322 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135075 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15179/22095 [26:07:59<12:22:36, 6.44s/it] {'loss': 0.3001, 'grad_norm': 0.9651744734923887, 'learning_rate': 2.357327152788903e-06, 'epoch': 0.69} 69%|██████▊ | 15179/22095 [26:07:59<12:22:36, 6.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134692 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▊ | 15180/22095 [26:08:02<10:29:07, 5.46s/it] {'loss': 0.277, 'grad_norm': 0.5976494048168159, 'learning_rate': 2.356704995896768e-06, 'epoch': 0.69} 69%|██████▊ | 15180/22095 [26:08:02<10:29:07, 5.46s/it] 69%|██████▊ | 15181/22095 [26:08:06<9:35:37, 5.00s/it] {'loss': 0.3234, 'grad_norm': 0.637045182518737, 'learning_rate': 2.3560828958009265e-06, 'epoch': 0.69} 69%|██████▊ | 15181/22095 [26:08:06<9:35:37, 5.00s/it] 69%|██████▊ | 15182/22095 [26:08:10<8:54:19, 4.64s/it] {'loss': 0.3296, 'grad_norm': 0.6255287785575562, 'learning_rate': 2.355460852514741e-06, 'epoch': 0.69} 69%|██████▊ | 15182/22095 [26:08:10<8:54:19, 4.64s/it] 69%|██████▊ | 15183/22095 [26:08:14<8:25:34, 4.39s/it] {'loss': 0.3245, 'grad_norm': 0.6239150838054454, 'learning_rate': 2.354838866051582e-06, 'epoch': 0.69} 69%|██████▊ | 15183/22095 [26:08:14<8:25:34, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62052 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50503 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89967 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108303 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▊ | 15184/22095 [26:08:17<7:43:47, 4.03s/it] {'loss': 0.2777, 'grad_norm': 0.6051523620230199, 'learning_rate': 2.354216936424812e-06, 'epoch': 0.69} 69%|██████▊ | 15184/22095 [26:08:17<7:43:47, 4.03s/it] 69%|██████▊ | 15185/22095 [26:08:21<7:45:16, 4.04s/it] {'loss': 0.2943, 'grad_norm': 0.6090172591445984, 'learning_rate': 2.3535950636477915e-06, 'epoch': 0.69} 69%|██████▊ | 15185/22095 [26:08:21<7:45:16, 4.04s/it] 69%|██████▊ | 15186/22095 [26:08:25<7:37:01, 3.97s/it] {'loss': 0.2999, 'grad_norm': 0.627645662677417, 'learning_rate': 2.3529732477338857e-06, 'epoch': 0.69} 69%|██████▊ | 15186/22095 [26:08:25<7:37:01, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▊ | 15187/22095 [26:08:34<10:43:57, 5.59s/it] {'loss': 0.4835, 'grad_norm': 0.2991361603487357, 'learning_rate': 2.352351488696457e-06, 'epoch': 0.69} 69%|██████▊ | 15187/22095 [26:08:34<10:43:57, 5.59s/it] 69%|██████▊ | 15188/22095 [26:08:37<9:21:15, 4.88s/it] {'loss': 0.3004, 'grad_norm': 0.636926639668503, 'learning_rate': 2.351729786548863e-06, 'epoch': 0.69} 69%|██████▊ | 15188/22095 [26:08:37<9:21:15, 4.88s/it] 69%|██████▊ | 15189/22095 [26:08:41<8:23:31, 4.37s/it] {'loss': 0.3057, 'grad_norm': 0.5933253379819048, 'learning_rate': 2.3511081413044605e-06, 'epoch': 0.69} 69%|██████▊ | 15189/22095 [26:08:41<8:23:31, 4.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▊ | 15190/22095 [26:08:45<8:07:34, 4.24s/it] {'loss': 0.3628, 'grad_norm': 0.6585305490498075, 'learning_rate': 2.3504865529766084e-06, 'epoch': 0.69} 69%|██████▊ | 15190/22095 [26:08:45<8:07:34, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51621 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41960 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15191/22095 [26:08:48<7:51:57, 4.10s/it] {'loss': 0.2863, 'grad_norm': 0.791935913806427, 'learning_rate': 2.3498650215786656e-06, 'epoch': 0.69} 69%|██████▉ | 15191/22095 [26:08:48<7:51:57, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15192/22095 [26:08:57<10:38:20, 5.55s/it] {'loss': 0.4629, 'grad_norm': 0.3022278662815868, 'learning_rate': 2.349243547123983e-06, 'epoch': 0.69} 69%|██████▉ | 15192/22095 [26:08:57<10:38:20, 5.55s/it] 69%|██████▉ | 15193/22095 [26:09:02<10:00:47, 5.22s/it] {'loss': 0.3112, 'grad_norm': 0.6842440528203508, 'learning_rate': 2.348622129625914e-06, 'epoch': 0.69} 69%|██████▉ | 15193/22095 [26:09:02<10:00:47, 5.22s/it] 69%|██████▉ | 15194/22095 [26:09:05<8:44:52, 4.56s/it] {'loss': 0.2748, 'grad_norm': 0.598615904191372, 'learning_rate': 2.3480007690978153e-06, 'epoch': 0.69} 69%|██████▉ | 15194/22095 [26:09:05<8:44:52, 4.56s/it] 69%|██████▉ | 15195/22095 [26:09:08<8:15:08, 4.31s/it] {'loss': 0.2794, 'grad_norm': 0.6166353577834562, 'learning_rate': 2.3473794655530317e-06, 'epoch': 0.69} 69%|██████▉ | 15195/22095 [26:09:08<8:15:08, 4.31s/it] 69%|██████▉ | 15196/22095 [26:09:12<7:39:18, 3.99s/it] {'loss': 0.3015, 'grad_norm': 0.6474296155650149, 'learning_rate': 2.3467582190049194e-06, 'epoch': 0.69} 69%|██████▉ | 15196/22095 [26:09:12<7:39:18, 3.99s/it] 69%|██████▉ | 15197/22095 [26:09:16<7:35:16, 3.96s/it] {'loss': 0.3272, 'grad_norm': 0.6586067699851095, 'learning_rate': 2.3461370294668234e-06, 'epoch': 0.69} 69%|██████▉ | 15197/22095 [26:09:16<7:35:16, 3.96s/it] 69%|██████▉ | 15198/22095 [26:09:19<7:15:51, 3.79s/it] {'loss': 0.3416, 'grad_norm': 0.6412806959305043, 'learning_rate': 2.3455158969520908e-06, 'epoch': 0.69} 69%|██████▉ | 15198/22095 [26:09:19<7:15:51, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [187, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8349608 in VC:s3://internvl-moe-sft-data/. Exception: Image size [187, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16280, 'image': 'vrdu_table_final_2/astro-ph.CO/bf202162-b612-4ae3-905b-eb7803fe2a3b.png', 'image_wh': [[187, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}\n LOGNORMAL\\\\\n\n\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [448, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8413431 in VC:s3://internvl-moe-sft-data/. Exception: Image size [448, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 127839, 'image': 'vrdu_texteq/astro-ph.CO/3555862e-d1d5-4254-a3b9-d22dfb381034.png', 'image_wh': [[448, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $k_B$ is the Boltzmann constant.'}]} 69%|██████▉ | 15199/22095 [26:09:29<10:44:34, 5.61s/it] {'loss': 0.4603, 'grad_norm': 0.29779863556989944, 'learning_rate': 2.3448948214740703e-06, 'epoch': 0.69} 69%|██████▉ | 15199/22095 [26:09:29<10:44:34, 5.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15200/22095 [26:09:33<9:37:48, 5.03s/it] {'loss': 0.3547, 'grad_norm': 0.7199625021107681, 'learning_rate': 2.3442738030461054e-06, 'epoch': 0.69} 69%|██████▉ | 15200/22095 [26:09:33<9:37:48, 5.03s/it] 69%|██████▉ | 15201/22095 [26:09:36<8:41:06, 4.54s/it] {'loss': 0.3109, 'grad_norm': 0.6317413566544562, 'learning_rate': 2.3436528416815384e-06, 'epoch': 0.69} 69%|██████▉ | 15201/22095 [26:09:36<8:41:06, 4.54s/it] 69%|██████▉ | 15202/22095 [26:09:40<8:22:00, 4.37s/it] {'loss': 0.2895, 'grad_norm': 0.5831209937004782, 'learning_rate': 2.343031937393714e-06, 'epoch': 0.69} 69%|██████▉ | 15202/22095 [26:09:40<8:22:00, 4.37s/it] 69%|██████▉ | 15203/22095 [26:09:43<7:44:47, 4.05s/it] {'loss': 0.2532, 'grad_norm': 0.5753754117741635, 'learning_rate': 2.342411090195974e-06, 'epoch': 0.69} 69%|██████▉ | 15203/22095 [26:09:43<7:44:47, 4.05s/it] 69%|██████▉ | 15204/22095 [26:09:46<7:04:35, 3.70s/it] {'loss': 0.3236, 'grad_norm': 0.6494344248438111, 'learning_rate': 2.341790300101658e-06, 'epoch': 0.69} 69%|██████▉ | 15204/22095 [26:09:46<7:04:35, 3.70s/it] 69%|██████▉ | 15205/22095 [26:09:50<7:01:44, 3.67s/it] {'loss': 0.2958, 'grad_norm': 0.6113703342258673, 'learning_rate': 2.3411695671241026e-06, 'epoch': 0.69} 69%|██████▉ | 15205/22095 [26:09:50<7:01:44, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51159 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46838 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51002 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15206/22095 [26:09:53<6:43:26, 3.51s/it] {'loss': 0.324, 'grad_norm': 0.6102866066688664, 'learning_rate': 2.3405488912766468e-06, 'epoch': 0.69} 69%|██████▉ | 15206/22095 [26:09:53<6:43:26, 3.51s/it] 69%|██████▉ | 15207/22095 [26:09:56<6:22:10, 3.33s/it] {'loss': 0.3133, 'grad_norm': 0.5710834246103191, 'learning_rate': 2.3399282725726297e-06, 'epoch': 0.69} 69%|██████▉ | 15207/22095 [26:09:56<6:22:10, 3.33s/it] 69%|██████▉ | 15208/22095 [26:09:59<6:06:17, 3.19s/it] {'loss': 0.2984, 'grad_norm': 0.5744056006601107, 'learning_rate': 2.3393077110253838e-06, 'epoch': 0.69} 69%|██████▉ | 15208/22095 [26:09:59<6:06:17, 3.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358067 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24778, 'image': 'vrdu_table_final_2/astro-ph.CO/9f2d3016-f2b2-4b76-a8c9-e898a9603007.png', 'image_wh': [[20, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}$\\Omega$ \\end{tabular}\n```"}]} 69%|██████▉ | 15209/22095 [26:10:02<6:08:21, 3.21s/it] {'loss': 0.3326, 'grad_norm': 0.6411009454649048, 'learning_rate': 2.338687206648242e-06, 'epoch': 0.69} 69%|██████▉ | 15209/22095 [26:10:02<6:08:21, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50467 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15210/22095 [26:10:05<6:03:34, 3.17s/it] {'loss': 0.2735, 'grad_norm': 0.6492584420661419, 'learning_rate': 2.3380667594545402e-06, 'epoch': 0.69} 69%|██████▉ | 15210/22095 [26:10:05<6:03:34, 3.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52800 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15211/22095 [26:10:08<5:55:16, 3.10s/it] {'loss': 0.3121, 'grad_norm': 0.6666730619678819, 'learning_rate': 2.337446369457607e-06, 'epoch': 0.69} 69%|██████▉ | 15211/22095 [26:10:08<5:55:16, 3.10s/it] 69%|██████▉ | 15212/22095 [26:10:11<5:53:01, 3.08s/it] {'loss': 0.3143, 'grad_norm': 0.6411538049352037, 'learning_rate': 2.3368260366707745e-06, 'epoch': 0.69} 69%|██████▉ | 15212/22095 [26:10:11<5:53:01, 3.08s/it] 69%|██████▉ | 15213/22095 [26:10:15<6:30:32, 3.40s/it] {'loss': 0.3211, 'grad_norm': 0.5730742986343003, 'learning_rate': 2.3362057611073722e-06, 'epoch': 0.69} 69%|██████▉ | 15213/22095 [26:10:15<6:30:32, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15214/22095 [26:10:24<9:56:40, 5.20s/it] {'loss': 0.4833, 'grad_norm': 0.6724012403402507, 'learning_rate': 2.3355855427807247e-06, 'epoch': 0.69} 69%|██████▉ | 15214/22095 [26:10:24<9:56:40, 5.20s/it] 69%|██████▉ | 15215/22095 [26:10:29<9:21:42, 4.90s/it] {'loss': 0.2862, 'grad_norm': 0.8056592749483618, 'learning_rate': 2.3349653817041607e-06, 'epoch': 0.69} 69%|██████▉ | 15215/22095 [26:10:29<9:21:42, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106627 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46204 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15216/22095 [26:10:32<8:30:14, 4.45s/it] {'loss': 0.3437, 'grad_norm': 0.6759884149762039, 'learning_rate': 2.3343452778910076e-06, 'epoch': 0.69} 69%|██████▉ | 15216/22095 [26:10:32<8:30:14, 4.45s/it] 69%|██████▉ | 15217/22095 [26:10:36<8:07:54, 4.26s/it] {'loss': 0.3249, 'grad_norm': 0.7560353506301231, 'learning_rate': 2.333725231354588e-06, 'epoch': 0.69} 69%|██████▉ | 15217/22095 [26:10:36<8:07:54, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15218/22095 [26:10:44<10:12:31, 5.34s/it] {'loss': 0.4799, 'grad_norm': 0.3039208009113988, 'learning_rate': 2.333105242108222e-06, 'epoch': 0.69} 69%|██████▉ | 15218/22095 [26:10:44<10:12:31, 5.34s/it] 69%|██████▉ | 15219/22095 [26:10:51<11:25:00, 5.98s/it] {'loss': 0.466, 'grad_norm': 0.29755263599505566, 'learning_rate': 2.332485310165233e-06, 'epoch': 0.69} 69%|██████▉ | 15219/22095 [26:10:51<11:25:00, 5.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 69%|██████▉ | 15220/22095 [26:10:55<10:07:49, 5.30s/it] {'loss': 0.3045, 'grad_norm': 0.5981605057766334, 'learning_rate': 2.3318654355389448e-06, 'epoch': 0.69} 69%|██████▉ | 15220/22095 [26:10:55<10:07:49, 5.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader buff = io.BytesIO(img_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924293 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nWhat is the east to west extent of USA? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'Pacific Ocean to Atlantic Ocean.\nThe texts mention "THE UNITED STATES" several times and also describe the country in various ways such as "LE PAYS ET SES HABITANTS", "HET LAND EN HET", "ABOUT THE UNITED STATES", "MAINE". However, there is no specific mention of the east to west extent of the country. The only clues we can gather are the mentions of the Pacific Ocean and the Atlantic Ocean. As the United States is located between these two oceans, its east to west extent can be stated as being from the Pacific Ocean to the Atlantic Ocean.'}]} 69%|██████▉ | 15221/22095 [26:10:59<9:19:20, 4.88s/it] {'loss': 0.3102, 'grad_norm': 0.7221055998904644, 'learning_rate': 2.3312456182426736e-06, 'epoch': 0.69} 69%|██████▉ | 15221/22095 [26:10:59<9:19:20, 4.88s/it] 69%|██████▉ | 15222/22095 [26:11:03<8:46:42, 4.60s/it] {'loss': 0.3337, 'grad_norm': 0.5825786882913035, 'learning_rate': 2.330625858289736e-06, 'epoch': 0.69} 69%|██████▉ | 15222/22095 [26:11:03<8:46:42, 4.60s/it] 69%|██████▉ | 15223/22095 [26:11:07<8:21:46, 4.38s/it] {'loss': 0.3228, 'grad_norm': 0.6336510446811916, 'learning_rate': 2.330006155693451e-06, 'epoch': 0.69} 69%|██████▉ | 15223/22095 [26:11:07<8:21:46, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15224/22095 [26:11:13<9:42:43, 5.09s/it] {'loss': 0.4962, 'grad_norm': 0.32607904113700137, 'learning_rate': 2.3293865104671324e-06, 'epoch': 0.69} 69%|██████▉ | 15224/22095 [26:11:13<9:42:43, 5.09s/it] 69%|██████▉ | 15225/22095 [26:11:17<8:48:26, 4.62s/it] {'loss': 0.3154, 'grad_norm': 0.7399849731248741, 'learning_rate': 2.328766922624098e-06, 'epoch': 0.69} 69%|██████▉ | 15225/22095 [26:11:17<8:48:26, 4.62s/it] 69%|██████▉ | 15226/22095 [26:11:20<8:05:22, 4.24s/it] {'loss': 0.2894, 'grad_norm': 0.6466307348063, 'learning_rate': 2.3281473921776577e-06, 'epoch': 0.69} 69%|██████▉ | 15226/22095 [26:11:20<8:05:22, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46063 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15227/22095 [26:11:24<7:35:36, 3.98s/it] {'loss': 0.3082, 'grad_norm': 0.6474918515256418, 'learning_rate': 2.327527919141122e-06, 'epoch': 0.69} 69%|██████▉ | 15227/22095 [26:11:24<7:35:36, 3.98s/it] 69%|██████▉ | 15228/22095 [26:11:27<7:21:13, 3.86s/it] {'loss': 0.3248, 'grad_norm': 0.6907658263724318, 'learning_rate': 2.3269085035278037e-06, 'epoch': 0.69} 69%|██████▉ | 15228/22095 [26:11:27<7:21:13, 3.86s/it] 69%|██████▉ | 15229/22095 [26:11:31<7:04:36, 3.71s/it] {'loss': 0.302, 'grad_norm': 0.668855494816506, 'learning_rate': 2.326289145351014e-06, 'epoch': 0.69} 69%|██████▉ | 15229/22095 [26:11:31<7:04:36, 3.71s/it] 69%|██████▉ | 15230/22095 [26:11:34<7:00:17, 3.67s/it] {'loss': 0.3008, 'grad_norm': 0.6269732722978408, 'learning_rate': 2.325669844624058e-06, 'epoch': 0.69} 69%|██████▉ | 15230/22095 [26:11:34<7:00:17, 3.67s/it] 69%|██████▉ | 15231/22095 [26:11:37<6:43:12, 3.52s/it] {'loss': 0.2953, 'grad_norm': 0.665790307558888, 'learning_rate': 2.3250506013602425e-06, 'epoch': 0.69} 69%|██████▉ | 15231/22095 [26:11:37<6:43:12, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91517 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74538 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15232/22095 [26:11:41<6:59:54, 3.67s/it] {'loss': 0.3252, 'grad_norm': 0.6336689756404082, 'learning_rate': 2.3244314155728758e-06, 'epoch': 0.69} 69%|██████▉ | 15232/22095 [26:11:41<6:59:54, 3.67s/it] 69%|██████▉ | 15233/22095 [26:11:45<7:01:34, 3.69s/it] {'loss': 0.2881, 'grad_norm': 0.6333133047342612, 'learning_rate': 2.3238122872752606e-06, 'epoch': 0.69} 69%|██████▉ | 15233/22095 [26:11:45<7:01:34, 3.69s/it] 69%|██████▉ | 15234/22095 [26:11:48<6:31:22, 3.42s/it] {'loss': 0.2952, 'grad_norm': 0.6604219335095796, 'learning_rate': 2.323193216480698e-06, 'epoch': 0.69} 69%|██████▉ | 15234/22095 [26:11:48<6:31:22, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15235/22095 [26:11:57<9:54:17, 5.20s/it] {'loss': 0.467, 'grad_norm': 0.2641142503850825, 'learning_rate': 2.3225742032024923e-06, 'epoch': 0.69} 69%|██████▉ | 15235/22095 [26:11:57<9:54:17, 5.20s/it] 69%|██████▉ | 15236/22095 [26:12:01<9:15:48, 4.86s/it] {'loss': 0.2812, 'grad_norm': 0.6703900163351237, 'learning_rate': 2.3219552474539452e-06, 'epoch': 0.69} 69%|██████▉ | 15236/22095 [26:12:01<9:15:48, 4.86s/it] 69%|██████▉ | 15237/22095 [26:12:05<8:36:44, 4.52s/it] {'loss': 0.291, 'grad_norm': 0.6644438053059106, 'learning_rate': 2.3213363492483553e-06, 'epoch': 0.69} 69%|██████▉ | 15237/22095 [26:12:05<8:36:44, 4.52s/it] 69%|██████▉ | 15238/22095 [26:12:09<8:08:23, 4.27s/it] {'loss': 0.2678, 'grad_norm': 0.6069432377435615, 'learning_rate': 2.3207175085990184e-06, 'epoch': 0.69} 69%|██████▉ | 15238/22095 [26:12:09<8:08:23, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104312 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15239/22095 [26:12:13<7:53:58, 4.15s/it] {'loss': 0.3135, 'grad_norm': 0.5916808380907982, 'learning_rate': 2.3200987255192354e-06, 'epoch': 0.69} 69%|██████▉ | 15239/22095 [26:12:13<7:53:58, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47387 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15240/22095 [26:12:22<11:02:07, 5.80s/it] {'loss': 0.4828, 'grad_norm': 0.3107890007219854, 'learning_rate': 2.3194800000222984e-06, 'epoch': 0.69} 69%|██████▉ | 15240/22095 [26:12:22<11:02:07, 5.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63253 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43300 > 40960) for 4 sample(s). Truncating to 15274 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (47484 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15241/22095 [26:12:25<9:26:53, 4.96s/it] {'loss': 0.28, 'grad_norm': 0.6601154188421439, 'learning_rate': 2.3188613321215046e-06, 'epoch': 0.69} 69%|██████▉ | 15241/22095 [26:12:25<9:26:53, 4.96s/it] 69%|██████▉ | 15242/22095 [26:12:29<8:33:07, 4.49s/it] {'loss': 0.2971, 'grad_norm': 0.6937457942077409, 'learning_rate': 2.3182427218301473e-06, 'epoch': 0.69} 69%|██████▉ | 15242/22095 [26:12:29<8:33:07, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86736 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48817 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15243/22095 [26:12:32<8:09:01, 4.28s/it] {'loss': 0.3301, 'grad_norm': 0.5748328417560244, 'learning_rate': 2.317624169161515e-06, 'epoch': 0.69} 69%|██████▉ | 15243/22095 [26:12:32<8:09:01, 4.28s/it] 69%|██████▉ | 15244/22095 [26:12:35<7:16:30, 3.82s/it] {'loss': 0.307, 'grad_norm': 0.5830720790065382, 'learning_rate': 2.3170056741289015e-06, 'epoch': 0.69} 69%|██████▉ | 15244/22095 [26:12:35<7:16:30, 3.82s/it] 69%|██████▉ | 15245/22095 [26:12:38<6:53:31, 3.62s/it] {'loss': 0.2767, 'grad_norm': 0.6566537590448583, 'learning_rate': 2.3163872367455976e-06, 'epoch': 0.69} 69%|██████▉ | 15245/22095 [26:12:38<6:53:31, 3.62s/it] 69%|██████▉ | 15246/22095 [26:12:41<6:28:06, 3.40s/it] {'loss': 0.262, 'grad_norm': 0.6135232962091322, 'learning_rate': 2.31576885702489e-06, 'epoch': 0.69} 69%|██████▉ | 15246/22095 [26:12:41<6:28:06, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15247/22095 [26:12:51<10:04:10, 5.29s/it] {'loss': 0.4986, 'grad_norm': 0.3635384431340751, 'learning_rate': 2.3151505349800635e-06, 'epoch': 0.69} 69%|██████▉ | 15247/22095 [26:12:51<10:04:10, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45638 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49167 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65412 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73498 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15248/22095 [26:13:00<12:30:27, 6.58s/it] {'loss': 0.4831, 'grad_norm': 0.3197510057064505, 'learning_rate': 2.314532270624406e-06, 'epoch': 0.69} 69%|██████▉ | 15248/22095 [26:13:00<12:30:27, 6.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348831 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15501, 'image': 'vrdu_table_final_2/astro-ph.CO/87ca8bc7-ee97-4329-b93a-ee78bb39f78c.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{5}$\\end{tabular}\n```"}]} 69%|██████▉ | 15249/22095 [26:13:09<13:36:37, 7.16s/it] {'loss': 0.4696, 'grad_norm': 0.3014046711126635, 'learning_rate': 2.3139140639712045e-06, 'epoch': 0.69} 69%|██████▉ | 15249/22095 [26:13:09<13:36:37, 7.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (99389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127182 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76894 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104429 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15250/22095 [26:13:12<11:18:04, 5.94s/it] {'loss': 0.3149, 'grad_norm': 0.6362004212788522, 'learning_rate': 2.31329591503374e-06, 'epoch': 0.69} 69%|██████▉ | 15250/22095 [26:13:12<11:18:04, 5.94s/it] 69%|██████▉ | 15251/22095 [26:13:22<13:22:36, 7.04s/it] {'loss': 0.4488, 'grad_norm': 0.25358928563314215, 'learning_rate': 2.312677823825292e-06, 'epoch': 0.69} 69%|██████▉ | 15251/22095 [26:13:22<13:22:36, 7.04s/it] 69%|██████▉ | 15252/22095 [26:13:29<13:44:09, 7.23s/it] {'loss': 0.4831, 'grad_norm': 0.26585388827939566, 'learning_rate': 2.312059790359147e-06, 'epoch': 0.69} 69%|██████▉ | 15252/22095 [26:13:29<13:44:09, 7.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15253/22095 [26:13:33<11:47:27, 6.20s/it] {'loss': 0.3269, 'grad_norm': 0.6304065372531822, 'learning_rate': 2.3114418146485793e-06, 'epoch': 0.69} 69%|██████▉ | 15253/22095 [26:13:33<11:47:27, 6.20s/it] 69%|██████▉ | 15254/22095 [26:13:36<10:02:20, 5.28s/it] {'loss': 0.282, 'grad_norm': 0.6465249995769476, 'learning_rate': 2.310823896706872e-06, 'epoch': 0.69} 69%|██████▉ | 15254/22095 [26:13:36<10:02:20, 5.28s/it] 69%|██████▉ | 15255/22095 [26:13:39<8:40:26, 4.57s/it] {'loss': 0.3379, 'grad_norm': 0.6462067809944219, 'learning_rate': 2.3102060365473e-06, 'epoch': 0.69} 69%|██████▉ | 15255/22095 [26:13:39<8:40:26, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15256/22095 [26:13:49<11:26:42, 6.02s/it] {'loss': 0.4506, 'grad_norm': 0.27030553396271884, 'learning_rate': 2.309588234183137e-06, 'epoch': 0.69} 69%|██████▉ | 15256/22095 [26:13:49<11:26:42, 6.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15257/22095 [26:13:52<9:48:44, 5.17s/it] {'loss': 0.3188, 'grad_norm': 0.6383121590737316, 'learning_rate': 2.3089704896276597e-06, 'epoch': 0.69} 69%|██████▉ | 15257/22095 [26:13:52<9:48:44, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15258/22095 [26:14:00<11:30:38, 6.06s/it] {'loss': 0.4898, 'grad_norm': 0.37044181031209134, 'learning_rate': 2.3083528028941444e-06, 'epoch': 0.69} 69%|██████▉ | 15258/22095 [26:14:00<11:30:38, 6.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15259/22095 [26:14:04<10:27:02, 5.50s/it] {'loss': 0.3093, 'grad_norm': 0.6686546014233246, 'learning_rate': 2.30773517399586e-06, 'epoch': 0.69} 69%|██████▉ | 15259/22095 [26:14:04<10:27:02, 5.50s/it] 69%|██████▉ | 15260/22095 [26:14:07<9:00:53, 4.75s/it] {'loss': 0.3016, 'grad_norm': 0.5594686195477221, 'learning_rate': 2.307117602946076e-06, 'epoch': 0.69} 69%|██████▉ | 15260/22095 [26:14:07<9:00:53, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93481 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106831 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15261/22095 [26:14:11<8:18:15, 4.37s/it] {'loss': 0.283, 'grad_norm': 0.6476786146664553, 'learning_rate': 2.306500089758065e-06, 'epoch': 0.69} 69%|██████▉ | 15261/22095 [26:14:11<8:18:15, 4.37s/it] 69%|██████▉ | 15262/22095 [26:14:14<7:53:05, 4.15s/it] {'loss': 0.2755, 'grad_norm': 0.6184613319727662, 'learning_rate': 2.3058826344450973e-06, 'epoch': 0.69} 69%|██████▉ | 15262/22095 [26:14:14<7:53:05, 4.15s/it] 69%|██████▉ | 15263/22095 [26:14:18<7:41:19, 4.05s/it] {'loss': 0.267, 'grad_norm': 0.602580473203544, 'learning_rate': 2.3052652370204344e-06, 'epoch': 0.69} 69%|██████▉ | 15263/22095 [26:14:18<7:41:19, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15264/22095 [26:14:21<6:55:17, 3.65s/it] {'loss': 0.308, 'grad_norm': 0.8129775661719012, 'learning_rate': 2.304647897497345e-06, 'epoch': 0.69} 69%|██████▉ | 15264/22095 [26:14:21<6:55:17, 3.65s/it] 69%|██████▉ | 15265/22095 [26:14:24<6:57:06, 3.66s/it] {'loss': 0.3276, 'grad_norm': 0.5952174745127535, 'learning_rate': 2.3040306158890963e-06, 'epoch': 0.69} 69%|██████▉ | 15265/22095 [26:14:25<6:57:06, 3.66s/it] 69%|██████▉ | 15266/22095 [26:14:28<6:39:55, 3.51s/it] {'loss': 0.303, 'grad_norm': 0.6401445111905103, 'learning_rate': 2.3034133922089496e-06, 'epoch': 0.69} 69%|██████▉ | 15266/22095 [26:14:28<6:39:55, 3.51s/it] 69%|██████▉ | 15267/22095 [26:14:31<6:25:10, 3.38s/it] {'loss': 0.3166, 'grad_norm': 0.6491990782272895, 'learning_rate': 2.3027962264701654e-06, 'epoch': 0.69} 69%|██████▉ | 15267/22095 [26:14:31<6:25:10, 3.38s/it] 69%|██████▉ | 15268/22095 [26:14:35<6:39:12, 3.51s/it] {'loss': 0.3303, 'grad_norm': 0.6406130372218273, 'learning_rate': 2.3021791186860078e-06, 'epoch': 0.69} 69%|██████▉ | 15268/22095 [26:14:35<6:39:12, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15269/22095 [26:14:44<10:02:29, 5.30s/it] {'loss': 0.4585, 'grad_norm': 0.27208154366542836, 'learning_rate': 2.3015620688697336e-06, 'epoch': 0.69} 69%|██████▉ | 15269/22095 [26:14:44<10:02:29, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76871 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122360 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15270/22095 [26:14:47<8:52:26, 4.68s/it] {'loss': 0.2978, 'grad_norm': 0.6524243568444674, 'learning_rate': 2.300945077034605e-06, 'epoch': 0.69} 69%|██████▉ | 15270/22095 [26:14:47<8:52:26, 4.68s/it] 69%|██████▉ | 15271/22095 [26:14:51<8:20:47, 4.40s/it] {'loss': 0.32, 'grad_norm': 0.6690897724061226, 'learning_rate': 2.300328143193875e-06, 'epoch': 0.69} 69%|██████▉ | 15271/22095 [26:14:51<8:20:47, 4.40s/it] 69%|██████▉ | 15272/22095 [26:14:55<7:59:23, 4.22s/it] {'loss': 0.2847, 'grad_norm': 0.608535062697799, 'learning_rate': 2.2997112673608035e-06, 'epoch': 0.69} 69%|██████▉ | 15272/22095 [26:14:55<7:59:23, 4.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957884 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8719, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 3\nB. 10\nC. 5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 69%|██████▉ | 15273/22095 [26:14:58<7:18:58, 3.86s/it] {'loss': 0.3089, 'grad_norm': 0.5510052701718511, 'learning_rate': 2.299094449548642e-06, 'epoch': 0.69} 69%|██████▉ | 15273/22095 [26:14:58<7:18:58, 3.86s/it] 69%|██████▉ | 15274/22095 [26:15:01<6:46:42, 3.58s/it] {'loss': 0.3045, 'grad_norm': 0.62296168685443, 'learning_rate': 2.298477689770648e-06, 'epoch': 0.69} 69%|██████▉ | 15274/22095 [26:15:01<6:46:42, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15275/22095 [26:15:05<6:59:00, 3.69s/it] {'loss': 0.3012, 'grad_norm': 0.6005738358702823, 'learning_rate': 2.2978609880400706e-06, 'epoch': 0.69} 69%|██████▉ | 15275/22095 [26:15:05<6:59:00, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15276/22095 [26:15:08<6:48:37, 3.60s/it] {'loss': 0.3577, 'grad_norm': 0.6278202168048158, 'learning_rate': 2.29724434437016e-06, 'epoch': 0.69} 69%|██████▉ | 15276/22095 [26:15:08<6:48:37, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76067 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15277/22095 [26:15:12<6:52:54, 3.63s/it] {'loss': 0.2752, 'grad_norm': 0.6911484776265677, 'learning_rate': 2.296627758774167e-06, 'epoch': 0.69} 69%|██████▉ | 15277/22095 [26:15:12<6:52:54, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15278/22095 [26:15:21<10:15:39, 5.42s/it] {'loss': 0.4458, 'grad_norm': 0.2865799014938107, 'learning_rate': 2.296011231265343e-06, 'epoch': 0.69} 69%|██████▉ | 15278/22095 [26:15:21<10:15:39, 5.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121075 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42892 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43654 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43974 > 40960) for 4 sample(s). Truncating to 40290 with 2 samples. 69%|██████▉ | 15279/22095 [26:15:25<9:00:04, 4.75s/it] {'loss': 0.3224, 'grad_norm': 0.7099050953246633, 'learning_rate': 2.2953947618569335e-06, 'epoch': 0.69} 69%|██████▉ | 15279/22095 [26:15:25<9:00:04, 4.75s/it] 69%|██████▉ | 15280/22095 [26:15:28<8:08:11, 4.30s/it] {'loss': 0.286, 'grad_norm': 0.5985945647670636, 'learning_rate': 2.2947783505621813e-06, 'epoch': 0.69} 69%|██████▉ | 15280/22095 [26:15:28<8:08:11, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15281/22095 [26:15:31<7:26:47, 3.93s/it] {'loss': 0.3016, 'grad_norm': 0.6077100325384621, 'learning_rate': 2.2941619973943363e-06, 'epoch': 0.69} 69%|██████▉ | 15281/22095 [26:15:31<7:26:47, 3.93s/it] 69%|██████▉ | 15282/22095 [26:15:34<6:57:31, 3.68s/it] {'loss': 0.3239, 'grad_norm': 0.7309766071163295, 'learning_rate': 2.2935457023666375e-06, 'epoch': 0.69} 69%|██████▉ | 15282/22095 [26:15:34<6:57:31, 3.68s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38841.png 2025-08-28 18:13:30.833714 load time: 1046.55 ms Token indices sequence length is longer than the specified maximum sequence length for this model (57545 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15283/22095 [26:15:37<6:41:38, 3.54s/it] {'loss': 0.2771, 'grad_norm': 0.5335062188242945, 'learning_rate': 2.2929294654923313e-06, 'epoch': 0.69} 69%|██████▉ | 15283/22095 [26:15:37<6:41:38, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123213 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55942 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15284/22095 [26:15:41<6:48:30, 3.60s/it] {'loss': 0.298, 'grad_norm': 0.6075102502850656, 'learning_rate': 2.2923132867846564e-06, 'epoch': 0.69} 69%|██████▉ | 15284/22095 [26:15:41<6:48:30, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15285/22095 [26:15:50<10:11:25, 5.39s/it] {'loss': 0.4757, 'grad_norm': 0.4341561501175963, 'learning_rate': 2.2916971662568514e-06, 'epoch': 0.69} 69%|██████▉ | 15285/22095 [26:15:50<10:11:25, 5.39s/it] 69%|██████▉ | 15286/22095 [26:15:54<9:09:45, 4.84s/it] {'loss': 0.3227, 'grad_norm': 0.6455224544042446, 'learning_rate': 2.2910811039221564e-06, 'epoch': 0.69} 69%|██████▉ | 15286/22095 [26:15:54<9:09:45, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [159, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8525733 in VC:s3://internvl-moe-sft-data/. Exception: Image size [159, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 113400, 'image': 'vrdu_texteq/astro-ph.CO/f34ec34f-22f8-475f-ae1a-237546924769.png', 'image_wh': [[159, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'whrere $i\\geq 1$.'}]} VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38792.png 2025-08-28 18:13:54.472570 load time: 1035.9 ms 69%|██████▉ | 15287/22095 [26:16:04<12:06:02, 6.40s/it] {'loss': 0.4699, 'grad_norm': 0.28667936098983365, 'learning_rate': 2.2904650997938105e-06, 'epoch': 0.69} 69%|██████▉ | 15287/22095 [26:16:04<12:06:02, 6.40s/it] 69%|██████▉ | 15288/22095 [26:16:11<12:23:40, 6.56s/it] {'loss': 0.4993, 'grad_norm': 0.28509191413429114, 'learning_rate': 2.2898491538850478e-06, 'epoch': 0.69} 69%|██████▉ | 15288/22095 [26:16:11<12:23:40, 6.56s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15289/22095 [26:16:15<10:49:30, 5.73s/it] {'loss': 0.2808, 'grad_norm': 0.6057133211093666, 'learning_rate': 2.2892332662091017e-06, 'epoch': 0.69} 69%|██████▉ | 15289/22095 [26:16:15<10:49:30, 5.73s/it] 69%|██████▉ | 15290/22095 [26:16:18<9:21:04, 4.95s/it] {'loss': 0.2905, 'grad_norm': 0.5920501028162446, 'learning_rate': 2.288617436779207e-06, 'epoch': 0.69} 69%|██████▉ | 15290/22095 [26:16:18<9:21:04, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15291/22095 [26:16:28<12:06:34, 6.41s/it] {'loss': 0.4701, 'grad_norm': 0.26799070501983296, 'learning_rate': 2.2880016656085995e-06, 'epoch': 0.69} 69%|██████▉ | 15291/22095 [26:16:28<12:06:34, 6.41s/it] 69%|██████▉ | 15292/22095 [26:16:35<12:33:02, 6.64s/it] {'loss': 0.4876, 'grad_norm': 0.30276461529003007, 'learning_rate': 2.2873859527105037e-06, 'epoch': 0.69} 69%|██████▉ | 15292/22095 [26:16:35<12:33:02, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15293/22095 [26:16:39<10:50:44, 5.74s/it] {'loss': 0.3028, 'grad_norm': 0.5982779481081313, 'learning_rate': 2.286770298098153e-06, 'epoch': 0.69} 69%|██████▉ | 15293/22095 [26:16:39<10:50:44, 5.74s/it] 69%|██████▉ | 15294/22095 [26:16:42<9:30:06, 5.03s/it] {'loss': 0.3119, 'grad_norm': 0.6289253461990337, 'learning_rate': 2.286154701784776e-06, 'epoch': 0.69} 69%|██████▉ | 15294/22095 [26:16:42<9:30:06, 5.03s/it] 69%|██████▉ | 15295/22095 [26:16:45<8:15:56, 4.38s/it] {'loss': 0.3064, 'grad_norm': 0.6455845115964702, 'learning_rate': 2.2855391637836006e-06, 'epoch': 0.69} 69%|██████▉ | 15295/22095 [26:16:45<8:15:56, 4.38s/it] 69%|██████▉ | 15296/22095 [26:16:48<7:39:11, 4.05s/it] {'loss': 0.2721, 'grad_norm': 0.6567945353620316, 'learning_rate': 2.2849236841078496e-06, 'epoch': 0.69} 69%|██████▉ | 15296/22095 [26:16:48<7:39:11, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15297/22095 [26:16:51<6:59:51, 3.71s/it] {'loss': 0.2861, 'grad_norm': 0.5886357694478286, 'learning_rate': 2.2843082627707517e-06, 'epoch': 0.69} 69%|██████▉ | 15297/22095 [26:16:51<6:59:51, 3.71s/it] 69%|██████▉ | 15298/22095 [26:16:54<6:31:04, 3.45s/it] {'loss': 0.3164, 'grad_norm': 0.7481216016016371, 'learning_rate': 2.2836928997855274e-06, 'epoch': 0.69} 69%|██████▉ | 15298/22095 [26:16:54<6:31:04, 3.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047167 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 4cm\nB. 6cm\nC. 12cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 69%|██████▉ | 15299/22095 [26:16:57<6:29:58, 3.44s/it] {'loss': 0.3323, 'grad_norm': 0.6880285776070417, 'learning_rate': 2.2830775951654018e-06, 'epoch': 0.69} 69%|██████▉ | 15299/22095 [26:16:57<6:29:58, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15300/22095 [26:17:01<6:41:51, 3.55s/it] {'loss': 0.3055, 'grad_norm': 0.5801076690074016, 'learning_rate': 2.282462348923592e-06, 'epoch': 0.69} 69%|██████▉ | 15300/22095 [26:17:01<6:41:51, 3.55s/it] 69%|██████▉ | 15301/22095 [26:17:05<7:07:06, 3.77s/it] {'loss': 0.283, 'grad_norm': 0.5879289946383263, 'learning_rate': 2.281847161073322e-06, 'epoch': 0.69} 69%|██████▉ | 15301/22095 [26:17:05<7:07:06, 3.77s/it] 69%|██████▉ | 15302/22095 [26:17:09<6:59:01, 3.70s/it] {'loss': 0.3125, 'grad_norm': 0.6179725409027124, 'learning_rate': 2.2812320316278065e-06, 'epoch': 0.69} 69%|██████▉ | 15302/22095 [26:17:09<6:59:01, 3.70s/it] 69%|██████▉ | 15303/22095 [26:17:12<6:54:17, 3.66s/it] {'loss': 0.279, 'grad_norm': 0.6156506331013871, 'learning_rate': 2.2806169606002663e-06, 'epoch': 0.69} 69%|██████▉ | 15303/22095 [26:17:12<6:54:17, 3.66s/it] 69%|██████▉ | 15304/22095 [26:17:16<6:43:30, 3.57s/it] {'loss': 0.2796, 'grad_norm': 1.2120338640195765, 'learning_rate': 2.280001948003916e-06, 'epoch': 0.69} 69%|██████▉ | 15304/22095 [26:17:16<6:43:30, 3.57s/it] 69%|██████▉ | 15305/22095 [26:17:20<7:07:45, 3.78s/it] {'loss': 0.3045, 'grad_norm': 0.6522398121819206, 'learning_rate': 2.279386993851968e-06, 'epoch': 0.69} 69%|██████▉ | 15305/22095 [26:17:20<7:07:45, 3.78s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [350, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8467782 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56012, 'image': 'vrdu_texteq/astro-ph.CO/1ebf347d-99cc-49db-8911-6d54fd660872.png', 'image_wh': [[350, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'where the wavefuctional $\\Psi$ is'}]} 69%|██████▉ | 15306/22095 [26:17:23<6:45:29, 3.58s/it] {'loss': 0.3168, 'grad_norm': 0.6099418151117931, 'learning_rate': 2.278772098157638e-06, 'epoch': 0.69} 69%|██████▉ | 15306/22095 [26:17:23<6:45:29, 3.58s/it] 69%|██████▉ | 15307/22095 [26:17:26<6:21:16, 3.37s/it] {'loss': 0.3295, 'grad_norm': 0.6312106783246817, 'learning_rate': 2.2781572609341397e-06, 'epoch': 0.69} 69%|██████▉ | 15307/22095 [26:17:26<6:21:16, 3.37s/it] 69%|██████▉ | 15308/22095 [26:17:29<6:16:24, 3.33s/it] {'loss': 0.3372, 'grad_norm': 0.6366029474353617, 'learning_rate': 2.2775424821946824e-06, 'epoch': 0.69} 69%|██████▉ | 15308/22095 [26:17:29<6:16:24, 3.33s/it] 69%|██████▉ | 15309/22095 [26:17:33<6:38:02, 3.52s/it] {'loss': 0.2665, 'grad_norm': 0.6116300351638355, 'learning_rate': 2.2769277619524737e-06, 'epoch': 0.69} 69%|██████▉ | 15309/22095 [26:17:33<6:38:02, 3.52s/it] 69%|██████▉ | 15310/22095 [26:17:37<6:47:46, 3.61s/it] {'loss': 0.3119, 'grad_norm': 0.5517062110299067, 'learning_rate': 2.276313100220726e-06, 'epoch': 0.69} 69%|██████▉ | 15310/22095 [26:17:37<6:47:46, 3.61s/it] 69%|██████▉ | 15311/22095 [26:17:41<6:41:32, 3.55s/it] {'loss': 0.3067, 'grad_norm': 0.6117086962300063, 'learning_rate': 2.275698497012643e-06, 'epoch': 0.69} 69%|██████▉ | 15311/22095 [26:17:41<6:41:32, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66578 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50595 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93873 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15312/22095 [26:17:43<6:18:58, 3.35s/it] {'loss': 0.2745, 'grad_norm': 0.6382350178222626, 'learning_rate': 2.275083952341434e-06, 'epoch': 0.69} 69%|██████▉ | 15312/22095 [26:17:43<6:18:58, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955478 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6313, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2.5\nB. 4.5\nC. 7\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 69%|██████▉ | 15313/22095 [26:17:46<6:08:06, 3.26s/it] {'loss': 0.3727, 'grad_norm': 0.6304432415131873, 'learning_rate': 2.2744694662203022e-06, 'epoch': 0.69} 69%|██████▉ | 15313/22095 [26:17:46<6:08:06, 3.26s/it] 69%|██████▉ | 15314/22095 [26:17:50<6:13:11, 3.30s/it] {'loss': 0.2959, 'grad_norm': 0.7568616098636195, 'learning_rate': 2.273855038662448e-06, 'epoch': 0.69} 69%|██████▉ | 15314/22095 [26:17:50<6:13:11, 3.30s/it] 69%|██████▉ | 15315/22095 [26:17:53<6:19:20, 3.36s/it] {'loss': 0.3061, 'grad_norm': 0.5968925730201458, 'learning_rate': 2.2732406696810773e-06, 'epoch': 0.69} 69%|██████▉ | 15315/22095 [26:17:53<6:19:20, 3.36s/it] 69%|██████▉ | 15316/22095 [26:17:56<6:02:10, 3.21s/it] {'loss': 0.2965, 'grad_norm': 0.6491887672754614, 'learning_rate': 2.2726263592893914e-06, 'epoch': 0.69} 69%|██████▉ | 15316/22095 [26:17:56<6:02:10, 3.21s/it] 69%|██████▉ | 15317/22095 [26:17:59<6:05:36, 3.24s/it] {'loss': 0.3298, 'grad_norm': 0.5861213051717736, 'learning_rate': 2.2720121075005884e-06, 'epoch': 0.69} 69%|██████▉ | 15317/22095 [26:17:59<6:05:36, 3.24s/it] 69%|██████▉ | 15318/22095 [26:18:03<6:05:25, 3.24s/it] {'loss': 0.281, 'grad_norm': 0.639189288756921, 'learning_rate': 2.271397914327865e-06, 'epoch': 0.69} 69%|██████▉ | 15318/22095 [26:18:03<6:05:25, 3.24s/it] 69%|██████▉ | 15319/22095 [26:18:07<6:25:34, 3.41s/it] {'loss': 0.2718, 'grad_norm': 0.5877308010010714, 'learning_rate': 2.2707837797844208e-06, 'epoch': 0.69} 69%|██████▉ | 15319/22095 [26:18:07<6:25:34, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15320/22095 [26:18:16<9:58:03, 5.30s/it] {'loss': 0.4743, 'grad_norm': 0.32706793670548634, 'learning_rate': 2.2701697038834543e-06, 'epoch': 0.69} 69%|██████▉ | 15320/22095 [26:18:16<9:58:03, 5.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (160015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132189 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42007 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15321/22095 [26:18:20<8:50:40, 4.70s/it] {'loss': 0.3242, 'grad_norm': 0.5812374384104216, 'learning_rate': 2.269555686638153e-06, 'epoch': 0.69} 69%|██████▉ | 15321/22095 [26:18:20<8:50:40, 4.70s/it] 69%|██████▉ | 15322/22095 [26:18:24<8:33:13, 4.55s/it] {'loss': 0.3378, 'grad_norm': 0.8841772008999619, 'learning_rate': 2.268941728061714e-06, 'epoch': 0.69} 69%|██████▉ | 15322/22095 [26:18:24<8:33:13, 4.55s/it] 69%|██████▉ | 15323/22095 [26:18:27<7:41:42, 4.09s/it] {'loss': 0.2963, 'grad_norm': 0.6341474894444818, 'learning_rate': 2.2683278281673315e-06, 'epoch': 0.69} 69%|██████▉ | 15323/22095 [26:18:27<7:41:42, 4.09s/it] 69%|██████▉ | 15324/22095 [26:18:31<7:37:55, 4.06s/it] {'loss': 0.3238, 'grad_norm': 0.5919393409921639, 'learning_rate': 2.2677139869681943e-06, 'epoch': 0.69} 69%|██████▉ | 15324/22095 [26:18:31<7:37:55, 4.06s/it] 69%|██████▉ | 15325/22095 [26:18:34<7:20:04, 3.90s/it] {'loss': 0.2791, 'grad_norm': 0.5823503974483093, 'learning_rate': 2.2671002044774896e-06, 'epoch': 0.69} 69%|██████▉ | 15325/22095 [26:18:34<7:20:04, 3.90s/it] 69%|██████▉ | 15326/22095 [26:18:38<6:59:38, 3.72s/it] {'loss': 0.3359, 'grad_norm': 0.6825871993082148, 'learning_rate': 2.266486480708411e-06, 'epoch': 0.69} 69%|██████▉ | 15326/22095 [26:18:38<6:59:38, 3.72s/it] 69%|██████▉ | 15327/22095 [26:18:41<6:37:29, 3.52s/it] {'loss': 0.3062, 'grad_norm': 0.6033434672171372, 'learning_rate': 2.26587281567414e-06, 'epoch': 0.69} 69%|██████▉ | 15327/22095 [26:18:41<6:37:29, 3.52s/it] 69%|██████▉ | 15328/22095 [26:18:44<6:19:38, 3.37s/it] {'loss': 0.2796, 'grad_norm': 0.5871056512885556, 'learning_rate': 2.265259209387867e-06, 'epoch': 0.69} 69%|██████▉ | 15328/22095 [26:18:44<6:19:38, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15329/22095 [26:18:53<9:44:52, 5.19s/it] {'loss': 0.4542, 'grad_norm': 0.2792552271003976, 'learning_rate': 2.2646456618627723e-06, 'epoch': 0.69} 69%|██████▉ | 15329/22095 [26:18:53<9:44:52, 5.19s/it] 69%|██████▉ | 15330/22095 [26:18:56<8:41:44, 4.63s/it] {'loss': 0.3141, 'grad_norm': 0.6157049851124522, 'learning_rate': 2.2640321731120434e-06, 'epoch': 0.69} 69%|██████▉ | 15330/22095 [26:18:56<8:41:44, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41601 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15331/22095 [26:19:00<7:51:49, 4.19s/it] {'loss': 0.2843, 'grad_norm': 0.6200462231323365, 'learning_rate': 2.2634187431488585e-06, 'epoch': 0.69} 69%|██████▉ | 15331/22095 [26:19:00<7:51:49, 4.19s/it] 69%|██████▉ | 15332/22095 [26:19:03<7:19:25, 3.90s/it] {'loss': 0.3096, 'grad_norm': 0.6690563909113196, 'learning_rate': 2.262805371986402e-06, 'epoch': 0.69} 69%|██████▉ | 15332/22095 [26:19:03<7:19:25, 3.90s/it] 69%|██████▉ | 15333/22095 [26:19:07<7:14:57, 3.86s/it] {'loss': 0.3136, 'grad_norm': 0.5942516202410125, 'learning_rate': 2.2621920596378503e-06, 'epoch': 0.69} 69%|██████▉ | 15333/22095 [26:19:07<7:14:57, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80268 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15334/22095 [26:19:09<6:36:52, 3.52s/it] {'loss': 0.2676, 'grad_norm': 0.6065599068867679, 'learning_rate': 2.2615788061163824e-06, 'epoch': 0.69} 69%|██████▉ | 15334/22095 [26:19:09<6:36:52, 3.52s/it] 69%|██████▉ | 15335/22095 [26:19:13<6:51:00, 3.65s/it] {'loss': 0.2857, 'grad_norm': 0.6547981697132818, 'learning_rate': 2.2609656114351745e-06, 'epoch': 0.69} 69%|██████▉ | 15335/22095 [26:19:13<6:51:00, 3.65s/it] 69%|██████▉ | 15336/22095 [26:19:17<6:41:06, 3.56s/it] {'loss': 0.36, 'grad_norm': 0.6390839950958643, 'learning_rate': 2.2603524756074057e-06, 'epoch': 0.69} 69%|██████▉ | 15336/22095 [26:19:17<6:41:06, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [309, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8449318 in VC:s3://internvl-moe-sft-data/. Exception: Image size [309, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 98188, 'image': 'vrdu_texteq/astro-ph.CO/565f752a-001e-42a0-b7e1-3c10706d464c.png', 'image_wh': [[309, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'For shear the $G$ matrix is'}]} 69%|██████▉ | 15337/22095 [26:19:20<6:22:56, 3.40s/it] {'loss': 0.3148, 'grad_norm': 0.6215447818598099, 'learning_rate': 2.2597393986462477e-06, 'epoch': 0.69} 69%|██████▉ | 15337/22095 [26:19:20<6:22:56, 3.40s/it] 69%|██████▉ | 15338/22095 [26:19:23<6:07:20, 3.26s/it] {'loss': 0.2996, 'grad_norm': 0.6120277468101288, 'learning_rate': 2.2591263805648724e-06, 'epoch': 0.69} 69%|██████▉ | 15338/22095 [26:19:23<6:07:20, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15339/22095 [26:19:31<8:54:04, 4.74s/it] {'loss': 0.4759, 'grad_norm': 0.2806594600194946, 'learning_rate': 2.258513421376455e-06, 'epoch': 0.69} 69%|██████▉ | 15339/22095 [26:19:31<8:54:04, 4.74s/it] 69%|██████▉ | 15340/22095 [26:19:34<8:03:11, 4.29s/it] {'loss': 0.3083, 'grad_norm': 0.6209858582212454, 'learning_rate': 2.2579005210941622e-06, 'epoch': 0.69} 69%|██████▉ | 15340/22095 [26:19:34<8:03:11, 4.29s/it] 69%|██████▉ | 15341/22095 [26:19:37<7:23:57, 3.94s/it] {'loss': 0.3039, 'grad_norm': 0.6114612468150938, 'learning_rate': 2.2572876797311676e-06, 'epoch': 0.69} 69%|██████▉ | 15341/22095 [26:19:37<7:23:57, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 69%|██████▉ | 15342/22095 [26:19:46<10:22:19, 5.53s/it] {'loss': 0.4914, 'grad_norm': 0.29673365820543085, 'learning_rate': 2.256674897300635e-06, 'epoch': 0.69} 69%|██████▉ | 15342/22095 [26:19:46<10:22:19, 5.53s/it] 69%|██████▉ | 15343/22095 [26:19:50<9:17:37, 4.96s/it] {'loss': 0.2851, 'grad_norm': 0.5557606997037232, 'learning_rate': 2.2560621738157357e-06, 'epoch': 0.69} 69%|██████▉ | 15343/22095 [26:19:50<9:17:37, 4.96s/it] 69%|██████▉ | 15344/22095 [26:19:53<8:20:42, 4.45s/it] {'loss': 0.3078, 'grad_norm': 0.5788703839311166, 'learning_rate': 2.2554495092896306e-06, 'epoch': 0.69} 69%|██████▉ | 15344/22095 [26:19:53<8:20:42, 4.45s/it] 69%|██████▉ | 15345/22095 [26:19:56<7:24:04, 3.95s/it] {'loss': 0.3457, 'grad_norm': 0.6749206715587992, 'learning_rate': 2.254836903735488e-06, 'epoch': 0.69} 69%|██████▉ | 15345/22095 [26:19:56<7:24:04, 3.95s/it] 69%|██████▉ | 15346/22095 [26:19:59<7:04:29, 3.77s/it] {'loss': 0.3052, 'grad_norm': 0.6495157356241859, 'learning_rate': 2.25422435716647e-06, 'epoch': 0.69} 69%|██████▉ | 15346/22095 [26:19:59<7:04:29, 3.77s/it] 69%|██████▉ | 15347/22095 [26:20:04<7:30:55, 4.01s/it] {'loss': 0.3935, 'grad_norm': 0.6655681762682285, 'learning_rate': 2.2536118695957353e-06, 'epoch': 0.69} 69%|██████▉ | 15347/22095 [26:20:04<7:30:55, 4.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954302 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5137, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 2\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [631, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8430592 in VC:s3://internvl-moe-sft-data/. Exception: Image size [631, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33967, 'image': 'vrdu_texteq/astro-ph.CO/6e20c3a7-10b5-4f54-9dd0-6aefc73616c8.png', 'image_wh': [[631, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where the anticurvature scalar $A$ is the trace of $A^{\\mu\\nu}$'}]} 69%|██████▉ | 15348/22095 [26:20:07<7:03:28, 3.77s/it] {'loss': 0.263, 'grad_norm': 0.6001421976661787, 'learning_rate': 2.252999441036447e-06, 'epoch': 0.69} 69%|██████▉ | 15348/22095 [26:20:07<7:03:28, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41835 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15349/22095 [26:20:10<6:36:12, 3.52s/it] {'loss': 0.3224, 'grad_norm': 0.616823676348002, 'learning_rate': 2.252387071501767e-06, 'epoch': 0.69} 69%|██████▉ | 15349/22095 [26:20:10<6:36:12, 3.52s/it] 69%|██████▉ | 15350/22095 [26:20:13<6:14:29, 3.33s/it] {'loss': 0.3126, 'grad_norm': 0.6300197767755465, 'learning_rate': 2.2517747610048467e-06, 'epoch': 0.69} 69%|██████▉ | 15350/22095 [26:20:13<6:14:29, 3.33s/it] 69%|██████▉ | 15351/22095 [26:20:16<6:02:44, 3.23s/it] {'loss': 0.2893, 'grad_norm': 0.6020949362258067, 'learning_rate': 2.2511625095588465e-06, 'epoch': 0.69} 69%|██████▉ | 15351/22095 [26:20:16<6:02:44, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100246 > 40960). Running this sequence through the model will result in indexing errors 69%|██████▉ | 15352/22095 [26:20:20<6:16:55, 3.35s/it] {'loss': 0.2813, 'grad_norm': 0.8110079469776496, 'learning_rate': 2.2505503171769233e-06, 'epoch': 0.69} 69%|██████▉ | 15352/22095 [26:20:20<6:16:55, 3.35s/it] 69%|██████▉ | 15353/22095 [26:20:23<6:16:30, 3.35s/it] {'loss': 0.3671, 'grad_norm': 0.6305017685210226, 'learning_rate': 2.2499381838722296e-06, 'epoch': 0.69} 69%|██████▉ | 15353/22095 [26:20:23<6:16:30, 3.35s/it] 69%|██████▉ | 15354/22095 [26:20:27<6:49:10, 3.64s/it] {'loss': 0.3377, 'grad_norm': 0.6055438844247768, 'learning_rate': 2.2493261096579163e-06, 'epoch': 0.69} 69%|██████▉ | 15354/22095 [26:20:27<6:49:10, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 69%|██████▉ | 15355/22095 [26:20:31<7:05:53, 3.79s/it] {'loss': 0.3407, 'grad_norm': 0.8846887760337592, 'learning_rate': 2.2487140945471382e-06, 'epoch': 0.69} 69%|██████▉ | 15355/22095 [26:20:31<7:05:53, 3.79s/it] 69%|██████▉ | 15356/22095 [26:20:35<6:53:16, 3.68s/it] {'loss': 0.3077, 'grad_norm': 0.6130218136946555, 'learning_rate': 2.2481021385530427e-06, 'epoch': 0.69} 69%|██████▉ | 15356/22095 [26:20:35<6:53:16, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15357/22095 [26:20:43<9:19:18, 4.98s/it] {'loss': 0.4541, 'grad_norm': 0.30618512419935756, 'learning_rate': 2.2474902416887824e-06, 'epoch': 0.7} 70%|██████▉ | 15357/22095 [26:20:43<9:19:18, 4.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306090 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1oZ.Il9YH8KJjSspdXXcRgVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text is visible in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n已升级\n2019年中国民航大学\n考\n料\n研\n资\n833算法与数据结构\n考研必备\n浓缩精华\n直击考点\n重点提炼\n考研精品资料\n[重点难点+高频考点]\n[直击考点+考研必备]\n关注商家微信号获取更多复试资讯\n内容详细完整\n考研必备宝典\n2019最新版'}]} 70%|██████▉ | 15358/22095 [26:20:46<8:27:21, 4.52s/it] {'loss': 0.331, 'grad_norm': 0.6349363257772199, 'learning_rate': 2.246878403967501e-06, 'epoch': 0.7} 70%|██████▉ | 15358/22095 [26:20:46<8:27:21, 4.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [92, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7805923 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [92, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27263', 'image': '51703.jpg', 'image_wh': [[92, 23]], 'conversations': [{'from': 'human', 'value': '\n Here is the caption I wrote for the image.\nThe provided graphic displays a statistical portrayal of an assessment paired with its deviation. The representation applied is \\(720 \\pm 72\\).\n\n### Detailed Explanation\n\n1. **Statistical Notation**:\n - **Median Value**: The median value or the midpoint indicated in the graphic stands at 720.\n - **Deviation**: The deviation in the median value is delineated as \\(\\pm72\\).\n\nThis notation is customarily utilized in research and technological settings to articulate an observed statistic together with its deviation. The median value (720) exemplifies the probable value or the midpoint of a sequence of observations. The plus-or-minus sign (\\(\\pm\\)) followed by 72 conveys that the authentic value is likely to be found within 72 units above or below the median value.\n\n### Analysis:\n\n1. **Range Presentation**: The notation \\(720 \\pm 72\\) can be construed as the span [648, 792], inferring that the authentic value is anticipated to be between 648 and 792.\n \\n2. **Deviation Examination**: \n - **Absolute Deviation**: The absolute deviation here is 72, which is a gauge of how significantly the observed values sway from the median value.\n - **Proportional Deviation**: To ascertain the proportional deviation, you divide the absolute deviation by the median value: \\(\\frac{72}{720} \\approx 0.1\\) or roughly 10%. This reveals that the deviation is about 10% of the median value.\n\n### Situational Context:\n\n1. **Application in Studies and Evaluations**:\n - This notation is typically employed in ecological studies, meteorology, and other scientific fields to compactly impart how exact an observation is.\n - For example, when gauging environmental parameters, climate trends, biodiversity, or other metrics, researchers and technologists disclose their findings using this layout to articulate both their statistic and its degree of accuracy.\n \n2. **Importance**:\n - **Statistical Accuracy**: The inclusion of a deviation value imparts crucial information about the accuracy and consistency of the observation.\n - **Margin of Error**: It offers a transparent view of the margin of error within which the authentic value of the observation is situated.\n\n### Conclusion:\n\nInterpreting the figure 720 with a deviation of \\(\\pm72\\), one deduces that while 720 is the most plausible observed value, the genuine value might reasonably reside somewhere within the range of 648 to 792. This technique is pivotal in scientific dissemination as it transparently conveys the constraints and accuracy of observations, thereby facilitating more judicious interpretation and judgement-making. \nPlease assess if my caption is accurate, and if it’s not, please point out all the errors.'}, {'from': 'gpt', 'value': 'Your response shows a high level of detail and understanding in interpreting the graphical representation and translating it into a comprehensive description. Here are some subtle points that could be refined:\n\n1. **Terminology Clarity**: In the first instance, you used "median value" when referring to 720, and in the second instance, you used "central value" for 405. It would be more consistent and accurate to use "central value" in both contexts unless the graphic explicitly indicates that 720 is the median of a dataset. The terms "central value" and "mean" are more commonly used in this context.\n\n2. **Precision in Percentage Calculation**: When calculating the proportional deviation (first context) and relative uncertainty (second context), it would be more precise to round the final percentage to the nearest tenth or hundredth, depending on the context. For example, "10.0%" and "9.9%" provide a clearer representation of the percentage.\n\n3. **Consistency in Formatting**: Ensure consistency in the formatting of mathematical expressions. For example, in the first instance, you used "\\\\\\(" and "\\\\\\)" to denote mathematical expressions, while in the second, you used "\\\\(" and "\\\\)". Choose one style and use it consistently throughout.\n\n4. **Avoidance of Redundancy**: In the conclusion of the first description, you mentioned that 720 is the "most plausible observed value," which is redundant since earlier you stated that it is the "median value." Consider revising to avoid repetition and improve clarity.\n\n5. **Incorporation of Graphic Context**: While your descriptions are thorough, they lack context from the graphic itself. Without revealing the correct description, ensure that any specific details from the graphic that inform the interpretation are included, such as whether the graphic provides insight into the data\'s distribution or sample size.\n\n6. **Consistency in Range Representation**: In both instances, you accurately represented the range as "[lower limit, upper limit]." However, ensure that you consistently use this format throughout both descriptions for clarity and uniformity.\n\n7. **Use of Technical Terms**: You\'ve used terms like "median value," "deviation," "central value," and "uncertainty" appropriately. However, make sure that these terms are explained or used in a context that is clear to the intended audience, especially when moving from the technical field to more general or interdisciplinary contexts.\n\nOverall, your descriptions are detailed and demonstrate a strong grasp of interpreting statistical representations. Minor adjustments in terminology, formatting, and inclusion of specific graphic details would enhance the accuracy and clarity of your explanations.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [478, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8413842 in VC:s3://internvl-moe-sft-data/. Exception: Image size [478, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36499, 'image': 'vrdu_texteq/astro-ph.CO/2df80f04-373e-4df3-85ae-41b0520168da.png', 'image_wh': [[478, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $\\sigma_T$ is the Thomson cross section.'}]} 70%|██████▉ | 15359/22095 [26:20:50<7:46:21, 4.15s/it] {'loss': 0.3305, 'grad_norm': 0.5705475206325165, 'learning_rate': 2.2462666254023495e-06, 'epoch': 0.7} 70%|██████▉ | 15359/22095 [26:20:50<7:46:21, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15360/22095 [26:20:59<10:47:10, 5.77s/it] {'loss': 0.4916, 'grad_norm': 0.2784925673579084, 'learning_rate': 2.2456549060064684e-06, 'epoch': 0.7} 70%|██████▉ | 15360/22095 [26:20:59<10:47:10, 5.77s/it] 70%|██████▉ | 15361/22095 [26:21:02<9:25:15, 5.04s/it] {'loss': 0.3054, 'grad_norm': 0.6430028863711853, 'learning_rate': 2.245043245793006e-06, 'epoch': 0.7} 70%|██████▉ | 15361/22095 [26:21:02<9:25:15, 5.04s/it] 70%|██████▉ | 15362/22095 [26:21:06<8:20:02, 4.46s/it] {'loss': 0.3257, 'grad_norm': 0.5952655056608498, 'learning_rate': 2.2444316447751034e-06, 'epoch': 0.7} 70%|██████▉ | 15362/22095 [26:21:06<8:20:02, 4.46s/it] 70%|██████▉ | 15363/22095 [26:21:08<7:24:35, 3.96s/it] {'loss': 0.3128, 'grad_norm': 0.6920910924343269, 'learning_rate': 2.2438201029658995e-06, 'epoch': 0.7} 70%|██████▉ | 15363/22095 [26:21:08<7:24:35, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52978 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15364/22095 [26:21:11<6:47:05, 3.63s/it] {'loss': 0.3147, 'grad_norm': 0.5834875442704955, 'learning_rate': 2.243208620378537e-06, 'epoch': 0.7} 70%|██████▉ | 15364/22095 [26:21:11<6:47:05, 3.63s/it] 70%|██████▉ | 15365/22095 [26:21:14<6:29:37, 3.47s/it] {'loss': 0.2993, 'grad_norm': 0.6003270635637863, 'learning_rate': 2.2425971970261558e-06, 'epoch': 0.7} 70%|██████▉ | 15365/22095 [26:21:14<6:29:37, 3.47s/it] 70%|██████▉ | 15366/22095 [26:21:18<6:29:21, 3.47s/it] {'loss': 0.2991, 'grad_norm': 0.6027439237743176, 'learning_rate': 2.2419858329218926e-06, 'epoch': 0.7} 70%|██████▉ | 15366/22095 [26:21:18<6:29:21, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41075 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62881 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90308 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15367/22095 [26:21:21<6:14:57, 3.34s/it] {'loss': 0.3246, 'grad_norm': 0.6248205418356746, 'learning_rate': 2.2413745280788806e-06, 'epoch': 0.7} 70%|██████▉ | 15367/22095 [26:21:21<6:14:57, 3.34s/it] 70%|██████▉ | 15368/22095 [26:21:24<6:11:34, 3.31s/it] {'loss': 0.2491, 'grad_norm': 0.6615877596406995, 'learning_rate': 2.2407632825102605e-06, 'epoch': 0.7} 70%|██████▉ | 15368/22095 [26:21:24<6:11:34, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15369/22095 [26:21:28<6:29:10, 3.47s/it] {'loss': 0.2661, 'grad_norm': 0.63773715869281, 'learning_rate': 2.24015209622916e-06, 'epoch': 0.7} 70%|██████▉ | 15369/22095 [26:21:28<6:29:10, 3.47s/it] 70%|██████▉ | 15370/22095 [26:21:31<6:28:10, 3.46s/it] {'loss': 0.2991, 'grad_norm': 0.5830994928670054, 'learning_rate': 2.2395409692487174e-06, 'epoch': 0.7} 70%|██████▉ | 15370/22095 [26:21:31<6:28:10, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15371/22095 [26:21:40<9:27:16, 5.06s/it] {'loss': 0.4761, 'grad_norm': 0.2917371253576962, 'learning_rate': 2.2389299015820592e-06, 'epoch': 0.7} 70%|██████▉ | 15371/22095 [26:21:40<9:27:16, 5.06s/it] 70%|██████▉ | 15372/22095 [26:21:44<8:41:32, 4.65s/it] {'loss': 0.2921, 'grad_norm': 0.64376924689846, 'learning_rate': 2.2383188932423192e-06, 'epoch': 0.7} 70%|██████▉ | 15372/22095 [26:21:44<8:41:32, 4.65s/it] 70%|██████▉ | 15373/22095 [26:21:47<7:51:13, 4.21s/it] {'loss': 0.3158, 'grad_norm': 0.5848437032666506, 'learning_rate': 2.237707944242623e-06, 'epoch': 0.7} 70%|██████▉ | 15373/22095 [26:21:47<7:51:13, 4.21s/it] 70%|██████▉ | 15374/22095 [26:21:50<7:11:01, 3.85s/it] {'loss': 0.3715, 'grad_norm': 0.5974157437942595, 'learning_rate': 2.2370970545961005e-06, 'epoch': 0.7} 70%|██████▉ | 15374/22095 [26:21:50<7:11:01, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [350, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8513588 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39982, 'image': 'vrdu_texteq/astro-ph.CO/6879212f-433a-4b1a-a518-4c607d9c8968.png', 'image_wh': [[350, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'Redshift slice $0.55 < z < 0.7$:'}]} 70%|██████▉ | 15375/22095 [26:21:57<8:57:14, 4.80s/it] {'loss': 0.469, 'grad_norm': 0.29087052445388767, 'learning_rate': 2.236486224315877e-06, 'epoch': 0.7} 70%|██████▉ | 15375/22095 [26:21:57<8:57:14, 4.80s/it] 70%|██████▉ | 15376/22095 [26:22:01<8:33:56, 4.59s/it] {'loss': 0.2879, 'grad_norm': 0.605095854819963, 'learning_rate': 2.2358754534150752e-06, 'epoch': 0.7} 70%|██████▉ | 15376/22095 [26:22:01<8:33:56, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42692 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15377/22095 [26:22:04<7:47:13, 4.17s/it] {'loss': 0.3192, 'grad_norm': 0.6365435641897976, 'learning_rate': 2.2352647419068207e-06, 'epoch': 0.7} 70%|██████▉ | 15377/22095 [26:22:04<7:47:13, 4.17s/it] 70%|██████▉ | 15378/22095 [26:22:08<7:35:39, 4.07s/it] {'loss': 0.2828, 'grad_norm': 0.6548937754424179, 'learning_rate': 2.2346540898042372e-06, 'epoch': 0.7} 70%|██████▉ | 15378/22095 [26:22:08<7:35:39, 4.07s/it] 70%|██████▉ | 15379/22095 [26:22:13<7:49:38, 4.20s/it] {'loss': 0.3149, 'grad_norm': 0.5831388735421803, 'learning_rate': 2.2340434971204445e-06, 'epoch': 0.7} 70%|██████▉ | 15379/22095 [26:22:13<7:49:38, 4.20s/it] 70%|██████▉ | 15380/22095 [26:22:17<7:42:17, 4.13s/it] {'loss': 0.3101, 'grad_norm': 0.6870834276561678, 'learning_rate': 2.2334329638685598e-06, 'epoch': 0.7} 70%|██████▉ | 15380/22095 [26:22:17<7:42:17, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47521 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15381/22095 [26:22:20<7:01:02, 3.76s/it] {'loss': 0.3087, 'grad_norm': 0.5934653026487168, 'learning_rate': 2.2328224900617064e-06, 'epoch': 0.7} 70%|██████▉ | 15381/22095 [26:22:20<7:01:02, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110953 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15382/22095 [26:22:23<7:05:27, 3.80s/it] {'loss': 0.2841, 'grad_norm': 0.6883932566836034, 'learning_rate': 2.2322120757129983e-06, 'epoch': 0.7} 70%|██████▉ | 15382/22095 [26:22:23<7:05:27, 3.80s/it] 70%|██████▉ | 15383/22095 [26:22:27<6:41:56, 3.59s/it] {'loss': 0.3141, 'grad_norm': 0.7244763280980958, 'learning_rate': 2.2316017208355504e-06, 'epoch': 0.7} 70%|██████▉ | 15383/22095 [26:22:27<6:41:56, 3.59s/it] 70%|██████▉ | 15384/22095 [26:22:30<6:20:44, 3.40s/it] {'loss': 0.2932, 'grad_norm': 0.6129160752040061, 'learning_rate': 2.2309914254424807e-06, 'epoch': 0.7} 70%|██████▉ | 15384/22095 [26:22:30<6:20:44, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106425 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15385/22095 [26:22:37<8:34:00, 4.60s/it] {'loss': 0.4697, 'grad_norm': 0.26808067124315665, 'learning_rate': 2.2303811895468996e-06, 'epoch': 0.7} 70%|██████▉ | 15385/22095 [26:22:37<8:34:00, 4.60s/it] 70%|██████▉ | 15386/22095 [26:22:41<8:22:49, 4.50s/it] {'loss': 0.3376, 'grad_norm': 0.6290085659815462, 'learning_rate': 2.2297710131619214e-06, 'epoch': 0.7} 70%|██████▉ | 15386/22095 [26:22:41<8:22:49, 4.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60858 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79177 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89027 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15387/22095 [26:22:44<7:41:50, 4.13s/it] {'loss': 0.3676, 'grad_norm': 0.6568816515106904, 'learning_rate': 2.229160896300655e-06, 'epoch': 0.7} 70%|██████▉ | 15387/22095 [26:22:44<7:41:50, 4.13s/it] 70%|██████▉ | 15388/22095 [26:22:49<7:43:15, 4.14s/it] {'loss': 0.2938, 'grad_norm': 0.610442344706643, 'learning_rate': 2.228550838976213e-06, 'epoch': 0.7} 70%|██████▉ | 15388/22095 [26:22:49<7:43:15, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41134 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42777 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15389/22095 [26:22:56<9:39:30, 5.19s/it] {'loss': 0.4745, 'grad_norm': 0.2898365808540318, 'learning_rate': 2.227940841201699e-06, 'epoch': 0.7} 70%|██████▉ | 15389/22095 [26:22:56<9:39:30, 5.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15390/22095 [26:23:05<11:39:51, 6.26s/it] {'loss': 0.4687, 'grad_norm': 0.2822728999278438, 'learning_rate': 2.227330902990225e-06, 'epoch': 0.7} 70%|██████▉ | 15390/22095 [26:23:05<11:39:51, 6.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|██████▉ | 15391/22095 [26:23:08<10:07:21, 5.44s/it] {'loss': 0.2607, 'grad_norm': 0.5926488550372246, 'learning_rate': 2.2267210243548943e-06, 'epoch': 0.7} 70%|██████▉ | 15391/22095 [26:23:09<10:07:21, 5.44s/it] 70%|██████▉ | 15392/22095 [26:23:12<8:56:51, 4.81s/it] {'loss': 0.3012, 'grad_norm': 0.6007425928292506, 'learning_rate': 2.226111205308809e-06, 'epoch': 0.7} 70%|██████▉ | 15392/22095 [26:23:12<8:56:51, 4.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15393/22095 [26:23:21<11:35:24, 6.23s/it] {'loss': 0.4933, 'grad_norm': 0.2771887358013033, 'learning_rate': 2.225501445865075e-06, 'epoch': 0.7} 70%|██████▉ | 15393/22095 [26:23:21<11:35:24, 6.23s/it] 70%|██████▉ | 15394/22095 [26:23:25<9:58:14, 5.36s/it] {'loss': 0.2889, 'grad_norm': 0.6469024784582776, 'learning_rate': 2.224891746036795e-06, 'epoch': 0.7} 70%|██████▉ | 15394/22095 [26:23:25<9:58:14, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15395/22095 [26:23:35<12:31:13, 6.73s/it] {'loss': 0.493, 'grad_norm': 0.2972666825183385, 'learning_rate': 2.224282105837069e-06, 'epoch': 0.7} 70%|██████▉ | 15395/22095 [26:23:35<12:31:13, 6.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15396/22095 [26:23:39<10:57:04, 5.89s/it] {'loss': 0.2966, 'grad_norm': 0.6249787087006309, 'learning_rate': 2.2236725252789933e-06, 'epoch': 0.7} 70%|██████▉ | 15396/22095 [26:23:39<10:57:04, 5.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15397/22095 [26:23:48<12:57:37, 6.97s/it] {'loss': 0.468, 'grad_norm': 0.28522173700529135, 'learning_rate': 2.22306300437567e-06, 'epoch': 0.7} 70%|██████▉ | 15397/22095 [26:23:48<12:57:37, 6.97s/it] 70%|██████▉ | 15398/22095 [26:23:52<11:15:08, 6.05s/it] {'loss': 0.3143, 'grad_norm': 0.6156039440420061, 'learning_rate': 2.222453543140192e-06, 'epoch': 0.7} 70%|██████▉ | 15398/22095 [26:23:52<11:15:08, 6.05s/it] 70%|██████▉ | 15399/22095 [26:23:55<9:30:13, 5.11s/it] {'loss': 0.3107, 'grad_norm': 0.5954022875600979, 'learning_rate': 2.221844141585659e-06, 'epoch': 0.7} 70%|██████▉ | 15399/22095 [26:23:55<9:30:13, 5.11s/it] 70%|██████▉ | 15400/22095 [26:23:58<8:31:45, 4.59s/it] {'loss': 0.2811, 'grad_norm': 0.6166093898007311, 'learning_rate': 2.221234799725161e-06, 'epoch': 0.7} 70%|██████▉ | 15400/22095 [26:23:58<8:31:45, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047739 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BD=4cm,∴AD=AB-BD=10-4=6(cm),∵点C是AD中点,∴CD=\\frac{1}{2}AD=3cm,则BC=CD+BD=7cm,'}]} 70%|██████▉ | 15401/22095 [26:24:03<8:23:21, 4.51s/it] {'loss': 0.3344, 'grad_norm': 0.6630235280814255, 'learning_rate': 2.220625517571795e-06, 'epoch': 0.7} 70%|██████▉ | 15401/22095 [26:24:03<8:23:21, 4.51s/it] 70%|██████▉ | 15402/22095 [26:24:05<7:29:13, 4.03s/it] {'loss': 0.3113, 'grad_norm': 0.6450303140316446, 'learning_rate': 2.2200162951386477e-06, 'epoch': 0.7} 70%|██████▉ | 15402/22095 [26:24:05<7:29:13, 4.03s/it] 70%|██████▉ | 15403/22095 [26:24:09<7:29:17, 4.03s/it] {'loss': 0.3176, 'grad_norm': 0.6059302954765791, 'learning_rate': 2.219407132438815e-06, 'epoch': 0.7} 70%|██████▉ | 15403/22095 [26:24:10<7:29:17, 4.03s/it] 70%|██████▉ | 15404/22095 [26:24:13<7:21:46, 3.96s/it] {'loss': 0.2914, 'grad_norm': 0.6265849775857281, 'learning_rate': 2.2187980294853827e-06, 'epoch': 0.7} 70%|██████▉ | 15404/22095 [26:24:13<7:21:46, 3.96s/it] 70%|██████▉ | 15405/22095 [26:24:17<6:59:55, 3.77s/it] {'loss': 0.2895, 'grad_norm': 0.6272461141911604, 'learning_rate': 2.2181889862914368e-06, 'epoch': 0.7} 70%|██████▉ | 15405/22095 [26:24:17<6:59:55, 3.77s/it] 70%|██████▉ | 15406/22095 [26:24:20<6:30:54, 3.51s/it] {'loss': 0.309, 'grad_norm': 0.6329130486899887, 'learning_rate': 2.217580002870066e-06, 'epoch': 0.7} 70%|██████▉ | 15406/22095 [26:24:20<6:30:54, 3.51s/it] 70%|██████▉ | 15407/22095 [26:24:24<6:53:01, 3.71s/it] {'loss': 0.3516, 'grad_norm': 0.6510575440133307, 'learning_rate': 2.2169710792343574e-06, 'epoch': 0.7} 70%|██████▉ | 15407/22095 [26:24:24<6:53:01, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43169 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15408/22095 [26:24:27<6:46:55, 3.65s/it] {'loss': 0.2705, 'grad_norm': 0.6025879494225266, 'learning_rate': 2.216362215397393e-06, 'epoch': 0.7} 70%|██████▉ | 15408/22095 [26:24:27<6:46:55, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15409/22095 [26:24:31<6:45:22, 3.64s/it] {'loss': 0.3211, 'grad_norm': 0.6229692611246024, 'learning_rate': 2.2157534113722533e-06, 'epoch': 0.7} 70%|██████▉ | 15409/22095 [26:24:31<6:45:22, 3.64s/it] 70%|██████▉ | 15410/22095 [26:24:35<7:13:26, 3.89s/it] {'loss': 0.3557, 'grad_norm': 0.6336128765611528, 'learning_rate': 2.215144667172023e-06, 'epoch': 0.7} 70%|██████▉ | 15410/22095 [26:24:35<7:13:26, 3.89s/it] 70%|██████▉ | 15411/22095 [26:24:39<7:20:04, 3.95s/it] {'loss': 0.3174, 'grad_norm': 0.6089725873892945, 'learning_rate': 2.21453598280978e-06, 'epoch': 0.7} 70%|██████▉ | 15411/22095 [26:24:39<7:20:04, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946832 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69985, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nA. 6.4cm\nB. 6.8cm\nC. 7cm\nD. 5.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 70%|██████▉ | 15412/22095 [26:24:43<6:57:18, 3.75s/it] {'loss': 0.3422, 'grad_norm': 0.6146045841194961, 'learning_rate': 2.213927358298605e-06, 'epoch': 0.7} 70%|██████▉ | 15412/22095 [26:24:43<6:57:18, 3.75s/it] 70%|██████▉ | 15413/22095 [26:24:47<7:32:50, 4.07s/it] {'loss': 0.2705, 'grad_norm': 0.6473314379938416, 'learning_rate': 2.213318793651573e-06, 'epoch': 0.7} 70%|██████▉ | 15413/22095 [26:24:47<7:32:50, 4.07s/it] 70%|██████▉ | 15414/22095 [26:24:51<7:13:47, 3.90s/it] {'loss': 0.2832, 'grad_norm': 0.5548174179736901, 'learning_rate': 2.2127102888817626e-06, 'epoch': 0.7} 70%|██████▉ | 15414/22095 [26:24:51<7:13:47, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58682 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15415/22095 [26:24:54<7:01:26, 3.79s/it] {'loss': 0.3175, 'grad_norm': 0.6272400617526707, 'learning_rate': 2.2121018440022458e-06, 'epoch': 0.7} 70%|██████▉ | 15415/22095 [26:24:55<7:01:26, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15416/22095 [26:25:01<8:34:22, 4.62s/it] {'loss': 0.4648, 'grad_norm': 0.2872778648761435, 'learning_rate': 2.2114934590261e-06, 'epoch': 0.7} 70%|██████▉ | 15416/22095 [26:25:01<8:34:22, 4.62s/it] 70%|██████▉ | 15417/22095 [26:25:05<8:08:43, 4.39s/it] {'loss': 0.3486, 'grad_norm': 0.6501873735094622, 'learning_rate': 2.2108851339663956e-06, 'epoch': 0.7} 70%|██████▉ | 15417/22095 [26:25:05<8:08:43, 4.39s/it] 70%|██████▉ | 15418/22095 [26:25:09<7:52:14, 4.24s/it] {'loss': 0.2585, 'grad_norm': 0.5714597352968046, 'learning_rate': 2.210276868836202e-06, 'epoch': 0.7} 70%|██████▉ | 15418/22095 [26:25:09<7:52:14, 4.24s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15419/22095 [26:25:12<7:16:17, 3.92s/it] {'loss': 0.2976, 'grad_norm': 0.6699368490383701, 'learning_rate': 2.209668663648592e-06, 'epoch': 0.7} 70%|██████▉ | 15419/22095 [26:25:12<7:16:17, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41453 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81253 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54116 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15420/22095 [26:25:15<6:41:43, 3.61s/it] {'loss': 0.3516, 'grad_norm': 0.5839244450998203, 'learning_rate': 2.2090605184166325e-06, 'epoch': 0.7} 70%|██████▉ | 15420/22095 [26:25:15<6:41:43, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104543 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64154 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15421/22095 [26:25:18<6:23:48, 3.45s/it] {'loss': 0.2602, 'grad_norm': 0.6601736114108441, 'learning_rate': 2.208452433153389e-06, 'epoch': 0.7} 70%|██████▉ | 15421/22095 [26:25:18<6:23:48, 3.45s/it] 70%|██████▉ | 15422/22095 [26:25:22<6:33:37, 3.54s/it] {'loss': 0.2763, 'grad_norm': 1.2962557190320656, 'learning_rate': 2.207844407871929e-06, 'epoch': 0.7} 70%|██████▉ | 15422/22095 [26:25:22<6:33:37, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15423/22095 [26:25:32<10:14:36, 5.53s/it] {'loss': 0.475, 'grad_norm': 0.3335447019667144, 'learning_rate': 2.2072364425853193e-06, 'epoch': 0.7} 70%|██████▉ | 15423/22095 [26:25:32<10:14:36, 5.53s/it] 70%|██████▉ | 15424/22095 [26:25:36<9:36:17, 5.18s/it] {'loss': 0.3245, 'grad_norm': 0.6271978005182927, 'learning_rate': 2.206628537306621e-06, 'epoch': 0.7} 70%|██████▉ | 15424/22095 [26:25:36<9:36:17, 5.18s/it] 70%|██████▉ | 15425/22095 [26:25:39<8:27:47, 4.57s/it] {'loss': 0.2691, 'grad_norm': 0.5811086521395183, 'learning_rate': 2.206020692048895e-06, 'epoch': 0.7} 70%|██████▉ | 15425/22095 [26:25:39<8:27:47, 4.57s/it] 70%|██████▉ | 15426/22095 [26:25:42<7:34:03, 4.09s/it] {'loss': 0.2459, 'grad_norm': 0.5722795124330649, 'learning_rate': 2.2054129068252037e-06, 'epoch': 0.7} 70%|██████▉ | 15426/22095 [26:25:42<7:34:03, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957195 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8030, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 7\nB. 6\nC. 10\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 70%|██████▉ | 15427/22095 [26:25:46<7:36:35, 4.11s/it] {'loss': 0.2928, 'grad_norm': 0.6096023565887093, 'learning_rate': 2.2048051816486054e-06, 'epoch': 0.7} 70%|██████▉ | 15427/22095 [26:25:47<7:36:35, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15428/22095 [26:25:54<9:20:39, 5.05s/it] {'loss': 0.4718, 'grad_norm': 0.294609066488518, 'learning_rate': 2.2041975165321606e-06, 'epoch': 0.7} 70%|██████▉ | 15428/22095 [26:25:54<9:20:39, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50766 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15429/22095 [26:26:05<12:33:23, 6.78s/it] {'loss': 0.4586, 'grad_norm': 0.2535641837168702, 'learning_rate': 2.2035899114889226e-06, 'epoch': 0.7} 70%|██████▉ | 15429/22095 [26:26:05<12:33:23, 6.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (47246 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66408 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78441 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15430/22095 [26:26:08<10:40:20, 5.76s/it] {'loss': 0.272, 'grad_norm': 0.5771085270607175, 'learning_rate': 2.2029823665319504e-06, 'epoch': 0.7} 70%|██████▉ | 15430/22095 [26:26:08<10:40:20, 5.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15431/22095 [26:26:12<9:28:10, 5.12s/it] {'loss': 0.3222, 'grad_norm': 0.6769738222692228, 'learning_rate': 2.2023748816742955e-06, 'epoch': 0.7} 70%|██████▉ | 15431/22095 [26:26:12<9:28:10, 5.12s/it] 70%|██████▉ | 15432/22095 [26:26:15<8:33:09, 4.62s/it] {'loss': 0.2761, 'grad_norm': 0.6033882521564669, 'learning_rate': 2.201767456929014e-06, 'epoch': 0.7} 70%|██████▉ | 15432/22095 [26:26:15<8:33:09, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15433/22095 [26:26:19<8:05:57, 4.38s/it] {'loss': 0.3192, 'grad_norm': 0.5714842277367986, 'learning_rate': 2.2011600923091554e-06, 'epoch': 0.7} 70%|██████▉ | 15433/22095 [26:26:19<8:05:57, 4.38s/it] 70%|██████▉ | 15434/22095 [26:26:22<7:16:30, 3.93s/it] {'loss': 0.3089, 'grad_norm': 0.6509570167193155, 'learning_rate': 2.200552787827768e-06, 'epoch': 0.7} 70%|██████▉ | 15434/22095 [26:26:22<7:16:30, 3.93s/it] 70%|██████▉ | 15435/22095 [26:26:25<6:54:11, 3.73s/it] {'loss': 0.2903, 'grad_norm': 0.6189100093640094, 'learning_rate': 2.1999455434979046e-06, 'epoch': 0.7} 70%|██████▉ | 15435/22095 [26:26:25<6:54:11, 3.73s/it] 70%|██████▉ | 15436/22095 [26:26:28<6:43:27, 3.64s/it] {'loss': 0.3026, 'grad_norm': 0.6344851280611853, 'learning_rate': 2.1993383593326127e-06, 'epoch': 0.7} 70%|██████▉ | 15436/22095 [26:26:28<6:43:27, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15437/22095 [26:26:34<7:53:00, 4.26s/it] {'loss': 0.4913, 'grad_norm': 0.27780582832440787, 'learning_rate': 2.1987312353449386e-06, 'epoch': 0.7} 70%|██████▉ | 15437/22095 [26:26:34<7:53:00, 4.26s/it] 70%|██████▉ | 15438/22095 [26:26:44<10:50:43, 5.87s/it] {'loss': 0.4925, 'grad_norm': 0.28615854234003973, 'learning_rate': 2.1981241715479247e-06, 'epoch': 0.7} 70%|██████▉ | 15438/22095 [26:26:44<10:50:43, 5.87s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (53589 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99411 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15439/22095 [26:26:47<9:38:49, 5.22s/it] {'loss': 0.2965, 'grad_norm': 0.5812592354372973, 'learning_rate': 2.1975171679546187e-06, 'epoch': 0.7} 70%|██████▉ | 15439/22095 [26:26:47<9:38:49, 5.22s/it] 70%|██████▉ | 15440/22095 [26:26:51<8:51:51, 4.80s/it] {'loss': 0.3421, 'grad_norm': 0.6557422880930702, 'learning_rate': 2.1969102245780592e-06, 'epoch': 0.7} 70%|██████▉ | 15440/22095 [26:26:51<8:51:51, 4.80s/it] 70%|██████▉ | 15441/22095 [26:26:55<8:09:27, 4.41s/it] {'loss': 0.269, 'grad_norm': 0.6649927896003363, 'learning_rate': 2.196303341431293e-06, 'epoch': 0.7} 70%|██████▉ | 15441/22095 [26:26:55<8:09:27, 4.41s/it] 70%|██████▉ | 15442/22095 [26:26:58<7:21:18, 3.98s/it] {'loss': 0.3044, 'grad_norm': 0.5648141574982591, 'learning_rate': 2.1956965185273545e-06, 'epoch': 0.7} 70%|██████▉ | 15442/22095 [26:26:58<7:21:18, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81518 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15443/22095 [26:27:01<6:42:11, 3.63s/it] {'loss': 0.2672, 'grad_norm': 0.6455390831857626, 'learning_rate': 2.1950897558792873e-06, 'epoch': 0.7} 70%|██████▉ | 15443/22095 [26:27:01<6:42:11, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46695 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15444/22095 [26:27:03<6:12:03, 3.36s/it] {'loss': 0.3157, 'grad_norm': 0.65669426469483, 'learning_rate': 2.1944830535001244e-06, 'epoch': 0.7} 70%|██████▉ | 15444/22095 [26:27:03<6:12:03, 3.36s/it] 70%|██████▉ | 15445/22095 [26:27:07<6:24:21, 3.47s/it] {'loss': 0.2951, 'grad_norm': 0.6604232285727459, 'learning_rate': 2.193876411402906e-06, 'epoch': 0.7} 70%|██████▉ | 15445/22095 [26:27:07<6:24:21, 3.47s/it] 70%|██████▉ | 15446/22095 [26:27:10<6:11:00, 3.35s/it] {'loss': 0.3403, 'grad_norm': 0.6278571713730251, 'learning_rate': 2.193269829600665e-06, 'epoch': 0.7} 70%|██████▉ | 15446/22095 [26:27:10<6:11:00, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15447/22095 [26:27:19<9:30:47, 5.15s/it] {'loss': 0.4824, 'grad_norm': 0.2877594596193233, 'learning_rate': 2.1926633081064336e-06, 'epoch': 0.7} 70%|██████▉ | 15447/22095 [26:27:19<9:30:47, 5.15s/it] 70%|██████▉ | 15448/22095 [26:27:23<8:22:32, 4.54s/it] {'loss': 0.2951, 'grad_norm': 0.626805222348979, 'learning_rate': 2.1920568469332458e-06, 'epoch': 0.7} 70%|██████▉ | 15448/22095 [26:27:23<8:22:32, 4.54s/it] 70%|██████▉ | 15449/22095 [26:27:26<7:40:38, 4.16s/it] {'loss': 0.3224, 'grad_norm': 0.6295785599758981, 'learning_rate': 2.191450446094136e-06, 'epoch': 0.7} 70%|██████▉ | 15449/22095 [26:27:26<7:40:38, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124635 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15450/22095 [26:27:30<7:30:14, 4.07s/it] {'loss': 0.283, 'grad_norm': 0.9393405038196936, 'learning_rate': 2.190844105602127e-06, 'epoch': 0.7} 70%|██████▉ | 15450/22095 [26:27:30<7:30:14, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52103 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87698 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81765 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15451/22095 [26:27:39<10:18:56, 5.59s/it] {'loss': 0.4909, 'grad_norm': 0.33121434047560755, 'learning_rate': 2.19023782547025e-06, 'epoch': 0.7} 70%|██████▉ | 15451/22095 [26:27:39<10:18:56, 5.59s/it] 70%|██████▉ | 15452/22095 [26:27:47<11:39:55, 6.32s/it] {'loss': 0.4659, 'grad_norm': 0.3220401310269192, 'learning_rate': 2.1896316057115343e-06, 'epoch': 0.7} 70%|██████▉ | 15452/22095 [26:27:47<11:39:55, 6.32s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (76932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71505 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15453/22095 [26:27:50<9:53:43, 5.36s/it] {'loss': 0.266, 'grad_norm': 0.5645794389594799, 'learning_rate': 2.189025446339004e-06, 'epoch': 0.7} 70%|██████▉ | 15453/22095 [26:27:50<9:53:43, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90696 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15454/22095 [26:27:53<8:41:50, 4.71s/it] {'loss': 0.2731, 'grad_norm': 0.6312858299620019, 'learning_rate': 2.1884193473656824e-06, 'epoch': 0.7} 70%|██████▉ | 15454/22095 [26:27:53<8:41:50, 4.71s/it] 70%|██████▉ | 15455/22095 [26:27:56<7:51:03, 4.26s/it] {'loss': 0.3496, 'grad_norm': 0.6166425710591585, 'learning_rate': 2.187813308804595e-06, 'epoch': 0.7} 70%|██████▉ | 15455/22095 [26:27:56<7:51:03, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123000 > 40960). Running this sequence through the model will result in indexing errors 70%|██████▉ | 15456/22095 [26:28:00<7:40:40, 4.16s/it] {'loss': 0.3373, 'grad_norm': 0.6094685675555125, 'learning_rate': 2.1872073306687614e-06, 'epoch': 0.7} 70%|██████▉ | 15456/22095 [26:28:00<7:40:40, 4.16s/it] 70%|██████▉ | 15457/22095 [26:28:04<7:39:28, 4.15s/it] {'loss': 0.3242, 'grad_norm': 0.6105705990820928, 'learning_rate': 2.186601412971205e-06, 'epoch': 0.7} 70%|██████▉ | 15457/22095 [26:28:04<7:39:28, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|██████▉ | 15458/22095 [26:28:14<10:49:20, 5.87s/it] {'loss': 0.4515, 'grad_norm': 0.2643205914938593, 'learning_rate': 2.185995555724942e-06, 'epoch': 0.7} 70%|██████▉ | 15458/22095 [26:28:14<10:49:20, 5.87s/it] 70%|██████▉ | 15459/22095 [26:28:24<13:05:59, 7.11s/it] {'loss': 0.4682, 'grad_norm': 0.29166666670093994, 'learning_rate': 2.1853897589429935e-06, 'epoch': 0.7} 70%|██████▉ | 15459/22095 [26:28:24<13:05:59, 7.11s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|██████▉ | 15460/22095 [26:28:28<11:05:39, 6.02s/it] {'loss': 0.3121, 'grad_norm': 0.6264830100800826, 'learning_rate': 2.184784022638373e-06, 'epoch': 0.7} 70%|██████▉ | 15460/22095 [26:28:28<11:05:39, 6.02s/it] 70%|██████▉ | 15461/22095 [26:28:37<13:02:54, 7.08s/it] {'loss': 0.4483, 'grad_norm': 0.25817546316852996, 'learning_rate': 2.184178346824099e-06, 'epoch': 0.7} 70%|██████▉ | 15461/22095 [26:28:37<13:02:54, 7.08s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|██████▉ | 15462/22095 [26:28:46<13:57:14, 7.57s/it] {'loss': 0.4954, 'grad_norm': 0.2978882566016645, 'learning_rate': 2.1835727315131842e-06, 'epoch': 0.7} 70%|██████▉ | 15462/22095 [26:28:46<13:57:14, 7.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|██████▉ | 15463/22095 [26:28:49<11:31:58, 6.26s/it] {'loss': 0.3194, 'grad_norm': 0.5901124855544712, 'learning_rate': 2.18296717671864e-06, 'epoch': 0.7} 70%|██████▉ | 15463/22095 [26:28:49<11:31:58, 6.26s/it] 70%|██████▉ | 15464/22095 [26:28:59<13:31:09, 7.34s/it] {'loss': 0.4612, 'grad_norm': 0.28291400965652297, 'learning_rate': 2.1823616824534788e-06, 'epoch': 0.7} 70%|██████▉ | 15464/22095 [26:28:59<13:31:09, 7.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|██████▉ | 15465/22095 [26:29:03<11:25:29, 6.20s/it] {'loss': 0.2949, 'grad_norm': 0.6461112710167152, 'learning_rate': 2.181756248730714e-06, 'epoch': 0.7} 70%|██████▉ | 15465/22095 [26:29:03<11:25:29, 6.20s/it] 70%|██████▉ | 15466/22095 [26:29:06<9:51:07, 5.35s/it] {'loss': 0.3271, 'grad_norm': 0.5991463451115281, 'learning_rate': 2.1811508755633508e-06, 'epoch': 0.7} 70%|██████▉ | 15466/22095 [26:29:06<9:51:07, 5.35s/it] 70%|███████ | 15467/22095 [26:29:10<9:02:22, 4.91s/it] {'loss': 0.255, 'grad_norm': 0.6766384769010018, 'learning_rate': 2.1805455629643966e-06, 'epoch': 0.7} 70%|███████ | 15467/22095 [26:29:10<9:02:22, 4.91s/it] 70%|███████ | 15468/22095 [26:29:13<7:52:47, 4.28s/it] {'loss': 0.2896, 'grad_norm': 0.6061024812032464, 'learning_rate': 2.179940310946861e-06, 'epoch': 0.7} 70%|███████ | 15468/22095 [26:29:13<7:52:47, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47954 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55894 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15469/22095 [26:29:16<7:32:58, 4.10s/it] {'loss': 0.2962, 'grad_norm': 0.6709462130279129, 'learning_rate': 2.179335119523745e-06, 'epoch': 0.7} 70%|███████ | 15469/22095 [26:29:16<7:32:58, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103403 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15470/22095 [26:29:20<7:16:02, 3.95s/it] {'loss': 0.2924, 'grad_norm': 0.581660104951167, 'learning_rate': 2.178729988708056e-06, 'epoch': 0.7} 70%|███████ | 15470/22095 [26:29:20<7:16:02, 3.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8437375 in VC:s3://internvl-moe-sft-data/. Exception: Image size [192, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 109526, 'image': 'vrdu_texteq/astro-ph.CO/a6c27ef6-3cee-4edd-ba73-88dcd0fbf706.png', 'image_wh': [[192, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': '$x_{00}$ components'}]} 70%|███████ | 15471/22095 [26:29:24<7:16:42, 3.96s/it] {'loss': 0.2999, 'grad_norm': 0.6539112408764716, 'learning_rate': 2.178124918512793e-06, 'epoch': 0.7} 70%|███████ | 15471/22095 [26:29:24<7:16:42, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68410 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15472/22095 [26:29:27<6:43:21, 3.65s/it] {'loss': 0.2721, 'grad_norm': 0.6104589647548964, 'learning_rate': 2.17751990895096e-06, 'epoch': 0.7} 70%|███████ | 15472/22095 [26:29:27<6:43:21, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50537 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131078 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15473/22095 [26:29:30<6:22:57, 3.47s/it] {'loss': 0.2883, 'grad_norm': 0.6344018666873753, 'learning_rate': 2.1769149600355545e-06, 'epoch': 0.7} 70%|███████ | 15473/22095 [26:29:30<6:22:57, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15474/22095 [26:29:34<6:31:09, 3.54s/it] {'loss': 0.2954, 'grad_norm': 0.8703735890057297, 'learning_rate': 2.176310071779577e-06, 'epoch': 0.7} 70%|███████ | 15474/22095 [26:29:34<6:31:09, 3.54s/it] 70%|███████ | 15475/22095 [26:29:37<6:27:30, 3.51s/it] {'loss': 0.2981, 'grad_norm': 0.6420105953615923, 'learning_rate': 2.1757052441960248e-06, 'epoch': 0.7} 70%|███████ | 15475/22095 [26:29:37<6:27:30, 3.51s/it] 70%|███████ | 15476/22095 [26:29:40<6:09:54, 3.35s/it] {'loss': 0.2762, 'grad_norm': 0.5941777634412325, 'learning_rate': 2.17510047729789e-06, 'epoch': 0.7} 70%|███████ | 15476/22095 [26:29:40<6:09:54, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15477/22095 [26:29:48<8:39:36, 4.71s/it] {'loss': 0.4802, 'grad_norm': 0.3041590428798745, 'learning_rate': 2.174495771098171e-06, 'epoch': 0.7} 70%|███████ | 15477/22095 [26:29:48<8:39:36, 4.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15478/22095 [26:29:55<9:55:18, 5.40s/it] {'loss': 0.4989, 'grad_norm': 0.3065766008871142, 'learning_rate': 2.173891125609863e-06, 'epoch': 0.7} 70%|███████ | 15478/22095 [26:29:55<9:55:18, 5.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|███████ | 15479/22095 [26:29:58<8:46:46, 4.78s/it] {'loss': 0.3296, 'grad_norm': 0.6260164462093338, 'learning_rate': 2.1732865408459508e-06, 'epoch': 0.7} 70%|███████ | 15479/22095 [26:29:58<8:46:46, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84933 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91634 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15480/22095 [26:30:05<9:59:29, 5.44s/it] {'loss': 0.4473, 'grad_norm': 0.2776070640653133, 'learning_rate': 2.17268201681943e-06, 'epoch': 0.7} 70%|███████ | 15480/22095 [26:30:05<9:59:29, 5.44s/it] 70%|███████ | 15481/22095 [26:30:15<12:18:35, 6.70s/it] {'loss': 0.4879, 'grad_norm': 0.2922294395288578, 'learning_rate': 2.172077553543291e-06, 'epoch': 0.7} 70%|███████ | 15481/22095 [26:30:15<12:18:35, 6.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 70%|███████ | 15482/22095 [26:30:19<10:37:32, 5.78s/it] {'loss': 0.2829, 'grad_norm': 0.6019312043691404, 'learning_rate': 2.17147315103052e-06, 'epoch': 0.7} 70%|███████ | 15482/22095 [26:30:19<10:37:32, 5.78s/it] 70%|███████ | 15483/22095 [26:30:22<9:14:42, 5.03s/it] {'loss': 0.3021, 'grad_norm': 0.5946946099216283, 'learning_rate': 2.1708688092941018e-06, 'epoch': 0.7} 70%|███████ | 15483/22095 [26:30:22<9:14:42, 5.03s/it] 70%|███████ | 15484/22095 [26:30:26<8:30:51, 4.64s/it] {'loss': 0.3011, 'grad_norm': 0.6213603603657362, 'learning_rate': 2.1702645283470238e-06, 'epoch': 0.7} 70%|███████ | 15484/22095 [26:30:26<8:30:51, 4.64s/it] 70%|███████ | 15485/22095 [26:30:28<7:34:21, 4.12s/it] {'loss': 0.3158, 'grad_norm': 0.6604124753553283, 'learning_rate': 2.169660308202272e-06, 'epoch': 0.7} 70%|███████ | 15485/22095 [26:30:28<7:34:21, 4.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301285 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1C9TcpC3PL1JjSZFxXXcBBVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content hidden in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n老鼠不跑光\n包退\n高效\n声光结合\n模式可调\n自动关闭\n主机防水\n多项专利\n低耗电型\n缤纷日子\n汽车专用\nW218\n12强光灯\n驱属之神'}]} 70%|███████ | 15486/22095 [26:30:32<7:18:18, 3.98s/it] {'loss': 0.2895, 'grad_norm': 0.6684013995648608, 'learning_rate': 2.169056148872828e-06, 'epoch': 0.7} 70%|███████ | 15486/22095 [26:30:32<7:18:18, 3.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [278, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8513656 in VC:s3://internvl-moe-sft-data/. Exception: Image size [278, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 156959, 'image': 'vrdu_texteq/astro-ph.CO/c20db4f1-6dee-43a0-ad82-174c4858e6e3.png', 'image_wh': [[278, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'where $C_n$ is a constant'}]} 70%|███████ | 15487/22095 [26:30:36<7:24:19, 4.03s/it] {'loss': 0.2915, 'grad_norm': 0.7528060434075988, 'learning_rate': 2.1684520503716704e-06, 'epoch': 0.7} 70%|███████ | 15487/22095 [26:30:36<7:24:19, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15488/22095 [26:30:44<9:12:32, 5.02s/it] {'loss': 0.4931, 'grad_norm': 0.31181609176172187, 'learning_rate': 2.167848012711784e-06, 'epoch': 0.7} 70%|███████ | 15488/22095 [26:30:44<9:12:32, 5.02s/it] 70%|███████ | 15489/22095 [26:30:47<8:31:59, 4.65s/it] {'loss': 0.3021, 'grad_norm': 0.5794370482748491, 'learning_rate': 2.1672440359061435e-06, 'epoch': 0.7} 70%|███████ | 15489/22095 [26:30:47<8:31:59, 4.65s/it] 70%|███████ | 15490/22095 [26:30:51<7:50:25, 4.27s/it] {'loss': 0.3133, 'grad_norm': 0.6148117834325464, 'learning_rate': 2.16664011996773e-06, 'epoch': 0.7} 70%|███████ | 15490/22095 [26:30:51<7:50:25, 4.27s/it] 70%|███████ | 15491/22095 [26:30:54<7:15:15, 3.95s/it] {'loss': 0.3024, 'grad_norm': 0.7677040014486417, 'learning_rate': 2.166036264909519e-06, 'epoch': 0.7} 70%|███████ | 15491/22095 [26:30:54<7:15:15, 3.95s/it] 70%|███████ | 15492/22095 [26:30:57<6:50:40, 3.73s/it] {'loss': 0.2839, 'grad_norm': 0.6786446701334466, 'learning_rate': 2.165432470744483e-06, 'epoch': 0.7} 70%|███████ | 15492/22095 [26:30:57<6:50:40, 3.73s/it] 70%|███████ | 15493/22095 [26:31:01<6:39:00, 3.63s/it] {'loss': 0.3138, 'grad_norm': 0.8929527677804295, 'learning_rate': 2.164828737485597e-06, 'epoch': 0.7} 70%|███████ | 15493/22095 [26:31:01<6:39:00, 3.63s/it] 70%|███████ | 15494/22095 [26:31:05<6:49:40, 3.72s/it] {'loss': 0.3477, 'grad_norm': 0.6217136165511883, 'learning_rate': 2.164225065145836e-06, 'epoch': 0.7} 70%|███████ | 15494/22095 [26:31:05<6:49:40, 3.72s/it] 70%|███████ | 15495/22095 [26:31:08<6:40:57, 3.65s/it] {'loss': 0.3093, 'grad_norm': 0.6117178180181198, 'learning_rate': 2.163621453738168e-06, 'epoch': 0.7} 70%|███████ | 15495/22095 [26:31:08<6:40:57, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63238 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15496/22095 [26:31:11<6:33:17, 3.58s/it] {'loss': 0.2741, 'grad_norm': 0.6523515263755088, 'learning_rate': 2.1630179032755632e-06, 'epoch': 0.7} 70%|███████ | 15496/22095 [26:31:11<6:33:17, 3.58s/it] 70%|███████ | 15497/22095 [26:31:15<6:49:19, 3.72s/it] {'loss': 0.3055, 'grad_norm': 0.5729455223348102, 'learning_rate': 2.1624144137709917e-06, 'epoch': 0.7} 70%|███████ | 15497/22095 [26:31:15<6:49:19, 3.72s/it] 70%|███████ | 15498/22095 [26:31:19<6:35:56, 3.60s/it] {'loss': 0.3363, 'grad_norm': 0.6059087224660069, 'learning_rate': 2.161810985237418e-06, 'epoch': 0.7} 70%|███████ | 15498/22095 [26:31:19<6:35:56, 3.60s/it] 70%|███████ | 15499/22095 [26:31:23<7:03:46, 3.85s/it] {'loss': 0.3404, 'grad_norm': 0.7130322049678949, 'learning_rate': 2.1612076176878112e-06, 'epoch': 0.7} 70%|███████ | 15499/22095 [26:31:23<7:03:46, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8368583 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 35331, 'image': 'vrdu_table_final_2/astro-ph.CO/a8801417-1b0b-4523-b888-a40a4af5abc3.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{#1}#2\\end{tabular}\n```"}]} 70%|███████ | 15500/22095 [26:31:26<6:41:16, 3.65s/it] {'loss': 0.2398, 'grad_norm': 0.5802648354575368, 'learning_rate': 2.1606043111351316e-06, 'epoch': 0.7} 70%|███████ | 15500/22095 [26:31:26<6:41:16, 3.65s/it] 70%|███████ | 15501/22095 [26:31:30<6:37:15, 3.61s/it] {'loss': 0.2982, 'grad_norm': 0.7169971190891021, 'learning_rate': 2.160001065592347e-06, 'epoch': 0.7} 70%|███████ | 15501/22095 [26:31:30<6:37:15, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15502/22095 [26:31:34<6:36:55, 3.61s/it] {'loss': 0.3027, 'grad_norm': 0.696332156180076, 'learning_rate': 2.1593978810724152e-06, 'epoch': 0.7} 70%|███████ | 15502/22095 [26:31:34<6:36:55, 3.61s/it] 70%|███████ | 15503/22095 [26:31:37<6:24:36, 3.50s/it] {'loss': 0.2887, 'grad_norm': 0.6148654487309713, 'learning_rate': 2.158794757588301e-06, 'epoch': 0.7} 70%|███████ | 15503/22095 [26:31:37<6:24:36, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8894379 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17532, 'image': 'images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 7cm\nB. 8cm\nC. 1lcm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 70%|███████ | 15504/22095 [26:31:44<8:36:44, 4.70s/it] {'loss': 0.479, 'grad_norm': 0.3234783939738736, 'learning_rate': 2.1581916951529606e-06, 'epoch': 0.7} 70%|███████ | 15504/22095 [26:31:44<8:36:44, 4.70s/it] 70%|███████ | 15505/22095 [26:31:49<8:38:33, 4.72s/it] {'loss': 0.3061, 'grad_norm': 0.5654991950174835, 'learning_rate': 2.1575886937793515e-06, 'epoch': 0.7} 70%|███████ | 15505/22095 [26:31:49<8:38:33, 4.72s/it] 70%|███████ | 15506/22095 [26:31:53<8:06:56, 4.43s/it] {'loss': 0.2862, 'grad_norm': 0.6185639792057311, 'learning_rate': 2.1569857534804317e-06, 'epoch': 0.7} 70%|███████ | 15506/22095 [26:31:53<8:06:56, 4.43s/it] 70%|███████ | 15507/22095 [26:31:56<7:30:54, 4.11s/it] {'loss': 0.2907, 'grad_norm': 0.5872142467018157, 'learning_rate': 2.1563828742691597e-06, 'epoch': 0.7} 70%|███████ | 15507/22095 [26:31:56<7:30:54, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965710 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16545, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 2cm\nB. 5cm\nC. 4cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 70%|███████ | 15508/22095 [26:31:59<6:52:17, 3.76s/it] {'loss': 0.2905, 'grad_norm': 0.6251172930466563, 'learning_rate': 2.1557800561584822e-06, 'epoch': 0.7} 70%|███████ | 15508/22095 [26:31:59<6:52:17, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946045 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69198, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 70%|███████ | 15509/22095 [26:32:02<6:30:47, 3.56s/it] {'loss': 0.3231, 'grad_norm': 0.6491345308673889, 'learning_rate': 2.155177299161357e-06, 'epoch': 0.7} 70%|███████ | 15509/22095 [26:32:02<6:30:47, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15510/22095 [26:32:06<6:28:53, 3.54s/it] {'loss': 0.3149, 'grad_norm': 0.6886456035695668, 'learning_rate': 2.154574603290735e-06, 'epoch': 0.7} 70%|███████ | 15510/22095 [26:32:06<6:28:53, 3.54s/it] 70%|███████ | 15511/22095 [26:32:09<6:10:08, 3.37s/it] {'loss': 0.3017, 'grad_norm': 0.6153866555546205, 'learning_rate': 2.1539719685595665e-06, 'epoch': 0.7} 70%|███████ | 15511/22095 [26:32:09<6:10:08, 3.37s/it] 70%|███████ | 15512/22095 [26:32:12<6:17:43, 3.44s/it] {'loss': 0.3578, 'grad_norm': 0.6256755519011062, 'learning_rate': 2.153369394980798e-06, 'epoch': 0.7} 70%|███████ | 15512/22095 [26:32:12<6:17:43, 3.44s/it] 70%|███████ | 15513/22095 [26:32:17<6:43:25, 3.68s/it] {'loss': 0.3478, 'grad_norm': 0.6207121681626878, 'learning_rate': 2.1527668825673777e-06, 'epoch': 0.7} 70%|███████ | 15513/22095 [26:32:17<6:43:25, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108768 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15514/22095 [26:32:20<6:27:13, 3.53s/it] {'loss': 0.3382, 'grad_norm': 0.634240642136747, 'learning_rate': 2.1521644313322543e-06, 'epoch': 0.7} 70%|███████ | 15514/22095 [26:32:20<6:27:13, 3.53s/it] 70%|███████ | 15515/22095 [26:32:23<6:10:03, 3.37s/it] {'loss': 0.3213, 'grad_norm': 0.6182497417181321, 'learning_rate': 2.151562041288371e-06, 'epoch': 0.7} 70%|███████ | 15515/22095 [26:32:23<6:10:03, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15516/22095 [26:32:27<6:26:31, 3.53s/it] {'loss': 0.2757, 'grad_norm': 0.6069266273627872, 'learning_rate': 2.1509597124486693e-06, 'epoch': 0.7} 70%|███████ | 15516/22095 [26:32:27<6:26:31, 3.53s/it] 70%|███████ | 15517/22095 [26:32:30<6:09:21, 3.37s/it] {'loss': 0.2847, 'grad_norm': 0.568023784096722, 'learning_rate': 2.150357444826095e-06, 'epoch': 0.7} 70%|███████ | 15517/22095 [26:32:30<6:09:21, 3.37s/it] 70%|███████ | 15518/22095 [26:32:33<6:23:21, 3.50s/it] {'loss': 0.3185, 'grad_norm': 0.617320839569645, 'learning_rate': 2.1497552384335858e-06, 'epoch': 0.7} 70%|███████ | 15518/22095 [26:32:33<6:23:21, 3.50s/it] 70%|███████ | 15519/22095 [26:32:37<6:22:02, 3.49s/it] {'loss': 0.3063, 'grad_norm': 0.6363743882498056, 'learning_rate': 2.1491530932840835e-06, 'epoch': 0.7} 70%|███████ | 15519/22095 [26:32:37<6:22:02, 3.49s/it] 70%|███████ | 15520/22095 [26:32:41<6:29:32, 3.55s/it] {'loss': 0.3117, 'grad_norm': 0.6209437669725376, 'learning_rate': 2.1485510093905264e-06, 'epoch': 0.7} 70%|███████ | 15520/22095 [26:32:41<6:29:32, 3.55s/it] 70%|███████ | 15521/22095 [26:32:43<6:07:24, 3.35s/it] {'loss': 0.3298, 'grad_norm': 0.6466039291351509, 'learning_rate': 2.147948986765849e-06, 'epoch': 0.7} 70%|███████ | 15521/22095 [26:32:43<6:07:24, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46964 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15522/22095 [26:32:53<9:15:35, 5.07s/it] {'loss': 0.5006, 'grad_norm': 0.2965267650217191, 'learning_rate': 2.147347025422988e-06, 'epoch': 0.7} 70%|███████ | 15522/22095 [26:32:53<9:15:35, 5.07s/it] 70%|███████ | 15523/22095 [26:32:57<8:50:50, 4.85s/it] {'loss': 0.2877, 'grad_norm': 0.6130321389581915, 'learning_rate': 2.1467451253748797e-06, 'epoch': 0.7} 70%|███████ | 15523/22095 [26:32:57<8:50:50, 4.85s/it] 70%|███████ | 15524/22095 [26:33:01<8:27:30, 4.63s/it] {'loss': 0.2878, 'grad_norm': 0.6827378240286927, 'learning_rate': 2.1461432866344554e-06, 'epoch': 0.7} 70%|███████ | 15524/22095 [26:33:01<8:27:30, 4.63s/it] 70%|███████ | 15525/22095 [26:33:04<7:33:18, 4.14s/it] {'loss': 0.311, 'grad_norm': 0.5873439345678003, 'learning_rate': 2.145541509214646e-06, 'epoch': 0.7} 70%|███████ | 15525/22095 [26:33:04<7:33:18, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15526/22095 [26:33:07<7:03:28, 3.87s/it] {'loss': 0.3204, 'grad_norm': 0.6193308607911314, 'learning_rate': 2.1449397931283838e-06, 'epoch': 0.7} 70%|███████ | 15526/22095 [26:33:07<7:03:28, 3.87s/it] 70%|███████ | 15527/22095 [26:33:11<6:46:13, 3.71s/it] {'loss': 0.3135, 'grad_norm': 0.6274669568283813, 'learning_rate': 2.1443381383885954e-06, 'epoch': 0.7} 70%|███████ | 15527/22095 [26:33:11<6:46:13, 3.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941397 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64550, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 6.8cm\nB. 7cm\nC. 5.4cm\nD. 6.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 70%|███████ | 15528/22095 [26:33:13<6:16:21, 3.44s/it] {'loss': 0.2914, 'grad_norm': 0.5998583644959756, 'learning_rate': 2.1437365450082114e-06, 'epoch': 0.7} 70%|███████ | 15528/22095 [26:33:13<6:16:21, 3.44s/it] 70%|███████ | 15529/22095 [26:33:17<6:24:39, 3.51s/it] {'loss': 0.2954, 'grad_norm': 0.6134254482027904, 'learning_rate': 2.1431350130001556e-06, 'epoch': 0.7} 70%|███████ | 15529/22095 [26:33:17<6:24:39, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (80600 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61277 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76236 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15530/22095 [26:33:24<8:06:43, 4.45s/it] {'loss': 0.4972, 'grad_norm': 0.28329977300242243, 'learning_rate': 2.142533542377355e-06, 'epoch': 0.7} 70%|███████ | 15530/22095 [26:33:24<8:06:43, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15531/22095 [26:33:28<7:55:46, 4.35s/it] {'loss': 0.3163, 'grad_norm': 0.6315642879116592, 'learning_rate': 2.1419321331527317e-06, 'epoch': 0.7} 70%|███████ | 15531/22095 [26:33:28<7:55:46, 4.35s/it] 70%|███████ | 15532/22095 [26:33:32<7:46:33, 4.27s/it] {'loss': 0.3264, 'grad_norm': 0.6288753910599849, 'learning_rate': 2.14133078533921e-06, 'epoch': 0.7} 70%|███████ | 15532/22095 [26:33:32<7:46:33, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11303193 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6, 'image': 'airplane_app/149.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一张中国国际航空公司的登机牌。以下是登机牌上的详细信息:\n\n- 航段信息:中国国航CA1515/HU1223\n- 日期:2017年1月13日\n- 出发地:北京首都机场T1\n- 目的地:上海虹桥机场T1\n- 乘客姓名:张小明\n- 座位号:36A\n- 舱位等级:头等舱\n- 登机时间:20:40\n- 登机口:C15\n- 会员卡号/合约:CA1234567890\n- 航班号:999542123456\n- 会员卡积分:123\n- 登机顺序:15\n\n右下方的“973游戏”标志表明这张登机牌图片可能是从一个游戏或应用程序中获取的。\n\n注意:这只是模拟登机牌的图片,不是实际的登机牌。'}]} 70%|███████ | 15533/22095 [26:33:35<7:01:13, 3.85s/it] {'loss': 0.3117, 'grad_norm': 0.6167214538896445, 'learning_rate': 2.14072949894971e-06, 'epoch': 0.7} 70%|███████ | 15533/22095 [26:33:35<7:01:13, 3.85s/it] 70%|███████ | 15534/22095 [26:33:38<6:46:21, 3.72s/it] {'loss': 0.2891, 'grad_norm': 0.5920196063320219, 'learning_rate': 2.14012827399715e-06, 'epoch': 0.7} 70%|███████ | 15534/22095 [26:33:38<6:46:21, 3.72s/it] 70%|███████ | 15535/22095 [26:33:42<6:35:57, 3.62s/it] {'loss': 0.2853, 'grad_norm': 0.5728835353448037, 'learning_rate': 2.13952711049445e-06, 'epoch': 0.7} 70%|███████ | 15535/22095 [26:33:42<6:35:57, 3.62s/it] 70%|███████ | 15536/22095 [26:33:44<6:06:25, 3.35s/it] {'loss': 0.3132, 'grad_norm': 0.6498149796461337, 'learning_rate': 2.1389260084545305e-06, 'epoch': 0.7} 70%|███████ | 15536/22095 [26:33:44<6:06:25, 3.35s/it] 70%|███████ | 15537/22095 [26:33:48<6:06:21, 3.35s/it] {'loss': 0.3054, 'grad_norm': 0.60638016947381, 'learning_rate': 2.1383249678903006e-06, 'epoch': 0.7} 70%|███████ | 15537/22095 [26:33:48<6:06:21, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43484 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45110 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47119 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15538/22095 [26:33:58<9:55:10, 5.45s/it] {'loss': 0.4889, 'grad_norm': 0.2768250776447379, 'learning_rate': 2.1377239888146785e-06, 'epoch': 0.7} 70%|███████ | 15538/22095 [26:33:58<9:55:10, 5.45s/it] 70%|███████ | 15539/22095 [26:34:02<9:01:14, 4.95s/it] {'loss': 0.3623, 'grad_norm': 0.647992038999156, 'learning_rate': 2.1371230712405783e-06, 'epoch': 0.7} 70%|███████ | 15539/22095 [26:34:02<9:01:14, 4.95s/it] 70%|███████ | 15540/22095 [26:34:05<7:50:53, 4.31s/it] {'loss': 0.3067, 'grad_norm': 0.7498658647713098, 'learning_rate': 2.1365222151809106e-06, 'epoch': 0.7} 70%|███████ | 15540/22095 [26:34:05<7:50:53, 4.31s/it] 70%|███████ | 15541/22095 [26:34:08<7:14:36, 3.98s/it] {'loss': 0.2784, 'grad_norm': 0.5874224222939307, 'learning_rate': 2.1359214206485845e-06, 'epoch': 0.7} 70%|███████ | 15541/22095 [26:34:08<7:14:36, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15542/22095 [26:34:12<7:08:35, 3.92s/it] {'loss': 0.3168, 'grad_norm': 0.6189769199261504, 'learning_rate': 2.135320687656511e-06, 'epoch': 0.7} 70%|███████ | 15542/22095 [26:34:12<7:08:35, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41431 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81026 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15543/22095 [26:34:15<6:49:51, 3.75s/it] {'loss': 0.3384, 'grad_norm': 0.6490191871652556, 'learning_rate': 2.1347200162175984e-06, 'epoch': 0.7} 70%|███████ | 15543/22095 [26:34:15<6:49:51, 3.75s/it] 70%|███████ | 15544/22095 [26:34:18<6:35:42, 3.62s/it] {'loss': 0.335, 'grad_norm': 0.6092680751785646, 'learning_rate': 2.1341194063447533e-06, 'epoch': 0.7} 70%|███████ | 15544/22095 [26:34:18<6:35:42, 3.62s/it] 70%|███████ | 15545/22095 [26:34:22<6:38:49, 3.65s/it] {'loss': 0.3049, 'grad_norm': 0.6465144265028875, 'learning_rate': 2.133518858050879e-06, 'epoch': 0.7} 70%|███████ | 15545/22095 [26:34:22<6:38:49, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15546/22095 [26:34:31<9:47:31, 5.38s/it] {'loss': 0.4656, 'grad_norm': 0.42971251058278587, 'learning_rate': 2.132918371348882e-06, 'epoch': 0.7} 70%|███████ | 15546/22095 [26:34:31<9:47:31, 5.38s/it] 70%|███████ | 15547/22095 [26:34:35<8:52:05, 4.88s/it] {'loss': 0.3116, 'grad_norm': 0.5763015261511008, 'learning_rate': 2.132317946251662e-06, 'epoch': 0.7} 70%|███████ | 15547/22095 [26:34:35<8:52:05, 4.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [109, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341288 in VC:s3://internvl-moe-sft-data/. Exception: Image size [109, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7933, 'image': 'vrdu_table_final_2/astro-ph.CO/f7c6018e-a885-4370-9434-75c86c1a7d62.png', 'image_wh': [[109, 20]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha - \\alpha_{\\rm true}$\\end{tabular}\n```"}]} 70%|███████ | 15548/22095 [26:34:38<7:47:47, 4.29s/it] {'loss': 0.2937, 'grad_norm': 0.7014396526280077, 'learning_rate': 2.1317175827721238e-06, 'epoch': 0.7} 70%|███████ | 15548/22095 [26:34:38<7:47:47, 4.29s/it] 70%|███████ | 15549/22095 [26:34:41<7:06:08, 3.91s/it] {'loss': 0.3095, 'grad_norm': 0.6105683067560459, 'learning_rate': 2.131117280923165e-06, 'epoch': 0.7} 70%|███████ | 15549/22095 [26:34:41<7:06:08, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48831 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15550/22095 [26:34:44<6:51:01, 3.77s/it] {'loss': 0.3185, 'grad_norm': 0.5806980703941906, 'learning_rate': 2.1305170407176836e-06, 'epoch': 0.7} 70%|███████ | 15550/22095 [26:34:44<6:51:01, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15551/22095 [26:34:54<10:15:14, 5.64s/it] {'loss': 0.4511, 'grad_norm': 0.3146629447531045, 'learning_rate': 2.1299168621685775e-06, 'epoch': 0.7} 70%|███████ | 15551/22095 [26:34:54<10:15:14, 5.64s/it] 70%|███████ | 15552/22095 [26:34:58<8:58:31, 4.94s/it] {'loss': 0.3012, 'grad_norm': 0.5856947087391554, 'learning_rate': 2.1293167452887452e-06, 'epoch': 0.7} 70%|███████ | 15552/22095 [26:34:58<8:58:31, 4.94s/it] 70%|███████ | 15553/22095 [26:35:01<7:50:14, 4.31s/it] {'loss': 0.295, 'grad_norm': 0.69512091931565, 'learning_rate': 2.1287166900910796e-06, 'epoch': 0.7} 70%|███████ | 15553/22095 [26:35:01<7:50:14, 4.31s/it] 70%|███████ | 15554/22095 [26:35:04<7:28:39, 4.12s/it] {'loss': 0.3547, 'grad_norm': 0.5998239511118214, 'learning_rate': 2.1281166965884715e-06, 'epoch': 0.7} 70%|███████ | 15554/22095 [26:35:04<7:28:39, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15555/22095 [26:35:08<7:03:45, 3.89s/it] {'loss': 0.3277, 'grad_norm': 0.7292774262951063, 'learning_rate': 2.1275167647938153e-06, 'epoch': 0.7} 70%|███████ | 15555/22095 [26:35:08<7:03:45, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15556/22095 [26:35:11<6:30:48, 3.59s/it] {'loss': 0.2779, 'grad_norm': 0.610099937979236, 'learning_rate': 2.1269168947200043e-06, 'epoch': 0.7} 70%|███████ | 15556/22095 [26:35:11<6:30:48, 3.59s/it] 70%|███████ | 15557/22095 [26:35:14<6:15:32, 3.45s/it] {'loss': 0.3535, 'grad_norm': 0.7000545008934641, 'learning_rate': 2.126317086379925e-06, 'epoch': 0.7} 70%|███████ | 15557/22095 [26:35:14<6:15:32, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15558/22095 [26:35:23<9:31:07, 5.24s/it] {'loss': 0.4646, 'grad_norm': 0.26531741359570854, 'learning_rate': 2.1257173397864635e-06, 'epoch': 0.7} 70%|███████ | 15558/22095 [26:35:23<9:31:07, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103276 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15559/22095 [26:35:26<8:22:57, 4.62s/it] {'loss': 0.3249, 'grad_norm': 0.6500995480765627, 'learning_rate': 2.1251176549525102e-06, 'epoch': 0.7} 70%|███████ | 15559/22095 [26:35:26<8:22:57, 4.62s/it] 70%|███████ | 15560/22095 [26:35:30<7:44:11, 4.26s/it] {'loss': 0.3052, 'grad_norm': 0.5911001356915474, 'learning_rate': 2.1245180318909482e-06, 'epoch': 0.7} 70%|███████ | 15560/22095 [26:35:30<7:44:11, 4.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:32 and width:26 must be larger than factor:28 [Try #0] Failed to fetch sample 2163176 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:32 and width:26 must be larger than factor:28 Problematic sample: {'image': '11c49bcb727e743e8497fa3f8ce1470f7557b6d2b1f6363f08799fec12624ab8.png', 'conversations': [{'from': 'human', 'value': '\nThe position of this The element is a profile picture, commonly used as a button or link to user account settings. can be described as:\nThe profile picture is positioned in the top-right corner of the screen. It is adjacent to a circular button with a compass icon, which is located directly below it. The profile picture is separate from the main map area, which occupies the majority of the screen.'}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]', 'recipient': 'all', 'end_turn': True}]} 70%|███████ | 15561/22095 [26:35:33<7:04:02, 3.89s/it] {'loss': 0.3311, 'grad_norm': 0.6448318552907646, 'learning_rate': 2.123918470614663e-06, 'epoch': 0.7} 70%|███████ | 15561/22095 [26:35:33<7:04:02, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358150 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24861, 'image': 'vrdu_table_final_2/astro-ph.CO/3079ba38-8c1d-4c6e-9dda-1f5b1a958fa3.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 70%|███████ | 15562/22095 [26:35:36<6:40:16, 3.68s/it] {'loss': 0.2825, 'grad_norm': 0.5916943962480607, 'learning_rate': 2.1233189711365374e-06, 'epoch': 0.7} 70%|███████ | 15562/22095 [26:35:36<6:40:16, 3.68s/it] 70%|███████ | 15563/22095 [26:35:39<6:17:19, 3.47s/it] {'loss': 0.3075, 'grad_norm': 0.5972384349849474, 'learning_rate': 2.12271953346945e-06, 'epoch': 0.7} 70%|███████ | 15563/22095 [26:35:39<6:17:19, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54032 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48547 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15564/22095 [26:35:42<5:51:53, 3.23s/it] {'loss': 0.3396, 'grad_norm': 0.7624894797514498, 'learning_rate': 2.1221201576262828e-06, 'epoch': 0.7} 70%|███████ | 15564/22095 [26:35:42<5:51:53, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69347 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15565/22095 [26:35:44<5:37:35, 3.10s/it] {'loss': 0.294, 'grad_norm': 0.5869299786308838, 'learning_rate': 2.121520843619917e-06, 'epoch': 0.7} 70%|███████ | 15565/22095 [26:35:44<5:37:35, 3.10s/it] 70%|███████ | 15566/22095 [26:35:47<5:30:23, 3.04s/it] {'loss': 0.2832, 'grad_norm': 0.5884655200661071, 'learning_rate': 2.1209215914632275e-06, 'epoch': 0.7} 70%|███████ | 15566/22095 [26:35:47<5:30:23, 3.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 70%|███████ | 15567/22095 [26:35:57<8:59:53, 4.96s/it] {'loss': 0.4624, 'grad_norm': 0.27608380802360893, 'learning_rate': 2.120322401169088e-06, 'epoch': 0.7} 70%|███████ | 15567/22095 [26:35:57<8:59:53, 4.96s/it] 70%|███████ | 15568/22095 [26:36:00<8:06:42, 4.47s/it] {'loss': 0.2897, 'grad_norm': 0.6114286440045849, 'learning_rate': 2.119723272750379e-06, 'epoch': 0.7} 70%|███████ | 15568/22095 [26:36:00<8:06:42, 4.47s/it] 70%|███████ | 15569/22095 [26:36:03<7:26:44, 4.11s/it] {'loss': 0.295, 'grad_norm': 0.6040466508722457, 'learning_rate': 2.1191242062199695e-06, 'epoch': 0.7} 70%|███████ | 15569/22095 [26:36:03<7:26:44, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15570/22095 [26:36:07<7:05:40, 3.91s/it] {'loss': 0.2781, 'grad_norm': 0.7752228122059384, 'learning_rate': 2.118525201590732e-06, 'epoch': 0.7} 70%|███████ | 15570/22095 [26:36:07<7:05:40, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 70%|███████ | 15571/22095 [26:36:10<6:33:29, 3.62s/it] {'loss': 0.3222, 'grad_norm': 0.6012127086414305, 'learning_rate': 2.117926258875538e-06, 'epoch': 0.7} 70%|███████ | 15571/22095 [26:36:10<6:33:29, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79348 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15572/22095 [26:36:17<8:28:00, 4.67s/it] {'loss': 0.4625, 'grad_norm': 0.28231026926098374, 'learning_rate': 2.1173273780872584e-06, 'epoch': 0.7} 70%|███████ | 15572/22095 [26:36:17<8:28:00, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [448, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8511563 in VC:s3://internvl-moe-sft-data/. Exception: Image size [448, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 158371, 'image': 'vrdu_texteq/astro-ph.CO/94a32d92-de3a-4624-8749-9b8da41eaf56.png', 'image_wh': [[448, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where we define the $\\Gamma$-function ratio:'}]} 70%|███████ | 15573/22095 [26:36:21<8:09:29, 4.50s/it] {'loss': 0.3222, 'grad_norm': 0.6151721387865137, 'learning_rate': 2.11672855923876e-06, 'epoch': 0.7} 70%|███████ | 15573/22095 [26:36:21<8:09:29, 4.50s/it] 70%|███████ | 15574/22095 [26:36:25<7:58:45, 4.41s/it] {'loss': 0.2739, 'grad_norm': 1.0305312559650226, 'learning_rate': 2.1161298023429076e-06, 'epoch': 0.7} 70%|███████ | 15574/22095 [26:36:25<7:58:45, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50722 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50510 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92318 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15575/22095 [26:36:29<7:28:58, 4.13s/it] {'loss': 0.2865, 'grad_norm': 0.615711634922465, 'learning_rate': 2.1155311074125713e-06, 'epoch': 0.7} 70%|███████ | 15575/22095 [26:36:29<7:28:58, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85917 > 40960). Running this sequence through the model will result in indexing errors 70%|███████ | 15576/22095 [26:36:32<7:05:52, 3.92s/it] {'loss': 0.2861, 'grad_norm': 0.6168852774937957, 'learning_rate': 2.1149324744606103e-06, 'epoch': 0.7} 70%|███████ | 15576/22095 [26:36:32<7:05:52, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15577/22095 [26:36:36<7:02:11, 3.89s/it] {'loss': 0.3326, 'grad_norm': 0.6119238649528917, 'learning_rate': 2.114333903499891e-06, 'epoch': 0.71} 71%|███████ | 15577/22095 [26:36:36<7:02:11, 3.89s/it] 71%|███████ | 15578/22095 [26:36:39<6:26:31, 3.56s/it] {'loss': 0.3255, 'grad_norm': 0.5819637721539274, 'learning_rate': 2.1137353945432743e-06, 'epoch': 0.71} 71%|███████ | 15578/22095 [26:36:39<6:26:31, 3.56s/it] 71%|███████ | 15579/22095 [26:36:41<6:04:02, 3.35s/it] {'loss': 0.3066, 'grad_norm': 0.6676049772112951, 'learning_rate': 2.1131369476036173e-06, 'epoch': 0.71} 71%|███████ | 15579/22095 [26:36:41<6:04:02, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15580/22095 [26:36:51<9:21:24, 5.17s/it] {'loss': 0.4758, 'grad_norm': 0.2675541874957772, 'learning_rate': 2.1125385626937806e-06, 'epoch': 0.71} 71%|███████ | 15580/22095 [26:36:51<9:21:24, 5.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11252478 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 776, 'image': ['MMMU/Architecture_and_Engineering/test_108_image_1.png', 'MMMU/Architecture_and_Engineering/test_108_image_2.png', 'MMMU/Architecture_and_Engineering/test_108_image_3.png', 'MMMU/Architecture_and_Engineering/test_108_image_4.png', 'MMMU/Architecture_and_Engineering/test_108_image_5.png', 'MMMU/Architecture_and_Engineering/test_108_image_6.png'], 'image_wh': [[174, 169], [188, 172], [197, 21], [194, 51], [194, 42], [193, 45]], 'conversations': [{'from': 'human', 'value': "Image-1: \nImage-2: \nImage-3: \nImage-4: \nImage-5: \nImage-6: \nPlease analyze all the relevant background knowledge involved in this question and give an in-depth and detailed introduction to each relevant knowledge point. You need to provide a clear explanation of each knowledge point involved in the question and answer, including its concept, definition, practical application, importance and influence in related fields. Please provide appropriate examples to support the explanation of each knowledge point, just like the detailed explanation in the textbook. Make sure the answer is clearly structured, well-organized, and covers all important aspects of each knowledge point. The answer requires at least 2,000 words, and avoid directly answering the answer to the specific question, and focus on the analysis and explanation of background knowledge.\nHere are the content and options of this question:\nQuestion: Match List I with List II and select the correct answer using the codes given below the lists:\nOptions: ['', '', '', '']"}, {'from': 'gpt', 'value': "To analyze and explain the background knowledge involved in this question, it's essential to break down each component and examine the concepts, definitions, applications, and importance. This question involves understanding materials and symbols related to different types of fencing.\n\n### Hedge\n\n#### Concept and Definition\n- **Hedge**: A hedge is typically a line of closely spaced shrubs or sometimes trees. It is used as a boundary or for decorative purposes.\n \n#### Practical Application\n- **Uses in Landscaping**: Hedges are often used for privacy, wind protection, or to define garden spaces. They are also a popular choice for marking property boundaries.\n- **Environmental Benefits**: Hedges provide habitats for wildlife and can contribute to biodiversity in urban and rural settings.\n \n#### Importance and Influence\n- **Aesthetic Value**: Hedges add visual interest to landscapes.\n- **Ecosystem Services**: They play a role in ecosystems by supporting various forms of wildlife.\n\n#### Example\n- A classic example of a hedge is the English Boxwood used around gardens for its dense growth and ease of shaping.\n\n### Wire Fencing\n\n#### Concept and Definition\n- **Wire Fencing**: This type of fencing involves the use of metal wires to create a barrier. It can be made from galvanized steel or other metals.\n \n#### Practical Application\n- **Security**: Wire fences are widely used for security purposes, around properties like farms, industrial sites, and homes.\n- **Animal Containment**: Used in agriculture to contain livestock.\n\n#### Importance and Influence\n- **Cost-Effective**: Wire fencing is relatively inexpensive compared to other types.\n- **Durability and Maintenance**: Requires less maintenance and is durable against weather conditions.\n\n#### Example\n- Chain-link fences are common in both residential and commercial properties for security and demarcation purposes.\n\n### Pipe Fencing\n\n#### Concept and Definition\n- **Pipe Fencing**: Constructed using metal pipes, this type of fencing is known for its strength and durability.\n \n#### Practical Application\n- **Animal Handling**: Often used in ranches and farms for containing large animals like horses and cattle.\n- **Transportation and Industry**: Seen around industrial zones and transport depots for heavy-duty boundary requirements.\n\n#### Importance and Influence\n- **Strength**: Pipe fencing provides a robust solution for demanding environments.\n- **Versatility**: It can be customized in height and structure to suit different needs.\n\n#### Example\n- Steel pipe fencing is popular in areas where animals are frequently corralled, like rodeos or ranches.\n\n### Wood Fencing\n\n#### Concept and Definition\n- **Wood Fencing**: A traditional option, wood fences are made of various types of timber and offer a classic and natural look.\n \n#### Practical Application\n- **Privacy**: Often used in residential areas to provide privacy and security.\n- **Decoration**: Adds rustic charm to gardens and landscapes.\n\n#### Importance and Influence\n- **Aesthetic Appeal**: Wood fencing is an attractive choice for many homeowners.\n- **Sustainability Considerations**: When sourced responsibly, wood can be an eco-friendly option.\n\n#### Example\n- Cedar and redwood are popular choices for wood fencing due to their resistance to decay and insects.\n\n### Summary of Symbolism in Fencing\n\nSymbols play a crucial role in facilitating the communication of complex ideas in simpler forms, and this extends into how we perceive and label fencing types. The symbols described in List II use visual abstractions to represent different fencing types, and understanding these can extend into broader discussions about symbolism in landscape design.\n\n### Importance of Matching Symbols\n\nUnderstanding visual symbols is important in fields ranging from engineering and construction to graphic design and cartography. This example of matching fencing types with their symbolic representations taps into our cognitive abilities to process information visually and associate it with real-world objects. This not only aids in practical applications but also enhances cognitive mapping in various disciplines such as:\n\n- **Construction Planning**: Being able to visualize plans using symbols quickly translates into real-world applications.\n- **GIS and Mapping**: Symbols like these are used in geographic information systems to denote various features.\n\n### Broader Implications in Related Fields\n\n1. **Landscape Architecture**: Understanding different materials and their symbolic representations is crucial in designing functional and aesthetic outdoor spaces. This involves not only an appreciation of beauty but also a responsibility toward ecological impacts.\n\n2. **Material Science**: Investigating the properties of materials like metal, wood, and living plants helps in selecting the best options for specific environments and applications. Whether it's the tensile strength of wire or the biodegradability of wood, each choice carries weight in sustainability and performance.\n\n3. **Environmental Science**: The use of each fencing type has environmental implications. For instance, planting a hedge offers ecological benefits like improved air quality and wildlife habitats, whereas the production and disposal of metal have more environmental footprints. These factors must be weighed carefully in decision-making.\n\n4. **Urban Planning**: Planners often use symbolic representations of structures and resources in their designs. Understanding the various materials used in urban environments helps planners develop spaces that meet community needs while maintaining sustainability.\n\n5. **Cognitive Psychology**: Symbol recognition and interpretation are studied in cognitive psychology to understand how people perceive, process, and remember information. This understanding is key to designing user-friendly maps and diagrams.\n\n### Examples of Symbol Usage\n\nIn design disciplines, symbols represent not only physical elements but also convey ideas and values. For instance, in landscape architecture, a row of hedges may also symbolize exclusivity or elegance. Similarly, wire fencing often conveys notions of security or boundary enforcement.\n\nIn practice, using these symbols efficiently can lead to better communication, saving time and resources in industries ranging from construction to digital design. They help bridge language barriers and provide universal understanding.\n\n### Conclusion\n\nUnderstanding the elements and their representations in fencing requires a multi-disciplinary approach. It involves recognizing not just the physical properties and uses of materials like hedges, wire, pipes, and wood, but also the cultural and symbolic meanings they embody. This broadens the scope to include discussions on sustainability, design efficacy, user experience, and many other domains impacted by these choices. By paying close attention to these various factors, professionals can make more informed decisions that align both with practical needs and ethical standards, reflecting a holistic understanding of the environment in which they operate. \n\nUltimately, education in these areas builds a well-equipped mind capable of navigating both the tangible and intangible elements of design and infrastructure, making the practical applications of such knowledge extensive and profound."}]} 71%|███████ | 15581/22095 [26:36:54<8:16:32, 4.57s/it] {'loss': 0.2666, 'grad_norm': 0.6290051475284912, 'learning_rate': 2.111940239826624e-06, 'epoch': 0.71} 71%|███████ | 15581/22095 [26:36:54<8:16:32, 4.57s/it] 71%|███████ | 15582/22095 [26:36:57<7:34:56, 4.19s/it] {'loss': 0.3403, 'grad_norm': 0.5945427288601011, 'learning_rate': 2.1113419790150017e-06, 'epoch': 0.71} 71%|███████ | 15582/22095 [26:36:57<7:34:56, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15583/22095 [26:37:07<10:26:48, 5.78s/it] {'loss': 0.4439, 'grad_norm': 0.2695891017928327, 'learning_rate': 2.1107437802717667e-06, 'epoch': 0.71} 71%|███████ | 15583/22095 [26:37:07<10:26:48, 5.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73833 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53870 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15584/22095 [26:37:11<9:49:51, 5.44s/it] {'loss': 0.4736, 'grad_norm': 0.2815561171450684, 'learning_rate': 2.1101456436097744e-06, 'epoch': 0.71} 71%|███████ | 15584/22095 [26:37:11<9:49:51, 5.44s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 71%|███████ | 15585/22095 [26:37:15<9:02:32, 5.00s/it] {'loss': 0.3111, 'grad_norm': 0.6002777685049623, 'learning_rate': 2.109547569041878e-06, 'epoch': 0.71} 71%|███████ | 15585/22095 [26:37:15<9:02:32, 5.00s/it] 71%|███████ | 15586/22095 [26:37:19<8:29:41, 4.70s/it] {'loss': 0.2802, 'grad_norm': 0.6074427839267478, 'learning_rate': 2.1089495565809274e-06, 'epoch': 0.71} 71%|███████ | 15586/22095 [26:37:19<8:29:41, 4.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15587/22095 [26:37:28<10:41:57, 5.92s/it] {'loss': 0.4811, 'grad_norm': 0.2618171024604309, 'learning_rate': 2.10835160623977e-06, 'epoch': 0.71} 71%|███████ | 15587/22095 [26:37:28<10:41:57, 5.92s/it] 71%|███████ | 15588/22095 [26:37:32<9:17:52, 5.14s/it] {'loss': 0.2957, 'grad_norm': 0.6403282477430244, 'learning_rate': 2.1077537180312568e-06, 'epoch': 0.71} 71%|███████ | 15588/22095 [26:37:32<9:17:52, 5.14s/it] 71%|███████ | 15589/22095 [26:37:34<8:02:37, 4.45s/it] {'loss': 0.2813, 'grad_norm': 0.6931161483217249, 'learning_rate': 2.107155891968232e-06, 'epoch': 0.71} 71%|███████ | 15589/22095 [26:37:34<8:02:37, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15590/22095 [26:37:39<7:56:59, 4.40s/it] {'loss': 0.2764, 'grad_norm': 0.5854855835134954, 'learning_rate': 2.106558128063544e-06, 'epoch': 0.71} 71%|███████ | 15590/22095 [26:37:39<7:56:59, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70998 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42486 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15591/22095 [26:37:42<7:12:47, 3.99s/it] {'loss': 0.2293, 'grad_norm': 0.9400059804734506, 'learning_rate': 2.1059604263300354e-06, 'epoch': 0.71} 71%|███████ | 15591/22095 [26:37:42<7:12:47, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15592/22095 [26:37:50<9:27:33, 5.24s/it] {'loss': 0.448, 'grad_norm': 0.31758819314786835, 'learning_rate': 2.105362786780547e-06, 'epoch': 0.71} 71%|███████ | 15592/22095 [26:37:50<9:27:33, 5.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045963 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 71%|███████ | 15593/22095 [26:37:54<8:53:59, 4.93s/it] {'loss': 0.2823, 'grad_norm': 0.59255163939248, 'learning_rate': 2.104765209427922e-06, 'epoch': 0.71} 71%|███████ | 15593/22095 [26:37:54<8:53:59, 4.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15594/22095 [26:37:57<8:01:42, 4.45s/it] {'loss': 0.2659, 'grad_norm': 0.6202984057488541, 'learning_rate': 2.1041676942850025e-06, 'epoch': 0.71} 71%|███████ | 15594/22095 [26:37:57<8:01:42, 4.45s/it] 71%|███████ | 15595/22095 [26:38:01<7:39:09, 4.24s/it] {'loss': 0.3219, 'grad_norm': 0.6665940338013063, 'learning_rate': 2.1035702413646257e-06, 'epoch': 0.71} 71%|███████ | 15595/22095 [26:38:01<7:39:09, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15596/22095 [26:38:09<9:21:13, 5.18s/it] {'loss': 0.462, 'grad_norm': 0.2709211805243884, 'learning_rate': 2.1029728506796266e-06, 'epoch': 0.71} 71%|███████ | 15596/22095 [26:38:09<9:21:13, 5.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306275 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1qDCWXjnD8KJjSspbXXbbEXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text does this image contain?'}, {'from': 'gpt', 'value': 'All words in the image:\n专业各种手机配件\n6AUO\n6AUO濠江源通讯\nR9尾插\nOPPO\n实体店批发\n量大价优'}]} 71%|███████ | 15597/22095 [26:38:12<8:31:24, 4.72s/it] {'loss': 0.3088, 'grad_norm': 0.6551666991759286, 'learning_rate': 2.1023755222428434e-06, 'epoch': 0.71} 71%|███████ | 15597/22095 [26:38:12<8:31:24, 4.72s/it] 71%|███████ | 15598/22095 [26:38:15<7:38:25, 4.23s/it] {'loss': 0.3216, 'grad_norm': 0.6156762722283183, 'learning_rate': 2.1017782560671124e-06, 'epoch': 0.71} 71%|███████ | 15598/22095 [26:38:15<7:38:25, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15599/22095 [26:38:23<9:18:47, 5.16s/it] {'loss': 0.4734, 'grad_norm': 0.4609435977100646, 'learning_rate': 2.101181052165266e-06, 'epoch': 0.71} 71%|███████ | 15599/22095 [26:38:23<9:18:47, 5.16s/it] 71%|███████ | 15600/22095 [26:38:27<8:38:52, 4.79s/it] {'loss': 0.3084, 'grad_norm': 0.6868399069144914, 'learning_rate': 2.1005839105501336e-06, 'epoch': 0.71} 71%|███████ | 15600/22095 [26:38:27<8:38:52, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62184 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42650 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47390 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15601/22095 [26:38:29<7:34:25, 4.20s/it] {'loss': 0.3307, 'grad_norm': 0.6540307934496004, 'learning_rate': 2.09998683123455e-06, 'epoch': 0.71} 71%|███████ | 15601/22095 [26:38:29<7:34:25, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15602/22095 [26:38:32<6:47:16, 3.76s/it] {'loss': 0.2532, 'grad_norm': 0.5804650262670074, 'learning_rate': 2.0993898142313428e-06, 'epoch': 0.71} 71%|███████ | 15602/22095 [26:38:32<6:47:16, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41162 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15603/22095 [26:38:36<6:40:28, 3.70s/it] {'loss': 0.3121, 'grad_norm': 0.7045137113564987, 'learning_rate': 2.098792859553338e-06, 'epoch': 0.71} 71%|███████ | 15603/22095 [26:38:36<6:40:28, 3.70s/it] 71%|███████ | 15604/22095 [26:38:38<6:11:24, 3.43s/it] {'loss': 0.2671, 'grad_norm': 0.5985304281397166, 'learning_rate': 2.0981959672133663e-06, 'epoch': 0.71} 71%|███████ | 15604/22095 [26:38:38<6:11:24, 3.43s/it] 71%|███████ | 15605/22095 [26:38:41<5:53:48, 3.27s/it] {'loss': 0.3095, 'grad_norm': 0.6932785443502347, 'learning_rate': 2.0975991372242488e-06, 'epoch': 0.71} 71%|███████ | 15605/22095 [26:38:41<5:53:48, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15606/22095 [26:38:49<8:18:48, 4.61s/it] {'loss': 0.4578, 'grad_norm': 0.25596252597725394, 'learning_rate': 2.097002369598814e-06, 'epoch': 0.71} 71%|███████ | 15606/22095 [26:38:49<8:18:48, 4.61s/it] 71%|███████ | 15607/22095 [26:38:53<7:52:26, 4.37s/it] {'loss': 0.3176, 'grad_norm': 0.6023667287025952, 'learning_rate': 2.096405664349882e-06, 'epoch': 0.71} 71%|███████ | 15607/22095 [26:38:53<7:52:26, 4.37s/it] 71%|███████ | 15608/22095 [26:38:58<8:13:27, 4.56s/it] {'loss': 0.3647, 'grad_norm': 0.6669780970769399, 'learning_rate': 2.095809021490273e-06, 'epoch': 0.71} 71%|███████ | 15608/22095 [26:38:58<8:13:27, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15609/22095 [26:39:06<9:57:02, 5.52s/it] {'loss': 0.4931, 'grad_norm': 0.27753170005461786, 'learning_rate': 2.0952124410328085e-06, 'epoch': 0.71} 71%|███████ | 15609/22095 [26:39:06<9:57:02, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15610/22095 [26:39:09<8:51:59, 4.92s/it] {'loss': 0.294, 'grad_norm': 0.5896931641117975, 'learning_rate': 2.094615922990309e-06, 'epoch': 0.71} 71%|███████ | 15610/22095 [26:39:09<8:51:59, 4.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15611/22095 [26:39:19<11:25:57, 6.35s/it] {'loss': 0.4577, 'grad_norm': 0.28245895056221954, 'learning_rate': 2.0940194673755903e-06, 'epoch': 0.71} 71%|███████ | 15611/22095 [26:39:19<11:25:57, 6.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81469 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51063 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15612/22095 [26:39:22<9:48:51, 5.45s/it] {'loss': 0.2881, 'grad_norm': 0.5821755169282244, 'learning_rate': 2.0934230742014666e-06, 'epoch': 0.71} 71%|███████ | 15612/22095 [26:39:22<9:48:51, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45635 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50749 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15613/22095 [26:39:26<8:45:46, 4.87s/it] {'loss': 0.3145, 'grad_norm': 0.6095114697705112, 'learning_rate': 2.0928267434807537e-06, 'epoch': 0.71} 71%|███████ | 15613/22095 [26:39:26<8:45:46, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15614/22095 [26:39:29<7:56:55, 4.42s/it] {'loss': 0.3091, 'grad_norm': 0.5966172519027911, 'learning_rate': 2.0922304752262672e-06, 'epoch': 0.71} 71%|███████ | 15614/22095 [26:39:29<7:56:55, 4.42s/it] 71%|███████ | 15615/22095 [26:39:33<7:39:36, 4.26s/it] {'loss': 0.2981, 'grad_norm': 0.5854903933345119, 'learning_rate': 2.0916342694508177e-06, 'epoch': 0.71} 71%|███████ | 15615/22095 [26:39:33<7:39:36, 4.26s/it] 71%|███████ | 15616/22095 [26:39:37<7:22:33, 4.10s/it] {'loss': 0.303, 'grad_norm': 0.6476592475962757, 'learning_rate': 2.0910381261672136e-06, 'epoch': 0.71} 71%|███████ | 15616/22095 [26:39:37<7:22:33, 4.10s/it] 71%|███████ | 15617/22095 [26:39:40<6:54:07, 3.84s/it] {'loss': 0.3085, 'grad_norm': 0.6178026593378579, 'learning_rate': 2.0904420453882675e-06, 'epoch': 0.71} 71%|███████ | 15617/22095 [26:39:40<6:54:07, 3.84s/it] 71%|███████ | 15618/22095 [26:39:43<6:29:09, 3.60s/it] {'loss': 0.3193, 'grad_norm': 0.6411289753975102, 'learning_rate': 2.089846027126784e-06, 'epoch': 0.71} 71%|███████ | 15618/22095 [26:39:43<6:29:09, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79565 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45293 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15619/22095 [26:39:46<6:11:22, 3.44s/it] {'loss': 0.2949, 'grad_norm': 0.5891169438856212, 'learning_rate': 2.089250071395573e-06, 'epoch': 0.71} 71%|███████ | 15619/22095 [26:39:46<6:11:22, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (110431 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15620/22095 [26:39:56<9:37:13, 5.35s/it] {'loss': 0.4793, 'grad_norm': 0.2940271428331134, 'learning_rate': 2.088654178207439e-06, 'epoch': 0.71} 71%|███████ | 15620/22095 [26:39:56<9:37:13, 5.35s/it] 71%|███████ | 15621/22095 [26:40:06<12:08:28, 6.75s/it] {'loss': 0.4955, 'grad_norm': 0.3054544662909863, 'learning_rate': 2.088058347575183e-06, 'epoch': 0.71} 71%|███████ | 15621/22095 [26:40:06<12:08:28, 6.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 71%|███████ | 15622/22095 [26:40:09<10:16:08, 5.71s/it] {'loss': 0.3594, 'grad_norm': 0.6651374294296094, 'learning_rate': 2.08746257951161e-06, 'epoch': 0.71} 71%|███████ | 15622/22095 [26:40:09<10:16:08, 5.71s/it] 71%|███████ | 15623/22095 [26:40:13<9:00:09, 5.01s/it] {'loss': 0.3299, 'grad_norm': 0.7210944144731016, 'learning_rate': 2.0868668740295223e-06, 'epoch': 0.71} 71%|███████ | 15623/22095 [26:40:13<9:00:09, 5.01s/it] 71%|███████ | 15624/22095 [26:40:17<8:33:44, 4.76s/it] {'loss': 0.3078, 'grad_norm': 0.5831154354291413, 'learning_rate': 2.086271231141719e-06, 'epoch': 0.71} 71%|███████ | 15624/22095 [26:40:17<8:33:44, 4.76s/it] 71%|███████ | 15625/22095 [26:40:20<7:55:16, 4.41s/it] {'loss': 0.3114, 'grad_norm': 0.623507882297109, 'learning_rate': 2.085675650860996e-06, 'epoch': 0.71} 71%|███████ | 15625/22095 [26:40:20<7:55:16, 4.41s/it] 71%|███████ | 15626/22095 [26:40:24<7:41:02, 4.28s/it] {'loss': 0.3274, 'grad_norm': 0.6069100613245061, 'learning_rate': 2.0850801332001526e-06, 'epoch': 0.71} 71%|███████ | 15626/22095 [26:40:24<7:41:02, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11305195 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 645, 'image': 'taobao/235.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一个典型的淘宝(TaoBao)优惠券领取和使用界面。我们可以从图片中看到以下几组信息:\n\n1. **界面顶部**\n - 在界面顶部,有一个淘宝的图标和“粉丝福利购”字样,表明这是一个促销活动页面。背景色设计是较为鲜艳的粉色和紫色,以吸引用户的注意。\n\n2. **左侧图片(优惠券领取页面)**\n - 有一个大红色的“¥5”字样,表示这是一个价值5元的优惠券。\n - 优惠券的有效期为2016年10月6日至2016年10月11日。\n - 在优惠券下方,有一个红色的按钮,按钮上写着“立即领券(需关注店铺)”,表示用户需关注店铺才可以领取这个优惠券。\n - 下方配有商品展示图片,这是一个可裁剪的家用防滑吸水毛绒地垫。现价是8.6元,使用这个5元优惠券后,价格降至3.6元。\n - 右上方有一个分享按钮,方便用户分享这个优惠券活动。\n\n3. **右侧图片(优惠券领取成功页面)**\n - 领取成功后,页面会弹出一个确认信息,表示关注成功和优惠券领取成功。\n - 中间显示的是¥5的优惠券信息。\n - 下方有一个红色按钮,写着“立即使用”,提示用户可以立即使用该优惠券进行购物。\n - 在页面的下方,依然有商品展示图片及其价格信息。\n\n**总结**\n这个界面展示了用户在淘宝平台上如何领取和使用店铺优惠券的过程。用户需关注店铺(通过点击相应按钮)以领取优惠券,领取成功后,可以立即使用优惠券来享受折扣购物。'}]} 71%|███████ | 15627/22095 [26:40:32<9:28:11, 5.27s/it] {'loss': 0.463, 'grad_norm': 0.2759847739560776, 'learning_rate': 2.0844846781719865e-06, 'epoch': 0.71} 71%|███████ | 15627/22095 [26:40:32<9:28:11, 5.27s/it] 71%|███████ | 15628/22095 [26:40:35<8:28:38, 4.72s/it] {'loss': 0.2949, 'grad_norm': 0.6771389749984982, 'learning_rate': 2.0838892857892908e-06, 'epoch': 0.71} 71%|███████ | 15628/22095 [26:40:35<8:28:38, 4.72s/it] 71%|███████ | 15629/22095 [26:40:38<7:38:49, 4.26s/it] {'loss': 0.2532, 'grad_norm': 0.5745668071147314, 'learning_rate': 2.0832939560648557e-06, 'epoch': 0.71} 71%|███████ | 15629/22095 [26:40:38<7:38:49, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56609 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42170 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87033 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15630/22095 [26:40:42<7:25:59, 4.14s/it] {'loss': 0.3186, 'grad_norm': 0.7284067278697884, 'learning_rate': 2.0826986890114775e-06, 'epoch': 0.71} 71%|███████ | 15630/22095 [26:40:42<7:25:59, 4.14s/it] 71%|███████ | 15631/22095 [26:40:45<6:51:21, 3.82s/it] {'loss': 0.3153, 'grad_norm': 0.6221647222093457, 'learning_rate': 2.082103484641943e-06, 'epoch': 0.71} 71%|███████ | 15631/22095 [26:40:45<6:51:21, 3.82s/it] 71%|███████ | 15632/22095 [26:40:48<6:25:31, 3.58s/it] {'loss': 0.3593, 'grad_norm': 0.6522605474785989, 'learning_rate': 2.0815083429690445e-06, 'epoch': 0.71} 71%|███████ | 15632/22095 [26:40:48<6:25:31, 3.58s/it] 71%|███████ | 15633/22095 [26:40:52<6:31:19, 3.63s/it] {'loss': 0.2701, 'grad_norm': 0.6066986056777576, 'learning_rate': 2.0809132640055685e-06, 'epoch': 0.71} 71%|███████ | 15633/22095 [26:40:52<6:31:19, 3.63s/it] 71%|███████ | 15634/22095 [26:40:56<6:33:41, 3.66s/it] {'loss': 0.2929, 'grad_norm': 0.5604974164989961, 'learning_rate': 2.080318247764299e-06, 'epoch': 0.71} 71%|███████ | 15634/22095 [26:40:56<6:33:41, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74692 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15635/22095 [26:40:59<6:10:11, 3.44s/it] {'loss': 0.2509, 'grad_norm': 0.5917143842030397, 'learning_rate': 2.0797232942580238e-06, 'epoch': 0.71} 71%|███████ | 15635/22095 [26:40:59<6:10:11, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76242 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15636/22095 [26:41:03<6:37:09, 3.69s/it] {'loss': 0.3029, 'grad_norm': 0.6229136734813554, 'learning_rate': 2.0791284034995296e-06, 'epoch': 0.71} 71%|███████ | 15636/22095 [26:41:03<6:37:09, 3.69s/it] 71%|███████ | 15637/22095 [26:41:07<6:30:56, 3.63s/it] {'loss': 0.3006, 'grad_norm': 0.734163918066207, 'learning_rate': 2.0785335755015913e-06, 'epoch': 0.71} 71%|███████ | 15637/22095 [26:41:07<6:30:56, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15638/22095 [26:41:16<9:22:38, 5.23s/it] {'loss': 0.4719, 'grad_norm': 0.4989460637768117, 'learning_rate': 2.077938810276994e-06, 'epoch': 0.71} 71%|███████ | 15638/22095 [26:41:16<9:22:38, 5.23s/it] 71%|███████ | 15639/22095 [26:41:19<8:15:37, 4.61s/it] {'loss': 0.3453, 'grad_norm': 0.6273674589339869, 'learning_rate': 2.0773441078385194e-06, 'epoch': 0.71} 71%|███████ | 15639/22095 [26:41:19<8:15:37, 4.61s/it] 71%|███████ | 15640/22095 [26:41:22<7:48:00, 4.35s/it] {'loss': 0.3269, 'grad_norm': 0.6310211789960349, 'learning_rate': 2.076749468198943e-06, 'epoch': 0.71} 71%|███████ | 15640/22095 [26:41:22<7:48:00, 4.35s/it] 71%|███████ | 15641/22095 [26:41:25<7:02:26, 3.93s/it] {'loss': 0.2853, 'grad_norm': 0.6198212516711802, 'learning_rate': 2.076154891371041e-06, 'epoch': 0.71} 71%|███████ | 15641/22095 [26:41:25<7:02:26, 3.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965712 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16547, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,M是AB中点,∴BM=\\frac{1}{2}AB=5cm,又∵NB=2cm,∴MN=BM-BN=5-2=3cm.'}]} 71%|███████ | 15642/22095 [26:41:29<6:56:44, 3.87s/it] {'loss': 0.3448, 'grad_norm': 0.6392808620470593, 'learning_rate': 2.0755603773675905e-06, 'epoch': 0.71} 71%|███████ | 15642/22095 [26:41:29<6:56:44, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41031 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (53541 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (88281 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15643/22095 [26:41:36<8:39:55, 4.84s/it] {'loss': 0.4709, 'grad_norm': 0.278762441922395, 'learning_rate': 2.0749659262013676e-06, 'epoch': 0.71} 71%|███████ | 15643/22095 [26:41:36<8:39:55, 4.84s/it] 71%|███████ | 15644/22095 [26:41:43<9:38:02, 5.38s/it] {'loss': 0.4739, 'grad_norm': 0.2769252676380643, 'learning_rate': 2.074371537885143e-06, 'epoch': 0.71} 71%|███████ | 15644/22095 [26:41:43<9:38:02, 5.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15645/22095 [26:41:47<8:58:31, 5.01s/it] {'loss': 0.2756, 'grad_norm': 0.5528948171061779, 'learning_rate': 2.0737772124316872e-06, 'epoch': 0.71} 71%|███████ | 15645/22095 [26:41:47<8:58:31, 5.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15646/22095 [26:41:51<8:33:17, 4.78s/it] {'loss': 0.2993, 'grad_norm': 0.6144812745521786, 'learning_rate': 2.0731829498537743e-06, 'epoch': 0.71} 71%|███████ | 15646/22095 [26:41:51<8:33:17, 4.78s/it] 71%|███████ | 15647/22095 [26:41:54<7:43:47, 4.32s/it] {'loss': 0.3328, 'grad_norm': 0.6165657421144701, 'learning_rate': 2.072588750164168e-06, 'epoch': 0.71} 71%|███████ | 15647/22095 [26:41:54<7:43:47, 4.32s/it] 71%|███████ | 15648/22095 [26:41:58<7:24:36, 4.14s/it] {'loss': 0.3008, 'grad_norm': 0.617449276444202, 'learning_rate': 2.071994613375641e-06, 'epoch': 0.71} 71%|███████ | 15648/22095 [26:41:58<7:24:36, 4.14s/it] 71%|███████ | 15649/22095 [26:42:02<7:23:05, 4.12s/it] {'loss': 0.3041, 'grad_norm': 0.6203225191777019, 'learning_rate': 2.0714005395009566e-06, 'epoch': 0.71} 71%|███████ | 15649/22095 [26:42:02<7:23:05, 4.12s/it] 71%|███████ | 15650/22095 [26:42:07<7:26:13, 4.15s/it] {'loss': 0.3275, 'grad_norm': 0.671178645171008, 'learning_rate': 2.0708065285528784e-06, 'epoch': 0.71} 71%|███████ | 15650/22095 [26:42:07<7:26:13, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15651/22095 [26:42:16<10:15:13, 5.73s/it] {'loss': 0.4803, 'grad_norm': 0.29431578101845934, 'learning_rate': 2.070212580544172e-06, 'epoch': 0.71} 71%|███████ | 15651/22095 [26:42:16<10:15:13, 5.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952442 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3277, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 10\nB. 12\nC. 16\nD. 9\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 71%|███████ | 15652/22095 [26:42:19<8:58:29, 5.01s/it] {'loss': 0.2571, 'grad_norm': 0.666418692678892, 'learning_rate': 2.0696186954876002e-06, 'epoch': 0.71} 71%|███████ | 15652/22095 [26:42:19<8:58:29, 5.01s/it] 71%|███████ | 15653/22095 [26:42:23<8:12:29, 4.59s/it] {'loss': 0.2972, 'grad_norm': 0.6247469527950302, 'learning_rate': 2.0690248733959235e-06, 'epoch': 0.71} 71%|███████ | 15653/22095 [26:42:23<8:12:29, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047783 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 8\nB. 10\nC. 12\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 71%|███████ | 15654/22095 [26:42:27<7:56:26, 4.44s/it] {'loss': 0.3069, 'grad_norm': 0.623897382602735, 'learning_rate': 2.068431114281898e-06, 'epoch': 0.71} 71%|███████ | 15654/22095 [26:42:27<7:56:26, 4.44s/it] 71%|███████ | 15655/22095 [26:42:31<7:41:19, 4.30s/it] {'loss': 0.2809, 'grad_norm': 0.5754801717126035, 'learning_rate': 2.0678374181582845e-06, 'epoch': 0.71} 71%|███████ | 15655/22095 [26:42:31<7:41:19, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914365 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37518, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nA. 2\nB. 4\nC. 8\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15656/22095 [26:42:34<7:04:16, 3.95s/it] {'loss': 0.2959, 'grad_norm': 0.6202131802305537, 'learning_rate': 2.0672437850378414e-06, 'epoch': 0.71} 71%|███████ | 15656/22095 [26:42:34<7:04:16, 3.95s/it] 71%|███████ | 15657/22095 [26:42:37<6:42:38, 3.75s/it] {'loss': 0.3044, 'grad_norm': 0.6962650105072428, 'learning_rate': 2.0666502149333215e-06, 'epoch': 0.71} 71%|███████ | 15657/22095 [26:42:37<6:42:38, 3.75s/it] 71%|███████ | 15658/22095 [26:42:41<6:38:21, 3.71s/it] {'loss': 0.2375, 'grad_norm': 0.5738380545478389, 'learning_rate': 2.066056707857478e-06, 'epoch': 0.71} 71%|███████ | 15658/22095 [26:42:41<6:38:21, 3.71s/it] 71%|███████ | 15659/22095 [26:42:44<6:30:06, 3.64s/it] {'loss': 0.3182, 'grad_norm': 0.6153974552831404, 'learning_rate': 2.0654632638230664e-06, 'epoch': 0.71} 71%|███████ | 15659/22095 [26:42:44<6:30:06, 3.64s/it] 71%|███████ | 15660/22095 [26:42:47<6:11:36, 3.46s/it] {'loss': 0.3539, 'grad_norm': 0.583783207426821, 'learning_rate': 2.064869882842835e-06, 'epoch': 0.71} 71%|███████ | 15660/22095 [26:42:47<6:11:36, 3.46s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31409.png 2025-08-28 18:40:46.267584 load time: 1167.12 ms 71%|███████ | 15661/22095 [26:42:51<6:13:42, 3.49s/it] {'loss': 0.3274, 'grad_norm': 0.5867962921817134, 'learning_rate': 2.064276564929537e-06, 'epoch': 0.71} 71%|███████ | 15661/22095 [26:42:51<6:13:42, 3.49s/it] 71%|███████ | 15662/22095 [26:42:54<5:53:54, 3.30s/it] {'loss': 0.2961, 'grad_norm': 0.6622066610058321, 'learning_rate': 2.0636833100959198e-06, 'epoch': 0.71} 71%|███████ | 15662/22095 [26:42:54<5:53:54, 3.30s/it] 71%|███████ | 15663/22095 [26:42:58<6:12:46, 3.48s/it] {'loss': 0.3433, 'grad_norm': 0.68034769054094, 'learning_rate': 2.0630901183547274e-06, 'epoch': 0.71} 71%|███████ | 15663/22095 [26:42:58<6:12:46, 3.48s/it] 71%|███████ | 15664/22095 [26:43:01<5:56:52, 3.33s/it] {'loss': 0.2824, 'grad_norm': 0.6299499173817064, 'learning_rate': 2.0624969897187084e-06, 'epoch': 0.71} 71%|███████ | 15664/22095 [26:43:01<5:56:52, 3.33s/it] 71%|███████ | 15665/22095 [26:43:04<5:56:12, 3.32s/it] {'loss': 0.2903, 'grad_norm': 0.6134710115056257, 'learning_rate': 2.0619039242006117e-06, 'epoch': 0.71} 71%|███████ | 15665/22095 [26:43:04<5:56:12, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8576894 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21249, 'image': '789401851.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Gay & Lesbian? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 71%|███████ | 15666/22095 [26:43:07<5:43:58, 3.21s/it] {'loss': 0.2736, 'grad_norm': 1.1167269850382753, 'learning_rate': 2.0613109218131717e-06, 'epoch': 0.71} 71%|███████ | 15666/22095 [26:43:07<5:43:58, 3.21s/it] 71%|███████ | 15667/22095 [26:43:10<5:43:30, 3.21s/it] {'loss': 0.2655, 'grad_norm': 0.5868181423205776, 'learning_rate': 2.0607179825691344e-06, 'epoch': 0.71} 71%|███████ | 15667/22095 [26:43:10<5:43:30, 3.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15668/22095 [26:43:14<6:08:03, 3.44s/it] {'loss': 0.2824, 'grad_norm': 0.6804760253555822, 'learning_rate': 2.0601251064812407e-06, 'epoch': 0.71} 71%|███████ | 15668/22095 [26:43:14<6:08:03, 3.44s/it] 71%|███████ | 15669/22095 [26:43:18<6:05:11, 3.41s/it] {'loss': 0.3331, 'grad_norm': 0.6415825775100403, 'learning_rate': 2.0595322935622326e-06, 'epoch': 0.71} 71%|███████ | 15669/22095 [26:43:18<6:05:11, 3.41s/it] 71%|███████ | 15670/22095 [26:43:22<6:32:54, 3.67s/it] {'loss': 0.2952, 'grad_norm': 0.9332292806580279, 'learning_rate': 2.058939543824841e-06, 'epoch': 0.71} 71%|███████ | 15670/22095 [26:43:22<6:32:54, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959592 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10427, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC=2MC,BC=2CN,由线段的和差得AC-BC=2MC-2NC=2(MC-NC)=2×2=4cm,'}]} 71%|███████ | 15671/22095 [26:43:25<6:30:23, 3.65s/it] {'loss': 0.3002, 'grad_norm': 0.6043048873822964, 'learning_rate': 2.058346857281806e-06, 'epoch': 0.71} 71%|███████ | 15671/22095 [26:43:25<6:30:23, 3.65s/it] 71%|███████ | 15672/22095 [26:43:30<6:48:48, 3.82s/it] {'loss': 0.3137, 'grad_norm': 0.6590343921393856, 'learning_rate': 2.0577542339458647e-06, 'epoch': 0.71} 71%|███████ | 15672/22095 [26:43:30<6:48:48, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15673/22095 [26:43:38<9:18:20, 5.22s/it] {'loss': 0.4802, 'grad_norm': 0.29385629964947496, 'learning_rate': 2.0571616738297473e-06, 'epoch': 0.71} 71%|███████ | 15673/22095 [26:43:38<9:18:20, 5.22s/it] 71%|███████ | 15674/22095 [26:43:42<8:41:58, 4.88s/it] {'loss': 0.3158, 'grad_norm': 0.6251899858791947, 'learning_rate': 2.0565691769461865e-06, 'epoch': 0.71} 71%|███████ | 15674/22095 [26:43:42<8:41:58, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47387 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15675/22095 [26:43:46<8:06:48, 4.55s/it] {'loss': 0.3026, 'grad_norm': 0.5689479115374646, 'learning_rate': 2.0559767433079154e-06, 'epoch': 0.71} 71%|███████ | 15675/22095 [26:43:46<8:06:48, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52975 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15676/22095 [26:43:50<8:01:06, 4.50s/it] {'loss': 0.2944, 'grad_norm': 0.5731850317292073, 'learning_rate': 2.0553843729276606e-06, 'epoch': 0.71} 71%|███████ | 15676/22095 [26:43:50<8:01:06, 4.50s/it] 71%|███████ | 15677/22095 [26:43:53<7:13:17, 4.05s/it] {'loss': 0.309, 'grad_norm': 0.6013672663083953, 'learning_rate': 2.0547920658181535e-06, 'epoch': 0.71} 71%|███████ | 15677/22095 [26:43:53<7:13:17, 4.05s/it] 71%|███████ | 15678/22095 [26:43:56<6:41:17, 3.75s/it] {'loss': 0.2835, 'grad_norm': 0.5966705722753385, 'learning_rate': 2.0541998219921194e-06, 'epoch': 0.71} 71%|███████ | 15678/22095 [26:43:56<6:41:17, 3.75s/it] 71%|███████ | 15679/22095 [26:44:00<6:22:35, 3.58s/it] {'loss': 0.2486, 'grad_norm': 0.6065622374492098, 'learning_rate': 2.0536076414622824e-06, 'epoch': 0.71} 71%|███████ | 15679/22095 [26:44:00<6:22:35, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121827 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80657 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15680/22095 [26:44:02<5:57:37, 3.34s/it] {'loss': 0.3338, 'grad_norm': 0.6093719632820457, 'learning_rate': 2.0530155242413676e-06, 'epoch': 0.71} 71%|███████ | 15680/22095 [26:44:02<5:57:37, 3.34s/it] 71%|███████ | 15681/22095 [26:44:06<6:06:15, 3.43s/it] {'loss': 0.2503, 'grad_norm': 0.7299538787359219, 'learning_rate': 2.0524234703421003e-06, 'epoch': 0.71} 71%|███████ | 15681/22095 [26:44:06<6:06:15, 3.43s/it] 71%|███████ | 15682/22095 [26:44:10<6:17:15, 3.53s/it] {'loss': 0.3176, 'grad_norm': 0.6195740147006733, 'learning_rate': 2.0518314797771993e-06, 'epoch': 0.71} 71%|███████ | 15682/22095 [26:44:10<6:17:15, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15683/22095 [26:44:19<9:23:24, 5.27s/it] {'loss': 0.4632, 'grad_norm': 0.27778375446200076, 'learning_rate': 2.0512395525593842e-06, 'epoch': 0.71} 71%|███████ | 15683/22095 [26:44:19<9:23:24, 5.27s/it] 71%|███████ | 15684/22095 [26:44:27<10:50:36, 6.09s/it] {'loss': 0.4892, 'grad_norm': 0.31267117915988685, 'learning_rate': 2.050647688701374e-06, 'epoch': 0.71} 71%|███████ | 15684/22095 [26:44:27<10:50:36, 6.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 71%|███████ | 15685/22095 [26:44:31<9:34:43, 5.38s/it] {'loss': 0.3681, 'grad_norm': 0.5625030089547218, 'learning_rate': 2.050055888215889e-06, 'epoch': 0.71} 71%|███████ | 15685/22095 [26:44:31<9:34:43, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8377583 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44366, 'image': 'vrdu_table_final_2/astro-ph.CO/c04438c6-f9eb-4b55-b345-e91ffff34201.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l} #1 \\end{tabular}\n```"}]} 71%|███████ | 15686/22095 [26:44:35<8:50:11, 4.96s/it] {'loss': 0.2921, 'grad_norm': 0.5762043814346973, 'learning_rate': 2.0494641511156426e-06, 'epoch': 0.71} 71%|███████ | 15686/22095 [26:44:35<8:50:11, 4.96s/it] 71%|███████ | 15687/22095 [26:44:38<7:45:18, 4.36s/it] {'loss': 0.3002, 'grad_norm': 0.677688149072792, 'learning_rate': 2.048872477413348e-06, 'epoch': 0.71} 71%|███████ | 15687/22095 [26:44:38<7:45:18, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15688/22095 [26:44:46<9:54:37, 5.57s/it] {'loss': 0.4714, 'grad_norm': 0.2675888295042279, 'learning_rate': 2.048280867121722e-06, 'epoch': 0.71} 71%|███████ | 15688/22095 [26:44:46<9:54:37, 5.57s/it] 71%|███████ | 15689/22095 [26:44:49<8:41:14, 4.88s/it] {'loss': 0.2863, 'grad_norm': 0.610043033065155, 'learning_rate': 2.0476893202534726e-06, 'epoch': 0.71} 71%|███████ | 15689/22095 [26:44:49<8:41:14, 4.88s/it] 71%|███████ | 15690/22095 [26:44:53<7:47:19, 4.38s/it] {'loss': 0.2699, 'grad_norm': 0.6019687341898033, 'learning_rate': 2.0470978368213145e-06, 'epoch': 0.71} 71%|███████ | 15690/22095 [26:44:53<7:47:19, 4.38s/it] 71%|███████ | 15691/22095 [26:44:56<7:08:45, 4.02s/it] {'loss': 0.3047, 'grad_norm': 0.6557212029775732, 'learning_rate': 2.0465064168379547e-06, 'epoch': 0.71} 71%|███████ | 15691/22095 [26:44:56<7:08:45, 4.02s/it] 71%|███████ | 15692/22095 [26:44:59<6:55:24, 3.89s/it] {'loss': 0.3123, 'grad_norm': 0.5681907011018104, 'learning_rate': 2.0459150603160993e-06, 'epoch': 0.71} 71%|███████ | 15692/22095 [26:44:59<6:55:24, 3.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82840 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15693/22095 [26:45:03<6:37:42, 3.73s/it] {'loss': 0.3232, 'grad_norm': 0.6937559347912634, 'learning_rate': 2.045323767268456e-06, 'epoch': 0.71} 71%|███████ | 15693/22095 [26:45:03<6:37:42, 3.73s/it] 71%|███████ | 15694/22095 [26:45:06<6:18:48, 3.55s/it] {'loss': 0.3281, 'grad_norm': 0.7626075879462596, 'learning_rate': 2.0447325377077344e-06, 'epoch': 0.71} 71%|███████ | 15694/22095 [26:45:06<6:18:48, 3.55s/it] 71%|███████ | 15695/22095 [26:45:09<6:08:50, 3.46s/it] {'loss': 0.3237, 'grad_norm': 0.6171676917923269, 'learning_rate': 2.0441413716466308e-06, 'epoch': 0.71} 71%|███████ | 15695/22095 [26:45:09<6:08:50, 3.46s/it] 71%|███████ | 15696/22095 [26:45:13<6:36:37, 3.72s/it] {'loss': 0.2728, 'grad_norm': 0.5615951852382349, 'learning_rate': 2.0435502690978502e-06, 'epoch': 0.71} 71%|███████ | 15696/22095 [26:45:13<6:36:37, 3.72s/it] 71%|███████ | 15697/22095 [26:45:16<6:07:36, 3.45s/it] {'loss': 0.272, 'grad_norm': 0.6073481684244535, 'learning_rate': 2.0429592300740945e-06, 'epoch': 0.71} 71%|███████ | 15697/22095 [26:45:16<6:07:36, 3.45s/it] 71%|███████ | 15698/22095 [26:45:20<6:18:52, 3.55s/it] {'loss': 0.2885, 'grad_norm': 0.5990557902507999, 'learning_rate': 2.042368254588067e-06, 'epoch': 0.71} 71%|███████ | 15698/22095 [26:45:20<6:18:52, 3.55s/it] 71%|███████ | 15699/22095 [26:45:24<6:35:00, 3.71s/it] {'loss': 0.2918, 'grad_norm': 0.5806910031766905, 'learning_rate': 2.0417773426524583e-06, 'epoch': 0.71} 71%|███████ | 15699/22095 [26:45:24<6:35:00, 3.71s/it] 71%|███████ | 15700/22095 [26:45:27<6:19:02, 3.56s/it] {'loss': 0.3183, 'grad_norm': 0.6167601568081476, 'learning_rate': 2.0411864942799685e-06, 'epoch': 0.71} 71%|███████ | 15700/22095 [26:45:27<6:19:02, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15701/22095 [26:45:36<8:56:22, 5.03s/it] {'loss': 0.4806, 'grad_norm': 0.2933210594741521, 'learning_rate': 2.0405957094832962e-06, 'epoch': 0.71} 71%|███████ | 15701/22095 [26:45:36<8:56:22, 5.03s/it] 71%|███████ | 15702/22095 [26:45:39<8:01:17, 4.52s/it] {'loss': 0.2852, 'grad_norm': 0.6009201878774669, 'learning_rate': 2.0400049882751327e-06, 'epoch': 0.71} 71%|███████ | 15702/22095 [26:45:39<8:01:17, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15703/22095 [26:45:43<7:26:36, 4.19s/it] {'loss': 0.3211, 'grad_norm': 0.5691961115059543, 'learning_rate': 2.0394143306681692e-06, 'epoch': 0.71} 71%|███████ | 15703/22095 [26:45:43<7:26:36, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15704/22095 [26:45:45<6:43:37, 3.79s/it] {'loss': 0.313, 'grad_norm': 0.6433385542199535, 'learning_rate': 2.0388237366751005e-06, 'epoch': 0.71} 71%|███████ | 15704/22095 [26:45:45<6:43:37, 3.79s/it] 71%|███████ | 15705/22095 [26:45:49<6:42:19, 3.78s/it] {'loss': 0.312, 'grad_norm': 0.6882102875117379, 'learning_rate': 2.038233206308614e-06, 'epoch': 0.71} 71%|███████ | 15705/22095 [26:45:49<6:42:19, 3.78s/it] 71%|███████ | 15706/22095 [26:45:52<6:16:13, 3.53s/it] {'loss': 0.2823, 'grad_norm': 0.6308695404554812, 'learning_rate': 2.037642739581401e-06, 'epoch': 0.71} 71%|███████ | 15706/22095 [26:45:52<6:16:13, 3.53s/it] 71%|███████ | 15707/22095 [26:45:55<5:51:41, 3.30s/it] {'loss': 0.2824, 'grad_norm': 0.6075481401338595, 'learning_rate': 2.0370523365061473e-06, 'epoch': 0.71} 71%|███████ | 15707/22095 [26:45:55<5:51:41, 3.30s/it] 71%|███████ | 15708/22095 [26:45:59<6:05:35, 3.43s/it] {'loss': 0.3136, 'grad_norm': 0.6553470517522016, 'learning_rate': 2.0364619970955373e-06, 'epoch': 0.71} 71%|███████ | 15708/22095 [26:45:59<6:05:35, 3.43s/it] 71%|███████ | 15709/22095 [26:46:02<6:00:30, 3.39s/it] {'loss': 0.3161, 'grad_norm': 0.6008238589049834, 'learning_rate': 2.035871721362257e-06, 'epoch': 0.71} 71%|███████ | 15709/22095 [26:46:02<6:00:30, 3.39s/it] 71%|███████ | 15710/22095 [26:46:05<5:41:52, 3.21s/it] {'loss': 0.2807, 'grad_norm': 0.6521871065838722, 'learning_rate': 2.0352815093189913e-06, 'epoch': 0.71} 71%|███████ | 15710/22095 [26:46:05<5:41:52, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68952 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76498 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97424 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114208 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15711/22095 [26:46:08<5:35:33, 3.15s/it] {'loss': 0.2905, 'grad_norm': 1.0025184314929678, 'learning_rate': 2.0346913609784215e-06, 'epoch': 0.71} 71%|███████ | 15711/22095 [26:46:08<5:35:33, 3.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15712/22095 [26:46:11<5:35:08, 3.15s/it] {'loss': 0.3228, 'grad_norm': 0.7522931147926903, 'learning_rate': 2.0341012763532243e-06, 'epoch': 0.71} 71%|███████ | 15712/22095 [26:46:11<5:35:08, 3.15s/it] 71%|███████ | 15713/22095 [26:46:14<5:30:44, 3.11s/it] {'loss': 0.3222, 'grad_norm': 0.6231508704120658, 'learning_rate': 2.033511255456082e-06, 'epoch': 0.71} 71%|███████ | 15713/22095 [26:46:14<5:30:44, 3.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960769 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11604, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 71%|███████ | 15714/22095 [26:46:17<5:18:55, 3.00s/it] {'loss': 0.2979, 'grad_norm': 0.6309365311711861, 'learning_rate': 2.032921298299674e-06, 'epoch': 0.71} 71%|███████ | 15714/22095 [26:46:17<5:18:55, 3.00s/it] 71%|███████ | 15715/22095 [26:46:20<5:18:54, 3.00s/it] {'loss': 0.2789, 'grad_norm': 0.624392999154907, 'learning_rate': 2.0323314048966737e-06, 'epoch': 0.71} 71%|███████ | 15715/22095 [26:46:20<5:18:54, 3.00s/it] 71%|███████ | 15716/22095 [26:46:24<5:55:06, 3.34s/it] {'loss': 0.3034, 'grad_norm': 0.579559880412466, 'learning_rate': 2.031741575259756e-06, 'epoch': 0.71} 71%|███████ | 15716/22095 [26:46:24<5:55:06, 3.34s/it] 71%|███████ | 15717/22095 [26:46:27<6:01:20, 3.40s/it] {'loss': 0.3132, 'grad_norm': 0.5873497811127301, 'learning_rate': 2.031151809401597e-06, 'epoch': 0.71} 71%|███████ | 15717/22095 [26:46:27<6:01:20, 3.40s/it] 71%|███████ | 15718/22095 [26:46:31<6:19:33, 3.57s/it] {'loss': 0.3197, 'grad_norm': 0.6213809208434222, 'learning_rate': 2.030562107334866e-06, 'epoch': 0.71} 71%|███████ | 15718/22095 [26:46:31<6:19:33, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344077 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10729, 'image': 'vrdu_table_final_2/astro-ph.CO/f64e855e-754d-4d26-aee4-d55d1154a9f1.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15719/22095 [26:46:41<9:28:34, 5.35s/it] {'loss': 0.4771, 'grad_norm': 0.2944410583813787, 'learning_rate': 2.0299724690722367e-06, 'epoch': 0.71} 71%|███████ | 15719/22095 [26:46:41<9:28:34, 5.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15720/22095 [26:46:44<8:16:28, 4.67s/it] {'loss': 0.2686, 'grad_norm': 0.6200291281083014, 'learning_rate': 2.029382894626378e-06, 'epoch': 0.71} 71%|███████ | 15720/22095 [26:46:44<8:16:28, 4.67s/it] 71%|███████ | 15721/22095 [26:46:47<7:22:19, 4.16s/it] {'loss': 0.2802, 'grad_norm': 0.6471888141535328, 'learning_rate': 2.028793384009955e-06, 'epoch': 0.71} 71%|███████ | 15721/22095 [26:46:47<7:22:19, 4.16s/it] 71%|███████ | 15722/22095 [26:46:51<7:08:06, 4.03s/it] {'loss': 0.3292, 'grad_norm': 0.5992679305582995, 'learning_rate': 2.028203937235637e-06, 'epoch': 0.71} 71%|███████ | 15722/22095 [26:46:51<7:08:06, 4.03s/it] 71%|███████ | 15723/22095 [26:46:54<6:35:18, 3.72s/it] {'loss': 0.3402, 'grad_norm': 0.6208222626265416, 'learning_rate': 2.0276145543160923e-06, 'epoch': 0.71} 71%|███████ | 15723/22095 [26:46:54<6:35:18, 3.72s/it] 71%|███████ | 15724/22095 [26:46:58<6:43:58, 3.80s/it] {'loss': 0.2572, 'grad_norm': 0.5213881130170567, 'learning_rate': 2.027025235263979e-06, 'epoch': 0.71} 71%|███████ | 15724/22095 [26:46:58<6:43:58, 3.80s/it] 71%|███████ | 15725/22095 [26:47:02<6:59:01, 3.95s/it] {'loss': 0.3361, 'grad_norm': 0.6372870238486473, 'learning_rate': 2.0264359800919626e-06, 'epoch': 0.71} 71%|███████ | 15725/22095 [26:47:02<6:59:01, 3.95s/it] 71%|███████ | 15726/22095 [26:47:05<6:46:03, 3.83s/it] {'loss': 0.289, 'grad_norm': 0.5644816426982812, 'learning_rate': 2.0258467888127036e-06, 'epoch': 0.71} 71%|███████ | 15726/22095 [26:47:05<6:46:03, 3.83s/it] 71%|███████ | 15727/22095 [26:47:09<6:31:57, 3.69s/it] {'loss': 0.3023, 'grad_norm': 0.6133526414307099, 'learning_rate': 2.0252576614388668e-06, 'epoch': 0.71} 71%|███████ | 15727/22095 [26:47:09<6:31:57, 3.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8940294 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63447, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C被称为AB段的顶点,AC=12cm,CB=\\ frac{2}{3}AC,D和E分别是AC和AB的中点,则的长度为()\nA. 6\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 71%|███████ | 15728/22095 [26:47:12<6:32:20, 3.70s/it] {'loss': 0.3226, 'grad_norm': 0.635948868871442, 'learning_rate': 2.024668597983103e-06, 'epoch': 0.71} 71%|███████ | 15728/22095 [26:47:13<6:32:20, 3.70s/it] 71%|███████ | 15729/22095 [26:47:15<6:07:42, 3.47s/it] {'loss': 0.315, 'grad_norm': 0.6566520390761876, 'learning_rate': 2.0240795984580734e-06, 'epoch': 0.71} 71%|███████ | 15729/22095 [26:47:15<6:07:42, 3.47s/it] 71%|███████ | 15730/22095 [26:47:19<6:18:09, 3.56s/it] {'loss': 0.2813, 'grad_norm': 0.6218129037138104, 'learning_rate': 2.023490662876435e-06, 'epoch': 0.71} 71%|███████ | 15730/22095 [26:47:19<6:18:09, 3.56s/it] 71%|███████ | 15731/22095 [26:47:23<6:30:48, 3.68s/it] {'loss': 0.3203, 'grad_norm': 0.6490607702973659, 'learning_rate': 2.0229017912508403e-06, 'epoch': 0.71} 71%|███████ | 15731/22095 [26:47:23<6:30:48, 3.68s/it] 71%|███████ | 15732/22095 [26:47:27<6:30:34, 3.68s/it] {'loss': 0.3047, 'grad_norm': 0.6511603840546198, 'learning_rate': 2.022312983593941e-06, 'epoch': 0.71} 71%|███████ | 15732/22095 [26:47:27<6:30:34, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65228 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60552 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15733/22095 [26:47:30<6:03:40, 3.43s/it] {'loss': 0.2785, 'grad_norm': 0.6033531094960397, 'learning_rate': 2.021724239918392e-06, 'epoch': 0.71} 71%|███████ | 15733/22095 [26:47:30<6:03:40, 3.43s/it] 71%|███████ | 15734/22095 [26:47:33<5:48:02, 3.28s/it] {'loss': 0.3141, 'grad_norm': 0.60674720041964, 'learning_rate': 2.0211355602368404e-06, 'epoch': 0.71} 71%|███████ | 15734/22095 [26:47:33<5:48:02, 3.28s/it] 71%|███████ | 15735/22095 [26:47:36<5:55:08, 3.35s/it] {'loss': 0.3585, 'grad_norm': 0.9254902853327204, 'learning_rate': 2.0205469445619386e-06, 'epoch': 0.71} 71%|███████ | 15735/22095 [26:47:36<5:55:08, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████ | 15736/22095 [26:47:40<6:02:45, 3.42s/it] {'loss': 0.3517, 'grad_norm': 0.5968363646845488, 'learning_rate': 2.019958392906332e-06, 'epoch': 0.71} 71%|███████ | 15736/22095 [26:47:40<6:02:45, 3.42s/it] 71%|███████ | 15737/22095 [26:47:43<6:08:39, 3.48s/it] {'loss': 0.3078, 'grad_norm': 0.6486568418701398, 'learning_rate': 2.0193699052826656e-06, 'epoch': 0.71} 71%|███████ | 15737/22095 [26:47:43<6:08:39, 3.48s/it] 71%|███████ | 15738/22095 [26:47:46<5:57:08, 3.37s/it] {'loss': 0.2922, 'grad_norm': 0.6678475248324102, 'learning_rate': 2.0187814817035855e-06, 'epoch': 0.71} 71%|███████ | 15738/22095 [26:47:46<5:57:08, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127373 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15739/22095 [26:47:49<5:39:04, 3.20s/it] {'loss': 0.2679, 'grad_norm': 0.7596394560633785, 'learning_rate': 2.018193122181737e-06, 'epoch': 0.71} 71%|███████ | 15739/22095 [26:47:49<5:39:04, 3.20s/it] 71%|███████ | 15740/22095 [26:47:53<5:56:15, 3.36s/it] {'loss': 0.3269, 'grad_norm': 0.5864552075821262, 'learning_rate': 2.0176048267297603e-06, 'epoch': 0.71} 71%|███████ | 15740/22095 [26:47:53<5:56:15, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████ | 15741/22095 [26:48:03<9:17:23, 5.26s/it] {'loss': 0.4661, 'grad_norm': 1.2647697192982412, 'learning_rate': 2.0170165953602944e-06, 'epoch': 0.71} 71%|███████ | 15741/22095 [26:48:03<9:17:23, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52483 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (137211 > 40960). Running this sequence through the model will result in indexing errors 71%|███████ | 15742/22095 [26:48:06<8:11:13, 4.64s/it] {'loss': 0.3029, 'grad_norm': 0.6058089185740559, 'learning_rate': 2.0164284280859803e-06, 'epoch': 0.71} 71%|███████ | 15742/22095 [26:48:06<8:11:13, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43977 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57162 > 40960). Running this sequence through the model will result in indexing errors 71%|███████▏ | 15743/22095 [26:48:09<7:32:39, 4.28s/it] {'loss': 0.3031, 'grad_norm': 0.6800456990515935, 'learning_rate': 2.015840324919458e-06, 'epoch': 0.71} 71%|███████▏ | 15743/22095 [26:48:09<7:32:39, 4.28s/it] 71%|███████▏ | 15744/22095 [26:48:13<7:01:24, 3.98s/it] {'loss': 0.2906, 'grad_norm': 0.6160556600016963, 'learning_rate': 2.0152522858733613e-06, 'epoch': 0.71} 71%|███████▏ | 15744/22095 [26:48:13<7:01:24, 3.98s/it] 71%|███████▏ | 15745/22095 [26:48:16<6:52:49, 3.90s/it] {'loss': 0.315, 'grad_norm': 0.6190587645711407, 'learning_rate': 2.0146643109603247e-06, 'epoch': 0.71} 71%|███████▏ | 15745/22095 [26:48:16<6:52:49, 3.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899862 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23015, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm'}, {'from': 'gpt', 'value': '【解答】解:∵点D、E分别是AC和BC的中点,∴DE=DC+CE=\\frac{1}{2}AC+\\frac{1}{2}BC=\\frac{1}{2}AB而AB=16cm,∴DE=\\frac{1}{2}×16=8(cm).'}]} 71%|███████▏ | 15746/22095 [26:48:20<6:37:08, 3.75s/it] {'loss': 0.3072, 'grad_norm': 0.630589774663842, 'learning_rate': 2.0140764001929853e-06, 'epoch': 0.71} 71%|███████▏ | 15746/22095 [26:48:20<6:37:08, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████▏ | 15747/22095 [26:48:26<8:07:10, 4.60s/it] {'loss': 0.4883, 'grad_norm': 0.27009444682988265, 'learning_rate': 2.0134885535839714e-06, 'epoch': 0.71} 71%|███████▏ | 15747/22095 [26:48:26<8:07:10, 4.60s/it] 71%|███████▏ | 15748/22095 [26:48:29<7:21:06, 4.17s/it] {'loss': 0.2955, 'grad_norm': 0.5805411200121087, 'learning_rate': 2.012900771145918e-06, 'epoch': 0.71} 71%|███████▏ | 15748/22095 [26:48:29<7:21:06, 4.17s/it] 71%|███████▏ | 15749/22095 [26:48:32<6:43:10, 3.81s/it] {'loss': 0.2941, 'grad_norm': 0.6036597094340551, 'learning_rate': 2.012313052891453e-06, 'epoch': 0.71} 71%|███████▏ | 15749/22095 [26:48:32<6:43:10, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████▏ | 15750/22095 [26:48:42<9:44:05, 5.52s/it] {'loss': 0.4879, 'grad_norm': 0.29160376342527783, 'learning_rate': 2.0117253988332023e-06, 'epoch': 0.71} 71%|███████▏ | 15750/22095 [26:48:42<9:44:05, 5.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████▏ | 15751/22095 [26:48:51<11:47:06, 6.69s/it] {'loss': 0.4641, 'grad_norm': 0.287614861835318, 'learning_rate': 2.0111378089837958e-06, 'epoch': 0.71} 71%|███████▏ | 15751/22095 [26:48:51<11:47:06, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 71%|███████▏ | 15752/22095 [26:48:55<10:12:57, 5.80s/it] {'loss': 0.3494, 'grad_norm': 0.6776791947833849, 'learning_rate': 2.010550283355861e-06, 'epoch': 0.71} 71%|███████▏ | 15752/22095 [26:48:55<10:12:57, 5.80s/it] 71%|███████▏ | 15753/22095 [26:48:59<9:05:07, 5.16s/it] {'loss': 0.2833, 'grad_norm': 0.6812298987323966, 'learning_rate': 2.009962821962016e-06, 'epoch': 0.71} 71%|███████▏ | 15753/22095 [26:48:59<9:05:07, 5.16s/it] 71%|███████▏ | 15754/22095 [26:49:02<8:03:20, 4.57s/it] {'loss': 0.2879, 'grad_norm': 0.6090574363970508, 'learning_rate': 2.009375424814886e-06, 'epoch': 0.71} 71%|███████▏ | 15754/22095 [26:49:02<8:03:20, 4.57s/it] 71%|███████▏ | 15755/22095 [26:49:05<7:14:03, 4.11s/it] {'loss': 0.3274, 'grad_norm': 0.8689524193398631, 'learning_rate': 2.0087880919270943e-06, 'epoch': 0.71} 71%|███████▏ | 15755/22095 [26:49:05<7:14:03, 4.11s/it] 71%|███████▏ | 15756/22095 [26:49:09<7:05:39, 4.03s/it] {'loss': 0.2807, 'grad_norm': 0.5948898312977335, 'learning_rate': 2.008200823311263e-06, 'epoch': 0.71} 71%|███████▏ | 15756/22095 [26:49:09<7:05:39, 4.03s/it] 71%|███████▏ | 15757/22095 [26:49:12<6:26:50, 3.66s/it] {'loss': 0.3436, 'grad_norm': 0.6558636463613668, 'learning_rate': 2.0076136189800033e-06, 'epoch': 0.71} 71%|███████▏ | 15757/22095 [26:49:12<6:26:50, 3.66s/it] 71%|███████▏ | 15758/22095 [26:49:16<6:37:17, 3.76s/it] {'loss': 0.2648, 'grad_norm': 0.851091129043898, 'learning_rate': 2.0070264789459365e-06, 'epoch': 0.71} 71%|███████▏ | 15758/22095 [26:49:16<6:37:17, 3.76s/it] 71%|███████▏ | 15759/22095 [26:49:18<6:07:42, 3.48s/it] {'loss': 0.299, 'grad_norm': 0.5796179561464531, 'learning_rate': 2.0064394032216807e-06, 'epoch': 0.71} 71%|███████▏ | 15759/22095 [26:49:18<6:07:42, 3.48s/it] 71%|███████▏ | 15760/22095 [26:49:22<6:03:37, 3.44s/it] {'loss': 0.3189, 'grad_norm': 0.5994608703240536, 'learning_rate': 2.0058523918198473e-06, 'epoch': 0.71} 71%|███████▏ | 15760/22095 [26:49:22<6:03:37, 3.44s/it] 71%|███████▏ | 15761/22095 [26:49:25<5:45:22, 3.27s/it] {'loss': 0.3101, 'grad_norm': 0.5952886279712135, 'learning_rate': 2.0052654447530497e-06, 'epoch': 0.71} 71%|███████▏ | 15761/22095 [26:49:25<5:45:22, 3.27s/it] 71%|███████▏ | 15762/22095 [26:49:29<6:02:31, 3.43s/it] {'loss': 0.2733, 'grad_norm': 0.5411759893169642, 'learning_rate': 2.004678562033901e-06, 'epoch': 0.71} 71%|███████▏ | 15762/22095 [26:49:29<6:02:31, 3.43s/it] 71%|███████▏ | 15763/22095 [26:49:32<6:08:19, 3.49s/it] {'loss': 0.2811, 'grad_norm': 0.6305517774581135, 'learning_rate': 2.004091743675009e-06, 'epoch': 0.71} 71%|███████▏ | 15763/22095 [26:49:32<6:08:19, 3.49s/it] 71%|███████▏ | 15764/22095 [26:49:35<6:01:19, 3.42s/it] {'loss': 0.2973, 'grad_norm': 0.622821270648253, 'learning_rate': 2.0035049896889857e-06, 'epoch': 0.71} 71%|███████▏ | 15764/22095 [26:49:35<6:01:19, 3.42s/it] 71%|███████▏ | 15765/22095 [26:49:39<6:04:46, 3.46s/it] {'loss': 0.3082, 'grad_norm': 0.6579762622647377, 'learning_rate': 2.0029183000884372e-06, 'epoch': 0.71} 71%|███████▏ | 15765/22095 [26:49:39<6:04:46, 3.46s/it] 71%|███████▏ | 15766/22095 [26:49:42<6:01:24, 3.43s/it] {'loss': 0.289, 'grad_norm': 0.5941769016688686, 'learning_rate': 2.0023316748859683e-06, 'epoch': 0.71} 71%|███████▏ | 15766/22095 [26:49:42<6:01:24, 3.43s/it] 71%|███████▏ | 15767/22095 [26:49:45<5:47:20, 3.29s/it] {'loss': 0.299, 'grad_norm': 0.6074341860587897, 'learning_rate': 2.0017451140941848e-06, 'epoch': 0.71} 71%|███████▏ | 15767/22095 [26:49:45<5:47:20, 3.29s/it] 71%|███████▏ | 15768/22095 [26:49:49<5:56:19, 3.38s/it] {'loss': 0.2788, 'grad_norm': 0.5944365416986104, 'learning_rate': 2.001158617725692e-06, 'epoch': 0.71} 71%|███████▏ | 15768/22095 [26:49:49<5:56:19, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46283 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60680 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105325 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49644 > 40960). Running this sequence through the model will result in indexing errors 71%|███████▏ | 15769/22095 [26:49:52<5:59:58, 3.41s/it] {'loss': 0.2867, 'grad_norm': 0.6734313146626568, 'learning_rate': 2.0005721857930902e-06, 'epoch': 0.71} 71%|███████▏ | 15769/22095 [26:49:52<5:59:58, 3.41s/it] 71%|███████▏ | 15770/22095 [26:49:56<5:57:49, 3.39s/it] {'loss': 0.2697, 'grad_norm': 0.6120330385710467, 'learning_rate': 1.999985818308979e-06, 'epoch': 0.71} 71%|███████▏ | 15770/22095 [26:49:56<5:57:49, 3.39s/it] 71%|███████▏ | 15771/22095 [26:49:59<5:58:01, 3.40s/it] {'loss': 0.2846, 'grad_norm': 0.6424849521515572, 'learning_rate': 1.9993995152859574e-06, 'epoch': 0.71} 71%|███████▏ | 15771/22095 [26:49:59<5:58:01, 3.40s/it] 71%|███████▏ | 15772/22095 [26:50:04<6:36:45, 3.76s/it] {'loss': 0.3511, 'grad_norm': 0.592428537896021, 'learning_rate': 1.9988132767366274e-06, 'epoch': 0.71} 71%|███████▏ | 15772/22095 [26:50:04<6:36:45, 3.76s/it] 71%|███████▏ | 15773/22095 [26:50:07<6:23:06, 3.64s/it] {'loss': 0.293, 'grad_norm': 0.6274610076103312, 'learning_rate': 1.9982271026735822e-06, 'epoch': 0.71} 71%|███████▏ | 15773/22095 [26:50:07<6:23:06, 3.64s/it] 71%|███████▏ | 15774/22095 [26:50:11<6:43:29, 3.83s/it] {'loss': 0.3029, 'grad_norm': 0.6506363639159314, 'learning_rate': 1.997640993109416e-06, 'epoch': 0.71} 71%|███████▏ | 15774/22095 [26:50:11<6:43:29, 3.83s/it] 71%|███████▏ | 15775/22095 [26:50:15<6:44:24, 3.84s/it] {'loss': 0.3086, 'grad_norm': 0.6224530901843206, 'learning_rate': 1.9970549480567253e-06, 'epoch': 0.71} 71%|███████▏ | 15775/22095 [26:50:15<6:44:24, 3.84s/it] 71%|███████▏ | 15776/22095 [26:50:19<6:50:23, 3.90s/it] {'loss': 0.2699, 'grad_norm': 0.5956593042207174, 'learning_rate': 1.9964689675280993e-06, 'epoch': 0.71} 71%|███████▏ | 15776/22095 [26:50:19<6:50:23, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [81, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8384896 in VC:s3://internvl-moe-sft-data/. Exception: Image size [81, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 51696, 'image': 'vrdu_table_final_2/astro-ph.CO/fb27a38c-bf28-4d39-a847-258ac79bb656.png', 'image_wh': [[81, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{c}\n \\@author\n \\end{tabular}\n```"}]} 71%|███████▏ | 15777/22095 [26:50:29<9:49:19, 5.60s/it] {'loss': 0.4754, 'grad_norm': 0.308828138671443, 'learning_rate': 1.9958830515361323e-06, 'epoch': 0.71} 71%|███████▏ | 15777/22095 [26:50:29<9:49:19, 5.60s/it] 71%|███████▏ | 15778/22095 [26:50:32<8:35:44, 4.90s/it] {'loss': 0.2883, 'grad_norm': 0.6760202791887676, 'learning_rate': 1.995297200093412e-06, 'epoch': 0.71} 71%|███████▏ | 15778/22095 [26:50:32<8:35:44, 4.90s/it] 71%|███████▏ | 15779/22095 [26:50:36<8:03:36, 4.59s/it] {'loss': 0.335, 'grad_norm': 0.6687713757079061, 'learning_rate': 1.9947114132125243e-06, 'epoch': 0.71} 71%|███████▏ | 15779/22095 [26:50:36<8:03:36, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366678 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33424, 'image': 'vrdu_table_final_2/astro-ph.CO/b2c53a2d-22de-4f1e-8b03-bdaaf4079970.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} 71%|███████▏ | 15780/22095 [26:50:39<7:20:09, 4.18s/it] {'loss': 0.2982, 'grad_norm': 0.6688648010934541, 'learning_rate': 1.994125690906059e-06, 'epoch': 0.71} 71%|███████▏ | 15780/22095 [26:50:39<7:20:09, 4.18s/it] 71%|███████▏ | 15781/22095 [26:50:42<6:41:25, 3.81s/it] {'loss': 0.263, 'grad_norm': 0.6302850774865625, 'learning_rate': 1.993540033186602e-06, 'epoch': 0.71} 71%|███████▏ | 15781/22095 [26:50:42<6:41:25, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████▏ | 15782/22095 [26:50:52<9:40:29, 5.52s/it] {'loss': 0.4811, 'grad_norm': 0.28680344474268504, 'learning_rate': 1.9929544400667366e-06, 'epoch': 0.71} 71%|███████▏ | 15782/22095 [26:50:52<9:40:29, 5.52s/it] 71%|███████▏ | 15783/22095 [26:50:55<8:44:52, 4.99s/it] {'loss': 0.3014, 'grad_norm': 0.6167589499171111, 'learning_rate': 1.9923689115590428e-06, 'epoch': 0.71} 71%|███████▏ | 15783/22095 [26:50:55<8:44:52, 4.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 71%|███████▏ | 15784/22095 [26:50:58<7:42:52, 4.40s/it] {'loss': 0.3096, 'grad_norm': 0.5816952636059396, 'learning_rate': 1.9917834476761037e-06, 'epoch': 0.71} 71%|███████▏ | 15784/22095 [26:50:58<7:42:52, 4.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880124 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3277, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 10\nB. 12\nC. 16\nD. 9\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 71%|███████▏ | 15785/22095 [26:51:02<7:01:49, 4.01s/it] {'loss': 0.3257, 'grad_norm': 0.6636120045872163, 'learning_rate': 1.9911980484305017e-06, 'epoch': 0.71} 71%|███████▏ | 15785/22095 [26:51:02<7:01:49, 4.01s/it] 71%|███████▏ | 15786/22095 [26:51:04<6:27:42, 3.69s/it] {'loss': 0.2956, 'grad_norm': 0.6286370620860274, 'learning_rate': 1.9906127138348123e-06, 'epoch': 0.71} 71%|███████▏ | 15786/22095 [26:51:04<6:27:42, 3.69s/it] 71%|███████▏ | 15787/22095 [26:51:08<6:11:34, 3.53s/it] {'loss': 0.2681, 'grad_norm': 0.6780622703223456, 'learning_rate': 1.9900274439016116e-06, 'epoch': 0.71} 71%|███████▏ | 15787/22095 [26:51:08<6:11:34, 3.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [98, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8532126 in VC:s3://internvl-moe-sft-data/. Exception: Image size [98, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 157984, 'image': 'vrdu_texteq/astro-ph.CO/9227ff9f-0627-445e-8bc5-12e574ce1bc9.png', 'image_wh': [[98, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'for $a_{mn}$.'}]} 71%|███████▏ | 15788/22095 [26:51:11<6:03:48, 3.46s/it] {'loss': 0.3265, 'grad_norm': 0.7062114812695381, 'learning_rate': 1.989442238643478e-06, 'epoch': 0.71} 71%|███████▏ | 15788/22095 [26:51:11<6:03:48, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85355 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113782 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106475 > 40960). Running this sequence through the model will result in indexing errors 71%|███████▏ | 15789/22095 [26:51:15<6:33:31, 3.74s/it] {'loss': 0.2714, 'grad_norm': 0.5841561044833753, 'learning_rate': 1.9888570980729847e-06, 'epoch': 0.71} 71%|███████▏ | 15789/22095 [26:51:15<6:33:31, 3.74s/it] 71%|███████▏ | 15790/22095 [26:51:19<6:20:06, 3.62s/it] {'loss': 0.3009, 'grad_norm': 0.7220036200359434, 'learning_rate': 1.9882720222027026e-06, 'epoch': 0.71} 71%|███████▏ | 15790/22095 [26:51:19<6:20:06, 3.62s/it] 71%|███████▏ | 15791/22095 [26:51:23<6:34:06, 3.75s/it] {'loss': 0.2785, 'grad_norm': 0.7006671521470127, 'learning_rate': 1.9876870110452066e-06, 'epoch': 0.71} 71%|███████▏ | 15791/22095 [26:51:23<6:34:06, 3.75s/it] 71%|███████▏ | 15792/22095 [26:51:27<6:46:25, 3.87s/it] {'loss': 0.2899, 'grad_norm': 0.5872154904076039, 'learning_rate': 1.9871020646130633e-06, 'epoch': 0.71} 71%|███████▏ | 15792/22095 [26:51:27<6:46:25, 3.87s/it] 71%|███████▏ | 15793/22095 [26:51:30<6:36:09, 3.77s/it] {'loss': 0.2957, 'grad_norm': 0.596507452041044, 'learning_rate': 1.9865171829188455e-06, 'epoch': 0.71} 71%|███████▏ | 15793/22095 [26:51:30<6:36:09, 3.77s/it] 71%|███████▏ | 15794/22095 [26:51:34<6:24:13, 3.66s/it] {'loss': 0.3272, 'grad_norm': 0.6760547872145583, 'learning_rate': 1.9859323659751178e-06, 'epoch': 0.71} 71%|███████▏ | 15794/22095 [26:51:34<6:24:13, 3.66s/it] 71%|███████▏ | 15795/22095 [26:51:37<6:10:43, 3.53s/it] {'loss': 0.3072, 'grad_norm': 0.6936074449471855, 'learning_rate': 1.985347613794445e-06, 'epoch': 0.71} 71%|███████▏ | 15795/22095 [26:51:37<6:10:43, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51158 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83949 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100585 > 40960). Running this sequence through the model will result in indexing errors 71%|███████▏ | 15796/22095 [26:51:40<5:52:27, 3.36s/it] {'loss': 0.3119, 'grad_norm': 0.6141029292821792, 'learning_rate': 1.984762926389393e-06, 'epoch': 0.71} 71%|███████▏ | 15796/22095 [26:51:40<5:52:27, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 71%|███████▏ | 15797/22095 [26:51:47<8:00:32, 4.58s/it] {'loss': 0.4784, 'grad_norm': 0.28622782989550266, 'learning_rate': 1.9841783037725264e-06, 'epoch': 0.71} 71%|███████▏ | 15797/22095 [26:51:47<8:00:32, 4.58s/it] 72%|███████▏ | 15798/22095 [26:51:51<7:36:30, 4.35s/it] {'loss': 0.333, 'grad_norm': 0.603727115110764, 'learning_rate': 1.9835937459564065e-06, 'epoch': 0.72} 72%|███████▏ | 15798/22095 [26:51:51<7:36:30, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15799/22095 [26:51:59<9:22:57, 5.36s/it] {'loss': 0.4622, 'grad_norm': 0.2584585191662679, 'learning_rate': 1.983009252953591e-06, 'epoch': 0.72} 72%|███████▏ | 15799/22095 [26:51:59<9:22:57, 5.36s/it] 72%|███████▏ | 15800/22095 [26:52:03<8:33:36, 4.90s/it] {'loss': 0.2999, 'grad_norm': 0.6295600557995713, 'learning_rate': 1.9824248247766404e-06, 'epoch': 0.72} 72%|███████▏ | 15800/22095 [26:52:03<8:33:36, 4.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15801/22095 [26:52:06<7:32:39, 4.32s/it] {'loss': 0.2997, 'grad_norm': 0.6054018087673412, 'learning_rate': 1.981840461438114e-06, 'epoch': 0.72} 72%|███████▏ | 15801/22095 [26:52:06<7:32:39, 4.32s/it] 72%|███████▏ | 15802/22095 [26:52:09<7:02:09, 4.03s/it] {'loss': 0.336, 'grad_norm': 0.6474448280391323, 'learning_rate': 1.9812561629505666e-06, 'epoch': 0.72} 72%|███████▏ | 15802/22095 [26:52:09<7:02:09, 4.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15803/22095 [26:52:13<6:45:15, 3.86s/it] {'loss': 0.3294, 'grad_norm': 0.6423659371952307, 'learning_rate': 1.980671929326551e-06, 'epoch': 0.72} 72%|███████▏ | 15803/22095 [26:52:13<6:45:15, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15804/22095 [26:52:23<10:02:23, 5.75s/it] {'loss': 0.4523, 'grad_norm': 0.29430593806193195, 'learning_rate': 1.980087760578625e-06, 'epoch': 0.72} 72%|███████▏ | 15804/22095 [26:52:23<10:02:23, 5.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [348, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429520 in VC:s3://internvl-moe-sft-data/. Exception: Image size [348, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40979, 'image': 'vrdu_texteq/astro-ph.CO/9aa1d7fb-95f1-4a16-94a7-01175e1832a1.png', 'image_wh': [[348, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'between 1.1$\\sigma$ and 2.7$\\sigma$ lower'}]} 72%|███████▏ | 15805/22095 [26:52:26<9:01:50, 5.17s/it] {'loss': 0.2993, 'grad_norm': 0.5765988062290655, 'learning_rate': 1.979503656719336e-06, 'epoch': 0.72} 72%|███████▏ | 15805/22095 [26:52:27<9:01:50, 5.17s/it] 72%|███████▏ | 15806/22095 [26:52:30<7:54:33, 4.53s/it] {'loss': 0.2718, 'grad_norm': 0.5732845095529945, 'learning_rate': 1.9789196177612384e-06, 'epoch': 0.72} 72%|███████▏ | 15806/22095 [26:52:30<7:54:33, 4.53s/it] 72%|███████▏ | 15807/22095 [26:52:34<7:46:03, 4.45s/it] {'loss': 0.3432, 'grad_norm': 0.6072815854683996, 'learning_rate': 1.97833564371688e-06, 'epoch': 0.72} 72%|███████▏ | 15807/22095 [26:52:34<7:46:03, 4.45s/it] 72%|███████▏ | 15808/22095 [26:52:37<7:02:17, 4.03s/it] {'loss': 0.3245, 'grad_norm': 0.599200295872143, 'learning_rate': 1.9777517345988057e-06, 'epoch': 0.72} 72%|███████▏ | 15808/22095 [26:52:37<7:02:17, 4.03s/it] 72%|███████▏ | 15809/22095 [26:52:41<7:16:56, 4.17s/it] {'loss': 0.2977, 'grad_norm': 0.6179029784376359, 'learning_rate': 1.977167890419565e-06, 'epoch': 0.72} 72%|███████▏ | 15809/22095 [26:52:41<7:16:56, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50335 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61869 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122968 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15810/22095 [26:52:45<7:05:36, 4.06s/it] {'loss': 0.2772, 'grad_norm': 0.619875528901657, 'learning_rate': 1.976584111191704e-06, 'epoch': 0.72} 72%|███████▏ | 15810/22095 [26:52:45<7:05:36, 4.06s/it] 72%|███████▏ | 15811/22095 [26:52:49<6:46:11, 3.88s/it] {'loss': 0.2653, 'grad_norm': 0.5717346985299682, 'learning_rate': 1.976000396927765e-06, 'epoch': 0.72} 72%|███████▏ | 15811/22095 [26:52:49<6:46:11, 3.88s/it] 72%|███████▏ | 15812/22095 [26:52:52<6:38:39, 3.81s/it] {'loss': 0.3106, 'grad_norm': 0.6167885903070414, 'learning_rate': 1.975416747640288e-06, 'epoch': 0.72} 72%|███████▏ | 15812/22095 [26:52:52<6:38:39, 3.81s/it] 72%|███████▏ | 15813/22095 [26:52:55<6:15:16, 3.58s/it] {'loss': 0.3206, 'grad_norm': 0.6860304566379826, 'learning_rate': 1.974833163341816e-06, 'epoch': 0.72} 72%|███████▏ | 15813/22095 [26:52:55<6:15:16, 3.58s/it] 72%|███████▏ | 15814/22095 [26:52:59<6:03:28, 3.47s/it] {'loss': 0.2664, 'grad_norm': 0.558697239360316, 'learning_rate': 1.9742496440448895e-06, 'epoch': 0.72} 72%|███████▏ | 15814/22095 [26:52:59<6:03:28, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15815/22095 [26:53:02<6:10:01, 3.54s/it] {'loss': 0.3243, 'grad_norm': 0.6422467254804713, 'learning_rate': 1.973666189762046e-06, 'epoch': 0.72} 72%|███████▏ | 15815/22095 [26:53:02<6:10:01, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47560 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48588 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67961 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15816/22095 [26:53:05<5:44:27, 3.29s/it] {'loss': 0.2976, 'grad_norm': 0.6272481981865834, 'learning_rate': 1.973082800505819e-06, 'epoch': 0.72} 72%|███████▏ | 15816/22095 [26:53:05<5:44:27, 3.29s/it] 72%|███████▏ | 15817/22095 [26:53:08<5:28:00, 3.13s/it] {'loss': 0.2906, 'grad_norm': 0.6170692813442996, 'learning_rate': 1.9724994762887484e-06, 'epoch': 0.72} 72%|███████▏ | 15817/22095 [26:53:08<5:28:00, 3.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15818/22095 [26:53:17<8:50:28, 5.07s/it] {'loss': 0.4571, 'grad_norm': 0.3043379788722677, 'learning_rate': 1.9719162171233636e-06, 'epoch': 0.72} 72%|███████▏ | 15818/22095 [26:53:17<8:50:28, 5.07s/it] 72%|███████▏ | 15819/22095 [26:53:20<7:50:22, 4.50s/it] {'loss': 0.2886, 'grad_norm': 0.7171550794067272, 'learning_rate': 1.9713330230222013e-06, 'epoch': 0.72} 72%|███████▏ | 15819/22095 [26:53:20<7:50:22, 4.50s/it] 72%|███████▏ | 15820/22095 [26:53:24<7:22:16, 4.23s/it] {'loss': 0.3184, 'grad_norm': 1.193219478002798, 'learning_rate': 1.9707498939977905e-06, 'epoch': 0.72} 72%|███████▏ | 15820/22095 [26:53:24<7:22:16, 4.23s/it] 72%|███████▏ | 15821/22095 [26:53:28<7:02:49, 4.04s/it] {'loss': 0.2735, 'grad_norm': 0.5752556708479493, 'learning_rate': 1.970166830062659e-06, 'epoch': 0.72} 72%|███████▏ | 15821/22095 [26:53:28<7:02:49, 4.04s/it] 72%|███████▏ | 15822/22095 [26:53:31<6:36:26, 3.79s/it] {'loss': 0.3245, 'grad_norm': 0.6125335200732254, 'learning_rate': 1.969583831229338e-06, 'epoch': 0.72} 72%|███████▏ | 15822/22095 [26:53:31<6:36:26, 3.79s/it] 72%|███████▏ | 15823/22095 [26:53:34<6:23:38, 3.67s/it] {'loss': 0.3523, 'grad_norm': 0.5971442106455194, 'learning_rate': 1.969000897510354e-06, 'epoch': 0.72} 72%|███████▏ | 15823/22095 [26:53:34<6:23:38, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15824/22095 [26:53:40<7:32:11, 4.33s/it] {'loss': 0.4491, 'grad_norm': 0.26319467873710606, 'learning_rate': 1.9684180289182297e-06, 'epoch': 0.72} 72%|███████▏ | 15824/22095 [26:53:40<7:32:11, 4.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68034 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15825/22095 [26:53:44<7:09:21, 4.11s/it] {'loss': 0.2929, 'grad_norm': 0.6962210165352554, 'learning_rate': 1.9678352254654914e-06, 'epoch': 0.72} 72%|███████▏ | 15825/22095 [26:53:44<7:09:21, 4.11s/it] 72%|███████▏ | 15826/22095 [26:53:47<6:45:44, 3.88s/it] {'loss': 0.3234, 'grad_norm': 0.597435716964487, 'learning_rate': 1.967252487164663e-06, 'epoch': 0.72} 72%|███████▏ | 15826/22095 [26:53:47<6:45:44, 3.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [659, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8472652 in VC:s3://internvl-moe-sft-data/. Exception: Image size [659, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15194, 'image': 'vrdu_texteq/astro-ph.CO/f82134d4-696f-4659-b895-7472d229f809.png', 'image_wh': [[659, 25]], 'conversations': [{'from': 'human', 'value': '\n请使用OCR技术识别并提取图像中的文本。'}, {'from': 'gpt', 'value': 'We can now define the source functions $S^T$ and $S^P$ as'}]} 72%|███████▏ | 15827/22095 [26:53:50<6:18:36, 3.62s/it] {'loss': 0.3108, 'grad_norm': 0.5774198683188823, 'learning_rate': 1.9666698140282648e-06, 'epoch': 0.72} 72%|███████▏ | 15827/22095 [26:53:50<6:18:36, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59140 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15828/22095 [26:53:54<6:16:09, 3.60s/it] {'loss': 0.3164, 'grad_norm': 0.6242275328564052, 'learning_rate': 1.966087206068814e-06, 'epoch': 0.72} 72%|███████▏ | 15828/22095 [26:53:54<6:16:09, 3.60s/it] 72%|███████▏ | 15829/22095 [26:53:57<6:12:32, 3.57s/it] {'loss': 0.2784, 'grad_norm': 0.7541080933452219, 'learning_rate': 1.9655046632988313e-06, 'epoch': 0.72} 72%|███████▏ | 15829/22095 [26:53:57<6:12:32, 3.57s/it] 72%|███████▏ | 15830/22095 [26:54:01<6:12:09, 3.56s/it] {'loss': 0.2849, 'grad_norm': 0.6206540518782321, 'learning_rate': 1.964922185730835e-06, 'epoch': 0.72} 72%|███████▏ | 15830/22095 [26:54:01<6:12:09, 3.56s/it] 72%|███████▏ | 15831/22095 [26:54:04<5:49:35, 3.35s/it] {'loss': 0.2744, 'grad_norm': 0.6295019724223111, 'learning_rate': 1.96433977337734e-06, 'epoch': 0.72} 72%|███████▏ | 15831/22095 [26:54:04<5:49:35, 3.35s/it] 72%|███████▏ | 15832/22095 [26:54:07<6:02:07, 3.47s/it] {'loss': 0.2941, 'grad_norm': 0.5879735551319749, 'learning_rate': 1.963757426250858e-06, 'epoch': 0.72} 72%|███████▏ | 15832/22095 [26:54:07<6:02:07, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61689 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15833/22095 [26:54:17<9:07:22, 5.24s/it] {'loss': 0.4766, 'grad_norm': 0.29055395224746566, 'learning_rate': 1.9631751443639054e-06, 'epoch': 0.72} 72%|███████▏ | 15833/22095 [26:54:17<9:07:22, 5.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952526 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3361, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 5\nB. 6\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 72%|███████▏ | 15834/22095 [26:54:20<7:58:05, 4.58s/it] {'loss': 0.3128, 'grad_norm': 0.7155505457515415, 'learning_rate': 1.96259292772899e-06, 'epoch': 0.72} 72%|███████▏ | 15834/22095 [26:54:20<7:58:05, 4.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15835/22095 [26:54:23<7:12:49, 4.15s/it] {'loss': 0.3233, 'grad_norm': 0.7041649501282407, 'learning_rate': 1.9620107763586267e-06, 'epoch': 0.72} 72%|███████▏ | 15835/22095 [26:54:23<7:12:49, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15836/22095 [26:54:32<10:02:34, 5.78s/it] {'loss': 0.4652, 'grad_norm': 0.2991827612162243, 'learning_rate': 1.96142869026532e-06, 'epoch': 0.72} 72%|███████▏ | 15836/22095 [26:54:32<10:02:34, 5.78s/it] 72%|███████▏ | 15837/22095 [26:54:36<8:40:21, 4.99s/it] {'loss': 0.2683, 'grad_norm': 0.5923121808846988, 'learning_rate': 1.960846669461578e-06, 'epoch': 0.72} 72%|███████▏ | 15837/22095 [26:54:36<8:40:21, 4.99s/it] 72%|███████▏ | 15838/22095 [26:54:39<7:51:52, 4.52s/it] {'loss': 0.2829, 'grad_norm': 0.630301288602417, 'learning_rate': 1.9602647139599063e-06, 'epoch': 0.72} 72%|███████▏ | 15838/22095 [26:54:39<7:51:52, 4.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308572 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2iMrxcWLN8KJjSZFKXXb7NVXa_!!2937694048.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nInterpret the sentences content on this picture.'}, {'from': 'gpt', 'value': 'All words in the image:\n正品\n真\n正\n7\n检\n阿吉普\nAgip\nAg13554784628\nAgip\n热卖!\n官方授权经销商\nAgip'}]} 72%|███████▏ | 15839/22095 [26:54:42<7:17:05, 4.19s/it] {'loss': 0.3598, 'grad_norm': 0.6268779453538741, 'learning_rate': 1.959682823772812e-06, 'epoch': 0.72} 72%|███████▏ | 15839/22095 [26:54:42<7:17:05, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79328 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59621 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15840/22095 [26:54:46<6:58:29, 4.01s/it] {'loss': 0.3044, 'grad_norm': 0.7842283739627655, 'learning_rate': 1.9591009989127958e-06, 'epoch': 0.72} 72%|███████▏ | 15840/22095 [26:54:46<6:58:29, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15841/22095 [26:54:53<8:41:34, 5.00s/it] {'loss': 0.481, 'grad_norm': 0.27938510457142823, 'learning_rate': 1.9585192393923583e-06, 'epoch': 0.72} 72%|███████▏ | 15841/22095 [26:54:53<8:41:34, 5.00s/it] 72%|███████▏ | 15842/22095 [26:54:57<7:55:16, 4.56s/it] {'loss': 0.2783, 'grad_norm': 0.6064302788309044, 'learning_rate': 1.9579375452240013e-06, 'epoch': 0.72} 72%|███████▏ | 15842/22095 [26:54:57<7:55:16, 4.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [167, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8425531 in VC:s3://internvl-moe-sft-data/. Exception: Image size [167, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 114812, 'image': 'vrdu_texteq/astro-ph.CO/736b5fa6-ade1-46f8-a8e4-4e203ebbf9b0.png', 'image_wh': [[167, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'for $r\\geq R$ and'}]} 72%|███████▏ | 15843/22095 [26:55:00<7:12:44, 4.15s/it] {'loss': 0.3212, 'grad_norm': 0.6281747951378225, 'learning_rate': 1.9573559164202248e-06, 'epoch': 0.72} 72%|███████▏ | 15843/22095 [26:55:00<7:12:44, 4.15s/it] 72%|███████▏ | 15844/22095 [26:55:03<6:24:42, 3.69s/it] {'loss': 0.2711, 'grad_norm': 0.6213000653987445, 'learning_rate': 1.956774352993526e-06, 'epoch': 0.72} 72%|███████▏ | 15844/22095 [26:55:03<6:24:42, 3.69s/it] 72%|███████▏ | 15845/22095 [26:55:06<5:57:53, 3.44s/it] {'loss': 0.3124, 'grad_norm': 0.6296797112903717, 'learning_rate': 1.956192854956397e-06, 'epoch': 0.72} 72%|███████▏ | 15845/22095 [26:55:06<5:57:53, 3.44s/it] 72%|███████▏ | 15846/22095 [26:55:08<5:43:49, 3.30s/it] {'loss': 0.2927, 'grad_norm': 0.5755777054173816, 'learning_rate': 1.955611422321337e-06, 'epoch': 0.72} 72%|███████▏ | 15846/22095 [26:55:09<5:43:49, 3.30s/it] 72%|███████▏ | 15847/22095 [26:55:13<6:08:03, 3.53s/it] {'loss': 0.3052, 'grad_norm': 0.6481445542956137, 'learning_rate': 1.9550300551008357e-06, 'epoch': 0.72} 72%|███████▏ | 15847/22095 [26:55:13<6:08:03, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8585568 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10544, 'image': '898796490.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Mystery, Thriller & Suspense? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Politics & Social Sciences? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 72%|███████▏ | 15848/22095 [26:55:22<9:14:46, 5.33s/it] {'loss': 0.4604, 'grad_norm': 0.2622176815018156, 'learning_rate': 1.9544487533073887e-06, 'epoch': 0.72} 72%|███████▏ | 15848/22095 [26:55:22<9:14:46, 5.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15849/22095 [26:55:26<8:27:34, 4.88s/it] {'loss': 0.3055, 'grad_norm': 0.6096963530780615, 'learning_rate': 1.9538675169534838e-06, 'epoch': 0.72} 72%|███████▏ | 15849/22095 [26:55:26<8:27:34, 4.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11249245 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 956, 'image': 'MMMU_Pro/standard/test_955_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words.Note that you do not need to give a specific answer to the question.Here are the content and options of this question: In the following reaction , what mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: ['38.50 grams', '40.00 grams', '35.50 grams', '37.00 grams', '34.78 grams', ' 36.78 grams', '36.50 grams', ' 36.76 grams', ' 36.77 grams']"}, {'from': 'gpt', 'value': "**Introduction to Chemical Reactions**\n\n1. **Chemical Equations**\n - **Concept**: Chemical equations are symbolic representations of chemical reactions. They show the reactants (substances consumed) and products (substances formed) in a reaction.\n - **Components**: A chemical equation consists of reactants, products, and an arrow that indicates the direction of the reaction. For the reaction given:\n \\[ \\text{H}_2\\text{SO}_4 + \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + \\text{H}_2\\text{O} \\]\n\n2. **Balancing Chemical Equations**\n - **Purpose**: Balancing equations ensures that the law of conservation of mass is observed, meaning the number of atoms for each element is the same on both sides of the equation.\n - **Method**: Coefficients are adjusted to balance the number of atoms. For the reaction:\n \\[ \\text{H}_2\\text{SO}_4 + 2 \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2 \\text{H}_2\\text{O} \\]\n\n3. **Types of Chemical Reactions**\n - Acid-base reaction (neutralization): Involves an acid (H\\(_2\\)SO\\(_4\\), sulfuric acid) and a base (NaOH, sodium hydroxide) reacting to form salt (Na\\(_2\\)SO\\(_4\\), sodium sulfate) and water (H\\(_2\\)O).\n\n**Stoichiometry**\n\n4. **Mole Concept**\n - **Definition**: A mole is a unit representing 6.022 × 10\\(^2\\)^3 particles (Avogadro's number), used to measure chemical substances.\n - **Application**: Stoichiometry uses moles to calculate the amounts of reactants and products in a chemical reaction. Here, moles help determine how much H\\(_2\\)SO\\(_4\\) is needed for a given amount of NaOH.\n - **Example**: In our reaction, 1 mole of H\\(_2\\)SO\\(_4\\) reacts with 2 moles of NaOH.\n\n5. **Molar Mass**\n - **Concept**: Molar mass is the mass of one mole of a substance, expressed in grams per mole (g/mol). It is calculated by summing the atomic masses of all atoms in the formula.\n - **Calculation**: \n - H\\(_2\\)SO\\(_4\\): \\(2(1.01) + 32.07 + 4(16.00) = 98.09 \\, \\text{g/mol}\\)\n - NaOH: \\(22.99 + 16.00 + 1.01 = 40.00 \\, \\text{g/mol}\\)\n\n6. **Stoichiometric Calculations**\n - **Objective**: Determine quantities of reactants or products using balanced equations.\n - **Method**: Use mole ratios from the balanced equation to convert between substances. For example, from the balanced equation, 1 mole of H\\(_2\\)SO\\(_4\\) reacts with 2 moles of NaOH.\n\n**Acid-Base Chemistry**\n\n7. **Acids and Bases**\n - **Acid**: A substance that donates protons (H\\(^{+}\\)) in a reaction. Sulfuric acid (\\(\\text{H}_2\\text{SO}_4\\)) is a strong acid that dissociates completely in solution.\n - **Base**: A substance that accepts protons. Sodium hydroxide (\\(\\text{NaOH}\\)) is a strong base that dissociates completely, yielding OH\\(^{-}\\) ions.\n\n8. **Neutralization Reaction**\n - **Definition**: A chemical reaction where an acid and a base react to form water and a salt.\n - **Example**: H\\(_2\\)SO\\(_4\\) + 2 NaOH \\(\\rightarrow\\) Na\\(_2\\)SO\\(_4\\) + 2 H\\(_2\\)O\n - **Significance**: Neutralization reactions are exothermic, often producing heat.\n\n9. **Applications of Neutralization**\n - Industries use neutralization to treat acidic and basic waste, produce salts, and manage pH levels in various processes.\n - Example: Treatment of wastewater to neutralize acidic contaminants.\n\n**Applications of Sulfuric Acid and Sodium Hydroxide**\n\n10. **Sulfuric Acid**\n - **Uses**: Fertilizer production, chemical synthesis, battery acid, and pH regulation.\n - **Safety**: Highly corrosive, handling requires proper safety equipment.\n\n11. **Sodium Hydroxide**\n - **Uses**: Soap making, paper manufacturing, chemical synthesis, and cleaning agents.\n - **Safety**: Can cause chemical burns, handling requires caution.\n\n**Examples and Real-life Applications**\n\n12. **Industrial Neutralization Processes**\n - Neutralizing acids in mining operations to prevent environmental damage.\n - In the food industry, sodium hydroxide is used for food processing.\n\n13. **Educational Demonstrations**\n - Neutralization reactions are demonstrated in chemistry classes to illustrate stoichiometry and acid-base balance.\n\n14. **Environmental Impact**\n - Proper handling and neutralization of acids and bases reduce environmental hazards.\n - Case studies involve spill neutralizations using base substances.\n\n**Conclusion**\n\nIn summary, understanding the stoichiometry and properties of substances involved in chemical reactions is essential for predicting the outcomes and managing the processes in which they are involved. Acid-base reactions play a crucial role in industrial applications and environmental management, highlighting the interconnectedness of chemistry, industry, and ecology. These insights drive advancements in chemical manufacturing and environmental protection, underscoring the importance of chemistry in various fields."}]} 72%|███████▏ | 15850/22095 [26:55:29<7:44:53, 4.47s/it] {'loss': 0.3121, 'grad_norm': 0.5801968716904514, 'learning_rate': 1.9532863460516095e-06, 'epoch': 0.72} 72%|███████▏ | 15850/22095 [26:55:29<7:44:53, 4.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [298, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8487944 in VC:s3://internvl-moe-sft-data/. Exception: Image size [298, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 107716, 'image': 'vrdu_texteq/astro-ph.CO/58b293b2-2c75-45f2-aa50-7f8f00bbcea6.png', 'image_wh': [[298, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $ P_{\\rm Pois} $ is defined as'}]} 72%|███████▏ | 15851/22095 [26:55:33<7:04:17, 4.08s/it] {'loss': 0.3132, 'grad_norm': 0.6445991323541399, 'learning_rate': 1.9527052406142534e-06, 'epoch': 0.72} 72%|███████▏ | 15851/22095 [26:55:33<7:04:17, 4.08s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047658 in VC:s3://multi-modal/UniGeo/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 3cm\nB. 2cm\nC. 5cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 72%|███████▏ | 15852/22095 [26:55:36<6:36:15, 3.81s/it] {'loss': 0.3084, 'grad_norm': 0.6252343576574004, 'learning_rate': 1.9521242006539065e-06, 'epoch': 0.72} 72%|███████▏ | 15852/22095 [26:55:36<6:36:15, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15853/22095 [26:55:39<6:03:50, 3.50s/it] {'loss': 0.2998, 'grad_norm': 0.7262094242239051, 'learning_rate': 1.9515432261830465e-06, 'epoch': 0.72} 72%|███████▏ | 15853/22095 [26:55:39<6:03:50, 3.50s/it] 72%|███████▏ | 15854/22095 [26:55:41<5:40:34, 3.27s/it] {'loss': 0.2937, 'grad_norm': 0.6183870478720913, 'learning_rate': 1.9509623172141596e-06, 'epoch': 0.72} 72%|███████▏ | 15854/22095 [26:55:41<5:40:34, 3.27s/it] 72%|███████▏ | 15855/22095 [26:55:44<5:38:10, 3.25s/it] {'loss': 0.2938, 'grad_norm': 0.6979481403695195, 'learning_rate': 1.9503814737597297e-06, 'epoch': 0.72} 72%|███████▏ | 15855/22095 [26:55:45<5:38:10, 3.25s/it] 72%|███████▏ | 15856/22095 [26:55:48<5:38:47, 3.26s/it] {'loss': 0.2918, 'grad_norm': 0.7083826073473445, 'learning_rate': 1.949800695832236e-06, 'epoch': 0.72} 72%|███████▏ | 15856/22095 [26:55:48<5:38:47, 3.26s/it] 72%|███████▏ | 15857/22095 [26:55:51<5:52:24, 3.39s/it] {'loss': 0.3071, 'grad_norm': 0.6447679929100643, 'learning_rate': 1.949219983444156e-06, 'epoch': 0.72} 72%|███████▏ | 15857/22095 [26:55:51<5:52:24, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50583 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15858/22095 [26:55:55<5:44:25, 3.31s/it] {'loss': 0.313, 'grad_norm': 0.6497871890283972, 'learning_rate': 1.9486393366079687e-06, 'epoch': 0.72} 72%|███████▏ | 15858/22095 [26:55:55<5:44:25, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15859/22095 [26:55:58<5:38:02, 3.25s/it] {'loss': 0.2814, 'grad_norm': 0.682234150983536, 'learning_rate': 1.948058755336152e-06, 'epoch': 0.72} 72%|███████▏ | 15859/22095 [26:55:58<5:38:02, 3.25s/it] 72%|███████▏ | 15860/22095 [26:56:01<5:27:06, 3.15s/it] {'loss': 0.3152, 'grad_norm': 0.6759019114753635, 'learning_rate': 1.947478239641179e-06, 'epoch': 0.72} 72%|███████▏ | 15860/22095 [26:56:01<5:27:06, 3.15s/it] 72%|███████▏ | 15861/22095 [26:56:04<5:34:52, 3.22s/it] {'loss': 0.3431, 'grad_norm': 0.6525915599032759, 'learning_rate': 1.9468977895355225e-06, 'epoch': 0.72} 72%|███████▏ | 15861/22095 [26:56:04<5:34:52, 3.22s/it] 72%|███████▏ | 15862/22095 [26:56:07<5:34:07, 3.22s/it] {'loss': 0.2991, 'grad_norm': 0.660095016570737, 'learning_rate': 1.946317405031657e-06, 'epoch': 0.72} 72%|███████▏ | 15862/22095 [26:56:07<5:34:07, 3.22s/it] 72%|███████▏ | 15863/22095 [26:56:10<5:32:50, 3.20s/it] {'loss': 0.2844, 'grad_norm': 0.6324364409123052, 'learning_rate': 1.94573708614205e-06, 'epoch': 0.72} 72%|███████▏ | 15863/22095 [26:56:10<5:32:50, 3.20s/it] 72%|███████▏ | 15864/22095 [26:56:13<5:28:38, 3.16s/it] {'loss': 0.2878, 'grad_norm': 0.6335492193236768, 'learning_rate': 1.945156832879174e-06, 'epoch': 0.72} 72%|███████▏ | 15864/22095 [26:56:13<5:28:38, 3.16s/it] 72%|███████▏ | 15865/22095 [26:56:16<5:19:13, 3.07s/it] {'loss': 0.3329, 'grad_norm': 0.6585628540801011, 'learning_rate': 1.944576645255496e-06, 'epoch': 0.72} 72%|███████▏ | 15865/22095 [26:56:16<5:19:13, 3.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70960 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15866/22095 [26:56:25<8:29:05, 4.90s/it] {'loss': 0.4802, 'grad_norm': 0.3185024687658992, 'learning_rate': 1.94399652328348e-06, 'epoch': 0.72} 72%|███████▏ | 15866/22095 [26:56:26<8:29:05, 4.90s/it] 72%|███████▏ | 15867/22095 [26:56:30<8:16:15, 4.78s/it] {'loss': 0.315, 'grad_norm': 0.5985997255672704, 'learning_rate': 1.9434164669755928e-06, 'epoch': 0.72} 72%|███████▏ | 15867/22095 [26:56:30<8:16:15, 4.78s/it] 72%|███████▏ | 15868/22095 [26:56:33<7:18:29, 4.23s/it] {'loss': 0.3393, 'grad_norm': 0.6134883826608056, 'learning_rate': 1.9428364763443e-06, 'epoch': 0.72} 72%|███████▏ | 15868/22095 [26:56:33<7:18:29, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15869/22095 [26:56:36<6:49:28, 3.95s/it] {'loss': 0.3076, 'grad_norm': 0.6659292946147676, 'learning_rate': 1.942256551402062e-06, 'epoch': 0.72} 72%|███████▏ | 15869/22095 [26:56:36<6:49:28, 3.95s/it] 72%|███████▏ | 15870/22095 [26:56:39<6:24:55, 3.71s/it] {'loss': 0.3074, 'grad_norm': 0.7063404873269847, 'learning_rate': 1.9416766921613375e-06, 'epoch': 0.72} 72%|███████▏ | 15870/22095 [26:56:39<6:24:55, 3.71s/it] 72%|███████▏ | 15871/22095 [26:56:43<6:31:58, 3.78s/it] {'loss': 0.3203, 'grad_norm': 0.8006880516701447, 'learning_rate': 1.941096898634588e-06, 'epoch': 0.72} 72%|███████▏ | 15871/22095 [26:56:43<6:31:58, 3.78s/it] 72%|███████▏ | 15872/22095 [26:56:47<6:19:38, 3.66s/it] {'loss': 0.3336, 'grad_norm': 0.6808668128914034, 'learning_rate': 1.9405171708342734e-06, 'epoch': 0.72} 72%|███████▏ | 15872/22095 [26:56:47<6:19:38, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57921 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15873/22095 [26:56:50<6:06:36, 3.54s/it] {'loss': 0.2706, 'grad_norm': 0.6046078305349775, 'learning_rate': 1.9399375087728485e-06, 'epoch': 0.72} 72%|███████▏ | 15873/22095 [26:56:50<6:06:36, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93567 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15874/22095 [26:56:59<8:59:08, 5.20s/it] {'loss': 0.4806, 'grad_norm': 0.28548787155681826, 'learning_rate': 1.939357912462766e-06, 'epoch': 0.72} 72%|███████▏ | 15874/22095 [26:56:59<8:59:08, 5.20s/it] 72%|███████▏ | 15875/22095 [26:57:04<8:41:17, 5.03s/it] {'loss': 0.2869, 'grad_norm': 0.5994086971826666, 'learning_rate': 1.938778381916484e-06, 'epoch': 0.72} 72%|███████▏ | 15875/22095 [26:57:04<8:41:17, 5.03s/it] 72%|███████▏ | 15876/22095 [26:57:07<7:54:38, 4.58s/it] {'loss': 0.3169, 'grad_norm': 0.6159373405288157, 'learning_rate': 1.938198917146451e-06, 'epoch': 0.72} 72%|███████▏ | 15876/22095 [26:57:07<7:54:38, 4.58s/it] 72%|███████▏ | 15877/22095 [26:57:10<7:02:10, 4.07s/it] {'loss': 0.3013, 'grad_norm': 0.6221266268026915, 'learning_rate': 1.937619518165121e-06, 'epoch': 0.72} 72%|███████▏ | 15877/22095 [26:57:10<7:02:10, 4.07s/it] 72%|███████▏ | 15878/22095 [26:57:14<7:00:46, 4.06s/it] {'loss': 0.2803, 'grad_norm': 0.6939394072074498, 'learning_rate': 1.937040184984943e-06, 'epoch': 0.72} 72%|███████▏ | 15878/22095 [26:57:14<7:00:46, 4.06s/it] 72%|███████▏ | 15879/22095 [26:57:18<6:56:22, 4.02s/it] {'loss': 0.2946, 'grad_norm': 0.6745653728912595, 'learning_rate': 1.936460917618362e-06, 'epoch': 0.72} 72%|███████▏ | 15879/22095 [26:57:18<6:56:22, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46710 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108407 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46389 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107252 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15880/22095 [26:57:21<6:33:02, 3.79s/it] {'loss': 0.3579, 'grad_norm': 0.6351055310352166, 'learning_rate': 1.9358817160778272e-06, 'epoch': 0.72} 72%|███████▏ | 15880/22095 [26:57:21<6:33:02, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15881/22095 [26:57:27<7:46:54, 4.51s/it] {'loss': 0.4696, 'grad_norm': 0.2773305345910911, 'learning_rate': 1.935302580375785e-06, 'epoch': 0.72} 72%|███████▏ | 15881/22095 [26:57:28<7:46:54, 4.51s/it] 72%|███████▏ | 15882/22095 [26:57:37<10:15:13, 5.94s/it] {'loss': 0.4589, 'grad_norm': 0.28853986338486576, 'learning_rate': 1.9347235105246783e-06, 'epoch': 0.72} 72%|███████▏ | 15882/22095 [26:57:37<10:15:13, 5.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 72%|███████▏ | 15883/22095 [26:57:41<9:08:26, 5.30s/it] {'loss': 0.3329, 'grad_norm': 0.6801749109228717, 'learning_rate': 1.934144506536946e-06, 'epoch': 0.72} 72%|███████▏ | 15883/22095 [26:57:41<9:08:26, 5.30s/it] 72%|███████▏ | 15884/22095 [26:57:44<8:11:19, 4.75s/it] {'loss': 0.3068, 'grad_norm': 0.6160835970036759, 'learning_rate': 1.9335655684250335e-06, 'epoch': 0.72} 72%|███████▏ | 15884/22095 [26:57:44<8:11:19, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43109 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15885/22095 [26:57:47<7:22:26, 4.27s/it] {'loss': 0.3009, 'grad_norm': 0.6493036262924194, 'learning_rate': 1.9329866962013825e-06, 'epoch': 0.72} 72%|███████▏ | 15885/22095 [26:57:47<7:22:26, 4.27s/it] 72%|███████▏ | 15886/22095 [26:57:51<7:01:16, 4.07s/it] {'loss': 0.2646, 'grad_norm': 0.6042260802325438, 'learning_rate': 1.9324078898784245e-06, 'epoch': 0.72} 72%|███████▏ | 15886/22095 [26:57:51<7:01:16, 4.07s/it] 72%|███████▏ | 15887/22095 [26:57:54<6:23:08, 3.70s/it] {'loss': 0.317, 'grad_norm': 0.6283814205090501, 'learning_rate': 1.9318291494685986e-06, 'epoch': 0.72} 72%|███████▏ | 15887/22095 [26:57:54<6:23:08, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949355 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 190, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} 72%|███████▏ | 15888/22095 [26:58:03<9:24:00, 5.45s/it] {'loss': 0.4805, 'grad_norm': 0.27051702087871426, 'learning_rate': 1.9312504749843435e-06, 'epoch': 0.72} 72%|███████▏ | 15888/22095 [26:58:03<9:24:00, 5.45s/it] 72%|███████▏ | 15889/22095 [26:58:06<8:17:16, 4.81s/it] {'loss': 0.3057, 'grad_norm': 0.6903824750745414, 'learning_rate': 1.9306718664380907e-06, 'epoch': 0.72} 72%|███████▏ | 15889/22095 [26:58:06<8:17:16, 4.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15890/22095 [26:58:10<7:24:06, 4.29s/it] {'loss': 0.2872, 'grad_norm': 0.5720509988148191, 'learning_rate': 1.930093323842271e-06, 'epoch': 0.72} 72%|███████▏ | 15890/22095 [26:58:10<7:24:06, 4.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72417 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15891/22095 [26:58:14<7:19:29, 4.25s/it] {'loss': 0.3056, 'grad_norm': 0.6374653633908585, 'learning_rate': 1.929514847209319e-06, 'epoch': 0.72} 72%|███████▏ | 15891/22095 [26:58:14<7:19:29, 4.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58673 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86942 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15892/22095 [26:58:17<6:34:13, 3.81s/it] {'loss': 0.309, 'grad_norm': 0.641860689442778, 'learning_rate': 1.928936436551661e-06, 'epoch': 0.72} 72%|███████▏ | 15892/22095 [26:58:17<6:34:13, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15893/22095 [26:58:26<9:24:34, 5.46s/it] {'loss': 0.4849, 'grad_norm': 0.33271062511421506, 'learning_rate': 1.9283580918817284e-06, 'epoch': 0.72} 72%|███████▏ | 15893/22095 [26:58:26<9:24:34, 5.46s/it] 72%|███████▏ | 15894/22095 [26:58:30<8:30:28, 4.94s/it] {'loss': 0.3101, 'grad_norm': 0.5551185896909835, 'learning_rate': 1.927779813211947e-06, 'epoch': 0.72} 72%|███████▏ | 15894/22095 [26:58:30<8:30:28, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41064 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80953 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (150216 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15895/22095 [26:58:33<7:55:09, 4.60s/it] {'loss': 0.2824, 'grad_norm': 0.6580061730478527, 'learning_rate': 1.92720160055474e-06, 'epoch': 0.72} 72%|███████▏ | 15895/22095 [26:58:33<7:55:09, 4.60s/it] 72%|███████▏ | 15896/22095 [26:58:37<7:24:29, 4.30s/it] {'loss': 0.3185, 'grad_norm': 0.603955540037126, 'learning_rate': 1.926623453922533e-06, 'epoch': 0.72} 72%|███████▏ | 15896/22095 [26:58:37<7:24:29, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54173 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54051 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15897/22095 [26:58:40<6:57:09, 4.04s/it] {'loss': 0.2553, 'grad_norm': 0.5711212663065297, 'learning_rate': 1.9260453733277505e-06, 'epoch': 0.72} 72%|███████▏ | 15897/22095 [26:58:40<6:57:09, 4.04s/it] 72%|███████▏ | 15898/22095 [26:58:44<6:36:18, 3.84s/it] {'loss': 0.3484, 'grad_norm': 0.6889052549277278, 'learning_rate': 1.925467358782812e-06, 'epoch': 0.72} 72%|███████▏ | 15898/22095 [26:58:44<6:36:18, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15899/22095 [26:58:52<9:04:54, 5.28s/it] {'loss': 0.4684, 'grad_norm': 0.2783497457159708, 'learning_rate': 1.924889410300136e-06, 'epoch': 0.72} 72%|███████▏ | 15899/22095 [26:58:52<9:04:54, 5.28s/it] 72%|███████▏ | 15900/22095 [26:59:02<11:17:58, 6.57s/it] {'loss': 0.4618, 'grad_norm': 0.25821795988046475, 'learning_rate': 1.9243115278921416e-06, 'epoch': 0.72} 72%|███████▏ | 15900/22095 [26:59:02<11:17:58, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 72%|███████▏ | 15901/22095 [26:59:06<9:57:39, 5.79s/it] {'loss': 0.297, 'grad_norm': 0.611423173817838, 'learning_rate': 1.923733711571248e-06, 'epoch': 0.72} 72%|███████▏ | 15901/22095 [26:59:06<9:57:39, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61397 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97570 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15902/22095 [26:59:09<8:44:05, 5.08s/it] {'loss': 0.2693, 'grad_norm': 0.5840177696674929, 'learning_rate': 1.923155961349869e-06, 'epoch': 0.72} 72%|███████▏ | 15902/22095 [26:59:09<8:44:05, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92347 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15903/22095 [26:59:18<10:35:09, 6.15s/it] {'loss': 0.4751, 'grad_norm': 0.265000607498201, 'learning_rate': 1.9225782772404166e-06, 'epoch': 0.72} 72%|███████▏ | 15903/22095 [26:59:18<10:35:09, 6.15s/it] 72%|███████▏ | 15904/22095 [26:59:21<9:08:35, 5.32s/it] {'loss': 0.3015, 'grad_norm': 0.64824568095004, 'learning_rate': 1.9220006592553075e-06, 'epoch': 0.72} 72%|███████▏ | 15904/22095 [26:59:21<9:08:35, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15905/22095 [26:59:29<10:30:59, 6.12s/it] {'loss': 0.4871, 'grad_norm': 0.31206766498139427, 'learning_rate': 1.921423107406949e-06, 'epoch': 0.72} 72%|███████▏ | 15905/22095 [26:59:29<10:30:59, 6.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8903168 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26321, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm'}, {'from': 'gpt', 'value': '【解答】解:设BC=xcm,∵BC=\\frac{1}{2}AB,∴AB=2BC=2x,AC=AB+BC=3xcm,∵D为AC的中点,∴AD=DC=\\frac{1}{2}AC=1.5xcm,∵CD=3cm,∴1.5x=3,解得:x=2,即AB=2xcm=4cm,'}]} VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/10226.png 2025-08-28 18:57:28.130364 load time: 1007.11 ms 72%|███████▏ | 15906/22095 [26:59:33<9:16:59, 5.40s/it] {'loss': 0.271, 'grad_norm': 0.6588422898892048, 'learning_rate': 1.920845621707755e-06, 'epoch': 0.72} 72%|███████▏ | 15906/22095 [26:59:33<9:16:59, 5.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15907/22095 [26:59:37<8:21:21, 4.86s/it] {'loss': 0.3178, 'grad_norm': 0.5959150073512096, 'learning_rate': 1.920268202170131e-06, 'epoch': 0.72} 72%|███████▏ | 15907/22095 [26:59:37<8:21:21, 4.86s/it] 72%|███████▏ | 15908/22095 [26:59:40<7:39:16, 4.45s/it] {'loss': 0.2759, 'grad_norm': 0.5610244219911166, 'learning_rate': 1.9196908488064832e-06, 'epoch': 0.72} 72%|███████▏ | 15908/22095 [26:59:40<7:39:16, 4.45s/it] 72%|███████▏ | 15909/22095 [26:59:44<7:05:20, 4.13s/it] {'loss': 0.2924, 'grad_norm': 0.6057227446694428, 'learning_rate': 1.9191135616292184e-06, 'epoch': 0.72} 72%|███████▏ | 15909/22095 [26:59:44<7:05:20, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15910/22095 [26:59:53<9:50:05, 5.72s/it] {'loss': 0.4753, 'grad_norm': 0.25465578536998, 'learning_rate': 1.918536340650743e-06, 'epoch': 0.72} 72%|███████▏ | 15910/22095 [26:59:53<9:50:05, 5.72s/it] 72%|███████▏ | 15911/22095 [26:59:56<8:35:53, 5.01s/it] {'loss': 0.2771, 'grad_norm': 0.6162224428012438, 'learning_rate': 1.9179591858834572e-06, 'epoch': 0.72} 72%|███████▏ | 15911/22095 [26:59:56<8:35:53, 5.01s/it] 72%|███████▏ | 15912/22095 [26:59:59<7:34:51, 4.41s/it] {'loss': 0.2924, 'grad_norm': 0.6236076685705199, 'learning_rate': 1.9173820973397617e-06, 'epoch': 0.72} 72%|███████▏ | 15912/22095 [26:59:59<7:34:51, 4.41s/it] 72%|███████▏ | 15913/22095 [27:00:02<6:54:05, 4.02s/it] {'loss': 0.2964, 'grad_norm': 0.6610076407431865, 'learning_rate': 1.916805075032057e-06, 'epoch': 0.72} 72%|███████▏ | 15913/22095 [27:00:02<6:54:05, 4.02s/it] 72%|███████▏ | 15914/22095 [27:00:06<6:27:52, 3.77s/it] {'loss': 0.2768, 'grad_norm': 0.8042526061411998, 'learning_rate': 1.9162281189727455e-06, 'epoch': 0.72} 72%|███████▏ | 15914/22095 [27:00:06<6:27:52, 3.77s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240828_195552_before_screenshot.png 2025-08-28 18:58:05.713945 load time: 1175.37 ms VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_6/images/20250417140200.png 2025-08-28 18:58:05.082278 load time: 2689.52 ms 72%|███████▏ | 15915/22095 [27:00:11<7:03:39, 4.11s/it] {'loss': 0.316, 'grad_norm': 0.6371717003262385, 'learning_rate': 1.915651229174217e-06, 'epoch': 0.72} 72%|███████▏ | 15915/22095 [27:00:11<7:03:39, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [117, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390148 in VC:s3://internvl-moe-sft-data/. Exception: Image size [117, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56967, 'image': 'vrdu_table_final_2/astro-ph.EP/08923ac0-d98d-4446-8ec3-bb33bac6ff88.png', 'image_wh': [[117, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c} Comment \\\\ \\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364941 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31682, 'image': 'vrdu_table_final_2/astro-ph.CO/71940017-8a5a-4bdc-b786-5791d0a0a39a.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} VC:s3://gui/aguvis/aguvis-stage2/amex/images/5fc78e6d22b648e2b64dd71cea63a050step24.png 2025-08-28 18:58:09.968097 load time: 1391.82 ms 72%|███████▏ | 15916/22095 [27:00:32<15:57:04, 9.29s/it] {'loss': 0.3211, 'grad_norm': 0.630859574891062, 'learning_rate': 1.9150744056488708e-06, 'epoch': 0.72} 72%|███████▏ | 15916/22095 [27:00:32<15:57:04, 9.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [450, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8442449 in VC:s3://internvl-moe-sft-data/. Exception: Image size [450, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 141924, 'image': 'vrdu_texteq/astro-ph.CO/ac03c2af-4835-4616-a672-fa8449e22237.png', 'image_wh': [[450, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'Existe un estimador $R$ definido como'}]} 72%|███████▏ | 15917/22095 [27:00:54<22:44:37, 13.25s/it] {'loss': 0.3136, 'grad_norm': 0.692485725729248, 'learning_rate': 1.9144976484091025e-06, 'epoch': 0.72} 72%|███████▏ | 15917/22095 [27:00:54<22:44:37, 13.25s/it] 72%|███████▏ | 15918/22095 [27:00:58<17:46:47, 10.36s/it] {'loss': 0.2925, 'grad_norm': 0.5700622958020366, 'learning_rate': 1.913920957467304e-06, 'epoch': 0.72} 72%|███████▏ | 15918/22095 [27:00:58<17:46:47, 10.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106954 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98089 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15919/22095 [27:01:01<14:11:10, 8.27s/it] {'loss': 0.2976, 'grad_norm': 0.5472061376592867, 'learning_rate': 1.913344332835864e-06, 'epoch': 0.72} 72%|███████▏ | 15919/22095 [27:01:01<14:11:10, 8.27s/it] 72%|███████▏ | 15920/22095 [27:01:05<11:56:07, 6.96s/it] {'loss': 0.275, 'grad_norm': 0.7371868821920786, 'learning_rate': 1.9127677745271754e-06, 'epoch': 0.72} 72%|███████▏ | 15920/22095 [27:01:05<11:56:07, 6.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47511 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44415 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15921/22095 [27:01:09<9:59:38, 5.83s/it] {'loss': 0.2919, 'grad_norm': 0.9859973646941241, 'learning_rate': 1.912191282553624e-06, 'epoch': 0.72} 72%|███████▏ | 15921/22095 [27:01:09<9:59:38, 5.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15922/22095 [27:01:17<11:22:24, 6.63s/it] {'loss': 0.4891, 'grad_norm': 0.3129194570815802, 'learning_rate': 1.911614856927601e-06, 'epoch': 0.72} 72%|███████▏ | 15922/22095 [27:01:17<11:22:24, 6.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908200 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31353, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nA. 7\nB. 2\nC. 2.5\nD. 4.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 72%|███████▏ | 15923/22095 [27:01:21<9:51:26, 5.75s/it] {'loss': 0.3079, 'grad_norm': 0.7252536527570467, 'learning_rate': 1.911038497661487e-06, 'epoch': 0.72} 72%|███████▏ | 15923/22095 [27:01:21<9:51:26, 5.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119396 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55061 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127939 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15924/22095 [27:01:25<8:57:44, 5.23s/it] {'loss': 0.3388, 'grad_norm': 0.6356784735211609, 'learning_rate': 1.910462204767671e-06, 'epoch': 0.72} 72%|███████▏ | 15924/22095 [27:01:25<8:57:44, 5.23s/it] 72%|███████▏ | 15925/22095 [27:01:28<7:59:09, 4.66s/it] {'loss': 0.3187, 'grad_norm': 0.8678659339354213, 'learning_rate': 1.9098859782585313e-06, 'epoch': 0.72} 72%|███████▏ | 15925/22095 [27:01:28<7:59:09, 4.66s/it] 72%|███████▏ | 15926/22095 [27:01:50<17:03:59, 9.96s/it] {'loss': 0.2709, 'grad_norm': 0.615283773322787, 'learning_rate': 1.909309818146453e-06, 'epoch': 0.72} 72%|███████▏ | 15926/22095 [27:01:50<17:03:59, 9.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15927/22095 [27:02:00<16:46:59, 9.80s/it] {'loss': 0.4684, 'grad_norm': 0.2776919225343176, 'learning_rate': 1.9087337244438147e-06, 'epoch': 0.72} 72%|███████▏ | 15927/22095 [27:02:00<16:46:59, 9.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15928/22095 [27:02:03<13:35:01, 7.93s/it] {'loss': 0.3262, 'grad_norm': 0.6257009861394178, 'learning_rate': 1.908157697162993e-06, 'epoch': 0.72} 72%|███████▏ | 15928/22095 [27:02:03<13:35:01, 7.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15929/22095 [27:02:13<14:23:26, 8.40s/it] {'loss': 0.4589, 'grad_norm': 0.2599198789267447, 'learning_rate': 1.9075817363163655e-06, 'epoch': 0.72} 72%|███████▏ | 15929/22095 [27:02:13<14:23:26, 8.40s/it] 72%|███████▏ | 15930/22095 [27:02:16<11:43:43, 6.85s/it] {'loss': 0.2479, 'grad_norm': 0.6840195491615668, 'learning_rate': 1.9070058419163118e-06, 'epoch': 0.72} 72%|███████▏ | 15930/22095 [27:02:16<11:43:43, 6.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15931/22095 [27:02:20<10:01:06, 5.85s/it] {'loss': 0.3021, 'grad_norm': 0.6094399970096488, 'learning_rate': 1.9064300139752024e-06, 'epoch': 0.72} 72%|███████▏ | 15931/22095 [27:02:20<10:01:06, 5.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922957 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46110, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 1cm\nB. 4cm\nC. 5cm\nD. 无法确定\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 72%|███████▏ | 15932/22095 [27:02:29<11:47:55, 6.89s/it] {'loss': 0.466, 'grad_norm': 0.25754235454591273, 'learning_rate': 1.9058542525054096e-06, 'epoch': 0.72} 72%|███████▏ | 15932/22095 [27:02:29<11:47:55, 6.89s/it]VC:s3://gui-agent/data_20250714/windows/images/adobe_illustrator/free_task_20250714_163642/images/20250714_163746_32.png 2025-08-28 19:00:27.732924 load time: 1263.46 ms VC:s3://gui-agent/data_20250612/windows/images/calculator/free_task_20250607_232451/images/20250607_232459_3.png 2025-08-28 19:00:28.340631 load time: 1124.8 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/data_20250414/mac/images/mac_desktop_handmade/handmade_annotation_8/images/20250417135922.png 2025-08-28 19:00:28.952006 load time: 1169.71 ms 72%|███████▏ | 15933/22095 [27:02:33<10:28:09, 6.12s/it] {'loss': 0.3106, 'grad_norm': 0.6092321613812091, 'learning_rate': 1.9052785575193072e-06, 'epoch': 0.72} 72%|███████▏ | 15933/22095 [27:02:33<10:28:09, 6.12s/it] 72%|███████▏ | 15934/22095 [27:02:37<9:27:28, 5.53s/it] {'loss': 0.2786, 'grad_norm': 0.6228942019247349, 'learning_rate': 1.9047029290292623e-06, 'epoch': 0.72} 72%|███████▏ | 15934/22095 [27:02:37<9:27:28, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47595 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15935/22095 [27:02:40<8:05:43, 4.73s/it] {'loss': 0.3071, 'grad_norm': 0.6296667281726287, 'learning_rate': 1.9041273670476468e-06, 'epoch': 0.72} 72%|███████▏ | 15935/22095 [27:02:40<8:05:43, 4.73s/it] 72%|███████▏ | 15936/22095 [27:02:43<7:10:28, 4.19s/it] {'loss': 0.2921, 'grad_norm': 0.6188099089168433, 'learning_rate': 1.9035518715868262e-06, 'epoch': 0.72} 72%|███████▏ | 15936/22095 [27:02:43<7:10:28, 4.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396929 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63782, 'image': 'vrdu_table_final_2/astro-ph.EP/b93fac58-e289-472a-91b7-8e1dee5f3c86.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_y$\\end{tabular}\n```"}]} 72%|███████▏ | 15937/22095 [27:02:46<6:41:55, 3.92s/it] {'loss': 0.3299, 'grad_norm': 0.7346168096084281, 'learning_rate': 1.9029764426591641e-06, 'epoch': 0.72} 72%|███████▏ | 15937/22095 [27:02:47<6:41:55, 3.92s/it] 72%|███████▏ | 15938/22095 [27:03:08<15:53:31, 9.29s/it] {'loss': 0.2967, 'grad_norm': 0.6566646675703187, 'learning_rate': 1.902401080277026e-06, 'epoch': 0.72} 72%|███████▏ | 15938/22095 [27:03:08<15:53:31, 9.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955481 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6316, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7'}]} 72%|███████▏ | 15939/22095 [27:03:29<21:54:52, 12.82s/it] {'loss': 0.3422, 'grad_norm': 0.6763430559004452, 'learning_rate': 1.901825784452777e-06, 'epoch': 0.72} 72%|███████▏ | 15939/22095 [27:03:29<21:54:52, 12.82s/it] 72%|███████▏ | 15940/22095 [27:03:32<16:55:06, 9.90s/it] {'loss': 0.3176, 'grad_norm': 0.5829262688664122, 'learning_rate': 1.9012505551987764e-06, 'epoch': 0.72} 72%|███████▏ | 15940/22095 [27:03:32<16:55:06, 9.90s/it] 72%|███████▏ | 15941/22095 [27:03:55<23:09:11, 13.54s/it] {'loss': 0.355, 'grad_norm': 0.6464037491427786, 'learning_rate': 1.900675392527383e-06, 'epoch': 0.72} 72%|███████▏ | 15941/22095 [27:03:55<23:09:11, 13.54s/it] 72%|███████▏ | 15942/22095 [27:03:57<17:42:17, 10.36s/it] {'loss': 0.3419, 'grad_norm': 0.6426142657790407, 'learning_rate': 1.9001002964509564e-06, 'epoch': 0.72} 72%|███████▏ | 15942/22095 [27:03:57<17:42:17, 10.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15943/22095 [27:04:07<17:17:03, 10.11s/it] {'loss': 0.4996, 'grad_norm': 0.3018870085983446, 'learning_rate': 1.8995252669818577e-06, 'epoch': 0.72} 72%|███████▏ | 15943/22095 [27:04:07<17:17:03, 10.11s/it] 72%|███████▏ | 15944/22095 [27:04:29<23:24:52, 13.70s/it] {'loss': 0.2671, 'grad_norm': 0.6567595177042254, 'learning_rate': 1.8989503041324341e-06, 'epoch': 0.72} 72%|███████▏ | 15944/22095 [27:04:29<23:24:52, 13.70s/it] 72%|███████▏ | 15945/22095 [27:04:51<27:25:29, 16.05s/it] {'loss': 0.3004, 'grad_norm': 1.1565325384593736, 'learning_rate': 1.8983754079150452e-06, 'epoch': 0.72} 72%|███████▏ | 15945/22095 [27:04:51<27:25:29, 16.05s/it] 72%|███████▏ | 15946/22095 [27:05:30<39:33:59, 23.16s/it] {'loss': 0.2855, 'grad_norm': 0.6058733951545143, 'learning_rate': 1.8978005783420444e-06, 'epoch': 0.72} 72%|███████▏ | 15946/22095 [27:05:30<39:33:59, 23.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68411 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45400 > 40960) for 4 sample(s). Truncating to 2759 with 1 samples. 72%|███████▏ | 15947/22095 [27:05:52<38:57:10, 22.81s/it] {'loss': 0.3139, 'grad_norm': 0.6461791664712543, 'learning_rate': 1.8972258154257816e-06, 'epoch': 0.72} 72%|███████▏ | 15947/22095 [27:05:52<38:57:10, 22.81s/it] 72%|███████▏ | 15948/22095 [27:05:55<28:51:44, 16.90s/it] {'loss': 0.2739, 'grad_norm': 0.6570032567439259, 'learning_rate': 1.8966511191786047e-06, 'epoch': 0.72} 72%|███████▏ | 15948/22095 [27:05:55<28:51:44, 16.90s/it] 72%|███████▏ | 15949/22095 [27:06:37<41:41:53, 24.42s/it] {'loss': 0.2943, 'grad_norm': 0.5894999645744051, 'learning_rate': 1.896076489612866e-06, 'epoch': 0.72} 72%|███████▏ | 15949/22095 [27:06:37<41:41:53, 24.42s/it] 72%|███████▏ | 15950/22095 [27:07:36<59:21:18, 34.77s/it] {'loss': 0.2833, 'grad_norm': 0.5993238627006856, 'learning_rate': 1.895501926740908e-06, 'epoch': 0.72} 72%|███████▏ | 15950/22095 [27:07:36<59:21:18, 34.77s/it] 72%|███████▏ | 15951/22095 [27:07:57<52:16:39, 30.63s/it] {'loss': 0.2601, 'grad_norm': 0.6305668222337719, 'learning_rate': 1.8949274305750814e-06, 'epoch': 0.72} 72%|███████▏ | 15951/22095 [27:07:57<52:16:39, 30.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54506 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15952/22095 [27:08:19<47:32:11, 27.86s/it] {'loss': 0.2872, 'grad_norm': 0.617752552570206, 'learning_rate': 1.8943530011277261e-06, 'epoch': 0.72} 72%|███████▏ | 15952/22095 [27:08:19<47:32:11, 27.86s/it] 72%|███████▏ | 15953/22095 [27:08:22<34:43:03, 20.35s/it] {'loss': 0.3434, 'grad_norm': 0.626608894837736, 'learning_rate': 1.893778638411188e-06, 'epoch': 0.72} 72%|███████▏ | 15953/22095 [27:08:22<34:43:03, 20.35s/it] 72%|███████▏ | 15954/22095 [27:08:43<35:31:55, 20.83s/it] {'loss': 0.3371, 'grad_norm': 0.733790000471714, 'learning_rate': 1.8932043424378049e-06, 'epoch': 0.72} 72%|███████▏ | 15954/22095 [27:08:43<35:31:55, 20.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15955/22095 [27:08:53<29:42:47, 17.42s/it] {'loss': 0.4619, 'grad_norm': 0.294562872158215, 'learning_rate': 1.892630113219921e-06, 'epoch': 0.72} 72%|███████▏ | 15955/22095 [27:08:53<29:42:47, 17.42s/it] 72%|███████▏ | 15956/22095 [27:09:16<32:36:11, 19.12s/it] {'loss': 0.3145, 'grad_norm': 0.6323404370467263, 'learning_rate': 1.8920559507698722e-06, 'epoch': 0.72} 72%|███████▏ | 15956/22095 [27:09:16<32:36:11, 19.12s/it] 72%|███████▏ | 15957/22095 [27:09:58<44:32:23, 26.12s/it] {'loss': 0.2851, 'grad_norm': 0.6202866373164587, 'learning_rate': 1.891481855099994e-06, 'epoch': 0.72} 72%|███████▏ | 15957/22095 [27:09:59<44:32:23, 26.12s/it] 72%|███████▏ | 15958/22095 [27:10:44<54:15:20, 31.83s/it] {'loss': 0.3197, 'grad_norm': 0.6657534635544718, 'learning_rate': 1.8909078262226237e-06, 'epoch': 0.72} 72%|███████▏ | 15958/22095 [27:10:44<54:15:20, 31.83s/it]VC:s3://gui-agent/data_20250612/windows/images/calculator/free_task_20250608_105438/images/20250608_105448_5.png 2025-08-28 19:08:42.407812 load time: 1071.13 ms 72%|███████▏ | 15959/22095 [27:11:07<49:52:46, 29.26s/it] {'loss': 0.3175, 'grad_norm': 0.5681651570767629, 'learning_rate': 1.8903338641500967e-06, 'epoch': 0.72} 72%|███████▏ | 15959/22095 [27:11:07<49:52:46, 29.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15960/22095 [27:11:16<39:40:55, 23.29s/it] {'loss': 0.4748, 'grad_norm': 0.2866490398022034, 'learning_rate': 1.889759968894745e-06, 'epoch': 0.72} 72%|███████▏ | 15960/22095 [27:11:16<39:40:55, 23.29s/it] 72%|███████▏ | 15961/22095 [27:11:19<29:23:55, 17.25s/it] {'loss': 0.3252, 'grad_norm': 0.6383201904904624, 'learning_rate': 1.889186140468897e-06, 'epoch': 0.72} 72%|███████▏ | 15961/22095 [27:11:19<29:23:55, 17.25s/it] 72%|███████▏ | 15962/22095 [27:11:23<22:18:27, 13.09s/it] {'loss': 0.2795, 'grad_norm': 0.5736735821452259, 'learning_rate': 1.8886123788848864e-06, 'epoch': 0.72} 72%|███████▏ | 15962/22095 [27:11:23<22:18:27, 13.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15963/22095 [27:11:44<26:25:52, 15.52s/it] {'loss': 0.2914, 'grad_norm': 0.6218336748590672, 'learning_rate': 1.8880386841550385e-06, 'epoch': 0.72} 72%|███████▏ | 15963/22095 [27:11:44<26:25:52, 15.52s/it] 72%|███████▏ | 15964/22095 [27:12:26<39:51:22, 23.40s/it] {'loss': 0.327, 'grad_norm': 0.6702019724144967, 'learning_rate': 1.887465056291683e-06, 'epoch': 0.72} 72%|███████▏ | 15964/22095 [27:12:26<39:51:22, 23.40s/it] 72%|███████▏ | 15965/22095 [27:12:29<29:32:47, 17.35s/it] {'loss': 0.3026, 'grad_norm': 1.4128354899466162, 'learning_rate': 1.8868914953071444e-06, 'epoch': 0.72} 72%|███████▏ | 15965/22095 [27:12:29<29:32:47, 17.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15966/22095 [27:13:28<50:49:08, 29.85s/it] {'loss': 0.2937, 'grad_norm': 0.619592327990379, 'learning_rate': 1.886318001213744e-06, 'epoch': 0.72} 72%|███████▏ | 15966/22095 [27:13:28<50:49:08, 29.85s/it] 72%|███████▏ | 15967/22095 [27:14:10<56:47:56, 33.37s/it] {'loss': 0.3358, 'grad_norm': 0.6439866145775998, 'learning_rate': 1.8857445740238073e-06, 'epoch': 0.72} 72%|███████▏ | 15967/22095 [27:14:10<56:47:56, 33.37s/it] 72%|███████▏ | 15968/22095 [27:14:14<41:51:48, 24.60s/it] {'loss': 0.3288, 'grad_norm': 0.6418633396022916, 'learning_rate': 1.8851712137496564e-06, 'epoch': 0.72} 72%|███████▏ | 15968/22095 [27:14:14<41:51:48, 24.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15969/22095 [27:14:37<41:08:08, 24.17s/it] {'loss': 0.2891, 'grad_norm': 0.5714028641956019, 'learning_rate': 1.8845979204036101e-06, 'epoch': 0.72} 72%|███████▏ | 15969/22095 [27:14:37<41:08:08, 24.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108853 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15970/22095 [27:15:17<49:06:52, 28.87s/it] {'loss': 0.3185, 'grad_norm': 0.5926099561173371, 'learning_rate': 1.8840246939979846e-06, 'epoch': 0.72} 72%|███████▏ | 15970/22095 [27:15:17<49:06:52, 28.87s/it] 72%|███████▏ | 15971/22095 [27:15:58<55:10:37, 32.44s/it] {'loss': 0.2876, 'grad_norm': 0.5924911218056633, 'learning_rate': 1.8834515345450977e-06, 'epoch': 0.72} 72%|███████▏ | 15971/22095 [27:15:58<55:10:37, 32.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365009 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31750, 'image': 'vrdu_table_final_2/astro-ph.CO/d17bbbcc-2437-4525-9aed-f14fb507e2cb.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} 72%|███████▏ | 15972/22095 [27:16:55<67:50:45, 39.89s/it] {'loss': 0.3069, 'grad_norm': 0.6234909875094362, 'learning_rate': 1.88287844205727e-06, 'epoch': 0.72} 72%|███████▏ | 15972/22095 [27:16:55<67:50:45, 39.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15973/22095 [27:17:03<51:27:06, 30.26s/it] {'loss': 0.4633, 'grad_norm': 0.2745665548547292, 'learning_rate': 1.882305416546807e-06, 'epoch': 0.72} 72%|███████▏ | 15973/22095 [27:17:03<51:27:06, 30.26s/it] 72%|███████▏ | 15974/22095 [27:17:06<37:45:20, 22.21s/it] {'loss': 0.3584, 'grad_norm': 0.6551699035958258, 'learning_rate': 1.8817324580260254e-06, 'epoch': 0.72} 72%|███████▏ | 15974/22095 [27:17:06<37:45:20, 22.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (66083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105972 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15975/22095 [27:17:31<39:21:59, 23.16s/it] {'loss': 0.4772, 'grad_norm': 0.27671999389208507, 'learning_rate': 1.881159566507238e-06, 'epoch': 0.72} 72%|███████▏ | 15975/22095 [27:17:31<39:21:59, 23.16s/it]VC:s3://gui/data_20250328/icon_canva/images/desktop_3840x2160_1743152929_canvas.png 2025-08-28 19:15:30.147624 load time: 1033.49 ms 72%|███████▏ | 15976/22095 [27:17:53<38:42:36, 22.77s/it] {'loss': 0.2879, 'grad_norm': 0.9831127283266593, 'learning_rate': 1.8805867420027529e-06, 'epoch': 0.72} 72%|███████▏ | 15976/22095 [27:17:53<38:42:36, 22.77s/it] 72%|███████▏ | 15977/22095 [27:18:14<37:31:54, 22.08s/it] {'loss': 0.3036, 'grad_norm': 0.6164403804911328, 'learning_rate': 1.880013984524876e-06, 'epoch': 0.72} 72%|███████▏ | 15977/22095 [27:18:14<37:31:54, 22.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 15978/22095 [27:18:41<40:13:14, 23.67s/it] {'loss': 0.4742, 'grad_norm': 0.2706023179382082, 'learning_rate': 1.8794412940859186e-06, 'epoch': 0.72} 72%|███████▏ | 15978/22095 [27:18:41<40:13:14, 23.67s/it] 72%|███████▏ | 15979/22095 [27:19:24<49:54:01, 29.37s/it] {'loss': 0.294, 'grad_norm': 0.813015049690279, 'learning_rate': 1.8788686706981813e-06, 'epoch': 0.72} 72%|███████▏ | 15979/22095 [27:19:24<49:54:01, 29.37s/it] 72%|███████▏ | 15980/22095 [27:20:04<55:39:41, 32.77s/it] {'loss': 0.3002, 'grad_norm': 0.617435460194587, 'learning_rate': 1.8782961143739724e-06, 'epoch': 0.72} 72%|███████▏ | 15980/22095 [27:20:04<55:39:41, 32.77s/it] 72%|███████▏ | 15981/22095 [27:20:46<60:18:10, 35.51s/it] {'loss': 0.3329, 'grad_norm': 0.6091669825529811, 'learning_rate': 1.877723625125591e-06, 'epoch': 0.72} 72%|███████▏ | 15981/22095 [27:20:46<60:18:10, 35.51s/it] 72%|███████▏ | 15982/22095 [27:21:29<63:48:16, 37.58s/it] {'loss': 0.2673, 'grad_norm': 0.5936151512194228, 'learning_rate': 1.877151202965341e-06, 'epoch': 0.72} 72%|███████▏ | 15982/22095 [27:21:29<63:48:16, 37.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [12, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396940 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63793, 'image': 'vrdu_table_final_2/astro-ph.EP/08b7d8c8-e9f8-480e-93b6-5808e7b8d918.png', 'image_wh': [[12, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}[t]{l}z\\end{tabular}\n```"}]} 72%|███████▏ | 15983/22095 [27:21:53<57:14:36, 33.72s/it] {'loss': 0.5012, 'grad_norm': 0.26325091324160144, 'learning_rate': 1.876578847905519e-06, 'epoch': 0.72} 72%|███████▏ | 15983/22095 [27:21:53<57:14:36, 33.72s/it] 72%|███████▏ | 15984/22095 [27:21:57<41:47:53, 24.62s/it] {'loss': 0.3388, 'grad_norm': 0.6278205249319658, 'learning_rate': 1.8760065599584266e-06, 'epoch': 0.72} 72%|███████▏ | 15984/22095 [27:21:57<41:47:53, 24.62s/it] 72%|███████▏ | 15985/22095 [27:22:38<50:22:27, 29.68s/it] {'loss': 0.3194, 'grad_norm': 0.6902459735771315, 'learning_rate': 1.8754343391363584e-06, 'epoch': 0.72} 72%|███████▏ | 15985/22095 [27:22:38<50:22:27, 29.68s/it] 72%|███████▏ | 15986/22095 [27:23:20<56:14:21, 33.14s/it] {'loss': 0.3023, 'grad_norm': 0.5960812307850535, 'learning_rate': 1.874862185451608e-06, 'epoch': 0.72} 72%|███████▏ | 15986/22095 [27:23:20<56:14:21, 33.14s/it] 72%|███████▏ | 15987/22095 [27:23:59<59:40:31, 35.17s/it] {'loss': 0.2792, 'grad_norm': 0.6296054589179261, 'learning_rate': 1.8742900989164713e-06, 'epoch': 0.72} 72%|███████▏ | 15987/22095 [27:24:00<59:40:31, 35.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44428 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80456 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15988/22095 [27:24:59<72:07:27, 42.52s/it] {'loss': 0.3203, 'grad_norm': 0.9689884059413011, 'learning_rate': 1.8737180795432425e-06, 'epoch': 0.72} 72%|███████▏ | 15988/22095 [27:24:59<72:07:27, 42.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11223034 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 956, 'image': 'MMMU_Pro/standard/test_955_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words.Note that you do not need to give a specific answer to the question.Here are the content and options of this question: In the following reaction , what mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: ['38.50 grams', '40.00 grams', '35.50 grams', '37.00 grams', '34.78 grams', ' 36.78 grams', '36.50 grams', ' 36.76 grams', ' 36.77 grams']"}, {'from': 'gpt', 'value': "## Introduction to Chemical Reactions and Stoichiometry\n\nChemical reactions involve the transformation of reactants into products. They are represented by balanced chemical equations that show the ratio in which substances react and form products. Understanding chemical reactions is crucial in many fields, including chemistry, biology, and environmental science.\n\n### Concepts of Chemical Equations\n\nA chemical equation is a symbolic representation of a chemical reaction. It consists of reactants (the starting substances) and products (the substances formed) separated by an arrow. \n\n#### Example:\n\\[ \\text{H}_2\\text{SO}_4 + \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + \\text{H}_2\\text{O} \\]\n\nHere, sulfuric acid (\\(\\text{H}_2\\text{SO}_4\\)) reacts with sodium hydroxide (\\(\\text{NaOH}\\)) to form sodium sulfate (\\(\\text{Na}_2\\text{SO}_4\\)) and water (\\(\\text{H}_2\\text{O}\\)).\n\n#### Balancing Chemical Equations\n\nTo obey the law of conservation of mass, chemical equations must be balanced. This implies that the number of atoms of each element must be the same on both sides of the equation. Balancing involves adjusting coefficients (the numbers before molecules or atoms) to achieve this equality.\n\n### Stoichiometry\n\nStoichiometry is the quantitative study of reactants and products in a chemical reaction. It allows us to predict how much of each substance is needed or produced in a reaction. \n\n#### Moles and Molar Mass\n\n- **Moles**: A mole is a unit that measures the amount of substance. It is equivalent to \\(6.022 \\times 10^{23}\\) entities (Avogadro's number).\n- **Molar Mass**: The mass of one mole of a substance, usually in grams per mole (g/mol). For instance, the molar mass of \\(\\text{H}_2\\text{SO}_4\\) is 98.08 g/mol and for \\(\\text{NaOH}\\), it is 40.00 g/mol.\n\n### Stoichiometric Calculations\n\nStoichiometric calculations involve using balanced equations to determine the mass or volume of reactants or products. Let’s look at an example based on the equation given above:\n\n1. **Balance the Equation**:\n \\[ \\text{H}_2\\text{SO}_4 + 2\\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2\\text{H}_2\\text{O} \\]\n\n2. **Identify Molar Ratios**:\n From the balanced equation, 1 mole of \\(\\text{H}_2\\text{SO}_4\\) reacts with 2 moles of \\(\\text{NaOH}\\).\n\n3. **Calculate Required Mass**:\n To find out how much \\(\\text{H}_2\\text{SO}_4\\) is needed to react with a certain amount of \\(\\text{NaOH}\\), we use the relation:\n \\[ \\text{Mass of substance} = \\text{Moles} \\times \\text{Molar Mass} \\]\n\n### Application in Real-Life and Industry\n\nUnderstanding stoichiometry and chemical reactions is crucial in fields such as pharmaceuticals for drug synthesis, agriculture for fertilizer manufacturing, and environmental science for pollution management.\n\n### Example of Stoichiometric Calculation\n\nTo illustrate, if we have 0.75 moles of \\(\\text{NaOH}\\):\n\n1. According to the reaction, 0.75 moles of \\(\\text{NaOH}\\) require 0.375 moles of \\(\\text{H}_2\\text{SO}_4\\) (because it takes half as many moles of \\(\\text{H}_2\\text{SO}_4\\) to completely react with \\(\\text{NaOH}\\)).\n\n2. Calculating the mass of \\(\\text{H}_2\\text{SO}_4\\):\n \\[ \\text{Mass} = 0.375 \\, \\text{moles} \\times 98.08 \\, \\text{g/mol} = 36.78 \\, \\text{grams} \\]\n\n### Acids, Bases, and Neutralization Reactions\n\n#### Acid-Base Reactions\n\n- **Acids**: Substances that donate protons (\\(H^+\\)) in a reaction, such as \\(\\text{H}_2\\text{SO}_4\\).\n- **Bases**: Substances that accept protons, such as \\(\\text{NaOH}\\).\n\nThe process of neutralization involves an acid and a base reacting to form water and a salt. In our case, \\(\\text{H}_2\\text{SO}_4\\) neutralizes \\(\\text{NaOH}\\).\n\n### Properties of Sulfuric Acid and Sodium Hydroxide\n\n#### Sulfuric Acid (\\(\\text{H}_2\\text{SO}_4\\))\n\n- **Properties**: Dense, oily liquid, strong acid, corrosive.\n- **Uses**: Manufacture of fertilizers, cleaning agents, and in petroleum refining.\n\n#### Sodium Hydroxide (\\(\\text{NaOH}\\))\n\n- **Properties**: White, crystalline solid, very soluble in water, strong base.\n- **Uses**: Paper production, soap manufacturing, water treatment.\n\n### Safety and Handling\n\nBoth substances are hazardous and require careful handling. Proper protective equipment and protocols must be followed to prevent skin burns and inhalation injuries.\n\n### Advanced Concepts\n\n#### Limiting Reactants\n\nIn a reaction, the limiting reactant is the substance that is completely consumed first, determining the amount of product formed. Understanding limiting reactants is essential for optimizing yields in industrial processes.\n\n#### Titration\n\nA laboratory technique that involves the gradual addition of one reactant from a burette to another until the reaction is complete. This technique is often used to determine the concentration of an unknown solution.\n\n### Examples and Practice Problems\n\n##### Example Problem:\nSuppose you have 50.0 grams of \\(\\text{NaOH}\\). How much \\(\\text{H}_2\\text{SO}_4\\) would be required for complete reaction?\n\n1. Convert grams of \\(\\text{NaOH}\\) to moles: 50.0 g \\(\\times \\frac{1 \\, \\text{mol}}{40.00 \\, \\text{g}}\\) = 1.25 mol.\n\n2. Use the mole ratio to find moles of \\(\\text{H}_2\\text{SO}_4\\): \n \\(\\frac{1.25 \\, \\text{mol NaOH}}{2} = 0.625 \\, \\text{mol H}_2\\text{SO}_4\\).\n\n3. Calculate mass of \\(\\text{H}_2\\text{SO}_4\\): 0.625 mol \\(\\times 98.08 \\, \\text{g/mol} = 61.30 \\, \\text{g}\\).\n\n### Conclusion\n\nUnderstanding the principles of chemical equations, stoichiometry, and acid-base reactions is fundamental to mastering chemistry. These concepts not only allow us to predict and quantify the results of reactions but also enable the practical application of chemistry in various industries and scientific research fields. As demonstrated in the example of \\(\\text{H}_2\\text{SO}_4\\) and \\(\\text{NaOH}\\), these principles form the cornerstone of chemical processes that drive technological advancements and innovation."}]} 72%|███████▏ | 15989/22095 [27:25:45<73:46:49, 43.50s/it] {'loss': 0.3106, 'grad_norm': 0.6232037537471353, 'learning_rate': 1.8731461273442097e-06, 'epoch': 0.72} 72%|███████▏ | 15989/22095 [27:25:45<73:46:49, 43.50s/it] 72%|███████▏ | 15990/22095 [27:26:10<64:15:43, 37.89s/it] {'loss': 0.2677, 'grad_norm': 0.6087724636557703, 'learning_rate': 1.8725742423316623e-06, 'epoch': 0.72} 72%|███████▏ | 15990/22095 [27:26:10<64:15:43, 37.89s/it]VC:s3://ocr/coco/train2014/COCO_train2014_000000267463.jpg 2025-08-28 19:24:08.534716 load time: 1047.03 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/b5e5d00a58c45ed7edfd576c61e9f08e201b24b78ea05c2ca502f3cb56f04b08.png 2025-08-28 19:24:08.536033 load time: 1054.14 ms 72%|███████▏ | 15991/22095 [27:26:52<66:35:31, 39.27s/it] {'loss': 0.3131, 'grad_norm': 0.6207701067127781, 'learning_rate': 1.872002424517891e-06, 'epoch': 0.72} 72%|███████▏ | 15991/22095 [27:26:52<66:35:31, 39.27s/it] 72%|███████▏ | 15992/22095 [27:27:32<66:46:15, 39.39s/it] {'loss': 0.2783, 'grad_norm': 0.6289892487218232, 'learning_rate': 1.8714306739151782e-06, 'epoch': 0.72} 72%|███████▏ | 15992/22095 [27:27:32<66:46:15, 39.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957887 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8722, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 72%|███████▏ | 15993/22095 [27:28:15<68:28:12, 40.40s/it] {'loss': 0.2943, 'grad_norm': 0.6020901857016302, 'learning_rate': 1.8708589905358138e-06, 'epoch': 0.72} 72%|███████▏ | 15993/22095 [27:28:15<68:28:12, 40.40s/it] 72%|███████▏ | 15994/22095 [27:28:57<69:31:55, 41.03s/it] {'loss': 0.3065, 'grad_norm': 0.6692875016527511, 'learning_rate': 1.8702873743920774e-06, 'epoch': 0.72} 72%|███████▏ | 15994/22095 [27:28:57<69:31:55, 41.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11259243 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7640, 'image': 'MMMU/Pharmacy/test_19_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze all the relevant background knowledge involved in this question and give an in-depth and detailed introduction to each relevant knowledge point. You need to provide a clear explanation of each knowledge point involved in the question and answer, including its concept, definition, practical application, importance and influence in related fields. Please provide appropriate examples to support the explanation of each knowledge point, just like the detailed explanation in the textbook. Make sure the answer is clearly structured, well-organized, and covers all important aspects of each knowledge point. The answer requires at least 2,000 words, and avoid directly answering the answer to the specific question, and focus on the analysis and explanation of background knowledge.\nHere are the content and options of this question:\nQuestion: Sulfuric acid reacts with sodium hydroxide according to the following:,What mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: [' 36.76 grams', ' 36.77 grams', ' 36.78 grams']"}, {'from': 'gpt', 'value': '### Introduction\n\nSulfuric acid (H₂SO₄) reacting with sodium hydroxide (NaOH) is an example of a neutralization reaction, a fundamental type of chemical reaction where an acid and a base react to form water and a salt. This example encapsulates several essential topics within the field of chemistry, including acid-base chemistry, stoichiometry, reaction types, and the practical applications of these reactions in various industrial and laboratory settings.\n\n---\n\n### 1. Acid-Base Chemistry\n\n**Concept and Definition:**\nAcid-base chemistry is a branch of chemistry that studies acids, bases, their properties, and how they interact in solutions. Acids are substances that donate hydrogen ions (H⁺) in aqueous solutions, while bases accept hydrogen ions. The reaction between an acid and a base often leads to the formation of water and a salt, known as a neutralization reaction.\n\n**Practical Application:**\nNeutralization reactions are fundamental in various industries. For example, in wastewater treatment, acids or bases are added to adjust the pH of the water to make it safe for discharge into the environment. In pharmaceuticals, maintaining the correct pH of solutions is crucial for drug stability and efficacy.\n\n**Importance and Influence:**\nUnderstanding acid-base reactions is vital for many scientific fields, including biochemistry, environmental science, and materials science. Acids and bases play critical roles in biological systems, from enzyme function to cellular respiration.\n\n**Example:**\nA classic example of an acid-base reaction is the neutralization of hydrochloric acid (HCl) with sodium hydroxide (NaOH) to form sodium chloride (NaCl) and water:\n\n\\[ \\text{HCl} + \\text{NaOH} \\rightarrow \\text{NaCl} + \\text{H}_2\\text{O} \\]\n\n---\n\n### 2. Stoichiometry\n\n**Concept and Definition:**\nStoichiometry is the area of chemistry that deals with the quantitative aspects of chemical reactions. It involves calculations that relate the quantities of reactants and products in a chemical reaction based on the balanced chemical equation.\n\n**Practical Application:**\nStoichiometry is used to determine how much of each reactant is needed to react completely without any leftover, which is critical in industrial processes to optimize efficiency and reduce waste. \n\n**Importance and Influence:**\nStoichiometry provides the foundation for understanding chemical reactivity and is fundamental in the lab for tasks such as calculating reagent concentrations, preparing solutions, and conducting titrations.\n\n**Example:**\nIn the reaction of hydrogen gas with oxygen gas to form water:\n\n\\[ 2\\text{H}_2 + \\text{O}_2 \\rightarrow 2\\text{H}_2\\text{O} \\]\n\nThe stoichiometry indicates that two moles of hydrogen gas react with one mole of oxygen gas to produce two moles of water.\n\n---\n\n### 3. Reaction Types\n\n**Concept and Definition:**\nChemical reactions can be classified into different types, such as synthesis, decomposition, single replacement, double replacement, and combustion reactions. The reaction between sulfuric acid and sodium hydroxide is a double replacement reaction, specifically a neutralization reaction.\n\n**Practical Application:**\nEach type of reaction has specific applications; for example, combustion reactions are harnessed to release energy in engines, while decomposition reactions are used in recycling industries to break down materials.\n\n**Importance and Influence:**\nIdentifying the type of reaction helps predict the products of unknown reactions and informs the conditions needed for the reaction to proceed efficiently.\n\n**Example:**\nA decomposition reaction is exemplified by the breakdown of potassium chlorate (KClO₃) to form potassium chloride (KCl) and oxygen gas (O₂):\n\n\\[ 2\\text{KClO}_3 \\rightarrow 2\\text{KCl} + 3\\text{O}_2 \\]\n\n---\n\n### 4. Neutralization Reactions\n\n**Concept and Definition:**\nA neutralization reaction occurs when an acid and a base react to form water and a salt. The process involves the transfer of protons from the acid to the base and is often exothermic, releasing heat.\n\n**Practical Application:**\nNeutralization reactions are used in everyday applications such as antacid tablets, which neutralize excess stomach acid. Additionally, they are applied in agriculture to adjust soil acidity.\n\n**Importance and Influence:**\nUnderstanding neutralization is crucial for designing reactions in industrial laboratories and for tasks like titrations in analytical chemistry.\n\n**Example:**\nThe reaction between acetic acid (CH₃COOH) and sodium bicarbonate (NaHCO₃) is a neutralization that produces sodium acetate, water, and carbon dioxide:\n\n\\[ \\text{CH}_3\\text{COOH} + \\text{NaHCO}_3 \\rightarrow \\text{CH}_3\\text{COONa} + \\text{H}_2\\text{O} + \\text{CO}_2 \\]\n\n---\n\n### 5. The Role of Sulfuric Acid\n\n**Concept and Definition:**\nSulfuric acid is a strong acid known for its high reactivity and diverse industrial uses. It is classified as a strong acid because it completely dissociates in water, releasing hydrogen ions.\n\n**Practical Application:**\nSulfuric acid is used in manufacturing fertilizers, refining petroleum, processing metals, and producing chemicals such as hydrochloric acid and nitric acid.\n\n**Importance and Influence:**\nSulfuric acid is crucial in the chemical industry as an intermediate in producing hundreds of different products. \n\n**Example:**\nThe production of phosphoric acid, used in fertilizers, involves reacting sulfuric acid with rock phosphate.\n\n---\n\n### 6. The Role of Sodium Hydroxide\n\n**Concept and Definition:**\nSodium hydroxide, also known as lye or caustic soda, is a strong base with significant industrial importance. It dissociates completely in water to produce hydroxide ions.\n\n**Practical Application:**\nSodium hydroxide is used in soap making, paper manufacturing, water treatment, and as a drain cleaner.\n\n**Importance and Influence:**\nAs a caustic reagent, sodium hydroxide is critical in many synthesis and purification processes in chemistry.\n\n**Example:**\nIn the saponification process, sodium hydroxide reacts with fats or oils to produce soap:\n\n\\[ \\text{fat/oil} + \\text{NaOH} \\rightarrow \\text{soap} + \\text{glycerol} \\]\n\n---\n\n### 7. Balanced Chemical Equations\n\n**Concept and Definition:**\nA balanced chemical equation accurately represents the conservation of mass by showing an equal number of each type of atom on both sides of the equation.\n\n**Practical Application:**\nBalancing equations is essential for stoichiometric calculations, ensuring that the proportions of reactants and products are correct in chemical reactions.\n\n**Importance and Influence:**\nBalanced equations are fundamental to understanding reaction mechanisms and calculating yields in chemical processes.\n\n**Example:**\nThe balanced equation for the reaction of sulfuric acid with sodium hydroxide is:\n\n\\[ \\text{H}_2\\text{SO}_4 + 2\\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2\\text{H}_2\\text{O} \\]\n\nIn this equation, the atoms of each element are balanced, maintaining the conservation of mass.\n\n---\n\n### Summary\n\nThe reaction between sulfuric acid and sodium hydroxide is not only an example of a neutralization reaction but also a vital demonstration of various core principles in chemistry. It ties together concepts like acid-base interaction, stoichiometry, balanced equations, and the practical roles of acids and bases in industrial and laboratory settings. Understanding this reaction provides insights into broader applications and conveys essential knowledge that underlies numerous scientific and industrial processes. \n\nA careful study of each knowledge point discussed is crucial for students and professionals who wish to gain a deeper comprehension of chemical reactions and their significance in the real world.'}]} 72%|███████▏ | 15995/22095 [27:29:19<59:54:40, 35.36s/it] {'loss': 0.3056, 'grad_norm': 0.7282381530703047, 'learning_rate': 1.869715825496255e-06, 'epoch': 0.72} 72%|███████▏ | 15995/22095 [27:29:19<59:54:40, 35.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (67626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117616 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88425 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 15996/22095 [27:29:47<55:48:34, 32.94s/it] {'loss': 0.4801, 'grad_norm': 0.29601027516313794, 'learning_rate': 1.8691443438606239e-06, 'epoch': 0.72} 72%|███████▏ | 15996/22095 [27:29:47<55:48:34, 32.94s/it] 72%|███████▏ | 15997/22095 [27:30:10<50:51:26, 30.02s/it] {'loss': 0.3043, 'grad_norm': 0.6945105051331881, 'learning_rate': 1.8685729294974668e-06, 'epoch': 0.72} 72%|███████▏ | 15997/22095 [27:30:10<50:51:26, 30.02s/it] 72%|███████▏ | 15998/22095 [27:30:31<46:09:57, 27.26s/it] {'loss': 0.3075, 'grad_norm': 0.618491726853251, 'learning_rate': 1.86800158241906e-06, 'epoch': 0.72} 72%|███████▏ | 15998/22095 [27:30:31<46:09:57, 27.26s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_51198.png 2025-08-28 19:28:29.382176 load time: 1030.14 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_771342.png 2025-08-28 19:28:29.380320 load time: 1041.99 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 72%|███████▏ | 15999/22095 [27:30:52<43:22:01, 25.61s/it] {'loss': 0.2814, 'grad_norm': 0.6025137434539652, 'learning_rate': 1.8674303026376783e-06, 'epoch': 0.72} 72%|███████▏ | 15999/22095 [27:30:52<43:22:01, 25.61s/it] 72%|███████▏ | 16000/22095 [27:31:14<41:19:59, 24.41s/it] {'loss': 0.251, 'grad_norm': 0.59573508665197, 'learning_rate': 1.866859090165598e-06, 'epoch': 0.72} 72%|███████▏ | 16000/22095 [27:31:14<41:19:59, 24.41s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 72%|███████▏ | 16001/22095 [27:33:44<105:18:56, 62.21s/it] {'loss': 0.329, 'grad_norm': 0.7315932204963735, 'learning_rate': 1.8662879450150956e-06, 'epoch': 0.72} 72%|███████▏ | 16001/22095 [27:33:44<105:18:56, 62.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11307296 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 874, 'image': 'chat/476.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n图中的聊天内容指的是什么?'}, {'from': 'gpt', 'value': '图中的聊天内容展示了一段对话,其中提到了如何修改表情包上的文字。\n\n对话的内容如下:\n\n1. 一方发送了一张图片,图片里包含了绿色的青蛙图案(看起来像网络表情包),并添加了文字"你点开"。\n2. 对方回应了三个省略号“...”,代表对显示内容感到疑惑。\n3. 发送表情包的人解释,“这种就是他提前保存了几张表情包吗...”\n4. 另一方解释说,“不是的哦 文字是可以随便改的”,并接着说“你看”。\n5. 发送表情包的人再次发送了一张包含青蛙图案的表情包,文字改成了“可以随便定”。\n\n总结:\n对话中的一方展示了表情包应用,可以对表情图片上的文字内容进行定制和修改,这样同一张表情包可以根据对话需要随意改变文字内容。'}]} 72%|███████▏ | 16002/22095 [27:34:27<95:18:58, 56.32s/it] {'loss': 0.4074, 'grad_norm': 0.6784928963482864, 'learning_rate': 1.8657168671984404e-06, 'epoch': 0.72} 72%|███████▏ | 16002/22095 [27:34:27<95:18:58, 56.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523232 in VC:s3://internvl-moe-sft-data/. Exception: Image size [198, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 79479, 'image': 'vrdu_texteq/astro-ph.CO/3c0f3baf-9402-4ee8-8d25-a2d4deb1ef9c.png', 'image_wh': [[198, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and $0$ elsewhere.'}]} 72%|███████▏ | 16003/22095 [27:35:09<88:05:32, 52.06s/it] {'loss': 0.2876, 'grad_norm': 0.6790960623785016, 'learning_rate': 1.8651458567279018e-06, 'epoch': 0.72} 72%|███████▏ | 16003/22095 [27:35:09<88:05:32, 52.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67348 > 40960). Running this sequence through the model will result in indexing errors 72%|███████▏ | 16004/22095 [27:35:36<75:06:18, 44.39s/it] {'loss': 0.4789, 'grad_norm': 0.2862915989405643, 'learning_rate': 1.8645749136157526e-06, 'epoch': 0.72} 72%|███████▏ | 16004/22095 [27:35:36<75:06:18, 44.39s/it] 72%|███████▏ | 16005/22095 [27:35:40<54:37:48, 32.29s/it] {'loss': 0.285, 'grad_norm': 0.6123843036858047, 'learning_rate': 1.8640040378742585e-06, 'epoch': 0.72} 72%|███████▏ | 16005/22095 [27:35:40<54:37:48, 32.29s/it] 72%|███████▏ | 16006/22095 [27:35:44<40:24:19, 23.89s/it] {'loss': 0.3159, 'grad_norm': 0.6121730728790465, 'learning_rate': 1.8634332295156848e-06, 'epoch': 0.72} 72%|███████▏ | 16006/22095 [27:35:44<40:24:19, 23.89s/it] 72%|███████▏ | 16007/22095 [27:36:44<58:56:10, 34.85s/it] {'loss': 0.2564, 'grad_norm': 0.5802479016860167, 'learning_rate': 1.8628624885522994e-06, 'epoch': 0.72} 72%|███████▏ | 16007/22095 [27:36:44<58:56:10, 34.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 16008/22095 [27:36:54<46:22:05, 27.42s/it] {'loss': 0.4929, 'grad_norm': 0.26500677475920026, 'learning_rate': 1.8622918149963626e-06, 'epoch': 0.72} 72%|███████▏ | 16008/22095 [27:36:54<46:22:05, 27.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/1873367841954866_20.png 2025-08-28 19:34:53.240949 load time: 1115.13 ms 72%|███████▏ | 16009/22095 [27:37:17<43:41:20, 25.84s/it] {'loss': 0.3143, 'grad_norm': 0.6523572564394401, 'learning_rate': 1.8617212088601395e-06, 'epoch': 0.72} 72%|███████▏ | 16009/22095 [27:37:17<43:41:20, 25.84s/it] 72%|███████▏ | 16010/22095 [27:37:20<32:23:35, 19.16s/it] {'loss': 0.2882, 'grad_norm': 0.6026168399191, 'learning_rate': 1.8611506701558874e-06, 'epoch': 0.72} 72%|███████▏ | 16010/22095 [27:37:20<32:23:35, 19.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957197 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8032, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 72%|███████▏ | 16011/22095 [27:37:42<33:48:42, 20.01s/it] {'loss': 0.3491, 'grad_norm': 0.6382442474781357, 'learning_rate': 1.8605801988958688e-06, 'epoch': 0.72} 72%|███████▏ | 16011/22095 [27:37:42<33:48:42, 20.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 72%|███████▏ | 16012/22095 [27:38:09<37:19:28, 22.09s/it] {'loss': 0.4603, 'grad_norm': 0.26119721971741644, 'learning_rate': 1.8600097950923379e-06, 'epoch': 0.72} 72%|███████▏ | 16012/22095 [27:38:09<37:19:28, 22.09s/it] 72%|███████▏ | 16013/22095 [27:38:13<28:08:30, 16.66s/it] {'loss': 0.3025, 'grad_norm': 0.6042488032958495, 'learning_rate': 1.8594394587575548e-06, 'epoch': 0.72} 72%|███████▏ | 16013/22095 [27:38:13<28:08:30, 16.66s/it] 72%|███████▏ | 16014/22095 [27:38:54<40:27:49, 23.95s/it] {'loss': 0.2698, 'grad_norm': 0.5995169075539502, 'learning_rate': 1.858869189903772e-06, 'epoch': 0.72} 72%|███████▏ | 16014/22095 [27:38:54<40:27:49, 23.95s/it] 72%|███████▏ | 16015/22095 [27:39:18<40:30:02, 23.98s/it] {'loss': 0.313, 'grad_norm': 0.5959838329113539, 'learning_rate': 1.8582989885432412e-06, 'epoch': 0.72} 72%|███████▏ | 16015/22095 [27:39:18<40:30:02, 23.98s/it]VC:s3://internvl-moe-sft-data/vrdu_table_final_2/astro-ph.CO/05131766-15fe-4ab5-ad22-8ec826177241.png 2025-08-28 19:37:16.892032 load time: 1032.08 ms 72%|███████▏ | 16016/22095 [27:39:22<30:07:53, 17.84s/it] {'loss': 0.3019, 'grad_norm': 0.6811871875801533, 'learning_rate': 1.8577288546882167e-06, 'epoch': 0.72} 72%|███████▏ | 16016/22095 [27:39:22<30:07:53, 17.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [442, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8491466 in VC:s3://internvl-moe-sft-data/. Exception: Image size [442, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 135063, 'image': 'vrdu_texteq/astro-ph.CO/be13e843-db3a-4bed-b0ab-b93cb2dea548.png', 'image_wh': [[442, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $\\tau$ denotes the conformal time.'}]} VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_653777.png 2025-08-28 19:37:20.426183 load time: 1033.6 ms 72%|███████▏ | 16017/22095 [27:39:44<32:21:43, 19.17s/it] {'loss': 0.2613, 'grad_norm': 0.598166448273423, 'learning_rate': 1.8571587883509495e-06, 'epoch': 0.72} 72%|███████▏ | 16017/22095 [27:39:44<32:21:43, 19.17s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/rico/dataset/image/24572.jpg 2025-08-28 19:37:42.674894 load time: 1022.5 ms 72%|███████▏ | 16018/22095 [27:39:48<24:30:59, 14.52s/it] {'loss': 0.2858, 'grad_norm': 0.6119582544966328, 'learning_rate': 1.8565887895436874e-06, 'epoch': 0.72} 72%|███████▏ | 16018/22095 [27:39:48<24:30:59, 14.52s/it] 73%|███████▎ | 16019/22095 [27:39:50<18:36:36, 11.03s/it] {'loss': 0.3108, 'grad_norm': 0.6892323236298378, 'learning_rate': 1.856018858278677e-06, 'epoch': 0.73} 73%|███████▎ | 16019/22095 [27:39:50<18:36:36, 11.03s/it] 73%|███████▎ | 16020/22095 [27:40:12<23:58:16, 14.21s/it] {'loss': 0.2651, 'grad_norm': 0.5374278359807803, 'learning_rate': 1.8554489945681663e-06, 'epoch': 0.73} 73%|███████▎ | 16020/22095 [27:40:12<23:58:16, 14.21s/it] 73%|███████▎ | 16021/22095 [27:40:34<27:44:36, 16.44s/it] {'loss': 0.2739, 'grad_norm': 0.6362418049415203, 'learning_rate': 1.8548791984243975e-06, 'epoch': 0.73} 73%|███████▎ | 16021/22095 [27:40:34<27:44:36, 16.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8372316 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39086, 'image': 'vrdu_table_final_2/astro-ph.CO/2cf54fce-e109-4598-a818-b20232e01756.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 73%|███████▎ | 16022/22095 [27:40:57<30:59:23, 18.37s/it] {'loss': 0.2955, 'grad_norm': 0.6069602912118967, 'learning_rate': 1.854309469859617e-06, 'epoch': 0.73} 73%|███████▎ | 16022/22095 [27:40:57<30:59:23, 18.37s/it] 73%|███████▎ | 16023/22095 [27:41:19<33:06:16, 19.63s/it] {'loss': 0.3109, 'grad_norm': 0.7050148269586889, 'learning_rate': 1.853739808886063e-06, 'epoch': 0.73} 73%|███████▎ | 16023/22095 [27:41:19<33:06:16, 19.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/windows_7/ieframe_42024.png 2025-08-28 19:39:17.942801 load time: 1045.6 ms VC:s3://multi-modal/laion_gpt4v/images/d8fff47215bcc884a6a8f4ae0214084b.jpg 2025-08-28 19:39:17.943203 load time: 1038.01 ms 73%|███████▎ | 16024/22095 [27:42:04<45:49:11, 27.17s/it] {'loss': 0.3129, 'grad_norm': 1.9981279830274465, 'learning_rate': 1.8531702155159792e-06, 'epoch': 0.73} 73%|███████▎ | 16024/22095 [27:42:04<45:49:11, 27.17s/it] 73%|███████▎ | 16025/22095 [27:43:03<61:46:54, 36.64s/it] {'loss': 0.3675, 'grad_norm': 0.7003608172407041, 'learning_rate': 1.8526006897616011e-06, 'epoch': 0.73} 73%|███████▎ | 16025/22095 [27:43:03<61:46:54, 36.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52133 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120469 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16026/22095 [27:43:25<54:31:02, 32.34s/it] {'loss': 0.272, 'grad_norm': 0.6301314495559056, 'learning_rate': 1.8520312316351692e-06, 'epoch': 0.73} 73%|███████▎ | 16026/22095 [27:43:25<54:31:02, 32.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50335 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16027/22095 [27:43:28<39:51:17, 23.64s/it] {'loss': 0.2866, 'grad_norm': 0.6483469398299987, 'learning_rate': 1.8514618411489176e-06, 'epoch': 0.73} 73%|███████▎ | 16027/22095 [27:43:28<39:51:17, 23.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047938 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 6\nB. 2\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 73%|███████▎ | 16028/22095 [27:43:31<29:25:49, 17.46s/it] {'loss': 0.3393, 'grad_norm': 0.6696378094528813, 'learning_rate': 1.85089251831508e-06, 'epoch': 0.73} 73%|███████▎ | 16028/22095 [27:43:31<29:25:49, 17.46s/it] 73%|███████▎ | 16029/22095 [27:43:54<31:49:45, 18.89s/it] {'loss': 0.2847, 'grad_norm': 0.5718861514539335, 'learning_rate': 1.85032326314589e-06, 'epoch': 0.73} 73%|███████▎ | 16029/22095 [27:43:54<31:49:45, 18.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946835 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69988, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6.4cm'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16030/22095 [27:44:18<34:40:32, 20.58s/it] {'loss': 0.3192, 'grad_norm': 0.6306886641099659, 'learning_rate': 1.8497540756535814e-06, 'epoch': 0.73} 73%|███████▎ | 16030/22095 [27:44:18<34:40:32, 20.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16031/22095 [27:44:22<26:06:53, 15.50s/it] {'loss': 0.2775, 'grad_norm': 0.6723825164117041, 'learning_rate': 1.8491849558503827e-06, 'epoch': 0.73} 73%|███████▎ | 16031/22095 [27:44:22<26:06:53, 15.50s/it] 73%|███████▎ | 16032/22095 [27:44:25<19:55:50, 11.83s/it] {'loss': 0.3138, 'grad_norm': 0.589946999511413, 'learning_rate': 1.8486159037485202e-06, 'epoch': 0.73} 73%|███████▎ | 16032/22095 [27:44:25<19:55:50, 11.83s/it] 73%|███████▎ | 16033/22095 [27:44:29<16:00:19, 9.50s/it] {'loss': 0.2836, 'grad_norm': 0.6564239072718545, 'learning_rate': 1.848046919360225e-06, 'epoch': 0.73} 73%|███████▎ | 16033/22095 [27:44:29<16:00:19, 9.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (85919 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131080 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16034/22095 [27:44:57<25:28:24, 15.13s/it] {'loss': 0.4662, 'grad_norm': 0.30486315624182836, 'learning_rate': 1.8474780026977196e-06, 'epoch': 0.73} 73%|███████▎ | 16034/22095 [27:44:57<25:28:24, 15.13s/it] 73%|███████▎ | 16035/22095 [27:45:07<22:34:03, 13.41s/it] {'loss': 0.4488, 'grad_norm': 0.28477761396096907, 'learning_rate': 1.8469091537732315e-06, 'epoch': 0.73} 73%|███████▎ | 16035/22095 [27:45:07<22:34:03, 13.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 73%|███████▎ | 16036/22095 [27:45:11<17:43:21, 10.53s/it] {'loss': 0.2711, 'grad_norm': 0.6452670364264114, 'learning_rate': 1.846340372598981e-06, 'epoch': 0.73} 73%|███████▎ | 16036/22095 [27:45:11<17:43:21, 10.53s/it] 73%|███████▎ | 16037/22095 [27:45:36<25:07:55, 14.93s/it] {'loss': 0.2993, 'grad_norm': 0.6356012564902356, 'learning_rate': 1.8457716591871887e-06, 'epoch': 0.73} 73%|███████▎ | 16037/22095 [27:45:36<25:07:55, 14.93s/it] 73%|███████▎ | 16038/22095 [27:46:17<38:17:54, 22.76s/it] {'loss': 0.3009, 'grad_norm': 0.7041584132129752, 'learning_rate': 1.8452030135500765e-06, 'epoch': 0.73} 73%|███████▎ | 16038/22095 [27:46:17<38:17:54, 22.76s/it] 73%|███████▎ | 16039/22095 [27:46:57<46:50:25, 27.84s/it] {'loss': 0.2898, 'grad_norm': 0.606881481776386, 'learning_rate': 1.8446344356998635e-06, 'epoch': 0.73} 73%|███████▎ | 16039/22095 [27:46:57<46:50:25, 27.84s/it] 73%|███████▎ | 16040/22095 [27:46:59<34:11:55, 20.33s/it] {'loss': 0.2707, 'grad_norm': 0.6058541708204863, 'learning_rate': 1.8440659256487658e-06, 'epoch': 0.73} 73%|███████▎ | 16040/22095 [27:46:59<34:11:55, 20.33s/it] 73%|███████▎ | 16041/22095 [27:47:03<25:36:05, 15.22s/it] {'loss': 0.2898, 'grad_norm': 0.6760264236093753, 'learning_rate': 1.843497483408997e-06, 'epoch': 0.73} 73%|███████▎ | 16041/22095 [27:47:03<25:36:05, 15.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89975 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16042/22095 [27:47:12<22:41:14, 13.49s/it] {'loss': 0.4867, 'grad_norm': 0.28486960915330306, 'learning_rate': 1.8429291089927742e-06, 'epoch': 0.73} 73%|███████▎ | 16042/22095 [27:47:12<22:41:14, 13.49s/it] 73%|███████▎ | 16043/22095 [27:47:15<17:34:14, 10.45s/it] {'loss': 0.3212, 'grad_norm': 0.6270785078332332, 'learning_rate': 1.8423608024123086e-06, 'epoch': 0.73} 73%|███████▎ | 16043/22095 [27:47:15<17:34:14, 10.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396945 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63798, 'image': 'vrdu_table_final_2/astro-ph.EP/9c3f5a5c-1896-4798-b26e-1e997584de5d.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[t]{l}$e_z$\\end{tabular}\n```"}]} 73%|███████▎ | 16044/22095 [27:47:40<24:55:48, 14.83s/it] {'loss': 0.2915, 'grad_norm': 0.560317095729978, 'learning_rate': 1.8417925636798101e-06, 'epoch': 0.73} 73%|███████▎ | 16044/22095 [27:47:41<24:55:48, 14.83s/it] 73%|███████▎ | 16045/22095 [27:47:44<19:00:10, 11.31s/it] {'loss': 0.3181, 'grad_norm': 0.9136672675320224, 'learning_rate': 1.8412243928074897e-06, 'epoch': 0.73} 73%|███████▎ | 16045/22095 [27:47:44<19:00:10, 11.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58326 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16046/22095 [27:47:46<14:43:09, 8.76s/it] {'loss': 0.2882, 'grad_norm': 0.6204588826654109, 'learning_rate': 1.840656289807557e-06, 'epoch': 0.73} 73%|███████▎ | 16046/22095 [27:47:46<14:43:09, 8.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75718 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16047/22095 [27:48:12<23:18:25, 13.87s/it] {'loss': 0.306, 'grad_norm': 0.620459957704693, 'learning_rate': 1.8400882546922177e-06, 'epoch': 0.73} 73%|███████▎ | 16047/22095 [27:48:12<23:18:25, 13.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60292 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65740 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16048/22095 [27:48:15<17:44:42, 10.56s/it] {'loss': 0.3312, 'grad_norm': 0.6024461681337429, 'learning_rate': 1.8395202874736752e-06, 'epoch': 0.73} 73%|███████▎ | 16048/22095 [27:48:15<17:44:42, 10.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72988 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16049/22095 [27:48:18<13:49:32, 8.23s/it] {'loss': 0.299, 'grad_norm': 0.5795091380255545, 'learning_rate': 1.8389523881641363e-06, 'epoch': 0.73} 73%|███████▎ | 16049/22095 [27:48:18<13:49:32, 8.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16050/22095 [27:48:28<14:36:20, 8.70s/it] {'loss': 0.5161, 'grad_norm': 0.3210511536301892, 'learning_rate': 1.8383845567758008e-06, 'epoch': 0.73} 73%|███████▎ | 16050/22095 [27:48:28<14:36:20, 8.70s/it] 73%|███████▎ | 16051/22095 [27:48:32<12:15:31, 7.30s/it] {'loss': 0.2658, 'grad_norm': 0.5599891840744048, 'learning_rate': 1.8378167933208729e-06, 'epoch': 0.73} 73%|███████▎ | 16051/22095 [27:48:32<12:15:31, 7.30s/it] 73%|███████▎ | 16052/22095 [27:48:35<10:20:37, 6.16s/it] {'loss': 0.2641, 'grad_norm': 0.6129156924252082, 'learning_rate': 1.837249097811548e-06, 'epoch': 0.73} 73%|███████▎ | 16052/22095 [27:48:35<10:20:37, 6.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16053/22095 [27:48:39<9:00:32, 5.37s/it] {'loss': 0.3307, 'grad_norm': 0.7396036670714075, 'learning_rate': 1.8366814702600288e-06, 'epoch': 0.73} 73%|███████▎ | 16053/22095 [27:48:39<9:00:32, 5.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16054/22095 [27:48:45<9:36:15, 5.72s/it] {'loss': 0.4751, 'grad_norm': 0.30804189245711033, 'learning_rate': 1.836113910678507e-06, 'epoch': 0.73} 73%|███████▎ | 16054/22095 [27:48:45<9:36:15, 5.72s/it] 73%|███████▎ | 16055/22095 [27:48:48<8:17:06, 4.94s/it] {'loss': 0.3435, 'grad_norm': 0.6450496615316457, 'learning_rate': 1.835546419079182e-06, 'epoch': 0.73} 73%|███████▎ | 16055/22095 [27:48:48<8:17:06, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16056/22095 [27:48:58<10:33:21, 6.29s/it] {'loss': 0.4613, 'grad_norm': 0.2796091456412559, 'learning_rate': 1.8349789954742459e-06, 'epoch': 0.73} 73%|███████▎ | 16056/22095 [27:48:58<10:33:21, 6.29s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16057/22095 [27:49:02<9:17:32, 5.54s/it] {'loss': 0.3306, 'grad_norm': 0.64890080958883, 'learning_rate': 1.8344116398758888e-06, 'epoch': 0.73} 73%|███████▎ | 16057/22095 [27:49:02<9:17:32, 5.54s/it] 73%|███████▎ | 16058/22095 [27:49:05<8:12:49, 4.90s/it] {'loss': 0.3081, 'grad_norm': 0.575461620410372, 'learning_rate': 1.8338443522963028e-06, 'epoch': 0.73} 73%|███████▎ | 16058/22095 [27:49:05<8:12:49, 4.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16059/22095 [27:49:13<9:50:38, 5.87s/it] {'loss': 0.4966, 'grad_norm': 0.2883379797548653, 'learning_rate': 1.8332771327476795e-06, 'epoch': 0.73} 73%|███████▎ | 16059/22095 [27:49:13<9:50:38, 5.87s/it] 73%|███████▎ | 16060/22095 [27:49:16<8:34:23, 5.11s/it] {'loss': 0.3059, 'grad_norm': 0.5764107107006161, 'learning_rate': 1.832709981242205e-06, 'epoch': 0.73} 73%|███████▎ | 16060/22095 [27:49:16<8:34:23, 5.11s/it] 73%|███████▎ | 16061/22095 [27:49:40<17:41:50, 10.56s/it] {'loss': 0.3088, 'grad_norm': 0.6613462650278219, 'learning_rate': 1.8321428977920635e-06, 'epoch': 0.73} 73%|███████▎ | 16061/22095 [27:49:40<17:41:50, 10.56s/it] 73%|███████▎ | 16062/22095 [27:49:42<13:45:10, 8.21s/it] {'loss': 0.2816, 'grad_norm': 0.5971963076408902, 'learning_rate': 1.8315758824094432e-06, 'epoch': 0.73} 73%|███████▎ | 16062/22095 [27:49:42<13:45:10, 8.21s/it] 73%|███████▎ | 16063/22095 [27:49:47<11:56:03, 7.12s/it] {'loss': 0.3588, 'grad_norm': 0.6458655637611862, 'learning_rate': 1.8310089351065246e-06, 'epoch': 0.73} 73%|███████▎ | 16063/22095 [27:49:47<11:56:03, 7.12s/it] 73%|███████▎ | 16064/22095 [27:49:50<9:43:49, 5.81s/it] {'loss': 0.2893, 'grad_norm': 0.6029614831205884, 'learning_rate': 1.8304420558954933e-06, 'epoch': 0.73} 73%|███████▎ | 16064/22095 [27:49:50<9:43:49, 5.81s/it] 73%|███████▎ | 16065/22095 [27:49:54<8:49:35, 5.27s/it] {'loss': 0.3295, 'grad_norm': 0.6523860978116632, 'learning_rate': 1.8298752447885254e-06, 'epoch': 0.73} 73%|███████▎ | 16065/22095 [27:49:54<8:49:35, 5.27s/it] 73%|███████▎ | 16066/22095 [27:49:57<7:43:45, 4.62s/it] {'loss': 0.3268, 'grad_norm': 0.6085824327664999, 'learning_rate': 1.829308501797804e-06, 'epoch': 0.73} 73%|███████▎ | 16066/22095 [27:49:57<7:43:45, 4.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047102 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 16cm\nB. 32cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 73%|███████▎ | 16067/22095 [27:50:00<7:09:18, 4.27s/it] {'loss': 0.3163, 'grad_norm': 0.5839321712913418, 'learning_rate': 1.8287418269355035e-06, 'epoch': 0.73} 73%|███████▎ | 16067/22095 [27:50:00<7:09:18, 4.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42795 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16068/22095 [27:50:04<6:41:49, 4.00s/it] {'loss': 0.2845, 'grad_norm': 0.5934730562217898, 'learning_rate': 1.8281752202138032e-06, 'epoch': 0.73} 73%|███████▎ | 16068/22095 [27:50:04<6:41:49, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44773 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16069/22095 [27:50:08<6:40:49, 3.99s/it] {'loss': 0.3397, 'grad_norm': 0.5862163177911442, 'learning_rate': 1.8276086816448751e-06, 'epoch': 0.73} 73%|███████▎ | 16069/22095 [27:50:08<6:40:49, 3.99s/it] 73%|███████▎ | 16070/22095 [27:50:11<6:08:06, 3.67s/it] {'loss': 0.3131, 'grad_norm': 0.6138676841077318, 'learning_rate': 1.8270422112408919e-06, 'epoch': 0.73} 73%|███████▎ | 16070/22095 [27:50:11<6:08:06, 3.67s/it] 73%|███████▎ | 16071/22095 [27:50:13<5:43:17, 3.42s/it] {'loss': 0.2898, 'grad_norm': 0.6599671578492269, 'learning_rate': 1.8264758090140267e-06, 'epoch': 0.73} 73%|███████▎ | 16071/22095 [27:50:13<5:43:17, 3.42s/it] 73%|███████▎ | 16072/22095 [27:50:17<5:37:53, 3.37s/it] {'loss': 0.2862, 'grad_norm': 0.6756821861159864, 'learning_rate': 1.8259094749764532e-06, 'epoch': 0.73} 73%|███████▎ | 16072/22095 [27:50:17<5:37:53, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8941396 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64549, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 6.4cm\nB. 6.8cm\nC. 7cm\nD. 5.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 73%|███████▎ | 16073/22095 [27:50:21<5:57:23, 3.56s/it] {'loss': 0.2819, 'grad_norm': 0.5798011713799885, 'learning_rate': 1.8253432091403329e-06, 'epoch': 0.73} 73%|███████▎ | 16073/22095 [27:50:21<5:57:23, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348838 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15508, 'image': 'vrdu_table_final_2/astro-ph.CO/f6d1a78d-67a9-44de-9518-d51915a69f7d.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$S_{5}$\\end{tabular}\n```"}]} 73%|███████▎ | 16074/22095 [27:50:25<6:07:17, 3.66s/it] {'loss': 0.3195, 'grad_norm': 0.5905087514687498, 'learning_rate': 1.824777011517837e-06, 'epoch': 0.73} 73%|███████▎ | 16074/22095 [27:50:25<6:07:17, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16075/22095 [27:50:28<5:56:56, 3.56s/it] {'loss': 0.259, 'grad_norm': 0.6026175779356564, 'learning_rate': 1.8242108821211324e-06, 'epoch': 0.73} 73%|███████▎ | 16075/22095 [27:50:28<5:56:56, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16076/22095 [27:50:37<8:54:07, 5.32s/it] {'loss': 0.4743, 'grad_norm': 0.2949314560163198, 'learning_rate': 1.8236448209623825e-06, 'epoch': 0.73} 73%|███████▎ | 16076/22095 [27:50:37<8:54:07, 5.32s/it] 73%|███████▎ | 16077/22095 [27:50:47<11:09:26, 6.67s/it] {'loss': 0.4569, 'grad_norm': 0.28458791237527803, 'learning_rate': 1.8230788280537487e-06, 'epoch': 0.73} 73%|███████▎ | 16077/22095 [27:50:47<11:09:26, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 73%|███████▎ | 16078/22095 [27:50:51<9:42:31, 5.81s/it] {'loss': 0.2971, 'grad_norm': 0.5643217470901566, 'learning_rate': 1.8225129034073951e-06, 'epoch': 0.73} 73%|███████▎ | 16078/22095 [27:50:51<9:42:31, 5.81s/it] 73%|███████▎ | 16079/22095 [27:50:55<8:52:18, 5.31s/it] {'loss': 0.3062, 'grad_norm': 0.9389791664087129, 'learning_rate': 1.8219470470354784e-06, 'epoch': 0.73} 73%|███████▎ | 16079/22095 [27:50:55<8:52:18, 5.31s/it] 73%|███████▎ | 16080/22095 [27:50:59<8:06:56, 4.86s/it] {'loss': 0.2948, 'grad_norm': 0.6886383299953388, 'learning_rate': 1.8213812589501611e-06, 'epoch': 0.73} 73%|███████▎ | 16080/22095 [27:50:59<8:06:56, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97457 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55227 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16081/22095 [27:51:02<7:09:07, 4.28s/it] {'loss': 0.2632, 'grad_norm': 0.5850857135648151, 'learning_rate': 1.8208155391635963e-06, 'epoch': 0.73} 73%|███████▎ | 16081/22095 [27:51:02<7:09:07, 4.28s/it] 73%|███████▎ | 16082/22095 [27:51:05<6:39:45, 3.99s/it] {'loss': 0.3238, 'grad_norm': 0.6042575374580038, 'learning_rate': 1.8202498876879432e-06, 'epoch': 0.73} 73%|███████▎ | 16082/22095 [27:51:05<6:39:45, 3.99s/it] 73%|███████▎ | 16083/22095 [27:51:08<6:14:48, 3.74s/it] {'loss': 0.2896, 'grad_norm': 0.6928404697901881, 'learning_rate': 1.8196843045353519e-06, 'epoch': 0.73} 73%|███████▎ | 16083/22095 [27:51:08<6:14:48, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46794 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43269 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47156 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (118017 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16084/22095 [27:51:12<6:04:33, 3.64s/it] {'loss': 0.2536, 'grad_norm': 0.5341234808814107, 'learning_rate': 1.8191187897179796e-06, 'epoch': 0.73} 73%|███████▎ | 16084/22095 [27:51:12<6:04:33, 3.64s/it] 73%|███████▎ | 16085/22095 [27:51:16<6:10:20, 3.70s/it] {'loss': 0.3156, 'grad_norm': 0.7459069229430887, 'learning_rate': 1.8185533432479751e-06, 'epoch': 0.73} 73%|███████▎ | 16085/22095 [27:51:16<6:10:20, 3.70s/it] 73%|███████▎ | 16086/22095 [27:51:18<5:46:24, 3.46s/it] {'loss': 0.3109, 'grad_norm': 0.6091769615205567, 'learning_rate': 1.8179879651374866e-06, 'epoch': 0.73} 73%|███████▎ | 16086/22095 [27:51:18<5:46:24, 3.46s/it] 73%|███████▎ | 16087/22095 [27:51:22<5:52:06, 3.52s/it] {'loss': 0.3079, 'grad_norm': 0.7380141827131348, 'learning_rate': 1.8174226553986635e-06, 'epoch': 0.73} 73%|███████▎ | 16087/22095 [27:51:22<5:52:06, 3.52s/it] 73%|███████▎ | 16088/22095 [27:51:26<5:59:17, 3.59s/it] {'loss': 0.3053, 'grad_norm': 0.714612464179002, 'learning_rate': 1.816857414043655e-06, 'epoch': 0.73} 73%|███████▎ | 16088/22095 [27:51:26<5:59:17, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16089/22095 [27:51:32<7:27:38, 4.47s/it] {'loss': 0.4745, 'grad_norm': 0.2860785773149628, 'learning_rate': 1.8162922410846046e-06, 'epoch': 0.73} 73%|███████▎ | 16089/22095 [27:51:32<7:27:38, 4.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8343717 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10367, 'image': 'vrdu_table_final_2/astro-ph.CO/7e030d42-dd5c-4679-9581-2aa63a5454e9.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 73%|███████▎ | 16090/22095 [27:51:36<7:02:27, 4.22s/it] {'loss': 0.303, 'grad_norm': 0.6039689558987191, 'learning_rate': 1.8157271365336536e-06, 'epoch': 0.73} 73%|███████▎ | 16090/22095 [27:51:36<7:02:27, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127833 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68135 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16091/22095 [27:51:40<6:47:45, 4.07s/it] {'loss': 0.2985, 'grad_norm': 0.6083549302228266, 'learning_rate': 1.815162100402949e-06, 'epoch': 0.73} 73%|███████▎ | 16091/22095 [27:51:40<6:47:45, 4.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8917250 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 40403, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 6\nB. 8\nC. 10\nD. 12'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AC=\\frac{1}{4}AB,AD=\\frac{1}{2}AB而CD=AD-AC∴CD=\\frac{1}{2}AB-\\frac{1}{4}AB=2∴\\frac{1}{4}AB=2∴AB=8'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [973, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8457405 in VC:s3://internvl-moe-sft-data/. Exception: Image size [973, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 84729, 'image': 'vrdu_texteq/astro-ph.CO/b1c49f33-a1eb-4683-87e7-bbdd9763f9a6.png', 'image_wh': [[973, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'The mean values are $\\alpha = -1.13\\pm0.21$\n for haloes and $\\alpha = -1.2\\pm0.19$ for relics.'}]} 73%|███████▎ | 16092/22095 [27:51:49<9:28:10, 5.68s/it] {'loss': 0.465, 'grad_norm': 0.2660101982454626, 'learning_rate': 1.8145971327046274e-06, 'epoch': 0.73} 73%|███████▎ | 16092/22095 [27:51:49<9:28:10, 5.68s/it] 73%|███████▎ | 16093/22095 [27:51:53<8:24:09, 5.04s/it] {'loss': 0.2818, 'grad_norm': 0.6313514296218542, 'learning_rate': 1.814032233450832e-06, 'epoch': 0.73} 73%|███████▎ | 16093/22095 [27:51:53<8:24:09, 5.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63456 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49784 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16094/22095 [27:51:57<7:51:36, 4.72s/it] {'loss': 0.3437, 'grad_norm': 0.6567990816362825, 'learning_rate': 1.8134674026536968e-06, 'epoch': 0.73} 73%|███████▎ | 16094/22095 [27:51:57<7:51:36, 4.72s/it] 73%|███████▎ | 16095/22095 [27:52:00<7:08:11, 4.28s/it] {'loss': 0.2579, 'grad_norm': 0.5475313855383052, 'learning_rate': 1.8129026403253624e-06, 'epoch': 0.73} 73%|███████▎ | 16095/22095 [27:52:00<7:08:11, 4.28s/it] 73%|███████▎ | 16096/22095 [27:52:03<6:44:08, 4.04s/it] {'loss': 0.3002, 'grad_norm': 0.5950986141428766, 'learning_rate': 1.8123379464779606e-06, 'epoch': 0.73} 73%|███████▎ | 16096/22095 [27:52:03<6:44:08, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16097/22095 [27:52:08<7:02:51, 4.23s/it] {'loss': 0.4713, 'grad_norm': 0.27592219794883777, 'learning_rate': 1.8117733211236277e-06, 'epoch': 0.73} 73%|███████▎ | 16097/22095 [27:52:08<7:02:51, 4.23s/it] 73%|███████▎ | 16098/22095 [27:52:12<6:41:13, 4.01s/it] {'loss': 0.3383, 'grad_norm': 0.6526013033225897, 'learning_rate': 1.811208764274494e-06, 'epoch': 0.73} 73%|███████▎ | 16098/22095 [27:52:12<6:41:13, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16099/22095 [27:52:21<9:26:40, 5.67s/it] {'loss': 0.4811, 'grad_norm': 0.3223904043897807, 'learning_rate': 1.8106442759426884e-06, 'epoch': 0.73} 73%|███████▎ | 16099/22095 [27:52:21<9:26:40, 5.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74815 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16100/22095 [27:52:25<8:38:10, 5.19s/it] {'loss': 0.2741, 'grad_norm': 0.5550383719292121, 'learning_rate': 1.8100798561403426e-06, 'epoch': 0.73} 73%|███████▎ | 16100/22095 [27:52:25<8:38:10, 5.19s/it] 73%|███████▎ | 16101/22095 [27:52:29<7:42:07, 4.63s/it] {'loss': 0.2984, 'grad_norm': 0.6392507722711632, 'learning_rate': 1.8095155048795865e-06, 'epoch': 0.73} 73%|███████▎ | 16101/22095 [27:52:29<7:42:07, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16102/22095 [27:52:35<8:41:45, 5.22s/it] {'loss': 0.4739, 'grad_norm': 0.2879523619249691, 'learning_rate': 1.8089512221725402e-06, 'epoch': 0.73} 73%|███████▎ | 16102/22095 [27:52:35<8:41:45, 5.22s/it] 73%|███████▎ | 16103/22095 [27:52:38<7:42:40, 4.63s/it] {'loss': 0.2789, 'grad_norm': 0.6498409838546698, 'learning_rate': 1.8083870080313315e-06, 'epoch': 0.73} 73%|███████▎ | 16103/22095 [27:52:38<7:42:40, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68110 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88234 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65011 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16104/22095 [27:52:42<7:20:53, 4.42s/it] {'loss': 0.3114, 'grad_norm': 0.6382210882940718, 'learning_rate': 1.8078228624680854e-06, 'epoch': 0.73} 73%|███████▎ | 16104/22095 [27:52:42<7:20:53, 4.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [375, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8479093 in VC:s3://internvl-moe-sft-data/. Exception: Image size [375, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 79129, 'image': 'vrdu_texteq/astro-ph.CO/041096e8-bd58-4e2b-aa66-06706e5b53b5.png', 'image_wh': [[375, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'where $\\tau$ is the normalized time'}]} 73%|███████▎ | 16105/22095 [27:52:47<7:15:14, 4.36s/it] {'loss': 0.277, 'grad_norm': 0.6468102133095937, 'learning_rate': 1.807258785494922e-06, 'epoch': 0.73} 73%|███████▎ | 16105/22095 [27:52:47<7:15:14, 4.36s/it] 73%|███████▎ | 16106/22095 [27:52:49<6:31:27, 3.92s/it] {'loss': 0.2999, 'grad_norm': 0.6125188426779778, 'learning_rate': 1.8066947771239597e-06, 'epoch': 0.73} 73%|███████▎ | 16106/22095 [27:52:49<6:31:27, 3.92s/it] 73%|███████▎ | 16107/22095 [27:52:53<6:08:54, 3.70s/it] {'loss': 0.2838, 'grad_norm': 0.6198896839628587, 'learning_rate': 1.8061308373673208e-06, 'epoch': 0.73} 73%|███████▎ | 16107/22095 [27:52:53<6:08:54, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16108/22095 [27:53:02<9:08:44, 5.50s/it] {'loss': 0.492, 'grad_norm': 0.28230401094755525, 'learning_rate': 1.8055669662371194e-06, 'epoch': 0.73} 73%|███████▎ | 16108/22095 [27:53:02<9:08:44, 5.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882179 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5332, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知点M为AB段中点,N为AM段中点,满足AN:Mn=1:2,若AN=2cm,则AB段=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12cm'}]} 73%|███████▎ | 16109/22095 [27:53:06<8:14:23, 4.96s/it] {'loss': 0.286, 'grad_norm': 0.6089025119666203, 'learning_rate': 1.8050031637454746e-06, 'epoch': 0.73} 73%|███████▎ | 16109/22095 [27:53:06<8:14:23, 4.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65019 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66208 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16110/22095 [27:53:10<7:33:43, 4.55s/it] {'loss': 0.2955, 'grad_norm': 0.6294693049743567, 'learning_rate': 1.8044394299044976e-06, 'epoch': 0.73} 73%|███████▎ | 16110/22095 [27:53:10<7:33:43, 4.55s/it] 73%|███████▎ | 16111/22095 [27:53:13<6:57:07, 4.18s/it] {'loss': 0.2899, 'grad_norm': 0.7226863549811137, 'learning_rate': 1.8038757647263045e-06, 'epoch': 0.73} 73%|███████▎ | 16111/22095 [27:53:13<6:57:07, 4.18s/it]Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8409781 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11977, 'image': 'vrdu_table_final_2/astro-ph.CO/cadcff51-d710-4e6b-9e6f-bdc4134dde33.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 73%|███████▎ | 16112/22095 [27:53:22<9:08:25, 5.50s/it] {'loss': 0.4737, 'grad_norm': 0.31058843361683514, 'learning_rate': 1.803312168223003e-06, 'epoch': 0.73} 73%|███████▎ | 16112/22095 [27:53:22<9:08:25, 5.50s/it] 73%|███████▎ | 16113/22095 [27:53:25<8:10:35, 4.92s/it] {'loss': 0.3407, 'grad_norm': 0.6455846866744789, 'learning_rate': 1.8027486404067075e-06, 'epoch': 0.73} 73%|███████▎ | 16113/22095 [27:53:25<8:10:35, 4.92s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16114/22095 [27:53:28<7:18:15, 4.40s/it] {'loss': 0.2939, 'grad_norm': 0.5926280243786574, 'learning_rate': 1.8021851812895235e-06, 'epoch': 0.73} 73%|███████▎ | 16114/22095 [27:53:28<7:18:15, 4.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16115/22095 [27:53:31<6:31:19, 3.93s/it] {'loss': 0.269, 'grad_norm': 0.6670643403778769, 'learning_rate': 1.8016217908835575e-06, 'epoch': 0.73} 73%|███████▎ | 16115/22095 [27:53:31<6:31:19, 3.93s/it] 73%|███████▎ | 16116/22095 [27:53:35<6:26:23, 3.88s/it] {'loss': 0.316, 'grad_norm': 0.5872139058201896, 'learning_rate': 1.8010584692009158e-06, 'epoch': 0.73} 73%|███████▎ | 16116/22095 [27:53:35<6:26:23, 3.88s/it] 73%|███████▎ | 16117/22095 [27:53:38<5:55:50, 3.57s/it] {'loss': 0.3087, 'grad_norm': 0.6360687494862899, 'learning_rate': 1.8004952162537043e-06, 'epoch': 0.73} 73%|███████▎ | 16117/22095 [27:53:38<5:55:50, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047594 in VC:s3://multi-modal/UniGeo/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 73%|███████▎ | 16118/22095 [27:53:43<6:53:04, 4.15s/it] {'loss': 0.4696, 'grad_norm': 0.27153954274586384, 'learning_rate': 1.7999320320540242e-06, 'epoch': 0.73} 73%|███████▎ | 16118/22095 [27:53:43<6:53:04, 4.15s/it] 73%|███████▎ | 16119/22095 [27:53:47<6:50:38, 4.12s/it] {'loss': 0.3101, 'grad_norm': 0.7052951096372572, 'learning_rate': 1.799368916613975e-06, 'epoch': 0.73} 73%|███████▎ | 16119/22095 [27:53:47<6:50:38, 4.12s/it] 73%|███████▎ | 16120/22095 [27:53:51<6:26:01, 3.88s/it] {'loss': 0.3768, 'grad_norm': 0.7958306372804045, 'learning_rate': 1.7988058699456596e-06, 'epoch': 0.73} 73%|███████▎ | 16120/22095 [27:53:51<6:26:01, 3.88s/it] 73%|███████▎ | 16121/22095 [27:53:54<6:07:27, 3.69s/it] {'loss': 0.2838, 'grad_norm': 0.5730055696897941, 'learning_rate': 1.7982428920611722e-06, 'epoch': 0.73} 73%|███████▎ | 16121/22095 [27:53:54<6:07:27, 3.69s/it] 73%|███████▎ | 16122/22095 [27:53:57<5:51:37, 3.53s/it] {'loss': 0.2892, 'grad_norm': 0.5679127683681903, 'learning_rate': 1.7976799829726138e-06, 'epoch': 0.73} 73%|███████▎ | 16122/22095 [27:53:57<5:51:37, 3.53s/it] 73%|███████▎ | 16123/22095 [27:54:00<5:46:28, 3.48s/it] {'loss': 0.2859, 'grad_norm': 0.6729493635220757, 'learning_rate': 1.7971171426920753e-06, 'epoch': 0.73} 73%|███████▎ | 16123/22095 [27:54:00<5:46:28, 3.48s/it] 73%|███████▎ | 16124/22095 [27:54:04<5:59:13, 3.61s/it] {'loss': 0.3034, 'grad_norm': 0.6069928623620314, 'learning_rate': 1.796554371231654e-06, 'epoch': 0.73} 73%|███████▎ | 16124/22095 [27:54:04<5:59:13, 3.61s/it] 73%|███████▎ | 16125/22095 [27:54:07<5:37:12, 3.39s/it] {'loss': 0.2854, 'grad_norm': 0.5807258567124326, 'learning_rate': 1.7959916686034395e-06, 'epoch': 0.73} 73%|███████▎ | 16125/22095 [27:54:07<5:37:12, 3.39s/it] 73%|███████▎ | 16126/22095 [27:54:11<5:52:44, 3.55s/it] {'loss': 0.257, 'grad_norm': 0.8648799347898458, 'learning_rate': 1.7954290348195248e-06, 'epoch': 0.73} 73%|███████▎ | 16126/22095 [27:54:11<5:52:44, 3.55s/it] 73%|███████▎ | 16127/22095 [27:54:15<5:51:13, 3.53s/it] {'loss': 0.3324, 'grad_norm': 0.620623586958228, 'learning_rate': 1.7948664698919987e-06, 'epoch': 0.73} 73%|███████▎ | 16127/22095 [27:54:15<5:51:13, 3.53s/it] 73%|███████▎ | 16128/22095 [27:54:18<5:47:20, 3.49s/it] {'loss': 0.2806, 'grad_norm': 0.6149741888961061, 'learning_rate': 1.794303973832946e-06, 'epoch': 0.73} 73%|███████▎ | 16128/22095 [27:54:18<5:47:20, 3.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953368 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4203, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 无法确定\nB. 1cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 73%|███████▎ | 16129/22095 [27:54:21<5:37:07, 3.39s/it] {'loss': 0.3367, 'grad_norm': 0.627170473824644, 'learning_rate': 1.7937415466544556e-06, 'epoch': 0.73} 73%|███████▎ | 16129/22095 [27:54:21<5:37:07, 3.39s/it] 73%|███████▎ | 16130/22095 [27:54:24<5:36:44, 3.39s/it] {'loss': 0.2592, 'grad_norm': 0.6527986768715455, 'learning_rate': 1.7931791883686155e-06, 'epoch': 0.73} 73%|███████▎ | 16130/22095 [27:54:24<5:36:44, 3.39s/it] 73%|███████▎ | 16131/22095 [27:54:29<5:59:00, 3.61s/it] {'loss': 0.3038, 'grad_norm': 0.6122081032264333, 'learning_rate': 1.7926168989875027e-06, 'epoch': 0.73} 73%|███████▎ | 16131/22095 [27:54:29<5:59:00, 3.61s/it] 73%|███████▎ | 16132/22095 [27:54:31<5:36:37, 3.39s/it] {'loss': 0.3087, 'grad_norm': 0.8831182658752256, 'learning_rate': 1.7920546785232013e-06, 'epoch': 0.73} 73%|███████▎ | 16132/22095 [27:54:31<5:36:37, 3.39s/it] 73%|███████▎ | 16133/22095 [27:54:35<5:29:17, 3.31s/it] {'loss': 0.2888, 'grad_norm': 1.3518827731634866, 'learning_rate': 1.7914925269877947e-06, 'epoch': 0.73} 73%|███████▎ | 16133/22095 [27:54:35<5:29:17, 3.31s/it] 73%|███████▎ | 16134/22095 [27:54:38<5:34:42, 3.37s/it] {'loss': 0.2892, 'grad_norm': 0.6225243673416847, 'learning_rate': 1.790930444393359e-06, 'epoch': 0.73} 73%|███████▎ | 16134/22095 [27:54:38<5:34:42, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57383 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16135/22095 [27:54:42<5:36:08, 3.38s/it] {'loss': 0.284, 'grad_norm': 0.8839277255860258, 'learning_rate': 1.790368430751971e-06, 'epoch': 0.73} 73%|███████▎ | 16135/22095 [27:54:42<5:36:08, 3.38s/it] 73%|███████▎ | 16136/22095 [27:54:44<5:23:50, 3.26s/it] {'loss': 0.2914, 'grad_norm': 0.6256720305902387, 'learning_rate': 1.789806486075707e-06, 'epoch': 0.73} 73%|███████▎ | 16136/22095 [27:54:45<5:23:50, 3.26s/it] 73%|███████▎ | 16137/22095 [27:54:48<5:28:59, 3.31s/it] {'loss': 0.2669, 'grad_norm': 0.929858129804423, 'learning_rate': 1.7892446103766448e-06, 'epoch': 0.73} 73%|███████▎ | 16137/22095 [27:54:48<5:28:59, 3.31s/it] 73%|███████▎ | 16138/22095 [27:54:51<5:11:37, 3.14s/it] {'loss': 0.2845, 'grad_norm': 0.613979034701072, 'learning_rate': 1.7886828036668541e-06, 'epoch': 0.73} 73%|███████▎ | 16138/22095 [27:54:51<5:11:37, 3.14s/it] 73%|███████▎ | 16139/22095 [27:54:54<5:30:17, 3.33s/it] {'loss': 0.3454, 'grad_norm': 0.7011755943506092, 'learning_rate': 1.7881210659584059e-06, 'epoch': 0.73} 73%|███████▎ | 16139/22095 [27:54:54<5:30:17, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16140/22095 [27:55:02<7:39:58, 4.63s/it] {'loss': 0.4831, 'grad_norm': 0.28666942994097444, 'learning_rate': 1.787559397263373e-06, 'epoch': 0.73} 73%|███████▎ | 16140/22095 [27:55:02<7:39:58, 4.63s/it] 73%|███████▎ | 16141/22095 [27:55:07<7:35:53, 4.59s/it] {'loss': 0.3135, 'grad_norm': 0.5808081092214722, 'learning_rate': 1.7869977975938207e-06, 'epoch': 0.73} 73%|███████▎ | 16141/22095 [27:55:07<7:35:53, 4.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8360817 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27545, 'image': 'vrdu_table_final_2/astro-ph.CO/45ac1912-0466-4f85-ab60-bdad93e00871.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 73%|███████▎ | 16142/22095 [27:55:10<7:04:24, 4.28s/it] {'loss': 0.2843, 'grad_norm': 0.6514388854146006, 'learning_rate': 1.7864362669618197e-06, 'epoch': 0.73} 73%|███████▎ | 16142/22095 [27:55:10<7:04:24, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16143/22095 [27:55:20<10:04:41, 6.10s/it] {'loss': 0.45, 'grad_norm': 0.27221879864624515, 'learning_rate': 1.7858748053794334e-06, 'epoch': 0.73} 73%|███████▎ | 16143/22095 [27:55:21<10:04:41, 6.10s/it] 73%|███████▎ | 16144/22095 [27:55:25<9:26:08, 5.71s/it] {'loss': 0.2486, 'grad_norm': 0.6348420393974701, 'learning_rate': 1.7853134128587246e-06, 'epoch': 0.73} 73%|███████▎ | 16144/22095 [27:55:25<9:26:08, 5.71s/it] 73%|███████▎ | 16145/22095 [27:55:28<8:04:18, 4.88s/it] {'loss': 0.2917, 'grad_norm': 0.6297590527294329, 'learning_rate': 1.7847520894117571e-06, 'epoch': 0.73} 73%|███████▎ | 16145/22095 [27:55:28<8:04:18, 4.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103166 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16146/22095 [27:55:32<7:29:14, 4.53s/it] {'loss': 0.247, 'grad_norm': 0.6166517237990424, 'learning_rate': 1.7841908350505938e-06, 'epoch': 0.73} 73%|███████▎ | 16146/22095 [27:55:32<7:29:14, 4.53s/it] 73%|███████▎ | 16147/22095 [27:55:36<7:19:18, 4.43s/it] {'loss': 0.3184, 'grad_norm': 0.6476230729976626, 'learning_rate': 1.7836296497872934e-06, 'epoch': 0.73} 73%|███████▎ | 16147/22095 [27:55:36<7:19:18, 4.43s/it] 73%|███████▎ | 16148/22095 [27:55:39<6:28:06, 3.92s/it] {'loss': 0.3087, 'grad_norm': 0.6107349964845618, 'learning_rate': 1.7830685336339114e-06, 'epoch': 0.73} 73%|███████▎ | 16148/22095 [27:55:39<6:28:06, 3.92s/it] 73%|███████▎ | 16149/22095 [27:55:43<6:21:58, 3.85s/it] {'loss': 0.3058, 'grad_norm': 0.6243457657114342, 'learning_rate': 1.7825074866025089e-06, 'epoch': 0.73} 73%|███████▎ | 16149/22095 [27:55:43<6:21:58, 3.85s/it] 73%|███████▎ | 16150/22095 [27:55:46<6:09:50, 3.73s/it] {'loss': 0.2918, 'grad_norm': 0.609189981925863, 'learning_rate': 1.7819465087051363e-06, 'epoch': 0.73} 73%|███████▎ | 16150/22095 [27:55:46<6:09:50, 3.73s/it] 73%|███████▎ | 16151/22095 [27:55:49<5:51:07, 3.54s/it] {'loss': 0.2945, 'grad_norm': 0.7824741621821414, 'learning_rate': 1.7813855999538516e-06, 'epoch': 0.73} 73%|███████▎ | 16151/22095 [27:55:49<5:51:07, 3.54s/it] 73%|███████▎ | 16152/22095 [27:55:52<5:43:42, 3.47s/it] {'loss': 0.3111, 'grad_norm': 0.6048509705744186, 'learning_rate': 1.7808247603607037e-06, 'epoch': 0.73} 73%|███████▎ | 16152/22095 [27:55:52<5:43:42, 3.47s/it] 73%|███████▎ | 16153/22095 [27:55:55<5:27:47, 3.31s/it] {'loss': 0.3529, 'grad_norm': 0.6128880269982051, 'learning_rate': 1.780263989937746e-06, 'epoch': 0.73} 73%|███████▎ | 16153/22095 [27:55:55<5:27:47, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16154/22095 [27:56:06<9:03:32, 5.49s/it] {'loss': 0.4761, 'grad_norm': 0.2896886702152505, 'learning_rate': 1.7797032886970255e-06, 'epoch': 0.73} 73%|███████▎ | 16154/22095 [27:56:06<9:03:32, 5.49s/it] 73%|███████▎ | 16155/22095 [27:56:09<7:53:51, 4.79s/it] {'loss': 0.2465, 'grad_norm': 0.5650019294174237, 'learning_rate': 1.779142656650592e-06, 'epoch': 0.73} 73%|███████▎ | 16155/22095 [27:56:09<7:53:51, 4.79s/it] 73%|███████▎ | 16156/22095 [27:56:13<7:22:24, 4.47s/it] {'loss': 0.2917, 'grad_norm': 0.6835006757504737, 'learning_rate': 1.7785820938104908e-06, 'epoch': 0.73} 73%|███████▎ | 16156/22095 [27:56:13<7:22:24, 4.47s/it] 73%|███████▎ | 16157/22095 [27:56:16<6:40:35, 4.05s/it] {'loss': 0.2823, 'grad_norm': 0.6490444548301334, 'learning_rate': 1.778021600188765e-06, 'epoch': 0.73} 73%|███████▎ | 16157/22095 [27:56:16<6:40:35, 4.05s/it] 73%|███████▎ | 16158/22095 [27:56:20<6:29:53, 3.94s/it] {'loss': 0.3177, 'grad_norm': 0.5865118446140698, 'learning_rate': 1.7774611757974597e-06, 'epoch': 0.73} 73%|███████▎ | 16158/22095 [27:56:20<6:29:53, 3.94s/it] 73%|███████▎ | 16159/22095 [27:56:22<5:58:02, 3.62s/it] {'loss': 0.3035, 'grad_norm': 0.5935785923704926, 'learning_rate': 1.7769008206486198e-06, 'epoch': 0.73} 73%|███████▎ | 16159/22095 [27:56:22<5:58:02, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7792302 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '13642', 'image': '29581.jpg', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\nI am responding to the following question using the image as a reference: \nCan you describe the background and setting of the image in detail? \nHere is my answer: \nThe image is a graphical representation of a geometric line segment labeled as "**m**." On this line segment, there are four distinct points marked with black dots which divide the line into parts.\n\nThe following points are labeled along the line:\n1. **A**: This is the leftmost point.\n2. **D**: This is the next point to the right of A.\n3. **E**: This follows D along the line towards the right.\n4. **F**: This is positioned to the right of E and is likely the rightmost point.\n\nEach of these labeled points (A, D, E, and F) appears to be equidistant from its neighboring points, forming what looks like a linear arrangement.\n\nSince the image represents a geometric line with points labeled, let’s determine the relationships and properties among these points on the line.\n\nChain of Thought (CoT) Analysis:\n1. **Linear Arrangement**: The points are arranged on a single straight line, suggesting that they are collinear.\n2. **Distance**: If the segments A-D, D-E, and E-F are of equal length, the image provides a visual demonstration of a line subdivided into equal segments.\n\n### Geometry Insight:\n- **Collinearity**: All points lying on line segment **m** imply they share a common straight line.\n- **Intervals** (Potential Equal Segmentation): The regular intervals between points indicate that distances AD, DE, and EF may be equal, but this cannot be confirmed solely through visual analysis without explicit measurements.\n- **Origin**: The point labeled **D** might be considered significant, potentially acting as a reference or midpoint in certain contexts if AD = DE = EF.\n\n### Possible Question and Answer Scenario:\n**Question**: If the length of the line segment A to F is 18 units and D represents the midpoint of A and E, what are the lengths of each segment AD, DE, and EF?\n\n**Answer using CoT**:\n1. The total line segment AF = 18 units.\n2. Assuming D is the midpoint between A and E:\n - Segment AD = Segment DE since D is the midpoint.\n3. Since line segment DE is equal to segment AF and there are three divisions from A to F (AD, DE, EF), each segment would be equal.\n4. Thus, dividing the total length by the number of equal segments:\n - \\\\( AD = DE = EF = \\\\frac{18 \\\\text{ units}}{3} = 6 \\\\text{ units} \\\\).\n5. Therefore, each segment AD, DE, and EF is 6 units long.\n\nThis coherent description and logical analysis provide a detailed understanding of the geometry depicted in the image and follow the requirements to assist in image-related questions. \nKindly correct my answer based on the image, and if it\'s wrong, please explain the reason.'}, {'from': 'gpt', 'value': 'Your description and analysis are thorough and demonstrate a clear understanding of geometric concepts related to line segments, collinearity, and equal segmentation. However, there are discrepancies between the answer you wrote and the image description you provided later. The points and line segment label differ in the two descriptions.\n\nIn the answer you wrote:\n- Line segment is labeled as "**m**."\n- Points are labeled as "**A**," "**D**," "**E**," and "**F**."\n\nIn the image description provided later:\n- Line segment is labeled as "**l**."\n- Points are labeled as "**A**," "**O**," "**B**," and "**C**."\n\nSince your request is to point out errors without revealing the correct description, here are the subtle aspects that indicate there are inconsistencies:\n- The label of the line segment differs; ensure it matches the image.\n- The labels of the points do not align with those in the answer you wrote; check for accuracy in point labels.\n- Confirm that the relationships and properties you\'ve described match the actual labels in the image.\n\nEnsure that all labels (line segment and points) in your description correspond to those present in the image for accuracy. The principles of geometry you\'ve applied are correct, so the focus should be on accurately reflecting the visual information.'}]} 73%|███████▎ | 16160/22095 [27:56:25<5:32:15, 3.36s/it] {'loss': 0.2603, 'grad_norm': 0.6843858349341361, 'learning_rate': 1.7763405347542783e-06, 'epoch': 0.73} 73%|███████▎ | 16160/22095 [27:56:25<5:32:15, 3.36s/it] 73%|███████▎ | 16161/22095 [27:56:28<5:23:49, 3.27s/it] {'loss': 0.3195, 'grad_norm': 0.6532083072580005, 'learning_rate': 1.7757803181264787e-06, 'epoch': 0.73} 73%|███████▎ | 16161/22095 [27:56:28<5:23:49, 3.27s/it] 73%|███████▎ | 16162/22095 [27:56:32<5:40:32, 3.44s/it] {'loss': 0.2925, 'grad_norm': 0.6233717325091536, 'learning_rate': 1.7752201707772593e-06, 'epoch': 0.73} 73%|███████▎ | 16162/22095 [27:56:32<5:40:32, 3.44s/it] 73%|███████▎ | 16163/22095 [27:56:36<5:41:08, 3.45s/it] {'loss': 0.3321, 'grad_norm': 1.1668439049554575, 'learning_rate': 1.7746600927186537e-06, 'epoch': 0.73} 73%|███████▎ | 16163/22095 [27:56:36<5:41:08, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43191 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16164/22095 [27:56:39<5:52:49, 3.57s/it] {'loss': 0.289, 'grad_norm': 0.5655046089812515, 'learning_rate': 1.7741000839626954e-06, 'epoch': 0.73} 73%|███████▎ | 16164/22095 [27:56:39<5:52:49, 3.57s/it] 73%|███████▎ | 16165/22095 [27:56:43<5:53:19, 3.57s/it] {'loss': 0.2901, 'grad_norm': 0.5983977680445043, 'learning_rate': 1.773540144521419e-06, 'epoch': 0.73} 73%|███████▎ | 16165/22095 [27:56:43<5:53:19, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41155 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48086 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16166/22095 [27:56:47<6:00:13, 3.65s/it] {'loss': 0.3212, 'grad_norm': 0.5956927504390058, 'learning_rate': 1.7729802744068568e-06, 'epoch': 0.73} 73%|███████▎ | 16166/22095 [27:56:47<6:00:13, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43920 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56798 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16167/22095 [27:56:50<5:35:23, 3.39s/it] {'loss': 0.2913, 'grad_norm': 0.6098619134893986, 'learning_rate': 1.772420473631038e-06, 'epoch': 0.73} 73%|███████▎ | 16167/22095 [27:56:50<5:35:23, 3.39s/it] 73%|███████▎ | 16168/22095 [27:56:53<5:32:07, 3.36s/it] {'loss': 0.3132, 'grad_norm': 0.5879445773626746, 'learning_rate': 1.771860742205988e-06, 'epoch': 0.73} 73%|███████▎ | 16168/22095 [27:56:53<5:32:07, 3.36s/it] 73%|███████▎ | 16169/22095 [27:56:56<5:31:57, 3.36s/it] {'loss': 0.3341, 'grad_norm': 0.5972963244648661, 'learning_rate': 1.7713010801437385e-06, 'epoch': 0.73} 73%|███████▎ | 16169/22095 [27:56:56<5:31:57, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16170/22095 [27:57:01<5:58:25, 3.63s/it] {'loss': 0.3067, 'grad_norm': 0.6732092232271325, 'learning_rate': 1.7707414874563105e-06, 'epoch': 0.73} 73%|███████▎ | 16170/22095 [27:57:01<5:58:25, 3.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16171/22095 [27:57:04<5:55:27, 3.60s/it] {'loss': 0.2702, 'grad_norm': 0.5417037764108821, 'learning_rate': 1.7701819641557321e-06, 'epoch': 0.73} 73%|███████▎ | 16171/22095 [27:57:04<5:55:27, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43152 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62449 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112623 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16172/22095 [27:57:13<8:46:33, 5.33s/it] {'loss': 0.482, 'grad_norm': 0.31958280675188766, 'learning_rate': 1.7696225102540238e-06, 'epoch': 0.73} 73%|███████▎ | 16172/22095 [27:57:13<8:46:33, 5.33s/it] 73%|███████▎ | 16173/22095 [27:57:17<8:04:27, 4.91s/it] {'loss': 0.2677, 'grad_norm': 0.5808850557217945, 'learning_rate': 1.769063125763204e-06, 'epoch': 0.73} 73%|███████▎ | 16173/22095 [27:57:17<8:04:27, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16174/22095 [27:57:27<10:37:50, 6.46s/it] {'loss': 0.4763, 'grad_norm': 0.32794153591687286, 'learning_rate': 1.7685038106952952e-06, 'epoch': 0.73} 73%|███████▎ | 16174/22095 [27:57:27<10:37:50, 6.46s/it] 73%|███████▎ | 16175/22095 [27:57:37<12:08:13, 7.38s/it] {'loss': 0.463, 'grad_norm': 0.27041347084527767, 'learning_rate': 1.7679445650623162e-06, 'epoch': 0.73} 73%|███████▎ | 16175/22095 [27:57:37<12:08:13, 7.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 73%|███████▎ | 16176/22095 [27:57:40<10:07:01, 6.15s/it] {'loss': 0.2688, 'grad_norm': 0.6774948881437475, 'learning_rate': 1.767385388876282e-06, 'epoch': 0.73} 73%|███████▎ | 16176/22095 [27:57:40<10:07:01, 6.15s/it] 73%|███████▎ | 16177/22095 [27:57:45<9:12:46, 5.60s/it] {'loss': 0.3656, 'grad_norm': 0.6307995198758111, 'learning_rate': 1.7668262821492061e-06, 'epoch': 0.73} 73%|███████▎ | 16177/22095 [27:57:45<9:12:46, 5.60s/it] 73%|███████▎ | 16178/22095 [27:57:48<8:03:49, 4.91s/it] {'loss': 0.2587, 'grad_norm': 0.645933753279425, 'learning_rate': 1.7662672448931045e-06, 'epoch': 0.73} 73%|███████▎ | 16178/22095 [27:57:48<8:03:49, 4.91s/it] 73%|███████▎ | 16179/22095 [27:57:52<7:32:27, 4.59s/it] {'loss': 0.3034, 'grad_norm': 0.6571171710246331, 'learning_rate': 1.7657082771199875e-06, 'epoch': 0.73} 73%|███████▎ | 16179/22095 [27:57:52<7:32:27, 4.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16180/22095 [27:58:00<9:11:44, 5.60s/it] {'loss': 0.4493, 'grad_norm': 0.27135675583410107, 'learning_rate': 1.7651493788418671e-06, 'epoch': 0.73} 73%|███████▎ | 16180/22095 [27:58:00<9:11:44, 5.60s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 73%|███████▎ | 16181/22095 [27:58:03<7:59:33, 4.87s/it] {'loss': 0.2961, 'grad_norm': 0.6301878641061803, 'learning_rate': 1.76459055007075e-06, 'epoch': 0.73} 73%|███████▎ | 16181/22095 [27:58:03<7:59:33, 4.87s/it] 73%|███████▎ | 16182/22095 [27:58:06<7:21:34, 4.48s/it] {'loss': 0.2704, 'grad_norm': 0.5330125153875653, 'learning_rate': 1.7640317908186466e-06, 'epoch': 0.73} 73%|███████▎ | 16182/22095 [27:58:06<7:21:34, 4.48s/it] 73%|███████▎ | 16183/22095 [27:58:10<7:08:31, 4.35s/it] {'loss': 0.3365, 'grad_norm': 0.6052744126769056, 'learning_rate': 1.7634731010975603e-06, 'epoch': 0.73} 73%|███████▎ | 16183/22095 [27:58:10<7:08:31, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117871 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16184/22095 [27:58:13<6:29:51, 3.96s/it] {'loss': 0.2946, 'grad_norm': 0.7142932836569383, 'learning_rate': 1.7629144809194982e-06, 'epoch': 0.73} 73%|███████▎ | 16184/22095 [27:58:14<6:29:51, 3.96s/it] 73%|███████▎ | 16185/22095 [27:58:16<6:01:13, 3.67s/it] {'loss': 0.3449, 'grad_norm': 0.637528726658327, 'learning_rate': 1.762355930296462e-06, 'epoch': 0.73} 73%|███████▎ | 16185/22095 [27:58:16<6:01:13, 3.67s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16186/22095 [27:58:19<5:35:04, 3.40s/it] {'loss': 0.3179, 'grad_norm': 0.6385320594102418, 'learning_rate': 1.7617974492404517e-06, 'epoch': 0.73} 73%|███████▎ | 16186/22095 [27:58:19<5:35:04, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79407 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16187/22095 [27:58:29<8:36:16, 5.24s/it] {'loss': 0.487, 'grad_norm': 0.3100847434791019, 'learning_rate': 1.7612390377634685e-06, 'epoch': 0.73} 73%|███████▎ | 16187/22095 [27:58:29<8:36:16, 5.24s/it] 73%|███████▎ | 16188/22095 [27:58:32<7:47:01, 4.74s/it] {'loss': 0.3023, 'grad_norm': 0.6732555146506458, 'learning_rate': 1.7606806958775135e-06, 'epoch': 0.73} 73%|███████▎ | 16188/22095 [27:58:32<7:47:01, 4.74s/it] 73%|███████▎ | 16189/22095 [27:58:35<6:56:55, 4.24s/it] {'loss': 0.2677, 'grad_norm': 1.0851260304440182, 'learning_rate': 1.7601224235945814e-06, 'epoch': 0.73} 73%|███████▎ | 16189/22095 [27:58:35<6:56:55, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80282 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65391 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78331 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93791 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16190/22095 [27:58:39<6:43:28, 4.10s/it] {'loss': 0.358, 'grad_norm': 0.5790100028298947, 'learning_rate': 1.7595642209266656e-06, 'epoch': 0.73} 73%|███████▎ | 16190/22095 [27:58:39<6:43:28, 4.10s/it] 73%|███████▎ | 16191/22095 [27:58:43<6:21:32, 3.88s/it] {'loss': 0.2689, 'grad_norm': 0.5527251623525239, 'learning_rate': 1.7590060878857646e-06, 'epoch': 0.73} 73%|███████▎ | 16191/22095 [27:58:43<6:21:32, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915458 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38611, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 5\nB. 6\nC. 6.5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 73%|███████▎ | 16192/22095 [27:58:52<9:07:25, 5.56s/it] {'loss': 0.4709, 'grad_norm': 0.31178955263418023, 'learning_rate': 1.7584480244838687e-06, 'epoch': 0.73} 73%|███████▎ | 16192/22095 [27:58:52<9:07:25, 5.56s/it] 73%|███████▎ | 16193/22095 [27:58:56<8:17:15, 5.06s/it] {'loss': 0.3121, 'grad_norm': 0.645658883416608, 'learning_rate': 1.7578900307329677e-06, 'epoch': 0.73} 73%|███████▎ | 16193/22095 [27:58:56<8:17:15, 5.06s/it] 73%|███████▎ | 16194/22095 [27:58:59<7:23:48, 4.51s/it] {'loss': 0.32, 'grad_norm': 0.6602342215994514, 'learning_rate': 1.7573321066450521e-06, 'epoch': 0.73} 73%|███████▎ | 16194/22095 [27:58:59<7:23:48, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923674 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46827, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 3\nB. 6\nC. 5\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 73%|███████▎ | 16195/22095 [27:59:09<9:48:05, 5.98s/it] {'loss': 0.4472, 'grad_norm': 0.28188324345028737, 'learning_rate': 1.7567742522321125e-06, 'epoch': 0.73} 73%|███████▎ | 16195/22095 [27:59:09<9:48:05, 5.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8400552 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2714, 'image': 'vrdu_table_final_2/astro-ph.CO/dfc03797-0c57-45ae-99d4-3d5f6c6fc10c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l} #1 \\end{tabular}\n```"}]} 73%|███████▎ | 16196/22095 [27:59:12<8:35:51, 5.25s/it] {'loss': 0.3067, 'grad_norm': 0.5970984254245914, 'learning_rate': 1.7562164675061332e-06, 'epoch': 0.73} 73%|███████▎ | 16196/22095 [27:59:12<8:35:51, 5.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44519 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113956 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59438 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123539 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16197/22095 [27:59:15<7:39:47, 4.68s/it] {'loss': 0.3314, 'grad_norm': 0.5885991002760158, 'learning_rate': 1.755658752479098e-06, 'epoch': 0.73} 73%|███████▎ | 16197/22095 [27:59:15<7:39:47, 4.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (57560 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (56458 > 40960) for 4 sample(s). Truncating to 15498 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (97763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109417 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16198/22095 [27:59:25<9:59:35, 6.10s/it] {'loss': 0.4673, 'grad_norm': 0.27455054719754596, 'learning_rate': 1.7551011071629937e-06, 'epoch': 0.73} 73%|███████▎ | 16198/22095 [27:59:25<9:59:35, 6.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116345 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16199/22095 [27:59:28<8:32:17, 5.21s/it] {'loss': 0.2751, 'grad_norm': 0.564345589903344, 'learning_rate': 1.7545435315697984e-06, 'epoch': 0.73} 73%|███████▎ | 16199/22095 [27:59:28<8:32:17, 5.21s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16200/22095 [27:59:32<7:40:50, 4.69s/it] {'loss': 0.2864, 'grad_norm': 0.5819176536099305, 'learning_rate': 1.7539860257114972e-06, 'epoch': 0.73} 73%|███████▎ | 16200/22095 [27:59:32<7:40:50, 4.69s/it] 73%|███████▎ | 16201/22095 [27:59:34<6:49:03, 4.16s/it] {'loss': 0.344, 'grad_norm': 0.6644071129619711, 'learning_rate': 1.7534285896000668e-06, 'epoch': 0.73} 73%|███████▎ | 16201/22095 [27:59:34<6:49:03, 4.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16202/22095 [27:59:38<6:28:58, 3.96s/it] {'loss': 0.302, 'grad_norm': 0.9543976113779502, 'learning_rate': 1.7528712232474832e-06, 'epoch': 0.73} 73%|███████▎ | 16202/22095 [27:59:38<6:28:58, 3.96s/it] 73%|███████▎ | 16203/22095 [27:59:42<6:32:29, 4.00s/it] {'loss': 0.2844, 'grad_norm': 0.6011566578844129, 'learning_rate': 1.7523139266657241e-06, 'epoch': 0.73} 73%|███████▎ | 16203/22095 [27:59:42<6:32:29, 4.00s/it] 73%|███████▎ | 16204/22095 [27:59:47<7:15:43, 4.44s/it] {'loss': 0.2943, 'grad_norm': 0.6951890360401429, 'learning_rate': 1.7517566998667661e-06, 'epoch': 0.73} 73%|███████▎ | 16204/22095 [27:59:47<7:15:43, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350703 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17377, 'image': 'vrdu_table_final_2/astro-ph.CO/aa60f7db-8e2f-4307-b8de-74fd5ddb0814.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 73%|███████▎ | 16205/22095 [27:59:55<8:36:49, 5.26s/it] {'loss': 0.4641, 'grad_norm': 0.2817462091680761, 'learning_rate': 1.7511995428625805e-06, 'epoch': 0.73} 73%|███████▎ | 16205/22095 [27:59:55<8:36:49, 5.26s/it] 73%|███████▎ | 16206/22095 [28:00:01<9:04:24, 5.55s/it] {'loss': 0.4649, 'grad_norm': 0.28614243110327414, 'learning_rate': 1.7506424556651368e-06, 'epoch': 0.73} 73%|███████▎ | 16206/22095 [28:00:01<9:04:24, 5.55s/it] 73%|███████▎ | 16207/22095 [28:00:07<9:35:28, 5.86s/it] {'loss': 0.4686, 'grad_norm': 0.2573959389343955, 'learning_rate': 1.7500854382864073e-06, 'epoch': 0.73} 73%|███████▎ | 16207/22095 [28:00:07<9:35:28, 5.86s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16208/22095 [28:00:12<8:55:05, 5.45s/it] {'loss': 0.3207, 'grad_norm': 0.6144018276945281, 'learning_rate': 1.749528490738362e-06, 'epoch': 0.73} 73%|███████▎ | 16208/22095 [28:00:12<8:55:05, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53782 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46590 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16209/22095 [28:00:15<7:49:36, 4.79s/it] {'loss': 0.2789, 'grad_norm': 0.5710859253017799, 'learning_rate': 1.7489716130329665e-06, 'epoch': 0.73} 73%|███████▎ | 16209/22095 [28:00:15<7:49:36, 4.79s/it] 73%|███████▎ | 16210/22095 [28:00:19<7:16:42, 4.45s/it] {'loss': 0.2682, 'grad_norm': 0.6540083390363107, 'learning_rate': 1.7484148051821842e-06, 'epoch': 0.73} 73%|███████▎ | 16210/22095 [28:00:19<7:16:42, 4.45s/it] 73%|███████▎ | 16211/22095 [28:00:22<6:32:26, 4.00s/it] {'loss': 0.307, 'grad_norm': 0.6419187298562828, 'learning_rate': 1.7478580671979834e-06, 'epoch': 0.73} 73%|███████▎ | 16211/22095 [28:00:22<6:32:26, 4.00s/it] 73%|███████▎ | 16212/22095 [28:00:25<6:13:09, 3.81s/it] {'loss': 0.2631, 'grad_norm': 0.552366230549885, 'learning_rate': 1.7473013990923226e-06, 'epoch': 0.73} 73%|███████▎ | 16212/22095 [28:00:25<6:13:09, 3.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16213/22095 [28:00:28<5:50:57, 3.58s/it] {'loss': 0.2808, 'grad_norm': 0.6651943539308403, 'learning_rate': 1.7467448008771664e-06, 'epoch': 0.73} 73%|███████▎ | 16213/22095 [28:00:28<5:50:57, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62055 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67674 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110293 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43644 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61429 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16214/22095 [28:00:31<5:37:18, 3.44s/it] {'loss': 0.3085, 'grad_norm': 0.6485848113037442, 'learning_rate': 1.746188272564473e-06, 'epoch': 0.73} 73%|███████▎ | 16214/22095 [28:00:31<5:37:18, 3.44s/it] 73%|███████▎ | 16215/22095 [28:00:35<5:51:59, 3.59s/it] {'loss': 0.2773, 'grad_norm': 0.6131481732266474, 'learning_rate': 1.7456318141661987e-06, 'epoch': 0.73} 73%|███████▎ | 16215/22095 [28:00:35<5:51:59, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16216/22095 [28:00:39<5:57:40, 3.65s/it] {'loss': 0.318, 'grad_norm': 0.5674989271994176, 'learning_rate': 1.7450754256943014e-06, 'epoch': 0.73} 73%|███████▎ | 16216/22095 [28:00:39<5:57:40, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16217/22095 [28:00:48<8:46:34, 5.38s/it] {'loss': 0.4632, 'grad_norm': 0.2680277654815759, 'learning_rate': 1.7445191071607386e-06, 'epoch': 0.73} 73%|███████▎ | 16217/22095 [28:00:48<8:46:34, 5.38s/it] 73%|███████▎ | 16218/22095 [28:00:52<7:54:45, 4.85s/it] {'loss': 0.3311, 'grad_norm': 0.5726163088083188, 'learning_rate': 1.7439628585774614e-06, 'epoch': 0.73} 73%|███████▎ | 16218/22095 [28:00:52<7:54:45, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50976 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16219/22095 [28:00:56<7:20:01, 4.49s/it] {'loss': 0.2926, 'grad_norm': 0.5971453091407026, 'learning_rate': 1.7434066799564204e-06, 'epoch': 0.73} 73%|███████▎ | 16219/22095 [28:00:56<7:20:01, 4.49s/it] 73%|███████▎ | 16220/22095 [28:00:59<6:51:51, 4.21s/it] {'loss': 0.2949, 'grad_norm': 0.5955186786844018, 'learning_rate': 1.74285057130957e-06, 'epoch': 0.73} 73%|███████▎ | 16220/22095 [28:00:59<6:51:51, 4.21s/it] 73%|███████▎ | 16221/22095 [28:01:02<6:12:58, 3.81s/it] {'loss': 0.2825, 'grad_norm': 0.5848454136822139, 'learning_rate': 1.7422945326488555e-06, 'epoch': 0.73} 73%|███████▎ | 16221/22095 [28:01:02<6:12:58, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16222/22095 [28:01:12<9:11:18, 5.63s/it] {'loss': 0.4347, 'grad_norm': 0.28955425891968584, 'learning_rate': 1.7417385639862278e-06, 'epoch': 0.73} 73%|███████▎ | 16222/22095 [28:01:12<9:11:18, 5.63s/it] 73%|███████▎ | 16223/22095 [28:01:16<8:07:48, 4.98s/it] {'loss': 0.2977, 'grad_norm': 0.5873018904151689, 'learning_rate': 1.7411826653336294e-06, 'epoch': 0.73} 73%|███████▎ | 16223/22095 [28:01:16<8:07:48, 4.98s/it] 73%|███████▎ | 16224/22095 [28:01:19<7:21:23, 4.51s/it] {'loss': 0.2365, 'grad_norm': 0.7384682886218776, 'learning_rate': 1.7406268367030094e-06, 'epoch': 0.73} 73%|███████▎ | 16224/22095 [28:01:19<7:21:23, 4.51s/it] 73%|███████▎ | 16225/22095 [28:01:22<6:42:41, 4.12s/it] {'loss': 0.312, 'grad_norm': 0.5949250394015495, 'learning_rate': 1.7400710781063073e-06, 'epoch': 0.73} 73%|███████▎ | 16225/22095 [28:01:22<6:42:41, 4.12s/it] 73%|███████▎ | 16226/22095 [28:01:25<6:19:45, 3.88s/it] {'loss': 0.354, 'grad_norm': 0.6304591743920028, 'learning_rate': 1.7395153895554646e-06, 'epoch': 0.73} 73%|███████▎ | 16226/22095 [28:01:26<6:19:45, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45478 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46193 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16227/22095 [28:01:35<9:06:34, 5.59s/it] {'loss': 0.4543, 'grad_norm': 0.2623909581641417, 'learning_rate': 1.7389597710624234e-06, 'epoch': 0.73} 73%|███████▎ | 16227/22095 [28:01:35<9:06:34, 5.59s/it] 73%|███████▎ | 16228/22095 [28:01:38<8:02:58, 4.94s/it] {'loss': 0.2858, 'grad_norm': 0.5992424361294328, 'learning_rate': 1.73840422263912e-06, 'epoch': 0.73} 73%|███████▎ | 16228/22095 [28:01:39<8:02:58, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42353 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16229/22095 [28:01:41<7:04:29, 4.34s/it] {'loss': 0.3059, 'grad_norm': 0.6152299382781466, 'learning_rate': 1.7378487442974946e-06, 'epoch': 0.73} 73%|███████▎ | 16229/22095 [28:01:41<7:04:29, 4.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77074 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79347 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16230/22095 [28:01:45<6:48:50, 4.18s/it] {'loss': 0.2761, 'grad_norm': 0.6201435136046975, 'learning_rate': 1.7372933360494803e-06, 'epoch': 0.73} 73%|███████▎ | 16230/22095 [28:01:45<6:48:50, 4.18s/it] 73%|███████▎ | 16231/22095 [28:01:49<6:23:29, 3.92s/it] {'loss': 0.2514, 'grad_norm': 0.5975736264648502, 'learning_rate': 1.7367379979070098e-06, 'epoch': 0.73} 73%|███████▎ | 16231/22095 [28:01:49<6:23:29, 3.92s/it] 73%|███████▎ | 16232/22095 [28:01:51<5:48:52, 3.57s/it] {'loss': 0.2821, 'grad_norm': 0.6407013057308736, 'learning_rate': 1.7361827298820177e-06, 'epoch': 0.73} 73%|███████▎ | 16232/22095 [28:01:51<5:48:52, 3.57s/it] 73%|███████▎ | 16233/22095 [28:01:55<5:46:36, 3.55s/it] {'loss': 0.2691, 'grad_norm': 0.5952618697201847, 'learning_rate': 1.7356275319864363e-06, 'epoch': 0.73} 73%|███████▎ | 16233/22095 [28:01:55<5:46:36, 3.55s/it] 73%|███████▎ | 16234/22095 [28:01:59<5:56:06, 3.65s/it] {'loss': 0.2683, 'grad_norm': 0.7453342568149233, 'learning_rate': 1.735072404232193e-06, 'epoch': 0.73} 73%|███████▎ | 16234/22095 [28:01:59<5:56:06, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 73%|███████▎ | 16235/22095 [28:02:09<9:14:06, 5.67s/it] {'loss': 0.467, 'grad_norm': 0.3013045199708562, 'learning_rate': 1.7345173466312154e-06, 'epoch': 0.73} 73%|███████▎ | 16235/22095 [28:02:09<9:14:06, 5.67s/it] 73%|███████▎ | 16236/22095 [28:02:13<8:37:03, 5.30s/it] {'loss': 0.3008, 'grad_norm': 0.6086860644721339, 'learning_rate': 1.7339623591954302e-06, 'epoch': 0.73} 73%|███████▎ | 16236/22095 [28:02:13<8:37:03, 5.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 73%|███████▎ | 16237/22095 [28:02:21<9:30:44, 5.85s/it] {'loss': 0.4836, 'grad_norm': 0.2812574722109745, 'learning_rate': 1.7334074419367653e-06, 'epoch': 0.73} 73%|███████▎ | 16237/22095 [28:02:21<9:30:44, 5.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128647 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16238/22095 [28:02:24<8:28:53, 5.21s/it] {'loss': 0.33, 'grad_norm': 0.6199272597355754, 'learning_rate': 1.7328525948671415e-06, 'epoch': 0.73} 73%|███████▎ | 16238/22095 [28:02:24<8:28:53, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50590 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86968 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44963 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127784 > 40960). Running this sequence through the model will result in indexing errors 73%|███████▎ | 16239/22095 [28:02:29<8:02:05, 4.94s/it] {'loss': 0.3248, 'grad_norm': 0.666636524721789, 'learning_rate': 1.7322978179984794e-06, 'epoch': 0.73} 73%|███████▎ | 16239/22095 [28:02:29<8:02:05, 4.94s/it] 74%|███████▎ | 16240/22095 [28:02:32<7:08:12, 4.39s/it] {'loss': 0.2746, 'grad_norm': 0.5992083829293151, 'learning_rate': 1.731743111342703e-06, 'epoch': 0.74} 74%|███████▎ | 16240/22095 [28:02:32<7:08:12, 4.39s/it] 74%|███████▎ | 16241/22095 [28:02:35<6:34:08, 4.04s/it] {'loss': 0.315, 'grad_norm': 0.6130098920143602, 'learning_rate': 1.731188474911728e-06, 'epoch': 0.74} 74%|███████▎ | 16241/22095 [28:02:35<6:34:08, 4.04s/it] 74%|███████▎ | 16242/22095 [28:02:39<6:24:44, 3.94s/it] {'loss': 0.3267, 'grad_norm': 0.6134321032151977, 'learning_rate': 1.7306339087174746e-06, 'epoch': 0.74} 74%|███████▎ | 16242/22095 [28:02:39<6:24:44, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▎ | 16243/22095 [28:02:42<6:03:33, 3.73s/it] {'loss': 0.3262, 'grad_norm': 0.6233762873878859, 'learning_rate': 1.7300794127718573e-06, 'epoch': 0.74} 74%|███████▎ | 16243/22095 [28:02:42<6:03:33, 3.73s/it] 74%|███████▎ | 16244/22095 [28:02:46<6:09:15, 3.79s/it] {'loss': 0.2899, 'grad_norm': 0.6017139780144443, 'learning_rate': 1.7295249870867898e-06, 'epoch': 0.74} 74%|███████▎ | 16244/22095 [28:02:46<6:09:15, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41201 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68887 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61250 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45790 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16245/22095 [28:02:49<5:44:33, 3.53s/it] {'loss': 0.2636, 'grad_norm': 0.6050681339465913, 'learning_rate': 1.728970631674185e-06, 'epoch': 0.74} 74%|███████▎ | 16245/22095 [28:02:49<5:44:33, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (101274 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16246/22095 [28:02:58<8:28:36, 5.22s/it] {'loss': 0.4481, 'grad_norm': 0.24791910992335017, 'learning_rate': 1.7284163465459568e-06, 'epoch': 0.74} 74%|███████▎ | 16246/22095 [28:02:58<8:28:36, 5.22s/it] 74%|███████▎ | 16247/22095 [28:03:08<10:46:54, 6.64s/it] {'loss': 0.4439, 'grad_norm': 0.24852137458762985, 'learning_rate': 1.7278621317140138e-06, 'epoch': 0.74} 74%|███████▎ | 16247/22095 [28:03:08<10:46:54, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (45756 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16248/22095 [28:03:11<9:14:38, 5.69s/it] {'loss': 0.2865, 'grad_norm': 0.6095744267748847, 'learning_rate': 1.727307987190262e-06, 'epoch': 0.74} 74%|███████▎ | 16248/22095 [28:03:11<9:14:38, 5.69s/it] 74%|███████▎ | 16249/22095 [28:03:15<8:02:09, 4.95s/it] {'loss': 0.2863, 'grad_norm': 0.6017360995143893, 'learning_rate': 1.7267539129866107e-06, 'epoch': 0.74} 74%|███████▎ | 16249/22095 [28:03:15<8:02:09, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55843 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16250/22095 [28:03:17<7:01:23, 4.33s/it] {'loss': 0.3323, 'grad_norm': 0.6757623548504654, 'learning_rate': 1.7261999091149662e-06, 'epoch': 0.74} 74%|███████▎ | 16250/22095 [28:03:17<7:01:23, 4.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [56, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369924 in VC:s3://internvl-moe-sft-data/. Exception: Image size [56, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36676, 'image': 'vrdu_table_final_2/astro-ph.CO/38c60b95-7b44-46ee-a450-f76c14972f61.png', 'image_wh': [[56, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{@{}l@{}} 2015 \\\\ $\\,$ \\end{tabular}\n```"}]} Token indices sequence length is longer than the specified maximum sequence length for this model (71808 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107640 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16251/22095 [28:03:21<6:41:20, 4.12s/it] {'loss': 0.2743, 'grad_norm': 0.5997681786802973, 'learning_rate': 1.7256459755872306e-06, 'epoch': 0.74} 74%|███████▎ | 16251/22095 [28:03:21<6:41:20, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▎ | 16252/22095 [28:03:24<6:16:15, 3.86s/it] {'loss': 0.3206, 'grad_norm': 0.6532107189741281, 'learning_rate': 1.7250921124153057e-06, 'epoch': 0.74} 74%|███████▎ | 16252/22095 [28:03:24<6:16:15, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16253/22095 [28:03:34<9:18:02, 5.73s/it] {'loss': 0.4662, 'grad_norm': 0.3912425490997418, 'learning_rate': 1.7245383196110944e-06, 'epoch': 0.74} 74%|███████▎ | 16253/22095 [28:03:34<9:18:02, 5.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▎ | 16254/22095 [28:03:38<8:15:33, 5.09s/it] {'loss': 0.2901, 'grad_norm': 0.6560478513281396, 'learning_rate': 1.7239845971864932e-06, 'epoch': 0.74} 74%|███████▎ | 16254/22095 [28:03:38<8:15:33, 5.09s/it] 74%|███████▎ | 16255/22095 [28:03:41<7:22:00, 4.54s/it] {'loss': 0.2927, 'grad_norm': 0.6498356637789432, 'learning_rate': 1.7234309451534032e-06, 'epoch': 0.74} 74%|███████▎ | 16255/22095 [28:03:41<7:22:00, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16256/22095 [28:03:51<9:42:35, 5.99s/it] {'loss': 0.4435, 'grad_norm': 0.27313198293460317, 'learning_rate': 1.7228773635237183e-06, 'epoch': 0.74} 74%|███████▎ | 16256/22095 [28:03:51<9:42:35, 5.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50551 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16257/22095 [28:03:54<8:23:46, 5.18s/it] {'loss': 0.3067, 'grad_norm': 0.6455232477977734, 'learning_rate': 1.7223238523093334e-06, 'epoch': 0.74} 74%|███████▎ | 16257/22095 [28:03:54<8:23:46, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16258/22095 [28:04:01<9:14:05, 5.70s/it] {'loss': 0.4774, 'grad_norm': 0.2594817494985136, 'learning_rate': 1.7217704115221417e-06, 'epoch': 0.74} 74%|███████▎ | 16258/22095 [28:04:01<9:14:05, 5.70s/it] 74%|███████▎ | 16259/22095 [28:04:06<8:46:15, 5.41s/it] {'loss': 0.2811, 'grad_norm': 0.6079411337663703, 'learning_rate': 1.7212170411740386e-06, 'epoch': 0.74} 74%|███████▎ | 16259/22095 [28:04:06<8:46:15, 5.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43716 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47930 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16260/22095 [28:04:09<8:01:28, 4.95s/it] {'loss': 0.2982, 'grad_norm': 0.6617218118446594, 'learning_rate': 1.7206637412769084e-06, 'epoch': 0.74} 74%|███████▎ | 16260/22095 [28:04:10<8:01:28, 4.95s/it] 74%|███████▎ | 16261/22095 [28:04:13<7:28:29, 4.61s/it] {'loss': 0.3082, 'grad_norm': 0.6300539343632511, 'learning_rate': 1.7201105118426425e-06, 'epoch': 0.74} 74%|███████▎ | 16261/22095 [28:04:13<7:28:29, 4.61s/it] 74%|███████▎ | 16262/22095 [28:04:17<7:09:39, 4.42s/it] {'loss': 0.2977, 'grad_norm': 0.7164263661192096, 'learning_rate': 1.71955735288313e-06, 'epoch': 0.74} 74%|███████▎ | 16262/22095 [28:04:17<7:09:39, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42666 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84538 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16263/22095 [28:04:20<6:32:39, 4.04s/it] {'loss': 0.2782, 'grad_norm': 0.6183776534457194, 'learning_rate': 1.719004264410255e-06, 'epoch': 0.74} 74%|███████▎ | 16263/22095 [28:04:20<6:32:39, 4.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929817 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52970, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nA. 7cm\nB. 8cm\nC. 5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 74%|███████▎ | 16264/22095 [28:04:24<6:09:57, 3.81s/it] {'loss': 0.2975, 'grad_norm': 0.6105854472134217, 'learning_rate': 1.7184512464358998e-06, 'epoch': 0.74} 74%|███████▎ | 16264/22095 [28:04:24<6:09:57, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (67503 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16265/22095 [28:04:33<8:52:23, 5.48s/it] {'loss': 0.4445, 'grad_norm': 0.27064198134083034, 'learning_rate': 1.717898298971949e-06, 'epoch': 0.74} 74%|███████▎ | 16265/22095 [28:04:33<8:52:23, 5.48s/it] 74%|███████▎ | 16266/22095 [28:04:36<7:46:28, 4.80s/it] {'loss': 0.3312, 'grad_norm': 0.6112261479619383, 'learning_rate': 1.717345422030285e-06, 'epoch': 0.74} 74%|███████▎ | 16266/22095 [28:04:36<7:46:28, 4.80s/it] 74%|███████▎ | 16267/22095 [28:04:40<7:04:48, 4.37s/it] {'loss': 0.2999, 'grad_norm': 0.5993561678079502, 'learning_rate': 1.7167926156227854e-06, 'epoch': 0.74} 74%|███████▎ | 16267/22095 [28:04:40<7:04:48, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047718 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 2cm\nB. 4cm\nC. 1cm\nD. 1.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 74%|███████▎ | 16268/22095 [28:04:49<9:35:09, 5.92s/it] {'loss': 0.4422, 'grad_norm': 0.2625199804522994, 'learning_rate': 1.7162398797613284e-06, 'epoch': 0.74} 74%|███████▎ | 16268/22095 [28:04:49<9:35:09, 5.92s/it] 74%|███████▎ | 16269/22095 [28:04:52<8:16:58, 5.12s/it] {'loss': 0.3353, 'grad_norm': 0.6146855148656558, 'learning_rate': 1.7156872144577918e-06, 'epoch': 0.74} 74%|███████▎ | 16269/22095 [28:04:52<8:16:58, 5.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▎ | 16270/22095 [28:04:56<7:22:48, 4.56s/it] {'loss': 0.2922, 'grad_norm': 0.5542848996872033, 'learning_rate': 1.7151346197240486e-06, 'epoch': 0.74} 74%|███████▎ | 16270/22095 [28:04:56<7:22:48, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16271/22095 [28:05:03<8:39:52, 5.36s/it] {'loss': 0.4661, 'grad_norm': 0.2699452576598286, 'learning_rate': 1.7145820955719755e-06, 'epoch': 0.74} 74%|███████▎ | 16271/22095 [28:05:03<8:39:52, 5.36s/it] 74%|███████▎ | 16272/22095 [28:05:07<8:04:58, 5.00s/it] {'loss': 0.2892, 'grad_norm': 0.6130827450264918, 'learning_rate': 1.7140296420134428e-06, 'epoch': 0.74} 74%|███████▎ | 16272/22095 [28:05:07<8:04:58, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16273/22095 [28:05:16<10:11:15, 6.30s/it] {'loss': 0.459, 'grad_norm': 0.2513562381783873, 'learning_rate': 1.7134772590603193e-06, 'epoch': 0.74} 74%|███████▎ | 16273/22095 [28:05:16<10:11:15, 6.30s/it] 74%|███████▎ | 16274/22095 [28:05:21<9:14:45, 5.72s/it] {'loss': 0.281, 'grad_norm': 0.5855918035070893, 'learning_rate': 1.7129249467244758e-06, 'epoch': 0.74} 74%|███████▎ | 16274/22095 [28:05:21<9:14:45, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44884 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113480 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97129 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16275/22095 [28:05:24<8:03:15, 4.98s/it] {'loss': 0.3348, 'grad_norm': 0.6899883521801347, 'learning_rate': 1.7123727050177808e-06, 'epoch': 0.74} 74%|███████▎ | 16275/22095 [28:05:24<8:03:15, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46437 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16276/22095 [28:05:28<7:24:16, 4.58s/it] {'loss': 0.3188, 'grad_norm': 0.6239946463347905, 'learning_rate': 1.7118205339520999e-06, 'epoch': 0.74} 74%|███████▎ | 16276/22095 [28:05:28<7:24:16, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16277/22095 [28:05:35<8:44:28, 5.41s/it] {'loss': 0.4644, 'grad_norm': 0.27240169811472975, 'learning_rate': 1.7112684335392948e-06, 'epoch': 0.74} 74%|███████▎ | 16277/22095 [28:05:35<8:44:28, 5.41s/it] 74%|███████▎ | 16278/22095 [28:05:39<8:08:11, 5.04s/it] {'loss': 0.3083, 'grad_norm': 0.6059417313635661, 'learning_rate': 1.7107164037912305e-06, 'epoch': 0.74} 74%|███████▎ | 16278/22095 [28:05:39<8:08:11, 5.04s/it] 74%|███████▎ | 16279/22095 [28:05:42<7:12:31, 4.46s/it] {'loss': 0.3065, 'grad_norm': 0.6361738820039944, 'learning_rate': 1.7101644447197702e-06, 'epoch': 0.74} 74%|███████▎ | 16279/22095 [28:05:42<7:12:31, 4.46s/it] 74%|███████▎ | 16280/22095 [28:05:45<6:33:06, 4.06s/it] {'loss': 0.2608, 'grad_norm': 0.6360595017871099, 'learning_rate': 1.7096125563367722e-06, 'epoch': 0.74} 74%|███████▎ | 16280/22095 [28:05:45<6:33:06, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▎ | 16281/22095 [28:05:49<6:19:50, 3.92s/it] {'loss': 0.3589, 'grad_norm': 0.6382952432671731, 'learning_rate': 1.709060738654093e-06, 'epoch': 0.74} 74%|███████▎ | 16281/22095 [28:05:49<6:19:50, 3.92s/it] 74%|███████▎ | 16282/22095 [28:05:53<6:29:56, 4.02s/it] {'loss': 0.2809, 'grad_norm': 0.6057322972207433, 'learning_rate': 1.7085089916835924e-06, 'epoch': 0.74} 74%|███████▎ | 16282/22095 [28:05:53<6:29:56, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▎ | 16283/22095 [28:06:03<9:20:34, 5.79s/it] {'loss': 0.4449, 'grad_norm': 0.2639576838682394, 'learning_rate': 1.7079573154371233e-06, 'epoch': 0.74} 74%|███████▎ | 16283/22095 [28:06:03<9:20:34, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51405 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126239 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16284/22095 [28:06:13<11:11:29, 6.93s/it] {'loss': 0.4703, 'grad_norm': 0.259451887808222, 'learning_rate': 1.7074057099265422e-06, 'epoch': 0.74} 74%|███████▎ | 16284/22095 [28:06:13<11:11:29, 6.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▎ | 16285/22095 [28:06:18<10:25:59, 6.46s/it] {'loss': 0.3055, 'grad_norm': 0.6382920149509974, 'learning_rate': 1.7068541751637001e-06, 'epoch': 0.74} 74%|███████▎ | 16285/22095 [28:06:18<10:25:59, 6.46s/it] 74%|███████▎ | 16286/22095 [28:06:22<9:03:34, 5.61s/it] {'loss': 0.3105, 'grad_norm': 0.6298061371883908, 'learning_rate': 1.7063027111604457e-06, 'epoch': 0.74} 74%|███████▎ | 16286/22095 [28:06:22<9:03:34, 5.61s/it] 74%|███████▎ | 16287/22095 [28:06:25<7:49:57, 4.85s/it] {'loss': 0.2964, 'grad_norm': 0.5901058915221602, 'learning_rate': 1.7057513179286305e-06, 'epoch': 0.74} 74%|███████▎ | 16287/22095 [28:06:25<7:49:57, 4.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53858 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16288/22095 [28:06:29<7:23:00, 4.58s/it] {'loss': 0.3239, 'grad_norm': 0.6692352016470466, 'learning_rate': 1.7051999954801058e-06, 'epoch': 0.74} 74%|███████▎ | 16288/22095 [28:06:29<7:23:00, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50396 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16289/22095 [28:06:32<6:49:50, 4.24s/it] {'loss': 0.2878, 'grad_norm': 0.5637542321209367, 'learning_rate': 1.7046487438267101e-06, 'epoch': 0.74} 74%|███████▎ | 16289/22095 [28:06:32<6:49:50, 4.24s/it] 74%|███████▎ | 16290/22095 [28:06:35<6:14:08, 3.87s/it] {'loss': 0.3092, 'grad_norm': 0.6137312238539702, 'learning_rate': 1.704097562980292e-06, 'epoch': 0.74} 74%|███████▎ | 16290/22095 [28:06:35<6:14:08, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959457 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10292, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 10\nB. 5\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 74%|███████▎ | 16291/22095 [28:06:38<5:48:32, 3.60s/it] {'loss': 0.2703, 'grad_norm': 0.6104309860060394, 'learning_rate': 1.7035464529526963e-06, 'epoch': 0.74} 74%|███████▎ | 16291/22095 [28:06:38<5:48:32, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (53974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79305 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42288 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▎ | 16292/22095 [28:06:44<6:56:49, 4.31s/it] {'loss': 0.4814, 'grad_norm': 0.2693494755988427, 'learning_rate': 1.702995413755763e-06, 'epoch': 0.74} 74%|███████▎ | 16292/22095 [28:06:44<6:56:49, 4.31s/it] 74%|███████▎ | 16293/22095 [28:06:48<6:42:27, 4.16s/it] {'loss': 0.2872, 'grad_norm': 0.6943479505825775, 'learning_rate': 1.7024444454013305e-06, 'epoch': 0.74} 74%|███████▎ | 16293/22095 [28:06:48<6:42:27, 4.16s/it] 74%|███████▎ | 16294/22095 [28:06:51<6:13:59, 3.87s/it] {'loss': 0.2772, 'grad_norm': 0.5915976030744456, 'learning_rate': 1.7018935479012394e-06, 'epoch': 0.74} 74%|███████▎ | 16294/22095 [28:06:51<6:13:59, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369310 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36062, 'image': 'vrdu_table_final_2/astro-ph.CO/44117651-c128-4c8e-a239-97122c50e9f5.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 74%|███████▎ | 16295/22095 [28:06:54<5:46:46, 3.59s/it] {'loss': 0.262, 'grad_norm': 0.6959850127902959, 'learning_rate': 1.7013427212673285e-06, 'epoch': 0.74} 74%|███████▎ | 16295/22095 [28:06:54<5:46:46, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16296/22095 [28:07:02<8:02:09, 4.99s/it] {'loss': 0.4919, 'grad_norm': 0.2816393378161216, 'learning_rate': 1.7007919655114314e-06, 'epoch': 0.74} 74%|███████▍ | 16296/22095 [28:07:02<8:02:09, 4.99s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16297/22095 [28:07:10<9:06:42, 5.66s/it] {'loss': 0.4763, 'grad_norm': 0.28639396589527855, 'learning_rate': 1.7002412806453799e-06, 'epoch': 0.74} 74%|███████▍ | 16297/22095 [28:07:10<9:06:42, 5.66s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▍ | 16298/22095 [28:07:14<8:21:50, 5.19s/it] {'loss': 0.2929, 'grad_norm': 0.610734186823621, 'learning_rate': 1.6996906666810116e-06, 'epoch': 0.74} 74%|███████▍ | 16298/22095 [28:07:14<8:21:50, 5.19s/it] 74%|███████▍ | 16299/22095 [28:07:17<7:29:24, 4.65s/it] {'loss': 0.2951, 'grad_norm': 0.650120453481213, 'learning_rate': 1.699140123630152e-06, 'epoch': 0.74} 74%|███████▍ | 16299/22095 [28:07:17<7:29:24, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16300/22095 [28:07:27<9:49:44, 6.11s/it] {'loss': 0.4629, 'grad_norm': 0.27258678959918653, 'learning_rate': 1.6985896515046357e-06, 'epoch': 0.74} 74%|███████▍ | 16300/22095 [28:07:27<9:49:44, 6.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16301/22095 [28:07:31<8:47:52, 5.47s/it] {'loss': 0.2871, 'grad_norm': 0.6129395204170821, 'learning_rate': 1.698039250316288e-06, 'epoch': 0.74} 74%|███████▍ | 16301/22095 [28:07:31<8:47:52, 5.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16302/22095 [28:07:34<7:35:16, 4.72s/it] {'loss': 0.2952, 'grad_norm': 0.5771135278717086, 'learning_rate': 1.697488920076934e-06, 'epoch': 0.74} 74%|███████▍ | 16302/22095 [28:07:34<7:35:16, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16303/22095 [28:07:37<6:45:10, 4.20s/it] {'loss': 0.3341, 'grad_norm': 0.6090117336444172, 'learning_rate': 1.6969386607984e-06, 'epoch': 0.74} 74%|███████▍ | 16303/22095 [28:07:37<6:45:10, 4.20s/it] 74%|███████▍ | 16304/22095 [28:07:40<6:21:45, 3.96s/it] {'loss': 0.3434, 'grad_norm': 0.6698203949048338, 'learning_rate': 1.6963884724925116e-06, 'epoch': 0.74} 74%|███████▍ | 16304/22095 [28:07:40<6:21:45, 3.96s/it] 74%|███████▍ | 16305/22095 [28:07:44<6:37:05, 4.12s/it] {'loss': 0.2917, 'grad_norm': 0.5727007389889912, 'learning_rate': 1.6958383551710888e-06, 'epoch': 0.74} 74%|███████▍ | 16305/22095 [28:07:44<6:37:05, 4.12s/it] 74%|███████▍ | 16306/22095 [28:07:48<6:33:35, 4.08s/it] {'loss': 0.2911, 'grad_norm': 0.6287283582218592, 'learning_rate': 1.6952883088459498e-06, 'epoch': 0.74} 74%|███████▍ | 16306/22095 [28:07:48<6:33:35, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102202 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16307/22095 [28:07:52<6:15:49, 3.90s/it] {'loss': 0.3169, 'grad_norm': 0.6010885159430064, 'learning_rate': 1.6947383335289152e-06, 'epoch': 0.74} 74%|███████▍ | 16307/22095 [28:07:52<6:15:49, 3.90s/it] 74%|███████▍ | 16308/22095 [28:07:55<5:53:32, 3.67s/it] {'loss': 0.339, 'grad_norm': 0.5694010188110677, 'learning_rate': 1.6941884292318044e-06, 'epoch': 0.74} 74%|███████▍ | 16308/22095 [28:07:55<5:53:32, 3.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937803 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60956, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nA. 5cm\nB. \\frac{11}{2}cm\nC. 4cm\nD. \\frac{9}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16309/22095 [28:07:58<5:38:32, 3.51s/it] {'loss': 0.2999, 'grad_norm': 0.6348745967305999, 'learning_rate': 1.6936385959664315e-06, 'epoch': 0.74} 74%|███████▍ | 16309/22095 [28:07:58<5:38:32, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138041 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46036 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16310/22095 [28:08:01<5:21:56, 3.34s/it] {'loss': 0.3054, 'grad_norm': 0.609076028085305, 'learning_rate': 1.6930888337446082e-06, 'epoch': 0.74} 74%|███████▍ | 16310/22095 [28:08:01<5:21:56, 3.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16311/22095 [28:08:04<5:20:20, 3.32s/it] {'loss': 0.2698, 'grad_norm': 0.6158342661208451, 'learning_rate': 1.6925391425781519e-06, 'epoch': 0.74} 74%|███████▍ | 16311/22095 [28:08:04<5:20:20, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47086 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54310 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85361 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16312/22095 [28:08:08<5:14:45, 3.27s/it] {'loss': 0.2937, 'grad_norm': 0.6013018213101735, 'learning_rate': 1.691989522478869e-06, 'epoch': 0.74} 74%|███████▍ | 16312/22095 [28:08:08<5:14:45, 3.27s/it] 74%|███████▍ | 16313/22095 [28:08:12<5:49:00, 3.62s/it] {'loss': 0.298, 'grad_norm': 0.5464993799583493, 'learning_rate': 1.6914399734585735e-06, 'epoch': 0.74} 74%|███████▍ | 16313/22095 [28:08:12<5:49:00, 3.62s/it] 74%|███████▍ | 16314/22095 [28:08:16<5:50:18, 3.64s/it] {'loss': 0.3046, 'grad_norm': 0.5666297376449007, 'learning_rate': 1.690890495529071e-06, 'epoch': 0.74} 74%|███████▍ | 16314/22095 [28:08:16<5:50:18, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16315/22095 [28:08:25<8:35:08, 5.35s/it] {'loss': 0.4429, 'grad_norm': 0.26592206593075024, 'learning_rate': 1.6903410887021676e-06, 'epoch': 0.74} 74%|███████▍ | 16315/22095 [28:08:25<8:35:08, 5.35s/it] 74%|███████▍ | 16316/22095 [28:08:30<8:16:21, 5.15s/it] {'loss': 0.2991, 'grad_norm': 0.6022086141416868, 'learning_rate': 1.6897917529896691e-06, 'epoch': 0.74} 74%|███████▍ | 16316/22095 [28:08:30<8:16:21, 5.15s/it] 74%|███████▍ | 16317/22095 [28:08:33<7:29:27, 4.67s/it] {'loss': 0.2886, 'grad_norm': 0.5292050781665859, 'learning_rate': 1.6892424884033825e-06, 'epoch': 0.74} 74%|███████▍ | 16317/22095 [28:08:33<7:29:27, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16318/22095 [28:08:41<8:55:05, 5.56s/it] {'loss': 0.4588, 'grad_norm': 0.31019800261565705, 'learning_rate': 1.6886932949551032e-06, 'epoch': 0.74} 74%|███████▍ | 16318/22095 [28:08:41<8:55:05, 5.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50995 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16319/22095 [28:08:44<7:47:37, 4.86s/it] {'loss': 0.2842, 'grad_norm': 0.5384154654615427, 'learning_rate': 1.6881441726566355e-06, 'epoch': 0.74} 74%|███████▍ | 16319/22095 [28:08:44<7:47:37, 4.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16320/22095 [28:08:53<9:58:12, 6.22s/it] {'loss': 0.4539, 'grad_norm': 0.25599225115318863, 'learning_rate': 1.6875951215197779e-06, 'epoch': 0.74} 74%|███████▍ | 16320/22095 [28:08:53<9:58:12, 6.22s/it] 74%|███████▍ | 16321/22095 [28:08:57<8:45:03, 5.46s/it] {'loss': 0.2836, 'grad_norm': 0.6232868132447051, 'learning_rate': 1.6870461415563311e-06, 'epoch': 0.74} 74%|███████▍ | 16321/22095 [28:08:57<8:45:03, 5.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047214 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 74%|███████▍ | 16322/22095 [28:09:01<7:51:18, 4.90s/it] {'loss': 0.273, 'grad_norm': 0.5879990085252694, 'learning_rate': 1.6864972327780842e-06, 'epoch': 0.74} 74%|███████▍ | 16322/22095 [28:09:01<7:51:18, 4.90s/it] 74%|███████▍ | 16323/22095 [28:09:04<7:14:58, 4.52s/it] {'loss': 0.2917, 'grad_norm': 0.6348301993005881, 'learning_rate': 1.6859483951968353e-06, 'epoch': 0.74} 74%|███████▍ | 16323/22095 [28:09:04<7:14:58, 4.52s/it] 74%|███████▍ | 16324/22095 [28:09:07<6:29:31, 4.05s/it] {'loss': 0.2771, 'grad_norm': 0.6169572468167954, 'learning_rate': 1.6853996288243785e-06, 'epoch': 0.74} 74%|███████▍ | 16324/22095 [28:09:07<6:29:31, 4.05s/it] 74%|███████▍ | 16325/22095 [28:09:10<5:56:30, 3.71s/it] {'loss': 0.2983, 'grad_norm': 0.5949852175498666, 'learning_rate': 1.6848509336725039e-06, 'epoch': 0.74} 74%|███████▍ | 16325/22095 [28:09:10<5:56:30, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16326/22095 [28:09:13<5:37:46, 3.51s/it] {'loss': 0.2544, 'grad_norm': 0.5970094415996532, 'learning_rate': 1.6843023097529993e-06, 'epoch': 0.74} 74%|███████▍ | 16326/22095 [28:09:13<5:37:46, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16327/22095 [28:09:23<8:29:27, 5.30s/it] {'loss': 0.4461, 'grad_norm': 0.2664461704857231, 'learning_rate': 1.6837537570776563e-06, 'epoch': 0.74} 74%|███████▍ | 16327/22095 [28:09:23<8:29:27, 5.30s/it] 74%|███████▍ | 16328/22095 [28:09:27<7:45:27, 4.84s/it] {'loss': 0.3353, 'grad_norm': 0.6167237480226018, 'learning_rate': 1.6832052756582583e-06, 'epoch': 0.74} 74%|███████▍ | 16328/22095 [28:09:27<7:45:27, 4.84s/it] 74%|███████▍ | 16329/22095 [28:09:30<6:54:00, 4.31s/it] {'loss': 0.3018, 'grad_norm': 0.6651092051598921, 'learning_rate': 1.682656865506594e-06, 'epoch': 0.74} 74%|███████▍ | 16329/22095 [28:09:30<6:54:00, 4.31s/it] 74%|███████▍ | 16330/22095 [28:09:32<6:07:31, 3.83s/it] {'loss': 0.2956, 'grad_norm': 0.5786650897321024, 'learning_rate': 1.682108526634445e-06, 'epoch': 0.74} 74%|███████▍ | 16330/22095 [28:09:32<6:07:31, 3.83s/it] 74%|███████▍ | 16331/22095 [28:09:35<5:43:27, 3.58s/it] {'loss': 0.3015, 'grad_norm': 0.7224670401316967, 'learning_rate': 1.6815602590535923e-06, 'epoch': 0.74} 74%|███████▍ | 16331/22095 [28:09:35<5:43:27, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63669 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91157 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16332/22095 [28:09:38<5:27:15, 3.41s/it] {'loss': 0.3501, 'grad_norm': 0.590046763536563, 'learning_rate': 1.6810120627758176e-06, 'epoch': 0.74} 74%|███████▍ | 16332/22095 [28:09:38<5:27:15, 3.41s/it] 74%|███████▍ | 16333/22095 [28:09:42<5:32:20, 3.46s/it] {'loss': 0.2926, 'grad_norm': 0.6788892856512094, 'learning_rate': 1.6804639378129017e-06, 'epoch': 0.74} 74%|███████▍ | 16333/22095 [28:09:42<5:32:20, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48538 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89938 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16334/22095 [28:09:45<5:25:29, 3.39s/it] {'loss': 0.2937, 'grad_norm': 0.5922790130333325, 'learning_rate': 1.6799158841766206e-06, 'epoch': 0.74} 74%|███████▍ | 16334/22095 [28:09:45<5:25:29, 3.39s/it] 74%|███████▍ | 16335/22095 [28:09:48<5:23:38, 3.37s/it] {'loss': 0.2815, 'grad_norm': 0.5851526631177592, 'learning_rate': 1.679367901878749e-06, 'epoch': 0.74} 74%|███████▍ | 16335/22095 [28:09:48<5:23:38, 3.37s/it] 74%|███████▍ | 16336/22095 [28:09:52<5:30:07, 3.44s/it] {'loss': 0.3148, 'grad_norm': 0.6265131905703484, 'learning_rate': 1.6788199909310626e-06, 'epoch': 0.74} 74%|███████▍ | 16336/22095 [28:09:52<5:30:07, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48780 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41514 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16337/22095 [28:09:56<5:32:20, 3.46s/it] {'loss': 0.309, 'grad_norm': 0.5723849009706529, 'learning_rate': 1.6782721513453353e-06, 'epoch': 0.74} 74%|███████▍ | 16337/22095 [28:09:56<5:32:20, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885566 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8719, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 3\nB. 10\nC. 5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 74%|███████▍ | 16338/22095 [28:09:59<5:26:50, 3.41s/it] {'loss': 0.3172, 'grad_norm': 0.6314188202897429, 'learning_rate': 1.6777243831333383e-06, 'epoch': 0.74} 74%|███████▍ | 16338/22095 [28:09:59<5:26:50, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65977 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16339/22095 [28:10:02<5:24:39, 3.38s/it] {'loss': 0.2789, 'grad_norm': 0.5981352108047535, 'learning_rate': 1.6771766863068389e-06, 'epoch': 0.74} 74%|███████▍ | 16339/22095 [28:10:02<5:24:39, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16340/22095 [28:10:11<8:07:03, 5.08s/it] {'loss': 0.4823, 'grad_norm': 1.6972236802070124, 'learning_rate': 1.6766290608776093e-06, 'epoch': 0.74} 74%|███████▍ | 16340/22095 [28:10:11<8:07:03, 5.08s/it] 74%|███████▍ | 16341/22095 [28:10:15<7:33:24, 4.73s/it] {'loss': 0.3213, 'grad_norm': 0.6993751371813014, 'learning_rate': 1.6760815068574116e-06, 'epoch': 0.74} 74%|███████▍ | 16341/22095 [28:10:15<7:33:24, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [500, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8464646 in VC:s3://internvl-moe-sft-data/. Exception: Image size [500, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119197, 'image': 'vrdu_texteq/astro-ph.CO/b6474ded-fdfe-4901-9714-3d51f194482b.png', 'image_wh': [[500, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and the unit matrix $\\mathcal{I}_N$ in $N$ dimensions.'}]} 74%|███████▍ | 16342/22095 [28:10:25<9:50:25, 6.16s/it] {'loss': 0.4798, 'grad_norm': 0.31186821878598653, 'learning_rate': 1.6755340242580158e-06, 'epoch': 0.74} 74%|███████▍ | 16342/22095 [28:10:25<9:50:25, 6.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76021 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16343/22095 [28:10:28<8:39:33, 5.42s/it] {'loss': 0.3498, 'grad_norm': 0.647280432904068, 'learning_rate': 1.674986613091184e-06, 'epoch': 0.74} 74%|███████▍ | 16343/22095 [28:10:28<8:39:33, 5.42s/it] 74%|███████▍ | 16344/22095 [28:10:31<7:26:52, 4.66s/it] {'loss': 0.2766, 'grad_norm': 0.9307783766001247, 'learning_rate': 1.6744392733686754e-06, 'epoch': 0.74} 74%|███████▍ | 16344/22095 [28:10:31<7:26:52, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16345/22095 [28:10:41<9:47:10, 6.13s/it] {'loss': 0.4809, 'grad_norm': 0.3135811131714723, 'learning_rate': 1.673892005102254e-06, 'epoch': 0.74} 74%|███████▍ | 16345/22095 [28:10:41<9:47:10, 6.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16346/22095 [28:10:44<8:17:33, 5.19s/it] {'loss': 0.2704, 'grad_norm': 0.7435748301186856, 'learning_rate': 1.6733448083036806e-06, 'epoch': 0.74} 74%|███████▍ | 16346/22095 [28:10:44<8:17:33, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16347/22095 [28:10:51<9:03:25, 5.67s/it] {'loss': 0.4669, 'grad_norm': 0.28471121810758254, 'learning_rate': 1.6727976829847075e-06, 'epoch': 0.74} 74%|███████▍ | 16347/22095 [28:10:51<9:03:25, 5.67s/it] 74%|███████▍ | 16348/22095 [28:10:54<8:09:28, 5.11s/it] {'loss': 0.2602, 'grad_norm': 0.590816121161494, 'learning_rate': 1.6722506291570929e-06, 'epoch': 0.74} 74%|███████▍ | 16348/22095 [28:10:54<8:09:28, 5.11s/it] 74%|███████▍ | 16349/22095 [28:10:59<8:05:26, 5.07s/it] {'loss': 0.269, 'grad_norm': 0.5922176862394455, 'learning_rate': 1.671703646832592e-06, 'epoch': 0.74} 74%|███████▍ | 16349/22095 [28:10:59<8:05:26, 5.07s/it] 74%|███████▍ | 16350/22095 [28:11:02<6:57:48, 4.36s/it] {'loss': 0.2886, 'grad_norm': 0.633108164986941, 'learning_rate': 1.6711567360229613e-06, 'epoch': 0.74} 74%|███████▍ | 16350/22095 [28:11:02<6:57:48, 4.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16351/22095 [28:11:12<9:25:12, 5.90s/it] {'loss': 0.4559, 'grad_norm': 0.25987291362248266, 'learning_rate': 1.6706098967399454e-06, 'epoch': 0.74} 74%|███████▍ | 16351/22095 [28:11:12<9:25:12, 5.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16352/22095 [28:11:15<8:08:08, 5.10s/it] {'loss': 0.2582, 'grad_norm': 0.672889853871863, 'learning_rate': 1.6700631289952967e-06, 'epoch': 0.74} 74%|███████▍ | 16352/22095 [28:11:15<8:08:08, 5.10s/it] 74%|███████▍ | 16353/22095 [28:11:18<7:04:37, 4.44s/it] {'loss': 0.307, 'grad_norm': 0.6149642362793232, 'learning_rate': 1.6695164328007663e-06, 'epoch': 0.74} 74%|███████▍ | 16353/22095 [28:11:18<7:04:37, 4.44s/it] 74%|███████▍ | 16354/22095 [28:11:21<6:20:06, 3.97s/it] {'loss': 0.298, 'grad_norm': 0.637888793193678, 'learning_rate': 1.6689698081680988e-06, 'epoch': 0.74} 74%|███████▍ | 16354/22095 [28:11:21<6:20:06, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52645 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16355/22095 [28:11:24<5:59:05, 3.75s/it] {'loss': 0.3063, 'grad_norm': 0.629594169843332, 'learning_rate': 1.6684232551090385e-06, 'epoch': 0.74} 74%|███████▍ | 16355/22095 [28:11:24<5:59:05, 3.75s/it] 74%|███████▍ | 16356/22095 [28:11:28<6:01:39, 3.78s/it] {'loss': 0.2921, 'grad_norm': 0.5993736230807224, 'learning_rate': 1.6678767736353313e-06, 'epoch': 0.74} 74%|███████▍ | 16356/22095 [28:11:28<6:01:39, 3.78s/it] 74%|███████▍ | 16357/22095 [28:11:31<5:47:29, 3.63s/it] {'loss': 0.2944, 'grad_norm': 0.6533709936406462, 'learning_rate': 1.6673303637587169e-06, 'epoch': 0.74} 74%|███████▍ | 16357/22095 [28:11:31<5:47:29, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116355 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119262 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16358/22095 [28:11:34<5:25:36, 3.41s/it] {'loss': 0.2746, 'grad_norm': 0.6279231132277909, 'learning_rate': 1.6667840254909395e-06, 'epoch': 0.74} 74%|███████▍ | 16358/22095 [28:11:34<5:25:36, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99526 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16359/22095 [28:11:37<5:15:51, 3.30s/it] {'loss': 0.3198, 'grad_norm': 0.6895537865960341, 'learning_rate': 1.6662377588437356e-06, 'epoch': 0.74} 74%|███████▍ | 16359/22095 [28:11:37<5:15:51, 3.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16360/22095 [28:11:40<5:17:21, 3.32s/it] {'loss': 0.2509, 'grad_norm': 0.5613014329500297, 'learning_rate': 1.6656915638288423e-06, 'epoch': 0.74} 74%|███████▍ | 16360/22095 [28:11:40<5:17:21, 3.32s/it] 74%|███████▍ | 16361/22095 [28:11:43<5:07:39, 3.22s/it] {'loss': 0.2772, 'grad_norm': 0.5244703265777766, 'learning_rate': 1.6651454404579965e-06, 'epoch': 0.74} 74%|███████▍ | 16361/22095 [28:11:43<5:07:39, 3.22s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (111125556 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 74%|███████▍ | 16362/22095 [28:11:47<5:20:34, 3.35s/it] {'loss': 0.2939, 'grad_norm': 0.6434737155875516, 'learning_rate': 1.6645993887429345e-06, 'epoch': 0.74} 74%|███████▍ | 16362/22095 [28:11:47<5:20:34, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (99704 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16363/22095 [28:11:56<8:15:24, 5.19s/it] {'loss': 0.4551, 'grad_norm': 0.27442691120208346, 'learning_rate': 1.664053408695388e-06, 'epoch': 0.74} 74%|███████▍ | 16363/22095 [28:11:56<8:15:24, 5.19s/it] 74%|███████▍ | 16364/22095 [28:12:00<7:27:41, 4.69s/it] {'loss': 0.3343, 'grad_norm': 0.6432461688130403, 'learning_rate': 1.6635075003270861e-06, 'epoch': 0.74} 74%|███████▍ | 16364/22095 [28:12:00<7:27:41, 4.69s/it] 74%|███████▍ | 16365/22095 [28:12:04<7:03:43, 4.44s/it] {'loss': 0.306, 'grad_norm': 0.6585033459543134, 'learning_rate': 1.6629616636497615e-06, 'epoch': 0.74} 74%|███████▍ | 16365/22095 [28:12:04<7:03:43, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16366/22095 [28:12:13<9:28:33, 5.95s/it] {'loss': 0.4308, 'grad_norm': 0.28741188531833256, 'learning_rate': 1.6624158986751427e-06, 'epoch': 0.74} 74%|███████▍ | 16366/22095 [28:12:13<9:28:33, 5.95s/it] 74%|███████▍ | 16367/22095 [28:12:17<8:17:48, 5.21s/it] {'loss': 0.2924, 'grad_norm': 0.6272621824628382, 'learning_rate': 1.661870205414956e-06, 'epoch': 0.74} 74%|███████▍ | 16367/22095 [28:12:17<8:17:48, 5.21s/it] 74%|███████▍ | 16368/22095 [28:12:21<7:48:06, 4.90s/it] {'loss': 0.3293, 'grad_norm': 0.6357340295514764, 'learning_rate': 1.6613245838809244e-06, 'epoch': 0.74} 74%|███████▍ | 16368/22095 [28:12:21<7:48:06, 4.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16369/22095 [28:12:24<6:58:23, 4.38s/it] {'loss': 0.2847, 'grad_norm': 0.586649855741785, 'learning_rate': 1.6607790340847757e-06, 'epoch': 0.74} 74%|███████▍ | 16369/22095 [28:12:24<6:58:23, 4.38s/it] 74%|███████▍ | 16370/22095 [28:12:28<6:38:21, 4.17s/it] {'loss': 0.3355, 'grad_norm': 0.6519309610805597, 'learning_rate': 1.6602335560382276e-06, 'epoch': 0.74} 74%|███████▍ | 16370/22095 [28:12:28<6:38:21, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16371/22095 [28:12:37<8:59:51, 5.66s/it] {'loss': 0.4759, 'grad_norm': 0.27314375144628994, 'learning_rate': 1.6596881497530054e-06, 'epoch': 0.74} 74%|███████▍ | 16371/22095 [28:12:37<8:59:51, 5.66s/it] 74%|███████▍ | 16372/22095 [28:12:41<8:28:09, 5.33s/it] {'loss': 0.3212, 'grad_norm': 0.6334847302860123, 'learning_rate': 1.6591428152408256e-06, 'epoch': 0.74} 74%|███████▍ | 16372/22095 [28:12:41<8:28:09, 5.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68896 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16373/22095 [28:12:45<7:33:13, 4.75s/it] {'loss': 0.2976, 'grad_norm': 0.6054026655609173, 'learning_rate': 1.6585975525134041e-06, 'epoch': 0.74} 74%|███████▍ | 16373/22095 [28:12:45<7:33:13, 4.75s/it] 74%|███████▍ | 16374/22095 [28:12:48<6:58:54, 4.39s/it] {'loss': 0.3441, 'grad_norm': 0.6606486067332609, 'learning_rate': 1.658052361582459e-06, 'epoch': 0.74} 74%|███████▍ | 16374/22095 [28:12:48<6:58:54, 4.39s/it] 74%|███████▍ | 16375/22095 [28:12:52<6:23:11, 4.02s/it] {'loss': 0.3342, 'grad_norm': 0.7073574769713497, 'learning_rate': 1.6575072424597083e-06, 'epoch': 0.74} 74%|███████▍ | 16375/22095 [28:12:52<6:23:11, 4.02s/it] 74%|███████▍ | 16376/22095 [28:12:55<6:01:03, 3.79s/it] {'loss': 0.2961, 'grad_norm': 0.6335308387328569, 'learning_rate': 1.6569621951568575e-06, 'epoch': 0.74} 74%|███████▍ | 16376/22095 [28:12:55<6:01:03, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63145 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16377/22095 [28:12:58<5:51:32, 3.69s/it] {'loss': 0.2878, 'grad_norm': 0.5610231362990775, 'learning_rate': 1.6564172196856222e-06, 'epoch': 0.74} 74%|███████▍ | 16377/22095 [28:12:58<5:51:32, 3.69s/it] 74%|███████▍ | 16378/22095 [28:13:02<5:44:12, 3.61s/it] {'loss': 0.2611, 'grad_norm': 0.5774696662726435, 'learning_rate': 1.6558723160577118e-06, 'epoch': 0.74} 74%|███████▍ | 16378/22095 [28:13:02<5:44:12, 3.61s/it] 74%|███████▍ | 16379/22095 [28:13:05<5:48:48, 3.66s/it] {'loss': 0.3021, 'grad_norm': 0.6121664560415999, 'learning_rate': 1.655327484284837e-06, 'epoch': 0.74} 74%|███████▍ | 16379/22095 [28:13:05<5:48:48, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [350, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8444064 in VC:s3://internvl-moe-sft-data/. Exception: Image size [350, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 155415, 'image': 'vrdu_texteq/astro-ph.CO/4ee9bc3e-1c21-4a2a-8364-4a5ee3c22590.png', 'image_wh': [[350, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $N_m$ is defined as such.'}]} 74%|███████▍ | 16380/22095 [28:13:09<5:33:47, 3.50s/it] {'loss': 0.2902, 'grad_norm': 0.6178927095424993, 'learning_rate': 1.6547827243787002e-06, 'epoch': 0.74} 74%|███████▍ | 16380/22095 [28:13:09<5:33:47, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (74048 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16381/22095 [28:13:19<9:03:18, 5.71s/it] {'loss': 0.4395, 'grad_norm': 0.28287751021161356, 'learning_rate': 1.654238036351008e-06, 'epoch': 0.74} 74%|███████▍ | 16381/22095 [28:13:19<9:03:18, 5.71s/it] 74%|███████▍ | 16382/22095 [28:13:24<8:27:10, 5.33s/it] {'loss': 0.3038, 'grad_norm': 0.598779619002262, 'learning_rate': 1.6536934202134663e-06, 'epoch': 0.74} 74%|███████▍ | 16382/22095 [28:13:24<8:27:10, 5.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [645, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8524250 in VC:s3://internvl-moe-sft-data/. Exception: Image size [645, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60425, 'image': 'vrdu_texteq/astro-ph.CO/0d690dd0-4b67-47e0-bc62-4de64b0b1e3b.png', 'image_wh': [[645, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where the flux relation is now written in terms of $\\bm\\theta_h$.'}]} 74%|███████▍ | 16383/22095 [28:13:27<7:39:03, 4.82s/it] {'loss': 0.2914, 'grad_norm': 0.5969691615599192, 'learning_rate': 1.6531488759777753e-06, 'epoch': 0.74} 74%|███████▍ | 16383/22095 [28:13:28<7:39:03, 4.82s/it] 74%|███████▍ | 16384/22095 [28:13:31<7:14:57, 4.57s/it] {'loss': 0.2809, 'grad_norm': 0.667321425427489, 'learning_rate': 1.6526044036556349e-06, 'epoch': 0.74} 74%|███████▍ | 16384/22095 [28:13:31<7:14:57, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16385/22095 [28:13:41<9:27:49, 5.97s/it] {'loss': 0.4745, 'grad_norm': 0.2795974705144027, 'learning_rate': 1.6520600032587464e-06, 'epoch': 0.74} 74%|███████▍ | 16385/22095 [28:13:41<9:27:49, 5.97s/it] 74%|███████▍ | 16386/22095 [28:13:45<8:27:56, 5.34s/it] {'loss': 0.2814, 'grad_norm': 0.7550919144344835, 'learning_rate': 1.6515156747988043e-06, 'epoch': 0.74} 74%|███████▍ | 16386/22095 [28:13:45<8:27:56, 5.34s/it] 74%|███████▍ | 16387/22095 [28:13:48<7:40:53, 4.84s/it] {'loss': 0.2722, 'grad_norm': 0.6126708197070092, 'learning_rate': 1.650971418287508e-06, 'epoch': 0.74} 74%|███████▍ | 16387/22095 [28:13:48<7:40:53, 4.84s/it] 74%|███████▍ | 16388/22095 [28:13:52<6:57:37, 4.39s/it] {'loss': 0.3413, 'grad_norm': 0.6412752477864133, 'learning_rate': 1.6504272337365501e-06, 'epoch': 0.74} 74%|███████▍ | 16388/22095 [28:13:52<6:57:37, 4.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047936 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6'}]} 74%|███████▍ | 16389/22095 [28:14:01<9:32:53, 6.02s/it] {'loss': 0.5012, 'grad_norm': 0.2807606489818299, 'learning_rate': 1.6498831211576222e-06, 'epoch': 0.74} 74%|███████▍ | 16389/22095 [28:14:01<9:32:53, 6.02s/it] 74%|███████▍ | 16390/22095 [28:14:10<10:31:53, 6.65s/it] {'loss': 0.4903, 'grad_norm': 0.27978365250473713, 'learning_rate': 1.6493390805624165e-06, 'epoch': 0.74} 74%|███████▍ | 16390/22095 [28:14:10<10:31:53, 6.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▍ | 16391/22095 [28:14:14<9:15:37, 5.84s/it] {'loss': 0.3449, 'grad_norm': 0.5874638050564923, 'learning_rate': 1.648795111962625e-06, 'epoch': 0.74} 74%|███████▍ | 16391/22095 [28:14:14<9:15:37, 5.84s/it]VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/3f0da374b929eea01ea24b61f298cfa1.png 2025-08-28 20:12:12.290915 load time: 1036.19 ms VC:s3://gui-agent/data_20250630/web/images/yang_0704161042/10_140_52_49_0704171614/img/0.png 2025-08-28 20:12:12.292474 load time: 1050.55 ms 74%|███████▍ | 16392/22095 [28:14:23<10:51:21, 6.85s/it] {'loss': 0.4518, 'grad_norm': 0.24334727858073832, 'learning_rate': 1.6482512153699344e-06, 'epoch': 0.74} 74%|███████▍ | 16392/22095 [28:14:23<10:51:21, 6.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▍ | 16393/22095 [28:14:27<9:50:38, 6.22s/it] {'loss': 0.3001, 'grad_norm': 0.6706376008021124, 'learning_rate': 1.647707390796029e-06, 'epoch': 0.74} 74%|███████▍ | 16393/22095 [28:14:27<9:50:38, 6.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348843 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15513, 'image': 'vrdu_table_final_2/astro-ph.CO/abe22ada-845a-40b5-a241-77733201d370.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{c}$S_{4}$\\end{tabular}\n```"}]} 74%|███████▍ | 16394/22095 [28:14:31<8:37:26, 5.45s/it] {'loss': 0.3018, 'grad_norm': 0.581696236394703, 'learning_rate': 1.6471636382525963e-06, 'epoch': 0.74} 74%|███████▍ | 16394/22095 [28:14:31<8:37:26, 5.45s/it] 74%|███████▍ | 16395/22095 [28:14:34<7:21:17, 4.65s/it] {'loss': 0.2851, 'grad_norm': 0.6585204969144176, 'learning_rate': 1.6466199577513209e-06, 'epoch': 0.74} 74%|███████▍ | 16395/22095 [28:14:34<7:21:17, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (116388 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16396/22095 [28:14:42<8:49:18, 5.57s/it] {'loss': 0.4776, 'grad_norm': 0.2582143407608148, 'learning_rate': 1.646076349303884e-06, 'epoch': 0.74} 74%|███████▍ | 16396/22095 [28:14:42<8:49:18, 5.57s/it] 74%|███████▍ | 16397/22095 [28:14:45<7:46:43, 4.91s/it] {'loss': 0.2621, 'grad_norm': 0.5990806798374867, 'learning_rate': 1.6455328129219634e-06, 'epoch': 0.74} 74%|███████▍ | 16397/22095 [28:14:45<7:46:43, 4.91s/it] 74%|███████▍ | 16398/22095 [28:14:49<7:07:28, 4.50s/it] {'loss': 0.3084, 'grad_norm': 0.5858539733747705, 'learning_rate': 1.6449893486172418e-06, 'epoch': 0.74} 74%|███████▍ | 16398/22095 [28:14:49<7:07:28, 4.50s/it] 74%|███████▍ | 16399/22095 [28:14:52<6:43:07, 4.25s/it] {'loss': 0.3453, 'grad_norm': 0.6423462532100619, 'learning_rate': 1.6444459564013938e-06, 'epoch': 0.74} 74%|███████▍ | 16399/22095 [28:14:52<6:43:07, 4.25s/it] 74%|███████▍ | 16400/22095 [28:14:55<6:08:04, 3.88s/it] {'loss': 0.3212, 'grad_norm': 0.6999659736527946, 'learning_rate': 1.6439026362860977e-06, 'epoch': 0.74} 74%|███████▍ | 16400/22095 [28:14:55<6:08:04, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70559 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16401/22095 [28:14:59<5:59:51, 3.79s/it] {'loss': 0.2644, 'grad_norm': 0.6350926523497435, 'learning_rate': 1.6433593882830262e-06, 'epoch': 0.74} 74%|███████▍ | 16401/22095 [28:14:59<5:59:51, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16402/22095 [28:15:02<5:40:53, 3.59s/it] {'loss': 0.2781, 'grad_norm': 0.6700380593625176, 'learning_rate': 1.642816212403851e-06, 'epoch': 0.74} 74%|███████▍ | 16402/22095 [28:15:02<5:40:53, 3.59s/it] 74%|███████▍ | 16403/22095 [28:15:05<5:20:46, 3.38s/it] {'loss': 0.3064, 'grad_norm': 0.5725905632375857, 'learning_rate': 1.642273108660245e-06, 'epoch': 0.74} 74%|███████▍ | 16403/22095 [28:15:05<5:20:46, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131711 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89958 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43984 > 40960) for 4 sample(s). Truncating to 3024 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (43256 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16404/22095 [28:15:14<8:10:58, 5.18s/it] {'loss': 0.4693, 'grad_norm': 0.30964321741467477, 'learning_rate': 1.6417300770638784e-06, 'epoch': 0.74} 74%|███████▍ | 16404/22095 [28:15:14<8:10:58, 5.18s/it] 74%|███████▍ | 16405/22095 [28:15:17<7:13:17, 4.57s/it] {'loss': 0.3223, 'grad_norm': 0.6382625480112772, 'learning_rate': 1.6411871176264188e-06, 'epoch': 0.74} 74%|███████▍ | 16405/22095 [28:15:17<7:13:17, 4.57s/it] 74%|███████▍ | 16406/22095 [28:15:20<6:25:09, 4.06s/it] {'loss': 0.2886, 'grad_norm': 0.7372807946732123, 'learning_rate': 1.6406442303595305e-06, 'epoch': 0.74} 74%|███████▍ | 16406/22095 [28:15:20<6:25:09, 4.06s/it] 74%|███████▍ | 16407/22095 [28:15:23<5:51:44, 3.71s/it] {'loss': 0.249, 'grad_norm': 0.6382685159478572, 'learning_rate': 1.6401014152748801e-06, 'epoch': 0.74} 74%|███████▍ | 16407/22095 [28:15:23<5:51:44, 3.71s/it] 74%|███████▍ | 16408/22095 [28:15:26<5:30:47, 3.49s/it] {'loss': 0.3053, 'grad_norm': 0.5808086404162707, 'learning_rate': 1.6395586723841328e-06, 'epoch': 0.74} 74%|███████▍ | 16408/22095 [28:15:26<5:30:47, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16409/22095 [28:15:35<8:08:42, 5.16s/it] {'loss': 0.5011, 'grad_norm': 0.30327634396753367, 'learning_rate': 1.6390160016989487e-06, 'epoch': 0.74} 74%|███████▍ | 16409/22095 [28:15:35<8:08:42, 5.16s/it] 74%|███████▍ | 16410/22095 [28:15:39<7:22:13, 4.67s/it] {'loss': 0.2838, 'grad_norm': 0.5749949314930741, 'learning_rate': 1.6384734032309868e-06, 'epoch': 0.74} 74%|███████▍ | 16410/22095 [28:15:39<7:22:13, 4.67s/it] 74%|███████▍ | 16411/22095 [28:15:42<6:34:59, 4.17s/it] {'loss': 0.2751, 'grad_norm': 0.8120256752960093, 'learning_rate': 1.6379308769919084e-06, 'epoch': 0.74} 74%|███████▍ | 16411/22095 [28:15:42<6:34:59, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55894 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87495 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41643 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16412/22095 [28:15:51<9:06:31, 5.77s/it] {'loss': 0.4333, 'grad_norm': 0.2966374383743913, 'learning_rate': 1.63738842299337e-06, 'epoch': 0.74} 74%|███████▍ | 16412/22095 [28:15:51<9:06:31, 5.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59814 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46674 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16413/22095 [28:15:55<8:14:29, 5.22s/it] {'loss': 0.2531, 'grad_norm': 0.817417467052321, 'learning_rate': 1.6368460412470255e-06, 'epoch': 0.74} 74%|███████▍ | 16413/22095 [28:15:55<8:14:29, 5.22s/it] 74%|███████▍ | 16414/22095 [28:15:58<7:17:08, 4.62s/it] {'loss': 0.2897, 'grad_norm': 0.6579780976381642, 'learning_rate': 1.636303731764532e-06, 'epoch': 0.74} 74%|███████▍ | 16414/22095 [28:15:58<7:17:08, 4.62s/it] 74%|███████▍ | 16415/22095 [28:16:02<7:03:34, 4.47s/it] {'loss': 0.2566, 'grad_norm': 0.532435367077427, 'learning_rate': 1.635761494557539e-06, 'epoch': 0.74} 74%|███████▍ | 16415/22095 [28:16:02<7:03:34, 4.47s/it] 74%|███████▍ | 16416/22095 [28:16:06<6:38:06, 4.21s/it] {'loss': 0.3194, 'grad_norm': 0.573111185664909, 'learning_rate': 1.6352193296377006e-06, 'epoch': 0.74} 74%|███████▍ | 16416/22095 [28:16:06<6:38:06, 4.21s/it] 74%|███████▍ | 16417/22095 [28:16:09<6:09:51, 3.91s/it] {'loss': 0.2936, 'grad_norm': 0.6146707134736058, 'learning_rate': 1.6346772370166646e-06, 'epoch': 0.74} 74%|███████▍ | 16417/22095 [28:16:09<6:09:51, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16418/22095 [28:16:13<5:51:44, 3.72s/it] {'loss': 0.266, 'grad_norm': 0.6074305357766798, 'learning_rate': 1.634135216706077e-06, 'epoch': 0.74} 74%|███████▍ | 16418/22095 [28:16:13<5:51:44, 3.72s/it] 74%|███████▍ | 16419/22095 [28:16:16<5:58:32, 3.79s/it] {'loss': 0.3097, 'grad_norm': 0.5672575122908394, 'learning_rate': 1.6335932687175865e-06, 'epoch': 0.74} 74%|███████▍ | 16419/22095 [28:16:16<5:58:32, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16420/22095 [28:16:22<6:37:42, 4.20s/it] {'loss': 0.4583, 'grad_norm': 0.27175533404918323, 'learning_rate': 1.6330513930628389e-06, 'epoch': 0.74} 74%|███████▍ | 16420/22095 [28:16:22<6:37:42, 4.20s/it] 74%|███████▍ | 16421/22095 [28:16:31<9:07:56, 5.79s/it] {'loss': 0.4705, 'grad_norm': 0.27517315541514614, 'learning_rate': 1.6325095897534765e-06, 'epoch': 0.74} 74%|███████▍ | 16421/22095 [28:16:31<9:07:56, 5.79s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▍ | 16422/22095 [28:16:35<8:04:32, 5.12s/it] {'loss': 0.3075, 'grad_norm': 0.6187333387559553, 'learning_rate': 1.6319678588011385e-06, 'epoch': 0.74} 74%|███████▍ | 16422/22095 [28:16:35<8:04:32, 5.12s/it] 74%|███████▍ | 16423/22095 [28:16:38<7:22:37, 4.68s/it] {'loss': 0.2521, 'grad_norm': 0.5779202963904356, 'learning_rate': 1.6314262002174674e-06, 'epoch': 0.74} 74%|███████▍ | 16423/22095 [28:16:38<7:22:37, 4.68s/it] 74%|███████▍ | 16424/22095 [28:16:42<7:02:26, 4.47s/it] {'loss': 0.2957, 'grad_norm': 0.7523310961274872, 'learning_rate': 1.6308846140141027e-06, 'epoch': 0.74} 74%|███████▍ | 16424/22095 [28:16:42<7:02:26, 4.47s/it] 74%|███████▍ | 16425/22095 [28:16:45<6:22:49, 4.05s/it] {'loss': 0.2476, 'grad_norm': 0.5847693172280757, 'learning_rate': 1.630343100202681e-06, 'epoch': 0.74} 74%|███████▍ | 16425/22095 [28:16:45<6:22:49, 4.05s/it] 74%|███████▍ | 16426/22095 [28:16:49<6:10:40, 3.92s/it] {'loss': 0.3456, 'grad_norm': 0.6238932100164696, 'learning_rate': 1.6298016587948345e-06, 'epoch': 0.74} 74%|███████▍ | 16426/22095 [28:16:49<6:10:40, 3.92s/it] 74%|███████▍ | 16427/22095 [28:16:52<5:45:54, 3.66s/it] {'loss': 0.2797, 'grad_norm': 0.5644616877249133, 'learning_rate': 1.6292602898022015e-06, 'epoch': 0.74} 74%|███████▍ | 16427/22095 [28:16:52<5:45:54, 3.66s/it] 74%|███████▍ | 16428/22095 [28:16:56<5:41:32, 3.62s/it] {'loss': 0.287, 'grad_norm': 0.6019491578971776, 'learning_rate': 1.6287189932364106e-06, 'epoch': 0.74} 74%|███████▍ | 16428/22095 [28:16:56<5:41:32, 3.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 89, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410449 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 89, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12648, 'image': 'vrdu_table_final_2/astro-ph.CO/576925fe-7cb5-48d1-b9ae-d5f1bc5ff1f2.png', 'image_wh': [[23, 89]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}{c}\n $A$\\\\\n $B$\\\\\n $C$\\\\\n \\end{tabular}\n```"}]} 74%|███████▍ | 16429/22095 [28:16:59<5:28:56, 3.48s/it] {'loss': 0.3096, 'grad_norm': 0.5979609372657156, 'learning_rate': 1.6281777691090966e-06, 'epoch': 0.74} 74%|███████▍ | 16429/22095 [28:16:59<5:28:56, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16430/22095 [28:17:02<5:26:29, 3.46s/it] {'loss': 0.2774, 'grad_norm': 0.5681120904040521, 'learning_rate': 1.6276366174318865e-06, 'epoch': 0.74} 74%|███████▍ | 16430/22095 [28:17:02<5:26:29, 3.46s/it] 74%|███████▍ | 16431/22095 [28:17:06<5:42:31, 3.63s/it] {'loss': 0.2919, 'grad_norm': 0.6291721807245989, 'learning_rate': 1.627095538216406e-06, 'epoch': 0.74} 74%|███████▍ | 16431/22095 [28:17:06<5:42:31, 3.63s/it] 74%|███████▍ | 16432/22095 [28:17:10<6:01:34, 3.83s/it] {'loss': 0.2693, 'grad_norm': 0.6375260230038594, 'learning_rate': 1.6265545314742838e-06, 'epoch': 0.74} 74%|███████▍ | 16432/22095 [28:17:10<6:01:34, 3.83s/it] 74%|███████▍ | 16433/22095 [28:17:14<5:48:24, 3.69s/it] {'loss': 0.3493, 'grad_norm': 0.6516953368882744, 'learning_rate': 1.6260135972171448e-06, 'epoch': 0.74} 74%|███████▍ | 16433/22095 [28:17:14<5:48:24, 3.69s/it] 74%|███████▍ | 16434/22095 [28:17:17<5:25:10, 3.45s/it] {'loss': 0.2763, 'grad_norm': 0.6326681888223729, 'learning_rate': 1.625472735456612e-06, 'epoch': 0.74} 74%|███████▍ | 16434/22095 [28:17:17<5:25:10, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49506 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16435/22095 [28:17:20<5:08:21, 3.27s/it] {'loss': 0.2825, 'grad_norm': 0.6176613716364594, 'learning_rate': 1.6249319462043039e-06, 'epoch': 0.74} 74%|███████▍ | 16435/22095 [28:17:20<5:08:21, 3.27s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16436/22095 [28:17:23<5:05:51, 3.24s/it] {'loss': 0.2937, 'grad_norm': 0.6120209281474143, 'learning_rate': 1.6243912294718428e-06, 'epoch': 0.74} 74%|███████▍ | 16436/22095 [28:17:23<5:05:51, 3.24s/it] 74%|███████▍ | 16437/22095 [28:17:26<4:56:18, 3.14s/it] {'loss': 0.28, 'grad_norm': 0.5795229600915072, 'learning_rate': 1.6238505852708481e-06, 'epoch': 0.74} 74%|███████▍ | 16437/22095 [28:17:26<4:56:18, 3.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46463 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60124 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16438/22095 [28:17:28<4:45:55, 3.03s/it] {'loss': 0.3315, 'grad_norm': 0.617044457533017, 'learning_rate': 1.623310013612936e-06, 'epoch': 0.74} 74%|███████▍ | 16438/22095 [28:17:28<4:45:55, 3.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53106 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16439/22095 [28:17:31<4:41:57, 2.99s/it] {'loss': 0.2776, 'grad_norm': 0.6151415354863083, 'learning_rate': 1.622769514509719e-06, 'epoch': 0.74} 74%|███████▍ | 16439/22095 [28:17:31<4:41:57, 2.99s/it] 74%|███████▍ | 16440/22095 [28:17:35<4:47:36, 3.05s/it] {'loss': 0.2804, 'grad_norm': 0.6251360148196247, 'learning_rate': 1.6222290879728142e-06, 'epoch': 0.74} 74%|███████▍ | 16440/22095 [28:17:35<4:47:36, 3.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16441/22095 [28:17:37<4:41:59, 2.99s/it] {'loss': 0.2605, 'grad_norm': 0.6917015257469193, 'learning_rate': 1.6216887340138304e-06, 'epoch': 0.74} 74%|███████▍ | 16441/22095 [28:17:37<4:41:59, 2.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49117 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43747 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48850 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42502 > 40960) for 4 sample(s). Truncating to 945 with 2 samples. 74%|███████▍ | 16442/22095 [28:17:41<4:52:40, 3.11s/it] {'loss': 0.3526, 'grad_norm': 0.616749212638954, 'learning_rate': 1.621148452644382e-06, 'epoch': 0.74} 74%|███████▍ | 16442/22095 [28:17:41<4:52:40, 3.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16443/22095 [28:17:48<6:54:46, 4.40s/it] {'loss': 0.4795, 'grad_norm': 0.2708142105961081, 'learning_rate': 1.6206082438760762e-06, 'epoch': 0.74} 74%|███████▍ | 16443/22095 [28:17:48<6:54:46, 4.40s/it] 74%|███████▍ | 16444/22095 [28:17:52<6:32:33, 4.17s/it] {'loss': 0.2995, 'grad_norm': 0.5801112566416491, 'learning_rate': 1.6200681077205182e-06, 'epoch': 0.74} 74%|███████▍ | 16444/22095 [28:17:52<6:32:33, 4.17s/it] 74%|███████▍ | 16445/22095 [28:17:56<6:42:58, 4.28s/it] {'loss': 0.2847, 'grad_norm': 0.5712979072266995, 'learning_rate': 1.619528044189318e-06, 'epoch': 0.74} 74%|███████▍ | 16445/22095 [28:17:56<6:42:58, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 74%|███████▍ | 16446/22095 [28:18:02<7:34:49, 4.83s/it] {'loss': 0.4951, 'grad_norm': 0.29290842989103355, 'learning_rate': 1.6189880532940772e-06, 'epoch': 0.74} 74%|███████▍ | 16446/22095 [28:18:02<7:34:49, 4.83s/it] 74%|███████▍ | 16447/22095 [28:18:12<9:42:55, 6.19s/it] {'loss': 0.455, 'grad_norm': 0.2558401143062855, 'learning_rate': 1.6184481350463976e-06, 'epoch': 0.74} 74%|███████▍ | 16447/22095 [28:18:12<9:42:55, 6.19s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 74%|███████▍ | 16448/22095 [28:18:16<8:37:57, 5.50s/it] {'loss': 0.3241, 'grad_norm': 0.7829357141843611, 'learning_rate': 1.6179082894578824e-06, 'epoch': 0.74} 74%|███████▍ | 16448/22095 [28:18:16<8:37:57, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46291 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16449/22095 [28:18:19<7:45:03, 4.94s/it] {'loss': 0.2901, 'grad_norm': 0.6261295834024594, 'learning_rate': 1.617368516540132e-06, 'epoch': 0.74} 74%|███████▍ | 16449/22095 [28:18:19<7:45:03, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8557329 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20675, 'image': '520081854.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this christianity book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 74%|███████▍ | 16450/22095 [28:18:29<9:51:08, 6.28s/it] {'loss': 0.4512, 'grad_norm': 0.26736031489859946, 'learning_rate': 1.6168288163047434e-06, 'epoch': 0.74} 74%|███████▍ | 16450/22095 [28:18:29<9:51:08, 6.28s/it] 74%|███████▍ | 16451/22095 [28:18:32<8:23:54, 5.36s/it] {'loss': 0.2953, 'grad_norm': 0.6085865074985845, 'learning_rate': 1.6162891887633114e-06, 'epoch': 0.74} 74%|███████▍ | 16451/22095 [28:18:32<8:23:54, 5.36s/it] 74%|███████▍ | 16452/22095 [28:18:35<7:20:37, 4.69s/it] {'loss': 0.2951, 'grad_norm': 0.5997255472378802, 'learning_rate': 1.615749633927432e-06, 'epoch': 0.74} 74%|███████▍ | 16452/22095 [28:18:35<7:20:37, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82126 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83877 > 40960). Running this sequence through the model will result in indexing errors 74%|███████▍ | 16453/22095 [28:18:38<6:43:52, 4.30s/it] {'loss': 0.2781, 'grad_norm': 0.6290365174173874, 'learning_rate': 1.615210151808701e-06, 'epoch': 0.74} 74%|███████▍ | 16453/22095 [28:18:38<6:43:52, 4.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16454/22095 [28:18:42<6:18:33, 4.03s/it] {'loss': 0.2618, 'grad_norm': 0.6093195173585996, 'learning_rate': 1.6146707424187086e-06, 'epoch': 0.74} 74%|███████▍ | 16454/22095 [28:18:42<6:18:33, 4.03s/it] 74%|███████▍ | 16455/22095 [28:18:45<5:44:52, 3.67s/it] {'loss': 0.3281, 'grad_norm': 0.6563387860435451, 'learning_rate': 1.6141314057690426e-06, 'epoch': 0.74} 74%|███████▍ | 16455/22095 [28:18:45<5:44:52, 3.67s/it] 74%|███████▍ | 16456/22095 [28:18:48<5:27:04, 3.48s/it] {'loss': 0.2966, 'grad_norm': 0.5755936718531313, 'learning_rate': 1.6135921418712959e-06, 'epoch': 0.74} 74%|███████▍ | 16456/22095 [28:18:48<5:27:04, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8584048 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14993, 'image': '671041312.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Romance? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Education & Teaching? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959587 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10422, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 74%|███████▍ | 16457/22095 [28:18:54<6:59:10, 4.46s/it] {'loss': 0.4748, 'grad_norm': 0.26943398549730296, 'learning_rate': 1.6130529507370513e-06, 'epoch': 0.74} 74%|███████▍ | 16457/22095 [28:18:55<6:59:10, 4.46s/it] 74%|███████▍ | 16458/22095 [28:18:58<6:24:16, 4.09s/it] {'loss': 0.3227, 'grad_norm': 0.6445979510026212, 'learning_rate': 1.6125138323778983e-06, 'epoch': 0.74} 74%|███████▍ | 16458/22095 [28:18:58<6:24:16, 4.09s/it] 74%|███████▍ | 16459/22095 [28:19:02<6:15:36, 4.00s/it] {'loss': 0.3094, 'grad_norm': 0.6014424196667216, 'learning_rate': 1.6119747868054193e-06, 'epoch': 0.74} 74%|███████▍ | 16459/22095 [28:19:02<6:15:36, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 74%|███████▍ | 16460/22095 [28:19:06<6:30:14, 4.16s/it] {'loss': 0.2948, 'grad_norm': 0.5965822463164119, 'learning_rate': 1.6114358140311948e-06, 'epoch': 0.74} 74%|███████▍ | 16460/22095 [28:19:06<6:30:14, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16461/22095 [28:19:13<7:39:24, 4.89s/it] {'loss': 0.4665, 'grad_norm': 0.27711896806631753, 'learning_rate': 1.610896914066808e-06, 'epoch': 0.75} 75%|███████▍ | 16461/22095 [28:19:13<7:39:24, 4.89s/it] 75%|███████▍ | 16462/22095 [28:19:16<6:53:09, 4.40s/it] {'loss': 0.2882, 'grad_norm': 0.5903313173991954, 'learning_rate': 1.6103580869238388e-06, 'epoch': 0.75} 75%|███████▍ | 16462/22095 [28:19:16<6:53:09, 4.40s/it] 75%|███████▍ | 16463/22095 [28:19:19<6:06:51, 3.91s/it] {'loss': 0.2436, 'grad_norm': 0.5993926220789347, 'learning_rate': 1.609819332613864e-06, 'epoch': 0.75} 75%|███████▍ | 16463/22095 [28:19:19<6:06:51, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16464/22095 [28:19:21<5:35:23, 3.57s/it] {'loss': 0.2978, 'grad_norm': 0.6501241995643003, 'learning_rate': 1.6092806511484576e-06, 'epoch': 0.75} 75%|███████▍ | 16464/22095 [28:19:21<5:35:23, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45736 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52307 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16465/22095 [28:19:25<5:28:11, 3.50s/it] {'loss': 0.3198, 'grad_norm': 0.6725764205319074, 'learning_rate': 1.6087420425391964e-06, 'epoch': 0.75} 75%|███████▍ | 16465/22095 [28:19:25<5:28:11, 3.50s/it] 75%|███████▍ | 16466/22095 [28:19:28<5:16:34, 3.37s/it] {'loss': 0.2942, 'grad_norm': 0.6499926682600154, 'learning_rate': 1.6082035067976553e-06, 'epoch': 0.75} 75%|███████▍ | 16466/22095 [28:19:28<5:16:34, 3.37s/it] 75%|███████▍ | 16467/22095 [28:19:33<6:03:30, 3.88s/it] {'loss': 0.3156, 'grad_norm': 0.7833476590078484, 'learning_rate': 1.6076650439354035e-06, 'epoch': 0.75} 75%|███████▍ | 16467/22095 [28:19:33<6:03:30, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16468/22095 [28:19:36<5:53:57, 3.77s/it] {'loss': 0.3058, 'grad_norm': 0.6036145375496808, 'learning_rate': 1.6071266539640095e-06, 'epoch': 0.75} 75%|███████▍ | 16468/22095 [28:19:36<5:53:57, 3.77s/it] 75%|███████▍ | 16469/22095 [28:19:40<5:52:45, 3.76s/it] {'loss': 0.3243, 'grad_norm': 0.6395409785445747, 'learning_rate': 1.6065883368950447e-06, 'epoch': 0.75} 75%|███████▍ | 16469/22095 [28:19:40<5:52:45, 3.76s/it] 75%|███████▍ | 16470/22095 [28:19:43<5:37:44, 3.60s/it] {'loss': 0.3002, 'grad_norm': 0.607747457083173, 'learning_rate': 1.606050092740073e-06, 'epoch': 0.75} 75%|███████▍ | 16470/22095 [28:19:43<5:37:44, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16471/22095 [28:19:46<5:14:06, 3.35s/it] {'loss': 0.2663, 'grad_norm': 0.6562502783928543, 'learning_rate': 1.6055119215106629e-06, 'epoch': 0.75} 75%|███████▍ | 16471/22095 [28:19:46<5:14:06, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333777 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 386, 'image': 'vrdu_table_final_2/astro-ph.CO/5f3bc9e2-a6ee-4526-8939-7664cbd0b5fa.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} 75%|███████▍ | 16472/22095 [28:19:49<5:00:39, 3.21s/it] {'loss': 0.3166, 'grad_norm': 0.6645214066337757, 'learning_rate': 1.604973823218376e-06, 'epoch': 0.75} 75%|███████▍ | 16472/22095 [28:19:49<5:00:39, 3.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16473/22095 [28:19:59<8:03:11, 5.16s/it] {'loss': 0.4609, 'grad_norm': 0.26599225456602565, 'learning_rate': 1.6044357978747733e-06, 'epoch': 0.75} 75%|███████▍ | 16473/22095 [28:19:59<8:03:11, 5.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16474/22095 [28:20:02<7:11:22, 4.60s/it] {'loss': 0.2608, 'grad_norm': 0.6025592209553091, 'learning_rate': 1.603897845491416e-06, 'epoch': 0.75} 75%|███████▍ | 16474/22095 [28:20:02<7:11:22, 4.60s/it] 75%|███████▍ | 16475/22095 [28:20:05<6:24:48, 4.11s/it] {'loss': 0.2616, 'grad_norm': 0.596383754180155, 'learning_rate': 1.6033599660798676e-06, 'epoch': 0.75} 75%|███████▍ | 16475/22095 [28:20:05<6:24:48, 4.11s/it] 75%|███████▍ | 16476/22095 [28:20:08<5:52:45, 3.77s/it] {'loss': 0.3297, 'grad_norm': 0.6375212264875807, 'learning_rate': 1.6028221596516779e-06, 'epoch': 0.75} 75%|███████▍ | 16476/22095 [28:20:08<5:52:45, 3.77s/it] 75%|███████▍ | 16477/22095 [28:20:11<5:28:12, 3.51s/it] {'loss': 0.2497, 'grad_norm': 0.5940876054246925, 'learning_rate': 1.6022844262184061e-06, 'epoch': 0.75} 75%|███████▍ | 16477/22095 [28:20:11<5:28:12, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [345, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8427143 in VC:s3://internvl-moe-sft-data/. Exception: Image size [345, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 106972, 'image': 'vrdu_texteq/astro-ph.CO/b1f7a674-8348-417e-abd3-e3a3fff4d6e3.png', 'image_wh': [[345, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'Where we have used $\\nu \\lambda=c$.'}]} 75%|███████▍ | 16478/22095 [28:20:14<5:21:10, 3.43s/it] {'loss': 0.276, 'grad_norm': 0.6218890417536425, 'learning_rate': 1.6017467657916075e-06, 'epoch': 0.75} 75%|███████▍ | 16478/22095 [28:20:14<5:21:10, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16479/22095 [28:20:19<6:04:19, 3.89s/it] {'loss': 0.3192, 'grad_norm': 0.6154177745187074, 'learning_rate': 1.6012091783828365e-06, 'epoch': 0.75} 75%|███████▍ | 16479/22095 [28:20:19<6:04:19, 3.89s/it] 75%|███████▍ | 16480/22095 [28:20:22<5:37:50, 3.61s/it] {'loss': 0.3104, 'grad_norm': 0.5586980717106319, 'learning_rate': 1.600671664003639e-06, 'epoch': 0.75} 75%|███████▍ | 16480/22095 [28:20:22<5:37:50, 3.61s/it] 75%|███████▍ | 16481/22095 [28:20:25<5:31:35, 3.54s/it] {'loss': 0.2703, 'grad_norm': 0.6361626035580967, 'learning_rate': 1.600134222665567e-06, 'epoch': 0.75} 75%|███████▍ | 16481/22095 [28:20:25<5:31:35, 3.54s/it] 75%|███████▍ | 16482/22095 [28:20:28<5:15:45, 3.38s/it] {'loss': 0.3597, 'grad_norm': 0.6284177067454095, 'learning_rate': 1.59959685438017e-06, 'epoch': 0.75} 75%|███████▍ | 16482/22095 [28:20:28<5:15:45, 3.38s/it] 75%|███████▍ | 16483/22095 [28:20:32<5:23:58, 3.46s/it] {'loss': 0.3308, 'grad_norm': 0.7516775332368926, 'learning_rate': 1.599059559158993e-06, 'epoch': 0.75} 75%|███████▍ | 16483/22095 [28:20:32<5:23:58, 3.46s/it] 75%|███████▍ | 16484/22095 [28:20:35<5:22:09, 3.44s/it] {'loss': 0.2719, 'grad_norm': 0.5991664336958192, 'learning_rate': 1.5985223370135795e-06, 'epoch': 0.75} 75%|███████▍ | 16484/22095 [28:20:36<5:22:09, 3.44s/it] 75%|███████▍ | 16485/22095 [28:20:39<5:26:15, 3.49s/it] {'loss': 0.2858, 'grad_norm': 0.5970637698076308, 'learning_rate': 1.5979851879554758e-06, 'epoch': 0.75} 75%|███████▍ | 16485/22095 [28:20:39<5:26:15, 3.49s/it] 75%|███████▍ | 16486/22095 [28:20:42<5:08:54, 3.30s/it] {'loss': 0.3179, 'grad_norm': 0.5938806151950049, 'learning_rate': 1.5974481119962203e-06, 'epoch': 0.75} 75%|███████▍ | 16486/22095 [28:20:42<5:08:54, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51806 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16487/22095 [28:20:46<5:21:39, 3.44s/it] {'loss': 0.3068, 'grad_norm': 0.6206591591584846, 'learning_rate': 1.596911109147356e-06, 'epoch': 0.75} 75%|███████▍ | 16487/22095 [28:20:46<5:21:39, 3.44s/it] 75%|███████▍ | 16488/22095 [28:20:50<5:35:58, 3.60s/it] {'loss': 0.3092, 'grad_norm': 0.5958474881442908, 'learning_rate': 1.5963741794204207e-06, 'epoch': 0.75} 75%|███████▍ | 16488/22095 [28:20:50<5:35:58, 3.60s/it] 75%|███████▍ | 16489/22095 [28:20:53<5:15:46, 3.38s/it] {'loss': 0.2846, 'grad_norm': 0.6318170853705819, 'learning_rate': 1.595837322826949e-06, 'epoch': 0.75} 75%|███████▍ | 16489/22095 [28:20:53<5:15:46, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51785 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90220 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16490/22095 [28:20:56<5:28:45, 3.52s/it] {'loss': 0.2862, 'grad_norm': 0.5792372877921103, 'learning_rate': 1.5953005393784782e-06, 'epoch': 0.75} 75%|███████▍ | 16490/22095 [28:20:56<5:28:45, 3.52s/it] 75%|███████▍ | 16491/22095 [28:21:00<5:18:50, 3.41s/it] {'loss': 0.2985, 'grad_norm': 0.6654866867136477, 'learning_rate': 1.5947638290865436e-06, 'epoch': 0.75} 75%|███████▍ | 16491/22095 [28:21:00<5:18:50, 3.41s/it] 75%|███████▍ | 16492/22095 [28:21:04<5:36:06, 3.60s/it] {'loss': 0.2709, 'grad_norm': 0.6909000728983873, 'learning_rate': 1.5942271919626762e-06, 'epoch': 0.75} 75%|███████▍ | 16492/22095 [28:21:04<5:36:06, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [606, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8460080 in VC:s3://internvl-moe-sft-data/. Exception: Image size [606, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37847, 'image': 'vrdu_texteq/astro-ph.CO/aee59ff0-9b1b-4490-a1d2-bfd241eec599.png', 'image_wh': [[606, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $\\Delta z_i$ denotes the width of $i$-th redshift slice.'}]} 75%|███████▍ | 16493/22095 [28:21:07<5:34:16, 3.58s/it] {'loss': 0.287, 'grad_norm': 0.62095556360136, 'learning_rate': 1.5936906280184045e-06, 'epoch': 0.75} 75%|███████▍ | 16493/22095 [28:21:07<5:34:16, 3.58s/it] 75%|███████▍ | 16494/22095 [28:21:10<5:20:15, 3.43s/it] {'loss': 0.3418, 'grad_norm': 0.6299401063232833, 'learning_rate': 1.5931541372652592e-06, 'epoch': 0.75} 75%|███████▍ | 16494/22095 [28:21:10<5:20:15, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16495/22095 [28:21:20<8:08:50, 5.24s/it] {'loss': 0.4905, 'grad_norm': 0.3067155177006923, 'learning_rate': 1.5926177197147702e-06, 'epoch': 0.75} 75%|███████▍ | 16495/22095 [28:21:20<8:08:50, 5.24s/it] 75%|███████▍ | 16496/22095 [28:21:23<7:21:19, 4.73s/it] {'loss': 0.322, 'grad_norm': 0.6525451405318027, 'learning_rate': 1.5920813753784614e-06, 'epoch': 0.75} 75%|███████▍ | 16496/22095 [28:21:23<7:21:19, 4.73s/it] 75%|███████▍ | 16497/22095 [28:21:26<6:32:00, 4.20s/it] {'loss': 0.2977, 'grad_norm': 0.6263061006818194, 'learning_rate': 1.5915451042678558e-06, 'epoch': 0.75} 75%|███████▍ | 16497/22095 [28:21:26<6:32:00, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16498/22095 [28:21:36<9:00:39, 5.80s/it] {'loss': 0.4502, 'grad_norm': 0.2618271933779362, 'learning_rate': 1.591008906394479e-06, 'epoch': 0.75} 75%|███████▍ | 16498/22095 [28:21:36<9:00:39, 5.80s/it] 75%|███████▍ | 16499/22095 [28:21:40<8:07:53, 5.23s/it] {'loss': 0.2815, 'grad_norm': 0.5847023317077064, 'learning_rate': 1.5904727817698495e-06, 'epoch': 0.75} 75%|███████▍ | 16499/22095 [28:21:40<8:07:53, 5.23s/it] 75%|███████▍ | 16500/22095 [28:21:43<7:16:02, 4.68s/it] {'loss': 0.3401, 'grad_norm': 0.7444545205585212, 'learning_rate': 1.5899367304054898e-06, 'epoch': 0.75} 75%|███████▍ | 16500/22095 [28:21:43<7:16:02, 4.68s/it] 75%|███████▍ | 16501/22095 [28:21:47<6:49:38, 4.39s/it] {'loss': 0.2957, 'grad_norm': 0.627059815960014, 'learning_rate': 1.5894007523129162e-06, 'epoch': 0.75} 75%|███████▍ | 16501/22095 [28:21:47<6:49:38, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99019 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16502/22095 [28:21:51<6:38:30, 4.28s/it] {'loss': 0.3021, 'grad_norm': 0.650345947630085, 'learning_rate': 1.5888648475036445e-06, 'epoch': 0.75} 75%|███████▍ | 16502/22095 [28:21:51<6:38:30, 4.28s/it] 75%|███████▍ | 16503/22095 [28:21:54<6:13:46, 4.01s/it] {'loss': 0.3422, 'grad_norm': 2.417059302876411, 'learning_rate': 1.5883290159891907e-06, 'epoch': 0.75} 75%|███████▍ | 16503/22095 [28:21:54<6:13:46, 4.01s/it] 75%|███████▍ | 16504/22095 [28:21:57<5:41:36, 3.67s/it] {'loss': 0.3189, 'grad_norm': 0.6539258850224872, 'learning_rate': 1.5877932577810712e-06, 'epoch': 0.75} 75%|███████▍ | 16504/22095 [28:21:57<5:41:36, 3.67s/it] 75%|███████▍ | 16505/22095 [28:22:00<5:16:37, 3.40s/it] {'loss': 0.2911, 'grad_norm': 0.6039525924981889, 'learning_rate': 1.5872575728907914e-06, 'epoch': 0.75} 75%|███████▍ | 16505/22095 [28:22:00<5:16:37, 3.40s/it] 75%|███████▍ | 16506/22095 [28:22:03<5:20:59, 3.45s/it] {'loss': 0.3224, 'grad_norm': 0.5788256798294671, 'learning_rate': 1.586721961329865e-06, 'epoch': 0.75} 75%|███████▍ | 16506/22095 [28:22:03<5:20:59, 3.45s/it] 75%|███████▍ | 16507/22095 [28:22:07<5:19:54, 3.43s/it] {'loss': 0.3178, 'grad_norm': 0.7105780907921888, 'learning_rate': 1.5861864231098006e-06, 'epoch': 0.75} 75%|███████▍ | 16507/22095 [28:22:07<5:19:54, 3.43s/it] 75%|███████▍ | 16508/22095 [28:22:11<5:35:51, 3.61s/it] {'loss': 0.3351, 'grad_norm': 0.7063708263989005, 'learning_rate': 1.5856509582421086e-06, 'epoch': 0.75} 75%|███████▍ | 16508/22095 [28:22:11<5:35:51, 3.61s/it] 75%|███████▍ | 16509/22095 [28:22:14<5:27:59, 3.52s/it] {'loss': 0.369, 'grad_norm': 0.6510998279016662, 'learning_rate': 1.585115566738288e-06, 'epoch': 0.75} 75%|███████▍ | 16509/22095 [28:22:14<5:27:59, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16510/22095 [28:22:24<8:15:15, 5.32s/it] {'loss': 0.4612, 'grad_norm': 0.277973618515596, 'learning_rate': 1.5845802486098461e-06, 'epoch': 0.75} 75%|███████▍ | 16510/22095 [28:22:24<8:15:15, 5.32s/it] 75%|███████▍ | 16511/22095 [28:22:27<7:23:09, 4.76s/it] {'loss': 0.3621, 'grad_norm': 0.642679572704552, 'learning_rate': 1.584045003868286e-06, 'epoch': 0.75} 75%|███████▍ | 16511/22095 [28:22:27<7:23:09, 4.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948332 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71485, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5.5cm\nB. 6cm\nC. 6.5cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 75%|███████▍ | 16512/22095 [28:22:30<6:32:00, 4.21s/it] {'loss': 0.2871, 'grad_norm': 0.6128790270686905, 'learning_rate': 1.5835098325251075e-06, 'epoch': 0.75} 75%|███████▍ | 16512/22095 [28:22:30<6:32:00, 4.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16513/22095 [28:22:34<6:34:25, 4.24s/it] {'loss': 0.312, 'grad_norm': 0.5868380908965802, 'learning_rate': 1.5829747345918083e-06, 'epoch': 0.75} 75%|███████▍ | 16513/22095 [28:22:34<6:34:25, 4.24s/it] 75%|███████▍ | 16514/22095 [28:22:37<5:58:28, 3.85s/it] {'loss': 0.2839, 'grad_norm': 0.5961565717006754, 'learning_rate': 1.5824397100798893e-06, 'epoch': 0.75} 75%|███████▍ | 16514/22095 [28:22:37<5:58:28, 3.85s/it] 75%|███████▍ | 16515/22095 [28:22:41<5:56:35, 3.83s/it] {'loss': 0.3001, 'grad_norm': 0.5754325439828039, 'learning_rate': 1.5819047590008429e-06, 'epoch': 0.75} 75%|███████▍ | 16515/22095 [28:22:41<5:56:35, 3.83s/it] 75%|███████▍ | 16516/22095 [28:22:45<6:04:55, 3.92s/it] {'loss': 0.3086, 'grad_norm': 0.7193337762494555, 'learning_rate': 1.5813698813661672e-06, 'epoch': 0.75} 75%|███████▍ | 16516/22095 [28:22:45<6:04:55, 3.92s/it] 75%|███████▍ | 16517/22095 [28:22:48<5:43:37, 3.70s/it] {'loss': 0.3474, 'grad_norm': 0.6679149437502885, 'learning_rate': 1.5808350771873527e-06, 'epoch': 0.75} 75%|███████▍ | 16517/22095 [28:22:48<5:43:37, 3.70s/it] 75%|███████▍ | 16518/22095 [28:22:52<5:38:15, 3.64s/it] {'loss': 0.3043, 'grad_norm': 0.5970275979081476, 'learning_rate': 1.58030034647589e-06, 'epoch': 0.75} 75%|███████▍ | 16518/22095 [28:22:52<5:38:15, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90733 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16519/22095 [28:22:55<5:30:05, 3.55s/it] {'loss': 0.268, 'grad_norm': 0.5749082872452106, 'learning_rate': 1.57976568924327e-06, 'epoch': 0.75} 75%|███████▍ | 16519/22095 [28:22:55<5:30:05, 3.55s/it] 75%|███████▍ | 16520/22095 [28:22:58<5:20:44, 3.45s/it] {'loss': 0.2956, 'grad_norm': 0.6548085932399869, 'learning_rate': 1.5792311055009824e-06, 'epoch': 0.75} 75%|███████▍ | 16520/22095 [28:22:58<5:20:44, 3.45s/it] 75%|███████▍ | 16521/22095 [28:23:02<5:31:24, 3.57s/it] {'loss': 0.2868, 'grad_norm': 0.623026618245212, 'learning_rate': 1.578696595260512e-06, 'epoch': 0.75} 75%|███████▍ | 16521/22095 [28:23:02<5:31:24, 3.57s/it] 75%|███████▍ | 16522/22095 [28:23:05<5:22:45, 3.47s/it] {'loss': 0.2832, 'grad_norm': 0.5916390234323907, 'learning_rate': 1.578162158533343e-06, 'epoch': 0.75} 75%|███████▍ | 16522/22095 [28:23:05<5:22:45, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70581 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16523/22095 [28:23:09<5:21:15, 3.46s/it] {'loss': 0.2819, 'grad_norm': 0.6014859725372366, 'learning_rate': 1.57762779533096e-06, 'epoch': 0.75} 75%|███████▍ | 16523/22095 [28:23:09<5:21:15, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16524/22095 [28:23:12<5:04:38, 3.28s/it] {'loss': 0.2898, 'grad_norm': 0.7435152017578733, 'learning_rate': 1.5770935056648456e-06, 'epoch': 0.75} 75%|███████▍ | 16524/22095 [28:23:12<5:04:38, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58154 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16525/22095 [28:23:16<5:21:55, 3.47s/it] {'loss': 0.3335, 'grad_norm': 0.6084117400456398, 'learning_rate': 1.5765592895464793e-06, 'epoch': 0.75} 75%|███████▍ | 16525/22095 [28:23:16<5:21:55, 3.47s/it] 75%|███████▍ | 16526/22095 [28:23:19<5:28:01, 3.53s/it] {'loss': 0.3038, 'grad_norm': 0.6107025095627568, 'learning_rate': 1.5760251469873378e-06, 'epoch': 0.75} 75%|███████▍ | 16526/22095 [28:23:19<5:28:01, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16527/22095 [28:23:29<8:20:09, 5.39s/it] {'loss': 0.4819, 'grad_norm': 0.25637382721289265, 'learning_rate': 1.5754910779989018e-06, 'epoch': 0.75} 75%|███████▍ | 16527/22095 [28:23:29<8:20:09, 5.39s/it] 75%|███████▍ | 16528/22095 [28:23:32<7:24:51, 4.79s/it] {'loss': 0.3256, 'grad_norm': 0.6661645367906474, 'learning_rate': 1.5749570825926437e-06, 'epoch': 0.75} 75%|███████▍ | 16528/22095 [28:23:32<7:24:51, 4.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16529/22095 [28:23:37<7:11:21, 4.65s/it] {'loss': 0.3045, 'grad_norm': 0.5931956952900801, 'learning_rate': 1.5744231607800397e-06, 'epoch': 0.75} 75%|███████▍ | 16529/22095 [28:23:37<7:11:21, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52880 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68369 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51818 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16530/22095 [28:23:40<6:41:29, 4.33s/it] {'loss': 0.2963, 'grad_norm': 0.6488485046622263, 'learning_rate': 1.5738893125725613e-06, 'epoch': 0.75} 75%|███████▍ | 16530/22095 [28:23:40<6:41:29, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16531/22095 [28:23:48<8:22:27, 5.42s/it] {'loss': 0.4647, 'grad_norm': 0.27494362095273467, 'learning_rate': 1.5733555379816773e-06, 'epoch': 0.75} 75%|███████▍ | 16531/22095 [28:23:48<8:22:27, 5.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54167 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56714 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16532/22095 [28:23:52<7:29:49, 4.85s/it] {'loss': 0.2794, 'grad_norm': 0.6009411757917738, 'learning_rate': 1.572821837018859e-06, 'epoch': 0.75} 75%|███████▍ | 16532/22095 [28:23:52<7:29:49, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16533/22095 [28:24:02<9:48:34, 6.35s/it] {'loss': 0.4698, 'grad_norm': 0.311812204997518, 'learning_rate': 1.5722882096955748e-06, 'epoch': 0.75} 75%|███████▍ | 16533/22095 [28:24:02<9:48:34, 6.35s/it] 75%|███████▍ | 16534/22095 [28:24:05<8:23:43, 5.43s/it] {'loss': 0.2724, 'grad_norm': 0.5832169580997869, 'learning_rate': 1.5717546560232904e-06, 'epoch': 0.75} 75%|███████▍ | 16534/22095 [28:24:05<8:23:43, 5.43s/it] 75%|███████▍ | 16535/22095 [28:24:08<7:12:08, 4.66s/it] {'loss': 0.2928, 'grad_norm': 0.5453719989550989, 'learning_rate': 1.5712211760134672e-06, 'epoch': 0.75} 75%|███████▍ | 16535/22095 [28:24:08<7:12:08, 4.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16536/22095 [28:24:12<6:56:58, 4.50s/it] {'loss': 0.2585, 'grad_norm': 0.6266500321574185, 'learning_rate': 1.5706877696775703e-06, 'epoch': 0.75} 75%|███████▍ | 16536/22095 [28:24:12<6:56:58, 4.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16537/22095 [28:24:15<6:10:51, 4.00s/it] {'loss': 0.2831, 'grad_norm': 0.6208100270151411, 'learning_rate': 1.5701544370270638e-06, 'epoch': 0.75} 75%|███████▍ | 16537/22095 [28:24:15<6:10:51, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16538/22095 [28:24:21<7:24:03, 4.79s/it] {'loss': 0.4833, 'grad_norm': 0.2689173965747566, 'learning_rate': 1.5696211780734017e-06, 'epoch': 0.75} 75%|███████▍ | 16538/22095 [28:24:21<7:24:03, 4.79s/it] 75%|███████▍ | 16539/22095 [28:24:25<6:58:10, 4.52s/it] {'loss': 0.2706, 'grad_norm': 0.6334801681171852, 'learning_rate': 1.569087992828045e-06, 'epoch': 0.75} 75%|███████▍ | 16539/22095 [28:24:25<6:58:10, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16540/22095 [28:24:32<7:54:44, 5.13s/it] {'loss': 0.4996, 'grad_norm': 0.28267318715463907, 'learning_rate': 1.5685548813024516e-06, 'epoch': 0.75} 75%|███████▍ | 16540/22095 [28:24:32<7:54:44, 5.13s/it] 75%|███████▍ | 16541/22095 [28:24:35<7:06:44, 4.61s/it] {'loss': 0.3117, 'grad_norm': 0.5544036599356525, 'learning_rate': 1.5680218435080747e-06, 'epoch': 0.75} 75%|███████▍ | 16541/22095 [28:24:35<7:06:44, 4.61s/it] 75%|███████▍ | 16542/22095 [28:24:38<6:19:37, 4.10s/it] {'loss': 0.2904, 'grad_norm': 0.5937306440955723, 'learning_rate': 1.5674888794563663e-06, 'epoch': 0.75} 75%|███████▍ | 16542/22095 [28:24:38<6:19:37, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16543/22095 [28:24:41<5:43:50, 3.72s/it] {'loss': 0.2802, 'grad_norm': 0.5710840182569125, 'learning_rate': 1.566955989158781e-06, 'epoch': 0.75} 75%|███████▍ | 16543/22095 [28:24:41<5:43:50, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51220 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16544/22095 [28:24:44<5:18:21, 3.44s/it] {'loss': 0.2877, 'grad_norm': 0.6156900161661162, 'learning_rate': 1.5664231726267664e-06, 'epoch': 0.75} 75%|███████▍ | 16544/22095 [28:24:44<5:18:21, 3.44s/it] 75%|███████▍ | 16545/22095 [28:24:47<5:23:41, 3.50s/it] {'loss': 0.3083, 'grad_norm': 1.9108896254674193, 'learning_rate': 1.5658904298717742e-06, 'epoch': 0.75} 75%|███████▍ | 16545/22095 [28:24:47<5:23:41, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310638 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1Ci0yLXXXXXcFXVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text information from this photo, but do not output the coordinates.'}, {'from': 'gpt', 'value': 'All words in the image:\n软件购买\n旺旺ID\n:whzywj\n体重/weight\n48.2\n公斤\nWelevas韦乐雅斯\n强大的批量处理,能避免你想要的效果!!!'}]} 75%|███████▍ | 16546/22095 [28:24:50<5:05:40, 3.31s/it] {'loss': 0.2686, 'grad_norm': 0.6082886537469275, 'learning_rate': 1.5653577609052495e-06, 'epoch': 0.75} 75%|███████▍ | 16546/22095 [28:24:50<5:05:40, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42091 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16547/22095 [28:24:53<5:02:00, 3.27s/it] {'loss': 0.2715, 'grad_norm': 0.5797560507570885, 'learning_rate': 1.5648251657386366e-06, 'epoch': 0.75} 75%|███████▍ | 16547/22095 [28:24:54<5:02:00, 3.27s/it] 75%|███████▍ | 16548/22095 [28:24:57<5:08:14, 3.33s/it] {'loss': 0.3317, 'grad_norm': 0.6528100426591587, 'learning_rate': 1.56429264438338e-06, 'epoch': 0.75} 75%|███████▍ | 16548/22095 [28:24:57<5:08:14, 3.33s/it] 75%|███████▍ | 16549/22095 [28:25:00<5:12:26, 3.38s/it] {'loss': 0.325, 'grad_norm': 0.6272924481051566, 'learning_rate': 1.5637601968509242e-06, 'epoch': 0.75} 75%|███████▍ | 16549/22095 [28:25:00<5:12:26, 3.38s/it] 75%|███████▍ | 16550/22095 [28:25:04<5:10:25, 3.36s/it] {'loss': 0.2886, 'grad_norm': 0.6446741475507854, 'learning_rate': 1.5632278231527081e-06, 'epoch': 0.75} 75%|███████▍ | 16550/22095 [28:25:04<5:10:25, 3.36s/it] 75%|███████▍ | 16551/22095 [28:25:08<5:28:49, 3.56s/it] {'loss': 0.2752, 'grad_norm': 0.7003736284823379, 'learning_rate': 1.5626955233001695e-06, 'epoch': 0.75} 75%|███████▍ | 16551/22095 [28:25:08<5:28:49, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16552/22095 [28:25:17<8:14:18, 5.35s/it] {'loss': 0.4716, 'grad_norm': 0.2720333118535126, 'learning_rate': 1.5621632973047468e-06, 'epoch': 0.75} 75%|███████▍ | 16552/22095 [28:25:17<8:14:18, 5.35s/it] 75%|███████▍ | 16553/22095 [28:25:21<7:26:30, 4.83s/it] {'loss': 0.3199, 'grad_norm': 0.6303202358609897, 'learning_rate': 1.5616311451778782e-06, 'epoch': 0.75} 75%|███████▍ | 16553/22095 [28:25:21<7:26:30, 4.83s/it] 75%|███████▍ | 16554/22095 [28:25:25<7:17:40, 4.74s/it] {'loss': 0.3115, 'grad_norm': 0.6152455382833465, 'learning_rate': 1.5610990669309961e-06, 'epoch': 0.75} 75%|███████▍ | 16554/22095 [28:25:25<7:17:40, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▍ | 16555/22095 [28:25:33<8:42:14, 5.66s/it] {'loss': 0.4601, 'grad_norm': 0.27440801803982967, 'learning_rate': 1.560567062575532e-06, 'epoch': 0.75} 75%|███████▍ | 16555/22095 [28:25:33<8:42:14, 5.66s/it] 75%|███████▍ | 16556/22095 [28:25:36<7:33:02, 4.91s/it] {'loss': 0.2785, 'grad_norm': 0.6169645054117495, 'learning_rate': 1.5600351321229196e-06, 'epoch': 0.75} 75%|███████▍ | 16556/22095 [28:25:36<7:33:02, 4.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [384, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8442669 in VC:s3://internvl-moe-sft-data/. Exception: Image size [384, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38458, 'image': 'vrdu_texteq/astro-ph.CO/f268447c-a4df-4935-aa10-35a8c44d2b93.png', 'image_wh': [[384, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $\\delta_{\\rm D}$ is the Dirac delta and'}]} 75%|███████▍ | 16557/22095 [28:25:40<6:42:10, 4.36s/it] {'loss': 0.2773, 'grad_norm': 0.6401479659716349, 'learning_rate': 1.5595032755845857e-06, 'epoch': 0.75} 75%|███████▍ | 16557/22095 [28:25:40<6:42:10, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (159023 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16558/22095 [28:25:43<6:11:06, 4.02s/it] {'loss': 0.3114, 'grad_norm': 0.6396815110074804, 'learning_rate': 1.5589714929719614e-06, 'epoch': 0.75} 75%|███████▍ | 16558/22095 [28:25:43<6:11:06, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49845 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16559/22095 [28:25:47<6:06:20, 3.97s/it] {'loss': 0.3343, 'grad_norm': 0.5781596757272333, 'learning_rate': 1.558439784296471e-06, 'epoch': 0.75} 75%|███████▍ | 16559/22095 [28:25:47<6:06:20, 3.97s/it] 75%|███████▍ | 16560/22095 [28:25:50<5:57:06, 3.87s/it] {'loss': 0.2718, 'grad_norm': 0.5997425540319526, 'learning_rate': 1.5579081495695381e-06, 'epoch': 0.75} 75%|███████▍ | 16560/22095 [28:25:50<5:57:06, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48125 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88036 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16561/22095 [28:25:53<5:36:40, 3.65s/it] {'loss': 0.3065, 'grad_norm': 0.6039137344409807, 'learning_rate': 1.5573765888025877e-06, 'epoch': 0.75} 75%|███████▍ | 16561/22095 [28:25:53<5:36:40, 3.65s/it] 75%|███████▍ | 16562/22095 [28:25:56<5:16:13, 3.43s/it] {'loss': 0.2898, 'grad_norm': 0.5809519922124259, 'learning_rate': 1.556845102007043e-06, 'epoch': 0.75} 75%|███████▍ | 16562/22095 [28:25:56<5:16:13, 3.43s/it] 75%|███████▍ | 16563/22095 [28:26:00<5:27:40, 3.55s/it] {'loss': 0.3315, 'grad_norm': 0.6782488781990339, 'learning_rate': 1.556313689194322e-06, 'epoch': 0.75} 75%|███████▍ | 16563/22095 [28:26:00<5:27:40, 3.55s/it] 75%|███████▍ | 16564/22095 [28:26:03<5:20:36, 3.48s/it] {'loss': 0.3036, 'grad_norm': 0.6110668027970159, 'learning_rate': 1.5557823503758418e-06, 'epoch': 0.75} 75%|███████▍ | 16564/22095 [28:26:03<5:20:36, 3.48s/it] 75%|███████▍ | 16565/22095 [28:26:07<5:10:42, 3.37s/it] {'loss': 0.2489, 'grad_norm': 0.6452243181347994, 'learning_rate': 1.555251085563021e-06, 'epoch': 0.75} 75%|███████▍ | 16565/22095 [28:26:07<5:10:42, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16566/22095 [28:26:17<8:16:03, 5.38s/it] {'loss': 0.473, 'grad_norm': 0.3017016378995432, 'learning_rate': 1.5547198947672777e-06, 'epoch': 0.75} 75%|███████▍ | 16566/22095 [28:26:17<8:16:03, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90593 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43786 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16567/22095 [28:26:25<9:45:36, 6.36s/it] {'loss': 0.4459, 'grad_norm': 0.26271291805718217, 'learning_rate': 1.5541887780000187e-06, 'epoch': 0.75} 75%|███████▍ | 16567/22095 [28:26:25<9:45:36, 6.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▍ | 16568/22095 [28:26:29<8:24:20, 5.47s/it] {'loss': 0.3114, 'grad_norm': 0.5799561207828952, 'learning_rate': 1.5536577352726607e-06, 'epoch': 0.75} 75%|███████▍ | 16568/22095 [28:26:29<8:24:20, 5.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67318 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▍ | 16569/22095 [28:26:37<9:45:07, 6.35s/it] {'loss': 0.4627, 'grad_norm': 0.2854953151376636, 'learning_rate': 1.5531267665966143e-06, 'epoch': 0.75} 75%|███████▍ | 16569/22095 [28:26:37<9:45:07, 6.35s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 75%|███████▍ | 16570/22095 [28:26:41<8:33:31, 5.58s/it] {'loss': 0.2798, 'grad_norm': 0.5779329050400861, 'learning_rate': 1.5525958719832879e-06, 'epoch': 0.75} 75%|███████▍ | 16570/22095 [28:26:41<8:33:31, 5.58s/it] 75%|███████▍ | 16571/22095 [28:26:45<7:41:36, 5.01s/it] {'loss': 0.2901, 'grad_norm': 0.6120806007444967, 'learning_rate': 1.5520650514440866e-06, 'epoch': 0.75} 75%|███████▍ | 16571/22095 [28:26:45<7:41:36, 5.01s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303717 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1XM4zLXXXXXa_XVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nRecognize text without outputting bounding boxes.'}, {'from': 'gpt', 'value': 'All words in the image:\nR\nNiSi\n影像·随心镜\n72mmw\nҰ\n100\nND2000\nMC+WPC\n多膜减光11档\nNISI\n超薄减光镜\n72mm\n12.12\n100\n大促\n品牌盛典\n领\n提前购\n元'}]} 75%|███████▌ | 16572/22095 [28:26:48<7:08:16, 4.65s/it] {'loss': 0.3147, 'grad_norm': 0.6358208723438779, 'learning_rate': 1.5515343049904191e-06, 'epoch': 0.75} 75%|███████▌ | 16572/22095 [28:26:48<7:08:16, 4.65s/it] 75%|███████▌ | 16573/22095 [28:26:52<6:43:01, 4.38s/it] {'loss': 0.3193, 'grad_norm': 0.6218208817316578, 'learning_rate': 1.5510036326336868e-06, 'epoch': 0.75} 75%|███████▌ | 16573/22095 [28:26:52<6:43:01, 4.38s/it] 75%|███████▌ | 16574/22095 [28:26:56<6:27:10, 4.21s/it] {'loss': 0.3067, 'grad_norm': 0.6407424514409737, 'learning_rate': 1.5504730343852952e-06, 'epoch': 0.75} 75%|███████▌ | 16574/22095 [28:26:56<6:27:10, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45062 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50208 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16575/22095 [28:27:05<8:54:48, 5.81s/it] {'loss': 0.4935, 'grad_norm': 0.27408098422727273, 'learning_rate': 1.5499425102566423e-06, 'epoch': 0.75} 75%|███████▌ | 16575/22095 [28:27:05<8:54:48, 5.81s/it] 75%|███████▌ | 16576/22095 [28:27:09<7:42:52, 5.03s/it] {'loss': 0.2392, 'grad_norm': 0.5948893953322856, 'learning_rate': 1.5494120602591305e-06, 'epoch': 0.75} 75%|███████▌ | 16576/22095 [28:27:09<7:42:52, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16577/22095 [28:27:18<9:36:56, 6.27s/it] {'loss': 0.4424, 'grad_norm': 0.2785590734503924, 'learning_rate': 1.5488816844041537e-06, 'epoch': 0.75} 75%|███████▌ | 16577/22095 [28:27:18<9:36:56, 6.27s/it] 75%|███████▌ | 16578/22095 [28:27:27<10:51:50, 7.09s/it] {'loss': 0.4678, 'grad_norm': 0.2694455060170366, 'learning_rate': 1.5483513827031122e-06, 'epoch': 0.75} 75%|███████▌ | 16578/22095 [28:27:27<10:51:50, 7.09s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 75%|███████▌ | 16579/22095 [28:27:30<9:04:45, 5.93s/it] {'loss': 0.2833, 'grad_norm': 0.611838194162662, 'learning_rate': 1.547821155167399e-06, 'epoch': 0.75} 75%|███████▌ | 16579/22095 [28:27:30<9:04:45, 5.93s/it] 75%|███████▌ | 16580/22095 [28:27:33<7:50:01, 5.11s/it] {'loss': 0.3155, 'grad_norm': 0.6544908570921042, 'learning_rate': 1.5472910018084043e-06, 'epoch': 0.75} 75%|███████▌ | 16580/22095 [28:27:33<7:50:01, 5.11s/it] 75%|███████▌ | 16581/22095 [28:27:37<7:25:12, 4.84s/it] {'loss': 0.312, 'grad_norm': 0.6395240466759102, 'learning_rate': 1.546760922637522e-06, 'epoch': 0.75} 75%|███████▌ | 16581/22095 [28:27:37<7:25:12, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94223 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47832 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76063 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16582/22095 [28:27:41<6:34:49, 4.30s/it] {'loss': 0.3063, 'grad_norm': 0.600114842632944, 'learning_rate': 1.5462309176661433e-06, 'epoch': 0.75} 75%|███████▌ | 16582/22095 [28:27:41<6:34:49, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71701 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16583/22095 [28:27:50<9:04:09, 5.92s/it] {'loss': 0.4876, 'grad_norm': 0.25831747528691457, 'learning_rate': 1.5457009869056545e-06, 'epoch': 0.75} 75%|███████▌ | 16583/22095 [28:27:50<9:04:09, 5.92s/it] 75%|███████▌ | 16584/22095 [28:27:54<8:01:44, 5.24s/it] {'loss': 0.2753, 'grad_norm': 0.6005233821159032, 'learning_rate': 1.5451711303674411e-06, 'epoch': 0.75} 75%|███████▌ | 16584/22095 [28:27:54<8:01:44, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60707 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80072 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16585/22095 [28:27:57<7:00:09, 4.58s/it] {'loss': 0.3225, 'grad_norm': 0.6294182622576255, 'learning_rate': 1.5446413480628908e-06, 'epoch': 0.75} 75%|███████▌ | 16585/22095 [28:27:57<7:00:09, 4.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85866 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16586/22095 [28:28:00<6:28:07, 4.23s/it] {'loss': 0.3342, 'grad_norm': 0.6474888224795824, 'learning_rate': 1.5441116400033846e-06, 'epoch': 0.75} 75%|███████▌ | 16586/22095 [28:28:00<6:28:07, 4.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047785 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 12\nB. 6\nC. 8\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 75%|███████▌ | 16587/22095 [28:28:04<6:08:42, 4.02s/it] {'loss': 0.2893, 'grad_norm': 0.738721455533006, 'learning_rate': 1.543582006200306e-06, 'epoch': 0.75} 75%|███████▌ | 16587/22095 [28:28:04<6:08:42, 4.02s/it] 75%|███████▌ | 16588/22095 [28:28:07<5:49:34, 3.81s/it] {'loss': 0.2991, 'grad_norm': 0.5786158399123434, 'learning_rate': 1.5430524466650354e-06, 'epoch': 0.75} 75%|███████▌ | 16588/22095 [28:28:07<5:49:34, 3.81s/it] 75%|███████▌ | 16589/22095 [28:28:11<5:48:00, 3.79s/it] {'loss': 0.3466, 'grad_norm': 0.596568758570493, 'learning_rate': 1.5425229614089482e-06, 'epoch': 0.75} 75%|███████▌ | 16589/22095 [28:28:11<5:48:00, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44543 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16590/22095 [28:28:17<6:52:09, 4.49s/it] {'loss': 0.4615, 'grad_norm': 0.2613221091049353, 'learning_rate': 1.5419935504434242e-06, 'epoch': 0.75} 75%|███████▌ | 16590/22095 [28:28:17<6:52:09, 4.49s/it] 75%|███████▌ | 16591/22095 [28:28:24<8:02:08, 5.26s/it] {'loss': 0.4633, 'grad_norm': 0.377820860939105, 'learning_rate': 1.5414642137798396e-06, 'epoch': 0.75} 75%|███████▌ | 16591/22095 [28:28:24<8:02:08, 5.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 75%|███████▌ | 16592/22095 [28:28:27<7:10:28, 4.69s/it] {'loss': 0.2742, 'grad_norm': 0.5871596830036833, 'learning_rate': 1.5409349514295674e-06, 'epoch': 0.75} 75%|███████▌ | 16592/22095 [28:28:28<7:10:28, 4.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78173 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16593/22095 [28:28:37<9:20:08, 6.11s/it] {'loss': 0.4966, 'grad_norm': 0.27193806045907737, 'learning_rate': 1.540405763403977e-06, 'epoch': 0.75} 75%|███████▌ | 16593/22095 [28:28:37<9:20:08, 6.11s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 75%|███████▌ | 16594/22095 [28:28:41<8:13:59, 5.39s/it] {'loss': 0.2994, 'grad_norm': 0.5923123067358512, 'learning_rate': 1.5398766497144424e-06, 'epoch': 0.75} 75%|███████▌ | 16594/22095 [28:28:41<8:13:59, 5.39s/it] 75%|███████▌ | 16595/22095 [28:28:44<7:19:23, 4.79s/it] {'loss': 0.3565, 'grad_norm': 0.6217953147336326, 'learning_rate': 1.5393476103723342e-06, 'epoch': 0.75} 75%|███████▌ | 16595/22095 [28:28:44<7:19:23, 4.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301805 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1GHxxLXXXXXcSXVXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nIdentify and read out the text content in the photo, there is no need to output location information.'}, {'from': 'gpt', 'value': 'All words in the image:\n日本\n美之容颜第一坊\n代购\nGOO.N\nPremium\n天使の產着\nL\nM\nXL\n6~11kg\n9~14kg\n38\n46\n12~20kg\n30\n枚'}]} 75%|███████▌ | 16596/22095 [28:28:48<6:46:52, 4.44s/it] {'loss': 0.27, 'grad_norm': 0.6456203570155561, 'learning_rate': 1.5388186453890142e-06, 'epoch': 0.75} 75%|███████▌ | 16596/22095 [28:28:48<6:46:52, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16597/22095 [28:28:51<6:26:52, 4.22s/it] {'loss': 0.283, 'grad_norm': 0.5788306938991927, 'learning_rate': 1.5382897547758513e-06, 'epoch': 0.75} 75%|███████▌ | 16597/22095 [28:28:51<6:26:52, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16598/22095 [28:29:00<8:41:43, 5.69s/it] {'loss': 0.4624, 'grad_norm': 0.2657948767174147, 'learning_rate': 1.5377609385442116e-06, 'epoch': 0.75} 75%|███████▌ | 16598/22095 [28:29:00<8:41:43, 5.69s/it] 75%|███████▌ | 16599/22095 [28:29:04<7:36:54, 4.99s/it] {'loss': 0.3256, 'grad_norm': 0.6724230533015774, 'learning_rate': 1.5372321967054554e-06, 'epoch': 0.75} 75%|███████▌ | 16599/22095 [28:29:04<7:36:54, 4.99s/it] 75%|███████▌ | 16600/22095 [28:29:07<6:46:01, 4.43s/it] {'loss': 0.3744, 'grad_norm': 0.6915387200679524, 'learning_rate': 1.5367035292709432e-06, 'epoch': 0.75} 75%|███████▌ | 16600/22095 [28:29:07<6:46:01, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16601/22095 [28:29:13<7:22:14, 4.83s/it] {'loss': 0.4691, 'grad_norm': 0.2619534121431575, 'learning_rate': 1.5361749362520363e-06, 'epoch': 0.75} 75%|███████▌ | 16601/22095 [28:29:13<7:22:14, 4.83s/it] 75%|███████▌ | 16602/22095 [28:29:16<6:45:41, 4.43s/it] {'loss': 0.2918, 'grad_norm': 0.6145861080376424, 'learning_rate': 1.5356464176600905e-06, 'epoch': 0.75} 75%|███████▌ | 16602/22095 [28:29:16<6:45:41, 4.43s/it] 75%|███████▌ | 16603/22095 [28:29:19<6:11:16, 4.06s/it] {'loss': 0.2846, 'grad_norm': 0.6530194354447144, 'learning_rate': 1.5351179735064647e-06, 'epoch': 0.75} 75%|███████▌ | 16603/22095 [28:29:19<6:11:16, 4.06s/it] 75%|███████▌ | 16604/22095 [28:29:24<6:21:19, 4.17s/it] {'loss': 0.2863, 'grad_norm': 0.6849799194550324, 'learning_rate': 1.534589603802511e-06, 'epoch': 0.75} 75%|███████▌ | 16604/22095 [28:29:24<6:21:19, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16605/22095 [28:29:27<6:01:00, 3.95s/it] {'loss': 0.3173, 'grad_norm': 0.6193998756496838, 'learning_rate': 1.5340613085595846e-06, 'epoch': 0.75} 75%|███████▌ | 16605/22095 [28:29:27<6:01:00, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16606/22095 [28:29:31<5:46:48, 3.79s/it] {'loss': 0.3018, 'grad_norm': 0.616145686650051, 'learning_rate': 1.5335330877890341e-06, 'epoch': 0.75} 75%|███████▌ | 16606/22095 [28:29:31<5:46:48, 3.79s/it] 75%|███████▌ | 16607/22095 [28:29:34<5:44:24, 3.77s/it] {'loss': 0.3318, 'grad_norm': 0.6140848804103951, 'learning_rate': 1.533004941502213e-06, 'epoch': 0.75} 75%|███████▌ | 16607/22095 [28:29:34<5:44:24, 3.77s/it] 75%|███████▌ | 16608/22095 [28:29:37<5:21:15, 3.51s/it] {'loss': 0.3211, 'grad_norm': 0.6719963364193847, 'learning_rate': 1.5324768697104681e-06, 'epoch': 0.75} 75%|███████▌ | 16608/22095 [28:29:37<5:21:15, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58782 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16609/22095 [28:29:40<5:11:35, 3.41s/it] {'loss': 0.3511, 'grad_norm': 0.6257389026054484, 'learning_rate': 1.5319488724251436e-06, 'epoch': 0.75} 75%|███████▌ | 16609/22095 [28:29:40<5:11:35, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358148 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24859, 'image': 'vrdu_table_final_2/astro-ph.CO/5ff28922-eb0b-45e9-bdc2-987be45f1b57.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} 75%|███████▌ | 16610/22095 [28:29:44<5:03:08, 3.32s/it] {'loss': 0.3059, 'grad_norm': 0.6387686337466676, 'learning_rate': 1.5314209496575861e-06, 'epoch': 0.75} 75%|███████▌ | 16610/22095 [28:29:44<5:03:08, 3.32s/it] 75%|███████▌ | 16611/22095 [28:29:47<4:55:07, 3.23s/it] {'loss': 0.2943, 'grad_norm': 0.7750573942287288, 'learning_rate': 1.5308931014191414e-06, 'epoch': 0.75} 75%|███████▌ | 16611/22095 [28:29:47<4:55:07, 3.23s/it] 75%|███████▌ | 16612/22095 [28:29:50<4:52:48, 3.20s/it] {'loss': 0.2929, 'grad_norm': 0.6321905375975513, 'learning_rate': 1.5303653277211493e-06, 'epoch': 0.75} 75%|███████▌ | 16612/22095 [28:29:50<4:52:48, 3.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65502 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16613/22095 [28:29:53<5:01:01, 3.29s/it] {'loss': 0.3131, 'grad_norm': 0.6112993269416406, 'learning_rate': 1.5298376285749489e-06, 'epoch': 0.75} 75%|███████▌ | 16613/22095 [28:29:53<5:01:01, 3.29s/it] 75%|███████▌ | 16614/22095 [28:29:57<5:19:12, 3.49s/it] {'loss': 0.3148, 'grad_norm': 0.6284143210745861, 'learning_rate': 1.5293100039918812e-06, 'epoch': 0.75} 75%|███████▌ | 16614/22095 [28:29:57<5:19:12, 3.49s/it] 75%|███████▌ | 16615/22095 [28:30:01<5:16:22, 3.46s/it] {'loss': 0.3209, 'grad_norm': 0.5935461720492932, 'learning_rate': 1.5287824539832808e-06, 'epoch': 0.75} 75%|███████▌ | 16615/22095 [28:30:01<5:16:22, 3.46s/it] 75%|███████▌ | 16616/22095 [28:30:04<5:08:06, 3.37s/it] {'loss': 0.3211, 'grad_norm': 0.6039730603185649, 'learning_rate': 1.5282549785604861e-06, 'epoch': 0.75} 75%|███████▌ | 16616/22095 [28:30:04<5:08:06, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16617/22095 [28:30:13<7:54:33, 5.20s/it] {'loss': 0.4479, 'grad_norm': 0.30584223112382614, 'learning_rate': 1.5277275777348294e-06, 'epoch': 0.75} 75%|███████▌ | 16617/22095 [28:30:13<7:54:33, 5.20s/it] 75%|███████▌ | 16618/22095 [28:30:17<7:08:34, 4.69s/it] {'loss': 0.2825, 'grad_norm': 0.5795522532688792, 'learning_rate': 1.5272002515176404e-06, 'epoch': 0.75} 75%|███████▌ | 16618/22095 [28:30:17<7:08:34, 4.69s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://gui-agent/data_20250616/windows_paste/images/word/20250430_193816_4/images/before_screenshot_44_id_192_function_1_crop_1_grounding_instructions_random_paste.png 2025-08-28 20:28:17.220838 load time: 1034.97 ms 75%|███████▌ | 16619/22095 [28:30:20<6:26:13, 4.23s/it] {'loss': 0.2639, 'grad_norm': 0.5665869525523641, 'learning_rate': 1.526672999920253e-06, 'epoch': 0.75} 75%|███████▌ | 16619/22095 [28:30:20<6:26:13, 4.23s/it] 75%|███████▌ | 16620/22095 [28:30:24<6:10:18, 4.06s/it] {'loss': 0.3083, 'grad_norm': 0.5847304350030971, 'learning_rate': 1.5261458229539966e-06, 'epoch': 0.75} 75%|███████▌ | 16620/22095 [28:30:24<6:10:18, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16621/22095 [28:30:31<7:57:49, 5.24s/it] {'loss': 0.4373, 'grad_norm': 0.27271189698999354, 'learning_rate': 1.525618720630197e-06, 'epoch': 0.75} 75%|███████▌ | 16621/22095 [28:30:32<7:57:49, 5.24s/it] 75%|███████▌ | 16622/22095 [28:30:37<8:12:26, 5.40s/it] {'loss': 0.4454, 'grad_norm': 0.2711721727141647, 'learning_rate': 1.525091692960179e-06, 'epoch': 0.75} 75%|███████▌ | 16622/22095 [28:30:37<8:12:26, 5.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 75%|███████▌ | 16623/22095 [28:30:41<7:30:43, 4.94s/it] {'loss': 0.3096, 'grad_norm': 0.6319203562525041, 'learning_rate': 1.5245647399552682e-06, 'epoch': 0.75} 75%|███████▌ | 16623/22095 [28:30:41<7:30:43, 4.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916664 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39817, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nA. 1\nB. 1.5\nC. 2\nD. 0.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 75%|███████▌ | 16624/22095 [28:30:44<6:42:02, 4.41s/it] {'loss': 0.3046, 'grad_norm': 0.5365429777613504, 'learning_rate': 1.5240378616267887e-06, 'epoch': 0.75} 75%|███████▌ | 16624/22095 [28:30:44<6:42:02, 4.41s/it] 75%|███████▌ | 16625/22095 [28:30:47<6:08:22, 4.04s/it] {'loss': 0.3068, 'grad_norm': 0.5917964105538425, 'learning_rate': 1.5235110579860602e-06, 'epoch': 0.75} 75%|███████▌ | 16625/22095 [28:30:48<6:08:22, 4.04s/it] 75%|███████▌ | 16626/22095 [28:30:50<5:37:34, 3.70s/it] {'loss': 0.2773, 'grad_norm': 0.6203010656794513, 'learning_rate': 1.5229843290443996e-06, 'epoch': 0.75} 75%|███████▌ | 16626/22095 [28:30:50<5:37:34, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950722 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1557, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 6\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 75%|███████▌ | 16627/22095 [28:30:54<5:27:11, 3.59s/it] {'loss': 0.3062, 'grad_norm': 0.6404234014576459, 'learning_rate': 1.5224576748131292e-06, 'epoch': 0.75} 75%|███████▌ | 16627/22095 [28:30:54<5:27:11, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16628/22095 [28:30:57<5:04:54, 3.35s/it] {'loss': 0.3381, 'grad_norm': 0.5958364086183551, 'learning_rate': 1.521931095303561e-06, 'epoch': 0.75} 75%|███████▌ | 16628/22095 [28:30:57<5:04:54, 3.35s/it] 75%|███████▌ | 16629/22095 [28:31:02<5:52:46, 3.87s/it] {'loss': 0.3265, 'grad_norm': 0.6663594972658775, 'learning_rate': 1.521404590527013e-06, 'epoch': 0.75} 75%|███████▌ | 16629/22095 [28:31:02<5:52:46, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934539 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57692, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nA. 7.5\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 75%|███████▌ | 16630/22095 [28:31:05<5:31:30, 3.64s/it] {'loss': 0.2932, 'grad_norm': 0.6487574124977519, 'learning_rate': 1.520878160494797e-06, 'epoch': 0.75} 75%|███████▌ | 16630/22095 [28:31:05<5:31:30, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16631/22095 [28:31:08<5:29:12, 3.62s/it] {'loss': 0.3152, 'grad_norm': 0.5654488136672179, 'learning_rate': 1.520351805218222e-06, 'epoch': 0.75} 75%|███████▌ | 16631/22095 [28:31:08<5:29:12, 3.62s/it] 75%|███████▌ | 16632/22095 [28:31:11<5:16:27, 3.48s/it] {'loss': 0.2959, 'grad_norm': 0.6411086584970878, 'learning_rate': 1.5198255247086018e-06, 'epoch': 0.75} 75%|███████▌ | 16632/22095 [28:31:11<5:16:27, 3.48s/it] 75%|███████▌ | 16633/22095 [28:31:14<4:59:19, 3.29s/it] {'loss': 0.3323, 'grad_norm': 0.6128062221240184, 'learning_rate': 1.5192993189772408e-06, 'epoch': 0.75} 75%|███████▌ | 16633/22095 [28:31:14<4:59:19, 3.29s/it] 75%|███████▌ | 16634/22095 [28:31:17<4:55:40, 3.25s/it] {'loss': 0.3243, 'grad_norm': 0.617088053959374, 'learning_rate': 1.5187731880354489e-06, 'epoch': 0.75} 75%|███████▌ | 16634/22095 [28:31:17<4:55:40, 3.25s/it] 75%|███████▌ | 16635/22095 [28:31:20<4:46:28, 3.15s/it] {'loss': 0.3403, 'grad_norm': 0.7494997082124705, 'learning_rate': 1.5182471318945275e-06, 'epoch': 0.75} 75%|███████▌ | 16635/22095 [28:31:20<4:46:28, 3.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910449 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33602, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 10cm\nB. 16cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 75%|███████▌ | 16636/22095 [28:31:24<4:59:51, 3.30s/it] {'loss': 0.2955, 'grad_norm': 0.6297086295505582, 'learning_rate': 1.517721150565784e-06, 'epoch': 0.75} 75%|███████▌ | 16636/22095 [28:31:24<4:59:51, 3.30s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100014728 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 75%|███████▌ | 16637/22095 [28:31:27<4:44:33, 3.13s/it] {'loss': 0.3205, 'grad_norm': 0.6379063471545178, 'learning_rate': 1.5171952440605175e-06, 'epoch': 0.75} 75%|███████▌ | 16637/22095 [28:31:27<4:44:33, 3.13s/it] 75%|███████▌ | 16638/22095 [28:31:30<5:00:41, 3.31s/it] {'loss': 0.2886, 'grad_norm': 0.5753119543744168, 'learning_rate': 1.5166694123900271e-06, 'epoch': 0.75} 75%|███████▌ | 16638/22095 [28:31:30<5:00:41, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16639/22095 [28:31:33<4:52:13, 3.21s/it] {'loss': 0.3017, 'grad_norm': 0.657485752082828, 'learning_rate': 1.5161436555656129e-06, 'epoch': 0.75} 75%|███████▌ | 16639/22095 [28:31:33<4:52:13, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85664 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54332 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16640/22095 [28:31:37<4:57:45, 3.28s/it] {'loss': 0.3474, 'grad_norm': 0.6017532634783226, 'learning_rate': 1.5156179735985732e-06, 'epoch': 0.75} 75%|███████▌ | 16640/22095 [28:31:37<4:57:45, 3.28s/it] 75%|███████▌ | 16641/22095 [28:31:40<4:53:10, 3.23s/it] {'loss': 0.2847, 'grad_norm': 0.6199333396182094, 'learning_rate': 1.5150923665002021e-06, 'epoch': 0.75} 75%|███████▌ | 16641/22095 [28:31:40<4:53:10, 3.23s/it] 75%|███████▌ | 16642/22095 [28:31:43<4:47:45, 3.17s/it] {'loss': 0.2791, 'grad_norm': 0.6428709789975856, 'learning_rate': 1.514566834281791e-06, 'epoch': 0.75} 75%|███████▌ | 16642/22095 [28:31:43<4:47:45, 3.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16643/22095 [28:31:47<5:08:37, 3.40s/it] {'loss': 0.2884, 'grad_norm': 0.6080445266538872, 'learning_rate': 1.5140413769546353e-06, 'epoch': 0.75} 75%|███████▌ | 16643/22095 [28:31:47<5:08:37, 3.40s/it] 75%|███████▌ | 16644/22095 [28:31:51<5:19:45, 3.52s/it] {'loss': 0.2953, 'grad_norm': 0.6450629264084641, 'learning_rate': 1.5135159945300232e-06, 'epoch': 0.75} 75%|███████▌ | 16644/22095 [28:31:51<5:19:45, 3.52s/it] 75%|███████▌ | 16645/22095 [28:31:55<5:28:05, 3.61s/it] {'loss': 0.2754, 'grad_norm': 0.7004210719059599, 'learning_rate': 1.5129906870192456e-06, 'epoch': 0.75} 75%|███████▌ | 16645/22095 [28:31:55<5:28:05, 3.61s/it] 75%|███████▌ | 16646/22095 [28:31:58<5:11:52, 3.43s/it] {'loss': 0.2697, 'grad_norm': 0.6067616219779681, 'learning_rate': 1.512465454433587e-06, 'epoch': 0.75} 75%|███████▌ | 16646/22095 [28:31:58<5:11:52, 3.43s/it] 75%|███████▌ | 16647/22095 [28:32:02<5:29:53, 3.63s/it] {'loss': 0.2953, 'grad_norm': 0.593391764471863, 'learning_rate': 1.5119402967843361e-06, 'epoch': 0.75} 75%|███████▌ | 16647/22095 [28:32:02<5:29:53, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16648/22095 [28:32:11<8:09:50, 5.40s/it] {'loss': 0.4766, 'grad_norm': 0.26635515223260847, 'learning_rate': 1.5114152140827744e-06, 'epoch': 0.75} 75%|███████▌ | 16648/22095 [28:32:11<8:09:50, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90222 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16649/22095 [28:32:14<7:11:14, 4.75s/it] {'loss': 0.2911, 'grad_norm': 0.5983568795117736, 'learning_rate': 1.5108902063401865e-06, 'epoch': 0.75} 75%|███████▌ | 16649/22095 [28:32:14<7:11:14, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (133783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87668 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118361 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16650/22095 [28:32:17<6:16:54, 4.15s/it] {'loss': 0.3202, 'grad_norm': 0.6481268868427346, 'learning_rate': 1.5103652735678525e-06, 'epoch': 0.75} 75%|███████▌ | 16650/22095 [28:32:17<6:16:54, 4.15s/it] 75%|███████▌ | 16651/22095 [28:32:20<5:42:13, 3.77s/it] {'loss': 0.2971, 'grad_norm': 0.5963856117452525, 'learning_rate': 1.509840415777049e-06, 'epoch': 0.75} 75%|███████▌ | 16651/22095 [28:32:20<5:42:13, 3.77s/it] 75%|███████▌ | 16652/22095 [28:32:23<5:26:43, 3.60s/it] {'loss': 0.2684, 'grad_norm': 0.6869703319652534, 'learning_rate': 1.5093156329790564e-06, 'epoch': 0.75} 75%|███████▌ | 16652/22095 [28:32:23<5:26:43, 3.60s/it] 75%|███████▌ | 16653/22095 [28:32:27<5:24:01, 3.57s/it] {'loss': 0.2859, 'grad_norm': 0.5996001305416143, 'learning_rate': 1.5087909251851513e-06, 'epoch': 0.75} 75%|███████▌ | 16653/22095 [28:32:27<5:24:01, 3.57s/it] 75%|███████▌ | 16654/22095 [28:32:31<5:29:16, 3.63s/it] {'loss': 0.353, 'grad_norm': 0.6645363319026046, 'learning_rate': 1.5082662924066067e-06, 'epoch': 0.75} 75%|███████▌ | 16654/22095 [28:32:31<5:29:16, 3.63s/it] 75%|███████▌ | 16655/22095 [28:32:34<5:23:24, 3.57s/it] {'loss': 0.3123, 'grad_norm': 0.6327652610190093, 'learning_rate': 1.5077417346546942e-06, 'epoch': 0.75} 75%|███████▌ | 16655/22095 [28:32:34<5:23:24, 3.57s/it] 75%|███████▌ | 16656/22095 [28:32:37<5:15:58, 3.49s/it] {'loss': 0.3011, 'grad_norm': 0.6546052835923984, 'learning_rate': 1.5072172519406874e-06, 'epoch': 0.75} 75%|███████▌ | 16656/22095 [28:32:37<5:15:58, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16657/22095 [28:32:47<7:58:51, 5.28s/it] {'loss': 0.4763, 'grad_norm': 0.26338153529601854, 'learning_rate': 1.5066928442758528e-06, 'epoch': 0.75} 75%|███████▌ | 16657/22095 [28:32:47<7:58:51, 5.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader buff = io.BytesIO(img_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7924298 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (254467000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/39041.png', 'image_wh': [[19000, 13393]], 'conversations': [{'from': 'human', 'value': '\nWhich crop is grown in the state of Louisiana? Answer the question with detailed explanation.'}, {'from': 'gpt', 'value': 'SUGARCANE is grown in the state of Louisiana.\nThe text does not directly mention the crop grown in Louisiana. However, by observing the location of Louisiana in the United States and looking for agricultural products that are commonly associated with the state, it can be inferred that sugarcane is one of the major crops grown there. Additionally, Louisiana is known for its sugarcane industry and is one of the leading states in sugarcane production in the United States.'}]} 75%|███████▌ | 16658/22095 [28:32:52<7:56:30, 5.26s/it] {'loss': 0.4702, 'grad_norm': 0.2735645155703924, 'learning_rate': 1.506168511671462e-06, 'epoch': 0.75} 75%|███████▌ | 16658/22095 [28:32:52<7:56:30, 5.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16659/22095 [28:32:55<7:08:26, 4.73s/it] {'loss': 0.3219, 'grad_norm': 0.6256308592453239, 'learning_rate': 1.5056442541387794e-06, 'epoch': 0.75} 75%|███████▌ | 16659/22095 [28:32:55<7:08:26, 4.73s/it] 75%|███████▌ | 16660/22095 [28:32:59<6:48:53, 4.51s/it] {'loss': 0.3154, 'grad_norm': 0.6069178482632577, 'learning_rate': 1.5051200716890686e-06, 'epoch': 0.75} 75%|███████▌ | 16660/22095 [28:32:59<6:48:53, 4.51s/it] 75%|███████▌ | 16661/22095 [28:33:03<6:25:34, 4.26s/it] {'loss': 0.326, 'grad_norm': 0.5788701325657072, 'learning_rate': 1.5045959643335928e-06, 'epoch': 0.75} 75%|███████▌ | 16661/22095 [28:33:03<6:25:34, 4.26s/it] 75%|███████▌ | 16662/22095 [28:33:07<6:16:02, 4.15s/it] {'loss': 0.3219, 'grad_norm': 0.6112234127041374, 'learning_rate': 1.5040719320836167e-06, 'epoch': 0.75} 75%|███████▌ | 16662/22095 [28:33:07<6:16:02, 4.15s/it] 75%|███████▌ | 16663/22095 [28:33:11<6:05:59, 4.04s/it] {'loss': 0.2864, 'grad_norm': 0.5851015832556286, 'learning_rate': 1.5035479749503973e-06, 'epoch': 0.75} 75%|███████▌ | 16663/22095 [28:33:11<6:05:59, 4.04s/it] 75%|███████▌ | 16664/22095 [28:33:15<6:09:38, 4.08s/it] {'loss': 0.3121, 'grad_norm': 0.8292918564880299, 'learning_rate': 1.5030240929451922e-06, 'epoch': 0.75} 75%|███████▌ | 16664/22095 [28:33:15<6:09:38, 4.08s/it] 75%|███████▌ | 16665/22095 [28:33:18<5:48:37, 3.85s/it] {'loss': 0.2842, 'grad_norm': 1.198695686104617, 'learning_rate': 1.5025002860792609e-06, 'epoch': 0.75} 75%|███████▌ | 16665/22095 [28:33:18<5:48:37, 3.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16666/22095 [28:33:22<5:35:02, 3.70s/it] {'loss': 0.2948, 'grad_norm': 0.6179582469616123, 'learning_rate': 1.5019765543638564e-06, 'epoch': 0.75} 75%|███████▌ | 16666/22095 [28:33:22<5:35:02, 3.70s/it] 75%|███████▌ | 16667/22095 [28:33:25<5:33:25, 3.69s/it] {'loss': 0.2801, 'grad_norm': 0.6735381253035728, 'learning_rate': 1.5014528978102311e-06, 'epoch': 0.75} 75%|███████▌ | 16667/22095 [28:33:25<5:33:25, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52794 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46886 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64516 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16668/22095 [28:33:28<5:14:04, 3.47s/it] {'loss': 0.2803, 'grad_norm': 0.6356498941144694, 'learning_rate': 1.500929316429638e-06, 'epoch': 0.75} 75%|███████▌ | 16668/22095 [28:33:28<5:14:04, 3.47s/it] 75%|███████▌ | 16669/22095 [28:33:32<5:09:30, 3.42s/it] {'loss': 0.3282, 'grad_norm': 0.6436793787689117, 'learning_rate': 1.5004058102333285e-06, 'epoch': 0.75} 75%|███████▌ | 16669/22095 [28:33:32<5:09:30, 3.42s/it] 75%|███████▌ | 16670/22095 [28:33:34<4:52:54, 3.24s/it] {'loss': 0.3088, 'grad_norm': 0.763312169832447, 'learning_rate': 1.49988237923255e-06, 'epoch': 0.75} 75%|███████▌ | 16670/22095 [28:33:34<4:52:54, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42147 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16671/22095 [28:33:38<4:59:01, 3.31s/it] {'loss': 0.3191, 'grad_norm': 0.838672560377792, 'learning_rate': 1.499359023438548e-06, 'epoch': 0.75} 75%|███████▌ | 16671/22095 [28:33:38<4:59:01, 3.31s/it] 75%|███████▌ | 16672/22095 [28:33:42<5:32:20, 3.68s/it] {'loss': 0.2931, 'grad_norm': 0.5915128483790769, 'learning_rate': 1.4988357428625711e-06, 'epoch': 0.75} 75%|███████▌ | 16672/22095 [28:33:42<5:32:20, 3.68s/it] 75%|███████▌ | 16673/22095 [28:33:46<5:37:02, 3.73s/it] {'loss': 0.321, 'grad_norm': 0.7560524050943002, 'learning_rate': 1.4983125375158591e-06, 'epoch': 0.75} 75%|███████▌ | 16673/22095 [28:33:46<5:37:02, 3.73s/it] 75%|███████▌ | 16674/22095 [28:33:50<5:34:10, 3.70s/it] {'loss': 0.2892, 'grad_norm': 0.6354801252230485, 'learning_rate': 1.4977894074096576e-06, 'epoch': 0.75} 75%|███████▌ | 16674/22095 [28:33:50<5:34:10, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55289 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72669 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16675/22095 [28:33:53<5:15:04, 3.49s/it] {'loss': 0.2678, 'grad_norm': 0.5789119692914134, 'learning_rate': 1.497266352555204e-06, 'epoch': 0.75} 75%|███████▌ | 16675/22095 [28:33:53<5:15:04, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16676/22095 [28:34:03<8:03:13, 5.35s/it] {'loss': 0.4841, 'grad_norm': 0.298435257825619, 'learning_rate': 1.4967433729637403e-06, 'epoch': 0.75} 75%|███████▌ | 16676/22095 [28:34:03<8:03:13, 5.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 75%|███████▌ | 16677/22095 [28:34:06<7:06:07, 4.72s/it] {'loss': 0.286, 'grad_norm': 0.5903516195540437, 'learning_rate': 1.4962204686465003e-06, 'epoch': 0.75} 75%|███████▌ | 16677/22095 [28:34:06<7:06:07, 4.72s/it] 75%|███████▌ | 16678/22095 [28:34:09<6:14:42, 4.15s/it] {'loss': 0.3419, 'grad_norm': 0.6343168652793674, 'learning_rate': 1.4956976396147233e-06, 'epoch': 0.75} 75%|███████▌ | 16678/22095 [28:34:09<6:14:42, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 75%|███████▌ | 16679/22095 [28:34:17<7:56:41, 5.28s/it] {'loss': 0.4855, 'grad_norm': 0.2934288256714517, 'learning_rate': 1.4951748858796411e-06, 'epoch': 0.75} 75%|███████▌ | 16679/22095 [28:34:17<7:56:41, 5.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46228 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91933 > 40960). Running this sequence through the model will result in indexing errors 75%|███████▌ | 16680/22095 [28:34:21<7:21:53, 4.90s/it] {'loss': 0.2848, 'grad_norm': 0.635717850760982, 'learning_rate': 1.494652207452485e-06, 'epoch': 0.75} 75%|███████▌ | 16680/22095 [28:34:21<7:21:53, 4.90s/it] 75%|███████▌ | 16681/22095 [28:34:24<6:31:49, 4.34s/it] {'loss': 0.3075, 'grad_norm': 0.5702053407110187, 'learning_rate': 1.4941296043444869e-06, 'epoch': 0.75} 75%|███████▌ | 16681/22095 [28:34:24<6:31:49, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946836 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69989, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm'}, {'from': 'gpt', 'value': '【解答】解:∵M是AC的中点,N是DB的中点,CD=3cm,AB=9.8cm,∴MC+DN=\\frac{1}{2}(AB-CD)=3.4cm,∴MN=MC+DN+CD=3.4+3=6.4cm.'}]} 76%|███████▌ | 16682/22095 [28:34:27<5:56:16, 3.95s/it] {'loss': 0.2881, 'grad_norm': 0.6190446810455607, 'learning_rate': 1.493607076566878e-06, 'epoch': 0.76} 76%|███████▌ | 16682/22095 [28:34:27<5:56:16, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16683/22095 [28:34:35<7:56:25, 5.28s/it] {'loss': 0.4557, 'grad_norm': 0.26498951155860173, 'learning_rate': 1.4930846241308838e-06, 'epoch': 0.76} 76%|███████▌ | 16683/22095 [28:34:35<7:56:25, 5.28s/it] 76%|███████▌ | 16684/22095 [28:34:39<7:14:40, 4.82s/it] {'loss': 0.2828, 'grad_norm': 0.592932363094059, 'learning_rate': 1.4925622470477291e-06, 'epoch': 0.76} 76%|███████▌ | 16684/22095 [28:34:39<7:14:40, 4.82s/it] 76%|███████▌ | 16685/22095 [28:34:42<6:36:42, 4.40s/it] {'loss': 0.2936, 'grad_norm': 0.6565257146532069, 'learning_rate': 1.4920399453286405e-06, 'epoch': 0.76} 76%|███████▌ | 16685/22095 [28:34:42<6:36:42, 4.40s/it] 76%|███████▌ | 16686/22095 [28:34:45<5:53:28, 3.92s/it] {'loss': 0.2476, 'grad_norm': 0.6645887871402962, 'learning_rate': 1.4915177189848384e-06, 'epoch': 0.76} 76%|███████▌ | 16686/22095 [28:34:45<5:53:28, 3.92s/it] 76%|███████▌ | 16687/22095 [28:34:48<5:35:32, 3.72s/it] {'loss': 0.2285, 'grad_norm': 0.6285093421542468, 'learning_rate': 1.4909955680275462e-06, 'epoch': 0.76} 76%|███████▌ | 16687/22095 [28:34:48<5:35:32, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16688/22095 [28:34:56<7:31:56, 5.02s/it] {'loss': 0.4875, 'grad_norm': 0.2759830785118571, 'learning_rate': 1.4904734924679825e-06, 'epoch': 0.76} 76%|███████▌ | 16688/22095 [28:34:56<7:31:56, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89229 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16689/22095 [28:34:59<6:40:55, 4.45s/it] {'loss': 0.3002, 'grad_norm': 0.6342964851974112, 'learning_rate': 1.489951492317363e-06, 'epoch': 0.76} 76%|███████▌ | 16689/22095 [28:34:59<6:40:55, 4.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047791 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 8\nB. 4\nC. 6\nD. 7.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 76%|███████▌ | 16690/22095 [28:35:04<6:33:45, 4.37s/it] {'loss': 0.3391, 'grad_norm': 0.6198259499774071, 'learning_rate': 1.4894295675869058e-06, 'epoch': 0.76} 76%|███████▌ | 16690/22095 [28:35:04<6:33:45, 4.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898844 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21997, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nA. 5\nB. 4\nC. 3\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 76%|███████▌ | 16691/22095 [28:35:06<5:52:43, 3.92s/it] {'loss': 0.2754, 'grad_norm': 0.57067587269311, 'learning_rate': 1.488907718287827e-06, 'epoch': 0.76} 76%|███████▌ | 16691/22095 [28:35:06<5:52:43, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16692/22095 [28:35:10<5:38:21, 3.76s/it] {'loss': 0.3354, 'grad_norm': 0.600458579427856, 'learning_rate': 1.4883859444313376e-06, 'epoch': 0.76} 76%|███████▌ | 16692/22095 [28:35:10<5:38:21, 3.76s/it] 76%|███████▌ | 16693/22095 [28:35:13<5:26:26, 3.63s/it] {'loss': 0.3409, 'grad_norm': 0.6040714705322814, 'learning_rate': 1.4878642460286474e-06, 'epoch': 0.76} 76%|███████▌ | 16693/22095 [28:35:13<5:26:26, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70120 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (50083 > 40960) for 4 sample(s). Truncating to 2450 with 1 samples. 76%|███████▌ | 16694/22095 [28:35:16<5:08:48, 3.43s/it] {'loss': 0.3009, 'grad_norm': 0.6057206746979718, 'learning_rate': 1.4873426230909682e-06, 'epoch': 0.76} 76%|███████▌ | 16694/22095 [28:35:16<5:08:48, 3.43s/it] 76%|███████▌ | 16695/22095 [28:35:20<5:11:01, 3.46s/it] {'loss': 0.338, 'grad_norm': 0.645707146707286, 'learning_rate': 1.4868210756295109e-06, 'epoch': 0.76} 76%|███████▌ | 16695/22095 [28:35:20<5:11:01, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16696/22095 [28:35:24<5:36:32, 3.74s/it] {'loss': 0.3085, 'grad_norm': 0.6667021532346543, 'learning_rate': 1.4862996036554756e-06, 'epoch': 0.76} 76%|███████▌ | 16696/22095 [28:35:24<5:36:32, 3.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16697/22095 [28:35:34<8:11:32, 5.46s/it] {'loss': 0.4542, 'grad_norm': 0.2810025522918448, 'learning_rate': 1.4857782071800697e-06, 'epoch': 0.76} 76%|███████▌ | 16697/22095 [28:35:34<8:11:32, 5.46s/it] 76%|███████▌ | 16698/22095 [28:35:37<7:22:54, 4.92s/it] {'loss': 0.3069, 'grad_norm': 0.6814649766957752, 'learning_rate': 1.4852568862144995e-06, 'epoch': 0.76} 76%|███████▌ | 16698/22095 [28:35:37<7:22:54, 4.92s/it] 76%|███████▌ | 16699/22095 [28:35:40<6:26:42, 4.30s/it] {'loss': 0.3161, 'grad_norm': 0.5990733271348277, 'learning_rate': 1.4847356407699632e-06, 'epoch': 0.76} 76%|███████▌ | 16699/22095 [28:35:40<6:26:42, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45356 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46881 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16700/22095 [28:35:44<6:13:26, 4.15s/it] {'loss': 0.3418, 'grad_norm': 0.6273544717931, 'learning_rate': 1.4842144708576606e-06, 'epoch': 0.76} 76%|███████▌ | 16700/22095 [28:35:44<6:13:26, 4.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [27, 54, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8799822 in VC:s3://internvl2/datasets/chrome_writting/images/. Exception: Image size [27, 54, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3553, 'image': '0003553.jpg', 'image_wh': [[27, 54]], 'conversations': [{'from': 'human', 'value': '\nPlease write out the expression of the formula in the image using LaTeX format.\nPlease give the latex expression directly without any extra output.'}, {'from': 'gpt', 'value': '$1$'}]} 76%|███████▌ | 16701/22095 [28:35:48<6:04:17, 4.05s/it] {'loss': 0.2986, 'grad_norm': 0.6326039496529865, 'learning_rate': 1.4836933764887928e-06, 'epoch': 0.76} 76%|███████▌ | 16701/22095 [28:35:48<6:04:17, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16702/22095 [28:35:57<8:27:47, 5.65s/it] {'loss': 0.4501, 'grad_norm': 0.26567037266128657, 'learning_rate': 1.4831723576745531e-06, 'epoch': 0.76} 76%|███████▌ | 16702/22095 [28:35:57<8:27:47, 5.65s/it] 76%|███████▌ | 16703/22095 [28:36:00<7:19:59, 4.90s/it] {'loss': 0.2983, 'grad_norm': 0.6678320036059954, 'learning_rate': 1.48265141442614e-06, 'epoch': 0.76} 76%|███████▌ | 16703/22095 [28:36:00<7:19:59, 4.90s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16704/22095 [28:36:04<6:47:25, 4.53s/it] {'loss': 0.2782, 'grad_norm': 0.6033148157639469, 'learning_rate': 1.4821305467547436e-06, 'epoch': 0.76} 76%|███████▌ | 16704/22095 [28:36:04<6:47:25, 4.53s/it] 76%|███████▌ | 16705/22095 [28:36:07<6:21:41, 4.25s/it] {'loss': 0.3172, 'grad_norm': 0.6145787945428695, 'learning_rate': 1.481609754671559e-06, 'epoch': 0.76} 76%|███████▌ | 16705/22095 [28:36:07<6:21:41, 4.25s/it] 76%|███████▌ | 16706/22095 [28:36:11<6:00:54, 4.02s/it] {'loss': 0.275, 'grad_norm': 0.6623739706543618, 'learning_rate': 1.4810890381877736e-06, 'epoch': 0.76} 76%|███████▌ | 16706/22095 [28:36:11<6:00:54, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16707/22095 [28:36:14<5:41:39, 3.80s/it] {'loss': 0.33, 'grad_norm': 0.594661372064247, 'learning_rate': 1.4805683973145784e-06, 'epoch': 0.76} 76%|███████▌ | 16707/22095 [28:36:14<5:41:39, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (64505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72160 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16708/22095 [28:36:25<8:49:09, 5.89s/it] {'loss': 0.4667, 'grad_norm': 0.2872788418078114, 'learning_rate': 1.4800478320631595e-06, 'epoch': 0.76} 76%|███████▌ | 16708/22095 [28:36:25<8:49:09, 5.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58428 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95951 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16709/22095 [28:36:29<7:44:41, 5.18s/it] {'loss': 0.3092, 'grad_norm': 0.664942448101899, 'learning_rate': 1.4795273424446998e-06, 'epoch': 0.76} 76%|███████▌ | 16709/22095 [28:36:29<7:44:41, 5.18s/it] 76%|███████▌ | 16710/22095 [28:36:32<6:50:42, 4.58s/it] {'loss': 0.2566, 'grad_norm': 0.5697394913447789, 'learning_rate': 1.4790069284703863e-06, 'epoch': 0.76} 76%|███████▌ | 16710/22095 [28:36:32<6:50:42, 4.58s/it] 76%|███████▌ | 16711/22095 [28:36:35<6:12:18, 4.15s/it] {'loss': 0.3119, 'grad_norm': 0.6307175705577794, 'learning_rate': 1.4784865901514005e-06, 'epoch': 0.76} 76%|███████▌ | 16711/22095 [28:36:35<6:12:18, 4.15s/it] 76%|███████▌ | 16712/22095 [28:36:38<5:52:35, 3.93s/it] {'loss': 0.3139, 'grad_norm': 0.6019116928534433, 'learning_rate': 1.4779663274989232e-06, 'epoch': 0.76} 76%|███████▌ | 16712/22095 [28:36:38<5:52:35, 3.93s/it] 76%|███████▌ | 16713/22095 [28:36:42<5:51:38, 3.92s/it] {'loss': 0.3392, 'grad_norm': 0.6226910834635931, 'learning_rate': 1.4774461405241303e-06, 'epoch': 0.76} 76%|███████▌ | 16713/22095 [28:36:42<5:51:38, 3.92s/it] 76%|███████▌ | 16714/22095 [28:36:45<5:32:45, 3.71s/it] {'loss': 0.2916, 'grad_norm': 0.6134479971109972, 'learning_rate': 1.4769260292382031e-06, 'epoch': 0.76} 76%|███████▌ | 16714/22095 [28:36:45<5:32:45, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63231 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46879 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16715/22095 [28:36:50<5:43:55, 3.84s/it] {'loss': 0.2951, 'grad_norm': 0.6155452094594851, 'learning_rate': 1.4764059936523134e-06, 'epoch': 0.76} 76%|███████▌ | 16715/22095 [28:36:50<5:43:55, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16716/22095 [28:36:59<8:15:50, 5.53s/it] {'loss': 0.4449, 'grad_norm': 0.2636065545551082, 'learning_rate': 1.4758860337776387e-06, 'epoch': 0.76} 76%|███████▌ | 16716/22095 [28:36:59<8:15:50, 5.53s/it] 76%|███████▌ | 16717/22095 [28:37:08<10:00:15, 6.70s/it] {'loss': 0.4538, 'grad_norm': 0.2850426046437727, 'learning_rate': 1.475366149625348e-06, 'epoch': 0.76} 76%|███████▌ | 16717/22095 [28:37:08<10:00:15, 6.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 76%|███████▌ | 16718/22095 [28:37:13<8:54:20, 5.96s/it] {'loss': 0.3183, 'grad_norm': 0.6652238532497509, 'learning_rate': 1.474846341206615e-06, 'epoch': 0.76} 76%|███████▌ | 16718/22095 [28:37:13<8:54:20, 5.96s/it] 76%|███████▌ | 16719/22095 [28:37:16<7:49:07, 5.24s/it] {'loss': 0.3575, 'grad_norm': 0.7590976861643215, 'learning_rate': 1.4743266085326062e-06, 'epoch': 0.76} 76%|███████▌ | 16719/22095 [28:37:16<7:49:07, 5.24s/it] 76%|███████▌ | 16720/22095 [28:37:20<7:12:55, 4.83s/it] {'loss': 0.3177, 'grad_norm': 0.5542928044334143, 'learning_rate': 1.473806951614492e-06, 'epoch': 0.76} 76%|███████▌ | 16720/22095 [28:37:20<7:12:55, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57866 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16721/22095 [28:37:24<6:40:27, 4.47s/it] {'loss': 0.291, 'grad_norm': 0.646674522922809, 'learning_rate': 1.4732873704634366e-06, 'epoch': 0.76} 76%|███████▌ | 16721/22095 [28:37:24<6:40:27, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86660 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48734 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16722/22095 [28:37:27<5:56:29, 3.98s/it] {'loss': 0.3181, 'grad_norm': 0.6264009080590869, 'learning_rate': 1.472767865090602e-06, 'epoch': 0.76} 76%|███████▌ | 16722/22095 [28:37:27<5:56:29, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16723/22095 [28:37:35<8:06:15, 5.43s/it] {'loss': 0.4683, 'grad_norm': 0.279109290941747, 'learning_rate': 1.472248435507153e-06, 'epoch': 0.76} 76%|███████▌ | 16723/22095 [28:37:35<8:06:15, 5.43s/it] 76%|███████▌ | 16724/22095 [28:37:39<7:23:59, 4.96s/it] {'loss': 0.3097, 'grad_norm': 0.6090954445924818, 'learning_rate': 1.4717290817242542e-06, 'epoch': 0.76} 76%|███████▌ | 16724/22095 [28:37:39<7:23:59, 4.96s/it] 76%|███████▌ | 16725/22095 [28:37:43<6:40:50, 4.48s/it] {'loss': 0.3042, 'grad_norm': 0.73037171924201, 'learning_rate': 1.4712098037530575e-06, 'epoch': 0.76} 76%|███████▌ | 16725/22095 [28:37:43<6:40:50, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91931 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16726/22095 [28:37:46<5:59:28, 4.02s/it] {'loss': 0.3073, 'grad_norm': 0.650877793870771, 'learning_rate': 1.4706906016047246e-06, 'epoch': 0.76} 76%|███████▌ | 16726/22095 [28:37:46<5:59:28, 4.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957194 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8029, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 8\nB. 7\nC. 6\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 76%|███████▌ | 16727/22095 [28:37:48<5:28:40, 3.67s/it] {'loss': 0.3051, 'grad_norm': 0.7641152232774284, 'learning_rate': 1.4701714752904123e-06, 'epoch': 0.76} 76%|███████▌ | 16727/22095 [28:37:48<5:28:40, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16728/22095 [28:37:52<5:23:28, 3.62s/it] {'loss': 0.2979, 'grad_norm': 0.6338668362810962, 'learning_rate': 1.4696524248212746e-06, 'epoch': 0.76} 76%|███████▌ | 16728/22095 [28:37:52<5:23:28, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16729/22095 [28:37:59<7:05:18, 4.76s/it] {'loss': 0.4609, 'grad_norm': 0.26435423960215104, 'learning_rate': 1.4691334502084614e-06, 'epoch': 0.76} 76%|███████▌ | 16729/22095 [28:37:59<7:05:18, 4.76s/it] 76%|███████▌ | 16730/22095 [28:38:08<9:03:26, 6.08s/it] {'loss': 0.4625, 'grad_norm': 0.27102785628931125, 'learning_rate': 1.4686145514631284e-06, 'epoch': 0.76} 76%|███████▌ | 16730/22095 [28:38:08<9:03:26, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 76%|███████▌ | 16731/22095 [28:38:13<8:32:12, 5.73s/it] {'loss': 0.2541, 'grad_norm': 0.6085883837552643, 'learning_rate': 1.4680957285964208e-06, 'epoch': 0.76} 76%|███████▌ | 16731/22095 [28:38:13<8:32:12, 5.73s/it] 76%|███████▌ | 16732/22095 [28:38:17<7:25:21, 4.98s/it] {'loss': 0.2708, 'grad_norm': 0.5566789634371905, 'learning_rate': 1.4675769816194902e-06, 'epoch': 0.76} 76%|███████▌ | 16732/22095 [28:38:17<7:25:21, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16733/22095 [28:38:20<6:30:59, 4.38s/it] {'loss': 0.2918, 'grad_norm': 0.6355008962027197, 'learning_rate': 1.46705831054348e-06, 'epoch': 0.76} 76%|███████▌ | 16733/22095 [28:38:20<6:30:59, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16734/22095 [28:38:29<8:50:42, 5.94s/it] {'loss': 0.4723, 'grad_norm': 0.25688814557401, 'learning_rate': 1.4665397153795375e-06, 'epoch': 0.76} 76%|███████▌ | 16734/22095 [28:38:29<8:50:42, 5.94s/it] 76%|███████▌ | 16735/22095 [28:38:34<8:07:37, 5.46s/it] {'loss': 0.3298, 'grad_norm': 0.5969059123585657, 'learning_rate': 1.4660211961388027e-06, 'epoch': 0.76} 76%|███████▌ | 16735/22095 [28:38:34<8:07:37, 5.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16736/22095 [28:38:36<6:59:27, 4.70s/it] {'loss': 0.2872, 'grad_norm': 0.6764990779117397, 'learning_rate': 1.46550275283242e-06, 'epoch': 0.76} 76%|███████▌ | 16736/22095 [28:38:36<6:59:27, 4.70s/it] 76%|███████▌ | 16737/22095 [28:38:40<6:20:43, 4.26s/it] {'loss': 0.307, 'grad_norm': 0.5972428709774328, 'learning_rate': 1.464984385471528e-06, 'epoch': 0.76} 76%|███████▌ | 16737/22095 [28:38:40<6:20:43, 4.26s/it] 76%|███████▌ | 16738/22095 [28:38:44<6:11:01, 4.16s/it] {'loss': 0.3265, 'grad_norm': 0.6033021842878759, 'learning_rate': 1.4644660940672628e-06, 'epoch': 0.76} 76%|███████▌ | 16738/22095 [28:38:44<6:11:01, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45469 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72180 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16739/22095 [28:38:47<5:57:20, 4.00s/it] {'loss': 0.3038, 'grad_norm': 0.6668498063128243, 'learning_rate': 1.4639478786307627e-06, 'epoch': 0.76} 76%|███████▌ | 16739/22095 [28:38:47<5:57:20, 4.00s/it] 76%|███████▌ | 16740/22095 [28:38:50<5:32:27, 3.72s/it] {'loss': 0.3057, 'grad_norm': 1.843639898452091, 'learning_rate': 1.4634297391731645e-06, 'epoch': 0.76} 76%|███████▌ | 16740/22095 [28:38:50<5:32:27, 3.72s/it] 76%|███████▌ | 16741/22095 [28:38:55<5:50:14, 3.92s/it] {'loss': 0.2588, 'grad_norm': 0.6062164374997798, 'learning_rate': 1.4629116757055989e-06, 'epoch': 0.76} 76%|███████▌ | 16741/22095 [28:38:55<5:50:14, 3.92s/it] 76%|███████▌ | 16742/22095 [28:38:58<5:23:22, 3.62s/it] {'loss': 0.2775, 'grad_norm': 0.6141414599099798, 'learning_rate': 1.462393688239197e-06, 'epoch': 0.76} 76%|███████▌ | 16742/22095 [28:38:58<5:23:22, 3.62s/it] 76%|███████▌ | 16743/22095 [28:39:02<5:32:28, 3.73s/it] {'loss': 0.3779, 'grad_norm': 0.6209748362392395, 'learning_rate': 1.461875776785091e-06, 'epoch': 0.76} 76%|███████▌ | 16743/22095 [28:39:02<5:32:28, 3.73s/it] 76%|███████▌ | 16744/22095 [28:39:05<5:14:18, 3.52s/it] {'loss': 0.2686, 'grad_norm': 0.5652933419403928, 'learning_rate': 1.4613579413544065e-06, 'epoch': 0.76} 76%|███████▌ | 16744/22095 [28:39:05<5:14:18, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16745/22095 [28:39:08<5:22:19, 3.61s/it] {'loss': 0.3364, 'grad_norm': 0.6753961791194718, 'learning_rate': 1.4608401819582734e-06, 'epoch': 0.76} 76%|███████▌ | 16745/22095 [28:39:08<5:22:19, 3.61s/it] 76%|███████▌ | 16746/22095 [28:39:12<5:12:44, 3.51s/it] {'loss': 0.3186, 'grad_norm': 0.594061372333826, 'learning_rate': 1.460322498607814e-06, 'epoch': 0.76} 76%|███████▌ | 16746/22095 [28:39:12<5:12:44, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16747/22095 [28:39:19<6:42:03, 4.51s/it] {'loss': 0.4631, 'grad_norm': 0.2855580345783155, 'learning_rate': 1.4598048913141538e-06, 'epoch': 0.76} 76%|███████▌ | 16747/22095 [28:39:19<6:42:03, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54763 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133959 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16748/22095 [28:39:22<6:18:52, 4.25s/it] {'loss': 0.3163, 'grad_norm': 0.6084118565420273, 'learning_rate': 1.4592873600884123e-06, 'epoch': 0.76} 76%|███████▌ | 16748/22095 [28:39:22<6:18:52, 4.25s/it] 76%|███████▌ | 16749/22095 [28:39:26<6:01:03, 4.05s/it] {'loss': 0.322, 'grad_norm': 0.624547778812951, 'learning_rate': 1.458769904941712e-06, 'epoch': 0.76} 76%|███████▌ | 16749/22095 [28:39:26<6:01:03, 4.05s/it] 76%|███████▌ | 16750/22095 [28:39:29<5:43:13, 3.85s/it] {'loss': 0.3053, 'grad_norm': 0.5736452180550441, 'learning_rate': 1.458252525885171e-06, 'epoch': 0.76} 76%|███████▌ | 16750/22095 [28:39:29<5:43:13, 3.85s/it] 76%|███████▌ | 16751/22095 [28:39:33<5:30:49, 3.71s/it] {'loss': 0.3067, 'grad_norm': 0.6451197460634303, 'learning_rate': 1.4577352229299036e-06, 'epoch': 0.76} 76%|███████▌ | 16751/22095 [28:39:33<5:30:49, 3.71s/it] 76%|███████▌ | 16752/22095 [28:39:37<5:43:10, 3.85s/it] {'loss': 0.2996, 'grad_norm': 0.5835805495229613, 'learning_rate': 1.4572179960870276e-06, 'epoch': 0.76} 76%|███████▌ | 16752/22095 [28:39:37<5:43:10, 3.85s/it] 76%|███████▌ | 16753/22095 [28:39:40<5:28:39, 3.69s/it] {'loss': 0.304, 'grad_norm': 0.644994455896996, 'learning_rate': 1.4567008453676584e-06, 'epoch': 0.76} 76%|███████▌ | 16753/22095 [28:39:40<5:28:39, 3.69s/it] 76%|███████▌ | 16754/22095 [28:39:43<5:12:23, 3.51s/it] {'loss': 0.2737, 'grad_norm': 0.6242410509122905, 'learning_rate': 1.456183770782903e-06, 'epoch': 0.76} 76%|███████▌ | 16754/22095 [28:39:43<5:12:23, 3.51s/it] 76%|███████▌ | 16755/22095 [28:39:46<4:59:40, 3.37s/it] {'loss': 0.2896, 'grad_norm': 0.5967340225522353, 'learning_rate': 1.4556667723438745e-06, 'epoch': 0.76} 76%|███████▌ | 16755/22095 [28:39:46<4:59:40, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16756/22095 [28:39:55<7:12:05, 4.86s/it] {'loss': 0.4955, 'grad_norm': 0.27541432600904664, 'learning_rate': 1.4551498500616823e-06, 'epoch': 0.76} 76%|███████▌ | 16756/22095 [28:39:55<7:12:05, 4.86s/it] 76%|███████▌ | 16757/22095 [28:39:58<6:27:52, 4.36s/it] {'loss': 0.2772, 'grad_norm': 0.7689975087348239, 'learning_rate': 1.4546330039474332e-06, 'epoch': 0.76} 76%|███████▌ | 16757/22095 [28:39:58<6:27:52, 4.36s/it] 76%|███████▌ | 16758/22095 [28:40:01<5:48:01, 3.91s/it] {'loss': 0.2655, 'grad_norm': 0.6404359902753533, 'learning_rate': 1.4541162340122305e-06, 'epoch': 0.76} 76%|███████▌ | 16758/22095 [28:40:01<5:48:01, 3.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [453, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8510662 in VC:s3://internvl-moe-sft-data/. Exception: Image size [453, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 121179, 'image': 'vrdu_texteq/astro-ph.CO/c3b4a7e5-fa12-45af-86dd-5eda252c51cf.png', 'image_wh': [[453, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'where $t_{*}$ denotes the nucleation time.'}]} 76%|███████▌ | 16759/22095 [28:40:04<5:43:08, 3.86s/it] {'loss': 0.3307, 'grad_norm': 0.6518976716703694, 'learning_rate': 1.453599540267181e-06, 'epoch': 0.76} 76%|███████▌ | 16759/22095 [28:40:04<5:43:08, 3.86s/it] 76%|███████▌ | 16760/22095 [28:40:07<5:22:00, 3.62s/it] {'loss': 0.268, 'grad_norm': 0.6298797588350113, 'learning_rate': 1.453082922723384e-06, 'epoch': 0.76} 76%|███████▌ | 16760/22095 [28:40:07<5:22:00, 3.62s/it] 76%|███████▌ | 16761/22095 [28:40:11<5:21:24, 3.62s/it] {'loss': 0.3191, 'grad_norm': 0.6043313496398718, 'learning_rate': 1.4525663813919433e-06, 'epoch': 0.76} 76%|███████▌ | 16761/22095 [28:40:11<5:21:24, 3.62s/it] 76%|███████▌ | 16762/22095 [28:40:15<5:30:19, 3.72s/it] {'loss': 0.3195, 'grad_norm': 0.7437923777491514, 'learning_rate': 1.452049916283954e-06, 'epoch': 0.76} 76%|███████▌ | 16762/22095 [28:40:15<5:30:19, 3.72s/it] 76%|███████▌ | 16763/22095 [28:40:18<5:08:46, 3.47s/it] {'loss': 0.3604, 'grad_norm': 0.6746630879217013, 'learning_rate': 1.4515335274105168e-06, 'epoch': 0.76} 76%|███████▌ | 16763/22095 [28:40:18<5:08:46, 3.47s/it] 76%|███████▌ | 16764/22095 [28:40:21<4:51:04, 3.28s/it] {'loss': 0.2821, 'grad_norm': 0.5701015759486506, 'learning_rate': 1.4510172147827244e-06, 'epoch': 0.76} 76%|███████▌ | 16764/22095 [28:40:21<4:51:04, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61108 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (50883 > 40960) for 4 sample(s). Truncating to 7970 with 2 samples. 76%|███████▌ | 16765/22095 [28:40:23<4:38:42, 3.14s/it] {'loss': 0.2907, 'grad_norm': 0.6900711667667638, 'learning_rate': 1.4505009784116735e-06, 'epoch': 0.76} 76%|███████▌ | 16765/22095 [28:40:24<4:38:42, 3.14s/it] 76%|███████▌ | 16766/22095 [28:40:26<4:27:20, 3.01s/it] {'loss': 0.2687, 'grad_norm': 0.5760290714206665, 'learning_rate': 1.4499848183084558e-06, 'epoch': 0.76} 76%|███████▌ | 16766/22095 [28:40:26<4:27:20, 3.01s/it] 76%|███████▌ | 16767/22095 [28:40:29<4:33:42, 3.08s/it] {'loss': 0.3341, 'grad_norm': 0.5925292721443706, 'learning_rate': 1.449468734484159e-06, 'epoch': 0.76} 76%|███████▌ | 16767/22095 [28:40:29<4:33:42, 3.08s/it] 76%|███████▌ | 16768/22095 [28:40:33<4:45:50, 3.22s/it] {'loss': 0.2754, 'grad_norm': 0.6251576508171806, 'learning_rate': 1.4489527269498749e-06, 'epoch': 0.76} 76%|███████▌ | 16768/22095 [28:40:33<4:45:50, 3.22s/it] 76%|███████▌ | 16769/22095 [28:40:36<4:50:24, 3.27s/it] {'loss': 0.3144, 'grad_norm': 0.5910949194326818, 'learning_rate': 1.4484367957166923e-06, 'epoch': 0.76} 76%|███████▌ | 16769/22095 [28:40:36<4:50:24, 3.27s/it] 76%|███████▌ | 16770/22095 [28:40:40<5:12:11, 3.52s/it] {'loss': 0.2814, 'grad_norm': 0.6602577060771069, 'learning_rate': 1.4479209407956946e-06, 'epoch': 0.76} 76%|███████▌ | 16770/22095 [28:40:40<5:12:11, 3.52s/it] 76%|███████▌ | 16771/22095 [28:40:44<5:02:15, 3.41s/it] {'loss': 0.306, 'grad_norm': 0.6725839382128941, 'learning_rate': 1.4474051621979651e-06, 'epoch': 0.76} 76%|███████▌ | 16771/22095 [28:40:44<5:02:15, 3.41s/it] 76%|███████▌ | 16772/22095 [28:40:47<4:57:34, 3.35s/it] {'loss': 0.2708, 'grad_norm': 0.6187387367824468, 'learning_rate': 1.4468894599345895e-06, 'epoch': 0.76} 76%|███████▌ | 16772/22095 [28:40:47<4:57:34, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6777847 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250509_125727_1/images/before_screenshot_1_id_121_function_1_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nSelect 'Edit customization interface' under the User Interface section to access tools for managing workspaces, toolbars, menus, and other interface elements."}, {'from': 'gpt', 'value': '\nclick(x=0.6055, y=0.1517)\n'}], 'width': 2560, 'height': 1440} 76%|███████▌ | 16773/22095 [28:40:56<7:39:14, 5.18s/it] {'loss': 0.4706, 'grad_norm': 0.2720563841107742, 'learning_rate': 1.446373834016645e-06, 'epoch': 0.76} 76%|███████▌ | 16773/22095 [28:40:56<7:39:14, 5.18s/it] 76%|███████▌ | 16774/22095 [28:41:00<6:54:38, 4.68s/it] {'loss': 0.3396, 'grad_norm': 0.6388025149213199, 'learning_rate': 1.4458582844552144e-06, 'epoch': 0.76} 76%|███████▌ | 16774/22095 [28:41:00<6:54:38, 4.68s/it] 76%|███████▌ | 16775/22095 [28:41:04<6:40:51, 4.52s/it] {'loss': 0.2842, 'grad_norm': 0.6594795049049732, 'learning_rate': 1.4453428112613716e-06, 'epoch': 0.76} 76%|███████▌ | 16775/22095 [28:41:04<6:40:51, 4.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396968 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63821, 'image': 'vrdu_table_final_2/astro-ph.EP/74a285b5-f5f3-4ad8-bf5f-77a241681894.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_x$\\end{tabular}\n```"}]} 76%|███████▌ | 16776/22095 [28:41:07<6:08:59, 4.16s/it] {'loss': 0.2865, 'grad_norm': 0.6183436482999873, 'learning_rate': 1.4448274144461965e-06, 'epoch': 0.76} 76%|███████▌ | 16776/22095 [28:41:07<6:08:59, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52007 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63702 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16777/22095 [28:41:11<5:51:16, 3.96s/it] {'loss': 0.263, 'grad_norm': 0.7109544276087211, 'learning_rate': 1.44431209402076e-06, 'epoch': 0.76} 76%|███████▌ | 16777/22095 [28:41:11<5:51:16, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43992 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77371 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48648 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16778/22095 [28:41:14<5:42:45, 3.87s/it] {'loss': 0.2731, 'grad_norm': 0.6297736322139782, 'learning_rate': 1.4437968499961374e-06, 'epoch': 0.76} 76%|███████▌ | 16778/22095 [28:41:14<5:42:45, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8953366 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4201, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 4cm\nB. 5cm\nC. 无法确定\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 76%|███████▌ | 16779/22095 [28:41:19<5:53:24, 3.99s/it] {'loss': 0.3162, 'grad_norm': 0.6384738341119297, 'learning_rate': 1.4432816823833983e-06, 'epoch': 0.76} 76%|███████▌ | 16779/22095 [28:41:19<5:53:24, 3.99s/it] 76%|███████▌ | 16780/22095 [28:41:22<5:46:03, 3.91s/it] {'loss': 0.2916, 'grad_norm': 0.582218562638432, 'learning_rate': 1.4427665911936106e-06, 'epoch': 0.76} 76%|███████▌ | 16780/22095 [28:41:22<5:46:03, 3.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 108, in __call__ # img_value_str = self.client.get(fn) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 99, in _get if has_tcs_loader: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 517, in get data, _ = self.get_with_info(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 514, in get_with_info return self._get_local_client().get_with_info(uri, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 478, in get_with_info return client.get(filepath), info File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 167, in get return self._client.get_object( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 569, in _api_call return self._make_api_call(operation_name, kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 1023, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. [Try #0] Failed to fetch sample 1110959 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. Problematic sample: {'image': 'bf3459bcaf434803a580fcd36cbe71aestep0.png', 'conversations': [{'from': 'human', 'value': '\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nOpen AP News. Share the link of the first article in the "Business" category\n\nPrevious operations:\nNone'}, {'from': 'gpt', 'value': "\nThe AP News app is visible on the home screen. The goal is to open the AP News app and find the first article in the 'Business' category. Starting by launching the AP News app is the logical first step.\n\n\nTap on the AP News app to open it.\n\n\nterminate(status='success')\n"}]} 76%|███████▌ | 16781/22095 [28:41:26<5:33:59, 3.77s/it] {'loss': 0.294, 'grad_norm': 0.6127931682849966, 'learning_rate': 1.4422515764378443e-06, 'epoch': 0.76} 76%|███████▌ | 16781/22095 [28:41:26<5:33:59, 3.77s/it] 76%|███████▌ | 16782/22095 [28:41:30<5:37:16, 3.81s/it] {'loss': 0.286, 'grad_norm': 0.6399210423841506, 'learning_rate': 1.4417366381271674e-06, 'epoch': 0.76} 76%|███████▌ | 16782/22095 [28:41:30<5:37:16, 3.81s/it] 76%|███████▌ | 16783/22095 [28:41:33<5:30:04, 3.73s/it] {'loss': 0.2846, 'grad_norm': 0.620176767003677, 'learning_rate': 1.4412217762726388e-06, 'epoch': 0.76} 76%|███████▌ | 16783/22095 [28:41:33<5:30:04, 3.73s/it] 76%|███████▌ | 16784/22095 [28:41:37<5:21:25, 3.63s/it] {'loss': 0.2601, 'grad_norm': 0.6019342956194589, 'learning_rate': 1.4407069908853243e-06, 'epoch': 0.76} 76%|███████▌ | 16784/22095 [28:41:37<5:21:25, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16785/22095 [28:41:42<6:06:49, 4.14s/it] {'loss': 0.4776, 'grad_norm': 0.2856625396067131, 'learning_rate': 1.4401922819762864e-06, 'epoch': 0.76} 76%|███████▌ | 16785/22095 [28:41:42<6:06:49, 4.14s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8346700 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13363, 'image': 'vrdu_table_final_2/astro-ph.CO/f8ddf088-bab5-406a-ba41-0c1b5ec347fc.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #2 \\\\\n \\end{tabular}\n```"}]} 76%|███████▌ | 16786/22095 [28:41:52<8:28:45, 5.75s/it] {'loss': 0.4881, 'grad_norm': 0.27648760043520637, 'learning_rate': 1.4396776495565833e-06, 'epoch': 0.76} 76%|███████▌ | 16786/22095 [28:41:52<8:28:45, 5.75s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 76%|███████▌ | 16787/22095 [28:41:55<7:25:25, 5.04s/it] {'loss': 0.3046, 'grad_norm': 0.625835672946518, 'learning_rate': 1.4391630936372714e-06, 'epoch': 0.76} 76%|███████▌ | 16787/22095 [28:41:55<7:25:25, 5.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16788/22095 [28:41:59<6:47:54, 4.61s/it] {'loss': 0.2524, 'grad_norm': 0.6146365451550019, 'learning_rate': 1.4386486142294081e-06, 'epoch': 0.76} 76%|███████▌ | 16788/22095 [28:41:59<6:47:54, 4.61s/it] 76%|███████▌ | 16789/22095 [28:42:02<6:08:56, 4.17s/it] {'loss': 0.2684, 'grad_norm': 0.6884840712238033, 'learning_rate': 1.43813421134405e-06, 'epoch': 0.76} 76%|███████▌ | 16789/22095 [28:42:02<6:08:56, 4.17s/it] 76%|███████▌ | 16790/22095 [28:42:06<6:12:13, 4.21s/it] {'loss': 0.2825, 'grad_norm': 0.7172089130658871, 'learning_rate': 1.4376198849922484e-06, 'epoch': 0.76} 76%|███████▌ | 16790/22095 [28:42:06<6:12:13, 4.21s/it] 76%|███████▌ | 16791/22095 [28:42:09<5:53:23, 4.00s/it] {'loss': 0.3419, 'grad_norm': 0.6264937069460348, 'learning_rate': 1.4371056351850525e-06, 'epoch': 0.76} 76%|███████▌ | 16791/22095 [28:42:09<5:53:23, 4.00s/it] 76%|███████▌ | 16792/22095 [28:42:13<5:30:12, 3.74s/it] {'loss': 0.2876, 'grad_norm': 0.6564587162794594, 'learning_rate': 1.4365914619335158e-06, 'epoch': 0.76} 76%|███████▌ | 16792/22095 [28:42:13<5:30:12, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70584 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88435 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16793/22095 [28:42:15<5:03:43, 3.44s/it] {'loss': 0.3208, 'grad_norm': 0.658685267212285, 'learning_rate': 1.4360773652486826e-06, 'epoch': 0.76} 76%|███████▌ | 16793/22095 [28:42:15<5:03:43, 3.44s/it] 76%|███████▌ | 16794/22095 [28:42:18<4:44:52, 3.22s/it] {'loss': 0.3154, 'grad_norm': 0.598486391096725, 'learning_rate': 1.435563345141603e-06, 'epoch': 0.76} 76%|███████▌ | 16794/22095 [28:42:18<4:44:52, 3.22s/it] 76%|███████▌ | 16795/22095 [28:42:21<4:36:49, 3.13s/it] {'loss': 0.2963, 'grad_norm': 0.6100147880854968, 'learning_rate': 1.4350494016233197e-06, 'epoch': 0.76} 76%|███████▌ | 16795/22095 [28:42:21<4:36:49, 3.13s/it] 76%|███████▌ | 16796/22095 [28:42:24<4:31:31, 3.07s/it] {'loss': 0.2693, 'grad_norm': 0.6434704416028074, 'learning_rate': 1.4345355347048739e-06, 'epoch': 0.76} 76%|███████▌ | 16796/22095 [28:42:24<4:31:31, 3.07s/it] 76%|███████▌ | 16797/22095 [28:42:28<4:49:39, 3.28s/it] {'loss': 0.282, 'grad_norm': 0.5765541262298857, 'learning_rate': 1.4340217443973093e-06, 'epoch': 0.76} 76%|███████▌ | 16797/22095 [28:42:28<4:49:39, 3.28s/it] 76%|███████▌ | 16798/22095 [28:42:31<4:46:41, 3.25s/it] {'loss': 0.3015, 'grad_norm': 0.6419387726754483, 'learning_rate': 1.4335080307116667e-06, 'epoch': 0.76} 76%|███████▌ | 16798/22095 [28:42:31<4:46:41, 3.25s/it] 76%|███████▌ | 16799/22095 [28:42:34<4:41:17, 3.19s/it] {'loss': 0.2685, 'grad_norm': 0.5943876804016321, 'learning_rate': 1.432994393658983e-06, 'epoch': 0.76} 76%|███████▌ | 16799/22095 [28:42:34<4:41:17, 3.19s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8344307 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10959, 'image': 'vrdu_table_final_2/astro-ph.CO/35b64eee-535b-4415-9d92-b80c6ad94416.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 76%|███████▌ | 16800/22095 [28:42:37<4:42:08, 3.20s/it] {'loss': 0.2883, 'grad_norm': 0.6019177857531949, 'learning_rate': 1.4324808332502932e-06, 'epoch': 0.76} 76%|███████▌ | 16800/22095 [28:42:37<4:42:08, 3.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47505 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97696 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16801/22095 [28:42:47<7:29:36, 5.10s/it] {'loss': 0.4577, 'grad_norm': 0.27134340303500054, 'learning_rate': 1.4319673494966345e-06, 'epoch': 0.76} 76%|███████▌ | 16801/22095 [28:42:47<7:29:36, 5.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [442, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8514339 in VC:s3://internvl-moe-sft-data/. Exception: Image size [442, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 157678, 'image': 'vrdu_texteq/astro-ph.CO/b8a4986f-26f5-4450-83de-bbd90f039b5e.png', 'image_wh': [[442, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and $\\mathbf{M}$ is the same matrix as before.'}]} 76%|███████▌ | 16802/22095 [28:42:50<6:43:25, 4.57s/it] {'loss': 0.2485, 'grad_norm': 0.7610053429037491, 'learning_rate': 1.431453942409038e-06, 'epoch': 0.76} 76%|███████▌ | 16802/22095 [28:42:50<6:43:25, 4.57s/it] 76%|███████▌ | 16803/22095 [28:42:53<6:11:20, 4.21s/it] {'loss': 0.28, 'grad_norm': 0.6087080165987451, 'learning_rate': 1.430940611998538e-06, 'epoch': 0.76} 76%|███████▌ | 16803/22095 [28:42:53<6:11:20, 4.21s/it] 76%|███████▌ | 16804/22095 [28:42:56<5:42:06, 3.88s/it] {'loss': 0.2827, 'grad_norm': 0.5778916428969291, 'learning_rate': 1.4304273582761607e-06, 'epoch': 0.76} 76%|███████▌ | 16804/22095 [28:42:56<5:42:06, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52746 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16805/22095 [28:42:59<5:13:21, 3.55s/it] {'loss': 0.3085, 'grad_norm': 0.6189611531085527, 'learning_rate': 1.4299141812529382e-06, 'epoch': 0.76} 76%|███████▌ | 16805/22095 [28:42:59<5:13:21, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8532101 in VC:s3://internvl-moe-sft-data/. Exception: Image size [225, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3621, 'image': 'vrdu_texteq/astro-ph.CO/02cb823a-e5ee-4185-ac1c-347b2cee093a.png', 'image_wh': [[225, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'where $R_A=\\theta_A D_l$.'}]} 76%|███████▌ | 16806/22095 [28:43:02<4:55:28, 3.35s/it] {'loss': 0.2818, 'grad_norm': 0.6054645325134527, 'learning_rate': 1.429401080939894e-06, 'epoch': 0.76} 76%|███████▌ | 16806/22095 [28:43:02<4:55:28, 3.35s/it] 76%|███████▌ | 16807/22095 [28:43:05<4:48:42, 3.28s/it] {'loss': 0.3107, 'grad_norm': 0.7938087806773069, 'learning_rate': 1.4288880573480551e-06, 'epoch': 0.76} 76%|███████▌ | 16807/22095 [28:43:05<4:48:42, 3.28s/it] 76%|███████▌ | 16808/22095 [28:43:09<5:09:21, 3.51s/it] {'loss': 0.309, 'grad_norm': 0.7210383208925405, 'learning_rate': 1.4283751104884446e-06, 'epoch': 0.76} 76%|███████▌ | 16808/22095 [28:43:09<5:09:21, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16809/22095 [28:43:19<7:44:42, 5.27s/it] {'loss': 0.4726, 'grad_norm': 0.28603461792807805, 'learning_rate': 1.4278622403720816e-06, 'epoch': 0.76} 76%|███████▌ | 16809/22095 [28:43:19<7:44:42, 5.27s/it] 76%|███████▌ | 16810/22095 [28:43:22<6:58:08, 4.75s/it] {'loss': 0.3327, 'grad_norm': 0.6318045524562781, 'learning_rate': 1.4273494470099886e-06, 'epoch': 0.76} 76%|███████▌ | 16810/22095 [28:43:22<6:58:08, 4.75s/it] 76%|███████▌ | 16811/22095 [28:43:26<6:40:45, 4.55s/it] {'loss': 0.2926, 'grad_norm': 0.7677521171288333, 'learning_rate': 1.4268367304131847e-06, 'epoch': 0.76} 76%|███████▌ | 16811/22095 [28:43:26<6:40:45, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42527 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16812/22095 [28:43:30<6:08:38, 4.19s/it] {'loss': 0.2787, 'grad_norm': 0.5626308426564788, 'learning_rate': 1.426324090592685e-06, 'epoch': 0.76} 76%|███████▌ | 16812/22095 [28:43:30<6:08:38, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16813/22095 [28:43:36<6:57:55, 4.75s/it] {'loss': 0.456, 'grad_norm': 0.2539087849679761, 'learning_rate': 1.4258115275595036e-06, 'epoch': 0.76} 76%|███████▌ | 16813/22095 [28:43:36<6:57:55, 4.75s/it] 76%|███████▌ | 16814/22095 [28:43:39<6:25:11, 4.38s/it] {'loss': 0.3336, 'grad_norm': 0.6223413384815674, 'learning_rate': 1.425299041324657e-06, 'epoch': 0.76} 76%|███████▌ | 16814/22095 [28:43:39<6:25:11, 4.38s/it] 76%|███████▌ | 16815/22095 [28:43:42<5:49:32, 3.97s/it] {'loss': 0.3442, 'grad_norm': 0.6122816121017155, 'learning_rate': 1.424786631899155e-06, 'epoch': 0.76} 76%|███████▌ | 16815/22095 [28:43:42<5:49:32, 3.97s/it] 76%|███████▌ | 16816/22095 [28:43:45<5:25:41, 3.70s/it] {'loss': 0.3565, 'grad_norm': 0.6532982871938595, 'learning_rate': 1.424274299294006e-06, 'epoch': 0.76} 76%|███████▌ | 16816/22095 [28:43:45<5:25:41, 3.70s/it] 76%|███████▌ | 16817/22095 [28:43:49<5:23:29, 3.68s/it] {'loss': 0.2918, 'grad_norm': 0.563622553510085, 'learning_rate': 1.423762043520221e-06, 'epoch': 0.76} 76%|███████▌ | 16817/22095 [28:43:49<5:23:29, 3.68s/it] 76%|███████▌ | 16818/22095 [28:43:52<5:18:47, 3.62s/it] {'loss': 0.316, 'grad_norm': 1.1159220788524196, 'learning_rate': 1.4232498645888071e-06, 'epoch': 0.76} 76%|███████▌ | 16818/22095 [28:43:52<5:18:47, 3.62s/it] 76%|███████▌ | 16819/22095 [28:43:56<5:12:46, 3.56s/it] {'loss': 0.3085, 'grad_norm': 0.5783143028320339, 'learning_rate': 1.4227377625107686e-06, 'epoch': 0.76} 76%|███████▌ | 16819/22095 [28:43:56<5:12:46, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339017 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5648, 'image': 'vrdu_table_final_2/astro-ph.CO/9090a52b-86b4-4c7e-9392-6cbe5a471b6d.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16820/22095 [28:44:03<6:55:32, 4.73s/it] {'loss': 0.4805, 'grad_norm': 0.24991075881259164, 'learning_rate': 1.4222257372971072e-06, 'epoch': 0.76} 76%|███████▌ | 16820/22095 [28:44:03<6:55:32, 4.73s/it] 76%|███████▌ | 16821/22095 [28:44:07<6:33:33, 4.48s/it] {'loss': 0.3054, 'grad_norm': 0.6570138851141044, 'learning_rate': 1.4217137889588279e-06, 'epoch': 0.76} 76%|███████▌ | 16821/22095 [28:44:07<6:33:33, 4.48s/it] 76%|███████▌ | 16822/22095 [28:44:10<5:52:49, 4.01s/it] {'loss': 0.2591, 'grad_norm': 0.6034558054078771, 'learning_rate': 1.421201917506928e-06, 'epoch': 0.76} 76%|███████▌ | 16822/22095 [28:44:10<5:52:49, 4.01s/it] 76%|███████▌ | 16823/22095 [28:44:13<5:23:17, 3.68s/it] {'loss': 0.2863, 'grad_norm': 0.5642207860718541, 'learning_rate': 1.4206901229524089e-06, 'epoch': 0.76} 76%|███████▌ | 16823/22095 [28:44:13<5:23:17, 3.68s/it] 76%|███████▌ | 16824/22095 [28:44:16<4:58:17, 3.40s/it] {'loss': 0.2803, 'grad_norm': 0.6288858995612324, 'learning_rate': 1.4201784053062662e-06, 'epoch': 0.76} 76%|███████▌ | 16824/22095 [28:44:16<4:58:17, 3.40s/it] 76%|███████▌ | 16825/22095 [28:44:19<4:48:07, 3.28s/it] {'loss': 0.2858, 'grad_norm': 0.6032890415679383, 'learning_rate': 1.4196667645794932e-06, 'epoch': 0.76} 76%|███████▌ | 16825/22095 [28:44:19<4:48:07, 3.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [781, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8429196 in VC:s3://internvl-moe-sft-data/. Exception: Image size [781, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7251, 'image': 'vrdu_texteq/astro-ph.CO/a4ea56cc-e08b-421f-ace2-1a6dc0c24071.png', 'image_wh': [[781, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'where $n$ is the total number of dwarfs in the observable window.'}]} 76%|███████▌ | 16826/22095 [28:44:22<4:43:33, 3.23s/it] {'loss': 0.2414, 'grad_norm': 0.577659202381288, 'learning_rate': 1.4191552007830856e-06, 'epoch': 0.76} 76%|███████▌ | 16826/22095 [28:44:22<4:43:33, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73234 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119421 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46669 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16827/22095 [28:44:26<4:57:47, 3.39s/it] {'loss': 0.3162, 'grad_norm': 0.6491047599613663, 'learning_rate': 1.4186437139280363e-06, 'epoch': 0.76} 76%|███████▌ | 16827/22095 [28:44:26<4:57:47, 3.39s/it] 76%|███████▌ | 16828/22095 [28:44:29<5:01:18, 3.43s/it] {'loss': 0.2882, 'grad_norm': 0.624204633497584, 'learning_rate': 1.4181323040253346e-06, 'epoch': 0.76} 76%|███████▌ | 16828/22095 [28:44:29<5:01:18, 3.43s/it] 76%|███████▌ | 16829/22095 [28:44:33<5:09:28, 3.53s/it] {'loss': 0.2684, 'grad_norm': 0.6492984485836424, 'learning_rate': 1.4176209710859672e-06, 'epoch': 0.76} 76%|███████▌ | 16829/22095 [28:44:33<5:09:28, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16830/22095 [28:44:36<4:52:30, 3.33s/it] {'loss': 0.265, 'grad_norm': 0.9304262523062551, 'learning_rate': 1.417109715120924e-06, 'epoch': 0.76} 76%|███████▌ | 16830/22095 [28:44:36<4:52:30, 3.33s/it] 76%|███████▌ | 16831/22095 [28:44:40<5:02:27, 3.45s/it] {'loss': 0.3371, 'grad_norm': 0.6239550151772859, 'learning_rate': 1.4165985361411878e-06, 'epoch': 0.76} 76%|███████▌ | 16831/22095 [28:44:40<5:02:27, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42973 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16832/22095 [28:44:43<5:09:33, 3.53s/it] {'loss': 0.3229, 'grad_norm': 0.6818527097603084, 'learning_rate': 1.4160874341577447e-06, 'epoch': 0.76} 76%|███████▌ | 16832/22095 [28:44:43<5:09:33, 3.53s/it] 76%|███████▌ | 16833/22095 [28:44:46<4:58:12, 3.40s/it] {'loss': 0.2793, 'grad_norm': 0.6515337635420715, 'learning_rate': 1.4155764091815737e-06, 'epoch': 0.76} 76%|███████▌ | 16833/22095 [28:44:46<4:58:12, 3.40s/it] 76%|███████▌ | 16834/22095 [28:44:50<5:04:37, 3.47s/it] {'loss': 0.3002, 'grad_norm': 0.6006328439560533, 'learning_rate': 1.4150654612236592e-06, 'epoch': 0.76} 76%|███████▌ | 16834/22095 [28:44:50<5:04:37, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▌ | 16835/22095 [28:44:58<6:54:42, 4.73s/it] {'loss': 0.4464, 'grad_norm': 0.2696520101700211, 'learning_rate': 1.4145545902949758e-06, 'epoch': 0.76} 76%|███████▌ | 16835/22095 [28:44:58<6:54:42, 4.73s/it] 76%|███████▌ | 16836/22095 [28:45:02<6:46:25, 4.64s/it] {'loss': 0.2875, 'grad_norm': 0.5866097715554073, 'learning_rate': 1.4140437964065034e-06, 'epoch': 0.76} 76%|███████▌ | 16836/22095 [28:45:02<6:46:25, 4.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938307 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61460, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nA. 13cm\nB. 11cm\nC. 12cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 76%|███████▌ | 16837/22095 [28:45:05<5:57:20, 4.08s/it] {'loss': 0.3314, 'grad_norm': 0.6464923450865269, 'learning_rate': 1.413533079569217e-06, 'epoch': 0.76} 76%|███████▌ | 16837/22095 [28:45:05<5:57:20, 4.08s/it] 76%|███████▌ | 16838/22095 [28:45:08<5:29:58, 3.77s/it] {'loss': 0.2522, 'grad_norm': 0.6175031665946599, 'learning_rate': 1.4130224397940883e-06, 'epoch': 0.76} 76%|███████▌ | 16838/22095 [28:45:08<5:29:58, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8515032 in VC:s3://internvl-moe-sft-data/. Exception: Image size [192, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 130265, 'image': 'vrdu_texteq/astro-ph.CO/40f4371d-7604-493c-b9bf-ba9af5c0725d.png', 'image_wh': [[192, 25]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'and \\mbox{$z-K_\\mathrm{s}\\simeq 1$}.'}]} 76%|███████▌ | 16839/22095 [28:45:11<5:05:48, 3.49s/it] {'loss': 0.2754, 'grad_norm': 0.6334144130604863, 'learning_rate': 1.4125118770920903e-06, 'epoch': 0.76} 76%|███████▌ | 16839/22095 [28:45:11<5:05:48, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▌ | 16840/22095 [28:45:21<8:10:14, 5.60s/it] {'loss': 0.4639, 'grad_norm': 0.2774705809975444, 'learning_rate': 1.412001391474196e-06, 'epoch': 0.76} 76%|███████▌ | 16840/22095 [28:45:21<8:10:14, 5.60s/it] 76%|███████▌ | 16841/22095 [28:45:26<7:54:25, 5.42s/it] {'loss': 0.2797, 'grad_norm': 0.7367744811054106, 'learning_rate': 1.4114909829513718e-06, 'epoch': 0.76} 76%|███████▌ | 16841/22095 [28:45:26<7:54:25, 5.42s/it] 76%|███████▌ | 16842/22095 [28:45:30<7:02:10, 4.82s/it] {'loss': 0.2997, 'grad_norm': 0.6474503842151763, 'learning_rate': 1.4109806515345836e-06, 'epoch': 0.76} 76%|███████▌ | 16842/22095 [28:45:30<7:02:10, 4.82s/it] 76%|███████▌ | 16843/22095 [28:45:33<6:16:47, 4.30s/it] {'loss': 0.2746, 'grad_norm': 0.60544937199488, 'learning_rate': 1.4104703972348e-06, 'epoch': 0.76} 76%|███████▌ | 16843/22095 [28:45:33<6:16:47, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75126 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50708 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▌ | 16844/22095 [28:45:36<5:39:58, 3.88s/it] {'loss': 0.2633, 'grad_norm': 0.5857315723739743, 'learning_rate': 1.4099602200629813e-06, 'epoch': 0.76} 76%|███████▌ | 16844/22095 [28:45:36<5:39:58, 3.88s/it] 76%|███████▌ | 16845/22095 [28:45:39<5:30:26, 3.78s/it] {'loss': 0.334, 'grad_norm': 0.6437930937488767, 'learning_rate': 1.4094501200300937e-06, 'epoch': 0.76} 76%|███████▌ | 16845/22095 [28:45:39<5:30:26, 3.78s/it] 76%|███████▌ | 16846/22095 [28:45:42<5:07:00, 3.51s/it] {'loss': 0.2714, 'grad_norm': 0.6409464336461744, 'learning_rate': 1.4089400971470935e-06, 'epoch': 0.76} 76%|███████▌ | 16846/22095 [28:45:42<5:07:00, 3.51s/it] 76%|███████▌ | 16847/22095 [28:45:45<4:56:55, 3.39s/it] {'loss': 0.247, 'grad_norm': 0.6142536853103293, 'learning_rate': 1.4084301514249432e-06, 'epoch': 0.76} 76%|███████▌ | 16847/22095 [28:45:45<4:56:55, 3.39s/it] 76%|███████▋ | 16848/22095 [28:45:48<4:48:28, 3.30s/it] {'loss': 0.3, 'grad_norm': 0.7107969655346827, 'learning_rate': 1.407920282874598e-06, 'epoch': 0.76} 76%|███████▋ | 16848/22095 [28:45:48<4:48:28, 3.30s/it] 76%|███████▋ | 16849/22095 [28:45:53<5:14:43, 3.60s/it] {'loss': 0.3144, 'grad_norm': 0.7072927653548671, 'learning_rate': 1.4074104915070124e-06, 'epoch': 0.76} 76%|███████▋ | 16849/22095 [28:45:53<5:14:43, 3.60s/it] 76%|███████▋ | 16850/22095 [28:45:56<5:03:40, 3.47s/it] {'loss': 0.3365, 'grad_norm': 0.7667570326124314, 'learning_rate': 1.4069007773331433e-06, 'epoch': 0.76} 76%|███████▋ | 16850/22095 [28:45:56<5:03:40, 3.47s/it] 76%|███████▋ | 16851/22095 [28:46:00<5:10:45, 3.56s/it] {'loss': 0.3386, 'grad_norm': 0.6173278493372766, 'learning_rate': 1.4063911403639392e-06, 'epoch': 0.76} 76%|███████▋ | 16851/22095 [28:46:00<5:10:45, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▋ | 16852/22095 [28:46:11<8:28:57, 5.82s/it] {'loss': 0.4656, 'grad_norm': 0.28676641994733726, 'learning_rate': 1.4058815806103542e-06, 'epoch': 0.76} 76%|███████▋ | 16852/22095 [28:46:11<8:28:57, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51685 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99406 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16853/22095 [28:46:15<7:54:24, 5.43s/it] {'loss': 0.2728, 'grad_norm': 0.5823508981388617, 'learning_rate': 1.4053720980833357e-06, 'epoch': 0.76} 76%|███████▋ | 16853/22095 [28:46:15<7:54:24, 5.43s/it] 76%|███████▋ | 16854/22095 [28:46:20<7:26:40, 5.11s/it] {'loss': 0.3565, 'grad_norm': 0.6438356399181908, 'learning_rate': 1.4048626927938292e-06, 'epoch': 0.76} 76%|███████▋ | 16854/22095 [28:46:20<7:26:40, 5.11s/it] 76%|███████▋ | 16855/22095 [28:46:24<7:00:29, 4.81s/it] {'loss': 0.2896, 'grad_norm': 0.5933897311378838, 'learning_rate': 1.4043533647527813e-06, 'epoch': 0.76} 76%|███████▋ | 16855/22095 [28:46:24<7:00:29, 4.81s/it] 76%|███████▋ | 16856/22095 [28:46:27<6:27:08, 4.43s/it] {'loss': 0.3016, 'grad_norm': 0.5942110961471518, 'learning_rate': 1.4038441139711384e-06, 'epoch': 0.76} 76%|███████▋ | 16856/22095 [28:46:27<6:27:08, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16857/22095 [28:46:30<5:55:14, 4.07s/it] {'loss': 0.2777, 'grad_norm': 0.5985865760232018, 'learning_rate': 1.4033349404598407e-06, 'epoch': 0.76} 76%|███████▋ | 16857/22095 [28:46:30<5:55:14, 4.07s/it] 76%|███████▋ | 16858/22095 [28:46:34<5:34:43, 3.83s/it] {'loss': 0.326, 'grad_norm': 0.6502313785103132, 'learning_rate': 1.402825844229827e-06, 'epoch': 0.76} 76%|███████▋ | 16858/22095 [28:46:34<5:34:43, 3.83s/it] 76%|███████▋ | 16859/22095 [28:46:38<5:44:02, 3.94s/it] {'loss': 0.267, 'grad_norm': 0.6101212744528433, 'learning_rate': 1.4023168252920384e-06, 'epoch': 0.76} 76%|███████▋ | 16859/22095 [28:46:38<5:44:02, 3.94s/it] 76%|███████▋ | 16860/22095 [28:46:41<5:28:21, 3.76s/it] {'loss': 0.2968, 'grad_norm': 0.6054417882461244, 'learning_rate': 1.4018078836574134e-06, 'epoch': 0.76} 76%|███████▋ | 16860/22095 [28:46:41<5:28:21, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▋ | 16861/22095 [28:46:49<7:05:21, 4.88s/it] {'loss': 0.4793, 'grad_norm': 0.2915904665319847, 'learning_rate': 1.401299019336886e-06, 'epoch': 0.76} 76%|███████▋ | 16861/22095 [28:46:49<7:05:21, 4.88s/it] 76%|███████▋ | 16862/22095 [28:46:52<6:24:35, 4.41s/it] {'loss': 0.2823, 'grad_norm': 0.5737394256906462, 'learning_rate': 1.400790232341388e-06, 'epoch': 0.76} 76%|███████▋ | 16862/22095 [28:46:52<6:24:35, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938306 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61459, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nA. 15cm\nB. 13cm\nC. 11cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16863/22095 [28:46:55<5:44:43, 3.95s/it] {'loss': 0.2596, 'grad_norm': 0.6102924639830412, 'learning_rate': 1.4002815226818557e-06, 'epoch': 0.76} 76%|███████▋ | 16863/22095 [28:46:55<5:44:43, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79251 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67753 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85467 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43276 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16864/22095 [28:46:58<5:21:26, 3.69s/it] {'loss': 0.2871, 'grad_norm': 0.6441089802380752, 'learning_rate': 1.3997728903692164e-06, 'epoch': 0.76} 76%|███████▋ | 16864/22095 [28:46:58<5:21:26, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16865/22095 [28:47:02<5:21:53, 3.69s/it] {'loss': 0.2683, 'grad_norm': 0.653370143391403, 'learning_rate': 1.3992643354144013e-06, 'epoch': 0.76} 76%|███████▋ | 16865/22095 [28:47:02<5:21:53, 3.69s/it] 76%|███████▋ | 16866/22095 [28:47:05<5:14:48, 3.61s/it] {'loss': 0.2918, 'grad_norm': 0.8143105617909899, 'learning_rate': 1.3987558578283378e-06, 'epoch': 0.76} 76%|███████▋ | 16866/22095 [28:47:05<5:14:48, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883161 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6314, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 4.5\nB. 7\nC. 2\nD. 2.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 76%|███████▋ | 16867/22095 [28:47:08<4:57:03, 3.41s/it] {'loss': 0.2695, 'grad_norm': 0.6276857224416678, 'learning_rate': 1.3982474576219485e-06, 'epoch': 0.76} 76%|███████▋ | 16867/22095 [28:47:08<4:57:03, 3.41s/it] 76%|███████▋ | 16868/22095 [28:47:12<5:05:18, 3.50s/it] {'loss': 0.2982, 'grad_norm': 0.6342212940275045, 'learning_rate': 1.3977391348061592e-06, 'epoch': 0.76} 76%|███████▋ | 16868/22095 [28:47:12<5:05:18, 3.50s/it] 76%|███████▋ | 16869/22095 [28:47:15<4:54:08, 3.38s/it] {'loss': 0.2849, 'grad_norm': 0.6223834130747216, 'learning_rate': 1.397230889391894e-06, 'epoch': 0.76} 76%|███████▋ | 16869/22095 [28:47:15<4:54:08, 3.38s/it] 76%|███████▋ | 16870/22095 [28:47:18<4:46:24, 3.29s/it] {'loss': 0.2476, 'grad_norm': 0.5577136042915022, 'learning_rate': 1.3967227213900725e-06, 'epoch': 0.76} 76%|███████▋ | 16870/22095 [28:47:18<4:46:24, 3.29s/it] 76%|███████▋ | 16871/22095 [28:47:21<4:44:27, 3.27s/it] {'loss': 0.2798, 'grad_norm': 0.7179559302744081, 'learning_rate': 1.3962146308116109e-06, 'epoch': 0.76} 76%|███████▋ | 16871/22095 [28:47:21<4:44:27, 3.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▋ | 16872/22095 [28:47:31<7:49:17, 5.39s/it] {'loss': 0.4514, 'grad_norm': 0.30340564592253993, 'learning_rate': 1.3957066176674306e-06, 'epoch': 0.76} 76%|███████▋ | 16872/22095 [28:47:31<7:49:17, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16873/22095 [28:47:36<7:29:30, 5.16s/it] {'loss': 0.3246, 'grad_norm': 0.6339792291596237, 'learning_rate': 1.3951986819684432e-06, 'epoch': 0.76} 76%|███████▋ | 16873/22095 [28:47:36<7:29:30, 5.16s/it] 76%|███████▋ | 16874/22095 [28:47:40<6:57:57, 4.80s/it] {'loss': 0.3045, 'grad_norm': 0.5892116521049253, 'learning_rate': 1.3946908237255668e-06, 'epoch': 0.76} 76%|███████▋ | 16874/22095 [28:47:40<6:57:57, 4.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8388098 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54913, 'image': 'vrdu_table_final_2/astro-ph.CO/b4f4e09c-a0a2-4dfe-8029-715afd6d2d98.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 76%|███████▋ | 16875/22095 [28:47:43<6:06:33, 4.21s/it] {'loss': 0.2898, 'grad_norm': 0.6261217936668318, 'learning_rate': 1.3941830429497105e-06, 'epoch': 0.76} 76%|███████▋ | 16875/22095 [28:47:43<6:06:33, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045976 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 6.4cm\nB. 6.8cm\nC. 7cm\nD. 5.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 76%|███████▋ | 16876/22095 [28:47:46<5:43:00, 3.94s/it] {'loss': 0.2698, 'grad_norm': 1.5401022245595641, 'learning_rate': 1.3936753396517877e-06, 'epoch': 0.76} 76%|███████▋ | 16876/22095 [28:47:46<5:43:00, 3.94s/it] 76%|███████▋ | 16877/22095 [28:47:49<5:17:27, 3.65s/it] {'loss': 0.317, 'grad_norm': 0.8747644893541968, 'learning_rate': 1.3931677138427035e-06, 'epoch': 0.76} 76%|███████▋ | 16877/22095 [28:47:49<5:17:27, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308192 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2Ob1Ve_J_SKJjSZPiXXb3LpXa_!!789520595.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否帮忙识别这张图片里的全部文本并告诉我?'}, {'from': 'gpt', 'value': '图中所有文字:\n产品名称;追踪器/BB响\n如图\n毛重;5.70g\n擎宇模型\nQINGYUTECHNOLOGY'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [150, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8472738 in VC:s3://internvl-moe-sft-data/. Exception: Image size [150, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 106233, 'image': 'vrdu_texteq/astro-ph.CO/bfc97853-c644-451b-946a-2506d9c91431.png', 'image_wh': [[150, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $r>0$.'}]} 76%|███████▋ | 16878/22095 [28:47:53<5:13:44, 3.61s/it] {'loss': 0.3333, 'grad_norm': 0.6173016787995003, 'learning_rate': 1.39266016553337e-06, 'epoch': 0.76} 76%|███████▋ | 16878/22095 [28:47:53<5:13:44, 3.61s/it] 76%|███████▋ | 16879/22095 [28:47:56<5:00:43, 3.46s/it] {'loss': 0.2761, 'grad_norm': 0.6037204276293361, 'learning_rate': 1.3921526947346902e-06, 'epoch': 0.76} 76%|███████▋ | 16879/22095 [28:47:56<5:00:43, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63735 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16880/22095 [28:47:59<4:53:49, 3.38s/it] {'loss': 0.2864, 'grad_norm': 0.6201939894233246, 'learning_rate': 1.3916453014575664e-06, 'epoch': 0.76} 76%|███████▋ | 16880/22095 [28:47:59<4:53:49, 3.38s/it] 76%|███████▋ | 16881/22095 [28:48:02<4:47:48, 3.31s/it] {'loss': 0.2896, 'grad_norm': 0.6081279482261308, 'learning_rate': 1.3911379857129037e-06, 'epoch': 0.76} 76%|███████▋ | 16881/22095 [28:48:02<4:47:48, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16882/22095 [28:48:06<4:49:02, 3.33s/it] {'loss': 0.3191, 'grad_norm': 0.5885304306737921, 'learning_rate': 1.3906307475116044e-06, 'epoch': 0.76} 76%|███████▋ | 16882/22095 [28:48:06<4:49:02, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41745 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16883/22095 [28:48:08<4:38:19, 3.20s/it] {'loss': 0.3063, 'grad_norm': 0.5702277237570906, 'learning_rate': 1.390123586864562e-06, 'epoch': 0.76} 76%|███████▋ | 16883/22095 [28:48:08<4:38:19, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 76%|███████▋ | 16884/22095 [28:48:12<4:40:45, 3.23s/it] {'loss': 0.3084, 'grad_norm': 0.6062661374592121, 'learning_rate': 1.389616503782677e-06, 'epoch': 0.76} 76%|███████▋ | 16884/22095 [28:48:12<4:40:45, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68917 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16885/22095 [28:48:15<4:44:10, 3.27s/it] {'loss': 0.2715, 'grad_norm': 0.5808660659853312, 'learning_rate': 1.389109498276846e-06, 'epoch': 0.76} 76%|███████▋ | 16885/22095 [28:48:15<4:44:10, 3.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308182 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2O8Ktm0bJ8KJjy1zjXXaqapXa_!!1116067536.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n识别这图中的文本。'}, {'from': 'gpt', 'value': '图中所有文字:\n更耐热料积木采用的是PP材质塑全新塑料更好玩!\n豪华礼盒装\n有\n收藏\n送收纳盒\n惊喜\n宝贝'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (106526 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16886/22095 [28:48:19<4:56:47, 3.42s/it] {'loss': 0.3076, 'grad_norm': 0.6005673175985065, 'learning_rate': 1.388602570357962e-06, 'epoch': 0.76} 76%|███████▋ | 16886/22095 [28:48:19<4:56:47, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56718 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16887/22095 [28:48:23<5:14:59, 3.63s/it] {'loss': 0.2916, 'grad_norm': 0.6122167898907246, 'learning_rate': 1.388095720036916e-06, 'epoch': 0.76} 76%|███████▋ | 16887/22095 [28:48:23<5:14:59, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45369 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16888/22095 [28:48:35<8:46:35, 6.07s/it] {'loss': 0.4631, 'grad_norm': 0.2734031758798873, 'learning_rate': 1.3875889473245996e-06, 'epoch': 0.76} 76%|███████▋ | 16888/22095 [28:48:35<8:46:35, 6.07s/it] 76%|███████▋ | 16889/22095 [28:48:39<7:46:32, 5.38s/it] {'loss': 0.3079, 'grad_norm': 0.6061707662293109, 'learning_rate': 1.3870822522319039e-06, 'epoch': 0.76} 76%|███████▋ | 16889/22095 [28:48:39<7:46:32, 5.38s/it] 76%|███████▋ | 16890/22095 [28:48:43<7:13:29, 5.00s/it] {'loss': 0.2763, 'grad_norm': 0.6162602129563666, 'learning_rate': 1.386575634769714e-06, 'epoch': 0.76} 76%|███████▋ | 16890/22095 [28:48:43<7:13:29, 5.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▋ | 16891/22095 [28:48:50<8:02:57, 5.57s/it] {'loss': 0.4764, 'grad_norm': 0.3073469359186612, 'learning_rate': 1.3860690949489141e-06, 'epoch': 0.76} 76%|███████▋ | 16891/22095 [28:48:50<8:02:57, 5.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81382 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16892/22095 [28:48:54<7:28:26, 5.17s/it] {'loss': 0.2817, 'grad_norm': 0.5780747748601645, 'learning_rate': 1.3855626327803923e-06, 'epoch': 0.76} 76%|███████▋ | 16892/22095 [28:48:54<7:28:26, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 76%|███████▋ | 16893/22095 [28:49:03<9:12:16, 6.37s/it] {'loss': 0.4591, 'grad_norm': 0.26548664320126003, 'learning_rate': 1.385056248275027e-06, 'epoch': 0.76} 76%|███████▋ | 16893/22095 [28:49:03<9:12:16, 6.37s/it] 76%|███████▋ | 16894/22095 [28:49:07<8:04:59, 5.60s/it] {'loss': 0.2995, 'grad_norm': 0.6425476158887657, 'learning_rate': 1.3845499414437013e-06, 'epoch': 0.76} 76%|███████▋ | 16894/22095 [28:49:07<8:04:59, 5.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (56580 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104743 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16895/22095 [28:49:16<9:44:44, 6.75s/it] {'loss': 0.4742, 'grad_norm': 0.2543732806748501, 'learning_rate': 1.384043712297294e-06, 'epoch': 0.76} 76%|███████▋ | 16895/22095 [28:49:16<9:44:44, 6.75s/it] 76%|███████▋ | 16896/22095 [28:49:25<10:44:17, 7.44s/it] {'loss': 0.4554, 'grad_norm': 0.27888961787907973, 'learning_rate': 1.38353756084668e-06, 'epoch': 0.76} 76%|███████▋ | 16896/22095 [28:49:25<10:44:17, 7.44s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 76%|███████▋ | 16897/22095 [28:49:29<9:20:50, 6.47s/it] {'loss': 0.322, 'grad_norm': 0.6126119132895413, 'learning_rate': 1.3830314871027367e-06, 'epoch': 0.76} 76%|███████▋ | 16897/22095 [28:49:29<9:20:50, 6.47s/it] 76%|███████▋ | 16898/22095 [28:49:33<8:14:26, 5.71s/it] {'loss': 0.2992, 'grad_norm': 0.6800039236791499, 'learning_rate': 1.3825254910763396e-06, 'epoch': 0.76} 76%|███████▋ | 16898/22095 [28:49:33<8:14:26, 5.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398866 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1018, 'image': 'vrdu_table_final_2/astro-ph.CO/13c136f6-e7d3-45f8-9220-71b8458b132d.png', 'image_wh': [[14, 53]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}{#2}\\\\{#3}\\end{tabular}\n```"}]} 76%|███████▋ | 16899/22095 [28:49:36<7:04:39, 4.90s/it] {'loss': 0.3103, 'grad_norm': 0.6701975701935918, 'learning_rate': 1.3820195727783597e-06, 'epoch': 0.76} 76%|███████▋ | 16899/22095 [28:49:36<7:04:39, 4.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55194 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16900/22095 [28:49:45<8:47:01, 6.09s/it] {'loss': 0.4648, 'grad_norm': 0.2552845699044317, 'learning_rate': 1.3815137322196654e-06, 'epoch': 0.76} 76%|███████▋ | 16900/22095 [28:49:45<8:47:01, 6.09s/it] 76%|███████▋ | 16901/22095 [28:49:50<8:22:17, 5.80s/it] {'loss': 0.2839, 'grad_norm': 0.665911728128072, 'learning_rate': 1.3810079694111295e-06, 'epoch': 0.76} 76%|███████▋ | 16901/22095 [28:49:50<8:22:17, 5.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69268 > 40960). Running this sequence through the model will result in indexing errors 76%|███████▋ | 16902/22095 [28:49:54<7:19:01, 5.07s/it] {'loss': 0.3057, 'grad_norm': 0.7525189007952864, 'learning_rate': 1.3805022843636162e-06, 'epoch': 0.76} 76%|███████▋ | 16902/22095 [28:49:54<7:19:01, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16903/22095 [28:50:02<8:49:58, 6.12s/it] {'loss': 0.455, 'grad_norm': 0.2652382252083221, 'learning_rate': 1.3799966770879936e-06, 'epoch': 0.77} 77%|███████▋ | 16903/22095 [28:50:02<8:49:58, 6.12s/it] 77%|███████▋ | 16904/22095 [28:50:06<7:45:25, 5.38s/it] {'loss': 0.3246, 'grad_norm': 0.6639683051494911, 'learning_rate': 1.3794911475951229e-06, 'epoch': 0.77} 77%|███████▋ | 16904/22095 [28:50:06<7:45:25, 5.38s/it] 77%|███████▋ | 16905/22095 [28:50:09<6:42:03, 4.65s/it] {'loss': 0.2624, 'grad_norm': 0.6142888855102578, 'learning_rate': 1.3789856958958692e-06, 'epoch': 0.77} 77%|███████▋ | 16905/22095 [28:50:09<6:42:03, 4.65s/it] 77%|███████▋ | 16906/22095 [28:50:12<6:03:57, 4.21s/it] {'loss': 0.299, 'grad_norm': 0.6536807988841115, 'learning_rate': 1.3784803220010906e-06, 'epoch': 0.77} 77%|███████▋ | 16906/22095 [28:50:12<6:03:57, 4.21s/it] 77%|███████▋ | 16907/22095 [28:50:15<5:27:57, 3.79s/it] {'loss': 0.2996, 'grad_norm': 0.6210388750058032, 'learning_rate': 1.3779750259216484e-06, 'epoch': 0.77} 77%|███████▋ | 16907/22095 [28:50:15<5:27:57, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42925 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16908/22095 [28:50:24<7:56:03, 5.51s/it] {'loss': 0.4469, 'grad_norm': 0.24744556272978027, 'learning_rate': 1.377469807668399e-06, 'epoch': 0.77} 77%|███████▋ | 16908/22095 [28:50:24<7:56:03, 5.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307474 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2356HkgfH8KJjy1zcXXcTzpXa_!!3539334190.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nConvert the text in this image to written form.'}, {'from': 'gpt', 'value': 'All words in the image:\n尚动\nSHANGDONG\n行\n心\n动\n动\n360°\n全方位自由旋转\n冷热龙头\n拒绝生锈\n有质检报告\n形象代言\n39.\n起\n!\n8\n圣诞领券下单\n李湘\n尚动\n李湘'}]} 77%|███████▋ | 16909/22095 [28:50:28<7:15:55, 5.04s/it] {'loss': 0.2728, 'grad_norm': 0.5780801130034201, 'learning_rate': 1.3769646672521964e-06, 'epoch': 0.77} 77%|███████▋ | 16909/22095 [28:50:28<7:15:55, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16910/22095 [28:50:38<9:12:43, 6.40s/it] {'loss': 0.5005, 'grad_norm': 0.2845392317576298, 'learning_rate': 1.3764596046838951e-06, 'epoch': 0.77} 77%|███████▋ | 16910/22095 [28:50:38<9:12:43, 6.40s/it] 77%|███████▋ | 16911/22095 [28:50:41<7:53:46, 5.48s/it] {'loss': 0.2606, 'grad_norm': 0.6439653211486774, 'learning_rate': 1.3759546199743518e-06, 'epoch': 0.77} 77%|███████▋ | 16911/22095 [28:50:41<7:53:46, 5.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902703 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25856, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 8cm\nB. 9cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 77%|███████▋ | 16912/22095 [28:50:44<6:48:46, 4.73s/it] {'loss': 0.2423, 'grad_norm': 0.6052871614791017, 'learning_rate': 1.3754497131344097e-06, 'epoch': 0.77} 77%|███████▋ | 16912/22095 [28:50:44<6:48:46, 4.73s/it] 77%|███████▋ | 16913/22095 [28:50:48<6:13:29, 4.32s/it] {'loss': 0.289, 'grad_norm': 0.6858402042648036, 'learning_rate': 1.3749448841749213e-06, 'epoch': 0.77} 77%|███████▋ | 16913/22095 [28:50:48<6:13:29, 4.32s/it] 77%|███████▋ | 16914/22095 [28:50:51<5:41:32, 3.96s/it] {'loss': 0.321, 'grad_norm': 0.6380469222281904, 'learning_rate': 1.3744401331067358e-06, 'epoch': 0.77} 77%|███████▋ | 16914/22095 [28:50:51<5:41:32, 3.96s/it] 77%|███████▋ | 16915/22095 [28:50:55<5:39:01, 3.93s/it] {'loss': 0.2968, 'grad_norm': 0.6318465111490466, 'learning_rate': 1.3739354599406969e-06, 'epoch': 0.77} 77%|███████▋ | 16915/22095 [28:50:55<5:39:01, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (56150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49416 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89516 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56119 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16916/22095 [28:51:00<6:13:17, 4.32s/it] {'loss': 0.4759, 'grad_norm': 0.27806466123117685, 'learning_rate': 1.373430864687646e-06, 'epoch': 0.77} 77%|███████▋ | 16916/22095 [28:51:00<6:13:17, 4.32s/it] 77%|███████▋ | 16917/22095 [28:51:03<5:52:41, 4.09s/it] {'loss': 0.2649, 'grad_norm': 0.559100606385692, 'learning_rate': 1.3729263473584281e-06, 'epoch': 0.77} 77%|███████▋ | 16917/22095 [28:51:03<5:52:41, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60754 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69710 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16918/22095 [28:51:06<5:23:28, 3.75s/it] {'loss': 0.2895, 'grad_norm': 0.5835671908916155, 'learning_rate': 1.372421907963885e-06, 'epoch': 0.77} 77%|███████▋ | 16918/22095 [28:51:06<5:23:28, 3.75s/it] 77%|███████▋ | 16919/22095 [28:51:11<5:37:43, 3.91s/it] {'loss': 0.3081, 'grad_norm': 0.5678136911278053, 'learning_rate': 1.3719175465148538e-06, 'epoch': 0.77} 77%|███████▋ | 16919/22095 [28:51:11<5:37:43, 3.91s/it] 77%|███████▋ | 16920/22095 [28:51:15<5:37:52, 3.92s/it] {'loss': 0.3243, 'grad_norm': 0.6582775454018125, 'learning_rate': 1.3714132630221699e-06, 'epoch': 0.77} 77%|███████▋ | 16920/22095 [28:51:15<5:37:52, 3.92s/it] 77%|███████▋ | 16921/22095 [28:51:18<5:22:24, 3.74s/it] {'loss': 0.288, 'grad_norm': 0.6572142174436094, 'learning_rate': 1.3709090574966726e-06, 'epoch': 0.77} 77%|███████▋ | 16921/22095 [28:51:18<5:22:24, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16922/22095 [28:51:22<5:25:38, 3.78s/it] {'loss': 0.2615, 'grad_norm': 0.5759000279508841, 'learning_rate': 1.3704049299491923e-06, 'epoch': 0.77} 77%|███████▋ | 16922/22095 [28:51:22<5:25:38, 3.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16923/22095 [28:51:25<5:06:54, 3.56s/it] {'loss': 0.2983, 'grad_norm': 0.9477769156689875, 'learning_rate': 1.3699008803905633e-06, 'epoch': 0.77} 77%|███████▋ | 16923/22095 [28:51:25<5:06:54, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16924/22095 [28:51:34<7:39:42, 5.33s/it] {'loss': 0.4754, 'grad_norm': 0.28476611007507796, 'learning_rate': 1.369396908831616e-06, 'epoch': 0.77} 77%|███████▋ | 16924/22095 [28:51:34<7:39:42, 5.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047788 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 77%|███████▋ | 16925/22095 [28:51:38<7:04:31, 4.93s/it] {'loss': 0.3237, 'grad_norm': 0.5858814659736541, 'learning_rate': 1.368893015283177e-06, 'epoch': 0.77} 77%|███████▋ | 16925/22095 [28:51:38<7:04:31, 4.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16926/22095 [28:51:48<8:58:18, 6.25s/it] {'loss': 0.4561, 'grad_norm': 0.2752287362306891, 'learning_rate': 1.368389199756075e-06, 'epoch': 0.77} 77%|███████▋ | 16926/22095 [28:51:48<8:58:18, 6.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41272 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16927/22095 [28:51:52<7:59:18, 5.56s/it] {'loss': 0.2647, 'grad_norm': 0.8509358285265931, 'learning_rate': 1.3678854622611371e-06, 'epoch': 0.77} 77%|███████▋ | 16927/22095 [28:51:52<7:59:18, 5.56s/it] 77%|███████▋ | 16928/22095 [28:51:55<7:03:14, 4.91s/it] {'loss': 0.2603, 'grad_norm': 0.5731027176137438, 'learning_rate': 1.367381802809185e-06, 'epoch': 0.77} 77%|███████▋ | 16928/22095 [28:51:55<7:03:14, 4.91s/it] 77%|███████▋ | 16929/22095 [28:51:58<6:23:50, 4.46s/it] {'loss': 0.3007, 'grad_norm': 0.6125266417761187, 'learning_rate': 1.3668782214110404e-06, 'epoch': 0.77} 77%|███████▋ | 16929/22095 [28:51:58<6:23:50, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47959 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51180 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16930/22095 [28:52:01<5:39:11, 3.94s/it] {'loss': 0.3023, 'grad_norm': 0.6363985154525227, 'learning_rate': 1.3663747180775238e-06, 'epoch': 0.77} 77%|███████▋ | 16930/22095 [28:52:01<5:39:11, 3.94s/it] 77%|███████▋ | 16931/22095 [28:52:05<5:32:59, 3.87s/it] {'loss': 0.2762, 'grad_norm': 0.5990897495045766, 'learning_rate': 1.3658712928194567e-06, 'epoch': 0.77} 77%|███████▋ | 16931/22095 [28:52:05<5:32:59, 3.87s/it] 77%|███████▋ | 16932/22095 [28:52:08<5:25:38, 3.78s/it] {'loss': 0.3044, 'grad_norm': 0.5934237035423845, 'learning_rate': 1.3653679456476536e-06, 'epoch': 0.77} 77%|███████▋ | 16932/22095 [28:52:08<5:25:38, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64743 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16933/22095 [28:52:12<5:24:48, 3.78s/it] {'loss': 0.334, 'grad_norm': 0.7283000284782619, 'learning_rate': 1.3648646765729295e-06, 'epoch': 0.77} 77%|███████▋ | 16933/22095 [28:52:12<5:24:48, 3.78s/it] 77%|███████▋ | 16934/22095 [28:52:15<5:04:43, 3.54s/it] {'loss': 0.2992, 'grad_norm': 0.609175602263638, 'learning_rate': 1.3643614856061005e-06, 'epoch': 0.77} 77%|███████▋ | 16934/22095 [28:52:15<5:04:43, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77437 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16935/22095 [28:52:21<6:07:34, 4.27s/it] {'loss': 0.4318, 'grad_norm': 0.2539648191964789, 'learning_rate': 1.3638583727579752e-06, 'epoch': 0.77} 77%|███████▋ | 16935/22095 [28:52:21<6:07:34, 4.27s/it] 77%|███████▋ | 16936/22095 [28:52:25<5:53:12, 4.11s/it] {'loss': 0.3037, 'grad_norm': 0.6069518858266917, 'learning_rate': 1.3633553380393677e-06, 'epoch': 0.77} 77%|███████▋ | 16936/22095 [28:52:25<5:53:12, 4.11s/it] 77%|███████▋ | 16937/22095 [28:52:29<5:47:34, 4.04s/it] {'loss': 0.3155, 'grad_norm': 0.5627562186944901, 'learning_rate': 1.362852381461085e-06, 'epoch': 0.77} 77%|███████▋ | 16937/22095 [28:52:29<5:47:34, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16938/22095 [28:52:39<8:17:24, 5.79s/it] {'loss': 0.4702, 'grad_norm': 0.26406261763895233, 'learning_rate': 1.3623495030339323e-06, 'epoch': 0.77} 77%|███████▋ | 16938/22095 [28:52:39<8:17:24, 5.79s/it] 77%|███████▋ | 16939/22095 [28:52:42<7:27:54, 5.21s/it] {'loss': 0.315, 'grad_norm': 0.7349242877482646, 'learning_rate': 1.3618467027687165e-06, 'epoch': 0.77} 77%|███████▋ | 16939/22095 [28:52:42<7:27:54, 5.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [81, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333542 in VC:s3://internvl-moe-sft-data/. Exception: Image size [81, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 150, 'image': 'vrdu_table_final_2/astro-ph.CO/4fdbdc30-953e-4dc1-9c7c-fda218db5285.png', 'image_wh': [[81, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{l}MMF1\\end{tabular}\n```'}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16940/22095 [28:52:46<6:32:43, 4.57s/it] {'loss': 0.2913, 'grad_norm': 0.6108496294096064, 'learning_rate': 1.3613439806762447e-06, 'epoch': 0.77} 77%|███████▋ | 16940/22095 [28:52:46<6:32:43, 4.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924512 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47665, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nA. 3cm\nB. 4cm\nC. 6cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30505.png 2025-08-28 20:50:41.215401 load time: 1388.45 ms 77%|███████▋ | 16941/22095 [28:52:55<8:38:13, 6.03s/it] {'loss': 0.4662, 'grad_norm': 0.2741832860992178, 'learning_rate': 1.3608413367673123e-06, 'epoch': 0.77} 77%|███████▋ | 16941/22095 [28:52:55<8:38:13, 6.03s/it] 77%|███████▋ | 16942/22095 [28:52:59<7:50:27, 5.48s/it] {'loss': 0.2901, 'grad_norm': 0.6248896759210308, 'learning_rate': 1.3603387710527228e-06, 'epoch': 0.77} 77%|███████▋ | 16942/22095 [28:52:59<7:50:27, 5.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16943/22095 [28:53:07<9:02:14, 6.31s/it] {'loss': 0.4719, 'grad_norm': 0.2945483709669181, 'learning_rate': 1.359836283543276e-06, 'epoch': 0.77} 77%|███████▋ | 16943/22095 [28:53:07<9:02:14, 6.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46756 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44748 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16944/22095 [28:53:11<8:02:46, 5.62s/it] {'loss': 0.2947, 'grad_norm': 0.6619790012059299, 'learning_rate': 1.3593338742497675e-06, 'epoch': 0.77} 77%|███████▋ | 16944/22095 [28:53:11<8:02:46, 5.62s/it] 77%|███████▋ | 16945/22095 [28:53:16<7:35:48, 5.31s/it] {'loss': 0.2854, 'grad_norm': 0.6236601589678317, 'learning_rate': 1.3588315431829913e-06, 'epoch': 0.77} 77%|███████▋ | 16945/22095 [28:53:16<7:35:48, 5.31s/it] 77%|███████▋ | 16946/22095 [28:53:20<6:51:48, 4.80s/it] {'loss': 0.2866, 'grad_norm': 0.7724796575015, 'learning_rate': 1.3583292903537427e-06, 'epoch': 0.77} 77%|███████▋ | 16946/22095 [28:53:20<6:51:48, 4.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45125 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93599 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16947/22095 [28:53:23<6:09:27, 4.31s/it] {'loss': 0.2853, 'grad_norm': 1.0154723266293544, 'learning_rate': 1.357827115772814e-06, 'epoch': 0.77} 77%|███████▋ | 16947/22095 [28:53:23<6:09:27, 4.31s/it] 77%|███████▋ | 16948/22095 [28:53:26<5:52:41, 4.11s/it] {'loss': 0.3171, 'grad_norm': 0.7038822773492349, 'learning_rate': 1.3573250194509946e-06, 'epoch': 0.77} 77%|███████▋ | 16948/22095 [28:53:26<5:52:41, 4.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 89, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365494 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 89, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32235, 'image': 'vrdu_table_final_2/astro-ph.CO/4a985ceb-784a-4ea8-91a4-dcd9c43681b4.png', 'image_wh': [[23, 89]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n $A$\\\\\n $B$\\\\\n $C$\\\\\n \\end{tabular}\n```"}]} 77%|███████▋ | 16949/22095 [28:53:31<5:58:10, 4.18s/it] {'loss': 0.265, 'grad_norm': 0.5762294358474576, 'learning_rate': 1.3568230013990713e-06, 'epoch': 0.77} 77%|███████▋ | 16949/22095 [28:53:31<5:58:10, 4.18s/it] 77%|███████▋ | 16950/22095 [28:53:34<5:47:05, 4.05s/it] {'loss': 0.2564, 'grad_norm': 0.5917370820067537, 'learning_rate': 1.3563210616278345e-06, 'epoch': 0.77} 77%|███████▋ | 16950/22095 [28:53:35<5:47:05, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16951/22095 [28:53:43<7:49:30, 5.48s/it] {'loss': 0.4499, 'grad_norm': 0.277251790379558, 'learning_rate': 1.3558192001480652e-06, 'epoch': 0.77} 77%|███████▋ | 16951/22095 [28:53:43<7:49:30, 5.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (97441 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85205 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46002 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56450 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59434 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16952/22095 [28:53:49<8:06:44, 5.68s/it] {'loss': 0.4531, 'grad_norm': 0.26052497713959355, 'learning_rate': 1.3553174169705507e-06, 'epoch': 0.77} 77%|███████▋ | 16952/22095 [28:53:49<8:06:44, 5.68s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 77%|███████▋ | 16953/22095 [28:53:53<7:06:26, 4.98s/it] {'loss': 0.2812, 'grad_norm': 0.6220368259883828, 'learning_rate': 1.3548157121060718e-06, 'epoch': 0.77} 77%|███████▋ | 16953/22095 [28:53:53<7:06:26, 4.98s/it] 77%|███████▋ | 16954/22095 [28:53:57<6:34:33, 4.60s/it] {'loss': 0.257, 'grad_norm': 0.5294918508547679, 'learning_rate': 1.3543140855654058e-06, 'epoch': 0.77} 77%|███████▋ | 16954/22095 [28:53:57<6:34:33, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16955/22095 [28:54:06<8:43:28, 6.11s/it] {'loss': 0.4716, 'grad_norm': 0.2807578134342027, 'learning_rate': 1.3538125373593335e-06, 'epoch': 0.77} 77%|███████▋ | 16955/22095 [28:54:06<8:43:28, 6.11s/it] 77%|███████▋ | 16956/22095 [28:54:10<7:33:04, 5.29s/it] {'loss': 0.2835, 'grad_norm': 0.5845163546613862, 'learning_rate': 1.3533110674986327e-06, 'epoch': 0.77} 77%|███████▋ | 16956/22095 [28:54:10<7:33:04, 5.29s/it] 77%|███████▋ | 16957/22095 [28:54:13<6:44:19, 4.72s/it] {'loss': 0.3021, 'grad_norm': 0.5872242276041237, 'learning_rate': 1.3528096759940768e-06, 'epoch': 0.77} 77%|███████▋ | 16957/22095 [28:54:13<6:44:19, 4.72s/it] 77%|███████▋ | 16958/22095 [28:54:16<6:10:55, 4.33s/it] {'loss': 0.291, 'grad_norm': 0.6504178575741542, 'learning_rate': 1.3523083628564388e-06, 'epoch': 0.77} 77%|███████▋ | 16958/22095 [28:54:16<6:10:55, 4.33s/it] 77%|███████▋ | 16959/22095 [28:54:20<5:55:43, 4.16s/it] {'loss': 0.2885, 'grad_norm': 0.6391989446126154, 'learning_rate': 1.3518071280964901e-06, 'epoch': 0.77} 77%|███████▋ | 16959/22095 [28:54:20<5:55:43, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16960/22095 [28:54:29<8:06:04, 5.68s/it] {'loss': 0.4722, 'grad_norm': 0.2884959672495194, 'learning_rate': 1.3513059717250037e-06, 'epoch': 0.77} 77%|███████▋ | 16960/22095 [28:54:29<8:06:04, 5.68s/it] 77%|███████▋ | 16961/22095 [28:54:33<7:09:57, 5.02s/it] {'loss': 0.3074, 'grad_norm': 0.569592458656659, 'learning_rate': 1.3508048937527458e-06, 'epoch': 0.77} 77%|███████▋ | 16961/22095 [28:54:33<7:09:57, 5.02s/it] 77%|███████▋ | 16962/22095 [28:54:36<6:29:48, 4.56s/it] {'loss': 0.3544, 'grad_norm': 0.6032703298178237, 'learning_rate': 1.3503038941904818e-06, 'epoch': 0.77} 77%|███████▋ | 16962/22095 [28:54:36<6:29:48, 4.56s/it] 77%|███████▋ | 16963/22095 [28:54:40<6:08:12, 4.30s/it] {'loss': 0.3116, 'grad_norm': 0.5832410428674883, 'learning_rate': 1.3498029730489793e-06, 'epoch': 0.77} 77%|███████▋ | 16963/22095 [28:54:40<6:08:12, 4.30s/it] 77%|███████▋ | 16964/22095 [28:54:43<5:42:05, 4.00s/it] {'loss': 0.2951, 'grad_norm': 0.6220872943704416, 'learning_rate': 1.3493021303389985e-06, 'epoch': 0.77} 77%|███████▋ | 16964/22095 [28:54:43<5:42:05, 4.00s/it] 77%|███████▋ | 16965/22095 [28:54:47<5:30:42, 3.87s/it] {'loss': 0.3117, 'grad_norm': 0.89794621665398, 'learning_rate': 1.348801366071304e-06, 'epoch': 0.77} 77%|███████▋ | 16965/22095 [28:54:47<5:30:42, 3.87s/it] 77%|███████▋ | 16966/22095 [28:54:50<5:19:56, 3.74s/it] {'loss': 0.311, 'grad_norm': 0.6600777353123349, 'learning_rate': 1.3483006802566546e-06, 'epoch': 0.77} 77%|███████▋ | 16966/22095 [28:54:50<5:19:56, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16967/22095 [28:54:54<5:22:27, 3.77s/it] {'loss': 0.3076, 'grad_norm': 0.6373080824566424, 'learning_rate': 1.3478000729058065e-06, 'epoch': 0.77} 77%|███████▋ | 16967/22095 [28:54:54<5:22:27, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16968/22095 [28:55:02<7:14:03, 5.08s/it] {'loss': 0.47, 'grad_norm': 0.2797095518192231, 'learning_rate': 1.3472995440295183e-06, 'epoch': 0.77} 77%|███████▋ | 16968/22095 [28:55:02<7:14:03, 5.08s/it] 77%|███████▋ | 16969/22095 [28:55:06<6:44:22, 4.73s/it] {'loss': 0.2976, 'grad_norm': 0.6633517189792275, 'learning_rate': 1.3467990936385478e-06, 'epoch': 0.77} 77%|███████▋ | 16969/22095 [28:55:06<6:44:22, 4.73s/it] 77%|███████▋ | 16970/22095 [28:55:10<6:23:06, 4.49s/it] {'loss': 0.3041, 'grad_norm': 0.6534031456071727, 'learning_rate': 1.3462987217436412e-06, 'epoch': 0.77} 77%|███████▋ | 16970/22095 [28:55:10<6:23:06, 4.49s/it] 77%|███████▋ | 16971/22095 [28:55:13<5:50:14, 4.10s/it] {'loss': 0.3258, 'grad_norm': 0.6053512025354313, 'learning_rate': 1.3457984283555536e-06, 'epoch': 0.77} 77%|███████▋ | 16971/22095 [28:55:13<5:50:14, 4.10s/it] 77%|███████▋ | 16972/22095 [28:55:17<5:39:38, 3.98s/it] {'loss': 0.3381, 'grad_norm': 0.5653073402346999, 'learning_rate': 1.345298213485035e-06, 'epoch': 0.77} 77%|███████▋ | 16972/22095 [28:55:17<5:39:38, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69363 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16973/22095 [28:55:20<5:26:12, 3.82s/it] {'loss': 0.2806, 'grad_norm': 0.5950416465028642, 'learning_rate': 1.344798077142836e-06, 'epoch': 0.77} 77%|███████▋ | 16973/22095 [28:55:20<5:26:12, 3.82s/it] 77%|███████▋ | 16974/22095 [28:55:23<5:01:24, 3.53s/it] {'loss': 0.2696, 'grad_norm': 0.6699224940747642, 'learning_rate': 1.3442980193396976e-06, 'epoch': 0.77} 77%|███████▋ | 16974/22095 [28:55:23<5:01:24, 3.53s/it] 77%|███████▋ | 16975/22095 [28:55:27<4:56:19, 3.47s/it] {'loss': 0.3147, 'grad_norm': 0.5930106756206643, 'learning_rate': 1.3437980400863671e-06, 'epoch': 0.77} 77%|███████▋ | 16975/22095 [28:55:27<4:56:19, 3.47s/it] 77%|███████▋ | 16976/22095 [28:55:30<4:45:13, 3.34s/it] {'loss': 0.3014, 'grad_norm': 0.6205814856898128, 'learning_rate': 1.3432981393935885e-06, 'epoch': 0.77} 77%|███████▋ | 16976/22095 [28:55:30<4:45:13, 3.34s/it] 77%|███████▋ | 16977/22095 [28:55:33<4:43:17, 3.32s/it] {'loss': 0.3099, 'grad_norm': 0.6479243058495224, 'learning_rate': 1.3427983172721026e-06, 'epoch': 0.77} 77%|███████▋ | 16977/22095 [28:55:33<4:43:17, 3.32s/it] 77%|███████▋ | 16978/22095 [28:55:36<4:47:56, 3.38s/it] {'loss': 0.2738, 'grad_norm': 0.6434276617349629, 'learning_rate': 1.3422985737326471e-06, 'epoch': 0.77} 77%|███████▋ | 16978/22095 [28:55:36<4:47:56, 3.38s/it] 77%|███████▋ | 16979/22095 [28:55:40<4:45:00, 3.34s/it] {'loss': 0.2903, 'grad_norm': 0.6253202570490182, 'learning_rate': 1.3417989087859628e-06, 'epoch': 0.77} 77%|███████▋ | 16979/22095 [28:55:40<4:45:00, 3.34s/it] 77%|███████▋ | 16980/22095 [28:55:43<4:36:47, 3.25s/it] {'loss': 0.2788, 'grad_norm': 0.6205806187921971, 'learning_rate': 1.3412993224427834e-06, 'epoch': 0.77} 77%|███████▋ | 16980/22095 [28:55:43<4:36:47, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16981/22095 [28:55:51<6:42:15, 4.72s/it] {'loss': 0.4619, 'grad_norm': 0.28114508109508995, 'learning_rate': 1.3407998147138462e-06, 'epoch': 0.77} 77%|███████▋ | 16981/22095 [28:55:51<6:42:15, 4.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78350 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94205 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16982/22095 [28:55:55<6:36:11, 4.65s/it] {'loss': 0.2637, 'grad_norm': 0.6156844052315181, 'learning_rate': 1.3403003856098823e-06, 'epoch': 0.77} 77%|███████▋ | 16982/22095 [28:55:55<6:36:11, 4.65s/it] 77%|███████▋ | 16983/22095 [28:56:00<6:23:57, 4.51s/it] {'loss': 0.3111, 'grad_norm': 0.5597348557261431, 'learning_rate': 1.339801035141622e-06, 'epoch': 0.77} 77%|███████▋ | 16983/22095 [28:56:00<6:23:57, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16984/22095 [28:56:08<7:57:19, 5.60s/it] {'loss': 0.4477, 'grad_norm': 0.2478413630444913, 'learning_rate': 1.3393017633197958e-06, 'epoch': 0.77} 77%|███████▋ | 16984/22095 [28:56:08<7:57:19, 5.60s/it] 77%|███████▋ | 16985/22095 [28:56:12<7:32:23, 5.31s/it] {'loss': 0.2989, 'grad_norm': 0.6558781901080365, 'learning_rate': 1.3388025701551339e-06, 'epoch': 0.77} 77%|███████▋ | 16985/22095 [28:56:12<7:32:23, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45370 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16986/22095 [28:56:16<6:52:58, 4.85s/it] {'loss': 0.2987, 'grad_norm': 0.6440251359091363, 'learning_rate': 1.3383034556583596e-06, 'epoch': 0.77} 77%|███████▋ | 16986/22095 [28:56:16<6:52:58, 4.85s/it] 77%|███████▋ | 16987/22095 [28:56:19<6:06:14, 4.30s/it] {'loss': 0.3121, 'grad_norm': 0.646678875565266, 'learning_rate': 1.3378044198401963e-06, 'epoch': 0.77} 77%|███████▋ | 16987/22095 [28:56:19<6:06:14, 4.30s/it] 77%|███████▋ | 16988/22095 [28:56:23<5:55:27, 4.18s/it] {'loss': 0.2762, 'grad_norm': 0.596718918220146, 'learning_rate': 1.337305462711369e-06, 'epoch': 0.77} 77%|███████▋ | 16988/22095 [28:56:23<5:55:27, 4.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16989/22095 [28:56:26<5:27:43, 3.85s/it] {'loss': 0.2831, 'grad_norm': 0.6003635530165137, 'learning_rate': 1.3368065842825994e-06, 'epoch': 0.77} 77%|███████▋ | 16989/22095 [28:56:26<5:27:43, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16990/22095 [28:56:36<7:50:20, 5.53s/it] {'loss': 0.4325, 'grad_norm': 0.25469878483264125, 'learning_rate': 1.3363077845646056e-06, 'epoch': 0.77} 77%|███████▋ | 16990/22095 [28:56:36<7:50:20, 5.53s/it] 77%|███████▋ | 16991/22095 [28:56:39<6:50:47, 4.83s/it] {'loss': 0.244, 'grad_norm': 0.5622020283876651, 'learning_rate': 1.3358090635681043e-06, 'epoch': 0.77} 77%|███████▋ | 16991/22095 [28:56:39<6:50:47, 4.83s/it] 77%|███████▋ | 16992/22095 [28:56:42<6:07:01, 4.32s/it] {'loss': 0.2878, 'grad_norm': 0.6387750462435204, 'learning_rate': 1.335310421303813e-06, 'epoch': 0.77} 77%|███████▋ | 16992/22095 [28:56:42<6:07:01, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 16993/22095 [28:56:46<5:52:50, 4.15s/it] {'loss': 0.2549, 'grad_norm': 0.7494803556084741, 'learning_rate': 1.3348118577824448e-06, 'epoch': 0.77} 77%|███████▋ | 16993/22095 [28:56:46<5:52:50, 4.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 16994/22095 [28:56:55<8:09:26, 5.76s/it] {'loss': 0.4734, 'grad_norm': 0.2880711269974684, 'learning_rate': 1.3343133730147144e-06, 'epoch': 0.77} 77%|███████▋ | 16994/22095 [28:56:55<8:09:26, 5.76s/it] 77%|███████▋ | 16995/22095 [28:56:58<7:00:17, 4.94s/it] {'loss': 0.2833, 'grad_norm': 0.7187617476015365, 'learning_rate': 1.3338149670113314e-06, 'epoch': 0.77} 77%|███████▋ | 16995/22095 [28:56:58<7:00:17, 4.94s/it] 77%|███████▋ | 16996/22095 [28:57:01<6:07:25, 4.32s/it] {'loss': 0.2975, 'grad_norm': 0.6144067884129695, 'learning_rate': 1.3333166397830033e-06, 'epoch': 0.77} 77%|███████▋ | 16996/22095 [28:57:01<6:07:25, 4.32s/it] 77%|███████▋ | 16997/22095 [28:57:04<5:36:28, 3.96s/it] {'loss': 0.2951, 'grad_norm': 0.6488198429125233, 'learning_rate': 1.3328183913404396e-06, 'epoch': 0.77} 77%|███████▋ | 16997/22095 [28:57:04<5:36:28, 3.96s/it] 77%|███████▋ | 16998/22095 [28:57:07<5:13:34, 3.69s/it] {'loss': 0.2879, 'grad_norm': 0.6572369985639023, 'learning_rate': 1.3323202216943488e-06, 'epoch': 0.77} 77%|███████▋ | 16998/22095 [28:57:07<5:13:34, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (129772 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 16999/22095 [28:57:11<5:16:34, 3.73s/it] {'loss': 0.297, 'grad_norm': 0.5623155598704372, 'learning_rate': 1.3318221308554287e-06, 'epoch': 0.77} 77%|███████▋ | 16999/22095 [28:57:11<5:16:34, 3.73s/it] 77%|███████▋ | 17000/22095 [28:57:14<5:05:05, 3.59s/it] {'loss': 0.282, 'grad_norm': 0.590359266967176, 'learning_rate': 1.3313241188343845e-06, 'epoch': 0.77} 77%|███████▋ | 17000/22095 [28:57:14<5:05:05, 3.59s/it] 77%|███████▋ | 17001/22095 [28:57:17<4:47:55, 3.39s/it] {'loss': 0.2862, 'grad_norm': 0.6139718057304095, 'learning_rate': 1.330826185641918e-06, 'epoch': 0.77} 77%|███████▋ | 17001/22095 [28:57:17<4:47:55, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46356 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17002/22095 [28:57:20<4:43:38, 3.34s/it] {'loss': 0.2734, 'grad_norm': 0.6207758725197622, 'learning_rate': 1.330328331288731e-06, 'epoch': 0.77} 77%|███████▋ | 17002/22095 [28:57:21<4:43:38, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047796 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 10\nB. 5\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Token indices sequence length is longer than the specified maximum sequence length for this model (84346 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17003/22095 [28:57:24<4:43:11, 3.34s/it] {'loss': 0.2743, 'grad_norm': 0.5562841534231531, 'learning_rate': 1.3298305557855146e-06, 'epoch': 0.77} 77%|███████▋ | 17003/22095 [28:57:24<4:43:11, 3.34s/it] 77%|███████▋ | 17004/22095 [28:57:27<4:31:35, 3.20s/it] {'loss': 0.3076, 'grad_norm': 0.6279572767987115, 'learning_rate': 1.329332859142967e-06, 'epoch': 0.77} 77%|███████▋ | 17004/22095 [28:57:27<4:31:35, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17005/22095 [28:57:30<4:36:09, 3.26s/it] {'loss': 0.3034, 'grad_norm': 0.5915825930214385, 'learning_rate': 1.3288352413717847e-06, 'epoch': 0.77} 77%|███████▋ | 17005/22095 [28:57:30<4:36:09, 3.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46688 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47658 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17006/22095 [28:57:34<4:43:27, 3.34s/it] {'loss': 0.3188, 'grad_norm': 0.5942031851275636, 'learning_rate': 1.3283377024826576e-06, 'epoch': 0.77} 77%|███████▋ | 17006/22095 [28:57:34<4:43:27, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17007/22095 [28:57:44<7:30:15, 5.31s/it] {'loss': 0.4766, 'grad_norm': 0.276941718944032, 'learning_rate': 1.3278402424862758e-06, 'epoch': 0.77} 77%|███████▋ | 17007/22095 [28:57:44<7:30:15, 5.31s/it] 77%|███████▋ | 17008/22095 [28:57:47<6:37:35, 4.69s/it] {'loss': 0.2926, 'grad_norm': 0.6329108071302986, 'learning_rate': 1.3273428613933298e-06, 'epoch': 0.77} 77%|███████▋ | 17008/22095 [28:57:47<6:37:35, 4.69s/it] 77%|███████▋ | 17009/22095 [28:57:51<6:15:16, 4.43s/it] {'loss': 0.3074, 'grad_norm': 0.5793166555741998, 'learning_rate': 1.3268455592145047e-06, 'epoch': 0.77} 77%|███████▋ | 17009/22095 [28:57:51<6:15:16, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17010/22095 [28:57:56<6:42:15, 4.75s/it] {'loss': 0.4525, 'grad_norm': 0.25748776118581745, 'learning_rate': 1.3263483359604884e-06, 'epoch': 0.77} 77%|███████▋ | 17010/22095 [28:57:56<6:42:15, 4.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17011/22095 [28:58:00<6:15:12, 4.43s/it] {'loss': 0.3125, 'grad_norm': 0.5655961201333015, 'learning_rate': 1.3258511916419641e-06, 'epoch': 0.77} 77%|███████▋ | 17011/22095 [28:58:00<6:15:12, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89287 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68440 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17012/22095 [28:58:03<5:50:29, 4.14s/it] {'loss': 0.2969, 'grad_norm': 0.5823148605766695, 'learning_rate': 1.3253541262696117e-06, 'epoch': 0.77} 77%|███████▋ | 17012/22095 [28:58:03<5:50:29, 4.14s/it] 77%|███████▋ | 17013/22095 [28:58:07<5:41:14, 4.03s/it] {'loss': 0.2965, 'grad_norm': 0.6444920412097761, 'learning_rate': 1.3248571398541138e-06, 'epoch': 0.77} 77%|███████▋ | 17013/22095 [28:58:07<5:41:14, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17014/22095 [28:58:14<7:04:54, 5.02s/it] {'loss': 0.4673, 'grad_norm': 0.2839524981208011, 'learning_rate': 1.3243602324061495e-06, 'epoch': 0.77} 77%|███████▋ | 17014/22095 [28:58:14<7:04:54, 5.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17015/22095 [28:58:19<6:52:50, 4.88s/it] {'loss': 0.3205, 'grad_norm': 0.6402337676488344, 'learning_rate': 1.3238634039363952e-06, 'epoch': 0.77} 77%|███████▋ | 17015/22095 [28:58:19<6:52:50, 4.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924515 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47668, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 77%|███████▋ | 17016/22095 [28:58:22<6:09:15, 4.36s/it] {'loss': 0.3168, 'grad_norm': 0.6372879682518151, 'learning_rate': 1.3233666544555246e-06, 'epoch': 0.77} 77%|███████▋ | 17016/22095 [28:58:22<6:09:15, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42564 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41593 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17017/22095 [28:58:26<6:01:03, 4.27s/it] {'loss': 0.3091, 'grad_norm': 0.6019808709883003, 'learning_rate': 1.3228699839742125e-06, 'epoch': 0.77} 77%|███████▋ | 17017/22095 [28:58:26<6:01:03, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047790 in VC:s3://multi-modal/UniGeo/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上的一点,点D是线段BC的中点,若AB=10,AC=6,则AD等于()\nA. 7.5\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 77%|███████▋ | 17018/22095 [28:58:30<5:58:42, 4.24s/it] {'loss': 0.3478, 'grad_norm': 0.7257516660275176, 'learning_rate': 1.3223733925031324e-06, 'epoch': 0.77} 77%|███████▋ | 17018/22095 [28:58:30<5:58:42, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [12, 103, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8353841 in VC:s3://internvl-moe-sft-data/. Exception: Image size [12, 103, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20526, 'image': 'vrdu_table_final_2/astro-ph.CO/9bf600fd-e576-47ae-b041-36cf9fb7e176.png', 'image_wh': [[12, 103]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}} -\\\\-\\\\-\\\\-\\end{tabular}\n```"}]} 77%|███████▋ | 17019/22095 [28:58:34<5:38:48, 4.00s/it] {'loss': 0.2837, 'grad_norm': 0.6080211785313477, 'learning_rate': 1.321876880052953e-06, 'epoch': 0.77} 77%|███████▋ | 17019/22095 [28:58:34<5:38:48, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17020/22095 [28:58:43<7:55:55, 5.63s/it] {'loss': 0.4872, 'grad_norm': 0.2720508001144198, 'learning_rate': 1.321380446634342e-06, 'epoch': 0.77} 77%|███████▋ | 17020/22095 [28:58:43<7:55:55, 5.63s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (160124908 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 77%|███████▋ | 17021/22095 [28:58:47<7:10:39, 5.09s/it] {'loss': 0.3264, 'grad_norm': 0.5971448911188646, 'learning_rate': 1.3208840922579686e-06, 'epoch': 0.77} 77%|███████▋ | 17021/22095 [28:58:47<7:10:39, 5.09s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17022/22095 [28:58:51<6:38:12, 4.71s/it] {'loss': 0.3739, 'grad_norm': 0.6620853507211729, 'learning_rate': 1.3203878169344948e-06, 'epoch': 0.77} 77%|███████▋ | 17022/22095 [28:58:51<6:38:12, 4.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17023/22095 [28:58:54<5:51:31, 4.16s/it] {'loss': 0.2813, 'grad_norm': 0.6750399270932413, 'learning_rate': 1.3198916206745871e-06, 'epoch': 0.77} 77%|███████▋ | 17023/22095 [28:58:54<5:51:31, 4.16s/it] 77%|███████▋ | 17024/22095 [28:58:57<5:24:59, 3.85s/it] {'loss': 0.304, 'grad_norm': 0.6677213392981335, 'learning_rate': 1.3193955034889056e-06, 'epoch': 0.77} 77%|███████▋ | 17024/22095 [28:58:57<5:24:59, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17025/22095 [28:59:06<7:46:20, 5.52s/it] {'loss': 0.4523, 'grad_norm': 0.26104906765845687, 'learning_rate': 1.31889946538811e-06, 'epoch': 0.77} 77%|███████▋ | 17025/22095 [28:59:06<7:46:20, 5.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369311 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36063, 'image': 'vrdu_table_final_2/astro-ph.CO/7144fbfd-2723-4b22-a8c4-ea673ed66c57.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} 77%|███████▋ | 17026/22095 [28:59:10<6:55:06, 4.91s/it] {'loss': 0.3088, 'grad_norm': 0.6332788248079336, 'learning_rate': 1.3184035063828586e-06, 'epoch': 0.77} 77%|███████▋ | 17026/22095 [28:59:10<6:55:06, 4.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17027/22095 [28:59:13<6:22:16, 4.53s/it] {'loss': 0.2801, 'grad_norm': 0.5982813398175602, 'learning_rate': 1.3179076264838102e-06, 'epoch': 0.77} 77%|███████▋ | 17027/22095 [28:59:13<6:22:16, 4.53s/it] 77%|███████▋ | 17028/22095 [28:59:17<5:54:19, 4.20s/it] {'loss': 0.3024, 'grad_norm': 0.6142886930313358, 'learning_rate': 1.3174118257016182e-06, 'epoch': 0.77} 77%|███████▋ | 17028/22095 [28:59:17<5:54:19, 4.20s/it] 77%|███████▋ | 17029/22095 [28:59:20<5:31:48, 3.93s/it] {'loss': 0.2989, 'grad_norm': 0.5813833619991972, 'learning_rate': 1.3169161040469347e-06, 'epoch': 0.77} 77%|███████▋ | 17029/22095 [28:59:20<5:31:48, 3.93s/it] 77%|███████▋ | 17030/22095 [28:59:24<5:22:22, 3.82s/it] {'loss': 0.2915, 'grad_norm': 0.5807463456591303, 'learning_rate': 1.316420461530412e-06, 'epoch': 0.77} 77%|███████▋ | 17030/22095 [28:59:24<5:22:22, 3.82s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8412338 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 14547, 'image': 'vrdu_table_final_2/astro-ph.CO/54ce0b94-d725-42be-9b7f-9c8f343ad4c5.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 77%|███████▋ | 17031/22095 [28:59:27<5:06:48, 3.64s/it] {'loss': 0.3041, 'grad_norm': 0.6038158844407124, 'learning_rate': 1.3159248981627026e-06, 'epoch': 0.77} 77%|███████▋ | 17031/22095 [28:59:27<5:06:48, 3.64s/it] 77%|███████▋ | 17032/22095 [28:59:30<5:01:01, 3.57s/it] {'loss': 0.321, 'grad_norm': 0.6905869638067237, 'learning_rate': 1.3154294139544516e-06, 'epoch': 0.77} 77%|███████▋ | 17032/22095 [28:59:30<5:01:01, 3.57s/it] 77%|███████▋ | 17033/22095 [28:59:33<4:43:15, 3.36s/it] {'loss': 0.3035, 'grad_norm': 0.587729403920332, 'learning_rate': 1.3149340089163048e-06, 'epoch': 0.77} 77%|███████▋ | 17033/22095 [28:59:33<4:43:15, 3.36s/it] 77%|███████▋ | 17034/22095 [28:59:37<4:51:26, 3.46s/it] {'loss': 0.2999, 'grad_norm': 0.5984624721440018, 'learning_rate': 1.3144386830589102e-06, 'epoch': 0.77} 77%|███████▋ | 17034/22095 [28:59:37<4:51:26, 3.46s/it] 77%|███████▋ | 17035/22095 [28:59:41<5:02:15, 3.58s/it] {'loss': 0.3004, 'grad_norm': 0.5938178505981546, 'learning_rate': 1.3139434363929088e-06, 'epoch': 0.77} 77%|███████▋ | 17035/22095 [28:59:41<5:02:15, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66022 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88188 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17036/22095 [28:59:44<4:45:50, 3.39s/it] {'loss': 0.2901, 'grad_norm': 0.6211449043197764, 'learning_rate': 1.3134482689289408e-06, 'epoch': 0.77} 77%|███████▋ | 17036/22095 [28:59:44<4:45:50, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95966 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17037/22095 [28:59:49<5:25:25, 3.86s/it] {'loss': 0.3086, 'grad_norm': 0.6013448927897931, 'learning_rate': 1.312953180677648e-06, 'epoch': 0.77} 77%|███████▋ | 17037/22095 [28:59:49<5:25:25, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66970 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17038/22095 [28:59:52<5:10:09, 3.68s/it] {'loss': 0.2874, 'grad_norm': 0.5856091578020052, 'learning_rate': 1.3124581716496666e-06, 'epoch': 0.77} 77%|███████▋ | 17038/22095 [28:59:52<5:10:09, 3.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888454 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11607, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 2cm\nB. 5cm\nC. 4cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 77%|███████▋ | 17039/22095 [28:59:55<5:03:14, 3.60s/it] {'loss': 0.3335, 'grad_norm': 0.606121887689741, 'learning_rate': 1.3119632418556344e-06, 'epoch': 0.77} 77%|███████▋ | 17039/22095 [28:59:55<5:03:14, 3.60s/it] 77%|███████▋ | 17040/22095 [28:59:58<4:43:54, 3.37s/it] {'loss': 0.2584, 'grad_norm': 0.610090598311767, 'learning_rate': 1.311468391306186e-06, 'epoch': 0.77} 77%|███████▋ | 17040/22095 [28:59:58<4:43:54, 3.37s/it] 77%|███████▋ | 17041/22095 [29:00:01<4:36:56, 3.29s/it] {'loss': 0.3297, 'grad_norm': 0.6136849850789233, 'learning_rate': 1.3109736200119517e-06, 'epoch': 0.77} 77%|███████▋ | 17041/22095 [29:00:01<4:36:56, 3.29s/it] 77%|███████▋ | 17042/22095 [29:00:04<4:29:51, 3.20s/it] {'loss': 0.2766, 'grad_norm': 0.6267240457586536, 'learning_rate': 1.310478927983564e-06, 'epoch': 0.77} 77%|███████▋ | 17042/22095 [29:00:04<4:29:51, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17043/22095 [29:00:07<4:30:58, 3.22s/it] {'loss': 0.293, 'grad_norm': 0.6977661479268513, 'learning_rate': 1.3099843152316543e-06, 'epoch': 0.77} 77%|███████▋ | 17043/22095 [29:00:07<4:30:58, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17044/22095 [29:00:17<7:08:01, 5.08s/it] {'loss': 0.4703, 'grad_norm': 0.2869785079670655, 'learning_rate': 1.309489781766849e-06, 'epoch': 0.77} 77%|███████▋ | 17044/22095 [29:00:17<7:08:01, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47531 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61162 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45293 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17045/22095 [29:00:20<6:25:01, 4.57s/it] {'loss': 0.2609, 'grad_norm': 0.6219557466552592, 'learning_rate': 1.308995327599772e-06, 'epoch': 0.77} 77%|███████▋ | 17045/22095 [29:00:20<6:25:01, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17046/22095 [29:00:29<8:19:00, 5.93s/it] {'loss': 0.486, 'grad_norm': 0.2754491484448643, 'learning_rate': 1.3085009527410491e-06, 'epoch': 0.77} 77%|███████▋ | 17046/22095 [29:00:29<8:19:00, 5.93s/it] 77%|███████▋ | 17047/22095 [29:00:33<7:30:10, 5.35s/it] {'loss': 0.3062, 'grad_norm': 0.6285882729758558, 'learning_rate': 1.3080066572013045e-06, 'epoch': 0.77} 77%|███████▋ | 17047/22095 [29:00:33<7:30:10, 5.35s/it] 77%|███████▋ | 17048/22095 [29:00:37<6:54:08, 4.92s/it] {'loss': 0.3307, 'grad_norm': 0.6209444308060075, 'learning_rate': 1.3075124409911584e-06, 'epoch': 0.77} 77%|███████▋ | 17048/22095 [29:00:37<6:54:08, 4.92s/it] 77%|███████▋ | 17049/22095 [29:00:42<6:41:24, 4.77s/it] {'loss': 0.3477, 'grad_norm': 0.608458141857825, 'learning_rate': 1.3070183041212276e-06, 'epoch': 0.77} 77%|███████▋ | 17049/22095 [29:00:42<6:41:24, 4.77s/it] 77%|███████▋ | 17050/22095 [29:00:46<6:28:08, 4.62s/it] {'loss': 0.2742, 'grad_norm': 0.5640585393302556, 'learning_rate': 1.3065242466021328e-06, 'epoch': 0.77} 77%|███████▋ | 17050/22095 [29:00:46<6:28:08, 4.62s/it] 77%|███████▋ | 17051/22095 [29:00:49<5:59:06, 4.27s/it] {'loss': 0.2725, 'grad_norm': 0.6017392265311926, 'learning_rate': 1.3060302684444864e-06, 'epoch': 0.77} 77%|███████▋ | 17051/22095 [29:00:49<5:59:06, 4.27s/it] 77%|███████▋ | 17052/22095 [29:00:54<5:55:38, 4.23s/it] {'loss': 0.3202, 'grad_norm': 0.6761743285746282, 'learning_rate': 1.3055363696589062e-06, 'epoch': 0.77} 77%|███████▋ | 17052/22095 [29:00:54<5:55:38, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17053/22095 [29:01:01<7:05:22, 5.06s/it] {'loss': 0.4524, 'grad_norm': 0.3254544022283287, 'learning_rate': 1.3050425502560028e-06, 'epoch': 0.77} 77%|███████▋ | 17053/22095 [29:01:01<7:05:22, 5.06s/it] 77%|███████▋ | 17054/22095 [29:01:04<6:29:25, 4.64s/it] {'loss': 0.3541, 'grad_norm': 0.6596226760850494, 'learning_rate': 1.3045488102463856e-06, 'epoch': 0.77} 77%|███████▋ | 17054/22095 [29:01:04<6:29:25, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42115 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41569 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92429 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42413 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17055/22095 [29:01:07<5:55:12, 4.23s/it] {'loss': 0.3441, 'grad_norm': 0.6610524550640676, 'learning_rate': 1.304055149640664e-06, 'epoch': 0.77} 77%|███████▋ | 17055/22095 [29:01:07<5:55:12, 4.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17056/22095 [29:01:11<5:28:16, 3.91s/it] {'loss': 0.3042, 'grad_norm': 0.613461336777729, 'learning_rate': 1.303561568449448e-06, 'epoch': 0.77} 77%|███████▋ | 17056/22095 [29:01:11<5:28:16, 3.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [792, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8476370 in VC:s3://internvl-moe-sft-data/. Exception: Image size [792, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 137544, 'image': 'vrdu_texteq/astro-ph.CO/abeabfec-5897-45d4-aec2-a9cbce57007f.png', 'image_wh': [[792, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'which differs from the conventional relation $\\Omega_k+\\Omega_m+\\Omega_{DE}=1$.'}]} 77%|███████▋ | 17057/22095 [29:01:14<5:07:40, 3.66s/it] {'loss': 0.3254, 'grad_norm': 0.6389856450153001, 'learning_rate': 1.3030680666833411e-06, 'epoch': 0.77} 77%|███████▋ | 17057/22095 [29:01:14<5:07:40, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63434 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102065 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17058/22095 [29:01:18<5:11:55, 3.72s/it] {'loss': 0.2951, 'grad_norm': 0.5882841282729346, 'learning_rate': 1.3025746443529459e-06, 'epoch': 0.77} 77%|███████▋ | 17058/22095 [29:01:18<5:11:55, 3.72s/it] 77%|███████▋ | 17059/22095 [29:01:21<5:00:38, 3.58s/it] {'loss': 0.2637, 'grad_norm': 0.5547272609744122, 'learning_rate': 1.302081301468865e-06, 'epoch': 0.77} 77%|███████▋ | 17059/22095 [29:01:21<5:00:38, 3.58s/it] 77%|███████▋ | 17060/22095 [29:01:24<4:55:24, 3.52s/it] {'loss': 0.2698, 'grad_norm': 0.5719637599550884, 'learning_rate': 1.3015880380417017e-06, 'epoch': 0.77} 77%|███████▋ | 17060/22095 [29:01:24<4:55:24, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50019 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121356 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17061/22095 [29:01:27<4:48:41, 3.44s/it] {'loss': 0.3142, 'grad_norm': 0.6128984698011726, 'learning_rate': 1.3010948540820528e-06, 'epoch': 0.77} 77%|███████▋ | 17061/22095 [29:01:27<4:48:41, 3.44s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17062/22095 [29:01:30<4:35:53, 3.29s/it] {'loss': 0.3351, 'grad_norm': 0.7081009445258358, 'learning_rate': 1.3006017496005135e-06, 'epoch': 0.77} 77%|███████▋ | 17062/22095 [29:01:30<4:35:53, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17063/22095 [29:01:40<7:17:35, 5.22s/it] {'loss': 0.4538, 'grad_norm': 0.280580376477097, 'learning_rate': 1.3001087246076821e-06, 'epoch': 0.77} 77%|███████▋ | 17063/22095 [29:01:40<7:17:35, 5.22s/it] 77%|███████▋ | 17064/22095 [29:01:43<6:26:42, 4.61s/it] {'loss': 0.3027, 'grad_norm': 0.7330898678125246, 'learning_rate': 1.29961577911415e-06, 'epoch': 0.77} 77%|███████▋ | 17064/22095 [29:01:43<6:26:42, 4.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305425 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1iaMtdzihSKJjy0FfXXbGzFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nExtract the text information from this photo.'}, {'from': 'gpt', 'value': 'All words in the image:\n持久驱蚊\n买3送1\n多款可选\n抢购\n包邮\n赠品数量有限\n时尚大方\n亲\n再不出手\n立即抢购21世纪商贸8\n被抢光了........'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303450 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1UyuDcBbM8KJjSZFFXXaynpXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease perform OCR on this image and provide the text.'}, {'from': 'gpt', 'value': 'All words in the image:\nⓇ\nPEAKMETER\n表\n华\n谊\n仪\n快速检测地零火线安全\n插座测试仪为用电安全护航\nchildren:Monteandl.GeorgeJr.wouldoftenDadwouldgetoutthismandolinandplayforthefamily.Wethr\n\nee\nHarborLightsandaroundC\n\nsingalong.\n第二件半价\nPEAKMETER\nⓇ\nPM6860CR\n220V-250V50-60Hz\n正确连接\n缺少地线\n火线与地线接反\n火线与零线接反\n缺少零线\n此测试器不能区分地线与零线接反\nRCD测试功能\nl▲n30mA\n无乐金'}]} 77%|███████▋ | 17065/22095 [29:01:47<6:01:47, 4.32s/it] {'loss': 0.2733, 'grad_norm': 0.5807060894420104, 'learning_rate': 1.2991229131305106e-06, 'epoch': 0.77} 77%|███████▋ | 17065/22095 [29:01:47<6:01:47, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17066/22095 [29:01:56<8:13:40, 5.89s/it] {'loss': 0.4627, 'grad_norm': 0.27381713925430146, 'learning_rate': 1.298630126667354e-06, 'epoch': 0.77} 77%|███████▋ | 17066/22095 [29:01:56<8:13:40, 5.89s/it] 77%|███████▋ | 17067/22095 [29:02:00<7:08:12, 5.11s/it] {'loss': 0.2926, 'grad_norm': 0.70382379688019, 'learning_rate': 1.2981374197352663e-06, 'epoch': 0.77} 77%|███████▋ | 17067/22095 [29:02:00<7:08:12, 5.11s/it] 77%|███████▋ | 17068/22095 [29:02:03<6:12:21, 4.44s/it] {'loss': 0.278, 'grad_norm': 0.5869086110499759, 'learning_rate': 1.2976447923448376e-06, 'epoch': 0.77} 77%|███████▋ | 17068/22095 [29:02:03<6:12:21, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17069/22095 [29:02:10<7:24:00, 5.30s/it] {'loss': 0.4378, 'grad_norm': 0.26054148489340934, 'learning_rate': 1.2971522445066515e-06, 'epoch': 0.77} 77%|███████▋ | 17069/22095 [29:02:10<7:24:00, 5.30s/it] 77%|███████▋ | 17070/22095 [29:02:14<6:41:43, 4.80s/it] {'loss': 0.3465, 'grad_norm': 0.7147306754379935, 'learning_rate': 1.29665977623129e-06, 'epoch': 0.77} 77%|███████▋ | 17070/22095 [29:02:14<6:41:43, 4.80s/it] 77%|███████▋ | 17071/22095 [29:02:17<6:12:22, 4.45s/it] {'loss': 0.287, 'grad_norm': 0.6196791744507667, 'learning_rate': 1.2961673875293352e-06, 'epoch': 0.77} 77%|███████▋ | 17071/22095 [29:02:17<6:12:22, 4.45s/it] 77%|███████▋ | 17072/22095 [29:02:21<5:54:27, 4.23s/it] {'loss': 0.3465, 'grad_norm': 0.7222106900360292, 'learning_rate': 1.2956750784113698e-06, 'epoch': 0.77} 77%|███████▋ | 17072/22095 [29:02:21<5:54:27, 4.23s/it] 77%|███████▋ | 17073/22095 [29:02:24<5:26:08, 3.90s/it] {'loss': 0.3168, 'grad_norm': 0.6263213043582183, 'learning_rate': 1.2951828488879702e-06, 'epoch': 0.77} 77%|███████▋ | 17073/22095 [29:02:24<5:26:08, 3.90s/it] 77%|███████▋ | 17074/22095 [29:02:27<5:09:07, 3.69s/it] {'loss': 0.3167, 'grad_norm': 0.601989846349521, 'learning_rate': 1.2946906989697106e-06, 'epoch': 0.77} 77%|███████▋ | 17074/22095 [29:02:27<5:09:07, 3.69s/it] 77%|███████▋ | 17075/22095 [29:02:31<5:18:01, 3.80s/it] {'loss': 0.3222, 'grad_norm': 0.6061256602703506, 'learning_rate': 1.2941986286671682e-06, 'epoch': 0.77} 77%|███████▋ | 17075/22095 [29:02:31<5:18:01, 3.80s/it] 77%|███████▋ | 17076/22095 [29:02:34<5:01:55, 3.61s/it] {'loss': 0.3075, 'grad_norm': 0.6098812345161176, 'learning_rate': 1.2937066379909174e-06, 'epoch': 0.77} 77%|███████▋ | 17076/22095 [29:02:35<5:01:55, 3.61s/it] 77%|███████▋ | 17077/22095 [29:02:38<5:01:32, 3.61s/it] {'loss': 0.2862, 'grad_norm': 0.577058215520092, 'learning_rate': 1.2932147269515278e-06, 'epoch': 0.77} 77%|███████▋ | 17077/22095 [29:02:38<5:01:32, 3.61s/it] 77%|███████▋ | 17078/22095 [29:02:41<4:55:51, 3.54s/it] {'loss': 0.2623, 'grad_norm': 0.7320423453045078, 'learning_rate': 1.2927228955595678e-06, 'epoch': 0.77} 77%|███████▋ | 17078/22095 [29:02:41<4:55:51, 3.54s/it] 77%|███████▋ | 17079/22095 [29:02:45<5:00:37, 3.60s/it] {'loss': 0.2689, 'grad_norm': 0.634616556189686, 'learning_rate': 1.292231143825608e-06, 'epoch': 0.77} 77%|███████▋ | 17079/22095 [29:02:45<5:00:37, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887886 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11039, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AC=6,CB=3,∴AB=6+3=9,∵O是线段AB的中点,∴AO=9÷2=4.5,∴OC=AC-AO=6-4.5=1.5.'}]} 77%|███████▋ | 17080/22095 [29:02:49<5:06:24, 3.67s/it] {'loss': 0.2991, 'grad_norm': 0.6614808934275564, 'learning_rate': 1.2917394717602123e-06, 'epoch': 0.77} 77%|███████▋ | 17080/22095 [29:02:49<5:06:24, 3.67s/it] 77%|███████▋ | 17081/22095 [29:02:52<4:44:53, 3.41s/it] {'loss': 0.2944, 'grad_norm': 0.6504693596242206, 'learning_rate': 1.2912478793739474e-06, 'epoch': 0.77} 77%|███████▋ | 17081/22095 [29:02:52<4:44:53, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 108, in __call__ # img_value_str = self.client.get(fn) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 99, in _get if has_tcs_loader: File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 517, in get data, _ = self.get_with_info(*args, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 514, in get_with_info return self._get_local_client().get_with_info(uri, **kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 478, in get_with_info return client.get(filepath), info File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/utils/s3_fileio.py", line 167, in get return self._client.get_object( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 569, in _api_call return self._make_api_call(operation_name, kwargs) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/botocore/client.py", line 1023, in _make_api_call raise error_class(parsed_response, operation_name) botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. [Try #0] Failed to fetch sample 1072490 in VC:s3://gui/aguvis/aguvis-stage2/amex/images. Exception: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist. Problematic sample: {'image': 'bf3459bcaf434803a580fcd36cbe71aestep0.png', 'conversations': [{'from': 'human', 'value': '\nPlease generate the next move according to the UI screenshot, the task and previous operations.\n\nTask:\nOpen AP News. Share the link of the first article in the "Business" category\n\nPrevious operations:\nNone'}, {'from': 'gpt', 'value': "\nThe goal is to open the AP News app and find the first article in the 'Business' category. Starting by launching the AP News app is the logical first step.\n\n\nTap on the AP News app to open it.\n\n\nterminate(status='success')\n"}]} 77%|███████▋ | 17082/22095 [29:02:55<4:31:15, 3.25s/it] {'loss': 0.2904, 'grad_norm': 0.6111587590059084, 'learning_rate': 1.2907563666773753e-06, 'epoch': 0.77} 77%|███████▋ | 17082/22095 [29:02:55<4:31:15, 3.25s/it] 77%|███████▋ | 17083/22095 [29:02:59<4:49:56, 3.47s/it] {'loss': 0.3246, 'grad_norm': 0.6120474107293098, 'learning_rate': 1.2902649336810553e-06, 'epoch': 0.77} 77%|███████▋ | 17083/22095 [29:02:59<4:49:56, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17084/22095 [29:03:02<4:39:17, 3.34s/it] {'loss': 0.2907, 'grad_norm': 0.6841406326328182, 'learning_rate': 1.289773580395548e-06, 'epoch': 0.77} 77%|███████▋ | 17084/22095 [29:03:02<4:39:17, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305115 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1gCVpLXXXXXX2apXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read out and tell me what is written on this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n店铺装修\n电商品牌.视觉营销服务商!\n始终选择与高质量同行\n羽道电商\n您身边的网络视觉指导专家'}]} 77%|███████▋ | 17085/22095 [29:03:06<4:56:14, 3.55s/it] {'loss': 0.3093, 'grad_norm': 0.6543032317655777, 'learning_rate': 1.289282306831413e-06, 'epoch': 0.77} 77%|███████▋ | 17085/22095 [29:03:06<4:56:14, 3.55s/it] 77%|███████▋ | 17086/22095 [29:03:09<4:38:51, 3.34s/it] {'loss': 0.2928, 'grad_norm': 0.6332293830460849, 'learning_rate': 1.2887911129992047e-06, 'epoch': 0.77} 77%|███████▋ | 17086/22095 [29:03:09<4:38:51, 3.34s/it] 77%|███████▋ | 17087/22095 [29:03:11<4:24:38, 3.17s/it] {'loss': 0.2799, 'grad_norm': 0.6293863668690138, 'learning_rate': 1.2882999989094758e-06, 'epoch': 0.77} 77%|███████▋ | 17087/22095 [29:03:11<4:24:38, 3.17s/it] 77%|███████▋ | 17088/22095 [29:03:15<4:31:35, 3.25s/it] {'loss': 0.3173, 'grad_norm': 0.6591978501859447, 'learning_rate': 1.2878089645727803e-06, 'epoch': 0.77} 77%|███████▋ | 17088/22095 [29:03:15<4:31:35, 3.25s/it] 77%|███████▋ | 17089/22095 [29:03:19<4:50:29, 3.48s/it] {'loss': 0.3301, 'grad_norm': 0.6410972269223547, 'learning_rate': 1.2873180099996701e-06, 'epoch': 0.77} 77%|███████▋ | 17089/22095 [29:03:19<4:50:29, 3.48s/it] 77%|███████▋ | 17090/22095 [29:03:22<4:35:59, 3.31s/it] {'loss': 0.2957, 'grad_norm': 0.9874183972194387, 'learning_rate': 1.2868271352006938e-06, 'epoch': 0.77} 77%|███████▋ | 17090/22095 [29:03:22<4:35:59, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17091/22095 [29:03:25<4:35:45, 3.31s/it] {'loss': 0.2871, 'grad_norm': 0.6636071316751236, 'learning_rate': 1.2863363401863966e-06, 'epoch': 0.77} 77%|███████▋ | 17091/22095 [29:03:25<4:35:45, 3.31s/it] 77%|███████▋ | 17092/22095 [29:03:29<4:49:58, 3.48s/it] {'loss': 0.2966, 'grad_norm': 0.5968374672452968, 'learning_rate': 1.2858456249673268e-06, 'epoch': 0.77} 77%|███████▋ | 17092/22095 [29:03:29<4:49:58, 3.48s/it] 77%|███████▋ | 17093/22095 [29:03:33<5:12:50, 3.75s/it] {'loss': 0.3091, 'grad_norm': 0.6203246444497553, 'learning_rate': 1.2853549895540268e-06, 'epoch': 0.77} 77%|███████▋ | 17093/22095 [29:03:33<5:12:50, 3.75s/it] 77%|███████▋ | 17094/22095 [29:03:37<5:10:16, 3.72s/it] {'loss': 0.3399, 'grad_norm': 0.5971734328529894, 'learning_rate': 1.2848644339570403e-06, 'epoch': 0.77} 77%|███████▋ | 17094/22095 [29:03:37<5:10:16, 3.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 77%|███████▋ | 17095/22095 [29:03:40<4:52:33, 3.51s/it] {'loss': 0.3061, 'grad_norm': 0.7651834237145628, 'learning_rate': 1.2843739581869068e-06, 'epoch': 0.77} 77%|███████▋ | 17095/22095 [29:03:40<4:52:33, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47555 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42145 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64841 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61937 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110205 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135021 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17096/22095 [29:03:43<4:34:35, 3.30s/it] {'loss': 0.2688, 'grad_norm': 0.63633869631367, 'learning_rate': 1.283883562254164e-06, 'epoch': 0.77} 77%|███████▋ | 17096/22095 [29:03:43<4:34:35, 3.30s/it] 77%|███████▋ | 17097/22095 [29:03:47<4:56:51, 3.56s/it] {'loss': 0.2692, 'grad_norm': 0.6051754070221738, 'learning_rate': 1.2833932461693504e-06, 'epoch': 0.77} 77%|███████▋ | 17097/22095 [29:03:47<4:56:51, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43442 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41132 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17098/22095 [29:03:51<4:56:30, 3.56s/it] {'loss': 0.3354, 'grad_norm': 0.592598034098679, 'learning_rate': 1.282903009943004e-06, 'epoch': 0.77} 77%|███████▋ | 17098/22095 [29:03:51<4:56:30, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56736 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17099/22095 [29:03:54<4:54:06, 3.53s/it] {'loss': 0.3162, 'grad_norm': 0.624401314189695, 'learning_rate': 1.282412853585653e-06, 'epoch': 0.77} 77%|███████▋ | 17099/22095 [29:03:54<4:54:06, 3.53s/it] 77%|███████▋ | 17100/22095 [29:03:57<4:46:42, 3.44s/it] {'loss': 0.2703, 'grad_norm': 0.7322262936851486, 'learning_rate': 1.2819227771078318e-06, 'epoch': 0.77} 77%|███████▋ | 17100/22095 [29:03:57<4:46:42, 3.44s/it] 77%|███████▋ | 17101/22095 [29:04:00<4:31:26, 3.26s/it] {'loss': 0.2636, 'grad_norm': 0.6266047926837723, 'learning_rate': 1.281432780520071e-06, 'epoch': 0.77} 77%|███████▋ | 17101/22095 [29:04:00<4:31:26, 3.26s/it] 77%|███████▋ | 17102/22095 [29:04:03<4:31:58, 3.27s/it] {'loss': 0.2694, 'grad_norm': 0.6831300671044548, 'learning_rate': 1.280942863832902e-06, 'epoch': 0.77} 77%|███████▋ | 17102/22095 [29:04:03<4:31:58, 3.27s/it] 77%|███████▋ | 17103/22095 [29:04:07<4:37:56, 3.34s/it] {'loss': 0.2572, 'grad_norm': 0.799933249191529, 'learning_rate': 1.280453027056846e-06, 'epoch': 0.77} 77%|███████▋ | 17103/22095 [29:04:07<4:37:56, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8558143 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22840, 'image': '1566045002.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17104/22095 [29:04:16<7:10:02, 5.17s/it] {'loss': 0.4234, 'grad_norm': 0.28749663474539183, 'learning_rate': 1.2799632702024307e-06, 'epoch': 0.77} 77%|███████▋ | 17104/22095 [29:04:16<7:10:02, 5.17s/it] 77%|███████▋ | 17105/22095 [29:04:20<6:24:30, 4.62s/it] {'loss': 0.2805, 'grad_norm': 0.6069052169608677, 'learning_rate': 1.2794735932801805e-06, 'epoch': 0.77} 77%|███████▋ | 17105/22095 [29:04:20<6:24:30, 4.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 77%|███████▋ | 17106/22095 [29:04:28<8:00:44, 5.78s/it] {'loss': 0.4686, 'grad_norm': 0.28689733822411023, 'learning_rate': 1.2789839963006161e-06, 'epoch': 0.77} 77%|███████▋ | 17106/22095 [29:04:28<8:00:44, 5.78s/it] 77%|███████▋ | 17107/22095 [29:04:32<7:07:42, 5.14s/it] {'loss': 0.3029, 'grad_norm': 0.5960346646546882, 'learning_rate': 1.278494479274256e-06, 'epoch': 0.77} 77%|███████▋ | 17107/22095 [29:04:32<7:07:42, 5.14s/it] 77%|███████▋ | 17108/22095 [29:04:35<6:17:45, 4.54s/it] {'loss': 0.3479, 'grad_norm': 0.6323977456256337, 'learning_rate': 1.2780050422116214e-06, 'epoch': 0.77} 77%|███████▋ | 17108/22095 [29:04:35<6:17:45, 4.54s/it] 77%|███████▋ | 17109/22095 [29:04:38<5:41:36, 4.11s/it] {'loss': 0.2833, 'grad_norm': 0.6001880736032134, 'learning_rate': 1.2775156851232262e-06, 'epoch': 0.77} 77%|███████▋ | 17109/22095 [29:04:38<5:41:36, 4.11s/it] 77%|███████▋ | 17110/22095 [29:04:41<5:14:01, 3.78s/it] {'loss': 0.2696, 'grad_norm': 0.6834787514837493, 'learning_rate': 1.277026408019587e-06, 'epoch': 0.77} 77%|███████▋ | 17110/22095 [29:04:41<5:14:01, 3.78s/it] 77%|███████▋ | 17111/22095 [29:04:44<5:01:59, 3.64s/it] {'loss': 0.2604, 'grad_norm': 0.5895092584163092, 'learning_rate': 1.276537210911216e-06, 'epoch': 0.77} 77%|███████▋ | 17111/22095 [29:04:44<5:01:59, 3.64s/it] 77%|███████▋ | 17112/22095 [29:04:48<4:56:07, 3.57s/it] {'loss': 0.3136, 'grad_norm': 0.5966485169919734, 'learning_rate': 1.2760480938086234e-06, 'epoch': 0.77} 77%|███████▋ | 17112/22095 [29:04:48<4:56:07, 3.57s/it] 77%|███████▋ | 17113/22095 [29:04:51<4:55:50, 3.56s/it] {'loss': 0.2904, 'grad_norm': 0.6310195151175784, 'learning_rate': 1.2755590567223203e-06, 'epoch': 0.77} 77%|███████▋ | 17113/22095 [29:04:51<4:55:50, 3.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933824 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56977, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nA. 5.5cm\nB. 6cm\nC. 6.5cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 77%|███████▋ | 17114/22095 [29:04:55<5:07:27, 3.70s/it] {'loss': 0.3208, 'grad_norm': 0.6517472902360024, 'learning_rate': 1.275070099662815e-06, 'epoch': 0.77} 77%|███████▋ | 17114/22095 [29:04:55<5:07:27, 3.70s/it] 77%|███████▋ | 17115/22095 [29:04:59<4:54:36, 3.55s/it] {'loss': 0.3065, 'grad_norm': 0.6227530110658003, 'learning_rate': 1.274581222640614e-06, 'epoch': 0.77} 77%|███████▋ | 17115/22095 [29:04:59<4:54:36, 3.55s/it] 77%|███████▋ | 17116/22095 [29:05:02<5:00:14, 3.62s/it] {'loss': 0.2588, 'grad_norm': 0.6298855001744631, 'learning_rate': 1.2740924256662185e-06, 'epoch': 0.77} 77%|███████▋ | 17116/22095 [29:05:02<5:00:14, 3.62s/it] 77%|███████▋ | 17117/22095 [29:05:05<4:41:26, 3.39s/it] {'loss': 0.2841, 'grad_norm': 0.610395970451911, 'learning_rate': 1.2736037087501342e-06, 'epoch': 0.77} 77%|███████▋ | 17117/22095 [29:05:05<4:41:26, 3.39s/it] 77%|███████▋ | 17118/22095 [29:05:08<4:30:24, 3.26s/it] {'loss': 0.2988, 'grad_norm': 0.6716630549579216, 'learning_rate': 1.2731150719028622e-06, 'epoch': 0.77} 77%|███████▋ | 17118/22095 [29:05:08<4:30:24, 3.26s/it] 77%|███████▋ | 17119/22095 [29:05:12<4:39:08, 3.37s/it] {'loss': 0.3546, 'grad_norm': 0.6214868094767717, 'learning_rate': 1.2726265151349015e-06, 'epoch': 0.77} 77%|███████▋ | 17119/22095 [29:05:12<4:39:08, 3.37s/it] 77%|███████▋ | 17120/22095 [29:05:15<4:38:55, 3.36s/it] {'loss': 0.3065, 'grad_norm': 0.5840971481190741, 'learning_rate': 1.2721380384567477e-06, 'epoch': 0.77} 77%|███████▋ | 17120/22095 [29:05:15<4:38:55, 3.36s/it] 77%|███████▋ | 17121/22095 [29:05:19<4:42:39, 3.41s/it] {'loss': 0.2748, 'grad_norm': 0.5838822957571945, 'learning_rate': 1.2716496418788998e-06, 'epoch': 0.77} 77%|███████▋ | 17121/22095 [29:05:19<4:42:39, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89483 > 40960). Running this sequence through the model will result in indexing errors 77%|███████▋ | 17122/22095 [29:05:22<4:31:54, 3.28s/it] {'loss': 0.3055, 'grad_norm': 0.6504767470495529, 'learning_rate': 1.2711613254118482e-06, 'epoch': 0.77} 77%|███████▋ | 17122/22095 [29:05:22<4:31:54, 3.28s/it] 77%|███████▋ | 17123/22095 [29:05:25<4:29:23, 3.25s/it] {'loss': 0.291, 'grad_norm': 0.5934252736975144, 'learning_rate': 1.2706730890660896e-06, 'epoch': 0.77} 77%|███████▋ | 17123/22095 [29:05:25<4:29:23, 3.25s/it] 78%|███████▊ | 17124/22095 [29:05:28<4:23:09, 3.18s/it] {'loss': 0.3175, 'grad_norm': 0.6088252559614474, 'learning_rate': 1.2701849328521127e-06, 'epoch': 0.78} 78%|███████▊ | 17124/22095 [29:05:28<4:23:09, 3.18s/it] 78%|███████▊ | 17125/22095 [29:05:32<4:36:45, 3.34s/it] {'loss': 0.3136, 'grad_norm': 0.6474140755106753, 'learning_rate': 1.2696968567804042e-06, 'epoch': 0.78} 78%|███████▊ | 17125/22095 [29:05:32<4:36:45, 3.34s/it] 78%|███████▊ | 17126/22095 [29:05:53<12:15:12, 8.88s/it] {'loss': 0.2631, 'grad_norm': 0.5741135972312951, 'learning_rate': 1.269208860861454e-06, 'epoch': 0.78} 78%|███████▊ | 17126/22095 [29:05:53<12:15:12, 8.88s/it] 78%|███████▊ | 17127/22095 [29:06:17<18:12:37, 13.20s/it] {'loss': 0.3057, 'grad_norm': 0.7014835529371155, 'learning_rate': 1.2687209451057498e-06, 'epoch': 0.78} 78%|███████▊ | 17127/22095 [29:06:17<18:12:37, 13.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45127 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17128/22095 [29:06:21<14:25:17, 10.45s/it] {'loss': 0.3084, 'grad_norm': 0.6135637429616104, 'learning_rate': 1.26823310952377e-06, 'epoch': 0.78} 78%|███████▊ | 17128/22095 [29:06:21<14:25:17, 10.45s/it] 78%|███████▊ | 17129/22095 [29:06:24<11:23:30, 8.26s/it] {'loss': 0.3007, 'grad_norm': 0.7016143908046977, 'learning_rate': 1.2677453541259993e-06, 'epoch': 0.78} 78%|███████▊ | 17129/22095 [29:06:24<11:23:30, 8.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17130/22095 [29:06:33<11:50:59, 8.59s/it] {'loss': 0.4599, 'grad_norm': 0.29424487408614314, 'learning_rate': 1.2672576789229186e-06, 'epoch': 0.78} 78%|███████▊ | 17130/22095 [29:06:33<11:50:59, 8.59s/it] 78%|███████▊ | 17131/22095 [29:06:36<9:38:51, 7.00s/it] {'loss': 0.2971, 'grad_norm': 0.6311842885262616, 'learning_rate': 1.2667700839250086e-06, 'epoch': 0.78} 78%|███████▊ | 17131/22095 [29:06:36<9:38:51, 7.00s/it] 78%|███████▊ | 17132/22095 [29:06:40<8:09:15, 5.91s/it] {'loss': 0.2783, 'grad_norm': 0.5907732570227133, 'learning_rate': 1.266282569142741e-06, 'epoch': 0.78} 78%|███████▊ | 17132/22095 [29:06:40<8:09:15, 5.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [289, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8518248 in VC:s3://internvl-moe-sft-data/. Exception: Image size [289, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 122800, 'image': 'vrdu_texteq/astro-ph.CO/b21d1047-b4e2-4f82-808c-066a637d7e4a.png', 'image_wh': [[289, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'while at $z=0.5$ we find'}]} 78%|███████▊ | 17133/22095 [29:06:43<6:57:29, 5.05s/it] {'loss': 0.2876, 'grad_norm': 0.6576672305095088, 'learning_rate': 1.2657951345865938e-06, 'epoch': 0.78} 78%|███████▊ | 17133/22095 [29:06:43<6:57:29, 5.05s/it] 78%|███████▊ | 17134/22095 [29:06:46<6:21:18, 4.61s/it] {'loss': 0.3081, 'grad_norm': 0.8345840354859797, 'learning_rate': 1.2653077802670416e-06, 'epoch': 0.78} 78%|███████▊ | 17134/22095 [29:06:46<6:21:18, 4.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8590757 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23837, 'image': '716713616.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a religious book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 78%|███████▊ | 17135/22095 [29:06:50<5:58:19, 4.33s/it] {'loss': 0.3252, 'grad_norm': 0.6008978189685654, 'learning_rate': 1.264820506194555e-06, 'epoch': 0.78} 78%|███████▊ | 17135/22095 [29:06:50<5:58:19, 4.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17136/22095 [29:07:17<15:25:56, 11.20s/it] {'loss': 0.4631, 'grad_norm': 0.2687867028456543, 'learning_rate': 1.2643333123796025e-06, 'epoch': 0.78} 78%|███████▊ | 17136/22095 [29:07:17<15:25:56, 11.20s/it] 78%|███████▊ | 17137/22095 [29:07:20<12:05:27, 8.78s/it] {'loss': 0.2904, 'grad_norm': 0.6555826987070684, 'learning_rate': 1.2638461988326556e-06, 'epoch': 0.78} 78%|███████▊ | 17137/22095 [29:07:20<12:05:27, 8.78s/it]Rank 0: Token indices sequence length is longer than the specified maximum sequence length (51164 > 40960) for 4 sample(s). Truncating to 11979 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (43144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63823 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17138/22095 [29:07:23<9:40:56, 7.03s/it] {'loss': 0.2812, 'grad_norm': 0.609080407444969, 'learning_rate': 1.263359165564178e-06, 'epoch': 0.78} 78%|███████▊ | 17138/22095 [29:07:23<9:40:56, 7.03s/it] 78%|███████▊ | 17139/22095 [29:07:27<8:24:00, 6.10s/it] {'loss': 0.3462, 'grad_norm': 1.2277354180594564, 'learning_rate': 1.2628722125846365e-06, 'epoch': 0.78} 78%|███████▊ | 17139/22095 [29:07:27<8:24:00, 6.10s/it] 78%|███████▊ | 17140/22095 [29:07:31<7:17:03, 5.29s/it] {'loss': 0.2616, 'grad_norm': 0.5927932744795882, 'learning_rate': 1.2623853399044938e-06, 'epoch': 0.78} 78%|███████▊ | 17140/22095 [29:07:31<7:17:03, 5.29s/it] 78%|███████▊ | 17141/22095 [29:07:53<14:04:28, 10.23s/it] {'loss': 0.2858, 'grad_norm': 0.5984460545486344, 'learning_rate': 1.2618985475342093e-06, 'epoch': 0.78} 78%|███████▊ | 17141/22095 [29:07:53<14:04:28, 10.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379306 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46091, 'image': 'vrdu_table_final_2/astro-ph.CO/be35b3b7-78d7-4e8a-a86e-263b47aad57e.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 78%|███████▊ | 17142/22095 [29:08:13<18:19:26, 13.32s/it] {'loss': 0.2662, 'grad_norm': 0.59031256302671, 'learning_rate': 1.2614118354842447e-06, 'epoch': 0.78} 78%|███████▊ | 17142/22095 [29:08:13<18:19:26, 13.32s/it] 78%|███████▊ | 17143/22095 [29:08:35<21:45:32, 15.82s/it] {'loss': 0.3013, 'grad_norm': 0.7025175929736768, 'learning_rate': 1.2609252037650587e-06, 'epoch': 0.78} 78%|███████▊ | 17143/22095 [29:08:35<21:45:32, 15.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17144/22095 [29:08:38<16:30:40, 12.01s/it] {'loss': 0.2853, 'grad_norm': 0.643701224946021, 'learning_rate': 1.2604386523871064e-06, 'epoch': 0.78} 78%|███████▊ | 17144/22095 [29:08:38<16:30:40, 12.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17145/22095 [29:08:42<13:12:23, 9.60s/it] {'loss': 0.2928, 'grad_norm': 0.5989556928627308, 'learning_rate': 1.2599521813608412e-06, 'epoch': 0.78} 78%|███████▊ | 17145/22095 [29:08:42<13:12:23, 9.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8358971 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25689, 'image': 'vrdu_table_final_2/astro-ph.CO/7030533f-8477-4d0a-9736-829509d663d2.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 78%|███████▊ | 17146/22095 [29:09:11<21:22:20, 15.55s/it] {'loss': 0.4847, 'grad_norm': 0.2845253593714727, 'learning_rate': 1.2594657906967161e-06, 'epoch': 0.78} 78%|███████▊ | 17146/22095 [29:09:11<21:22:20, 15.55s/it] 78%|███████▊ | 17147/22095 [29:09:15<16:30:17, 12.01s/it] {'loss': 0.3145, 'grad_norm': 0.6871974893336779, 'learning_rate': 1.2589794804051852e-06, 'epoch': 0.78} 78%|███████▊ | 17147/22095 [29:09:15<16:30:17, 12.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96668 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80277 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17148/22095 [29:09:36<20:16:46, 14.76s/it] {'loss': 0.3174, 'grad_norm': 0.5784810860714322, 'learning_rate': 1.2584932504966952e-06, 'epoch': 0.78} 78%|███████▊ | 17148/22095 [29:09:36<20:16:46, 14.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102106 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17149/22095 [29:09:58<23:13:52, 16.91s/it] {'loss': 0.3097, 'grad_norm': 0.6567586672835883, 'learning_rate': 1.258007100981693e-06, 'epoch': 0.78} 78%|███████▊ | 17149/22095 [29:09:58<23:13:52, 16.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45977 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102974 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17150/22095 [29:10:20<25:11:33, 18.34s/it] {'loss': 0.2711, 'grad_norm': 0.6514163489561462, 'learning_rate': 1.2575210318706266e-06, 'epoch': 0.78} 78%|███████▊ | 17150/22095 [29:10:20<25:11:33, 18.34s/it] 78%|███████▊ | 17151/22095 [29:10:59<33:54:44, 24.69s/it] {'loss': 0.3026, 'grad_norm': 0.6198600650648628, 'learning_rate': 1.2570350431739382e-06, 'epoch': 0.78} 78%|███████▊ | 17151/22095 [29:10:59<33:54:44, 24.69s/it] 78%|███████▊ | 17152/22095 [29:11:39<40:09:40, 29.25s/it] {'loss': 0.2564, 'grad_norm': 0.6841175367625959, 'learning_rate': 1.256549134902072e-06, 'epoch': 0.78} 78%|███████▊ | 17152/22095 [29:11:39<40:09:40, 29.25s/it] 78%|███████▊ | 17153/22095 [29:12:02<37:37:28, 27.41s/it] {'loss': 0.3106, 'grad_norm': 0.6208186416223886, 'learning_rate': 1.2560633070654677e-06, 'epoch': 0.78} 78%|███████▊ | 17153/22095 [29:12:02<37:37:28, 27.41s/it] 78%|███████▊ | 17154/22095 [29:12:42<42:38:29, 31.07s/it] {'loss': 0.2898, 'grad_norm': 0.592246840762376, 'learning_rate': 1.2555775596745628e-06, 'epoch': 0.78} 78%|███████▊ | 17154/22095 [29:12:42<42:38:29, 31.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107388 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62012 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17155/22095 [29:13:03<38:38:26, 28.16s/it] {'loss': 0.2798, 'grad_norm': 0.6202578708073626, 'learning_rate': 1.2550918927397965e-06, 'epoch': 0.78} 78%|███████▊ | 17155/22095 [29:13:03<38:38:26, 28.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17156/22095 [29:13:13<30:55:25, 22.54s/it] {'loss': 0.4716, 'grad_norm': 0.2673820938547034, 'learning_rate': 1.2546063062716069e-06, 'epoch': 0.78} 78%|███████▊ | 17156/22095 [29:13:13<30:55:25, 22.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047606 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '12'}]} 78%|███████▊ | 17157/22095 [29:13:22<25:30:32, 18.60s/it] {'loss': 0.4611, 'grad_norm': 0.2625151895796911, 'learning_rate': 1.2541208002804211e-06, 'epoch': 0.78} 78%|███████▊ | 17157/22095 [29:13:22<25:30:32, 18.60s/it] 78%|███████▊ | 17158/22095 [29:13:51<29:39:30, 21.63s/it] {'loss': 0.4789, 'grad_norm': 0.3058334961048853, 'learning_rate': 1.253635374776675e-06, 'epoch': 0.78} 78%|███████▊ | 17158/22095 [29:13:51<29:39:30, 21.63s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17159/22095 [29:13:54<22:06:59, 16.13s/it] {'loss': 0.3129, 'grad_norm': 0.6107794870209484, 'learning_rate': 1.2531500297707987e-06, 'epoch': 0.78} 78%|███████▊ | 17159/22095 [29:13:54<22:06:59, 16.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86089 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17160/22095 [29:13:58<16:54:56, 12.34s/it] {'loss': 0.2955, 'grad_norm': 0.5760173758350476, 'learning_rate': 1.2526647652732233e-06, 'epoch': 0.78} 78%|███████▊ | 17160/22095 [29:13:58<16:54:56, 12.34s/it] 78%|███████▊ | 17161/22095 [29:14:55<35:24:40, 25.84s/it] {'loss': 0.3089, 'grad_norm': 0.6996473534148662, 'learning_rate': 1.2521795812943704e-06, 'epoch': 0.78} 78%|███████▊ | 17161/22095 [29:14:55<35:24:40, 25.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_040238_before_screenshot_sub0.png 2025-08-28 21:12:53.668622 load time: 1052.74 ms 78%|███████▊ | 17162/22095 [29:14:58<26:02:17, 19.00s/it] {'loss': 0.3142, 'grad_norm': 0.6170363776126002, 'learning_rate': 1.2516944778446676e-06, 'epoch': 0.78} 78%|███████▊ | 17162/22095 [29:14:58<26:02:17, 19.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17163/22095 [29:15:06<21:34:20, 15.75s/it] {'loss': 0.4601, 'grad_norm': 0.2843270946939196, 'learning_rate': 1.2512094549345399e-06, 'epoch': 0.78} 78%|███████▊ | 17163/22095 [29:15:06<21:34:20, 15.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54276 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17164/22095 [29:15:47<31:48:05, 23.22s/it] {'loss': 0.3181, 'grad_norm': 0.6367604001768401, 'learning_rate': 1.2507245125744077e-06, 'epoch': 0.78} 78%|███████▊ | 17164/22095 [29:15:47<31:48:05, 23.22s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_735206.png 2025-08-28 21:13:45.543375 load time: 1046.48 ms 78%|███████▊ | 17165/22095 [29:16:09<31:29:09, 22.99s/it] {'loss': 0.2979, 'grad_norm': 0.5898394959517429, 'learning_rate': 1.2502396507746889e-06, 'epoch': 0.78} 78%|███████▊ | 17165/22095 [29:16:09<31:29:09, 22.99s/it]VC:s3://internvl2/datasets/ocr/Wired_Table_10w/A/images/border_1184_FIKMP7NS50WT2U2AYTAN.jpg 2025-08-28 21:14:07.984415 load time: 1035.55 ms 78%|███████▊ | 17166/22095 [29:16:32<31:27:47, 22.98s/it] {'loss': 0.3047, 'grad_norm': 0.6324469738057149, 'learning_rate': 1.2497548695458051e-06, 'epoch': 0.78} 78%|███████▊ | 17166/22095 [29:16:32<31:27:47, 22.98s/it] 78%|███████▊ | 17167/22095 [29:16:54<31:03:21, 22.69s/it] {'loss': 0.29, 'grad_norm': 0.5924126772432989, 'learning_rate': 1.24927016889817e-06, 'epoch': 0.78} 78%|███████▊ | 17167/22095 [29:16:54<31:03:21, 22.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17168/22095 [29:17:21<32:40:23, 23.87s/it] {'loss': 0.4564, 'grad_norm': 0.274682558773357, 'learning_rate': 1.2487855488422007e-06, 'epoch': 0.78} 78%|███████▊ | 17168/22095 [29:17:21<32:40:23, 23.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99806 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79462 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17169/22095 [29:17:24<24:12:50, 17.70s/it] {'loss': 0.2866, 'grad_norm': 0.5996590732612687, 'learning_rate': 1.2483010093883086e-06, 'epoch': 0.78} 78%|███████▊ | 17169/22095 [29:17:24<24:12:50, 17.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55676 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42890 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101974 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17170/22095 [29:18:07<34:42:14, 25.37s/it] {'loss': 0.2807, 'grad_norm': 0.6573037277905804, 'learning_rate': 1.2478165505469042e-06, 'epoch': 0.78} 78%|███████▊ | 17170/22095 [29:18:07<34:42:14, 25.37s/it] 78%|███████▊ | 17171/22095 [29:18:30<33:29:10, 24.48s/it] {'loss': 0.2917, 'grad_norm': 0.6040269001549651, 'learning_rate': 1.2473321723283982e-06, 'epoch': 0.78} 78%|███████▊ | 17171/22095 [29:18:30<33:29:10, 24.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17172/22095 [29:18:40<27:34:37, 20.17s/it] {'loss': 0.4632, 'grad_norm': 0.263686879267683, 'learning_rate': 1.2468478747432e-06, 'epoch': 0.78} 78%|███████▊ | 17172/22095 [29:18:40<27:34:37, 20.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104637 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17173/22095 [29:18:44<20:47:36, 15.21s/it] {'loss': 0.3008, 'grad_norm': 0.6994408235588686, 'learning_rate': 1.2463636578017142e-06, 'epoch': 0.78} 78%|███████▊ | 17173/22095 [29:18:44<20:47:36, 15.21s/it] 78%|███████▊ | 17174/22095 [29:19:23<30:55:04, 22.62s/it] {'loss': 0.2719, 'grad_norm': 0.6136295462490068, 'learning_rate': 1.2458795215143431e-06, 'epoch': 0.78} 78%|███████▊ | 17174/22095 [29:19:23<30:55:04, 22.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83127 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17175/22095 [29:19:27<23:06:27, 16.91s/it] {'loss': 0.33, 'grad_norm': 0.6700928782014203, 'learning_rate': 1.2453954658914913e-06, 'epoch': 0.78} 78%|███████▊ | 17175/22095 [29:19:27<23:06:27, 16.91s/it] 78%|███████▊ | 17176/22095 [29:20:08<33:00:29, 24.16s/it] {'loss': 0.2892, 'grad_norm': 0.6453116410368338, 'learning_rate': 1.2449114909435611e-06, 'epoch': 0.78} 78%|███████▊ | 17176/22095 [29:20:08<33:00:29, 24.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (62830 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17177/22095 [29:20:17<26:57:14, 19.73s/it] {'loss': 0.4958, 'grad_norm': 0.28101865286112065, 'learning_rate': 1.24442759668095e-06, 'epoch': 0.78} 78%|███████▊ | 17177/22095 [29:20:17<26:57:14, 19.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52410 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17178/22095 [29:20:24<21:42:14, 15.89s/it] {'loss': 0.4634, 'grad_norm': 0.25113335487181443, 'learning_rate': 1.2439437831140538e-06, 'epoch': 0.78} 78%|███████▊ | 17178/22095 [29:20:24<21:42:14, 15.89s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17179/22095 [29:20:46<23:56:15, 17.53s/it] {'loss': 0.2786, 'grad_norm': 0.662903364481204, 'learning_rate': 1.2434600502532717e-06, 'epoch': 0.78} 78%|███████▊ | 17179/22095 [29:20:46<23:56:15, 17.53s/it] 78%|███████▊ | 17180/22095 [29:21:08<25:42:46, 18.83s/it] {'loss': 0.2952, 'grad_norm': 0.7470827691874483, 'learning_rate': 1.2429763981089938e-06, 'epoch': 0.78} 78%|███████▊ | 17180/22095 [29:21:08<25:42:46, 18.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17181/22095 [29:21:17<21:52:13, 16.02s/it] {'loss': 0.4501, 'grad_norm': 0.3091443192580121, 'learning_rate': 1.2424928266916164e-06, 'epoch': 0.78} 78%|███████▊ | 17181/22095 [29:21:17<21:52:13, 16.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66192 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46709 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113081 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17182/22095 [29:22:16<39:28:05, 28.92s/it] {'loss': 0.2978, 'grad_norm': 0.5766299726257766, 'learning_rate': 1.2420093360115276e-06, 'epoch': 0.78} 78%|███████▊ | 17182/22095 [29:22:16<39:28:05, 28.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17183/22095 [29:22:24<30:53:25, 22.64s/it] {'loss': 0.4589, 'grad_norm': 0.2973232597197437, 'learning_rate': 1.2415259260791147e-06, 'epoch': 0.78} 78%|███████▊ | 17183/22095 [29:22:24<30:53:25, 22.64s/it]VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/d2ff733e92585f0432d489031a5ede31.png 2025-08-28 21:20:22.874875 load time: 1046.01 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893392 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16545, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 2cm\nB. 5cm\nC. 4cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 78%|███████▊ | 17184/22095 [29:22:28<23:06:16, 16.94s/it] {'loss': 0.2664, 'grad_norm': 0.6270181736822528, 'learning_rate': 1.2410425969047667e-06, 'epoch': 0.78} 78%|███████▊ | 17184/22095 [29:22:28<23:06:16, 16.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_197555.png 2025-08-28 21:20:26.509515 load time: 1030.32 ms VC:s3://st2pj/20250222/images/multi_modal/agent_data/AndroidUI/20240327/20240327_filtered/lianlianshouzhang/screen_00000006.jpg 2025-08-28 21:20:26.507500 load time: 1033.97 ms 78%|███████▊ | 17185/22095 [29:23:10<33:28:06, 24.54s/it] {'loss': 0.4854, 'grad_norm': 0.26161742101697577, 'learning_rate': 1.2405593484988697e-06, 'epoch': 0.78} 78%|███████▊ | 17185/22095 [29:23:10<33:28:06, 24.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [92, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369920 in VC:s3://internvl-moe-sft-data/. Exception: Image size [92, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36672, 'image': 'vrdu_table_final_2/astro-ph.CO/fbc546a1-c69d-4b43-bdc8-64b4a5a4f4db.png', 'image_wh': [[92, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}l@{}} 2011-15 \\\\ $\\,$ \\end{tabular}\n```"}]} VC:s3://internvl2/datasets/MMMUDataset/MMMU/Economics/test_206_image_1.png 2025-08-28 21:21:08.783933 load time: 1039.45 ms 78%|███████▊ | 17186/22095 [29:23:37<34:37:13, 25.39s/it] {'loss': 0.4808, 'grad_norm': 0.2728915388621859, 'learning_rate': 1.2400761808718065e-06, 'epoch': 0.78} 78%|███████▊ | 17186/22095 [29:23:37<34:37:13, 25.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17187/22095 [29:24:00<33:29:08, 24.56s/it] {'loss': 0.3026, 'grad_norm': 0.6215441103367519, 'learning_rate': 1.2395930940339562e-06, 'epoch': 0.78} 78%|███████▊ | 17187/22095 [29:24:00<33:29:08, 24.56s/it] 78%|███████▊ | 17188/22095 [29:24:04<25:05:00, 18.40s/it] {'loss': 0.2923, 'grad_norm': 0.5801009208199909, 'learning_rate': 1.2391100879957018e-06, 'epoch': 0.78} 78%|███████▊ | 17188/22095 [29:24:04<25:05:00, 18.40s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_022253_before_screenshot_sub2.png 2025-08-28 21:22:02.826051 load time: 1028.26 ms VC:s3://gui-agent/data_20250630/web/images/yang_0708152720/10_140_52_49_0708153932/img/4.png 2025-08-28 21:22:02.826101 load time: 1023.15 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/eac7ba8fc247beda13413e6332bab44a7fbfa8f9a19c72bd7ad05c37b06176cc.png 2025-08-28 21:22:02.824679 load time: 1043.79 ms 78%|███████▊ | 17189/22095 [29:24:49<35:52:25, 26.32s/it] {'loss': 0.3494, 'grad_norm': 0.6763289072826297, 'learning_rate': 1.2386271627674234e-06, 'epoch': 0.78} 78%|███████▊ | 17189/22095 [29:24:49<35:52:25, 26.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17190/22095 [29:25:14<35:22:21, 25.96s/it] {'loss': 0.4727, 'grad_norm': 0.28140635961391025, 'learning_rate': 1.2381443183594927e-06, 'epoch': 0.78} 78%|███████▊ | 17190/22095 [29:25:14<35:22:21, 25.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/app_store_ios/331999207782.png 2025-08-28 21:23:12.745559 load time: 1030.96 ms VC:s3://gui-agent/agentnet/win_mac_images/09ac13dd-3d7c-4ee7-b460-6b83b479ad9a.png 2025-08-28 21:23:12.745403 load time: 1034.23 ms 78%|███████▊ | 17191/22095 [29:25:54<41:04:35, 30.15s/it] {'loss': 0.2701, 'grad_norm': 0.670661369162288, 'learning_rate': 1.2376615547822867e-06, 'epoch': 0.78} 78%|███████▊ | 17191/22095 [29:25:54<41:04:35, 30.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047787 in VC:s3://multi-modal/UniGeo/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C将线段AB分成1:3的两部分,点D是AB的中点,若CD=2,则线段AB的长为()\nA. 6\nB. 8\nC. 10\nD. 12'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AC=\\frac{1}{4}AB,AD=\\frac{1}{2}AB而CD=AD-AC∴CD=\\frac{1}{2}AB-\\frac{1}{4}AB=2∴\\frac{1}{4}AB=2∴AB=8'}]} 78%|███████▊ | 17192/22095 [29:26:55<53:40:06, 39.41s/it] {'loss': 0.3252, 'grad_norm': 0.8439057856638054, 'learning_rate': 1.2371788720461802e-06, 'epoch': 0.78} 78%|███████▊ | 17192/22095 [29:26:55<53:40:06, 39.41s/it] 78%|███████▊ | 17193/22095 [29:27:16<46:21:54, 34.05s/it] {'loss': 0.229, 'grad_norm': 1.1001434994464, 'learning_rate': 1.2366962701615431e-06, 'epoch': 0.78} 78%|███████▊ | 17193/22095 [29:27:16<46:21:54, 34.05s/it] 78%|███████▊ | 17194/22095 [29:27:19<33:35:45, 24.68s/it] {'loss': 0.2615, 'grad_norm': 0.5700448027566635, 'learning_rate': 1.2362137491387433e-06, 'epoch': 0.78} 78%|███████▊ | 17194/22095 [29:27:19<33:35:45, 24.68s/it] 78%|███████▊ | 17195/22095 [29:27:40<31:58:07, 23.49s/it] {'loss': 0.2852, 'grad_norm': 0.6436858853915678, 'learning_rate': 1.2357313089881524e-06, 'epoch': 0.78} 78%|███████▊ | 17195/22095 [29:27:40<31:58:07, 23.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957885 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8720, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 10\nB. 5\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 78%|███████▊ | 17196/22095 [29:28:01<31:07:39, 22.87s/it] {'loss': 0.3068, 'grad_norm': 0.6936044601347898, 'learning_rate': 1.235248949720133e-06, 'epoch': 0.78} 78%|███████▊ | 17196/22095 [29:28:01<31:07:39, 22.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98340 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17197/22095 [29:28:05<23:07:17, 16.99s/it] {'loss': 0.3152, 'grad_norm': 0.6631834829042466, 'learning_rate': 1.2347666713450524e-06, 'epoch': 0.78} 78%|███████▊ | 17197/22095 [29:28:05<23:07:17, 16.99s/it] 78%|███████▊ | 17198/22095 [29:29:03<39:52:05, 29.31s/it] {'loss': 0.3193, 'grad_norm': 0.6094239192280482, 'learning_rate': 1.2342844738732724e-06, 'epoch': 0.78} 78%|███████▊ | 17198/22095 [29:29:03<39:52:05, 29.31s/it] 78%|███████▊ | 17199/22095 [29:29:46<45:28:34, 33.44s/it] {'loss': 0.3004, 'grad_norm': 0.6329941486039516, 'learning_rate': 1.2338023573151514e-06, 'epoch': 0.78} 78%|███████▊ | 17199/22095 [29:29:46<45:28:34, 33.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17200/22095 [29:30:10<41:30:57, 30.53s/it] {'loss': 0.2855, 'grad_norm': 0.6459641698773567, 'learning_rate': 1.2333203216810514e-06, 'epoch': 0.78} 78%|███████▊ | 17200/22095 [29:30:10<41:30:57, 30.53s/it] 78%|███████▊ | 17201/22095 [29:30:12<30:11:17, 22.21s/it] {'loss': 0.3281, 'grad_norm': 0.615623943776746, 'learning_rate': 1.2328383669813304e-06, 'epoch': 0.78} 78%|███████▊ | 17201/22095 [29:30:12<30:11:17, 22.21s/it] 78%|███████▊ | 17202/22095 [29:30:16<22:26:50, 16.52s/it] {'loss': 0.2979, 'grad_norm': 1.4465718273959651, 'learning_rate': 1.2323564932263428e-06, 'epoch': 0.78} 78%|███████▊ | 17202/22095 [29:30:16<22:26:50, 16.52s/it] 78%|███████▊ | 17203/22095 [29:30:37<24:38:52, 18.14s/it] {'loss': 0.2778, 'grad_norm': 0.6687764650944367, 'learning_rate': 1.2318747004264414e-06, 'epoch': 0.78} 78%|███████▊ | 17203/22095 [29:30:38<24:38:52, 18.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48019 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139648 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41169 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76855 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17204/22095 [29:31:17<33:13:34, 24.46s/it] {'loss': 0.2385, 'grad_norm': 0.6070489391281674, 'learning_rate': 1.2313929885919796e-06, 'epoch': 0.78} 78%|███████▊ | 17204/22095 [29:31:17<33:13:34, 24.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17205/22095 [29:31:26<26:56:12, 19.83s/it] {'loss': 0.4846, 'grad_norm': 0.2771610215799396, 'learning_rate': 1.2309113577333098e-06, 'epoch': 0.78} 78%|███████▊ | 17205/22095 [29:31:26<26:56:12, 19.83s/it] 78%|███████▊ | 17206/22095 [29:31:29<20:08:09, 14.83s/it] {'loss': 0.2521, 'grad_norm': 0.6130620834328264, 'learning_rate': 1.230429807860779e-06, 'epoch': 0.78} 78%|███████▊ | 17206/22095 [29:31:29<20:08:09, 14.83s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 78%|███████▊ | 17207/22095 [29:31:53<23:55:13, 17.62s/it] {'loss': 0.3027, 'grad_norm': 0.6405205580826229, 'learning_rate': 1.2299483389847328e-06, 'epoch': 0.78} 78%|███████▊ | 17207/22095 [29:31:53<23:55:13, 17.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62467 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121725 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17208/22095 [29:32:54<41:31:13, 30.59s/it] {'loss': 0.2639, 'grad_norm': 0.5343316133104158, 'learning_rate': 1.2294669511155193e-06, 'epoch': 0.78} 78%|███████▊ | 17208/22095 [29:32:54<41:31:13, 30.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914367 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37520, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nA. 8\nB. 16\nC. 2\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11303948 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 770, 'image': 'airplane_app/649.png', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这是一份关于航旅纵横应用软件的用户体验报告,涉及到用户分析和软件使用趋势的图表分析。以下是报告的详细内容:\n\n1. **报告标题**:航旅纵横产品体验报告:一手好牌怎么打?\n\n2. **趋势图表**:\n - 图表展示了2011年1月1日至2015年11月16日的整体趋势、PC趋势和移动趋势。\n - 数据包括搜索指数,提供了航旅纵横、航班管家、飞常准等关键词在不同年份的热度变化情况。\n\n3. **体验环境**:\n - **软件版本**:V3.3.2\n - **手机型号**:iPhone 4S\n - **操作系统版本**:iOS 6.1.3\n\n4. **用户分析**:\n - **分析时间段**:从百度指数10月30日至11月5日的数据来看,北京、上海、广东的人民对航旅纵横的关注程度位居前三甲。\n - **年龄分布**:30-39岁的人群对航旅纵横的关注度最高,占57%;其次是20-29岁的人群,占27%。\n - **性别分布**:其中男性占88%,女性占12%。\n\n5. **用户群分类**:\n - 航旅纵横的用户群大致可分为乘机人和关注者两类。\n - 乘机人可细分为商旅出差人士和休闲旅游人士,关注者可细分为接机人、乘机人亲友等。\n\n总结来看,该报告通过用户体验和数据分析,详细描述了航旅纵横在不同用户群体中的关注度和使用趋势。'}]} 78%|███████▊ | 17209/22095 [29:34:38<71:26:36, 52.64s/it] {'loss': 0.2932, 'grad_norm': 0.6271906841197386, 'learning_rate': 1.2289856442634796e-06, 'epoch': 0.78} 78%|███████▊ | 17209/22095 [29:34:38<71:26:36, 52.64s/it] 78%|███████▊ | 17210/22095 [29:34:42<51:31:51, 37.98s/it] {'loss': 0.291, 'grad_norm': 0.6349351163411207, 'learning_rate': 1.2285044184389578e-06, 'epoch': 0.78} 78%|███████▊ | 17210/22095 [29:34:42<51:31:51, 37.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87199 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17211/22095 [29:36:01<68:22:07, 50.39s/it] {'loss': 0.3204, 'grad_norm': 0.6329715545668408, 'learning_rate': 1.2280232736522928e-06, 'epoch': 0.78} 78%|███████▊ | 17211/22095 [29:36:01<68:22:07, 50.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17212/22095 [29:36:33<60:40:39, 44.73s/it] {'loss': 0.4897, 'grad_norm': 0.2753888188223901, 'learning_rate': 1.2275422099138213e-06, 'epoch': 0.78} 78%|███████▊ | 17212/22095 [29:36:33<60:40:39, 44.73s/it] 78%|███████▊ | 17213/22095 [29:37:12<58:39:40, 43.26s/it] {'loss': 0.345, 'grad_norm': 0.6631362683205497, 'learning_rate': 1.2270612272338816e-06, 'epoch': 0.78} 78%|███████▊ | 17213/22095 [29:37:12<58:39:40, 43.26s/it] 78%|███████▊ | 17214/22095 [29:37:52<57:07:33, 42.13s/it] {'loss': 0.3039, 'grad_norm': 0.6205185522767497, 'learning_rate': 1.2265803256228103e-06, 'epoch': 0.78} 78%|███████▊ | 17214/22095 [29:37:52<57:07:33, 42.13s/it] 78%|███████▊ | 17215/22095 [29:39:10<71:35:49, 52.82s/it] {'loss': 0.2761, 'grad_norm': 0.607641422679698, 'learning_rate': 1.226099505090938e-06, 'epoch': 0.78} 78%|███████▊ | 17215/22095 [29:39:10<71:35:49, 52.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17216/22095 [29:39:18<53:40:05, 39.60s/it] {'loss': 0.4918, 'grad_norm': 0.2766637285628353, 'learning_rate': 1.2256187656485957e-06, 'epoch': 0.78} 78%|███████▊ | 17216/22095 [29:39:18<53:40:05, 39.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17217/22095 [29:39:39<46:03:29, 33.99s/it] {'loss': 0.2883, 'grad_norm': 0.6734382404378367, 'learning_rate': 1.2251381073061137e-06, 'epoch': 0.78} 78%|███████▊ | 17217/22095 [29:39:39<46:03:29, 33.99s/it] 78%|███████▊ | 17218/22095 [29:39:43<33:34:09, 24.78s/it] {'loss': 0.293, 'grad_norm': 0.5843294729100404, 'learning_rate': 1.2246575300738234e-06, 'epoch': 0.78} 78%|███████▊ | 17218/22095 [29:39:43<33:34:09, 24.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17219/22095 [29:39:52<27:25:03, 20.24s/it] {'loss': 0.4569, 'grad_norm': 0.29133487415654197, 'learning_rate': 1.2241770339620446e-06, 'epoch': 0.78} 78%|███████▊ | 17219/22095 [29:39:52<27:25:03, 20.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [678, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507071 in VC:s3://internvl-moe-sft-data/. Exception: Image size [678, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 142458, 'image': 'vrdu_texteq/astro-ph.CO/ce55db9c-49cc-4ab0-b6d2-2bc5a6ae77ad.png', 'image_wh': [[678, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'Sources detected at $>$3$\\sigma$ in at least one out of the three'}]} 78%|███████▊ | 17220/22095 [29:39:56<20:45:10, 15.33s/it] {'loss': 0.3347, 'grad_norm': 0.6515322371384752, 'learning_rate': 1.2236966189811045e-06, 'epoch': 0.78} 78%|███████▊ | 17220/22095 [29:39:56<20:45:10, 15.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396927 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63780, 'image': 'vrdu_table_final_2/astro-ph.EP/a2feaf88-f6df-432e-b14f-22bdd56475bb.png', 'image_wh': [[20, 14]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}x\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17221/22095 [29:40:36<30:52:47, 22.81s/it] {'loss': 0.3264, 'grad_norm': 0.6285603746441605, 'learning_rate': 1.2232162851413282e-06, 'epoch': 0.78} 78%|███████▊ | 17221/22095 [29:40:36<30:52:47, 22.81s/it] 78%|███████▊ | 17222/22095 [29:41:00<30:59:47, 22.90s/it] {'loss': 0.3157, 'grad_norm': 0.614769755240032, 'learning_rate': 1.2227360324530335e-06, 'epoch': 0.78} 78%|███████▊ | 17222/22095 [29:41:00<30:59:47, 22.90s/it] 78%|███████▊ | 17223/22095 [29:41:04<23:19:01, 17.23s/it] {'loss': 0.3023, 'grad_norm': 0.5887134813600734, 'learning_rate': 1.2222558609265394e-06, 'epoch': 0.78} 78%|███████▊ | 17223/22095 [29:41:04<23:19:01, 17.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897250 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20403, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nA. 8\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 78%|███████▊ | 17224/22095 [29:41:25<24:51:12, 18.37s/it] {'loss': 0.2957, 'grad_norm': 0.6053064454054246, 'learning_rate': 1.2217757705721662e-06, 'epoch': 0.78} 78%|███████▊ | 17224/22095 [29:41:25<24:51:12, 18.37s/it] 78%|███████▊ | 17225/22095 [29:42:05<33:47:22, 24.98s/it] {'loss': 0.2738, 'grad_norm': 0.6885769936709684, 'learning_rate': 1.2212957614002263e-06, 'epoch': 0.78} 78%|███████▊ | 17225/22095 [29:42:05<33:47:22, 24.98s/it] 78%|███████▊ | 17226/22095 [29:42:27<32:31:44, 24.05s/it] {'loss': 0.2744, 'grad_norm': 0.5805273307300288, 'learning_rate': 1.2208158334210363e-06, 'epoch': 0.78} 78%|███████▊ | 17226/22095 [29:42:27<32:31:44, 24.05s/it]VC:s3://gui/aguvis/aguvis-stage2/mind2web/e7fbd3a3-d583-46b9-ad7e-3f7b765fc311-48650296-30f6-4c10-90bc-b65a4f8d92c1.jpg 2025-08-28 21:40:25.608983 load time: 1235.06 ms 78%|███████▊ | 17227/22095 [29:42:48<31:31:01, 23.31s/it] {'loss': 0.2786, 'grad_norm': 0.6466902830249935, 'learning_rate': 1.2203359866449073e-06, 'epoch': 0.78} 78%|███████▊ | 17227/22095 [29:42:48<31:31:01, 23.31s/it] 78%|███████▊ | 17228/22095 [29:43:29<38:25:08, 28.42s/it] {'loss': 0.3269, 'grad_norm': 0.645014514513907, 'learning_rate': 1.2198562210821474e-06, 'epoch': 0.78} 78%|███████▊ | 17228/22095 [29:43:29<38:25:08, 28.42s/it] 78%|███████▊ | 17229/22095 [29:43:50<35:28:56, 26.25s/it] {'loss': 0.2917, 'grad_norm': 0.6807809877605372, 'learning_rate': 1.2193765367430683e-06, 'epoch': 0.78} 78%|███████▊ | 17229/22095 [29:43:50<35:28:56, 26.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17230/22095 [29:43:57<27:49:21, 20.59s/it] {'loss': 0.4527, 'grad_norm': 0.2872760545323733, 'learning_rate': 1.2188969336379775e-06, 'epoch': 0.78} 78%|███████▊ | 17230/22095 [29:43:57<27:49:21, 20.59s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/rico/dataset/image/62.jpg 2025-08-28 21:41:56.092170 load time: 1046.56 ms VC:s3://gui-agent/data_20250612/web/images/yang_0611215436/10_140_52_49_0611220710/img/16.png 2025-08-28 21:41:56.092191 load time: 1044.68 ms 78%|███████▊ | 17231/22095 [29:44:19<28:18:41, 20.95s/it] {'loss': 0.322, 'grad_norm': 0.5980345291753337, 'learning_rate': 1.2184174117771786e-06, 'epoch': 0.78} 78%|███████▊ | 17231/22095 [29:44:19<28:18:41, 20.95s/it]VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_181481.png 2025-08-28 21:42:17.899333 load time: 1050.21 ms VC:s3://gui-agent/data_20250609/windows/images/autocad/20250508_132635_5/images/before_screenshot_60_concat_right.png 2025-08-28 21:42:17.897243 load time: 1041.52 ms 78%|███████▊ | 17232/22095 [29:45:00<36:20:41, 26.91s/it] {'loss': 0.2945, 'grad_norm': 0.6540469404760676, 'learning_rate': 1.2179379711709738e-06, 'epoch': 0.78} 78%|███████▊ | 17232/22095 [29:45:00<36:20:41, 26.91s/it] 78%|███████▊ | 17233/22095 [29:45:42<42:39:33, 31.59s/it] {'loss': 0.3381, 'grad_norm': 0.6760927496791808, 'learning_rate': 1.2174586118296665e-06, 'epoch': 0.78} 78%|███████▊ | 17233/22095 [29:45:42<42:39:33, 31.59s/it] 78%|███████▊ | 17234/22095 [29:46:05<38:55:00, 28.82s/it] {'loss': 0.2636, 'grad_norm': 0.9163450526486921, 'learning_rate': 1.2169793337635577e-06, 'epoch': 0.78} 78%|███████▊ | 17234/22095 [29:46:05<38:55:00, 28.82s/it] 78%|███████▊ | 17235/22095 [29:46:27<36:17:32, 26.88s/it] {'loss': 0.2601, 'grad_norm': 0.607433315490523, 'learning_rate': 1.2165001369829442e-06, 'epoch': 0.78} 78%|███████▊ | 17235/22095 [29:46:27<36:17:32, 26.88s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240827_144649_before_screenshot_sub0.png 2025-08-28 21:44:25.941247 load time: 1020.42 ms 78%|███████▊ | 17236/22095 [29:47:06<41:10:56, 30.51s/it] {'loss': 0.295, 'grad_norm': 0.6189073530057309, 'learning_rate': 1.2160210214981217e-06, 'epoch': 0.78} 78%|███████▊ | 17236/22095 [29:47:06<41:10:56, 30.51s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_47549.png 2025-08-28 21:45:04.919457 load time: 1053.29 ms 78%|███████▊ | 17237/22095 [29:47:29<38:05:07, 28.22s/it] {'loss': 0.329, 'grad_norm': 0.6835413411402755, 'learning_rate': 1.215541987319387e-06, 'epoch': 0.78} 78%|███████▊ | 17237/22095 [29:47:29<38:05:07, 28.22s/it] 78%|███████▊ | 17238/22095 [29:47:32<27:49:58, 20.63s/it] {'loss': 0.2517, 'grad_norm': 0.566895279509252, 'learning_rate': 1.2150630344570301e-06, 'epoch': 0.78} 78%|███████▊ | 17238/22095 [29:47:32<27:49:58, 20.63s/it] 78%|███████▊ | 17239/22095 [29:47:36<21:00:34, 15.58s/it] {'loss': 0.3313, 'grad_norm': 0.6177703234752498, 'learning_rate': 1.2145841629213462e-06, 'epoch': 0.78} 78%|███████▊ | 17239/22095 [29:47:36<21:00:34, 15.58s/it] 78%|███████▊ | 17240/22095 [29:47:39<15:53:57, 11.79s/it] {'loss': 0.2414, 'grad_norm': 0.6494082399900752, 'learning_rate': 1.2141053727226222e-06, 'epoch': 0.78} 78%|███████▊ | 17240/22095 [29:47:39<15:53:57, 11.79s/it] 78%|███████▊ | 17241/22095 [29:47:41<12:15:25, 9.09s/it] {'loss': 0.2737, 'grad_norm': 0.6395650775518928, 'learning_rate': 1.2136266638711452e-06, 'epoch': 0.78} 78%|███████▊ | 17241/22095 [29:47:41<12:15:25, 9.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49075 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43871 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97688 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17242/22095 [29:48:04<17:30:27, 12.99s/it] {'loss': 0.3063, 'grad_norm': 0.6178599277507637, 'learning_rate': 1.2131480363772018e-06, 'epoch': 0.78} 78%|███████▊ | 17242/22095 [29:48:04<17:30:27, 12.99s/it] 78%|███████▊ | 17243/22095 [29:48:07<13:27:48, 9.99s/it] {'loss': 0.3056, 'grad_norm': 0.6273313869636293, 'learning_rate': 1.2126694902510783e-06, 'epoch': 0.78} 78%|███████▊ | 17243/22095 [29:48:07<13:27:48, 9.99s/it] 78%|███████▊ | 17244/22095 [29:48:10<10:48:08, 8.02s/it] {'loss': 0.2808, 'grad_norm': 1.1095679829939256, 'learning_rate': 1.2121910255030556e-06, 'epoch': 0.78} 78%|███████▊ | 17244/22095 [29:48:10<10:48:08, 8.02s/it] 78%|███████▊ | 17245/22095 [29:48:31<15:59:35, 11.87s/it] {'loss': 0.2703, 'grad_norm': 0.5816461982748863, 'learning_rate': 1.2117126421434127e-06, 'epoch': 0.78} 78%|███████▊ | 17245/22095 [29:48:31<15:59:35, 11.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45999 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69476 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70668 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17246/22095 [29:48:35<12:51:26, 9.55s/it] {'loss': 0.3267, 'grad_norm': 0.6286741517081604, 'learning_rate': 1.2112343401824306e-06, 'epoch': 0.78} 78%|███████▊ | 17246/22095 [29:48:35<12:51:26, 9.55s/it] 78%|███████▊ | 17247/22095 [29:48:39<10:36:32, 7.88s/it] {'loss': 0.3199, 'grad_norm': 0.6109071921166503, 'learning_rate': 1.2107561196303874e-06, 'epoch': 0.78} 78%|███████▊ | 17247/22095 [29:48:39<10:36:32, 7.88s/it] 78%|███████▊ | 17248/22095 [29:49:25<25:59:06, 19.30s/it] {'loss': 0.321, 'grad_norm': 0.680967226262785, 'learning_rate': 1.2102779804975574e-06, 'epoch': 0.78} 78%|███████▊ | 17248/22095 [29:49:25<25:59:06, 19.30s/it] 78%|███████▊ | 17249/22095 [29:49:46<26:40:27, 19.82s/it] {'loss': 0.3042, 'grad_norm': 0.6266455172496049, 'learning_rate': 1.209799922794213e-06, 'epoch': 0.78} 78%|███████▊ | 17249/22095 [29:49:46<26:40:27, 19.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17250/22095 [29:49:55<22:28:37, 16.70s/it] {'loss': 0.4413, 'grad_norm': 0.2809296741128752, 'learning_rate': 1.2093219465306289e-06, 'epoch': 0.78} 78%|███████▊ | 17250/22095 [29:49:55<22:28:37, 16.70s/it] 78%|███████▊ | 17251/22095 [29:49:59<17:03:42, 12.68s/it] {'loss': 0.3074, 'grad_norm': 0.6475967871995113, 'learning_rate': 1.2088440517170729e-06, 'epoch': 0.78} 78%|███████▊ | 17251/22095 [29:49:59<17:03:42, 12.68s/it] 78%|███████▊ | 17252/22095 [29:50:02<13:09:16, 9.78s/it] {'loss': 0.2676, 'grad_norm': 0.5713740728066894, 'learning_rate': 1.2083662383638156e-06, 'epoch': 0.78} 78%|███████▊ | 17252/22095 [29:50:02<13:09:16, 9.78s/it] 78%|███████▊ | 17253/22095 [29:50:06<10:49:17, 8.05s/it] {'loss': 0.28, 'grad_norm': 0.5782149452659897, 'learning_rate': 1.207888506481123e-06, 'epoch': 0.78} 78%|███████▊ | 17253/22095 [29:50:06<10:49:17, 8.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51225 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73541 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17254/22095 [29:50:15<11:27:00, 8.51s/it] {'loss': 0.4498, 'grad_norm': 0.2789851147253102, 'learning_rate': 1.2074108560792586e-06, 'epoch': 0.78} 78%|███████▊ | 17254/22095 [29:50:15<11:27:00, 8.51s/it] 78%|███████▊ | 17255/22095 [29:50:37<16:42:08, 12.42s/it] {'loss': 0.2967, 'grad_norm': 0.6078179929177168, 'learning_rate': 1.2069332871684875e-06, 'epoch': 0.78} 78%|███████▊ | 17255/22095 [29:50:37<16:42:08, 12.42s/it] 78%|███████▊ | 17256/22095 [29:51:00<20:51:10, 15.51s/it] {'loss': 0.2738, 'grad_norm': 0.5688470930722148, 'learning_rate': 1.2064557997590697e-06, 'epoch': 0.78} 78%|███████▊ | 17256/22095 [29:51:00<20:51:10, 15.51s/it] 78%|███████▊ | 17257/22095 [29:51:02<15:46:47, 11.74s/it] {'loss': 0.2825, 'grad_norm': 0.5808970913291032, 'learning_rate': 1.2059783938612674e-06, 'epoch': 0.78} 78%|███████▊ | 17257/22095 [29:51:02<15:46:47, 11.74s/it] 78%|███████▊ | 17258/22095 [29:51:27<20:59:36, 15.62s/it] {'loss': 0.2928, 'grad_norm': 0.60264982018552, 'learning_rate': 1.2055010694853347e-06, 'epoch': 0.78} 78%|███████▊ | 17258/22095 [29:51:27<20:59:36, 15.62s/it] 78%|███████▊ | 17259/22095 [29:51:30<15:57:25, 11.88s/it] {'loss': 0.3223, 'grad_norm': 0.630800348885263, 'learning_rate': 1.2050238266415325e-06, 'epoch': 0.78} 78%|███████▊ | 17259/22095 [29:51:30<15:57:25, 11.88s/it] 78%|███████▊ | 17260/22095 [29:51:33<12:25:11, 9.25s/it] {'loss': 0.2849, 'grad_norm': 0.6324553607490628, 'learning_rate': 1.2045466653401122e-06, 'epoch': 0.78} 78%|███████▊ | 17260/22095 [29:51:33<12:25:11, 9.25s/it] 78%|███████▊ | 17261/22095 [29:51:37<10:18:39, 7.68s/it] {'loss': 0.2833, 'grad_norm': 0.6422699549315195, 'learning_rate': 1.204069585591326e-06, 'epoch': 0.78} 78%|███████▊ | 17261/22095 [29:51:37<10:18:39, 7.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17262/22095 [29:51:47<11:08:21, 8.30s/it] {'loss': 0.437, 'grad_norm': 0.25655613677725086, 'learning_rate': 1.203592587405426e-06, 'epoch': 0.78} 78%|███████▊ | 17262/22095 [29:51:47<11:08:21, 8.30s/it] 78%|███████▊ | 17263/22095 [29:52:13<18:05:17, 13.48s/it] {'loss': 0.3006, 'grad_norm': 0.5916781393995728, 'learning_rate': 1.2031156707926632e-06, 'epoch': 0.78} 78%|███████▊ | 17263/22095 [29:52:13<18:05:17, 13.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134799 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17264/22095 [29:52:16<13:52:52, 10.34s/it] {'loss': 0.2649, 'grad_norm': 0.5927202272252555, 'learning_rate': 1.2026388357632835e-06, 'epoch': 0.78} 78%|███████▊ | 17264/22095 [29:52:16<13:52:52, 10.34s/it] 78%|███████▊ | 17265/22095 [29:52:19<11:10:20, 8.33s/it] {'loss': 0.2859, 'grad_norm': 0.6134259430364656, 'learning_rate': 1.202162082327531e-06, 'epoch': 0.78} 78%|███████▊ | 17265/22095 [29:52:19<11:10:20, 8.33s/it] 78%|███████▊ | 17266/22095 [29:52:22<9:01:51, 6.73s/it] {'loss': 0.289, 'grad_norm': 0.598847689996696, 'learning_rate': 1.2016854104956522e-06, 'epoch': 0.78} 78%|███████▊ | 17266/22095 [29:52:22<9:01:51, 6.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58714 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17267/22095 [29:52:26<7:43:16, 5.76s/it] {'loss': 0.2622, 'grad_norm': 0.6069934691634431, 'learning_rate': 1.201208820277887e-06, 'epoch': 0.78} 78%|███████▊ | 17267/22095 [29:52:26<7:43:16, 5.76s/it] 78%|███████▊ | 17268/22095 [29:52:29<6:42:00, 5.00s/it] {'loss': 0.2764, 'grad_norm': 0.6079465311713511, 'learning_rate': 1.2007323116844789e-06, 'epoch': 0.78} 78%|███████▊ | 17268/22095 [29:52:29<6:42:00, 5.00s/it] 78%|███████▊ | 17269/22095 [29:52:32<5:52:34, 4.38s/it] {'loss': 0.2586, 'grad_norm': 0.5959024824792365, 'learning_rate': 1.2002558847256652e-06, 'epoch': 0.78} 78%|███████▊ | 17269/22095 [29:52:32<5:52:34, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113136 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86608 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119957 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17270/22095 [29:52:36<5:42:44, 4.26s/it] {'loss': 0.3159, 'grad_norm': 0.6086991031251796, 'learning_rate': 1.1997795394116802e-06, 'epoch': 0.78} 78%|███████▊ | 17270/22095 [29:52:36<5:42:44, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123813 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55884 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17271/22095 [29:52:39<5:07:30, 3.82s/it] {'loss': 0.3057, 'grad_norm': 0.6123399712260394, 'learning_rate': 1.1993032757527618e-06, 'epoch': 0.78} 78%|███████▊ | 17271/22095 [29:52:39<5:07:30, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17272/22095 [29:53:10<15:56:30, 11.90s/it] {'loss': 0.4743, 'grad_norm': 0.29205414473219377, 'learning_rate': 1.1988270937591446e-06, 'epoch': 0.78} 78%|███████▊ | 17272/22095 [29:53:10<15:56:30, 11.90s/it] 78%|███████▊ | 17273/22095 [29:53:13<12:42:21, 9.49s/it] {'loss': 0.2693, 'grad_norm': 0.5817875667486778, 'learning_rate': 1.1983509934410586e-06, 'epoch': 0.78} 78%|███████▊ | 17273/22095 [29:53:13<12:42:21, 9.49s/it] 78%|███████▊ | 17274/22095 [29:53:17<10:13:50, 7.64s/it] {'loss': 0.2755, 'grad_norm': 0.6029411095590721, 'learning_rate': 1.1978749748087325e-06, 'epoch': 0.78} 78%|███████▊ | 17274/22095 [29:53:17<10:13:50, 7.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/web/images/yang_0527174255/10_140_52_49_0527210823/img/6.png 2025-08-28 21:51:15.507969 load time: 1032.87 ms VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_153413.png 2025-08-28 21:51:15.507837 load time: 1049.09 ms VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/9b1537dfec4a3e3611d847145502a1eba02e4ed131ada0e79712a9e868127696.png 2025-08-28 21:51:15.509657 load time: 1060.57 ms 78%|███████▊ | 17275/22095 [29:53:27<11:11:02, 8.35s/it] {'loss': 0.4454, 'grad_norm': 0.25139198220891606, 'learning_rate': 1.1973990378723954e-06, 'epoch': 0.78} 78%|███████▊ | 17275/22095 [29:53:27<11:11:02, 8.35s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17276/22095 [29:53:31<9:35:14, 7.16s/it] {'loss': 0.2676, 'grad_norm': 1.0187806068919762, 'learning_rate': 1.1969231826422762e-06, 'epoch': 0.78} 78%|███████▊ | 17276/22095 [29:53:31<9:35:14, 7.16s/it] 78%|███████▊ | 17277/22095 [29:53:35<8:24:28, 6.28s/it] {'loss': 0.2568, 'grad_norm': 0.5490177076170151, 'learning_rate': 1.1964474091285976e-06, 'epoch': 0.78} 78%|███████▊ | 17277/22095 [29:53:35<8:24:28, 6.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17278/22095 [29:53:47<10:42:03, 8.00s/it] {'loss': 0.4767, 'grad_norm': 0.2790758147332501, 'learning_rate': 1.1959717173415807e-06, 'epoch': 0.78} 78%|███████▊ | 17278/22095 [29:53:47<10:42:03, 8.00s/it] 78%|███████▊ | 17279/22095 [29:53:57<11:28:18, 8.58s/it] {'loss': 0.4783, 'grad_norm': 0.27360207391679175, 'learning_rate': 1.19549610729145e-06, 'epoch': 0.78} 78%|███████▊ | 17279/22095 [29:53:57<11:28:18, 8.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17280/22095 [29:54:02<9:57:17, 7.44s/it] {'loss': 0.2921, 'grad_norm': 0.6240491325841333, 'learning_rate': 1.1950205789884217e-06, 'epoch': 0.78} 78%|███████▊ | 17280/22095 [29:54:02<9:57:17, 7.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89987 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17281/22095 [29:54:06<8:23:57, 6.28s/it] {'loss': 0.2819, 'grad_norm': 0.5984853106488534, 'learning_rate': 1.1945451324427166e-06, 'epoch': 0.78} 78%|███████▊ | 17281/22095 [29:54:06<8:23:57, 6.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46301 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17282/22095 [29:54:09<7:10:04, 5.36s/it] {'loss': 0.2744, 'grad_norm': 0.5494188922806639, 'learning_rate': 1.194069767664549e-06, 'epoch': 0.78} 78%|███████▊ | 17282/22095 [29:54:09<7:10:04, 5.36s/it] 78%|███████▊ | 17283/22095 [29:54:12<6:15:30, 4.68s/it] {'loss': 0.2958, 'grad_norm': 0.5946047717941394, 'learning_rate': 1.1935944846641318e-06, 'epoch': 0.78} 78%|███████▊ | 17283/22095 [29:54:12<6:15:30, 4.68s/it] 78%|███████▊ | 17284/22095 [29:54:15<5:30:17, 4.12s/it] {'loss': 0.298, 'grad_norm': 0.6009843995686562, 'learning_rate': 1.1931192834516787e-06, 'epoch': 0.78} 78%|███████▊ | 17284/22095 [29:54:15<5:30:17, 4.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65938 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17285/22095 [29:54:19<5:30:04, 4.12s/it] {'loss': 0.2896, 'grad_norm': 0.5980249673659658, 'learning_rate': 1.1926441640374015e-06, 'epoch': 0.78} 78%|███████▊ | 17285/22095 [29:54:19<5:30:04, 4.12s/it] 78%|███████▊ | 17286/22095 [29:54:22<5:04:13, 3.80s/it] {'loss': 0.2637, 'grad_norm': 1.0807125893259095, 'learning_rate': 1.1921691264315078e-06, 'epoch': 0.78} 78%|███████▊ | 17286/22095 [29:54:22<5:04:13, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348839 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15509, 'image': 'vrdu_table_final_2/astro-ph.CO/babc92ff-3404-4d0d-a57f-86bc0385bb48.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 78%|███████▊ | 17287/22095 [29:54:25<4:48:06, 3.60s/it] {'loss': 0.304, 'grad_norm': 0.6771671739260511, 'learning_rate': 1.191694170644203e-06, 'epoch': 0.78} 78%|███████▊ | 17287/22095 [29:54:25<4:48:06, 3.60s/it] 78%|███████▊ | 17288/22095 [29:54:28<4:37:43, 3.47s/it] {'loss': 0.3074, 'grad_norm': 0.6605418925307608, 'learning_rate': 1.191219296685696e-06, 'epoch': 0.78} 78%|███████▊ | 17288/22095 [29:54:28<4:37:43, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17289/22095 [29:54:38<7:07:46, 5.34s/it] {'loss': 0.5121, 'grad_norm': 0.2795546897923235, 'learning_rate': 1.1907445045661885e-06, 'epoch': 0.78} 78%|███████▊ | 17289/22095 [29:54:38<7:07:46, 5.34s/it] 78%|███████▊ | 17290/22095 [29:54:47<8:44:45, 6.55s/it] {'loss': 0.4615, 'grad_norm': 0.2621604055121253, 'learning_rate': 1.1902697942958806e-06, 'epoch': 0.78} 78%|███████▊ | 17290/22095 [29:54:47<8:44:45, 6.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17291/22095 [29:54:52<8:05:28, 6.06s/it] {'loss': 0.2757, 'grad_norm': 0.5772421413586936, 'learning_rate': 1.189795165884975e-06, 'epoch': 0.78} 78%|███████▊ | 17291/22095 [29:54:52<8:05:28, 6.06s/it] 78%|███████▊ | 17292/22095 [29:54:55<6:55:29, 5.19s/it] {'loss': 0.299, 'grad_norm': 0.7082060921982903, 'learning_rate': 1.1893206193436696e-06, 'epoch': 0.78} 78%|███████▊ | 17292/22095 [29:54:55<6:55:29, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17293/22095 [29:55:02<7:40:14, 5.75s/it] {'loss': 0.467, 'grad_norm': 0.2718666477866807, 'learning_rate': 1.188846154682161e-06, 'epoch': 0.78} 78%|███████▊ | 17293/22095 [29:55:02<7:40:14, 5.75s/it] 78%|███████▊ | 17294/22095 [29:55:06<6:37:27, 4.97s/it] {'loss': 0.3106, 'grad_norm': 0.5893728826172965, 'learning_rate': 1.1883717719106419e-06, 'epoch': 0.78} 78%|███████▊ | 17294/22095 [29:55:06<6:37:27, 4.97s/it] 78%|███████▊ | 17295/22095 [29:55:08<5:46:55, 4.34s/it] {'loss': 0.2885, 'grad_norm': 0.773709814014488, 'learning_rate': 1.1878974710393082e-06, 'epoch': 0.78} 78%|███████▊ | 17295/22095 [29:55:08<5:46:55, 4.34s/it] 78%|███████▊ | 17296/22095 [29:55:11<5:15:01, 3.94s/it] {'loss': 0.2649, 'grad_norm': 0.6151307002494818, 'learning_rate': 1.1874232520783486e-06, 'epoch': 0.78} 78%|███████▊ | 17296/22095 [29:55:11<5:15:01, 3.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [328, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8454136 in VC:s3://internvl-moe-sft-data/. Exception: Image size [328, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 120388, 'image': 'vrdu_texteq/astro-ph.CO/4913c4a8-de07-446f-90db-11f25d52bbc8.png', 'image_wh': [[328, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where $V_\\text{rms}$ is estimated as:'}]} 78%|███████▊ | 17297/22095 [29:55:15<4:53:51, 3.67s/it] {'loss': 0.2729, 'grad_norm': 0.6510029108292334, 'learning_rate': 1.1869491150379553e-06, 'epoch': 0.78} 78%|███████▊ | 17297/22095 [29:55:15<4:53:51, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83122 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17298/22095 [29:55:18<4:40:54, 3.51s/it] {'loss': 0.2947, 'grad_norm': 0.6532810746624844, 'learning_rate': 1.1864750599283132e-06, 'epoch': 0.78} 78%|███████▊ | 17298/22095 [29:55:18<4:40:54, 3.51s/it] 78%|███████▊ | 17299/22095 [29:55:21<4:31:30, 3.40s/it] {'loss': 0.3081, 'grad_norm': 0.6156139091235419, 'learning_rate': 1.1860010867596112e-06, 'epoch': 0.78} 78%|███████▊ | 17299/22095 [29:55:21<4:31:30, 3.40s/it] 78%|███████▊ | 17300/22095 [29:55:24<4:21:17, 3.27s/it] {'loss': 0.3007, 'grad_norm': 0.636874452229096, 'learning_rate': 1.1855271955420306e-06, 'epoch': 0.78} 78%|███████▊ | 17300/22095 [29:55:24<4:21:17, 3.27s/it] 78%|███████▊ | 17301/22095 [29:55:27<4:17:53, 3.23s/it] {'loss': 0.2929, 'grad_norm': 0.6506900226776479, 'learning_rate': 1.1850533862857567e-06, 'epoch': 0.78} 78%|███████▊ | 17301/22095 [29:55:27<4:17:53, 3.23s/it] 78%|███████▊ | 17302/22095 [29:55:31<4:30:07, 3.38s/it] {'loss': 0.3279, 'grad_norm': 0.645473079213239, 'learning_rate': 1.1845796590009684e-06, 'epoch': 0.78} 78%|███████▊ | 17302/22095 [29:55:31<4:30:07, 3.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [371, 27, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047166 in VC:s3://multi-modal/UniGeo/. Exception: Image size [371, 27, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5024.png', 'image_wh': [[371, 27]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上一点,点P是AC的中点,点Q是BC的中点,已知线段AC=8cm,线段BC=4cm,则线段PQ为()\nA. 2cm\nB. 4cm\nC. 6cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 78%|███████▊ | 17303/22095 [29:55:34<4:20:36, 3.26s/it] {'loss': 0.2731, 'grad_norm': 0.6372074891495952, 'learning_rate': 1.1841060136978443e-06, 'epoch': 0.78} 78%|███████▊ | 17303/22095 [29:55:34<4:20:36, 3.26s/it] 78%|███████▊ | 17304/22095 [29:55:38<4:46:26, 3.59s/it] {'loss': 0.2679, 'grad_norm': 0.560182761230861, 'learning_rate': 1.183632450386562e-06, 'epoch': 0.78} 78%|███████▊ | 17304/22095 [29:55:38<4:46:26, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57229 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17305/22095 [29:55:42<4:51:18, 3.65s/it] {'loss': 0.2748, 'grad_norm': 0.6097224018444345, 'learning_rate': 1.1831589690772988e-06, 'epoch': 0.78} 78%|███████▊ | 17305/22095 [29:55:42<4:51:18, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17306/22095 [29:55:51<7:11:14, 5.40s/it] {'loss': 0.4553, 'grad_norm': 0.2578450905562074, 'learning_rate': 1.1826855697802264e-06, 'epoch': 0.78} 78%|███████▊ | 17306/22095 [29:55:51<7:11:14, 5.40s/it] 78%|███████▊ | 17307/22095 [29:55:55<6:24:40, 4.82s/it] {'loss': 0.3061, 'grad_norm': 0.6094006990832728, 'learning_rate': 1.1822122525055163e-06, 'epoch': 0.78} 78%|███████▊ | 17307/22095 [29:55:55<6:24:40, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97036 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17308/22095 [29:55:58<5:52:07, 4.41s/it] {'loss': 0.2967, 'grad_norm': 0.6032761051944457, 'learning_rate': 1.1817390172633402e-06, 'epoch': 0.78} 78%|███████▊ | 17308/22095 [29:55:58<5:52:07, 4.41s/it] 78%|███████▊ | 17309/22095 [29:56:03<5:56:45, 4.47s/it] {'loss': 0.3423, 'grad_norm': 0.6550332039805137, 'learning_rate': 1.1812658640638653e-06, 'epoch': 0.78} 78%|███████▊ | 17309/22095 [29:56:03<5:56:45, 4.47s/it] 78%|███████▊ | 17310/22095 [29:56:06<5:17:33, 3.98s/it] {'loss': 0.2907, 'grad_norm': 0.6853703330648068, 'learning_rate': 1.180792792917259e-06, 'epoch': 0.78} 78%|███████▊ | 17310/22095 [29:56:06<5:17:33, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17311/22095 [29:56:09<4:52:04, 3.66s/it] {'loss': 0.2822, 'grad_norm': 0.6279731879623388, 'learning_rate': 1.1803198038336866e-06, 'epoch': 0.78} 78%|███████▊ | 17311/22095 [29:56:09<4:52:04, 3.66s/it] 78%|███████▊ | 17312/22095 [29:56:12<4:47:22, 3.60s/it] {'loss': 0.3088, 'grad_norm': 0.601357385134243, 'learning_rate': 1.1798468968233084e-06, 'epoch': 0.78} 78%|███████▊ | 17312/22095 [29:56:12<4:47:22, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17313/22095 [29:56:21<6:47:31, 5.11s/it] {'loss': 0.4758, 'grad_norm': 0.27019008623071317, 'learning_rate': 1.179374071896288e-06, 'epoch': 0.78} 78%|███████▊ | 17313/22095 [29:56:21<6:47:31, 5.11s/it] 78%|███████▊ | 17314/22095 [29:56:24<6:06:50, 4.60s/it] {'loss': 0.265, 'grad_norm': 0.5989830670091586, 'learning_rate': 1.178901329062786e-06, 'epoch': 0.78} 78%|███████▊ | 17314/22095 [29:56:24<6:06:50, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84529 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17315/22095 [29:56:28<5:49:07, 4.38s/it] {'loss': 0.2784, 'grad_norm': 0.6033427541616095, 'learning_rate': 1.1784286683329587e-06, 'epoch': 0.78} 78%|███████▊ | 17315/22095 [29:56:28<5:49:07, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17316/22095 [29:56:38<7:53:21, 5.94s/it] {'loss': 0.4445, 'grad_norm': 0.261811394535706, 'learning_rate': 1.1779560897169611e-06, 'epoch': 0.78} 78%|███████▊ | 17316/22095 [29:56:38<7:53:21, 5.94s/it] 78%|███████▊ | 17317/22095 [29:56:41<6:47:43, 5.12s/it] {'loss': 0.2336, 'grad_norm': 0.5609501858635176, 'learning_rate': 1.1774835932249485e-06, 'epoch': 0.78} 78%|███████▊ | 17317/22095 [29:56:41<6:47:43, 5.12s/it] 78%|███████▊ | 17318/22095 [29:56:44<6:01:35, 4.54s/it] {'loss': 0.2644, 'grad_norm': 0.6105280956973805, 'learning_rate': 1.1770111788670763e-06, 'epoch': 0.78} 78%|███████▊ | 17318/22095 [29:56:44<6:01:35, 4.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47627 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17319/22095 [29:56:47<5:25:30, 4.09s/it] {'loss': 0.2569, 'grad_norm': 0.6030722422612644, 'learning_rate': 1.1765388466534895e-06, 'epoch': 0.78} 78%|███████▊ | 17319/22095 [29:56:47<5:25:30, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (125148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50777 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107001 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144483 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72171 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101880 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17320/22095 [29:56:50<5:02:34, 3.80s/it] {'loss': 0.3065, 'grad_norm': 0.6215687069796155, 'learning_rate': 1.1760665965943402e-06, 'epoch': 0.78} 78%|███████▊ | 17320/22095 [29:56:50<5:02:34, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109782 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17321/22095 [29:56:53<4:52:22, 3.67s/it] {'loss': 0.2529, 'grad_norm': 0.612128101161067, 'learning_rate': 1.1755944286997766e-06, 'epoch': 0.78} 78%|███████▊ | 17321/22095 [29:56:53<4:52:22, 3.67s/it] 78%|███████▊ | 17322/22095 [29:56:57<4:47:05, 3.61s/it] {'loss': 0.2851, 'grad_norm': 0.5852209019344554, 'learning_rate': 1.175122342979943e-06, 'epoch': 0.78} 78%|███████▊ | 17322/22095 [29:56:57<4:47:05, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53117 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17323/22095 [29:57:06<7:07:49, 5.38s/it] {'loss': 0.4541, 'grad_norm': 0.2576637423810731, 'learning_rate': 1.174650339444982e-06, 'epoch': 0.78} 78%|███████▊ | 17323/22095 [29:57:06<7:07:49, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40976 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17324/22095 [29:57:10<6:25:24, 4.85s/it] {'loss': 0.2818, 'grad_norm': 0.5993989103121361, 'learning_rate': 1.1741784181050376e-06, 'epoch': 0.78} 78%|███████▊ | 17324/22095 [29:57:10<6:25:24, 4.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 78%|███████▊ | 17325/22095 [29:57:14<5:56:41, 4.49s/it] {'loss': 0.2974, 'grad_norm': 0.6645192315616409, 'learning_rate': 1.1737065789702473e-06, 'epoch': 0.78} 78%|███████▊ | 17325/22095 [29:57:14<5:56:41, 4.49s/it] 78%|███████▊ | 17326/22095 [29:57:17<5:21:26, 4.04s/it] {'loss': 0.2586, 'grad_norm': 0.6193089282975309, 'learning_rate': 1.1732348220507529e-06, 'epoch': 0.78} 78%|███████▊ | 17326/22095 [29:57:17<5:21:26, 4.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 78%|███████▊ | 17327/22095 [29:57:25<7:12:13, 5.44s/it] {'loss': 0.4753, 'grad_norm': 0.2807318580083904, 'learning_rate': 1.1727631473566875e-06, 'epoch': 0.78} 78%|███████▊ | 17327/22095 [29:57:25<7:12:13, 5.44s/it] 78%|███████▊ | 17328/22095 [29:57:35<8:47:59, 6.65s/it] {'loss': 0.4746, 'grad_norm': 0.28674929824535034, 'learning_rate': 1.1722915548981896e-06, 'epoch': 0.78} 78%|███████▊ | 17328/22095 [29:57:35<8:47:59, 6.65s/it] 78%|███████▊ | 17329/22095 [29:57:41<8:38:53, 6.53s/it] {'loss': 0.4742, 'grad_norm': 0.23822296288756592, 'learning_rate': 1.1718200446853877e-06, 'epoch': 0.78} 78%|███████▊ | 17329/22095 [29:57:41<8:38:53, 6.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1217, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8386730 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1217, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 53540, 'image': 'vrdu_table_final_2/astro-ph.CO/722f21f6-b82a-4daa-bc31-77b1ad5af33f.png', 'image_wh': [[1217, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{ccc}\n\\multicolumn{1}{c}{\\footnotesize * To be observed with Hubble Space Telescope (HST)/Space Telescope Imaging Spectrograph (STIS) during cycle 24.}\n\\end{tabular}\n```"}]} 78%|███████▊ | 17330/22095 [29:57:44<7:20:36, 5.55s/it] {'loss': 0.2915, 'grad_norm': 0.6216948625679023, 'learning_rate': 1.1713486167284183e-06, 'epoch': 0.78} 78%|███████▊ | 17330/22095 [29:57:44<7:20:36, 5.55s/it] 78%|███████▊ | 17331/22095 [29:57:52<8:17:00, 6.26s/it] {'loss': 0.4857, 'grad_norm': 0.2684113634588021, 'learning_rate': 1.1708772710374078e-06, 'epoch': 0.78} 78%|███████▊ | 17331/22095 [29:57:52<8:17:00, 6.26s/it] 78%|███████▊ | 17332/22095 [29:58:02<9:36:27, 7.26s/it] {'loss': 0.4521, 'grad_norm': 0.2603748463730575, 'learning_rate': 1.1704060076224827e-06, 'epoch': 0.78} 78%|███████▊ | 17332/22095 [29:58:02<9:36:27, 7.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 78%|███████▊ | 17333/22095 [29:58:06<8:13:59, 6.22s/it] {'loss': 0.2767, 'grad_norm': 0.6779421800191525, 'learning_rate': 1.169934826493771e-06, 'epoch': 0.78} 78%|███████▊ | 17333/22095 [29:58:06<8:13:59, 6.22s/it] 78%|███████▊ | 17334/22095 [29:58:10<7:20:17, 5.55s/it] {'loss': 0.3272, 'grad_norm': 0.7134792482230873, 'learning_rate': 1.1694637276613985e-06, 'epoch': 0.78} 78%|███████▊ | 17334/22095 [29:58:10<7:20:17, 5.55s/it] 78%|███████▊ | 17335/22095 [29:58:14<6:41:00, 5.05s/it] {'loss': 0.2796, 'grad_norm': 0.6114288019983596, 'learning_rate': 1.168992711135486e-06, 'epoch': 0.78} 78%|███████▊ | 17335/22095 [29:58:14<6:41:00, 5.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [342, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8482852 in VC:s3://internvl-moe-sft-data/. Exception: Image size [342, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 145690, 'image': 'vrdu_texteq/astro-ph.CO/8be17ca0-c7df-4ce3-94b4-00a2ed561bc3.png', 'image_wh': [[342, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'which has dimension $n\\times m$.'}]} 78%|███████▊ | 17336/22095 [29:58:17<5:57:40, 4.51s/it] {'loss': 0.2972, 'grad_norm': 0.6899832877009546, 'learning_rate': 1.1685217769261519e-06, 'epoch': 0.78} 78%|███████▊ | 17336/22095 [29:58:17<5:57:40, 4.51s/it] 78%|███████▊ | 17337/22095 [29:58:20<5:25:47, 4.11s/it] {'loss': 0.2816, 'grad_norm': 0.5679586106893408, 'learning_rate': 1.1680509250435195e-06, 'epoch': 0.78} 78%|███████▊ | 17337/22095 [29:58:20<5:25:47, 4.11s/it] 78%|███████▊ | 17338/22095 [29:58:23<5:06:22, 3.86s/it] {'loss': 0.3232, 'grad_norm': 0.6080534926128125, 'learning_rate': 1.1675801554977017e-06, 'epoch': 0.78} 78%|███████▊ | 17338/22095 [29:58:23<5:06:22, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119889 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17339/22095 [29:58:26<4:42:31, 3.56s/it] {'loss': 0.2831, 'grad_norm': 0.5998955037981734, 'learning_rate': 1.1671094682988182e-06, 'epoch': 0.78} 78%|███████▊ | 17339/22095 [29:58:26<4:42:31, 3.56s/it] 78%|███████▊ | 17340/22095 [29:58:29<4:26:38, 3.36s/it] {'loss': 0.2993, 'grad_norm': 0.5728524362999736, 'learning_rate': 1.1666388634569798e-06, 'epoch': 0.78} 78%|███████▊ | 17340/22095 [29:58:29<4:26:38, 3.36s/it] 78%|███████▊ | 17341/22095 [29:58:32<4:26:48, 3.37s/it] {'loss': 0.3322, 'grad_norm': 0.672290396497668, 'learning_rate': 1.1661683409822976e-06, 'epoch': 0.78} 78%|███████▊ | 17341/22095 [29:58:32<4:26:48, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:24 and width:1660 must be larger than factor:28 [Try #0] Failed to fetch sample 2210305 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:24 and width:1660 must be larger than factor:28 Problematic sample: {'image': '993930d933fd08dcb69c5d39e23e8664c9d8f9f42fe3808805d989b0e5022b9e.png', 'conversations': [{'from': 'human', 'value': "\nThe visual appearance of this Menu Item is as follows:\n\nVisual Description: The 'Version Control' text appears in white on a dark toolbar background, matching the macOS menu bar styling. The text uses a system font consistent with macOS design guidelines.\n\nRegarding the Menu Item's position:\nLocated in the main application menu bar between 'Run' and 'Tools' menu items. It is positioned in the upper portion of the window in a horizontal menu layout.\n\nUsage and purpose of this Menu Item:\nThis is a clickable menu item that opens a dropdown menu containing version control related options for managing source code and repository operations."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[349, 0, 423, 1000]]'}, {'from': 'human', 'value': 'This Menu Item allows users to:\nThis is a clickable menu item that opens a dropdown menu containing version control related options for managing source code and repository operations.'}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[349, 0, 423, 1000]]'}, {'from': 'human', 'value': "The Menu Item's visual characteristics are as follows:\nThe 'Version Control' text appears in white on a dark toolbar background, matching the macOS menu bar styling. The text uses a system font consistent with macOS design guidelines.\n\nLocation details of the Menu Item:\nLocated in the main application menu bar between 'Run' and 'Tools' menu items. It is positioned in the upper portion of the window in a horizontal menu layout."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[349, 0, 423, 1000]]'}, {'from': 'human', 'value': "Here's where the Menu Item is located:\nLocated at the far right end of the main menu bar, after 'Window' and other system menu items. It's part of the primary horizontal navigation strip at the very top of the application window."}, {'from': 'gpt', 'value': '[[507, 0, 542, 1000]]'}, {'from': 'human', 'value': "Positional details of the Menu Item:\nLocated at the far right end of the main menu bar, after 'Window' and other system menu items. It's part of the primary horizontal navigation strip at the very top of the application window.\n\nRegarding the Menu Item's functionality:\nClicking this menu item reveals a dropdown menu containing help-related options, documentation, and support resources typical of macOS applications. It's a standard system menu that provides access to assistance features."}, {'from': 'gpt', 'value': '[[507, 0, 542, 1000]]'}, {'from': 'human', 'value': "This Menu Item element can be described as follows:\n\nVisual Description: The 'Help' menu item appears in the application's menu bar with white text on a dark gray/black background. It's a standard macOS menu bar item with clear typography and contrasting colors.\n\nThe Menu Item's placement in the interface:\nLocated at the far right end of the main menu bar, after 'Window' and other system menu items. It's part of the primary horizontal navigation strip at the very top of the application window."}, {'from': 'gpt', 'value': '[[507, 0, 542, 1000]]'}, {'from': 'human', 'value': "Spatial context of the Menu Item:\nLocated in the top menu bar between 'File' and 'View' menu items. It's positioned directly below the Apple menu icon and is part of the Visual Studio application's main menu structure.\n\nFunctional capabilities of the Menu Item:\nOpens a dropdown menu containing editing-related commands and operations when clicked. This is a standard menu item that provides access to common editing functions like cut, copy, paste, and other text manipulation operations."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[126, 0, 158, 1000]]'}, {'from': 'human', 'value': "Visually, this Menu Item can be described as:\nThe 'Edit' menu item appears in the main menu bar with white text on a dark background. It's rendered in the standard system font and follows macOS menu styling conventions."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[126, 0, 158, 1000]]'}, {'from': 'human', 'value': "The spatial layout of this Menu Item:\nLocated in the top menu bar between 'File' and 'View' menu items. It's positioned directly below the Apple menu icon and is part of the Visual Studio application's main menu structure."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[126, 0, 158, 1000]]'}, {'from': 'human', 'value': 'This Menu Bar is positioned as follows:\nLocated at the very top of the application window, spanning the full width of the screen. The menu items are arranged horizontally in a single row, with the Apple menu and application name anchored to the left side.\n\nThe functionality of this Menu Bar:\nServes as the main navigation menu bar for the Visual Studio application, providing access to all major application functions and features through dropdown menus. Each label represents a clickable menu that reveals additional options when selected.'}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}, {'from': 'human', 'value': "Here's what this Menu Bar looks like:\nA dark-themed menu bar with white text labels, displaying standard application menu items including 'File', 'Edit', 'View', 'Search', 'Project', 'Build', 'Run', 'Version Control', 'Tools', 'Window', and 'Help'. The leftmost item is an Apple logo, followed by 'Visual Studio' as the application name.\n\nIn terms of the Menu Bar's positioning:\nLocated at the very top of the application window, spanning the full width of the screen. The menu items are arranged horizontally in a single row, with the Apple menu and application name anchored to the left side.\n\nThe role of this Menu Bar:\nServes as the main navigation menu bar for the Visual Studio application, providing access to all major application functions and features through dropdown menus. Each label represents a clickable menu that reveals additional options when selected."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}, {'from': 'human', 'value': "The visual attributes of this Menu Bar are:\nA dark-themed menu bar with white text labels, displaying standard application menu items including 'File', 'Edit', 'View', 'Search', 'Project', 'Build', 'Run', 'Version Control', 'Tools', 'Window', and 'Help'. The leftmost item is an Apple logo, followed by 'Visual Studio' as the application name.\n\nPositional details of the Menu Bar:\nLocated at the very top of the application window, spanning the full width of the screen. The menu items are arranged horizontally in a single row, with the Apple menu and application name anchored to the left side."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}, {'from': 'human', 'value': "Here's a detailed visual description of the Menu Item:\nThe 'Tools' menu item appears in the main application menu bar with white text on a dark background. It matches the styling of other menu items like 'Build', 'Run', and 'Window'.\n\nThe Menu Item's intended function:\nWhen clicked, the Tools menu item reveals a dropdown menu containing various tool-related options and commands for the Visual Studio environment."}, {'from': 'gpt', 'value': '[[423, 0, 460, 1000]]'}, {'from': 'human', 'value': "The visual appearance of this Menu Item is as follows:\n\nVisual Description: The 'Tools' menu item appears in the main application menu bar with white text on a dark background. It matches the styling of other menu items like 'Build', 'Run', and 'Window'.\n\nLocation and arrangement of this Menu Item:\nLocated in the main menu bar at the top of the window, positioned between 'Version Control' and 'Window'. It's part of a horizontal list of menu options that spans the width of the application window."}, {'from': 'gpt', 'value': '[[423, 0, 460, 1000]]'}, {'from': 'human', 'value': "Visual features of the Menu Item:\nThe 'Tools' menu item appears in the main application menu bar with white text on a dark background. It matches the styling of other menu items like 'Build', 'Run', and 'Window'.\n\nThe spatial layout of this Menu Item:\nLocated in the main menu bar at the top of the window, positioned between 'Version Control' and 'Window'. It's part of a horizontal list of menu options that spans the width of the application window.\n\nThis Menu Item allows users to:\nWhen clicked, the Tools menu item reveals a dropdown menu containing various tool-related options and commands for the Visual Studio environment."}, {'from': 'gpt', 'value': '[[423, 0, 460, 1000]]'}, {'from': 'human', 'value': "Spatial context of the Menu Item:\nLocated in the main menu bar at the top of the window, positioned between 'Search' and 'Build' menu items. The menu item is sixth from the left in the horizontal menu bar sequence, following 'File', 'Edit', 'View', 'Search', and preceding 'Build', 'Run', 'Version Control'."}, {'from': 'gpt', 'value': '[[237, 0, 281, 1000]]'}, {'from': 'human', 'value': "Visual appearance details of the Menu Item:\nA dark gray tab labeled 'Project' in white text, appearing as part of the main application menu bar. The tab has a slightly darker background than its neighboring menu items, with clear typography that matches the macOS system font.\n\nThis Menu Item serves the following purpose:\nServes as a clickable menu item that reveals a dropdown menu containing project-related commands and options when clicked. Part of the primary navigation system for the IDE interface."}, {'from': 'gpt', 'value': '[[237, 0, 281, 1000]]'}, {'from': 'human', 'value': "Let me describe the visual characteristics of this Menu Item:\nA dark gray tab labeled 'Project' in white text, appearing as part of the main application menu bar. The tab has a slightly darker background than its neighboring menu items, with clear typography that matches the macOS system font.\n\nSpatial context of the Menu Item:\nLocated in the main menu bar at the top of the window, positioned between 'Search' and 'Build' menu items. The menu item is sixth from the left in the horizontal menu bar sequence, following 'File', 'Edit', 'View', 'Search', and preceding 'Build', 'Run', 'Version Control'.\n\nFunctional description of the Menu Item:\nServes as a clickable menu item that reveals a dropdown menu containing project-related commands and options when clicked. Part of the primary navigation system for the IDE interface."}, {'from': 'gpt', 'value': '[[237, 0, 281, 1000]]'}, {'from': 'human', 'value': "This Menu Item element can be described as follows:\n\nVisual Description: A 'Run' menu item in the main application menu bar, styled in the standard macOS menu appearance with white text on a dark background. The text appears in a system font typical of macOS menu items.\n\nThis Menu Item's purpose and usage:\nOpens a dropdown menu containing run-related commands and options for executing the application or code within Visual Studio."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[317, 0, 349, 1000]]'}, {'from': 'human', 'value': "The visual attributes of this Menu Item are:\nA 'Run' menu item in the main application menu bar, styled in the standard macOS menu appearance with white text on a dark background. The text appears in a system font typical of macOS menu items.\n\nLocation and arrangement of this Menu Item:\nLocated in the main menu bar between the 'Build' and 'Version Control' menu items, positioned in the upper portion of the application window. This menu item follows the standard macOS menu bar layout pattern.\n\nThis Menu Item allows users to:\nOpens a dropdown menu containing run-related commands and options for executing the application or code within Visual Studio."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[317, 0, 349, 1000]]'}, {'from': 'human', 'value': "Regarding the Menu Item's functionality:\nOpens a dropdown menu containing run-related commands and options for executing the application or code within Visual Studio."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[317, 0, 349, 1000]]'}, {'from': 'human', 'value': "Here's a detailed visual description of the Menu Bar:\nA dark-themed menu bar displaying primary navigation items including 'File', 'Edit', 'View', 'Search', 'Project', 'Build', 'Run', 'Version Control', 'Tools', 'Window', and 'Help'. The menu bar features white text on a dark gray background, following the standard macOS menu bar design pattern.\n\nLocation and arrangement of this Menu Bar:\nLocated at the very top of the application window, spanning the entire width. The Apple menu icon appears at the far left, followed by 'Visual Studio' and the rest of the menu items arranged horizontally from left to right.\n\nThe functionality of this Menu Bar:\nServes as the main navigation menu for the Visual Studio application, providing access to core functionality and features through dropdown menus. Users can click on any menu item to reveal additional options and commands."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[0, 0, 550, 1000]]'}, {'from': 'human', 'value': "Spatial context of the Menu Bar:\nLocated at the very top of the application window, spanning the entire width. The Apple menu icon appears at the far left, followed by 'Visual Studio' and the rest of the menu items arranged horizontally from left to right.\n\nFunctional capabilities of the Menu Bar:\nServes as the main navigation menu for the Visual Studio application, providing access to core functionality and features through dropdown menus. Users can click on any menu item to reveal additional options and commands."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[0, 0, 550, 1000]]'}, {'from': 'human', 'value': "Here's a detailed visual description of the Menu Bar:\nA dark-themed menu bar displaying primary navigation items including 'File', 'Edit', 'View', 'Search', 'Project', 'Build', 'Run', 'Version Control', 'Tools', 'Window', and 'Help'. The menu bar features white text on a dark gray background, following the standard macOS menu bar design pattern."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[0, 0, 550, 1000]]'}, {'from': 'human', 'value': "The Menu Item's visual characteristics are as follows:\nA dark gray menu item labeled 'Build' in white text, positioned in the main menu bar. The text appears to be using a system standard font, and has a consistent height and padding with neighboring menu items.\n\nThe functionality of this Menu Item:\nOpens a dropdown menu containing build-related commands and options for compiling and constructing the project. This is a standard menu item that users can click to access build functionality in Visual Studio."}, {'from': 'gpt', 'value': '[[281, 0, 317, 1000]]'}, {'from': 'human', 'value': "The Menu Item's visual characteristics are as follows:\nA dark gray menu item labeled 'Build' in white text, positioned in the main menu bar. The text appears to be using a system standard font, and has a consistent height and padding with neighboring menu items.\n\nSpatial context of the Menu Item:\nLocated in the main menu bar at the top of the Visual Studio window, positioned between 'Project' and 'Run' menu items. It's part of a horizontal list of menu options that spans the width of the application window."}, {'from': 'gpt', 'value': '[[281, 0, 317, 1000]]'}, {'from': 'human', 'value': "Visual appearance details of the Menu Item:\nA dark gray menu item labeled 'Build' in white text, positioned in the main menu bar. The text appears to be using a system standard font, and has a consistent height and padding with neighboring menu items."}, {'from': 'gpt', 'value': '[[281, 0, 317, 1000]]'}, {'from': 'human', 'value': 'This Menu Button with Icon allows users to:\nThis is the primary system menu that provides access to core macOS functions like System Preferences, App Store, Force Quit, Sleep, Restart, and Shut Down options. Clicking this icon reveals a dropdown menu with these system-level controls.'}, {'from': 'gpt', 'value': '[[8, 0, 27, 1000]]'}, {'from': 'human', 'value': "The visual attributes of this Menu Button with Icon are:\nThis is the Apple menu icon represented by the classic Apple logo in white, located in the macOS menu bar. It maintains the standard operating system menu style and is monochrome white against the dark menu bar background.\n\nRegarding the Menu Button with Icon's position:\nThe Apple menu icon is positioned at the far left end of the menu bar, appearing as the first item before 'Visual Studio' and other menu items. It is aligned vertically with other menu bar items in the standard macOS menu bar height.\n\nWhat this Menu Button with Icon does:\nThis is the primary system menu that provides access to core macOS functions like System Preferences, App Store, Force Quit, Sleep, Restart, and Shut Down options. Clicking this icon reveals a dropdown menu with these system-level controls."}, {'from': 'gpt', 'value': '[[8, 0, 27, 1000]]'}, {'from': 'human', 'value': "Here's a detailed visual description of the Menu Button with Icon:\nThis is the Apple menu icon represented by the classic Apple logo in white, located in the macOS menu bar. It maintains the standard operating system menu style and is monochrome white against the dark menu bar background."}, {'from': 'gpt', 'value': '[[8, 0, 27, 1000]]'}, {'from': 'human', 'value': "Let me describe the visual characteristics of this Menu item:\nThe word 'Window' appears in white text against a dark toolbar background. It's styled as a standard macOS menu item with a clear, readable font typical of system menus.\n\nThe spatial layout of this Menu item:\nLocated in the main menu bar at the top of the screen, positioned between 'Tools' and 'Help'. The menu item appears in the standard macOS menu position, near the right side of the top menu bar.\n\nFunctional description of the Menu item:\nClicking this menu item reveals a dropdown menu containing window management options, such as minimizing, maximizing, or arranging windows within the Visual Studio application."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[460, 0, 507, 1000]]'}, {'from': 'human', 'value': "The visual attributes of this Menu item are:\nThe word 'Window' appears in white text against a dark toolbar background. It's styled as a standard macOS menu item with a clear, readable font typical of system menus.\n\nPositional details of the Menu item:\nLocated in the main menu bar at the top of the screen, positioned between 'Tools' and 'Help'. The menu item appears in the standard macOS menu position, near the right side of the top menu bar."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[460, 0, 507, 1000]]'}, {'from': 'human', 'value': 'Usage and purpose of this Menu item:\nClicking this menu item reveals a dropdown menu containing window management options, such as minimizing, maximizing, or arranging windows within the Visual Studio application.'}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[460, 0, 507, 1000]]'}, {'from': 'human', 'value': "The visual appearance of this Menu Item is as follows:\n\nVisual Description: The word 'View' appears as a menu item in the top application menu bar, styled in white text against a dark gray background. The text is clear and legible, maintaining the standard macOS menu appearance.\n\nIn terms of the Menu Item's positioning:\nLocated in the main application menu bar at the top, positioned between 'Edit' and 'Search' menu items. It's the fourth item from the left, following 'Visual Studio', 'File', and 'Edit'."}, {'from': 'gpt', 'value': '[[158, 0, 193, 1000]]'}, {'from': 'human', 'value': "The visual attributes of this Menu Item are:\nThe word 'View' appears as a menu item in the top application menu bar, styled in white text against a dark gray background. The text is clear and legible, maintaining the standard macOS menu appearance.\n\nPositional details of the Menu Item:\nLocated in the main application menu bar at the top, positioned between 'Edit' and 'Search' menu items. It's the fourth item from the left, following 'Visual Studio', 'File', and 'Edit'.\n\nThe role of this Menu Item:\nWhen clicked, this menu item reveals a dropdown with view-related options and commands for controlling the application's visual interface and layout settings, following standard macOS menu conventions."}, {'from': 'gpt', 'value': '[[158, 0, 193, 1000]]'}, {'from': 'human', 'value': "This Menu Item is positioned as follows:\nLocated in the main application menu bar at the top, positioned between 'Edit' and 'Search' menu items. It's the fourth item from the left, following 'Visual Studio', 'File', and 'Edit'."}, {'from': 'gpt', 'value': '[[158, 0, 193, 1000]]'}, {'from': 'human', 'value': "Looking at this Menu Item, we can observe:\nThe 'Search' menu item appears in the main application menu bar. It has white text against a dark gray/black background menu bar, consistent with macOS styling. The text uses the system default menu font."}, {'from': 'gpt', 'value': '[[193, 0, 237, 1000]]'}, {'from': 'human', 'value': "The Menu Item's intended function:\nClicking this menu item reveals a dropdown menu with search-related commands and options. It provides access to search functionality within the Visual Studio application."}, {'from': 'gpt', 'value': '[[193, 0, 237, 1000]]'}, {'from': 'human', 'value': "Regarding the Menu Item's position:\nLocated in the main menu bar between 'View' and 'Project' menu items, positioned in the upper portion of the application window. It's part of a standard horizontal menu layout common to macOS applications."}, {'from': 'gpt', 'value': '[[193, 0, 237, 1000]]'}]} 78%|███████▊ | 17342/22095 [29:58:37<4:59:38, 3.78s/it] {'loss': 0.2962, 'grad_norm': 0.650360992582781, 'learning_rate': 1.1656979008848834e-06, 'epoch': 0.78} 78%|███████▊ | 17342/22095 [29:58:37<4:59:38, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49506 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102825 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90420 > 40960). Running this sequence through the model will result in indexing errors 78%|███████▊ | 17343/22095 [29:58:40<4:47:13, 3.63s/it] {'loss': 0.2957, 'grad_norm': 0.5724326349339708, 'learning_rate': 1.1652275431748462e-06, 'epoch': 0.78} 78%|███████▊ | 17343/22095 [29:58:40<4:47:13, 3.63s/it] 78%|███████▊ | 17344/22095 [29:58:44<4:45:10, 3.60s/it] {'loss': 0.2927, 'grad_norm': 1.5135642866782946, 'learning_rate': 1.164757267862292e-06, 'epoch': 0.78} 78%|███████▊ | 17344/22095 [29:58:44<4:45:10, 3.60s/it] 79%|███████▊ | 17345/22095 [29:58:47<4:29:20, 3.40s/it] {'loss': 0.26, 'grad_norm': 0.6320185461540719, 'learning_rate': 1.1642870749573231e-06, 'epoch': 0.79} 79%|███████▊ | 17345/22095 [29:58:47<4:29:20, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114288 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17346/22095 [29:58:50<4:25:17, 3.35s/it] {'loss': 0.3077, 'grad_norm': 0.6058476301879185, 'learning_rate': 1.1638169644700447e-06, 'epoch': 0.79} 79%|███████▊ | 17346/22095 [29:58:50<4:25:17, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▊ | 17347/22095 [29:58:59<6:46:26, 5.14s/it] {'loss': 0.4534, 'grad_norm': 0.3120894027493385, 'learning_rate': 1.1633469364105604e-06, 'epoch': 0.79} 79%|███████▊ | 17347/22095 [29:58:59<6:46:26, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137734 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103904 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17348/22095 [29:59:04<6:34:20, 4.98s/it] {'loss': 0.3339, 'grad_norm': 0.5984657291673785, 'learning_rate': 1.1628769907889643e-06, 'epoch': 0.79} 79%|███████▊ | 17348/22095 [29:59:04<6:34:20, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53973 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46526 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17349/22095 [29:59:07<5:56:20, 4.50s/it] {'loss': 0.2648, 'grad_norm': 0.6255476437590982, 'learning_rate': 1.162407127615357e-06, 'epoch': 0.79} 79%|███████▊ | 17349/22095 [29:59:07<5:56:20, 4.50s/it] 79%|███████▊ | 17350/22095 [29:59:10<5:21:44, 4.07s/it] {'loss': 0.3146, 'grad_norm': 0.6902499963136173, 'learning_rate': 1.1619373468998357e-06, 'epoch': 0.79} 79%|███████▊ | 17350/22095 [29:59:10<5:21:44, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▊ | 17351/22095 [29:59:20<7:36:54, 5.78s/it] {'loss': 0.4615, 'grad_norm': 0.2817907502818888, 'learning_rate': 1.1614676486524927e-06, 'epoch': 0.79} 79%|███████▊ | 17351/22095 [29:59:20<7:36:54, 5.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▊ | 17352/22095 [29:59:24<6:48:36, 5.17s/it] {'loss': 0.2971, 'grad_norm': 0.6043644113354228, 'learning_rate': 1.1609980328834196e-06, 'epoch': 0.79} 79%|███████▊ | 17352/22095 [29:59:24<6:48:36, 5.17s/it] 79%|███████▊ | 17353/22095 [29:59:28<6:10:43, 4.69s/it] {'loss': 0.2657, 'grad_norm': 0.5668065899841754, 'learning_rate': 1.16052849960271e-06, 'epoch': 0.79} 79%|███████▊ | 17353/22095 [29:59:28<6:10:43, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▊ | 17354/22095 [29:59:37<8:04:54, 6.14s/it] {'loss': 0.4568, 'grad_norm': 0.26723005660984916, 'learning_rate': 1.1600590488204495e-06, 'epoch': 0.79} 79%|███████▊ | 17354/22095 [29:59:37<8:04:54, 6.14s/it] 79%|███████▊ | 17355/22095 [29:59:40<6:56:14, 5.27s/it] {'loss': 0.2744, 'grad_norm': 0.5662361819226162, 'learning_rate': 1.159589680546727e-06, 'epoch': 0.79} 79%|███████▊ | 17355/22095 [29:59:40<6:56:14, 5.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (162480 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17356/22095 [29:59:49<8:05:28, 6.15s/it] {'loss': 0.4581, 'grad_norm': 0.26301807410907907, 'learning_rate': 1.159120394791627e-06, 'epoch': 0.79} 79%|███████▊ | 17356/22095 [29:59:49<8:05:28, 6.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60764 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17357/22095 [29:59:57<8:54:40, 6.77s/it] {'loss': 0.4793, 'grad_norm': 0.26376801379454134, 'learning_rate': 1.1586511915652343e-06, 'epoch': 0.79} 79%|███████▊ | 17357/22095 [29:59:57<8:54:40, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▊ | 17358/22095 [30:00:00<7:34:11, 5.75s/it] {'loss': 0.3042, 'grad_norm': 0.6552962676442267, 'learning_rate': 1.1581820708776282e-06, 'epoch': 0.79} 79%|███████▊ | 17358/22095 [30:00:00<7:34:11, 5.75s/it] 79%|███████▊ | 17359/22095 [30:00:04<6:56:50, 5.28s/it] {'loss': 0.2622, 'grad_norm': 0.749149067682797, 'learning_rate': 1.1577130327388918e-06, 'epoch': 0.79} 79%|███████▊ | 17359/22095 [30:00:04<6:56:50, 5.28s/it] 79%|███████▊ | 17360/22095 [30:00:08<6:08:06, 4.66s/it] {'loss': 0.3008, 'grad_norm': 0.5841610216480839, 'learning_rate': 1.1572440771591014e-06, 'epoch': 0.79} 79%|███████▊ | 17360/22095 [30:00:08<6:08:06, 4.66s/it] 79%|███████▊ | 17361/22095 [30:00:11<5:51:32, 4.46s/it] {'loss': 0.3048, 'grad_norm': 0.6299373407052793, 'learning_rate': 1.1567752041483328e-06, 'epoch': 0.79} 79%|███████▊ | 17361/22095 [30:00:12<5:51:32, 4.46s/it] 79%|███████▊ | 17362/22095 [30:00:14<5:16:31, 4.01s/it] {'loss': 0.2657, 'grad_norm': 0.7158363775054724, 'learning_rate': 1.1563064137166607e-06, 'epoch': 0.79} 79%|███████▊ | 17362/22095 [30:00:15<5:16:31, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42041 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69266 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17363/22095 [30:00:22<6:41:33, 5.09s/it] {'loss': 0.4914, 'grad_norm': 0.27321909472327843, 'learning_rate': 1.1558377058741605e-06, 'epoch': 0.79} 79%|███████▊ | 17363/22095 [30:00:22<6:41:33, 5.09s/it] 79%|███████▊ | 17364/22095 [30:00:32<8:33:05, 6.51s/it] {'loss': 0.4868, 'grad_norm': 0.27974658066057784, 'learning_rate': 1.1553690806309015e-06, 'epoch': 0.79} 79%|███████▊ | 17364/22095 [30:00:32<8:33:05, 6.51s/it] 79%|███████▊ | 17365/22095 [30:00:39<8:47:20, 6.69s/it] {'loss': 0.4575, 'grad_norm': 0.26357198631483486, 'learning_rate': 1.154900537996952e-06, 'epoch': 0.79} 79%|███████▊ | 17365/22095 [30:00:39<8:47:20, 6.69s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (51591 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17366/22095 [30:00:43<7:44:47, 5.90s/it] {'loss': 0.2967, 'grad_norm': 0.8363162554904776, 'learning_rate': 1.154432077982382e-06, 'epoch': 0.79} 79%|███████▊ | 17366/22095 [30:00:43<7:44:47, 5.90s/it] 79%|███████▊ | 17367/22095 [30:00:47<6:59:58, 5.33s/it] {'loss': 0.2601, 'grad_norm': 0.5646840841086878, 'learning_rate': 1.1539637005972543e-06, 'epoch': 0.79} 79%|███████▊ | 17367/22095 [30:00:47<6:59:58, 5.33s/it] 79%|███████▊ | 17368/22095 [30:00:51<6:32:22, 4.98s/it] {'loss': 0.2893, 'grad_norm': 0.5884686071005956, 'learning_rate': 1.1534954058516357e-06, 'epoch': 0.79} 79%|███████▊ | 17368/22095 [30:00:51<6:32:22, 4.98s/it] 79%|███████▊ | 17369/22095 [30:00:55<6:05:13, 4.64s/it] {'loss': 0.3212, 'grad_norm': 0.6341286607495344, 'learning_rate': 1.1530271937555859e-06, 'epoch': 0.79} 79%|███████▊ | 17369/22095 [30:00:55<6:05:13, 4.64s/it] 79%|███████▊ | 17370/22095 [30:00:58<5:25:18, 4.13s/it] {'loss': 0.2998, 'grad_norm': 0.653173806007168, 'learning_rate': 1.152559064319168e-06, 'epoch': 0.79} 79%|███████▊ | 17370/22095 [30:00:58<5:25:18, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▊ | 17371/22095 [30:01:08<7:32:42, 5.75s/it] {'loss': 0.4712, 'grad_norm': 0.28527858241156007, 'learning_rate': 1.152091017552438e-06, 'epoch': 0.79} 79%|███████▊ | 17371/22095 [30:01:08<7:32:42, 5.75s/it] 79%|███████▊ | 17372/22095 [30:01:11<6:39:21, 5.07s/it] {'loss': 0.3352, 'grad_norm': 0.7963908930873177, 'learning_rate': 1.1516230534654554e-06, 'epoch': 0.79} 79%|███████▊ | 17372/22095 [30:01:11<6:39:21, 5.07s/it] 79%|███████▊ | 17373/22095 [30:01:14<5:54:16, 4.50s/it] {'loss': 0.2846, 'grad_norm': 0.6102186650838729, 'learning_rate': 1.151155172068274e-06, 'epoch': 0.79} 79%|███████▊ | 17373/22095 [30:01:14<5:54:16, 4.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43085 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46773 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44182 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17374/22095 [30:01:20<6:20:30, 4.84s/it] {'loss': 0.4701, 'grad_norm': 0.27518292768602326, 'learning_rate': 1.1506873733709457e-06, 'epoch': 0.79} 79%|███████▊ | 17374/22095 [30:01:20<6:20:30, 4.84s/it] 79%|███████▊ | 17375/22095 [30:01:23<5:51:51, 4.47s/it] {'loss': 0.2709, 'grad_norm': 0.6106521447355924, 'learning_rate': 1.1502196573835239e-06, 'epoch': 0.79} 79%|███████▊ | 17375/22095 [30:01:23<5:51:51, 4.47s/it] 79%|███████▊ | 17376/22095 [30:01:27<5:28:46, 4.18s/it] {'loss': 0.2618, 'grad_norm': 0.6243633846315203, 'learning_rate': 1.1497520241160603e-06, 'epoch': 0.79} 79%|███████▊ | 17376/22095 [30:01:27<5:28:46, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103599 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63991 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46768 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17377/22095 [30:01:30<5:08:00, 3.92s/it] {'loss': 0.3185, 'grad_norm': 0.9340048826982438, 'learning_rate': 1.1492844735785979e-06, 'epoch': 0.79} 79%|███████▊ | 17377/22095 [30:01:30<5:08:00, 3.92s/it] 79%|███████▊ | 17378/22095 [30:01:34<5:09:34, 3.94s/it] {'loss': 0.3282, 'grad_norm': 0.6010856148363889, 'learning_rate': 1.1488170057811853e-06, 'epoch': 0.79} 79%|███████▊ | 17378/22095 [30:01:34<5:09:34, 3.94s/it] 79%|███████▊ | 17379/22095 [30:01:38<5:00:07, 3.82s/it] {'loss': 0.2786, 'grad_norm': 0.650174914655158, 'learning_rate': 1.148349620733869e-06, 'epoch': 0.79} 79%|███████▊ | 17379/22095 [30:01:38<5:00:07, 3.82s/it] 79%|███████▊ | 17380/22095 [30:01:41<4:34:20, 3.49s/it] {'loss': 0.2876, 'grad_norm': 0.6267531884417445, 'learning_rate': 1.1478823184466897e-06, 'epoch': 0.79} 79%|███████▊ | 17380/22095 [30:01:41<4:34:20, 3.49s/it] 79%|███████▊ | 17381/22095 [30:01:44<4:38:24, 3.54s/it] {'loss': 0.2996, 'grad_norm': 0.651212363619707, 'learning_rate': 1.1474150989296872e-06, 'epoch': 0.79} 79%|███████▊ | 17381/22095 [30:01:44<4:38:24, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956350 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7185, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 4cm\nB. 1cm\nC. 1.5cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 79%|███████▊ | 17382/22095 [30:01:54<7:08:16, 5.45s/it] {'loss': 0.4627, 'grad_norm': 0.3385550548730938, 'learning_rate': 1.1469479621929036e-06, 'epoch': 0.79} 79%|███████▊ | 17382/22095 [30:01:54<7:08:16, 5.45s/it] 79%|███████▊ | 17383/22095 [30:01:57<6:16:53, 4.80s/it] {'loss': 0.293, 'grad_norm': 1.1355741833386386, 'learning_rate': 1.146480908246373e-06, 'epoch': 0.79} 79%|███████▊ | 17383/22095 [30:01:57<6:16:53, 4.80s/it] 79%|███████▊ | 17384/22095 [30:02:00<5:35:27, 4.27s/it] {'loss': 0.3296, 'grad_norm': 0.5953511777939547, 'learning_rate': 1.1460139371001339e-06, 'epoch': 0.79} 79%|███████▊ | 17384/22095 [30:02:00<5:35:27, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▊ | 17385/22095 [30:02:10<7:36:45, 5.82s/it] {'loss': 0.4861, 'grad_norm': 0.28265491438938334, 'learning_rate': 1.1455470487642167e-06, 'epoch': 0.79} 79%|███████▊ | 17385/22095 [30:02:10<7:36:45, 5.82s/it] 79%|███████▊ | 17386/22095 [30:02:14<6:55:15, 5.29s/it] {'loss': 0.2921, 'grad_norm': 0.6026682293599088, 'learning_rate': 1.1450802432486574e-06, 'epoch': 0.79} 79%|███████▊ | 17386/22095 [30:02:14<6:55:15, 5.29s/it] 79%|███████▊ | 17387/22095 [30:02:17<6:06:24, 4.67s/it] {'loss': 0.2722, 'grad_norm': 0.6458133985405689, 'learning_rate': 1.1446135205634829e-06, 'epoch': 0.79} 79%|███████▊ | 17387/22095 [30:02:17<6:06:24, 4.67s/it] 79%|███████▊ | 17388/22095 [30:02:20<5:24:45, 4.14s/it] {'loss': 0.2813, 'grad_norm': 0.6222737944884161, 'learning_rate': 1.144146880718724e-06, 'epoch': 0.79} 79%|███████▊ | 17388/22095 [30:02:20<5:24:45, 4.14s/it] 79%|███████▊ | 17389/22095 [30:02:24<5:10:15, 3.96s/it] {'loss': 0.3041, 'grad_norm': 0.6055298101392338, 'learning_rate': 1.1436803237244065e-06, 'epoch': 0.79} 79%|███████▊ | 17389/22095 [30:02:24<5:10:15, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76552 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17390/22095 [30:02:27<4:49:39, 3.69s/it] {'loss': 0.3363, 'grad_norm': 0.6225200984515917, 'learning_rate': 1.1432138495905531e-06, 'epoch': 0.79} 79%|███████▊ | 17390/22095 [30:02:27<4:49:39, 3.69s/it] 79%|███████▊ | 17391/22095 [30:02:30<4:38:13, 3.55s/it] {'loss': 0.2508, 'grad_norm': 0.630228041297407, 'learning_rate': 1.1427474583271896e-06, 'epoch': 0.79} 79%|███████▊ | 17391/22095 [30:02:30<4:38:13, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69312 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58657 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17392/22095 [30:02:33<4:30:21, 3.45s/it] {'loss': 0.32, 'grad_norm': 0.8153673652545924, 'learning_rate': 1.1422811499443375e-06, 'epoch': 0.79} 79%|███████▊ | 17392/22095 [30:02:33<4:30:21, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85009 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17393/22095 [30:02:36<4:21:51, 3.34s/it] {'loss': 0.3029, 'grad_norm': 0.6295439380761974, 'learning_rate': 1.1418149244520155e-06, 'epoch': 0.79} 79%|███████▊ | 17393/22095 [30:02:36<4:21:51, 3.34s/it] 79%|███████▊ | 17394/22095 [30:02:40<4:25:10, 3.38s/it] {'loss': 0.3676, 'grad_norm': 0.6664844860170986, 'learning_rate': 1.1413487818602397e-06, 'epoch': 0.79} 79%|███████▊ | 17394/22095 [30:02:40<4:25:10, 3.38s/it] 79%|███████▊ | 17395/22095 [30:02:44<4:39:42, 3.57s/it] {'loss': 0.2806, 'grad_norm': 0.6075032187161125, 'learning_rate': 1.1408827221790297e-06, 'epoch': 0.79} 79%|███████▊ | 17395/22095 [30:02:44<4:39:42, 3.57s/it] 79%|███████▊ | 17396/22095 [30:02:47<4:33:18, 3.49s/it] {'loss': 0.3159, 'grad_norm': 0.6965414154938917, 'learning_rate': 1.1404167454183957e-06, 'epoch': 0.79} 79%|███████▊ | 17396/22095 [30:02:47<4:33:18, 3.49s/it] 79%|███████▊ | 17397/22095 [30:02:50<4:17:33, 3.29s/it] {'loss': 0.272, 'grad_norm': 0.6003957067092349, 'learning_rate': 1.1399508515883533e-06, 'epoch': 0.79} 79%|███████▊ | 17397/22095 [30:02:50<4:17:33, 3.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61691 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▊ | 17398/22095 [30:02:59<6:39:04, 5.10s/it] {'loss': 0.4759, 'grad_norm': 0.28195165676181216, 'learning_rate': 1.1394850406989106e-06, 'epoch': 0.79} 79%|███████▊ | 17398/22095 [30:02:59<6:39:04, 5.10s/it] 79%|███████▊ | 17399/22095 [30:03:02<5:55:55, 4.55s/it] {'loss': 0.2927, 'grad_norm': 1.5650898904360315, 'learning_rate': 1.139019312760079e-06, 'epoch': 0.79} 79%|███████▊ | 17399/22095 [30:03:02<5:55:55, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17400/22095 [30:03:12<7:56:32, 6.09s/it] {'loss': 0.429, 'grad_norm': 0.34002607274237956, 'learning_rate': 1.1385536677818632e-06, 'epoch': 0.79} 79%|███████▉ | 17400/22095 [30:03:12<7:56:32, 6.09s/it] 79%|███████▉ | 17401/22095 [30:03:15<6:52:15, 5.27s/it] {'loss': 0.2949, 'grad_norm': 0.6491468227013686, 'learning_rate': 1.138088105774271e-06, 'epoch': 0.79} 79%|███████▉ | 17401/22095 [30:03:15<6:52:15, 5.27s/it] 79%|███████▉ | 17402/22095 [30:03:18<5:59:48, 4.60s/it] {'loss': 0.2673, 'grad_norm': 0.6195093709848404, 'learning_rate': 1.137622626747304e-06, 'epoch': 0.79} 79%|███████▉ | 17402/22095 [30:03:18<5:59:48, 4.60s/it] 79%|███████▉ | 17403/22095 [30:03:22<5:35:03, 4.28s/it] {'loss': 0.3477, 'grad_norm': 0.5960717594699569, 'learning_rate': 1.1371572307109634e-06, 'epoch': 0.79} 79%|███████▉ | 17403/22095 [30:03:22<5:35:03, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49885 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58326 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17404/22095 [30:03:25<5:13:05, 4.00s/it] {'loss': 0.2689, 'grad_norm': 0.5759243199990242, 'learning_rate': 1.13669191767525e-06, 'epoch': 0.79} 79%|███████▉ | 17404/22095 [30:03:25<5:13:05, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79514 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17405/22095 [30:03:28<4:42:43, 3.62s/it] {'loss': 0.2394, 'grad_norm': 0.6149926025008491, 'learning_rate': 1.1362266876501649e-06, 'epoch': 0.79} 79%|███████▉ | 17405/22095 [30:03:28<4:42:43, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17406/22095 [30:03:37<6:56:47, 5.33s/it] {'loss': 0.4622, 'grad_norm': 0.262255617987434, 'learning_rate': 1.1357615406456985e-06, 'epoch': 0.79} 79%|███████▉ | 17406/22095 [30:03:37<6:56:47, 5.33s/it] 79%|███████▉ | 17407/22095 [30:03:41<6:08:13, 4.71s/it] {'loss': 0.3056, 'grad_norm': 0.6810866592082769, 'learning_rate': 1.1352964766718488e-06, 'epoch': 0.79} 79%|███████▉ | 17407/22095 [30:03:41<6:08:13, 4.71s/it] 79%|███████▉ | 17408/22095 [30:03:44<5:30:53, 4.24s/it] {'loss': 0.2777, 'grad_norm': 0.8782494286183904, 'learning_rate': 1.1348314957386093e-06, 'epoch': 0.79} 79%|███████▉ | 17408/22095 [30:03:44<5:30:53, 4.24s/it] 79%|███████▉ | 17409/22095 [30:03:47<5:10:23, 3.97s/it] {'loss': 0.2746, 'grad_norm': 0.5896235487727614, 'learning_rate': 1.1343665978559704e-06, 'epoch': 0.79} 79%|███████▉ | 17409/22095 [30:03:47<5:10:23, 3.97s/it] 79%|███████▉ | 17410/22095 [30:03:51<4:57:57, 3.82s/it] {'loss': 0.2693, 'grad_norm': 0.6212552351887514, 'learning_rate': 1.1339017830339195e-06, 'epoch': 0.79} 79%|███████▉ | 17410/22095 [30:03:51<4:57:57, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (127335 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44332 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17411/22095 [30:03:54<4:55:45, 3.79s/it] {'loss': 0.2959, 'grad_norm': 0.6423392372429871, 'learning_rate': 1.1334370512824466e-06, 'epoch': 0.79} 79%|███████▉ | 17411/22095 [30:03:54<4:55:45, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17412/22095 [30:04:04<7:08:27, 5.49s/it] {'loss': 0.4557, 'grad_norm': 0.3022873126019532, 'learning_rate': 1.1329724026115345e-06, 'epoch': 0.79} 79%|███████▉ | 17412/22095 [30:04:04<7:08:27, 5.49s/it] 79%|███████▉ | 17413/22095 [30:04:14<8:51:13, 6.81s/it] {'loss': 0.4582, 'grad_norm': 0.27724362715051737, 'learning_rate': 1.132507837031171e-06, 'epoch': 0.79} 79%|███████▉ | 17413/22095 [30:04:14<8:51:13, 6.81s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▉ | 17414/22095 [30:04:18<7:49:41, 6.02s/it] {'loss': 0.3247, 'grad_norm': 0.5863026360362641, 'learning_rate': 1.1320433545513342e-06, 'epoch': 0.79} 79%|███████▉ | 17414/22095 [30:04:18<7:49:41, 6.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (110048 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91060 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17415/22095 [30:04:21<6:42:28, 5.16s/it] {'loss': 0.2866, 'grad_norm': 0.6152221198174394, 'learning_rate': 1.1315789551820078e-06, 'epoch': 0.79} 79%|███████▉ | 17415/22095 [30:04:21<6:42:28, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8400244 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2402, 'image': 'vrdu_table_final_2/astro-ph.CO/2978d904-03a0-4bfd-9758-9c806df47e04.png', 'image_wh': [[14, 6]], 'conversations': [{'from': 'human', 'value': '\nChange the table in the given image to LaTeX format, please.'}, {'from': 'gpt', 'value': "I'll change the table in the given image to LaTeX format for you.\n```latex\n\\begin{tabular}\n...\n\\end{tabular}\n```"}]} 79%|███████▉ | 17416/22095 [30:04:24<6:03:10, 4.66s/it] {'loss': 0.3318, 'grad_norm': 0.6148723884806709, 'learning_rate': 1.1311146389331667e-06, 'epoch': 0.79} 79%|███████▉ | 17416/22095 [30:04:24<6:03:10, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [792, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8498032 in VC:s3://internvl-moe-sft-data/. Exception: Image size [792, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 130517, 'image': 'vrdu_texteq/astro-ph.CO/07292995-b183-44bc-973d-b5bec3266291.png', 'image_wh': [[792, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where we have assumed $Y=0.24$ for the mass fraction of helium.'}]} 79%|███████▉ | 17417/22095 [30:04:28<5:41:42, 4.38s/it] {'loss': 0.3178, 'grad_norm': 0.5900675655471992, 'learning_rate': 1.1306504058147915e-06, 'epoch': 0.79} 79%|███████▉ | 17417/22095 [30:04:28<5:41:42, 4.38s/it] 79%|███████▉ | 17418/22095 [30:04:32<5:29:40, 4.23s/it] {'loss': 0.3106, 'grad_norm': 0.6071007835312443, 'learning_rate': 1.1301862558368554e-06, 'epoch': 0.79} 79%|███████▉ | 17418/22095 [30:04:32<5:29:40, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17419/22095 [30:04:41<7:29:09, 5.76s/it] {'loss': 0.463, 'grad_norm': 0.24889869987412316, 'learning_rate': 1.1297221890093302e-06, 'epoch': 0.79} 79%|███████▉ | 17419/22095 [30:04:41<7:29:09, 5.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69660 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17420/22095 [30:04:45<6:28:08, 4.98s/it] {'loss': 0.319, 'grad_norm': 0.6513774161982812, 'learning_rate': 1.129258205342188e-06, 'epoch': 0.79} 79%|███████▉ | 17420/22095 [30:04:45<6:28:08, 4.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366683 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33429, 'image': 'vrdu_table_final_2/astro-ph.CO/9f89ae11-c351-467a-9e98-ab803eea660e.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{1}$\\end{tabular}\n```"}]} 79%|███████▉ | 17421/22095 [30:04:48<5:46:56, 4.45s/it] {'loss': 0.314, 'grad_norm': 0.6863154751764883, 'learning_rate': 1.1287943048454003e-06, 'epoch': 0.79} 79%|███████▉ | 17421/22095 [30:04:48<5:46:56, 4.45s/it] 79%|███████▉ | 17422/22095 [30:04:51<5:14:01, 4.03s/it] {'loss': 0.2905, 'grad_norm': 0.6930197690759812, 'learning_rate': 1.1283304875289335e-06, 'epoch': 0.79} 79%|███████▉ | 17422/22095 [30:04:51<5:14:01, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94427 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17423/22095 [30:05:00<7:24:07, 5.70s/it] {'loss': 0.4794, 'grad_norm': 0.27159987742847796, 'learning_rate': 1.1278667534027525e-06, 'epoch': 0.79} 79%|███████▉ | 17423/22095 [30:05:00<7:24:07, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86816 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65065 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17424/22095 [30:05:05<6:48:28, 5.25s/it] {'loss': 0.3424, 'grad_norm': 0.544502579307501, 'learning_rate': 1.1274031024768239e-06, 'epoch': 0.79} 79%|███████▉ | 17424/22095 [30:05:05<6:48:28, 5.25s/it] 79%|███████▉ | 17425/22095 [30:05:09<6:22:37, 4.92s/it] {'loss': 0.3051, 'grad_norm': 0.6749145704048627, 'learning_rate': 1.1269395347611074e-06, 'epoch': 0.79} 79%|███████▉ | 17425/22095 [30:05:09<6:22:37, 4.92s/it] 79%|███████▉ | 17426/22095 [30:05:12<5:52:48, 4.53s/it] {'loss': 0.3006, 'grad_norm': 0.6598064622660003, 'learning_rate': 1.126476050265567e-06, 'epoch': 0.79} 79%|███████▉ | 17426/22095 [30:05:12<5:52:48, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44459 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62673 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101284 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17427/22095 [30:05:16<5:20:35, 4.12s/it] {'loss': 0.2898, 'grad_norm': 0.6273926410342063, 'learning_rate': 1.1260126490001577e-06, 'epoch': 0.79} 79%|███████▉ | 17427/22095 [30:05:16<5:20:35, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78095 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82611 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17428/22095 [30:05:23<6:33:33, 5.06s/it] {'loss': 0.4581, 'grad_norm': 0.26148859512293166, 'learning_rate': 1.12554933097484e-06, 'epoch': 0.79} 79%|███████▉ | 17428/22095 [30:05:23<6:33:33, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47470 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53930 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53187 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17429/22095 [30:05:26<5:54:57, 4.56s/it] {'loss': 0.2734, 'grad_norm': 0.6653921917108375, 'learning_rate': 1.1250860961995663e-06, 'epoch': 0.79} 79%|███████▉ | 17429/22095 [30:05:26<5:54:57, 4.56s/it] 79%|███████▉ | 17430/22095 [30:05:31<5:53:07, 4.54s/it] {'loss': 0.2815, 'grad_norm': 0.5786418872287515, 'learning_rate': 1.1246229446842927e-06, 'epoch': 0.79} 79%|███████▉ | 17430/22095 [30:05:31<5:53:07, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17431/22095 [30:05:38<7:01:14, 5.42s/it] {'loss': 0.4784, 'grad_norm': 0.282184982960806, 'learning_rate': 1.1241598764389699e-06, 'epoch': 0.79} 79%|███████▉ | 17431/22095 [30:05:38<7:01:14, 5.42s/it] 79%|███████▉ | 17432/22095 [30:05:48<8:42:12, 6.72s/it] {'loss': 0.4798, 'grad_norm': 0.27743184880991817, 'learning_rate': 1.1236968914735462e-06, 'epoch': 0.79} 79%|███████▉ | 17432/22095 [30:05:48<8:42:12, 6.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▉ | 17433/22095 [30:05:52<7:48:36, 6.03s/it] {'loss': 0.2717, 'grad_norm': 0.6517786801029701, 'learning_rate': 1.1232339897979716e-06, 'epoch': 0.79} 79%|███████▉ | 17433/22095 [30:05:52<7:48:36, 6.03s/it] 79%|███████▉ | 17434/22095 [30:05:57<7:12:48, 5.57s/it] {'loss': 0.2987, 'grad_norm': 0.5885837605798061, 'learning_rate': 1.1227711714221928e-06, 'epoch': 0.79} 79%|███████▉ | 17434/22095 [30:05:57<7:12:48, 5.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51616 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17435/22095 [30:06:06<8:24:57, 6.50s/it] {'loss': 0.4636, 'grad_norm': 0.2615744958143664, 'learning_rate': 1.1223084363561538e-06, 'epoch': 0.79} 79%|███████▉ | 17435/22095 [30:06:06<8:24:57, 6.50s/it] 79%|███████▉ | 17436/22095 [30:06:13<8:54:20, 6.88s/it] {'loss': 0.4644, 'grad_norm': 0.2574679366280398, 'learning_rate': 1.1218457846097958e-06, 'epoch': 0.79} 79%|███████▉ | 17436/22095 [30:06:13<8:54:20, 6.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51070 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103703 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17437/22095 [30:06:23<9:53:10, 7.64s/it] {'loss': 0.4656, 'grad_norm': 0.2889475935321559, 'learning_rate': 1.1213832161930622e-06, 'epoch': 0.79} 79%|███████▉ | 17437/22095 [30:06:23<9:53:10, 7.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▉ | 17438/22095 [30:06:26<8:10:49, 6.32s/it] {'loss': 0.3093, 'grad_norm': 0.6560240655089297, 'learning_rate': 1.120920731115891e-06, 'epoch': 0.79} 79%|███████▉ | 17438/22095 [30:06:26<8:10:49, 6.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75940 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17439/22095 [30:06:29<6:55:29, 5.35s/it] {'loss': 0.3272, 'grad_norm': 0.6313307722903918, 'learning_rate': 1.1204583293882181e-06, 'epoch': 0.79} 79%|███████▉ | 17439/22095 [30:06:29<6:55:29, 5.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11305774 in VC:s3://internvl2/datasets/crawler_data2/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1230, 'image': 'taobao/1513.gif', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n这是什么'}, {'from': 'gpt', 'value': '这张图片是一张宣传横幅,内容与消费电子和科技相关。横幅上显示了中文标题“消费电子与科技同在”,并附有行动提示“点击上方蓝色关注”。\n\n图像细节如下:\n- 背景颜色主要为蓝色和白色。\n- 中间为一个电脑窗口样式的图案,里面有文字内容,文字上方有“CE消費電子”的标志。\n- 文字内容为:“消费电子与科技同在”。\n- 鼓励用户点击的提示文字为:“点击上方蓝色关注”。\n- 周围有多个与科技和电子消费品相关的图标,比如新闻、放大镜、对话、灯泡、扬声器、图标、鼠标等。\n\n这幅横幅可能用于社交媒体或网站上,旨在吸引用户关注与消费电子和科技相关的内容或平台。这类横幅常用于科技新闻平台、科技论坛或社交媒体页面,鼓励用户进一步互动和关注。'}]} 79%|███████▉ | 17440/22095 [30:06:32<6:08:55, 4.76s/it] {'loss': 0.3384, 'grad_norm': 0.6278866803927241, 'learning_rate': 1.119996011019981e-06, 'epoch': 0.79} 79%|███████▉ | 17440/22095 [30:06:32<6:08:55, 4.76s/it] 79%|███████▉ | 17441/22095 [30:06:36<5:31:46, 4.28s/it] {'loss': 0.3399, 'grad_norm': 0.6354112550807469, 'learning_rate': 1.119533776021114e-06, 'epoch': 0.79} 79%|███████▉ | 17441/22095 [30:06:36<5:31:46, 4.28s/it] 79%|███████▉ | 17442/22095 [30:06:38<5:00:25, 3.87s/it] {'loss': 0.2715, 'grad_norm': 0.6007230130425893, 'learning_rate': 1.1190716244015487e-06, 'epoch': 0.79} 79%|███████▉ | 17442/22095 [30:06:38<5:00:25, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047719 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 4cm\nB. 1cm\nC. 1.5cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 79%|███████▉ | 17443/22095 [30:06:46<6:24:51, 4.96s/it] {'loss': 0.4671, 'grad_norm': 0.2898353552609453, 'learning_rate': 1.118609556171213e-06, 'epoch': 0.79} 79%|███████▉ | 17443/22095 [30:06:46<6:24:51, 4.96s/it] 79%|███████▉ | 17444/22095 [30:06:49<5:46:58, 4.48s/it] {'loss': 0.3023, 'grad_norm': 0.6571880434843874, 'learning_rate': 1.118147571340039e-06, 'epoch': 0.79} 79%|███████▉ | 17444/22095 [30:06:49<5:46:58, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17445/22095 [30:06:58<7:31:51, 5.83s/it] {'loss': 0.4866, 'grad_norm': 0.37409319570832594, 'learning_rate': 1.11768566991795e-06, 'epoch': 0.79} 79%|███████▉ | 17445/22095 [30:06:58<7:31:51, 5.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80022 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17446/22095 [30:07:01<6:30:02, 5.03s/it] {'loss': 0.3239, 'grad_norm': 0.621756727887467, 'learning_rate': 1.1172238519148732e-06, 'epoch': 0.79} 79%|███████▉ | 17446/22095 [30:07:02<6:30:02, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (120160 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41063 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17447/22095 [30:07:11<8:14:12, 6.38s/it] {'loss': 0.4652, 'grad_norm': 0.273835122139672, 'learning_rate': 1.1167621173407312e-06, 'epoch': 0.79} 79%|███████▉ | 17447/22095 [30:07:11<8:14:12, 6.38s/it] 79%|███████▉ | 17448/22095 [30:07:14<7:02:42, 5.46s/it] {'loss': 0.3353, 'grad_norm': 0.5843208882446077, 'learning_rate': 1.1163004662054434e-06, 'epoch': 0.79} 79%|███████▉ | 17448/22095 [30:07:14<7:02:42, 5.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17449/22095 [30:07:17<6:05:29, 4.72s/it] {'loss': 0.2871, 'grad_norm': 0.5808348307996056, 'learning_rate': 1.1158388985189312e-06, 'epoch': 0.79} 79%|███████▉ | 17449/22095 [30:07:17<6:05:29, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17450/22095 [30:07:21<5:43:53, 4.44s/it] {'loss': 0.3018, 'grad_norm': 0.5875845384179548, 'learning_rate': 1.1153774142911123e-06, 'epoch': 0.79} 79%|███████▉ | 17450/22095 [30:07:21<5:43:53, 4.44s/it] 79%|███████▉ | 17451/22095 [30:07:24<5:11:16, 4.02s/it] {'loss': 0.2962, 'grad_norm': 0.594981041595839, 'learning_rate': 1.1149160135319027e-06, 'epoch': 0.79} 79%|███████▉ | 17451/22095 [30:07:24<5:11:16, 4.02s/it] 79%|███████▉ | 17452/22095 [30:07:27<4:51:28, 3.77s/it] {'loss': 0.2997, 'grad_norm': 0.6065672910891571, 'learning_rate': 1.1144546962512144e-06, 'epoch': 0.79} 79%|███████▉ | 17452/22095 [30:07:27<4:51:28, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71193 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17453/22095 [30:07:31<4:50:35, 3.76s/it] {'loss': 0.2902, 'grad_norm': 0.6603432041018994, 'learning_rate': 1.113993462458962e-06, 'epoch': 0.79} 79%|███████▉ | 17453/22095 [30:07:31<4:50:35, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (77810 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17454/22095 [30:07:41<7:08:48, 5.54s/it] {'loss': 0.4582, 'grad_norm': 0.30835422814081065, 'learning_rate': 1.1135323121650542e-06, 'epoch': 0.79} 79%|███████▉ | 17454/22095 [30:07:41<7:08:48, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49737 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17455/22095 [30:07:47<7:26:48, 5.78s/it] {'loss': 0.4902, 'grad_norm': 0.2869527199386691, 'learning_rate': 1.113071245379402e-06, 'epoch': 0.79} 79%|███████▉ | 17455/22095 [30:07:47<7:26:48, 5.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▉ | 17456/22095 [30:07:52<6:54:48, 5.37s/it] {'loss': 0.2943, 'grad_norm': 0.6115552347964449, 'learning_rate': 1.1126102621119095e-06, 'epoch': 0.79} 79%|███████▉ | 17456/22095 [30:07:52<6:54:48, 5.37s/it] 79%|███████▉ | 17457/22095 [30:07:55<6:06:35, 4.74s/it] {'loss': 0.287, 'grad_norm': 0.6205860996932847, 'learning_rate': 1.1121493623724845e-06, 'epoch': 0.79} 79%|███████▉ | 17457/22095 [30:07:55<6:06:35, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80353 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47732 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17458/22095 [30:07:59<5:47:20, 4.49s/it] {'loss': 0.3194, 'grad_norm': 0.6428203422179108, 'learning_rate': 1.111688546171028e-06, 'epoch': 0.79} 79%|███████▉ | 17458/22095 [30:07:59<5:47:20, 4.49s/it] 79%|███████▉ | 17459/22095 [30:08:02<5:11:00, 4.03s/it] {'loss': 0.278, 'grad_norm': 0.6574805982745042, 'learning_rate': 1.1112278135174438e-06, 'epoch': 0.79} 79%|███████▉ | 17459/22095 [30:08:02<5:11:00, 4.03s/it] 79%|███████▉ | 17460/22095 [30:08:04<4:43:08, 3.67s/it] {'loss': 0.2562, 'grad_norm': 0.6510734470303903, 'learning_rate': 1.1107671644216305e-06, 'epoch': 0.79} 79%|███████▉ | 17460/22095 [30:08:04<4:43:08, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17461/22095 [30:08:14<7:08:30, 5.55s/it] {'loss': 0.4615, 'grad_norm': 0.2731843402504672, 'learning_rate': 1.1103065988934842e-06, 'epoch': 0.79} 79%|███████▉ | 17461/22095 [30:08:14<7:08:30, 5.55s/it] 79%|███████▉ | 17462/22095 [30:08:19<6:37:31, 5.15s/it] {'loss': 0.3162, 'grad_norm': 0.60979356487696, 'learning_rate': 1.109846116942903e-06, 'epoch': 0.79} 79%|███████▉ | 17462/22095 [30:08:19<6:37:31, 5.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17463/22095 [30:08:26<7:21:48, 5.72s/it] {'loss': 0.4731, 'grad_norm': 0.26468973035547944, 'learning_rate': 1.109385718579783e-06, 'epoch': 0.79} 79%|███████▉ | 17463/22095 [30:08:26<7:21:48, 5.72s/it] 79%|███████▉ | 17464/22095 [30:08:29<6:28:42, 5.04s/it] {'loss': 0.3443, 'grad_norm': 0.7844896219515496, 'learning_rate': 1.1089254038140141e-06, 'epoch': 0.79} 79%|███████▉ | 17464/22095 [30:08:29<6:28:42, 5.04s/it] 79%|███████▉ | 17465/22095 [30:08:33<6:00:36, 4.67s/it] {'loss': 0.3096, 'grad_norm': 0.6988599850950089, 'learning_rate': 1.1084651726554868e-06, 'epoch': 0.79} 79%|███████▉ | 17465/22095 [30:08:33<6:00:36, 4.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17466/22095 [30:08:40<7:04:42, 5.50s/it] {'loss': 0.4939, 'grad_norm': 0.25101069040293145, 'learning_rate': 1.1080050251140923e-06, 'epoch': 0.79} 79%|███████▉ | 17466/22095 [30:08:40<7:04:42, 5.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17467/22095 [30:08:45<6:39:04, 5.17s/it] {'loss': 0.3558, 'grad_norm': 0.6044654536372189, 'learning_rate': 1.1075449611997153e-06, 'epoch': 0.79} 79%|███████▉ | 17467/22095 [30:08:45<6:39:04, 5.17s/it] 79%|███████▉ | 17468/22095 [30:08:49<6:10:51, 4.81s/it] {'loss': 0.3099, 'grad_norm': 0.6321237819686049, 'learning_rate': 1.1070849809222428e-06, 'epoch': 0.79} 79%|███████▉ | 17468/22095 [30:08:49<6:10:51, 4.81s/it] 79%|███████▉ | 17469/22095 [30:08:52<5:30:44, 4.29s/it] {'loss': 0.2565, 'grad_norm': 0.6163219336638721, 'learning_rate': 1.106625084291557e-06, 'epoch': 0.79} 79%|███████▉ | 17469/22095 [30:08:52<5:30:44, 4.29s/it] 79%|███████▉ | 17470/22095 [30:08:55<5:12:27, 4.05s/it] {'loss': 0.2767, 'grad_norm': 0.6160762324173575, 'learning_rate': 1.1061652713175425e-06, 'epoch': 0.79} 79%|███████▉ | 17470/22095 [30:08:55<5:12:27, 4.05s/it] 79%|███████▉ | 17471/22095 [30:08:59<5:10:39, 4.03s/it] {'loss': 0.2694, 'grad_norm': 0.572659054651402, 'learning_rate': 1.1057055420100755e-06, 'epoch': 0.79} 79%|███████▉ | 17471/22095 [30:08:59<5:10:39, 4.03s/it] 79%|███████▉ | 17472/22095 [30:09:03<5:05:59, 3.97s/it] {'loss': 0.3049, 'grad_norm': 0.6128638406139482, 'learning_rate': 1.1052458963790374e-06, 'epoch': 0.79} 79%|███████▉ | 17472/22095 [30:09:03<5:05:59, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (139791 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76954 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42578 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17473/22095 [30:09:08<5:21:16, 4.17s/it] {'loss': 0.3289, 'grad_norm': 0.6937107123304866, 'learning_rate': 1.104786334434303e-06, 'epoch': 0.79} 79%|███████▉ | 17473/22095 [30:09:08<5:21:16, 4.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17474/22095 [30:09:12<5:16:05, 4.10s/it] {'loss': 0.3183, 'grad_norm': 0.5975880810758326, 'learning_rate': 1.1043268561857456e-06, 'epoch': 0.79} 79%|███████▉ | 17474/22095 [30:09:12<5:16:05, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67273 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17475/22095 [30:09:15<4:53:49, 3.82s/it] {'loss': 0.2738, 'grad_norm': 0.5915920011910971, 'learning_rate': 1.103867461643241e-06, 'epoch': 0.79} 79%|███████▉ | 17475/22095 [30:09:15<4:53:49, 3.82s/it] 79%|███████▉ | 17476/22095 [30:09:18<4:38:31, 3.62s/it] {'loss': 0.2447, 'grad_norm': 0.6566936994613086, 'learning_rate': 1.1034081508166588e-06, 'epoch': 0.79} 79%|███████▉ | 17476/22095 [30:09:18<4:38:31, 3.62s/it] 79%|███████▉ | 17477/22095 [30:09:21<4:28:01, 3.48s/it] {'loss': 0.3233, 'grad_norm': 0.6188551895462027, 'learning_rate': 1.1029489237158663e-06, 'epoch': 0.79} 79%|███████▉ | 17477/22095 [30:09:21<4:28:01, 3.48s/it] 79%|███████▉ | 17478/22095 [30:09:25<4:24:16, 3.43s/it] {'loss': 0.2977, 'grad_norm': 0.6344862698039115, 'learning_rate': 1.1024897803507322e-06, 'epoch': 0.79} 79%|███████▉ | 17478/22095 [30:09:25<4:24:16, 3.43s/it] 79%|███████▉ | 17479/22095 [30:09:28<4:22:44, 3.42s/it] {'loss': 0.3067, 'grad_norm': 0.6057029705000123, 'learning_rate': 1.1020307207311244e-06, 'epoch': 0.79} 79%|███████▉ | 17479/22095 [30:09:28<4:22:44, 3.42s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/31397.png 2025-08-28 22:07:26.659208 load time: 1143.02 ms Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908197 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31350, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nA. 2\nB. 2.5\nC. 4.5\nD. 7\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 79%|███████▉ | 17480/22095 [30:09:31<4:09:10, 3.24s/it] {'loss': 0.2907, 'grad_norm': 0.6705464585220849, 'learning_rate': 1.1015717448669045e-06, 'epoch': 0.79} 79%|███████▉ | 17480/22095 [30:09:31<4:09:10, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17481/22095 [30:09:35<4:33:00, 3.55s/it] {'loss': 0.478, 'grad_norm': 0.26897717434855983, 'learning_rate': 1.1011128527679332e-06, 'epoch': 0.79} 79%|███████▉ | 17481/22095 [30:09:35<4:33:00, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90503 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110212 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17482/22095 [30:09:39<4:34:47, 3.57s/it] {'loss': 0.2897, 'grad_norm': 0.5829298651200642, 'learning_rate': 1.1006540444440738e-06, 'epoch': 0.79} 79%|███████▉ | 17482/22095 [30:09:39<4:34:47, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17483/22095 [30:09:48<6:52:53, 5.37s/it] {'loss': 0.4589, 'grad_norm': 0.27077008808618885, 'learning_rate': 1.100195319905182e-06, 'epoch': 0.79} 79%|███████▉ | 17483/22095 [30:09:48<6:52:53, 5.37s/it] 79%|███████▉ | 17484/22095 [30:09:52<6:15:09, 4.88s/it] {'loss': 0.3433, 'grad_norm': 0.6846489277337233, 'learning_rate': 1.0997366791611165e-06, 'epoch': 0.79} 79%|███████▉ | 17484/22095 [30:09:52<6:15:09, 4.88s/it] 79%|███████▉ | 17485/22095 [30:09:55<5:29:30, 4.29s/it] {'loss': 0.3462, 'grad_norm': 0.6362951825644277, 'learning_rate': 1.0992781222217291e-06, 'epoch': 0.79} 79%|███████▉ | 17485/22095 [30:09:55<5:29:30, 4.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [231, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364311 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31051, 'image': 'vrdu_table_final_2/astro-ph.CO/3a6de9b2-4c4c-49f7-a281-d3fcc0757ef5.png', 'image_wh': [[231, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{c}Number of clusters\\end{tabular}\n```"}]} 79%|███████▉ | 17486/22095 [30:09:58<5:06:06, 3.98s/it] {'loss': 0.2928, 'grad_norm': 0.5888260862616873, 'learning_rate': 1.0988196490968766e-06, 'epoch': 0.79} 79%|███████▉ | 17486/22095 [30:09:58<5:06:06, 3.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17487/22095 [30:10:01<4:38:46, 3.63s/it] {'loss': 0.2874, 'grad_norm': 0.6074798482109431, 'learning_rate': 1.0983612597964065e-06, 'epoch': 0.79} 79%|███████▉ | 17487/22095 [30:10:01<4:38:46, 3.63s/it] 79%|███████▉ | 17488/22095 [30:10:05<4:55:17, 3.85s/it] {'loss': 0.2783, 'grad_norm': 0.6282389175937101, 'learning_rate': 1.0979029543301718e-06, 'epoch': 0.79} 79%|███████▉ | 17488/22095 [30:10:05<4:55:17, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42172 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17489/22095 [30:10:09<4:42:25, 3.68s/it] {'loss': 0.319, 'grad_norm': 0.5872783503668951, 'learning_rate': 1.0974447327080185e-06, 'epoch': 0.79} 79%|███████▉ | 17489/22095 [30:10:09<4:42:25, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17490/22095 [30:10:15<5:50:07, 4.56s/it] {'loss': 0.4732, 'grad_norm': 0.27033500104217156, 'learning_rate': 1.0969865949397902e-06, 'epoch': 0.79} 79%|███████▉ | 17490/22095 [30:10:15<5:50:07, 4.56s/it] 79%|███████▉ | 17491/22095 [30:10:19<5:28:28, 4.28s/it] {'loss': 0.2963, 'grad_norm': 0.5968647069455049, 'learning_rate': 1.0965285410353326e-06, 'epoch': 0.79} 79%|███████▉ | 17491/22095 [30:10:19<5:28:28, 4.28s/it] 79%|███████▉ | 17492/22095 [30:10:22<5:12:11, 4.07s/it] {'loss': 0.2741, 'grad_norm': 0.5431438516873746, 'learning_rate': 1.09607057100449e-06, 'epoch': 0.79} 79%|███████▉ | 17492/22095 [30:10:22<5:12:11, 4.07s/it] 79%|███████▉ | 17493/22095 [30:10:25<4:47:56, 3.75s/it] {'loss': 0.3069, 'grad_norm': 0.5890407122521515, 'learning_rate': 1.0956126848571004e-06, 'epoch': 0.79} 79%|███████▉ | 17493/22095 [30:10:25<4:47:56, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [909, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350482 in VC:s3://internvl-moe-sft-data/. Exception: Image size [909, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17156, 'image': 'vrdu_table_final_2/astro-ph.CO/039de011-9137-42ce-8dad-011ba9829303.png', 'image_wh': [[909, 6]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.45\\textwidth}}\n\\hline \\\\\n\\end{tabular}\n```'}]} 79%|███████▉ | 17494/22095 [30:10:29<4:47:42, 3.75s/it] {'loss': 0.311, 'grad_norm': 0.6542280291407335, 'learning_rate': 1.0951548826030018e-06, 'epoch': 0.79} 79%|███████▉ | 17494/22095 [30:10:29<4:47:42, 3.75s/it] 79%|███████▉ | 17495/22095 [30:10:33<4:46:16, 3.73s/it] {'loss': 0.2977, 'grad_norm': 0.5326352594264989, 'learning_rate': 1.0946971642520327e-06, 'epoch': 0.79} 79%|███████▉ | 17495/22095 [30:10:33<4:46:16, 3.73s/it] 79%|███████▉ | 17496/22095 [30:10:36<4:43:36, 3.70s/it] {'loss': 0.367, 'grad_norm': 0.6390072525495386, 'learning_rate': 1.0942395298140262e-06, 'epoch': 0.79} 79%|███████▉ | 17496/22095 [30:10:36<4:43:36, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17497/22095 [30:10:46<7:06:45, 5.57s/it] {'loss': 0.4745, 'grad_norm': 0.25659664659858095, 'learning_rate': 1.0937819792988186e-06, 'epoch': 0.79} 79%|███████▉ | 17497/22095 [30:10:46<7:06:45, 5.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57606 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52862 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17498/22095 [30:10:50<6:11:59, 4.86s/it] {'loss': 0.2549, 'grad_norm': 0.6051771049858189, 'learning_rate': 1.0933245127162373e-06, 'epoch': 0.79} 79%|███████▉ | 17498/22095 [30:10:50<6:11:59, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44234 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93798 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17499/22095 [30:10:53<5:48:10, 4.55s/it] {'loss': 0.2758, 'grad_norm': 0.6346089264271125, 'learning_rate': 1.0928671300761152e-06, 'epoch': 0.79} 79%|███████▉ | 17499/22095 [30:10:53<5:48:10, 4.55s/it] 79%|███████▉ | 17500/22095 [30:10:57<5:19:02, 4.17s/it] {'loss': 0.3607, 'grad_norm': 0.7071202798055578, 'learning_rate': 1.092409831388277e-06, 'epoch': 0.79} 79%|███████▉ | 17500/22095 [30:10:57<5:19:02, 4.17s/it] 79%|███████▉ | 17501/22095 [30:11:00<4:55:28, 3.86s/it] {'loss': 0.2941, 'grad_norm': 0.6110020712418186, 'learning_rate': 1.091952616662552e-06, 'epoch': 0.79} 79%|███████▉ | 17501/22095 [30:11:00<4:55:28, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75578 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73747 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17502/22095 [30:11:03<4:47:13, 3.75s/it] {'loss': 0.2926, 'grad_norm': 0.8158655254030276, 'learning_rate': 1.0914954859087629e-06, 'epoch': 0.79} 79%|███████▉ | 17502/22095 [30:11:03<4:47:13, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41127 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53258 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82891 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17503/22095 [30:11:06<4:30:35, 3.54s/it] {'loss': 0.2774, 'grad_norm': 0.6683921375881807, 'learning_rate': 1.0910384391367296e-06, 'epoch': 0.79} 79%|███████▉ | 17503/22095 [30:11:06<4:30:35, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42403 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17504/22095 [30:11:10<4:30:29, 3.54s/it] {'loss': 0.3091, 'grad_norm': 0.5978056609012776, 'learning_rate': 1.0905814763562755e-06, 'epoch': 0.79} 79%|███████▉ | 17504/22095 [30:11:10<4:30:29, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45394 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78151 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77762 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17505/22095 [30:11:13<4:21:15, 3.42s/it] {'loss': 0.2897, 'grad_norm': 0.6267850613795837, 'learning_rate': 1.0901245975772207e-06, 'epoch': 0.79} 79%|███████▉ | 17505/22095 [30:11:13<4:21:15, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17506/22095 [30:11:22<6:38:51, 5.21s/it] {'loss': 0.4773, 'grad_norm': 0.25613541359831193, 'learning_rate': 1.0896678028093777e-06, 'epoch': 0.79} 79%|███████▉ | 17506/22095 [30:11:22<6:38:51, 5.21s/it] 79%|███████▉ | 17507/22095 [30:11:32<8:18:01, 6.51s/it] {'loss': 0.4721, 'grad_norm': 0.28159218026235666, 'learning_rate': 1.0892110920625643e-06, 'epoch': 0.79} 79%|███████▉ | 17507/22095 [30:11:32<8:18:01, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [62, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365273 in VC:s3://internvl-moe-sft-data/. Exception: Image size [62, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32014, 'image': 'vrdu_table_final_2/astro-ph.CO/1772b57e-b7fd-4a05-93b7-e129c0de3411.png', 'image_wh': [[62, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}{\\bf SED} \\end{tabular}\n```"}]} 79%|███████▉ | 17508/22095 [30:11:37<7:35:47, 5.96s/it] {'loss': 0.3366, 'grad_norm': 0.681314188020368, 'learning_rate': 1.0887544653465942e-06, 'epoch': 0.79} 79%|███████▉ | 17508/22095 [30:11:37<7:35:47, 5.96s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [489, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8416367 in VC:s3://internvl-moe-sft-data/. Exception: Image size [489, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17331, 'image': 'vrdu_texteq/astro-ph.CO/b3d06ff3-9cf4-4405-83c3-0d30a554e718.png', 'image_wh': [[489, 23]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'where the flux must be $K$\\,-\\,corrected as\\,:'}]} 79%|███████▉ | 17509/22095 [30:11:40<6:44:42, 5.29s/it] {'loss': 0.2951, 'grad_norm': 0.7454404200334943, 'learning_rate': 1.0882979226712782e-06, 'epoch': 0.79} 79%|███████▉ | 17509/22095 [30:11:40<6:44:42, 5.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17510/22095 [30:11:50<8:26:57, 6.63s/it] {'loss': 0.4393, 'grad_norm': 0.2699858245041074, 'learning_rate': 1.0878414640464247e-06, 'epoch': 0.79} 79%|███████▉ | 17510/22095 [30:11:50<8:26:57, 6.63s/it] 79%|███████▉ | 17511/22095 [30:12:00<9:31:24, 7.48s/it] {'loss': 0.455, 'grad_norm': 0.27894602885659225, 'learning_rate': 1.0873850894818433e-06, 'epoch': 0.79} 79%|███████▉ | 17511/22095 [30:12:00<9:31:24, 7.48s/it] 79%|███████▉ | 17512/22095 [30:12:09<10:26:45, 8.21s/it] {'loss': 0.4873, 'grad_norm': 0.29535618246704853, 'learning_rate': 1.0869287989873406e-06, 'epoch': 0.79} 79%|███████▉ | 17512/22095 [30:12:10<10:26:45, 8.21s/it] 79%|███████▉ | 17513/22095 [30:12:19<10:54:28, 8.57s/it] {'loss': 0.4604, 'grad_norm': 0.27670213394183, 'learning_rate': 1.0864725925727198e-06, 'epoch': 0.79} 79%|███████▉ | 17513/22095 [30:12:19<10:54:28, 8.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 79%|███████▉ | 17514/22095 [30:12:22<8:50:32, 6.95s/it] {'loss': 0.2957, 'grad_norm': 1.046633322989338, 'learning_rate': 1.0860164702477826e-06, 'epoch': 0.79} 79%|███████▉ | 17514/22095 [30:12:22<8:50:32, 6.95s/it] 79%|███████▉ | 17515/22095 [30:12:25<7:28:46, 5.88s/it] {'loss': 0.3571, 'grad_norm': 0.6662828314797252, 'learning_rate': 1.0855604320223317e-06, 'epoch': 0.79} 79%|███████▉ | 17515/22095 [30:12:25<7:28:46, 5.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86884 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17516/22095 [30:12:29<6:39:25, 5.23s/it] {'loss': 0.2822, 'grad_norm': 0.6028625071373115, 'learning_rate': 1.085104477906163e-06, 'epoch': 0.79} 79%|███████▉ | 17516/22095 [30:12:29<6:39:25, 5.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (130308 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17517/22095 [30:12:33<6:07:18, 4.81s/it] {'loss': 0.3176, 'grad_norm': 0.7617858158323627, 'learning_rate': 1.0846486079090773e-06, 'epoch': 0.79} 79%|███████▉ | 17517/22095 [30:12:33<6:07:18, 4.81s/it] 79%|███████▉ | 17518/22095 [30:12:36<5:30:20, 4.33s/it] {'loss': 0.2881, 'grad_norm': 0.6198032078988815, 'learning_rate': 1.0841928220408682e-06, 'epoch': 0.79} 79%|███████▉ | 17518/22095 [30:12:36<5:30:20, 4.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365008 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31749, 'image': 'vrdu_table_final_2/astro-ph.CO/fe73279a-7c06-46da-b113-7f494d0ef61c.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 79%|███████▉ | 17519/22095 [30:12:39<4:59:35, 3.93s/it] {'loss': 0.2502, 'grad_norm': 0.595029698665207, 'learning_rate': 1.0837371203113266e-06, 'epoch': 0.79} 79%|███████▉ | 17519/22095 [30:12:39<4:59:35, 3.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17520/22095 [30:12:42<4:32:31, 3.57s/it] {'loss': 0.2761, 'grad_norm': 0.6069144135300447, 'learning_rate': 1.0832815027302473e-06, 'epoch': 0.79} 79%|███████▉ | 17520/22095 [30:12:42<4:32:31, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (40995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90936 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17521/22095 [30:12:45<4:24:52, 3.47s/it] {'loss': 0.3101, 'grad_norm': 0.626882704408546, 'learning_rate': 1.08282596930742e-06, 'epoch': 0.79} 79%|███████▉ | 17521/22095 [30:12:45<4:24:52, 3.47s/it] 79%|███████▉ | 17522/22095 [30:12:49<4:33:52, 3.59s/it] {'loss': 0.2865, 'grad_norm': 0.5670749951186008, 'learning_rate': 1.0823705200526325e-06, 'epoch': 0.79} 79%|███████▉ | 17522/22095 [30:12:49<4:33:52, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85911 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48977 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79907 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17523/22095 [30:12:53<4:38:55, 3.66s/it] {'loss': 0.3193, 'grad_norm': 0.5826516551992915, 'learning_rate': 1.0819151549756685e-06, 'epoch': 0.79} 79%|███████▉ | 17523/22095 [30:12:53<4:38:55, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17524/22095 [30:13:01<6:22:49, 5.02s/it] {'loss': 0.4747, 'grad_norm': 0.27305265116969735, 'learning_rate': 1.081459874086316e-06, 'epoch': 0.79} 79%|███████▉ | 17524/22095 [30:13:01<6:22:49, 5.02s/it] 79%|███████▉ | 17525/22095 [30:13:05<5:53:20, 4.64s/it] {'loss': 0.3022, 'grad_norm': 0.7086693801004199, 'learning_rate': 1.0810046773943544e-06, 'epoch': 0.79} 79%|███████▉ | 17525/22095 [30:13:05<5:53:20, 4.64s/it] 79%|███████▉ | 17526/22095 [30:13:09<5:34:47, 4.40s/it] {'loss': 0.2586, 'grad_norm': 0.5747501021359275, 'learning_rate': 1.0805495649095676e-06, 'epoch': 0.79} 79%|███████▉ | 17526/22095 [30:13:09<5:34:47, 4.40s/it] 79%|███████▉ | 17527/22095 [30:13:13<5:42:51, 4.50s/it] {'loss': 0.277, 'grad_norm': 0.6506436875014426, 'learning_rate': 1.0800945366417316e-06, 'epoch': 0.79} 79%|███████▉ | 17527/22095 [30:13:13<5:42:51, 4.50s/it] 79%|███████▉ | 17528/22095 [30:13:18<5:41:24, 4.49s/it] {'loss': 0.3037, 'grad_norm': 0.6422115703988123, 'learning_rate': 1.0796395926006258e-06, 'epoch': 0.79} 79%|███████▉ | 17528/22095 [30:13:18<5:41:24, 4.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59320 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68329 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17529/22095 [30:13:27<7:33:32, 5.96s/it] {'loss': 0.452, 'grad_norm': 0.25125905669770676, 'learning_rate': 1.0791847327960236e-06, 'epoch': 0.79} 79%|███████▉ | 17529/22095 [30:13:27<7:33:32, 5.96s/it] 79%|███████▉ | 17530/22095 [30:13:31<6:33:29, 5.17s/it] {'loss': 0.2964, 'grad_norm': 0.719021657172756, 'learning_rate': 1.0787299572377015e-06, 'epoch': 0.79} 79%|███████▉ | 17530/22095 [30:13:31<6:33:29, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17531/22095 [30:13:37<7:00:23, 5.53s/it] {'loss': 0.4507, 'grad_norm': 0.28278646840808924, 'learning_rate': 1.078275265935429e-06, 'epoch': 0.79} 79%|███████▉ | 17531/22095 [30:13:37<7:00:23, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63613 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17532/22095 [30:13:40<6:05:17, 4.80s/it] {'loss': 0.268, 'grad_norm': 0.6185083956022757, 'learning_rate': 1.0778206588989748e-06, 'epoch': 0.79} 79%|███████▉ | 17532/22095 [30:13:40<6:05:17, 4.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17533/22095 [30:13:44<5:47:30, 4.57s/it] {'loss': 0.3169, 'grad_norm': 0.6212316194701377, 'learning_rate': 1.0773661361381088e-06, 'epoch': 0.79} 79%|███████▉ | 17533/22095 [30:13:44<5:47:30, 4.57s/it] 79%|███████▉ | 17534/22095 [30:13:47<5:12:51, 4.12s/it] {'loss': 0.3188, 'grad_norm': 0.6590947742085103, 'learning_rate': 1.0769116976625998e-06, 'epoch': 0.79} 79%|███████▉ | 17534/22095 [30:13:47<5:12:51, 4.12s/it] 79%|███████▉ | 17535/22095 [30:13:50<4:52:38, 3.85s/it] {'loss': 0.2602, 'grad_norm': 0.5734631130915846, 'learning_rate': 1.0764573434822067e-06, 'epoch': 0.79} 79%|███████▉ | 17535/22095 [30:13:50<4:52:38, 3.85s/it] 79%|███████▉ | 17536/22095 [30:13:55<5:06:17, 4.03s/it] {'loss': 0.2786, 'grad_norm': 0.5968917546790733, 'learning_rate': 1.0760030736066952e-06, 'epoch': 0.79} 79%|███████▉ | 17536/22095 [30:13:55<5:06:17, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17537/22095 [30:14:05<7:36:25, 6.01s/it] {'loss': 0.4527, 'grad_norm': 0.2466211171565022, 'learning_rate': 1.075548888045827e-06, 'epoch': 0.79} 79%|███████▉ | 17537/22095 [30:14:05<7:36:25, 6.01s/it] 79%|███████▉ | 17538/22095 [30:14:09<6:50:43, 5.41s/it] {'loss': 0.3244, 'grad_norm': 0.6092165463869262, 'learning_rate': 1.0750947868093608e-06, 'epoch': 0.79} 79%|███████▉ | 17538/22095 [30:14:09<6:50:43, 5.41s/it] 79%|███████▉ | 17539/22095 [30:14:13<6:03:46, 4.79s/it] {'loss': 0.2905, 'grad_norm': 0.6165602362370749, 'learning_rate': 1.0746407699070516e-06, 'epoch': 0.79} 79%|███████▉ | 17539/22095 [30:14:13<6:03:46, 4.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17540/22095 [30:14:22<7:48:39, 6.17s/it] {'loss': 0.4509, 'grad_norm': 0.25864134317846904, 'learning_rate': 1.0741868373486564e-06, 'epoch': 0.79} 79%|███████▉ | 17540/22095 [30:14:22<7:48:39, 6.17s/it] 79%|███████▉ | 17541/22095 [30:14:26<6:53:47, 5.45s/it] {'loss': 0.3143, 'grad_norm': 0.6255255495885236, 'learning_rate': 1.0737329891439303e-06, 'epoch': 0.79} 79%|███████▉ | 17541/22095 [30:14:26<6:53:47, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88836 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106078 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41742 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17542/22095 [30:14:29<6:02:32, 4.78s/it] {'loss': 0.2957, 'grad_norm': 0.5879664093941299, 'learning_rate': 1.0732792253026231e-06, 'epoch': 0.79} 79%|███████▉ | 17542/22095 [30:14:29<6:02:32, 4.78s/it] 79%|███████▉ | 17543/22095 [30:14:33<5:40:04, 4.48s/it] {'loss': 0.3047, 'grad_norm': 0.7776972902240687, 'learning_rate': 1.0728255458344843e-06, 'epoch': 0.79} 79%|███████▉ | 17543/22095 [30:14:33<5:40:04, 4.48s/it] 79%|███████▉ | 17544/22095 [30:14:36<5:07:49, 4.06s/it] {'loss': 0.3092, 'grad_norm': 0.6120374867402096, 'learning_rate': 1.0723719507492648e-06, 'epoch': 0.79} 79%|███████▉ | 17544/22095 [30:14:36<5:07:49, 4.06s/it] 79%|███████▉ | 17545/22095 [30:14:40<5:03:31, 4.00s/it] {'loss': 0.3109, 'grad_norm': 0.6156840091395118, 'learning_rate': 1.0719184400567078e-06, 'epoch': 0.79} 79%|███████▉ | 17545/22095 [30:14:40<5:03:31, 4.00s/it] 79%|███████▉ | 17546/22095 [30:14:43<4:39:48, 3.69s/it] {'loss': 0.3113, 'grad_norm': 0.6428936730218494, 'learning_rate': 1.0714650137665604e-06, 'epoch': 0.79} 79%|███████▉ | 17546/22095 [30:14:43<4:39:48, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103196 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17547/22095 [30:14:46<4:37:29, 3.66s/it] {'loss': 0.3354, 'grad_norm': 0.6156742053106594, 'learning_rate': 1.071011671888565e-06, 'epoch': 0.79} 79%|███████▉ | 17547/22095 [30:14:46<4:37:29, 3.66s/it] 79%|███████▉ | 17548/22095 [30:14:50<4:24:43, 3.49s/it] {'loss': 0.3165, 'grad_norm': 0.7783029452965077, 'learning_rate': 1.07055841443246e-06, 'epoch': 0.79} 79%|███████▉ | 17548/22095 [30:14:50<4:24:43, 3.49s/it] 79%|███████▉ | 17549/22095 [30:14:53<4:28:48, 3.55s/it] {'loss': 0.2703, 'grad_norm': 0.6133884955674895, 'learning_rate': 1.070105241407986e-06, 'epoch': 0.79} 79%|███████▉ | 17549/22095 [30:14:53<4:28:48, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17550/22095 [30:15:03<6:41:30, 5.30s/it] {'loss': 0.4519, 'grad_norm': 0.26731785140084197, 'learning_rate': 1.0696521528248822e-06, 'epoch': 0.79} 79%|███████▉ | 17550/22095 [30:15:03<6:41:30, 5.30s/it] 79%|███████▉ | 17551/22095 [30:15:07<6:09:25, 4.88s/it] {'loss': 0.2993, 'grad_norm': 0.7314874059177867, 'learning_rate': 1.0691991486928826e-06, 'epoch': 0.79} 79%|███████▉ | 17551/22095 [30:15:07<6:09:25, 4.88s/it] 79%|███████▉ | 17552/22095 [30:15:10<5:27:29, 4.33s/it] {'loss': 0.3224, 'grad_norm': 0.5885684214964538, 'learning_rate': 1.0687462290217193e-06, 'epoch': 0.79} 79%|███████▉ | 17552/22095 [30:15:10<5:27:29, 4.33s/it] 79%|███████▉ | 17553/22095 [30:15:13<5:03:46, 4.01s/it] {'loss': 0.3066, 'grad_norm': 0.6464653570859419, 'learning_rate': 1.0682933938211272e-06, 'epoch': 0.79} 79%|███████▉ | 17553/22095 [30:15:13<5:03:46, 4.01s/it] 79%|███████▉ | 17554/22095 [30:15:16<4:49:18, 3.82s/it] {'loss': 0.2899, 'grad_norm': 0.6038104443874369, 'learning_rate': 1.067840643100833e-06, 'epoch': 0.79} 79%|███████▉ | 17554/22095 [30:15:16<4:49:18, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (111216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74331 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17555/22095 [30:15:23<6:02:38, 4.79s/it] {'loss': 0.4692, 'grad_norm': 0.26682934326948565, 'learning_rate': 1.0673879768705681e-06, 'epoch': 0.79} 79%|███████▉ | 17555/22095 [30:15:23<6:02:38, 4.79s/it] 79%|███████▉ | 17556/22095 [30:15:27<5:37:55, 4.47s/it] {'loss': 0.31, 'grad_norm': 0.582118110369442, 'learning_rate': 1.0669353951400563e-06, 'epoch': 0.79} 79%|███████▉ | 17556/22095 [30:15:27<5:37:55, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64345 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17557/22095 [30:15:30<5:03:09, 4.01s/it] {'loss': 0.2951, 'grad_norm': 1.0647840750997166, 'learning_rate': 1.066482897919025e-06, 'epoch': 0.79} 79%|███████▉ | 17557/22095 [30:15:30<5:03:09, 4.01s/it] 79%|███████▉ | 17558/22095 [30:15:34<5:00:17, 3.97s/it] {'loss': 0.3115, 'grad_norm': 0.569722159501484, 'learning_rate': 1.0660304852171932e-06, 'epoch': 0.79} 79%|███████▉ | 17558/22095 [30:15:34<5:00:17, 3.97s/it] 79%|███████▉ | 17559/22095 [30:15:38<5:00:16, 3.97s/it] {'loss': 0.2882, 'grad_norm': 0.5872530009817197, 'learning_rate': 1.0655781570442864e-06, 'epoch': 0.79} 79%|███████▉ | 17559/22095 [30:15:38<5:00:16, 3.97s/it] 79%|███████▉ | 17560/22095 [30:15:41<4:50:02, 3.84s/it] {'loss': 0.2829, 'grad_norm': 0.6712127830923466, 'learning_rate': 1.0651259134100205e-06, 'epoch': 0.79} 79%|███████▉ | 17560/22095 [30:15:41<4:50:02, 3.84s/it] 79%|███████▉ | 17561/22095 [30:15:45<4:50:12, 3.84s/it] {'loss': 0.2982, 'grad_norm': 0.6292811854227739, 'learning_rate': 1.0646737543241125e-06, 'epoch': 0.79} 79%|███████▉ | 17561/22095 [30:15:45<4:50:12, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348832 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15502, 'image': 'vrdu_table_final_2/astro-ph.CO/2d315550-949f-4d13-b24f-062aeafeb155.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 79%|███████▉ | 17562/22095 [30:15:49<4:39:49, 3.70s/it] {'loss': 0.2927, 'grad_norm': 0.5772369823628817, 'learning_rate': 1.0642216797962795e-06, 'epoch': 0.79} 79%|███████▉ | 17562/22095 [30:15:49<4:39:49, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 79%|███████▉ | 17563/22095 [30:15:58<6:48:44, 5.41s/it] {'loss': 0.4438, 'grad_norm': 0.270961324115532, 'learning_rate': 1.063769689836237e-06, 'epoch': 0.79} 79%|███████▉ | 17563/22095 [30:15:58<6:48:44, 5.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 79%|███████▉ | 17564/22095 [30:16:02<6:09:35, 4.89s/it] {'loss': 0.3384, 'grad_norm': 0.8262430223157452, 'learning_rate': 1.0633177844536924e-06, 'epoch': 0.79} 79%|███████▉ | 17564/22095 [30:16:02<6:09:35, 4.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42324 > 40960). Running this sequence through the model will result in indexing errors 79%|███████▉ | 17565/22095 [30:16:11<7:52:47, 6.26s/it] {'loss': 0.4954, 'grad_norm': 0.3042669338076642, 'learning_rate': 1.0628659636583577e-06, 'epoch': 0.79} 79%|███████▉ | 17565/22095 [30:16:11<7:52:47, 6.26s/it] 80%|███████▉ | 17566/22095 [30:16:15<7:06:14, 5.65s/it] {'loss': 0.2735, 'grad_norm': 0.6161863340896231, 'learning_rate': 1.0624142274599425e-06, 'epoch': 0.8} 80%|███████▉ | 17566/22095 [30:16:15<7:06:14, 5.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944625 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67778, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 13cm\nB. 11cm\nC. 12cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 80%|███████▉ | 17567/22095 [30:16:19<6:24:15, 5.09s/it] {'loss': 0.2773, 'grad_norm': 1.033689895662228, 'learning_rate': 1.061962575868153e-06, 'epoch': 0.8} 80%|███████▉ | 17567/22095 [30:16:19<6:24:15, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17568/22095 [30:16:28<7:54:17, 6.29s/it] {'loss': 0.4553, 'grad_norm': 0.3166101438817103, 'learning_rate': 1.061511008892691e-06, 'epoch': 0.8} 80%|███████▉ | 17568/22095 [30:16:28<7:54:17, 6.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92374 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17569/22095 [30:16:33<7:15:58, 5.78s/it] {'loss': 0.2797, 'grad_norm': 0.6893218205777645, 'learning_rate': 1.0610595265432615e-06, 'epoch': 0.8} 80%|███████▉ | 17569/22095 [30:16:33<7:15:58, 5.78s/it] 80%|███████▉ | 17570/22095 [30:16:36<6:26:56, 5.13s/it] {'loss': 0.2648, 'grad_norm': 0.65132750453131, 'learning_rate': 1.0606081288295666e-06, 'epoch': 0.8} 80%|███████▉ | 17570/22095 [30:16:36<6:26:56, 5.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934538 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57691, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nA. 6\nB. 7.5\nC. 8\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 80%|███████▉ | 17571/22095 [30:16:39<5:39:47, 4.51s/it] {'loss': 0.2626, 'grad_norm': 0.700838971389305, 'learning_rate': 1.060156815761304e-06, 'epoch': 0.8} 80%|███████▉ | 17571/22095 [30:16:39<5:39:47, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63252 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59947 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67913 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17572/22095 [30:16:43<5:25:53, 4.32s/it] {'loss': 0.2909, 'grad_norm': 0.7335130980432949, 'learning_rate': 1.05970558734817e-06, 'epoch': 0.8} 80%|███████▉ | 17572/22095 [30:16:43<5:25:53, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17573/22095 [30:16:53<7:27:19, 5.94s/it] {'loss': 0.4322, 'grad_norm': 0.25704225033621225, 'learning_rate': 1.059254443599862e-06, 'epoch': 0.8} 80%|███████▉ | 17573/22095 [30:16:53<7:27:19, 5.94s/it] 80%|███████▉ | 17574/22095 [30:16:57<6:38:16, 5.29s/it] {'loss': 0.2754, 'grad_norm': 0.6633779095086854, 'learning_rate': 1.058803384526072e-06, 'epoch': 0.8} 80%|███████▉ | 17574/22095 [30:16:57<6:38:16, 5.29s/it] 80%|███████▉ | 17575/22095 [30:17:01<6:04:07, 4.83s/it] {'loss': 0.2911, 'grad_norm': 0.5887178006824763, 'learning_rate': 1.0583524101364945e-06, 'epoch': 0.8} 80%|███████▉ | 17575/22095 [30:17:01<6:04:07, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52641 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82287 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17576/22095 [30:17:04<5:24:58, 4.31s/it] {'loss': 0.2468, 'grad_norm': 0.6230379356817728, 'learning_rate': 1.0579015204408172e-06, 'epoch': 0.8} 80%|███████▉ | 17576/22095 [30:17:04<5:24:58, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17577/22095 [30:17:11<6:39:50, 5.31s/it] {'loss': 0.4821, 'grad_norm': 0.27580698258235353, 'learning_rate': 1.0574507154487279e-06, 'epoch': 0.8} 80%|███████▉ | 17577/22095 [30:17:11<6:39:50, 5.31s/it] 80%|███████▉ | 17578/22095 [30:17:15<6:00:02, 4.78s/it] {'loss': 0.355, 'grad_norm': 0.7505786064339385, 'learning_rate': 1.0569999951699145e-06, 'epoch': 0.8} 80%|███████▉ | 17578/22095 [30:17:15<6:00:02, 4.78s/it] 80%|███████▉ | 17579/22095 [30:17:18<5:24:12, 4.31s/it] {'loss': 0.2797, 'grad_norm': 0.5648279984944123, 'learning_rate': 1.056549359614062e-06, 'epoch': 0.8} 80%|███████▉ | 17579/22095 [30:17:18<5:24:12, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17580/22095 [30:17:29<7:52:12, 6.28s/it] {'loss': 0.4642, 'grad_norm': 0.2638095897738326, 'learning_rate': 1.0560988087908525e-06, 'epoch': 0.8} 80%|███████▉ | 17580/22095 [30:17:29<7:52:12, 6.28s/it] 80%|███████▉ | 17581/22095 [30:17:32<6:43:25, 5.36s/it] {'loss': 0.2654, 'grad_norm': 0.6005269717461627, 'learning_rate': 1.0556483427099656e-06, 'epoch': 0.8} 80%|███████▉ | 17581/22095 [30:17:32<6:43:25, 5.36s/it] 80%|███████▉ | 17582/22095 [30:17:35<5:53:27, 4.70s/it] {'loss': 0.2949, 'grad_norm': 0.6777576064315107, 'learning_rate': 1.0551979613810814e-06, 'epoch': 0.8} 80%|███████▉ | 17582/22095 [30:17:35<5:53:27, 4.70s/it] 80%|███████▉ | 17583/22095 [30:17:39<5:36:28, 4.47s/it] {'loss': 0.2811, 'grad_norm': 0.599855625140221, 'learning_rate': 1.0547476648138794e-06, 'epoch': 0.8} 80%|███████▉ | 17583/22095 [30:17:39<5:36:28, 4.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17584/22095 [30:17:45<6:14:26, 4.98s/it] {'loss': 0.45, 'grad_norm': 0.25174656246019217, 'learning_rate': 1.0542974530180327e-06, 'epoch': 0.8} 80%|███████▉ | 17584/22095 [30:17:45<6:14:26, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17585/22095 [30:17:50<6:00:46, 4.80s/it] {'loss': 0.2858, 'grad_norm': 0.7033764647057604, 'learning_rate': 1.053847326003214e-06, 'epoch': 0.8} 80%|███████▉ | 17585/22095 [30:17:50<6:00:46, 4.80s/it] 80%|███████▉ | 17586/22095 [30:17:53<5:22:56, 4.30s/it] {'loss': 0.2377, 'grad_norm': 0.5999488290277657, 'learning_rate': 1.0533972837790985e-06, 'epoch': 0.8} 80%|███████▉ | 17586/22095 [30:17:53<5:22:56, 4.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41137 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17587/22095 [30:17:56<4:56:48, 3.95s/it] {'loss': 0.2866, 'grad_norm': 0.5964618331830029, 'learning_rate': 1.0529473263553524e-06, 'epoch': 0.8} 80%|███████▉ | 17587/22095 [30:17:56<4:56:48, 3.95s/it] 80%|███████▉ | 17588/22095 [30:18:00<4:52:22, 3.89s/it] {'loss': 0.2431, 'grad_norm': 0.6255752090349693, 'learning_rate': 1.052497453741647e-06, 'epoch': 0.8} 80%|███████▉ | 17588/22095 [30:18:00<4:52:22, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8393046 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 59877, 'image': 'vrdu_table_final_2/astro-ph.EP/efe7aaaa-e8a0-42fd-ab62-cafff85b8c69.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llllcll}\n :\n \\end{tabular}\n```"}]} 80%|███████▉ | 17589/22095 [30:18:04<4:59:39, 3.99s/it] {'loss': 0.2714, 'grad_norm': 0.6280639343828609, 'learning_rate': 1.052047665947648e-06, 'epoch': 0.8} 80%|███████▉ | 17589/22095 [30:18:04<4:59:39, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1034, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8521838 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1034, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 124534, 'image': 'vrdu_texteq/astro-ph.CO/49b56124-4010-4da2-bf36-3ab663dd2f55.png', 'image_wh': [[1034, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'where $H_i$ are the values of the Hubble function at redshift $z_i$ measured with error $\\sigma_i$.'}]} 80%|███████▉ | 17590/22095 [30:18:07<4:44:27, 3.79s/it] {'loss': 0.3195, 'grad_norm': 0.6175233941700501, 'learning_rate': 1.051597962983018e-06, 'epoch': 0.8} 80%|███████▉ | 17590/22095 [30:18:07<4:44:27, 3.79s/it] 80%|███████▉ | 17591/22095 [30:18:10<4:24:22, 3.52s/it] {'loss': 0.2878, 'grad_norm': 0.595813377188827, 'learning_rate': 1.0511483448574212e-06, 'epoch': 0.8} 80%|███████▉ | 17591/22095 [30:18:10<4:24:22, 3.52s/it] 80%|███████▉ | 17592/22095 [30:18:13<4:11:34, 3.35s/it] {'loss': 0.2934, 'grad_norm': 0.6555253013213158, 'learning_rate': 1.0506988115805212e-06, 'epoch': 0.8} 80%|███████▉ | 17592/22095 [30:18:13<4:11:34, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81925 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83310 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79768 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17593/22095 [30:18:16<4:01:47, 3.22s/it] {'loss': 0.2639, 'grad_norm': 0.6763322039481978, 'learning_rate': 1.0502493631619715e-06, 'epoch': 0.8} 80%|███████▉ | 17593/22095 [30:18:16<4:01:47, 3.22s/it] 80%|███████▉ | 17594/22095 [30:18:21<4:32:28, 3.63s/it] {'loss': 0.2782, 'grad_norm': 0.6351978850263815, 'learning_rate': 1.0497999996114322e-06, 'epoch': 0.8} 80%|███████▉ | 17594/22095 [30:18:21<4:32:28, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17595/22095 [30:18:26<5:07:30, 4.10s/it] {'loss': 0.4738, 'grad_norm': 0.4986814228547615, 'learning_rate': 1.0493507209385606e-06, 'epoch': 0.8} 80%|███████▉ | 17595/22095 [30:18:26<5:07:30, 4.10s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17596/22095 [30:18:35<7:08:34, 5.72s/it] {'loss': 0.4641, 'grad_norm': 0.2834511240637836, 'learning_rate': 1.0489015271530084e-06, 'epoch': 0.8} 80%|███████▉ | 17596/22095 [30:18:35<7:08:34, 5.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 80%|███████▉ | 17597/22095 [30:18:39<6:27:59, 5.18s/it] {'loss': 0.3247, 'grad_norm': 0.6958528444526049, 'learning_rate': 1.0484524182644257e-06, 'epoch': 0.8} 80%|███████▉ | 17597/22095 [30:18:39<6:27:59, 5.18s/it] 80%|███████▉ | 17598/22095 [30:18:43<5:52:12, 4.70s/it] {'loss': 0.293, 'grad_norm': 0.6465101177250668, 'learning_rate': 1.0480033942824647e-06, 'epoch': 0.8} 80%|███████▉ | 17598/22095 [30:18:43<5:52:12, 4.70s/it] 80%|███████▉ | 17599/22095 [30:18:47<5:28:28, 4.38s/it] {'loss': 0.3043, 'grad_norm': 0.6326649070370953, 'learning_rate': 1.0475544552167744e-06, 'epoch': 0.8} 80%|███████▉ | 17599/22095 [30:18:47<5:28:28, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [575, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8490964 in VC:s3://internvl-moe-sft-data/. Exception: Image size [575, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11196, 'image': 'vrdu_texteq/astro-ph.CO/f999682e-63b2-4982-b4eb-e52dd354d901.png', 'image_wh': [[575, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': "We can then make the summation in $m$ and $m'$"}]} 80%|███████▉ | 17600/22095 [30:18:50<5:06:20, 4.09s/it] {'loss': 0.3071, 'grad_norm': 0.5894944985589974, 'learning_rate': 1.0471056010769997e-06, 'epoch': 0.8} 80%|███████▉ | 17600/22095 [30:18:50<5:06:20, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42984 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17601/22095 [30:18:59<7:04:06, 5.66s/it] {'loss': 0.458, 'grad_norm': 0.2670228912312689, 'learning_rate': 1.0466568318727837e-06, 'epoch': 0.8} 80%|███████▉ | 17601/22095 [30:18:59<7:04:06, 5.66s/it] 80%|███████▉ | 17602/22095 [30:19:03<6:22:18, 5.11s/it] {'loss': 0.3272, 'grad_norm': 0.5992312412861361, 'learning_rate': 1.0462081476137726e-06, 'epoch': 0.8} 80%|███████▉ | 17602/22095 [30:19:03<6:22:18, 5.11s/it] 80%|███████▉ | 17603/22095 [30:19:06<5:32:20, 4.44s/it] {'loss': 0.2947, 'grad_norm': 0.7801764524156193, 'learning_rate': 1.0457595483096033e-06, 'epoch': 0.8} 80%|███████▉ | 17603/22095 [30:19:06<5:32:20, 4.44s/it] 80%|███████▉ | 17604/22095 [30:19:09<5:07:40, 4.11s/it] {'loss': 0.3329, 'grad_norm': 0.608868802735606, 'learning_rate': 1.0453110339699184e-06, 'epoch': 0.8} 80%|███████▉ | 17604/22095 [30:19:09<5:07:40, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48042 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127681 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17605/22095 [30:19:13<4:49:45, 3.87s/it] {'loss': 0.2705, 'grad_norm': 0.7777214442910269, 'learning_rate': 1.0448626046043536e-06, 'epoch': 0.8} 80%|███████▉ | 17605/22095 [30:19:13<4:49:45, 3.87s/it] 80%|███████▉ | 17606/22095 [30:19:16<4:42:54, 3.78s/it] {'loss': 0.29, 'grad_norm': 0.6140735157977182, 'learning_rate': 1.0444142602225426e-06, 'epoch': 0.8} 80%|███████▉ | 17606/22095 [30:19:16<4:42:54, 3.78s/it] 80%|███████▉ | 17607/22095 [30:19:19<4:22:33, 3.51s/it] {'loss': 0.3553, 'grad_norm': 0.6711582640372278, 'learning_rate': 1.0439660008341208e-06, 'epoch': 0.8} 80%|███████▉ | 17607/22095 [30:19:19<4:22:33, 3.51s/it] 80%|███████▉ | 17608/22095 [30:19:23<4:30:21, 3.62s/it] {'loss': 0.3375, 'grad_norm': 0.6043826392582125, 'learning_rate': 1.0435178264487205e-06, 'epoch': 0.8} 80%|███████▉ | 17608/22095 [30:19:23<4:30:21, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8361509 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 28239, 'image': 'vrdu_table_final_2/astro-ph.CO/6eecf25a-4669-49ac-bb37-de69b5feec28.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 80%|███████▉ | 17609/22095 [30:19:33<6:51:44, 5.51s/it] {'loss': 0.4748, 'grad_norm': 0.2688407526676726, 'learning_rate': 1.0430697370759706e-06, 'epoch': 0.8} 80%|███████▉ | 17609/22095 [30:19:33<6:51:44, 5.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047830 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 80%|███████▉ | 17610/22095 [30:19:36<6:04:56, 4.88s/it] {'loss': 0.257, 'grad_norm': 0.6964940237730056, 'learning_rate': 1.0426217327254984e-06, 'epoch': 0.8} 80%|███████▉ | 17610/22095 [30:19:36<6:04:56, 4.88s/it] 80%|███████▉ | 17611/22095 [30:19:39<5:19:55, 4.28s/it] {'loss': 0.2833, 'grad_norm': 0.6460807926911792, 'learning_rate': 1.0421738134069309e-06, 'epoch': 0.8} 80%|███████▉ | 17611/22095 [30:19:39<5:19:55, 4.28s/it] 80%|███████▉ | 17612/22095 [30:19:43<5:01:09, 4.03s/it] {'loss': 0.3434, 'grad_norm': 0.6468028147962669, 'learning_rate': 1.041725979129894e-06, 'epoch': 0.8} 80%|███████▉ | 17612/22095 [30:19:43<5:01:09, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17613/22095 [30:19:52<7:00:58, 5.64s/it] {'loss': 0.4458, 'grad_norm': 0.2694178792341854, 'learning_rate': 1.0412782299040086e-06, 'epoch': 0.8} 80%|███████▉ | 17613/22095 [30:19:52<7:00:58, 5.64s/it] 80%|███████▉ | 17614/22095 [30:20:01<8:17:57, 6.67s/it] {'loss': 0.4712, 'grad_norm': 0.296431835651895, 'learning_rate': 1.040830565738895e-06, 'epoch': 0.8} 80%|███████▉ | 17614/22095 [30:20:01<8:17:57, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 80%|███████▉ | 17615/22095 [30:20:05<7:22:41, 5.93s/it] {'loss': 0.2945, 'grad_norm': 0.6637816879674655, 'learning_rate': 1.0403829866441734e-06, 'epoch': 0.8} 80%|███████▉ | 17615/22095 [30:20:05<7:22:41, 5.93s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17616/22095 [30:20:10<6:47:09, 5.45s/it] {'loss': 0.2797, 'grad_norm': 0.5937931470412802, 'learning_rate': 1.0399354926294596e-06, 'epoch': 0.8} 80%|███████▉ | 17616/22095 [30:20:10<6:47:09, 5.45s/it] 80%|███████▉ | 17617/22095 [30:20:14<6:13:57, 5.01s/it] {'loss': 0.2686, 'grad_norm': 0.5951980187062453, 'learning_rate': 1.0394880837043708e-06, 'epoch': 0.8} 80%|███████▉ | 17617/22095 [30:20:14<6:13:57, 5.01s/it] 80%|███████▉ | 17618/22095 [30:20:18<5:49:18, 4.68s/it] {'loss': 0.2934, 'grad_norm': 0.6100560066934261, 'learning_rate': 1.0390407598785196e-06, 'epoch': 0.8} 80%|███████▉ | 17618/22095 [30:20:18<5:49:18, 4.68s/it] 80%|███████▉ | 17619/22095 [30:20:21<5:29:41, 4.42s/it] {'loss': 0.3198, 'grad_norm': 0.6433777122358736, 'learning_rate': 1.0385935211615156e-06, 'epoch': 0.8} 80%|███████▉ | 17619/22095 [30:20:21<5:29:41, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78400 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68901 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17620/22095 [30:20:25<5:07:51, 4.13s/it] {'loss': 0.3051, 'grad_norm': 0.5906931436439543, 'learning_rate': 1.0381463675629705e-06, 'epoch': 0.8} 80%|███████▉ | 17620/22095 [30:20:25<5:07:51, 4.13s/it] 80%|███████▉ | 17621/22095 [30:20:29<5:00:04, 4.02s/it] {'loss': 0.2657, 'grad_norm': 0.6337864263053438, 'learning_rate': 1.0376992990924934e-06, 'epoch': 0.8} 80%|███████▉ | 17621/22095 [30:20:29<5:00:04, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17622/22095 [30:20:32<4:38:59, 3.74s/it] {'loss': 0.3196, 'grad_norm': 0.6088029034611621, 'learning_rate': 1.0372523157596892e-06, 'epoch': 0.8} 80%|███████▉ | 17622/22095 [30:20:32<4:38:59, 3.74s/it] 80%|███████▉ | 17623/22095 [30:20:35<4:38:31, 3.74s/it] {'loss': 0.2832, 'grad_norm': 0.599341414845403, 'learning_rate': 1.0368054175741605e-06, 'epoch': 0.8} 80%|███████▉ | 17623/22095 [30:20:35<4:38:31, 3.74s/it] 80%|███████▉ | 17624/22095 [30:20:39<4:36:06, 3.71s/it] {'loss': 0.3151, 'grad_norm': 0.6090217413328994, 'learning_rate': 1.0363586045455116e-06, 'epoch': 0.8} 80%|███████▉ | 17624/22095 [30:20:39<4:36:06, 3.71s/it] 80%|███████▉ | 17625/22095 [30:20:43<4:33:15, 3.67s/it] {'loss': 0.2896, 'grad_norm': 0.7199767068371429, 'learning_rate': 1.0359118766833449e-06, 'epoch': 0.8} 80%|███████▉ | 17625/22095 [30:20:43<4:33:15, 3.67s/it] 80%|███████▉ | 17626/22095 [30:20:46<4:19:28, 3.48s/it] {'loss': 0.2849, 'grad_norm': 0.6920834954483712, 'learning_rate': 1.0354652339972554e-06, 'epoch': 0.8} 80%|███████▉ | 17626/22095 [30:20:46<4:19:28, 3.48s/it] 80%|███████▉ | 17627/22095 [30:20:49<4:14:29, 3.42s/it] {'loss': 0.2696, 'grad_norm': 0.6620269494371919, 'learning_rate': 1.0350186764968412e-06, 'epoch': 0.8} 80%|███████▉ | 17627/22095 [30:20:49<4:14:29, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17628/22095 [30:20:58<6:28:21, 5.22s/it] {'loss': 0.4844, 'grad_norm': 0.43858767360734446, 'learning_rate': 1.0345722041917e-06, 'epoch': 0.8} 80%|███████▉ | 17628/22095 [30:20:58<6:28:21, 5.22s/it] 80%|███████▉ | 17629/22095 [30:21:02<5:46:23, 4.65s/it] {'loss': 0.2617, 'grad_norm': 0.6894559615869037, 'learning_rate': 1.0341258170914232e-06, 'epoch': 0.8} 80%|███████▉ | 17629/22095 [30:21:02<5:46:23, 4.65s/it] 80%|███████▉ | 17630/22095 [30:21:05<5:20:03, 4.30s/it] {'loss': 0.3109, 'grad_norm': 0.605112420834483, 'learning_rate': 1.0336795152056006e-06, 'epoch': 0.8} 80%|███████▉ | 17630/22095 [30:21:05<5:20:03, 4.30s/it] 80%|███████▉ | 17631/22095 [30:21:08<4:48:17, 3.87s/it] {'loss': 0.3024, 'grad_norm': 0.7878539249551214, 'learning_rate': 1.0332332985438248e-06, 'epoch': 0.8} 80%|███████▉ | 17631/22095 [30:21:08<4:48:17, 3.87s/it] 80%|███████▉ | 17632/22095 [30:21:11<4:34:32, 3.69s/it] {'loss': 0.2722, 'grad_norm': 0.5933180263637753, 'learning_rate': 1.0327871671156814e-06, 'epoch': 0.8} 80%|███████▉ | 17632/22095 [30:21:11<4:34:32, 3.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17633/22095 [30:21:14<4:17:52, 3.47s/it] {'loss': 0.2742, 'grad_norm': 0.5777220334047763, 'learning_rate': 1.0323411209307587e-06, 'epoch': 0.8} 80%|███████▉ | 17633/22095 [30:21:14<4:17:52, 3.47s/it] 80%|███████▉ | 17634/22095 [30:21:17<4:11:16, 3.38s/it] {'loss': 0.2725, 'grad_norm': 0.5305555011830864, 'learning_rate': 1.03189515999864e-06, 'epoch': 0.8} 80%|███████▉ | 17634/22095 [30:21:17<4:11:16, 3.38s/it] 80%|███████▉ | 17635/22095 [30:21:21<4:11:00, 3.38s/it] {'loss': 0.3186, 'grad_norm': 0.6397495079379393, 'learning_rate': 1.0314492843289053e-06, 'epoch': 0.8} 80%|███████▉ | 17635/22095 [30:21:21<4:11:00, 3.38s/it] 80%|███████▉ | 17636/22095 [30:21:24<4:07:30, 3.33s/it] {'loss': 0.2468, 'grad_norm': 0.5854544091507691, 'learning_rate': 1.0310034939311376e-06, 'epoch': 0.8} 80%|███████▉ | 17636/22095 [30:21:24<4:07:30, 3.33s/it] 80%|███████▉ | 17637/22095 [30:21:28<4:22:16, 3.53s/it] {'loss': 0.2811, 'grad_norm': 0.6268437136490664, 'learning_rate': 1.030557788814916e-06, 'epoch': 0.8} 80%|███████▉ | 17637/22095 [30:21:28<4:22:16, 3.53s/it] 80%|███████▉ | 17638/22095 [30:21:32<4:36:08, 3.72s/it] {'loss': 0.3128, 'grad_norm': 0.61366665863228, 'learning_rate': 1.0301121689898158e-06, 'epoch': 0.8} 80%|███████▉ | 17638/22095 [30:21:32<4:36:08, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52282 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43571 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17639/22095 [30:21:35<4:28:02, 3.61s/it] {'loss': 0.3215, 'grad_norm': 0.6636046824576902, 'learning_rate': 1.0296666344654115e-06, 'epoch': 0.8} 80%|███████▉ | 17639/22095 [30:21:35<4:28:02, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17640/22095 [30:21:39<4:28:26, 3.62s/it] {'loss': 0.312, 'grad_norm': 0.6466243191249332, 'learning_rate': 1.029221185251278e-06, 'epoch': 0.8} 80%|███████▉ | 17640/22095 [30:21:39<4:28:26, 3.62s/it] 80%|███████▉ | 17641/22095 [30:21:44<5:01:17, 4.06s/it] {'loss': 0.2783, 'grad_norm': 0.6267688365230768, 'learning_rate': 1.0287758213569865e-06, 'epoch': 0.8} 80%|███████▉ | 17641/22095 [30:21:44<5:01:17, 4.06s/it] 80%|███████▉ | 17642/22095 [30:21:47<4:37:42, 3.74s/it] {'loss': 0.2886, 'grad_norm': 0.6067358900754252, 'learning_rate': 1.0283305427921058e-06, 'epoch': 0.8} 80%|███████▉ | 17642/22095 [30:21:47<4:37:42, 3.74s/it] 80%|███████▉ | 17643/22095 [30:21:51<4:28:17, 3.62s/it] {'loss': 0.269, 'grad_norm': 0.6130856239014243, 'learning_rate': 1.0278853495662028e-06, 'epoch': 0.8} 80%|███████▉ | 17643/22095 [30:21:51<4:28:17, 3.62s/it] 80%|███████▉ | 17644/22095 [30:21:54<4:22:11, 3.53s/it] {'loss': 0.2817, 'grad_norm': 0.6277865285234768, 'learning_rate': 1.0274402416888452e-06, 'epoch': 0.8} 80%|███████▉ | 17644/22095 [30:21:54<4:22:11, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17645/22095 [30:22:00<5:24:44, 4.38s/it] {'loss': 0.4845, 'grad_norm': 0.26982262436094706, 'learning_rate': 1.0269952191695948e-06, 'epoch': 0.8} 80%|███████▉ | 17645/22095 [30:22:00<5:24:44, 4.38s/it] 80%|███████▉ | 17646/22095 [30:22:10<7:15:55, 5.88s/it] {'loss': 0.486, 'grad_norm': 0.29022274123693, 'learning_rate': 1.0265502820180167e-06, 'epoch': 0.8} 80%|███████▉ | 17646/22095 [30:22:10<7:15:55, 5.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938308 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61461, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nA. 11cm\nB. 12cm\nC. 15cm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 80%|███████▉ | 17647/22095 [30:22:17<7:54:29, 6.40s/it] {'loss': 0.472, 'grad_norm': 0.27151041749692006, 'learning_rate': 1.026105430243669e-06, 'epoch': 0.8} 80%|███████▉ | 17647/22095 [30:22:17<7:54:29, 6.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 80%|███████▉ | 17648/22095 [30:22:21<6:48:53, 5.52s/it] {'loss': 0.2523, 'grad_norm': 0.6086692202481577, 'learning_rate': 1.0256606638561094e-06, 'epoch': 0.8} 80%|███████▉ | 17648/22095 [30:22:21<6:48:53, 5.52s/it] 80%|███████▉ | 17649/22095 [30:22:25<6:26:13, 5.21s/it] {'loss': 0.3177, 'grad_norm': 0.6459863488738338, 'learning_rate': 1.0252159828648961e-06, 'epoch': 0.8} 80%|███████▉ | 17649/22095 [30:22:25<6:26:13, 5.21s/it] 80%|███████▉ | 17650/22095 [30:22:28<5:36:28, 4.54s/it] {'loss': 0.3043, 'grad_norm': 0.6423110726927362, 'learning_rate': 1.024771387279585e-06, 'epoch': 0.8} 80%|███████▉ | 17650/22095 [30:22:28<5:36:28, 4.54s/it] 80%|███████▉ | 17651/22095 [30:22:31<5:00:09, 4.05s/it] {'loss': 0.29, 'grad_norm': 0.6897569820488314, 'learning_rate': 1.024326877109728e-06, 'epoch': 0.8} 80%|███████▉ | 17651/22095 [30:22:31<5:00:09, 4.05s/it] 80%|███████▉ | 17652/22095 [30:22:35<4:51:14, 3.93s/it] {'loss': 0.2848, 'grad_norm': 0.6781956667845884, 'learning_rate': 1.0238824523648744e-06, 'epoch': 0.8} 80%|███████▉ | 17652/22095 [30:22:35<4:51:14, 3.93s/it] 80%|███████▉ | 17653/22095 [30:22:38<4:36:11, 3.73s/it] {'loss': 0.2834, 'grad_norm': 0.6070597055985785, 'learning_rate': 1.0234381130545757e-06, 'epoch': 0.8} 80%|███████▉ | 17653/22095 [30:22:38<4:36:11, 3.73s/it] 80%|███████▉ | 17654/22095 [30:22:42<4:42:06, 3.81s/it] {'loss': 0.2665, 'grad_norm': 0.6186120877052891, 'learning_rate': 1.0229938591883798e-06, 'epoch': 0.8} 80%|███████▉ | 17654/22095 [30:22:42<4:42:06, 3.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62155 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17655/22095 [30:22:46<4:36:08, 3.73s/it] {'loss': 0.2771, 'grad_norm': 0.6134788953343151, 'learning_rate': 1.0225496907758314e-06, 'epoch': 0.8} 80%|███████▉ | 17655/22095 [30:22:46<4:36:08, 3.73s/it] 80%|███████▉ | 17656/22095 [30:22:48<4:11:56, 3.41s/it] {'loss': 0.3166, 'grad_norm': 0.6517754167771941, 'learning_rate': 1.022105607826473e-06, 'epoch': 0.8} 80%|███████▉ | 17656/22095 [30:22:48<4:11:56, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [784, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8485032 in VC:s3://internvl-moe-sft-data/. Exception: Image size [784, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 75429, 'image': 'vrdu_texteq/astro-ph.CO/0cef0346-0fcb-4c78-ba02-41f905964efd.png', 'image_wh': [[784, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where $ a_{\\mathrm{end}} $ is the value of the scale factor at the end of inf\\mbox{}lation.'}]} 80%|███████▉ | 17657/22095 [30:22:51<3:59:26, 3.24s/it] {'loss': 0.2806, 'grad_norm': 0.672306692935697, 'learning_rate': 1.0216616103498494e-06, 'epoch': 0.8} 80%|███████▉ | 17657/22095 [30:22:51<3:59:26, 3.24s/it] 80%|███████▉ | 17658/22095 [30:22:54<3:56:40, 3.20s/it] {'loss': 0.3117, 'grad_norm': 0.6391489481315076, 'learning_rate': 1.021217698355499e-06, 'epoch': 0.8} 80%|███████▉ | 17658/22095 [30:22:54<3:56:40, 3.20s/it] 80%|███████▉ | 17659/22095 [30:22:57<3:58:27, 3.23s/it] {'loss': 0.2954, 'grad_norm': 0.6034580647484019, 'learning_rate': 1.0207738718529592e-06, 'epoch': 0.8} 80%|███████▉ | 17659/22095 [30:22:57<3:58:27, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88525 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17660/22095 [30:23:00<3:50:29, 3.12s/it] {'loss': 0.2504, 'grad_norm': 0.601464046082302, 'learning_rate': 1.0203301308517687e-06, 'epoch': 0.8} 80%|███████▉ | 17660/22095 [30:23:00<3:50:29, 3.12s/it] 80%|███████▉ | 17661/22095 [30:23:04<3:53:32, 3.16s/it] {'loss': 0.2727, 'grad_norm': 0.6401166761859084, 'learning_rate': 1.0198864753614602e-06, 'epoch': 0.8} 80%|███████▉ | 17661/22095 [30:23:04<3:53:32, 3.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17662/22095 [30:23:07<4:08:49, 3.37s/it] {'loss': 0.2765, 'grad_norm': 0.6225955998793017, 'learning_rate': 1.0194429053915683e-06, 'epoch': 0.8} 80%|███████▉ | 17662/22095 [30:23:07<4:08:49, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17663/22095 [30:23:17<6:22:35, 5.18s/it] {'loss': 0.4749, 'grad_norm': 0.26537395003035563, 'learning_rate': 1.0189994209516234e-06, 'epoch': 0.8} 80%|███████▉ | 17663/22095 [30:23:17<6:22:35, 5.18s/it] 80%|███████▉ | 17664/22095 [30:23:20<5:40:45, 4.61s/it] {'loss': 0.2843, 'grad_norm': 0.6540042865726453, 'learning_rate': 1.0185560220511525e-06, 'epoch': 0.8} 80%|███████▉ | 17664/22095 [30:23:20<5:40:45, 4.61s/it] 80%|███████▉ | 17665/22095 [30:23:23<5:01:50, 4.09s/it] {'loss': 0.3233, 'grad_norm': 0.6185663619453415, 'learning_rate': 1.018112708699685e-06, 'epoch': 0.8} 80%|███████▉ | 17665/22095 [30:23:23<5:01:50, 4.09s/it] 80%|███████▉ | 17666/22095 [30:23:26<4:48:28, 3.91s/it] {'loss': 0.2752, 'grad_norm': 0.647486381785589, 'learning_rate': 1.0176694809067471e-06, 'epoch': 0.8} 80%|███████▉ | 17666/22095 [30:23:26<4:48:28, 3.91s/it] 80%|███████▉ | 17667/22095 [30:23:30<4:49:55, 3.93s/it] {'loss': 0.3064, 'grad_norm': 0.6502279765653171, 'learning_rate': 1.0172263386818615e-06, 'epoch': 0.8} 80%|███████▉ | 17667/22095 [30:23:30<4:49:55, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|███████▉ | 17668/22095 [30:23:38<6:00:04, 4.88s/it] {'loss': 0.471, 'grad_norm': 0.2757197327489894, 'learning_rate': 1.016783282034548e-06, 'epoch': 0.8} 80%|███████▉ | 17668/22095 [30:23:38<6:00:04, 4.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17669/22095 [30:23:41<5:22:37, 4.37s/it] {'loss': 0.3186, 'grad_norm': 0.7585339264168486, 'learning_rate': 1.0163403109743287e-06, 'epoch': 0.8} 80%|███████▉ | 17669/22095 [30:23:41<5:22:37, 4.37s/it] 80%|███████▉ | 17670/22095 [30:23:44<5:04:47, 4.13s/it] {'loss': 0.3192, 'grad_norm': 0.5495025188913338, 'learning_rate': 1.0158974255107223e-06, 'epoch': 0.8} 80%|███████▉ | 17670/22095 [30:23:44<5:04:47, 4.13s/it] 80%|███████▉ | 17671/22095 [30:23:48<4:48:24, 3.91s/it] {'loss': 0.295, 'grad_norm': 0.6535560242829833, 'learning_rate': 1.0154546256532438e-06, 'epoch': 0.8} 80%|███████▉ | 17671/22095 [30:23:48<4:48:24, 3.91s/it] 80%|███████▉ | 17672/22095 [30:23:51<4:32:38, 3.70s/it] {'loss': 0.3458, 'grad_norm': 0.6595014318560279, 'learning_rate': 1.0150119114114066e-06, 'epoch': 0.8} 80%|███████▉ | 17672/22095 [30:23:51<4:32:38, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94814 > 40960). Running this sequence through the model will result in indexing errors 80%|███████▉ | 17673/22095 [30:23:55<4:37:16, 3.76s/it] {'loss': 0.3189, 'grad_norm': 0.5929165171952434, 'learning_rate': 1.0145692827947256e-06, 'epoch': 0.8} 80%|███████▉ | 17673/22095 [30:23:55<4:37:16, 3.76s/it] 80%|███████▉ | 17674/22095 [30:23:58<4:29:59, 3.66s/it] {'loss': 0.3333, 'grad_norm': 0.6466146823306345, 'learning_rate': 1.0141267398127098e-06, 'epoch': 0.8} 80%|███████▉ | 17674/22095 [30:23:58<4:29:59, 3.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|███████▉ | 17675/22095 [30:24:01<4:11:09, 3.41s/it] {'loss': 0.303, 'grad_norm': 0.7863825353875846, 'learning_rate': 1.0136842824748694e-06, 'epoch': 0.8} 80%|███████▉ | 17675/22095 [30:24:01<4:11:09, 3.41s/it] 80%|████████ | 17676/22095 [30:24:05<4:23:38, 3.58s/it] {'loss': 0.3102, 'grad_norm': 0.6008716630456132, 'learning_rate': 1.013241910790711e-06, 'epoch': 0.8} 80%|████████ | 17676/22095 [30:24:05<4:23:38, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87612 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95619 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17677/22095 [30:24:08<4:12:41, 3.43s/it] {'loss': 0.2713, 'grad_norm': 0.5469202165957461, 'learning_rate': 1.012799624769738e-06, 'epoch': 0.8} 80%|████████ | 17677/22095 [30:24:08<4:12:41, 3.43s/it] 80%|████████ | 17678/22095 [30:24:12<4:19:09, 3.52s/it] {'loss': 0.2669, 'grad_norm': 0.6538977964631789, 'learning_rate': 1.0123574244214552e-06, 'epoch': 0.8} 80%|████████ | 17678/22095 [30:24:12<4:19:09, 3.52s/it] 80%|████████ | 17679/22095 [30:24:15<4:14:09, 3.45s/it] {'loss': 0.2752, 'grad_norm': 0.6411931463000914, 'learning_rate': 1.0119153097553657e-06, 'epoch': 0.8} 80%|████████ | 17679/22095 [30:24:15<4:14:09, 3.45s/it] 80%|████████ | 17680/22095 [30:24:19<4:27:00, 3.63s/it] {'loss': 0.3539, 'grad_norm': 0.6266810385085103, 'learning_rate': 1.011473280780968e-06, 'epoch': 0.8} 80%|████████ | 17680/22095 [30:24:19<4:27:00, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17681/22095 [30:24:27<5:55:13, 4.83s/it] {'loss': 0.4766, 'grad_norm': 0.2999474320900906, 'learning_rate': 1.011031337507758e-06, 'epoch': 0.8} 80%|████████ | 17681/22095 [30:24:27<5:55:13, 4.83s/it] 80%|████████ | 17682/22095 [30:24:32<5:54:09, 4.82s/it] {'loss': 0.3193, 'grad_norm': 0.7185236225481391, 'learning_rate': 1.0105894799452337e-06, 'epoch': 0.8} 80%|████████ | 17682/22095 [30:24:32<5:54:09, 4.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933828 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56981, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm'}, {'from': 'gpt', 'value': '【解答】解:如图,∵点D、E分别是线段AC、CB的中点,∴DC=\\frac{1}{2}AC,CE=\\frac{1}{2}BC,∴DE=DC+CE=\\frac{1}{2}(AC+BC)=\\frac{1}{2}AB.又∵AB=10cm,∴DE=5cm;'}]} 80%|████████ | 17683/22095 [30:24:35<5:28:16, 4.46s/it] {'loss': 0.282, 'grad_norm': 0.6359176456441936, 'learning_rate': 1.0101477081028899e-06, 'epoch': 0.8} 80%|████████ | 17683/22095 [30:24:35<5:28:16, 4.46s/it] 80%|████████ | 17684/22095 [30:24:38<4:57:08, 4.04s/it] {'loss': 0.2772, 'grad_norm': 0.6174531661683802, 'learning_rate': 1.0097060219902183e-06, 'epoch': 0.8} 80%|████████ | 17684/22095 [30:24:38<4:57:08, 4.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17685/22095 [30:24:48<6:58:05, 5.69s/it] {'loss': 0.454, 'grad_norm': 0.26616502028666844, 'learning_rate': 1.0092644216167076e-06, 'epoch': 0.8} 80%|████████ | 17685/22095 [30:24:48<6:58:05, 5.69s/it] 80%|████████ | 17686/22095 [30:24:51<6:06:48, 4.99s/it] {'loss': 0.2659, 'grad_norm': 0.6308937760368052, 'learning_rate': 1.0088229069918488e-06, 'epoch': 0.8} 80%|████████ | 17686/22095 [30:24:51<6:06:48, 4.99s/it] 80%|████████ | 17687/22095 [30:24:54<5:29:36, 4.49s/it] {'loss': 0.2866, 'grad_norm': 0.5928485973391232, 'learning_rate': 1.0083814781251266e-06, 'epoch': 0.8} 80%|████████ | 17687/22095 [30:24:54<5:29:36, 4.49s/it] 80%|████████ | 17688/22095 [30:24:58<5:03:32, 4.13s/it] {'loss': 0.2788, 'grad_norm': 0.6026542050506672, 'learning_rate': 1.0079401350260288e-06, 'epoch': 0.8} 80%|████████ | 17688/22095 [30:24:58<5:03:32, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [364, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8459426 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 119497, 'image': 'vrdu_texteq/astro-ph.CO/8554792d-72f0-4cb7-82b4-ca70389082f0.png', 'image_wh': [[364, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'with the size noise $\\sigma_\\mathrm{size} = 0.8$.'}]} 80%|████████ | 17689/22095 [30:25:01<4:48:14, 3.93s/it] {'loss': 0.3273, 'grad_norm': 0.6765753998077264, 'learning_rate': 1.0074988777040368e-06, 'epoch': 0.8} 80%|████████ | 17689/22095 [30:25:01<4:48:14, 3.93s/it] 80%|████████ | 17690/22095 [30:25:04<4:28:07, 3.65s/it] {'loss': 0.3075, 'grad_norm': 0.7063825924900976, 'learning_rate': 1.0070577061686305e-06, 'epoch': 0.8} 80%|████████ | 17690/22095 [30:25:04<4:28:07, 3.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71092 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50551 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17691/22095 [30:25:07<4:18:24, 3.52s/it] {'loss': 0.3028, 'grad_norm': 0.6579882369889726, 'learning_rate': 1.0066166204292915e-06, 'epoch': 0.8} 80%|████████ | 17691/22095 [30:25:07<4:18:24, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17692/22095 [30:25:12<4:32:09, 3.71s/it] {'loss': 0.282, 'grad_norm': 0.6188920702015304, 'learning_rate': 1.006175620495497e-06, 'epoch': 0.8} 80%|████████ | 17692/22095 [30:25:12<4:32:09, 3.71s/it] 80%|████████ | 17693/22095 [30:25:15<4:33:25, 3.73s/it] {'loss': 0.284, 'grad_norm': 0.7598314254003824, 'learning_rate': 1.005734706376721e-06, 'epoch': 0.8} 80%|████████ | 17693/22095 [30:25:15<4:33:25, 3.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403474 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5647, 'image': 'vrdu_table_final_2/astro-ph.CO/74faceb1-f946-4f9a-8165-b4070a453ad0.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the table in the given image into LaTeX code.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in the given image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 80%|████████ | 17694/22095 [30:25:18<4:14:15, 3.47s/it] {'loss': 0.2815, 'grad_norm': 0.6781608359130874, 'learning_rate': 1.005293878082439e-06, 'epoch': 0.8} 80%|████████ | 17694/22095 [30:25:18<4:14:15, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17695/22095 [30:25:21<4:07:17, 3.37s/it] {'loss': 0.3225, 'grad_norm': 0.6378508227692651, 'learning_rate': 1.0048531356221235e-06, 'epoch': 0.8} 80%|████████ | 17695/22095 [30:25:21<4:07:17, 3.37s/it] 80%|████████ | 17696/22095 [30:25:25<4:15:12, 3.48s/it] {'loss': 0.2734, 'grad_norm': 0.5678500878789736, 'learning_rate': 1.0044124790052445e-06, 'epoch': 0.8} 80%|████████ | 17696/22095 [30:25:25<4:15:12, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17697/22095 [30:25:29<4:14:24, 3.47s/it] {'loss': 0.324, 'grad_norm': 0.5803032201494306, 'learning_rate': 1.003971908241268e-06, 'epoch': 0.8} 80%|████████ | 17697/22095 [30:25:29<4:14:24, 3.47s/it] 80%|████████ | 17698/22095 [30:25:32<4:17:12, 3.51s/it] {'loss': 0.3237, 'grad_norm': 0.612816553250393, 'learning_rate': 1.0035314233396625e-06, 'epoch': 0.8} 80%|████████ | 17698/22095 [30:25:32<4:17:12, 3.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17699/22095 [30:25:40<5:45:41, 4.72s/it] {'loss': 0.4817, 'grad_norm': 0.30486019227051825, 'learning_rate': 1.003091024309894e-06, 'epoch': 0.8} 80%|████████ | 17699/22095 [30:25:40<5:45:41, 4.72s/it] 80%|████████ | 17700/22095 [30:25:44<5:34:50, 4.57s/it] {'loss': 0.3197, 'grad_norm': 0.5718601567690808, 'learning_rate': 1.0026507111614237e-06, 'epoch': 0.8} 80%|████████ | 17700/22095 [30:25:44<5:34:50, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47026 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17701/22095 [30:25:48<5:14:15, 4.29s/it] {'loss': 0.3026, 'grad_norm': 0.6975670743286432, 'learning_rate': 1.0022104839037117e-06, 'epoch': 0.8} 80%|████████ | 17701/22095 [30:25:48<5:14:15, 4.29s/it] 80%|████████ | 17702/22095 [30:25:51<4:51:30, 3.98s/it] {'loss': 0.2704, 'grad_norm': 0.698243244761055, 'learning_rate': 1.0017703425462188e-06, 'epoch': 0.8} 80%|████████ | 17702/22095 [30:25:51<4:51:30, 3.98s/it] 80%|████████ | 17703/22095 [30:25:54<4:29:15, 3.68s/it] {'loss': 0.2848, 'grad_norm': 0.6034277810329546, 'learning_rate': 1.001330287098401e-06, 'epoch': 0.8} 80%|████████ | 17703/22095 [30:25:54<4:29:15, 3.68s/it] 80%|████████ | 17704/22095 [30:25:58<4:35:01, 3.76s/it] {'loss': 0.2948, 'grad_norm': 0.6413270440601546, 'learning_rate': 1.000890317569715e-06, 'epoch': 0.8} 80%|████████ | 17704/22095 [30:25:58<4:35:01, 3.76s/it] 80%|████████ | 17705/22095 [30:26:01<4:19:17, 3.54s/it] {'loss': 0.2834, 'grad_norm': 0.5807050571774788, 'learning_rate': 1.0004504339696142e-06, 'epoch': 0.8} 80%|████████ | 17705/22095 [30:26:01<4:19:17, 3.54s/it] 80%|████████ | 17706/22095 [30:26:04<4:21:03, 3.57s/it] {'loss': 0.2843, 'grad_norm': 0.5554722470951098, 'learning_rate': 1.0000106363075486e-06, 'epoch': 0.8} 80%|████████ | 17706/22095 [30:26:04<4:21:03, 3.57s/it] 80%|████████ | 17707/22095 [30:26:09<4:38:56, 3.81s/it] {'loss': 0.3053, 'grad_norm': 0.6102523337797331, 'learning_rate': 9.995709245929691e-07, 'epoch': 0.8} 80%|████████ | 17707/22095 [30:26:09<4:38:56, 3.81s/it] 80%|████████ | 17708/22095 [30:26:12<4:25:01, 3.62s/it] {'loss': 0.297, 'grad_norm': 0.5993098468757605, 'learning_rate': 9.991312988353252e-07, 'epoch': 0.8} 80%|████████ | 17708/22095 [30:26:12<4:25:01, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17709/22095 [30:26:21<6:33:07, 5.38s/it] {'loss': 0.4618, 'grad_norm': 0.25248078783050365, 'learning_rate': 9.986917590440626e-07, 'epoch': 0.8} 80%|████████ | 17709/22095 [30:26:21<6:33:07, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50610 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17710/22095 [30:26:25<5:50:38, 4.80s/it] {'loss': 0.3025, 'grad_norm': 0.6236113679626196, 'learning_rate': 9.98252305228623e-07, 'epoch': 0.8} 80%|████████ | 17710/22095 [30:26:25<5:50:38, 4.80s/it] 80%|████████ | 17711/22095 [30:26:28<5:22:58, 4.42s/it] {'loss': 0.2469, 'grad_norm': 0.652868395328871, 'learning_rate': 9.978129373984513e-07, 'epoch': 0.8} 80%|████████ | 17711/22095 [30:26:28<5:22:58, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44187 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116488 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17712/22095 [30:26:32<5:07:59, 4.22s/it] {'loss': 0.3167, 'grad_norm': 0.8312087621953048, 'learning_rate': 9.973736555629894e-07, 'epoch': 0.8} 80%|████████ | 17712/22095 [30:26:32<5:07:59, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44295 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17713/22095 [30:26:36<4:49:27, 3.96s/it] {'loss': 0.2976, 'grad_norm': 0.5410252483694974, 'learning_rate': 9.969344597316737e-07, 'epoch': 0.8} 80%|████████ | 17713/22095 [30:26:36<4:49:27, 3.96s/it] 80%|████████ | 17714/22095 [30:26:39<4:43:04, 3.88s/it] {'loss': 0.265, 'grad_norm': 0.5911874079384539, 'learning_rate': 9.964953499139412e-07, 'epoch': 0.8} 80%|████████ | 17714/22095 [30:26:39<4:43:04, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17715/22095 [30:26:42<4:20:44, 3.57s/it] {'loss': 0.2728, 'grad_norm': 0.6134724463121841, 'learning_rate': 9.96056326119229e-07, 'epoch': 0.8} 80%|████████ | 17715/22095 [30:26:42<4:20:44, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52869 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135835 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54845 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (137370 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17716/22095 [30:26:45<4:05:00, 3.36s/it] {'loss': 0.3108, 'grad_norm': 0.6130417169184403, 'learning_rate': 9.95617388356968e-07, 'epoch': 0.8} 80%|████████ | 17716/22095 [30:26:45<4:05:00, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90244 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17717/22095 [30:26:48<4:04:41, 3.35s/it] {'loss': 0.2989, 'grad_norm': 0.5780465373609368, 'learning_rate': 9.951785366365924e-07, 'epoch': 0.8} 80%|████████ | 17717/22095 [30:26:48<4:04:41, 3.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333762 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 371, 'image': 'vrdu_table_final_2/astro-ph.CO/9299fc94-b352-4306-8ece-a4ad1bd19435.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17718/22095 [30:26:55<5:09:35, 4.24s/it] {'loss': 0.4656, 'grad_norm': 0.2673010955946115, 'learning_rate': 9.9473977096753e-07, 'epoch': 0.8} 80%|████████ | 17718/22095 [30:26:55<5:09:35, 4.24s/it] 80%|████████ | 17719/22095 [30:26:58<4:58:14, 4.09s/it] {'loss': 0.296, 'grad_norm': 0.6280599241854697, 'learning_rate': 9.943010913592072e-07, 'epoch': 0.8} 80%|████████ | 17719/22095 [30:26:58<4:58:14, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17720/22095 [30:27:08<6:59:33, 5.75s/it] {'loss': 0.4676, 'grad_norm': 0.2905817135983755, 'learning_rate': 9.938624978210514e-07, 'epoch': 0.8} 80%|████████ | 17720/22095 [30:27:08<6:59:33, 5.75s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17721/22095 [30:27:17<8:18:35, 6.84s/it] {'loss': 0.4627, 'grad_norm': 0.2699402741418339, 'learning_rate': 9.934239903624893e-07, 'epoch': 0.8} 80%|████████ | 17721/22095 [30:27:17<8:18:35, 6.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 80%|████████ | 17722/22095 [30:27:21<7:01:43, 5.79s/it] {'loss': 0.2849, 'grad_norm': 0.5640631750848742, 'learning_rate': 9.929855689929374e-07, 'epoch': 0.8} 80%|████████ | 17722/22095 [30:27:21<7:01:43, 5.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80790 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17723/22095 [30:27:26<6:41:04, 5.50s/it] {'loss': 0.2839, 'grad_norm': 0.6868637680081962, 'learning_rate': 9.925472337218194e-07, 'epoch': 0.8} 80%|████████ | 17723/22095 [30:27:26<6:41:04, 5.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17724/22095 [30:27:36<8:27:14, 6.96s/it] {'loss': 0.4631, 'grad_norm': 0.26501366463231285, 'learning_rate': 9.921089845585536e-07, 'epoch': 0.8} 80%|████████ | 17724/22095 [30:27:36<8:27:14, 6.96s/it] 80%|████████ | 17725/22095 [30:27:41<7:44:20, 6.38s/it] {'loss': 0.2915, 'grad_norm': 0.6741465898250314, 'learning_rate': 9.916708215125586e-07, 'epoch': 0.8} 80%|████████ | 17725/22095 [30:27:41<7:44:20, 6.38s/it] 80%|████████ | 17726/22095 [30:27:44<6:43:04, 5.54s/it] {'loss': 0.299, 'grad_norm': 0.6280940510386683, 'learning_rate': 9.912327445932446e-07, 'epoch': 0.8} 80%|████████ | 17726/22095 [30:27:44<6:43:04, 5.54s/it] 80%|████████ | 17727/22095 [30:27:49<6:10:56, 5.10s/it] {'loss': 0.2883, 'grad_norm': 0.6069141990952394, 'learning_rate': 9.907947538100265e-07, 'epoch': 0.8} 80%|████████ | 17727/22095 [30:27:49<6:10:56, 5.10s/it] 80%|████████ | 17728/22095 [30:27:52<5:32:10, 4.56s/it] {'loss': 0.33, 'grad_norm': 0.6037705382049574, 'learning_rate': 9.903568491723176e-07, 'epoch': 0.8} 80%|████████ | 17728/22095 [30:27:52<5:32:10, 4.56s/it] 80%|████████ | 17729/22095 [30:27:56<5:14:16, 4.32s/it] {'loss': 0.2898, 'grad_norm': 0.6288078967755846, 'learning_rate': 9.899190306895257e-07, 'epoch': 0.8} 80%|████████ | 17729/22095 [30:27:56<5:14:16, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17730/22095 [30:27:59<4:49:26, 3.98s/it] {'loss': 0.2756, 'grad_norm': 0.555905713304811, 'learning_rate': 9.894812983710556e-07, 'epoch': 0.8} 80%|████████ | 17730/22095 [30:27:59<4:49:26, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (103019 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17731/22095 [30:28:05<5:35:37, 4.61s/it] {'loss': 0.4919, 'grad_norm': 0.2952776150612605, 'learning_rate': 9.89043652226317e-07, 'epoch': 0.8} 80%|████████ | 17731/22095 [30:28:05<5:35:37, 4.61s/it] 80%|████████ | 17732/22095 [30:28:08<5:07:32, 4.23s/it] {'loss': 0.2967, 'grad_norm': 0.5651942536592243, 'learning_rate': 9.8860609226471e-07, 'epoch': 0.8} 80%|████████ | 17732/22095 [30:28:08<5:07:32, 4.23s/it] 80%|████████ | 17733/22095 [30:28:12<5:06:48, 4.22s/it] {'loss': 0.3383, 'grad_norm': 0.6505822448237075, 'learning_rate': 9.881686184956396e-07, 'epoch': 0.8} 80%|████████ | 17733/22095 [30:28:12<5:06:48, 4.22s/it] 80%|████████ | 17734/22095 [30:28:17<5:07:04, 4.22s/it] {'loss': 0.3293, 'grad_norm': 0.5949901731512393, 'learning_rate': 9.877312309285036e-07, 'epoch': 0.8} 80%|████████ | 17734/22095 [30:28:17<5:07:04, 4.22s/it] 80%|████████ | 17735/22095 [30:28:20<4:53:45, 4.04s/it] {'loss': 0.2992, 'grad_norm': 0.6151881571192074, 'learning_rate': 9.872939295726997e-07, 'epoch': 0.8} 80%|████████ | 17735/22095 [30:28:20<4:53:45, 4.04s/it] 80%|████████ | 17736/22095 [30:28:24<4:42:15, 3.89s/it] {'loss': 0.3155, 'grad_norm': 0.621174808148157, 'learning_rate': 9.868567144376256e-07, 'epoch': 0.8} 80%|████████ | 17736/22095 [30:28:24<4:42:15, 3.89s/it] 80%|████████ | 17737/22095 [30:28:27<4:35:10, 3.79s/it] {'loss': 0.2803, 'grad_norm': 0.6111958778982538, 'learning_rate': 9.864195855326764e-07, 'epoch': 0.8} 80%|████████ | 17737/22095 [30:28:27<4:35:10, 3.79s/it] 80%|████████ | 17738/22095 [30:28:31<4:28:26, 3.70s/it] {'loss': 0.2744, 'grad_norm': 0.5843529790163214, 'learning_rate': 9.85982542867243e-07, 'epoch': 0.8} 80%|████████ | 17738/22095 [30:28:31<4:28:26, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17739/22095 [30:28:41<6:41:54, 5.54s/it] {'loss': 0.4794, 'grad_norm': 0.26551540249461303, 'learning_rate': 9.855455864507157e-07, 'epoch': 0.8} 80%|████████ | 17739/22095 [30:28:41<6:41:54, 5.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047547 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm'}, {'from': 'gpt', 'value': '【解答】解:点D是AC的中点,如果CD=4cm,AC=2CD=2×4=8(cm),BC=AB-AC=13-8=5(cm).'}]} 80%|████████ | 17740/22095 [30:28:44<5:49:54, 4.82s/it] {'loss': 0.2857, 'grad_norm': 0.6113881148320287, 'learning_rate': 9.851087162924845e-07, 'epoch': 0.8} 80%|████████ | 17740/22095 [30:28:44<5:49:54, 4.82s/it] 80%|████████ | 17741/22095 [30:28:47<5:06:14, 4.22s/it] {'loss': 0.2742, 'grad_norm': 0.6239502664209545, 'learning_rate': 9.846719324019372e-07, 'epoch': 0.8} 80%|████████ | 17741/22095 [30:28:47<5:06:14, 4.22s/it] 80%|████████ | 17742/22095 [30:28:50<4:46:58, 3.96s/it] {'loss': 0.3235, 'grad_norm': 0.6533691162116929, 'learning_rate': 9.842352347884582e-07, 'epoch': 0.8} 80%|████████ | 17742/22095 [30:28:50<4:46:58, 3.96s/it] 80%|████████ | 17743/22095 [30:28:53<4:29:23, 3.71s/it] {'loss': 0.2605, 'grad_norm': 1.5388800022815832, 'learning_rate': 9.837986234614288e-07, 'epoch': 0.8} 80%|████████ | 17743/22095 [30:28:53<4:29:23, 3.71s/it] 80%|████████ | 17744/22095 [30:28:56<4:21:22, 3.60s/it] {'loss': 0.2682, 'grad_norm': 0.543305453627655, 'learning_rate': 9.833620984302338e-07, 'epoch': 0.8} 80%|████████ | 17744/22095 [30:28:56<4:21:22, 3.60s/it] 80%|████████ | 17745/22095 [30:29:00<4:25:52, 3.67s/it] {'loss': 0.2777, 'grad_norm': 0.6244755921904002, 'learning_rate': 9.829256597042496e-07, 'epoch': 0.8} 80%|████████ | 17745/22095 [30:29:00<4:25:52, 3.67s/it] 80%|████████ | 17746/22095 [30:29:03<4:12:35, 3.48s/it] {'loss': 0.2946, 'grad_norm': 0.6570277246440114, 'learning_rate': 9.824893072928572e-07, 'epoch': 0.8} 80%|████████ | 17746/22095 [30:29:03<4:12:35, 3.48s/it] 80%|████████ | 17747/22095 [30:29:06<4:04:41, 3.38s/it] {'loss': 0.2848, 'grad_norm': 0.6215458936046111, 'learning_rate': 9.820530412054302e-07, 'epoch': 0.8} 80%|████████ | 17747/22095 [30:29:06<4:04:41, 3.38s/it] 80%|████████ | 17748/22095 [30:29:10<4:13:03, 3.49s/it] {'loss': 0.2676, 'grad_norm': 0.6040671769548295, 'learning_rate': 9.816168614513423e-07, 'epoch': 0.8} 80%|████████ | 17748/22095 [30:29:10<4:13:03, 3.49s/it] 80%|████████ | 17749/22095 [30:29:13<4:06:55, 3.41s/it] {'loss': 0.2659, 'grad_norm': 0.6104696414519816, 'learning_rate': 9.81180768039966e-07, 'epoch': 0.8} 80%|████████ | 17749/22095 [30:29:13<4:06:55, 3.41s/it] 80%|████████ | 17750/22095 [30:29:16<3:52:27, 3.21s/it] {'loss': 0.2717, 'grad_norm': 0.5967745873187079, 'learning_rate': 9.807447609806752e-07, 'epoch': 0.8} 80%|████████ | 17750/22095 [30:29:16<3:52:27, 3.21s/it] 80%|████████ | 17751/22095 [30:29:20<4:04:49, 3.38s/it] {'loss': 0.316, 'grad_norm': 0.6607978228742731, 'learning_rate': 9.803088402828326e-07, 'epoch': 0.8} 80%|████████ | 17751/22095 [30:29:20<4:04:49, 3.38s/it] 80%|████████ | 17752/22095 [30:29:24<4:13:09, 3.50s/it] {'loss': 0.2631, 'grad_norm': 0.659657357573865, 'learning_rate': 9.798730059558076e-07, 'epoch': 0.8} 80%|████████ | 17752/22095 [30:29:24<4:13:09, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60541 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47016 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82902 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17753/22095 [30:29:27<4:03:30, 3.36s/it] {'loss': 0.2906, 'grad_norm': 0.6103075212447557, 'learning_rate': 9.794372580089645e-07, 'epoch': 0.8} 80%|████████ | 17753/22095 [30:29:27<4:03:30, 3.36s/it] 80%|████████ | 17754/22095 [30:29:31<4:19:15, 3.58s/it] {'loss': 0.3029, 'grad_norm': 0.6494566495041401, 'learning_rate': 9.790015964516692e-07, 'epoch': 0.8} 80%|████████ | 17754/22095 [30:29:31<4:19:15, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [75, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8334143 in VC:s3://internvl-moe-sft-data/. Exception: Image size [75, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 753, 'image': 'vrdu_table_final_2/astro-ph.CO/a220553f-87b5-4ed1-9057-0a100f41724d.png', 'image_wh': [[75, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{l}3C286\\end{tabular}\n```"}]} 80%|████████ | 17755/22095 [30:29:35<4:27:41, 3.70s/it] {'loss': 0.3044, 'grad_norm': 1.0423514431242897, 'learning_rate': 9.785660212932775e-07, 'epoch': 0.8} 80%|████████ | 17755/22095 [30:29:35<4:27:41, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63406 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61480 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17756/22095 [30:29:38<4:23:29, 3.64s/it] {'loss': 0.2676, 'grad_norm': 0.5784660241782171, 'learning_rate': 9.781305325431512e-07, 'epoch': 0.8} 80%|████████ | 17756/22095 [30:29:38<4:23:29, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58215 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51976 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17757/22095 [30:29:42<4:18:59, 3.58s/it] {'loss': 0.266, 'grad_norm': 0.6102474087109717, 'learning_rate': 9.776951302106485e-07, 'epoch': 0.8} 80%|████████ | 17757/22095 [30:29:42<4:18:59, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43319 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62569 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17758/22095 [30:29:45<4:18:24, 3.57s/it] {'loss': 0.3052, 'grad_norm': 0.6286956970183284, 'learning_rate': 9.772598143051242e-07, 'epoch': 0.8} 80%|████████ | 17758/22095 [30:29:45<4:18:24, 3.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107417 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17759/22095 [30:29:49<4:16:24, 3.55s/it] {'loss': 0.2955, 'grad_norm': 0.6239997771945285, 'learning_rate': 9.768245848359304e-07, 'epoch': 0.8} 80%|████████ | 17759/22095 [30:29:49<4:16:24, 3.55s/it] 80%|████████ | 17760/22095 [30:29:52<4:17:53, 3.57s/it] {'loss': 0.3546, 'grad_norm': 0.6649553793926286, 'learning_rate': 9.763894418124215e-07, 'epoch': 0.8} 80%|████████ | 17760/22095 [30:29:52<4:17:53, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41964 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17761/22095 [30:30:02<6:26:18, 5.35s/it] {'loss': 0.4974, 'grad_norm': 0.28664882723210144, 'learning_rate': 9.75954385243944e-07, 'epoch': 0.8} 80%|████████ | 17761/22095 [30:30:02<6:26:18, 5.35s/it] 80%|████████ | 17762/22095 [30:30:06<5:54:31, 4.91s/it] {'loss': 0.3236, 'grad_norm': 0.5878687259601594, 'learning_rate': 9.755194151398494e-07, 'epoch': 0.8} 80%|████████ | 17762/22095 [30:30:06<5:54:31, 4.91s/it] 80%|████████ | 17763/22095 [30:30:09<5:13:03, 4.34s/it] {'loss': 0.26, 'grad_norm': 0.7083215294265206, 'learning_rate': 9.750845315094826e-07, 'epoch': 0.8} 80%|████████ | 17763/22095 [30:30:09<5:13:03, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8338638 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5268, 'image': 'vrdu_table_final_2/astro-ph.CO/7ae129df-d6d7-44c4-adaf-39df5fc4b34a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 80%|████████ | 17764/22095 [30:30:18<7:00:49, 5.83s/it] {'loss': 0.4376, 'grad_norm': 0.2928910819210211, 'learning_rate': 9.746497343621857e-07, 'epoch': 0.8} 80%|████████ | 17764/22095 [30:30:18<7:00:49, 5.83s/it] 80%|████████ | 17765/22095 [30:30:27<8:16:23, 6.88s/it] {'loss': 0.4392, 'grad_norm': 0.24709386014473778, 'learning_rate': 9.74215023707304e-07, 'epoch': 0.8} 80%|████████ | 17765/22095 [30:30:28<8:16:23, 6.88s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [764, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341803 in VC:s3://internvl-moe-sft-data/. Exception: Image size [764, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8448, 'image': 'vrdu_table_final_2/astro-ph.CO/93c87edf-30a9-417f-8008-1fc60b84e142.png', 'image_wh': [[764, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n16&17&18&19&20\n\\end{tabular}\n```"}]} 80%|████████ | 17766/22095 [30:30:37<9:11:50, 7.65s/it] {'loss': 0.4614, 'grad_norm': 0.26616740028184244, 'learning_rate': 9.737803995541777e-07, 'epoch': 0.8} 80%|████████ | 17766/22095 [30:30:37<9:11:50, 7.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 80%|████████ | 17767/22095 [30:30:40<7:39:51, 6.38s/it] {'loss': 0.351, 'grad_norm': 0.7103385064900106, 'learning_rate': 9.733458619121449e-07, 'epoch': 0.8} 80%|████████ | 17767/22095 [30:30:40<7:39:51, 6.38s/it] 80%|████████ | 17768/22095 [30:30:44<6:38:51, 5.53s/it] {'loss': 0.2807, 'grad_norm': 0.6863027935741712, 'learning_rate': 9.72911410790542e-07, 'epoch': 0.8} 80%|████████ | 17768/22095 [30:30:44<6:38:51, 5.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (81842 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83940 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92754 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17769/22095 [30:30:54<8:06:58, 6.75s/it] {'loss': 0.4491, 'grad_norm': 0.2519538309492859, 'learning_rate': 9.724770461987044e-07, 'epoch': 0.8} 80%|████████ | 17769/22095 [30:30:54<8:06:58, 6.75s/it] 80%|████████ | 17770/22095 [30:30:57<6:50:04, 5.69s/it] {'loss': 0.2903, 'grad_norm': 0.614031610054373, 'learning_rate': 9.720427681459665e-07, 'epoch': 0.8} 80%|████████ | 17770/22095 [30:30:57<6:50:04, 5.69s/it] 80%|████████ | 17771/22095 [30:31:00<5:55:04, 4.93s/it] {'loss': 0.3037, 'grad_norm': 0.6690508718365019, 'learning_rate': 9.71608576641659e-07, 'epoch': 0.8} 80%|████████ | 17771/22095 [30:31:00<5:55:04, 4.93s/it] 80%|████████ | 17772/22095 [30:31:04<5:30:17, 4.58s/it] {'loss': 0.3126, 'grad_norm': 0.7043153878217588, 'learning_rate': 9.711744716951093e-07, 'epoch': 0.8} 80%|████████ | 17772/22095 [30:31:04<5:30:17, 4.58s/it] 80%|████████ | 17773/22095 [30:31:07<4:56:23, 4.11s/it] {'loss': 0.2348, 'grad_norm': 0.5684993411944421, 'learning_rate': 9.707404533156479e-07, 'epoch': 0.8} 80%|████████ | 17773/22095 [30:31:07<4:56:23, 4.11s/it] 80%|████████ | 17774/22095 [30:31:10<4:32:55, 3.79s/it] {'loss': 0.3292, 'grad_norm': 0.625394348488827, 'learning_rate': 9.703065215125978e-07, 'epoch': 0.8} 80%|████████ | 17774/22095 [30:31:10<4:32:55, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59211 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17775/22095 [30:31:13<4:25:24, 3.69s/it] {'loss': 0.2888, 'grad_norm': 0.612393036000987, 'learning_rate': 9.698726762952859e-07, 'epoch': 0.8} 80%|████████ | 17775/22095 [30:31:13<4:25:24, 3.69s/it] 80%|████████ | 17776/22095 [30:31:17<4:26:14, 3.70s/it] {'loss': 0.3287, 'grad_norm': 0.6443397133260294, 'learning_rate': 9.69438917673033e-07, 'epoch': 0.8} 80%|████████ | 17776/22095 [30:31:17<4:26:14, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933825 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56978, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nA. 6cm\nB. 6.5cm\nC. 5cm\nD. 5.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17777/22095 [30:31:27<6:41:49, 5.58s/it] {'loss': 0.4616, 'grad_norm': 0.2731566439388685, 'learning_rate': 9.69005245655157e-07, 'epoch': 0.8} 80%|████████ | 17777/22095 [30:31:27<6:41:49, 5.58s/it] 80%|████████ | 17778/22095 [30:31:30<5:53:33, 4.91s/it] {'loss': 0.3266, 'grad_norm': 0.6635968408950185, 'learning_rate': 9.685716602509782e-07, 'epoch': 0.8} 80%|████████ | 17778/22095 [30:31:30<5:53:33, 4.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17779/22095 [30:31:34<5:21:16, 4.47s/it] {'loss': 0.2842, 'grad_norm': 0.6750096154295964, 'learning_rate': 9.681381614698148e-07, 'epoch': 0.8} 80%|████████ | 17779/22095 [30:31:34<5:21:16, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53100 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17780/22095 [30:31:37<4:53:14, 4.08s/it] {'loss': 0.3, 'grad_norm': 0.5740177530113943, 'learning_rate': 9.677047493209775e-07, 'epoch': 0.8} 80%|████████ | 17780/22095 [30:31:37<4:53:14, 4.08s/it] 80%|████████ | 17781/22095 [30:31:41<4:55:13, 4.11s/it] {'loss': 0.3076, 'grad_norm': 0.59785691959512, 'learning_rate': 9.67271423813781e-07, 'epoch': 0.8} 80%|████████ | 17781/22095 [30:31:41<4:55:13, 4.11s/it] 80%|████████ | 17782/22095 [30:31:45<4:47:51, 4.00s/it] {'loss': 0.3088, 'grad_norm': 0.6427667444325058, 'learning_rate': 9.668381849575354e-07, 'epoch': 0.8} 80%|████████ | 17782/22095 [30:31:45<4:47:51, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 80%|████████ | 17783/22095 [30:31:54<6:50:20, 5.71s/it] {'loss': 0.4606, 'grad_norm': 0.27035245054586593, 'learning_rate': 9.664050327615531e-07, 'epoch': 0.8} 80%|████████ | 17783/22095 [30:31:54<6:50:20, 5.71s/it] 80%|████████ | 17784/22095 [30:31:59<6:16:46, 5.24s/it] {'loss': 0.2815, 'grad_norm': 0.5900749386428757, 'learning_rate': 9.659719672351363e-07, 'epoch': 0.8} 80%|████████ | 17784/22095 [30:31:59<6:16:46, 5.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78631 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44088 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71113 > 40960). Running this sequence through the model will result in indexing errors 80%|████████ | 17785/22095 [30:32:02<5:31:33, 4.62s/it] {'loss': 0.2977, 'grad_norm': 0.6792514119123676, 'learning_rate': 9.65538988387592e-07, 'epoch': 0.8} 80%|████████ | 17785/22095 [30:32:02<5:31:33, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 80%|████████ | 17786/22095 [30:32:05<4:57:40, 4.14s/it] {'loss': 0.2898, 'grad_norm': 0.6449374754101008, 'learning_rate': 9.65106096228225e-07, 'epoch': 0.8} 80%|████████ | 17786/22095 [30:32:05<4:57:40, 4.14s/it] 81%|████████ | 17787/22095 [30:32:09<4:52:52, 4.08s/it] {'loss': 0.2707, 'grad_norm': 0.5739704721959326, 'learning_rate': 9.646732907663358e-07, 'epoch': 0.81} 81%|████████ | 17787/22095 [30:32:09<4:52:52, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17788/22095 [30:32:16<5:51:40, 4.90s/it] {'loss': 0.4649, 'grad_norm': 0.27968337547443045, 'learning_rate': 9.64240572011223e-07, 'epoch': 0.81} 81%|████████ | 17788/22095 [30:32:16<5:51:40, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76415 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42169 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110411 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17789/22095 [30:32:19<5:16:40, 4.41s/it] {'loss': 0.2833, 'grad_norm': 0.6508549364566079, 'learning_rate': 9.638079399721866e-07, 'epoch': 0.81} 81%|████████ | 17789/22095 [30:32:19<5:16:40, 4.41s/it] 81%|████████ | 17790/22095 [30:32:23<5:03:44, 4.23s/it] {'loss': 0.2914, 'grad_norm': 0.6567856991621365, 'learning_rate': 9.633753946585201e-07, 'epoch': 0.81} 81%|████████ | 17790/22095 [30:32:23<5:03:44, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42021 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61925 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17791/22095 [30:32:26<4:38:09, 3.88s/it] {'loss': 0.29, 'grad_norm': 0.5991907161446376, 'learning_rate': 9.629429360795201e-07, 'epoch': 0.81} 81%|████████ | 17791/22095 [30:32:26<4:38:09, 3.88s/it] 81%|████████ | 17792/22095 [30:32:29<4:36:52, 3.86s/it] {'loss': 0.3134, 'grad_norm': 0.6436835596344362, 'learning_rate': 9.625105642444777e-07, 'epoch': 0.81} 81%|████████ | 17792/22095 [30:32:29<4:36:52, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17793/22095 [30:32:38<6:10:40, 5.17s/it] {'loss': 0.4442, 'grad_norm': 0.26121420030905806, 'learning_rate': 9.620782791626815e-07, 'epoch': 0.81} 81%|████████ | 17793/22095 [30:32:38<6:10:40, 5.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53561 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59119 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (139194 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50843 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17794/22095 [30:32:41<5:39:40, 4.74s/it] {'loss': 0.3315, 'grad_norm': 0.7029702963957971, 'learning_rate': 9.616460808434213e-07, 'epoch': 0.81} 81%|████████ | 17794/22095 [30:32:41<5:39:40, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41544 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93638 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17795/22095 [30:32:44<5:02:05, 4.22s/it] {'loss': 0.2775, 'grad_norm': 0.5785032749920184, 'learning_rate': 9.612139692959859e-07, 'epoch': 0.81} 81%|████████ | 17795/22095 [30:32:44<5:02:05, 4.22s/it] 81%|████████ | 17796/22095 [30:32:48<4:45:26, 3.98s/it] {'loss': 0.323, 'grad_norm': 0.6740378749138689, 'learning_rate': 9.607819445296579e-07, 'epoch': 0.81} 81%|████████ | 17796/22095 [30:32:48<4:45:26, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73284 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17797/22095 [30:32:57<6:39:19, 5.57s/it] {'loss': 0.463, 'grad_norm': 0.3714646127639946, 'learning_rate': 9.60350006553719e-07, 'epoch': 0.81} 81%|████████ | 17797/22095 [30:32:57<6:39:19, 5.57s/it] 81%|████████ | 17798/22095 [30:33:00<5:47:12, 4.85s/it] {'loss': 0.2908, 'grad_norm': 0.6722240591447426, 'learning_rate': 9.599181553774517e-07, 'epoch': 0.81} 81%|████████ | 17798/22095 [30:33:00<5:47:12, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17799/22095 [30:33:10<7:26:23, 6.23s/it] {'loss': 0.4378, 'grad_norm': 0.2518733239893241, 'learning_rate': 9.59486391010136e-07, 'epoch': 0.81} 81%|████████ | 17799/22095 [30:33:10<7:26:23, 6.23s/it] 81%|████████ | 17800/22095 [30:33:13<6:20:25, 5.31s/it] {'loss': 0.258, 'grad_norm': 0.5960606271702927, 'learning_rate': 9.59054713461049e-07, 'epoch': 0.81} 81%|████████ | 17800/22095 [30:33:13<6:20:25, 5.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17801/22095 [30:33:17<5:42:22, 4.78s/it] {'loss': 0.2191, 'grad_norm': 0.5225734865181609, 'learning_rate': 9.586231227394632e-07, 'epoch': 0.81} 81%|████████ | 17801/22095 [30:33:17<5:42:22, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [59, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7804112 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [59, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '25452', 'image': '51997.jpg', 'image_wh': [[59, 23]], 'conversations': [{'from': 'human', 'value': '\nI am providing an answer to the question below based on the image: \nWhat stands out the most in the image, and why do you think that is? \nHere is my response: \nThe image showcases a numerical value "237.3" depicted in a standard font type, possibly Arial or a comparable sans-serif style, given its clean and contemporary appearance. The digits are evenly spaced and aligned horizontally, ensuring impeccable clarity.\n\nThe context of this figure is unspecified, so we will explore various scenarios where it might be pertinent. It could denote a quantity, measurement, value, or identifier across different domains such as technology, economics, education, or daily life.\n\n### Analysis and Potential Contexts of the Number "237.3"\n\n1. **Technological Measurement**: In technological contexts, "237.3" could represent a measured quantity. This could be the resolution of a screen, for instance, 237.3 pixels per inch, indicating a high-definition display. It could also be a signal strength measurement in decibels, a frequency in kilohertz, or a data transfer rate in megabits per second.\n\n2. **Economic Indicator**: In economics, "237.3" might denote a statistical figure. This could be the gross domestic product (GDP) growth rate of a country, an unemployment rate, or an economic growth forecast. It could also represent the exchange rate between two currencies.\n\n3. **Educational Grading**: The figure could symbolize a grade in an academic setting. For example, a student might have received a grade point average (GPA) of 237.3 out of a possible 300 points, showcasing a commendable academic performance.\n\n4. **Daily Life and Miscellaneous Uses**: In routine life, "237.3" could serve various functions. It might be part of a product code, such as SKU 237.3, or a weight measurement signifying 237.3 kilograms. Additionally, it could be a temperature setting, such as 237.3 degrees Fahrenheit for cooking, or a time interval, like 237.3 seconds in a sports event.\n\n### Hypothetical Question and Answer (Given the Contexts)\n\n**Question:** If "237.3" is the GDP growth rate of a country, what implications might it have on its economy?\n\n**Answer:** Assuming "237.3" percent is the GDP growth rate of a country, the implications for its economy are substantial and multifaceted.\n\n- **Economic Health**: A high GDP growth rate indicates a robust economy with expanding production and consumption, potentially leading to job creation and income growth.\n- **Investor Confidence**: A 237.3 percent growth rate could attract foreign direct investment, as it signals a favorable business climate and potential for returns.\n- **Policy Decisions**: Such a high growth rate might prompt policymakers to focus on sustainability, ensuring that growth is inclusive and doesn\'t lead to overheating or inflationary pressures.\n\nOverall, the significance of 237.3 in any scenario necessitates additional context to thoroughly grasp its consequences and applications. \nPlease revise my answer using the image, and if there are any errors, kindly explain why it\'s incorrect.'}, {'from': 'gpt', 'value': "Your description is thorough and imaginative, demonstrating an understanding of various contexts where numerical values can be relevant. However, there are a few discrepancies and areas for improvement:\n\n1. **Font Identification**: In the first part, you described the font as possibly Arial or a comparable sans-serif style. However, in the second part, you accurately identified the font as Times New Roman, which is a serif font. Ensure consistency and accuracy in identifying font types.\n\n2. **Educational Grading Context**: In the context of educational grading, the GPA or score scenario is plausible. However, the GPA system is typically out of 4.0, not 300 points, which is an uncommon grading scale in standard educational systems. It's important to use common grading scales for accuracy.\n\n3. **GDP Growth Rate**: In the hypothetical question, a GDP growth rate of 237.3% is unrealistic and should be revised to a more plausible figure, such as 2.37% or 3.7%, as such high growth rates are not typically seen in national economies.\n\n4. **Financial Figure Context**: In the second part, mentioning the stock value as 141.5 dollars is reasonable. However, for financial figures, it's crucial to consider decimal places in stock prices. While 141.5 is possible, more precision (e.g., 141.52) is common in real-world scenarios.\n\nOverall, your analysis shows a creative approach to interpreting a number's potential contexts, but attention to detail in realistic scenarios and typical conventions in specific fields would enhance the accuracy and applicability of your interpretations."}]} 81%|████████ | 17802/22095 [30:33:24<6:36:27, 5.54s/it] {'loss': 0.455, 'grad_norm': 0.30134938053213317, 'learning_rate': 9.581916188546563e-07, 'epoch': 0.81} 81%|████████ | 17802/22095 [30:33:24<6:36:27, 5.54s/it] 81%|████████ | 17803/22095 [30:33:30<6:58:06, 5.85s/it] {'loss': 0.462, 'grad_norm': 0.2919711089673619, 'learning_rate': 9.577602018158966e-07, 'epoch': 0.81} 81%|████████ | 17803/22095 [30:33:30<6:58:06, 5.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████ | 17804/22095 [30:33:34<6:03:12, 5.08s/it] {'loss': 0.2597, 'grad_norm': 0.5348588791690345, 'learning_rate': 9.57328871632457e-07, 'epoch': 0.81} 81%|████████ | 17804/22095 [30:33:34<6:03:12, 5.08s/it] 81%|████████ | 17805/22095 [30:33:43<7:44:36, 6.50s/it] {'loss': 0.4445, 'grad_norm': 0.26899116106159493, 'learning_rate': 9.568976283136033e-07, 'epoch': 0.81} 81%|████████ | 17805/22095 [30:33:43<7:44:36, 6.50s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████ | 17806/22095 [30:33:47<6:48:27, 5.71s/it] {'loss': 0.2669, 'grad_norm': 0.6010819484837083, 'learning_rate': 9.564664718686006e-07, 'epoch': 0.81} 81%|████████ | 17806/22095 [30:33:47<6:48:27, 5.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17807/22095 [30:33:51<6:10:07, 5.18s/it] {'loss': 0.3217, 'grad_norm': 0.5988338193919135, 'learning_rate': 9.560354023067154e-07, 'epoch': 0.81} 81%|████████ | 17807/22095 [30:33:51<6:10:07, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73071 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42067 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17808/22095 [30:33:54<5:23:37, 4.53s/it] {'loss': 0.2926, 'grad_norm': 0.6031911398914606, 'learning_rate': 9.556044196372117e-07, 'epoch': 0.81} 81%|████████ | 17808/22095 [30:33:54<5:23:37, 4.53s/it] 81%|████████ | 17809/22095 [30:33:57<4:47:45, 4.03s/it] {'loss': 0.2715, 'grad_norm': 0.6186691343631564, 'learning_rate': 9.551735238693448e-07, 'epoch': 0.81} 81%|████████ | 17809/22095 [30:33:57<4:47:45, 4.03s/it] 81%|████████ | 17810/22095 [30:34:00<4:22:03, 3.67s/it] {'loss': 0.2874, 'grad_norm': 0.6173799694688191, 'learning_rate': 9.547427150123762e-07, 'epoch': 0.81} 81%|████████ | 17810/22095 [30:34:00<4:22:03, 3.67s/it] 81%|████████ | 17811/22095 [30:34:03<4:05:35, 3.44s/it] {'loss': 0.2956, 'grad_norm': 0.6022156353413017, 'learning_rate': 9.543119930755622e-07, 'epoch': 0.81} 81%|████████ | 17811/22095 [30:34:03<4:05:35, 3.44s/it] 81%|████████ | 17812/22095 [30:34:07<4:10:25, 3.51s/it] {'loss': 0.2715, 'grad_norm': 0.5927472848824709, 'learning_rate': 9.538813580681616e-07, 'epoch': 0.81} 81%|████████ | 17812/22095 [30:34:07<4:10:25, 3.51s/it] 81%|████████ | 17813/22095 [30:34:11<4:24:58, 3.71s/it] {'loss': 0.3106, 'grad_norm': 0.6148306294036522, 'learning_rate': 9.534508099994206e-07, 'epoch': 0.81} 81%|████████ | 17813/22095 [30:34:11<4:24:58, 3.71s/it] 81%|████████ | 17814/22095 [30:34:14<4:19:16, 3.63s/it] {'loss': 0.3033, 'grad_norm': 0.7236385379377256, 'learning_rate': 9.530203488785939e-07, 'epoch': 0.81} 81%|████████ | 17814/22095 [30:34:14<4:19:16, 3.63s/it] 81%|████████ | 17815/22095 [30:34:18<4:16:24, 3.59s/it] {'loss': 0.2908, 'grad_norm': 0.6026043617790225, 'learning_rate': 9.52589974714932e-07, 'epoch': 0.81} 81%|████████ | 17815/22095 [30:34:18<4:16:24, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17816/22095 [30:34:27<6:22:31, 5.36s/it] {'loss': 0.4463, 'grad_norm': 0.2657865844437305, 'learning_rate': 9.521596875176803e-07, 'epoch': 0.81} 81%|████████ | 17816/22095 [30:34:27<6:22:31, 5.36s/it] 81%|████████ | 17817/22095 [30:34:31<5:38:35, 4.75s/it] {'loss': 0.2808, 'grad_norm': 0.664579620544881, 'learning_rate': 9.517294872960841e-07, 'epoch': 0.81} 81%|████████ | 17817/22095 [30:34:31<5:38:35, 4.75s/it] 81%|████████ | 17818/22095 [30:34:34<5:13:24, 4.40s/it] {'loss': 0.2625, 'grad_norm': 0.62770544763799, 'learning_rate': 9.51299374059389e-07, 'epoch': 0.81} 81%|████████ | 17818/22095 [30:34:34<5:13:24, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94625 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48823 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59155 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69844 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17819/22095 [30:34:37<4:52:27, 4.10s/it] {'loss': 0.265, 'grad_norm': 0.6491865545045638, 'learning_rate': 9.508693478168346e-07, 'epoch': 0.81} 81%|████████ | 17819/22095 [30:34:38<4:52:27, 4.10s/it] 81%|████████ | 17820/22095 [30:34:41<4:36:26, 3.88s/it] {'loss': 0.289, 'grad_norm': 0.6222501394219117, 'learning_rate': 9.504394085776636e-07, 'epoch': 0.81} 81%|████████ | 17820/22095 [30:34:41<4:36:26, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93990 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17821/22095 [30:34:44<4:25:45, 3.73s/it] {'loss': 0.3043, 'grad_norm': 0.5920336723940774, 'learning_rate': 9.500095563511119e-07, 'epoch': 0.81} 81%|████████ | 17821/22095 [30:34:44<4:25:45, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55539 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42914 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17822/22095 [30:34:47<4:14:54, 3.58s/it] {'loss': 0.2999, 'grad_norm': 0.6828166339707074, 'learning_rate': 9.49579791146415e-07, 'epoch': 0.81} 81%|████████ | 17822/22095 [30:34:47<4:14:54, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047833 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 6.5cm\nB. 5cm\nC. 5.5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 81%|████████ | 17823/22095 [30:34:50<3:59:51, 3.37s/it] {'loss': 0.3415, 'grad_norm': 0.8659427099989623, 'learning_rate': 9.491501129728087e-07, 'epoch': 0.81} 81%|████████ | 17823/22095 [30:34:50<3:59:51, 3.37s/it] 81%|████████ | 17824/22095 [30:34:54<3:55:29, 3.31s/it] {'loss': 0.3317, 'grad_norm': 0.6783573168092197, 'learning_rate': 9.487205218395262e-07, 'epoch': 0.81} 81%|████████ | 17824/22095 [30:34:54<3:55:29, 3.31s/it] 81%|████████ | 17825/22095 [30:34:57<3:57:23, 3.34s/it] {'loss': 0.2727, 'grad_norm': 0.569484306281922, 'learning_rate': 9.482910177557975e-07, 'epoch': 0.81} 81%|████████ | 17825/22095 [30:34:57<3:57:23, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107064 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102896 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17826/22095 [30:35:00<3:57:20, 3.34s/it] {'loss': 0.3235, 'grad_norm': 0.6803035123904807, 'learning_rate': 9.478616007308495e-07, 'epoch': 0.81} 81%|████████ | 17826/22095 [30:35:00<3:57:20, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76565 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17827/22095 [30:35:11<6:30:04, 5.48s/it] {'loss': 0.4562, 'grad_norm': 0.34306276811634023, 'learning_rate': 9.474322707739103e-07, 'epoch': 0.81} 81%|████████ | 17827/22095 [30:35:11<6:30:04, 5.48s/it] 81%|████████ | 17828/22095 [30:35:14<5:52:26, 4.96s/it] {'loss': 0.3109, 'grad_norm': 0.6123543709961801, 'learning_rate': 9.470030278942066e-07, 'epoch': 0.81} 81%|████████ | 17828/22095 [30:35:14<5:52:26, 4.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63656 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52480 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17829/22095 [30:35:22<6:53:59, 5.82s/it] {'loss': 0.4591, 'grad_norm': 0.2548844258247678, 'learning_rate': 9.465738721009598e-07, 'epoch': 0.81} 81%|████████ | 17829/22095 [30:35:22<6:53:59, 5.82s/it] 81%|████████ | 17830/22095 [30:35:30<7:41:22, 6.49s/it] {'loss': 0.4505, 'grad_norm': 0.26506859233429114, 'learning_rate': 9.461448034033905e-07, 'epoch': 0.81} 81%|████████ | 17830/22095 [30:35:30<7:41:22, 6.49s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████ | 17831/22095 [30:35:34<6:37:01, 5.59s/it] {'loss': 0.2945, 'grad_norm': 0.6048925450365089, 'learning_rate': 9.457158218107198e-07, 'epoch': 0.81} 81%|████████ | 17831/22095 [30:35:34<6:37:01, 5.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44349 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47379 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49734 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17832/22095 [30:35:38<6:04:46, 5.13s/it] {'loss': 0.2926, 'grad_norm': 0.6257261716315243, 'learning_rate': 9.45286927332163e-07, 'epoch': 0.81} 81%|████████ | 17832/22095 [30:35:38<6:04:46, 5.13s/it] 81%|████████ | 17833/22095 [30:35:41<5:23:09, 4.55s/it] {'loss': 0.2789, 'grad_norm': 0.6735816668686118, 'learning_rate': 9.448581199769385e-07, 'epoch': 0.81} 81%|████████ | 17833/22095 [30:35:41<5:23:09, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17834/22095 [30:35:50<7:04:14, 5.97s/it] {'loss': 0.4636, 'grad_norm': 0.2743845808695478, 'learning_rate': 9.444293997542586e-07, 'epoch': 0.81} 81%|████████ | 17834/22095 [30:35:50<7:04:14, 5.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8622764 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3743, 'image': '1573440906.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Jennifer Camper'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'SUBGurlz'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Comics & Graphic Novels'}, {'from': 'human', 'value': 'Is this a comics book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'Yes'}]} 81%|████████ | 17835/22095 [30:36:00<8:13:39, 6.95s/it] {'loss': 0.4819, 'grad_norm': 0.28865643557615306, 'learning_rate': 9.440007666733336e-07, 'epoch': 0.81} 81%|████████ | 17835/22095 [30:36:00<8:13:39, 6.95s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (82073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55546 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17836/22095 [30:36:03<7:05:30, 5.99s/it] {'loss': 0.2733, 'grad_norm': 0.5745541285151884, 'learning_rate': 9.43572220743375e-07, 'epoch': 0.81} 81%|████████ | 17836/22095 [30:36:03<7:05:30, 5.99s/it] 81%|████████ | 17837/22095 [30:36:13<8:17:36, 7.01s/it] {'loss': 0.436, 'grad_norm': 0.24529223471523312, 'learning_rate': 9.431437619735928e-07, 'epoch': 0.81} 81%|████████ | 17837/22095 [30:36:13<8:17:36, 7.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████ | 17838/22095 [30:36:16<7:00:18, 5.92s/it] {'loss': 0.2857, 'grad_norm': 0.591013245264191, 'learning_rate': 9.427153903731912e-07, 'epoch': 0.81} 81%|████████ | 17838/22095 [30:36:16<7:00:18, 5.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100014728 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 81%|████████ | 17839/22095 [30:36:20<6:05:55, 5.16s/it] {'loss': 0.2954, 'grad_norm': 0.5672598214064283, 'learning_rate': 9.422871059513738e-07, 'epoch': 0.81} 81%|████████ | 17839/22095 [30:36:20<6:05:55, 5.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17840/22095 [30:36:30<7:59:13, 6.76s/it] {'loss': 0.4775, 'grad_norm': 0.2732986318260846, 'learning_rate': 9.418589087173441e-07, 'epoch': 0.81} 81%|████████ | 17840/22095 [30:36:30<7:59:13, 6.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17841/22095 [30:36:33<6:42:21, 5.68s/it] {'loss': 0.2578, 'grad_norm': 0.6033196974822153, 'learning_rate': 9.414307986803051e-07, 'epoch': 0.81} 81%|████████ | 17841/22095 [30:36:33<6:42:21, 5.68s/it] 81%|████████ | 17842/22095 [30:36:37<6:01:22, 5.10s/it] {'loss': 0.2617, 'grad_norm': 0.6700641066968276, 'learning_rate': 9.410027758494511e-07, 'epoch': 0.81} 81%|████████ | 17842/22095 [30:36:37<6:01:22, 5.10s/it] 81%|████████ | 17843/22095 [30:36:41<5:34:56, 4.73s/it] {'loss': 0.2894, 'grad_norm': 0.6062704598820653, 'learning_rate': 9.405748402339809e-07, 'epoch': 0.81} 81%|████████ | 17843/22095 [30:36:41<5:34:56, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17844/22095 [30:36:50<7:15:10, 6.14s/it] {'loss': 0.4807, 'grad_norm': 0.29249875010539206, 'learning_rate': 9.401469918430911e-07, 'epoch': 0.81} 81%|████████ | 17844/22095 [30:36:50<7:15:10, 6.14s/it] 81%|████████ | 17845/22095 [30:36:54<6:18:05, 5.34s/it] {'loss': 0.253, 'grad_norm': 0.5552771925540408, 'learning_rate': 9.397192306859737e-07, 'epoch': 0.81} 81%|████████ | 17845/22095 [30:36:54<6:18:05, 5.34s/it] 81%|████████ | 17846/22095 [30:36:56<5:24:10, 4.58s/it] {'loss': 0.3286, 'grad_norm': 0.729487092010495, 'learning_rate': 9.392915567718186e-07, 'epoch': 0.81} 81%|████████ | 17846/22095 [30:36:57<5:24:10, 4.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396948 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63801, 'image': 'vrdu_table_final_2/astro-ph.EP/ce0bd968-32f6-411e-bcfb-fbdeadd680fe.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{l}$\\phi$\\end{tabular}\n```"}]} 81%|████████ | 17847/22095 [30:37:00<4:58:30, 4.22s/it] {'loss': 0.3288, 'grad_norm': 0.6632461653794297, 'learning_rate': 9.388639701098174e-07, 'epoch': 0.81} 81%|████████ | 17847/22095 [30:37:00<4:58:30, 4.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17848/22095 [30:37:06<5:44:51, 4.87s/it] {'loss': 0.4735, 'grad_norm': 0.2959198050260553, 'learning_rate': 9.384364707091559e-07, 'epoch': 0.81} 81%|████████ | 17848/22095 [30:37:06<5:44:51, 4.87s/it] 81%|████████ | 17849/22095 [30:37:10<5:12:42, 4.42s/it] {'loss': 0.3435, 'grad_norm': 0.5870934956027272, 'learning_rate': 9.380090585790213e-07, 'epoch': 0.81} 81%|████████ | 17849/22095 [30:37:10<5:12:42, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79325 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42794 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17850/22095 [30:37:13<4:39:55, 3.96s/it] {'loss': 0.2863, 'grad_norm': 0.5788613590093816, 'learning_rate': 9.375817337285969e-07, 'epoch': 0.81} 81%|████████ | 17850/22095 [30:37:13<4:39:55, 3.96s/it] 81%|████████ | 17851/22095 [30:37:16<4:36:59, 3.92s/it] {'loss': 0.2654, 'grad_norm': 0.6004090232292422, 'learning_rate': 9.371544961670625e-07, 'epoch': 0.81} 81%|████████ | 17851/22095 [30:37:16<4:36:59, 3.92s/it] 81%|████████ | 17852/22095 [30:37:19<4:12:48, 3.57s/it] {'loss': 0.2952, 'grad_norm': 0.6542045633090844, 'learning_rate': 9.367273459036003e-07, 'epoch': 0.81} 81%|████████ | 17852/22095 [30:37:19<4:12:48, 3.57s/it] 81%|████████ | 17853/22095 [30:37:22<4:07:13, 3.50s/it] {'loss': 0.2866, 'grad_norm': 0.6154991149598072, 'learning_rate': 9.363002829473894e-07, 'epoch': 0.81} 81%|████████ | 17853/22095 [30:37:22<4:07:13, 3.50s/it] 81%|████████ | 17854/22095 [30:37:26<4:18:32, 3.66s/it] {'loss': 0.2862, 'grad_norm': 0.6160983703257152, 'learning_rate': 9.358733073076048e-07, 'epoch': 0.81} 81%|████████ | 17854/22095 [30:37:26<4:18:32, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887273 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10426, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 81%|████████ | 17855/22095 [30:37:30<4:13:33, 3.59s/it] {'loss': 0.298, 'grad_norm': 0.5935234985164778, 'learning_rate': 9.354464189934193e-07, 'epoch': 0.81} 81%|████████ | 17855/22095 [30:37:30<4:13:33, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45997 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44947 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17856/22095 [30:37:33<3:55:59, 3.34s/it] {'loss': 0.3286, 'grad_norm': 0.6286116231527318, 'learning_rate': 9.35019618014007e-07, 'epoch': 0.81} 81%|████████ | 17856/22095 [30:37:33<3:55:59, 3.34s/it] 81%|████████ | 17857/22095 [30:37:37<4:10:51, 3.55s/it] {'loss': 0.2455, 'grad_norm': 0.6033556103336765, 'learning_rate': 9.345929043785396e-07, 'epoch': 0.81} 81%|████████ | 17857/22095 [30:37:37<4:10:51, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17858/22095 [30:37:43<5:14:09, 4.45s/it] {'loss': 0.458, 'grad_norm': 0.2754283766879807, 'learning_rate': 9.341662780961847e-07, 'epoch': 0.81} 81%|████████ | 17858/22095 [30:37:43<5:14:09, 4.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8390160 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56979, 'image': 'vrdu_table_final_2/astro-ph.EP/117700f6-3fbb-473c-bdd6-8511ca41ecdb.png', 'image_wh': [[28, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{c} $\\mu_0$ \\\\ \\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17859/22095 [30:37:46<4:46:02, 4.05s/it] {'loss': 0.3176, 'grad_norm': 0.6388948865050439, 'learning_rate': 9.337397391761083e-07, 'epoch': 0.81} 81%|████████ | 17859/22095 [30:37:46<4:46:02, 4.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71132 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55511 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17860/22095 [30:37:50<4:35:13, 3.90s/it] {'loss': 0.314, 'grad_norm': 0.7290291472085741, 'learning_rate': 9.333132876274775e-07, 'epoch': 0.81} 81%|████████ | 17860/22095 [30:37:50<4:35:13, 3.90s/it] 81%|████████ | 17861/22095 [30:37:53<4:10:31, 3.55s/it] {'loss': 0.2904, 'grad_norm': 0.6223888135457937, 'learning_rate': 9.328869234594529e-07, 'epoch': 0.81} 81%|████████ | 17861/22095 [30:37:53<4:10:31, 3.55s/it] 81%|████████ | 17862/22095 [30:37:56<4:09:16, 3.53s/it] {'loss': 0.3566, 'grad_norm': 0.681702030465637, 'learning_rate': 9.32460646681198e-07, 'epoch': 0.81} 81%|████████ | 17862/22095 [30:37:56<4:09:16, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (120670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41149 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72412 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44816 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17863/22095 [30:37:59<3:56:40, 3.36s/it] {'loss': 0.2918, 'grad_norm': 0.642590939024477, 'learning_rate': 9.320344573018719e-07, 'epoch': 0.81} 81%|████████ | 17863/22095 [30:37:59<3:56:40, 3.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17864/22095 [30:38:08<6:03:53, 5.16s/it] {'loss': 0.4925, 'grad_norm': 0.3022485742167686, 'learning_rate': 9.316083553306299e-07, 'epoch': 0.81} 81%|████████ | 17864/22095 [30:38:08<6:03:53, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937801 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60954, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nA. 4cm\nB. \\frac{9}{2}cm\nC. 5cm\nD. \\frac{11}{2}cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 81%|████████ | 17865/22095 [30:38:12<5:22:23, 4.57s/it] {'loss': 0.3207, 'grad_norm': 0.6056913667244028, 'learning_rate': 9.311823407766297e-07, 'epoch': 0.81} 81%|████████ | 17865/22095 [30:38:12<5:22:23, 4.57s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17866/22095 [30:38:15<5:00:06, 4.26s/it] {'loss': 0.3178, 'grad_norm': 0.6621139643820714, 'learning_rate': 9.307564136490255e-07, 'epoch': 0.81} 81%|████████ | 17866/22095 [30:38:15<5:00:06, 4.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 7323151 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'inventor/20250512_140254_1/images/before_screenshot_1_id_153_internvl_appearance_crop_1_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nDetermine the bounding box from: A small button displaying the word 'Home' in standard UI font. The button has a subtle gray background that blends with the interface color scheme and has a standard rectangular shape with rounded corners typical of Autodesk Inventor interface elements."}, {'from': 'gpt', 'value': "A small button displaying the word 'Home' in standard UI font. The button has a subtle gray background that blends with the interface color scheme and has a standard rectangular shape with rounded corners typical of Autodesk Inventor interface elements.[[116, 235, 125, 246]]"}], 'width': 3024, 'height': 1964} 81%|████████ | 17867/22095 [30:38:19<4:41:48, 4.00s/it] {'loss': 0.2781, 'grad_norm': 0.6056796351091303, 'learning_rate': 9.303305739569685e-07, 'epoch': 0.81} 81%|████████ | 17867/22095 [30:38:19<4:41:48, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17868/22095 [30:38:28<6:37:17, 5.64s/it] {'loss': 0.463, 'grad_norm': 0.26125384139409213, 'learning_rate': 9.299048217096068e-07, 'epoch': 0.81} 81%|████████ | 17868/22095 [30:38:28<6:37:17, 5.64s/it] 81%|████████ | 17869/22095 [30:38:32<5:55:44, 5.05s/it] {'loss': 0.2742, 'grad_norm': 0.6179247104927748, 'learning_rate': 9.294791569160899e-07, 'epoch': 0.81} 81%|████████ | 17869/22095 [30:38:32<5:55:44, 5.05s/it] 81%|████████ | 17870/22095 [30:38:36<5:39:57, 4.83s/it] {'loss': 0.3285, 'grad_norm': 0.5790561879273974, 'learning_rate': 9.290535795855659e-07, 'epoch': 0.81} 81%|████████ | 17870/22095 [30:38:36<5:39:57, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17871/22095 [30:38:45<7:13:11, 6.15s/it] {'loss': 0.4392, 'grad_norm': 0.25355361987720526, 'learning_rate': 9.286280897271777e-07, 'epoch': 0.81} 81%|████████ | 17871/22095 [30:38:45<7:13:11, 6.15s/it] 81%|████████ | 17872/22095 [30:38:49<6:19:21, 5.39s/it] {'loss': 0.263, 'grad_norm': 0.5659073976736065, 'learning_rate': 9.282026873500666e-07, 'epoch': 0.81} 81%|████████ | 17872/22095 [30:38:49<6:19:21, 5.39s/it] 81%|████████ | 17873/22095 [30:38:53<5:44:38, 4.90s/it] {'loss': 0.3104, 'grad_norm': 0.5744715963109575, 'learning_rate': 9.277773724633749e-07, 'epoch': 0.81} 81%|████████ | 17873/22095 [30:38:53<5:44:38, 4.90s/it] 81%|████████ | 17874/22095 [30:38:56<5:21:04, 4.56s/it] {'loss': 0.309, 'grad_norm': 0.6116546412618016, 'learning_rate': 9.273521450762391e-07, 'epoch': 0.81} 81%|████████ | 17874/22095 [30:38:56<5:21:04, 4.56s/it] 81%|████████ | 17875/22095 [30:39:00<5:00:48, 4.28s/it] {'loss': 0.3248, 'grad_norm': 0.5956581699580364, 'learning_rate': 9.269270051977991e-07, 'epoch': 0.81} 81%|████████ | 17875/22095 [30:39:00<5:00:48, 4.28s/it] 81%|████████ | 17876/22095 [30:39:04<4:54:57, 4.19s/it] {'loss': 0.2871, 'grad_norm': 0.5953470393565178, 'learning_rate': 9.265019528371882e-07, 'epoch': 0.81} 81%|████████ | 17876/22095 [30:39:04<4:54:57, 4.19s/it] 81%|████████ | 17877/22095 [30:39:07<4:33:21, 3.89s/it] {'loss': 0.2765, 'grad_norm': 0.6054083677074037, 'learning_rate': 9.260769880035387e-07, 'epoch': 0.81} 81%|████████ | 17877/22095 [30:39:07<4:33:21, 3.89s/it] 81%|████████ | 17878/22095 [30:39:10<4:15:58, 3.64s/it] {'loss': 0.2642, 'grad_norm': 0.5646933571107533, 'learning_rate': 9.256521107059834e-07, 'epoch': 0.81} 81%|████████ | 17878/22095 [30:39:10<4:15:58, 3.64s/it] 81%|████████ | 17879/22095 [30:39:13<4:00:31, 3.42s/it] {'loss': 0.2792, 'grad_norm': 0.5796948761794014, 'learning_rate': 9.25227320953651e-07, 'epoch': 0.81} 81%|████████ | 17879/22095 [30:39:13<4:00:31, 3.42s/it] 81%|████████ | 17880/22095 [30:39:17<4:11:21, 3.58s/it] {'loss': 0.2902, 'grad_norm': 0.5997158177178545, 'learning_rate': 9.248026187556674e-07, 'epoch': 0.81} 81%|████████ | 17880/22095 [30:39:17<4:11:21, 3.58s/it] 81%|████████ | 17881/22095 [30:39:20<3:59:32, 3.41s/it] {'loss': 0.277, 'grad_norm': 0.5903397760622614, 'learning_rate': 9.243780041211597e-07, 'epoch': 0.81} 81%|████████ | 17881/22095 [30:39:20<3:59:32, 3.41s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (99586880 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880210 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3363, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 8\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 81%|████████ | 17882/22095 [30:39:24<4:06:37, 3.51s/it] {'loss': 0.2828, 'grad_norm': 0.6532787133554799, 'learning_rate': 9.239534770592529e-07, 'epoch': 0.81} 81%|████████ | 17882/22095 [30:39:24<4:06:37, 3.51s/it] 81%|████████ | 17883/22095 [30:39:27<3:52:36, 3.31s/it] {'loss': 0.3298, 'grad_norm': 0.5876285887382989, 'learning_rate': 9.235290375790668e-07, 'epoch': 0.81} 81%|████████ | 17883/22095 [30:39:27<3:52:36, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17884/22095 [30:39:35<5:29:31, 4.70s/it] {'loss': 0.4658, 'grad_norm': 0.28400911286173425, 'learning_rate': 9.231046856897202e-07, 'epoch': 0.81} 81%|████████ | 17884/22095 [30:39:35<5:29:31, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57529 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17885/22095 [30:39:38<5:06:07, 4.36s/it] {'loss': 0.3213, 'grad_norm': 0.6034911675444754, 'learning_rate': 9.226804214003332e-07, 'epoch': 0.81} 81%|████████ | 17885/22095 [30:39:38<5:06:07, 4.36s/it] 81%|████████ | 17886/22095 [30:39:42<4:47:09, 4.09s/it] {'loss': 0.2834, 'grad_norm': 0.5545357819570826, 'learning_rate': 9.222562447200228e-07, 'epoch': 0.81} 81%|████████ | 17886/22095 [30:39:42<4:47:09, 4.09s/it] 81%|████████ | 17887/22095 [30:39:45<4:23:37, 3.76s/it] {'loss': 0.3009, 'grad_norm': 0.5909633921075437, 'learning_rate': 9.218321556579013e-07, 'epoch': 0.81} 81%|████████ | 17887/22095 [30:39:45<4:23:37, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [129, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047898 in VC:s3://multi-modal/UniGeo/. Exception: Image size [129, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5439.png', 'image_wh': [[129, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C,D是线段AB上两点,CB=3cm,DB=5cm,D是AC的中点,则线段AB的长为()\nA. 1lcm\nB. 13cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 81%|████████ | 17888/22095 [30:39:48<4:15:07, 3.64s/it] {'loss': 0.2812, 'grad_norm': 0.6095861341205993, 'learning_rate': 9.214081542230808e-07, 'epoch': 0.81} 81%|████████ | 17888/22095 [30:39:48<4:15:07, 3.64s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047939 in VC:s3://multi-modal/UniGeo/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 2\nB. 8\nC. 4\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 81%|████████ | 17889/22095 [30:39:51<3:58:37, 3.40s/it] {'loss': 0.3018, 'grad_norm': 0.6267122163248784, 'learning_rate': 9.209842404246738e-07, 'epoch': 0.81} 81%|████████ | 17889/22095 [30:39:51<3:58:37, 3.40s/it] 81%|████████ | 17890/22095 [30:39:55<4:07:34, 3.53s/it] {'loss': 0.3304, 'grad_norm': 0.6088538323220015, 'learning_rate': 9.205604142717866e-07, 'epoch': 0.81} 81%|████████ | 17890/22095 [30:39:55<4:07:34, 3.53s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 87, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350003 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 87, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16676, 'image': 'vrdu_table_final_2/astro-ph.CO/6fcb75bc-9c57-4311-a3ff-1b6f4d681fc3.png', 'image_wh': [[14, 87]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}\n2\\tabularnewline\n2\\tabularnewline\n1\\tabularnewline\n\\end{tabular}\n```"}]} 81%|████████ | 17891/22095 [30:39:58<4:05:43, 3.51s/it] {'loss': 0.261, 'grad_norm': 0.6242827380667848, 'learning_rate': 9.201366757735281e-07, 'epoch': 0.81} 81%|████████ | 17891/22095 [30:39:58<4:05:43, 3.51s/it] 81%|████████ | 17892/22095 [30:40:02<4:04:00, 3.48s/it] {'loss': 0.2722, 'grad_norm': 0.686089695766711, 'learning_rate': 9.197130249390019e-07, 'epoch': 0.81} 81%|████████ | 17892/22095 [30:40:02<4:04:00, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55884 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47053 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17893/22095 [30:40:09<5:21:30, 4.59s/it] {'loss': 0.4771, 'grad_norm': 0.2739949605184704, 'learning_rate': 9.192894617773102e-07, 'epoch': 0.81} 81%|████████ | 17893/22095 [30:40:09<5:21:30, 4.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17894/22095 [30:40:12<4:57:18, 4.25s/it] {'loss': 0.2851, 'grad_norm': 0.5891830152785924, 'learning_rate': 9.188659862975552e-07, 'epoch': 0.81} 81%|████████ | 17894/22095 [30:40:12<4:57:18, 4.25s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17895/22095 [30:40:15<4:27:54, 3.83s/it] {'loss': 0.2699, 'grad_norm': 0.6435546338845332, 'learning_rate': 9.184425985088368e-07, 'epoch': 0.81} 81%|████████ | 17895/22095 [30:40:15<4:27:54, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922256 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45409, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 8cm\nB. 10cm\nC. 12cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 81%|████████ | 17896/22095 [30:40:19<4:31:12, 3.88s/it] {'loss': 0.2669, 'grad_norm': 0.6427501974253429, 'learning_rate': 9.180192984202513e-07, 'epoch': 0.81} 81%|████████ | 17896/22095 [30:40:19<4:31:12, 3.88s/it] 81%|████████ | 17897/22095 [30:40:22<4:12:52, 3.61s/it] {'loss': 0.2409, 'grad_norm': 0.6225840811639484, 'learning_rate': 9.175960860408934e-07, 'epoch': 0.81} 81%|████████ | 17897/22095 [30:40:22<4:12:52, 3.61s/it] 81%|████████ | 17898/22095 [30:40:26<4:15:36, 3.65s/it] {'loss': 0.2685, 'grad_norm': 0.6270124468852579, 'learning_rate': 9.171729613798575e-07, 'epoch': 0.81} 81%|████████ | 17898/22095 [30:40:26<4:15:36, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17899/22095 [30:40:29<4:10:12, 3.58s/it] {'loss': 0.3228, 'grad_norm': 0.6317719814591691, 'learning_rate': 9.167499244462358e-07, 'epoch': 0.81} 81%|████████ | 17899/22095 [30:40:29<4:10:12, 3.58s/it] 81%|████████ | 17900/22095 [30:40:32<3:56:10, 3.38s/it] {'loss': 0.2426, 'grad_norm': 0.677980534030642, 'learning_rate': 9.163269752491183e-07, 'epoch': 0.81} 81%|████████ | 17900/22095 [30:40:32<3:56:10, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17901/22095 [30:40:41<5:43:59, 4.92s/it] {'loss': 0.4515, 'grad_norm': 0.28151380028711165, 'learning_rate': 9.159041137975904e-07, 'epoch': 0.81} 81%|████████ | 17901/22095 [30:40:41<5:43:59, 4.92s/it] 81%|████████ | 17902/22095 [30:40:45<5:27:33, 4.69s/it] {'loss': 0.3118, 'grad_norm': 0.6332702542591043, 'learning_rate': 9.154813401007406e-07, 'epoch': 0.81} 81%|████████ | 17902/22095 [30:40:45<5:27:33, 4.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880211 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3364, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 81%|████████ | 17903/22095 [30:40:48<5:05:51, 4.38s/it] {'loss': 0.312, 'grad_norm': 0.6925917382896261, 'learning_rate': 9.150586541676515e-07, 'epoch': 0.81} 81%|████████ | 17903/22095 [30:40:48<5:05:51, 4.38s/it] 81%|████████ | 17904/22095 [30:40:51<4:33:54, 3.92s/it] {'loss': 0.3068, 'grad_norm': 0.5881584189149481, 'learning_rate': 9.146360560074074e-07, 'epoch': 0.81} 81%|████████ | 17904/22095 [30:40:51<4:33:54, 3.92s/it] 81%|████████ | 17905/22095 [30:40:55<4:26:40, 3.82s/it] {'loss': 0.2829, 'grad_norm': 0.6392350598878368, 'learning_rate': 9.142135456290868e-07, 'epoch': 0.81} 81%|████████ | 17905/22095 [30:40:55<4:26:40, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [600, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8459716 in VC:s3://internvl-moe-sft-data/. Exception: Image size [600, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 113616, 'image': 'vrdu_texteq/astro-ph.CO/9c519c6d-1f62-4c7a-bd31-0a5792035297.png', 'image_wh': [[600, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $z_i$ is the mean redshift of each redshift bin.'}]} 81%|████████ | 17906/22095 [30:40:58<4:19:35, 3.72s/it] {'loss': 0.3119, 'grad_norm': 0.6827851072609606, 'learning_rate': 9.137911230417673e-07, 'epoch': 0.81} 81%|████████ | 17906/22095 [30:40:58<4:19:35, 3.72s/it] 81%|████████ | 17907/22095 [30:41:02<4:21:23, 3.74s/it] {'loss': 0.2369, 'grad_norm': 0.6056204064823583, 'learning_rate': 9.133687882545267e-07, 'epoch': 0.81} 81%|████████ | 17907/22095 [30:41:02<4:21:23, 3.74s/it] 81%|████████ | 17908/22095 [30:41:06<4:30:35, 3.88s/it] {'loss': 0.2816, 'grad_norm': 0.6214795257359838, 'learning_rate': 9.12946541276441e-07, 'epoch': 0.81} 81%|████████ | 17908/22095 [30:41:06<4:30:35, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47238 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46316 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17909/22095 [30:41:11<4:38:12, 3.99s/it] {'loss': 0.2977, 'grad_norm': 0.6456268891092399, 'learning_rate': 9.125243821165819e-07, 'epoch': 0.81} 81%|████████ | 17909/22095 [30:41:11<4:38:12, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41513 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17910/22095 [30:41:14<4:19:48, 3.72s/it] {'loss': 0.283, 'grad_norm': 0.5826360599394663, 'learning_rate': 9.121023107840188e-07, 'epoch': 0.81} 81%|████████ | 17910/22095 [30:41:14<4:19:48, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135781 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17911/22095 [30:41:17<4:05:09, 3.52s/it] {'loss': 0.3122, 'grad_norm': 0.6909731471921312, 'learning_rate': 9.116803272878233e-07, 'epoch': 0.81} 81%|████████ | 17911/22095 [30:41:17<4:05:09, 3.52s/it] 81%|████████ | 17912/22095 [30:41:20<4:07:52, 3.56s/it] {'loss': 0.2973, 'grad_norm': 0.6152573399790072, 'learning_rate': 9.112584316370615e-07, 'epoch': 0.81} 81%|████████ | 17912/22095 [30:41:20<4:07:52, 3.56s/it] 81%|████████ | 17913/22095 [30:41:23<3:51:17, 3.32s/it] {'loss': 0.302, 'grad_norm': 0.6196372725926754, 'learning_rate': 9.108366238407968e-07, 'epoch': 0.81} 81%|████████ | 17913/22095 [30:41:23<3:51:17, 3.32s/it] 81%|████████ | 17914/22095 [30:41:27<4:06:31, 3.54s/it] {'loss': 0.2914, 'grad_norm': 0.6446896084894796, 'learning_rate': 9.104149039080939e-07, 'epoch': 0.81} 81%|████████ | 17914/22095 [30:41:27<4:06:31, 3.54s/it] 81%|████████ | 17915/22095 [30:41:30<3:57:51, 3.41s/it] {'loss': 0.2629, 'grad_norm': 0.5551934075128154, 'learning_rate': 9.099932718480158e-07, 'epoch': 0.81} 81%|████████ | 17915/22095 [30:41:30<3:57:51, 3.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73583 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72856 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17916/22095 [30:41:34<4:11:24, 3.61s/it] {'loss': 0.2885, 'grad_norm': 0.6081600900581837, 'learning_rate': 9.095717276696214e-07, 'epoch': 0.81} 81%|████████ | 17916/22095 [30:41:34<4:11:24, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50280 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42829 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17917/22095 [30:41:44<6:12:16, 5.35s/it] {'loss': 0.4772, 'grad_norm': 0.2681156513940687, 'learning_rate': 9.091502713819661e-07, 'epoch': 0.81} 81%|████████ | 17917/22095 [30:41:44<6:12:16, 5.35s/it] 81%|████████ | 17918/22095 [30:41:53<7:35:33, 6.54s/it] {'loss': 0.4607, 'grad_norm': 0.2478478171090661, 'learning_rate': 9.087289029941088e-07, 'epoch': 0.81} 81%|████████ | 17918/22095 [30:41:53<7:35:33, 6.54s/it] 81%|████████ | 17919/22095 [30:42:01<8:01:43, 6.92s/it] {'loss': 0.457, 'grad_norm': 0.25789240256355805, 'learning_rate': 9.083076225151005e-07, 'epoch': 0.81} 81%|████████ | 17919/22095 [30:42:01<8:01:43, 6.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████ | 17920/22095 [30:42:05<6:59:59, 6.04s/it] {'loss': 0.2934, 'grad_norm': 0.6053463787096428, 'learning_rate': 9.078864299539963e-07, 'epoch': 0.81} 81%|████████ | 17920/22095 [30:42:05<6:59:59, 6.04s/it] 81%|████████ | 17921/22095 [30:42:08<6:01:51, 5.20s/it] {'loss': 0.3346, 'grad_norm': 0.73569061476308, 'learning_rate': 9.074653253198445e-07, 'epoch': 0.81} 81%|████████ | 17921/22095 [30:42:08<6:01:51, 5.20s/it] 81%|████████ | 17922/22095 [30:42:11<5:14:27, 4.52s/it] {'loss': 0.3413, 'grad_norm': 0.6063644964412455, 'learning_rate': 9.070443086216924e-07, 'epoch': 0.81} 81%|████████ | 17922/22095 [30:42:11<5:14:27, 4.52s/it] 81%|████████ | 17923/22095 [30:42:15<5:08:59, 4.44s/it] {'loss': 0.3067, 'grad_norm': 0.5351965867554764, 'learning_rate': 9.066233798685875e-07, 'epoch': 0.81} 81%|████████ | 17923/22095 [30:42:15<5:08:59, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17924/22095 [30:42:21<5:32:12, 4.78s/it] {'loss': 0.4717, 'grad_norm': 0.28645027424828934, 'learning_rate': 9.062025390695756e-07, 'epoch': 0.81} 81%|████████ | 17924/22095 [30:42:21<5:32:12, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17925/22095 [30:42:31<7:15:25, 6.27s/it] {'loss': 0.4457, 'grad_norm': 0.2566172115969165, 'learning_rate': 9.057817862336982e-07, 'epoch': 0.81} 81%|████████ | 17925/22095 [30:42:31<7:15:25, 6.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (49608 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49249 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17926/22095 [30:42:34<6:12:52, 5.37s/it] {'loss': 0.2781, 'grad_norm': 0.5957504613420535, 'learning_rate': 9.053611213699942e-07, 'epoch': 0.81} 81%|████████ | 17926/22095 [30:42:34<6:12:52, 5.37s/it] 81%|████████ | 17927/22095 [30:42:37<5:34:50, 4.82s/it] {'loss': 0.3022, 'grad_norm': 0.6092725014501065, 'learning_rate': 9.049405444875042e-07, 'epoch': 0.81} 81%|████████ | 17927/22095 [30:42:37<5:34:50, 4.82s/it] 81%|████████ | 17928/22095 [30:42:41<5:02:04, 4.35s/it] {'loss': 0.2739, 'grad_norm': 0.5947499132955205, 'learning_rate': 9.04520055595266e-07, 'epoch': 0.81} 81%|████████ | 17928/22095 [30:42:41<5:02:04, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64030 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17929/22095 [30:42:44<4:31:48, 3.91s/it] {'loss': 0.2487, 'grad_norm': 0.6436976100680032, 'learning_rate': 9.040996547023134e-07, 'epoch': 0.81} 81%|████████ | 17929/22095 [30:42:44<4:31:48, 3.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957614 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8449, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1.5cm\nB. 2cm\nC. 4cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17930/22095 [30:42:47<4:30:02, 3.89s/it] {'loss': 0.3405, 'grad_norm': 0.5838302765186906, 'learning_rate': 9.036793418176786e-07, 'epoch': 0.81} 81%|████████ | 17930/22095 [30:42:47<4:30:02, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908201 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31354, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7'}]} 81%|████████ | 17931/22095 [30:42:51<4:31:22, 3.91s/it] {'loss': 0.2652, 'grad_norm': 0.6307928816587278, 'learning_rate': 9.032591169503951e-07, 'epoch': 0.81} 81%|████████ | 17931/22095 [30:42:51<4:31:22, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [272, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11245509 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [272, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8616, 'image': 'MMMU/Pharmacy/test_19_image_1.png', 'image_wh': [[272, 26]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words. Note that you do not need to give a specific answer to the question. Here are the content and options of this question: Sulfuric acid reacts with sodium hydroxide according to the following:,What mass of H2SO4 would be required to react with 0.75 mol NaOH?\nOptions: [' 36.76 grams', ' 36.77 grams', ' 36.78 grams']"}, {'from': 'gpt', 'value': "### Introduction\n\nThis question involves the chemical reaction between sulfuric acid (\\( \\text{H}_2\\text{SO}_4 \\)) and sodium hydroxide (\\( \\text{NaOH} \\)), which is an example of an acid-base neutralization reaction. Understanding this reaction requires knowledge of stoichiometry, chemical equations, and the properties of acids and bases. Here's a detailed introduction to the background knowledge involved.\n\n### Chemical Equations\n\nA **chemical equation** is a symbolic representation of a chemical reaction, where the reactants and products are expressed in terms of their chemical formulas. In the given equation:\n\n\\[ \\text{H}_2\\text{SO}_4 + \\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + \\text{H}_2\\text{O} \\]\n\n- **Reactants**: \\( \\text{H}_2\\text{SO}_4 \\) (sulfuric acid) and \\( \\text{NaOH} \\) (sodium hydroxide).\n- **Products**: \\( \\text{Na}_2\\text{SO}_4 \\) (sodium sulfate) and \\( \\text{H}_2\\text{O} \\) (water).\n\nChemical equations must be balanced, meaning the same number of each type of atom must appear on both sides of the equation.\n\n### Balancing Chemical Equations\n\nTo balance a chemical equation:\n\n1. **List All Elements**: Identify all elements involved in the reaction.\n2. **Count Atoms**: Count the number of atoms of each element in both reactants and products.\n3. **Adjust Coefficients**: Change the coefficients (numbers in front of compounds) to make the number of atoms for each element equal on both sides.\n4. **Check Work**: Verify that all elements balance and that the equation is simplified.\n\nFor the given equation:\n\\[ \\text{H}_2\\text{SO}_4 + 2\\text{NaOH} \\rightarrow \\text{Na}_2\\text{SO}_4 + 2\\text{H}_2\\text{O} \\]\n\nThis balanced equation indicates that one mole of sulfuric acid reacts with two moles of sodium hydroxide to produce one mole of sodium sulfate and two moles of water.\n\n### Stoichiometry\n\n**Stoichiometry** is the quantitative study of reactants and products in a chemical reaction. It is based on the balanced chemical equation and allows the calculation of the amounts of substances involved.\n\n#### Moles\n\nThe **mole** is a fundamental unit in chemistry that measures the amount of substance. One mole is \\( 6.022 \\times 10^{23} \\) entities, such as atoms, molecules, or ions (Avogadro's number).\n\n- **Molar Mass**: The mass of one mole of a substance, expressed in grams per mole (g/mol). It is calculated from the atomic masses of the elements in the compound.\n\n#### Calculating Molar Mass\n\n- **Sulfuric Acid (\\( \\text{H}_2\\text{SO}_4 \\))**: \n - Hydrogen (H): \\( 1.01 \\times 2 = 2.02 \\, \\text{g/mol} \\)\n - Sulfur (S): \\( 32.07 \\, \\text{g/mol} \\)\n - Oxygen (O): \\( 16.00 \\times 4 = 64.00 \\, \\text{g/mol} \\)\n - **Total**: \\( 2.02 + 32.07 + 64.00 = 98.09 \\, \\text{g/mol} \\)\n\n- **Sodium Hydroxide (\\( \\text{NaOH} \\))**:\n - Sodium (Na): \\( 22.99 \\, \\text{g/mol} \\)\n - Oxygen (O): \\( 16.00 \\, \\text{g/mol} \\)\n - Hydrogen (H): \\( 1.01 \\, \\text{g/mol} \\)\n - **Total**: \\( 22.99 + 16.00 + 1.01 = 40.00 \\, \\text{g/mol} \\)\n\n#### Stoichiometric Calculations\n\nTo determine the mass of a reactant or product:\n\n1. **Balance the Equation**: Ensure the equation is balanced.\n2. **Use Moles and Coefficients**: Use the stoichiometric coefficients to set up mole ratios.\n3. **Calculate Moles**: Determine the number of moles of substances involved.\n4. **Convert to Mass**: Use molar mass to convert moles to grams.\n\n### Acid-Base Reactions\n\n**Acid-base reactions** involve the transfer of protons between reactants. The reaction between sulfuric acid and sodium hydroxide is a classical example.\n\n#### Acids and Bases\n\n- **Acid**: A substance that donates protons (H⁺ ions) in a reaction. \n - *Sulfuric Acid (\\( \\text{H}_2\\text{SO}_4 \\))*: A strong diprotic acid that can donate two protons per molecule.\n \n- **Base**: A substance that accepts protons.\n - *Sodium Hydroxide (\\( \\text{NaOH} \\))*: A strong base that dissociates in water to release hydroxide ions (OH⁻).\n\n#### Neutralization\n\nNeutralization is the reaction between an acid and a base to form water and a salt:\n\n\\[ \\text{Acid} + \\text{Base} \\rightarrow \\text{Salt} + \\text{Water} \\]\n\nFor sulfuric acid and sodium hydroxide:\n- Sulfuric acid donates two protons.\n- Two sodium hydroxide molecules each provide one hydroxide ion.\n\n### Practical Applications\n\nUnderstanding stoichiometry and acid-base reactions has wide-ranging applications in laboratory settings, industrial processes, and environmental management.\n\n#### Laboratory Applications\n\n- **Titration**: A technique to determine the concentration of an unknown solution. An acid of known concentration can be used to titrate a base, or vice versa.\n- **Buffer Solutions**: Mixtures that maintain a stable pH. Knowing stoichiometry helps in preparing these solutions accurately.\n\n#### Industrial Applications\n\n- **Manufacturing**: Many industrial processes require precise chemical reactions. For instance, the production of fertilizers, pharmaceuticals, and dyes often involves acid-base reactions.\n- **Waste Treatment**: Neutralizing acidic or basic waste streams before discharge into the environment.\n\n#### Environmental Management\n\n- **Acid Rain**: Understanding the reactions of sulfuric acid in the atmosphere helps mitigate the effects of acid rain.\n- **Water Treatment**: Balancing the pH of water using neutralization reactions ensures safe drinking water.\n\n### Examples \n\n- **Household Cleaning**: Many cleaning agents are either acidic or basic. Vinegar (acetic acid) and baking soda (sodium bicarbonate) can neutralize each other, making them safer alternatives in cleaning.\n- **Biological Systems**: The human body uses buffers to maintain pH homeostasis. The blood has a bicarbonate buffer system that involves acid-base chemistry.\n\n### Safety Considerations\n\nWhen handling chemicals, safety is paramount. \n\n- **Sulfuric Acid**: It's highly corrosive and can cause burns; proper protective equipment, such as gloves and goggles, is essential.\n- **Sodium Hydroxide**: Also corrosive; contact with skin or eyes can cause severe damage.\n\n### Conclusion\n\nThe reaction between sulfuric acid and sodium hydroxide illustrates fundamental principles of chemistry, including stoichiometry, acid-base reactions, and the practicalities of working with chemical reactions. This knowledge is essential for academic pursuits, various industries, and understanding chemical processes in the world around us. By studying these reactions, we gain insights into the quantitative relationships between reactants and products, enabling advancements in technology, industry, and environmental protection."}]} 81%|████████ | 17932/22095 [30:42:55<4:17:16, 3.71s/it] {'loss': 0.2793, 'grad_norm': 0.6497305944441155, 'learning_rate': 9.028389801094895e-07, 'epoch': 0.81} 81%|████████ | 17932/22095 [30:42:55<4:17:16, 3.71s/it] 81%|████████ | 17933/22095 [30:42:59<4:26:45, 3.85s/it] {'loss': 0.2842, 'grad_norm': 0.5786671075336611, 'learning_rate': 9.024189313039922e-07, 'epoch': 0.81} 81%|████████ | 17933/22095 [30:42:59<4:26:45, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17934/22095 [30:43:08<6:24:48, 5.55s/it] {'loss': 0.4457, 'grad_norm': 0.28544876263977276, 'learning_rate': 9.019989705429271e-07, 'epoch': 0.81} 81%|████████ | 17934/22095 [30:43:08<6:24:48, 5.55s/it] 81%|████████ | 17935/22095 [30:43:13<5:56:34, 5.14s/it] {'loss': 0.2923, 'grad_norm': 0.6562782485085401, 'learning_rate': 9.015790978353173e-07, 'epoch': 0.81} 81%|████████ | 17935/22095 [30:43:13<5:56:34, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90074 > 40960). Running this sequence through the model will result in indexing errors 81%|████████ | 17936/22095 [30:43:16<5:28:34, 4.74s/it] {'loss': 0.3118, 'grad_norm': 0.6443448934260734, 'learning_rate': 9.011593131901852e-07, 'epoch': 0.81} 81%|████████ | 17936/22095 [30:43:16<5:28:34, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17937/22095 [30:43:26<7:07:42, 6.17s/it] {'loss': 0.4967, 'grad_norm': 0.3076745719943253, 'learning_rate': 9.007396166165516e-07, 'epoch': 0.81} 81%|████████ | 17937/22095 [30:43:26<7:07:42, 6.17s/it] 81%|████████ | 17938/22095 [30:43:30<6:23:08, 5.53s/it] {'loss': 0.3007, 'grad_norm': 0.6008853865334476, 'learning_rate': 9.003200081234342e-07, 'epoch': 0.81} 81%|████████ | 17938/22095 [30:43:30<6:23:08, 5.53s/it] 81%|████████ | 17939/22095 [30:43:34<5:48:45, 5.04s/it] {'loss': 0.2712, 'grad_norm': 0.5958514186274408, 'learning_rate': 8.999004877198475e-07, 'epoch': 0.81} 81%|████████ | 17939/22095 [30:43:34<5:48:45, 5.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [970, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465122 in VC:s3://internvl-moe-sft-data/. Exception: Image size [970, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 126178, 'image': 'vrdu_texteq/astro-ph.CO/e7f5853a-0b84-435d-9c51-3e37400d5c8c.png', 'image_wh': [[970, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where $n\\simeq -2$ near the nonlinear scale in the real universe. In this case we have'}]} 81%|████████ | 17940/22095 [30:43:37<5:18:00, 4.59s/it] {'loss': 0.297, 'grad_norm': 0.604126124888519, 'learning_rate': 8.994810554148065e-07, 'epoch': 0.81} 81%|████████ | 17940/22095 [30:43:37<5:18:00, 4.59s/it] 81%|████████ | 17941/22095 [30:43:40<4:43:22, 4.09s/it] {'loss': 0.2805, 'grad_norm': 0.5785700048548983, 'learning_rate': 8.990617112173261e-07, 'epoch': 0.81} 81%|████████ | 17941/22095 [30:43:40<4:43:22, 4.09s/it] 81%|████████ | 17942/22095 [30:43:43<4:25:02, 3.83s/it] {'loss': 0.3081, 'grad_norm': 0.6076645267963782, 'learning_rate': 8.986424551364126e-07, 'epoch': 0.81} 81%|████████ | 17942/22095 [30:43:43<4:25:02, 3.83s/it] 81%|████████ | 17943/22095 [30:43:46<4:02:21, 3.50s/it] {'loss': 0.2825, 'grad_norm': 0.71513808602186, 'learning_rate': 8.982232871810759e-07, 'epoch': 0.81} 81%|████████ | 17943/22095 [30:43:46<4:02:21, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████ | 17944/22095 [30:43:56<6:08:14, 5.32s/it] {'loss': 0.4603, 'grad_norm': 0.2832504811945113, 'learning_rate': 8.978042073603243e-07, 'epoch': 0.81} 81%|████████ | 17944/22095 [30:43:56<6:08:14, 5.32s/it] 81%|████████ | 17945/22095 [30:43:59<5:30:30, 4.78s/it] {'loss': 0.2935, 'grad_norm': 0.5795502680430666, 'learning_rate': 8.97385215683162e-07, 'epoch': 0.81} 81%|████████ | 17945/22095 [30:43:59<5:30:30, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████ | 17946/22095 [30:44:03<5:04:14, 4.40s/it] {'loss': 0.2908, 'grad_norm': 0.6008787860153728, 'learning_rate': 8.969663121585892e-07, 'epoch': 0.81} 81%|████████ | 17946/22095 [30:44:03<5:04:14, 4.40s/it] 81%|████████ | 17947/22095 [30:44:06<4:35:37, 3.99s/it] {'loss': 0.2657, 'grad_norm': 0.5873590487284043, 'learning_rate': 8.965474967956106e-07, 'epoch': 0.81} 81%|████████ | 17947/22095 [30:44:06<4:35:37, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8892998 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16151, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC=2MC,BC=2CN,由线段的和差得AC-BC=2MC-2NC=2(MC-NC)=2×2=4cm,'}]} 81%|████████ | 17948/22095 [30:44:09<4:26:24, 3.85s/it] {'loss': 0.3314, 'grad_norm': 0.6827117507553307, 'learning_rate': 8.961287696032217e-07, 'epoch': 0.81} 81%|████████ | 17948/22095 [30:44:09<4:26:24, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898845 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21998, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nA. 4\nB. 3\nC. 6\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 81%|████████ | 17949/22095 [30:44:13<4:18:10, 3.74s/it] {'loss': 0.2584, 'grad_norm': 0.6190725778935383, 'learning_rate': 8.957101305904231e-07, 'epoch': 0.81} 81%|████████ | 17949/22095 [30:44:13<4:18:10, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8878402 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1555, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 3\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 81%|████████ | 17950/22095 [30:44:16<4:09:43, 3.61s/it] {'loss': 0.3212, 'grad_norm': 0.635167529439559, 'learning_rate': 8.95291579766207e-07, 'epoch': 0.81} 81%|████████ | 17950/22095 [30:44:16<4:09:43, 3.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922961 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46114, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 81%|████████ | 17951/22095 [30:44:20<4:06:43, 3.57s/it] {'loss': 0.295, 'grad_norm': 0.6384375583049835, 'learning_rate': 8.948731171395697e-07, 'epoch': 0.81} 81%|████████ | 17951/22095 [30:44:20<4:06:43, 3.57s/it] 81%|████████ | 17952/22095 [30:44:23<3:54:34, 3.40s/it] {'loss': 0.3228, 'grad_norm': 0.6788234412119452, 'learning_rate': 8.944547427195e-07, 'epoch': 0.81} 81%|████████ | 17952/22095 [30:44:23<3:54:34, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17953/22095 [30:44:26<4:02:25, 3.51s/it] {'loss': 0.2866, 'grad_norm': 0.5621391389443448, 'learning_rate': 8.940364565149895e-07, 'epoch': 0.81} 81%|████████▏ | 17953/22095 [30:44:26<4:02:25, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48757 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55124 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17954/22095 [30:44:29<3:46:45, 3.29s/it] {'loss': 0.2987, 'grad_norm': 0.641359077850069, 'learning_rate': 8.936182585350256e-07, 'epoch': 0.81} 81%|████████▏ | 17954/22095 [30:44:29<3:46:45, 3.29s/it] 81%|████████▏ | 17955/22095 [30:44:32<3:43:40, 3.24s/it] {'loss': 0.2646, 'grad_norm': 0.6266553026836502, 'learning_rate': 8.932001487885916e-07, 'epoch': 0.81} 81%|████████▏ | 17955/22095 [30:44:32<3:43:40, 3.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307808 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2AzHIhlTH8KJjy0FiXXcRsXXa_!!2691187853.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n小时\n12\n净化\n空气\n财神\n檀香\n一盘点12小时\n财源广进达三江\n生意兴隆通四海\n恭喜发财\n一组40盘\n恒河\n印度奇楠香\n送\n清新怡人\n足12小时\n香盘\n薰衣草香\n一组4盒\n买就送支架托盘'}]} 81%|████████▏ | 17956/22095 [30:44:35<3:37:39, 3.16s/it] {'loss': 0.3098, 'grad_norm': 0.5493797054964148, 'learning_rate': 8.927821272846737e-07, 'epoch': 0.81} 81%|████████▏ | 17956/22095 [30:44:35<3:37:39, 3.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84262 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17957/22095 [30:44:38<3:29:08, 3.03s/it] {'loss': 0.2677, 'grad_norm': 0.7846153097565445, 'learning_rate': 8.923641940322547e-07, 'epoch': 0.81} 81%|████████▏ | 17957/22095 [30:44:38<3:29:08, 3.03s/it] 81%|████████▏ | 17958/22095 [30:44:42<3:45:58, 3.28s/it] {'loss': 0.299, 'grad_norm': 0.5725269933339863, 'learning_rate': 8.919463490403141e-07, 'epoch': 0.81} 81%|████████▏ | 17958/22095 [30:44:42<3:45:58, 3.28s/it] 81%|████████▏ | 17959/22095 [30:44:45<3:37:04, 3.15s/it] {'loss': 0.2828, 'grad_norm': 0.5885140435649727, 'learning_rate': 8.915285923178274e-07, 'epoch': 0.81} 81%|████████▏ | 17959/22095 [30:44:45<3:37:04, 3.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48603 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48435 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119408 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44798 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17960/22095 [30:44:48<3:35:00, 3.12s/it] {'loss': 0.2552, 'grad_norm': 0.5613966687381595, 'learning_rate': 8.911109238737748e-07, 'epoch': 0.81} 81%|████████▏ | 17960/22095 [30:44:48<3:35:00, 3.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118896 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17961/22095 [30:44:51<3:27:45, 3.02s/it] {'loss': 0.2803, 'grad_norm': 0.5813063519597989, 'learning_rate': 8.906933437171278e-07, 'epoch': 0.81} 81%|████████▏ | 17961/22095 [30:44:51<3:27:45, 3.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8877037 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 190, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': '\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '15cm'}]} 81%|████████▏ | 17962/22095 [30:44:58<5:00:03, 4.36s/it] {'loss': 0.4879, 'grad_norm': 0.2914511549712682, 'learning_rate': 8.90275851856861e-07, 'epoch': 0.81} 81%|████████▏ | 17962/22095 [30:44:58<5:00:03, 4.36s/it] 81%|████████▏ | 17963/22095 [30:45:01<4:37:24, 4.03s/it] {'loss': 0.2995, 'grad_norm': 0.5898202553287785, 'learning_rate': 8.89858448301944e-07, 'epoch': 0.81} 81%|████████▏ | 17963/22095 [30:45:01<4:37:24, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95724 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65792 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98156 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17964/22095 [30:45:05<4:35:52, 4.01s/it] {'loss': 0.3256, 'grad_norm': 0.6590792125265483, 'learning_rate': 8.894411330613445e-07, 'epoch': 0.81} 81%|████████▏ | 17964/22095 [30:45:05<4:35:52, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884033 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7186, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1cm'}]} 81%|████████▏ | 17965/22095 [30:45:15<6:28:28, 5.64s/it] {'loss': 0.4525, 'grad_norm': 0.27424966293835146, 'learning_rate': 8.890239061440303e-07, 'epoch': 0.81} 81%|████████▏ | 17965/22095 [30:45:15<6:28:28, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58691 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17966/22095 [30:45:18<5:47:38, 5.05s/it] {'loss': 0.2955, 'grad_norm': 0.5828946147301375, 'learning_rate': 8.886067675589682e-07, 'epoch': 0.81} 81%|████████▏ | 17966/22095 [30:45:18<5:47:38, 5.05s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883163 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6316, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '7'}]} 81%|████████▏ | 17967/22095 [30:45:21<5:05:33, 4.44s/it] {'loss': 0.315, 'grad_norm': 0.5702092181704951, 'learning_rate': 8.881897173151188e-07, 'epoch': 0.81} 81%|████████▏ | 17967/22095 [30:45:21<5:05:33, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████▏ | 17968/22095 [30:45:31<6:51:55, 5.99s/it] {'loss': 0.4809, 'grad_norm': 0.2968341038768363, 'learning_rate': 8.877727554214432e-07, 'epoch': 0.81} 81%|████████▏ | 17968/22095 [30:45:31<6:51:55, 5.99s/it] 81%|████████▏ | 17969/22095 [30:45:38<7:19:46, 6.40s/it] {'loss': 0.4748, 'grad_norm': 0.2735751270105653, 'learning_rate': 8.87355881886901e-07, 'epoch': 0.81} 81%|████████▏ | 17969/22095 [30:45:38<7:19:46, 6.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████▏ | 17970/22095 [30:45:42<6:28:53, 5.66s/it] {'loss': 0.2481, 'grad_norm': 0.5959178434447817, 'learning_rate': 8.869390967204527e-07, 'epoch': 0.81} 81%|████████▏ | 17970/22095 [30:45:42<6:28:53, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47107 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101833 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17971/22095 [30:45:46<5:48:31, 5.07s/it] {'loss': 0.2885, 'grad_norm': 0.581536219848497, 'learning_rate': 8.865223999310485e-07, 'epoch': 0.81} 81%|████████▏ | 17971/22095 [30:45:46<5:48:31, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61550 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79367 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54965 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78980 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17972/22095 [30:45:56<7:24:03, 6.46s/it] {'loss': 0.4554, 'grad_norm': 0.25640784722105925, 'learning_rate': 8.861057915276438e-07, 'epoch': 0.81} 81%|████████▏ | 17972/22095 [30:45:56<7:24:03, 6.46s/it] 81%|████████▏ | 17973/22095 [30:45:59<6:25:45, 5.61s/it] {'loss': 0.2738, 'grad_norm': 0.6235145197841093, 'learning_rate': 8.856892715191929e-07, 'epoch': 0.81} 81%|████████▏ | 17973/22095 [30:45:59<6:25:45, 5.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████▏ | 17974/22095 [30:46:06<6:53:22, 6.02s/it] {'loss': 0.4696, 'grad_norm': 0.2672252289127243, 'learning_rate': 8.852728399146427e-07, 'epoch': 0.81} 81%|████████▏ | 17974/22095 [30:46:06<6:53:22, 6.02s/it] 81%|████████▏ | 17975/22095 [30:46:10<5:59:08, 5.23s/it] {'loss': 0.2915, 'grad_norm': 0.7879556336384355, 'learning_rate': 8.848564967229407e-07, 'epoch': 0.81} 81%|████████▏ | 17975/22095 [30:46:10<5:59:08, 5.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17976/22095 [30:46:14<5:33:00, 4.85s/it] {'loss': 0.25, 'grad_norm': 0.6113467717808888, 'learning_rate': 8.844402419530346e-07, 'epoch': 0.81} 81%|████████▏ | 17976/22095 [30:46:14<5:33:00, 4.85s/it] 81%|████████▏ | 17977/22095 [30:46:17<5:02:26, 4.41s/it] {'loss': 0.2584, 'grad_norm': 0.6012089521551512, 'learning_rate': 8.840240756138673e-07, 'epoch': 0.81} 81%|████████▏ | 17977/22095 [30:46:17<5:02:26, 4.41s/it] 81%|████████▏ | 17978/22095 [30:46:20<4:42:00, 4.11s/it] {'loss': 0.3234, 'grad_norm': 0.6901266025938774, 'learning_rate': 8.836079977143819e-07, 'epoch': 0.81} 81%|████████▏ | 17978/22095 [30:46:20<4:42:00, 4.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████▏ | 17979/22095 [30:46:30<6:32:26, 5.72s/it] {'loss': 0.4446, 'grad_norm': 0.266879344823511, 'learning_rate': 8.831920082635175e-07, 'epoch': 0.81} 81%|████████▏ | 17979/22095 [30:46:30<6:32:26, 5.72s/it] 81%|████████▏ | 17980/22095 [30:46:40<7:53:23, 6.90s/it] {'loss': 0.458, 'grad_norm': 0.2728346231036277, 'learning_rate': 8.82776107270214e-07, 'epoch': 0.81} 81%|████████▏ | 17980/22095 [30:46:40<7:53:23, 6.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████▏ | 17981/22095 [30:46:43<6:37:26, 5.80s/it] {'loss': 0.2792, 'grad_norm': 0.5920782461326178, 'learning_rate': 8.823602947434056e-07, 'epoch': 0.81} 81%|████████▏ | 17981/22095 [30:46:43<6:37:26, 5.80s/it] 81%|████████▏ | 17982/22095 [30:46:46<5:44:43, 5.03s/it] {'loss': 0.3333, 'grad_norm': 0.6327987553773374, 'learning_rate': 8.819445706920293e-07, 'epoch': 0.81} 81%|████████▏ | 17982/22095 [30:46:46<5:44:43, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17983/22095 [30:46:49<5:01:17, 4.40s/it] {'loss': 0.2912, 'grad_norm': 0.6320931870141716, 'learning_rate': 8.815289351250166e-07, 'epoch': 0.81} 81%|████████▏ | 17983/22095 [30:46:49<5:01:17, 4.40s/it] 81%|████████▏ | 17984/22095 [30:46:52<4:30:14, 3.94s/it] {'loss': 0.2914, 'grad_norm': 0.6171374305864387, 'learning_rate': 8.811133880512967e-07, 'epoch': 0.81} 81%|████████▏ | 17984/22095 [30:46:52<4:30:14, 3.94s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17985/22095 [30:46:55<4:22:27, 3.83s/it] {'loss': 0.3362, 'grad_norm': 0.5839525656588083, 'learning_rate': 8.806979294798001e-07, 'epoch': 0.81} 81%|████████▏ | 17985/22095 [30:46:55<4:22:27, 3.83s/it] 81%|████████▏ | 17986/22095 [30:46:58<4:03:37, 3.56s/it] {'loss': 0.2797, 'grad_norm': 0.6262579128109615, 'learning_rate': 8.802825594194553e-07, 'epoch': 0.81} 81%|████████▏ | 17986/22095 [30:46:58<4:03:37, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51908 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17987/22095 [30:47:01<3:53:41, 3.41s/it] {'loss': 0.2652, 'grad_norm': 0.6644319477416321, 'learning_rate': 8.798672778791851e-07, 'epoch': 0.81} 81%|████████▏ | 17987/22095 [30:47:01<3:53:41, 3.41s/it] 81%|████████▏ | 17988/22095 [30:47:04<3:40:55, 3.23s/it] {'loss': 0.3056, 'grad_norm': 0.5892605327097258, 'learning_rate': 8.794520848679117e-07, 'epoch': 0.81} 81%|████████▏ | 17988/22095 [30:47:04<3:40:55, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17989/22095 [30:47:07<3:33:57, 3.13s/it] {'loss': 0.3171, 'grad_norm': 0.651773532486608, 'learning_rate': 8.790369803945586e-07, 'epoch': 0.81} 81%|████████▏ | 17989/22095 [30:47:07<3:33:57, 3.13s/it] 81%|████████▏ | 17990/22095 [30:47:10<3:33:03, 3.11s/it] {'loss': 0.3445, 'grad_norm': 0.6321790753968162, 'learning_rate': 8.786219644680433e-07, 'epoch': 0.81} 81%|████████▏ | 17990/22095 [30:47:10<3:33:03, 3.11s/it] 81%|████████▏ | 17991/22095 [30:47:13<3:35:09, 3.15s/it] {'loss': 0.308, 'grad_norm': 0.7756419867480293, 'learning_rate': 8.782070370972856e-07, 'epoch': 0.81} 81%|████████▏ | 17991/22095 [30:47:13<3:35:09, 3.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (109100 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73750 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17992/22095 [30:47:16<3:28:51, 3.05s/it] {'loss': 0.2983, 'grad_norm': 0.6363231400815615, 'learning_rate': 8.777921982911996e-07, 'epoch': 0.81} 81%|████████▏ | 17992/22095 [30:47:16<3:28:51, 3.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113642 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17993/22095 [30:47:19<3:31:26, 3.09s/it] {'loss': 0.2611, 'grad_norm': 0.6368075871813724, 'learning_rate': 8.773774480586972e-07, 'epoch': 0.81} 81%|████████▏ | 17993/22095 [30:47:19<3:31:26, 3.09s/it] 81%|████████▏ | 17994/22095 [30:47:22<3:31:33, 3.10s/it] {'loss': 0.3046, 'grad_norm': 0.6556289472461144, 'learning_rate': 8.769627864086922e-07, 'epoch': 0.81} 81%|████████▏ | 17994/22095 [30:47:22<3:31:33, 3.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████▏ | 17995/22095 [30:47:31<5:25:54, 4.77s/it] {'loss': 0.4674, 'grad_norm': 0.2811520282166436, 'learning_rate': 8.765482133500952e-07, 'epoch': 0.81} 81%|████████▏ | 17995/22095 [30:47:31<5:25:54, 4.77s/it] 81%|████████▏ | 17996/22095 [30:47:35<5:09:25, 4.53s/it] {'loss': 0.3681, 'grad_norm': 0.6395634551090364, 'learning_rate': 8.761337288918126e-07, 'epoch': 0.81} 81%|████████▏ | 17996/22095 [30:47:35<5:09:25, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41257 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119386 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45397 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17997/22095 [30:47:38<4:35:55, 4.04s/it] {'loss': 0.2535, 'grad_norm': 0.5727505237743223, 'learning_rate': 8.757193330427494e-07, 'epoch': 0.81} 81%|████████▏ | 17997/22095 [30:47:38<4:35:55, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63520 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 17998/22095 [30:47:41<4:23:22, 3.86s/it] {'loss': 0.3083, 'grad_norm': 0.6768003967586744, 'learning_rate': 8.753050258118112e-07, 'epoch': 0.81} 81%|████████▏ | 17998/22095 [30:47:41<4:23:22, 3.86s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 81%|████████▏ | 17999/22095 [30:47:44<4:03:28, 3.57s/it] {'loss': 0.3023, 'grad_norm': 0.6193381144097816, 'learning_rate': 8.748908072079021e-07, 'epoch': 0.81} 81%|████████▏ | 17999/22095 [30:47:44<4:03:28, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 81%|████████▏ | 18000/22095 [30:47:51<5:06:33, 4.49s/it] {'loss': 0.4727, 'grad_norm': 0.3014669651232553, 'learning_rate': 8.744766772399182e-07, 'epoch': 0.81} 81%|████████▏ | 18000/22095 [30:47:51<5:06:33, 4.49s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [631, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8453732 in VC:s3://internvl-moe-sft-data/. Exception: Image size [631, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 58954, 'image': 'vrdu_texteq/astro-ph.CO/118cb0b7-e293-406a-9bc2-55d122d262bc.png', 'image_wh': [[631, 25]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'where $N_{i}$ is the number of SNe in $i-$th redshift bin.'}]} 81%|████████▏ | 18001/22095 [30:48:50<23:34:01, 20.72s/it] {'loss': 0.463, 'grad_norm': 0.29125316456076983, 'learning_rate': 8.740626359167598e-07, 'epoch': 0.81} 81%|████████▏ | 18001/22095 [30:48:50<23:34:01, 20.72s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8339015 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5646, 'image': 'vrdu_table_final_2/astro-ph.CO/35965147-786b-4d28-8d25-abe2b820e323.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```"}]} 81%|████████▏ | 18002/22095 [30:48:53<17:44:13, 15.60s/it] {'loss': 0.2559, 'grad_norm': 0.5728738895218383, 'learning_rate': 8.736486832473246e-07, 'epoch': 0.81} 81%|████████▏ | 18002/22095 [30:48:53<17:44:13, 15.60s/it] 81%|████████▏ | 18003/22095 [30:48:57<13:38:12, 12.00s/it] {'loss': 0.2874, 'grad_norm': 0.592202199487586, 'learning_rate': 8.732348192405061e-07, 'epoch': 0.81} 81%|████████▏ | 18003/22095 [30:48:57<13:38:12, 12.00s/it] 81%|████████▏ | 18004/22095 [30:49:00<10:45:21, 9.46s/it] {'loss': 0.3314, 'grad_norm': 0.6240452002246657, 'learning_rate': 8.72821043905196e-07, 'epoch': 0.81} 81%|████████▏ | 18004/22095 [30:49:00<10:45:21, 9.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (62897 > 40960). Running this sequence through the model will result in indexing errors 81%|████████▏ | 18005/22095 [30:49:11<11:15:38, 9.91s/it] {'loss': 0.4812, 'grad_norm': 0.27107133608396483, 'learning_rate': 8.724073572502867e-07, 'epoch': 0.81} 81%|████████▏ | 18005/22095 [30:49:11<11:15:38, 9.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306594 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1tTxJhNHI8KJjy1zbXXaxdpXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nRead the text on the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n满88元包邮\nicouplerSAicSACQC主营:全新原装\nⓔ\nDEV\nA\nADI\n我们做\n合作双赢\n的搬运工\n佐鑫得电子\nCOUNTHYOTASSEMBLYMALAYSIA\nCOUNTRYOFDIFFUSIONTALWAN\n(Q)QTY:1000\nC\n(9D)DATECODE:1723\n2001933644\nCQC\n(11)LOTNUBER:\n(1P)MFGNO:ADUM320ARZ-BUZ'}]} 81%|████████▏ | 18006/22095 [30:49:20<10:51:14, 9.56s/it] {'loss': 0.4683, 'grad_norm': 0.26570821313541987, 'learning_rate': 8.719937592846655e-07, 'epoch': 0.81} 81%|████████▏ | 18006/22095 [30:49:20<10:51:14, 9.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 81%|████████▏ | 18007/22095 [30:49:24<8:59:55, 7.92s/it] {'loss': 0.3069, 'grad_norm': 0.6297787448641845, 'learning_rate': 8.715802500172215e-07, 'epoch': 0.81} 81%|████████▏ | 18007/22095 [30:49:24<8:59:55, 7.92s/it] 82%|████████▏ | 18008/22095 [30:49:27<7:23:48, 6.52s/it] {'loss': 0.2747, 'grad_norm': 1.1029666181074564, 'learning_rate': 8.71166829456837e-07, 'epoch': 0.82} 82%|████████▏ | 18008/22095 [30:49:27<7:23:48, 6.52s/it] 82%|████████▏ | 18009/22095 [30:49:30<6:12:40, 5.47s/it] {'loss': 0.2896, 'grad_norm': 0.6509827180821017, 'learning_rate': 8.707534976123982e-07, 'epoch': 0.82} 82%|████████▏ | 18009/22095 [30:49:30<6:12:40, 5.47s/it] 82%|████████▏ | 18010/22095 [30:49:34<5:39:45, 4.99s/it] {'loss': 0.3169, 'grad_norm': 0.6877266774259756, 'learning_rate': 8.70340254492783e-07, 'epoch': 0.82} 82%|████████▏ | 18010/22095 [30:49:34<5:39:45, 4.99s/it] 82%|████████▏ | 18011/22095 [30:49:37<4:55:47, 4.35s/it] {'loss': 0.3197, 'grad_norm': 0.6522964469108322, 'learning_rate': 8.699271001068737e-07, 'epoch': 0.82} 82%|████████▏ | 18011/22095 [30:49:37<4:55:47, 4.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18012/22095 [30:49:40<4:27:22, 3.93s/it] {'loss': 0.3409, 'grad_norm': 0.607681550138856, 'learning_rate': 8.695140344635472e-07, 'epoch': 0.82} 82%|████████▏ | 18012/22095 [30:49:40<4:27:22, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18013/22095 [30:49:50<6:19:16, 5.57s/it] {'loss': 0.4695, 'grad_norm': 0.28011214100128695, 'learning_rate': 8.691010575716763e-07, 'epoch': 0.82} 82%|████████▏ | 18013/22095 [30:49:50<6:19:16, 5.57s/it] 82%|████████▏ | 18014/22095 [30:49:53<5:29:25, 4.84s/it] {'loss': 0.3129, 'grad_norm': 0.6319478865347986, 'learning_rate': 8.686881694401366e-07, 'epoch': 0.82} 82%|████████▏ | 18014/22095 [30:49:53<5:29:25, 4.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18015/22095 [30:49:56<4:48:54, 4.25s/it] {'loss': 0.2748, 'grad_norm': 0.6545870615655542, 'learning_rate': 8.682753700778013e-07, 'epoch': 0.82} 82%|████████▏ | 18015/22095 [30:49:56<4:48:54, 4.25s/it] 82%|████████▏ | 18016/22095 [30:49:59<4:33:07, 4.02s/it] {'loss': 0.2814, 'grad_norm': 0.5383505723290435, 'learning_rate': 8.678626594935385e-07, 'epoch': 0.82} 82%|████████▏ | 18016/22095 [30:49:59<4:33:07, 4.02s/it] 82%|████████▏ | 18017/22095 [30:50:03<4:30:05, 3.97s/it] {'loss': 0.3214, 'grad_norm': 0.6287309199313841, 'learning_rate': 8.674500376962153e-07, 'epoch': 0.82} 82%|████████▏ | 18017/22095 [30:50:03<4:30:05, 3.97s/it] 82%|████████▏ | 18018/22095 [30:50:06<4:10:16, 3.68s/it] {'loss': 0.2629, 'grad_norm': 0.6113330668405157, 'learning_rate': 8.670375046946999e-07, 'epoch': 0.82} 82%|████████▏ | 18018/22095 [30:50:06<4:10:16, 3.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8357839 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24549, 'image': 'vrdu_table_final_2/astro-ph.CO/98802ad2-0c63-4d3b-a8e4-f88f8e596778.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18019/22095 [30:50:10<4:12:31, 3.72s/it] {'loss': 0.2794, 'grad_norm': 0.6091712904104718, 'learning_rate': 8.666250604978532e-07, 'epoch': 0.82} 82%|████████▏ | 18019/22095 [30:50:10<4:12:31, 3.72s/it] 82%|████████▏ | 18020/22095 [30:50:13<4:05:59, 3.62s/it] {'loss': 0.2868, 'grad_norm': 0.5902500448470669, 'learning_rate': 8.662127051145414e-07, 'epoch': 0.82} 82%|████████▏ | 18020/22095 [30:50:13<4:05:59, 3.62s/it] 82%|████████▏ | 18021/22095 [30:50:16<3:55:53, 3.47s/it] {'loss': 0.3043, 'grad_norm': 0.852530710451694, 'learning_rate': 8.658004385536207e-07, 'epoch': 0.82} 82%|████████▏ | 18021/22095 [30:50:16<3:55:53, 3.47s/it] 82%|████████▏ | 18022/22095 [30:50:20<4:01:18, 3.55s/it] {'loss': 0.287, 'grad_norm': 0.662962701091303, 'learning_rate': 8.653882608239528e-07, 'epoch': 0.82} 82%|████████▏ | 18022/22095 [30:50:20<4:01:18, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18023/22095 [30:50:23<3:55:56, 3.48s/it] {'loss': 0.2593, 'grad_norm': 0.6398341194492669, 'learning_rate': 8.649761719343913e-07, 'epoch': 0.82} 82%|████████▏ | 18023/22095 [30:50:23<3:55:56, 3.48s/it] 82%|████████▏ | 18024/22095 [30:50:27<4:01:10, 3.55s/it] {'loss': 0.3002, 'grad_norm': 0.6288687742255715, 'learning_rate': 8.645641718937936e-07, 'epoch': 0.82} 82%|████████▏ | 18024/22095 [30:50:27<4:01:10, 3.55s/it] 82%|████████▏ | 18025/22095 [30:50:31<4:02:44, 3.58s/it] {'loss': 0.2877, 'grad_norm': 0.6466427119006023, 'learning_rate': 8.641522607110108e-07, 'epoch': 0.82} 82%|████████▏ | 18025/22095 [30:50:31<4:02:44, 3.58s/it] 82%|████████▏ | 18026/22095 [30:50:35<4:11:54, 3.71s/it] {'loss': 0.2459, 'grad_norm': 0.5266603977129436, 'learning_rate': 8.637404383948922e-07, 'epoch': 0.82} 82%|████████▏ | 18026/22095 [30:50:35<4:11:54, 3.71s/it] 82%|████████▏ | 18027/22095 [30:50:38<4:01:11, 3.56s/it] {'loss': 0.2877, 'grad_norm': 0.5882476530914964, 'learning_rate': 8.633287049542882e-07, 'epoch': 0.82} 82%|████████▏ | 18027/22095 [30:50:38<4:01:11, 3.56s/it] 82%|████████▏ | 18028/22095 [30:50:41<4:00:23, 3.55s/it] {'loss': 0.3038, 'grad_norm': 0.6191750205576243, 'learning_rate': 8.62917060398048e-07, 'epoch': 0.82} 82%|████████▏ | 18028/22095 [30:50:41<4:00:23, 3.55s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8584086 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10496, 'image': '898795494.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a sci-fi book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 82%|████████▏ | 18029/22095 [30:50:45<3:52:44, 3.43s/it] {'loss': 0.2968, 'grad_norm': 0.5662399271037007, 'learning_rate': 8.625055047350117e-07, 'epoch': 0.82} 82%|████████▏ | 18029/22095 [30:50:45<3:52:44, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55462 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61885 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18030/22095 [30:50:49<4:05:35, 3.62s/it] {'loss': 0.2823, 'grad_norm': 0.6095585560534345, 'learning_rate': 8.620940379740245e-07, 'epoch': 0.82} 82%|████████▏ | 18030/22095 [30:50:49<4:05:35, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [728, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8480072 in VC:s3://internvl-moe-sft-data/. Exception: Image size [728, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 27162, 'image': 'vrdu_texteq/astro-ph.CO/8ffa6616-98c6-4bdd-b98f-a7fe23113972.png', 'image_wh': [[728, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'where we follow and to calculate the\nreaction coefficient $k_{m}$.'}]} 82%|████████▏ | 18031/22095 [30:50:52<3:52:01, 3.43s/it] {'loss': 0.2968, 'grad_norm': 0.658250569960579, 'learning_rate': 8.616826601239292e-07, 'epoch': 0.82} 82%|████████▏ | 18031/22095 [30:50:52<3:52:01, 3.43s/it] 82%|████████▏ | 18032/22095 [30:50:56<4:05:59, 3.63s/it] {'loss': 0.3158, 'grad_norm': 0.6742898954206353, 'learning_rate': 8.612713711935633e-07, 'epoch': 0.82} 82%|████████▏ | 18032/22095 [30:50:56<4:05:59, 3.63s/it] 82%|████████▏ | 18033/22095 [30:50:59<3:57:01, 3.50s/it] {'loss': 0.2991, 'grad_norm': 0.5886347097456335, 'learning_rate': 8.608601711917635e-07, 'epoch': 0.82} 82%|████████▏ | 18033/22095 [30:50:59<3:57:01, 3.50s/it] 82%|████████▏ | 18034/22095 [30:51:03<4:03:57, 3.60s/it] {'loss': 0.277, 'grad_norm': 0.5730933904332982, 'learning_rate': 8.60449060127368e-07, 'epoch': 0.82} 82%|████████▏ | 18034/22095 [30:51:03<4:03:57, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (129072 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94850 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18035/22095 [30:51:12<6:06:24, 5.41s/it] {'loss': 0.4785, 'grad_norm': 0.314793965110205, 'learning_rate': 8.600380380092066e-07, 'epoch': 0.82} 82%|████████▏ | 18035/22095 [30:51:12<6:06:24, 5.41s/it] 82%|████████▏ | 18036/22095 [30:51:16<5:39:54, 5.02s/it] {'loss': 0.2944, 'grad_norm': 0.6019555970834, 'learning_rate': 8.596271048461141e-07, 'epoch': 0.82} 82%|████████▏ | 18036/22095 [30:51:16<5:39:54, 5.02s/it] 82%|████████▏ | 18037/22095 [30:51:19<4:58:08, 4.41s/it] {'loss': 0.3038, 'grad_norm': 0.6373598092176254, 'learning_rate': 8.592162606469179e-07, 'epoch': 0.82} 82%|████████▏ | 18037/22095 [30:51:19<4:58:08, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18038/22095 [30:51:29<6:37:20, 5.88s/it] {'loss': 0.4772, 'grad_norm': 0.2930173461656693, 'learning_rate': 8.588055054204481e-07, 'epoch': 0.82} 82%|████████▏ | 18038/22095 [30:51:29<6:37:20, 5.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44941 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18039/22095 [30:51:33<5:59:46, 5.32s/it] {'loss': 0.297, 'grad_norm': 0.5853816249810239, 'learning_rate': 8.583948391755281e-07, 'epoch': 0.82} 82%|████████▏ | 18039/22095 [30:51:33<5:59:46, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65710 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49399 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44412 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18040/22095 [30:51:36<5:23:41, 4.79s/it] {'loss': 0.3457, 'grad_norm': 0.6216333902770305, 'learning_rate': 8.579842619209844e-07, 'epoch': 0.82} 82%|████████▏ | 18040/22095 [30:51:36<5:23:41, 4.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [267, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8476972 in VC:s3://internvl-moe-sft-data/. Exception: Image size [267, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13615, 'image': 'vrdu_texteq/astro-ph.CO/e0996434-bf4a-4718-9301-996ab92077e9.png', 'image_wh': [[267, 23]], 'conversations': [{'from': 'human', 'value': '\n请将图像中的文本识别并转换为机器可读格式。'}, {'from': 'gpt', 'value': 'where $\\mathcal{C}$ is defined in~.'}]} 82%|████████▏ | 18041/22095 [30:51:40<4:58:25, 4.42s/it] {'loss': 0.2948, 'grad_norm': 0.5844381819470246, 'learning_rate': 8.575737736656376e-07, 'epoch': 0.82} 82%|████████▏ | 18041/22095 [30:51:40<4:58:25, 4.42s/it] 82%|████████▏ | 18042/22095 [30:51:43<4:24:11, 3.91s/it] {'loss': 0.2926, 'grad_norm': 0.5545450711342551, 'learning_rate': 8.571633744183061e-07, 'epoch': 0.82} 82%|████████▏ | 18042/22095 [30:51:43<4:24:11, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18043/22095 [30:51:53<6:29:32, 5.77s/it] {'loss': 0.4673, 'grad_norm': 0.27769105334571487, 'learning_rate': 8.567530641878103e-07, 'epoch': 0.82} 82%|████████▏ | 18043/22095 [30:51:53<6:29:32, 5.77s/it] 82%|████████▏ | 18044/22095 [30:51:57<5:59:43, 5.33s/it] {'loss': 0.3118, 'grad_norm': 0.6955840626435977, 'learning_rate': 8.563428429829674e-07, 'epoch': 0.82} 82%|████████▏ | 18044/22095 [30:51:57<5:59:43, 5.33s/it] 82%|████████▏ | 18045/22095 [30:52:01<5:37:35, 5.00s/it] {'loss': 0.3299, 'grad_norm': 0.7041439251924754, 'learning_rate': 8.559327108125909e-07, 'epoch': 0.82} 82%|████████▏ | 18045/22095 [30:52:01<5:37:35, 5.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924586 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47739, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 32cm\nB. 4cm\nC. 8cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 82%|████████▏ | 18046/22095 [30:52:05<5:08:58, 4.58s/it] {'loss': 0.2401, 'grad_norm': 0.6386908175674691, 'learning_rate': 8.555226676854911e-07, 'epoch': 0.82} 82%|████████▏ | 18046/22095 [30:52:05<5:08:58, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18047/22095 [30:52:15<6:59:42, 6.22s/it] {'loss': 0.4705, 'grad_norm': 0.2727069733305922, 'learning_rate': 8.55112713610482e-07, 'epoch': 0.82} 82%|████████▏ | 18047/22095 [30:52:15<6:59:42, 6.22s/it] 82%|████████▏ | 18048/22095 [30:52:23<7:47:53, 6.94s/it] {'loss': 0.4624, 'grad_norm': 0.2700845917645277, 'learning_rate': 8.547028485963693e-07, 'epoch': 0.82} 82%|████████▏ | 18048/22095 [30:52:23<7:47:53, 6.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 82%|████████▏ | 18049/22095 [30:52:28<6:48:55, 6.06s/it] {'loss': 0.3294, 'grad_norm': 0.6460354403133349, 'learning_rate': 8.542930726519622e-07, 'epoch': 0.82} 82%|████████▏ | 18049/22095 [30:52:28<6:48:55, 6.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114256 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18050/22095 [30:52:31<5:55:27, 5.27s/it] {'loss': 0.2709, 'grad_norm': 0.6197181489036633, 'learning_rate': 8.538833857860635e-07, 'epoch': 0.82} 82%|████████▏ | 18050/22095 [30:52:31<5:55:27, 5.27s/it] 82%|████████▏ | 18051/22095 [30:52:34<5:13:28, 4.65s/it] {'loss': 0.2896, 'grad_norm': 0.735925769436235, 'learning_rate': 8.534737880074778e-07, 'epoch': 0.82} 82%|████████▏ | 18051/22095 [30:52:34<5:13:28, 4.65s/it] 82%|████████▏ | 18052/22095 [30:52:37<4:43:24, 4.21s/it] {'loss': 0.2803, 'grad_norm': 0.6346324401263916, 'learning_rate': 8.530642793250044e-07, 'epoch': 0.82} 82%|████████▏ | 18052/22095 [30:52:37<4:43:24, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047607 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 9\nB. 10\nC. 12\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由题意得,EC+FD=EF-CD=8-4=4,∵E是AC的中点,F是BD的中点,∴AE+FB=EC+FD=4,∴AB=AE+FB+EF=4+8=12.'}]} 82%|████████▏ | 18053/22095 [30:52:41<4:24:27, 3.93s/it] {'loss': 0.2843, 'grad_norm': 0.5409400804031699, 'learning_rate': 8.526548597474444e-07, 'epoch': 0.82} 82%|████████▏ | 18053/22095 [30:52:41<4:24:27, 3.93s/it] 82%|████████▏ | 18054/22095 [30:52:44<4:08:43, 3.69s/it] {'loss': 0.2799, 'grad_norm': 0.5605393417963239, 'learning_rate': 8.522455292835935e-07, 'epoch': 0.82} 82%|████████▏ | 18054/22095 [30:52:44<4:08:43, 3.69s/it] 82%|████████▏ | 18055/22095 [30:52:47<3:59:14, 3.55s/it] {'loss': 0.3166, 'grad_norm': 0.6013058872868924, 'learning_rate': 8.518362879422465e-07, 'epoch': 0.82} 82%|████████▏ | 18055/22095 [30:52:47<3:59:14, 3.55s/it] 82%|████████▏ | 18056/22095 [30:52:51<4:05:53, 3.65s/it] {'loss': 0.3177, 'grad_norm': 0.5732754320023369, 'learning_rate': 8.514271357321974e-07, 'epoch': 0.82} 82%|████████▏ | 18056/22095 [30:52:51<4:05:53, 3.65s/it] 82%|████████▏ | 18057/22095 [30:52:54<3:54:08, 3.48s/it] {'loss': 0.2924, 'grad_norm': 0.6212655502781934, 'learning_rate': 8.510180726622392e-07, 'epoch': 0.82} 82%|████████▏ | 18057/22095 [30:52:54<3:54:08, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1017, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8507731 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1017, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 128019, 'image': 'vrdu_texteq/astro-ph.CO/a706e590-4ac2-4011-8a43-bd9a58236728.png', 'image_wh': [[1017, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'where $C$ is the covariance matrix between the different bins of the cross-correlation.'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18058/22095 [30:53:05<6:30:43, 5.81s/it] {'loss': 0.4635, 'grad_norm': 0.30100841815431545, 'learning_rate': 8.506090987411603e-07, 'epoch': 0.82} 82%|████████▏ | 18058/22095 [30:53:05<6:30:43, 5.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41956 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78057 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45212 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49732 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18059/22095 [30:53:10<6:13:31, 5.55s/it] {'loss': 0.3195, 'grad_norm': 0.6259434935033461, 'learning_rate': 8.50200213977746e-07, 'epoch': 0.82} 82%|████████▏ | 18059/22095 [30:53:10<6:13:31, 5.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69102 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43804 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (41089 > 40960) for 4 sample(s). Truncating to 2485 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (79520 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18060/22095 [30:53:14<5:45:06, 5.13s/it] {'loss': 0.2448, 'grad_norm': 0.611446509904659, 'learning_rate': 8.49791418380786e-07, 'epoch': 0.82} 82%|████████▏ | 18060/22095 [30:53:14<5:45:06, 5.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306391 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1rFDSaXLM8KJjSZFBXXXJHVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n我想知道这张图中的所有文字写了什么,你能全部识别出来吗?'}, {'from': 'gpt', 'value': '图中所有文字:\n新品\n加粗款-银色\nNEW\n承重40斤\n送盖子\n买一送四\n送螺丝+扎带+皮套+盖子'}]} 82%|████████▏ | 18061/22095 [30:53:17<5:06:47, 4.56s/it] {'loss': 0.2709, 'grad_norm': 0.6051093462489923, 'learning_rate': 8.493827119590615e-07, 'epoch': 0.82} 82%|████████▏ | 18061/22095 [30:53:18<5:06:47, 4.56s/it] 82%|████████▏ | 18062/22095 [30:53:22<4:56:54, 4.42s/it] {'loss': 0.328, 'grad_norm': 0.6503222997170561, 'learning_rate': 8.489740947213537e-07, 'epoch': 0.82} 82%|████████▏ | 18062/22095 [30:53:22<4:56:54, 4.42s/it] 82%|████████▏ | 18063/22095 [30:53:25<4:44:15, 4.23s/it] {'loss': 0.2965, 'grad_norm': 0.6214017202106538, 'learning_rate': 8.485655666764448e-07, 'epoch': 0.82} 82%|████████▏ | 18063/22095 [30:53:25<4:44:15, 4.23s/it] 82%|████████▏ | 18064/22095 [30:53:28<4:20:42, 3.88s/it] {'loss': 0.2627, 'grad_norm': 0.5921564688005043, 'learning_rate': 8.481571278331108e-07, 'epoch': 0.82} 82%|████████▏ | 18064/22095 [30:53:28<4:20:42, 3.88s/it] 82%|████████▏ | 18065/22095 [30:53:32<4:18:26, 3.85s/it] {'loss': 0.2888, 'grad_norm': 0.6325059930394564, 'learning_rate': 8.477487782001298e-07, 'epoch': 0.82} 82%|████████▏ | 18065/22095 [30:53:32<4:18:26, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77517 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83135 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18066/22095 [30:53:36<4:16:32, 3.82s/it] {'loss': 0.3454, 'grad_norm': 0.579900367318165, 'learning_rate': 8.473405177862737e-07, 'epoch': 0.82} 82%|████████▏ | 18066/22095 [30:53:36<4:16:32, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18067/22095 [30:53:45<6:10:40, 5.52s/it] {'loss': 0.4872, 'grad_norm': 0.51058712141538, 'learning_rate': 8.46932346600317e-07, 'epoch': 0.82} 82%|████████▏ | 18067/22095 [30:53:45<6:10:40, 5.52s/it] 82%|████████▏ | 18068/22095 [30:53:54<7:15:06, 6.48s/it] {'loss': 0.5006, 'grad_norm': 0.285936508840432, 'learning_rate': 8.46524264651028e-07, 'epoch': 0.82} 82%|████████▏ | 18068/22095 [30:53:54<7:15:06, 6.48s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (53371 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95993 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52939 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56737 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46588 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 82%|████████▏ | 18069/22095 [30:53:57<6:08:33, 5.49s/it] {'loss': 0.2589, 'grad_norm': 0.6130167016167806, 'learning_rate': 8.461162719471772e-07, 'epoch': 0.82} 82%|████████▏ | 18069/22095 [30:53:57<6:08:33, 5.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60416 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18070/22095 [30:54:01<5:37:29, 5.03s/it] {'loss': 0.2764, 'grad_norm': 0.5926511878837478, 'learning_rate': 8.457083684975298e-07, 'epoch': 0.82} 82%|████████▏ | 18070/22095 [30:54:01<5:37:29, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18071/22095 [30:54:05<5:12:49, 4.66s/it] {'loss': 0.3133, 'grad_norm': 0.5814554261878402, 'learning_rate': 8.453005543108501e-07, 'epoch': 0.82} 82%|████████▏ | 18071/22095 [30:54:05<5:12:49, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96354 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18072/22095 [30:54:09<4:55:48, 4.41s/it] {'loss': 0.2403, 'grad_norm': 0.5453222644749728, 'learning_rate': 8.448928293959007e-07, 'epoch': 0.82} 82%|████████▏ | 18072/22095 [30:54:09<4:55:48, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [178, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8431220 in VC:s3://internvl-moe-sft-data/. Exception: Image size [178, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24627, 'image': 'vrdu_texteq/astro-ph.CO/fcc8bb5b-0aaf-4c2a-a6de-9de060249286.png', 'image_wh': [[178, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'for $n_{v} >1$ and'}]} 82%|████████▏ | 18073/22095 [30:54:13<4:44:55, 4.25s/it] {'loss': 0.3076, 'grad_norm': 0.6616303384602777, 'learning_rate': 8.444851937614446e-07, 'epoch': 0.82} 82%|████████▏ | 18073/22095 [30:54:13<4:44:55, 4.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18074/22095 [30:54:18<5:01:10, 4.49s/it] {'loss': 0.4685, 'grad_norm': 0.2775563857111059, 'learning_rate': 8.440776474162388e-07, 'epoch': 0.82} 82%|████████▏ | 18074/22095 [30:54:18<5:01:10, 4.49s/it] 82%|████████▏ | 18075/22095 [30:54:22<4:47:42, 4.29s/it] {'loss': 0.2993, 'grad_norm': 0.5854024135886258, 'learning_rate': 8.436701903690392e-07, 'epoch': 0.82} 82%|████████▏ | 18075/22095 [30:54:22<4:47:42, 4.29s/it] 82%|████████▏ | 18076/22095 [30:54:26<4:41:25, 4.20s/it] {'loss': 0.3135, 'grad_norm': 0.6477255114728717, 'learning_rate': 8.432628226286032e-07, 'epoch': 0.82} 82%|████████▏ | 18076/22095 [30:54:26<4:41:25, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18077/22095 [30:54:35<6:28:05, 5.80s/it] {'loss': 0.4663, 'grad_norm': 0.27798679897661827, 'learning_rate': 8.428555442036812e-07, 'epoch': 0.82} 82%|████████▏ | 18077/22095 [30:54:35<6:28:05, 5.80s/it] 82%|████████▏ | 18078/22095 [30:54:39<5:40:19, 5.08s/it] {'loss': 0.3152, 'grad_norm': 0.5661684252041219, 'learning_rate': 8.424483551030277e-07, 'epoch': 0.82} 82%|████████▏ | 18078/22095 [30:54:39<5:40:19, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80581 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129791 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18079/22095 [30:54:41<4:55:51, 4.42s/it] {'loss': 0.2672, 'grad_norm': 0.616865906589501, 'learning_rate': 8.420412553353885e-07, 'epoch': 0.82} 82%|████████▏ | 18079/22095 [30:54:42<4:55:51, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18080/22095 [30:54:47<5:26:00, 4.87s/it] {'loss': 0.4634, 'grad_norm': 0.2682258287041038, 'learning_rate': 8.416342449095138e-07, 'epoch': 0.82} 82%|████████▏ | 18080/22095 [30:54:47<5:26:00, 4.87s/it] 82%|████████▏ | 18081/22095 [30:54:51<4:55:43, 4.42s/it] {'loss': 0.306, 'grad_norm': 0.6944034869375798, 'learning_rate': 8.412273238341462e-07, 'epoch': 0.82} 82%|████████▏ | 18081/22095 [30:54:51<4:55:43, 4.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948331 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71484, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 82%|████████▏ | 18082/22095 [30:54:54<4:29:49, 4.03s/it] {'loss': 0.28, 'grad_norm': 0.632263742360841, 'learning_rate': 8.408204921180324e-07, 'epoch': 0.82} 82%|████████▏ | 18082/22095 [30:54:54<4:29:49, 4.03s/it] 82%|████████▏ | 18083/22095 [30:54:57<4:08:09, 3.71s/it] {'loss': 0.2901, 'grad_norm': 0.626627799718189, 'learning_rate': 8.404137497699122e-07, 'epoch': 0.82} 82%|████████▏ | 18083/22095 [30:54:57<4:08:09, 3.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348840 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15510, 'image': 'vrdu_table_final_2/astro-ph.CO/af20082c-6a3e-40da-b954-6f3657b37824.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 82%|████████▏ | 18084/22095 [30:55:01<4:08:07, 3.71s/it] {'loss': 0.3531, 'grad_norm': 0.6525676731594316, 'learning_rate': 8.400070967985241e-07, 'epoch': 0.82} 82%|████████▏ | 18084/22095 [30:55:01<4:08:07, 3.71s/it] 82%|████████▏ | 18085/22095 [30:55:03<3:49:43, 3.44s/it] {'loss': 0.3061, 'grad_norm': 0.7172573954857759, 'learning_rate': 8.396005332126068e-07, 'epoch': 0.82} 82%|████████▏ | 18085/22095 [30:55:03<3:49:43, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89789 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18086/22095 [30:55:10<4:56:20, 4.44s/it] {'loss': 0.4708, 'grad_norm': 0.2704532318775338, 'learning_rate': 8.391940590208975e-07, 'epoch': 0.82} 82%|████████▏ | 18086/22095 [30:55:10<4:56:20, 4.44s/it] 82%|████████▏ | 18087/22095 [30:55:21<6:54:57, 6.21s/it] {'loss': 0.4428, 'grad_norm': 0.24641823451323808, 'learning_rate': 8.387876742321294e-07, 'epoch': 0.82} 82%|████████▏ | 18087/22095 [30:55:21<6:54:57, 6.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 82%|████████▏ | 18088/22095 [30:55:24<6:00:28, 5.40s/it] {'loss': 0.3161, 'grad_norm': 0.6891107336428738, 'learning_rate': 8.383813788550326e-07, 'epoch': 0.82} 82%|████████▏ | 18088/22095 [30:55:24<6:00:28, 5.40s/it] 82%|████████▏ | 18089/22095 [30:55:28<5:38:13, 5.07s/it] {'loss': 0.3436, 'grad_norm': 0.6139209157814352, 'learning_rate': 8.379751728983399e-07, 'epoch': 0.82} 82%|████████▏ | 18089/22095 [30:55:28<5:38:13, 5.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11259333 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7733, 'image': 'MMMU/Pharmacy/test_112_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze all the relevant background knowledge involved in this question and give an in-depth and detailed introduction to each relevant knowledge point. You need to provide a clear explanation of each knowledge point involved in the question and answer, including its concept, definition, practical application, importance and influence in related fields. Please provide appropriate examples to support the explanation of each knowledge point, just like the detailed explanation in the textbook. Make sure the answer is clearly structured, well-organized, and covers all important aspects of each knowledge point. The answer requires at least 2,000 words, and avoid directly answering the answer to the specific question, and focus on the analysis and explanation of background knowledge.\nHere are the content and options of this question:\nQuestion: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,How many moles of each product are formed\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': "To thoroughly explore the background knowledge involved in the reaction given, which is the reduction of iron(III) oxide by carbon monoxide to produce iron and carbon dioxide, we must delve into the following key areas: chemical reactions and stoichiometry, redox reactions, the role of a blast furnace, properties of iron compounds, and practical applications in industry. Let's explore each concept in detail.\n\n### 1. Chemical Reactions and Stoichiometry\n\n**Concept and Definition:**\n\nChemical reactions involve the transformation of reactants into products. Stoichiometry is the field of chemistry concerned with the quantitative relationships between reactants and products in a chemical reaction. It is based on the conservation of mass, where the total mass of reactants equals the total mass of products.\n\n**Applications and Importance:**\n\nStoichiometry is essential for predicting the amounts of substances consumed and produced in a reaction. For instance, in industrial applications, stoichiometry helps in scaling reactions for manufacturing processes, ensuring efficient use of resources and minimizing waste.\n\n**Examples:**\n\nFor a balanced reaction, such as 2H₂ + O₂ → 2H₂O, stoichiometry tells us that 2 moles of hydrogen react with 1 mole of oxygen to produce 2 moles of water. This helps engineer systems where precise amounts of gas are required for combustion or other chemical processes.\n\n### 2. Redox Reactions\n\n**Concept and Definition:**\n\nRedox (reduction-oxidation) reactions involve the transfer of electrons between substances. One substance undergoes oxidation (loses electrons), while another undergoes reduction (gains electrons).\n\n**Applications and Importance:**\n\nRedox reactions are fundamental to various scientific and industrial processes, including energy production, metallurgy, and biological systems. They are crucial for electrochemical cells, where chemical energy is converted into electrical energy.\n\n**Examples:**\n\nIn the reaction 2Fe₂O₃ + 3CO → 4Fe + 3CO₂, iron(III) oxide (Fe₂O₃) is reduced to iron (Fe), while carbon monoxide (CO) is oxidized to carbon dioxide (CO₂). Redox reactions like this are integral to processes like metal extraction and refining.\n\n### 3. Role of a Blast Furnace\n\n**Concept and Definition:**\n\nA blast furnace is an industrial apparatus used to extract metals from their ores at high temperatures. It operates on the principle of chemical reduction, where a reducing agent helps convert metal oxides to pure metal.\n\n**Applications and Importance:**\n\nBlast furnaces are a critical component of the steel industry. They help in the mass production of iron and steel, which are foundational to construction and manufacturing industries worldwide.\n\n**Examples:**\n\nThe typical process in a blast furnace involves the reduction of iron ore (e.g., hematite or magnetite) using carbon-based reducing agents like coke. The equation is similar to 2Fe₂O₃ + 3CO → 4Fe + 3CO₂, where iron ore is transformed into liquid pig iron.\n\n### 4. Properties of Iron Compounds\n\n**Concept and Definition:**\n\nIron compounds, including oxides like Fe₂O₃, play crucial roles in various chemical reactions and industrial processes. They exhibit different properties depending on their oxidation state and molecular structure.\n\n**Applications and Importance:**\n\nIron oxides are common in industrial applications, from pigments and polishing compounds to magnetic materials and iron extraction. Understanding these properties is essential for process optimization in metallurgy and materials science.\n\n**Examples:**\n\n- **Iron(III) Oxide (Fe₂O₃):** Known as hematite, it is used in pigments and ores for iron extraction.\n- **Iron(II) Oxide (FeO):** Plays a role in steel production as a feed material.\n\n### 5. Practical Applications in Industry\n\n**Concept and Definition:**\n\nThe knowledge related to this chemical reaction is applied in the metallurgical industry to produce metals like iron. Industrial processes rely on finely tuned chemical reactions to obtain desired products efficiently and sustainably.\n\n**Applications and Importance:**\n\nRefining processes in industries like steel manufacturing are optimized using chemical principles like stoichiometry and redox reactions to enhance the quality and quantity of output. These processes are designed to be energy-efficient and environmentally friendly.\n\n**Examples:**\n\n- **Steel Production:** The blast furnace is used to produce pig iron, which is further refined into steel. Innovations in this process aim to reduce carbon emissions and energy consumption.\n- **Cement Industry:** Similar principles are applied in kilns where chemical reactions create clinker, a precursor to cement.\n\n### Conclusion\n\nUnderstanding the intricacies of chemical reactions, redox processes, and industrial applications provides a foundation for exploring the production of iron from iron(III) oxide. From the stoichiometric calculations that ensure efficient use of resources to the engineering of massive blast furnaces that drive the global steel industry, this knowledge is integral to both theoretical chemistry and practical industrial applications. Continual advancements in this field strive to make these processes more sustainable and economically viable, highlighting the ongoing evolution of chemistry and engineering."}]} 82%|████████▏ | 18090/22095 [30:55:32<5:16:07, 4.74s/it] {'loss': 0.2992, 'grad_norm': 0.595071574398606, 'learning_rate': 8.375690563707761e-07, 'epoch': 0.82} 82%|████████▏ | 18090/22095 [30:55:32<5:16:07, 4.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (109554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44002 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68244 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47384 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (122356 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18091/22095 [30:55:36<4:49:11, 4.33s/it] {'loss': 0.2521, 'grad_norm': 0.770051033788159, 'learning_rate': 8.371630292810712e-07, 'epoch': 0.82} 82%|████████▏ | 18091/22095 [30:55:36<4:49:11, 4.33s/it] 82%|████████▏ | 18092/22095 [30:55:39<4:26:09, 3.99s/it] {'loss': 0.3236, 'grad_norm': 1.1735476464817132, 'learning_rate': 8.367570916379464e-07, 'epoch': 0.82} 82%|████████▏ | 18092/22095 [30:55:39<4:26:09, 3.99s/it] 82%|████████▏ | 18093/22095 [30:55:42<4:07:34, 3.71s/it] {'loss': 0.3005, 'grad_norm': 0.7713069831108617, 'learning_rate': 8.363512434501264e-07, 'epoch': 0.82} 82%|████████▏ | 18093/22095 [30:55:42<4:07:34, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (68730 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95411 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18094/22095 [30:55:51<6:04:00, 5.46s/it] {'loss': 0.4746, 'grad_norm': 0.2847719199034834, 'learning_rate': 8.359454847263293e-07, 'epoch': 0.82} 82%|████████▏ | 18094/22095 [30:55:51<6:04:00, 5.46s/it] 82%|████████▏ | 18095/22095 [30:55:55<5:32:31, 4.99s/it] {'loss': 0.2622, 'grad_norm': 0.552169758517861, 'learning_rate': 8.355398154752759e-07, 'epoch': 0.82} 82%|████████▏ | 18095/22095 [30:55:55<5:32:31, 4.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123635 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72976 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51504 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18096/22095 [30:55:58<4:50:50, 4.36s/it] {'loss': 0.2641, 'grad_norm': 0.6249685361362285, 'learning_rate': 8.351342357056818e-07, 'epoch': 0.82} 82%|████████▏ | 18096/22095 [30:55:58<4:50:50, 4.36s/it] 82%|████████▏ | 18097/22095 [30:56:02<4:31:47, 4.08s/it] {'loss': 0.265, 'grad_norm': 0.6003519390627243, 'learning_rate': 8.347287454262603e-07, 'epoch': 0.82} 82%|████████▏ | 18097/22095 [30:56:02<4:31:47, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18098/22095 [30:56:05<4:20:36, 3.91s/it] {'loss': 0.2519, 'grad_norm': 0.5991625551143557, 'learning_rate': 8.343233446457272e-07, 'epoch': 0.82} 82%|████████▏ | 18098/22095 [30:56:05<4:20:36, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18099/22095 [30:56:13<5:41:44, 5.13s/it] {'loss': 0.4893, 'grad_norm': 0.2810411008882779, 'learning_rate': 8.339180333727909e-07, 'epoch': 0.82} 82%|████████▏ | 18099/22095 [30:56:13<5:41:44, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18100/22095 [30:56:17<5:10:35, 4.66s/it] {'loss': 0.314, 'grad_norm': 0.5922428037797257, 'learning_rate': 8.335128116161595e-07, 'epoch': 0.82} 82%|████████▏ | 18100/22095 [30:56:17<5:10:35, 4.66s/it] 82%|████████▏ | 18101/22095 [30:56:19<4:32:04, 4.09s/it] {'loss': 0.2865, 'grad_norm': 0.7208442527501777, 'learning_rate': 8.331076793845422e-07, 'epoch': 0.82} 82%|████████▏ | 18101/22095 [30:56:19<4:32:04, 4.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66891 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47170 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18102/22095 [30:56:29<6:27:42, 5.83s/it] {'loss': 0.4443, 'grad_norm': 0.2829411112652835, 'learning_rate': 8.327026366866437e-07, 'epoch': 0.82} 82%|████████▏ | 18102/22095 [30:56:29<6:27:42, 5.83s/it] 82%|████████▏ | 18103/22095 [30:56:33<5:47:43, 5.23s/it] {'loss': 0.292, 'grad_norm': 0.6211237231506862, 'learning_rate': 8.322976835311669e-07, 'epoch': 0.82} 82%|████████▏ | 18103/22095 [30:56:33<5:47:43, 5.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (131769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45773 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61217 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48808 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18104/22095 [30:56:43<7:22:36, 6.65s/it] {'loss': 0.4805, 'grad_norm': 0.26641329034091676, 'learning_rate': 8.318928199268117e-07, 'epoch': 0.82} 82%|████████▏ | 18104/22095 [30:56:43<7:22:36, 6.65s/it] 82%|████████▏ | 18105/22095 [30:56:48<6:39:04, 6.00s/it] {'loss': 0.3213, 'grad_norm': 0.5954674577374436, 'learning_rate': 8.314880458822794e-07, 'epoch': 0.82} 82%|████████▏ | 18105/22095 [30:56:48<6:39:04, 6.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101067 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18106/22095 [30:56:52<6:05:01, 5.49s/it] {'loss': 0.3267, 'grad_norm': 0.6591766551015656, 'learning_rate': 8.310833614062652e-07, 'epoch': 0.82} 82%|████████▏ | 18106/22095 [30:56:52<6:05:01, 5.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43847 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91330 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18107/22095 [30:56:56<5:30:17, 4.97s/it] {'loss': 0.2646, 'grad_norm': 0.634582558619486, 'learning_rate': 8.306787665074673e-07, 'epoch': 0.82} 82%|████████▏ | 18107/22095 [30:56:56<5:30:17, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73770 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43851 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76457 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18108/22095 [30:56:59<5:00:08, 4.52s/it] {'loss': 0.2812, 'grad_norm': 0.591732723888874, 'learning_rate': 8.302742611945758e-07, 'epoch': 0.82} 82%|████████▏ | 18108/22095 [30:56:59<5:00:08, 4.52s/it] 82%|████████▏ | 18109/22095 [30:57:02<4:29:50, 4.06s/it] {'loss': 0.2814, 'grad_norm': 0.5876345405105374, 'learning_rate': 8.298698454762854e-07, 'epoch': 0.82} 82%|████████▏ | 18109/22095 [30:57:02<4:29:50, 4.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [625, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8500757 in VC:s3://internvl-moe-sft-data/. Exception: Image size [625, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 125074, 'image': 'vrdu_texteq/astro-ph.CO/4d495a2a-790d-4eca-aeda-514f42af6757.png', 'image_wh': [[625, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'Note that $\\Omega_{m2} \\to 1$ in the $\\delta \\to 0$ and $\\delta \\to \\infty$ limits.'}]} 82%|████████▏ | 18110/22095 [30:57:11<6:12:17, 5.61s/it] {'loss': 0.4747, 'grad_norm': 0.2784159769170307, 'learning_rate': 8.294655193612838e-07, 'epoch': 0.82} 82%|████████▏ | 18110/22095 [30:57:11<6:12:17, 5.61s/it] 82%|████████▏ | 18111/22095 [30:57:15<5:28:38, 4.95s/it] {'loss': 0.3417, 'grad_norm': 0.6053642942795595, 'learning_rate': 8.2906128285826e-07, 'epoch': 0.82} 82%|████████▏ | 18111/22095 [30:57:15<5:28:38, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18112/22095 [30:57:24<6:49:23, 6.17s/it] {'loss': 0.4722, 'grad_norm': 0.3103793994779556, 'learning_rate': 8.286571359758993e-07, 'epoch': 0.82} 82%|████████▏ | 18112/22095 [30:57:24<6:49:23, 6.17s/it] 82%|████████▏ | 18113/22095 [30:57:28<6:10:19, 5.58s/it] {'loss': 0.2578, 'grad_norm': 0.6863803379605348, 'learning_rate': 8.282530787228848e-07, 'epoch': 0.82} 82%|████████▏ | 18113/22095 [30:57:28<6:10:19, 5.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18114/22095 [30:57:38<7:29:11, 6.77s/it] {'loss': 0.4758, 'grad_norm': 0.286499942924758, 'learning_rate': 8.278491111078984e-07, 'epoch': 0.82} 82%|████████▏ | 18114/22095 [30:57:38<7:29:11, 6.77s/it] 82%|████████▏ | 18115/22095 [30:57:44<7:30:08, 6.79s/it] {'loss': 0.4866, 'grad_norm': 0.28709485782608885, 'learning_rate': 8.274452331396221e-07, 'epoch': 0.82} 82%|████████▏ | 18115/22095 [30:57:44<7:30:08, 6.79s/it] 82%|████████▏ | 18116/22095 [30:57:54<8:27:38, 7.65s/it] {'loss': 0.4694, 'grad_norm': 0.2764893701502648, 'learning_rate': 8.270414448267333e-07, 'epoch': 0.82} 82%|████████▏ | 18116/22095 [30:57:54<8:27:38, 7.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 82%|████████▏ | 18117/22095 [30:57:57<6:57:54, 6.30s/it] {'loss': 0.3021, 'grad_norm': 0.8317042355162554, 'learning_rate': 8.266377461779057e-07, 'epoch': 0.82} 82%|████████▏ | 18117/22095 [30:57:57<6:57:54, 6.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18118/22095 [30:58:01<6:15:35, 5.67s/it] {'loss': 0.2858, 'grad_norm': 0.5772234597460723, 'learning_rate': 8.262341372018168e-07, 'epoch': 0.82} 82%|████████▏ | 18118/22095 [30:58:01<6:15:35, 5.67s/it] 82%|████████▏ | 18119/22095 [30:58:04<5:22:45, 4.87s/it] {'loss': 0.3145, 'grad_norm': 0.5982028186899865, 'learning_rate': 8.258306179071368e-07, 'epoch': 0.82} 82%|████████▏ | 18119/22095 [30:58:04<5:22:45, 4.87s/it] 82%|████████▏ | 18120/22095 [30:58:08<5:00:29, 4.54s/it] {'loss': 0.2632, 'grad_norm': 0.6180362368064066, 'learning_rate': 8.254271883025377e-07, 'epoch': 0.82} 82%|████████▏ | 18120/22095 [30:58:08<5:00:29, 4.54s/it] 82%|████████▏ | 18121/22095 [30:58:11<4:29:40, 4.07s/it] {'loss': 0.2411, 'grad_norm': 0.6745805870388583, 'learning_rate': 8.250238483966855e-07, 'epoch': 0.82} 82%|████████▏ | 18121/22095 [30:58:11<4:29:40, 4.07s/it] 82%|████████▏ | 18122/22095 [30:58:15<4:19:59, 3.93s/it] {'loss': 0.3088, 'grad_norm': 0.5797029407307055, 'learning_rate': 8.246205981982503e-07, 'epoch': 0.82} 82%|████████▏ | 18122/22095 [30:58:15<4:19:59, 3.93s/it] 82%|████████▏ | 18123/22095 [30:58:18<4:06:06, 3.72s/it] {'loss': 0.2584, 'grad_norm': 0.5830337703042072, 'learning_rate': 8.242174377158929e-07, 'epoch': 0.82} 82%|████████▏ | 18123/22095 [30:58:18<4:06:06, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79848 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45621 > 40960) for 4 sample(s). Truncating to 40491 with 2 samples. 82%|████████▏ | 18124/22095 [30:58:23<4:42:00, 4.26s/it] {'loss': 0.348, 'grad_norm': 0.8578833069169375, 'learning_rate': 8.238143669582794e-07, 'epoch': 0.82} 82%|████████▏ | 18124/22095 [30:58:23<4:42:00, 4.26s/it] 82%|████████▏ | 18125/22095 [30:58:27<4:21:19, 3.95s/it] {'loss': 0.2835, 'grad_norm': 0.5814887271750685, 'learning_rate': 8.234113859340687e-07, 'epoch': 0.82} 82%|████████▏ | 18125/22095 [30:58:27<4:21:19, 3.95s/it] 82%|████████▏ | 18126/22095 [30:58:30<4:04:45, 3.70s/it] {'loss': 0.3162, 'grad_norm': 0.994467368324847, 'learning_rate': 8.23008494651919e-07, 'epoch': 0.82} 82%|████████▏ | 18126/22095 [30:58:30<4:04:45, 3.70s/it] 82%|████████▏ | 18127/22095 [30:58:33<3:49:20, 3.47s/it] {'loss': 0.2825, 'grad_norm': 0.6106514026492696, 'learning_rate': 8.226056931204879e-07, 'epoch': 0.82} 82%|████████▏ | 18127/22095 [30:58:33<3:49:20, 3.47s/it] 82%|████████▏ | 18128/22095 [30:58:36<3:49:24, 3.47s/it] {'loss': 0.3031, 'grad_norm': 0.6112142660925025, 'learning_rate': 8.222029813484333e-07, 'epoch': 0.82} 82%|████████▏ | 18128/22095 [30:58:36<3:49:24, 3.47s/it] 82%|████████▏ | 18129/22095 [30:58:39<3:36:57, 3.28s/it] {'loss': 0.3145, 'grad_norm': 0.6635302954408059, 'learning_rate': 8.218003593444029e-07, 'epoch': 0.82} 82%|████████▏ | 18129/22095 [30:58:39<3:36:57, 3.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18130/22095 [30:58:43<3:40:46, 3.34s/it] {'loss': 0.2666, 'grad_norm': 0.6091885895922927, 'learning_rate': 8.213978271170503e-07, 'epoch': 0.82} 82%|████████▏ | 18130/22095 [30:58:43<3:40:46, 3.34s/it] 82%|████████▏ | 18131/22095 [30:58:46<3:47:47, 3.45s/it] {'loss': 0.3235, 'grad_norm': 0.6142540711638156, 'learning_rate': 8.209953846750257e-07, 'epoch': 0.82} 82%|████████▏ | 18131/22095 [30:58:46<3:47:47, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18132/22095 [30:58:56<6:01:45, 5.48s/it] {'loss': 0.4826, 'grad_norm': 0.32349620026022546, 'learning_rate': 8.205930320269762e-07, 'epoch': 0.82} 82%|████████▏ | 18132/22095 [30:58:56<6:01:45, 5.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18133/22095 [30:59:00<5:24:10, 4.91s/it] {'loss': 0.3094, 'grad_norm': 0.6565366682003055, 'learning_rate': 8.201907691815448e-07, 'epoch': 0.82} 82%|████████▏ | 18133/22095 [30:59:00<5:24:10, 4.91s/it] 82%|████████▏ | 18134/22095 [30:59:03<4:52:28, 4.43s/it] {'loss': 0.3032, 'grad_norm': 1.3502509981198274, 'learning_rate': 8.197885961473773e-07, 'epoch': 0.82} 82%|████████▏ | 18134/22095 [30:59:03<4:52:28, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78407 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104077 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18135/22095 [30:59:06<4:18:44, 3.92s/it] {'loss': 0.2772, 'grad_norm': 0.5785720017965967, 'learning_rate': 8.193865129331136e-07, 'epoch': 0.82} 82%|████████▏ | 18135/22095 [30:59:06<4:18:44, 3.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41659 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43116 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102014 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18136/22095 [30:59:14<5:37:09, 5.11s/it] {'loss': 0.4586, 'grad_norm': 0.28259587502298983, 'learning_rate': 8.18984519547395e-07, 'epoch': 0.82} 82%|████████▏ | 18136/22095 [30:59:14<5:37:09, 5.11s/it] 82%|████████▏ | 18137/22095 [30:59:17<4:58:37, 4.53s/it] {'loss': 0.2603, 'grad_norm': 0.5966135804840258, 'learning_rate': 8.18582615998857e-07, 'epoch': 0.82} 82%|████████▏ | 18137/22095 [30:59:17<4:58:37, 4.53s/it] 82%|████████▏ | 18138/22095 [30:59:20<4:31:25, 4.12s/it] {'loss': 0.3201, 'grad_norm': 0.6561140381210594, 'learning_rate': 8.181808022961374e-07, 'epoch': 0.82} 82%|████████▏ | 18138/22095 [30:59:20<4:31:25, 4.12s/it] 82%|████████▏ | 18139/22095 [30:59:23<4:08:31, 3.77s/it] {'loss': 0.2884, 'grad_norm': 0.6647527581768201, 'learning_rate': 8.177790784478679e-07, 'epoch': 0.82} 82%|████████▏ | 18139/22095 [30:59:23<4:08:31, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18140/22095 [30:59:31<5:23:21, 4.91s/it] {'loss': 0.4565, 'grad_norm': 0.24971824884844132, 'learning_rate': 8.173774444626819e-07, 'epoch': 0.82} 82%|████████▏ | 18140/22095 [30:59:31<5:23:21, 4.91s/it] 82%|████████▏ | 18141/22095 [30:59:35<5:02:32, 4.59s/it] {'loss': 0.3287, 'grad_norm': 0.6438764482155982, 'learning_rate': 8.169759003492095e-07, 'epoch': 0.82} 82%|████████▏ | 18141/22095 [30:59:35<5:02:32, 4.59s/it] 82%|████████▏ | 18142/22095 [30:59:39<4:57:48, 4.52s/it] {'loss': 0.3141, 'grad_norm': 0.6143142139977565, 'learning_rate': 8.165744461160763e-07, 'epoch': 0.82} 82%|████████▏ | 18142/22095 [30:59:39<4:57:48, 4.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18143/22095 [30:59:48<6:23:12, 5.82s/it] {'loss': 0.4639, 'grad_norm': 0.25061996236946416, 'learning_rate': 8.161730817719094e-07, 'epoch': 0.82} 82%|████████▏ | 18143/22095 [30:59:48<6:23:12, 5.82s/it] 82%|████████▏ | 18144/22095 [30:59:57<7:34:24, 6.90s/it] {'loss': 0.461, 'grad_norm': 0.26770998240193866, 'learning_rate': 8.157718073253351e-07, 'epoch': 0.82} 82%|████████▏ | 18144/22095 [30:59:57<7:34:24, 6.90s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 82%|████████▏ | 18145/22095 [31:00:01<6:30:59, 5.94s/it] {'loss': 0.2556, 'grad_norm': 0.6405285204560663, 'learning_rate': 8.153706227849734e-07, 'epoch': 0.82} 82%|████████▏ | 18145/22095 [31:00:01<6:30:59, 5.94s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369309 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36061, 'image': 'vrdu_table_final_2/astro-ph.CO/aa2eb608-417d-4d12-b8ab-a6db4cfee36e.png', 'image_wh': [[25, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{c}$y_0$\\end{tabular}\n```'}]} 82%|████████▏ | 18146/22095 [31:00:04<5:38:11, 5.14s/it] {'loss': 0.2818, 'grad_norm': 0.637428253048071, 'learning_rate': 8.149695281594438e-07, 'epoch': 0.82} 82%|████████▏ | 18146/22095 [31:00:04<5:38:11, 5.14s/it] 82%|████████▏ | 18147/22095 [31:00:08<5:05:44, 4.65s/it] {'loss': 0.3443, 'grad_norm': 0.6201304150041989, 'learning_rate': 8.145685234573675e-07, 'epoch': 0.82} 82%|████████▏ | 18147/22095 [31:00:08<5:05:44, 4.65s/it] 82%|████████▏ | 18148/22095 [31:00:11<4:45:36, 4.34s/it] {'loss': 0.3329, 'grad_norm': 0.5821698916057503, 'learning_rate': 8.141676086873574e-07, 'epoch': 0.82} 82%|████████▏ | 18148/22095 [31:00:11<4:45:36, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18149/22095 [31:00:19<5:53:24, 5.37s/it] {'loss': 0.4701, 'grad_norm': 0.2703313986797029, 'learning_rate': 8.137667838580304e-07, 'epoch': 0.82} 82%|████████▏ | 18149/22095 [31:00:19<5:53:24, 5.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50565 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110272 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18150/22095 [31:00:23<5:29:03, 5.00s/it] {'loss': 0.252, 'grad_norm': 0.5871374932622914, 'learning_rate': 8.13366048977997e-07, 'epoch': 0.82} 82%|████████▏ | 18150/22095 [31:00:23<5:29:03, 5.00s/it] 82%|████████▏ | 18151/22095 [31:00:27<4:58:40, 4.54s/it] {'loss': 0.3217, 'grad_norm': 0.6248557357627881, 'learning_rate': 8.12965404055871e-07, 'epoch': 0.82} 82%|████████▏ | 18151/22095 [31:00:27<4:58:40, 4.54s/it] 82%|████████▏ | 18152/22095 [31:00:31<4:42:48, 4.30s/it] {'loss': 0.2855, 'grad_norm': 0.6547924784275919, 'learning_rate': 8.125648491002569e-07, 'epoch': 0.82} 82%|████████▏ | 18152/22095 [31:00:31<4:42:48, 4.30s/it] 82%|████████▏ | 18153/22095 [31:00:34<4:23:35, 4.01s/it] {'loss': 0.29, 'grad_norm': 0.6527228580605702, 'learning_rate': 8.121643841197652e-07, 'epoch': 0.82} 82%|████████▏ | 18153/22095 [31:00:34<4:23:35, 4.01s/it] 82%|████████▏ | 18154/22095 [31:00:38<4:21:22, 3.98s/it] {'loss': 0.3293, 'grad_norm': 0.6755647792905147, 'learning_rate': 8.117640091229984e-07, 'epoch': 0.82} 82%|████████▏ | 18154/22095 [31:00:38<4:21:22, 3.98s/it] 82%|████████▏ | 18155/22095 [31:00:42<4:17:12, 3.92s/it] {'loss': 0.2874, 'grad_norm': 0.5999022378057997, 'learning_rate': 8.11363724118559e-07, 'epoch': 0.82} 82%|████████▏ | 18155/22095 [31:00:42<4:17:12, 3.92s/it] 82%|████████▏ | 18156/22095 [31:00:45<4:05:41, 3.74s/it] {'loss': 0.252, 'grad_norm': 0.6007954263800855, 'learning_rate': 8.109635291150492e-07, 'epoch': 0.82} 82%|████████▏ | 18156/22095 [31:00:45<4:05:41, 3.74s/it] 82%|████████▏ | 18157/22095 [31:00:49<4:12:36, 3.85s/it] {'loss': 0.3278, 'grad_norm': 0.7865647067064967, 'learning_rate': 8.105634241210692e-07, 'epoch': 0.82} 82%|████████▏ | 18157/22095 [31:00:49<4:12:36, 3.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18158/22095 [31:00:58<5:58:41, 5.47s/it] {'loss': 0.4878, 'grad_norm': 0.27734227101871806, 'learning_rate': 8.101634091452121e-07, 'epoch': 0.82} 82%|████████▏ | 18158/22095 [31:00:58<5:58:41, 5.47s/it] 82%|████████▏ | 18159/22095 [31:01:02<5:19:02, 4.86s/it] {'loss': 0.2536, 'grad_norm': 0.5600608454366888, 'learning_rate': 8.097634841960756e-07, 'epoch': 0.82} 82%|████████▏ | 18159/22095 [31:01:02<5:19:02, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111748 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43830 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18160/22095 [31:01:05<4:40:20, 4.27s/it] {'loss': 0.3144, 'grad_norm': 0.623917531673197, 'learning_rate': 8.093636492822532e-07, 'epoch': 0.82} 82%|████████▏ | 18160/22095 [31:01:05<4:40:20, 4.27s/it] 82%|████████▏ | 18161/22095 [31:01:07<4:11:29, 3.84s/it] {'loss': 0.2931, 'grad_norm': 0.623452221799919, 'learning_rate': 8.089639044123354e-07, 'epoch': 0.82} 82%|████████▏ | 18161/22095 [31:01:07<4:11:29, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item self.list_data_dict[i].get("height", 100), ValueError: Number of image tokens ['data/ant-design/upload/other_screenshot/original/StyledFileUploader_1742052280.344486.png'] does not match number of images None [Try #0] Failed to fetch sample 1843668 in VC:s3://gui-agent/jedi/images/final_1.5m/final_1.5m_extracted/. Exception: Number of image tokens ['data/ant-design/upload/other_screenshot/original/StyledFileUploader_1742052280.344486.png'] does not match number of images None Problematic sample: {'image': 'data/ant-design/upload/other_screenshot/original/StyledFileUploader_1742052280.344486.png', 'conversations': []} 82%|████████▏ | 18162/22095 [31:01:11<4:13:09, 3.86s/it] {'loss': 0.2827, 'grad_norm': 0.6132492355170043, 'learning_rate': 8.085642495949108e-07, 'epoch': 0.82} 82%|████████▏ | 18162/22095 [31:01:11<4:13:09, 3.86s/it] 82%|████████▏ | 18163/22095 [31:01:15<4:13:19, 3.87s/it] {'loss': 0.3229, 'grad_norm': 0.6219717134997269, 'learning_rate': 8.081646848385671e-07, 'epoch': 0.82} 82%|████████▏ | 18163/22095 [31:01:15<4:13:19, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67074 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18164/22095 [31:01:19<4:12:55, 3.86s/it] {'loss': 0.2871, 'grad_norm': 0.5882007765149354, 'learning_rate': 8.077652101518918e-07, 'epoch': 0.82} 82%|████████▏ | 18164/22095 [31:01:19<4:12:55, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18165/22095 [31:01:27<5:39:53, 5.19s/it] {'loss': 0.4568, 'grad_norm': 0.2626698915552704, 'learning_rate': 8.073658255434658e-07, 'epoch': 0.82} 82%|████████▏ | 18165/22095 [31:01:27<5:39:53, 5.19s/it] 82%|████████▏ | 18166/22095 [31:01:31<5:00:59, 4.60s/it] {'loss': 0.3057, 'grad_norm': 0.8530835081708823, 'learning_rate': 8.06966531021871e-07, 'epoch': 0.82} 82%|████████▏ | 18166/22095 [31:01:31<5:00:59, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76415 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18167/22095 [31:01:34<4:36:59, 4.23s/it] {'loss': 0.3907, 'grad_norm': 0.6556701708777884, 'learning_rate': 8.065673265956886e-07, 'epoch': 0.82} 82%|████████▏ | 18167/22095 [31:01:34<4:36:59, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18168/22095 [31:01:44<6:26:09, 5.90s/it] {'loss': 0.4701, 'grad_norm': 0.27094993368097087, 'learning_rate': 8.061682122734937e-07, 'epoch': 0.82} 82%|████████▏ | 18168/22095 [31:01:44<6:26:09, 5.90s/it] 82%|████████▏ | 18169/22095 [31:01:47<5:35:03, 5.12s/it] {'loss': 0.3179, 'grad_norm': 0.5941525019158841, 'learning_rate': 8.057691880638651e-07, 'epoch': 0.82} 82%|████████▏ | 18169/22095 [31:01:47<5:35:03, 5.12s/it] 82%|████████▏ | 18170/22095 [31:01:51<5:04:56, 4.66s/it] {'loss': 0.2853, 'grad_norm': 0.5982256887096105, 'learning_rate': 8.053702539753749e-07, 'epoch': 0.82} 82%|████████▏ | 18170/22095 [31:01:51<5:04:56, 4.66s/it] 82%|████████▏ | 18171/22095 [31:01:54<4:41:39, 4.31s/it] {'loss': 0.2905, 'grad_norm': 0.7014758040783442, 'learning_rate': 8.04971410016594e-07, 'epoch': 0.82} 82%|████████▏ | 18171/22095 [31:01:54<4:41:39, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18172/22095 [31:02:02<6:02:14, 5.54s/it] {'loss': 0.4598, 'grad_norm': 0.28190956210995716, 'learning_rate': 8.045726561960931e-07, 'epoch': 0.82} 82%|████████▏ | 18172/22095 [31:02:03<6:02:14, 5.54s/it] 82%|████████▏ | 18173/22095 [31:02:06<5:20:38, 4.91s/it] {'loss': 0.2901, 'grad_norm': 0.6631737483533378, 'learning_rate': 8.041739925224424e-07, 'epoch': 0.82} 82%|████████▏ | 18173/22095 [31:02:06<5:20:38, 4.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18174/22095 [31:02:10<4:57:43, 4.56s/it] {'loss': 0.2919, 'grad_norm': 0.643318251146491, 'learning_rate': 8.037754190042058e-07, 'epoch': 0.82} 82%|████████▏ | 18174/22095 [31:02:10<4:57:43, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18175/22095 [31:02:19<6:33:57, 6.03s/it] {'loss': 0.4767, 'grad_norm': 0.2627393483539942, 'learning_rate': 8.033769356499466e-07, 'epoch': 0.82} 82%|████████▏ | 18175/22095 [31:02:19<6:33:57, 6.03s/it] 82%|████████▏ | 18176/22095 [31:02:23<5:57:51, 5.48s/it] {'loss': 0.2922, 'grad_norm': 0.5960163959853573, 'learning_rate': 8.029785424682291e-07, 'epoch': 0.82} 82%|████████▏ | 18176/22095 [31:02:23<5:57:51, 5.48s/it] 82%|████████▏ | 18177/22095 [31:02:27<5:28:22, 5.03s/it] {'loss': 0.3352, 'grad_norm': 0.6325423313576644, 'learning_rate': 8.025802394676114e-07, 'epoch': 0.82} 82%|████████▏ | 18177/22095 [31:02:27<5:28:22, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53475 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18178/22095 [31:02:31<4:56:11, 4.54s/it] {'loss': 0.3123, 'grad_norm': 0.6717469644769754, 'learning_rate': 8.021820266566538e-07, 'epoch': 0.82} 82%|████████▏ | 18178/22095 [31:02:31<4:56:11, 4.54s/it] 82%|████████▏ | 18179/22095 [31:02:34<4:30:39, 4.15s/it] {'loss': 0.2657, 'grad_norm': 0.6987405573728781, 'learning_rate': 8.017839040439113e-07, 'epoch': 0.82} 82%|████████▏ | 18179/22095 [31:02:34<4:30:39, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86548 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18180/22095 [31:02:38<4:19:57, 3.98s/it] {'loss': 0.3038, 'grad_norm': 0.6914251246915453, 'learning_rate': 8.013858716379396e-07, 'epoch': 0.82} 82%|████████▏ | 18180/22095 [31:02:38<4:19:57, 3.98s/it] 82%|████████▏ | 18181/22095 [31:02:41<4:07:59, 3.80s/it] {'loss': 0.3245, 'grad_norm': 0.6329902662785449, 'learning_rate': 8.009879294472894e-07, 'epoch': 0.82} 82%|████████▏ | 18181/22095 [31:02:41<4:07:59, 3.80s/it] 82%|████████▏ | 18182/22095 [31:02:45<4:08:07, 3.80s/it] {'loss': 0.282, 'grad_norm': 0.6271602808615151, 'learning_rate': 8.005900774805137e-07, 'epoch': 0.82} 82%|████████▏ | 18182/22095 [31:02:45<4:08:07, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45822 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86811 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18183/22095 [31:02:48<4:00:03, 3.68s/it] {'loss': 0.2784, 'grad_norm': 0.5855788353703362, 'learning_rate': 8.001923157461594e-07, 'epoch': 0.82} 82%|████████▏ | 18183/22095 [31:02:48<4:00:03, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18184/22095 [31:02:51<3:48:29, 3.51s/it] {'loss': 0.2903, 'grad_norm': 0.661459765686884, 'learning_rate': 7.997946442527726e-07, 'epoch': 0.82} 82%|████████▏ | 18184/22095 [31:02:51<3:48:29, 3.51s/it] 82%|████████▏ | 18185/22095 [31:02:55<3:54:40, 3.60s/it] {'loss': 0.2791, 'grad_norm': 0.6626994250988717, 'learning_rate': 7.993970630088988e-07, 'epoch': 0.82} 82%|████████▏ | 18185/22095 [31:02:55<3:54:40, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18186/22095 [31:03:00<4:15:57, 3.93s/it] {'loss': 0.4706, 'grad_norm': 0.2745844389633259, 'learning_rate': 7.989995720230837e-07, 'epoch': 0.82} 82%|████████▏ | 18186/22095 [31:03:00<4:15:57, 3.93s/it] 82%|████████▏ | 18187/22095 [31:03:03<4:08:17, 3.81s/it] {'loss': 0.3139, 'grad_norm': 0.6205753959267856, 'learning_rate': 7.986021713038627e-07, 'epoch': 0.82} 82%|████████▏ | 18187/22095 [31:03:03<4:08:17, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18188/22095 [31:03:11<5:17:45, 4.88s/it] {'loss': 0.4659, 'grad_norm': 0.2632761372907264, 'learning_rate': 7.982048608597776e-07, 'epoch': 0.82} 82%|████████▏ | 18188/22095 [31:03:11<5:17:45, 4.88s/it] 82%|████████▏ | 18189/22095 [31:03:14<4:48:59, 4.44s/it] {'loss': 0.2786, 'grad_norm': 0.7485575335159796, 'learning_rate': 7.978076406993662e-07, 'epoch': 0.82} 82%|████████▏ | 18189/22095 [31:03:14<4:48:59, 4.44s/it] 82%|████████▏ | 18190/22095 [31:03:17<4:25:47, 4.08s/it] {'loss': 0.2959, 'grad_norm': 0.6566044462595303, 'learning_rate': 7.974105108311625e-07, 'epoch': 0.82} 82%|████████▏ | 18190/22095 [31:03:17<4:25:47, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18191/22095 [31:03:27<6:14:22, 5.75s/it] {'loss': 0.473, 'grad_norm': 0.2937101749404086, 'learning_rate': 7.970134712636984e-07, 'epoch': 0.82} 82%|████████▏ | 18191/22095 [31:03:27<6:14:22, 5.75s/it] 82%|████████▏ | 18192/22095 [31:03:31<5:39:55, 5.23s/it] {'loss': 0.3067, 'grad_norm': 0.6265125798422845, 'learning_rate': 7.966165220055067e-07, 'epoch': 0.82} 82%|████████▏ | 18192/22095 [31:03:31<5:39:55, 5.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8351588 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 18266, 'image': 'vrdu_table_final_2/astro-ph.CO/dfce9cca-a5ae-4d51-9802-0715a90e4e1c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 82%|████████▏ | 18193/22095 [31:03:34<4:58:19, 4.59s/it] {'loss': 0.2693, 'grad_norm': 0.6204538426926942, 'learning_rate': 7.96219663065117e-07, 'epoch': 0.82} 82%|████████▏ | 18193/22095 [31:03:34<4:58:19, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89772 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18194/22095 [31:03:38<4:38:26, 4.28s/it] {'loss': 0.2908, 'grad_norm': 0.6668241049027904, 'learning_rate': 7.95822894451056e-07, 'epoch': 0.82} 82%|████████▏ | 18194/22095 [31:03:38<4:38:26, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42863 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18195/22095 [31:03:40<4:10:49, 3.86s/it] {'loss': 0.2979, 'grad_norm': 0.6167276572327692, 'learning_rate': 7.954262161718479e-07, 'epoch': 0.82} 82%|████████▏ | 18195/22095 [31:03:40<4:10:49, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18196/22095 [31:03:50<6:00:29, 5.55s/it] {'loss': 0.4892, 'grad_norm': 0.263679280167042, 'learning_rate': 7.950296282360181e-07, 'epoch': 0.82} 82%|████████▏ | 18196/22095 [31:03:50<6:00:29, 5.55s/it] 82%|████████▏ | 18197/22095 [31:03:54<5:22:33, 4.96s/it] {'loss': 0.2942, 'grad_norm': 0.7312264021619062, 'learning_rate': 7.946331306520854e-07, 'epoch': 0.82} 82%|████████▏ | 18197/22095 [31:03:54<5:22:33, 4.96s/it] 82%|████████▏ | 18198/22095 [31:03:57<4:47:23, 4.42s/it] {'loss': 0.2761, 'grad_norm': 0.6937632900134934, 'learning_rate': 7.942367234285725e-07, 'epoch': 0.82} 82%|████████▏ | 18198/22095 [31:03:57<4:47:23, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105465 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41078 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45855 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18199/22095 [31:04:00<4:23:35, 4.06s/it] {'loss': 0.3461, 'grad_norm': 0.6389653819189878, 'learning_rate': 7.938404065739952e-07, 'epoch': 0.82} 82%|████████▏ | 18199/22095 [31:04:00<4:23:35, 4.06s/it] 82%|████████▏ | 18200/22095 [31:04:03<3:59:34, 3.69s/it] {'loss': 0.3216, 'grad_norm': 0.7198943347630627, 'learning_rate': 7.934441800968684e-07, 'epoch': 0.82} 82%|████████▏ | 18200/22095 [31:04:03<3:59:34, 3.69s/it] 82%|████████▏ | 18201/22095 [31:04:07<4:17:14, 3.96s/it] {'loss': 0.2941, 'grad_norm': 0.9748183117086185, 'learning_rate': 7.93048044005707e-07, 'epoch': 0.82} 82%|████████▏ | 18201/22095 [31:04:07<4:17:14, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18202/22095 [31:04:17<6:12:01, 5.73s/it] {'loss': 0.4613, 'grad_norm': 0.26246050734572146, 'learning_rate': 7.92651998309023e-07, 'epoch': 0.82} 82%|████████▏ | 18202/22095 [31:04:17<6:12:01, 5.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8604702 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12512, 'image': '761525904.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Science Fiction & Fantasy? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Reference? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 82%|████████▏ | 18203/22095 [31:04:27<7:23:11, 6.83s/it] {'loss': 0.471, 'grad_norm': 0.2857559137902954, 'learning_rate': 7.922560430153259e-07, 'epoch': 0.82} 82%|████████▏ | 18203/22095 [31:04:27<7:23:11, 6.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924514 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47667, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nA. 6cm\nB. 2cm\nC. 3cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 82%|████████▏ | 18204/22095 [31:04:30<6:14:37, 5.78s/it] {'loss': 0.2734, 'grad_norm': 0.6119988564709138, 'learning_rate': 7.918601781331225e-07, 'epoch': 0.82} 82%|████████▏ | 18204/22095 [31:04:30<6:14:37, 5.78s/it] 82%|████████▏ | 18205/22095 [31:04:34<5:34:26, 5.16s/it] {'loss': 0.3015, 'grad_norm': 0.6360295830543973, 'learning_rate': 7.914644036709202e-07, 'epoch': 0.82} 82%|████████▏ | 18205/22095 [31:04:34<5:34:26, 5.16s/it] 82%|████████▏ | 18206/22095 [31:04:37<4:59:31, 4.62s/it] {'loss': 0.3613, 'grad_norm': 0.5614432284169272, 'learning_rate': 7.910687196372214e-07, 'epoch': 0.82} 82%|████████▏ | 18206/22095 [31:04:37<4:59:31, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18207/22095 [31:04:47<6:51:10, 6.35s/it] {'loss': 0.4515, 'grad_norm': 0.28829720900091843, 'learning_rate': 7.906731260405304e-07, 'epoch': 0.82} 82%|████████▏ | 18207/22095 [31:04:47<6:51:10, 6.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48893 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18208/22095 [31:04:52<6:14:09, 5.78s/it] {'loss': 0.2823, 'grad_norm': 0.5525539167791396, 'learning_rate': 7.902776228893444e-07, 'epoch': 0.82} 82%|████████▏ | 18208/22095 [31:04:52<6:14:09, 5.78s/it] 82%|████████▏ | 18209/22095 [31:04:56<5:34:52, 5.17s/it] {'loss': 0.3265, 'grad_norm': 0.6762310852060991, 'learning_rate': 7.898822101921644e-07, 'epoch': 0.82} 82%|████████▏ | 18209/22095 [31:04:56<5:34:52, 5.17s/it] 82%|████████▏ | 18210/22095 [31:04:59<4:58:22, 4.61s/it] {'loss': 0.2474, 'grad_norm': 0.6259275549416197, 'learning_rate': 7.894868879574847e-07, 'epoch': 0.82} 82%|████████▏ | 18210/22095 [31:04:59<4:58:22, 4.61s/it] 82%|████████▏ | 18211/22095 [31:05:03<4:39:27, 4.32s/it] {'loss': 0.2945, 'grad_norm': 0.6182305289073031, 'learning_rate': 7.890916561938006e-07, 'epoch': 0.82} 82%|████████▏ | 18211/22095 [31:05:03<4:39:27, 4.32s/it] 82%|████████▏ | 18212/22095 [31:05:06<4:22:12, 4.05s/it] {'loss': 0.3004, 'grad_norm': 0.634163999330956, 'learning_rate': 7.886965149096044e-07, 'epoch': 0.82} 82%|████████▏ | 18212/22095 [31:05:06<4:22:12, 4.05s/it] 82%|████████▏ | 18213/22095 [31:05:09<4:11:41, 3.89s/it] {'loss': 0.2223, 'grad_norm': 0.613473307222863, 'learning_rate': 7.883014641133846e-07, 'epoch': 0.82} 82%|████████▏ | 18213/22095 [31:05:09<4:11:41, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18214/22095 [31:05:13<4:07:22, 3.82s/it] {'loss': 0.2979, 'grad_norm': 0.590548117790914, 'learning_rate': 7.879065038136314e-07, 'epoch': 0.82} 82%|████████▏ | 18214/22095 [31:05:13<4:07:22, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72712 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18215/22095 [31:05:17<3:58:58, 3.70s/it] {'loss': 0.3004, 'grad_norm': 0.6453971910214398, 'learning_rate': 7.875116340188333e-07, 'epoch': 0.82} 82%|████████▏ | 18215/22095 [31:05:17<3:58:58, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18216/22095 [31:05:22<4:28:34, 4.15s/it] {'loss': 0.4756, 'grad_norm': 0.2649864985329331, 'learning_rate': 7.871168547374697e-07, 'epoch': 0.82} 82%|████████▏ | 18216/22095 [31:05:22<4:28:34, 4.15s/it] 82%|████████▏ | 18217/22095 [31:05:25<4:17:57, 3.99s/it] {'loss': 0.2831, 'grad_norm': 0.6405631892725506, 'learning_rate': 7.867221659780267e-07, 'epoch': 0.82} 82%|████████▏ | 18217/22095 [31:05:25<4:17:57, 3.99s/it] 82%|████████▏ | 18218/22095 [31:05:29<4:02:54, 3.76s/it] {'loss': 0.308, 'grad_norm': 0.6321704014564995, 'learning_rate': 7.863275677489851e-07, 'epoch': 0.82} 82%|████████▏ | 18218/22095 [31:05:29<4:02:54, 3.76s/it] 82%|████████▏ | 18219/22095 [31:05:32<3:53:07, 3.61s/it] {'loss': 0.2833, 'grad_norm': 0.6509092342644126, 'learning_rate': 7.859330600588228e-07, 'epoch': 0.82} 82%|████████▏ | 18219/22095 [31:05:32<3:53:07, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47499 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18220/22095 [31:05:36<3:57:26, 3.68s/it] {'loss': 0.2995, 'grad_norm': 0.6273536804779368, 'learning_rate': 7.85538642916015e-07, 'epoch': 0.82} 82%|████████▏ | 18220/22095 [31:05:36<3:57:26, 3.68s/it] 82%|████████▏ | 18221/22095 [31:05:39<3:55:20, 3.65s/it] {'loss': 0.2725, 'grad_norm': 0.587258924680402, 'learning_rate': 7.851443163290385e-07, 'epoch': 0.82} 82%|████████▏ | 18221/22095 [31:05:39<3:55:20, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 82%|████████▏ | 18222/22095 [31:05:43<3:51:32, 3.59s/it] {'loss': 0.2936, 'grad_norm': 0.6799857601085152, 'learning_rate': 7.847500803063668e-07, 'epoch': 0.82} 82%|████████▏ | 18222/22095 [31:05:43<3:51:32, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18223/22095 [31:05:52<5:47:52, 5.39s/it] {'loss': 0.458, 'grad_norm': 0.2675036031094028, 'learning_rate': 7.843559348564694e-07, 'epoch': 0.82} 82%|████████▏ | 18223/22095 [31:05:52<5:47:52, 5.39s/it] 82%|████████▏ | 18224/22095 [31:05:56<5:06:24, 4.75s/it] {'loss': 0.3493, 'grad_norm': 0.6276520182225253, 'learning_rate': 7.839618799878146e-07, 'epoch': 0.82} 82%|████████▏ | 18224/22095 [31:05:56<5:06:24, 4.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44305 > 40960). Running this sequence through the model will result in indexing errors 82%|████████▏ | 18225/22095 [31:05:59<4:32:01, 4.22s/it] {'loss': 0.3136, 'grad_norm': 0.6380138615809682, 'learning_rate': 7.835679157088716e-07, 'epoch': 0.82} 82%|████████▏ | 18225/22095 [31:05:59<4:32:01, 4.22s/it] 82%|████████▏ | 18226/22095 [31:06:02<4:16:30, 3.98s/it] {'loss': 0.2854, 'grad_norm': 0.5879257908734836, 'learning_rate': 7.831740420281031e-07, 'epoch': 0.82} 82%|████████▏ | 18226/22095 [31:06:02<4:16:30, 3.98s/it] 82%|████████▏ | 18227/22095 [31:06:06<4:10:19, 3.88s/it] {'loss': 0.2752, 'grad_norm': 0.6046994935357449, 'learning_rate': 7.827802589539751e-07, 'epoch': 0.82} 82%|████████▏ | 18227/22095 [31:06:06<4:10:19, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 82%|████████▏ | 18228/22095 [31:06:12<5:03:57, 4.72s/it] {'loss': 0.4613, 'grad_norm': 0.2677090660018164, 'learning_rate': 7.823865664949464e-07, 'epoch': 0.82} 82%|████████▏ | 18228/22095 [31:06:12<5:03:57, 4.72s/it] 83%|████████▎ | 18229/22095 [31:06:15<4:34:37, 4.26s/it] {'loss': 0.2667, 'grad_norm': 0.672067112995326, 'learning_rate': 7.819929646594765e-07, 'epoch': 0.83} 83%|████████▎ | 18229/22095 [31:06:15<4:34:37, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48661 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69100 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18230/22095 [31:06:19<4:11:21, 3.90s/it] {'loss': 0.2792, 'grad_norm': 0.6228260122442586, 'learning_rate': 7.815994534560228e-07, 'epoch': 0.83} 83%|████████▎ | 18230/22095 [31:06:19<4:11:21, 3.90s/it] 83%|████████▎ | 18231/22095 [31:06:21<3:52:36, 3.61s/it] {'loss': 0.3122, 'grad_norm': 0.6467169822046223, 'learning_rate': 7.812060328930421e-07, 'epoch': 0.83} 83%|████████▎ | 18231/22095 [31:06:21<3:52:36, 3.61s/it] 83%|████████▎ | 18232/22095 [31:06:25<3:56:00, 3.67s/it] {'loss': 0.2974, 'grad_norm': 0.5694912430159714, 'learning_rate': 7.808127029789869e-07, 'epoch': 0.83} 83%|████████▎ | 18232/22095 [31:06:25<3:56:00, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18233/22095 [31:06:33<5:22:51, 5.02s/it] {'loss': 0.4857, 'grad_norm': 0.2691041012160917, 'learning_rate': 7.804194637223073e-07, 'epoch': 0.83} 83%|████████▎ | 18233/22095 [31:06:33<5:22:51, 5.02s/it] 83%|████████▎ | 18234/22095 [31:06:37<4:48:01, 4.48s/it] {'loss': 0.2878, 'grad_norm': 0.6448056928408912, 'learning_rate': 7.800263151314536e-07, 'epoch': 0.83} 83%|████████▎ | 18234/22095 [31:06:37<4:48:01, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [462, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8501464 in VC:s3://internvl-moe-sft-data/. Exception: Image size [462, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36277, 'image': 'vrdu_texteq/astro-ph.CO/06e607fd-af0c-4b26-8b8e-56f02df535dd.png', 'image_wh': [[462, 23]], 'conversations': [{'from': 'human', 'value': '\n请分析图像并提取其中可以辨认的任何文本信息。'}, {'from': 'gpt', 'value': 'we need $F$ and its derivative at $z=0$.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18235/22095 [31:06:40<4:20:04, 4.04s/it] {'loss': 0.2713, 'grad_norm': 0.5742698318047919, 'learning_rate': 7.796332572148752e-07, 'epoch': 0.83} 83%|████████▎ | 18235/22095 [31:06:40<4:20:04, 4.04s/it] 83%|████████▎ | 18236/22095 [31:06:43<3:57:06, 3.69s/it] {'loss': 0.2851, 'grad_norm': 0.6351411007375356, 'learning_rate': 7.792402899810164e-07, 'epoch': 0.83} 83%|████████▎ | 18236/22095 [31:06:43<3:57:06, 3.69s/it] 83%|████████▎ | 18237/22095 [31:06:46<3:46:47, 3.53s/it] {'loss': 0.2896, 'grad_norm': 0.6244020735748892, 'learning_rate': 7.788474134383195e-07, 'epoch': 0.83} 83%|████████▎ | 18237/22095 [31:06:46<3:46:47, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18238/22095 [31:06:53<5:07:04, 4.78s/it] {'loss': 0.4723, 'grad_norm': 0.27950882607118965, 'learning_rate': 7.784546275952281e-07, 'epoch': 0.83} 83%|████████▎ | 18238/22095 [31:06:53<5:07:04, 4.78s/it] 83%|████████▎ | 18239/22095 [31:06:58<4:56:05, 4.61s/it] {'loss': 0.2944, 'grad_norm': 0.622105002869538, 'learning_rate': 7.780619324601807e-07, 'epoch': 0.83} 83%|████████▎ | 18239/22095 [31:06:58<4:56:05, 4.61s/it] 83%|████████▎ | 18240/22095 [31:07:01<4:24:24, 4.12s/it] {'loss': 0.2871, 'grad_norm': 0.6414907773269223, 'learning_rate': 7.776693280416164e-07, 'epoch': 0.83} 83%|████████▎ | 18240/22095 [31:07:01<4:24:24, 4.12s/it] 83%|████████▎ | 18241/22095 [31:07:04<4:17:28, 4.01s/it] {'loss': 0.3158, 'grad_norm': 0.9654656942328771, 'learning_rate': 7.772768143479703e-07, 'epoch': 0.83} 83%|████████▎ | 18241/22095 [31:07:04<4:17:28, 4.01s/it] 83%|████████▎ | 18242/22095 [31:07:07<4:00:58, 3.75s/it] {'loss': 0.3062, 'grad_norm': 0.7863181674165229, 'learning_rate': 7.768843913876756e-07, 'epoch': 0.83} 83%|████████▎ | 18242/22095 [31:07:07<4:00:58, 3.75s/it] 83%|████████▎ | 18243/22095 [31:07:11<3:56:01, 3.68s/it] {'loss': 0.2808, 'grad_norm': 0.5856414481972946, 'learning_rate': 7.76492059169165e-07, 'epoch': 0.83} 83%|████████▎ | 18243/22095 [31:07:11<3:56:01, 3.68s/it] 83%|████████▎ | 18244/22095 [31:07:14<3:41:51, 3.46s/it] {'loss': 0.2964, 'grad_norm': 0.7651089549539092, 'learning_rate': 7.760998177008694e-07, 'epoch': 0.83} 83%|████████▎ | 18244/22095 [31:07:14<3:41:51, 3.46s/it] 83%|████████▎ | 18245/22095 [31:07:17<3:34:58, 3.35s/it] {'loss': 0.3085, 'grad_norm': 0.987347191861977, 'learning_rate': 7.757076669912162e-07, 'epoch': 0.83} 83%|████████▎ | 18245/22095 [31:07:17<3:34:58, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18246/22095 [31:07:21<3:40:05, 3.43s/it] {'loss': 0.3031, 'grad_norm': 0.6591369804984707, 'learning_rate': 7.7531560704863e-07, 'epoch': 0.83} 83%|████████▎ | 18246/22095 [31:07:21<3:40:05, 3.43s/it] 83%|████████▎ | 18247/22095 [31:07:24<3:31:35, 3.30s/it] {'loss': 0.2856, 'grad_norm': 0.5720351463462361, 'learning_rate': 7.749236378815372e-07, 'epoch': 0.83} 83%|████████▎ | 18247/22095 [31:07:24<3:31:35, 3.30s/it] 83%|████████▎ | 18248/22095 [31:07:27<3:27:37, 3.24s/it] {'loss': 0.3181, 'grad_norm': 0.6792636477123376, 'learning_rate': 7.745317594983598e-07, 'epoch': 0.83} 83%|████████▎ | 18248/22095 [31:07:27<3:27:37, 3.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18249/22095 [31:07:36<5:29:21, 5.14s/it] {'loss': 0.4846, 'grad_norm': 0.2756247302938806, 'learning_rate': 7.741399719075154e-07, 'epoch': 0.83} 83%|████████▎ | 18249/22095 [31:07:36<5:29:21, 5.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82377 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18250/22095 [31:07:40<4:56:10, 4.62s/it] {'loss': 0.3261, 'grad_norm': 0.6683871282988517, 'learning_rate': 7.737482751174247e-07, 'epoch': 0.83} 83%|████████▎ | 18250/22095 [31:07:40<4:56:10, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18251/22095 [31:07:43<4:35:43, 4.30s/it] {'loss': 0.2598, 'grad_norm': 0.5851850370664479, 'learning_rate': 7.733566691365047e-07, 'epoch': 0.83} 83%|████████▎ | 18251/22095 [31:07:43<4:35:43, 4.30s/it] 83%|████████▎ | 18252/22095 [31:07:47<4:23:30, 4.11s/it] {'loss': 0.3272, 'grad_norm': 0.6077307922996963, 'learning_rate': 7.729651539731686e-07, 'epoch': 0.83} 83%|████████▎ | 18252/22095 [31:07:47<4:23:30, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18253/22095 [31:07:51<4:12:46, 3.95s/it] {'loss': 0.3178, 'grad_norm': 0.6496080829350848, 'learning_rate': 7.725737296358283e-07, 'epoch': 0.83} 83%|████████▎ | 18253/22095 [31:07:51<4:12:46, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43337 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77941 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42594 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18254/22095 [31:07:54<3:59:00, 3.73s/it] {'loss': 0.3069, 'grad_norm': 0.6895525446763424, 'learning_rate': 7.721823961328955e-07, 'epoch': 0.83} 83%|████████▎ | 18254/22095 [31:07:54<3:59:00, 3.73s/it] 83%|████████▎ | 18255/22095 [31:07:57<3:45:34, 3.52s/it] {'loss': 0.3144, 'grad_norm': 0.6007041300072309, 'learning_rate': 7.717911534727778e-07, 'epoch': 0.83} 83%|████████▎ | 18255/22095 [31:07:57<3:45:34, 3.52s/it] 83%|████████▎ | 18256/22095 [31:08:00<3:32:41, 3.32s/it] {'loss': 0.3323, 'grad_norm': 0.6795520769191841, 'learning_rate': 7.714000016638829e-07, 'epoch': 0.83} 83%|████████▎ | 18256/22095 [31:08:00<3:32:41, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (106439 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44612 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18257/22095 [31:08:02<3:23:05, 3.17s/it] {'loss': 0.2825, 'grad_norm': 0.5915455209682717, 'learning_rate': 7.710089407146154e-07, 'epoch': 0.83} 83%|████████▎ | 18257/22095 [31:08:02<3:23:05, 3.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18258/22095 [31:08:07<3:57:46, 3.72s/it] {'loss': 0.4318, 'grad_norm': 0.24806576764877158, 'learning_rate': 7.706179706333755e-07, 'epoch': 0.83} 83%|████████▎ | 18258/22095 [31:08:07<3:57:46, 3.72s/it] 83%|████████▎ | 18259/22095 [31:08:12<4:05:37, 3.84s/it] {'loss': 0.2767, 'grad_norm': 0.6538030184224267, 'learning_rate': 7.702270914285664e-07, 'epoch': 0.83} 83%|████████▎ | 18259/22095 [31:08:12<4:05:37, 3.84s/it] 83%|████████▎ | 18260/22095 [31:08:15<4:05:28, 3.84s/it] {'loss': 0.2771, 'grad_norm': 0.6195635754908934, 'learning_rate': 7.698363031085871e-07, 'epoch': 0.83} 83%|████████▎ | 18260/22095 [31:08:15<4:05:28, 3.84s/it] 83%|████████▎ | 18261/22095 [31:08:19<3:57:51, 3.72s/it] {'loss': 0.2988, 'grad_norm': 0.6598416768005659, 'learning_rate': 7.694456056818339e-07, 'epoch': 0.83} 83%|████████▎ | 18261/22095 [31:08:19<3:57:51, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18262/22095 [31:08:28<5:48:22, 5.45s/it] {'loss': 0.4652, 'grad_norm': 0.28054173240776076, 'learning_rate': 7.690549991567004e-07, 'epoch': 0.83} 83%|████████▎ | 18262/22095 [31:08:28<5:48:22, 5.45s/it] 83%|████████▎ | 18263/22095 [31:08:32<5:13:55, 4.92s/it] {'loss': 0.2588, 'grad_norm': 0.6181440261644516, 'learning_rate': 7.686644835415808e-07, 'epoch': 0.83} 83%|████████▎ | 18263/22095 [31:08:32<5:13:55, 4.92s/it] 83%|████████▎ | 18264/22095 [31:08:36<4:54:01, 4.60s/it] {'loss': 0.3479, 'grad_norm': 0.6146368913346695, 'learning_rate': 7.682740588448667e-07, 'epoch': 0.83} 83%|████████▎ | 18264/22095 [31:08:36<4:54:01, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18265/22095 [31:08:42<5:29:14, 5.16s/it] {'loss': 0.4507, 'grad_norm': 0.25630777827620876, 'learning_rate': 7.67883725074946e-07, 'epoch': 0.83} 83%|████████▎ | 18265/22095 [31:08:42<5:29:14, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [75, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8472779 in VC:s3://internvl-moe-sft-data/. Exception: Image size [75, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29052, 'image': 'vrdu_texteq/astro-ph.CO/e4835684-9a33-4f68-a765-1d2eac701469.png', 'image_wh': [[75, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': '\\ and $z$.'}]} 83%|████████▎ | 18266/22095 [31:08:46<4:54:51, 4.62s/it] {'loss': 0.3121, 'grad_norm': 0.5530316672521514, 'learning_rate': 7.674934822402052e-07, 'epoch': 0.83} 83%|████████▎ | 18266/22095 [31:08:46<4:54:51, 4.62s/it] 83%|████████▎ | 18267/22095 [31:08:50<4:40:40, 4.40s/it] {'loss': 0.3163, 'grad_norm': 1.172841472236167, 'learning_rate': 7.671033303490321e-07, 'epoch': 0.83} 83%|████████▎ | 18267/22095 [31:08:50<4:40:40, 4.40s/it] 83%|████████▎ | 18268/22095 [31:08:53<4:28:29, 4.21s/it] {'loss': 0.3352, 'grad_norm': 0.5813172977654625, 'learning_rate': 7.667132694098061e-07, 'epoch': 0.83} 83%|████████▎ | 18268/22095 [31:08:53<4:28:29, 4.21s/it] 83%|████████▎ | 18269/22095 [31:08:56<4:05:15, 3.85s/it] {'loss': 0.281, 'grad_norm': 0.678137676491045, 'learning_rate': 7.663232994309122e-07, 'epoch': 0.83} 83%|████████▎ | 18269/22095 [31:08:56<4:05:15, 3.85s/it] 83%|████████▎ | 18270/22095 [31:09:00<4:06:35, 3.87s/it] {'loss': 0.3451, 'grad_norm': 0.6224508351144359, 'learning_rate': 7.659334204207275e-07, 'epoch': 0.83} 83%|████████▎ | 18270/22095 [31:09:00<4:06:35, 3.87s/it] 83%|████████▎ | 18271/22095 [31:09:03<3:51:24, 3.63s/it] {'loss': 0.2906, 'grad_norm': 0.6507117854871701, 'learning_rate': 7.655436323876286e-07, 'epoch': 0.83} 83%|████████▎ | 18271/22095 [31:09:03<3:51:24, 3.63s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (92139516 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 83%|████████▎ | 18272/22095 [31:09:07<3:47:50, 3.58s/it] {'loss': 0.3121, 'grad_norm': 0.6180031129751423, 'learning_rate': 7.651539353399917e-07, 'epoch': 0.83} 83%|████████▎ | 18272/22095 [31:09:07<3:47:50, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881049 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4202, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nA. 5cm\nB. 无法确定\nC. 1cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 83%|████████▎ | 18273/22095 [31:09:10<3:37:47, 3.42s/it] {'loss': 0.3068, 'grad_norm': 0.7227123766917248, 'learning_rate': 7.647643292861917e-07, 'epoch': 0.83} 83%|████████▎ | 18273/22095 [31:09:10<3:37:47, 3.42s/it] 83%|████████▎ | 18274/22095 [31:09:14<3:42:13, 3.49s/it] {'loss': 0.2751, 'grad_norm': 0.6440853563337421, 'learning_rate': 7.643748142345985e-07, 'epoch': 0.83} 83%|████████▎ | 18274/22095 [31:09:14<3:42:13, 3.49s/it] 83%|████████▎ | 18275/22095 [31:09:17<3:32:43, 3.34s/it] {'loss': 0.2505, 'grad_norm': 0.6618308941143313, 'learning_rate': 7.639853901935812e-07, 'epoch': 0.83} 83%|████████▎ | 18275/22095 [31:09:17<3:32:43, 3.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18276/22095 [31:09:19<3:22:26, 3.18s/it] {'loss': 0.2894, 'grad_norm': 0.6067429456867909, 'learning_rate': 7.635960571715073e-07, 'epoch': 0.83} 83%|████████▎ | 18276/22095 [31:09:19<3:22:26, 3.18s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18277/22095 [31:09:23<3:34:20, 3.37s/it] {'loss': 0.2784, 'grad_norm': 0.6330449486181741, 'learning_rate': 7.632068151767447e-07, 'epoch': 0.83} 83%|████████▎ | 18277/22095 [31:09:23<3:34:20, 3.37s/it] 83%|████████▎ | 18278/22095 [31:09:28<3:57:03, 3.73s/it] {'loss': 0.2569, 'grad_norm': 0.6117914158146655, 'learning_rate': 7.628176642176549e-07, 'epoch': 0.83} 83%|████████▎ | 18278/22095 [31:09:28<3:57:03, 3.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118338 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18279/22095 [31:09:32<4:12:44, 3.97s/it] {'loss': 0.2808, 'grad_norm': 0.5901946253104174, 'learning_rate': 7.624286043025991e-07, 'epoch': 0.83} 83%|████████▎ | 18279/22095 [31:09:32<4:12:44, 3.97s/it] 83%|████████▎ | 18280/22095 [31:09:36<4:14:16, 4.00s/it] {'loss': 0.3258, 'grad_norm': 0.6189254349398654, 'learning_rate': 7.62039635439939e-07, 'epoch': 0.83} 83%|████████▎ | 18280/22095 [31:09:36<4:14:16, 4.00s/it] 83%|████████▎ | 18281/22095 [31:09:41<4:22:17, 4.13s/it] {'loss': 0.2848, 'grad_norm': 0.5864574378011577, 'learning_rate': 7.616507576380311e-07, 'epoch': 0.83} 83%|████████▎ | 18281/22095 [31:09:41<4:22:17, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67789 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73080 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18282/22095 [31:09:44<4:09:48, 3.93s/it] {'loss': 0.3302, 'grad_norm': 0.6289188315276548, 'learning_rate': 7.612619709052305e-07, 'epoch': 0.83} 83%|████████▎ | 18282/22095 [31:09:44<4:09:48, 3.93s/it] 83%|████████▎ | 18283/22095 [31:09:48<4:00:54, 3.79s/it] {'loss': 0.2773, 'grad_norm': 0.6599194063754822, 'learning_rate': 7.608732752498926e-07, 'epoch': 0.83} 83%|████████▎ | 18283/22095 [31:09:48<4:00:54, 3.79s/it] 83%|████████▎ | 18284/22095 [31:09:52<4:06:07, 3.87s/it] {'loss': 0.306, 'grad_norm': 0.7143180101946018, 'learning_rate': 7.604846706803676e-07, 'epoch': 0.83} 83%|████████▎ | 18284/22095 [31:09:52<4:06:07, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41197 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45224 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53353 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70258 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58254 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18285/22095 [31:09:55<3:49:01, 3.61s/it] {'loss': 0.2777, 'grad_norm': 0.5686795089354336, 'learning_rate': 7.600961572050076e-07, 'epoch': 0.83} 83%|████████▎ | 18285/22095 [31:09:55<3:49:01, 3.61s/it] 83%|████████▎ | 18286/22095 [31:09:58<3:50:25, 3.63s/it] {'loss': 0.3026, 'grad_norm': 0.5544043684290126, 'learning_rate': 7.59707734832159e-07, 'epoch': 0.83} 83%|████████▎ | 18286/22095 [31:09:58<3:50:25, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18287/22095 [31:10:05<4:49:50, 4.57s/it] {'loss': 0.4679, 'grad_norm': 0.26360211053535226, 'learning_rate': 7.593194035701667e-07, 'epoch': 0.83} 83%|████████▎ | 18287/22095 [31:10:05<4:49:50, 4.57s/it] 83%|████████▎ | 18288/22095 [31:10:09<4:38:19, 4.39s/it] {'loss': 0.3124, 'grad_norm': 0.6223757948396181, 'learning_rate': 7.589311634273766e-07, 'epoch': 0.83} 83%|████████▎ | 18288/22095 [31:10:09<4:38:19, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41581 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18289/22095 [31:10:12<4:16:57, 4.05s/it] {'loss': 0.2925, 'grad_norm': 0.7011240909472778, 'learning_rate': 7.585430144121319e-07, 'epoch': 0.83} 83%|████████▎ | 18289/22095 [31:10:12<4:16:57, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18290/22095 [31:10:23<6:14:06, 5.90s/it] {'loss': 0.4616, 'grad_norm': 0.2836839740412367, 'learning_rate': 7.581549565327706e-07, 'epoch': 0.83} 83%|████████▎ | 18290/22095 [31:10:23<6:14:06, 5.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8933826 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 56979, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为10cm长AB段顶点,D、E分别为AC、CB中点,长度为()\nA. 6.5cm\nB. 5cm\nC. 5.5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 83%|████████▎ | 18291/22095 [31:10:26<5:26:38, 5.15s/it] {'loss': 0.2795, 'grad_norm': 0.5970258776833657, 'learning_rate': 7.577669897976303e-07, 'epoch': 0.83} 83%|████████▎ | 18291/22095 [31:10:26<5:26:38, 5.15s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18292/22095 [31:10:30<4:58:44, 4.71s/it] {'loss': 0.2582, 'grad_norm': 0.7240878255723272, 'learning_rate': 7.573791142150488e-07, 'epoch': 0.83} 83%|████████▎ | 18292/22095 [31:10:30<4:58:44, 4.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18293/22095 [31:10:40<6:48:08, 6.44s/it] {'loss': 0.4656, 'grad_norm': 0.2803187872537055, 'learning_rate': 7.569913297933606e-07, 'epoch': 0.83} 83%|████████▎ | 18293/22095 [31:10:40<6:48:08, 6.44s/it] 83%|████████▎ | 18294/22095 [31:10:44<5:55:37, 5.61s/it] {'loss': 0.3276, 'grad_norm': 0.6074138860738469, 'learning_rate': 7.566036365408974e-07, 'epoch': 0.83} 83%|████████▎ | 18294/22095 [31:10:44<5:55:37, 5.61s/it] 83%|████████▎ | 18295/22095 [31:10:47<5:10:54, 4.91s/it] {'loss': 0.2956, 'grad_norm': 0.6355019438669003, 'learning_rate': 7.562160344659886e-07, 'epoch': 0.83} 83%|████████▎ | 18295/22095 [31:10:47<5:10:54, 4.91s/it] 83%|████████▎ | 18296/22095 [31:10:50<4:33:03, 4.31s/it] {'loss': 0.3196, 'grad_norm': 0.6349727647123978, 'learning_rate': 7.558285235769647e-07, 'epoch': 0.83} 83%|████████▎ | 18296/22095 [31:10:50<4:33:03, 4.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18297/22095 [31:11:00<6:26:45, 6.11s/it] {'loss': 0.4601, 'grad_norm': 0.24541544171528995, 'learning_rate': 7.55441103882149e-07, 'epoch': 0.83} 83%|████████▎ | 18297/22095 [31:11:00<6:26:45, 6.11s/it] 83%|████████▎ | 18298/22095 [31:11:04<5:47:45, 5.50s/it] {'loss': 0.2831, 'grad_norm': 0.5882825549019338, 'learning_rate': 7.550537753898696e-07, 'epoch': 0.83} 83%|████████▎ | 18298/22095 [31:11:04<5:47:45, 5.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48436 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65204 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18299/22095 [31:11:08<5:10:06, 4.90s/it] {'loss': 0.2905, 'grad_norm': 0.6734086488781459, 'learning_rate': 7.546665381084467e-07, 'epoch': 0.83} 83%|████████▎ | 18299/22095 [31:11:08<5:10:06, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69976 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18300/22095 [31:11:12<4:53:45, 4.64s/it] {'loss': 0.2904, 'grad_norm': 0.6355215550001918, 'learning_rate': 7.542793920462005e-07, 'epoch': 0.83} 83%|████████▎ | 18300/22095 [31:11:12<4:53:45, 4.64s/it] 83%|████████▎ | 18301/22095 [31:11:15<4:28:43, 4.25s/it] {'loss': 0.3429, 'grad_norm': 0.6804674889052729, 'learning_rate': 7.538923372114504e-07, 'epoch': 0.83} 83%|████████▎ | 18301/22095 [31:11:15<4:28:43, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18302/22095 [31:11:18<4:06:37, 3.90s/it] {'loss': 0.296, 'grad_norm': 0.6514486631880623, 'learning_rate': 7.535053736125142e-07, 'epoch': 0.83} 83%|████████▎ | 18302/22095 [31:11:18<4:06:37, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18303/22095 [31:11:28<5:47:45, 5.50s/it] {'loss': 0.4688, 'grad_norm': 0.2913101001381168, 'learning_rate': 7.531185012577052e-07, 'epoch': 0.83} 83%|████████▎ | 18303/22095 [31:11:28<5:47:45, 5.50s/it] 83%|████████▎ | 18304/22095 [31:11:31<5:13:32, 4.96s/it] {'loss': 0.2999, 'grad_norm': 0.6438046832602575, 'learning_rate': 7.527317201553358e-07, 'epoch': 0.83} 83%|████████▎ | 18304/22095 [31:11:31<5:13:32, 4.96s/it] 83%|████████▎ | 18305/22095 [31:11:34<4:36:14, 4.37s/it] {'loss': 0.2919, 'grad_norm': 0.6767431632880623, 'learning_rate': 7.523450303137164e-07, 'epoch': 0.83} 83%|████████▎ | 18305/22095 [31:11:34<4:36:14, 4.37s/it] 83%|████████▎ | 18306/22095 [31:11:38<4:20:09, 4.12s/it] {'loss': 0.2844, 'grad_norm': 0.5343222301415081, 'learning_rate': 7.519584317411582e-07, 'epoch': 0.83} 83%|████████▎ | 18306/22095 [31:11:38<4:20:09, 4.12s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18307/22095 [31:11:45<5:26:01, 5.16s/it] {'loss': 0.4714, 'grad_norm': 0.2829817591705852, 'learning_rate': 7.515719244459668e-07, 'epoch': 0.83} 83%|████████▎ | 18307/22095 [31:11:45<5:26:01, 5.16s/it] 83%|████████▎ | 18308/22095 [31:11:49<4:59:06, 4.74s/it] {'loss': 0.3009, 'grad_norm': 0.6824523978433078, 'learning_rate': 7.51185508436445e-07, 'epoch': 0.83} 83%|████████▎ | 18308/22095 [31:11:49<4:59:06, 4.74s/it] 83%|████████▎ | 18309/22095 [31:11:53<4:33:52, 4.34s/it] {'loss': 0.2714, 'grad_norm': 0.5993978929754125, 'learning_rate': 7.507991837208989e-07, 'epoch': 0.83} 83%|████████▎ | 18309/22095 [31:11:53<4:33:52, 4.34s/it] 83%|████████▎ | 18310/22095 [31:11:56<4:23:50, 4.18s/it] {'loss': 0.2783, 'grad_norm': 0.5934461993346714, 'learning_rate': 7.504129503076263e-07, 'epoch': 0.83} 83%|████████▎ | 18310/22095 [31:11:56<4:23:50, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57782 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138172 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18311/22095 [31:11:59<4:02:41, 3.85s/it] {'loss': 0.3129, 'grad_norm': 0.6268985757761065, 'learning_rate': 7.500268082049294e-07, 'epoch': 0.83} 83%|████████▎ | 18311/22095 [31:11:59<4:02:41, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89792 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97636 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18312/22095 [31:12:03<3:49:58, 3.65s/it] {'loss': 0.2818, 'grad_norm': 0.630009354553089, 'learning_rate': 7.496407574211034e-07, 'epoch': 0.83} 83%|████████▎ | 18312/22095 [31:12:03<3:49:58, 3.65s/it] 83%|████████▎ | 18313/22095 [31:12:07<4:06:11, 3.91s/it] {'loss': 0.2825, 'grad_norm': 0.5945913692915438, 'learning_rate': 7.492547979644421e-07, 'epoch': 0.83} 83%|████████▎ | 18313/22095 [31:12:07<4:06:11, 3.91s/it] 83%|████████▎ | 18314/22095 [31:12:10<3:47:42, 3.61s/it] {'loss': 0.2916, 'grad_norm': 0.6051115804362879, 'learning_rate': 7.488689298432406e-07, 'epoch': 0.83} 83%|████████▎ | 18314/22095 [31:12:10<3:47:42, 3.61s/it] 83%|████████▎ | 18315/22095 [31:12:14<3:48:15, 3.62s/it] {'loss': 0.3228, 'grad_norm': 0.6861148878623919, 'learning_rate': 7.484831530657916e-07, 'epoch': 0.83} 83%|████████▎ | 18315/22095 [31:12:14<3:48:15, 3.62s/it] 83%|████████▎ | 18316/22095 [31:12:17<3:37:22, 3.45s/it] {'loss': 0.2573, 'grad_norm': 2.0859298361444627, 'learning_rate': 7.480974676403796e-07, 'epoch': 0.83} 83%|████████▎ | 18316/22095 [31:12:17<3:37:22, 3.45s/it] 83%|████████▎ | 18317/22095 [31:12:20<3:30:38, 3.35s/it] {'loss': 0.3188, 'grad_norm': 0.6780884526798012, 'learning_rate': 7.477118735752942e-07, 'epoch': 0.83} 83%|████████▎ | 18317/22095 [31:12:20<3:30:38, 3.35s/it] 83%|████████▎ | 18318/22095 [31:12:23<3:34:05, 3.40s/it] {'loss': 0.2828, 'grad_norm': 0.6263272543005641, 'learning_rate': 7.47326370878822e-07, 'epoch': 0.83} 83%|████████▎ | 18318/22095 [31:12:23<3:34:05, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18319/22095 [31:12:34<5:48:17, 5.53s/it] {'loss': 0.4676, 'grad_norm': 0.27210140659531634, 'learning_rate': 7.469409595592453e-07, 'epoch': 0.83} 83%|████████▎ | 18319/22095 [31:12:34<5:48:17, 5.53s/it] 83%|████████▎ | 18320/22095 [31:12:44<7:06:28, 6.78s/it] {'loss': 0.4613, 'grad_norm': 0.27514811354097995, 'learning_rate': 7.465556396248436e-07, 'epoch': 0.83} 83%|████████▎ | 18320/22095 [31:12:44<7:06:28, 6.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18321/22095 [31:12:47<5:57:04, 5.68s/it] {'loss': 0.2831, 'grad_norm': 0.6326532116835614, 'learning_rate': 7.461704110838974e-07, 'epoch': 0.83} 83%|████████▎ | 18321/22095 [31:12:47<5:57:04, 5.68s/it] 83%|████████▎ | 18322/22095 [31:12:51<5:22:08, 5.12s/it] {'loss': 0.2694, 'grad_norm': 0.5798521668499196, 'learning_rate': 7.457852739446864e-07, 'epoch': 0.83} 83%|████████▎ | 18322/22095 [31:12:51<5:22:08, 5.12s/it] 83%|████████▎ | 18323/22095 [31:12:54<4:41:04, 4.47s/it] {'loss': 0.3344, 'grad_norm': 0.6280232175895027, 'learning_rate': 7.454002282154838e-07, 'epoch': 0.83} 83%|████████▎ | 18323/22095 [31:12:54<4:41:04, 4.47s/it] 83%|████████▎ | 18324/22095 [31:12:56<4:08:20, 3.95s/it] {'loss': 0.2764, 'grad_norm': 0.6556717243167894, 'learning_rate': 7.450152739045618e-07, 'epoch': 0.83} 83%|████████▎ | 18324/22095 [31:12:56<4:08:20, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42274 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58600 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73368 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18325/22095 [31:12:59<3:53:51, 3.72s/it] {'loss': 0.2853, 'grad_norm': 0.6230288274977733, 'learning_rate': 7.446304110201947e-07, 'epoch': 0.83} 83%|████████▎ | 18325/22095 [31:12:59<3:53:51, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57036 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18326/22095 [31:13:03<3:46:57, 3.61s/it] {'loss': 0.3202, 'grad_norm': 1.2027587348251012, 'learning_rate': 7.442456395706493e-07, 'epoch': 0.83} 83%|████████▎ | 18326/22095 [31:13:03<3:46:57, 3.61s/it] 83%|████████▎ | 18327/22095 [31:13:06<3:39:41, 3.50s/it] {'loss': 0.273, 'grad_norm': 0.5805160457644685, 'learning_rate': 7.43860959564196e-07, 'epoch': 0.83} 83%|████████▎ | 18327/22095 [31:13:06<3:39:41, 3.50s/it] 83%|████████▎ | 18328/22095 [31:13:09<3:29:20, 3.33s/it] {'loss': 0.2808, 'grad_norm': 0.6256624824026773, 'learning_rate': 7.434763710090991e-07, 'epoch': 0.83} 83%|████████▎ | 18328/22095 [31:13:09<3:29:20, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18329/22095 [31:13:16<4:32:18, 4.34s/it] {'loss': 0.4638, 'grad_norm': 0.2540345517953192, 'learning_rate': 7.430918739136206e-07, 'epoch': 0.83} 83%|████████▎ | 18329/22095 [31:13:16<4:32:18, 4.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18330/22095 [31:13:19<4:10:26, 3.99s/it] {'loss': 0.3041, 'grad_norm': 0.627344755177281, 'learning_rate': 7.427074682860242e-07, 'epoch': 0.83} 83%|████████▎ | 18330/22095 [31:13:19<4:10:26, 3.99s/it] 83%|████████▎ | 18331/22095 [31:13:22<3:56:07, 3.76s/it] {'loss': 0.3091, 'grad_norm': 0.5883979495725812, 'learning_rate': 7.423231541345694e-07, 'epoch': 0.83} 83%|████████▎ | 18331/22095 [31:13:22<3:56:07, 3.76s/it] 83%|████████▎ | 18332/22095 [31:13:26<3:54:53, 3.75s/it] {'loss': 0.2699, 'grad_norm': 0.6278410167294457, 'learning_rate': 7.41938931467514e-07, 'epoch': 0.83} 83%|████████▎ | 18332/22095 [31:13:26<3:54:53, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18333/22095 [31:13:35<5:44:33, 5.50s/it] {'loss': 0.4733, 'grad_norm': 0.2937449638440562, 'learning_rate': 7.415548002931122e-07, 'epoch': 0.83} 83%|████████▎ | 18333/22095 [31:13:35<5:44:33, 5.50s/it] 83%|████████▎ | 18334/22095 [31:13:42<6:11:50, 5.93s/it] {'loss': 0.4704, 'grad_norm': 0.2606301138768154, 'learning_rate': 7.411707606196189e-07, 'epoch': 0.83} 83%|████████▎ | 18334/22095 [31:13:42<6:11:50, 5.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18335/22095 [31:13:47<5:41:40, 5.45s/it] {'loss': 0.2954, 'grad_norm': 0.6100340455285485, 'learning_rate': 7.40786812455287e-07, 'epoch': 0.83} 83%|████████▎ | 18335/22095 [31:13:47<5:41:40, 5.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18336/22095 [31:13:53<5:53:47, 5.65s/it] {'loss': 0.4779, 'grad_norm': 0.2608413139470748, 'learning_rate': 7.404029558083653e-07, 'epoch': 0.83} 83%|████████▎ | 18336/22095 [31:13:53<5:53:47, 5.65s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18337/22095 [31:13:56<5:06:46, 4.90s/it] {'loss': 0.2791, 'grad_norm': 0.6010332039258852, 'learning_rate': 7.400191906871007e-07, 'epoch': 0.83} 83%|████████▎ | 18337/22095 [31:13:56<5:06:46, 4.90s/it] 83%|████████▎ | 18338/22095 [31:13:59<4:42:27, 4.51s/it] {'loss': 0.2787, 'grad_norm': 0.6136009454379556, 'learning_rate': 7.396355170997411e-07, 'epoch': 0.83} 83%|████████▎ | 18338/22095 [31:14:00<4:42:27, 4.51s/it] 83%|████████▎ | 18339/22095 [31:14:04<4:35:33, 4.40s/it] {'loss': 0.2675, 'grad_norm': 0.6287725134443892, 'learning_rate': 7.392519350545286e-07, 'epoch': 0.83} 83%|████████▎ | 18339/22095 [31:14:04<4:35:33, 4.40s/it] 83%|████████▎ | 18340/22095 [31:14:07<4:08:37, 3.97s/it] {'loss': 0.2872, 'grad_norm': 0.6205791387005112, 'learning_rate': 7.388684445597072e-07, 'epoch': 0.83} 83%|████████▎ | 18340/22095 [31:14:07<4:08:37, 3.97s/it] 83%|████████▎ | 18341/22095 [31:14:10<4:00:39, 3.85s/it] {'loss': 0.289, 'grad_norm': 0.6596207022524252, 'learning_rate': 7.384850456235154e-07, 'epoch': 0.83} 83%|████████▎ | 18341/22095 [31:14:10<4:00:39, 3.85s/it] 83%|████████▎ | 18342/22095 [31:14:13<3:42:19, 3.55s/it] {'loss': 0.2587, 'grad_norm': 0.6058336742195085, 'learning_rate': 7.38101738254191e-07, 'epoch': 0.83} 83%|████████▎ | 18342/22095 [31:14:13<3:42:19, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64283 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53522 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18343/22095 [31:14:16<3:37:08, 3.47s/it] {'loss': 0.308, 'grad_norm': 0.6265365425373657, 'learning_rate': 7.377185224599709e-07, 'epoch': 0.83} 83%|████████▎ | 18343/22095 [31:14:16<3:37:08, 3.47s/it] 83%|████████▎ | 18344/22095 [31:14:19<3:29:05, 3.34s/it] {'loss': 0.3419, 'grad_norm': 0.6376954651218508, 'learning_rate': 7.373353982490916e-07, 'epoch': 0.83} 83%|████████▎ | 18344/22095 [31:14:19<3:29:05, 3.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045966 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 6\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 83%|████████▎ | 18345/22095 [31:14:23<3:37:18, 3.48s/it] {'loss': 0.2814, 'grad_norm': 0.6332723305238434, 'learning_rate': 7.369523656297805e-07, 'epoch': 0.83} 83%|████████▎ | 18345/22095 [31:14:23<3:37:18, 3.48s/it] 83%|████████▎ | 18346/22095 [31:14:27<3:46:11, 3.62s/it] {'loss': 0.2584, 'grad_norm': 0.6078735243851932, 'learning_rate': 7.3656942461027e-07, 'epoch': 0.83} 83%|████████▎ | 18346/22095 [31:14:27<3:46:11, 3.62s/it] 83%|████████▎ | 18347/22095 [31:14:31<3:43:33, 3.58s/it] {'loss': 0.2874, 'grad_norm': 0.619477069126535, 'learning_rate': 7.361865751987879e-07, 'epoch': 0.83} 83%|████████▎ | 18347/22095 [31:14:31<3:43:33, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930856 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54009, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 6cm\nB. 2cm\nC. 3cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 83%|████████▎ | 18348/22095 [31:14:35<3:53:57, 3.75s/it] {'loss': 0.2428, 'grad_norm': 0.5551233601014256, 'learning_rate': 7.358038174035642e-07, 'epoch': 0.83} 83%|████████▎ | 18348/22095 [31:14:35<3:53:57, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [214, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410671 in VC:s3://internvl-moe-sft-data/. Exception: Image size [214, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12870, 'image': 'vrdu_table_final_2/astro-ph.CO/4b2b60e7-a29f-41a4-82c3-95f9b909fe7f.png', 'image_wh': [[214, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\small #1 \\today\n\\end{tabular}\n```"}]} 83%|████████▎ | 18349/22095 [31:14:38<3:54:09, 3.75s/it] {'loss': 0.3064, 'grad_norm': 0.6291363041257584, 'learning_rate': 7.354211512328169e-07, 'epoch': 0.83} 83%|████████▎ | 18349/22095 [31:14:38<3:54:09, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100910 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18350/22095 [31:14:42<3:50:27, 3.69s/it] {'loss': 0.2716, 'grad_norm': 0.5688927307438404, 'learning_rate': 7.350385766947721e-07, 'epoch': 0.83} 83%|████████▎ | 18350/22095 [31:14:42<3:50:27, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (118154 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18351/22095 [31:14:52<5:40:02, 5.45s/it] {'loss': 0.4701, 'grad_norm': 0.2701299884443355, 'learning_rate': 7.346560937976499e-07, 'epoch': 0.83} 83%|████████▎ | 18351/22095 [31:14:52<5:40:02, 5.45s/it] 83%|████████▎ | 18352/22095 [31:14:55<5:02:11, 4.84s/it] {'loss': 0.3139, 'grad_norm': 0.5826385085192366, 'learning_rate': 7.342737025496688e-07, 'epoch': 0.83} 83%|████████▎ | 18352/22095 [31:14:55<5:02:11, 4.84s/it] 83%|████████▎ | 18353/22095 [31:14:58<4:28:31, 4.31s/it] {'loss': 0.3124, 'grad_norm': 0.5941275237233942, 'learning_rate': 7.338914029590432e-07, 'epoch': 0.83} 83%|████████▎ | 18353/22095 [31:14:58<4:28:31, 4.31s/it] 83%|████████▎ | 18354/22095 [31:15:01<4:02:05, 3.88s/it] {'loss': 0.2959, 'grad_norm': 0.661803715012073, 'learning_rate': 7.335091950339901e-07, 'epoch': 0.83} 83%|████████▎ | 18354/22095 [31:15:01<4:02:05, 3.88s/it] 83%|████████▎ | 18355/22095 [31:15:05<3:56:26, 3.79s/it] {'loss': 0.2954, 'grad_norm': 0.5813518898375629, 'learning_rate': 7.3312707878272e-07, 'epoch': 0.83} 83%|████████▎ | 18355/22095 [31:15:05<3:56:26, 3.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107945 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18356/22095 [31:15:08<3:41:04, 3.55s/it] {'loss': 0.3009, 'grad_norm': 0.6580523114531696, 'learning_rate': 7.327450542134457e-07, 'epoch': 0.83} 83%|████████▎ | 18356/22095 [31:15:08<3:41:04, 3.55s/it] 83%|████████▎ | 18357/22095 [31:15:31<9:47:04, 9.42s/it] {'loss': 0.3442, 'grad_norm': 0.6537994532148272, 'learning_rate': 7.323631213343735e-07, 'epoch': 0.83} 83%|████████▎ | 18357/22095 [31:15:31<9:47:04, 9.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18358/22095 [31:15:40<9:46:51, 9.42s/it] {'loss': 0.4757, 'grad_norm': 0.26140961085679265, 'learning_rate': 7.319812801537101e-07, 'epoch': 0.83} 83%|████████▎ | 18358/22095 [31:15:40<9:46:51, 9.42s/it] 83%|████████▎ | 18359/22095 [31:15:45<8:31:06, 8.21s/it] {'loss': 0.4598, 'grad_norm': 0.2807159183875202, 'learning_rate': 7.315995306796608e-07, 'epoch': 0.83} 83%|████████▎ | 18359/22095 [31:15:45<8:31:06, 8.21s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18360/22095 [31:15:49<6:55:19, 6.67s/it] {'loss': 0.2972, 'grad_norm': 0.635033300667888, 'learning_rate': 7.312178729204294e-07, 'epoch': 0.83} 83%|████████▎ | 18360/22095 [31:15:49<6:55:19, 6.67s/it] 83%|████████▎ | 18361/22095 [31:15:57<7:36:33, 7.34s/it] {'loss': 0.4711, 'grad_norm': 0.2736155481440812, 'learning_rate': 7.30836306884215e-07, 'epoch': 0.83} 83%|████████▎ | 18361/22095 [31:15:57<7:36:33, 7.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045979 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6.4cm'}]} 83%|████████▎ | 18362/22095 [31:16:19<12:09:01, 11.72s/it] {'loss': 0.2794, 'grad_norm': 0.6417947220928917, 'learning_rate': 7.304548325792154e-07, 'epoch': 0.83} 83%|████████▎ | 18362/22095 [31:16:19<12:09:01, 11.72s/it] 83%|████████▎ | 18363/22095 [31:16:23<9:38:02, 9.29s/it] {'loss': 0.3218, 'grad_norm': 0.5762612450271447, 'learning_rate': 7.300734500136291e-07, 'epoch': 0.83} 83%|████████▎ | 18363/22095 [31:16:23<9:38:02, 9.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18364/22095 [31:16:30<9:01:27, 8.71s/it] {'loss': 0.457, 'grad_norm': 0.2596337538859704, 'learning_rate': 7.296921591956513e-07, 'epoch': 0.83} 83%|████████▎ | 18364/22095 [31:16:30<9:01:27, 8.71s/it] 83%|████████▎ | 18365/22095 [31:16:35<7:44:03, 7.46s/it] {'loss': 0.308, 'grad_norm': 0.5954486161827727, 'learning_rate': 7.293109601334735e-07, 'epoch': 0.83} 83%|████████▎ | 18365/22095 [31:16:35<7:44:03, 7.46s/it] 83%|████████▎ | 18366/22095 [31:16:39<6:36:09, 6.37s/it] {'loss': 0.2823, 'grad_norm': 0.6269519114385219, 'learning_rate': 7.289298528352857e-07, 'epoch': 0.83} 83%|████████▎ | 18366/22095 [31:16:39<6:36:09, 6.37s/it] 83%|████████▎ | 18367/22095 [31:16:42<5:37:57, 5.44s/it] {'loss': 0.2188, 'grad_norm': 0.518563367356704, 'learning_rate': 7.285488373092792e-07, 'epoch': 0.83} 83%|████████▎ | 18367/22095 [31:16:42<5:37:57, 5.44s/it] 83%|████████▎ | 18368/22095 [31:16:45<4:53:57, 4.73s/it] {'loss': 0.346, 'grad_norm': 0.6214800550202545, 'learning_rate': 7.281679135636377e-07, 'epoch': 0.83} 83%|████████▎ | 18368/22095 [31:16:45<4:53:57, 4.73s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18369/22095 [31:16:48<4:25:17, 4.27s/it] {'loss': 0.2907, 'grad_norm': 0.5959904463385176, 'learning_rate': 7.27787081606549e-07, 'epoch': 0.83} 83%|████████▎ | 18369/22095 [31:16:48<4:25:17, 4.27s/it] 83%|████████▎ | 18370/22095 [31:16:52<4:16:27, 4.13s/it] {'loss': 0.263, 'grad_norm': 0.6622710687024219, 'learning_rate': 7.274063414461952e-07, 'epoch': 0.83} 83%|████████▎ | 18370/22095 [31:16:52<4:16:27, 4.13s/it] 83%|████████▎ | 18371/22095 [31:16:55<3:52:29, 3.75s/it] {'loss': 0.3058, 'grad_norm': 0.5778474098970238, 'learning_rate': 7.270256930907555e-07, 'epoch': 0.83} 83%|████████▎ | 18371/22095 [31:16:55<3:52:29, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53969 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108292 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18372/22095 [31:16:58<3:41:08, 3.56s/it] {'loss': 0.274, 'grad_norm': 0.5545792197697489, 'learning_rate': 7.266451365484106e-07, 'epoch': 0.83} 83%|████████▎ | 18372/22095 [31:16:58<3:41:08, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18373/22095 [31:17:21<9:33:46, 9.25s/it] {'loss': 0.3256, 'grad_norm': 0.8707822719027274, 'learning_rate': 7.262646718273392e-07, 'epoch': 0.83} 83%|████████▎ | 18373/22095 [31:17:21<9:33:46, 9.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1098, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8468242 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1098, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 99917, 'image': 'vrdu_texteq/astro-ph.CO/41904481-0134-4525-8ccf-a2a89f2da069.png', 'image_wh': [[1098, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'The fraction of the mass in radiation that is transformed into PBHs at\nthe time $t_{\\mathrm{H}}$ is then'}]} 83%|████████▎ | 18374/22095 [31:17:25<7:59:25, 7.73s/it] {'loss': 0.3162, 'grad_norm': 0.6130864604779631, 'learning_rate': 7.258842989357118e-07, 'epoch': 0.83} 83%|████████▎ | 18374/22095 [31:17:25<7:59:25, 7.73s/it] 83%|████████▎ | 18375/22095 [31:17:29<6:50:11, 6.62s/it] {'loss': 0.3024, 'grad_norm': 0.6267185140516509, 'learning_rate': 7.255040178817035e-07, 'epoch': 0.83} 83%|████████▎ | 18375/22095 [31:17:29<6:50:11, 6.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (46691 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18376/22095 [31:17:55<12:52:36, 12.46s/it] {'loss': 0.4653, 'grad_norm': 0.2661307336173574, 'learning_rate': 7.251238286734863e-07, 'epoch': 0.83} 83%|████████▎ | 18376/22095 [31:17:55<12:52:36, 12.46s/it] 83%|████████▎ | 18377/22095 [31:17:58<10:05:33, 9.77s/it] {'loss': 0.2627, 'grad_norm': 0.5452519945094221, 'learning_rate': 7.247437313192307e-07, 'epoch': 0.83} 83%|████████▎ | 18377/22095 [31:17:58<10:05:33, 9.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8301864 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1GhRGLXXXXXaAXFXXunYpLFXX.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nTranscribe all text on this picture.'}, {'from': 'gpt', 'value': 'All words in the image:\nusaqueenla\nUSAQUEENLA\n美国洛杉矶实体店100%正品可美国直邮微信:usaqueenvip'}]} 83%|████████▎ | 18378/22095 [31:18:19<13:28:00, 13.04s/it] {'loss': 0.332, 'grad_norm': 0.597310206470709, 'learning_rate': 7.243637258270996e-07, 'epoch': 0.83} 83%|████████▎ | 18378/22095 [31:18:19<13:28:00, 13.04s/it] 83%|████████▎ | 18379/22095 [31:18:57<21:19:10, 20.65s/it] {'loss': 0.2937, 'grad_norm': 0.6262737359671798, 'learning_rate': 7.239838122052612e-07, 'epoch': 0.83} 83%|████████▎ | 18379/22095 [31:18:57<21:19:10, 20.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_042031_before_screenshot_sub0.png 2025-08-28 23:16:56.258272 load time: 1023.42 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_024434_before_screenshot.png 2025-08-28 23:16:56.258638 load time: 1017.77 ms VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/5066034000249106_1.png 2025-08-28 23:16:56.258449 load time: 1032.5 ms 83%|████████▎ | 18380/22095 [31:19:07<17:51:31, 17.31s/it] {'loss': 0.4686, 'grad_norm': 0.2774610859586583, 'learning_rate': 7.23603990461878e-07, 'epoch': 0.83} 83%|████████▎ | 18380/22095 [31:19:07<17:51:31, 17.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56629 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97298 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80877 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92534 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18381/22095 [31:19:10<13:28:24, 13.06s/it] {'loss': 0.2775, 'grad_norm': 0.6236227179505209, 'learning_rate': 7.232242606051115e-07, 'epoch': 0.83} 83%|████████▎ | 18381/22095 [31:19:10<13:28:24, 13.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (112283 > 40960). Running this sequence through the model will result in indexing errors VC:s3://ocr/coco/train2014/COCO_train2014_000000494224.jpg 2025-08-28 23:17:08.905170 load time: 1051.54 ms 83%|████████▎ | 18382/22095 [31:19:14<10:33:08, 10.23s/it] {'loss': 0.2429, 'grad_norm': 0.7734252543559725, 'learning_rate': 7.228446226431196e-07, 'epoch': 0.83} 83%|████████▎ | 18382/22095 [31:19:14<10:33:08, 10.23s/it]VC:s3://internvl-moe-sft-data/vrdu_texteq/astro-ph.CO/83d375f6-1178-4ecd-80f8-096c0686b09a.png 2025-08-28 23:17:12.541971 load time: 1029.72 ms VC:s3://gui-agent/data_20250407/windows/images/word/20250402_174741_2/images/before_screenshot_43_id_92_function_2_crop_1.png 2025-08-28 23:17:12.542040 load time: 1029.81 ms VC:s3://gui/aguvis/aguvis-stage2/guiact-web-multi-v2/images/uid_record_07713_step_05.png VC:s3://gui/aguvis/aguvis-stage2/android_control/images/19890/screenshot_3.png 2025-08-28 23:17:12.539720 load time: 1035.02 ms 2025-08-28 23:17:12.539461 load time: 1056.85 ms VC:s3://gui-agent/data_20250612/web/images/yang_0611215436/10_140_52_49_0611215534/img/16.png 2025-08-28 23:17:12.541261 load time: 1053.86 ms VC:s3://gui-agent/data_20250407/windows/images/paint3d/20250408_181231_1/images/before_screenshot_29.png 2025-08-28 23:17:12.541145 load time: 1052.22 ms 83%|████████▎ | 18383/22095 [31:19:17<8:29:09, 8.23s/it] {'loss': 0.2224, 'grad_norm': 0.5598087332181514, 'learning_rate': 7.224650765840613e-07, 'epoch': 0.83} 83%|████████▎ | 18383/22095 [31:19:17<8:29:09, 8.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047749 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 2\nB. 0.5\nC. 1\nD. 1.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8410556 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12755, 'image': 'vrdu_table_final_2/astro-ph.CO/708390e5-b0f8-47f2-8f1a-cdac1cb6a201.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 83%|████████▎ | 18384/22095 [31:19:25<8:13:51, 7.98s/it] {'loss': 0.4499, 'grad_norm': 0.2714013413523971, 'learning_rate': 7.2208562243609e-07, 'epoch': 0.83} 83%|████████▎ | 18384/22095 [31:19:25<8:13:51, 7.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047216 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 6\nB. 8\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 83%|████████▎ | 18385/22095 [31:19:28<6:46:22, 6.57s/it] {'loss': 0.2592, 'grad_norm': 0.627473107254701, 'learning_rate': 7.21706260207361e-07, 'epoch': 0.83} 83%|████████▎ | 18385/22095 [31:19:28<6:46:22, 6.57s/it] 83%|████████▎ | 18386/22095 [31:19:50<11:24:41, 11.08s/it] {'loss': 0.2811, 'grad_norm': 0.5923881783519446, 'learning_rate': 7.213269899060249e-07, 'epoch': 0.83} 83%|████████▎ | 18386/22095 [31:19:50<11:24:41, 11.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18387/22095 [31:19:57<10:21:44, 10.06s/it] {'loss': 0.4794, 'grad_norm': 0.26871978050997436, 'learning_rate': 7.209478115402302e-07, 'epoch': 0.83} 83%|████████▎ | 18387/22095 [31:19:57<10:21:44, 10.06s/it] 83%|████████▎ | 18388/22095 [31:20:20<14:17:41, 13.88s/it] {'loss': 0.3104, 'grad_norm': 0.6224968148808394, 'learning_rate': 7.205687251181242e-07, 'epoch': 0.83} 83%|████████▎ | 18388/22095 [31:20:20<14:17:41, 13.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18389/22095 [31:20:48<18:41:47, 18.16s/it] {'loss': 0.4785, 'grad_norm': 0.29372753562995985, 'learning_rate': 7.201897306478544e-07, 'epoch': 0.83} 83%|████████▎ | 18389/22095 [31:20:48<18:41:47, 18.16s/it] 83%|████████▎ | 18390/22095 [31:21:47<31:20:46, 30.46s/it] {'loss': 0.2776, 'grad_norm': 0.5670068481317462, 'learning_rate': 7.198108281375627e-07, 'epoch': 0.83} 83%|████████▎ | 18390/22095 [31:21:47<31:20:46, 30.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8342150 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8796, 'image': 'vrdu_table_final_2/astro-ph.CO/11f8bae5-1ffa-4d1f-9d91-6f76d147d823.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 83%|████████▎ | 18391/22095 [31:22:09<28:29:04, 27.68s/it] {'loss': 0.319, 'grad_norm': 0.633384202144468, 'learning_rate': 7.194320175953901e-07, 'epoch': 0.83} 83%|████████▎ | 18391/22095 [31:22:09<28:29:04, 27.68s/it] 83%|████████▎ | 18392/22095 [31:23:06<37:42:54, 36.67s/it] {'loss': 0.3073, 'grad_norm': 0.6352015901863839, 'learning_rate': 7.190532990294762e-07, 'epoch': 0.83} 83%|████████▎ | 18392/22095 [31:23:06<37:42:54, 36.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121963 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18393/22095 [31:23:32<34:12:33, 33.27s/it] {'loss': 0.3043, 'grad_norm': 0.6621876582354893, 'learning_rate': 7.186746724479599e-07, 'epoch': 0.83} 83%|████████▎ | 18393/22095 [31:23:32<34:12:33, 33.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73996 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18394/22095 [31:23:35<25:02:37, 24.36s/it] {'loss': 0.2757, 'grad_norm': 1.1329216180159083, 'learning_rate': 7.182961378589765e-07, 'epoch': 0.83} 83%|████████▎ | 18394/22095 [31:23:35<25:02:37, 24.36s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_69747.png 2025-08-28 23:21:33.907053 load time: 1020.01 ms VC:s3://gui/uground_web_processing/screenshots/web_direct_258k_function_filtered_211401.png 2025-08-28 23:21:33.908819 load time: 1046.34 ms 83%|████████▎ | 18395/22095 [31:23:39<18:41:12, 18.18s/it] {'loss': 0.2705, 'grad_norm': 0.6590145185584861, 'learning_rate': 7.179176952706574e-07, 'epoch': 0.83} 83%|████████▎ | 18395/22095 [31:23:39<18:41:12, 18.18s/it]VC:s3://gui-agent/jedi/images/component_library_snap_icon_data/component_library_snap_icon_data_extracted/images_pure_color_background/component_library_icons/material-design-icons/src/search/dining/materialiconstwotone/24px.png 2025-08-28 23:21:37.674604 load time: 1024.41 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18396/22095 [31:24:01<19:49:56, 19.30s/it] {'loss': 0.3282, 'grad_norm': 0.710785836852594, 'learning_rate': 7.175393446911366e-07, 'epoch': 0.83} 83%|████████▎ | 18396/22095 [31:24:01<19:49:56, 19.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45020 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42743 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41081 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45592 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18397/22095 [31:24:04<14:46:49, 14.39s/it] {'loss': 0.3095, 'grad_norm': 0.5993552009304864, 'learning_rate': 7.171610861285417e-07, 'epoch': 0.83} 83%|████████▎ | 18397/22095 [31:24:04<14:46:49, 14.39s/it] 83%|████████▎ | 18398/22095 [31:24:24<16:32:35, 16.11s/it] {'loss': 0.26, 'grad_norm': 0.5895560821888629, 'learning_rate': 7.167829195910026e-07, 'epoch': 0.83} 83%|████████▎ | 18398/22095 [31:24:24<16:32:35, 16.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96229 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50842 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70847 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18399/22095 [31:25:05<24:17:48, 23.67s/it] {'loss': 0.2786, 'grad_norm': 0.7544662890759368, 'learning_rate': 7.164048450866435e-07, 'epoch': 0.83} 83%|████████▎ | 18399/22095 [31:25:05<24:17:48, 23.67s/it] 83%|████████▎ | 18400/22095 [31:25:26<23:28:26, 22.87s/it] {'loss': 0.2823, 'grad_norm': 0.6287372418726318, 'learning_rate': 7.160268626235866e-07, 'epoch': 0.83} 83%|████████▎ | 18400/22095 [31:25:26<23:28:26, 22.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64235 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18401/22095 [31:25:33<18:27:22, 17.99s/it] {'loss': 0.4802, 'grad_norm': 0.30292889477882584, 'learning_rate': 7.156489722099558e-07, 'epoch': 0.83} 83%|████████▎ | 18401/22095 [31:25:33<18:27:22, 17.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49660 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18402/22095 [31:26:01<21:27:50, 20.92s/it] {'loss': 0.4563, 'grad_norm': 0.26542531608177117, 'learning_rate': 7.152711738538725e-07, 'epoch': 0.83} 83%|████████▎ | 18402/22095 [31:26:01<21:27:50, 20.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18403/22095 [31:26:04<16:11:50, 15.79s/it] {'loss': 0.3128, 'grad_norm': 0.6015363738942859, 'learning_rate': 7.148934675634494e-07, 'epoch': 0.83} 83%|████████▎ | 18403/22095 [31:26:04<16:11:50, 15.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922713 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45866, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 1\nB. 2\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 83%|████████▎ | 18404/22095 [31:26:27<18:26:22, 17.99s/it] {'loss': 0.2569, 'grad_norm': 0.5985106227128415, 'learning_rate': 7.145158533468055e-07, 'epoch': 0.83} 83%|████████▎ | 18404/22095 [31:26:27<18:26:22, 17.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18405/22095 [31:26:55<21:28:59, 20.96s/it] {'loss': 0.4644, 'grad_norm': 0.28834188236230374, 'learning_rate': 7.141383312120536e-07, 'epoch': 0.83} 83%|████████▎ | 18405/22095 [31:26:55<21:28:59, 20.96s/it] 83%|████████▎ | 18406/22095 [31:26:59<16:12:23, 15.82s/it] {'loss': 0.3215, 'grad_norm': 0.5680174625657594, 'learning_rate': 7.137609011673086e-07, 'epoch': 0.83} 83%|████████▎ | 18406/22095 [31:26:59<16:12:23, 15.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18407/22095 [31:27:28<20:16:35, 19.79s/it] {'loss': 0.4899, 'grad_norm': 0.3533673673809758, 'learning_rate': 7.133835632206754e-07, 'epoch': 0.83} 83%|████████▎ | 18407/22095 [31:27:28<20:16:35, 19.79s/it] 83%|████████▎ | 18408/22095 [31:27:33<15:36:16, 15.24s/it] {'loss': 0.2644, 'grad_norm': 0.5735933751162591, 'learning_rate': 7.130063173802637e-07, 'epoch': 0.83} 83%|████████▎ | 18408/22095 [31:27:33<15:36:16, 15.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (52880 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42500 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18409/22095 [31:27:41<13:33:37, 13.24s/it] {'loss': 0.4551, 'grad_norm': 0.3244715465410745, 'learning_rate': 7.126291636541815e-07, 'epoch': 0.83} 83%|████████▎ | 18409/22095 [31:27:41<13:33:37, 13.24s/it] 83%|████████▎ | 18410/22095 [31:28:08<17:33:08, 17.15s/it] {'loss': 0.4724, 'grad_norm': 0.2608028807397206, 'learning_rate': 7.122521020505302e-07, 'epoch': 0.83} 83%|████████▎ | 18410/22095 [31:28:08<17:33:08, 17.15s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18411/22095 [31:28:30<19:06:39, 18.68s/it] {'loss': 0.298, 'grad_norm': 0.6129342907073956, 'learning_rate': 7.11875132577412e-07, 'epoch': 0.83} 83%|████████▎ | 18411/22095 [31:28:30<19:06:39, 18.68s/it] 83%|████████▎ | 18412/22095 [31:28:34<14:35:24, 14.26s/it] {'loss': 0.2856, 'grad_norm': 0.6824953976909873, 'learning_rate': 7.114982552429278e-07, 'epoch': 0.83} 83%|████████▎ | 18412/22095 [31:28:34<14:35:24, 14.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18413/22095 [31:29:15<22:45:39, 22.25s/it] {'loss': 0.271, 'grad_norm': 0.6085744465612274, 'learning_rate': 7.111214700551738e-07, 'epoch': 0.83} 83%|████████▎ | 18413/22095 [31:29:15<22:45:39, 22.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88913 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42129 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81714 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18414/22095 [31:29:57<28:53:46, 28.26s/it] {'loss': 0.2786, 'grad_norm': 0.7326685463051631, 'learning_rate': 7.107447770222486e-07, 'epoch': 0.83} 83%|████████▎ | 18414/22095 [31:29:57<28:53:46, 28.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81446 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18415/22095 [31:30:18<26:31:09, 25.94s/it] {'loss': 0.2859, 'grad_norm': 0.6808975607293165, 'learning_rate': 7.103681761522446e-07, 'epoch': 0.83} 83%|████████▎ | 18415/22095 [31:30:18<26:31:09, 25.94s/it]VC:s3://mm-dataset/LLaVAR/images/100001068542.jpg 2025-08-28 23:28:16.392718 load time: 1021.49 ms VC:s3://gui-agent/data_20250612/web/images/yang_0611171254/10_140_52_49_0611192226/img/18.png 2025-08-28 23:28:16.392069 load time: 1027.91 ms VC:s3://gui-agent/data_20250407/windows/images/ppt/20250402_180746_2/images/before_screenshot_23.png 2025-08-28 23:28:16.393493 load time: 1022.45 ms 83%|████████▎ | 18416/22095 [31:30:40<25:26:15, 24.89s/it] {'loss': 0.3479, 'grad_norm': 0.5929595113087428, 'learning_rate': 7.099916674532526e-07, 'epoch': 0.83} 83%|████████▎ | 18416/22095 [31:30:40<25:26:15, 24.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18417/22095 [31:31:09<26:37:50, 26.07s/it] {'loss': 0.4868, 'grad_norm': 0.2701159437656839, 'learning_rate': 7.096152509333642e-07, 'epoch': 0.83} 83%|████████▎ | 18417/22095 [31:31:09<26:37:50, 26.07s/it] 83%|████████▎ | 18418/22095 [31:31:13<19:48:04, 19.39s/it] {'loss': 0.2754, 'grad_norm': 0.5661430809971738, 'learning_rate': 7.092389266006683e-07, 'epoch': 0.83} 83%|████████▎ | 18418/22095 [31:31:13<19:48:04, 19.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908008 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31161, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段上的点,D点为BC段的中点,AB=10,AC=6,则AD段的长度为()\nA. 4\nB. 6\nC. 2\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 83%|████████▎ | 18419/22095 [31:32:14<32:45:16, 32.08s/it] {'loss': 0.3262, 'grad_norm': 0.6159875929159425, 'learning_rate': 7.088626944632493e-07, 'epoch': 0.83} 83%|████████▎ | 18419/22095 [31:32:14<32:45:16, 32.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18420/22095 [31:32:56<35:33:49, 34.84s/it] {'loss': 0.2933, 'grad_norm': 0.6080998733893245, 'learning_rate': 7.084865545291914e-07, 'epoch': 0.83} 83%|████████▎ | 18420/22095 [31:32:56<35:33:49, 34.84s/it] 83%|████████▎ | 18421/22095 [31:33:19<31:57:08, 31.31s/it] {'loss': 0.3477, 'grad_norm': 0.6126167266070449, 'learning_rate': 7.081105068065764e-07, 'epoch': 0.83} 83%|████████▎ | 18421/22095 [31:33:19<31:57:08, 31.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18422/22095 [31:33:28<25:12:46, 24.71s/it] {'loss': 0.4426, 'grad_norm': 0.2536339863608969, 'learning_rate': 7.077345513034861e-07, 'epoch': 0.83} 83%|████████▎ | 18422/22095 [31:33:28<25:12:46, 24.71s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302209 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1Ji6pcWmgSKJjSspiXXXyJFXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat text is hidden in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n川\n京\n全铜墙排/地排\nKingchun\n全铜整体\n防臭防堵\n时尚大气\n送\n39\n起'}]} 83%|████████▎ | 18423/22095 [31:33:37<20:31:42, 20.13s/it] {'loss': 0.4655, 'grad_norm': 0.2617414439907864, 'learning_rate': 7.073586880279981e-07, 'epoch': 0.83} 83%|████████▎ | 18423/22095 [31:33:37<20:31:42, 20.13s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18424/22095 [31:33:41<15:21:36, 15.06s/it] {'loss': 0.2721, 'grad_norm': 0.5566006485219414, 'learning_rate': 7.06982916988187e-07, 'epoch': 0.83} 83%|████████▎ | 18424/22095 [31:33:41<15:21:36, 15.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62340 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (123477 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73374 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18425/22095 [31:33:44<11:47:36, 11.57s/it] {'loss': 0.3213, 'grad_norm': 0.6710321305525754, 'learning_rate': 7.066072381921285e-07, 'epoch': 0.83} 83%|████████▎ | 18425/22095 [31:33:44<11:47:36, 11.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45520 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77841 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72226 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68138 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18426/22095 [31:34:24<20:20:43, 19.96s/it] {'loss': 0.331, 'grad_norm': 0.6511717324963595, 'learning_rate': 7.06231651647894e-07, 'epoch': 0.83} 83%|████████▎ | 18426/22095 [31:34:24<20:20:43, 19.96s/it] 83%|████████▎ | 18427/22095 [31:35:25<33:01:13, 32.41s/it] {'loss': 0.2835, 'grad_norm': 0.5692816316713081, 'learning_rate': 7.058561573635548e-07, 'epoch': 0.83} 83%|████████▎ | 18427/22095 [31:35:25<33:01:13, 32.41s/it] 83%|████████▎ | 18428/22095 [31:36:08<36:10:53, 35.52s/it] {'loss': 0.278, 'grad_norm': 0.663359168877623, 'learning_rate': 7.054807553471782e-07, 'epoch': 0.83} 83%|████████▎ | 18428/22095 [31:36:08<36:10:53, 35.52s/it] 83%|████████▎ | 18429/22095 [31:37:09<43:58:42, 43.19s/it] {'loss': 0.2733, 'grad_norm': 0.5552336909462475, 'learning_rate': 7.05105445606829e-07, 'epoch': 0.83} 83%|████████▎ | 18429/22095 [31:37:09<43:58:42, 43.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44320 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18430/22095 [31:37:13<32:09:04, 31.58s/it] {'loss': 0.3261, 'grad_norm': 0.5845264375176329, 'learning_rate': 7.047302281505735e-07, 'epoch': 0.83} 83%|████████▎ | 18430/22095 [31:37:13<32:09:04, 31.58s/it] 83%|████████▎ | 18431/22095 [31:37:55<35:19:40, 34.71s/it] {'loss': 0.319, 'grad_norm': 0.6262122985968133, 'learning_rate': 7.043551029864759e-07, 'epoch': 0.83} 83%|████████▎ | 18431/22095 [31:37:55<35:19:40, 34.71s/it] 83%|████████▎ | 18432/22095 [31:38:38<37:34:22, 36.93s/it] {'loss': 0.2854, 'grad_norm': 0.6622169142228267, 'learning_rate': 7.039800701225918e-07, 'epoch': 0.83} 83%|████████▎ | 18432/22095 [31:38:38<37:34:22, 36.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18433/22095 [31:38:47<29:09:10, 28.66s/it] {'loss': 0.461, 'grad_norm': 0.27194861515127755, 'learning_rate': 7.036051295669816e-07, 'epoch': 0.83} 83%|████████▎ | 18433/22095 [31:38:47<29:09:10, 28.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924516 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47669, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C、D为AB段的两点,D为AC段的中点,AB=10cm,BC=4cm,广告长度为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BC=4cm,∴AC=6cm,∵D是线段AC的中点,∴AD=3cm.'}]} 83%|████████▎ | 18434/22095 [31:38:57<23:23:46, 23.01s/it] {'loss': 0.4504, 'grad_norm': 0.25211593145997274, 'learning_rate': 7.03230281327702e-07, 'epoch': 0.83} 83%|████████▎ | 18434/22095 [31:38:57<23:23:46, 23.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 83%|████████▎ | 18435/22095 [31:39:01<17:43:10, 17.43s/it] {'loss': 0.2867, 'grad_norm': 0.6034517931757836, 'learning_rate': 7.028555254128089e-07, 'epoch': 0.83} 83%|████████▎ | 18435/22095 [31:39:01<17:43:10, 17.43s/it] 83%|████████▎ | 18436/22095 [31:39:23<19:09:31, 18.85s/it] {'loss': 0.2809, 'grad_norm': 0.6218373751755876, 'learning_rate': 7.024808618303508e-07, 'epoch': 0.83} 83%|████████▎ | 18436/22095 [31:39:23<19:09:31, 18.85s/it] 83%|████████▎ | 18437/22095 [31:39:27<14:25:52, 14.20s/it] {'loss': 0.3138, 'grad_norm': 0.622619872023975, 'learning_rate': 7.021062905883802e-07, 'epoch': 0.83} 83%|████████▎ | 18437/22095 [31:39:27<14:25:52, 14.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18438/22095 [31:39:36<12:59:28, 12.79s/it] {'loss': 0.4385, 'grad_norm': 0.25726287792657343, 'learning_rate': 7.017318116949468e-07, 'epoch': 0.83} 83%|████████▎ | 18438/22095 [31:39:36<12:59:28, 12.79s/it] 83%|████████▎ | 18439/22095 [31:39:58<15:48:10, 15.56s/it] {'loss': 0.2942, 'grad_norm': 0.5635510149757561, 'learning_rate': 7.013574251580956e-07, 'epoch': 0.83} 83%|████████▎ | 18439/22095 [31:39:58<15:48:10, 15.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64446 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18440/22095 [31:40:56<28:44:38, 28.31s/it] {'loss': 0.2885, 'grad_norm': 0.5767968519994132, 'learning_rate': 7.009831309858701e-07, 'epoch': 0.83} 83%|████████▎ | 18440/22095 [31:40:56<28:44:38, 28.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 83%|████████▎ | 18441/22095 [31:41:18<26:47:39, 26.40s/it] {'loss': 0.2802, 'grad_norm': 0.6454957968254394, 'learning_rate': 7.006089291863144e-07, 'epoch': 0.83} 83%|████████▎ | 18441/22095 [31:41:18<26:47:39, 26.40s/it]VC:s3://internvl2/datasets/ocr/synthCalligraphy_poetry_v1.0/images/image_18_1152_ioomkddbcecglkbfahke.jpg 2025-08-28 23:39:16.999048 load time: 1022.29 ms 83%|████████▎ | 18442/22095 [31:41:21<19:35:59, 19.32s/it] {'loss': 0.2863, 'grad_norm': 0.6431056047668252, 'learning_rate': 7.002348197674669e-07, 'epoch': 0.83} 83%|████████▎ | 18442/22095 [31:41:21<19:35:59, 19.32s/it] 83%|████████▎ | 18443/22095 [31:41:24<14:39:13, 14.45s/it] {'loss': 0.2709, 'grad_norm': 0.6548391597943949, 'learning_rate': 6.998608027373694e-07, 'epoch': 0.83} 83%|████████▎ | 18443/22095 [31:41:24<14:39:13, 14.45s/it]VC:s3://gui-agent/agentnet/win_mac_images/4b649c0f-6c42-4537-bd51-c9a563e6bc77.png 2025-08-28 23:39:22.868069 load time: 1058.61 ms VC:s3://st2pj/20250222/images/sam-all/images/sa_544152.jpg 2025-08-28 23:39:22.868120 load time: 1060.37 ms 83%|████████▎ | 18444/22095 [31:42:24<28:37:04, 28.22s/it] {'loss': 0.2773, 'grad_norm': 0.5851244717476433, 'learning_rate': 6.994868781040553e-07, 'epoch': 0.83} 83%|████████▎ | 18444/22095 [31:42:24<28:37:04, 28.22s/it] 83%|████████▎ | 18445/22095 [31:42:49<27:25:10, 27.04s/it] {'loss': 0.3102, 'grad_norm': 0.6108327610865241, 'learning_rate': 6.991130458755596e-07, 'epoch': 0.83} 83%|████████▎ | 18445/22095 [31:42:49<27:25:10, 27.04s/it] 83%|████████▎ | 18446/22095 [31:43:10<25:42:06, 25.36s/it] {'loss': 0.287, 'grad_norm': 0.5969419647703962, 'learning_rate': 6.987393060599157e-07, 'epoch': 0.83} 83%|████████▎ | 18446/22095 [31:43:10<25:42:06, 25.36s/it] 83%|████████▎ | 18447/22095 [31:43:53<31:04:33, 30.67s/it] {'loss': 0.2959, 'grad_norm': 0.7424954024336571, 'learning_rate': 6.983656586651543e-07, 'epoch': 0.83} 83%|████████▎ | 18447/22095 [31:43:53<31:04:33, 30.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 83%|████████▎ | 18448/22095 [31:44:20<29:49:37, 29.44s/it] {'loss': 0.4771, 'grad_norm': 0.32286385121349703, 'learning_rate': 6.979921036993042e-07, 'epoch': 0.83} 83%|████████▎ | 18448/22095 [31:44:20<29:49:37, 29.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47392 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68112 > 40960). Running this sequence through the model will result in indexing errors 83%|████████▎ | 18449/22095 [31:44:23<21:52:25, 21.60s/it] {'loss': 0.2719, 'grad_norm': 0.577601691006914, 'learning_rate': 6.976186411703894e-07, 'epoch': 0.83} 83%|████████▎ | 18449/22095 [31:44:23<21:52:25, 21.60s/it] 84%|████████▎ | 18450/22095 [31:45:05<28:01:55, 27.69s/it] {'loss': 0.3351, 'grad_norm': 0.6326586227520709, 'learning_rate': 6.972452710864364e-07, 'epoch': 0.84} 84%|████████▎ | 18450/22095 [31:45:05<28:01:55, 27.69s/it] 84%|████████▎ | 18451/22095 [31:46:05<37:56:58, 37.49s/it] {'loss': 0.3225, 'grad_norm': 0.6604271876887587, 'learning_rate': 6.968719934554691e-07, 'epoch': 0.84} 84%|████████▎ | 18451/22095 [31:46:05<37:56:58, 37.49s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-0_1232644-split-0.jpg 2025-08-28 23:44:04.154671 load time: 1036.98 ms 84%|████████▎ | 18452/22095 [31:46:51<40:19:48, 39.85s/it] {'loss': 0.2977, 'grad_norm': 0.589944492600839, 'learning_rate': 6.964988082855062e-07, 'epoch': 0.84} 84%|████████▎ | 18452/22095 [31:46:51<40:19:48, 39.85s/it] 84%|████████▎ | 18453/22095 [31:46:54<29:15:23, 28.92s/it] {'loss': 0.2546, 'grad_norm': 0.5965548477237558, 'learning_rate': 6.961257155845658e-07, 'epoch': 0.84} 84%|████████▎ | 18453/22095 [31:46:54<29:15:23, 28.92s/it] 84%|████████▎ | 18454/22095 [31:47:15<26:48:05, 26.50s/it] {'loss': 0.2908, 'grad_norm': 0.6621594533928651, 'learning_rate': 6.957527153606664e-07, 'epoch': 0.84} 84%|████████▎ | 18454/22095 [31:47:15<26:48:05, 26.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130469 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81712 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18455/22095 [31:47:56<31:05:42, 30.75s/it] {'loss': 0.2905, 'grad_norm': 0.5712805422352859, 'learning_rate': 6.953798076218204e-07, 'epoch': 0.84} 84%|████████▎ | 18455/22095 [31:47:56<31:05:42, 30.75s/it] 84%|████████▎ | 18456/22095 [31:48:19<28:56:52, 28.64s/it] {'loss': 0.2914, 'grad_norm': 0.5798015410898242, 'learning_rate': 6.950069923760433e-07, 'epoch': 0.84} 84%|████████▎ | 18456/22095 [31:48:19<28:56:52, 28.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-3_276349948-split-0.jpg 2025-08-28 23:46:18.149282 load time: 1056.64 ms 84%|████████▎ | 18457/22095 [31:48:47<28:36:09, 28.30s/it] {'loss': 0.4883, 'grad_norm': 0.26862311862827815, 'learning_rate': 6.946342696313435e-07, 'epoch': 0.84} 84%|████████▎ | 18457/22095 [31:48:47<28:36:09, 28.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▎ | 18458/22095 [31:49:14<28:05:09, 27.80s/it] {'loss': 0.4824, 'grad_norm': 0.2655924882413599, 'learning_rate': 6.942616393957297e-07, 'epoch': 0.84} 84%|████████▎ | 18458/22095 [31:49:14<28:05:09, 27.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (42171 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18459/22095 [31:49:34<26:00:28, 25.75s/it] {'loss': 0.2974, 'grad_norm': 0.828216103418796, 'learning_rate': 6.938891016772092e-07, 'epoch': 0.84} 84%|████████▎ | 18459/22095 [31:49:35<26:00:28, 25.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [570, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8424359 in VC:s3://internvl-moe-sft-data/. Exception: Image size [570, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 128651, 'image': 'vrdu_texteq/astro-ph.CO/b4b1ca3b-073f-41d9-bba2-50568dd1f18c.png', 'image_wh': [[570, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'where we introduced the substitution ${t\\equiv\\cos\\theta}$.'}]} 84%|████████▎ | 18460/22095 [31:49:57<24:57:17, 24.71s/it] {'loss': 0.315, 'grad_norm': 0.6503665178507614, 'learning_rate': 6.935166564837875e-07, 'epoch': 0.84} 84%|████████▎ | 18460/22095 [31:49:57<24:57:17, 24.71s/it] 84%|████████▎ | 18461/22095 [31:50:39<30:19:20, 30.04s/it] {'loss': 0.2828, 'grad_norm': 0.809480352605823, 'learning_rate': 6.93144303823467e-07, 'epoch': 0.84} 84%|████████▎ | 18461/22095 [31:50:39<30:19:20, 30.04s/it] 84%|████████▎ | 18462/22095 [31:51:21<33:52:30, 33.57s/it] {'loss': 0.3042, 'grad_norm': 0.6732385764784247, 'learning_rate': 6.927720437042462e-07, 'epoch': 0.84} 84%|████████▎ | 18462/22095 [31:51:21<33:52:30, 33.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18463/22095 [31:51:31<26:40:41, 26.44s/it] {'loss': 0.4947, 'grad_norm': 0.29168950544423544, 'learning_rate': 6.923998761341261e-07, 'epoch': 0.84} 84%|████████▎ | 18463/22095 [31:51:31<26:40:41, 26.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (140451 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18464/22095 [31:51:53<25:28:17, 25.25s/it] {'loss': 0.2732, 'grad_norm': 0.6211694790944401, 'learning_rate': 6.920278011211034e-07, 'epoch': 0.84} 84%|████████▎ | 18464/22095 [31:51:53<25:28:17, 25.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▎ | 18465/22095 [31:52:33<29:56:58, 29.70s/it] {'loss': 0.2932, 'grad_norm': 0.6205506528108833, 'learning_rate': 6.916558186731726e-07, 'epoch': 0.84} 84%|████████▎ | 18465/22095 [31:52:33<29:56:58, 29.70s/it] 84%|████████▎ | 18466/22095 [31:52:55<27:37:31, 27.40s/it] {'loss': 0.2827, 'grad_norm': 0.6014775172694962, 'learning_rate': 6.912839287983253e-07, 'epoch': 0.84} 84%|████████▎ | 18466/22095 [31:52:55<27:37:31, 27.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18467/22095 [31:53:42<33:20:44, 33.09s/it] {'loss': 0.4542, 'grad_norm': 0.2592762946173183, 'learning_rate': 6.909121315045541e-07, 'epoch': 0.84} 84%|████████▎ | 18467/22095 [31:53:42<33:20:44, 33.09s/it] 84%|████████▎ | 18468/22095 [31:53:46<24:32:44, 24.36s/it] {'loss': 0.3685, 'grad_norm': 0.9117098601539213, 'learning_rate': 6.905404267998466e-07, 'epoch': 0.84} 84%|████████▎ | 18468/22095 [31:53:46<24:32:44, 24.36s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (142643256 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 84%|████████▎ | 18469/22095 [31:53:49<18:13:14, 18.09s/it] {'loss': 0.2811, 'grad_norm': 0.7034910533099269, 'learning_rate': 6.901688146921892e-07, 'epoch': 0.84} 84%|████████▎ | 18469/22095 [31:53:49<18:13:14, 18.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44225 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75099 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18470/22095 [31:55:06<36:03:56, 35.82s/it] {'loss': 0.3007, 'grad_norm': 0.6019756167831373, 'learning_rate': 6.897972951895682e-07, 'epoch': 0.84} 84%|████████▎ | 18470/22095 [31:55:06<36:03:56, 35.82s/it] 84%|████████▎ | 18471/22095 [31:55:32<32:49:15, 32.60s/it] {'loss': 0.3, 'grad_norm': 0.6456353538014693, 'learning_rate': 6.894258682999644e-07, 'epoch': 0.84} 84%|████████▎ | 18471/22095 [31:55:32<32:49:15, 32.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18472/22095 [31:55:39<25:16:01, 25.11s/it] {'loss': 0.4847, 'grad_norm': 0.2788759898919128, 'learning_rate': 6.890545340313609e-07, 'epoch': 0.84} 84%|████████▎ | 18472/22095 [31:55:39<25:16:01, 25.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398229 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 380, 'image': 'vrdu_table_final_2/astro-ph.CO/36b81f93-b1b0-40b9-8a51-794740f362cf.png', 'image_wh': [[23, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease transcribe the table in the picture into LaTeX code.'}, {'from': 'gpt', 'value': "Of course, I'll transcribe the table in the picture into LaTeX code for you.\n```latex\n\\begin{tabular}{c} $\\tilde{f}_i$ \\end{tabular}\n```"}]} 84%|████████▎ | 18473/22095 [31:55:43<18:47:43, 18.68s/it] {'loss': 0.3145, 'grad_norm': 0.7049032865204843, 'learning_rate': 6.886832923917358e-07, 'epoch': 0.84} 84%|████████▎ | 18473/22095 [31:55:43<18:47:43, 18.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77757 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18474/22095 [31:56:06<20:01:58, 19.92s/it] {'loss': 0.2557, 'grad_norm': 0.5249122167302466, 'learning_rate': 6.883121433890639e-07, 'epoch': 0.84} 84%|████████▎ | 18474/22095 [31:56:06<20:01:58, 19.92s/it] 84%|████████▎ | 18475/22095 [31:56:09<14:56:12, 14.85s/it] {'loss': 0.3068, 'grad_norm': 0.6918651384335092, 'learning_rate': 6.879410870313219e-07, 'epoch': 0.84} 84%|████████▎ | 18475/22095 [31:56:09<14:56:12, 14.85s/it] 84%|████████▎ | 18476/22095 [31:56:12<11:21:18, 11.30s/it] {'loss': 0.251, 'grad_norm': 0.5832527914148368, 'learning_rate': 6.875701233264837e-07, 'epoch': 0.84} 84%|████████▎ | 18476/22095 [31:56:12<11:21:18, 11.30s/it] 84%|████████▎ | 18477/22095 [31:56:52<20:01:49, 19.93s/it] {'loss': 0.2806, 'grad_norm': 0.6140431784941516, 'learning_rate': 6.871992522825183e-07, 'epoch': 0.84} 84%|████████▎ | 18477/22095 [31:56:52<20:01:49, 19.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▎ | 18478/22095 [31:57:01<16:52:04, 16.79s/it] {'loss': 0.4759, 'grad_norm': 0.26624596622234775, 'learning_rate': 6.868284739073949e-07, 'epoch': 0.84} 84%|████████▎ | 18478/22095 [31:57:01<16:52:04, 16.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▎ | 18479/22095 [31:57:05<12:51:57, 12.81s/it] {'loss': 0.2673, 'grad_norm': 0.6660657563262626, 'learning_rate': 6.8645778820908e-07, 'epoch': 0.84} 84%|████████▎ | 18479/22095 [31:57:05<12:51:57, 12.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51858 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18480/22095 [31:57:08<9:54:58, 9.88s/it] {'loss': 0.2871, 'grad_norm': 0.6050959441806862, 'learning_rate': 6.860871951955412e-07, 'epoch': 0.84} 84%|████████▎ | 18480/22095 [31:57:08<9:54:58, 9.88s/it] 84%|████████▎ | 18481/22095 [31:57:11<7:55:55, 7.90s/it] {'loss': 0.3073, 'grad_norm': 0.6525082970979413, 'learning_rate': 6.857166948747385e-07, 'epoch': 0.84} 84%|████████▎ | 18481/22095 [31:57:11<7:55:55, 7.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18482/22095 [31:57:21<8:29:31, 8.46s/it] {'loss': 0.4547, 'grad_norm': 0.2765877051458427, 'learning_rate': 6.853462872546329e-07, 'epoch': 0.84} 84%|████████▎ | 18482/22095 [31:57:21<8:29:31, 8.46s/it] 84%|████████▎ | 18483/22095 [31:57:25<7:06:06, 7.08s/it] {'loss': 0.3209, 'grad_norm': 0.6329547373858155, 'learning_rate': 6.849759723431853e-07, 'epoch': 0.84} 84%|████████▎ | 18483/22095 [31:57:25<7:06:06, 7.08s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 84%|████████▎ | 18484/22095 [31:58:05<17:12:39, 17.16s/it] {'loss': 0.317, 'grad_norm': 0.6013514520423907, 'learning_rate': 6.846057501483505e-07, 'epoch': 0.84} 84%|████████▎ | 18484/22095 [31:58:05<17:12:39, 17.16s/it] 84%|████████▎ | 18485/22095 [31:58:08<12:55:21, 12.89s/it] {'loss': 0.267, 'grad_norm': 0.6708127579750481, 'learning_rate': 6.842356206780853e-07, 'epoch': 0.84} 84%|████████▎ | 18485/22095 [31:58:08<12:55:21, 12.89s/it] 84%|████████▎ | 18486/22095 [31:59:10<27:35:47, 27.53s/it] {'loss': 0.2944, 'grad_norm': 0.6180890666149683, 'learning_rate': 6.838655839403419e-07, 'epoch': 0.84} 84%|████████▎ | 18486/22095 [31:59:10<27:35:47, 27.53s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-3_276216342-split-5.jpg 2025-08-28 23:57:08.775090 load time: 1040.85 ms 84%|████████▎ | 18487/22095 [31:59:13<20:13:26, 20.18s/it] {'loss': 0.3012, 'grad_norm': 0.5812224184554288, 'learning_rate': 6.834956399430703e-07, 'epoch': 0.84} 84%|████████▎ | 18487/22095 [31:59:13<20:13:26, 20.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18488/22095 [31:59:22<16:54:41, 16.88s/it] {'loss': 0.4628, 'grad_norm': 0.2711922549276204, 'learning_rate': 6.8312578869422e-07, 'epoch': 0.84} 84%|████████▎ | 18488/22095 [31:59:22<16:54:41, 16.88s/it] 84%|████████▎ | 18489/22095 [31:59:46<18:50:14, 18.81s/it] {'loss': 0.3233, 'grad_norm': 0.797307266433614, 'learning_rate': 6.827560302017389e-07, 'epoch': 0.84} 84%|████████▎ | 18489/22095 [31:59:46<18:50:14, 18.81s/it] 84%|████████▎ | 18490/22095 [32:00:08<19:58:36, 19.95s/it] {'loss': 0.2964, 'grad_norm': 0.6030944382919147, 'learning_rate': 6.823863644735718e-07, 'epoch': 0.84} 84%|████████▎ | 18490/22095 [32:00:08<19:58:36, 19.95s/it] 84%|████████▎ | 18491/22095 [32:00:11<14:51:28, 14.84s/it] {'loss': 0.3053, 'grad_norm': 0.7432983858204644, 'learning_rate': 6.820167915176601e-07, 'epoch': 0.84} 84%|████████▎ | 18491/22095 [32:00:11<14:51:28, 14.84s/it] 84%|████████▎ | 18492/22095 [32:00:35<17:32:06, 17.52s/it] {'loss': 0.2915, 'grad_norm': 0.7667762688745203, 'learning_rate': 6.816473113419459e-07, 'epoch': 0.84} 84%|████████▎ | 18492/22095 [32:00:35<17:32:06, 17.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64530 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42098 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18493/22095 [32:00:56<18:35:56, 18.59s/it] {'loss': 0.2945, 'grad_norm': 0.6022476658119068, 'learning_rate': 6.812779239543688e-07, 'epoch': 0.84} 84%|████████▎ | 18493/22095 [32:00:56<18:35:56, 18.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42172 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18494/22095 [32:01:00<14:11:33, 14.19s/it] {'loss': 0.2849, 'grad_norm': 0.5689870681284912, 'learning_rate': 6.809086293628658e-07, 'epoch': 0.84} 84%|████████▎ | 18494/22095 [32:01:00<14:11:33, 14.19s/it] 84%|████████▎ | 18495/22095 [32:01:42<22:32:03, 22.53s/it] {'loss': 0.2985, 'grad_norm': 0.6094370358822103, 'learning_rate': 6.805394275753696e-07, 'epoch': 0.84} 84%|████████▎ | 18495/22095 [32:01:42<22:32:03, 22.53s/it] 84%|████████▎ | 18496/22095 [32:01:45<16:48:35, 16.81s/it] {'loss': 0.309, 'grad_norm': 0.6096821133744085, 'learning_rate': 6.801703185998165e-07, 'epoch': 0.84} 84%|████████▎ | 18496/22095 [32:01:45<16:48:35, 16.81s/it] 84%|████████▎ | 18497/22095 [32:01:48<12:36:57, 12.62s/it] {'loss': 0.2717, 'grad_norm': 0.6210190486095353, 'learning_rate': 6.798013024441346e-07, 'epoch': 0.84} 84%|████████▎ | 18497/22095 [32:01:48<12:36:57, 12.62s/it] 84%|████████▎ | 18498/22095 [32:01:52<9:56:25, 9.95s/it] {'loss': 0.2883, 'grad_norm': 0.5832405292000097, 'learning_rate': 6.794323791162549e-07, 'epoch': 0.84} 84%|████████▎ | 18498/22095 [32:01:52<9:56:25, 9.95s/it] 84%|████████▎ | 18499/22095 [32:01:55<7:57:53, 7.97s/it] {'loss': 0.2912, 'grad_norm': 0.6939500706572298, 'learning_rate': 6.790635486241043e-07, 'epoch': 0.84} 84%|████████▎ | 18499/22095 [32:01:55<7:57:53, 7.97s/it] 84%|████████▎ | 18500/22095 [32:01:58<6:25:14, 6.43s/it] {'loss': 0.3227, 'grad_norm': 0.5921908749665488, 'learning_rate': 6.786948109756064e-07, 'epoch': 0.84} 84%|████████▎ | 18500/22095 [32:01:58<6:25:14, 6.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▎ | 18501/22095 [32:02:07<7:18:24, 7.32s/it] {'loss': 0.479, 'grad_norm': 0.2750326264546234, 'learning_rate': 6.783261661786855e-07, 'epoch': 0.84} 84%|████████▎ | 18501/22095 [32:02:07<7:18:24, 7.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▎ | 18502/22095 [32:02:36<13:33:08, 13.58s/it] {'loss': 0.4474, 'grad_norm': 0.24164130552241625, 'learning_rate': 6.77957614241263e-07, 'epoch': 0.84} 84%|████████▎ | 18502/22095 [32:02:36<13:33:08, 13.58s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (51502 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▎ | 18503/22095 [32:03:00<16:48:03, 16.84s/it] {'loss': 0.2594, 'grad_norm': 0.5861892387606339, 'learning_rate': 6.775891551712555e-07, 'epoch': 0.84} 84%|████████▎ | 18503/22095 [32:03:00<16:48:03, 16.84s/it] 84%|████████▎ | 18504/22095 [32:03:25<19:04:13, 19.12s/it] {'loss': 0.3095, 'grad_norm': 0.7380838640426645, 'learning_rate': 6.77220788976582e-07, 'epoch': 0.84} 84%|████████▎ | 18504/22095 [32:03:25<19:04:13, 19.12s/it]VC:s3://gui-agent/data_20250707/windows/images/chrome/free_task_20250624_201718/images/20250624_201727_4.png 2025-08-29 00:01:23.287012 load time: 1034.91 ms 84%|████████▍ | 18505/22095 [32:03:46<19:47:14, 19.84s/it] {'loss': 0.257, 'grad_norm': 0.6415416692617534, 'learning_rate': 6.768525156651589e-07, 'epoch': 0.84} 84%|████████▍ | 18505/22095 [32:03:46<19:47:14, 19.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (109824 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18506/22095 [32:03:50<14:55:59, 14.98s/it] {'loss': 0.297, 'grad_norm': 0.6310572300126577, 'learning_rate': 6.764843352448974e-07, 'epoch': 0.84} 84%|████████▍ | 18506/22095 [32:03:50<14:55:59, 14.98s/it] 84%|████████▍ | 18507/22095 [32:03:54<11:45:37, 11.80s/it] {'loss': 0.266, 'grad_norm': 0.5824094431531872, 'learning_rate': 6.761162477237076e-07, 'epoch': 0.84} 84%|████████▍ | 18507/22095 [32:03:54<11:45:37, 11.80s/it] 84%|████████▍ | 18508/22095 [32:03:58<9:24:17, 9.44s/it] {'loss': 0.2683, 'grad_norm': 0.5821116481571228, 'learning_rate': 6.757482531094999e-07, 'epoch': 0.84} 84%|████████▍ | 18508/22095 [32:03:58<9:24:17, 9.44s/it] 84%|████████▍ | 18509/22095 [32:04:22<13:50:49, 13.90s/it] {'loss': 0.2824, 'grad_norm': 0.6269496246057881, 'learning_rate': 6.753803514101826e-07, 'epoch': 0.84} 84%|████████▍ | 18509/22095 [32:04:22<13:50:49, 13.90s/it] 84%|████████▍ | 18510/22095 [32:04:26<10:47:12, 10.83s/it] {'loss': 0.3464, 'grad_norm': 0.6786039172697631, 'learning_rate': 6.75012542633659e-07, 'epoch': 0.84} 84%|████████▍ | 18510/22095 [32:04:26<10:47:12, 10.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18511/22095 [32:04:35<10:20:22, 10.39s/it] {'loss': 0.4969, 'grad_norm': 0.2701244215128959, 'learning_rate': 6.74644826787832e-07, 'epoch': 0.84} 84%|████████▍ | 18511/22095 [32:04:35<10:20:22, 10.39s/it] 84%|████████▍ | 18512/22095 [32:04:39<8:18:02, 8.34s/it] {'loss': 0.3244, 'grad_norm': 0.5815568096960195, 'learning_rate': 6.742772038806045e-07, 'epoch': 0.84} 84%|████████▍ | 18512/22095 [32:04:39<8:18:02, 8.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82319 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18513/22095 [32:04:43<6:59:13, 7.02s/it] {'loss': 0.3038, 'grad_norm': 0.6155053020966658, 'learning_rate': 6.739096739198731e-07, 'epoch': 0.84} 84%|████████▍ | 18513/22095 [32:04:43<6:59:13, 7.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18514/22095 [32:04:52<7:42:42, 7.75s/it] {'loss': 0.4636, 'grad_norm': 0.2623644484991377, 'learning_rate': 6.735422369135375e-07, 'epoch': 0.84} 84%|████████▍ | 18514/22095 [32:04:52<7:42:42, 7.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72692 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75770 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18515/22095 [32:04:56<6:27:46, 6.50s/it] {'loss': 0.3144, 'grad_norm': 0.9128337737816552, 'learning_rate': 6.731748928694914e-07, 'epoch': 0.84} 84%|████████▍ | 18515/22095 [32:04:56<6:27:46, 6.50s/it] 84%|████████▍ | 18516/22095 [32:04:59<5:25:58, 5.46s/it] {'loss': 0.2951, 'grad_norm': 0.6336593216480945, 'learning_rate': 6.72807641795627e-07, 'epoch': 0.84} 84%|████████▍ | 18516/22095 [32:04:59<5:25:58, 5.46s/it] 84%|████████▍ | 18517/22095 [32:05:02<4:40:09, 4.70s/it] {'loss': 0.3192, 'grad_norm': 0.6044869413049455, 'learning_rate': 6.724404836998366e-07, 'epoch': 0.84} 84%|████████▍ | 18517/22095 [32:05:02<4:40:09, 4.70s/it] 84%|████████▍ | 18518/22095 [32:05:05<4:17:52, 4.33s/it] {'loss': 0.2872, 'grad_norm': 0.6188016034575473, 'learning_rate': 6.720734185900101e-07, 'epoch': 0.84} 84%|████████▍ | 18518/22095 [32:05:05<4:17:52, 4.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55303 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115398 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18519/22095 [32:05:09<3:58:39, 4.00s/it] {'loss': 0.2403, 'grad_norm': 0.6214542611014593, 'learning_rate': 6.717064464740336e-07, 'epoch': 0.84} 84%|████████▍ | 18519/22095 [32:05:09<3:58:39, 4.00s/it] 84%|████████▍ | 18520/22095 [32:05:12<3:49:45, 3.86s/it] {'loss': 0.2977, 'grad_norm': 0.7963334781011263, 'learning_rate': 6.713395673597911e-07, 'epoch': 0.84} 84%|████████▍ | 18520/22095 [32:05:12<3:49:45, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18521/22095 [32:05:22<5:36:16, 5.65s/it] {'loss': 0.4759, 'grad_norm': 0.28427234355197933, 'learning_rate': 6.709727812551669e-07, 'epoch': 0.84} 84%|████████▍ | 18521/22095 [32:05:22<5:36:16, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68524 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54091 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18522/22095 [32:05:26<5:09:04, 5.19s/it] {'loss': 0.324, 'grad_norm': 0.6178113786648095, 'learning_rate': 6.706060881680432e-07, 'epoch': 0.84} 84%|████████▍ | 18522/22095 [32:05:26<5:09:04, 5.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (92677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44780 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18523/22095 [32:05:36<6:27:37, 6.51s/it] {'loss': 0.4676, 'grad_norm': 0.27607898173326817, 'learning_rate': 6.702394881062974e-07, 'epoch': 0.84} 84%|████████▍ | 18523/22095 [32:05:36<6:27:37, 6.51s/it] 84%|████████▍ | 18524/22095 [32:05:39<5:40:12, 5.72s/it] {'loss': 0.3249, 'grad_norm': 0.6041009697924854, 'learning_rate': 6.698729810778065e-07, 'epoch': 0.84} 84%|████████▍ | 18524/22095 [32:05:39<5:40:12, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64718 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43813 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43616 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18525/22095 [32:06:03<10:57:39, 11.05s/it] {'loss': 0.2971, 'grad_norm': 0.6633498696162133, 'learning_rate': 6.695065670904477e-07, 'epoch': 0.84} 84%|████████▍ | 18525/22095 [32:06:03<10:57:39, 11.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50821 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18526/22095 [32:06:06<8:37:54, 8.71s/it] {'loss': 0.2991, 'grad_norm': 0.5935907065614485, 'learning_rate': 6.691402461520913e-07, 'epoch': 0.84} 84%|████████▍ | 18526/22095 [32:06:06<8:37:54, 8.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (67723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67487 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103676 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18527/22095 [32:06:17<9:07:20, 9.20s/it] {'loss': 0.4612, 'grad_norm': 0.25082580423110495, 'learning_rate': 6.687740182706103e-07, 'epoch': 0.84} 84%|████████▍ | 18527/22095 [32:06:17<9:07:20, 9.20s/it] 84%|████████▍ | 18528/22095 [32:06:20<7:20:25, 7.41s/it] {'loss': 0.2703, 'grad_norm': 0.6078515373628862, 'learning_rate': 6.684078834538743e-07, 'epoch': 0.84} 84%|████████▍ | 18528/22095 [32:06:20<7:20:25, 7.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18529/22095 [32:06:23<6:12:24, 6.27s/it] {'loss': 0.3062, 'grad_norm': 0.6176734156235348, 'learning_rate': 6.680418417097478e-07, 'epoch': 0.84} 84%|████████▍ | 18529/22095 [32:06:23<6:12:24, 6.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18530/22095 [32:06:30<6:13:25, 6.28s/it] {'loss': 0.4671, 'grad_norm': 0.2644695950699044, 'learning_rate': 6.676758930460975e-07, 'epoch': 0.84} 84%|████████▍ | 18530/22095 [32:06:30<6:13:25, 6.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72128 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18531/22095 [32:06:36<6:20:21, 6.40s/it] {'loss': 0.4595, 'grad_norm': 0.2695615966163874, 'learning_rate': 6.673100374707886e-07, 'epoch': 0.84} 84%|████████▍ | 18531/22095 [32:06:36<6:20:21, 6.40s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [387, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8438120 in VC:s3://internvl-moe-sft-data/. Exception: Image size [387, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 88602, 'image': 'vrdu_texteq/astro-ph.CO/53f6f625-d114-4f40-aae9-0758f57be144.png', 'image_wh': [[387, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'Note that $\\sigma$ remains unaffected.'}]} 84%|████████▍ | 18532/22095 [32:06:40<5:25:39, 5.48s/it] {'loss': 0.2783, 'grad_norm': 0.6319710434142184, 'learning_rate': 6.669442749916782e-07, 'epoch': 0.84} 84%|████████▍ | 18532/22095 [32:06:40<5:25:39, 5.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18533/22095 [32:06:47<6:06:01, 6.17s/it] {'loss': 0.4834, 'grad_norm': 0.27180888854367324, 'learning_rate': 6.665786056166274e-07, 'epoch': 0.84} 84%|████████▍ | 18533/22095 [32:06:47<6:06:01, 6.17s/it] 84%|████████▍ | 18534/22095 [32:06:53<5:47:48, 5.86s/it] {'loss': 0.4602, 'grad_norm': 0.263547407515624, 'learning_rate': 6.662130293534941e-07, 'epoch': 0.84} 84%|████████▍ | 18534/22095 [32:06:53<5:47:48, 5.86s/it] 84%|████████▍ | 18535/22095 [32:07:02<6:50:09, 6.91s/it] {'loss': 0.4475, 'grad_norm': 0.28470399398942675, 'learning_rate': 6.658475462101327e-07, 'epoch': 0.84} 84%|████████▍ | 18535/22095 [32:07:02<6:50:09, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 84%|████████▍ | 18536/22095 [32:07:06<6:06:57, 6.19s/it] {'loss': 0.2538, 'grad_norm': 0.594393630301785, 'learning_rate': 6.654821561943953e-07, 'epoch': 0.84} 84%|████████▍ | 18536/22095 [32:07:06<6:06:57, 6.19s/it] 84%|████████▍ | 18537/22095 [32:07:10<5:14:35, 5.31s/it] {'loss': 0.28, 'grad_norm': 0.6516799206285943, 'learning_rate': 6.651168593141339e-07, 'epoch': 0.84} 84%|████████▍ | 18537/22095 [32:07:10<5:14:35, 5.31s/it] 84%|████████▍ | 18538/22095 [32:07:13<4:36:29, 4.66s/it] {'loss': 0.3085, 'grad_norm': 0.5994769333314872, 'learning_rate': 6.647516555771988e-07, 'epoch': 0.84} 84%|████████▍ | 18538/22095 [32:07:13<4:36:29, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43266 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18539/22095 [32:07:22<6:00:38, 6.08s/it] {'loss': 0.4624, 'grad_norm': 0.3360962215042519, 'learning_rate': 6.643865449914355e-07, 'epoch': 0.84} 84%|████████▍ | 18539/22095 [32:07:22<6:00:38, 6.08s/it] 84%|████████▍ | 18540/22095 [32:07:26<5:21:03, 5.42s/it] {'loss': 0.2594, 'grad_norm': 0.553898450685638, 'learning_rate': 6.640215275646889e-07, 'epoch': 0.84} 84%|████████▍ | 18540/22095 [32:07:26<5:21:03, 5.42s/it] 84%|████████▍ | 18541/22095 [32:07:29<4:35:31, 4.65s/it] {'loss': 0.2977, 'grad_norm': 0.6146330382727423, 'learning_rate': 6.636566033048037e-07, 'epoch': 0.84} 84%|████████▍ | 18541/22095 [32:07:29<4:35:31, 4.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45615 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72699 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89010 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18542/22095 [32:07:33<4:17:58, 4.36s/it] {'loss': 0.3527, 'grad_norm': 0.6534901474622999, 'learning_rate': 6.632917722196186e-07, 'epoch': 0.84} 84%|████████▍ | 18542/22095 [32:07:33<4:17:58, 4.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74359 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18543/22095 [32:07:36<3:51:34, 3.91s/it] {'loss': 0.2602, 'grad_norm': 0.6374263507876112, 'learning_rate': 6.629270343169752e-07, 'epoch': 0.84} 84%|████████▍ | 18543/22095 [32:07:36<3:51:34, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93412 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18544/22095 [32:07:39<3:49:11, 3.87s/it] {'loss': 0.2771, 'grad_norm': 0.9073245039519338, 'learning_rate': 6.625623896047101e-07, 'epoch': 0.84} 84%|████████▍ | 18544/22095 [32:07:39<3:49:11, 3.87s/it] 84%|████████▍ | 18545/22095 [32:07:42<3:33:27, 3.61s/it] {'loss': 0.2847, 'grad_norm': 0.5576977394379002, 'learning_rate': 6.621978380906563e-07, 'epoch': 0.84} 84%|████████▍ | 18545/22095 [32:07:42<3:33:27, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46317 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83626 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18546/22095 [32:07:46<3:30:14, 3.55s/it] {'loss': 0.3053, 'grad_norm': 0.6261781506288243, 'learning_rate': 6.618333797826487e-07, 'epoch': 0.84} 84%|████████▍ | 18546/22095 [32:07:46<3:30:14, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70827 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18547/22095 [32:07:55<5:12:44, 5.29s/it] {'loss': 0.4567, 'grad_norm': 0.2783462372789997, 'learning_rate': 6.614690146885189e-07, 'epoch': 0.84} 84%|████████▍ | 18547/22095 [32:07:55<5:12:44, 5.29s/it] 84%|████████▍ | 18548/22095 [32:07:59<4:55:22, 5.00s/it] {'loss': 0.3023, 'grad_norm': 0.5963827788955498, 'learning_rate': 6.611047428160954e-07, 'epoch': 0.84} 84%|████████▍ | 18548/22095 [32:07:59<4:55:22, 5.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18549/22095 [32:08:03<4:29:31, 4.56s/it] {'loss': 0.2707, 'grad_norm': 0.5959170023377508, 'learning_rate': 6.60740564173204e-07, 'epoch': 0.84} 84%|████████▍ | 18549/22095 [32:08:03<4:29:31, 4.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18550/22095 [32:08:12<5:56:26, 6.03s/it] {'loss': 0.4457, 'grad_norm': 0.25713733466545213, 'learning_rate': 6.603764787676703e-07, 'epoch': 0.84} 84%|████████▍ | 18550/22095 [32:08:12<5:56:26, 6.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57286 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47577 > 40960) for 4 sample(s). Truncating to 982 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (50582 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18551/22095 [32:08:19<6:12:39, 6.31s/it] {'loss': 0.4678, 'grad_norm': 0.38063115712027656, 'learning_rate': 6.600124866073199e-07, 'epoch': 0.84} 84%|████████▍ | 18551/22095 [32:08:19<6:12:39, 6.31s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 84%|████████▍ | 18552/22095 [32:08:23<5:22:42, 5.46s/it] {'loss': 0.3188, 'grad_norm': 0.5948071627450263, 'learning_rate': 6.596485876999714e-07, 'epoch': 0.84} 84%|████████▍ | 18552/22095 [32:08:23<5:22:42, 5.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18553/22095 [32:08:27<4:56:24, 5.02s/it] {'loss': 0.2585, 'grad_norm': 0.5947740569630174, 'learning_rate': 6.592847820534432e-07, 'epoch': 0.84} 84%|████████▍ | 18553/22095 [32:08:27<4:56:24, 5.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302777 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1OZi7a0LO8KJjSZFxXXaGEVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n我需要你分析图片,并提取上面所有的文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n满88元包邮\n电子元器件\n电子\n有限\n佐得\n公司\n深圳\n主营ADI\n保证原装\n假一赔十\n专业配单\n一站式采购站'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18554/22095 [32:08:36<6:02:52, 6.15s/it] {'loss': 0.4601, 'grad_norm': 0.26618875327704994, 'learning_rate': 6.589210696755549e-07, 'epoch': 0.84} 84%|████████▍ | 18554/22095 [32:08:36<6:02:52, 6.15s/it] 84%|████████▍ | 18555/22095 [32:08:45<7:02:17, 7.16s/it] {'loss': 0.4836, 'grad_norm': 0.3881434164089876, 'learning_rate': 6.585574505741188e-07, 'epoch': 0.84} 84%|████████▍ | 18555/22095 [32:08:45<7:02:17, 7.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (58868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94717 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18556/22095 [32:08:49<6:07:04, 6.22s/it] {'loss': 0.2967, 'grad_norm': 0.6202801403288642, 'learning_rate': 6.581939247569508e-07, 'epoch': 0.84} 84%|████████▍ | 18556/22095 [32:08:49<6:07:04, 6.22s/it] 84%|████████▍ | 18557/22095 [32:08:52<5:13:46, 5.32s/it] {'loss': 0.2491, 'grad_norm': 0.5730519871310629, 'learning_rate': 6.578304922318607e-07, 'epoch': 0.84} 84%|████████▍ | 18557/22095 [32:08:52<5:13:46, 5.32s/it] 84%|████████▍ | 18558/22095 [32:08:56<4:49:16, 4.91s/it] {'loss': 0.3151, 'grad_norm': 0.6228626600596571, 'learning_rate': 6.574671530066557e-07, 'epoch': 0.84} 84%|████████▍ | 18558/22095 [32:08:56<4:49:16, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18559/22095 [32:09:03<5:24:28, 5.51s/it] {'loss': 0.4816, 'grad_norm': 0.28480024563607853, 'learning_rate': 6.571039070891449e-07, 'epoch': 0.84} 84%|████████▍ | 18559/22095 [32:09:03<5:24:28, 5.51s/it] 84%|████████▍ | 18560/22095 [32:09:06<4:42:25, 4.79s/it] {'loss': 0.2483, 'grad_norm': 0.7009703026905671, 'learning_rate': 6.567407544871341e-07, 'epoch': 0.84} 84%|████████▍ | 18560/22095 [32:09:06<4:42:25, 4.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950724 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1559, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 2\nB. 3\nC. 4\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:根据题意,AC=12cm,CB=\\frac{2}{3}AC,所以CB=8cm,所以AB=AC+CB=20cm,又D、E分别为AC、AB的中点,所以DE=AE-AD=\\frac{1}{2}(AB-AC)=4cm.即DE=4cm.'}]} 84%|████████▍ | 18561/22095 [32:09:09<4:07:42, 4.21s/it] {'loss': 0.2819, 'grad_norm': 0.6737484038016858, 'learning_rate': 6.56377695208425e-07, 'epoch': 0.84} 84%|████████▍ | 18561/22095 [32:09:09<4:07:42, 4.21s/it] 84%|████████▍ | 18562/22095 [32:09:13<4:07:14, 4.20s/it] {'loss': 0.3284, 'grad_norm': 0.6495359229581547, 'learning_rate': 6.560147292608177e-07, 'epoch': 0.84} 84%|████████▍ | 18562/22095 [32:09:13<4:07:14, 4.20s/it] 84%|████████▍ | 18563/22095 [32:09:17<3:52:34, 3.95s/it] {'loss': 0.2784, 'grad_norm': 0.7227420956022628, 'learning_rate': 6.556518566521125e-07, 'epoch': 0.84} 84%|████████▍ | 18563/22095 [32:09:17<3:52:34, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (66278 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54652 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41186 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41946 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18564/22095 [32:09:26<5:30:07, 5.61s/it] {'loss': 0.4648, 'grad_norm': 0.2921568050989983, 'learning_rate': 6.552890773901083e-07, 'epoch': 0.84} 84%|████████▍ | 18564/22095 [32:09:26<5:30:07, 5.61s/it] 84%|████████▍ | 18565/22095 [32:09:30<5:00:56, 5.12s/it] {'loss': 0.2818, 'grad_norm': 0.9524537477492414, 'learning_rate': 6.54926391482596e-07, 'epoch': 0.84} 84%|████████▍ | 18565/22095 [32:09:30<5:00:56, 5.12s/it] 84%|████████▍ | 18566/22095 [32:09:35<4:48:05, 4.90s/it] {'loss': 0.3514, 'grad_norm': 0.5970172164361539, 'learning_rate': 6.545637989373704e-07, 'epoch': 0.84} 84%|████████▍ | 18566/22095 [32:09:35<4:48:05, 4.90s/it] 84%|████████▍ | 18567/22095 [32:09:38<4:19:14, 4.41s/it] {'loss': 0.3053, 'grad_norm': 0.6591626744459652, 'learning_rate': 6.542012997622238e-07, 'epoch': 0.84} 84%|████████▍ | 18567/22095 [32:09:38<4:19:14, 4.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [437, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8420062 in VC:s3://internvl-moe-sft-data/. Exception: Image size [437, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5223, 'image': 'vrdu_texteq/astro-ph.CO/286dbe9c-b6e9-4860-8f9c-3bdae95f7c01.png', 'image_wh': [[437, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $D$ is the distance to the halo.'}]} 84%|████████▍ | 18568/22095 [32:09:42<4:09:30, 4.24s/it] {'loss': 0.263, 'grad_norm': 0.6245199254392384, 'learning_rate': 6.538388939649442e-07, 'epoch': 0.84} 84%|████████▍ | 18568/22095 [32:09:42<4:09:30, 4.24s/it] 84%|████████▍ | 18569/22095 [32:09:45<3:53:07, 3.97s/it] {'loss': 0.2426, 'grad_norm': 0.6100214864717944, 'learning_rate': 6.534765815533179e-07, 'epoch': 0.84} 84%|████████▍ | 18569/22095 [32:09:45<3:53:07, 3.97s/it] 84%|████████▍ | 18570/22095 [32:09:48<3:41:06, 3.76s/it] {'loss': 0.2755, 'grad_norm': 0.6087616477739742, 'learning_rate': 6.531143625351316e-07, 'epoch': 0.84} 84%|████████▍ | 18570/22095 [32:09:48<3:41:06, 3.76s/it] 84%|████████▍ | 18571/22095 [32:09:51<3:29:13, 3.56s/it] {'loss': 0.2807, 'grad_norm': 0.5870832214071411, 'learning_rate': 6.527522369181655e-07, 'epoch': 0.84} 84%|████████▍ | 18571/22095 [32:09:51<3:29:13, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18572/22095 [32:10:01<5:13:49, 5.34s/it] {'loss': 0.4648, 'grad_norm': 0.26530881471943285, 'learning_rate': 6.523902047102038e-07, 'epoch': 0.84} 84%|████████▍ | 18572/22095 [32:10:01<5:13:49, 5.34s/it] 84%|████████▍ | 18573/22095 [32:10:04<4:37:40, 4.73s/it] {'loss': 0.333, 'grad_norm': 0.6748885856919552, 'learning_rate': 6.520282659190241e-07, 'epoch': 0.84} 84%|████████▍ | 18573/22095 [32:10:04<4:37:40, 4.73s/it] 84%|████████▍ | 18574/22095 [32:10:08<4:22:23, 4.47s/it] {'loss': 0.3334, 'grad_norm': 0.6525134433216766, 'learning_rate': 6.516664205524021e-07, 'epoch': 0.84} 84%|████████▍ | 18574/22095 [32:10:08<4:22:23, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44554 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49253 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49022 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44330 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18575/22095 [32:10:11<3:58:26, 4.06s/it] {'loss': 0.3059, 'grad_norm': 0.626439705015834, 'learning_rate': 6.513046686181135e-07, 'epoch': 0.84} 84%|████████▍ | 18575/22095 [32:10:11<3:58:26, 4.06s/it] 84%|████████▍ | 18576/22095 [32:10:16<4:03:45, 4.16s/it] {'loss': 0.3226, 'grad_norm': 0.6414577042974596, 'learning_rate': 6.509430101239328e-07, 'epoch': 0.84} 84%|████████▍ | 18576/22095 [32:10:16<4:03:45, 4.16s/it] 84%|████████▍ | 18577/22095 [32:10:19<3:46:03, 3.86s/it] {'loss': 0.3111, 'grad_norm': 0.6366298350867605, 'learning_rate': 6.505814450776299e-07, 'epoch': 0.84} 84%|████████▍ | 18577/22095 [32:10:19<3:46:03, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52073 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18578/22095 [32:10:23<3:46:52, 3.87s/it] {'loss': 0.3032, 'grad_norm': 0.6135687883135332, 'learning_rate': 6.502199734869718e-07, 'epoch': 0.84} 84%|████████▍ | 18578/22095 [32:10:23<3:46:52, 3.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307399 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB20eQfjqagSKJjy0FgXXcRqFXa_!!480667565.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n检测图中的文字并告诉我内容。'}, {'from': 'gpt', 'value': '图中所有文字:\nBanglah\n袋袋纸尿片\n帮宝适\n来自英国的呵护\n薄柔婴儿纸尿片\nISO\n9001\nBALYWqsu\n超级金袋\n强儿纸尿片\n★\n0\n+\n超透气\nXL\n24\n香港帮宝适国际有限公司\n超干爽\n超柔软\n纸尿片\n江苏心悦卫生用品有限公司\n24片'}]} 84%|████████▍ | 18579/22095 [32:10:26<3:45:12, 3.84s/it] {'loss': 0.2993, 'grad_norm': 0.6618915855858729, 'learning_rate': 6.498585953597275e-07, 'epoch': 0.84} 84%|████████▍ | 18579/22095 [32:10:26<3:45:12, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43034 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60384 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18580/22095 [32:10:30<3:37:36, 3.71s/it] {'loss': 0.2969, 'grad_norm': 0.5810166039177667, 'learning_rate': 6.494973107036628e-07, 'epoch': 0.84} 84%|████████▍ | 18580/22095 [32:10:30<3:37:36, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18581/22095 [32:10:33<3:30:32, 3.60s/it] {'loss': 0.2933, 'grad_norm': 0.6190871177789631, 'learning_rate': 6.491361195265394e-07, 'epoch': 0.84} 84%|████████▍ | 18581/22095 [32:10:33<3:30:32, 3.60s/it] 84%|████████▍ | 18582/22095 [32:10:37<3:33:20, 3.64s/it] {'loss': 0.2754, 'grad_norm': 0.571417004000721, 'learning_rate': 6.487750218361172e-07, 'epoch': 0.84} 84%|████████▍ | 18582/22095 [32:10:37<3:33:20, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64344 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18583/22095 [32:10:46<5:15:37, 5.39s/it] {'loss': 0.45, 'grad_norm': 0.26255393521512566, 'learning_rate': 6.484140176401565e-07, 'epoch': 0.84} 84%|████████▍ | 18583/22095 [32:10:46<5:15:37, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18584/22095 [32:10:50<4:36:33, 4.73s/it] {'loss': 0.2464, 'grad_norm': 0.5854899811507577, 'learning_rate': 6.48053106946413e-07, 'epoch': 0.84} 84%|████████▍ | 18584/22095 [32:10:50<4:36:33, 4.73s/it] 84%|████████▍ | 18585/22095 [32:10:54<4:24:27, 4.52s/it] {'loss': 0.2958, 'grad_norm': 0.685442932062727, 'learning_rate': 6.476922897626431e-07, 'epoch': 0.84} 84%|████████▍ | 18585/22095 [32:10:54<4:24:27, 4.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18586/22095 [32:11:02<5:28:51, 5.62s/it] {'loss': 0.465, 'grad_norm': 0.25921140126472897, 'learning_rate': 6.47331566096599e-07, 'epoch': 0.84} 84%|████████▍ | 18586/22095 [32:11:02<5:28:51, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41970 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79576 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18587/22095 [32:11:06<5:00:40, 5.14s/it] {'loss': 0.3098, 'grad_norm': 0.6404645261086098, 'learning_rate': 6.4697093595603e-07, 'epoch': 0.84} 84%|████████▍ | 18587/22095 [32:11:06<5:00:40, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18588/22095 [32:11:15<6:19:21, 6.49s/it] {'loss': 0.4794, 'grad_norm': 0.2909442563168596, 'learning_rate': 6.466103993486866e-07, 'epoch': 0.84} 84%|████████▍ | 18588/22095 [32:11:15<6:19:21, 6.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18589/22095 [32:11:19<5:33:59, 5.72s/it] {'loss': 0.2952, 'grad_norm': 0.6357938988339054, 'learning_rate': 6.462499562823166e-07, 'epoch': 0.84} 84%|████████▍ | 18589/22095 [32:11:19<5:33:59, 5.72s/it] 84%|████████▍ | 18590/22095 [32:11:23<4:49:03, 4.95s/it] {'loss': 0.2687, 'grad_norm': 0.5806777395557357, 'learning_rate': 6.45889606764663e-07, 'epoch': 0.84} 84%|████████▍ | 18590/22095 [32:11:23<4:49:03, 4.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18591/22095 [32:11:33<6:17:23, 6.46s/it] {'loss': 0.4687, 'grad_norm': 0.2657439730935918, 'learning_rate': 6.455293508034682e-07, 'epoch': 0.84} 84%|████████▍ | 18591/22095 [32:11:33<6:17:23, 6.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8938310 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61463, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,直线长度AB=18cm,BC=6cm,D为BC中点,则直线长度AD为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=18cm,BC=6cm,∴AC=AB-BC=12cm又∵D为BC的中点,∴CD=\\frac{1}{2}BC=3于是AD=AC+CD=12+3=15'}]} 84%|████████▍ | 18592/22095 [32:11:36<5:29:18, 5.64s/it] {'loss': 0.2837, 'grad_norm': 1.2369104990125643, 'learning_rate': 6.451691884064737e-07, 'epoch': 0.84} 84%|████████▍ | 18592/22095 [32:11:36<5:29:18, 5.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18593/22095 [32:11:40<4:59:09, 5.13s/it] {'loss': 0.3056, 'grad_norm': 0.5736522130921489, 'learning_rate': 6.44809119581421e-07, 'epoch': 0.84} 84%|████████▍ | 18593/22095 [32:11:40<4:59:09, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18594/22095 [32:11:44<4:38:40, 4.78s/it] {'loss': 0.2915, 'grad_norm': 0.5971583599445661, 'learning_rate': 6.444491443360423e-07, 'epoch': 0.84} 84%|████████▍ | 18594/22095 [32:11:44<4:38:40, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18595/22095 [32:11:52<5:39:40, 5.82s/it] {'loss': 0.4945, 'grad_norm': 0.26921985013098143, 'learning_rate': 6.440892626780742e-07, 'epoch': 0.84} 84%|████████▍ | 18595/22095 [32:11:52<5:39:40, 5.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18596/22095 [32:11:56<5:08:10, 5.28s/it] {'loss': 0.3517, 'grad_norm': 0.702059806929883, 'learning_rate': 6.437294746152506e-07, 'epoch': 0.84} 84%|████████▍ | 18596/22095 [32:11:56<5:08:10, 5.28s/it] 84%|████████▍ | 18597/22095 [32:11:59<4:29:26, 4.62s/it] {'loss': 0.2978, 'grad_norm': 0.6959560991057743, 'learning_rate': 6.433697801553018e-07, 'epoch': 0.84} 84%|████████▍ | 18597/22095 [32:11:59<4:29:26, 4.62s/it] 84%|████████▍ | 18598/22095 [32:12:04<4:20:13, 4.46s/it] {'loss': 0.3127, 'grad_norm': 0.6424117920937337, 'learning_rate': 6.430101793059545e-07, 'epoch': 0.84} 84%|████████▍ | 18598/22095 [32:12:04<4:20:13, 4.46s/it] 84%|████████▍ | 18599/22095 [32:12:07<3:57:22, 4.07s/it] {'loss': 0.2602, 'grad_norm': 0.6470073503248186, 'learning_rate': 6.426506720749382e-07, 'epoch': 0.84} 84%|████████▍ | 18599/22095 [32:12:07<3:57:22, 4.07s/it] 84%|████████▍ | 18600/22095 [32:12:11<3:55:20, 4.04s/it] {'loss': 0.2543, 'grad_norm': 0.5961597104114085, 'learning_rate': 6.422912584699753e-07, 'epoch': 0.84} 84%|████████▍ | 18600/22095 [32:12:11<3:55:20, 4.04s/it] 84%|████████▍ | 18601/22095 [32:12:14<3:45:36, 3.87s/it] {'loss': 0.2955, 'grad_norm': 0.6579959921697064, 'learning_rate': 6.41931938498791e-07, 'epoch': 0.84} 84%|████████▍ | 18601/22095 [32:12:14<3:45:36, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18602/22095 [32:12:18<3:37:30, 3.74s/it] {'loss': 0.2896, 'grad_norm': 0.5739176031467433, 'learning_rate': 6.415727121691029e-07, 'epoch': 0.84} 84%|████████▍ | 18602/22095 [32:12:18<3:37:30, 3.74s/it] 84%|████████▍ | 18603/22095 [32:12:21<3:25:34, 3.53s/it] {'loss': 0.3284, 'grad_norm': 0.7145701489626878, 'learning_rate': 6.412135794886326e-07, 'epoch': 0.84} 84%|████████▍ | 18603/22095 [32:12:21<3:25:34, 3.53s/it] 84%|████████▍ | 18604/22095 [32:12:24<3:14:50, 3.35s/it] {'loss': 0.3477, 'grad_norm': 0.6603734675555657, 'learning_rate': 6.408545404650945e-07, 'epoch': 0.84} 84%|████████▍ | 18604/22095 [32:12:24<3:14:50, 3.35s/it] 84%|████████▍ | 18605/22095 [32:12:27<3:17:34, 3.40s/it] {'loss': 0.3122, 'grad_norm': 0.60232612770229, 'learning_rate': 6.404955951062058e-07, 'epoch': 0.84} 84%|████████▍ | 18605/22095 [32:12:27<3:17:34, 3.40s/it] 84%|████████▍ | 18606/22095 [32:12:30<3:06:07, 3.20s/it] {'loss': 0.2737, 'grad_norm': 0.5689447404173003, 'learning_rate': 6.40136743419677e-07, 'epoch': 0.84} 84%|████████▍ | 18606/22095 [32:12:30<3:06:07, 3.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119518 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41658 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18607/22095 [32:12:33<3:09:44, 3.26s/it] {'loss': 0.2842, 'grad_norm': 0.5942540051798831, 'learning_rate': 6.39777985413218e-07, 'epoch': 0.84} 84%|████████▍ | 18607/22095 [32:12:33<3:09:44, 3.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18608/22095 [32:12:37<3:15:03, 3.36s/it] {'loss': 0.236, 'grad_norm': 0.5761743364454467, 'learning_rate': 6.394193210945393e-07, 'epoch': 0.84} 84%|████████▍ | 18608/22095 [32:12:37<3:15:03, 3.36s/it] 84%|████████▍ | 18609/22095 [32:12:40<3:17:53, 3.41s/it] {'loss': 0.3192, 'grad_norm': 0.6744896562351126, 'learning_rate': 6.390607504713476e-07, 'epoch': 0.84} 84%|████████▍ | 18609/22095 [32:12:40<3:17:53, 3.41s/it] 84%|████████▍ | 18610/22095 [32:12:44<3:17:12, 3.40s/it] {'loss': 0.2759, 'grad_norm': 0.5559132233571377, 'learning_rate': 6.387022735513465e-07, 'epoch': 0.84} 84%|████████▍ | 18610/22095 [32:12:44<3:17:12, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887274 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [917, 3, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7805473 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [917, 3, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10427, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC=2MC,BC=2CN,由线段的和差得AC-BC=2MC-2NC=2(MC-NC)=2×2=4cm,'}]} Problematic sample: {'id': '26813', 'image': '51833.jpg', 'image_wh': [[917, 3]], 'conversations': [{'from': 'human', 'value': "\nI am providing an answer to the question below based on the image: \nPlease explain what you think is happening or what story the image is telling. \nHere is my response: \nThe picture showcased is a monochrome backdrop with delicate horizontal waves enveloping the entire scene. These waves are harmoniously arranged and in sync, culminating in a seamless wavy appearance. No explicit entities, forms, or superfluous details are detected within the visual space. The entire canvas is uniformly monochrome, nuanced with soft and slightly deeper hues, generating a subtle aquatic texture that imbues a feeling of serenity.\n\nGiven the character of the imagery and its apparent dearth of intricacy, this visual can serve as an adaptable element in various creative contexts such as graphical overlay for content integration, avant-garde artistic initiatives, and multimedia storytelling frameworks.\n\n### Contextual Application:\n\n1. **Visual Component**:\n - **Oceanic Background**: This imagery could function as an oceanic backdrop in diverse artistic undertakings. Artists frequently select such understated aesthetics to uphold a clean and refined demeanor in visual performances, websites, or promotional materials.\n - **Accentuation of Central Figure**: Owing to the absence of distractions presented by the uniform monochrome and horizontal waves, the picture would efficaciously spotlight any central figure, rendering it a perfect stage when the desired emphasis is on the lead subject.\n\n2. **Psychological Resonance**:\n - **Peaceful and Tranquil**: Monochrome is commonly perceived as a tranquil and soothing hue. The periodicity of the horizontal waves might also add to a sensation of harmony and equilibrium, useful in scenarios that necessitate a tranquil, non-disruptive setting.\n\n### Potential Questions and Answers:\n\n1. **Question**: Which elements are visible in the picture?\n **Answer**: The picture does not portray any concrete entities. It is a monochrome canvas with delicate horizontal waves spanning the entirety.\n \n2. **Question**: What is the predominant color of the backdrop?\n **Answer**: The predominant color of the backdrop is monochrome.\n\n3. **Question**: How are the horizontal waves organized?\n **Answer**: The horizontal waves in the picture are evenly distributed and oscillate in unison from the left to the right of the frame.\n\n4. **Question**: Can the backdrop be utilized in artistic projects?\n **Answer**: Indeed, the monochrome backdrop with horizontal waves can be integrated into artistic projects to furnish a tranquil and non-disruptive setting.\n\n5. **Question**: Is the picture elaborate?\n **Answer**: No, the picture is remarkably simple and devoid of intricacy, rendering it apt for circumstances where minimalism and lucidity are sought after.\n\n### Analysis:\n\n- This picture embodies avant-garde design tenets, concentrating on elegance and practicality without the exploitation of ornate or convoluted aspects.\n- The consistency and tranquility of the monochrome canvas can be harnessed to fabricate a non-disruptive interface or to direct attention to pivotal visual components situated in the foreground.\n- The understated horizontal waves introduce a suggestion of movement, endowing the scene with dynamics, preventing it from being purely static and devoid of characteristics.\n\nIn summary, the simplicity, serenity, and organized composition render this backdrop adaptable, appropriate in diverse situations where aesthetic sensibilities advocate minimalism and spotlight. \nPlease revise my answer using the image, and if there are any errors, kindly explain why it's incorrect."}, {'from': 'gpt', 'value': 'Your initial description accurately captures the essence of a design, though it describes a scene with horizontal waves in a monochrome setting, which isn\'t what the actual image shows. The real image features vertical stripes on a grey-scale background. Here are the discrepancies:\n\n1. **Orientation of Lines**: You mentioned horizontal waves, whereas the image presents vertical stripes.\n2. **Pattern Description**: The term "waves" suggests a curvy pattern, but the image depicts straight, parallel lines.\n3. **Color Detail**: While "monochrome" can imply various shades of a single color, specifying "grey-scale" is more accurate for the actual image.\n4. **Texture Characterization**: The description uses "aquatic texture," which doesn\'t apply to the visual of uniform vertical stripes.\n\nThe description is eloquent and rich in detail, which could be beneficial in contexts where a similar aesthetic is being described. However, for accuracy regarding the provided image, the elements should be revised to reflect vertical stripes on a grey-scale background.'}]} 84%|████████▍ | 18611/22095 [32:12:48<3:34:59, 3.70s/it] {'loss': 0.2971, 'grad_norm': 0.5737130358392764, 'learning_rate': 6.383438903422384e-07, 'epoch': 0.84} 84%|████████▍ | 18611/22095 [32:12:48<3:34:59, 3.70s/it] 84%|████████▍ | 18612/22095 [32:12:52<3:39:58, 3.79s/it] {'loss': 0.3357, 'grad_norm': 0.5801767481601431, 'learning_rate': 6.379856008517249e-07, 'epoch': 0.84} 84%|████████▍ | 18612/22095 [32:12:52<3:39:58, 3.79s/it] 84%|████████▍ | 18613/22095 [32:12:55<3:27:55, 3.58s/it] {'loss': 0.302, 'grad_norm': 0.6083233227030705, 'learning_rate': 6.376274050875031e-07, 'epoch': 0.84} 84%|████████▍ | 18613/22095 [32:12:55<3:27:55, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42700 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56068 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18614/22095 [32:12:58<3:13:13, 3.33s/it] {'loss': 0.3155, 'grad_norm': 0.644686585245738, 'learning_rate': 6.372693030572713e-07, 'epoch': 0.84} 84%|████████▍ | 18614/22095 [32:12:58<3:13:13, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18615/22095 [32:13:07<4:57:56, 5.14s/it] {'loss': 0.5087, 'grad_norm': 0.2862529176463453, 'learning_rate': 6.369112947687228e-07, 'epoch': 0.84} 84%|████████▍ | 18615/22095 [32:13:07<4:57:56, 5.14s/it] 84%|████████▍ | 18616/22095 [32:13:11<4:31:10, 4.68s/it] {'loss': 0.2779, 'grad_norm': 0.6002537971676198, 'learning_rate': 6.365533802295498e-07, 'epoch': 0.84} 84%|████████▍ | 18616/22095 [32:13:11<4:31:10, 4.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18617/22095 [32:13:14<4:01:49, 4.17s/it] {'loss': 0.2924, 'grad_norm': 0.6794693559548842, 'learning_rate': 6.361955594474434e-07, 'epoch': 0.84} 84%|████████▍ | 18617/22095 [32:13:14<4:01:49, 4.17s/it] 84%|████████▍ | 18618/22095 [32:13:17<3:47:06, 3.92s/it] {'loss': 0.2633, 'grad_norm': 0.6557389050187346, 'learning_rate': 6.358378324300929e-07, 'epoch': 0.84} 84%|████████▍ | 18618/22095 [32:13:17<3:47:06, 3.92s/it] 84%|████████▍ | 18619/22095 [32:13:21<3:35:38, 3.72s/it] {'loss': 0.2793, 'grad_norm': 0.5913684893750959, 'learning_rate': 6.354801991851839e-07, 'epoch': 0.84} 84%|████████▍ | 18619/22095 [32:13:21<3:35:38, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117914 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18620/22095 [32:13:24<3:30:42, 3.64s/it] {'loss': 0.2813, 'grad_norm': 0.56507780537573, 'learning_rate': 6.351226597203996e-07, 'epoch': 0.84} 84%|████████▍ | 18620/22095 [32:13:24<3:30:42, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18621/22095 [32:13:28<3:29:42, 3.62s/it] {'loss': 0.3006, 'grad_norm': 0.6543157854903586, 'learning_rate': 6.347652140434235e-07, 'epoch': 0.84} 84%|████████▍ | 18621/22095 [32:13:28<3:29:42, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18622/22095 [32:13:39<5:41:51, 5.91s/it] {'loss': 0.4696, 'grad_norm': 0.27062195826958063, 'learning_rate': 6.344078621619388e-07, 'epoch': 0.84} 84%|████████▍ | 18622/22095 [32:13:39<5:41:51, 5.91s/it] 84%|████████▍ | 18623/22095 [32:13:42<5:02:26, 5.23s/it] {'loss': 0.3229, 'grad_norm': 0.5956050072863, 'learning_rate': 6.340506040836186e-07, 'epoch': 0.84} 84%|████████▍ | 18623/22095 [32:13:42<5:02:26, 5.23s/it] 84%|████████▍ | 18624/22095 [32:13:46<4:30:09, 4.67s/it] {'loss': 0.2641, 'grad_norm': 0.6072215620693544, 'learning_rate': 6.336934398161421e-07, 'epoch': 0.84} 84%|████████▍ | 18624/22095 [32:13:46<4:30:09, 4.67s/it] 84%|████████▍ | 18625/22095 [32:13:50<4:14:40, 4.40s/it] {'loss': 0.2568, 'grad_norm': 0.6481129870782569, 'learning_rate': 6.333363693671846e-07, 'epoch': 0.84} 84%|████████▍ | 18625/22095 [32:13:50<4:14:40, 4.40s/it] 84%|████████▍ | 18626/22095 [32:13:53<4:04:53, 4.24s/it] {'loss': 0.311, 'grad_norm': 0.6048251027030022, 'learning_rate': 6.329793927444178e-07, 'epoch': 0.84} 84%|████████▍ | 18626/22095 [32:13:53<4:04:53, 4.24s/it] 84%|████████▍ | 18627/22095 [32:13:56<3:41:00, 3.82s/it] {'loss': 0.3054, 'grad_norm': 0.6737990008870608, 'learning_rate': 6.3262250995551e-07, 'epoch': 0.84} 84%|████████▍ | 18627/22095 [32:13:56<3:41:00, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18628/22095 [32:14:06<5:18:34, 5.51s/it] {'loss': 0.4757, 'grad_norm': 0.27320233517049275, 'learning_rate': 6.322657210081318e-07, 'epoch': 0.84} 84%|████████▍ | 18628/22095 [32:14:06<5:18:34, 5.51s/it] 84%|████████▍ | 18629/22095 [32:14:10<4:54:53, 5.10s/it] {'loss': 0.2854, 'grad_norm': 0.5951501062135266, 'learning_rate': 6.319090259099486e-07, 'epoch': 0.84} 84%|████████▍ | 18629/22095 [32:14:10<4:54:53, 5.10s/it] 84%|████████▍ | 18630/22095 [32:14:13<4:17:40, 4.46s/it] {'loss': 0.2763, 'grad_norm': 0.6803703893987928, 'learning_rate': 6.31552424668625e-07, 'epoch': 0.84} 84%|████████▍ | 18630/22095 [32:14:13<4:17:40, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18631/22095 [32:14:16<3:55:51, 4.09s/it] {'loss': 0.3098, 'grad_norm': 0.606461679888621, 'learning_rate': 6.311959172918225e-07, 'epoch': 0.84} 84%|████████▍ | 18631/22095 [32:14:16<3:55:51, 4.09s/it] 84%|████████▍ | 18632/22095 [32:14:19<3:42:24, 3.85s/it] {'loss': 0.3023, 'grad_norm': 0.6402462679139579, 'learning_rate': 6.308395037872034e-07, 'epoch': 0.84} 84%|████████▍ | 18632/22095 [32:14:19<3:42:24, 3.85s/it] 84%|████████▍ | 18633/22095 [32:14:24<3:49:00, 3.97s/it] {'loss': 0.3131, 'grad_norm': 0.6456806940445331, 'learning_rate': 6.304831841624231e-07, 'epoch': 0.84} 84%|████████▍ | 18633/22095 [32:14:24<3:49:00, 3.97s/it] 84%|████████▍ | 18634/22095 [32:14:28<3:51:12, 4.01s/it] {'loss': 0.3454, 'grad_norm': 0.6475293952371274, 'learning_rate': 6.301269584251402e-07, 'epoch': 0.84} 84%|████████▍ | 18634/22095 [32:14:28<3:51:12, 4.01s/it] 84%|████████▍ | 18635/22095 [32:14:31<3:39:18, 3.80s/it] {'loss': 0.3182, 'grad_norm': 0.6217532563366727, 'learning_rate': 6.297708265830083e-07, 'epoch': 0.84} 84%|████████▍ | 18635/22095 [32:14:31<3:39:18, 3.80s/it] 84%|████████▍ | 18636/22095 [32:14:36<3:58:22, 4.13s/it] {'loss': 0.2799, 'grad_norm': 0.624619552227697, 'learning_rate': 6.294147886436774e-07, 'epoch': 0.84} 84%|████████▍ | 18636/22095 [32:14:36<3:58:22, 4.13s/it] 84%|████████▍ | 18637/22095 [32:14:40<3:49:26, 3.98s/it] {'loss': 0.2987, 'grad_norm': 0.6023133975470476, 'learning_rate': 6.290588446148005e-07, 'epoch': 0.84} 84%|████████▍ | 18637/22095 [32:14:40<3:49:26, 3.98s/it] 84%|████████▍ | 18638/22095 [32:14:43<3:48:10, 3.96s/it] {'loss': 0.2694, 'grad_norm': 0.6966724938516231, 'learning_rate': 6.287029945040251e-07, 'epoch': 0.84} 84%|████████▍ | 18638/22095 [32:14:43<3:48:10, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83295 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18639/22095 [32:14:47<3:33:01, 3.70s/it] {'loss': 0.2905, 'grad_norm': 0.9611111517466708, 'learning_rate': 6.28347238318997e-07, 'epoch': 0.84} 84%|████████▍ | 18639/22095 [32:14:47<3:33:01, 3.70s/it] 84%|████████▍ | 18640/22095 [32:14:50<3:20:33, 3.48s/it] {'loss': 0.3332, 'grad_norm': 0.6493403691536899, 'learning_rate': 6.279915760673593e-07, 'epoch': 0.84} 84%|████████▍ | 18640/22095 [32:14:50<3:20:33, 3.48s/it] 84%|████████▍ | 18641/22095 [32:14:55<3:53:52, 4.06s/it] {'loss': 0.286, 'grad_norm': 0.5719059711061655, 'learning_rate': 6.276360077567556e-07, 'epoch': 0.84} 84%|████████▍ | 18641/22095 [32:14:55<3:53:52, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18642/22095 [32:15:00<4:08:25, 4.32s/it] {'loss': 0.3522, 'grad_norm': 0.6307456637814162, 'learning_rate': 6.27280533394825e-07, 'epoch': 0.84} 84%|████████▍ | 18642/22095 [32:15:00<4:08:25, 4.32s/it] 84%|████████▍ | 18643/22095 [32:15:03<3:56:25, 4.11s/it] {'loss': 0.3227, 'grad_norm': 0.6154099960134782, 'learning_rate': 6.269251529892067e-07, 'epoch': 0.84} 84%|████████▍ | 18643/22095 [32:15:03<3:56:25, 4.11s/it] 84%|████████▍ | 18644/22095 [32:15:07<3:40:13, 3.83s/it] {'loss': 0.2806, 'grad_norm': 0.5971840758665974, 'learning_rate': 6.265698665475362e-07, 'epoch': 0.84} 84%|████████▍ | 18644/22095 [32:15:07<3:40:13, 3.83s/it] 84%|████████▍ | 18645/22095 [32:15:12<3:58:13, 4.14s/it] {'loss': 0.262, 'grad_norm': 0.5660815219881201, 'learning_rate': 6.26214674077446e-07, 'epoch': 0.84} 84%|████████▍ | 18645/22095 [32:15:12<3:58:13, 4.14s/it] 84%|████████▍ | 18646/22095 [32:15:15<3:53:55, 4.07s/it] {'loss': 0.2644, 'grad_norm': 0.5582881587328437, 'learning_rate': 6.258595755865693e-07, 'epoch': 0.84} 84%|████████▍ | 18646/22095 [32:15:15<3:53:55, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18647/22095 [32:15:24<5:04:25, 5.30s/it] {'loss': 0.4854, 'grad_norm': 0.270000743015364, 'learning_rate': 6.255045710825375e-07, 'epoch': 0.84} 84%|████████▍ | 18647/22095 [32:15:24<5:04:25, 5.30s/it] 84%|████████▍ | 18648/22095 [32:15:27<4:36:03, 4.81s/it] {'loss': 0.2698, 'grad_norm': 0.5930373376426961, 'learning_rate': 6.251496605729773e-07, 'epoch': 0.84} 84%|████████▍ | 18648/22095 [32:15:27<4:36:03, 4.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18649/22095 [32:15:34<5:15:14, 5.49s/it] {'loss': 0.4542, 'grad_norm': 0.2799876243783329, 'learning_rate': 6.247948440655133e-07, 'epoch': 0.84} 84%|████████▍ | 18649/22095 [32:15:34<5:15:14, 5.49s/it] 84%|████████▍ | 18650/22095 [32:15:38<4:39:10, 4.86s/it] {'loss': 0.2915, 'grad_norm': 0.5833763261664848, 'learning_rate': 6.244401215677709e-07, 'epoch': 0.84} 84%|████████▍ | 18650/22095 [32:15:38<4:39:10, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59934 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18651/22095 [32:15:41<4:17:20, 4.48s/it] {'loss': 0.3143, 'grad_norm': 0.6616761716049248, 'learning_rate': 6.240854930873735e-07, 'epoch': 0.84} 84%|████████▍ | 18651/22095 [32:15:41<4:17:20, 4.48s/it] 84%|████████▍ | 18652/22095 [32:15:45<4:04:06, 4.25s/it] {'loss': 0.3434, 'grad_norm': 0.5763756407201951, 'learning_rate': 6.237309586319378e-07, 'epoch': 0.84} 84%|████████▍ | 18652/22095 [32:15:45<4:04:06, 4.25s/it] 84%|████████▍ | 18653/22095 [32:15:48<3:46:21, 3.95s/it] {'loss': 0.2785, 'grad_norm': 0.6994970687999154, 'learning_rate': 6.233765182090829e-07, 'epoch': 0.84} 84%|████████▍ | 18653/22095 [32:15:48<3:46:21, 3.95s/it] 84%|████████▍ | 18654/22095 [32:15:52<3:45:32, 3.93s/it] {'loss': 0.2635, 'grad_norm': 0.5907761492683824, 'learning_rate': 6.230221718264257e-07, 'epoch': 0.84} 84%|████████▍ | 18654/22095 [32:15:52<3:45:32, 3.93s/it] 84%|████████▍ | 18655/22095 [32:15:57<3:55:10, 4.10s/it] {'loss': 0.3221, 'grad_norm': 0.6022140883838824, 'learning_rate': 6.226679194915791e-07, 'epoch': 0.84} 84%|████████▍ | 18655/22095 [32:15:57<3:55:10, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49905 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44004 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18656/22095 [32:16:01<3:53:48, 4.08s/it] {'loss': 0.2902, 'grad_norm': 0.6181478771378632, 'learning_rate': 6.223137612121538e-07, 'epoch': 0.84} 84%|████████▍ | 18656/22095 [32:16:01<3:53:48, 4.08s/it] 84%|████████▍ | 18657/22095 [32:16:04<3:41:56, 3.87s/it] {'loss': 0.2774, 'grad_norm': 0.6570667306521998, 'learning_rate': 6.219596969957619e-07, 'epoch': 0.84} 84%|████████▍ | 18657/22095 [32:16:04<3:41:56, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18658/22095 [32:16:14<5:22:01, 5.62s/it] {'loss': 0.4801, 'grad_norm': 0.26911455724039257, 'learning_rate': 6.216057268500092e-07, 'epoch': 0.84} 84%|████████▍ | 18658/22095 [32:16:14<5:22:01, 5.62s/it] 84%|████████▍ | 18659/22095 [32:16:23<6:28:24, 6.78s/it] {'loss': 0.4859, 'grad_norm': 0.28515342343215133, 'learning_rate': 6.212518507825027e-07, 'epoch': 0.84} 84%|████████▍ | 18659/22095 [32:16:23<6:28:24, 6.78s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (105237 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18660/22095 [32:16:27<5:30:19, 5.77s/it] {'loss': 0.3107, 'grad_norm': 0.6390100496113705, 'learning_rate': 6.208980688008453e-07, 'epoch': 0.84} 84%|████████▍ | 18660/22095 [32:16:27<5:30:19, 5.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87471 > 40960). Running this sequence through the model will result in indexing errors 84%|████████▍ | 18661/22095 [32:16:30<4:47:08, 5.02s/it] {'loss': 0.2957, 'grad_norm': 0.6064606481601544, 'learning_rate': 6.205443809126399e-07, 'epoch': 0.84} 84%|████████▍ | 18661/22095 [32:16:30<4:47:08, 5.02s/it] 84%|████████▍ | 18662/22095 [32:16:34<4:32:08, 4.76s/it] {'loss': 0.3138, 'grad_norm': 0.6515192999358557, 'learning_rate': 6.201907871254836e-07, 'epoch': 0.84} 84%|████████▍ | 18662/22095 [32:16:34<4:32:08, 4.76s/it] 84%|████████▍ | 18663/22095 [32:16:38<4:12:53, 4.42s/it] {'loss': 0.2815, 'grad_norm': 0.5526030700262841, 'learning_rate': 6.198372874469777e-07, 'epoch': 0.84} 84%|████████▍ | 18663/22095 [32:16:38<4:12:53, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 84%|████████▍ | 18664/22095 [32:16:47<5:38:39, 5.92s/it] {'loss': 0.4647, 'grad_norm': 0.27052209067226685, 'learning_rate': 6.194838818847155e-07, 'epoch': 0.84} 84%|████████▍ | 18664/22095 [32:16:47<5:38:39, 5.92s/it] 84%|████████▍ | 18665/22095 [32:16:51<4:58:52, 5.23s/it] {'loss': 0.2587, 'grad_norm': 0.6545133811559477, 'learning_rate': 6.191305704462897e-07, 'epoch': 0.84} 84%|████████▍ | 18665/22095 [32:16:51<4:58:52, 5.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365274 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32015, 'image': 'vrdu_table_final_2/astro-ph.CO/65b19b1e-9fbb-4563-aaf8-0d5f9d5f6e8b.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\boldmath$\\tau$\\end{tabular}\n```"}]} 84%|████████▍ | 18666/22095 [32:16:55<4:36:16, 4.83s/it] {'loss': 0.3043, 'grad_norm': 0.6538530551934313, 'learning_rate': 6.187773531392932e-07, 'epoch': 0.84} 84%|████████▍ | 18666/22095 [32:16:55<4:36:16, 4.83s/it] 84%|████████▍ | 18667/22095 [32:16:58<4:10:45, 4.39s/it] {'loss': 0.282, 'grad_norm': 0.640001109669132, 'learning_rate': 6.184242299713162e-07, 'epoch': 0.84} 84%|████████▍ | 18667/22095 [32:16:58<4:10:45, 4.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [253, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8934537 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [253, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57690, 'image': 'images/5362.png', 'image_wh': [[253, 25]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C是AB段上的一个点,D是BC段的中点,如果AB=10,AC=6,AD等于()\nA. 4\nB. 6\nC. 7.5\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 84%|████████▍ | 18668/22095 [32:17:07<5:30:29, 5.79s/it] {'loss': 0.4548, 'grad_norm': 0.25687967698350045, 'learning_rate': 6.180712009499462e-07, 'epoch': 0.84} 84%|████████▍ | 18668/22095 [32:17:07<5:30:29, 5.79s/it] 84%|████████▍ | 18669/22095 [32:17:15<6:14:20, 6.56s/it] {'loss': 0.4615, 'grad_norm': 0.27272353640801006, 'learning_rate': 6.177182660827664e-07, 'epoch': 0.84} 84%|████████▍ | 18669/22095 [32:17:15<6:14:20, 6.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 84%|████████▍ | 18670/22095 [32:17:19<5:17:49, 5.57s/it] {'loss': 0.2885, 'grad_norm': 0.561871817929382, 'learning_rate': 6.173654253773631e-07, 'epoch': 0.84} 84%|████████▍ | 18670/22095 [32:17:19<5:17:49, 5.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897249 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20402, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nA. 6\nB. 8\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 85%|████████▍ | 18671/22095 [32:17:22<4:42:50, 4.96s/it] {'loss': 0.2925, 'grad_norm': 0.6008742399480839, 'learning_rate': 6.170126788413156e-07, 'epoch': 0.85} 85%|████████▍ | 18671/22095 [32:17:22<4:42:50, 4.96s/it] 85%|████████▍ | 18672/22095 [32:17:25<4:08:13, 4.35s/it] {'loss': 0.3027, 'grad_norm': 0.6377610207407267, 'learning_rate': 6.166600264822054e-07, 'epoch': 0.85} 85%|████████▍ | 18672/22095 [32:17:25<4:08:13, 4.35s/it] 85%|████████▍ | 18673/22095 [32:17:29<4:02:02, 4.24s/it] {'loss': 0.2892, 'grad_norm': 0.625818270946855, 'learning_rate': 6.163074683076081e-07, 'epoch': 0.85} 85%|████████▍ | 18673/22095 [32:17:29<4:02:02, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41949 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51241 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18674/22095 [32:17:33<3:54:37, 4.12s/it] {'loss': 0.3326, 'grad_norm': 0.8744926735664028, 'learning_rate': 6.159550043251006e-07, 'epoch': 0.85} 85%|████████▍ | 18674/22095 [32:17:33<3:54:37, 4.12s/it] 85%|████████▍ | 18675/22095 [32:17:37<3:47:03, 3.98s/it] {'loss': 0.2761, 'grad_norm': 0.6022902779539743, 'learning_rate': 6.156026345422539e-07, 'epoch': 0.85} 85%|████████▍ | 18675/22095 [32:17:37<3:47:03, 3.98s/it] 85%|████████▍ | 18676/22095 [32:17:39<3:27:04, 3.63s/it] {'loss': 0.2864, 'grad_norm': 0.7967324838798232, 'learning_rate': 6.152503589666426e-07, 'epoch': 0.85} 85%|████████▍ | 18676/22095 [32:17:39<3:27:04, 3.63s/it] 85%|████████▍ | 18677/22095 [32:17:42<3:15:16, 3.43s/it] {'loss': 0.2881, 'grad_norm': 0.7098888607833619, 'learning_rate': 6.148981776058344e-07, 'epoch': 0.85} 85%|████████▍ | 18677/22095 [32:17:42<3:15:16, 3.43s/it] 85%|████████▍ | 18678/22095 [32:17:45<3:06:24, 3.27s/it] {'loss': 0.2999, 'grad_norm': 0.6371350015862445, 'learning_rate': 6.14546090467395e-07, 'epoch': 0.85} 85%|████████▍ | 18678/22095 [32:17:45<3:06:24, 3.27s/it] 85%|████████▍ | 18679/22095 [32:17:49<3:08:24, 3.31s/it] {'loss': 0.312, 'grad_norm': 0.649784256762448, 'learning_rate': 6.141940975588917e-07, 'epoch': 0.85} 85%|████████▍ | 18679/22095 [32:17:49<3:08:24, 3.31s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 85%|████████▍ | 18680/22095 [32:17:52<3:08:44, 3.32s/it] {'loss': 0.2819, 'grad_norm': 0.6282063633236151, 'learning_rate': 6.138421988878884e-07, 'epoch': 0.85} 85%|████████▍ | 18680/22095 [32:17:52<3:08:44, 3.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46700 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43607 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45293 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56903 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18681/22095 [32:17:55<3:03:34, 3.23s/it] {'loss': 0.2983, 'grad_norm': 0.600461964804082, 'learning_rate': 6.134903944619447e-07, 'epoch': 0.85} 85%|████████▍ | 18681/22095 [32:17:55<3:03:34, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18682/22095 [32:18:00<3:27:13, 3.64s/it] {'loss': 0.4629, 'grad_norm': 0.24963576277311014, 'learning_rate': 6.131386842886194e-07, 'epoch': 0.85} 85%|████████▍ | 18682/22095 [32:18:00<3:27:13, 3.64s/it] 85%|████████▍ | 18683/22095 [32:18:03<3:19:17, 3.50s/it] {'loss': 0.234, 'grad_norm': 0.6689409334850602, 'learning_rate': 6.127870683754717e-07, 'epoch': 0.85} 85%|████████▍ | 18683/22095 [32:18:03<3:19:17, 3.50s/it] 85%|████████▍ | 18684/22095 [32:18:06<3:13:30, 3.40s/it] {'loss': 0.3267, 'grad_norm': 0.671592567687221, 'learning_rate': 6.124355467300558e-07, 'epoch': 0.85} 85%|████████▍ | 18684/22095 [32:18:06<3:13:30, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74038 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18685/22095 [32:18:09<3:10:03, 3.34s/it] {'loss': 0.3232, 'grad_norm': 0.6113351501273673, 'learning_rate': 6.120841193599231e-07, 'epoch': 0.85} 85%|████████▍ | 18685/22095 [32:18:09<3:10:03, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50214 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58619 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81070 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18686/22095 [32:18:14<3:28:54, 3.68s/it] {'loss': 0.3147, 'grad_norm': 0.9509336152397326, 'learning_rate': 6.11732786272628e-07, 'epoch': 0.85} 85%|████████▍ | 18686/22095 [32:18:14<3:28:54, 3.68s/it] 85%|████████▍ | 18687/22095 [32:18:17<3:27:14, 3.65s/it] {'loss': 0.2984, 'grad_norm': 0.6433261864621458, 'learning_rate': 6.113815474757162e-07, 'epoch': 0.85} 85%|████████▍ | 18687/22095 [32:18:17<3:27:14, 3.65s/it] 85%|████████▍ | 18688/22095 [32:18:21<3:20:48, 3.54s/it] {'loss': 0.3001, 'grad_norm': 0.6026784560014598, 'learning_rate': 6.110304029767372e-07, 'epoch': 0.85} 85%|████████▍ | 18688/22095 [32:18:21<3:20:48, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18689/22095 [32:18:31<5:20:42, 5.65s/it] {'loss': 0.4963, 'grad_norm': 0.2685922247370795, 'learning_rate': 6.106793527832344e-07, 'epoch': 0.85} 85%|████████▍ | 18689/22095 [32:18:31<5:20:42, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (108746 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44324 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46774 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47111 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107853 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44614 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18690/22095 [32:18:38<5:36:54, 5.94s/it] {'loss': 0.4648, 'grad_norm': 0.26093778596229505, 'learning_rate': 6.103283969027524e-07, 'epoch': 0.85} 85%|████████▍ | 18690/22095 [32:18:38<5:36:54, 5.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18691/22095 [32:18:42<5:07:42, 5.42s/it] {'loss': 0.2752, 'grad_norm': 0.6104503654019943, 'learning_rate': 6.099775353428306e-07, 'epoch': 0.85} 85%|████████▍ | 18691/22095 [32:18:42<5:07:42, 5.42s/it] 85%|████████▍ | 18692/22095 [32:18:46<4:41:15, 4.96s/it] {'loss': 0.2844, 'grad_norm': 0.6807608671938857, 'learning_rate': 6.096267681110097e-07, 'epoch': 0.85} 85%|████████▍ | 18692/22095 [32:18:46<4:41:15, 4.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18693/22095 [32:18:53<5:12:25, 5.51s/it] {'loss': 0.4885, 'grad_norm': 0.27051557821575795, 'learning_rate': 6.092760952148253e-07, 'epoch': 0.85} 85%|████████▍ | 18693/22095 [32:18:53<5:12:25, 5.51s/it] 85%|████████▍ | 18694/22095 [32:19:02<6:20:00, 6.70s/it] {'loss': 0.4542, 'grad_norm': 0.26212738678058434, 'learning_rate': 6.089255166618113e-07, 'epoch': 0.85} 85%|████████▍ | 18694/22095 [32:19:02<6:20:00, 6.70s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 85%|████████▍ | 18695/22095 [32:19:06<5:34:44, 5.91s/it] {'loss': 0.3503, 'grad_norm': 0.57817303041535, 'learning_rate': 6.085750324595019e-07, 'epoch': 0.85} 85%|████████▍ | 18695/22095 [32:19:06<5:34:44, 5.91s/it] 85%|████████▍ | 18696/22095 [32:19:10<4:51:38, 5.15s/it] {'loss': 0.2785, 'grad_norm': 0.8624097034013176, 'learning_rate': 6.082246426154292e-07, 'epoch': 0.85} 85%|████████▍ | 18696/22095 [32:19:10<4:51:38, 5.15s/it] 85%|████████▍ | 18697/22095 [32:19:14<4:34:52, 4.85s/it] {'loss': 0.3429, 'grad_norm': 0.6297980038757982, 'learning_rate': 6.078743471371207e-07, 'epoch': 0.85} 85%|████████▍ | 18697/22095 [32:19:14<4:34:52, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18698/22095 [32:19:23<5:53:44, 6.25s/it] {'loss': 0.4822, 'grad_norm': 0.28655257814501794, 'learning_rate': 6.075241460321013e-07, 'epoch': 0.85} 85%|████████▍ | 18698/22095 [32:19:23<5:53:44, 6.25s/it] 85%|████████▍ | 18699/22095 [32:19:26<5:00:51, 5.32s/it] {'loss': 0.3127, 'grad_norm': 0.6108057426107879, 'learning_rate': 6.071740393078995e-07, 'epoch': 0.85} 85%|████████▍ | 18699/22095 [32:19:26<5:00:51, 5.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [250, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8428326 in VC:s3://internvl-moe-sft-data/. Exception: Image size [250, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 106614, 'image': 'vrdu_texteq/astro-ph.CO/f3326e6d-f65a-408d-bff2-606bb51a1708.png', 'image_wh': [[250, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'where $x = 1+z$ and'}]} 85%|████████▍ | 18700/22095 [32:19:29<4:21:23, 4.62s/it] {'loss': 0.2779, 'grad_norm': 0.5917538247552456, 'learning_rate': 6.068240269720343e-07, 'epoch': 0.85} 85%|████████▍ | 18700/22095 [32:19:29<4:21:23, 4.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88002 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18701/22095 [32:19:32<3:55:12, 4.16s/it] {'loss': 0.264, 'grad_norm': 0.5479848011861002, 'learning_rate': 6.064741090320297e-07, 'epoch': 0.85} 85%|████████▍ | 18701/22095 [32:19:32<3:55:12, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49249 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98580 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18702/22095 [32:19:36<3:37:48, 3.85s/it] {'loss': 0.302, 'grad_norm': 0.5713766082902766, 'learning_rate': 6.061242854954014e-07, 'epoch': 0.85} 85%|████████▍ | 18702/22095 [32:19:36<3:37:48, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918151 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41304, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 85%|████████▍ | 18703/22095 [32:19:39<3:37:36, 3.85s/it] {'loss': 0.334, 'grad_norm': 0.641789067864487, 'learning_rate': 6.057745563696688e-07, 'epoch': 0.85} 85%|████████▍ | 18703/22095 [32:19:39<3:37:36, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46546 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18704/22095 [32:19:43<3:39:36, 3.89s/it] {'loss': 0.3085, 'grad_norm': 0.6392789934629359, 'learning_rate': 6.054249216623437e-07, 'epoch': 0.85} 85%|████████▍ | 18704/22095 [32:19:43<3:39:36, 3.89s/it] 85%|████████▍ | 18705/22095 [32:19:47<3:28:38, 3.69s/it] {'loss': 0.2877, 'grad_norm': 0.6277894109142051, 'learning_rate': 6.050753813809412e-07, 'epoch': 0.85} 85%|████████▍ | 18705/22095 [32:19:47<3:28:38, 3.69s/it] 85%|████████▍ | 18706/22095 [32:19:50<3:17:36, 3.50s/it] {'loss': 0.2878, 'grad_norm': 0.6102758916483524, 'learning_rate': 6.04725935532971e-07, 'epoch': 0.85} 85%|████████▍ | 18706/22095 [32:19:50<3:17:36, 3.50s/it] 85%|████████▍ | 18707/22095 [32:19:53<3:14:13, 3.44s/it] {'loss': 0.2652, 'grad_norm': 0.5987516515555169, 'learning_rate': 6.043765841259402e-07, 'epoch': 0.85} 85%|████████▍ | 18707/22095 [32:19:53<3:14:13, 3.44s/it] 85%|████████▍ | 18708/22095 [32:19:56<3:06:05, 3.30s/it] {'loss': 0.2849, 'grad_norm': 0.6540286604005203, 'learning_rate': 6.040273271673569e-07, 'epoch': 0.85} 85%|████████▍ | 18708/22095 [32:19:56<3:06:05, 3.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18709/22095 [32:20:06<4:54:10, 5.21s/it] {'loss': 0.4783, 'grad_norm': 0.2626634167184486, 'learning_rate': 6.036781646647261e-07, 'epoch': 0.85} 85%|████████▍ | 18709/22095 [32:20:06<4:54:10, 5.21s/it] 85%|████████▍ | 18710/22095 [32:20:15<6:08:34, 6.53s/it] {'loss': 0.4351, 'grad_norm': 0.2624905961997942, 'learning_rate': 6.03329096625549e-07, 'epoch': 0.85} 85%|████████▍ | 18710/22095 [32:20:15<6:08:34, 6.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 85%|████████▍ | 18711/22095 [32:20:18<5:12:29, 5.54s/it] {'loss': 0.2784, 'grad_norm': 0.665480812530786, 'learning_rate': 6.029801230573252e-07, 'epoch': 0.85} 85%|████████▍ | 18711/22095 [32:20:18<5:12:29, 5.54s/it] 85%|████████▍ | 18712/22095 [32:20:28<6:16:11, 6.67s/it] {'loss': 0.4464, 'grad_norm': 0.25667545942246284, 'learning_rate': 6.026312439675553e-07, 'epoch': 0.85} 85%|████████▍ | 18712/22095 [32:20:28<6:16:11, 6.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18713/22095 [32:20:32<5:27:13, 5.81s/it] {'loss': 0.3045, 'grad_norm': 0.6136753121244453, 'learning_rate': 6.022824593637334e-07, 'epoch': 0.85} 85%|████████▍ | 18713/22095 [32:20:32<5:27:13, 5.81s/it] 85%|████████▍ | 18714/22095 [32:20:35<4:55:29, 5.24s/it] {'loss': 0.2807, 'grad_norm': 0.5794067426638368, 'learning_rate': 6.019337692533556e-07, 'epoch': 0.85} 85%|████████▍ | 18714/22095 [32:20:36<4:55:29, 5.24s/it] 85%|████████▍ | 18715/22095 [32:20:40<4:43:19, 5.03s/it] {'loss': 0.2631, 'grad_norm': 0.5829253591861213, 'learning_rate': 6.015851736439138e-07, 'epoch': 0.85} 85%|████████▍ | 18715/22095 [32:20:40<4:43:19, 5.03s/it] 85%|████████▍ | 18716/22095 [32:20:43<4:04:05, 4.33s/it] {'loss': 0.2638, 'grad_norm': 0.5811772762809982, 'learning_rate': 6.01236672542897e-07, 'epoch': 0.85} 85%|████████▍ | 18716/22095 [32:20:43<4:04:05, 4.33s/it] 85%|████████▍ | 18717/22095 [32:20:46<3:48:00, 4.05s/it] {'loss': 0.2473, 'grad_norm': 0.606423734682036, 'learning_rate': 6.008882659577942e-07, 'epoch': 0.85} 85%|████████▍ | 18717/22095 [32:20:46<3:48:00, 4.05s/it] 85%|████████▍ | 18718/22095 [32:20:49<3:31:43, 3.76s/it] {'loss': 0.2999, 'grad_norm': 0.6286834947259153, 'learning_rate': 6.005399538960927e-07, 'epoch': 0.85} 85%|████████▍ | 18718/22095 [32:20:49<3:31:43, 3.76s/it] 85%|████████▍ | 18719/22095 [32:20:54<3:41:47, 3.94s/it] {'loss': 0.2997, 'grad_norm': 0.6016500099893688, 'learning_rate': 6.001917363652759e-07, 'epoch': 0.85} 85%|████████▍ | 18719/22095 [32:20:54<3:41:47, 3.94s/it] 85%|████████▍ | 18720/22095 [32:20:58<3:42:18, 3.95s/it] {'loss': 0.2978, 'grad_norm': 0.8563674664688892, 'learning_rate': 5.998436133728247e-07, 'epoch': 0.85} 85%|████████▍ | 18720/22095 [32:20:58<3:42:18, 3.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [637, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8522498 in VC:s3://internvl-moe-sft-data/. Exception: Image size [637, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64244, 'image': 'vrdu_texteq/astro-ph.CO/1a98b353-3b4e-4e95-a4ae-ab62fa766d5f.png', 'image_wh': [[637, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'The values of $\\lambda_1$ and $\\lambda_2$ are set to be 1300 and 2000'}]} 85%|████████▍ | 18721/22095 [32:21:01<3:31:26, 3.76s/it] {'loss': 0.3001, 'grad_norm': 0.6109050842316257, 'learning_rate': 5.994955849262207e-07, 'epoch': 0.85} 85%|████████▍ | 18721/22095 [32:21:01<3:31:26, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 34, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333774 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 34, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 383, 'image': 'vrdu_table_final_2/astro-ph.CO/22b3fb85-a525-49bc-8fa7-7a82926820a6.png', 'image_wh': [[20, 34]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c} $\\tilde{b}_i$ \\end{tabular}\n```"}]} 85%|████████▍ | 18722/22095 [32:21:05<3:34:11, 3.81s/it] {'loss': 0.2903, 'grad_norm': 0.6730770713324719, 'learning_rate': 5.991476510329419e-07, 'epoch': 0.85} 85%|████████▍ | 18722/22095 [32:21:05<3:34:11, 3.81s/it] 85%|████████▍ | 18723/22095 [32:21:08<3:24:00, 3.63s/it] {'loss': 0.266, 'grad_norm': 0.6298080817289243, 'learning_rate': 5.987998117004628e-07, 'epoch': 0.85} 85%|████████▍ | 18723/22095 [32:21:08<3:24:00, 3.63s/it] 85%|████████▍ | 18724/22095 [32:21:11<3:11:24, 3.41s/it] {'loss': 0.2539, 'grad_norm': 0.6843953732002055, 'learning_rate': 5.984520669362587e-07, 'epoch': 0.85} 85%|████████▍ | 18724/22095 [32:21:11<3:11:24, 3.41s/it] 85%|████████▍ | 18725/22095 [32:21:15<3:20:36, 3.57s/it] {'loss': 0.3277, 'grad_norm': 0.6452454414382873, 'learning_rate': 5.981044167478017e-07, 'epoch': 0.85} 85%|████████▍ | 18725/22095 [32:21:15<3:20:36, 3.57s/it] 85%|████████▍ | 18726/22095 [32:21:18<3:16:56, 3.51s/it] {'loss': 0.2997, 'grad_norm': 0.6321163182383871, 'learning_rate': 5.977568611425621e-07, 'epoch': 0.85} 85%|████████▍ | 18726/22095 [32:21:18<3:16:56, 3.51s/it] 85%|████████▍ | 18727/22095 [32:21:22<3:19:35, 3.56s/it] {'loss': 0.3082, 'grad_norm': 0.6705720958675141, 'learning_rate': 5.974094001280056e-07, 'epoch': 0.85} 85%|████████▍ | 18727/22095 [32:21:22<3:19:35, 3.56s/it] 85%|████████▍ | 18728/22095 [32:21:25<3:08:19, 3.36s/it] {'loss': 0.3023, 'grad_norm': 0.7455710756474468, 'learning_rate': 5.970620337116012e-07, 'epoch': 0.85} 85%|████████▍ | 18728/22095 [32:21:25<3:08:19, 3.36s/it] 85%|████████▍ | 18729/22095 [32:21:28<3:05:31, 3.31s/it] {'loss': 0.2658, 'grad_norm': 1.577587002978819, 'learning_rate': 5.967147619008096e-07, 'epoch': 0.85} 85%|████████▍ | 18729/22095 [32:21:28<3:05:31, 3.31s/it] 85%|████████▍ | 18730/22095 [32:21:32<3:18:31, 3.54s/it] {'loss': 0.2914, 'grad_norm': 0.6551931720234587, 'learning_rate': 5.963675847030953e-07, 'epoch': 0.85} 85%|████████▍ | 18730/22095 [32:21:32<3:18:31, 3.54s/it] 85%|████████▍ | 18731/22095 [32:21:35<3:07:40, 3.35s/it] {'loss': 0.2945, 'grad_norm': 0.6306636264131432, 'learning_rate': 5.960205021259158e-07, 'epoch': 0.85} 85%|████████▍ | 18731/22095 [32:21:35<3:07:40, 3.35s/it] 85%|████████▍ | 18732/22095 [32:21:38<2:59:39, 3.21s/it] {'loss': 0.3369, 'grad_norm': 0.6513955703366753, 'learning_rate': 5.956735141767306e-07, 'epoch': 0.85} 85%|████████▍ | 18732/22095 [32:21:38<2:59:39, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48729 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131717 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18733/22095 [32:21:41<3:01:25, 3.24s/it] {'loss': 0.2704, 'grad_norm': 0.6427932218936455, 'learning_rate': 5.953266208629943e-07, 'epoch': 0.85} 85%|████████▍ | 18733/22095 [32:21:41<3:01:25, 3.24s/it] 85%|████████▍ | 18734/22095 [32:21:45<3:04:43, 3.30s/it] {'loss': 0.2858, 'grad_norm': 0.6024984714834914, 'learning_rate': 5.949798221921616e-07, 'epoch': 0.85} 85%|████████▍ | 18734/22095 [32:21:45<3:04:43, 3.30s/it] 85%|████████▍ | 18735/22095 [32:21:47<2:58:17, 3.18s/it] {'loss': 0.2667, 'grad_norm': 0.6340217838797076, 'learning_rate': 5.946331181716836e-07, 'epoch': 0.85} 85%|████████▍ | 18735/22095 [32:21:47<2:58:17, 3.18s/it] 85%|████████▍ | 18736/22095 [32:21:51<2:59:08, 3.20s/it] {'loss': 0.2957, 'grad_norm': 0.6800193111456344, 'learning_rate': 5.942865088090088e-07, 'epoch': 0.85} 85%|████████▍ | 18736/22095 [32:21:51<2:59:08, 3.20s/it] 85%|████████▍ | 18737/22095 [32:21:54<2:55:50, 3.14s/it] {'loss': 0.2763, 'grad_norm': 0.6755842385766843, 'learning_rate': 5.939399941115859e-07, 'epoch': 0.85} 85%|████████▍ | 18737/22095 [32:21:54<2:55:50, 3.14s/it] 85%|████████▍ | 18738/22095 [32:21:57<2:56:17, 3.15s/it] {'loss': 0.3231, 'grad_norm': 0.5660382128890884, 'learning_rate': 5.935935740868614e-07, 'epoch': 0.85} 85%|████████▍ | 18738/22095 [32:21:57<2:56:17, 3.15s/it] 85%|████████▍ | 18739/22095 [32:22:00<2:52:18, 3.08s/it] {'loss': 0.289, 'grad_norm': 0.8721972221487893, 'learning_rate': 5.93247248742278e-07, 'epoch': 0.85} 85%|████████▍ | 18739/22095 [32:22:00<2:52:18, 3.08s/it] 85%|████████▍ | 18740/22095 [32:22:03<2:50:57, 3.06s/it] {'loss': 0.2935, 'grad_norm': 0.5421885890616007, 'learning_rate': 5.929010180852756e-07, 'epoch': 0.85} 85%|████████▍ | 18740/22095 [32:22:03<2:50:57, 3.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18741/22095 [32:22:13<4:48:47, 5.17s/it] {'loss': 0.4406, 'grad_norm': 0.2620228273329047, 'learning_rate': 5.925548821232957e-07, 'epoch': 0.85} 85%|████████▍ | 18741/22095 [32:22:13<4:48:47, 5.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906517 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29670, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nA. 15cm\nB. 16cm\nC. 10cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 85%|████████▍ | 18742/22095 [32:22:16<4:17:28, 4.61s/it] {'loss': 0.2586, 'grad_norm': 0.7154912165594549, 'learning_rate': 5.922088408637743e-07, 'epoch': 0.85} 85%|████████▍ | 18742/22095 [32:22:16<4:17:28, 4.61s/it] 85%|████████▍ | 18743/22095 [32:22:19<3:50:54, 4.13s/it] {'loss': 0.2971, 'grad_norm': 0.5712037351530368, 'learning_rate': 5.918628943141486e-07, 'epoch': 0.85} 85%|████████▍ | 18743/22095 [32:22:19<3:50:54, 4.13s/it] 85%|████████▍ | 18744/22095 [32:22:22<3:33:36, 3.82s/it] {'loss': 0.2986, 'grad_norm': 0.6005964579283544, 'learning_rate': 5.915170424818495e-07, 'epoch': 0.85} 85%|████████▍ | 18744/22095 [32:22:22<3:33:36, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (100344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63266 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18745/22095 [32:22:26<3:37:33, 3.90s/it] {'loss': 0.3353, 'grad_norm': 0.6268217753093963, 'learning_rate': 5.911712853743101e-07, 'epoch': 0.85} 85%|████████▍ | 18745/22095 [32:22:26<3:37:33, 3.90s/it] 85%|████████▍ | 18746/22095 [32:22:29<3:23:04, 3.64s/it] {'loss': 0.2985, 'grad_norm': 0.6444851868878317, 'learning_rate': 5.90825622998959e-07, 'epoch': 0.85} 85%|████████▍ | 18746/22095 [32:22:29<3:23:04, 3.64s/it] 85%|████████▍ | 18747/22095 [32:22:33<3:19:56, 3.58s/it] {'loss': 0.3326, 'grad_norm': 0.619428402824257, 'learning_rate': 5.90480055363224e-07, 'epoch': 0.85} 85%|████████▍ | 18747/22095 [32:22:33<3:19:56, 3.58s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18748/22095 [32:22:37<3:23:51, 3.65s/it] {'loss': 0.2932, 'grad_norm': 0.5733968766422702, 'learning_rate': 5.901345824745297e-07, 'epoch': 0.85} 85%|████████▍ | 18748/22095 [32:22:37<3:23:51, 3.65s/it] 85%|████████▍ | 18749/22095 [32:22:40<3:24:12, 3.66s/it] {'loss': 0.297, 'grad_norm': 0.6356372521086717, 'learning_rate': 5.897892043402986e-07, 'epoch': 0.85} 85%|████████▍ | 18749/22095 [32:22:40<3:24:12, 3.66s/it] 85%|████████▍ | 18750/22095 [32:22:44<3:18:16, 3.56s/it] {'loss': 0.2889, 'grad_norm': 0.7481215559015781, 'learning_rate': 5.89443920967952e-07, 'epoch': 0.85} 85%|████████▍ | 18750/22095 [32:22:44<3:18:16, 3.56s/it] 85%|████████▍ | 18751/22095 [32:22:47<3:20:22, 3.60s/it] {'loss': 0.2871, 'grad_norm': 0.7661646171965207, 'learning_rate': 5.890987323649122e-07, 'epoch': 0.85} 85%|████████▍ | 18751/22095 [32:22:47<3:20:22, 3.60s/it] 85%|████████▍ | 18752/22095 [32:22:50<3:10:05, 3.41s/it] {'loss': 0.2719, 'grad_norm': 0.565257020815254, 'learning_rate': 5.887536385385917e-07, 'epoch': 0.85} 85%|████████▍ | 18752/22095 [32:22:50<3:10:05, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18753/22095 [32:23:00<4:49:01, 5.19s/it] {'loss': 0.4449, 'grad_norm': 0.2543059215630452, 'learning_rate': 5.884086394964067e-07, 'epoch': 0.85} 85%|████████▍ | 18753/22095 [32:23:00<4:49:01, 5.19s/it] 85%|████████▍ | 18754/22095 [32:23:03<4:17:36, 4.63s/it] {'loss': 0.2935, 'grad_norm': 0.6522712431560104, 'learning_rate': 5.880637352457724e-07, 'epoch': 0.85} 85%|████████▍ | 18754/22095 [32:23:03<4:17:36, 4.63s/it] 85%|████████▍ | 18755/22095 [32:23:06<3:52:56, 4.18s/it] {'loss': 0.2842, 'grad_norm': 0.7678483583609577, 'learning_rate': 5.87718925794098e-07, 'epoch': 0.85} 85%|████████▍ | 18755/22095 [32:23:06<3:52:56, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77437 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18756/22095 [32:23:10<3:45:47, 4.06s/it] {'loss': 0.2841, 'grad_norm': 0.8928166842517539, 'learning_rate': 5.873742111487917e-07, 'epoch': 0.85} 85%|████████▍ | 18756/22095 [32:23:10<3:45:47, 4.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8888452 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11605, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=10cm,M是AB中点,点N在AB上,NB=2cm,那么线段MN的长为()\nA. 4cm\nB. 3cm\nC. 2cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 85%|████████▍ | 18757/22095 [32:23:13<3:26:55, 3.72s/it] {'loss': 0.2586, 'grad_norm': 0.5877417914147925, 'learning_rate': 5.870295913172625e-07, 'epoch': 0.85} 85%|████████▍ | 18757/22095 [32:23:13<3:26:55, 3.72s/it] 85%|████████▍ | 18758/22095 [32:23:16<3:15:38, 3.52s/it] {'loss': 0.2773, 'grad_norm': 0.6971168951778712, 'learning_rate': 5.866850663069124e-07, 'epoch': 0.85} 85%|████████▍ | 18758/22095 [32:23:16<3:15:38, 3.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▍ | 18759/22095 [32:23:19<3:12:31, 3.46s/it] {'loss': 0.3014, 'grad_norm': 0.7007595502752746, 'learning_rate': 5.863406361251472e-07, 'epoch': 0.85} 85%|████████▍ | 18759/22095 [32:23:19<3:12:31, 3.46s/it] 85%|████████▍ | 18760/22095 [32:23:22<2:59:09, 3.22s/it] {'loss': 0.3526, 'grad_norm': 0.6213943742869298, 'learning_rate': 5.859963007793651e-07, 'epoch': 0.85} 85%|████████▍ | 18760/22095 [32:23:22<2:59:09, 3.22s/it] 85%|████████▍ | 18761/22095 [32:23:25<2:58:33, 3.21s/it] {'loss': 0.2767, 'grad_norm': 0.5838266017627748, 'learning_rate': 5.856520602769667e-07, 'epoch': 0.85} 85%|████████▍ | 18761/22095 [32:23:25<2:58:33, 3.21s/it] 85%|████████▍ | 18762/22095 [32:23:28<2:57:31, 3.20s/it] {'loss': 0.2932, 'grad_norm': 0.5926360558327918, 'learning_rate': 5.853079146253471e-07, 'epoch': 0.85} 85%|████████▍ | 18762/22095 [32:23:28<2:57:31, 3.20s/it] 85%|████████▍ | 18763/22095 [32:23:32<3:11:25, 3.45s/it] {'loss': 0.2734, 'grad_norm': 0.5563599425447716, 'learning_rate': 5.849638638319027e-07, 'epoch': 0.85} 85%|████████▍ | 18763/22095 [32:23:32<3:11:25, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [84, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8333544 in VC:s3://internvl-moe-sft-data/. Exception: Image size [84, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 152, 'image': 'vrdu_table_final_2/astro-ph.CO/1380e8fb-ff9c-4f47-a52f-5b5762b3e632.png', 'image_wh': [[84, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{l}MMF3\\end{tabular}\n```"}]} 85%|████████▍ | 18764/22095 [32:23:38<3:47:30, 4.10s/it] {'loss': 0.4497, 'grad_norm': 0.2510181119494583, 'learning_rate': 5.846199079040249e-07, 'epoch': 0.85} 85%|████████▍ | 18764/22095 [32:23:38<3:47:30, 4.10s/it] 85%|████████▍ | 18765/22095 [32:23:42<3:48:58, 4.13s/it] {'loss': 0.3179, 'grad_norm': 0.6454068837621051, 'learning_rate': 5.842760468491037e-07, 'epoch': 0.85} 85%|████████▍ | 18765/22095 [32:23:42<3:48:58, 4.13s/it] 85%|████████▍ | 18766/22095 [32:23:45<3:36:51, 3.91s/it] {'loss': 0.3043, 'grad_norm': 0.6272581164047886, 'learning_rate': 5.839322806745285e-07, 'epoch': 0.85} 85%|████████▍ | 18766/22095 [32:23:46<3:36:51, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50506 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18767/22095 [32:23:48<3:18:36, 3.58s/it] {'loss': 0.2909, 'grad_norm': 0.654346951581988, 'learning_rate': 5.835886093876863e-07, 'epoch': 0.85} 85%|████████▍ | 18767/22095 [32:23:48<3:18:36, 3.58s/it] 85%|████████▍ | 18768/22095 [32:23:52<3:19:25, 3.60s/it] {'loss': 0.3006, 'grad_norm': 0.5879851108229344, 'learning_rate': 5.832450329959616e-07, 'epoch': 0.85} 85%|████████▍ | 18768/22095 [32:23:52<3:19:25, 3.60s/it] 85%|████████▍ | 18769/22095 [32:23:56<3:33:29, 3.85s/it] {'loss': 0.3012, 'grad_norm': 0.7232430310009791, 'learning_rate': 5.829015515067344e-07, 'epoch': 0.85} 85%|████████▍ | 18769/22095 [32:23:56<3:33:29, 3.85s/it] 85%|████████▍ | 18770/22095 [32:24:00<3:33:55, 3.86s/it] {'loss': 0.2543, 'grad_norm': 0.6953028335043215, 'learning_rate': 5.825581649273881e-07, 'epoch': 0.85} 85%|████████▍ | 18770/22095 [32:24:00<3:33:55, 3.86s/it] 85%|████████▍ | 18771/22095 [32:24:04<3:31:03, 3.81s/it] {'loss': 0.2915, 'grad_norm': 0.6051757432812306, 'learning_rate': 5.822148732652988e-07, 'epoch': 0.85} 85%|████████▍ | 18771/22095 [32:24:04<3:31:03, 3.81s/it] 85%|████████▍ | 18772/22095 [32:24:07<3:19:40, 3.61s/it] {'loss': 0.2894, 'grad_norm': 0.5621090053762571, 'learning_rate': 5.818716765278443e-07, 'epoch': 0.85} 85%|████████▍ | 18772/22095 [32:24:07<3:19:40, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108532 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65293 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113337 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18773/22095 [32:24:11<3:19:50, 3.61s/it] {'loss': 0.3113, 'grad_norm': 0.6685361801495828, 'learning_rate': 5.815285747223975e-07, 'epoch': 0.85} 85%|████████▍ | 18773/22095 [32:24:11<3:19:50, 3.61s/it] 85%|████████▍ | 18774/22095 [32:24:15<3:37:54, 3.94s/it] {'loss': 0.3037, 'grad_norm': 0.6262082362319686, 'learning_rate': 5.811855678563322e-07, 'epoch': 0.85} 85%|████████▍ | 18774/22095 [32:24:15<3:37:54, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▍ | 18775/22095 [32:24:25<5:19:12, 5.77s/it] {'loss': 0.4543, 'grad_norm': 0.27065950072618833, 'learning_rate': 5.808426559370172e-07, 'epoch': 0.85} 85%|████████▍ | 18775/22095 [32:24:25<5:19:12, 5.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79145 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43081 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46270 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18776/22095 [32:24:29<4:45:16, 5.16s/it] {'loss': 0.3205, 'grad_norm': 0.6036759949570691, 'learning_rate': 5.804998389718214e-07, 'epoch': 0.85} 85%|████████▍ | 18776/22095 [32:24:29<4:45:16, 5.16s/it] 85%|████████▍ | 18777/22095 [32:24:32<4:09:38, 4.51s/it] {'loss': 0.2772, 'grad_norm': 0.5681712093767958, 'learning_rate': 5.801571169681108e-07, 'epoch': 0.85} 85%|████████▍ | 18777/22095 [32:24:32<4:09:38, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71430 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58221 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18778/22095 [32:24:36<3:51:43, 4.19s/it] {'loss': 0.2948, 'grad_norm': 0.5971504050818677, 'learning_rate': 5.798144899332486e-07, 'epoch': 0.85} 85%|████████▍ | 18778/22095 [32:24:36<3:51:43, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119054 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▍ | 18779/22095 [32:24:39<3:31:08, 3.82s/it] {'loss': 0.2643, 'grad_norm': 0.7151176048313252, 'learning_rate': 5.794719578745972e-07, 'epoch': 0.85} 85%|████████▍ | 18779/22095 [32:24:39<3:31:08, 3.82s/it] 85%|████████▍ | 18780/22095 [32:24:42<3:21:33, 3.65s/it] {'loss': 0.299, 'grad_norm': 0.684751460375087, 'learning_rate': 5.79129520799519e-07, 'epoch': 0.85} 85%|████████▍ | 18780/22095 [32:24:42<3:21:33, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18781/22095 [32:24:49<4:19:50, 4.70s/it] {'loss': 0.4719, 'grad_norm': 0.2779298704574024, 'learning_rate': 5.787871787153676e-07, 'epoch': 0.85} 85%|████████▌ | 18781/22095 [32:24:49<4:19:50, 4.70s/it] 85%|████████▌ | 18782/22095 [32:24:53<4:03:30, 4.41s/it] {'loss': 0.306, 'grad_norm': 0.599828675778426, 'learning_rate': 5.784449316295005e-07, 'epoch': 0.85} 85%|████████▌ | 18782/22095 [32:24:53<4:03:30, 4.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18783/22095 [32:24:56<3:51:29, 4.19s/it] {'loss': 0.2583, 'grad_norm': 0.6811163672881649, 'learning_rate': 5.781027795492738e-07, 'epoch': 0.85} 85%|████████▌ | 18783/22095 [32:24:56<3:51:29, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18784/22095 [32:25:04<4:41:55, 5.11s/it] {'loss': 0.4424, 'grad_norm': 0.26523875058027263, 'learning_rate': 5.77760722482037e-07, 'epoch': 0.85} 85%|████████▌ | 18784/22095 [32:25:04<4:41:55, 5.11s/it] 85%|████████▌ | 18785/22095 [32:25:08<4:28:04, 4.86s/it] {'loss': 0.3334, 'grad_norm': 0.5872002054895826, 'learning_rate': 5.7741876043514e-07, 'epoch': 0.85} 85%|████████▌ | 18785/22095 [32:25:08<4:28:04, 4.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75070 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18786/22095 [32:25:17<5:43:31, 6.23s/it] {'loss': 0.4533, 'grad_norm': 0.2891465936058678, 'learning_rate': 5.770768934159315e-07, 'epoch': 0.85} 85%|████████▌ | 18786/22095 [32:25:17<5:43:31, 6.23s/it] 85%|████████▌ | 18787/22095 [32:25:24<5:44:11, 6.24s/it] {'loss': 0.46, 'grad_norm': 0.27265696093260977, 'learning_rate': 5.767351214317557e-07, 'epoch': 0.85} 85%|████████▌ | 18787/22095 [32:25:24<5:44:11, 6.24s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 85%|████████▌ | 18788/22095 [32:25:27<4:57:19, 5.39s/it] {'loss': 0.2617, 'grad_norm': 0.606245514986393, 'learning_rate': 5.763934444899577e-07, 'epoch': 0.85} 85%|████████▌ | 18788/22095 [32:25:27<4:57:19, 5.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60385 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18789/22095 [32:25:30<4:22:08, 4.76s/it] {'loss': 0.358, 'grad_norm': 0.5748858767760663, 'learning_rate': 5.760518625978778e-07, 'epoch': 0.85} 85%|████████▌ | 18789/22095 [32:25:30<4:22:08, 4.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61771 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45521 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52046 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18790/22095 [32:25:33<3:50:33, 4.19s/it] {'loss': 0.28, 'grad_norm': 0.6154813445986127, 'learning_rate': 5.757103757628573e-07, 'epoch': 0.85} 85%|████████▌ | 18790/22095 [32:25:33<3:50:33, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67443 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18791/22095 [32:25:36<3:29:56, 3.81s/it] {'loss': 0.2952, 'grad_norm': 0.5935345821580245, 'learning_rate': 5.753689839922321e-07, 'epoch': 0.85} 85%|████████▌ | 18791/22095 [32:25:36<3:29:56, 3.81s/it] 85%|████████▌ | 18792/22095 [32:25:39<3:12:37, 3.50s/it] {'loss': 0.3154, 'grad_norm': 0.6015336243939349, 'learning_rate': 5.750276872933386e-07, 'epoch': 0.85} 85%|████████▌ | 18792/22095 [32:25:39<3:12:37, 3.50s/it] 85%|████████▌ | 18793/22095 [32:25:42<3:10:08, 3.45s/it] {'loss': 0.3443, 'grad_norm': 0.6187644617250292, 'learning_rate': 5.746864856735102e-07, 'epoch': 0.85} 85%|████████▌ | 18793/22095 [32:25:42<3:10:08, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18794/22095 [32:25:49<4:03:46, 4.43s/it] {'loss': 0.461, 'grad_norm': 0.24585577370749073, 'learning_rate': 5.743453791400766e-07, 'epoch': 0.85} 85%|████████▌ | 18794/22095 [32:25:49<4:03:46, 4.43s/it] 85%|████████▌ | 18795/22095 [32:25:53<3:58:40, 4.34s/it] {'loss': 0.2903, 'grad_norm': 0.6568526354093772, 'learning_rate': 5.740043677003688e-07, 'epoch': 0.85} 85%|████████▌ | 18795/22095 [32:25:53<3:58:40, 4.34s/it] 85%|████████▌ | 18796/22095 [32:25:57<3:49:51, 4.18s/it] {'loss': 0.2996, 'grad_norm': 0.6077793753624469, 'learning_rate': 5.736634513617145e-07, 'epoch': 0.85} 85%|████████▌ | 18796/22095 [32:25:57<3:49:51, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18797/22095 [32:26:07<5:35:32, 6.10s/it] {'loss': 0.4758, 'grad_norm': 0.27141939267184767, 'learning_rate': 5.733226301314381e-07, 'epoch': 0.85} 85%|████████▌ | 18797/22095 [32:26:08<5:35:32, 6.10s/it] 85%|████████▌ | 18798/22095 [32:26:18<6:53:51, 7.53s/it] {'loss': 0.4898, 'grad_norm': 0.2860800943684531, 'learning_rate': 5.729819040168622e-07, 'epoch': 0.85} 85%|████████▌ | 18798/22095 [32:26:18<6:53:51, 7.53s/it] 85%|████████▌ | 18799/22095 [32:26:24<6:26:17, 7.03s/it] {'loss': 0.4675, 'grad_norm': 0.2722142642852313, 'learning_rate': 5.72641273025309e-07, 'epoch': 0.85} 85%|████████▌ | 18799/22095 [32:26:24<6:26:17, 7.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 85%|████████▌ | 18800/22095 [32:26:28<5:35:15, 6.10s/it] {'loss': 0.317, 'grad_norm': 0.6485039319914844, 'learning_rate': 5.723007371640965e-07, 'epoch': 0.85} 85%|████████▌ | 18800/22095 [32:26:28<5:35:15, 6.10s/it] 85%|████████▌ | 18801/22095 [32:26:32<4:59:04, 5.45s/it] {'loss': 0.2897, 'grad_norm': 0.6123542113321424, 'learning_rate': 5.719602964405441e-07, 'epoch': 0.85} 85%|████████▌ | 18801/22095 [32:26:32<4:59:04, 5.45s/it] 85%|████████▌ | 18802/22095 [32:26:35<4:15:23, 4.65s/it] {'loss': 0.2893, 'grad_norm': 0.6664532371643351, 'learning_rate': 5.716199508619635e-07, 'epoch': 0.85} 85%|████████▌ | 18802/22095 [32:26:35<4:15:23, 4.65s/it] 85%|████████▌ | 18803/22095 [32:26:39<4:04:22, 4.45s/it] {'loss': 0.3076, 'grad_norm': 0.6345223426266785, 'learning_rate': 5.712797004356707e-07, 'epoch': 0.85} 85%|████████▌ | 18803/22095 [32:26:39<4:04:22, 4.45s/it] 85%|████████▌ | 18804/22095 [32:26:43<3:55:23, 4.29s/it] {'loss': 0.2654, 'grad_norm': 0.6016173433152902, 'learning_rate': 5.709395451689748e-07, 'epoch': 0.85} 85%|████████▌ | 18804/22095 [32:26:43<3:55:23, 4.29s/it] 85%|████████▌ | 18805/22095 [32:26:46<3:41:19, 4.04s/it] {'loss': 0.2816, 'grad_norm': 0.5833362501878927, 'learning_rate': 5.705994850691854e-07, 'epoch': 0.85} 85%|████████▌ | 18805/22095 [32:26:46<3:41:19, 4.04s/it] 85%|████████▌ | 18806/22095 [32:26:50<3:32:40, 3.88s/it] {'loss': 0.3097, 'grad_norm': 0.6211038318307641, 'learning_rate': 5.702595201436101e-07, 'epoch': 0.85} 85%|████████▌ | 18806/22095 [32:26:50<3:32:40, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18807/22095 [32:26:53<3:17:28, 3.60s/it] {'loss': 0.3061, 'grad_norm': 0.6724082872979813, 'learning_rate': 5.699196503995513e-07, 'epoch': 0.85} 85%|████████▌ | 18807/22095 [32:26:53<3:17:28, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63031 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107048 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79894 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18808/22095 [32:26:56<3:20:06, 3.65s/it] {'loss': 0.3068, 'grad_norm': 0.6125127055127121, 'learning_rate': 5.695798758443133e-07, 'epoch': 0.85} 85%|████████▌ | 18808/22095 [32:26:56<3:20:06, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18809/22095 [32:26:59<3:04:41, 3.37s/it] {'loss': 0.2419, 'grad_norm': 0.656838136626938, 'learning_rate': 5.692401964851985e-07, 'epoch': 0.85} 85%|████████▌ | 18809/22095 [32:26:59<3:04:41, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884880 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8033, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为AB段顶点,D为BC段中点,AB=20,AD=14,则AC长度为()\nA. 10\nB. 8\nC. 7\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:∵AB=20,AD=14,∴BD=AB-AD=20-14=6,∵D为线段BC的中点,∴BC=2BD=12,∴AC=AB-BC=20-12=8.'}]} 85%|████████▌ | 18810/22095 [32:27:02<2:56:32, 3.22s/it] {'loss': 0.3015, 'grad_norm': 0.5754169823983257, 'learning_rate': 5.689006123295021e-07, 'epoch': 0.85} 85%|████████▌ | 18810/22095 [32:27:02<2:56:32, 3.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [573, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8438389 in VC:s3://internvl-moe-sft-data/. Exception: Image size [573, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 159586, 'image': 'vrdu_texteq/astro-ph.CO/a91e205a-d222-4e31-871c-895608076936.png', 'image_wh': [[573, 23]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'as in . \nThe color selection criteria for $z \\sim9$ are'}]} 85%|████████▌ | 18811/22095 [32:27:06<3:08:17, 3.44s/it] {'loss': 0.2808, 'grad_norm': 0.6049474580450107, 'learning_rate': 5.685611233845228e-07, 'epoch': 0.85} 85%|████████▌ | 18811/22095 [32:27:06<3:08:17, 3.44s/it] 85%|████████▌ | 18812/22095 [32:27:09<3:08:05, 3.44s/it] {'loss': 0.3118, 'grad_norm': 0.6100428373955716, 'learning_rate': 5.682217296575554e-07, 'epoch': 0.85} 85%|████████▌ | 18812/22095 [32:27:09<3:08:05, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60842 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18813/22095 [32:27:13<3:05:34, 3.39s/it] {'loss': 0.3305, 'grad_norm': 0.6017512056907653, 'learning_rate': 5.678824311558923e-07, 'epoch': 0.85} 85%|████████▌ | 18813/22095 [32:27:13<3:05:34, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116289 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18814/22095 [32:27:16<3:04:07, 3.37s/it] {'loss': 0.3152, 'grad_norm': 0.6388400015965405, 'learning_rate': 5.675432278868221e-07, 'epoch': 0.85} 85%|████████▌ | 18814/22095 [32:27:16<3:04:07, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047181 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,BC=\\frac{1}{2}AB,D为AC的中点,DC=3cm,则AB的长是()\nA. \\frac{11}{2}cm\nB. 4cm\nC. \\frac{9}{2}cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 85%|████████▌ | 18815/22095 [32:27:19<2:56:50, 3.23s/it] {'loss': 0.2422, 'grad_norm': 0.5639933085295421, 'learning_rate': 5.672041198576345e-07, 'epoch': 0.85} 85%|████████▌ | 18815/22095 [32:27:19<2:56:50, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18816/22095 [32:27:28<4:37:48, 5.08s/it] {'loss': 0.4692, 'grad_norm': 0.27213150733841973, 'learning_rate': 5.668651070756176e-07, 'epoch': 0.85} 85%|████████▌ | 18816/22095 [32:27:28<4:37:48, 5.08s/it] 85%|████████▌ | 18817/22095 [32:27:32<4:10:42, 4.59s/it] {'loss': 0.3034, 'grad_norm': 0.6384639020043845, 'learning_rate': 5.66526189548054e-07, 'epoch': 0.85} 85%|████████▌ | 18817/22095 [32:27:32<4:10:42, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91011 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111748 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49567 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84238 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57744 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18818/22095 [32:27:36<4:01:15, 4.42s/it] {'loss': 0.2785, 'grad_norm': 0.6115731159495899, 'learning_rate': 5.661873672822249e-07, 'epoch': 0.85} 85%|████████▌ | 18818/22095 [32:27:36<4:01:15, 4.42s/it] 85%|████████▌ | 18819/22095 [32:27:39<3:43:47, 4.10s/it] {'loss': 0.2857, 'grad_norm': 0.6277971478222045, 'learning_rate': 5.658486402854136e-07, 'epoch': 0.85} 85%|████████▌ | 18819/22095 [32:27:39<3:43:47, 4.10s/it] 85%|████████▌ | 18820/22095 [32:27:42<3:28:06, 3.81s/it] {'loss': 0.3337, 'grad_norm': 0.6595635277979729, 'learning_rate': 5.655100085648945e-07, 'epoch': 0.85} 85%|████████▌ | 18820/22095 [32:27:42<3:28:06, 3.81s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18821/22095 [32:27:50<4:29:09, 4.93s/it] {'loss': 0.4378, 'grad_norm': 0.27716929417854275, 'learning_rate': 5.651714721279478e-07, 'epoch': 0.85} 85%|████████▌ | 18821/22095 [32:27:50<4:29:09, 4.93s/it] 85%|████████▌ | 18822/22095 [32:27:53<3:59:52, 4.40s/it] {'loss': 0.3588, 'grad_norm': 0.6675023621782427, 'learning_rate': 5.648330309818451e-07, 'epoch': 0.85} 85%|████████▌ | 18822/22095 [32:27:53<3:59:52, 4.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18823/22095 [32:27:56<3:43:28, 4.10s/it] {'loss': 0.2927, 'grad_norm': 0.6199954797671143, 'learning_rate': 5.644946851338584e-07, 'epoch': 0.85} 85%|████████▌ | 18823/22095 [32:27:56<3:43:28, 4.10s/it] 85%|████████▌ | 18824/22095 [32:28:00<3:28:46, 3.83s/it] {'loss': 0.3057, 'grad_norm': 0.6273237859124571, 'learning_rate': 5.641564345912581e-07, 'epoch': 0.85} 85%|████████▌ | 18824/22095 [32:28:00<3:28:46, 3.83s/it] 85%|████████▌ | 18825/22095 [32:28:03<3:19:48, 3.67s/it] {'loss': 0.2555, 'grad_norm': 0.6171373265787468, 'learning_rate': 5.638182793613134e-07, 'epoch': 0.85} 85%|████████▌ | 18825/22095 [32:28:03<3:19:48, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18826/22095 [32:28:12<4:57:13, 5.46s/it] {'loss': 0.4488, 'grad_norm': 0.2743814316281519, 'learning_rate': 5.634802194512889e-07, 'epoch': 0.85} 85%|████████▌ | 18826/22095 [32:28:13<4:57:13, 5.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50560 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18827/22095 [32:28:16<4:23:20, 4.83s/it] {'loss': 0.2773, 'grad_norm': 0.6320920857305001, 'learning_rate': 5.631422548684479e-07, 'epoch': 0.85} 85%|████████▌ | 18827/22095 [32:28:16<4:23:20, 4.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18828/22095 [32:28:19<4:02:53, 4.46s/it] {'loss': 0.269, 'grad_norm': 0.5809184334187234, 'learning_rate': 5.628043856200543e-07, 'epoch': 0.85} 85%|████████▌ | 18828/22095 [32:28:19<4:02:53, 4.46s/it] 85%|████████▌ | 18829/22095 [32:28:24<3:56:50, 4.35s/it] {'loss': 0.2843, 'grad_norm': 0.636732723488326, 'learning_rate': 5.624666117133653e-07, 'epoch': 0.85} 85%|████████▌ | 18829/22095 [32:28:24<3:56:50, 4.35s/it] 85%|████████▌ | 18830/22095 [32:28:28<3:50:55, 4.24s/it] {'loss': 0.2933, 'grad_norm': 0.5666407427293152, 'learning_rate': 5.621289331556413e-07, 'epoch': 0.85} 85%|████████▌ | 18830/22095 [32:28:28<3:50:55, 4.24s/it] 85%|████████▌ | 18831/22095 [32:28:32<3:52:10, 4.27s/it] {'loss': 0.3025, 'grad_norm': 0.6216220501351425, 'learning_rate': 5.617913499541355e-07, 'epoch': 0.85} 85%|████████▌ | 18831/22095 [32:28:32<3:52:10, 4.27s/it] 85%|████████▌ | 18832/22095 [32:28:35<3:39:05, 4.03s/it] {'loss': 0.2516, 'grad_norm': 0.6165932173152734, 'learning_rate': 5.614538621161036e-07, 'epoch': 0.85} 85%|████████▌ | 18832/22095 [32:28:35<3:39:05, 4.03s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [64, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369922 in VC:s3://internvl-moe-sft-data/. Exception: Image size [64, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36674, 'image': 'vrdu_table_final_2/astro-ph.CO/4d6533b1-a95b-46a6-b63b-835d4f9dcf95.png', 'image_wh': [[64, 25]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}l@{}} \\data{CMB} \\\\ $\\,$ \\end{tabular}\n```"}]} 85%|████████▌ | 18833/22095 [32:28:38<3:23:05, 3.74s/it] {'loss': 0.2596, 'grad_norm': 0.5415014831368999, 'learning_rate': 5.611164696487953e-07, 'epoch': 0.85} 85%|████████▌ | 18833/22095 [32:28:38<3:23:05, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (124443 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120164 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44132 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18834/22095 [32:28:42<3:28:16, 3.83s/it] {'loss': 0.2992, 'grad_norm': 0.6254587466014427, 'learning_rate': 5.607791725594619e-07, 'epoch': 0.85} 85%|████████▌ | 18834/22095 [32:28:42<3:28:16, 3.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18835/22095 [32:28:50<4:21:57, 4.82s/it] {'loss': 0.4804, 'grad_norm': 0.2603911234479618, 'learning_rate': 5.604419708553504e-07, 'epoch': 0.85} 85%|████████▌ | 18835/22095 [32:28:50<4:21:57, 4.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8304115 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1_eYUmiqAXuNjy1XdXXaYcVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to read and retrieve all the words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n®\nnewyu\noppoR11S\n适用于\nop\n秒换\n颜色\nPVC冰膜前后贴纸\n买一送一\n加送软壳+支架'}]} 85%|████████▌ | 18836/22095 [32:28:53<4:00:34, 4.43s/it] {'loss': 0.2681, 'grad_norm': 0.5863409592256777, 'learning_rate': 5.601048645437046e-07, 'epoch': 0.85} 85%|████████▌ | 18836/22095 [32:28:53<4:00:34, 4.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18837/22095 [32:28:57<3:45:41, 4.16s/it] {'loss': 0.3173, 'grad_norm': 0.5850552431665895, 'learning_rate': 5.597678536317697e-07, 'epoch': 0.85} 85%|████████▌ | 18837/22095 [32:28:57<3:45:41, 4.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047662 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 85%|████████▌ | 18838/22095 [32:28:59<3:23:33, 3.75s/it] {'loss': 0.2922, 'grad_norm': 0.6171999858745233, 'learning_rate': 5.594309381267882e-07, 'epoch': 0.85} 85%|████████▌ | 18838/22095 [32:28:59<3:23:33, 3.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8929815 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 52968, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果点C为AD段中点,AB=10cm,BD=4cm,则BC的长度为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881985 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5138, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nA. 8\nB. 4\nC. 6\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 85%|████████▌ | 18839/22095 [32:29:03<3:17:13, 3.63s/it] {'loss': 0.3304, 'grad_norm': 0.6624620574115613, 'learning_rate': 5.590941180359954e-07, 'epoch': 0.85} 85%|████████▌ | 18839/22095 [32:29:03<3:17:13, 3.63s/it] 85%|████████▌ | 18840/22095 [32:29:06<3:06:02, 3.43s/it] {'loss': 0.282, 'grad_norm': 0.5769346122637378, 'learning_rate': 5.587573933666307e-07, 'epoch': 0.85} 85%|████████▌ | 18840/22095 [32:29:06<3:06:02, 3.43s/it] 85%|████████▌ | 18841/22095 [32:29:09<3:01:09, 3.34s/it] {'loss': 0.2493, 'grad_norm': 0.5673563829975743, 'learning_rate': 5.584207641259309e-07, 'epoch': 0.85} 85%|████████▌ | 18841/22095 [32:29:09<3:01:09, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18842/22095 [32:29:18<4:42:32, 5.21s/it] {'loss': 0.4406, 'grad_norm': 0.2607286583379762, 'learning_rate': 5.580842303211275e-07, 'epoch': 0.85} 85%|████████▌ | 18842/22095 [32:29:18<4:42:32, 5.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54136 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18843/22095 [32:29:22<4:12:51, 4.67s/it] {'loss': 0.302, 'grad_norm': 0.6296153721567521, 'learning_rate': 5.577477919594504e-07, 'epoch': 0.85} 85%|████████▌ | 18843/22095 [32:29:22<4:12:51, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137785 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18844/22095 [32:29:25<3:45:19, 4.16s/it] {'loss': 0.3279, 'grad_norm': 0.599328037123013, 'learning_rate': 5.574114490481303e-07, 'epoch': 0.85} 85%|████████▌ | 18844/22095 [32:29:25<3:45:19, 4.16s/it] 85%|████████▌ | 18845/22095 [32:29:28<3:26:55, 3.82s/it] {'loss': 0.3385, 'grad_norm': 0.6831492461060868, 'learning_rate': 5.570752015943942e-07, 'epoch': 0.85} 85%|████████▌ | 18845/22095 [32:29:28<3:26:55, 3.82s/it] 85%|████████▌ | 18846/22095 [32:29:31<3:12:41, 3.56s/it] {'loss': 0.2928, 'grad_norm': 0.5819428657972001, 'learning_rate': 5.56739049605467e-07, 'epoch': 0.85} 85%|████████▌ | 18846/22095 [32:29:31<3:12:41, 3.56s/it] 85%|████████▌ | 18847/22095 [32:29:34<3:10:17, 3.52s/it] {'loss': 0.2754, 'grad_norm': 0.5730264752763495, 'learning_rate': 5.5640299308857e-07, 'epoch': 0.85} 85%|████████▌ | 18847/22095 [32:29:34<3:10:17, 3.52s/it] 85%|████████▌ | 18848/22095 [32:29:37<3:00:02, 3.33s/it] {'loss': 0.3011, 'grad_norm': 0.5948009048473426, 'learning_rate': 5.560670320509265e-07, 'epoch': 0.85} 85%|████████▌ | 18848/22095 [32:29:37<3:00:02, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42805 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73904 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86861 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18849/22095 [32:29:41<3:06:08, 3.44s/it] {'loss': 0.265, 'grad_norm': 0.6239783490017728, 'learning_rate': 5.557311664997528e-07, 'epoch': 0.85} 85%|████████▌ | 18849/22095 [32:29:41<3:06:08, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18850/22095 [32:29:44<3:04:38, 3.41s/it] {'loss': 0.287, 'grad_norm': 0.6317259979875175, 'learning_rate': 5.553953964422681e-07, 'epoch': 0.85} 85%|████████▌ | 18850/22095 [32:29:44<3:04:38, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18851/22095 [32:29:53<4:25:43, 4.91s/it] {'loss': 0.4627, 'grad_norm': 0.26053981077165644, 'learning_rate': 5.550597218856857e-07, 'epoch': 0.85} 85%|████████▌ | 18851/22095 [32:29:53<4:25:43, 4.91s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308549 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2hb1fkTnI8KJjSszbXXb4KFXa_!!3191532427.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWould you be able to extract all visible text from this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n®\nA4加厚钢制可切0.2mm铁皮\n白色升级版可切长度320MM\n环美\nHUANMEI\nPAPEE\nB7\nA5\nB5\nA4(210mm×297mm)\nIV\n送\n包邮\n环美牌原工厂正品'}]} 85%|████████▌ | 18852/22095 [32:29:56<4:04:35, 4.53s/it] {'loss': 0.2711, 'grad_norm': 0.5959948035580483, 'learning_rate': 5.547241428372169e-07, 'epoch': 0.85} 85%|████████▌ | 18852/22095 [32:29:56<4:04:35, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (40985 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18853/22095 [32:30:00<3:49:47, 4.25s/it] {'loss': 0.2901, 'grad_norm': 0.605770568665207, 'learning_rate': 5.543886593040737e-07, 'epoch': 0.85} 85%|████████▌ | 18853/22095 [32:30:00<3:49:47, 4.25s/it] 85%|████████▌ | 18854/22095 [32:30:03<3:35:22, 3.99s/it] {'loss': 0.2762, 'grad_norm': 0.6075204232134758, 'learning_rate': 5.54053271293466e-07, 'epoch': 0.85} 85%|████████▌ | 18854/22095 [32:30:03<3:35:22, 3.99s/it] 85%|████████▌ | 18855/22095 [32:30:06<3:24:05, 3.78s/it] {'loss': 0.2638, 'grad_norm': 0.6230506570272137, 'learning_rate': 5.537179788125985e-07, 'epoch': 0.85} 85%|████████▌ | 18855/22095 [32:30:06<3:24:05, 3.78s/it] 85%|████████▌ | 18856/22095 [32:30:10<3:13:36, 3.59s/it] {'loss': 0.2977, 'grad_norm': 0.644161196347823, 'learning_rate': 5.533827818686749e-07, 'epoch': 0.85} 85%|████████▌ | 18856/22095 [32:30:10<3:13:36, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18857/22095 [32:30:20<5:10:00, 5.74s/it] {'loss': 0.4676, 'grad_norm': 0.2786588955899518, 'learning_rate': 5.530476804688994e-07, 'epoch': 0.85} 85%|████████▌ | 18857/22095 [32:30:20<5:10:00, 5.74s/it] 85%|████████▌ | 18858/22095 [32:30:24<4:30:31, 5.01s/it] {'loss': 0.2872, 'grad_norm': 0.6168369356365765, 'learning_rate': 5.527126746204708e-07, 'epoch': 0.85} 85%|████████▌ | 18858/22095 [32:30:24<4:30:31, 5.01s/it] 85%|████████▌ | 18859/22095 [32:30:27<3:55:26, 4.37s/it] {'loss': 0.2992, 'grad_norm': 0.568447828785355, 'learning_rate': 5.523777643305888e-07, 'epoch': 0.85} 85%|████████▌ | 18859/22095 [32:30:27<3:55:26, 4.37s/it] 85%|████████▌ | 18860/22095 [32:30:30<3:35:30, 4.00s/it] {'loss': 0.2535, 'grad_norm': 0.6224121972005846, 'learning_rate': 5.520429496064483e-07, 'epoch': 0.85} 85%|████████▌ | 18860/22095 [32:30:30<3:35:30, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 85%|████████▌ | 18861/22095 [32:30:38<4:43:52, 5.27s/it] {'loss': 0.4564, 'grad_norm': 0.2702032240620627, 'learning_rate': 5.517082304552446e-07, 'epoch': 0.85} 85%|████████▌ | 18861/22095 [32:30:38<4:43:52, 5.27s/it] 85%|████████▌ | 18862/22095 [32:30:41<4:13:28, 4.70s/it] {'loss': 0.2937, 'grad_norm': 0.6557325387237996, 'learning_rate': 5.513736068841679e-07, 'epoch': 0.85} 85%|████████▌ | 18862/22095 [32:30:41<4:13:28, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18863/22095 [32:30:44<3:46:43, 4.21s/it] {'loss': 0.3211, 'grad_norm': 0.6332128454241116, 'learning_rate': 5.510390789004105e-07, 'epoch': 0.85} 85%|████████▌ | 18863/22095 [32:30:44<3:46:43, 4.21s/it] 85%|████████▌ | 18864/22095 [32:30:48<3:41:38, 4.12s/it] {'loss': 0.3144, 'grad_norm': 0.5827273766818591, 'learning_rate': 5.507046465111598e-07, 'epoch': 0.85} 85%|████████▌ | 18864/22095 [32:30:48<3:41:38, 4.12s/it] 85%|████████▌ | 18865/22095 [32:30:51<3:26:04, 3.83s/it] {'loss': 0.2749, 'grad_norm': 0.9306378137704379, 'learning_rate': 5.503703097236002e-07, 'epoch': 0.85} 85%|████████▌ | 18865/22095 [32:30:51<3:26:04, 3.83s/it] 85%|████████▌ | 18866/22095 [32:30:54<3:12:41, 3.58s/it] {'loss': 0.2842, 'grad_norm': 0.6298125599693024, 'learning_rate': 5.500360685449163e-07, 'epoch': 0.85} 85%|████████▌ | 18866/22095 [32:30:54<3:12:41, 3.58s/it] 85%|████████▌ | 18867/22095 [32:30:58<3:08:23, 3.50s/it] {'loss': 0.2998, 'grad_norm': 0.5290613603715577, 'learning_rate': 5.497019229822914e-07, 'epoch': 0.85} 85%|████████▌ | 18867/22095 [32:30:58<3:08:23, 3.50s/it] 85%|████████▌ | 18868/22095 [32:31:01<3:07:47, 3.49s/it] {'loss': 0.3622, 'grad_norm': 0.6734853602452973, 'learning_rate': 5.493678730429041e-07, 'epoch': 0.85} 85%|████████▌ | 18868/22095 [32:31:01<3:07:47, 3.49s/it] 85%|████████▌ | 18869/22095 [32:31:04<2:57:59, 3.31s/it] {'loss': 0.323, 'grad_norm': 0.6189222555767221, 'learning_rate': 5.490339187339317e-07, 'epoch': 0.85} 85%|████████▌ | 18869/22095 [32:31:04<2:57:59, 3.31s/it] 85%|████████▌ | 18870/22095 [32:31:08<3:06:29, 3.47s/it] {'loss': 0.2794, 'grad_norm': 0.6925699517107268, 'learning_rate': 5.487000600625509e-07, 'epoch': 0.85} 85%|████████▌ | 18870/22095 [32:31:08<3:06:29, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18871/22095 [32:31:11<3:01:41, 3.38s/it] {'loss': 0.3002, 'grad_norm': 0.5761053277661985, 'learning_rate': 5.483662970359344e-07, 'epoch': 0.85} 85%|████████▌ | 18871/22095 [32:31:11<3:01:41, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45201 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18872/22095 [32:31:15<3:01:51, 3.39s/it] {'loss': 0.3006, 'grad_norm': 0.6464114513484791, 'learning_rate': 5.480326296612532e-07, 'epoch': 0.85} 85%|████████▌ | 18872/22095 [32:31:15<3:01:51, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18873/22095 [32:31:17<2:55:16, 3.26s/it] {'loss': 0.2717, 'grad_norm': 0.6209160449026865, 'learning_rate': 5.476990579456776e-07, 'epoch': 0.85} 85%|████████▌ | 18873/22095 [32:31:18<2:55:16, 3.26s/it] 85%|████████▌ | 18874/22095 [32:31:21<2:58:34, 3.33s/it] {'loss': 0.2691, 'grad_norm': 0.6411823424903216, 'learning_rate': 5.473655818963758e-07, 'epoch': 0.85} 85%|████████▌ | 18874/22095 [32:31:21<2:58:34, 3.33s/it] 85%|████████▌ | 18875/22095 [32:31:24<2:52:31, 3.21s/it] {'loss': 0.2636, 'grad_norm': 0.6069517485269621, 'learning_rate': 5.470322015205132e-07, 'epoch': 0.85} 85%|████████▌ | 18875/22095 [32:31:24<2:52:31, 3.21s/it] 85%|████████▌ | 18876/22095 [32:31:27<2:58:10, 3.32s/it] {'loss': 0.2925, 'grad_norm': 0.6509685693287112, 'learning_rate': 5.466989168252506e-07, 'epoch': 0.85} 85%|████████▌ | 18876/22095 [32:31:27<2:58:10, 3.32s/it] 85%|████████▌ | 18877/22095 [32:31:31<3:01:31, 3.38s/it] {'loss': 0.3091, 'grad_norm': 0.5782497626693441, 'learning_rate': 5.463657278177526e-07, 'epoch': 0.85} 85%|████████▌ | 18877/22095 [32:31:31<3:01:31, 3.38s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (106300000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 85%|████████▌ | 18878/22095 [32:31:34<3:00:02, 3.36s/it] {'loss': 0.3234, 'grad_norm': 0.6516349786922266, 'learning_rate': 5.460326345051753e-07, 'epoch': 0.85} 85%|████████▌ | 18878/22095 [32:31:34<3:00:02, 3.36s/it] 85%|████████▌ | 18879/22095 [32:31:38<2:57:24, 3.31s/it] {'loss': 0.2522, 'grad_norm': 0.5935709697545233, 'learning_rate': 5.456996368946782e-07, 'epoch': 0.85} 85%|████████▌ | 18879/22095 [32:31:38<2:57:24, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (89802 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61919 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42471 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87893 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18880/22095 [32:31:44<3:47:56, 4.25s/it] {'loss': 0.4659, 'grad_norm': 0.275509383061996, 'learning_rate': 5.45366734993416e-07, 'epoch': 0.85} 85%|████████▌ | 18880/22095 [32:31:44<3:47:56, 4.25s/it] 85%|████████▌ | 18881/22095 [32:31:47<3:29:52, 3.92s/it] {'loss': 0.2656, 'grad_norm': 0.5750454126126509, 'learning_rate': 5.450339288085404e-07, 'epoch': 0.85} 85%|████████▌ | 18881/22095 [32:31:47<3:29:52, 3.92s/it] 85%|████████▌ | 18882/22095 [32:31:50<3:12:31, 3.60s/it] {'loss': 0.29, 'grad_norm': 0.6455719189966478, 'learning_rate': 5.447012183472027e-07, 'epoch': 0.85} 85%|████████▌ | 18882/22095 [32:31:50<3:12:31, 3.60s/it] 85%|████████▌ | 18883/22095 [32:31:53<3:01:52, 3.40s/it] {'loss': 0.2941, 'grad_norm': 0.6337452815116204, 'learning_rate': 5.443686036165541e-07, 'epoch': 0.85} 85%|████████▌ | 18883/22095 [32:31:53<3:01:52, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107700 > 40960). Running this sequence through the model will result in indexing errors 85%|████████▌ | 18884/22095 [32:31:56<3:04:31, 3.45s/it] {'loss': 0.3315, 'grad_norm': 0.6011439559025845, 'learning_rate': 5.440360846237397e-07, 'epoch': 0.85} 85%|████████▌ | 18884/22095 [32:31:56<3:04:31, 3.45s/it] 85%|████████▌ | 18885/22095 [32:32:00<3:09:52, 3.55s/it] {'loss': 0.2904, 'grad_norm': 0.6416022117789674, 'learning_rate': 5.437036613759028e-07, 'epoch': 0.85} 85%|████████▌ | 18885/22095 [32:32:00<3:09:52, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49144 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46507 > 40960) for 4 sample(s). Truncating to 1652 with 2 samples. 85%|████████▌ | 18886/22095 [32:32:08<4:09:42, 4.67s/it] {'loss': 0.2722, 'grad_norm': 0.5634699989693578, 'learning_rate': 5.433713338801883e-07, 'epoch': 0.85} 85%|████████▌ | 18886/22095 [32:32:08<4:09:42, 4.67s/it] 85%|████████▌ | 18887/22095 [32:32:11<3:46:45, 4.24s/it] {'loss': 0.3061, 'grad_norm': 0.614677168689508, 'learning_rate': 5.43039102143737e-07, 'epoch': 0.85} 85%|████████▌ | 18887/22095 [32:32:11<3:46:45, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 85%|████████▌ | 18888/22095 [32:32:20<5:10:32, 5.81s/it] {'loss': 0.4437, 'grad_norm': 0.2506225998611218, 'learning_rate': 5.427069661736873e-07, 'epoch': 0.85} 85%|████████▌ | 18888/22095 [32:32:20<5:10:32, 5.81s/it] 85%|████████▌ | 18889/22095 [32:32:24<4:40:37, 5.25s/it] {'loss': 0.3086, 'grad_norm': 0.5797397847475295, 'learning_rate': 5.423749259771738e-07, 'epoch': 0.85} 85%|████████▌ | 18889/22095 [32:32:24<4:40:37, 5.25s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047144 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 5\nB. 6\nC. 3\nD. 4\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 85%|████████▌ | 18890/22095 [32:32:27<4:02:34, 4.54s/it] {'loss': 0.2535, 'grad_norm': 0.5874281031572004, 'learning_rate': 5.420429815613343e-07, 'epoch': 0.85} 85%|████████▌ | 18890/22095 [32:32:27<4:02:34, 4.54s/it] 85%|████████▌ | 18891/22095 [32:32:30<3:34:55, 4.02s/it] {'loss': 0.3137, 'grad_norm': 0.6109100658942526, 'learning_rate': 5.41711132933298e-07, 'epoch': 0.85} 85%|████████▌ | 18891/22095 [32:32:30<3:34:55, 4.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18892/22095 [32:32:39<4:52:51, 5.49s/it] {'loss': 0.4826, 'grad_norm': 0.2686910506299942, 'learning_rate': 5.413793801001981e-07, 'epoch': 0.86} 86%|████████▌ | 18892/22095 [32:32:39<4:52:51, 5.49s/it] 86%|████████▌ | 18893/22095 [32:32:42<4:23:27, 4.94s/it] {'loss': 0.291, 'grad_norm': 0.6193657188011698, 'learning_rate': 5.410477230691618e-07, 'epoch': 0.86} 86%|████████▌ | 18893/22095 [32:32:42<4:23:27, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18894/22095 [32:32:52<5:41:21, 6.40s/it] {'loss': 0.4563, 'grad_norm': 0.2658226084268709, 'learning_rate': 5.407161618473139e-07, 'epoch': 0.86} 86%|████████▌ | 18894/22095 [32:32:52<5:41:21, 6.40s/it] 86%|████████▌ | 18895/22095 [32:33:01<6:16:12, 7.05s/it] {'loss': 0.4571, 'grad_norm': 0.2628564515559104, 'learning_rate': 5.403846964417803e-07, 'epoch': 0.86} 86%|████████▌ | 18895/22095 [32:33:01<6:16:12, 7.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 86%|████████▌ | 18896/22095 [32:33:04<5:13:46, 5.89s/it] {'loss': 0.3062, 'grad_norm': 0.7782007656221628, 'learning_rate': 5.400533268596841e-07, 'epoch': 0.86} 86%|████████▌ | 18896/22095 [32:33:04<5:13:46, 5.89s/it] 86%|████████▌ | 18897/22095 [32:33:07<4:30:28, 5.07s/it] {'loss': 0.3558, 'grad_norm': 0.6839910168496154, 'learning_rate': 5.397220531081437e-07, 'epoch': 0.86} 86%|████████▌ | 18897/22095 [32:33:07<4:30:28, 5.07s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51512 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18898/22095 [32:33:11<4:12:21, 4.74s/it] {'loss': 0.2878, 'grad_norm': 0.6251850507046652, 'learning_rate': 5.393908751942773e-07, 'epoch': 0.86} 86%|████████▌ | 18898/22095 [32:33:11<4:12:21, 4.74s/it] 86%|████████▌ | 18899/22095 [32:33:14<3:43:39, 4.20s/it] {'loss': 0.3013, 'grad_norm': 0.7957418093710049, 'learning_rate': 5.390597931252017e-07, 'epoch': 0.86} 86%|████████▌ | 18899/22095 [32:33:14<3:43:39, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18900/22095 [32:33:17<3:26:46, 3.88s/it] {'loss': 0.2899, 'grad_norm': 0.638593999873009, 'learning_rate': 5.387288069080298e-07, 'epoch': 0.86} 86%|████████▌ | 18900/22095 [32:33:17<3:26:46, 3.88s/it] 86%|████████▌ | 18901/22095 [32:33:21<3:30:02, 3.95s/it] {'loss': 0.2584, 'grad_norm': 0.6447274806725127, 'learning_rate': 5.383979165498748e-07, 'epoch': 0.86} 86%|████████▌ | 18901/22095 [32:33:21<3:30:02, 3.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18902/22095 [32:33:25<3:28:25, 3.92s/it] {'loss': 0.2995, 'grad_norm': 0.6147292165958476, 'learning_rate': 5.380671220578454e-07, 'epoch': 0.86} 86%|████████▌ | 18902/22095 [32:33:25<3:28:25, 3.92s/it] 86%|████████▌ | 18903/22095 [32:33:29<3:28:04, 3.91s/it] {'loss': 0.314, 'grad_norm': 0.5864371107534302, 'learning_rate': 5.377364234390503e-07, 'epoch': 0.86} 86%|████████▌ | 18903/22095 [32:33:29<3:28:04, 3.91s/it] 86%|████████▌ | 18904/22095 [32:33:32<3:15:51, 3.68s/it] {'loss': 0.2921, 'grad_norm': 0.7160624991087194, 'learning_rate': 5.374058207005945e-07, 'epoch': 0.86} 86%|████████▌ | 18904/22095 [32:33:32<3:15:51, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67421 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18905/22095 [32:33:35<3:06:11, 3.50s/it] {'loss': 0.2886, 'grad_norm': 0.5965671451551371, 'learning_rate': 5.37075313849581e-07, 'epoch': 0.86} 86%|████████▌ | 18905/22095 [32:33:35<3:06:11, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18906/22095 [32:33:42<3:51:37, 4.36s/it] {'loss': 0.4619, 'grad_norm': 0.25418867193021494, 'learning_rate': 5.367449028931133e-07, 'epoch': 0.86} 86%|████████▌ | 18906/22095 [32:33:42<3:51:37, 4.36s/it] 86%|████████▌ | 18907/22095 [32:33:45<3:38:15, 4.11s/it] {'loss': 0.3046, 'grad_norm': 0.5835242042302965, 'learning_rate': 5.364145878382887e-07, 'epoch': 0.86} 86%|████████▌ | 18907/22095 [32:33:45<3:38:15, 4.11s/it] 86%|████████▌ | 18908/22095 [32:33:49<3:28:37, 3.93s/it] {'loss': 0.2951, 'grad_norm': 0.6649529932271935, 'learning_rate': 5.360843686922068e-07, 'epoch': 0.86} 86%|████████▌ | 18908/22095 [32:33:49<3:28:37, 3.93s/it] 86%|████████▌ | 18909/22095 [32:33:53<3:32:37, 4.00s/it] {'loss': 0.3277, 'grad_norm': 0.6184811126092957, 'learning_rate': 5.357542454619619e-07, 'epoch': 0.86} 86%|████████▌ | 18909/22095 [32:33:53<3:32:37, 4.00s/it] 86%|████████▌ | 18910/22095 [32:33:56<3:19:53, 3.77s/it] {'loss': 0.2856, 'grad_norm': 0.5683251110712597, 'learning_rate': 5.354242181546465e-07, 'epoch': 0.86} 86%|████████▌ | 18910/22095 [32:33:56<3:19:53, 3.77s/it] 86%|████████▌ | 18911/22095 [32:33:59<3:13:31, 3.65s/it] {'loss': 0.276, 'grad_norm': 0.658689800101766, 'learning_rate': 5.350942867773523e-07, 'epoch': 0.86} 86%|████████▌ | 18911/22095 [32:33:59<3:13:31, 3.65s/it] 86%|████████▌ | 18912/22095 [32:34:02<3:01:15, 3.42s/it] {'loss': 0.3159, 'grad_norm': 0.700714967968001, 'learning_rate': 5.347644513371702e-07, 'epoch': 0.86} 86%|████████▌ | 18912/22095 [32:34:02<3:01:15, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18913/22095 [32:34:11<4:26:00, 5.02s/it] {'loss': 0.4583, 'grad_norm': 0.2668507086692464, 'learning_rate': 5.344347118411863e-07, 'epoch': 0.86} 86%|████████▌ | 18913/22095 [32:34:11<4:26:00, 5.02s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18914/22095 [32:34:14<3:57:18, 4.48s/it] {'loss': 0.2797, 'grad_norm': 0.6919274318584268, 'learning_rate': 5.341050682964844e-07, 'epoch': 0.86} 86%|████████▌ | 18914/22095 [32:34:14<3:57:18, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [325, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8522529 in VC:s3://internvl-moe-sft-data/. Exception: Image size [325, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 116253, 'image': 'vrdu_texteq/astro-ph.CO/11af1247-d193-47a5-99c2-c2fbdf6174ea.png', 'image_wh': [[325, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'with restrictions on the $\\alpha_i$:'}]} 86%|████████▌ | 18915/22095 [32:34:18<3:39:55, 4.15s/it] {'loss': 0.2587, 'grad_norm': 0.6035576730213521, 'learning_rate': 5.337755207101486e-07, 'epoch': 0.86} 86%|████████▌ | 18915/22095 [32:34:18<3:39:55, 4.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48096 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18916/22095 [32:34:21<3:25:07, 3.87s/it] {'loss': 0.2689, 'grad_norm': 0.6375513840137232, 'learning_rate': 5.334460690892613e-07, 'epoch': 0.86} 86%|████████▌ | 18916/22095 [32:34:21<3:25:07, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047717 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1.5cm\nB. 2cm\nC. 4cm\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 86%|████████▌ | 18917/22095 [32:34:24<3:09:51, 3.58s/it] {'loss': 0.3083, 'grad_norm': 0.7166262430313622, 'learning_rate': 5.331167134408994e-07, 'epoch': 0.86} 86%|████████▌ | 18917/22095 [32:34:24<3:09:51, 3.58s/it] 86%|████████▌ | 18918/22095 [32:34:27<3:01:20, 3.42s/it] {'loss': 0.3031, 'grad_norm': 0.5963358319668006, 'learning_rate': 5.327874537721395e-07, 'epoch': 0.86} 86%|████████▌ | 18918/22095 [32:34:27<3:01:20, 3.42s/it] 86%|████████▌ | 18919/22095 [32:34:31<3:19:50, 3.78s/it] {'loss': 0.3269, 'grad_norm': 0.6120859632707557, 'learning_rate': 5.324582900900587e-07, 'epoch': 0.86} 86%|████████▌ | 18919/22095 [32:34:31<3:19:50, 3.78s/it] 86%|████████▌ | 18920/22095 [32:34:35<3:15:31, 3.69s/it] {'loss': 0.2969, 'grad_norm': 0.6375785644361093, 'learning_rate': 5.321292224017266e-07, 'epoch': 0.86} 86%|████████▌ | 18920/22095 [32:34:35<3:15:31, 3.69s/it] 86%|████████▌ | 18921/22095 [32:34:38<3:07:07, 3.54s/it] {'loss': 0.3002, 'grad_norm': 0.6517198634795168, 'learning_rate': 5.318002507142167e-07, 'epoch': 0.86} 86%|████████▌ | 18921/22095 [32:34:38<3:07:07, 3.54s/it] 86%|████████▌ | 18922/22095 [32:34:42<3:08:09, 3.56s/it] {'loss': 0.295, 'grad_norm': 0.7554083589509392, 'learning_rate': 5.314713750345968e-07, 'epoch': 0.86} 86%|████████▌ | 18922/22095 [32:34:42<3:08:09, 3.56s/it] 86%|████████▌ | 18923/22095 [32:34:45<3:02:42, 3.46s/it] {'loss': 0.331, 'grad_norm': 0.6673993566220832, 'learning_rate': 5.311425953699312e-07, 'epoch': 0.86} 86%|████████▌ | 18923/22095 [32:34:45<3:02:42, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18924/22095 [32:34:54<4:38:47, 5.28s/it] {'loss': 0.452, 'grad_norm': 0.278931041184877, 'learning_rate': 5.30813911727287e-07, 'epoch': 0.86} 86%|████████▌ | 18924/22095 [32:34:54<4:38:47, 5.28s/it] 86%|████████▌ | 18925/22095 [32:34:58<4:17:09, 4.87s/it] {'loss': 0.3178, 'grad_norm': 0.6127972655988442, 'learning_rate': 5.304853241137264e-07, 'epoch': 0.86} 86%|████████▌ | 18925/22095 [32:34:58<4:17:09, 4.87s/it] 86%|████████▌ | 18926/22095 [32:35:02<4:00:23, 4.55s/it] {'loss': 0.2958, 'grad_norm': 0.61155340954273, 'learning_rate': 5.301568325363088e-07, 'epoch': 0.86} 86%|████████▌ | 18926/22095 [32:35:02<4:00:23, 4.55s/it] 86%|████████▌ | 18927/22095 [32:35:06<3:50:55, 4.37s/it] {'loss': 0.304, 'grad_norm': 0.6030392436805826, 'learning_rate': 5.298284370020923e-07, 'epoch': 0.86} 86%|████████▌ | 18927/22095 [32:35:06<3:50:55, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18928/22095 [32:35:16<5:17:11, 6.01s/it] {'loss': 0.4643, 'grad_norm': 0.2969858127149444, 'learning_rate': 5.295001375181336e-07, 'epoch': 0.86} 86%|████████▌ | 18928/22095 [32:35:16<5:17:11, 6.01s/it] 86%|████████▌ | 18929/22095 [32:35:20<4:41:36, 5.34s/it] {'loss': 0.3374, 'grad_norm': 0.5792598709562329, 'learning_rate': 5.291719340914875e-07, 'epoch': 0.86} 86%|████████▌ | 18929/22095 [32:35:20<4:41:36, 5.34s/it] 86%|████████▌ | 18930/22095 [32:35:23<4:13:53, 4.81s/it] {'loss': 0.3105, 'grad_norm': 0.6383794283244996, 'learning_rate': 5.288438267292057e-07, 'epoch': 0.86} 86%|████████▌ | 18930/22095 [32:35:23<4:13:53, 4.81s/it] 86%|████████▌ | 18931/22095 [32:35:27<3:54:50, 4.45s/it] {'loss': 0.3027, 'grad_norm': 0.6203512639753918, 'learning_rate': 5.285158154383369e-07, 'epoch': 0.86} 86%|████████▌ | 18931/22095 [32:35:27<3:54:50, 4.45s/it] 86%|████████▌ | 18932/22095 [32:35:30<3:31:58, 4.02s/it] {'loss': 0.2773, 'grad_norm': 0.5828768141171531, 'learning_rate': 5.28187900225931e-07, 'epoch': 0.86} 86%|████████▌ | 18932/22095 [32:35:30<3:31:58, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56517 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18933/22095 [32:35:33<3:18:35, 3.77s/it] {'loss': 0.2567, 'grad_norm': 0.6918531275424276, 'learning_rate': 5.27860081099032e-07, 'epoch': 0.86} 86%|████████▌ | 18933/22095 [32:35:33<3:18:35, 3.77s/it] 86%|████████▌ | 18934/22095 [32:35:36<3:04:27, 3.50s/it] {'loss': 0.2882, 'grad_norm': 0.6421699593795226, 'learning_rate': 5.275323580646857e-07, 'epoch': 0.86} 86%|████████▌ | 18934/22095 [32:35:36<3:04:27, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72797 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18935/22095 [32:35:39<3:01:25, 3.44s/it] {'loss': 0.2321, 'grad_norm': 0.68124116019829, 'learning_rate': 5.272047311299333e-07, 'epoch': 0.86} 86%|████████▌ | 18935/22095 [32:35:39<3:01:25, 3.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18936/22095 [32:35:43<3:07:50, 3.57s/it] {'loss': 0.2825, 'grad_norm': 1.7539131270402744, 'learning_rate': 5.268772003018124e-07, 'epoch': 0.86} 86%|████████▌ | 18936/22095 [32:35:43<3:07:50, 3.57s/it] 86%|████████▌ | 18937/22095 [32:35:46<2:54:49, 3.32s/it] {'loss': 0.2473, 'grad_norm': 0.6545848551286552, 'learning_rate': 5.26549765587363e-07, 'epoch': 0.86} 86%|████████▌ | 18937/22095 [32:35:46<2:54:49, 3.32s/it] 86%|████████▌ | 18938/22095 [32:35:49<2:57:45, 3.38s/it] {'loss': 0.2917, 'grad_norm': 0.5745584200238323, 'learning_rate': 5.262224269936217e-07, 'epoch': 0.86} 86%|████████▌ | 18938/22095 [32:35:49<2:57:45, 3.38s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18939/22095 [32:35:52<2:50:03, 3.23s/it] {'loss': 0.2557, 'grad_norm': 0.596404529852008, 'learning_rate': 5.258951845276178e-07, 'epoch': 0.86} 86%|████████▌ | 18939/22095 [32:35:52<2:50:03, 3.23s/it] 86%|████████▌ | 18940/22095 [32:35:56<2:57:17, 3.37s/it] {'loss': 0.3002, 'grad_norm': 0.6719309822426754, 'learning_rate': 5.255680381963856e-07, 'epoch': 0.86} 86%|████████▌ | 18940/22095 [32:35:56<2:57:17, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18941/22095 [32:36:04<4:02:01, 4.60s/it] {'loss': 0.4646, 'grad_norm': 0.30837623461057606, 'learning_rate': 5.252409880069553e-07, 'epoch': 0.86} 86%|████████▌ | 18941/22095 [32:36:04<4:02:01, 4.60s/it] 86%|████████▌ | 18942/22095 [32:36:07<3:39:20, 4.17s/it] {'loss': 0.2939, 'grad_norm': 0.6469260324986573, 'learning_rate': 5.249140339663533e-07, 'epoch': 0.86} 86%|████████▌ | 18942/22095 [32:36:07<3:39:20, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18943/22095 [32:36:18<5:28:25, 6.25s/it] {'loss': 0.4531, 'grad_norm': 0.2768016336533323, 'learning_rate': 5.245871760816029e-07, 'epoch': 0.86} 86%|████████▌ | 18943/22095 [32:36:18<5:28:25, 6.25s/it] 86%|████████▌ | 18944/22095 [32:36:21<4:48:27, 5.49s/it] {'loss': 0.3148, 'grad_norm': 0.6037990091196405, 'learning_rate': 5.24260414359729e-07, 'epoch': 0.86} 86%|████████▌ | 18944/22095 [32:36:22<4:48:27, 5.49s/it] 86%|████████▌ | 18945/22095 [32:36:24<4:08:17, 4.73s/it] {'loss': 0.3031, 'grad_norm': 0.6346860481158199, 'learning_rate': 5.239337488077539e-07, 'epoch': 0.86} 86%|████████▌ | 18945/22095 [32:36:24<4:08:17, 4.73s/it] 86%|████████▌ | 18946/22095 [32:36:28<3:46:15, 4.31s/it] {'loss': 0.329, 'grad_norm': 0.6893880255876333, 'learning_rate': 5.236071794326952e-07, 'epoch': 0.86} 86%|████████▌ | 18946/22095 [32:36:28<3:46:15, 4.31s/it] 86%|████████▌ | 18947/22095 [32:36:32<3:37:01, 4.14s/it] {'loss': 0.2441, 'grad_norm': 0.6232018323259965, 'learning_rate': 5.232807062415691e-07, 'epoch': 0.86} 86%|████████▌ | 18947/22095 [32:36:32<3:37:01, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18948/22095 [32:36:34<3:14:41, 3.71s/it] {'loss': 0.2605, 'grad_norm': 0.6015287791266947, 'learning_rate': 5.229543292413919e-07, 'epoch': 0.86} 86%|████████▌ | 18948/22095 [32:36:34<3:14:41, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45577 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42900 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89488 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18949/22095 [32:36:38<3:10:19, 3.63s/it] {'loss': 0.2543, 'grad_norm': 0.6469701975664196, 'learning_rate': 5.226280484391754e-07, 'epoch': 0.86} 86%|████████▌ | 18949/22095 [32:36:38<3:10:19, 3.63s/it] 86%|████████▌ | 18950/22095 [32:36:41<3:06:10, 3.55s/it] {'loss': 0.2346, 'grad_norm': 0.5637765135573233, 'learning_rate': 5.22301863841932e-07, 'epoch': 0.86} 86%|████████▌ | 18950/22095 [32:36:41<3:06:10, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18951/22095 [32:36:44<2:54:56, 3.34s/it] {'loss': 0.3188, 'grad_norm': 0.6134828769689903, 'learning_rate': 5.219757754566696e-07, 'epoch': 0.86} 86%|████████▌ | 18951/22095 [32:36:44<2:54:56, 3.34s/it] 86%|████████▌ | 18952/22095 [32:36:47<2:46:04, 3.17s/it] {'loss': 0.2864, 'grad_norm': 0.6391224852971039, 'learning_rate': 5.216497832903927e-07, 'epoch': 0.86} 86%|████████▌ | 18952/22095 [32:36:47<2:46:04, 3.17s/it] 86%|████████▌ | 18953/22095 [32:36:50<2:44:00, 3.13s/it] {'loss': 0.3206, 'grad_norm': 0.5689721588401953, 'learning_rate': 5.213238873501086e-07, 'epoch': 0.86} 86%|████████▌ | 18953/22095 [32:36:50<2:44:00, 3.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952530 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3365, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵D为线段CB的中点,CD=3,∴BC=2CD=6,∴AC=AB-BC=5.'}]} 86%|████████▌ | 18954/22095 [32:36:53<2:46:56, 3.19s/it] {'loss': 0.3183, 'grad_norm': 0.7111119399235565, 'learning_rate': 5.209980876428195e-07, 'epoch': 0.86} 86%|████████▌ | 18954/22095 [32:36:53<2:46:56, 3.19s/it] 86%|████████▌ | 18955/22095 [32:36:56<2:43:17, 3.12s/it] {'loss': 0.3008, 'grad_norm': 0.7379886429307578, 'learning_rate': 5.206723841755257e-07, 'epoch': 0.86} 86%|████████▌ | 18955/22095 [32:36:56<2:43:17, 3.12s/it] 86%|████████▌ | 18956/22095 [32:36:59<2:40:53, 3.08s/it] {'loss': 0.274, 'grad_norm': 0.6115833659110569, 'learning_rate': 5.203467769552239e-07, 'epoch': 0.86} 86%|████████▌ | 18956/22095 [32:36:59<2:40:53, 3.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41425 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41955 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18957/22095 [32:37:05<3:33:12, 4.08s/it] {'loss': 0.4851, 'grad_norm': 0.24484072528037706, 'learning_rate': 5.200212659889114e-07, 'epoch': 0.86} 86%|████████▌ | 18957/22095 [32:37:05<3:33:12, 4.08s/it] 86%|████████▌ | 18958/22095 [32:37:09<3:20:53, 3.84s/it] {'loss': 0.257, 'grad_norm': 0.6280509556954103, 'learning_rate': 5.196958512835843e-07, 'epoch': 0.86} 86%|████████▌ | 18958/22095 [32:37:09<3:20:53, 3.84s/it] 86%|████████▌ | 18959/22095 [32:37:12<3:13:06, 3.69s/it] {'loss': 0.2908, 'grad_norm': 0.588999369246937, 'learning_rate': 5.193705328462328e-07, 'epoch': 0.86} 86%|████████▌ | 18959/22095 [32:37:12<3:13:06, 3.69s/it] 86%|████████▌ | 18960/22095 [32:37:16<3:23:43, 3.90s/it] {'loss': 0.3257, 'grad_norm': 0.6195333674917924, 'learning_rate': 5.190453106838461e-07, 'epoch': 0.86} 86%|████████▌ | 18960/22095 [32:37:16<3:23:43, 3.90s/it] 86%|████████▌ | 18961/22095 [32:37:19<3:11:02, 3.66s/it] {'loss': 0.318, 'grad_norm': 0.6086824841574335, 'learning_rate': 5.187201848034146e-07, 'epoch': 0.86} 86%|████████▌ | 18961/22095 [32:37:19<3:11:02, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43658 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95264 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18962/22095 [32:37:23<3:12:40, 3.69s/it] {'loss': 0.2802, 'grad_norm': 0.58719677522352, 'learning_rate': 5.183951552119227e-07, 'epoch': 0.86} 86%|████████▌ | 18962/22095 [32:37:23<3:12:40, 3.69s/it] 86%|████████▌ | 18963/22095 [32:37:27<3:07:56, 3.60s/it] {'loss': 0.327, 'grad_norm': 0.580374303078345, 'learning_rate': 5.180702219163552e-07, 'epoch': 0.86} 86%|████████▌ | 18963/22095 [32:37:27<3:07:56, 3.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047721 in VC:s3://multi-modal/UniGeo/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,在直线l上顺次取A、B、C三点,使得AB=5cm,BC=3cm,如果O是线段AC的中点,那么线段OB长为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm'}, {'from': 'gpt', 'value': '【解答】解:根据上图所示OB=5cm-OA,∵OA=(AB+BC)÷2=4cm,∴OB=1cm.'}]} 86%|████████▌ | 18964/22095 [32:37:30<3:01:15, 3.47s/it] {'loss': 0.3092, 'grad_norm': 0.6288797524191543, 'learning_rate': 5.177453849236935e-07, 'epoch': 0.86} 86%|████████▌ | 18964/22095 [32:37:30<3:01:15, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18965/22095 [32:37:39<4:26:01, 5.10s/it] {'loss': 0.4658, 'grad_norm': 0.25583946705194716, 'learning_rate': 5.174206442409163e-07, 'epoch': 0.86} 86%|████████▌ | 18965/22095 [32:37:39<4:26:01, 5.10s/it] 86%|████████▌ | 18966/22095 [32:37:42<3:54:20, 4.49s/it] {'loss': 0.2621, 'grad_norm': 0.6087906512514828, 'learning_rate': 5.17095999875002e-07, 'epoch': 0.86} 86%|████████▌ | 18966/22095 [32:37:42<3:54:20, 4.49s/it] 86%|████████▌ | 18967/22095 [32:37:46<3:47:05, 4.36s/it] {'loss': 0.2717, 'grad_norm': 0.5619462201669044, 'learning_rate': 5.167714518329286e-07, 'epoch': 0.86} 86%|████████▌ | 18967/22095 [32:37:46<3:47:05, 4.36s/it] 86%|████████▌ | 18968/22095 [32:37:49<3:22:42, 3.89s/it] {'loss': 0.3056, 'grad_norm': 0.6351139673480072, 'learning_rate': 5.16447000121666e-07, 'epoch': 0.86} 86%|████████▌ | 18968/22095 [32:37:49<3:22:42, 3.89s/it] 86%|████████▌ | 18969/22095 [32:37:52<3:21:05, 3.86s/it] {'loss': 0.3107, 'grad_norm': 0.5868098674192496, 'learning_rate': 5.161226447481865e-07, 'epoch': 0.86} 86%|████████▌ | 18969/22095 [32:37:52<3:21:05, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18970/22095 [32:38:03<5:07:01, 5.89s/it] {'loss': 0.4457, 'grad_norm': 0.2619592796412879, 'learning_rate': 5.157983857194615e-07, 'epoch': 0.86} 86%|████████▌ | 18970/22095 [32:38:03<5:07:01, 5.89s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97732 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18971/22095 [32:38:07<4:32:34, 5.24s/it] {'loss': 0.2846, 'grad_norm': 0.6057152229425675, 'learning_rate': 5.154742230424575e-07, 'epoch': 0.86} 86%|████████▌ | 18971/22095 [32:38:07<4:32:34, 5.24s/it] 86%|████████▌ | 18972/22095 [32:38:10<3:58:05, 4.57s/it] {'loss': 0.3123, 'grad_norm': 0.6068278424363579, 'learning_rate': 5.151501567241373e-07, 'epoch': 0.86} 86%|████████▌ | 18972/22095 [32:38:10<3:58:05, 4.57s/it] 86%|████████▌ | 18973/22095 [32:38:13<3:30:47, 4.05s/it] {'loss': 0.2686, 'grad_norm': 0.6502329942997074, 'learning_rate': 5.148261867714671e-07, 'epoch': 0.86} 86%|████████▌ | 18973/22095 [32:38:13<3:30:47, 4.05s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8302892 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1Pdo7deEJL1JjSZFGXXa6OXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nIdentify the text in the picture.'}, {'from': 'gpt', 'value': 'All words in the image:\n汽车工业\n欧文\nouwen多个厚度任你选择!\n建筑装修\n工厂耗材\n可定做规格!!!\n品\n全国包邮\n高粘持久\n缓冲抗震\n可开17%增值税专票'}]} 86%|████████▌ | 18974/22095 [32:38:16<3:17:18, 3.79s/it] {'loss': 0.2948, 'grad_norm': 0.6056539004633698, 'learning_rate': 5.145023131914074e-07, 'epoch': 0.86} 86%|████████▌ | 18974/22095 [32:38:16<3:17:18, 3.79s/it] 86%|████████▌ | 18975/22095 [32:38:19<3:07:11, 3.60s/it] {'loss': 0.3182, 'grad_norm': 0.6625029521242054, 'learning_rate': 5.141785359909168e-07, 'epoch': 0.86} 86%|████████▌ | 18975/22095 [32:38:19<3:07:11, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58385 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111822 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18976/22095 [32:38:23<3:12:24, 3.70s/it] {'loss': 0.2629, 'grad_norm': 0.6228892249909714, 'learning_rate': 5.138548551769512e-07, 'epoch': 0.86} 86%|████████▌ | 18976/22095 [32:38:23<3:12:24, 3.70s/it] 86%|████████▌ | 18977/22095 [32:38:27<3:13:37, 3.73s/it] {'loss': 0.2828, 'grad_norm': 0.5779265648025561, 'learning_rate': 5.135312707564683e-07, 'epoch': 0.86} 86%|████████▌ | 18977/22095 [32:38:27<3:13:37, 3.73s/it] 86%|████████▌ | 18978/22095 [32:38:30<3:13:53, 3.73s/it] {'loss': 0.3255, 'grad_norm': 0.6126297100733569, 'learning_rate': 5.132077827364174e-07, 'epoch': 0.86} 86%|████████▌ | 18978/22095 [32:38:30<3:13:53, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (73242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106803 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18979/22095 [32:38:40<4:50:22, 5.59s/it] {'loss': 0.4581, 'grad_norm': 0.2882798253765335, 'learning_rate': 5.128843911237525e-07, 'epoch': 0.86} 86%|████████▌ | 18979/22095 [32:38:40<4:50:22, 5.59s/it] 86%|████████▌ | 18980/22095 [32:38:43<4:12:06, 4.86s/it] {'loss': 0.2562, 'grad_norm': 0.5635818665300483, 'learning_rate': 5.125610959254213e-07, 'epoch': 0.86} 86%|████████▌ | 18980/22095 [32:38:44<4:12:06, 4.86s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (92019200 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 86%|████████▌ | 18981/22095 [32:38:47<3:53:09, 4.49s/it] {'loss': 0.305, 'grad_norm': 0.6155989790532402, 'learning_rate': 5.122378971483683e-07, 'epoch': 0.86} 86%|████████▌ | 18981/22095 [32:38:47<3:53:09, 4.49s/it] 86%|████████▌ | 18982/22095 [32:38:50<3:29:58, 4.05s/it] {'loss': 0.3282, 'grad_norm': 0.6168694726468859, 'learning_rate': 5.119147947995401e-07, 'epoch': 0.86} 86%|████████▌ | 18982/22095 [32:38:50<3:29:58, 4.05s/it] 86%|████████▌ | 18983/22095 [32:38:53<3:17:11, 3.80s/it] {'loss': 0.324, 'grad_norm': 0.7878038210236072, 'learning_rate': 5.115917888858802e-07, 'epoch': 0.86} 86%|████████▌ | 18983/22095 [32:38:53<3:17:11, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43779 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18984/22095 [32:39:02<4:29:02, 5.19s/it] {'loss': 0.4733, 'grad_norm': 0.2860775231703091, 'learning_rate': 5.112688794143273e-07, 'epoch': 0.86} 86%|████████▌ | 18984/22095 [32:39:02<4:29:02, 5.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18985/22095 [32:39:06<4:09:27, 4.81s/it] {'loss': 0.3034, 'grad_norm': 0.584825913982036, 'learning_rate': 5.109460663918192e-07, 'epoch': 0.86} 86%|████████▌ | 18985/22095 [32:39:06<4:09:27, 4.81s/it] 86%|████████▌ | 18986/22095 [32:39:10<3:55:38, 4.55s/it] {'loss': 0.3081, 'grad_norm': 0.634246711145265, 'learning_rate': 5.106233498252927e-07, 'epoch': 0.86} 86%|████████▌ | 18986/22095 [32:39:10<3:55:38, 4.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 18987/22095 [32:39:19<5:16:59, 6.12s/it] {'loss': 0.4738, 'grad_norm': 0.2571638233495209, 'learning_rate': 5.103007297216838e-07, 'epoch': 0.86} 86%|████████▌ | 18987/22095 [32:39:19<5:16:59, 6.12s/it] 86%|████████▌ | 18988/22095 [32:39:23<4:31:12, 5.24s/it] {'loss': 0.2935, 'grad_norm': 0.636572794194953, 'learning_rate': 5.099782060879227e-07, 'epoch': 0.86} 86%|████████▌ | 18988/22095 [32:39:23<4:31:12, 5.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18989/22095 [32:39:26<4:04:23, 4.72s/it] {'loss': 0.3017, 'grad_norm': 0.8216739640425371, 'learning_rate': 5.096557789309392e-07, 'epoch': 0.86} 86%|████████▌ | 18989/22095 [32:39:26<4:04:23, 4.72s/it] 86%|████████▌ | 18990/22095 [32:39:30<3:43:59, 4.33s/it] {'loss': 0.3003, 'grad_norm': 0.5856909949197273, 'learning_rate': 5.093334482576634e-07, 'epoch': 0.86} 86%|████████▌ | 18990/22095 [32:39:30<3:43:59, 4.33s/it] 86%|████████▌ | 18991/22095 [32:39:33<3:25:38, 3.97s/it] {'loss': 0.3364, 'grad_norm': 0.6767030167755445, 'learning_rate': 5.09011214075018e-07, 'epoch': 0.86} 86%|████████▌ | 18991/22095 [32:39:33<3:25:38, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74855 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47552 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65065 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82341 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18992/22095 [32:39:39<3:55:39, 4.56s/it] {'loss': 0.4833, 'grad_norm': 0.26674590985470387, 'learning_rate': 5.086890763899299e-07, 'epoch': 0.86} 86%|████████▌ | 18992/22095 [32:39:39<3:55:39, 4.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [37, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8501799 in VC:s3://internvl-moe-sft-data/. Exception: Image size [37, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 118298, 'image': 'vrdu_texteq/astro-ph.CO/b675e20e-7d3a-4c93-92dc-8a21862bdf89.png', 'image_wh': [[37, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': '$\\approx$6'}]} 86%|████████▌ | 18993/22095 [32:39:42<3:33:53, 4.14s/it] {'loss': 0.3348, 'grad_norm': 0.5781600172894498, 'learning_rate': 5.083670352093196e-07, 'epoch': 0.86} 86%|████████▌ | 18993/22095 [32:39:42<3:33:53, 4.14s/it] 86%|████████▌ | 18994/22095 [32:39:45<3:22:27, 3.92s/it] {'loss': 0.3246, 'grad_norm': 0.6306420137429097, 'learning_rate': 5.080450905401057e-07, 'epoch': 0.86} 86%|████████▌ | 18994/22095 [32:39:45<3:22:27, 3.92s/it] 86%|████████▌ | 18995/22095 [32:39:48<3:09:32, 3.67s/it] {'loss': 0.2773, 'grad_norm': 0.6241465051839432, 'learning_rate': 5.07723242389207e-07, 'epoch': 0.86} 86%|████████▌ | 18995/22095 [32:39:48<3:09:32, 3.67s/it] 86%|████████▌ | 18996/22095 [32:39:52<3:09:30, 3.67s/it] {'loss': 0.3036, 'grad_norm': 0.6184597563032113, 'learning_rate': 5.074014907635405e-07, 'epoch': 0.86} 86%|████████▌ | 18996/22095 [32:39:52<3:09:30, 3.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8337577 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4199, 'image': 'vrdu_table_final_2/astro-ph.CO/d6b3a072-dac7-4b9e-8673-6c04344e0312.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 86%|████████▌ | 18997/22095 [32:39:56<3:09:22, 3.67s/it] {'loss': 0.3068, 'grad_norm': 0.5868468882841714, 'learning_rate': 5.070798356700163e-07, 'epoch': 0.86} 86%|████████▌ | 18997/22095 [32:39:56<3:09:22, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65472 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83636 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 18998/22095 [32:40:05<4:36:46, 5.36s/it] {'loss': 0.2949, 'grad_norm': 0.6258664955028327, 'learning_rate': 5.067582771155472e-07, 'epoch': 0.86} 86%|████████▌ | 18998/22095 [32:40:05<4:36:46, 5.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 18999/22095 [32:40:09<4:12:43, 4.90s/it] {'loss': 0.2924, 'grad_norm': 0.578382879309725, 'learning_rate': 5.064368151070431e-07, 'epoch': 0.86} 86%|████████▌ | 18999/22095 [32:40:09<4:12:43, 4.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 19000/22095 [32:40:19<5:37:25, 6.54s/it] {'loss': 0.4532, 'grad_norm': 0.2750712488483772, 'learning_rate': 5.061154496514125e-07, 'epoch': 0.86} 86%|████████▌ | 19000/22095 [32:40:19<5:37:25, 6.54s/it] 86%|████████▌ | 19001/22095 [32:40:24<5:04:02, 5.90s/it] {'loss': 0.2935, 'grad_norm': 0.6657087671047848, 'learning_rate': 5.057941807555571e-07, 'epoch': 0.86} 86%|████████▌ | 19001/22095 [32:40:24<5:04:02, 5.90s/it] 86%|████████▌ | 19002/22095 [32:40:27<4:20:19, 5.05s/it] {'loss': 0.2981, 'grad_norm': 0.6026886189492119, 'learning_rate': 5.05473008426382e-07, 'epoch': 0.86} 86%|████████▌ | 19002/22095 [32:40:27<4:20:19, 5.05s/it] 86%|████████▌ | 19003/22095 [32:40:30<4:00:36, 4.67s/it] {'loss': 0.3204, 'grad_norm': 0.6383401618594063, 'learning_rate': 5.051519326707893e-07, 'epoch': 0.86} 86%|████████▌ | 19003/22095 [32:40:30<4:00:36, 4.67s/it] 86%|████████▌ | 19004/22095 [32:40:34<3:49:45, 4.46s/it] {'loss': 0.3278, 'grad_norm': 0.6416246251796458, 'learning_rate': 5.048309534956763e-07, 'epoch': 0.86} 86%|████████▌ | 19004/22095 [32:40:34<3:49:45, 4.46s/it] 86%|████████▌ | 19005/22095 [32:40:37<3:25:42, 3.99s/it] {'loss': 0.2715, 'grad_norm': 0.6460942069471778, 'learning_rate': 5.045100709079393e-07, 'epoch': 0.86} 86%|████████▌ | 19005/22095 [32:40:37<3:25:42, 3.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 19006/22095 [32:40:47<5:01:53, 5.86s/it] {'loss': 0.4784, 'grad_norm': 0.40949862405358045, 'learning_rate': 5.041892849144753e-07, 'epoch': 0.86} 86%|████████▌ | 19006/22095 [32:40:47<5:01:53, 5.86s/it] 86%|████████▌ | 19007/22095 [32:40:52<4:47:27, 5.59s/it] {'loss': 0.3326, 'grad_norm': 0.5929098033750538, 'learning_rate': 5.038685955221745e-07, 'epoch': 0.86} 86%|████████▌ | 19007/22095 [32:40:52<4:47:27, 5.59s/it] 86%|████████▌ | 19008/22095 [32:40:56<4:18:48, 5.03s/it] {'loss': 0.3042, 'grad_norm': 0.61831375951075, 'learning_rate': 5.035480027379297e-07, 'epoch': 0.86} 86%|████████▌ | 19008/22095 [32:40:56<4:18:48, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19009/22095 [32:41:00<3:59:52, 4.66s/it] {'loss': 0.2795, 'grad_norm': 0.6487690460983344, 'learning_rate': 5.032275065686287e-07, 'epoch': 0.86} 86%|████████▌ | 19009/22095 [32:41:00<3:59:52, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94444 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84054 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 19010/22095 [32:41:04<3:48:54, 4.45s/it] {'loss': 0.3131, 'grad_norm': 0.6187830430063399, 'learning_rate': 5.029071070211566e-07, 'epoch': 0.86} 86%|████████▌ | 19010/22095 [32:41:04<3:48:54, 4.45s/it] 86%|████████▌ | 19011/22095 [32:41:08<3:48:29, 4.45s/it] {'loss': 0.3177, 'grad_norm': 0.5865905552575247, 'learning_rate': 5.025868041023996e-07, 'epoch': 0.86} 86%|████████▌ | 19011/22095 [32:41:08<3:48:29, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19012/22095 [32:41:12<3:30:56, 4.11s/it] {'loss': 0.2824, 'grad_norm': 0.6062723886429946, 'learning_rate': 5.022665978192398e-07, 'epoch': 0.86} 86%|████████▌ | 19012/22095 [32:41:12<3:30:56, 4.11s/it] 86%|████████▌ | 19013/22095 [32:41:15<3:20:46, 3.91s/it] {'loss': 0.2734, 'grad_norm': 1.024759836194424, 'learning_rate': 5.019464881785569e-07, 'epoch': 0.86} 86%|████████▌ | 19013/22095 [32:41:15<3:20:46, 3.91s/it] 86%|████████▌ | 19014/22095 [32:41:19<3:13:30, 3.77s/it] {'loss': 0.2917, 'grad_norm': 0.599430239637231, 'learning_rate': 5.016264751872291e-07, 'epoch': 0.86} 86%|████████▌ | 19014/22095 [32:41:19<3:13:30, 3.77s/it] 86%|████████▌ | 19015/22095 [32:41:21<2:59:29, 3.50s/it] {'loss': 0.3019, 'grad_norm': 0.5957761500719001, 'learning_rate': 5.013065588521321e-07, 'epoch': 0.86} 86%|████████▌ | 19015/22095 [32:41:21<2:59:29, 3.50s/it] 86%|████████▌ | 19016/22095 [32:41:25<2:54:38, 3.40s/it] {'loss': 0.2628, 'grad_norm': 0.6263540868459652, 'learning_rate': 5.009867391801415e-07, 'epoch': 0.86} 86%|████████▌ | 19016/22095 [32:41:25<2:54:38, 3.40s/it] 86%|████████▌ | 19017/22095 [32:41:28<2:53:45, 3.39s/it] {'loss': 0.2809, 'grad_norm': 0.6826419226058544, 'learning_rate': 5.00667016178128e-07, 'epoch': 0.86} 86%|████████▌ | 19017/22095 [32:41:28<2:53:45, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19018/22095 [32:41:37<4:18:06, 5.03s/it] {'loss': 0.4614, 'grad_norm': 0.2711226807270318, 'learning_rate': 5.00347389852961e-07, 'epoch': 0.86} 86%|████████▌ | 19018/22095 [32:41:37<4:18:06, 5.03s/it] 86%|████████▌ | 19019/22095 [32:41:41<3:59:54, 4.68s/it] {'loss': 0.3382, 'grad_norm': 0.5811627750482236, 'learning_rate': 5.0002786021151e-07, 'epoch': 0.86} 86%|████████▌ | 19019/22095 [32:41:41<3:59:54, 4.68s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19020/22095 [32:41:44<3:40:32, 4.30s/it] {'loss': 0.3412, 'grad_norm': 0.674512844703926, 'learning_rate': 4.997084272606384e-07, 'epoch': 0.86} 86%|████████▌ | 19020/22095 [32:41:44<3:40:32, 4.30s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19021/22095 [32:41:47<3:23:10, 3.97s/it] {'loss': 0.2769, 'grad_norm': 0.6337745654730212, 'learning_rate': 4.993890910072124e-07, 'epoch': 0.86} 86%|████████▌ | 19021/22095 [32:41:47<3:23:10, 3.97s/it] 86%|████████▌ | 19022/22095 [32:41:51<3:14:10, 3.79s/it] {'loss': 0.295, 'grad_norm': 0.5466312533797884, 'learning_rate': 4.990698514580922e-07, 'epoch': 0.86} 86%|████████▌ | 19022/22095 [32:41:51<3:14:10, 3.79s/it] 86%|████████▌ | 19023/22095 [32:41:54<3:05:08, 3.62s/it] {'loss': 0.2996, 'grad_norm': 0.6220405566632338, 'learning_rate': 4.987507086201359e-07, 'epoch': 0.86} 86%|████████▌ | 19023/22095 [32:41:54<3:05:08, 3.62s/it] 86%|████████▌ | 19024/22095 [32:41:57<3:05:16, 3.62s/it] {'loss': 0.2729, 'grad_norm': 0.5593024505119963, 'learning_rate': 4.984316625002029e-07, 'epoch': 0.86} 86%|████████▌ | 19024/22095 [32:41:57<3:05:16, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65697 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69122 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43330 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43169 > 40960) for 4 sample(s). Truncating to 1120 with 2 samples. 86%|████████▌ | 19025/22095 [32:42:01<2:56:40, 3.45s/it] {'loss': 0.2785, 'grad_norm': 0.597273768740094, 'learning_rate': 4.981127131051494e-07, 'epoch': 0.86} 86%|████████▌ | 19025/22095 [32:42:01<2:56:40, 3.45s/it] 86%|████████▌ | 19026/22095 [32:42:03<2:46:09, 3.25s/it] {'loss': 0.2976, 'grad_norm': 0.581083934494568, 'learning_rate': 4.977938604418259e-07, 'epoch': 0.86} 86%|████████▌ | 19026/22095 [32:42:03<2:46:09, 3.25s/it] 86%|████████▌ | 19027/22095 [32:42:08<3:04:04, 3.60s/it] {'loss': 0.306, 'grad_norm': 0.8116799266557139, 'learning_rate': 4.974751045170845e-07, 'epoch': 0.86} 86%|████████▌ | 19027/22095 [32:42:08<3:04:04, 3.60s/it] 86%|████████▌ | 19028/22095 [32:42:11<2:55:26, 3.43s/it] {'loss': 0.2869, 'grad_norm': 0.5998911145327874, 'learning_rate': 4.971564453377748e-07, 'epoch': 0.86} 86%|████████▌ | 19028/22095 [32:42:11<2:55:26, 3.43s/it] 86%|████████▌ | 19029/22095 [32:42:14<2:49:05, 3.31s/it] {'loss': 0.2929, 'grad_norm': 0.6301029855331602, 'learning_rate': 4.968378829107451e-07, 'epoch': 0.86} 86%|████████▌ | 19029/22095 [32:42:14<2:49:05, 3.31s/it] 86%|████████▌ | 19030/22095 [32:42:18<3:01:06, 3.55s/it] {'loss': 0.3149, 'grad_norm': 0.6169160539266364, 'learning_rate': 4.965194172428378e-07, 'epoch': 0.86} 86%|████████▌ | 19030/22095 [32:42:18<3:01:06, 3.55s/it] 86%|████████▌ | 19031/22095 [32:42:22<3:02:51, 3.58s/it] {'loss': 0.2762, 'grad_norm': 0.6082576689734003, 'learning_rate': 4.962010483408964e-07, 'epoch': 0.86} 86%|████████▌ | 19031/22095 [32:42:22<3:02:51, 3.58s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19032/22095 [32:42:25<2:56:28, 3.46s/it] {'loss': 0.2853, 'grad_norm': 0.658833602087984, 'learning_rate': 4.95882776211763e-07, 'epoch': 0.86} 86%|████████▌ | 19032/22095 [32:42:25<2:56:28, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▌ | 19033/22095 [32:42:33<4:13:27, 4.97s/it] {'loss': 0.4381, 'grad_norm': 0.2621143460969143, 'learning_rate': 4.955646008622755e-07, 'epoch': 0.86} 86%|████████▌ | 19033/22095 [32:42:33<4:13:27, 4.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8403475 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5648, 'image': 'vrdu_table_final_2/astro-ph.CO/9090a52b-86b4-4c7e-9392-6cbe5a471b6d.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nKindly transform the table shown in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll kindly transform the table shown in the image into LaTeX format for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} 86%|████████▌ | 19034/22095 [32:42:36<3:44:55, 4.41s/it] {'loss': 0.3295, 'grad_norm': 0.7241578876038501, 'learning_rate': 4.952465222992692e-07, 'epoch': 0.86} 86%|████████▌ | 19034/22095 [32:42:36<3:44:55, 4.41s/it] 86%|████████▌ | 19035/22095 [32:42:39<3:22:59, 3.98s/it] {'loss': 0.2792, 'grad_norm': 0.5715552371285294, 'learning_rate': 4.949285405295812e-07, 'epoch': 0.86} 86%|████████▌ | 19035/22095 [32:42:39<3:22:59, 3.98s/it] 86%|████████▌ | 19036/22095 [32:42:43<3:12:33, 3.78s/it] {'loss': 0.2917, 'grad_norm': 0.627102608544472, 'learning_rate': 4.94610655560041e-07, 'epoch': 0.86} 86%|████████▌ | 19036/22095 [32:42:43<3:12:33, 3.78s/it] 86%|████████▌ | 19037/22095 [32:42:46<3:05:23, 3.64s/it] {'loss': 0.2747, 'grad_norm': 0.5924211155800102, 'learning_rate': 4.942928673974823e-07, 'epoch': 0.86} 86%|████████▌ | 19037/22095 [32:42:46<3:05:23, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▌ | 19038/22095 [32:42:49<2:53:59, 3.42s/it] {'loss': 0.3035, 'grad_norm': 0.6872239178605828, 'learning_rate': 4.93975176048731e-07, 'epoch': 0.86} 86%|████████▌ | 19038/22095 [32:42:49<2:53:59, 3.42s/it] 86%|████████▌ | 19039/22095 [32:42:52<2:43:21, 3.21s/it] {'loss': 0.2408, 'grad_norm': 0.6337648406533888, 'learning_rate': 4.936575815206134e-07, 'epoch': 0.86} 86%|████████▌ | 19039/22095 [32:42:52<2:43:21, 3.21s/it] 86%|████████▌ | 19040/22095 [32:42:56<2:55:50, 3.45s/it] {'loss': 0.3157, 'grad_norm': 0.600265567939054, 'learning_rate': 4.933400838199543e-07, 'epoch': 0.86} 86%|████████▌ | 19040/22095 [32:42:56<2:55:50, 3.45s/it] 86%|████████▌ | 19041/22095 [32:42:59<2:53:41, 3.41s/it] {'loss': 0.2966, 'grad_norm': 0.5771906865411033, 'learning_rate': 4.930226829535767e-07, 'epoch': 0.86} 86%|████████▌ | 19041/22095 [32:42:59<2:53:41, 3.41s/it] 86%|████████▌ | 19042/22095 [32:43:02<2:42:57, 3.20s/it] {'loss': 0.3165, 'grad_norm': 0.6112288585274278, 'learning_rate': 4.927053789282988e-07, 'epoch': 0.86} 86%|████████▌ | 19042/22095 [32:43:02<2:42:57, 3.20s/it] 86%|████████▌ | 19043/22095 [32:43:05<2:42:02, 3.19s/it] {'loss': 0.2963, 'grad_norm': 0.5944446545435385, 'learning_rate': 4.923881717509388e-07, 'epoch': 0.86} 86%|████████▌ | 19043/22095 [32:43:05<2:42:02, 3.19s/it] 86%|████████▌ | 19044/22095 [32:43:08<2:40:48, 3.16s/it] {'loss': 0.2409, 'grad_norm': 0.6289189375880971, 'learning_rate': 4.920710614283131e-07, 'epoch': 0.86} 86%|████████▌ | 19044/22095 [32:43:08<2:40:48, 3.16s/it] 86%|████████▌ | 19045/22095 [32:43:11<2:47:44, 3.30s/it] {'loss': 0.2782, 'grad_norm': 0.599893130066422, 'learning_rate': 4.917540479672356e-07, 'epoch': 0.86} 86%|████████▌ | 19045/22095 [32:43:11<2:47:44, 3.30s/it] 86%|████████▌ | 19046/22095 [32:43:15<2:53:43, 3.42s/it] {'loss': 0.2983, 'grad_norm': 0.5993331924655295, 'learning_rate': 4.914371313745181e-07, 'epoch': 0.86} 86%|████████▌ | 19046/22095 [32:43:15<2:53:43, 3.42s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365575 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32316, 'image': 'vrdu_table_final_2/astro-ph.CO/6f90b2fe-bc70-4a83-b0a3-f786e077ae9a.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{@{}c@{}}\n #2 \\\\\n\n \\end{tabular}\n```"}]} 86%|████████▌ | 19047/22095 [32:43:18<2:49:05, 3.33s/it] {'loss': 0.315, 'grad_norm': 0.6404876556911271, 'learning_rate': 4.911203116569685e-07, 'epoch': 0.86} 86%|████████▌ | 19047/22095 [32:43:18<2:49:05, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047670 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 4\nB. 3\nC. 6\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 86%|████████▌ | 19048/22095 [32:43:22<2:50:44, 3.36s/it] {'loss': 0.3065, 'grad_norm': 0.5569564271980745, 'learning_rate': 4.908035888213964e-07, 'epoch': 0.86} 86%|████████▌ | 19048/22095 [32:43:22<2:50:44, 3.36s/it] 86%|████████▌ | 19049/22095 [32:43:25<2:43:26, 3.22s/it] {'loss': 0.3015, 'grad_norm': 0.6423297856896808, 'learning_rate': 4.904869628746051e-07, 'epoch': 0.86} 86%|████████▌ | 19049/22095 [32:43:25<2:43:26, 3.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401745 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3909, 'image': 'vrdu_table_final_2/astro-ph.CO/985b6e65-5c63-45e9-a3ef-18a638b01f55.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 86%|████████▌ | 19050/22095 [32:43:28<2:42:54, 3.21s/it] {'loss': 0.2633, 'grad_norm': 0.6286364099047824, 'learning_rate': 4.901704338234004e-07, 'epoch': 0.86} 86%|████████▌ | 19050/22095 [32:43:28<2:42:54, 3.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71168 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▌ | 19051/22095 [32:43:31<2:36:07, 3.08s/it] {'loss': 0.2815, 'grad_norm': 0.5863196918495058, 'learning_rate': 4.898540016745818e-07, 'epoch': 0.86} 86%|████████▌ | 19051/22095 [32:43:31<2:36:07, 3.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909861 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33014, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nA. 12\nB. 16\nC. 9\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 86%|████████▌ | 19052/22095 [32:43:35<2:49:52, 3.35s/it] {'loss': 0.3161, 'grad_norm': 0.5968521463422396, 'learning_rate': 4.895376664349482e-07, 'epoch': 0.86} 86%|████████▌ | 19052/22095 [32:43:35<2:49:52, 3.35s/it] 86%|████████▌ | 19053/22095 [32:43:37<2:42:10, 3.20s/it] {'loss': 0.346, 'grad_norm': 0.6459674874191642, 'learning_rate': 4.892214281112973e-07, 'epoch': 0.86} 86%|████████▌ | 19053/22095 [32:43:37<2:42:10, 3.20s/it] 86%|████████▌ | 19054/22095 [32:43:41<2:46:44, 3.29s/it] {'loss': 0.2461, 'grad_norm': 0.6018249731642297, 'learning_rate': 4.88905286710426e-07, 'epoch': 0.86} 86%|████████▌ | 19054/22095 [32:43:41<2:46:44, 3.29s/it] 86%|████████▌ | 19055/22095 [32:43:45<2:55:39, 3.47s/it] {'loss': 0.2744, 'grad_norm': 0.646433803061555, 'learning_rate': 4.88589242239123e-07, 'epoch': 0.86} 86%|████████▌ | 19055/22095 [32:43:45<2:55:39, 3.47s/it] 86%|████████▌ | 19056/22095 [32:43:49<3:06:12, 3.68s/it] {'loss': 0.2501, 'grad_norm': 0.5662144953473035, 'learning_rate': 4.882732947041818e-07, 'epoch': 0.86} 86%|████████▌ | 19056/22095 [32:43:49<3:06:12, 3.68s/it] 86%|████████▋ | 19057/22095 [32:43:52<3:03:02, 3.62s/it] {'loss': 0.2622, 'grad_norm': 0.620217354053335, 'learning_rate': 4.879574441123907e-07, 'epoch': 0.86} 86%|████████▋ | 19057/22095 [32:43:52<3:03:02, 3.62s/it] 86%|████████▋ | 19058/22095 [32:43:56<3:09:53, 3.75s/it] {'loss': 0.2975, 'grad_norm': 0.5854897297948762, 'learning_rate': 4.876416904705384e-07, 'epoch': 0.86} 86%|████████▋ | 19058/22095 [32:43:56<3:09:53, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19059/22095 [32:44:05<4:22:35, 5.19s/it] {'loss': 0.4751, 'grad_norm': 0.26857698043195, 'learning_rate': 4.873260337854058e-07, 'epoch': 0.86} 86%|████████▋ | 19059/22095 [32:44:05<4:22:35, 5.19s/it] 86%|████████▋ | 19060/22095 [32:44:10<4:12:17, 4.99s/it] {'loss': 0.2864, 'grad_norm': 0.6472511347380544, 'learning_rate': 4.870104740637771e-07, 'epoch': 0.86} 86%|████████▋ | 19060/22095 [32:44:10<4:12:17, 4.99s/it] 86%|████████▋ | 19061/22095 [32:44:13<3:51:27, 4.58s/it] {'loss': 0.2362, 'grad_norm': 0.7115035145085302, 'learning_rate': 4.866950113124335e-07, 'epoch': 0.86} 86%|████████▋ | 19061/22095 [32:44:13<3:51:27, 4.58s/it] 86%|████████▋ | 19062/22095 [32:44:18<3:49:43, 4.54s/it] {'loss': 0.2481, 'grad_norm': 0.5470744766222366, 'learning_rate': 4.863796455381525e-07, 'epoch': 0.86} 86%|████████▋ | 19062/22095 [32:44:18<3:49:43, 4.54s/it] 86%|████████▋ | 19063/22095 [32:44:21<3:31:10, 4.18s/it] {'loss': 0.2607, 'grad_norm': 0.5624476468618756, 'learning_rate': 4.860643767477097e-07, 'epoch': 0.86} 86%|████████▋ | 19063/22095 [32:44:21<3:31:10, 4.18s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8960203 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11038, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1.5'}]} 86%|████████▋ | 19064/22095 [32:44:25<3:24:33, 4.05s/it] {'loss': 0.2816, 'grad_norm': 0.6459982206651735, 'learning_rate': 4.857492049478807e-07, 'epoch': 0.86} 86%|████████▋ | 19064/22095 [32:44:25<3:24:33, 4.05s/it] 86%|████████▋ | 19065/22095 [32:44:28<3:07:08, 3.71s/it] {'loss': 0.2854, 'grad_norm': 0.7126904269804301, 'learning_rate': 4.854341301454357e-07, 'epoch': 0.86} 86%|████████▋ | 19065/22095 [32:44:28<3:07:08, 3.71s/it] 86%|████████▋ | 19066/22095 [32:44:31<3:05:09, 3.67s/it] {'loss': 0.2792, 'grad_norm': 0.6386738086723427, 'learning_rate': 4.851191523471465e-07, 'epoch': 0.86} 86%|████████▋ | 19066/22095 [32:44:31<3:05:09, 3.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19067/22095 [32:44:39<4:12:51, 5.01s/it] {'loss': 0.44, 'grad_norm': 0.26909620891534086, 'learning_rate': 4.848042715597811e-07, 'epoch': 0.86} 86%|████████▋ | 19067/22095 [32:44:39<4:12:51, 5.01s/it] 86%|████████▋ | 19068/22095 [32:44:43<3:48:12, 4.52s/it] {'loss': 0.2905, 'grad_norm': 0.6898405599890808, 'learning_rate': 4.84489487790103e-07, 'epoch': 0.86} 86%|████████▋ | 19068/22095 [32:44:43<3:48:12, 4.52s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19069/22095 [32:44:46<3:31:03, 4.18s/it] {'loss': 0.2512, 'grad_norm': 0.6080263444100681, 'learning_rate': 4.841748010448777e-07, 'epoch': 0.86} 86%|████████▋ | 19069/22095 [32:44:46<3:31:03, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78097 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109720 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19070/22095 [32:44:49<3:15:11, 3.87s/it] {'loss': 0.2908, 'grad_norm': 0.6142457407787201, 'learning_rate': 4.838602113308677e-07, 'epoch': 0.86} 86%|████████▋ | 19070/22095 [32:44:49<3:15:11, 3.87s/it] 86%|████████▋ | 19071/22095 [32:44:52<3:04:12, 3.65s/it] {'loss': 0.2512, 'grad_norm': 0.6201942893627198, 'learning_rate': 4.835457186548315e-07, 'epoch': 0.86} 86%|████████▋ | 19071/22095 [32:44:52<3:04:12, 3.65s/it] 86%|████████▋ | 19072/22095 [32:44:56<2:56:02, 3.49s/it] {'loss': 0.2799, 'grad_norm': 0.6845755794045176, 'learning_rate': 4.832313230235253e-07, 'epoch': 0.86} 86%|████████▋ | 19072/22095 [32:44:56<2:56:02, 3.49s/it] 86%|████████▋ | 19073/22095 [32:44:58<2:47:10, 3.32s/it] {'loss': 0.3014, 'grad_norm': 1.158425573208752, 'learning_rate': 4.829170244437064e-07, 'epoch': 0.86} 86%|████████▋ | 19073/22095 [32:44:58<2:47:10, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19074/22095 [32:45:03<2:59:58, 3.57s/it] {'loss': 0.2885, 'grad_norm': 0.6331246849845407, 'learning_rate': 4.82602822922128e-07, 'epoch': 0.86} 86%|████████▋ | 19074/22095 [32:45:03<2:59:58, 3.57s/it] 86%|████████▋ | 19075/22095 [32:45:06<2:54:48, 3.47s/it] {'loss': 0.2824, 'grad_norm': 0.5773053131214465, 'learning_rate': 4.822887184655406e-07, 'epoch': 0.86} 86%|████████▋ | 19075/22095 [32:45:06<2:54:48, 3.47s/it] 86%|████████▋ | 19076/22095 [32:45:09<2:54:38, 3.47s/it] {'loss': 0.3117, 'grad_norm': 0.6958738240209923, 'learning_rate': 4.819747110806928e-07, 'epoch': 0.86} 86%|████████▋ | 19076/22095 [32:45:09<2:54:38, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115470 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73759 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19077/22095 [32:45:13<2:53:30, 3.45s/it] {'loss': 0.3118, 'grad_norm': 0.6302093385948211, 'learning_rate': 4.816608007743335e-07, 'epoch': 0.86} 86%|████████▋ | 19077/22095 [32:45:13<2:53:30, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19078/22095 [32:45:20<3:57:25, 4.72s/it] {'loss': 0.479, 'grad_norm': 0.26448191342255795, 'learning_rate': 4.813469875532056e-07, 'epoch': 0.86} 86%|████████▋ | 19078/22095 [32:45:20<3:57:25, 4.72s/it] 86%|████████▋ | 19079/22095 [32:45:24<3:38:00, 4.34s/it] {'loss': 0.311, 'grad_norm': 0.6490925546607369, 'learning_rate': 4.810332714240534e-07, 'epoch': 0.86} 86%|████████▋ | 19079/22095 [32:45:24<3:38:00, 4.34s/it] 86%|████████▋ | 19080/22095 [32:45:27<3:17:12, 3.92s/it] {'loss': 0.3362, 'grad_norm': 0.6298511384731518, 'learning_rate': 4.80719652393618e-07, 'epoch': 0.86} 86%|████████▋ | 19080/22095 [32:45:27<3:17:12, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59975 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48407 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19081/22095 [32:45:31<3:14:54, 3.88s/it] {'loss': 0.2934, 'grad_norm': 0.6095024090671313, 'learning_rate': 4.804061304686358e-07, 'epoch': 0.86} 86%|████████▋ | 19081/22095 [32:45:31<3:14:54, 3.88s/it] 86%|████████▋ | 19082/22095 [32:45:34<3:01:55, 3.62s/it] {'loss': 0.3192, 'grad_norm': 0.5775729096049567, 'learning_rate': 4.800927056558452e-07, 'epoch': 0.86} 86%|████████▋ | 19082/22095 [32:45:34<3:01:55, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19083/22095 [32:45:37<2:55:49, 3.50s/it] {'loss': 0.3155, 'grad_norm': 0.6587368973474238, 'learning_rate': 4.79779377961982e-07, 'epoch': 0.86} 86%|████████▋ | 19083/22095 [32:45:37<2:55:49, 3.50s/it] 86%|████████▋ | 19084/22095 [32:45:40<2:45:53, 3.31s/it] {'loss': 0.3334, 'grad_norm': 0.6612433715020307, 'learning_rate': 4.794661473937761e-07, 'epoch': 0.86} 86%|████████▋ | 19084/22095 [32:45:40<2:45:53, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19085/22095 [32:45:49<4:13:55, 5.06s/it] {'loss': 0.4772, 'grad_norm': 0.41205551413684943, 'learning_rate': 4.791530139579586e-07, 'epoch': 0.86} 86%|████████▋ | 19085/22095 [32:45:49<4:13:55, 5.06s/it] 86%|████████▋ | 19086/22095 [32:45:53<3:54:09, 4.67s/it] {'loss': 0.2692, 'grad_norm': 0.6138513167648837, 'learning_rate': 4.788399776612584e-07, 'epoch': 0.86} 86%|████████▋ | 19086/22095 [32:45:53<3:54:09, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [189, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8384164 in VC:s3://internvl-moe-sft-data/. Exception: Image size [189, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 50963, 'image': 'vrdu_table_final_2/astro-ph.CO/607ac515-35cb-481e-9a62-3ea2053b8fed.png', 'image_wh': [[189, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}{ll} J164347-650155\\\\ \\end{tabular}\n```"}]} 86%|████████▋ | 19087/22095 [32:45:55<3:27:08, 4.13s/it] {'loss': 0.2664, 'grad_norm': 0.6230080093900928, 'learning_rate': 4.785270385104018e-07, 'epoch': 0.86} 86%|████████▋ | 19087/22095 [32:45:55<3:27:08, 4.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19088/22095 [32:45:59<3:17:18, 3.94s/it] {'loss': 0.3199, 'grad_norm': 0.6425162193038234, 'learning_rate': 4.782141965121129e-07, 'epoch': 0.86} 86%|████████▋ | 19088/22095 [32:45:59<3:17:18, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50831 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19089/22095 [32:46:09<4:55:47, 5.90s/it] {'loss': 0.4722, 'grad_norm': 0.28171184535994015, 'learning_rate': 4.779014516731123e-07, 'epoch': 0.86} 86%|████████▋ | 19089/22095 [32:46:09<4:55:47, 5.90s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8360127 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26848, 'image': 'vrdu_table_final_2/astro-ph.CO/42e8c367-dce6-4183-9c0a-ff9b8c4e59ae.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 86%|████████▋ | 19090/22095 [32:46:13<4:26:54, 5.33s/it] {'loss': 0.2832, 'grad_norm': 0.5849269358010359, 'learning_rate': 4.775888040001214e-07, 'epoch': 0.86} 86%|████████▋ | 19090/22095 [32:46:13<4:26:54, 5.33s/it] 86%|████████▋ | 19091/22095 [32:46:17<3:53:33, 4.66s/it] {'loss': 0.2955, 'grad_norm': 0.6586879746802395, 'learning_rate': 4.772762534998582e-07, 'epoch': 0.86} 86%|████████▋ | 19091/22095 [32:46:17<3:53:33, 4.66s/it] 86%|████████▋ | 19092/22095 [32:46:21<3:43:08, 4.46s/it] {'loss': 0.2682, 'grad_norm': 0.642220205319096, 'learning_rate': 4.769638001790366e-07, 'epoch': 0.86} 86%|████████▋ | 19092/22095 [32:46:21<3:43:08, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119803 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19093/22095 [32:46:24<3:28:29, 4.17s/it] {'loss': 0.2928, 'grad_norm': 0.6185275549850182, 'learning_rate': 4.766514440443726e-07, 'epoch': 0.86} 86%|████████▋ | 19093/22095 [32:46:24<3:28:29, 4.17s/it] 86%|████████▋ | 19094/22095 [32:46:27<3:14:32, 3.89s/it] {'loss': 0.2647, 'grad_norm': 0.6533054865884008, 'learning_rate': 4.763391851025756e-07, 'epoch': 0.86} 86%|████████▋ | 19094/22095 [32:46:27<3:14:32, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19095/22095 [32:46:31<3:06:09, 3.72s/it] {'loss': 0.304, 'grad_norm': 0.7854922368088226, 'learning_rate': 4.76027023360357e-07, 'epoch': 0.86} 86%|████████▋ | 19095/22095 [32:46:31<3:06:09, 3.72s/it] 86%|████████▋ | 19096/22095 [32:46:33<2:52:29, 3.45s/it] {'loss': 0.2787, 'grad_norm': 0.643605119309677, 'learning_rate': 4.7571495882442363e-07, 'epoch': 0.86} 86%|████████▋ | 19096/22095 [32:46:33<2:52:29, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19097/22095 [32:46:39<3:32:09, 4.25s/it] {'loss': 0.4567, 'grad_norm': 0.2500595226999286, 'learning_rate': 4.7540299150147906e-07, 'epoch': 0.86} 86%|████████▋ | 19097/22095 [32:46:40<3:32:09, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19098/22095 [32:46:43<3:20:50, 4.02s/it] {'loss': 0.2883, 'grad_norm': 0.654488899090377, 'learning_rate': 4.7509112139822846e-07, 'epoch': 0.86} 86%|████████▋ | 19098/22095 [32:46:43<3:20:50, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19099/22095 [32:46:46<3:07:59, 3.76s/it] {'loss': 0.3268, 'grad_norm': 0.6683060747430509, 'learning_rate': 4.7477934852137306e-07, 'epoch': 0.86} 86%|████████▋ | 19099/22095 [32:46:46<3:07:59, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 86%|████████▋ | 19100/22095 [32:46:54<4:14:09, 5.09s/it] {'loss': 0.4767, 'grad_norm': 0.3008385202888705, 'learning_rate': 4.7446767287761154e-07, 'epoch': 0.86} 86%|████████▋ | 19100/22095 [32:46:54<4:14:09, 5.09s/it] 86%|████████▋ | 19101/22095 [32:46:59<4:05:48, 4.93s/it] {'loss': 0.3294, 'grad_norm': 0.6540711942901398, 'learning_rate': 4.741560944736395e-07, 'epoch': 0.86} 86%|████████▋ | 19101/22095 [32:46:59<4:05:48, 4.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 53, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8361543 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 53, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 28273, 'image': 'vrdu_table_final_2/astro-ph.CO/ce073a7e-2e8d-4332-9e8e-6f6ff52b7bb7.png', 'image_wh': [[25, 53]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{c}$N$\\\\$n$\\end{tabular}\n```"}]} 86%|████████▋ | 19102/22095 [32:47:02<3:40:45, 4.43s/it] {'loss': 0.2962, 'grad_norm': 0.6253051333606227, 'learning_rate': 4.7384461331615284e-07, 'epoch': 0.86} 86%|████████▋ | 19102/22095 [32:47:02<3:40:45, 4.43s/it] 86%|████████▋ | 19103/22095 [32:47:05<3:19:46, 4.01s/it] {'loss': 0.2801, 'grad_norm': 0.6168597078443402, 'learning_rate': 4.735332294118455e-07, 'epoch': 0.86} 86%|████████▋ | 19103/22095 [32:47:05<3:19:46, 4.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 86%|████████▋ | 19104/22095 [32:47:08<3:01:44, 3.65s/it] {'loss': 0.3245, 'grad_norm': 0.6142779127798528, 'learning_rate': 4.732219427674073e-07, 'epoch': 0.86} 86%|████████▋ | 19104/22095 [32:47:08<3:01:44, 3.65s/it] 86%|████████▋ | 19105/22095 [32:47:12<3:08:15, 3.78s/it] {'loss': 0.298, 'grad_norm': 0.6313286970148196, 'learning_rate': 4.729107533895255e-07, 'epoch': 0.86} 86%|████████▋ | 19105/22095 [32:47:12<3:08:15, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (119493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53271 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53207 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74444 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19106/22095 [32:47:15<3:00:13, 3.62s/it] {'loss': 0.3042, 'grad_norm': 0.5928257456260824, 'learning_rate': 4.7259966128488876e-07, 'epoch': 0.86} 86%|████████▋ | 19106/22095 [32:47:15<3:00:13, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73290 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (148854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46648 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124492 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19107/22095 [32:47:19<3:07:23, 3.76s/it] {'loss': 0.254, 'grad_norm': 0.554898880701603, 'learning_rate': 4.722886664601795e-07, 'epoch': 0.86} 86%|████████▋ | 19107/22095 [32:47:19<3:07:23, 3.76s/it] 86%|████████▋ | 19108/22095 [32:47:24<3:12:24, 3.86s/it] {'loss': 0.3293, 'grad_norm': 0.6208995354350286, 'learning_rate': 4.719777689220817e-07, 'epoch': 0.86} 86%|████████▋ | 19108/22095 [32:47:24<3:12:24, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59446 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74055 > 40960). Running this sequence through the model will result in indexing errors 86%|████████▋ | 19109/22095 [32:47:28<3:16:06, 3.94s/it] {'loss': 0.2971, 'grad_norm': 0.6028647084343479, 'learning_rate': 4.716669686772751e-07, 'epoch': 0.86} 86%|████████▋ | 19109/22095 [32:47:28<3:16:06, 3.94s/it] 86%|████████▋ | 19110/22095 [32:47:31<3:05:03, 3.72s/it] {'loss': 0.2651, 'grad_norm': 0.5898969298823253, 'learning_rate': 4.7135626573243607e-07, 'epoch': 0.86} 86%|████████▋ | 19110/22095 [32:47:31<3:05:03, 3.72s/it] 86%|████████▋ | 19111/22095 [32:47:34<2:52:34, 3.47s/it] {'loss': 0.303, 'grad_norm': 0.6482671602848603, 'learning_rate': 4.710456600942431e-07, 'epoch': 0.86} 86%|████████▋ | 19111/22095 [32:47:34<2:52:34, 3.47s/it] 86%|████████▋ | 19112/22095 [32:47:37<2:56:23, 3.55s/it] {'loss': 0.2706, 'grad_norm': 0.6068135500269236, 'learning_rate': 4.707351517693698e-07, 'epoch': 0.86} 86%|████████▋ | 19112/22095 [32:47:37<2:56:23, 3.55s/it] 87%|████████▋ | 19113/22095 [32:47:41<2:55:19, 3.53s/it] {'loss': 0.2919, 'grad_norm': 0.6790603909443061, 'learning_rate': 4.704247407644874e-07, 'epoch': 0.87} 87%|████████▋ | 19113/22095 [32:47:41<2:55:19, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19114/22095 [32:47:50<4:17:03, 5.17s/it] {'loss': 0.4753, 'grad_norm': 0.31337349374409723, 'learning_rate': 4.701144270862651e-07, 'epoch': 0.87} 87%|████████▋ | 19114/22095 [32:47:50<4:17:03, 5.17s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19115/22095 [32:47:58<4:55:09, 5.94s/it] {'loss': 0.4887, 'grad_norm': 0.285205795606402, 'learning_rate': 4.6980421074137137e-07, 'epoch': 0.87} 87%|████████▋ | 19115/22095 [32:47:58<4:55:09, 5.94s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (137034 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19116/22095 [32:48:02<4:29:53, 5.44s/it] {'loss': 0.2864, 'grad_norm': 0.6040188107489841, 'learning_rate': 4.6949409173647267e-07, 'epoch': 0.87} 87%|████████▋ | 19116/22095 [32:48:02<4:29:53, 5.44s/it] 87%|████████▋ | 19117/22095 [32:48:11<5:29:23, 6.64s/it] {'loss': 0.4959, 'grad_norm': 0.28986218884428416, 'learning_rate': 4.691840700782313e-07, 'epoch': 0.87} 87%|████████▋ | 19117/22095 [32:48:11<5:29:23, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 87%|████████▋ | 19118/22095 [32:48:15<4:46:40, 5.78s/it] {'loss': 0.3104, 'grad_norm': 0.6388514118297662, 'learning_rate': 4.6887414577330814e-07, 'epoch': 0.87} 87%|████████▋ | 19118/22095 [32:48:15<4:46:40, 5.78s/it] 87%|████████▋ | 19119/22095 [32:48:18<4:08:23, 5.01s/it] {'loss': 0.2906, 'grad_norm': 0.620333813228641, 'learning_rate': 4.6856431882836397e-07, 'epoch': 0.87} 87%|████████▋ | 19119/22095 [32:48:18<4:08:23, 5.01s/it] 87%|████████▋ | 19120/22095 [32:48:22<3:46:09, 4.56s/it] {'loss': 0.2928, 'grad_norm': 0.604231251337452, 'learning_rate': 4.682545892500545e-07, 'epoch': 0.87} 87%|████████▋ | 19120/22095 [32:48:22<3:46:09, 4.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921714 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44867, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': '\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 10\nB. 8\nC. 7\nD. 6'}, {'from': 'gpt', 'value': '【解答】解:∵AB=20,AD=14,∴BD=AB-AD=20-14=6,∵D为线段BC的中点,∴BC=2BD=12,∴AC=AB-BC=20-12=8.'}]} 87%|████████▋ | 19121/22095 [32:48:25<3:31:03, 4.26s/it] {'loss': 0.3025, 'grad_norm': 0.6711168213936696, 'learning_rate': 4.679449570450367e-07, 'epoch': 0.87} 87%|████████▋ | 19121/22095 [32:48:25<3:31:03, 4.26s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 87%|████████▋ | 19122/22095 [32:48:29<3:16:04, 3.96s/it] {'loss': 0.3123, 'grad_norm': 0.5841088916852475, 'learning_rate': 4.676354222199625e-07, 'epoch': 0.87} 87%|████████▋ | 19122/22095 [32:48:29<3:16:04, 3.96s/it] 87%|████████▋ | 19123/22095 [32:48:32<3:11:29, 3.87s/it] {'loss': 0.29, 'grad_norm': 0.6040892371615114, 'learning_rate': 4.6732598478148264e-07, 'epoch': 0.87} 87%|████████▋ | 19123/22095 [32:48:32<3:11:29, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19124/22095 [32:48:41<4:29:28, 5.44s/it] {'loss': 0.454, 'grad_norm': 0.26525114918965464, 'learning_rate': 4.6701664473624677e-07, 'epoch': 0.87} 87%|████████▋ | 19124/22095 [32:48:41<4:29:28, 5.44s/it] 87%|████████▋ | 19125/22095 [32:48:45<3:58:58, 4.83s/it] {'loss': 0.2827, 'grad_norm': 0.6638484740908424, 'learning_rate': 4.667074020909013e-07, 'epoch': 0.87} 87%|████████▋ | 19125/22095 [32:48:45<3:58:58, 4.83s/it] 87%|████████▋ | 19126/22095 [32:48:49<3:44:58, 4.55s/it] {'loss': 0.2832, 'grad_norm': 0.6088432095068631, 'learning_rate': 4.663982568520897e-07, 'epoch': 0.87} 87%|████████▋ | 19126/22095 [32:48:49<3:44:58, 4.55s/it] 87%|████████▋ | 19127/22095 [32:48:52<3:30:17, 4.25s/it] {'loss': 0.3158, 'grad_norm': 0.606926588924379, 'learning_rate': 4.660892090264557e-07, 'epoch': 0.87} 87%|████████▋ | 19127/22095 [32:48:52<3:30:17, 4.25s/it] 87%|████████▋ | 19128/22095 [32:48:56<3:24:22, 4.13s/it] {'loss': 0.2902, 'grad_norm': 0.5796853979962601, 'learning_rate': 4.657802586206411e-07, 'epoch': 0.87} 87%|████████▋ | 19128/22095 [32:48:56<3:24:22, 4.13s/it] 87%|████████▋ | 19129/22095 [32:49:00<3:15:25, 3.95s/it] {'loss': 0.2957, 'grad_norm': 0.6563871862561939, 'learning_rate': 4.6547140564128236e-07, 'epoch': 0.87} 87%|████████▋ | 19129/22095 [32:49:00<3:15:25, 3.95s/it] 87%|████████▋ | 19130/22095 [32:49:03<3:01:08, 3.67s/it] {'loss': 0.2634, 'grad_norm': 0.6368524606296716, 'learning_rate': 4.651626500950157e-07, 'epoch': 0.87} 87%|████████▋ | 19130/22095 [32:49:03<3:01:08, 3.67s/it] 87%|████████▋ | 19131/22095 [32:49:06<2:56:55, 3.58s/it] {'loss': 0.3179, 'grad_norm': 0.6389499416717722, 'learning_rate': 4.648539919884759e-07, 'epoch': 0.87} 87%|████████▋ | 19131/22095 [32:49:06<2:56:55, 3.58s/it] 87%|████████▋ | 19132/22095 [32:49:11<3:13:03, 3.91s/it] {'loss': 0.2628, 'grad_norm': 0.5272471866796019, 'learning_rate': 4.6454543132829653e-07, 'epoch': 0.87} 87%|████████▋ | 19132/22095 [32:49:11<3:13:03, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80004 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19133/22095 [32:49:14<3:09:28, 3.84s/it] {'loss': 0.2985, 'grad_norm': 0.6371155766939642, 'learning_rate': 4.6423696812110564e-07, 'epoch': 0.87} 87%|████████▋ | 19133/22095 [32:49:14<3:09:28, 3.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882166 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5319, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 8cm\nB. 5cm\nC. 6cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 87%|████████▋ | 19134/22095 [32:49:18<3:02:17, 3.69s/it] {'loss': 0.2576, 'grad_norm': 0.5966897807770152, 'learning_rate': 4.639286023735312e-07, 'epoch': 0.87} 87%|████████▋ | 19134/22095 [32:49:18<3:02:17, 3.69s/it] 87%|████████▋ | 19135/22095 [32:49:21<2:48:16, 3.41s/it] {'loss': 0.2903, 'grad_norm': 0.6236122764447524, 'learning_rate': 4.6362033409220077e-07, 'epoch': 0.87} 87%|████████▋ | 19135/22095 [32:49:21<2:48:16, 3.41s/it] 87%|████████▋ | 19136/22095 [32:49:24<2:44:56, 3.34s/it] {'loss': 0.3142, 'grad_norm': 0.5832164595872447, 'learning_rate': 4.6331216328373565e-07, 'epoch': 0.87} 87%|████████▋ | 19136/22095 [32:49:24<2:44:56, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88946 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47369 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19137/22095 [32:49:27<2:39:54, 3.24s/it] {'loss': 0.3063, 'grad_norm': 0.6123744465516868, 'learning_rate': 4.6300408995476e-07, 'epoch': 0.87} 87%|████████▋ | 19137/22095 [32:49:27<2:39:54, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45060 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41857 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43347 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48755 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19138/22095 [32:49:30<2:43:32, 3.32s/it] {'loss': 0.3219, 'grad_norm': 0.5975181078415203, 'learning_rate': 4.6269611411189185e-07, 'epoch': 0.87} 87%|████████▋ | 19138/22095 [32:49:30<2:43:32, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047544 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 8cm\nB. 9cm\nC. 4cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 87%|████████▋ | 19139/22095 [32:49:34<2:55:31, 3.56s/it] {'loss': 0.3384, 'grad_norm': 0.5996937805734076, 'learning_rate': 4.6238823576174817e-07, 'epoch': 0.87} 87%|████████▋ | 19139/22095 [32:49:34<2:55:31, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58965 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77843 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19140/22095 [32:49:38<2:56:10, 3.58s/it] {'loss': 0.2838, 'grad_norm': 0.6347639658352031, 'learning_rate': 4.620804549109448e-07, 'epoch': 0.87} 87%|████████▋ | 19140/22095 [32:49:38<2:56:10, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19141/22095 [32:49:45<3:54:16, 4.76s/it] {'loss': 0.475, 'grad_norm': 0.2863017410983737, 'learning_rate': 4.6177277156609634e-07, 'epoch': 0.87} 87%|████████▋ | 19141/22095 [32:49:45<3:54:16, 4.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19142/22095 [32:49:50<3:43:38, 4.54s/it] {'loss': 0.2555, 'grad_norm': 0.5374663072332243, 'learning_rate': 4.6146518573381314e-07, 'epoch': 0.87} 87%|████████▋ | 19142/22095 [32:49:50<3:43:38, 4.54s/it] 87%|████████▋ | 19143/22095 [32:49:53<3:22:04, 4.11s/it] {'loss': 0.2865, 'grad_norm': 0.6177097756381084, 'learning_rate': 4.6115769742070326e-07, 'epoch': 0.87} 87%|████████▋ | 19143/22095 [32:49:53<3:22:04, 4.11s/it] 87%|████████▋ | 19144/22095 [32:49:56<3:05:43, 3.78s/it] {'loss': 0.3041, 'grad_norm': 0.5915875014756634, 'learning_rate': 4.608503066333742e-07, 'epoch': 0.87} 87%|████████▋ | 19144/22095 [32:49:56<3:05:43, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19145/22095 [32:50:05<4:35:43, 5.61s/it] {'loss': 0.4299, 'grad_norm': 0.2633069387160369, 'learning_rate': 4.6054301337843165e-07, 'epoch': 0.87} 87%|████████▋ | 19145/22095 [32:50:06<4:35:43, 5.61s/it] 87%|████████▋ | 19146/22095 [32:50:13<5:07:32, 6.26s/it] {'loss': 0.4688, 'grad_norm': 0.2848193531287552, 'learning_rate': 4.6023581766247825e-07, 'epoch': 0.87} 87%|████████▋ | 19146/22095 [32:50:13<5:07:32, 6.26s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 87%|████████▋ | 19147/22095 [32:50:16<4:22:21, 5.34s/it] {'loss': 0.2719, 'grad_norm': 0.5622954185203691, 'learning_rate': 4.5992871949211373e-07, 'epoch': 0.87} 87%|████████▋ | 19147/22095 [32:50:16<4:22:21, 5.34s/it] 87%|████████▋ | 19148/22095 [32:50:24<4:54:18, 5.99s/it] {'loss': 0.4457, 'grad_norm': 0.2668497744033602, 'learning_rate': 4.596217188739377e-07, 'epoch': 0.87} 87%|████████▋ | 19148/22095 [32:50:24<4:54:18, 5.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045977 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 6.8cm\nB. 7cm\nC. 5.4cm\nD. 6.4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 87%|████████▋ | 19149/22095 [32:50:27<4:13:36, 5.17s/it] {'loss': 0.2476, 'grad_norm': 0.6091572459376194, 'learning_rate': 4.593148158145455e-07, 'epoch': 0.87} 87%|████████▋ | 19149/22095 [32:50:27<4:13:36, 5.17s/it] 87%|████████▋ | 19150/22095 [32:50:31<3:45:51, 4.60s/it] {'loss': 0.2957, 'grad_norm': 0.5863981063835381, 'learning_rate': 4.59008010320533e-07, 'epoch': 0.87} 87%|████████▋ | 19150/22095 [32:50:31<3:45:51, 4.60s/it] 87%|████████▋ | 19151/22095 [32:50:34<3:28:11, 4.24s/it] {'loss': 0.2619, 'grad_norm': 0.7047953065256126, 'learning_rate': 4.587013023984921e-07, 'epoch': 0.87} 87%|████████▋ | 19151/22095 [32:50:34<3:28:11, 4.24s/it] 87%|████████▋ | 19152/22095 [32:50:37<3:11:01, 3.89s/it] {'loss': 0.2535, 'grad_norm': 0.7000958588469114, 'learning_rate': 4.583946920550114e-07, 'epoch': 0.87} 87%|████████▋ | 19152/22095 [32:50:37<3:11:01, 3.89s/it] 87%|████████▋ | 19153/22095 [32:50:40<3:01:49, 3.71s/it] {'loss': 0.2703, 'grad_norm': 0.6598872071664383, 'learning_rate': 4.580881792966807e-07, 'epoch': 0.87} 87%|████████▋ | 19153/22095 [32:50:40<3:01:49, 3.71s/it] 87%|████████▋ | 19154/22095 [32:50:43<2:49:48, 3.46s/it] {'loss': 0.2874, 'grad_norm': 0.602439683038341, 'learning_rate': 4.577817641300869e-07, 'epoch': 0.87} 87%|████████▋ | 19154/22095 [32:50:43<2:49:48, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [578, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8460558 in VC:s3://internvl-moe-sft-data/. Exception: Image size [578, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65708, 'image': 'vrdu_texteq/astro-ph.CO/53cbc687-6a08-417d-a97d-5f1108acc286.png', 'image_wh': [[578, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'where $\\kappa$ is the extinction coefficient of the dust.'}]} 87%|████████▋ | 19155/22095 [32:50:47<2:50:09, 3.47s/it] {'loss': 0.2972, 'grad_norm': 0.590883109590152, 'learning_rate': 4.574754465618114e-07, 'epoch': 0.87} 87%|████████▋ | 19155/22095 [32:50:47<2:50:09, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118141 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19156/22095 [32:50:50<2:44:04, 3.35s/it] {'loss': 0.3008, 'grad_norm': 0.6449816228421348, 'learning_rate': 4.571692265984368e-07, 'epoch': 0.87} 87%|████████▋ | 19156/22095 [32:50:50<2:44:04, 3.35s/it]VC:s3://gui-agent/data_20250612/mac/images/mac_gui_trajectory_1_0614/reminders_3/images/step_3.png 2025-08-29 00:48:49.596229 load time: 1020.53 ms 87%|████████▋ | 19157/22095 [32:50:53<2:38:25, 3.24s/it] {'loss': 0.28, 'grad_norm': 0.6160323402068825, 'learning_rate': 4.5686310424654325e-07, 'epoch': 0.87} 87%|████████▋ | 19157/22095 [32:50:53<2:38:25, 3.24s/it] 87%|████████▋ | 19158/22095 [32:50:56<2:37:18, 3.21s/it] {'loss': 0.2698, 'grad_norm': 0.6428005330677683, 'learning_rate': 4.565570795127106e-07, 'epoch': 0.87} 87%|████████▋ | 19158/22095 [32:50:56<2:37:18, 3.21s/it] 87%|████████▋ | 19159/22095 [32:50:59<2:37:44, 3.22s/it] {'loss': 0.3133, 'grad_norm': 0.5738385724075437, 'learning_rate': 4.5625115240351016e-07, 'epoch': 0.87} 87%|████████▋ | 19159/22095 [32:50:59<2:37:44, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84846 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19160/22095 [32:51:02<2:33:09, 3.13s/it] {'loss': 0.2853, 'grad_norm': 0.8398364145820625, 'learning_rate': 4.559453229255173e-07, 'epoch': 0.87} 87%|████████▋ | 19160/22095 [32:51:02<2:33:09, 3.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959589 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10424, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 4cm\nB. 6cm\nC. 1cm\nD. 2cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 87%|████████▋ | 19161/22095 [32:51:05<2:30:34, 3.08s/it] {'loss': 0.2243, 'grad_norm': 0.5485241443634394, 'learning_rate': 4.5563959108530455e-07, 'epoch': 0.87} 87%|████████▋ | 19161/22095 [32:51:05<2:30:34, 3.08s/it] 87%|████████▋ | 19162/22095 [32:51:08<2:31:08, 3.09s/it] {'loss': 0.2844, 'grad_norm': 0.580763951997358, 'learning_rate': 4.553339568894399e-07, 'epoch': 0.87} 87%|████████▋ | 19162/22095 [32:51:08<2:31:08, 3.09s/it] 87%|████████▋ | 19163/22095 [32:51:12<2:45:44, 3.39s/it] {'loss': 0.2904, 'grad_norm': 0.6433553330581192, 'learning_rate': 4.550284203444899e-07, 'epoch': 0.87} 87%|████████▋ | 19163/22095 [32:51:12<2:45:44, 3.39s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19164/22095 [32:51:20<3:51:05, 4.73s/it] {'loss': 0.4849, 'grad_norm': 0.4912122765876196, 'learning_rate': 4.5472298145702144e-07, 'epoch': 0.87} 87%|████████▋ | 19164/22095 [32:51:20<3:51:05, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [337, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8435156 in VC:s3://internvl-moe-sft-data/. Exception: Image size [337, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39981, 'image': 'vrdu_texteq/astro-ph.CO/dca5ad16-b611-45f3-be0f-e1473f6dcbce.png', 'image_wh': [[337, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'Redshift slice $0.9 < z < 1.1$:'}]} 87%|████████▋ | 19165/22095 [32:51:24<3:34:30, 4.39s/it] {'loss': 0.2417, 'grad_norm': 0.6330924353046441, 'learning_rate': 4.5441764023359483e-07, 'epoch': 0.87} 87%|████████▋ | 19165/22095 [32:51:24<3:34:30, 4.39s/it] 87%|████████▋ | 19166/22095 [32:51:27<3:14:26, 3.98s/it] {'loss': 0.3085, 'grad_norm': 0.6120571184065314, 'learning_rate': 4.5411239668077366e-07, 'epoch': 0.87} 87%|████████▋ | 19166/22095 [32:51:27<3:14:26, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19167/22095 [32:51:33<3:44:47, 4.61s/it] {'loss': 0.4664, 'grad_norm': 0.2666466628316768, 'learning_rate': 4.5380725080511555e-07, 'epoch': 0.87} 87%|████████▋ | 19167/22095 [32:51:33<3:44:47, 4.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19168/22095 [32:51:36<3:27:45, 4.26s/it] {'loss': 0.2461, 'grad_norm': 0.6969170437946436, 'learning_rate': 4.5350220261317633e-07, 'epoch': 0.87} 87%|████████▋ | 19168/22095 [32:51:36<3:27:45, 4.26s/it] 87%|████████▋ | 19169/22095 [32:51:40<3:14:59, 4.00s/it] {'loss': 0.262, 'grad_norm': 0.6219580735827156, 'learning_rate': 4.5319725211151077e-07, 'epoch': 0.87} 87%|████████▋ | 19169/22095 [32:51:40<3:14:59, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69202 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19170/22095 [32:51:43<3:05:31, 3.81s/it] {'loss': 0.2742, 'grad_norm': 0.5670702283976174, 'learning_rate': 4.5289239930667304e-07, 'epoch': 0.87} 87%|████████▋ | 19170/22095 [32:51:43<3:05:31, 3.81s/it] 87%|████████▋ | 19171/22095 [32:51:46<2:49:13, 3.47s/it] {'loss': 0.3061, 'grad_norm': 0.6359844552906783, 'learning_rate': 4.525876442052124e-07, 'epoch': 0.87} 87%|████████▋ | 19171/22095 [32:51:46<2:49:13, 3.47s/it] 87%|████████▋ | 19172/22095 [32:51:49<2:50:11, 3.49s/it] {'loss': 0.3253, 'grad_norm': 0.7597180178156122, 'learning_rate': 4.522829868136758e-07, 'epoch': 0.87} 87%|████████▋ | 19172/22095 [32:51:49<2:50:11, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73164 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19173/22095 [32:51:54<3:11:14, 3.93s/it] {'loss': 0.2991, 'grad_norm': 0.5932075929375237, 'learning_rate': 4.519784271386107e-07, 'epoch': 0.87} 87%|████████▋ | 19173/22095 [32:51:54<3:11:14, 3.93s/it] 87%|████████▋ | 19174/22095 [32:51:58<3:07:36, 3.85s/it] {'loss': 0.2778, 'grad_norm': 0.6196670959860574, 'learning_rate': 4.516739651865615e-07, 'epoch': 0.87} 87%|████████▋ | 19174/22095 [32:51:58<3:07:36, 3.85s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19175/22095 [32:52:01<2:56:45, 3.63s/it] {'loss': 0.3191, 'grad_norm': 0.6200868791032506, 'learning_rate': 4.5136960096407e-07, 'epoch': 0.87} 87%|████████▋ | 19175/22095 [32:52:01<2:56:45, 3.63s/it] 87%|████████▋ | 19176/22095 [32:52:04<2:48:16, 3.46s/it] {'loss': 0.2861, 'grad_norm': 0.6199821020576625, 'learning_rate': 4.5106533447767496e-07, 'epoch': 0.87} 87%|████████▋ | 19176/22095 [32:52:04<2:48:16, 3.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8574037 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13658, 'image': '345410882.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a sci-fi book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a journey related book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 87%|████████▋ | 19177/22095 [32:52:07<2:44:27, 3.38s/it] {'loss': 0.3108, 'grad_norm': 0.6538168468033612, 'learning_rate': 4.507611657339156e-07, 'epoch': 0.87} 87%|████████▋ | 19177/22095 [32:52:07<2:44:27, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19178/22095 [32:52:17<4:13:44, 5.22s/it] {'loss': 0.4587, 'grad_norm': 0.25948403348974586, 'learning_rate': 4.504570947393261e-07, 'epoch': 0.87} 87%|████████▋ | 19178/22095 [32:52:17<4:13:44, 5.22s/it] 87%|████████▋ | 19179/22095 [32:52:20<3:45:38, 4.64s/it] {'loss': 0.3174, 'grad_norm': 0.7001967108251121, 'learning_rate': 4.5015312150044177e-07, 'epoch': 0.87} 87%|████████▋ | 19179/22095 [32:52:20<3:45:38, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104178 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46287 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51263 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19180/22095 [32:52:23<3:27:16, 4.27s/it] {'loss': 0.3012, 'grad_norm': 0.6252283624355447, 'learning_rate': 4.49849246023793e-07, 'epoch': 0.87} 87%|████████▋ | 19180/22095 [32:52:23<3:27:16, 4.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19181/22095 [32:52:30<4:07:08, 5.09s/it] {'loss': 0.4589, 'grad_norm': 0.2573979034500562, 'learning_rate': 4.4954546831590837e-07, 'epoch': 0.87} 87%|████████▋ | 19181/22095 [32:52:30<4:07:08, 5.09s/it] 87%|████████▋ | 19182/22095 [32:52:34<3:48:58, 4.72s/it] {'loss': 0.2555, 'grad_norm': 0.5719706327436379, 'learning_rate': 4.4924178838331554e-07, 'epoch': 0.87} 87%|████████▋ | 19182/22095 [32:52:34<3:48:58, 4.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [277, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11259300 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [277, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7699, 'image': 'MMMU/Pharmacy/test_78_image_1.png', 'image_wh': [[277, 23]], 'conversations': [{'from': 'human', 'value': "\nPlease analyze all the relevant background knowledge involved in this question and give an in-depth and detailed introduction to each relevant knowledge point. You need to provide a clear explanation of each knowledge point involved in the question and answer, including its concept, definition, practical application, importance and influence in related fields. Please provide appropriate examples to support the explanation of each knowledge point, just like the detailed explanation in the textbook. Make sure the answer is clearly structured, well-organized, and covers all important aspects of each knowledge point. The answer requires at least 2,000 words, and avoid directly answering the answer to the specific question, and focus on the analysis and explanation of background knowledge.\nHere are the content and options of this question:\nQuestion: In a blast furnace, iron(lll) oxide is used to produce iron by the following (unbalanced) reaction: ,If 4.00 kg Fe2O3 are available to react, how many moles of CO are needed?\nOptions: ['75.15mol', '75.16mol', '75.17mol', '75.18mol']"}, {'from': 'gpt', 'value': '### Introduction\n\nUnderstanding the chemical reaction between iron(III) oxide (\\( \\text{Fe}_2\\text{O}_3 \\)) and carbon monoxide (\\( \\text{CO} \\)) to produce iron (\\( \\text{Fe} \\)) and carbon dioxide (\\( \\text{CO}_2 \\)) is crucial in various fields, including metallurgy, environmental science, and industrial chemistry. This reaction forms the core process in a blast furnace, which is essential in the production of iron from its ores. This essay provides a detailed exploration of the background knowledge involved, covering chemical reactions, stoichiometry, thermodynamics, industrial applications, and environmental considerations.\n\n### Chemical Reactions and Stoichiometry\n\n#### Concept and Definition\n\nA chemical reaction involves the transformation of reactants into products, characterized by the breaking and forming of chemical bonds. In our context, the reaction:\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + \\text{CO}(g) \\rightarrow \\text{Fe}(s) + \\text{CO}_2(g) \\]\n\nillustrates a redox reaction, where \\(\\text{Fe}_2\\text{O}_3\\) is reduced to \\(\\text{Fe}\\) and \\(\\text{CO}\\) is oxidized to \\(\\text{CO}_2\\).\n\n**Stoichiometry** is the quantitative relationship between reactants and products in a chemical reaction. It allows for the calculation of the amounts of substances consumed and produced.\n\n#### Practical Application\n\nStoichiometry is applied to ensure reactants are used efficiently in both laboratory and industrial processes. It is crucial for:\n\n- Calculating reactant and product quantities.\n- Scaling reactions for industrial production.\n- Balancing chemical equations.\n\nExample: In balancing the given reaction, coefficients ensure atom conservation.\n\nHow many moles of \\(\\text{CO}\\) are required to react with a given amount of \\(\\text{Fe}_2\\text{O}_3\\)?\n\n\\[ \\text{Fe}_2\\text{O}_3(s) + 3\\text{CO}(g) \\rightarrow 2\\text{Fe}(s) + 3\\text{CO}_2(g) \\]\n\nThis balanced equation indicates that 3 moles of \\(\\text{CO}\\) are needed for every mole of \\(\\text{Fe}_2\\text{O}_3\\).\n\n#### Importance and Influence\n\nStoichiometry is fundamental to:\n\n- Designing chemical processes.\n- Environmental assessments and waste reduction.\n- Pharmaceutical synthesis ensuring the right dosages.\n\n### Thermodynamics\n\n#### Concept and Definition\n\n**Thermodynamics** involves the study of energy changes, especially heat energy, in chemical reactions. In the context of \\(\\text{Fe}_2\\text{O}_3\\) reduction, the thermodynamic properties include:\n\n1. **Enthalpy (\\( \\Delta H \\))**: Heat change at constant pressure.\n2. **Gibbs Free Energy (\\( \\Delta G \\))**: Determines reaction spontaneity.\n3. **Entropy (\\( \\Delta S \\))**: Measures disorder or randomness.\n\n#### Practical Application\n\nThermodynamics guides:\n\n- The feasibility of reactions.\n- Temperature and pressure conditions for reactions.\n\nIn the blast furnace:\n\n- High temperatures are maintained to ensure \\(\\Delta G\\) is negative, favoring \\(\\text{Fe}\\) formation.\n- Coke combustion provides necessary heat.\n\nExample: The reduction of \\(\\text{Fe}_2\\text{O}_3\\) is endothermic, requiring heat input to progress.\n\n#### Importance and Influence\n\nThermodynamics affects:\n\n- Reaction design for energy efficiency.\n- Material choice and innovations.\n- Environmental impact assessments.\n\n### Industrial Applications: The Blast Furnace\n\n#### Concept and Definition\n\nA **blast furnace** is a large, steel stacked structure lined with refractory brick where iron ore, coke, and limestone are combined at high temperatures to produce molten iron, CO, and other byproducts.\n\n#### Practical Application\n\nCritical in metallurgy, blast furnaces:\n\n- Reduce iron ores to produce pig iron.\n- Generate \\(\\text{CO}\\) to act as a reducing agent.\n\n**Steps Involved**:\n1. Raw materials (iron ore, coke, limestone) are charged at the top.\n2. Hot air is blown at the bottom, initiating coke combustion.\n3. Heat decomposes and purifies materials.\n4. Molten iron and slag are tapped off separately.\n\nExample: Annually, blast furnaces produce millions of tons of iron, integral to steel manufacturing.\n\n#### Importance and Influence\n\nBlast furnaces are vital for:\n\n- Large-scale iron production.\n- Providing raw material for steelmaking.\n- Influencing technological advancements in construction and manufacturing.\n\n### Environmental Considerations\n\n#### Concept and Definition\n\n**Environmental science** assesses the impacts of chemical processes on ecosystems. Here, critical issues include air pollution, carbon emissions, and slag disposal.\n\n#### Practical Application\n\nEfforts in minimizing blast furnace impacts:\n\n- Implementing scrubbers to reduce \\(\\text{CO}_2\\) and \\(\\text{SO}_2\\) emissions.\n- Recycling slag into construction materials.\n- Innovating with carbon capture technology.\n\nExample: "Green steel" initiatives focus on reducing the carbon footprint of steel production by altering traditional blast furnace methods.\n\n#### Importance and Influence\n\nEnvironmental considerations drive:\n\n- Regulatory compliance in industries.\n- Sustainable development and practices.\n- Innovations in clean technology.\n\n### Related Fields and Innovations\n\n#### Concept and Definition\n\nEmerging technologies and fields, like **materials science** and **sustainable engineering**, are closely linked to advancements in reducing the environmental impact of iron production.\n\n#### Practical Application\n\n- **Materials science** develops corrosion-resistant and lightweight steel.\n- **Sustainable engineering** integrates renewable energy with traditional processes.\n\nExample: Hydrogen-based reduction processes are being researched to replace carbon-based reactions, aiming for a lower environmental impact.\n\n#### Importance and Influence\n\nInnovation in these fields contributes to:\n\n- Economic efficiency.\n- Reducing reliance on finite resources.\n- Enhancing global industrial sustainability.\n\n### Conclusion\n\nThe reduction of \\(\\text{Fe}_2\\text{O}_3\\) by \\(\\text{CO}\\) in a blast furnace encompasses a wealth of knowledge across various disciplines, from basic chemistry and stoichiometry to considerations in thermodynamics and environmental science. Proper understanding and application of these concepts are essential for efficient, sustainable, and technologically advanced production processes, addressing both contemporary industrial needs and future challenges. This foundational knowledge remains pivotal in academia and industry, influencing how resources are responsibly utilized and how related technologies are innovatively developed.'}]} 87%|████████▋ | 19183/22095 [32:52:38<3:39:00, 4.51s/it] {'loss': 0.2688, 'grad_norm': 0.6167229877308994, 'learning_rate': 4.4893820623254257e-07, 'epoch': 0.87} 87%|████████▋ | 19183/22095 [32:52:38<3:39:00, 4.51s/it] 87%|████████▋ | 19184/22095 [32:52:41<3:16:28, 4.05s/it] {'loss': 0.3255, 'grad_norm': 0.6365963953210592, 'learning_rate': 4.486347218701076e-07, 'epoch': 0.87} 87%|████████▋ | 19184/22095 [32:52:41<3:16:28, 4.05s/it] 87%|████████▋ | 19185/22095 [32:52:44<2:59:40, 3.70s/it] {'loss': 0.2828, 'grad_norm': 0.6035459960710321, 'learning_rate': 4.4833133530253425e-07, 'epoch': 0.87} 87%|████████▋ | 19185/22095 [32:52:44<2:59:40, 3.70s/it] 87%|████████▋ | 19186/22095 [32:52:48<2:59:30, 3.70s/it] {'loss': 0.2625, 'grad_norm': 0.7515385783969697, 'learning_rate': 4.4802804653634124e-07, 'epoch': 0.87} 87%|████████▋ | 19186/22095 [32:52:48<2:59:30, 3.70s/it] 87%|████████▋ | 19187/22095 [32:52:51<2:54:58, 3.61s/it] {'loss': 0.2819, 'grad_norm': 0.6160544269891035, 'learning_rate': 4.477248555780467e-07, 'epoch': 0.87} 87%|████████▋ | 19187/22095 [32:52:51<2:54:58, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19188/22095 [32:53:02<4:36:21, 5.70s/it] {'loss': 0.4588, 'grad_norm': 0.2900133552382891, 'learning_rate': 4.4742176243416257e-07, 'epoch': 0.87} 87%|████████▋ | 19188/22095 [32:53:02<4:36:21, 5.70s/it] 87%|████████▋ | 19189/22095 [32:53:05<3:59:38, 4.95s/it] {'loss': 0.2858, 'grad_norm': 0.5679561213409132, 'learning_rate': 4.4711876711120206e-07, 'epoch': 0.87} 87%|████████▋ | 19189/22095 [32:53:05<3:59:38, 4.95s/it] 87%|████████▋ | 19190/22095 [32:53:08<3:30:58, 4.36s/it] {'loss': 0.3471, 'grad_norm': 2.1890383143218557, 'learning_rate': 4.4681586961567714e-07, 'epoch': 0.87} 87%|████████▋ | 19190/22095 [32:53:08<3:30:58, 4.36s/it] 87%|████████▋ | 19191/22095 [32:53:11<3:18:31, 4.10s/it] {'loss': 0.2829, 'grad_norm': 0.5816623983622343, 'learning_rate': 4.4651306995409485e-07, 'epoch': 0.87} 87%|████████▋ | 19191/22095 [32:53:11<3:18:31, 4.10s/it] 87%|████████▋ | 19192/22095 [32:53:15<3:10:51, 3.94s/it] {'loss': 0.31, 'grad_norm': 0.5621643901170886, 'learning_rate': 4.462103681329616e-07, 'epoch': 0.87} 87%|████████▋ | 19192/22095 [32:53:15<3:10:51, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51951 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94890 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19193/22095 [32:53:18<3:02:23, 3.77s/it] {'loss': 0.3343, 'grad_norm': 0.6637047733200644, 'learning_rate': 4.4590776415878166e-07, 'epoch': 0.87} 87%|████████▋ | 19193/22095 [32:53:18<3:02:23, 3.77s/it] 87%|████████▋ | 19194/22095 [32:53:21<2:50:55, 3.54s/it] {'loss': 0.2687, 'grad_norm': 0.6641349132062052, 'learning_rate': 4.4560525803805654e-07, 'epoch': 0.87} 87%|████████▋ | 19194/22095 [32:53:21<2:50:55, 3.54s/it] 87%|████████▋ | 19195/22095 [32:53:24<2:39:16, 3.30s/it] {'loss': 0.2857, 'grad_norm': 0.5905567406914118, 'learning_rate': 4.453028497772877e-07, 'epoch': 0.87} 87%|████████▋ | 19195/22095 [32:53:24<2:39:16, 3.30s/it] 87%|████████▋ | 19196/22095 [32:53:27<2:40:10, 3.31s/it] {'loss': 0.2685, 'grad_norm': 0.6181811323044204, 'learning_rate': 4.4500053938297205e-07, 'epoch': 0.87} 87%|████████▋ | 19196/22095 [32:53:27<2:40:10, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55292 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19197/22095 [32:53:30<2:32:24, 3.16s/it] {'loss': 0.3152, 'grad_norm': 0.5688526447255133, 'learning_rate': 4.4469832686160395e-07, 'epoch': 0.87} 87%|████████▋ | 19197/22095 [32:53:30<2:32:24, 3.16s/it] 87%|████████▋ | 19198/22095 [32:53:33<2:28:36, 3.08s/it] {'loss': 0.2868, 'grad_norm': 0.5609431518771413, 'learning_rate': 4.443962122196782e-07, 'epoch': 0.87} 87%|████████▋ | 19198/22095 [32:53:33<2:28:36, 3.08s/it] 87%|████████▋ | 19199/22095 [32:53:37<2:44:42, 3.41s/it] {'loss': 0.2702, 'grad_norm': 0.61520595196673, 'learning_rate': 4.4409419546368735e-07, 'epoch': 0.87} 87%|████████▋ | 19199/22095 [32:53:37<2:44:42, 3.41s/it] 87%|████████▋ | 19200/22095 [32:53:41<2:42:53, 3.38s/it] {'loss': 0.2714, 'grad_norm': 0.6052659169977241, 'learning_rate': 4.437922766001201e-07, 'epoch': 0.87} 87%|████████▋ | 19200/22095 [32:53:41<2:42:53, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19201/22095 [32:53:48<3:45:37, 4.68s/it] {'loss': 0.4929, 'grad_norm': 0.27276169252499394, 'learning_rate': 4.4349045563546245e-07, 'epoch': 0.87} 87%|████████▋ | 19201/22095 [32:53:48<3:45:37, 4.68s/it] 87%|████████▋ | 19202/22095 [32:53:52<3:28:25, 4.32s/it] {'loss': 0.2841, 'grad_norm': 0.6297723203334212, 'learning_rate': 4.4318873257620077e-07, 'epoch': 0.87} 87%|████████▋ | 19202/22095 [32:53:52<3:28:25, 4.32s/it] 87%|████████▋ | 19203/22095 [32:53:55<3:13:31, 4.02s/it] {'loss': 0.261, 'grad_norm': 0.5606502021223138, 'learning_rate': 4.428871074288188e-07, 'epoch': 0.87} 87%|████████▋ | 19203/22095 [32:53:55<3:13:31, 4.02s/it] 87%|████████▋ | 19204/22095 [32:53:58<2:59:16, 3.72s/it] {'loss': 0.2845, 'grad_norm': 0.6184436659663045, 'learning_rate': 4.425855801997969e-07, 'epoch': 0.87} 87%|████████▋ | 19204/22095 [32:53:58<2:59:16, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19205/22095 [32:54:08<4:22:43, 5.45s/it] {'loss': 0.4507, 'grad_norm': 0.26524336851540486, 'learning_rate': 4.422841508956127e-07, 'epoch': 0.87} 87%|████████▋ | 19205/22095 [32:54:08<4:22:43, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48425 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19206/22095 [32:54:12<4:02:34, 5.04s/it] {'loss': 0.2866, 'grad_norm': 0.6086760919671172, 'learning_rate': 4.419828195227455e-07, 'epoch': 0.87} 87%|████████▋ | 19206/22095 [32:54:12<4:02:34, 5.04s/it] 87%|████████▋ | 19207/22095 [32:54:16<3:44:07, 4.66s/it] {'loss': 0.2894, 'grad_norm': 0.6494256573886917, 'learning_rate': 4.416815860876672e-07, 'epoch': 0.87} 87%|████████▋ | 19207/22095 [32:54:16<3:44:07, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8923671 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46824, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 6\nB. 5\nC. 4\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 87%|████████▋ | 19208/22095 [32:54:19<3:30:52, 4.38s/it] {'loss': 0.2499, 'grad_norm': 0.6112122012915637, 'learning_rate': 4.413804505968533e-07, 'epoch': 0.87} 87%|████████▋ | 19208/22095 [32:54:19<3:30:52, 4.38s/it] 87%|████████▋ | 19209/22095 [32:54:22<3:07:17, 3.89s/it] {'loss': 0.2771, 'grad_norm': 0.6051172673894841, 'learning_rate': 4.410794130567725e-07, 'epoch': 0.87} 87%|████████▋ | 19209/22095 [32:54:22<3:07:17, 3.89s/it] 87%|████████▋ | 19210/22095 [32:54:25<2:56:03, 3.66s/it] {'loss': 0.28, 'grad_norm': 0.5811739345315933, 'learning_rate': 4.4077847347389236e-07, 'epoch': 0.87} 87%|████████▋ | 19210/22095 [32:54:25<2:56:03, 3.66s/it] 87%|████████▋ | 19211/22095 [32:54:29<2:55:06, 3.64s/it] {'loss': 0.3089, 'grad_norm': 0.6868280778020592, 'learning_rate': 4.404776318546805e-07, 'epoch': 0.87} 87%|████████▋ | 19211/22095 [32:54:29<2:55:06, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19212/22095 [32:54:32<2:51:02, 3.56s/it] {'loss': 0.2917, 'grad_norm': 0.5590297594558641, 'learning_rate': 4.401768882056012e-07, 'epoch': 0.87} 87%|████████▋ | 19212/22095 [32:54:32<2:51:02, 3.56s/it] 87%|████████▋ | 19213/22095 [32:54:35<2:40:02, 3.33s/it] {'loss': 0.2982, 'grad_norm': 0.6357604627136214, 'learning_rate': 4.3987624253311657e-07, 'epoch': 0.87} 87%|████████▋ | 19213/22095 [32:54:35<2:40:02, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19214/22095 [32:54:45<4:14:16, 5.30s/it] {'loss': 0.4706, 'grad_norm': 0.25405496257449234, 'learning_rate': 4.3957569484368523e-07, 'epoch': 0.87} 87%|████████▋ | 19214/22095 [32:54:45<4:14:16, 5.30s/it] 87%|████████▋ | 19215/22095 [32:54:49<3:53:16, 4.86s/it] {'loss': 0.283, 'grad_norm': 0.5730905648746117, 'learning_rate': 4.3927524514376596e-07, 'epoch': 0.87} 87%|████████▋ | 19215/22095 [32:54:49<3:53:16, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70861 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76684 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44186 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (42776 > 40960) for 4 sample(s). Truncating to 775 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (47269 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19216/22095 [32:54:52<3:26:16, 4.30s/it] {'loss': 0.2872, 'grad_norm': 0.5808504897339036, 'learning_rate': 4.389748934398164e-07, 'epoch': 0.87} 87%|████████▋ | 19216/22095 [32:54:52<3:26:16, 4.30s/it] 87%|████████▋ | 19217/22095 [32:54:55<3:09:05, 3.94s/it] {'loss': 0.3183, 'grad_norm': 0.5989775669737522, 'learning_rate': 4.386746397382863e-07, 'epoch': 0.87} 87%|████████▋ | 19217/22095 [32:54:55<3:09:05, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [595, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8517204 in VC:s3://internvl-moe-sft-data/. Exception: Image size [595, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 158564, 'image': 'vrdu_texteq/astro-ph.CO/b208e17a-a83f-4f7a-83b0-cdfe5f006d2c.png', 'image_wh': [[595, 23]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'The mean-life of neutrons is taken to be $880.1$~s~.'}]} 87%|████████▋ | 19218/22095 [32:55:03<4:06:38, 5.14s/it] {'loss': 0.4799, 'grad_norm': 0.2539415810006778, 'learning_rate': 4.3837448404562886e-07, 'epoch': 0.87} 87%|████████▋ | 19218/22095 [32:55:03<4:06:38, 5.14s/it] 87%|████████▋ | 19219/22095 [32:55:06<3:40:35, 4.60s/it] {'loss': 0.2673, 'grad_norm': 0.7084341348935672, 'learning_rate': 4.3807442636829513e-07, 'epoch': 0.87} 87%|████████▋ | 19219/22095 [32:55:06<3:40:35, 4.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19220/22095 [32:55:09<3:14:53, 4.07s/it] {'loss': 0.2818, 'grad_norm': 0.6404269693956344, 'learning_rate': 4.3777446671273093e-07, 'epoch': 0.87} 87%|████████▋ | 19220/22095 [32:55:09<3:14:53, 4.07s/it] 87%|████████▋ | 19221/22095 [32:55:12<3:01:23, 3.79s/it] {'loss': 0.3055, 'grad_norm': 0.6818600806110079, 'learning_rate': 4.3747460508538064e-07, 'epoch': 0.87} 87%|████████▋ | 19221/22095 [32:55:12<3:01:23, 3.79s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [37, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8371036 in VC:s3://internvl-moe-sft-data/. Exception: Image size [37, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37797, 'image': 'vrdu_table_final_2/astro-ph.CO/84c9b012-9b56-45c5-a20a-bbd88535d1d5.png', 'image_wh': [[37, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}} 0.6 \\end{tabular}\n```"}]} 87%|████████▋ | 19222/22095 [32:55:15<2:55:48, 3.67s/it] {'loss': 0.2804, 'grad_norm': 0.5967480519356161, 'learning_rate': 4.371748414926896e-07, 'epoch': 0.87} 87%|████████▋ | 19222/22095 [32:55:16<2:55:48, 3.67s/it] 87%|████████▋ | 19223/22095 [32:55:19<2:53:30, 3.62s/it] {'loss': 0.3152, 'grad_norm': 0.6705196046683227, 'learning_rate': 4.3687517594109664e-07, 'epoch': 0.87} 87%|████████▋ | 19223/22095 [32:55:19<2:53:30, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19224/22095 [32:55:28<4:17:40, 5.38s/it] {'loss': 0.4877, 'grad_norm': 0.27213909087842897, 'learning_rate': 4.3657560843704207e-07, 'epoch': 0.87} 87%|████████▋ | 19224/22095 [32:55:28<4:17:40, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49713 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115637 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114564 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19225/22095 [32:55:32<3:57:36, 4.97s/it] {'loss': 0.2902, 'grad_norm': 0.5580890194538464, 'learning_rate': 4.362761389869624e-07, 'epoch': 0.87} 87%|████████▋ | 19225/22095 [32:55:32<3:57:36, 4.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922958 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46111, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=6cm,在线段AB的延长线上有一点C,且BC=4cm,若点M、N分别为AB、BC的中点,那么M、N两点之间的距离为()\nA. 4cm\nB. 5cm\nC. 无法确定\nD. 1cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 87%|████████▋ | 19226/22095 [32:55:35<3:30:49, 4.41s/it] {'loss': 0.2687, 'grad_norm': 0.6272575113776604, 'learning_rate': 4.3597676759729147e-07, 'epoch': 0.87} 87%|████████▋ | 19226/22095 [32:55:35<3:30:49, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (114532 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19227/22095 [32:55:39<3:25:08, 4.29s/it] {'loss': 0.3212, 'grad_norm': 0.6501694422134591, 'learning_rate': 4.356774942744618e-07, 'epoch': 0.87} 87%|████████▋ | 19227/22095 [32:55:39<3:25:08, 4.29s/it] 87%|████████▋ | 19228/22095 [32:55:43<3:16:08, 4.10s/it] {'loss': 0.3212, 'grad_norm': 0.6227079354249133, 'learning_rate': 4.353783190249061e-07, 'epoch': 0.87} 87%|████████▋ | 19228/22095 [32:55:43<3:16:08, 4.10s/it] 87%|████████▋ | 19229/22095 [32:55:47<3:09:09, 3.96s/it] {'loss': 0.3429, 'grad_norm': 0.6774682709699127, 'learning_rate': 4.350792418550509e-07, 'epoch': 0.87} 87%|████████▋ | 19229/22095 [32:55:47<3:09:09, 3.96s/it] 87%|████████▋ | 19230/22095 [32:55:50<2:58:51, 3.75s/it] {'loss': 0.3038, 'grad_norm': 0.6162439245033716, 'learning_rate': 4.3478026277132157e-07, 'epoch': 0.87} 87%|████████▋ | 19230/22095 [32:55:50<2:58:51, 3.75s/it] 87%|████████▋ | 19231/22095 [32:55:53<2:48:15, 3.53s/it] {'loss': 0.2752, 'grad_norm': 0.5618576367930656, 'learning_rate': 4.3448138178014354e-07, 'epoch': 0.87} 87%|████████▋ | 19231/22095 [32:55:53<2:48:15, 3.53s/it] 87%|████████▋ | 19232/22095 [32:55:56<2:43:36, 3.43s/it] {'loss': 0.2934, 'grad_norm': 0.8201144878058423, 'learning_rate': 4.3418259888794e-07, 'epoch': 0.87} 87%|████████▋ | 19232/22095 [32:55:56<2:43:36, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19233/22095 [32:56:04<3:51:51, 4.86s/it] {'loss': 0.4781, 'grad_norm': 0.3217543622772321, 'learning_rate': 4.338839141011292e-07, 'epoch': 0.87} 87%|████████▋ | 19233/22095 [32:56:04<3:51:51, 4.86s/it] 87%|████████▋ | 19234/22095 [32:56:08<3:29:41, 4.40s/it] {'loss': 0.2976, 'grad_norm': 0.5878816880054325, 'learning_rate': 4.3358532742612814e-07, 'epoch': 0.87} 87%|████████▋ | 19234/22095 [32:56:08<3:29:41, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19235/22095 [32:56:15<4:13:07, 5.31s/it] {'loss': 0.4722, 'grad_norm': 0.26947232868969295, 'learning_rate': 4.3328683886935507e-07, 'epoch': 0.87} 87%|████████▋ | 19235/22095 [32:56:15<4:13:07, 5.31s/it] 87%|████████▋ | 19236/22095 [32:56:19<3:50:28, 4.84s/it] {'loss': 0.3097, 'grad_norm': 0.7166495013877396, 'learning_rate': 4.329884484372215e-07, 'epoch': 0.87} 87%|████████▋ | 19236/22095 [32:56:19<3:50:28, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68567 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19237/22095 [32:56:22<3:30:25, 4.42s/it] {'loss': 0.2757, 'grad_norm': 0.6034099383936006, 'learning_rate': 4.326901561361402e-07, 'epoch': 0.87} 87%|████████▋ | 19237/22095 [32:56:22<3:30:25, 4.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50703 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53559 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19238/22095 [32:56:25<3:08:42, 3.96s/it] {'loss': 0.2653, 'grad_norm': 0.5717583707219932, 'learning_rate': 4.3239196197252034e-07, 'epoch': 0.87} 87%|████████▋ | 19238/22095 [32:56:25<3:08:42, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72994 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68874 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41460 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63099 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19239/22095 [32:56:28<2:52:46, 3.63s/it] {'loss': 0.2792, 'grad_norm': 0.5895933010053086, 'learning_rate': 4.3209386595276737e-07, 'epoch': 0.87} 87%|████████▋ | 19239/22095 [32:56:28<2:52:46, 3.63s/it] 87%|████████▋ | 19240/22095 [32:56:31<2:47:28, 3.52s/it] {'loss': 0.2686, 'grad_norm': 0.6151786474253428, 'learning_rate': 4.317958680832884e-07, 'epoch': 0.87} 87%|████████▋ | 19240/22095 [32:56:31<2:47:28, 3.52s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 907, in process_image_unified ) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 432, in preprocess patches, image_grid_thw = self._preprocess( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 252, in _preprocess resized_height, resized_width = smart_resize( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/transformers/models/qwen2_vl/image_processing_qwen2_vl.py", line 69, in smart_resize raise ValueError(f"height:{height} and width:{width} must be larger than factor:{factor}") ValueError: height:21 and width:135 must be larger than factor:28 [Try #0] Failed to fetch sample 2206265 in VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/. Exception: height:21 and width:135 must be larger than factor:28 Problematic sample: {'image': 'a4b739659e7c325d58fab0a3e55e135c875dc9a485e2b2af942dbafdf4662908.png', 'conversations': [{'from': 'human', 'value': "\nThe Date picker trigger button icon's intended function:\nActs as a date picker trigger that opens a calendar dropdown when clicked, allowing users to select dates visually rather than typing them manually. It's part of the browser's default date input functionality."}, {'from': 'gpt', 'value': '[[674, 238, 756, 762]]'}, {'from': 'human', 'value': "The spatial layout of this Date picker trigger button icon:\nLocated at the absolute right edge of a date input field, aligned vertically in the center. The triangle appears immediately adjacent to another UI control in the same date field.\n\nRegarding the Date picker trigger button icon's functionality:\nActs as a date picker trigger that opens a calendar dropdown when clicked, allowing users to select dates visually rather than typing them manually. It's part of the browser's default date input functionality."}, {'from': 'gpt', 'value': '[[674, 238, 756, 762]]'}, {'from': 'human', 'value': 'Location and arrangement of this Date picker trigger button icon:\nLocated at the absolute right edge of a date input field, aligned vertically in the center. The triangle appears immediately adjacent to another UI control in the same date field.'}, {'from': 'gpt', 'value': '[[674, 238, 756, 762]]'}, {'from': 'human', 'value': "The Button (Spinner Button)'s visual characteristics are as follows:\nA small downward-pointing arrow button that's part of a number input control. The arrow has a dark color and appears in a square container with border styling.\n\nPositional details of the Button (Spinner Button):\nThis button is positioned on the right side of a date input field, located in a row with two identical date input controls. Each date field has a dd/mm/yyyy format placeholder text."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[785, 190, 852, 857]]'}, {'from': 'human', 'value': "The visual appearance of this Button (Spinner Button) is as follows:\n\nVisual Description: A small downward-pointing arrow button that's part of a number input control. The arrow has a dark color and appears in a square container with border styling."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[785, 190, 852, 857]]'}, {'from': 'human', 'value': "This Button (Spinner Button) serves the following purpose:\nWhen clicked, this button decreases the value in the associated date input field. It's part of the browser's default date picker UI controls that allow users to adjust date values without typing."}, {'from': 'gpt', 'value': 'x1, y1, x2, y2: [[785, 190, 852, 857]]'}, {'from': 'human', 'value': "The User Profile Component's appearance can be described as:\nA compact user profile display showing the username 'NotTerry' accompanied by a small circular avatar with a cyan/blue profile picture\n\nThe User Profile Component's placement in the interface:\nLocated in the live chat section on the right side of the interface, integrated into the message stream, appearing as a message sender's identification"}, {'from': 'gpt', 'value': '[[785, 190, 852, 857]]'}, {'from': 'human', 'value': "The User Profile Component's appearance can be described as:\nA compact user profile display showing the username 'NotTerry' accompanied by a small circular avatar with a cyan/blue profile picture\n\nFunctional capabilities of the User Profile Component:\nRepresents a user's identity and serves as a header for their chat messages in the live chat interface"}, {'from': 'gpt', 'value': '[[785, 190, 852, 857]]'}, {'from': 'human', 'value': "The visual attributes of this User Profile Component are:\nA compact user profile display showing the username 'NotTerry' accompanied by a small circular avatar with a cyan/blue profile picture\n\nThis User Profile Component is positioned as follows:\nLocated in the live chat section on the right side of the interface, integrated into the message stream, appearing as a message sender's identification\n\nUsage and purpose of this User Profile Component:\nRepresents a user's identity and serves as a header for their chat messages in the live chat interface"}, {'from': 'gpt', 'value': '[[785, 190, 852, 857]]'}, {'from': 'human', 'value': "Here's a detailed visual description of the Navigation Button:\nA navigation icon featuring three small squares arranged in a grid pattern, located within a dark navy interface. The icon appears to use a faintly glowing blue color against the dark background.\n\nThis Navigation Button allows users to:\nA navigation button that likely serves as a 'MAIN' menu control, allowing users to access primary navigation options or return to the main view of the application."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}, {'from': 'human', 'value': "The position of this Navigation Button can be described as:\nThe icon is positioned in the upper-left side of the interface, just below the RustyLoot logo. It's the first item in what appears to be a vertical navigation menu with other gaming-related options.\n\nThis Navigation Button serves the following purpose:\nA navigation button that likely serves as a 'MAIN' menu control, allowing users to access primary navigation options or return to the main view of the application."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}, {'from': 'human', 'value': "The Navigation Button's appearance can be described as:\nA navigation icon featuring three small squares arranged in a grid pattern, located within a dark navy interface. The icon appears to use a faintly glowing blue color against the dark background.\n\nPositional details of the Navigation Button:\nThe icon is positioned in the upper-left side of the interface, just below the RustyLoot logo. It's the first item in what appears to be a vertical navigation menu with other gaming-related options."}, {'from': 'gpt', 'value': '[[0, 0, 1000, 1000]]'}]} 87%|████████▋ | 19241/22095 [32:56:35<2:49:52, 3.57s/it] {'loss': 0.289, 'grad_norm': 0.6573071928254116, 'learning_rate': 4.3149796837048677e-07, 'epoch': 0.87} 87%|████████▋ | 19241/22095 [32:56:35<2:49:52, 3.57s/it] 87%|████████▋ | 19242/22095 [32:56:39<2:56:10, 3.71s/it] {'loss': 0.306, 'grad_norm': 0.6535222992402957, 'learning_rate': 4.3120016682076324e-07, 'epoch': 0.87} 87%|████████▋ | 19242/22095 [32:56:39<2:56:10, 3.71s/it] 87%|████████▋ | 19243/22095 [32:56:42<2:49:41, 3.57s/it] {'loss': 0.2703, 'grad_norm': 0.6144457344128956, 'learning_rate': 4.309024634405146e-07, 'epoch': 0.87} 87%|████████▋ | 19243/22095 [32:56:42<2:49:41, 3.57s/it] 87%|████████▋ | 19244/22095 [32:56:45<2:39:19, 3.35s/it] {'loss': 0.2945, 'grad_norm': 0.5685603172224465, 'learning_rate': 4.306048582361394e-07, 'epoch': 0.87} 87%|████████▋ | 19244/22095 [32:56:45<2:39:19, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19245/22095 [32:56:55<4:07:12, 5.20s/it] {'loss': 0.4831, 'grad_norm': 0.2646618931089399, 'learning_rate': 4.3030735121403376e-07, 'epoch': 0.87} 87%|████████▋ | 19245/22095 [32:56:55<4:07:12, 5.20s/it] 87%|████████▋ | 19246/22095 [32:57:01<4:23:04, 5.54s/it] {'loss': 0.4574, 'grad_norm': 0.2749930163517183, 'learning_rate': 4.300099423805865e-07, 'epoch': 0.87} 87%|████████▋ | 19246/22095 [32:57:01<4:23:04, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78440 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60720 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19247/22095 [32:57:11<5:26:44, 6.88s/it] {'loss': 0.4804, 'grad_norm': 0.27635098523908386, 'learning_rate': 4.2971263174219014e-07, 'epoch': 0.87} 87%|████████▋ | 19247/22095 [32:57:11<5:26:44, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 87%|████████▋ | 19248/22095 [32:57:14<4:33:22, 5.76s/it] {'loss': 0.2964, 'grad_norm': 0.5877024693849136, 'learning_rate': 4.2941541930523356e-07, 'epoch': 0.87} 87%|████████▋ | 19248/22095 [32:57:14<4:33:22, 5.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (115863 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68343 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52814 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19249/22095 [32:57:17<3:57:41, 5.01s/it] {'loss': 0.2738, 'grad_norm': 0.6122661833595895, 'learning_rate': 4.291183050761022e-07, 'epoch': 0.87} 87%|████████▋ | 19249/22095 [32:57:17<3:57:41, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62596 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41674 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113196 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56502 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19250/22095 [32:57:21<3:31:12, 4.45s/it] {'loss': 0.2818, 'grad_norm': 0.6142665018253046, 'learning_rate': 4.288212890611787e-07, 'epoch': 0.87} 87%|████████▋ | 19250/22095 [32:57:21<3:31:12, 4.45s/it] 87%|████████▋ | 19251/22095 [32:57:24<3:14:35, 4.11s/it] {'loss': 0.3234, 'grad_norm': 0.6032876851784271, 'learning_rate': 4.28524371266848e-07, 'epoch': 0.87} 87%|████████▋ | 19251/22095 [32:57:24<3:14:35, 4.11s/it] 87%|████████▋ | 19252/22095 [32:57:27<3:03:29, 3.87s/it] {'loss': 0.3027, 'grad_norm': 0.6290539406396431, 'learning_rate': 4.2822755169948714e-07, 'epoch': 0.87} 87%|████████▋ | 19252/22095 [32:57:27<3:03:29, 3.87s/it] 87%|████████▋ | 19253/22095 [32:57:30<2:51:32, 3.62s/it] {'loss': 0.2438, 'grad_norm': 0.5551021014013245, 'learning_rate': 4.2793083036547554e-07, 'epoch': 0.87} 87%|████████▋ | 19253/22095 [32:57:30<2:51:32, 3.62s/it] 87%|████████▋ | 19254/22095 [32:57:34<2:49:31, 3.58s/it] {'loss': 0.2844, 'grad_norm': 0.7048705729104559, 'learning_rate': 4.276342072711881e-07, 'epoch': 0.87} 87%|████████▋ | 19254/22095 [32:57:34<2:49:31, 3.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19255/22095 [32:57:43<4:14:16, 5.37s/it] {'loss': 0.4484, 'grad_norm': 0.28930790958547165, 'learning_rate': 4.273376824229991e-07, 'epoch': 0.87} 87%|████████▋ | 19255/22095 [32:57:43<4:14:16, 5.37s/it] 87%|████████▋ | 19256/22095 [32:57:47<3:46:14, 4.78s/it] {'loss': 0.3384, 'grad_norm': 0.615694385686459, 'learning_rate': 4.270412558272785e-07, 'epoch': 0.87} 87%|████████▋ | 19256/22095 [32:57:47<3:46:14, 4.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19257/22095 [32:57:57<5:06:22, 6.48s/it] {'loss': 0.5046, 'grad_norm': 0.2746311252453542, 'learning_rate': 4.267449274903979e-07, 'epoch': 0.87} 87%|████████▋ | 19257/22095 [32:57:57<5:06:22, 6.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (131830 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43185 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117591 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19258/22095 [32:58:04<5:04:49, 6.45s/it] {'loss': 0.4676, 'grad_norm': 0.25477527336527783, 'learning_rate': 4.2644869741872263e-07, 'epoch': 0.87} 87%|████████▋ | 19258/22095 [32:58:04<5:04:49, 6.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 87%|████████▋ | 19259/22095 [32:58:07<4:18:17, 5.46s/it] {'loss': 0.3168, 'grad_norm': 0.6237616386439251, 'learning_rate': 4.2615256561861773e-07, 'epoch': 0.87} 87%|████████▋ | 19259/22095 [32:58:07<4:18:17, 5.46s/it] 87%|████████▋ | 19260/22095 [32:58:10<3:44:39, 4.75s/it] {'loss': 0.2646, 'grad_norm': 0.5799414767663191, 'learning_rate': 4.258565320964464e-07, 'epoch': 0.87} 87%|████████▋ | 19260/22095 [32:58:10<3:44:39, 4.75s/it] 87%|████████▋ | 19261/22095 [32:58:13<3:29:18, 4.43s/it] {'loss': 0.2521, 'grad_norm': 0.5581433964083954, 'learning_rate': 4.2556059685857133e-07, 'epoch': 0.87} 87%|████████▋ | 19261/22095 [32:58:13<3:29:18, 4.43s/it] 87%|████████▋ | 19262/22095 [32:58:16<3:07:30, 3.97s/it] {'loss': 0.2704, 'grad_norm': 0.647202387736622, 'learning_rate': 4.252647599113491e-07, 'epoch': 0.87} 87%|████████▋ | 19262/22095 [32:58:16<3:07:30, 3.97s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19263/22095 [32:58:19<2:52:01, 3.64s/it] {'loss': 0.3396, 'grad_norm': 0.608873076960802, 'learning_rate': 4.2496902126113626e-07, 'epoch': 0.87} 87%|████████▋ | 19263/22095 [32:58:19<2:52:01, 3.64s/it] 87%|████████▋ | 19264/22095 [32:58:23<2:56:32, 3.74s/it] {'loss': 0.3353, 'grad_norm': 0.689268710860395, 'learning_rate': 4.246733809142889e-07, 'epoch': 0.87} 87%|████████▋ | 19264/22095 [32:58:23<2:56:32, 3.74s/it] 87%|████████▋ | 19265/22095 [32:58:27<3:01:06, 3.84s/it] {'loss': 0.282, 'grad_norm': 0.6566491745917127, 'learning_rate': 4.2437783887715745e-07, 'epoch': 0.87} 87%|████████▋ | 19265/22095 [32:58:27<3:01:06, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19266/22095 [32:58:34<3:41:35, 4.70s/it] {'loss': 0.4821, 'grad_norm': 0.28732687237202653, 'learning_rate': 4.2408239515609407e-07, 'epoch': 0.87} 87%|████████▋ | 19266/22095 [32:58:34<3:41:35, 4.70s/it] 87%|████████▋ | 19267/22095 [32:58:38<3:36:05, 4.58s/it] {'loss': 0.2778, 'grad_norm': 0.6146920890965886, 'learning_rate': 4.2378704975744646e-07, 'epoch': 0.87} 87%|████████▋ | 19267/22095 [32:58:38<3:36:05, 4.58s/it] 87%|████████▋ | 19268/22095 [32:58:42<3:21:42, 4.28s/it] {'loss': 0.2898, 'grad_norm': 0.6326512945876744, 'learning_rate': 4.2349180268755953e-07, 'epoch': 0.87} 87%|████████▋ | 19268/22095 [32:58:42<3:21:42, 4.28s/it] 87%|████████▋ | 19269/22095 [32:58:45<3:02:32, 3.88s/it] {'loss': 0.3457, 'grad_norm': 0.6194643099384461, 'learning_rate': 4.231966539527782e-07, 'epoch': 0.87} 87%|████████▋ | 19269/22095 [32:58:45<3:02:32, 3.88s/it] 87%|████████▋ | 19270/22095 [32:58:48<2:59:40, 3.82s/it] {'loss': 0.2819, 'grad_norm': 0.5949243243637904, 'learning_rate': 4.2290160355944467e-07, 'epoch': 0.87} 87%|████████▋ | 19270/22095 [32:58:49<2:59:40, 3.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047925 in VC:s3://multi-modal/UniGeo/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 2\nB. 4\nC. 8\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由点D是线段AB的中点,得AD=\\frac{1}{2}AB=\\frac{1}{2}×16=8cm,由C是线段AD的中点,得CD=\\frac{1}{2}AD=\\frac{1}{2}×8=4cm.'}]} 87%|████████▋ | 19271/22095 [32:58:52<2:51:59, 3.65s/it] {'loss': 0.2972, 'grad_norm': 0.6479036250261243, 'learning_rate': 4.2260665151389825e-07, 'epoch': 0.87} 87%|████████▋ | 19271/22095 [32:58:52<2:51:59, 3.65s/it] 87%|████████▋ | 19272/22095 [32:58:55<2:43:14, 3.47s/it] {'loss': 0.2769, 'grad_norm': 0.6817991465424627, 'learning_rate': 4.223117978224761e-07, 'epoch': 0.87} 87%|████████▋ | 19272/22095 [32:58:55<2:43:14, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45597 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87783 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19273/22095 [32:58:59<2:53:29, 3.69s/it] {'loss': 0.339, 'grad_norm': 0.6523627812235554, 'learning_rate': 4.2201704249151377e-07, 'epoch': 0.87} 87%|████████▋ | 19273/22095 [32:58:59<2:53:29, 3.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68057 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19274/22095 [32:59:02<2:49:20, 3.60s/it] {'loss': 0.3178, 'grad_norm': 0.630393557338104, 'learning_rate': 4.217223855273467e-07, 'epoch': 0.87} 87%|████████▋ | 19274/22095 [32:59:02<2:49:20, 3.60s/it] 87%|████████▋ | 19275/22095 [32:59:06<2:43:45, 3.48s/it] {'loss': 0.3071, 'grad_norm': 0.6068565329435825, 'learning_rate': 4.214278269363026e-07, 'epoch': 0.87} 87%|████████▋ | 19275/22095 [32:59:06<2:43:45, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19276/22095 [32:59:13<3:45:08, 4.79s/it] {'loss': 0.4875, 'grad_norm': 0.26120053279786065, 'learning_rate': 4.211333667247125e-07, 'epoch': 0.87} 87%|████████▋ | 19276/22095 [32:59:13<3:45:08, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75125 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41936 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88519 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78594 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19277/22095 [32:59:17<3:28:41, 4.44s/it] {'loss': 0.3256, 'grad_norm': 0.6359553199205671, 'learning_rate': 4.208390048989047e-07, 'epoch': 0.87} 87%|████████▋ | 19277/22095 [32:59:17<3:28:41, 4.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19278/22095 [32:59:28<5:00:52, 6.41s/it] {'loss': 0.449, 'grad_norm': 0.28486232174258747, 'learning_rate': 4.2054474146520254e-07, 'epoch': 0.87} 87%|████████▋ | 19278/22095 [32:59:28<5:00:52, 6.41s/it] 87%|████████▋ | 19279/22095 [32:59:32<4:18:41, 5.51s/it] {'loss': 0.3179, 'grad_norm': 0.5948076828239628, 'learning_rate': 4.202505764299286e-07, 'epoch': 0.87} 87%|████████▋ | 19279/22095 [32:59:32<4:18:41, 5.51s/it] 87%|████████▋ | 19280/22095 [32:59:35<3:50:29, 4.91s/it] {'loss': 0.2939, 'grad_norm': 0.683832652398244, 'learning_rate': 4.199565097994046e-07, 'epoch': 0.87} 87%|████████▋ | 19280/22095 [32:59:35<3:50:29, 4.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19281/22095 [32:59:46<5:10:35, 6.62s/it] {'loss': 0.4816, 'grad_norm': 0.29879773745404237, 'learning_rate': 4.1966254157994826e-07, 'epoch': 0.87} 87%|████████▋ | 19281/22095 [32:59:46<5:10:35, 6.62s/it] 87%|████████▋ | 19282/22095 [32:59:55<5:51:52, 7.51s/it] {'loss': 0.4963, 'grad_norm': 0.27659396071200854, 'learning_rate': 4.1936867177787723e-07, 'epoch': 0.87} 87%|████████▋ | 19282/22095 [32:59:55<5:51:52, 7.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (129783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47514 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19283/22095 [33:00:00<5:16:48, 6.76s/it] {'loss': 0.3116, 'grad_norm': 0.6063704551071534, 'learning_rate': 4.190749003995037e-07, 'epoch': 0.87} 87%|████████▋ | 19283/22095 [33:00:00<5:16:48, 6.76s/it] 87%|████████▋ | 19284/22095 [33:00:04<4:37:20, 5.92s/it] {'loss': 0.2813, 'grad_norm': 0.6478424532617484, 'learning_rate': 4.187812274511427e-07, 'epoch': 0.87} 87%|████████▋ | 19284/22095 [33:00:04<4:37:20, 5.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19285/22095 [33:00:14<5:28:05, 7.01s/it] {'loss': 0.4547, 'grad_norm': 0.25002317875374175, 'learning_rate': 4.1848765293910187e-07, 'epoch': 0.87} 87%|████████▋ | 19285/22095 [33:00:14<5:28:05, 7.01s/it] 87%|████████▋ | 19286/22095 [33:00:18<4:42:57, 6.04s/it] {'loss': 0.3154, 'grad_norm': 0.8394752438160599, 'learning_rate': 4.181941768696912e-07, 'epoch': 0.87} 87%|████████▋ | 19286/22095 [33:00:18<4:42:57, 6.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19287/22095 [33:00:25<4:57:46, 6.36s/it] {'loss': 0.4757, 'grad_norm': 0.3167525239809427, 'learning_rate': 4.1790079924921625e-07, 'epoch': 0.87} 87%|████████▋ | 19287/22095 [33:00:25<4:57:46, 6.36s/it] 87%|████████▋ | 19288/22095 [33:00:28<4:15:38, 5.46s/it] {'loss': 0.2937, 'grad_norm': 0.6515488862914801, 'learning_rate': 4.176075200839791e-07, 'epoch': 0.87} 87%|████████▋ | 19288/22095 [33:00:28<4:15:38, 5.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54402 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59036 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50846 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19289/22095 [33:00:32<3:49:19, 4.90s/it] {'loss': 0.3317, 'grad_norm': 0.6422687423007235, 'learning_rate': 4.173143393802825e-07, 'epoch': 0.87} 87%|████████▋ | 19289/22095 [33:00:32<3:49:19, 4.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74471 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19290/22095 [33:00:35<3:22:55, 4.34s/it] {'loss': 0.2833, 'grad_norm': 0.6589076061519356, 'learning_rate': 4.170212571444271e-07, 'epoch': 0.87} 87%|████████▋ | 19290/22095 [33:00:35<3:22:55, 4.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19291/22095 [33:00:43<4:14:19, 5.44s/it] {'loss': 0.4838, 'grad_norm': 0.26402214706993543, 'learning_rate': 4.1672827338270884e-07, 'epoch': 0.87} 87%|████████▋ | 19291/22095 [33:00:43<4:14:19, 5.44s/it] 87%|████████▋ | 19292/22095 [33:00:46<3:46:16, 4.84s/it] {'loss': 0.2827, 'grad_norm': 0.5423897183305791, 'learning_rate': 4.1643538810142324e-07, 'epoch': 0.87} 87%|████████▋ | 19292/22095 [33:00:46<3:46:16, 4.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50344 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61672 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43074 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19293/22095 [33:00:50<3:33:49, 4.58s/it] {'loss': 0.29, 'grad_norm': 0.693746363923824, 'learning_rate': 4.1614260130686424e-07, 'epoch': 0.87} 87%|████████▋ | 19293/22095 [33:00:50<3:33:49, 4.58s/it] 87%|████████▋ | 19294/22095 [33:00:53<3:10:57, 4.09s/it] {'loss': 0.2907, 'grad_norm': 0.6786270636978193, 'learning_rate': 4.158499130053223e-07, 'epoch': 0.87} 87%|████████▋ | 19294/22095 [33:00:53<3:10:57, 4.09s/it] 87%|████████▋ | 19295/22095 [33:00:57<3:02:54, 3.92s/it] {'loss': 0.2811, 'grad_norm': 0.5433855176325215, 'learning_rate': 4.155573232030868e-07, 'epoch': 0.87} 87%|████████▋ | 19295/22095 [33:00:57<3:02:54, 3.92s/it] 87%|████████▋ | 19296/22095 [33:01:00<2:53:07, 3.71s/it] {'loss': 0.2744, 'grad_norm': 0.6064155161405602, 'learning_rate': 4.152648319064445e-07, 'epoch': 0.87} 87%|████████▋ | 19296/22095 [33:01:00<2:53:07, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19297/22095 [33:01:09<4:15:52, 5.49s/it] {'loss': 0.4553, 'grad_norm': 0.4722005543780067, 'learning_rate': 4.1497243912167975e-07, 'epoch': 0.87} 87%|████████▋ | 19297/22095 [33:01:09<4:15:52, 5.49s/it] 87%|████████▋ | 19298/22095 [33:01:13<3:54:39, 5.03s/it] {'loss': 0.2969, 'grad_norm': 0.6905482528007223, 'learning_rate': 4.146801448550747e-07, 'epoch': 0.87} 87%|████████▋ | 19298/22095 [33:01:13<3:54:39, 5.03s/it] 87%|████████▋ | 19299/22095 [33:01:17<3:31:51, 4.55s/it] {'loss': 0.2956, 'grad_norm': 0.6159339429816463, 'learning_rate': 4.143879491129116e-07, 'epoch': 0.87} 87%|████████▋ | 19299/22095 [33:01:17<3:31:51, 4.55s/it] 87%|████████▋ | 19300/22095 [33:01:20<3:07:51, 4.03s/it] {'loss': 0.2665, 'grad_norm': 0.5837472702885921, 'learning_rate': 4.140958519014682e-07, 'epoch': 0.87} 87%|████████▋ | 19300/22095 [33:01:20<3:07:51, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19301/22095 [33:01:29<4:26:49, 5.73s/it] {'loss': 0.443, 'grad_norm': 0.26396583160348003, 'learning_rate': 4.1380385322701945e-07, 'epoch': 0.87} 87%|████████▋ | 19301/22095 [33:01:29<4:26:49, 5.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68614 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64098 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19302/22095 [33:01:32<3:50:18, 4.95s/it] {'loss': 0.2684, 'grad_norm': 0.6096296037623206, 'learning_rate': 4.1351195309584034e-07, 'epoch': 0.87} 87%|████████▋ | 19302/22095 [33:01:32<3:50:18, 4.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [464, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465554 in VC:s3://internvl-moe-sft-data/. Exception: Image size [464, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 115623, 'image': 'vrdu_texteq/astro-ph.CO/bd7ce36d-eb24-4cae-b316-3a620098827b.png', 'image_wh': [[464, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly identify and transcribe the text present in the image.'}, {'from': 'gpt', 'value': 'while the solution for the $\\epsilon=1$ case is'}]} 87%|████████▋ | 19303/22095 [33:01:36<3:27:06, 4.45s/it] {'loss': 0.2921, 'grad_norm': 0.5867769778710267, 'learning_rate': 4.132201515142037e-07, 'epoch': 0.87} 87%|████████▋ | 19303/22095 [33:01:36<3:27:06, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19304/22095 [33:01:39<3:10:51, 4.10s/it] {'loss': 0.2702, 'grad_norm': 0.7311603956808999, 'learning_rate': 4.129284484883789e-07, 'epoch': 0.87} 87%|████████▋ | 19304/22095 [33:01:39<3:10:51, 4.10s/it] 87%|████████▋ | 19305/22095 [33:01:42<2:57:16, 3.81s/it] {'loss': 0.3367, 'grad_norm': 0.5948717141358987, 'learning_rate': 4.126368440246331e-07, 'epoch': 0.87} 87%|████████▋ | 19305/22095 [33:01:42<2:57:16, 3.81s/it] 87%|████████▋ | 19306/22095 [33:01:46<2:54:28, 3.75s/it] {'loss': 0.2991, 'grad_norm': 0.572718298861418, 'learning_rate': 4.1234533812923307e-07, 'epoch': 0.87} 87%|████████▋ | 19306/22095 [33:01:46<2:54:28, 3.75s/it] 87%|████████▋ | 19307/22095 [33:01:49<2:54:14, 3.75s/it] {'loss': 0.2663, 'grad_norm': 0.5464262209786708, 'learning_rate': 4.120539308084409e-07, 'epoch': 0.87} 87%|████████▋ | 19307/22095 [33:01:49<2:54:14, 3.75s/it] 87%|████████▋ | 19308/22095 [33:01:53<2:45:54, 3.57s/it] {'loss': 0.2939, 'grad_norm': 0.6203983342770546, 'learning_rate': 4.1176262206852e-07, 'epoch': 0.87} 87%|████████▋ | 19308/22095 [33:01:53<2:45:54, 3.57s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (104400000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 87%|████████▋ | 19309/22095 [33:01:55<2:35:02, 3.34s/it] {'loss': 0.2894, 'grad_norm': 0.6275878683163069, 'learning_rate': 4.114714119157287e-07, 'epoch': 0.87} 87%|████████▋ | 19309/22095 [33:01:55<2:35:02, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106174 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19310/22095 [33:02:03<3:40:20, 4.75s/it] {'loss': 0.4439, 'grad_norm': 0.2656071642941677, 'learning_rate': 4.111803003563231e-07, 'epoch': 0.87} 87%|████████▋ | 19310/22095 [33:02:03<3:40:20, 4.75s/it] 87%|████████▋ | 19311/22095 [33:02:07<3:20:19, 4.32s/it] {'loss': 0.2719, 'grad_norm': 0.6453478987417932, 'learning_rate': 4.108892873965603e-07, 'epoch': 0.87} 87%|████████▋ | 19311/22095 [33:02:07<3:20:19, 4.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19312/22095 [33:02:10<3:03:38, 3.96s/it] {'loss': 0.3123, 'grad_norm': 0.6799707695236196, 'learning_rate': 4.105983730426916e-07, 'epoch': 0.87} 87%|████████▋ | 19312/22095 [33:02:10<3:03:38, 3.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19313/22095 [33:02:13<2:55:10, 3.78s/it] {'loss': 0.2624, 'grad_norm': 0.5816988735360314, 'learning_rate': 4.103075573009691e-07, 'epoch': 0.87} 87%|████████▋ | 19313/22095 [33:02:13<2:55:10, 3.78s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (143038128 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 87%|████████▋ | 19314/22095 [33:02:16<2:41:59, 3.50s/it] {'loss': 0.2969, 'grad_norm': 0.6331262659881548, 'learning_rate': 4.1001684017764053e-07, 'epoch': 0.87} 87%|████████▋ | 19314/22095 [33:02:16<2:41:59, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19315/22095 [33:02:19<2:38:48, 3.43s/it] {'loss': 0.2979, 'grad_norm': 0.7024568024365313, 'learning_rate': 4.097262216789538e-07, 'epoch': 0.87} 87%|████████▋ | 19315/22095 [33:02:19<2:38:48, 3.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71099 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79360 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19316/22095 [33:02:22<2:31:12, 3.26s/it] {'loss': 0.2764, 'grad_norm': 0.6067067548843695, 'learning_rate': 4.0943570181115275e-07, 'epoch': 0.87} 87%|████████▋ | 19316/22095 [33:02:22<2:31:12, 3.26s/it] 87%|████████▋ | 19317/22095 [33:02:26<2:31:43, 3.28s/it] {'loss': 0.3062, 'grad_norm': 0.608306134835602, 'learning_rate': 4.091452805804785e-07, 'epoch': 0.87} 87%|████████▋ | 19317/22095 [33:02:26<2:31:43, 3.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19318/22095 [33:02:34<3:37:26, 4.70s/it] {'loss': 0.4438, 'grad_norm': 0.2563566022423515, 'learning_rate': 4.088549579931722e-07, 'epoch': 0.87} 87%|████████▋ | 19318/22095 [33:02:34<3:37:26, 4.70s/it] 87%|████████▋ | 19319/22095 [33:02:38<3:28:54, 4.52s/it] {'loss': 0.2903, 'grad_norm': 0.6088891635681201, 'learning_rate': 4.085647340554738e-07, 'epoch': 0.87} 87%|████████▋ | 19319/22095 [33:02:38<3:28:54, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56191 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93996 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19320/22095 [33:02:41<3:18:54, 4.30s/it] {'loss': 0.2875, 'grad_norm': 0.6325971987131987, 'learning_rate': 4.0827460877361724e-07, 'epoch': 0.87} 87%|████████▋ | 19320/22095 [33:02:41<3:18:54, 4.30s/it] 87%|████████▋ | 19321/22095 [33:02:45<3:04:22, 3.99s/it] {'loss': 0.2751, 'grad_norm': 0.5723939317045896, 'learning_rate': 4.079845821538364e-07, 'epoch': 0.87} 87%|████████▋ | 19321/22095 [33:02:45<3:04:22, 3.99s/it] 87%|████████▋ | 19322/22095 [33:02:49<3:06:44, 4.04s/it] {'loss': 0.281, 'grad_norm': 0.7989220925329679, 'learning_rate': 4.0769465420236407e-07, 'epoch': 0.87} 87%|████████▋ | 19322/22095 [33:02:49<3:06:44, 4.04s/it] 87%|████████▋ | 19323/22095 [33:02:52<2:50:46, 3.70s/it] {'loss': 0.3168, 'grad_norm': 0.5519022007177745, 'learning_rate': 4.0740482492542864e-07, 'epoch': 0.87} 87%|████████▋ | 19323/22095 [33:02:52<2:50:46, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42661 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41633 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50505 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19324/22095 [33:03:02<4:15:34, 5.53s/it] {'loss': 0.4728, 'grad_norm': 0.27196518760632776, 'learning_rate': 4.0711509432925955e-07, 'epoch': 0.87} 87%|████████▋ | 19324/22095 [33:03:02<4:15:34, 5.53s/it] 87%|████████▋ | 19325/22095 [33:03:05<3:51:45, 5.02s/it] {'loss': 0.2973, 'grad_norm': 0.5852629861057203, 'learning_rate': 4.0682546242008017e-07, 'epoch': 0.87} 87%|████████▋ | 19325/22095 [33:03:05<3:51:45, 5.02s/it] 87%|████████▋ | 19326/22095 [33:03:09<3:27:37, 4.50s/it] {'loss': 0.3121, 'grad_norm': 0.8416817462412677, 'learning_rate': 4.0653592920411545e-07, 'epoch': 0.87} 87%|████████▋ | 19326/22095 [33:03:09<3:27:37, 4.50s/it] 87%|████████▋ | 19327/22095 [33:03:12<3:12:46, 4.18s/it] {'loss': 0.2591, 'grad_norm': 0.656747312664601, 'learning_rate': 4.0624649468758494e-07, 'epoch': 0.87} 87%|████████▋ | 19327/22095 [33:03:12<3:12:46, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 87%|████████▋ | 19328/22095 [33:03:22<4:25:30, 5.76s/it] {'loss': 0.4515, 'grad_norm': 0.2852204573784242, 'learning_rate': 4.0595715887670973e-07, 'epoch': 0.87} 87%|████████▋ | 19328/22095 [33:03:22<4:25:30, 5.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (99175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80876 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127383 > 40960). Running this sequence through the model will result in indexing errors 87%|████████▋ | 19329/22095 [33:03:25<3:47:46, 4.94s/it] {'loss': 0.2606, 'grad_norm': 0.5623803990170919, 'learning_rate': 4.056679217777054e-07, 'epoch': 0.87} 87%|████████▋ | 19329/22095 [33:03:25<3:47:46, 4.94s/it] 87%|████████▋ | 19330/22095 [33:03:28<3:20:28, 4.35s/it] {'loss': 0.2584, 'grad_norm': 0.5641726428377336, 'learning_rate': 4.0537878339678647e-07, 'epoch': 0.87} 87%|████████▋ | 19330/22095 [33:03:28<3:20:28, 4.35s/it] 87%|████████▋ | 19331/22095 [33:03:31<3:06:25, 4.05s/it] {'loss': 0.2992, 'grad_norm': 0.6174679946555726, 'learning_rate': 4.050897437401657e-07, 'epoch': 0.87} 87%|████████▋ | 19331/22095 [33:03:31<3:06:25, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 87%|████████▋ | 19332/22095 [33:03:39<3:57:48, 5.16s/it] {'loss': 0.4789, 'grad_norm': 0.2592118664339325, 'learning_rate': 4.0480080281405544e-07, 'epoch': 0.87} 87%|████████▋ | 19332/22095 [33:03:39<3:57:48, 5.16s/it] 87%|████████▋ | 19333/22095 [33:03:43<3:40:55, 4.80s/it] {'loss': 0.2708, 'grad_norm': 0.6222844653809816, 'learning_rate': 4.045119606246628e-07, 'epoch': 0.87} 87%|████████▋ | 19333/22095 [33:03:43<3:40:55, 4.80s/it] 88%|████████▊ | 19334/22095 [33:03:46<3:14:52, 4.23s/it] {'loss': 0.2892, 'grad_norm': 0.561262388965449, 'learning_rate': 4.0422321717819347e-07, 'epoch': 0.88} 88%|████████▊ | 19334/22095 [33:03:46<3:14:52, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50340 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19335/22095 [33:03:49<3:04:53, 4.02s/it] {'loss': 0.277, 'grad_norm': 0.6080040410481603, 'learning_rate': 4.03934572480853e-07, 'epoch': 0.88} 88%|████████▊ | 19335/22095 [33:03:49<3:04:53, 4.02s/it] 88%|████████▊ | 19336/22095 [33:03:53<3:08:38, 4.10s/it] {'loss': 0.2712, 'grad_norm': 0.6680865928353568, 'learning_rate': 4.03646026538842e-07, 'epoch': 0.88} 88%|████████▊ | 19336/22095 [33:03:53<3:08:38, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19337/22095 [33:04:01<3:58:18, 5.18s/it] {'loss': 0.4717, 'grad_norm': 0.2658129002143189, 'learning_rate': 4.0335757935836216e-07, 'epoch': 0.88} 88%|████████▊ | 19337/22095 [33:04:01<3:58:18, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84925 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57001 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48850 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19338/22095 [33:04:05<3:36:03, 4.70s/it] {'loss': 0.2591, 'grad_norm': 0.6268641775392694, 'learning_rate': 4.0306923094561025e-07, 'epoch': 0.88} 88%|████████▊ | 19338/22095 [33:04:05<3:36:03, 4.70s/it] 88%|████████▊ | 19339/22095 [33:04:08<3:13:59, 4.22s/it] {'loss': 0.3132, 'grad_norm': 0.6131332345601052, 'learning_rate': 4.027809813067812e-07, 'epoch': 0.88} 88%|████████▊ | 19339/22095 [33:04:08<3:13:59, 4.22s/it] 88%|████████▊ | 19340/22095 [33:04:11<3:03:43, 4.00s/it] {'loss': 0.2804, 'grad_norm': 0.6250152587687039, 'learning_rate': 4.024928304480696e-07, 'epoch': 0.88} 88%|████████▊ | 19340/22095 [33:04:11<3:03:43, 4.00s/it] 88%|████████▊ | 19341/22095 [33:04:14<2:47:31, 3.65s/it] {'loss': 0.2782, 'grad_norm': 0.6700185047153728, 'learning_rate': 4.022047783756683e-07, 'epoch': 0.88} 88%|████████▊ | 19341/22095 [33:04:14<2:47:31, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366229 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32975, 'image': 'vrdu_table_final_2/astro-ph.CO/783d5349-8b3c-4661-a855-db8973d6eb74.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 88%|████████▊ | 19342/22095 [33:04:18<2:49:02, 3.68s/it] {'loss': 0.2948, 'grad_norm': 0.5891930228963966, 'learning_rate': 4.0191682509576503e-07, 'epoch': 0.88} 88%|████████▊ | 19342/22095 [33:04:18<2:49:02, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19343/22095 [33:04:26<3:52:44, 5.07s/it] {'loss': 0.4634, 'grad_norm': 0.27279569033472617, 'learning_rate': 4.0162897061454596e-07, 'epoch': 0.88} 88%|████████▊ | 19343/22095 [33:04:26<3:52:44, 5.07s/it] 88%|████████▊ | 19344/22095 [33:04:29<3:27:40, 4.53s/it] {'loss': 0.3157, 'grad_norm': 0.6391335136643886, 'learning_rate': 4.0134121493819897e-07, 'epoch': 0.88} 88%|████████▊ | 19344/22095 [33:04:29<3:27:40, 4.53s/it] 88%|████████▊ | 19345/22095 [33:04:33<3:12:18, 4.20s/it] {'loss': 0.3173, 'grad_norm': 0.6616544774477524, 'learning_rate': 4.0105355807290523e-07, 'epoch': 0.88} 88%|████████▊ | 19345/22095 [33:04:33<3:12:18, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8942805 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 65958, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图所示,如果已知段AB=12,则将段AB延伸至点C,使BC=\\ frac{1}{2}AB,点D为段AC的中点,段BD的长度为()\nA. 3\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 88%|████████▊ | 19346/22095 [33:04:42<4:24:27, 5.77s/it] {'loss': 0.4681, 'grad_norm': 0.2702210661503924, 'learning_rate': 4.0076600002484533e-07, 'epoch': 0.88} 88%|████████▊ | 19346/22095 [33:04:42<4:24:27, 5.77s/it] 88%|████████▊ | 19347/22095 [33:04:49<4:38:02, 6.07s/it] {'loss': 0.475, 'grad_norm': 0.285967670957762, 'learning_rate': 4.004785408001982e-07, 'epoch': 0.88} 88%|████████▊ | 19347/22095 [33:04:49<4:38:02, 6.07s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19348/22095 [33:04:53<4:03:37, 5.32s/it] {'loss': 0.3145, 'grad_norm': 0.6405104658919056, 'learning_rate': 4.001911804051417e-07, 'epoch': 0.88} 88%|████████▊ | 19348/22095 [33:04:53<4:03:37, 5.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88422 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19349/22095 [33:04:56<3:35:51, 4.72s/it] {'loss': 0.3267, 'grad_norm': 0.6727554754483442, 'learning_rate': 3.999039188458498e-07, 'epoch': 0.88} 88%|████████▊ | 19349/22095 [33:04:56<3:35:51, 4.72s/it] 88%|████████▊ | 19350/22095 [33:04:59<3:16:08, 4.29s/it] {'loss': 0.258, 'grad_norm': 0.6028274037616632, 'learning_rate': 3.996167561284936e-07, 'epoch': 0.88} 88%|████████▊ | 19350/22095 [33:04:59<3:16:08, 4.29s/it] 88%|████████▊ | 19351/22095 [33:05:02<2:55:30, 3.84s/it] {'loss': 0.2794, 'grad_norm': 0.5946123279302, 'learning_rate': 3.9932969225924546e-07, 'epoch': 0.88} 88%|████████▊ | 19351/22095 [33:05:02<2:55:30, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54210 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87489 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19352/22095 [33:05:06<2:52:19, 3.77s/it] {'loss': 0.2662, 'grad_norm': 0.5871803342675288, 'learning_rate': 3.990427272442715e-07, 'epoch': 0.88} 88%|████████▊ | 19352/22095 [33:05:06<2:52:19, 3.77s/it] 88%|████████▊ | 19353/22095 [33:05:09<2:50:54, 3.74s/it] {'loss': 0.2839, 'grad_norm': 0.6323833531414866, 'learning_rate': 3.987558610897391e-07, 'epoch': 0.88} 88%|████████▊ | 19353/22095 [33:05:09<2:50:54, 3.74s/it] 88%|████████▊ | 19354/22095 [33:05:13<2:53:20, 3.79s/it] {'loss': 0.2938, 'grad_norm': 0.5265504978894515, 'learning_rate': 3.9846909380181096e-07, 'epoch': 0.88} 88%|████████▊ | 19354/22095 [33:05:13<2:53:20, 3.79s/it] 88%|████████▊ | 19355/22095 [33:05:17<2:48:00, 3.68s/it] {'loss': 0.3392, 'grad_norm': 0.6319468212936, 'learning_rate': 3.981824253866501e-07, 'epoch': 0.88} 88%|████████▊ | 19355/22095 [33:05:17<2:48:00, 3.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8409143 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11337, 'image': 'vrdu_table_final_2/astro-ph.CO/d77dc4e3-2f67-40c2-addd-2992078a076b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 88%|████████▊ | 19356/22095 [33:05:20<2:38:59, 3.48s/it] {'loss': 0.2472, 'grad_norm': 0.6347599472287441, 'learning_rate': 3.978958558504148e-07, 'epoch': 0.88} 88%|████████▊ | 19356/22095 [33:05:20<2:38:59, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19357/22095 [33:05:28<3:42:05, 4.87s/it] {'loss': 0.4936, 'grad_norm': 0.29546727970639075, 'learning_rate': 3.9760938519926404e-07, 'epoch': 0.88} 88%|████████▊ | 19357/22095 [33:05:28<3:42:05, 4.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59719 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46518 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19358/22095 [33:05:31<3:23:08, 4.45s/it] {'loss': 0.2786, 'grad_norm': 0.6320587601466439, 'learning_rate': 3.9732301343935243e-07, 'epoch': 0.88} 88%|████████▊ | 19358/22095 [33:05:31<3:23:08, 4.45s/it] 88%|████████▊ | 19359/22095 [33:05:35<3:10:08, 4.17s/it] {'loss': 0.2796, 'grad_norm': 0.63209996992386, 'learning_rate': 3.970367405768322e-07, 'epoch': 0.88} 88%|████████▊ | 19359/22095 [33:05:35<3:10:08, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54608 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113523 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19360/22095 [33:05:43<4:10:22, 5.49s/it] {'loss': 0.4559, 'grad_norm': 0.26120033002662857, 'learning_rate': 3.9675056661785563e-07, 'epoch': 0.88} 88%|████████▊ | 19360/22095 [33:05:43<4:10:22, 5.49s/it] 88%|████████▊ | 19361/22095 [33:05:47<3:48:15, 5.01s/it] {'loss': 0.2847, 'grad_norm': 0.5661813233089844, 'learning_rate': 3.964644915685728e-07, 'epoch': 0.88} 88%|████████▊ | 19361/22095 [33:05:47<3:48:15, 5.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (94671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69493 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127649 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65500 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19362/22095 [33:05:51<3:33:32, 4.69s/it] {'loss': 0.3123, 'grad_norm': 0.6678577903977209, 'learning_rate': 3.961785154351289e-07, 'epoch': 0.88} 88%|████████▊ | 19362/22095 [33:05:51<3:33:32, 4.69s/it] 88%|████████▊ | 19363/22095 [33:05:55<3:25:02, 4.50s/it] {'loss': 0.3141, 'grad_norm': 0.6577484082745595, 'learning_rate': 3.9589263822366886e-07, 'epoch': 0.88} 88%|████████▊ | 19363/22095 [33:05:55<3:25:02, 4.50s/it] 88%|████████▊ | 19364/22095 [33:05:58<3:05:15, 4.07s/it] {'loss': 0.2685, 'grad_norm': 0.6469438211303374, 'learning_rate': 3.9560685994033566e-07, 'epoch': 0.88} 88%|████████▊ | 19364/22095 [33:05:58<3:05:15, 4.07s/it] 88%|████████▊ | 19365/22095 [33:06:02<2:55:55, 3.87s/it] {'loss': 0.2983, 'grad_norm': 0.6733623265000933, 'learning_rate': 3.9532118059126935e-07, 'epoch': 0.88} 88%|████████▊ | 19365/22095 [33:06:02<2:55:55, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49655 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19366/22095 [33:06:07<3:19:22, 4.38s/it] {'loss': 0.4546, 'grad_norm': 0.2691216368567256, 'learning_rate': 3.9503560018260945e-07, 'epoch': 0.88} 88%|████████▊ | 19366/22095 [33:06:07<3:19:22, 4.38s/it] 88%|████████▊ | 19367/22095 [33:06:11<3:05:39, 4.08s/it] {'loss': 0.285, 'grad_norm': 0.631367657505299, 'learning_rate': 3.9475011872049164e-07, 'epoch': 0.88} 88%|████████▊ | 19367/22095 [33:06:11<3:05:39, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19368/22095 [33:06:20<4:19:52, 5.72s/it] {'loss': 0.4689, 'grad_norm': 0.28257712047841743, 'learning_rate': 3.9446473621104877e-07, 'epoch': 0.88} 88%|████████▊ | 19368/22095 [33:06:20<4:19:52, 5.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44330 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19369/22095 [33:06:24<3:56:48, 5.21s/it] {'loss': 0.3236, 'grad_norm': 0.6845795804766492, 'learning_rate': 3.9417945266041367e-07, 'epoch': 0.88} 88%|████████▊ | 19369/22095 [33:06:24<3:56:48, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19370/22095 [33:06:34<4:54:07, 6.48s/it] {'loss': 0.4324, 'grad_norm': 0.2629276730359694, 'learning_rate': 3.9389426807471764e-07, 'epoch': 0.88} 88%|████████▊ | 19370/22095 [33:06:34<4:54:07, 6.48s/it] 88%|████████▊ | 19371/22095 [33:06:38<4:20:58, 5.75s/it] {'loss': 0.2742, 'grad_norm': 0.6353848202383956, 'learning_rate': 3.9360918246008684e-07, 'epoch': 0.88} 88%|████████▊ | 19371/22095 [33:06:38<4:20:58, 5.75s/it] 88%|████████▊ | 19372/22095 [33:06:41<3:51:19, 5.10s/it] {'loss': 0.321, 'grad_norm': 0.5924153676161918, 'learning_rate': 3.933241958226469e-07, 'epoch': 0.88} 88%|████████▊ | 19372/22095 [33:06:41<3:51:19, 5.10s/it] 88%|████████▊ | 19373/22095 [33:06:44<3:19:25, 4.40s/it] {'loss': 0.2434, 'grad_norm': 0.7712837707043331, 'learning_rate': 3.930393081685213e-07, 'epoch': 0.88} 88%|████████▊ | 19373/22095 [33:06:44<3:19:25, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308730 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2pR0vcrsTMeJjSszhXXcGCFXa_!!1071328078.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n检索图中的字句,忽略背景,只提供文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n3条\n省\n8.2元\n纯灰+条纹橘+条纹灰3条装\nXL\nXXL\nL\nM\n130-145\n160-180\n115-130\n<115斤\n145-160\n斤'}]} 88%|████████▊ | 19374/22095 [33:06:53<4:27:50, 5.91s/it] {'loss': 0.4528, 'grad_norm': 0.2724202673210525, 'learning_rate': 3.9275451950383346e-07, 'epoch': 0.88} 88%|████████▊ | 19374/22095 [33:06:53<4:27:50, 5.91s/it] 88%|████████▊ | 19375/22095 [33:06:57<3:59:52, 5.29s/it] {'loss': 0.3075, 'grad_norm': 0.6339317784538435, 'learning_rate': 3.924698298346996e-07, 'epoch': 0.88} 88%|████████▊ | 19375/22095 [33:06:57<3:59:52, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48304 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19376/22095 [33:07:00<3:28:34, 4.60s/it] {'loss': 0.3037, 'grad_norm': 0.6102452598889615, 'learning_rate': 3.9218523916723814e-07, 'epoch': 0.88} 88%|████████▊ | 19376/22095 [33:07:00<3:28:34, 4.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19377/22095 [33:07:05<3:34:13, 4.73s/it] {'loss': 0.4438, 'grad_norm': 0.24234823449444134, 'learning_rate': 3.9190074750756424e-07, 'epoch': 0.88} 88%|████████▊ | 19377/22095 [33:07:05<3:34:13, 4.73s/it] 88%|████████▊ | 19378/22095 [33:07:08<3:12:41, 4.26s/it] {'loss': 0.3213, 'grad_norm': 0.7741991460040867, 'learning_rate': 3.916163548617913e-07, 'epoch': 0.88} 88%|████████▊ | 19378/22095 [33:07:08<3:12:41, 4.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19379/22095 [33:07:12<3:03:06, 4.05s/it] {'loss': 0.2576, 'grad_norm': 0.5343018843761397, 'learning_rate': 3.913320612360283e-07, 'epoch': 0.88} 88%|████████▊ | 19379/22095 [33:07:12<3:03:06, 4.05s/it] 88%|████████▊ | 19380/22095 [33:07:15<2:48:46, 3.73s/it] {'loss': 0.2738, 'grad_norm': 0.6245680766490284, 'learning_rate': 3.9104786663638537e-07, 'epoch': 0.88} 88%|████████▊ | 19380/22095 [33:07:15<2:48:46, 3.73s/it] 88%|████████▊ | 19381/22095 [33:07:18<2:36:44, 3.47s/it] {'loss': 0.3479, 'grad_norm': 0.6527068583160411, 'learning_rate': 3.9076377106896765e-07, 'epoch': 0.88} 88%|████████▊ | 19381/22095 [33:07:18<2:36:44, 3.47s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19382/22095 [33:07:22<2:48:17, 3.72s/it] {'loss': 0.3232, 'grad_norm': 0.5873223696868036, 'learning_rate': 3.904797745398814e-07, 'epoch': 0.88} 88%|████████▊ | 19382/22095 [33:07:22<2:48:17, 3.72s/it] 88%|████████▊ | 19383/22095 [33:07:26<2:48:39, 3.73s/it] {'loss': 0.3038, 'grad_norm': 0.8239279730871042, 'learning_rate': 3.901958770552272e-07, 'epoch': 0.88} 88%|████████▊ | 19383/22095 [33:07:26<2:48:39, 3.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19384/22095 [33:07:36<4:10:10, 5.54s/it] {'loss': 0.4633, 'grad_norm': 0.2989201442753827, 'learning_rate': 3.899120786211058e-07, 'epoch': 0.88} 88%|████████▊ | 19384/22095 [33:07:36<4:10:10, 5.54s/it] 88%|████████▊ | 19385/22095 [33:07:39<3:39:19, 4.86s/it] {'loss': 0.3101, 'grad_norm': 0.588468140386913, 'learning_rate': 3.8962837924361454e-07, 'epoch': 0.88} 88%|████████▊ | 19385/22095 [33:07:39<3:39:19, 4.86s/it] 88%|████████▊ | 19386/22095 [33:07:43<3:27:05, 4.59s/it] {'loss': 0.3124, 'grad_norm': 0.6190450837806823, 'learning_rate': 3.893447789288507e-07, 'epoch': 0.88} 88%|████████▊ | 19386/22095 [33:07:43<3:27:05, 4.59s/it] 88%|████████▊ | 19387/22095 [33:07:46<3:04:52, 4.10s/it] {'loss': 0.3072, 'grad_norm': 0.5657772448108447, 'learning_rate': 3.890612776829067e-07, 'epoch': 0.88} 88%|████████▊ | 19387/22095 [33:07:46<3:04:52, 4.10s/it] 88%|████████▊ | 19388/22095 [33:07:49<2:49:30, 3.76s/it] {'loss': 0.2591, 'grad_norm': 0.7836726961639346, 'learning_rate': 3.887778755118743e-07, 'epoch': 0.88} 88%|████████▊ | 19388/22095 [33:07:49<2:49:30, 3.76s/it] 88%|████████▊ | 19389/22095 [33:07:53<3:00:43, 4.01s/it] {'loss': 0.3644, 'grad_norm': 0.6392129475718579, 'learning_rate': 3.884945724218425e-07, 'epoch': 0.88} 88%|████████▊ | 19389/22095 [33:07:53<3:00:43, 4.01s/it] 88%|████████▊ | 19390/22095 [33:07:56<2:45:24, 3.67s/it] {'loss': 0.2711, 'grad_norm': 0.7234741241049042, 'learning_rate': 3.882113684188998e-07, 'epoch': 0.88} 88%|████████▊ | 19390/22095 [33:07:56<2:45:24, 3.67s/it] 88%|████████▊ | 19391/22095 [33:08:00<2:50:36, 3.79s/it] {'loss': 0.3199, 'grad_norm': 0.6200475117502785, 'learning_rate': 3.879282635091308e-07, 'epoch': 0.88} 88%|████████▊ | 19391/22095 [33:08:00<2:50:36, 3.79s/it] 88%|████████▊ | 19392/22095 [33:08:03<2:39:28, 3.54s/it] {'loss': 0.3023, 'grad_norm': 0.6234293255662298, 'learning_rate': 3.876452576986184e-07, 'epoch': 0.88} 88%|████████▊ | 19392/22095 [33:08:03<2:39:28, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19393/22095 [33:08:07<2:45:45, 3.68s/it] {'loss': 0.2745, 'grad_norm': 0.701509162656271, 'learning_rate': 3.8736235099344375e-07, 'epoch': 0.88} 88%|████████▊ | 19393/22095 [33:08:07<2:45:45, 3.68s/it] 88%|████████▊ | 19394/22095 [33:08:11<2:45:28, 3.68s/it] {'loss': 0.2669, 'grad_norm': 0.6403016579755417, 'learning_rate': 3.870795433996849e-07, 'epoch': 0.88} 88%|████████▊ | 19394/22095 [33:08:11<2:45:28, 3.68s/it] 88%|████████▊ | 19395/22095 [33:08:15<2:49:01, 3.76s/it] {'loss': 0.3066, 'grad_norm': 0.6178465099533741, 'learning_rate': 3.8679683492342023e-07, 'epoch': 0.88} 88%|████████▊ | 19395/22095 [33:08:15<2:49:01, 3.76s/it] 88%|████████▊ | 19396/22095 [33:08:18<2:43:10, 3.63s/it] {'loss': 0.3097, 'grad_norm': 0.5918812381592131, 'learning_rate': 3.865142255707222e-07, 'epoch': 0.88} 88%|████████▊ | 19396/22095 [33:08:18<2:43:10, 3.63s/it] 88%|████████▊ | 19397/22095 [33:08:21<2:33:39, 3.42s/it] {'loss': 0.2848, 'grad_norm': 0.6154359170578279, 'learning_rate': 3.862317153476647e-07, 'epoch': 0.88} 88%|████████▊ | 19397/22095 [33:08:21<2:33:39, 3.42s/it] 88%|████████▊ | 19398/22095 [33:08:25<2:38:56, 3.54s/it] {'loss': 0.3003, 'grad_norm': 0.60710922438913, 'learning_rate': 3.859493042603174e-07, 'epoch': 0.88} 88%|████████▊ | 19398/22095 [33:08:25<2:38:56, 3.54s/it] 88%|████████▊ | 19399/22095 [33:08:28<2:36:26, 3.48s/it] {'loss': 0.2623, 'grad_norm': 0.6197900213750153, 'learning_rate': 3.856669923147488e-07, 'epoch': 0.88} 88%|████████▊ | 19399/22095 [33:08:28<2:36:26, 3.48s/it] 88%|████████▊ | 19400/22095 [33:08:32<2:43:11, 3.63s/it] {'loss': 0.2578, 'grad_norm': 0.6219230728627466, 'learning_rate': 3.8538477951702515e-07, 'epoch': 0.88} 88%|████████▊ | 19400/22095 [33:08:32<2:43:11, 3.63s/it] 88%|████████▊ | 19401/22095 [33:08:36<2:39:47, 3.56s/it] {'loss': 0.2988, 'grad_norm': 0.6047950271722398, 'learning_rate': 3.8510266587320876e-07, 'epoch': 0.88} 88%|████████▊ | 19401/22095 [33:08:36<2:39:47, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67381 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19402/22095 [33:08:39<2:32:41, 3.40s/it] {'loss': 0.3019, 'grad_norm': 0.6586513286928564, 'learning_rate': 3.8482065138936263e-07, 'epoch': 0.88} 88%|████████▊ | 19402/22095 [33:08:39<2:32:41, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19403/22095 [33:08:45<3:13:02, 4.30s/it] {'loss': 0.4491, 'grad_norm': 0.27443656822880147, 'learning_rate': 3.84538736071548e-07, 'epoch': 0.88} 88%|████████▊ | 19403/22095 [33:08:45<3:13:02, 4.30s/it] 88%|████████▊ | 19404/22095 [33:08:49<3:04:37, 4.12s/it] {'loss': 0.2536, 'grad_norm': 0.6154796542159318, 'learning_rate': 3.8425691992581836e-07, 'epoch': 0.88} 88%|████████▊ | 19404/22095 [33:08:49<3:04:37, 4.12s/it] 88%|████████▊ | 19405/22095 [33:08:52<2:55:32, 3.92s/it] {'loss': 0.2785, 'grad_norm': 0.6173411982057696, 'learning_rate': 3.839752029582322e-07, 'epoch': 0.88} 88%|████████▊ | 19405/22095 [33:08:52<2:55:32, 3.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45385 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19406/22095 [33:08:56<2:49:27, 3.78s/it] {'loss': 0.2677, 'grad_norm': 0.6446679072516598, 'learning_rate': 3.836935851748419e-07, 'epoch': 0.88} 88%|████████▊ | 19406/22095 [33:08:56<2:49:27, 3.78s/it] 88%|████████▊ | 19407/22095 [33:09:00<2:59:12, 4.00s/it] {'loss': 0.3072, 'grad_norm': 0.6188887158583402, 'learning_rate': 3.834120665816993e-07, 'epoch': 0.88} 88%|████████▊ | 19407/22095 [33:09:00<2:59:12, 4.00s/it] 88%|████████▊ | 19408/22095 [33:09:04<2:55:59, 3.93s/it] {'loss': 0.2933, 'grad_norm': 0.6223118704251513, 'learning_rate': 3.8313064718485116e-07, 'epoch': 0.88} 88%|████████▊ | 19408/22095 [33:09:04<2:55:59, 3.93s/it] 88%|████████▊ | 19409/22095 [33:09:07<2:46:15, 3.71s/it] {'loss': 0.2547, 'grad_norm': 0.6092096403353362, 'learning_rate': 3.8284932699034717e-07, 'epoch': 0.88} 88%|████████▊ | 19409/22095 [33:09:07<2:46:15, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19410/22095 [33:09:17<4:02:31, 5.42s/it] {'loss': 0.4956, 'grad_norm': 0.2827938848992071, 'learning_rate': 3.825681060042297e-07, 'epoch': 0.88} 88%|████████▊ | 19410/22095 [33:09:17<4:02:31, 5.42s/it] 88%|████████▊ | 19411/22095 [33:09:21<3:41:57, 4.96s/it] {'loss': 0.3117, 'grad_norm': 0.6332323053744587, 'learning_rate': 3.822869842325427e-07, 'epoch': 0.88} 88%|████████▊ | 19411/22095 [33:09:21<3:41:57, 4.96s/it] 88%|████████▊ | 19412/22095 [33:09:24<3:19:17, 4.46s/it] {'loss': 0.3028, 'grad_norm': 0.5821023599323277, 'learning_rate': 3.8200596168132596e-07, 'epoch': 0.88} 88%|████████▊ | 19412/22095 [33:09:24<3:19:17, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 95, in pil_loader img = Image.open(buff) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 993, in convert self.load() File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/ImageFile.py", line 319, in load raise _get_oserror(err_code, encoder=False) OSError: unrecognized data stream contents when reading image file [Try #0] Failed to fetch sample 6777584 in VC:s3://gui-agent/data_20250623/windows_augment/images. Exception: unrecognized data stream contents when reading image file Problematic sample: {'image': 'autocad/20250509_125727_1/images/before_screenshot_1_id_121_function_1_crop_0_grounding_instructions_point_o_paste.png', 'conversations': [{'from': 'human', 'value': "\nSelect 'CUSTOMIZE' to begin editing and managing your AutoCAD tool palettes."}, {'from': 'gpt', 'value': '\nclick(x=0.4773, y=0.1250)\n'}], 'width': 3024, 'height': 1964} 88%|████████▊ | 19413/22095 [33:09:27<2:59:22, 4.01s/it] {'loss': 0.2867, 'grad_norm': 0.6325107408918255, 'learning_rate': 3.8172503835661846e-07, 'epoch': 0.88} 88%|████████▊ | 19413/22095 [33:09:27<2:59:22, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19414/22095 [33:09:36<4:13:02, 5.66s/it] {'loss': 0.4556, 'grad_norm': 0.2831159664124358, 'learning_rate': 3.814442142644548e-07, 'epoch': 0.88} 88%|████████▊ | 19414/22095 [33:09:36<4:13:02, 5.66s/it] 88%|████████▊ | 19415/22095 [33:09:40<3:52:08, 5.20s/it] {'loss': 0.2689, 'grad_norm': 0.5849533493887362, 'learning_rate': 3.8116348941087176e-07, 'epoch': 0.88} 88%|████████▊ | 19415/22095 [33:09:40<3:52:08, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44852 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19416/22095 [33:09:44<3:29:31, 4.69s/it] {'loss': 0.2817, 'grad_norm': 0.6519080783139413, 'learning_rate': 3.808828638018991e-07, 'epoch': 0.88} 88%|████████▊ | 19416/22095 [33:09:44<3:29:31, 4.69s/it] 88%|████████▊ | 19417/22095 [33:09:48<3:17:23, 4.42s/it] {'loss': 0.2636, 'grad_norm': 0.5594713028214605, 'learning_rate': 3.8060233744356634e-07, 'epoch': 0.88} 88%|████████▊ | 19417/22095 [33:09:48<3:17:23, 4.42s/it] 88%|████████▊ | 19418/22095 [33:09:51<3:03:53, 4.12s/it] {'loss': 0.2823, 'grad_norm': 0.5705113920596403, 'learning_rate': 3.8032191034190204e-07, 'epoch': 0.88} 88%|████████▊ | 19418/22095 [33:09:51<3:03:53, 4.12s/it] 88%|████████▊ | 19419/22095 [33:09:54<2:51:07, 3.84s/it] {'loss': 0.2877, 'grad_norm': 0.6227759180273239, 'learning_rate': 3.8004158250293246e-07, 'epoch': 0.88} 88%|████████▊ | 19419/22095 [33:09:54<2:51:07, 3.84s/it] 88%|████████▊ | 19420/22095 [33:09:57<2:42:10, 3.64s/it] {'loss': 0.2699, 'grad_norm': 0.6259684011915944, 'learning_rate': 3.7976135393268057e-07, 'epoch': 0.88} 88%|████████▊ | 19420/22095 [33:09:57<2:42:10, 3.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19421/22095 [33:10:02<2:48:40, 3.78s/it] {'loss': 0.2828, 'grad_norm': 0.6387988070703673, 'learning_rate': 3.79481224637166e-07, 'epoch': 0.88} 88%|████████▊ | 19421/22095 [33:10:02<2:48:40, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49307 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54585 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43541 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19422/22095 [33:10:05<2:38:18, 3.55s/it] {'loss': 0.3304, 'grad_norm': 0.6254957096979336, 'learning_rate': 3.7920119462241e-07, 'epoch': 0.88} 88%|████████▊ | 19422/22095 [33:10:05<2:38:18, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59388 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (48293 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (56230 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19423/22095 [33:10:09<2:44:21, 3.69s/it] {'loss': 0.2539, 'grad_norm': 0.5490326198817921, 'learning_rate': 3.789212638944273e-07, 'epoch': 0.88} 88%|████████▊ | 19423/22095 [33:10:09<2:44:21, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19424/22095 [33:10:19<4:17:09, 5.78s/it] {'loss': 0.4545, 'grad_norm': 0.25844367227032256, 'learning_rate': 3.786414324592358e-07, 'epoch': 0.88} 88%|████████▊ | 19424/22095 [33:10:19<4:17:09, 5.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19425/22095 [33:10:24<3:59:16, 5.38s/it] {'loss': 0.276, 'grad_norm': 0.707449751803239, 'learning_rate': 3.7836170032284516e-07, 'epoch': 0.88} 88%|████████▊ | 19425/22095 [33:10:24<3:59:16, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55526 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55281 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50111 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19426/22095 [33:10:27<3:27:24, 4.66s/it] {'loss': 0.2782, 'grad_norm': 0.7146855031244232, 'learning_rate': 3.7808206749126777e-07, 'epoch': 0.88} 88%|████████▊ | 19426/22095 [33:10:27<3:27:24, 4.66s/it] 88%|████████▊ | 19427/22095 [33:10:30<3:03:22, 4.12s/it] {'loss': 0.2449, 'grad_norm': 0.681521661479075, 'learning_rate': 3.778025339705116e-07, 'epoch': 0.88} 88%|████████▊ | 19427/22095 [33:10:30<3:03:22, 4.12s/it] 88%|████████▊ | 19428/22095 [33:10:33<2:48:52, 3.80s/it] {'loss': 0.2698, 'grad_norm': 0.624553488593751, 'learning_rate': 3.7752309976658295e-07, 'epoch': 0.88} 88%|████████▊ | 19428/22095 [33:10:33<2:48:52, 3.80s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116081 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (150009 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19429/22095 [33:10:36<2:39:29, 3.59s/it] {'loss': 0.2938, 'grad_norm': 0.5821597816164235, 'learning_rate': 3.7724376488548655e-07, 'epoch': 0.88} 88%|████████▊ | 19429/22095 [33:10:36<2:39:29, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19430/22095 [33:10:46<4:09:09, 5.61s/it] {'loss': 0.4679, 'grad_norm': 0.25441806517758003, 'learning_rate': 3.7696452933322305e-07, 'epoch': 0.88} 88%|████████▊ | 19430/22095 [33:10:46<4:09:09, 5.61s/it] 88%|████████▊ | 19431/22095 [33:10:56<5:11:01, 7.01s/it] {'loss': 0.436, 'grad_norm': 0.24762157896644646, 'learning_rate': 3.766853931157932e-07, 'epoch': 0.88} 88%|████████▊ | 19431/22095 [33:10:56<5:11:01, 7.01s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 88%|████████▊ | 19432/22095 [33:11:00<4:21:12, 5.89s/it] {'loss': 0.317, 'grad_norm': 0.5959314807385596, 'learning_rate': 3.7640635623919674e-07, 'epoch': 0.88} 88%|████████▊ | 19432/22095 [33:11:00<4:21:12, 5.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19433/22095 [33:11:03<3:52:57, 5.25s/it] {'loss': 0.3062, 'grad_norm': 0.6562698471158812, 'learning_rate': 3.761274187094255e-07, 'epoch': 0.88} 88%|████████▊ | 19433/22095 [33:11:03<3:52:57, 5.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19434/22095 [33:11:12<4:37:27, 6.26s/it] {'loss': 0.4677, 'grad_norm': 0.27390885110652485, 'learning_rate': 3.758485805324746e-07, 'epoch': 0.88} 88%|████████▊ | 19434/22095 [33:11:12<4:37:27, 6.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44559 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50181 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19435/22095 [33:11:15<4:00:57, 5.44s/it] {'loss': 0.2797, 'grad_norm': 0.692513863185441, 'learning_rate': 3.7556984171433663e-07, 'epoch': 0.88} 88%|████████▊ | 19435/22095 [33:11:15<4:00:57, 5.44s/it] 88%|████████▊ | 19436/22095 [33:11:19<3:38:46, 4.94s/it] {'loss': 0.2814, 'grad_norm': 0.6152037869565279, 'learning_rate': 3.752912022610006e-07, 'epoch': 0.88} 88%|████████▊ | 19436/22095 [33:11:19<3:38:46, 4.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19437/22095 [33:11:29<4:38:11, 6.28s/it] {'loss': 0.4398, 'grad_norm': 0.24162605717765784, 'learning_rate': 3.750126621784511e-07, 'epoch': 0.88} 88%|████████▊ | 19437/22095 [33:11:29<4:38:11, 6.28s/it] 88%|████████▊ | 19438/22095 [33:11:33<4:05:52, 5.55s/it] {'loss': 0.285, 'grad_norm': 0.6122811807147558, 'learning_rate': 3.7473422147267623e-07, 'epoch': 0.88} 88%|████████▊ | 19438/22095 [33:11:33<4:05:52, 5.55s/it] 88%|████████▊ | 19439/22095 [33:11:36<3:35:37, 4.87s/it] {'loss': 0.2805, 'grad_norm': 0.5819518988029752, 'learning_rate': 3.744558801496567e-07, 'epoch': 0.88} 88%|████████▊ | 19439/22095 [33:11:36<3:35:37, 4.87s/it] 88%|████████▊ | 19440/22095 [33:11:40<3:25:47, 4.65s/it] {'loss': 0.2991, 'grad_norm': 0.8562388874106288, 'learning_rate': 3.74177638215375e-07, 'epoch': 0.88} 88%|████████▊ | 19440/22095 [33:11:40<3:25:47, 4.65s/it] 88%|████████▊ | 19441/22095 [33:11:44<3:17:24, 4.46s/it] {'loss': 0.2922, 'grad_norm': 0.6023165883846132, 'learning_rate': 3.73899495675808e-07, 'epoch': 0.88} 88%|████████▊ | 19441/22095 [33:11:44<3:17:24, 4.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8957883 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8718, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nA. 2\nB. 3\nC. 10\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 88%|████████▊ | 19442/22095 [33:11:47<3:00:13, 4.08s/it] {'loss': 0.3151, 'grad_norm': 0.6116074594476207, 'learning_rate': 3.736214525369336e-07, 'epoch': 0.88} 88%|████████▊ | 19442/22095 [33:11:47<3:00:13, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19443/22095 [33:11:57<4:10:55, 5.68s/it] {'loss': 0.4811, 'grad_norm': 0.3015748179436036, 'learning_rate': 3.7334350880472434e-07, 'epoch': 0.88} 88%|████████▊ | 19443/22095 [33:11:57<4:10:55, 5.68s/it] 88%|████████▊ | 19444/22095 [33:12:00<3:40:41, 4.99s/it] {'loss': 0.3192, 'grad_norm': 0.5836461033173239, 'learning_rate': 3.730656644851538e-07, 'epoch': 0.88} 88%|████████▊ | 19444/22095 [33:12:00<3:40:41, 4.99s/it] 88%|████████▊ | 19445/22095 [33:12:04<3:21:52, 4.57s/it] {'loss': 0.2662, 'grad_norm': 1.1388168216379084, 'learning_rate': 3.727879195841921e-07, 'epoch': 0.88} 88%|████████▊ | 19445/22095 [33:12:04<3:21:52, 4.57s/it] 88%|████████▊ | 19446/22095 [33:12:07<3:05:08, 4.19s/it] {'loss': 0.3012, 'grad_norm': 0.6501681088951335, 'learning_rate': 3.7251027410780573e-07, 'epoch': 0.88} 88%|████████▊ | 19446/22095 [33:12:07<3:05:08, 4.19s/it] 88%|████████▊ | 19447/22095 [33:12:10<2:57:04, 4.01s/it] {'loss': 0.3035, 'grad_norm': 0.6716599059645094, 'learning_rate': 3.722327280619614e-07, 'epoch': 0.88} 88%|████████▊ | 19447/22095 [33:12:10<2:57:04, 4.01s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65175 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64613 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19448/22095 [33:12:14<2:54:28, 3.95s/it] {'loss': 0.3487, 'grad_norm': 0.6544174561508919, 'learning_rate': 3.7195528145262337e-07, 'epoch': 0.88} 88%|████████▊ | 19448/22095 [33:12:14<2:54:28, 3.95s/it] 88%|████████▊ | 19449/22095 [33:12:17<2:41:30, 3.66s/it] {'loss': 0.2736, 'grad_norm': 0.6577095681684545, 'learning_rate': 3.7167793428575236e-07, 'epoch': 0.88} 88%|████████▊ | 19449/22095 [33:12:17<2:41:30, 3.66s/it] 88%|████████▊ | 19450/22095 [33:12:20<2:31:53, 3.45s/it] {'loss': 0.2975, 'grad_norm': 0.6075001612961495, 'learning_rate': 3.71400686567307e-07, 'epoch': 0.88} 88%|████████▊ | 19450/22095 [33:12:20<2:31:53, 3.45s/it] 88%|████████▊ | 19451/22095 [33:12:24<2:36:16, 3.55s/it] {'loss': 0.2914, 'grad_norm': 0.6407300018974064, 'learning_rate': 3.7112353830324576e-07, 'epoch': 0.88} 88%|████████▊ | 19451/22095 [33:12:24<2:36:16, 3.55s/it] 88%|████████▊ | 19452/22095 [33:12:27<2:29:13, 3.39s/it] {'loss': 0.2878, 'grad_norm': 0.6312364077947341, 'learning_rate': 3.7084648949952284e-07, 'epoch': 0.88} 88%|████████▊ | 19452/22095 [33:12:27<2:29:13, 3.39s/it] 88%|████████▊ | 19453/22095 [33:12:31<2:33:54, 3.50s/it] {'loss': 0.3099, 'grad_norm': 0.7431039517859039, 'learning_rate': 3.705695401620918e-07, 'epoch': 0.88} 88%|████████▊ | 19453/22095 [33:12:31<2:33:54, 3.50s/it] 88%|████████▊ | 19454/22095 [33:12:34<2:25:35, 3.31s/it] {'loss': 0.2982, 'grad_norm': 0.5882110778724492, 'learning_rate': 3.7029269029690287e-07, 'epoch': 0.88} 88%|████████▊ | 19454/22095 [33:12:34<2:25:35, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50746 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19455/22095 [33:12:37<2:31:58, 3.45s/it] {'loss': 0.2796, 'grad_norm': 0.8440818057248546, 'learning_rate': 3.700159399099057e-07, 'epoch': 0.88} 88%|████████▊ | 19455/22095 [33:12:37<2:31:58, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19456/22095 [33:12:47<3:51:04, 5.25s/it] {'loss': 0.4566, 'grad_norm': 0.2834390846480046, 'learning_rate': 3.6973928900704503e-07, 'epoch': 0.88} 88%|████████▊ | 19456/22095 [33:12:47<3:51:04, 5.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19457/22095 [33:12:50<3:25:06, 4.66s/it] {'loss': 0.3159, 'grad_norm': 0.5899447925861252, 'learning_rate': 3.6946273759426667e-07, 'epoch': 0.88} 88%|████████▊ | 19457/22095 [33:12:50<3:25:06, 4.66s/it] 88%|████████▊ | 19458/22095 [33:12:53<3:04:19, 4.19s/it] {'loss': 0.2919, 'grad_norm': 0.5717095051810808, 'learning_rate': 3.69186285677513e-07, 'epoch': 0.88} 88%|████████▊ | 19458/22095 [33:12:53<3:04:19, 4.19s/it] 88%|████████▊ | 19459/22095 [33:12:57<2:59:11, 4.08s/it] {'loss': 0.3135, 'grad_norm': 0.5766207199446806, 'learning_rate': 3.6890993326272273e-07, 'epoch': 0.88} 88%|████████▊ | 19459/22095 [33:12:57<2:59:11, 4.08s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306314 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1qX.nNpXXXXXEapXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nRead the text on the image.'}, {'from': 'gpt', 'value': 'All words in the image:\n500张\n果茶饮品坊\n爱尚饮\n积分卡\n10\n凡进店消费满\n元送积分卡1张\n青桔蜜茶\n积满10次即送\n一杯\n解释权归本店所有\n地址:宁波市海曙区新芝路53号顺旺基旁()\n联系电话:13777117067\n18968299395\n爱尚饮\n果茶饮品坊\n1\n2\n3\n4\n5\n8\n包邮\n6\n7\n9\n10\n买就送积分贴'}]} 88%|████████▊ | 19460/22095 [33:13:00<2:45:17, 3.76s/it] {'loss': 0.2814, 'grad_norm': 0.6403324284212105, 'learning_rate': 3.6863368035583494e-07, 'epoch': 0.88} 88%|████████▊ | 19460/22095 [33:13:00<2:45:17, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307531 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB24dJ4jCYH8KJjSspdXXcRgVXa_!!2426498448.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nRead the text in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n3G\n绝缘电工胶布\n爆款热销\n颜色请备注10米长\n拍一件就是10卷\n皇冠\ncrown\n一筒10卷装\n颜色请备注\n免费开发票\n全国包邮\n10卷一筒装\n价格实惠\n品质保证'}]} 88%|████████▊ | 19461/22095 [33:13:03<2:38:52, 3.62s/it] {'loss': 0.3103, 'grad_norm': 0.5780060358993144, 'learning_rate': 3.683575269627865e-07, 'epoch': 0.88} 88%|████████▊ | 19461/22095 [33:13:03<2:38:52, 3.62s/it] 88%|████████▊ | 19462/22095 [33:13:08<2:45:57, 3.78s/it] {'loss': 0.2913, 'grad_norm': 0.5773243391426244, 'learning_rate': 3.680814730895077e-07, 'epoch': 0.88} 88%|████████▊ | 19462/22095 [33:13:08<2:45:57, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19463/22095 [33:13:15<3:40:51, 5.03s/it] {'loss': 0.4877, 'grad_norm': 0.2893917732617025, 'learning_rate': 3.6780551874193273e-07, 'epoch': 0.88} 88%|████████▊ | 19463/22095 [33:13:15<3:40:51, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86420 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45853 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19464/22095 [33:13:19<3:27:23, 4.73s/it] {'loss': 0.3022, 'grad_norm': 0.6755635937322024, 'learning_rate': 3.675296639259912e-07, 'epoch': 0.88} 88%|████████▊ | 19464/22095 [33:13:19<3:27:23, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46911 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42108 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130693 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19465/22095 [33:13:23<3:16:06, 4.47s/it] {'loss': 0.3154, 'grad_norm': 1.2261435587924292, 'learning_rate': 3.672539086476101e-07, 'epoch': 0.88} 88%|████████▊ | 19465/22095 [33:13:23<3:16:06, 4.47s/it] 88%|████████▊ | 19466/22095 [33:13:27<3:10:41, 4.35s/it] {'loss': 0.2787, 'grad_norm': 0.593591631903637, 'learning_rate': 3.669782529127125e-07, 'epoch': 0.88} 88%|████████▊ | 19466/22095 [33:13:27<3:10:41, 4.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19467/22095 [33:13:34<3:39:31, 5.01s/it] {'loss': 0.4651, 'grad_norm': 0.2859027748282125, 'learning_rate': 3.667026967272236e-07, 'epoch': 0.88} 88%|████████▊ | 19467/22095 [33:13:34<3:39:31, 5.01s/it] 88%|████████▊ | 19468/22095 [33:13:37<3:17:10, 4.50s/it] {'loss': 0.2995, 'grad_norm': 0.5509629077468778, 'learning_rate': 3.6642724009706423e-07, 'epoch': 0.88} 88%|████████▊ | 19468/22095 [33:13:37<3:17:10, 4.50s/it] 88%|████████▊ | 19469/22095 [33:13:41<3:09:18, 4.33s/it] {'loss': 0.3176, 'grad_norm': 0.552569680672477, 'learning_rate': 3.661518830281524e-07, 'epoch': 0.88} 88%|████████▊ | 19469/22095 [33:13:41<3:09:18, 4.33s/it] 88%|████████▊ | 19470/22095 [33:13:44<2:52:52, 3.95s/it] {'loss': 0.3136, 'grad_norm': 0.6266113022409174, 'learning_rate': 3.658766255264046e-07, 'epoch': 0.88} 88%|████████▊ | 19470/22095 [33:13:44<2:52:52, 3.95s/it] 88%|████████▊ | 19471/22095 [33:13:48<2:52:59, 3.96s/it] {'loss': 0.3461, 'grad_norm': 0.6089083806216865, 'learning_rate': 3.65601467597736e-07, 'epoch': 0.88} 88%|████████▊ | 19471/22095 [33:13:48<2:52:59, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (104731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94108 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19472/22095 [33:13:56<3:37:13, 4.97s/it] {'loss': 0.4639, 'grad_norm': 0.2774525191944238, 'learning_rate': 3.653264092480574e-07, 'epoch': 0.88} 88%|████████▊ | 19472/22095 [33:13:56<3:37:13, 4.97s/it] 88%|████████▊ | 19473/22095 [33:14:05<4:34:43, 6.29s/it] {'loss': 0.469, 'grad_norm': 0.2802605123799578, 'learning_rate': 3.650514504832808e-07, 'epoch': 0.88} 88%|████████▊ | 19473/22095 [33:14:05<4:34:43, 6.29s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (91697 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80386 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84197 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51857 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19474/22095 [33:14:08<3:55:46, 5.40s/it] {'loss': 0.2696, 'grad_norm': 0.6079325883555515, 'learning_rate': 3.647765913093132e-07, 'epoch': 0.88} 88%|████████▊ | 19474/22095 [33:14:08<3:55:46, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46747 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19475/22095 [33:14:12<3:33:56, 4.90s/it] {'loss': 0.2796, 'grad_norm': 0.7064070655192801, 'learning_rate': 3.6450183173205975e-07, 'epoch': 0.88} 88%|████████▊ | 19475/22095 [33:14:12<3:33:56, 4.90s/it] 88%|████████▊ | 19476/22095 [33:14:16<3:15:58, 4.49s/it] {'loss': 0.3106, 'grad_norm': 0.6073929551673423, 'learning_rate': 3.6422717175742584e-07, 'epoch': 0.88} 88%|████████▊ | 19476/22095 [33:14:16<3:15:58, 4.49s/it] 88%|████████▊ | 19477/22095 [33:14:19<2:59:59, 4.13s/it] {'loss': 0.3233, 'grad_norm': 0.6969037107691501, 'learning_rate': 3.639526113913122e-07, 'epoch': 0.88} 88%|████████▊ | 19477/22095 [33:14:19<2:59:59, 4.13s/it] 88%|████████▊ | 19478/22095 [33:14:22<2:49:05, 3.88s/it] {'loss': 0.2654, 'grad_norm': 0.6067566723679195, 'learning_rate': 3.636781506396192e-07, 'epoch': 0.88} 88%|████████▊ | 19478/22095 [33:14:22<2:49:05, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19479/22095 [33:14:31<3:51:25, 5.31s/it] {'loss': 0.4727, 'grad_norm': 0.27022736806054953, 'learning_rate': 3.634037895082421e-07, 'epoch': 0.88} 88%|████████▊ | 19479/22095 [33:14:31<3:51:25, 5.31s/it] 88%|████████▊ | 19480/22095 [33:14:39<4:23:29, 6.05s/it] {'loss': 0.4623, 'grad_norm': 0.2748053658632415, 'learning_rate': 3.631295280030783e-07, 'epoch': 0.88} 88%|████████▊ | 19480/22095 [33:14:39<4:23:29, 6.05s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 88%|████████▊ | 19481/22095 [33:14:42<3:50:08, 5.28s/it] {'loss': 0.2831, 'grad_norm': 0.6392027857838919, 'learning_rate': 3.628553661300194e-07, 'epoch': 0.88} 88%|████████▊ | 19481/22095 [33:14:42<3:50:08, 5.28s/it] 88%|████████▊ | 19482/22095 [33:14:52<4:57:31, 6.83s/it] {'loss': 0.4477, 'grad_norm': 0.2676240449487449, 'learning_rate': 3.6258130389495714e-07, 'epoch': 0.88} 88%|████████▊ | 19482/22095 [33:14:52<4:57:31, 6.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 88%|████████▊ | 19483/22095 [33:14:56<4:15:12, 5.86s/it] {'loss': 0.2927, 'grad_norm': 0.603529385483359, 'learning_rate': 3.623073413037792e-07, 'epoch': 0.88} 88%|████████▊ | 19483/22095 [33:14:56<4:15:12, 5.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63351 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (135078 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19484/22095 [33:14:59<3:40:01, 5.06s/it] {'loss': 0.3097, 'grad_norm': 0.5873583234591071, 'learning_rate': 3.620334783623736e-07, 'epoch': 0.88} 88%|████████▊ | 19484/22095 [33:14:59<3:40:01, 5.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41280 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69580 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70255 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124869 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19485/22095 [33:15:03<3:23:00, 4.67s/it] {'loss': 0.2814, 'grad_norm': 0.6453616824837317, 'learning_rate': 3.6175971507662334e-07, 'epoch': 0.88} 88%|████████▊ | 19485/22095 [33:15:03<3:23:00, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8347056 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 13721, 'image': 'vrdu_table_final_2/astro-ph.CO/4af78706-c702-4053-8a4c-3aeabb9eec80.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 88%|████████▊ | 19486/22095 [33:15:06<3:04:30, 4.24s/it] {'loss': 0.2953, 'grad_norm': 0.7953146155133014, 'learning_rate': 3.6148605145241264e-07, 'epoch': 0.88} 88%|████████▊ | 19486/22095 [33:15:06<3:04:30, 4.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19487/22095 [33:15:09<2:45:40, 3.81s/it] {'loss': 0.3198, 'grad_norm': 0.6501379276469679, 'learning_rate': 3.612124874956202e-07, 'epoch': 0.88} 88%|████████▊ | 19487/22095 [33:15:09<2:45:40, 3.81s/it] 88%|████████▊ | 19488/22095 [33:15:13<2:44:39, 3.79s/it] {'loss': 0.3009, 'grad_norm': 0.6536634477685312, 'learning_rate': 3.6093902321212405e-07, 'epoch': 0.88} 88%|████████▊ | 19488/22095 [33:15:13<2:44:39, 3.79s/it] 88%|████████▊ | 19489/22095 [33:15:17<2:44:02, 3.78s/it] {'loss': 0.2996, 'grad_norm': 0.6306391823413456, 'learning_rate': 3.606656586078e-07, 'epoch': 0.88} 88%|████████▊ | 19489/22095 [33:15:17<2:44:02, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49005 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81155 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97312 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19490/22095 [33:15:20<2:44:40, 3.79s/it] {'loss': 0.2883, 'grad_norm': 0.6143080705714032, 'learning_rate': 3.603923936885234e-07, 'epoch': 0.88} 88%|████████▊ | 19490/22095 [33:15:20<2:44:40, 3.79s/it] 88%|████████▊ | 19491/22095 [33:15:25<2:52:06, 3.97s/it] {'loss': 0.2809, 'grad_norm': 0.5784749363086075, 'learning_rate': 3.6011922846016513e-07, 'epoch': 0.88} 88%|████████▊ | 19491/22095 [33:15:25<2:52:06, 3.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19492/22095 [33:15:35<4:12:46, 5.83s/it] {'loss': 0.4634, 'grad_norm': 0.24860289722800014, 'learning_rate': 3.598461629285932e-07, 'epoch': 0.88} 88%|████████▊ | 19492/22095 [33:15:35<4:12:46, 5.83s/it] 88%|████████▊ | 19493/22095 [33:15:45<5:03:28, 7.00s/it] {'loss': 0.4477, 'grad_norm': 0.25458788983963304, 'learning_rate': 3.5957319709967686e-07, 'epoch': 0.88} 88%|████████▊ | 19493/22095 [33:15:45<5:03:28, 7.00s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 88%|████████▊ | 19494/22095 [33:15:48<4:12:59, 5.84s/it] {'loss': 0.2921, 'grad_norm': 0.5409817762850514, 'learning_rate': 3.5930033097928086e-07, 'epoch': 0.88} 88%|████████▊ | 19494/22095 [33:15:48<4:12:59, 5.84s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948334 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71487, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 6.5cm\nB. 5cm\nC. 5.5cm\nD. 6cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 88%|████████▊ | 19495/22095 [33:15:52<3:46:47, 5.23s/it] {'loss': 0.3658, 'grad_norm': 0.82768520924384, 'learning_rate': 3.590275645732666e-07, 'epoch': 0.88} 88%|████████▊ | 19495/22095 [33:15:52<3:46:47, 5.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98622 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97197 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19496/22095 [33:15:55<3:19:00, 4.59s/it] {'loss': 0.271, 'grad_norm': 0.6294611657462783, 'learning_rate': 3.5875489788749665e-07, 'epoch': 0.88} 88%|████████▊ | 19496/22095 [33:15:55<3:19:00, 4.59s/it] 88%|████████▊ | 19497/22095 [33:15:58<3:03:09, 4.23s/it] {'loss': 0.362, 'grad_norm': 0.6913804587027634, 'learning_rate': 3.5848233092783015e-07, 'epoch': 0.88} 88%|████████▊ | 19497/22095 [33:15:58<3:03:09, 4.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19498/22095 [33:16:08<4:14:05, 5.87s/it] {'loss': 0.473, 'grad_norm': 0.25888332790044416, 'learning_rate': 3.5820986370012303e-07, 'epoch': 0.88} 88%|████████▊ | 19498/22095 [33:16:08<4:14:05, 5.87s/it] 88%|████████▊ | 19499/22095 [33:16:11<3:43:01, 5.15s/it] {'loss': 0.2763, 'grad_norm': 0.6432899343617813, 'learning_rate': 3.579374962102289e-07, 'epoch': 0.88} 88%|████████▊ | 19499/22095 [33:16:11<3:43:01, 5.15s/it] 88%|████████▊ | 19500/22095 [33:16:16<3:32:28, 4.91s/it] {'loss': 0.2607, 'grad_norm': 0.6067096968977975, 'learning_rate': 3.57665228464002e-07, 'epoch': 0.88} 88%|████████▊ | 19500/22095 [33:16:16<3:32:28, 4.91s/it] 88%|████████▊ | 19501/22095 [33:16:18<3:05:56, 4.30s/it] {'loss': 0.2651, 'grad_norm': 0.6861618750869042, 'learning_rate': 3.573930604672904e-07, 'epoch': 0.88} 88%|████████▊ | 19501/22095 [33:16:18<3:05:56, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19502/22095 [33:16:25<3:40:53, 5.11s/it] {'loss': 0.4661, 'grad_norm': 0.2943924522520371, 'learning_rate': 3.571209922259439e-07, 'epoch': 0.88} 88%|████████▊ | 19502/22095 [33:16:25<3:40:53, 5.11s/it] 88%|████████▊ | 19503/22095 [33:16:29<3:14:48, 4.51s/it] {'loss': 0.303, 'grad_norm': 0.6177842227396511, 'learning_rate': 3.568490237458083e-07, 'epoch': 0.88} 88%|████████▊ | 19503/22095 [33:16:29<3:14:48, 4.51s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19504/22095 [33:16:37<4:10:54, 5.81s/it] {'loss': 0.4871, 'grad_norm': 0.2745503493714577, 'learning_rate': 3.5657715503272574e-07, 'epoch': 0.88} 88%|████████▊ | 19504/22095 [33:16:37<4:10:54, 5.81s/it] 88%|████████▊ | 19505/22095 [33:16:41<3:41:31, 5.13s/it] {'loss': 0.302, 'grad_norm': 0.5571673918655595, 'learning_rate': 3.563053860925392e-07, 'epoch': 0.88} 88%|████████▊ | 19505/22095 [33:16:41<3:41:31, 5.13s/it] 88%|████████▊ | 19506/22095 [33:16:44<3:12:59, 4.47s/it] {'loss': 0.3064, 'grad_norm': 0.6663631237912184, 'learning_rate': 3.5603371693108845e-07, 'epoch': 0.88} 88%|████████▊ | 19506/22095 [33:16:44<3:12:59, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45036 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101370 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54110 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19507/22095 [33:16:47<2:50:36, 3.96s/it] {'loss': 0.2845, 'grad_norm': 0.6260927130262418, 'learning_rate': 3.5576214755421e-07, 'epoch': 0.88} 88%|████████▊ | 19507/22095 [33:16:47<2:50:36, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93725 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19508/22095 [33:16:51<2:56:52, 4.10s/it] {'loss': 0.31, 'grad_norm': 0.5917560520583667, 'learning_rate': 3.5549067796773915e-07, 'epoch': 0.88} 88%|████████▊ | 19508/22095 [33:16:51<2:56:52, 4.10s/it] 88%|████████▊ | 19509/22095 [33:16:54<2:44:31, 3.82s/it] {'loss': 0.2842, 'grad_norm': 1.1379628888985436, 'learning_rate': 3.5521930817750963e-07, 'epoch': 0.88} 88%|████████▊ | 19509/22095 [33:16:54<2:44:31, 3.82s/it] 88%|████████▊ | 19510/22095 [33:16:57<2:34:58, 3.60s/it] {'loss': 0.2603, 'grad_norm': 0.6216868931141476, 'learning_rate': 3.549480381893505e-07, 'epoch': 0.88} 88%|████████▊ | 19510/22095 [33:16:57<2:34:58, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19511/22095 [33:17:00<2:28:25, 3.45s/it] {'loss': 0.2915, 'grad_norm': 0.6292864149594591, 'learning_rate': 3.546768680090934e-07, 'epoch': 0.88} 88%|████████▊ | 19511/22095 [33:17:00<2:28:25, 3.45s/it] 88%|████████▊ | 19512/22095 [33:17:04<2:29:42, 3.48s/it] {'loss': 0.2967, 'grad_norm': 0.5450364036985776, 'learning_rate': 3.544057976425619e-07, 'epoch': 0.88} 88%|████████▊ | 19512/22095 [33:17:04<2:29:42, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19513/22095 [33:17:08<2:33:35, 3.57s/it] {'loss': 0.297, 'grad_norm': 0.6244991880782165, 'learning_rate': 3.5413482709558353e-07, 'epoch': 0.88} 88%|████████▊ | 19513/22095 [33:17:08<2:33:35, 3.57s/it] 88%|████████▊ | 19514/22095 [33:17:11<2:31:12, 3.52s/it] {'loss': 0.3106, 'grad_norm': 0.6562732652908535, 'learning_rate': 3.538639563739776e-07, 'epoch': 0.88} 88%|████████▊ | 19514/22095 [33:17:11<2:31:12, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 88%|████████▊ | 19515/22095 [33:17:17<3:07:21, 4.36s/it] {'loss': 0.4413, 'grad_norm': 0.24905978799813444, 'learning_rate': 3.535931854835667e-07, 'epoch': 0.88} 88%|████████▊ | 19515/22095 [33:17:18<3:07:21, 4.36s/it] 88%|████████▊ | 19516/22095 [33:17:22<3:03:24, 4.27s/it] {'loss': 0.314, 'grad_norm': 0.6667836600959485, 'learning_rate': 3.533225144301683e-07, 'epoch': 0.88} 88%|████████▊ | 19516/22095 [33:17:22<3:03:24, 4.27s/it] 88%|████████▊ | 19517/22095 [33:17:25<2:57:05, 4.12s/it] {'loss': 0.269, 'grad_norm': 0.6324897324523692, 'learning_rate': 3.530519432195967e-07, 'epoch': 0.88} 88%|████████▊ | 19517/22095 [33:17:25<2:57:05, 4.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19518/22095 [33:17:29<2:49:12, 3.94s/it] {'loss': 0.315, 'grad_norm': 0.6004718094460242, 'learning_rate': 3.5278147185766665e-07, 'epoch': 0.88} 88%|████████▊ | 19518/22095 [33:17:29<2:49:12, 3.94s/it] 88%|████████▊ | 19519/22095 [33:17:32<2:44:10, 3.82s/it] {'loss': 0.3061, 'grad_norm': 0.580260577387419, 'learning_rate': 3.525111003501908e-07, 'epoch': 0.88} 88%|████████▊ | 19519/22095 [33:17:32<2:44:10, 3.82s/it] 88%|████████▊ | 19520/22095 [33:17:36<2:35:42, 3.63s/it] {'loss': 0.3236, 'grad_norm': 0.5955680232408859, 'learning_rate': 3.522408287029783e-07, 'epoch': 0.88} 88%|████████▊ | 19520/22095 [33:17:36<2:35:42, 3.63s/it] 88%|████████▊ | 19521/22095 [33:17:39<2:30:29, 3.51s/it] {'loss': 0.2763, 'grad_norm': 0.5709331388756373, 'learning_rate': 3.519706569218345e-07, 'epoch': 0.88} 88%|████████▊ | 19521/22095 [33:17:39<2:30:29, 3.51s/it] 88%|████████▊ | 19522/22095 [33:17:43<2:35:23, 3.62s/it] {'loss': 0.2955, 'grad_norm': 0.7962422901003292, 'learning_rate': 3.517005850125671e-07, 'epoch': 0.88} 88%|████████▊ | 19522/22095 [33:17:43<2:35:23, 3.62s/it] 88%|████████▊ | 19523/22095 [33:17:46<2:29:29, 3.49s/it] {'loss': 0.2651, 'grad_norm': 0.5829977543303796, 'learning_rate': 3.5143061298097693e-07, 'epoch': 0.88} 88%|████████▊ | 19523/22095 [33:17:46<2:29:29, 3.49s/it] 88%|████████▊ | 19524/22095 [33:17:49<2:31:09, 3.53s/it] {'loss': 0.2744, 'grad_norm': 0.6197722715291717, 'learning_rate': 3.5116074083286655e-07, 'epoch': 0.88} 88%|████████▊ | 19524/22095 [33:17:49<2:31:09, 3.53s/it] 88%|████████▊ | 19525/22095 [33:17:52<2:23:10, 3.34s/it] {'loss': 0.2971, 'grad_norm': 0.7073843878252019, 'learning_rate': 3.508909685740336e-07, 'epoch': 0.88} 88%|████████▊ | 19525/22095 [33:17:52<2:23:10, 3.34s/it] 88%|████████▊ | 19526/22095 [33:17:57<2:38:17, 3.70s/it] {'loss': 0.3349, 'grad_norm': 0.6411539261445458, 'learning_rate': 3.5062129621027565e-07, 'epoch': 0.88} 88%|████████▊ | 19526/22095 [33:17:57<2:38:17, 3.70s/it] 88%|████████▊ | 19527/22095 [33:18:00<2:28:48, 3.48s/it] {'loss': 0.3039, 'grad_norm': 0.5867776122798319, 'learning_rate': 3.5035172374738636e-07, 'epoch': 0.88} 88%|████████▊ | 19527/22095 [33:18:00<2:28:48, 3.48s/it] 88%|████████▊ | 19528/22095 [33:18:03<2:23:35, 3.36s/it] {'loss': 0.3061, 'grad_norm': 0.5909421682088367, 'learning_rate': 3.500822511911578e-07, 'epoch': 0.88} 88%|████████▊ | 19528/22095 [33:18:03<2:23:35, 3.36s/it] 88%|████████▊ | 19529/22095 [33:18:07<2:32:22, 3.56s/it] {'loss': 0.3391, 'grad_norm': 1.0457195445167602, 'learning_rate': 3.4981287854738143e-07, 'epoch': 0.88} 88%|████████▊ | 19529/22095 [33:18:07<2:32:22, 3.56s/it] 88%|████████▊ | 19530/22095 [33:18:11<2:38:06, 3.70s/it] {'loss': 0.2978, 'grad_norm': 0.6062906376849054, 'learning_rate': 3.495436058218432e-07, 'epoch': 0.88} 88%|████████▊ | 19530/22095 [33:18:11<2:38:06, 3.70s/it] 88%|████████▊ | 19531/22095 [33:18:14<2:29:35, 3.50s/it] {'loss': 0.298, 'grad_norm': 0.6345821755484723, 'learning_rate': 3.4927443302033127e-07, 'epoch': 0.88} 88%|████████▊ | 19531/22095 [33:18:14<2:29:35, 3.50s/it] 88%|████████▊ | 19532/22095 [33:18:17<2:21:37, 3.32s/it] {'loss': 0.2746, 'grad_norm': 0.5693391318164701, 'learning_rate': 3.4900536014862763e-07, 'epoch': 0.88} 88%|████████▊ | 19532/22095 [33:18:17<2:21:37, 3.32s/it] 88%|████████▊ | 19533/22095 [33:18:21<2:25:57, 3.42s/it] {'loss': 0.3167, 'grad_norm': 0.607301733592083, 'learning_rate': 3.487363872125138e-07, 'epoch': 0.88} 88%|████████▊ | 19533/22095 [33:18:21<2:25:57, 3.42s/it] 88%|████████▊ | 19534/22095 [33:18:25<2:40:51, 3.77s/it] {'loss': 0.2913, 'grad_norm': 0.6503070900326725, 'learning_rate': 3.4846751421777014e-07, 'epoch': 0.88} 88%|████████▊ | 19534/22095 [33:18:25<2:40:51, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74934 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19535/22095 [33:18:30<2:54:03, 4.08s/it] {'loss': 0.2766, 'grad_norm': 0.6221687774308418, 'learning_rate': 3.4819874117017373e-07, 'epoch': 0.88} 88%|████████▊ | 19535/22095 [33:18:30<2:54:03, 4.08s/it] 88%|████████▊ | 19536/22095 [33:18:34<2:47:44, 3.93s/it] {'loss': 0.3141, 'grad_norm': 0.5918727382827849, 'learning_rate': 3.479300680754999e-07, 'epoch': 0.88} 88%|████████▊ | 19536/22095 [33:18:34<2:47:44, 3.93s/it] 88%|████████▊ | 19537/22095 [33:18:38<2:52:25, 4.04s/it] {'loss': 0.3138, 'grad_norm': 0.7346385733269784, 'learning_rate': 3.4766149493952015e-07, 'epoch': 0.88} 88%|████████▊ | 19537/22095 [33:18:38<2:52:25, 4.04s/it] 88%|████████▊ | 19538/22095 [33:18:41<2:37:47, 3.70s/it] {'loss': 0.2779, 'grad_norm': 0.9691102569844686, 'learning_rate': 3.4739302176800603e-07, 'epoch': 0.88} 88%|████████▊ | 19538/22095 [33:18:41<2:37:47, 3.70s/it] 88%|████████▊ | 19539/22095 [33:18:44<2:32:36, 3.58s/it] {'loss': 0.2885, 'grad_norm': 0.5998689432100688, 'learning_rate': 3.471246485667279e-07, 'epoch': 0.88} 88%|████████▊ | 19539/22095 [33:18:44<2:32:36, 3.58s/it] 88%|████████▊ | 19540/22095 [33:18:47<2:29:06, 3.50s/it] {'loss': 0.2637, 'grad_norm': 0.5951426041184009, 'learning_rate': 3.468563753414506e-07, 'epoch': 0.88} 88%|████████▊ | 19540/22095 [33:18:47<2:29:06, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19541/22095 [33:18:50<2:22:45, 3.35s/it] {'loss': 0.2924, 'grad_norm': 0.6478330409952806, 'learning_rate': 3.4658820209793773e-07, 'epoch': 0.88} 88%|████████▊ | 19541/22095 [33:18:50<2:22:45, 3.35s/it] 88%|████████▊ | 19542/22095 [33:18:53<2:17:03, 3.22s/it] {'loss': 0.2855, 'grad_norm': 0.6745078177873124, 'learning_rate': 3.463201288419532e-07, 'epoch': 0.88} 88%|████████▊ | 19542/22095 [33:18:53<2:17:03, 3.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 88%|████████▊ | 19543/22095 [33:18:59<2:51:23, 4.03s/it] {'loss': 0.4745, 'grad_norm': 0.2845659706695868, 'learning_rate': 3.460521555792562e-07, 'epoch': 0.88} 88%|████████▊ | 19543/22095 [33:18:59<2:51:23, 4.03s/it] 88%|████████▊ | 19544/22095 [33:19:03<2:46:41, 3.92s/it] {'loss': 0.3066, 'grad_norm': 0.588620732096941, 'learning_rate': 3.4578428231560547e-07, 'epoch': 0.88} 88%|████████▊ | 19544/22095 [33:19:03<2:46:41, 3.92s/it] 88%|████████▊ | 19545/22095 [33:19:06<2:34:52, 3.64s/it] {'loss': 0.3262, 'grad_norm': 0.6574074748283296, 'learning_rate': 3.4551650905675584e-07, 'epoch': 0.88} 88%|████████▊ | 19545/22095 [33:19:06<2:34:52, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42634 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19546/22095 [33:19:10<2:35:43, 3.67s/it] {'loss': 0.3199, 'grad_norm': 0.6045371469529954, 'learning_rate': 3.4524883580846045e-07, 'epoch': 0.88} 88%|████████▊ | 19546/22095 [33:19:10<2:35:43, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44195 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (118027 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130082 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45825 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19547/22095 [33:19:13<2:28:51, 3.51s/it] {'loss': 0.2819, 'grad_norm': 0.5269759620850714, 'learning_rate': 3.44981262576472e-07, 'epoch': 0.88} 88%|████████▊ | 19547/22095 [33:19:13<2:28:51, 3.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42648 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19548/22095 [33:19:16<2:22:57, 3.37s/it] {'loss': 0.2828, 'grad_norm': 0.5953117426683678, 'learning_rate': 3.4471378936654033e-07, 'epoch': 0.88} 88%|████████▊ | 19548/22095 [33:19:16<2:22:57, 3.37s/it] 88%|████████▊ | 19549/22095 [33:19:18<2:13:35, 3.15s/it] {'loss': 0.2823, 'grad_norm': 0.5621445856567112, 'learning_rate': 3.444464161844113e-07, 'epoch': 0.88} 88%|████████▊ | 19549/22095 [33:19:18<2:13:35, 3.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382447 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49241, 'image': 'vrdu_table_final_2/astro-ph.CO/e20575b6-21e8-4b55-be29-61c9a99c089d.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 88%|████████▊ | 19550/22095 [33:19:21<2:09:50, 3.06s/it] {'loss': 0.2874, 'grad_norm': 0.5682398961006359, 'learning_rate': 3.441791430358299e-07, 'epoch': 0.88} 88%|████████▊ | 19550/22095 [33:19:21<2:09:50, 3.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (123528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (147862 > 40960). Running this sequence through the model will result in indexing errors 88%|████████▊ | 19551/22095 [33:19:25<2:20:15, 3.31s/it] {'loss': 0.253, 'grad_norm': 0.5835295496557882, 'learning_rate': 3.4391196992653976e-07, 'epoch': 0.88} 88%|████████▊ | 19551/22095 [33:19:25<2:20:15, 3.31s/it] 88%|████████▊ | 19552/22095 [33:19:28<2:17:43, 3.25s/it] {'loss': 0.2733, 'grad_norm': 0.6077999682841784, 'learning_rate': 3.4364489686228076e-07, 'epoch': 0.88} 88%|████████▊ | 19552/22095 [33:19:28<2:17:43, 3.25s/it] 88%|████████▊ | 19553/22095 [33:19:32<2:24:31, 3.41s/it] {'loss': 0.2468, 'grad_norm': 0.6696188768672875, 'learning_rate': 3.4337792384879274e-07, 'epoch': 0.88} 88%|████████▊ | 19553/22095 [33:19:32<2:24:31, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [520, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8427815 in VC:s3://internvl-moe-sft-data/. Exception: Image size [520, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 142867, 'image': 'vrdu_texteq/astro-ph.CO/e36704a4-2834-4554-aeea-ddbbb630ad95.png', 'image_wh': [[520, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'We wish to solve this at linear order in $\\epsilon_m$.'}]} 88%|████████▊ | 19554/22095 [33:19:42<3:42:45, 5.26s/it] {'loss': 0.4613, 'grad_norm': 0.2749716275869113, 'learning_rate': 3.431110508918112e-07, 'epoch': 0.88} 88%|████████▊ | 19554/22095 [33:19:42<3:42:45, 5.26s/it] 89%|████████▊ | 19555/22095 [33:19:51<4:36:00, 6.52s/it] {'loss': 0.4337, 'grad_norm': 0.312283411686006, 'learning_rate': 3.428442779970709e-07, 'epoch': 0.89} 89%|████████▊ | 19555/22095 [33:19:51<4:36:00, 6.52s/it] 89%|████████▊ | 19556/22095 [33:20:01<5:14:32, 7.43s/it] {'loss': 0.4855, 'grad_norm': 0.2585317322224034, 'learning_rate': 3.425776051703028e-07, 'epoch': 0.89} 89%|████████▊ | 19556/22095 [33:20:01<5:14:32, 7.43s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 89%|████████▊ | 19557/22095 [33:20:04<4:28:09, 6.34s/it] {'loss': 0.2766, 'grad_norm': 0.5817616751386044, 'learning_rate': 3.4231103241723904e-07, 'epoch': 0.89} 89%|████████▊ | 19557/22095 [33:20:04<4:28:09, 6.34s/it] 89%|████████▊ | 19558/22095 [33:20:09<4:04:49, 5.79s/it] {'loss': 0.3129, 'grad_norm': 0.657507339106589, 'learning_rate': 3.420445597436056e-07, 'epoch': 0.89} 89%|████████▊ | 19558/22095 [33:20:09<4:04:49, 5.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19559/22095 [33:20:20<5:05:25, 7.23s/it] {'loss': 0.4524, 'grad_norm': 0.25678973673554567, 'learning_rate': 3.4177818715512844e-07, 'epoch': 0.89} 89%|████████▊ | 19559/22095 [33:20:20<5:05:25, 7.23s/it] 89%|████████▊ | 19560/22095 [33:20:24<4:24:31, 6.26s/it] {'loss': 0.2883, 'grad_norm': 0.6084533850542106, 'learning_rate': 3.415119146575313e-07, 'epoch': 0.89} 89%|████████▊ | 19560/22095 [33:20:24<4:24:31, 6.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44644 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19561/22095 [33:20:28<4:04:17, 5.78s/it] {'loss': 0.3022, 'grad_norm': 0.6689930661361905, 'learning_rate': 3.412457422565368e-07, 'epoch': 0.89} 89%|████████▊ | 19561/22095 [33:20:28<4:04:17, 5.78s/it] 89%|████████▊ | 19562/22095 [33:20:32<3:36:38, 5.13s/it] {'loss': 0.3097, 'grad_norm': 0.6389051792047346, 'learning_rate': 3.409796699578621e-07, 'epoch': 0.89} 89%|████████▊ | 19562/22095 [33:20:32<3:36:38, 5.13s/it] 89%|████████▊ | 19563/22095 [33:20:36<3:19:38, 4.73s/it] {'loss': 0.2956, 'grad_norm': 0.6029784116061799, 'learning_rate': 3.4071369776722487e-07, 'epoch': 0.89} 89%|████████▊ | 19563/22095 [33:20:36<3:19:38, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19564/22095 [33:20:44<4:09:10, 5.91s/it] {'loss': 0.4892, 'grad_norm': 0.2873501741847482, 'learning_rate': 3.4044782569034096e-07, 'epoch': 0.89} 89%|████████▊ | 19564/22095 [33:20:44<4:09:10, 5.91s/it] 89%|████████▊ | 19565/22095 [33:20:48<3:40:04, 5.22s/it] {'loss': 0.2888, 'grad_norm': 0.6136217708277859, 'learning_rate': 3.401820537329231e-07, 'epoch': 0.89} 89%|████████▊ | 19565/22095 [33:20:48<3:40:04, 5.22s/it] 89%|████████▊ | 19566/22095 [33:20:51<3:12:11, 4.56s/it] {'loss': 0.2727, 'grad_norm': 0.6195423898571132, 'learning_rate': 3.399163819006801e-07, 'epoch': 0.89} 89%|████████▊ | 19566/22095 [33:20:51<3:12:11, 4.56s/it] 89%|████████▊ | 19567/22095 [33:20:55<3:03:07, 4.35s/it] {'loss': 0.2648, 'grad_norm': 0.5806307328938312, 'learning_rate': 3.3965081019932176e-07, 'epoch': 0.89} 89%|████████▊ | 19567/22095 [33:20:55<3:03:07, 4.35s/it] 89%|████████▊ | 19568/22095 [33:20:59<2:55:34, 4.17s/it] {'loss': 0.321, 'grad_norm': 1.002216804194398, 'learning_rate': 3.3938533863455526e-07, 'epoch': 0.89} 89%|████████▊ | 19568/22095 [33:20:59<2:55:34, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19569/22095 [33:21:06<3:35:55, 5.13s/it] {'loss': 0.4512, 'grad_norm': 0.2583117712856083, 'learning_rate': 3.3911996721208373e-07, 'epoch': 0.89} 89%|████████▊ | 19569/22095 [33:21:06<3:35:55, 5.13s/it] 89%|████████▊ | 19570/22095 [33:21:09<3:11:40, 4.55s/it] {'loss': 0.2698, 'grad_norm': 0.6515063155660002, 'learning_rate': 3.388546959376088e-07, 'epoch': 0.89} 89%|████████▊ | 19570/22095 [33:21:09<3:11:40, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41671 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47665 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97251 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19571/22095 [33:21:12<2:49:24, 4.03s/it] {'loss': 0.2663, 'grad_norm': 0.6489297620437484, 'learning_rate': 3.385895248168314e-07, 'epoch': 0.89} 89%|████████▊ | 19571/22095 [33:21:12<2:49:24, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49994 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56942 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97287 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19572/22095 [33:21:15<2:41:35, 3.84s/it] {'loss': 0.312, 'grad_norm': 0.6414007301380998, 'learning_rate': 3.383244538554481e-07, 'epoch': 0.89} 89%|████████▊ | 19572/22095 [33:21:15<2:41:35, 3.84s/it] 89%|████████▊ | 19573/22095 [33:21:20<2:48:41, 4.01s/it] {'loss': 0.311, 'grad_norm': 0.6456000031136664, 'learning_rate': 3.380594830591555e-07, 'epoch': 0.89} 89%|████████▊ | 19573/22095 [33:21:20<2:48:41, 4.01s/it] 89%|████████▊ | 19574/22095 [33:21:23<2:39:11, 3.79s/it] {'loss': 0.3125, 'grad_norm': 0.6274904157405076, 'learning_rate': 3.3779461243364673e-07, 'epoch': 0.89} 89%|████████▊ | 19574/22095 [33:21:23<2:39:11, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▊ | 19575/22095 [33:21:27<2:40:46, 3.83s/it] {'loss': 0.3102, 'grad_norm': 0.6555214158150692, 'learning_rate': 3.3752984198461236e-07, 'epoch': 0.89} 89%|████████▊ | 19575/22095 [33:21:27<2:40:46, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83378 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54369 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100916 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19576/22095 [33:21:30<2:33:54, 3.67s/it] {'loss': 0.251, 'grad_norm': 0.6002976364382615, 'learning_rate': 3.3726517171774163e-07, 'epoch': 0.89} 89%|████████▊ | 19576/22095 [33:21:30<2:33:54, 3.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104961 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84678 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71664 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47973 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19577/22095 [33:21:52<6:24:32, 9.16s/it] {'loss': 0.2729, 'grad_norm': 0.6060495541921025, 'learning_rate': 3.3700060163872285e-07, 'epoch': 0.89} 89%|████████▊ | 19577/22095 [33:21:52<6:24:32, 9.16s/it] 89%|████████▊ | 19578/22095 [33:21:56<5:15:34, 7.52s/it] {'loss': 0.2833, 'grad_norm': 0.5862835298549485, 'learning_rate': 3.367361317532397e-07, 'epoch': 0.89} 89%|████████▊ | 19578/22095 [33:21:56<5:15:34, 7.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▊ | 19579/22095 [33:22:06<5:43:28, 8.19s/it] {'loss': 0.4968, 'grad_norm': 0.26710656491030127, 'learning_rate': 3.3647176206697387e-07, 'epoch': 0.89} 89%|████████▊ | 19579/22095 [33:22:06<5:43:28, 8.19s/it] 89%|████████▊ | 19580/22095 [33:22:16<6:04:56, 8.71s/it] {'loss': 0.431, 'grad_norm': 0.2666211536533626, 'learning_rate': 3.362074925856079e-07, 'epoch': 0.89} 89%|████████▊ | 19580/22095 [33:22:16<6:04:56, 8.71s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 89%|████████▊ | 19581/22095 [33:22:39<9:12:11, 13.18s/it] {'loss': 0.2709, 'grad_norm': 0.5695538360488616, 'learning_rate': 3.359433233148185e-07, 'epoch': 0.89} 89%|████████▊ | 19581/22095 [33:22:39<9:12:11, 13.18s/it] 89%|████████▊ | 19582/22095 [33:22:43<7:08:33, 10.23s/it] {'loss': 0.2706, 'grad_norm': 0.59719229785415, 'learning_rate': 3.356792542602838e-07, 'epoch': 0.89} 89%|████████▊ | 19582/22095 [33:22:43<7:08:33, 10.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77020 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (121958 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19583/22095 [33:22:46<5:41:16, 8.15s/it] {'loss': 0.2906, 'grad_norm': 0.5836204564546559, 'learning_rate': 3.354152854276749e-07, 'epoch': 0.89} 89%|████████▊ | 19583/22095 [33:22:46<5:41:16, 8.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19584/22095 [33:22:55<5:55:30, 8.49s/it] {'loss': 0.4762, 'grad_norm': 0.26799320129721166, 'learning_rate': 3.351514168226666e-07, 'epoch': 0.89} 89%|████████▊ | 19584/22095 [33:22:55<5:55:30, 8.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62190 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45255 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19585/22095 [33:23:05<6:09:26, 8.83s/it] {'loss': 0.4441, 'grad_norm': 0.26377778286885706, 'learning_rate': 3.348876484509267e-07, 'epoch': 0.89} 89%|████████▊ | 19585/22095 [33:23:05<6:09:26, 8.83s/it] 89%|████████▊ | 19586/22095 [33:23:14<6:12:27, 8.91s/it] {'loss': 0.4714, 'grad_norm': 0.2715906803668781, 'learning_rate': 3.346239803181239e-07, 'epoch': 0.89} 89%|████████▊ | 19586/22095 [33:23:14<6:12:27, 8.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 VC:s3://gui-agent/data_20250813/windows/images/libreoffice_calc/free_task_20250728_173051/images/20250728_173055_2.png 2025-08-29 01:21:12.582396 load time: 1045.9 ms 89%|████████▊ | 19587/22095 [33:23:17<4:59:35, 7.17s/it] {'loss': 0.309, 'grad_norm': 0.6451659642294735, 'learning_rate': 3.343604124299232e-07, 'epoch': 0.89} 89%|████████▊ | 19587/22095 [33:23:17<4:59:35, 7.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047746 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 0.5\nB. 1\nC. 1.5\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 89%|████████▊ | 19588/22095 [33:23:20<4:10:23, 5.99s/it] {'loss': 0.2856, 'grad_norm': 0.6017547316225268, 'learning_rate': 3.340969447919873e-07, 'epoch': 0.89} 89%|████████▊ | 19588/22095 [33:23:20<4:10:23, 5.99s/it] 89%|████████▊ | 19589/22095 [33:23:43<7:46:05, 11.16s/it] {'loss': 0.3516, 'grad_norm': 0.6555996985935236, 'learning_rate': 3.338335774099777e-07, 'epoch': 0.89} 89%|████████▊ | 19589/22095 [33:23:43<7:46:05, 11.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▊ | 19590/22095 [33:23:47<6:10:30, 8.87s/it] {'loss': 0.3465, 'grad_norm': 0.6593669661485506, 'learning_rate': 3.335703102895549e-07, 'epoch': 0.89} 89%|████████▊ | 19590/22095 [33:23:47<6:10:30, 8.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19591/22095 [33:23:56<6:16:46, 9.03s/it] {'loss': 0.4901, 'grad_norm': 0.26153963630435606, 'learning_rate': 3.333071434363727e-07, 'epoch': 0.89} 89%|████████▊ | 19591/22095 [33:23:56<6:16:46, 9.03s/it] 89%|████████▊ | 19592/22095 [33:24:00<5:10:46, 7.45s/it] {'loss': 0.2507, 'grad_norm': 0.5857658939032128, 'learning_rate': 3.3304407685608777e-07, 'epoch': 0.89} 89%|████████▊ | 19592/22095 [33:24:00<5:10:46, 7.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (124615 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44306 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19593/22095 [33:24:10<5:36:22, 8.07s/it] {'loss': 0.4731, 'grad_norm': 0.26285765382768767, 'learning_rate': 3.3278111055435214e-07, 'epoch': 0.89} 89%|████████▊ | 19593/22095 [33:24:10<5:36:22, 8.07s/it] 89%|████████▊ | 19594/22095 [33:24:13<4:39:16, 6.70s/it] {'loss': 0.3197, 'grad_norm': 0.590814928105251, 'learning_rate': 3.325182445368169e-07, 'epoch': 0.89} 89%|████████▊ | 19594/22095 [33:24:13<4:39:16, 6.70s/it] 89%|████████▊ | 19595/22095 [33:24:17<4:04:29, 5.87s/it] {'loss': 0.3039, 'grad_norm': 0.6646910180079847, 'learning_rate': 3.322554788091287e-07, 'epoch': 0.89} 89%|████████▊ | 19595/22095 [33:24:17<4:04:29, 5.87s/it] 89%|████████▊ | 19596/22095 [33:24:20<3:27:34, 4.98s/it] {'loss': 0.2911, 'grad_norm': 0.6051269524511917, 'learning_rate': 3.31992813376934e-07, 'epoch': 0.89} 89%|████████▊ | 19596/22095 [33:24:20<3:27:34, 4.98s/it] 89%|████████▊ | 19597/22095 [33:24:41<6:49:51, 9.84s/it] {'loss': 0.2668, 'grad_norm': 0.6154458023837881, 'learning_rate': 3.3173024824587786e-07, 'epoch': 0.89} 89%|████████▊ | 19597/22095 [33:24:41<6:49:51, 9.84s/it] 89%|████████▊ | 19598/22095 [33:25:04<9:34:19, 13.80s/it] {'loss': 0.2939, 'grad_norm': 0.7649855997860965, 'learning_rate': 3.314677834216012e-07, 'epoch': 0.89} 89%|████████▊ | 19598/22095 [33:25:04<9:34:19, 13.80s/it] 89%|████████▊ | 19599/22095 [33:25:26<11:15:41, 16.24s/it] {'loss': 0.3067, 'grad_norm': 0.5910296190533375, 'learning_rate': 3.31205418909743e-07, 'epoch': 0.89} 89%|████████▊ | 19599/22095 [33:25:26<11:15:41, 16.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48784 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51840 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77854 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▊ | 19600/22095 [33:25:29<8:30:48, 12.28s/it] {'loss': 0.384, 'grad_norm': 0.6687276660419679, 'learning_rate': 3.30943154715942e-07, 'epoch': 0.89} 89%|████████▊ | 19600/22095 [33:25:29<8:30:48, 12.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▊ | 19601/22095 [33:25:39<7:55:17, 11.43s/it] {'loss': 0.4374, 'grad_norm': 0.26577943660214226, 'learning_rate': 3.3068099084583195e-07, 'epoch': 0.89} 89%|████████▊ | 19601/22095 [33:25:39<7:55:17, 11.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▊ | 19602/22095 [33:25:42<6:16:14, 9.06s/it] {'loss': 0.2687, 'grad_norm': 0.5708204411176053, 'learning_rate': 3.304189273050473e-07, 'epoch': 0.89} 89%|████████▊ | 19602/22095 [33:25:42<6:16:14, 9.06s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19603/22095 [33:25:51<6:16:03, 9.05s/it] {'loss': 0.4675, 'grad_norm': 0.28507712724926143, 'learning_rate': 3.301569640992186e-07, 'epoch': 0.89} 89%|████████▊ | 19603/22095 [33:25:51<6:16:03, 9.05s/it] 89%|████████▊ | 19604/22095 [33:26:14<9:08:48, 13.22s/it] {'loss': 0.2979, 'grad_norm': 0.6315800803320455, 'learning_rate': 3.298951012339735e-07, 'epoch': 0.89} 89%|████████▊ | 19604/22095 [33:26:14<9:08:48, 13.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8555754 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24644, 'image': '1570190380.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Teen & Young Adult? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 89%|████████▊ | 19605/22095 [33:26:37<11:07:14, 16.08s/it] {'loss': 0.2999, 'grad_norm': 0.6476269168568197, 'learning_rate': 3.2963333871493917e-07, 'epoch': 0.89} 89%|████████▊ | 19605/22095 [33:26:37<11:07:14, 16.08s/it] 89%|████████▊ | 19606/22095 [33:26:59<12:19:33, 17.83s/it] {'loss': 0.2903, 'grad_norm': 0.5974437769779533, 'learning_rate': 3.293716765477417e-07, 'epoch': 0.89} 89%|████████▊ | 19606/22095 [33:26:59<12:19:33, 17.83s/it] 89%|████████▊ | 19607/22095 [33:27:41<17:25:33, 25.21s/it] {'loss': 0.2874, 'grad_norm': 0.6080218454670456, 'learning_rate': 3.2911011473800213e-07, 'epoch': 0.89} 89%|████████▊ | 19607/22095 [33:27:41<17:25:33, 25.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▊ | 19608/22095 [33:27:49<13:48:10, 19.98s/it] {'loss': 0.4416, 'grad_norm': 0.24843599926496016, 'learning_rate': 3.2884865329133986e-07, 'epoch': 0.89} 89%|████████▊ | 19608/22095 [33:27:49<13:48:10, 19.98s/it] 89%|████████▊ | 19609/22095 [33:27:52<10:22:47, 15.03s/it] {'loss': 0.264, 'grad_norm': 0.6070739732249638, 'learning_rate': 3.285872922133737e-07, 'epoch': 0.89} 89%|████████▊ | 19609/22095 [33:27:52<10:22:47, 15.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19610/22095 [33:28:33<15:42:19, 22.75s/it] {'loss': 0.2928, 'grad_norm': 0.61134458557448, 'learning_rate': 3.2832603150971974e-07, 'epoch': 0.89} 89%|████████▉ | 19610/22095 [33:28:33<15:42:19, 22.75s/it] 89%|████████▉ | 19611/22095 [33:28:36<11:39:40, 16.90s/it] {'loss': 0.2799, 'grad_norm': 0.590230363263004, 'learning_rate': 3.2806487118599237e-07, 'epoch': 0.89} 89%|████████▉ | 19611/22095 [33:28:36<11:39:40, 16.90s/it] 89%|████████▉ | 19612/22095 [33:28:40<8:56:32, 12.96s/it] {'loss': 0.306, 'grad_norm': 0.5998546487827437, 'learning_rate': 3.2780381124780046e-07, 'epoch': 0.89} 89%|████████▉ | 19612/22095 [33:28:40<8:56:32, 12.96s/it] 89%|████████▉ | 19613/22095 [33:29:02<10:45:04, 15.59s/it] {'loss': 0.2695, 'grad_norm': 0.6350566283688125, 'learning_rate': 3.275428517007562e-07, 'epoch': 0.89} 89%|████████▉ | 19613/22095 [33:29:02<10:45:04, 15.59s/it] 89%|████████▉ | 19614/22095 [33:29:23<11:56:32, 17.33s/it] {'loss': 0.3069, 'grad_norm': 0.607949377490081, 'learning_rate': 3.27281992550465e-07, 'epoch': 0.89} 89%|████████▉ | 19614/22095 [33:29:23<11:56:32, 17.33s/it] 89%|████████▉ | 19615/22095 [33:30:38<23:51:41, 34.64s/it] {'loss': 0.2813, 'grad_norm': 0.6484075872349462, 'learning_rate': 3.270212338025336e-07, 'epoch': 0.89} 89%|████████▉ | 19615/22095 [33:30:38<23:51:41, 34.64s/it] 89%|████████▉ | 19616/22095 [33:30:41<17:17:10, 25.10s/it] {'loss': 0.2596, 'grad_norm': 0.6570954002846704, 'learning_rate': 3.2676057546256354e-07, 'epoch': 0.89} 89%|████████▉ | 19616/22095 [33:30:41<17:17:10, 25.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49398 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59417 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63283 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19617/22095 [33:31:03<16:39:50, 24.21s/it] {'loss': 0.2986, 'grad_norm': 0.6249630982237082, 'learning_rate': 3.2650001753615547e-07, 'epoch': 0.89} 89%|████████▉ | 19617/22095 [33:31:03<16:39:50, 24.21s/it] 89%|████████▉ | 19618/22095 [33:31:08<12:33:13, 18.25s/it] {'loss': 0.3208, 'grad_norm': 0.6217870094199994, 'learning_rate': 3.262395600289087e-07, 'epoch': 0.89} 89%|████████▉ | 19618/22095 [33:31:08<12:33:13, 18.25s/it] 89%|████████▉ | 19619/22095 [33:31:11<9:30:20, 13.82s/it] {'loss': 0.293, 'grad_norm': 0.5814716308866661, 'learning_rate': 3.259792029464204e-07, 'epoch': 0.89} 89%|████████▉ | 19619/22095 [33:31:11<9:30:20, 13.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46502 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61981 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71361 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56334 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46884 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19620/22095 [33:31:32<10:57:18, 15.93s/it] {'loss': 0.3349, 'grad_norm': 0.6060563502689635, 'learning_rate': 3.2571894629428224e-07, 'epoch': 0.89} 89%|████████▉ | 19620/22095 [33:31:32<10:57:18, 15.93s/it] 89%|████████▉ | 19621/22095 [33:31:36<8:32:23, 12.43s/it] {'loss': 0.2974, 'grad_norm': 0.6316516663442189, 'learning_rate': 3.2545879007808866e-07, 'epoch': 0.89} 89%|████████▉ | 19621/22095 [33:31:36<8:32:23, 12.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19622/22095 [33:31:58<10:29:04, 15.26s/it] {'loss': 0.2803, 'grad_norm': 0.5989664177057166, 'learning_rate': 3.2519873430342905e-07, 'epoch': 0.89} 89%|████████▉ | 19622/22095 [33:31:58<10:29:04, 15.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66199 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48428 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50810 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49974 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19623/22095 [33:32:37<15:18:08, 22.28s/it] {'loss': 0.2616, 'grad_norm': 0.650238534391762, 'learning_rate': 3.2493877897589123e-07, 'epoch': 0.89} 89%|████████▉ | 19623/22095 [33:32:37<15:18:08, 22.28s/it] 89%|████████▉ | 19624/22095 [33:32:59<15:10:31, 22.11s/it] {'loss': 0.3034, 'grad_norm': 0.5862701585975826, 'learning_rate': 3.2467892410106006e-07, 'epoch': 0.89} 89%|████████▉ | 19624/22095 [33:32:59<15:10:31, 22.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19625/22095 [33:33:08<12:33:09, 18.30s/it] {'loss': 0.4626, 'grad_norm': 0.2878441759964836, 'learning_rate': 3.2441916968452003e-07, 'epoch': 0.89} 89%|████████▉ | 19625/22095 [33:33:08<12:33:09, 18.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48237 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42555 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19626/22095 [33:33:30<13:21:26, 19.48s/it] {'loss': 0.2857, 'grad_norm': 0.7429412990238157, 'learning_rate': 3.2415951573185224e-07, 'epoch': 0.89} 89%|████████▉ | 19626/22095 [33:33:30<13:21:26, 19.48s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_104847_before_screenshot.png 2025-08-29 01:31:28.947690 load time: 1041.06 ms 89%|████████▉ | 19627/22095 [33:34:10<17:36:25, 25.68s/it] {'loss': 0.271, 'grad_norm': 0.5785353982968996, 'learning_rate': 3.2389996224863604e-07, 'epoch': 0.89} 89%|████████▉ | 19627/22095 [33:34:10<17:36:25, 25.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19628/22095 [33:34:20<14:15:07, 20.80s/it] {'loss': 0.4653, 'grad_norm': 0.282444870309688, 'learning_rate': 3.236405092404471e-07, 'epoch': 0.89} 89%|████████▉ | 19628/22095 [33:34:20<14:15:07, 20.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8307200 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1z4LSj8DH8KJjy1zeXXXjepXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n从图片中辨认文字。'}, {'from': 'gpt', 'value': '图中所有文字:\n正品保障,盗图必究'}]} 89%|████████▉ | 19629/22095 [33:34:44<14:54:53, 21.77s/it] {'loss': 0.305, 'grad_norm': 0.5859165748543158, 'learning_rate': 3.2338115671286254e-07, 'epoch': 0.89} 89%|████████▉ | 19629/22095 [33:34:44<14:54:53, 21.77s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8557178 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22815, 'image': '866364935.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a sociopolitical book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 89%|████████▉ | 19630/22095 [33:35:23<18:32:04, 27.07s/it] {'loss': 0.2632, 'grad_norm': 0.5722429759030558, 'learning_rate': 3.231219046714523e-07, 'epoch': 0.89} 89%|████████▉ | 19630/22095 [33:35:23<18:32:04, 27.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19631/22095 [33:35:52<18:57:53, 27.71s/it] {'loss': 0.464, 'grad_norm': 0.2766515224836743, 'learning_rate': 3.2286275312178984e-07, 'epoch': 0.89} 89%|████████▉ | 19631/22095 [33:35:52<18:57:53, 27.71s/it] 89%|████████▉ | 19632/22095 [33:35:56<13:58:22, 20.42s/it] {'loss': 0.2758, 'grad_norm': 0.6600272957732479, 'learning_rate': 3.226037020694417e-07, 'epoch': 0.89} 89%|████████▉ | 19632/22095 [33:35:56<13:58:22, 20.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19633/22095 [33:36:36<18:01:03, 26.35s/it] {'loss': 0.2981, 'grad_norm': 0.6126178906891568, 'learning_rate': 3.2234475151997345e-07, 'epoch': 0.89} 89%|████████▉ | 19633/22095 [33:36:36<18:01:03, 26.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19634/22095 [33:37:04<18:18:37, 26.78s/it] {'loss': 0.4847, 'grad_norm': 0.4553668902972284, 'learning_rate': 3.220859014789507e-07, 'epoch': 0.89} 89%|████████▉ | 19634/22095 [33:37:04<18:18:37, 26.78s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 89%|████████▉ | 19635/22095 [33:37:26<17:15:43, 25.26s/it] {'loss': 0.2758, 'grad_norm': 0.6261440561231867, 'learning_rate': 3.21827151951935e-07, 'epoch': 0.89} 89%|████████▉ | 19635/22095 [33:37:26<17:15:43, 25.26s/it] 89%|████████▉ | 19636/22095 [33:38:06<20:17:36, 29.71s/it] {'loss': 0.2576, 'grad_norm': 0.6035397288881246, 'learning_rate': 3.215685029444865e-07, 'epoch': 0.89} 89%|████████▉ | 19636/22095 [33:38:06<20:17:36, 29.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19637/22095 [33:38:30<19:14:41, 28.19s/it] {'loss': 0.2752, 'grad_norm': 0.6880253117826668, 'learning_rate': 3.213099544621612e-07, 'epoch': 0.89} 89%|████████▉ | 19637/22095 [33:38:30<19:14:41, 28.19s/it] 89%|████████▉ | 19638/22095 [33:38:53<18:11:53, 26.66s/it] {'loss': 0.2761, 'grad_norm': 0.5554182104281393, 'learning_rate': 3.210515065105152e-07, 'epoch': 0.89} 89%|████████▉ | 19638/22095 [33:38:53<18:11:53, 26.66s/it] 89%|████████▉ | 19639/22095 [33:39:36<21:26:59, 31.44s/it] {'loss': 0.2992, 'grad_norm': 0.7699347195496511, 'learning_rate': 3.20793159095103e-07, 'epoch': 0.89} 89%|████████▉ | 19639/22095 [33:39:36<21:26:59, 31.44s/it] 89%|████████▉ | 19640/22095 [33:39:39<15:40:36, 22.99s/it] {'loss': 0.2525, 'grad_norm': 0.674613342892287, 'learning_rate': 3.2053491222147514e-07, 'epoch': 0.89} 89%|████████▉ | 19640/22095 [33:39:39<15:40:36, 22.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79619 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19641/22095 [33:40:01<15:29:10, 22.72s/it] {'loss': 0.3129, 'grad_norm': 0.6265349024511476, 'learning_rate': 3.2027676589517885e-07, 'epoch': 0.89} 89%|████████▉ | 19641/22095 [33:40:01<15:29:10, 22.72s/it] 89%|████████▉ | 19642/22095 [33:41:40<30:55:50, 45.39s/it] {'loss': 0.3456, 'grad_norm': 0.6407998943248913, 'learning_rate': 3.2001872012176304e-07, 'epoch': 0.89} 89%|████████▉ | 19642/22095 [33:41:40<30:55:50, 45.39s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 89%|████████▉ | 19643/22095 [33:42:21<30:10:58, 44.31s/it] {'loss': 0.2703, 'grad_norm': 0.6249769827265742, 'learning_rate': 3.1976077490677106e-07, 'epoch': 0.89} 89%|████████▉ | 19643/22095 [33:42:21<30:10:58, 44.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/data_20250612/android/images/android_lab_data_Clock/clock_3/images/002_click_1749555722336.png 2025-08-29 01:40:20.168116 load time: 1054.67 ms 89%|████████▉ | 19644/22095 [33:42:29<22:43:01, 33.37s/it] {'loss': 0.4517, 'grad_norm': 0.2689051847218968, 'learning_rate': 3.195029302557462e-07, 'epoch': 0.89} 89%|████████▉ | 19644/22095 [33:42:29<22:43:01, 33.37s/it] 89%|████████▉ | 19645/22095 [33:42:33<16:36:20, 24.40s/it] {'loss': 0.2631, 'grad_norm': 0.8638207476845396, 'learning_rate': 3.1924518617422796e-07, 'epoch': 0.89} 89%|████████▉ | 19645/22095 [33:42:33<16:36:20, 24.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19646/22095 [33:43:52<27:44:00, 40.77s/it] {'loss': 0.2959, 'grad_norm': 0.5727394461563624, 'learning_rate': 3.1898754266775467e-07, 'epoch': 0.89} 89%|████████▉ | 19646/22095 [33:43:52<27:44:00, 40.77s/it] 89%|████████▉ | 19647/22095 [33:44:15<24:04:49, 35.41s/it] {'loss': 0.2718, 'grad_norm': 0.6598354797321724, 'learning_rate': 3.1872999974186194e-07, 'epoch': 0.89} 89%|████████▉ | 19647/22095 [33:44:15<24:04:49, 35.41s/it] 89%|████████▉ | 19648/22095 [33:44:37<21:19:48, 31.38s/it] {'loss': 0.2871, 'grad_norm': 0.5766600113148568, 'learning_rate': 3.1847255740208636e-07, 'epoch': 0.89} 89%|████████▉ | 19648/22095 [33:44:37<21:19:48, 31.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19649/22095 [33:44:42<16:02:26, 23.61s/it] {'loss': 0.4891, 'grad_norm': 0.2839105455499461, 'learning_rate': 3.182152156539553e-07, 'epoch': 0.89} 89%|████████▉ | 19649/22095 [33:44:42<16:02:26, 23.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8360264 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 26986, 'image': 'vrdu_table_final_2/astro-ph.CO/98ab8355-2c06-4c12-a579-8fa788375638.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [395, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8424683 in VC:s3://internvl-moe-sft-data/. Exception: Image size [395, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63626, 'image': 'vrdu_texteq/astro-ph.CO/972fb00b-e2e8-4f38-86c4-f4032d256d01.png', 'image_wh': [[395, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'Then $R_K$ and $G_K$ are related as'}]} 89%|████████▉ | 19650/22095 [33:45:03<15:24:14, 22.68s/it] {'loss': 0.2825, 'grad_norm': 0.6785505816778864, 'learning_rate': 3.179579745029998e-07, 'epoch': 0.89} 89%|████████▉ | 19650/22095 [33:45:03<15:24:14, 22.68s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/rico/dataset/image/9411.jpg 2025-08-29 01:43:01.303013 load time: 1047.17 ms 89%|████████▉ | 19651/22095 [33:45:25<15:17:34, 22.53s/it] {'loss': 0.2801, 'grad_norm': 0.5529712585838098, 'learning_rate': 3.1770083395474827e-07, 'epoch': 0.89} 89%|████████▉ | 19651/22095 [33:45:25<15:17:34, 22.53s/it] 89%|████████▉ | 19652/22095 [33:46:22<22:23:04, 32.99s/it] {'loss': 0.3026, 'grad_norm': 0.6154279850046951, 'learning_rate': 3.174437940147268e-07, 'epoch': 0.89} 89%|████████▉ | 19652/22095 [33:46:22<22:23:04, 32.99s/it] 89%|████████▉ | 19653/22095 [33:46:46<20:28:58, 30.20s/it] {'loss': 0.2991, 'grad_norm': 0.6320580096458656, 'learning_rate': 3.171868546884549e-07, 'epoch': 0.89} 89%|████████▉ | 19653/22095 [33:46:46<20:28:58, 30.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63787 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19654/22095 [33:47:07<18:37:37, 27.47s/it] {'loss': 0.2572, 'grad_norm': 0.5896944818087667, 'learning_rate': 3.169300159814559e-07, 'epoch': 0.89} 89%|████████▉ | 19654/22095 [33:47:07<18:37:37, 27.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41101 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76764 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58132 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19655/22095 [33:48:04<24:34:58, 36.27s/it] {'loss': 0.2978, 'grad_norm': 0.6314943278887438, 'learning_rate': 3.1667327789924815e-07, 'epoch': 0.89} 89%|████████▉ | 19655/22095 [33:48:04<24:34:58, 36.27s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56002 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46880 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19656/22095 [33:49:04<29:23:51, 43.39s/it] {'loss': 0.3456, 'grad_norm': 0.6240822715544811, 'learning_rate': 3.1641664044734786e-07, 'epoch': 0.89} 89%|████████▉ | 19656/22095 [33:49:04<29:23:51, 43.39s/it]VC:s3://multi-modal/TQA/train/teaching_images/atomic_mass_number_9006.png 2025-08-29 01:47:02.477307 load time: 1037.18 ms VC:s3://gui/aguvis/aguvis-stage1/seeclick/seeclick_web_imgs/8f7a054831d88e6f243d1ae8f8fae61c.png 2025-08-29 01:47:02.479041 load time: 1261.52 ms 89%|████████▉ | 19657/22095 [33:49:45<28:54:21, 42.68s/it] {'loss': 0.3238, 'grad_norm': 0.5924244399609876, 'learning_rate': 3.1616010363126893e-07, 'epoch': 0.89} 89%|████████▉ | 19657/22095 [33:49:45<28:54:21, 42.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [634, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8502083 in VC:s3://internvl-moe-sft-data/. Exception: Image size [634, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 141001, 'image': 'vrdu_texteq/astro-ph.CO/33c23d4a-879e-4863-8d25-3e199d852dc9.png', 'image_wh': [[634, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'The transformation rule for $A$ can then be solved as'}]} 89%|████████▉ | 19658/22095 [33:50:07<24:49:15, 36.67s/it] {'loss': 0.2926, 'grad_norm': 0.6401946065654505, 'learning_rate': 3.159036674565247e-07, 'epoch': 0.89} 89%|████████▉ | 19658/22095 [33:50:07<24:49:15, 36.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64269 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48803 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42638 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19659/22095 [33:50:28<21:39:38, 32.01s/it] {'loss': 0.2609, 'grad_norm': 0.6277174806465244, 'learning_rate': 3.156473319286241e-07, 'epoch': 0.89} 89%|████████▉ | 19659/22095 [33:50:29<21:39:38, 32.01s/it] 89%|████████▉ | 19660/22095 [33:50:31<15:44:29, 23.27s/it] {'loss': 0.2848, 'grad_norm': 0.6024028873920564, 'learning_rate': 3.15391097053076e-07, 'epoch': 0.89} 89%|████████▉ | 19660/22095 [33:50:31<15:44:29, 23.27s/it]VC:s3://gui/aguvis/aguvis-stage2/miniwob/images/click-checkboxes_seed32_step1.jpg.jpg 2025-08-29 01:48:30.163574 load time: 1020.33 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_587889.png 2025-08-29 01:48:30.163438 load time: 1015.42 ms VC:s3://gui/aguvis/aguvis-stage2/guiact-web-single/images/f4286227-97e9-4737-a18a-bf600d1bbfde.jpg 2025-08-29 01:48:30.161639 load time: 1018.92 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/2498/screenshot_4.png 2025-08-29 01:48:30.161473 load time: 1019.08 ms VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/data/tabs/other_screenshot/original/ModernDashboard_1739966744.2230458.png VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/vision/test_1363_image.png 2025-08-29 01:48:30.161558 load time: 1023.76 ms 2025-08-29 01:48:30.163382 load time: 1024.08 ms VC:s3://gui-agent/data_20250612/windows/images/settings/free_task_20250606_205214/images/20250606_205218_1.png 2025-08-29 01:48:30.163664 load time: 1053.8 ms 89%|████████▉ | 19661/22095 [33:51:14<19:35:29, 28.98s/it] {'loss': 0.2723, 'grad_norm': 0.6713968922655957, 'learning_rate': 3.151349628353856e-07, 'epoch': 0.89} 89%|████████▉ | 19661/22095 [33:51:14<19:35:29, 28.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19662/22095 [33:51:23<15:32:56, 23.01s/it] {'loss': 0.4566, 'grad_norm': 0.28489424233366306, 'learning_rate': 3.1487892928105554e-07, 'epoch': 0.89} 89%|████████▉ | 19662/22095 [33:51:23<15:32:56, 23.01s/it] 89%|████████▉ | 19663/22095 [33:51:32<12:47:39, 18.94s/it] {'loss': 0.4745, 'grad_norm': 0.2687351928135246, 'learning_rate': 3.146229963955877e-07, 'epoch': 0.89} 89%|████████▉ | 19663/22095 [33:51:32<12:47:39, 18.94s/it] 89%|████████▉ | 19664/22095 [33:52:22<19:08:02, 28.34s/it] {'loss': 0.464, 'grad_norm': 0.2776367386077948, 'learning_rate': 3.143671641844831e-07, 'epoch': 0.89} 89%|████████▉ | 19664/22095 [33:52:22<19:08:02, 28.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 89%|████████▉ | 19665/22095 [33:52:26<14:02:00, 20.79s/it] {'loss': 0.3152, 'grad_norm': 0.7031881152911872, 'learning_rate': 3.1411143265323684e-07, 'epoch': 0.89} 89%|████████▉ | 19665/22095 [33:52:26<14:02:00, 20.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19666/22095 [33:52:47<14:09:42, 20.99s/it] {'loss': 0.3258, 'grad_norm': 0.7058053280755887, 'learning_rate': 3.138558018073434e-07, 'epoch': 0.89} 89%|████████▉ | 19666/22095 [33:52:47<14:09:42, 20.99s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75682 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19667/22095 [33:52:57<11:50:09, 17.55s/it] {'loss': 0.4666, 'grad_norm': 0.2524281733289989, 'learning_rate': 3.1360027165229677e-07, 'epoch': 0.89} 89%|████████▉ | 19667/22095 [33:52:57<11:50:09, 17.55s/it] 89%|████████▉ | 19668/22095 [33:53:19<12:46:07, 18.94s/it] {'loss': 0.3123, 'grad_norm': 0.5566851001002566, 'learning_rate': 3.1334484219358754e-07, 'epoch': 0.89} 89%|████████▉ | 19668/22095 [33:53:19<12:46:07, 18.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56371 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52377 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72613 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19669/22095 [33:53:22<9:33:10, 14.18s/it] {'loss': 0.3133, 'grad_norm': 0.6187301932164185, 'learning_rate': 3.13089513436704e-07, 'epoch': 0.89} 89%|████████▉ | 19669/22095 [33:53:22<9:33:10, 14.18s/it] 89%|████████▉ | 19670/22095 [33:54:05<15:28:23, 22.97s/it] {'loss': 0.2975, 'grad_norm': 0.6546110361034537, 'learning_rate': 3.1283428538713134e-07, 'epoch': 0.89} 89%|████████▉ | 19670/22095 [33:54:05<15:28:23, 22.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85880 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19671/22095 [33:54:09<11:38:40, 17.29s/it] {'loss': 0.2893, 'grad_norm': 0.5680701004053491, 'learning_rate': 3.125791580503551e-07, 'epoch': 0.89} 89%|████████▉ | 19671/22095 [33:54:09<11:38:40, 17.29s/it]VC:s3://gui-agent/jedi/images/figma400k/figma400k_extracted/9b4ebf2c185fd916f66fb7077d814a73aee14f7284bd03fe3ab5982be1cfb63c.png 2025-08-29 01:52:08.180230 load time: 1044.19 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_821145.png 2025-08-29 01:52:08.182369 load time: 1027.46 ms 89%|████████▉ | 19672/22095 [33:54:12<8:44:10, 12.98s/it] {'loss': 0.285, 'grad_norm': 0.6341125142971917, 'learning_rate': 3.1232413143185534e-07, 'epoch': 0.89} 89%|████████▉ | 19672/22095 [33:54:12<8:44:10, 12.98s/it] 89%|████████▉ | 19673/22095 [33:54:16<6:52:05, 10.21s/it] {'loss': 0.2534, 'grad_norm': 0.6664029205667784, 'learning_rate': 3.1206920553711385e-07, 'epoch': 0.89} 89%|████████▉ | 19673/22095 [33:54:16<6:52:05, 10.21s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19674/22095 [33:54:37<9:05:58, 13.53s/it] {'loss': 0.3067, 'grad_norm': 0.660749332252751, 'learning_rate': 3.1181438037160727e-07, 'epoch': 0.89} 89%|████████▉ | 19674/22095 [33:54:37<9:05:58, 13.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19675/22095 [33:55:22<15:22:59, 22.88s/it] {'loss': 0.4597, 'grad_norm': 0.2588901641031759, 'learning_rate': 3.1155965594081017e-07, 'epoch': 0.89} 89%|████████▉ | 19675/22095 [33:55:22<15:22:59, 22.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48551 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61201 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19676/22095 [33:55:43<15:01:43, 22.37s/it] {'loss': 0.3262, 'grad_norm': 0.6116703932544021, 'learning_rate': 3.1130503225019705e-07, 'epoch': 0.89} 89%|████████▉ | 19676/22095 [33:55:43<15:01:43, 22.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19677/22095 [33:55:48<11:23:34, 16.96s/it] {'loss': 0.4598, 'grad_norm': 0.28303243772908865, 'learning_rate': 3.110505093052396e-07, 'epoch': 0.89} 89%|████████▉ | 19677/22095 [33:55:48<11:23:34, 16.96s/it] 89%|████████▉ | 19678/22095 [33:55:51<8:43:33, 13.00s/it] {'loss': 0.2759, 'grad_norm': 0.624993176785746, 'learning_rate': 3.107960871114041e-07, 'epoch': 0.89} 89%|████████▉ | 19678/22095 [33:55:51<8:43:33, 13.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui-agent/jedi/images/icons_v0122/icons_v0122_extracted/images_pure_color_background/AppleMusicAssets/iTunesExtraListView_ko_Normal.png 2025-08-29 01:53:50.080173 load time: 1057.69 ms Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19679/22095 [33:55:58<7:33:19, 11.26s/it] {'loss': 0.4466, 'grad_norm': 0.2662360133368587, 'learning_rate': 3.1054176567415937e-07, 'epoch': 0.89} 89%|████████▉ | 19679/22095 [33:55:59<7:33:19, 11.26s/it] 89%|████████▉ | 19680/22095 [33:56:21<9:52:35, 14.72s/it] {'loss': 0.2865, 'grad_norm': 0.6189525480592604, 'learning_rate': 3.1028754499896895e-07, 'epoch': 0.89} 89%|████████▉ | 19680/22095 [33:56:21<9:52:35, 14.72s/it] 89%|████████▉ | 19681/22095 [33:57:26<20:01:10, 29.86s/it] {'loss': 0.287, 'grad_norm': 0.6925785423664801, 'learning_rate': 3.1003342509129783e-07, 'epoch': 0.89} 89%|████████▉ | 19681/22095 [33:57:26<20:01:10, 29.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81087 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51999 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19682/22095 [33:57:31<14:49:29, 22.12s/it] {'loss': 0.2926, 'grad_norm': 0.5897566594709222, 'learning_rate': 3.097794059566023e-07, 'epoch': 0.89} 89%|████████▉ | 19682/22095 [33:57:31<14:49:29, 22.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922714 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45867, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 2\nB. 3\nC. 4\nD. 1\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 89%|████████▉ | 19683/22095 [33:57:51<14:33:54, 21.74s/it] {'loss': 0.2999, 'grad_norm': 0.5965249195848944, 'learning_rate': 3.0952548760034284e-07, 'epoch': 0.89} 89%|████████▉ | 19683/22095 [33:57:51<14:33:54, 21.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19684/22095 [33:59:18<27:34:28, 41.17s/it] {'loss': 0.443, 'grad_norm': 0.2501494233144302, 'learning_rate': 3.0927167002797574e-07, 'epoch': 0.89} 89%|████████▉ | 19684/22095 [33:59:18<27:34:28, 41.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75147 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19685/22095 [33:59:40<23:48:37, 35.57s/it] {'loss': 0.2894, 'grad_norm': 0.6550215832039792, 'learning_rate': 3.0901795324495334e-07, 'epoch': 0.89} 89%|████████▉ | 19685/22095 [33:59:40<23:48:37, 35.57s/it] 89%|████████▉ | 19686/22095 [34:00:04<21:19:10, 31.86s/it] {'loss': 0.2451, 'grad_norm': 0.6083383919033705, 'learning_rate': 3.0876433725672783e-07, 'epoch': 0.89} 89%|████████▉ | 19686/22095 [34:00:04<21:19:10, 31.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79707 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108407 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (43192 > 40960) for 4 sample(s). Truncating to 1269 with 2 samples. 89%|████████▉ | 19687/22095 [34:01:05<27:09:09, 40.59s/it] {'loss': 0.2653, 'grad_norm': 0.5901909902580682, 'learning_rate': 3.085108220687494e-07, 'epoch': 0.89} 89%|████████▉ | 19687/22095 [34:01:05<27:09:09, 40.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8406259 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8446, 'image': 'vrdu_table_final_2/astro-ph.CO/9bfc0a5f-b6bb-4aab-9a65-70d8c609222f.png', 'image_wh': [[1120, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n1&2&3&4&5&7&8\n\\end{tabular}\n```"}]} 89%|████████▉ | 19688/22095 [34:01:27<23:30:44, 35.17s/it] {'loss': 0.3328, 'grad_norm': 0.6244542350942888, 'learning_rate': 3.082574076864636e-07, 'epoch': 0.89} 89%|████████▉ | 19688/22095 [34:01:27<23:30:44, 35.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [334, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8459148 in VC:s3://internvl-moe-sft-data/. Exception: Image size [334, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71399, 'image': 'vrdu_texteq/astro-ph.CO/38c6eb56-e043-4748-9646-aa46bf9a8827.png', 'image_wh': [[334, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease employ OCR technology to recognize and extract the text from the image.'}, {'from': 'gpt', 'value': 'where $\\theta_b$ is the beam width'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883160 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6313, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2.5\nB. 4.5\nC. 7\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 89%|████████▉ | 19689/22095 [34:01:31<17:10:34, 25.70s/it] {'loss': 0.2903, 'grad_norm': 0.6182059717072552, 'learning_rate': 3.0800409411531727e-07, 'epoch': 0.89} 89%|████████▉ | 19689/22095 [34:01:31<17:10:34, 25.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19690/22095 [34:01:37<13:12:07, 19.76s/it] {'loss': 0.4782, 'grad_norm': 0.27308560925210523, 'learning_rate': 3.077508813607527e-07, 'epoch': 0.89} 89%|████████▉ | 19690/22095 [34:01:37<13:12:07, 19.76s/it]VC:s3://internvl2/datasets/VCR-wiki-zh-easy/images/0015749.jpg 2025-08-29 01:59:35.374591 load time: 1022.42 ms 89%|████████▉ | 19691/22095 [34:01:58<13:25:45, 20.11s/it] {'loss': 0.3073, 'grad_norm': 0.6191578915878152, 'learning_rate': 3.0749776942820943e-07, 'epoch': 0.89} 89%|████████▉ | 19691/22095 [34:01:58<13:25:45, 20.11s/it] 89%|████████▉ | 19692/22095 [34:02:20<13:55:30, 20.86s/it] {'loss': 0.224, 'grad_norm': 0.6610736152984831, 'learning_rate': 3.072447583231275e-07, 'epoch': 0.89} 89%|████████▉ | 19692/22095 [34:02:20<13:55:30, 20.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19693/22095 [34:02:28<11:13:03, 16.81s/it] {'loss': 0.4748, 'grad_norm': 0.2789734855593772, 'learning_rate': 3.0699184805094374e-07, 'epoch': 0.89} 89%|████████▉ | 19693/22095 [34:02:28<11:13:03, 16.81s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19694/22095 [34:02:31<8:36:17, 12.90s/it] {'loss': 0.2646, 'grad_norm': 0.7874295132580852, 'learning_rate': 3.067390386170915e-07, 'epoch': 0.89} 89%|████████▉ | 19694/22095 [34:02:31<8:36:17, 12.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19695/22095 [34:02:39<7:38:10, 11.45s/it] {'loss': 0.4445, 'grad_norm': 0.2728170143891171, 'learning_rate': 3.064863300270027e-07, 'epoch': 0.89} 89%|████████▉ | 19695/22095 [34:02:39<7:38:10, 11.45s/it] 89%|████████▉ | 19696/22095 [34:03:04<10:11:11, 15.29s/it] {'loss': 0.3046, 'grad_norm': 0.7961230497887644, 'learning_rate': 3.0623372228610725e-07, 'epoch': 0.89} 89%|████████▉ | 19696/22095 [34:03:04<10:11:11, 15.29s/it] 89%|████████▉ | 19697/22095 [34:03:26<11:31:22, 17.30s/it] {'loss': 0.3039, 'grad_norm': 1.4269209581457398, 'learning_rate': 3.059812153998343e-07, 'epoch': 0.89} 89%|████████▉ | 19697/22095 [34:03:26<11:31:22, 17.30s/it] 89%|████████▉ | 19698/22095 [34:03:47<12:18:46, 18.49s/it] {'loss': 0.2733, 'grad_norm': 0.7972617116547703, 'learning_rate': 3.057288093736083e-07, 'epoch': 0.89} 89%|████████▉ | 19698/22095 [34:03:47<12:18:46, 18.49s/it] 89%|████████▉ | 19699/22095 [34:03:51<9:21:17, 14.06s/it] {'loss': 0.2966, 'grad_norm': 0.650142097827148, 'learning_rate': 3.0547650421285216e-07, 'epoch': 0.89} 89%|████████▉ | 19699/22095 [34:03:51<9:21:17, 14.06s/it] 89%|████████▉ | 19700/22095 [34:04:12<10:52:55, 16.36s/it] {'loss': 0.3322, 'grad_norm': 1.0215184862868254, 'learning_rate': 3.0522429992298873e-07, 'epoch': 0.89} 89%|████████▉ | 19700/22095 [34:04:12<10:52:55, 16.36s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (160124908 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 89%|████████▉ | 19701/22095 [34:04:15<8:12:33, 12.34s/it] {'loss': 0.2878, 'grad_norm': 0.5915881881019676, 'learning_rate': 3.0497219650943545e-07, 'epoch': 0.89} 89%|████████▉ | 19701/22095 [34:04:15<8:12:33, 12.34s/it] 89%|████████▉ | 19702/22095 [34:04:18<6:22:21, 9.59s/it] {'loss': 0.2904, 'grad_norm': 0.6089222195563149, 'learning_rate': 3.0472019397761065e-07, 'epoch': 0.89} 89%|████████▉ | 19702/22095 [34:04:18<6:22:21, 9.59s/it] 89%|████████▉ | 19703/22095 [34:04:39<8:38:59, 13.02s/it] {'loss': 0.2931, 'grad_norm': 0.6306950949640981, 'learning_rate': 3.044682923329284e-07, 'epoch': 0.89} 89%|████████▉ | 19703/22095 [34:04:39<8:38:59, 13.02s/it] 89%|████████▉ | 19704/22095 [34:05:01<10:25:16, 15.69s/it] {'loss': 0.29, 'grad_norm': 0.6377067107024558, 'learning_rate': 3.0421649158080047e-07, 'epoch': 0.89} 89%|████████▉ | 19704/22095 [34:05:01<10:25:16, 15.69s/it] 89%|████████▉ | 19705/22095 [34:05:05<7:56:32, 11.96s/it] {'loss': 0.2944, 'grad_norm': 0.6472275255096006, 'learning_rate': 3.0396479172663806e-07, 'epoch': 0.89} 89%|████████▉ | 19705/22095 [34:05:05<7:56:32, 11.96s/it] 89%|████████▉ | 19706/22095 [34:05:27<10:00:19, 15.08s/it] {'loss': 0.2888, 'grad_norm': 0.6015998556020318, 'learning_rate': 3.037131927758502e-07, 'epoch': 0.89} 89%|████████▉ | 19706/22095 [34:05:27<10:00:19, 15.08s/it] 89%|████████▉ | 19707/22095 [34:05:32<7:54:28, 11.92s/it] {'loss': 0.2742, 'grad_norm': 0.6050816306761447, 'learning_rate': 3.0346169473384255e-07, 'epoch': 0.89} 89%|████████▉ | 19707/22095 [34:05:32<7:54:28, 11.92s/it] 89%|████████▉ | 19708/22095 [34:05:54<10:02:18, 15.14s/it] {'loss': 0.2751, 'grad_norm': 0.5798568141658198, 'learning_rate': 3.032102976060181e-07, 'epoch': 0.89} 89%|████████▉ | 19708/22095 [34:05:54<10:02:18, 15.14s/it] 89%|████████▉ | 19709/22095 [34:05:57<7:34:41, 11.43s/it] {'loss': 0.2635, 'grad_norm': 0.7827332971119896, 'learning_rate': 3.02959001397779e-07, 'epoch': 0.89} 89%|████████▉ | 19709/22095 [34:05:57<7:34:41, 11.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [606, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8456477 in VC:s3://internvl-moe-sft-data/. Exception: Image size [606, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 142977, 'image': 'vrdu_texteq/astro-ph.CO/a92ca682-ff2d-4c31-bca4-9353c1243af7.png', 'image_wh': [[606, 25]], 'conversations': [{'from': 'human', 'value': '\n能否使用光学字符识别(OCR)技术在图像上提取文本?'}, {'from': 'gpt', 'value': 'which for $D_B=0$ coincides with the result of and'}]} 89%|████████▉ | 19710/22095 [34:06:00<5:58:55, 9.03s/it] {'loss': 0.3007, 'grad_norm': 0.6218102387465159, 'learning_rate': 3.027078061145261e-07, 'epoch': 0.89} 89%|████████▉ | 19710/22095 [34:06:00<5:58:55, 9.03s/it] 89%|████████▉ | 19711/22095 [34:06:23<8:43:25, 13.17s/it] {'loss': 0.2722, 'grad_norm': 0.5946378433026083, 'learning_rate': 3.024567117616556e-07, 'epoch': 0.89} 89%|████████▉ | 19711/22095 [34:06:23<8:43:25, 13.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49018 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19712/22095 [34:06:45<10:24:14, 15.72s/it] {'loss': 0.2859, 'grad_norm': 0.6229968934552937, 'learning_rate': 3.0220571834456256e-07, 'epoch': 0.89} 89%|████████▉ | 19712/22095 [34:06:45<10:24:14, 15.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77382 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19713/22095 [34:06:49<8:05:00, 12.22s/it] {'loss': 0.2961, 'grad_norm': 0.551512068307753, 'learning_rate': 3.0195482586864055e-07, 'epoch': 0.89} 89%|████████▉ | 19713/22095 [34:06:49<8:05:00, 12.22s/it] 89%|████████▉ | 19714/22095 [34:06:52<6:16:49, 9.50s/it] {'loss': 0.2716, 'grad_norm': 0.6761291147721131, 'learning_rate': 3.0170403433928077e-07, 'epoch': 0.89} 89%|████████▉ | 19714/22095 [34:06:52<6:16:49, 9.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19715/22095 [34:07:33<12:27:20, 18.84s/it] {'loss': 0.2877, 'grad_norm': 0.5916261620907665, 'learning_rate': 3.014533437618711e-07, 'epoch': 0.89} 89%|████████▉ | 19715/22095 [34:07:33<12:27:20, 18.84s/it] 89%|████████▉ | 19716/22095 [34:07:36<9:17:35, 14.06s/it] {'loss': 0.3192, 'grad_norm': 0.6567528853482709, 'learning_rate': 3.012027541417989e-07, 'epoch': 0.89} 89%|████████▉ | 19716/22095 [34:07:36<9:17:35, 14.06s/it] 89%|████████▉ | 19717/22095 [34:07:59<11:09:15, 16.89s/it] {'loss': 0.3236, 'grad_norm': 0.6371454442751823, 'learning_rate': 3.0095226548444765e-07, 'epoch': 0.89} 89%|████████▉ | 19717/22095 [34:07:59<11:09:15, 16.89s/it] 89%|████████▉ | 19718/22095 [34:08:02<8:21:59, 12.67s/it] {'loss': 0.3087, 'grad_norm': 0.5819460949351073, 'learning_rate': 3.007018777952009e-07, 'epoch': 0.89} 89%|████████▉ | 19718/22095 [34:08:02<8:21:59, 12.67s/it] 89%|████████▉ | 19719/22095 [34:08:06<6:43:05, 10.18s/it] {'loss': 0.2836, 'grad_norm': 0.6690205421170103, 'learning_rate': 3.004515910794381e-07, 'epoch': 0.89} 89%|████████▉ | 19719/22095 [34:08:06<6:43:05, 10.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19720/22095 [34:08:12<5:46:29, 8.75s/it] {'loss': 0.4502, 'grad_norm': 0.24512552320011244, 'learning_rate': 3.0020140534253617e-07, 'epoch': 0.89} 89%|████████▉ | 19720/22095 [34:08:12<5:46:29, 8.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46866 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19721/22095 [34:08:21<5:54:12, 8.95s/it] {'loss': 0.4571, 'grad_norm': 0.2524877763429363, 'learning_rate': 2.9995132058987185e-07, 'epoch': 0.89} 89%|████████▉ | 19721/22095 [34:08:21<5:54:12, 8.95s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 89%|████████▉ | 19722/22095 [34:08:25<4:52:39, 7.40s/it] {'loss': 0.2898, 'grad_norm': 0.6603633468410276, 'learning_rate': 2.9970133682681924e-07, 'epoch': 0.89} 89%|████████▉ | 19722/22095 [34:08:25<4:52:39, 7.40s/it] 89%|████████▉ | 19723/22095 [34:08:47<7:49:13, 11.87s/it] {'loss': 0.2916, 'grad_norm': 0.623220132327247, 'learning_rate': 2.9945145405874955e-07, 'epoch': 0.89} 89%|████████▉ | 19723/22095 [34:08:47<7:49:13, 11.87s/it] 89%|████████▉ | 19724/22095 [34:09:29<13:47:13, 20.93s/it] {'loss': 0.284, 'grad_norm': 0.694842873664057, 'learning_rate': 2.9920167229103015e-07, 'epoch': 0.89} 89%|████████▉ | 19724/22095 [34:09:29<13:47:13, 20.93s/it] 89%|████████▉ | 19725/22095 [34:09:32<10:12:42, 15.51s/it] {'loss': 0.2988, 'grad_norm': 0.6110616350131243, 'learning_rate': 2.9895199152902955e-07, 'epoch': 0.89} 89%|████████▉ | 19725/22095 [34:09:32<10:12:42, 15.51s/it] 89%|████████▉ | 19726/22095 [34:09:56<11:49:49, 17.98s/it] {'loss': 0.2796, 'grad_norm': 0.6380149501154079, 'learning_rate': 2.987024117781129e-07, 'epoch': 0.89} 89%|████████▉ | 19726/22095 [34:09:56<11:49:49, 17.98s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047142 in VC:s3://multi-modal/UniGeo/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 3\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 89%|████████▉ | 19727/22095 [34:09:59<8:51:13, 13.46s/it] {'loss': 0.2867, 'grad_norm': 0.5801201468884075, 'learning_rate': 2.984529330436431e-07, 'epoch': 0.89} 89%|████████▉ | 19727/22095 [34:09:59<8:51:13, 13.46s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [264, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8529861 in VC:s3://internvl-moe-sft-data/. Exception: Image size [264, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3385, 'image': 'vrdu_texteq/astro-ph.CO/665ce88c-b95c-430f-80c3-61f4790c8f1b.png', 'image_wh': [[264, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'with $a\\!\\gg\\!a_{1}$ or $a\\!\\ll\\!a_{2}$.'}]} 89%|████████▉ | 19728/22095 [34:10:02<6:45:15, 10.27s/it] {'loss': 0.2699, 'grad_norm': 0.5946765214793907, 'learning_rate': 2.9820355533097864e-07, 'epoch': 0.89} 89%|████████▉ | 19728/22095 [34:10:02<6:45:15, 10.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19729/22095 [34:10:09<6:12:38, 9.45s/it] {'loss': 0.4857, 'grad_norm': 0.2925541355352873, 'learning_rate': 2.9795427864548034e-07, 'epoch': 0.89} 89%|████████▉ | 19729/22095 [34:10:09<6:12:38, 9.45s/it] 89%|████████▉ | 19730/22095 [34:10:13<5:02:23, 7.67s/it] {'loss': 0.2928, 'grad_norm': 0.6101996258652933, 'learning_rate': 2.9770510299250265e-07, 'epoch': 0.89} 89%|████████▉ | 19730/22095 [34:10:13<5:02:23, 7.67s/it] 89%|████████▉ | 19731/22095 [34:10:39<8:41:53, 13.25s/it] {'loss': 0.3178, 'grad_norm': 0.9588491138498303, 'learning_rate': 2.974560283774014e-07, 'epoch': 0.89} 89%|████████▉ | 19731/22095 [34:10:39<8:41:53, 13.25s/it] 89%|████████▉ | 19732/22095 [34:10:43<6:48:53, 10.38s/it] {'loss': 0.3092, 'grad_norm': 0.650951067812214, 'learning_rate': 2.972070548055267e-07, 'epoch': 0.89} 89%|████████▉ | 19732/22095 [34:10:43<6:48:53, 10.38s/it] 89%|████████▉ | 19733/22095 [34:10:47<5:35:50, 8.53s/it] {'loss': 0.2679, 'grad_norm': 0.6326366483138217, 'learning_rate': 2.9695818228222873e-07, 'epoch': 0.89} 89%|████████▉ | 19733/22095 [34:10:47<5:35:50, 8.53s/it] 89%|████████▉ | 19734/22095 [34:10:50<4:33:15, 6.94s/it] {'loss': 0.3045, 'grad_norm': 0.6164318764744039, 'learning_rate': 2.967094108128549e-07, 'epoch': 0.89} 89%|████████▉ | 19734/22095 [34:10:50<4:33:15, 6.94s/it] 89%|████████▉ | 19735/22095 [34:10:54<3:54:32, 5.96s/it] {'loss': 0.2565, 'grad_norm': 0.563153323637875, 'learning_rate': 2.964607404027514e-07, 'epoch': 0.89} 89%|████████▉ | 19735/22095 [34:10:54<3:54:32, 5.96s/it] 89%|████████▉ | 19736/22095 [34:10:57<3:25:24, 5.22s/it] {'loss': 0.3056, 'grad_norm': 0.607980885936067, 'learning_rate': 2.9621217105726077e-07, 'epoch': 0.89} 89%|████████▉ | 19736/22095 [34:10:57<3:25:24, 5.22s/it] 89%|████████▉ | 19737/22095 [34:11:00<3:00:36, 4.60s/it] {'loss': 0.2901, 'grad_norm': 0.708213998184299, 'learning_rate': 2.9596370278172305e-07, 'epoch': 0.89} 89%|████████▉ | 19737/22095 [34:11:00<3:00:36, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58448 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19738/22095 [34:11:03<2:38:23, 4.03s/it] {'loss': 0.2731, 'grad_norm': 0.6246550411649562, 'learning_rate': 2.9571533558147845e-07, 'epoch': 0.89} 89%|████████▉ | 19738/22095 [34:11:03<2:38:23, 4.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44957 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47235 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19739/22095 [34:11:07<2:33:38, 3.91s/it] {'loss': 0.3315, 'grad_norm': 0.6935539968631732, 'learning_rate': 2.9546706946186387e-07, 'epoch': 0.89} 89%|████████▉ | 19739/22095 [34:11:07<2:33:38, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87717 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93565 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19740/22095 [34:11:11<2:31:25, 3.86s/it] {'loss': 0.2844, 'grad_norm': 0.61341847188213, 'learning_rate': 2.9521890442821276e-07, 'epoch': 0.89} 89%|████████▉ | 19740/22095 [34:11:11<2:31:25, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42740 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57367 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19741/22095 [34:11:13<2:21:04, 3.60s/it] {'loss': 0.2901, 'grad_norm': 0.6046356361903988, 'learning_rate': 2.9497084048585755e-07, 'epoch': 0.89} 89%|████████▉ | 19741/22095 [34:11:14<2:21:04, 3.60s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Token indices sequence length is longer than the specified maximum sequence length for this model (143555 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19742/22095 [34:11:16<2:12:52, 3.39s/it] {'loss': 0.2884, 'grad_norm': 0.6001415015783245, 'learning_rate': 2.94722877640129e-07, 'epoch': 0.89} 89%|████████▉ | 19742/22095 [34:11:16<2:12:52, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60645 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54974 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19743/22095 [34:11:20<2:12:50, 3.39s/it] {'loss': 0.247, 'grad_norm': 0.6380842124636362, 'learning_rate': 2.9447501589635387e-07, 'epoch': 0.89} 89%|████████▉ | 19743/22095 [34:11:20<2:12:50, 3.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898847 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22000, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} 89%|████████▉ | 19744/22095 [34:11:23<2:11:43, 3.36s/it] {'loss': 0.3277, 'grad_norm': 0.5885874973556446, 'learning_rate': 2.942272552598596e-07, 'epoch': 0.89} 89%|████████▉ | 19744/22095 [34:11:23<2:11:43, 3.36s/it] 89%|████████▉ | 19745/22095 [34:11:45<5:50:07, 8.94s/it] {'loss': 0.2867, 'grad_norm': 0.6059575773494633, 'learning_rate': 2.9397959573596867e-07, 'epoch': 0.89} 89%|████████▉ | 19745/22095 [34:11:45<5:50:07, 8.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [214, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366419 in VC:s3://internvl-moe-sft-data/. Exception: Image size [214, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33165, 'image': 'vrdu_table_final_2/astro-ph.CO/16075af7-f7b5-44fa-804e-2e3ae24327d4.png', 'image_wh': [[214, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.8\\hsize}}\n \\small #1 \\today\n\\end{tabular}\n```"}]} 89%|████████▉ | 19746/22095 [34:11:54<5:55:47, 9.09s/it] {'loss': 0.4673, 'grad_norm': 0.25104711480411934, 'learning_rate': 2.9373203733000234e-07, 'epoch': 0.89} 89%|████████▉ | 19746/22095 [34:11:54<5:55:47, 9.09s/it] 89%|████████▉ | 19747/22095 [34:12:04<5:59:22, 9.18s/it] {'loss': 0.4565, 'grad_norm': 0.2764519128206775, 'learning_rate': 2.9348458004728074e-07, 'epoch': 0.89} 89%|████████▉ | 19747/22095 [34:12:04<5:59:22, 9.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41510 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19748/22095 [34:12:13<6:02:21, 9.26s/it] {'loss': 0.448, 'grad_norm': 0.25004411755184186, 'learning_rate': 2.9323722389312084e-07, 'epoch': 0.89} 89%|████████▉ | 19748/22095 [34:12:13<6:02:21, 9.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19749/22095 [34:12:23<6:05:57, 9.36s/it] {'loss': 0.4502, 'grad_norm': 0.2816876893548513, 'learning_rate': 2.929899688728366e-07, 'epoch': 0.89} 89%|████████▉ | 19749/22095 [34:12:23<6:05:57, 9.36s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (49778 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84370 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19750/22095 [34:12:26<4:57:38, 7.62s/it] {'loss': 0.2989, 'grad_norm': 0.6316241558565415, 'learning_rate': 2.927428149917416e-07, 'epoch': 0.89} 89%|████████▉ | 19750/22095 [34:12:26<4:57:38, 7.62s/it] 89%|████████▉ | 19751/22095 [34:12:37<5:30:03, 8.45s/it] {'loss': 0.4817, 'grad_norm': 0.2710113008541533, 'learning_rate': 2.9249576225514664e-07, 'epoch': 0.89} 89%|████████▉ | 19751/22095 [34:12:37<5:30:03, 8.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8910450 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33603, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 16cm\nB. 4cm\nC. 8cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 89%|████████▉ | 19752/22095 [34:12:40<4:27:06, 6.84s/it] {'loss': 0.3181, 'grad_norm': 0.633843670051217, 'learning_rate': 2.922488106683596e-07, 'epoch': 0.89} 89%|████████▉ | 19752/22095 [34:12:40<4:27:06, 6.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45230 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67797 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (47165 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. 89%|████████▉ | 19753/22095 [34:12:44<3:52:14, 5.95s/it] {'loss': 0.3278, 'grad_norm': 0.6530386741855132, 'learning_rate': 2.9200196023668693e-07, 'epoch': 0.89} 89%|████████▉ | 19753/22095 [34:12:44<3:52:14, 5.95s/it] 89%|████████▉ | 19754/22095 [34:12:47<3:24:23, 5.24s/it] {'loss': 0.3048, 'grad_norm': 0.6829899675953176, 'learning_rate': 2.91755210965432e-07, 'epoch': 0.89} 89%|████████▉ | 19754/22095 [34:12:47<3:24:23, 5.24s/it] 89%|████████▉ | 19755/22095 [34:12:52<3:11:59, 4.92s/it] {'loss': 0.291, 'grad_norm': 0.5950240474932199, 'learning_rate': 2.915085628598979e-07, 'epoch': 0.89} 89%|████████▉ | 19755/22095 [34:12:52<3:11:59, 4.92s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (68776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62474 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19756/22095 [34:13:00<3:48:06, 5.85s/it] {'loss': 0.4681, 'grad_norm': 0.27733112695232565, 'learning_rate': 2.9126201592538427e-07, 'epoch': 0.89} 89%|████████▉ | 19756/22095 [34:13:00<3:48:06, 5.85s/it] 89%|████████▉ | 19757/22095 [34:13:03<3:21:19, 5.17s/it] {'loss': 0.2653, 'grad_norm': 0.6544517655613107, 'learning_rate': 2.910155701671868e-07, 'epoch': 0.89} 89%|████████▉ | 19757/22095 [34:13:03<3:21:19, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19758/22095 [34:13:13<4:20:31, 6.69s/it] {'loss': 0.4757, 'grad_norm': 0.2657406569618254, 'learning_rate': 2.907692255906036e-07, 'epoch': 0.89} 89%|████████▉ | 19758/22095 [34:13:13<4:20:31, 6.69s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 89%|████████▉ | 19759/22095 [34:13:17<3:49:44, 5.90s/it] {'loss': 0.3149, 'grad_norm': 0.6613698668778631, 'learning_rate': 2.905229822009253e-07, 'epoch': 0.89} 89%|████████▉ | 19759/22095 [34:13:17<3:49:44, 5.90s/it] 89%|████████▉ | 19760/22095 [34:13:21<3:23:16, 5.22s/it] {'loss': 0.2823, 'grad_norm': 0.5564095968570708, 'learning_rate': 2.9027684000344446e-07, 'epoch': 0.89} 89%|████████▉ | 19760/22095 [34:13:21<3:23:16, 5.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19761/22095 [34:13:30<4:10:58, 6.45s/it] {'loss': 0.4609, 'grad_norm': 0.2668504868753882, 'learning_rate': 2.900307990034501e-07, 'epoch': 0.89} 89%|████████▉ | 19761/22095 [34:13:30<4:10:58, 6.45s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [164, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8921709 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [164, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44862, 'image': 'images/4915.png', 'image_wh': [[164, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,C为线段AB上一点,D为线段BC的中点,AB=20,AD=14,则AC的长为()\nA. 10\nB. 8\nC. 7\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 89%|████████▉ | 19762/22095 [34:13:34<3:33:20, 5.49s/it] {'loss': 0.3106, 'grad_norm': 0.6364155599319349, 'learning_rate': 2.8978485920622747e-07, 'epoch': 0.89} 89%|████████▉ | 19762/22095 [34:13:34<3:33:20, 5.49s/it] 89%|████████▉ | 19763/22095 [34:13:37<3:03:49, 4.73s/it] {'loss': 0.2661, 'grad_norm': 0.550987796887358, 'learning_rate': 2.8953902061706173e-07, 'epoch': 0.89} 89%|████████▉ | 19763/22095 [34:13:37<3:03:49, 4.73s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49842 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50206 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100415 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19764/22095 [34:13:46<3:59:19, 6.16s/it] {'loss': 0.4433, 'grad_norm': 0.25702933360286156, 'learning_rate': 2.8929328324123595e-07, 'epoch': 0.89} 89%|████████▉ | 19764/22095 [34:13:46<3:59:19, 6.16s/it] 89%|████████▉ | 19765/22095 [34:13:50<3:28:38, 5.37s/it] {'loss': 0.3219, 'grad_norm': 0.5719382824193083, 'learning_rate': 2.890476470840303e-07, 'epoch': 0.89} 89%|████████▉ | 19765/22095 [34:13:50<3:28:38, 5.37s/it] 89%|████████▉ | 19766/22095 [34:13:53<3:02:29, 4.70s/it] {'loss': 0.279, 'grad_norm': 0.6305795164274872, 'learning_rate': 2.8880211215072065e-07, 'epoch': 0.89} 89%|████████▉ | 19766/22095 [34:13:53<3:02:29, 4.70s/it] 89%|████████▉ | 19767/22095 [34:13:56<2:40:57, 4.15s/it] {'loss': 0.2827, 'grad_norm': 0.6123742014140814, 'learning_rate': 2.8855667844658484e-07, 'epoch': 0.89} 89%|████████▉ | 19767/22095 [34:13:56<2:40:57, 4.15s/it] 89%|████████▉ | 19768/22095 [34:13:59<2:32:52, 3.94s/it] {'loss': 0.2471, 'grad_norm': 0.6371501831330458, 'learning_rate': 2.8831134597689604e-07, 'epoch': 0.89} 89%|████████▉ | 19768/22095 [34:13:59<2:32:52, 3.94s/it] 89%|████████▉ | 19769/22095 [34:14:02<2:22:21, 3.67s/it] {'loss': 0.3209, 'grad_norm': 0.6011369890792166, 'learning_rate': 2.8806611474692604e-07, 'epoch': 0.89} 89%|████████▉ | 19769/22095 [34:14:02<2:22:21, 3.67s/it] 89%|████████▉ | 19770/22095 [34:14:05<2:16:19, 3.52s/it] {'loss': 0.2731, 'grad_norm': 0.6019283022002772, 'learning_rate': 2.878209847619429e-07, 'epoch': 0.89} 89%|████████▉ | 19770/22095 [34:14:05<2:16:19, 3.52s/it] 89%|████████▉ | 19771/22095 [34:14:08<2:12:10, 3.41s/it] {'loss': 0.2717, 'grad_norm': 0.6117442681689266, 'learning_rate': 2.875759560272151e-07, 'epoch': 0.89} 89%|████████▉ | 19771/22095 [34:14:08<2:12:10, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 89%|████████▉ | 19772/22095 [34:14:18<3:24:52, 5.29s/it] {'loss': 0.4742, 'grad_norm': 0.2607889799884667, 'learning_rate': 2.873310285480063e-07, 'epoch': 0.89} 89%|████████▉ | 19772/22095 [34:14:18<3:24:52, 5.29s/it] 89%|████████▉ | 19773/22095 [34:14:21<3:01:15, 4.68s/it] {'loss': 0.3086, 'grad_norm': 0.6182961608864804, 'learning_rate': 2.8708620232958004e-07, 'epoch': 0.89} 89%|████████▉ | 19773/22095 [34:14:21<3:01:15, 4.68s/it] 89%|████████▉ | 19774/22095 [34:14:25<2:53:40, 4.49s/it] {'loss': 0.294, 'grad_norm': 0.579715758551968, 'learning_rate': 2.868414773771971e-07, 'epoch': 0.89} 89%|████████▉ | 19774/22095 [34:14:25<2:53:40, 4.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52485 > 40960). Running this sequence through the model will result in indexing errors 89%|████████▉ | 19775/22095 [34:14:30<2:52:13, 4.45s/it] {'loss': 0.275, 'grad_norm': 0.5744817660343502, 'learning_rate': 2.8659685369611503e-07, 'epoch': 0.89} 89%|████████▉ | 19775/22095 [34:14:30<2:52:13, 4.45s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19776/22095 [34:14:34<2:46:43, 4.31s/it] {'loss': 0.2363, 'grad_norm': 0.5618204289540498, 'learning_rate': 2.8635233129159004e-07, 'epoch': 0.9} 90%|████████▉ | 19776/22095 [34:14:34<2:46:43, 4.31s/it] 90%|████████▉ | 19777/22095 [34:14:37<2:30:45, 3.90s/it] {'loss': 0.2863, 'grad_norm': 0.650461405504146, 'learning_rate': 2.8610791016887794e-07, 'epoch': 0.9} 90%|████████▉ | 19777/22095 [34:14:37<2:30:45, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70085 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19778/22095 [34:14:40<2:17:22, 3.56s/it] {'loss': 0.2734, 'grad_norm': 0.5736113775880847, 'learning_rate': 2.85863590333228e-07, 'epoch': 0.9} 90%|████████▉ | 19778/22095 [34:14:40<2:17:22, 3.56s/it] 90%|████████▉ | 19779/22095 [34:14:44<2:26:46, 3.80s/it] {'loss': 0.2767, 'grad_norm': 0.5879128588576477, 'learning_rate': 2.8561937178989087e-07, 'epoch': 0.9} 90%|████████▉ | 19779/22095 [34:14:44<2:26:46, 3.80s/it] 90%|████████▉ | 19780/22095 [34:14:48<2:30:19, 3.90s/it] {'loss': 0.2962, 'grad_norm': 0.6277117466751468, 'learning_rate': 2.853752545441146e-07, 'epoch': 0.9} 90%|████████▉ | 19780/22095 [34:14:48<2:30:19, 3.90s/it] 90%|████████▉ | 19781/22095 [34:14:52<2:29:03, 3.86s/it] {'loss': 0.3418, 'grad_norm': 0.666912011889595, 'learning_rate': 2.851312386011457e-07, 'epoch': 0.9} 90%|████████▉ | 19781/22095 [34:14:52<2:29:03, 3.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116189 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19782/22095 [34:14:55<2:19:55, 3.63s/it] {'loss': 0.2705, 'grad_norm': 0.724911195777025, 'learning_rate': 2.8488732396622476e-07, 'epoch': 0.9} 90%|████████▉ | 19782/22095 [34:14:55<2:19:55, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (101799 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19783/22095 [34:15:01<2:54:07, 4.52s/it] {'loss': 0.459, 'grad_norm': 0.28456627284890906, 'learning_rate': 2.846435106445933e-07, 'epoch': 0.9} 90%|████████▉ | 19783/22095 [34:15:01<2:54:07, 4.52s/it] 90%|████████▉ | 19784/22095 [34:15:05<2:42:59, 4.23s/it] {'loss': 0.254, 'grad_norm': 0.6560552485337452, 'learning_rate': 2.843997986414915e-07, 'epoch': 0.9} 90%|████████▉ | 19784/22095 [34:15:05<2:42:59, 4.23s/it] 90%|████████▉ | 19785/22095 [34:15:08<2:30:31, 3.91s/it] {'loss': 0.3142, 'grad_norm': 0.6501966318352483, 'learning_rate': 2.8415618796215516e-07, 'epoch': 0.9} 90%|████████▉ | 19785/22095 [34:15:08<2:30:31, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [639, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504937 in VC:s3://internvl-moe-sft-data/. Exception: Image size [639, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37636, 'image': 'vrdu_texteq/astro-ph.CO/7d0e4eb5-c885-4028-8e51-8d8662211d47.png', 'image_wh': [[639, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'while those with size $\\ell+d\\ell$ at time $t$ were formed at'}]} 90%|████████▉ | 19786/22095 [34:15:16<3:10:18, 4.95s/it] {'loss': 0.4688, 'grad_norm': 0.25189795807932797, 'learning_rate': 2.839126786118179e-07, 'epoch': 0.9} 90%|████████▉ | 19786/22095 [34:15:16<3:10:18, 4.95s/it] 90%|████████▉ | 19787/22095 [34:15:19<2:55:53, 4.57s/it] {'loss': 0.2673, 'grad_norm': 0.5445596252129087, 'learning_rate': 2.8366927059571393e-07, 'epoch': 0.9} 90%|████████▉ | 19787/22095 [34:15:19<2:55:53, 4.57s/it] 90%|████████▉ | 19788/22095 [34:15:23<2:40:34, 4.18s/it] {'loss': 0.2709, 'grad_norm': 0.6155411118466514, 'learning_rate': 2.834259639190712e-07, 'epoch': 0.9} 90%|████████▉ | 19788/22095 [34:15:23<2:40:34, 4.18s/it] 90%|████████▉ | 19789/22095 [34:15:26<2:28:13, 3.86s/it] {'loss': 0.3112, 'grad_norm': 0.5917130188296267, 'learning_rate': 2.8318275858711943e-07, 'epoch': 0.9} 90%|████████▉ | 19789/22095 [34:15:26<2:28:13, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19790/22095 [34:15:34<3:21:12, 5.24s/it] {'loss': 0.4581, 'grad_norm': 0.2553760666330001, 'learning_rate': 2.829396546050839e-07, 'epoch': 0.9} 90%|████████▉ | 19790/22095 [34:15:34<3:21:12, 5.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19791/22095 [34:15:44<4:13:27, 6.60s/it] {'loss': 0.4565, 'grad_norm': 0.26227643370388387, 'learning_rate': 2.826966519781871e-07, 'epoch': 0.9} 90%|████████▉ | 19791/22095 [34:15:44<4:13:27, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 90%|████████▉ | 19792/22095 [34:15:48<3:48:05, 5.94s/it] {'loss': 0.2896, 'grad_norm': 0.6770457020151776, 'learning_rate': 2.824537507116504e-07, 'epoch': 0.9} 90%|████████▉ | 19792/22095 [34:15:48<3:48:05, 5.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43386 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89909 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47899 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46767 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97298 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19793/22095 [34:15:51<3:15:30, 5.10s/it] {'loss': 0.2608, 'grad_norm': 0.6022884364992735, 'learning_rate': 2.8221095081069517e-07, 'epoch': 0.9} 90%|████████▉ | 19793/22095 [34:15:51<3:15:30, 5.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8965316 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16151, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上一点,点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC比BC长()\nA. 1cm\nB. 2cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:点M是AC的中点,点N是BC的中点,如果MC比NC长2cm,AC=2MC,BC=2CN,由线段的和差得AC-BC=2MC-2NC=2(MC-NC)=2×2=4cm,'}]} 90%|████████▉ | 19794/22095 [34:15:54<2:50:44, 4.45s/it] {'loss': 0.2736, 'grad_norm': 0.5787444811677719, 'learning_rate': 2.819682522805367e-07, 'epoch': 0.9} 90%|████████▉ | 19794/22095 [34:15:54<2:50:44, 4.45s/it] 90%|████████▉ | 19795/22095 [34:15:57<2:33:55, 4.02s/it] {'loss': 0.2927, 'grad_norm': 0.599606507374959, 'learning_rate': 2.8172565512638974e-07, 'epoch': 0.9} 90%|████████▉ | 19795/22095 [34:15:57<2:33:55, 4.02s/it] 90%|████████▉ | 19796/22095 [34:16:01<2:28:34, 3.88s/it] {'loss': 0.2343, 'grad_norm': 0.625534914275994, 'learning_rate': 2.8148315935346725e-07, 'epoch': 0.9} 90%|████████▉ | 19796/22095 [34:16:01<2:28:34, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19797/22095 [34:16:04<2:17:01, 3.58s/it] {'loss': 0.3011, 'grad_norm': 0.5826505623078216, 'learning_rate': 2.812407649669807e-07, 'epoch': 0.9} 90%|████████▉ | 19797/22095 [34:16:04<2:17:01, 3.58s/it] 90%|████████▉ | 19798/22095 [34:16:07<2:13:16, 3.48s/it] {'loss': 0.2726, 'grad_norm': 0.6161056662098575, 'learning_rate': 2.809984719721376e-07, 'epoch': 0.9} 90%|████████▉ | 19798/22095 [34:16:07<2:13:16, 3.48s/it] 90%|████████▉ | 19799/22095 [34:16:10<2:06:28, 3.31s/it] {'loss': 0.2774, 'grad_norm': 0.5970542362819024, 'learning_rate': 2.807562803741426e-07, 'epoch': 0.9} 90%|████████▉ | 19799/22095 [34:16:10<2:06:28, 3.31s/it] 90%|████████▉ | 19800/22095 [34:16:13<2:04:07, 3.25s/it] {'loss': 0.2983, 'grad_norm': 0.6152274129951627, 'learning_rate': 2.805141901782027e-07, 'epoch': 0.9} 90%|████████▉ | 19800/22095 [34:16:13<2:04:07, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19801/22095 [34:16:22<3:15:24, 5.11s/it] {'loss': 0.4626, 'grad_norm': 0.28077492126298775, 'learning_rate': 2.8027220138951705e-07, 'epoch': 0.9} 90%|████████▉ | 19801/22095 [34:16:22<3:15:24, 5.11s/it] 90%|████████▉ | 19802/22095 [34:16:26<2:56:34, 4.62s/it] {'loss': 0.2933, 'grad_norm': 0.6393216136802916, 'learning_rate': 2.8003031401328653e-07, 'epoch': 0.9} 90%|████████▉ | 19802/22095 [34:16:26<2:56:34, 4.62s/it] 90%|████████▉ | 19803/22095 [34:16:29<2:41:34, 4.23s/it] {'loss': 0.2614, 'grad_norm': 0.5985841723566697, 'learning_rate': 2.797885280547086e-07, 'epoch': 0.9} 90%|████████▉ | 19803/22095 [34:16:29<2:41:34, 4.23s/it] 90%|████████▉ | 19804/22095 [34:16:34<2:41:38, 4.23s/it] {'loss': 0.2568, 'grad_norm': 0.5298212310543782, 'learning_rate': 2.795468435189774e-07, 'epoch': 0.9} 90%|████████▉ | 19804/22095 [34:16:34<2:41:38, 4.23s/it] 90%|████████▉ | 19805/22095 [34:16:37<2:31:50, 3.98s/it] {'loss': 0.2981, 'grad_norm': 0.5987061216283815, 'learning_rate': 2.7930526041128727e-07, 'epoch': 0.9} 90%|████████▉ | 19805/22095 [34:16:37<2:31:50, 3.98s/it] 90%|████████▉ | 19806/22095 [34:16:41<2:33:20, 4.02s/it] {'loss': 0.3146, 'grad_norm': 0.6934345951631308, 'learning_rate': 2.790637787368294e-07, 'epoch': 0.9} 90%|████████▉ | 19806/22095 [34:16:41<2:33:20, 4.02s/it] 90%|████████▉ | 19807/22095 [34:16:45<2:28:59, 3.91s/it] {'loss': 0.2934, 'grad_norm': 0.6084117691504284, 'learning_rate': 2.788223985007904e-07, 'epoch': 0.9} 90%|████████▉ | 19807/22095 [34:16:45<2:28:59, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47577 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41433 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19808/22095 [34:16:48<2:21:16, 3.71s/it] {'loss': 0.3225, 'grad_norm': 0.5735161793850959, 'learning_rate': 2.7858111970835823e-07, 'epoch': 0.9} 90%|████████▉ | 19808/22095 [34:16:48<2:21:16, 3.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67245 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19809/22095 [34:16:51<2:12:45, 3.48s/it] {'loss': 0.3502, 'grad_norm': 0.6054105921436966, 'learning_rate': 2.783399423647171e-07, 'epoch': 0.9} 90%|████████▉ | 19809/22095 [34:16:51<2:12:45, 3.48s/it] 90%|████████▉ | 19810/22095 [34:16:55<2:16:40, 3.59s/it] {'loss': 0.3202, 'grad_norm': 0.7195183412606594, 'learning_rate': 2.7809886647505e-07, 'epoch': 0.9} 90%|████████▉ | 19810/22095 [34:16:55<2:16:40, 3.59s/it] 90%|████████▉ | 19811/22095 [34:16:58<2:17:27, 3.61s/it] {'loss': 0.2581, 'grad_norm': 0.5922610762383163, 'learning_rate': 2.778578920445352e-07, 'epoch': 0.9} 90%|████████▉ | 19811/22095 [34:16:58<2:17:27, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41483 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19812/22095 [34:17:02<2:17:21, 3.61s/it] {'loss': 0.3336, 'grad_norm': 0.8144053010320692, 'learning_rate': 2.7761701907835114e-07, 'epoch': 0.9} 90%|████████▉ | 19812/22095 [34:17:02<2:17:21, 3.61s/it] 90%|████████▉ | 19813/22095 [34:17:05<2:12:04, 3.47s/it] {'loss': 0.3069, 'grad_norm': 0.5985999716901659, 'learning_rate': 2.7737624758167436e-07, 'epoch': 0.9} 90%|████████▉ | 19813/22095 [34:17:05<2:12:04, 3.47s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8930857 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54010, 'image': 'images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '3cm'}]} 90%|████████▉ | 19814/22095 [34:17:08<2:08:21, 3.38s/it] {'loss': 0.2978, 'grad_norm': 0.6518395917982522, 'learning_rate': 2.771355775596779e-07, 'epoch': 0.9} 90%|████████▉ | 19814/22095 [34:17:08<2:08:21, 3.38s/it] 90%|████████▉ | 19815/22095 [34:17:12<2:09:24, 3.41s/it] {'loss': 0.3159, 'grad_norm': 0.6309417840869341, 'learning_rate': 2.768950090175315e-07, 'epoch': 0.9} 90%|████████▉ | 19815/22095 [34:17:12<2:09:24, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19816/22095 [34:17:18<2:43:03, 4.29s/it] {'loss': 0.4729, 'grad_norm': 0.28977489773166926, 'learning_rate': 2.7665454196040665e-07, 'epoch': 0.9} 90%|████████▉ | 19816/22095 [34:17:18<2:43:03, 4.29s/it] 90%|████████▉ | 19817/22095 [34:17:28<3:41:25, 5.83s/it] {'loss': 0.4647, 'grad_norm': 0.2650338371075625, 'learning_rate': 2.76414176393468e-07, 'epoch': 0.9} 90%|████████▉ | 19817/22095 [34:17:28<3:41:25, 5.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 90%|████████▉ | 19818/22095 [34:17:31<3:14:49, 5.13s/it] {'loss': 0.2825, 'grad_norm': 0.5966383668175729, 'learning_rate': 2.7617391232188207e-07, 'epoch': 0.9} 90%|████████▉ | 19818/22095 [34:17:31<3:14:49, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19819/22095 [34:17:35<3:00:45, 4.77s/it] {'loss': 0.2948, 'grad_norm': 0.7291846411758335, 'learning_rate': 2.7593374975081075e-07, 'epoch': 0.9} 90%|████████▉ | 19819/22095 [34:17:35<3:00:45, 4.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19820/22095 [34:17:43<3:32:29, 5.60s/it] {'loss': 0.4718, 'grad_norm': 0.29386027248409563, 'learning_rate': 2.7569368868541333e-07, 'epoch': 0.9} 90%|████████▉ | 19820/22095 [34:17:43<3:32:29, 5.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95115 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19821/22095 [34:17:46<3:11:47, 5.06s/it] {'loss': 0.2841, 'grad_norm': 0.5366175112637515, 'learning_rate': 2.75453729130849e-07, 'epoch': 0.9} 90%|████████▉ | 19821/22095 [34:17:46<3:11:47, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379633 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46418, 'image': 'vrdu_table_final_2/astro-ph.CO/ea91da11-b52a-4078-89ae-d7b5bae850b9.png', 'image_wh': [[23, 6]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}ccccccccccc@{}}\n...\n\\end{tabular}\n```"}]} 90%|████████▉ | 19822/22095 [34:17:49<2:48:51, 4.46s/it] {'loss': 0.2648, 'grad_norm': 0.5773925409926636, 'learning_rate': 2.752138710922747e-07, 'epoch': 0.9} 90%|████████▉ | 19822/22095 [34:17:49<2:48:51, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101740 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65019 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19823/22095 [34:17:52<2:30:18, 3.97s/it] {'loss': 0.2812, 'grad_norm': 0.6528258350735204, 'learning_rate': 2.74974114574843e-07, 'epoch': 0.9} 90%|████████▉ | 19823/22095 [34:17:52<2:30:18, 3.97s/it] 90%|████████▉ | 19824/22095 [34:17:56<2:29:39, 3.95s/it] {'loss': 0.2388, 'grad_norm': 0.5804721946337439, 'learning_rate': 2.747344595837048e-07, 'epoch': 0.9} 90%|████████▉ | 19824/22095 [34:17:56<2:29:39, 3.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66983 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83603 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (124615 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19825/22095 [34:18:00<2:24:11, 3.81s/it] {'loss': 0.3212, 'grad_norm': 0.5968044745916934, 'learning_rate': 2.74494906124011e-07, 'epoch': 0.9} 90%|████████▉ | 19825/22095 [34:18:00<2:24:11, 3.81s/it] 90%|████████▉ | 19826/22095 [34:18:03<2:16:39, 3.61s/it] {'loss': 0.2932, 'grad_norm': 0.6206777396994065, 'learning_rate': 2.7425545420090906e-07, 'epoch': 0.9} 90%|████████▉ | 19826/22095 [34:18:03<2:16:39, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45194 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85688 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41953 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48043 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44002 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19827/22095 [34:18:06<2:11:09, 3.47s/it] {'loss': 0.2769, 'grad_norm': 0.5662554206299493, 'learning_rate': 2.7401610381954325e-07, 'epoch': 0.9} 90%|████████▉ | 19827/22095 [34:18:06<2:11:09, 3.47s/it] 90%|████████▉ | 19828/22095 [34:18:11<2:32:14, 4.03s/it] {'loss': 0.3565, 'grad_norm': 0.6144031913927165, 'learning_rate': 2.7377685498505557e-07, 'epoch': 0.9} 90%|████████▉ | 19828/22095 [34:18:11<2:32:14, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19829/22095 [34:18:20<3:27:56, 5.51s/it] {'loss': 0.4778, 'grad_norm': 0.2641253441420351, 'learning_rate': 2.7353770770258915e-07, 'epoch': 0.9} 90%|████████▉ | 19829/22095 [34:18:20<3:27:56, 5.51s/it] 90%|████████▉ | 19830/22095 [34:18:23<3:02:36, 4.84s/it] {'loss': 0.3096, 'grad_norm': 0.5820930843681503, 'learning_rate': 2.7329866197727983e-07, 'epoch': 0.9} 90%|████████▉ | 19830/22095 [34:18:23<3:02:36, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19831/22095 [34:18:32<3:42:25, 5.89s/it] {'loss': 0.4733, 'grad_norm': 0.274671774086108, 'learning_rate': 2.7305971781426634e-07, 'epoch': 0.9} 90%|████████▉ | 19831/22095 [34:18:32<3:42:25, 5.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8387348 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54159, 'image': 'vrdu_table_final_2/astro-ph.CO/123568ed-4b60-4d52-9cc5-9f65c6b39b4d.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 90%|████████▉ | 19832/22095 [34:18:36<3:20:44, 5.32s/it] {'loss': 0.2492, 'grad_norm': 0.586907469338291, 'learning_rate': 2.728208752186817e-07, 'epoch': 0.9} 90%|████████▉ | 19832/22095 [34:18:36<3:20:44, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (51878 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89360 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50203 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98560 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19833/22095 [34:18:45<4:07:11, 6.56s/it] {'loss': 0.4904, 'grad_norm': 0.2651039284392432, 'learning_rate': 2.725821341956575e-07, 'epoch': 0.9} 90%|████████▉ | 19833/22095 [34:18:45<4:07:11, 6.56s/it] 90%|████████▉ | 19834/22095 [34:18:49<3:35:31, 5.72s/it] {'loss': 0.2695, 'grad_norm': 0.558027137087256, 'learning_rate': 2.7234349475032395e-07, 'epoch': 0.9} 90%|████████▉ | 19834/22095 [34:18:49<3:35:31, 5.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19835/22095 [34:18:59<4:21:31, 6.94s/it] {'loss': 0.4558, 'grad_norm': 0.2854868461334528, 'learning_rate': 2.7210495688781037e-07, 'epoch': 0.9} 90%|████████▉ | 19835/22095 [34:18:59<4:21:31, 6.94s/it] 90%|████████▉ | 19836/22095 [34:19:08<4:42:55, 7.51s/it] {'loss': 0.4828, 'grad_norm': 0.2792178849027721, 'learning_rate': 2.7186652061323924e-07, 'epoch': 0.9} 90%|████████▉ | 19836/22095 [34:19:08<4:42:55, 7.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19837/22095 [34:19:11<3:56:42, 6.29s/it] {'loss': 0.2715, 'grad_norm': 0.6265739449917149, 'learning_rate': 2.716281859317349e-07, 'epoch': 0.9} 90%|████████▉ | 19837/22095 [34:19:11<3:56:42, 6.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [188, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8893394 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [188, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16547, 'image': 'images/5291.png', 'image_wh': [[188, 24]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果线段AB=10cm,M为AB的中点,N点在AB上,Nb=2cm,则线段Mn的长度为()\nA. 5cm\nB. 4cm\nC. 3cm\nD. 2cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,M是AB中点,∴BM=\\frac{1}{2}AB=5cm,又∵NB=2cm,∴MN=BM-BN=5-2=3cm.'}]} 90%|████████▉ | 19838/22095 [34:19:15<3:30:26, 5.59s/it] {'loss': 0.3151, 'grad_norm': 0.6166940853718327, 'learning_rate': 2.713899528484193e-07, 'epoch': 0.9} 90%|████████▉ | 19838/22095 [34:19:15<3:30:26, 5.59s/it] 90%|████████▉ | 19839/22095 [34:19:18<3:02:03, 4.84s/it] {'loss': 0.3109, 'grad_norm': 0.6109203277092121, 'learning_rate': 2.7115182136841166e-07, 'epoch': 0.9} 90%|████████▉ | 19839/22095 [34:19:18<3:02:03, 4.84s/it] 90%|████████▉ | 19840/22095 [34:19:22<2:45:38, 4.41s/it] {'loss': 0.3114, 'grad_norm': 0.5714322175597286, 'learning_rate': 2.7091379149682683e-07, 'epoch': 0.9} 90%|████████▉ | 19840/22095 [34:19:22<2:45:38, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19841/22095 [34:19:33<4:01:47, 6.44s/it] {'loss': 0.4597, 'grad_norm': 0.2799105941790582, 'learning_rate': 2.7067586323878014e-07, 'epoch': 0.9} 90%|████████▉ | 19841/22095 [34:19:33<4:01:47, 6.44s/it] 90%|████████▉ | 19842/22095 [34:19:37<3:42:15, 5.92s/it] {'loss': 0.2858, 'grad_norm': 0.6264111829057746, 'learning_rate': 2.704380365993847e-07, 'epoch': 0.9} 90%|████████▉ | 19842/22095 [34:19:37<3:42:15, 5.92s/it] 90%|████████▉ | 19843/22095 [34:19:41<3:13:24, 5.15s/it] {'loss': 0.3181, 'grad_norm': 0.6384133066448844, 'learning_rate': 2.7020031158375037e-07, 'epoch': 0.9} 90%|████████▉ | 19843/22095 [34:19:41<3:13:24, 5.15s/it] 90%|████████▉ | 19844/22095 [34:19:45<2:58:53, 4.77s/it] {'loss': 0.3096, 'grad_norm': 0.6823010909322235, 'learning_rate': 2.699626881969841e-07, 'epoch': 0.9} 90%|████████▉ | 19844/22095 [34:19:45<2:58:53, 4.77s/it] 90%|████████▉ | 19845/22095 [34:19:48<2:41:06, 4.30s/it] {'loss': 0.2549, 'grad_norm': 0.6476981453613727, 'learning_rate': 2.6972516644419356e-07, 'epoch': 0.9} 90%|████████▉ | 19845/22095 [34:19:48<2:41:06, 4.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8915457 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38610, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 90%|████████▉ | 19846/22095 [34:19:51<2:29:14, 3.98s/it] {'loss': 0.3313, 'grad_norm': 0.6620153827854617, 'learning_rate': 2.6948774633048016e-07, 'epoch': 0.9} 90%|████████▉ | 19846/22095 [34:19:51<2:29:14, 3.98s/it] 90%|████████▉ | 19847/22095 [34:19:55<2:27:02, 3.92s/it] {'loss': 0.295, 'grad_norm': 0.748192870105131, 'learning_rate': 2.69250427860947e-07, 'epoch': 0.9} 90%|████████▉ | 19847/22095 [34:19:55<2:27:02, 3.92s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [709, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523172 in VC:s3://internvl-moe-sft-data/. Exception: Image size [709, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 98181, 'image': 'vrdu_texteq/astro-ph.CO/b4b1f7c4-7d9e-4a2c-ae19-e2d3bf431a12.png', 'image_wh': [[709, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease recognize the text in the image.'}, {'from': 'gpt', 'value': 'which is made of four blocks of $N_k\\times N_k$ matrices that are'}]} 90%|████████▉ | 19848/22095 [34:19:58<2:16:25, 3.64s/it] {'loss': 0.2879, 'grad_norm': 0.5661829688022068, 'learning_rate': 2.690132110406929e-07, 'epoch': 0.9} 90%|████████▉ | 19848/22095 [34:19:58<2:16:25, 3.64s/it] 90%|████████▉ | 19849/22095 [34:20:02<2:19:56, 3.74s/it] {'loss': 0.3005, 'grad_norm': 0.602374880920724, 'learning_rate': 2.687760958748137e-07, 'epoch': 0.9} 90%|████████▉ | 19849/22095 [34:20:02<2:19:56, 3.74s/it] 90%|████████▉ | 19850/22095 [34:20:05<2:10:20, 3.48s/it] {'loss': 0.3028, 'grad_norm': 0.7271757380328313, 'learning_rate': 2.6853908236840586e-07, 'epoch': 0.9} 90%|████████▉ | 19850/22095 [34:20:05<2:10:20, 3.48s/it] 90%|████████▉ | 19851/22095 [34:20:08<2:03:41, 3.31s/it] {'loss': 0.2946, 'grad_norm': 0.6395083076273245, 'learning_rate': 2.68302170526562e-07, 'epoch': 0.9} 90%|████████▉ | 19851/22095 [34:20:08<2:03:41, 3.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8956347 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7182, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 90%|████████▉ | 19852/22095 [34:20:13<2:24:25, 3.86s/it] {'loss': 0.268, 'grad_norm': 0.5677435398394265, 'learning_rate': 2.680653603543726e-07, 'epoch': 0.9} 90%|████████▉ | 19852/22095 [34:20:13<2:24:25, 3.86s/it] 90%|████████▉ | 19853/22095 [34:20:16<2:20:47, 3.77s/it] {'loss': 0.3044, 'grad_norm': 0.6325083354864175, 'learning_rate': 2.678286518569245e-07, 'epoch': 0.9} 90%|████████▉ | 19853/22095 [34:20:16<2:20:47, 3.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [414, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523753 in VC:s3://internvl-moe-sft-data/. Exception: Image size [414, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41772, 'image': 'vrdu_texteq/astro-ph.CO/21d615d3-7f0b-40d3-87e4-bef291342490.png', 'image_wh': [[414, 25]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': 'The acoustic scale $l_A$ is defined as'}]} 90%|████████▉ | 19854/22095 [34:20:19<2:09:17, 3.46s/it] {'loss': 0.3017, 'grad_norm': 0.5950952403576475, 'learning_rate': 2.675920450393049e-07, 'epoch': 0.9} 90%|████████▉ | 19854/22095 [34:20:19<2:09:17, 3.46s/it] 90%|████████▉ | 19855/22095 [34:20:23<2:18:41, 3.71s/it] {'loss': 0.2637, 'grad_norm': 0.6290601694265202, 'learning_rate': 2.673555399065986e-07, 'epoch': 0.9} 90%|████████▉ | 19855/22095 [34:20:23<2:18:41, 3.71s/it] 90%|████████▉ | 19856/22095 [34:20:27<2:16:21, 3.65s/it] {'loss': 0.2924, 'grad_norm': 0.6594481525414645, 'learning_rate': 2.6711913646388645e-07, 'epoch': 0.9} 90%|████████▉ | 19856/22095 [34:20:27<2:16:21, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047747 in VC:s3://multi-modal/UniGeo/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1\nB. 1.5\nC. 2\nD. 0.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19857/22095 [34:20:30<2:15:49, 3.64s/it] {'loss': 0.3086, 'grad_norm': 0.6759843523758717, 'learning_rate': 2.6688283471624775e-07, 'epoch': 0.9} 90%|████████▉ | 19857/22095 [34:20:30<2:15:49, 3.64s/it] 90%|████████▉ | 19858/22095 [34:20:35<2:22:04, 3.81s/it] {'loss': 0.2711, 'grad_norm': 0.6150657269648987, 'learning_rate': 2.666466346687607e-07, 'epoch': 0.9} 90%|████████▉ | 19858/22095 [34:20:35<2:22:04, 3.81s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [223, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8884029 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [223, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7182, 'image': 'images/5329.png', 'image_wh': [[223, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,在L线上依次取点A、B、C,AB=5cm,BC=3cm。如果O是AC段的中点,则OB段的长度为()\nA. 1cm\nB. 1.5cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 90%|████████▉ | 19859/22095 [34:20:38<2:11:47, 3.54s/it] {'loss': 0.2714, 'grad_norm': 0.683435761376238, 'learning_rate': 2.6641053632649907e-07, 'epoch': 0.9} 90%|████████▉ | 19859/22095 [34:20:38<2:11:47, 3.54s/it] 90%|████████▉ | 19860/22095 [34:20:41<2:16:02, 3.65s/it] {'loss': 0.3114, 'grad_norm': 0.6200624707524008, 'learning_rate': 2.661745396945381e-07, 'epoch': 0.9} 90%|████████▉ | 19860/22095 [34:20:42<2:16:02, 3.65s/it] 90%|████████▉ | 19861/22095 [34:20:45<2:13:46, 3.59s/it] {'loss': 0.2759, 'grad_norm': 0.5928210284723426, 'learning_rate': 2.6593864477794716e-07, 'epoch': 0.9} 90%|████████▉ | 19861/22095 [34:20:45<2:13:46, 3.59s/it] 90%|████████▉ | 19862/22095 [34:20:48<2:12:24, 3.56s/it] {'loss': 0.2771, 'grad_norm': 0.6245841851464607, 'learning_rate': 2.65702851581795e-07, 'epoch': 0.9} 90%|████████▉ | 19862/22095 [34:20:48<2:12:24, 3.56s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44431 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19863/22095 [34:20:51<2:05:06, 3.36s/it] {'loss': 0.3323, 'grad_norm': 0.6491764300200128, 'learning_rate': 2.654671601111475e-07, 'epoch': 0.9} 90%|████████▉ | 19863/22095 [34:20:51<2:05:06, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67232 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108897 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49624 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19864/22095 [34:20:55<2:08:06, 3.45s/it] {'loss': 0.3018, 'grad_norm': 0.5722118069854693, 'learning_rate': 2.652315703710712e-07, 'epoch': 0.9} 90%|████████▉ | 19864/22095 [34:20:55<2:08:06, 3.45s/it] 90%|████████▉ | 19865/22095 [34:20:59<2:09:34, 3.49s/it] {'loss': 0.2881, 'grad_norm': 0.5964576792561528, 'learning_rate': 2.649960823666259e-07, 'epoch': 0.9} 90%|████████▉ | 19865/22095 [34:20:59<2:09:34, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19866/22095 [34:21:02<2:12:08, 3.56s/it] {'loss': 0.316, 'grad_norm': 0.5918655001352278, 'learning_rate': 2.64760696102872e-07, 'epoch': 0.9} 90%|████████▉ | 19866/22095 [34:21:02<2:12:08, 3.56s/it] 90%|████████▉ | 19867/22095 [34:21:05<2:05:52, 3.39s/it] {'loss': 0.3112, 'grad_norm': 0.5898085749442864, 'learning_rate': 2.6452541158486776e-07, 'epoch': 0.9} 90%|████████▉ | 19867/22095 [34:21:05<2:05:52, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19868/22095 [34:21:08<2:01:11, 3.26s/it] {'loss': 0.254, 'grad_norm': 0.6609907398862443, 'learning_rate': 2.642902288176696e-07, 'epoch': 0.9} 90%|████████▉ | 19868/22095 [34:21:08<2:01:11, 3.26s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [514, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8461979 in VC:s3://internvl-moe-sft-data/. Exception: Image size [514, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 101253, 'image': 'vrdu_texteq/astro-ph.CO/ecb29b38-c352-487c-8d18-e0ed1904eace.png', 'image_wh': [[514, 25]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'and $H_\\Lambda$ contains the self-interactions for $v$'}]} 90%|████████▉ | 19869/22095 [34:21:11<2:00:13, 3.24s/it] {'loss': 0.2878, 'grad_norm': 0.6117096439894077, 'learning_rate': 2.640551478063286e-07, 'epoch': 0.9} 90%|████████▉ | 19869/22095 [34:21:11<2:00:13, 3.24s/it] 90%|████████▉ | 19870/22095 [34:21:15<2:04:20, 3.35s/it] {'loss': 0.2783, 'grad_norm': 0.6931583049749798, 'learning_rate': 2.638201685558972e-07, 'epoch': 0.9} 90%|████████▉ | 19870/22095 [34:21:15<2:04:20, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|████████▉ | 19871/22095 [34:21:22<2:47:43, 4.53s/it] {'loss': 0.4694, 'grad_norm': 0.2860786343684956, 'learning_rate': 2.6358529107142485e-07, 'epoch': 0.9} 90%|████████▉ | 19871/22095 [34:21:22<2:47:43, 4.53s/it] 90%|████████▉ | 19872/22095 [34:21:32<3:41:37, 5.98s/it] {'loss': 0.4533, 'grad_norm': 0.2578641755720357, 'learning_rate': 2.63350515357958e-07, 'epoch': 0.9} 90%|████████▉ | 19872/22095 [34:21:32<3:41:37, 5.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 90%|████████▉ | 19873/22095 [34:21:36<3:26:01, 5.56s/it] {'loss': 0.3173, 'grad_norm': 0.5993616311568444, 'learning_rate': 2.6311584142054036e-07, 'epoch': 0.9} 90%|████████▉ | 19873/22095 [34:21:36<3:26:01, 5.56s/it] 90%|████████▉ | 19874/22095 [34:21:40<3:10:43, 5.15s/it] {'loss': 0.2741, 'grad_norm': 0.6154401783980245, 'learning_rate': 2.6288126926421576e-07, 'epoch': 0.9} 90%|████████▉ | 19874/22095 [34:21:40<3:10:43, 5.15s/it] 90%|████████▉ | 19875/22095 [34:21:44<2:48:13, 4.55s/it] {'loss': 0.2823, 'grad_norm': 0.6898472911804348, 'learning_rate': 2.626467988940229e-07, 'epoch': 0.9} 90%|████████▉ | 19875/22095 [34:21:44<2:48:13, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (61135 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43200 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19876/22095 [34:21:46<2:27:59, 4.00s/it] {'loss': 0.2818, 'grad_norm': 0.6204420429013519, 'learning_rate': 2.624124303150011e-07, 'epoch': 0.9} 90%|████████▉ | 19876/22095 [34:21:46<2:27:59, 4.00s/it] 90%|████████▉ | 19877/22095 [34:21:49<2:17:59, 3.73s/it] {'loss': 0.2975, 'grad_norm': 0.6136446340161927, 'learning_rate': 2.621781635321863e-07, 'epoch': 0.9} 90%|████████▉ | 19877/22095 [34:21:49<2:17:59, 3.73s/it] 90%|████████▉ | 19878/22095 [34:21:52<2:10:13, 3.52s/it] {'loss': 0.2961, 'grad_norm': 0.6612494059694828, 'learning_rate': 2.6194399855061056e-07, 'epoch': 0.9} 90%|████████▉ | 19878/22095 [34:21:52<2:10:13, 3.52s/it] 90%|████████▉ | 19879/22095 [34:21:55<2:04:13, 3.36s/it] {'loss': 0.2769, 'grad_norm': 0.6256958878919391, 'learning_rate': 2.6170993537530665e-07, 'epoch': 0.9} 90%|████████▉ | 19879/22095 [34:21:55<2:04:13, 3.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53171 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46490 > 40960) for 4 sample(s). Truncating to 4388 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (102809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60069 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19880/22095 [34:21:59<2:10:49, 3.54s/it] {'loss': 0.2967, 'grad_norm': 0.6156127701428498, 'learning_rate': 2.6147597401130433e-07, 'epoch': 0.9} 90%|████████▉ | 19880/22095 [34:21:59<2:10:49, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19881/22095 [34:22:02<2:03:30, 3.35s/it] {'loss': 0.2683, 'grad_norm': 0.5999686058867921, 'learning_rate': 2.612421144636301e-07, 'epoch': 0.9} 90%|████████▉ | 19881/22095 [34:22:02<2:03:30, 3.35s/it] 90%|████████▉ | 19882/22095 [34:22:05<2:00:47, 3.28s/it] {'loss': 0.2729, 'grad_norm': 0.6345276232974958, 'learning_rate': 2.610083567373078e-07, 'epoch': 0.9} 90%|████████▉ | 19882/22095 [34:22:05<2:00:47, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77349 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45707 > 40960). Running this sequence through the model will result in indexing errors 90%|████████▉ | 19883/22095 [34:22:09<2:05:54, 3.42s/it] {'loss': 0.2669, 'grad_norm': 0.6370424196514467, 'learning_rate': 2.6077470083736176e-07, 'epoch': 0.9} 90%|████████▉ | 19883/22095 [34:22:09<2:05:54, 3.42s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|████████▉ | 19884/22095 [34:22:20<3:25:20, 5.57s/it] {'loss': 0.4509, 'grad_norm': 0.25297348773573924, 'learning_rate': 2.6054114676881237e-07, 'epoch': 0.9} 90%|████████▉ | 19884/22095 [34:22:20<3:25:20, 5.57s/it] 90%|████████▉ | 19885/22095 [34:22:23<3:03:07, 4.97s/it] {'loss': 0.2861, 'grad_norm': 0.6323420343265559, 'learning_rate': 2.6030769453667783e-07, 'epoch': 0.9} 90%|████████▉ | 19885/22095 [34:22:23<3:03:07, 4.97s/it] 90%|█████████ | 19886/22095 [34:22:27<2:50:53, 4.64s/it] {'loss': 0.3002, 'grad_norm': 0.6696787676826822, 'learning_rate': 2.60074344145973e-07, 'epoch': 0.9} 90%|█████████ | 19886/22095 [34:22:27<2:50:53, 4.64s/it] 90%|█████████ | 19887/22095 [34:22:31<2:37:35, 4.28s/it] {'loss': 0.3064, 'grad_norm': 0.6649350361638969, 'learning_rate': 2.5984109560171387e-07, 'epoch': 0.9} 90%|█████████ | 19887/22095 [34:22:31<2:37:35, 4.28s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19888/22095 [34:22:36<2:45:59, 4.51s/it] {'loss': 0.4569, 'grad_norm': 0.2588500924076885, 'learning_rate': 2.5960794890891093e-07, 'epoch': 0.9} 90%|█████████ | 19888/22095 [34:22:36<2:45:59, 4.51s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132858 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19889/22095 [34:22:40<2:38:17, 4.31s/it] {'loss': 0.3292, 'grad_norm': 0.6195830757026074, 'learning_rate': 2.593749040725746e-07, 'epoch': 0.9} 90%|█████████ | 19889/22095 [34:22:40<2:38:17, 4.31s/it] 90%|█████████ | 19890/22095 [34:22:42<2:22:18, 3.87s/it] {'loss': 0.2796, 'grad_norm': 0.5795950758623272, 'learning_rate': 2.5914196109771197e-07, 'epoch': 0.9} 90%|█████████ | 19890/22095 [34:22:42<2:22:18, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19891/22095 [34:22:49<2:54:41, 4.76s/it] {'loss': 0.4723, 'grad_norm': 0.2654230841344855, 'learning_rate': 2.5890911998932735e-07, 'epoch': 0.9} 90%|█████████ | 19891/22095 [34:22:49<2:54:41, 4.76s/it] 90%|█████████ | 19892/22095 [34:22:53<2:42:24, 4.42s/it] {'loss': 0.2968, 'grad_norm': 0.5663503785166497, 'learning_rate': 2.5867638075242454e-07, 'epoch': 0.9} 90%|█████████ | 19892/22095 [34:22:53<2:42:24, 4.42s/it] 90%|█████████ | 19893/22095 [34:22:56<2:30:40, 4.11s/it] {'loss': 0.3256, 'grad_norm': 0.6279957619169866, 'learning_rate': 2.5844374339200505e-07, 'epoch': 0.9} 90%|█████████ | 19893/22095 [34:22:56<2:30:40, 4.11s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19894/22095 [34:22:59<2:18:51, 3.79s/it] {'loss': 0.2999, 'grad_norm': 0.5559620151418101, 'learning_rate': 2.5821120791306665e-07, 'epoch': 0.9} 90%|█████████ | 19894/22095 [34:22:59<2:18:51, 3.79s/it] 90%|█████████ | 19895/22095 [34:23:03<2:13:59, 3.65s/it] {'loss': 0.2995, 'grad_norm': 0.7242133638005693, 'learning_rate': 2.579787743206058e-07, 'epoch': 0.9} 90%|█████████ | 19895/22095 [34:23:03<2:13:59, 3.65s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348837 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15507, 'image': 'vrdu_table_final_2/astro-ph.CO/dc49bd9d-fd0f-48b4-8bf9-58c67b6f10ee.png', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$x_0$\\end{tabular}\n```"}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19896/22095 [34:23:06<2:07:01, 3.47s/it] {'loss': 0.2885, 'grad_norm': 0.6079929095062804, 'learning_rate': 2.5774644261961746e-07, 'epoch': 0.9} 90%|█████████ | 19896/22095 [34:23:06<2:07:01, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19897/22095 [34:23:12<2:38:25, 4.32s/it] {'loss': 0.4523, 'grad_norm': 0.25827572484656686, 'learning_rate': 2.5751421281509426e-07, 'epoch': 0.9} 90%|█████████ | 19897/22095 [34:23:12<2:38:25, 4.32s/it] 90%|█████████ | 19898/22095 [34:23:15<2:25:09, 3.96s/it] {'loss': 0.2691, 'grad_norm': 0.5761463549202924, 'learning_rate': 2.572820849120239e-07, 'epoch': 0.9} 90%|█████████ | 19898/22095 [34:23:15<2:25:09, 3.96s/it] 90%|█████████ | 19899/22095 [34:23:18<2:16:33, 3.73s/it] {'loss': 0.2806, 'grad_norm': 0.6240479186911713, 'learning_rate': 2.5705005891539516e-07, 'epoch': 0.9} 90%|█████████ | 19899/22095 [34:23:18<2:16:33, 3.73s/it] 90%|█████████ | 19900/22095 [34:23:22<2:11:43, 3.60s/it] {'loss': 0.2929, 'grad_norm': 0.6456703003520794, 'learning_rate': 2.5681813483019515e-07, 'epoch': 0.9} 90%|█████████ | 19900/22095 [34:23:22<2:11:43, 3.60s/it] 90%|█████████ | 19901/22095 [34:23:25<2:12:07, 3.61s/it] {'loss': 0.2685, 'grad_norm': 0.624521209277269, 'learning_rate': 2.565863126614049e-07, 'epoch': 0.9} 90%|█████████ | 19901/22095 [34:23:25<2:12:07, 3.61s/it] 90%|█████████ | 19902/22095 [34:23:29<2:16:14, 3.73s/it] {'loss': 0.3145, 'grad_norm': 0.6036744838044934, 'learning_rate': 2.563545924140065e-07, 'epoch': 0.9} 90%|█████████ | 19902/22095 [34:23:29<2:16:14, 3.73s/it] 90%|█████████ | 19903/22095 [34:23:32<2:04:37, 3.41s/it] {'loss': 0.2643, 'grad_norm': 0.6375461561357829, 'learning_rate': 2.5612297409297937e-07, 'epoch': 0.9} 90%|█████████ | 19903/22095 [34:23:32<2:04:37, 3.41s/it] 90%|█████████ | 19904/22095 [34:23:35<1:59:42, 3.28s/it] {'loss': 0.3051, 'grad_norm': 0.6038009033673213, 'learning_rate': 2.558914577032995e-07, 'epoch': 0.9} 90%|█████████ | 19904/22095 [34:23:35<1:59:42, 3.28s/it] 90%|█████████ | 19905/22095 [34:23:38<1:56:52, 3.20s/it] {'loss': 0.2905, 'grad_norm': 0.5593653532095186, 'learning_rate': 2.5566004324994174e-07, 'epoch': 0.9} 90%|█████████ | 19905/22095 [34:23:38<1:56:52, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19906/22095 [34:23:41<1:52:31, 3.08s/it] {'loss': 0.2783, 'grad_norm': 0.5949014043160534, 'learning_rate': 2.554287307378794e-07, 'epoch': 0.9} 90%|█████████ | 19906/22095 [34:23:41<1:52:31, 3.08s/it] 90%|█████████ | 19907/22095 [34:23:44<1:59:41, 3.28s/it] {'loss': 0.2942, 'grad_norm': 0.5745613856583127, 'learning_rate': 2.551975201720802e-07, 'epoch': 0.9} 90%|█████████ | 19907/22095 [34:23:44<1:59:41, 3.28s/it] 90%|█████████ | 19908/22095 [34:23:48<1:59:07, 3.27s/it] {'loss': 0.2976, 'grad_norm': 0.6167366159363535, 'learning_rate': 2.5496641155751456e-07, 'epoch': 0.9} 90%|█████████ | 19908/22095 [34:23:48<1:59:07, 3.27s/it] 90%|█████████ | 19909/22095 [34:23:51<1:57:42, 3.23s/it] {'loss': 0.295, 'grad_norm': 0.6439202150247513, 'learning_rate': 2.5473540489914794e-07, 'epoch': 0.9} 90%|█████████ | 19909/22095 [34:23:51<1:57:42, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41927 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60732 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19910/22095 [34:23:53<1:51:36, 3.06s/it] {'loss': 0.3307, 'grad_norm': 0.6657246782177099, 'learning_rate': 2.5450450020194306e-07, 'epoch': 0.9} 90%|█████████ | 19910/22095 [34:23:53<1:51:36, 3.06s/it] 90%|█████████ | 19911/22095 [34:23:58<2:03:51, 3.40s/it] {'loss': 0.2642, 'grad_norm': 0.6133423808228226, 'learning_rate': 2.542736974708615e-07, 'epoch': 0.9} 90%|█████████ | 19911/22095 [34:23:58<2:03:51, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8946831 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69984, 'image': 'images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C和D是AB段上的两点,Cd=3c m,M是AC的中点,N是DB的中点,AB=9.8cm,则Mn段的长度等于()\nA. 5.4cm\nB. 6.4cm\nC. 6.8cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 90%|█████████ | 19912/22095 [34:24:01<1:57:57, 3.24s/it] {'loss': 0.3237, 'grad_norm': 0.686243999667497, 'learning_rate': 2.5404299671086264e-07, 'epoch': 0.9} 90%|█████████ | 19912/22095 [34:24:01<1:57:57, 3.24s/it] 90%|█████████ | 19913/22095 [34:24:03<1:53:19, 3.12s/it] {'loss': 0.2688, 'grad_norm': 0.6013825159558385, 'learning_rate': 2.538123979269047e-07, 'epoch': 0.9} 90%|█████████ | 19913/22095 [34:24:03<1:53:19, 3.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19914/22095 [34:24:06<1:51:36, 3.07s/it] {'loss': 0.2772, 'grad_norm': 0.5847798766604101, 'learning_rate': 2.5358190112394097e-07, 'epoch': 0.9} 90%|█████████ | 19914/22095 [34:24:06<1:51:36, 3.07s/it] 90%|█████████ | 19915/22095 [34:24:10<1:57:17, 3.23s/it] {'loss': 0.2988, 'grad_norm': 0.6089930596789058, 'learning_rate': 2.5335150630692476e-07, 'epoch': 0.9} 90%|█████████ | 19915/22095 [34:24:10<1:57:17, 3.23s/it] 90%|█████████ | 19916/22095 [34:24:14<2:06:45, 3.49s/it] {'loss': 0.2939, 'grad_norm': 0.5788620985116087, 'learning_rate': 2.5312121348080643e-07, 'epoch': 0.9} 90%|█████████ | 19916/22095 [34:24:14<2:06:45, 3.49s/it] 90%|█████████ | 19917/22095 [34:24:18<2:08:58, 3.55s/it] {'loss': 0.2769, 'grad_norm': 0.6364074845395432, 'learning_rate': 2.528910226505338e-07, 'epoch': 0.9} 90%|█████████ | 19917/22095 [34:24:18<2:08:58, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19918/22095 [34:24:27<3:12:32, 5.31s/it] {'loss': 0.4693, 'grad_norm': 0.2567038459793554, 'learning_rate': 2.5266093382105395e-07, 'epoch': 0.9} 90%|█████████ | 19918/22095 [34:24:27<3:12:32, 5.31s/it] 90%|█████████ | 19919/22095 [34:24:31<2:59:14, 4.94s/it] {'loss': 0.2732, 'grad_norm': 0.7494711890048615, 'learning_rate': 2.5243094699731076e-07, 'epoch': 0.9} 90%|█████████ | 19919/22095 [34:24:31<2:59:14, 4.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71731 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53577 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (132447 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144696 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19920/22095 [34:24:35<2:44:56, 4.55s/it] {'loss': 0.2735, 'grad_norm': 0.5794564708348203, 'learning_rate': 2.522010621842447e-07, 'epoch': 0.9} 90%|█████████ | 19920/22095 [34:24:35<2:44:56, 4.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49372 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51776 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45149 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (104090 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19921/22095 [34:24:39<2:36:15, 4.31s/it] {'loss': 0.3306, 'grad_norm': 0.6298020059586386, 'learning_rate': 2.5197127938679567e-07, 'epoch': 0.9} 90%|█████████ | 19921/22095 [34:24:39<2:36:15, 4.31s/it] 90%|█████████ | 19922/22095 [34:24:43<2:33:06, 4.23s/it] {'loss': 0.3239, 'grad_norm': 0.6395789983688206, 'learning_rate': 2.5174159860990256e-07, 'epoch': 0.9} 90%|█████████ | 19922/22095 [34:24:43<2:33:06, 4.23s/it] 90%|█████████ | 19923/22095 [34:24:46<2:18:46, 3.83s/it] {'loss': 0.2741, 'grad_norm': 0.6277059924919379, 'learning_rate': 2.5151201985849915e-07, 'epoch': 0.9} 90%|█████████ | 19923/22095 [34:24:46<2:18:46, 3.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19924/22095 [34:24:48<2:08:17, 3.55s/it] {'loss': 0.2628, 'grad_norm': 0.7003457691753658, 'learning_rate': 2.512825431375177e-07, 'epoch': 0.9} 90%|█████████ | 19924/22095 [34:24:48<2:08:17, 3.55s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19925/22095 [34:24:51<2:02:37, 3.39s/it] {'loss': 0.2927, 'grad_norm': 0.5869953938455421, 'learning_rate': 2.510531684518902e-07, 'epoch': 0.9} 90%|█████████ | 19925/22095 [34:24:51<2:02:37, 3.39s/it] 90%|█████████ | 19926/22095 [34:24:55<1:59:05, 3.29s/it] {'loss': 0.2906, 'grad_norm': 0.6669116927869857, 'learning_rate': 2.508238958065451e-07, 'epoch': 0.9} 90%|█████████ | 19926/22095 [34:24:55<1:59:05, 3.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047219 in VC:s3://multi-modal/UniGeo/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 4\nB. 5\nC. 6\nD. 8'}, {'from': 'gpt', 'value': '【解答】解:∵D为线段CB的中点,CD=3,∴BC=2CD=6,∴AC=AB-BC=5.'}]} 90%|█████████ | 19927/22095 [34:24:59<2:06:51, 3.51s/it] {'loss': 0.3189, 'grad_norm': 0.6093736144281799, 'learning_rate': 2.505947252064089e-07, 'epoch': 0.9} 90%|█████████ | 19927/22095 [34:24:59<2:06:51, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [203, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8530547 in VC:s3://internvl-moe-sft-data/. Exception: Image size [203, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 113284, 'image': 'vrdu_texteq/astro-ph.CO/449aa1dc-a1e5-4610-818c-bf8ce61d9cf0.png', 'image_wh': [[203, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'for some $S_1\\le S$.'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19928/22095 [34:25:10<3:27:32, 5.75s/it] {'loss': 0.465, 'grad_norm': 0.25538566156125225, 'learning_rate': 2.5036565665640443e-07, 'epoch': 0.9} 90%|█████████ | 19928/22095 [34:25:10<3:27:32, 5.75s/it] 90%|█████████ | 19929/22095 [34:25:20<4:21:12, 7.24s/it] {'loss': 0.4727, 'grad_norm': 0.2582399895913366, 'learning_rate': 2.501366901614555e-07, 'epoch': 0.9} 90%|█████████ | 19929/22095 [34:25:20<4:21:12, 7.24s/it] 90%|█████████ | 19930/22095 [34:25:31<4:55:59, 8.20s/it] {'loss': 0.4667, 'grad_norm': 0.28190071012238044, 'learning_rate': 2.4990782572647977e-07, 'epoch': 0.9} 90%|█████████ | 19930/22095 [34:25:31<4:55:59, 8.20s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 90%|█████████ | 19931/22095 [34:25:35<4:08:50, 6.90s/it] {'loss': 0.3162, 'grad_norm': 1.11822190822124, 'learning_rate': 2.4967906335639725e-07, 'epoch': 0.9} 90%|█████████ | 19931/22095 [34:25:35<4:08:50, 6.90s/it] 90%|█████████ | 19932/22095 [34:25:38<3:32:10, 5.89s/it] {'loss': 0.3137, 'grad_norm': 0.6665944056354663, 'learning_rate': 2.494504030561223e-07, 'epoch': 0.9} 90%|█████████ | 19932/22095 [34:25:38<3:32:10, 5.89s/it] 90%|█████████ | 19933/22095 [34:25:41<3:05:25, 5.15s/it] {'loss': 0.2803, 'grad_norm': 0.5415064243105101, 'learning_rate': 2.4922184483056665e-07, 'epoch': 0.9} 90%|█████████ | 19933/22095 [34:25:41<3:05:25, 5.15s/it] 90%|█████████ | 19934/22095 [34:25:45<2:43:05, 4.53s/it] {'loss': 0.3501, 'grad_norm': 0.7574661488081733, 'learning_rate': 2.4899338868464404e-07, 'epoch': 0.9} 90%|█████████ | 19934/22095 [34:25:45<2:43:05, 4.53s/it] 90%|█████████ | 19935/22095 [34:25:48<2:33:10, 4.26s/it] {'loss': 0.3302, 'grad_norm': 0.6385815253014705, 'learning_rate': 2.487650346232606e-07, 'epoch': 0.9} 90%|█████████ | 19935/22095 [34:25:48<2:33:10, 4.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74275 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19936/22095 [34:25:51<2:19:33, 3.88s/it] {'loss': 0.2581, 'grad_norm': 0.5893896562648631, 'learning_rate': 2.485367826513258e-07, 'epoch': 0.9} 90%|█████████ | 19936/22095 [34:25:51<2:19:33, 3.88s/it] 90%|█████████ | 19937/22095 [34:25:56<2:26:03, 4.06s/it] {'loss': 0.3226, 'grad_norm': 0.5638720757545203, 'learning_rate': 2.483086327737411e-07, 'epoch': 0.9} 90%|█████████ | 19937/22095 [34:25:56<2:26:03, 4.06s/it] 90%|█████████ | 19938/22095 [34:26:00<2:23:52, 4.00s/it] {'loss': 0.2622, 'grad_norm': 0.5858279262262214, 'learning_rate': 2.48080584995411e-07, 'epoch': 0.9} 90%|█████████ | 19938/22095 [34:26:00<2:23:52, 4.00s/it] 90%|█████████ | 19939/22095 [34:26:03<2:15:19, 3.77s/it] {'loss': 0.3356, 'grad_norm': 0.6459975315468293, 'learning_rate': 2.4785263932123495e-07, 'epoch': 0.9} 90%|█████████ | 19939/22095 [34:26:03<2:15:19, 3.77s/it] 90%|█████████ | 19940/22095 [34:26:07<2:17:21, 3.82s/it] {'loss': 0.288, 'grad_norm': 0.5968005891204486, 'learning_rate': 2.4762479575610954e-07, 'epoch': 0.9} 90%|█████████ | 19940/22095 [34:26:07<2:17:21, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41504 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19941/22095 [34:26:11<2:17:23, 3.83s/it] {'loss': 0.2635, 'grad_norm': 0.6842359527286419, 'learning_rate': 2.47397054304932e-07, 'epoch': 0.9} 90%|█████████ | 19941/22095 [34:26:11<2:17:23, 3.83s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19942/22095 [34:26:14<2:11:57, 3.68s/it] {'loss': 0.293, 'grad_norm': 0.6616650535720541, 'learning_rate': 2.4716941497259563e-07, 'epoch': 0.9} 90%|█████████ | 19942/22095 [34:26:14<2:11:57, 3.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42311 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45692 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81750 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19943/22095 [34:26:17<2:11:15, 3.66s/it] {'loss': 0.2953, 'grad_norm': 0.6865500924952155, 'learning_rate': 2.4694187776399094e-07, 'epoch': 0.9} 90%|█████████ | 19943/22095 [34:26:17<2:11:15, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19944/22095 [34:26:26<3:00:32, 5.04s/it] {'loss': 0.465, 'grad_norm': 0.24231874254102226, 'learning_rate': 2.4671444268400736e-07, 'epoch': 0.9} 90%|█████████ | 19944/22095 [34:26:26<3:00:32, 5.04s/it] 90%|█████████ | 19945/22095 [34:26:29<2:43:49, 4.57s/it] {'loss': 0.2587, 'grad_norm': 0.5940515750046157, 'learning_rate': 2.464871097375321e-07, 'epoch': 0.9} 90%|█████████ | 19945/22095 [34:26:29<2:43:49, 4.57s/it] 90%|█████████ | 19946/22095 [34:26:33<2:33:40, 4.29s/it] {'loss': 0.3054, 'grad_norm': 0.6427760010776417, 'learning_rate': 2.46259878929449e-07, 'epoch': 0.9} 90%|█████████ | 19946/22095 [34:26:33<2:33:40, 4.29s/it] 90%|█████████ | 19947/22095 [34:26:36<2:22:14, 3.97s/it] {'loss': 0.2891, 'grad_norm': 0.6293624346771375, 'learning_rate': 2.460327502646415e-07, 'epoch': 0.9} 90%|█████████ | 19947/22095 [34:26:36<2:22:14, 3.97s/it] 90%|█████████ | 19948/22095 [34:26:40<2:17:34, 3.84s/it] {'loss': 0.2955, 'grad_norm': 0.6467156463352277, 'learning_rate': 2.4580572374798997e-07, 'epoch': 0.9} 90%|█████████ | 19948/22095 [34:26:40<2:17:34, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19949/22095 [34:26:46<2:48:51, 4.72s/it] {'loss': 0.4673, 'grad_norm': 0.2585270588356642, 'learning_rate': 2.455787993843711e-07, 'epoch': 0.9} 90%|█████████ | 19949/22095 [34:26:46<2:48:51, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19950/22095 [34:26:50<2:38:28, 4.43s/it] {'loss': 0.2961, 'grad_norm': 0.5902196062976237, 'learning_rate': 2.453519771786617e-07, 'epoch': 0.9} 90%|█████████ | 19950/22095 [34:26:50<2:38:28, 4.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (126645 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19951/22095 [34:27:01<3:50:21, 6.45s/it] {'loss': 0.4569, 'grad_norm': 0.2511985102721687, 'learning_rate': 2.451252571357365e-07, 'epoch': 0.9} 90%|█████████ | 19951/22095 [34:27:01<3:50:21, 6.45s/it] 90%|█████████ | 19952/22095 [34:27:05<3:20:01, 5.60s/it] {'loss': 0.2901, 'grad_norm': 0.6232119279137442, 'learning_rate': 2.4489863926046577e-07, 'epoch': 0.9} 90%|█████████ | 19952/22095 [34:27:05<3:20:01, 5.60s/it] 90%|█████████ | 19953/22095 [34:27:08<2:54:55, 4.90s/it] {'loss': 0.2961, 'grad_norm': 1.440194667225179, 'learning_rate': 2.446721235577182e-07, 'epoch': 0.9} 90%|█████████ | 19953/22095 [34:27:08<2:54:55, 4.90s/it] 90%|█████████ | 19954/22095 [34:27:12<2:38:13, 4.43s/it] {'loss': 0.3075, 'grad_norm': 0.615639529056928, 'learning_rate': 2.4444571003236216e-07, 'epoch': 0.9} 90%|█████████ | 19954/22095 [34:27:12<2:38:13, 4.43s/it] 90%|█████████ | 19955/22095 [34:27:16<2:34:12, 4.32s/it] {'loss': 0.2727, 'grad_norm': 0.593651072497511, 'learning_rate': 2.4421939868926325e-07, 'epoch': 0.9} 90%|█████████ | 19955/22095 [34:27:16<2:34:12, 4.32s/it] 90%|█████████ | 19956/22095 [34:27:19<2:24:43, 4.06s/it] {'loss': 0.2899, 'grad_norm': 0.6528840635575128, 'learning_rate': 2.4399318953328255e-07, 'epoch': 0.9} 90%|█████████ | 19956/22095 [34:27:19<2:24:43, 4.06s/it] 90%|█████████ | 19957/22095 [34:27:22<2:12:24, 3.72s/it] {'loss': 0.2697, 'grad_norm': 0.6198831311701227, 'learning_rate': 2.437670825692812e-07, 'epoch': 0.9} 90%|█████████ | 19957/22095 [34:27:22<2:12:24, 3.72s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7800921 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '22261', 'image': '29669.jpg', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n Here is the caption I wrote for the image.\n### Description:\n\nThe image depicts a geometric line segment diagram featuring a horizontal line with four designated points, labeled as F, P, R, and X, respectively. The labels are placed above the line at their corresponding positions.\n1. **Point F**: Positioned at the extreme left of the line segment.\n2. **Point P**: Situated between points F and R, the distance from F to P is noticeably shorter compared to the intervals between the other points.\n3. **Point R**: Located to the right of point P and to the left of point X, maintaining roughly equal distance from points P and X.\n4. **Point X**: Marked at the far right of the line segment.\n\nThe configuration of these points suggests a sequence from left to right as F, P, R, and X. The points are not evenly spaced, and there are no numerical values or measurements provided to indicate the exact distances. The labels and relative positioning indicate an illustrative aim, potentially to elucidate concepts related to points and line segments in geometry, such as the division of a line segment.\n\n### Deductive Analysis:\n\nFrom the layout and labels of the points, one can deduce several geometric properties or theorems, including the concept of betweenness of points on a line. This could serve to illustrate ideas like:\n1. **Betweenness Theorem**: Point P is between points F and R, and point R is between points P and X.\n2. **Segment Addition**: The total length of segment FX is the aggregate of the lengths of segments FP, PR, and RX. This relationship is represented as \\( FX = FP + PR + RX \\).\n\n### Answer the Question:\n\nIf the illustration fundamentally poses a question to discern the relationships between the points, the ensuing rationale applies:\n#### Chain of Thought (CoT):\n\n1. **Determine the Sequence**: The points are sequentially arranged from left to right as F, P, R, and X.\n2. **Define Betweenness**:\n - Point F signifies the commencement.\n - Point P is situated between F and R.\n - Point R lies between P and X.\n - Point X denotes the termination.\n3. **Estimate Segment Lengths** (if necessary):\n - Absent specific measurements, the main objective is to display relative positions rather than absolute distances.\n\nBy following these logical stages, one can comprehend the geometric relationships and roles of each point on the line segment, facilitating insight into its mathematical attributes.\n\nGiven these points, any inquiries pertaining to distances would necessitate specific numerical values for precise computations. For instance, if distances were given, one could calculate the exact lengths of segments FP, PR, and RX, culminating in the accurate length of FX. \nPlease assess if my caption is accurate, and if it’s not, please point out all the errors.'}, {'from': 'gpt', 'value': "Your description and analysis are well-structured and demonstrate a clear understanding of the geometric concepts involved. However, there are a few discrepancies between your description and the actual image provided later:\n\n1. **Point Labels**: You described points as F, P, R, and X, but the image shows points labeled as A, D, B, and C. This mismatch in labels could lead to confusion if someone is trying to follow along with the diagram.\n2. **Betweenness and Segment Addition**: While your explanation of these concepts is correct, it's based on the incorrect labels. The correct interpretation should refer to points A, D, B, and C for the betweenness theorem and segment addition.\n\nThe rest of your analysis, such as the sequence of points, the absence of numerical values, and the implications for calculations if such values were provided, is accurate and insightful. Just ensure that the point labels in your description match those in the actual image."}]} 90%|█████████ | 19958/22095 [34:27:26<2:15:15, 3.80s/it] {'loss': 0.3065, 'grad_norm': 0.6317778826705256, 'learning_rate': 2.4354107780211745e-07, 'epoch': 0.9} 90%|█████████ | 19958/22095 [34:27:26<2:15:15, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [689, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8478192 in VC:s3://internvl-moe-sft-data/. Exception: Image size [689, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 150946, 'image': 'vrdu_texteq/astro-ph.CO/4d68f153-a6ed-4081-b952-0f28b575fafe.png', 'image_wh': [[689, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'A bound on $H_*$ is derived from the tensor-to-scalar ratio~'}]} 90%|█████████ | 19959/22095 [34:27:36<3:20:56, 5.64s/it] {'loss': 0.4926, 'grad_norm': 0.2854533424264698, 'learning_rate': 2.433151752366475e-07, 'epoch': 0.9} 90%|█████████ | 19959/22095 [34:27:36<3:20:56, 5.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96204 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50982 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91226 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19960/22095 [34:27:40<2:59:14, 5.04s/it] {'loss': 0.2814, 'grad_norm': 0.735995959989909, 'learning_rate': 2.4308937487772576e-07, 'epoch': 0.9} 90%|█████████ | 19960/22095 [34:27:40<2:59:14, 5.04s/it] 90%|█████████ | 19961/22095 [34:27:43<2:38:25, 4.45s/it] {'loss': 0.2507, 'grad_norm': 0.6361012382328806, 'learning_rate': 2.4286367673020396e-07, 'epoch': 0.9} 90%|█████████ | 19961/22095 [34:27:43<2:38:25, 4.45s/it] 90%|█████████ | 19962/22095 [34:27:47<2:33:08, 4.31s/it] {'loss': 0.2508, 'grad_norm': 0.6104055856919773, 'learning_rate': 2.4263808079893035e-07, 'epoch': 0.9} 90%|█████████ | 19962/22095 [34:27:47<2:33:08, 4.31s/it] 90%|█████████ | 19963/22095 [34:27:50<2:21:20, 3.98s/it] {'loss': 0.2905, 'grad_norm': 0.6575551325731162, 'learning_rate': 2.4241258708875336e-07, 'epoch': 0.9} 90%|█████████ | 19963/22095 [34:27:50<2:21:20, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70873 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46649 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19964/22095 [34:27:53<2:13:28, 3.76s/it] {'loss': 0.3214, 'grad_norm': 0.5908448190267949, 'learning_rate': 2.4218719560451907e-07, 'epoch': 0.9} 90%|█████████ | 19964/22095 [34:27:53<2:13:28, 3.76s/it] 90%|█████████ | 19965/22095 [34:27:58<2:21:11, 3.98s/it] {'loss': 0.2715, 'grad_norm': 0.5794135894633232, 'learning_rate': 2.4196190635106917e-07, 'epoch': 0.9} 90%|█████████ | 19965/22095 [34:27:58<2:21:11, 3.98s/it] 90%|█████████ | 19966/22095 [34:28:02<2:26:59, 4.14s/it] {'loss': 0.2657, 'grad_norm': 0.703211043777139, 'learning_rate': 2.4173671933324373e-07, 'epoch': 0.9} 90%|█████████ | 19966/22095 [34:28:02<2:26:59, 4.14s/it] 90%|█████████ | 19967/22095 [34:28:05<2:13:51, 3.77s/it] {'loss': 0.2983, 'grad_norm': 0.6009922340041405, 'learning_rate': 2.415116345558832e-07, 'epoch': 0.9} 90%|█████████ | 19967/22095 [34:28:05<2:13:51, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19968/22095 [34:28:15<3:19:41, 5.63s/it] {'loss': 0.464, 'grad_norm': 0.2530597694122791, 'learning_rate': 2.4128665202382327e-07, 'epoch': 0.9} 90%|█████████ | 19968/22095 [34:28:15<3:19:41, 5.63s/it] 90%|█████████ | 19969/22095 [34:28:18<2:53:34, 4.90s/it] {'loss': 0.3327, 'grad_norm': 0.6404494730165863, 'learning_rate': 2.4106177174189724e-07, 'epoch': 0.9} 90%|█████████ | 19969/22095 [34:28:18<2:53:34, 4.90s/it] 90%|█████████ | 19970/22095 [34:28:21<2:33:54, 4.35s/it] {'loss': 0.2858, 'grad_norm': 0.6221944918677672, 'learning_rate': 2.408369937149374e-07, 'epoch': 0.9} 90%|█████████ | 19970/22095 [34:28:21<2:33:54, 4.35s/it] 90%|█████████ | 19971/22095 [34:28:25<2:25:17, 4.10s/it] {'loss': 0.3332, 'grad_norm': 0.7356818975973848, 'learning_rate': 2.4061231794777483e-07, 'epoch': 0.9} 90%|█████████ | 19971/22095 [34:28:25<2:25:17, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19972/22095 [34:28:35<3:27:09, 5.85s/it] {'loss': 0.4571, 'grad_norm': 0.2557326175187924, 'learning_rate': 2.4038774444523627e-07, 'epoch': 0.9} 90%|█████████ | 19972/22095 [34:28:35<3:27:09, 5.85s/it] 90%|█████████ | 19973/22095 [34:28:38<2:59:22, 5.07s/it] {'loss': 0.3138, 'grad_norm': 0.6719787234019912, 'learning_rate': 2.4016327321214614e-07, 'epoch': 0.9} 90%|█████████ | 19973/22095 [34:28:38<2:59:22, 5.07s/it] 90%|█████████ | 19974/22095 [34:28:41<2:40:36, 4.54s/it] {'loss': 0.3172, 'grad_norm': 0.5657826799708439, 'learning_rate': 2.3993890425332957e-07, 'epoch': 0.9} 90%|█████████ | 19974/22095 [34:28:41<2:40:36, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19975/22095 [34:28:44<2:24:10, 4.08s/it] {'loss': 0.2936, 'grad_norm': 0.6515336390605506, 'learning_rate': 2.3971463757360537e-07, 'epoch': 0.9} 90%|█████████ | 19975/22095 [34:28:44<2:24:10, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 90%|█████████ | 19976/22095 [34:28:48<2:25:21, 4.12s/it] {'loss': 0.3024, 'grad_norm': 0.630534874722075, 'learning_rate': 2.394904731777947e-07, 'epoch': 0.9} 90%|█████████ | 19976/22095 [34:28:48<2:25:21, 4.12s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79476 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41186 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89821 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19977/22095 [34:28:51<2:12:23, 3.75s/it] {'loss': 0.2706, 'grad_norm': 0.5926030588348037, 'learning_rate': 2.392664110707116e-07, 'epoch': 0.9} 90%|█████████ | 19977/22095 [34:28:51<2:12:23, 3.75s/it] 90%|█████████ | 19978/22095 [34:28:55<2:12:24, 3.75s/it] {'loss': 0.3356, 'grad_norm': 0.6027188615891163, 'learning_rate': 2.390424512571732e-07, 'epoch': 0.9} 90%|█████████ | 19978/22095 [34:28:55<2:12:24, 3.75s/it] 90%|█████████ | 19979/22095 [34:28:58<2:05:05, 3.55s/it] {'loss': 0.2613, 'grad_norm': 0.8008295264317392, 'learning_rate': 2.388185937419896e-07, 'epoch': 0.9} 90%|█████████ | 19979/22095 [34:28:58<2:05:05, 3.55s/it] 90%|█████████ | 19980/22095 [34:29:01<1:56:43, 3.31s/it] {'loss': 0.2557, 'grad_norm': 0.5926005579561792, 'learning_rate': 2.385948385299719e-07, 'epoch': 0.9} 90%|█████████ | 19980/22095 [34:29:01<1:56:43, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59313 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61605 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19981/22095 [34:29:04<1:59:02, 3.38s/it] {'loss': 0.2976, 'grad_norm': 0.6086515646248625, 'learning_rate': 2.3837118562592799e-07, 'epoch': 0.9} 90%|█████████ | 19981/22095 [34:29:04<1:59:02, 3.38s/it] 90%|█████████ | 19982/22095 [34:29:08<2:01:09, 3.44s/it] {'loss': 0.2795, 'grad_norm': 0.7498121647059611, 'learning_rate': 2.3814763503466175e-07, 'epoch': 0.9} 90%|█████████ | 19982/22095 [34:29:08<2:01:09, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44764 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42501 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19983/22095 [34:29:12<2:05:12, 3.56s/it] {'loss': 0.3199, 'grad_norm': 0.6778933569637745, 'learning_rate': 2.3792418676097884e-07, 'epoch': 0.9} 90%|█████████ | 19983/22095 [34:29:12<2:05:12, 3.56s/it] 90%|█████████ | 19984/22095 [34:29:15<2:05:44, 3.57s/it] {'loss': 0.274, 'grad_norm': 0.6100026715479245, 'learning_rate': 2.3770084080967926e-07, 'epoch': 0.9} 90%|█████████ | 19984/22095 [34:29:16<2:05:44, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19985/22095 [34:29:24<2:54:53, 4.97s/it] {'loss': 0.4535, 'grad_norm': 0.26096261402513665, 'learning_rate': 2.3747759718556308e-07, 'epoch': 0.9} 90%|█████████ | 19985/22095 [34:29:24<2:54:53, 4.97s/it] 90%|█████████ | 19986/22095 [34:29:28<2:45:32, 4.71s/it] {'loss': 0.3258, 'grad_norm': 0.6154908976706324, 'learning_rate': 2.3725445589342534e-07, 'epoch': 0.9} 90%|█████████ | 19986/22095 [34:29:28<2:45:32, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52100 > 40960). Running this sequence through the model will result in indexing errors 90%|█████████ | 19987/22095 [34:29:32<2:39:15, 4.53s/it] {'loss': 0.2933, 'grad_norm': 0.6117443541239407, 'learning_rate': 2.3703141693806276e-07, 'epoch': 0.9} 90%|█████████ | 19987/22095 [34:29:32<2:39:15, 4.53s/it] 90%|█████████ | 19988/22095 [34:29:36<2:33:44, 4.38s/it] {'loss': 0.3113, 'grad_norm': 0.7156865784325133, 'learning_rate': 2.368084803242654e-07, 'epoch': 0.9} 90%|█████████ | 19988/22095 [34:29:36<2:33:44, 4.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19989/22095 [34:29:45<3:24:21, 5.82s/it] {'loss': 0.4333, 'grad_norm': 0.4855626199280217, 'learning_rate': 2.3658564605682555e-07, 'epoch': 0.9} 90%|█████████ | 19989/22095 [34:29:45<3:24:21, 5.82s/it] 90%|█████████ | 19990/22095 [34:29:49<3:02:43, 5.21s/it] {'loss': 0.2703, 'grad_norm': 0.6961645975967724, 'learning_rate': 2.3636291414053104e-07, 'epoch': 0.9} 90%|█████████ | 19990/22095 [34:29:49<3:02:43, 5.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19991/22095 [34:29:59<3:57:57, 6.79s/it] {'loss': 0.4774, 'grad_norm': 0.2763988693404931, 'learning_rate': 2.3614028458016581e-07, 'epoch': 0.9} 90%|█████████ | 19991/22095 [34:29:59<3:57:57, 6.79s/it] 90%|█████████ | 19992/22095 [34:30:03<3:24:21, 5.83s/it] {'loss': 0.2983, 'grad_norm': 0.6112826169360427, 'learning_rate': 2.3591775738051491e-07, 'epoch': 0.9} 90%|█████████ | 19992/22095 [34:30:03<3:24:21, 5.83s/it] 90%|█████████ | 19993/22095 [34:30:06<2:58:58, 5.11s/it] {'loss': 0.2626, 'grad_norm': 0.6104745671130538, 'learning_rate': 2.356953325463607e-07, 'epoch': 0.9} 90%|█████████ | 19993/22095 [34:30:06<2:58:58, 5.11s/it] 90%|█████████ | 19994/22095 [34:30:09<2:35:51, 4.45s/it] {'loss': 0.2903, 'grad_norm': 0.7505273977344898, 'learning_rate': 2.354730100824809e-07, 'epoch': 0.9} 90%|█████████ | 19994/22095 [34:30:09<2:35:51, 4.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 90%|█████████ | 19995/22095 [34:30:20<3:38:24, 6.24s/it] {'loss': 0.4528, 'grad_norm': 0.26741496817940696, 'learning_rate': 2.3525078999365236e-07, 'epoch': 0.9} 90%|█████████ | 19995/22095 [34:30:20<3:38:24, 6.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (87329 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 19996/22095 [34:30:27<3:50:48, 6.60s/it] {'loss': 0.4586, 'grad_norm': 0.28491452323675803, 'learning_rate': 2.3502867228465064e-07, 'epoch': 0.91} 91%|█████████ | 19996/22095 [34:30:27<3:50:48, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 91%|█████████ | 19997/22095 [34:30:31<3:16:41, 5.63s/it] {'loss': 0.3186, 'grad_norm': 0.6851341238362276, 'learning_rate': 2.3480665696024974e-07, 'epoch': 0.91} 91%|█████████ | 19997/22095 [34:30:31<3:16:41, 5.63s/it] 91%|█████████ | 19998/22095 [34:30:34<2:53:25, 4.96s/it] {'loss': 0.3063, 'grad_norm': 0.5780368717859541, 'learning_rate': 2.3458474402521747e-07, 'epoch': 0.91} 91%|█████████ | 19998/22095 [34:30:34<2:53:25, 4.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (130422 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41296 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74277 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 19999/22095 [34:30:38<2:42:40, 4.66s/it] {'loss': 0.2774, 'grad_norm': 0.5757468936890212, 'learning_rate': 2.343629334843228e-07, 'epoch': 0.91} 91%|█████████ | 19999/22095 [34:30:38<2:42:40, 4.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53767 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120163 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57421 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52882 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45856 > 40960) for 4 sample(s). Truncating to 91 with 1 samples. 91%|█████████ | 20000/22095 [34:30:42<2:36:43, 4.49s/it] {'loss': 0.3095, 'grad_norm': 0.6206281078686119, 'learning_rate': 2.3414122534233252e-07, 'epoch': 0.91} 91%|█████████ | 20000/22095 [34:30:42<2:36:43, 4.49s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 91%|█████████ | 20001/22095 [34:31:31<10:18:57, 17.74s/it] {'loss': 0.2821, 'grad_norm': 0.6001697472110555, 'learning_rate': 2.3391961960401055e-07, 'epoch': 0.91} 91%|█████████ | 20001/22095 [34:31:31<10:18:57, 17.74s/it] 91%|█████████ | 20002/22095 [34:31:34<7:47:51, 13.41s/it] {'loss': 0.2946, 'grad_norm': 0.6455405388295047, 'learning_rate': 2.3369811627411755e-07, 'epoch': 0.91} 91%|█████████ | 20002/22095 [34:31:34<7:47:51, 13.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20003/22095 [34:31:37<5:57:54, 10.27s/it] {'loss': 0.2875, 'grad_norm': 0.5857060542849744, 'learning_rate': 2.3347671535741356e-07, 'epoch': 0.91} 91%|█████████ | 20003/22095 [34:31:37<5:57:54, 10.27s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308527 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2gwyxj46I8KJjy0FgXXXXzVXa_!!3401535694.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n能否将这幅图像中的所有文字详细地列出来?'}, {'from': 'gpt', 'value': '图中所有文字:\n嗨\n起来!!\n让宝宝\n增强身体发育\n增强宝宝平衡力、\n质量好\n玩不坏\n加粗碳钢车身'}]} 91%|█████████ | 20004/22095 [34:31:44<5:20:17, 9.19s/it] {'loss': 0.4448, 'grad_norm': 0.2701905458880112, 'learning_rate': 2.3325541685865538e-07, 'epoch': 0.91} 91%|█████████ | 20004/22095 [34:31:44<5:20:17, 9.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45326 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20005/22095 [34:31:47<4:24:49, 7.60s/it] {'loss': 0.2884, 'grad_norm': 0.7333708834362386, 'learning_rate': 2.3303422078259918e-07, 'epoch': 0.91} 91%|█████████ | 20005/22095 [34:31:48<4:24:49, 7.60s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [281, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8444694 in VC:s3://internvl-moe-sft-data/. Exception: Image size [281, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60602, 'image': 'vrdu_texteq/astro-ph.CO/5b1fab22-34b1-4ad8-956d-4602a0ce86f9.png', 'image_wh': [[281, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'and $ 1 $ meV $ = 10^{-3} \\; $ eV.'}]} 91%|█████████ | 20006/22095 [34:31:51<3:39:01, 6.29s/it] {'loss': 0.2768, 'grad_norm': 0.6560985783085491, 'learning_rate': 2.3281312713399618e-07, 'epoch': 0.91} 91%|█████████ | 20006/22095 [34:31:51<3:39:01, 6.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (75390 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20007/22095 [34:32:00<4:13:12, 7.28s/it] {'loss': 0.468, 'grad_norm': 0.24549467484021845, 'learning_rate': 2.325921359175981e-07, 'epoch': 0.91} 91%|█████████ | 20007/22095 [34:32:00<4:13:12, 7.28s/it] 91%|█████████ | 20008/22095 [34:32:10<4:37:26, 7.98s/it] {'loss': 0.4485, 'grad_norm': 0.35618191466750615, 'learning_rate': 2.3237124713815285e-07, 'epoch': 0.91} 91%|█████████ | 20008/22095 [34:32:10<4:37:26, 7.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20009/22095 [34:32:14<3:56:08, 6.79s/it] {'loss': 0.28, 'grad_norm': 1.165867138539832, 'learning_rate': 2.3215046080040714e-07, 'epoch': 0.91} 91%|█████████ | 20009/22095 [34:32:14<3:56:08, 6.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43910 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20010/22095 [34:32:17<3:22:30, 5.83s/it] {'loss': 0.265, 'grad_norm': 0.5878889928573706, 'learning_rate': 2.31929776909105e-07, 'epoch': 0.91} 91%|█████████ | 20010/22095 [34:32:17<3:22:30, 5.83s/it] 91%|█████████ | 20011/22095 [34:32:22<3:04:28, 5.31s/it] {'loss': 0.2676, 'grad_norm': 0.6493016711033609, 'learning_rate': 2.3170919546898707e-07, 'epoch': 0.91} 91%|█████████ | 20011/22095 [34:32:22<3:04:28, 5.31s/it] 91%|█████████ | 20012/22095 [34:32:25<2:48:14, 4.85s/it] {'loss': 0.3042, 'grad_norm': 0.6119097365861756, 'learning_rate': 2.3148871648479398e-07, 'epoch': 0.91} 91%|█████████ | 20012/22095 [34:32:25<2:48:14, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20013/22095 [34:32:34<3:28:53, 6.02s/it] {'loss': 0.4837, 'grad_norm': 0.25708074637369255, 'learning_rate': 2.3126833996126364e-07, 'epoch': 0.91} 91%|█████████ | 20013/22095 [34:32:34<3:28:53, 6.02s/it] 91%|█████████ | 20014/22095 [34:32:38<3:03:25, 5.29s/it] {'loss': 0.2769, 'grad_norm': 0.5724588237659539, 'learning_rate': 2.3104806590313055e-07, 'epoch': 0.91} 91%|█████████ | 20014/22095 [34:32:38<3:03:25, 5.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (98273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57622 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84452 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20015/22095 [34:32:47<3:49:30, 6.62s/it] {'loss': 0.4353, 'grad_norm': 0.2631363793982775, 'learning_rate': 2.308278943151271e-07, 'epoch': 0.91} 91%|█████████ | 20015/22095 [34:32:47<3:49:30, 6.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047551 in VC:s3://multi-modal/UniGeo/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 5cm\nB. 15cm\nC. 16cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 91%|█████████ | 20016/22095 [34:32:52<3:26:06, 5.95s/it] {'loss': 0.2949, 'grad_norm': 0.6307812688200601, 'learning_rate': 2.3060782520198554e-07, 'epoch': 0.91} 91%|█████████ | 20016/22095 [34:32:52<3:26:06, 5.95s/it] 91%|█████████ | 20017/22095 [34:32:55<2:54:18, 5.03s/it] {'loss': 0.3419, 'grad_norm': 0.8016890777574586, 'learning_rate': 2.3038785856843328e-07, 'epoch': 0.91} 91%|█████████ | 20017/22095 [34:32:55<2:54:18, 5.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20018/22095 [34:33:04<3:39:46, 6.35s/it] {'loss': 0.4753, 'grad_norm': 0.2989237934568109, 'learning_rate': 2.3016799441919756e-07, 'epoch': 0.91} 91%|█████████ | 20018/22095 [34:33:04<3:39:46, 6.35s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047934 in VC:s3://multi-modal/UniGeo/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,点C在线段AB上,点D是AC的中点,如果CD=4,AB=14,那么BC长度为()\nA. 6\nB. 6.5\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 91%|█████████ | 20019/22095 [34:33:08<3:18:02, 5.72s/it] {'loss': 0.2807, 'grad_norm': 0.6004554050861189, 'learning_rate': 2.2994823275900246e-07, 'epoch': 0.91} 91%|█████████ | 20019/22095 [34:33:08<3:18:02, 5.72s/it] 91%|█████████ | 20020/22095 [34:33:13<3:02:20, 5.27s/it] {'loss': 0.3212, 'grad_norm': 0.6495330424530107, 'learning_rate': 2.2972857359256862e-07, 'epoch': 0.91} 91%|█████████ | 20020/22095 [34:33:13<3:02:20, 5.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8355174 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 21865, 'image': 'vrdu_table_final_2/astro-ph.CO/7c7199ee-8816-4f22-873b-f8ad9a08d4ce.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{l} #1 \\end{tabular}\n```"}]} 91%|█████████ | 20021/22095 [34:33:17<2:54:00, 5.03s/it] {'loss': 0.2954, 'grad_norm': 0.5534044314636215, 'learning_rate': 2.2950901692461725e-07, 'epoch': 0.91} 91%|█████████ | 20021/22095 [34:33:17<2:54:00, 5.03s/it] 91%|█████████ | 20022/22095 [34:33:21<2:40:31, 4.65s/it] {'loss': 0.3441, 'grad_norm': 0.5939677994361185, 'learning_rate': 2.292895627598668e-07, 'epoch': 0.91} 91%|█████████ | 20022/22095 [34:33:21<2:40:31, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20023/22095 [34:33:31<3:39:29, 6.36s/it] {'loss': 0.4868, 'grad_norm': 0.28416916028422884, 'learning_rate': 2.2907021110303073e-07, 'epoch': 0.91} 91%|█████████ | 20023/22095 [34:33:31<3:39:29, 6.36s/it] 91%|█████████ | 20024/22095 [34:33:34<3:06:53, 5.41s/it] {'loss': 0.2729, 'grad_norm': 0.6777718658467312, 'learning_rate': 2.2885096195882306e-07, 'epoch': 0.91} 91%|█████████ | 20024/22095 [34:33:34<3:06:53, 5.41s/it] 91%|█████████ | 20025/22095 [34:33:37<2:41:57, 4.69s/it] {'loss': 0.3077, 'grad_norm': 0.6626368509700329, 'learning_rate': 2.2863181533195443e-07, 'epoch': 0.91} 91%|█████████ | 20025/22095 [34:33:37<2:41:57, 4.69s/it] 91%|█████████ | 20026/22095 [34:33:40<2:22:47, 4.14s/it] {'loss': 0.2678, 'grad_norm': 0.5582505524440544, 'learning_rate': 2.2841277122713502e-07, 'epoch': 0.91} 91%|█████████ | 20026/22095 [34:33:40<2:22:47, 4.14s/it] 91%|█████████ | 20027/22095 [34:33:44<2:17:22, 3.99s/it] {'loss': 0.2972, 'grad_norm': 0.6291111402466241, 'learning_rate': 2.2819382964906933e-07, 'epoch': 0.91} 91%|█████████ | 20027/22095 [34:33:44<2:17:22, 3.99s/it] 91%|█████████ | 20028/22095 [34:33:47<2:11:43, 3.82s/it] {'loss': 0.339, 'grad_norm': 0.6332071836580533, 'learning_rate': 2.2797499060246253e-07, 'epoch': 0.91} 91%|█████████ | 20028/22095 [34:33:47<2:11:43, 3.82s/it] 91%|█████████ | 20029/22095 [34:33:51<2:12:46, 3.86s/it] {'loss': 0.2715, 'grad_norm': 0.592577141665457, 'learning_rate': 2.2775625409201807e-07, 'epoch': 0.91} 91%|█████████ | 20029/22095 [34:33:51<2:12:46, 3.86s/it] 91%|█████████ | 20030/22095 [34:33:54<2:03:09, 3.58s/it] {'loss': 0.29, 'grad_norm': 0.5395007641614055, 'learning_rate': 2.275376201224344e-07, 'epoch': 0.91} 91%|█████████ | 20030/22095 [34:33:54<2:03:09, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42326 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62613 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87279 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20031/22095 [34:33:57<1:59:28, 3.47s/it] {'loss': 0.2871, 'grad_norm': 0.5726874305021403, 'learning_rate': 2.2731908869840945e-07, 'epoch': 0.91} 91%|█████████ | 20031/22095 [34:33:57<1:59:28, 3.47s/it] 91%|█████████ | 20032/22095 [34:34:01<2:02:01, 3.55s/it] {'loss': 0.2853, 'grad_norm': 0.5915757607695858, 'learning_rate': 2.2710065982464001e-07, 'epoch': 0.91} 91%|█████████ | 20032/22095 [34:34:01<2:02:01, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (113354 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67484 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20033/22095 [34:34:05<2:02:03, 3.55s/it] {'loss': 0.3084, 'grad_norm': 0.6336437929439116, 'learning_rate': 2.2688233350581734e-07, 'epoch': 0.91} 91%|█████████ | 20033/22095 [34:34:05<2:02:03, 3.55s/it] 91%|█████████ | 20034/22095 [34:34:08<2:00:13, 3.50s/it] {'loss': 0.2711, 'grad_norm': 0.5800979237410989, 'learning_rate': 2.266641097466349e-07, 'epoch': 0.91} 91%|█████████ | 20034/22095 [34:34:08<2:00:13, 3.50s/it] 91%|█████████ | 20035/22095 [34:34:11<1:54:03, 3.32s/it] {'loss': 0.2564, 'grad_norm': 0.6896619283347677, 'learning_rate': 2.2644598855177947e-07, 'epoch': 0.91} 91%|█████████ | 20035/22095 [34:34:11<1:54:03, 3.32s/it] 91%|█████████ | 20036/22095 [34:34:14<1:53:30, 3.31s/it] {'loss': 0.3011, 'grad_norm': 0.6594548147719869, 'learning_rate': 2.262279699259401e-07, 'epoch': 0.91} 91%|█████████ | 20036/22095 [34:34:14<1:53:30, 3.31s/it] 91%|█████████ | 20037/22095 [34:34:17<1:47:49, 3.14s/it] {'loss': 0.3024, 'grad_norm': 0.6633480941917634, 'learning_rate': 2.2601005387379914e-07, 'epoch': 0.91} 91%|█████████ | 20037/22095 [34:34:17<1:47:49, 3.14s/it] 91%|█████████ | 20038/22095 [34:34:20<1:44:17, 3.04s/it] {'loss': 0.2946, 'grad_norm': 0.6002884781872193, 'learning_rate': 2.2579224040004068e-07, 'epoch': 0.91} 91%|█████████ | 20038/22095 [34:34:20<1:44:17, 3.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (42219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67304 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20039/22095 [34:34:29<2:50:24, 4.97s/it] {'loss': 0.4717, 'grad_norm': 0.26351204782415494, 'learning_rate': 2.2557452950934367e-07, 'epoch': 0.91} 91%|█████████ | 20039/22095 [34:34:29<2:50:24, 4.97s/it] 91%|█████████ | 20040/22095 [34:34:33<2:33:26, 4.48s/it] {'loss': 0.3007, 'grad_norm': 0.6369228382695861, 'learning_rate': 2.2535692120638665e-07, 'epoch': 0.91} 91%|█████████ | 20040/22095 [34:34:33<2:33:26, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20041/22095 [34:34:40<3:04:28, 5.39s/it] {'loss': 0.4742, 'grad_norm': 0.27484098339073637, 'learning_rate': 2.2513941549584473e-07, 'epoch': 0.91} 91%|█████████ | 20041/22095 [34:34:40<3:04:28, 5.39s/it] 91%|█████████ | 20042/22095 [34:34:43<2:41:01, 4.71s/it] {'loss': 0.2869, 'grad_norm': 0.5711325651584115, 'learning_rate': 2.2492201238239252e-07, 'epoch': 0.91} 91%|█████████ | 20042/22095 [34:34:43<2:41:01, 4.71s/it] 91%|█████████ | 20043/22095 [34:34:47<2:30:09, 4.39s/it] {'loss': 0.2799, 'grad_norm': 0.5805424742255618, 'learning_rate': 2.2470471187070075e-07, 'epoch': 0.91} 91%|█████████ | 20043/22095 [34:34:47<2:30:09, 4.39s/it] 91%|█████████ | 20044/22095 [34:34:50<2:18:48, 4.06s/it] {'loss': 0.2586, 'grad_norm': 0.5694199887498899, 'learning_rate': 2.2448751396543788e-07, 'epoch': 0.91} 91%|█████████ | 20044/22095 [34:34:50<2:18:48, 4.06s/it] 91%|█████████ | 20045/22095 [34:34:54<2:19:47, 4.09s/it] {'loss': 0.267, 'grad_norm': 0.5960203023496685, 'learning_rate': 2.242704186712724e-07, 'epoch': 0.91} 91%|█████████ | 20045/22095 [34:34:54<2:19:47, 4.09s/it] 91%|█████████ | 20046/22095 [34:34:58<2:11:29, 3.85s/it] {'loss': 0.2806, 'grad_norm': 0.6596114041630285, 'learning_rate': 2.2405342599286672e-07, 'epoch': 0.91} 91%|█████████ | 20046/22095 [34:34:58<2:11:29, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20047/22095 [34:35:06<3:01:14, 5.31s/it] {'loss': 0.479, 'grad_norm': 0.27839170887263026, 'learning_rate': 2.2383653593488596e-07, 'epoch': 0.91} 91%|█████████ | 20047/22095 [34:35:06<3:01:14, 5.31s/it] 91%|█████████ | 20048/22095 [34:35:09<2:39:10, 4.67s/it] {'loss': 0.2961, 'grad_norm': 0.6263478186913513, 'learning_rate': 2.2361974850198865e-07, 'epoch': 0.91} 91%|█████████ | 20048/22095 [34:35:10<2:39:10, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341801 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1120, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8446, 'image': 'vrdu_table_final_2/astro-ph.CO/9bfc0a5f-b6bb-4aab-9a65-70d8c609222f.png', 'image_wh': [[1120, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}p{19.2mm}}\n1&2&3&4&5&7&8\n\\end{tabular}\n```"}]} 91%|█████████ | 20049/22095 [34:35:12<2:20:40, 4.13s/it] {'loss': 0.2843, 'grad_norm': 0.6984960638546034, 'learning_rate': 2.234030636988338e-07, 'epoch': 0.91} 91%|█████████ | 20049/22095 [34:35:12<2:20:40, 4.13s/it] 91%|█████████ | 20050/22095 [34:35:16<2:16:05, 3.99s/it] {'loss': 0.3258, 'grad_norm': 0.6520341024668019, 'learning_rate': 2.2318648153007605e-07, 'epoch': 0.91} 91%|█████████ | 20050/22095 [34:35:16<2:16:05, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58254 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20051/22095 [34:35:19<2:10:19, 3.83s/it] {'loss': 0.2776, 'grad_norm': 0.6284751430203892, 'learning_rate': 2.229700020003711e-07, 'epoch': 0.91} 91%|█████████ | 20051/22095 [34:35:19<2:10:19, 3.83s/it] 91%|█████████ | 20052/22095 [34:35:22<2:00:49, 3.55s/it] {'loss': 0.2703, 'grad_norm': 0.5311253177293819, 'learning_rate': 2.2275362511436914e-07, 'epoch': 0.91} 91%|█████████ | 20052/22095 [34:35:22<2:00:49, 3.55s/it] 91%|█████████ | 20053/22095 [34:35:26<1:57:10, 3.44s/it] {'loss': 0.2971, 'grad_norm': 0.616319374180847, 'learning_rate': 2.2253735087671867e-07, 'epoch': 0.91} 91%|█████████ | 20053/22095 [34:35:26<1:57:10, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (79052 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113321 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20054/22095 [34:35:31<2:20:26, 4.13s/it] {'loss': 0.4823, 'grad_norm': 0.26938372789301107, 'learning_rate': 2.2232117929206764e-07, 'epoch': 0.91} 91%|█████████ | 20054/22095 [34:35:31<2:20:26, 4.13s/it] 91%|█████████ | 20055/22095 [34:35:35<2:17:50, 4.05s/it] {'loss': 0.3024, 'grad_norm': 0.6187431043930877, 'learning_rate': 2.2210511036506232e-07, 'epoch': 0.91} 91%|█████████ | 20055/22095 [34:35:35<2:17:50, 4.05s/it] 91%|█████████ | 20056/22095 [34:35:39<2:12:02, 3.89s/it] {'loss': 0.3081, 'grad_norm': 0.6003295596575504, 'learning_rate': 2.218891441003429e-07, 'epoch': 0.91} 91%|█████████ | 20056/22095 [34:35:39<2:12:02, 3.89s/it] 91%|█████████ | 20057/22095 [34:35:43<2:11:19, 3.87s/it] {'loss': 0.3059, 'grad_norm': 0.5985733209787534, 'learning_rate': 2.2167328050255122e-07, 'epoch': 0.91} 91%|█████████ | 20057/22095 [34:35:43<2:11:19, 3.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119587 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106527 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20058/22095 [34:35:46<2:03:09, 3.63s/it] {'loss': 0.3032, 'grad_norm': 0.6669948310068601, 'learning_rate': 2.2145751957632521e-07, 'epoch': 0.91} 91%|█████████ | 20058/22095 [34:35:46<2:03:09, 3.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [392, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397498 in VC:s3://internvl-moe-sft-data/. Exception: Image size [392, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 64353, 'image': 'vrdu_table_final_2/astro-ph.EP/fc3f9350-365a-438f-81ba-23e20c5b0e55.png', 'image_wh': [[392, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease code the table in the image into LaTeX language.'}, {'from': 'gpt', 'value': "Of course, I'll code the table in the image into LaTeX language for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}\\hspace{-1 cm}Observations and Measurements \\end{tabular}\n```"}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20059/22095 [34:35:55<3:03:46, 5.42s/it] {'loss': 0.4851, 'grad_norm': 0.25838696674097744, 'learning_rate': 2.2124186132630122e-07, 'epoch': 0.91} 91%|█████████ | 20059/22095 [34:35:55<3:03:46, 5.42s/it] 91%|█████████ | 20060/22095 [34:35:59<2:45:02, 4.87s/it] {'loss': 0.2995, 'grad_norm': 0.5950360146604151, 'learning_rate': 2.2102630575711215e-07, 'epoch': 0.91} 91%|█████████ | 20060/22095 [34:35:59<2:45:02, 4.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20061/22095 [34:36:02<2:29:42, 4.42s/it] {'loss': 0.2496, 'grad_norm': 0.6025156236670351, 'learning_rate': 2.20810852873391e-07, 'epoch': 0.91} 91%|█████████ | 20061/22095 [34:36:02<2:29:42, 4.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51995 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20062/22095 [34:36:12<3:23:03, 5.99s/it] {'loss': 0.4513, 'grad_norm': 0.26372452664482027, 'learning_rate': 2.2059550267976572e-07, 'epoch': 0.91} 91%|█████████ | 20062/22095 [34:36:12<3:23:03, 5.99s/it] 91%|█████████ | 20063/22095 [34:36:15<2:58:36, 5.27s/it] {'loss': 0.2796, 'grad_norm': 0.5920258230976766, 'learning_rate': 2.2038025518086482e-07, 'epoch': 0.91} 91%|█████████ | 20063/22095 [34:36:15<2:58:36, 5.27s/it] 91%|█████████ | 20064/22095 [34:36:19<2:39:50, 4.72s/it] {'loss': 0.3333, 'grad_norm': 0.615303934129582, 'learning_rate': 2.2016511038131238e-07, 'epoch': 0.91} 91%|█████████ | 20064/22095 [34:36:19<2:39:50, 4.72s/it] 91%|█████████ | 20065/22095 [34:36:22<2:27:42, 4.37s/it] {'loss': 0.2623, 'grad_norm': 0.6655320396579752, 'learning_rate': 2.1995006828573194e-07, 'epoch': 0.91} 91%|█████████ | 20065/22095 [34:36:22<2:27:42, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_134446_before_screenshot_sub2.png 2025-08-29 02:34:21.137790 load time: 1024.25 ms VC:s3://gui-agent/data_20250714/windows/images/unreal_engine/free_task_20250715_150831/images/20250715_150932_15.png 2025-08-29 02:34:21.136018 load time: 1059.05 ms 91%|█████████ | 20066/22095 [34:36:32<3:23:13, 6.01s/it] {'loss': 0.4526, 'grad_norm': 0.2551258815085798, 'learning_rate': 2.1973512889874316e-07, 'epoch': 0.91} 91%|█████████ | 20066/22095 [34:36:32<3:23:13, 6.01s/it] 91%|█████████ | 20067/22095 [34:36:36<2:57:19, 5.25s/it] {'loss': 0.3186, 'grad_norm': 0.786400107775112, 'learning_rate': 2.1952029222496562e-07, 'epoch': 0.91} 91%|█████████ | 20067/22095 [34:36:36<2:57:19, 5.25s/it] 91%|█████████ | 20068/22095 [34:36:39<2:34:56, 4.59s/it] {'loss': 0.2552, 'grad_norm': 0.5946120226496916, 'learning_rate': 2.1930555826901513e-07, 'epoch': 0.91} 91%|█████████ | 20068/22095 [34:36:39<2:34:56, 4.59s/it] 91%|█████████ | 20069/22095 [34:36:42<2:21:10, 4.18s/it] {'loss': 0.3191, 'grad_norm': 0.6447590483502432, 'learning_rate': 2.1909092703550406e-07, 'epoch': 0.91} 91%|█████████ | 20069/22095 [34:36:42<2:21:10, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [225, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954305 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [225, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5140, 'image': 'images/5466.png', 'image_wh': [[225, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是线段AB上的点,点D是线段BC的中点,AB=10,AC=6,则线段AD的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 91%|█████████ | 20070/22095 [34:36:48<2:44:00, 4.86s/it] {'loss': 0.4662, 'grad_norm': 0.25579389228282384, 'learning_rate': 2.1887639852904653e-07, 'epoch': 0.91} 91%|█████████ | 20070/22095 [34:36:48<2:44:00, 4.86s/it] 91%|█████████ | 20071/22095 [34:36:53<2:37:41, 4.67s/it] {'loss': 0.2557, 'grad_norm': 0.6112753434468116, 'learning_rate': 2.1866197275425106e-07, 'epoch': 0.91} 91%|█████████ | 20071/22095 [34:36:53<2:37:41, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66556 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72061 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49203 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42068 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96063 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20072/22095 [34:36:56<2:25:26, 4.31s/it] {'loss': 0.3018, 'grad_norm': 0.674747862300124, 'learning_rate': 2.1844764971572507e-07, 'epoch': 0.91} 91%|█████████ | 20072/22095 [34:36:56<2:25:26, 4.31s/it] 91%|█████████ | 20073/22095 [34:36:59<2:10:43, 3.88s/it] {'loss': 0.2986, 'grad_norm': 0.6150259270468362, 'learning_rate': 2.1823342941807324e-07, 'epoch': 0.91} 91%|█████████ | 20073/22095 [34:36:59<2:10:43, 3.88s/it] 91%|█████████ | 20074/22095 [34:37:03<2:12:16, 3.93s/it] {'loss': 0.3338, 'grad_norm': 0.6258412983607646, 'learning_rate': 2.1801931186589963e-07, 'epoch': 0.91} 91%|█████████ | 20074/22095 [34:37:03<2:12:16, 3.93s/it] 91%|█████████ | 20075/22095 [34:37:06<2:01:26, 3.61s/it] {'loss': 0.2614, 'grad_norm': 0.6052915432140621, 'learning_rate': 2.1780529706380337e-07, 'epoch': 0.91} 91%|█████████ | 20075/22095 [34:37:06<2:01:26, 3.61s/it] 91%|█████████ | 20076/22095 [34:37:10<2:04:36, 3.70s/it] {'loss': 0.2688, 'grad_norm': 0.6119499527385366, 'learning_rate': 2.1759138501638466e-07, 'epoch': 0.91} 91%|█████████ | 20076/22095 [34:37:10<2:04:36, 3.70s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8382446 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 49240, 'image': 'vrdu_table_final_2/astro-ph.CO/145c69b2-1a6f-4c2d-a67b-a42a5e3187b7.png', 'image_wh': [[17, 17]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\alpha$\\end{tabular}\n```"}]} 91%|█████████ | 20077/22095 [34:37:13<2:01:13, 3.60s/it] {'loss': 0.3054, 'grad_norm': 0.5978537982068777, 'learning_rate': 2.1737757572823813e-07, 'epoch': 0.91} 91%|█████████ | 20077/22095 [34:37:13<2:01:13, 3.60s/it] 91%|█████████ | 20078/22095 [34:37:16<1:55:09, 3.43s/it] {'loss': 0.283, 'grad_norm': 0.5522834789899027, 'learning_rate': 2.1716386920396016e-07, 'epoch': 0.91} 91%|█████████ | 20078/22095 [34:37:16<1:55:09, 3.43s/it] 91%|█████████ | 20079/22095 [34:37:20<2:03:05, 3.66s/it] {'loss': 0.302, 'grad_norm': 0.5917621379636202, 'learning_rate': 2.169502654481398e-07, 'epoch': 0.91} 91%|█████████ | 20079/22095 [34:37:20<2:03:05, 3.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20080/22095 [34:37:30<3:02:09, 5.42s/it] {'loss': 0.4688, 'grad_norm': 0.27429627501030185, 'learning_rate': 2.1673676446536952e-07, 'epoch': 0.91} 91%|█████████ | 20080/22095 [34:37:30<3:02:09, 5.42s/it] 91%|█████████ | 20081/22095 [34:37:34<2:46:49, 4.97s/it] {'loss': 0.3555, 'grad_norm': 0.6334485282257166, 'learning_rate': 2.1652336626023506e-07, 'epoch': 0.91} 91%|█████████ | 20081/22095 [34:37:34<2:46:49, 4.97s/it] 91%|█████████ | 20082/22095 [34:37:37<2:26:25, 4.36s/it] {'loss': 0.2904, 'grad_norm': 0.6271218443000107, 'learning_rate': 2.1631007083732169e-07, 'epoch': 0.91} 91%|█████████ | 20082/22095 [34:37:37<2:26:25, 4.36s/it] 91%|█████████ | 20083/22095 [34:37:40<2:18:55, 4.14s/it] {'loss': 0.2757, 'grad_norm': 0.5528300230753579, 'learning_rate': 2.1609687820121295e-07, 'epoch': 0.91} 91%|█████████ | 20083/22095 [34:37:40<2:18:55, 4.14s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20084/22095 [34:37:43<2:07:04, 3.79s/it] {'loss': 0.2674, 'grad_norm': 0.5809626317071711, 'learning_rate': 2.158837883564907e-07, 'epoch': 0.91} 91%|█████████ | 20084/22095 [34:37:43<2:07:04, 3.79s/it] 91%|█████████ | 20085/22095 [34:37:46<1:58:32, 3.54s/it] {'loss': 0.2712, 'grad_norm': 0.5976098597063977, 'learning_rate': 2.1567080130773188e-07, 'epoch': 0.91} 91%|█████████ | 20085/22095 [34:37:46<1:58:32, 3.54s/it] 91%|█████████ | 20086/22095 [34:37:50<1:59:47, 3.58s/it] {'loss': 0.2683, 'grad_norm': 0.5778406755545751, 'learning_rate': 2.154579170595128e-07, 'epoch': 0.91} 91%|█████████ | 20086/22095 [34:37:50<1:59:47, 3.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948336 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71489, 'image': 'images/5400.png', 'image_wh': [[204, 22]], 'conversations': [{'from': 'human', 'value': '\n如图,点C是长为10cm的线段AB上一点,D、E分别是AC、CB中点,则DE的长为()\nA. 5cm\nB. 5.5cm\nC. 6cm\nD. 6.5cm'}, {'from': 'gpt', 'value': '【解答】解:如图,∵点D、E分别是线段AC、CB的中点,∴DC=\\frac{1}{2}AC,CE=\\frac{1}{2}BC,∴DE=DC+CE=\\frac{1}{2}(AC+BC)=\\frac{1}{2}AB.又∵AB=10cm,∴DE=5cm;'}]} 91%|█████████ | 20087/22095 [34:37:53<1:56:29, 3.48s/it] {'loss': 0.2899, 'grad_norm': 0.6191740308187985, 'learning_rate': 2.152451356164098e-07, 'epoch': 0.91} 91%|█████████ | 20087/22095 [34:37:53<1:56:29, 3.48s/it] 91%|█████████ | 20088/22095 [34:37:57<1:56:53, 3.49s/it] {'loss': 0.266, 'grad_norm': 0.6120327299909363, 'learning_rate': 2.1503245698299312e-07, 'epoch': 0.91} 91%|█████████ | 20088/22095 [34:37:57<1:56:53, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41819 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111828 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (130805 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20089/22095 [34:38:01<1:59:06, 3.56s/it] {'loss': 0.2843, 'grad_norm': 0.688995530064606, 'learning_rate': 2.1481988116383246e-07, 'epoch': 0.91} 91%|█████████ | 20089/22095 [34:38:01<1:59:06, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20090/22095 [34:38:05<2:04:06, 3.71s/it] {'loss': 0.282, 'grad_norm': 0.5403359844763675, 'learning_rate': 2.146074081634969e-07, 'epoch': 0.91} 91%|█████████ | 20090/22095 [34:38:05<2:04:06, 3.71s/it] 91%|█████████ | 20091/22095 [34:38:08<1:57:33, 3.52s/it] {'loss': 0.2768, 'grad_norm': 1.0998611164448522, 'learning_rate': 2.1439503798655003e-07, 'epoch': 0.91} 91%|█████████ | 20091/22095 [34:38:08<1:57:33, 3.52s/it] 91%|█████████ | 20092/22095 [34:38:11<2:00:27, 3.61s/it] {'loss': 0.2578, 'grad_norm': 0.5736670905305177, 'learning_rate': 2.1418277063755656e-07, 'epoch': 0.91} 91%|█████████ | 20092/22095 [34:38:11<2:00:27, 3.61s/it] 91%|█████████ | 20093/22095 [34:38:14<1:54:26, 3.43s/it] {'loss': 0.272, 'grad_norm': 0.6303521103735157, 'learning_rate': 2.139706061210761e-07, 'epoch': 0.91} 91%|█████████ | 20093/22095 [34:38:14<1:54:26, 3.43s/it] 91%|█████████ | 20094/22095 [34:38:19<2:03:48, 3.71s/it] {'loss': 0.2708, 'grad_norm': 0.5708800033364091, 'learning_rate': 2.13758544441669e-07, 'epoch': 0.91} 91%|█████████ | 20094/22095 [34:38:19<2:03:48, 3.71s/it] 91%|█████████ | 20095/22095 [34:38:22<1:55:56, 3.48s/it] {'loss': 0.2416, 'grad_norm': 0.6194234916597959, 'learning_rate': 2.1354658560389042e-07, 'epoch': 0.91} 91%|█████████ | 20095/22095 [34:38:22<1:55:56, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79241 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20096/22095 [34:38:26<2:00:42, 3.62s/it] {'loss': 0.2535, 'grad_norm': 0.5330122281441105, 'learning_rate': 2.1333472961229563e-07, 'epoch': 0.91} 91%|█████████ | 20096/22095 [34:38:26<2:00:42, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20097/22095 [34:38:36<3:02:55, 5.49s/it] {'loss': 0.4591, 'grad_norm': 0.2637332376720348, 'learning_rate': 2.1312297647143653e-07, 'epoch': 0.91} 91%|█████████ | 20097/22095 [34:38:36<3:02:55, 5.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20098/22095 [34:38:39<2:39:48, 4.80s/it] {'loss': 0.2574, 'grad_norm': 0.625594032869571, 'learning_rate': 2.129113261858623e-07, 'epoch': 0.91} 91%|█████████ | 20098/22095 [34:38:39<2:39:48, 4.80s/it] 91%|█████████ | 20099/22095 [34:38:43<2:29:03, 4.48s/it] {'loss': 0.2765, 'grad_norm': 0.5724869592135517, 'learning_rate': 2.1269977876012094e-07, 'epoch': 0.91} 91%|█████████ | 20099/22095 [34:38:43<2:29:03, 4.48s/it] 91%|█████████ | 20100/22095 [34:38:46<2:15:08, 4.06s/it] {'loss': 0.2833, 'grad_norm': 0.5638714706740632, 'learning_rate': 2.1248833419875936e-07, 'epoch': 0.91} 91%|█████████ | 20100/22095 [34:38:46<2:15:08, 4.06s/it] 91%|█████████ | 20101/22095 [34:38:49<2:08:52, 3.88s/it] {'loss': 0.3008, 'grad_norm': 0.6085258497235556, 'learning_rate': 2.122769925063195e-07, 'epoch': 0.91} 91%|█████████ | 20101/22095 [34:38:49<2:08:52, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48599 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20102/22095 [34:38:52<2:04:11, 3.74s/it] {'loss': 0.2708, 'grad_norm': 0.5601390088119588, 'learning_rate': 2.1206575368734216e-07, 'epoch': 0.91} 91%|█████████ | 20102/22095 [34:38:52<2:04:11, 3.74s/it] 91%|█████████ | 20103/22095 [34:38:56<2:06:07, 3.80s/it] {'loss': 0.3048, 'grad_norm': 0.6428953467497713, 'learning_rate': 2.1185461774636705e-07, 'epoch': 0.91} 91%|█████████ | 20103/22095 [34:38:56<2:06:07, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20104/22095 [34:39:05<2:57:59, 5.36s/it] {'loss': 0.4753, 'grad_norm': 0.2727656342620612, 'learning_rate': 2.1164358468793055e-07, 'epoch': 0.91} 91%|█████████ | 20104/22095 [34:39:05<2:57:59, 5.36s/it] 91%|█████████ | 20105/22095 [34:39:09<2:35:42, 4.69s/it] {'loss': 0.3, 'grad_norm': 0.6124735588652476, 'learning_rate': 2.1143265451656736e-07, 'epoch': 0.91} 91%|█████████ | 20105/22095 [34:39:09<2:35:42, 4.69s/it] 91%|█████████ | 20106/22095 [34:39:12<2:27:15, 4.44s/it] {'loss': 0.2805, 'grad_norm': 0.647641665379744, 'learning_rate': 2.1122182723680883e-07, 'epoch': 0.91} 91%|█████████ | 20106/22095 [34:39:12<2:27:15, 4.44s/it] 91%|█████████ | 20107/22095 [34:39:16<2:18:41, 4.19s/it] {'loss': 0.3268, 'grad_norm': 0.5980609552698322, 'learning_rate': 2.1101110285318639e-07, 'epoch': 0.91} 91%|█████████ | 20107/22095 [34:39:16<2:18:41, 4.19s/it] 91%|█████████ | 20108/22095 [34:39:19<2:07:27, 3.85s/it] {'loss': 0.2965, 'grad_norm': 0.6609734334611863, 'learning_rate': 2.108004813702258e-07, 'epoch': 0.91} 91%|█████████ | 20108/22095 [34:39:19<2:07:27, 3.85s/it] 91%|█████████ | 20109/22095 [34:39:22<2:01:40, 3.68s/it] {'loss': 0.2486, 'grad_norm': 0.5644962278927006, 'learning_rate': 2.1058996279245515e-07, 'epoch': 0.91} 91%|█████████ | 20109/22095 [34:39:22<2:01:40, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20110/22095 [34:39:30<2:40:19, 4.85s/it] {'loss': 0.4734, 'grad_norm': 0.2563080729027022, 'learning_rate': 2.103795471243969e-07, 'epoch': 0.91} 91%|█████████ | 20110/22095 [34:39:30<2:40:19, 4.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8366679 in VC:s3://internvl-moe-sft-data/. Exception: Image size [28, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33425, 'image': 'vrdu_table_final_2/astro-ph.CO/905dc4d4-7a66-4724-acd9-5613631777b6.png', 'image_wh': [[28, 25]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$S_{9}$\\end{tabular}\n```"}]} 91%|█████████ | 20111/22095 [34:39:33<2:27:26, 4.46s/it] {'loss': 0.2781, 'grad_norm': 0.6255772447099949, 'learning_rate': 2.101692343705708e-07, 'epoch': 0.91} 91%|█████████ | 20111/22095 [34:39:34<2:27:26, 4.46s/it] 91%|█████████ | 20112/22095 [34:39:37<2:18:44, 4.20s/it] {'loss': 0.2852, 'grad_norm': 0.6850588419635695, 'learning_rate': 2.0995902453549766e-07, 'epoch': 0.91} 91%|█████████ | 20112/22095 [34:39:37<2:18:44, 4.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308514 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2gPn7dsnI8KJjSspeXXcwIpXa_!!2542198347.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat are the exact words displayed in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n欧铂尔墙布\n来图定制\n免费设计\n海量图库\n专业设计\n专业\n背景墙/壁画/墙布设计'}]} 91%|█████████ | 20113/22095 [34:39:42<2:25:46, 4.41s/it] {'loss': 0.3114, 'grad_norm': 0.9849127423646337, 'learning_rate': 2.0974891762369386e-07, 'epoch': 0.91} 91%|█████████ | 20113/22095 [34:39:42<2:25:46, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115318 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115762 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79517 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20114/22095 [34:39:46<2:17:53, 4.18s/it] {'loss': 0.3264, 'grad_norm': 0.6059976563244579, 'learning_rate': 2.095389136396736e-07, 'epoch': 0.91} 91%|█████████ | 20114/22095 [34:39:46<2:17:53, 4.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (72334 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20115/22095 [34:39:50<2:19:36, 4.23s/it] {'loss': 0.2636, 'grad_norm': 0.5625520832592269, 'learning_rate': 2.093290125879488e-07, 'epoch': 0.91} 91%|█████████ | 20115/22095 [34:39:50<2:19:36, 4.23s/it] 91%|█████████ | 20116/22095 [34:39:54<2:17:35, 4.17s/it] {'loss': 0.3275, 'grad_norm': 0.6029349016535929, 'learning_rate': 2.0911921447303086e-07, 'epoch': 0.91} 91%|█████████ | 20116/22095 [34:39:54<2:17:35, 4.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [231, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8528889 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 104220, 'image': 'vrdu_texteq/astro-ph.CO/bbf5f01e-3a5c-47ea-a18f-4223fbdc1407.png', 'image_wh': [[231, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'For $\\widetilde u_i \\equiv 0$ we have'}]} 91%|█████████ | 20117/22095 [34:39:57<2:08:09, 3.89s/it] {'loss': 0.2979, 'grad_norm': 0.6418283491471828, 'learning_rate': 2.0890951929942671e-07, 'epoch': 0.91} 91%|█████████ | 20117/22095 [34:39:57<2:08:09, 3.89s/it] 91%|█████████ | 20118/22095 [34:40:00<2:00:25, 3.65s/it] {'loss': 0.3058, 'grad_norm': 0.5858464661181148, 'learning_rate': 2.0869992707164166e-07, 'epoch': 0.91} 91%|█████████ | 20118/22095 [34:40:00<2:00:25, 3.65s/it] 91%|█████████ | 20119/22095 [34:40:04<1:59:07, 3.62s/it] {'loss': 0.253, 'grad_norm': 0.5446063183284051, 'learning_rate': 2.0849043779417987e-07, 'epoch': 0.91} 91%|█████████ | 20119/22095 [34:40:04<1:59:07, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20120/22095 [34:40:07<1:52:33, 3.42s/it] {'loss': 0.2791, 'grad_norm': 0.6664356187095676, 'learning_rate': 2.0828105147154275e-07, 'epoch': 0.91} 91%|█████████ | 20120/22095 [34:40:07<1:52:33, 3.42s/it] 91%|█████████ | 20121/22095 [34:40:10<1:47:13, 3.26s/it] {'loss': 0.2497, 'grad_norm': 0.6085205672276123, 'learning_rate': 2.0807176810823005e-07, 'epoch': 0.91} 91%|█████████ | 20121/22095 [34:40:10<1:47:13, 3.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20122/22095 [34:40:15<2:10:11, 3.96s/it] {'loss': 0.4548, 'grad_norm': 0.2654758316672969, 'learning_rate': 2.0786258770873647e-07, 'epoch': 0.91} 91%|█████████ | 20122/22095 [34:40:15<2:10:11, 3.96s/it] 91%|█████████ | 20123/22095 [34:40:19<2:06:23, 3.85s/it] {'loss': 0.2668, 'grad_norm': 0.6455911301462177, 'learning_rate': 2.0765351027755897e-07, 'epoch': 0.91} 91%|█████████ | 20123/22095 [34:40:19<2:06:23, 3.85s/it] 91%|█████████ | 20124/22095 [34:40:22<1:59:13, 3.63s/it] {'loss': 0.3166, 'grad_norm': 0.6219158113543389, 'learning_rate': 2.0744453581918843e-07, 'epoch': 0.91} 91%|█████████ | 20124/22095 [34:40:22<1:59:13, 3.63s/it] 91%|█████████ | 20125/22095 [34:40:26<2:00:45, 3.68s/it] {'loss': 0.2693, 'grad_norm': 0.7528225257884705, 'learning_rate': 2.0723566433811572e-07, 'epoch': 0.91} 91%|█████████ | 20125/22095 [34:40:26<2:00:45, 3.68s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20126/22095 [34:40:29<1:57:33, 3.58s/it] {'loss': 0.3284, 'grad_norm': 0.6306450779415568, 'learning_rate': 2.0702689583882883e-07, 'epoch': 0.91} 91%|█████████ | 20126/22095 [34:40:29<1:57:33, 3.58s/it] 91%|█████████ | 20127/22095 [34:40:32<1:51:22, 3.40s/it] {'loss': 0.3162, 'grad_norm': 0.5932455349455487, 'learning_rate': 2.0681823032581316e-07, 'epoch': 0.91} 91%|█████████ | 20127/22095 [34:40:32<1:51:22, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59158 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42321 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20128/22095 [34:40:36<1:57:23, 3.58s/it] {'loss': 0.3029, 'grad_norm': 0.6433329829199189, 'learning_rate': 2.066096678035523e-07, 'epoch': 0.91} 91%|█████████ | 20128/22095 [34:40:36<1:57:23, 3.58s/it] 91%|█████████ | 20129/22095 [34:40:39<1:55:19, 3.52s/it] {'loss': 0.3142, 'grad_norm': 0.6278373181227935, 'learning_rate': 2.0640120827652876e-07, 'epoch': 0.91} 91%|█████████ | 20129/22095 [34:40:39<1:55:19, 3.52s/it] 91%|█████████ | 20130/22095 [34:40:43<1:51:34, 3.41s/it] {'loss': 0.2493, 'grad_norm': 0.5950706534219158, 'learning_rate': 2.0619285174922067e-07, 'epoch': 0.91} 91%|█████████ | 20130/22095 [34:40:43<1:51:34, 3.41s/it] 91%|█████████ | 20131/22095 [34:40:46<1:53:54, 3.48s/it] {'loss': 0.2582, 'grad_norm': 0.6003694710464432, 'learning_rate': 2.0598459822610494e-07, 'epoch': 0.91} 91%|█████████ | 20131/22095 [34:40:46<1:53:54, 3.48s/it] 91%|█████████ | 20132/22095 [34:40:49<1:49:16, 3.34s/it] {'loss': 0.3004, 'grad_norm': 0.6110120573969848, 'learning_rate': 2.057764477116564e-07, 'epoch': 0.91} 91%|█████████ | 20132/22095 [34:40:49<1:49:16, 3.34s/it] 91%|█████████ | 20133/22095 [34:40:52<1:46:54, 3.27s/it] {'loss': 0.2936, 'grad_norm': 0.6446796955432598, 'learning_rate': 2.0556840021034753e-07, 'epoch': 0.91} 91%|█████████ | 20133/22095 [34:40:52<1:46:54, 3.27s/it] 91%|█████████ | 20134/22095 [34:40:56<1:48:36, 3.32s/it] {'loss': 0.3027, 'grad_norm': 0.6100270371936517, 'learning_rate': 2.053604557266492e-07, 'epoch': 0.91} 91%|█████████ | 20134/22095 [34:40:56<1:48:36, 3.32s/it] 91%|█████████ | 20135/22095 [34:40:59<1:50:33, 3.38s/it] {'loss': 0.2601, 'grad_norm': 0.6406473157774112, 'learning_rate': 2.0515261426502897e-07, 'epoch': 0.91} 91%|█████████ | 20135/22095 [34:40:59<1:50:33, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20136/22095 [34:41:09<2:49:36, 5.19s/it] {'loss': 0.4509, 'grad_norm': 0.4227689291899131, 'learning_rate': 2.049448758299527e-07, 'epoch': 0.91} 91%|█████████ | 20136/22095 [34:41:09<2:49:36, 5.19s/it] 91%|█████████ | 20137/22095 [34:41:12<2:29:54, 4.59s/it] {'loss': 0.254, 'grad_norm': 0.581610239211927, 'learning_rate': 2.0473724042588405e-07, 'epoch': 0.91} 91%|█████████ | 20137/22095 [34:41:12<2:29:54, 4.59s/it] 91%|█████████ | 20138/22095 [34:41:16<2:22:31, 4.37s/it] {'loss': 0.2672, 'grad_norm': 0.6115934720483611, 'learning_rate': 2.0452970805728502e-07, 'epoch': 0.91} 91%|█████████ | 20138/22095 [34:41:16<2:22:31, 4.37s/it] 91%|█████████ | 20139/22095 [34:41:19<2:08:50, 3.95s/it] {'loss': 0.3153, 'grad_norm': 0.6571594489557938, 'learning_rate': 2.0432227872861422e-07, 'epoch': 0.91} 91%|█████████ | 20139/22095 [34:41:19<2:08:50, 3.95s/it] 91%|█████████ | 20140/22095 [34:41:22<1:56:48, 3.58s/it] {'loss': 0.2624, 'grad_norm': 0.5950439047019607, 'learning_rate': 2.041149524443281e-07, 'epoch': 0.91} 91%|█████████ | 20140/22095 [34:41:22<1:56:48, 3.58s/it] 91%|█████████ | 20141/22095 [34:41:25<1:56:26, 3.58s/it] {'loss': 0.3267, 'grad_norm': 1.4537733982204537, 'learning_rate': 2.0390772920888258e-07, 'epoch': 0.91} 91%|█████████ | 20141/22095 [34:41:25<1:56:26, 3.58s/it] 91%|█████████ | 20142/22095 [34:41:28<1:51:40, 3.43s/it] {'loss': 0.2958, 'grad_norm': 0.7336297650765733, 'learning_rate': 2.0370060902673074e-07, 'epoch': 0.91} 91%|█████████ | 20142/22095 [34:41:28<1:51:40, 3.43s/it] 91%|█████████ | 20143/22095 [34:41:31<1:44:33, 3.21s/it] {'loss': 0.3183, 'grad_norm': 0.6293224370546653, 'learning_rate': 2.0349359190232176e-07, 'epoch': 0.91} 91%|█████████ | 20143/22095 [34:41:31<1:44:33, 3.21s/it] 91%|█████████ | 20144/22095 [34:41:35<1:57:57, 3.63s/it] {'loss': 0.2704, 'grad_norm': 0.6208929072969963, 'learning_rate': 2.0328667784010324e-07, 'epoch': 0.91} 91%|█████████ | 20144/22095 [34:41:35<1:57:57, 3.63s/it] 91%|█████████ | 20145/22095 [34:41:40<2:02:55, 3.78s/it] {'loss': 0.2778, 'grad_norm': 0.6356988807823192, 'learning_rate': 2.030798668445233e-07, 'epoch': 0.91} 91%|█████████ | 20145/22095 [34:41:40<2:02:55, 3.78s/it] 91%|█████████ | 20146/22095 [34:41:42<1:53:51, 3.50s/it] {'loss': 0.2838, 'grad_norm': 0.632110705142832, 'learning_rate': 2.0287315892002335e-07, 'epoch': 0.91} 91%|█████████ | 20146/22095 [34:41:42<1:53:51, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20147/22095 [34:41:46<1:49:50, 3.38s/it] {'loss': 0.3172, 'grad_norm': 0.6456962160612171, 'learning_rate': 2.0266655407104652e-07, 'epoch': 0.91} 91%|█████████ | 20147/22095 [34:41:46<1:49:50, 3.38s/it] 91%|█████████ | 20148/22095 [34:41:48<1:44:23, 3.22s/it] {'loss': 0.2721, 'grad_norm': 0.6471312561425153, 'learning_rate': 2.024600523020309e-07, 'epoch': 0.91} 91%|█████████ | 20148/22095 [34:41:48<1:44:23, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49614 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52317 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20149/22095 [34:41:52<1:52:17, 3.46s/it] {'loss': 0.3046, 'grad_norm': 0.6297922945477495, 'learning_rate': 2.0225365361741522e-07, 'epoch': 0.91} 91%|█████████ | 20149/22095 [34:41:52<1:52:17, 3.46s/it] 91%|█████████ | 20150/22095 [34:41:56<1:54:29, 3.53s/it] {'loss': 0.2654, 'grad_norm': 0.6130515227525498, 'learning_rate': 2.0204735802163254e-07, 'epoch': 0.91} 91%|█████████ | 20150/22095 [34:41:56<1:54:29, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57515 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56680 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20151/22095 [34:41:59<1:51:30, 3.44s/it] {'loss': 0.282, 'grad_norm': 0.5487820024424245, 'learning_rate': 2.0184116551911714e-07, 'epoch': 0.91} 91%|█████████ | 20151/22095 [34:41:59<1:51:30, 3.44s/it] 91%|█████████ | 20152/22095 [34:42:03<1:48:44, 3.36s/it] {'loss': 0.3083, 'grad_norm': 0.599754947660118, 'learning_rate': 2.0163507611429823e-07, 'epoch': 0.91} 91%|█████████ | 20152/22095 [34:42:03<1:48:44, 3.36s/it] 91%|█████████ | 20153/22095 [34:42:06<1:50:30, 3.41s/it] {'loss': 0.3301, 'grad_norm': 0.6336611363559453, 'learning_rate': 2.0142908981160447e-07, 'epoch': 0.91} 91%|█████████ | 20153/22095 [34:42:06<1:50:30, 3.41s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████ | 20154/22095 [34:42:09<1:49:50, 3.40s/it] {'loss': 0.3038, 'grad_norm': 0.6186858420131777, 'learning_rate': 2.012232066154618e-07, 'epoch': 0.91} 91%|█████████ | 20154/22095 [34:42:09<1:49:50, 3.40s/it] 91%|█████████ | 20155/22095 [34:42:12<1:45:06, 3.25s/it] {'loss': 0.2921, 'grad_norm': 0.6368384635497683, 'learning_rate': 2.01017426530295e-07, 'epoch': 0.91} 91%|█████████ | 20155/22095 [34:42:12<1:45:06, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41765 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58462 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20156/22095 [34:42:16<1:47:05, 3.31s/it] {'loss': 0.3185, 'grad_norm': 0.7317290019886159, 'learning_rate': 2.0081174956052329e-07, 'epoch': 0.91} 91%|█████████ | 20156/22095 [34:42:16<1:47:05, 3.31s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20157/22095 [34:42:25<2:46:04, 5.14s/it] {'loss': 0.4373, 'grad_norm': 0.26234659961321066, 'learning_rate': 2.0060617571056817e-07, 'epoch': 0.91} 91%|█████████ | 20157/22095 [34:42:25<2:46:04, 5.14s/it] 91%|█████████ | 20158/22095 [34:42:29<2:32:17, 4.72s/it] {'loss': 0.3736, 'grad_norm': 0.7022199549928246, 'learning_rate': 2.004007049848461e-07, 'epoch': 0.91} 91%|█████████ | 20158/22095 [34:42:29<2:32:17, 4.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████ | 20159/22095 [34:42:35<2:48:32, 5.22s/it] {'loss': 0.4316, 'grad_norm': 0.24999613707843227, 'learning_rate': 2.001953373877724e-07, 'epoch': 0.91} 91%|█████████ | 20159/22095 [34:42:35<2:48:32, 5.22s/it] 91%|█████████ | 20160/22095 [34:42:38<2:28:16, 4.60s/it] {'loss': 0.2693, 'grad_norm': 1.258637722295874, 'learning_rate': 1.999900729237586e-07, 'epoch': 0.91} 91%|█████████ | 20160/22095 [34:42:38<2:28:16, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91296 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42045 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41279 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████ | 20161/22095 [34:42:42<2:13:55, 4.15s/it] {'loss': 0.287, 'grad_norm': 0.5582673961022042, 'learning_rate': 1.9978491159721724e-07, 'epoch': 0.91} 91%|█████████ | 20161/22095 [34:42:42<2:13:55, 4.15s/it] 91%|█████████▏| 20162/22095 [34:42:45<2:10:15, 4.04s/it] {'loss': 0.2933, 'grad_norm': 0.5574906899089822, 'learning_rate': 1.9957985341255427e-07, 'epoch': 0.91} 91%|█████████▏| 20162/22095 [34:42:45<2:10:15, 4.04s/it] 91%|█████████▏| 20163/22095 [34:42:50<2:11:12, 4.07s/it] {'loss': 0.2954, 'grad_norm': 0.6276203254950994, 'learning_rate': 1.9937489837417723e-07, 'epoch': 0.91} 91%|█████████▏| 20163/22095 [34:42:50<2:11:12, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20164/22095 [34:42:59<3:02:47, 5.68s/it] {'loss': 0.4516, 'grad_norm': 0.25034283765087206, 'learning_rate': 1.991700464864893e-07, 'epoch': 0.91} 91%|█████████▏| 20164/22095 [34:42:59<3:02:47, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (82888 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104828 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20165/22095 [34:43:02<2:41:54, 5.03s/it] {'loss': 0.3309, 'grad_norm': 0.6101342253794745, 'learning_rate': 1.9896529775389363e-07, 'epoch': 0.91} 91%|█████████▏| 20165/22095 [34:43:02<2:41:54, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86006 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65063 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20166/22095 [34:43:05<2:20:44, 4.38s/it] {'loss': 0.3228, 'grad_norm': 0.5735698779122026, 'learning_rate': 1.9876065218078722e-07, 'epoch': 0.91} 91%|█████████▏| 20166/22095 [34:43:05<2:20:44, 4.38s/it] 91%|█████████▏| 20167/22095 [34:43:09<2:12:33, 4.13s/it] {'loss': 0.298, 'grad_norm': 0.9948355697347983, 'learning_rate': 1.9855610977156882e-07, 'epoch': 0.91} 91%|█████████▏| 20167/22095 [34:43:09<2:12:33, 4.13s/it] 91%|█████████▏| 20168/22095 [34:43:13<2:09:27, 4.03s/it] {'loss': 0.2965, 'grad_norm': 0.6163256323172275, 'learning_rate': 1.9835167053063376e-07, 'epoch': 0.91} 91%|█████████▏| 20168/22095 [34:43:13<2:09:27, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (72925 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83113 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20169/22095 [34:43:21<2:51:22, 5.34s/it] {'loss': 0.4638, 'grad_norm': 0.29371825179107114, 'learning_rate': 1.9814733446237356e-07, 'epoch': 0.91} 91%|█████████▏| 20169/22095 [34:43:21<2:51:22, 5.34s/it] 91%|█████████▏| 20170/22095 [34:43:25<2:33:43, 4.79s/it] {'loss': 0.3371, 'grad_norm': 0.70807913278701, 'learning_rate': 1.9794310157117913e-07, 'epoch': 0.91} 91%|█████████▏| 20170/22095 [34:43:25<2:33:43, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85712 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144492 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43019 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20171/22095 [34:43:28<2:18:10, 4.31s/it] {'loss': 0.3038, 'grad_norm': 0.6712528365040975, 'learning_rate': 1.977389718614392e-07, 'epoch': 0.91} 91%|█████████▏| 20171/22095 [34:43:28<2:18:10, 4.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60531 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20172/22095 [34:43:31<2:04:04, 3.87s/it] {'loss': 0.3169, 'grad_norm': 0.6521966163230466, 'learning_rate': 1.9753494533754026e-07, 'epoch': 0.91} 91%|█████████▏| 20172/22095 [34:43:31<2:04:04, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20173/22095 [34:43:41<3:10:42, 5.95s/it] {'loss': 0.4618, 'grad_norm': 0.260499083334948, 'learning_rate': 1.9733102200386544e-07, 'epoch': 0.91} 91%|█████████▏| 20173/22095 [34:43:41<3:10:42, 5.95s/it] 91%|█████████▏| 20174/22095 [34:43:51<3:44:10, 7.00s/it] {'loss': 0.4476, 'grad_norm': 0.2448547025157443, 'learning_rate': 1.9712720186479685e-07, 'epoch': 0.91} 91%|█████████▏| 20174/22095 [34:43:51<3:44:10, 7.00s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (71391 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63518 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60999 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20175/22095 [34:43:54<3:08:16, 5.88s/it] {'loss': 0.2788, 'grad_norm': 0.6350681819680561, 'learning_rate': 1.9692348492471313e-07, 'epoch': 0.91} 91%|█████████▏| 20175/22095 [34:43:54<3:08:16, 5.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20176/22095 [34:43:58<2:44:21, 5.14s/it] {'loss': 0.3113, 'grad_norm': 0.6428461694612434, 'learning_rate': 1.9671987118799307e-07, 'epoch': 0.91} 91%|█████████▏| 20176/22095 [34:43:58<2:44:21, 5.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20177/22095 [34:44:07<3:25:58, 6.44s/it] {'loss': 0.4812, 'grad_norm': 0.2592207557142955, 'learning_rate': 1.965163606590098e-07, 'epoch': 0.91} 91%|█████████▏| 20177/22095 [34:44:07<3:25:58, 6.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20178/22095 [34:44:10<2:56:23, 5.52s/it] {'loss': 0.3166, 'grad_norm': 0.6141639243091234, 'learning_rate': 1.963129533421382e-07, 'epoch': 0.91} 91%|█████████▏| 20178/22095 [34:44:10<2:56:23, 5.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20179/22095 [34:44:20<3:34:25, 6.71s/it] {'loss': 0.4558, 'grad_norm': 0.2552167500263459, 'learning_rate': 1.961096492417469e-07, 'epoch': 0.91} 91%|█████████▏| 20179/22095 [34:44:20<3:34:25, 6.71s/it] 91%|█████████▏| 20180/22095 [34:44:29<4:00:59, 7.55s/it] {'loss': 0.4447, 'grad_norm': 0.24585128292836528, 'learning_rate': 1.9590644836220584e-07, 'epoch': 0.91} 91%|█████████▏| 20180/22095 [34:44:29<4:00:59, 7.55s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 91%|█████████▏| 20181/22095 [34:44:33<3:19:36, 6.26s/it] {'loss': 0.337, 'grad_norm': 0.6693985643334417, 'learning_rate': 1.9570335070788093e-07, 'epoch': 0.91} 91%|█████████▏| 20181/22095 [34:44:33<3:19:36, 6.26s/it] 91%|█████████▏| 20182/22095 [34:44:36<2:50:45, 5.36s/it] {'loss': 0.3002, 'grad_norm': 0.566802657436706, 'learning_rate': 1.9550035628313478e-07, 'epoch': 0.91} 91%|█████████▏| 20182/22095 [34:44:36<2:50:45, 5.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20183/22095 [34:44:45<3:30:13, 6.60s/it] {'loss': 0.4712, 'grad_norm': 0.24719745682119912, 'learning_rate': 1.9529746509233006e-07, 'epoch': 0.91} 91%|█████████▏| 20183/22095 [34:44:45<3:30:13, 6.60s/it] 91%|█████████▏| 20184/22095 [34:44:49<3:03:34, 5.76s/it] {'loss': 0.2691, 'grad_norm': 0.5804125523677844, 'learning_rate': 1.950946771398282e-07, 'epoch': 0.91} 91%|█████████▏| 20184/22095 [34:44:49<3:03:34, 5.76s/it] 91%|█████████▏| 20185/22095 [34:44:53<2:40:46, 5.05s/it] {'loss': 0.2865, 'grad_norm': 0.5569565564874374, 'learning_rate': 1.9489199242998248e-07, 'epoch': 0.91} 91%|█████████▏| 20185/22095 [34:44:53<2:40:46, 5.05s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108651 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46364 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20186/22095 [34:44:56<2:29:12, 4.69s/it] {'loss': 0.3374, 'grad_norm': 0.5907424873541962, 'learning_rate': 1.9468941096715043e-07, 'epoch': 0.91} 91%|█████████▏| 20186/22095 [34:44:56<2:29:12, 4.69s/it] 91%|█████████▏| 20187/22095 [34:44:59<2:11:10, 4.13s/it] {'loss': 0.2975, 'grad_norm': 1.7032824969349205, 'learning_rate': 1.9448693275568532e-07, 'epoch': 0.91} 91%|█████████▏| 20187/22095 [34:44:59<2:11:10, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42044 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91483 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20188/22095 [34:45:02<2:00:17, 3.78s/it] {'loss': 0.3074, 'grad_norm': 0.8489322917680525, 'learning_rate': 1.9428455779993694e-07, 'epoch': 0.91} 91%|█████████▏| 20188/22095 [34:45:02<2:00:17, 3.78s/it] 91%|█████████▏| 20189/22095 [34:45:06<2:00:32, 3.79s/it] {'loss': 0.2862, 'grad_norm': 0.6903877179464709, 'learning_rate': 1.9408228610425296e-07, 'epoch': 0.91} 91%|█████████▏| 20189/22095 [34:45:06<2:00:32, 3.79s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20190/22095 [34:45:10<2:01:50, 3.84s/it] {'loss': 0.292, 'grad_norm': 0.652244246341138, 'learning_rate': 1.9388011767298042e-07, 'epoch': 0.91} 91%|█████████▏| 20190/22095 [34:45:10<2:01:50, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50712 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71321 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20191/22095 [34:45:13<1:53:56, 3.59s/it] {'loss': 0.2418, 'grad_norm': 0.597657405511287, 'learning_rate': 1.9367805251046422e-07, 'epoch': 0.91} 91%|█████████▏| 20191/22095 [34:45:13<1:53:56, 3.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42315 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66954 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45636 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20192/22095 [34:45:17<1:57:20, 3.70s/it] {'loss': 0.3036, 'grad_norm': 0.6294900696669814, 'learning_rate': 1.9347609062104478e-07, 'epoch': 0.91} 91%|█████████▏| 20192/22095 [34:45:17<1:57:20, 3.70s/it] 91%|█████████▏| 20193/22095 [34:45:20<1:50:27, 3.48s/it] {'loss': 0.2796, 'grad_norm': 0.7343966627058698, 'learning_rate': 1.932742320090619e-07, 'epoch': 0.91} 91%|█████████▏| 20193/22095 [34:45:20<1:50:27, 3.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20194/22095 [34:45:27<2:21:23, 4.46s/it] {'loss': 0.4451, 'grad_norm': 0.2675377517537364, 'learning_rate': 1.9307247667885331e-07, 'epoch': 0.91} 91%|█████████▏| 20194/22095 [34:45:27<2:21:23, 4.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20195/22095 [34:45:30<2:09:42, 4.10s/it] {'loss': 0.2973, 'grad_norm': 0.6294374440983784, 'learning_rate': 1.9287082463475326e-07, 'epoch': 0.91} 91%|█████████▏| 20195/22095 [34:45:30<2:09:42, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 91%|█████████▏| 20196/22095 [34:45:38<2:43:00, 5.15s/it] {'loss': 0.4829, 'grad_norm': 0.2735152794246382, 'learning_rate': 1.926692758810955e-07, 'epoch': 0.91} 91%|█████████▏| 20196/22095 [34:45:38<2:43:00, 5.15s/it] 91%|█████████▏| 20197/22095 [34:45:41<2:27:30, 4.66s/it] {'loss': 0.2955, 'grad_norm': 1.029793456721243, 'learning_rate': 1.9246783042221106e-07, 'epoch': 0.91} 91%|█████████▏| 20197/22095 [34:45:41<2:27:30, 4.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [190, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8881051 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [190, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4204, 'image': 'images/4938.png', 'image_wh': [[190, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知截面AB=6cm,则截面AB的延长线上有一个C点,BC=4cm。如果M点和N点分别是AB和BC的中点,则M点和N点之间的距离为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5cm'}]} 91%|█████████▏| 20198/22095 [34:45:44<2:15:31, 4.29s/it] {'loss': 0.2777, 'grad_norm': 0.5930683611980938, 'learning_rate': 1.9226648826242699e-07, 'epoch': 0.91} 91%|█████████▏| 20198/22095 [34:45:44<2:15:31, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [909, 6, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350481 in VC:s3://internvl-moe-sft-data/. Exception: Image size [909, 6, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 17155, 'image': 'vrdu_table_final_2/astro-ph.CO/0cb9ccb7-74cf-44c9-963f-f3d3f714521a.png', 'image_wh': [[909, 6]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{p{0.45\\textwidth}}\n\\hline \\\\\n\\end{tabular}\n```"}]} 91%|█████████▏| 20199/22095 [34:45:54<3:03:44, 5.81s/it] {'loss': 0.4717, 'grad_norm': 0.2621875181458503, 'learning_rate': 1.9206524940606984e-07, 'epoch': 0.91} 91%|█████████▏| 20199/22095 [34:45:54<3:03:44, 5.81s/it] 91%|█████████▏| 20200/22095 [34:46:03<3:38:15, 6.91s/it] {'loss': 0.469, 'grad_norm': 0.27766580160689125, 'learning_rate': 1.9186411385746507e-07, 'epoch': 0.91} 91%|█████████▏| 20200/22095 [34:46:03<3:38:15, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 91%|█████████▏| 20201/22095 [34:46:07<3:03:37, 5.82s/it] {'loss': 0.3206, 'grad_norm': 0.6663637678544063, 'learning_rate': 1.9166308162093306e-07, 'epoch': 0.91} 91%|█████████▏| 20201/22095 [34:46:07<3:03:37, 5.82s/it] 91%|█████████▏| 20202/22095 [34:46:10<2:41:08, 5.11s/it] {'loss': 0.2368, 'grad_norm': 0.5736598664573532, 'learning_rate': 1.914621527007937e-07, 'epoch': 0.91} 91%|█████████▏| 20202/22095 [34:46:10<2:41:08, 5.11s/it] 91%|█████████▏| 20203/22095 [34:46:14<2:28:33, 4.71s/it] {'loss': 0.3147, 'grad_norm': 0.5907927822480857, 'learning_rate': 1.912613271013647e-07, 'epoch': 0.91} 91%|█████████▏| 20203/22095 [34:46:14<2:28:33, 4.71s/it] 91%|█████████▏| 20204/22095 [34:46:18<2:19:47, 4.44s/it] {'loss': 0.296, 'grad_norm': 0.6438013056377528, 'learning_rate': 1.9106060482695976e-07, 'epoch': 0.91} 91%|█████████▏| 20204/22095 [34:46:18<2:19:47, 4.44s/it] 91%|█████████▏| 20205/22095 [34:46:21<2:08:44, 4.09s/it] {'loss': 0.2871, 'grad_norm': 0.6053099453746283, 'learning_rate': 1.9085998588189436e-07, 'epoch': 0.91} 91%|█████████▏| 20205/22095 [34:46:21<2:08:44, 4.09s/it] 91%|█████████▏| 20206/22095 [34:46:25<2:04:10, 3.94s/it] {'loss': 0.3364, 'grad_norm': 0.622434748725618, 'learning_rate': 1.906594702704767e-07, 'epoch': 0.91} 91%|█████████▏| 20206/22095 [34:46:25<2:04:10, 3.94s/it] 91%|█████████▏| 20207/22095 [34:46:28<1:57:19, 3.73s/it] {'loss': 0.2924, 'grad_norm': 0.6125138418175076, 'learning_rate': 1.904590579970167e-07, 'epoch': 0.91} 91%|█████████▏| 20207/22095 [34:46:28<1:57:19, 3.73s/it] 91%|█████████▏| 20208/22095 [34:46:31<1:56:13, 3.70s/it] {'loss': 0.3192, 'grad_norm': 0.6837576033495794, 'learning_rate': 1.9025874906581975e-07, 'epoch': 0.91} 91%|█████████▏| 20208/22095 [34:46:31<1:56:13, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51233 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43432 > 40960). Running this sequence through the model will result in indexing errors 91%|█████████▏| 20209/22095 [34:46:34<1:48:57, 3.47s/it] {'loss': 0.287, 'grad_norm': 0.5913973359919333, 'learning_rate': 1.900585434811908e-07, 'epoch': 0.91} 91%|█████████▏| 20209/22095 [34:46:34<1:48:57, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (63427 > 40960). Running this sequence through the model will result in indexing errors Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [642, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8375612 in VC:s3://internvl-moe-sft-data/. Exception: Image size [642, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 42389, 'image': 'vrdu_table_final_2/astro-ph.CO/69e14bd2-89a3-48c8-83ee-a48b315dc968.png', 'image_wh': [[642, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{@{}llll}\n$\\Omega_\\Lambda = 0.72$ & $\\Omega_\\text{CDM} = 0.23$ & $\\Omega_\\text{b} = 0.05$ & $\\Omega_\\text{r} = \\num{8e-5}$\n\\end{tabular}\n```'}]} 91%|█████████▏| 20210/22095 [34:46:38<1:49:35, 3.49s/it] {'loss': 0.2943, 'grad_norm': 0.7310883652397158, 'learning_rate': 1.8985844124743136e-07, 'epoch': 0.91} 91%|█████████▏| 20210/22095 [34:46:38<1:49:35, 3.49s/it] 91%|█████████▏| 20211/22095 [34:46:41<1:44:25, 3.33s/it] {'loss': 0.3022, 'grad_norm': 0.620939923788956, 'learning_rate': 1.8965844236883968e-07, 'epoch': 0.91} 91%|█████████▏| 20211/22095 [34:46:41<1:44:25, 3.33s/it] 91%|█████████▏| 20212/22095 [34:46:43<1:38:26, 3.14s/it] {'loss': 0.267, 'grad_norm': 0.5870684472252491, 'learning_rate': 1.894585468497151e-07, 'epoch': 0.91} 91%|█████████▏| 20212/22095 [34:46:43<1:38:26, 3.14s/it] 91%|█████████▏| 20213/22095 [34:46:48<1:47:44, 3.43s/it] {'loss': 0.289, 'grad_norm': 0.579162736305499, 'learning_rate': 1.892587546943525e-07, 'epoch': 0.91} 91%|█████████▏| 20213/22095 [34:46:48<1:47:44, 3.43s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 91%|█████████▏| 20214/22095 [34:46:51<1:48:41, 3.47s/it] {'loss': 0.3629, 'grad_norm': 0.61310203729002, 'learning_rate': 1.8905906590704293e-07, 'epoch': 0.91} 91%|█████████▏| 20214/22095 [34:46:51<1:48:41, 3.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69434 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43768 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (41300 > 40960) for 4 sample(s). Truncating to 8889 with 3 samples. 91%|█████████▏| 20215/22095 [34:46:54<1:45:29, 3.37s/it] {'loss': 0.2849, 'grad_norm': 0.6277058044001794, 'learning_rate': 1.8885948049207847e-07, 'epoch': 0.91} 91%|█████████▏| 20215/22095 [34:46:54<1:45:29, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8573638 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22252, 'image': '810140047.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Biographies & Memoirs? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 91%|█████████▏| 20216/22095 [34:46:58<1:45:45, 3.38s/it] {'loss': 0.323, 'grad_norm': 0.8194378360950997, 'learning_rate': 1.8865999845374794e-07, 'epoch': 0.91} 91%|█████████▏| 20216/22095 [34:46:58<1:45:45, 3.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20217/22095 [34:47:07<2:41:13, 5.15s/it] {'loss': 0.4627, 'grad_norm': 0.2803414338616721, 'learning_rate': 1.8846061979633734e-07, 'epoch': 0.92} 92%|█████████▏| 20217/22095 [34:47:07<2:41:13, 5.15s/it] 92%|█████████▏| 20218/22095 [34:47:11<2:29:33, 4.78s/it] {'loss': 0.3527, 'grad_norm': 0.6307224297322953, 'learning_rate': 1.8826134452412993e-07, 'epoch': 0.92} 92%|█████████▏| 20218/22095 [34:47:11<2:29:33, 4.78s/it] 92%|█████████▏| 20219/22095 [34:47:15<2:21:35, 4.53s/it] {'loss': 0.3024, 'grad_norm': 0.5779802801482953, 'learning_rate': 1.8806217264140836e-07, 'epoch': 0.92} 92%|█████████▏| 20219/22095 [34:47:15<2:21:35, 4.53s/it] 92%|█████████▏| 20220/22095 [34:47:18<2:06:39, 4.05s/it] {'loss': 0.2627, 'grad_norm': 0.5418445470374095, 'learning_rate': 1.87863104152452e-07, 'epoch': 0.92} 92%|█████████▏| 20220/22095 [34:47:18<2:06:39, 4.05s/it] 92%|█████████▏| 20221/22095 [34:47:21<2:00:16, 3.85s/it] {'loss': 0.3207, 'grad_norm': 0.5964870628673384, 'learning_rate': 1.8766413906153856e-07, 'epoch': 0.92} 92%|█████████▏| 20221/22095 [34:47:21<2:00:16, 3.85s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [406, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8476380 in VC:s3://internvl-moe-sft-data/. Exception: Image size [406, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 97154, 'image': 'vrdu_texteq/astro-ph.CO/e68b77fa-cf2d-4034-ba00-d585df33b729.png', 'image_wh': [[406, 25]], 'conversations': [{'from': 'human', 'value': '\n能提取图像中的任何可读文本吗?'}, {'from': 'gpt', 'value': 'where \n$\\delta_h$ is the unsmoothed field.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20222/22095 [34:47:25<2:01:38, 3.90s/it] {'loss': 0.2746, 'grad_norm': 0.6014968870783488, 'learning_rate': 1.874652773729424e-07, 'epoch': 0.92} 92%|█████████▏| 20222/22095 [34:47:25<2:01:38, 3.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20223/22095 [34:47:32<2:24:58, 4.65s/it] {'loss': 0.4705, 'grad_norm': 0.2805350431099635, 'learning_rate': 1.8726651909093675e-07, 'epoch': 0.92} 92%|█████████▏| 20223/22095 [34:47:32<2:24:58, 4.65s/it] 92%|█████████▏| 20224/22095 [34:47:36<2:27:06, 4.72s/it] {'loss': 0.3217, 'grad_norm': 0.6307605153708684, 'learning_rate': 1.870678642197926e-07, 'epoch': 0.92} 92%|█████████▏| 20224/22095 [34:47:36<2:27:06, 4.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92146 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47908 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20225/22095 [34:47:40<2:13:06, 4.27s/it] {'loss': 0.2803, 'grad_norm': 0.5446748202273499, 'learning_rate': 1.868693127637783e-07, 'epoch': 0.92} 92%|█████████▏| 20225/22095 [34:47:40<2:13:06, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [348, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8473529 in VC:s3://internvl-moe-sft-data/. Exception: Image size [348, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20824, 'image': 'vrdu_texteq/astro-ph.CO/7f32fbd8-7d94-4129-a093-17692a843bb6.png', 'image_wh': [[348, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $l_1=220$ and $l_2=240$.'}]} 92%|█████████▏| 20226/22095 [34:47:43<2:01:38, 3.91s/it] {'loss': 0.3142, 'grad_norm': 0.5966799858674203, 'learning_rate': 1.8667086472716034e-07, 'epoch': 0.92} 92%|█████████▏| 20226/22095 [34:47:43<2:01:38, 3.91s/it] 92%|█████████▏| 20227/22095 [34:47:46<1:52:25, 3.61s/it] {'loss': 0.2786, 'grad_norm': 0.5956319525704, 'learning_rate': 1.8647252011420202e-07, 'epoch': 0.92} 92%|█████████▏| 20227/22095 [34:47:46<1:52:25, 3.61s/it] 92%|█████████▏| 20228/22095 [34:47:49<1:53:32, 3.65s/it] {'loss': 0.2844, 'grad_norm': 0.5717016703576566, 'learning_rate': 1.8627427892916493e-07, 'epoch': 0.92} 92%|█████████▏| 20228/22095 [34:47:49<1:53:32, 3.65s/it] 92%|█████████▏| 20229/22095 [34:47:52<1:48:15, 3.48s/it] {'loss': 0.2898, 'grad_norm': 0.670300461314522, 'learning_rate': 1.860761411763107e-07, 'epoch': 0.92} 92%|█████████▏| 20229/22095 [34:47:52<1:48:15, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98381 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77525 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20230/22095 [34:47:55<1:42:48, 3.31s/it] {'loss': 0.2834, 'grad_norm': 0.6314533304522649, 'learning_rate': 1.8587810685989528e-07, 'epoch': 0.92} 92%|█████████▏| 20230/22095 [34:47:55<1:42:48, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83042 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20231/22095 [34:47:58<1:37:28, 3.14s/it] {'loss': 0.2744, 'grad_norm': 0.6125017548125974, 'learning_rate': 1.856801759841731e-07, 'epoch': 0.92} 92%|█████████▏| 20231/22095 [34:47:58<1:37:28, 3.14s/it] 92%|█████████▏| 20232/22095 [34:48:01<1:37:58, 3.16s/it] {'loss': 0.2708, 'grad_norm': 0.5840046732133334, 'learning_rate': 1.8548234855339798e-07, 'epoch': 0.92} 92%|█████████▏| 20232/22095 [34:48:01<1:37:58, 3.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [487, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8415536 in VC:s3://internvl-moe-sft-data/. Exception: Image size [487, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 107156, 'image': 'vrdu_texteq/astro-ph.CO/e08eb43e-f0f6-4678-aa70-1443380d0479.png', 'image_wh': [[487, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'where $M$ is the mass of considered halo.'}]} 92%|█████████▏| 20233/22095 [34:48:11<2:36:26, 5.04s/it] {'loss': 0.4598, 'grad_norm': 0.25643681040427324, 'learning_rate': 1.8528462457182095e-07, 'epoch': 0.92} 92%|█████████▏| 20233/22095 [34:48:11<2:36:26, 5.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8954482 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5317, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 6cm\nB. 7cm\nC. 8cm\nD. 5cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 92%|█████████▏| 20234/22095 [34:48:14<2:18:28, 4.46s/it] {'loss': 0.2768, 'grad_norm': 0.6206408904388228, 'learning_rate': 1.8508700404368973e-07, 'epoch': 0.92} 92%|█████████▏| 20234/22095 [34:48:14<2:18:28, 4.46s/it] 92%|█████████▏| 20235/22095 [34:48:17<2:05:28, 4.05s/it] {'loss': 0.3203, 'grad_norm': 0.6378085195757364, 'learning_rate': 1.8488948697325094e-07, 'epoch': 0.92} 92%|█████████▏| 20235/22095 [34:48:17<2:05:28, 4.05s/it] 92%|█████████▏| 20236/22095 [34:48:20<1:59:25, 3.85s/it] {'loss': 0.2854, 'grad_norm': 0.5901667715023341, 'learning_rate': 1.8469207336474893e-07, 'epoch': 0.92} 92%|█████████▏| 20236/22095 [34:48:20<1:59:25, 3.85s/it] 92%|█████████▏| 20237/22095 [34:48:24<2:01:49, 3.93s/it] {'loss': 0.3057, 'grad_norm': 0.6633365915851603, 'learning_rate': 1.8449476322242476e-07, 'epoch': 0.92} 92%|█████████▏| 20237/22095 [34:48:24<2:01:49, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83105 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20238/22095 [34:48:34<2:55:52, 5.68s/it] {'loss': 0.4787, 'grad_norm': 0.2721618197466199, 'learning_rate': 1.8429755655051896e-07, 'epoch': 0.92} 92%|█████████▏| 20238/22095 [34:48:35<2:55:52, 5.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57931 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77730 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20239/22095 [34:48:38<2:39:29, 5.16s/it] {'loss': 0.2677, 'grad_norm': 0.5846259701182656, 'learning_rate': 1.841004533532681e-07, 'epoch': 0.92} 92%|█████████▏| 20239/22095 [34:48:38<2:39:29, 5.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1334, 12, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8345607 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1334, 12, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12263, 'image': 'vrdu_table_final_2/astro-ph.CO/8f33418b-d929-47b3-8c04-56a22b2c66f0.png', 'image_wh': [[1334, 12]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{llllllllllllllllllllllllllllllllllllllllllllllll}\n & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & \\\\\n\\hline \\hline\n\\end{tabular}\n```"}]} 92%|█████████▏| 20240/22095 [34:48:42<2:26:48, 4.75s/it] {'loss': 0.2987, 'grad_norm': 0.586431225367831, 'learning_rate': 1.8390345363490713e-07, 'epoch': 0.92} 92%|█████████▏| 20240/22095 [34:48:42<2:26:48, 4.75s/it] 92%|█████████▏| 20241/22095 [34:48:45<2:09:26, 4.19s/it] {'loss': 0.3144, 'grad_norm': 0.5920702643710308, 'learning_rate': 1.8370655739966937e-07, 'epoch': 0.92} 92%|█████████▏| 20241/22095 [34:48:45<2:09:26, 4.19s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42065 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20242/22095 [34:48:48<1:59:59, 3.89s/it] {'loss': 0.2833, 'grad_norm': 0.5928483811429326, 'learning_rate': 1.8350976465178693e-07, 'epoch': 0.92} 92%|█████████▏| 20242/22095 [34:48:48<1:59:59, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20243/22095 [34:48:52<1:57:11, 3.80s/it] {'loss': 0.2931, 'grad_norm': 0.6568539880090193, 'learning_rate': 1.8331307539548593e-07, 'epoch': 0.92} 92%|█████████▏| 20243/22095 [34:48:52<1:57:11, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20244/22095 [34:48:56<1:58:23, 3.84s/it] {'loss': 0.2672, 'grad_norm': 0.6174884944900461, 'learning_rate': 1.831164896349935e-07, 'epoch': 0.92} 92%|█████████▏| 20244/22095 [34:48:56<1:58:23, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65611 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46937 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41367 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63268 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20245/22095 [34:48:58<1:49:57, 3.57s/it] {'loss': 0.2718, 'grad_norm': 0.6719921439413501, 'learning_rate': 1.829200073745341e-07, 'epoch': 0.92} 92%|█████████▏| 20245/22095 [34:48:58<1:49:57, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20246/22095 [34:49:07<2:37:52, 5.12s/it] {'loss': 0.4778, 'grad_norm': 0.2804039360211197, 'learning_rate': 1.8272362861832925e-07, 'epoch': 0.92} 92%|█████████▏| 20246/22095 [34:49:07<2:37:52, 5.12s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20247/22095 [34:49:17<3:20:42, 6.52s/it] {'loss': 0.4582, 'grad_norm': 0.24948144674247946, 'learning_rate': 1.825273533705979e-07, 'epoch': 0.92} 92%|█████████▏| 20247/22095 [34:49:17<3:20:42, 6.52s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20248/22095 [34:49:21<3:00:05, 5.85s/it] {'loss': 0.2726, 'grad_norm': 0.7611009625456361, 'learning_rate': 1.823311816355583e-07, 'epoch': 0.92} 92%|█████████▏| 20248/22095 [34:49:21<3:00:05, 5.85s/it] 92%|█████████▏| 20249/22095 [34:49:25<2:42:54, 5.29s/it] {'loss': 0.2935, 'grad_norm': 0.5962263712466588, 'learning_rate': 1.8213511341742596e-07, 'epoch': 0.92} 92%|█████████▏| 20249/22095 [34:49:25<2:42:54, 5.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (59199 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20250/22095 [34:49:35<3:19:43, 6.50s/it] {'loss': 0.479, 'grad_norm': 0.2622130822164122, 'learning_rate': 1.819391487204125e-07, 'epoch': 0.92} 92%|█████████▏| 20250/22095 [34:49:35<3:19:43, 6.50s/it] 92%|█████████▏| 20251/22095 [34:49:39<3:02:48, 5.95s/it] {'loss': 0.2505, 'grad_norm': 0.5557890604995867, 'learning_rate': 1.8174328754872906e-07, 'epoch': 0.92} 92%|█████████▏| 20251/22095 [34:49:39<3:02:48, 5.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887272 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10425, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nA. 6cm\nB. 1cm\nC. 2cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20252/22095 [34:49:44<2:49:29, 5.52s/it] {'loss': 0.254, 'grad_norm': 0.5915074391315309, 'learning_rate': 1.815475299065844e-07, 'epoch': 0.92} 92%|█████████▏| 20252/22095 [34:49:44<2:49:29, 5.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20253/22095 [34:49:53<3:26:42, 6.73s/it] {'loss': 0.4755, 'grad_norm': 0.2590629272394503, 'learning_rate': 1.8135187579818415e-07, 'epoch': 0.92} 92%|█████████▏| 20253/22095 [34:49:53<3:26:42, 6.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65372 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51883 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20254/22095 [34:49:57<2:59:47, 5.86s/it] {'loss': 0.2421, 'grad_norm': 0.554163045102261, 'learning_rate': 1.8115632522773375e-07, 'epoch': 0.92} 92%|█████████▏| 20254/22095 [34:49:57<2:59:47, 5.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20255/22095 [34:50:01<2:37:12, 5.13s/it] {'loss': 0.2769, 'grad_norm': 0.5629796866398182, 'learning_rate': 1.8096087819943376e-07, 'epoch': 0.92} 92%|█████████▏| 20255/22095 [34:50:01<2:37:12, 5.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46352 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46928 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58211 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42953 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66979 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49128 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20256/22095 [34:50:03<2:15:51, 4.43s/it] {'loss': 0.2869, 'grad_norm': 0.6647392026898864, 'learning_rate': 1.8076553471748304e-07, 'epoch': 0.92} 92%|█████████▏| 20256/22095 [34:50:03<2:15:51, 4.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (135787 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98131 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43115 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20257/22095 [34:50:06<2:00:25, 3.93s/it] {'loss': 0.2561, 'grad_norm': 0.5923359751380025, 'learning_rate': 1.805702947860799e-07, 'epoch': 0.92} 92%|█████████▏| 20257/22095 [34:50:06<2:00:25, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 17, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396970 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 17, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63823, 'image': 'vrdu_table_final_2/astro-ph.EP/f11d136e-7429-4576-9800-28c941d72660.png', 'image_wh': [[25, 17]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}[t]{l}$e_z$\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8527002 in VC:s3://internvl-moe-sft-data/. Exception: Image size [198, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 155407, 'image': 'vrdu_texteq/astro-ph.CO/3d6b548b-9dac-4ddb-8416-016c7b6f34d0.png', 'image_wh': [[198, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'at over $95\\%$ CL.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [364, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8508425 in VC:s3://internvl-moe-sft-data/. Exception: Image size [364, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 93018, 'image': 'vrdu_texteq/astro-ph.CO/7bec64a4-bbc2-444a-9e71-f2ff59560150.png', 'image_wh': [[364, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $R$ is the rate defined as'}]} 92%|█████████▏| 20258/22095 [34:50:14<2:33:57, 5.03s/it] {'loss': 0.4584, 'grad_norm': 0.2732673759257525, 'learning_rate': 1.8037515840942043e-07, 'epoch': 0.92} 92%|█████████▏| 20258/22095 [34:50:14<2:33:57, 5.03s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20259/22095 [34:50:18<2:30:03, 4.90s/it] {'loss': 0.2926, 'grad_norm': 0.5901968281956451, 'learning_rate': 1.8018012559169573e-07, 'epoch': 0.92} 92%|█████████▏| 20259/22095 [34:50:18<2:30:03, 4.90s/it] 92%|█████████▏| 20260/22095 [34:50:22<2:21:48, 4.64s/it] {'loss': 0.3034, 'grad_norm': 0.694288842386728, 'learning_rate': 1.7998519633709688e-07, 'epoch': 0.92} 92%|█████████▏| 20260/22095 [34:50:22<2:21:48, 4.64s/it] 92%|█████████▏| 20261/22095 [34:50:27<2:19:36, 4.57s/it] {'loss': 0.3262, 'grad_norm': 0.581170916707714, 'learning_rate': 1.7979037064981275e-07, 'epoch': 0.92} 92%|█████████▏| 20261/22095 [34:50:27<2:19:36, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20262/22095 [34:50:38<3:17:41, 6.47s/it] {'loss': 0.4322, 'grad_norm': 0.24995335589526108, 'learning_rate': 1.7959564853403e-07, 'epoch': 0.92} 92%|█████████▏| 20262/22095 [34:50:38<3:17:41, 6.47s/it] 92%|█████████▏| 20263/22095 [34:50:41<2:51:15, 5.61s/it] {'loss': 0.2856, 'grad_norm': 0.6144727338350052, 'learning_rate': 1.7940102999393194e-07, 'epoch': 0.92} 92%|█████████▏| 20263/22095 [34:50:41<2:51:15, 5.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (87716 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20264/22095 [34:50:51<3:33:13, 6.99s/it] {'loss': 0.4764, 'grad_norm': 0.25951800013520543, 'learning_rate': 1.7920651503370022e-07, 'epoch': 0.92} 92%|█████████▏| 20264/22095 [34:50:51<3:33:13, 6.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42950 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44573 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51641 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20265/22095 [34:51:00<3:45:58, 7.41s/it] {'loss': 0.4756, 'grad_norm': 0.2825775327652799, 'learning_rate': 1.7901210365751488e-07, 'epoch': 0.92} 92%|█████████▏| 20265/22095 [34:51:00<3:45:58, 7.41s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20266/22095 [34:51:04<3:13:38, 6.35s/it] {'loss': 0.2479, 'grad_norm': 0.5810418071924581, 'learning_rate': 1.7881779586955196e-07, 'epoch': 0.92} 92%|█████████▏| 20266/22095 [34:51:04<3:13:38, 6.35s/it] 92%|█████████▏| 20267/22095 [34:51:08<2:52:36, 5.67s/it] {'loss': 0.2806, 'grad_norm': 0.6689716280098308, 'learning_rate': 1.7862359167398814e-07, 'epoch': 0.92} 92%|█████████▏| 20267/22095 [34:51:08<2:52:36, 5.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95226 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20268/22095 [34:51:12<2:36:48, 5.15s/it] {'loss': 0.2845, 'grad_norm': 0.6429654330586431, 'learning_rate': 1.784294910749962e-07, 'epoch': 0.92} 92%|█████████▏| 20268/22095 [34:51:12<2:36:48, 5.15s/it] 92%|█████████▏| 20269/22095 [34:51:15<2:21:57, 4.66s/it] {'loss': 0.2493, 'grad_norm': 0.6171635547788606, 'learning_rate': 1.78235494076745e-07, 'epoch': 0.92} 92%|█████████▏| 20269/22095 [34:51:15<2:21:57, 4.66s/it] 92%|█████████▏| 20270/22095 [34:51:18<2:05:37, 4.13s/it] {'loss': 0.3252, 'grad_norm': 0.6655164567554036, 'learning_rate': 1.7804160068340403e-07, 'epoch': 0.92} 92%|█████████▏| 20270/22095 [34:51:18<2:05:37, 4.13s/it] 92%|█████████▏| 20271/22095 [34:51:21<1:57:11, 3.86s/it] {'loss': 0.2945, 'grad_norm': 0.5822090955235841, 'learning_rate': 1.7784781089914106e-07, 'epoch': 0.92} 92%|█████████▏| 20271/22095 [34:51:21<1:57:11, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20272/22095 [34:51:30<2:40:31, 5.28s/it] {'loss': 0.456, 'grad_norm': 0.24552934203250296, 'learning_rate': 1.776541247281177e-07, 'epoch': 0.92} 92%|█████████▏| 20272/22095 [34:51:30<2:40:31, 5.28s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047928 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 13cm\nB. 11cm\nC. 12cm\nD. 15cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 92%|█████████▏| 20273/22095 [34:51:40<3:19:22, 6.57s/it] {'loss': 0.4819, 'grad_norm': 0.2575428016919689, 'learning_rate': 1.774605421744957e-07, 'epoch': 0.92} 92%|█████████▏| 20273/22095 [34:51:40<3:19:22, 6.57s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20274/22095 [34:51:43<2:47:57, 5.53s/it] {'loss': 0.3044, 'grad_norm': 0.5968669926945249, 'learning_rate': 1.7726706324243614e-07, 'epoch': 0.92} 92%|█████████▏| 20274/22095 [34:51:43<2:47:57, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46597 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49854 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63940 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50401 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20275/22095 [34:51:47<2:34:44, 5.10s/it] {'loss': 0.3052, 'grad_norm': 0.679367312200526, 'learning_rate': 1.770736879360957e-07, 'epoch': 0.92} 92%|█████████▏| 20275/22095 [34:51:47<2:34:44, 5.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (47732 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20276/22095 [34:51:53<2:43:48, 5.40s/it] {'loss': 0.4838, 'grad_norm': 0.2811066629242291, 'learning_rate': 1.7688041625962881e-07, 'epoch': 0.92} 92%|█████████▏| 20276/22095 [34:51:53<2:43:48, 5.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42867 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82360 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20277/22095 [34:51:56<2:27:15, 4.86s/it] {'loss': 0.2946, 'grad_norm': 0.5880107542504787, 'learning_rate': 1.766872482171883e-07, 'epoch': 0.92} 92%|█████████▏| 20277/22095 [34:51:56<2:27:15, 4.86s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48141 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43798 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43083 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20278/22095 [34:51:59<2:09:17, 4.27s/it] {'loss': 0.2766, 'grad_norm': 0.6804630857887852, 'learning_rate': 1.7649418381292584e-07, 'epoch': 0.92} 92%|█████████▏| 20278/22095 [34:51:59<2:09:17, 4.27s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [678, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8443649 in VC:s3://internvl-moe-sft-data/. Exception: Image size [678, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60941, 'image': 'vrdu_texteq/astro-ph.CO/adaf0b6d-2184-4bd6-aaba-9b5680786117.png', 'image_wh': [[678, 23]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': 'where a dot stands for a derivative w.r.t. cosmic time $t$.'}]} 92%|█████████▏| 20279/22095 [34:52:02<1:55:09, 3.80s/it] {'loss': 0.2822, 'grad_norm': 0.7164473666448703, 'learning_rate': 1.7630122305098919e-07, 'epoch': 0.92} 92%|█████████▏| 20279/22095 [34:52:02<1:55:09, 3.80s/it] 92%|█████████▏| 20280/22095 [34:52:05<1:47:57, 3.57s/it] {'loss': 0.2706, 'grad_norm': 0.5634355600166236, 'learning_rate': 1.7610836593552394e-07, 'epoch': 0.92} 92%|█████████▏| 20280/22095 [34:52:05<1:47:57, 3.57s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045965 in VC:s3://multi-modal/UniGeo/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nA. 4\nB. 6\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 92%|█████████▏| 20281/22095 [34:52:10<1:57:38, 3.89s/it] {'loss': 0.2619, 'grad_norm': 0.6060948444999675, 'learning_rate': 1.7591561247067513e-07, 'epoch': 0.92} 92%|█████████▏| 20281/22095 [34:52:10<1:57:38, 3.89s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20282/22095 [34:52:13<1:47:54, 3.57s/it] {'loss': 0.2884, 'grad_norm': 0.6906994401131011, 'learning_rate': 1.7572296266058274e-07, 'epoch': 0.92} 92%|█████████▏| 20282/22095 [34:52:13<1:47:54, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20283/22095 [34:52:19<2:16:20, 4.51s/it] {'loss': 0.4929, 'grad_norm': 0.3472640325425808, 'learning_rate': 1.7553041650938797e-07, 'epoch': 0.92} 92%|█████████▏| 20283/22095 [34:52:19<2:16:20, 4.51s/it] 92%|█████████▏| 20284/22095 [34:52:29<3:01:20, 6.01s/it] {'loss': 0.4658, 'grad_norm': 0.2600903816186952, 'learning_rate': 1.7533797402122743e-07, 'epoch': 0.92} 92%|█████████▏| 20284/22095 [34:52:29<3:01:20, 6.01s/it] 92%|█████████▏| 20285/22095 [34:52:37<3:20:21, 6.64s/it] {'loss': 0.4415, 'grad_norm': 0.28327782277311986, 'learning_rate': 1.7514563520023565e-07, 'epoch': 0.92} 92%|█████████▏| 20285/22095 [34:52:37<3:20:21, 6.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (89868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (144817 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20286/22095 [34:52:46<3:44:43, 7.45s/it] {'loss': 0.4762, 'grad_norm': 0.26357230226310807, 'learning_rate': 1.749534000505454e-07, 'epoch': 0.92} 92%|█████████▏| 20286/22095 [34:52:46<3:44:43, 7.45s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [720, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8475827 in VC:s3://internvl-moe-sft-data/. Exception: Image size [720, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55436, 'image': 'vrdu_texteq/astro-ph.CO/e8a59d87-050c-42b9-bb87-24db8ca368e2.png', 'image_wh': [[720, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'The above relation works when $0.3 < \\Delta < 1$ and $0 < z < 2$.'}]} 92%|█████████▏| 20287/22095 [34:52:50<3:09:49, 6.30s/it] {'loss': 0.2731, 'grad_norm': 0.6251890113242208, 'learning_rate': 1.747612685762884e-07, 'epoch': 0.92} 92%|█████████▏| 20287/22095 [34:52:50<3:09:49, 6.30s/it] 92%|█████████▏| 20288/22095 [34:52:54<2:50:20, 5.66s/it] {'loss': 0.2898, 'grad_norm': 0.5775415680510588, 'learning_rate': 1.7456924078159187e-07, 'epoch': 0.92} 92%|█████████▏| 20288/22095 [34:52:54<2:50:20, 5.66s/it] 92%|█████████▏| 20289/22095 [34:52:57<2:30:10, 4.99s/it] {'loss': 0.2839, 'grad_norm': 0.5923941246023781, 'learning_rate': 1.7437731667058143e-07, 'epoch': 0.92} 92%|█████████▏| 20289/22095 [34:52:57<2:30:10, 4.99s/it] 92%|█████████▏| 20290/22095 [34:53:01<2:13:26, 4.44s/it] {'loss': 0.3324, 'grad_norm': 0.8476013350525512, 'learning_rate': 1.7418549624738213e-07, 'epoch': 0.92} 92%|█████████▏| 20290/22095 [34:53:01<2:13:26, 4.44s/it] 92%|█████████▏| 20291/22095 [34:53:05<2:09:45, 4.32s/it] {'loss': 0.2754, 'grad_norm': 0.6336776573796626, 'learning_rate': 1.7399377951611563e-07, 'epoch': 0.92} 92%|█████████▏| 20291/22095 [34:53:05<2:09:45, 4.32s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41376 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20292/22095 [34:53:08<2:02:54, 4.09s/it] {'loss': 0.3239, 'grad_norm': 0.6547617577821889, 'learning_rate': 1.7380216648090087e-07, 'epoch': 0.92} 92%|█████████▏| 20292/22095 [34:53:08<2:02:54, 4.09s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [28, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7805765 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [28, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27105', 'image': '51668.jpg', 'image_wh': [[28, 20]], 'conversations': [{'from': 'human', 'value': '\n Here is the caption I wrote for the image.\nThe image showcases a singular letter "X" in an elegant, cursive script. This character is crimson and placed against a rich, midnight blue backdrop. The script is ornate, resembling a stylized calligraphic font, with its fluid strokes and embellishments that imbue the "X" with a sense of artistic flair and historical resonance.\n\nThe letter "X" is crafted from two curved lines intersecting at their midpoints. Each line bears intricate swirls at the ends, giving it a romantic, almost Gothic appearance. The overall design is elaborate, with delicate embellishments and a nuanced color palette that draws attention to the letter\'s form and meaning. The placement of the character is strategic, indicating a sense of symmetry and purpose.\n\nGiven the absence of supplementary objects or details, the letter "X" can be understood in various contexts, such as denoting an unknown variable in mathematics, marking a spot in a treasure map, representing a crossing or intersection, or even serving as a symbol in ancient languages.\n\n### Analysis:\nThe letter "X" pervades numerous domains:\n- **In Literature**: X often stands for an unknown quantity or a placeholder for information yet to be revealed, sparking intrigue and mystery.\n- **In Cartography**: X marks the spot on treasure maps, guiding adventurers to hidden riches, a tradition that has fascinated generations.\n- **In Science**: In the field of genetics, the X chromosome plays a crucial role in determining sex, being present in all females and half of all males.\n- **In Mythology**: Certain mythological tales revolve around an "X" element, such as the intersection of fate or a pivotal moment where paths cross, leading to pivotal outcomes.\n\n### Example Question and Answer:\n**Question**: Given the visual depiction, what might the letter "X" represent?\n**Answer (CoT)**:\n1. We acknowledge that the illustration features the letter "X".\n2. The letter\'s significance can vary extensively, contingent on the environment in which it is employed.\n3. In literature, it often conceals a secret or an unknown element.\n4. In cartography, it symbolizes the coveted location of buried treasure.\n5. In science, it denotes the X chromosome, central to genetic studies.\n6. Mythologically, it might signify a crossroads or a fateful encounter.\n \nHence, the meaning of "X" is contextually driven, and without further context from the illustration, it could embody any of these interpretations. \nPlease assess if my caption is accurate, and if it’s not, please point out all the errors.'}, {'from': 'gpt', 'value': 'The description you provided for the letter "X" is quite detailed and elaborate, demonstrating an understanding of the visual elements and symbolic potential of the character. However, since the image actually presents the number "11," your description is entirely misplaced. Here are the errors:\n\n1. **Misidentification of the Subject**: The description focuses on a letter "X" when the image shows the number "11."\n2. **Inapplicable Attributes**: Attributes like "crimson," "midnight blue backdrop," and "cursive script" do not apply to the number "11" in a bold, serif typeface.\n3. **Incorrect Symbolic Interpretations**: The various symbolic interpretations given for "X" are not relevant to the numeral "11."\n4. **Misleading Contextual Analysis**: The analysis provided is tailored to the letter "X" and does not apply to the numeral "11" or its significance in mathematics, timekeeping, sports, or history.\n\nFor the correct image (number "11"), ensure the description matches the actual visual elements and symbolic meanings associated with the numeral.'}]} 92%|█████████▏| 20293/22095 [34:53:11<1:55:36, 3.85s/it] {'loss': 0.2698, 'grad_norm': 0.5762644864125279, 'learning_rate': 1.7361065714585458e-07, 'epoch': 0.92} 92%|█████████▏| 20293/22095 [34:53:11<1:55:36, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20294/22095 [34:53:21<2:46:33, 5.55s/it] {'loss': 0.4401, 'grad_norm': 0.26382957927102185, 'learning_rate': 1.734192515150923e-07, 'epoch': 0.92} 92%|█████████▏| 20294/22095 [34:53:21<2:46:33, 5.55s/it] 92%|█████████▏| 20295/22095 [34:53:25<2:34:49, 5.16s/it] {'loss': 0.3416, 'grad_norm': 0.6306528339851688, 'learning_rate': 1.732279495927264e-07, 'epoch': 0.92} 92%|█████████▏| 20295/22095 [34:53:25<2:34:49, 5.16s/it] 92%|█████████▏| 20296/22095 [34:53:28<2:15:49, 4.53s/it] {'loss': 0.2814, 'grad_norm': 0.587214543222049, 'learning_rate': 1.730367513828679e-07, 'epoch': 0.92} 92%|█████████▏| 20296/22095 [34:53:28<2:15:49, 4.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (91041 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53100 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48292 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20297/22095 [34:53:32<2:04:51, 4.17s/it] {'loss': 0.2797, 'grad_norm': 0.5364668170202281, 'learning_rate': 1.7284565688962474e-07, 'epoch': 0.92} 92%|█████████▏| 20297/22095 [34:53:32<2:04:51, 4.17s/it] 92%|█████████▏| 20298/22095 [34:53:35<1:55:52, 3.87s/it] {'loss': 0.2945, 'grad_norm': 0.5879666024093826, 'learning_rate': 1.7265466611710248e-07, 'epoch': 0.92} 92%|█████████▏| 20298/22095 [34:53:35<1:55:52, 3.87s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20299/22095 [34:53:38<1:49:14, 3.65s/it] {'loss': 0.3117, 'grad_norm': 0.7149839138267361, 'learning_rate': 1.7246377906940503e-07, 'epoch': 0.92} 92%|█████████▏| 20299/22095 [34:53:38<1:49:14, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20300/22095 [34:53:47<2:40:42, 5.37s/it] {'loss': 0.474, 'grad_norm': 0.27485696126433967, 'learning_rate': 1.7227299575063528e-07, 'epoch': 0.92} 92%|█████████▏| 20300/22095 [34:53:47<2:40:42, 5.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367287 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34035, 'image': 'vrdu_table_final_2/astro-ph.CO/ed1b9ac4-b670-4037-8637-165435a3b00d.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 92%|█████████▏| 20301/22095 [34:53:51<2:23:03, 4.78s/it] {'loss': 0.2535, 'grad_norm': 0.6842081565564186, 'learning_rate': 1.7208231616489156e-07, 'epoch': 0.92} 92%|█████████▏| 20301/22095 [34:53:51<2:23:03, 4.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20302/22095 [34:53:54<2:05:25, 4.20s/it] {'loss': 0.3039, 'grad_norm': 0.9067952840010864, 'learning_rate': 1.7189174031627064e-07, 'epoch': 0.92} 92%|█████████▏| 20302/22095 [34:53:54<2:05:25, 4.20s/it] 92%|█████████▏| 20303/22095 [34:53:57<1:56:04, 3.89s/it] {'loss': 0.2834, 'grad_norm': 0.6247776678453516, 'learning_rate': 1.7170126820886755e-07, 'epoch': 0.92} 92%|█████████▏| 20303/22095 [34:53:57<1:56:04, 3.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047577 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7'}, {'from': 'gpt', 'value': '【解答】解:∵AB=9,BC=5,∴AC=AB+BC=14,∵D为线段AC的中点,∴AD=\\frac{1}{2}AC=7,'}]} 92%|█████████▏| 20304/22095 [34:54:00<1:53:30, 3.80s/it] {'loss': 0.265, 'grad_norm': 0.6025353886341012, 'learning_rate': 1.7151089984677684e-07, 'epoch': 0.92} 92%|█████████▏| 20304/22095 [34:54:00<1:53:30, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20305/22095 [34:54:10<2:42:54, 5.46s/it] {'loss': 0.4568, 'grad_norm': 0.2846579159301575, 'learning_rate': 1.713206352340857e-07, 'epoch': 0.92} 92%|█████████▏| 20305/22095 [34:54:10<2:42:54, 5.46s/it] 92%|█████████▏| 20306/22095 [34:54:13<2:23:15, 4.80s/it] {'loss': 0.267, 'grad_norm': 0.6058765647167526, 'learning_rate': 1.7113047437488373e-07, 'epoch': 0.92} 92%|█████████▏| 20306/22095 [34:54:13<2:23:15, 4.80s/it] 92%|█████████▏| 20307/22095 [34:54:17<2:13:23, 4.48s/it] {'loss': 0.2987, 'grad_norm': 0.6440807707619911, 'learning_rate': 1.7094041727325817e-07, 'epoch': 0.92} 92%|█████████▏| 20307/22095 [34:54:17<2:13:23, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59941 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48291 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20308/22095 [34:54:20<1:59:07, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (116675 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.2649, 'grad_norm': 0.6087728665863835, 'learning_rate': 1.7075046393329132e-07, 'epoch': 0.92} 92%|█████████▏| 20308/22095 [34:54:20<1:59:07, 4.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45850 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20309/22095 [34:54:23<1:51:22, 3.74s/it] {'loss': 0.3285, 'grad_norm': 0.6629216048938273, 'learning_rate': 1.705606143590649e-07, 'epoch': 0.92} 92%|█████████▏| 20309/22095 [34:54:23<1:51:22, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931434 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54587, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nA. 6\nB. 8\nC. 10\nD. 12'}, {'from': 'gpt', 'value': '【解答】解:由题意可知AC=\\frac{1}{4}AB,AD=\\frac{1}{2}AB而CD=AD-AC∴CD=\\frac{1}{2}AB-\\frac{1}{4}AB=2∴\\frac{1}{4}AB=2∴AB=8'}]} 92%|█████████▏| 20310/22095 [34:54:26<1:46:30, 3.58s/it] {'loss': 0.2972, 'grad_norm': 0.6432855564190427, 'learning_rate': 1.7037086855465902e-07, 'epoch': 0.92} 92%|█████████▏| 20310/22095 [34:54:26<1:46:30, 3.58s/it] 92%|█████████▏| 20311/22095 [34:54:29<1:42:08, 3.44s/it] {'loss': 0.3088, 'grad_norm': 0.5504870130107095, 'learning_rate': 1.7018122652414926e-07, 'epoch': 0.92} 92%|█████████▏| 20311/22095 [34:54:29<1:42:08, 3.44s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8906520 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29673, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图所示,C点为AB段中点,D点为BC段中点,AB=20cm,则AD段等于()\nA. 5cm\nB. 15cm\nC. 16cm\nD. 10cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 92%|█████████▏| 20312/22095 [34:54:32<1:38:54, 3.33s/it] {'loss': 0.2958, 'grad_norm': 0.6063378414760505, 'learning_rate': 1.6999168827161182e-07, 'epoch': 0.92} 92%|█████████▏| 20312/22095 [34:54:32<1:38:54, 3.33s/it] 92%|█████████▏| 20313/22095 [34:54:36<1:43:22, 3.48s/it] {'loss': 0.2869, 'grad_norm': 0.5759916244656855, 'learning_rate': 1.6980225380111904e-07, 'epoch': 0.92} 92%|█████████▏| 20313/22095 [34:54:36<1:43:22, 3.48s/it] 92%|█████████▏| 20314/22095 [34:54:39<1:38:43, 3.33s/it] {'loss': 0.2959, 'grad_norm': 0.591904167150174, 'learning_rate': 1.6961292311674037e-07, 'epoch': 0.92} 92%|█████████▏| 20314/22095 [34:54:39<1:38:43, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20315/22095 [34:54:43<1:43:56, 3.50s/it] {'loss': 0.2788, 'grad_norm': 0.6214481185267032, 'learning_rate': 1.6942369622254428e-07, 'epoch': 0.92} 92%|█████████▏| 20315/22095 [34:54:43<1:43:56, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48084 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47436 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129638 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20316/22095 [34:54:52<2:37:24, 5.31s/it] {'loss': 0.4749, 'grad_norm': 0.2848607056517523, 'learning_rate': 1.692345731225975e-07, 'epoch': 0.92} 92%|█████████▏| 20316/22095 [34:54:52<2:37:24, 5.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58464 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68151 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20317/22095 [34:55:00<2:55:43, 5.93s/it] {'loss': 0.4568, 'grad_norm': 0.25736908411097426, 'learning_rate': 1.6904555382096343e-07, 'epoch': 0.92} 92%|█████████▏| 20317/22095 [34:55:00<2:55:43, 5.93s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20318/22095 [34:55:03<2:34:30, 5.22s/it] {'loss': 0.2423, 'grad_norm': 0.6279904182505865, 'learning_rate': 1.6885663832170274e-07, 'epoch': 0.92} 92%|█████████▏| 20318/22095 [34:55:03<2:34:30, 5.22s/it] 92%|█████████▏| 20319/22095 [34:55:07<2:24:01, 4.87s/it] {'loss': 0.2607, 'grad_norm': 0.5400609614569958, 'learning_rate': 1.686678266288755e-07, 'epoch': 0.92} 92%|█████████▏| 20319/22095 [34:55:07<2:24:01, 4.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914370 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37523, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,D点为AB段中点,C点为AD段中点,AB=16cm,则CD段=cm。(一)\nA. 2\nB. 4\nC. 8\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由点D是线段AB的中点,得AD=\\frac{1}{2}AB=\\frac{1}{2}×16=8cm,由C是线段AD的中点,得CD=\\frac{1}{2}AD=\\frac{1}{2}×8=4cm.'}]} 92%|█████████▏| 20320/22095 [34:55:11<2:17:24, 4.64s/it] {'loss': 0.2651, 'grad_norm': 0.5738164996555193, 'learning_rate': 1.6847911874653843e-07, 'epoch': 0.92} 92%|█████████▏| 20320/22095 [34:55:11<2:17:24, 4.64s/it] 92%|█████████▏| 20321/22095 [34:55:15<2:04:56, 4.23s/it] {'loss': 0.2998, 'grad_norm': 0.6456818098102596, 'learning_rate': 1.6829051467874613e-07, 'epoch': 0.92} 92%|█████████▏| 20321/22095 [34:55:15<2:04:56, 4.23s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 92%|█████████▏| 20322/22095 [34:55:18<1:56:01, 3.93s/it] {'loss': 0.3245, 'grad_norm': 0.6348852708401035, 'learning_rate': 1.6810201442955087e-07, 'epoch': 0.92} 92%|█████████▏| 20322/22095 [34:55:18<1:56:01, 3.93s/it] 92%|█████████▏| 20323/22095 [34:55:21<1:51:47, 3.79s/it] {'loss': 0.2987, 'grad_norm': 0.6234987326189086, 'learning_rate': 1.6791361800300386e-07, 'epoch': 0.92} 92%|█████████▏| 20323/22095 [34:55:21<1:51:47, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20324/22095 [34:55:30<2:33:00, 5.18s/it] {'loss': 0.4803, 'grad_norm': 0.46159750163370644, 'learning_rate': 1.6772532540315188e-07, 'epoch': 0.92} 92%|█████████▏| 20324/22095 [34:55:30<2:33:00, 5.18s/it] 92%|█████████▏| 20325/22095 [34:55:40<3:21:22, 6.83s/it] {'loss': 0.472, 'grad_norm': 0.27836394932852804, 'learning_rate': 1.6753713663404224e-07, 'epoch': 0.92} 92%|█████████▏| 20325/22095 [34:55:40<3:21:22, 6.83s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20326/22095 [34:55:44<2:51:50, 5.83s/it] {'loss': 0.2471, 'grad_norm': 0.6227338739189778, 'learning_rate': 1.6734905169971782e-07, 'epoch': 0.92} 92%|█████████▏| 20326/22095 [34:55:44<2:51:50, 5.83s/it] 92%|█████████▏| 20327/22095 [34:55:47<2:29:03, 5.06s/it] {'loss': 0.2589, 'grad_norm': 0.6299923273196348, 'learning_rate': 1.671610706042187e-07, 'epoch': 0.92} 92%|█████████▏| 20327/22095 [34:55:47<2:29:03, 5.06s/it] 92%|█████████▏| 20328/22095 [34:55:51<2:14:12, 4.56s/it] {'loss': 0.2706, 'grad_norm': 0.5676574324783089, 'learning_rate': 1.6697319335158613e-07, 'epoch': 0.92} 92%|█████████▏| 20328/22095 [34:55:51<2:14:12, 4.56s/it] 92%|█████████▏| 20329/22095 [34:55:55<2:08:50, 4.38s/it] {'loss': 0.2926, 'grad_norm': 0.6999647644794571, 'learning_rate': 1.6678541994585629e-07, 'epoch': 0.92} 92%|█████████▏| 20329/22095 [34:55:55<2:08:50, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047926 in VC:s3://multi-modal/UniGeo/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 92%|█████████▏| 20330/22095 [34:55:58<2:01:36, 4.13s/it] {'loss': 0.3487, 'grad_norm': 0.643394225929882, 'learning_rate': 1.665977503910632e-07, 'epoch': 0.92} 92%|█████████▏| 20330/22095 [34:55:58<2:01:36, 4.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93570 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79567 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20331/22095 [34:56:01<1:51:38, 3.80s/it] {'loss': 0.2489, 'grad_norm': 0.5346259476521655, 'learning_rate': 1.664101846912397e-07, 'epoch': 0.92} 92%|█████████▏| 20331/22095 [34:56:01<1:51:38, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20332/22095 [34:56:04<1:44:31, 3.56s/it] {'loss': 0.2456, 'grad_norm': 0.625222026956306, 'learning_rate': 1.6622272285041652e-07, 'epoch': 0.92} 92%|█████████▏| 20332/22095 [34:56:04<1:44:31, 3.56s/it] 92%|█████████▏| 20333/22095 [34:56:07<1:41:36, 3.46s/it] {'loss': 0.2566, 'grad_norm': 0.7322590324640349, 'learning_rate': 1.6603536487262095e-07, 'epoch': 0.92} 92%|█████████▏| 20333/22095 [34:56:07<1:41:36, 3.46s/it] 92%|█████████▏| 20334/22095 [34:56:11<1:44:24, 3.56s/it] {'loss': 0.2397, 'grad_norm': 0.5569926068928689, 'learning_rate': 1.658481107618798e-07, 'epoch': 0.92} 92%|█████████▏| 20334/22095 [34:56:11<1:44:24, 3.56s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20335/22095 [34:56:14<1:39:01, 3.38s/it] {'loss': 0.2932, 'grad_norm': 0.6147399641282735, 'learning_rate': 1.6566096052221482e-07, 'epoch': 0.92} 92%|█████████▏| 20335/22095 [34:56:14<1:39:01, 3.38s/it] 92%|█████████▏| 20336/22095 [34:56:18<1:40:16, 3.42s/it] {'loss': 0.3123, 'grad_norm': 0.639177648247498, 'learning_rate': 1.6547391415764836e-07, 'epoch': 0.92} 92%|█████████▏| 20336/22095 [34:56:18<1:40:16, 3.42s/it] 92%|█████████▏| 20337/22095 [34:56:22<1:47:56, 3.68s/it] {'loss': 0.3208, 'grad_norm': 0.6068624796114247, 'learning_rate': 1.652869716722e-07, 'epoch': 0.92} 92%|█████████▏| 20337/22095 [34:56:22<1:47:56, 3.68s/it] 92%|█████████▏| 20338/22095 [34:56:25<1:44:19, 3.56s/it] {'loss': 0.3149, 'grad_norm': 0.762520313233785, 'learning_rate': 1.6510013306988538e-07, 'epoch': 0.92} 92%|█████████▏| 20338/22095 [34:56:25<1:44:19, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20339/22095 [34:56:30<1:54:27, 3.91s/it] {'loss': 0.474, 'grad_norm': 0.29952288591201587, 'learning_rate': 1.6491339835471964e-07, 'epoch': 0.92} 92%|█████████▏| 20339/22095 [34:56:30<1:54:27, 3.91s/it] 92%|█████████▏| 20340/22095 [34:56:34<1:52:47, 3.86s/it] {'loss': 0.2757, 'grad_norm': 0.6158717020716946, 'learning_rate': 1.6472676753071516e-07, 'epoch': 0.92} 92%|█████████▏| 20340/22095 [34:56:34<1:52:47, 3.86s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [232, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8950723 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [232, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1558, 'image': 'images/4917.png', 'image_wh': [[232, 24]], 'conversations': [{'from': 'human', 'value': '\n如图,已知点C为线段AB上一点,AC=12cm,CB=\\frac{2}{3}AC,D、E分别为AC、AB的中点,则DE的长是()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20341/22095 [34:56:38<1:53:58, 3.90s/it] {'loss': 0.3228, 'grad_norm': 0.610806311783428, 'learning_rate': 1.6454024060188257e-07, 'epoch': 0.92} 92%|█████████▏| 20341/22095 [34:56:38<1:53:58, 3.90s/it] 92%|█████████▏| 20342/22095 [34:56:42<1:59:14, 4.08s/it] {'loss': 0.3003, 'grad_norm': 0.6269652583233704, 'learning_rate': 1.6435381757222869e-07, 'epoch': 0.92} 92%|█████████▏| 20342/22095 [34:56:42<1:59:14, 4.08s/it] 92%|█████████▏| 20343/22095 [34:56:46<1:58:05, 4.04s/it] {'loss': 0.2882, 'grad_norm': 0.6759177811930258, 'learning_rate': 1.6416749844575974e-07, 'epoch': 0.92} 92%|█████████▏| 20343/22095 [34:56:46<1:58:05, 4.04s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (139918 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57317 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20344/22095 [34:56:49<1:47:05, 3.67s/it] {'loss': 0.2816, 'grad_norm': 0.631666234340163, 'learning_rate': 1.6398128322647865e-07, 'epoch': 0.92} 92%|█████████▏| 20344/22095 [34:56:49<1:47:05, 3.67s/it] 92%|█████████▏| 20345/22095 [34:56:53<1:49:20, 3.75s/it] {'loss': 0.2593, 'grad_norm': 0.6173364819027063, 'learning_rate': 1.6379517191838777e-07, 'epoch': 0.92} 92%|█████████▏| 20345/22095 [34:56:53<1:49:20, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75644 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20346/22095 [34:56:56<1:43:15, 3.54s/it] {'loss': 0.2865, 'grad_norm': 0.5951260182263146, 'learning_rate': 1.636091645254856e-07, 'epoch': 0.92} 92%|█████████▏| 20346/22095 [34:56:56<1:43:15, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43887 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41063 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50032 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20347/22095 [34:57:00<1:46:09, 3.64s/it] {'loss': 0.2656, 'grad_norm': 0.5968563815932153, 'learning_rate': 1.634232610517683e-07, 'epoch': 0.92} 92%|█████████▏| 20347/22095 [34:57:00<1:46:09, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76844 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47016 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20348/22095 [34:57:03<1:39:47, 3.43s/it] {'loss': 0.276, 'grad_norm': 0.5688768727910141, 'learning_rate': 1.6323746150123e-07, 'epoch': 0.92} 92%|█████████▏| 20348/22095 [34:57:03<1:39:47, 3.43s/it] 92%|█████████▏| 20349/22095 [34:57:06<1:37:06, 3.34s/it] {'loss': 0.2923, 'grad_norm': 0.602614942186902, 'learning_rate': 1.6305176587786465e-07, 'epoch': 0.92} 92%|█████████▏| 20349/22095 [34:57:06<1:37:06, 3.34s/it] 92%|█████████▏| 20350/22095 [34:57:09<1:36:00, 3.30s/it] {'loss': 0.2828, 'grad_norm': 0.8071633035076183, 'learning_rate': 1.628661741856613e-07, 'epoch': 0.92} 92%|█████████▏| 20350/22095 [34:57:09<1:36:00, 3.30s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57798 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20351/22095 [34:57:13<1:39:37, 3.43s/it] {'loss': 0.2556, 'grad_norm': 0.5866499802529012, 'learning_rate': 1.6268068642860735e-07, 'epoch': 0.92} 92%|█████████▏| 20351/22095 [34:57:13<1:39:37, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41187 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57362 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47386 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20352/22095 [34:57:20<2:12:56, 4.58s/it] {'loss': 0.466, 'grad_norm': 0.25558126830066286, 'learning_rate': 1.6249530261068903e-07, 'epoch': 0.92} 92%|█████████▏| 20352/22095 [34:57:20<2:12:56, 4.58s/it] 92%|█████████▏| 20353/22095 [34:57:27<2:35:09, 5.34s/it] {'loss': 0.4592, 'grad_norm': 0.25617421405222546, 'learning_rate': 1.623100227358887e-07, 'epoch': 0.92} 92%|█████████▏| 20353/22095 [34:57:27<2:35:09, 5.34s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8341062 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 7707, 'image': 'vrdu_table_final_2/astro-ph.CO/94529a1c-bdb1-435b-992c-ec9f841ce93e.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease process the table in the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll process the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 92%|█████████▏| 20354/22095 [34:57:31<2:19:50, 4.82s/it] {'loss': 0.2959, 'grad_norm': 0.6722170018272257, 'learning_rate': 1.621248468081893e-07, 'epoch': 0.92} 92%|█████████▏| 20354/22095 [34:57:31<2:19:50, 4.82s/it] 92%|█████████▏| 20355/22095 [34:57:42<3:13:35, 6.68s/it] {'loss': 0.4913, 'grad_norm': 0.2676472468700252, 'learning_rate': 1.619397748315682e-07, 'epoch': 0.92} 92%|█████████▏| 20355/22095 [34:57:42<3:13:35, 6.68s/it] 92%|█████████▏| 20356/22095 [34:57:53<3:50:55, 7.97s/it] {'loss': 0.4517, 'grad_norm': 0.269768813951422, 'learning_rate': 1.6175480681000167e-07, 'epoch': 0.92} 92%|█████████▏| 20356/22095 [34:57:53<3:50:55, 7.97s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20357/22095 [34:57:57<3:19:47, 6.90s/it] {'loss': 0.3, 'grad_norm': 0.6520827900369393, 'learning_rate': 1.6156994274746484e-07, 'epoch': 0.92} 92%|█████████▏| 20357/22095 [34:57:57<3:19:47, 6.90s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (94609968 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 92%|█████████▏| 20358/22095 [34:58:01<2:52:48, 5.97s/it] {'loss': 0.3292, 'grad_norm': 0.6370859639103148, 'learning_rate': 1.613851826479307e-07, 'epoch': 0.92} 92%|█████████▏| 20358/22095 [34:58:01<2:52:48, 5.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20359/22095 [34:58:10<3:21:18, 6.96s/it] {'loss': 0.4813, 'grad_norm': 0.2639862715381424, 'learning_rate': 1.6120052651536766e-07, 'epoch': 0.92} 92%|█████████▏| 20359/22095 [34:58:10<3:21:18, 6.96s/it] 92%|█████████▏| 20360/22095 [34:58:21<3:51:09, 7.99s/it] {'loss': 0.4575, 'grad_norm': 0.2595331991104463, 'learning_rate': 1.6101597435374428e-07, 'epoch': 0.92} 92%|█████████▏| 20360/22095 [34:58:21<3:51:09, 7.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20361/22095 [34:58:26<3:25:37, 7.12s/it] {'loss': 0.3012, 'grad_norm': 0.6429967201358898, 'learning_rate': 1.6083152616702512e-07, 'epoch': 0.92} 92%|█████████▏| 20361/22095 [34:58:26<3:25:37, 7.12s/it] 92%|█████████▏| 20362/22095 [34:58:30<3:01:26, 6.28s/it] {'loss': 0.3135, 'grad_norm': 0.5585878017776091, 'learning_rate': 1.606471819591754e-07, 'epoch': 0.92} 92%|█████████▏| 20362/22095 [34:58:30<3:01:26, 6.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86015 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56967 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20363/22095 [34:58:33<2:35:32, 5.39s/it] {'loss': 0.2622, 'grad_norm': 0.5677551745022221, 'learning_rate': 1.604629417341541e-07, 'epoch': 0.92} 92%|█████████▏| 20363/22095 [34:58:33<2:35:32, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20364/22095 [34:58:36<2:15:09, 4.68s/it] {'loss': 0.2885, 'grad_norm': 0.6363566209381748, 'learning_rate': 1.6027880549592033e-07, 'epoch': 0.92} 92%|█████████▏| 20364/22095 [34:58:36<2:15:09, 4.68s/it] 92%|█████████▏| 20365/22095 [34:58:43<2:27:35, 5.12s/it] {'loss': 0.3059, 'grad_norm': 0.7943373394647701, 'learning_rate': 1.6009477324843204e-07, 'epoch': 0.92} 92%|█████████▏| 20365/22095 [34:58:43<2:27:35, 5.12s/it] 92%|█████████▏| 20366/22095 [34:58:47<2:25:32, 5.05s/it] {'loss': 0.2455, 'grad_norm': 0.5412161990938706, 'learning_rate': 1.59910844995641e-07, 'epoch': 0.92} 92%|█████████▏| 20366/22095 [34:58:47<2:25:32, 5.05s/it] 92%|█████████▏| 20367/22095 [34:58:51<2:12:12, 4.59s/it] {'loss': 0.2965, 'grad_norm': 0.628707186222274, 'learning_rate': 1.5972702074150194e-07, 'epoch': 0.92} 92%|█████████▏| 20367/22095 [34:58:51<2:12:12, 4.59s/it] 92%|█████████▏| 20368/22095 [34:58:55<2:07:26, 4.43s/it] {'loss': 0.2751, 'grad_norm': 0.671824730233963, 'learning_rate': 1.5954330048996326e-07, 'epoch': 0.92} 92%|█████████▏| 20368/22095 [34:58:55<2:07:26, 4.43s/it] 92%|█████████▏| 20369/22095 [34:58:58<1:53:22, 3.94s/it] {'loss': 0.2822, 'grad_norm': 0.5892389520276957, 'learning_rate': 1.5935968424497184e-07, 'epoch': 0.92} 92%|█████████▏| 20369/22095 [34:58:58<1:53:22, 3.94s/it] 92%|█████████▏| 20370/22095 [34:59:01<1:45:48, 3.68s/it] {'loss': 0.2617, 'grad_norm': 0.5617899868690214, 'learning_rate': 1.5917617201047508e-07, 'epoch': 0.92} 92%|█████████▏| 20370/22095 [34:59:01<1:45:48, 3.68s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20371/22095 [34:59:11<2:43:53, 5.70s/it] {'loss': 0.4458, 'grad_norm': 0.27106261944272375, 'learning_rate': 1.589927637904143e-07, 'epoch': 0.92} 92%|█████████▏| 20371/22095 [34:59:11<2:43:53, 5.70s/it] 92%|█████████▏| 20372/22095 [34:59:15<2:27:21, 5.13s/it] {'loss': 0.2717, 'grad_norm': 0.6156014604944067, 'learning_rate': 1.5880945958873073e-07, 'epoch': 0.92} 92%|█████████▏| 20372/22095 [34:59:15<2:27:21, 5.13s/it] 92%|█████████▏| 20373/22095 [34:59:19<2:18:22, 4.82s/it] {'loss': 0.322, 'grad_norm': 0.6014911690767994, 'learning_rate': 1.586262594093635e-07, 'epoch': 0.92} 92%|█████████▏| 20373/22095 [34:59:19<2:18:22, 4.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81490 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74626 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113138 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50486 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47519 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (44428 > 40960) for 4 sample(s). Truncating to 3468 with 3 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (93868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117034 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20374/22095 [34:59:23<2:08:01, 4.46s/it] {'loss': 0.2976, 'grad_norm': 0.6133283076051357, 'learning_rate': 1.5844316325624887e-07, 'epoch': 0.92} 92%|█████████▏| 20374/22095 [34:59:23<2:08:01, 4.46s/it] 92%|█████████▏| 20375/22095 [34:59:26<1:55:04, 4.01s/it] {'loss': 0.2794, 'grad_norm': 0.6033457098179053, 'learning_rate': 1.5826017113332148e-07, 'epoch': 0.92} 92%|█████████▏| 20375/22095 [34:59:26<1:55:04, 4.01s/it] 92%|█████████▏| 20376/22095 [34:59:29<1:52:06, 3.91s/it] {'loss': 0.2638, 'grad_norm': 0.5862417085140412, 'learning_rate': 1.580772830445121e-07, 'epoch': 0.92} 92%|█████████▏| 20376/22095 [34:59:29<1:52:06, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45235 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20377/22095 [34:59:34<1:54:20, 3.99s/it] {'loss': 0.2529, 'grad_norm': 0.5840447890147605, 'learning_rate': 1.5789449899375086e-07, 'epoch': 0.92} 92%|█████████▏| 20377/22095 [34:59:34<1:54:20, 3.99s/it] 92%|█████████▏| 20378/22095 [34:59:38<1:56:17, 4.06s/it] {'loss': 0.2991, 'grad_norm': 0.5688909679197075, 'learning_rate': 1.5771181898496578e-07, 'epoch': 0.92} 92%|█████████▏| 20378/22095 [34:59:38<1:56:17, 4.06s/it] 92%|█████████▏| 20379/22095 [34:59:41<1:51:36, 3.90s/it] {'loss': 0.2936, 'grad_norm': 0.724734229839457, 'learning_rate': 1.5752924302208206e-07, 'epoch': 0.92} 92%|█████████▏| 20379/22095 [34:59:41<1:51:36, 3.90s/it] 92%|█████████▏| 20380/22095 [34:59:45<1:45:49, 3.70s/it] {'loss': 0.3022, 'grad_norm': 0.5286664324455521, 'learning_rate': 1.573467711090221e-07, 'epoch': 0.92} 92%|█████████▏| 20380/22095 [34:59:45<1:45:49, 3.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43663 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86596 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20381/22095 [34:59:47<1:37:58, 3.43s/it] {'loss': 0.2745, 'grad_norm': 0.6311872016999195, 'learning_rate': 1.5716440324970716e-07, 'epoch': 0.92} 92%|█████████▏| 20381/22095 [34:59:47<1:37:58, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20382/22095 [34:59:57<2:29:10, 5.23s/it] {'loss': 0.4545, 'grad_norm': 0.2896336364166879, 'learning_rate': 1.5698213944805528e-07, 'epoch': 0.92} 92%|█████████▏| 20382/22095 [34:59:57<2:29:10, 5.23s/it] 92%|█████████▏| 20383/22095 [35:00:04<2:46:55, 5.85s/it] {'loss': 0.4907, 'grad_norm': 0.26109570845484165, 'learning_rate': 1.5679997970798333e-07, 'epoch': 0.92} 92%|█████████▏| 20383/22095 [35:00:04<2:46:55, 5.85s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 92%|█████████▏| 20384/22095 [35:00:07<2:25:19, 5.10s/it] {'loss': 0.2804, 'grad_norm': 0.5823890638539553, 'learning_rate': 1.566179240334048e-07, 'epoch': 0.92} 92%|█████████▏| 20384/22095 [35:00:07<2:25:19, 5.10s/it] 92%|█████████▏| 20385/22095 [35:00:14<2:36:47, 5.50s/it] {'loss': 0.4458, 'grad_norm': 0.24110901552768604, 'learning_rate': 1.564359724282316e-07, 'epoch': 0.92} 92%|█████████▏| 20385/22095 [35:00:14<2:36:47, 5.50s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20386/22095 [35:00:17<2:16:40, 4.80s/it] {'loss': 0.268, 'grad_norm': 0.6395430045112075, 'learning_rate': 1.5625412489637337e-07, 'epoch': 0.92} 92%|█████████▏| 20386/22095 [35:00:17<2:16:40, 4.80s/it] 92%|█████████▏| 20387/22095 [35:00:21<2:05:11, 4.40s/it] {'loss': 0.2735, 'grad_norm': 0.5978845010368887, 'learning_rate': 1.5607238144173864e-07, 'epoch': 0.92} 92%|█████████▏| 20387/22095 [35:00:21<2:05:11, 4.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [387, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8523735 in VC:s3://internvl-moe-sft-data/. Exception: Image size [387, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19888, 'image': 'vrdu_texteq/astro-ph.CO/c8be1e08-0859-4e31-9f9c-e2678b7dad1e.png', 'image_wh': [[387, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'and $\\Delta_n = \\Delta$ as defined in Ref.~.'}]} 92%|█████████▏| 20388/22095 [35:00:24<1:55:12, 4.05s/it] {'loss': 0.2703, 'grad_norm': 0.6037538312066337, 'learning_rate': 1.5589074206823096e-07, 'epoch': 0.92} 92%|█████████▏| 20388/22095 [35:00:24<1:55:12, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20389/22095 [35:00:32<2:27:00, 5.17s/it] {'loss': 0.4569, 'grad_norm': 0.27980886714460196, 'learning_rate': 1.5570920677975276e-07, 'epoch': 0.92} 92%|█████████▏| 20389/22095 [35:00:32<2:27:00, 5.17s/it] 92%|█████████▏| 20390/22095 [35:00:35<2:12:43, 4.67s/it] {'loss': 0.2486, 'grad_norm': 0.6337457442557619, 'learning_rate': 1.5552777558020594e-07, 'epoch': 0.92} 92%|█████████▏| 20390/22095 [35:00:35<2:12:43, 4.67s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [1303, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8474350 in VC:s3://internvl-moe-sft-data/. Exception: Image size [1303, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6164, 'image': 'vrdu_texteq/astro-ph.CO/4f7a1cb0-db81-47b4-90bb-ad1aa459245c.png', 'image_wh': [[1303, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease analyze the image and retrieve any textual information that is discernible.'}, {'from': 'gpt', 'value': 'Our initial condition leads to a timescale for KHI of $\\tau_{\\rm KHI}=2.5$ and we run simulations for $t=2 \\tau_{\\rm KHI}=5.0$.'}]} 92%|█████████▏| 20391/22095 [35:00:39<2:03:37, 4.35s/it] {'loss': 0.2934, 'grad_norm': 0.6288532464939638, 'learning_rate': 1.5534644847348957e-07, 'epoch': 0.92} 92%|█████████▏| 20391/22095 [35:00:39<2:03:37, 4.35s/it] 92%|█████████▏| 20392/22095 [35:00:42<1:52:09, 3.95s/it] {'loss': 0.2449, 'grad_norm': 0.6810993647037666, 'learning_rate': 1.5516522546349833e-07, 'epoch': 0.92} 92%|█████████▏| 20392/22095 [35:00:42<1:52:09, 3.95s/it] 92%|█████████▏| 20393/22095 [35:00:46<1:58:46, 4.19s/it] {'loss': 0.283, 'grad_norm': 0.6479504141767006, 'learning_rate': 1.5498410655412577e-07, 'epoch': 0.92} 92%|█████████▏| 20393/22095 [35:00:46<1:58:46, 4.19s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20394/22095 [35:00:49<1:49:00, 3.85s/it] {'loss': 0.3113, 'grad_norm': 0.6164855382342345, 'learning_rate': 1.5480309174926544e-07, 'epoch': 0.92} 92%|█████████▏| 20394/22095 [35:00:49<1:49:00, 3.85s/it] 92%|█████████▏| 20395/22095 [35:00:52<1:41:54, 3.60s/it] {'loss': 0.2923, 'grad_norm': 0.8744004839033451, 'learning_rate': 1.5462218105280535e-07, 'epoch': 0.92} 92%|█████████▏| 20395/22095 [35:00:52<1:41:54, 3.60s/it] 92%|█████████▏| 20396/22095 [35:00:55<1:35:13, 3.36s/it] {'loss': 0.3171, 'grad_norm': 0.6521839231619401, 'learning_rate': 1.544413744686335e-07, 'epoch': 0.92} 92%|█████████▏| 20396/22095 [35:00:55<1:35:13, 3.36s/it] 92%|█████████▏| 20397/22095 [35:00:59<1:37:28, 3.44s/it] {'loss': 0.3141, 'grad_norm': 0.628945928780969, 'learning_rate': 1.5426067200063454e-07, 'epoch': 0.92} 92%|█████████▏| 20397/22095 [35:00:59<1:37:28, 3.44s/it] 92%|█████████▏| 20398/22095 [35:01:02<1:37:31, 3.45s/it] {'loss': 0.3094, 'grad_norm': 0.6002713949526557, 'learning_rate': 1.540800736526904e-07, 'epoch': 0.92} 92%|█████████▏| 20398/22095 [35:01:02<1:37:31, 3.45s/it] 92%|█████████▏| 20399/22095 [35:01:06<1:38:31, 3.49s/it] {'loss': 0.325, 'grad_norm': 0.6054775864880105, 'learning_rate': 1.5389957942868295e-07, 'epoch': 0.92} 92%|█████████▏| 20399/22095 [35:01:06<1:38:31, 3.49s/it] 92%|█████████▏| 20400/22095 [35:01:10<1:40:18, 3.55s/it] {'loss': 0.2665, 'grad_norm': 0.5423819600754345, 'learning_rate': 1.5371918933249018e-07, 'epoch': 0.92} 92%|█████████▏| 20400/22095 [35:01:10<1:40:18, 3.55s/it] 92%|█████████▏| 20401/22095 [35:01:13<1:34:13, 3.34s/it] {'loss': 0.2896, 'grad_norm': 0.6114592856681977, 'learning_rate': 1.5353890336798738e-07, 'epoch': 0.92} 92%|█████████▏| 20401/22095 [35:01:13<1:34:13, 3.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43632 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20402/22095 [35:01:16<1:39:11, 3.52s/it] {'loss': 0.3219, 'grad_norm': 0.6591721781153652, 'learning_rate': 1.5335872153904863e-07, 'epoch': 0.92} 92%|█████████▏| 20402/22095 [35:01:16<1:39:11, 3.52s/it] 92%|█████████▏| 20403/22095 [35:01:19<1:35:03, 3.37s/it] {'loss': 0.2773, 'grad_norm': 0.5609042868126081, 'learning_rate': 1.5317864384954527e-07, 'epoch': 0.92} 92%|█████████▏| 20403/22095 [35:01:19<1:35:03, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85228 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41316 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103445 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56114 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20404/22095 [35:01:23<1:35:55, 3.40s/it] {'loss': 0.273, 'grad_norm': 0.6318151743882193, 'learning_rate': 1.5299867030334815e-07, 'epoch': 0.92} 92%|█████████▏| 20404/22095 [35:01:23<1:35:55, 3.40s/it] 92%|█████████▏| 20405/22095 [35:01:26<1:36:19, 3.42s/it] {'loss': 0.3142, 'grad_norm': 0.6692743045696299, 'learning_rate': 1.5281880090432245e-07, 'epoch': 0.92} 92%|█████████▏| 20405/22095 [35:01:26<1:36:19, 3.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79323 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82882 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86708 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20406/22095 [35:01:30<1:34:59, 3.37s/it] {'loss': 0.2948, 'grad_norm': 0.623888346712142, 'learning_rate': 1.5263903565633342e-07, 'epoch': 0.92} 92%|█████████▏| 20406/22095 [35:01:30<1:34:59, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20407/22095 [35:01:39<2:23:21, 5.10s/it] {'loss': 0.4559, 'grad_norm': 0.2757973992857609, 'learning_rate': 1.5245937456324468e-07, 'epoch': 0.92} 92%|█████████▏| 20407/22095 [35:01:39<2:23:21, 5.10s/it] 92%|█████████▏| 20408/22095 [35:01:42<2:08:10, 4.56s/it] {'loss': 0.2783, 'grad_norm': 0.6188098397848397, 'learning_rate': 1.5227981762891586e-07, 'epoch': 0.92} 92%|█████████▏| 20408/22095 [35:01:42<2:08:10, 4.56s/it] 92%|█████████▏| 20409/22095 [35:01:45<1:56:08, 4.13s/it] {'loss': 0.302, 'grad_norm': 0.8300233252171786, 'learning_rate': 1.5210036485720503e-07, 'epoch': 0.92} 92%|█████████▏| 20409/22095 [35:01:45<1:56:08, 4.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20410/22095 [35:01:54<2:39:04, 5.66s/it] {'loss': 0.4886, 'grad_norm': 0.37985033345341146, 'learning_rate': 1.5192101625196798e-07, 'epoch': 0.92} 92%|█████████▏| 20410/22095 [35:01:54<2:39:04, 5.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20411/22095 [35:01:58<2:17:14, 4.89s/it] {'loss': 0.2816, 'grad_norm': 0.5821665503233471, 'learning_rate': 1.517417718170583e-07, 'epoch': 0.92} 92%|█████████▏| 20411/22095 [35:01:58<2:17:14, 4.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20412/22095 [35:02:07<2:57:33, 6.33s/it] {'loss': 0.4586, 'grad_norm': 0.24585264221427977, 'learning_rate': 1.5156263155632844e-07, 'epoch': 0.92} 92%|█████████▏| 20412/22095 [35:02:07<2:57:33, 6.33s/it] 92%|█████████▏| 20413/22095 [35:02:11<2:38:14, 5.64s/it] {'loss': 0.2861, 'grad_norm': 0.6593814272013578, 'learning_rate': 1.5138359547362645e-07, 'epoch': 0.92} 92%|█████████▏| 20413/22095 [35:02:11<2:38:14, 5.64s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20414/22095 [35:02:21<3:10:51, 6.81s/it] {'loss': 0.4899, 'grad_norm': 0.2898750406110483, 'learning_rate': 1.5120466357279929e-07, 'epoch': 0.92} 92%|█████████▏| 20414/22095 [35:02:21<3:10:51, 6.81s/it] 92%|█████████▏| 20415/22095 [35:02:25<2:47:01, 5.97s/it] {'loss': 0.2763, 'grad_norm': 0.5845322241114378, 'learning_rate': 1.510258358576916e-07, 'epoch': 0.92} 92%|█████████▏| 20415/22095 [35:02:25<2:47:01, 5.97s/it] 92%|█████████▏| 20416/22095 [35:02:29<2:29:55, 5.36s/it] {'loss': 0.2856, 'grad_norm': 0.6005937355250502, 'learning_rate': 1.5084711233214699e-07, 'epoch': 0.92} 92%|█████████▏| 20416/22095 [35:02:29<2:29:55, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62193 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44879 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42067 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20417/22095 [35:02:32<2:11:09, 4.69s/it] {'loss': 0.2622, 'grad_norm': 0.6675591618962443, 'learning_rate': 1.5066849300000519e-07, 'epoch': 0.92} 92%|█████████▏| 20417/22095 [35:02:32<2:11:09, 4.69s/it] 92%|█████████▏| 20418/22095 [35:02:35<1:56:40, 4.17s/it] {'loss': 0.3195, 'grad_norm': 0.6381729858413194, 'learning_rate': 1.5048997786510311e-07, 'epoch': 0.92} 92%|█████████▏| 20418/22095 [35:02:35<1:56:40, 4.17s/it] 92%|█████████▏| 20419/22095 [35:02:39<1:54:47, 4.11s/it] {'loss': 0.2958, 'grad_norm': 0.6069740562512995, 'learning_rate': 1.5031156693127714e-07, 'epoch': 0.92} 92%|█████████▏| 20419/22095 [35:02:39<1:54:47, 4.11s/it] 92%|█████████▏| 20420/22095 [35:02:43<1:56:52, 4.19s/it] {'loss': 0.3274, 'grad_norm': 0.6311512886772741, 'learning_rate': 1.5013326020236141e-07, 'epoch': 0.92} 92%|█████████▏| 20420/22095 [35:02:43<1:56:52, 4.19s/it] 92%|█████████▏| 20421/22095 [35:02:46<1:47:28, 3.85s/it] {'loss': 0.2664, 'grad_norm': 0.5988129173441522, 'learning_rate': 1.4995505768218677e-07, 'epoch': 0.92} 92%|█████████▏| 20421/22095 [35:02:46<1:47:28, 3.85s/it] 92%|█████████▏| 20422/22095 [35:02:50<1:44:58, 3.76s/it] {'loss': 0.2802, 'grad_norm': 0.6004426149550289, 'learning_rate': 1.497769593745818e-07, 'epoch': 0.92} 92%|█████████▏| 20422/22095 [35:02:50<1:44:58, 3.76s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20423/22095 [35:02:53<1:41:24, 3.64s/it] {'loss': 0.3328, 'grad_norm': 0.6129639964883378, 'learning_rate': 1.4959896528337402e-07, 'epoch': 0.92} 92%|█████████▏| 20423/22095 [35:02:53<1:41:24, 3.64s/it] 92%|█████████▏| 20424/22095 [35:02:57<1:39:15, 3.56s/it] {'loss': 0.3161, 'grad_norm': 0.5832031731496322, 'learning_rate': 1.4942107541238705e-07, 'epoch': 0.92} 92%|█████████▏| 20424/22095 [35:02:57<1:39:15, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (121236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49613 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20425/22095 [35:03:06<2:28:14, 5.33s/it] {'loss': 0.4766, 'grad_norm': 0.2922199415284455, 'learning_rate': 1.4924328976544446e-07, 'epoch': 0.92} 92%|█████████▏| 20425/22095 [35:03:06<2:28:14, 5.33s/it] 92%|█████████▏| 20426/22095 [35:03:09<2:12:31, 4.76s/it] {'loss': 0.3016, 'grad_norm': 0.6427471456037717, 'learning_rate': 1.490656083463654e-07, 'epoch': 0.92} 92%|█████████▏| 20426/22095 [35:03:09<2:12:31, 4.76s/it] 92%|█████████▏| 20427/22095 [35:03:12<1:56:02, 4.17s/it] {'loss': 0.2993, 'grad_norm': 0.6452756589772367, 'learning_rate': 1.4888803115896745e-07, 'epoch': 0.92} 92%|█████████▏| 20427/22095 [35:03:12<1:56:02, 4.17s/it] 92%|█████████▏| 20428/22095 [35:03:16<1:52:50, 4.06s/it] {'loss': 0.297, 'grad_norm': 0.7535574291069664, 'learning_rate': 1.4871055820706692e-07, 'epoch': 0.92} 92%|█████████▏| 20428/22095 [35:03:16<1:52:50, 4.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 92%|█████████▏| 20429/22095 [35:03:19<1:46:43, 3.84s/it] {'loss': 0.3203, 'grad_norm': 0.619299210313287, 'learning_rate': 1.4853318949447747e-07, 'epoch': 0.92} 92%|█████████▏| 20429/22095 [35:03:19<1:46:43, 3.84s/it] 92%|█████████▏| 20430/22095 [35:03:23<1:44:36, 3.77s/it] {'loss': 0.2791, 'grad_norm': 0.7475947457906347, 'learning_rate': 1.4835592502500883e-07, 'epoch': 0.92} 92%|█████████▏| 20430/22095 [35:03:23<1:44:36, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44299 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47695 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57926 > 40960). Running this sequence through the model will result in indexing errors 92%|█████████▏| 20431/22095 [35:03:26<1:35:50, 3.46s/it] {'loss': 0.2761, 'grad_norm': 0.6180770913988448, 'learning_rate': 1.4817876480247074e-07, 'epoch': 0.92} 92%|█████████▏| 20431/22095 [35:03:26<1:35:50, 3.46s/it] 92%|█████████▏| 20432/22095 [35:03:29<1:32:54, 3.35s/it] {'loss': 0.2817, 'grad_norm': 0.7096132177751993, 'learning_rate': 1.4800170883066954e-07, 'epoch': 0.92} 92%|█████████▏| 20432/22095 [35:03:29<1:32:54, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 92%|█████████▏| 20433/22095 [35:03:38<2:25:11, 5.24s/it] {'loss': 0.4727, 'grad_norm': 0.27086503197020917, 'learning_rate': 1.4782475711341115e-07, 'epoch': 0.92} 92%|█████████▏| 20433/22095 [35:03:38<2:25:11, 5.24s/it] 92%|█████████▏| 20434/22095 [35:03:42<2:12:04, 4.77s/it] {'loss': 0.2906, 'grad_norm': 0.5942397803198581, 'learning_rate': 1.4764790965449528e-07, 'epoch': 0.92} 92%|█████████▏| 20434/22095 [35:03:42<2:12:04, 4.77s/it] 92%|█████████▏| 20435/22095 [35:03:46<2:05:32, 4.54s/it] {'loss': 0.2978, 'grad_norm': 0.6921617036728822, 'learning_rate': 1.474711664577233e-07, 'epoch': 0.92} 92%|█████████▏| 20435/22095 [35:03:46<2:05:32, 4.54s/it] 92%|█████████▏| 20436/22095 [35:03:50<1:57:08, 4.24s/it] {'loss': 0.3047, 'grad_norm': 0.5857174790255871, 'learning_rate': 1.4729452752689277e-07, 'epoch': 0.92} 92%|█████████▏| 20436/22095 [35:03:50<1:57:08, 4.24s/it] 92%|█████████▏| 20437/22095 [35:03:54<1:56:29, 4.22s/it] {'loss': 0.2796, 'grad_norm': 0.7227079550701959, 'learning_rate': 1.471179928657984e-07, 'epoch': 0.92} 92%|█████████▏| 20437/22095 [35:03:54<1:56:29, 4.22s/it] 93%|█████████▎| 20438/22095 [35:03:58<1:53:18, 4.10s/it] {'loss': 0.3013, 'grad_norm': 0.6329769635085243, 'learning_rate': 1.4694156247823387e-07, 'epoch': 0.93} 93%|█████████▎| 20438/22095 [35:03:58<1:53:18, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20439/22095 [35:04:09<2:54:13, 6.31s/it] {'loss': 0.4498, 'grad_norm': 0.301921818944306, 'learning_rate': 1.4676523636799057e-07, 'epoch': 0.93} 93%|█████████▎| 20439/22095 [35:04:09<2:54:13, 6.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20440/22095 [35:04:13<2:34:26, 5.60s/it] {'loss': 0.2787, 'grad_norm': 0.5936333054729149, 'learning_rate': 1.4658901453885654e-07, 'epoch': 0.93} 93%|█████████▎| 20440/22095 [35:04:13<2:34:26, 5.60s/it] 93%|█████████▎| 20441/22095 [35:04:17<2:22:31, 5.17s/it] {'loss': 0.3102, 'grad_norm': 0.6066804280429381, 'learning_rate': 1.464128969946188e-07, 'epoch': 0.93} 93%|█████████▎| 20441/22095 [35:04:17<2:22:31, 5.17s/it] 93%|█████████▎| 20442/22095 [35:04:22<2:17:37, 5.00s/it] {'loss': 0.2788, 'grad_norm': 0.5606739862421105, 'learning_rate': 1.4623688373906098e-07, 'epoch': 0.93} 93%|█████████▎| 20442/22095 [35:04:22<2:17:37, 5.00s/it] 93%|█████████▎| 20443/22095 [35:04:26<2:07:44, 4.64s/it] {'loss': 0.3192, 'grad_norm': 0.6343599251414981, 'learning_rate': 1.4606097477596504e-07, 'epoch': 0.93} 93%|█████████▎| 20443/22095 [35:04:26<2:07:44, 4.64s/it] 93%|█████████▎| 20444/22095 [35:04:29<1:55:53, 4.21s/it] {'loss': 0.3119, 'grad_norm': 0.5491031501485523, 'learning_rate': 1.4588517010911073e-07, 'epoch': 0.93} 93%|█████████▎| 20444/22095 [35:04:29<1:55:53, 4.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047737 in VC:s3://multi-modal/UniGeo/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 8cm\nB. 5cm\nC. 6cm\nD. 7cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 93%|█████████▎| 20445/22095 [35:04:32<1:48:08, 3.93s/it] {'loss': 0.2711, 'grad_norm': 0.6006706830936251, 'learning_rate': 1.4570946974227674e-07, 'epoch': 0.93} 93%|█████████▎| 20445/22095 [35:04:32<1:48:08, 3.93s/it] 93%|█████████▎| 20446/22095 [35:04:36<1:49:18, 3.98s/it] {'loss': 0.3267, 'grad_norm': 0.6806183941372534, 'learning_rate': 1.455338736792372e-07, 'epoch': 0.93} 93%|█████████▎| 20446/22095 [35:04:36<1:49:18, 3.98s/it] 93%|█████████▎| 20447/22095 [35:04:40<1:47:57, 3.93s/it] {'loss': 0.3118, 'grad_norm': 0.6494127773785351, 'learning_rate': 1.4535838192376527e-07, 'epoch': 0.93} 93%|█████████▎| 20447/22095 [35:04:40<1:47:57, 3.93s/it] 93%|█████████▎| 20448/22095 [35:04:43<1:43:26, 3.77s/it] {'loss': 0.3411, 'grad_norm': 0.6462344113018397, 'learning_rate': 1.4518299447963126e-07, 'epoch': 0.93} 93%|█████████▎| 20448/22095 [35:04:43<1:43:26, 3.77s/it] 93%|█████████▎| 20449/22095 [35:04:47<1:42:29, 3.74s/it] {'loss': 0.3017, 'grad_norm': 0.5892924906517587, 'learning_rate': 1.4500771135060486e-07, 'epoch': 0.93} 93%|█████████▎| 20449/22095 [35:04:47<1:42:29, 3.74s/it] 93%|█████████▎| 20450/22095 [35:04:50<1:36:56, 3.54s/it] {'loss': 0.2883, 'grad_norm': 0.6098691150241621, 'learning_rate': 1.4483253254045205e-07, 'epoch': 0.93} 93%|█████████▎| 20450/22095 [35:04:50<1:36:56, 3.54s/it] 93%|█████████▎| 20451/22095 [35:04:54<1:39:58, 3.65s/it] {'loss': 0.3161, 'grad_norm': 0.6008950850538864, 'learning_rate': 1.4465745805293584e-07, 'epoch': 0.93} 93%|█████████▎| 20451/22095 [35:04:54<1:39:58, 3.65s/it] 93%|█████████▎| 20452/22095 [35:04:57<1:35:02, 3.47s/it] {'loss': 0.2913, 'grad_norm': 0.6600949221221687, 'learning_rate': 1.444824878918194e-07, 'epoch': 0.93} 93%|█████████▎| 20452/22095 [35:04:57<1:35:02, 3.47s/it] 93%|█████████▎| 20453/22095 [35:05:01<1:35:44, 3.50s/it] {'loss': 0.2743, 'grad_norm': 0.5904981565608284, 'learning_rate': 1.4430762206086136e-07, 'epoch': 0.93} 93%|█████████▎| 20453/22095 [35:05:01<1:35:44, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8916667 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 39820, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': '\n如图所示,O是AB段的中点,C在OB段上,AC=6,CB=3,OC的长度等于()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '1.5'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047797 in VC:s3://multi-modal/UniGeo/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 5\nB. 2\nC. 3\nD. 10\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 93%|█████████▎| 20454/22095 [35:05:03<1:30:03, 3.29s/it] {'loss': 0.3187, 'grad_norm': 0.604631254211844, 'learning_rate': 1.441328605638198e-07, 'epoch': 0.93} 93%|█████████▎| 20454/22095 [35:05:04<1:30:03, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43974 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20455/22095 [35:05:07<1:33:15, 3.41s/it] {'loss': 0.2989, 'grad_norm': 0.6057068609858999, 'learning_rate': 1.4395820340444954e-07, 'epoch': 0.93} 93%|█████████▎| 20455/22095 [35:05:07<1:33:15, 3.41s/it] 93%|█████████▎| 20456/22095 [35:05:10<1:30:45, 3.32s/it] {'loss': 0.3047, 'grad_norm': 0.6196797755533714, 'learning_rate': 1.4378365058650257e-07, 'epoch': 0.93} 93%|█████████▎| 20456/22095 [35:05:10<1:30:45, 3.32s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8899861 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 23014, 'image': 'images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果已知线段AB=16cm,则点C为线段AB上的移动点,点D和E分别为AC和BC的中点,则长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20457/22095 [35:05:14<1:35:25, 3.50s/it] {'loss': 0.2727, 'grad_norm': 0.5852014846220924, 'learning_rate': 1.436092021137303e-07, 'epoch': 0.93} 93%|█████████▎| 20457/22095 [35:05:14<1:35:25, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8299913 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB105UWiC3PL1JjSZFtXXclRVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you decode and provide me with the exact words in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n24V暖手桌垫\n安全第一\n双档可调\nSpring♬\nhffgyxj'}]} 93%|█████████▎| 20458/22095 [35:05:17<1:30:20, 3.31s/it] {'loss': 0.3025, 'grad_norm': 0.6442860368973576, 'learning_rate': 1.4343485798988198e-07, 'epoch': 0.93} 93%|█████████▎| 20458/22095 [35:05:17<1:30:20, 3.31s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90035 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69845 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20459/22095 [35:05:20<1:26:22, 3.17s/it] {'loss': 0.2908, 'grad_norm': 0.6644307473314526, 'learning_rate': 1.4326061821870186e-07, 'epoch': 0.93} 93%|█████████▎| 20459/22095 [35:05:20<1:26:22, 3.17s/it] 93%|█████████▎| 20460/22095 [35:05:23<1:27:06, 3.20s/it] {'loss': 0.2927, 'grad_norm': 0.6125065176034205, 'learning_rate': 1.4308648280393466e-07, 'epoch': 0.93} 93%|█████████▎| 20460/22095 [35:05:23<1:27:06, 3.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20461/22095 [35:05:26<1:23:41, 3.07s/it] {'loss': 0.2954, 'grad_norm': 0.7227698857705623, 'learning_rate': 1.429124517493219e-07, 'epoch': 0.93} 93%|█████████▎| 20461/22095 [35:05:26<1:23:41, 3.07s/it] 93%|█████████▎| 20462/22095 [35:05:30<1:28:27, 3.25s/it] {'loss': 0.2987, 'grad_norm': 0.6853652240967014, 'learning_rate': 1.4273852505860443e-07, 'epoch': 0.93} 93%|█████████▎| 20462/22095 [35:05:30<1:28:27, 3.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20463/22095 [35:05:39<2:19:37, 5.13s/it] {'loss': 0.474, 'grad_norm': 0.2613456844771442, 'learning_rate': 1.4256470273551705e-07, 'epoch': 0.93} 93%|█████████▎| 20463/22095 [35:05:39<2:19:37, 5.13s/it] 93%|█████████▎| 20464/22095 [35:05:43<2:08:41, 4.73s/it] {'loss': 0.2837, 'grad_norm': 1.0479860117384798, 'learning_rate': 1.4239098478379565e-07, 'epoch': 0.93} 93%|█████████▎| 20464/22095 [35:05:43<2:08:41, 4.73s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902706 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25859, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': '\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 4cm\nB. 5cm\nC. 8cm\nD. 9cm'}, {'from': 'gpt', 'value': '【解答】解:点D是AC的中点,如果CD=4cm,AC=2CD=2×4=8(cm),BC=AB-AC=13-8=5(cm).'}]} 93%|█████████▎| 20465/22095 [35:05:46<1:55:15, 4.24s/it] {'loss': 0.2954, 'grad_norm': 0.5307899678938974, 'learning_rate': 1.4221737120717338e-07, 'epoch': 0.93} 93%|█████████▎| 20465/22095 [35:05:46<1:55:15, 4.24s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20466/22095 [35:05:49<1:48:45, 4.01s/it] {'loss': 0.3101, 'grad_norm': 0.6494081792983811, 'learning_rate': 1.4204386200938057e-07, 'epoch': 0.93} 93%|█████████▎| 20466/22095 [35:05:50<1:48:45, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (43523 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74546 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53831 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20467/22095 [35:05:59<2:33:37, 5.66s/it] {'loss': 0.4481, 'grad_norm': 0.26612494706636547, 'learning_rate': 1.4187045719414427e-07, 'epoch': 0.93} 93%|█████████▎| 20467/22095 [35:05:59<2:33:37, 5.66s/it] 93%|█████████▎| 20468/22095 [35:06:04<2:24:12, 5.32s/it] {'loss': 0.2688, 'grad_norm': 0.5947286912424484, 'learning_rate': 1.4169715676519203e-07, 'epoch': 0.93} 93%|█████████▎| 20468/22095 [35:06:04<2:24:12, 5.32s/it] 93%|█████████▎| 20469/22095 [35:06:07<2:09:57, 4.80s/it] {'loss': 0.2903, 'grad_norm': 0.5686910911062496, 'learning_rate': 1.4152396072624587e-07, 'epoch': 0.93} 93%|█████████▎| 20469/22095 [35:06:07<2:09:57, 4.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [157, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8949353 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [157, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 188, 'image': 'images/5238.png', 'image_wh': [[157, 19]], 'conversations': [{'from': 'human', 'value': "\n已知:如图,点C是线段AB的中点,点D是线段BC的中点,AB=20cm,那么线段AD等于()\nA. 10cm\nB. 5cm\nC. 15cm\nD. 16cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 93%|█████████▎| 20470/22095 [35:06:11<1:58:55, 4.39s/it] {'loss': 0.3428, 'grad_norm': 0.5883994277724498, 'learning_rate': 1.413508690810289e-07, 'epoch': 0.93} 93%|█████████▎| 20470/22095 [35:06:11<1:58:55, 4.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57552 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (119014 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80504 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20471/22095 [35:06:14<1:47:31, 3.97s/it] {'loss': 0.2879, 'grad_norm': 0.6112876770382101, 'learning_rate': 1.4117788183325986e-07, 'epoch': 0.93} 93%|█████████▎| 20471/22095 [35:06:14<1:47:31, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86783 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58171 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71987 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20472/22095 [35:06:16<1:38:31, 3.64s/it] {'loss': 0.2958, 'grad_norm': 0.6009679063039658, 'learning_rate': 1.410049989866541e-07, 'epoch': 0.93} 93%|█████████▎| 20472/22095 [35:06:16<1:38:31, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53122 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20473/22095 [35:06:19<1:31:51, 3.40s/it] {'loss': 0.2611, 'grad_norm': 0.7066580354453394, 'learning_rate': 1.4083222054492862e-07, 'epoch': 0.93} 93%|█████████▎| 20473/22095 [35:06:19<1:31:51, 3.40s/it] 93%|█████████▎| 20474/22095 [35:06:23<1:31:05, 3.37s/it] {'loss': 0.3046, 'grad_norm': 0.6038274186738244, 'learning_rate': 1.4065954651179492e-07, 'epoch': 0.93} 93%|█████████▎| 20474/22095 [35:06:23<1:31:05, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20475/22095 [35:06:29<1:57:32, 4.35s/it] {'loss': 0.461, 'grad_norm': 0.26245478634417746, 'learning_rate': 1.404869768909628e-07, 'epoch': 0.93} 93%|█████████▎| 20475/22095 [35:06:29<1:57:32, 4.35s/it] 93%|█████████▎| 20476/22095 [35:06:32<1:48:45, 4.03s/it] {'loss': 0.2844, 'grad_norm': 0.5982783265922673, 'learning_rate': 1.4031451168614097e-07, 'epoch': 0.93} 93%|█████████▎| 20476/22095 [35:06:33<1:48:45, 4.03s/it] 93%|█████████▎| 20477/22095 [35:06:35<1:39:20, 3.68s/it] {'loss': 0.3082, 'grad_norm': 0.5835922919662517, 'learning_rate': 1.4014215090103424e-07, 'epoch': 0.93} 93%|█████████▎| 20477/22095 [35:06:35<1:39:20, 3.68s/it] 93%|█████████▎| 20478/22095 [35:06:39<1:37:20, 3.61s/it] {'loss': 0.2762, 'grad_norm': 0.5992267862265387, 'learning_rate': 1.3996989453934795e-07, 'epoch': 0.93} 93%|█████████▎| 20478/22095 [35:06:39<1:37:20, 3.61s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20479/22095 [35:06:48<2:25:13, 5.39s/it] {'loss': 0.4519, 'grad_norm': 0.2659060116019865, 'learning_rate': 1.397977426047814e-07, 'epoch': 0.93} 93%|█████████▎| 20479/22095 [35:06:48<2:25:13, 5.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20480/22095 [35:06:58<2:58:47, 6.64s/it] {'loss': 0.4599, 'grad_norm': 0.2554853549334449, 'learning_rate': 1.396256951010344e-07, 'epoch': 0.93} 93%|█████████▎| 20480/22095 [35:06:58<2:58:47, 6.64s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 93%|█████████▎| 20481/22095 [35:07:01<2:33:49, 5.72s/it] {'loss': 0.3366, 'grad_norm': 0.6112969280647528, 'learning_rate': 1.39453752031804e-07, 'epoch': 0.93} 93%|█████████▎| 20481/22095 [35:07:01<2:33:49, 5.72s/it] 93%|█████████▎| 20482/22095 [35:07:05<2:16:47, 5.09s/it] {'loss': 0.3113, 'grad_norm': 0.6211876213503223, 'learning_rate': 1.3928191340078446e-07, 'epoch': 0.93} 93%|█████████▎| 20482/22095 [35:07:05<2:16:47, 5.09s/it] 93%|█████████▎| 20483/22095 [35:07:09<2:03:11, 4.59s/it] {'loss': 0.3026, 'grad_norm': 0.6725018512657114, 'learning_rate': 1.391101792116678e-07, 'epoch': 0.93} 93%|█████████▎| 20483/22095 [35:07:09<2:03:11, 4.59s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44527 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (109158 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20484/22095 [35:07:12<1:53:31, 4.23s/it] {'loss': 0.2821, 'grad_norm': 0.6567315780209119, 'learning_rate': 1.38938549468145e-07, 'epoch': 0.93} 93%|█████████▎| 20484/22095 [35:07:12<1:53:31, 4.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62156 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20485/22095 [35:07:15<1:42:30, 3.82s/it] {'loss': 0.3291, 'grad_norm': 0.6525421756495428, 'learning_rate': 1.3876702417390197e-07, 'epoch': 0.93} 93%|█████████▎| 20485/22095 [35:07:15<1:42:30, 3.82s/it] 93%|█████████▎| 20486/22095 [35:07:18<1:37:01, 3.62s/it] {'loss': 0.3101, 'grad_norm': 0.5911414570134386, 'learning_rate': 1.3859560333262578e-07, 'epoch': 0.93} 93%|█████████▎| 20486/22095 [35:07:18<1:37:01, 3.62s/it] 93%|█████████▎| 20487/22095 [35:07:21<1:34:59, 3.54s/it] {'loss': 0.2805, 'grad_norm': 0.5579965197267512, 'learning_rate': 1.384242869480007e-07, 'epoch': 0.93} 93%|█████████▎| 20487/22095 [35:07:21<1:34:59, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20488/22095 [35:07:24<1:30:59, 3.40s/it] {'loss': 0.2891, 'grad_norm': 0.665727549513722, 'learning_rate': 1.3825307502370487e-07, 'epoch': 0.93} 93%|█████████▎| 20488/22095 [35:07:24<1:30:59, 3.40s/it] 93%|█████████▎| 20489/22095 [35:07:28<1:32:09, 3.44s/it] {'loss': 0.2708, 'grad_norm': 0.5640027269792147, 'learning_rate': 1.3808196756341928e-07, 'epoch': 0.93} 93%|█████████▎| 20489/22095 [35:07:28<1:32:09, 3.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47387 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47400 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20490/22095 [35:07:32<1:37:30, 3.65s/it] {'loss': 0.2878, 'grad_norm': 0.6294402244906123, 'learning_rate': 1.3791096457081987e-07, 'epoch': 0.93} 93%|█████████▎| 20490/22095 [35:07:32<1:37:30, 3.65s/it] 93%|█████████▎| 20491/22095 [35:07:35<1:34:52, 3.55s/it] {'loss': 0.3326, 'grad_norm': 0.8039347348919981, 'learning_rate': 1.3774006604958202e-07, 'epoch': 0.93} 93%|█████████▎| 20491/22095 [35:07:35<1:34:52, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20492/22095 [35:07:45<2:22:02, 5.32s/it] {'loss': 0.4793, 'grad_norm': 0.2858853839858013, 'learning_rate': 1.3756927200337555e-07, 'epoch': 0.93} 93%|█████████▎| 20492/22095 [35:07:45<2:22:02, 5.32s/it] 93%|█████████▎| 20493/22095 [35:07:48<2:07:57, 4.79s/it] {'loss': 0.2837, 'grad_norm': 0.6492039128977286, 'learning_rate': 1.37398582435872e-07, 'epoch': 0.93} 93%|█████████▎| 20493/22095 [35:07:48<2:07:57, 4.79s/it] 93%|█████████▎| 20494/22095 [35:07:52<1:59:51, 4.49s/it] {'loss': 0.3027, 'grad_norm': 0.6545488674984096, 'learning_rate': 1.3722799735073898e-07, 'epoch': 0.93} 93%|█████████▎| 20494/22095 [35:07:52<1:59:51, 4.49s/it] 93%|█████████▎| 20495/22095 [35:07:55<1:45:45, 3.97s/it] {'loss': 0.2678, 'grad_norm': 0.6080759408041968, 'learning_rate': 1.3705751675164137e-07, 'epoch': 0.93} 93%|█████████▎| 20495/22095 [35:07:55<1:45:45, 3.97s/it] 93%|█████████▎| 20496/22095 [35:07:58<1:40:11, 3.76s/it] {'loss': 0.2975, 'grad_norm': 0.6326392617289096, 'learning_rate': 1.3688714064224175e-07, 'epoch': 0.93} 93%|█████████▎| 20496/22095 [35:07:58<1:40:11, 3.76s/it] 93%|█████████▎| 20497/22095 [35:08:02<1:40:30, 3.77s/it] {'loss': 0.3192, 'grad_norm': 0.6756598727801661, 'learning_rate': 1.367168690262022e-07, 'epoch': 0.93} 93%|█████████▎| 20497/22095 [35:08:02<1:40:30, 3.77s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20498/22095 [35:08:08<2:00:12, 4.52s/it] {'loss': 0.447, 'grad_norm': 0.2631476713639336, 'learning_rate': 1.3654670190718035e-07, 'epoch': 0.93} 93%|█████████▎| 20498/22095 [35:08:08<2:00:12, 4.52s/it] 93%|█████████▎| 20499/22095 [35:08:11<1:49:28, 4.12s/it] {'loss': 0.2915, 'grad_norm': 0.5927515198474593, 'learning_rate': 1.3637663928883328e-07, 'epoch': 0.93} 93%|█████████▎| 20499/22095 [35:08:11<1:49:28, 4.12s/it] 93%|█████████▎| 20500/22095 [35:08:16<1:50:57, 4.17s/it] {'loss': 0.2322, 'grad_norm': 0.6149204123557976, 'learning_rate': 1.3620668117481471e-07, 'epoch': 0.93} 93%|█████████▎| 20500/22095 [35:08:16<1:50:57, 4.17s/it] 93%|█████████▎| 20501/22095 [35:08:18<1:39:41, 3.75s/it] {'loss': 0.29, 'grad_norm': 0.6259834762829701, 'learning_rate': 1.3603682756877624e-07, 'epoch': 0.93} 93%|█████████▎| 20501/22095 [35:08:18<1:39:41, 3.75s/it] 93%|█████████▎| 20502/22095 [35:08:22<1:39:10, 3.74s/it] {'loss': 0.2761, 'grad_norm': 0.592115814894418, 'learning_rate': 1.3586707847436765e-07, 'epoch': 0.93} 93%|█████████▎| 20502/22095 [35:08:22<1:39:10, 3.74s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20503/22095 [35:08:26<1:36:30, 3.64s/it] {'loss': 0.3121, 'grad_norm': 0.6097902401023106, 'learning_rate': 1.356974338952366e-07, 'epoch': 0.93} 93%|█████████▎| 20503/22095 [35:08:26<1:36:30, 3.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (85083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92784 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20504/22095 [35:08:29<1:31:45, 3.46s/it] {'loss': 0.278, 'grad_norm': 0.6350445882270588, 'learning_rate': 1.3552789383502906e-07, 'epoch': 0.93} 93%|█████████▎| 20504/22095 [35:08:29<1:31:45, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90880 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20505/22095 [35:08:32<1:29:20, 3.37s/it] {'loss': 0.3052, 'grad_norm': 0.617967458484192, 'learning_rate': 1.3535845829738547e-07, 'epoch': 0.93} 93%|█████████▎| 20505/22095 [35:08:32<1:29:20, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20506/22095 [35:08:36<1:32:12, 3.48s/it] {'loss': 0.2497, 'grad_norm': 0.5859739429927824, 'learning_rate': 1.3518912728594902e-07, 'epoch': 0.93} 93%|█████████▎| 20506/22095 [35:08:36<1:32:12, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53787 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67308 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20507/22095 [35:08:39<1:29:26, 3.38s/it] {'loss': 0.3071, 'grad_norm': 0.5816997049619769, 'learning_rate': 1.350199008043568e-07, 'epoch': 0.93} 93%|█████████▎| 20507/22095 [35:08:39<1:29:26, 3.38s/it] 93%|█████████▎| 20508/22095 [35:08:42<1:25:18, 3.23s/it] {'loss': 0.295, 'grad_norm': 0.6387081232887973, 'learning_rate': 1.3485077885624587e-07, 'epoch': 0.93} 93%|█████████▎| 20508/22095 [35:08:42<1:25:18, 3.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47858 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20509/22095 [35:08:45<1:23:31, 3.16s/it] {'loss': 0.2468, 'grad_norm': 0.6058050849986356, 'learning_rate': 1.3468176144524837e-07, 'epoch': 0.93} 93%|█████████▎| 20509/22095 [35:08:45<1:23:31, 3.16s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20510/22095 [35:08:48<1:24:02, 3.18s/it] {'loss': 0.3326, 'grad_norm': 0.6973876198738655, 'learning_rate': 1.3451284857499803e-07, 'epoch': 0.93} 93%|█████████▎| 20510/22095 [35:08:48<1:24:02, 3.18s/it] 93%|█████████▎| 20511/22095 [35:08:51<1:21:51, 3.10s/it] {'loss': 0.2416, 'grad_norm': 0.5773441879086344, 'learning_rate': 1.3434404024912307e-07, 'epoch': 0.93} 93%|█████████▎| 20511/22095 [35:08:51<1:21:51, 3.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20512/22095 [35:08:58<1:56:47, 4.43s/it] {'loss': 0.4776, 'grad_norm': 0.26222600729671075, 'learning_rate': 1.3417533647125114e-07, 'epoch': 0.93} 93%|█████████▎| 20512/22095 [35:08:58<1:56:47, 4.43s/it] 93%|█████████▎| 20513/22095 [35:09:03<1:56:06, 4.40s/it] {'loss': 0.313, 'grad_norm': 0.5985889900254078, 'learning_rate': 1.3400673724500713e-07, 'epoch': 0.93} 93%|█████████▎| 20513/22095 [35:09:03<1:56:06, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42299 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20514/22095 [35:09:07<1:56:12, 4.41s/it] {'loss': 0.2749, 'grad_norm': 1.1236766909494023, 'learning_rate': 1.3383824257401256e-07, 'epoch': 0.93} 93%|█████████▎| 20514/22095 [35:09:07<1:56:12, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (63236 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61242 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49062 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20515/22095 [35:09:17<2:44:20, 6.24s/it] {'loss': 0.4529, 'grad_norm': 0.3598829426277801, 'learning_rate': 1.3366985246188958e-07, 'epoch': 0.93} 93%|█████████▎| 20515/22095 [35:09:18<2:44:20, 6.24s/it] 93%|█████████▎| 20516/22095 [35:09:23<2:34:46, 5.88s/it] {'loss': 0.3305, 'grad_norm': 0.6381871757977565, 'learning_rate': 1.335015669122558e-07, 'epoch': 0.93} 93%|█████████▎| 20516/22095 [35:09:23<2:34:46, 5.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20517/22095 [35:09:34<3:21:16, 7.65s/it] {'loss': 0.476, 'grad_norm': 0.27547840975691507, 'learning_rate': 1.3333338592872725e-07, 'epoch': 0.93} 93%|█████████▎| 20517/22095 [35:09:34<3:21:16, 7.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111959 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20518/22095 [35:09:38<2:52:50, 6.58s/it] {'loss': 0.3096, 'grad_norm': 0.654558987708579, 'learning_rate': 1.3316530951491712e-07, 'epoch': 0.93} 93%|█████████▎| 20518/22095 [35:09:38<2:52:50, 6.58s/it] 93%|█████████▎| 20519/22095 [35:09:42<2:28:45, 5.66s/it] {'loss': 0.2882, 'grad_norm': 0.6093967244105901, 'learning_rate': 1.3299733767443645e-07, 'epoch': 0.93} 93%|█████████▎| 20519/22095 [35:09:42<2:28:45, 5.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66015 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20520/22095 [35:09:46<2:13:16, 5.08s/it] {'loss': 0.3011, 'grad_norm': 0.6398248693621602, 'learning_rate': 1.3282947041089678e-07, 'epoch': 0.93} 93%|█████████▎| 20520/22095 [35:09:46<2:13:16, 5.08s/it] 93%|█████████▎| 20521/22095 [35:09:48<1:54:35, 4.37s/it] {'loss': 0.2636, 'grad_norm': 0.6069225158778648, 'learning_rate': 1.3266170772790244e-07, 'epoch': 0.93} 93%|█████████▎| 20521/22095 [35:09:48<1:54:35, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20522/22095 [35:09:57<2:27:30, 5.63s/it] {'loss': 0.4825, 'grad_norm': 0.26498649232465793, 'learning_rate': 1.3249404962905832e-07, 'epoch': 0.93} 93%|█████████▎| 20522/22095 [35:09:57<2:27:30, 5.63s/it] 93%|█████████▎| 20523/22095 [35:10:06<2:57:08, 6.76s/it] {'loss': 0.4737, 'grad_norm': 0.4456432719083173, 'learning_rate': 1.3232649611796878e-07, 'epoch': 0.93} 93%|█████████▎| 20523/22095 [35:10:06<2:57:08, 6.76s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 93%|█████████▎| 20524/22095 [35:10:10<2:29:21, 5.70s/it] {'loss': 0.2678, 'grad_norm': 0.6128672166428326, 'learning_rate': 1.3215904719823313e-07, 'epoch': 0.93} 93%|█████████▎| 20524/22095 [35:10:10<2:29:21, 5.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42956 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107931 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20525/22095 [35:10:13<2:12:34, 5.07s/it] {'loss': 0.3146, 'grad_norm': 0.629973709969791, 'learning_rate': 1.3199170287344797e-07, 'epoch': 0.93} 93%|█████████▎| 20525/22095 [35:10:13<2:12:34, 5.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20526/22095 [35:10:23<2:47:36, 6.41s/it] {'loss': 0.4286, 'grad_norm': 0.26166497076034695, 'learning_rate': 1.3182446314721154e-07, 'epoch': 0.93} 93%|█████████▎| 20526/22095 [35:10:23<2:47:36, 6.41s/it] 93%|█████████▎| 20527/22095 [35:10:26<2:25:47, 5.58s/it] {'loss': 0.2927, 'grad_norm': 0.7510941062492887, 'learning_rate': 1.316573280231148e-07, 'epoch': 0.93} 93%|█████████▎| 20527/22095 [35:10:26<2:25:47, 5.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20528/22095 [35:10:36<2:55:41, 6.73s/it] {'loss': 0.4667, 'grad_norm': 0.28980127141565226, 'learning_rate': 1.3149029750475052e-07, 'epoch': 0.93} 93%|█████████▎| 20528/22095 [35:10:36<2:55:41, 6.73s/it] 93%|█████████▎| 20529/22095 [35:10:40<2:34:31, 5.92s/it] {'loss': 0.3383, 'grad_norm': 0.5980028104852588, 'learning_rate': 1.313233715957074e-07, 'epoch': 0.93} 93%|█████████▎| 20529/22095 [35:10:40<2:34:31, 5.92s/it] 93%|█████████▎| 20530/22095 [35:10:43<2:13:22, 5.11s/it] {'loss': 0.3128, 'grad_norm': 0.700142480044533, 'learning_rate': 1.3115655029957207e-07, 'epoch': 0.93} 93%|█████████▎| 20530/22095 [35:10:43<2:13:22, 5.11s/it] 93%|█████████▎| 20531/22095 [35:10:46<1:57:33, 4.51s/it] {'loss': 0.31, 'grad_norm': 0.6492349316199251, 'learning_rate': 1.3098983361992834e-07, 'epoch': 0.93} 93%|█████████▎| 20531/22095 [35:10:46<1:57:33, 4.51s/it] 93%|█████████▎| 20532/22095 [35:10:49<1:45:55, 4.07s/it] {'loss': 0.3311, 'grad_norm': 0.6112670944838561, 'learning_rate': 1.3082322156035942e-07, 'epoch': 0.93} 93%|█████████▎| 20532/22095 [35:10:49<1:45:55, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (107011 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20533/22095 [35:10:59<2:31:13, 5.81s/it] {'loss': 0.4467, 'grad_norm': 0.2691885698944432, 'learning_rate': 1.3065671412444526e-07, 'epoch': 0.93} 93%|█████████▎| 20533/22095 [35:10:59<2:31:13, 5.81s/it] 93%|█████████▎| 20534/22095 [35:11:03<2:13:44, 5.14s/it] {'loss': 0.3091, 'grad_norm': 0.640175163072835, 'learning_rate': 1.3049031131576294e-07, 'epoch': 0.93} 93%|█████████▎| 20534/22095 [35:11:03<2:13:44, 5.14s/it] 93%|█████████▎| 20535/22095 [35:11:06<1:59:27, 4.59s/it] {'loss': 0.2457, 'grad_norm': 0.5699163789000503, 'learning_rate': 1.30324013137888e-07, 'epoch': 0.93} 93%|█████████▎| 20535/22095 [35:11:06<1:59:27, 4.59s/it] 93%|█████████▎| 20536/22095 [35:11:10<1:58:10, 4.55s/it] {'loss': 0.2705, 'grad_norm': 0.6347799630181783, 'learning_rate': 1.3015781959439478e-07, 'epoch': 0.93} 93%|█████████▎| 20536/22095 [35:11:10<1:58:10, 4.55s/it] 93%|█████████▎| 20537/22095 [35:11:14<1:52:12, 4.32s/it] {'loss': 0.2613, 'grad_norm': 0.6121332607475962, 'learning_rate': 1.299917306888532e-07, 'epoch': 0.93} 93%|█████████▎| 20537/22095 [35:11:14<1:52:12, 4.32s/it] 93%|█████████▎| 20538/22095 [35:11:17<1:42:16, 3.94s/it] {'loss': 0.302, 'grad_norm': 0.579396510362602, 'learning_rate': 1.2982574642483148e-07, 'epoch': 0.93} 93%|█████████▎| 20538/22095 [35:11:17<1:42:16, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58235 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53852 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20539/22095 [35:11:20<1:33:52, 3.62s/it] {'loss': 0.2864, 'grad_norm': 0.6109700917502632, 'learning_rate': 1.2965986680589793e-07, 'epoch': 0.93} 93%|█████████▎| 20539/22095 [35:11:20<1:33:52, 3.62s/it] 93%|█████████▎| 20540/22095 [35:11:23<1:29:38, 3.46s/it] {'loss': 0.2482, 'grad_norm': 0.5384522792034464, 'learning_rate': 1.2949409183561467e-07, 'epoch': 0.93} 93%|█████████▎| 20540/22095 [35:11:23<1:29:38, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20541/22095 [35:11:31<2:05:16, 4.84s/it] {'loss': 0.4769, 'grad_norm': 0.27588932629748913, 'learning_rate': 1.2932842151754555e-07, 'epoch': 0.93} 93%|█████████▎| 20541/22095 [35:11:31<2:05:16, 4.84s/it] 93%|█████████▎| 20542/22095 [35:11:36<2:01:48, 4.71s/it] {'loss': 0.3171, 'grad_norm': 0.5868988348140859, 'learning_rate': 1.2916285585524936e-07, 'epoch': 0.93} 93%|█████████▎| 20542/22095 [35:11:36<2:01:48, 4.71s/it] 93%|█████████▎| 20543/22095 [35:11:40<1:55:41, 4.47s/it] {'loss': 0.2801, 'grad_norm': 0.5927639601052319, 'learning_rate': 1.2899739485228325e-07, 'epoch': 0.93} 93%|█████████▎| 20543/22095 [35:11:40<1:55:41, 4.47s/it] 93%|█████████▎| 20544/22095 [35:11:43<1:50:45, 4.28s/it] {'loss': 0.3065, 'grad_norm': 0.5875612986420774, 'learning_rate': 1.2883203851220326e-07, 'epoch': 0.93} 93%|█████████▎| 20544/22095 [35:11:43<1:50:45, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44812 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44338 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71153 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20545/22095 [35:11:48<1:51:17, 4.31s/it] {'loss': 0.2955, 'grad_norm': 0.5992372891431096, 'learning_rate': 1.286667868385627e-07, 'epoch': 0.93} 93%|█████████▎| 20545/22095 [35:11:48<1:51:17, 4.31s/it] 93%|█████████▎| 20546/22095 [35:11:52<1:48:03, 4.19s/it] {'loss': 0.2895, 'grad_norm': 0.5718712978749788, 'learning_rate': 1.285016398349115e-07, 'epoch': 0.93} 93%|█████████▎| 20546/22095 [35:11:52<1:48:03, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20547/22095 [35:12:03<2:47:01, 6.47s/it] {'loss': 0.472, 'grad_norm': 0.30508341455682847, 'learning_rate': 1.2833659750479787e-07, 'epoch': 0.93} 93%|█████████▎| 20547/22095 [35:12:03<2:47:01, 6.47s/it] 93%|█████████▎| 20548/22095 [35:12:08<2:30:03, 5.82s/it] {'loss': 0.2778, 'grad_norm': 0.6402722698777407, 'learning_rate': 1.281716598517685e-07, 'epoch': 0.93} 93%|█████████▎| 20548/22095 [35:12:08<2:30:03, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (66432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76015 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20549/22095 [35:12:12<2:16:08, 5.28s/it] {'loss': 0.2441, 'grad_norm': 0.6912387838467804, 'learning_rate': 1.2800682687936826e-07, 'epoch': 0.93} 93%|█████████▎| 20549/22095 [35:12:12<2:16:08, 5.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20550/22095 [35:12:15<2:03:17, 4.79s/it] {'loss': 0.2777, 'grad_norm': 0.5550573796308262, 'learning_rate': 1.2784209859113773e-07, 'epoch': 0.93} 93%|█████████▎| 20550/22095 [35:12:15<2:03:17, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57437 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (95814 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113768 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80654 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (151731 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20551/22095 [35:12:19<1:52:00, 4.35s/it] {'loss': 0.316, 'grad_norm': 0.7114796618199457, 'learning_rate': 1.2767747499061677e-07, 'epoch': 0.93} 93%|█████████▎| 20551/22095 [35:12:19<1:52:00, 4.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43844 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69074 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82687 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45594 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138730 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96297 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57083 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20552/22095 [35:12:22<1:44:35, 4.07s/it] {'loss': 0.2525, 'grad_norm': 0.6176338168213179, 'learning_rate': 1.2751295608134262e-07, 'epoch': 0.93} 93%|█████████▎| 20552/22095 [35:12:22<1:44:35, 4.07s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20553/22095 [35:12:32<2:28:38, 5.78s/it] {'loss': 0.4685, 'grad_norm': 0.257163146418204, 'learning_rate': 1.273485418668502e-07, 'epoch': 0.93} 93%|█████████▎| 20553/22095 [35:12:32<2:28:38, 5.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20554/22095 [35:12:35<2:09:03, 5.02s/it] {'loss': 0.3013, 'grad_norm': 0.5724525940518886, 'learning_rate': 1.2718423235067278e-07, 'epoch': 0.93} 93%|█████████▎| 20554/22095 [35:12:35<2:09:03, 5.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56284 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56796 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (115628 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85978 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44904 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20555/22095 [35:12:39<1:57:22, 4.57s/it] {'loss': 0.3173, 'grad_norm': 0.6024815855082362, 'learning_rate': 1.2702002753634092e-07, 'epoch': 0.93} 93%|█████████▎| 20555/22095 [35:12:39<1:57:22, 4.57s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67960 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58879 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20556/22095 [35:12:41<1:42:31, 4.00s/it] {'loss': 0.3283, 'grad_norm': 0.6725907048687568, 'learning_rate': 1.2685592742738173e-07, 'epoch': 0.93} 93%|█████████▎| 20556/22095 [35:12:41<1:42:31, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20557/22095 [35:12:51<2:24:42, 5.65s/it] {'loss': 0.4496, 'grad_norm': 0.2593537537342258, 'learning_rate': 1.266919320273219e-07, 'epoch': 0.93} 93%|█████████▎| 20557/22095 [35:12:51<2:24:42, 5.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43528 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48735 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64831 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50367 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71133 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20558/22095 [35:12:55<2:12:20, 5.17s/it] {'loss': 0.3077, 'grad_norm': 0.6055413336996001, 'learning_rate': 1.2652804133968578e-07, 'epoch': 0.93} 93%|█████████▎| 20558/22095 [35:12:55<2:12:20, 5.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20559/22095 [35:13:04<2:45:23, 6.46s/it] {'loss': 0.4526, 'grad_norm': 0.27130371236025036, 'learning_rate': 1.263642553679939e-07, 'epoch': 0.93} 93%|█████████▎| 20559/22095 [35:13:04<2:45:23, 6.46s/it] 93%|█████████▎| 20560/22095 [35:13:14<3:08:52, 7.38s/it] {'loss': 0.4429, 'grad_norm': 0.2867604133985603, 'learning_rate': 1.2620057411576568e-07, 'epoch': 0.93} 93%|█████████▎| 20560/22095 [35:13:14<3:08:52, 7.38s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (43432 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53390 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20561/22095 [35:13:17<2:38:56, 6.22s/it] {'loss': 0.3223, 'grad_norm': 0.7368223557648287, 'learning_rate': 1.2603699758651888e-07, 'epoch': 0.93} 93%|█████████▎| 20561/22095 [35:13:17<2:38:56, 6.22s/it] 93%|█████████▎| 20562/22095 [35:13:21<2:20:29, 5.50s/it] {'loss': 0.3017, 'grad_norm': 0.5805378186525145, 'learning_rate': 1.2587352578376787e-07, 'epoch': 0.93} 93%|█████████▎| 20562/22095 [35:13:21<2:20:29, 5.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20563/22095 [35:13:31<2:55:58, 6.89s/it] {'loss': 0.4467, 'grad_norm': 0.28028151826188114, 'learning_rate': 1.2571015871102433e-07, 'epoch': 0.93} 93%|█████████▎| 20563/22095 [35:13:31<2:55:58, 6.89s/it] 93%|█████████▎| 20564/22095 [35:13:38<2:52:42, 6.77s/it] {'loss': 0.4622, 'grad_norm': 0.28380587605428187, 'learning_rate': 1.2554689637179984e-07, 'epoch': 0.93} 93%|█████████▎| 20564/22095 [35:13:38<2:52:42, 6.77s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 93%|█████████▎| 20565/22095 [35:13:41<2:25:35, 5.71s/it] {'loss': 0.2572, 'grad_norm': 0.6322006217264248, 'learning_rate': 1.2538373876960162e-07, 'epoch': 0.93} 93%|█████████▎| 20565/22095 [35:13:41<2:25:35, 5.71s/it] 93%|█████████▎| 20566/22095 [35:13:51<2:54:20, 6.84s/it] {'loss': 0.4756, 'grad_norm': 0.28134304880944516, 'learning_rate': 1.2522068590793578e-07, 'epoch': 0.93} 93%|█████████▎| 20566/22095 [35:13:51<2:54:20, 6.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 93%|█████████▎| 20567/22095 [35:13:54<2:31:38, 5.95s/it] {'loss': 0.342, 'grad_norm': 0.5847147524196179, 'learning_rate': 1.2505773779030562e-07, 'epoch': 0.93} 93%|█████████▎| 20567/22095 [35:13:54<2:31:38, 5.95s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20568/22095 [35:13:58<2:10:06, 5.11s/it] {'loss': 0.28, 'grad_norm': 0.6099517197345848, 'learning_rate': 1.2489489442021275e-07, 'epoch': 0.93} 93%|█████████▎| 20568/22095 [35:13:58<2:10:06, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49039 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20569/22095 [35:14:06<2:38:03, 6.21s/it] {'loss': 0.4718, 'grad_norm': 0.3126201811819335, 'learning_rate': 1.2473215580115493e-07, 'epoch': 0.93} 93%|█████████▎| 20569/22095 [35:14:06<2:38:03, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44418 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20570/22095 [35:14:11<2:22:17, 5.60s/it] {'loss': 0.3037, 'grad_norm': 0.5750391423215298, 'learning_rate': 1.2456952193663052e-07, 'epoch': 0.93} 93%|█████████▎| 20570/22095 [35:14:11<2:22:17, 5.60s/it] 93%|█████████▎| 20571/22095 [35:14:15<2:11:26, 5.18s/it] {'loss': 0.3163, 'grad_norm': 0.613391861418245, 'learning_rate': 1.2440699283013335e-07, 'epoch': 0.93} 93%|█████████▎| 20571/22095 [35:14:15<2:11:26, 5.18s/it] 93%|█████████▎| 20572/22095 [35:14:18<1:56:11, 4.58s/it] {'loss': 0.27, 'grad_norm': 0.5927581691696879, 'learning_rate': 1.2424456848515565e-07, 'epoch': 0.93} 93%|█████████▎| 20572/22095 [35:14:18<1:56:11, 4.58s/it] 93%|█████████▎| 20573/22095 [35:14:21<1:43:40, 4.09s/it] {'loss': 0.2794, 'grad_norm': 0.6240474780698677, 'learning_rate': 1.2408224890518683e-07, 'epoch': 0.93} 93%|█████████▎| 20573/22095 [35:14:21<1:43:40, 4.09s/it] 93%|█████████▎| 20574/22095 [35:14:24<1:39:11, 3.91s/it] {'loss': 0.3358, 'grad_norm': 0.6195892179866072, 'learning_rate': 1.2392003409371578e-07, 'epoch': 0.93} 93%|█████████▎| 20574/22095 [35:14:24<1:39:11, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79726 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20575/22095 [35:14:27<1:30:38, 3.58s/it] {'loss': 0.2981, 'grad_norm': 0.6295648159672592, 'learning_rate': 1.2375792405422748e-07, 'epoch': 0.93} 93%|█████████▎| 20575/22095 [35:14:27<1:30:38, 3.58s/it] 93%|█████████▎| 20576/22095 [35:14:31<1:33:26, 3.69s/it] {'loss': 0.3139, 'grad_norm': 0.6303571979522588, 'learning_rate': 1.2359591879020528e-07, 'epoch': 0.93} 93%|█████████▎| 20576/22095 [35:14:31<1:33:26, 3.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20577/22095 [35:14:39<2:06:16, 4.99s/it] {'loss': 0.4946, 'grad_norm': 0.2804170390755626, 'learning_rate': 1.2343401830512914e-07, 'epoch': 0.93} 93%|█████████▎| 20577/22095 [35:14:39<2:06:16, 4.99s/it] 93%|█████████▎| 20578/22095 [35:14:48<2:33:15, 6.06s/it] {'loss': 0.4687, 'grad_norm': 0.2491879000750708, 'learning_rate': 1.232722226024796e-07, 'epoch': 0.93} 93%|█████████▎| 20578/22095 [35:14:48<2:33:15, 6.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 93%|█████████▎| 20579/22095 [35:14:51<2:12:44, 5.25s/it] {'loss': 0.2897, 'grad_norm': 0.5779750601252036, 'learning_rate': 1.231105316857323e-07, 'epoch': 0.93} 93%|█████████▎| 20579/22095 [35:14:51<2:12:44, 5.25s/it] 93%|█████████▎| 20580/22095 [35:14:54<1:57:10, 4.64s/it] {'loss': 0.3061, 'grad_norm': 0.6781888687374773, 'learning_rate': 1.22948945558361e-07, 'epoch': 0.93} 93%|█████████▎| 20580/22095 [35:14:54<1:57:10, 4.64s/it] 93%|█████████▎| 20581/22095 [35:14:58<1:52:17, 4.45s/it] {'loss': 0.2441, 'grad_norm': 0.5907780247980898, 'learning_rate': 1.2278746422383858e-07, 'epoch': 0.93} 93%|█████████▎| 20581/22095 [35:14:58<1:52:17, 4.45s/it] 93%|█████████▎| 20582/22095 [35:15:02<1:45:46, 4.19s/it] {'loss': 0.2912, 'grad_norm': 0.5702829526712206, 'learning_rate': 1.226260876856339e-07, 'epoch': 0.93} 93%|█████████▎| 20582/22095 [35:15:02<1:45:46, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20583/22095 [35:15:12<2:28:18, 5.89s/it] {'loss': 0.4518, 'grad_norm': 0.23769228889810345, 'learning_rate': 1.2246481594721582e-07, 'epoch': 0.93} 93%|█████████▎| 20583/22095 [35:15:12<2:28:18, 5.89s/it] 93%|█████████▎| 20584/22095 [35:15:15<2:09:15, 5.13s/it] {'loss': 0.2644, 'grad_norm': 0.6490998239245681, 'learning_rate': 1.2230364901204773e-07, 'epoch': 0.93} 93%|█████████▎| 20584/22095 [35:15:15<2:09:15, 5.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20585/22095 [35:15:18<1:51:37, 4.44s/it] {'loss': 0.2807, 'grad_norm': 0.5921873669323882, 'learning_rate': 1.2214258688359347e-07, 'epoch': 0.93} 93%|█████████▎| 20585/22095 [35:15:18<1:51:37, 4.44s/it] 93%|█████████▎| 20586/22095 [35:15:21<1:41:17, 4.03s/it] {'loss': 0.2608, 'grad_norm': 0.6513329181124101, 'learning_rate': 1.2198162956531423e-07, 'epoch': 0.93} 93%|█████████▎| 20586/22095 [35:15:21<1:41:17, 4.03s/it] 93%|█████████▎| 20587/22095 [35:15:25<1:44:03, 4.14s/it] {'loss': 0.2831, 'grad_norm': 0.5945147095377089, 'learning_rate': 1.2182077706066776e-07, 'epoch': 0.93} 93%|█████████▎| 20587/22095 [35:15:25<1:44:03, 4.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20588/22095 [35:15:33<2:08:59, 5.14s/it] {'loss': 0.4728, 'grad_norm': 0.27614195415263953, 'learning_rate': 1.2166002937311128e-07, 'epoch': 0.93} 93%|█████████▎| 20588/22095 [35:15:33<2:08:59, 5.14s/it] 93%|█████████▎| 20589/22095 [35:15:36<1:57:22, 4.68s/it] {'loss': 0.2974, 'grad_norm': 0.5984964831266933, 'learning_rate': 1.2149938650609704e-07, 'epoch': 0.93} 93%|█████████▎| 20589/22095 [35:15:36<1:57:22, 4.68s/it] 93%|█████████▎| 20590/22095 [35:15:40<1:51:59, 4.46s/it] {'loss': 0.2943, 'grad_norm': 0.5825886799572433, 'learning_rate': 1.2133884846307898e-07, 'epoch': 0.93} 93%|█████████▎| 20590/22095 [35:15:40<1:51:59, 4.46s/it] 93%|█████████▎| 20591/22095 [35:15:43<1:39:57, 3.99s/it] {'loss': 0.3212, 'grad_norm': 0.6139118481888114, 'learning_rate': 1.2117841524750485e-07, 'epoch': 0.93} 93%|█████████▎| 20591/22095 [35:15:43<1:39:57, 3.99s/it] 93%|█████████▎| 20592/22095 [35:15:48<1:43:21, 4.13s/it] {'loss': 0.2863, 'grad_norm': 0.6098701406625552, 'learning_rate': 1.210180868628219e-07, 'epoch': 0.93} 93%|█████████▎| 20592/22095 [35:15:48<1:43:21, 4.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306196 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1pTKrSVXXXXblXpXXXXXXXXXX_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide me with the written content in this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n包邮\n修牙机\n高精过牙/修牙\n小孔可攻牙\n稳定性高\n可调两速\n可提供正规发票\n稳定性一般\n可以调三速'}]} 93%|█████████▎| 20593/22095 [35:15:51<1:34:05, 3.76s/it] {'loss': 0.2707, 'grad_norm': 0.6222401621125467, 'learning_rate': 1.2085786331247574e-07, 'epoch': 0.93} 93%|█████████▎| 20593/22095 [35:15:51<1:34:05, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41995 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52915 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48497 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20594/22095 [35:15:54<1:30:16, 3.61s/it] {'loss': 0.2575, 'grad_norm': 0.6321271350748866, 'learning_rate': 1.206977445999097e-07, 'epoch': 0.93} 93%|█████████▎| 20594/22095 [35:15:54<1:30:16, 3.61s/it] 93%|█████████▎| 20595/22095 [35:15:57<1:24:15, 3.37s/it] {'loss': 0.3058, 'grad_norm': 0.65663036194109, 'learning_rate': 1.2053773072856323e-07, 'epoch': 0.93} 93%|█████████▎| 20595/22095 [35:15:57<1:24:15, 3.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51824 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62720 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20596/22095 [35:16:01<1:29:36, 3.59s/it] {'loss': 0.2685, 'grad_norm': 0.5785509596715999, 'learning_rate': 1.2037782170187472e-07, 'epoch': 0.93} 93%|█████████▎| 20596/22095 [35:16:01<1:29:36, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20597/22095 [35:16:04<1:29:01, 3.57s/it] {'loss': 0.289, 'grad_norm': 0.6303907355875169, 'learning_rate': 1.2021801752328034e-07, 'epoch': 0.93} 93%|█████████▎| 20597/22095 [35:16:04<1:29:01, 3.57s/it] 93%|█████████▎| 20598/22095 [35:16:07<1:26:00, 3.45s/it] {'loss': 0.2338, 'grad_norm': 0.5863599732194953, 'learning_rate': 1.2005831819621284e-07, 'epoch': 0.93} 93%|█████████▎| 20598/22095 [35:16:08<1:26:00, 3.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59984 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69761 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62559 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20599/22095 [35:16:11<1:26:36, 3.47s/it] {'loss': 0.2871, 'grad_norm': 0.610079565117258, 'learning_rate': 1.198987237241056e-07, 'epoch': 0.93} 93%|█████████▎| 20599/22095 [35:16:11<1:26:36, 3.47s/it] 93%|█████████▎| 20600/22095 [35:16:14<1:21:37, 3.28s/it] {'loss': 0.3204, 'grad_norm': 0.6265622624776339, 'learning_rate': 1.1973923411038646e-07, 'epoch': 0.93} 93%|█████████▎| 20600/22095 [35:16:14<1:21:37, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (98061 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48156 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79271 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20601/22095 [35:16:17<1:22:52, 3.33s/it] {'loss': 0.2588, 'grad_norm': 0.6561079202368796, 'learning_rate': 1.195798493584821e-07, 'epoch': 0.93} 93%|█████████▎| 20601/22095 [35:16:17<1:22:52, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20602/22095 [35:16:27<2:08:06, 5.15s/it] {'loss': 0.4727, 'grad_norm': 0.25492769257299186, 'learning_rate': 1.1942056947181757e-07, 'epoch': 0.93} 93%|█████████▎| 20602/22095 [35:16:27<2:08:06, 5.15s/it] 93%|█████████▎| 20603/22095 [35:16:30<1:55:32, 4.65s/it] {'loss': 0.283, 'grad_norm': 0.6279894540127755, 'learning_rate': 1.1926139445381624e-07, 'epoch': 0.93} 93%|█████████▎| 20603/22095 [35:16:30<1:55:32, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (98821 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (103325 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104654 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20604/22095 [35:16:41<2:38:01, 6.36s/it] {'loss': 0.458, 'grad_norm': 0.2566352653015581, 'learning_rate': 1.1910232430789703e-07, 'epoch': 0.93} 93%|█████████▎| 20604/22095 [35:16:41<2:38:01, 6.36s/it] 93%|█████████▎| 20605/22095 [35:16:44<2:15:16, 5.45s/it] {'loss': 0.2847, 'grad_norm': 0.6515167565739174, 'learning_rate': 1.1894335903747834e-07, 'epoch': 0.93} 93%|█████████▎| 20605/22095 [35:16:44<2:15:16, 5.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43131 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20606/22095 [35:16:47<1:59:52, 4.83s/it] {'loss': 0.3173, 'grad_norm': 0.6284090350465346, 'learning_rate': 1.1878449864597575e-07, 'epoch': 0.93} 93%|█████████▎| 20606/22095 [35:16:47<1:59:52, 4.83s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50056 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20607/22095 [35:16:57<2:34:20, 6.22s/it] {'loss': 0.4519, 'grad_norm': 0.26375783626835564, 'learning_rate': 1.1862574313680264e-07, 'epoch': 0.93} 93%|█████████▎| 20607/22095 [35:16:57<2:34:20, 6.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308310 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2aKEUbBfM8KJjSZFOXXXr5XXa_!!1079374403.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nInterpret the sentences content on this picture.'}, {'from': 'gpt', 'value': 'All words in the image:\nCHANGANMOTORS\nCHANGANMOTORS\nCHANGANMOTORS\nCHANGANMOTORS\n长安专用侧标\n对装'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [132, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8882163 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [132, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5316, 'image': 'images/5338.png', 'image_wh': [[132, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点C是线段AD的中点,AB=10cm,BD=4cm,则BC的长为()\nA. 5cm\nB. 6cm\nC. 7cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 93%|█████████▎| 20608/22095 [35:17:01<2:17:46, 5.56s/it] {'loss': 0.2929, 'grad_norm': 0.6398174159409153, 'learning_rate': 1.1846709251337129e-07, 'epoch': 0.93} 93%|█████████▎| 20608/22095 [35:17:01<2:17:46, 5.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20609/22095 [35:17:10<2:46:37, 6.73s/it] {'loss': 0.4455, 'grad_norm': 0.2867187620695821, 'learning_rate': 1.1830854677908842e-07, 'epoch': 0.93} 93%|█████████▎| 20609/22095 [35:17:10<2:46:37, 6.73s/it] 93%|█████████▎| 20610/22095 [35:17:13<2:19:50, 5.65s/it] {'loss': 0.2664, 'grad_norm': 0.5787649635408306, 'learning_rate': 1.1815010593736298e-07, 'epoch': 0.93} 93%|█████████▎| 20610/22095 [35:17:13<2:19:50, 5.65s/it] 93%|█████████▎| 20611/22095 [35:17:17<2:05:42, 5.08s/it] {'loss': 0.2548, 'grad_norm': 0.585519096160613, 'learning_rate': 1.1799176999159722e-07, 'epoch': 0.93} 93%|█████████▎| 20611/22095 [35:17:17<2:05:42, 5.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20612/22095 [35:17:26<2:37:44, 6.38s/it] {'loss': 0.4511, 'grad_norm': 0.26683750157452135, 'learning_rate': 1.1783353894519512e-07, 'epoch': 0.93} 93%|█████████▎| 20612/22095 [35:17:26<2:37:44, 6.38s/it] 93%|█████████▎| 20613/22095 [35:17:30<2:12:55, 5.38s/it] {'loss': 0.3128, 'grad_norm': 0.5801571515630817, 'learning_rate': 1.1767541280155614e-07, 'epoch': 0.93} 93%|█████████▎| 20613/22095 [35:17:30<2:12:55, 5.38s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20614/22095 [35:17:38<2:33:35, 6.22s/it] {'loss': 0.4847, 'grad_norm': 0.3611265207641925, 'learning_rate': 1.1751739156407649e-07, 'epoch': 0.93} 93%|█████████▎| 20614/22095 [35:17:38<2:33:35, 6.22s/it] 93%|█████████▎| 20615/22095 [35:17:41<2:11:49, 5.34s/it] {'loss': 0.2688, 'grad_norm': 0.6068286561560011, 'learning_rate': 1.1735947523615344e-07, 'epoch': 0.93} 93%|█████████▎| 20615/22095 [35:17:41<2:11:49, 5.34s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20616/22095 [35:17:46<2:09:34, 5.26s/it] {'loss': 0.2683, 'grad_norm': 0.5836598884521632, 'learning_rate': 1.1720166382117925e-07, 'epoch': 0.93} 93%|█████████▎| 20616/22095 [35:17:46<2:09:34, 5.26s/it] 93%|█████████▎| 20617/22095 [35:17:49<1:51:06, 4.51s/it] {'loss': 0.281, 'grad_norm': 0.5905638232357641, 'learning_rate': 1.1704395732254515e-07, 'epoch': 0.93} 93%|█████████▎| 20617/22095 [35:17:49<1:51:06, 4.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 11269371 in VC:s3://internvl2/datasets/MMMUDataset/. Exception: Image size [174, 169, 188, 172, 197, 21, 194, 51, 194, 42, 193, 45, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 776, 'image': ['MMMU/Architecture_and_Engineering/test_108_image_1.png', 'MMMU/Architecture_and_Engineering/test_108_image_2.png', 'MMMU/Architecture_and_Engineering/test_108_image_3.png', 'MMMU/Architecture_and_Engineering/test_108_image_4.png', 'MMMU/Architecture_and_Engineering/test_108_image_5.png', 'MMMU/Architecture_and_Engineering/test_108_image_6.png'], 'image_wh': [[174, 169], [188, 172], [197, 21], [194, 51], [194, 42], [193, 45]], 'conversations': [{'from': 'human', 'value': "Image-1: \nImage-2: \nImage-3: \nImage-4: \nImage-5: \nImage-6: \nPlease analyze the relevant background knowledge involved in this question and provide a detailed introduction to each knowledge point. You need to provide a detailed introduction to all the knowledge points involved in the question and answer, and explain each knowledge point in detail, including concepts, applications, and related examples, similar to the detailed content in textbooks. \nPlease note that the answer requires 2,000 words.Note that you do not need to give a specific answer to the question.Here are the content and options of this question: Match List I with List II and select the correct answer using the codes given below the lists:\nOptions: ['', '', '', '']"}, {'from': 'gpt', 'value': 'To thoroughly understand the nature of the question and effectively match the given objects and symbols, it is essential to delve into the specific characteristics of each fencing type and the general principles of representation using symbols. This exploration not only involves understanding the physical characteristics of each item but also interpreting how these can be visually symbolized. Below are detailed introductions to each concept and its applications.\n\n### 1. Hedge\n\n**Concept**: \nA hedge is a dense row of shrubs or low trees that forms a boundary or screen. Traditionally, hedges have been used for privacy, windbreaks, or to define property lines. They are living plants that can be shaped and maintained through regular trimming and care.\n\n**Applications**:\n- **Privacy and Security**: Hedges provide a natural barrier against intruders – both human and animal – and offer privacy by creating a visual screen.\n- **Aesthetic Appeal**: They enhance the landscape architecture due to their lush green appearance and ability to be sculpted into various shapes and designs.\n- **Environmental Benefits**: Hedges contribute to biodiversity by providing habitat for wildlife, reducing pollution by trapping particulates, and aiding in air purification.\n\n**Example**: A common example of a hedge is a row of well-maintained boxwood shrubs or privet bushes along the boundary of a garden.\n\n### 2. Wire Fencing\n\n**Concept**: \nWire fencing consists of strands of wire arranged in various configurations, often held up by posts. It is designed to keep animals in or out and to mark boundaries.\n\n**Types of Wire Fencing**:\n- **Barbed Wire**: Used mainly for livestock containment; it has sharp edges to deter animals.\n- **Woven Wire**: Also known as field fencing, suitable for large animals, with woven grids to provide strength.\n- **Chain Link Fencing**: Often used in urban areas for its durability and low cost.\n\n**Applications**:\n- **Agricultural**: Secures fields and pens for the safety of livestock, protecting them against predators.\n- **Industrial and Urban Use**: Provides security around buildings, parks, and construction sites.\n\n**Example**: Barbed wire fencing is often seen around ranches, protecting cattle and other livestock.\n\n### 3. Pipe Fencing\n\n**Concept**: \nPipe fencing uses metal pipes as posts and rails to construct a sturdy, long-lasting fence. It is highly durable and resistant to damage from animals and weather conditions.\n\n**Applications**:\n- **Ranch and Farm Usage**: Ideal for containing horses and cattle due to its strength and minimal maintenance requirements.\n- **Industrial Use**: Used in facilities that demand high security and safety standards.\n\n**Example**: Pipe fencing is commonly employed in horse stables and large farms where robust containment is crucial.\n\n### 4. Wood Fencing\n\n**Concept**: \nWood fencing is a traditional form of fencing using timber. It can be constructed in various styles such as picket, panel, or post and rail.\n\n**Types of Wood Fencing**:\n- **Picket Fence**: Iconic style often used in suburban homes.\n- **Privacy Fence**: Tall and solid to block visibility, commonly used around gardens.\n- **Split Rail**: Rustic style typically found in rural settings.\n\n**Applications**:\n- **Residential**: Provides privacy, defines boundaries, and enhances aesthetic appeal.\n- **Decorative**: Frequently used for purely ornamental reasons in landscaping.\n\n**Example**: A white picket fence is a traditional American symbol of the quintessential suburban lifestyle.\n\n### Symbol Representation in Fencing\n\n**Concept**: \nThe use of symbols to represent physical objects like fencing types is a method employed in mapping, technical drawings, and schematic designs to convey information in a simplified visual format.\n\n**Principles**:\n- **Recognizability**: Symbols should be easily recognizable and associated with the object they represent.\n- **Simplicity**: Symbols are typically simplified versions, capturing the essence rather than detailing every characteristic.\n- **Standardization**: In many fields, symbols are standardized to facilitate universal understanding and interpretation.\n\n**Applications**:\n- **Maps and Diagrams**: Symbols allow for quick reference without overwhelming with details.\n- **Technical Plans**: In construction or engineering, symbols efficiently communicate specifications.\n\n**Example**: On a garden landscape design, a series of “arches” might symbolize a hedge, while “crossed lines” could indicate wire fencing.\n\n### Matching Symbols to Objects\n\n**Process**:\n- **Identify Characteristics**: Assess the characteristics of the object and find the visual element in the symbol that best represents these.\n- **Logical Association**: Consider what aspects of the object are most significant in context and how they relate to the intended symbol.\n- **Contextual Relevance**: The match should make sense in the context in which the symbols are being used, reflecting the practical application of each object.\n\n**Example**: A symbol with circular repetitions or continuous flow might align with the linear continuity of a pipe fence, while scattered dots could suggest the irregularity found in wire fencing patterns.\n\nBy understanding these concepts, one can accurately match objects like fences with their symbolic representations, which is a fundamental skill in fields such as landscaping, architecture, and cartography.'}]} 93%|█████████▎| 20618/22095 [35:17:52<1:42:34, 4.17s/it] {'loss': 0.2656, 'grad_norm': 0.5952007858494884, 'learning_rate': 1.1688635574363894e-07, 'epoch': 0.93} 93%|█████████▎| 20618/22095 [35:17:52<1:42:34, 4.17s/it] 93%|█████████▎| 20619/22095 [35:17:56<1:39:57, 4.06s/it] {'loss': 0.286, 'grad_norm': 0.6158201663415633, 'learning_rate': 1.1672885908784792e-07, 'epoch': 0.93} 93%|█████████▎| 20619/22095 [35:17:56<1:39:57, 4.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918155 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41308, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} 93%|█████████▎| 20620/22095 [35:17:59<1:29:47, 3.65s/it] {'loss': 0.2793, 'grad_norm': 0.561801502229038, 'learning_rate': 1.1657146735855662e-07, 'epoch': 0.93} 93%|█████████▎| 20620/22095 [35:17:59<1:29:47, 3.65s/it] 93%|█████████▎| 20621/22095 [35:18:01<1:22:53, 3.37s/it] {'loss': 0.3061, 'grad_norm': 0.6148801897922445, 'learning_rate': 1.1641418055914566e-07, 'epoch': 0.93} 93%|█████████▎| 20621/22095 [35:18:01<1:22:53, 3.37s/it] 93%|█████████▎| 20622/22095 [35:18:06<1:30:31, 3.69s/it] {'loss': 0.2931, 'grad_norm': 0.5864090389864688, 'learning_rate': 1.1625699869299457e-07, 'epoch': 0.93} 93%|█████████▎| 20622/22095 [35:18:06<1:30:31, 3.69s/it] 93%|█████████▎| 20623/22095 [35:18:09<1:27:09, 3.55s/it] {'loss': 0.2819, 'grad_norm': 0.6281392670998087, 'learning_rate': 1.1609992176348228e-07, 'epoch': 0.93} 93%|█████████▎| 20623/22095 [35:18:09<1:27:09, 3.55s/it] 93%|█████████▎| 20624/22095 [35:18:12<1:25:08, 3.47s/it] {'loss': 0.3034, 'grad_norm': 0.5604583332916402, 'learning_rate': 1.1594294977398224e-07, 'epoch': 0.93} 93%|█████████▎| 20624/22095 [35:18:12<1:25:08, 3.47s/it] 93%|█████████▎| 20625/22095 [35:18:15<1:20:50, 3.30s/it] {'loss': 0.2666, 'grad_norm': 0.6333945542597627, 'learning_rate': 1.1578608272786785e-07, 'epoch': 0.93} 93%|█████████▎| 20625/22095 [35:18:15<1:20:50, 3.30s/it] 93%|█████████▎| 20626/22095 [35:18:18<1:17:19, 3.16s/it] {'loss': 0.3127, 'grad_norm': 0.6070115405261841, 'learning_rate': 1.1562932062851084e-07, 'epoch': 0.93} 93%|█████████▎| 20626/22095 [35:18:18<1:17:19, 3.16s/it] 93%|█████████▎| 20627/22095 [35:18:22<1:19:13, 3.24s/it] {'loss': 0.3199, 'grad_norm': 0.6164082922394221, 'learning_rate': 1.1547266347927743e-07, 'epoch': 0.93} 93%|█████████▎| 20627/22095 [35:18:22<1:19:13, 3.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8607360 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20283, 'image': '1580800165.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Self-Help? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 93%|█████████▎| 20628/22095 [35:18:25<1:24:31, 3.46s/it] {'loss': 0.3147, 'grad_norm': 0.5987945549595481, 'learning_rate': 1.1531611128353548e-07, 'epoch': 0.93} 93%|█████████▎| 20628/22095 [35:18:26<1:24:31, 3.46s/it] 93%|█████████▎| 20629/22095 [35:18:29<1:22:27, 3.37s/it] {'loss': 0.2713, 'grad_norm': 0.6618821649290977, 'learning_rate': 1.1515966404464728e-07, 'epoch': 0.93} 93%|█████████▎| 20629/22095 [35:18:29<1:22:27, 3.37s/it] 93%|█████████▎| 20630/22095 [35:18:32<1:25:34, 3.50s/it] {'loss': 0.2605, 'grad_norm': 0.5678115155151787, 'learning_rate': 1.1500332176597629e-07, 'epoch': 0.93} 93%|█████████▎| 20630/22095 [35:18:33<1:25:34, 3.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944628 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67781, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': '\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=18cm,BC=6cm,∴AC=AB-BC=12cm又∵D为BC的中点,∴CD=\\frac{1}{2}BC=3于是AD=AC+CD=12+3=15'}]} 93%|█████████▎| 20631/22095 [35:18:36<1:23:09, 3.41s/it] {'loss': 0.3178, 'grad_norm': 0.6471740693252899, 'learning_rate': 1.1484708445087978e-07, 'epoch': 0.93} 93%|█████████▎| 20631/22095 [35:18:36<1:23:09, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8346098 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12755, 'image': 'vrdu_table_final_2/astro-ph.CO/708390e5-b0f8-47f2-8f1a-cdac1cb6a201.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 93%|█████████▎| 20632/22095 [35:18:39<1:26:13, 3.54s/it] {'loss': 0.3047, 'grad_norm': 0.5827959737037237, 'learning_rate': 1.1469095210271675e-07, 'epoch': 0.93} 93%|█████████▎| 20632/22095 [35:18:40<1:26:13, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20633/22095 [35:18:42<1:20:12, 3.29s/it] {'loss': 0.2448, 'grad_norm': 0.6409107104206825, 'learning_rate': 1.1453492472484118e-07, 'epoch': 0.93} 93%|█████████▎| 20633/22095 [35:18:42<1:20:12, 3.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (58673 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20634/22095 [35:18:46<1:25:57, 3.53s/it] {'loss': 0.2844, 'grad_norm': 0.5763797274333525, 'learning_rate': 1.1437900232060483e-07, 'epoch': 0.93} 93%|█████████▎| 20634/22095 [35:18:46<1:25:57, 3.53s/it] 93%|█████████▎| 20635/22095 [35:18:50<1:27:53, 3.61s/it] {'loss': 0.3084, 'grad_norm': 0.6591185442707223, 'learning_rate': 1.1422318489335838e-07, 'epoch': 0.93} 93%|█████████▎| 20635/22095 [35:18:50<1:27:53, 3.61s/it] 93%|█████████▎| 20636/22095 [35:18:53<1:25:32, 3.52s/it] {'loss': 0.2699, 'grad_norm': 0.6101336407060521, 'learning_rate': 1.1406747244645078e-07, 'epoch': 0.93} 93%|█████████▎| 20636/22095 [35:18:53<1:25:32, 3.52s/it] 93%|█████████▎| 20637/22095 [35:18:56<1:20:51, 3.33s/it] {'loss': 0.2701, 'grad_norm': 0.6088189525568914, 'learning_rate': 1.1391186498322771e-07, 'epoch': 0.93} 93%|█████████▎| 20637/22095 [35:18:56<1:20:51, 3.33s/it] 93%|█████████▎| 20638/22095 [35:19:00<1:27:10, 3.59s/it] {'loss': 0.2833, 'grad_norm': 0.647299207439683, 'learning_rate': 1.1375636250703092e-07, 'epoch': 0.93} 93%|█████████▎| 20638/22095 [35:19:01<1:27:10, 3.59s/it] 93%|█████████▎| 20639/22095 [35:19:04<1:26:02, 3.55s/it] {'loss': 0.2807, 'grad_norm': 0.6330465226385166, 'learning_rate': 1.1360096502120387e-07, 'epoch': 0.93} 93%|█████████▎| 20639/22095 [35:19:04<1:26:02, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (68973 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20640/22095 [35:19:07<1:25:51, 3.54s/it] {'loss': 0.3136, 'grad_norm': 0.7381215076472419, 'learning_rate': 1.1344567252908445e-07, 'epoch': 0.93} 93%|█████████▎| 20640/22095 [35:19:07<1:25:51, 3.54s/it] 93%|█████████▎| 20641/22095 [35:19:11<1:23:15, 3.44s/it] {'loss': 0.2919, 'grad_norm': 0.5984678510006978, 'learning_rate': 1.1329048503400996e-07, 'epoch': 0.93} 93%|█████████▎| 20641/22095 [35:19:11<1:23:15, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20642/22095 [35:19:23<2:26:40, 6.06s/it] {'loss': 0.452, 'grad_norm': 0.2648725606387613, 'learning_rate': 1.1313540253931387e-07, 'epoch': 0.93} 93%|█████████▎| 20642/22095 [35:19:23<2:26:40, 6.06s/it] 93%|█████████▎| 20643/22095 [35:19:27<2:12:55, 5.49s/it] {'loss': 0.3172, 'grad_norm': 0.6362303922975918, 'learning_rate': 1.1298042504832963e-07, 'epoch': 0.93} 93%|█████████▎| 20643/22095 [35:19:27<2:12:55, 5.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 93%|█████████▎| 20644/22095 [35:19:31<2:03:04, 5.09s/it] {'loss': 0.2738, 'grad_norm': 0.5489044672031816, 'learning_rate': 1.1282555256438622e-07, 'epoch': 0.93} 93%|█████████▎| 20644/22095 [35:19:31<2:03:04, 5.09s/it] 93%|█████████▎| 20645/22095 [35:19:34<1:46:51, 4.42s/it] {'loss': 0.2534, 'grad_norm': 0.6001819554223173, 'learning_rate': 1.1267078509081209e-07, 'epoch': 0.93} 93%|█████████▎| 20645/22095 [35:19:34<1:46:51, 4.42s/it] 93%|█████████▎| 20646/22095 [35:19:37<1:36:00, 3.98s/it] {'loss': 0.2834, 'grad_norm': 0.5818315990671367, 'learning_rate': 1.1251612263093292e-07, 'epoch': 0.93} 93%|█████████▎| 20646/22095 [35:19:37<1:36:00, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60029 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49440 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63575 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98732 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20647/22095 [35:19:40<1:29:49, 3.72s/it] {'loss': 0.3121, 'grad_norm': 0.6244052785314712, 'learning_rate': 1.1236156518807106e-07, 'epoch': 0.93} 93%|█████████▎| 20647/22095 [35:19:40<1:29:49, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20648/22095 [35:19:47<1:52:30, 4.67s/it] {'loss': 0.4486, 'grad_norm': 0.2778742047993764, 'learning_rate': 1.1220711276554775e-07, 'epoch': 0.93} 93%|█████████▎| 20648/22095 [35:19:47<1:52:30, 4.67s/it] 93%|█████████▎| 20649/22095 [35:19:51<1:50:33, 4.59s/it] {'loss': 0.2993, 'grad_norm': 0.6124342133392783, 'learning_rate': 1.1205276536668252e-07, 'epoch': 0.93} 93%|█████████▎| 20649/22095 [35:19:51<1:50:33, 4.59s/it] 93%|█████████▎| 20650/22095 [35:19:55<1:41:17, 4.21s/it] {'loss': 0.2454, 'grad_norm': 0.5638354052595087, 'learning_rate': 1.118985229947911e-07, 'epoch': 0.93} 93%|█████████▎| 20650/22095 [35:19:55<1:41:17, 4.21s/it] 93%|█████████▎| 20651/22095 [35:19:58<1:36:12, 4.00s/it] {'loss': 0.2681, 'grad_norm': 0.6008229714251444, 'learning_rate': 1.1174438565318691e-07, 'epoch': 0.93} 93%|█████████▎| 20651/22095 [35:19:58<1:36:12, 4.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909859 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33012, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nA. 9\nB. 10\nC. 12\nD. 16\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 93%|█████████▎| 20652/22095 [35:20:02<1:35:11, 3.96s/it] {'loss': 0.2764, 'grad_norm': 0.6160137116835261, 'learning_rate': 1.1159035334518343e-07, 'epoch': 0.93} 93%|█████████▎| 20652/22095 [35:20:02<1:35:11, 3.96s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57001 > 40960). Running this sequence through the model will result in indexing errors 93%|█████████▎| 20653/22095 [35:20:05<1:29:03, 3.71s/it] {'loss': 0.2865, 'grad_norm': 0.6336472871929654, 'learning_rate': 1.1143642607409023e-07, 'epoch': 0.93} 93%|█████████▎| 20653/22095 [35:20:05<1:29:03, 3.71s/it] 93%|█████████▎| 20654/22095 [35:20:09<1:26:21, 3.60s/it] {'loss': 0.2658, 'grad_norm': 0.6108194372529918, 'learning_rate': 1.11282603843213e-07, 'epoch': 0.93} 93%|█████████▎| 20654/22095 [35:20:09<1:26:21, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 93%|█████████▎| 20655/22095 [35:20:20<2:25:46, 6.07s/it] {'loss': 0.4549, 'grad_norm': 0.2761153769668537, 'learning_rate': 1.1112888665585852e-07, 'epoch': 0.93} 93%|█████████▎| 20655/22095 [35:20:20<2:25:46, 6.07s/it] 93%|█████████▎| 20656/22095 [35:20:24<2:08:57, 5.38s/it] {'loss': 0.3186, 'grad_norm': 0.6343527944142431, 'learning_rate': 1.109752745153292e-07, 'epoch': 0.93} 93%|█████████▎| 20656/22095 [35:20:24<2:08:57, 5.38s/it] 93%|█████████▎| 20657/22095 [35:20:27<1:54:15, 4.77s/it] {'loss': 0.2781, 'grad_norm': 0.5965075668169835, 'learning_rate': 1.1082176742492623e-07, 'epoch': 0.93} 93%|█████████▎| 20657/22095 [35:20:27<1:54:15, 4.77s/it] 93%|█████████▎| 20658/22095 [35:20:31<1:42:51, 4.29s/it] {'loss': 0.2521, 'grad_norm': 0.6259405292833713, 'learning_rate': 1.1066836538794645e-07, 'epoch': 0.93} 93%|█████████▎| 20658/22095 [35:20:31<1:42:51, 4.29s/it] 94%|█████████▎| 20659/22095 [35:20:35<1:43:01, 4.30s/it] {'loss': 0.3014, 'grad_norm': 0.6056507338546241, 'learning_rate': 1.1051506840768833e-07, 'epoch': 0.94} 94%|█████████▎| 20659/22095 [35:20:35<1:43:01, 4.30s/it] 94%|█████████▎| 20660/22095 [35:20:38<1:36:27, 4.03s/it] {'loss': 0.3197, 'grad_norm': 0.5962640258792247, 'learning_rate': 1.1036187648744311e-07, 'epoch': 0.94} 94%|█████████▎| 20660/22095 [35:20:38<1:36:27, 4.03s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (40966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47576 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92003 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57232 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62671 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20661/22095 [35:20:48<2:17:59, 5.77s/it] {'loss': 0.4817, 'grad_norm': 0.28611351228558085, 'learning_rate': 1.1020878963050485e-07, 'epoch': 0.94} 94%|█████████▎| 20661/22095 [35:20:48<2:17:59, 5.77s/it] 94%|█████████▎| 20662/22095 [35:20:52<2:01:40, 5.09s/it] {'loss': 0.3086, 'grad_norm': 0.8179938486173777, 'learning_rate': 1.10055807840162e-07, 'epoch': 0.94} 94%|█████████▎| 20662/22095 [35:20:52<2:01:40, 5.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (128535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48610 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85496 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20663/22095 [35:20:55<1:48:37, 4.55s/it] {'loss': 0.2467, 'grad_norm': 0.6348754369470706, 'learning_rate': 1.0990293111970085e-07, 'epoch': 0.94} 94%|█████████▎| 20663/22095 [35:20:55<1:48:37, 4.55s/it] 94%|█████████▎| 20664/22095 [35:20:58<1:38:28, 4.13s/it] {'loss': 0.2806, 'grad_norm': 0.6353518229408754, 'learning_rate': 1.0975015947240652e-07, 'epoch': 0.94} 94%|█████████▎| 20664/22095 [35:20:58<1:38:28, 4.13s/it] 94%|█████████▎| 20665/22095 [35:21:02<1:36:39, 4.06s/it] {'loss': 0.2838, 'grad_norm': 0.5646218500174285, 'learning_rate': 1.0959749290156307e-07, 'epoch': 0.94} 94%|█████████▎| 20665/22095 [35:21:02<1:36:39, 4.06s/it] 94%|█████████▎| 20666/22095 [35:21:06<1:34:42, 3.98s/it] {'loss': 0.3028, 'grad_norm': 0.6067841157078296, 'learning_rate': 1.0944493141044953e-07, 'epoch': 0.94} 94%|█████████▎| 20666/22095 [35:21:06<1:34:42, 3.98s/it] 94%|█████████▎| 20667/22095 [35:21:10<1:34:26, 3.97s/it] {'loss': 0.3187, 'grad_norm': 0.6727425150929137, 'learning_rate': 1.0929247500234386e-07, 'epoch': 0.94} 94%|█████████▎| 20667/22095 [35:21:10<1:34:26, 3.97s/it] 94%|█████████▎| 20668/22095 [35:21:13<1:31:16, 3.84s/it] {'loss': 0.2772, 'grad_norm': 0.5660911183506132, 'learning_rate': 1.0914012368052229e-07, 'epoch': 0.94} 94%|█████████▎| 20668/22095 [35:21:13<1:31:16, 3.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▎| 20669/22095 [35:21:23<2:11:59, 5.55s/it] {'loss': 0.4611, 'grad_norm': 0.26176590366847236, 'learning_rate': 1.0898787744825833e-07, 'epoch': 0.94} 94%|█████████▎| 20669/22095 [35:21:23<2:11:59, 5.55s/it] 94%|█████████▎| 20670/22095 [35:21:26<1:57:55, 4.97s/it] {'loss': 0.3043, 'grad_norm': 0.6051240123558446, 'learning_rate': 1.0883573630882327e-07, 'epoch': 0.94} 94%|█████████▎| 20670/22095 [35:21:26<1:57:55, 4.97s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▎| 20671/22095 [35:21:36<2:27:55, 6.23s/it] {'loss': 0.4457, 'grad_norm': 0.28517697737707726, 'learning_rate': 1.086837002654867e-07, 'epoch': 0.94} 94%|█████████▎| 20671/22095 [35:21:36<2:27:55, 6.23s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948713 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71866, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '2'}]} 94%|█████████▎| 20672/22095 [35:21:39<2:06:36, 5.34s/it] {'loss': 0.283, 'grad_norm': 0.6380053084611067, 'learning_rate': 1.0853176932151432e-07, 'epoch': 0.94} 94%|█████████▎| 20672/22095 [35:21:39<2:06:36, 5.34s/it] 94%|█████████▎| 20673/22095 [35:21:42<1:49:54, 4.64s/it] {'loss': 0.2406, 'grad_norm': 0.6196496503170629, 'learning_rate': 1.0837994348017133e-07, 'epoch': 0.94} 94%|█████████▎| 20673/22095 [35:21:42<1:49:54, 4.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51835 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47159 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45535 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20674/22095 [35:21:45<1:37:12, 4.10s/it] {'loss': 0.32, 'grad_norm': 0.5589782837257363, 'learning_rate': 1.0822822274472011e-07, 'epoch': 0.94} 94%|█████████▎| 20674/22095 [35:21:45<1:37:12, 4.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51219 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41411 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71983 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20675/22095 [35:21:48<1:28:16, 3.73s/it] {'loss': 0.326, 'grad_norm': 0.6107204778054238, 'learning_rate': 1.0807660711842027e-07, 'epoch': 0.94} 94%|█████████▎| 20675/22095 [35:21:48<1:28:16, 3.73s/it] 94%|█████████▎| 20676/22095 [35:21:51<1:25:59, 3.64s/it] {'loss': 0.2679, 'grad_norm': 0.5624460182654537, 'learning_rate': 1.0792509660452921e-07, 'epoch': 0.94} 94%|█████████▎| 20676/22095 [35:21:51<1:25:59, 3.64s/it] 94%|█████████▎| 20677/22095 [35:21:54<1:20:25, 3.40s/it] {'loss': 0.2483, 'grad_norm': 0.6935165199403789, 'learning_rate': 1.0777369120630377e-07, 'epoch': 0.94} 94%|█████████▎| 20677/22095 [35:21:54<1:20:25, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (77838 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57530 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120358 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20678/22095 [35:21:57<1:17:27, 3.28s/it] {'loss': 0.2961, 'grad_norm': 0.6135992877109803, 'learning_rate': 1.0762239092699633e-07, 'epoch': 0.94} 94%|█████████▎| 20678/22095 [35:21:57<1:17:27, 3.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67720 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (125341 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20679/22095 [35:22:00<1:14:48, 3.17s/it] {'loss': 0.3076, 'grad_norm': 0.5876182716494162, 'learning_rate': 1.0747119576985765e-07, 'epoch': 0.94} 94%|█████████▎| 20679/22095 [35:22:00<1:14:48, 3.17s/it] 94%|█████████▎| 20680/22095 [35:22:04<1:19:51, 3.39s/it] {'loss': 0.2848, 'grad_norm': 0.6489565993731783, 'learning_rate': 1.0732010573813623e-07, 'epoch': 0.94} 94%|█████████▎| 20680/22095 [35:22:04<1:19:51, 3.39s/it] 94%|█████████▎| 20681/22095 [35:22:07<1:20:23, 3.41s/it] {'loss': 0.3443, 'grad_norm': 0.6246061102068522, 'learning_rate': 1.0716912083508003e-07, 'epoch': 0.94} 94%|█████████▎| 20681/22095 [35:22:07<1:20:23, 3.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▎| 20682/22095 [35:22:17<2:02:29, 5.20s/it] {'loss': 0.4986, 'grad_norm': 0.2884324669723961, 'learning_rate': 1.07018241063932e-07, 'epoch': 0.94} 94%|█████████▎| 20682/22095 [35:22:17<2:02:29, 5.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107369 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75409 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20683/22095 [35:22:24<2:19:23, 5.92s/it] {'loss': 0.4963, 'grad_norm': 0.29179013837324413, 'learning_rate': 1.06867466427934e-07, 'epoch': 0.94} 94%|█████████▎| 20683/22095 [35:22:24<2:19:23, 5.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (81903 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20684/22095 [35:22:27<2:00:40, 5.13s/it] {'loss': 0.3042, 'grad_norm': 0.6290104692273423, 'learning_rate': 1.0671679693032621e-07, 'epoch': 0.94} 94%|█████████▎| 20684/22095 [35:22:27<2:00:40, 5.13s/it] 94%|█████████▎| 20685/22095 [35:22:31<1:47:22, 4.57s/it] {'loss': 0.3001, 'grad_norm': 0.5833963754232364, 'learning_rate': 1.0656623257434551e-07, 'epoch': 0.94} 94%|█████████▎| 20685/22095 [35:22:31<1:47:22, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49463 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▎| 20686/22095 [35:22:40<2:21:17, 6.02s/it] {'loss': 0.4641, 'grad_norm': 0.2653414681907125, 'learning_rate': 1.0641577336322761e-07, 'epoch': 0.94} 94%|█████████▎| 20686/22095 [35:22:40<2:21:17, 6.02s/it] 94%|█████████▎| 20687/22095 [35:22:49<2:42:08, 6.91s/it] {'loss': 0.4616, 'grad_norm': 0.271146481973984, 'learning_rate': 1.0626541930020551e-07, 'epoch': 0.94} 94%|█████████▎| 20687/22095 [35:22:49<2:42:08, 6.91s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▎| 20688/22095 [35:22:52<2:15:35, 5.78s/it] {'loss': 0.2942, 'grad_norm': 0.6061218869882479, 'learning_rate': 1.0611517038850938e-07, 'epoch': 0.94} 94%|█████████▎| 20688/22095 [35:22:52<2:15:35, 5.78s/it] 94%|█████████▎| 20689/22095 [35:22:55<1:57:25, 5.01s/it] {'loss': 0.305, 'grad_norm': 0.6484075906920178, 'learning_rate': 1.0596502663136776e-07, 'epoch': 0.94} 94%|█████████▎| 20689/22095 [35:22:55<1:57:25, 5.01s/it] 94%|█████████▎| 20690/22095 [35:22:59<1:45:58, 4.53s/it] {'loss': 0.2868, 'grad_norm': 0.6201439884226438, 'learning_rate': 1.0581498803200696e-07, 'epoch': 0.94} 94%|█████████▎| 20690/22095 [35:22:59<1:45:58, 4.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▎| 20691/22095 [35:23:09<2:24:35, 6.18s/it] {'loss': 0.4628, 'grad_norm': 0.26360757034570326, 'learning_rate': 1.0566505459365106e-07, 'epoch': 0.94} 94%|█████████▎| 20691/22095 [35:23:09<2:24:35, 6.18s/it] 94%|█████████▎| 20692/22095 [35:23:19<2:50:35, 7.30s/it] {'loss': 0.4424, 'grad_norm': 0.2568003036889309, 'learning_rate': 1.0551522631952083e-07, 'epoch': 0.94} 94%|█████████▎| 20692/22095 [35:23:19<2:50:35, 7.30s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▎| 20693/22095 [35:23:23<2:26:59, 6.29s/it] {'loss': 0.2877, 'grad_norm': 0.670806761610968, 'learning_rate': 1.0536550321283589e-07, 'epoch': 0.94} 94%|█████████▎| 20693/22095 [35:23:23<2:26:59, 6.29s/it] 94%|█████████▎| 20694/22095 [35:23:31<2:39:10, 6.82s/it] {'loss': 0.461, 'grad_norm': 0.26251251174654255, 'learning_rate': 1.0521588527681426e-07, 'epoch': 0.94} 94%|█████████▎| 20694/22095 [35:23:31<2:39:10, 6.82s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▎| 20695/22095 [35:23:34<2:15:38, 5.81s/it] {'loss': 0.3156, 'grad_norm': 0.62146968516843, 'learning_rate': 1.0506637251467e-07, 'epoch': 0.94} 94%|█████████▎| 20695/22095 [35:23:34<2:15:38, 5.81s/it] 94%|█████████▎| 20696/22095 [35:23:37<1:57:25, 5.04s/it] {'loss': 0.2895, 'grad_norm': 0.6042091551748556, 'learning_rate': 1.0491696492961501e-07, 'epoch': 0.94} 94%|█████████▎| 20696/22095 [35:23:37<1:57:25, 5.04s/it] 94%|█████████▎| 20697/22095 [35:23:41<1:43:50, 4.46s/it] {'loss': 0.3159, 'grad_norm': 0.6251271004901205, 'learning_rate': 1.0476766252486114e-07, 'epoch': 0.94} 94%|█████████▎| 20697/22095 [35:23:41<1:43:50, 4.46s/it] 94%|█████████▎| 20698/22095 [35:23:45<1:41:00, 4.34s/it] {'loss': 0.2795, 'grad_norm': 0.7420180169744252, 'learning_rate': 1.046184653036153e-07, 'epoch': 0.94} 94%|█████████▎| 20698/22095 [35:23:45<1:41:00, 4.34s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [133, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887882 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [133, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 11035, 'image': 'images/5343.png', 'image_wh': [[133, 25]], 'conversations': [{'from': 'human', 'value': "\n如图,O是线段AB的中点,C在线段OB上,AC=6,CB=3,则OC的长等于()\nA. 1\nB. 1.5\nC. 2\nD. 0.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 94%|█████████▎| 20699/22095 [35:23:48<1:32:53, 3.99s/it] {'loss': 0.2615, 'grad_norm': 0.5907238212284458, 'learning_rate': 1.044693732690838e-07, 'epoch': 0.94} 94%|█████████▎| 20699/22095 [35:23:48<1:32:53, 3.99s/it] 94%|█████████▎| 20700/22095 [35:23:52<1:33:33, 4.02s/it] {'loss': 0.3017, 'grad_norm': 0.5952101244183566, 'learning_rate': 1.0432038642446962e-07, 'epoch': 0.94} 94%|█████████▎| 20700/22095 [35:23:52<1:33:33, 4.02s/it] 94%|█████████▎| 20701/22095 [35:23:55<1:27:19, 3.76s/it] {'loss': 0.2927, 'grad_norm': 0.6378778314486471, 'learning_rate': 1.0417150477297466e-07, 'epoch': 0.94} 94%|█████████▎| 20701/22095 [35:23:55<1:27:19, 3.76s/it] 94%|█████████▎| 20702/22095 [35:23:59<1:25:46, 3.69s/it] {'loss': 0.2539, 'grad_norm': 0.6421025040459815, 'learning_rate': 1.0402272831779747e-07, 'epoch': 0.94} 94%|█████████▎| 20702/22095 [35:23:59<1:25:46, 3.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047226 in VC:s3://multi-modal/UniGeo/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 6cm\nB. 8cm\nC. 10cm\nD. 12cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 94%|█████████▎| 20703/22095 [35:24:02<1:26:12, 3.72s/it] {'loss': 0.2801, 'grad_norm': 0.6106293661526145, 'learning_rate': 1.038740570621355e-07, 'epoch': 0.94} 94%|█████████▎| 20703/22095 [35:24:02<1:26:12, 3.72s/it] 94%|█████████▎| 20704/22095 [35:24:06<1:25:40, 3.70s/it] {'loss': 0.2888, 'grad_norm': 0.5933251946238154, 'learning_rate': 1.0372549100918283e-07, 'epoch': 0.94} 94%|█████████▎| 20704/22095 [35:24:06<1:25:40, 3.70s/it] 94%|█████████▎| 20705/22095 [35:24:09<1:20:38, 3.48s/it] {'loss': 0.3189, 'grad_norm': 0.6368085204757286, 'learning_rate': 1.0357703016213083e-07, 'epoch': 0.94} 94%|█████████▎| 20705/22095 [35:24:09<1:20:38, 3.48s/it] 94%|█████████▎| 20706/22095 [35:24:12<1:17:36, 3.35s/it] {'loss': 0.2458, 'grad_norm': 0.8117745628825518, 'learning_rate': 1.0342867452417027e-07, 'epoch': 0.94} 94%|█████████▎| 20706/22095 [35:24:12<1:17:36, 3.35s/it] 94%|█████████▎| 20707/22095 [35:24:15<1:14:24, 3.22s/it] {'loss': 0.3155, 'grad_norm': 0.593644986221223, 'learning_rate': 1.0328042409849026e-07, 'epoch': 0.94} 94%|█████████▎| 20707/22095 [35:24:15<1:14:24, 3.22s/it] 94%|█████████▎| 20708/22095 [35:24:18<1:12:46, 3.15s/it] {'loss': 0.314, 'grad_norm': 0.6291442440715559, 'learning_rate': 1.0313227888827326e-07, 'epoch': 0.94} 94%|█████████▎| 20708/22095 [35:24:18<1:12:46, 3.15s/it] 94%|█████████▎| 20709/22095 [35:24:21<1:13:54, 3.20s/it] {'loss': 0.3022, 'grad_norm': 0.5833471343156812, 'learning_rate': 1.0298423889670395e-07, 'epoch': 0.94} 94%|█████████▎| 20709/22095 [35:24:21<1:13:54, 3.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8898848 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 22001, 'image': 'images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,M点是AB段的中点,N点在MB段上。如果AB=12,AM:BN=3:1,则Mn段的长度为()\nA. 6\nB. 5\nC. 4\nD. 3'}, {'from': 'gpt', 'value': '【解答】解:∵AM:BN=3:1,而点M是线段AB的中点,且AB=12,∴AM=BM=6,BN=2而MN=BM-BN=6-2=4'}]} 94%|█████████▎| 20710/22095 [35:24:25<1:15:23, 3.27s/it] {'loss': 0.3423, 'grad_norm': 0.6802737846945168, 'learning_rate': 1.0283630412696422e-07, 'epoch': 0.94} 94%|█████████▎| 20710/22095 [35:24:25<1:15:23, 3.27s/it] 94%|█████████▎| 20711/22095 [35:24:28<1:16:32, 3.32s/it] {'loss': 0.2871, 'grad_norm': 0.6108634600364936, 'learning_rate': 1.0268847458223152e-07, 'epoch': 0.94} 94%|█████████▎| 20711/22095 [35:24:28<1:16:32, 3.32s/it] 94%|█████████▎| 20712/22095 [35:24:32<1:20:13, 3.48s/it] {'loss': 0.2992, 'grad_norm': 0.6103729859109673, 'learning_rate': 1.0254075026568222e-07, 'epoch': 0.94} 94%|█████████▎| 20712/22095 [35:24:32<1:20:13, 3.48s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▎| 20713/22095 [35:24:35<1:16:44, 3.33s/it] {'loss': 0.3295, 'grad_norm': 0.68933196891107, 'learning_rate': 1.0239313118049155e-07, 'epoch': 0.94} 94%|█████████▎| 20713/22095 [35:24:35<1:16:44, 3.33s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▎| 20714/22095 [35:24:38<1:13:48, 3.21s/it] {'loss': 0.3062, 'grad_norm': 0.6171769175551642, 'learning_rate': 1.0224561732982973e-07, 'epoch': 0.94} 94%|█████████▎| 20714/22095 [35:24:38<1:13:48, 3.21s/it] 94%|█████████▍| 20715/22095 [35:24:41<1:14:35, 3.24s/it] {'loss': 0.2867, 'grad_norm': 0.6477669112440486, 'learning_rate': 1.0209820871686816e-07, 'epoch': 0.94} 94%|█████████▍| 20715/22095 [35:24:41<1:14:35, 3.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (117800 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101568 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20716/22095 [35:24:44<1:13:48, 3.21s/it] {'loss': 0.2634, 'grad_norm': 0.5893855899352083, 'learning_rate': 1.0195090534477258e-07, 'epoch': 0.94} 94%|█████████▍| 20716/22095 [35:24:44<1:13:48, 3.21s/it] 94%|█████████▍| 20717/22095 [35:24:48<1:14:16, 3.23s/it] {'loss': 0.29, 'grad_norm': 0.5858031388266357, 'learning_rate': 1.0180370721670941e-07, 'epoch': 0.94} 94%|█████████▍| 20717/22095 [35:24:48<1:14:16, 3.23s/it] 94%|█████████▍| 20718/22095 [35:24:51<1:13:55, 3.22s/it] {'loss': 0.2734, 'grad_norm': 0.6179295544108785, 'learning_rate': 1.0165661433583996e-07, 'epoch': 0.94} 94%|█████████▍| 20718/22095 [35:24:51<1:13:55, 3.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52333 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67998 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49563 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44453 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62376 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (45437 > 40960) for 4 sample(s). Truncating to 4240 with 2 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (52413 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20719/22095 [35:24:55<1:18:48, 3.44s/it] {'loss': 0.2639, 'grad_norm': 0.588071945852166, 'learning_rate': 1.0150962670532671e-07, 'epoch': 0.94} 94%|█████████▍| 20719/22095 [35:24:55<1:18:48, 3.44s/it] 94%|█████████▍| 20720/22095 [35:24:59<1:24:09, 3.67s/it] {'loss': 0.3177, 'grad_norm': 0.654440332520569, 'learning_rate': 1.0136274432832715e-07, 'epoch': 0.94} 94%|█████████▍| 20720/22095 [35:24:59<1:24:09, 3.67s/it] 94%|█████████▍| 20721/22095 [35:25:02<1:19:46, 3.48s/it] {'loss': 0.3317, 'grad_norm': 0.6446000731328926, 'learning_rate': 1.0121596720799653e-07, 'epoch': 0.94} 94%|█████████▍| 20721/22095 [35:25:02<1:19:46, 3.48s/it] 94%|█████████▍| 20722/22095 [35:25:06<1:22:10, 3.59s/it] {'loss': 0.2716, 'grad_norm': 0.6274830605070559, 'learning_rate': 1.01069295347489e-07, 'epoch': 0.94} 94%|█████████▍| 20722/22095 [35:25:06<1:22:10, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20723/22095 [35:25:10<1:23:11, 3.64s/it] {'loss': 0.2628, 'grad_norm': 0.571998916266495, 'learning_rate': 1.00922728749957e-07, 'epoch': 0.94} 94%|█████████▍| 20723/22095 [35:25:10<1:23:11, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20724/22095 [35:25:19<2:02:38, 5.37s/it] {'loss': 0.4834, 'grad_norm': 0.2695101803328294, 'learning_rate': 1.0077626741854973e-07, 'epoch': 0.94} 94%|█████████▍| 20724/22095 [35:25:19<2:02:38, 5.37s/it] 94%|█████████▍| 20725/22095 [35:25:22<1:46:47, 4.68s/it] {'loss': 0.2976, 'grad_norm': 0.605191147919929, 'learning_rate': 1.0062991135641242e-07, 'epoch': 0.94} 94%|█████████▍| 20725/22095 [35:25:22<1:46:47, 4.68s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48155 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (126314 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20726/22095 [35:25:25<1:37:57, 4.29s/it] {'loss': 0.253, 'grad_norm': 0.6077927199877912, 'learning_rate': 1.0048366056669201e-07, 'epoch': 0.94} 94%|█████████▍| 20726/22095 [35:25:25<1:37:57, 4.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20727/22095 [35:25:35<2:13:56, 5.87s/it] {'loss': 0.4636, 'grad_norm': 0.2903158097699309, 'learning_rate': 1.0033751505252987e-07, 'epoch': 0.94} 94%|█████████▍| 20727/22095 [35:25:35<2:13:56, 5.87s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 87, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8350009 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 87, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 16682, 'image': 'vrdu_table_final_2/astro-ph.CO/70a737e2-2e74-4238-a751-8d978080403c.png', 'image_wh': [[14, 87]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}{c}\n0\\tabularnewline\n0\\tabularnewline\n0\\tabularnewline\n\\end{tabular}\n```"}]} 94%|█████████▍| 20728/22095 [35:25:38<1:56:06, 5.10s/it] {'loss': 0.2904, 'grad_norm': 0.5420678267629183, 'learning_rate': 1.0019147481706626e-07, 'epoch': 0.94} 94%|█████████▍| 20728/22095 [35:25:38<1:56:06, 5.10s/it] 94%|█████████▍| 20729/22095 [35:25:43<1:52:17, 4.93s/it] {'loss': 0.3297, 'grad_norm': 0.566633524220132, 'learning_rate': 1.0004553986343868e-07, 'epoch': 0.94} 94%|█████████▍| 20729/22095 [35:25:43<1:52:17, 4.93s/it] 94%|█████████▍| 20730/22095 [35:25:46<1:39:44, 4.38s/it] {'loss': 0.2887, 'grad_norm': 0.6372132574695685, 'learning_rate': 9.989971019478406e-08, 'epoch': 0.94} 94%|█████████▍| 20730/22095 [35:25:46<1:39:44, 4.38s/it] 94%|█████████▍| 20731/22095 [35:25:49<1:33:09, 4.10s/it] {'loss': 0.2905, 'grad_norm': 0.5991609239206822, 'learning_rate': 9.97539858142349e-08, 'epoch': 0.94} 94%|█████████▍| 20731/22095 [35:25:49<1:33:09, 4.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [645, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8514222 in VC:s3://internvl-moe-sft-data/. Exception: Image size [645, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 96653, 'image': 'vrdu_texteq/astro-ph.CO/0ea792f4-329e-4308-813a-297e8e87cbbd.png', 'image_wh': [[645, 25]], 'conversations': [{'from': 'human', 'value': '\nCould you transcribe any visible text from the image, please?'}, {'from': 'gpt', 'value': 'where $D_2$ is the fractal dimension of the distribution.'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [306, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8490259 in VC:s3://internvl-moe-sft-data/. Exception: Image size [306, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8286, 'image': 'vrdu_texteq/astro-ph.CO/36bd53c7-4f9b-4f3b-a044-aa8bbaaf4324.png', 'image_wh': [[306, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'where we used ${\\cal C} \\simeq 3$ and'}]} 94%|█████████▍| 20732/22095 [35:25:53<1:27:39, 3.86s/it] {'loss': 0.294, 'grad_norm': 0.6303969117656907, 'learning_rate': 9.960836672492313e-08, 'epoch': 0.94} 94%|█████████▍| 20732/22095 [35:25:53<1:27:39, 3.86s/it] 94%|█████████▍| 20733/22095 [35:25:56<1:20:41, 3.55s/it] {'loss': 0.2889, 'grad_norm': 0.6249848427120213, 'learning_rate': 9.946285292997681e-08, 'epoch': 0.94} 94%|█████████▍| 20733/22095 [35:25:56<1:20:41, 3.55s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20734/22095 [35:26:05<2:00:30, 5.31s/it] {'loss': 0.4659, 'grad_norm': 0.2906943402790951, 'learning_rate': 9.931744443252234e-08, 'epoch': 0.94} 94%|█████████▍| 20734/22095 [35:26:05<2:00:30, 5.31s/it] 94%|█████████▍| 20735/22095 [35:26:08<1:45:47, 4.67s/it] {'loss': 0.2762, 'grad_norm': 0.601588500629499, 'learning_rate': 9.917214123568498e-08, 'epoch': 0.94} 94%|█████████▍| 20735/22095 [35:26:08<1:45:47, 4.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50218 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68647 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20736/22095 [35:26:11<1:36:53, 4.28s/it] {'loss': 0.2967, 'grad_norm': 0.5907351514132653, 'learning_rate': 9.902694334258722e-08, 'epoch': 0.94} 94%|█████████▍| 20736/22095 [35:26:11<1:36:53, 4.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48173 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (94555 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62512 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44820 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20737/22095 [35:26:15<1:28:25, 3.91s/it] {'loss': 0.2877, 'grad_norm': 0.6572000147313092, 'learning_rate': 9.88818507563477e-08, 'epoch': 0.94} 94%|█████████▍| 20737/22095 [35:26:15<1:28:25, 3.91s/it] 94%|█████████▍| 20738/22095 [35:26:17<1:20:51, 3.58s/it] {'loss': 0.29, 'grad_norm': 0.5992227687883745, 'learning_rate': 9.873686348008448e-08, 'epoch': 0.94} 94%|█████████▍| 20738/22095 [35:26:17<1:20:51, 3.58s/it] 94%|█████████▍| 20739/22095 [35:26:21<1:20:13, 3.55s/it] {'loss': 0.2909, 'grad_norm': 0.578189315166507, 'learning_rate': 9.859198151691341e-08, 'epoch': 0.94} 94%|█████████▍| 20739/22095 [35:26:21<1:20:13, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59860 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78662 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59311 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20740/22095 [35:26:24<1:15:33, 3.35s/it] {'loss': 0.2614, 'grad_norm': 0.6107110327904092, 'learning_rate': 9.844720486994752e-08, 'epoch': 0.94} 94%|█████████▍| 20740/22095 [35:26:24<1:15:33, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (101202 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (136698 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20741/22095 [35:26:28<1:19:11, 3.51s/it] {'loss': 0.2667, 'grad_norm': 0.6555056399941228, 'learning_rate': 9.830253354229601e-08, 'epoch': 0.94} 94%|█████████▍| 20741/22095 [35:26:28<1:19:11, 3.51s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20742/22095 [35:26:30<1:14:46, 3.32s/it] {'loss': 0.2976, 'grad_norm': 0.6031419210418268, 'learning_rate': 9.815796753706975e-08, 'epoch': 0.94} 94%|█████████▍| 20742/22095 [35:26:30<1:14:46, 3.32s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20743/22095 [35:26:34<1:15:02, 3.33s/it] {'loss': 0.2762, 'grad_norm': 0.6368408296601574, 'learning_rate': 9.801350685737288e-08, 'epoch': 0.94} 94%|█████████▍| 20743/22095 [35:26:34<1:15:02, 3.33s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20744/22095 [35:26:44<2:03:43, 5.50s/it] {'loss': 0.4666, 'grad_norm': 0.2690041210892171, 'learning_rate': 9.786915150631126e-08, 'epoch': 0.94} 94%|█████████▍| 20744/22095 [35:26:44<2:03:43, 5.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [100, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8377912 in VC:s3://internvl-moe-sft-data/. Exception: Image size [100, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 44695, 'image': 'vrdu_table_final_2/astro-ph.CO/edd3afbd-e738-4539-b538-a5b5c9376f08.png', 'image_wh': [[100, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[c]{@{}c@{}}Box size \\\\ \\Mpch \\end{tabular}\n```"}]} 94%|█████████▍| 20745/22095 [35:26:54<2:31:14, 6.72s/it] {'loss': 0.4607, 'grad_norm': 0.2698433370702812, 'learning_rate': 9.772490148698522e-08, 'epoch': 0.94} 94%|█████████▍| 20745/22095 [35:26:54<2:31:14, 6.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20746/22095 [35:26:58<2:11:11, 5.84s/it] {'loss': 0.3069, 'grad_norm': 0.690187986117633, 'learning_rate': 9.758075680249556e-08, 'epoch': 0.94} 94%|█████████▍| 20746/22095 [35:26:58<2:11:11, 5.84s/it] 94%|█████████▍| 20747/22095 [35:27:01<1:54:46, 5.11s/it] {'loss': 0.2957, 'grad_norm': 0.5945569657567784, 'learning_rate': 9.743671745593819e-08, 'epoch': 0.94} 94%|█████████▍| 20747/22095 [35:27:01<1:54:46, 5.11s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20748/22095 [35:27:11<2:25:46, 6.49s/it] {'loss': 0.4828, 'grad_norm': 0.25335591426758786, 'learning_rate': 9.729278345040894e-08, 'epoch': 0.94} 94%|█████████▍| 20748/22095 [35:27:11<2:25:46, 6.49s/it] 94%|█████████▍| 20749/22095 [35:27:23<3:03:34, 8.18s/it] {'loss': 0.4656, 'grad_norm': 0.27672363140824363, 'learning_rate': 9.714895478900088e-08, 'epoch': 0.94} 94%|█████████▍| 20749/22095 [35:27:23<3:03:34, 8.18s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (50733 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64212 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20750/22095 [35:27:28<2:39:53, 7.13s/it] {'loss': 0.3492, 'grad_norm': 0.6777892512541681, 'learning_rate': 9.700523147480267e-08, 'epoch': 0.94} 94%|█████████▍| 20750/22095 [35:27:28<2:39:53, 7.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45133 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20751/22095 [35:27:31<2:13:09, 5.94s/it] {'loss': 0.2834, 'grad_norm': 0.6734370034072383, 'learning_rate': 9.686161351090407e-08, 'epoch': 0.94} 94%|█████████▍| 20751/22095 [35:27:31<2:13:09, 5.94s/it] 94%|█████████▍| 20752/22095 [35:27:34<1:54:17, 5.11s/it] {'loss': 0.2997, 'grad_norm': 0.6072316263959454, 'learning_rate': 9.671810090039091e-08, 'epoch': 0.94} 94%|█████████▍| 20752/22095 [35:27:34<1:54:17, 5.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42545 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20753/22095 [35:27:37<1:41:02, 4.52s/it] {'loss': 0.2921, 'grad_norm': 0.7106171343448984, 'learning_rate': 9.65746936463463e-08, 'epoch': 0.94} 94%|█████████▍| 20753/22095 [35:27:37<1:41:02, 4.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54123 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92057 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20754/22095 [35:27:40<1:30:54, 4.07s/it] {'loss': 0.3073, 'grad_norm': 0.6342138174196997, 'learning_rate': 9.643139175185168e-08, 'epoch': 0.94} 94%|█████████▍| 20754/22095 [35:27:40<1:30:54, 4.07s/it] 94%|█████████▍| 20755/22095 [35:27:43<1:25:58, 3.85s/it] {'loss': 0.3091, 'grad_norm': 0.6157424495113403, 'learning_rate': 9.628819521998622e-08, 'epoch': 0.94} 94%|█████████▍| 20755/22095 [35:27:43<1:25:58, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57590 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20756/22095 [35:27:47<1:20:41, 3.62s/it] {'loss': 0.2859, 'grad_norm': 0.6602136680439613, 'learning_rate': 9.614510405382693e-08, 'epoch': 0.94} 94%|█████████▍| 20756/22095 [35:27:47<1:20:41, 3.62s/it] 94%|█████████▍| 20757/22095 [35:27:50<1:18:38, 3.53s/it] {'loss': 0.2603, 'grad_norm': 0.5572487286447867, 'learning_rate': 9.600211825644856e-08, 'epoch': 0.94} 94%|█████████▍| 20757/22095 [35:27:50<1:18:38, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (70972 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86161 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (131971 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42404 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52111 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20758/22095 [35:27:59<1:57:43, 5.28s/it] {'loss': 0.4781, 'grad_norm': 0.261804490907496, 'learning_rate': 9.585923783092255e-08, 'epoch': 0.94} 94%|█████████▍| 20758/22095 [35:27:59<1:57:43, 5.28s/it] 94%|█████████▍| 20759/22095 [35:28:08<2:19:31, 6.27s/it] {'loss': 0.4741, 'grad_norm': 0.2599046113400412, 'learning_rate': 9.571646278032032e-08, 'epoch': 0.94} 94%|█████████▍| 20759/22095 [35:28:08<2:19:31, 6.27s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8952527 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3362, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,D为线段CB的中点,CD=3,AB=11,则AC的长为()\nA. 6\nB. 8\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 94%|█████████▍| 20760/22095 [35:28:12<2:07:46, 5.74s/it] {'loss': 0.3133, 'grad_norm': 0.6160630272152418, 'learning_rate': 9.557379310770831e-08, 'epoch': 0.94} 94%|█████████▍| 20760/22095 [35:28:12<2:07:46, 5.74s/it] 94%|█████████▍| 20761/22095 [35:28:21<2:26:29, 6.59s/it] {'loss': 0.4767, 'grad_norm': 0.26545070340134724, 'learning_rate': 9.543122881615297e-08, 'epoch': 0.94} 94%|█████████▍| 20761/22095 [35:28:21<2:26:29, 6.59s/it] 94%|█████████▍| 20762/22095 [35:28:30<2:40:33, 7.23s/it] {'loss': 0.4777, 'grad_norm': 0.27299182989107457, 'learning_rate': 9.528876990871793e-08, 'epoch': 0.94} 94%|█████████▍| 20762/22095 [35:28:30<2:40:33, 7.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▍| 20763/22095 [35:28:33<2:13:43, 6.02s/it] {'loss': 0.2666, 'grad_norm': 0.6274370211652937, 'learning_rate': 9.514641638846245e-08, 'epoch': 0.94} 94%|█████████▍| 20763/22095 [35:28:33<2:13:43, 6.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [267, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7806644 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [267, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27984', 'image': '51804.jpg', 'image_wh': [[267, 25]], 'conversations': [{'from': 'human', 'value': '\nI am providing an answer to the question below based on the image: \nCan you describe the background and setting of the image in detail? \nHere is my response: \nThe image presents a chemical name in text: "3-Ethyl-1-hexanethiol." This title embodies an organic compound adhering to the IUPAC nomenclature system for chemical substances.\n\nTo dissect this nomenclature:\n- The "3-Ethyl" prefix highlights an ethyl group (-C2H5) bonded to the third carbon of the principal chain.\n- "1-hex" denotes the main chain, comprising six carbon atoms with a thiol group (-SH) attached to the first carbon (counting from the end nearest to the thiol group).\n- "Thiol" suggests the molecule features a sulfur atom bonded to a hydrogen atom, forming a thiol functional group (-SH).\n\nStructurally, 3-ethyl-1-hexanethiol exhibits these characteristics:\n1. A primary carbon chain of six carbons with a thiol group (-SH) attached to the first carbon.\n2. An ethyl group is connected to the third carbon.\n3. A thiol functional group (-SH) is included, finishing the molecule, presumably bonded to the first carbon (since it\'s customarily at the commencement of the chain without a particular numerical position specified otherwise).\n\nThe skeletal formula can be illustrated with Carbons and Hydrogens explicitly marking the branching and functional group attachment:\n1. C1 bears the thiol group (-SH).\n2. C3 carries the ethyl group (-C2H5).\n3. C6 concludes the hexane chain.\n\nEmploying the IUPAC nomenclature, we can conclude that mastering organic compound naming involves discerning the prefixes (position and kind of substituents), infixes (nature and location of double/triple bonds), and suffixes (functional group category), permitting a systematic means to depict and recognize chemical structures. For any questions, one requires a robust understanding of functional groups, molecular structure, and carbon chain nomenclature.\n\nThis compound manifests both aliphatic traits (owing to the hexane chain) and attributes conferred by the thiol group, impacting its responsiveness and engagement in chemical landscapes. As a thiol, it potentially takes part in disulfide bond formation, affecting its solubility characteristics and boiling/condensation points. Grasping its chemical demeanor in different mediums and reactions necessitates comprehending these structural subtleties.\n\nTaking these points into account, the image fundamentally imparts information essential to organic chemistry concerning the architecture and classification of chemical compounds as delineated in the IUPAC system. \nPlease revise my answer using the image, and if there are any errors, kindly explain why it\'s incorrect.'}, {'from': 'gpt', 'value': 'Your description closely follows the conventions of IUPAC nomenclature and provides a detailed explanation of how to interpret the name of the compound. However, there are a few subtle points that could be refined for greater accuracy:\n\n1. In describing the position of the functional groups, you mention that they are "presumably bonded to the first carbon" or "likely attached to the first carbon" without specifying a position in the second case. IUPAC names do not assume a default position for a functional group but rather specify its location explicitly, as you correctly did with "1-hex" and "2-pent." Ensure that any functional group position is clearly stated, as in "1-hexanethiol" or "2-pentenamide."\n\n2. The term "infixes" is not standard in the context of IUPAC nomenclature. Typically, the structure of an IUPAC name is broken down into prefixes (indicating substituents), the stem (denoting the main chain and its length), and the suffix (indicating the principal functional group). The location of double or triple bonds is indicated by a number before the stem, as in "2-pent," which does not require a separate term.\n\n3. In your description, you mention "molecular geometry" as something to understand for inquiries. While understanding molecular geometry is crucial for organic chemistry, the focus of your description should be more on molecular structure and functional groups as they relate to IUPAC nomenclature.\n\nOverall, your descriptions are thorough and accurate, with just a few minor adjustments needed for greater precision in terminology and clarity of functional group positions.'}]} 94%|█████████▍| 20764/22095 [35:28:42<2:35:04, 6.99s/it] {'loss': 0.4652, 'grad_norm': 0.26111906070225804, 'learning_rate': 9.500416825844682e-08, 'epoch': 0.94} 94%|█████████▍| 20764/22095 [35:28:42<2:35:04, 6.99s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▍| 20765/22095 [35:28:45<2:11:17, 5.92s/it] {'loss': 0.288, 'grad_norm': 0.6329097033253354, 'learning_rate': 9.486202552172697e-08, 'epoch': 0.94} 94%|█████████▍| 20765/22095 [35:28:46<2:11:17, 5.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47382 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (40967 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20766/22095 [35:28:52<2:17:55, 6.23s/it] {'loss': 0.451, 'grad_norm': 0.25605798885576536, 'learning_rate': 9.471998818135764e-08, 'epoch': 0.94} 94%|█████████▍| 20766/22095 [35:28:52<2:17:55, 6.23s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (82522 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41809 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20767/22095 [35:28:56<2:02:28, 5.53s/it] {'loss': 0.2861, 'grad_norm': 0.5778371870857809, 'learning_rate': 9.457805624038974e-08, 'epoch': 0.94} 94%|█████████▍| 20767/22095 [35:28:56<2:02:28, 5.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44083 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87582 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (154424 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20768/22095 [35:29:00<1:52:58, 5.11s/it] {'loss': 0.2864, 'grad_norm': 0.6041845447583947, 'learning_rate': 9.443622970187415e-08, 'epoch': 0.94} 94%|█████████▍| 20768/22095 [35:29:00<1:52:58, 5.11s/it] 94%|█████████▍| 20769/22095 [35:29:04<1:42:37, 4.64s/it] {'loss': 0.2925, 'grad_norm': 0.6005423832736728, 'learning_rate': 9.429450856885736e-08, 'epoch': 0.94} 94%|█████████▍| 20769/22095 [35:29:04<1:42:37, 4.64s/it] 94%|█████████▍| 20770/22095 [35:29:07<1:33:42, 4.24s/it] {'loss': 0.2868, 'grad_norm': 0.6072185585444965, 'learning_rate': 9.415289284438523e-08, 'epoch': 0.94} 94%|█████████▍| 20770/22095 [35:29:07<1:33:42, 4.24s/it] 94%|█████████▍| 20771/22095 [35:29:10<1:25:52, 3.89s/it] {'loss': 0.2903, 'grad_norm': 0.5844322503894608, 'learning_rate': 9.401138253149977e-08, 'epoch': 0.94} 94%|█████████▍| 20771/22095 [35:29:10<1:25:52, 3.89s/it] 94%|█████████▍| 20772/22095 [35:29:15<1:28:19, 4.01s/it] {'loss': 0.3265, 'grad_norm': 0.5930622865313223, 'learning_rate': 9.386997763324246e-08, 'epoch': 0.94} 94%|█████████▍| 20772/22095 [35:29:15<1:28:19, 4.01s/it] 94%|█████████▍| 20773/22095 [35:29:19<1:32:16, 4.19s/it] {'loss': 0.2775, 'grad_norm': 0.5847618141175167, 'learning_rate': 9.372867815265085e-08, 'epoch': 0.94} 94%|█████████▍| 20773/22095 [35:29:19<1:32:16, 4.19s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20774/22095 [35:29:28<1:59:52, 5.44s/it] {'loss': 0.4778, 'grad_norm': 0.2686276723017246, 'learning_rate': 9.358748409276196e-08, 'epoch': 0.94} 94%|█████████▍| 20774/22095 [35:29:28<1:59:52, 5.44s/it] 94%|█████████▍| 20775/22095 [35:29:37<2:28:21, 6.74s/it] {'loss': 0.4637, 'grad_norm': 0.2631650236530227, 'learning_rate': 9.34463954566095e-08, 'epoch': 0.94} 94%|█████████▍| 20775/22095 [35:29:37<2:28:21, 6.74s/it] 94%|█████████▍| 20776/22095 [35:29:47<2:46:14, 7.56s/it] {'loss': 0.4509, 'grad_norm': 0.2683016572122838, 'learning_rate': 9.330541224722378e-08, 'epoch': 0.94} 94%|█████████▍| 20776/22095 [35:29:47<2:46:14, 7.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 94%|█████████▍| 20777/22095 [35:29:50<2:18:54, 6.32s/it] {'loss': 0.2846, 'grad_norm': 0.7865247979900946, 'learning_rate': 9.316453446763518e-08, 'epoch': 0.94} 94%|█████████▍| 20777/22095 [35:29:50<2:18:54, 6.32s/it] 94%|█████████▍| 20778/22095 [35:29:54<2:00:06, 5.47s/it] {'loss': 0.2904, 'grad_norm': 0.7200645826034856, 'learning_rate': 9.302376212087128e-08, 'epoch': 0.94} 94%|█████████▍| 20778/22095 [35:29:54<2:00:06, 5.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (54639 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54428 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20779/22095 [35:30:01<2:11:49, 6.01s/it] {'loss': 0.4622, 'grad_norm': 0.24627763004895964, 'learning_rate': 9.28830952099552e-08, 'epoch': 0.94} 94%|█████████▍| 20779/22095 [35:30:01<2:11:49, 6.01s/it] 94%|█████████▍| 20780/22095 [35:30:05<1:55:19, 5.26s/it] {'loss': 0.289, 'grad_norm': 0.8043361758815954, 'learning_rate': 9.274253373791064e-08, 'epoch': 0.94} 94%|█████████▍| 20780/22095 [35:30:05<1:55:19, 5.26s/it] 94%|█████████▍| 20781/22095 [35:30:08<1:41:13, 4.62s/it] {'loss': 0.2704, 'grad_norm': 0.6579035360584158, 'learning_rate': 9.260207770775742e-08, 'epoch': 0.94} 94%|█████████▍| 20781/22095 [35:30:08<1:41:13, 4.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20782/22095 [35:30:12<1:35:51, 4.38s/it] {'loss': 0.2917, 'grad_norm': 0.6009245769666856, 'learning_rate': 9.246172712251422e-08, 'epoch': 0.94} 94%|█████████▍| 20782/22095 [35:30:12<1:35:51, 4.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50117 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75369 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65469 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (55142 > 40960) for 4 sample(s). Truncating to 13821 with 2 samples. 94%|█████████▍| 20783/22095 [35:30:15<1:27:59, 4.02s/it] {'loss': 0.2859, 'grad_norm': 0.6449447415026867, 'learning_rate': 9.23214819851953e-08, 'epoch': 0.94} 94%|█████████▍| 20783/22095 [35:30:15<1:27:59, 4.02s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20784/22095 [35:30:18<1:19:42, 3.65s/it] {'loss': 0.2819, 'grad_norm': 0.7220785813494247, 'learning_rate': 9.218134229881548e-08, 'epoch': 0.94} 94%|█████████▍| 20784/22095 [35:30:18<1:19:42, 3.65s/it] 94%|█████████▍| 20785/22095 [35:30:22<1:23:15, 3.81s/it] {'loss': 0.2776, 'grad_norm': 0.6103242402975791, 'learning_rate': 9.204130806638511e-08, 'epoch': 0.94} 94%|█████████▍| 20785/22095 [35:30:22<1:23:15, 3.81s/it] 94%|█████████▍| 20786/22095 [35:30:25<1:21:03, 3.72s/it] {'loss': 0.3184, 'grad_norm': 0.6693346951778233, 'learning_rate': 9.190137929091403e-08, 'epoch': 0.94} 94%|█████████▍| 20786/22095 [35:30:25<1:21:03, 3.72s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20787/22095 [35:30:33<1:47:59, 4.95s/it] {'loss': 0.4541, 'grad_norm': 0.2612038012801092, 'learning_rate': 9.176155597540759e-08, 'epoch': 0.94} 94%|█████████▍| 20787/22095 [35:30:33<1:47:59, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47254 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51601 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89136 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96009 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58577 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20788/22095 [35:30:38<1:45:48, 4.86s/it] {'loss': 0.2876, 'grad_norm': 0.5896828998572881, 'learning_rate': 9.162183812287117e-08, 'epoch': 0.94} 94%|█████████▍| 20788/22095 [35:30:38<1:45:48, 4.86s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20789/22095 [35:30:41<1:38:37, 4.53s/it] {'loss': 0.2627, 'grad_norm': 0.6221679556915928, 'learning_rate': 9.148222573630572e-08, 'epoch': 0.94} 94%|█████████▍| 20789/22095 [35:30:41<1:38:37, 4.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20790/22095 [35:30:44<1:27:07, 4.01s/it] {'loss': 0.3288, 'grad_norm': 0.6494938453249083, 'learning_rate': 9.13427188187127e-08, 'epoch': 0.94} 94%|█████████▍| 20790/22095 [35:30:44<1:27:07, 4.01s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20791/22095 [35:30:54<2:06:00, 5.80s/it] {'loss': 0.453, 'grad_norm': 0.2865355009159929, 'learning_rate': 9.120331737308919e-08, 'epoch': 0.94} 94%|█████████▍| 20791/22095 [35:30:54<2:06:00, 5.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20792/22095 [35:30:59<1:56:41, 5.37s/it] {'loss': 0.3156, 'grad_norm': 0.6116692277411361, 'learning_rate': 9.106402140242943e-08, 'epoch': 0.94} 94%|█████████▍| 20792/22095 [35:30:59<1:56:41, 5.37s/it] 94%|█████████▍| 20793/22095 [35:31:02<1:41:21, 4.67s/it] {'loss': 0.2678, 'grad_norm': 0.5744247951556596, 'learning_rate': 9.092483090972714e-08, 'epoch': 0.94} 94%|█████████▍| 20793/22095 [35:31:02<1:41:21, 4.67s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20794/22095 [35:31:05<1:34:25, 4.35s/it] {'loss': 0.2615, 'grad_norm': 0.5521819457940438, 'learning_rate': 9.078574589797329e-08, 'epoch': 0.94} 94%|█████████▍| 20794/22095 [35:31:05<1:34:25, 4.35s/it] 94%|█████████▍| 20795/22095 [35:31:08<1:25:04, 3.93s/it] {'loss': 0.2676, 'grad_norm': 0.9947077639779605, 'learning_rate': 9.064676637015656e-08, 'epoch': 0.94} 94%|█████████▍| 20795/22095 [35:31:08<1:25:04, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20796/22095 [35:31:18<2:01:24, 5.61s/it] {'loss': 0.4613, 'grad_norm': 0.28656657329167173, 'learning_rate': 9.050789232926293e-08, 'epoch': 0.94} 94%|█████████▍| 20796/22095 [35:31:18<2:01:24, 5.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [166, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9045978 in VC:s3://multi-modal/UniGeo/. Exception: Image size [166, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/4924.png', 'image_wh': [[166, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两个点,CD=3cm,M是AC的中点,N是DB的中点,AB=9.8cm,那么线段MN的长等于()\nA. 7cm\nB. 5.4cm\nC. 6.4cm\nD. 6.8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20797/22095 [35:31:21<1:46:08, 4.91s/it] {'loss': 0.2707, 'grad_norm': 0.6196670981555568, 'learning_rate': 9.036912377827611e-08, 'epoch': 0.94} 94%|█████████▍| 20797/22095 [35:31:21<1:46:08, 4.91s/it] 94%|█████████▍| 20798/22095 [35:31:24<1:34:56, 4.39s/it] {'loss': 0.2722, 'grad_norm': 0.6390512291005944, 'learning_rate': 9.023046072017761e-08, 'epoch': 0.94} 94%|█████████▍| 20798/22095 [35:31:24<1:34:56, 4.39s/it] 94%|█████████▍| 20799/22095 [35:31:27<1:27:02, 4.03s/it] {'loss': 0.308, 'grad_norm': 0.6302381460021429, 'learning_rate': 9.009190315794835e-08, 'epoch': 0.94} 94%|█████████▍| 20799/22095 [35:31:27<1:27:02, 4.03s/it] 94%|█████████▍| 20800/22095 [35:31:30<1:20:44, 3.74s/it] {'loss': 0.2708, 'grad_norm': 0.6638656744628239, 'learning_rate': 8.995345109456377e-08, 'epoch': 0.94} 94%|█████████▍| 20800/22095 [35:31:30<1:20:44, 3.74s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [312, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8512203 in VC:s3://internvl-moe-sft-data/. Exception: Image size [312, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 125839, 'image': 'vrdu_texteq/astro-ph.CO/f729c91c-deeb-4439-8811-64740c2052b7.png', 'image_wh': [[312, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you extract any readable text contained in the image?'}, {'from': 'gpt', 'value': 'and $x$ is defined as above.'}]} 94%|█████████▍| 20801/22095 [35:31:35<1:26:54, 4.03s/it] {'loss': 0.3135, 'grad_norm': 0.6004928538328558, 'learning_rate': 8.981510453299925e-08, 'epoch': 0.94} 94%|█████████▍| 20801/22095 [35:31:35<1:26:54, 4.03s/it] 94%|█████████▍| 20802/22095 [35:31:39<1:27:31, 4.06s/it] {'loss': 0.3093, 'grad_norm': 0.6447108416431894, 'learning_rate': 8.967686347622795e-08, 'epoch': 0.94} 94%|█████████▍| 20802/22095 [35:31:39<1:27:31, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53230 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110314 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20803/22095 [35:31:42<1:21:35, 3.79s/it] {'loss': 0.2828, 'grad_norm': 0.5841072682650577, 'learning_rate': 8.953872792722029e-08, 'epoch': 0.94} 94%|█████████▍| 20803/22095 [35:31:42<1:21:35, 3.79s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20804/22095 [35:31:49<1:39:42, 4.63s/it] {'loss': 0.4916, 'grad_norm': 0.26468154786651854, 'learning_rate': 8.940069788894389e-08, 'epoch': 0.94} 94%|█████████▍| 20804/22095 [35:31:49<1:39:42, 4.63s/it] 94%|█████████▍| 20805/22095 [35:31:52<1:32:12, 4.29s/it] {'loss': 0.2887, 'grad_norm': 0.6191241165888246, 'learning_rate': 8.926277336436417e-08, 'epoch': 0.94} 94%|█████████▍| 20805/22095 [35:31:53<1:32:12, 4.29s/it] 94%|█████████▍| 20806/22095 [35:31:56<1:26:08, 4.01s/it] {'loss': 0.2721, 'grad_norm': 0.5628097473321694, 'learning_rate': 8.912495435644542e-08, 'epoch': 0.94} 94%|█████████▍| 20806/22095 [35:31:56<1:26:08, 4.01s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (104400000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20807/22095 [35:32:00<1:26:39, 4.04s/it] {'loss': 0.2938, 'grad_norm': 0.6086306229708609, 'learning_rate': 8.898724086814969e-08, 'epoch': 0.94} 94%|█████████▍| 20807/22095 [35:32:00<1:26:39, 4.04s/it] 94%|█████████▍| 20808/22095 [35:32:04<1:27:40, 4.09s/it] {'loss': 0.3095, 'grad_norm': 0.6401280558876904, 'learning_rate': 8.88496329024341e-08, 'epoch': 0.94} 94%|█████████▍| 20808/22095 [35:32:04<1:27:40, 4.09s/it] 94%|█████████▍| 20809/22095 [35:32:08<1:27:38, 4.09s/it] {'loss': 0.3041, 'grad_norm': 0.616183908277212, 'learning_rate': 8.87121304622568e-08, 'epoch': 0.94} 94%|█████████▍| 20809/22095 [35:32:08<1:27:38, 4.09s/it] 94%|█████████▍| 20810/22095 [35:32:11<1:20:35, 3.76s/it] {'loss': 0.3106, 'grad_norm': 0.5798350520600436, 'learning_rate': 8.857473355057211e-08, 'epoch': 0.94} 94%|█████████▍| 20810/22095 [35:32:11<1:20:35, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20811/22095 [35:32:19<1:48:39, 5.08s/it] {'loss': 0.4618, 'grad_norm': 0.2668041636397088, 'learning_rate': 8.843744217033212e-08, 'epoch': 0.94} 94%|█████████▍| 20811/22095 [35:32:19<1:48:39, 5.08s/it] 94%|█████████▍| 20812/22095 [35:32:42<3:38:16, 10.21s/it] {'loss': 0.3185, 'grad_norm': 0.5873337668301859, 'learning_rate': 8.83002563244867e-08, 'epoch': 0.94} 94%|█████████▍| 20812/22095 [35:32:42<3:38:16, 10.21s/it] 94%|█████████▍| 20813/22095 [35:32:45<2:55:12, 8.20s/it] {'loss': 0.2823, 'grad_norm': 0.5872641247341023, 'learning_rate': 8.816317601598346e-08, 'epoch': 0.94} 94%|█████████▍| 20813/22095 [35:32:45<2:55:12, 8.20s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100560000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 94%|█████████▍| 20814/22095 [35:32:48<2:23:28, 6.72s/it] {'loss': 0.3097, 'grad_norm': 0.6006640705752363, 'learning_rate': 8.802620124776784e-08, 'epoch': 0.94} 94%|█████████▍| 20814/22095 [35:32:48<2:23:28, 6.72s/it] 94%|█████████▍| 20815/22095 [35:32:53<2:08:55, 6.04s/it] {'loss': 0.2669, 'grad_norm': 0.5830047965432855, 'learning_rate': 8.78893320227836e-08, 'epoch': 0.94} 94%|█████████▍| 20815/22095 [35:32:53<2:08:55, 6.04s/it] 94%|█████████▍| 20816/22095 [35:32:56<1:49:35, 5.14s/it] {'loss': 0.2908, 'grad_norm': 0.6608002252496344, 'learning_rate': 8.775256834397117e-08, 'epoch': 0.94} 94%|█████████▍| 20816/22095 [35:32:56<1:49:35, 5.14s/it] 94%|█████████▍| 20817/22095 [35:33:01<1:46:42, 5.01s/it] {'loss': 0.2896, 'grad_norm': 0.6022366659295527, 'learning_rate': 8.761591021426929e-08, 'epoch': 0.94} 94%|█████████▍| 20817/22095 [35:33:01<1:46:42, 5.01s/it] 94%|█████████▍| 20818/22095 [35:33:22<3:29:55, 9.86s/it] {'loss': 0.2679, 'grad_norm': 0.5838450301006088, 'learning_rate': 8.747935763661397e-08, 'epoch': 0.94} 94%|█████████▍| 20818/22095 [35:33:22<3:29:55, 9.86s/it] 94%|█████████▍| 20819/22095 [35:33:44<4:49:06, 13.59s/it] {'loss': 0.279, 'grad_norm': 0.5875361382398744, 'learning_rate': 8.734291061394006e-08, 'epoch': 0.94} 94%|█████████▍| 20819/22095 [35:33:44<4:49:06, 13.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8365271 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 32012, 'image': 'vrdu_table_final_2/astro-ph.CO/3c2a693a-37a7-4279-9eb0-26d949d19ec8.png', 'image_wh': [[20, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}}${\\bf f_*}$\\end{tabular}\n```"}]} 94%|█████████▍| 20820/22095 [35:33:47<3:43:21, 10.51s/it] {'loss': 0.2504, 'grad_norm': 0.5922955937040799, 'learning_rate': 8.720656914917858e-08, 'epoch': 0.94} 94%|█████████▍| 20820/22095 [35:33:47<3:43:21, 10.51s/it] 94%|█████████▍| 20821/22095 [35:33:50<2:54:04, 8.20s/it] {'loss': 0.2884, 'grad_norm': 0.5978814440316839, 'learning_rate': 8.707033324525937e-08, 'epoch': 0.94} 94%|█████████▍| 20821/22095 [35:33:50<2:54:04, 8.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (103586 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72833 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41342 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44448 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71576 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20822/22095 [35:33:53<2:21:12, 6.66s/it] {'loss': 0.2639, 'grad_norm': 0.5789905148749843, 'learning_rate': 8.693420290510957e-08, 'epoch': 0.94} 94%|█████████▍| 20822/22095 [35:33:53<2:21:12, 6.66s/it] 94%|█████████▍| 20823/22095 [35:33:56<1:56:40, 5.50s/it] {'loss': 0.2994, 'grad_norm': 0.6163399307257849, 'learning_rate': 8.679817813165514e-08, 'epoch': 0.94} 94%|█████████▍| 20823/22095 [35:33:56<1:56:40, 5.50s/it] 94%|█████████▍| 20824/22095 [35:34:00<1:43:56, 4.91s/it] {'loss': 0.3021, 'grad_norm': 0.8125102862013436, 'learning_rate': 8.666225892781765e-08, 'epoch': 0.94} 94%|█████████▍| 20824/22095 [35:34:00<1:43:56, 4.91s/it] 94%|█████████▍| 20825/22095 [35:34:03<1:34:37, 4.47s/it] {'loss': 0.2927, 'grad_norm': 0.5710581587433058, 'learning_rate': 8.65264452965181e-08, 'epoch': 0.94} 94%|█████████▍| 20825/22095 [35:34:03<1:34:37, 4.47s/it] 94%|█████████▍| 20826/22095 [35:34:24<3:21:56, 9.55s/it] {'loss': 0.2423, 'grad_norm': 0.6127986569841006, 'learning_rate': 8.63907372406747e-08, 'epoch': 0.94} 94%|█████████▍| 20826/22095 [35:34:24<3:21:56, 9.55s/it] 94%|█████████▍| 20827/22095 [35:34:28<2:41:22, 7.64s/it] {'loss': 0.2586, 'grad_norm': 1.494464415261979, 'learning_rate': 8.625513476320291e-08, 'epoch': 0.94} 94%|█████████▍| 20827/22095 [35:34:28<2:41:22, 7.64s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54299 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (84935 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20828/22095 [35:34:31<2:11:38, 6.23s/it] {'loss': 0.2899, 'grad_norm': 0.6249746398755778, 'learning_rate': 8.61196378670176e-08, 'epoch': 0.94} 94%|█████████▍| 20828/22095 [35:34:31<2:11:38, 6.23s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93795 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20829/22095 [35:34:34<1:53:15, 5.37s/it] {'loss': 0.2926, 'grad_norm': 0.6167952244077558, 'learning_rate': 8.598424655502868e-08, 'epoch': 0.94} 94%|█████████▍| 20829/22095 [35:34:34<1:53:15, 5.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20830/22095 [35:34:44<2:26:12, 6.93s/it] {'loss': 0.4542, 'grad_norm': 0.2539420689861876, 'learning_rate': 8.584896083014715e-08, 'epoch': 0.94} 94%|█████████▍| 20830/22095 [35:34:44<2:26:12, 6.93s/it] 94%|█████████▍| 20831/22095 [35:34:49<2:08:53, 6.12s/it] {'loss': 0.2907, 'grad_norm': 0.6555364937207958, 'learning_rate': 8.571378069527792e-08, 'epoch': 0.94} 94%|█████████▍| 20831/22095 [35:34:49<2:08:53, 6.12s/it] 94%|█████████▍| 20832/22095 [35:35:29<5:42:42, 16.28s/it] {'loss': 0.3222, 'grad_norm': 0.6597898779039076, 'learning_rate': 8.557870615332642e-08, 'epoch': 0.94} 94%|█████████▍| 20832/22095 [35:35:29<5:42:42, 16.28s/it] 94%|█████████▍| 20833/22095 [35:35:32<4:18:56, 12.31s/it] {'loss': 0.2518, 'grad_norm': 0.5962301735572672, 'learning_rate': 8.54437372071959e-08, 'epoch': 0.94} 94%|█████████▍| 20833/22095 [35:35:32<4:18:56, 12.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20834/22095 [35:35:55<5:24:57, 15.46s/it] {'loss': 0.3197, 'grad_norm': 0.6429082872294095, 'learning_rate': 8.53088738597846e-08, 'epoch': 0.94} 94%|█████████▍| 20834/22095 [35:35:55<5:24:57, 15.46s/it] 94%|█████████▍| 20835/22095 [35:36:18<6:13:02, 17.76s/it] {'loss': 0.3089, 'grad_norm': 0.5912302009849205, 'learning_rate': 8.517411611399129e-08, 'epoch': 0.94} 94%|█████████▍| 20835/22095 [35:36:18<6:13:02, 17.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20836/22095 [35:36:26<5:15:39, 15.04s/it] {'loss': 0.4697, 'grad_norm': 0.25971197451149486, 'learning_rate': 8.503946397271257e-08, 'epoch': 0.94} 94%|█████████▍| 20836/22095 [35:36:26<5:15:39, 15.04s/it] 94%|█████████▍| 20837/22095 [35:36:30<4:02:18, 11.56s/it] {'loss': 0.2927, 'grad_norm': 0.6035761663858425, 'learning_rate': 8.490491743883944e-08, 'epoch': 0.94} 94%|█████████▍| 20837/22095 [35:36:30<4:02:18, 11.56s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8398835 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 987, 'image': 'vrdu_table_final_2/astro-ph.CO/c49b457c-749a-493c-afaa-41e88078b5f6.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 94%|█████████▍| 20838/22095 [35:36:33<3:10:53, 9.11s/it] {'loss': 0.3457, 'grad_norm': 0.6500550482103429, 'learning_rate': 8.47704765152646e-08, 'epoch': 0.94} 94%|█████████▍| 20838/22095 [35:36:33<3:10:53, 9.11s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8885569 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8722, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C为线段AB的上点,AC=4,BC=6,M点和N点分别为线段AC和BC的中点,Mn=()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 94%|█████████▍| 20839/22095 [35:36:55<4:27:36, 12.78s/it] {'loss': 0.3086, 'grad_norm': 0.6167534463037921, 'learning_rate': 8.463614120487629e-08, 'epoch': 0.94} 94%|█████████▍| 20839/22095 [35:36:55<4:27:36, 12.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74811 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74479 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66457 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20840/22095 [35:37:16<5:19:37, 15.28s/it] {'loss': 0.324, 'grad_norm': 0.6201984186786818, 'learning_rate': 8.450191151056054e-08, 'epoch': 0.94} 94%|█████████▍| 20840/22095 [35:37:16<5:19:37, 15.28s/it] 94%|█████████▍| 20841/22095 [35:37:56<7:54:00, 22.68s/it] {'loss': 0.2899, 'grad_norm': 0.6488985759841122, 'learning_rate': 8.436778743520225e-08, 'epoch': 0.94} 94%|█████████▍| 20841/22095 [35:37:56<7:54:00, 22.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8612315 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 24586, 'image': '1570190003.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho wrote this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the genre of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this a comedy book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 94%|█████████▍| 20842/22095 [35:37:59<5:50:50, 16.80s/it] {'loss': 0.2632, 'grad_norm': 0.6156545222452543, 'learning_rate': 8.423376898168246e-08, 'epoch': 0.94} 94%|█████████▍| 20842/22095 [35:37:59<5:50:50, 16.80s/it]VC:s3://st2pj/20250222/images/multi_modal/agent_data/AndroidUI/20240312/20240312_filtered/zhihu/action_153501.396470finished.jpg 2025-08-29 03:35:57.477421 load time: 1027.66 ms 94%|█████████▍| 20843/22095 [35:38:02<4:27:31, 12.82s/it] {'loss': 0.3041, 'grad_norm': 0.6573285195409124, 'learning_rate': 8.409985615288218e-08, 'epoch': 0.94} 94%|█████████▍| 20843/22095 [35:38:02<4:27:31, 12.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8622116 in VC:s3://mm-dataset/ocrvqa/images/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 19165, 'image': '425163636.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWho is the author of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What is the title of this book? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'What type of book is this? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Humor & Entertainment? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}, {'from': 'human', 'value': 'Is this book related to Comics & Graphic Novels? Answer the question using a single word or phrase.'}, {'from': 'gpt', 'value': 'No'}]} 94%|█████████▍| 20844/22095 [35:38:24<5:25:27, 15.61s/it] {'loss': 0.2964, 'grad_norm': 0.6037230252645684, 'learning_rate': 8.396604895167748e-08, 'epoch': 0.94} 94%|█████████▍| 20844/22095 [35:38:24<5:25:27, 15.61s/it] 94%|█████████▍| 20845/22095 [35:38:28<4:12:16, 12.11s/it] {'loss': 0.3154, 'grad_norm': 0.5870445554468589, 'learning_rate': 8.383234738094381e-08, 'epoch': 0.94} 94%|█████████▍| 20845/22095 [35:38:28<4:12:16, 12.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44662 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20846/22095 [35:38:31<3:13:33, 9.30s/it] {'loss': 0.2493, 'grad_norm': 0.5362564404723827, 'learning_rate': 8.3698751443555e-08, 'epoch': 0.94} 94%|█████████▍| 20846/22095 [35:38:31<3:13:33, 9.30s/it] 94%|█████████▍| 20847/22095 [35:38:34<2:33:02, 7.36s/it] {'loss': 0.3024, 'grad_norm': 0.6489866126328017, 'learning_rate': 8.356526114237983e-08, 'epoch': 0.94} 94%|█████████▍| 20847/22095 [35:38:34<2:33:02, 7.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20848/22095 [35:39:03<4:48:44, 13.89s/it] {'loss': 0.4851, 'grad_norm': 0.2631653817827076, 'learning_rate': 8.343187648028772e-08, 'epoch': 0.94} 94%|█████████▍| 20848/22095 [35:39:03<4:48:44, 13.89s/it] 94%|█████████▍| 20849/22095 [35:39:06<3:41:50, 10.68s/it] {'loss': 0.2714, 'grad_norm': 0.6463116506629104, 'learning_rate': 8.329859746014468e-08, 'epoch': 0.94} 94%|█████████▍| 20849/22095 [35:39:06<3:41:50, 10.68s/it] 94%|█████████▍| 20850/22095 [35:39:46<6:45:26, 19.54s/it] {'loss': 0.2543, 'grad_norm': 0.5690519206908605, 'learning_rate': 8.316542408481398e-08, 'epoch': 0.94} 94%|█████████▍| 20850/22095 [35:39:46<6:45:26, 19.54s/it] 94%|█████████▍| 20851/22095 [35:40:28<9:00:28, 26.07s/it] {'loss': 0.2665, 'grad_norm': 0.5890163713323255, 'learning_rate': 8.303235635715723e-08, 'epoch': 0.94} 94%|█████████▍| 20851/22095 [35:40:28<9:00:28, 26.07s/it] 94%|█████████▍| 20852/22095 [35:40:48<8:26:38, 24.46s/it] {'loss': 0.3173, 'grad_norm': 0.6416623026573095, 'learning_rate': 8.289939428003491e-08, 'epoch': 0.94} 94%|█████████▍| 20852/22095 [35:40:48<8:26:38, 24.46s/it] 94%|█████████▍| 20853/22095 [35:41:11<8:12:20, 23.78s/it] {'loss': 0.2747, 'grad_norm': 0.5765374229975274, 'learning_rate': 8.276653785630195e-08, 'epoch': 0.94} 94%|█████████▍| 20853/22095 [35:41:11<8:12:20, 23.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20854/22095 [35:41:20<6:43:24, 19.50s/it] {'loss': 0.4645, 'grad_norm': 0.25657991387142726, 'learning_rate': 8.263378708881443e-08, 'epoch': 0.94} 94%|█████████▍| 20854/22095 [35:41:20<6:43:24, 19.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20855/22095 [35:41:42<6:58:37, 20.26s/it] {'loss': 0.2769, 'grad_norm': 0.5873869400501082, 'learning_rate': 8.250114198042392e-08, 'epoch': 0.94} 94%|█████████▍| 20855/22095 [35:41:42<6:58:37, 20.26s/it] 94%|█████████▍| 20856/22095 [35:42:22<8:58:53, 26.10s/it] {'loss': 0.3104, 'grad_norm': 0.5907343998929974, 'learning_rate': 8.236860253398094e-08, 'epoch': 0.94} 94%|█████████▍| 20856/22095 [35:42:22<8:58:53, 26.10s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56138 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83165 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43228 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20857/22095 [35:42:44<8:31:38, 24.80s/it] {'loss': 0.3117, 'grad_norm': 0.6388619579014826, 'learning_rate': 8.223616875233376e-08, 'epoch': 0.94} 94%|█████████▍| 20857/22095 [35:42:44<8:31:38, 24.80s/it] 94%|█████████▍| 20858/22095 [35:42:47<6:21:01, 18.48s/it] {'loss': 0.2738, 'grad_norm': 0.6244062370435659, 'learning_rate': 8.210384063832678e-08, 'epoch': 0.94} 94%|█████████▍| 20858/22095 [35:42:47<6:21:01, 18.48s/it] 94%|█████████▍| 20859/22095 [35:43:27<8:34:28, 24.97s/it] {'loss': 0.2964, 'grad_norm': 0.6299226406317449, 'learning_rate': 8.197161819480493e-08, 'epoch': 0.94} 94%|█████████▍| 20859/22095 [35:43:28<8:34:28, 24.97s/it] 94%|█████████▍| 20860/22095 [35:43:30<6:16:49, 18.31s/it] {'loss': 0.2986, 'grad_norm': 0.7781736528219134, 'learning_rate': 8.183950142460761e-08, 'epoch': 0.94} 94%|█████████▍| 20860/22095 [35:43:30<6:16:49, 18.31s/it] 94%|█████████▍| 20861/22095 [35:43:52<6:40:36, 19.48s/it] {'loss': 0.2877, 'grad_norm': 0.5846529738886026, 'learning_rate': 8.170749033057534e-08, 'epoch': 0.94} 94%|█████████▍| 20861/22095 [35:43:52<6:40:36, 19.48s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20862/22095 [35:43:57<5:05:10, 14.85s/it] {'loss': 0.2746, 'grad_norm': 0.5571171188715623, 'learning_rate': 8.157558491554306e-08, 'epoch': 0.94} 94%|█████████▍| 20862/22095 [35:43:57<5:05:10, 14.85s/it] 94%|█████████▍| 20863/22095 [35:45:14<11:31:53, 33.70s/it] {'loss': 0.2819, 'grad_norm': 0.6518076238522412, 'learning_rate': 8.144378518234574e-08, 'epoch': 0.94} 94%|█████████▍| 20863/22095 [35:45:14<11:31:53, 33.70s/it] 94%|█████████▍| 20864/22095 [35:45:55<12:12:38, 35.71s/it] {'loss': 0.2865, 'grad_norm': 0.6662587604637679, 'learning_rate': 8.131209113381556e-08, 'epoch': 0.94} 94%|█████████▍| 20864/22095 [35:45:55<12:12:38, 35.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20865/22095 [35:46:23<11:25:03, 33.42s/it] {'loss': 0.4741, 'grad_norm': 0.28615234380710475, 'learning_rate': 8.118050277278245e-08, 'epoch': 0.94} 94%|█████████▍| 20865/22095 [35:46:23<11:25:03, 33.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (55380 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85908 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77296 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80521 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20866/22095 [35:46:26<8:20:09, 24.42s/it] {'loss': 0.2808, 'grad_norm': 0.6165350758593174, 'learning_rate': 8.104902010207249e-08, 'epoch': 0.94} 94%|█████████▍| 20866/22095 [35:46:26<8:20:09, 24.42s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46039 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20867/22095 [35:47:23<11:40:02, 34.20s/it] {'loss': 0.2474, 'grad_norm': 0.945412991945462, 'learning_rate': 8.091764312451122e-08, 'epoch': 0.94} 94%|█████████▍| 20867/22095 [35:47:23<11:40:02, 34.20s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931429 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54582, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nA. 6\nB. 8\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 94%|█████████▍| 20868/22095 [35:48:24<14:26:07, 42.35s/it] {'loss': 0.3092, 'grad_norm': 0.6484266098431832, 'learning_rate': 8.078637184292304e-08, 'epoch': 0.94} 94%|█████████▍| 20868/22095 [35:48:25<14:26:07, 42.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92899 > 40960). Running this sequence through the model will result in indexing errors 94%|█████████▍| 20869/22095 [35:49:04<14:10:32, 41.63s/it] {'loss': 0.3084, 'grad_norm': 0.5862338401789485, 'learning_rate': 8.065520626012735e-08, 'epoch': 0.94} 94%|█████████▍| 20869/22095 [35:49:04<14:10:32, 41.63s/it] 94%|█████████▍| 20870/22095 [35:50:07<16:18:45, 47.94s/it] {'loss': 0.2862, 'grad_norm': 0.6182423665804847, 'learning_rate': 8.052414637894246e-08, 'epoch': 0.94} 94%|█████████▍| 20870/22095 [35:50:07<16:18:45, 47.94s/it] 94%|█████████▍| 20871/22095 [35:50:30<13:44:21, 40.41s/it] {'loss': 0.3038, 'grad_norm': 0.6010714458374049, 'learning_rate': 8.039319220218444e-08, 'epoch': 0.94} 94%|█████████▍| 20871/22095 [35:50:30<13:44:21, 40.41s/it] 94%|█████████▍| 20872/22095 [35:50:52<11:49:31, 34.81s/it] {'loss': 0.2724, 'grad_norm': 0.6317935772444221, 'learning_rate': 8.026234373266773e-08, 'epoch': 0.94} 94%|█████████▍| 20872/22095 [35:50:52<11:49:31, 34.81s/it] 94%|█████████▍| 20873/22095 [35:51:13<10:28:59, 30.88s/it] {'loss': 0.2702, 'grad_norm': 0.5788671859175848, 'learning_rate': 8.013160097320339e-08, 'epoch': 0.94} 94%|█████████▍| 20873/22095 [35:51:13<10:28:59, 30.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 94%|█████████▍| 20874/22095 [35:51:44<10:24:51, 30.71s/it] {'loss': 0.4617, 'grad_norm': 0.24808045688663874, 'learning_rate': 8.000096392660029e-08, 'epoch': 0.94} 94%|█████████▍| 20874/22095 [35:51:44<10:24:51, 30.71s/it] 94%|█████████▍| 20875/22095 [35:51:47<7:39:06, 22.58s/it] {'loss': 0.3128, 'grad_norm': 0.5853744341393615, 'learning_rate': 7.987043259566618e-08, 'epoch': 0.94} 94%|█████████▍| 20875/22095 [35:51:47<7:39:06, 22.58s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944623 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67776, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 12cm\nB. 15cm\nC. 13cm\nD. 11cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 94%|█████████▍| 20876/22095 [35:52:31<9:46:38, 28.87s/it] {'loss': 0.2834, 'grad_norm': 0.5850339197503641, 'learning_rate': 7.974000698320495e-08, 'epoch': 0.94} 94%|█████████▍| 20876/22095 [35:52:31<9:46:38, 28.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 94%|█████████▍| 20877/22095 [35:52:59<9:39:44, 28.56s/it] {'loss': 0.4502, 'grad_norm': 0.23458010581167918, 'learning_rate': 7.960968709202044e-08, 'epoch': 0.94} 94%|█████████▍| 20877/22095 [35:52:59<9:39:44, 28.56s/it] 94%|█████████▍| 20878/22095 [35:53:03<7:08:59, 21.15s/it] {'loss': 0.3207, 'grad_norm': 0.6946033096154024, 'learning_rate': 7.947947292491154e-08, 'epoch': 0.94} 94%|█████████▍| 20878/22095 [35:53:03<7:08:59, 21.15s/it] 94%|█████████▍| 20879/22095 [35:53:43<9:04:28, 26.87s/it] {'loss': 0.2688, 'grad_norm': 0.5948525066627732, 'learning_rate': 7.9349364484676e-08, 'epoch': 0.94} 94%|█████████▍| 20879/22095 [35:53:43<9:04:28, 26.87s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45967 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82490 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20880/22095 [35:54:24<10:28:27, 31.04s/it] {'loss': 0.3158, 'grad_norm': 0.585106418093436, 'learning_rate': 7.921936177411049e-08, 'epoch': 0.95} 95%|█████████▍| 20880/22095 [35:54:24<10:28:27, 31.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8163837 in VC:s3://multi-modal/laion_gpt4v/images/. Exception: Image size [20, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 1923, 'image': '622f702ea27fea3f4bb430f8c01572f8.jpg', 'image_wh': [[20, 20]], 'conversations': [{'from': 'human', 'value': '\nWrite a detailed description of this image.'}, {'from': 'gpt', 'value': 'This image appears to be the cover of a book. The top portion of the image contains text that reads "Literature and Language." Below this, in larger, bold font, is the title "Plexippus: or, the aspiring plebeian. ... Volume 2 of 2." The author\'s name, "Richard Graves," is printed below the title. The central portion of the image features two illustrations of spherical objects, resembling armillary spheres, which are models representing the celestial sphere with rings and hoops representing the equator and other astronomical circles. The background of the cover is a light beige color, and there is a visible crease or shadow suggesting that the book cover may be slightly bent or that the image shows two adjoining pages. The overall style of the image is simple and straightforward, with a focus on the text and the illustrations of the armillary spheres.'}]} 95%|█████████▍| 20881/22095 [35:54:46<9:35:11, 28.43s/it] {'loss': 0.2977, 'grad_norm': 0.7232573710512356, 'learning_rate': 7.908946479600777e-08, 'epoch': 0.95} 95%|█████████▍| 20881/22095 [35:54:46<9:35:11, 28.43s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59910 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (89369 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20882/22095 [35:55:45<12:41:51, 37.68s/it] {'loss': 0.3173, 'grad_norm': 0.6729210253022881, 'learning_rate': 7.895967355315948e-08, 'epoch': 0.95} 95%|█████████▍| 20882/22095 [35:55:45<12:41:51, 37.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [170, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8897251 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [170, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 20404, 'image': 'images/5071.png', 'image_wh': [[170, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,D为CB段中点,Cd=3,AB=11,则AC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '5'}]} 95%|█████████▍| 20883/22095 [35:55:49<9:13:23, 27.40s/it] {'loss': 0.2871, 'grad_norm': 0.5913024246309978, 'learning_rate': 7.88299880483534e-08, 'epoch': 0.95} 95%|█████████▍| 20883/22095 [35:55:49<9:13:23, 27.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (107982 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20884/22095 [35:56:14<9:03:23, 26.92s/it] {'loss': 0.2656, 'grad_norm': 0.6504311777673383, 'learning_rate': 7.870040828437675e-08, 'epoch': 0.95} 95%|█████████▍| 20884/22095 [35:56:14<9:03:23, 26.92s/it] 95%|█████████▍| 20885/22095 [35:56:18<6:40:09, 19.84s/it] {'loss': 0.2983, 'grad_norm': 0.6857428537796664, 'learning_rate': 7.857093426401397e-08, 'epoch': 0.95} 95%|█████████▍| 20885/22095 [35:56:18<6:40:09, 19.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20886/22095 [35:56:22<5:04:29, 15.11s/it] {'loss': 0.3062, 'grad_norm': 0.6297503614601261, 'learning_rate': 7.844156599004671e-08, 'epoch': 0.95} 95%|█████████▍| 20886/22095 [35:56:22<5:04:29, 15.11s/it] 95%|█████████▍| 20887/22095 [35:57:20<9:26:26, 28.13s/it] {'loss': 0.2848, 'grad_norm': 0.5954023304747791, 'learning_rate': 7.831230346525443e-08, 'epoch': 0.95} 95%|█████████▍| 20887/22095 [35:57:20<9:26:26, 28.13s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20888/22095 [35:57:48<9:23:50, 28.03s/it] {'loss': 0.4977, 'grad_norm': 0.26616954929128794, 'learning_rate': 7.818314669241544e-08, 'epoch': 0.95} 95%|█████████▍| 20888/22095 [35:57:48<9:23:50, 28.03s/it] 95%|█████████▍| 20889/22095 [35:59:06<14:25:56, 43.08s/it] {'loss': 0.2996, 'grad_norm': 0.6184736597521032, 'learning_rate': 7.805409567430367e-08, 'epoch': 0.95} 95%|█████████▍| 20889/22095 [35:59:06<14:25:56, 43.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (99686 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88650 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (75949 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20890/22095 [35:59:34<12:50:07, 38.35s/it] {'loss': 0.4563, 'grad_norm': 0.25006736595217566, 'learning_rate': 7.792515041369353e-08, 'epoch': 0.95} 95%|█████████▍| 20890/22095 [35:59:34<12:50:07, 38.35s/it] 95%|█████████▍| 20891/22095 [35:59:37<9:19:53, 27.90s/it] {'loss': 0.3086, 'grad_norm': 0.5676641219555973, 'learning_rate': 7.779631091335505e-08, 'epoch': 0.95} 95%|█████████▍| 20891/22095 [35:59:37<9:19:53, 27.90s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (121831 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67288 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20892/22095 [35:59:47<7:28:47, 22.38s/it] {'loss': 0.4475, 'grad_norm': 0.2454819629724571, 'learning_rate': 7.7667577176056e-08, 'epoch': 0.95} 95%|█████████▍| 20892/22095 [35:59:47<7:28:47, 22.38s/it] 95%|█████████▍| 20893/22095 [35:59:51<5:38:10, 16.88s/it] {'loss': 0.265, 'grad_norm': 0.7133383522561514, 'learning_rate': 7.753894920456251e-08, 'epoch': 0.95} 95%|█████████▍| 20893/22095 [35:59:51<5:38:10, 16.88s/it] 95%|█████████▍| 20894/22095 [36:00:50<9:52:09, 29.58s/it] {'loss': 0.3141, 'grad_norm': 0.7529118268346086, 'learning_rate': 7.741042700164014e-08, 'epoch': 0.95} 95%|█████████▍| 20894/22095 [36:00:50<9:52:09, 29.58s/it] 95%|█████████▍| 20895/22095 [36:01:12<9:04:21, 27.22s/it] {'loss': 0.2876, 'grad_norm': 0.6052921223708356, 'learning_rate': 7.72820105700478e-08, 'epoch': 0.95} 95%|█████████▍| 20895/22095 [36:01:12<9:04:21, 27.22s/it] 95%|█████████▍| 20896/22095 [36:02:10<12:09:28, 36.50s/it] {'loss': 0.2644, 'grad_norm': 0.7209021765111919, 'learning_rate': 7.715369991254662e-08, 'epoch': 0.95} 95%|█████████▍| 20896/22095 [36:02:10<12:09:28, 36.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8401948 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4112, 'image': 'vrdu_table_final_2/astro-ph.CO/bac8bb57-ceea-4e5a-9cae-277d540ff36c.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you turn the table in the image into LaTeX code?'}, {'from': 'gpt', 'value': "Absolutely, I'll turn the table in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 95%|█████████▍| 20897/22095 [36:02:52<12:46:07, 38.37s/it] {'loss': 0.3039, 'grad_norm': 0.5757050535425359, 'learning_rate': 7.702549503189272e-08, 'epoch': 0.95} 95%|█████████▍| 20897/22095 [36:02:52<12:46:07, 38.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20898/22095 [36:03:02<9:51:41, 29.66s/it] {'loss': 0.4918, 'grad_norm': 0.2770258305772582, 'learning_rate': 7.689739593084166e-08, 'epoch': 0.95} 95%|█████████▍| 20898/22095 [36:03:02<9:51:41, 29.66s/it] 95%|█████████▍| 20899/22095 [36:03:27<9:25:17, 28.36s/it] {'loss': 0.4559, 'grad_norm': 0.22921896598491626, 'learning_rate': 7.676940261214516e-08, 'epoch': 0.95} 95%|█████████▍| 20899/22095 [36:03:27<9:25:17, 28.36s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20900/22095 [36:03:59<9:48:11, 29.53s/it] {'loss': 0.4843, 'grad_norm': 0.276942567838856, 'learning_rate': 7.664151507855378e-08, 'epoch': 0.95} 95%|█████████▍| 20900/22095 [36:03:59<9:48:11, 29.53s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20901/22095 [36:04:03<7:14:14, 21.82s/it] {'loss': 0.314, 'grad_norm': 0.605310768362177, 'learning_rate': 7.651373333281532e-08, 'epoch': 0.95} 95%|█████████▍| 20901/22095 [36:04:03<7:14:14, 21.82s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [331, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8505568 in VC:s3://internvl-moe-sft-data/. Exception: Image size [331, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 50426, 'image': 'vrdu_texteq/astro-ph.CO/e99b55e5-9547-49e0-95d2-276efc631adc.png', 'image_wh': [[331, 23]], 'conversations': [{'from': 'human', 'value': '\n请处理图像以识别并提取其中包含的任何文本内容。'}, {'from': 'gpt', 'value': 'At redshifts below $z\\sim0.02$'}]} 95%|█████████▍| 20902/22095 [36:04:27<7:27:12, 22.49s/it] {'loss': 0.2826, 'grad_norm': 0.6351944506005454, 'learning_rate': 7.638605737767534e-08, 'epoch': 0.95} 95%|█████████▍| 20902/22095 [36:04:28<7:27:12, 22.49s/it] 95%|█████████▍| 20903/22095 [36:05:08<9:16:58, 28.04s/it] {'loss': 0.2872, 'grad_norm': 0.6064184675760638, 'learning_rate': 7.625848721587725e-08, 'epoch': 0.95} 95%|█████████▍| 20903/22095 [36:05:08<9:16:58, 28.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20904/22095 [36:05:29<8:35:51, 25.99s/it] {'loss': 0.2878, 'grad_norm': 0.6023337318092542, 'learning_rate': 7.613102285016216e-08, 'epoch': 0.95} 95%|█████████▍| 20904/22095 [36:05:29<8:35:51, 25.99s/it] 95%|█████████▍| 20905/22095 [36:06:10<10:00:04, 30.26s/it] {'loss': 0.3028, 'grad_norm': 0.5897002049598271, 'learning_rate': 7.600366428326845e-08, 'epoch': 0.95} 95%|█████████▍| 20905/22095 [36:06:10<10:00:04, 30.26s/it]VC:s3://internvl2/datasets/MMMUDataset/MMMU_Pro/standard/test_1342_image_1.png 2025-08-29 04:04:08.433095 load time: 1023.52 ms VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-0_1423533-split-1.jpg 2025-08-29 04:04:08.433051 load time: 1018.6 ms VC:s3://multi-modal/Super-CLEVR/images/superCLEVR_new_011652.png 2025-08-29 04:04:08.431205 load time: 1034.4 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_707007.png 2025-08-29 04:04:08.431252 load time: 1045.26 ms VC:s3://gui-agent/mind2web_train/images/e832e1f9-3a9d-440e-a96f-8cbbf241e4af/images/16.png 2025-08-29 04:04:08.433676 load time: 1020.15 ms VC:s3://gui/aguvis/aguvis-stage2/gui-odyssey/images/3553013387320893_8.png 2025-08-29 04:04:08.431175 load time: 1034.38 ms 95%|█████████▍| 20906/22095 [36:06:50<10:58:49, 33.25s/it] {'loss': 0.2779, 'grad_norm': 0.5920218219632091, 'learning_rate': 7.58764115179339e-08, 'epoch': 0.95} 95%|█████████▍| 20906/22095 [36:06:50<10:58:49, 33.25s/it] 95%|█████████▍| 20907/22095 [36:07:14<10:04:26, 30.53s/it] {'loss': 0.2515, 'grad_norm': 0.5523652834187022, 'learning_rate': 7.574926455689136e-08, 'epoch': 0.95} 95%|█████████▍| 20907/22095 [36:07:14<10:04:26, 30.53s/it] 95%|█████████▍| 20908/22095 [36:07:17<7:21:34, 22.32s/it] {'loss': 0.2723, 'grad_norm': 0.5483838828103694, 'learning_rate': 7.562222340287362e-08, 'epoch': 0.95} 95%|█████████▍| 20908/22095 [36:07:17<7:21:34, 22.32s/it] 95%|█████████▍| 20909/22095 [36:07:21<5:29:10, 16.65s/it] {'loss': 0.2727, 'grad_norm': 0.6156417525577635, 'learning_rate': 7.549528805861017e-08, 'epoch': 0.95} 95%|█████████▍| 20909/22095 [36:07:21<5:29:10, 16.65s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54332 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (113144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60004 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82545 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20910/22095 [36:08:02<7:55:26, 24.07s/it] {'loss': 0.3334, 'grad_norm': 0.6154673128491037, 'learning_rate': 7.536845852682884e-08, 'epoch': 0.95} 95%|█████████▍| 20910/22095 [36:08:02<7:55:26, 24.07s/it] 95%|█████████▍| 20911/22095 [36:08:24<7:41:36, 23.39s/it] {'loss': 0.2951, 'grad_norm': 0.5543715790612631, 'learning_rate': 7.52417348102541e-08, 'epoch': 0.95} 95%|█████████▍| 20911/22095 [36:08:24<7:41:36, 23.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71856 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90305 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110822 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20912/22095 [36:08:27<5:40:41, 17.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (137777 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.2926, 'grad_norm': 0.5990124247903211, 'learning_rate': 7.511511691160933e-08, 'epoch': 0.95} 95%|█████████▍| 20912/22095 [36:08:27<5:40:41, 17.28s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54743 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20913/22095 [36:08:31<4:20:43, 13.23s/it] {'loss': 0.2876, 'grad_norm': 0.6182521099000854, 'learning_rate': 7.498860483361459e-08, 'epoch': 0.95} 95%|█████████▍| 20913/22095 [36:08:31<4:20:43, 13.23s/it]VC:s3://gui/uground_web_processing/screenshots/web_direct_150k_description_filtered_39552.png 2025-08-29 04:06:29.436498 load time: 1019.93 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240822_155719_before_screenshot.png 2025-08-29 04:06:29.438677 load time: 1030.04 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/17541/screenshot_1.png 2025-08-29 04:06:29.437346 load time: 1045.09 ms 95%|█████████▍| 20914/22095 [36:08:34<3:23:46, 10.35s/it] {'loss': 0.307, 'grad_norm': 0.6492851657534132, 'learning_rate': 7.486219857898935e-08, 'epoch': 0.95} 95%|█████████▍| 20914/22095 [36:08:34<3:23:46, 10.35s/it] 95%|█████████▍| 20915/22095 [36:09:20<6:50:16, 20.86s/it] {'loss': 0.2528, 'grad_norm': 1.1963892970319248, 'learning_rate': 7.473589815044924e-08, 'epoch': 0.95} 95%|█████████▍| 20915/22095 [36:09:20<6:50:16, 20.86s/it] 95%|█████████▍| 20916/22095 [36:09:42<6:58:06, 21.28s/it] {'loss': 0.2931, 'grad_norm': 0.6083346740598554, 'learning_rate': 7.460970355070763e-08, 'epoch': 0.95} 95%|█████████▍| 20916/22095 [36:09:42<6:58:06, 21.28s/it] 95%|█████████▍| 20917/22095 [36:09:45<5:11:39, 15.87s/it] {'loss': 0.2864, 'grad_norm': 0.7154186931695922, 'learning_rate': 7.448361478247624e-08, 'epoch': 0.95} 95%|█████████▍| 20917/22095 [36:09:45<5:11:39, 15.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20918/22095 [36:09:55<4:33:41, 13.95s/it] {'loss': 0.4749, 'grad_norm': 0.28426325171555816, 'learning_rate': 7.4357631848464e-08, 'epoch': 0.95} 95%|█████████▍| 20918/22095 [36:09:55<4:33:41, 13.95s/it] 95%|█████████▍| 20919/22095 [36:10:16<5:16:21, 16.14s/it] {'loss': 0.3012, 'grad_norm': 0.6014997826042021, 'learning_rate': 7.423175475137934e-08, 'epoch': 0.95} 95%|█████████▍| 20919/22095 [36:10:16<5:16:21, 16.14s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (55454 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42600 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65600 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (46075 > 40960) for 4 sample(s). Truncating to 40960 with 1 samples. Token indices sequence length is longer than the specified maximum sequence length for this model (98547 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20920/22095 [36:10:25<4:36:25, 14.12s/it] {'loss': 0.4796, 'grad_norm': 0.28871653746278625, 'learning_rate': 7.410598349392506e-08, 'epoch': 0.95} 95%|█████████▍| 20920/22095 [36:10:25<4:36:25, 14.12s/it] 95%|█████████▍| 20921/22095 [36:10:32<3:53:17, 11.92s/it] {'loss': 0.4565, 'grad_norm': 0.2493791500104858, 'learning_rate': 7.398031807880456e-08, 'epoch': 0.95} 95%|█████████▍| 20921/22095 [36:10:32<3:53:17, 11.92s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▍| 20922/22095 [36:10:35<3:02:46, 9.35s/it] {'loss': 0.2726, 'grad_norm': 0.6420186837671348, 'learning_rate': 7.385475850871793e-08, 'epoch': 0.95} 95%|█████████▍| 20922/22095 [36:10:35<3:02:46, 9.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (81254 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20923/22095 [36:10:39<2:27:08, 7.53s/it] {'loss': 0.3089, 'grad_norm': 0.7493605256465465, 'learning_rate': 7.372930478636353e-08, 'epoch': 0.95} 95%|█████████▍| 20923/22095 [36:10:39<2:27:08, 7.53s/it] 95%|█████████▍| 20924/22095 [36:11:01<3:52:54, 11.93s/it] {'loss': 0.2867, 'grad_norm': 0.6441946120542986, 'learning_rate': 7.360395691443644e-08, 'epoch': 0.95} 95%|█████████▍| 20924/22095 [36:11:01<3:52:54, 11.93s/it] 95%|█████████▍| 20925/22095 [36:11:42<6:42:35, 20.65s/it] {'loss': 0.2897, 'grad_norm': 0.6416847443775258, 'learning_rate': 7.347871489562952e-08, 'epoch': 0.95} 95%|█████████▍| 20925/22095 [36:11:42<6:42:35, 20.65s/it] 95%|█████████▍| 20926/22095 [36:12:03<6:47:44, 20.93s/it] {'loss': 0.3384, 'grad_norm': 0.6957522478918147, 'learning_rate': 7.335357873263449e-08, 'epoch': 0.95} 95%|█████████▍| 20926/22095 [36:12:04<6:47:44, 20.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [481, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8453232 in VC:s3://internvl-moe-sft-data/. Exception: Image size [481, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34582, 'image': 'vrdu_texteq/astro-ph.CO/d8d94aac-b678-48c2-be84-306145f9b077.png', 'image_wh': [[481, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': 'and $\\sigma$ is the rms fluctuation of the field'}]} 95%|█████████▍| 20927/22095 [36:12:25<6:51:44, 21.15s/it] {'loss': 0.3174, 'grad_norm': 0.6612940508036357, 'learning_rate': 7.322854842814031e-08, 'epoch': 0.95} 95%|█████████▍| 20927/22095 [36:12:25<6:51:44, 21.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20928/22095 [36:12:35<5:43:06, 17.64s/it] {'loss': 0.4583, 'grad_norm': 0.2789703973879626, 'learning_rate': 7.310362398483262e-08, 'epoch': 0.95} 95%|█████████▍| 20928/22095 [36:12:35<5:43:06, 17.64s/it] 95%|█████████▍| 20929/22095 [36:12:38<4:21:42, 13.47s/it] {'loss': 0.31, 'grad_norm': 0.6310776014752988, 'learning_rate': 7.297880540539648e-08, 'epoch': 0.95} 95%|█████████▍| 20929/22095 [36:12:38<4:21:42, 13.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57068 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68557 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52885 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44042 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20930/22095 [36:13:18<6:52:43, 21.26s/it] {'loss': 0.2681, 'grad_norm': 0.6304965415205801, 'learning_rate': 7.28540926925142e-08, 'epoch': 0.95} 95%|█████████▍| 20930/22095 [36:13:18<6:52:43, 21.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20931/22095 [36:13:25<5:31:37, 17.09s/it] {'loss': 0.4656, 'grad_norm': 0.2765236675299349, 'learning_rate': 7.27294858488642e-08, 'epoch': 0.95} 95%|█████████▍| 20931/22095 [36:13:25<5:31:37, 17.09s/it] 95%|█████████▍| 20932/22095 [36:13:49<6:10:33, 19.12s/it] {'loss': 0.3207, 'grad_norm': 0.6720546270612172, 'learning_rate': 7.260498487712487e-08, 'epoch': 0.95} 95%|█████████▍| 20932/22095 [36:13:49<6:10:33, 19.12s/it] 95%|█████████▍| 20933/22095 [36:14:34<8:38:06, 26.75s/it] {'loss': 0.2713, 'grad_norm': 0.627060354808679, 'learning_rate': 7.24805897799713e-08, 'epoch': 0.95} 95%|█████████▍| 20933/22095 [36:14:34<8:38:06, 26.75s/it] 95%|█████████▍| 20934/22095 [36:14:57<8:16:18, 25.65s/it] {'loss': 0.2702, 'grad_norm': 0.5966061175146216, 'learning_rate': 7.23563005600758e-08, 'epoch': 0.95} 95%|█████████▍| 20934/22095 [36:14:57<8:16:18, 25.65s/it] 95%|█████████▍| 20935/22095 [36:15:01<6:09:51, 19.13s/it] {'loss': 0.311, 'grad_norm': 0.6318013396150959, 'learning_rate': 7.223211722010959e-08, 'epoch': 0.95} 95%|█████████▍| 20935/22095 [36:15:01<6:09:51, 19.13s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45241 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107304 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (117795 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78272 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63974 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44837 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116943 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20936/22095 [36:16:23<12:18:53, 38.25s/it] {'loss': 0.2959, 'grad_norm': 0.5954183406323375, 'learning_rate': 7.21080397627405e-08, 'epoch': 0.95} 95%|█████████▍| 20936/22095 [36:16:23<12:18:53, 38.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45092 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63737 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60834 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42054 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50434 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20937/22095 [36:16:26<8:53:57, 27.67s/it] {'loss': 0.3436, 'grad_norm': 0.6338239230437382, 'learning_rate': 7.198406819063419e-08, 'epoch': 0.95} 95%|█████████▍| 20937/22095 [36:16:26<8:53:57, 27.67s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20938/22095 [36:16:36<7:08:41, 22.23s/it] {'loss': 0.4425, 'grad_norm': 0.2799655198032767, 'learning_rate': 7.186020250645576e-08, 'epoch': 0.95} 95%|█████████▍| 20938/22095 [36:16:36<7:08:41, 22.23s/it] 95%|█████████▍| 20939/22095 [36:16:39<5:19:21, 16.58s/it] {'loss': 0.2909, 'grad_norm': 0.5993272085024067, 'learning_rate': 7.173644271286584e-08, 'epoch': 0.95} 95%|█████████▍| 20939/22095 [36:16:39<5:19:21, 16.58s/it] 95%|█████████▍| 20940/22095 [36:17:01<5:49:12, 18.14s/it] {'loss': 0.2581, 'grad_norm': 0.6036991084042239, 'learning_rate': 7.161278881252398e-08, 'epoch': 0.95} 95%|█████████▍| 20940/22095 [36:17:01<5:49:12, 18.14s/it] 95%|█████████▍| 20941/22095 [36:17:04<4:21:47, 13.61s/it] {'loss': 0.292, 'grad_norm': 0.5657049971902813, 'learning_rate': 7.14892408080864e-08, 'epoch': 0.95} 95%|█████████▍| 20941/22095 [36:17:04<4:21:47, 13.61s/it] 95%|█████████▍| 20942/22095 [36:17:07<3:21:30, 10.49s/it] {'loss': 0.2978, 'grad_norm': 0.619094047430233, 'learning_rate': 7.136579870220817e-08, 'epoch': 0.95} 95%|█████████▍| 20942/22095 [36:17:07<3:21:30, 10.49s/it] 95%|█████████▍| 20943/22095 [36:17:11<2:39:17, 8.30s/it] {'loss': 0.2904, 'grad_norm': 0.6030486256087684, 'learning_rate': 7.124246249754218e-08, 'epoch': 0.95} 95%|█████████▍| 20943/22095 [36:17:11<2:39:17, 8.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20944/22095 [36:17:20<2:46:43, 8.69s/it] {'loss': 0.4689, 'grad_norm': 0.275658799458521, 'learning_rate': 7.1119232196738e-08, 'epoch': 0.95} 95%|█████████▍| 20944/22095 [36:17:20<2:46:43, 8.69s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (90172 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44499 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56477 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20945/22095 [36:17:26<2:29:34, 7.80s/it] {'loss': 0.4701, 'grad_norm': 0.3242225239131589, 'learning_rate': 7.099610780244348e-08, 'epoch': 0.95} 95%|█████████▍| 20945/22095 [36:17:26<2:29:34, 7.80s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▍| 20946/22095 [36:17:29<2:03:21, 6.44s/it] {'loss': 0.2765, 'grad_norm': 0.6053260765775155, 'learning_rate': 7.087308931730485e-08, 'epoch': 0.95} 95%|█████████▍| 20946/22095 [36:17:29<2:03:21, 6.44s/it] 95%|█████████▍| 20947/22095 [36:17:33<1:46:03, 5.54s/it] {'loss': 0.283, 'grad_norm': 0.5822627920801777, 'learning_rate': 7.075017674396445e-08, 'epoch': 0.95} 95%|█████████▍| 20947/22095 [36:17:33<1:46:03, 5.54s/it] 95%|█████████▍| 20948/22095 [36:17:36<1:31:28, 4.78s/it] {'loss': 0.283, 'grad_norm': 0.6450773095173824, 'learning_rate': 7.062737008506404e-08, 'epoch': 0.95} 95%|█████████▍| 20948/22095 [36:17:36<1:31:28, 4.78s/it] 95%|█████████▍| 20949/22095 [36:17:57<3:03:52, 9.63s/it] {'loss': 0.3242, 'grad_norm': 0.6037121474357465, 'learning_rate': 7.050466934324207e-08, 'epoch': 0.95} 95%|█████████▍| 20949/22095 [36:17:57<3:03:52, 9.63s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8909864 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 33017, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C,D是AB段上的两点,E是AC的中点,F是BD的中点,如果EF=8,CD=4,AB的长度为()\nA. 9\nB. 10\nC. 12\nD. 16'}, {'from': 'gpt', 'value': '【解答】解:由题意得,EC+FD=EF-CD=8-4=4,∵E是AC的中点,F是BD的中点,∴AE+FB=EC+FD=4,∴AB=AE+FB+EF=4+8=12.'}]} 95%|█████████▍| 20950/22095 [36:18:01<2:32:06, 7.97s/it] {'loss': 0.2438, 'grad_norm': 0.5702545627996378, 'learning_rate': 7.038207452113422e-08, 'epoch': 0.95} 95%|█████████▍| 20950/22095 [36:18:01<2:32:06, 7.97s/it] 95%|█████████▍| 20951/22095 [36:18:41<5:38:32, 17.76s/it] {'loss': 0.291, 'grad_norm': 0.5819989483938457, 'learning_rate': 7.025958562137559e-08, 'epoch': 0.95} 95%|█████████▍| 20951/22095 [36:18:41<5:38:32, 17.76s/it] 95%|█████████▍| 20952/22095 [36:18:44<4:13:08, 13.29s/it] {'loss': 0.2897, 'grad_norm': 0.6063900047506062, 'learning_rate': 7.013720264659851e-08, 'epoch': 0.95} 95%|█████████▍| 20952/22095 [36:18:44<4:13:08, 13.29s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20953/22095 [36:18:54<3:51:28, 12.16s/it] {'loss': 0.4894, 'grad_norm': 0.25345238884624277, 'learning_rate': 7.001492559943201e-08, 'epoch': 0.95} 95%|█████████▍| 20953/22095 [36:18:54<3:51:28, 12.16s/it] 95%|█████████▍| 20954/22095 [36:18:57<3:03:02, 9.63s/it] {'loss': 0.3302, 'grad_norm': 0.607414809250055, 'learning_rate': 6.989275448250288e-08, 'epoch': 0.95} 95%|█████████▍| 20954/22095 [36:18:57<3:03:02, 9.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70525 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50586 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20955/22095 [36:19:01<2:26:13, 7.70s/it] {'loss': 0.2831, 'grad_norm': 0.6563703110364023, 'learning_rate': 6.977068929843678e-08, 'epoch': 0.95} 95%|█████████▍| 20955/22095 [36:19:01<2:26:13, 7.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (93812 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20956/22095 [36:19:04<2:01:44, 6.41s/it] {'loss': 0.2811, 'grad_norm': 0.6503675742457773, 'learning_rate': 6.964873004985717e-08, 'epoch': 0.95} 95%|█████████▍| 20956/22095 [36:19:04<2:01:44, 6.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8306654 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1tzXFmhTI8KJjSspiXXbM4FXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nCould you read and tell me the text in this image?'}, {'from': 'gpt', 'value': 'All words in the image:\n加厚轻松移动太方便了。。所有型号全部带轮\n破\n损包赔\n超强承受'}]} 95%|█████████▍| 20957/22095 [36:19:08<1:46:47, 5.63s/it] {'loss': 0.3489, 'grad_norm': 0.6344037115955137, 'learning_rate': 6.952687673938363e-08, 'epoch': 0.95} 95%|█████████▍| 20957/22095 [36:19:08<1:46:47, 5.63s/it] 95%|█████████▍| 20958/22095 [36:19:29<3:17:36, 10.43s/it] {'loss': 0.2568, 'grad_norm': 0.5878882777015434, 'learning_rate': 6.940512936963461e-08, 'epoch': 0.95} 95%|█████████▍| 20958/22095 [36:19:29<3:17:36, 10.43s/it] 95%|█████████▍| 20959/22095 [36:20:14<6:28:56, 20.54s/it] {'loss': 0.2659, 'grad_norm': 0.5402159930154613, 'learning_rate': 6.928348794322637e-08, 'epoch': 0.95} 95%|█████████▍| 20959/22095 [36:20:14<6:28:56, 20.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70614 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59066 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61996 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (60520 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20960/22095 [36:20:17<4:49:37, 15.31s/it] {'loss': 0.2994, 'grad_norm': 0.5676910831484602, 'learning_rate': 6.916195246277291e-08, 'epoch': 0.95} 95%|█████████▍| 20960/22095 [36:20:17<4:49:37, 15.31s/it]VC:s3://gui-agent/data_20250421/web/images/accuweather_com/trajectory_64/img/step_5.png 2025-08-29 04:18:15.400820 load time: 1060.23 ms 95%|█████████▍| 20961/22095 [36:20:19<3:38:44, 11.57s/it] {'loss': 0.2892, 'grad_norm': 0.6005688409644764, 'learning_rate': 6.904052293088437e-08, 'epoch': 0.95} 95%|█████████▍| 20961/22095 [36:20:19<3:38:44, 11.57s/it] 95%|█████████▍| 20962/22095 [36:20:23<2:54:27, 9.24s/it] {'loss': 0.2766, 'grad_norm': 0.5844076062105606, 'learning_rate': 6.891919935017089e-08, 'epoch': 0.95} 95%|█████████▍| 20962/22095 [36:20:23<2:54:27, 9.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20963/22095 [36:20:34<3:01:22, 9.61s/it] {'loss': 0.4712, 'grad_norm': 0.2680542891543253, 'learning_rate': 6.879798172323926e-08, 'epoch': 0.95} 95%|█████████▍| 20963/22095 [36:20:34<3:01:22, 9.61s/it] 95%|█████████▍| 20964/22095 [36:21:00<4:36:31, 14.67s/it] {'loss': 0.274, 'grad_norm': 1.233302978572291, 'learning_rate': 6.867687005269408e-08, 'epoch': 0.95} 95%|█████████▍| 20964/22095 [36:21:00<4:36:31, 14.67s/it] 95%|█████████▍| 20965/22095 [36:21:45<7:28:36, 23.82s/it] {'loss': 0.2768, 'grad_norm': 0.5678739452683957, 'learning_rate': 6.855586434113771e-08, 'epoch': 0.95} 95%|█████████▍| 20965/22095 [36:21:45<7:28:36, 23.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (84555 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20966/22095 [36:21:49<5:31:19, 17.61s/it] {'loss': 0.2816, 'grad_norm': 0.6447409783343343, 'learning_rate': 6.843496459116917e-08, 'epoch': 0.95} 95%|█████████▍| 20966/22095 [36:21:49<5:31:19, 17.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47040 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85770 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20967/22095 [36:21:52<4:08:36, 13.22s/it] {'loss': 0.2733, 'grad_norm': 0.6330282871837848, 'learning_rate': 6.83141708053875e-08, 'epoch': 0.95} 95%|█████████▍| 20967/22095 [36:21:52<4:08:36, 13.22s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20968/22095 [36:22:01<3:45:21, 12.00s/it] {'loss': 0.481, 'grad_norm': 0.2887291296955998, 'learning_rate': 6.819348298638839e-08, 'epoch': 0.95} 95%|█████████▍| 20968/22095 [36:22:01<3:45:21, 12.00s/it] 95%|█████████▍| 20969/22095 [36:22:05<3:01:21, 9.66s/it] {'loss': 0.3026, 'grad_norm': 0.6455807936183818, 'learning_rate': 6.807290113676423e-08, 'epoch': 0.95} 95%|█████████▍| 20969/22095 [36:22:05<3:01:21, 9.66s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20970/22095 [36:22:08<2:25:28, 7.76s/it] {'loss': 0.2907, 'grad_norm': 0.5974785045800158, 'learning_rate': 6.795242525910573e-08, 'epoch': 0.95} 95%|█████████▍| 20970/22095 [36:22:08<2:25:28, 7.76s/it]VC:s3://gui/aguvis/aguvis-stage1/guienvs/images/C4web50k-0_3966672-split-1.jpg 2025-08-29 04:20:06.950690 load time: 1024.93 ms 95%|█████████▍| 20971/22095 [36:22:11<1:57:02, 6.25s/it] {'loss': 0.3221, 'grad_norm': 0.5544173397992318, 'learning_rate': 6.783205535600191e-08, 'epoch': 0.95} 95%|█████████▍| 20971/22095 [36:22:11<1:57:02, 6.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59033 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72892 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20972/22095 [36:22:14<1:39:29, 5.32s/it] {'loss': 0.3031, 'grad_norm': 0.6038550671982219, 'learning_rate': 6.771179143003958e-08, 'epoch': 0.95} 95%|█████████▍| 20972/22095 [36:22:14<1:39:29, 5.32s/it] 95%|█████████▍| 20973/22095 [36:22:17<1:25:29, 4.57s/it] {'loss': 0.2946, 'grad_norm': 0.6218445340632932, 'learning_rate': 6.759163348380282e-08, 'epoch': 0.95} 95%|█████████▍| 20973/22095 [36:22:17<1:25:29, 4.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (100560000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/30428.png 2025-08-29 04:20:17.367771 load time: 1551.48 ms 95%|█████████▍| 20974/22095 [36:22:26<1:52:46, 6.04s/it] {'loss': 0.4569, 'grad_norm': 0.2522104366288927, 'learning_rate': 6.747158151987232e-08, 'epoch': 0.95} 95%|█████████▍| 20974/22095 [36:22:26<1:52:46, 6.04s/it] 95%|█████████▍| 20975/22095 [36:22:34<2:00:49, 6.47s/it] {'loss': 0.4977, 'grad_norm': 0.2725387958153841, 'learning_rate': 6.73516355408288e-08, 'epoch': 0.95} 95%|█████████▍| 20975/22095 [36:22:34<2:00:49, 6.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▍| 20976/22095 [36:22:39<1:51:00, 5.95s/it] {'loss': 0.2763, 'grad_norm': 0.5699970378852044, 'learning_rate': 6.723179554924908e-08, 'epoch': 0.95} 95%|█████████▍| 20976/22095 [36:22:39<1:51:00, 5.95s/it] 95%|█████████▍| 20977/22095 [36:22:43<1:41:26, 5.44s/it] {'loss': 0.2901, 'grad_norm': 0.5946194671942663, 'learning_rate': 6.711206154770833e-08, 'epoch': 0.95} 95%|█████████▍| 20977/22095 [36:22:43<1:41:26, 5.44s/it] 95%|█████████▍| 20978/22095 [36:22:46<1:30:19, 4.85s/it] {'loss': 0.2875, 'grad_norm': 0.5964805198685067, 'learning_rate': 6.699243353877949e-08, 'epoch': 0.95} 95%|█████████▍| 20978/22095 [36:22:46<1:30:19, 4.85s/it] 95%|█████████▍| 20979/22095 [36:22:50<1:24:27, 4.54s/it] {'loss': 0.3049, 'grad_norm': 1.2702132941453168, 'learning_rate': 6.687291152503217e-08, 'epoch': 0.95} 95%|█████████▍| 20979/22095 [36:22:50<1:24:27, 4.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▍| 20980/22095 [36:23:00<1:53:55, 6.13s/it] {'loss': 0.4688, 'grad_norm': 0.35882497371212974, 'learning_rate': 6.675349550903488e-08, 'epoch': 0.95} 95%|█████████▍| 20980/22095 [36:23:00<1:53:55, 6.13s/it] 95%|█████████▍| 20981/22095 [36:23:04<1:42:13, 5.51s/it] {'loss': 0.2948, 'grad_norm': 0.7600225740849603, 'learning_rate': 6.663418549335443e-08, 'epoch': 0.95} 95%|█████████▍| 20981/22095 [36:23:04<1:42:13, 5.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918292 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41445, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': "\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nA. 6.5\nB. 4\nC. 5\nD. 6\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 95%|█████████▍| 20982/22095 [36:23:08<1:33:08, 5.02s/it] {'loss': 0.3708, 'grad_norm': 0.6181063297081756, 'learning_rate': 6.651498148055324e-08, 'epoch': 0.95} 95%|█████████▍| 20982/22095 [36:23:08<1:33:08, 5.02s/it] 95%|█████████▍| 20983/22095 [36:23:11<1:22:07, 4.43s/it] {'loss': 0.2737, 'grad_norm': 1.4409835173105592, 'learning_rate': 6.639588347319315e-08, 'epoch': 0.95} 95%|█████████▍| 20983/22095 [36:23:11<1:22:07, 4.43s/it] 95%|█████████▍| 20984/22095 [36:23:14<1:13:58, 4.00s/it] {'loss': 0.275, 'grad_norm': 0.5926133598300354, 'learning_rate': 6.627689147383265e-08, 'epoch': 0.95} 95%|█████████▍| 20984/22095 [36:23:14<1:13:58, 4.00s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20985/22095 [36:23:17<1:09:25, 3.75s/it] {'loss': 0.2557, 'grad_norm': 0.621328656285908, 'learning_rate': 6.615800548502971e-08, 'epoch': 0.95} 95%|█████████▍| 20985/22095 [36:23:17<1:09:25, 3.75s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44875 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68142 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52096 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67481 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▍| 20986/22095 [36:23:21<1:09:59, 3.79s/it] {'loss': 0.2822, 'grad_norm': 0.6553306928571592, 'learning_rate': 6.603922550933783e-08, 'epoch': 0.95} 95%|█████████▍| 20986/22095 [36:23:21<1:09:59, 3.79s/it] 95%|█████████▍| 20987/22095 [36:23:24<1:06:50, 3.62s/it] {'loss': 0.29, 'grad_norm': 0.5898052926929013, 'learning_rate': 6.592055154930887e-08, 'epoch': 0.95} 95%|█████████▍| 20987/22095 [36:23:24<1:06:50, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▍| 20988/22095 [36:23:27<1:01:48, 3.35s/it] {'loss': 0.2807, 'grad_norm': 0.60423853194223, 'learning_rate': 6.580198360749412e-08, 'epoch': 0.95} 95%|█████████▍| 20988/22095 [36:23:27<1:01:48, 3.35s/it] 95%|█████████▍| 20989/22095 [36:23:31<1:03:12, 3.43s/it] {'loss': 0.2714, 'grad_norm': 0.6083911386276145, 'learning_rate': 6.568352168644043e-08, 'epoch': 0.95} 95%|█████████▍| 20989/22095 [36:23:31<1:03:12, 3.43s/it] 95%|█████████▍| 20990/22095 [36:23:33<59:27, 3.23s/it] {'loss': 0.3046, 'grad_norm': 0.8479645873483991, 'learning_rate': 6.556516578869299e-08, 'epoch': 0.95} 95%|█████████▍| 20990/22095 [36:23:33<59:27, 3.23s/it] 95%|█████████▌| 20991/22095 [36:23:37<1:01:44, 3.36s/it] {'loss': 0.274, 'grad_norm': 0.5760836337681511, 'learning_rate': 6.544691591679531e-08, 'epoch': 0.95} 95%|█████████▌| 20991/22095 [36:23:37<1:01:44, 3.36s/it] 95%|█████████▌| 20992/22095 [36:24:01<2:55:01, 9.52s/it] {'loss': 0.271, 'grad_norm': 0.6151864967390867, 'learning_rate': 6.532877207328813e-08, 'epoch': 0.95} 95%|█████████▌| 20992/22095 [36:24:01<2:55:01, 9.52s/it] 95%|█████████▌| 20993/22095 [36:24:04<2:18:07, 7.52s/it] {'loss': 0.2904, 'grad_norm': 0.6503544528267623, 'learning_rate': 6.521073426070945e-08, 'epoch': 0.95} 95%|█████████▌| 20993/22095 [36:24:04<2:18:07, 7.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (45859 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88275 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 20994/22095 [36:24:13<2:28:34, 8.10s/it] {'loss': 0.4739, 'grad_norm': 0.26711729261399775, 'learning_rate': 6.509280248159721e-08, 'epoch': 0.95} 95%|█████████▌| 20994/22095 [36:24:13<2:28:34, 8.10s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [50, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8478922 in VC:s3://internvl-moe-sft-data/. Exception: Image size [50, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 113998, 'image': 'vrdu_texteq/astro-ph.CO/ac1e197e-fa18-43ad-bcb0-bb2e2c6c1a86.png', 'image_wh': [[50, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': '$\\approx$42'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [185, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8931433 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [185, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54586, 'image': 'images/5361.png', 'image_wh': [[185, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,已知C点将AB段分为两部分1:3,D点为AB的中点,如果CD=2,AB段的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8'}]} 95%|█████████▌| 20995/22095 [36:24:17<2:03:51, 6.76s/it] {'loss': 0.2758, 'grad_norm': 0.571338219335804, 'learning_rate': 6.49749767384833e-08, 'epoch': 0.95} 95%|█████████▌| 20995/22095 [36:24:17<2:03:51, 6.76s/it] 95%|█████████▌| 20996/22095 [36:24:19<1:41:27, 5.54s/it] {'loss': 0.3139, 'grad_norm': 0.6613673798392251, 'learning_rate': 6.485725703390067e-08, 'epoch': 0.95} 95%|█████████▌| 20996/22095 [36:24:19<1:41:27, 5.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62056 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65431 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50568 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58874 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 20997/22095 [36:24:22<1:26:34, 4.73s/it] {'loss': 0.2549, 'grad_norm': 0.6584892264971919, 'learning_rate': 6.473964337037842e-08, 'epoch': 0.95} 95%|█████████▌| 20997/22095 [36:24:22<1:26:34, 4.73s/it] 95%|█████████▌| 20998/22095 [36:24:26<1:21:40, 4.47s/it] {'loss': 0.2781, 'grad_norm': 0.5943344110402399, 'learning_rate': 6.462213575044396e-08, 'epoch': 0.95} 95%|█████████▌| 20998/22095 [36:24:26<1:21:40, 4.47s/it] 95%|█████████▌| 20999/22095 [36:24:30<1:19:55, 4.38s/it] {'loss': 0.2842, 'grad_norm': 0.6127678986376591, 'learning_rate': 6.45047341766214e-08, 'epoch': 0.95} 95%|█████████▌| 20999/22095 [36:24:30<1:19:55, 4.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [475, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8465299 in VC:s3://internvl-moe-sft-data/. Exception: Image size [475, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 138611, 'image': 'vrdu_texteq/astro-ph.CO/a9d805bc-12eb-4175-857e-9d651e31f48a.png', 'image_wh': [[475, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'The wave function $\\Psi$ can be written as'}]} 95%|█████████▌| 21000/22095 [36:24:34<1:16:20, 4.18s/it] {'loss': 0.3379, 'grad_norm': 0.6000744206177664, 'learning_rate': 6.438743865143371e-08, 'epoch': 0.95} 95%|█████████▌| 21000/22095 [36:24:34<1:16:20, 4.18s/it] 95%|█████████▌| 21001/22095 [36:24:37<1:11:36, 3.93s/it] {'loss': 0.2908, 'grad_norm': 0.5657602497024272, 'learning_rate': 6.42702491774022e-08, 'epoch': 0.95} 95%|█████████▌| 21001/22095 [36:24:37<1:11:36, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21002/22095 [36:24:45<1:32:34, 5.08s/it] {'loss': 0.4503, 'grad_norm': 0.2609860411322821, 'learning_rate': 6.415316575704378e-08, 'epoch': 0.95} 95%|█████████▌| 21002/22095 [36:24:45<1:32:34, 5.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21003/22095 [36:24:49<1:25:13, 4.68s/it] {'loss': 0.2892, 'grad_norm': 0.604504207006534, 'learning_rate': 6.403618839287418e-08, 'epoch': 0.95} 95%|█████████▌| 21003/22095 [36:24:49<1:25:13, 4.68s/it] 95%|█████████▌| 21004/22095 [36:24:53<1:20:07, 4.41s/it] {'loss': 0.2736, 'grad_norm': 0.6030571400391765, 'learning_rate': 6.391931708740806e-08, 'epoch': 0.95} 95%|█████████▌| 21004/22095 [36:24:53<1:20:07, 4.41s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21005/22095 [36:25:04<1:58:33, 6.53s/it] {'loss': 0.4747, 'grad_norm': 0.2807062113118566, 'learning_rate': 6.380255184315509e-08, 'epoch': 0.95} 95%|█████████▌| 21005/22095 [36:25:04<1:58:33, 6.53s/it] 95%|█████████▌| 21006/22095 [36:25:09<1:48:54, 6.00s/it] {'loss': 0.2965, 'grad_norm': 0.5946543907442272, 'learning_rate': 6.368589266262493e-08, 'epoch': 0.95} 95%|█████████▌| 21006/22095 [36:25:09<1:48:54, 6.00s/it] 95%|█████████▌| 21007/22095 [36:25:13<1:36:18, 5.31s/it] {'loss': 0.3242, 'grad_norm': 0.590482601249089, 'learning_rate': 6.356933954832501e-08, 'epoch': 0.95} 95%|█████████▌| 21007/22095 [36:25:13<1:36:18, 5.31s/it] 95%|█████████▌| 21008/22095 [36:25:16<1:25:34, 4.72s/it] {'loss': 0.2708, 'grad_norm': 0.5794334337238063, 'learning_rate': 6.345289250275777e-08, 'epoch': 0.95} 95%|█████████▌| 21008/22095 [36:25:16<1:25:34, 4.72s/it] 95%|█████████▌| 21009/22095 [36:25:19<1:18:51, 4.36s/it] {'loss': 0.3122, 'grad_norm': 0.5793012980790783, 'learning_rate': 6.333655152842676e-08, 'epoch': 0.95} 95%|█████████▌| 21009/22095 [36:25:19<1:18:51, 4.36s/it] 95%|█████████▌| 21010/22095 [36:25:23<1:12:05, 3.99s/it] {'loss': 0.303, 'grad_norm': 0.5797144339972501, 'learning_rate': 6.322031662783167e-08, 'epoch': 0.95} 95%|█████████▌| 21010/22095 [36:25:23<1:12:05, 3.99s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50270 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21011/22095 [36:25:26<1:07:11, 3.72s/it] {'loss': 0.283, 'grad_norm': 0.5331724832690391, 'learning_rate': 6.310418780346993e-08, 'epoch': 0.95} 95%|█████████▌| 21011/22095 [36:25:26<1:07:11, 3.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21012/22095 [36:25:30<1:07:35, 3.74s/it] {'loss': 0.3098, 'grad_norm': 0.6539748251622324, 'learning_rate': 6.298816505783623e-08, 'epoch': 0.95} 95%|█████████▌| 21012/22095 [36:25:30<1:07:35, 3.74s/it] 95%|█████████▌| 21013/22095 [36:25:33<1:08:14, 3.78s/it] {'loss': 0.2896, 'grad_norm': 0.5968445176581573, 'learning_rate': 6.28722483934241e-08, 'epoch': 0.95} 95%|█████████▌| 21013/22095 [36:25:33<1:08:14, 3.78s/it] 95%|█████████▌| 21014/22095 [36:25:38<1:13:29, 4.08s/it] {'loss': 0.3001, 'grad_norm': 0.7165491692322432, 'learning_rate': 6.275643781272489e-08, 'epoch': 0.95} 95%|█████████▌| 21014/22095 [36:25:38<1:13:29, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21015/22095 [36:25:43<1:16:38, 4.26s/it] {'loss': 0.3247, 'grad_norm': 0.5754934026250936, 'learning_rate': 6.264073331822551e-08, 'epoch': 0.95} 95%|█████████▌| 21015/22095 [36:25:43<1:16:38, 4.26s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21016/22095 [36:25:46<1:10:02, 3.90s/it] {'loss': 0.3054, 'grad_norm': 0.6371439596459173, 'learning_rate': 6.252513491241285e-08, 'epoch': 0.95} 95%|█████████▌| 21016/22095 [36:25:46<1:10:02, 3.90s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88246 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73175 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21017/22095 [36:25:49<1:05:09, 3.63s/it] {'loss': 0.304, 'grad_norm': 0.6393908936424378, 'learning_rate': 6.240964259777104e-08, 'epoch': 0.95} 95%|█████████▌| 21017/22095 [36:25:49<1:05:09, 3.63s/it] 95%|█████████▌| 21018/22095 [36:25:52<1:01:18, 3.42s/it] {'loss': 0.2724, 'grad_norm': 0.6371497263366444, 'learning_rate': 6.229425637678088e-08, 'epoch': 0.95} 95%|█████████▌| 21018/22095 [36:25:52<1:01:18, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21019/22095 [36:26:01<1:30:39, 5.06s/it] {'loss': 0.4779, 'grad_norm': 0.26785550578434497, 'learning_rate': 6.217897625192259e-08, 'epoch': 0.95} 95%|█████████▌| 21019/22095 [36:26:01<1:30:39, 5.06s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8402035 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4199, 'image': 'vrdu_table_final_2/astro-ph.CO/d6b3a072-dac7-4b9e-8673-6c04344e0312.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 95%|█████████▌| 21020/22095 [36:26:04<1:23:47, 4.68s/it] {'loss': 0.301, 'grad_norm': 0.5961553134529505, 'learning_rate': 6.206380222567254e-08, 'epoch': 0.95} 95%|█████████▌| 21020/22095 [36:26:04<1:23:47, 4.68s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8397066 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63919, 'image': 'vrdu_table_final_2/astro-ph.EP/3960adf1-c302-48d3-b95e-aee788eeed8b.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nCan you render the table in the image as LaTeX code?'}, {'from': 'gpt', 'value': 'Sure, I can render the table in the image as LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```'}]} 95%|█████████▌| 21021/22095 [36:26:08<1:17:37, 4.34s/it] {'loss': 0.2966, 'grad_norm': 0.5852208105910898, 'learning_rate': 6.194873430050596e-08, 'epoch': 0.95} 95%|█████████▌| 21021/22095 [36:26:08<1:17:37, 4.34s/it] 95%|█████████▌| 21022/22095 [36:26:11<1:11:15, 3.98s/it] {'loss': 0.2742, 'grad_norm': 0.6041103603260393, 'learning_rate': 6.183377247889422e-08, 'epoch': 0.95} 95%|█████████▌| 21022/22095 [36:26:11<1:11:15, 3.98s/it] 95%|█████████▌| 21023/22095 [36:26:15<1:07:55, 3.80s/it] {'loss': 0.2631, 'grad_norm': 0.6030801014720969, 'learning_rate': 6.171891676330922e-08, 'epoch': 0.95} 95%|█████████▌| 21023/22095 [36:26:15<1:07:55, 3.80s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918293 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41446, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '6'}]} 95%|█████████▌| 21024/22095 [36:26:18<1:05:57, 3.70s/it] {'loss': 0.268, 'grad_norm': 0.6407847667172143, 'learning_rate': 6.160416715621786e-08, 'epoch': 0.95} 95%|█████████▌| 21024/22095 [36:26:18<1:05:57, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21025/22095 [36:26:28<1:37:26, 5.46s/it] {'loss': 0.4682, 'grad_norm': 0.2788988931928052, 'learning_rate': 6.148952366008487e-08, 'epoch': 0.95} 95%|█████████▌| 21025/22095 [36:26:28<1:37:26, 5.46s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 95%|█████████▌| 21026/22095 [36:26:31<1:28:37, 4.97s/it] {'loss': 0.2491, 'grad_norm': 0.5766844844108057, 'learning_rate': 6.137498627737492e-08, 'epoch': 0.95} 95%|█████████▌| 21026/22095 [36:26:31<1:28:37, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (108086 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21027/22095 [36:26:35<1:22:31, 4.64s/it] {'loss': 0.2971, 'grad_norm': 0.6204674046222174, 'learning_rate': 6.126055501054995e-08, 'epoch': 0.95} 95%|█████████▌| 21027/22095 [36:26:35<1:22:31, 4.64s/it] 95%|█████████▌| 21028/22095 [36:26:38<1:13:48, 4.15s/it] {'loss': 0.2732, 'grad_norm': 0.566262868538345, 'learning_rate': 6.114622986206575e-08, 'epoch': 0.95} 95%|█████████▌| 21028/22095 [36:26:38<1:13:48, 4.15s/it] 95%|█████████▌| 21029/22095 [36:26:42<1:12:21, 4.07s/it] {'loss': 0.3184, 'grad_norm': 0.6266490854478113, 'learning_rate': 6.103201083438149e-08, 'epoch': 0.95} 95%|█████████▌| 21029/22095 [36:26:42<1:12:21, 4.07s/it] 95%|█████████▌| 21030/22095 [36:26:46<1:12:38, 4.09s/it] {'loss': 0.3184, 'grad_norm': 0.6233291681881801, 'learning_rate': 6.091789792995018e-08, 'epoch': 0.95} 95%|█████████▌| 21030/22095 [36:26:46<1:12:38, 4.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (74820 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70355 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21031/22095 [36:26:50<1:09:18, 3.91s/it] {'loss': 0.3531, 'grad_norm': 0.6900403839984441, 'learning_rate': 6.080389115122432e-08, 'epoch': 0.95} 95%|█████████▌| 21031/22095 [36:26:50<1:09:18, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41749 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45881 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21032/22095 [36:26:53<1:08:08, 3.85s/it] {'loss': 0.3536, 'grad_norm': 0.6418354601151535, 'learning_rate': 6.06899905006525e-08, 'epoch': 0.95} 95%|█████████▌| 21032/22095 [36:26:53<1:08:08, 3.85s/it] 95%|█████████▌| 21033/22095 [36:26:57<1:06:33, 3.76s/it] {'loss': 0.3051, 'grad_norm': 0.7346705264008148, 'learning_rate': 6.057619598068332e-08, 'epoch': 0.95} 95%|█████████▌| 21033/22095 [36:26:57<1:06:33, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43620 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111286 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21034/22095 [36:27:00<1:01:25, 3.47s/it] {'loss': 0.2993, 'grad_norm': 0.6679759636947534, 'learning_rate': 6.046250759376148e-08, 'epoch': 0.95} 95%|█████████▌| 21034/22095 [36:27:00<1:01:25, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21035/22095 [36:27:06<1:17:05, 4.36s/it] {'loss': 0.4927, 'grad_norm': 0.27520480360869226, 'learning_rate': 6.034892534233006e-08, 'epoch': 0.95} 95%|█████████▌| 21035/22095 [36:27:06<1:17:05, 4.36s/it] 95%|█████████▌| 21036/22095 [36:27:10<1:13:28, 4.16s/it] {'loss': 0.3048, 'grad_norm': 0.6746084316484376, 'learning_rate': 6.023544922882874e-08, 'epoch': 0.95} 95%|█████████▌| 21036/22095 [36:27:10<1:13:28, 4.16s/it] 95%|█████████▌| 21037/22095 [36:27:13<1:06:15, 3.76s/it] {'loss': 0.3036, 'grad_norm': 0.6528323441208711, 'learning_rate': 6.012207925569613e-08, 'epoch': 0.95} 95%|█████████▌| 21037/22095 [36:27:13<1:06:15, 3.76s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95496 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21038/22095 [36:27:17<1:06:55, 3.80s/it] {'loss': 0.2905, 'grad_norm': 0.6145097092357207, 'learning_rate': 6.000881542536863e-08, 'epoch': 0.95} 95%|█████████▌| 21038/22095 [36:27:17<1:06:55, 3.80s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21039/22095 [36:27:27<1:39:15, 5.64s/it] {'loss': 0.4491, 'grad_norm': 0.24463072379857484, 'learning_rate': 5.989565774027983e-08, 'epoch': 0.95} 95%|█████████▌| 21039/22095 [36:27:27<1:39:15, 5.64s/it] 95%|█████████▌| 21040/22095 [36:27:36<2:00:21, 6.84s/it] {'loss': 0.4778, 'grad_norm': 0.2627344849839754, 'learning_rate': 5.978260620286058e-08, 'epoch': 0.95} 95%|█████████▌| 21040/22095 [36:27:36<2:00:21, 6.84s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▌| 21041/22095 [36:27:41<1:47:21, 6.11s/it] {'loss': 0.242, 'grad_norm': 0.596168684386448, 'learning_rate': 5.96696608155406e-08, 'epoch': 0.95} 95%|█████████▌| 21041/22095 [36:27:41<1:47:21, 6.11s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21042/22095 [36:27:48<1:55:12, 6.56s/it] {'loss': 0.4841, 'grad_norm': 0.2646225649141157, 'learning_rate': 5.955682158074627e-08, 'epoch': 0.95} 95%|█████████▌| 21042/22095 [36:27:48<1:55:12, 6.56s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▌| 21043/22095 [36:27:52<1:39:29, 5.67s/it] {'loss': 0.3003, 'grad_norm': 0.7122779503387644, 'learning_rate': 5.944408850090289e-08, 'epoch': 0.95} 95%|█████████▌| 21043/22095 [36:27:52<1:39:29, 5.67s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100876 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21044/22095 [36:27:55<1:27:37, 5.00s/it] {'loss': 0.2806, 'grad_norm': 0.6594758890068033, 'learning_rate': 5.933146157843239e-08, 'epoch': 0.95} 95%|█████████▌| 21044/22095 [36:27:55<1:27:37, 5.00s/it] 95%|█████████▌| 21045/22095 [36:27:59<1:22:01, 4.69s/it] {'loss': 0.2961, 'grad_norm': 0.5710099357492796, 'learning_rate': 5.921894081575397e-08, 'epoch': 0.95} 95%|█████████▌| 21045/22095 [36:27:59<1:22:01, 4.69s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21046/22095 [36:28:09<1:48:12, 6.19s/it] {'loss': 0.477, 'grad_norm': 0.26264952959296717, 'learning_rate': 5.9106526215286786e-08, 'epoch': 0.95} 95%|█████████▌| 21046/22095 [36:28:09<1:48:12, 6.19s/it] 95%|█████████▌| 21047/22095 [36:28:12<1:33:42, 5.37s/it] {'loss': 0.25, 'grad_norm': 0.667966157173857, 'learning_rate': 5.899421777944503e-08, 'epoch': 0.95} 95%|█████████▌| 21047/22095 [36:28:12<1:33:42, 5.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21048/22095 [36:28:16<1:22:01, 4.70s/it] {'loss': 0.2955, 'grad_norm': 0.6473696352491063, 'learning_rate': 5.888201551064288e-08, 'epoch': 0.95} 95%|█████████▌| 21048/22095 [36:28:16<1:22:01, 4.70s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (118180 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (129080 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52208 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99499 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21049/22095 [36:28:19<1:12:41, 4.17s/it] {'loss': 0.2818, 'grad_norm': 0.6820985118600682, 'learning_rate': 5.876991941129062e-08, 'epoch': 0.95} 95%|█████████▌| 21049/22095 [36:28:19<1:12:41, 4.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71392 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (154424 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21050/22095 [36:28:23<1:15:58, 4.36s/it] {'loss': 0.2987, 'grad_norm': 0.6109836936427275, 'learning_rate': 5.8657929483796336e-08, 'epoch': 0.95} 95%|█████████▌| 21050/22095 [36:28:23<1:15:58, 4.36s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21051/22095 [36:28:26<1:09:18, 3.98s/it] {'loss': 0.287, 'grad_norm': 0.651082778888153, 'learning_rate': 5.854604573056755e-08, 'epoch': 0.95} 95%|█████████▌| 21051/22095 [36:28:26<1:09:18, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21052/22095 [36:28:33<1:21:52, 4.71s/it] {'loss': 0.4622, 'grad_norm': 0.30875576273308114, 'learning_rate': 5.843426815400788e-08, 'epoch': 0.95} 95%|█████████▌| 21052/22095 [36:28:33<1:21:52, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43763 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21053/22095 [36:28:36<1:14:03, 4.26s/it] {'loss': 0.2646, 'grad_norm': 0.655355098365421, 'learning_rate': 5.8322596756518744e-08, 'epoch': 0.95} 95%|█████████▌| 21053/22095 [36:28:36<1:14:03, 4.26s/it] 95%|█████████▌| 21054/22095 [36:28:39<1:09:29, 4.01s/it] {'loss': 0.306, 'grad_norm': 0.6104545493531314, 'learning_rate': 5.821103154049934e-08, 'epoch': 0.95} 95%|█████████▌| 21054/22095 [36:28:39<1:09:29, 4.01s/it] 95%|█████████▌| 21055/22095 [36:28:43<1:05:57, 3.81s/it] {'loss': 0.2829, 'grad_norm': 0.5416426331637725, 'learning_rate': 5.809957250834774e-08, 'epoch': 0.95} 95%|█████████▌| 21055/22095 [36:28:43<1:05:57, 3.81s/it] 95%|█████████▌| 21056/22095 [36:28:46<1:01:39, 3.56s/it] {'loss': 0.2843, 'grad_norm': 0.6268521076725558, 'learning_rate': 5.7988219662458714e-08, 'epoch': 0.95} 95%|█████████▌| 21056/22095 [36:28:46<1:01:39, 3.56s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (123916 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45494 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57930 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50113 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21057/22095 [36:28:56<1:33:47, 5.42s/it] {'loss': 0.4533, 'grad_norm': 0.2669820642962163, 'learning_rate': 5.787697300522421e-08, 'epoch': 0.95} 95%|█████████▌| 21057/22095 [36:28:56<1:33:47, 5.42s/it] 95%|█████████▌| 21058/22095 [36:29:06<2:01:34, 7.03s/it] {'loss': 0.4546, 'grad_norm': 0.2477758080254133, 'learning_rate': 5.7765832539035113e-08, 'epoch': 0.95} 95%|█████████▌| 21058/22095 [36:29:06<2:01:34, 7.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21059/22095 [36:29:11<1:50:03, 6.37s/it] {'loss': 0.2887, 'grad_norm': 0.648880992568745, 'learning_rate': 5.765479826627951e-08, 'epoch': 0.95} 95%|█████████▌| 21059/22095 [36:29:11<1:50:03, 6.37s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79361 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21060/22095 [36:29:15<1:34:52, 5.50s/it] {'loss': 0.3575, 'grad_norm': 0.6113640942856314, 'learning_rate': 5.754387018934271e-08, 'epoch': 0.95} 95%|█████████▌| 21060/22095 [36:29:15<1:34:52, 5.50s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922711 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45864, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB上有C、D两点,且AD=\\frac{1}{3}AB,C是AD的中点,若AB=12,则线段AC的长为()\nA. 3\nB. 4\nC. 1\nD. 2\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21061/22095 [36:29:24<1:56:06, 6.74s/it] {'loss': 0.4683, 'grad_norm': 0.2506811649761924, 'learning_rate': 5.743304831060836e-08, 'epoch': 0.95} 95%|█████████▌| 21061/22095 [36:29:24<1:56:06, 6.74s/it] 95%|█████████▌| 21062/22095 [36:29:29<1:43:35, 6.02s/it] {'loss': 0.3598, 'grad_norm': 0.6305241344926028, 'learning_rate': 5.7322332632458454e-08, 'epoch': 0.95} 95%|█████████▌| 21062/22095 [36:29:29<1:43:35, 6.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87382 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21063/22095 [36:29:32<1:27:32, 5.09s/it] {'loss': 0.3155, 'grad_norm': 0.9082959487210251, 'learning_rate': 5.721172315727108e-08, 'epoch': 0.95} 95%|█████████▌| 21063/22095 [36:29:32<1:27:32, 5.09s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (104104 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21064/22095 [36:29:42<1:56:29, 6.78s/it] {'loss': 0.4894, 'grad_norm': 0.30722117733165927, 'learning_rate': 5.7101219887423233e-08, 'epoch': 0.95} 95%|█████████▌| 21064/22095 [36:29:42<1:56:29, 6.78s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21065/22095 [36:29:51<2:06:48, 7.39s/it] {'loss': 0.4934, 'grad_norm': 0.30885965328323056, 'learning_rate': 5.6990822825289115e-08, 'epoch': 0.95} 95%|█████████▌| 21065/22095 [36:29:51<2:06:48, 7.39s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 95%|█████████▌| 21066/22095 [36:29:55<1:50:33, 6.45s/it] {'loss': 0.2562, 'grad_norm': 0.6392135432575204, 'learning_rate': 5.688053197324073e-08, 'epoch': 0.95} 95%|█████████▌| 21066/22095 [36:29:55<1:50:33, 6.45s/it] 95%|█████████▌| 21067/22095 [36:29:59<1:35:53, 5.60s/it] {'loss': 0.288, 'grad_norm': 0.5728801787248158, 'learning_rate': 5.677034733364839e-08, 'epoch': 0.95} 95%|█████████▌| 21067/22095 [36:29:59<1:35:53, 5.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21068/22095 [36:30:08<1:55:21, 6.74s/it] {'loss': 0.4584, 'grad_norm': 0.2522632360951553, 'learning_rate': 5.66602689088791e-08, 'epoch': 0.95} 95%|█████████▌| 21068/22095 [36:30:08<1:55:21, 6.74s/it] 95%|█████████▌| 21069/22095 [36:30:12<1:40:16, 5.86s/it] {'loss': 0.293, 'grad_norm': 0.7143571198515337, 'learning_rate': 5.655029670129875e-08, 'epoch': 0.95} 95%|█████████▌| 21069/22095 [36:30:12<1:40:16, 5.86s/it] 95%|█████████▌| 21070/22095 [36:30:15<1:25:47, 5.02s/it] {'loss': 0.2969, 'grad_norm': 0.5981811006835039, 'learning_rate': 5.6440430713269325e-08, 'epoch': 0.95} 95%|█████████▌| 21070/22095 [36:30:15<1:25:47, 5.02s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21071/22095 [36:30:23<1:40:37, 5.90s/it] {'loss': 0.4681, 'grad_norm': 0.2668000224402627, 'learning_rate': 5.633067094715228e-08, 'epoch': 0.95} 95%|█████████▌| 21071/22095 [36:30:23<1:40:37, 5.90s/it] 95%|█████████▌| 21072/22095 [36:30:28<1:32:54, 5.45s/it] {'loss': 0.2771, 'grad_norm': 0.6051485800178262, 'learning_rate': 5.622101740530572e-08, 'epoch': 0.95} 95%|█████████▌| 21072/22095 [36:30:28<1:32:54, 5.45s/it] 95%|█████████▌| 21073/22095 [36:30:31<1:23:28, 4.90s/it] {'loss': 0.303, 'grad_norm': 0.6327385317687241, 'learning_rate': 5.6111470090086106e-08, 'epoch': 0.95} 95%|█████████▌| 21073/22095 [36:30:31<1:23:28, 4.90s/it] 95%|█████████▌| 21074/22095 [36:30:34<1:14:44, 4.39s/it] {'loss': 0.3054, 'grad_norm': 0.7189651235790038, 'learning_rate': 5.6002029003847105e-08, 'epoch': 0.95} 95%|█████████▌| 21074/22095 [36:30:34<1:14:44, 4.39s/it] 95%|█████████▌| 21075/22095 [36:30:39<1:13:43, 4.34s/it] {'loss': 0.3036, 'grad_norm': 0.6291207355588576, 'learning_rate': 5.589269414893961e-08, 'epoch': 0.95} 95%|█████████▌| 21075/22095 [36:30:39<1:13:43, 4.34s/it] 95%|█████████▌| 21076/22095 [36:30:42<1:06:52, 3.94s/it] {'loss': 0.3429, 'grad_norm': 0.6193220306777247, 'learning_rate': 5.5783465527713964e-08, 'epoch': 0.95} 95%|█████████▌| 21076/22095 [36:30:42<1:06:52, 3.94s/it] 95%|█████████▌| 21077/22095 [36:30:45<1:04:10, 3.78s/it] {'loss': 0.2816, 'grad_norm': 0.5910164438973713, 'learning_rate': 5.567434314251663e-08, 'epoch': 0.95} 95%|█████████▌| 21077/22095 [36:30:45<1:04:10, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21078/22095 [36:30:52<1:21:21, 4.80s/it] {'loss': 0.4609, 'grad_norm': 0.2552213221032225, 'learning_rate': 5.5565326995691835e-08, 'epoch': 0.95} 95%|█████████▌| 21078/22095 [36:30:52<1:21:21, 4.80s/it] 95%|█████████▌| 21079/22095 [36:30:56<1:16:32, 4.52s/it] {'loss': 0.2692, 'grad_norm': 0.5937592714720844, 'learning_rate': 5.5456417089582715e-08, 'epoch': 0.95} 95%|█████████▌| 21079/22095 [36:30:56<1:16:32, 4.52s/it] 95%|█████████▌| 21080/22095 [36:31:00<1:11:42, 4.24s/it] {'loss': 0.2704, 'grad_norm': 0.5623907371733852, 'learning_rate': 5.534761342652906e-08, 'epoch': 0.95} 95%|█████████▌| 21080/22095 [36:31:00<1:11:42, 4.24s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44582 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21081/22095 [36:31:03<1:06:08, 3.91s/it] {'loss': 0.2716, 'grad_norm': 0.6982056335613128, 'learning_rate': 5.523891600886955e-08, 'epoch': 0.95} 95%|█████████▌| 21081/22095 [36:31:03<1:06:08, 3.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 95%|█████████▌| 21082/22095 [36:31:06<1:02:10, 3.68s/it] {'loss': 0.2886, 'grad_norm': 0.5974350115498644, 'learning_rate': 5.513032483893843e-08, 'epoch': 0.95} 95%|█████████▌| 21082/22095 [36:31:06<1:02:10, 3.68s/it] 95%|█████████▌| 21083/22095 [36:31:09<1:01:04, 3.62s/it] {'loss': 0.2659, 'grad_norm': 0.5876815070067601, 'learning_rate': 5.50218399190694e-08, 'epoch': 0.95} 95%|█████████▌| 21083/22095 [36:31:09<1:01:04, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (65279 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48672 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86929 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53073 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45454 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21084/22095 [36:31:13<59:49, 3.55s/it] {'loss': 0.2632, 'grad_norm': 0.5721221155130534, 'learning_rate': 5.491346125159391e-08, 'epoch': 0.95} 95%|█████████▌| 21084/22095 [36:31:13<59:49, 3.55s/it] 95%|█████████▌| 21085/22095 [36:31:17<1:01:10, 3.63s/it] {'loss': 0.2783, 'grad_norm': 0.5523906313642887, 'learning_rate': 5.4805188838841226e-08, 'epoch': 0.95} 95%|█████████▌| 21085/22095 [36:31:17<1:01:10, 3.63s/it] 95%|█████████▌| 21086/22095 [36:31:21<1:03:22, 3.77s/it] {'loss': 0.2775, 'grad_norm': 0.6276116118037559, 'learning_rate': 5.4697022683136145e-08, 'epoch': 0.95} 95%|█████████▌| 21086/22095 [36:31:21<1:03:22, 3.77s/it] 95%|█████████▌| 21087/22095 [36:31:24<1:00:43, 3.61s/it] {'loss': 0.2863, 'grad_norm': 0.5721432827948391, 'learning_rate': 5.4588962786804035e-08, 'epoch': 0.95} 95%|█████████▌| 21087/22095 [36:31:24<1:00:43, 3.61s/it] 95%|█████████▌| 21088/22095 [36:31:27<58:40, 3.50s/it] {'loss': 0.2458, 'grad_norm': 0.5807363563910248, 'learning_rate': 5.448100915216636e-08, 'epoch': 0.95} 95%|█████████▌| 21088/22095 [36:31:27<58:40, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21089/22095 [36:31:33<1:08:41, 4.10s/it] {'loss': 0.4643, 'grad_norm': 0.238644150188603, 'learning_rate': 5.437316178154295e-08, 'epoch': 0.95} 95%|█████████▌| 21089/22095 [36:31:33<1:08:41, 4.10s/it] 95%|█████████▌| 21090/22095 [36:31:36<1:04:13, 3.83s/it] {'loss': 0.3157, 'grad_norm': 0.7048671117012055, 'learning_rate': 5.4265420677250267e-08, 'epoch': 0.95} 95%|█████████▌| 21090/22095 [36:31:36<1:04:13, 3.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79540 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61788 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55113 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21091/22095 [36:31:40<1:05:55, 3.94s/it] {'loss': 0.3047, 'grad_norm': 0.7289959440252665, 'learning_rate': 5.4157785841604805e-08, 'epoch': 0.95} 95%|█████████▌| 21091/22095 [36:31:40<1:05:55, 3.94s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (48944 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21092/22095 [36:31:50<1:33:31, 5.60s/it] {'loss': 0.4595, 'grad_norm': 0.28604388476548864, 'learning_rate': 5.4050257276918036e-08, 'epoch': 0.95} 95%|█████████▌| 21092/22095 [36:31:50<1:33:31, 5.60s/it] 95%|█████████▌| 21093/22095 [36:31:53<1:21:09, 4.86s/it] {'loss': 0.271, 'grad_norm': 0.6084096838677485, 'learning_rate': 5.3942834985501455e-08, 'epoch': 0.95} 95%|█████████▌| 21093/22095 [36:31:53<1:21:09, 4.86s/it] 95%|█████████▌| 21094/22095 [36:31:56<1:13:09, 4.38s/it] {'loss': 0.2735, 'grad_norm': 0.5654045262842151, 'learning_rate': 5.383551896966266e-08, 'epoch': 0.95} 95%|█████████▌| 21094/22095 [36:31:56<1:13:09, 4.38s/it] 95%|█████████▌| 21095/22095 [36:31:59<1:06:20, 3.98s/it] {'loss': 0.3032, 'grad_norm': 0.6056723134167218, 'learning_rate': 5.372830923170702e-08, 'epoch': 0.95} 95%|█████████▌| 21095/22095 [36:31:59<1:06:20, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69932 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45790 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63150 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (110779 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (53769 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85751 > 40960). Running this sequence through the model will result in indexing errors 95%|█████████▌| 21096/22095 [36:32:02<1:03:18, 3.80s/it] {'loss': 0.2952, 'grad_norm': 0.6271978895462672, 'learning_rate': 5.362120577393881e-08, 'epoch': 0.95} 95%|█████████▌| 21096/22095 [36:32:02<1:03:18, 3.80s/it] 95%|█████████▌| 21097/22095 [36:32:05<59:30, 3.58s/it] {'loss': 0.2753, 'grad_norm': 0.578230439300833, 'learning_rate': 5.351420859865952e-08, 'epoch': 0.95} 95%|█████████▌| 21097/22095 [36:32:05<59:30, 3.58s/it] 95%|█████████▌| 21098/22095 [36:32:09<1:00:07, 3.62s/it] {'loss': 0.2957, 'grad_norm': 0.6369091384458982, 'learning_rate': 5.340731770816843e-08, 'epoch': 0.95} 95%|█████████▌| 21098/22095 [36:32:09<1:00:07, 3.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 95%|█████████▌| 21099/22095 [36:32:19<1:31:29, 5.51s/it] {'loss': 0.4708, 'grad_norm': 0.2938941194995703, 'learning_rate': 5.330053310476091e-08, 'epoch': 0.95} 95%|█████████▌| 21099/22095 [36:32:19<1:31:29, 5.51s/it] 95%|█████████▌| 21100/22095 [36:32:29<1:52:56, 6.81s/it] {'loss': 0.4724, 'grad_norm': 0.2623577246616529, 'learning_rate': 5.319385479073236e-08, 'epoch': 0.95} 95%|█████████▌| 21100/22095 [36:32:29<1:52:56, 6.81s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (65436 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57308 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55260 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21101/22095 [36:32:33<1:37:54, 5.91s/it] {'loss': 0.2841, 'grad_norm': 1.2555377118351114, 'learning_rate': 5.308728276837538e-08, 'epoch': 0.96} 96%|█████████▌| 21101/22095 [36:32:33<1:37:54, 5.91s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21102/22095 [36:32:36<1:25:42, 5.18s/it] {'loss': 0.2868, 'grad_norm': 0.6404644636725458, 'learning_rate': 5.298081703997926e-08, 'epoch': 0.96} 96%|█████████▌| 21102/22095 [36:32:36<1:25:42, 5.18s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 96%|█████████▌| 21103/22095 [36:32:40<1:17:52, 4.71s/it] {'loss': 0.3046, 'grad_norm': 0.5611261911554767, 'learning_rate': 5.287445760783161e-08, 'epoch': 0.96} 96%|█████████▌| 21103/22095 [36:32:40<1:17:52, 4.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (102063920 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 96%|█████████▌| 21104/22095 [36:32:49<1:41:51, 6.17s/it] {'loss': 0.4611, 'grad_norm': 0.26375488095717653, 'learning_rate': 5.276820447421782e-08, 'epoch': 0.96} 96%|█████████▌| 21104/22095 [36:32:49<1:41:51, 6.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45698 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79030 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52898 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21105/22095 [36:32:53<1:30:13, 5.47s/it] {'loss': 0.2924, 'grad_norm': 0.7815139458922309, 'learning_rate': 5.266205764142107e-08, 'epoch': 0.96} 96%|█████████▌| 21105/22095 [36:32:53<1:30:13, 5.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50852 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42640 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (100592 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55647 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (66639 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21106/22095 [36:32:57<1:19:20, 4.81s/it] {'loss': 0.3019, 'grad_norm': 0.6860109324350689, 'learning_rate': 5.2556017111722315e-08, 'epoch': 0.96} 96%|█████████▌| 21106/22095 [36:32:57<1:19:20, 4.81s/it] 96%|█████████▌| 21107/22095 [36:33:00<1:10:59, 4.31s/it] {'loss': 0.2745, 'grad_norm': 0.5940709035253605, 'learning_rate': 5.245008288740028e-08, 'epoch': 0.96} 96%|█████████▌| 21107/22095 [36:33:00<1:10:59, 4.31s/it] 96%|█████████▌| 21108/22095 [36:33:04<1:08:39, 4.17s/it] {'loss': 0.2929, 'grad_norm': 0.6224783076128134, 'learning_rate': 5.234425497072981e-08, 'epoch': 0.96} 96%|█████████▌| 21108/22095 [36:33:04<1:08:39, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50706 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21109/22095 [36:33:10<1:21:04, 4.93s/it] {'loss': 0.4686, 'grad_norm': 0.25188962605758014, 'learning_rate': 5.223853336398632e-08, 'epoch': 0.96} 96%|█████████▌| 21109/22095 [36:33:10<1:21:04, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (40988 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21110/22095 [36:33:14<1:13:22, 4.47s/it] {'loss': 0.2819, 'grad_norm': 0.5638666587277356, 'learning_rate': 5.213291806944076e-08, 'epoch': 0.96} 96%|█████████▌| 21110/22095 [36:33:14<1:13:22, 4.47s/it] 96%|█████████▌| 21111/22095 [36:33:16<1:05:23, 3.99s/it] {'loss': 0.2873, 'grad_norm': 0.6578418934123136, 'learning_rate': 5.2027409089362434e-08, 'epoch': 0.96} 96%|█████████▌| 21111/22095 [36:33:16<1:05:23, 3.99s/it] 96%|█████████▌| 21112/22095 [36:33:20<1:00:41, 3.70s/it] {'loss': 0.2969, 'grad_norm': 0.6370928784488743, 'learning_rate': 5.192200642601841e-08, 'epoch': 0.96} 96%|█████████▌| 21112/22095 [36:33:20<1:00:41, 3.70s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21113/22095 [36:33:28<1:23:11, 5.08s/it] {'loss': 0.4389, 'grad_norm': 0.25498088535690616, 'learning_rate': 5.181671008167355e-08, 'epoch': 0.96} 96%|█████████▌| 21113/22095 [36:33:28<1:23:11, 5.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49254 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21114/22095 [36:33:38<1:46:25, 6.51s/it] {'loss': 0.473, 'grad_norm': 0.2758628803552411, 'learning_rate': 5.171152005859159e-08, 'epoch': 0.96} 96%|█████████▌| 21114/22095 [36:33:38<1:46:25, 6.51s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21115/22095 [36:33:41<1:30:19, 5.53s/it] {'loss': 0.3202, 'grad_norm': 0.5835708463549837, 'learning_rate': 5.1606436359030174e-08, 'epoch': 0.96} 96%|█████████▌| 21115/22095 [36:33:41<1:30:19, 5.53s/it] 96%|█████████▌| 21116/22095 [36:33:44<1:18:59, 4.84s/it] {'loss': 0.3031, 'grad_norm': 0.616351688744552, 'learning_rate': 5.150145898524916e-08, 'epoch': 0.96} 96%|█████████▌| 21116/22095 [36:33:44<1:18:59, 4.84s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21117/22095 [36:33:50<1:26:22, 5.30s/it] {'loss': 0.454, 'grad_norm': 0.25333214668607923, 'learning_rate': 5.139658793950342e-08, 'epoch': 0.96} 96%|█████████▌| 21117/22095 [36:33:50<1:26:22, 5.30s/it] 96%|█████████▌| 21118/22095 [36:34:00<1:46:05, 6.52s/it] {'loss': 0.4685, 'grad_norm': 0.2571882919873599, 'learning_rate': 5.1291823224046687e-08, 'epoch': 0.96} 96%|█████████▌| 21118/22095 [36:34:00<1:46:05, 6.52s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 96%|█████████▌| 21119/22095 [36:34:03<1:31:12, 5.61s/it] {'loss': 0.3117, 'grad_norm': 0.7099556179776506, 'learning_rate': 5.1187164841129954e-08, 'epoch': 0.96} 96%|█████████▌| 21119/22095 [36:34:03<1:31:12, 5.61s/it] 96%|█████████▌| 21120/22095 [36:34:07<1:20:58, 4.98s/it] {'loss': 0.2631, 'grad_norm': 0.5463801828022469, 'learning_rate': 5.1082612793001976e-08, 'epoch': 0.96} 96%|█████████▌| 21120/22095 [36:34:07<1:20:58, 4.98s/it] 96%|█████████▌| 21121/22095 [36:34:10<1:12:42, 4.48s/it] {'loss': 0.2613, 'grad_norm': 0.5868453745955741, 'learning_rate': 5.0978167081908726e-08, 'epoch': 0.96} 96%|█████████▌| 21121/22095 [36:34:10<1:12:42, 4.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41708 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (161510 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21122/22095 [36:34:18<1:27:11, 5.38s/it] {'loss': 0.468, 'grad_norm': 0.2854284040132491, 'learning_rate': 5.0873827710095636e-08, 'epoch': 0.96} 96%|█████████▌| 21122/22095 [36:34:18<1:27:11, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56558 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21123/22095 [36:34:21<1:18:40, 4.86s/it] {'loss': 0.2847, 'grad_norm': 1.618474871427813, 'learning_rate': 5.076959467980369e-08, 'epoch': 0.96} 96%|█████████▌| 21123/22095 [36:34:21<1:18:40, 4.86s/it] 96%|█████████▌| 21124/22095 [36:34:24<1:10:01, 4.33s/it] {'loss': 0.2828, 'grad_norm': 0.5758606216633175, 'learning_rate': 5.066546799327221e-08, 'epoch': 0.96} 96%|█████████▌| 21124/22095 [36:34:24<1:10:01, 4.33s/it] 96%|█████████▌| 21125/22095 [36:34:28<1:06:20, 4.10s/it] {'loss': 0.2697, 'grad_norm': 0.5602948965142438, 'learning_rate': 5.0561447652739404e-08, 'epoch': 0.96} 96%|█████████▌| 21125/22095 [36:34:28<1:06:20, 4.10s/it] 96%|█████████▌| 21126/22095 [36:34:32<1:05:58, 4.08s/it] {'loss': 0.236, 'grad_norm': 0.6075309864313594, 'learning_rate': 5.045753366044015e-08, 'epoch': 0.96} 96%|█████████▌| 21126/22095 [36:34:32<1:05:58, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21127/22095 [36:34:35<1:00:08, 3.73s/it] {'loss': 0.2855, 'grad_norm': 0.6679248281529808, 'learning_rate': 5.035372601860766e-08, 'epoch': 0.96} 96%|█████████▌| 21127/22095 [36:34:35<1:00:08, 3.73s/it] 96%|█████████▌| 21128/22095 [36:34:39<1:01:02, 3.79s/it] {'loss': 0.2807, 'grad_norm': 0.6291431479098318, 'learning_rate': 5.0250024729470714e-08, 'epoch': 0.96} 96%|█████████▌| 21128/22095 [36:34:39<1:01:02, 3.79s/it] 96%|█████████▌| 21129/22095 [36:34:42<57:48, 3.59s/it] {'loss': 0.3114, 'grad_norm': 0.564615306221422, 'learning_rate': 5.0146429795259745e-08, 'epoch': 0.96} 96%|█████████▌| 21129/22095 [36:34:42<57:48, 3.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [509, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8418674 in VC:s3://internvl-moe-sft-data/. Exception: Image size [509, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 29596, 'image': 'vrdu_texteq/astro-ph.CO/4aeb5c27-8d53-473d-bd85-dd227d37643c.png', 'image_wh': [[509, 23]], 'conversations': [{'from': 'human', 'value': '\n请使用文本识别技术解读图像中的文字。'}, {'from': 'gpt', 'value': 'whose solution for $t < 0$ can be written as'}]} Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21130/22095 [36:34:50<1:17:45, 4.84s/it] {'loss': 0.4783, 'grad_norm': 0.26899426000202586, 'learning_rate': 5.004294121819908e-08, 'epoch': 0.96} 96%|█████████▌| 21130/22095 [36:34:50<1:17:45, 4.84s/it] 96%|█████████▌| 21131/22095 [36:34:57<1:31:10, 5.67s/it] {'loss': 0.4898, 'grad_norm': 0.29653148382525596, 'learning_rate': 4.993955900051362e-08, 'epoch': 0.96} 96%|█████████▌| 21131/22095 [36:34:57<1:31:10, 5.67s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Token indices sequence length is longer than the specified maximum sequence length for this model (85136 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47365 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44544 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21132/22095 [36:35:01<1:21:38, 5.09s/it] {'loss': 0.293, 'grad_norm': 0.6173135466935873, 'learning_rate': 4.983628314442324e-08, 'epoch': 0.96} 96%|█████████▌| 21132/22095 [36:35:01<1:21:38, 5.09s/it] 96%|█████████▌| 21133/22095 [36:35:05<1:17:08, 4.81s/it] {'loss': 0.318, 'grad_norm': 0.5586290414287138, 'learning_rate': 4.973311365214894e-08, 'epoch': 0.96} 96%|█████████▌| 21133/22095 [36:35:05<1:17:08, 4.81s/it] 96%|█████████▌| 21134/22095 [36:35:09<1:12:42, 4.54s/it] {'loss': 0.3139, 'grad_norm': 0.5546851153757404, 'learning_rate': 4.9630050525905635e-08, 'epoch': 0.96} 96%|█████████▌| 21134/22095 [36:35:09<1:12:42, 4.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047853 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': '\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 2cm\nB. 3cm\nC. 4cm\nD. 6cm'}, {'from': 'gpt', 'value': '【解答】解:∵AB=10cm,BC=4cm,∴AC=6cm,∵D是线段AC的中点,∴AD=3cm.'}]} 96%|█████████▌| 21135/22095 [36:35:12<1:06:40, 4.17s/it] {'loss': 0.2762, 'grad_norm': 0.5424416970865831, 'learning_rate': 4.9527093767908765e-08, 'epoch': 0.96} 96%|█████████▌| 21135/22095 [36:35:12<1:06:40, 4.17s/it] 96%|█████████▌| 21136/22095 [36:35:15<1:00:30, 3.79s/it] {'loss': 0.2852, 'grad_norm': 0.6311911428226128, 'learning_rate': 4.942424338037044e-08, 'epoch': 0.96} 96%|█████████▌| 21136/22095 [36:35:15<1:00:30, 3.79s/it] 96%|█████████▌| 21137/22095 [36:35:19<59:11, 3.71s/it] {'loss': 0.2514, 'grad_norm': 0.5741090634293206, 'learning_rate': 4.932149936550057e-08, 'epoch': 0.96} 96%|█████████▌| 21137/22095 [36:35:19<59:11, 3.71s/it] 96%|█████████▌| 21138/22095 [36:35:22<56:40, 3.55s/it] {'loss': 0.2994, 'grad_norm': 0.6936989040590055, 'learning_rate': 4.9218861725506825e-08, 'epoch': 0.96} 96%|█████████▌| 21138/22095 [36:35:22<56:40, 3.55s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42813 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (138018 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46857 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21139/22095 [36:35:26<58:01, 3.64s/it] {'loss': 0.2769, 'grad_norm': 0.6044609637642747, 'learning_rate': 4.9116330462594677e-08, 'epoch': 0.96} 96%|█████████▌| 21139/22095 [36:35:26<58:01, 3.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21140/22095 [36:35:36<1:28:38, 5.57s/it] {'loss': 0.446, 'grad_norm': 0.25341767036578816, 'learning_rate': 4.9013905578967346e-08, 'epoch': 0.96} 96%|█████████▌| 21140/22095 [36:35:36<1:28:38, 5.57s/it] 96%|█████████▌| 21141/22095 [36:35:43<1:37:50, 6.15s/it] {'loss': 0.4946, 'grad_norm': 0.2809826063683718, 'learning_rate': 4.8911587076825305e-08, 'epoch': 0.96} 96%|█████████▌| 21141/22095 [36:35:43<1:37:50, 6.15s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 96%|█████████▌| 21142/22095 [36:35:47<1:25:19, 5.37s/it] {'loss': 0.3133, 'grad_norm': 0.663001023558929, 'learning_rate': 4.8809374958366796e-08, 'epoch': 0.96} 96%|█████████▌| 21142/22095 [36:35:47<1:25:19, 5.37s/it] 96%|█████████▌| 21143/22095 [36:35:51<1:17:10, 4.86s/it] {'loss': 0.2811, 'grad_norm': 0.61431465932955, 'learning_rate': 4.870726922578839e-08, 'epoch': 0.96} 96%|█████████▌| 21143/22095 [36:35:51<1:17:10, 4.86s/it] 96%|█████████▌| 21144/22095 [36:35:54<1:11:03, 4.48s/it] {'loss': 0.298, 'grad_norm': 0.5896603758861003, 'learning_rate': 4.8605269881284446e-08, 'epoch': 0.96} 96%|█████████▌| 21144/22095 [36:35:54<1:11:03, 4.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (74715 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52743 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45400 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (90248 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21145/22095 [36:35:58<1:06:41, 4.21s/it] {'loss': 0.2603, 'grad_norm': 0.6509342502713384, 'learning_rate': 4.8503376927045984e-08, 'epoch': 0.96} 96%|█████████▌| 21145/22095 [36:35:58<1:06:41, 4.21s/it] 96%|█████████▌| 21146/22095 [36:36:01<1:01:00, 3.86s/it] {'loss': 0.2688, 'grad_norm': 0.5676415541024242, 'learning_rate': 4.840159036526237e-08, 'epoch': 0.96} 96%|█████████▌| 21146/22095 [36:36:01<1:01:00, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21147/22095 [36:36:10<1:27:21, 5.53s/it] {'loss': 0.4955, 'grad_norm': 0.2541454717914221, 'learning_rate': 4.8299910198121304e-08, 'epoch': 0.96} 96%|█████████▌| 21147/22095 [36:36:10<1:27:21, 5.53s/it] 96%|█████████▌| 21148/22095 [36:36:14<1:18:28, 4.97s/it] {'loss': 0.2653, 'grad_norm': 0.6064039473492522, 'learning_rate': 4.819833642780713e-08, 'epoch': 0.96} 96%|█████████▌| 21148/22095 [36:36:14<1:18:28, 4.97s/it] 96%|█████████▌| 21149/22095 [36:36:17<1:09:11, 4.39s/it] {'loss': 0.265, 'grad_norm': 0.6175290510783422, 'learning_rate': 4.809686905650257e-08, 'epoch': 0.96} 96%|█████████▌| 21149/22095 [36:36:17<1:09:11, 4.39s/it] 96%|█████████▌| 21150/22095 [36:36:21<1:07:54, 4.31s/it] {'loss': 0.288, 'grad_norm': 0.6565275491143019, 'learning_rate': 4.7995508086386975e-08, 'epoch': 0.96} 96%|█████████▌| 21150/22095 [36:36:21<1:07:54, 4.31s/it] 96%|█████████▌| 21151/22095 [36:36:24<1:01:42, 3.92s/it] {'loss': 0.2669, 'grad_norm': 0.762414490219462, 'learning_rate': 4.789425351963972e-08, 'epoch': 0.96} 96%|█████████▌| 21151/22095 [36:36:24<1:01:42, 3.92s/it] 96%|█████████▌| 21152/22095 [36:36:28<1:00:28, 3.85s/it] {'loss': 0.26, 'grad_norm': 0.6262336664562198, 'learning_rate': 4.779310535843573e-08, 'epoch': 0.96} 96%|█████████▌| 21152/22095 [36:36:28<1:00:28, 3.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21153/22095 [36:36:37<1:27:06, 5.55s/it] {'loss': 0.4672, 'grad_norm': 0.31724872112389707, 'learning_rate': 4.769206360494771e-08, 'epoch': 0.96} 96%|█████████▌| 21153/22095 [36:36:37<1:27:06, 5.55s/it] 96%|█████████▌| 21154/22095 [36:36:41<1:17:39, 4.95s/it] {'loss': 0.2793, 'grad_norm': 0.5928255327048539, 'learning_rate': 4.759112826134782e-08, 'epoch': 0.96} 96%|█████████▌| 21154/22095 [36:36:41<1:17:39, 4.95s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62914 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (77543 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21155/22095 [36:36:44<1:07:51, 4.33s/it] {'loss': 0.2363, 'grad_norm': 0.6557021358975665, 'learning_rate': 4.749029932980431e-08, 'epoch': 0.96} 96%|█████████▌| 21155/22095 [36:36:44<1:07:51, 4.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59725 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (87268 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21156/22095 [36:36:47<1:00:46, 3.88s/it] {'loss': 0.2506, 'grad_norm': 0.6553815062030756, 'learning_rate': 4.73895768124838e-08, 'epoch': 0.96} 96%|█████████▌| 21156/22095 [36:36:47<1:00:46, 3.88s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (92535 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44934 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41069 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21157/22095 [36:36:51<1:04:03, 4.10s/it] {'loss': 0.3, 'grad_norm': 2.1409793774596806, 'learning_rate': 4.7288960711550644e-08, 'epoch': 0.96} 96%|█████████▌| 21157/22095 [36:36:51<1:04:03, 4.10s/it] 96%|█████████▌| 21158/22095 [36:36:56<1:05:01, 4.16s/it] {'loss': 0.2757, 'grad_norm': 0.5651822107390357, 'learning_rate': 4.718845102916592e-08, 'epoch': 0.96} 96%|█████████▌| 21158/22095 [36:36:56<1:05:01, 4.16s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308538 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2hNrqXol7MKJjSZFDXXaOEpXa_!!2896121446.jpg.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n你能帮忙识别图中的所有文字吗?'}, {'from': 'gpt', 'value': '图中所有文字:\nHK\n8888\n彩色爆闪\n火焰爆闪框+随机牌子\n需要自己回去打眼字国定哦)\n(螺丝眼不对的\n送固定螺丝\n(牌子随机发哦\n或者联系客服换其他的)'}]} 96%|█████████▌| 21159/22095 [36:37:00<1:04:19, 4.12s/it] {'loss': 0.2822, 'grad_norm': 0.625507174834113, 'learning_rate': 4.708804776749121e-08, 'epoch': 0.96} 96%|█████████▌| 21159/22095 [36:37:00<1:04:19, 4.12s/it] 96%|█████████▌| 21160/22095 [36:37:03<59:51, 3.84s/it] {'loss': 0.2983, 'grad_norm': 0.6486296978176692, 'learning_rate': 4.6987750928682017e-08, 'epoch': 0.96} 96%|█████████▌| 21160/22095 [36:37:03<59:51, 3.84s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42964 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21161/22095 [36:37:07<1:03:45, 4.10s/it] {'loss': 0.297, 'grad_norm': 0.5840867435621814, 'learning_rate': 4.688756051489385e-08, 'epoch': 0.96} 96%|█████████▌| 21161/22095 [36:37:07<1:03:45, 4.10s/it] 96%|█████████▌| 21162/22095 [36:37:12<1:04:49, 4.17s/it] {'loss': 0.2637, 'grad_norm': 0.6474200844795487, 'learning_rate': 4.678747652827997e-08, 'epoch': 0.96} 96%|█████████▌| 21162/22095 [36:37:12<1:04:49, 4.17s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21163/22095 [36:37:22<1:34:36, 6.09s/it] {'loss': 0.4583, 'grad_norm': 0.2523292538945871, 'learning_rate': 4.668749897099034e-08, 'epoch': 0.96} 96%|█████████▌| 21163/22095 [36:37:22<1:34:36, 6.09s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69078 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46200 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49012 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (92302 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21164/22095 [36:37:32<1:53:21, 7.31s/it] {'loss': 0.4774, 'grad_norm': 0.26201240745249477, 'learning_rate': 4.6587627845173786e-08, 'epoch': 0.96} 96%|█████████▌| 21164/22095 [36:37:33<1:53:21, 7.31s/it] 96%|█████████▌| 21165/22095 [36:37:39<1:48:15, 6.98s/it] {'loss': 0.474, 'grad_norm': 0.2650016651303161, 'learning_rate': 4.648786315297582e-08, 'epoch': 0.96} 96%|█████████▌| 21165/22095 [36:37:39<1:48:15, 6.98s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 96%|█████████▌| 21166/22095 [36:37:43<1:36:12, 6.21s/it] {'loss': 0.3042, 'grad_norm': 0.6206100835637757, 'learning_rate': 4.6388204896539724e-08, 'epoch': 0.96} 96%|█████████▌| 21166/22095 [36:37:43<1:36:12, 6.21s/it] 96%|█████████▌| 21167/22095 [36:37:47<1:24:50, 5.49s/it] {'loss': 0.2747, 'grad_norm': 0.61239911731407, 'learning_rate': 4.628865307800712e-08, 'epoch': 0.96} 96%|█████████▌| 21167/22095 [36:37:47<1:24:50, 5.49s/it] 96%|█████████▌| 21168/22095 [36:37:51<1:17:49, 5.04s/it] {'loss': 0.3023, 'grad_norm': 0.6238800569552224, 'learning_rate': 4.618920769951796e-08, 'epoch': 0.96} 96%|█████████▌| 21168/22095 [36:37:51<1:17:49, 5.04s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21169/22095 [36:38:02<1:43:26, 6.70s/it] {'loss': 0.4675, 'grad_norm': 0.24533446552736418, 'learning_rate': 4.6089868763207756e-08, 'epoch': 0.96} 96%|█████████▌| 21169/22095 [36:38:02<1:43:26, 6.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21170/22095 [36:38:05<1:29:51, 5.83s/it] {'loss': 0.3292, 'grad_norm': 0.927201016908863, 'learning_rate': 4.5990636271211474e-08, 'epoch': 0.96} 96%|█████████▌| 21170/22095 [36:38:05<1:29:51, 5.83s/it] 96%|█████████▌| 21171/22095 [36:38:09<1:21:59, 5.32s/it] {'loss': 0.2917, 'grad_norm': 0.5815065294128933, 'learning_rate': 4.58915102256613e-08, 'epoch': 0.96} 96%|█████████▌| 21171/22095 [36:38:09<1:21:59, 5.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (61292 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62809 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76034 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21172/22095 [36:38:19<1:40:23, 6.53s/it] {'loss': 0.4839, 'grad_norm': 0.28944106861197766, 'learning_rate': 4.5792490628687734e-08, 'epoch': 0.96} 96%|█████████▌| 21172/22095 [36:38:19<1:40:23, 6.53s/it] 96%|█████████▌| 21173/22095 [36:38:22<1:25:18, 5.55s/it] {'loss': 0.2879, 'grad_norm': 0.5599205054206638, 'learning_rate': 4.569357748241743e-08, 'epoch': 0.96} 96%|█████████▌| 21173/22095 [36:38:22<1:25:18, 5.55s/it] 96%|█████████▌| 21174/22095 [36:38:25<1:14:38, 4.86s/it] {'loss': 0.2428, 'grad_norm': 0.7083504678979596, 'learning_rate': 4.55947707889759e-08, 'epoch': 0.96} 96%|█████████▌| 21174/22095 [36:38:25<1:14:38, 4.86s/it] 96%|█████████▌| 21175/22095 [36:38:29<1:08:14, 4.45s/it] {'loss': 0.3069, 'grad_norm': 0.5732163183389878, 'learning_rate': 4.549607055048699e-08, 'epoch': 0.96} 96%|█████████▌| 21175/22095 [36:38:29<1:08:14, 4.45s/it] 96%|█████████▌| 21176/22095 [36:38:32<1:01:08, 3.99s/it] {'loss': 0.2736, 'grad_norm': 0.5684435449877221, 'learning_rate': 4.539747676907069e-08, 'epoch': 0.96} 96%|█████████▌| 21176/22095 [36:38:32<1:01:08, 3.99s/it] 96%|█████████▌| 21177/22095 [36:38:35<56:54, 3.72s/it] {'loss': 0.2662, 'grad_norm': 0.7016450015645664, 'learning_rate': 4.529898944684585e-08, 'epoch': 0.96} 96%|█████████▌| 21177/22095 [36:38:35<56:54, 3.72s/it] 96%|█████████▌| 21178/22095 [36:38:38<54:58, 3.60s/it] {'loss': 0.2779, 'grad_norm': 0.6133835173395702, 'learning_rate': 4.5200608585928566e-08, 'epoch': 0.96} 96%|█████████▌| 21178/22095 [36:38:38<54:58, 3.60s/it] 96%|█████████▌| 21179/22095 [36:38:41<53:29, 3.50s/it] {'loss': 0.2647, 'grad_norm': 0.8378032151828722, 'learning_rate': 4.510233418843213e-08, 'epoch': 0.96} 96%|█████████▌| 21179/22095 [36:38:41<53:29, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21180/22095 [36:38:52<1:26:00, 5.64s/it] {'loss': 0.4614, 'grad_norm': 0.3407268560380774, 'learning_rate': 4.5004166256469305e-08, 'epoch': 0.96} 96%|█████████▌| 21180/22095 [36:38:52<1:26:00, 5.64s/it] 96%|█████████▌| 21181/22095 [36:38:56<1:18:33, 5.16s/it] {'loss': 0.2638, 'grad_norm': 0.6134658495980012, 'learning_rate': 4.490610479214841e-08, 'epoch': 0.96} 96%|█████████▌| 21181/22095 [36:38:56<1:18:33, 5.16s/it] 96%|█████████▌| 21182/22095 [36:38:59<1:08:14, 4.48s/it] {'loss': 0.3505, 'grad_norm': 0.6981064429014657, 'learning_rate': 4.480814979757719e-08, 'epoch': 0.96} 96%|█████████▌| 21182/22095 [36:38:59<1:08:14, 4.48s/it] 96%|█████████▌| 21183/22095 [36:39:03<1:04:32, 4.25s/it] {'loss': 0.2621, 'grad_norm': 0.5817051446699949, 'learning_rate': 4.471030127486009e-08, 'epoch': 0.96} 96%|█████████▌| 21183/22095 [36:39:03<1:04:32, 4.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62121 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79126 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21184/22095 [36:39:06<58:57, 3.88s/it] {'loss': 0.2656, 'grad_norm': 0.6006523021407582, 'learning_rate': 4.461255922609986e-08, 'epoch': 0.96} 96%|█████████▌| 21184/22095 [36:39:06<58:57, 3.88s/it] 96%|█████████▌| 21185/22095 [36:39:09<54:59, 3.63s/it] {'loss': 0.2884, 'grad_norm': 0.6019792523072536, 'learning_rate': 4.451492365339594e-08, 'epoch': 0.96} 96%|█████████▌| 21185/22095 [36:39:09<54:59, 3.63s/it] 96%|█████████▌| 21186/22095 [36:39:12<54:47, 3.62s/it] {'loss': 0.3031, 'grad_norm': 0.6145690423990191, 'learning_rate': 4.4417394558846636e-08, 'epoch': 0.96} 96%|█████████▌| 21186/22095 [36:39:12<54:47, 3.62s/it] 96%|█████████▌| 21187/22095 [36:39:16<53:19, 3.52s/it] {'loss': 0.355, 'grad_norm': 0.6242685469563993, 'learning_rate': 4.431997194454807e-08, 'epoch': 0.96} 96%|█████████▌| 21187/22095 [36:39:16<53:19, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51896 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50458 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21188/22095 [36:39:21<1:00:41, 4.01s/it] {'loss': 0.2737, 'grad_norm': 0.6240215002808371, 'learning_rate': 4.4222655812592995e-08, 'epoch': 0.96} 96%|█████████▌| 21188/22095 [36:39:21<1:00:41, 4.01s/it] 96%|█████████▌| 21189/22095 [36:39:24<55:49, 3.70s/it] {'loss': 0.3129, 'grad_norm': 0.6140724268164072, 'learning_rate': 4.412544616507253e-08, 'epoch': 0.96} 96%|█████████▌| 21189/22095 [36:39:24<55:49, 3.70s/it] 96%|█████████▌| 21190/22095 [36:39:27<52:23, 3.47s/it] {'loss': 0.2695, 'grad_norm': 0.8783380764141445, 'learning_rate': 4.402834300407499e-08, 'epoch': 0.96} 96%|█████████▌| 21190/22095 [36:39:27<52:23, 3.47s/it] 96%|█████████▌| 21191/22095 [36:39:30<52:43, 3.50s/it] {'loss': 0.3249, 'grad_norm': 0.5972526903824302, 'learning_rate': 4.3931346331688165e-08, 'epoch': 0.96} 96%|█████████▌| 21191/22095 [36:39:30<52:43, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21192/22095 [36:39:39<1:15:53, 5.04s/it] {'loss': 0.4751, 'grad_norm': 0.25982751541373955, 'learning_rate': 4.383445614999426e-08, 'epoch': 0.96} 96%|█████████▌| 21192/22095 [36:39:39<1:15:53, 5.04s/it] 96%|█████████▌| 21193/22095 [36:39:42<1:07:06, 4.46s/it] {'loss': 0.2567, 'grad_norm': 0.6209074527492965, 'learning_rate': 4.373767246107718e-08, 'epoch': 0.96} 96%|█████████▌| 21193/22095 [36:39:42<1:07:06, 4.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43258 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44549 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82602 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44451 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57776 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133267 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43868 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21194/22095 [36:39:45<1:00:41, 4.04s/it] {'loss': 0.2274, 'grad_norm': 0.6748164420339022, 'learning_rate': 4.3640995267014704e-08, 'epoch': 0.96} 96%|█████████▌| 21194/22095 [36:39:45<1:00:41, 4.04s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [6, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8400529 in VC:s3://internvl-moe-sft-data/. Exception: Image size [6, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 2690, 'image': 'vrdu_table_final_2/astro-ph.CO/a7273bdf-7d69-4d48-981d-d149d665104a.png', 'image_wh': [[6, 14]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{@{}lccll}\n :\n \\end{tabular}\n```"}]} 96%|█████████▌| 21195/22095 [36:39:49<1:00:40, 4.05s/it] {'loss': 0.2833, 'grad_norm': 0.6207451339187819, 'learning_rate': 4.354442456988517e-08, 'epoch': 0.96} 96%|█████████▌| 21195/22095 [36:39:49<1:00:40, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21196/22095 [36:39:59<1:24:39, 5.65s/it] {'loss': 0.4636, 'grad_norm': 0.2598314089164235, 'learning_rate': 4.3447960371763575e-08, 'epoch': 0.96} 96%|█████████▌| 21196/22095 [36:39:59<1:24:39, 5.65s/it] 96%|█████████▌| 21197/22095 [36:40:02<1:14:23, 4.97s/it] {'loss': 0.3045, 'grad_norm': 0.6280994414834518, 'learning_rate': 4.335160267472216e-08, 'epoch': 0.96} 96%|█████████▌| 21197/22095 [36:40:02<1:14:23, 4.97s/it] 96%|█████████▌| 21198/22095 [36:40:05<1:07:15, 4.50s/it] {'loss': 0.3126, 'grad_norm': 0.6668469020947756, 'learning_rate': 4.325535148083204e-08, 'epoch': 0.96} 96%|█████████▌| 21198/22095 [36:40:05<1:07:15, 4.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79844 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46328 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21199/22095 [36:40:08<1:00:32, 4.05s/it] {'loss': 0.2935, 'grad_norm': 0.7184303504115127, 'learning_rate': 4.3159206792160455e-08, 'epoch': 0.96} 96%|█████████▌| 21199/22095 [36:40:08<1:00:32, 4.05s/it] 96%|█████████▌| 21200/22095 [36:40:11<56:05, 3.76s/it] {'loss': 0.2885, 'grad_norm': 0.6537703551533632, 'learning_rate': 4.3063168610774084e-08, 'epoch': 0.96} 96%|█████████▌| 21200/22095 [36:40:11<56:05, 3.76s/it] 96%|█████████▌| 21201/22095 [36:40:14<52:30, 3.52s/it] {'loss': 0.2797, 'grad_norm': 0.6323361763200717, 'learning_rate': 4.2967236938735725e-08, 'epoch': 0.96} 96%|█████████▌| 21201/22095 [36:40:14<52:30, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (41509 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57116 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21202/22095 [36:40:24<1:19:09, 5.32s/it] {'loss': 0.4609, 'grad_norm': 0.2735815182206019, 'learning_rate': 4.287141177810761e-08, 'epoch': 0.96} 96%|█████████▌| 21202/22095 [36:40:24<1:19:09, 5.32s/it] 96%|█████████▌| 21203/22095 [36:40:28<1:13:56, 4.97s/it] {'loss': 0.3092, 'grad_norm': 0.6295577981996854, 'learning_rate': 4.2775693130948094e-08, 'epoch': 0.96} 96%|█████████▌| 21203/22095 [36:40:28<1:13:56, 4.97s/it] 96%|█████████▌| 21204/22095 [36:40:32<1:07:28, 4.54s/it] {'loss': 0.2894, 'grad_norm': 0.5876943016991591, 'learning_rate': 4.268008099931387e-08, 'epoch': 0.96} 96%|█████████▌| 21204/22095 [36:40:32<1:07:28, 4.54s/it] 96%|█████████▌| 21205/22095 [36:40:35<1:01:02, 4.12s/it] {'loss': 0.2857, 'grad_norm': 0.5478724164381471, 'learning_rate': 4.25845753852594e-08, 'epoch': 0.96} 96%|█████████▌| 21205/22095 [36:40:35<1:01:02, 4.12s/it] 96%|█████████▌| 21206/22095 [36:40:38<56:37, 3.82s/it] {'loss': 0.2725, 'grad_norm': 0.7570418593038322, 'learning_rate': 4.248917629083693e-08, 'epoch': 0.96} 96%|█████████▌| 21206/22095 [36:40:38<56:37, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21207/22095 [36:40:47<1:19:31, 5.37s/it] {'loss': 0.4692, 'grad_norm': 0.2699582044210566, 'learning_rate': 4.2393883718096495e-08, 'epoch': 0.96} 96%|█████████▌| 21207/22095 [36:40:47<1:19:31, 5.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21208/22095 [36:40:50<1:09:23, 4.69s/it] {'loss': 0.2818, 'grad_norm': 0.6258507229904574, 'learning_rate': 4.2298697669084785e-08, 'epoch': 0.96} 96%|█████████▌| 21208/22095 [36:40:50<1:09:23, 4.69s/it] 96%|█████████▌| 21209/22095 [36:40:53<1:02:50, 4.26s/it] {'loss': 0.3102, 'grad_norm': 0.6338652523526765, 'learning_rate': 4.2203618145847946e-08, 'epoch': 0.96} 96%|█████████▌| 21209/22095 [36:40:53<1:02:50, 4.26s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21210/22095 [36:41:01<1:18:13, 5.30s/it] {'loss': 0.4548, 'grad_norm': 0.25974894190440184, 'learning_rate': 4.210864515042878e-08, 'epoch': 0.96} 96%|█████████▌| 21210/22095 [36:41:01<1:18:13, 5.30s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21211/22095 [36:41:05<1:13:53, 5.02s/it] {'loss': 0.297, 'grad_norm': 0.6166636330929279, 'learning_rate': 4.2013778684867335e-08, 'epoch': 0.96} 96%|█████████▌| 21211/22095 [36:41:05<1:13:53, 5.02s/it] 96%|█████████▌| 21212/22095 [36:41:09<1:06:09, 4.50s/it] {'loss': 0.3011, 'grad_norm': 0.5978516768881983, 'learning_rate': 4.191901875120308e-08, 'epoch': 0.96} 96%|█████████▌| 21212/22095 [36:41:09<1:06:09, 4.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57606 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21213/22095 [36:41:11<58:51, 4.00s/it] {'loss': 0.3409, 'grad_norm': 0.6019325817783381, 'learning_rate': 4.182436535147105e-08, 'epoch': 0.96} 96%|█████████▌| 21213/22095 [36:41:11<58:51, 4.00s/it] 96%|█████████▌| 21214/22095 [36:41:14<54:46, 3.73s/it] {'loss': 0.2794, 'grad_norm': 0.6240142157887236, 'learning_rate': 4.1729818487706297e-08, 'epoch': 0.96} 96%|█████████▌| 21214/22095 [36:41:14<54:46, 3.73s/it] 96%|█████████▌| 21215/22095 [36:41:18<53:09, 3.62s/it] {'loss': 0.3074, 'grad_norm': 0.6209004645487891, 'learning_rate': 4.163537816193885e-08, 'epoch': 0.96} 96%|█████████▌| 21215/22095 [36:41:18<53:09, 3.62s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21216/22095 [36:41:21<50:11, 3.43s/it] {'loss': 0.2902, 'grad_norm': 0.5959644019366294, 'learning_rate': 4.154104437619877e-08, 'epoch': 0.96} 96%|█████████▌| 21216/22095 [36:41:21<50:11, 3.43s/it] 96%|█████████▌| 21217/22095 [36:41:24<47:47, 3.27s/it] {'loss': 0.2721, 'grad_norm': 0.6120197629494267, 'learning_rate': 4.144681713251275e-08, 'epoch': 0.96} 96%|█████████▌| 21217/22095 [36:41:24<47:47, 3.27s/it] 96%|█████████▌| 21218/22095 [36:41:28<51:34, 3.53s/it] {'loss': 0.2833, 'grad_norm': 0.6320329580407882, 'learning_rate': 4.1352696432906405e-08, 'epoch': 0.96} 96%|█████████▌| 21218/22095 [36:41:28<51:34, 3.53s/it] 96%|█████████▌| 21219/22095 [36:41:31<51:34, 3.53s/it] {'loss': 0.3063, 'grad_norm': 0.5909436023906957, 'learning_rate': 4.125868227940033e-08, 'epoch': 0.96} 96%|█████████▌| 21219/22095 [36:41:31<51:34, 3.53s/it] 96%|█████████▌| 21220/22095 [36:41:35<50:08, 3.44s/it] {'loss': 0.3018, 'grad_norm': 0.5730815741653685, 'learning_rate': 4.116477467401625e-08, 'epoch': 0.96} 96%|█████████▌| 21220/22095 [36:41:35<50:08, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21221/22095 [36:41:44<1:17:03, 5.29s/it] {'loss': 0.4632, 'grad_norm': 0.26835348997717295, 'learning_rate': 4.107097361877088e-08, 'epoch': 0.96} 96%|█████████▌| 21221/22095 [36:41:44<1:17:03, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53562 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (55966 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107934 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21222/22095 [36:41:48<1:10:07, 4.82s/it] {'loss': 0.3173, 'grad_norm': 0.63553753421761, 'learning_rate': 4.097727911568039e-08, 'epoch': 0.96} 96%|█████████▌| 21222/22095 [36:41:48<1:10:07, 4.82s/it] 96%|█████████▌| 21223/22095 [36:41:52<1:05:53, 4.53s/it] {'loss': 0.2952, 'grad_norm': 0.6397865086894752, 'learning_rate': 4.088369116675761e-08, 'epoch': 0.96} 96%|█████████▌| 21223/22095 [36:41:52<1:05:53, 4.53s/it] 96%|█████████▌| 21224/22095 [36:41:55<59:05, 4.07s/it] {'loss': 0.2655, 'grad_norm': 0.599380152021612, 'learning_rate': 4.0790209774013156e-08, 'epoch': 0.96} 96%|█████████▌| 21224/22095 [36:41:55<59:05, 4.07s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21225/22095 [36:41:59<59:28, 4.10s/it] {'loss': 0.2557, 'grad_norm': 0.5637309960267389, 'learning_rate': 4.069683493945598e-08, 'epoch': 0.96} 96%|█████████▌| 21225/22095 [36:41:59<59:28, 4.10s/it] 96%|█████████▌| 21226/22095 [36:42:02<54:48, 3.78s/it] {'loss': 0.298, 'grad_norm': 0.6125425675800832, 'learning_rate': 4.060356666509335e-08, 'epoch': 0.96} 96%|█████████▌| 21226/22095 [36:42:02<54:48, 3.78s/it] 96%|█████████▌| 21227/22095 [36:42:05<50:47, 3.51s/it] {'loss': 0.3023, 'grad_norm': 0.6476639084705905, 'learning_rate': 4.051040495292757e-08, 'epoch': 0.96} 96%|█████████▌| 21227/22095 [36:42:05<50:47, 3.51s/it] 96%|█████████▌| 21228/22095 [36:42:08<49:43, 3.44s/it] {'loss': 0.3043, 'grad_norm': 0.5677392671931741, 'learning_rate': 4.041734980496148e-08, 'epoch': 0.96} 96%|█████████▌| 21228/22095 [36:42:08<49:43, 3.44s/it] 96%|█████████▌| 21229/22095 [36:42:14<1:00:22, 4.18s/it] {'loss': 0.2896, 'grad_norm': 0.6092973251190281, 'learning_rate': 4.032440122319459e-08, 'epoch': 0.96} 96%|█████████▌| 21229/22095 [36:42:14<1:00:22, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (71758 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44198 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (61387 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41035 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21230/22095 [36:42:24<1:23:08, 5.77s/it] {'loss': 0.4789, 'grad_norm': 0.26849380700297054, 'learning_rate': 4.0231559209624185e-08, 'epoch': 0.96} 96%|█████████▌| 21230/22095 [36:42:24<1:23:08, 5.77s/it] 96%|█████████▌| 21231/22095 [36:42:28<1:15:35, 5.25s/it] {'loss': 0.3167, 'grad_norm': 1.0232158053660931, 'learning_rate': 4.013882376624423e-08, 'epoch': 0.96} 96%|█████████▌| 21231/22095 [36:42:28<1:15:35, 5.25s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21232/22095 [36:42:33<1:15:45, 5.27s/it] {'loss': 0.456, 'grad_norm': 0.30389324861319167, 'learning_rate': 4.004619489504813e-08, 'epoch': 0.96} 96%|█████████▌| 21232/22095 [36:42:33<1:15:45, 5.27s/it] 96%|█████████▌| 21233/22095 [36:42:36<1:06:30, 4.63s/it] {'loss': 0.3061, 'grad_norm': 0.6564086381621276, 'learning_rate': 3.995367259802596e-08, 'epoch': 0.96} 96%|█████████▌| 21233/22095 [36:42:36<1:06:30, 4.63s/it] 96%|█████████▌| 21234/22095 [36:42:39<1:01:03, 4.25s/it] {'loss': 0.2643, 'grad_norm': 0.6095148878709185, 'learning_rate': 3.986125687716558e-08, 'epoch': 0.96} 96%|█████████▌| 21234/22095 [36:42:39<1:01:03, 4.25s/it] 96%|█████████▌| 21235/22095 [36:42:43<1:00:10, 4.20s/it] {'loss': 0.3019, 'grad_norm': 0.6100252027921181, 'learning_rate': 3.976894773445261e-08, 'epoch': 0.96} 96%|█████████▌| 21235/22095 [36:42:44<1:00:10, 4.20s/it] 96%|█████████▌| 21236/22095 [36:42:47<55:08, 3.85s/it] {'loss': 0.2719, 'grad_norm': 0.5839556175104145, 'learning_rate': 3.967674517187159e-08, 'epoch': 0.96} 96%|█████████▌| 21236/22095 [36:42:47<55:08, 3.85s/it] 96%|█████████▌| 21237/22095 [36:42:50<55:23, 3.87s/it] {'loss': 0.3032, 'grad_norm': 0.5543483765055761, 'learning_rate': 3.9584649191402034e-08, 'epoch': 0.96} 96%|█████████▌| 21237/22095 [36:42:50<55:23, 3.87s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396692 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63543, 'image': 'vrdu_table_final_2/astro-ph.EP/5f862c51-72b4-4cd8-86ba-61a3b310a190.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease convert the depicted table in the image into LaTeX format.'}, {'from': 'gpt', 'value': "Of course, I'll convert the depicted table in the image into LaTeX format for you.\n```latex\n\\begin{tabular}[t]{@{}l@{}}#1\\end{tabular}\n```"}]} 96%|█████████▌| 21238/22095 [36:43:00<1:19:10, 5.54s/it] {'loss': 0.4663, 'grad_norm': 0.2859477139223187, 'learning_rate': 3.9492659795024035e-08, 'epoch': 0.96} 96%|█████████▌| 21238/22095 [36:43:00<1:19:10, 5.54s/it] 96%|█████████▌| 21239/22095 [36:43:07<1:26:48, 6.08s/it] {'loss': 0.4635, 'grad_norm': 0.2531076016653137, 'learning_rate': 3.940077698471378e-08, 'epoch': 0.96} 96%|█████████▌| 21239/22095 [36:43:07<1:26:48, 6.08s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 96%|█████████▌| 21240/22095 [36:43:11<1:16:43, 5.38s/it] {'loss': 0.3218, 'grad_norm': 0.6157171437043197, 'learning_rate': 3.930900076244526e-08, 'epoch': 0.96} 96%|█████████▌| 21240/22095 [36:43:11<1:16:43, 5.38s/it] 96%|█████████▌| 21241/22095 [36:43:14<1:08:31, 4.81s/it] {'loss': 0.2452, 'grad_norm': 0.6477974935783631, 'learning_rate': 3.921733113019077e-08, 'epoch': 0.96} 96%|█████████▌| 21241/22095 [36:43:14<1:08:31, 4.81s/it] 96%|█████████▌| 21242/22095 [36:43:18<1:03:30, 4.47s/it] {'loss': 0.3111, 'grad_norm': 0.6188385285914921, 'learning_rate': 3.912576808991986e-08, 'epoch': 0.96} 96%|█████████▌| 21242/22095 [36:43:18<1:03:30, 4.47s/it] 96%|█████████▌| 21243/22095 [36:43:22<1:01:01, 4.30s/it] {'loss': 0.3082, 'grad_norm': 0.6520923746735432, 'learning_rate': 3.903431164360094e-08, 'epoch': 0.96} 96%|█████████▌| 21243/22095 [36:43:22<1:01:01, 4.30s/it] 96%|█████████▌| 21244/22095 [36:43:25<55:25, 3.91s/it] {'loss': 0.2757, 'grad_norm': 0.6093413924671904, 'learning_rate': 3.8942961793197456e-08, 'epoch': 0.96} 96%|█████████▌| 21244/22095 [36:43:25<55:25, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42157 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21245/22095 [36:43:28<52:02, 3.67s/it] {'loss': 0.2801, 'grad_norm': 0.6180322542059126, 'learning_rate': 3.885171854067282e-08, 'epoch': 0.96} 96%|█████████▌| 21245/22095 [36:43:28<52:02, 3.67s/it] 96%|█████████▌| 21246/22095 [36:43:32<53:36, 3.79s/it] {'loss': 0.2981, 'grad_norm': 0.6173825785972774, 'learning_rate': 3.8760581887987706e-08, 'epoch': 0.96} 96%|█████████▌| 21246/22095 [36:43:32<53:36, 3.79s/it] 96%|█████████▌| 21247/22095 [36:43:36<51:32, 3.65s/it] {'loss': 0.281, 'grad_norm': 0.6535693212898909, 'learning_rate': 3.866955183710108e-08, 'epoch': 0.96} 96%|█████████▌| 21247/22095 [36:43:36<51:32, 3.65s/it] 96%|█████████▌| 21248/22095 [36:43:39<51:14, 3.63s/it] {'loss': 0.2282, 'grad_norm': 0.6009252696082468, 'learning_rate': 3.857862838996751e-08, 'epoch': 0.96} 96%|█████████▌| 21248/22095 [36:43:39<51:14, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111181 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43166 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21249/22095 [36:43:43<51:25, 3.65s/it] {'loss': 0.3151, 'grad_norm': 0.7029568060532387, 'learning_rate': 3.8487811548542086e-08, 'epoch': 0.96} 96%|█████████▌| 21249/22095 [36:43:43<51:25, 3.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21250/22095 [36:43:52<1:15:17, 5.35s/it] {'loss': 0.4705, 'grad_norm': 0.27285242924989755, 'learning_rate': 3.839710131477492e-08, 'epoch': 0.96} 96%|█████████▌| 21250/22095 [36:43:52<1:15:17, 5.35s/it] 96%|█████████▌| 21251/22095 [36:43:56<1:09:31, 4.94s/it] {'loss': 0.3379, 'grad_norm': 0.6684912899665435, 'learning_rate': 3.8306497690615564e-08, 'epoch': 0.96} 96%|█████████▌| 21251/22095 [36:43:56<1:09:31, 4.94s/it] 96%|█████████▌| 21252/22095 [36:43:59<1:01:24, 4.37s/it] {'loss': 0.2976, 'grad_norm': 0.5495280270916768, 'learning_rate': 3.8216000678011344e-08, 'epoch': 0.96} 96%|█████████▌| 21252/22095 [36:43:59<1:01:24, 4.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21253/22095 [36:44:08<1:22:03, 5.85s/it] {'loss': 0.4836, 'grad_norm': 0.2520299691807092, 'learning_rate': 3.812561027890571e-08, 'epoch': 0.96} 96%|█████████▌| 21253/22095 [36:44:08<1:22:03, 5.85s/it] 96%|█████████▌| 21254/22095 [36:44:12<1:10:51, 5.06s/it] {'loss': 0.2578, 'grad_norm': 0.559546923586854, 'learning_rate': 3.8035326495242106e-08, 'epoch': 0.96} 96%|█████████▌| 21254/22095 [36:44:12<1:10:51, 5.06s/it] 96%|█████████▌| 21255/22095 [36:44:15<1:04:02, 4.57s/it] {'loss': 0.3164, 'grad_norm': 0.6358503324588595, 'learning_rate': 3.794514932895954e-08, 'epoch': 0.96} 96%|█████████▌| 21255/22095 [36:44:15<1:04:02, 4.57s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21256/22095 [36:44:18<57:16, 4.10s/it] {'loss': 0.2585, 'grad_norm': 0.5869712565273543, 'learning_rate': 3.78550787819959e-08, 'epoch': 0.96} 96%|█████████▌| 21256/22095 [36:44:18<57:16, 4.10s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21257/22095 [36:44:28<1:20:02, 5.73s/it] {'loss': 0.4614, 'grad_norm': 0.25232073452894843, 'learning_rate': 3.7765114856286866e-08, 'epoch': 0.96} 96%|█████████▌| 21257/22095 [36:44:28<1:20:02, 5.73s/it] 96%|█████████▌| 21258/22095 [36:44:31<1:11:28, 5.12s/it] {'loss': 0.2838, 'grad_norm': 0.6596492126163388, 'learning_rate': 3.7675257553764224e-08, 'epoch': 0.96} 96%|█████████▌| 21258/22095 [36:44:31<1:11:28, 5.12s/it] 96%|█████████▌| 21259/22095 [36:44:35<1:04:55, 4.66s/it] {'loss': 0.312, 'grad_norm': 0.625749386386828, 'learning_rate': 3.7585506876360865e-08, 'epoch': 0.96} 96%|█████████▌| 21259/22095 [36:44:35<1:04:55, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▌| 21260/22095 [36:44:43<1:18:17, 5.63s/it] {'loss': 0.4784, 'grad_norm': 0.2912275382564058, 'learning_rate': 3.749586282600359e-08, 'epoch': 0.96} 96%|█████████▌| 21260/22095 [36:44:43<1:18:17, 5.63s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21261/22095 [36:44:46<1:09:26, 5.00s/it] {'loss': 0.2934, 'grad_norm': 0.7980256338173004, 'learning_rate': 3.740632540461864e-08, 'epoch': 0.96} 96%|█████████▌| 21261/22095 [36:44:46<1:09:26, 5.00s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86093 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21262/22095 [36:44:49<1:00:38, 4.37s/it] {'loss': 0.2908, 'grad_norm': 0.5957946792402637, 'learning_rate': 3.731689461413113e-08, 'epoch': 0.96} 96%|█████████▌| 21262/22095 [36:44:49<1:00:38, 4.37s/it] 96%|█████████▌| 21263/22095 [36:44:52<54:20, 3.92s/it] {'loss': 0.2563, 'grad_norm': 0.6516521315572986, 'learning_rate': 3.7227570456461194e-08, 'epoch': 0.96} 96%|█████████▌| 21263/22095 [36:44:52<54:20, 3.92s/it] 96%|█████████▌| 21264/22095 [36:44:56<53:52, 3.89s/it] {'loss': 0.3169, 'grad_norm': 0.6167601726247165, 'learning_rate': 3.7138352933528965e-08, 'epoch': 0.96} 96%|█████████▌| 21264/22095 [36:44:56<53:52, 3.89s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▌| 21265/22095 [36:45:00<54:53, 3.97s/it] {'loss': 0.2906, 'grad_norm': 0.6262162319203259, 'learning_rate': 3.70492420472518e-08, 'epoch': 0.96} 96%|█████████▌| 21265/22095 [36:45:00<54:53, 3.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (101812 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▌| 21266/22095 [36:45:04<53:29, 3.87s/it] {'loss': 0.2909, 'grad_norm': 0.6135124490731227, 'learning_rate': 3.6960237799543166e-08, 'epoch': 0.96} 96%|█████████▌| 21266/22095 [36:45:04<53:29, 3.87s/it] 96%|█████████▋| 21267/22095 [36:45:08<53:20, 3.86s/it] {'loss': 0.2873, 'grad_norm': 0.7785797765189498, 'learning_rate': 3.6871340192315974e-08, 'epoch': 0.96} 96%|█████████▋| 21267/22095 [36:45:08<53:20, 3.86s/it] 96%|█████████▋| 21268/22095 [36:45:11<52:19, 3.80s/it] {'loss': 0.2909, 'grad_norm': 0.6334405969382095, 'learning_rate': 3.6782549227481476e-08, 'epoch': 0.96} 96%|█████████▋| 21268/22095 [36:45:11<52:19, 3.80s/it] 96%|█████████▋| 21269/22095 [36:45:15<51:45, 3.76s/it] {'loss': 0.268, 'grad_norm': 0.5805036307580624, 'learning_rate': 3.669386490694593e-08, 'epoch': 0.96} 96%|█████████▋| 21269/22095 [36:45:15<51:45, 3.76s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (50657 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80649 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21270/22095 [36:45:22<1:07:15, 4.89s/it] {'loss': 0.4674, 'grad_norm': 0.2630489032207125, 'learning_rate': 3.6605287232616137e-08, 'epoch': 0.96} 96%|█████████▋| 21270/22095 [36:45:22<1:07:15, 4.89s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955482 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6317, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7'}, {'from': 'gpt', 'value': '【解答】解:∵AB=9,BC=5,∴AC=AB+BC=14,∵D为线段AC的中点,∴AD=\\frac{1}{2}AC=7,'}]} 96%|█████████▋| 21271/22095 [36:45:26<1:00:16, 4.39s/it] {'loss': 0.2704, 'grad_norm': 0.6354005481006761, 'learning_rate': 3.651681620639447e-08, 'epoch': 0.96} 96%|█████████▋| 21271/22095 [36:45:26<1:00:16, 4.39s/it] 96%|█████████▋| 21272/22095 [36:45:29<54:36, 3.98s/it] {'loss': 0.2801, 'grad_norm': 0.6971951264532625, 'learning_rate': 3.642845183018273e-08, 'epoch': 0.96} 96%|█████████▋| 21272/22095 [36:45:29<54:36, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (83114 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41656 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64291 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48533 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21273/22095 [36:45:32<52:48, 3.86s/it] {'loss': 0.2253, 'grad_norm': 0.618233409724753, 'learning_rate': 3.63401941058783e-08, 'epoch': 0.96} 96%|█████████▋| 21273/22095 [36:45:32<52:48, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21274/22095 [36:45:39<1:06:36, 4.87s/it] {'loss': 0.4863, 'grad_norm': 0.2852131682171573, 'learning_rate': 3.625204303537855e-08, 'epoch': 0.96} 96%|█████████▋| 21274/22095 [36:45:39<1:06:36, 4.87s/it] 96%|█████████▋| 21275/22095 [36:45:43<1:00:42, 4.44s/it] {'loss': 0.3192, 'grad_norm': 0.6274027089335161, 'learning_rate': 3.6163998620578065e-08, 'epoch': 0.96} 96%|█████████▋| 21275/22095 [36:45:43<1:00:42, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (128520000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 96%|█████████▋| 21276/22095 [36:45:46<54:22, 3.98s/it] {'loss': 0.2792, 'grad_norm': 0.639063695040162, 'learning_rate': 3.6076060863367565e-08, 'epoch': 0.96} 96%|█████████▋| 21276/22095 [36:45:46<54:22, 3.98s/it] 96%|█████████▋| 21277/22095 [36:45:50<54:15, 3.98s/it] {'loss': 0.3416, 'grad_norm': 0.657216145416747, 'learning_rate': 3.598822976563665e-08, 'epoch': 0.96} 96%|█████████▋| 21277/22095 [36:45:50<54:15, 3.98s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21278/22095 [36:45:56<1:01:49, 4.54s/it] {'loss': 0.4811, 'grad_norm': 0.314988580497501, 'learning_rate': 3.5900505329273804e-08, 'epoch': 0.96} 96%|█████████▋| 21278/22095 [36:45:56<1:01:49, 4.54s/it] 96%|█████████▋| 21279/22095 [36:45:59<58:39, 4.31s/it] {'loss': 0.2978, 'grad_norm': 0.6421338261470828, 'learning_rate': 3.581288755616197e-08, 'epoch': 0.96} 96%|█████████▋| 21279/22095 [36:45:59<58:39, 4.31s/it] 96%|█████████▋| 21280/22095 [36:46:03<57:07, 4.21s/it] {'loss': 0.3207, 'grad_norm': 0.6104777277721297, 'learning_rate': 3.5725376448185744e-08, 'epoch': 0.96} 96%|█████████▋| 21280/22095 [36:46:03<57:07, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49829 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102504 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21281/22095 [36:46:11<1:10:12, 5.17s/it] {'loss': 0.4712, 'grad_norm': 0.25787119378948375, 'learning_rate': 3.563797200722363e-08, 'epoch': 0.96} 96%|█████████▋| 21281/22095 [36:46:11<1:10:12, 5.17s/it] 96%|█████████▋| 21282/22095 [36:46:14<1:04:00, 4.72s/it] {'loss': 0.2714, 'grad_norm': 0.6496668162613294, 'learning_rate': 3.555067423515523e-08, 'epoch': 0.96} 96%|█████████▋| 21282/22095 [36:46:14<1:04:00, 4.72s/it] 96%|█████████▋| 21283/22095 [36:46:19<1:01:27, 4.54s/it] {'loss': 0.3019, 'grad_norm': 0.5899650319063284, 'learning_rate': 3.5463483133855726e-08, 'epoch': 0.96} 96%|█████████▋| 21283/22095 [36:46:19<1:01:27, 4.54s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▋| 21284/22095 [36:46:22<55:55, 4.14s/it] {'loss': 0.2778, 'grad_norm': 0.6227285001305403, 'learning_rate': 3.5376398705198603e-08, 'epoch': 0.96} 96%|█████████▋| 21284/22095 [36:46:22<55:55, 4.14s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (69754 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21285/22095 [36:46:25<50:36, 3.75s/it] {'loss': 0.2984, 'grad_norm': 0.5848115284332623, 'learning_rate': 3.5289420951055145e-08, 'epoch': 0.96} 96%|█████████▋| 21285/22095 [36:46:25<50:36, 3.75s/it] 96%|█████████▋| 21286/22095 [36:46:28<48:29, 3.60s/it] {'loss': 0.3182, 'grad_norm': 0.613686346774837, 'learning_rate': 3.5202549873293304e-08, 'epoch': 0.96} 96%|█████████▋| 21286/22095 [36:46:28<48:29, 3.60s/it] 96%|█████████▋| 21287/22095 [36:46:31<46:09, 3.43s/it] {'loss': 0.3004, 'grad_norm': 0.5834137635129636, 'learning_rate': 3.5115785473781026e-08, 'epoch': 0.96} 96%|█████████▋| 21287/22095 [36:46:31<46:09, 3.43s/it] 96%|█████████▋| 21288/22095 [36:46:34<45:04, 3.35s/it] {'loss': 0.2664, 'grad_norm': 0.6097829559220829, 'learning_rate': 3.502912775438183e-08, 'epoch': 0.96} 96%|█████████▋| 21288/22095 [36:46:34<45:04, 3.35s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21289/22095 [36:46:43<1:06:05, 4.92s/it] {'loss': 0.4597, 'grad_norm': 0.25201403422770474, 'learning_rate': 3.494257671695811e-08, 'epoch': 0.96} 96%|█████████▋| 21289/22095 [36:46:43<1:06:05, 4.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▋| 21290/22095 [36:46:46<59:26, 4.43s/it] {'loss': 0.3064, 'grad_norm': 0.6905157574890572, 'learning_rate': 3.4856132363369485e-08, 'epoch': 0.96} 96%|█████████▋| 21290/22095 [36:46:46<59:26, 4.43s/it] 96%|█████████▋| 21291/22095 [36:46:50<57:46, 4.31s/it] {'loss': 0.269, 'grad_norm': 1.02299639303271, 'learning_rate': 3.476979469547337e-08, 'epoch': 0.96} 96%|█████████▋| 21291/22095 [36:46:50<57:46, 4.31s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8308555 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB2hsEndInI8KJjSspeXXcwIpXa_!!2832689975.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nPlease decode and provide the text from this image.'}, {'from': 'gpt', 'value': 'All words in the image:\n星巴克\n同款挂式黑板\n榉木框\n西巴克黑\n咖啡框'}]} 96%|█████████▋| 21292/22095 [36:46:53<52:00, 3.89s/it] {'loss': 0.3031, 'grad_norm': 0.6560738269000224, 'learning_rate': 3.468356371512438e-08, 'epoch': 0.96} 96%|█████████▋| 21292/22095 [36:46:53<52:00, 3.89s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21293/22095 [36:47:02<1:14:04, 5.54s/it] {'loss': 0.4655, 'grad_norm': 0.26279930128979606, 'learning_rate': 3.459743942417604e-08, 'epoch': 0.96} 96%|█████████▋| 21293/22095 [36:47:02<1:14:04, 5.54s/it] 96%|█████████▋| 21294/22095 [36:47:06<1:06:23, 4.97s/it] {'loss': 0.2765, 'grad_norm': 0.6623621867401419, 'learning_rate': 3.451142182447908e-08, 'epoch': 0.96} 96%|█████████▋| 21294/22095 [36:47:06<1:06:23, 4.97s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49383 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48359 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (104873 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21295/22095 [36:47:09<59:33, 4.47s/it] {'loss': 0.266, 'grad_norm': 0.5857663907388148, 'learning_rate': 3.442551091788038e-08, 'epoch': 0.96} 96%|█████████▋| 21295/22095 [36:47:09<59:33, 4.47s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54017 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116220 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21296/22095 [36:47:12<53:23, 4.01s/it] {'loss': 0.3012, 'grad_norm': 0.6146602373853328, 'learning_rate': 3.4339706706227326e-08, 'epoch': 0.96} 96%|█████████▋| 21296/22095 [36:47:12<53:23, 4.01s/it] 96%|█████████▋| 21297/22095 [36:47:16<52:11, 3.92s/it] {'loss': 0.3118, 'grad_norm': 0.5996634569620617, 'learning_rate': 3.425400919136346e-08, 'epoch': 0.96} 96%|█████████▋| 21297/22095 [36:47:16<52:11, 3.92s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▋| 21298/22095 [36:47:19<47:59, 3.61s/it] {'loss': 0.2982, 'grad_norm': 0.5527183377984016, 'learning_rate': 3.416841837512952e-08, 'epoch': 0.96} 96%|█████████▋| 21298/22095 [36:47:19<47:59, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (104482 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81773 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21299/22095 [36:47:22<46:54, 3.54s/it] {'loss': 0.2939, 'grad_norm': 0.6135812886345209, 'learning_rate': 3.40829342593646e-08, 'epoch': 0.96} 96%|█████████▋| 21299/22095 [36:47:22<46:54, 3.54s/it] 96%|█████████▋| 21300/22095 [36:47:26<49:36, 3.74s/it] {'loss': 0.2768, 'grad_norm': 0.6100060335767012, 'learning_rate': 3.399755684590611e-08, 'epoch': 0.96} 96%|█████████▋| 21300/22095 [36:47:26<49:36, 3.74s/it] 96%|█████████▋| 21301/22095 [36:47:30<49:54, 3.77s/it] {'loss': 0.3029, 'grad_norm': 0.6862427360829833, 'learning_rate': 3.39122861365887e-08, 'epoch': 0.96} 96%|█████████▋| 21301/22095 [36:47:30<49:54, 3.77s/it] 96%|█████████▋| 21302/22095 [36:47:33<46:10, 3.49s/it] {'loss': 0.2559, 'grad_norm': 1.103074870883008, 'learning_rate': 3.382712213324313e-08, 'epoch': 0.96} 96%|█████████▋| 21302/22095 [36:47:33<46:10, 3.49s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21303/22095 [36:47:40<1:01:37, 4.67s/it] {'loss': 0.4603, 'grad_norm': 0.27218569974804424, 'learning_rate': 3.374206483770071e-08, 'epoch': 0.96} 96%|█████████▋| 21303/22095 [36:47:40<1:01:37, 4.67s/it] 96%|█████████▋| 21304/22095 [36:47:50<1:19:29, 6.03s/it] {'loss': 0.4548, 'grad_norm': 0.2551901000735597, 'learning_rate': 3.365711425178886e-08, 'epoch': 0.96} 96%|█████████▋| 21304/22095 [36:47:50<1:19:29, 6.03s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 96%|█████████▋| 21305/22095 [36:47:53<1:10:30, 5.36s/it] {'loss': 0.2514, 'grad_norm': 0.6019272256332595, 'learning_rate': 3.357227037733224e-08, 'epoch': 0.96} 96%|█████████▋| 21305/22095 [36:47:53<1:10:30, 5.36s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44547 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (114010 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51211 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21306/22095 [36:47:57<1:02:26, 4.75s/it] {'loss': 0.2902, 'grad_norm': 0.5836409045668817, 'learning_rate': 3.3487533216154386e-08, 'epoch': 0.96} 96%|█████████▋| 21306/22095 [36:47:57<1:02:26, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 96%|█████████▋| 21307/22095 [36:48:05<1:15:52, 5.78s/it] {'loss': 0.4798, 'grad_norm': 0.3000098505611248, 'learning_rate': 3.340290277007607e-08, 'epoch': 0.96} 96%|█████████▋| 21307/22095 [36:48:05<1:15:52, 5.78s/it] 96%|█████████▋| 21308/22095 [36:48:09<1:07:39, 5.16s/it] {'loss': 0.3096, 'grad_norm': 0.6196779072403756, 'learning_rate': 3.3318379040915284e-08, 'epoch': 0.96} 96%|█████████▋| 21308/22095 [36:48:09<1:07:39, 5.16s/it] 96%|█████████▋| 21309/22095 [36:48:13<1:05:48, 5.02s/it] {'loss': 0.2708, 'grad_norm': 0.6049054192400112, 'learning_rate': 3.3233962030489453e-08, 'epoch': 0.96} 96%|█████████▋| 21309/22095 [36:48:13<1:05:48, 5.02s/it] 96%|█████████▋| 21310/22095 [36:48:17<1:01:10, 4.68s/it] {'loss': 0.299, 'grad_norm': 0.6443006566661341, 'learning_rate': 3.3149651740610464e-08, 'epoch': 0.96} 96%|█████████▋| 21310/22095 [36:48:17<1:01:10, 4.68s/it] 96%|█████████▋| 21311/22095 [36:48:20<53:57, 4.13s/it] {'loss': 0.2754, 'grad_norm': 0.5848403475285998, 'learning_rate': 3.3065448173091873e-08, 'epoch': 0.96} 96%|█████████▋| 21311/22095 [36:48:20<53:57, 4.13s/it] 96%|█████████▋| 21312/22095 [36:48:24<53:18, 4.08s/it] {'loss': 0.2824, 'grad_norm': 0.6277945205866138, 'learning_rate': 3.298135132974112e-08, 'epoch': 0.96} 96%|█████████▋| 21312/22095 [36:48:24<53:18, 4.08s/it] 96%|█████████▋| 21313/22095 [36:48:28<51:27, 3.95s/it] {'loss': 0.3334, 'grad_norm': 0.6249229890940834, 'learning_rate': 3.289736121236675e-08, 'epoch': 0.96} 96%|█████████▋| 21313/22095 [36:48:28<51:27, 3.95s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▋| 21314/22095 [36:48:37<1:13:54, 5.68s/it] {'loss': 0.4655, 'grad_norm': 0.23627308508275272, 'learning_rate': 3.2813477822772885e-08, 'epoch': 0.96} 96%|█████████▋| 21314/22095 [36:48:37<1:13:54, 5.68s/it] 96%|█████████▋| 21315/22095 [36:48:42<1:08:25, 5.26s/it] {'loss': 0.293, 'grad_norm': 0.6215681145266133, 'learning_rate': 3.2729701162760865e-08, 'epoch': 0.96} 96%|█████████▋| 21315/22095 [36:48:42<1:08:25, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44754 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21316/22095 [36:48:45<1:00:09, 4.63s/it] {'loss': 0.2539, 'grad_norm': 0.6019029106587596, 'learning_rate': 3.264603123413257e-08, 'epoch': 0.96} 96%|█████████▋| 21316/22095 [36:48:45<1:00:09, 4.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48090 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79227 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102903 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69642 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42843 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111711 > 40960). Running this sequence through the model will result in indexing errors 96%|█████████▋| 21317/22095 [36:48:49<56:46, 4.38s/it] {'loss': 0.298, 'grad_norm': 0.7327028825493089, 'learning_rate': 3.25624680386849e-08, 'epoch': 0.96} 96%|█████████▋| 21317/22095 [36:48:49<56:46, 4.38s/it] 96%|█████████▋| 21318/22095 [36:48:52<51:20, 3.97s/it] {'loss': 0.2691, 'grad_norm': 0.5796198943379359, 'learning_rate': 3.247901157821365e-08, 'epoch': 0.96} 96%|█████████▋| 21318/22095 [36:48:52<51:20, 3.97s/it] 96%|█████████▋| 21319/22095 [36:48:56<50:51, 3.93s/it] {'loss': 0.3111, 'grad_norm': 0.628252502714485, 'learning_rate': 3.2395661854511264e-08, 'epoch': 0.96} 96%|█████████▋| 21319/22095 [36:48:56<50:51, 3.93s/it] 96%|█████████▋| 21320/22095 [36:48:59<50:06, 3.88s/it] {'loss': 0.2912, 'grad_norm': 0.6548347401051914, 'learning_rate': 3.23124188693702e-08, 'epoch': 0.96} 96%|█████████▋| 21320/22095 [36:48:59<50:06, 3.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 96%|█████████▋| 21321/22095 [36:49:03<49:22, 3.83s/it] {'loss': 0.2759, 'grad_norm': 0.6173020789342651, 'learning_rate': 3.222928262457736e-08, 'epoch': 0.96} 96%|█████████▋| 21321/22095 [36:49:03<49:22, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [204, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918294 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [204, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41447, 'image': 'images/5464.png', 'image_wh': [[204, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点位于AB段,D点为AC的中点,如果Cd=4,AB=14,则BC长度为()\nA. 4\nB. 5\nC. 6\nD. 6.5'}, {'from': 'gpt', 'value': '【解答】解:∵点D是AC的中点,如果CD=4,∴AC=2CD=8∵AB=14∴BC=AB-AC=6'}]} 97%|█████████▋| 21322/22095 [36:49:06<45:26, 3.53s/it] {'loss': 0.3075, 'grad_norm': 0.6195357469478212, 'learning_rate': 3.2146253121920215e-08, 'epoch': 0.97} 97%|█████████▋| 21322/22095 [36:49:06<45:26, 3.53s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53086 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21323/22095 [36:49:09<43:46, 3.40s/it] {'loss': 0.2806, 'grad_norm': 0.6253934878655681, 'learning_rate': 3.2063330363182323e-08, 'epoch': 0.97} 97%|█████████▋| 21323/22095 [36:49:09<43:46, 3.40s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047104 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} Traceback (most recent call last): Invalidate trace cache @ step 2: expected module 1, but got module 364 File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [211, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047668 in VC:s3://multi-modal/UniGeo/. Exception: Image size [211, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5302.png', 'image_wh': [[211, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点M是线段AB的中点,点N在线段MB上,若AB=12,AM:BN=3:1,则线段MN的长为()\nA. 6\nB. 5\nC. 4\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 97%|█████████▋| 21324/22095 [36:49:18<1:06:40, 5.19s/it] {'loss': 0.4556, 'grad_norm': 0.24519266579212162, 'learning_rate': 3.19805143501456e-08, 'epoch': 0.97} 97%|█████████▋| 21324/22095 [36:49:18<1:06:40, 5.19s/it] 97%|█████████▋| 21325/22095 [36:49:28<1:23:56, 6.54s/it] {'loss': 0.4621, 'grad_norm': 0.2714398967046984, 'learning_rate': 3.1897805084589726e-08, 'epoch': 0.97} 97%|█████████▋| 21325/22095 [36:49:28<1:23:56, 6.54s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21326/22095 [36:49:31<1:11:59, 5.62s/it] {'loss': 0.2697, 'grad_norm': 0.7212628197749601, 'learning_rate': 3.1815202568291625e-08, 'epoch': 0.97} 97%|█████████▋| 21326/22095 [36:49:31<1:11:59, 5.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43396 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46712 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52311 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21327/22095 [36:49:35<1:03:16, 4.94s/it] {'loss': 0.2547, 'grad_norm': 0.5578222247668211, 'learning_rate': 3.173270680302598e-08, 'epoch': 0.97} 97%|█████████▋| 21327/22095 [36:49:35<1:03:16, 4.94s/it] 97%|█████████▋| 21328/22095 [36:49:39<1:00:09, 4.71s/it] {'loss': 0.321, 'grad_norm': 0.6141571513729598, 'learning_rate': 3.165031779056582e-08, 'epoch': 0.97} 97%|█████████▋| 21328/22095 [36:49:39<1:00:09, 4.71s/it] 97%|█████████▋| 21329/22095 [36:49:42<53:55, 4.22s/it] {'loss': 0.2925, 'grad_norm': 0.5825078984311327, 'learning_rate': 3.156803553268084e-08, 'epoch': 0.97} 97%|█████████▋| 21329/22095 [36:49:42<53:55, 4.22s/it] 97%|█████████▋| 21330/22095 [36:49:45<49:50, 3.91s/it] {'loss': 0.2951, 'grad_norm': 0.5824074331733144, 'learning_rate': 3.1485860031140183e-08, 'epoch': 0.97} 97%|█████████▋| 21330/22095 [36:49:45<49:50, 3.91s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [239, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8515242 in VC:s3://internvl-moe-sft-data/. Exception: Image size [239, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 69893, 'image': 'vrdu_texteq/astro-ph.CO/661fa671-0a85-4975-a6f2-006b86044c18.png', 'image_wh': [[239, 23]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': 'where $N=\\ln{a}$ and'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047105 in VC:s3://multi-modal/UniGeo/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nA. 4cm\nB. 8cm\nC. 16cm\nD. 32cm'}, {'from': 'gpt', 'value': '∵点D是AC的中点,∴bc=\\frac{1}{2}AC,∵点E是BC的中点,∴CE=\\frac{1}{2}CB,∴DE=DC+CE=\\frac{1}{2}(AC+CB)=8cm,故选B'}]} 97%|█████████▋| 21331/22095 [36:49:55<1:11:32, 5.62s/it] {'loss': 0.4912, 'grad_norm': 0.26065467366216083, 'learning_rate': 3.1403791287707986e-08, 'epoch': 0.97} 97%|█████████▋| 21331/22095 [36:49:55<1:11:32, 5.62s/it] 97%|█████████▋| 21332/22095 [36:49:58<1:03:27, 4.99s/it] {'loss': 0.3223, 'grad_norm': 0.6281474671279208, 'learning_rate': 3.1321829304148954e-08, 'epoch': 0.97} 97%|█████████▋| 21332/22095 [36:49:58<1:03:27, 4.99s/it] 97%|█████████▋| 21333/22095 [36:50:01<55:52, 4.40s/it] {'loss': 0.3109, 'grad_norm': 0.6394881447406083, 'learning_rate': 3.1239974082223347e-08, 'epoch': 0.97} 97%|█████████▋| 21333/22095 [36:50:01<55:52, 4.40s/it] 97%|█████████▋| 21334/22095 [36:50:05<53:04, 4.18s/it] {'loss': 0.2848, 'grad_norm': 0.6804236092716724, 'learning_rate': 3.115822562369086e-08, 'epoch': 0.97} 97%|█████████▋| 21334/22095 [36:50:05<53:04, 4.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21335/22095 [36:50:15<1:13:51, 5.83s/it] {'loss': 0.451, 'grad_norm': 0.2707283479344151, 'learning_rate': 3.107658393030677e-08, 'epoch': 0.97} 97%|█████████▋| 21335/22095 [36:50:15<1:13:51, 5.83s/it] 97%|█████████▋| 21336/22095 [36:50:18<1:03:37, 5.03s/it] {'loss': 0.2892, 'grad_norm': 0.6753311154891868, 'learning_rate': 3.0995049003826325e-08, 'epoch': 0.97} 97%|█████████▋| 21336/22095 [36:50:18<1:03:37, 5.03s/it] 97%|█████████▋| 21337/22095 [36:50:21<56:07, 4.44s/it] {'loss': 0.3038, 'grad_norm': 0.6298728408666605, 'learning_rate': 3.0913620846000916e-08, 'epoch': 0.97} 97%|█████████▋| 21337/22095 [36:50:21<56:07, 4.44s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8495292 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 9555, 'image': 'vrdu_texteq/astro-ph.CO/eda306d8-b754-429c-8e8d-d70057bb2b3f.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease extract the text from the image.'}, {'from': 'gpt', 'value': '\\vspace{0.2cm}\n$\\Downarrow$'}]} 97%|█████████▋| 21338/22095 [36:50:25<53:16, 4.22s/it] {'loss': 0.3137, 'grad_norm': 0.5729926603889947, 'learning_rate': 3.083229945858079e-08, 'epoch': 0.97} 97%|█████████▋| 21338/22095 [36:50:25<53:16, 4.22s/it] 97%|█████████▋| 21339/22095 [36:50:29<52:15, 4.15s/it] {'loss': 0.3039, 'grad_norm': 0.7725364840099211, 'learning_rate': 3.075108484331235e-08, 'epoch': 0.97} 97%|█████████▋| 21339/22095 [36:50:29<52:15, 4.15s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [189, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8455382 in VC:s3://internvl-moe-sft-data/. Exception: Image size [189, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 55779, 'image': 'vrdu_texteq/astro-ph.CO/6266d502-329d-4862-a51f-b3197311fd61.png', 'image_wh': [[189, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the text within the image recognized and converted into machine-readable format, please.'}, {'from': 'gpt', 'value': '$=-1.15\\pm0.05$.'}]} 97%|█████████▋| 21340/22095 [36:50:32<50:20, 4.00s/it] {'loss': 0.2881, 'grad_norm': 0.6693582053497801, 'learning_rate': 3.066997700194197e-08, 'epoch': 0.97} 97%|█████████▋| 21340/22095 [36:50:32<50:20, 4.00s/it] 97%|█████████▋| 21341/22095 [36:50:36<49:36, 3.95s/it] {'loss': 0.2904, 'grad_norm': 0.6188826859538752, 'learning_rate': 3.0588975936211017e-08, 'epoch': 0.97} 97%|█████████▋| 21341/22095 [36:50:36<49:36, 3.95s/it] 97%|█████████▋| 21342/22095 [36:50:39<45:35, 3.63s/it] {'loss': 0.3559, 'grad_norm': 0.6369808490204127, 'learning_rate': 3.05080816478609e-08, 'epoch': 0.97} 97%|█████████▋| 21342/22095 [36:50:39<45:35, 3.63s/it] 97%|█████████▋| 21343/22095 [36:50:42<43:14, 3.45s/it] {'loss': 0.3077, 'grad_norm': 0.6384628270851854, 'learning_rate': 3.042729413862966e-08, 'epoch': 0.97} 97%|█████████▋| 21343/22095 [36:50:42<43:14, 3.45s/it] 97%|█████████▋| 21344/22095 [36:50:45<42:19, 3.38s/it] {'loss': 0.275, 'grad_norm': 0.6037050679353614, 'learning_rate': 3.034661341025258e-08, 'epoch': 0.97} 97%|█████████▋| 21344/22095 [36:50:45<42:19, 3.38s/it] 97%|█████████▋| 21345/22095 [36:50:49<42:57, 3.44s/it] {'loss': 0.2826, 'grad_norm': 0.576146704475577, 'learning_rate': 3.0266039464463823e-08, 'epoch': 0.97} 97%|█████████▋| 21345/22095 [36:50:49<42:57, 3.44s/it] 97%|█████████▋| 21346/22095 [36:50:53<45:00, 3.61s/it] {'loss': 0.3294, 'grad_norm': 0.5736812009031691, 'learning_rate': 3.0185572302994795e-08, 'epoch': 0.97} 97%|█████████▋| 21346/22095 [36:50:53<45:00, 3.61s/it] 97%|█████████▋| 21347/22095 [36:50:56<43:33, 3.49s/it] {'loss': 0.2729, 'grad_norm': 0.6388020999905242, 'learning_rate': 3.0105211927574096e-08, 'epoch': 0.97} 97%|█████████▋| 21347/22095 [36:50:56<43:33, 3.49s/it] 97%|█████████▋| 21348/22095 [36:50:59<41:45, 3.35s/it] {'loss': 0.3027, 'grad_norm': 0.5807264304566786, 'learning_rate': 3.002495833992813e-08, 'epoch': 0.97} 97%|█████████▋| 21348/22095 [36:50:59<41:45, 3.35s/it] 97%|█████████▋| 21349/22095 [36:51:02<39:49, 3.20s/it] {'loss': 0.3396, 'grad_norm': 0.6342121065316381, 'learning_rate': 2.994481154178164e-08, 'epoch': 0.97} 97%|█████████▋| 21349/22095 [36:51:02<39:49, 3.20s/it] 97%|█████████▋| 21350/22095 [36:51:05<40:06, 3.23s/it] {'loss': 0.2826, 'grad_norm': 0.6107434989636781, 'learning_rate': 2.9864771534857114e-08, 'epoch': 0.97} 97%|█████████▋| 21350/22095 [36:51:05<40:06, 3.23s/it] 97%|█████████▋| 21351/22095 [36:51:08<39:44, 3.21s/it] {'loss': 0.3127, 'grad_norm': 0.6483472713840857, 'learning_rate': 2.978483832087431e-08, 'epoch': 0.97} 97%|█████████▋| 21351/22095 [36:51:08<39:44, 3.21s/it] 97%|█████████▋| 21352/22095 [36:51:11<38:59, 3.15s/it] {'loss': 0.2856, 'grad_norm': 0.5739299116426992, 'learning_rate': 2.970501190154962e-08, 'epoch': 0.97} 97%|█████████▋| 21352/22095 [36:51:11<38:59, 3.15s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21353/22095 [36:51:21<1:04:19, 5.20s/it] {'loss': 0.4672, 'grad_norm': 0.25963244238987054, 'learning_rate': 2.9625292278600005e-08, 'epoch': 0.97} 97%|█████████▋| 21353/22095 [36:51:21<1:04:19, 5.20s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21354/22095 [36:51:25<57:52, 4.69s/it] {'loss': 0.2673, 'grad_norm': 0.5650708348270717, 'learning_rate': 2.9545679453736874e-08, 'epoch': 0.97} 97%|█████████▋| 21354/22095 [36:51:25<57:52, 4.69s/it] 97%|█████████▋| 21355/22095 [36:51:28<51:53, 4.21s/it] {'loss': 0.2856, 'grad_norm': 0.5911926544516285, 'learning_rate': 2.9466173428672197e-08, 'epoch': 0.97} 97%|█████████▋| 21355/22095 [36:51:28<51:53, 4.21s/it]VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/DUE_Benchmark/InfographicsVQA/pngs/38380.png 2025-08-29 04:49:28.541142 load time: 1024.81 ms 97%|█████████▋| 21356/22095 [36:51:32<50:33, 4.11s/it] {'loss': 0.3381, 'grad_norm': 0.6348876111669358, 'learning_rate': 2.9386774205112934e-08, 'epoch': 0.97} 97%|█████████▋| 21356/22095 [36:51:32<50:33, 4.11s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42225 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81191 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21357/22095 [36:51:35<46:02, 3.74s/it] {'loss': 0.2879, 'grad_norm': 0.5792450882193889, 'learning_rate': 2.9307481784766057e-08, 'epoch': 0.97} 97%|█████████▋| 21357/22095 [36:51:35<46:02, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44988 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47261 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21358/22095 [36:51:38<44:13, 3.60s/it] {'loss': 0.3035, 'grad_norm': 0.6526299208429099, 'learning_rate': 2.92282961693352e-08, 'epoch': 0.97} 97%|█████████▋| 21358/22095 [36:51:38<44:13, 3.60s/it] 97%|█████████▋| 21359/22095 [36:51:42<43:55, 3.58s/it] {'loss': 0.3048, 'grad_norm': 0.5981744444686902, 'learning_rate': 2.9149217360521788e-08, 'epoch': 0.97} 97%|█████████▋| 21359/22095 [36:51:42<43:55, 3.58s/it] 97%|█████████▋| 21360/22095 [36:51:45<42:35, 3.48s/it] {'loss': 0.2464, 'grad_norm': 0.6631230697732848, 'learning_rate': 2.907024536002501e-08, 'epoch': 0.97} 97%|█████████▋| 21360/22095 [36:51:45<42:35, 3.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047575 in VC:s3://multi-modal/UniGeo/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 7\nB. 2\nC. 2.5\nD. 4.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 97%|█████████▋| 21361/22095 [36:51:48<40:22, 3.30s/it] {'loss': 0.3114, 'grad_norm': 0.6415576068812007, 'learning_rate': 2.8991380169541284e-08, 'epoch': 0.97} 97%|█████████▋| 21361/22095 [36:51:48<40:22, 3.30s/it] 97%|█████████▋| 21362/22095 [36:51:51<39:56, 3.27s/it] {'loss': 0.2719, 'grad_norm': 0.6141966263098978, 'learning_rate': 2.8912621790765373e-08, 'epoch': 0.97} 97%|█████████▋| 21362/22095 [36:51:51<39:56, 3.27s/it] 97%|█████████▋| 21363/22095 [36:51:54<40:57, 3.36s/it] {'loss': 0.3383, 'grad_norm': 0.6159770387398168, 'learning_rate': 2.883397022538981e-08, 'epoch': 0.97} 97%|█████████▋| 21363/22095 [36:51:54<40:57, 3.36s/it] 97%|█████████▋| 21364/22095 [36:51:58<41:53, 3.44s/it] {'loss': 0.2737, 'grad_norm': 0.6686155973457726, 'learning_rate': 2.8755425475104904e-08, 'epoch': 0.97} 97%|█████████▋| 21364/22095 [36:51:58<41:53, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21365/22095 [36:52:06<57:31, 4.73s/it] {'loss': 0.4554, 'grad_norm': 0.3101889005652721, 'learning_rate': 2.8676987541597646e-08, 'epoch': 0.97} 97%|█████████▋| 21365/22095 [36:52:06<57:31, 4.73s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (47491 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46232 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21366/22095 [36:52:13<1:06:30, 5.47s/it] {'loss': 0.4551, 'grad_norm': 0.2625387624468325, 'learning_rate': 2.859865642655335e-08, 'epoch': 0.97} 97%|█████████▋| 21366/22095 [36:52:13<1:06:30, 5.47s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21367/22095 [36:52:19<1:08:44, 5.67s/it] {'loss': 0.2948, 'grad_norm': 0.6784526673158491, 'learning_rate': 2.8520432131655673e-08, 'epoch': 0.97} 97%|█████████▋| 21367/22095 [36:52:19<1:08:44, 5.67s/it] 97%|█████████▋| 21368/22095 [36:52:22<59:47, 4.93s/it] {'loss': 0.2791, 'grad_norm': 0.7018253409325259, 'learning_rate': 2.8442314658584936e-08, 'epoch': 0.97} 97%|█████████▋| 21368/22095 [36:52:22<59:47, 4.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (62068 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44282 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21369/22095 [36:52:26<54:15, 4.48s/it] {'loss': 0.2803, 'grad_norm': 0.6030150366424402, 'learning_rate': 2.8364304009020348e-08, 'epoch': 0.97} 97%|█████████▋| 21369/22095 [36:52:26<54:15, 4.48s/it] 97%|█████████▋| 21370/22095 [36:52:28<47:44, 3.95s/it] {'loss': 0.3103, 'grad_norm': 0.5687094575670066, 'learning_rate': 2.8286400184637242e-08, 'epoch': 0.97} 97%|█████████▋| 21370/22095 [36:52:28<47:44, 3.95s/it] 97%|█████████▋| 21371/22095 [36:52:32<45:08, 3.74s/it] {'loss': 0.269, 'grad_norm': 0.604944121680553, 'learning_rate': 2.820860318710983e-08, 'epoch': 0.97} 97%|█████████▋| 21371/22095 [36:52:32<45:08, 3.74s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54815 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (134846 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105224 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21372/22095 [36:52:35<42:40, 3.54s/it] {'loss': 0.3115, 'grad_norm': 0.6127066407259337, 'learning_rate': 2.813091301811066e-08, 'epoch': 0.97} 97%|█████████▋| 21372/22095 [36:52:35<42:40, 3.54s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [215, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914674 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [215, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 37827, 'image': 'images/5459.png', 'image_wh': [[215, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,点D是线段AB的中点,C是线段AD的中点,若AB=16cm,则线段CD=cm.()\nA. 16\nB. 2\nC. 4\nD. 8\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 97%|█████████▋| 21373/22095 [36:52:38<41:30, 3.45s/it] {'loss': 0.2924, 'grad_norm': 0.5783694521108265, 'learning_rate': 2.8053329679307293e-08, 'epoch': 0.97} 97%|█████████▋| 21373/22095 [36:52:38<41:30, 3.45s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (83604 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42062 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21374/22095 [36:52:48<1:03:32, 5.29s/it] {'loss': 0.4593, 'grad_norm': 0.28195681477137985, 'learning_rate': 2.797585317236784e-08, 'epoch': 0.97} 97%|█████████▋| 21374/22095 [36:52:48<1:03:32, 5.29s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (172738 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47926 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21375/22095 [36:52:51<57:44, 4.81s/it] {'loss': 0.27, 'grad_norm': 0.6422170672348851, 'learning_rate': 2.789848349895763e-08, 'epoch': 0.97} 97%|█████████▋| 21375/22095 [36:52:51<57:44, 4.81s/it] 97%|█████████▋| 21376/22095 [36:52:54<50:21, 4.20s/it] {'loss': 0.2965, 'grad_norm': 0.5907711761183103, 'learning_rate': 2.782122066073756e-08, 'epoch': 0.97} 97%|█████████▋| 21376/22095 [36:52:54<50:21, 4.20s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21377/22095 [36:53:04<1:09:52, 5.84s/it] {'loss': 0.4785, 'grad_norm': 0.2995926429078034, 'learning_rate': 2.7744064659369073e-08, 'epoch': 0.97} 97%|█████████▋| 21377/22095 [36:53:04<1:09:52, 5.84s/it] 97%|█████████▋| 21378/22095 [36:53:08<1:02:37, 5.24s/it] {'loss': 0.3488, 'grad_norm': 0.8353705596359702, 'learning_rate': 2.7667015496509187e-08, 'epoch': 0.97} 97%|█████████▋| 21378/22095 [36:53:08<1:02:37, 5.24s/it] 97%|█████████▋| 21379/22095 [36:53:11<55:22, 4.64s/it] {'loss': 0.3093, 'grad_norm': 0.6500850117419013, 'learning_rate': 2.7590073173813792e-08, 'epoch': 0.97} 97%|█████████▋| 21379/22095 [36:53:11<55:22, 4.64s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21380/22095 [36:53:17<59:58, 5.03s/it] {'loss': 0.4659, 'grad_norm': 0.2531423524179933, 'learning_rate': 2.7513237692936567e-08, 'epoch': 0.97} 97%|█████████▋| 21380/22095 [36:53:17<59:58, 5.03s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57984 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21381/22095 [36:53:21<56:11, 4.72s/it] {'loss': 0.2858, 'grad_norm': 0.649250415015185, 'learning_rate': 2.743650905552786e-08, 'epoch': 0.97} 97%|█████████▋| 21381/22095 [36:53:21<56:11, 4.72s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21382/22095 [36:53:30<1:12:27, 6.10s/it] {'loss': 0.436, 'grad_norm': 0.24597576493641854, 'learning_rate': 2.7359887263236352e-08, 'epoch': 0.97} 97%|█████████▋| 21382/22095 [36:53:30<1:12:27, 6.10s/it] 97%|█████████▋| 21383/22095 [36:53:40<1:24:17, 7.10s/it] {'loss': 0.4512, 'grad_norm': 0.23698909783832273, 'learning_rate': 2.7283372317708502e-08, 'epoch': 0.97} 97%|█████████▋| 21383/22095 [36:53:40<1:24:17, 7.10s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21384/22095 [36:53:43<1:10:48, 5.98s/it] {'loss': 0.3041, 'grad_norm': 0.5635119405187751, 'learning_rate': 2.720696422058855e-08, 'epoch': 0.97} 97%|█████████▋| 21384/22095 [36:53:43<1:10:48, 5.98s/it] 97%|█████████▋| 21385/22095 [36:53:47<1:03:42, 5.38s/it] {'loss': 0.3106, 'grad_norm': 0.6416710534046843, 'learning_rate': 2.713066297351852e-08, 'epoch': 0.97} 97%|█████████▋| 21385/22095 [36:53:47<1:03:42, 5.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56826 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63539 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (99875 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21386/22095 [36:53:51<58:03, 4.91s/it] {'loss': 0.2724, 'grad_norm': 0.6454105343058075, 'learning_rate': 2.7054468578137093e-08, 'epoch': 0.97} 97%|█████████▋| 21386/22095 [36:53:51<58:03, 4.91s/it] 97%|█████████▋| 21387/22095 [36:53:54<51:15, 4.34s/it] {'loss': 0.2513, 'grad_norm': 0.8120809368879789, 'learning_rate': 2.6978381036081857e-08, 'epoch': 0.97} 97%|█████████▋| 21387/22095 [36:53:54<51:15, 4.34s/it] 97%|█████████▋| 21388/22095 [36:53:58<50:00, 4.24s/it] {'loss': 0.284, 'grad_norm': 0.5913980149953101, 'learning_rate': 2.6902400348987613e-08, 'epoch': 0.97} 97%|█████████▋| 21388/22095 [36:53:58<50:00, 4.24s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21389/22095 [36:54:07<1:07:53, 5.77s/it] {'loss': 0.4646, 'grad_norm': 0.2593861442090093, 'learning_rate': 2.6826526518487496e-08, 'epoch': 0.97} 97%|█████████▋| 21389/22095 [36:54:07<1:07:53, 5.77s/it] 97%|█████████▋| 21390/22095 [36:54:11<59:47, 5.09s/it] {'loss': 0.2709, 'grad_norm': 0.5799799435681116, 'learning_rate': 2.6750759546211312e-08, 'epoch': 0.97} 97%|█████████▋| 21390/22095 [36:54:11<59:47, 5.09s/it] 97%|█████████▋| 21391/22095 [36:54:14<52:17, 4.46s/it] {'loss': 0.2577, 'grad_norm': 0.8052886730121234, 'learning_rate': 2.6675099433787212e-08, 'epoch': 0.97} 97%|█████████▋| 21391/22095 [36:54:14<52:17, 4.46s/it] 97%|█████████▋| 21392/22095 [36:54:17<47:46, 4.08s/it] {'loss': 0.2765, 'grad_norm': 0.6162388408613758, 'learning_rate': 2.6599546182840553e-08, 'epoch': 0.97} 97%|█████████▋| 21392/22095 [36:54:17<47:46, 4.08s/it] 97%|█████████▋| 21393/22095 [36:54:21<46:49, 4.00s/it] {'loss': 0.3015, 'grad_norm': 0.6170886181395424, 'learning_rate': 2.652409979499504e-08, 'epoch': 0.97} 97%|█████████▋| 21393/22095 [36:54:21<46:49, 4.00s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21394/22095 [36:54:29<1:02:17, 5.33s/it] {'loss': 0.4623, 'grad_norm': 0.25442359485009247, 'learning_rate': 2.6448760271872152e-08, 'epoch': 0.97} 97%|█████████▋| 21394/22095 [36:54:29<1:02:17, 5.33s/it] 97%|█████████▋| 21395/22095 [36:54:39<1:20:16, 6.88s/it] {'loss': 0.4517, 'grad_norm': 0.2603122844717185, 'learning_rate': 2.6373527615090044e-08, 'epoch': 0.97} 97%|█████████▋| 21395/22095 [36:54:40<1:20:16, 6.88s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21396/22095 [36:54:43<1:09:53, 6.00s/it] {'loss': 0.2465, 'grad_norm': 0.603507615130539, 'learning_rate': 2.6298401826265195e-08, 'epoch': 0.97} 97%|█████████▋| 21396/22095 [36:54:43<1:09:53, 6.00s/it] 97%|█████████▋| 21397/22095 [36:54:48<1:03:55, 5.49s/it] {'loss': 0.3009, 'grad_norm': 0.5992182971411182, 'learning_rate': 2.6223382907012428e-08, 'epoch': 0.97} 97%|█████████▋| 21397/22095 [36:54:48<1:03:55, 5.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48228 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (86418 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47068 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21398/22095 [36:54:51<56:09, 4.83s/it] {'loss': 0.2966, 'grad_norm': 0.6311970985757361, 'learning_rate': 2.6148470858943787e-08, 'epoch': 0.97} 97%|█████████▋| 21398/22095 [36:54:51<56:09, 4.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41679 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21399/22095 [36:54:54<50:54, 4.39s/it] {'loss': 0.2755, 'grad_norm': 0.6006286490801301, 'learning_rate': 2.607366568366798e-08, 'epoch': 0.97} 97%|█████████▋| 21399/22095 [36:54:54<50:54, 4.39s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [235, 26, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8948714 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [235, 26, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 71867, 'image': 'images/5308.png', 'image_wh': [[235, 26]], 'conversations': [{'from': 'human', 'value': '\n如图所示,AB段上有两个点C和D,AD=\\ frac{1}{3}AB,C是AD的中点,如果AB=12,则AC段的长度为()\nA. 3\nB. 4\nC. 1\nD. 2'}, {'from': 'gpt', 'value': '【解答】解:∵AD=\\frac{1}{3}AB,AB=12,∴AD=4,∵C是AD的中点,∴AC=\\frac{1}{2}AD=2.'}]} 97%|█████████▋| 21400/22095 [36:54:58<49:37, 4.28s/it] {'loss': 0.3063, 'grad_norm': 0.6708687272743549, 'learning_rate': 2.5998967382792618e-08, 'epoch': 0.97} 97%|█████████▋| 21400/22095 [36:54:58<49:37, 4.28s/it] 97%|█████████▋| 21401/22095 [36:55:02<45:37, 3.94s/it] {'loss': 0.2948, 'grad_norm': 0.5393526089892805, 'learning_rate': 2.592437595792363e-08, 'epoch': 0.97} 97%|█████████▋| 21401/22095 [36:55:02<45:37, 3.94s/it] 97%|█████████▋| 21402/22095 [36:55:05<43:41, 3.78s/it] {'loss': 0.2808, 'grad_norm': 0.5912557610649447, 'learning_rate': 2.584989141066252e-08, 'epoch': 0.97} 97%|█████████▋| 21402/22095 [36:55:05<43:41, 3.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41621 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44104 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (64134 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21403/22095 [36:55:08<40:00, 3.47s/it] {'loss': 0.2584, 'grad_norm': 0.6030725729127032, 'learning_rate': 2.577551374261078e-08, 'epoch': 0.97} 97%|█████████▋| 21403/22095 [36:55:08<40:00, 3.47s/it] 97%|█████████▋| 21404/22095 [36:55:12<41:04, 3.57s/it] {'loss': 0.2946, 'grad_norm': 0.6478863741545718, 'learning_rate': 2.5701242955365468e-08, 'epoch': 0.97} 97%|█████████▋| 21404/22095 [36:55:12<41:04, 3.57s/it] 97%|█████████▋| 21405/22095 [36:55:14<38:22, 3.34s/it] {'loss': 0.2576, 'grad_norm': 0.6117868458022317, 'learning_rate': 2.562707905052364e-08, 'epoch': 0.97} 97%|█████████▋| 21405/22095 [36:55:14<38:22, 3.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (66767 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (59964 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91171 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (51050 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (120497 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21406/22095 [36:55:24<59:36, 5.19s/it] {'loss': 0.4557, 'grad_norm': 0.2594879696804052, 'learning_rate': 2.555302202967791e-08, 'epoch': 0.97} 97%|█████████▋| 21406/22095 [36:55:24<59:36, 5.19s/it] 97%|█████████▋| 21407/22095 [36:55:31<1:06:21, 5.79s/it] {'loss': 0.4701, 'grad_norm': 0.2998048691048826, 'learning_rate': 2.5479071894420337e-08, 'epoch': 0.97} 97%|█████████▋| 21407/22095 [36:55:31<1:06:21, 5.79s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21408/22095 [36:55:34<58:17, 5.09s/it] {'loss': 0.3045, 'grad_norm': 0.6665315299044354, 'learning_rate': 2.5405228646339096e-08, 'epoch': 0.97} 97%|█████████▋| 21408/22095 [36:55:35<58:17, 5.09s/it] 97%|█████████▋| 21409/22095 [36:55:38<51:36, 4.51s/it] {'loss': 0.3092, 'grad_norm': 0.6795854446055811, 'learning_rate': 2.5331492287021252e-08, 'epoch': 0.97} 97%|█████████▋| 21409/22095 [36:55:38<51:36, 4.51s/it] 97%|█████████▋| 21410/22095 [36:55:42<50:15, 4.40s/it] {'loss': 0.2733, 'grad_norm': 0.6512646157878369, 'learning_rate': 2.5257862818051092e-08, 'epoch': 0.97} 97%|█████████▋| 21410/22095 [36:55:42<50:15, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21411/22095 [36:55:51<1:08:04, 5.97s/it] {'loss': 0.4437, 'grad_norm': 0.2645611035802779, 'learning_rate': 2.5184340241010687e-08, 'epoch': 0.97} 97%|█████████▋| 21411/22095 [36:55:51<1:08:04, 5.97s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8364436 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31176, 'image': 'vrdu_table_final_2/astro-ph.CO/ab3cd34b-fb49-44a1-abd1-e0e7fa227d07.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}c@{}}\\strut#1\\strut\\end{tabular}\n```"}]} 97%|█████████▋| 21412/22095 [36:55:59<1:11:50, 6.31s/it] {'loss': 0.4545, 'grad_norm': 0.24962282761171117, 'learning_rate': 2.511092455747932e-08, 'epoch': 0.97} 97%|█████████▋| 21412/22095 [36:55:59<1:11:50, 6.31s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21413/22095 [36:56:03<1:05:37, 5.77s/it] {'loss': 0.2791, 'grad_norm': 0.5633033273133744, 'learning_rate': 2.503761576903574e-08, 'epoch': 0.97} 97%|█████████▋| 21413/22095 [36:56:03<1:05:37, 5.77s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21414/22095 [36:56:13<1:20:04, 7.06s/it] {'loss': 0.4514, 'grad_norm': 0.24964261861415946, 'learning_rate': 2.4964413877254233e-08, 'epoch': 0.97} 97%|█████████▋| 21414/22095 [36:56:13<1:20:04, 7.06s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 97%|█████████▋| 21415/22095 [36:56:17<1:10:20, 6.21s/it] {'loss': 0.2838, 'grad_norm': 0.8924893770088967, 'learning_rate': 2.489131888370744e-08, 'epoch': 0.97} 97%|█████████▋| 21415/22095 [36:56:17<1:10:20, 6.21s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53776 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21416/22095 [36:56:21<1:00:44, 5.37s/it] {'loss': 0.2717, 'grad_norm': 0.6368048568730261, 'learning_rate': 2.4818330789966872e-08, 'epoch': 0.97} 97%|█████████▋| 21416/22095 [36:56:21<1:00:44, 5.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21417/22095 [36:56:29<1:08:46, 6.09s/it] {'loss': 0.4867, 'grad_norm': 0.25813157938113807, 'learning_rate': 2.474544959760017e-08, 'epoch': 0.97} 97%|█████████▋| 21417/22095 [36:56:29<1:08:46, 6.09s/it] 97%|█████████▋| 21418/22095 [36:56:32<1:00:21, 5.35s/it] {'loss': 0.3078, 'grad_norm': 0.6155721713424308, 'learning_rate': 2.4672675308173298e-08, 'epoch': 0.97} 97%|█████████▋| 21418/22095 [36:56:32<1:00:21, 5.35s/it] 97%|█████████▋| 21419/22095 [36:56:36<53:34, 4.75s/it] {'loss': 0.2826, 'grad_norm': 0.5923410351217169, 'learning_rate': 2.460000792324946e-08, 'epoch': 0.97} 97%|█████████▋| 21419/22095 [36:56:36<53:34, 4.75s/it] 97%|█████████▋| 21420/22095 [36:56:39<48:24, 4.30s/it] {'loss': 0.3003, 'grad_norm': 0.6407965605028224, 'learning_rate': 2.4527447444391838e-08, 'epoch': 0.97} 97%|█████████▋| 21420/22095 [36:56:39<48:24, 4.30s/it] 97%|█████████▋| 21421/22095 [36:56:42<43:16, 3.85s/it] {'loss': 0.3277, 'grad_norm': 0.6538379741907768, 'learning_rate': 2.445499387315753e-08, 'epoch': 0.97} 97%|█████████▋| 21421/22095 [36:56:42<43:16, 3.85s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49736 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21422/22095 [36:56:45<40:11, 3.58s/it] {'loss': 0.3262, 'grad_norm': 0.6457363395930453, 'learning_rate': 2.4382647211104173e-08, 'epoch': 0.97} 97%|█████████▋| 21422/22095 [36:56:45<40:11, 3.58s/it] 97%|█████████▋| 21423/22095 [36:56:47<37:41, 3.36s/it] {'loss': 0.2973, 'grad_norm': 0.6266991708334412, 'learning_rate': 2.4310407459786634e-08, 'epoch': 0.97} 97%|█████████▋| 21423/22095 [36:56:47<37:41, 3.36s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (109112 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21424/22095 [36:56:55<50:46, 4.54s/it] {'loss': 0.471, 'grad_norm': 0.26378314561444793, 'learning_rate': 2.423827462075701e-08, 'epoch': 0.97} 97%|█████████▋| 21424/22095 [36:56:55<50:46, 4.54s/it] 97%|█████████▋| 21425/22095 [36:56:59<49:04, 4.39s/it] {'loss': 0.3047, 'grad_norm': 0.5846522553477222, 'learning_rate': 2.416624869556461e-08, 'epoch': 0.97} 97%|█████████▋| 21425/22095 [36:56:59<49:04, 4.39s/it] 97%|█████████▋| 21426/22095 [36:57:02<43:56, 3.94s/it] {'loss': 0.2762, 'grad_norm': 0.5748228073096529, 'learning_rate': 2.409432968575709e-08, 'epoch': 0.97} 97%|█████████▋| 21426/22095 [36:57:02<43:56, 3.94s/it] 97%|█████████▋| 21427/22095 [36:57:04<40:11, 3.61s/it] {'loss': 0.3131, 'grad_norm': 0.5776693574561639, 'learning_rate': 2.402251759288099e-08, 'epoch': 0.97} 97%|█████████▋| 21427/22095 [36:57:04<40:11, 3.61s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71358 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68407 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97874 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21428/22095 [36:57:07<37:38, 3.39s/it] {'loss': 0.2992, 'grad_norm': 0.6249029756593694, 'learning_rate': 2.3950812418477852e-08, 'epoch': 0.97} 97%|█████████▋| 21428/22095 [36:57:07<37:38, 3.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21429/22095 [36:57:11<37:46, 3.40s/it] {'loss': 0.2795, 'grad_norm': 1.7975295639136513, 'learning_rate': 2.3879214164088672e-08, 'epoch': 0.97} 97%|█████████▋| 21429/22095 [36:57:11<37:46, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21430/22095 [36:57:17<46:57, 4.24s/it] {'loss': 0.4724, 'grad_norm': 0.34538818894279005, 'learning_rate': 2.3807722831252768e-08, 'epoch': 0.97} 97%|█████████▋| 21430/22095 [36:57:17<46:57, 4.24s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [198, 24, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8914854 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [198, 24, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 38007, 'image': 'images/5014.png', 'image_wh': [[198, 24]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=12,延长线段AB至点C,使得BC=\\frac{1}{2}AB,点D是线段AC的中点,则线段BD的长是()\nA. 6\nB. 3\nC. 4\nD. 5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 97%|█████████▋| 21431/22095 [36:57:20<43:30, 3.93s/it] {'loss': 0.3004, 'grad_norm': 0.6697070486805683, 'learning_rate': 2.3736338421505578e-08, 'epoch': 0.97} 97%|█████████▋| 21431/22095 [36:57:20<43:30, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21432/22095 [36:57:30<1:01:38, 5.58s/it] {'loss': 0.4629, 'grad_norm': 0.26052467045218836, 'learning_rate': 2.366506093638088e-08, 'epoch': 0.97} 97%|█████████▋| 21432/22095 [36:57:30<1:01:38, 5.58s/it] 97%|█████████▋| 21433/22095 [36:57:33<54:07, 4.91s/it] {'loss': 0.3041, 'grad_norm': 0.6098721901226468, 'learning_rate': 2.359389037741022e-08, 'epoch': 0.97} 97%|█████████▋| 21433/22095 [36:57:33<54:07, 4.91s/it] 97%|█████████▋| 21434/22095 [36:57:36<47:21, 4.30s/it] {'loss': 0.2828, 'grad_norm': 0.6389878370521055, 'learning_rate': 2.3522826746123496e-08, 'epoch': 0.97} 97%|█████████▋| 21434/22095 [36:57:36<47:21, 4.30s/it] 97%|█████████▋| 21435/22095 [36:57:39<42:43, 3.88s/it] {'loss': 0.2912, 'grad_norm': 0.6152278199424831, 'learning_rate': 2.3451870044046698e-08, 'epoch': 0.97} 97%|█████████▋| 21435/22095 [36:57:39<42:43, 3.88s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21436/22095 [36:57:50<1:06:14, 6.03s/it] {'loss': 0.4857, 'grad_norm': 0.25780271796413584, 'learning_rate': 2.338102027270528e-08, 'epoch': 0.97} 97%|█████████▋| 21436/22095 [36:57:50<1:06:14, 6.03s/it] 97%|█████████▋| 21437/22095 [36:57:55<1:02:07, 5.67s/it] {'loss': 0.2852, 'grad_norm': 0.6092924189798018, 'learning_rate': 2.33102774336208e-08, 'epoch': 0.97} 97%|█████████▋| 21437/22095 [36:57:55<1:02:07, 5.67s/it] 97%|█████████▋| 21438/22095 [36:57:59<57:02, 5.21s/it] {'loss': 0.3193, 'grad_norm': 0.5870916846508486, 'learning_rate': 2.323964152831426e-08, 'epoch': 0.97} 97%|█████████▋| 21438/22095 [36:57:59<57:02, 5.21s/it] 97%|█████████▋| 21439/22095 [36:58:02<51:47, 4.74s/it] {'loss': 0.3011, 'grad_norm': 0.5496351854261411, 'learning_rate': 2.3169112558302232e-08, 'epoch': 0.97} 97%|█████████▋| 21439/22095 [36:58:02<51:47, 4.74s/it] 97%|█████████▋| 21440/22095 [36:58:06<49:18, 4.52s/it] {'loss': 0.2424, 'grad_norm': 0.5707888549972117, 'learning_rate': 2.3098690525101275e-08, 'epoch': 0.97} 97%|█████████▋| 21440/22095 [36:58:06<49:18, 4.52s/it] 97%|█████████▋| 21441/22095 [36:58:09<44:37, 4.09s/it] {'loss': 0.2863, 'grad_norm': 0.5956520405613818, 'learning_rate': 2.302837543022407e-08, 'epoch': 0.97} 97%|█████████▋| 21441/22095 [36:58:09<44:37, 4.09s/it] 97%|█████████▋| 21442/22095 [36:58:13<42:31, 3.91s/it] {'loss': 0.2945, 'grad_norm': 0.6552895738865977, 'learning_rate': 2.2958167275181076e-08, 'epoch': 0.97} 97%|█████████▋| 21442/22095 [36:58:13<42:31, 3.91s/it] 97%|█████████▋| 21443/22095 [36:58:16<39:38, 3.65s/it] {'loss': 0.2832, 'grad_norm': 0.5961285069005348, 'learning_rate': 2.288806606148164e-08, 'epoch': 0.97} 97%|█████████▋| 21443/22095 [36:58:16<39:38, 3.65s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21444/22095 [36:58:20<41:10, 3.79s/it] {'loss': 0.3379, 'grad_norm': 0.6151196913769361, 'learning_rate': 2.281807179063178e-08, 'epoch': 0.97} 97%|█████████▋| 21444/22095 [36:58:20<41:10, 3.79s/it] 97%|█████████▋| 21445/22095 [36:58:24<40:36, 3.75s/it] {'loss': 0.2987, 'grad_norm': 0.7587253293887296, 'learning_rate': 2.2748184464134736e-08, 'epoch': 0.97} 97%|█████████▋| 21445/22095 [36:58:24<40:36, 3.75s/it] 97%|█████████▋| 21446/22095 [36:58:27<39:42, 3.67s/it] {'loss': 0.3182, 'grad_norm': 0.6081690588146057, 'learning_rate': 2.26784040834932e-08, 'epoch': 0.97} 97%|█████████▋| 21446/22095 [36:58:27<39:42, 3.67s/it] 97%|█████████▋| 21447/22095 [36:58:31<38:56, 3.61s/it] {'loss': 0.278, 'grad_norm': 0.6881531310815816, 'learning_rate': 2.2608730650205966e-08, 'epoch': 0.97} 97%|█████████▋| 21447/22095 [36:58:31<38:56, 3.61s/it] 97%|█████████▋| 21448/22095 [36:58:34<37:40, 3.49s/it] {'loss': 0.2851, 'grad_norm': 0.642456803386314, 'learning_rate': 2.2539164165770178e-08, 'epoch': 0.97} 97%|█████████▋| 21448/22095 [36:58:34<37:40, 3.49s/it] 97%|█████████▋| 21449/22095 [36:58:38<40:22, 3.75s/it] {'loss': 0.2962, 'grad_norm': 0.6942711063770738, 'learning_rate': 2.2469704631680743e-08, 'epoch': 0.97} 97%|█████████▋| 21449/22095 [36:58:38<40:22, 3.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21450/22095 [36:58:47<57:16, 5.33s/it] {'loss': 0.4807, 'grad_norm': 0.24719462371086962, 'learning_rate': 2.2400352049429807e-08, 'epoch': 0.97} 97%|█████████▋| 21450/22095 [36:58:47<57:16, 5.33s/it] 97%|█████████▋| 21451/22095 [36:58:51<51:11, 4.77s/it] {'loss': 0.2831, 'grad_norm': 0.7658959323580632, 'learning_rate': 2.2331106420507843e-08, 'epoch': 0.97} 97%|█████████▋| 21451/22095 [36:58:51<51:11, 4.77s/it] 97%|█████████▋| 21452/22095 [36:58:55<48:39, 4.54s/it] {'loss': 0.3444, 'grad_norm': 0.6173714523200496, 'learning_rate': 2.2261967746402545e-08, 'epoch': 0.97} 97%|█████████▋| 21452/22095 [36:58:55<48:39, 4.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21453/22095 [36:58:59<46:32, 4.35s/it] {'loss': 0.2827, 'grad_norm': 0.638420072583342, 'learning_rate': 2.2192936028599953e-08, 'epoch': 0.97} 97%|█████████▋| 21453/22095 [36:58:59<46:32, 4.35s/it] 97%|█████████▋| 21454/22095 [36:59:02<42:46, 4.00s/it] {'loss': 0.2824, 'grad_norm': 0.5560272357588397, 'learning_rate': 2.212401126858277e-08, 'epoch': 0.97} 97%|█████████▋| 21454/22095 [36:59:02<42:46, 4.00s/it] 97%|█████████▋| 21455/22095 [36:59:06<42:14, 3.96s/it] {'loss': 0.2749, 'grad_norm': 0.5624028101250275, 'learning_rate': 2.2055193467832582e-08, 'epoch': 0.97} 97%|█████████▋| 21455/22095 [36:59:06<42:14, 3.96s/it] 97%|█████████▋| 21456/22095 [36:59:10<41:48, 3.93s/it] {'loss': 0.3075, 'grad_norm': 0.6437701220694999, 'learning_rate': 2.1986482627827098e-08, 'epoch': 0.97} 97%|█████████▋| 21456/22095 [36:59:10<41:48, 3.93s/it] 97%|█████████▋| 21457/22095 [36:59:13<39:35, 3.72s/it] {'loss': 0.2736, 'grad_norm': 0.6032683639248221, 'learning_rate': 2.1917878750043475e-08, 'epoch': 0.97} 97%|█████████▋| 21457/22095 [36:59:13<39:35, 3.72s/it] 97%|█████████▋| 21458/22095 [36:59:16<36:31, 3.44s/it] {'loss': 0.3054, 'grad_norm': 0.6148573920082251, 'learning_rate': 2.1849381835956084e-08, 'epoch': 0.97} 97%|█████████▋| 21458/22095 [36:59:16<36:31, 3.44s/it] 97%|█████████▋| 21459/22095 [36:59:19<35:47, 3.38s/it] {'loss': 0.3534, 'grad_norm': 0.5825952684250644, 'learning_rate': 2.1780991887035973e-08, 'epoch': 0.97} 97%|█████████▋| 21459/22095 [36:59:19<35:47, 3.38s/it] 97%|█████████▋| 21460/22095 [36:59:23<37:18, 3.53s/it] {'loss': 0.3276, 'grad_norm': 0.87701726721472, 'learning_rate': 2.1712708904752522e-08, 'epoch': 0.97} 97%|█████████▋| 21460/22095 [36:59:23<37:18, 3.53s/it] 97%|█████████▋| 21461/22095 [36:59:26<37:10, 3.52s/it] {'loss': 0.2737, 'grad_norm': 0.6016188627438112, 'learning_rate': 2.1644532890573444e-08, 'epoch': 0.97} 97%|█████████▋| 21461/22095 [36:59:26<37:10, 3.52s/it] 97%|█████████▋| 21462/22095 [36:59:30<36:54, 3.50s/it] {'loss': 0.2692, 'grad_norm': 0.6370604104920156, 'learning_rate': 2.1576463845964236e-08, 'epoch': 0.97} 97%|█████████▋| 21462/22095 [36:59:30<36:54, 3.50s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42245 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (81276 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73130 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (50768 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21463/22095 [36:59:34<39:38, 3.76s/it] {'loss': 0.3225, 'grad_norm': 0.56852633621876, 'learning_rate': 2.150850177238595e-08, 'epoch': 0.97} 97%|█████████▋| 21463/22095 [36:59:34<39:38, 3.76s/it] 97%|█████████▋| 21464/22095 [36:59:38<39:13, 3.73s/it] {'loss': 0.3534, 'grad_norm': 0.6540170279504566, 'learning_rate': 2.1440646671300193e-08, 'epoch': 0.97} 97%|█████████▋| 21464/22095 [36:59:38<39:13, 3.73s/it] 97%|█████████▋| 21465/22095 [36:59:41<36:34, 3.48s/it] {'loss': 0.2592, 'grad_norm': 0.6270797679263911, 'learning_rate': 2.1372898544164134e-08, 'epoch': 0.97} 97%|█████████▋| 21465/22095 [36:59:41<36:34, 3.48s/it] 97%|█████████▋| 21466/22095 [36:59:45<37:59, 3.62s/it] {'loss': 0.3286, 'grad_norm': 0.6053132155608527, 'learning_rate': 2.1305257392433832e-08, 'epoch': 0.97} 97%|█████████▋| 21466/22095 [36:59:45<37:59, 3.62s/it] 97%|█████████▋| 21467/22095 [36:59:49<40:26, 3.86s/it] {'loss': 0.2471, 'grad_norm': 0.5849943563276676, 'learning_rate': 2.1237723217562566e-08, 'epoch': 0.97} 97%|█████████▋| 21467/22095 [36:59:49<40:26, 3.86s/it] 97%|█████████▋| 21468/22095 [36:59:53<39:46, 3.81s/it] {'loss': 0.3247, 'grad_norm': 0.9313910065181935, 'learning_rate': 2.1170296021001956e-08, 'epoch': 0.97} 97%|█████████▋| 21468/22095 [36:59:53<39:46, 3.81s/it] 97%|█████████▋| 21469/22095 [36:59:56<37:04, 3.55s/it] {'loss': 0.2969, 'grad_norm': 0.6476378321677396, 'learning_rate': 2.1102975804200287e-08, 'epoch': 0.97} 97%|█████████▋| 21469/22095 [36:59:56<37:04, 3.55s/it] 97%|█████████▋| 21470/22095 [36:59:59<36:25, 3.50s/it] {'loss': 0.3016, 'grad_norm': 0.6802598270516329, 'learning_rate': 2.1035762568603623e-08, 'epoch': 0.97} 97%|█████████▋| 21470/22095 [36:59:59<36:25, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [128, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8504809 in VC:s3://internvl-moe-sft-data/. Exception: Image size [128, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10208, 'image': 'vrdu_texteq/astro-ph.CO/8f2dbefe-ccf1-4823-96d1-3024d1b2fc4f.png', 'image_wh': [[128, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'where $A$ is'}]} 97%|█████████▋| 21471/22095 [37:00:08<54:57, 5.28s/it] {'loss': 0.437, 'grad_norm': 0.25566338750406215, 'learning_rate': 2.096865631565692e-08, 'epoch': 0.97} 97%|█████████▋| 21471/22095 [37:00:08<54:57, 5.28s/it] 97%|█████████▋| 21472/22095 [37:00:12<49:38, 4.78s/it] {'loss': 0.2753, 'grad_norm': 0.6144524105325259, 'learning_rate': 2.090165704680236e-08, 'epoch': 0.97} 97%|█████████▋| 21472/22095 [37:00:12<49:38, 4.78s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45128 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (91907 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54014 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21473/22095 [37:00:15<43:27, 4.19s/it] {'loss': 0.3103, 'grad_norm': 0.5939623725491535, 'learning_rate': 2.083476476347823e-08, 'epoch': 0.97} 97%|█████████▋| 21473/22095 [37:00:15<43:27, 4.19s/it] 97%|█████████▋| 21474/22095 [37:00:19<42:13, 4.08s/it] {'loss': 0.299, 'grad_norm': 0.6245506587877645, 'learning_rate': 2.076797946712339e-08, 'epoch': 0.97} 97%|█████████▋| 21474/22095 [37:00:19<42:13, 4.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21475/22095 [37:00:22<39:38, 3.84s/it] {'loss': 0.2893, 'grad_norm': 0.6063673889028894, 'learning_rate': 2.0701301159171683e-08, 'epoch': 0.97} 97%|█████████▋| 21475/22095 [37:00:22<39:38, 3.84s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21476/22095 [37:00:25<37:09, 3.60s/it] {'loss': 0.2745, 'grad_norm': 0.6002761281196382, 'learning_rate': 2.0634729841056966e-08, 'epoch': 0.97} 97%|█████████▋| 21476/22095 [37:00:25<37:09, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (56793 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43677 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44037 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (52221 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21477/22095 [37:00:28<36:47, 3.57s/it] {'loss': 0.2948, 'grad_norm': 0.6330634611823746, 'learning_rate': 2.0568265514208097e-08, 'epoch': 0.97} 97%|█████████▋| 21477/22095 [37:00:28<36:47, 3.57s/it] 97%|█████████▋| 21478/22095 [37:00:32<35:40, 3.47s/it] {'loss': 0.2946, 'grad_norm': 0.5653621329054418, 'learning_rate': 2.0501908180054486e-08, 'epoch': 0.97} 97%|█████████▋| 21478/22095 [37:00:32<35:40, 3.47s/it] 97%|█████████▋| 21479/22095 [37:00:34<33:19, 3.25s/it] {'loss': 0.2776, 'grad_norm': 0.6638168852132872, 'learning_rate': 2.0435657840021104e-08, 'epoch': 0.97} 97%|█████████▋| 21479/22095 [37:00:34<33:19, 3.25s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (53730 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21480/22095 [37:00:38<35:01, 3.42s/it] {'loss': 0.2782, 'grad_norm': 0.5561328122632877, 'learning_rate': 2.0369514495532373e-08, 'epoch': 0.97} 97%|█████████▋| 21480/22095 [37:00:38<35:01, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21481/22095 [37:00:46<49:32, 4.84s/it] {'loss': 0.4444, 'grad_norm': 0.24582704186734805, 'learning_rate': 2.0303478148008813e-08, 'epoch': 0.97} 97%|█████████▋| 21481/22095 [37:00:46<49:32, 4.84s/it] 97%|█████████▋| 21482/22095 [37:00:50<44:55, 4.40s/it] {'loss': 0.3019, 'grad_norm': 0.6173504157082066, 'learning_rate': 2.02375487988693e-08, 'epoch': 0.97} 97%|█████████▋| 21482/22095 [37:00:50<44:55, 4.40s/it] 97%|█████████▋| 21483/22095 [37:00:53<41:35, 4.08s/it] {'loss': 0.3275, 'grad_norm': 0.6212625353881241, 'learning_rate': 2.0171726449531025e-08, 'epoch': 0.97} 97%|█████████▋| 21483/22095 [37:00:53<41:35, 4.08s/it] 97%|█████████▋| 21484/22095 [37:00:58<43:56, 4.32s/it] {'loss': 0.3348, 'grad_norm': 0.6496282193749826, 'learning_rate': 2.010601110140786e-08, 'epoch': 0.97} 97%|█████████▋| 21484/22095 [37:00:58<43:56, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21485/22095 [37:01:08<1:01:18, 6.03s/it] {'loss': 0.461, 'grad_norm': 0.5653231467934803, 'learning_rate': 2.0040402755912013e-08, 'epoch': 0.97} 97%|█████████▋| 21485/22095 [37:01:08<1:01:18, 6.03s/it] 97%|█████████▋| 21486/22095 [37:01:12<55:25, 5.46s/it] {'loss': 0.2806, 'grad_norm': 0.605088017779233, 'learning_rate': 1.9974901414452907e-08, 'epoch': 0.97} 97%|█████████▋| 21486/22095 [37:01:12<55:25, 5.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21487/22095 [37:01:15<47:05, 4.65s/it] {'loss': 0.2612, 'grad_norm': 0.6410627323825772, 'learning_rate': 1.9909507078438307e-08, 'epoch': 0.97} 97%|█████████▋| 21487/22095 [37:01:15<47:05, 4.65s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (44415 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21488/22095 [37:01:20<48:07, 4.76s/it] {'loss': 0.481, 'grad_norm': 0.2572270975628381, 'learning_rate': 1.984421974927375e-08, 'epoch': 0.97} 97%|█████████▋| 21488/22095 [37:01:20<48:07, 4.76s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21489/22095 [37:01:23<42:56, 4.25s/it] {'loss': 0.277, 'grad_norm': 0.5717832368892245, 'learning_rate': 1.9779039428360904e-08, 'epoch': 0.97} 97%|█████████▋| 21489/22095 [37:01:23<42:56, 4.25s/it] 97%|█████████▋| 21490/22095 [37:01:26<39:18, 3.90s/it] {'loss': 0.2534, 'grad_norm': 0.5836692238315675, 'learning_rate': 1.971396611710086e-08, 'epoch': 0.97} 97%|█████████▋| 21490/22095 [37:01:26<39:18, 3.90s/it] 97%|█████████▋| 21491/22095 [37:01:30<38:49, 3.86s/it] {'loss': 0.2539, 'grad_norm': 0.6329943429107763, 'learning_rate': 1.9648999816891944e-08, 'epoch': 0.97} 97%|█████████▋| 21491/22095 [37:01:30<38:49, 3.86s/it] 97%|█████████▋| 21492/22095 [37:01:33<37:41, 3.75s/it] {'loss': 0.2916, 'grad_norm': 0.6262710039347652, 'learning_rate': 1.958414052913027e-08, 'epoch': 0.97} 97%|█████████▋| 21492/22095 [37:01:33<37:41, 3.75s/it] 97%|█████████▋| 21493/22095 [37:01:36<35:57, 3.58s/it] {'loss': 0.3064, 'grad_norm': 0.6340901437998767, 'learning_rate': 1.951938825520916e-08, 'epoch': 0.97} 97%|█████████▋| 21493/22095 [37:01:37<35:57, 3.58s/it] 97%|█████████▋| 21494/22095 [37:01:40<36:50, 3.68s/it] {'loss': 0.3294, 'grad_norm': 0.613620906601682, 'learning_rate': 1.9454742996519726e-08, 'epoch': 0.97} 97%|█████████▋| 21494/22095 [37:01:40<36:50, 3.68s/it] 97%|█████████▋| 21495/22095 [37:01:44<35:17, 3.53s/it] {'loss': 0.2749, 'grad_norm': 0.5792752864447169, 'learning_rate': 1.9390204754451967e-08, 'epoch': 0.97} 97%|█████████▋| 21495/22095 [37:01:44<35:17, 3.53s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21496/22095 [37:01:51<48:02, 4.81s/it] {'loss': 0.4725, 'grad_norm': 0.26855686320942884, 'learning_rate': 1.9325773530391446e-08, 'epoch': 0.97} 97%|█████████▋| 21496/22095 [37:01:51<48:02, 4.81s/it] 97%|█████████▋| 21497/22095 [37:01:55<43:52, 4.40s/it] {'loss': 0.277, 'grad_norm': 0.5983834047394934, 'learning_rate': 1.926144932572316e-08, 'epoch': 0.97} 97%|█████████▋| 21497/22095 [37:01:55<43:52, 4.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (147058 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48507 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21498/22095 [37:01:58<39:04, 3.93s/it] {'loss': 0.2616, 'grad_norm': 0.7360701247702012, 'learning_rate': 1.9197232141829335e-08, 'epoch': 0.97} 97%|█████████▋| 21498/22095 [37:01:58<39:04, 3.93s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (54204 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21499/22095 [37:02:01<36:38, 3.69s/it] {'loss': 0.3041, 'grad_norm': 0.6197883921720705, 'learning_rate': 1.913312198008943e-08, 'epoch': 0.97} 97%|█████████▋| 21499/22095 [37:02:01<36:38, 3.69s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047667 in VC:s3://multi-modal/UniGeo/. Exception: Image size [163, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5299.png', 'image_wh': [[163, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=16cm,点C为线段AB上的一个动点,点D、E分别是AC和BC的中点,则DE的长()\nA. 4cm\nB. 8cm\nC. 10cm\nD. 16cm'}, {'from': 'gpt', 'value': '【解答】解:∵点D、E分别是AC和BC的中点,∴DE=DC+CE=\\frac{1}{2}AC+\\frac{1}{2}BC=\\frac{1}{2}AB而AB=16cm,∴DE=\\frac{1}{2}×16=8(cm).'}]} 97%|█████████▋| 21500/22095 [37:02:05<39:08, 3.95s/it] {'loss': 0.2714, 'grad_norm': 0.5844609404817493, 'learning_rate': 1.9069118841881228e-08, 'epoch': 0.97} 97%|█████████▋| 21500/22095 [37:02:05<39:08, 3.95s/it] 97%|█████████▋| 21501/22095 [37:02:09<39:03, 3.94s/it] {'loss': 0.3048, 'grad_norm': 0.5603732149038518, 'learning_rate': 1.9005222728579742e-08, 'epoch': 0.97} 97%|█████████▋| 21501/22095 [37:02:09<39:03, 3.94s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (46443 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (76868 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56706 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69955 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (69141 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21502/22095 [37:02:12<35:25, 3.58s/it] {'loss': 0.2807, 'grad_norm': 0.6533007099497831, 'learning_rate': 1.8941433641558315e-08, 'epoch': 0.97} 97%|█████████▋| 21502/22095 [37:02:12<35:25, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (86669 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21503/22095 [37:02:16<35:44, 3.62s/it] {'loss': 0.2856, 'grad_norm': 0.6925200042670877, 'learning_rate': 1.8877751582186966e-08, 'epoch': 0.97} 97%|█████████▋| 21503/22095 [37:02:16<35:44, 3.62s/it] 97%|█████████▋| 21504/22095 [37:02:19<35:30, 3.60s/it] {'loss': 0.3288, 'grad_norm': 0.6110740454862127, 'learning_rate': 1.8814176551834595e-08, 'epoch': 0.97} 97%|█████████▋| 21504/22095 [37:02:19<35:30, 3.60s/it] 97%|█████████▋| 21505/22095 [37:02:22<34:07, 3.47s/it] {'loss': 0.3062, 'grad_norm': 0.6500400906628822, 'learning_rate': 1.8750708551867336e-08, 'epoch': 0.97} 97%|█████████▋| 21505/22095 [37:02:22<34:07, 3.47s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21506/22095 [37:02:31<49:24, 5.03s/it] {'loss': 0.4977, 'grad_norm': 0.38635779551762117, 'learning_rate': 1.8687347583647985e-08, 'epoch': 0.97} 97%|█████████▋| 21506/22095 [37:02:31<49:24, 5.03s/it] 97%|█████████▋| 21507/22095 [37:02:35<44:48, 4.57s/it] {'loss': 0.2942, 'grad_norm': 0.6038331706183906, 'learning_rate': 1.8624093648539344e-08, 'epoch': 0.97} 97%|█████████▋| 21507/22095 [37:02:35<44:48, 4.57s/it] 97%|█████████▋| 21508/22095 [37:02:38<41:50, 4.28s/it] {'loss': 0.2343, 'grad_norm': 0.6262687242978424, 'learning_rate': 1.856094674789921e-08, 'epoch': 0.97} 97%|█████████▋| 21508/22095 [37:02:38<41:50, 4.28s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21509/22095 [37:02:42<39:52, 4.08s/it] {'loss': 0.3216, 'grad_norm': 0.6102937548707332, 'learning_rate': 1.8497906883085394e-08, 'epoch': 0.97} 97%|█████████▋| 21509/22095 [37:02:42<39:52, 4.08s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (58061 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65626 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21510/22095 [37:02:49<48:18, 4.96s/it] {'loss': 0.4984, 'grad_norm': 0.5662823014312907, 'learning_rate': 1.8434974055451248e-08, 'epoch': 0.97} 97%|█████████▋| 21510/22095 [37:02:49<48:18, 4.96s/it] 97%|█████████▋| 21511/22095 [37:02:52<43:18, 4.45s/it] {'loss': 0.2854, 'grad_norm': 0.6200777181397742, 'learning_rate': 1.8372148266350696e-08, 'epoch': 0.97} 97%|█████████▋| 21511/22095 [37:02:52<43:18, 4.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21512/22095 [37:02:56<40:58, 4.22s/it] {'loss': 0.2692, 'grad_norm': 0.6630690064524812, 'learning_rate': 1.830942951713266e-08, 'epoch': 0.97} 97%|█████████▋| 21512/22095 [37:02:56<40:58, 4.22s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52166 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102025 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46317 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21513/22095 [37:02:59<38:15, 3.94s/it] {'loss': 0.3083, 'grad_norm': 0.6760977205002116, 'learning_rate': 1.8246817809144392e-08, 'epoch': 0.97} 97%|█████████▋| 21513/22095 [37:02:59<38:15, 3.94s/it] 97%|█████████▋| 21514/22095 [37:03:02<36:38, 3.78s/it] {'loss': 0.3112, 'grad_norm': 0.6257431106132486, 'learning_rate': 1.8184313143732035e-08, 'epoch': 0.97} 97%|█████████▋| 21514/22095 [37:03:03<36:38, 3.78s/it] 97%|█████████▋| 21515/22095 [37:03:06<34:24, 3.56s/it] {'loss': 0.3552, 'grad_norm': 0.5883702422264766, 'learning_rate': 1.812191552223841e-08, 'epoch': 0.97} 97%|█████████▋| 21515/22095 [37:03:06<34:24, 3.56s/it] 97%|█████████▋| 21516/22095 [37:03:09<33:42, 3.49s/it] {'loss': 0.2976, 'grad_norm': 0.610938178761219, 'learning_rate': 1.8059624946004105e-08, 'epoch': 0.97} 97%|█████████▋| 21516/22095 [37:03:09<33:42, 3.49s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67989 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (73941 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21517/22095 [37:03:12<33:10, 3.44s/it] {'loss': 0.3352, 'grad_norm': 0.6088542630214259, 'learning_rate': 1.79974414163675e-08, 'epoch': 0.97} 97%|█████████▋| 21517/22095 [37:03:12<33:10, 3.44s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (88781 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21518/22095 [37:03:21<49:10, 5.11s/it] {'loss': 0.4551, 'grad_norm': 0.23932636750915598, 'learning_rate': 1.7935364934664744e-08, 'epoch': 0.97} 97%|█████████▋| 21518/22095 [37:03:21<49:10, 5.11s/it] 97%|█████████▋| 21519/22095 [37:03:25<44:51, 4.67s/it] {'loss': 0.3013, 'grad_norm': 0.6661497356134606, 'learning_rate': 1.7873395502229774e-08, 'epoch': 0.97} 97%|█████████▋| 21519/22095 [37:03:25<44:51, 4.67s/it] 97%|█████████▋| 21520/22095 [37:03:28<39:53, 4.16s/it] {'loss': 0.256, 'grad_norm': 0.5806129832544668, 'learning_rate': 1.7811533120394296e-08, 'epoch': 0.97} 97%|█████████▋| 21520/22095 [37:03:28<39:53, 4.16s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21521/22095 [37:03:37<54:57, 5.74s/it] {'loss': 0.4692, 'grad_norm': 0.28544530450591754, 'learning_rate': 1.7749777790487256e-08, 'epoch': 0.97} 97%|█████████▋| 21521/22095 [37:03:37<54:57, 5.74s/it] 97%|█████████▋| 21522/22095 [37:03:40<47:33, 4.98s/it] {'loss': 0.3194, 'grad_norm': 0.5703201834714856, 'learning_rate': 1.7688129513835915e-08, 'epoch': 0.97} 97%|█████████▋| 21522/22095 [37:03:40<47:33, 4.98s/it] 97%|█████████▋| 21523/22095 [37:03:44<42:08, 4.42s/it] {'loss': 0.2476, 'grad_norm': 0.5872381541669788, 'learning_rate': 1.7626588291764225e-08, 'epoch': 0.97} 97%|█████████▋| 21523/22095 [37:03:44<42:08, 4.42s/it] 97%|█████████▋| 21524/22095 [37:03:47<40:06, 4.21s/it] {'loss': 0.2815, 'grad_norm': 0.5319838055635173, 'learning_rate': 1.7565154125595006e-08, 'epoch': 0.97} 97%|█████████▋| 21524/22095 [37:03:47<40:06, 4.21s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21525/22095 [37:03:58<57:34, 6.06s/it] {'loss': 0.4754, 'grad_norm': 0.28202626818847565, 'learning_rate': 1.7503827016648876e-08, 'epoch': 0.97} 97%|█████████▋| 21525/22095 [37:03:58<57:34, 6.06s/it] 97%|█████████▋| 21526/22095 [37:04:01<49:08, 5.18s/it] {'loss': 0.2987, 'grad_norm': 0.6586370358676067, 'learning_rate': 1.7442606966242005e-08, 'epoch': 0.97} 97%|█████████▋| 21526/22095 [37:04:01<49:08, 5.18s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 97%|█████████▋| 21527/22095 [37:04:07<52:37, 5.56s/it] {'loss': 0.4714, 'grad_norm': 0.2766401223790382, 'learning_rate': 1.7381493975691667e-08, 'epoch': 0.97} 97%|█████████▋| 21527/22095 [37:04:07<52:37, 5.56s/it] 97%|█████████▋| 21528/22095 [37:04:10<45:21, 4.80s/it] {'loss': 0.2988, 'grad_norm': 0.6413995570584645, 'learning_rate': 1.7320488046309593e-08, 'epoch': 0.97} 97%|█████████▋| 21528/22095 [37:04:10<45:21, 4.80s/it] 97%|█████████▋| 21529/22095 [37:04:14<41:44, 4.42s/it] {'loss': 0.3021, 'grad_norm': 0.6030670799447602, 'learning_rate': 1.7259589179406953e-08, 'epoch': 0.97} 97%|█████████▋| 21529/22095 [37:04:14<41:44, 4.42s/it] 97%|█████████▋| 21530/22095 [37:04:17<37:58, 4.03s/it] {'loss': 0.2757, 'grad_norm': 0.8385567228907049, 'learning_rate': 1.7198797376292708e-08, 'epoch': 0.97} 97%|█████████▋| 21530/22095 [37:04:17<37:58, 4.03s/it] 97%|█████████▋| 21531/22095 [37:04:20<34:24, 3.66s/it] {'loss': 0.3029, 'grad_norm': 0.6998410794379304, 'learning_rate': 1.7138112638272476e-08, 'epoch': 0.97} 97%|█████████▋| 21531/22095 [37:04:20<34:24, 3.66s/it] 97%|█████████▋| 21532/22095 [37:04:23<33:12, 3.54s/it] {'loss': 0.3045, 'grad_norm': 0.668868767753376, 'learning_rate': 1.7077534966650767e-08, 'epoch': 0.97} 97%|█████████▋| 21532/22095 [37:04:23<33:12, 3.54s/it] 97%|█████████▋| 21533/22095 [37:04:26<31:42, 3.39s/it] {'loss': 0.2379, 'grad_norm': 0.6094772471471711, 'learning_rate': 1.7017064362728764e-08, 'epoch': 0.97} 97%|█████████▋| 21533/22095 [37:04:26<31:42, 3.39s/it] 97%|█████████▋| 21534/22095 [37:04:29<30:14, 3.23s/it] {'loss': 0.2848, 'grad_norm': 0.6392838917586049, 'learning_rate': 1.6956700827806538e-08, 'epoch': 0.97} 97%|█████████▋| 21534/22095 [37:04:29<30:14, 3.23s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21535/22095 [37:04:32<29:48, 3.19s/it] {'loss': 0.3148, 'grad_norm': 0.7005772112789435, 'learning_rate': 1.689644436317972e-08, 'epoch': 0.97} 97%|█████████▋| 21535/22095 [37:04:32<29:48, 3.19s/it] 97%|█████████▋| 21536/22095 [37:04:36<31:30, 3.38s/it] {'loss': 0.2571, 'grad_norm': 0.6039779318860282, 'learning_rate': 1.6836294970144495e-08, 'epoch': 0.97} 97%|█████████▋| 21536/22095 [37:04:36<31:30, 3.38s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80326 > 40960). Running this sequence through the model will result in indexing errors 97%|█████████▋| 21537/22095 [37:04:40<32:35, 3.50s/it] {'loss': 0.2869, 'grad_norm': 0.6087022627637411, 'learning_rate': 1.6776252649992608e-08, 'epoch': 0.97} 97%|█████████▋| 21537/22095 [37:04:40<32:35, 3.50s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21538/22095 [37:04:43<32:16, 3.48s/it] {'loss': 0.276, 'grad_norm': 0.6536183340738748, 'learning_rate': 1.6716317404014136e-08, 'epoch': 0.97} 97%|█████████▋| 21538/22095 [37:04:43<32:16, 3.48s/it] 97%|█████████▋| 21539/22095 [37:04:46<31:20, 3.38s/it] {'loss': 0.2726, 'grad_norm': 0.5890982188008039, 'learning_rate': 1.665648923349694e-08, 'epoch': 0.97} 97%|█████████▋| 21539/22095 [37:04:46<31:20, 3.38s/it] 97%|█████████▋| 21540/22095 [37:04:49<29:59, 3.24s/it] {'loss': 0.3057, 'grad_norm': 0.6485504916729523, 'learning_rate': 1.659676813972666e-08, 'epoch': 0.97} 97%|█████████▋| 21540/22095 [37:04:49<29:59, 3.24s/it] 97%|█████████▋| 21541/22095 [37:04:53<31:24, 3.40s/it] {'loss': 0.3385, 'grad_norm': 0.6579237092730789, 'learning_rate': 1.6537154123986156e-08, 'epoch': 0.97} 97%|█████████▋| 21541/22095 [37:04:53<31:24, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 97%|█████████▋| 21542/22095 [37:04:56<30:14, 3.28s/it] {'loss': 0.2696, 'grad_norm': 0.69823033168968, 'learning_rate': 1.647764718755718e-08, 'epoch': 0.97} 97%|█████████▋| 21542/22095 [37:04:56<30:14, 3.28s/it] 98%|█████████▊| 21543/22095 [37:04:59<30:49, 3.35s/it] {'loss': 0.3085, 'grad_norm': 0.5525465895809587, 'learning_rate': 1.641824733171815e-08, 'epoch': 0.98} 98%|█████████▊| 21543/22095 [37:04:59<30:49, 3.35s/it] 98%|█████████▊| 21544/22095 [37:05:02<29:42, 3.23s/it] {'loss': 0.3365, 'grad_norm': 0.5679357769200283, 'learning_rate': 1.6358954557744166e-08, 'epoch': 0.98} 98%|█████████▊| 21544/22095 [37:05:02<29:42, 3.23s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21545/22095 [37:05:12<47:01, 5.13s/it] {'loss': 0.4753, 'grad_norm': 0.250665352604849, 'learning_rate': 1.629976886691087e-08, 'epoch': 0.98} 98%|█████████▊| 21545/22095 [37:05:12<47:01, 5.13s/it] 98%|█████████▊| 21546/22095 [37:05:16<42:47, 4.68s/it] {'loss': 0.3078, 'grad_norm': 0.5955064802272411, 'learning_rate': 1.6240690260488913e-08, 'epoch': 0.98} 98%|█████████▊| 21546/22095 [37:05:16<42:47, 4.68s/it] 98%|█████████▊| 21547/22095 [37:05:19<39:29, 4.32s/it] {'loss': 0.3097, 'grad_norm': 0.608417079637003, 'learning_rate': 1.6181718739748388e-08, 'epoch': 0.98} 98%|█████████▊| 21547/22095 [37:05:19<39:29, 4.32s/it] 98%|█████████▊| 21548/22095 [37:05:22<36:06, 3.96s/it] {'loss': 0.2917, 'grad_norm': 0.6143087542971571, 'learning_rate': 1.6122854305955505e-08, 'epoch': 0.98} 98%|█████████▊| 21548/22095 [37:05:22<36:06, 3.96s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21549/22095 [37:05:30<47:54, 5.26s/it] {'loss': 0.4708, 'grad_norm': 0.2602212956379094, 'learning_rate': 1.6064096960376476e-08, 'epoch': 0.98} 98%|█████████▊| 21549/22095 [37:05:30<47:54, 5.26s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (135727 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (71690 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46694 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42079 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (47631 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21550/22095 [37:05:34<43:28, 4.79s/it] {'loss': 0.331, 'grad_norm': 0.6126645269940824, 'learning_rate': 1.600544670427251e-08, 'epoch': 0.98} 98%|█████████▊| 21550/22095 [37:05:34<43:28, 4.79s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (88872 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56998 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21551/22095 [37:05:37<39:27, 4.35s/it] {'loss': 0.2803, 'grad_norm': 0.6449883399574192, 'learning_rate': 1.5946903538904267e-08, 'epoch': 0.98} 98%|█████████▊| 21551/22095 [37:05:37<39:27, 4.35s/it] 98%|█████████▊| 21552/22095 [37:05:41<36:09, 3.99s/it] {'loss': 0.2897, 'grad_norm': 0.6410152600174044, 'learning_rate': 1.5888467465529632e-08, 'epoch': 0.98} 98%|█████████▊| 21552/22095 [37:05:41<36:09, 3.99s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [706, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8470831 in VC:s3://internvl-moe-sft-data/. Exception: Image size [706, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 4467, 'image': 'vrdu_texteq/astro-ph.CO/d8885fcc-bed1-47b5-ba2a-9085740bcb4a.png', 'image_wh': [[706, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease use text recognition techniques to decipher the text within the image.'}, {'from': 'gpt', 'value': 'where $a$ and $H$ are to be evaluated at the end of inflation.'}]} 98%|█████████▊| 21553/22095 [37:05:44<34:26, 3.81s/it] {'loss': 0.3022, 'grad_norm': 1.1902445501114625, 'learning_rate': 1.583013848540482e-08, 'epoch': 0.98} 98%|█████████▊| 21553/22095 [37:05:44<34:26, 3.81s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 98%|█████████▊| 21554/22095 [37:05:48<35:25, 3.93s/it] {'loss': 0.2639, 'grad_norm': 0.562064160439384, 'learning_rate': 1.5771916599782167e-08, 'epoch': 0.98} 98%|█████████▊| 21554/22095 [37:05:48<35:25, 3.93s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408532 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10726, 'image': 'vrdu_table_final_2/astro-ph.CO/f6d7151c-25da-4f3f-b639-0747e8f15f34.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nCan you convert the table from the image into LaTeX notation?'}, {'from': 'gpt', 'value': "Certainly, I'll convert the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 98%|█████████▊| 21555/22095 [37:05:51<32:40, 3.63s/it] {'loss': 0.2595, 'grad_norm': 0.5666439513306696, 'learning_rate': 1.5713801809913443e-08, 'epoch': 0.98} 98%|█████████▊| 21555/22095 [37:05:51<32:40, 3.63s/it] 98%|█████████▊| 21556/22095 [37:05:54<31:05, 3.46s/it] {'loss': 0.253, 'grad_norm': 0.6185233159292626, 'learning_rate': 1.5655794117047097e-08, 'epoch': 0.98} 98%|█████████▊| 21556/22095 [37:05:54<31:05, 3.46s/it] 98%|█████████▊| 21557/22095 [37:05:57<30:10, 3.37s/it] {'loss': 0.304, 'grad_norm': 0.6169772789722926, 'learning_rate': 1.5597893522428796e-08, 'epoch': 0.98} 98%|█████████▊| 21557/22095 [37:05:57<30:10, 3.37s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21558/22095 [37:06:07<46:16, 5.17s/it] {'loss': 0.4642, 'grad_norm': 0.27782104133037505, 'learning_rate': 1.5540100027304217e-08, 'epoch': 0.98} 98%|█████████▊| 21558/22095 [37:06:07<46:16, 5.17s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (131662 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21559/22095 [37:06:15<53:37, 6.00s/it] {'loss': 0.4763, 'grad_norm': 0.2715187409490002, 'learning_rate': 1.5482413632914028e-08, 'epoch': 0.98} 98%|█████████▊| 21559/22095 [37:06:15<53:37, 6.00s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 98%|█████████▊| 21560/22095 [37:06:19<47:57, 5.38s/it] {'loss': 0.3033, 'grad_norm': 0.5957365122967457, 'learning_rate': 1.5424834340497796e-08, 'epoch': 0.98} 98%|█████████▊| 21560/22095 [37:06:19<47:57, 5.38s/it] 98%|█████████▊| 21561/22095 [37:06:22<42:28, 4.77s/it] {'loss': 0.3173, 'grad_norm': 0.6438850183819776, 'learning_rate': 1.5367362151292863e-08, 'epoch': 0.98} 98%|█████████▊| 21561/22095 [37:06:22<42:28, 4.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (51089 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (85266 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72214 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21562/22095 [37:06:26<39:26, 4.44s/it] {'loss': 0.2965, 'grad_norm': 0.8646528602373577, 'learning_rate': 1.5309997066534354e-08, 'epoch': 0.98} 98%|█████████▊| 21562/22095 [37:06:26<39:26, 4.44s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49508 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42884 > 40960). Running this sequence through the model will result in indexing errors Rank 0: Token indices sequence length is longer than the specified maximum sequence length (59079 > 40960) for 4 sample(s). Truncating to 14268 with 2 samples. 98%|█████████▊| 21563/22095 [37:06:29<35:26, 4.00s/it] {'loss': 0.3036, 'grad_norm': 0.6283703457696358, 'learning_rate': 1.5252739087454617e-08, 'epoch': 0.98} 98%|█████████▊| 21563/22095 [37:06:29<35:26, 4.00s/it] 98%|█████████▊| 21564/22095 [37:06:32<34:09, 3.86s/it] {'loss': 0.2437, 'grad_norm': 0.5739317342776467, 'learning_rate': 1.5195588215283773e-08, 'epoch': 0.98} 98%|█████████▊| 21564/22095 [37:06:32<34:09, 3.86s/it] 98%|█████████▊| 21565/22095 [37:06:35<31:36, 3.58s/it] {'loss': 0.3071, 'grad_norm': 0.5952634209800168, 'learning_rate': 1.5138544451250292e-08, 'epoch': 0.98} 98%|█████████▊| 21565/22095 [37:06:35<31:36, 3.58s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (79732 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93901 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (41385 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21566/22095 [37:06:38<30:32, 3.46s/it] {'loss': 0.3059, 'grad_norm': 0.6403235870149089, 'learning_rate': 1.5081607796579856e-08, 'epoch': 0.98} 98%|█████████▊| 21566/22095 [37:06:38<30:32, 3.46s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21567/22095 [37:06:46<43:02, 4.89s/it] {'loss': 0.4823, 'grad_norm': 0.27436692115992706, 'learning_rate': 1.502477825249593e-08, 'epoch': 0.98} 98%|█████████▊| 21567/22095 [37:06:46<43:02, 4.89s/it] 98%|█████████▊| 21568/22095 [37:06:50<38:58, 4.44s/it] {'loss': 0.2929, 'grad_norm': 0.6302113621127572, 'learning_rate': 1.4968055820218653e-08, 'epoch': 0.98} 98%|█████████▊| 21568/22095 [37:06:50<38:58, 4.44s/it] 98%|█████████▊| 21569/22095 [37:06:53<35:30, 4.05s/it] {'loss': 0.2988, 'grad_norm': 0.6228658000514758, 'learning_rate': 1.4911440500968155e-08, 'epoch': 0.98} 98%|█████████▊| 21569/22095 [37:06:53<35:30, 4.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21570/22095 [37:07:03<50:19, 5.75s/it] {'loss': 0.4579, 'grad_norm': 0.2591940642929855, 'learning_rate': 1.4854932295959578e-08, 'epoch': 0.98} 98%|█████████▊| 21570/22095 [37:07:03<50:19, 5.75s/it] 98%|█████████▊| 21571/22095 [37:07:12<59:52, 6.86s/it] {'loss': 0.4573, 'grad_norm': 0.2359238616287538, 'learning_rate': 1.4798531206408617e-08, 'epoch': 0.98} 98%|█████████▊| 21571/22095 [37:07:12<59:52, 6.86s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 98%|█████████▊| 21572/22095 [37:07:16<50:41, 5.82s/it] {'loss': 0.2585, 'grad_norm': 0.5557793876162257, 'learning_rate': 1.474223723352597e-08, 'epoch': 0.98} 98%|█████████▊| 21572/22095 [37:07:16<50:41, 5.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44295 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21573/22095 [37:07:19<43:58, 5.05s/it] {'loss': 0.2887, 'grad_norm': 0.6008817741426201, 'learning_rate': 1.4686050378521221e-08, 'epoch': 0.98} 98%|█████████▊| 21573/22095 [37:07:19<43:58, 5.05s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21574/22095 [37:07:28<55:19, 6.37s/it] {'loss': 0.4633, 'grad_norm': 0.25315653321249215, 'learning_rate': 1.4629970642602298e-08, 'epoch': 0.98} 98%|█████████▊| 21574/22095 [37:07:28<55:19, 6.37s/it] 98%|█████████▊| 21575/22095 [37:07:32<47:10, 5.44s/it] {'loss': 0.2273, 'grad_norm': 0.5625693320994961, 'learning_rate': 1.457399802697379e-08, 'epoch': 0.98} 98%|█████████▊| 21575/22095 [37:07:32<47:10, 5.44s/it] 98%|█████████▊| 21576/22095 [37:07:36<43:21, 5.01s/it] {'loss': 0.2596, 'grad_norm': 0.5776294944714752, 'learning_rate': 1.4518132532838624e-08, 'epoch': 0.98} 98%|█████████▊| 21576/22095 [37:07:36<43:21, 5.01s/it] 98%|█████████▊| 21577/22095 [37:07:39<38:30, 4.46s/it] {'loss': 0.273, 'grad_norm': 0.6204162617245031, 'learning_rate': 1.4462374161396952e-08, 'epoch': 0.98} 98%|█████████▊| 21577/22095 [37:07:39<38:30, 4.46s/it] 98%|█████████▊| 21578/22095 [37:07:42<34:12, 3.97s/it] {'loss': 0.2819, 'grad_norm': 0.5925661056508587, 'learning_rate': 1.440672291384726e-08, 'epoch': 0.98} 98%|█████████▊| 21578/22095 [37:07:42<34:12, 3.97s/it] 98%|█████████▊| 21579/22095 [37:07:45<31:53, 3.71s/it] {'loss': 0.3058, 'grad_norm': 0.5902542261005069, 'learning_rate': 1.4351178791384702e-08, 'epoch': 0.98} 98%|█████████▊| 21579/22095 [37:07:45<31:53, 3.71s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21580/22095 [37:07:48<30:15, 3.52s/it] {'loss': 0.3097, 'grad_norm': 0.6734892936433218, 'learning_rate': 1.4295741795203322e-08, 'epoch': 0.98} 98%|█████████▊| 21580/22095 [37:07:48<30:15, 3.52s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21581/22095 [37:07:57<45:44, 5.34s/it] {'loss': 0.4596, 'grad_norm': 0.25765264371655133, 'learning_rate': 1.4240411926493835e-08, 'epoch': 0.98} 98%|█████████▊| 21581/22095 [37:07:57<45:44, 5.34s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (111548 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (97346 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (49775 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (105203 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21582/22095 [37:08:01<41:08, 4.81s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (96181 > 40960). Running this sequence through the model will result in indexing errors {'loss': 0.2836, 'grad_norm': 0.569592922648196, 'learning_rate': 1.4185189186445292e-08, 'epoch': 0.98} 98%|█████████▊| 21582/22095 [37:08:01<41:08, 4.81s/it] 98%|█████████▊| 21583/22095 [37:08:04<36:31, 4.28s/it] {'loss': 0.2876, 'grad_norm': 0.6310649915158688, 'learning_rate': 1.4130073576244518e-08, 'epoch': 0.98} 98%|█████████▊| 21583/22095 [37:08:04<36:31, 4.28s/it] 98%|█████████▊| 21584/22095 [37:08:07<34:29, 4.05s/it] {'loss': 0.289, 'grad_norm': 0.5954175314647394, 'learning_rate': 1.4075065097075013e-08, 'epoch': 0.98} 98%|█████████▊| 21584/22095 [37:08:07<34:29, 4.05s/it] 98%|█████████▊| 21585/22095 [37:08:11<33:02, 3.89s/it] {'loss': 0.3152, 'grad_norm': 0.584969965100779, 'learning_rate': 1.402016375011972e-08, 'epoch': 0.98} 98%|█████████▊| 21585/22095 [37:08:11<33:02, 3.89s/it] 98%|█████████▊| 21586/22095 [37:08:14<30:45, 3.63s/it] {'loss': 0.2726, 'grad_norm': 0.6513597095588576, 'learning_rate': 1.3965369536557694e-08, 'epoch': 0.98} 98%|█████████▊| 21586/22095 [37:08:14<30:45, 3.63s/it] 98%|█████████▊| 21587/22095 [37:08:19<33:41, 3.98s/it] {'loss': 0.3172, 'grad_norm': 0.6068463477849974, 'learning_rate': 1.3910682457566327e-08, 'epoch': 0.98} 98%|█████████▊| 21587/22095 [37:08:19<33:41, 3.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42739 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (48917 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (93575 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21588/22095 [37:08:22<30:33, 3.62s/it] {'loss': 0.2749, 'grad_norm': 0.6569562461124709, 'learning_rate': 1.3856102514321345e-08, 'epoch': 0.98} 98%|█████████▊| 21588/22095 [37:08:22<30:33, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (42705 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21589/22095 [37:08:25<30:05, 3.57s/it] {'loss': 0.2997, 'grad_norm': 0.6050937931355871, 'learning_rate': 1.3801629707994035e-08, 'epoch': 0.98} 98%|█████████▊| 21589/22095 [37:08:25<30:05, 3.57s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21590/22095 [37:08:35<46:13, 5.49s/it] {'loss': 0.4529, 'grad_norm': 0.26659203847765517, 'learning_rate': 1.3747264039756236e-08, 'epoch': 0.98} 98%|█████████▊| 21590/22095 [37:08:35<46:13, 5.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21591/22095 [37:08:38<40:28, 4.82s/it] {'loss': 0.2708, 'grad_norm': 0.6891775542701496, 'learning_rate': 1.3693005510775903e-08, 'epoch': 0.98} 98%|█████████▊| 21591/22095 [37:08:38<40:28, 4.82s/it] 98%|█████████▊| 21592/22095 [37:08:42<37:17, 4.45s/it] {'loss': 0.2726, 'grad_norm': 0.6208702276175495, 'learning_rate': 1.3638854122218214e-08, 'epoch': 0.98} 98%|█████████▊| 21592/22095 [37:08:42<37:17, 4.45s/it] 98%|█████████▊| 21593/22095 [37:08:45<33:03, 3.95s/it] {'loss': 0.2751, 'grad_norm': 0.5831698339732241, 'learning_rate': 1.358480987524724e-08, 'epoch': 0.98} 98%|█████████▊| 21593/22095 [37:08:45<33:03, 3.95s/it] 98%|█████████▊| 21594/22095 [37:08:47<30:20, 3.63s/it] {'loss': 0.321, 'grad_norm': 0.6343592251514779, 'learning_rate': 1.3530872771024273e-08, 'epoch': 0.98} 98%|█████████▊| 21594/22095 [37:08:47<30:20, 3.63s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (52019 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21595/22095 [37:08:51<29:27, 3.54s/it] {'loss': 0.2605, 'grad_norm': 0.5875276035704957, 'learning_rate': 1.3477042810707829e-08, 'epoch': 0.98} 98%|█████████▊| 21595/22095 [37:08:51<29:27, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21596/22095 [37:09:00<44:15, 5.32s/it] {'loss': 0.4601, 'grad_norm': 0.24808703230881213, 'learning_rate': 1.3423319995454765e-08, 'epoch': 0.98} 98%|█████████▊| 21596/22095 [37:09:00<44:15, 5.32s/it] 98%|█████████▊| 21597/22095 [37:09:11<58:35, 7.06s/it] {'loss': 0.4717, 'grad_norm': 0.26599602174500386, 'learning_rate': 1.3369704326419709e-08, 'epoch': 0.98} 98%|█████████▊| 21597/22095 [37:09:11<58:35, 7.06s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [409, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8498440 in VC:s3://internvl-moe-sft-data/. Exception: Image size [409, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 141012, 'image': 'vrdu_texteq/astro-ph.CO/5ce717f2-8412-4144-aa60-b1f44eca4267.png', 'image_wh': [[409, 23]], 'conversations': [{'from': 'human', 'value': '\nCould you please perform optical character recognition (OCR) on the image to retrieve the text?'}, {'from': 'gpt', 'value': 'and the volume of $\\mathcal{R}$ is defined as'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8918153 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 41306, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': "\n如图所示,AB=16cm,C为AB上任意点,D为AC中点,E为BC中点,则段长为()\nA. 16cm\nB. 32cm\nC. 4cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21598/22095 [37:09:16<53:22, 6.44s/it] {'loss': 0.2957, 'grad_norm': 0.6308260174148227, 'learning_rate': 1.3316195804753962e-08, 'epoch': 0.98} 98%|█████████▊| 21598/22095 [37:09:16<53:22, 6.44s/it] 98%|█████████▊| 21599/22095 [37:09:21<48:45, 5.90s/it] {'loss': 0.3095, 'grad_norm': 0.6417862465844587, 'learning_rate': 1.3262794431608272e-08, 'epoch': 0.98} 98%|█████████▊| 21599/22095 [37:09:21<48:45, 5.90s/it] 98%|█████████▊| 21600/22095 [37:09:25<43:50, 5.31s/it] {'loss': 0.3006, 'grad_norm': 0.5779867840850115, 'learning_rate': 1.32095002081295e-08, 'epoch': 0.98} 98%|█████████▊| 21600/22095 [37:09:25<43:50, 5.31s/it] 98%|█████████▊| 21601/22095 [37:09:29<40:59, 4.98s/it] {'loss': 0.2623, 'grad_norm': 0.6374846073793348, 'learning_rate': 1.3156313135462284e-08, 'epoch': 0.98} 98%|█████████▊| 21601/22095 [37:09:29<40:59, 4.98s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (64148 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46985 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56629 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21602/22095 [37:09:33<37:37, 4.58s/it] {'loss': 0.3378, 'grad_norm': 0.6026661398778114, 'learning_rate': 1.310323321475071e-08, 'epoch': 0.98} 98%|█████████▊| 21602/22095 [37:09:33<37:37, 4.58s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21603/22095 [37:09:42<49:48, 6.07s/it] {'loss': 0.4632, 'grad_norm': 0.24092280344486997, 'learning_rate': 1.3050260447133866e-08, 'epoch': 0.98} 98%|█████████▊| 21603/22095 [37:09:42<49:48, 6.07s/it] 98%|█████████▊| 21604/22095 [37:09:46<43:52, 5.36s/it] {'loss': 0.295, 'grad_norm': 0.6303012480618946, 'learning_rate': 1.2997394833750842e-08, 'epoch': 0.98} 98%|█████████▊| 21604/22095 [37:09:46<43:52, 5.36s/it] 98%|█████████▊| 21605/22095 [37:09:50<40:11, 4.92s/it] {'loss': 0.2873, 'grad_norm': 0.5629119592200705, 'learning_rate': 1.2944636375737952e-08, 'epoch': 0.98} 98%|█████████▊| 21605/22095 [37:09:50<40:11, 4.92s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95767 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21606/22095 [37:09:53<35:44, 4.39s/it] {'loss': 0.248, 'grad_norm': 0.587805715190683, 'learning_rate': 1.289198507422762e-08, 'epoch': 0.98} 98%|█████████▊| 21606/22095 [37:09:53<35:44, 4.39s/it] 98%|█████████▊| 21607/22095 [37:09:57<34:35, 4.25s/it] {'loss': 0.3042, 'grad_norm': 0.7009930509067366, 'learning_rate': 1.2839440930352276e-08, 'epoch': 0.98} 98%|█████████▊| 21607/22095 [37:09:57<34:35, 4.25s/it] 98%|█████████▊| 21608/22095 [37:10:00<31:43, 3.91s/it] {'loss': 0.2643, 'grad_norm': 0.624810524584758, 'learning_rate': 1.2787003945239906e-08, 'epoch': 0.98} 98%|█████████▊| 21608/22095 [37:10:00<31:43, 3.91s/it] 98%|█████████▊| 21609/22095 [37:10:03<29:10, 3.60s/it] {'loss': 0.2777, 'grad_norm': 0.62834319904867, 'learning_rate': 1.2734674120018497e-08, 'epoch': 0.98} 98%|█████████▊| 21609/22095 [37:10:03<29:10, 3.60s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21610/22095 [37:10:12<40:55, 5.06s/it] {'loss': 0.4483, 'grad_norm': 0.2711195268524511, 'learning_rate': 1.268245145581104e-08, 'epoch': 0.98} 98%|█████████▊| 21610/22095 [37:10:12<40:55, 5.06s/it] 98%|█████████▊| 21611/22095 [37:10:15<37:08, 4.60s/it] {'loss': 0.2703, 'grad_norm': 0.6405341538941641, 'learning_rate': 1.2630335953740524e-08, 'epoch': 0.98} 98%|█████████▊| 21611/22095 [37:10:15<37:08, 4.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (50333 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46594 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21612/22095 [37:10:18<33:30, 4.16s/it] {'loss': 0.3029, 'grad_norm': 0.7713359262031876, 'learning_rate': 1.257832761492661e-08, 'epoch': 0.98} 98%|█████████▊| 21612/22095 [37:10:18<33:30, 4.16s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (80554 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21613/22095 [37:10:22<31:36, 3.93s/it] {'loss': 0.297, 'grad_norm': 0.6050305225094347, 'learning_rate': 1.2526426440486738e-08, 'epoch': 0.98} 98%|█████████▊| 21613/22095 [37:10:22<31:36, 3.93s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21614/22095 [37:10:32<48:21, 6.03s/it] {'loss': 0.4589, 'grad_norm': 0.2565092506379215, 'learning_rate': 1.2474632431536126e-08, 'epoch': 0.98} 98%|█████████▊| 21614/22095 [37:10:33<48:21, 6.03s/it] 98%|█████████▊| 21615/22095 [37:10:37<44:49, 5.60s/it] {'loss': 0.3016, 'grad_norm': 0.6426891785778904, 'learning_rate': 1.2422945589187774e-08, 'epoch': 0.98} 98%|█████████▊| 21615/22095 [37:10:37<44:49, 5.60s/it] 98%|█████████▊| 21616/22095 [37:10:41<40:39, 5.09s/it] {'loss': 0.3074, 'grad_norm': 0.7105239256678185, 'learning_rate': 1.2371365914551903e-08, 'epoch': 0.98} 98%|█████████▊| 21616/22095 [37:10:41<40:39, 5.09s/it] 98%|█████████▊| 21617/22095 [37:10:45<37:27, 4.70s/it] {'loss': 0.3187, 'grad_norm': 0.6318290634122832, 'learning_rate': 1.2319893408737072e-08, 'epoch': 0.98} 98%|█████████▊| 21617/22095 [37:10:45<37:27, 4.70s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21618/22095 [37:10:49<35:17, 4.44s/it] {'loss': 0.2676, 'grad_norm': 0.7091202626299058, 'learning_rate': 1.2268528072849063e-08, 'epoch': 0.98} 98%|█████████▊| 21618/22095 [37:10:49<35:17, 4.44s/it] 98%|█████████▊| 21619/22095 [37:10:52<33:43, 4.25s/it] {'loss': 0.2756, 'grad_norm': 0.5991788178591732, 'learning_rate': 1.221726990799199e-08, 'epoch': 0.98} 98%|█████████▊| 21619/22095 [37:10:52<33:43, 4.25s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21620/22095 [37:11:02<46:47, 5.91s/it] {'loss': 0.4842, 'grad_norm': 0.2644385392660403, 'learning_rate': 1.21661189152672e-08, 'epoch': 0.98} 98%|█████████▊| 21620/22095 [37:11:02<46:47, 5.91s/it] 98%|█████████▊| 21621/22095 [37:11:06<40:55, 5.18s/it] {'loss': 0.3265, 'grad_norm': 0.6044750980358486, 'learning_rate': 1.2115075095773255e-08, 'epoch': 0.98} 98%|█████████▊| 21621/22095 [37:11:06<40:55, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (73909 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21622/22095 [37:11:10<38:21, 4.86s/it] {'loss': 0.3153, 'grad_norm': 0.6782799426016777, 'learning_rate': 1.206413845060761e-08, 'epoch': 0.98} 98%|█████████▊| 21622/22095 [37:11:10<38:21, 4.86s/it] 98%|█████████▊| 21623/22095 [37:11:13<34:29, 4.38s/it] {'loss': 0.2665, 'grad_norm': 0.6519410449382191, 'learning_rate': 1.2013308980863836e-08, 'epoch': 0.98} 98%|█████████▊| 21623/22095 [37:11:13<34:29, 4.38s/it] 98%|█████████▊| 21624/22095 [37:11:17<32:46, 4.17s/it] {'loss': 0.2875, 'grad_norm': 0.5774124278120938, 'learning_rate': 1.1962586687634947e-08, 'epoch': 0.98} 98%|█████████▊| 21624/22095 [37:11:17<32:46, 4.17s/it] 98%|█████████▊| 21625/22095 [37:11:20<30:02, 3.83s/it] {'loss': 0.2441, 'grad_norm': 0.6004423970128941, 'learning_rate': 1.1911971572010073e-08, 'epoch': 0.98} 98%|█████████▊| 21625/22095 [37:11:20<30:02, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [137, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8394416 in VC:s3://internvl-moe-sft-data/. Exception: Image size [137, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 61251, 'image': 'vrdu_table_final_2/astro-ph.EP/766f187b-4ecd-4784-953c-e2ff39c8c397.png', 'image_wh': [[137, 25]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}[c]{@{}l@{}} \\hspace{-1cm}Continuum \\\\ \\\\ \\end{tabular}\n```"}]} 98%|█████████▊| 21626/22095 [37:11:24<30:00, 3.84s/it] {'loss': 0.3083, 'grad_norm': 0.5843913260285941, 'learning_rate': 1.1861463635077785e-08, 'epoch': 0.98} 98%|█████████▊| 21626/22095 [37:11:24<30:00, 3.84s/it] 98%|█████████▊| 21627/22095 [37:11:27<29:39, 3.80s/it] {'loss': 0.3188, 'grad_norm': 0.6598961792562245, 'learning_rate': 1.181106287792222e-08, 'epoch': 0.98} 98%|█████████▊| 21627/22095 [37:11:27<29:39, 3.80s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21628/22095 [37:11:32<30:23, 3.91s/it] {'loss': 0.2678, 'grad_norm': 0.6118189962772933, 'learning_rate': 1.1760769301626951e-08, 'epoch': 0.98} 98%|█████████▊| 21628/22095 [37:11:32<30:23, 3.91s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41682 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21629/22095 [37:11:35<29:20, 3.78s/it] {'loss': 0.2965, 'grad_norm': 0.7576996983657155, 'learning_rate': 1.1710582907272783e-08, 'epoch': 0.98} 98%|█████████▊| 21629/22095 [37:11:35<29:20, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21630/22095 [37:11:45<42:55, 5.54s/it] {'loss': 0.4581, 'grad_norm': 0.28542584526267356, 'learning_rate': 1.166050369593774e-08, 'epoch': 0.98} 98%|█████████▊| 21630/22095 [37:11:45<42:55, 5.54s/it] 98%|█████████▊| 21631/22095 [37:11:48<37:35, 4.86s/it] {'loss': 0.2755, 'grad_norm': 0.5788657187354852, 'learning_rate': 1.1610531668697633e-08, 'epoch': 0.98} 98%|█████████▊| 21631/22095 [37:11:48<37:35, 4.86s/it] 98%|█████████▊| 21632/22095 [37:11:51<33:44, 4.37s/it] {'loss': 0.2844, 'grad_norm': 0.6078411782319739, 'learning_rate': 1.1560666826627154e-08, 'epoch': 0.98} 98%|█████████▊| 21632/22095 [37:11:51<33:44, 4.37s/it] 98%|█████████▊| 21633/22095 [37:11:54<30:25, 3.95s/it] {'loss': 0.3157, 'grad_norm': 0.6201364270028205, 'learning_rate': 1.1510909170796558e-08, 'epoch': 0.98} 98%|█████████▊| 21633/22095 [37:11:54<30:25, 3.95s/it] 98%|█████████▊| 21634/22095 [37:11:57<28:25, 3.70s/it] {'loss': 0.2549, 'grad_norm': 0.5508890145642606, 'learning_rate': 1.14612587022761e-08, 'epoch': 0.98} 98%|█████████▊| 21634/22095 [37:11:57<28:25, 3.70s/it] 98%|█████████▊| 21635/22095 [37:12:00<26:54, 3.51s/it] {'loss': 0.2701, 'grad_norm': 0.6795842409627164, 'learning_rate': 1.1411715422131598e-08, 'epoch': 0.98} 98%|█████████▊| 21635/22095 [37:12:00<26:54, 3.51s/it] 98%|█████████▊| 21636/22095 [37:12:04<26:11, 3.42s/it] {'loss': 0.2996, 'grad_norm': 0.5777709409082231, 'learning_rate': 1.1362279331428305e-08, 'epoch': 0.98} 98%|█████████▊| 21636/22095 [37:12:04<26:11, 3.42s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21637/22095 [37:12:12<37:54, 4.97s/it] {'loss': 0.4574, 'grad_norm': 0.2564448952346507, 'learning_rate': 1.1312950431228153e-08, 'epoch': 0.98} 98%|█████████▊| 21637/22095 [37:12:12<37:54, 4.97s/it] 98%|█████████▊| 21638/22095 [37:12:15<34:06, 4.48s/it] {'loss': 0.2868, 'grad_norm': 0.7122237193809913, 'learning_rate': 1.126372872259085e-08, 'epoch': 0.98} 98%|█████████▊| 21638/22095 [37:12:15<34:06, 4.48s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [17, 28, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408533 in VC:s3://internvl-moe-sft-data/. Exception: Image size [17, 28, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10727, 'image': 'vrdu_table_final_2/astro-ph.CO/a1102c05-6ff0-4965-af9b-cfab23096f41.png', 'image_wh': [[17, 28]], 'conversations': [{'from': 'human', 'value': '\nPlease craft the table from the image into LaTeX notation.'}, {'from': 'gpt', 'value': "Certainly, I'll craft the table from the image into LaTeX notation for you.\n```latex\n\\begin{tabular}{c}$\\beta$\\end{tabular}\n```"}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8388074 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 54889, 'image': 'vrdu_table_final_2/astro-ph.CO/13ad7ce5-67cf-43dd-bb89-2d336ddce023.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 98%|█████████▊| 21639/22095 [37:12:18<30:18, 3.99s/it] {'loss': 0.2437, 'grad_norm': 0.5895747232643146, 'learning_rate': 1.1214614206574437e-08, 'epoch': 0.98} 98%|█████████▊| 21639/22095 [37:12:18<30:18, 3.99s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21640/22095 [37:12:22<29:40, 3.91s/it] {'loss': 0.3135, 'grad_norm': 0.630322761241128, 'learning_rate': 1.1165606884234182e-08, 'epoch': 0.98} 98%|█████████▊| 21640/22095 [37:12:22<29:40, 3.91s/it] 98%|█████████▊| 21641/22095 [37:12:25<28:30, 3.77s/it] {'loss': 0.2899, 'grad_norm': 0.6556277953449368, 'learning_rate': 1.111670675662313e-08, 'epoch': 0.98} 98%|█████████▊| 21641/22095 [37:12:25<28:30, 3.77s/it] 98%|█████████▊| 21642/22095 [37:12:29<28:19, 3.75s/it] {'loss': 0.2835, 'grad_norm': 0.5784803966331807, 'learning_rate': 1.1067913824791553e-08, 'epoch': 0.98} 98%|█████████▊| 21642/22095 [37:12:29<28:19, 3.75s/it] 98%|█████████▊| 21643/22095 [37:12:32<26:39, 3.54s/it] {'loss': 0.2543, 'grad_norm': 0.547317704540307, 'learning_rate': 1.1019228089788613e-08, 'epoch': 0.98} 98%|█████████▊| 21643/22095 [37:12:32<26:39, 3.54s/it] 98%|█████████▊| 21644/22095 [37:12:36<26:41, 3.55s/it] {'loss': 0.2773, 'grad_norm': 0.5978956990884308, 'learning_rate': 1.0970649552659585e-08, 'epoch': 0.98} 98%|█████████▊| 21644/22095 [37:12:36<26:41, 3.55s/it] 98%|█████████▊| 21645/22095 [37:12:39<25:47, 3.44s/it] {'loss': 0.2687, 'grad_norm': 0.5614664206206553, 'learning_rate': 1.092217821444863e-08, 'epoch': 0.98} 98%|█████████▊| 21645/22095 [37:12:39<25:47, 3.44s/it] 98%|█████████▊| 21646/22095 [37:12:42<25:12, 3.37s/it] {'loss': 0.2588, 'grad_norm': 0.672027978277878, 'learning_rate': 1.0873814076197142e-08, 'epoch': 0.98} 98%|█████████▊| 21646/22095 [37:12:42<25:12, 3.37s/it] 98%|█████████▊| 21647/22095 [37:12:46<26:31, 3.55s/it] {'loss': 0.2885, 'grad_norm': 0.6207091688396222, 'learning_rate': 1.0825557138944843e-08, 'epoch': 0.98} 98%|█████████▊| 21647/22095 [37:12:46<26:31, 3.55s/it] 98%|█████████▊| 21648/22095 [37:12:49<24:54, 3.34s/it] {'loss': 0.2752, 'grad_norm': 0.6139079486988628, 'learning_rate': 1.0777407403728123e-08, 'epoch': 0.98} 98%|█████████▊| 21648/22095 [37:12:49<24:54, 3.34s/it] 98%|█████████▊| 21649/22095 [37:12:52<25:02, 3.37s/it] {'loss': 0.2969, 'grad_norm': 0.5793301468223664, 'learning_rate': 1.0729364871581716e-08, 'epoch': 0.98} 98%|█████████▊| 21649/22095 [37:12:52<25:02, 3.37s/it] 98%|█████████▊| 21650/22095 [37:12:56<26:05, 3.52s/it] {'loss': 0.309, 'grad_norm': 0.7081059442224155, 'learning_rate': 1.0681429543538125e-08, 'epoch': 0.98} 98%|█████████▊| 21650/22095 [37:12:56<26:05, 3.52s/it] 98%|█████████▊| 21651/22095 [37:13:00<26:30, 3.58s/it] {'loss': 0.31, 'grad_norm': 0.6522737384702686, 'learning_rate': 1.0633601420626528e-08, 'epoch': 0.98} 98%|█████████▊| 21651/22095 [37:13:00<26:30, 3.58s/it] 98%|█████████▊| 21652/22095 [37:13:03<25:41, 3.48s/it] {'loss': 0.3273, 'grad_norm': 0.6194217668405, 'learning_rate': 1.0585880503875546e-08, 'epoch': 0.98} 98%|█████████▊| 21652/22095 [37:13:03<25:41, 3.48s/it] 98%|█████████▊| 21653/22095 [37:13:06<24:33, 3.33s/it] {'loss': 0.2502, 'grad_norm': 0.7235806562053905, 'learning_rate': 1.0538266794309914e-08, 'epoch': 0.98} 98%|█████████▊| 21653/22095 [37:13:06<24:33, 3.33s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (76301 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21654/22095 [37:13:09<23:38, 3.22s/it] {'loss': 0.3025, 'grad_norm': 0.6185869570994706, 'learning_rate': 1.049076029295326e-08, 'epoch': 0.98} 98%|█████████▊| 21654/22095 [37:13:09<23:38, 3.22s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21655/22095 [37:13:13<25:00, 3.41s/it] {'loss': 0.3451, 'grad_norm': 0.628706820093385, 'learning_rate': 1.044336100082588e-08, 'epoch': 0.98} 98%|█████████▊| 21655/22095 [37:13:13<25:00, 3.41s/it] 98%|█████████▊| 21656/22095 [37:13:16<24:30, 3.35s/it] {'loss': 0.3198, 'grad_norm': 0.6164354143700633, 'learning_rate': 1.03960689189464e-08, 'epoch': 0.98} 98%|█████████▊| 21656/22095 [37:13:16<24:30, 3.35s/it] 98%|█████████▊| 21657/22095 [37:13:20<24:36, 3.37s/it] {'loss': 0.2957, 'grad_norm': 0.6485613291924816, 'learning_rate': 1.0348884048331232e-08, 'epoch': 0.98} 98%|█████████▊| 21657/22095 [37:13:20<24:36, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8305386 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1iJ.phLDH8KJjy1XcXXcpdXXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\n图中有哪些所有的文字内容?请列举出来。'}, {'from': 'gpt', 'value': '图中所有文字:\n真正上海发货\n皇冠\n信誉\n8\n8\n88\nNewAmvel\n新品推荐\n强货\n365天包换全新!'}]} 98%|█████████▊| 21658/22095 [37:13:23<25:10, 3.46s/it] {'loss': 0.3375, 'grad_norm': 0.6598439358952244, 'learning_rate': 1.0301806389994006e-08, 'epoch': 0.98} 98%|█████████▊| 21658/22095 [37:13:23<25:10, 3.46s/it] 98%|█████████▊| 21659/22095 [37:13:27<24:41, 3.40s/it] {'loss': 0.3017, 'grad_norm': 0.5931629157219409, 'learning_rate': 1.025483594494614e-08, 'epoch': 0.98} 98%|█████████▊| 21659/22095 [37:13:27<24:41, 3.40s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (95943 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68216 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (67458 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (65032 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21660/22095 [37:13:30<25:35, 3.53s/it] {'loss': 0.2703, 'grad_norm': 0.5585563100963906, 'learning_rate': 1.0207972714196824e-08, 'epoch': 0.98} 98%|█████████▊| 21660/22095 [37:13:30<25:35, 3.53s/it] 98%|█████████▊| 21661/22095 [37:13:35<27:19, 3.78s/it] {'loss': 0.329, 'grad_norm': 0.5768770066117372, 'learning_rate': 1.0161216698753029e-08, 'epoch': 0.98} 98%|█████████▊| 21661/22095 [37:13:35<27:19, 3.78s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21662/22095 [37:13:43<36:28, 5.05s/it] {'loss': 0.4897, 'grad_norm': 0.266834695010642, 'learning_rate': 1.0114567899620066e-08, 'epoch': 0.98} 98%|█████████▊| 21662/22095 [37:13:43<36:28, 5.05s/it] 98%|█████████▊| 21663/22095 [37:13:46<33:16, 4.62s/it] {'loss': 0.2643, 'grad_norm': 0.6164671838543433, 'learning_rate': 1.0068026317799906e-08, 'epoch': 0.98} 98%|█████████▊| 21663/22095 [37:13:46<33:16, 4.62s/it] 98%|█████████▊| 21664/22095 [37:13:49<29:16, 4.08s/it] {'loss': 0.2698, 'grad_norm': 0.6455551846103544, 'learning_rate': 1.0021591954291754e-08, 'epoch': 0.98} 98%|█████████▊| 21664/22095 [37:13:49<29:16, 4.08s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (75187 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54003 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21665/22095 [37:13:52<27:17, 3.81s/it] {'loss': 0.2947, 'grad_norm': 0.5680472792211821, 'learning_rate': 9.975264810094254e-09, 'epoch': 0.98} 98%|█████████▊| 21665/22095 [37:13:52<27:17, 3.81s/it] 98%|█████████▊| 21666/22095 [37:13:56<25:42, 3.59s/it] {'loss': 0.2885, 'grad_norm': 0.6534256833339015, 'learning_rate': 9.929044886203276e-09, 'epoch': 0.98} 98%|█████████▊| 21666/22095 [37:13:56<25:42, 3.59s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21667/22095 [37:13:59<25:13, 3.54s/it] {'loss': 0.2808, 'grad_norm': 0.6006793231315621, 'learning_rate': 9.882932183610806e-09, 'epoch': 0.98} 98%|█████████▊| 21667/22095 [37:13:59<25:13, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21668/22095 [37:14:08<37:44, 5.30s/it] {'loss': 0.4742, 'grad_norm': 0.25155329493710854, 'learning_rate': 9.836926703307714e-09, 'epoch': 0.98} 98%|█████████▊| 21668/22095 [37:14:08<37:44, 5.30s/it] 98%|█████████▊| 21669/22095 [37:14:12<34:48, 4.90s/it] {'loss': 0.3059, 'grad_norm': 0.5974539516861112, 'learning_rate': 9.791028446283768e-09, 'epoch': 0.98} 98%|█████████▊| 21669/22095 [37:14:12<34:48, 4.90s/it] 98%|█████████▊| 21670/22095 [37:14:15<31:00, 4.38s/it] {'loss': 0.248, 'grad_norm': 0.6617095694002861, 'learning_rate': 9.745237413523733e-09, 'epoch': 0.98} 98%|█████████▊| 21670/22095 [37:14:15<31:00, 4.38s/it] 98%|█████████▊| 21671/22095 [37:14:18<27:39, 3.91s/it] {'loss': 0.2827, 'grad_norm': 0.6060716678997371, 'learning_rate': 9.69955360601238e-09, 'epoch': 0.98} 98%|█████████▊| 21671/22095 [37:14:18<27:39, 3.91s/it] 98%|█████████▊| 21672/22095 [37:14:22<26:54, 3.82s/it] {'loss': 0.3029, 'grad_norm': 0.5792760913170358, 'learning_rate': 9.653977024731143e-09, 'epoch': 0.98} 98%|█████████▊| 21672/22095 [37:14:22<26:54, 3.82s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21673/22095 [37:14:31<38:50, 5.52s/it] {'loss': 0.4947, 'grad_norm': 0.2910242789051668, 'learning_rate': 9.608507670659239e-09, 'epoch': 0.98} 98%|█████████▊| 21673/22095 [37:14:31<38:50, 5.52s/it] 98%|█████████▊| 21674/22095 [37:14:36<35:50, 5.11s/it] {'loss': 0.2854, 'grad_norm': 0.6148235384331048, 'learning_rate': 9.563145544773666e-09, 'epoch': 0.98} 98%|█████████▊| 21674/22095 [37:14:36<35:50, 5.11s/it] 98%|█████████▊| 21675/22095 [37:14:39<31:21, 4.48s/it] {'loss': 0.2991, 'grad_norm': 0.615447357686388, 'learning_rate': 9.517890648049199e-09, 'epoch': 0.98} 98%|█████████▊| 21675/22095 [37:14:39<31:21, 4.48s/it] 98%|█████████▊| 21676/22095 [37:14:42<29:43, 4.26s/it] {'loss': 0.2366, 'grad_norm': 0.6092808326297857, 'learning_rate': 9.472742981458393e-09, 'epoch': 0.98} 98%|█████████▊| 21676/22095 [37:14:42<29:43, 4.26s/it] 98%|█████████▊| 21677/22095 [37:14:47<30:08, 4.33s/it] {'loss': 0.3255, 'grad_norm': 0.6672915344250399, 'learning_rate': 9.427702545970474e-09, 'epoch': 0.98} 98%|█████████▊| 21677/22095 [37:14:47<30:08, 4.33s/it] 98%|█████████▊| 21678/22095 [37:14:50<27:30, 3.96s/it] {'loss': 0.2415, 'grad_norm': 0.6059301178060165, 'learning_rate': 9.38276934255411e-09, 'epoch': 0.98} 98%|█████████▊| 21678/22095 [37:14:50<27:30, 3.96s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21679/22095 [37:14:54<27:50, 4.02s/it] {'loss': 0.272, 'grad_norm': 0.5872986394992463, 'learning_rate': 9.337943372175195e-09, 'epoch': 0.98} 98%|█████████▊| 21679/22095 [37:14:54<27:50, 4.02s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57113 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46314 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83114 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21680/22095 [37:14:57<26:10, 3.78s/it] {'loss': 0.2612, 'grad_norm': 0.5859666654621924, 'learning_rate': 9.293224635795184e-09, 'epoch': 0.98} 98%|█████████▊| 21680/22095 [37:14:57<26:10, 3.78s/it] 98%|█████████▊| 21681/22095 [37:15:00<24:22, 3.53s/it] {'loss': 0.278, 'grad_norm': 0.6206243334069153, 'learning_rate': 9.248613134376638e-09, 'epoch': 0.98} 98%|█████████▊| 21681/22095 [37:15:00<24:22, 3.53s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21682/22095 [37:15:04<24:52, 3.61s/it] {'loss': 0.3476, 'grad_norm': 0.5897081586642346, 'learning_rate': 9.204108868877127e-09, 'epoch': 0.98} 98%|█████████▊| 21682/22095 [37:15:04<24:52, 3.61s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21683/22095 [37:15:07<24:29, 3.57s/it] {'loss': 0.3461, 'grad_norm': 0.626181183441953, 'learning_rate': 9.15971184025366e-09, 'epoch': 0.98} 98%|█████████▊| 21683/22095 [37:15:07<24:29, 3.57s/it] 98%|█████████▊| 21684/22095 [37:15:11<24:30, 3.58s/it] {'loss': 0.3023, 'grad_norm': 0.5570704921342378, 'learning_rate': 9.115422049459365e-09, 'epoch': 0.98} 98%|█████████▊| 21684/22095 [37:15:11<24:30, 3.58s/it] 98%|█████████▊| 21685/22095 [37:15:15<25:45, 3.77s/it] {'loss': 0.3333, 'grad_norm': 0.6182692913792313, 'learning_rate': 9.071239497446815e-09, 'epoch': 0.98} 98%|█████████▊| 21685/22095 [37:15:15<25:45, 3.77s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (48144 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (107723 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21686/22095 [37:15:19<25:06, 3.68s/it] {'loss': 0.3385, 'grad_norm': 0.6284031326139313, 'learning_rate': 9.027164185164694e-09, 'epoch': 0.98} 98%|█████████▊| 21686/22095 [37:15:19<25:06, 3.68s/it] 98%|█████████▊| 21687/22095 [37:15:22<24:41, 3.63s/it] {'loss': 0.2968, 'grad_norm': 0.7153580755772212, 'learning_rate': 8.983196113560023e-09, 'epoch': 0.98} 98%|█████████▊| 21687/22095 [37:15:22<24:41, 3.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (78497 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (63636 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21688/22095 [37:15:32<36:30, 5.38s/it] {'loss': 0.4477, 'grad_norm': 0.27580635853133384, 'learning_rate': 8.939335283577599e-09, 'epoch': 0.98} 98%|█████████▊| 21688/22095 [37:15:32<36:30, 5.38s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in _get_item sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1360, in sizes = reduce(lambda x, y: x + y, sizes) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 905, in process_image_unified cur_len = sum( File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 161, in load_image raise ValueError("tcs_loader is required to load image from s3://") File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 110, in __call__ duration = round((time.time() - start_time) * 1000, 2) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 94, in pil_loader buff = io.BytesIO(img_str) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3515, in open im = _open_core(fp, filename, prefix, formats) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3504, in _open_core _decompression_bomb_check(im.size) File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py", line 3403, in _decompression_bomb_check raise DecompressionBombError(msg) PIL.Image.DecompressionBombError: Image size (189987600 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. [Try #0] Failed to fetch sample 7927876 in VC:s3://mm-dataset/DocOwl/DocReason25K/imgs/. Exception: Image size (189987600 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Problematic sample: {'image': 'DUE_Benchmark/InfographicsVQA/pngs/38238.png', 'image_wh': [[23340, 8140]], 'conversations': [{'from': 'human', 'value': "\nWhat percentage of candidates prefer CV's to an application form? Answer the question with detailed explanation."}, {'from': 'gpt', 'value': '67% of candidates prefer CV\'s to an application form. \nAccording to the image, the text under the "THE APPLICATION PROCESS" section states that 67% of candidates prefer CV\'s to an application form. This information is part of a list of top tips on how to attract candidates, and it suggests that simplifying the application process by accepting CV\'s can make the hiring process more appealing to potential candidates.'}]} 98%|█████████▊| 21689/22095 [37:15:35<32:45, 4.84s/it] {'loss': 0.2952, 'grad_norm': 0.615050617869175, 'learning_rate': 8.895581696160006e-09, 'epoch': 0.98} 98%|█████████▊| 21689/22095 [37:15:35<32:45, 4.84s/it] 98%|█████████▊| 21690/22095 [37:15:39<29:45, 4.41s/it] {'loss': 0.3283, 'grad_norm': 0.5893081313297105, 'learning_rate': 8.851935352247597e-09, 'epoch': 0.98} 98%|█████████▊| 21690/22095 [37:15:39<29:45, 4.41s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45892 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (112238 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (80324 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21691/22095 [37:15:42<27:07, 4.03s/it] {'loss': 0.2646, 'grad_norm': 0.6061131730742584, 'learning_rate': 8.808396252777962e-09, 'epoch': 0.98} 98%|█████████▊| 21691/22095 [37:15:42<27:07, 4.03s/it] 98%|█████████▊| 21692/22095 [37:15:45<25:03, 3.73s/it] {'loss': 0.2645, 'grad_norm': 0.602182475757348, 'learning_rate': 8.76496439868646e-09, 'epoch': 0.98} 98%|█████████▊| 21692/22095 [37:15:45<25:03, 3.73s/it] 98%|█████████▊| 21693/22095 [37:15:48<24:04, 3.59s/it] {'loss': 0.3046, 'grad_norm': 0.5596327104464522, 'learning_rate': 8.721639790906788e-09, 'epoch': 0.98} 98%|█████████▊| 21693/22095 [37:15:48<24:04, 3.59s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21694/22095 [37:15:58<35:51, 5.37s/it] {'loss': 0.4679, 'grad_norm': 0.24469286039476712, 'learning_rate': 8.67842243036876e-09, 'epoch': 0.98} 98%|█████████▊| 21694/22095 [37:15:58<35:51, 5.37s/it] 98%|█████████▊| 21695/22095 [37:16:01<31:35, 4.74s/it] {'loss': 0.2419, 'grad_norm': 0.7909436392895103, 'learning_rate': 8.635312318002742e-09, 'epoch': 0.98} 98%|█████████▊| 21695/22095 [37:16:01<31:35, 4.74s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21696/22095 [37:16:09<37:28, 5.64s/it] {'loss': 0.4627, 'grad_norm': 0.28832400803941166, 'learning_rate': 8.59230945473355e-09, 'epoch': 0.98} 98%|█████████▊| 21696/22095 [37:16:09<37:28, 5.64s/it] 98%|█████████▊| 21697/22095 [37:16:12<32:47, 4.94s/it] {'loss': 0.2726, 'grad_norm': 0.6002097689708532, 'learning_rate': 8.549413841485443e-09, 'epoch': 0.98} 98%|█████████▊| 21697/22095 [37:16:12<32:47, 4.94s/it] 98%|█████████▊| 21698/22095 [37:16:15<29:04, 4.40s/it] {'loss': 0.3065, 'grad_norm': 0.5583495432364033, 'learning_rate': 8.506625479181018e-09, 'epoch': 0.98} 98%|█████████▊| 21698/22095 [37:16:15<29:04, 4.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21699/22095 [37:16:23<36:02, 5.46s/it] {'loss': 0.4479, 'grad_norm': 0.26688574661396586, 'learning_rate': 8.46394436873843e-09, 'epoch': 0.98} 98%|█████████▊| 21699/22095 [37:16:23<36:02, 5.46s/it] 98%|█████████▊| 21700/22095 [37:16:26<31:37, 4.80s/it] {'loss': 0.2828, 'grad_norm': 0.6449065606928776, 'learning_rate': 8.421370511075833e-09, 'epoch': 0.98} 98%|█████████▊| 21700/22095 [37:16:26<31:37, 4.80s/it] 98%|█████████▊| 21701/22095 [37:16:30<28:43, 4.37s/it] {'loss': 0.2696, 'grad_norm': 0.6317061710680665, 'learning_rate': 8.378903907106938e-09, 'epoch': 0.98} 98%|█████████▊| 21701/22095 [37:16:30<28:43, 4.37s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (104400000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 98%|█████████▊| 21702/22095 [37:16:33<27:07, 4.14s/it] {'loss': 0.2752, 'grad_norm': 0.577668120306397, 'learning_rate': 8.336544557745463e-09, 'epoch': 0.98} 98%|█████████▊| 21702/22095 [37:16:33<27:07, 4.14s/it] 98%|█████████▊| 21703/22095 [37:16:37<25:13, 3.86s/it] {'loss': 0.2829, 'grad_norm': 0.6065787159864158, 'learning_rate': 8.294292463900123e-09, 'epoch': 0.98} 98%|█████████▊| 21703/22095 [37:16:37<25:13, 3.86s/it] 98%|█████████▊| 21704/22095 [37:16:41<25:52, 3.97s/it] {'loss': 0.2909, 'grad_norm': 0.600593619384963, 'learning_rate': 8.25214762648019e-09, 'epoch': 0.98} 98%|█████████▊| 21704/22095 [37:16:41<25:52, 3.97s/it] 98%|█████████▊| 21705/22095 [37:16:44<23:32, 3.62s/it] {'loss': 0.2916, 'grad_norm': 0.6630398083335145, 'learning_rate': 8.210110046390496e-09, 'epoch': 0.98} 98%|█████████▊| 21705/22095 [37:16:44<23:32, 3.62s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (78109 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45545 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (57715 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21706/22095 [37:16:47<23:21, 3.60s/it] {'loss': 0.3217, 'grad_norm': 0.7609258393204628, 'learning_rate': 8.168179724534209e-09, 'epoch': 0.98} 98%|█████████▊| 21706/22095 [37:16:47<23:21, 3.60s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (70723 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (82020 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (43285 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45708 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21707/22095 [37:16:50<22:21, 3.46s/it] {'loss': 0.2742, 'grad_norm': 0.6582641164460659, 'learning_rate': 8.126356661812829e-09, 'epoch': 0.98} 98%|█████████▊| 21707/22095 [37:16:50<22:21, 3.46s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (134874 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21708/22095 [37:16:54<22:26, 3.48s/it] {'loss': 0.3159, 'grad_norm': 0.6759465229322568, 'learning_rate': 8.084640859124527e-09, 'epoch': 0.98} 98%|█████████▊| 21708/22095 [37:16:54<22:26, 3.48s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49011 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21709/22095 [37:16:57<22:39, 3.52s/it] {'loss': 0.3072, 'grad_norm': 0.6703636419723167, 'learning_rate': 8.043032317365807e-09, 'epoch': 0.98} 98%|█████████▊| 21709/22095 [37:16:57<22:39, 3.52s/it] 98%|█████████▊| 21710/22095 [37:17:00<21:22, 3.33s/it] {'loss': 0.2814, 'grad_norm': 1.1165883505399121, 'learning_rate': 8.001531037430954e-09, 'epoch': 0.98} 98%|█████████▊| 21710/22095 [37:17:00<21:22, 3.33s/it] 98%|█████████▊| 21711/22095 [37:17:03<20:26, 3.19s/it] {'loss': 0.2428, 'grad_norm': 0.6061301568076047, 'learning_rate': 7.960137020210923e-09, 'epoch': 0.98} 98%|█████████▊| 21711/22095 [37:17:03<20:26, 3.19s/it] 98%|█████████▊| 21712/22095 [37:17:07<21:39, 3.39s/it] {'loss': 0.2852, 'grad_norm': 0.6607322205290386, 'learning_rate': 7.918850266596112e-09, 'epoch': 0.98} 98%|█████████▊| 21712/22095 [37:17:07<21:39, 3.39s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49893 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (116228 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54216 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21713/22095 [37:17:11<23:38, 3.71s/it] {'loss': 0.3121, 'grad_norm': 1.1731956802419958, 'learning_rate': 7.877670777473035e-09, 'epoch': 0.98} 98%|█████████▊| 21713/22095 [37:17:11<23:38, 3.71s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21714/22095 [37:17:21<35:11, 5.54s/it] {'loss': 0.4782, 'grad_norm': 0.4363254521210518, 'learning_rate': 7.836598553726538e-09, 'epoch': 0.98} 98%|█████████▊| 21714/22095 [37:17:21<35:11, 5.54s/it] 98%|█████████▊| 21715/22095 [37:17:29<38:32, 6.09s/it] {'loss': 0.4794, 'grad_norm': 0.4375526802347568, 'learning_rate': 7.79563359623925e-09, 'epoch': 0.98} 98%|█████████▊| 21715/22095 [37:17:29<38:32, 6.09s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21716/22095 [37:17:33<34:23, 5.44s/it] {'loss': 0.2834, 'grad_norm': 0.6432974621505368, 'learning_rate': 7.754775905891576e-09, 'epoch': 0.98} 98%|█████████▊| 21716/22095 [37:17:33<34:23, 5.44s/it] 98%|█████████▊| 21717/22095 [37:17:40<38:49, 6.16s/it] {'loss': 0.4849, 'grad_norm': 0.2588387571579923, 'learning_rate': 7.714025483561149e-09, 'epoch': 0.98} 98%|█████████▊| 21717/22095 [37:17:40<38:49, 6.16s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 98%|█████████▊| 21718/22095 [37:17:44<34:36, 5.51s/it] {'loss': 0.2615, 'grad_norm': 0.68694740461852, 'learning_rate': 7.673382330123936e-09, 'epoch': 0.98} 98%|█████████▊| 21718/22095 [37:17:45<34:36, 5.51s/it] 98%|█████████▊| 21719/22095 [37:17:48<30:45, 4.91s/it] {'loss': 0.2903, 'grad_norm': 1.042378564690906, 'learning_rate': 7.63284644645257e-09, 'epoch': 0.98} 98%|█████████▊| 21719/22095 [37:17:48<30:45, 4.91s/it] 98%|█████████▊| 21720/22095 [37:17:52<28:54, 4.62s/it] {'loss': 0.3058, 'grad_norm': 0.6446807855150728, 'learning_rate': 7.59241783341913e-09, 'epoch': 0.98} 98%|█████████▊| 21720/22095 [37:17:52<28:54, 4.62s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21721/22095 [37:18:01<38:10, 6.12s/it] {'loss': 0.4596, 'grad_norm': 0.2694289263740075, 'learning_rate': 7.552096491891259e-09, 'epoch': 0.98} 98%|█████████▊| 21721/22095 [37:18:02<38:10, 6.12s/it] 98%|█████████▊| 21722/22095 [37:18:05<33:11, 5.34s/it] {'loss': 0.2648, 'grad_norm': 0.559283975254006, 'learning_rate': 7.511882422735483e-09, 'epoch': 0.98} 98%|█████████▊| 21722/22095 [37:18:05<33:11, 5.34s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21723/22095 [37:18:13<38:21, 6.19s/it] {'loss': 0.4668, 'grad_norm': 0.26536821524762305, 'learning_rate': 7.471775626816114e-09, 'epoch': 0.98} 98%|█████████▊| 21723/22095 [37:18:13<38:21, 6.19s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21724/22095 [37:18:17<34:02, 5.51s/it] {'loss': 0.3224, 'grad_norm': 0.7440514698602366, 'learning_rate': 7.431776104994681e-09, 'epoch': 0.98} 98%|█████████▊| 21724/22095 [37:18:17<34:02, 5.51s/it] 98%|█████████▊| 21725/22095 [37:18:21<31:02, 5.03s/it] {'loss': 0.2766, 'grad_norm': 0.6962424596967072, 'learning_rate': 7.39188385813161e-09, 'epoch': 0.98} 98%|█████████▊| 21725/22095 [37:18:21<31:02, 5.03s/it] 98%|█████████▊| 21726/22095 [37:18:24<27:05, 4.41s/it] {'loss': 0.2606, 'grad_norm': 0.8940563277181036, 'learning_rate': 7.352098887082881e-09, 'epoch': 0.98} 98%|█████████▊| 21726/22095 [37:18:24<27:05, 4.41s/it] 98%|█████████▊| 21727/22095 [37:18:28<25:28, 4.15s/it] {'loss': 0.3176, 'grad_norm': 0.7214958076035248, 'learning_rate': 7.312421192703923e-09, 'epoch': 0.98} 98%|█████████▊| 21727/22095 [37:18:28<25:28, 4.15s/it] 98%|█████████▊| 21728/22095 [37:18:31<24:17, 3.97s/it] {'loss': 0.2644, 'grad_norm': 0.6343731980873564, 'learning_rate': 7.2728507758468295e-09, 'epoch': 0.98} 98%|█████████▊| 21728/22095 [37:18:31<24:17, 3.97s/it] 98%|█████████▊| 21729/22095 [37:18:35<24:46, 4.06s/it] {'loss': 0.3388, 'grad_norm': 0.5961209533114099, 'learning_rate': 7.233387637362587e-09, 'epoch': 0.98} 98%|█████████▊| 21729/22095 [37:18:35<24:46, 4.06s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (59679 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44678 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (88906 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (45951 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21730/22095 [37:18:39<23:17, 3.83s/it] {'loss': 0.2604, 'grad_norm': 0.6028494217289609, 'learning_rate': 7.194031778098853e-09, 'epoch': 0.98} 98%|█████████▊| 21730/22095 [37:18:39<23:17, 3.83s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8402862 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 5031, 'image': 'vrdu_table_final_2/astro-ph.CO/1daab5d7-c135-4764-8810-034956f0a661.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': "\nI'd like the table in the image converted into LaTeX format, please."}, {'from': 'gpt', 'value': "Certainly, I'd like the table in the image converted into LaTeX format for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 98%|█████████▊| 21731/22095 [37:18:42<22:12, 3.66s/it] {'loss': 0.3093, 'grad_norm': 0.6506180980165432, 'learning_rate': 7.1547831989005055e-09, 'epoch': 0.98} 98%|█████████▊| 21731/22095 [37:18:42<22:12, 3.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [650, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8367641 in VC:s3://internvl-moe-sft-data/. Exception: Image size [650, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 34389, 'image': 'vrdu_table_final_2/astro-ph.CO/84d89cde-6975-4821-a8e7-7c871fe6f42c.png', 'image_wh': [[650, 23]], 'conversations': [{'from': 'human', 'value': '\nI need the table in this image converted into LaTeX script, please.'}, {'from': 'gpt', 'value': "Sure, I'll convert the table in this image into LaTeX script for you.\n```latex\n\\begin{tabular}{l}\n$^{a}$ Residual colors relative to fits described in the text. \\\\\n\\end{tabular}\n```"}]} 98%|█████████▊| 21732/22095 [37:18:46<22:24, 3.70s/it] {'loss': 0.2921, 'grad_norm': 0.6274426767704868, 'learning_rate': 7.1156419006118695e-09, 'epoch': 0.98} 98%|█████████▊| 21732/22095 [37:18:46<22:24, 3.70s/it] 98%|█████████▊| 21733/22095 [37:18:49<20:49, 3.45s/it] {'loss': 0.3085, 'grad_norm': 0.6356227393707362, 'learning_rate': 7.076607884073939e-09, 'epoch': 0.98} 98%|█████████▊| 21733/22095 [37:18:49<20:49, 3.45s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21734/22095 [37:18:53<21:41, 3.61s/it] {'loss': 0.3198, 'grad_norm': 0.6440067946101039, 'learning_rate': 7.037681150124931e-09, 'epoch': 0.98} 98%|█████████▊| 21734/22095 [37:18:53<21:41, 3.61s/it] 98%|█████████▊| 21735/22095 [37:18:55<20:19, 3.39s/it] {'loss': 0.301, 'grad_norm': 0.622467940468477, 'learning_rate': 6.998861699600845e-09, 'epoch': 0.98} 98%|█████████▊| 21735/22095 [37:18:55<20:19, 3.39s/it] 98%|█████████▊| 21736/22095 [37:18:58<19:15, 3.22s/it] {'loss': 0.2924, 'grad_norm': 0.5591829293615435, 'learning_rate': 6.960149533337124e-09, 'epoch': 0.98} 98%|█████████▊| 21736/22095 [37:18:58<19:15, 3.22s/it] 98%|█████████▊| 21737/22095 [37:19:02<20:27, 3.43s/it] {'loss': 0.328, 'grad_norm': 0.568720885745079, 'learning_rate': 6.921544652164769e-09, 'epoch': 0.98} 98%|█████████▊| 21737/22095 [37:19:02<20:27, 3.43s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8379668 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 46453, 'image': 'vrdu_table_final_2/astro-ph.CO/89654155-4837-4a4a-a5ee-93db65ec1cb3.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}c@{}}#2\\end{tabular}\n```"}]} 98%|█████████▊| 21738/22095 [37:19:12<31:42, 5.33s/it] {'loss': 0.4822, 'grad_norm': 0.34489008410081096, 'learning_rate': 6.883047056913117e-09, 'epoch': 0.98} 98%|█████████▊| 21738/22095 [37:19:12<31:42, 5.33s/it] 98%|█████████▊| 21739/22095 [37:19:15<27:57, 4.71s/it] {'loss': 0.265, 'grad_norm': 0.6254423314914499, 'learning_rate': 6.844656748409284e-09, 'epoch': 0.98} 98%|█████████▊| 21739/22095 [37:19:15<27:57, 4.71s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (45707 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21740/22095 [37:19:19<25:48, 4.36s/it] {'loss': 0.2559, 'grad_norm': 0.5603541966537482, 'learning_rate': 6.8063737274787214e-09, 'epoch': 0.98} 98%|█████████▊| 21740/22095 [37:19:19<25:48, 4.36s/it] 98%|█████████▊| 21741/22095 [37:19:23<24:45, 4.20s/it] {'loss': 0.2705, 'grad_norm': 0.5814920279218694, 'learning_rate': 6.768197994944103e-09, 'epoch': 0.98} 98%|█████████▊| 21741/22095 [37:19:23<24:45, 4.20s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (43194 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (58443 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21742/22095 [37:19:25<22:29, 3.82s/it] {'loss': 0.2855, 'grad_norm': 0.6190468275501919, 'learning_rate': 6.730129551625331e-09, 'epoch': 0.98} 98%|█████████▊| 21742/22095 [37:19:26<22:29, 3.82s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (44390 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (54067 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (46199 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21743/22095 [37:19:28<20:46, 3.54s/it] {'loss': 0.3286, 'grad_norm': 0.5794138544317892, 'learning_rate': 6.692168398340082e-09, 'epoch': 0.98} 98%|█████████▊| 21743/22095 [37:19:28<20:46, 3.54s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (121569 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56192 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (79519 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (106649 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21744/22095 [37:19:32<20:47, 3.55s/it] {'loss': 0.2499, 'grad_norm': 0.6187931500635636, 'learning_rate': 6.6543145359043714e-09, 'epoch': 0.98} 98%|█████████▊| 21744/22095 [37:19:32<20:47, 3.55s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (96630000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 98%|█████████▊| 21745/22095 [37:19:35<20:27, 3.51s/it] {'loss': 0.3007, 'grad_norm': 0.7168440597895542, 'learning_rate': 6.616567965131992e-09, 'epoch': 0.98} 98%|█████████▊| 21745/22095 [37:19:35<20:27, 3.51s/it] 98%|█████████▊| 21746/22095 [37:19:39<20:45, 3.57s/it] {'loss': 0.2904, 'grad_norm': 0.5912575346461676, 'learning_rate': 6.578928686832853e-09, 'epoch': 0.98} 98%|█████████▊| 21746/22095 [37:19:39<20:45, 3.57s/it] 98%|█████████▊| 21747/22095 [37:19:42<19:44, 3.40s/it] {'loss': 0.2431, 'grad_norm': 0.5860976780626126, 'learning_rate': 6.54139670181686e-09, 'epoch': 0.98} 98%|█████████▊| 21747/22095 [37:19:42<19:44, 3.40s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21748/22095 [37:19:52<30:12, 5.22s/it] {'loss': 0.457, 'grad_norm': 0.26903828166599186, 'learning_rate': 6.503972010890036e-09, 'epoch': 0.98} 98%|█████████▊| 21748/22095 [37:19:52<30:12, 5.22s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8955480 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6315, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': "\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 7\nB. 2\nC. 2.5\nD. 4.5\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 98%|█████████▊| 21749/22095 [37:19:55<26:40, 4.63s/it] {'loss': 0.2648, 'grad_norm': 0.6051058553500327, 'learning_rate': 6.466654614856183e-09, 'epoch': 0.98} 98%|█████████▊| 21749/22095 [37:19:55<26:40, 4.63s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (49051 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96975 > 40960). Running this sequence through the model will result in indexing errors 98%|█████████▊| 21750/22095 [37:20:00<28:14, 4.91s/it] {'loss': 0.4579, 'grad_norm': 0.25852584920291466, 'learning_rate': 6.42944451451799e-09, 'epoch': 0.98} 98%|█████████▊| 21750/22095 [37:20:00<28:14, 4.91s/it] 98%|█████████▊| 21751/22095 [37:20:07<31:21, 5.47s/it] {'loss': 0.4677, 'grad_norm': 0.2643033016293149, 'learning_rate': 6.392341710674266e-09, 'epoch': 0.98} 98%|█████████▊| 21751/22095 [37:20:07<31:21, 5.47s/it] 98%|█████████▊| 21752/22095 [37:20:17<38:28, 6.73s/it] {'loss': 0.4427, 'grad_norm': 0.25894655658059895, 'learning_rate': 6.355346204122148e-09, 'epoch': 0.98} 98%|█████████▊| 21752/22095 [37:20:17<38:28, 6.73s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 98%|█████████▊| 21753/22095 [37:20:21<33:10, 5.82s/it] {'loss': 0.2816, 'grad_norm': 0.6426058287872317, 'learning_rate': 6.318457995657113e-09, 'epoch': 0.98} 98%|█████████▊| 21753/22095 [37:20:21<33:10, 5.82s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 98%|█████████▊| 21754/22095 [37:20:24<28:55, 5.09s/it] {'loss': 0.2634, 'grad_norm': 0.6257485278487301, 'learning_rate': 6.281677086071303e-09, 'epoch': 0.98} 98%|█████████▊| 21754/22095 [37:20:24<28:55, 5.09s/it] 98%|█████████▊| 21755/22095 [37:20:27<25:03, 4.42s/it] {'loss': 0.2717, 'grad_norm': 0.6100901183104055, 'learning_rate': 6.245003476155198e-09, 'epoch': 0.98} 98%|█████████▊| 21755/22095 [37:20:27<25:03, 4.42s/it] 98%|█████████▊| 21756/22095 [37:20:30<23:16, 4.12s/it] {'loss': 0.2676, 'grad_norm': 0.6110668534928377, 'learning_rate': 6.208437166697056e-09, 'epoch': 0.98} 98%|█████████▊| 21756/22095 [37:20:30<23:16, 4.12s/it] 98%|█████████▊| 21757/22095 [37:20:33<21:45, 3.86s/it] {'loss': 0.2739, 'grad_norm': 0.584749554284075, 'learning_rate': 6.171978158482361e-09, 'epoch': 0.98} 98%|█████████▊| 21757/22095 [37:20:33<21:45, 3.86s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 98%|█████████▊| 21758/22095 [37:20:42<29:13, 5.20s/it] {'loss': 0.4525, 'grad_norm': 0.2766780699015175, 'learning_rate': 6.135626452294374e-09, 'epoch': 0.98} 98%|█████████▊| 21758/22095 [37:20:42<29:13, 5.20s/it] 98%|█████████▊| 21759/22095 [37:20:46<26:55, 4.81s/it] {'loss': 0.3081, 'grad_norm': 0.6570081472633232, 'learning_rate': 6.099382048914138e-09, 'epoch': 0.98} 98%|█████████▊| 21759/22095 [37:20:46<26:55, 4.81s/it] 98%|█████████▊| 21760/22095 [37:20:49<24:27, 4.38s/it] {'loss': 0.2725, 'grad_norm': 0.6301607344700081, 'learning_rate': 6.063244949120473e-09, 'epoch': 0.98} 98%|█████████▊| 21760/22095 [37:20:49<24:27, 4.38s/it] 98%|█████████▊| 21761/22095 [37:20:52<21:46, 3.91s/it] {'loss': 0.2909, 'grad_norm': 0.8074035664818978, 'learning_rate': 6.027215153689981e-09, 'epoch': 0.98} 98%|█████████▊| 21761/22095 [37:20:52<21:46, 3.91s/it] 98%|█████████▊| 21762/22095 [37:20:56<21:45, 3.92s/it] {'loss': 0.2785, 'grad_norm': 0.5661608368630222, 'learning_rate': 5.9912926633970415e-09, 'epoch': 0.98} 98%|█████████▊| 21762/22095 [37:20:56<21:45, 3.92s/it] 98%|█████████▊| 21763/22095 [37:20:59<21:05, 3.81s/it] {'loss': 0.3317, 'grad_norm': 0.699512720197519, 'learning_rate': 5.955477479013816e-09, 'epoch': 0.98} 98%|█████████▊| 21763/22095 [37:20:59<21:05, 3.81s/it] 99%|█████████▊| 21764/22095 [37:21:03<21:34, 3.91s/it] {'loss': 0.3022, 'grad_norm': 0.5609269535348155, 'learning_rate': 5.919769601308578e-09, 'epoch': 0.99} 99%|█████████▊| 21764/22095 [37:21:04<21:34, 3.91s/it] 99%|█████████▊| 21765/22095 [37:21:08<22:32, 4.10s/it] {'loss': 0.3011, 'grad_norm': 0.5993231895671898, 'learning_rate': 5.8841690310496024e-09, 'epoch': 0.99} 99%|█████████▊| 21765/22095 [37:21:08<22:32, 4.10s/it] 99%|█████████▊| 21766/22095 [37:21:11<20:20, 3.71s/it] {'loss': 0.2157, 'grad_norm': 0.5689363005629489, 'learning_rate': 5.8486757690012775e-09, 'epoch': 0.99} 99%|█████████▊| 21766/22095 [37:21:11<20:20, 3.71s/it] 99%|█████████▊| 21767/22095 [37:21:14<20:00, 3.66s/it] {'loss': 0.3054, 'grad_norm': 0.6758401330216106, 'learning_rate': 5.8132898159268815e-09, 'epoch': 0.99} 99%|█████████▊| 21767/22095 [37:21:14<20:00, 3.66s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (71053 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (98379 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21768/22095 [37:21:19<21:34, 3.96s/it] {'loss': 0.2651, 'grad_norm': 0.567882054273113, 'learning_rate': 5.778011172586362e-09, 'epoch': 0.99} 99%|█████████▊| 21768/22095 [37:21:19<21:34, 3.96s/it] 99%|█████████▊| 21769/22095 [37:21:22<20:12, 3.72s/it] {'loss': 0.2943, 'grad_norm': 0.5951791861094041, 'learning_rate': 5.742839839738001e-09, 'epoch': 0.99} 99%|█████████▊| 21769/22095 [37:21:22<20:12, 3.72s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (57186 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21770/22095 [37:21:26<20:36, 3.80s/it] {'loss': 0.2947, 'grad_norm': 0.6362072194919404, 'learning_rate': 5.7077758181367516e-09, 'epoch': 0.99} 99%|█████████▊| 21770/22095 [37:21:26<20:36, 3.80s/it] 99%|█████████▊| 21771/22095 [37:21:30<21:02, 3.90s/it] {'loss': 0.3013, 'grad_norm': 0.5708793133016763, 'learning_rate': 5.6728191085370085e-09, 'epoch': 0.99} 99%|█████████▊| 21771/22095 [37:21:30<21:02, 3.90s/it] 99%|█████████▊| 21772/22095 [37:21:34<20:57, 3.89s/it] {'loss': 0.2648, 'grad_norm': 0.5973514269004946, 'learning_rate': 5.637969711689839e-09, 'epoch': 0.99} 99%|█████████▊| 21772/22095 [37:21:34<20:57, 3.89s/it] 99%|█████████▊| 21773/22095 [37:21:38<20:32, 3.83s/it] {'loss': 0.2827, 'grad_norm': 0.5688949733432468, 'learning_rate': 5.603227628342978e-09, 'epoch': 0.99} 99%|█████████▊| 21773/22095 [37:21:38<20:32, 3.83s/it] 99%|█████████▊| 21774/22095 [37:21:41<20:07, 3.76s/it] {'loss': 0.3101, 'grad_norm': 0.6476887178964443, 'learning_rate': 5.56859285924416e-09, 'epoch': 0.99} 99%|█████████▊| 21774/22095 [37:21:41<20:07, 3.76s/it] 99%|█████████▊| 21775/22095 [37:21:44<18:41, 3.50s/it] {'loss': 0.2817, 'grad_norm': 0.6396008335007618, 'learning_rate': 5.534065405136679e-09, 'epoch': 0.99} 99%|█████████▊| 21775/22095 [37:21:44<18:41, 3.50s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [125, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8485456 in VC:s3://internvl-moe-sft-data/. Exception: Image size [125, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 88058, 'image': 'vrdu_texteq/astro-ph.CO/2c529472-f204-4e2a-8673-73ee34288b9d.png', 'image_wh': [[125, 25]], 'conversations': [{'from': 'human', 'value': '\nKindly process the image to identify and retrieve any textual content it contains.'}, {'from': 'gpt', 'value': "$b=1000''$:"}]} 99%|█████████▊| 21776/22095 [37:21:54<28:15, 5.31s/it] {'loss': 0.448, 'grad_norm': 0.2376931055703006, 'learning_rate': 5.499645266762721e-09, 'epoch': 0.99} 99%|█████████▊| 21776/22095 [37:21:54<28:15, 5.31s/it] 99%|█████████▊| 21777/22095 [37:21:58<25:42, 4.85s/it] {'loss': 0.3093, 'grad_norm': 0.6520128967399444, 'learning_rate': 5.465332444862248e-09, 'epoch': 0.99} 99%|█████████▊| 21777/22095 [37:21:58<25:42, 4.85s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 99%|█████████▊| 21778/22095 [37:22:08<33:31, 6.34s/it] {'loss': 0.4658, 'grad_norm': 0.2865227420837884, 'learning_rate': 5.431126940172449e-09, 'epoch': 0.99} 99%|█████████▊| 21778/22095 [37:22:08<33:31, 6.34s/it] 99%|█████████▊| 21779/22095 [37:22:11<28:34, 5.43s/it] {'loss': 0.2714, 'grad_norm': 0.5659488627340931, 'learning_rate': 5.397028753427735e-09, 'epoch': 0.99} 99%|█████████▊| 21779/22095 [37:22:11<28:34, 5.43s/it] 99%|█████████▊| 21780/22095 [37:22:14<25:24, 4.84s/it] {'loss': 0.3034, 'grad_norm': 0.6335095816920012, 'learning_rate': 5.363037885360856e-09, 'epoch': 0.99} 99%|█████████▊| 21780/22095 [37:22:14<25:24, 4.84s/it] 99%|█████████▊| 21781/22095 [37:22:18<23:18, 4.45s/it] {'loss': 0.2621, 'grad_norm': 0.7757831393322988, 'learning_rate': 5.329154336702891e-09, 'epoch': 0.99} 99%|█████████▊| 21781/22095 [37:22:18<23:18, 4.45s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (41701 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (68653 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83750 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (96482 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21782/22095 [37:22:21<20:39, 3.96s/it] {'loss': 0.261, 'grad_norm': 0.5468882390931025, 'learning_rate': 5.295378108181592e-09, 'epoch': 0.99} 99%|█████████▊| 21782/22095 [37:22:21<20:39, 3.96s/it] 99%|█████████▊| 21783/22095 [37:22:24<19:43, 3.79s/it] {'loss': 0.2598, 'grad_norm': 0.5784789337992116, 'learning_rate': 5.261709200521936e-09, 'epoch': 0.99} 99%|█████████▊| 21783/22095 [37:22:24<19:43, 3.79s/it] 99%|█████████▊| 21784/22095 [37:22:27<18:14, 3.52s/it] {'loss': 0.3111, 'grad_norm': 0.6304084729105638, 'learning_rate': 5.228147614448342e-09, 'epoch': 0.99} 99%|█████████▊| 21784/22095 [37:22:27<18:14, 3.52s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (105670 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (42278 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21785/22095 [37:22:30<17:19, 3.35s/it] {'loss': 0.2934, 'grad_norm': 0.6380056859811842, 'learning_rate': 5.194693350681901e-09, 'epoch': 0.99} 99%|█████████▊| 21785/22095 [37:22:30<17:19, 3.35s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (60729 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (56962 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (111273 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (133300 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (62579 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (72371 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (44805 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21786/22095 [37:22:34<18:12, 3.54s/it] {'loss': 0.2863, 'grad_norm': 0.5925074282805367, 'learning_rate': 5.161346409940371e-09, 'epoch': 0.99} 99%|█████████▊| 21786/22095 [37:22:34<18:12, 3.54s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▊| 21787/22095 [37:22:42<25:32, 4.98s/it] {'loss': 0.494, 'grad_norm': 0.28274373509663, 'learning_rate': 5.128106792941512e-09, 'epoch': 0.99} 99%|█████████▊| 21787/22095 [37:22:42<25:32, 4.98s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▊| 21788/22095 [37:22:46<23:08, 4.52s/it] {'loss': 0.2801, 'grad_norm': 0.9279097453101085, 'learning_rate': 5.094974500399197e-09, 'epoch': 0.99} 99%|█████████▊| 21788/22095 [37:22:46<23:08, 4.52s/it] 99%|█████████▊| 21789/22095 [37:22:49<21:55, 4.30s/it] {'loss': 0.2814, 'grad_norm': 0.6266657228728079, 'learning_rate': 5.061949533025079e-09, 'epoch': 0.99} 99%|█████████▊| 21789/22095 [37:22:49<21:55, 4.30s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 Token indices sequence length is longer than the specified maximum sequence length for this model (84026 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (70702 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21790/22095 [37:22:59<29:39, 5.84s/it] {'loss': 0.4443, 'grad_norm': 0.3617893690237543, 'learning_rate': 5.02903189152859e-09, 'epoch': 0.99} 99%|█████████▊| 21790/22095 [37:22:59<29:39, 5.84s/it] 99%|█████████▊| 21791/22095 [37:23:02<26:06, 5.15s/it] {'loss': 0.3358, 'grad_norm': 0.603404324430023, 'learning_rate': 4.996221576617499e-09, 'epoch': 0.99} 99%|█████████▊| 21791/22095 [37:23:02<26:06, 5.15s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (67711 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (127862 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (78808 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (102081 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21792/22095 [37:23:06<23:31, 4.66s/it] {'loss': 0.3327, 'grad_norm': 0.5743585459695362, 'learning_rate': 4.9635185889967966e-09, 'epoch': 0.99} 99%|█████████▊| 21792/22095 [37:23:06<23:31, 4.66s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 99%|█████████▊| 21793/22095 [37:23:13<26:27, 5.26s/it] {'loss': 0.4703, 'grad_norm': 0.25070337964915884, 'learning_rate': 4.930922929368698e-09, 'epoch': 0.99} 99%|█████████▊| 21793/22095 [37:23:13<26:27, 5.26s/it] 99%|█████████▊| 21794/22095 [37:23:22<33:06, 6.60s/it] {'loss': 0.4523, 'grad_norm': 0.29054184356976154, 'learning_rate': 4.89843459843431e-09, 'epoch': 0.99} 99%|█████████▊| 21794/22095 [37:23:22<33:06, 6.60s/it]Invalidate trace cache @ step 2: expected module 364, but got module 1 99%|█████████▊| 21795/22095 [37:23:26<28:58, 5.80s/it] {'loss': 0.3137, 'grad_norm': 0.5935151628381033, 'learning_rate': 4.8660535968908515e-09, 'epoch': 0.99} 99%|█████████▊| 21795/22095 [37:23:26<28:58, 5.80s/it] 99%|█████████▊| 21796/22095 [37:23:30<25:50, 5.18s/it] {'loss': 0.3458, 'grad_norm': 0.6235034034439103, 'learning_rate': 4.833779925434434e-09, 'epoch': 0.99} 99%|█████████▊| 21796/22095 [37:23:30<25:50, 5.18s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (49935 > 40960). Running this sequence through the model will result in indexing errors Token indices sequence length is longer than the specified maximum sequence length for this model (83567 > 40960). Running this sequence through the model will result in indexing errors 99%|█████████▊| 21797/22095 [37:23:34<23:35, 4.75s/it] {'loss': 0.29, 'grad_norm': 0.6738464231815146, 'learning_rate': 4.801613584758946e-09, 'epoch': 0.99} 99%|█████████▊| 21797/22095 [37:23:34<23:35, 4.75s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 99%|█████████▊| 21798/22095 [37:23:44<31:03, 6.27s/it] {'loss': 0.4779, 'grad_norm': 0.2779606632713415, 'learning_rate': 4.769554575554947e-09, 'epoch': 0.99} 99%|█████████▊| 21798/22095 [37:23:44<31:03, 6.27s/it] 99%|█████████▊| 21799/22095 [37:23:47<27:01, 5.48s/it] {'loss': 0.2765, 'grad_norm': 0.5716203526929273, 'learning_rate': 4.737602898511884e-09, 'epoch': 0.99} 99%|█████████▊| 21799/22095 [37:23:47<27:01, 5.48s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 99%|█████████▊| 21800/22095 [37:23:57<32:54, 6.69s/it] {'loss': 0.4396, 'grad_norm': 0.24347567952338942, 'learning_rate': 4.705758554315876e-09, 'epoch': 0.99} 99%|█████████▊| 21800/22095 [37:23:57<32:54, 6.69s/it] 99%|█████████▊| 21801/22095 [37:24:00<28:11, 5.75s/it] {'loss': 0.2783, 'grad_norm': 0.5874975033790542, 'learning_rate': 4.674021543651374e-09, 'epoch': 0.99} 99%|█████████▊| 21801/22095 [37:24:00<28:11, 5.75s/it] 99%|█████████▊| 21802/22095 [37:24:03<23:51, 4.88s/it] {'loss': 0.2863, 'grad_norm': 0.7032677349580255, 'learning_rate': 4.642391867199503e-09, 'epoch': 0.99} 99%|█████████▊| 21802/22095 [37:24:03<23:51, 4.88s/it] 99%|█████████▊| 21803/22095 [37:24:06<21:34, 4.43s/it] {'loss': 0.3122, 'grad_norm': 0.6305097485833882, 'learning_rate': 4.610869525641382e-09, 'epoch': 0.99} 99%|█████████▊| 21803/22095 [37:24:06<21:34, 4.43s/it] 99%|█████████▊| 21804/22095 [37:24:11<20:56, 4.32s/it] {'loss': 0.2941, 'grad_norm': 0.5912198400207207, 'learning_rate': 4.579454519653137e-09, 'epoch': 0.99} 99%|█████████▊| 21804/22095 [37:24:11<20:56, 4.32s/it]Invalidate trace cache @ step 2: expected module 1, but got module 364 99%|█████████▊| 21805/22095 [37:24:16<22:58, 4.75s/it] {'loss': 0.466, 'grad_norm': 0.2503455997627201, 'learning_rate': 4.5481468499097845e-09, 'epoch': 0.99} 99%|█████████▊| 21805/22095 [37:24:16<22:58, 4.75s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [25, 62, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8408364 in VC:s3://internvl-moe-sft-data/. Exception: Image size [25, 62, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10557, 'image': 'vrdu_table_final_2/astro-ph.CO/e2a53cc9-7e7b-44ae-9ef9-b43c75618f97.png', 'image_wh': [[25, 62]], 'conversations': [{'from': 'human', 'value': '\nTranscribe the table visible in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Certainly, I'll transcribe the table visible in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n$\\theta_{i\\pcopy}^p$ \\\\\n$\\theta_\\unsplitcopy^p$\n\\end{tabular}\n```"}]} 99%|█████████▊| 21806/22095 [37:24:20<21:01, 4.37s/it] {'loss': 0.2868, 'grad_norm': 0.6940940885386304, 'learning_rate': 4.516946517084675e-09, 'epoch': 0.99} 99%|█████████▊| 21806/22095 [37:24:20<21:01, 4.37s/it] 99%|█████████▊| 21807/22095 [37:24:23<19:24, 4.04s/it] {'loss': 0.3202, 'grad_norm': 0.644379368900686, 'learning_rate': 4.485853521848382e-09, 'epoch': 0.99} 99%|█████████▊| 21807/22095 [37:24:23<19:24, 4.04s/it] 99%|█████████▊| 21808/22095 [37:24:26<17:40, 3.70s/it] {'loss': 0.3505, 'grad_norm': 0.6286080941019534, 'learning_rate': 4.4548678648681506e-09, 'epoch': 0.99} 99%|█████████▊| 21808/22095 [37:24:26<17:40, 3.70s/it] 99%|█████████▊| 21809/22095 [37:24:29<16:31, 3.47s/it] {'loss': 0.2872, 'grad_norm': 0.6199985918169026, 'learning_rate': 4.423989546810115e-09, 'epoch': 0.99} 99%|█████████▊| 21809/22095 [37:24:29<16:31, 3.47s/it] 99%|█████████▊| 21810/22095 [37:24:32<16:31, 3.48s/it] {'loss': 0.3019, 'grad_norm': 0.5828107840187536, 'learning_rate': 4.3932185683376316e-09, 'epoch': 0.99} 99%|█████████▊| 21810/22095 [37:24:32<16:31, 3.48s/it] 99%|█████████▊| 21811/22095 [37:24:37<17:32, 3.71s/it] {'loss': 0.2626, 'grad_norm': 0.591062507801819, 'learning_rate': 4.362554930112395e-09, 'epoch': 0.99} 99%|█████████▊| 21811/22095 [37:24:37<17:32, 3.71s/it] 99%|█████████▊| 21812/22095 [37:24:39<16:12, 3.43s/it] {'loss': 0.3347, 'grad_norm': 0.6613664730270813, 'learning_rate': 4.331998632792766e-09, 'epoch': 0.99} 99%|█████████▊| 21812/22095 [37:24:39<16:12, 3.43s/it] 99%|█████████▊| 21813/22095 [37:24:43<16:58, 3.61s/it] {'loss': 0.2522, 'grad_norm': 0.591931367567967, 'learning_rate': 4.3015496770354435e-09, 'epoch': 0.99} 99%|█████████▊| 21813/22095 [37:24:43<16:58, 3.61s/it] 99%|█████████▊| 21814/22095 [37:24:47<16:20, 3.49s/it] {'loss': 0.3038, 'grad_norm': 0.6118863093822567, 'learning_rate': 4.2712080634949024e-09, 'epoch': 0.99} 99%|█████████▊| 21814/22095 [37:24:47<16:20, 3.49s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▊| 21815/22095 [37:24:51<17:34, 3.76s/it] {'loss': 0.3698, 'grad_norm': 0.9204855378714794, 'learning_rate': 4.240973792822845e-09, 'epoch': 0.99} 99%|█████████▊| 21815/22095 [37:24:51<17:34, 3.76s/it] 99%|█████████▊| 21816/22095 [37:24:54<17:02, 3.66s/it] {'loss': 0.2452, 'grad_norm': 0.5471052873028251, 'learning_rate': 4.210846865668749e-09, 'epoch': 0.99} 99%|█████████▊| 21816/22095 [37:24:54<17:02, 3.66s/it] 99%|█████████▊| 21817/22095 [37:24:58<16:07, 3.48s/it] {'loss': 0.2873, 'grad_norm': 0.648162350342267, 'learning_rate': 4.180827282680433e-09, 'epoch': 0.99} 99%|█████████▊| 21817/22095 [37:24:58<16:07, 3.48s/it] 99%|█████████▊| 21818/22095 [37:25:00<15:11, 3.29s/it] {'loss': 0.283, 'grad_norm': 0.6168060659402789, 'learning_rate': 4.1509150445023794e-09, 'epoch': 0.99} 99%|█████████▊| 21818/22095 [37:25:00<15:11, 3.29s/it] 99%|█████████▉| 21819/22095 [37:25:03<14:48, 3.22s/it] {'loss': 0.2466, 'grad_norm': 0.6184546215910344, 'learning_rate': 4.121110151777407e-09, 'epoch': 0.99} 99%|█████████▉| 21819/22095 [37:25:03<14:48, 3.22s/it] 99%|█████████▉| 21820/22095 [37:25:06<14:16, 3.11s/it] {'loss': 0.2808, 'grad_norm': 0.7674514700369797, 'learning_rate': 4.0914126051466715e-09, 'epoch': 0.99} 99%|█████████▉| 21820/22095 [37:25:06<14:16, 3.11s/it] 99%|█████████▉| 21821/22095 [37:25:09<13:52, 3.04s/it] {'loss': 0.3179, 'grad_norm': 0.597187270312596, 'learning_rate': 4.06182240524744e-09, 'epoch': 0.99} 99%|█████████▉| 21821/22095 [37:25:09<13:52, 3.04s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21822/22095 [37:25:12<13:42, 3.01s/it] {'loss': 0.3038, 'grad_norm': 0.668955246942683, 'learning_rate': 4.032339552715869e-09, 'epoch': 0.99} 99%|█████████▉| 21822/22095 [37:25:12<13:42, 3.01s/it] 99%|█████████▉| 21823/22095 [37:25:16<14:39, 3.23s/it] {'loss': 0.2824, 'grad_norm': 0.6083238151005289, 'learning_rate': 4.002964048185342e-09, 'epoch': 0.99} 99%|█████████▉| 21823/22095 [37:25:16<14:39, 3.23s/it] 99%|█████████▉| 21824/22095 [37:25:19<14:07, 3.13s/it] {'loss': 0.3158, 'grad_norm': 0.6671406711905182, 'learning_rate': 3.973695892287022e-09, 'epoch': 0.99} 99%|█████████▉| 21824/22095 [37:25:19<14:07, 3.13s/it] 99%|█████████▉| 21825/22095 [37:25:22<14:03, 3.12s/it] {'loss': 0.2742, 'grad_norm': 0.7626532643269714, 'learning_rate': 3.944535085649848e-09, 'epoch': 0.99} 99%|█████████▉| 21825/22095 [37:25:22<14:03, 3.12s/it] 99%|█████████▉| 21826/22095 [37:25:25<13:46, 3.07s/it] {'loss': 0.2999, 'grad_norm': 0.6286400094111584, 'learning_rate': 3.915481628900541e-09, 'epoch': 0.99} 99%|█████████▉| 21826/22095 [37:25:25<13:46, 3.07s/it] 99%|█████████▉| 21827/22095 [37:25:28<13:28, 3.02s/it] {'loss': 0.2548, 'grad_norm': 0.5968127011830571, 'learning_rate': 3.8865355226630484e-09, 'epoch': 0.99} 99%|█████████▉| 21827/22095 [37:25:28<13:28, 3.02s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 99%|█████████▉| 21828/22095 [37:25:31<14:20, 3.22s/it] {'loss': 0.3023, 'grad_norm': 0.6086665535501046, 'learning_rate': 3.857696767559649e-09, 'epoch': 0.99} 99%|█████████▉| 21828/22095 [37:25:31<14:20, 3.22s/it] 99%|█████████▉| 21829/22095 [37:25:34<13:40, 3.08s/it] {'loss': 0.278, 'grad_norm': 0.6130992682575301, 'learning_rate': 3.828965364209847e-09, 'epoch': 0.99} 99%|█████████▉| 21829/22095 [37:25:34<13:40, 3.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21830/22095 [37:25:37<13:10, 2.98s/it] {'loss': 0.2965, 'grad_norm': 0.6310926783173733, 'learning_rate': 3.8003413132309265e-09, 'epoch': 0.99} 99%|█████████▉| 21830/22095 [37:25:37<13:10, 2.98s/it] 99%|█████████▉| 21831/22095 [37:25:40<13:26, 3.05s/it] {'loss': 0.3078, 'grad_norm': 0.6549428048986663, 'learning_rate': 3.771824615237951e-09, 'epoch': 0.99} 99%|█████████▉| 21831/22095 [37:25:40<13:26, 3.05s/it] 99%|█████████▉| 21832/22095 [37:25:43<13:43, 3.13s/it] {'loss': 0.2942, 'grad_norm': 0.6044611572338786, 'learning_rate': 3.7434152708437645e-09, 'epoch': 0.99} 99%|█████████▉| 21832/22095 [37:25:43<13:43, 3.13s/it] 99%|█████████▉| 21833/22095 [37:25:46<13:23, 3.07s/it] {'loss': 0.2848, 'grad_norm': 0.6306510752902703, 'learning_rate': 3.7151132806589885e-09, 'epoch': 0.99} 99%|█████████▉| 21833/22095 [37:25:46<13:23, 3.07s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [183, 19, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047850 in VC:s3://multi-modal/UniGeo/. Exception: Image size [183, 19, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5415.png', 'image_wh': [[183, 19]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,且D是线段AC的中点.若AB=10cm,BC=4cm,则AD的长为()\nA. 4cm\nB. 6cm\nC. 2cm\nD. 3cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 99%|█████████▉| 21834/22095 [37:25:50<14:20, 3.30s/it] {'loss': 0.2606, 'grad_norm': 0.5962400120103846, 'learning_rate': 3.6869186452909157e-09, 'epoch': 0.99} 99%|█████████▉| 21834/22095 [37:25:50<14:20, 3.30s/it] 99%|█████████▉| 21835/22095 [37:25:53<14:16, 3.29s/it] {'loss': 0.3017, 'grad_norm': 0.6985939720579019, 'learning_rate': 3.6588313653468377e-09, 'epoch': 0.99} 99%|█████████▉| 21835/22095 [37:25:53<14:16, 3.29s/it] 99%|█████████▉| 21836/22095 [37:25:57<14:05, 3.26s/it] {'loss': 0.2337, 'grad_norm': 0.6139462080468838, 'learning_rate': 3.6308514414284958e-09, 'epoch': 0.99} 99%|█████████▉| 21836/22095 [37:25:57<14:05, 3.26s/it] 99%|█████████▉| 21837/22095 [37:26:00<13:52, 3.23s/it] {'loss': 0.3009, 'grad_norm': 0.6032328510366322, 'learning_rate': 3.6029788741387407e-09, 'epoch': 0.99} 99%|█████████▉| 21837/22095 [37:26:00<13:52, 3.23s/it] 99%|█████████▉| 21838/22095 [37:26:03<14:02, 3.28s/it] {'loss': 0.2985, 'grad_norm': 0.5821709047204009, 'learning_rate': 3.5752136640754278e-09, 'epoch': 0.99} 99%|█████████▉| 21838/22095 [37:26:03<14:02, 3.28s/it] 99%|█████████▉| 21839/22095 [37:26:06<13:18, 3.12s/it] {'loss': 0.2756, 'grad_norm': 0.6916801065461516, 'learning_rate': 3.5475558118353015e-09, 'epoch': 0.99} 99%|█████████▉| 21839/22095 [37:26:06<13:18, 3.12s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8883164 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 6317, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图,已知线段AB=9,BC=5,点D为线段AC的中点,则线段AD的长度是()\nA. 2\nB. 2.5\nC. 4.5\nD. 7'}, {'from': 'gpt', 'value': '【解答】解:∵AB=9,BC=5,∴AC=AB+BC=14,∵D为线段AC的中点,∴AD=\\frac{1}{2}AC=7,'}]} 99%|█████████▉| 21840/22095 [37:26:09<13:14, 3.11s/it] {'loss': 0.2696, 'grad_norm': 0.5836411571093898, 'learning_rate': 3.5200053180128867e-09, 'epoch': 0.99} 99%|█████████▉| 21840/22095 [37:26:09<13:14, 3.11s/it] 99%|█████████▉| 21841/22095 [37:26:12<13:01, 3.08s/it] {'loss': 0.2749, 'grad_norm': 0.5775515617423247, 'learning_rate': 3.4925621831999325e-09, 'epoch': 0.99} 99%|█████████▉| 21841/22095 [37:26:12<13:01, 3.08s/it] 99%|█████████▉| 21842/22095 [37:26:15<12:45, 3.02s/it] {'loss': 0.274, 'grad_norm': 0.6257537502517646, 'learning_rate': 3.4652264079859666e-09, 'epoch': 0.99} 99%|█████████▉| 21842/22095 [37:26:15<12:45, 3.02s/it] 99%|█████████▉| 21843/22095 [37:26:19<13:32, 3.23s/it] {'loss': 0.2858, 'grad_norm': 0.5803599968862889, 'learning_rate': 3.4379979929588526e-09, 'epoch': 0.99} 99%|█████████▉| 21843/22095 [37:26:19<13:32, 3.23s/it] 99%|█████████▉| 21844/22095 [37:26:22<13:29, 3.23s/it] {'loss': 0.3185, 'grad_norm': 0.6363360456382771, 'learning_rate': 3.410876938703678e-09, 'epoch': 0.99} 99%|█████████▉| 21844/22095 [37:26:22<13:29, 3.23s/it] 99%|█████████▉| 21845/22095 [37:26:25<13:17, 3.19s/it] {'loss': 0.2686, 'grad_norm': 0.5635976745712896, 'learning_rate': 3.383863245802754e-09, 'epoch': 0.99} 99%|█████████▉| 21845/22095 [37:26:25<13:17, 3.19s/it] 99%|█████████▉| 21846/22095 [37:26:28<12:50, 3.09s/it] {'loss': 0.2938, 'grad_norm': 0.5524016813596654, 'learning_rate': 3.3569569148367286e-09, 'epoch': 0.99} 99%|█████████▉| 21846/22095 [37:26:28<12:50, 3.09s/it] 99%|█████████▉| 21847/22095 [37:26:32<14:08, 3.42s/it] {'loss': 0.2986, 'grad_norm': 0.5676520951342483, 'learning_rate': 3.3301579463834722e-09, 'epoch': 0.99} 99%|█████████▉| 21847/22095 [37:26:32<14:08, 3.42s/it] 99%|█████████▉| 21848/22095 [37:26:36<14:38, 3.56s/it] {'loss': 0.3036, 'grad_norm': 0.5711577922083861, 'learning_rate': 3.30346634101919e-09, 'epoch': 0.99} 99%|█████████▉| 21848/22095 [37:26:36<14:38, 3.56s/it] 99%|█████████▉| 21849/22095 [37:26:39<13:44, 3.35s/it] {'loss': 0.2543, 'grad_norm': 0.6275128453665675, 'learning_rate': 3.276882099316758e-09, 'epoch': 0.99} 99%|█████████▉| 21849/22095 [37:26:39<13:44, 3.35s/it] 99%|█████████▉| 21850/22095 [37:26:42<13:27, 3.29s/it] {'loss': 0.2959, 'grad_norm': 0.6967194406864456, 'learning_rate': 3.250405221848496e-09, 'epoch': 0.99} 99%|█████████▉| 21850/22095 [37:26:42<13:27, 3.29s/it] 99%|█████████▉| 21851/22095 [37:26:45<13:40, 3.36s/it] {'loss': 0.269, 'grad_norm': 0.5869615549138006, 'learning_rate': 3.224035709182283e-09, 'epoch': 0.99} 99%|█████████▉| 21851/22095 [37:26:45<13:40, 3.36s/it] 99%|█████████▉| 21852/22095 [37:26:49<13:37, 3.36s/it] {'loss': 0.319, 'grad_norm': 0.6351290356709387, 'learning_rate': 3.1977735618854423e-09, 'epoch': 0.99} 99%|█████████▉| 21852/22095 [37:26:49<13:37, 3.36s/it] 99%|█████████▉| 21853/22095 [37:26:52<13:17, 3.29s/it] {'loss': 0.2715, 'grad_norm': 0.6284376408209436, 'learning_rate': 3.1716187805214127e-09, 'epoch': 0.99} 99%|█████████▉| 21853/22095 [37:26:52<13:17, 3.29s/it] 99%|█████████▉| 21854/22095 [37:26:55<12:35, 3.13s/it] {'loss': 0.2756, 'grad_norm': 0.624262732120063, 'learning_rate': 3.1455713656530774e-09, 'epoch': 0.99} 99%|█████████▉| 21854/22095 [37:26:55<12:35, 3.13s/it] 99%|█████████▉| 21855/22095 [37:26:58<12:40, 3.17s/it] {'loss': 0.278, 'grad_norm': 0.5863698484875334, 'learning_rate': 3.1196313178399882e-09, 'epoch': 0.99} 99%|█████████▉| 21855/22095 [37:26:58<12:40, 3.17s/it] 99%|█████████▉| 21856/22095 [37:27:02<13:39, 3.43s/it] {'loss': 0.2609, 'grad_norm': 0.5864017516835073, 'learning_rate': 3.093798637638923e-09, 'epoch': 0.99} 99%|█████████▉| 21856/22095 [37:27:02<13:39, 3.43s/it] 99%|█████████▉| 21857/22095 [37:27:05<13:37, 3.43s/it] {'loss': 0.313, 'grad_norm': 0.6361268027097685, 'learning_rate': 3.0680733256055475e-09, 'epoch': 0.99} 99%|█████████▉| 21857/22095 [37:27:05<13:37, 3.43s/it] 99%|█████████▉| 21858/22095 [37:27:09<13:32, 3.43s/it] {'loss': 0.2762, 'grad_norm': 0.5593585491784253, 'learning_rate': 3.0424553822921977e-09, 'epoch': 0.99} 99%|█████████▉| 21858/22095 [37:27:09<13:32, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [231, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8424681 in VC:s3://internvl-moe-sft-data/. Exception: Image size [231, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 150176, 'image': 'vrdu_texteq/astro-ph.CO/54af9edb-aac6-4ee8-9dec-7b66b6df2aac.png', 'image_wh': [[231, 23]], 'conversations': [{'from': 'human', 'value': '\n你能够将图像中可见的文本转录出来吗?'}, {'from': 'gpt', 'value': 'For $m = 0$ we have'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8887139 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10292, 'image': 'images/5364.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': "\n如图,C是线段AB上一点,AC=4,BC=6,点M、N分别是线段AC、BC的中点,则MN=()\nA. 10\nB. 5\nC. 2\nD. 3\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 99%|█████████▉| 21859/22095 [37:27:13<14:33, 3.70s/it] {'loss': 0.323, 'grad_norm': 0.5934333708764978, 'learning_rate': 3.0169448082495446e-09, 'epoch': 0.99} 99%|█████████▉| 21859/22095 [37:27:13<14:33, 3.70s/it] 99%|█████████▉| 21860/22095 [37:27:16<13:45, 3.51s/it] {'loss': 0.2079, 'grad_norm': 0.5554011995679686, 'learning_rate': 2.991541604025483e-09, 'epoch': 0.99} 99%|█████████▉| 21860/22095 [37:27:16<13:45, 3.51s/it] 99%|█████████▉| 21861/22095 [37:27:20<13:38, 3.50s/it] {'loss': 0.3135, 'grad_norm': 0.6551671338868413, 'learning_rate': 2.9662457701662428e-09, 'epoch': 0.99} 99%|█████████▉| 21861/22095 [37:27:20<13:38, 3.50s/it] 99%|█████████▉| 21862/22095 [37:27:23<13:01, 3.35s/it] {'loss': 0.2909, 'grad_norm': 0.5727764315689057, 'learning_rate': 2.9410573072152783e-09, 'epoch': 0.99} 99%|█████████▉| 21862/22095 [37:27:23<13:01, 3.35s/it] 99%|█████████▉| 21863/22095 [37:27:26<12:23, 3.21s/it] {'loss': 0.2915, 'grad_norm': 0.5694765003150496, 'learning_rate': 2.915976215713268e-09, 'epoch': 0.99} 99%|█████████▉| 21863/22095 [37:27:26<12:23, 3.21s/it] 99%|█████████▉| 21864/22095 [37:27:28<11:58, 3.11s/it] {'loss': 0.3184, 'grad_norm': 0.6835921802362995, 'learning_rate': 2.8910024962003347e-09, 'epoch': 0.99} 99%|█████████▉| 21864/22095 [37:27:29<11:58, 3.11s/it] 99%|█████████▉| 21865/22095 [37:27:32<11:51, 3.09s/it] {'loss': 0.3253, 'grad_norm': 0.597071520318116, 'learning_rate': 2.866136149211607e-09, 'epoch': 0.99} 99%|█████████▉| 21865/22095 [37:27:32<11:51, 3.09s/it] 99%|█████████▉| 21866/22095 [37:27:36<12:49, 3.36s/it] {'loss': 0.2917, 'grad_norm': 0.6053926412628252, 'learning_rate': 2.8413771752822116e-09, 'epoch': 0.99} 99%|█████████▉| 21866/22095 [37:27:36<12:49, 3.36s/it] 99%|█████████▉| 21867/22095 [37:27:39<13:01, 3.43s/it] {'loss': 0.2666, 'grad_norm': 0.6335645278894662, 'learning_rate': 2.816725574943946e-09, 'epoch': 0.99} 99%|█████████▉| 21867/22095 [37:27:39<13:01, 3.43s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [106, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8451495 in VC:s3://internvl-moe-sft-data/. Exception: Image size [106, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 111706, 'image': 'vrdu_texteq/astro-ph.CO/15631fcb-3471-4959-bf1a-405f426745a0.png', 'image_wh': [[106, 25]], 'conversations': [{'from': 'human', 'value': '\n请从图像中提取文本'}, {'from': 'gpt', 'value': '$s^3 e^{-3 i Ks}$'}]} 99%|█████████▉| 21868/22095 [37:27:42<12:27, 3.29s/it] {'loss': 0.2856, 'grad_norm': 0.6513409620336323, 'learning_rate': 2.792181348726941e-09, 'epoch': 0.99} 99%|█████████▉| 21868/22095 [37:27:42<12:27, 3.29s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 9047543 in VC:s3://multi-modal/UniGeo/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'calculation_images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 5cm\nB. 8cm\nC. 9cm\nD. 4cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'A'}]} 99%|█████████▉| 21869/22095 [37:27:46<12:59, 3.45s/it] {'loss': 0.3292, 'grad_norm': 0.6472487900335032, 'learning_rate': 2.767744497157998e-09, 'epoch': 0.99} 99%|█████████▉| 21869/22095 [37:27:46<12:59, 3.45s/it] 99%|█████████▉| 21870/22095 [37:27:50<13:15, 3.53s/it] {'loss': 0.2938, 'grad_norm': 0.6328915428497753, 'learning_rate': 2.7434150207622525e-09, 'epoch': 0.99} 99%|█████████▉| 21870/22095 [37:27:50<13:15, 3.53s/it] 99%|█████████▉| 21871/22095 [37:27:53<13:09, 3.52s/it] {'loss': 0.3124, 'grad_norm': 0.676358863988407, 'learning_rate': 2.719192920063174e-09, 'epoch': 0.99} 99%|█████████▉| 21871/22095 [37:27:53<13:09, 3.52s/it] 99%|█████████▉| 21872/22095 [37:27:56<12:20, 3.32s/it] {'loss': 0.2925, 'grad_norm': 0.7129888996080095, 'learning_rate': 2.6950781955803475e-09, 'epoch': 0.99} 99%|█████████▉| 21872/22095 [37:27:56<12:20, 3.32s/it] 99%|█████████▉| 21873/22095 [37:28:00<13:04, 3.53s/it] {'loss': 0.2965, 'grad_norm': 0.5912825030245769, 'learning_rate': 2.6710708478316914e-09, 'epoch': 0.99} 99%|█████████▉| 21873/22095 [37:28:00<13:04, 3.53s/it] 99%|█████████▉| 21874/22095 [37:28:03<12:37, 3.43s/it] {'loss': 0.2893, 'grad_norm': 0.6327579557572981, 'learning_rate': 2.6471708773340154e-09, 'epoch': 0.99} 99%|█████████▉| 21874/22095 [37:28:03<12:37, 3.43s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:1054: UserWarning: Palette images with Transparency expressed in bytes should be converted to RGBA images warnings.warn( 99%|█████████▉| 21875/22095 [37:28:07<12:57, 3.54s/it] {'loss': 0.2831, 'grad_norm': 0.6187239759164697, 'learning_rate': 2.623378284600797e-09, 'epoch': 0.99} 99%|█████████▉| 21875/22095 [37:28:07<12:57, 3.54s/it] 99%|█████████▉| 21876/22095 [37:28:11<13:08, 3.60s/it] {'loss': 0.274, 'grad_norm': 0.5763850127898063, 'learning_rate': 2.599693070142739e-09, 'epoch': 0.99} 99%|█████████▉| 21876/22095 [37:28:11<13:08, 3.60s/it] 99%|█████████▉| 21877/22095 [37:28:14<12:32, 3.45s/it] {'loss': 0.3054, 'grad_norm': 0.6044053598183847, 'learning_rate': 2.576115234468324e-09, 'epoch': 0.99} 99%|█████████▉| 21877/22095 [37:28:14<12:32, 3.45s/it] 99%|█████████▉| 21878/22095 [37:28:17<12:17, 3.40s/it] {'loss': 0.3095, 'grad_norm': 0.6342602639735581, 'learning_rate': 2.552644778085478e-09, 'epoch': 0.99} 99%|█████████▉| 21878/22095 [37:28:17<12:17, 3.40s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21879/22095 [37:28:20<11:35, 3.22s/it] {'loss': 0.2937, 'grad_norm': 0.5974125313774147, 'learning_rate': 2.5292817014976877e-09, 'epoch': 0.99} 99%|█████████▉| 21879/22095 [37:28:20<11:35, 3.22s/it] 99%|█████████▉| 21880/22095 [37:28:24<12:27, 3.48s/it] {'loss': 0.2344, 'grad_norm': 0.5630186640044591, 'learning_rate': 2.5060260052067742e-09, 'epoch': 0.99} 99%|█████████▉| 21880/22095 [37:28:24<12:27, 3.48s/it] 99%|█████████▉| 21881/22095 [37:28:27<12:18, 3.45s/it] {'loss': 0.3481, 'grad_norm': 0.6598742022611298, 'learning_rate': 2.4828776897128925e-09, 'epoch': 0.99} 99%|█████████▉| 21881/22095 [37:28:27<12:18, 3.45s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (117990000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 99%|█████████▉| 21882/22095 [37:28:31<12:08, 3.42s/it] {'loss': 0.31, 'grad_norm': 0.679798663117042, 'learning_rate': 2.459836755513423e-09, 'epoch': 0.99} 99%|█████████▉| 21882/22095 [37:28:31<12:08, 3.42s/it] 99%|█████████▉| 21883/22095 [37:28:34<11:50, 3.35s/it] {'loss': 0.2829, 'grad_norm': 0.6326383323678978, 'learning_rate': 2.4369032031029695e-09, 'epoch': 0.99} 99%|█████████▉| 21883/22095 [37:28:34<11:50, 3.35s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21884/22095 [37:28:37<11:19, 3.22s/it] {'loss': 0.2817, 'grad_norm': 0.6012953654016868, 'learning_rate': 2.4140770329750264e-09, 'epoch': 0.99} 99%|█████████▉| 21884/22095 [37:28:37<11:19, 3.22s/it] 99%|█████████▉| 21885/22095 [37:28:40<11:01, 3.15s/it] {'loss': 0.2729, 'grad_norm': 0.6390496404163677, 'learning_rate': 2.391358245619202e-09, 'epoch': 0.99} 99%|█████████▉| 21885/22095 [37:28:40<11:01, 3.15s/it] 99%|█████████▉| 21886/22095 [37:28:43<11:16, 3.24s/it] {'loss': 0.2952, 'grad_norm': 0.6142964865838836, 'learning_rate': 2.3687468415245494e-09, 'epoch': 0.99} 99%|█████████▉| 21886/22095 [37:28:43<11:16, 3.24s/it] 99%|█████████▉| 21887/22095 [37:28:47<11:18, 3.26s/it] {'loss': 0.3201, 'grad_norm': 0.7176865699863244, 'learning_rate': 2.346242821176237e-09, 'epoch': 0.99} 99%|█████████▉| 21887/22095 [37:28:47<11:18, 3.26s/it] 99%|█████████▉| 21888/22095 [37:28:50<11:37, 3.37s/it] {'loss': 0.3243, 'grad_norm': 0.6598530408570948, 'learning_rate': 2.3238461850583206e-09, 'epoch': 0.99} 99%|█████████▉| 21888/22095 [37:28:50<11:37, 3.37s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21889/22095 [37:28:54<11:52, 3.46s/it] {'loss': 0.3047, 'grad_norm': 0.6331095745795835, 'learning_rate': 2.3015569336509724e-09, 'epoch': 0.99} 99%|█████████▉| 21889/22095 [37:28:54<11:52, 3.46s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21890/22095 [37:28:57<11:15, 3.30s/it] {'loss': 0.2782, 'grad_norm': 0.6414719345668272, 'learning_rate': 2.279375067434919e-09, 'epoch': 0.99} 99%|█████████▉| 21890/22095 [37:28:57<11:15, 3.30s/it] 99%|█████████▉| 21891/22095 [37:29:00<10:45, 3.16s/it] {'loss': 0.3231, 'grad_norm': 0.638176475097762, 'learning_rate': 2.2573005868853358e-09, 'epoch': 0.99} 99%|█████████▉| 21891/22095 [37:29:00<10:45, 3.16s/it] 99%|█████████▉| 21892/22095 [37:29:03<10:29, 3.10s/it] {'loss': 0.2795, 'grad_norm': 0.6025064831274091, 'learning_rate': 2.2353334924768435e-09, 'epoch': 0.99} 99%|█████████▉| 21892/22095 [37:29:03<10:29, 3.10s/it] 99%|█████████▉| 21893/22095 [37:29:06<10:53, 3.23s/it] {'loss': 0.2546, 'grad_norm': 0.5871043366477097, 'learning_rate': 2.213473784681286e-09, 'epoch': 0.99} 99%|█████████▉| 21893/22095 [37:29:06<10:53, 3.23s/it] 99%|█████████▉| 21894/22095 [37:29:09<10:50, 3.24s/it] {'loss': 0.2715, 'grad_norm': 0.6708388062940376, 'learning_rate': 2.1917214639693985e-09, 'epoch': 0.99} 99%|█████████▉| 21894/22095 [37:29:09<10:50, 3.24s/it] 99%|█████████▉| 21895/22095 [37:29:12<10:34, 3.17s/it] {'loss': 0.2948, 'grad_norm': 0.6180303987275302, 'learning_rate': 2.1700765308074743e-09, 'epoch': 0.99} 99%|█████████▉| 21895/22095 [37:29:12<10:34, 3.17s/it] 99%|█████████▉| 21896/22095 [37:29:17<11:38, 3.51s/it] {'loss': 0.3112, 'grad_norm': 0.5761702880401627, 'learning_rate': 2.1485389856606973e-09, 'epoch': 0.99} 99%|█████████▉| 21896/22095 [37:29:17<11:38, 3.51s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [20, 14, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8396955 in VC:s3://internvl-moe-sft-data/. Exception: Image size [20, 14, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 63808, 'image': 'vrdu_table_final_2/astro-ph.EP/a508408e-f0de-4f69-b50e-1ca207289f49.png', 'image_wh': [[20, 14]], 'conversations': [{'from': 'human', 'value': '\nPlease rewrite the table in the image into LaTeX syntax.'}, {'from': 'gpt', 'value': "Sure, I'll rewrite the table in the image into LaTeX syntax for you.\n```latex\n\\begin{tabular}[t]{l}x\\end{tabular}\n```"}]} 99%|█████████▉| 21897/22095 [37:29:21<11:53, 3.60s/it] {'loss': 0.292, 'grad_norm': 0.5283381025407042, 'learning_rate': 2.1271088289920304e-09, 'epoch': 0.99} 99%|█████████▉| 21897/22095 [37:29:21<11:53, 3.60s/it] 99%|█████████▉| 21898/22095 [37:29:23<11:05, 3.38s/it] {'loss': 0.2763, 'grad_norm': 0.6000773893981395, 'learning_rate': 2.1057860612627713e-09, 'epoch': 0.99} 99%|█████████▉| 21898/22095 [37:29:23<11:05, 3.38s/it] 99%|█████████▉| 21899/22095 [37:29:27<10:59, 3.36s/it] {'loss': 0.3268, 'grad_norm': 0.5964113949286953, 'learning_rate': 2.0845706829297762e-09, 'epoch': 0.99} 99%|█████████▉| 21899/22095 [37:29:27<10:59, 3.36s/it] 99%|█████████▉| 21900/22095 [37:29:30<11:00, 3.38s/it] {'loss': 0.251, 'grad_norm': 0.5875991897924737, 'learning_rate': 2.0634626944493475e-09, 'epoch': 0.99} 99%|█████████▉| 21900/22095 [37:29:30<11:00, 3.38s/it] 99%|█████████▉| 21901/22095 [37:29:33<10:29, 3.24s/it] {'loss': 0.3334, 'grad_norm': 0.6623000008080091, 'learning_rate': 2.0424620962750107e-09, 'epoch': 0.99} 99%|█████████▉| 21901/22095 [37:29:33<10:29, 3.24s/it] 99%|█████████▉| 21902/22095 [37:29:36<10:18, 3.20s/it] {'loss': 0.2595, 'grad_norm': 0.5850960457521891, 'learning_rate': 2.021568888858627e-09, 'epoch': 0.99} 99%|█████████▉| 21902/22095 [37:29:36<10:18, 3.20s/it] 99%|█████████▉| 21903/22095 [37:29:40<10:22, 3.24s/it] {'loss': 0.2888, 'grad_norm': 0.5840999288698242, 'learning_rate': 2.0007830726481716e-09, 'epoch': 0.99} 99%|█████████▉| 21903/22095 [37:29:40<10:22, 3.24s/it] 99%|█████████▉| 21904/22095 [37:29:43<10:31, 3.31s/it] {'loss': 0.2938, 'grad_norm': 0.7335081157665665, 'learning_rate': 1.980104648090508e-09, 'epoch': 0.99} 99%|█████████▉| 21904/22095 [37:29:43<10:31, 3.31s/it] 99%|█████████▉| 21905/22095 [37:29:46<09:58, 3.15s/it] {'loss': 0.321, 'grad_norm': 0.6805641085989351, 'learning_rate': 1.9595336156308375e-09, 'epoch': 0.99} 99%|█████████▉| 21905/22095 [37:29:46<09:58, 3.15s/it] 99%|█████████▉| 21906/22095 [37:29:49<09:45, 3.10s/it] {'loss': 0.2899, 'grad_norm': 0.6457916470088789, 'learning_rate': 1.9390699757099174e-09, 'epoch': 0.99} 99%|█████████▉| 21906/22095 [37:29:49<09:45, 3.10s/it] 99%|█████████▉| 21907/22095 [37:29:53<10:47, 3.44s/it] {'loss': 0.2824, 'grad_norm': 0.6183658680749494, 'learning_rate': 1.9187137287685065e-09, 'epoch': 0.99} 99%|█████████▉| 21907/22095 [37:29:53<10:47, 3.44s/it] 99%|█████████▉| 21908/22095 [37:29:57<10:55, 3.51s/it] {'loss': 0.287, 'grad_norm': 0.6144721012585188, 'learning_rate': 1.8984648752429222e-09, 'epoch': 0.99} 99%|█████████▉| 21908/22095 [37:29:57<10:55, 3.51s/it] 99%|█████████▉| 21909/22095 [37:30:00<10:17, 3.32s/it] {'loss': 0.2704, 'grad_norm': 0.6641942731452835, 'learning_rate': 1.878323415568928e-09, 'epoch': 0.99} 99%|█████████▉| 21909/22095 [37:30:00<10:17, 3.32s/it] 99%|█████████▉| 21910/22095 [37:30:03<10:14, 3.32s/it] {'loss': 0.2873, 'grad_norm': 1.0685510439896335, 'learning_rate': 1.8582893501795096e-09, 'epoch': 0.99} 99%|█████████▉| 21910/22095 [37:30:03<10:14, 3.32s/it] 99%|█████████▉| 21911/22095 [37:30:06<10:11, 3.32s/it] {'loss': 0.2665, 'grad_norm': 0.5948347391105585, 'learning_rate': 1.8383626795048793e-09, 'epoch': 0.99} 99%|█████████▉| 21911/22095 [37:30:06<10:11, 3.32s/it] 99%|█████████▉| 21912/22095 [37:30:09<10:04, 3.30s/it] {'loss': 0.2922, 'grad_norm': 0.625767466906673, 'learning_rate': 1.8185434039730277e-09, 'epoch': 0.99} 99%|█████████▉| 21912/22095 [37:30:09<10:04, 3.30s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8406608 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 8796, 'image': 'vrdu_table_final_2/astro-ph.CO/11f8bae5-1ffa-4d1f-9d91-6f76d147d823.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nPlease reformat the table from the image into LaTeX code.'}, {'from': 'gpt', 'value': "Absolutely, I'll reformat the table from the image into LaTeX code for you.\n```latex\n\\begin{tabular}{@{}#1@{}}#2\\end{tabular}\n```"}]} 99%|█████████▉| 21913/22095 [37:30:13<09:49, 3.24s/it] {'loss': 0.2751, 'grad_norm': 0.5766022669904595, 'learning_rate': 1.7988315240097254e-09, 'epoch': 0.99} 99%|█████████▉| 21913/22095 [37:30:13<09:49, 3.24s/it] 99%|█████████▉| 21914/22095 [37:30:16<09:42, 3.22s/it] {'loss': 0.2615, 'grad_norm': 0.6275741354154193, 'learning_rate': 1.7792270400390776e-09, 'epoch': 0.99} 99%|█████████▉| 21914/22095 [37:30:16<09:42, 3.22s/it] 99%|█████████▉| 21915/22095 [37:30:19<09:56, 3.31s/it] {'loss': 0.3006, 'grad_norm': 0.5845001298448244, 'learning_rate': 1.759729952481859e-09, 'epoch': 0.99} 99%|█████████▉| 21915/22095 [37:30:19<09:56, 3.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21916/22095 [37:30:23<10:29, 3.52s/it] {'loss': 0.2822, 'grad_norm': 1.86394089138806, 'learning_rate': 1.7403402617571785e-09, 'epoch': 0.99} 99%|█████████▉| 21916/22095 [37:30:23<10:29, 3.52s/it] 99%|█████████▉| 21917/22095 [37:30:27<10:40, 3.60s/it] {'loss': 0.3051, 'grad_norm': 0.6429230000792644, 'learning_rate': 1.72105796828137e-09, 'epoch': 0.99} 99%|█████████▉| 21917/22095 [37:30:27<10:40, 3.60s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21918/22095 [37:30:30<10:08, 3.44s/it] {'loss': 0.2702, 'grad_norm': 0.6895858610389437, 'learning_rate': 1.7018830724691016e-09, 'epoch': 0.99} 99%|█████████▉| 21918/22095 [37:30:30<10:08, 3.44s/it] 99%|█████████▉| 21919/22095 [37:30:33<09:34, 3.26s/it] {'loss': 0.2767, 'grad_norm': 0.592948558171695, 'learning_rate': 1.682815574732266e-09, 'epoch': 0.99} 99%|█████████▉| 21919/22095 [37:30:33<09:34, 3.26s/it] 99%|█████████▉| 21920/22095 [37:30:37<09:48, 3.36s/it] {'loss': 0.2975, 'grad_norm': 0.5687485020465194, 'learning_rate': 1.6638554754805358e-09, 'epoch': 0.99} 99%|█████████▉| 21920/22095 [37:30:37<09:48, 3.36s/it] 99%|█████████▉| 21921/22095 [37:30:40<09:40, 3.34s/it] {'loss': 0.2997, 'grad_norm': 0.5920284006547106, 'learning_rate': 1.6450027751213626e-09, 'epoch': 0.99} 99%|█████████▉| 21921/22095 [37:30:40<09:40, 3.34s/it] 99%|█████████▉| 21922/22095 [37:30:43<09:16, 3.22s/it] {'loss': 0.3029, 'grad_norm': 0.6121210158543754, 'learning_rate': 1.6262574740599778e-09, 'epoch': 0.99} 99%|█████████▉| 21922/22095 [37:30:43<09:16, 3.22s/it] 99%|█████████▉| 21923/22095 [37:30:46<08:58, 3.13s/it] {'loss': 0.2618, 'grad_norm': 0.5701848941441401, 'learning_rate': 1.6076195726982824e-09, 'epoch': 0.99} 99%|█████████▉| 21923/22095 [37:30:46<08:58, 3.13s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [113, 13, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8924587 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [113, 13, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 47740, 'image': 'images/982.png', 'image_wh': [[113, 13]], 'conversations': [{'from': 'human', 'value': '\n如图,AB=16cm,C是AB上任意一点,D是AC的中点,E是BC的中点,则线段DE的长为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '8cm'}]} /mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (139814040 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 99%|█████████▉| 21924/22095 [37:30:49<09:02, 3.17s/it] {'loss': 0.3071, 'grad_norm': 0.6239870659742223, 'learning_rate': 1.5890890714381769e-09, 'epoch': 0.99} 99%|█████████▉| 21924/22095 [37:30:49<09:02, 3.17s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 25, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8369314 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 25, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 36066, 'image': 'vrdu_table_final_2/astro-ph.CO/17f48c56-863b-4ef9-a7a9-019c111b5ae4.png', 'image_wh': [[23, 25]], 'conversations': [{'from': 'human', 'value': '\nTranslate the table shown in the image into LaTeX code, please.'}, {'from': 'gpt', 'value': "Sure, I'll translate the table shown in the image into LaTeX code for you.\n```latex\n\\begin{tabular}{c}$\\theta_s$\\end{tabular}\n```"}]} 99%|█████████▉| 21925/22095 [37:30:52<09:08, 3.22s/it] {'loss': 0.2781, 'grad_norm': 0.5982485102238448, 'learning_rate': 1.5706659706771211e-09, 'epoch': 0.99} 99%|█████████▉| 21925/22095 [37:30:52<09:08, 3.22s/it] 99%|█████████▉| 21926/22095 [37:30:56<09:25, 3.35s/it] {'loss': 0.304, 'grad_norm': 0.5887485429538946, 'learning_rate': 1.5523502708103544e-09, 'epoch': 0.99} 99%|█████████▉| 21926/22095 [37:30:56<09:25, 3.35s/it] 99%|█████████▉| 21927/22095 [37:31:00<10:21, 3.70s/it] {'loss': 0.2928, 'grad_norm': 0.6232348804393142, 'learning_rate': 1.5341419722325612e-09, 'epoch': 0.99} 99%|█████████▉| 21927/22095 [37:31:00<10:21, 3.70s/it] 99%|█████████▉| 21928/22095 [37:31:04<09:55, 3.57s/it] {'loss': 0.2651, 'grad_norm': 0.6679441376503535, 'learning_rate': 1.51604107533454e-09, 'epoch': 0.99} 99%|█████████▉| 21928/22095 [37:31:04<09:55, 3.57s/it] 99%|█████████▉| 21929/22095 [37:31:07<10:00, 3.62s/it] {'loss': 0.3113, 'grad_norm': 0.6308610655766753, 'learning_rate': 1.4980475805048688e-09, 'epoch': 0.99} 99%|█████████▉| 21929/22095 [37:31:07<10:00, 3.62s/it] 99%|█████████▉| 21930/22095 [37:31:11<09:35, 3.49s/it] {'loss': 0.2558, 'grad_norm': 0.6038910862193314, 'learning_rate': 1.4801614881304604e-09, 'epoch': 0.99} 99%|█████████▉| 21930/22095 [37:31:11<09:35, 3.49s/it] 99%|█████████▉| 21931/22095 [37:31:14<09:16, 3.39s/it] {'loss': 0.2528, 'grad_norm': 0.566272493740601, 'learning_rate': 1.462382798595452e-09, 'epoch': 0.99} 99%|█████████▉| 21931/22095 [37:31:14<09:16, 3.39s/it] 99%|█████████▉| 21932/22095 [37:31:17<08:59, 3.31s/it] {'loss': 0.2585, 'grad_norm': 0.6423144924625247, 'learning_rate': 1.4447115122817601e-09, 'epoch': 0.99} 99%|█████████▉| 21932/22095 [37:31:17<08:59, 3.31s/it] 99%|█████████▉| 21933/22095 [37:31:20<08:52, 3.29s/it] {'loss': 0.2633, 'grad_norm': 0.6201093438260887, 'learning_rate': 1.4271476295696363e-09, 'epoch': 0.99} 99%|█████████▉| 21933/22095 [37:31:20<08:52, 3.29s/it] 99%|█████████▉| 21934/22095 [37:31:23<08:31, 3.17s/it] {'loss': 0.28, 'grad_norm': 0.6085470147719076, 'learning_rate': 1.4096911508365564e-09, 'epoch': 0.99} 99%|█████████▉| 21934/22095 [37:31:23<08:31, 3.17s/it] 99%|█████████▉| 21935/22095 [37:31:27<08:57, 3.36s/it] {'loss': 0.2697, 'grad_norm': 0.637300112023428, 'learning_rate': 1.3923420764566653e-09, 'epoch': 0.99} 99%|█████████▉| 21935/22095 [37:31:27<08:57, 3.36s/it] 99%|█████████▉| 21936/22095 [37:31:30<09:07, 3.44s/it] {'loss': 0.2711, 'grad_norm': 0.5883667578079982, 'learning_rate': 1.3751004068035534e-09, 'epoch': 0.99} 99%|█████████▉| 21936/22095 [37:31:30<09:07, 3.44s/it] 99%|█████████▉| 21937/22095 [37:31:34<09:04, 3.45s/it] {'loss': 0.2861, 'grad_norm': 0.5961564073876877, 'learning_rate': 1.35796614224748e-09, 'epoch': 0.99} 99%|█████████▉| 21937/22095 [37:31:34<09:04, 3.45s/it] 99%|█████████▉| 21938/22095 [37:31:38<09:23, 3.59s/it] {'loss': 0.2655, 'grad_norm': 0.6375560477188555, 'learning_rate': 1.3409392831564838e-09, 'epoch': 0.99} 99%|█████████▉| 21938/22095 [37:31:38<09:23, 3.59s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [195, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8944626 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [195, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 67779, 'image': 'images/5460.png', 'image_wh': [[195, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,线段AB=18cm,BC=6cm,D为BC的中点,则线段AD的长为()\nA. 11cm\nB. 12cm\nC. 15cm\nD. 13cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 99%|█████████▉| 21939/22095 [37:31:42<09:46, 3.76s/it] {'loss': 0.2948, 'grad_norm': 0.6792700890001889, 'learning_rate': 1.3240198298963836e-09, 'epoch': 0.99} 99%|█████████▉| 21939/22095 [37:31:42<09:46, 3.76s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [210, 22, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8937805 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [210, 22, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 60958, 'image': 'images/5028.png', 'image_wh': [[210, 22]], 'conversations': [{'from': 'human', 'value': '\n如图所示,bc=\\ frac{1}{2}ab,d是ac的中点,dc=3cm,则ab的长度为()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 99%|█████████▉| 21940/22095 [37:31:46<09:50, 3.81s/it] {'loss': 0.3265, 'grad_norm': 0.5840208606838262, 'learning_rate': 1.3072077828307772e-09, 'epoch': 0.99} 99%|█████████▉| 21940/22095 [37:31:46<09:50, 3.81s/it] 99%|█████████▉| 21941/22095 [37:31:49<09:31, 3.71s/it] {'loss': 0.2734, 'grad_norm': 0.6286625929006481, 'learning_rate': 1.2905031423210423e-09, 'epoch': 0.99} 99%|█████████▉| 21941/22095 [37:31:49<09:31, 3.71s/it] 99%|█████████▉| 21942/22095 [37:31:53<09:02, 3.54s/it] {'loss': 0.3262, 'grad_norm': 0.7285734220275707, 'learning_rate': 1.2739059087263362e-09, 'epoch': 0.99} 99%|█████████▉| 21942/22095 [37:31:53<09:02, 3.54s/it] 99%|█████████▉| 21943/22095 [37:31:57<09:47, 3.87s/it] {'loss': 0.2495, 'grad_norm': 0.5779318308145721, 'learning_rate': 1.257416082402485e-09, 'epoch': 0.99} 99%|█████████▉| 21943/22095 [37:31:57<09:47, 3.87s/it] 99%|█████████▉| 21944/22095 [37:32:00<09:07, 3.63s/it] {'loss': 0.3073, 'grad_norm': 0.6083077717806407, 'learning_rate': 1.2410336637047604e-09, 'epoch': 0.99} 99%|█████████▉| 21944/22095 [37:32:00<09:07, 3.63s/it] 99%|█████████▉| 21945/22095 [37:32:03<08:31, 3.41s/it] {'loss': 0.2834, 'grad_norm': 0.6214203937496386, 'learning_rate': 1.2247586529845479e-09, 'epoch': 0.99} 99%|█████████▉| 21945/22095 [37:32:03<08:31, 3.41s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [158, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8959591 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [158, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 10426, 'image': 'images/5267.png', 'image_wh': [[158, 18]], 'conversations': [{'from': 'human', 'value': '\n如图所示,C点为AB段的顶点,M点为CA的中点,N点为BC的中点。如果MC比NC长2 cm,AC比BC长()\nAnswer the question using a single word or phrase.'}, {'from': 'gpt', 'value': '4cm'}]} 99%|█████████▉| 21946/22095 [37:32:07<08:31, 3.44s/it] {'loss': 0.3056, 'grad_norm': 0.5481423624767995, 'learning_rate': 1.2085910505915677e-09, 'epoch': 0.99} 99%|█████████▉| 21946/22095 [37:32:07<08:31, 3.44s/it] 99%|█████████▉| 21947/22095 [37:32:11<08:52, 3.60s/it] {'loss': 0.2804, 'grad_norm': 0.572538572374642, 'learning_rate': 1.1925308568733197e-09, 'epoch': 0.99} 99%|█████████▉| 21947/22095 [37:32:11<08:52, 3.60s/it] 99%|█████████▉| 21948/22095 [37:32:14<08:51, 3.62s/it] {'loss': 0.273, 'grad_norm': 0.6241378536217829, 'learning_rate': 1.176578072175083e-09, 'epoch': 0.99} 99%|█████████▉| 21948/22095 [37:32:14<08:51, 3.62s/it] 99%|█████████▉| 21949/22095 [37:32:17<08:20, 3.43s/it] {'loss': 0.2911, 'grad_norm': 0.5799838848065667, 'learning_rate': 1.1607326968393617e-09, 'epoch': 0.99} 99%|█████████▉| 21949/22095 [37:32:17<08:20, 3.43s/it] 99%|█████████▉| 21950/22095 [37:32:20<07:55, 3.28s/it] {'loss': 0.3535, 'grad_norm': 0.6480647784604268, 'learning_rate': 1.1449947312064392e-09, 'epoch': 0.99} 99%|█████████▉| 21950/22095 [37:32:20<07:55, 3.28s/it] 99%|█████████▉| 21951/22095 [37:32:24<08:05, 3.37s/it] {'loss': 0.3251, 'grad_norm': 0.6990225541403731, 'learning_rate': 1.1293641756154883e-09, 'epoch': 0.99} 99%|█████████▉| 21951/22095 [37:32:24<08:05, 3.37s/it] 99%|█████████▉| 21952/22095 [37:32:27<07:58, 3.35s/it] {'loss': 0.3409, 'grad_norm': 0.5795906833178194, 'learning_rate': 1.1138410304012415e-09, 'epoch': 0.99} 99%|█████████▉| 21952/22095 [37:32:27<07:58, 3.35s/it] 99%|█████████▉| 21953/22095 [37:32:30<07:44, 3.27s/it] {'loss': 0.3098, 'grad_norm': 0.605951227260699, 'learning_rate': 1.0984252958973207e-09, 'epoch': 0.99} 99%|█████████▉| 21953/22095 [37:32:30<07:44, 3.27s/it] 99%|█████████▉| 21954/22095 [37:32:33<07:21, 3.13s/it] {'loss': 0.321, 'grad_norm': 0.6328547372958854, 'learning_rate': 1.0831169724356828e-09, 'epoch': 0.99} 99%|█████████▉| 21954/22095 [37:32:33<07:21, 3.13s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21955/22095 [37:32:36<07:22, 3.16s/it] {'loss': 0.3066, 'grad_norm': 0.6120233663098031, 'learning_rate': 1.0679160603449533e-09, 'epoch': 0.99} 99%|█████████▉| 21955/22095 [37:32:36<07:22, 3.16s/it] 99%|█████████▉| 21956/22095 [37:32:40<07:26, 3.21s/it] {'loss': 0.3099, 'grad_norm': 0.5841427755427862, 'learning_rate': 1.0528225599515385e-09, 'epoch': 0.99} 99%|█████████▉| 21956/22095 [37:32:40<07:26, 3.21s/it] 99%|█████████▉| 21957/22095 [37:32:43<07:17, 3.17s/it] {'loss': 0.3139, 'grad_norm': 0.6222734322557101, 'learning_rate': 1.037836471579623e-09, 'epoch': 0.99} 99%|█████████▉| 21957/22095 [37:32:43<07:17, 3.17s/it] 99%|█████████▉| 21958/22095 [37:32:46<07:05, 3.10s/it] {'loss': 0.3219, 'grad_norm': 0.6193352639862391, 'learning_rate': 1.0229577955517267e-09, 'epoch': 0.99} 99%|█████████▉| 21958/22095 [37:32:46<07:05, 3.10s/it] 99%|█████████▉| 21959/22095 [37:32:49<07:12, 3.18s/it] {'loss': 0.2576, 'grad_norm': 0.6356822729983301, 'learning_rate': 1.008186532187594e-09, 'epoch': 0.99} 99%|█████████▉| 21959/22095 [37:32:49<07:12, 3.18s/it] 99%|█████████▉| 21960/22095 [37:32:52<07:02, 3.13s/it] {'loss': 0.3192, 'grad_norm': 0.5978070819650005, 'learning_rate': 9.93522681803638e-10, 'epoch': 0.99} 99%|█████████▉| 21960/22095 [37:32:52<07:02, 3.13s/it] 99%|█████████▉| 21961/22095 [37:32:56<07:24, 3.32s/it] {'loss': 0.3025, 'grad_norm': 0.6261767044610226, 'learning_rate': 9.789662447157178e-10, 'epoch': 0.99} 99%|█████████▉| 21961/22095 [37:32:56<07:24, 3.32s/it] 99%|█████████▉| 21962/22095 [37:32:59<07:04, 3.19s/it] {'loss': 0.2947, 'grad_norm': 0.593104996327592, 'learning_rate': 9.645172212369158e-10, 'epoch': 0.99} 99%|█████████▉| 21962/22095 [37:32:59<07:04, 3.19s/it] 99%|█████████▉| 21963/22095 [37:33:02<06:54, 3.14s/it] {'loss': 0.2828, 'grad_norm': 0.6462433332947585, 'learning_rate': 9.501756116769844e-10, 'epoch': 0.99} 99%|█████████▉| 21963/22095 [37:33:02<06:54, 3.14s/it] 99%|█████████▉| 21964/22095 [37:33:06<07:29, 3.43s/it] {'loss': 0.3159, 'grad_norm': 0.5524732648542988, 'learning_rate': 9.359414163445657e-10, 'epoch': 0.99} 99%|█████████▉| 21964/22095 [37:33:06<07:29, 3.43s/it] 99%|█████████▉| 21965/22095 [37:33:09<07:16, 3.36s/it] {'loss': 0.2795, 'grad_norm': 0.583561381683474, 'learning_rate': 9.218146355449709e-10, 'epoch': 0.99} 99%|█████████▉| 21965/22095 [37:33:09<07:16, 3.36s/it] 99%|█████████▉| 21966/22095 [37:33:12<06:51, 3.19s/it] {'loss': 0.3043, 'grad_norm': 0.6135715357641147, 'learning_rate': 9.07795269582401e-10, 'epoch': 0.99} 99%|█████████▉| 21966/22095 [37:33:12<06:51, 3.19s/it] 99%|█████████▉| 21967/22095 [37:33:15<06:47, 3.19s/it] {'loss': 0.2594, 'grad_norm': 0.6229441688619807, 'learning_rate': 8.938833187577267e-10, 'epoch': 0.99} 99%|█████████▉| 21967/22095 [37:33:15<06:47, 3.19s/it] 99%|█████████▉| 21968/22095 [37:33:19<07:02, 3.33s/it] {'loss': 0.28, 'grad_norm': 0.6177528880302975, 'learning_rate': 8.800787833695978e-10, 'epoch': 0.99} 99%|█████████▉| 21968/22095 [37:33:19<07:02, 3.33s/it] 99%|█████████▉| 21969/22095 [37:33:22<07:00, 3.34s/it] {'loss': 0.2535, 'grad_norm': 1.8804680331595953, 'learning_rate': 8.663816637149991e-10, 'epoch': 0.99} 99%|█████████▉| 21969/22095 [37:33:22<07:00, 3.34s/it] 99%|█████████▉| 21970/22095 [37:33:26<07:41, 3.69s/it] {'loss': 0.3012, 'grad_norm': 0.7889549714801867, 'learning_rate': 8.527919600886947e-10, 'epoch': 0.99} 99%|█████████▉| 21970/22095 [37:33:26<07:41, 3.69s/it] 99%|█████████▉| 21971/22095 [37:33:30<07:20, 3.55s/it] {'loss': 0.3013, 'grad_norm': 0.6087356486241385, 'learning_rate': 8.393096727815631e-10, 'epoch': 0.99} 99%|█████████▉| 21971/22095 [37:33:30<07:20, 3.55s/it] 99%|█████████▉| 21972/22095 [37:33:33<06:53, 3.36s/it] {'loss': 0.3094, 'grad_norm': 0.6173904701348589, 'learning_rate': 8.259348020844827e-10, 'epoch': 0.99} 99%|█████████▉| 21972/22095 [37:33:33<06:53, 3.36s/it] 99%|█████████▉| 21973/22095 [37:33:35<06:33, 3.23s/it] {'loss': 0.2814, 'grad_norm': 0.6153369456157168, 'learning_rate': 8.126673482838909e-10, 'epoch': 0.99} 99%|█████████▉| 21973/22095 [37:33:35<06:33, 3.23s/it] 99%|█████████▉| 21974/22095 [37:33:39<06:48, 3.37s/it] {'loss': 0.3302, 'grad_norm': 0.6292016348416848, 'learning_rate': 7.995073116656704e-10, 'epoch': 0.99} 99%|█████████▉| 21974/22095 [37:33:39<06:48, 3.37s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8310894 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1EjJ_a4k98KJjSZFoXXXS6pXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nWhat is the all word in the image?'}, {'from': 'gpt', 'value': 'All words in the image:\n奥立德\n好评\n如潮\n100%实物拍摄,盗图必究\n厂家直销\n随意定做\n正品保证'}]} Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [14, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8348591 in VC:s3://internvl-moe-sft-data/. Exception: Image size [14, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 15260, 'image': 'vrdu_table_final_2/astro-ph.CO/2c7ce79d-85f6-4459-8564-4a79ebbc8e66.png', 'image_wh': [[14, 23]], 'conversations': [{'from': 'human', 'value': '\nTransform the depicted table into LaTeX code, kindly.'}, {'from': 'gpt', 'value': "Certainly, I'll transform the depicted table into LaTeX code for you.\n```latex\n\\begin{tabular}[#1]{@{}l@{}}#2\\end{tabular}\n```"}]} 99%|█████████▉| 21975/22095 [37:33:42<06:38, 3.32s/it] {'loss': 0.2664, 'grad_norm': 0.5631017655648378, 'learning_rate': 7.864546925118177e-10, 'epoch': 0.99} 99%|█████████▉| 21975/22095 [37:33:42<06:38, 3.32s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 99%|█████████▉| 21976/22095 [37:33:45<06:25, 3.24s/it] {'loss': 0.2962, 'grad_norm': 0.6069259072107689, 'learning_rate': 7.735094911032193e-10, 'epoch': 0.99} 99%|█████████▉| 21976/22095 [37:33:45<06:25, 3.24s/it] 99%|█████████▉| 21977/22095 [37:33:49<06:15, 3.18s/it] {'loss': 0.2682, 'grad_norm': 0.531390973141288, 'learning_rate': 7.606717077179859e-10, 'epoch': 0.99} 99%|█████████▉| 21977/22095 [37:33:49<06:15, 3.18s/it] 99%|█████████▉| 21978/22095 [37:33:51<05:59, 3.08s/it] {'loss': 0.3383, 'grad_norm': 0.7526144498408558, 'learning_rate': 7.47941342631453e-10, 'epoch': 0.99} 99%|█████████▉| 21978/22095 [37:33:51<05:59, 3.08s/it] 99%|█████████▉| 21979/22095 [37:33:55<06:31, 3.38s/it] {'loss': 0.323, 'grad_norm': 0.6084599902690812, 'learning_rate': 7.353183961184007e-10, 'epoch': 0.99} 99%|█████████▉| 21979/22095 [37:33:55<06:31, 3.38s/it] 99%|█████████▉| 21980/22095 [37:33:58<06:12, 3.24s/it] {'loss': 0.2905, 'grad_norm': 0.5413245411564832, 'learning_rate': 7.228028684486132e-10, 'epoch': 0.99} 99%|█████████▉| 21980/22095 [37:33:58<06:12, 3.24s/it] 99%|█████████▉| 21981/22095 [37:34:02<06:29, 3.42s/it] {'loss': 0.3424, 'grad_norm': 0.598947853073034, 'learning_rate': 7.103947598918747e-10, 'epoch': 0.99} 99%|█████████▉| 21981/22095 [37:34:02<06:29, 3.42s/it] 99%|█████████▉| 21982/22095 [37:34:05<06:11, 3.29s/it] {'loss': 0.2437, 'grad_norm': 0.6844650599597334, 'learning_rate': 6.980940707146388e-10, 'epoch': 0.99} 99%|█████████▉| 21982/22095 [37:34:05<06:11, 3.29s/it] 99%|█████████▉| 21983/22095 [37:34:08<05:49, 3.12s/it] {'loss': 0.2969, 'grad_norm': 0.6625091436369556, 'learning_rate': 6.859008011816937e-10, 'epoch': 0.99} 99%|█████████▉| 21983/22095 [37:34:08<05:49, 3.12s/it] 99%|█████████▉| 21984/22095 [37:34:11<05:44, 3.10s/it] {'loss': 0.2529, 'grad_norm': 0.6723159430338779, 'learning_rate': 6.738149515539416e-10, 'epoch': 0.99} 99%|█████████▉| 21984/22095 [37:34:11<05:44, 3.10s/it] 100%|█████████▉| 21985/22095 [37:34:14<05:50, 3.19s/it] {'loss': 0.3146, 'grad_norm': 1.4999966827879183, 'learning_rate': 6.618365220917299e-10, 'epoch': 1.0} 100%|█████████▉| 21985/22095 [37:34:14<05:50, 3.19s/it] 100%|█████████▉| 21986/22095 [37:34:18<06:02, 3.33s/it] {'loss': 0.3073, 'grad_norm': 0.6497521224251643, 'learning_rate': 6.499655130526306e-10, 'epoch': 1.0} 100%|█████████▉| 21986/22095 [37:34:18<06:02, 3.33s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [192, 21, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8908202 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [192, 21, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 31355, 'image': 'images/5249.png', 'image_wh': [[192, 21]], 'conversations': [{'from': 'human', 'value': '\n如图所示,如果段AB=9,BC=5,点D为AC段的中点,则段AD的长度为()\nA. 2\nB. 2.5\nC. 4.5\nD. 7'}, {'from': 'gpt', 'value': '【解答】解:∵AB=9,BC=5,∴AC=AB+BC=14,∵D为线段AC的中点,∴AD=\\frac{1}{2}AC=7,'}]} 100%|█████████▉| 21987/22095 [37:34:22<06:20, 3.52s/it] {'loss': 0.2842, 'grad_norm': 0.6168352312353976, 'learning_rate': 6.382019246908844e-10, 'epoch': 1.0} 100%|█████████▉| 21987/22095 [37:34:22<06:20, 3.52s/it] 100%|█████████▉| 21988/22095 [37:34:25<05:53, 3.30s/it] {'loss': 0.2974, 'grad_norm': 0.5962748762799298, 'learning_rate': 6.265457572601774e-10, 'epoch': 1.0} 100%|█████████▉| 21988/22095 [37:34:25<05:53, 3.30s/it] 100%|█████████▉| 21989/22095 [37:34:28<05:52, 3.33s/it] {'loss': 0.2968, 'grad_norm': 0.6219469205840408, 'learning_rate': 6.149970110108649e-10, 'epoch': 1.0} 100%|█████████▉| 21989/22095 [37:34:28<05:52, 3.33s/it] 100%|█████████▉| 21990/22095 [37:34:31<05:39, 3.24s/it] {'loss': 0.256, 'grad_norm': 0.5677401620285023, 'learning_rate': 6.035556861905268e-10, 'epoch': 1.0} 100%|█████████▉| 21990/22095 [37:34:31<05:39, 3.24s/it] 100%|█████████▉| 21991/22095 [37:34:34<05:37, 3.25s/it] {'loss': 0.2936, 'grad_norm': 0.6820100081540836, 'learning_rate': 5.922217830450772e-10, 'epoch': 1.0} 100%|█████████▉| 21991/22095 [37:34:34<05:37, 3.25s/it] 100%|█████████▉| 21992/22095 [37:34:38<05:53, 3.43s/it] {'loss': 0.2635, 'grad_norm': 0.5594585909652834, 'learning_rate': 5.809953018187652e-10, 'epoch': 1.0} 100%|█████████▉| 21992/22095 [37:34:38<05:53, 3.43s/it] 100%|█████████▉| 21993/22095 [37:34:42<06:01, 3.54s/it] {'loss': 0.32, 'grad_norm': 0.6206856525103511, 'learning_rate': 5.698762427519544e-10, 'epoch': 1.0} 100%|█████████▉| 21993/22095 [37:34:42<06:01, 3.54s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 21994/22095 [37:34:46<06:01, 3.58s/it] {'loss': 0.258, 'grad_norm': 0.5969689557141281, 'learning_rate': 5.588646060838976e-10, 'epoch': 1.0} 100%|█████████▉| 21994/22095 [37:34:46<06:01, 3.58s/it] 100%|█████████▉| 21995/22095 [37:34:49<05:56, 3.56s/it] {'loss': 0.328, 'grad_norm': 0.6377851737737551, 'learning_rate': 5.479603920516275e-10, 'epoch': 1.0} 100%|█████████▉| 21995/22095 [37:34:49<05:56, 3.56s/it] 100%|█████████▉| 21996/22095 [37:34:53<05:50, 3.54s/it] {'loss': 0.2919, 'grad_norm': 0.5864974798279645, 'learning_rate': 5.371636008888459e-10, 'epoch': 1.0} 100%|█████████▉| 21996/22095 [37:34:53<05:50, 3.54s/it] 100%|█████████▉| 21997/22095 [37:34:56<05:45, 3.53s/it] {'loss': 0.3034, 'grad_norm': 0.599331196829423, 'learning_rate': 5.264742328275896e-10, 'epoch': 1.0} 100%|█████████▉| 21997/22095 [37:34:56<05:45, 3.53s/it] 100%|█████████▉| 21998/22095 [37:34:59<05:23, 3.33s/it] {'loss': 0.3054, 'grad_norm': 0.5859840998410031, 'learning_rate': 5.158922880976747e-10, 'epoch': 1.0} 100%|█████████▉| 21998/22095 [37:34:59<05:23, 3.33s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/PIL/Image.py:3406: DecompressionBombWarning: Image size (92139516 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( 100%|█████████▉| 21999/22095 [37:35:02<05:05, 3.18s/it] {'loss': 0.2732, 'grad_norm': 0.6133896988379982, 'learning_rate': 5.054177669266969e-10, 'epoch': 1.0} 100%|█████████▉| 21999/22095 [37:35:02<05:05, 3.18s/it] 100%|█████████▉| 22000/22095 [37:35:06<05:36, 3.54s/it] {'loss': 0.2989, 'grad_norm': 0.6355733324276068, 'learning_rate': 4.950506695394763e-10, 'epoch': 1.0} 100%|█████████▉| 22000/22095 [37:35:06<05:36, 3.54s/it]/mnt/shared-storage-user/intern7shared/liuzhaoyang/programs/conda/envs/qwen2_5vl/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( 100%|█████████▉| 22001/22095 [37:35:53<25:43, 16.42s/it] {'loss': 0.2672, 'grad_norm': 0.6405292750738774, 'learning_rate': 4.847909961586128e-10, 'epoch': 1.0} 100%|█████████▉| 22001/22095 [37:35:53<25:43, 16.42s/it] 100%|█████████▉| 22002/22095 [37:35:56<19:04, 12.31s/it] {'loss': 0.2925, 'grad_norm': 0.6117498728482897, 'learning_rate': 4.746387470044855e-10, 'epoch': 1.0} 100%|█████████▉| 22002/22095 [37:35:56<19:04, 12.31s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22003/22095 [37:35:59<14:44, 9.62s/it] {'loss': 0.293, 'grad_norm': 0.5603673735108489, 'learning_rate': 4.645939222963636e-10, 'epoch': 1.0} 100%|█████████▉| 22003/22095 [37:35:59<14:44, 9.62s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [356, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8438363 in VC:s3://internvl-moe-sft-data/. Exception: Image size [356, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 57511, 'image': 'vrdu_texteq/astro-ph.CO/24ec5276-536d-4605-9d5b-3b56368d1043.png', 'image_wh': [[356, 23]], 'conversations': [{'from': 'human', 'value': '\n请识别图像中的文本。'}, {'from': 'gpt', 'value': 'where $x$ can be either S or L.'}]} 100%|█████████▉| 22004/22095 [37:36:02<11:38, 7.67s/it] {'loss': 0.3283, 'grad_norm': 0.6255173214844361, 'learning_rate': 4.5465652224851996e-10, 'epoch': 1.0} 100%|█████████▉| 22004/22095 [37:36:02<11:38, 7.67s/it] 100%|█████████▉| 22005/22095 [37:36:05<09:36, 6.40s/it] {'loss': 0.2651, 'grad_norm': 0.6383297301865899, 'learning_rate': 4.4482654707522774e-10, 'epoch': 1.0} 100%|█████████▉| 22005/22095 [37:36:05<09:36, 6.40s/it] 100%|█████████▉| 22006/22095 [37:36:27<16:06, 10.86s/it] {'loss': 0.3096, 'grad_norm': 0.6596672113163915, 'learning_rate': 4.3510399698798445e-10, 'epoch': 1.0} 100%|█████████▉| 22006/22095 [37:36:27<16:06, 10.86s/it] 100%|█████████▉| 22007/22095 [37:36:30<12:42, 8.67s/it] {'loss': 0.2759, 'grad_norm': 0.5702939246958649, 'learning_rate': 4.2548887219551196e-10, 'epoch': 1.0} 100%|█████████▉| 22007/22095 [37:36:30<12:42, 8.67s/it] 100%|█████████▉| 22008/22095 [37:36:34<10:18, 7.11s/it] {'loss': 0.3007, 'grad_norm': 0.6001358151182172, 'learning_rate': 4.159811729037566e-10, 'epoch': 1.0} 100%|█████████▉| 22008/22095 [37:36:34<10:18, 7.11s/it] 100%|█████████▉| 22009/22095 [37:36:37<08:30, 5.94s/it] {'loss': 0.2896, 'grad_norm': 0.5805687401563262, 'learning_rate': 4.0658089931755463e-10, 'epoch': 1.0} 100%|█████████▉| 22009/22095 [37:36:37<08:30, 5.94s/it] 100%|█████████▉| 22010/22095 [37:36:40<07:16, 5.14s/it] {'loss': 0.2929, 'grad_norm': 0.5658873242067638, 'learning_rate': 3.9728805163896654e-10, 'epoch': 1.0} 100%|█████████▉| 22010/22095 [37:36:40<07:16, 5.14s/it] 100%|█████████▉| 22011/22095 [37:36:44<06:37, 4.74s/it] {'loss': 0.2973, 'grad_norm': 0.6399124156608923, 'learning_rate': 3.8810263006783255e-10, 'epoch': 1.0} 100%|█████████▉| 22011/22095 [37:36:44<06:37, 4.74s/it] 100%|█████████▉| 22012/22095 [37:36:48<06:10, 4.46s/it] {'loss': 0.3147, 'grad_norm': 0.5762794400323175, 'learning_rate': 3.790246348012172e-10, 'epoch': 1.0} 100%|█████████▉| 22012/22095 [37:36:48<06:10, 4.46s/it] 100%|█████████▉| 22013/22095 [37:37:09<13:00, 9.52s/it] {'loss': 0.2454, 'grad_norm': 0.5395651917899117, 'learning_rate': 3.7005406603396464e-10, 'epoch': 1.0} 100%|█████████▉| 22013/22095 [37:37:09<13:00, 9.52s/it] 100%|█████████▉| 22014/22095 [37:37:12<10:15, 7.60s/it] {'loss': 0.3129, 'grad_norm': 0.5848163901483809, 'learning_rate': 3.6119092395869857e-10, 'epoch': 1.0} 100%|█████████▉| 22014/22095 [37:37:12<10:15, 7.60s/it] 100%|█████████▉| 22015/22095 [37:37:16<08:38, 6.49s/it] {'loss': 0.3131, 'grad_norm': 0.6193390910504796, 'learning_rate': 3.524352087669325e-10, 'epoch': 1.0} 100%|█████████▉| 22015/22095 [37:37:16<08:38, 6.49s/it] 100%|█████████▉| 22016/22095 [37:37:19<07:09, 5.43s/it] {'loss': 0.3218, 'grad_norm': 0.6067972758265497, 'learning_rate': 3.4378692064573895e-10, 'epoch': 1.0} 100%|█████████▉| 22016/22095 [37:37:19<07:09, 5.43s/it] 100%|█████████▉| 22017/22095 [37:37:22<06:14, 4.80s/it] {'loss': 0.3294, 'grad_norm': 0.5943380199518459, 'learning_rate': 3.3524605978108027e-10, 'epoch': 1.0} 100%|█████████▉| 22017/22095 [37:37:22<06:14, 4.80s/it] 100%|█████████▉| 22018/22095 [37:37:25<05:28, 4.27s/it] {'loss': 0.3097, 'grad_norm': 0.6979205732955975, 'learning_rate': 3.268126263572535e-10, 'epoch': 1.0} 100%|█████████▉| 22018/22095 [37:37:25<05:28, 4.27s/it] 100%|█████████▉| 22019/22095 [37:37:29<05:06, 4.04s/it] {'loss': 0.3155, 'grad_norm': 0.6026104421965898, 'learning_rate': 3.1848662055411484e-10, 'epoch': 1.0} 100%|█████████▉| 22019/22095 [37:37:29<05:06, 4.04s/it] 100%|█████████▉| 22020/22095 [37:37:51<11:54, 9.52s/it] {'loss': 0.348, 'grad_norm': 0.5833935911580108, 'learning_rate': 3.1026804255207544e-10, 'epoch': 1.0} 100%|█████████▉| 22020/22095 [37:37:51<11:54, 9.52s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22021/22095 [37:37:54<09:22, 7.61s/it] {'loss': 0.2619, 'grad_norm': 0.5544309223963749, 'learning_rate': 3.0215689252655056e-10, 'epoch': 1.0} 100%|█████████▉| 22021/22095 [37:37:54<09:22, 7.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [163, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8880126 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [163, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 3279, 'image': 'images/5270.png', 'image_wh': [[163, 23]], 'conversations': [{'from': 'human', 'value': "\n如图,C、D是线段AB上的两点,E是AC的中点,F是BD的中点,若EF=8,CD=4,则AB的长为()\nA. 16\nB. 9\nC. 10\nD. 12\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'D'}]} 100%|█████████▉| 22022/22095 [37:37:57<07:32, 6.20s/it] {'loss': 0.2839, 'grad_norm': 0.62487696640231, 'learning_rate': 2.9415317065240037e-10, 'epoch': 1.0} 100%|█████████▉| 22022/22095 [37:37:57<07:32, 6.20s/it] 100%|█████████▉| 22023/22095 [37:38:00<06:17, 5.24s/it] {'loss': 0.282, 'grad_norm': 0.5779353778908592, 'learning_rate': 2.8625687710170933e-10, 'epoch': 1.0} 100%|█████████▉| 22023/22095 [37:38:00<06:17, 5.24s/it] 100%|█████████▉| 22024/22095 [37:38:03<05:22, 4.55s/it] {'loss': 0.28, 'grad_norm': 0.6887365734828341, 'learning_rate': 2.784680120437866e-10, 'epoch': 1.0} 100%|█████████▉| 22024/22095 [37:38:03<05:22, 4.55s/it] 100%|█████████▉| 22025/22095 [37:38:06<04:44, 4.06s/it] {'loss': 0.2795, 'grad_norm': 0.6284514613591872, 'learning_rate': 2.7078657564572065e-10, 'epoch': 1.0} 100%|█████████▉| 22025/22095 [37:38:06<04:44, 4.06s/it] 100%|█████████▉| 22026/22095 [37:38:29<11:14, 9.78s/it] {'loss': 0.3028, 'grad_norm': 0.6018631891275054, 'learning_rate': 2.632125680734898e-10, 'epoch': 1.0} 100%|█████████▉| 22026/22095 [37:38:29<11:14, 9.78s/it] 100%|█████████▉| 22027/22095 [37:38:51<15:16, 13.47s/it] {'loss': 0.3303, 'grad_norm': 0.6014152852040981, 'learning_rate': 2.557459894891867e-10, 'epoch': 1.0} 100%|█████████▉| 22027/22095 [37:38:51<15:16, 13.47s/it] 100%|█████████▉| 22028/22095 [37:38:55<11:50, 10.61s/it] {'loss': 0.2885, 'grad_norm': 0.6160925266368528, 'learning_rate': 2.4838684005323853e-10, 'epoch': 1.0} 100%|█████████▉| 22028/22095 [37:38:55<11:50, 10.61s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [169, 18, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8902704 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [169, 18, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 25857, 'image': 'images/5232.png', 'image_wh': [[169, 18]], 'conversations': [{'from': 'human', 'value': "\n如图,C点在线段AB上,点D是AC的中点,若CD=4cm,AB=13cm,则BC的长为()\nA. 9cm\nB. 4cm\nC. 5cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'C'}]} 100%|█████████▉| 22029/22095 [37:39:18<15:35, 14.18s/it] {'loss': 0.2893, 'grad_norm': 0.7129821777469944, 'learning_rate': 2.4113511992385206e-10, 'epoch': 1.0} 100%|█████████▉| 22029/22095 [37:39:18<15:35, 14.18s/it] 100%|█████████▉| 22030/22095 [37:39:21<11:53, 10.98s/it] {'loss': 0.2745, 'grad_norm': 0.735108791638317, 'learning_rate': 2.3399082925701367e-10, 'epoch': 1.0} 100%|█████████▉| 22030/22095 [37:39:21<11:53, 10.98s/it] 100%|█████████▉| 22031/22095 [37:39:25<09:12, 8.63s/it] {'loss': 0.3017, 'grad_norm': 0.6263294300073722, 'learning_rate': 2.2695396820593408e-10, 'epoch': 1.0} 100%|█████████▉| 22031/22095 [37:39:25<09:12, 8.63s/it] 100%|█████████▉| 22032/22095 [37:40:06<19:26, 18.52s/it] {'loss': 0.3046, 'grad_norm': 0.650352346210055, 'learning_rate': 2.2002453692215875e-10, 'epoch': 1.0} 100%|█████████▉| 22032/22095 [37:40:06<19:26, 18.52s/it] 100%|█████████▉| 22033/22095 [37:40:09<14:16, 13.82s/it] {'loss': 0.297, 'grad_norm': 0.6100886433596671, 'learning_rate': 2.1320253555445758e-10, 'epoch': 1.0} 100%|█████████▉| 22033/22095 [37:40:09<14:16, 13.82s/it] 100%|█████████▉| 22034/22095 [37:40:12<10:45, 10.59s/it] {'loss': 0.2162, 'grad_norm': 0.5483647830903922, 'learning_rate': 2.064879642488249e-10, 'epoch': 1.0} 100%|█████████▉| 22034/22095 [37:40:12<10:45, 10.59s/it] 100%|█████████▉| 22035/22095 [37:40:34<14:00, 14.01s/it] {'loss': 0.2947, 'grad_norm': 0.6325002031474769, 'learning_rate': 1.998808231506999e-10, 'epoch': 1.0} 100%|█████████▉| 22035/22095 [37:40:34<14:00, 14.01s/it]VC:s3://gui-agent/data_20250630/android/images/Weibo/demo_6/images/005_swipe_left_1750325080237.png 2025-08-29 05:38:32.774336 load time: 1029.95 ms VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/AMEX/image/ZALORA_2024_3_8_19_7-402.png 2025-08-29 05:38:32.774209 load time: 1052.04 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/8264/screenshot_2.png 2025-08-29 05:38:32.774400 load time: 1060.29 ms 100%|█████████▉| 22036/22095 [37:40:55<15:48, 16.07s/it] {'loss': 0.2726, 'grad_norm': 0.6281452891500079, 'learning_rate': 1.9338111240108094e-10, 'epoch': 1.0} 100%|█████████▉| 22036/22095 [37:40:55<15:48, 16.07s/it] 100%|█████████▉| 22037/22095 [37:41:17<17:20, 17.94s/it] {'loss': 0.2957, 'grad_norm': 0.6656382032845043, 'learning_rate': 1.8698883214041118e-10, 'epoch': 1.0} 100%|█████████▉| 22037/22095 [37:41:17<17:20, 17.94s/it] 100%|█████████▉| 22038/22095 [37:41:21<12:56, 13.63s/it] {'loss': 0.2731, 'grad_norm': 0.5760080145872826, 'learning_rate': 1.8070398250524811e-10, 'epoch': 1.0} 100%|█████████▉| 22038/22095 [37:41:21<12:56, 13.63s/it] 100%|█████████▉| 22039/22095 [37:42:03<20:48, 22.30s/it] {'loss': 0.2941, 'grad_norm': 0.5702775651101916, 'learning_rate': 1.7452656363103893e-10, 'epoch': 1.0} 100%|█████████▉| 22039/22095 [37:42:03<20:48, 22.30s/it] 100%|█████████▉| 22040/22095 [37:42:06<15:08, 16.52s/it] {'loss': 0.266, 'grad_norm': 0.6276471887693469, 'learning_rate': 1.6845657565045526e-10, 'epoch': 1.0} 100%|█████████▉| 22040/22095 [37:42:06<15:08, 16.52s/it] 100%|█████████▉| 22041/22095 [37:42:10<11:26, 12.72s/it] {'loss': 0.2733, 'grad_norm': 0.576372293276278, 'learning_rate': 1.6249401869394832e-10, 'epoch': 1.0} 100%|█████████▉| 22041/22095 [37:42:10<11:26, 12.72s/it] 100%|█████████▉| 22042/22095 [37:42:50<18:33, 21.00s/it] {'loss': 0.3227, 'grad_norm': 0.6028893622022222, 'learning_rate': 1.5663889288919377e-10, 'epoch': 1.0} 100%|█████████▉| 22042/22095 [37:42:51<18:33, 21.00s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [230, 20, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8922257 in VC:s3://multi-modal/playground/data/geoqa+/. Exception: Image size [230, 20, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 45410, 'image': 'images/5087.png', 'image_wh': [[230, 20]], 'conversations': [{'from': 'human', 'value': "\n如图,已知点M是线段AB的中点,N是线段AM上的点,且满足AN:MN=1:2,若AN=2cm,则线段AB=()\nA. 10cm\nB. 12cm\nC. 6cm\nD. 8cm\nAnswer with the option's letter from the given choices directly."}, {'from': 'gpt', 'value': 'B'}]} 100%|█████████▉| 22043/22095 [37:43:12<18:25, 21.25s/it] {'loss': 0.2877, 'grad_norm': 0.5989598041118115, 'learning_rate': 1.50891198362757e-10, 'epoch': 1.0} 100%|█████████▉| 22043/22095 [37:43:12<18:25, 21.25s/it] 100%|█████████▉| 22044/22095 [37:43:16<13:36, 16.01s/it] {'loss': 0.3048, 'grad_norm': 0.7686153505540809, 'learning_rate': 1.452509352378728e-10, 'epoch': 1.0} 100%|█████████▉| 22044/22095 [37:43:16<13:36, 16.01s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22045/22095 [37:43:19<10:05, 12.12s/it] {'loss': 0.2678, 'grad_norm': 0.6289101541046046, 'learning_rate': 1.397181036361106e-10, 'epoch': 1.0} 100%|█████████▉| 22045/22095 [37:43:19<10:05, 12.12s/it] 100%|█████████▉| 22046/22095 [37:43:40<12:09, 14.88s/it] {'loss': 0.2787, 'grad_norm': 0.7850580188083028, 'learning_rate': 1.3429270367515402e-10, 'epoch': 1.0} 100%|█████████▉| 22046/22095 [37:43:40<12:09, 14.88s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22047/22095 [37:43:44<09:15, 11.58s/it] {'loss': 0.2581, 'grad_norm': 0.6581661810845428, 'learning_rate': 1.289747354726867e-10, 'epoch': 1.0} 100%|█████████▉| 22047/22095 [37:43:44<09:15, 11.58s/it] 100%|█████████▉| 22048/22095 [37:43:48<07:10, 9.16s/it] {'loss': 0.2898, 'grad_norm': 0.6217159388539761, 'learning_rate': 1.237641991425065e-10, 'epoch': 1.0} 100%|█████████▉| 22048/22095 [37:43:48<07:10, 9.16s/it] 100%|█████████▉| 22049/22095 [37:44:10<10:01, 13.08s/it] {'loss': 0.2847, 'grad_norm': 0.6195719530982097, 'learning_rate': 1.1866109479674593e-10, 'epoch': 1.0} 100%|█████████▉| 22049/22095 [37:44:10<10:01, 13.08s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22050/22095 [37:44:14<07:46, 10.36s/it] {'loss': 0.296, 'grad_norm': 0.6210557170903431, 'learning_rate': 1.1366542254476198e-10, 'epoch': 1.0} 100%|█████████▉| 22050/22095 [37:44:14<07:46, 10.36s/it] 100%|█████████▉| 22051/22095 [37:45:13<18:19, 24.98s/it] {'loss': 0.298, 'grad_norm': 0.634926737070786, 'learning_rate': 1.087771824948014e-10, 'epoch': 1.0} 100%|█████████▉| 22051/22095 [37:45:13<18:19, 24.98s/it]VC:s3://st2pj/20250222/images/multi_modal_2024/gui_data/ui_data/GUICourse/guienv/chunk_42/C4web50k-2_184733102-split-1.png 2025-08-29 05:43:11.976805 load time: 1051.87 ms VC:s3://st2pj/20250222/images/multi_modal_2024/agent_data/aig_share/ui_agi_appdata_pad/ui_agi_appdata_pad_20240522_reorg_filtered/00110.png 2025-08-29 05:43:11.976848 load time: 1031.46 ms VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240823_023221_before_screenshot.png 2025-08-29 05:43:11.978980 load time: 1045.41 ms 100%|█████████▉| 22052/22095 [37:45:16<13:10, 18.39s/it] {'loss': 0.275, 'grad_norm': 0.5940878592865888, 'learning_rate': 1.0399637475067004e-10, 'epoch': 1.0} 100%|█████████▉| 22052/22095 [37:45:16<13:10, 18.39s/it]Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22053/22095 [37:45:37<13:26, 19.21s/it] {'loss': 0.2628, 'grad_norm': 0.593622440995024, 'learning_rate': 9.932299941561862e-11, 'epoch': 1.0} 100%|█████████▉| 22053/22095 [37:45:37<13:26, 19.21s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1356, in _get_item self.list_data_dict[i].get("height", 100), ValueError: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None [Try #0] Failed to fetch sample 1863619 in VC:s3://gui-agent/jedi/images/component_v1_130k/component_v1_130k_extracted/. Exception: Number of image tokens ['data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'] does not match number of images None Problematic sample: {'image': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png', 'conversations': [], 'image_id': 'data/slider/other_screenshot/original/AudioControlPanel_1739898931.4442017.png'} 100%|█████████▉| 22054/22095 [37:46:01<14:04, 20.59s/it] {'loss': 0.2769, 'grad_norm': 0.5324880581632664, 'learning_rate': 9.475705659012236e-11, 'epoch': 1.0} 100%|█████████▉| 22054/22095 [37:46:01<14:04, 20.59s/it] 100%|█████████▉| 22055/22095 [37:46:24<14:13, 21.33s/it] {'loss': 0.294, 'grad_norm': 0.6244130460469303, 'learning_rate': 9.029854637243595e-11, 'epoch': 1.0} 100%|█████████▉| 22055/22095 [37:46:24<14:13, 21.33s/it]VC:s3://gui/visual_inputs/multi_modal_2024/gui_data/ui_data/OpenApp/image/51970.jpg 2025-08-29 05:44:22.979924 load time: 1050.08 ms 100%|█████████▉| 22056/22095 [37:47:24<21:18, 32.79s/it] {'loss': 0.2953, 'grad_norm': 0.5707182384816525, 'learning_rate': 8.594746885803862e-11, 'epoch': 1.0} 100%|█████████▉| 22056/22095 [37:47:24<21:18, 32.79s/it]Rank 0: Number of image tokens 2 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22057/22095 [37:47:27<15:12, 24.02s/it] {'loss': 0.2922, 'grad_norm': 0.602646670813106, 'learning_rate': 8.170382414074418e-11, 'epoch': 1.0} 100%|█████████▉| 22057/22095 [37:47:27<15:12, 24.02s/it] 100%|█████████▉| 22058/22095 [37:48:09<18:01, 29.23s/it] {'loss': 0.2928, 'grad_norm': 0.7079394099653381, 'learning_rate': 7.756761231159094e-11, 'epoch': 1.0} 100%|█████████▉| 22058/22095 [37:48:09<18:01, 29.23s/it] 100%|█████████▉| 22059/22095 [37:48:30<16:06, 26.85s/it] {'loss': 0.309, 'grad_norm': 0.6180266400910659, 'learning_rate': 7.353883345939672e-11, 'epoch': 1.0} 100%|█████████▉| 22059/22095 [37:48:30<16:06, 26.85s/it] 100%|█████████▉| 22060/22095 [37:48:34<11:43, 20.11s/it] {'loss': 0.3378, 'grad_norm': 0.6097297959534627, 'learning_rate': 6.961748767020382e-11, 'epoch': 1.0} 100%|█████████▉| 22060/22095 [37:48:34<11:43, 20.11s/it] 100%|█████████▉| 22061/22095 [37:49:17<15:10, 26.78s/it] {'loss': 0.2849, 'grad_norm': 0.6597148209260325, 'learning_rate': 6.580357502949942e-11, 'epoch': 1.0} 100%|█████████▉| 22061/22095 [37:49:17<15:10, 26.78s/it]VC:s3://gui-agent/data_20250612/web/images/yang_0529112522/10_140_52_49_0529160635/img/1.png 2025-08-29 05:47:15.495915 load time: 1034.1 ms VC:s3://st2pj/20250222/images/multi_modal_2024/agent_data/aig_share/ui_agi_appdata_car/20240425/main_1117_label.png 2025-08-29 05:47:15.492055 load time: 1040.82 ms 100%|█████████▉| 22062/22095 [37:49:20<10:49, 19.67s/it] {'loss': 0.27, 'grad_norm': 0.5774377184498107, 'learning_rate': 6.209709561832977e-11, 'epoch': 1.0} 100%|█████████▉| 22062/22095 [37:49:20<10:49, 19.67s/it]VC:s3://gui/aguvis/aguvis-stage1/ricosca/images/23398.jpg 2025-08-29 05:47:18.582981 load time: 1027.22 ms 100%|█████████▉| 22063/22095 [37:49:41<10:47, 20.24s/it] {'loss': 0.3152, 'grad_norm': 0.6361187588078144, 'learning_rate': 5.849804951663096e-11, 'epoch': 1.0} 100%|█████████▉| 22063/22095 [37:49:41<10:47, 20.24s/it] 100%|█████████▉| 22064/22095 [37:50:23<13:44, 26.60s/it] {'loss': 0.2703, 'grad_norm': 0.6457484657525558, 'learning_rate': 5.500643680156348e-11, 'epoch': 1.0} 100%|█████████▉| 22064/22095 [37:50:23<13:44, 26.60s/it] 100%|█████████▉| 22065/22095 [37:51:04<15:27, 30.91s/it] {'loss': 0.2947, 'grad_norm': 0.6391349815071075, 'learning_rate': 5.162225754806738e-11, 'epoch': 1.0} 100%|█████████▉| 22065/22095 [37:51:04<15:27, 30.91s/it] 100%|█████████▉| 22066/22095 [37:52:02<18:51, 39.02s/it] {'loss': 0.2897, 'grad_norm': 0.6407162809395204, 'learning_rate': 4.834551182941738e-11, 'epoch': 1.0} 100%|█████████▉| 22066/22095 [37:52:02<18:51, 39.02s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [23, 9, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8345992 in VC:s3://internvl-moe-sft-data/. Exception: Image size [23, 9, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': 12649, 'image': 'vrdu_table_final_2/astro-ph.CO/a6864068-5772-4096-8990-ef20771f71ed.png', 'image_wh': [[23, 9]], 'conversations': [{'from': 'human', 'value': "\nConvert the image's table to LaTeX code, please."}, {'from': 'gpt', 'value': "Sure, I'll convert the image's table to LaTeX code for you.\n```latex\n\\begin{tabular}{c}\n \\\\\n =\\\\\n \\\\\n \\end{tabular}\n```"}]} 100%|█████████▉| 22067/22095 [37:52:43<18:27, 39.56s/it] {'loss': 0.2673, 'grad_norm': 0.7632623349347577, 'learning_rate': 4.517619971500242e-11, 'epoch': 1.0} 100%|█████████▉| 22067/22095 [37:52:43<18:27, 39.56s/it] 100%|█████████▉| 22068/22095 [37:53:22<17:46, 39.50s/it] {'loss': 0.2909, 'grad_norm': 0.6325446996026688, 'learning_rate': 4.211432127421144e-11, 'epoch': 1.0} 100%|█████████▉| 22068/22095 [37:53:22<17:46, 39.50s/it] 100%|█████████▉| 22069/22095 [37:54:23<19:53, 45.90s/it] {'loss': 0.2539, 'grad_norm': 0.6078505153502691, 'learning_rate': 3.9159876571992495e-11, 'epoch': 1.0} 100%|█████████▉| 22069/22095 [37:54:23<19:53, 45.90s/it] 100%|█████████▉| 22070/22095 [37:54:44<16:03, 38.55s/it] {'loss': 0.2862, 'grad_norm': 0.5766445623028778, 'learning_rate': 3.6312865672183394e-11, 'epoch': 1.0} 100%|█████████▉| 22070/22095 [37:54:44<16:03, 38.55s/it] 100%|█████████▉| 22071/22095 [37:55:43<17:52, 44.70s/it] {'loss': 0.2688, 'grad_norm': 0.5453404898877032, 'learning_rate': 3.3573288635291304e-11, 'epoch': 1.0} 100%|█████████▉| 22071/22095 [37:55:43<17:52, 44.70s/it] 100%|█████████▉| 22072/22095 [37:56:05<14:29, 37.80s/it] {'loss': 0.2529, 'grad_norm': 0.6131718694107106, 'learning_rate': 3.094114552126826e-11, 'epoch': 1.0} 100%|█████████▉| 22072/22095 [37:56:05<14:29, 37.80s/it] 100%|█████████▉| 22073/22095 [37:56:27<12:06, 33.02s/it] {'loss': 0.2526, 'grad_norm': 0.5919616950332834, 'learning_rate': 2.8416436385625412e-11, 'epoch': 1.0} 100%|█████████▉| 22073/22095 [37:56:27<12:06, 33.02s/it]VC:s3://gui/OS-Atlas/desktop_domain/windows_images/20240824_052914_before_screenshot.png 2025-08-29 05:54:25.531137 load time: 1046.65 ms 100%|█████████▉| 22074/22095 [37:56:49<10:26, 29.83s/it] {'loss': 0.2754, 'grad_norm': 0.673900937746956, 'learning_rate': 2.599916128331881e-11, 'epoch': 1.0} 100%|█████████▉| 22074/22095 [37:56:49<10:26, 29.83s/it] 100%|█████████▉| 22075/22095 [37:57:11<09:07, 27.37s/it] {'loss': 0.2914, 'grad_norm': 0.601033366388982, 'learning_rate': 2.3689320265973815e-11, 'epoch': 1.0} 100%|█████████▉| 22075/22095 [37:57:11<09:07, 27.37s/it] 100%|█████████▉| 22076/22095 [37:57:51<09:54, 31.27s/it] {'loss': 0.3096, 'grad_norm': 0.5964276240291718, 'learning_rate': 2.1486913383550467e-11, 'epoch': 1.0} 100%|█████████▉| 22076/22095 [37:57:51<09:54, 31.27s/it] 100%|█████████▉| 22077/22095 [37:58:12<08:28, 28.24s/it] {'loss': 0.259, 'grad_norm': 0.7357281689405786, 'learning_rate': 1.9391940682678133e-11, 'epoch': 1.0} 100%|█████████▉| 22077/22095 [37:58:12<08:28, 28.24s/it] 100%|█████████▉| 22078/22095 [37:58:38<07:47, 27.51s/it] {'loss': 0.3087, 'grad_norm': 0.6228373665170601, 'learning_rate': 1.740440220887596e-11, 'epoch': 1.0} 100%|█████████▉| 22078/22095 [37:58:38<07:47, 27.51s/it] 100%|█████████▉| 22079/22095 [37:59:00<06:55, 25.95s/it] {'loss': 0.2665, 'grad_norm': 0.6323616538871186, 'learning_rate': 1.5524298004887527e-11, 'epoch': 1.0} 100%|█████████▉| 22079/22095 [37:59:00<06:55, 25.95s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [0, 0, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 8303966 in VC:s3://ocr/TextVQA-dataset-CHN/MTWI/image_train/. Exception: Image size [0, 0, 100, 100] is too small. Minimum size is 28. Problematic sample: {'image': 'TB1Zj7Un4rI8KJjy0FpXXb5hVXa_!!1-item_pic.gif.jpg', 'image_wh': [[0, 0]], 'conversations': [{'from': 'human', 'value': '\nI require the text from this image transcribed.'}, {'from': 'gpt', 'value': 'All words in the image:\n超耐用\n铝膜\n防雨\n防晒\n超长质保\n加厚\n通用\n四季\n撕不烂\n扯不坏\n专车\n专用'}]} VC:s3://mm-dataset/ocrvqa/images/1849754985.jpg 2025-08-29 05:56:59.197958 load time: 1035.54 ms VC:s3://multi-modal/Super-CLEVR/images/superCLEVR_new_002289.png 2025-08-29 05:56:59.199928 load time: 1030.24 ms VC:s3://gui/uground_web_processing/screenshots/web_hybrid_773k_max_25qa_filtered_524661.png 2025-08-29 05:56:59.199575 load time: 1044.24 ms VC:s3://gui/aguvis/aguvis-stage2/android_control/images/2311/screenshot_2.png 2025-08-29 05:56:59.197766 load time: 1024.54 ms 100%|█████████▉| 22080/22095 [37:59:23<06:16, 25.08s/it] {'loss': 0.2945, 'grad_norm': 0.6233482317348159, 'learning_rate': 1.3751628111235981e-11, 'epoch': 1.0} 100%|█████████▉| 22080/22095 [37:59:23<06:16, 25.08s/it] 100%|█████████▉| 22081/22095 [37:59:27<04:21, 18.66s/it] {'loss': 0.2714, 'grad_norm': 0.5677360160707733, 'learning_rate': 1.2086392565113792e-11, 'epoch': 1.0} 100%|█████████▉| 22081/22095 [37:59:27<04:21, 18.66s/it]Traceback (most recent call last): File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1230, in __getitem__ "" in msg and "" in msg File "/mnt/shared-storage-user/intern7shared/liuzhaoyang/workspace/GUIAgent/qwen-vl-finetune/qwenvl/data/data_qwen_2.py", line 1335, in _get_item msg = sources[0]["conversations"][first_user_input_idx]["value"] ValueError: Image size [309, 23, 100, 100] is too small. Minimum size is 28. [Try #0] Failed to fetch sample 7806470 in VC:s3://mm-dataset/gpt4o/longcaption_20240519/image/. Exception: Image size [309, 23, 100, 100] is too small. Minimum size is 28. Problematic sample: {'id': '27810', 'image': '52001.jpg', 'image_wh': [[309, 23]], 'conversations': [{'from': 'human', 'value': '\nI am providing an answer to the question below based on the image: \nWhat are the main elements or objects present in the image? \nHere is my response: \nThe provided image contains text with "LHCb" followed by a mathematical expression. The content of the image shows "LHCb > 1.25". \n\nThe text "LHCb" likely refers to the LHCb experiment, which is an international collaboration primarily focused on the study of B mesons to investigate flavor changing neutral currents and other aspects of particle physics. This experiment is conducted at the Large Hadron Collider (LHC) at CERN in Geneva, Switzerland, and has been operational since 2008. The results from the LHCb collaboration add significant value to the field of particle physics, providing insights into the behavior of fundamental particles.\n\nThe mathematical expression "> 1.25" appears to be a constraint or result related to the LHCb experiment. This value could represent a specific measurement or a limit placed on a parameter or quantity derived from the LHCb data. Given the nature of the LHCb experiment, this number might be associated with production cross-sections, branching ratios, or other significant metrics in particle physics research.\n\nIn summary:\n\n- The text "LHCb" is present, referring to the LHCb experiment in particle physics.\n- The comparison "> 1.25" indicates a numerical value constraint which could be related to an experimental result from LHCb.\n\n### Analysis\n\nTo understand the potential implications:\n1. **LHCb Experiment**: The LHCb experiment\'s focus on B mesons is to explore properties that can validate or challenge the Standard Model of particle physics. Findings such as lower bounds on specific parameters are essential as they help refine theoretical models or propose new physics.\n\n2. **Numerical Constraint**: The lower bound "> 1.25" suggests a quantitative limit has been established above the value of 1.25. This number could be a limit on the intensity of a particular process involving B mesons or related particles.\n\n### Relevant Predictions and Inferences\n\n- **Parameter Limitation**: If this value represents a lower limit (a constraint), then anything measured by LHCb for this specific parameter or quantity is known to be greater than 1.25. It signifies precise measurement capabilities of the experiment.\n- **Contextual Understanding**: One might infer that the research leading to this constraint involved extensive data analysis and cross-referencing against theoretical predictions.\n\nIn answering a question related to this image, consider the context and importance of the LHCb experiment in contributing to our understanding of particle physics, and the supplied lower bound, to potentially discuss how such a constraint influences theoretical and experimental physics work. \nPlease revise my answer using the image, and if there are any errors, kindly explain why it\'s incorrect.'}, {'from': 'gpt', 'value': 'Your description is thorough and detailed, demonstrating a good understanding of how to interpret scientific data within the context of particle physics experiments. However, there are a few discrepancies that should be addressed to align the response with the image you described:\n\n1. **Experiment Identifier**: You analyzed the text "LHCb" when the image shows "BaBar." This discrepancy suggests a misidentification of the experiment in question. BaBar and LHCb are both significant but distinct experiments with different focuses and operational timelines.\n\n2. **Mathematical Expression**: You correctly identified the nature of mathematical expressions in scientific contexts but applied it to "LHCb > 1.25" instead of "BaBar < 0.732." The implication of a lower bound in your analysis needs to be adjusted to an upper bound, considering the "<" symbol in the image.\n\n3. **Relevance of Details**: While your explanation is informative, it should be tailored to match the specific details provided in the image. For instance, the focus on LHCb\'s location at CERN and its operational details is not relevant to the BaBar experiment, which was conducted at SLAC.\n\n4. **Consistency with Image Content**: Ensure that the elements you discuss directly correspond to those present in the image. This includes the name of the experiment and the specific values in the mathematical expression.\n\nBy addressing these points, your description will more accurately reflect the information contained in the image, providing a precise and relevant analysis.'}]} Rank 0: Number of image tokens 0 does not match number of images 1 Rank 0: Fixed image tokens in the conversation 100%|█████████▉| 22082/22095 [38:00:29<06:49, 31.47s/it] {'loss': 0.2969, 'grad_norm': 0.5755998827043224, 'learning_rate': 1.0528591403713428e-11, 'epoch': 1.0} 100%|█████████▉| 22082/22095 [38:00:29<06:49, 31.47s/it] 100%|█████████▉| 22083/22095 [38:00:51<05:45, 28.83s/it] {'loss': 0.3026, 'grad_norm': 0.6115343210945876, 'learning_rate': 9.07822465923136e-12, 'epoch': 1.0} 100%|█████████▉| 22083/22095 [38:00:51<05:45, 28.83s/it] 100%|█████████▉| 22084/22095 [38:01:31<05:51, 31.99s/it] {'loss': 0.2999, 'grad_norm': 0.551669168000992, 'learning_rate': 7.735292363864055e-12, 'epoch': 1.0} 100%|█████████▉| 22084/22095 [38:01:31<05:51, 31.99s/it] 100%|█████████▉| 22085/22095 [38:01:52<04:47, 28.71s/it] {'loss': 0.296, 'grad_norm': 0.630577749173783, 'learning_rate': 6.4997945453670885e-12, 'epoch': 1.0} 100%|█████████▉| 22085/22095 [38:01:52<04:47, 28.71s/it] 100%|█████████▉| 22086/22095 [38:02:33<04:52, 32.45s/it] {'loss': 0.2953, 'grad_norm': 0.6528100887496123, 'learning_rate': 5.371731231496036e-12, 'epoch': 1.0} 100%|█████████▉| 22086/22095 [38:02:33<04:52, 32.45s/it] 100%|█████████▉| 22087/22095 [38:03:12<04:36, 34.58s/it] {'loss': 0.2665, 'grad_norm': 0.5826005800793707, 'learning_rate': 4.3511024455655806e-12, 'epoch': 1.0} 100%|█████████▉| 22087/22095 [38:03:12<04:36, 34.58s/it] 100%|█████████▉| 22088/22095 [38:03:34<03:34, 30.58s/it] {'loss': 0.2648, 'grad_norm': 0.5722049007305591, 'learning_rate': 3.437908209780183e-12, 'epoch': 1.0} 100%|█████████▉| 22088/22095 [38:03:34<03:34, 30.58s/it] 100%|█████████▉| 22089/22095 [38:03:37<02:14, 22.40s/it] {'loss': 0.2973, 'grad_norm': 0.615232940231692, 'learning_rate': 2.6321485435687465e-12, 'epoch': 1.0} 100%|█████████▉| 22089/22095 [38:03:37<02:14, 22.40s/it] 100%|█████████▉| 22090/22095 [38:03:40<01:23, 16.73s/it] {'loss': 0.2987, 'grad_norm': 0.5882069213196555, 'learning_rate': 1.9338234646948397e-12, 'epoch': 1.0} 100%|█████████▉| 22090/22095 [38:03:40<01:23, 16.73s/it] 100%|█████████▉| 22091/22095 [38:04:01<01:11, 17.98s/it] {'loss': 0.2605, 'grad_norm': 0.5699427946030503, 'learning_rate': 1.3429329881464725e-12, 'epoch': 1.0} 100%|█████████▉| 22091/22095 [38:04:01<01:11, 17.98s/it] 100%|█████████▉| 22092/22095 [38:04:22<00:56, 18.84s/it] {'loss': 0.2567, 'grad_norm': 0.6242708571432227, 'learning_rate': 8.59477126136099e-13, 'epoch': 1.0} 100%|█████████▉| 22092/22095 [38:04:22<00:56, 18.84s/it] 100%|█████████▉| 22093/22095 [38:04:44<00:39, 19.64s/it] {'loss': 0.2771, 'grad_norm': 0.6206500807977732, 'learning_rate': 4.834558897659492e-13, 'epoch': 1.0} 100%|█████████▉| 22093/22095 [38:04:44<00:39, 19.64s/it] 100%|█████████▉| 22094/22095 [38:05:10<00:21, 21.59s/it] {'loss': 0.2852, 'grad_norm': 0.6006375397432488, 'learning_rate': 2.148692862524726e-13, 'epoch': 1.0} 100%|█████████▉| 22094/22095 [38:05:10<00:21, 21.59s/it] 100%|██████████| 22095/22095 [38:05:13<00:00, 16.22s/it] {'loss': 0.2744, 'grad_norm': 0.5607683363863506, 'learning_rate': 5.3717321701896033e-14, 'epoch': 1.0} 100%|██████████| 22095/22095 [38:05:13<00:00, 16.22s/it] {'train_runtime': 137166.71, 'train_samples_per_second': 82.478, 'train_steps_per_second': 0.161, 'train_loss': 0.3743057166681648, 'epoch': 1.0} 100%|██████████| 22095/22095 [38:06:12<00:00, 16.22s/it] 100%|██████████| 22095/22095 [38:06:12<00:00, 6.21s/it]